Skip to content
Snippets Groups Projects
README.md 3.94 KiB
Newer Older
xgrand's avatar
xgrand committed
# bolero

bolero is a nextflow pipeline dedicated to analyse Nanopore sequencing coupled to 5'RACE amplification of HBV RNAs.

## Getting the last updates

To get the last commits from this repository into your fork use the following commands:

```sh
Xavier Grand's avatar
Xavier Grand committed
git clone http://gitbio.ens-lyon.fr/xgrand/bolero.git
xgrand's avatar
xgrand committed
```

## Getting Started

The pipeline `src/bolero.nf` works a nextflow configuration file `src/nextflow.config`.
aliarifki's avatar
aliarifki committed
The typical command for running the pipeline is as follows:
`nextflow ./src/bolero.nf -c ./src/nextflow.config -profile singularity`
xgrand's avatar
xgrand committed

Xavier Grand's avatar
Xavier Grand committed
The typical command to obtain help:
`nextflow ./src/bolero.nf --help`

xgrand's avatar
xgrand committed
The arguments of this pipeline are described in the table below:

|         Arguments           |                             Description                             | 
|:---------------------------:|:-------------------------------------------------------------------:| 
| -c | configuration file. This parameter should always be `src/nextflow.config`        | 
| -profile      | The profile to use. This can be **docker** or **singularity** to run the pipeline in docker or singularity container respectively. This can also be **psmn** to launch the analysis on the PSMN | 
| --input [path] | Path to the folder containing fastq files. If skip basecalling option disabled, path to fast5 files folder. |
xgrand's avatar
xgrand committed
| --adapt [str] | Sequence of 5'RACE adapter. |
Xavier Grand's avatar
Xavier Grand committed
| --genome [file] | Path to the fasta file containing the genome. HBV reference sequence preCore available in data folder. |
aliarifki's avatar
aliarifki committed
| --skipBC [boolean] | Skip basecalling step. If truen give fastq folder as input. Default: true. |
xgrand's avatar
xgrand committed
| --flowcell [str] | Nanopore flowcell. Default = FLO-MIN106. |
| --kit [str] | Nanopore kit. Default = SQK-PBK004. |
aliarifki's avatar
aliarifki committed
| --gpu_mode [str] | Guppy basecaller configuration. Default: false. 
"gpu" mode is dedicated to NVIDIA Cuda compatible system according to Guppy specifications. |
| --min_qscore [float] | Minimum quality score threshold, default = 7.0. |
| --gpu_runners_per_device [int] | Number of runner per device, default = 32 (refer to guppy manual). |
| --num_callers [int] | Number of callers, default = 16 (refer to guppy manual). |
| --chunks_per_runner [int] | Number of chunks per runner, default = 512 (refer to guppy manual). |
| --chunks_size [int] | Chunck size, default = 1900 (refer to guppy manual). |
| --help --h | Display this help message. |
xgrand's avatar
xgrand committed

Xavier Grand's avatar
Xavier Grand committed
## Test Bolero

1. simulate 5'RACE sequenced reads:
Require pbsim3 software: https://github.com/yukiteruono/pbsim3

To produce a complete transcriptome you can run:
```
Xavier Grand's avatar
Xavier Grand committed
path_to_bolero=./bolero
path_to_pbsim3=/opt/Programs/pbsim3
Xavier Grand's avatar
Xavier Grand committed
mkdir -p 01_basecalling
for i in $(seq 1 30)
do
Xavier Grand's avatar
Xavier Grand committed
    extract=$(cut -f1 ${path_to_bolero}/data/simulation/expression.transcript_${i})
    mkdir 01_basecalling/${extract}
    ${path_to_pbsim3}/src/pbsim --strategy trans --transcript ${path_to_bolero}/data/simulation/expression.transcript_${i} --id-prefix ${extract} --method errhmm --errhmm ${path_to_pbsim3}/data/ERRHMM-ONT.model
    mv sd.fastq 01_basecalling/${extract}/${extract}.fastq
    rm sd.maf
    gzip 01_basecalling/${extract}/${extract}.fastq
Xavier Grand's avatar
Xavier Grand committed
done
```

2. run Bolero:
```
cd <PATH_TO_Bolero>
Xavier Grand's avatar
Xavier Grand committed
nextflow ./src/bolero.nf -c ./src/nextflow.config -profile <PROFILE> --input <PATH_TO_01_basecalling>
Xavier Grand's avatar
Xavier Grand committed
```

Xavier Grand's avatar
Xavier Grand committed
## Reference sequence

The HBV reference sequence, genotype D ayw, is available in "data" folder.

xgrand's avatar
xgrand committed
## Contributing

If you want to add more tools to this project, please read the [CONTRIBUTING.md](CONTRIBUTING.md).

## Authors

* **Xavier Grand** - *Maintainer*
* **Alia Rifki** - *Contributor*
xgrand's avatar
xgrand committed

## License
Xavier Grand's avatar
Xavier Grand committed
[Info](#){.btn .btn-info}
Xavier Grand's avatar
Xavier Grand committed
This project is licensed under the CeCiLL License- see the [LICENSE](LICENSE) file for details.

Xavier Grand's avatar
Xavier Grand committed
[Warning](#){.btn .btn-warning}
Xavier Grand's avatar
Xavier Grand committed
The optional basecalling and demultiplexing steps may be carried out if necessary but are not executed automatically. 
To execute these steps, it is essential to adhere to the guidelines provided with the Guppy software from Oxford Nanopore Technologies.
xgrand's avatar
xgrand committed

## To Do:

Xavier Grand's avatar
Xavier Grand committed
* Short-time updates: replace guppy by dorado.