@@ -21,10 +21,10 @@ For the practical you can either work with the WebIDE of Gitlab, or locally as d
To run a tool within a [Docker container](https://www.docker.com/what-container) you need to write a `Dockerfile`.
[`Dockerfile`](./src/docker_modules/Kallisto/0.44.0/Dockerfile) are found in the [pipelines/nextflow](https://gitlab.biologie.ens-lyon.fr/pipelines/nextflow) project under `src/docker_modules/`. Each [`Dockerfile`](./src/docker_modules/Kallisto/0.44.0/Dockerfile) is paired with a [`docker_init.sh`](./src/docker_modules/Kallisto/0.44.0/docker_init.sh) file like following the example for `Kallisto` version `0.43.1`:
[`Dockerfile`](./src/docker_modules/kallisto/0.44.0/Dockerfile) are found in the [pipelines/nextflow](https://gitlab.biologie.ens-lyon.fr/pipelines/nextflow) project under `src/docker_modules/`. Each [`Dockerfile`](./src/docker_modules/kallisto/0.44.0/Dockerfile) is paired with a [`docker_init.sh`](./src/docker_modules/kallisto/0.44.0/docker_init.sh) file like following the example for `Kallisto` version `0.43.1`:
The [`docker_init.sh`](./src/docker_modules/Kallisto/0.44.0/docker_init.sh) is a simple sh script with executable rights (`chmod +x`). By executing this script, the user creates a [Docker container](https://www.docker.com/what-container) with the tool installed a specific version. You can check the [`docker_init.sh`](./src/docker_modules/Kallisto/0.44.0/docker_init.sh) file of any implemented tools as a template.
The [`docker_init.sh`](./src/docker_modules/kallisto/0.44.0/docker_init.sh) is a simple sh script with executable rights (`chmod +x`). By executing this script, the user creates a [Docker container](https://www.docker.com/what-container) with the tool installed a specific version. You can check the [`docker_init.sh`](./src/docker_modules/kallisto/0.44.0/docker_init.sh) file of any implemented tools as a template.
Remember that the name of the [container](https://www.docker.com/what-container) must be in lower case and in the format `<tool_name>:<version>`.
For tools without a version number you can use a commit hash instead.
The recipe to wrap your tool in a [Docker container](https://www.docker.com/what-container) is written in a [`Dockerfile`](./src/docker_modules/Kallisto/0.44.0/Dockerfile) file.
The recipe to wrap your tool in a [Docker container](https://www.docker.com/what-container) is written in a [`Dockerfile`](./src/docker_modules/kallisto/0.44.0/Dockerfile) file.
For `Kallisto` version `0.44.0` the header of the `Dockerfile` is :
...
...
@@ -57,12 +57,12 @@ Then we declare the *maintainer* of the container. Before declaring an environme
You should always declare a variable `TOOLSNAME_VERSION` that contains the version number of commit number of the tools you wrap. In simple cases you just have to modify this line to create a new `Dockerfile` for another version of the tool.
The following lines of the [`Dockerfile`](./src/docker_modules/Kallisto/0.44.0/Dockerfile) are a succession of `bash` commands executed as the **root** user within the container.
The following lines of the [`Dockerfile`](./src/docker_modules/kallisto/0.44.0/Dockerfile) are a succession of `bash` commands executed as the **root** user within the container.
Each `RUN` block is run sequentially by `Docker`. If there is an error or modifications in a `RUN` block, only this block and the following `RUN` will be executed.
You can learn more about the building of Docker containers [here](https://docs.docker.com/engine/reference/builder/#usage).
When you build your [`Dockerfile`](./src/docker_modules/Kallisto/0.44.0/Dockerfile), instead of launching many times the [`docker_init.sh`](./src/docker_modules/Kallisto/0.44.0/docker_init.sh) script to tests your [container](https://www.docker.com/what-container), you can connect to a base container in interactive mode to launch tests your commands.
When you build your [`Dockerfile`](./src/docker_modules/kallisto/0.44.0/Dockerfile), instead of launching many times the [`docker_init.sh`](./src/docker_modules/kallisto/0.44.0/docker_init.sh) script to tests your [container](https://www.docker.com/what-container), you can connect to a base container in interactive mode to launch tests your commands.
```sh
docker run -it ubuntu:18.04 bash
...
...
@@ -80,8 +80,8 @@ You can read the Contributing guide for the [PMSN/modules](https://gitlab.biolog
The last step to wrap your tool is to make it available in nextflow. For this you need to create at least 4 files, like the following for Kallisto version `0.44.0`:
```sh
ls-lR src/nf_modules/Kallisto
src/nf_modules/Kallisto/:
ls-lR src/nf_modules/kallisto
src/nf_modules/kallisto/:
total 12
-rw-r--r-- 1 laurent users 551 Jun 18 17:14 index.nf
-rw-r--r-- 1 laurent users 901 Jun 18 17:14 mapping_paired.nf
...
...
@@ -89,12 +89,12 @@ total 12
-rwxr-xr-x 1 laurent users 627 Jun 18 17:14 tests.sh*
```
The [`.config` files](./src/nf_modules/Kallisto/) file contains instructions for two profiles : `psmn` and `docker`.
The [`.nf` files](./src/nf_modules/Kallisto/) file contains nextflow processes to use `Kallisto`.
The [`.config` files](./src/nf_modules/kallisto/) file contains instructions for two profiles : `psmn` and `docker`.
The [`.nf` files](./src/nf_modules/kallisto/) file contains nextflow processes to use `Kallisto`.
The [`tests/tests.sh`](./src/nf_modules/Kallisto/tests/tests.sh) script (with executable rights), contains a series of nextflow calls on the other `.nf` files of the folder. Those tests correspond to execution of the `*.nf` files present in the [`kallisto folder`](./src/nf_modules/Kallisto/) on the [LBMC/tiny_dataset](https://gitlab.biologie.ens-lyon.fr/LBMC/tiny_dataset) dataset with the `docker` profile. You can read the *Running the tests* section of the [README.md](https://gitlab.biologie.ens-lyon.fr/pipelines/nextflow/blob/master/README.md).
The [`tests/tests.sh`](./src/nf_modules/kallisto/tests/tests.sh) script (with executable rights), contains a series of nextflow calls on the other `.nf` files of the folder. Those tests correspond to execution of the `*.nf` files present in the [`kallisto folder`](./src/nf_modules/kallisto/) on the [LBMC/tiny_dataset](https://gitlab.biologie.ens-lyon.fr/LBMC/tiny_dataset) dataset with the `docker` profile. You can read the *Running the tests* section of the [README.md](https://gitlab.biologie.ens-lyon.fr/pipelines/nextflow/blob/master/README.md).
The `.config` file defines the configuration to apply to your process conditionally to the value of the `-profile` option. You must define configuration for at least the `psmn` and `docker` profile.
...
...
@@ -160,9 +160,9 @@ process{
The `beforeScript` variable is executed before the main script for the corresponding process.
@@ -59,16 +59,16 @@ The `src/docker_modules` contains the code to wrap tools in [Docker](https://www
```sh
ls-l src/docker_modules/
rwxr-xr-x 3 laurent _lpoperator 96 May 25 15:42 BEDtools/
drwxr-xr-x 4 laurent _lpoperator 128 Jun 5 16:14 Bowtie2/
drwxr-xr-x 3 laurent _lpoperator 96 May 25 15:42 FastQC/
drwxr-xr-x 4 laurent _lpoperator 128 Jun 5 16:14 HTSeq/
rwxr-xr-x 3 laurent _lpoperator 96 May 25 15:42 bedtools/
drwxr-xr-x 4 laurent _lpoperator 128 Jun 5 16:14 bowtie2/
drwxr-xr-x 3 laurent _lpoperator 96 May 25 15:42 fastqc/
drwxr-xr-x 4 laurent _lpoperator 128 Jun 5 16:14 htseq/
```
To each `tools/version` corresponds two files:
```sh
ls-l src/docker_modules/Bowtie2/2.3.4.1/
ls-l src/docker_modules/bowtie2/2.3.4.1/
-rw-r--r-- 1 laurent _lpoperator 283 Jun 5 15:07 Dockerfile
-rwxr-xr-x 1 laurent _lpoperator 79 Jun 5 16:18 docker_init.sh*
```
...
...
@@ -322,7 +322,7 @@ You can test your pipeline with the following command:
The second step of the pipeline is to trim reads by quality.
Browse for [src/nf_modules/UrQt/trimming_paired.nf](https://gitlab.biologie.ens-lyon.fr/pipelines/nextflow/blob/master/src/nf_modules/UrQt/trimming_paired.nf), this file contains examples for UrQt. We are interested in the *for paired-end data* section of the code. Copy the process section code in your pipeline and commit it.
Browse for [src/nf_modules/urqt/trimming_paired.nf](https://gitlab.biologie.ens-lyon.fr/pipelines/nextflow/blob/master/src/nf_modules/urqt/trimming_paired.nf), this file contains examples for UrQt. We are interested in the *for paired-end data* section of the code. Copy the process section code in your pipeline and commit it.
This code won’t work if you try to run it: the `fastq_file` channel is already consumed by the `adaptor_removal` process. In nextflow once a channel is used by a process, it ceases to exist. Moreover, we don’t want to trim the input fastq, we want to trim the fastq that comes from the `adaptor_removal` process.
...
...
@@ -340,7 +340,7 @@ set pair_id, file(reads) from fastq_files_cut
The two processes are now connected by the channel `fastq_files_cut`.
Add the content of the [src/nf_modules/UrQt/trimming_paired.config](https://gitlab.biologie.ens-lyon.fr/pipelines/nextflow/blob/master/src/nf_modules/UrQt/trimming_paired.config) file to your `src/RNASeq.config` file and commit it.
Add the content of the [src/nf_modules/urqt/trimming_paired.config](https://gitlab.biologie.ens-lyon.fr/pipelines/nextflow/blob/master/src/nf_modules/urqt/trimming_paired.config) file to your `src/RNASeq.config` file and commit it.
You can test your pipeline.
...
...
@@ -348,7 +348,7 @@ You can test your pipeline.
Kallisto need the sequences of the transcripts that need to be quantified. We are going to extract these sequences from the reference `data/tiny_dataset/fasta/tiny_v2.fasta` with the `bed` annotation `data/tiny_dataset/annot/tiny.bed`.
You can copy to your `src/RNASeq.nf` file the content of [src/nf_modules/BEDtools/fasta_from_bed.nf](https://gitlab.biologie.ens-lyon.fr/pipelines/nextflow/blob/master/src/nf_modules/BEDtools/fasta_from_bed.nf) and to your `src/RNASeq.config` file the content of [src/nf_modules/BEDtools/fasta_from_bed.config](https://gitlab.biologie.ens-lyon.fr/pipelines/nextflow/blob/master/src/nf_modules/BEDtools/fasta_from_bed.config).
You can copy to your `src/RNASeq.nf` file the content of [src/nf_modules/bedtools/fasta_from_bed.nf](https://gitlab.biologie.ens-lyon.fr/pipelines/nextflow/blob/master/src/nf_modules/bedtools/fasta_from_bed.nf) and to your `src/RNASeq.config` file the content of [src/nf_modules/bedtools/fasta_from_bed.config](https://gitlab.biologie.ens-lyon.fr/pipelines/nextflow/blob/master/src/nf_modules/bedtools/fasta_from_bed.config).
Commit your work and test your pipeline with the following command:
...
...
@@ -360,7 +360,7 @@ Commit your work and test your pipeline with the following command:
Kallisto run in two steps: the indexation of the reference and the quantification on this index.
You can copy to your `src/RNASeq.nf` file the content of the files [src/nf_modules/Kallisto/indexing.nf](https://gitlab.biologie.ens-lyon.fr/pipelines/nextflow/blob/master/src/nf_modules/Kallisto/indexing.nf) and [src/nf_modules/Kallisto/mapping_paired.nf](https://gitlab.biologie.ens-lyon.fr/pipelines/nextflow/blob/master/src/nf_modules/Kallisto/mapping_paired.nf). You can add to your file `src/RNASeq.config` file the content of the files [src/nf_modules/Kallisto/indexing.config](https://gitlab.biologie.ens-lyon.fr/pipelines/nextflow/blob/master/src/nf_modules/Kallisto/indexing.config) and [src/nf_modules/Kallisto/mapping_paired.config](https://gitlab.biologie.ens-lyon.fr/pipelines/nextflow/blob/master/src/nf_modules/Kallisto/mapping_paired.config).
You can copy to your `src/RNASeq.nf` file the content of the files [src/nf_modules/kallisto/indexing.nf](https://gitlab.biologie.ens-lyon.fr/pipelines/nextflow/blob/master/src/nf_modules/kallisto/indexing.nf) and [src/nf_modules/kallisto/mapping_paired.nf](https://gitlab.biologie.ens-lyon.fr/pipelines/nextflow/blob/master/src/nf_modules/kallisto/mapping_paired.nf). You can add to your file `src/RNASeq.config` file the content of the files [src/nf_modules/kallisto/indexing.config](https://gitlab.biologie.ens-lyon.fr/pipelines/nextflow/blob/master/src/nf_modules/kallisto/indexing.config) and [src/nf_modules/kallisto/mapping_paired.config](https://gitlab.biologie.ens-lyon.fr/pipelines/nextflow/blob/master/src/nf_modules/kallisto/mapping_paired.config).
We are going to work with paired-end so only copy the relevant processes. The `index_fasta` process needs to take as input the output of your `fasta_from_bed` process. The `fastq` input of your `mapping_fastq` process needs to take as input the output of your `index_fasta` process and the `trimming` process.