# Contributing
email, or on the [ENS-Bioinfo channel]( before making a change.
## Project organisation
The `LBMC/nextflow` project is structured as follow:
- all the code is in the `src/` folder
- scripts downloading external tools should download them in the `bin/` folder
- all the documentation (including this file) can be found int he `doc/` folder
- the `data` and `results` folders contain the data and results of your piplines and are ignored by `git`
## Code structure
The `src/` folder is where we want to save the pipline (`.nf`) script. This folder also contains:
- the `src/` to install the nextflow executable at the root of the project.
- some pipelines examples (like the one build during the nf_pratical)
- the `src/nextflow.config` global configuration file which contains the `docker`, `singularity`, `psmn` and `ccin2p3` profiles.
- the `src/nf_modules` folder contains per tools `` modules with predefined process that users can imports in their projects with the [DSL2](
But also some hidden folders that users don't need to see when building their pipeline:
- the `src/.docker_modules` contains the recipies for the `docker` containers used in the `src/nf_modules/<tool_names>/` files
- the `src/.singularity_in2p3` and `src/.singularity_psmn` are symbolic links to the shared folder where the singularity images are downloaded on the PSMN and CCIN2P3
# Proposing a new tool
Each tool named `<tool_name>` must have two dedicated folders:
- `src/nf_modules/<tool_name>` where users can find `.nf` files to include
- `src/.docker_modules/<tool_name>/<version_number>` where we have the `.Dockerfile` to construct the container used in the `` file
## `src/nf_module` guide lines
1. Ensure any install or build dependencies are removed before the end of the layer when doing a
2. Update the with details of changes to the interface, this includes new environment
variables, exposed ports, useful file locations and container parameters.
3. Increase the version numbers in any examples files and the to the new version that this
Pull Request would represent. The versioning scheme we use is [SemVer](
4. You may merge the Pull Request in once you have the sign-off of two other developers, or if you
do not have permission to do that, you may request the second reviewer to merge it for you.
We are going to take the `fastp`, `nf_module` as an example.
The `src/nf_modules/<tool_name>` should contain a `` file that describe at least one process using `<tool_name>`
we can then use the `container_url` definition in each `process` in the `container` attribute.
In addition to the `container` directive, each `process` should have one of the following `label` attributes (defined in the `src/nextflow.config` file)
- `big_mem_mono_cpus`
- `big_mem_multi_cpus`
- `small_mem_mono_cpus`
- `small_mem_multi_cpus`
### process options
params.fastp = ""
params.fastp_out = ""
process fastp {
container = "${container_url}"
label "big_mem_multi_cpus"
if (params.fastp_out != "") {
publishDir "results/${params.fastp_out}", mode: 'copy'
fastp --thread ${task.cpus} \
${params.fastp} \
The user can then change the value of these variables:
- from the command line `--fastp "--trim_head1=10"``
- with the `include` command within their pipeline: `include { fastq } from "nf_modules/fastq/main" addParams(fastq_out: "QC/fastq/")
- by defining the variable within their pipeline: `params.fastq_out = "QC/fastq/"
You should always use `tuple` for input and output channel format with at least:
- a `val` containing variable(s) related to the item
- a `path` for the file(s) that you want to process
for example:
process fastp {
container = "${container_url}"
label "big_mem_multi_cpus"
tag "$file_id"
if (params.fastp_out != "") {
publishDir "results/${params.fastp_out}", mode: 'copy'
tuple val(file_id), path(reads)
tuple val(file_id), path("*.fastq.gz"), emit: fastq
tuple val(file_id), path("*.html"), emit: html
tuple val(file_id), path("*.json"), emit: report
The rational behind taking a `file_id` and emitting the same `file_id` is to facilitate complex channel operations in pipelines without having to rewrite the `process` blocks.
Fastq files opened with `channel.fromFilePairs( params.fastq )`
### Handling single and paired end data
For process that have to deal with single
