diff --git a/docs/usage.md b/docs/usage.md index 655b0aa023d6bf492fe33aa9d29a683a4fe8867d..82a79b05c3822c50804753cb11e889f1405e3096 100644 --- a/docs/usage.md +++ b/docs/usage.md @@ -185,7 +185,7 @@ NXF_OPTS='-Xms1g -Xmx4g' ### Hi-C digestion protocol Here is an command line example for standard DpnII digestion protocols. -Alignment will be performed on the `mm10` genome with default paramters. +Alignment will be performed on the `mm10` genome with default parameters. Multi-hits will not be considered and duplicates will be removed. Note that by default, no filters are applied on DNA and restriction fragment sizes. @@ -250,13 +250,13 @@ run the pipeline: ### `--bwt2_index` -The bowtie2 indexes are required to run the Hi-C pipeline. If the +The bowtie2 indexes are required to align the data with the HiC-Pro workflow. If the `--bwt2_index` is not specified, the pipeline will either use the igenome bowtie2 indexes (see `--genome` option) or build the indexes on-the-fly (see `--fasta` option) ```bash ---bwt2_index '[path to bowtie2 index (with basename)]' +--bwt2_index '[path to bowtie2 index]' ``` ### `--chromosome_size` @@ -305,7 +305,7 @@ file with coordinates of restriction fragments. If not specified, this file will be automatically created by the pipline. In this case, the `--fasta` reference genome will be used. -Note that the `--restriction_site` parameter is mandatory to create this file. +Note that the `digestion` or `--restriction_site` parameter is mandatory to create this file. ## Hi-C specific options @@ -313,7 +313,7 @@ The following options are defined in the `nextflow.config` file, and can be updated either using a custom configuration file (see `-c` option) or using command line parameter. -### Reads mapping +### HiC-pro mapping The reads mapping is currently based on the two-steps strategy implemented in the HiC-pro pipeline. The idea is to first align reads from end-to-end. @@ -398,6 +398,22 @@ Default: 'AAGCTAGCTT' Exemple of the ARIMA kit: GATCGATC,GANTGATC,GANTANTC,GATCANTC + +### DNAse Hi-C + +#### `--dnase` + +In DNAse Hi-C mode, all options related to digestion Hi-C +(see previous section) are ignored. +In this case, it is highly recommanded to use the `--min_cis_dist` parameter +to remove spurious ligation products. + +```bash +--dnase' +``` + +### HiC-pro processing + #### `--min_restriction_fragment_size` Minimum size of restriction fragments to consider for the Hi-C processing. @@ -434,21 +450,6 @@ Default: '0' - no filter --max_insert_size '[numeric]' ``` -### DNAse Hi-C - -#### `--dnase` - -In DNAse Hi-C mode, all options related to digestion Hi-C -(see previous section) are ignored. -In this case, it is highly recommanded to use the `--min_cis_dist` parameter -to remove spurious ligation products. - -```bash ---dnase' -``` - -### Hi-C processing - #### `--min_cis_dist` Filter short range contact below the specified distance. @@ -479,16 +480,42 @@ Note that in this case the `--min_mapq` parameter is ignored. ## Genome-wide contact maps +Once the list of valid pairs is available, the standard is now to move on the `cooler` +framework to build the raw and balanced contact maps in txt and (m)cool formats. + ### `--bin_size` -Resolution of contact maps to generate (space separated). -Default:'1000000,500000' +Resolution of contact maps to generate (comma separated). +Default:'1000000' ```bash ---bins_size '[numeric]' +--bins_size '[string]' ``` -### `--ice_max_iter` +### `--res_zoomify` + +Define the maximum resolution to reach when zoomify the cool contact maps. +Default:'5000' + +```bash +--res_zoomify '[string]' +``` + +### HiC-Pro contact maps + +Note that by default, the contact maps are now generated with the `cooler` framework. +However, for backward compatibility, the raw and normalized maps can still be generated +by HiC-pro if the `--hicpro_maps` parameter is set. + +#### `--hicpro_maps + +If specified, the raw and ICE normalized contact maps will be generated by HiC-Pro. + +```bash +--hicpro_maps +``` + +#### `--ice_max_iter` Maximum number of iteration for ICE normalization. Default: 100 @@ -497,7 +524,7 @@ Default: 100 --ice_max_iter '[numeric]' ``` -### `--ice_filer_low_count_perc` +#### `--ice_filer_low_count_perc` Define which pourcentage of bins with low counts should be force to zero. Default: 0.02 @@ -506,7 +533,7 @@ Default: 0.02 --ice_filter_low_count_perc '[numeric]' ``` -### `--ice_filer_high_count_perc` +#### `--ice_filer_high_count_perc` Define which pourcentage of bins with low counts should be discarded before normalization. Default: 0 @@ -515,7 +542,7 @@ normalization. Default: 0 --ice_filter_high_count_perc '[numeric]' ``` -### `--ice_eps` +#### `--ice_eps` The relative increment in the results before declaring convergence for ICE normalization. Default: 0.1 @@ -524,6 +551,54 @@ normalization. Default: 0.1 --ice_eps '[numeric]' ``` +## Downstream analysis + +### Additional quality controls + +#### `--res_dist_decay` + +Generates distance vs Hi-C counts plots at a given resolution using HiCExplorer +Several resolution can be specified (comma separeted). Default: '250000' + +```bash +--res_dist_decay '[string]' +``` + +### Compartment calling + +Call open/close compartments for each chromosome, using the `cooltools` command. + +#### `--res_compartments` + +Resolution to call the chromosome compartments (comma separated). +Default: '250000' + +```bash +--res_compartments '[string]' +``` + +### TADs calling + +#### `--tads_caller` + +TADs calling can be performed using different approaches. +Currently available options are 'insulation' and 'hicexplorer'. +Note that all options can be specified (comma separated). +Default: 'insulation' + +```bash +--tads_caller '[string]' +``` + +#### `--res_tads` + +Resolution to run the TADs calling analysis (comma separated). +Default: '40000' + +```bash +--res_tads '[string]' +``` + ## Inputs/Outputs ### `--split_fastq` @@ -578,13 +653,13 @@ genome-wide maps are not built. Usefult for capture-C analysis. Default: false --skip_maps ``` -### `--skip_ice` +### `--skip_balancing` -If defined, the ICE normalization is not run on the raw contact maps. +If defined, the contact maps normalization is not run on the raw contact maps. Default: false ```bash ---skip_ice +--skip_balancing ``` ### `--skip_cool` @@ -595,6 +670,30 @@ If defined, cooler files are not generated. Default: false --skip_cool ``` +### `skip_dist_decay` + +Do not run distance decay plots. Default: false + +```bash +--skip_dist_decay +``` + +### `skip_compartments` + +Do not call compartments. Default: false + +```bash +--skip_compartments +``` + +### `skip_tads` + +Do not call TADs. Default: false + +```bash +--skip_tads +``` + ### `--skip_multiQC` If defined, the MultiQC report is not generated. Default: false