[LINT] fix linting issues

2a24e909 · nservant · 273cba0e · 2a24e909
Commit 2a24e909 authored 2 years ago by nservant
--- a/docs/output.md
+++ b/docs/output.md
@@ -9,20 +9,20 @@ The directories listed below will be created in the results directory after the

 The pipeline is built using [Nextflow](https://www.nextflow.io/) and processes data using the following steps:

- [From raw data to valid pairs](#from-raw-data-to-valid-pairs)
-  - [HiC-Pro](#hicpro)
-    - [Reads alignment](#reads-alignment)
-    - [Valid pairs detection](#valid-pairs-detection)
-    - [Duplicates removal](#duplicates-removal)
-    - [Contact maps](#hicpro-contact-maps)
- [Hi-C contact maps](#hic-contact-maps)
- [Downstream analysis](#downstream-analysis)
-  - [Distance decay](#distance-decay)
-  - [Compartments calling](#compartments-calling)
-  - [TADs calling](#tads-calling)
- [MultiQC](#multiqc) - aggregate report and quality controls, describing
+* [From raw data to valid pairs](#from-raw-data-to-valid-pairs)
+  * [HiC-Pro](#hicpro)
+    * [Reads alignment](#reads-alignment)
+    * [Valid pairs detection](#valid-pairs-detection)
+    * [Duplicates removal](#duplicates-removal)
+    * [Contact maps](#hicpro-contact-maps)
+* [Hi-C contact maps](#hic-contact-maps)
+* [Downstream analysis](#downstream-analysis)
+  * [Distance decay](#distance-decay)
+  * [Compartments calling](#compartments-calling)
+  * [TADs calling](#tads-calling)
+* [MultiQC](#multiqc) - aggregate report and quality controls, describing
  results of the whole pipeline
- [Export](#exprot) - additionnal export for compatibility with downstream
+* [Export](#exprot) - additionnal export for compatibility with downstream
  analysis tool and visualization

 ## From raw data to valid pairs
@@ -50,17 +50,17 @@ mapping step.

 **Output directory: `results/hicpro/mapping`**

- `*bwt2pairs.bam` - final BAM file with aligned paired data
+* `*bwt2pairs.bam` - final BAM file with aligned paired data

 if `--save_aligned_intermediates` is specified, additional mapping file results
 are available ;

- `*.bam` - Aligned reads (R1 and R2) from end-to-end alignment
- `*_unmap.fastq` - Unmapped reads after end-to-end alignment
- `*_trimmed.fastq` - Trimmed reads after end-to-end alignment
- `*_trimmed.bam` - Alignment of trimmed reads
- `*bwt2merged.bam` - merged BAM file after the two-steps alignment
- `*.mapstat` - mapping statistics per read mate
+* `*.bam` - Aligned reads (R1 and R2) from end-to-end alignment
+* `*_unmap.fastq` - Unmapped reads after end-to-end alignment
+* `*_trimmed.fastq` - Trimmed reads after end-to-end alignment
+* `*_trimmed.bam` - Alignment of trimmed reads
+* `*bwt2merged.bam` - merged BAM file after the two-steps alignment
+* `*.mapstat` - mapping statistics per read mate

 Usually, a high fraction of reads is expected to be aligned on the genome
 (80-90%). Among them, we usually observed a few percent (around 10%) of step 2
@@ -79,14 +79,14 @@ reference genome and the digestion protocol.

 Invalid pairs are classified as follow:

- Dangling end, i.e. unligated fragments (both reads mapped on the same
+* Dangling end, i.e. unligated fragments (both reads mapped on the same
  restriction fragment)
- Self circles, i.e. fragments ligated on themselves (both reads mapped on the
+* Self circles, i.e. fragments ligated on themselves (both reads mapped on the
  same restriction fragment in inverted orientation)
- Religation, i.e. ligation of juxtaposed fragments
- Filtered pairs, i.e. any pairs that do not match the filtering criteria on
+* Religation, i.e. ligation of juxtaposed fragments
+* Filtered pairs, i.e. any pairs that do not match the filtering criteria on
  inserts size, restriction fragments size
- Dumped pairs, i.e. any pairs for which we were not able to reconstruct the
+* Dumped pairs, i.e. any pairs for which we were not able to reconstruct the
  ligation product.

 Only valid pairs involving two different restriction fragments are used to
@@ -102,12 +102,12 @@ can thus be discarded using the `--min_cis_dist` parameter.

 **Output directory: `results/hicpro/valid_pairs`**

- `*.validPairs` - List of valid ligation products
- `*.DEpairs` - List of dangling-end products
- `*.SCPairs` - List of self-circle products
- `*.REPairs` - List of religation products
- `*.FiltPairs` - List of filtered pairs
- `*RSstat` - Statitics of number of read pairs falling in each category
+* `*.validPairs` - List of valid ligation products
+* `*.DEpairs` - List of dangling-end products
+* `*.SCPairs` - List of self-circle products
+* `*.REPairs` - List of religation products
+* `*.FiltPairs` - List of filtered pairs
+* `*RSstat` - Statitics of number of read pairs falling in each category

 Of note, these results are saved only if `--save_pairs_intermediates` is used.  
 The `validPairs` are stored using a simple tab-delimited text format ;
@@ -138,7 +138,7 @@ removed (see `--keep_dups` to disable duplicates filtering).

 **Output directory: `results/hicpro/valid_pairs`**

- `*allValidPairs` - combined valid pairs from all read chunks
+* `*allValidPairs` - combined valid pairs from all read chunks

 Additional quality controls such as fragment size distribution can be extracted
 from the list of valid interaction products.
@@ -160,7 +160,7 @@ detection of valid pairs.

 **Output directory: `results/hicpro/valid_pairs/pairix`**

- `*pairix` - compressed and indexed pairs file
+* `*pairix` - compressed and indexed pairs file

 #### Statistics

@@ -169,10 +169,10 @@ All results are available in `results/hicpro/stats`.

 **Output directory: `results/hicpro/stats`**

- *mapstat - mapping statistics per read mate
- *pairstat - R1/R2 pairing statistics
- *RSstat - Statitics of number of read pairs falling in each category
- *mergestat - statistics about duplicates removal and valid pairs information
+* \*mapstat - mapping statistics per read mate
+* \*pairstat - R1/R2 pairing statistics
+* \*RSstat - Statitics of number of read pairs falling in each category
+* \*mergestat - statistics about duplicates removal and valid pairs information

 #### Contact maps

@@ -192,15 +192,15 @@ is specified on the command line.

 **Output directory: `results/hicpro/matrix`**

- `*.matrix` - genome-wide contact maps
- `*_iced.matrix` - genome-wide iced contact maps
+* `*.matrix` - genome-wide contact maps
+* `*_iced.matrix` - genome-wide iced contact maps

 The contact maps are generated for all specified resolutions
 (see `--bin_size` argument).  
 A contact map is defined by :

- A list of genomic intervals related to the specified resolution (BED format).
- A matrix, stored as standard triplet sparse format (i.e. list format).
+* A list of genomic intervals related to the specified resolution (BED format).
+* A matrix, stored as standard triplet sparse format (i.e. list format).

 Based on the observation that a contact map is symmetric and usually sparse,
 only non-zero values are stored for half of the matrix. The user can specified
@@ -254,8 +254,8 @@ Here, we use the implementation available in the [`cooltools`](https://cooltools

 Results are available in **`results/compartments/`** folder and includes :

- `*cis.vecs.tsv`: eigenvectors decomposition along the genome
- `*cis.lam.txt`: eigenvalues associated with the eigenvectors
+* `*cis.vecs.tsv`: eigenvectors decomposition along the genome
+* `*cis.lam.txt`: eigenvalues associated with the eigenvectors

 ### TADs calling

@@ -266,8 +266,8 @@ TADs calling remains a challenging task, and even if many methods have been prop

 Currently, the pipeline proposes two approaches :

- Insulation score using the [`cooltools`](https://cooltools.readthedocs.io/en/latest/cli.html#cooltools-diamond-insulation) package. Results are availabe in **`results/tads/insulation`**.
- [`HiCExplorer TADs calling`](https://hicexplorer.readthedocs.io/en/latest/content/tools/hicFindTADs.html). Results are available at **`results/tads/hicexplorer`**.
+* Insulation score using the [`cooltools`](https://cooltools.readthedocs.io/en/latest/cli.html#cooltools-diamond-insulation) package. Results are availabe in **`results/tads/insulation`**.
+* [`HiCExplorer TADs calling`](https://hicexplorer.readthedocs.io/en/latest/content/tools/hicFindTADs.html). Results are available at **`results/tads/hicexplorer`**.

 Usually, TADs results are presented as simple BED files, or bigWig files, with the position of boundaries along the genome.

@@ -276,10 +276,10 @@ Usually, TADs results are presented as simple BED files, or bigWig files, with t
 <details markdown="1">
 <summary>Output files</summary>

- `multiqc/`
-  - `multiqc_report.html`: a standalone HTML file that can be viewed in your web browser.
-  - `multiqc_data/`: directory containing parsed statistics from the different tools used in the pipeline.
-  - `multiqc_plots/`: directory containing static images from the report in various formats.
+* `multiqc/`
+  * `multiqc_report.html`: a standalone HTML file that can be viewed in your web browser.
+  * `multiqc_data/`: directory containing parsed statistics from the different tools used in the pipeline.
+  * `multiqc_plots/`: directory containing static images from the report in various formats.

 </details>

@@ -292,10 +292,10 @@ Results generated by MultiQC collate pipeline QC from supported tools e.g. FastQ
 <details markdown="1">
 <summary>Output files</summary>

- `pipeline_info/`
-  - Reports generated by Nextflow: `execution_report.html`, `execution_timeline.html`, `execution_trace.txt` and `pipeline_dag.dot`/`pipeline_dag.svg`.
-  - Reports generated by the pipeline: `pipeline_report.html`, `pipeline_report.txt` and `software_versions.yml`. The `pipeline_report*` files will only be present if the `--email` / `--email_on_fail` parameter's are used when running the pipeline.
-  - Reformatted samplesheet files used as input to the pipeline: `samplesheet.valid.csv`.
+* `pipeline_info/`
+  * Reports generated by Nextflow: `execution_report.html`, `execution_timeline.html`, `execution_trace.txt` and `pipeline_dag.dot`/`pipeline_dag.svg`.
+  * Reports generated by the pipeline: `pipeline_report.html`, `pipeline_report.txt` and `software_versions.yml`. The `pipeline_report*` files will only be present if the `--email` / `--email_on_fail` parameter's are used when running the pipeline.
+  * Reformatted samplesheet files used as input to the pipeline: `samplesheet.valid.csv`.

 </details>