diff --git a/README.md b/README.md index 9ba1078d093723191c746198725674ee1384ab8e..73e2665b84470263b8d20c128cba22be7db2d6ae 100644 --- a/README.md +++ b/README.md @@ -46,8 +46,8 @@ results highly reproducible. 4. Export to various contact maps formats ([`HiC-Pro`](https://github.com/nservant/HiC-Pro), [`cooler`](https://github.com/open2c/cooler)) 5. Quality controls ([`HiC-Pro`](https://github.com/nservant/HiC-Pro), [`HiCExplorer`](https://github.com/deeptools/HiCExplorer)) 6. Compartments calling ([`cooltools`](https://cooltools.readthedocs.io/en/latest/)) -8. TADs calling ([`HiCExplorer`](https://github.com/deeptools/HiCExplorer), [`cooltools`](https://cooltools.readthedocs.io/en/latest/)) -9. Quality control report ([`MultiQC`](https://multiqc.info/)) +7. TADs calling ([`HiCExplorer`](https://github.com/deeptools/HiCExplorer), [`cooltools`](https://cooltools.readthedocs.io/en/latest/)) +8. Quality control report ([`MultiQC`](https://multiqc.info/)) ## Quick Start diff --git a/docs/output.md b/docs/output.md index 342ce3a704e7d0cdf6ba5fed8f28c00a0d4d8f1f..d73bce332fae54e6248816828b963d38b24f5eac 100644 --- a/docs/output.md +++ b/docs/output.md @@ -15,7 +15,7 @@ and processes data using the following steps: * [Valid pairs detection](#valid-pairs-detection) * [Duplicates removal](#duplicates-removal) * [Contact maps](#hicpro-contact-maps) -* [Contact maps](#contact-maps) +* [Hi-C contact maps](#hic-contact-maps) * [Downstream analysis](#downstream-analysis) * [Distance decay](#distance-decay) * [Compartments calling](#compartments calling) @@ -193,7 +193,7 @@ files. This format is memory efficient, and is compatible with several software for downstream analysis. -## Contact maps +## Hi-C contact maps Contact maps are usually stored as simple txt (`HiC-Pro`), .hic (`Juicer/Juicebox`) and .(m)cool (`cooler/Higlass`) formats. Note that .cool and .hic format are compressed and usually much more efficient that the txt format. @@ -227,6 +227,7 @@ Although different methods have been proposed for compartment calling, the stand Here, we use the implementation available in the [`cooltools`](https://cooltools.readthedocs.io/en/lates) package. Results are available in **`results/compartments/`** folder and includes : + * `*cis.vecs.tsv`: eigenvectors decomposition along the genome * `*cis.lam.txt`: eigenvalues associated with the eigenvectors @@ -236,9 +237,11 @@ TADs has been described as functional units of the genome. While contacts between genes and regulatority elements can occur within a single TADs, contacts between TADs are much less frequent, mainly due to the presence of insulation protein (such as CTCF) at their boundaries. Looking at Hi-C maps, TADs look like triangles around the diagonal. According to the contact map resolutions, TADs appear as hierarchical structures with a median size around 1Mb (in mammals), as well as smaller structures usually called sub-TADs of smaller size. TADs calling remains a challenging task, and even if many methods have been proposed in the last decade, little overlap have been found between their results. + Currently, the pipeline proposes two approaches : -- Insulation score using the [`cooltools`](https://cooltools.readthedocs.io/en/latest/cli.html#cooltools-diamond-insulation) package. Results are availabe in **`results/tads/insulation`**. -- [`HiCExplorer TADs calling`](https://hicexplorer.readthedocs.io/en/latest/content/tools/hicFindTADs.html). Results are available at **`results/tads/hicexplorer`**. + +* Insulation score using the [`cooltools`](https://cooltools.readthedocs.io/en/latest/cli.html#cooltools-diamond-insulation) package. Results are availabe in **`results/tads/insulation`**. +* [`HiCExplorer TADs calling`](https://hicexplorer.readthedocs.io/en/latest/content/tools/hicFindTADs.html). Results are available at **`results/tads/hicexplorer`**. Usually, TADs results are presented as simple BED files, or bigWig files, with the position of boundaries along the genome. diff --git a/docs/usage.md b/docs/usage.md index 3c39e290986aa5f0efc59c628bac3c2984c8c186..f072ba565b84454879b8f1e9f83cf368f74ae24f 100644 --- a/docs/usage.md +++ b/docs/usage.md @@ -398,7 +398,6 @@ Default: 'AAGCTAGCTT' Exemple of the ARIMA kit: GATCGATC,GANTGATC,GANTANTC,GATCANTC - ### DNAse Hi-C #### `--dnase`