From 97d3d5dae107962cc9d09a2e850082fbd5a4edd5 Mon Sep 17 00:00:00 2001 From: nservant <nservant@curie.fr> Date: Mon, 14 Oct 2019 11:24:14 +0200 Subject: [PATCH] update doc --- README.md | 10 +- docs/configuration/adding_your_own.md | 126 -------------------- docs/configuration/local.md | 76 ------------ docs/configuration/reference_genomes.md | 68 ----------- docs/installation.md | 148 ------------------------ docs/troubleshooting.md | 43 ------- 6 files changed, 5 insertions(+), 466 deletions(-) delete mode 100644 docs/configuration/adding_your_own.md delete mode 100644 docs/configuration/local.md delete mode 100644 docs/configuration/reference_genomes.md delete mode 100644 docs/installation.md delete mode 100644 docs/troubleshooting.md diff --git a/README.md b/README.md index b23872f..e59b529 100644 --- a/README.md +++ b/README.md @@ -44,14 +44,14 @@ sites (bowtie2) The nf-core/hic pipeline comes with documentation about the pipeline, found in the `docs/` directory: -1. [Installation](docs/installation.md) +1. [Installation](https://nf-co.re/usage/installation) 2. Pipeline configuration - * [Local installation](docs/configuration/local.md) - * [Adding your own system](docs/configuration/adding_your_own.md) - * [Reference genomes](docs/configuration/reference_genomes.md) + * [Local installation](https://nf-co.re/usage/local_installation) + * [Adding your own system config](https://nf-co.re/usage/adding_own_config) + * [Reference genomes](https://nf-co.re/usage/reference_genomes) 3. [Running the pipeline](docs/usage.md) 4. [Output and how to interpret the results](docs/output.md) -5. [Troubleshooting](docs/troubleshooting.md) +5. [Troubleshooting](https://nf-co.re/usage/troubleshooting) ## Credits diff --git a/docs/configuration/adding_your_own.md b/docs/configuration/adding_your_own.md deleted file mode 100644 index b1703c1..0000000 --- a/docs/configuration/adding_your_own.md +++ /dev/null @@ -1,126 +0,0 @@ -# nf-core/hic: Configuration for other clusters - -It is entirely possible to run this pipeline on other clusters, though you will -need to set up your own config file so that the pipeline knows how to work with -your cluster. - -> If you think that there are other people using the pipeline who would benefit -from your configuration (eg. other common cluster setups), please let us know. -We can add a new configuration and profile which can used by specifying -`-profile <name>` when running the pipeline. The config file will then be -hosted at `nf-core/configs` and will be pulled automatically before the pipeline -is executed. - -If you are the only person to be running this pipeline, you can create your -config file as `~/.nextflow/config` and it will be applied every time you run -Nextflow. Alternatively, save the file anywhere and reference it when running -the pipeline with `-c path/to/config` (see the -[Nextflow documentation](https://www.nextflow.io/docs/latest/config.html) -for more). - -A basic configuration comes with the pipeline, which loads the -[`conf/base.config`](../../conf/base.config) by default. This means that you -only need to configure the specifics for your system and overwrite any defaults -that you want to change. - -## Cluster Environment - -By default, pipeline uses the `local` Nextflow executor - in other words, all -jobs are run in the login session. If you're using a simple server, this may be -fine. If you're using a compute cluster, this is bad as all jobs will run on -the head node. - -To specify your cluster environment, add the following line to your config -file: - -```nextflow -process.executor = 'YOUR_SYSTEM_TYPE' -``` - -Many different cluster types are supported by Nextflow. For more information, -please see the -[Nextflow documentation](https://www.nextflow.io/docs/latest/executor.html). - -Note that you may need to specify cluster options, such as a project or queue. -To do so, use the `clusterOptions` config option: - -```nextflow -process { - executor = 'SLURM' - clusterOptions = '-A myproject' -} -``` - -## Software Requirements - -To run the pipeline, several software packages are required. How you satisfy -these requirements is essentially up to you and depends on your system. -If possible, we _highly_ recommend using either Docker or Singularity. - -Please see the [`installation documentation`](../installation.md) for how to -run using the below as a one-off. These instructions are about configuring a -config file for repeated use. - -### Docker - -Docker is a great way to run nf-core/hic, as it manages all software -installations and allows the pipeline to be run in an identical software -environment across a range of systems. - -Nextflow has -[excellent integration](https://www.nextflow.io/docs/latest/docker.html) -with Docker, and beyond installing the two tools, not much else is required - -nextflow will automatically fetch the -[nfcore/hic](https://hub.docker.com/r/nfcore/hic/) image that we have created -and is hosted at dockerhub at run time. - -To add docker support to your own config file, add the following: - -```nextflow -docker.enabled = true -process.container = "nfcore/hic" -``` - -Note that the dockerhub organisation name annoyingly can't have a hyphen, -so is `nfcore` and not `nf-core`. - -### Singularity image - -Many HPC environments are not able to run Docker due to security issues. -[Singularity](http://singularity.lbl.gov/) is a tool designed to run on such -HPC systems which is very similar to Docker. - -To specify singularity usage in your pipeline config file, add the following: - -```nextflow -singularity.enabled = true -process.container = "shub://nf-core/hic" -``` - -If you intend to run the pipeline offline, nextflow will not be able to -automatically download the singularity image for you. -Instead, you'll have to do this yourself manually first, transfer the image -file and then point to that. - -First, pull the image file where you have an internet connection: - -```bash -singularity pull --name nf-core-hic.simg shub://nf-core/hic -``` - -Then transfer this file and point the config file to the image: - -```nextflow -singularity.enabled = true -process.container = "/path/to/nf-core-hic.simg" -``` - -### Conda - -If you're not able to use Docker or Singularity, you can instead use conda to -manage the software requirements. -To use conda in your own config file, add the following: - -```nextflow -process.conda = "$baseDir/environment.yml" -``` diff --git a/docs/configuration/local.md b/docs/configuration/local.md deleted file mode 100644 index c3a047f..0000000 --- a/docs/configuration/local.md +++ /dev/null @@ -1,76 +0,0 @@ -# nf-core/hic: Local Configuration - -If running the pipeline in a local environment, we highly recommend using -either Docker or Singularity. - -## Docker - -Docker is a great way to run `nf-core/hic`, as it manages all software -installations and allows the pipeline to be run in an identical software -environment across a range of systems. - -Nextflow has -[excellent integration](https://www.nextflow.io/docs/latest/docker.html) with -Docker, and beyond installing the two tools, not much else is required. -The `nf-core/hic` profile comes with a configuration profile for docker, making -it very easy to use. This also comes with the required presets to use the AWS -iGenomes resource, meaning that if using common reference genomes you just -specify the reference ID and it will be automatically downloaded from AWS S3. - -First, install docker on your system: -[Docker Installation Instructions](https://docs.docker.com/engine/installation/) - -Then, simply run the analysis pipeline: - -```bash -nextflow run nf-core/hic -profile docker --genome '<genome ID>' -``` - -Nextflow will recognise `nf-core/hic` and download the pipeline from GitHub. -The `-profile docker` configuration lists the -[nf-core/hic](https://hub.docker.com/r/nfcore/hic/) image that we have created -and is hosted at dockerhub, and this is downloaded. - -For more information about how to work with reference genomes, see -[`docs/configuration/reference_genomes.md`](reference_genomes.md). - -### Pipeline versions - -The public docker images are tagged with the same version numbers as the code, -which you can use to ensure reproducibility. When running the pipeline, -specify the pipeline version with `-r`, for example `-r 1.0`. This uses -pipeline code and docker image from this tagged version. - -## Singularity image - -Many HPC environments are not able to run Docker due to security issues. -[Singularity](http://singularity.lbl.gov/) is a tool designed to run on such -HPC systems which is very similar to Docker. Even better, it can use create -images directly from dockerhub. - -To use the singularity image for a single run, use `-with-singularity`. -This will download the docker container from dockerhub and create a singularity -image for you dynamically. - -If you intend to run the pipeline offline, nextflow will not be able to -automatically download the singularity image for you. Instead, you'll have -to do this yourself manually first, transfer the image file and then point to -that. - -First, pull the image file where you have an internet connection: - -> NB: The "tag" at the end of this command corresponds to the pipeline version. -> Here, we're pulling the docker image for version 1.0 of the nf-core/hic -pipeline -> Make sure that this tag corresponds to the version of the pipeline that -you're using - -```bash -singularity pull --name nf-core-hic-1.0.img docker://nf-core/hic:1.0 -``` - -Then transfer this file and run the pipeline with this path: - -```bash -nextflow run /path/to/nf-core-hic -with-singularity /path/to/nf-core-hic-1.0.img -``` diff --git a/docs/configuration/reference_genomes.md b/docs/configuration/reference_genomes.md deleted file mode 100644 index d584c0c..0000000 --- a/docs/configuration/reference_genomes.md +++ /dev/null @@ -1,68 +0,0 @@ -# nf-core/hic: Reference Genomes Configuration - -The nf-core/hic pipeline needs a reference genome for alignment and annotation. - -These paths can be supplied on the command line at run time (see the -[usage docs](../usage.md)), -but for convenience it's often better to save these paths in a nextflow config -file. -See below for instructions on how to do this. -Read [Adding your own system](adding_your_own.md) to find out how to set up -custom config files. - -## Adding paths to a config file - -Specifying long paths every time you run the pipeline is a pain. -To make this easier, the pipeline comes configured to understand reference -genome keywords which correspond to preconfigured paths, meaning that you can -just specify `--genome ID` when running the pipeline. - -Note that this genome key can also be specified in a config file if you always -use the same genome. - -To use this system, add paths to your config file using the following template: - -```nextflow -params { - genomes { - 'YOUR-ID' { - fasta = '<PATH TO FASTA FILE>/genome.fa' - } - 'OTHER-GENOME' { - // [..] - } - } - // Optional - default genome. Ignored if --genome 'OTHER-GENOME' specified - // on command line - genome = 'YOUR-ID' -} -``` - -You can add as many genomes as you like as long as they have unique IDs. - -## illumina iGenomes - -To make the use of reference genomes easier, illumina has developed a -centralised resource called -[iGenomes](https://support.illumina.com/sequencing/sequencing_software/igenome.html). -Multiple reference index types are held together with consistent structure for -multiple genomes. - -We have put a copy of iGenomes up onto AWS S3 hosting and this pipeline is -configured to use this by default. -The hosting fees for AWS iGenomes are currently kindly funded by a grant from -Amazon. -The pipeline will automatically download the required reference files when you -run the pipeline. -For more information about the AWS iGenomes, see -[AWS-iGenomes](https://ewels.github.io/AWS-iGenomes/) - -Downloading the files takes time and bandwidth, so we recommend making a local -copy of the iGenomes resource. -Once downloaded, you can customise the variable `params.igenomes_base` in your -custom configuration file to point to the reference location. -For example: - -```nextflow -params.igenomes_base = '/path/to/data/igenomes/' -``` diff --git a/docs/installation.md b/docs/installation.md deleted file mode 100644 index c3dc018..0000000 --- a/docs/installation.md +++ /dev/null @@ -1,148 +0,0 @@ -# nf-core/hic: Installation - -To start using the nf-core/hic pipeline, follow the steps below: - -1. [Install Nextflow](#1-install-nextflow) -2. [Install the pipeline](#2-install-the-pipeline) - * [Automatic](#21-automatic) - * [Offline](#22-offline) - * [Development](#23-development) -3. [Pipeline configuration](#3-pipeline-configuration) - * [Software deps: Docker and Singularity](#31-software-deps-docker-and-singularity) - * [Software deps: Bioconda](#32-software-deps-bioconda) - * [Configuration profiles](#33-configuration-profiles) -4. [Reference genomes](#4-reference-genomes) - -## 1) Install NextFlow - -Nextflow runs on most POSIX systems (Linux, Mac OSX etc). It can be installed -by running the following commands: - -```bash -# Make sure that Java v8+ is installed: -java -version - -# Install Nextflow -curl -fsSL get.nextflow.io | bash - -# Add Nextflow binary to your PATH: -mv nextflow ~/bin/ -# OR system-wide installation: -# sudo mv nextflow /usr/local/bin -``` - -See [nextflow.io](https://www.nextflow.io/) for further instructions on how to -install and configure Nextflow. - -## 2) Install the pipeline - -### 2.1) Automatic - -This pipeline itself needs no installation - NextFlow will automatically fetch -it from GitHub if `nf-core/hic` is specified as the pipeline name. - -### 2.2) Offline - -The above method requires an internet connection so that Nextflow can download -the pipeline files. If you're running on a system that has no internet -connection, you'll need to download and transfer the pipeline files manually: - -```bash -wget https://github.com/nf-core/hic/archive/master.zip -mkdir -p ~/my-pipelines/nf-core/ -unzip master.zip -d ~/my-pipelines/nf-core/ -cd ~/my_data/ -nextflow run ~/my-pipelines/nf-core/hic-master -``` - -To stop nextflow from looking for updates online, you can tell it to run in -offline mode by specifying the following environment variable in your -~/.bashrc file: - -```bash -export NXF_OFFLINE='TRUE' -``` - -### 2.3) Development - -If you would like to make changes to the pipeline, it's best to make a fork on -GitHub and then clone the files. Once cloned you can run the pipeline directly -as above. - -## 3) Pipeline configuration - -By default, the pipeline loads a basic server configuration -[`conf/base.config`](../conf/base.config) -This uses a number of sensible defaults for process requirements and is -suitable for running on a simple (if powerful!) local server. - -Be warned of two important points about this default configuration: - -1. The default profile uses the `local` executor - * All jobs are run in the login session. If you're using a simple server, -this may be fine. If you're using a compute cluster, this is bad as all jobs -will run on the head node. - * See the -[nextflow docs](https://www.nextflow.io/docs/latest/executor.html) for -information about running with other hardware backends. Most job scheduler -systems are natively supported. -2. Nextflow will expect all software to be installed and available on the -`PATH` - * It's expected to use an additional config profile for docker, singularity -or conda support. See below. - -### 3.1) Software deps: Docker - -First, install docker on your system: -[Docker Installation Instructions](https://docs.docker.com/engine/installation/) - -Then, running the pipeline with the option `-profile docker` tells Nextflow to -enable Docker for this run. An image containing all of the software -requirements will be automatically fetched and used from -[dockerhub](https://hub.docker.com/r/nfcore/hic). - -### 3.1) Software deps: Singularity - -If you're not able to use Docker then -[Singularity](http://singularity.lbl.gov/) is a great alternative. -The process is very similar: running the pipeline with the option -`-profile singularity` tells Nextflow to enable singularity for this run. -An image containing all of the software requirements will be automatically -fetched and used from singularity hub. - -If running offline with Singularity, you'll need to download and transfer the -Singularity image first: - -```bash -singularity pull --name nf-core-hic.simg shub://nf-core/hic -``` - -Once transferred, use `-with-singularity` and specify the path to the image -file: - -```bash -nextflow run /path/to/nf-core-hic -with-singularity nf-core-hic.simg -``` - -Remember to pull updated versions of the singularity image if you update the -pipeline. - -### 3.2) Software deps: conda - -If you're not able to use Docker _or_ Singularity, you can instead use conda to -manage the software requirements. -This is slower and less reproducible than the above, but is still better than -having to install all requirements yourself! -The pipeline ships with a conda environment file and nextflow has built-in -support for this. -To use it first ensure that you have conda installed (we recommend -[miniconda](https://conda.io/miniconda.html)), then follow the same pattern -as above and use the flag `-profile conda` - -### 3.3) Configuration profiles - -See [`docs/configuration/adding_your_own.md`](configuration/adding_your_own.md) - -## 4) Reference genomes - -See [`docs/configuration/reference_genomes.md`](configuration/reference_genomes.md) diff --git a/docs/troubleshooting.md b/docs/troubleshooting.md deleted file mode 100644 index df43e8a..0000000 --- a/docs/troubleshooting.md +++ /dev/null @@ -1,43 +0,0 @@ -# nf-core/hic: Troubleshooting - -## Input files not found - -If only no file, only one input file , or only read one and not read two is -picked up then something is wrong with your input file declaration - -1. The path must be enclosed in quotes (`'` or `"`) -2. The path must have at least one `*` wildcard character. This is even if -you are only running one paired end sample. -3. When using the pipeline with paired end data, the path must use `{1,2}` or -`{R1,R2}` notation to specify read pairs. -4. If you are running Single end data make sure to specify `--singleEnd` - -If the pipeline can't find your files then you will get the following error - -```bash -ERROR ~ Cannot find any reads matching: *{1,2}.fastq.gz -``` - -Note that if your sample name is "messy" then you have to be very particular -with your glob specification. A file name like -`L1-1-D-2h_S1_L002_R1_001.fastq.gz` can be difficult enough for a human to -read. Specifying `*{1,2}*.gz` wont work whilst `*{R1,R2}*.gz` will. - -## Data organization - -The pipeline can't take a list of multiple input files - it takes a glob -expression. If your input files are scattered in different paths then we -recommend that you generate a directory with symlinked files. If running -in paired end mode please make sure that your files are sensibly named so -that they can be properly paired. See the previous point. - -## Extra resources and getting help - -If you still have an issue with running the pipeline then feel free to -contact us. -Have a look at the [pipeline website](https://github.com/nf-core/hic) to -find out how. - -If you have problems that are related to Nextflow and not our pipeline then -check out the [Nextflow gitter channel](https://gitter.im/nextflow-io/nextflow) -or the [google group](https://groups.google.com/forum/#!forum/nextflow). -- GitLab