Skip to content
Snippets Groups Projects
Commit c386dd18 authored by nservant's avatar nservant
Browse files

[MODIF] init DSL2

parent c738afc4
No related branches found
No related tags found
No related merge requests found
Showing
with 1183 additions and 1504 deletions
# nf-core/hic: Citations
## [nf-core](https://pubmed.ncbi.nlm.nih.gov/32055031/)
> Ewels PA, Peltzer A, Fillinger S, Patel H, Alneberg J, Wilm A, Garcia MU, Di Tommaso P, Nahnsen S. The nf-core framework for community-curated bioinformatics pipelines. Nat Biotechnol. 2020 Mar;38(3):276-278. doi: 10.1038/s41587-020-0439-x. PubMed PMID: 32055031.
## [Nextflow](https://pubmed.ncbi.nlm.nih.gov/28398311/)
> Di Tommaso P, Chatzou M, Floden EW, Barja PP, Palumbo E, Notredame C. Nextflow enables reproducible computational workflows. Nat Biotechnol. 2017 Apr 11;35(4):316-319. doi: 10.1038/nbt.3820. PubMed PMID: 28398311.
## Pipeline tools
* [FastQC](https://www.bioinformatics.babraham.ac.uk/projects/fastqc/)
* [MultiQC](https://www.ncbi.nlm.nih.gov/pubmed/27312411/)
> Ewels P, Magnusson M, Lundin S, Käller M. MultiQC: summarize analysis results for multiple tools and samples in a single report. Bioinformatics. 2016 Oct 1;32(19):3047-8. doi: 10.1093/bioinformatics/btw354. Epub 2016 Jun 16. PubMed PMID: 27312411; PubMed Central PMCID: PMC5039924.
## Software packaging/containerisation tools
* [Anaconda](https://anaconda.com)
> Anaconda Software Distribution. Computer software. Vers. 2-2.4.0. Anaconda, Nov. 2016. Web.
* [Bioconda](https://pubmed.ncbi.nlm.nih.gov/29967506/)
> Grüning B, Dale R, Sjödin A, Chapman BA, Rowe J, Tomkins-Tinch CH, Valieris R, Köster J; Bioconda Team. Bioconda: sustainable and comprehensive software distribution for the life sciences. Nat Methods. 2018 Jul;15(7):475-476. doi: 10.1038/s41592-018-0046-7. PubMed PMID: 29967506.
* [BioContainers](https://pubmed.ncbi.nlm.nih.gov/28379341/)
> da Veiga Leprevost F, Grüning B, Aflitos SA, Röst HL, Uszkoreit J, Barsnes H, Vaudel M, Moreno P, Gatto L, Weber J, Bai M, Jimenez RC, Sachsenberg T, Pfeuffer J, Alvarez RV, Griss J, Nesvizhskii AI, Perez-Riverol Y. BioContainers: an open-source and community-driven framework for software standardization. Bioinformatics. 2017 Aug 15;33(16):2580-2582. doi: 10.1093/bioinformatics/btx192. PubMed PMID: 28379341; PubMed Central PMCID: PMC5870671.
* [Docker](https://dl.acm.org/doi/10.5555/2600239.2600241)
* [Singularity](https://pubmed.ncbi.nlm.nih.gov/28494014/)
> Kurtzer GM, Sochat V, Bauer MW. Singularity: Scientific containers for mobility of compute. PLoS One. 2017 May 11;12(5):e0177459. doi: 10.1371/journal.pone.0177459. eCollection 2017. PubMed PMID: 28494014; PubMed Central PMCID: PMC5426675.
/*
========================================================================================
Config file for defining DSL2 per module options
========================================================================================
Available keys to override module options:
args = Additional arguments appended to command in module.
args2 = Second set of arguments appended to command in module (multi-tool modules).
args3 = Third set of arguments appended to command in module (multi-tool modules).
publish_dir = Directory to publish results.
publish_by_meta = Groovy list of keys available in meta map to append as directories to "publish_dir" path
If publish_by_meta = true - Value of ${meta['id']} is appended as a directory to "publish_dir" path
If publish_by_meta = ['id', 'custompath'] - If "id" is in meta map and "custompath" isn't then "${meta['id']}/custompath/"
is appended as a directory to "publish_dir" path
If publish_by_meta = false / null - No directories are appended to "publish_dir" path
publish_files = Groovy map where key = "file_ext" and value = "directory" to publish results for that file extension
The value of "directory" is appended to the standard "publish_dir" path as defined above.
If publish_files = null (unspecified) - All files are published.
If publish_files = false - No files are published.
suffix = File name suffix for output files.
----------------------------------------------------------------------------------------
*/
params {
modules {
'fastqc' {
args = "--quiet"
}
'multiqc' {
args = ""
}
}
}
This diff is collapsed.
//
// This file holds several functions used within the nf-core pipeline template.
//
import org.yaml.snakeyaml.Yaml
class NfcoreTemplate {
//
// Check AWS Batch related parameters have been specified correctly
//
public static void awsBatch(workflow, params) {
if (workflow.profile.contains('awsbatch')) {
// Check params.awsqueue and params.awsregion have been set if running on AWSBatch
assert (params.awsqueue && params.awsregion) : "Specify correct --awsqueue and --awsregion parameters on AWSBatch!"
// Check outdir paths to be S3 buckets if running on AWSBatch
assert params.outdir.startsWith('s3:') : "Outdir not on S3 - specify S3 Bucket to run on AWSBatch!"
}
}
//
// Check params.hostnames
//
public static void hostName(workflow, params, log) {
Map colors = logColours(params.monochrome_logs)
if (params.hostnames) {
try {
def hostname = "hostname".execute().text.trim()
params.hostnames.each { prof, hnames ->
hnames.each { hname ->
if (hostname.contains(hname) && !workflow.profile.contains(prof)) {
log.info "=${colors.yellow}====================================================${colors.reset}=\n" +
"${colors.yellow}WARN: You are running with `-profile $workflow.profile`\n" +
" but your machine hostname is ${colors.white}'$hostname'${colors.reset}.\n" +
" ${colors.yellow_bold}Please use `-profile $prof${colors.reset}`\n" +
"=${colors.yellow}====================================================${colors.reset}="
}
}
}
} catch (Exception e) {
log.warn "[$workflow.manifest.name] Could not determine 'hostname' - skipping check. Reason: ${e.message}."
}
}
}
//
// Construct and send completion email
//
public static void email(workflow, params, summary_params, projectDir, log, multiqc_report=[]) {
// Set up the e-mail variables
def subject = "[$workflow.manifest.name] Successful: $workflow.runName"
if (!workflow.success) {
subject = "[$workflow.manifest.name] FAILED: $workflow.runName"
}
def summary = [:]
for (group in summary_params.keySet()) {
summary << summary_params[group]
}
def misc_fields = [:]
misc_fields['Date Started'] = workflow.start
misc_fields['Date Completed'] = workflow.complete
misc_fields['Pipeline script file path'] = workflow.scriptFile
misc_fields['Pipeline script hash ID'] = workflow.scriptId
if (workflow.repository) misc_fields['Pipeline repository Git URL'] = workflow.repository
if (workflow.commitId) misc_fields['Pipeline repository Git Commit'] = workflow.commitId
if (workflow.revision) misc_fields['Pipeline Git branch/tag'] = workflow.revision
misc_fields['Nextflow Version'] = workflow.nextflow.version
misc_fields['Nextflow Build'] = workflow.nextflow.build
misc_fields['Nextflow Compile Timestamp'] = workflow.nextflow.timestamp
def email_fields = [:]
email_fields['version'] = workflow.manifest.version
email_fields['runName'] = workflow.runName
email_fields['success'] = workflow.success
email_fields['dateComplete'] = workflow.complete
email_fields['duration'] = workflow.duration
email_fields['exitStatus'] = workflow.exitStatus
email_fields['errorMessage'] = (workflow.errorMessage ?: 'None')
email_fields['errorReport'] = (workflow.errorReport ?: 'None')
email_fields['commandLine'] = workflow.commandLine
email_fields['projectDir'] = workflow.projectDir
email_fields['summary'] = summary << misc_fields
// On success try attach the multiqc report
def mqc_report = null
try {
if (workflow.success) {
mqc_report = multiqc_report.getVal()
if (mqc_report.getClass() == ArrayList && mqc_report.size() >= 1) {
if (mqc_report.size() > 1) {
log.warn "[$workflow.manifest.name] Found multiple reports from process 'MULTIQC', will use only one"
}
mqc_report = mqc_report[0]
}
}
} catch (all) {
if (multiqc_report) {
log.warn "[$workflow.manifest.name] Could not attach MultiQC report to summary email"
}
}
// Check if we are only sending emails on failure
def email_address = params.email
if (!params.email && params.email_on_fail && !workflow.success) {
email_address = params.email_on_fail
}
// Render the TXT template
def engine = new groovy.text.GStringTemplateEngine()
def tf = new File("$projectDir/assets/email_template.txt")
def txt_template = engine.createTemplate(tf).make(email_fields)
def email_txt = txt_template.toString()
// Render the HTML template
def hf = new File("$projectDir/assets/email_template.html")
def html_template = engine.createTemplate(hf).make(email_fields)
def email_html = html_template.toString()
// Render the sendmail template
def max_multiqc_email_size = params.max_multiqc_email_size as nextflow.util.MemoryUnit
def smail_fields = [ email: email_address, subject: subject, email_txt: email_txt, email_html: email_html, projectDir: "$projectDir", mqcFile: mqc_report, mqcMaxSize: max_multiqc_email_size.toBytes() ]
def sf = new File("$projectDir/assets/sendmail_template.txt")
def sendmail_template = engine.createTemplate(sf).make(smail_fields)
def sendmail_html = sendmail_template.toString()
// Send the HTML e-mail
Map colors = logColours(params.monochrome_logs)
if (email_address) {
try {
if (params.plaintext_email) { throw GroovyException('Send plaintext e-mail, not HTML') }
// Try to send HTML e-mail using sendmail
[ 'sendmail', '-t' ].execute() << sendmail_html
log.info "-${colors.purple}[$workflow.manifest.name]${colors.green} Sent summary e-mail to $email_address (sendmail)-"
} catch (all) {
// Catch failures and try with plaintext
def mail_cmd = [ 'mail', '-s', subject, '--content-type=text/html', email_address ]
if ( mqc_report.size() <= max_multiqc_email_size.toBytes() ) {
mail_cmd += [ '-A', mqc_report ]
}
mail_cmd.execute() << email_html
log.info "-${colors.purple}[$workflow.manifest.name]${colors.green} Sent summary e-mail to $email_address (mail)-"
}
}
// Write summary e-mail HTML to a file
def output_d = new File("${params.outdir}/pipeline_info/")
if (!output_d.exists()) {
output_d.mkdirs()
}
def output_hf = new File(output_d, "pipeline_report.html")
output_hf.withWriter { w -> w << email_html }
def output_tf = new File(output_d, "pipeline_report.txt")
output_tf.withWriter { w -> w << email_txt }
}
//
// Print pipeline summary on completion
//
public static void summary(workflow, params, log) {
Map colors = logColours(params.monochrome_logs)
if (workflow.success) {
if (workflow.stats.ignoredCount == 0) {
log.info "-${colors.purple}[$workflow.manifest.name]${colors.green} Pipeline completed successfully${colors.reset}-"
} else {
log.info "-${colors.purple}[$workflow.manifest.name]${colors.red} Pipeline completed successfully, but with errored process(es) ${colors.reset}-"
}
} else {
hostName(workflow, params, log)
log.info "-${colors.purple}[$workflow.manifest.name]${colors.red} Pipeline completed with errors${colors.reset}-"
}
}
//
// ANSII Colours used for terminal logging
//
public static Map logColours(Boolean monochrome_logs) {
Map colorcodes = [:]
// Reset / Meta
colorcodes['reset'] = monochrome_logs ? '' : "\033[0m"
colorcodes['bold'] = monochrome_logs ? '' : "\033[1m"
colorcodes['dim'] = monochrome_logs ? '' : "\033[2m"
colorcodes['underlined'] = monochrome_logs ? '' : "\033[4m"
colorcodes['blink'] = monochrome_logs ? '' : "\033[5m"
colorcodes['reverse'] = monochrome_logs ? '' : "\033[7m"
colorcodes['hidden'] = monochrome_logs ? '' : "\033[8m"
// Regular Colors
colorcodes['black'] = monochrome_logs ? '' : "\033[0;30m"
colorcodes['red'] = monochrome_logs ? '' : "\033[0;31m"
colorcodes['green'] = monochrome_logs ? '' : "\033[0;32m"
colorcodes['yellow'] = monochrome_logs ? '' : "\033[0;33m"
colorcodes['blue'] = monochrome_logs ? '' : "\033[0;34m"
colorcodes['purple'] = monochrome_logs ? '' : "\033[0;35m"
colorcodes['cyan'] = monochrome_logs ? '' : "\033[0;36m"
colorcodes['white'] = monochrome_logs ? '' : "\033[0;37m"
// Bold
colorcodes['bblack'] = monochrome_logs ? '' : "\033[1;30m"
colorcodes['bred'] = monochrome_logs ? '' : "\033[1;31m"
colorcodes['bgreen'] = monochrome_logs ? '' : "\033[1;32m"
colorcodes['byellow'] = monochrome_logs ? '' : "\033[1;33m"
colorcodes['bblue'] = monochrome_logs ? '' : "\033[1;34m"
colorcodes['bpurple'] = monochrome_logs ? '' : "\033[1;35m"
colorcodes['bcyan'] = monochrome_logs ? '' : "\033[1;36m"
colorcodes['bwhite'] = monochrome_logs ? '' : "\033[1;37m"
// Underline
colorcodes['ublack'] = monochrome_logs ? '' : "\033[4;30m"
colorcodes['ured'] = monochrome_logs ? '' : "\033[4;31m"
colorcodes['ugreen'] = monochrome_logs ? '' : "\033[4;32m"
colorcodes['uyellow'] = monochrome_logs ? '' : "\033[4;33m"
colorcodes['ublue'] = monochrome_logs ? '' : "\033[4;34m"
colorcodes['upurple'] = monochrome_logs ? '' : "\033[4;35m"
colorcodes['ucyan'] = monochrome_logs ? '' : "\033[4;36m"
colorcodes['uwhite'] = monochrome_logs ? '' : "\033[4;37m"
// High Intensity
colorcodes['iblack'] = monochrome_logs ? '' : "\033[0;90m"
colorcodes['ired'] = monochrome_logs ? '' : "\033[0;91m"
colorcodes['igreen'] = monochrome_logs ? '' : "\033[0;92m"
colorcodes['iyellow'] = monochrome_logs ? '' : "\033[0;93m"
colorcodes['iblue'] = monochrome_logs ? '' : "\033[0;94m"
colorcodes['ipurple'] = monochrome_logs ? '' : "\033[0;95m"
colorcodes['icyan'] = monochrome_logs ? '' : "\033[0;96m"
colorcodes['iwhite'] = monochrome_logs ? '' : "\033[0;97m"
// Bold High Intensity
colorcodes['biblack'] = monochrome_logs ? '' : "\033[1;90m"
colorcodes['bired'] = monochrome_logs ? '' : "\033[1;91m"
colorcodes['bigreen'] = monochrome_logs ? '' : "\033[1;92m"
colorcodes['biyellow'] = monochrome_logs ? '' : "\033[1;93m"
colorcodes['biblue'] = monochrome_logs ? '' : "\033[1;94m"
colorcodes['bipurple'] = monochrome_logs ? '' : "\033[1;95m"
colorcodes['bicyan'] = monochrome_logs ? '' : "\033[1;96m"
colorcodes['biwhite'] = monochrome_logs ? '' : "\033[1;97m"
return colorcodes
}
//
// Does what is says on the tin
//
public static String dashedLine(monochrome_logs) {
Map colors = logColours(monochrome_logs)
return "-${colors.dim}----------------------------------------------------${colors.reset}-"
}
//
// nf-core logo
//
public static String logo(workflow, monochrome_logs) {
Map colors = logColours(monochrome_logs)
String.format(
"""\n
${dashedLine(monochrome_logs)}
${colors.green},--.${colors.black}/${colors.green},-.${colors.reset}
${colors.blue} ___ __ __ __ ___ ${colors.green}/,-._.--~\'${colors.reset}
${colors.blue} |\\ | |__ __ / ` / \\ |__) |__ ${colors.yellow}} {${colors.reset}
${colors.blue} | \\| | \\__, \\__/ | \\ |___ ${colors.green}\\`-._,-`-,${colors.reset}
${colors.green}`._,._,\'${colors.reset}
${colors.purple} ${workflow.manifest.name} v${workflow.manifest.version}${colors.reset}
${dashedLine(monochrome_logs)}
""".stripIndent()
)
}
}
//
// This file holds several Groovy functions that could be useful for any Nextflow pipeline
//
import org.yaml.snakeyaml.Yaml
class Utils {
//
// When running with -profile conda, warn if channels have not been set-up appropriately
//
public static void checkCondaChannels(log) {
Yaml parser = new Yaml()
def channels = []
try {
def config = parser.load("conda config --show channels".execute().text)
channels = config.channels
} catch(NullPointerException | IOException e) {
log.warn "Could not verify conda channel configuration."
return
}
// Check that all channels are present
def required_channels = ['conda-forge', 'bioconda', 'defaults']
def conda_check_failed = !required_channels.every { ch -> ch in channels }
// Check that they are in the right order
conda_check_failed |= !(channels.indexOf('conda-forge') < channels.indexOf('bioconda'))
conda_check_failed |= !(channels.indexOf('bioconda') < channels.indexOf('defaults'))
if (conda_check_failed) {
log.warn "=============================================================================\n" +
" There is a problem with your Conda configuration!\n\n" +
" You will need to set-up the conda-forge and bioconda channels correctly.\n" +
" Please refer to https://bioconda.github.io/user/install.html#set-up-channels\n" +
" NB: The order of the channels matters!\n" +
"==================================================================================="
}
}
//
// Join module args with appropriate spacing
//
public static String joinModuleArgs(args_list) {
return ' ' + args_list.join(' ')
}
}
//
// This file holds several functions specific to the workflow/hic.nf in the nf-core/hic pipeline
//
class WorkflowHic {
//
// Check and validate parameters
//
public static void initialise(params, log) {
genomeExistsError(params, log)
if (!params.fasta) {
log.error "Genome fasta file not specified with e.g. '--fasta genome.fa' or via a detectable config file."
System.exit(1)
}
}
//
// Get workflow summary for MultiQC
//
public static String paramsSummaryMultiqc(workflow, summary) {
String summary_section = ''
for (group in summary.keySet()) {
def group_params = summary.get(group) // This gets the parameters of that particular group
if (group_params) {
summary_section += " <p style=\"font-size:110%\"><b>$group</b></p>\n"
summary_section += " <dl class=\"dl-horizontal\">\n"
for (param in group_params.keySet()) {
summary_section += " <dt>$param</dt><dd><samp>${group_params.get(param) ?: '<span style=\"color:#999999;\">N/A</a>'}</samp></dd>\n"
}
summary_section += " </dl>\n"
}
}
String yaml_file_text = "id: '${workflow.manifest.name.replace('/','-')}-summary'\n"
yaml_file_text += "description: ' - this information is collected when the pipeline is started.'\n"
yaml_file_text += "section_name: '${workflow.manifest.name} Workflow Summary'\n"
yaml_file_text += "section_href: 'https://github.com/${workflow.manifest.name}'\n"
yaml_file_text += "plot_type: 'html'\n"
yaml_file_text += "data: |\n"
yaml_file_text += "${summary_section}"
return yaml_file_text
}
//
// Exit pipeline if incorrect --genome key provided
//
private static void genomeExistsError(params, log) {
if (params.genomes && params.genome && !params.genomes.containsKey(params.genome)) {
log.error "=============================================================================\n" +
" Genome '${params.genome}' not found in any config files provided to the pipeline.\n" +
" Currently, the available genome keys are:\n" +
" ${params.genomes.keySet().join(", ")}\n" +
"==================================================================================="
System.exit(1)
}
}
}
//
// This file holds several functions specific to the main.nf workflow in the nf-core/hic pipeline
//
class WorkflowMain {
//
// Citation string for pipeline
//
public static String citation(workflow) {
return "If you use ${workflow.manifest.name} for your analysis please cite:\n\n" +
// TODO nf-core: Add Zenodo DOI for pipeline after first release
//"* The pipeline\n" +
//" https://doi.org/10.5281/zenodo.XXXXXXX\n\n" +
"* The nf-core framework\n" +
" https://doi.org/10.1038/s41587-020-0439-x\n\n" +
"* Software dependencies\n" +
" https://github.com/${workflow.manifest.name}/blob/master/CITATIONS.md"
}
//
// Print help to screen if required
//
public static String help(workflow, params, log) {
def command = "nextflow run ${workflow.manifest.name} --input samplesheet.csv --genome GRCh37 -profile docker"
def help_string = ''
help_string += NfcoreTemplate.logo(workflow, params.monochrome_logs)
help_string += NfcoreSchema.paramsHelp(workflow, params, command)
help_string += '\n' + citation(workflow) + '\n'
help_string += NfcoreTemplate.dashedLine(params.monochrome_logs)
return help_string
}
//
// Print parameter summary log to screen
//
public static String paramsSummaryLog(workflow, params, log) {
def summary_log = ''
summary_log += NfcoreTemplate.logo(workflow, params.monochrome_logs)
summary_log += NfcoreSchema.paramsSummaryLog(workflow, params)
summary_log += '\n' + citation(workflow) + '\n'
summary_log += NfcoreTemplate.dashedLine(params.monochrome_logs)
return summary_log
}
//
// Validate parameters and print summary to screen
//
public static void initialise(workflow, params, log) {
// Print help to screen if required
if (params.help) {
log.info help(workflow, params, log)
System.exit(0)
}
// Validate workflow parameters via the JSON schema
if (params.validate_params) {
NfcoreSchema.validateParameters(workflow, params, log)
}
// Print parameter summary log to screen
log.info paramsSummaryLog(workflow, params, log)
// Check that conda channels are set-up correctly
if (params.enable_conda) {
Utils.checkCondaChannels(log)
}
// Check AWS batch settings
NfcoreTemplate.awsBatch(workflow, params)
// Check the hostnames against configured profiles
NfcoreTemplate.hostName(workflow, params, log)
// Check input has been provided
if (!params.input) {
log.error "Please provide an input samplesheet to the pipeline e.g. '--input samplesheet.csv'"
System.exit(1)
}
}
//
// Get attribute from genome config file e.g. fasta
//
public static String getGenomeAttribute(params, attribute) {
def val = ''
if (params.genomes && params.genome && params.genomes.containsKey(params.genome)) {
if (params.genomes[ params.genome ].containsKey(attribute)) {
val = params.genomes[ params.genome ][ attribute ]
}
}
return val
}
}
This diff is collapsed.
{
"name": "nf-core/hic",
"homePage": "https://github.com/nf-core/hic",
"repos": {
"nf-core/modules": {
"fastqc": {
"git_sha": "e937c7950af70930d1f34bb961403d9d2aa81c7d"
},
"multiqc": {
"git_sha": "e937c7950af70930d1f34bb961403d9d2aa81c7d"
}
}
}
}
// Import generic module functions
include { initOptions; saveFiles; getSoftwareName } from './functions'
params.options = [:]
options = initOptions(params.options)
process bowtie2_end_to_end {
tag "$sample"
label 'process_medium'
publishDir path: { params.save_aligned_intermediates ? "${params.outdir}/mapping/bwt2_end2end" : params.outdir },
saveAs: { filename -> if (params.save_aligned_intermediates) filename }, mode: params.publish_dir_mode
input:
tuple val(sample), path(reads)
path index
output:
tuple val(sample), path("${prefix}_unmap.fastq"), emit: unmapped_end_to_end
tuple val(sample), path("${prefix}.bam"), emit: end_to_end_bam
script:
prefix = reads.toString() - ~/(\.fq)?(\.fastq)?(\.gz)?$/
def bwt2_opts = params.bwt2_opts_end2end
if (!params.dnase){
"""
INDEX=`find -L ./ -name "*.rev.1.bt2" | sed 's/.rev.1.bt2//'`
bowtie2 --rg-id BMG --rg SM:${prefix} \\
${bwt2_opts} \\
-p ${task.cpus} \\
-x \${INDEX} \\
--un ${prefix}_unmap.fastq \\
-U ${reads} | samtools view -F 4 -bS - > ${prefix}.bam
"""
}else{
"""
INDEX=`find -L ./ -name "*.rev.1.bt2" | sed 's/.rev.1.bt2//'`
bowtie2 --rg-id BMG --rg SM:${prefix} \\
${bwt2_opts} \\
-p ${task.cpus} \\
-x \${INDEX} \\
--un ${prefix}_unmap.fastq \\
-U ${reads} > ${prefix}.bam
"""
}
}
// Import generic module functions
include { initOptions; saveFiles; getSoftwareName } from './functions'
params.options = [:]
options = initOptions(params.options)
process bowtie2_merge_mapping_steps{
tag "$prefix = $bam1 + $bam2"
label 'process_medium'
publishDir "${params.outdir}/hicpro/mapping", mode: params.publish_dir_mode,
saveAs: { filename -> if (params.save_aligned_intermediates && filename.endsWith("stat")) "stats/$filename"
else if (params.save_aligned_intermediates) filename}
input:
tuple val(prefix), path(bam1), path(bam2)
output:
tuple val(sample), path("${prefix}_bwt2merged.bam"), emit:bwt2_merged_bam
tuple val(oname), path("${prefix}.mapstat"), emit:all_mapstat
script:
sample = prefix.toString() - ~/(_R1|_R2)/
tag = prefix.toString() =~/_R1/ ? "R1" : "R2"
oname = prefix.toString() - ~/(\.[0-9]+)$/
"""
samtools merge -@ ${task.cpus} \\
-f ${prefix}_bwt2merged.bam \\
${bam1} ${bam2}
samtools sort -@ ${task.cpus} -m 800M \\
-n \\
-o ${prefix}_bwt2merged.sorted.bam \\
${prefix}_bwt2merged.bam
mv ${prefix}_bwt2merged.sorted.bam ${prefix}_bwt2merged.bam
echo "## ${prefix}" > ${prefix}.mapstat
echo -n "total_${tag}\t" >> ${prefix}.mapstat
samtools view -c ${prefix}_bwt2merged.bam >> ${prefix}.mapstat
echo -n "mapped_${tag}\t" >> ${prefix}.mapstat
samtools view -c -F 4 ${prefix}_bwt2merged.bam >> ${prefix}.mapstat
echo -n "global_${tag}\t" >> ${prefix}.mapstat
samtools view -c -F 4 ${bam1} >> ${prefix}.mapstat
echo -n "local_${tag}\t" >> ${prefix}.mapstat
samtools view -c -F 4 ${bam2} >> ${prefix}.mapstat
"""
}
// Import generic module functions
include { initOptions; saveFiles; getSoftwareName } from './functions'
params.options = [:]
options = initOptions(params.options)
process bowtie2_on_trimmed_reads {
tag "$sample"
label 'process_medium'
publishDir path: { params.save_aligned_intermediates ? "${params.outdir}/mapping/bwt2_trimmed" : params.outdir },
saveAs: { filename -> if (params.save_aligned_intermediates) filename }, mode: params.publish_dir_mode
when:
!params.dnase
input:
tuple val(sample), path(reads)
path index
output:
tuple val(sample), path("${prefix}_trimmed.bam"), emit:trimmed_bam
script:
prefix = reads.toString() - ~/(_trimmed)?(\.fq)?(\.fastq)?(\.gz)?$/
"""
INDEX=`find -L ./ -name "*.rev.1.bt2" | sed 's/.rev.1.bt2//'`
bowtie2 --rg-id BMG --rg SM:${prefix} \\
${params.bwt2_opts_trimmed} \\
-p ${task.cpus} \\
-x \${INDEX} \\
-U ${reads} | samtools view -bS - > ${prefix}_trimmed.bam
"""
}
// Import generic module functions
include { initOptions; saveFiles; getSoftwareName } from './functions'
params.options = [:]
options = initOptions(params.options)
process build_contact_maps{
tag "$sample - $mres"
label 'process_highmem'
publishDir "${params.outdir}/hicpro/matrix/raw", mode: params.publish_dir_mode
when:
!params.skip_maps && params.hicpro_maps
input:
tuple val(sample), path(vpairs), val(mres)
path chrsize
output:
tuple val(sample), val(mres), path("*.matrix"), path("*.bed"), emit: raw_maps_4cool
script:
"""
build_matrix --matrix-format upper --binsize ${mres} --chrsizes ${chrsize} --ipath ${vpairs} --oprefix ${sample}_${mres}
"""
}
// Import generic module functions
include { initOptions; saveFiles; getSoftwareName } from './functions'
params.options = [:]
options = initOptions(params.options)
process combine_mates{
tag "$sample = $r1_prefix + $r2_prefix"
label 'process_low'
publishDir "${params.outdir}/hicpro/mapping", mode: params.publish_dir_mode,
saveAs: {filename -> filename.endsWith(".pairstat") ? "stats/$filename" : "$filename"}
input:
tuple val(sample), path(aligned_bam)
output:
tuple val(oname), path("${sample}_bwt2pairs.bam"), emit:paired_bam
tuple val(oname), path("*.pairstat"), emit:all_pairstat
script:
r1_bam = aligned_bam[0]
r1_prefix = r1_bam.toString() - ~/_bwt2merged.bam$/
r2_bam = aligned_bam[1]
r2_prefix = r2_bam.toString() - ~/_bwt2merged.bam$/
oname = sample.toString() - ~/(\.[0-9]+)$/
def opts = "-t"
if (params.keep_multi) {
opts="${opts} --multi"
}else if (params.min_mapq){
opts="${opts} -q ${params.min_mapq}"
}
"""
mergeSAM.py -f ${r1_bam} -r ${r2_bam} -o ${sample}_bwt2pairs.bam ${opts}
"""
}
// Import generic module functions
include { initOptions; saveFiles; getSoftwareName } from './functions'
params.options = [:]
options = initOptions(params.options)
process compartment_calling {
tag "$sample - $res"
label 'process_medium'
publishDir "${params.outdir}/compartments", mode: 'copy'
when:
!params.skip_compartments
input:
tuple val(sample), val(res), path(cool), val(r)
path(fasta)
path(chrsize)
output:
path("*compartments*") optional true, emit:out_compartments
script:
"""
cooltools genome binnify --all-names ${chrsize} ${res} > genome_bins.txt
cooltools genome gc genome_bins.txt ${fasta} > genome_gc.txt
cooltools call-compartments --contact-type cis -o ${sample}_compartments ${cool}
awk -F"\t" 'NR>1{OFS="\t"; if(\$6==""){\$6=0}; print \$1,\$2,\$3,\$6}' ${sample}_compartments.cis.vecs.tsv | sort -k1,1 -k2,2n > ${sample}_compartments.cis.E1.bedgraph
"""
}
// Import generic module functions
include { initOptions; saveFiles; getSoftwareName } from './functions'
params.options = [:]
options = initOptions(params.options)
process convert_to_pairs {
tag "$sample"
label 'process_medium'
when:
!params.skip_maps
input:
tuple val(sample), path(vpairs)
path chrsize
output:
tuple val(sample), path("*.txt.gz"), emit: cool_build_zoom
script:
"""
## chr/pos/strand/chr/pos/strand
awk '{OFS="\t";print \$1,\$2,\$3,\$5,\$6,\$4,\$7}' $vpairs > contacts.txt
gzip contacts.txt
"""
}
// Import generic module functions
include { initOptions; saveFiles; getSoftwareName } from './functions'
params.options = [:]
options = initOptions(params.options)
process cooler_balance {
tag "$sample - ${res}"
label 'process_medium'
publishDir "${params.outdir}/contact_maps/", mode: 'copy',
saveAs: {filename -> filename.endsWith(".cool") ? "norm/cool/$filename" : "norm/txt/$filename"}
when:
!params.skip_balancing
input:
tuple val(sample), val(res), path(cool)
path chrsize
output:
tuple val(sample), val(res), path("${sample}_${res}_norm.cool"), emit:balanced_cool_maps
path("${sample}_${res}_norm.txt"), emit:norm_txt_maps
script:
"""
cp ${cool} ${sample}_${res}_norm.cool
cooler balance ${sample}_${res}_norm.cool -p ${task.cpus} --force
cooler dump ${sample}_${res}_norm.cool --balanced --na-rep 0 | awk '{OFS="\t"; print \$1+1,\$2+1,\$4}' > ${sample}_${res}_norm.txt
"""
}
// Import generic module functions
include { initOptions; saveFiles; getSoftwareName } from './functions'
params.options = [:]
options = initOptions(params.options)
process cooler_raw {
tag "$sample - ${res}"
label 'process_medium'
publishDir "${params.outdir}/contact_maps/", mode: 'copy',
saveAs: {filename -> filename.endsWith(".cool") ? "raw/cool/$filename" : "raw/txt/$filename"}
input:
tuple val(sample), path(contacts), val(res)
path chrsize
output:
tuple val(sample), val(res), path("*cool"), emit:raw_cool_maps
tuple path("*.bed"), path("${sample}_${res}.txt"), emit:raw_txt_maps
script:
"""
cooler makebins ${chrsize} ${res} > ${sample}_${res}.bed
cooler cload pairs -c1 2 -p1 3 -c2 4 -p2 5 ${sample}_${res}.bed ${contacts} ${sample}_${res}.cool
cooler dump ${sample}_${res}.cool | awk '{OFS="\t"; print \$1+1,\$2+1,\$3}' > ${sample}_${res}.txt
"""
}
// Import generic module functions
include { initOptions; saveFiles; getSoftwareName } from './functions'
params.options = [:]
options = initOptions(params.options)
process cooler_zoomify {
tag "$sample"
label 'process_medium'
publishDir "${params.outdir}/contact_maps/norm/mcool", mode: 'copy'
when:
!params.skip_mcool
input:
tuple val(sample), path(contacts)
path chrsize
output:
path("*mcool"), emit:mcool_maps
script:
"""
cooler makebins ${chrsize} ${params.res_zoomify} > bins.bed
cooler cload pairs -c1 2 -p1 3 -c2 4 -p2 5 bins.bed ${contacts} ${sample}.cool
cooler zoomify --nproc ${task.cpus} --balance ${sample}.cool
"""
}
// Import generic module functions
include { initOptions; saveFiles; getSoftwareName } from './functions'
params.options = [:]
options = initOptions(params.options)
process dist_decay {
tag "$sample"
label 'process_medium'
publishDir "${params.outdir}/dist_decay", mode: 'copy'
when:
!params.skip_dist_decay
input:
tuple val(sample), val(res), path(maps), val(r)
output:
path("*_distcount.txt")
path("*.png")
script:
"""
hicPlotDistVsCounts --matrices ${maps} \
--plotFile ${maps.baseName}_distcount.png \
--outFileData ${maps.baseName}_distcount.txt
"""
}
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment