Skip to content

Commit

Permalink
Merge branch 'dev' into dsl2-add-sharding-of-fastqs-before-alignment
Browse files Browse the repository at this point in the history
  • Loading branch information
TCLamnidis authored Oct 13, 2023
2 parents a2310a5 + bc9c7b5 commit 5bfb860
Show file tree
Hide file tree
Showing 19 changed files with 401 additions and 40 deletions.
1 change: 1 addition & 0 deletions .devcontainer/devcontainer.json
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
"name": "nfcore",
"image": "nfcore/gitpod:latest",
"remoteUser": "gitpod",
"runArgs": ["--privileged"],

// Configure tool-specific properties.
"customizations": {
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ jobs:
- "-profile test,docker --mapping_tool bwamem --run_mapdamage_rescaling --run_pmd_filtering --run_trim_bam"
- "-profile test,docker --mapping_tool bowtie2"
- "-profile test,docker --skip_preprocessing"
- "-profile test_humanbam,docker --run_mtnucratio --run_contamination_estimation_angsd"
- "-profile test_humanbam,docker --run_mtnucratio --run_contamination_estimation_angsd --snpcapture_bed 'https://raw.githubusercontent.com/nf-core/test-datasets/eager/reference/Human/1240K.pos.list_hs37d5.0based.bed.gz'"
- "-profile test_multiref,docker" ## TODO add damage manipulation here instead once it goes multiref
steps:
- name: Check out pipeline code
Expand Down
68 changes: 68 additions & 0 deletions .github/workflows/release-announcments.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
name: release-announcements
# Automatic release toot and tweet anouncements
on:
release:
types: [published]
workflow_dispatch:

jobs:
toot:
runs-on: ubuntu-latest
steps:
- uses: rzr/fediverse-action@master
with:
access-token: ${{ secrets.MASTODON_ACCESS_TOKEN }}
host: "mstdn.science" # custom host if not "mastodon.social" (default)
# GitHub event payload
# https://docs.github.com/en/developers/webhooks-and-events/webhooks/webhook-events-and-payloads#release
message: |
Pipeline release! ${{ github.repository }} v${{ github.event.release.tag_name }} - ${{ github.event.release.name }}!
Please see the changelog: ${{ github.event.release.html_url }}
send-tweet:
runs-on: ubuntu-latest

steps:
- uses: actions/setup-python@v4
with:
python-version: "3.10"
- name: Install dependencies
run: pip install tweepy==4.14.0
- name: Send tweet
shell: python
run: |
import os
import tweepy
client = tweepy.Client(
access_token=os.getenv("TWITTER_ACCESS_TOKEN"),
access_token_secret=os.getenv("TWITTER_ACCESS_TOKEN_SECRET"),
consumer_key=os.getenv("TWITTER_CONSUMER_KEY"),
consumer_secret=os.getenv("TWITTER_CONSUMER_SECRET"),
)
tweet = os.getenv("TWEET")
client.create_tweet(text=tweet)
env:
TWEET: |
Pipeline release! ${{ github.repository }} v${{ github.event.release.tag_name }} - ${{ github.event.release.name }}!
Please see the changelog: ${{ github.event.release.html_url }}
TWITTER_CONSUMER_KEY: ${{ secrets.TWITTER_CONSUMER_KEY }}
TWITTER_CONSUMER_SECRET: ${{ secrets.TWITTER_CONSUMER_SECRET }}
TWITTER_ACCESS_TOKEN: ${{ secrets.TWITTER_ACCESS_TOKEN }}
TWITTER_ACCESS_TOKEN_SECRET: ${{ secrets.TWITTER_ACCESS_TOKEN_SECRET }}

bsky-post:
runs-on: ubuntu-latest
steps:
- uses: zentered/[email protected]
with:
post: |
Pipeline release! ${{ github.repository }} v${{ github.event.release.tag_name }} - ${{ github.event.release.name }}!
Please see the changelog: ${{ github.event.release.html_url }}
env:
BSKY_IDENTIFIER: ${{ secrets.BSKY_IDENTIFIER }}
BSKY_PASSWORD: ${{ secrets.BSKY_PASSWORD }}
#
6 changes: 5 additions & 1 deletion CITATIONS.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@

- [FastQC](https://www.bioinformatics.babraham.ac.uk/projects/fastqc/)

> Andrews, S. (2010). FastQC: A Quality Control Tool for High Throughput Sequence Data [Online]. Available online https://www.bioinformatics.babraham.ac.uk/projects/fastqc/.
> Andrews, S. (2010). FastQC: A Quality Control Tool for High Throughput Sequence Data [Online].
- [MultiQC](https://pubmed.ncbi.nlm.nih.gov/27312411/)

Expand Down Expand Up @@ -98,6 +98,10 @@

> Jun, G., Wing, M. K., Abecasis, G. R., & Kang, H. M. (2015). An efficient and scalable analysis framework for variant extraction and refinement from population-scale DNA sequence data. Genome Research, 25(6), 918–925. doi: [10.1101/gr.176552.114](https://doi.org/10.1101/gr.176552.114)
- [QualiMap](https://doi.org/10.1093/bioinformatics/btv566)

> QualiMap Okonechnikov, K., Conesa, A., & García-Alcalde, F. (2016). Qualimap 2: advanced multi-sample quality control for high-throughput sequencing data. Bioinformatics , 32(2), 292–294. Download: http://qualimap.bioinfo.cipf.es/
- [DamageProfiler](https://doi.org/10.1093/bioinformatics/btab190)
> DamageProfiler Neukamm, J., Peltzer, A., & Nieselt, K. (2020). DamageProfiler: Fast damage pattern calculation for ancient DNA. In Bioinformatics (btab190). doi: [10.1093/bioinformatics/btab190](https://doi.org/10.1093/bioinformatics/btab190). Download: https://github.com/Integrative-Transcriptomics/DamageProfiler
Expand Down
21 changes: 12 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,8 @@

**A fully reproducible and state-of-the-art ancient DNA analysis pipeline.**

[![AWS CI](https://img.shields.io/badge/CI%20tests-full%20size-FF9900?labelColor=000000&logo=Amazon%20AWS)](https://nf-co.re/eager/results)[![Cite with Zenodo](http://img.shields.io/badge/DOI-10.5281/zenodo.XXXXXXX-1073c8?labelColor=000000)](https://doi.org/10.5281/zenodo.XXXXXXX)
[![GitHub Actions CI Status](https://github.com/nf-core/eager/workflows/nf-core%20CI/badge.svg)](https://github.com/nf-core/eager/actions?query=workflow%3A%22nf-core+CI%22)
[![GitHub Actions Linting Status](https://github.com/nf-core/eager/workflows/nf-core%20linting/badge.svg)](https://github.com/nf-core/eager/actions?query=workflow%3A%22nf-core+linting%22)[![AWS CI](https://img.shields.io/badge/CI%20tests-full%20size-FF9900?labelColor=000000&logo=Amazon%20AWS)](https://nf-co.re/eager/results)[![Cite with Zenodo](http://img.shields.io/badge/DOI-10.5281/zenodo.1465061-1073c8?labelColor=000000)](https://doi.org/10.5281/zenodo.1465061)

[![Nextflow](https://img.shields.io/badge/nextflow%20DSL2-%E2%89%A523.04.0-23aa62.svg)](https://www.nextflow.io/)
[![run with conda](http://img.shields.io/badge/run%20with-conda-3EB049?labelColor=000000&logo=anaconda)](https://docs.conda.io/en/latest/)
Expand Down Expand Up @@ -84,10 +85,11 @@ A graphical overview of suggested routes through the pipeline depending on conte

## Usage

> **Note**
> If you are new to Nextflow and nf-core, please refer to [this page](https://nf-co.re/docs/usage/installation) on how
> to set-up Nextflow. Make sure to [test your setup](https://nf-co.re/docs/usage/introduction#how-to-run-a-pipeline)
> with `-profile test` before running the workflow on actual data.
:::note
If you are new to Nextflow and nf-core, please refer to [this page](https://nf-co.re/docs/usage/installation) on how
to set-up Nextflow. Make sure to [test your setup](https://nf-co.re/docs/usage/introduction#how-to-run-a-pipeline)
with `-profile test` before running the workflow on actual data.
:::

First, prepare a samplesheet with your input data that looks as follows:

Expand All @@ -112,10 +114,11 @@ nextflow run nf-core/eager \
--outdir <OUTDIR>
```

> **Warning:**
> Please provide pipeline parameters via the CLI or Nextflow `-params-file` option. Custom config files including those
> provided by the `-c` Nextflow option can be used to provide any configuration _**except for parameters**_;
> see [docs](https://nf-co.re/usage/configuration#custom-configuration-files).
:::warning
Please provide pipeline parameters via the CLI or Nextflow `-params-file` option. Custom config files including those
provided by the `-c` Nextflow option can be used to provide any configuration _**except for parameters**_;
see [docs](https://nf-co.re/usage/configuration#custom-configuration-files).
:::

For more details and further functionality, please refer to the [usage documentation](https://nf-co.re/eager/usage) and the [parameter documentation](https://nf-co.re/eager/parameters).

Expand Down
4 changes: 2 additions & 2 deletions assets/multiqc_config.yml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
report_comment: >
This report has been generated by the <a href="https://github.com/nf-core/eager/3.0.0dev" target="_blank">nf-core/eager</a>
This report has been generated by the <a href="https://github.com/nf-core/eager/releases/tag/dev" target="_blank">nf-core/eager</a>
analysis pipeline. For information about how to interpret these results, please see the
<a href="https://nf-co.re/eager/3.0.0dev/output" target="_blank">documentation</a>.
<a href="https://nf-co.re/eager/dev/docs/output" target="_blank">documentation</a>.
report_section_order:
"nf-core-eager-methods-description":
order: -1000
Expand Down
18 changes: 18 additions & 0 deletions conf/modules.config
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,15 @@ process {
]
}

withName: 'MULTIQC' {
ext.args = params.multiqc_title ? "--title \"$params.multiqc_title\"" : ''
publishDir = [
path: { "${params.outdir}/multiqc" },
mode: params.publish_dir_mode,
saveAs: { filename -> filename.equals('versions.yml') ? null : filename }
]
}

withName: FALCO {
tag = { "${meta.sample_id}_${meta.library_id}_L${meta.lane}" }
ext.args = '--quiet'
Expand Down Expand Up @@ -832,6 +841,15 @@ process {
]
}

withName: "QUALIMAP_BAMQC" {
tag = { "${meta.reference}|${meta.sample_id}_${meta.library_id}" }
publishDir = [
path: { "${params.outdir}/mapstats/qualimap/${meta.reference}/${meta.sample_id}/}" },
mode: params.publish_dir_mode,
enabled: true
]
}

//
// DAMAGE CALCULATION
//
Expand Down
32 changes: 32 additions & 0 deletions docs/output.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,10 @@ FastQC results in the `raw/` directory contain the results of running FastQC on

![MultiQC - FastQC adapter content plot](images/mqc_fastqc_adapter.png)

:::note
The FastQC plots displayed in the MultiQC report shows _untrimmed_ reads. They may contain adapter sequence and potentially regions with low quality.
:::

### MultiQC

<details markdown="1">
Expand All @@ -64,6 +68,7 @@ Results generated by MultiQC collate pipeline QC from supported tools e.g. FastQ
- Reports generated by Nextflow: `execution_report.html`, `execution_timeline.html`, `execution_trace.txt` and `pipeline_dag.dot`/`pipeline_dag.svg`.
- Reports generated by the pipeline: `pipeline_report.html`, `pipeline_report.txt` and `software_versions.yml`. The `pipeline_report*` files will only be present if the `--email` / `--email_on_fail` parameter's are used when running the pipeline.
- Reformatted samplesheet files used as input to the pipeline: `samplesheet.valid.csv`.
- Parameters used by the pipeline run: `params.json`.

</details>

Expand Down Expand Up @@ -379,6 +384,33 @@ These curves will be displayed in the pipeline run's MultiQC report, however you

### Mapping Statistics

#### QualiMap

<details markdown="1">
<summary>Output files</summary>

- `qualimap/`

- `<sample_id>/`
- `*.html`: in-depth report including percent coverage, depth coverage, GC content, etc. of mapped reads
- `genome_results.txt`
- `css/`: HTML CSS styling used for the report
- `images_qualimapReport/`: PNG version of images from the HTML report.
- `raw_data_qualimapReport/`: The raw data used to render the HTML report as TXT files.

</details>

[QualiMap](http://qualimap.bioinfo.cipf.es/)
is a tool which provides statistics on the quality of the mapping of your reads to your reference genome. It allows you to assess how well covered your reference genome is by your data, both in 'fold' depth (average number of times a given base on the reference is covered by a read) and 'percentage' (the percentage of all bases on the reference genome that is covered at a given fold depth). These outputs allow you to make decision if you have enough quality data for downstream applications like genotyping, and how to adjust the parameters for those tools accordingly.

> NB: Neither fold coverage nor percent coverage on its own is sufficient to assess whether you have a high quality mapping. Abnormally high fold coverages of a smaller region such as highly conserved genes or un-removed-adapter-containing reference genomes can artificially inflate the mean coverage, yet a high percent coverage is not useful if all bases of the genome are covered at just 1x coverage.
**Note that many of the statistics from this module are displayed in the General Stats table, as they represent single values that are not plottable.**

You will receive output for each sample. This means you will statistics of deduplicated values of all types of libraries combined in a single value (i.e. non-UDG treated, full-UDG, paired-end, single-end all together).

> ⚠️ Warning: If your library has no reads mapping to the reference, this will result in an empty BAM file. Qualimap will therefore not produce any output even if a BAM exists!
#### Bedtools

<details markdown="1">
Expand Down
16 changes: 12 additions & 4 deletions docs/usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -117,7 +117,9 @@ If you wish to repeatedly use the same parameters for multiple runs, rather than

Pipeline settings can be provided in a `yaml` or `json` file via `-params-file <file>`.

> ⚠️ Do not use `-c <file>` to specify parameters as this will result in errors. Custom config files specified with `-c` must only be used for [tuning process resource specifications](https://nf-co.re/docs/usage/configuration#tuning-workflow-resources), other infrastructural tweaks (such as output directories), or module arguments (args).
:::warning
Do not use `-c <file>` to specify parameters as this will result in errors. Custom config files specified with `-c` must only be used for [tuning process resource specifications](https://nf-co.re/docs/usage/configuration#tuning-workflow-resources), other infrastructural tweaks (such as output directories), or module arguments (args).
:::

The above pipeline run specified with a params file in yaml format:

Expand Down Expand Up @@ -154,19 +156,25 @@ This version number will be logged in reports when you run the pipeline, so that

To further assist in reproducbility, you can use share and re-use [parameter files](#running-the-pipeline) to repeat pipeline runs with the same settings without having to write out a command with every single parameter.

> 💡 If you wish to share such profile (such as upload as supplementary material for academic publications), make sure to NOT include cluster specific paths to files, nor institutional specific profiles.
:::tip
If you wish to share such profile (such as upload as supplementary material for academic publications), make sure to NOT include cluster specific paths to files, nor institutional specific profiles.
:::

## Core Nextflow arguments

> **NB:** These options are part of Nextflow and use a _single_ hyphen (pipeline parameters use a double-hyphen).
:::note
These options are part of Nextflow and use a _single_ hyphen (pipeline parameters use a double-hyphen).
:::

### `-profile`

Use this parameter to choose a configuration profile. Profiles can give configuration presets for different compute environments.

Several generic profiles are bundled with the pipeline which instruct the pipeline to use software packaged using different methods (Docker, Singularity, Podman, Shifter, Charliecloud, Apptainer, Conda) - see below.

> We highly recommend the use of Docker or Singularity containers for full pipeline reproducibility, however when this is not possible, Conda is also supported.
:::info
We highly recommend the use of Docker or Singularity containers for full pipeline reproducibility, however when this is not possible, Conda is also supported.
:::

The pipeline also dynamically loads configurations from [https://github.com/nf-core/configs](https://github.com/nf-core/configs) when it runs, making multiple config profiles for various institutional clusters available at run time. For more information and to see if your system is available in these configs please see the [nf-core/configs documentation](https://github.com/nf-core/configs#documentation).

Expand Down
2 changes: 1 addition & 1 deletion lib/WorkflowEager.groovy
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@ class WorkflowEager {

public static String toolCitationText(params) {

// TODO Optionally add in-text citation tools to this list.
// TODO nf-core: Optionally add in-text citation tools to this list.
// Can use ternary operators to dynamically construct based conditions, e.g. params["run_xyz"] ? "Tool (Foo et al. 2023)" : "",
// Uncomment function in methodsDescriptionText to render in MultiQC report
def citation_text = [
Expand Down
3 changes: 3 additions & 0 deletions main.nf
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,9 @@ nextflow.enable.dsl = 2
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
*/

// TODO nf-core: Remove this line if you don't need a FASTA file
// This is an example of how to use getGenomeAttribute() to fetch parameters
// from igenomes.config using `--genome`
params.fasta = WorkflowMain.getGenomeAttribute(params, 'fasta')

/*
Expand Down
9 changes: 7 additions & 2 deletions modules.json
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@
},
"bamutil/trimbam": {
"branch": "master",
"git_sha": "c8e35eb2055c099720a75538d1b8adb3fb5a464c",
"git_sha": "911696ea0b62df80e900ef244d7867d177971f73",
"installed_by": ["modules"]
},
"bbmap/bbduk": {
Expand Down Expand Up @@ -107,7 +107,7 @@
},
"fastqc": {
"branch": "master",
"git_sha": "911696ea0b62df80e900ef244d7867d177971f73",
"git_sha": "9a4517e720bc812e95b56d23d15a1653b6db4f53",
"installed_by": ["modules"]
},
"gunzip": {
Expand Down Expand Up @@ -160,6 +160,11 @@
"git_sha": "911696ea0b62df80e900ef244d7867d177971f73",
"installed_by": ["modules"]
},
"qualimap/bamqc": {
"branch": "master",
"git_sha": "911696ea0b62df80e900ef244d7867d177971f73",
"installed_by": ["modules"]
},
"samtools/faidx": {
"branch": "master",
"git_sha": "911696ea0b62df80e900ef244d7867d177971f73",
Expand Down
2 changes: 1 addition & 1 deletion modules/nf-core/bamutil/trimbam/main.nf

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 5 additions & 1 deletion modules/nf-core/fastqc/main.nf

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading

0 comments on commit 5bfb860

Please sign in to comment.