Skip to content

Commit

Permalink
Merge branch 'standardize-conversion-workflow' of https://github.com/…
Browse files Browse the repository at this point in the history
…nf-core/scrnaseq into standardize-conversion-workflow
  • Loading branch information
fmalmeida committed Nov 29, 2024
2 parents d12f64d + f07cac6 commit 595be8b
Show file tree
Hide file tree
Showing 4 changed files with 15 additions and 9 deletions.
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

- Add `--save_align_intermeds` parameter that publishes BAM files to the output directory (for `starsolo`, `cellranger` and `cellranger multi`) ([#384](https://github.com/nf-core/scrnaseq/issues/384))
- Added support for pre-built indexes in `genomes.config` file for `cellranger`, `cellranger-arc`, `simpleaf` and `simpleaf txp2gene` ([#371](https://github.com/nf-core/scrnaseq/issues/371))
- Cleanup and fix bugs in matrix conversion code, and change to use anndataR for conversions, and cellbender for emptydrops call. ([#369](https://github.com/nf-core/scrnaseq/pull/369))
- Fix problem with `test_full` that was not running out of the box, since code was trying to overwrite parameters in the workflow, which is not possible ([#366](https://github.com/nf-core/scrnaseq/issues/366))

## v2.7.1 - 2024-08-13

Expand Down
16 changes: 8 additions & 8 deletions docs/output.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ The pipeline is built using [Nextflow](https://www.nextflow.io/) and processes d
- [Cellranger ARC](#cellranger-arc)
- [Cellranger multi](#cellranger-multi)
- [UniverSC](#universc)
- [Custom emptydrops filter](#custom-emptydrops-filter)
- [Cellbender emptydrops filter](#cellbender-emptydrops-filter)
- [Other output data](#other-output-data)
- [MultiQC](#multiqc)
- [Pipeline information](#pipeline-information)
Expand Down Expand Up @@ -141,15 +141,15 @@ Battenberg, K., Kelly, S.T., Ras, R.A., Hetherington, N.A., Hayashi, K., and Min

- Contains the mapped BAM files, filtered and unfiltered HDF5 matrices and output metrics created by the open-source implementation of Cell Ranger run via UniverSC

## Custom emptydrops filter
## Cellbender emptydrops filter

The pipeline also possess a module to perform empty-drops calling and filtering with a custom-made script that uses a library called `bioconductor-dropletutils` that is available in `bioconda`. The process is simple, it takes a raw/unfiltered matrix file, and performs the empty-drops calling and filtering on it, generating another matrix file.
The pipeline also possess a subworkflow imported from scdownstream to perform emptydrops calling and filtering using [cellbender](https://github.com/broadinstitute/CellBender). The process is simple, it takes a raw/unfiltered matrix file, and performs the emptydrops calling and filtering on it, generating another matrix file.

> Users can turn it of with `--skip_emptydrops`.
**Output directory: `results/${params.aligner}/emptydrops_filtered`**
**Output directory: `results/${params.aligner}/${meta.id}/emptydrops_filter`**

- Contains the empty-drops filtered matrices results generated by the `bioconductor-dropletutils` custom script
- Contains the emptydrops filtered matrices results generated by the cellbender subworkflow.

## Other output data

Expand All @@ -170,15 +170,15 @@ The pipeline also possess a module to perform empty-drops calling and filtering
- `.mtx` files converted to R native data format, rds, using the [Seurat package](https://github.com/satijalab/seurat)
- One per sample

Because the pipeline has both the data directly from the aligners, and from the custom empty-drops filtering module the conversion modules were modified to understand the difference between raw/filtered from the aligners itself and filtered from the custom empty-drops module. So, to try to avoid confusion by the user, we added "suffixes" to the generated converted files so that we have provenance from what input it came from.
Because the pipeline has both the data directly from the aligners, and from the cellbender empty-drops filtering module, the conversion modules were modified to understand the difference between raw/filtered from the aligners itself and filtered from the empty-drops module. So, to try to avoid confusion by the user, we added "suffixes" to the generated converted files so that we have provenance from what input it came from.

So, the conversion modules generate data with the following syntax: **`*_{raw,filtered,custom_emptydrops_filter}_matrix.{h5ad,rds}`**. With the following meanings:
So, the conversion modules generate data with the following syntax: **`*_{raw,filtered,emptydrops_filter}_matrix.{h5ad,rds}`**. With the following meanings:

| suffix | meaning |
| :----------------------- | :--------------------------------------------------------------------------------------------------------------------------------------- |
| raw | Conversion of the raw/unprocessed matrix generated by the tool. It is also used for tools that generate only one matrix, such as alevin. |
| filtered | Conversion of the filtered/processed matrix generated by the tool |
| custom_emptydrops_filter | Conversion of the matrix that was generated by the new custom empty drops filter module |
| emptydrops_filter | Conversion of the matrix that was generated by the cellbender empty drops filter module |

> Some aligners, like `alevin` do not produce both raw&filtered matrices. When aligners give only one output, they are treated with the `raw` suffix. Some aligners may have an option to give both raw&filtered and only one, like `kallisto`. Be aware when using the tools.
Expand Down
2 changes: 1 addition & 1 deletion nextflow_schema.json
Original file line number Diff line number Diff line change
Expand Up @@ -90,7 +90,7 @@
},
"skip_emptydrops": {
"type": "boolean",
"description": "Skip custom empty drops filter module"
"description": "Skip cellbender empty drops filter subworkflow"
}
}
},
Expand Down
4 changes: 4 additions & 0 deletions subworkflows/local/emptydrops_removal.nf
Original file line number Diff line number Diff line change
@@ -1,6 +1,10 @@
include { CELLBENDER_REMOVEBACKGROUND } from '../../modules/nf-core/cellbender/removebackground'
include { ADATA_BARCODES } from '../../modules/local/adata_barcodes'

//
// TODO: Make it a nf-core subworkflow to be shared by scrnaseq and scdownstream pipelines.
//

workflow EMPTY_DROPLET_REMOVAL {
take:
ch_unfiltered
Expand Down

0 comments on commit 595be8b

Please sign in to comment.