Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adapt cellranger #11

Merged
merged 5 commits into from
Sep 11, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,3 +13,9 @@
## [1.0.0] - 2022-12-08

First full, publicly available version of the gExcite analysis pipeline.

## [1.0.1] - 2022-08-21

### Changed
- Give cellranger ADT and cellranger GEX rule the number of threads specified in the config file (`--localcores`).
- in script `analyse_citeseq.R` have new parameter `--number_pca_adt`. With this parameter in the config file the number of PCA dimensions used for the UMAP calculation based on ADT counts can be adjusted. The number of PCA dimensions cannot be larger than the number of ADTs in the experiment at hand, otherwise the script fails.
3 changes: 3 additions & 0 deletions config/config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,9 @@ tools:

analyse_citeseq:
numberVariableGenes: 500
# With this parameter the number of PCA dimensions used for the UMAP calculation based on ADT counts can be adjusted.
# The number of PCA dimensions cannot be larger than the number of ADTs in the experiment, otherwise the function (and the script) fails.
number_pca_adt: 20

scampi:
# scampi is a snakemake workflow that runs general scRNA processing steps
Expand Down
2 changes: 2 additions & 0 deletions workflow/rules/adt_analyse_citeseq.smk
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,7 @@ rule Rscript_analyse_citeseq:
colorConfig=config["scampi"]["resources"]["colour_config"],
lookup=config["resources"]["adt_lookup"],
numberVariableGenes=config["tools"]["analyse_citeseq"]["numberVariableGenes"],
number_pca_adt=config["tools"]["analyse_citeseq"]["number_pca_adt"],
outdir="results/citeseq_analysis/{sample}/",
custom_script=workflow.source_path("../scripts/analyse_citeseq.R"),
log:
Expand All @@ -59,4 +60,5 @@ rule Rscript_analyse_citeseq:
"--threads {threads} "
"--sampleName {wildcards.sample} "
"--number_variable_genes {params.numberVariableGenes} "
"--number_pca_adt {params.number_pca_adt} "
"--output {params.outdir} &> {log} "
5 changes: 4 additions & 1 deletion workflow/rules/adt_cellranger.smk
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,10 @@ rule cellranger_count_adt:
"--transcriptome={input.reference} "
"--libraries={input.library} "
"--feature-ref={input.features_ref} "
"--nosecondary {params.variousParams} {params.targetCells}) "
"--nosecondary "
"--localcores={threads} "
"{params.variousParams} "
"{params.targetCells}) "
"&> {log} "
"&& gunzip -c {params.cr_out}{params.sample}/outs/filtered_feature_bc_matrix/features.tsv.gz > {params.cr_out}{params.sample}/outs/filtered_feature_bc_matrix/features.tsv ; "
"gunzip -c {params.cr_out}{params.sample}/outs/filtered_feature_bc_matrix/barcodes.tsv.gz > {params.cr_out}{params.sample}/outs/filtered_feature_bc_matrix/barcodes.tsv ; "
Expand Down
1 change: 1 addition & 0 deletions workflow/rules/gex_cellranger.smk
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,7 @@ rule cellranger_count_gex:
"--transcriptome={input.reference} "
"--fastqs={input.fastqs_dir} "
"--nosecondary "
"--localcores={threads} "
"{params.variousParams}) "
"&> {log} ; "
"gunzip {params.cr_out}{params.mySample}/outs/filtered_feature_bc_matrix/features.tsv.gz ; "
Expand Down
3 changes: 2 additions & 1 deletion workflow/scripts/analyse_citeseq.R
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,7 @@ option_list <- list(
make_option("--threads", type = "integer", help = "Number of threads that are available for the script. Recomended: 3-5"),
make_option("--sampleName", type = "character", help = "SampleName. Needs to be exactly as used in the thresholds table."),
make_option("--number_variable_genes", type = "integer", default = 500, help = "Number of variable genes that are included when calculating the UMAP embedding with RNA and ADT data."),
make_option("--number_pca_adt", type = "integer", default = 20, help = "Number of PCA dimensions used when calculating the ADT UMAP. Cannot be larger than number of ADTs in experiment."),
make_option("--output", type = "character", help = "Output directory.")
)
opt_parser <- OptionParser(option_list = option_list)
Expand Down Expand Up @@ -549,7 +550,7 @@ for (type in c("adt", "gex", "adt_gex")) {
cell_attributes <- plyr::join(df_colData_cells, CellRangerADTextended, by = "barcodes")
print("Working on Adt based UMAP embedding.")
combout <- paste(outfolder, "/ExpressionPlots/adt_based_embedding/", opt$sampleName, ".ADT_only", sep = "")
umap.adt <- umap(t(as.matrix(CellRangerADT)), n_neighbors = 30, pca = 50, spread = 1, min_dist = 0.3, ret_nn = T)
umap.adt <- umap(t(as.matrix(CellRangerADT)), n_neighbors = 30, pca = opt$number_pca_adt, spread = 1, min_dist = 0.3, ret_nn = T)
reducedDim(my_sce, "umap_adt") <- umap.adt$embedding
metadata(my_sce) <- c(metadata(my_sce), list(umap_adt = umap.adt$nn$euclidean))
umap_coord <- as.data.frame(reducedDims(my_sce)$umap_adt)
Expand Down