Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new profiles to hadge #24

Merged
merged 32 commits into from
Nov 17, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
ea82311
update docs
wxicu Nov 3, 2023
3534e20
improve conda directive, add profiles
wxicu Nov 4, 2023
e6c1c01
fix warning
wxicu Nov 4, 2023
6139bd5
add test profile to souporcell
wxicu Nov 4, 2023
d8fb67d
upload test config
wxicu Nov 4, 2023
2d5767b
1.add line break 2.remove container setting 3. adapt souporcell testing
wxicu Nov 5, 2023
0b2fd70
add notebook
wxicu Nov 5, 2023
6c41264
add souporcell res in notebook
wxicu Nov 5, 2023
49e8cb6
enable conda and singularity for test config
wxicu Nov 6, 2023
cdcda52
Merge branch 'main' into docs
wxicu Nov 6, 2023
b5f02b2
fix pre commit
wxicu Nov 6, 2023
99589b6
minor changes
wxicu Nov 6, 2023
aa01844
add quick starts for three mode
wxicu Nov 6, 2023
f995c5f
upgrade pandas and python version, simplify summary code, add github …
wxicu Nov 7, 2023
4dcb1ed
remove empty lines
wxicu Nov 7, 2023
33f1bb9
add test action
wxicu Nov 7, 2023
b5b7946
fix typo
wxicu Nov 7, 2023
22b5f63
change on action
wxicu Nov 7, 2023
2ca87c4
update singularity version
wxicu Nov 7, 2023
629fd96
fix bug
wxicu Nov 7, 2023
17d4b20
use subset reference genome as test data
wxicu Nov 9, 2023
e5c5ae9
remove folder
wxicu Nov 9, 2023
6646ab3
fix test data loc bug
wxicu Nov 9, 2023
9f809e6
solve conda env issue
wxicu Nov 9, 2023
59f9ad9
add setup conda section to update conda
wxicu Nov 9, 2023
d577aad
update conda setup
wxicu Nov 9, 2023
3749bf0
solve conda env issue
wxicu Nov 9, 2023
c46c199
disable souporcell
wxicu Nov 12, 2023
4155572
enable new hashing tools
wxicu Nov 12, 2023
688770b
disable new hashing methods + add mudata
wxicu Nov 12, 2023
2e63ce1
disable anndata in test workflow to reduce memory
wxicu Nov 12, 2023
66319ab
disable mudata for memory
wxicu Nov 16, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 22 additions & 0 deletions .github/workflows/test_action.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
name: hadge test workflow
on: push
jobs:
test:
name: Run pipeline with test data
runs-on: ubuntu-latest
steps:
- name: Check out pipeline code
uses: actions/checkout@v3
- name: Setup conda
uses: conda-incubator/setup-miniconda@v2
with:
auto-update-conda: true
miniconda-version: "latest"
channels: conda-forge, bioconda
- name: Install Nextflow
uses: nf-core/setup-nextflow@v1
- name: Download test dataset
run: bash ${GITHUB_WORKSPACE}/test_data/download_data.sh
- name: Run pipeline with test data
run: |
nextflow run ${GITHUB_WORKSPACE} -profile test,conda --souporcell False --generate_anndata False
26 changes: 13 additions & 13 deletions bin/HTODemux-visualisation.R
Original file line number Diff line number Diff line change
Expand Up @@ -9,30 +9,30 @@ parser <- ArgumentParser("Parameters for HTODemux Visualisation")
parser$add_argument("--hashtagPath",help="folder where rds object was saved from the first part of HTODemux")
parser$add_argument("--assay",help="Name of the Hashtag assay HTO by default", default = "HTO")
#Output graphs - Ridge Plot
parser$add_argument("--ridgePlot", help = "Generates a ridge plot from the results, True to generate", default = "TRUE")
parser$add_argument("--ridgePlot", help = "Generates a ridge plot from the results, True to generate", default = "True")
parser$add_argument("--ridgeNCol", help = "Number of columns for ridgePlot", default = 3, type = "integer")

#Output graphs - Scatter Feature
parser$add_argument("--featureScatter",help = "Generates a ridge plot from the results, True to generate", default = "TRUE")
parser$add_argument("--featureScatter",help = "Generates a ridge plot from the results, True to generate", default = "True")
parser$add_argument("--scatterFeat1", help = "Feature 1 for Feature Scatter Plot", default = "hto_HTO-A")
parser$add_argument("--scatterFeat2", help = "Feature 2 for Feature Scatter Plot", default = "hto_HTO-B")

#Output graphs - Violin Plot
parser$add_argument("--vlnPlot", help = "Generates a violin plot from the results, True to generate", default = "TRUE")
parser$add_argument("--vlnPlot", help = "Generates a violin plot from the results, True to generate", default = "True")
parser$add_argument("--vlnFeatures", help = "Features to plot (gene expression, metrics, PC scores, anything that can be retreived by FetchData)", default = "nCount_RNA")
parser$add_argument("--vlnLog", help = "plot the feature axis on log scale", action = "store_true")

#Output graphs - tSNE
parser$add_argument("--tSNE", help = "Generate a two dimensional tSNE embedding for HTOs", default = "TRUE")
parser$add_argument("--tSNE", help = "Generate a two dimensional tSNE embedding for HTOs", default = "True")
parser$add_argument("--tSNEIdents", help = "What should we remove from the object (we have Singlet,Doublet and Negative)", default = "Negative")
parser$add_argument("--tSNEInvert", action = "store_true") # TRUE
parser$add_argument("--tSNEVerbose", action = "store_true") # FALSE
parser$add_argument("--tSNEApprox", action = "store_true") # FALSE
parser$add_argument("--tSNEInvert", action = "store_true")
parser$add_argument("--tSNEVerbose", action = "store_true")
parser$add_argument("--tSNEApprox", action = "store_true")
parser$add_argument("--tSNEDimMax", help = "max number of donors ",type = "integer", default = 1)
parser$add_argument("--tSNEPerplexity", help = "value for perplexity", type = "integer", default = 100)

#Output graphs - Heatmap
parser$add_argument("--heatMap", help = "Generate a Heatmap", default = "FALSE")
parser$add_argument("--heatMap", help = "Generate a Heatmap", default = "False")
parser$add_argument("--heatMapNcells", help ="value for number of cells", type = "integer", default = 500)
parser$add_argument("--outputdir", help='Output directory')

Expand All @@ -56,24 +56,24 @@ hashtag <-readRDS(hash_file)

# Ridge Plot
# Group cells based on the max HTO signal
if (args$ridgePlot == "TRUE") {
if (args$ridgePlot == "True") {
Idents(hashtag) <- paste0(args$assay, "_maxID")
RidgePlot(hashtag, assay = args$assay, features = rownames(hashtag[[args$assay]]), ncol = args$ridgeNCol)
ggsave(paste0(args$outputdir, '/ridge.jpeg'), device = 'jpeg', dpi = 500) # height = 10, width = 10
}

if (args$featureScatter == "TRUE") {
if (args$featureScatter == "True") {
FeatureScatter(hashtag, feature1 = args$scatterFeat1, feature2 = args$scatterFeat2)
ggsave(paste0(args$outputdir, '/featureScatter.jpeg'), device = 'jpeg',dpi = 500)
}

if (args$vlnPlot == "TRUE") {
if (args$vlnPlot == "True") {
Idents(hashtag) <- paste0(args$assay, "_classification.global")
VlnPlot(hashtag, features = args$vlnFeatures, pt.size = 0.1, log = args$vlnLog)
ggsave(paste0(args$outputdir, '/violinPlot.jpeg'), device = 'jpeg', dpi = 500)
}

if (args$tSNE == "TRUE") {
if (args$tSNE == "True") {
hashtag.subset <- subset(hashtag, idents = args$tSNEIdents, invert = args$tSNEInvert)
DefaultAssay(hashtag.subset) <- args$assay
hashtag.subset <- ScaleData(hashtag.subset, features = rownames(hashtag.subset),
Expand All @@ -84,7 +84,7 @@ if (args$tSNE == "TRUE") {
ggsave(paste0(args$outputdir, '/tSNE.jpeg'), device = 'jpeg', dpi = 500)
}

if (args$heatMap == "TRUE") {
if (args$heatMap == "True") {
HTOHeatmap(hashtag, assay = args$assay, ncells = args$heatMapNcells)
ggsave(paste0(args$outputdir, '/heatMap.jpeg'), device = 'jpeg', dpi = 500)
}
Expand Down
12 changes: 2 additions & 10 deletions bin/demuxem.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@
parser.add_argument('--alpha_noise', help='The Dirichlet prior concenration parameter on the background noise.', type=float, default=1.0)
parser.add_argument('--tol', help='Threshold used for the EM convergence.', type=float, default=1e-6)
parser.add_argument('--n_threads', help='Number of threads to use. Must be a positive integer.', type=int, default=1)
parser.add_argument('--filter_demuxem', help='Use the filter for RNA, true or false', default='true')
parser.add_argument('--filter_demuxem', help='Use the filter for RNA, True or False', default='True')
parser.add_argument('--generateGenderPlot', help='Generate violin plots using gender-specific genes (e.g. Xist). <gene> is a comma-separated list of gene names.', default='')
parser.add_argument('--objectOutDemuxem', help='Output name of demultiplexing results. All outputs will use it as the prefix.', default="demuxem_res")
parser.add_argument('--outputdir', help='Output directory')
Expand All @@ -31,16 +31,8 @@
if __name__ == '__main__':
output_name = args.outputdir + "/" + args.objectOutDemuxem
# load input rna data
#data = io.read_input(args.rna_matrix_dir, modality="rna")
rna_data = sc.read_10x_mtx(args.rna_matrix_dir)
hashing_data = sc.read_10x_mtx(args.hto_matrix_dir,gex_only=False)
#data.subset_data(modality_subset=['rna'])
#data.concat_data() # in case of multi-organism mixing data
# load input hashing data
#data.update(io.read_input(args.hto_matrix_dir, modality="hashing"))
# Extract rna and hashing data
#rna_data = data.get_data(modality="rna")
#hashing_data = data.get_data(modality="hashing")
filter = ""
if args.filter_demuxem.lower() in ['true', 't', 'yes', 'y', '1']:
filter = True
Expand Down Expand Up @@ -96,7 +88,7 @@
pg.write_output(mudata, output_name + ".out.demuxEM.zarr.zip")
print("\nSummary statistics:")
print("total\t{}".format(rna_data.shape[0]))
for name, value in rna_data.obs["demux_type"].value_counts().iteritems():
for name, value in rna_data.obs["demux_type"].value_counts().items():
print("{}\t{}".format(name, value))
summary = rna_data.obs["demux_type"].value_counts().rename_axis('classification').reset_index(name='counts')
total = ["total", rna_data.shape[0]]
Expand Down
1 change: 0 additions & 1 deletion bin/generate_data.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,6 @@
import os
import scanpy as sc
import argparse
import muon as mu

parser = argparse.ArgumentParser(description="Parameters for generating anndata and mudata")
parser.add_argument("--assignment", help="Folder which contains cSV file with demultiplexing assignment", default=None)
Expand Down
Loading
Loading