Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Souporcell clustering panicked at 'no entry found for key' for some samples #57

Open
bednarsky opened this issue Jul 4, 2024 · 0 comments

Comments

@bednarsky
Copy link

Hi again, and thanks for all the help in my long demultiplexing process! Encountered a new error, now at souporcell. I am doing demultiplexing of many samples with reference vcfs. Below all the information I could collect so far. Let me know if you need anything more - or if you think this is better discussed at the souporcell repo.

ERROR ~ Error executing process > 'HADGE:run_multi:gene_demultiplexing:demultiplex_souporcell:souporcell (SID399)'

Caused by:
Process HADGE:run_multi:gene_demultiplexing:demultiplex_souporcell:souporcell (SID399) terminated with an error exit status (1)

Command executed:

bcftools view path/to/projectsSID_prj/results/demultiplexing/split_wgs_vcf/BSA_0860_SID_prj.SID399.vcf -Ov -o unzi
pped.vcf

     mkdir souporcell_SID399
     mkdir souporcell_SID399/souporcell_out
     touch souporcell_SID399/params.csv
     echo -e "Argument,Value

bamfile,SID399__filtered_bam_file.bam
barcode,barcodes.tsv.gz
fasta,genome.fa
threads,5
clusters,2
ploidy,2
min_alt,10
min_ref,10
max_loci,2048
restarts,None
common_variant,No common variants are given.
known_genotype,path/to/projectsSID_prj/results/demultiplexing/split_wgs_vcf/BSA_0860_SID_prj.SID399.vcf
known_genotype_sample,No known sample names are given.
skip_remap,True
ignore,False " >> souporcell_SID399/params.csv

     souporcell_pipeline.py --threads 32 -i SID399__filtered_bam_file.bam -b barcodes.tsv.gz -f genome.fa -t 5 -k 2 --ploidy 2 --min_alt 10 --min_ref 10 --max_loci 2048               --known_genotypes unzipped.vcf  --skip_remap True  -o souporcell_SID399/souporcell_out

Command exit status:
1

Command output:
checking modules
imports done
checking bam for expected tags
checking fasta
restarting pipeline in existing directory souporcell_SID399/souporcell_out
using known genotypes
5
running vartrix
running souporcell clustering
/opt/souporcell/souporcell/target/release/souporcell -k 2 -a souporcell_SID399/souporcell_out/alt.mtx -r souporcell_SID399/souporcell_out/ref.mtx --
restarts 100 -b barcodes.tsv.gz --min_ref 10 --min_alt 10 --threads 5 --known_genotypes souporcell_SID399/souporcell_out/common_variants_covered.vcf -
-known_genotypes_sample_names AP101_S144894 AP102_S144893

Command error:
INFO: Converting SIF file to temporary sandbox...
WARNING: While bind mounting 'path/to/projectsSID_prj:path/to/projectsSID_prj': destination is already in the mount point list
Traceback (most recent call last):
File "/opt/souporcell/souporcell_pipeline.py", line 593, in
souporcell(args, ref_mtx, alt_mtx, final_vcf)
File "/opt/souporcell/souporcell_pipeline.py", line 531, in souporcell
subprocess.check_call(cmd, stdout = log, stderr = err)
File "/usr/local/envs/py36/lib/python3.6/subprocess.py", line 311, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['/opt/souporcell/souporcell/target/release/souporcell', '-k', '2', '-a', 'souporcell_SID399/souporcell_out/
alt.mtx', '-r', 'souporcell_SID399/souporcell_out/ref.mtx', '--restarts', '100', '-b', 'barcodes.tsv.gz', '--min_ref', '10', '--min_alt', '10', '--thr
eads', '5', '--known_genotypes', 'souporcell_SID399/souporcell_out/common_variants_covered.vcf', '--known_genotypes_sample_names', 'AP101_S144894', 'AP102_S144893']' returned non-zero exit status 101.
INFO: Cleaning up image...

Work dir:
path/to/projectsSID_prj/results/demultiplexing/work/73/cbcefa88ffec9fc5f62d2419bec9f4

total loci used 16524
thread '' panicked at 'no entry found for key', src/main.rs:325:30
note: run with RUST_BACKTRACE=1 environment variable to display a backtrace
thread '' panicked at 'no entry found for key', src/main.rs:325:30
thread '' panicked at 'no entry found for key', src/main.rs:325:30
thread '' panicked at 'no entry found for key', src/main.rs:325:30
thread '' panicked at 'no entry found for key', src/main.rs:325:30
→ but none of them are helpfully resolved

  • look at other directories for same process and whether they finished → more output files? Results in clustering.tsv?

    • Some worked, some didn't
    • Timestamps of the failures and successes are mixed, and they are from all batches of samples
    • The other samples that have 'panicked' in their err files are not mentioned in the hadge nextflow logfile, but the ones that worked are.
  • another issue that might be relevant Doublet detection. Returned non-zero exit status 101 wheaton5/souporcell#175 (comment)

    wheaton5 commented on Apr 12, 2023
    This generally happens when there are no variants that pass filters. But usually just no variants for some silly reason like inconsistent chromosome naming or similar.

    → but I cannot find anything that is mismatching. Everything is like chr{i}
    Also the comment above is regarding doublet detection, after clustering

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant