-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Running cellSNP on transcriptomic BAM from salmon alevin #14
Comments
Just following up on this - I have been recommended this tool However it isn't clear to me yet whether it can also go the other way - transcriptomic to genomic coordinates, which is what would be needed to run cellSNP on the |
Hi, a quick answer is that if you can convert the genome based VCF into transcriptome based VCF, cellSNP may work straightforward (in mode 1). This means you need to change to chromosome id into transcript id and SNP position on the transcript. Also, keep an eye on the CB and UMI barcodes, as I'm not sure if salmon works similarly as STAR in cellranger. |
Oh cool, thanks for this idea, I'll look into this. I am already manipulating the Yes, the cell barcodes and UMI barcodes were also an issue. We asked the salmon maintainers (Avi Srivastava) about this, and Avi has kindly added an update in the latest development version of salmon, which includes the barcodes (CB and UR fields). (This discussion was in the Bioconductor Slack, |
Thanks for this idea. Unfortunately, I don't think it worked in the end. I used Ensembl Variant Effect Predictor (VEP) tool to get the transcriptome ID and position for each variant, and then rebuilt the
So unfortunately it seems it may not be possible to use
If you have any other ideas, please let me know, since it would be really useful for us if we could somehow use Thanks again for your help. |
Hi, thanks for your previous responses a while back.
I am now trying to use cellSNP/Vireo with a SAM/BAM output file from
salmon alevin
instead of Cell Ranger, and wanted to see if this would be feasible somehow.The main issue seems to be that
salmon alevin
maps to the transcriptome and creates a transcriptomic BAM, while Cell Ranger maps to the genome and creates a genomic BAM. Since cellSNP expects a genomic BAM, it then gives me errors due to not finding chromosome references in the transcriptomic BAM.I just checked with the
salmon alevin
authors on Slack this morning to find out more about this - I believe the chromosome reference is normally stored in theRNAME
field in the genomic BAM, but in the transcriptomic BAM this field contains the transcript name. The genomic/transcriptomic coordinates are also different.Do you know if there would be a way to run cellSNP directly on a transcriptomic BAM, or convert this internally somehow? Alternatively, the
salmon alevin
authors pointed me towards some tools to directly convert transcriptomic to genomic BAM, which could be a way around this - but I thought I would check here first.The main reason this has come up is that we tend to have various issues with Cell Ranger (slow runtime, memory problems, multi-mapping reads), so we would prefer to use
salmon alevin
in our pipelines, and were wondering if this has already been considered. Thanks!The text was updated successfully, but these errors were encountered: