-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Segmentation fault on MashMap step of generateDecoyTranscriptome.sh #5
Comments
Hi @jaclyn-taroni , Thanks for raising this issue, one other user is also facing the similar issue with human genome. |
Hi @k3yavi, Thanks for the quick reply and the offer. I was planning on using the most recent Ensembl release for zebrafish. Here are the relevant links: ftp://ftp.ensembl.org/pub/release-96/fasta/danio_rerio/dna/Danio_rerio.GRCz11.dna.toplevel.fa.gz Thanks again! |
Hi @jaclyn-taroni, @k3yavi has built the decoy transcriptome for zebrafish, you can grab it from the link on the salmon readme. --Rob |
hi @k3yavi I'm getting the same error with data from a tick species -- any chance you'd be willing to run this for me, too? The genome is (we use the first one, Ixodes-Scapularis-IES6_...): The .gtf (ISE6, same as above): And a transcriptome that is as of yet unpublished/posted -- I'd have to send it. |
Hi @cmatKhan , |
Very much appreciated. I realized after I hit send that there is a transcriptome on vectorbase -- I assume that's what you used? |
Actually I just used the gtf and the genome to extract the transcriptome . |
Hi Guys, Just to give the heads up, we have curated the decoys sequence of a subset of model organism and it can be found here. |
I'm having this issue as well, I've tried it on a couple machines although the most RAM so far is 24GB (20 free). Any chance you could generate decoys for refseq human and mouse? They give GFF annotation files, I was feeding that directly into step 2 (instead of the exons.bed) and step 2 completes fine, but step 3 fails pretty early with segmentation fault. Alternatively, can you give an estimate of how much RAM this script is using on your machine where it successfully completes? Also, how long do you typically find it takes? I've not used MashMap before. I tried doing a trial run with a smaller genome and gave it 10 threads and while it didn't have a segmentation fault, after ~ 6 hours in step 3 I gave up since I didn't really need the decoys but was surprised at how long it was taking. Thanks! |
Hi, please fill the following decoy generation request form https://forms.gle/3baJc5SYrkSWb1z48 and we will let you know once we have the decoys. On our machine it was taking ~100G and approximately an hour to run for human gencode data. Thanks ! |
Hi guys, Just wanted to let you know, we recently released a new version of salmon where you don't have to explicitly run the mashmap pipeline. With v1.0 salmon can consume both the genome and transcriptome without the need of annotations. Please checkout the new preprint or follow this tutorial for redindexing. |
Thank you so much! I asked in the chat, but just in case. Any estimation of memory during index and quantification, assuming a human genome like reference? Thanks! |
Hi @lpantano, The indexing using the entire human genome as decoy and the whole transcriptome (gencode v29) as the actual target sequence takes ~20G of RAM in our runs. The final (dense) index size is ~19G so construction RAM is only a little bit more. Interestingly, while the final index for using the whole genome as decoy is considerably bigger than if one uses the mashmap decoy sequences, the indexing memory is quite a bit smaller. |
Hi all,
I get
Segmentation fault (core dumped)
on step 3 ofgenerateDecoyTranscriptome.sh
.I've filed marbl/MashMap#21 upstream with more detailed information. I wanted to file an issue here in case you have any insight or I am using the script improperly.
Here's how I'm using this:
I realize you have
gentrome.fa
anddecoys.txt
for human here: https://github.com/COMBINE-lab/salmon#pre-computed-decoy-transcriptomesI'm interested in generating this for zebrafish and happened to run into this problem with human first/before I found that on the Salmon README.
Thank you!
The text was updated successfully, but these errors were encountered: