-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improvement in Variant calling #28
Comments
Hi @priesgo. Do you have any thoughts about the points I mentioned above. Besides, I have one more doubt. It appears that the GATK HaplotypeCaller default behaviour is to decompose MNPs to SNPs (See Documentation where they use `--max-mnp-distance 0). This can be achieved using |
Thanks for the kind words. Many points in a single issue, I'll try to cover them all and I will spin off particular tasks from this thread. Starting from the end... Normalization and MNVsWe actually "phase" the mutations from all variant callers and join them into MNVs if two conditions are met: 1) they are both clonal variants (VAF >= 80 %) and 2) overlap the same amino acid. We use some custom code for this under iVar and referenceAccording to the samtools documentation when reference is provided then the Base Alignment Quality is performed.
So, you may be filtering FPs when using the reference... will add this option here #29 iVar and spanning deletionsThis looks reasonable although we have been focusing on LoFreq results so cannot comment. Will add though an option to the pipeline here #30 Improvements to BCFtoolsAgain we have not been looking so much into BCFtools results, but the best for the pipeline would to add optional BCFtools arguments for end-users to tweak variant calling. Will do this here #31 @Rohit-Satyam we accept contributions if you would want to extend the CoVigator pipeline to your needs instead of developing your own pipeline. That may help us both advance faster, but we are happy to help whatever your decision is. |
Sorry, I have few more doubt
It would be wonderful if that can happen. The only problem is I have just started learning nextflow. I will try to edit your code and if that doesn't work will try to share the additional modules (maybe via Email/ gists) that can be included in pipeline. |
We do the same. In order to get the annotation right you need to merge all variants that overlap the same codon. We do this for all variant callers, not only for iVar. This is the "phasing".
This is a good question! The phasing does not matter as |
Any contribution will be appreciated. You can also create a github branch and do your work there, I can help you to make it work if you face troubles. |
@priesgo
@ibn-salem
@ozlemmuslu
I am using COVIGATOR as a reference to build my own Nextflow pipeline and thanks for doing a wonderful job. Your code is clean and comprehensible. I have few queries and suggestions that I need your insight on
Suggestion
I stumbled upon this post, that recommends use of
-x
inbcftools mpileup
flag to disable read-pair overlap detection, following an issue on bcftools and use of-h500
flag for not missing deletions of high coverage, see.Also, maybe
-m 10
EDIT 1: There is also some issue with IVAR related to spanning deletions that can be temporarily taken care of by using
--ignore-overlaps
. See issue1 and issue2EDIT2: I also observe a weird trend of more variants being reported by ivar when we don't use
--reference
option insamtools pileup
. Attaching the results obtained on same sample with the only difference of--reference Sars_cov_2.ASM985889v3.dna.toplevel.fa
. Can you tell me what's happening?ivar-results.xlsx
The text was updated successfully, but these errors were encountered: