-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Using GLNexus to merge GATK gVCFs #216
Comments
GLnexus doesn't compute them because of the way it's meant to process subsets of the cohort across compute nodes for genotyping. In a complete workflow, the variant-level aggregates are easier to compute in a downstream analytics environment like Apache Spark. This answer doesn't much help users of the standalone open-source version, I know. Some more of the aggregates can be added by bcftools. If there are a selected few that'd be especially helpful to build in, we'd welcome that feedback.
I'd think of both GLnexus and GenotypeGVCFs as applying first-pass filters mainly meant to prevent the aforementioned blowup of runtime and file size. IIRC we calibrated the Re multiallelic and indel sites, there is a raft of gnarly issues with overlapping variants in VCF discussed further on Reading GLnexus pVCFs and #210. Those account for some of the difference and also notable is that GenotypeGVCFs has a default setting of --max-alternate-alleles much lower than the GLnexus equivalent. |
Hi. |
Hi, I'm undergraduate student from Korea University. |
Hi,
I'm trying to use GLnexus v.1.2.6 to merge g.vcf files generated by the GATK HaplotypeCaller, using the gatk configuration pre-set. For my project, I'm merging WGS calls from ~300 individuals sequenced at 30-50X mean coverage. I have a couple of questions about the merging process:
Thanks!
The text was updated successfully, but these errors were encountered: