This repository contains pipelines and scripts used to assemble, validate, and compare bovine genomes using ONT and HiFi data for three trios
- Original Braunvieh x Original Braunvieh
- Nellore x Brown Swiss
- gaur x Piedmontese
For multiple current long read assemblers assemblers like hifiasm, canu, Shasta, Flye, etc.
There are also multiple pipelines and scripts used to construct different pangenomes with minigraph and subsequent analysis of the SVs like repeatitive content and accuracy scores.
Details on the results can be found in our publication, or the assemblies hosted here.
Structural variant-based pangenome construction has low sensitivity to variability of haplotype-resolved bovine assemblies
Alexander S. Leonard, Danang Crysnanto, Zih-Hua Fang, Michael P Heaton, Brian L. Vander Ley, Carolina Herrera, Heinrich Bollwein, Derek M. Bickhart, Kristen L. Kuhn, Timothy PL. Smith, Benjamin D. Rosen, Hubert Pausch
Nat Commun 13, 3012 (2022). https://doi.org/10.1038/s41467-022-30680-2
Many of the parameters are tuned to run for our data and on the ETH Euler cluster, using for example a forked version of the LSF snakemake profile, so it may take some modifying to work smoothly in different contexts. Many tools are assumed to be available in $PATH, but all are freely available.