Audience | Computational Skills | Prerequisites | Duration |
---|---|---|---|
Biologists | Intermediate bash | Introduction to the command-line interface | 3 hour workshop (~3 hours of trainer-led time) |
This repository has teaching materials for a 3 hour, hands-on Exploring genomic variants using GEMINI workshop led at a relaxed pace.
Exome-seq and WGS experiments result in large VCF (Variant Call Format) files with information about variants (SNPs, indels, etc.) present in the dataset. GEMINI (GEnome MINIng) is a framework that helps turn VCF files with millions of rows and thousands of columns into simple and easily accessible databases. Within the database, GEMINI annotates variants with publicly available information, including ENCODE, OMIM, dbSNP, plus internal annotations like regions of interest, candidate genes, etc. The resulting framework/database supports an interactive exploration of variants in the dataset in the context of known information as well as sample information to rapidly get to the biology at play.
- Using annotation information to filter out important variants
- Creating a GEMINI database using VCF files
- Exploring how to query the GEMINI database
- Extracting important variants based on characteristics, such as severity, type, location, etc.
These materials are developed for a trainer-led workshop, but also amenable to self-guided learning.
Lessons | Estimated Duration |
---|---|
Exploring genomic variants using GEMINI | 75 min |
Mac users: No installation requirements.
Windows users: GitBash