Pangenomics

Overview

This module will introduce you to (graphical) pangenomics and walk you through a pangenomics pipeline. Specifically, you will learn how to build a pangenome graph, map reads to the graph, call variants on the mapped reads, and visualize the graph. All analyses will be performed on the Google Cloud Platform. The estimated cost for the complete module is $?

Background

A pangenome is a collection of genomes from the same species. Compared to a reference genome, a pangenome is a less biased, more comprehensive representation of sequence preservation and variation within a population. While the pangenome may provide greater insight into questions related to the genetic and genomic nature of a species, these data require the use of bioinformatics tools that are different than those typically used on reference genomes. This module aims to introduce you to the idea of pangenome graphs and the bioinformatics tools used for their analysis.

Before Starting

This module is designed to run on the Google Cloud Platform (GCP). Follow the instructions below to prepare to run the module on GCP.

Setting up GCP

See the Vertex AI Quickstart instructions for details on steps 1-5.

Create a Google Cloud account
Create a Google Cloud project
Enable billing for your Google Cloud project
Create a Vertex AI Workbench instance
Click "OPEN JUPYTERLAB" on your instance to open JupyterLab
Clone this repository into JupyterLab

Installing Software

All software for this module is installed via Conda. To set up the module's Conda environment and install all the software, open a Terminal in JupyterLab (File -> New Launcher -> Terminal) and run the following command:

bash -i ./NIGMS-Sandbox-Pangenomics-Module/scripts/0-setup.sh

After the command complets, close the terminal and refresh the JupyterLab window in your web browser. There should now be a new kernal in the launcher called "conda-nigms-pangenomics". This is the kernel you should use with every notebook in the module.

Getting Started

To begin, we must understand how this repository is organized.

└── module_notebooks/
    ├── 00-environment-setup.ipynb
    ├── 01-intro-to-pangenomics.ipynb
    ├── 02-building-graphs-with-pggb.ipynb
    ├── 03-indexing-graphs-with-vg.ipynb
    ├── 04-read-mapping-with-vg.ipynb
    ├── 05-variant-calling-with-vg.ipynb
    ├── 06-searching-graphs-with-blast.ipynb
    └── 07-visualization.ipynb

module_notebooks/ contains Jupyter notebooks - one for each submodule. To open a notebook, simply double-click on it in your Workbench instance. To begin this module, open the 00-environment-setup.ipynb notebook. This notebook will introduce you to Jupyter notebooks and instruct you on how to install the software for this module.

Software Requirements

The follow software is required for this module and will be installed as part of the 00-environment-setup.ipynb submodule:

Name		Name	Last commit message	Last commit date
Latest commit History 207 Commits
.github		.github
bandage		bandage
flashcards		flashcards
generated_notebooks		generated_notebooks
html		html
images		images
module_notebooks		module_notebooks
questions		questions
reference_notebooks		reference_notebooks
scripts		scripts
videos		videos
viz_data		viz_data
.DS_Store		.DS_Store
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
notebook_template.ipynb		notebook_template.ipynb
notebook_template_readme.md		notebook_template_readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Pangenomics

Contents

Overview

Background

Before Starting

Getting Started

Software Requirements

Architecture Design

Data

Funding

License for Data

About

Releases

Packages

Contributors 5

Languages

License

ncgr/NIGMS-Sandbox-Pangenomics-Module

Folders and files

Latest commit

History

Repository files navigation

Pangenomics

Contents

Overview

Background

Before Starting

Getting Started

Software Requirements

Architecture Design

Data

Funding

License for Data

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 5

Languages

Packages