Skip to content

Commit

Permalink
Initial content for readme
Browse files Browse the repository at this point in the history
  • Loading branch information
alancleary committed Sep 13, 2024
1 parent f6b9b4a commit 2180207
Showing 1 changed file with 51 additions and 1 deletion.
52 changes: 51 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# [Module-Name]
# Pangenomics
---------------------------------

## **Contents**
Expand All @@ -15,18 +15,68 @@

## **Overview**

This module will introduce you to (graphical) pangenomics and walk you through a pangenomics pipeline.
Specifically, you will learn how to build a pangenome graph, map reads to the graph, call variants on the mapped reads, and visualize the graph.
All analyses will be performed on the Google Cloud Platform.
The estimated cost for the complete module is $?


## **Background**

A *pangenome* is a collection of genomes from the same species.
Compared to a reference genome, a pangenome is a less biased, more comprehensive representation of sequence preservation and variation within a population.
While the pangenome may provide greater insight into questions related to the genetic and genomic nature of a species, these data require the use of bioinformatics tools that are different than those typically used on reference genomes.
This module aims to introduce you to the idea of *pangenome graphs* and the bioinformatics tools used for their analysis.


## **Before Starting**

This module is designed to run on the Google Cloud Platform (GCP).
Follow the instructions below to prepare to run the module on GCP.

<details>

<summary>Setting up GCP</summary>

* Create a Google Cloud account
* Create a Google Cloud project
* Enable billing for your Google Cloud project
* Create a Vertex AI Workbench instance
* Clone this repository into your Workbench instance

</details>


## **Getting Started**

To begin, we must understand how this repository is organized.
```
└── module_notebooks/
   ├── 00-environment-setup.ipynb
   ├── 01-intro-to-pangenomics.ipynb
   ├── 02-building-graphs-with-pggb.ipynb
   ├── 03-indexing-graphs-with-vg.ipynb
   ├── 04-read-mapping-with-vg.ipynb
   ├── 05-variant-calling-with-vg.ipynb
   ├── 06-searching-graphs-with-blast.ipynb
   └── 07-visualization.ipynb
```

`module_notebooks/` contains Jupyter notebooks - one for each submodule.
To open a notebook, simply double-click on it in your Workbench instance.
To begin this module, open the `00-environment-setup.ipynb` notebook.
This notebook will introduce you to Jupyter notebooks and instruct you on how to install the software for this module.


## **Software Requirements**

The follow software is required for this module and will be installed as part of the `00-environment-setup.ipynb` submodule:

* [PGGB](https://github.com/pangenome/pggb)
* [vg](https://github.com/vgteam/vg)
* [BLAST](https://www.ncbi.nlm.nih.gov/books/NBK569861/)
* [Bandage](https://rrwick.github.io/Bandage/)


## **Architecture Design**

Expand Down

0 comments on commit 2180207

Please sign in to comment.