ALIGaToR - Annotator of Loci for IG and T-cell Receptors

A pipeline for annotating genomic contigs from the IG and TR loci. The pipeline includes:

Extract: A parsing script that extracts gene, exon, and RSS name and corrdinates from reference annotations of choice of closely related species.
Predict: A prediction script calls submodule DnaGrep, that predicts RSS sequences based on genomic contigs.
Annotate: Annotator script that uses the extracted reference genome and genomic information to generate a search databse for blast. Blast hits are matched with predicted RSSs. Other scripts are called to check for start and stop codons, and splice sites.

Dependencies/Prerequisites

Python 3.6 or greater
Muscle
Blast+
BedTools

Getting Started

Clone the aligator repository:

	git clone https://github.com/scharch/aligator.git

Install required python packages:

	pip install -r aligator/requirements.txt

Set enviromental variable:

	export ALIGATOR_PATH=$(pwd)/aligator

Quick help:

	`aligator help`

Vignette annotating MF989451 from Ramesh et al Frontiers Immunology 2017:

Data is in aligator/sample_data.

First, get reference genome from IMGT:

	#Download BK063715 fasta file from https://imgt.org/ligmdb/view.action?format=FASTA&id=BK063715
	#Then create bedfile with reference annotations
	aligator extract https://imgt.org/ligmdb/view.action?id=BK063715 BK063715

Find possible RSS motifs in the target contig. For MF989451, the output should look the same as sample_data/MF989451.rss12_pred.bed and sample_data/MF989451.rss23_pred.bed:

	aligator predict $ALIGATOR_PATH/sample_data/MF989451.fa MF989451

Finally, annotate the target contig. For MF989451, the actual annotations provided by Ramesh et al are included as sample_data/MF989451.ground_truth.bed:

	aligator annotate $ALIGATOR_PATH/sample_data/MF989451.fa MF989451.RSS12.bed MF989451.RSS23.bed IGH BK063715.fasta BK063715.bed

Name		Name	Last commit message	Last commit date
Latest commit History 89 Commits
DnaGrep @ 18aadc3		DnaGrep @ 18aadc3
sample_data		sample_data
.gitignore		.gitignore
.gitmodules		.gitmodules
IGHJ.fa		IGHJ.fa
IGHV.fa		IGHV.fa
IGKJ.fa		IGKJ.fa
IGKV.fa		IGKV.fa
IGLJ.fa		IGLJ.fa
IGLV.fa		IGLV.fa
IMGThtmlParser.py		IMGThtmlParser.py
LICENSE		LICENSE
README.md		README.md
TRAJ.fa		TRAJ.fa
TRAV.fa		TRAV.fa
TRBJ.fa		TRBJ.fa
TRBV.fa		TRBV.fa
TRDJ.fa		TRDJ.fa
TRDV.fa		TRDV.fa
__init__.py		__init__.py
aligator		aligator
annotator.py		annotator.py
assignNames.py		assignNames.py
blast2bed.py		blast2bed.py
checkFunctionality.py		checkFunctionality.py
checkSplice.py		checkSplice.py
parseRSS.py		parseRSS.py
parseTRA.py		parseTRA.py
predictRSS.sh		predictRSS.sh
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ALIGaToR - Annotator of Loci for IG and T-cell Receptors

Dependencies/Prerequisites

Getting Started

Vignette annotating MF989451 from Ramesh et al Frontiers Immunology 2017:

About

Releases 1

Packages

Contributors 2

Languages

License

scharch/aligator

Folders and files

Latest commit

History

Repository files navigation

ALIGaToR - Annotator of Loci for IG and T-cell Receptors

Dependencies/Prerequisites

Getting Started

Vignette annotating MF989451 from Ramesh et al Frontiers Immunology 2017:

About

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 2

Languages

Packages