This is a repository in progress!
These are scripts I use to analyze shotgun metagenomes. These scripts can be adapted for amplicon reads, single end reads, etc.
Currently I am using a high performance computing cluster that uses a Slurm manager.
- bbduk_loop.sh
- bbnorm_loop.sh
- bbmerge_loop.sh
- Spades_Mgm_Error_Correction.sh
- Spades_mgm_Assembly.sh
- can be used for non-metagenomic genome assembly by changing metaspades.py to spades.py and removing the --meta argument
- megahit_loop_hpcc.sh (for running on Slurm cluster system)
- megahit_loop_local.sh (for running locally)
- both megahit scripts could be used for non-metagenomic genome assembly, but please refer to their manuals for available arguments & syntax.
- Currently metaSPADES can only handle one library at a time, whereas megahit can do co-assembly
- bwa_map_to_contigs.sh
- creates SAM file which is then converted to BAM file with Samtools
- After indexing and sorting the BAM file, this script uses Samtools to create stats about alignment + mapping
- metaBAT_contig_binning_loop_hpcc.sh (for running on Slurm cluster system)
- metaBAT_contig_binning_loop.sh (for running locally)
- Checkm_QA_MAGs.sh
- Check genome completeness, contamination & basic taxonomy
- metaQUAST_Compare_mgm_Assemblies.sh
- Compare genome assemblies, completeness, contamination, & basic taxonomy
- Parse_QA_Results.sh
- separate genome bins based on completeness and contamination percentages
- Examine_MGM_Quality.R
- visually examine genome bins based on completeness, contamination, and other stats (based on CheckM output)
- Taxonomic Annotation
- TBD
- Functional Annotation
- TBD
- PhyloFlash_16S_ID.sh (for running on cluster system)
- uses metagenome short reads to assemble into possible SSU rRNA genes (16S, 18S) for taxnomic identification