-
Notifications
You must be signed in to change notification settings - Fork 14
Home
BinSanity contains a suite a scripts designed to cluster contigs generated from metagenomic assembly into putative genomes. What makes BinSanity unique is the usage of Affinity Propagation (AP) and its biphasic approach. The biphasic approach whereby contigs are clustered initially with contig coverage followed by refinement using GC% and k-mer frequencies yields more complete bins than similar methods (See the BinSanity Publication)
Additionally, AP is a deterministic algorithim that does not need to set the number of clusters. Affinity Propagation has been shown to be more effective than methods such as k-means in clustering of pictures using a facial recognition as well as identifying regulated transcripts (Check out the paper here).
Check out the usage section for a more detailed run through of new features in BinSanity v.0.2.7
More detailed descriptions of each script is given under the Usage section
-
Binsanity
- BinSanity implements Affinity Propagation to cluster contigs into putative genomes using contig coverage as an input
-
Binsanity-refine
- BinSanity-refine incorporates tetranucleotide frequencies, GC%, and optionally incorporates the coverage profile
-
Binsanity-wf
- Binsanity-wf runs Binsanity and Binsanity-refine sequentially to optimize cluster results
-
Binsanity-profile
- Binsanity-profile uses featureCounts to produce the coverage profiles requires in Binsanity, Binsanity-refine, and Binsanity-wf
-
Binsanity-lc
- Binsanity-lc is written for large metagenomic assemblies (e.g >100,000 contigs) where Binsanity and Binsanity-refine become to memory intensive. It uses K-means to subset contigs based on coverage before implementing Binsanity (BetaVersion)
Graham ED, Heidelberg JF, Tully BJ. (2017) BinSanity: unsupervised clustering of environmental microbial assemblies using coverage and affinity propagation. PeerJ 5:e3035 https://doi.org/10.7717/peerj.3035
Please reach out if there are any questions or comments.