Amino-Acid k-mer tools for creating, searching, and analyzing phylogenetic signatures from genomes or reads of DNA.
A 64-bit Python 3.4 or greater is required. 8 GB or more of memory is recommended.
The python dependencies of aakbar are: biopython, click>=5.0, click_plugins numpy, pandas, pyfaidx, and pyyaml. Running the examples also requires the pyfastaq https://pypi.python.org/pypi/pyfastaq package.
If you don't have a python installed that meets these requirements, I recommend getting Anaconda Python <https://www.continuum.io/downloads> on MacOSX and Windows for the smoothness of installation and for the packages that come pre-installed. Once Anaconda python is installed, you can get the dependencies like this on MacOSX:
export PATH=~/anaconda/bin:${PATH} # you might want to put this in your .profile conda install click conda install --channel https://conda.anaconda.org/IOOS click-plugins conda install --channel https://conda.anaconda.org/bioconda pyfaidx conda install --channel https://conda.anaconda.org/bioconda pyfastaq
This package is tested under Linux and MacOS using Python 3.5 and is available from the PyPI. To install via pip (or pip3 under some distributions) :
pip install aakbar
If you wish to develop aakbar, download a release and in the top-level directory:
pip install --editable .
If you wish to have pip install directly from git, use this command:
pip install git+https://github.com/ncgr/aakbar.git
Installation puts a single script called aakbar
in your path. The usage format is:
aakbar [GLOBALOPTIONS] COMMAND [COMMANDOPTIONS] [ARGS]
A listing of commands is available via aakbar --help
. Current available commands are:
calculate-peptide-terms | Write peptide terms and histograms. |
conserved-signature-stats | Stats on signatures found in all input genomes. |
define-set | Define an identifier and directory for a set. |
define-summary | Define summary directory and label. |
demo-simplicity | Demo self-provided simplicity outputs. |
filter-peptide-terms | Remove high-simplicity terms. |
init-config-file | Initialize a configuration file. |
install-demo-scripts | Copy demo scripts to the current directory. |
intersect-peptide-terms | Find intersecting terms from multiple sets. |
label-set | Define label associated with a set. |
peptide-simplicity-mask | Lower-case high-simplicity regions in FASTA. |
search-peptide-occurrances | Find signatures in peptide space. |
set-simplicity-window | Define size of window used in simplicity calcs. |
set-plot-type | Define label associated with a set. |
set-simplicity-type | Select function used in simplicity calculation. |
show-config | Print location and contents of config file. |
show-context-object | Print the global context object. |
test-logging | Logs at different severity levels. |
Bash scripts that implement examples for calculating and using signature sets for Firmicutes and Streptococcus, complete with downloading data from GenBank, will be created in the (empty) current working directory when you issue the command:
aakbar install-demo-scripts
On linux and MacOS, follow the instructions to run the demos. On Windows, you will
need bash
installed for the scripts to work.
In addition to pyfastaq, two tools that you will probably find helpful in working with aakbar are alphabetsoup <https://github.com/ncgr/alphabetsoup> for sanitizing input FASTA files and tsv-tools <https://https://github.com/eBay/tsv-utils/> for filtering output TSV files.
Latest Release | ||
GitHub | ||
License | ||
Documentation | ||
Travis Build | ||
Coverage | ||
Code Grade | ||
Dependencies | ||
Issues |