A small module for easy access to BLOSUM matrices without dependencies.
This module is now part of the bfx suite. See https://py-bfx.readthedocs.io for more information.
The BLOcks SUbstitution Matrices (BLOSUM) are used to score alignments between protein sequences and are therefore mainly used in bioinformatics.
Reading such matrices is not particularly difficult, yet most off the shelf packages are overloaded with strange dependencies. And why do we need to implement the same reader again if there is a simple module for that.
blosum
offers a robust and easy-to-expand implementation without relying on third-party libraries.
Using pip / pip3:
pip install blosum
Or by source:
git clone [email protected]:not-a-feature/blosum.git
cd blosum
pip install .
Or by conda:
conda install blosum
This package provides the most commonly used BLOSUM matrices. You can choose from BLOSUM 45, 50, 62, 80 and 90.
To load a matrix:
import blosum as bl
matrix = bl.BLOSUM(62)
val = matrix["A"]["Y"]
In addition, own matrices can be loaded. For this, the path is given as an argument.
import blosum as bl
matrix = bl.BLOSUM("path/to/blosum.file")
val = matrix["A"]["Y"]
The matrices are required to have following format:
# Comments should start with #
# Each value should be seperated by one or many whitespace
A R N D
A 5 -2 -1 -2
R -2 7 0 -1
N -1 0 6 2
D -2 -1 2 7
Once loaded the matrix
behaves like a defaultdict
.
To get a value use:
val = matrix["A"]["Y"]
To get a defaultdict of the row with a given key use:
val_dict = matrix["A"]
If the key cannot be found, the default value float("-inf")
is returned.
It is possible to set a custom default score:
matrix = bl.BLOSUM(62, default=0)
Copyright (C) 2023 by Jules Kreuer - @not_a_feature
This piece of software is published unter the GNU General Public License v3.0 TLDR:
Permissions | Conditions | Limitations |
---|---|---|
✓ Commercial use | Disclose source | ✕ Liability |
✓ Distribution | License and copyright notice | ✕ Warranty |
✓ Modification | Same license | |
✓ Patent use | State changes | |
✓ Private use |
Go to LICENSE.md to see the full version.