Skip to content

Commit

Permalink
pairwise_alignment_map: Update to choose alignment with smallest numb…
Browse files Browse the repository at this point in the history
…er of gaps
  • Loading branch information
avivrosenberg committed Jun 2, 2024
1 parent f2b0399 commit 9e06d37
Showing 1 changed file with 10 additions and 1 deletion.
11 changes: 10 additions & 1 deletion src/pp5/align.py
Original file line number Diff line number Diff line change
Expand Up @@ -68,6 +68,8 @@ def pairwise_alignment_map(
Aligns between two AA sequences and produces a map from the indices of
the source sequence to the indices of the target sequence.
Uses biopython's PairwiseAligner with BLOSUM80 matrix.
In case there is more than one alignment option, the one with the highest score
and smallest number of gaps will be chosen.
:param src_seq: Source AA sequence to align.
:param tgt_seq: Target AA sequence to align.
Expand All @@ -91,8 +93,15 @@ def pairwise_alignment_map(
tgt_seq = tgt_seq.replace(unk_aa, UNKNOWN_AA)
src_seq = src_seq.replace(unk_aa, UNKNOWN_AA)

# The aligner returns multiple alignment options
multi_alignments = aligner.align(src_seq, tgt_seq)
alignment = sorted(multi_alignments, key=lambda a: a.score)[-1]

# Choose alignment with maximal score and minimum number of gaps
def _align_sort_key(a: Alignment) -> Tuple[int, int]:
_n_gaps = a.coordinates.shape[1]
return -a.score, _n_gaps

alignment = sorted(multi_alignments, key=_align_sort_key)[0]

# Alignment contains two tuples each of length N (for N matching sub-sequences)
# (
Expand Down

0 comments on commit 9e06d37

Please sign in to comment.