-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Possibility to filter alignment results on coverage between HMM and target #27
Comments
Hi @jpjarnoux If I'm not mistaken, this option doesn't exist in the original HMMER either, because of the difficulty to compute coverage for an alignment which can contain more than one domain. I like this example (from the documentation examples): |
Hi, |
Hi, |
Hi @jpjarnoux You can get the length of HMMs from hmm_lengths = {}
with pyhmmer.plan7.HMMFile("Pfam-A.h3m") as hmm_file:
for hmm in hmm_file:
hmm_lengths[hmm.name] = len(hmm.consensus) Then: n_aligned_positions = len(
hit.best_domain.alignment.hmm_sequence
) - hit.best_domain.alignment.hmm_sequence.count(".")
hmm_coverage = (
n_aligned_positions / hmm_lengths[hit.best_domain.alignment.hmm_name]
) As @althonos said, this is an oversimplification because you can have multiple domains in the hit. But it can be pretty useful for HMMs of full-length proteins. |
Sorry, I forget to reply. |
A less dumb approach: def get_hmm_coverage(domain):
n_aligned_positions = domain.alignment.hmm_to - domain.alignment.hmm_from + 1
return n_aligned_positions / domain.alignment.hmm_length
with pyhmmer.plan7.HMMFile("Pfam-A.h3m") as hmm_file:
for hits in pyhmmer.hmmsearch(hmm_file, seqs, bit_cutoffs="gathering"):
for hit in hits:
for domain in hit.domains.included:
hmm_coverage = get_hmm_coverage(domain) |
Hi !
I was searching if that's possible to report the alignment coverage on the HMM and the target. I'm using the PADLOC-DB as HMM database and I notice that they include
hmm.coverage.threshold
andtarget.coverage.threshold
to filter results in there metadata file.So I was searching if these values were in the HIT object, but they're not, and I don't find them in the documentation. Did I miss something ?
Thanks
The text was updated successfully, but these errors were encountered: