Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow all ranking stats to be computed on individual CollapsedTrees #109

Open
willdumm opened this issue Oct 13, 2022 · 1 comment
Open

Comments

@willdumm
Copy link
Contributor

Since ctrees are ete trees, and e.g. mutability parsimony is only implemented for history DAGs, it's difficult to match ranking stats to individual collapsed trees extracted from a parsimony forest.

There are two options for fixing this. The elegant but inefficient way (which is also not backwards-compatible with older pickled trees) is to store the original history on each ctree object, so that optimal_weight_annotate kwargs may be used to compute any stats of interest.

The more practical way would be to implement a ctree method to compute each ranking stat, directly from the ete tree.

@willdumm
Copy link
Contributor Author

For example, here's how this can be done for mutability parsimony

import gctree.mutation_model as mm
mut_model = mm.MutationModel(mutability_file='path_to_mutability_file', substitution_file='path_to_substitution_file')
mutability_distance = mm._mutability_distance(mut_model, splits=splits)

def mutability_parsimony(ctree):
    return sum(mutability_distance(n.up.sequence, n.sequence) for n in ctree.tree.iter_descendants())

for ctree in forest:
    print(mutability_parsimony(ctree))

Where splits is a list containing indices where sequences are concatenated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant