Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

local branching metric harmonization #99

Open
wsdewitt opened this issue Jun 10, 2022 · 0 comments
Open

local branching metric harmonization #99

wsdewitt opened this issue Jun 10, 2022 · 0 comments

Comments

@wsdewitt
Copy link
Collaborator

wsdewitt commented Jun 10, 2022

Background

There have been persistent discrepancies between the local branching metrics provided by gctree.CollapsedTree.local_branching() and those provided by partis, including on the unit test in gctree. Here I'll walk through analytical calculations of the integrals for some very simple trees, as a basis for discussion with @psathyrella about what the source of the discrepancy is.

Message passing scheme

For reference, the following excerpt from Neher et al. (2014) describes the LBI message passing scheme for computing the exponentially discounted tree length focal to each node:

image

Additional implementation details in gctree/partis

In gctree/partis there are a few additional steps/notes:

  1. In trees that don't have time-calibrated branch lengths, we use the number of mutations on a branch as an estimate of its branch length. If a branch has no mutations, it is assigned a default/prior branch length $\tau_0$ for the purpose of the message integration limits, where $\tau_0\le\tau$ is an additional parameter. This allows clonal offspring count toward LBI fitness, whereas using a branch length 0 would result in vanishing messages from clonal offspring.
  2. For the purpose of message computations, the root node is given a branch length of infinity (which integrates to $\tau$), as a heuristic to stabilize/smooth LB metrics near the root.
  3. In addition to the LBI of Eqn 19, we compute a local branching ratio (LBR), defined as $$LBR_i = \frac{\sum_{j}m_{\uparrow i_j}}{m_{\downarrow i}}.$$
  4. Although rule 3 results in nonzero root node LBR values (given the infinite root branch noted above, which integrates to $\tau$), the root node LBR is re-assigned to zero or NaN.

Single-clone tree

As a first exercise, consider a tree that is a clonal polytomy of 8 identical sampled cells (with no mutations):
image
In the notation of genotype-collapsed trees, this would be represented as a single node (let's denote it index $0$) with abundance 8:
image

Down message

According to rule 2 above, we add an infinite branch above the root of the tree, so $$m_{\downarrow 0} = \int_0^\infty e^{-x/\tau}dx = \tau.$$

Up messages

According to rule 1 above, each of the 8 clonal children $j$ contributes upward message $$m_{\uparrow j} = \int_0^{\tau_0} e^{-x/\tau}dx = \tau(1-e^{-\tau_0/\tau}).$$
So $$\sum_j m_{\uparrow j} = \sum_j \tau(1-e^{-\tau_0/\tau}) = 8\tau(1-e^{-\tau_0/\tau}).$$

LBI

$$\lambda_i = \sum_j m_{\uparrow i_j} + m_{\downarrow i} = \tau(1 + 8(1-e^{-\tau_0/\tau})).$$

If we take $\tau=\tau_0=1$, we have
$$\lambda_i = 1 + 8(1-e^{-1}) \simeq 6.0569644706.$$

LBI from gctree

Using gctree we can validate this analytical result for $\tau=\tau_0=1$.
First, create the tree object:

import ete3
import gctree

ete_tree = ete3.TreeNode(name=0)
ete_tree.abundance = 8
ete_tree.isotype = {}

tree = gctree.CollapsedTree()
tree.tree = ete_tree
tree.render("%%inline")

image
Now, evaluate LBI for $\tau=\tau_0=1$:

tree.local_branching(tau=1, tau0=1)
print(tree.tree.LBI)
6.056964470628461
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant