phylotrackpy: a python phylogeny tracker

In in silico evolution experiments, we have the luxury of being able to perfectly track the phylogenies of our populations, rather than having to just infer them after the fact. Phylotrackpy is a Python package designed to help you do so as efficiently as possible.

At face value, measuring a phylogeny in in silico evolution may seem very straightforward: you just need to keep track of what gives birth to what. However, multiple aspects turn out to be non-trivial. The goal of Phylotrackpy is to implement these things the right way once so that we all can stop needing to re-implement them over and over. Phylotrackpy is a python library designed to flexibly handle all aspects of recording phylogenies in in silico evolution.

Phylogeny Trackers in Other Languages

C++

Phylotrackpy is essentially a wrapper around Phylotracklib, which is implemented in C++. If you need a C++ phylogeny tracker, you can use that one directly (it is part of the larger Empirical library, which is header-only so you can just include the parts you want).

Julia

A phylogeny tracker written in Julia is also available.

Features

Pruning: Ability to prune out taxa that are extinct and have no extant descendants (to keep memory use under control)
Flexible taxon definitions: Flexible control of how taxa are defined (e.g. by genotype, by phenotype, by trait, or by something more complex)
Efficiency: Highly efficient (implemented in C++ under the hood)
Phylostatistics: Includes various phylogenetic topology metrics
Flexible output: Easily add columns to output files.

Running a parallel/distributed simulation? Check out hstrat, which provides an alternate parallel/distributed-friendly methodology for decentralized phylogenetic tracking.

High level usage

There are three main steps to tracking a phylogeny using phylotrackpy:

You may also want to:

Import and export data

For more detailed instructions, see the documentation

Installation

Phylotrackpy is available through pip:

pip install phylotrackpy

To install the latest development version:

pip install git+https://github.com/emilydolson/phylotrackpy

To install from a local sorce copy:

pip install . --upgrade

Note that development and local installs will require local compilation of C++ bindings. Pre-built wheels are available with the PyPi distribution. See our documentation for more complete information on local builds.

Useful background information

There are certain quirks associated with real-time phylogenies (especially digital ones) that you might not be used to thinking about if you're used to dealing with reconstructed phylogenies. Many of these discrepancies are the result of the very different temporal resolutions on which these types of phylogenies are measured, and the fact that the taxonomic units we work with are often at a finer resolution than species. We document some here so that they don't catch you off guard:

Multifurcations are real: In phylogenetic reconstructions, there is usually an assumption that any multifurcation/polytomy (i.e. a node that has more than two child nodes) is an artifact of having insufficient data. In real-time phylogenies, however, we often observe multifurcations that we know for sure actually happened.
Not all extant taxa are leaf nodes: In phylogenetic reconstructions, there is usually an assumption that all extant (i.e. still living) taxa are leaf nodes in the phylogeny (i.e. none of them are parents/offspring of each other; similar taxa are descended from a shared common ancestor). In real-time phylogenies it is entirely possible that one taxon gives birth to something that we have defined as a different taxon and then continues to coexist with that child taxon.
Not all nodes are branch points: In phylogenetic reconstructions, we only attempt to infer where branch points (i.e. common ancestors of multiple taxa) occurred. We do not try to infer how many taxa existed on a line of descent between a branch point and an extant taxa. In real-time phylogenies we observe exactly how many taxa exist on this line of descent and we keep a record of them. In practice there are often a lot of them, depending on you define your taxa. It is unclear whether we should include these non-branching nodes when calculating phylogenetic statistics (which is why Phylotrackpy lets you choose whether you want to).

The above image represents an actual phylogeny measured from digital evolution. Each rectangle represents a different taxon. It's position along the x axis represents the span of time it existed for. Note that there are often sections along a single branch where multiple taxa coexisted for a period of time. Circles represent extant taxa at the end of this run.

Dependencies

pybind11 (for wrapping C++ code into Python)
Empirical (where the C++ version of this code lives)

Testing dependencies

pytest

Documentation dependencies

myst_parser (for writing documentation in markdown)
sphinx_rtd_theme (theme for readthedocs)

Contributing

Contributions are welcome! See CONTRIBUTING.md.

Citing

If Phylotrack contributes to a scientific publication, please cite it as

Dolson, E., Rodriguez-Papa, S., & Moreno, M. A. (2024). Phylotrack: C++ and Python libraries for in silico phylogenetic tracking. arXiv preprint arXiv:2405.09389. https://doi.org/10.48550/arXiv.2405.09389

@misc{dolson2024phylotrack,
      doi={10.48550/arXiv.2405.09389},
      url={https://arxiv.org/abs/2405.09389},
      title={Phylotrack: C++ and Python libraries for in silico phylogenetic tracking},
      author={Emily Dolson and Santiago Rodriguez-Papa and Matthew Andres Moreno},
      year={2024},
      eprint={2405.09389},
      archivePrefix={arXiv},
      primaryClass={q-bio.PE}
}

Consider also citing pybind11 if you are using PhylotrackPy. And don't forget to leave a star on GitHub!

Developers

Emily Dolson (lead developer)
Matthew Andres Moreno

Name		Name	Last commit message	Last commit date
Latest commit History 563 Commits
.github		.github
Empirical @ aa9da03		Empirical @ aa9da03
docs		docs
joss		joss
phylotrackpy		phylotrackpy
profile		profile
test		test
.gitignore		.gitignore
.gitmodules		.gitmodules
.readthedocs.yaml		.readthedocs.yaml
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
Makefile		Makefile
README.md		README.md
codecov.yml		codecov.yml
constraints.in		constraints.in
pyproject.toml		pyproject.toml
requirements.in		requirements.in
requirements.sh		requirements.sh
requirements.txt		requirements.txt
setup.cfg		setup.cfg
setup.py		setup.py
systematics_bindings.cpp		systematics_bindings.cpp

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

phylotrackpy: a python phylogeny tracker

Phylogeny Trackers in Other Languages

C++

Julia

Features

High level usage

Installation

Useful background information

Dependencies

Testing dependencies

Documentation dependencies

Contributing

Citing

Developers

About

Releases 5

Packages

Contributors 4

Languages

License

emilydolson/phylotrackpy

Folders and files

Latest commit

History

Repository files navigation

phylotrackpy: a python phylogeny tracker

Phylogeny Trackers in Other Languages

C++

Julia

Features

High level usage

Installation

Useful background information

Dependencies

Testing dependencies

Documentation dependencies

Contributing

Citing

Developers

About

Topics

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases 5

Packages 0

Contributors 4

Languages

Packages