Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH(tree): See whether we can add Usher and/or Maple as experimental tree builders for augur tree #1233

Open
corneliusroemer opened this issue May 22, 2023 · 5 comments
Labels
enhancement New feature or request

Comments

@corneliusroemer
Copy link
Member

Context

IQtree can struggle with large trees and take long. We may want to experiment with using Usher and/or Maple as alternatives. They probably are significantly faster and may be good enough for some use cases, maybe even better than IQtree.

@bqminh
Copy link

bqminh commented May 23, 2024

We have also released CMAPLE tool, https://github.com/iqtree/cmaple, which is 3 times more efficient than MAPLE (https://doi.org/10.1101/2024.05.15.594295). Is there any plan to integrate such tool into the pipeline? We @trongnhanuit can volunteer to integrate CMAPLE.

@huddlej
Copy link
Contributor

huddlej commented May 31, 2024

Thanks, @bqminh! Based on the IQ-TREE docs, it looks like Augur might support CMAPLE already by passing custom arguments to IQ-TREE like augur tree --tree-builder-args="--pathogen-force" (where IQ-TREE is already the default tree-builder). Another option to surface CMAPLE which would require a change to Augur would be to wrap that custom IQ-TREE command through a new "method" option like augur tree --method cmaple.

The other technical consideration is that we bundle IQ-TREE with Augur in our Nextstrain runtimes for Conda and Docker. For the Conda runtime, we pull the IQ-TREE package from Bioconda. For the Docker runtime, we download a (slightly out-of-date) binary from GitHub. For CMAPLE to work with augur tree across our various runtimes, we'd just need the Bioconda package and GitHub binaries to reflect the CMAPLE branch of the code. Separately, we are eager to include the latest version of IQ-TREE that supports ARM64 CPUs, but it looks like that development is happening in a separate branch from the CMAPLE work. Is there a plan to have a single release with both CMAPLE and ARM64 support or will these remain as separate development paths for a while?

@bqminh
Copy link

bqminh commented Jun 12, 2024

Thank you for this information! We'll prioritise to have this IQ-TREE/CMAPLE version work on ARM. It's good to know that it might work already with this tree builder arguments, but we'll also consider other options.

@corneliusroemer
Copy link
Member Author

I've managed to build iqtree2+cmaple on my local machine (osx-arm64 macOS 14.6) with a few workarounds, see iqtree2 issue:

Per the logs, this time it really worked (I tried with bioconda version but that lacks cmaple support, see iqtree/iqtree2#274)

IQ-TREE multicore version 2.3.5 for MacOS ARM 64-bit built Jul 17 2024
Developed by Bui Quang Minh, Nguyen Lam Tung, Olga Chernomor, Heiko Schmidt,
Dominik Schrempf, Michael Woodhams, Ly Trong Nhan, Thomas Wong

Host:    dyn-3-4-29.mobile.unibas.ch (SSE4.2, 32 GB RAM)
Command: iqtree2 -ntmax 4 -s results/hmpxv1/masked_masked-delim.fasta -m GTR -ninit 2 -n 2 -me 0.05 -nt AUTO -redo --pathogen-force
Seed:    131082 (Using SPRNG - Scalable Parallel Random Number Generator)
Time:    Wed Jul 17 16:05:07 2024
Kernel:  SSE2 - auto-detect threads (10 CPU cores detected)

Reading an alignment
Running [C]MAPLE algorithm...
Performing placement
243 sequences have been added to the tree.
Applying a normal tree search
Optimizing branch lengths
Tree log likelihood: -272539.7511423723

MODEL: GTR

ROOT FREQUENCIES
A                       C                       G                       T                       
0.365181        0.157898        0.157473        0.319448        

MUTATION MATRIX
        A                       C                       G                       T                       
A       -2552.16        317.651 1864.83 369.68  
C       734.649 -5636.53        455.989 4445.89 
G       4324.56 457.223 -5524.77        742.987 
T       422.604 2197.54 366.257 -2986.4 

Analysis results written to:
Maximum-likelihood tree:       results/hmpxv1/masked_masked-delim.fasta.treefile
Screen log file:               results/hmpxv1/masked_masked-delim.fasta.log

CMAPLE Runtime: 0.9459710121s
Date and Time: Wed Jul 17 16:05:08 2024

On a small build (240 sequences, mpox clade IIb) things look good:

Brave Browser 2024-07-17 16 09 06

I'll keep exploring. I think I can edit the bioconda recipe to add cmaple so we can use it broadly across workflows. See:

@corneliusroemer
Copy link
Member Author

I've managed to build iqtree with cmaple feature enabled in bioconda! There's thus no need to change augur code, one can simply pass the tree builder argument --pathogen-force and cmaple should be used automatically.k

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
No open projects
Development

No branches or pull requests

3 participants