Skip to content

Pre-release 10 of PF2.0.0

Pre-release
Pre-release
Compare
Choose a tag to compare
@roblanf roblanf released this 13 Nov 05:27
· 18 commits to develop since this release

This pre-release has a number of new features. Please read the commit history for full details, but I will summarise them here.

  1. --all-states and --min-subset-size now apply to the rcluster algorithm.
    They operate as follows. We first run the rcluster algorithm as usual, and wait until it finishes. Following that, if either --all-states or --min-subset-size are used, we assess the final partitioning scheme to check whether any subsets fail either of those conditions. As soon as we find a subset that fails either of those conditions, we merge it with its nearest neighbour subset in the scheme, and we then repeat the searching process (we do things one at a time to avoid conflicts while merging subsets). Here, nearest neighbour is defined using the clustering weights that you used for the rcluster algorithm. Thus, the final merging works by trying to merge subsets that are too small or don't have all states present with other similar subsets.
  2. There is a new collection of models called Lie Markov models.
    Since none of these are implemented in RAxML, I suspect the 1Kite community won't be interested in these. If you are interested, you can read the commit message here: d9ae785
  3. There is a new variant of the rcluster algorithm: search = rclusterf;.
    DON'T WORRY, the rcluster algorithm (search = rcluster;) works the same as in pre-release 9. However, we now have a new algorithm: search = rclusterf;. In this algorithm, instead of just merging the best pair of subsets we find at each step of the algorithm, we merge all the subsets that improve the AICc/BIC score. This means we need many fewer steps of the algorithm. I have tested this on some 1Kite datasets, and so far it appears to get very similar AICc scores to the normal rcluster algorithm, and I suspect it will also be faster in many situations. However, I have not examined its performance exhaustively. So, if you have time to compare rcluster and rclusterf, I'd be interested in your findings. I will let you know when I know more about this algorithm.
  4. We now output a summary csv file for schemes.
    This file contains summaries of the best scheme at each step of the rcluster, rclusterf, hcluster, and greedy algorithms. This will only be written if you are not using -q/--quick. It's just a summary of what is written in the scheme summary files (in the /schemes folder). It is written to the /schemes folder.
  5. We include in the best_scheme.txt file a MrBayes block that specifies the partitioned model (or as close as we can get to the model) output by PartitionFinder.

These changes are not yet written into the manual, because the request for change 1 was that it be tested and released as soon as possible. I will update the manual at some point.