-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Jumble parameter for tree space search should be user configurable #101
Comments
@wsdewitt From what I can tell, changing the jumble parameter doesn't actually change the trees that dnapars finds. For example, using the sample data in mkconfig --jumble 100 deduplicated.phylip dnapars > dnapars.cfg
dnapars < dnapars.cfg > dnapars.log and doing mkconfig --jumble 10 deduplicated.phylip dnapars > dnapars.cfg
dnapars < dnapars.cfg > dnapars.log both result in 263 trees found by dnapars, in about the same amount of time. I suspect that dnapars is just ignoring the jumble parameter and only evaluating a single (?) input sequence order. However, if I change the random seed passed to the jumble parameter in
to this config file:
results in 137 (presumably different from the other 263) trees from dnapars. The number following |
I am seeing larger parsimony forests with more jumbles, using the following commands. deduplicate example/150228_Clone_3-8.fasta --root GL > deduplicated.phylip
mkconfig --jumble 10 deduplicated.phylip dnapars > dnapars.cfg
dnapars < dnapars.cfg > dnapars.log
head -5 outfile
mkconfig --jumble 100 --quick deduplicated.phylip dnapars > dnapars.cfg
rm outfile outtree
dnapars < dnapars.cfg > dnapars.log
head -5 outfile
Note these use the "quick" option. I haven't had time to try without the quick option yet. |
Ok I retried without |
The funny thing is that changing the random seed for shuffling the sequences does change the trees that dnapars finds. You'd think that would be the same as shuffling multiple times starting with one seed. |
Good point. Ok maybe we should ask Joe. Do you agree @matsen? |
Y'all are much deeper in this than me, so I don't have much to say, but I'm sure that Joe would be happy to hear from you. |
The "Jumble" option in
dnapars
determines the number of random starts to try for tree space search, described as follows in thephylip
docs:This is currently fixed at 10 in the
mkconfig
program:gctree/gctree/mkconfig.py
Lines 48 to 51 in dff3119
This should be made a parameter (default 10) in the command line interface.
The text was updated successfully, but these errors were encountered: