-
Notifications
You must be signed in to change notification settings - Fork 240
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Document / Improve / Polish / Fix --indels-2.0 #2280
Comments
True, the option I'll use the opportunity to document its current state and explain its main idea: it is intended to replace the old mpileup indel calling code and make it modular, so that new concepts can be plugged in and tested more easily. It also changes the overall structure of the algorithm, so that nested loops naturally support diploid or even polyploid genomes
Currently the algorithm of consensus template creation is as follow:
Apart from small technical problems, such as the assert you're quoting, the consensus building algorithm described above is not perfect and should be improved. |
Understood, buit please think of this from a user perspective:
There's no way for the user to see that the current caller "works well", the 2.0 caller "works less well" and the indels-cns caller "works better". That's in generalised terms of course as obviously all implementations will have some cases they get better and some they get worse. I'm not arguing against doing more work nor that it is necessary, but that it is exposed to the users and advertised before it's in a viable state to be used. Maybe it simply needs removing from the usage statement. If so I'd also change the "with edlib" to "with diploid consensus". I only added the edlib there because I couldn't really use the term I wanted to as it was already being used. Alternatively, drop the EXPERIMENTAL from indels-cns. It's well tested and debugged and is in a different ballpark to indels-2.0 for robustness and polish. Clearly it's not perfect (nothing is, including the original caller), but as far as I'm concerned it's a finished project and I have no intention of further development of that algorithm. |
I am happy for the wording to be changed |
Indels-2.0 is somewhat buggy at present. On several tests it simply fails. For example the test data from #2277.
Often this is due to the use of asserts to validate assumptions. Those validations are perhaps fine, but assert is not. Either it needs the asserts replacing by return codes to indicate errors, so that the alignment is rejected, or it needs canning as functionality. Basically it looks to be part-way through development and more of a prototype than a usable piece of code, although it is labelled as experimental to reflect that.
If I recall it was an attempt to rewrite my old PR (#1679), but in doing so introduced bugs and lost most of the improvements in calling accuracy. Ultimately I improved on that PR and it got merged as
--indels-cns
, so we may need to decide what to do with the--indels-2.0
functionality.Right now though I think it's just confusing for the users having multiple experimental options.
The text was updated successfully, but these errors were encountered: