-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix reconsensus #95
Fix reconsensus #95
Conversation
@@ -41,12 +41,12 @@ impl PangraphNode { | |||
} | |||
} | |||
|
|||
pub fn len(&self) -> usize { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I this this function had a bug. But due to possibly having circular genomes, the length of a node cannot always be determined by only start-end positions
self.position.1.saturating_sub(self.position.1) | ||
} | ||
|
||
pub fn is_empty(&self) -> bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Therefore also this function was wrong. I thought about defining is_emtpy
as checking if the start position coincides with the end position, but this is also wrong in one edge case: when the path is circular and composed of a single node, then the start/end position of the node can be the same even if the node is not empty.
@@ -87,8 +91,11 @@ fn reconsensus_mutations(block: &mut PangraphBlock) -> Result<(), Report> { | |||
let subs_at_pos: Vec<_> = edit.subs.iter().filter(|s| s.pos == pos).cloned().collect(); | |||
match subs_at_pos.len() { | |||
0 => { | |||
edit.subs.push(Sub::new(pos, original)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This fixed the bug: the substitution should be applied only if the position is not deleted.
@@ -203,14 +226,14 @@ mod tests { | |||
// 01234567890123456789012 | |||
// node 1) .T...--..........A..... | |||
// node 2) .T...--...C......|..... | |||
// node 3) .T...--...C............ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed the test so that it hits this edge-case
let node_idx = path_nodes.iter().position(|&n| n == node_id).unwrap(); | ||
path_nodes.remove(node_idx); | ||
// while there is a node with this id, remove it: | ||
while let Some(node_idx) = path_nodes.iter().position(|&n| n == node_id) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the second fix: sometimes a path could have more than one empty node with the same id. The same id happens only if the nodes are adjacent. This removes all of them.
A more elegant solution might be to not produce these empty nodes at all (modify the merging procedure?) or to make sure they are generated with different id.
Fixed a bug in reconsensus: when transferring a substitution to the consensus, sites that are deleted should not receive the substitution.
This caused a failure in node removal downstream, in which empty nodes were not removed because they featured a substitution, which was erroneously located on a deleted site.
I added additional debug sanity checks and fixed the bug. I also added the edge-case to the test set.