-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Results differ unexpectedly between pipeline.predict
and query_bed
with ref
score
#38
Comments
@krrome Do you have an idea why the predictions are different? |
I think that could be the potential hunch. @krrome do you know if that's true? |
That was my suspicion, but the docs made it sound like they don't do that for bed files. I tried tracing through the code but frankly ran out of steam before I could figure it out. |
@krrome wrote the code so he'll be better able to help here. |
Thanks! As an FYI: yesterday, I ended up working around this by using kipoi-interpret's |
Glad to hear the Mutation class worked well. Let's still keep this thread open. |
I'm using the
DeepSEA/variantEffects
model withMutationMap
to try and find the most impactful mutations for a set of sequences. However, I've noticed a discrepancy between the predictions I get for the wild-type sequences when usingpipeline.predict
vs.query_bed
with theref
score.The
pipeline.predict
scores are generally high probabilities for CTCF in cell type A549, which is what I'd expect given that my bed file consists of ChIP-seq peaks for that TF/cell line pair. Theref
scored predictions, on the other hand, tend to be really small. A few examples of the difference are:I'm trying to understand whether this is expected, a result of something I'm doing wrong, or a bug. For context, my code is essentially the following (pared down to make it easier to see the essentials):
Note that both of these use the exact same bed file and therefore should be looking at the same sequences.
Am I missing some key reason why I should expect these two prediction arrays to differ dramatically? I am using the default rc merge settings for both (and that wouldn't account for the order-of-magnitude differences anyway). The two best ideas I have for why I might be getting such different results are:
MutationMap.query_bed
is re-centering on each currently-being-tested variant.ref
doesn't mean what I think it does.The text was updated successfully, but these errors were encountered: