-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
scoring tool for pipeline #282
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This overall looks good - though there are a collection of random changes.
My main question is do we want to summarise in R and if so we need a follow up to generalise this in order to allow us to do all the evaluations we need to do.
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #282 +/- ##
=======================================
Coverage 92.85% 92.85%
=======================================
Files 47 47
Lines 490 490
=======================================
Hits 455 455
Misses 35 35 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking again I wonder at the decision to nest creating a dataframe with scoring.
I think the key comments from @seabbs here are:
This PR gives a function that get target any parameter in a |
I think what my questions are trying to get at is does this function give you the flexibility we need to do the analysis plan and I think the answer to that is no as it only supports summarising by parameter. This is really locked in by the |
It does processes by adding the |
Yes, I can see that. I guess that an argument for a post-processing step which makes a big dataframe out of all the models we trial and we send selected cols to R/scoringutils? |
yes or julia -> score -> julia followed by julia -> summarise -> julia |
I'm undecided, but I think the
approach |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I still have some reservations but I think we should merge and see how this works out when sticking together with other parts of the pipeline.
This PR adds scoring utility to the pipeline and closes #247.
The main contribution here is the
score_parameters
function, which collects inference/forecast samples into a Dataframe and sends to anR
run time to usescoringutils
and returns a dataframe of summary scores.Example usage is in
pipeline/test/end-to-end/test_scoring.jl
, my main worry after doing a full run is that the function is abit low level given the boilerplate needed to look at all the times points.