-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bootstrapping and Dataset
updates
#10
Conversation
… to run multiple randomized iterations for each fit
…e directory structure
…ith method on `Dataset` Since `Event`s are now wrapped in `Arc`, there's no need to open a `Dataset` multiple times, we can just make new `Dataset`s by referencing the events in the original. We keep the `Arc` wrapper on `Dataset` to allow them to be quickly copied (rather than increasing the reference count for each `Event`).
This also adds a bootstrap to the unbinned fit which should yield more accurate uncertainties
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #10 +/- ##
==========================================
+ Coverage 10.36% 10.81% +0.45%
==========================================
Files 15 15
Lines 3935 3771 -164
Branches 3935 3771 -164
==========================================
Hits 408 408
+ Misses 3527 3363 -164 ☔ View full report in Codecov by Sentry. |
CodSpeed Performance ReportCongrats! CodSpeed is installed 🎉
You will start to see performance impacts in the reports once the benchmarks are run from your default branch.
|
It's not wrong, it's just missing error estimation and the total unbinned fit.
This PR mainly addresses methods to resample
Event
s via a bootstrap to allow for more accurate error estimation. A bootstrappedDataset
contains the same number of events as the original, but resamples the original with replacement, meaning some events might be duplicated and some might be missing entirely. This simulates the process of collecting new data, and by redoing a fit with these new bootstrapped data multiple times, we can more accurately represent how uncertainties accumulate in the fitting process.Additionally, this PR reorganizes the way data is stored, wrapping each
Event
in anArc
. This way, bootstrapped and binnedDataset
s don't actually use up as much memory as the original and only have to store references to the originalEvent
s. I removed theopen_binned
function in favor of aDataset::bin_by
method.This PR also switches the benchmark CI for Codspeed and adds a data-related benchmark.