You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is a suggestion by Anders, as an alternative to the "remove negatives", see #116 & that we currently perform the on the background
Statistics, last paragraph in Section 4: I think we're introducing a slight bias by leaving out the negative-count bins in the background-subtracted spectra. That is, in our simulated spectra we accept statistical fluctuations in one direction (surisingly low background count and/or surprisingly large total count), but we exclude fluctuations in the opposite direction (high background count and/or low total count).
Would anything break in the math/code if we actually just included the negative-count bins in the fit? To be clear, I don't expect the impact to be large (perhaps not even noticable), so if it's technically challenging we may want to leave it as is.
Alternatively, I guess we could sample the total count (tot_i) first, and then sample the background count (bkg_i) repeatedly until we get a sample that satisifies bkg_i < tot_i -- so effectively sample the background count from a conditional distribution p(bkg_i | lambda_bkg, bkg_i < tot_i).
[I think we're encountering a classic statistics issue here: if the true value of some quantity X is close to zero, X < 0 is unphysical, and your individual estimates of X have a significant statistical uncertainty, you should expect some of your X estimates to get a central value in the X < 0 region. If you force each individual estimate to be X >= 0 (e.g. by leaving out the X < 0 estimates) and later combine your X estimates, your combined estimator will be biased towards high X values.]
The text was updated successfully, but these errors were encountered:
Somewhat along the same lines is then #28 and following comment
Question about the chi^2 in Section 5: We say that "[...] most bins of the first-generation matrices follow a normal distribution". I assume it's the low-count bins that deviate most strongly from a normal distribution? I wonder if this might improve a bit if we include the negative-count bins in the fit (point 5 above)?
[For the future: it could be interesting to try to replace the chi^2 with a log-liklihood function that also tries to account for the deviations from normal distributions.]
In line with the comments by the referee we might just as well not (by default) cut away the negative counts etc. I'm not working on a branch to implement this.
If one still wishes to run a bg subtraction in the Ensemble class, one could for example use the action_raw, action_unfolded and action_firstgen attribute to apply it to the corresponding matrices.
This is a suggestion by Anders, as an alternative to the "remove negatives", see #116 & that we currently perform the on the background
The text was updated successfully, but these errors were encountered: