Remove or Improve background subtraction: Currently introducing a bias? #119

fzeiser · 2020-03-30T10:14:16Z

This is a suggestion by Anders, as an alternative to the "remove negatives", see #116 & that we currently perform the on the background

Statistics, last paragraph in Section 4: I think we're introducing a slight bias by leaving out the negative-count bins in the background-subtracted spectra. That is, in our simulated spectra we accept statistical fluctuations in one direction (surisingly low background count and/or surprisingly large total count), but we exclude fluctuations in the opposite direction (high background count and/or low total count).

Would anything break in the math/code if we actually just included the negative-count bins in the fit? To be clear, I don't expect the impact to be large (perhaps not even noticable), so if it's technically challenging we may want to leave it as is.

Alternatively, I guess we could sample the total count (tot_i) first, and then sample the background count (bkg_i) repeatedly until we get a sample that satisifies bkg_i < tot_i -- so effectively sample the background count from a conditional distribution p(bkg_i | lambda_bkg, bkg_i < tot_i).
[I think we're encountering a classic statistics issue here: if the true value of some quantity X is close to zero, X < 0 is unphysical, and your individual estimates of X have a significant statistical uncertainty, you should expect some of your X estimates to get a central value in the X < 0 region. If you force each individual estimate to be X >= 0 (e.g. by leaving out the X < 0 estimates) and later combine your X estimates, your combined estimator will be biased towards high X values.]

fzeiser · 2020-03-30T10:18:15Z

Somewhat along the same lines is then #28 and following comment

Question about the chi^2 in Section 5: We say that "[...] most bins of the first-generation matrices follow a normal distribution". I assume it's the low-count bins that deviate most strongly from a normal distribution? I wonder if this might improve a bit if we include the negative-count bins in the fit (point 5 above)?
[For the future: it could be interesting to try to replace the chi^2 with a log-liklihood function that also tries to account for the deviations from normal distributions.]

fzeiser · 2020-09-08T08:29:15Z

In line with the comments by the referee we might just as well not (by default) cut away the negative counts etc. I'm not working on a branch to implement this.

If one still wishes to run a bg subtraction in the Ensemble class, one could for example use the action_raw, action_unfolded and action_firstgen attribute to apply it to the corresponding matrices.

Keep negative entries in Ensemble, Unfolder and Fristgen by default.

fzeiser · 2020-09-09T14:12:03Z

See also #148 (comment) on another idea of how to avoid the bias.

fzeiser added the Suggestion Suggestion for new feature/changes label Mar 30, 2020

fzeiser added this to the Version 2.0 milestone Mar 30, 2020

fzeiser pushed a commit that referenced this issue Sep 8, 2020

Keep negative entries by default. Solves #119.

cf32991

Keep negative entries in Ensemble, Unfolder and Fristgen by default.

fzeiser mentioned this issue Sep 8, 2020

Keep negative entries by default. Solves #119. #148

Closed

fzeiser changed the title ~~Improved background subtraction~~ Remove or Improve background subtraction: Currently introducing a bias? Sep 9, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove or Improve background subtraction: Currently introducing a bias? #119

Remove or Improve background subtraction: Currently introducing a bias? #119

fzeiser commented Mar 30, 2020

fzeiser commented Mar 30, 2020

fzeiser commented Sep 8, 2020 •

edited

Loading

fzeiser commented Sep 9, 2020

Remove or Improve background subtraction: Currently introducing a bias? #119

Remove or Improve background subtraction: Currently introducing a bias? #119

Comments

fzeiser commented Mar 30, 2020

fzeiser commented Mar 30, 2020

fzeiser commented Sep 8, 2020 • edited Loading

fzeiser commented Sep 9, 2020

fzeiser commented Sep 8, 2020 •

edited

Loading