-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Doctor Visits adjusted signal AUC does not match the raw signal #2045
Comments
I believe this was part of @rumackaaron 's work. Are we correct in assuming that these should match? |
Interesting find! Mathematically, they don't have to match and I think that's the expected behavior in this case. When creating the design matrix in weekday.py, the constraint is that For simplicity, say that there are only two days in the week. Let It may be possible to create a different constraint to ensure that (at least on the training data), the sum of the original signal is the same as that of the adjusted signal. I don't think it's possible to ensure that constraint holds over an arbitrary time interval while using multiplicative day-of-week effects. P.S. I find it concerning that the "sawtooth" pattern is still present in the adjusted signal. I don't know what the training period is for fitting the day-of-week effects, but it may be worth experimenting to find an appropriate period that consistently removes the "sawtooth" pattern. |
Indeed. In fact, it's not possible to ensure that with any modification (think of the special case of an interval of one day). Even if we relax the requirement to all intervals of some fixed length (e.g. 7 days), I think that the only solution is a moving average. But a moving average isn't sufficiently sensitive to the most recent developments. This suggests an asymmetric kernel, e.g. a triangle or half-Gaussian. I think all kernels satisfy some form of long-term AUC equivalence. But this doesn't address the day-of-week effects. We need to send this problem for some research TLC. |
Actual Behavior:
When looking at the data from the Doctor Visits signal, the day-adjusted signal does not seem to match the area under the curve of the raw signal. The sum of the values on the raw signal is 67.70 and the day-adjusted signal is 56.22.
Expected behavior
@RoniRos and I were looking through this yesterday and it was our intuition that the AUC should match between these two signals.
Context
Here's some code to replicate the plot above
The text was updated successfully, but these errors were encountered: