Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explore Different Lengths of Data Available #17

Open
1 task
chauhankaranraj opened this issue Dec 16, 2020 · 0 comments
Open
1 task

Explore Different Lengths of Data Available #17

chauhankaranraj opened this issue Dec 16, 2020 · 0 comments

Comments

@chauhankaranraj
Copy link
Member

Feedback no. 4

In the current forecasting notebook, we assumed that the maximum number of days of data that we are guaranteed to have at runtime is 6. However after talking to ceph subject matter experts, it seems that there might be some flexibility there.

On the one hand, having more amount of data available might improve model accuracy. But on the other hand, this would mean users have to store more health data locally. The main purpose of this issue is to figure out the “sweet spot” such that not a lot of data is stored and yet model performance is also improved.

As a data scientist, I want to explore how model performance changes with number of days of data available at runtime, to find a reasonable compromise between amount of data stored and model accuracy achieved.

Acceptance criteria:

  • EDA notebook showing effect of number of days of data on model accuracy
@isabelizimm isabelizimm self-assigned this Jun 2, 2021
@isabelizimm isabelizimm removed their assignment Oct 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants