Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Computation took so long #28

Open
kurniayazid opened this issue Aug 6, 2020 · 3 comments
Open

Computation took so long #28

kurniayazid opened this issue Aug 6, 2020 · 3 comments

Comments

@kurniayazid
Copy link

Hello, I am Ega from Indonesia and I would like to compute the Rt in my country. However, it took me around 13 hours to run data for a region, while I see that you only took around 7 mins to run data for a region in the python notebook. Do you use a virtual machine or cloud computing to speed up the computation? I am sorry for a dumb question, I am a newbie. Thanks in advance.

Best,
Ega

@michaelosthege
Copy link
Collaborator

Hi Ega,
Yes, the model is somewhat expensive to run. However, there are aspects about how you install PyMC3 and its dependencies that have a massive impact on the performance.

Are you running on Windows or Linux?
Pay attention to any warnings that appear when you import Theano or PyMC3.

cheers

@JessVanN
Copy link

Hi Ega,
Yes, the model is somewhat expensive to run. However, there are aspects about how you install PyMC3 and its dependencies that have a massive impact on the performance.

Are you running on Windows or Linux?
Pay attention to any warnings that appear when you import Theano or PyMC3.

cheers

Can you clarify this more? What exactly can I research or adjust to improve computational performance. I am trying to run this model for counties in my state, and each takes 30 minutes to 1 hour, making running multiple regions a massive amount of time. I am unfamiliar with Pymc3 prior to this but have tried adjust the cores and chains in the generative file. I would greatly appreciate any more direction you could provide

@michaelosthege
Copy link
Collaborator

Hi @jessplaysclash ,

first I recommend to fork and work with https://github.com/rtcovidlive/rtlive-global which is the recent version also running for https://rtlive.de/global
30-60 minutes per region is about the time you should expect from the model. Right now it won't get much faster than that.
If you implement reliable data loaders for rtlive-global we might be able to run your country on our cluster alongside the other regions.

One thing that could make the sampling a little faster would be to initialize the NUTS sampler based on yesterdays tuning results. This is however not something easily accessible with the current PyMC3 API.

Another thing would be to shorten the time series start with some time a few months back instead of dates as early as March/May 2020 (if you even have data that far back).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants