-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Don’t download model files for every test run #238
Comments
Hi Jost, this looks like a good solution for the time being. I think moving all the data to an outside repository/Zenodo entry would solve all of the mentioned problems. |
Moving the data files elsewhere would solve the first point; but not the second and third one. (In particular, having the files outside of GitHub would potentially make the third point much worse—downloading files over the internet from e.g. Zenodo is much slower than downloading them from another GitHub server that’s potentially a few meters away in the same data centre.) For adding new models, yes, being able to use them directly from Zenodo (where modeling groups have hopefully made them available themselves) would be good. For existing model files, I’m not sure whether it’s worth the effort to move them elsewhere. But as you say, that’s a discussion for the long run; if we agree that the proposal above is a good solution in the meantime & since it doesn’t impede other long-term solutions, I think that’s good enough for now. |
For the second problem: having all the models decoupled from the main SNEWPY repository means that the developer will be in exactly the same position as the testing script in the GitHub job. Download the models, or reuse the ones that are already in cache.
I agree, this might be the case (although it's worth checking if this problem is real). |
For Zenodo, we’ve previously observed issues where downloads would fail completely or tests would take ~40 min to run, largely due to slow downloads. It’s possible that was some temporary hiccup for a few days; but even if it’s reliable and faster now, it would be good to avoid even a few min extra wait on every test run. |
I see, thanks for pointing to the issue. One more thought about running tests: it's possible to cache the directories between the runs, I think this is done especially for such use case |
Right now, tests on GitHub Actions need to download all model files for every test run. This has multiple disadvantages:
snewpy.model_path
)At least for model files that are in the GitHub repository anyway, the files are already available on the test runner; it would be sufficient to change
snewpy.model_path
to point to themodels/
directory in the repo. We could then also eliminate a lot of duplicate testing code intests/test_0{1,2}*.py
.(Instead, we would probably need one download test running as part of the release workflow—after tagging, but before uploading the release to PyPI.)
Open questions: How to deal with models whose files are not in the GitHub repo? (E.g. on Zenodo?)
The text was updated successfully, but these errors were encountered: