-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Validation tests #107
Validation tests #107
Conversation
@TomNicholas Now the test suite is running in half of the time. There is no particular test that takes up a lot of time (as before), it's just a lot of tests. Not optimal yet, but an improvement. The new solution for the validation tests is definitely cleaner, and this means we can easily re-write test data if we want to run the tests on even smaller input test data. Something to keep in mind, though: If we overwrite the test data often via
this will make the size of this GitHub repo grow. |
|
||
forcing = request.getfixturevalue(forcing_fixture) | ||
fname = _get_fname(name) | ||
forcing.ds.to_zarr(fname, mode="a") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@TomNicholas Would it maybe be saver to first delete the entire zarr group before overwriting?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably? Do you think it's likely that you will want to change the store on disk by only altering / adding individual variables (which I think is mostly what append is intended for)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might want to delete obsolete variables if we take them out of the datasets. So we should probably delete the entire zarr store before rewriting.
Great idea @NoraLoose! I am struggling to scroll through all the changes on my phone but one suggestion: instead of setting an environment variable outside python you could add a config option to the test suite, so that you can do pytest --overwrite-regression-test-data. See VirtualiZarr's --run-network-tests for an example of how to do this. Presumably in reality we might want to only overwrite the test data for one test at a time? If I set that flag then run just that test is that what it will do? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(reviewed, see comment above)
With all the new zarr data in the diffs, it's hard to see the essential code change. This is the new test module: As implemented, all test data is overwritten at once. |
To clarify, there is only a single test that writes all test data, and a single test that checks all test data. Both use parameterization of fixtures. This is the most compact way to write these validation tests, but it would probably be better to have the option to overwrite the test data one by one. |
@TomNicholas I followed your advice and
|
This PR does two main things:
test_boundary_forcing.py
operate on coarsened GLORYS test data to speed up the automated test suite. Closes Boundary forcing tests are super slow #105.test_validation.py
module which compares entire xarray Datasets to the expected datasets in a saved zarr storage. Closes Use xarray.testing.assert_allclose #104.an environment variablea config option as follows:where
tidal_forcing
andgrid
are two of the (currently) eleven test fixtures, which will then be overwritten. The other nine fixtures will stay untouched. You can also doto overwrite all test data.
Otherwise, i.e., if you just run
there is no overwriting of the test data.