Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validation tests #107

Merged
merged 23 commits into from
Sep 3, 2024
Merged

Validation tests #107

merged 23 commits into from
Sep 3, 2024

Conversation

NoraLoose
Copy link
Collaborator

@NoraLoose NoraLoose commented Aug 30, 2024

This PR does two main things:

  • The tests in test_boundary_forcing.py operate on coarsened GLORYS test data to speed up the automated test suite. Closes Boundary forcing tests are super slow #105.
  • The regression / validation tests are entirely refactored:
    • Instead of hardcoded values in the test modules, there is now a test_validation.py module which compares entire xarray Datasets to the expected datasets in a saved zarr storage. Closes Use xarray.testing.assert_allclose #104.
    • The zarr validation data is overwritten if you invoke pytest with an environment variable a config option as follows:
pytest --overwrite=tidal_forcing --overwrite=grid

where tidal_forcing and grid are two of the (currently) eleven test fixtures, which will then be overwritten. The other nine fixtures will stay untouched. You can also do

pytest --overwrite=all

to overwrite all test data.

Otherwise, i.e., if you just run

pytest

there is no overwriting of the test data.

@NoraLoose
Copy link
Collaborator Author

@TomNicholas Now the test suite is running in half of the time. There is no particular test that takes up a lot of time (as before), it's just a lot of tests. Not optimal yet, but an improvement.

The new solution for the validation tests is definitely cleaner, and this means we can easily re-write test data if we want to run the tests on even smaller input test data. Something to keep in mind, though: If we overwrite the test data often via

ROMS_TOOLS_OVERWRITE_TEST_DATA=1 pytest test_validation.py

this will make the size of this GitHub repo grow.


forcing = request.getfixturevalue(forcing_fixture)
fname = _get_fname(name)
forcing.ds.to_zarr(fname, mode="a")
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TomNicholas Would it maybe be saver to first delete the entire zarr group before overwriting?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably? Do you think it's likely that you will want to change the store on disk by only altering / adding individual variables (which I think is mostly what append is intended for)?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might want to delete obsolete variables if we take them out of the datasets. So we should probably delete the entire zarr store before rewriting.

@TomNicholas
Copy link
Member

Great idea @NoraLoose!

I am struggling to scroll through all the changes on my phone but one suggestion: instead of setting an environment variable outside python you could add a config option to the test suite, so that you can do pytest --overwrite-regression-test-data. See VirtualiZarr's --run-network-tests for an example of how to do this.

Presumably in reality we might want to only overwrite the test data for one test at a time? If I set that flag then run just that test is that what it will do?

Copy link
Member

@TomNicholas TomNicholas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(reviewed, see comment above)

@NoraLoose
Copy link
Collaborator Author

I am struggling to scroll through all the changes on my phone

With all the new zarr data in the diffs, it's hard to see the essential code change. This is the new test module:
https://github.com/NoraLoose/roms-tools/blob/refactor-tests/roms_tools/tests/test_setup/test_validation.py

As implemented, all test data is overwritten at once.

@NoraLoose
Copy link
Collaborator Author

NoraLoose commented Aug 30, 2024

To clarify, there is only a single test that writes all test data, and a single test that checks all test data. Both use parameterization of fixtures.

This is the most compact way to write these validation tests, but it would probably be better to have the option to overwrite the test data one by one.

@NoraLoose
Copy link
Collaborator Author

NoraLoose commented Aug 30, 2024

@TomNicholas I followed your advice and

  • added a pytest config option (rather than using environment variable), see updated description of PR
  • implemented the option to overwrite test data separately (rather than all at once), see description above

@NoraLoose NoraLoose merged commit 93a8844 into CWorthy-ocean:main Sep 3, 2024
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Boundary forcing tests are super slow Use xarray.testing.assert_allclose
2 participants