Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Active interface for cf python: add active.do_active attribute and uri_analyzer() function #69

Open
wants to merge 8 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ _sidebar.rst.inc

# test output
tests/test_data/cesm2_native_TREFHT.json
tests/test_data/cesm2_native_cow_TREFHT.json
tests/test_data/daily_data_tas.json
tests/test_data/daily_data_ta.json
tests/test_data/daily_data_masked_ta.json
Expand Down
51 changes: 47 additions & 4 deletions activestorage/active.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,44 @@
from activestorage import netcdf_to_zarr as nz


def uri_analyzer(uri):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we want to assess if active storage reductions are possible, not just that the file exists?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes! But how do we do that? We need a set of standards here to get conditions in

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See below ....

"""Run a first-pass examination on the file with given uri.

Parameters
----------
uri: str
input ile URI.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
input ile URI.
Input file URI.


Returns
-------
bool
Boolean that sets other methods/attributes in Active.

Raises
------
ValueError
if file URI is None
if file URI is pointing to a nonexistent file
"""
#TODO probably best placed in an acctive_tools library
result = True
if uri is None:
raise ValueError(f"Must use a valid file for uri. Got {uri}")
else:
if not os.path.isfile(uri):
raise ValueError(f"Must use existing file for uri. "
f"{uri} not found")
else:
# bogus condition until we fix criteria
if "cow" in uri:
print("This file doesn't support active storage.")
result = False
else:
print("This file supports active storage.")

return result


class Active:
"""
Instantiates an interface to active storage which contains either zarr files
Expand Down Expand Up @@ -43,10 +81,15 @@ def __init__(self, uri, ncvar, missing_value=None, _FillValue=None, valid_min=No
"""
# Assume NetCDF4 for now
self.uri = uri
if self.uri is None:
raise ValueError(f"Must use a valid file for uri. Got {self.uri}")
if not os.path.isfile(self.uri):
raise ValueError(f"Must use existing file for uri. {self.uri} not found")
self.do_active = True
# run in active storage mode only if the uri analyzer returns True
uri_analyzer_result = uri_analyzer(self.uri)
self.do_active = uri_analyzer_result
# otherwise just get out and return active.do_active = False
if not self.do_active:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi V - I thought that if do_active is False then we do not want to return, rather we want to set _version to 2 (or is it 1) ...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeps indeed, I followed up on your change request, thanks, bud!

Copy link
Collaborator

@davidhassell davidhassell Jun 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Edit - this was a defunct version of this comment - I shall rewrite it below ....

Hi V, I don't get what uri_analyzer is doing. If the file doesn't exist, then we can do anything, so setting self._version = 2 won't help us.

From a cf-python perspective, I know that the file exists - the question is "is it in location that supports active storage requests?" If "yes" then self._version = 1, and if "no" then self._version = 2

self._version = 1
else:
self._version = 2
self.ncvar = ncvar
if self.ncvar is None:
raise ValueError("Must set a netCDF variable name to slice")
Expand Down
Binary file added tests/test_data/cesm2_native_cow.nc
Binary file not shown.
29 changes: 28 additions & 1 deletion tests/unit/test_active.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
import pytest
import threading

from activestorage.active import Active
from activestorage.active import Active, uri_analyzer


def test_uri_none():
Expand Down Expand Up @@ -105,3 +105,30 @@ def test_lock():

active.lock = None
assert active.lock is False


def test_uri_analyzer_invalid_uri():
"""Test the uri_analyzer functionality."""
#TODO implement correct test cases when fully functional
uri = "tests/test_data/cesm2_native_cow.nc"
ncvar = "TREFHT"
active = Active(uri, ncvar=ncvar)
assert active.do_active is False
raised_exc = "'Active' object has no attribute '_ncvar'"

# test version switching
item = active.__getitem__(index=3)[0, 0]
expected = np.array(277.11945, dtype="float32")
np.testing.assert_array_equal(item, expected)


def test_uri_analyzer_valid_uri():
"""Test the uri_analyzer functionality."""
#TODO implement correct test cases when fully functional
uri = "tests/test_data/cesm2_native.nc"
ncvar = "TREFHT"
active = Active(uri, ncvar=ncvar)
assert active.do_active is True

active._method = "min"
assert active.method([3,444]) == 3