-
-
Notifications
You must be signed in to change notification settings - Fork 286
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make AsyncArray.nchunks_initialized async #2449
Make AsyncArray.nchunks_initialized async #2449
Conversation
This changes the API of AysncArray.nchunks_initialized to change it from a property to an async function. The motivation here comes from 1. general cleanliness (a property access calling async functions doing I/O feels a bit wrong) 2. Work on Array.info, where I hit a strange error, I think from jumping from a - sync Array.info_complete -> - async AsyncArray.info_complete -> - sync AsyncArray.nchunks_initialzed -> - sync collect_aiterator (async list_prefix) With this change, we'll be able to jump from sync to async just once at the boundary. ``` File "/Users/tom/gh/zarr-developers/zarr-python/src/zarr/core/array.py", line 3011, in info_complete return sync(self._async_array.info_complete()) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/tom/gh/zarr-developers/zarr-python/src/zarr/core/sync.py", line 141, in sync raise return_result File "/Users/tom/gh/zarr-developers/zarr-python/src/zarr/core/sync.py", line 100, in _runner return await coro ^^^^^^^^^^ File "/Users/tom/gh/zarr-developers/zarr-python/src/zarr/core/array.py", line 1223, in info_complete "count_chunks_initialized": self.nchunks_initialized, # this should be async? ^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/tom/gh/zarr-developers/zarr-python/src/zarr/core/array.py", line 844, in nchunks_initialized return nchunks_initialized(self) ^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/tom/gh/zarr-developers/zarr-python/src/zarr/core/array.py", line 3035, in nchunks_initialized return len(chunks_initialized(array)) ^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/tom/gh/zarr-developers/zarr-python/src/zarr/core/array.py", line 3061, in chunks_initialized collect_aiterator(array.store_path.store.list_prefix(prefix=array.store_path.path)) File "/Users/tom/gh/zarr-developers/zarr-python/src/zarr/core/sync.py", line 178, in collect_aiterator return sync(_collect_aiterator(data)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/tom/gh/zarr-developers/zarr-python/src/zarr/core/sync.py", line 128, in sync raise SyncError("Calling sync() from within a running loop") zarr.core.sync.SyncError: Calling sync() from within a running loop ```
It's also OK from my POV. This looks like a good change to me! |
tests/test_array.py
Outdated
@@ -366,9 +366,9 @@ def test_chunks_initialized(test_cls: type[Array] | type[AsyncArray[Any]]) -> No | |||
arr[region] = 1 | |||
|
|||
if test_cls == Array: | |||
observed = sorted(chunks_initialized(arr)) | |||
observed = sorted(await chunks_initialized(arr)) # Why doesn't mypy error here? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh good, this failed in CI: https://results.pre-commit.ci/run/github/48049137/1730299519.ojFfXsB_QvipnKxxQtQKgA
And it fails for me locally if I run it on --all-files
(but it didn't fail when checking the diff, part of why I don't like putting mypy in pre-commit).
I'll get this fixed up. AFAICT, chunks_initialized
isn't part of the public API so it's safe to make it async only. If it's useful, maybe it could be added as a method on Array
/ AsyncArray
.
This changes the API of AysncArray.nchunks_initialized to change it from a property to an async function. The motivation here comes from
I think the error came from jumping between sync and async multiple times:
With this change, we'll be able to jump from sync to async just once
at the boundary. Maybe
sync
should be able to do that if needed, but in this case we can avoid it.The big downside: this makes
Array.nchunks_initialized
different fromAsyncArray.nchunks_initialized
. IMO that's OK.Array.nchunks_initialized
is bound by backwards compatibility, but that doesn't have to lock AsyncArray into the same API. Even stuff likeresize
are different since the async version needs to be awaited. This is a bit more different since you need to call it and await it, but maybe that's OK for the clarity?TODO: