Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extendable zarr arrays #802

Merged
merged 35 commits into from
Nov 18, 2024
Merged

Conversation

dylanmcreynolds
Copy link
Contributor

@dylanmcreynolds dylanmcreynolds commented Oct 27, 2024

Checklist

  • Add a Changelog entry
  • Unit tests
  • User documentation

This is extraordinarily early work towards append-able arrays, which should be supportable with zarr. For use cases where we are doing processing off of an instrument and sending the results to Tiled, we don't always know the eventual size of the array. We can do this by writing the array directly outside of Tiled and then sending a message to Tiled to reindex. But it would be super useful to send the new block to Tiled directly.

Changes:

  • Add a new client method, server route and ZarrAdapter for a method:
    client method:
  • new link for append that the client uses, and new route in routes.py
  • In the zarr adapter and its parent12, try and find the largest block along that axis, resize along that axis and write the data.
  • Client needs append_block method that sends the block to append and the axis to append it to
  • Router needs new PATCH method for /array/block endpoint
  • Zarr adapter gets the message and resizes the zarr array and appends
  • f you edit the metadata or specs, we keep those edits in the revisions table. So, when you update things to structure, they're final, the user can't go back in history. In CatalogArrayAdapter, we need to update the database with the new size. We can look at replace_metadata but need to avoid the revisions update.
  • We also need to rehash the structure (see create_node)
    Questions and Issues:
  • Not even sure this is the right approach
  • Would I be responsible for updating the index in the SQL database about the resize? If so, doest the ZarrAdapter have access to database objects?
  • the link with with the format /arrays/append is a little unsatisfying as it uses a verb, which I don't love
  • @danielballan and I have has many discussions about the HTTP verb to use. Currently this uses PATCH, as it's a change to the array, without a complete replacement of the array.

@dylanmcreynolds dylanmcreynolds marked this pull request as draft October 27, 2024 21:37
@danielballan
Copy link
Member

Notes from discussion with @dylanmcreynolds

Background

  • We let users patch metadata and specs but in general not structure. If you mess up the structure at register-time, just blow it away and register again. If you mess up the structure on data you are uploading to Tiled, then altering later could be a computationally heavy operation, such as changing the data type after the fact. So we do not allow it.
  • We want to let users edit specific things about the structure that can be done in the space of an HTTP request/response cycle.
  • We laid some track for this by placing resizable as a key in the structure for array, sparse, and table. Currently, it's always False. For array and sparse, as the type hints suggest, we anticipated we might extend it to allow a tuple of booleans aligned with the shape tuple. For example, a (10, 1024, 1024) image stack that is appendable along the time axis would have resizable=(True, False, False). Zarr does not require you to declare up front which axes you might resize later. But for HTTP caching purposes, we expect that clients can leverage this knowledge. So we may insist that you declare up front which axes may resize and hold you to that.

Routes

This route updates the shape metadata and uploads data in a single request:

PATCH /array/append/{path}?axis=i

The above works if you can fit the full axis' breadth of data in one request body. This satisfies ALS' current use case.

For other use cases, we may need to upload multiple chunks in separate requests. For example, to do an update on X with O in three requests...

X  X  O
X  X  O
X  X  O

we want two routes. One to declare the change in shape and then one (existing) route to upload the chunks, same as we upload chunks in a new array.

PATCH /array/resize/{path}
POST /array/block?block=i,j,...  # N of these requests

where i must correspond to a True in the resizable structure.

@dylanmcreynolds
Copy link
Contributor Author

More notes from conversation:

  • If you edit the metadata or specs, we keep those edits in the revisions table. So, when you update things to structure, they're final, the user can't go back in history. In CatalogArrayAdapter, we need to update the database with the new size. We can look at replace_metadata but need to avoid the revisions update.
  • We also need to rehash the structure (see create_node)

@dylanmcreynolds
Copy link
Contributor Author

This ready for review, provided that can start before unit tests and user documentation.

@dylanmcreynolds dylanmcreynolds marked this pull request as ready for review October 29, 2024 21:50
@dylanmcreynolds
Copy link
Contributor Author

Fixed issues causing unit test to fail.

@dylanmcreynolds
Copy link
Contributor Author

Talking with @danielballan for a while, there is always going to be an issue here where multiple clients appending an array could write them out of order. Should the append_block method take a parameter that says where to put the block in the array and resize if the array needs the size, and simply write to the correct place if it's already been resized?

@danielballan
Copy link
Member

danielballan commented Oct 30, 2024

In our chat I think we identified two problems:

  1. Requests processed out of order
  2. Write contention resulting in data corruption

For (1): Instead of PATCH /append/{path} implement PATCH /array/full/{path}?slice=10,:. That is, instead of "appending" to an array (which is kind of a tabular concept...) overlay a specific slice of data into an array chunk. The above example writes data into row 10. Requests that arrive in any order will achieve the same result.

For (2): Rely on the database as a semaphor. Maybe transactions can get us what we need, or maybe we need to involve explicit row-level locking. Now, if the chunks are stored on a filesystem, there is a fundamental limit to how much we can guarantee. Use cases that are pushing the limits of streaming can upgrade to a Zarr store with stronger locking semantics.


Edit: I would think that, if the PATCH lies outside the current array shape, the array would be reshaped (i.e. allocating new empty chunks) to accommodate the added slice. If we're concerned about accidental resizing, there could be a query parameter to opt in or out.

@danielballan
Copy link
Member

Demo:

In [1]: from tiled.client import from_profile

In [2]: import numpy

In [3]: c = from_profile('local', api_key='secret')

In [4]: ac = c.write_array(numpy.ones((3, 2, 2)), key='y')

In [5]: ac
Out[5]: <ArrayClient shape=(3, 2, 2) chunks=((3,), (2,), (2,)) dtype=float64>

In [6]: ac.read()
Out[6]:
array([[[1., 1.],
        [1., 1.]],

       [[1., 1.],
        [1., 1.]],

       [[1., 1.],
        [1., 1.]]])

In [7]: ac.patch(numpy.zeros((1, 2, 2)), slice=slice(3, 4), extend=True)

In [8]: ac.refresh()
Out[8]: <ArrayClient shape=(4, 2, 2) chunks=((3, 1), (2,), (2,)) dtype=float64>

In [9]: ac.read()
Out[9]:
array([[[1., 1.],
        [1., 1.]],

       [[1., 1.],
        [1., 1.]],

       [[1., 1.],
        [1., 1.]],

       [[0., 0.],
        [0., 0.]]])

Copy link
Contributor Author

@dylanmcreynolds dylanmcreynolds left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome! I left a couple minor comments...

tiled/adapters/zarr.py Outdated Show resolved Hide resolved
tiled/catalog/adapter.py Show resolved Hide resolved
@danielballan
Copy link
Member

Review comments addressed. Test updated. Also, some usability improvements:

  1. If you do not pass extend=True and your slice exceed the array bounds, the server returns 409 CONFLICT and the Python client catches it to raise a clear ValueError.
  2. It is no longer necessary to run a second request with ac.refresh() to update the local cache of the structure (i.e. get the new shape and chunks). The PATCH /array/full endpoint now returns the new ArrayStructure, and the client updates its cache.

Demo of both:

In [4]: ac = c.write_array(numpy.ones((3, 2, 2)), key='x')

In [5]: ac.patch(numpy.zeros((1, 2, 2)), slice=slice(4, 5))
<snipped>
ValueError: Slice slice(4, 5, None) does not fit within current array shape. Pass keyword argument extend=True to extend the array dimensions to fit.

In [6]: ac.patch(numpy.zeros((1, 2, 2)), slice=slice(4, 5), extend=True)

In [7]: ac
Out[7]: <ArrayClient shape=(5, 2, 2) chunks=((3, 2), (2,), (2,)) dtype=float64>

@danielballan
Copy link
Member

Needs more testing and user docs, but otherwise this is in coherent shape I think, ready for tires to be kicked.

f"Slice {slice} does not fit into array shape {current_shape}. "
f"Use ?extend=true to extend array dimension to fit."
)
self._array[slice] = data
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the array was resized, but an exception happens from here on (like someone putting in a bad slicing parameter), the file was resized, but new data not put in. Would we consider a try block that resizes back to the original size?

Copy link
Contributor

@padraic-shafer padraic-shafer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like that by specifying the slice indices, requests can safely arrive out of order. It would be useful to add a unit test for out-of-order updates.

From an earlier iteration I liked the implications of "append" coupled with passing the expected "before" size, which for multiple writers would be a conflict-resolution mechanism that favors the first request that arrives...rejecting the others. However this is probably an obscure edge case. Much more likely usage would be a single writer emitting multiple updates with order known a priori. Current solution handles this nicely.

I skimmed, but did not look too closely at:

  • tiled/client/array.py
  • tiled/adapters/zarr.py
  • tiled/catalog/adapter.py

array : array-like
The data to write
slice : NDSlice
Where to place this data in the array
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be nice to give more information on how to format the slice parameter.

@danielballan
Copy link
Member

Lifting a comment here that from private chats:

I think that concurrent appending to arrays is a broken concept. Appending makes sense on tables, but not on arrays. The atomic unit of a table is a row, and the row can contain metadata (e.g. sequence number, timestamp) on which we can to locate a given value later. The only metadata that arrays have are coordinates themselves, and so when placing data we must address coordinates explicitly.

We can extend arrays, enlarging the canvas of the coordinate system, but when data is added it must be placed at explicit coordinates.

If I'm wrong about that, we can revisit appending later with an additional PATCH /array/append route.

@danielballan danielballan changed the title Appendendable zarr arrays Extendable zarr arrays Nov 1, 2024
@genematx
Copy link
Contributor

genematx commented Nov 1, 2024

I think this is a great addition to Tiled; thank you, @dylanmcreynolds for putting all this work together! I've gone through the code, and have just been trying to break it. Here are some of my findings; most likely they are just edge cases that fall out of scope of this PR, or maybe I've missed some assumptions here, but I think they are still worth mentioning.

  1. I think we need to check that the shape of the slice in the client request matches the shape of the array being inserted and to raise an informative exception if they don't match. Alternatively, and I personally would prefer this option, assume that the slice specifies the top right left corner (lowest indices) where the new array should be put and figure out the full slice by the server itself (maybe only if extend=True or maybe we need to introduce another flag to control this behaviour). Although I see how requests arriving out of order could be an issue here.
    Currently, ac.patch(numpy.zeros((2, 2, 2)), slice=slice(3, 4), extend=True) just raises a 500 error.

  2. On the other hand, ac.patch(numpy.zeros((1, 2, 2)), slice=slice(33, 34), extend=True) works and fills the array with zeros. Is this the intended behaviour?

  3. Would be nice to have some default value for slice (e.g. -1 or (..., -1)) that would allow to append to the end of an axis, like in numpy.append(arr, values, axis=None) but, as I understand, this may not be possible due to the unknown order of arriving requests.

  4. Perhaps, more interesting:

import zarr
arr = zarr.ones((3,4,5), chunks=(1,2,3))
ac = c.write_array(arr, key='y')
# <ArrayClient shape=(3, 4, 5) chunks=((1, 1, 1), (2, 2), (3, 2)) dtype=float64>

ac.patch(numpy.zeros((1, 4, 5)), slice=slice(0,1))

errors with a 500 error and the following traceback on the server: ValueError: parameter 'value': expected array with shape (1, 4, 6), got (1, 4, 5). Also, this raises an interesting question of how we should rechunk the array (if at all) when it's expanded (perhaps you're handling this already, and I just missed it).

  1. What should we do about dtype mismatch?
ac = c.write_array(numpy.ones((3, 2, 2), dtype='int'), key='w')
ac.patch(numpy.zeros((1, 2, 2), dtype='float'), slice=slice(0, 1))

The above code works and converts the new data to int. I wonder if this should raise an exception.

I'll keep digging into it. Really like this new functionality!

ac.patch(ones * 7, slice=slice(7, 8), extend=True)
ac.patch(ones * 5, slice=slice(5, 6), extend=True)
ac.patch(ones * 6, slice=slice(6, 7), extend=True)
numpy.testing.assert_equal(ac[5:6], ones * 5)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we better use something like rng.random instead of ones here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, that would avoid rotational confusion.

@danielballan
Copy link
Member

  1. This is interesting! The formulation I put in this PR is "overdetermined" in the sense that we give slice and shape. Your proposal would have us give offset and shape. I think this is worth considering. To be precise, we wouldn't be giving slice to the corner, we would be giving a tuple of coordinates: tuple[int], not tuple[slice]. I should think about this but my initial reaction is that this is strictly better, from both an engineering standpoint and a usability standpoint.
  2. Zarr has a "fill value" configured at creation time (default 0) and that's what happening here. Zeros are actually written to storage, but filled in. The same situation occurs if you try to read and array before all the chunks have been written. So I would say this is fine and expected.
  3. I'm against this, for the reasons you say. But we can always add it later if we change our minds.
  4. Using offset instead of slice would fix this, I think. There rechunking code---see new_chunks and surrounding comments.
  5. I agree, this should raise 400 Bad Request or similar.

@danielballan
Copy link
Member

The only generality you lose with offset is striding, as in 10:60:2. But this more of a computation idiom than a storage idiom.

@danielballan
Copy link
Member

Rebased on main. Refactored to use offset instead of slice, following @genematx's suggestion.

@dylanmcreynolds
Copy link
Contributor Author

I added text about the new patch method to the writing tutorial.

@danielballan danielballan merged commit 2459624 into bluesky:main Nov 18, 2024
7 of 8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants