Using parse_url to create a store reads arrays with all zeros #343

joshua-gould · 2023-12-15T16:04:50Z

import numpy as np
import zarr
from ome_zarr.io import parse_url

g = zarr.open('test.zarr', 'w')
g['foo'] = np.ones((2, 3))

g2 = zarr.open('test.zarr', mode='r')
a2 = g2['foo'][...]
assert a2.max() == 1 # works

g3 = zarr.open(parse_url('test.zarr', mode='r').store)
a3 = g3['foo'][...]
assert a3.max() == 1 # fails

The text was updated successfully, but these errors were encountered:

joshmoore · 2023-12-15T16:07:31Z

Hi @joshua-gould. parse_url enforces dimension_separator="/". Can you try setting that on all of your calls to pure zarr methods?

joshua-gould · 2023-12-15T16:17:34Z

I've confirmed using the following code to create an array is read in correctly. Can we add a check to ensure the values are not read in incorrectly in case a user does not add dimension_separator='/'?:

a1 = g.create_dataset('foo', shape=(2, 3), dimension_separator='/')
a1[:] = np.ones((2, 3))

joshua-gould · 2023-12-15T16:33:04Z

Note that creating an array using ome-zarr and reading in the array using pure zarr works correctly:

import numpy as np
import zarr
from ome_zarr.io import parse_url

g = zarr.open(parse_url('test.zarr', mode='w').store)
g['foo'] = np.ones((2, 3))

g2 = zarr.open('test.zarr', mode='r')
a2 = g2['foo'][...]
assert a2.max() == 1

will-moore · 2023-12-15T16:50:00Z

This is a similar issue as #245

I think I proposed somewhere that when reading, parse_url should just use whatever dimension separator it finds, but I seem to remember there was an argument against doing that.

joshmoore · 2023-12-19T17:29:43Z

By "find" you mean looking into the directory to see what files are present? On S3, you can't assume that you can list the directories. Combined with the fact that chunks can be missing, this means you will likely need to try more than a handful of paths before knowing for certain whether or not each array uses "." or "/".

will-moore · 2023-12-19T18:15:07Z

No, I meant looking in .zarray.
It seems wrong to ignore the dimension_separator if it's there.
(I know there's not always one there with earlier versions - I think that was the objection before)

joshmoore · 2023-12-20T20:16:10Z

(I know there's not always one there with earlier versions - I think that was the objection before)

Exactly.

No, I meant looking in .zarray.

Interesting. If we add our own .zarray reading logic, then we might could do this. It's just that you can't currently detect from the zarr-python metadata if it's missing or set to the default.

dstansby · 2024-07-16T13:28:35Z

I'm running into this too - it's currently breaking my attempts to read in data using ome-zarr-py 😢

dstansby · 2024-07-16T13:46:41Z

I think the quite frustrating thing here is by default zarr-python will write with the dimension separator ., which means v2 zarr data written with zarr-python currently doesn't load with ome-zarr-py.

joshmoore · 2024-07-17T07:36:17Z

@dstansby: definitely an issue. The single dimension separator character led to a large number of incompatibilities. But rather than try to change the default in zarr-python v2, I think getting us onto zarr v3 ASAP is a better use of our time.

will-moore mentioned this issue Nov 27, 2024

Zarr v3 #404

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Using parse_url to create a store reads arrays with all zeros #343

Using parse_url to create a store reads arrays with all zeros #343

joshua-gould commented Dec 15, 2023

joshmoore commented Dec 15, 2023

joshua-gould commented Dec 15, 2023

joshua-gould commented Dec 15, 2023

will-moore commented Dec 15, 2023

joshmoore commented Dec 19, 2023

will-moore commented Dec 19, 2023

joshmoore commented Dec 20, 2023

dstansby commented Jul 16, 2024

dstansby commented Jul 16, 2024

joshmoore commented Jul 17, 2024

Using parse_url to create a store reads arrays with all zeros #343

Using parse_url to create a store reads arrays with all zeros #343

Comments

joshua-gould commented Dec 15, 2023

joshmoore commented Dec 15, 2023

joshua-gould commented Dec 15, 2023

joshua-gould commented Dec 15, 2023

will-moore commented Dec 15, 2023

joshmoore commented Dec 19, 2023

will-moore commented Dec 19, 2023

joshmoore commented Dec 20, 2023

dstansby commented Jul 16, 2024

dstansby commented Jul 16, 2024

joshmoore commented Jul 17, 2024