Add decompressed OME-Zarr dataset size to iohub info #248

edyoshikun · 2024-09-26T00:52:47Z

This addresses issue #247 by adding the store size and array size in GB. This is useful and simple metadata.

I wanted to know how much memory to request for caching datasets.

ziw-liu · 2024-09-26T02:37:48Z

Is this meant to represent the size on disk (compressed) or size in RAM (decompressed)?

edyoshikun · 2024-09-26T18:33:54Z

I find it more use when it's decompressed rather than compressed. We can report both if needed. I think zarr.array does nbytes_stored. What do you guys think?

talonchandler

I think the uncompressed size is the most valuable.

The reported size is the expected size, not the true size (e.g. it hasn't been filled yet or there was an error). Naming is tricky---maybe "Expected uncompressed size (GB)", "Est. size in RAM (GB)", or "Est. size (GB)"?

iohub/reader.py

edyoshikun · 2024-09-28T01:29:59Z

ended up adding uncompressed size [GB]

iohub/reader.py

ziw-liu · 2024-10-26T00:33:17Z

Need to add a test case before merging.

ziw-liu · 2024-11-06T18:54:53Z

Due to upstream issue zarr-developers/zarr-python#2174, nbytes_stored will be wrong for OME-Zarr. I think we should just remove this field since the zarr devs are probably not focusing on v2 bugs now.

ziw-liu · 2024-11-06T18:55:47Z

For example this compression ratio is clearly wrong:

No. bytes:               88473600 [84.4 MiB]
No. bytes stored:        419 [419 B]

adding datastore size to info

f374116

edyoshikun requested review from ziw-liu, talonchandler and ieivanov September 26, 2024 00:52

ziw-liu added enhancement New feature or request NGFF OME-NGFF (OME-Zarr format) labels Sep 26, 2024

talonchandler reviewed Sep 26, 2024

View reviewed changes

iohub/reader.py Outdated Show resolved Hide resolved

adding uncompressed string

e70d168

ziw-liu reviewed Sep 29, 2024

View reviewed changes

iohub/reader.py Outdated Show resolved Hide resolved

edyoshikun added 2 commits October 14, 2024 18:22

adding changes for readability

58ec908

typo

5b1ab9a

edyoshikun requested a review from ziw-liu October 15, 2024 01:25

ziw-liu added 3 commits November 6, 2024 10:58

Only show decompressed size due to zarr-python bug

e5b3d51

add test for size formatting

11864e0

add test for CLI size info

d728430

ziw-liu approved these changes Nov 6, 2024

View reviewed changes

ziw-liu changed the title ~~adding datastore size to iohub info~~ Add decompressed OME-Zarr dataset size to iohub info Nov 6, 2024

ziw-liu merged commit 16b5571 into main Nov 6, 2024
7 checks passed

ziw-liu deleted the info_data_size branch November 6, 2024 19:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add decompressed OME-Zarr dataset size to iohub info #248

Add decompressed OME-Zarr dataset size to iohub info #248

edyoshikun commented Sep 26, 2024 •

edited

Loading

ziw-liu commented Sep 26, 2024 •

edited

Loading

edyoshikun commented Sep 26, 2024

talonchandler left a comment •

edited

Loading

edyoshikun commented Sep 28, 2024

ziw-liu commented Oct 26, 2024

ziw-liu commented Nov 6, 2024

ziw-liu commented Nov 6, 2024

Add decompressed OME-Zarr dataset size to iohub info #248

Add decompressed OME-Zarr dataset size to iohub info #248

Conversation

edyoshikun commented Sep 26, 2024 • edited Loading

ziw-liu commented Sep 26, 2024 • edited Loading

edyoshikun commented Sep 26, 2024

talonchandler left a comment • edited Loading

Choose a reason for hiding this comment

edyoshikun commented Sep 28, 2024

ziw-liu commented Oct 26, 2024

ziw-liu commented Nov 6, 2024

ziw-liu commented Nov 6, 2024

edyoshikun commented Sep 26, 2024 •

edited

Loading

ziw-liu commented Sep 26, 2024 •

edited

Loading

talonchandler left a comment •

edited

Loading