Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Comparison of OME-Zarr libs #407

Open
will-moore opened this issue Nov 25, 2024 · 0 comments
Open

Comparison of OME-Zarr libs #407

will-moore opened this issue Nov 25, 2024 · 0 comments

Comments

@will-moore
Copy link
Member

will-moore commented Nov 25, 2024

Some discussion about potential changes to ome-zarr-py at #402 inspired me to check out other OME-Zarr libs to understand alternative ways of structuring things...

Work in progres....

Inspired by https://docs.google.com/spreadsheets/d/1airVWkmQGaVopHgtW1tL8k4YJM55DO9I5QA6jnpu1nk/edit?gid=0#gid=0

ngff-zarr

https://github.com/thewtex/ngff-zarr
Testing example at https://ngff-zarr.readthedocs.io/en/latest/quick_start.html

import ngff_zarr as nz
import numpy as np
data = np.random.randint(0, 256, int(1e6)).reshape((1000, 1000))
multiscales = nz.to_multiscales(data)
nz.to_ngff_zarr('example.ome.zarr', multiscales)
  • Pyramid generation is separate from writing to zarr 👍 Pyramid shapes are (1000,1000) and (500,500).
  • 1 line to generate pyramid, 1 line to write to zarr
  • We get array at example.ome.zarr/scale0/image/.zarray with example.ome.zarr/scale0/.zattrs for xarray _ARRAY_DIMENSIONS
  • nz.to_multiscales(image, scale_factors=[2,4,8], chunks=64) generates a Multiscales data object with data as dask delayed pyramid.
  • Can't pass in e.g. a 4D image with shape (1, 512, 512, 512) since it fails to downsample - trying to downsample all dimensions?
  • No support for multi-C or multi-T images??

pydantic-ome-ngff

https://github.com/janeliascicomp/pydantic-ome-ngff

from pydantic_ome_ngff.v04.multiscale import MultiscaleGroup
from pydantic_ome_ngff.v04.axis import Axis
import numpy as np
import zarr

axes = [
    Axis(name='y', unit='nanometer', type='space'),
    Axis(name='x', unit='nanometer', type='space')
]
arrays = [np.zeros((512, 512)), np.zeros((256, 256))]

group_model = MultiscaleGroup.from_arrays(
    axes=axes,
    paths=['s0', 's1'],
    arrays=arrays,
    scales=[ [1.25, 1.25], [2.5, 2.5] ],
    translations=[ [0.0, 0.0], [1.0, 1.0] ],
    chunks=(64, 64),
    compressor=None)

store = zarr.DirectoryStore('min_example2.zarr', dimension_separator='/')
stored_group = group_model.to_zarr(store, path="")
# no data (chunks) has been written to these arrays, you must do that separately.
stored_group['s0'] = arrays[0]
stored_group['s1'] = arrays[1]
  • We have full control over metadata - e.g. Axis types and downsampling by different factors in various dimensions etc.
  • No help with actually downsampling arrays - lib just helps with metadata creation & validation
  • But flexible in how we write the data to arrays. E.g. could do a plane at a time etc.

Others

https://github.com/janelia-cellmap/pydantic-zarr - Not OME-Zarr

https://github.com/CBI-PITT/stack_to_multiscale_ngff - Python based command like tool - E.g TIFFs to OME-Zarr

python ~/stack_to_multiscale_ngff/stack_to_multiscale_ngff/builder.py '/path/to/tiff/stack/channel1' 
'/path/to/tiff/stack/channel2' '/path/to/tiff/stack/channel3' '/path/to/output/multiscale.omehans' --scale 1 1 0.280 0.114 
0.114 --origionalChunkSize 1 1 1 1024 1024 --finalChunkSize 1 1 64 64 64 --fileType tif

https://github.com/bioio-devs/bioio - uses https://github.com/bioio-devs/bioio-ome-zarr which uses ome-zarr-py.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant