Sort keys in metadata .json files #211

ieivanov · 2024-02-22T02:05:58Z

Keys in the metadata dump files used to be unordered, making them difficult to browse visually. This PR sorts the keys, resulting in metadata like this:

{
    "0/0/0": {
        "Axes": {
            "z": 0
        },
        "Binning": "2",
        "BitDepth": 12,
        "CameraChannelIndex": 0,
        "Channel": "Default",
        "ChannelIndex": 0,
        "Exposure": 100,
        "Frame": 0,
        "FrameIndex": 0,
        "Height": 1024,
        "PixelSizeAffine": "0.0;0.0;0.0;0.0;0.0;0.0",
        "PixelSizeUm": 0,
        "PixelSize_um": 0,
        "PixelType": "GRAY16",
        "Position": "Default",
        "PositionIndex": 0,
        "ROI": "0-0-1224-1024",
        "Slice": 0,
        "SliceIndex": 0,
        "Time": "2024-02-15 14:08:53 -",
        "Width": 1224,
        "ZPosition_um_Intended": -2,
        "AP Galvo-DA Device": "TS2_DAC03",
        "AP Galvo-Description": "ZStage controlled with voltage provided by a DA board",
        "AP Galvo-Name": "DA Z Stage",
        "AP Galvo-Position": "0.0000",
        "AP Galvo-Stage High Position(um)": "156.5000",
        "AP Galvo-Stage High Voltage": "5.0000",
        "AP Galvo-Stage Low Position(um)": "-156.5000",
        "AP Galvo-Stage Low Voltage": "-5.0000",
        "Blackfly BFP-ADC Bit Depth": "Bit12",
        "Blackfly BFP-Binning": "1",
        "Blackfly BFP-Black Level": "2.0000",
...

One problem is that ElapsedTime-ms is not sorted near the top, but I think that's not a big problem.

use globals instead of fixtures since parametrization happens before fixture evaluation

workaround for the Dragonfly microscope where the multi-camera setup only has one channel name written

this is a known MM limitation when writing OME-TIFFs

ziw-liu · 2024-02-22T03:17:28Z

Can you measure the overhead of doing this? I've been testing the ometiff-uapi branch and observed that converting metadata is already taking more time than converting the images themselves, as each JSON file (for each FOV) could be 300 MB or more.

ieivanov · 2024-02-22T17:40:59Z

Oh interesting - do you have infrastructure for testing this? We can't do without the metadata, but I agree that sorting it is optional.

ziw-liu · 2024-02-24T09:23:22Z

An easy way is to pick a large dataset and measure the time needed for sorting.

ieivanov · 2024-04-06T02:41:08Z

Not sure what happened to this PR - maybe it needs to be rebased?

I tested the conversion as you suggested. I picked this ~150 GB dataset:

/hpc/instruments/cm.mantis/2023_09_21_OpenCell_targets/opencell_hcs_1/opencell_hcs_lightsheet_1/

Format:			 ndtiff
FOVs:			 93
FOV shape:		 T=1, C=2, Z=593, Y=300, X=2048
Channel names:		 ['GFP EX488 EM525-45', 'mCherry EX561 EM600-37']
(Z, Y, X) scale (um):	 (0.313, 0.1161, 0.1161)

And copied it to /tmp to avoid some of the io overhead. It finished converting in 383 seconds using the unified_api branch and in 385 seconds using this branch. I think that's worthwhile. I confirmed that the metadata keys are unsorted in one version and sorted in the other.

commit fac2c13 Author: Ivan Ivanov <[email protected]> Date: Tue Apr 9 11:25:36 2024 -0700 Fix bug reading dragonfly acquisitions (#215) * fix bug reading dragonfly acquisitions * fix typo * style * bugfix commit 0c6984e Author: Ivan Ivanov <[email protected]> Date: Mon Mar 11 12:35:51 2024 -0700 Fix bug determining number of rows and cols (#214) * fix bug determining number of rows and cols * add another XY Stage variation * add docs and fix style commit 3ab89ba Author: Ziwen Liu <[email protected]> Date: Mon Mar 4 11:02:49 2024 -0800 Universal API implementations for Micro-Manager OME-TIFF and NDTiff (#185) * wip: draft mmstack ome-tiff fov * MM FOV base class * tests * bump tifffile * comment * fix indent after rebase * use get default * test pixel indexing * set MM metadata * style * update dependencies * add xarray * move old readers to the `_deprecated` namespace * uapi for ndtiff * refactor test setup to parametrize by dataset use globals instead of fixtures since parametrization happens before fixture evaluation * convert mmstack * fix and test chunking * fix metadata conversion and test ndtiff * update cli * fix scaling * test 1.4 and incomplete ome-tiffs * move reader tests * deprecate reader tests * update deprecated tests * update ngff tests * isort * update black target to 3.10 * lint * fix download paths * update docs references and theme * untrack autogenerated file * ignore execution time file * add github icon * update docstring * update docstring * show channel names and chunk size in info * print plate chunk size if verbose * fallback for pixel size * remove log level setting * do not filter logs and warnings in reader * avoid root logger * isort * set default logging level to INFO * format docstring * improve conversion messages * black * fix ome-tiff channel name indexing * fix ndtiff channel name indexing * update converter test * remove use of os.path in `reader` * expand _check_ndtiff checks * fix iteration * fix python 3.10 using `Path.glob(*/)` to get subdirs was added in 3.11 * bump zarr version to include resizing fix zarr-developers/zarr-python#1540 * fix cli default * set log level with an environment variable * fix unset * catch non-existent page * implement fallback for incomplete channel names workaround for the Dragonfly microscope where the multi-camera setup only has one channel name written * add debug logs * handle virtual frames * try reading pages from TiffFile directly * filter error logs about ImageJ metadata being broken this is a known MM limitation when writing OME-TIFFs * fix regex * remove use of os.path in `convert.py` * better channel indexing in `_get_summary_metadata` * style * safer NoneType check * private default axis names for NDTiff * update documentation to reflect new entry point * add repr to MM FOV and dataset types * rename mm_meta and expose summary metadata * add MicroManagerFOVMapping.root * add MicroManagerFOVMapping.zyx_scale * add warning log for failed position grid * fix grid layout * suppress hypothesis flakiness * different health check suppression --------- Co-authored-by: Ivan Ivanov <[email protected]>

ieivanov · 2024-04-15T22:53:21Z

@ziw-liu I think this PR is ready now

ziw-liu added 30 commits September 1, 2023 14:50

wip: draft mmstack ome-tiff fov

89f609b

MM FOV base class

ce80f9c

tests

5e7c8bc

bump tifffile

e0f54c6

comment

33d69df

fix indent after rebase

bff9c84

use get default

7df30be

test pixel indexing

8053fde

set MM metadata

9d95517

style

31c5897

update dependencies

41772c3

Merge branch 'unified-api' into ometiff-uapi

82747d8

add xarray

049d945

move old readers to the _deprecated namespace

e086b4b

uapi for ndtiff

723ab9d

refactor test setup to parametrize by dataset

6efbb34

use globals instead of fixtures since parametrization happens before fixture evaluation

convert mmstack

d698609

fix and test chunking

a4fbd76

fix metadata conversion and test ndtiff

3cace84

update cli

d57008b

fix scaling

5ae6333

test 1.4 and incomplete ome-tiffs

aacd964

move reader tests

8d4265d

deprecate reader tests

6dc5711

update deprecated tests

e82fbe0

update ngff tests

37c2fbc

isort

6a8d38c

update black target to 3.10

2ec73ab

lint

a96135f

fix download paths

b10f333

ziw-liu and others added 15 commits February 15, 2024 11:56

set log level with an environment variable

f93f0f9

fix unset

232ef1c

catch non-existent page

d9e4380

implement fallback for incomplete channel names

2698317

workaround for the Dragonfly microscope where the multi-camera setup only has one channel name written

add debug logs

7001352

handle virtual frames

ff6a038

try reading pages from TiffFile directly

44c3080

filter error logs about ImageJ metadata being broken

dab5857

this is a known MM limitation when writing OME-TIFFs

fix regex

fa8fbc3

remove use of os.path in convert.py

692e2a7

better channel indexing in _get_summary_metadata

a2e9b53

style

e59e403

safer NoneType check

406012b

private default axis names for NDTiff

a6e13a8

sort metadata keys

0468367

ieivanov requested a review from ziw-liu February 22, 2024 02:05

ziw-liu added the μManager Micro-Manager files and metadata label Feb 28, 2024

Base automatically changed from ometiff-uapi to unified-api March 4, 2024 19:02

ieivanov added 4 commits April 15, 2024 15:37

Merge branch 'unified-api' into sort_metadata_keys

adb9ee5

black

e096c3c

bugfix

916c987

ziw-liu approved these changes Apr 15, 2024

View reviewed changes

ieivanov merged commit 420f052 into unified-api Apr 15, 2024
7 checks passed

ieivanov deleted the sort_metadata_keys branch April 15, 2024 23:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sort keys in metadata .json files #211

Sort keys in metadata .json files #211

ieivanov commented Feb 22, 2024

ziw-liu commented Feb 22, 2024

ieivanov commented Feb 22, 2024

ziw-liu commented Feb 24, 2024

ieivanov commented Apr 6, 2024

ieivanov commented Apr 15, 2024

Sort keys in metadata .json files #211

Sort keys in metadata .json files #211

Conversation

ieivanov commented Feb 22, 2024

ziw-liu commented Feb 22, 2024

ieivanov commented Feb 22, 2024

ziw-liu commented Feb 24, 2024

ieivanov commented Apr 6, 2024

ieivanov commented Apr 15, 2024