Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sort keys in metadata .json files #211

Merged
merged 75 commits into from
Apr 15, 2024
Merged

Sort keys in metadata .json files #211

merged 75 commits into from
Apr 15, 2024

Conversation

ieivanov
Copy link
Contributor

Keys in the metadata dump files used to be unordered, making them difficult to browse visually. This PR sorts the keys, resulting in metadata like this:

{
    "0/0/0": {
        "Axes": {
            "z": 0
        },
        "Binning": "2",
        "BitDepth": 12,
        "CameraChannelIndex": 0,
        "Channel": "Default",
        "ChannelIndex": 0,
        "Exposure": 100,
        "Frame": 0,
        "FrameIndex": 0,
        "Height": 1024,
        "PixelSizeAffine": "0.0;0.0;0.0;0.0;0.0;0.0",
        "PixelSizeUm": 0,
        "PixelSize_um": 0,
        "PixelType": "GRAY16",
        "Position": "Default",
        "PositionIndex": 0,
        "ROI": "0-0-1224-1024",
        "Slice": 0,
        "SliceIndex": 0,
        "Time": "2024-02-15 14:08:53 -",
        "Width": 1224,
        "ZPosition_um_Intended": -2,
        "AP Galvo-DA Device": "TS2_DAC03",
        "AP Galvo-Description": "ZStage controlled with voltage provided by a DA board",
        "AP Galvo-Name": "DA Z Stage",
        "AP Galvo-Position": "0.0000",
        "AP Galvo-Stage High Position(um)": "156.5000",
        "AP Galvo-Stage High Voltage": "5.0000",
        "AP Galvo-Stage Low Position(um)": "-156.5000",
        "AP Galvo-Stage Low Voltage": "-5.0000",
        "Blackfly BFP-ADC Bit Depth": "Bit12",
        "Blackfly BFP-Binning": "1",
        "Blackfly BFP-Black Level": "2.0000",
...

One problem is that ElapsedTime-ms is not sorted near the top, but I think that's not a big problem.

@ieivanov ieivanov requested a review from ziw-liu February 22, 2024 02:05
@ziw-liu
Copy link
Collaborator

ziw-liu commented Feb 22, 2024

Can you measure the overhead of doing this? I've been testing the ometiff-uapi branch and observed that converting metadata is already taking more time than converting the images themselves, as each JSON file (for each FOV) could be 300 MB or more.

@ieivanov
Copy link
Contributor Author

Oh interesting - do you have infrastructure for testing this? We can't do without the metadata, but I agree that sorting it is optional.

@ziw-liu
Copy link
Collaborator

ziw-liu commented Feb 24, 2024

An easy way is to pick a large dataset and measure the time needed for sorting.

@ziw-liu ziw-liu added the μManager Micro-Manager files and metadata label Feb 28, 2024
Base automatically changed from ometiff-uapi to unified-api March 4, 2024 19:02
@ieivanov
Copy link
Contributor Author

ieivanov commented Apr 6, 2024

Not sure what happened to this PR - maybe it needs to be rebased?

I tested the conversion as you suggested. I picked this ~150 GB dataset:

/hpc/instruments/cm.mantis/2023_09_21_OpenCell_targets/opencell_hcs_1/opencell_hcs_lightsheet_1/

Format:			 ndtiff
FOVs:			 93
FOV shape:		 T=1, C=2, Z=593, Y=300, X=2048
Channel names:		 ['GFP EX488 EM525-45', 'mCherry EX561 EM600-37']
(Z, Y, X) scale (um):	 (0.313, 0.1161, 0.1161)

And copied it to /tmp to avoid some of the io overhead. It finished converting in 383 seconds using the unified_api branch and in 385 seconds using this branch. I think that's worthwhile. I confirmed that the metadata keys are unsorted in one version and sorted in the other.

commit fac2c13
Author: Ivan Ivanov <[email protected]>
Date:   Tue Apr 9 11:25:36 2024 -0700

    Fix bug reading dragonfly acquisitions (#215)

    * fix bug reading dragonfly acquisitions

    * fix typo

    * style

    * bugfix

commit 0c6984e
Author: Ivan Ivanov <[email protected]>
Date:   Mon Mar 11 12:35:51 2024 -0700

    Fix bug determining number of rows and cols (#214)

    * fix bug determining number of rows and cols

    * add another XY Stage variation

    * add docs and fix style

commit 3ab89ba
Author: Ziwen Liu <[email protected]>
Date:   Mon Mar 4 11:02:49 2024 -0800

    Universal API implementations for Micro-Manager OME-TIFF and NDTiff (#185)

    * wip: draft mmstack ome-tiff fov

    * MM FOV base class

    * tests

    * bump tifffile

    * comment

    * fix indent after rebase

    * use get default

    * test pixel indexing

    * set MM metadata

    * style

    * update dependencies

    * add xarray

    * move old readers to the `_deprecated` namespace

    * uapi for ndtiff

    * refactor test setup to parametrize by dataset
    use globals instead of fixtures since parametrization happens before fixture evaluation

    * convert mmstack

    * fix and test chunking

    * fix metadata conversion and test ndtiff

    * update cli

    * fix scaling

    * test 1.4 and incomplete ome-tiffs

    * move reader tests

    * deprecate reader tests

    * update deprecated tests

    * update ngff tests

    * isort

    * update black target to 3.10

    * lint

    * fix download paths

    * update docs references and theme

    * untrack autogenerated file

    * ignore execution time file

    * add github icon

    * update docstring

    * update docstring

    * show channel names and chunk size in info

    * print plate chunk size if verbose

    * fallback for pixel size

    * remove log level setting

    * do not filter logs and warnings in reader

    * avoid root logger

    * isort

    * set default logging level to INFO

    * format docstring

    * improve conversion messages

    * black

    * fix ome-tiff channel name indexing

    * fix ndtiff channel name indexing

    * update converter test

    * remove use of os.path in `reader`

    * expand _check_ndtiff checks

    * fix iteration

    * fix python 3.10

    using `Path.glob(*/)` to get subdirs was added in 3.11

    * bump zarr version to include resizing fix
    zarr-developers/zarr-python#1540

    * fix cli default

    * set log level with an environment variable

    * fix unset

    * catch non-existent page

    * implement fallback for incomplete channel names
    workaround for the Dragonfly microscope where the multi-camera setup only has one channel name written

    * add debug logs

    * handle virtual frames

    * try reading pages from TiffFile directly

    * filter error logs about ImageJ metadata being broken
    this is a known MM limitation when writing OME-TIFFs

    * fix regex

    * remove use of os.path in `convert.py`

    * better channel indexing in `_get_summary_metadata`

    * style

    * safer NoneType check

    * private default axis names for NDTiff

    * update documentation to reflect new entry point

    * add repr to MM FOV and dataset types

    * rename mm_meta and expose summary metadata

    * add MicroManagerFOVMapping.root

    * add MicroManagerFOVMapping.zyx_scale

    * add warning log for failed position grid

    * fix grid layout

    * suppress hypothesis flakiness

    * different health check suppression

    ---------

    Co-authored-by: Ivan Ivanov <[email protected]>
@ieivanov
Copy link
Contributor Author

@ziw-liu I think this PR is ready now

@ieivanov ieivanov merged commit 420f052 into unified-api Apr 15, 2024
7 checks passed
@ieivanov ieivanov deleted the sort_metadata_keys branch April 15, 2024 23:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
μManager Micro-Manager files and metadata
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants