Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/anndata io #1

Merged
merged 175 commits into from
Jun 20, 2023
Merged

Feature/anndata io #1

merged 175 commits into from
Jun 20, 2023

Conversation

minnerbe
Copy link
Collaborator

This is an attempt to include AnnData support:

  • Datasets can be in either the pre-existing N5-based layout, or an AnnData-conforming layout.
  • For both layouts, all backends supported by the N5 API (FileSystem, HDF5, Zarr) can be used, although some may not work properly until N5 can write String-datasets; see also this PR.
  • Containerization of datasets is now optional, but alignment still needs a container due to the inherent multi-dataset structure of the task.

Major changes in the code:

  • Tests were added for all parts of the code that were changed.
  • The io module underwent a major re-design, implementing the features described above.
  • The documentation was adapted to the new situation.

This should open a simple hdf5 dataset (also sent by Stephan) of a mouse
brain.
A short (but comprehensive?) summary of everything that can be stored in
the .h5ad format can be found at
    https://anndata.readthedocs.io/en/latest/fileformat-prose.html

It seems that only the top-level fields ['X', 'layers', 'obs', 'obsm',
'obsp', 'uns', 'var', 'varm', 'varp'] as well as some sub-fields (like
'_index', 'indices', 'indptr', 'data', ...) are prescribed. Other names
and structure are chosen by the user.
Current version of N5 cannot handle strings, which might be a dealbreaker
for AnnData.
Throw away all other stuff that was just for diagnostics/exploring
The files need to have locations stored under '/obs/locations'
@minnerbe minnerbe merged commit 3e6a6e5 into master Jun 20, 2023
@minnerbe minnerbe deleted the feature/anndata-io branch June 20, 2023 17:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants