-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow write
to accept file-like objects and HDF5 File
or Group
#378
Comments
Both ideas are interesting; the python side of the library is largely exposing the C++ code, so it's not trivial to add the glue that would allow this functionality. However, we will look into it to see what is required. Thanks for the suggestions. |
@mgeplf any progress on this? I'd really like to leverage MorphIO in a framework we're developing. We take morphologies from several sources, including straight from NeuroMorpho and other online databases, and having to buffer everything in a temporary file on disk is something I'd like to avoid :) |
We're using BlueBrain/HighFive to wrap access to the HDF5 files. To be able to pass groups into writing (and reading), we would have to first see about how (FWIW, what you are describing sounds similar to some of our envisioned morphology storage concepts) |
Wow, this fell off the radar; sorry about that. I remember looking it at a bit, and h5py exposes an id, but I wasn't sure if it was the hdf5 id. It was mainly the glue from h5py that would be a challenge, but I'm forgetting why.
Is a good point; if you can, @Helveg, can you describe what you're envisining? Perhaps some special API can be added instead of generic h5py handling. |
I see that my original request was for the In Python it is conventional for I/O functions to check 2 things:
I'm not sure if I'm in a great position to recommend any particular design solutions or APIs for this, being so unfamiliar with your code :) IMHO none of the APIs need to change, they just ought to accept more input types. |
I looked into it more, and with some quick hacking around, I'm not sure it's possible. On linux, it seems that
Since the way that The next best thing that I can think of is to be able to pass both a path to an h5file and a path to a group within it, when reading or writing morphologies:
But I'm a little reluctant to go down this road, mainly because how h5 handles file locking. @Helveg would that API work for you? I'd have to ponder adding it, some more. |
Yea, those would solve the HDF5 related read/writes! What about Python's file-like objects? I think I have a proposal there: In the C++ library, factor out the file reading code from the content processing code, with everything remaining backwards compatible. The Python bindings can then contain a bit of glue to determine the file path (using |
Ok, I will consider that, and we'll have to see when we can find time to add it.
I don't think it's as simple as that; for instance; not all python file like objects have paths, and presumably one of the reason that one would want to be able to use a file like object would be to control the position within the file that one reads/writes too (ie: The text based parsers already accept content, so factoring things out shouldn't be a problem Line 77 in d493136
|
With these extensions the user could be trying to hand MorphIO a path, or a file-like object, and the 2 need to be consistently separated from each other. Not just strings are |
It's possible that we can disambiguate the ctor based on whether an extension is passed, as well as if there is a |
If two distinct HDF5 libraries are being used, I'm not sure what happens in code such as:
Are we guaranteed that the group is created and visible before reopening the file/group from the C++ HDF5 library? Metadata is often aggregated an written in bulk, e.g. when closing the file. If we anyway need to make sure that we're using the same HDF5 library, then it might be nicer to accept |
We appear to be having 2 discussions in parallel:
@1uc > Are we guaranteed that the group is created and visible before reopening the file/group from the C++ HDF5 library? Good point; I'm starting to shy away from wanting to pass around potentially live h5 objects in the interface, especially because of the split hdf5 library, but not only for that reason: it opens up a can of worms with file locking, flushing, etc, etc, and I don't think we have enough control to do it reliably on the C++ side since python's refcounted objects may or may not be closed at any particular point. Perhaps the more fruitful information will be born from the discussions @matz-e mentioned.
For the case that one wants to pass a descriptor in, that has a name (ie: I played w/ a way of handling For the write side, I think we could add a |
I'd like to store the created files a bit more dynamically than just in a file on the filesystem, to enable various use cases, and because it is conventional in Python, can the
write
function also accept file-like objects (anything with awrite
function) to write the morphology to?Additionally, when storing HDF5, can we pass the handle to a
h5py.File
orh5py.Group
so that we may store morphologies inside the hierarchy of existing HDF5 files.The text was updated successfully, but these errors were encountered: