-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Relax restrictions / guidelines #2
Comments
Hello Suhas. I was developing some Python code to convert scanning tunneling microscopy data from proprietary RHK to HDF5 and, after checking NeXus, I came across USID. I think its core idea of flattening data to 2D is really good (and actually could allow this format to be used indeed as universal standard for scientific data)! This is indeed its strength and I see that for this reason Ancillary datasets are mandatory. On the other hand, for regularly sampled data, as already pointed out, they are also redundant and unnecessarily complicated (besides wasting quite a lot of space especially for gray scale images). This is something that is keeping me a bit. While still keeping the construct of Main + Ancillary, in order to allow the representation of any data, do you think the definition of the Ancillary datasets could be relaxed? For example the ancillary attribute of a channel could link to a "full length" dataset (as it is now) or to a group/dataset (to be precisely defined) containing start/stop/increment. Then one let pyUSID (or equivalent) to handle this and, if necessary, create "on the fly" the full length dataset. |
@mpanighel Thank you for your interest in USID and for sharing your ideas. You do indeed bring up good points. We realize that the strict main + ancillary dataset rules make USID an overkill for simplistic and small datasets like images or single spectra. We also realize that the majority of data have parameters that have been varied in a linear manner and have thought to some extent about how to avoid being verbose where unnecessary. We would be happy to work with you to incorporate this capability if you are interested. Please feel free to get in touch with us at [email protected] to discuss more (please send us an email at [email protected] with the email address you would like to use for your slack account). |
Per this google group conversation, we should relax any restrictions placed on how data should be stored beyond the core Main Dataset - Ancillary Dataset constructs.
We need to comb through the entire specifications document and relax restrictions unless absolutely essential. The few examples I can think of at the moment are:
The text was updated successfully, but these errors were encountered: