-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: Add mne_ch_name_mapping #29
Conversation
@jnenonen we need you to approve or question this. |
Hi,
Looks OK, but I would like to leave the decision to Matti Kajola.
BR, Jukka
[A screenshot of a cell phone Description automatically generated]--
From: Eric Larson <[email protected]>
Reply to: mne-tools/fiff-constants <[email protected]>
Date: Thursday 19. November 2020 at 14.59
To: mne-tools/fiff-constants <[email protected]>
Cc: Jukka Nenonen <[email protected]>, Mention <[email protected]>
Subject: [mne-tools/fiff-constants] ENH: Add mne_ch_name_mapping (#29)
@jnenonen<https://github.com/jnenonen> this PR adds a new tag to the MNE namespace to allow channel names longer than 15 characters by giving a JSON dictionary mapping between 15-char and arbitrary-length names. But before doing that, I thought I'd check to see if MEGIN folks would be interested in using this sort of functionality, in which case it could live in the main namespace. Quoting the MNE issue<mne-tools/mne-python#8312 (comment)>:
The reason there is a limitation at all IIUC is beacuse the channel info struct is a fixed-size C struct. We could add a FIFFV_MNE_CHANNEL_NAMES_MAPPING tag as a string JSON dump of the dict mapping short-to-original names that has the following behavior:
1. When writing, (temporarily) replace info['chs'] and info['bads'] names with the shortened ones; write the FIFFV_MNE_CHANNEL_NAMES_MAPPING JSON
2. When reading, check for the presence of the FIFFV_MNE_CHANNEL_NAMES_MAPPING JSON and use it to make the appropriate substitutions in info['chs'] and info['bads'].
In other words, we add a tag that says "do rename_channels(...) during writing and reading".
In theory this could be implemented in MEGIN C programs, too, but I assume it would be more of a pain because of structs/types/assumptions about these channel names.
Any interest in having this in the main DictionaryTags (e.g., if you think you would use it someday), or should we just keep it here in DictionaryTags_MNE.txt ?
…________________________________
You can view, comment on, or merge this pull request online at:
#29
Commit Summary
* ENH: Add mne_ch_name_mapping
File Changes
* M DictionaryTags_MNE.txt<https://github.com/mne-tools/fiff-constants/pull/29/files#diff-63d3e37c89c229299c0b2a25369cf0765f39ca54778ca3a11a3fa503529039db> (1)
Patch Links:
* https://github.com/mne-tools/fiff-constants/pull/29.patch
* https://github.com/mne-tools/fiff-constants/pull/29.diff
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub<#29>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ADR3D6QPXBWSGZ5HH7UOZ6DSQUJB7ANCNFSM4T3LWVLQ>.
|
@mkajola sounds like it's up to you! |
Hi,
This kind of solution is of course possible, but probably will not be the Megin solution when it will be needed.
Embedded json is probably practical for python, but not necessarily for other platforms, and it kind of breaks the basic idea of the fiff file.
We already have similar kind of formal issue with the acquisition parameters, which are essentially a dump of internals of one program, but one that have been started to be used, then it becomes like part of the file format, while it clearly is not in the specification, and at the same time it binds the acquisition software design (in our case new systems produce that kind of entry, even though there no such variables in the software).
One also needs multiple parsers to read the files.
The idea that has been in the air for a very long time, has been to change the rigid channel info records to normal fiff-blocks. I think that even the block tags are there from some very distant past.
Then the channel info would be as flexible as anything else, and ne could add arbitrary channel information (descriptions, colors, whatever) for each channel.
This would be especially valuable for processed output files, and contexts other than data from out MEG devices.
For backwards compatibility, the ch-info records could still be present.
One small question in this respect, is whether the records should be inside or outside the ch-info blocks. If they are inside, correctly functioning reader that is aware of the tag context should actually ignore it, so that could broke some of the readers (most of the readers tend to be quite sloppy in this respect, so most likely the damage would be surprisingly small). So possibly better to be outside, parallel to blocks.
The contents of the channel info block should also be specified rather carefully.
One of the questions is, that if they are present, should all the info in the ch-info records be also there, so that they can be dropped away at some phase when most of the readers can handle the new structures.
I suppose for overlapping information the block would have precedence.
The ch-info structure is very much channel counting/labeling scheme in Neuromag-122 from last millennium, so it probably would be time to think what is really needed there.
Possibly the old scheme is ok, but if the encoding is changed, then one could think a bit about the contents too.
What is the actual need here? Is it just a longer name for the channel. Then no new logical info would be in the file, just encoded differently.
In some contexts, there seem to be also need to have e.g. user defined labels etc. which are necessarily not the same thing as the ‘name’ of the channel.
One example where we have been bit indecisive, is whether the EEG labels should the ‘channel names’, or should there be both the ‘electrode location’, and the ‘physical channel’ names present.
The exact purposes of the data items should be specified.
Even bigger question about all kinds of names is the string encoding. Current fiff spec requires isolatin 1, though on this century Unicode is the way to go.
t. mjk
From: Eric Larson <[email protected]>
Sent: torstai 19. marraskuuta 2020 16.03
To: mne-tools/fiff-constants <[email protected]>
Cc: Matti Kajola <[email protected]>; Mention <[email protected]>
Subject: Re: [mne-tools/fiff-constants] ENH: Add mne_ch_name_mapping (#29)
@mkajola<https://github.com/mkajola> sounds like it's up to you!
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub<#29 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AKJMTHBJ6EN4CLZNENHLKLLSQUQQLANCNFSM4T3LWVLQ>.
|
For what it's worth, we chose JSON because of its adoption across multiple languages -- a quick search turned up several possible C parsers (e.g., this one that is MIT/commercial-compatible single source and single header file) and the same goes for other languages. But it is indeed more overhead than using FIF tags directly, as you say, and is really just a quick workaround/band-aid solution to use a single string tag to encode what typically would be encoded using multiple tags (via a new block type, etc.).
Channel-name-length is the most pressing need, but it would be great to encode other information in a more flexible manner, support unicode, etc. as these are also limitations of the current format. A new "channel block" scheme that takes precedence over the old "channel struct" entries is definitely a better solution! To me the question is, what do you see as a potential timeline for finalization and adoption of this sort of scheme? If we knew that this proposal could be discussed, finalized, and implemented within 6 months, there wouldn't be any point in us implementing the band-aid JSON approach. On the other hand, if we knew it take 6 years, then the JSON band-aid would have value as a short/medium-term workaround because it solves a real problem for our users in the meantime. ProposalTo try to keep the proposed "channel block" solution toward the <= 6-month end of the spectrum, how about we start small with only the components we need within a framework we can expand later as we see a need for new tags? Concretely, we need to keep the channel info struct for backward compatibility, so there is no reason to replicate the entries already in there. So let's create a new block and for now only add the types that would take care of two concrete problems -- channel name length, and (optionally) unicode encoding:
Step (4) seems like the most annoying to implement in C, so it could be omitted and implemented later -- LATIN-1 strings could be used for now. To represent this visually with how it would show up in
Does this sort of thing seem like a reasonable starting point? I think it checks all the boxes you mention above @mkajola and it should be easy enough for us to implement at the MATLAB/Python end, and definitely seems better from a long-term perspective. |
(FYI I edited my GitHub comment to fix name consistency in the proposal, so if you're looking at my replies over email it's worth going on GitHub to see the up-to-date version of the proposal. Also the tag/block names and 000's are just placeholders -- if we agree this is a good way to move forward, I can open a PR with some proposed names and we can iterate there!) |
I can only voice my support to a solution over the next few months. We have more and more users hitting the limitations of fif when working with different input formats that do not have the same restrictions. It's becoming problematic for a wider adoption of mne. my 2c |
I think that this would be useful thing, so lets see if we could realize this. That would be then in version 1.5. (Do we have the Manual somewhere in github, or only the tag directory text file). Unicode should then be 2.x later on, since touching the structures/primary types will not be backwards compatible. What would be the correct talking/specification environment for this. I suppose here we are approving/disapproving the json workaround. For me that's fine if it is in the MNE reserved area. |
I opened a PR that is an alternative to / supersedes this one: #33. Let's discuss there as people on GitHub can leave comments directly on the code in line, comment on the diff, etc. If we merge that PR we'll close this JSON band-aid one as it's redundant and a less good solution.
Just text -- let's discuss further in #31, it would be great to have the build automated.
Opened an issue to discuss as people have ideas: #30 |
Closing this as we hopefully won't need it! |
@jnenonen this PR adds a new tag to the MNE namespace to allow channel names longer than 15 characters by giving a JSON dictionary mapping between 15-char and arbitrary-length names. But before doing that, I thought I'd check to see if MEGIN folks would be interested in using this sort of functionality, in which case it could live in the main namespace. Quoting the MNE issue:
In theory this could be implemented in MEGIN C programs, too, but I assume it would be more of a pain because of structs/types/assumptions about these channel names.
Any interest in having this in the main DictionaryTags (e.g., if you think you would use it someday), or should we just keep it here in DictionaryTags_MNE.txt ?