Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add reader for annotations in MIT format #12660

Open
drammock opened this issue Jun 13, 2024 · 4 comments · May be fixed by #13030
Open

Add reader for annotations in MIT format #12660

drammock opened this issue Jun 13, 2024 · 4 comments · May be fixed by #13030
Labels

Comments

@drammock
Copy link
Member

Describe the new feature or enhancement

The CHB-MIT Scalp EEG Database (physionet) has annotation files that MNE can't currently read. It's been asked about at least twice on our forum (one, two) and @Teuniz helpfully explained there what the annotation format is, which means it is possible for us to write a reader for the format.

Describe your proposed implementation

a new private func _read_annot_mit() or so, and accompanying logic in the existing mne.read_annotations() function to triage to the new private func.

Describe possible alternatives

A separate public function for reading just this type of annotation. Why might that be preferable? The existing mne.read_annotations() triages based on file extension. In the dataset linked above the file extension is .seizure, but the annotation format was designed for ECG so it's likely that there are files in this format with .ecg extensions out there too, and possibly other extensions as well. This makes it hard to know what file extension(s) to use to identify this format, so a separate reader would be more flexible.

Another possibility is adding a new parameter to the existing reader func (mit_format=False or so). If this switch were True, we triage to the new private func regardless of what the file extension is.

My preference is to overload the existing reader function, and (for now) only triage based on .seizure extension (since it's the only one we know our users want supported). We can always later add to the list of file extensions that map to this format, if users ask us to.

Additional context

Here's the text of the reference spec.

 Each annotation occupies an even number of bytes. The first byte in each pair is the least significant byte. The six most significant bits (A) of each byte pair are the annotation type code, and the ten remaining bits (I) specify the time of the annotation, measured in sample intervals from the previous annotation (or from the beginning of the record for the first annotation). If 0 < A <= ACMAX, then A is defined in <ecg/ecgcodes.h>. Several other possibilities exist:

A = SKIP [59.]
    I = 0; the next four bytes are the interval in PDP-11 long integer format (the high 16 bits first, then the low 16 bits, with the low byte first in each pair). 
A = NUM [60.]
    I = annotation num field for current and subsequent annotations; otherwise, assume previous annotation num (initially 0). 
A = SUB [61.]
    I = annotation subtyp field for current annotation only; otherwise, assume subtyp = 0. 
A = CHN [62.]
    I = annotation chan field for current and subsequent annotations; otherwise, assume previous chan (initially 0). 
A = AUX [63.]
    I = number of bytes of auxiliary information (which is contained in the next I bytes); an extra null, not included in the byte count, is appended if I is odd. 
A = I = 0: End of file. 

Copied from the wayback machine because the original page was timing out for me on the day I opened this issue.

@drammock drammock added the ENH label Jun 13, 2024
@larsoner
Copy link
Member

My preference is to overload the existing reader function, and (for now) only triage based on .seizure extension (since it's the only one we know our users want supported). We can always later add to the list of file extensions that map to this format, if users ask us to.

Agreed and rather than mit_format I'd rather have fmt="auto" that you could set to "mit" (or any of the other supported formats).

@drammock
Copy link
Member Author

Agreed and rather than mit_format I'd rather have fmt="auto" that you could set to "mit" (or any of the other supported formats).

If we triage based on file extension than IMO the extra param isn't necessary. But if in future triaging based on file extension becomes impractical then I agree fmt="auto" | "mit" is better than what I suggested.

@adam2392
Copy link
Member

Wow this would've helped my life a long time ago :p. I always thought those were junk files! Perhaps some ppl in my old lab are interested.

@withmywoessner
Copy link
Contributor

withmywoessner commented Dec 11, 2024

Hello All!
It looks like there is already a Python library that can read these annotations:
https://wfdb.readthedocs.io/en/latest/wfdb.html#wfdb-annotations
It matches the annotations on PhysioNet
image
image

Maybe it can be incorporated into MNE? Alternatively, I don't mind implementing it myself without using the package.

@withmywoessner withmywoessner linked a pull request Dec 16, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants