Load common open source ECG databases as a Tensorflow dataset. The final format should be ECG signals of one second in length, representing a heartbeat. The R-peaks form the center of the signal, which is normalized in the range
- Install the requirements
pip install -r requirements
- Navigate into the desired dataset folder
cd src/dataset/
- Build the dataset
tfds build
- Shaoxing (zheng): https://www.nature.com/articles/s41597-020-0386-x
- PTB-XL: https://www.nature.com/articles/s41597-020-0495-6
- Icentia11k: https://physionet.org/content/icentia11k-continuous-ecg/1.0/
- MedalCare-XL: https://www.nature.com/articles/s41597-023-02416-4
- ECGSYN: https://physionet.org/content/ecgsyn/1.0.0/
- Custom: Add a custom dataset
To incorporate a new dataset into the collection, follow these steps:
- Choose an open-source ECG database, e.g., PhysioNet. Ensure that you exclusively select datasets with an appropriate license.
- Go to the
src
folder and execute the following command:tfds new DATASET_NAME
- Open and edit the
DATASET_NAME/DATASET_NAME_dataset_builder.py
file following the provided instructions. Any generic preprocessing steps should be placed in theutils
folder. - Include tests in the
DATASET_NAME/DATASET_NAME_dataset_builder_test.py
file. - Include metadata files:
CITATIONS.bib
,README.md
, andTAGS.txt
. - Confirm successful dataset building using the command:
tfds build
- During your initial complete build, register the checksum with:
tfds build --register_checksums
- Navigate to
./electrocardiogram
and append the dataset to the collection inelectrocardiogram.py
. - Modify the
requirements.txt
and thisREADME
file accordingly - Create a pull request and provide a concise motivation, description, and dataset metadata, including details like count, size, dataset license, and source.
Ensure adherence to these steps to seamlessly integrate the new dataset into the collection.
If you are using this repository please reference the following article: (to be determined)
@article{kapsecker2025disentangled,
title={Disentangled representational learning for anomaly detection in single-lead electrocardiogram signals using variational autoencoder},
author={Kapsecker, Maximilian and Möller, Matthias C and Jonas, Stephan M},
journal={Computers in Biology and Medicine},
volume={184},
pages = {109422},
year = {2025},
issn = {0010-4825},
doi = {https://doi.org/10.1016/j.compbiomed.2024.109422},
url = {https://www.sciencedirect.com/science/article/pii/S0010482524015075},
}