The Data: PhysioNet 🫀
To hack healthcare, you need the right data. For IDSC 2026, our theme is Mathematics and Hope in Healthcare. Participants must use dataset(s) from the official PhysioNet platform listed below. You have complete flexibility to choose the dataset that best aligns with your team’s computational strengths, whether that’s signal processing, computer vision, or time-series classification.
(Note: External labeled datasets are strictly prohibited. You must train your models using only the provided official datasets.)
🩺 Why PhysioNet? (Real Data, Real Hope)
Unlike perfectly clean synthetic datasets, biomedical data in the real world is noisy, heterogeneous, and sensitive to bias. By using PhysioNet's open-access medical databases, you are tackling real-world clinical dilemmas. Your challenge is to build computational models that are accurate, transparent, reproducible, and clinically meaningful enough to provide hope through earlier screening and better decision support!
🗂️ Choose Your Arena (Official Datasets)
🧠 1. bigP3BCI
Context: P300-based Brain–Computer Interface (BCI) speller studies.
- Modality: EEG signals (+ optional eye-tracking).
- Format: EDF+ standard format.
- Details: Standardized structure aligned with emerging IEEE P2731 terminology. Includes event markers, demographics, and ALS status.
🎯 Mission: P300 detection, BCI decoding, and cross-session generalization.
❤️ 2. Brugada-HUCA
Context: 12-lead ECG recordings for Brugada syndrome study.
- Modality: 12-lead ECG (100 Hz, 12-second strips).
- Format: WFDB format (.dat/.hea) + CSV metadata.
- Details: 363 subjects total (76 Brugada, 287 normal controls), expert-reviewed.
🎯 Mission: Binary classification of Brugada syndrome versus Normal controls.
☢️ 3. Myocardial Perfusion SPECT
Context: Rest myocardial perfusion scintigraphy image database.
- Modality: Cardiac nuclear imaging.
- Format: DICOM images + NIfTI volumes/masks.
- Details: 3D volumes intended to support advanced medical image analysis research.
🎯 Mission: Volume segmentation (e.g., left ventricular wall masks) and image preprocessing/QA.
👁️ 4. Hillel Yaffe Glaucoma (HYGD)
Context: Gold-standard annotated fundus dataset for eye disease.
- Modality: Retinal fundus images.
- Format: JPG images + `Labels.csv`.
- Details: Glaucomatous optic neuropathy (GON) labels based on comprehensive ophthalmic examination. Includes image quality scores.
🎯 Mission: Glaucoma (GON) detection and quality-aware medical computer vision modeling.
🛠️ How to Access Data & Recommended Tools
- 🌐 PhysioNet Access: Visit the official pages to follow the download instructions:
- 💻 Data Tooling: Make sure your team utilizes the right libraries for these specialized formats!
- EEG/ECG:
WFDB(for Brugada),MNE-Python(for bigP3BCI). - Imaging:
pydicom,nibabel,SimpleITK, orMONAI(for SPECT and HYGD). - Core Stack: Python, scikit-learn, PyTorch/TensorFlow.
- EEG/ECG:
- 🔄 Reproducibility: You MUST rigorously document your data loading, preprocessing steps, and train/val/test splits in your final code repository.
📌 Mandatory Citation Policy
To ensure open-source integrity, all submissions MUST contain two types of citations:
- 1. The Dataset Citation: You must cite the specific dataset(s) you chose. The exact citation string can be found at the bottom of each dataset's PhysioNet page.
- 2. The Platform Citation: You must cite the PhysioNet platform itself, as requested on their website.