Causal Inference for Hypertension Prediction with Wearable Devices - Aurora Project ECG/PPG Dataset
Abstract
"Large public dataset from Microsoft Research Aurora Project with ECG and PPG signals from wrist-worn wearables, balanced for gender, age, and hypertension status, enabling causal inference research for non-invasive blood pressure prediction with 205 extracted features."
Description
Overview
The Aurora Project Dataset for Hypertension Prediction published in JMIR Cardio (January 2025) provides high-quality wearable sensor data specifically curated for causal inference and machine learning research on hypertension prediction from non-invasive signals.
Dataset Characteristics
- Large cohort of participants balanced across key demographic and clinical factors: gender (approximately equal male/female representation), age groups (young, middle-aged, elderly), and hypertension status (normotensive vs. hypertensive).
- ECG and PPG signals simultaneously acquired using wrist-worn wearable devices during controlled conditions to ensure signal quality and synchronization.
- Reference blood pressure measurements obtained using validated oscillometric or auscultatory methods, providing ground truth for hypertension classification (systolic BP ≥ 140 mmHg or diastolic BP ≥ 90 mmHg).
- Multi-session data collection enabling longitudinal analysis and within-subject variability assessment.
Feature Extraction and Engineering
- 205 features extracted from ECG and PPG waveforms using signal processing and domain knowledge, including:
- Time-domain features: R-R intervals, pulse arrival time (PAT), pulse transit time (PTT), systolic/diastolic peaks, pulse width, inter-beat intervals.
- Frequency-domain features: Power spectral density components, heart rate variability frequency bands (LF, HF, LF/HF ratio).
- Morphological features: Waveform slopes, area under curves, symmetry metrics, and fiducial point characteristics.
- Statistical metrics (6 categories): Mean, standard deviation, median, skewness, kurtosis, and percentiles computed for each of the 205 features to capture distribution properties.
Causal Inference Methodology
- Rather than purely correlational machine learning, the research applies causal discovery to identify features that have genuine causal relationships with hypertension (not just statistical associations).
- Six different causal graph structures explored to model relationships among extracted features and hypertension outcomes.
- Selected causal features used to train predictive models, improving interpretability and potentially enhancing generalization to new populations.
- Evaluation using multiple metrics (accuracy, sensitivity, specificity, AUC) to assess both prediction performance and clinical utility.
Use Cases
- Cuffless blood pressure estimation: Developing algorithms to predict hypertension status from wearable ECG/PPG without traditional cuff-based measurements.
- Causal modeling in health AI: Advancing interpretable and trustworthy AI by identifying features with causal (not just correlational) links to cardiovascular outcomes.
- Wearable device validation: Benchmarking the accuracy of consumer and medical-grade wearables for hypertension screening.
- Personalized risk assessment: Creating individualized hypertension risk models based on longitudinal wearable data.
- Clinical decision support: Integrating continuous wearable monitoring into primary care for early detection and management of hypertension.
📊 View Data Structure
To explore column names, data types, and sample rows, visit the official dataset page on JMIR Cardio / Microsoft Research.
Preview on JMIR Cardio / Microsoft Research
Cite This Dataset
Gong, Kaiwen, & others (2025). Causal Inference for Hypertension Prediction with Wearable Devices - Aurora Project ECG/PPG Dataset. JMIR Cardio. [Dataset]. JMIR Cardio / Microsoft Research. https://doi.org/10.2196/60238
Select your preferred citation style above. The citation will automatically update and you can copy it to your clipboard.
Original source: JMIR Cardio / Microsoft Research (2025). Visit official page for more details.
Indexed by IoTDataset.com on Feb 01, 2026
Ready to Start Your Research?
Download this dataset directly from the official repository and start building your next breakthrough project.