Federated IoT Intrusion Detection Dataset - Privacy-Preserving Security
Abstract
"Dataset for evaluating federated learning approaches to IoT intrusion detection published in Nature Scientific Reports January 2026. Features distributed network traffic from multiple IoT deployments with privacy constraints and decentralized learning evaluation metrics."
Description
Dataset Overview
This pioneering dataset published in Nature Scientific Reports in January 2026 addresses the critical challenge of IoT security in privacy-sensitive environments. It enables research into federated learning approaches where intrusion detection models are trained across distributed IoT networks without centralizing raw data.
Distributed Network Architecture
The dataset simulates multiple independent IoT deployments:
- Number of Nodes: 10+ independent IoT network sites
- Device Diversity: Each site contains different device types and manufacturers
- Network Topologies: Varied architectures (star, mesh, hierarchical)
- Geographic Distribution: Simulated sites in different regions with varying threat landscapes
Privacy-Preserving Data Structure
Local Network Traffic
Each site provides:
- Flow-Based Features: Aggregated traffic statistics without raw packets
- Attack Labels: Local intrusion detection annotations
- Site Metadata: Anonymous identifiers and configuration parameters
Federated Learning Metrics
Performance measures for distributed training:
- Communication Rounds: Number of model update exchanges
- Model Convergence: Accuracy improvement across federation rounds
- Data Heterogeneity: Statistical divergence between sites
- Privacy Budget: Differential privacy parameters (epsilon, delta)
Attack Coverage
Each site contains varying proportions of attack types:
- DDoS and DoS attacks
- Port scanning and reconnaissance
- MITM attacks
- Botnet command-and-control traffic
- Data exfiltration attempts
The heterogeneous attack distribution tests federated models' ability to generalize across diverse threat environments.
Research Contributions
Dataset-Centric Evaluation
The publication emphasizes evaluating federated learning algorithms based on data characteristics rather than just model architectures, providing insights into when federated approaches outperform centralized training.
Benchmark Results
Baseline performance for multiple federated learning algorithms:
- FedAvg (Federated Averaging)
- FedProx (Federated Proximal)
- SCAFFOLD (Stochastic Controlled Averaging)
- FedOpt (Federated Optimization)
Practical Applications
- Multi-Organization Security: Collaborative threat detection across competing organizations without sharing sensitive data
- GDPR Compliance: Privacy-preserving security analytics meeting regulatory requirements
- Edge-Cloud Hybrid: Distributed learning between edge devices and cloud infrastructure
- Continuous Adaptation: Models improving over time through federated updates without data movement
Academic Significance
Published in Nature with rigorous peer review, this dataset advances both IoT security and privacy-preserving machine learning fields. It provides reproducible benchmarks for evaluating federated learning in real-world IoT security scenarios.
📊 View Data Structure
To explore column names, data types, and sample rows, visit the official dataset page on Research Paper.
Preview on Research Paper
Cite This Dataset
Al-Essa, M., Andresini, G., Appice, A., & Malerba, D. (2025). Federated IoT Intrusion Detection Dataset - Privacy-Preserving Security. Scientific Reports. [Dataset]. Nature Publishing Group. https://doi.org/10.1038/s41598-025-32567-w
Select your preferred citation style above. The citation will automatically update and you can copy it to your clipboard.
Original source: Nature Publishing Group (2025). Visit official page for more details.
Indexed by IoTDataset.com on Jan 24, 2026
Ready to Start Your Research?
Download this dataset directly from the official repository and start building your next breakthrough project.