Skip to main content
Kaggle

Gotham Dataset 2025: Large-Scale Federated IoT IDS Benchmark

Cybersecurity
Feb 05, 2026
52 views
License

Abstract

"The Gotham Dataset is a large-scale, reproducible benchmark for evaluating decentralized Intrusion Detection Systems (IDS) and Federated Learning in virtualized smart cities. It captures interface-level network traffic from 78 heterogeneous IoT devices, including complex attack vectors like Mirai botnets, Merlin C2 traffic, and CoAP amplification, preserving the non-IID nature of edge data for realistic AI security training."

Description

Overview

The Gotham Dataset 2025 is designed to move beyond centralized security models by providing granular, device-level traffic captures from a virtualized urban IoT infrastructure. It is specifically structured to support Federated Learning (FL) research where data remains local to each node.

What’s inside

  • Data modalities: Structured CSV files containing extracted network features from packet captures.
  • Scale: 78 unique IoT devices including sensors, actuators, and controllers.
  • Metadata: Device IDs, interface identifiers, and detailed attack timestamps.

Collection / Setup

  • Generated using the open-source Gotham testbed for high reproducibility.
  • Captures traffic at the individual node interface level to maintain data skew (non-IID).

Labels / Targets

  • Attack types: Mirai Botnet, Merlin C2 (HTTP/1-3/QUIC), Masscan, Nmap, CoAP reflection, and various UDP/TCP Floods.

Recommended tasks

  • Federated Learning benchmarking
  • Anonymized intrusion detection
  • Distributed anomaly detection
  • Privacy-preserving AI validation

Limitations

  • Data is synthetic/simulated via a testbed rather than physical urban sensors.
  • Requires concatenation for centralized learning tasks.

Access & License

Official page

📊 View Data Structure

To explore column names, data types, and sample rows, visit the official dataset page on Kaggle.

Preview on Kaggle

Cite This Dataset

Belarbi, O., Spyridopoulos, T., Anthi, E., Rana, O., Carnelli, P., & Khan, A. (2025). Gotham Dataset 2025: Large-Scale Federated IoT IDS Benchmark. [Dataset]. Zenodo. https://doi.org/10.5281/zenodo.14502760

Select your preferred citation style above. The citation will automatically update and you can copy it to your clipboard.

Original source: Zenodo (2025). Visit official page for more details.

Indexed by IoTDataset.com on Feb 05, 2026

Ready to Start Your Research?

Download this dataset directly from the official repository and start building your next breakthrough project.

Download Dataset

Related Topics & Keywords

Share This Research

More in Cybersecurity

View All