Skip to main content
Data in Brief (Elsevier) + Mendeley Data

MQTTEEB-D: A Real-World IoT Cybersecurity Dataset for AI-Powered Threat Detection in MQTT Networks

Abstract

"A real-world MQTT-based IoT cybersecurity dataset collected from the MQTTEEB testbed at the International University of Rabat, with benign traffic and five attack types (DoS, SlowITe, Malformed Data Injection, Brute Force, Publish Flooding), provided in multiple processed forms (raw, cleaned, normalized, standardized, SMOTE) for AI-driven intrusion detection research."

Description

Overview

MQTTEEB-D is a practical real-world IoT cybersecurity dataset designed for intrusion detection in MQTT-based networks. It is captured from a live IoT deployment called MQTTEEB at the International University of Rabat (UIR), Morocco, and supports AI-powered threat detection in MQTT-based IoT environments.

Testbed and Data Collection

  • IoT testbed composed of MySignals IoT health sensors, a Raspberry Pi 4 gateway, and an MQTT broker server, representing realistic health-related IoT communication.
  • Network traffic captured in real time using PyShark (Python wrapper for tshark) while executing benign scenarios and multiple cyberattacks.
  • Captured traffic is organized into multiple CSV files, and several processed versions are provided: raw, cleaned, normalized, standardized, and SMOTE-balanced datasets.

Attack Types

  • Denial of Service (DoS): high-rate flooding of MQTT messages to overwhelm the broker or clients.
  • SlowITe (Slow DoS against IoT Environments): low-rate, slow-paced attack targeting MQTT communication.
  • Malformed Data Injection: insertion of incorrectly structured or corrupted MQTT payloads.
  • Brute Force: repeated unauthorized connection or login attempts against MQTT services.
  • MQTT Publish Flooding: excessive publication of MQTT messages to specific topics to cause congestion.

Dataset Contents

  • Multiple CSV files representing different preprocessing stages: raw, cleaned, normalized, standardized, and SMOTE-augmented data.
  • Each record includes MQTT-related traffic features extracted from packet captures (e.g., packet length, inter-arrival times, flags, topics, QoS levels) along with labels indicating benign or specific attack type.
  • Detailed metadata is provided in the Mendeley repository, describing file structure, feature descriptions, and preprocessing steps.

Use Cases

  • Training and evaluating machine learning and deep learning intrusion detection systems for MQTT-based IoT networks.
  • Benchmarking models across different preprocessing strategies (raw versus normalized versus SMOTE-balanced).
  • Studying the impact of diverse real-time attack types on MQTT traffic patterns.

Access and License

The dataset is hosted on Mendeley Data and is intended for public use in cybersecurity research. Users should consult the Mendeley page for the exact license terms and citation instructions.

📊 View Data Structure

To explore column names, data types, and sample rows, visit the official dataset page on Data in Brief (Elsevier) + Mendeley Data.

Preview on Data in Brief (Elsevier) + Mendeley Data

Cite This Dataset

Aqachtoul, A., Najib, M., & others (2025). MQTTEEB-D: A Real-World IoT Cybersecurity Dataset for AI-Powered Threat Detection in MQTT Networks. Data in Brief. [Dataset]. Data in Brief (Elsevier) + Mendeley Data. https://doi.org/10.1016/j.dib.2025.111897

Select your preferred citation style above. The citation will automatically update and you can copy it to your clipboard.

Original source: Data in Brief (Elsevier) + Mendeley Data (2025). Visit official page for more details.

Indexed by IoTDataset.com on Feb 03, 2026

Ready to Start Your Research?

Download this dataset directly from the official repository and start building your next breakthrough project.

Download Dataset

Related Topics & Keywords

Share This Research