Skip to main content
Zenodo

IoT-23 — Labeled IoT Malware & Benign Traffic [325M Flows, 500+ Hours]

Cybersecurity
12 views
2 min read
License

Abstract

"Real IoT malware traffic dataset with 325M labeled network flows from 20 malware and 3 benign device captures over 500+ hours. PCAP and Zeek conn.log formats. Used for IoT botnet detection, malware traffic classification, and ML security research."

Description

Overview

IoT-23 (Aposemat IoT-23) is a large-scale, publicly available dataset of labeled network traffic from real IoT devices captured at the Stratosphere Laboratory, AIC Group, FEL, Czech Technical University in Prague. It is the first dataset to combine actual malware execution on physical IoT devices with benign device traffic, making it uniquely representative of real-world IoT threat scenarios.

The dataset contains 23 scenarios: 20 network captures from IoT devices infected with real malware samples (including Mirai, Torii, Okiru, Muhstik, and IRCBot variants) and 3 captures of completely benign IoT devices (Philips Hue smart light bridge, Amazon Echo, and a Somfy smart door lock). In total, it contains more than 760 million packets and 325 million labeled flows spanning over 500 hours of network traffic captured between 2018 and 2019.

Traffic was labeled using Zeek (Bro) conn.log format, with labels including Benign, C&C, DDoS, PartOfAHorizontalPortScan, FileDownload, Attack, and Okiru, among others. The research and dataset collection was funded by Avast Software.

Column Schema

ColumnDescription
tsTimestamp of the connection record.
uidUnique connection identifier.
id.orig_h / id.resp_hOriginator and responder IP addresses.
id.orig_p / id.resp_pOriginator and responder port numbers.
protoTransport protocol (tcp, udp, icmp).
serviceApplication-layer service detected.
durationConnection duration in seconds.
orig_bytes / resp_bytesBytes transferred by originator and responder.
labelTraffic class: Benign, C&C, DDoS, PortScan, FileDownload, etc.
detailed-labelGranular attack sub-label.

Key Statistics

  • Total Flows: 325+ million labeled flows
  • Total Packets: 760+ million
  • Traffic Duration: 500+ hours
  • Scenarios: 23 (20 malware + 3 benign)
  • Malware Families: Mirai, Torii, Okiru, Muhstik, IRCBot, and others
  • File Format: PCAP and Zeek conn.log (labeled)
  • Full download: ~20 GB; Light version (flows only): ~8.7 GB
  • Capture Period: 2018–2019; Published: January 2020

Use Cases

  • IoT malware traffic detection and botnet identification
  • Behavioral analysis of compromised IoT devices vs. benign devices
  • C&C communication detection and lateral movement analysis
  • ML-based multi-label network traffic classification for IoT security

Source & Attribution

Created by Sebastian Garcia, Agustin Parmisano, and Maria Jose Erquiaga at the Stratosphere Laboratory, Czech Technical University in Prague. Funded by Avast Software. Available for download from the Stratosphere IPS website and mirrored on Zenodo (record 4743746).

View Data Structure

To explore column names, data types, and sample rows, visit the official dataset page on Zenodo.

Preview on Zenodo

Cite This Dataset

Garcia, Sebastian, Parmisano, Agustin, & Erquiaga, Maria Jose (2020). IoT-23 — Labeled IoT Malware & Benign Traffic [325M Flows, 500+ Hours]. [Dataset]. Zenodo. https://doi.org/10.5281/ZENODO.4743745

Source: Zenodo (2020) · DOI: 10.5281/ZENODO.4743745

Indexed by IoTDataset.com on Apr 13, 2026

Ready to Start Your Research?

Download this dataset directly from the official repository and start building your next breakthrough project.

Download Dataset

Related Topics & Keywords

Share This Research

More in Cybersecurity

View All