Skip to main content
Kaggle

Bot-IoT Dataset - Large-Scale IoT Botnet Traffic with Full Packet Capture

Cybersecurity
Jan 23, 2026
116 views
License

Abstract

"Comprehensive large-scale IoT botnet dataset combining legitimate IoT network traffic with realistic botnet attack scenarios. Features full packet captures (PCAP) and extracted flow features for diverse attack types including DDoS, reconnaissance, theft, and DoS attacks."

Description

Dataset Overview

The Bot-IoT dataset is a large-scale IoT security resource that combines realistic benign IoT network traffic with diverse botnet attack scenarios. It provides both raw packet captures and processed network flow features, making it versatile for various research approaches.

Dataset Composition

The dataset integrates two distinct traffic sources:

1. Legitimate IoT Traffic

Normal operational traffic from IoT devices and services including:

  • Smart home device communications
  • IoT sensor data transmissions
  • Device-to-cloud service interactions
  • Inter-device communications in smart environments
  • Firmware updates and maintenance traffic

2. Botnet Attack Traffic

Realistic attack scenarios simulating compromised IoT devices participating in malicious activities across four major categories.

Attack Categories

DDoS Attacks (Distributed Denial of Service)

  • UDP flooding from multiple compromised devices
  • TCP SYN flood attacks
  • HTTP flood targeting web services
  • DNS amplification attacks

Reconnaissance and Information Gathering

  • Network scanning to identify vulnerable devices
  • Port scanning for open services
  • OS fingerprinting attempts
  • Service enumeration

Data Theft and Exfiltration

  • Keylogging traffic patterns
  • Data exfiltration through covert channels
  • Credential harvesting communications

DoS Attacks (Single-Source)

  • Resource exhaustion attacks
  • Protocol-specific DoS targeting IoT services

Data Formats

PCAP Files (Full Packet Capture)

Raw network packets captured at wire-level enabling deep packet inspection, protocol analysis, payload examination, and development of signature-based detection systems.

Flow Features (Extracted Statistics)

Processed network flow statistics providing efficient machine learning features without requiring packet-level processing:

  • Flow durations and packet counts
  • Byte statistics and protocol distributions
  • Flag counts and connection states
  • Inter-arrival times and burst patterns
  • Bidirectional flow characteristics

Scale and Diversity

The large-scale nature provides:

  • Millions of network flows
  • Diverse attack implementations
  • Realistic traffic mixing (benign and malicious)
  • Multiple device types and manufacturers
  • Temporal patterns spanning extended periods

Research Applications

  • Deep Learning IDS: Train neural networks on flow features for intrusion detection
  • Signature Development: Use PCAP files to create attack signatures
  • Behavioral Analysis: Study differences between legitimate and botnet traffic patterns
  • Protocol Analysis: Examine protocol-level characteristics of attacks
  • Real-Time Detection: Develop systems using flow-based features for online detection

Advantages for ML Research

  • Both binary classification (benign vs attack) and multi-class (attack type) labels
  • Rich feature set reducing preprocessing requirements
  • Sufficient data volume for deep learning approaches
  • Realistic class imbalance reflecting real networks

📊 View Data Structure

To explore column names, data types, and sample rows, visit the official dataset page on Kaggle.

Preview on Kaggle

Cite This Dataset

Vignesh Venkateswaran (2023). Bot-IoT Dataset - Large-Scale IoT Botnet Traffic with Full Packet Capture. [Dataset]. Kaggle. https://www.kaggle.com/datasets/vigneshvenkateswaran/bot-iot

Select your preferred citation style above. The citation will automatically update and you can copy it to your clipboard.

Original source: Kaggle (2023). Visit official page for more details.

Indexed by IoTDataset.com on Jan 23, 2026

Ready to Start Your Research?

Download this dataset directly from the official repository and start building your next breakthrough project.

Download Dataset

Related Topics & Keywords

Share This Research

More in Cybersecurity

View All