Skip to main content
UCI

AI4I 2020 Predictive Maintenance — Milling Machine Sensor Failures [10,000 Records]

Industrial IoT
1 views
2 min read
License

Abstract

"Synthetic IIoT dataset reflecting real milling machine predictive maintenance scenarios. 10,000 records with 14 features including air temperature, process temperature, rotational speed, torque, and 5 labeled failure types. CSV format. Ideal for multi-label fault classification."

Description

Overview

The AI4I 2020 Predictive Maintenance Dataset is a synthetic dataset carefully designed to reflect real industrial predictive maintenance data encountered in milling machine operations. It was created because actual industrial maintenance datasets are rarely publicly available due to proprietary constraints, yet there is strong demand for labeled IIoT failure data in machine learning research.

The dataset models five distinct failure modes — Tool Wear Failure (TWF), Heat Dissipation Failure (HDF), Power Failure (PWF), Overstrain Failure (OSF), and Random Failure (RNF) — alongside a combined machine failure label. Each record includes air and process temperature, rotational speed, torque, and tool wear duration, making it a compact but highly structured multi-label classification benchmark.

With 10,000 records, no missing values, and clear feature-label relationships, AI4I 2020 is widely used to benchmark and compare ML classifiers including Random Forest, XGBoost, SVM, and neural networks for IIoT fault detection and root-cause analysis.

Column Schema

ColumnDescription
UDIUnique identifier ranging from 1 to 10,000.
Product_IDProduct quality variant: L (low), M (medium), H (high).
Air_temperature_KAir temperature in Kelvin (generated via random walk).
Process_temperature_KProcess temperature in Kelvin.
Rotational_speed_rpmRotational speed in RPM (calculated from power).
Torque_NmTorque in Newton-metres.
Tool_wear_minTool wear duration in minutes.
Machine_failureBinary combined machine failure label (0/1).
TWF / HDF / PWF / OSF / RNFIndividual binary failure mode labels (5 types).

Key Statistics

  • Total Records: 10,000
  • Features: 14 columns
  • Failure Types: 5 individual modes + 1 combined label
  • Missing Values: None
  • File Format: CSV
  • Donated: August 2020

Use Cases

  • Multi-label failure mode classification for CNC milling machines
  • Benchmarking ML/DL classifiers on IIoT predictive maintenance tasks
  • Feature importance analysis for industrial sensor-driven fault detection
  • Root cause analysis of manufacturing equipment failures

Source & Attribution

The dataset was created by Stephan Matzka at the Berlin School of Economics and Law and donated to the UCI Machine Learning Repository. It is available via UCI and widely referenced in Industry 4.0 and IIoT machine learning papers.

View Data Structure

To explore column names, data types, and sample rows, visit the official dataset page on UCI.

Preview on UCI

Cite This Dataset

Matzka, Stephan (2020). AI4I 2020 Predictive Maintenance — Milling Machine Sensor Failures [10,000 Records]. [Dataset]. UCI Machine Learning Repository. https://archive.ics.uci.edu/ml/datasets/ai4i+2020+predictive+maintenance+dataset

Source: UCI Machine Learning Repository (2020)

Indexed by IoTDataset.com on Apr 10, 2026

Ready to Start Your Research?

Download this dataset directly from the official repository and start building your next breakthrough project.

Download Dataset

Related Topics & Keywords

Share This Research

More in Industrial IoT

View All
Industrial IoT Government

NASA C-MAPSS Turbofan Engine Degradation — 4 Sub-datasets, 21 Sensors [Run-to-Failure]

NASA Prognostics Center run-to-failure simulation dataset for turbofan engines. Four operational sub-datasets with 21 sensor channels and 3 operational settings. TXT/CSV format. Primary benchmark for Remaining Useful Life (RUL) estimation.

Apr 10, 2026
Industrial IoT University

CWRU Bearing Fault Dataset — 2HP Motor Vibration, 4 Fault Diameters [12k & 48k Hz]

Benchmark bearing vibration dataset from Case Western Reserve University with drive-end and fan-end faults at 4 severity levels. Sampled at 12 kHz and 48 kHz. MATLAB MAT and CSV formats. Used for fault diagnosis and vibration-based condition monitoring.

Apr 10, 2026
Industrial IoT Kaggle

Bosch Production Line Performance — Assembly Line Fault Detection [1.18M Parts]

One of Kaggle's largest IIoT manufacturing datasets with 1.18 million parts measured across Bosch's assembly lines. Thousands of anonymized sensor features split across numeric, categorical, and date files. CSV format. Used for quality control and failure prediction.

Apr 10, 2026
Industrial IoT Research Paper

IIoT Metalworking Fluid Degradation — Real-World Physicochemical Sensor Monitoring [Multi-Month]

Real-world IIoT multivariate time series dataset tracking physicochemical degradation of metalworking fluid over several months. Includes imputed benchmark variants for 5 methods. CSV format. Designed for predictive maintenance and anomaly detection research in manufacturing.

Apr 10, 2026