Published April 2025 | Version 1.0.0
Dataset Open

Synthetic Data from Industrial Sensor Monitoring

  • 1. ROR icon Polytechnic Institute of Porto
  • 2. ROR icon INESC TEC

Description

Synthetic Data from Industrial Sensor Monitoring

Overview
This collection contains five datasets that simulate sensor readings from different industrial production lines. The data represents measurements of temperature, pressure, and in some cases, elapsed time of industrial machines, with records of normal operating conditions and potential anomalies.

Data Structure

Common Format
All datasets are in CSV format with the following common fields:
- timestamp: Date and time of measurement (format YYYY-MM-DD HH:MM:SS)
- temperature/Temperature: Temperature value in arbitrary units
- pressure/Pressure: Pressure value in arbitrary units
- label: Binary indicator (0 = normal operation, 1 = anomaly)

Some datasets include an additional field:
- elapsed_time/Elapsed_time: Machine runtime in arbitrary units

Dataset Descriptions

1. LineA_Stable_10K.csv
   - Period: May 2025
   - Fields: timestamp, Temperature, pressure, elapsed_time, label
   - Characteristics: Low variability in temperature and pressure, larger dataset with 10,000 records
   - Size: 757.50KB
   - Notes: Most stable production line with consistent readings

2. LineB_Flux.csv
   - Period: April 2025
   - Fields: timestamp, temperature, pressure, Elapsed_time, label
   - Characteristics: Medium variability in temperature and pressure
   - Size: 381.55KB
   - Notes: Production line with moderate fluctuations

3. LineC_Turbulent.csv
   - Period: March 2025
   - Fields: timestamp, Temperature, pressure, label
   - Characteristics: High variability in temperature, medium variability in pressure
   - Size: 288.11KB
   - Notes: Production line with turbulent conditions and significant fluctuations

4. LineD_SpikeControl.csv
   - Period: February 2025
   - Fields: timestamp, temperature, Pressure, label
   - Characteristics**: High variability in temperature, low variability in pressure
   - Size: 288.18KB
   - Notes: Production line with controlled pressure but temperature spikes

5. LineE_SmoothRun.csv
   - Period: January 2025
   - Fields: timestamp, Temperature, pressure, label
   - Characteristics: Low variability in both temperature and pressure
   - Size: 288.17KB
   - Notes: Production line with smooth operation and minimal fluctuations

Data Statistics

Dataset Temperature Range Pressure Range Elapsed Time Range % of Anomalies
LineA_Stable_10K ~179-180 ~159-160 ~34-35 < 1%
LineB_Flux ~188-191 ~19-20 ~19-20 | < 1%
LineC_Turbulent ~196-210 ~97-103 N/A < 5%
LineD_SpikeControl ~196-202 ~97-102 N/A < 5%
LineE_SmoothRun ~199-200 ~99-100 N/A 0%

 

Data Generation Methodology
The data were synthetically generated to simulate real operational conditions of industrial production lines. Anomalies were introduced to represent potential failures or abnormal operating conditions. Each dataset represents a different production line with specific characteristics regarding stability and variability.

Suggested Applications
- Anomaly detection in industrial environments
- Machine failure prediction
- Time series analysis of sensor data
- Development of predictive maintenance systems
- Benchmarking machine learning algorithms for industrial IoT
- Comparative analysis of production line stability

 

Contact
 - Davide Carneiro
 - davide.r.carneiro@inesctec.pt
 - Escola Superior de Tecnologia e Gestão, Instituto Politécnico do Porto, 4610-156 Felgueiras, Portugal
 - INESC TEC, R. Dr. Roberto Frias, 4200-465 Porto, Portugal

Files

How_to_Use.pdf

Files (2.2 MB)

Name Size Download all
md5:60c0ce83e460a365759e4e5908bf4a2d
44.7 kB Preview Download
md5:664d60672e369352f5235c9376f4c947
775.7 kB Preview Download
md5:cb1ef784bb5a99098ec43a763b9a1e67
390.7 kB Preview Download
md5:cb408152cc4729b4384d5b1c9667bd97
295.0 kB Preview Download
md5:1adc02fec2893d071ecba27d85e1754b
295.1 kB Preview Download
md5:c29839ffd6ea599dc600b3c353b996ac
295.1 kB Preview Download
md5:00e5a0d918c28fe083f5fc37b4654048
55.5 kB Preview Download

Additional details

Related works

References
Conference paper: Towards Generalizable Machine Learning Pipelines in Complex Industrial Scenarios (Other)

Funding

Polytechnic Institute of Porto
PRODUTECH R3 C645808870-00000067

Software

Programming language
Python