Published June 3, 2026 | Version v1
Dataset Open

High-Frequency Water Quality Time-Series Dataset for WAWQI Forecasting

  • 1. ROR icon Telkom University
  • 2. Telkom University School of Applied Science

Description

This dataset contains high-frequency, multivariate time-series data collected from an active freshwater aquaculture lake at Telkom University, Bandung, Indonesia. The data was recorded using a multi-sensor monitoring node deployed over a four-day period, recording observations continuously at 15-second intervals.

The primary purpose of this dataset is to facilitate research in short-term temporal forecasting of the Weighted Arithmetic Water Quality Index (WAWQI) using machine learning algorithms. The dataset consists of 23,502 rows and includes seven physical-chemical parameters. Additionally, the pre-computed WAWQI score is provided for direct use as a forecasting target.

Dataset Variables:

  1. Temperature: Water temperature in degrees Celsius (°C)
  2. pH: Potential of Hydrogen (acid-base balance)
  3. Dissolved Oxygen (DO): Measured in milligrams per liter (mg/L)
  4. Turbidity: Measured in Nephelometric Turbidity Units (NTU)
  5. Electrical Conductivity (EC): Measured in milliSiemens per centimeter (mS/cm)
  6. Total Dissolved Solids (TDS): Measured in milligrams per liter (mg/L)
  7. Oxidation-Reduction Potential (ORP): Measured in millivolts (mV)
  8. WAWQI_Score: The computed index score

Note for reproducibility: To match the exact 23,453-row dataset used in our baseline predictive modeling research, users must exclude the first 36 observations (sensor stabilization) and the final 13 observations (sensor retrieval).

Files

aquaculture_wawqi_dataset.csv

Files (1.1 MB)

Name Size Download all
md5:b4cecefad13b7485323fb9d539bc076c
1.1 MB Preview Download
md5:47b55f41b10201e5a99b5e92de9dfa6e
2.2 kB Preview Download