Published January 7, 2026 | Version v1.0.0

OpenMarineStream v1.0: Raw Multi-Sensor Marine Time Series for Online Anomaly Detection

Authors/Creators

  • 1. Data & AI Engineer

Description

Overview
OpenMarineStream v1.0 is a raw multi-sensor marine time series dataset created for research on online anomaly detection, sensor fusion, and concept drift in environmental monitoring systems. The archive contains nine CSV files, each representing an individual sensor stream exported over the period 10-28 to 10-30:

  • C3 Temperature

  • C3 Turbidity

  • Flow Flow

  • Flow Temperature

  • Optode Concentration

  • Optode Saturation

  • Optode Temperature

  • SEB45 Conductivity

  • SEB45 Salinity

  • A Labelled Preprocessed Dataset

Each file includes a common timestamp column and numerical readings from the corresponding instrument. Together, these streams capture water temperature, turbidity, flow rate, conductivity, salinity, dissolved oxygen concentration, oxygen saturation, and related probe temperatures in a FerryBox-style flow-through system.

Preprocessing (reference pipeline)
For RoLA V2.1 experiments, these raw CSV files were merged and preprocessed as follows:

  1. Timestamp harmonisation & merging

    • Convert all timestamp columns to a standard datetime format.

    • Merge the nine datasets on the timestamp using an inner join, so each row represents a time step where all selected sensors have readings.

    • Standardise column names to a consistent schema (e.g. C3_Temperature, C3_Turbidity, Flow_Flow, Flow_Temperature, Optode_Concentration, Optode_Saturation, Optode_Temperature, SEB45_Conductivity, SEB45_Salinity).

  2. Handling missing data

    • Apply forward-fill (ffill) and backward-fill (bfill) to propagate recent valid values through short gaps.

    • Use linear interpolation to estimate remaining missing values in continuous numeric columns.

  3. Feature selection & normalisation

    • Retain all numerical sensor readings as candidate features for anomaly detection.

    • Optionally apply Min–Max normalization to scale each series into 0,10, 10,1 for certain models; RoLA V2.1 can also operate on raw values.

Intended use
The dataset is designed to support:

  • Development and benchmarking of real-time anomaly detection algorithms on multivariate environmental data.

  • Experiments with sensor fusion (optical, electrochemical, and flow-through sensors).

  • Studies of concept drift and robustness in marine monitoring streams.

It is directly compatible with the RoLA V2.1 streaming anomaly detection demo, which supports synthetic data, USGS/UK Hydrology sources, and user-uploaded CSVs based on this schema.

Files

C3_Temperature_10-28_10-30.csv

Files (6.1 MB)

Name Size Download all
md5:546d54fe6c5577331e6e7dbbe8e94f99
622.1 kB Preview Download
md5:387d7a54f90810a4ca8cc488fdb015b2
599.6 kB Preview Download
md5:1c05cebcc1ca9336b2d5fe4b484b0ddc
615.5 kB Preview Download
md5:ab545cd034bc96b53beb8c0c589177cc
612.9 kB Preview Download
md5:08dab1c8d09852d3cc1428257280bcb3
539.0 kB Preview Download
md5:6797baa99160bff5071d0d4732f99e0e
621.9 kB Preview Download
md5:f8bc5d483087d56f3fbc45b37d264801
621.7 kB Preview Download
md5:d420e90ff6a432ee80c46db6ebc7ddde
622.0 kB Preview Download
md5:ae0f5fd72e4e2e41ae0c5c7d0793ac0c
623.4 kB Preview Download
md5:8e8bb37cf47c3c47182809e1492a9ac1
603.4 kB Preview Download

Additional details

Related works

Software

Repository URL
https://github.com/freelansire/RoLA-Anomaly-Detection
Programming language
Python
Development Status
Active