Published May 6, 2026 | Version v1
Dataset Open

Synthetic Pulsed Current Anion Sensor Data for Multiple Nonspecific Ion Selective Electrodes - Simulated Mixed Ion Responses

Description

Dataset Overview

This dataset contains synthetic electrochemical feature data generated to expand coverage of mixed-ion conditions for a multi-electrode polymer sensor array. The synthetic data were created from experimentally measured single-ion responses using an approximate RC-Voigt framework and a noninteracting-ion assumption.

The dataset is intended for machine learning and statistical modeling workflows that require broad concentration-space coverage beyond the experimentally measured combinations.

Scientific Context and Generation Logic

Initial experiments measured potential-vs-time responses for polymer-coated PPy electrodes (PAN, PVA, PVC) across single ions and selected mixtures. To extend condition coverage, synthetic responses were generated by:

  • fitting single-ion potential-time traces with an RC-type response model;
  • estimating ion-specific resistance and capacitance trends vs concentration using quadratic fits in -log10(c);
  • combining ion responses under a first-order noninteracting assumption to obtain total circuit parameters;
  • generating potential-time trajectories and extracting compact temporal descriptors.

Synthetic concentration combinations were sampled for nitrate, chloride, and hydrogen phosphate over 10^-5 to 10^-1 M, resulting in 500,000 synthetic samples.

Data Utility

This dataset supports:

  • supervised learning to predict ion concentrations from electrochemical descriptors;
  • feature selection and model benchmarking across electrode-specific response spaces;
  • analysis of sensitivity to electrode-to-electrode variability;
  • rapid simulation-driven hypothesis testing for mixed-ion sensing.

Data Organization

  • Format: Parquet (columnar)
  • Rows: 500,000
  • Columns: 206
  • Row unit: one synthetic mixed-ion condition with engineered response features

Core concentration and indexing fields

  • synthetic_id: unique row identifier added for public release
  • CaCbCg: ion concentrations (M) used to generate each sample
  • pCapCbpCg: transformed concentrations, defined as -log10(Ca)-log10(Cb)-log10(Cg)
  • r_tot: modeled total resistance term used in synthetic response generation

Electrode feature blocks

For each electrode (PAN01-03PVA01-03PVC01-03), the table includes 22 descriptors:

  • *_v1 to *_v10: potential descriptors at 1-10 s
  • *_slope_1_to_2 ... *_slope_9_to_10: adjacent-interval slopes
  • *_slope_1_to_5*_slope_5_to_10*_slope_1_to_10: windowed average slopes

Total electrode descriptors: 9 electrodes x 22 features = 198 columns.

Files

Files (734.1 MB)

Name Size Download all
md5:c8a4e0223bbc053de1c07ceb40050999
734.1 MB Download

Additional details

Funding

U.S. National Science Foundation
NRT-HDR: Advancing Materials Frontiers with Creativity and Data Science 2243526
U.S. Geological Survey, Wetland and Aquatic Research Center
G21AC10446

Dates

Submitted
2026-05-06