Lab Calibrations of MOS Gas Sensors: Dataset to Investigate Drift Compensation
Creators
Description
This dataset was created at the Lab for Measurement Technology (Saarland University). The dataset contains data of two lab calibrations with one year in between of two resistive-type metal oxide semiconductor (MOS, also known as MOX or SMOX) gas sensors, i.e. multipixel sensor SGP40, Sensirion AG, Switzerland, each comprising four gas sensitive layers (pixels). The purpose of the dataset is to study drift effects and drift compensation strategies of MOS sensors.
The sensors are run in temperature cycled operation (TCO) (DOI: 10.1016/B978-0-08-102559-8.00012-4). The measured sensor data is proportional to logarithmic conductance and provided together with a time stamp in Unix Time as well as the set temperature of the micro-hotplates. Additionally, for each temperature cycle the gas concentration set points in parts-per-billion (ppb) from the calibration in a custom-built Gas Mixing Apparatus (GMA) (DOI: 10.1515/teme-2023-0075) are provided as target values (labels).
The temperature cycle (applied to all pixels) consists of 24 steps, with a total length of 144 s and a sample rate of 10 Hz, resulting in 1440 datapoints per cycle (i.e. number of columns in the sensor data struct). The cycle starts at 400 °C for 5 s, after which a lower temperature is set for 7 s. After each low temperature, another 400 °C step is set for 5 s. The lower temperatures range from 100 °C to 375 °C, increasing by 25 °C with each step. A picture of the cycle is added to the dataset.
The entire dataset is divided into two laboratory calibrations. In between a field test was performed over more than one year in a normal office room (12.04.2023 - 02.05.2024), which lead to aging of the sensors. The purpose of the second calibration is to allow drift compensation (DOI: 10.3390/atmos12050647). The chronological sequence and time periods of the measurements are as follows:
- First lab calibration: 06.04.2023 - 11.04.2023
- Second lab calibration: 03.05.2024 - 07.05.2024
Both calibrations consists of 200 randomised Unique Gas Mixtures (UGMs) (DOI: 10.5194/jsss-9-411-2020), each UGM is a mixture of substances inside the given boundaries, with a duration of 20 minutes. In between two UGMs, the GMA flushes the previous state and sets the new one. Since the exact concentrations are not in a steady state, those sensor cycles are marked invalid for evaluation and are marked with NaN in the target vectors. During the first calibration, the GMA generated mixtures of the following substances: Acetone, Carbon Monoxide, Ethanol, Ethyl acetate, Formaldehyde, Hydrogen, Toluene, Humidity. The concentration ranges for the different substances are given below, each mixture is unique in its composition. To avoid unwanted correlations Latin Hypercube Sampling was used, where the concentrations are linearly distributed between the given boundaries and independent of each other (DOI: 10.3390/atmos13101614). During the second calibration, the same gases were present, except for Carbon Monoxide and Ethyl Acetate. The remaining gases have the same concentration ranges as before. For the rel. humidity the maximum value is set to 75% RH, instead of 70% RH. All concentration borders are listed in the table below:
| 
 | First Calibration | Second Calibration | ||
| Substance | Min. | Max. | Min. | Max. | 
| Acetone | 1 ppb | 300 ppb | 1 ppb | 300 ppb | 
| Carbon monoxide | 100 ppb | 2000 ppb | / | / | 
| Ethanol | 1 ppb | 300 ppb | 1 ppb | 300 ppb | 
| Ethyl acetate | 1 ppb | 300 ppb | / | / | 
| Formaldehyde | 1 ppb | 300 ppb | 1 ppb | 300 ppb | 
| Hydrogen | 400 ppb | 1900 ppb | 400 ppb | 1900 ppb | 
| Toluene | 1 ppb | 300 ppb | 1 ppb | 300 ppb | 
| Rel. Humidity @20 °C | 25% RH | 70% RH | 25% RH | 75% RH | 
Dataset:
The Matlab mat-file comprises different datasets stored in structs:
- sensor: Struct with data of 736 SGP40 sensors, with each SGP40 sensor beeing comprised of 4 pixels (sensor0 - sensor3). Each sensor pixel is described by a matrix (A x B), where A represents the number of recorded cycles (observations) and B the number of datapoints (144 s cycle with 10 Hz Samplerate, i.e., 1440 data points) within each cycle. The depicted sensor name is the unique ID of each sensor.
- target: Struct with gas concentrations for every cycle and sensor. NaN values are set for invalid cycles (e.g. GMA has not reached steady state between two mixtures).
- gases: Concentration in ppb
- water: Relative humidity in %RH @20 °C
- states: UGM variable, indicating which cycle belongs to which UGM. Allowing the user to make groupbased training/validation/test based on UGMs instead of obervations alone.
 
- time: Struct with the time data. Time is stored in unix time and in Matlab datetime format. The time represents the starting time of each cycle.
Files
      
        Zenodo_Cycle.png
        
      
    
    
      
        Files
         (419.2 MB)
        
      
    
    | Name | Size | Download all | 
|---|---|---|
| md5:fdc49ee985fd34aca04e2891bd33a1ab | 83.3 kB | Preview Download | 
| md5:0205135ec71cf52c9b33fe1215d61cd5 | 226.1 MB | Download | 
| md5:e0b15cfb6d3fffdefd3ac76ebdec379a | 193.0 MB | Download |