CNR Ozone Sounding Merged (COSM) Dataset
Authors/Creators
Description
The unified database of ozonesounding profiles was obtained through the merging of three existing ozonesounding datasets, provided by the Southern Hemisphere Additional OZonesondes (SHADOZ), the Network for the Detection of Atmospheric Composition Change (NDACC), and the World Ozone and Ultraviolet Radiation Data Centre (WOUDC).
Only a selected set of variables of interest, both data and metadata, were considered to build the unified dataset, due to the heterogeneous formats and varying levels of detail provided by each network, even when referring to measurements shared across different initiatives. These variables are listed in the following Table.
|
Standard name |
Description |
Unit |
|
idstation |
The name of the station. |
N.A. |
|
location_latitude |
Latitude of station. |
deg |
|
location_longitude |
Longitude of station. |
deg |
|
lacation_height |
Height is defined as the altitude, elevation, or height of the defined platform + instrument above sea level. |
m |
|
date_of_observation |
Date when the ozonesonde was launched (in format yyyy-mm-dd hh:mm:ss with time zone). |
N.A. |
|
time |
Elapsed flight time since released. |
s |
|
pressure |
Atmospheric pressure of each level in Pascals. |
Pa |
|
geop_alt |
Geopotential height in meters. |
m |
|
temperature |
Air temperature in degrees Kelvin. |
K |
|
relative_humidity |
Relative humidity in 1. |
1 |
|
wind_speed |
Wind speed in meters per seconds. |
m/s |
|
wind_direction |
Wind direction in degrees. |
deg |
|
latitude |
Observation latitude (during the flight). |
deg |
|
longitude |
Observation longitude (during the flight). |
deg |
|
altitude |
Height of sensor above local ground or sea surface. Positive values for above surface (e.g., sondes), negative for below (e.g., xbt). For visual observations, the height of the visual observing platform. |
m (a. s. l.) |
|
sample_temperature |
Temperature where sample is measured in degrees Kelvin. |
K |
|
o3_partial_pressure |
The level partial pressure of ozone in Pascals. |
Pa |
|
ozone_concentration |
The level mixing ratio of ozone in ppmv. |
ppmv |
|
ozone_partial_pressure_total_uncertainty |
Total uncertainty in the calculation of the ozone partial pressure as a composite of the individual uncertainty contribution. Uncertainties due to systematic bias are assumed as random and follow a random normal distribution. The uncertainty calculation also accounts for the increased uncertainty incurred by homogenizing the data record. |
Pa |
|
network |
Source network of the profile. |
N.A. |
|
type |
Station classification flag. |
N.A. |
|
vertical_coverage_flag |
Boolean flag indicating whether the ozone profile reaches the 10 hPa pressure level. Set to 't' if the profile exceeds 10 hPa, 'f' otherwise. |
N.A. |
|
vertical_completeness_flag |
Boolean flag indicating whether the ozone profile contains at least one data point every 100 meters throughout its vertical extent. Set to 't' if the profile is vertically complete (i.e., no gaps larger than 100 meters), 'f' otherwise. |
N.A. |
|
outliers_flag |
Boolean flag indicating whether the ozone partial pressure profile (o3_partial_pressure) contains strong outliers, based on the ±3·IQR method. Set to 't' if no strong outliers are found, 'f' otherwise. |
N.A. |
|
time_series_completeness_flag |
Boolean flag indicating whether the time series for a given station includes at least three ozone profiles per month, allowing up to 5% of months without coverage. Set to 't' if this criterion is met, 'f' otherwise. |
N.A. |
|
filter_check |
Profile quality control flag. |
N.A. |
The dataset is organized into two main tables:
- unified_header, which contains metadata associated with each ozonesounding profile (idstation, date_of_observation, location_latitude, location_longitude, location_height, network, type, filter_check, vertical_coverage_flag, vertical_completeness_flag, outliers_flag, time_series_completeness_flag);
- unified_value, which includes the actual measurement data (idstation, date_of_observation, time, pressure, geop_alt, temperature, relative_humidity, wind_speed, wind_direction, latitude, longitude, altitude, sample_temperature, o3_partial_pressure, ozone_concentration, ozone_partial_pressure_total_uncertainty).
To improve accessibility and performance, both tables are further subdivided into year-specific subtables, allowing for more efficient querying and data management across temporal ranges.
Among the metadata variables included in the unified_header table, type and filter_check play a key role in characterizing the quality and coverage of the ozonesounding profiles. The type variable classifies each station based on the continuity of its time series: stations are grouped into Long Coverage (G), Medium Coverage (Y), or Short Coverage (R), depending on whether they provide at least one profile per month for at least 95% of the months in their time series, spanning:
- ≥20 years for Long Coverage,
- ≥10 and <20 years for Medium Coverage,
- <10 years for Short Coverage.
The filter_check variable is a quality control flag ranging from 0 to 4, summarizing the results of four structural checks applied to each profile: completeness of monthly coverage (at least three ascents per month), vertical coverage (reaching at least 10 hPa), vertical resolution (minimum one data point every 100 meters), and detection of strong outliers (values in ozone profiles beyond ±3·IQR). A higher filter_check value indicates better compliance with these criteria and, consequently, higher data reliability. The individual flags corresponding to each control are also provided in the dataset, allowing users to apply custom quality filters based on their specific research needs.
In addition to the dataset, two log files are provided to ensure full transparency of the quality control process and to allow users to trace all data removals and better understand the filtering criteria applied during dataset construction:
-
delete_outliers.log: lists all strong outlier values removed from the dataset. Each entry includes the station identifier, the profile date, the pressure level, and the corresponding outlier value of o3_partial_pressure.
-
delete_wrong_profile.log: contains all ozone profiles that were entirely removed due to being considered erroneous. These profiles typically exhibit values consistently close to zero or deviate significantly from the station’s seasonal climatology. Each entry is catalogued by station and launch date.
Furthermore, an algorithm was implemented able to merge the different datasets by handling their different features and duplicated profiles, i.e. profiles from different networks recorded within a 2-hour time window. In such cases, the profile that passes the greatest number of quality control (filter_check) tests is retained in the unified dataset. If multiple profiles meet the same number of quality control criteria, the selection is refined using additional indicators of dataset maturity, such as the availability of metadata, documentation, peer-reviewed publications, and especially the presence of measurement uncertainties associated with ozone concentration profiles. This last criterion is prioritized, as uncertainties are routinely provided in SHADOZ and, only for a limited number of profiles, in NDACC, while they are generally absent in WOUDC.
Files
ozone_unified.zip
Files
(2.9 GB)
| Name | Size | Download all |
|---|---|---|
|
md5:372eddb82c8301aaef3233eeaf79131a
|
2.9 GB | Preview Download |
Additional details
Additional titles
- Alternative title
- Unified database of ozonesounding profiles from existing global archives
Dates
- Updated
-
2025