Quantitative Content Analysis Data for Hand Labeling Road Surface Conditions in New York State Department of Transportation Camera Images
Creators
- 1. University at Albany, State University of New York
- 2. Atmospheric Sciences Research Center, University at Albany - SUNY
- 3. UAlbany Center of Excellence
- 4. National Center for Atmospheric Research
- 5. Cooperative Institute for Research in the Atmosphere (CIRA)
Description
Foundational Codebook and Data:
Traffic camera images from the New York State Department of Transportation (511ny.org) are used to create a hand-labeled dataset of images classified into to one of six road surface conditions: 1) severe snow, 2) snow, 3) wet, 4) dry, 5) poor visibility, or 6) obstructed. Six labelers (authors Sutter, Wirz, Przybylo, Cains, Radford, and Evans) went through a series of four labeling trials where reliability across all six labelers were assessed using the Krippendorff’s alpha (KA) metric (Krippendorff, 2007). The online tool by Dr. Freelon (Freelon, 2013; Freelon, 2010) was used to calculate reliability metrics after each trial, and the group achieved inter-coder reliability with KA of 0.888 on the 4th trial. This process is known as quantitative content analysis, and three pieces of data used in this process are shared, including: 1) a PDF of the codebook which serves as a set of rules for labeling images, 2) images from each of the four labeling trials, including the use of New York State Mesonet weather observation data (Brotzge et al., 2020), and 3) an Excel spreadsheet including the calculated inter-coder reliability (ICR) metrics and other summaries used to asses reliability after each trial. The data are included in NYSDOT_quantitative_content_analysis.zip.
The broader purpose of this work is that the six human labelers, after achieving inter-coder reliability, can then label large sets of images independently, each contributing to the creation of larger labeled dataset used for training supervised machine learning models to predict road surface conditions from camera images. The xCITE lab (xCITE, 2023) is used to store camera images from 511ny.org, and the lab provides computing resources for training machine learning models.
Obstructed Class Variation:
There are many applications for labeling roadside camera images, and as a variation of the foundational codebook, an addendum codebook provides another version of labeling the obstructed class. Specifically, this variation prioritizes labeling an image as “obstructed” only in extreme circumstances where there is a camera- or image- specific problem that prevents the assessment of any road surfaces. For labelers who want to use this version of the obstructed class (in this document) and also the other five weather-related classes (in the foundational codebook), the guidance is to use both documents in tandem, making sure to use the obstructed rules/definitions in this document while disregarding the obstructed rules/definitions in the foundational codebook. Alternatively, this codebook may be used alone in applications where the goal is to solely classify obstructed vs not obstructed. To ensure reliability and quality of this variation, quantitative content analysis was conducted on this addendum codebook, just as it was for the foundational codebook. Two labelers were tested with a sample of 30 images and achieved inter-coder reliability with Krippendorff's Alpha of 0.934 after one trial. The data, including the addendum codebook and labeling trial data (images and results) are included in ObstructedVariation_quantitative_content_analysis.zip.
This material is based upon work supported by the U.S. National Science Foundation under Grant No. RISE-2019758.
Files
NYSDOT_quantitative_content_analysis.zip
Files
(76.1 MB)
Name | Size | Download all |
---|---|---|
md5:fc7924e225393f732cc9b2545a034ebe
|
60.8 MB | Preview Download |
md5:e08db80d3be56d1876de0d738fe41e08
|
15.2 MB | Preview Download |
Additional details
Funding
- U.S. National Science Foundation
- AI Institute: Artificial Intelligence for Environmental Sciences (AI2ES) 2019758
References
- New York State Department of Transportation. (2023). 511NY. https://511ny.org/
- Hayes, A. F., & Krippendorff, K. (2007). Answering the call for a standard reliability measure for coding data. Communication Methods and Measures, 1(1), 77–89.
- Freelon, D. (2013). ReCal OIR: Ordinal, Interval, and Ratio Intercoder Reliability as a Web Service. ijis.net. https://dfreelon.org/publications/2013_ReCal_OIR_Ordinal_Interval_and_Ratio_Intercoder_Reliability_as_a_Web_Service.pdf
- Freelon, D. G. (2010). ReCal: Intercoder reliability calculation as a web service. dfreelon.org. https://dfreelon.org/publications/2010_ReCal_Intercoder_reliability_calculation_as_a_web_service.pdf
- Brotzge, J. A., Wang, J., Thorncroft, C. D., Joseph, E., Bain, N., Bassill, N., Farruggio, N., Freedman, J. M., Hemker, K., Johnston, D., Kane, E., McKim, S., Miller, S. D., Minder, J. R., Naple, P., Perez, S., Schwab, J. J., Schwab, M. J., & Sicker, J. (2020). A Technical Overview of the New York State Mesonet Standard Network. Journal of Atmospheric and Oceanic Technology, 37(10), 1827–1845. https://doi.org/10.1175/JTECH-D-19-0220.1
- xCITE. (2023). ExTreme Collaboration, Innovation, and Technology Laboratory. University at Albany. https://www.albany.edu/asrc/xcite-laboratory