Published July 15, 2025 | Version v2
Dataset Open

An Augmented Dataset of Autonomous Vehicle Collisions in California

  • 1. ROR icon The University of Texas at Austin

Description

The rapid advancement of autonomous vehicles (AVs) and the emergence of robotaxi services have the potential to transform urban mobility. However, public concerns regarding AV safety remain a significant barrier to widespread adoption. While extensive AV testing has been conducted in controlled environments, real-world accident data is crucial for understanding safety risks and enhancing public trust. This study addresses the gap in AV-specific accident datasets by presenting a comprehensive augmented dataset of AV collisions in California, covering all reported AV-involved accidents from January 1, 2019, to December 31, 2024. The dataset integrates information from California DMV accident reports, geographical data derived using Geographic Information System (GIS) tools, and semantic information extracted via Large Language Models (LLMs). The resulting tabular dataset supports a wide range of applications, including AV crash pattern analysis, contributing factor identification, risk assessment, safety algorithm refinement, regulatory policy development, and urban infrastructure planning.

Files

CA_AV_Collision_2019-2024.csv

Files (791.1 kB)

Name Size Download all
md5:66c0f791b30c383c45e3d4ec8971e89b
669.5 kB Preview Download
md5:e16b6369dd6fe796b32008a51775f954
121.6 kB Preview Download

Additional details

Related works