There is a newer version of the record available.

Published August 8, 2025 | Version v1
Dataset Open

Constructing a Spatiotemporal Knowledge Graph for Urban Traffic from Trajectory Big Data

Authors/Creators

Description

Description

This dataset comprises data directly derived from, or further processed based on, the proposed urban traffic ST-KG in connection with the research presented in the paper. It is used to demonstrate the applicability of the proposed ST-KG across four key tasks: spatiotemporal analysis of congestion dynamics, traffic speed prediction, intelligent question answering on congestion, and tracing the causes of congestion. The data are organized according to these research tasks as follows:

1. Urban Traffic ST-KG Construction & Congestion Level Assessment.zip

Contains the urban traffic ST-KG entities and relations (converted to CSV format) used for producing Figures 1 and 2 in the paper. 

2. Spatiotemporal Analysis of Urban Traffic Congestion Dynamics.zip

Contains the processed results extracted from the urban traffic ST-KG and computed for exploring the spatiotemporal evolution of congestion. The dataset includes three folders—weekday, weekend, and holiday—each containing five subfolders (group 1–5) that store the average traffic speed of each grid cell within the corresponding group.

3. Traffic Speed Prediction at the Regional Scale.zip

Contains the data used for traffic speed prediction, including the counts of seven types of Points of Interest (POIs) within each predicted grid cell, data from two precipitation stations, and the adjacency matrix of the predicted grid cells. The prediction target is the average traffic speed of the predicted grid cells, calculated at five-minute intervals. The file feature_matrix_X.csv stores the interpolated average traffic speed matrix.

4. Intelligent Question Answering on Traffic Congestion.zip

Records the specific entities and relations extracted by the LLM-Agent from the urban traffic ST-KG in response to user queries.

5. Tracing the Causes of Non-Recurrent Traffic Congestion.zip

Contains statistical analyses based on the results from Intelligent Question Answering on Traffic Congestion, further deriving traffic flow data. The files include:

  1. date_group_counts.xlsx – daily traffic flow
  2. date_time_range_count.xlsx – traffic flow at five-minute intervals
  3. honeycomb_time_date_count.xlsx – traffic flow of each grid cell at five-minute intervals
  4. honeycomb_time_date_congestion_count.xlsx – traffic flow of each grid cell at five-minute intervals, categorized by congestion level

Naming Notes

  • In the paper, the "grid" is hexagonal in shape; therefore, it is referred to as "honeycomb" in the dataset.
  • The "state" in the paper is derived from mapped trajectory points and is directly referred to as "trajectory_point" in the dataset.

 

 

Files

Spatiotemporal Analysis of Urban Traffic Congestion Dynamics.zip