Published March 18, 2024 | Version v3
Dataset Open

YJMob100K: City-Scale and Longitudinal Dataset of Anonymized Human Mobility Trajectories

Description

The YJMob100K human mobility datasets (YJMob100K_dataset1.csv.gz and YJMob100K_dataset1.csv.gz) contain the movement of a total of 100,000 individuals across a 75 day period, discretized into 30-minute intervals and 500 meter grid cells. The first dataset contains the movement of 80,000 individuals across a 75-day business-as-usual period, while the second dataset contains the movement of 20,000 individuals across a 75-day period (including the last 15 days during an emergency) with unusual behavior. 

While the name or location of the city is not disclosed, the participants are provided with points-of-interest (POIs; e.g., restaurants, parks) data for each grid cell (~85 dimensional vector) as supplementary information (cell_POIcat.csv.gz). The list of 85 POI categories can be found in POI_datacategories.csv. 

For details of the dataset, see Data Descriptor: 

  • Yabe, T., Tsubouchi, K., Shimizu, T., Sekimoto, Y., Sezaki, K., Moro, E., & Pentland, A. (2024). YJMob100K: City-scale and longitudinal dataset of anonymized human mobility trajectories. Scientific Data, 11(1), 397. https://www.nature.com/articles/s41597-024-03237-9  

 

 

--- Details about the Human Mobility Prediction Challenge 2023 (ended November 13, 2023) --- 

The challenge takes place in a mid-sized and highly populated metropolitan area, somewhere in Japan. The area is divided into 500 meters x 500 meters grid cells, resulting in a 200 x 200 grid cell space.

The human mobility datasets (task1_dataset.csv.gz and task2_dataset.csv.gz) contain the movement of a total of 100,000 individuals across a 90 day period, discretized into 30-minute intervals and 500 meter grid cells. The first dataset contains the movement of a 75 day business-as-usual period, while the second dataset contains the movement of a 75 day period during an emergency with unusual behavior.

There are 2 tasks in the Human Mobility Prediction Challenge.

In task 1, participants are provided with the full time series data (75 days) for 80,000 individuals, and partial (only 60 days) time series movement data for the remaining 20,000 individuals (task1_dataset.csv.gz). Given the provided data, Task 1 of the challenge is to predict the movement patterns of the individuals in the 20,000 individuals during days 60-74. Task 2 is similar task but uses a smaller dataset of 25,000 individuals in total, 2,500 of which have the locations during days 60-74 masked and need to be predicted (task2_dataset.csv.gz).

While the name or location of the city is not disclosed, the participants are provided with points-of-interest (POIs; e.g., restaurants, parks) data for each grid cell (~85 dimensional vector) as supplementary information (which is optional for use in the challenge) (cell_POIcat.csv.gz).

For more details, see https://connection.mit.edu/humob-challenge-2023

Files

POI_datacategories.csv

Files (530.8 MB)

Name Size Download all
md5:93af4bb0b417cd20e95fdc1d8f7b459f
626.6 kB Download
md5:fd024e10c1d1dd91356d010413ace390
1.2 kB Preview Download
md5:3781f6f03a118b5f639bdb4f94dcfdb8
419.0 MB Download
md5:ec769fafa7a746caa3d48d2844b9735c
111.2 MB Download