Published February 5, 2024 | Version v2
Dataset Open

Dataset for "Geospatial analysis of toponyms in geotagged social media posts"

  • 1. Aalto-yliopisto
  • 2. Kogakkan University
  • 3. ROR icon Tohoku University

Description

Geotagged Twitter posts dataset

Dataset used for the research presented in the following paper: Takayuki Hiraoka, Takashi Kirimura, Naoya Fujiwara (2024) "Geospatial analysis of toponyms in geo-tagged social media posts".

We collected georeferenced Twitter posts tagged to coordinates inside the bounding box of Japan between 2012-2018. The present dataset represents the spatial distributions of all geotagged posts as well as posts containing in the text each of 24 domestic toponyms, 12 common nouns, and 6 foreign toponyms. The code used to analyze the data is available on GitHub.

Data description

  • preprocessed_mcntlt7_selected/: Number of geotagged twitter posts in each grid cell. Each csv file under this directory associates each grid cell (spanning 30 seconds of latitude and 45 secoonds of longitude, which is approximately a 1km x 1km square, specified by an 8 digit code m3code) with the number of geotagged tweets tagged to the coordinates inside that cell (tweetcount). file_names.json relates each of the toponyms studied in this work to the corresponding datafile (all denotes the full data). Note that these data files are modified from the v1.0.0 to exclude posts that contain seven or more mentions.
  • population/population_center_2020.xlsx: Center of population of each municipality based on the 2020 census. Derived from data published by the Statistics Bureau of Japan on their website (Japanese)
  • population/census2015mesh3_totalpop_setai_area.csv: Resident population in each grid cell based on the 2015 census. Derived from data published by the Statistics Bureau of Japan on e-stat (Japanese)
  • population/economiccensus2016mesh3_jigyosyo_jugyosya_area.csv: Employed population in each grid cell based on the 2016 Economic Census. Derived from data published by the Statistics Bureau of Japan on e-stat (Japanese)
  • japan_MetropolitanEmploymentArea2015map/: Shape file for the boundaries of Metropolitan Employment Areas (MEA) in Japan. See this website for details of MEA.
  • ward_shapefiles/: Shape files for the boundaries of wards in large cities, published by the Statistics Bureau of Japan on e-stat

Files

data.zip

Files (17.3 MB)

Name Size Download all
md5:501d52b2b8bf338c37bf54d93c89b70e
17.3 MB Preview Download

Additional details

Related works

Is supplement to
Preprint: arXiv:2410.03250 (arXiv)

Software

Repository URL
https://github.com/takayukihir/geotagged-tweets
Programming language
Jupyter Notebook, Python