Published December 14, 2020 | Version 1.0
Dataset Open

GeoVectors-Asia-tags (v1.0)

  • 1. L3S Research Center, Leibniz University Hannover, Germany
  • 2. Data Science & Intelligent Systems Group (DSIS), University of Bonn, Germany

Description

Description

The GeoVectors corpus is a comprehensive large-scale linked open corpus of OpenStreetMap (https://www.openstreetmap.org/) entity embeddings that provides latent representations of over 980 million entities. The GeoVectors capture the semantic and geographic similarities of OpenStreetMap entities and make them directly accessible to machine learning applications. The "-tags" datasets provide embeddings that capture the semantic similarities of OpenStreetMap entities. The "-location" datasets provide the geographic similarities.

Contents

This dataset was derived from an OpenStreetMap snapshot that was taken on November 10, 2020 (© OpenStreetMap contributors).

We provide the GeoVectors in region-specific subsets. This subset contains tag-embeddings for the region "Asia" including the following countries:

  • Afghanistan
  • Armenia
  • Azerbaijan
  • Bangladesh
  • Bhutan
  • Cambodia
  • China
  • Gcc-States
  • India
  • Indonesia
  • Iran
  • Iraq
  • Israel-and-Palestine
  • Japan
  • Jordan
  • Kazakhstan
  • Kyrgyzstan
  • Laos
  • Lebanon
  • Malaysia-Singapore-Brunei
  • Maldives
  • Mongolia
  • Myanmar
  • Nepal
  • North-Korea
  • Pakistan
  • Philippines
  • South-Korea
  • Sri-Lanka
  • Syria
  • Taiwan
  • Tajikistan
  • Thailand
  • Turkmenistan
  • Uzbekistan
  • Vietnam
  • Yemen

File format

The embeddings are provided in the tab-separated values (tsv) format. Each row contains the embedding of a single OpenStreetMap entity. The first column contains the OpenStreetMap type and the second column contains the OpenStreetMap id of the respective entity. The type can either be node (n), way (w), or relation (r). The remaining columns represent the dimensions of the embedding space. (See also header.tsv)

Further information:

For further information, please visit http://geovectors.l3s.uni-hannover.de

Funding:

This work was partially funded by DFG, German Research Foundation (“WorldKG", DE 2299/2-1), the Federal Ministry of Education and Research (BMBF), Germany (“Simple-ML", 01IS18054), the Federal Ministry for Economic Affairs and Energy (BMWi), Germany (“d-E-mand", 01ME19009B), and the European Commission (EU H2020, “smashHit", grant-ID 871477).

Files

Files (40.4 GB)

Name Size Download all
md5:d310b06d8ff23afd8d73b286ec596a72
130.3 MB Download
md5:942573547d955a90ea3f7ff8a8c87e24
195.1 MB Download
md5:f4ebe0d999a6e8c3de861f38a6bfa4b4
177.8 MB Download
md5:dca63173f0cd46a3c37b981d562d0d70
790.7 MB Download
md5:6527369ce7e7c6d8673596043e48d037
32.4 MB Download
md5:1ae6eab0bcee4c57e0c47ab775db1e70
107.0 MB Download
md5:04fa28ddad13bef8bacb6ca3f1da6637
4.7 GB Download
md5:7aadd1e1b6ae9778d17743589132af24
1.2 GB Download
md5:aeac30cf1c385adb69ab0e962ebb3efd
1.1 kB Download
md5:457cd67eeab291eb7eda08ec63c85382
3.7 GB Download
md5:f53cf811f40c30f161f0f4a07a59d3c8
3.6 GB Download
md5:8c5d4e5771914a4f3614bc1d43342160
1.6 GB Download
md5:cdcd89326ccca2391c77e9147f13d011
333.1 MB Download
md5:5786d7880a25b7c4d59111782c0ccbf1
698.3 MB Download
md5:0c94537d6e4d752dd8445650d0e4cf33
10.2 GB Download
md5:e5ae49650623d671c3ee8d63d8740076
148.2 MB Download
md5:26bafc3f8c2da60eb7b4b9681c600499
1.3 GB Download
md5:222ea9c9b24ce99572d36f2c3a02af3c
250.3 MB Download
md5:f6b7a139ccce5de0efa72461dec2466f
118.8 MB Download
md5:898d6c92b2a3ed4952caaad640dd6f6f
85.4 MB Download
md5:57761f6b4c051b5e7c7f156e4ee6d5e5
1.4 GB Download
md5:6c1205f2e7930722f2b7565098a9f588
19.2 MB Download
md5:9b57d23b331f2c3956168151c5320e44
142.3 MB Download
md5:a97151b774ee8f538216533b6cf6917d
405.5 MB Download
md5:f35c858d5604da3c5284203d983156d2
893.3 MB Download
md5:645a516ef4ed23619dbf0d9b654de7db
150.0 MB Download
md5:e64db6faf4ee8774b2cd3cfb85bd0c9a
401.3 MB Download
md5:dea98f73d38b07977f0d599d186a50ac
1.9 GB Download
md5:8717e1f28ba191f87bfa514aca025710
1.4 GB Download
md5:431c68c4d109bff042b8a28449741c9f
507.9 MB Download
md5:baf550dbedffbd42c92e6cbd31461ec9
160.5 MB Download
md5:0953118ccec45cac384189187d564fd7
1.0 GB Download
md5:b2a40574adc65010e2f7a1e4a098c8d2
127.2 MB Download
md5:cd60721f629fcbea4f3b7a50e16751fb
1.2 GB Download
md5:3b3486aae9d8a5f48e609c7a81b311e0
75.7 MB Download
md5:84d57c85a0767d14d948bbbea1e57d5b
395.0 MB Download
md5:95b00c60452055d313f327776772ba64
732.9 MB Download
md5:fa1193e3ef54a2deb801cae41973d300
221.8 MB Download