Published June 15, 2021 | Version 1.0
Dataset Open

GeoVectors-Europe-East-location (v1.0)

  • 1. L3S Research Center, Leibniz University Hannover, Germany
  • 2. Data Science & Intelligent Systems Group (DSIS), University of Bonn, Germany

Description

Description

The GeoVectors corpus is a comprehensive large-scale linked open corpus of OpenStreetMap (https://www.openstreetmap.org/) entity embeddings that provides latent representations of over 980 million entities. The GeoVectors capture the semantic and geographic dimensions of OpenStreetMap entities and make them directly accessible to machine learning applications. The "-tags" datasets provide embeddings that capture the semantic dimension of OpenStreetMap entities. The "-location" datasets provide the geographic dimension.

Contents

This dataset was derived from an OpenStreetMap snapshot that was taken on November 10, 2020 (© OpenStreetMap contributors).

We provide the GeoVectors in region-specific subsets. This subset contains location-embeddings for the region "Europe-east" including the following countries:

  • Albania
  • Belarus
  • Bosnia-Herzegovina
  • Bulgaria
  • Croatia
  • Cyprus
  • Czech-Republic
  • Estonia
  • Finland
  • Georgia
  • Greece
  • Hungary
  • Iceland
  • Kosovo
  • Latvia
  • Lithuania
  • Macedonia
  • Moldova
  • Montenegro
  • Poland
  • Romania
  • Serbia
  • Slovakia
  • Slovenia
  • Sweden
  • Turkey
  • Ukraine

File format

The embeddings are provided in the tab-separated values (tsv) format. Each row contains the embedding of a single OpenStreetMap entity. The first column contains the OpenStreetMap type and the second column contains the OpenStreetMap id of the respective entity. The type can either be node (n), way (w), or relation (r). The remaining columns represent the dimensions of the embedding space. (See also header.tsv)

Further information:

For further information, please visit http://geovectors.l3s.uni-hannover.de

Funding:

This work was partially funded by DFG, German Research Foundation (“WorldKG", DE 2299/2-1), the Federal Ministry of Education and Research (BMBF), Germany (“Simple-ML", 01IS18054), the Federal Ministry for Economic Affairs and Energy (BMWi), Germany (“d-E-mand", 01ME19009B), and the European Commission (EU H2020, “smashHit", grant-ID 871477).

Files

Files (47.8 GB)

Name Size Download all
md5:524885a148000f4ea507dee1ef89363f
232.0 MB Download
md5:343042f30fe9af444e6e861392717f74
2.2 GB Download
md5:b62d86eca16809c0b54e70c57df19ac6
438.6 MB Download
md5:33088481638598bd1432b099d791d71e
652.9 MB Download
md5:31cf356faa8191c733285b3b71c28e7c
847.4 MB Download
md5:589df217e7f1a7bb102efe6c5d1a7633
122.9 MB Download
md5:4a19d85c388eb92fced933d15a804202
5.6 GB Download
md5:eda2f49e8529796111294acc53f4a7ae
635.2 MB Download
md5:3b9bc1806c4a976a7ec92d61957f9c54
3.3 GB Download
md5:0b68f022455b1249f7117d08707564c6
264.3 MB Download
md5:6c587426944d5d76a12ed0482ec9627b
1.1 GB Download
md5:26381b842a88278272c49de0c8c04188
298 Bytes Download
md5:500d15bdeb3e9fd9245917d3133bdc4a
1.7 GB Download
md5:b48f43d21ce05f3143ff84da85c7257a
242.5 MB Download
md5:492e8ea2b547b91a373e38cbacd61cfa
227.1 MB Download
md5:e5572b1e38c93494ed3c64948e3d5f26
585.1 MB Download
md5:6743308c46d361e7ca04aba11141575c
1.0 GB Download
md5:4a921af861d7c59611a1863ecf14a44f
81.8 MB Download
md5:c6357a42d6346bae24988a2492b94b8d
461.6 MB Download
md5:e6391c08636e5b45f0ffd6e244f78418
103.1 MB Download
md5:6144b6ac9a7a6eb3e1f2e4f4117a2d5e
12.9 GB Download
md5:6077e1b70a8842a0a75ff789d49dd48e
1.4 GB Download
md5:9a0de11b5e0f48f3c587c738ea01ea46
696.7 MB Download
md5:05432c33147cd7c462845a9e6423dd0b
1.8 GB Download
md5:17a61e7483e73936481395e6e8cbcdc3
1.1 GB Download
md5:80145dc9b30c8fb1bdad3304c4fa880f
3.1 GB Download
md5:c2c514bc840e094dc756d60519b6b1cd
2.1 GB Download
md5:22f5101db367e28e1468a9e0359fbb47
4.9 GB Download