Published December 14, 2020 | Version 1.0
Dataset Open

GeoVectors-Europe-East-tags (v1.0)

  • 1. L3S Research Center, Leibniz University Hannover, Germany
  • 2. Data Science & Intelligent Systems Group (DSIS), University of Bonn, Germany

Description

Description

The GeoVectors corpus is a comprehensive large-scale linked open corpus of OpenStreetMap (https://www.openstreetmap.org/) entity embeddings that provides latent representations of over 980 million entities. The GeoVectors capture the semantic and geographic similarities of OpenStreetMap entities and make them directly accessible to machine learning applications. The "-tags" datasets provide embeddings that capture the semantic similarities of OpenStreetMap entities. The "-location" datasets provide the geographic similarities.

Contents

This dataset was derived from an OpenStreetMap snapshot that was taken on November 10, 2020 (© OpenStreetMap contributors).

We provide the GeoVectors in region-specific subsets. This subset contains tag-embeddings for the region "Europe-East" including the following countries:

  • Albania
  • Belarus
  • Bosnia-Herzegovina
  • Bulgaria
  • Croatia
  • Cyprus
  • Czech-Republic
  • Estonia
  • Finland
  • Georgia
  • Greece
  • Hungary
  • Iceland
  • Kosovo
  • Latvia
  • Lithuania
  • Macedonia
  • Moldova
  • Montenegro
  • Romania
  • Serbia
  • Slovakia
  • Slovenia
  • Sweden
  • Turkey
  • Ukraine

File format

The embeddings are provided in the tab-separated values (tsv) format. Each row contains the embedding of a single OpenStreetMap entity. The first column contains the OpenStreetMap type and the second column contains the OpenStreetMap id of the respective entity. The type can either be node (n), way (w), or relation (r). The remaining columns represent the dimensions of the embedding space. (See also header.tsv)

Further information:

For further information, please visit http://geovectors.l3s.uni-hannover.de

Funding:

This work was partially funded by DFG, German Research Foundation (“WorldKG", DE 2299/2-1), the Federal Ministry of Education and Research (BMBF), Germany (“Simple-ML", 01IS18054), the Federal Ministry for Economic Affairs and Energy (BMWi), Germany (“d-E-mand", 01ME19009B), and the European Commission (EU H2020, “smashHit", grant-ID 871477).

Files

Files (46.9 GB)

Name Size Download all
md5:b70f3bbd349f894402980ce467449afc
232.9 MB Download
md5:ea1faa69e40444468c377109d5c0975c
2.8 GB Download
md5:9ab58a1c9a86cd695d01c651c21dd8ae
357.5 MB Download
md5:9983fe116da7728b1666f075c85187cf
823.5 MB Download
md5:bee67cb56fd6cab8532db559427b7f8a
884.0 MB Download
md5:17629de983b20b163006f89409441d74
157.2 MB Download
md5:6ea0d650312b068c9092ff896661aeaa
13.2 GB Download
md5:cd97f4fef22ca2dfbc6339eaa4b7e124
1.5 GB Download
md5:7ad4542a404f19ce6f25b49d63acf617
3.5 GB Download
md5:233ebc6ebf877c48c2da7688f4214f38
284.2 MB Download
md5:315d38ec86c2f80d33f2a1146194d911
1.3 GB Download
md5:aeac30cf1c385adb69ab0e962ebb3efd
1.1 kB Download
md5:3144c9525d608baaf8d9ca13ce57bf4a
2.0 GB Download
md5:c4dab7bb38f2d018adb98c95ff096253
359.4 MB Download
md5:584ff3929fb16a44312b1ee6419ffe02
475.7 MB Download
md5:780daca58ae5cb60e8fd934002176e40
608.8 MB Download
md5:01f1a5d7d2f96319fb558db0f81cf5aa
1.2 GB Download
md5:a97db09c7e9eb7fc2ff71ee95dd38d3d
96.8 MB Download
md5:365150c74fa46678faffb96250ff0582
408.0 MB Download
md5:bdbce04afc0340a28b6d39f981b6477d
92.3 MB Download
md5:29df2e2f4a3becb01b8fa9d532ce279a
1.6 GB Download
md5:c4f6d78cd816a90046faf6863b44973d
736.4 MB Download
md5:e8e4a33ad84f38b525bf0507721bd4b2
2.7 GB Download
md5:0cb8b7b4bf61693008a8de2a5745e781
976.0 MB Download
md5:8536a9b67ed0f6adbbce3cabc265f2bc
4.1 GB Download
md5:0175b4173d358ff526de374130418df7
1.6 GB Download
md5:c0d19a022b5f32ed6d489d6ced56cfc7
4.9 GB Download