Published January 2, 2025 | Version v2
Dataset Open

MapReader_railspace_London_imago_mundi_2025

  • 1. ROR icon The Alan Turing Institute

Description

MapReader Outputs

Railspace patches and text for London maps inferred with MapReader software using https://huggingface.co/Livingwithmachines/mr_resnest101e_finetuned_OS_6inch_2nd_ed_railspace on Ordnance Survey 6-inch-to-1-mile 2nd edition map sheets from the National Library of Scotland. 

How did we create the London polygon.

Our London polygon was defined as a 20 mile radius from a point in central London:

```python
# load point as geopandas geodataframe, for easy crs conversion
london = gpd.GeoDataFrame(
data = ["London"],
columns=["name"],
geometry=[Point(-0.1275, 51.507222)],
crs="EPSG:4326"
)

# convert to British National Grid, units are meters
london.to_crs("EPSG:27700", inplace=True)

# buffer 20 miles (32186 meters) around London centriod
london.geometry = london.geometry.buffer(32186)
```

Files:

  • 100meter_patch_df.csv - 586,275 patches data
  • 100meter_parent_df.csv - 329 maps metadata
  • railspace_predictions_patch_df.csv - 586,275 patches classified as either "no" or "railspace", 556,721 "no", 29,554 "railspace"
  • post_processed_railspace_predictions_patch_df.csv - 586,275 patches classified as either "no" or "railspace", 556,880 "no", 29,395 "railspace" 
  • geo_predictions_deduplicated_point.csv - georeferenced text spotting predictions for all maps with polygons simplified to points. This file only contains point data.
  • geo_predictions_deduplicated_centroid.csv - georeferenced text spotting predictions for all maps with polygons simplified to points. This file contains both polygons and point data for text spotting predictions but will load points as geometry by default, you can update this by setting the geometry as the `polygon` column.

Note: new columns have been added in post processed dataframe with updated label + label index "new_predicted_label" and "new_pred". This post-processing was done using MapReader's context-based post-processing tool [here](https://mapreader.readthedocs.io/en/latest/using-mapreader/step-by-step-guide/5-post-process.html#context-post-processing). We used default confidence threshold of 0.7.

 

Files

100meter_parent_df.csv

Files (1.9 GB)

Name Size Download all
md5:386aa36f65eac6a465302f95d91ab2fb
30.7 MB Preview Download
md5:12b6d97d426d127bf823519dab55a8a5
427.8 MB Preview Download
md5:77e547b963ac944352b5a3d0ee8e9be7
632.9 MB Download
md5:96ca14107e5059179e58cc2388b7f19b
139.9 MB Preview Download
md5:52ecb5d90198e760729324bdbf01d74c
360.9 MB Preview Download
md5:de2fe35d6c1ab428bb440070eaabc7bc
345.4 MB Preview Download

Additional details

Related works

Is compiled by
Software: 10.5281/zenodo.8189652 (DOI)
Is new version of
Dataset: 10.5281/zenodo.11241370 (DOI)

Funding

Arts and Humanities Research Council
Data/Culture: Building sustainable communities around Arts and Humanities datasets and tools AH/Y00745X/1
UK Research and Innovation
Living with Machines AH/S01179X/1
UK Research and Innovation
The Alan Turing Institute EP/N510129/1

Software

Repository URL
https://github.com/maps-as-data/MapReader
Development Status
Active