The clean_coordinates function enables a fast, automated and reproducible flagging of potentially erroneous occurrence coordinates based on geographic gazetteers. The function can flag records based on known problems common to biological collections.

Individual test can be switched on and off by a logical flag (see ?clean_coordinates) and distance thresholds for all tests can also be adapted. Custom gazetteers for the cleaning can be provided for all tests (for a higher level of detail). See here for a detailed tutorial on how to clean occurrence records using CoordinateCleaner.

Please find a detailed tutorial on how to clean occurrence records (e.g. from GBIF) here and how to clean fossil data (e.g. from PBDB) here.

Switch individual test on/off

clean_coordinates wraps around multiple tests for common error sources in species distribution records. Individual test can be included or excluded from a run with the tests argument of clean_coordinates, e.g. "seas" switches the seas test off. Most basic tests are switched on by defaults, but some more complex are switched off by default.

library(CoordinateCleaner)
## Registered S3 method overwritten by 'dplyr':
##   method               from  
##   as.data.frame.tbl_df tibble
## Registered S3 methods overwritten by 'ggplot2':
##   method         from 
##   [.quosures     rlang
##   c.quosures     rlang
##   print.quosures rlang

exmpl <- data.frame(species = sample(letters, size = 250, replace = TRUE),
                   decimallongitude = runif(250, min = 42, max = 51),
                   decimallatitude = runif(250, min = -26, max = -11),
                   countries = "MDG")


#run all tests
dat <- clean_coordinates(exmpl, tests = c("capitals", "centroids", "countries", "equal", "gbif", "institutions", "outliers", "seas", "urban", "zeros"), countries = "countries")
## Testing coordinate validity
## Flagged 0 records.
## Testing equal lat/lon
## Flagged 0 records.
## Testing zero coordinates
## Flagged 0 records.
## Testing country capitals
## Flagged 0 records.
## Testing country centroids
## Flagged 0 records.
## Testing sea coordinates
## OGR data source with driver: ESRI Shapefile 
## Source: "C:\Users\alexander.zizka\AppData\Local\Temp\Rtmpqc5aFR", layer: "ne_50m_land"
## with 1420 features
## It has 3 fields
## Integer64 fields read as strings:  scalerank
## Flagged 156 records.
## Testing urban areas
## No reference for urban areas found. 
##             Using rnaturalearth to download.
## OGR data source with driver: ESRI Shapefile 
## Source: "C:\Users\alexander.zizka\AppData\Local\Temp\Rtmpqc5aFR", layer: "ne_50m_urban_areas"
## with 2143 features
## It has 4 fields
## Integer64 fields read as strings:  scalerank
## Flagged 0 records.
## Testing country identity
## Flagged 156 records.
## Testing geographic outliers
## Flagged 3 records.
## Testing GBIF headquarters, flagging records around Copenhagen
## Flagged 0 records.
## Testing biodiversity institutions
## Flagged 0 records.
## Flagged 158 of 250 records, EQ = 0.63.

#only run the validity test
dat <- clean_coordinates(exmpl, tests = c(""))
## Testing coordinate validity
## Flagged 0 records.
## Flagged 0 of 250 records, EQ = 0.
Test Function Background Default
capitals radius around capitals georeferenced from location description on
centroids radius around country and province centroids geo-referenced from description on
countries coordinates in the right country switched lon/lat, data entry errors off
duplicates records from one species with identical coordinates repetitive observation of identical individual, same voucher from multiple data sources, genetic data off
gbif radius around GBIF headquarters data entry errors, falsely geo-referenced on
institutions radius around biodiversity institutions falsely geo-referenced, zoo or garden records on
outliers records far away from all other records of this species various off
seas in the sea switched lon/lat on
urban within urban area cultivated/captivity off
validity outside reference coordinate system missing data, data entry errors on
zeros plain zeros, lat = lon missing data, data entry errors on

Custom test radii for capitals, centroids and institutions

The capitals, centroids and institutions test use a radius around gazetteers to flag coordinates. You can change this radius for each test using the .rad arguments. The radius is specified in decimal degrees. This means that the actual size of the in meters will vary slightly depending on latitude.

clean_coordinates(exmpl, capitals_rad = 0.1)
## Testing coordinate validity
## Flagged 0 records.
## Testing equal lat/lon
## Flagged 0 records.
## Testing zero coordinates
## Flagged 0 records.
## Testing country capitals
## Flagged 0 records.
## Testing country centroids
## Flagged 0 records.
## Testing sea coordinates
## OGR data source with driver: ESRI Shapefile 
## Source: "C:\Users\alexander.zizka\AppData\Local\Temp\Rtmpqc5aFR", layer: "ne_50m_land"
## with 1420 features
## It has 3 fields
## Integer64 fields read as strings:  scalerank
## Flagged 156 records.
## Testing geographic outliers
## Flagged 3 records.
## Testing GBIF headquarters, flagging records around Copenhagen
## Flagged 0 records.
## Testing biodiversity institutions
## Flagged 0 records.
## Flagged 158 of 250 records, EQ = 0.63.
##     species decimallongitude decimallatitude  val  equ  zer  cap  cen
## 1         l         49.18590       -20.71880 TRUE TRUE TRUE TRUE TRUE
## 2         w         48.73326       -18.32124 TRUE TRUE TRUE TRUE TRUE
## 3         u         42.43554       -15.72547 TRUE TRUE TRUE TRUE TRUE
## 4         c         50.17924       -21.66092 TRUE TRUE TRUE TRUE TRUE
## 5         d         50.75853       -19.11579 TRUE TRUE TRUE TRUE TRUE
## 6         m         43.93606       -15.53044 TRUE TRUE TRUE TRUE TRUE
## 7         h         42.74776       -19.26143 TRUE TRUE TRUE TRUE TRUE
## 8         f         46.16682       -13.53061 TRUE TRUE TRUE TRUE TRUE
## 9         o         43.64763       -14.84481 TRUE TRUE TRUE TRUE TRUE
## 10        u         49.25360       -12.62174 TRUE TRUE TRUE TRUE TRUE
## 11        w         42.64977       -18.68775 TRUE TRUE TRUE TRUE TRUE
## 12        g         43.10648       -21.70366 TRUE TRUE TRUE TRUE TRUE
## 13        x         42.53900       -19.24482 TRUE TRUE TRUE TRUE TRUE
## 14        a         44.42925       -19.23684 TRUE TRUE TRUE TRUE TRUE
## 15        u         48.71508       -19.50283 TRUE TRUE TRUE TRUE TRUE
## 16        r         42.03020       -24.37555 TRUE TRUE TRUE TRUE TRUE
## 17        k         42.52437       -19.41456 TRUE TRUE TRUE TRUE TRUE
## 18        j         43.68465       -25.56188 TRUE TRUE TRUE TRUE TRUE
## 19        k         43.50951       -24.39107 TRUE TRUE TRUE TRUE TRUE
## 20        u         45.78401       -16.86553 TRUE TRUE TRUE TRUE TRUE
## 21        w         44.38697       -16.53746 TRUE TRUE TRUE TRUE TRUE
## 22        m         44.21495       -14.75126 TRUE TRUE TRUE TRUE TRUE
## 23        w         45.91899       -19.84619 TRUE TRUE TRUE TRUE TRUE
## 24        j         49.44800       -13.22954 TRUE TRUE TRUE TRUE TRUE
## 25        o         47.19915       -23.81937 TRUE TRUE TRUE TRUE TRUE
## 26        c         48.33915       -18.87889 TRUE TRUE TRUE TRUE TRUE
## 27        n         43.36642       -12.04434 TRUE TRUE TRUE TRUE TRUE
## 28        e         47.02641       -18.71071 TRUE TRUE TRUE TRUE TRUE
## 29        y         44.49769       -21.45606 TRUE TRUE TRUE TRUE TRUE
## 30        q         49.46204       -11.50508 TRUE TRUE TRUE TRUE TRUE
## 31        p         47.45954       -23.91925 TRUE TRUE TRUE TRUE TRUE
## 32        w         50.81620       -13.85193 TRUE TRUE TRUE TRUE TRUE
## 33        j         42.02057       -18.59120 TRUE TRUE TRUE TRUE TRUE
## 34        j         47.95033       -22.09662 TRUE TRUE TRUE TRUE TRUE
## 35        a         49.01466       -19.52101 TRUE TRUE TRUE TRUE TRUE
## 36        j         48.74231       -20.13698 TRUE TRUE TRUE TRUE TRUE
## 37        f         42.97659       -11.28416 TRUE TRUE TRUE TRUE TRUE
## 38        g         43.76136       -13.61890 TRUE TRUE TRUE TRUE TRUE
## 39        g         49.57766       -12.60070 TRUE TRUE TRUE TRUE TRUE
## 40        y         46.32367       -14.09645 TRUE TRUE TRUE TRUE TRUE
## 41        n         47.06535       -23.20093 TRUE TRUE TRUE TRUE TRUE
## 42        r         45.97036       -21.66840 TRUE TRUE TRUE TRUE TRUE
## 43        r         45.25523       -16.44089 TRUE TRUE TRUE TRUE TRUE
## 44        g         48.99622       -23.53746 TRUE TRUE TRUE TRUE TRUE
## 45        w         49.54365       -14.29799 TRUE TRUE TRUE TRUE TRUE
## 46        v         50.21024       -16.20776 TRUE TRUE TRUE TRUE TRUE
## 47        y         50.61223       -12.84135 TRUE TRUE TRUE TRUE TRUE
## 48        r         43.79180       -24.42490 TRUE TRUE TRUE TRUE TRUE
## 49        o         48.26625       -22.92693 TRUE TRUE TRUE TRUE TRUE
## 50        e         49.23238       -11.06857 TRUE TRUE TRUE TRUE TRUE
## 51        o         47.86576       -18.18920 TRUE TRUE TRUE TRUE TRUE
## 52        h         50.23453       -24.45763 TRUE TRUE TRUE TRUE TRUE
## 53        w         44.99258       -13.44145 TRUE TRUE TRUE TRUE TRUE
## 54        b         49.84508       -18.81094 TRUE TRUE TRUE TRUE TRUE
## 55        f         50.01793       -24.46228 TRUE TRUE TRUE TRUE TRUE
## 56        n         44.84224       -13.18717 TRUE TRUE TRUE TRUE TRUE
## 57        a         46.83722       -14.55670 TRUE TRUE TRUE TRUE TRUE
## 58        i         46.04549       -17.19569 TRUE TRUE TRUE TRUE TRUE
## 59        z         43.11164       -19.64094 TRUE TRUE TRUE TRUE TRUE
## 60        t         46.17499       -22.27887 TRUE TRUE TRUE TRUE TRUE
## 61        a         44.89776       -16.40322 TRUE TRUE TRUE TRUE TRUE
## 62        r         44.56910       -12.66744 TRUE TRUE TRUE TRUE TRUE
## 63        r         48.29982       -22.92505 TRUE TRUE TRUE TRUE TRUE
## 64        y         42.61644       -17.59462 TRUE TRUE TRUE TRUE TRUE
## 65        v         49.91636       -21.74183 TRUE TRUE TRUE TRUE TRUE
## 66        x         49.09550       -17.40778 TRUE TRUE TRUE TRUE TRUE
## 67        l         46.92588       -25.30999 TRUE TRUE TRUE TRUE TRUE
## 68        l         42.51916       -22.83208 TRUE TRUE TRUE TRUE TRUE
## 69        q         49.79692       -12.01504 TRUE TRUE TRUE TRUE TRUE
## 70        w         48.12006       -18.64730 TRUE TRUE TRUE TRUE TRUE
## 71        w         44.10432       -16.25259 TRUE TRUE TRUE TRUE TRUE
## 72        d         48.16776       -22.87364 TRUE TRUE TRUE TRUE TRUE
## 73        s         48.93125       -13.53774 TRUE TRUE TRUE TRUE TRUE
## 74        w         47.91291       -15.93257 TRUE TRUE TRUE TRUE TRUE
## 75        w         49.21318       -22.99666 TRUE TRUE TRUE TRUE TRUE
## 76        y         49.13329       -19.77676 TRUE TRUE TRUE TRUE TRUE
## 77        v         48.58976       -17.13427 TRUE TRUE TRUE TRUE TRUE
## 78        o         49.32941       -21.18135 TRUE TRUE TRUE TRUE TRUE
## 79        a         44.61405       -11.29742 TRUE TRUE TRUE TRUE TRUE
## 80        s         45.94918       -14.73340 TRUE TRUE TRUE TRUE TRUE
## 81        v         48.42274       -18.94465 TRUE TRUE TRUE TRUE TRUE
## 82        a         46.40954       -22.93284 TRUE TRUE TRUE TRUE TRUE
## 83        w         49.99914       -24.95373 TRUE TRUE TRUE TRUE TRUE
## 84        y         50.32892       -18.29211 TRUE TRUE TRUE TRUE TRUE
## 85        r         47.65391       -19.76936 TRUE TRUE TRUE TRUE TRUE
## 86        p         46.64227       -13.90433 TRUE TRUE TRUE TRUE TRUE
## 87        z         47.25414       -24.93289 TRUE TRUE TRUE TRUE TRUE
## 88        s         42.07978       -11.53997 TRUE TRUE TRUE TRUE TRUE
## 89        x         49.38772       -14.72063 TRUE TRUE TRUE TRUE TRUE
## 90        z         49.23698       -23.91420 TRUE TRUE TRUE TRUE TRUE
## 91        x         48.39761       -22.32064 TRUE TRUE TRUE TRUE TRUE
## 92        o         44.07180       -23.31976 TRUE TRUE TRUE TRUE TRUE
## 93        w         45.60565       -24.31831 TRUE TRUE TRUE TRUE TRUE
## 94        u         43.90905       -12.39791 TRUE TRUE TRUE TRUE TRUE
## 95        s         42.95996       -18.67143 TRUE TRUE TRUE TRUE TRUE
## 96        h         45.83874       -15.22579 TRUE TRUE TRUE TRUE TRUE
## 97        r         43.21896       -12.55921 TRUE TRUE TRUE TRUE TRUE
## 98        x         48.69670       -14.70274 TRUE TRUE TRUE TRUE TRUE
## 99        a         46.72645       -20.22503 TRUE TRUE TRUE TRUE TRUE
## 100       p         46.34369       -14.23710 TRUE TRUE TRUE TRUE TRUE
## 101       e         46.53400       -25.20683 TRUE TRUE TRUE TRUE TRUE
## 102       y         42.16734       -18.63340 TRUE TRUE TRUE TRUE TRUE
## 103       z         49.72625       -19.44816 TRUE TRUE TRUE TRUE TRUE
## 104       i         50.45714       -22.12839 TRUE TRUE TRUE TRUE TRUE
## 105       g         45.33411       -14.00022 TRUE TRUE TRUE TRUE TRUE
## 106       i         45.56808       -25.69178 TRUE TRUE TRUE TRUE TRUE
## 107       z         50.04195       -15.04625 TRUE TRUE TRUE TRUE TRUE
## 108       t         44.57765       -20.25642 TRUE TRUE TRUE TRUE TRUE
## 109       b         43.97457       -22.08555 TRUE TRUE TRUE TRUE TRUE
## 110       s         48.81252       -12.32890 TRUE TRUE TRUE TRUE TRUE
## 111       v         44.39171       -24.10033 TRUE TRUE TRUE TRUE TRUE
## 112       d         46.99930       -11.00269 TRUE TRUE TRUE TRUE TRUE
## 113       m         44.60472       -11.05668 TRUE TRUE TRUE TRUE TRUE
## 114       b         50.33073       -12.06249 TRUE TRUE TRUE TRUE TRUE
## 115       y         49.89654       -15.21176 TRUE TRUE TRUE TRUE TRUE
## 116       r         46.61428       -22.01891 TRUE TRUE TRUE TRUE TRUE
## 117       s         46.38357       -23.96059 TRUE TRUE TRUE TRUE TRUE
## 118       s         47.78609       -19.11666 TRUE TRUE TRUE TRUE TRUE
## 119       l         44.13842       -16.00176 TRUE TRUE TRUE TRUE TRUE
## 120       n         49.53060       -20.58976 TRUE TRUE TRUE TRUE TRUE
## 121       l         42.98346       -17.63155 TRUE TRUE TRUE TRUE TRUE
## 122       v         48.48893       -20.20935 TRUE TRUE TRUE TRUE TRUE
## 123       c         49.46769       -13.10034 TRUE TRUE TRUE TRUE TRUE
## 124       s         44.60732       -14.53303 TRUE TRUE TRUE TRUE TRUE
## 125       q         49.28102       -12.04293 TRUE TRUE TRUE TRUE TRUE
## 126       y         48.81690       -16.71582 TRUE TRUE TRUE TRUE TRUE
## 127       z         45.20443       -24.20471 TRUE TRUE TRUE TRUE TRUE
## 128       n         47.42808       -18.53953 TRUE TRUE TRUE TRUE TRUE
## 129       w         44.47183       -15.44532 TRUE TRUE TRUE TRUE TRUE
## 130       f         45.10992       -20.40357 TRUE TRUE TRUE TRUE TRUE
## 131       q         48.11443       -19.13391 TRUE TRUE TRUE TRUE TRUE
## 132       e         43.38047       -14.91228 TRUE TRUE TRUE TRUE TRUE
## 133       v         43.21990       -18.42594 TRUE TRUE TRUE TRUE TRUE
## 134       n         49.87413       -25.31316 TRUE TRUE TRUE TRUE TRUE
## 135       l         44.37213       -25.26315 TRUE TRUE TRUE TRUE TRUE
## 136       r         42.83473       -24.40360 TRUE TRUE TRUE TRUE TRUE
## 137       g         49.25300       -24.54716 TRUE TRUE TRUE TRUE TRUE
## 138       i         42.77585       -11.90303 TRUE TRUE TRUE TRUE TRUE
## 139       i         50.25854       -23.59804 TRUE TRUE TRUE TRUE TRUE
## 140       q         42.75531       -23.96597 TRUE TRUE TRUE TRUE TRUE
## 141       r         43.26437       -15.46460 TRUE TRUE TRUE TRUE TRUE
## 142       w         42.75644       -14.36546 TRUE TRUE TRUE TRUE TRUE
## 143       d         50.32222       -15.65287 TRUE TRUE TRUE TRUE TRUE
## 144       t         43.83573       -16.41725 TRUE TRUE TRUE TRUE TRUE
## 145       e         42.40943       -19.19846 TRUE TRUE TRUE TRUE TRUE
## 146       v         48.22779       -16.12883 TRUE TRUE TRUE TRUE TRUE
## 147       o         44.19167       -22.04497 TRUE TRUE TRUE TRUE TRUE
## 148       w         48.27839       -12.47075 TRUE TRUE TRUE TRUE TRUE
## 149       p         44.54864       -22.23884 TRUE TRUE TRUE TRUE TRUE
## 150       p         45.10895       -22.66670 TRUE TRUE TRUE TRUE TRUE
## 151       j         48.77856       -22.59544 TRUE TRUE TRUE TRUE TRUE
## 152       h         42.64749       -12.14627 TRUE TRUE TRUE TRUE TRUE
## 153       j         45.44877       -16.74619 TRUE TRUE TRUE TRUE TRUE
## 154       r         46.09674       -21.13085 TRUE TRUE TRUE TRUE TRUE
## 155       e         42.41916       -11.89262 TRUE TRUE TRUE TRUE TRUE
## 156       p         43.58359       -15.86326 TRUE TRUE TRUE TRUE TRUE
## 157       n         49.67242       -11.90380 TRUE TRUE TRUE TRUE TRUE
## 158       a         43.45945       -11.39999 TRUE TRUE TRUE TRUE TRUE
## 159       v         48.76599       -18.19055 TRUE TRUE TRUE TRUE TRUE
## 160       c         45.27119       -20.09108 TRUE TRUE TRUE TRUE TRUE
## 161       t         45.91788       -15.96579 TRUE TRUE TRUE TRUE TRUE
## 162       q         49.44890       -22.24958 TRUE TRUE TRUE TRUE TRUE
## 163       q         47.54690       -15.58417 TRUE TRUE TRUE TRUE TRUE
## 164       y         47.04942       -12.43993 TRUE TRUE TRUE TRUE TRUE
## 165       z         42.23309       -17.92042 TRUE TRUE TRUE TRUE TRUE
## 166       p         50.60985       -11.75911 TRUE TRUE TRUE TRUE TRUE
## 167       y         47.95595       -13.23602 TRUE TRUE TRUE TRUE TRUE
## 168       x         46.23414       -21.91418 TRUE TRUE TRUE TRUE TRUE
## 169       t         49.96675       -19.39709 TRUE TRUE TRUE TRUE TRUE
## 170       p         50.24237       -22.72095 TRUE TRUE TRUE TRUE TRUE
## 171       g         44.65291       -24.30957 TRUE TRUE TRUE TRUE TRUE
## 172       v         42.48343       -23.99245 TRUE TRUE TRUE TRUE TRUE
## 173       b         48.72915       -20.04092 TRUE TRUE TRUE TRUE TRUE
## 174       r         50.63772       -24.46493 TRUE TRUE TRUE TRUE TRUE
## 175       e         48.79448       -20.57736 TRUE TRUE TRUE TRUE TRUE
## 176       z         45.17639       -15.87400 TRUE TRUE TRUE TRUE TRUE
## 177       j         45.47444       -17.59675 TRUE TRUE TRUE TRUE TRUE
## 178       h         46.25624       -22.35446 TRUE TRUE TRUE TRUE TRUE
## 179       r         50.90072       -13.08348 TRUE TRUE TRUE TRUE TRUE
## 180       k         42.20132       -18.76894 TRUE TRUE TRUE TRUE TRUE
## 181       f         48.02464       -14.76150 TRUE TRUE TRUE TRUE TRUE
## 182       h         44.99305       -11.14091 TRUE TRUE TRUE TRUE TRUE
## 183       w         43.22851       -23.48841 TRUE TRUE TRUE TRUE TRUE
## 184       y         42.41693       -20.62161 TRUE TRUE TRUE TRUE TRUE
## 185       n         46.49501       -22.67301 TRUE TRUE TRUE TRUE TRUE
## 186       r         47.57834       -16.55797 TRUE TRUE TRUE TRUE TRUE
## 187       u         45.85827       -11.12132 TRUE TRUE TRUE TRUE TRUE
## 188       m         42.51685       -12.47987 TRUE TRUE TRUE TRUE TRUE
## 189       w         43.76790       -19.56184 TRUE TRUE TRUE TRUE TRUE
## 190       s         50.93977       -22.50521 TRUE TRUE TRUE TRUE TRUE
## 191       u         48.59199       -20.75355 TRUE TRUE TRUE TRUE TRUE
## 192       l         43.96398       -25.97814 TRUE TRUE TRUE TRUE TRUE
## 193       f         47.34199       -18.34360 TRUE TRUE TRUE TRUE TRUE
## 194       i         48.90730       -19.90369 TRUE TRUE TRUE TRUE TRUE
## 195       v         43.46306       -14.18464 TRUE TRUE TRUE TRUE TRUE
## 196       c         46.77790       -17.54921 TRUE TRUE TRUE TRUE TRUE
## 197       c         43.89798       -20.27962 TRUE TRUE TRUE TRUE TRUE
## 198       f         44.19955       -25.45451 TRUE TRUE TRUE TRUE TRUE
## 199       p         47.15078       -19.13664 TRUE TRUE TRUE TRUE TRUE
## 200       c         42.29943       -25.85997 TRUE TRUE TRUE TRUE TRUE
## 201       l         48.77541       -19.93621 TRUE TRUE TRUE TRUE TRUE
## 202       r         50.46370       -12.67779 TRUE TRUE TRUE TRUE TRUE
## 203       o         48.01572       -21.25904 TRUE TRUE TRUE TRUE TRUE
## 204       v         47.23324       -25.42260 TRUE TRUE TRUE TRUE TRUE
## 205       t         48.37226       -11.93376 TRUE TRUE TRUE TRUE TRUE
## 206       j         49.69528       -14.67603 TRUE TRUE TRUE TRUE TRUE
## 207       z         46.06268       -22.15271 TRUE TRUE TRUE TRUE TRUE
## 208       b         49.97101       -13.88328 TRUE TRUE TRUE TRUE TRUE
## 209       l         44.11513       -19.65934 TRUE TRUE TRUE TRUE TRUE
## 210       t         50.39503       -15.55402 TRUE TRUE TRUE TRUE TRUE
## 211       p         48.25199       -11.88655 TRUE TRUE TRUE TRUE TRUE
## 212       z         45.90567       -23.02689 TRUE TRUE TRUE TRUE TRUE
## 213       c         46.36629       -12.06558 TRUE TRUE TRUE TRUE TRUE
## 214       y         49.39284       -22.12101 TRUE TRUE TRUE TRUE TRUE
## 215       t         43.52331       -21.02947 TRUE TRUE TRUE TRUE TRUE
## 216       h         44.92737       -16.40351 TRUE TRUE TRUE TRUE TRUE
## 217       h         49.05715       -23.06282 TRUE TRUE TRUE TRUE TRUE
## 218       g         47.47413       -16.59857 TRUE TRUE TRUE TRUE TRUE
## 219       n         45.92569       -14.90568 TRUE TRUE TRUE TRUE TRUE
## 220       u         48.18932       -12.54389 TRUE TRUE TRUE TRUE TRUE
## 221       h         50.77166       -22.98788 TRUE TRUE TRUE TRUE TRUE
## 222       f         47.33253       -20.36102 TRUE TRUE TRUE TRUE TRUE
## 223       d         44.83019       -24.31481 TRUE TRUE TRUE TRUE TRUE
## 224       g         45.10038       -20.41926 TRUE TRUE TRUE TRUE TRUE
## 225       n         42.45133       -19.85839 TRUE TRUE TRUE TRUE TRUE
## 226       p         46.50129       -14.18795 TRUE TRUE TRUE TRUE TRUE
## 227       v         43.63416       -19.66356 TRUE TRUE TRUE TRUE TRUE
## 228       q         43.27273       -12.86189 TRUE TRUE TRUE TRUE TRUE
## 229       m         48.22686       -14.01560 TRUE TRUE TRUE TRUE TRUE
## 230       y         50.69444       -20.24032 TRUE TRUE TRUE TRUE TRUE
## 231       j         42.89292       -15.93477 TRUE TRUE TRUE TRUE TRUE
## 232       m         50.95544       -16.66376 TRUE TRUE TRUE TRUE TRUE
## 233       q         50.89336       -14.72263 TRUE TRUE TRUE TRUE TRUE
## 234       t         47.02953       -16.85623 TRUE TRUE TRUE TRUE TRUE
## 235       z         48.86044       -13.09634 TRUE TRUE TRUE TRUE TRUE
## 236       x         43.37459       -25.73203 TRUE TRUE TRUE TRUE TRUE
## 237       a         45.16985       -24.38672 TRUE TRUE TRUE TRUE TRUE
## 238       f         45.17163       -14.54149 TRUE TRUE TRUE TRUE TRUE
## 239       l         47.70606       -15.85942 TRUE TRUE TRUE TRUE TRUE
## 240       h         45.40316       -24.54162 TRUE TRUE TRUE TRUE TRUE
## 241       b         45.74062       -23.28833 TRUE TRUE TRUE TRUE TRUE
## 242       c         42.57740       -19.61903 TRUE TRUE TRUE TRUE TRUE
## 243       u         44.87698       -22.39369 TRUE TRUE TRUE TRUE TRUE
## 244       s         50.90221       -21.67745 TRUE TRUE TRUE TRUE TRUE
## 245       s         45.40282       -14.35720 TRUE TRUE TRUE TRUE TRUE
## 246       y         49.04774       -13.51070 TRUE TRUE TRUE TRUE TRUE
## 247       m         42.53044       -22.77043 TRUE TRUE TRUE TRUE TRUE
## 248       k         42.19256       -11.68454 TRUE TRUE TRUE TRUE TRUE
## 249       p         45.07205       -13.95623 TRUE TRUE TRUE TRUE TRUE
## 250       r         50.82171       -15.64992 TRUE TRUE TRUE TRUE TRUE
##       sea   otl  gbf inst summary
## 1   FALSE  TRUE TRUE TRUE   FALSE
## 2    TRUE  TRUE TRUE TRUE    TRUE
## 3   FALSE  TRUE TRUE TRUE   FALSE
## 4   FALSE  TRUE TRUE TRUE   FALSE
## 5   FALSE  TRUE TRUE TRUE   FALSE
## 6   FALSE  TRUE TRUE TRUE   FALSE
## 7   FALSE  TRUE TRUE TRUE   FALSE
## 8   FALSE  TRUE TRUE TRUE   FALSE
## 9   FALSE FALSE TRUE TRUE   FALSE
## 10   TRUE  TRUE TRUE TRUE    TRUE
## 11  FALSE  TRUE TRUE TRUE   FALSE
## 12  FALSE  TRUE TRUE TRUE   FALSE
## 13  FALSE  TRUE TRUE TRUE   FALSE
## 14   TRUE  TRUE TRUE TRUE    TRUE
## 15   TRUE  TRUE TRUE TRUE    TRUE
## 16  FALSE  TRUE TRUE TRUE   FALSE
## 17  FALSE  TRUE TRUE TRUE   FALSE
## 18  FALSE  TRUE TRUE TRUE   FALSE
## 19  FALSE  TRUE TRUE TRUE   FALSE
## 20   TRUE FALSE TRUE TRUE   FALSE
## 21  FALSE  TRUE TRUE TRUE   FALSE
## 22  FALSE  TRUE TRUE TRUE   FALSE
## 23   TRUE  TRUE TRUE TRUE    TRUE
## 24   TRUE  TRUE TRUE TRUE    TRUE
## 25   TRUE  TRUE TRUE TRUE    TRUE
## 26   TRUE  TRUE TRUE TRUE    TRUE
## 27  FALSE  TRUE TRUE TRUE   FALSE
## 28   TRUE  TRUE TRUE TRUE    TRUE
## 29   TRUE  TRUE TRUE TRUE    TRUE
## 30  FALSE  TRUE TRUE TRUE   FALSE
## 31   TRUE  TRUE TRUE TRUE    TRUE
## 32  FALSE  TRUE TRUE TRUE   FALSE
## 33  FALSE  TRUE TRUE TRUE   FALSE
## 34   TRUE  TRUE TRUE TRUE    TRUE
## 35  FALSE  TRUE TRUE TRUE   FALSE
## 36  FALSE  TRUE TRUE TRUE   FALSE
## 37  FALSE  TRUE TRUE TRUE   FALSE
## 38  FALSE  TRUE TRUE TRUE   FALSE
## 39   TRUE  TRUE TRUE TRUE    TRUE
## 40  FALSE  TRUE TRUE TRUE   FALSE
## 41   TRUE  TRUE TRUE TRUE    TRUE
## 42   TRUE  TRUE TRUE TRUE    TRUE
## 43   TRUE  TRUE TRUE TRUE    TRUE
## 44  FALSE  TRUE TRUE TRUE   FALSE
## 45   TRUE  TRUE TRUE TRUE    TRUE
## 46  FALSE  TRUE TRUE TRUE   FALSE
## 47  FALSE  TRUE TRUE TRUE   FALSE
## 48   TRUE  TRUE TRUE TRUE    TRUE
## 49  FALSE  TRUE TRUE TRUE   FALSE
## 50  FALSE  TRUE TRUE TRUE   FALSE
## 51   TRUE  TRUE TRUE TRUE    TRUE
## 52  FALSE  TRUE TRUE TRUE   FALSE
## 53  FALSE  TRUE TRUE TRUE   FALSE
## 54  FALSE  TRUE TRUE TRUE   FALSE
## 55  FALSE  TRUE TRUE TRUE   FALSE
## 56  FALSE  TRUE TRUE TRUE   FALSE
## 57  FALSE  TRUE TRUE TRUE   FALSE
## 58   TRUE  TRUE TRUE TRUE    TRUE
## 59  FALSE  TRUE TRUE TRUE   FALSE
## 60   TRUE  TRUE TRUE TRUE    TRUE
## 61   TRUE  TRUE TRUE TRUE    TRUE
## 62  FALSE  TRUE TRUE TRUE   FALSE
## 63  FALSE  TRUE TRUE TRUE   FALSE
## 64  FALSE  TRUE TRUE TRUE   FALSE
## 65  FALSE  TRUE TRUE TRUE   FALSE
## 66   TRUE  TRUE TRUE TRUE    TRUE
## 67  FALSE  TRUE TRUE TRUE   FALSE
## 68  FALSE  TRUE TRUE TRUE   FALSE
## 69  FALSE  TRUE TRUE TRUE   FALSE
## 70   TRUE  TRUE TRUE TRUE    TRUE
## 71  FALSE  TRUE TRUE TRUE   FALSE
## 72  FALSE  TRUE TRUE TRUE   FALSE
## 73   TRUE  TRUE TRUE TRUE    TRUE
## 74   TRUE  TRUE TRUE TRUE    TRUE
## 75  FALSE  TRUE TRUE TRUE   FALSE
## 76  FALSE  TRUE TRUE TRUE   FALSE
## 77   TRUE  TRUE TRUE TRUE    TRUE
## 78  FALSE  TRUE TRUE TRUE   FALSE
## 79  FALSE  TRUE TRUE TRUE   FALSE
## 80  FALSE  TRUE TRUE TRUE   FALSE
## 81   TRUE  TRUE TRUE TRUE    TRUE
## 82   TRUE  TRUE TRUE TRUE    TRUE
## 83  FALSE  TRUE TRUE TRUE   FALSE
## 84  FALSE  TRUE TRUE TRUE   FALSE
## 85   TRUE  TRUE TRUE TRUE    TRUE
## 86  FALSE  TRUE TRUE TRUE   FALSE
## 87  FALSE  TRUE TRUE TRUE   FALSE
## 88  FALSE  TRUE TRUE TRUE   FALSE
## 89   TRUE  TRUE TRUE TRUE    TRUE
## 90  FALSE  TRUE TRUE TRUE   FALSE
## 91  FALSE  TRUE TRUE TRUE   FALSE
## 92   TRUE  TRUE TRUE TRUE    TRUE
## 93   TRUE  TRUE TRUE TRUE    TRUE
## 94  FALSE  TRUE TRUE TRUE   FALSE
## 95  FALSE  TRUE TRUE TRUE   FALSE
## 96  FALSE  TRUE TRUE TRUE   FALSE
## 97  FALSE  TRUE TRUE TRUE   FALSE
## 98   TRUE  TRUE TRUE TRUE    TRUE
## 99   TRUE  TRUE TRUE TRUE    TRUE
## 100 FALSE  TRUE TRUE TRUE   FALSE
## 101 FALSE  TRUE TRUE TRUE   FALSE
## 102 FALSE  TRUE TRUE TRUE   FALSE
## 103 FALSE  TRUE TRUE TRUE   FALSE
## 104 FALSE  TRUE TRUE TRUE   FALSE
## 105 FALSE  TRUE TRUE TRUE   FALSE
## 106 FALSE  TRUE TRUE TRUE   FALSE
## 107  TRUE  TRUE TRUE TRUE    TRUE
## 108  TRUE  TRUE TRUE TRUE    TRUE
## 109  TRUE  TRUE TRUE TRUE    TRUE
## 110 FALSE  TRUE TRUE TRUE   FALSE
## 111  TRUE  TRUE TRUE TRUE    TRUE
## 112 FALSE  TRUE TRUE TRUE   FALSE
## 113 FALSE  TRUE TRUE TRUE   FALSE
## 114 FALSE  TRUE TRUE TRUE   FALSE
## 115  TRUE  TRUE TRUE TRUE    TRUE
## 116  TRUE  TRUE TRUE TRUE    TRUE
## 117  TRUE  TRUE TRUE TRUE    TRUE
## 118  TRUE  TRUE TRUE TRUE    TRUE
## 119 FALSE  TRUE TRUE TRUE   FALSE
## 120 FALSE  TRUE TRUE TRUE   FALSE
## 121 FALSE  TRUE TRUE TRUE   FALSE
## 122  TRUE  TRUE TRUE TRUE    TRUE
## 123  TRUE  TRUE TRUE TRUE    TRUE
## 124 FALSE  TRUE TRUE TRUE   FALSE
## 125 FALSE  TRUE TRUE TRUE   FALSE
## 126  TRUE  TRUE TRUE TRUE    TRUE
## 127  TRUE  TRUE TRUE TRUE    TRUE
## 128  TRUE  TRUE TRUE TRUE    TRUE
## 129 FALSE  TRUE TRUE TRUE   FALSE
## 130  TRUE  TRUE TRUE TRUE    TRUE
## 131  TRUE  TRUE TRUE TRUE    TRUE
## 132 FALSE  TRUE TRUE TRUE   FALSE
## 133 FALSE  TRUE TRUE TRUE   FALSE
## 134 FALSE  TRUE TRUE TRUE   FALSE
## 135 FALSE  TRUE TRUE TRUE   FALSE
## 136 FALSE  TRUE TRUE TRUE   FALSE
## 137 FALSE  TRUE TRUE TRUE   FALSE
## 138 FALSE  TRUE TRUE TRUE   FALSE
## 139 FALSE  TRUE TRUE TRUE   FALSE
## 140 FALSE  TRUE TRUE TRUE   FALSE
## 141 FALSE  TRUE TRUE TRUE   FALSE
## 142 FALSE  TRUE TRUE TRUE   FALSE
## 143  TRUE  TRUE TRUE TRUE    TRUE
## 144 FALSE  TRUE TRUE TRUE   FALSE
## 145 FALSE  TRUE TRUE TRUE   FALSE
## 146  TRUE  TRUE TRUE TRUE    TRUE
## 147  TRUE  TRUE TRUE TRUE    TRUE
## 148 FALSE  TRUE TRUE TRUE   FALSE
## 149  TRUE  TRUE TRUE TRUE    TRUE
## 150  TRUE  TRUE TRUE TRUE    TRUE
## 151 FALSE  TRUE TRUE TRUE   FALSE
## 152 FALSE  TRUE TRUE TRUE   FALSE
## 153  TRUE  TRUE TRUE TRUE    TRUE
## 154  TRUE  TRUE TRUE TRUE    TRUE
## 155 FALSE  TRUE TRUE TRUE   FALSE
## 156 FALSE  TRUE TRUE TRUE   FALSE
## 157 FALSE  TRUE TRUE TRUE   FALSE
## 158 FALSE  TRUE TRUE TRUE   FALSE
## 159  TRUE  TRUE TRUE TRUE    TRUE
## 160  TRUE  TRUE TRUE TRUE    TRUE
## 161  TRUE  TRUE TRUE TRUE    TRUE
## 162 FALSE  TRUE TRUE TRUE   FALSE
## 163  TRUE  TRUE TRUE TRUE    TRUE
## 164 FALSE  TRUE TRUE TRUE   FALSE
## 165 FALSE  TRUE TRUE TRUE   FALSE
## 166 FALSE  TRUE TRUE TRUE   FALSE
## 167 FALSE  TRUE TRUE TRUE   FALSE
## 168  TRUE  TRUE TRUE TRUE    TRUE
## 169 FALSE  TRUE TRUE TRUE   FALSE
## 170 FALSE  TRUE TRUE TRUE   FALSE
## 171  TRUE  TRUE TRUE TRUE    TRUE
## 172 FALSE  TRUE TRUE TRUE   FALSE
## 173  TRUE  TRUE TRUE TRUE    TRUE
## 174 FALSE  TRUE TRUE TRUE   FALSE
## 175 FALSE  TRUE TRUE TRUE   FALSE
## 176 FALSE  TRUE TRUE TRUE   FALSE
## 177  TRUE  TRUE TRUE TRUE    TRUE
## 178  TRUE  TRUE TRUE TRUE    TRUE
## 179 FALSE  TRUE TRUE TRUE   FALSE
## 180 FALSE  TRUE TRUE TRUE   FALSE
## 181  TRUE  TRUE TRUE TRUE    TRUE
## 182 FALSE  TRUE TRUE TRUE   FALSE
## 183 FALSE  TRUE TRUE TRUE   FALSE
## 184 FALSE  TRUE TRUE TRUE   FALSE
## 185  TRUE  TRUE TRUE TRUE    TRUE
## 186  TRUE  TRUE TRUE TRUE    TRUE
## 187 FALSE  TRUE TRUE TRUE   FALSE
## 188 FALSE  TRUE TRUE TRUE   FALSE
## 189 FALSE  TRUE TRUE TRUE   FALSE
## 190 FALSE  TRUE TRUE TRUE   FALSE
## 191 FALSE  TRUE TRUE TRUE   FALSE
## 192 FALSE  TRUE TRUE TRUE   FALSE
## 193  TRUE  TRUE TRUE TRUE    TRUE
## 194 FALSE  TRUE TRUE TRUE   FALSE
## 195 FALSE  TRUE TRUE TRUE   FALSE
## 196  TRUE  TRUE TRUE TRUE    TRUE
## 197 FALSE  TRUE TRUE TRUE   FALSE
## 198 FALSE  TRUE TRUE TRUE   FALSE
## 199  TRUE  TRUE TRUE TRUE    TRUE
## 200 FALSE  TRUE TRUE TRUE   FALSE
## 201  TRUE  TRUE TRUE TRUE    TRUE
## 202 FALSE  TRUE TRUE TRUE   FALSE
## 203  TRUE  TRUE TRUE TRUE    TRUE
## 204 FALSE  TRUE TRUE TRUE   FALSE
## 205 FALSE  TRUE TRUE TRUE   FALSE
## 206  TRUE  TRUE TRUE TRUE    TRUE
## 207  TRUE  TRUE TRUE TRUE    TRUE
## 208  TRUE  TRUE TRUE TRUE    TRUE
## 209 FALSE  TRUE TRUE TRUE   FALSE
## 210  TRUE  TRUE TRUE TRUE    TRUE
## 211 FALSE  TRUE TRUE TRUE   FALSE
## 212  TRUE  TRUE TRUE TRUE    TRUE
## 213 FALSE  TRUE TRUE TRUE   FALSE
## 214 FALSE  TRUE TRUE TRUE   FALSE
## 215 FALSE  TRUE TRUE TRUE   FALSE
## 216  TRUE  TRUE TRUE TRUE    TRUE
## 217 FALSE  TRUE TRUE TRUE   FALSE
## 218  TRUE  TRUE TRUE TRUE    TRUE
## 219 FALSE  TRUE TRUE TRUE   FALSE
## 220 FALSE  TRUE TRUE TRUE   FALSE
## 221 FALSE  TRUE TRUE TRUE   FALSE
## 222  TRUE  TRUE TRUE TRUE    TRUE
## 223  TRUE  TRUE TRUE TRUE    TRUE
## 224  TRUE  TRUE TRUE TRUE    TRUE
## 225 FALSE  TRUE TRUE TRUE   FALSE
## 226 FALSE  TRUE TRUE TRUE   FALSE
## 227 FALSE  TRUE TRUE TRUE   FALSE
## 228 FALSE  TRUE TRUE TRUE   FALSE
## 229  TRUE  TRUE TRUE TRUE    TRUE
## 230 FALSE  TRUE TRUE TRUE   FALSE
## 231 FALSE  TRUE TRUE TRUE   FALSE
## 232 FALSE  TRUE TRUE TRUE   FALSE
## 233 FALSE  TRUE TRUE TRUE   FALSE
## 234  TRUE  TRUE TRUE TRUE    TRUE
## 235  TRUE  TRUE TRUE TRUE    TRUE
## 236 FALSE  TRUE TRUE TRUE   FALSE
## 237  TRUE  TRUE TRUE TRUE    TRUE
## 238 FALSE  TRUE TRUE TRUE   FALSE
## 239  TRUE  TRUE TRUE TRUE    TRUE
## 240  TRUE  TRUE TRUE TRUE    TRUE
## 241  TRUE  TRUE TRUE TRUE    TRUE
## 242 FALSE  TRUE TRUE TRUE   FALSE
## 243  TRUE FALSE TRUE TRUE   FALSE
## 244 FALSE  TRUE TRUE TRUE   FALSE
## 245 FALSE  TRUE TRUE TRUE   FALSE
## 246  TRUE  TRUE TRUE TRUE    TRUE
## 247 FALSE  TRUE TRUE TRUE   FALSE
## 248 FALSE  TRUE TRUE TRUE   FALSE
## 249 FALSE  TRUE TRUE TRUE   FALSE
## 250 FALSE  TRUE TRUE TRUE   FALSE
Test Default radius [°] Default radius (lat 0°/45°) [km]
capitals 0.05 5.5/4
centroids 0.01 1.1/0.8
gbif 1 63.0
institutions 0.001 0.1/0.08
zeros 0.5 55.6

Custom gazetteers

You can use custom gazetteers for all CleanCoordinates tests, via the .ref arguments of the function. For example the capitals.ref argument controls the reference for the capitals test. Customized reference data must follow the same format as the default reference for the same test. You can check the structure of gazetteers via their documentation or by looking at the gazetteer (e.g. head(capitals)). For example:

#check the format of the default capitals reference
head(countrtyref) #a data.frame with four columns: ISO3, capital, longitude, latitude

#create new reference data set from scratch. For real analysis you 
#probably want to load the alternative file from a .txt file
my.cap <- data.frame(ISO3 = LETTERS[1:10],
                     capital = letters[1:10],
                     capital.longitude = runif(10, -180, 180),
                     capital.latitude = runif(10, -90, 90))

flags <- clean_coordinates(exmpl, capitals.ref = my.cap)

In this way test can be completely customized, you could for example provide a gazetteer with the locations of hardware stores (in the capitals format) if you want to flag records around hardware stores.

Classes of the default gazetteers of clean_coordinates.

Test Default gazetteer Class Argument
capitals countryref data.frame capitals.ref
centroids countryref data.frame centroids.ref
countrycheck rnaturalearth::ne_countries(scale = “medium”) SpatialPolygonsDataFrame country.ref
institutions institutions data.frame inst.ref
seas landmass SpatialPolygonsDataFrame seas.ref
urban rnaturalearth::ne_download(scale = ‘medium’, type = ‘urban_areas’) SpatialPolygonsDataFrame urban.ref

Summary and visualization

You cane easily summarize the results of clean_coordinates either with the report option or via summary. If report == T the summary is written to the working directory as a .txt file, if report is a character, it is the path to which the summary file will be written, Alternatively, you can get a summary of the number of records flagged with summary.

#via the report option
flags <- clean_coordinates(exmpl, report = T)
## Testing coordinate validity
## Flagged 0 records.
## Testing equal lat/lon
## Flagged 0 records.
## Testing zero coordinates
## Flagged 0 records.
## Testing country capitals
## Flagged 0 records.
## Testing country centroids
## Flagged 0 records.
## Testing sea coordinates
## OGR data source with driver: ESRI Shapefile 
## Source: "C:\Users\alexander.zizka\AppData\Local\Temp\Rtmpqc5aFR", layer: "ne_50m_land"
## with 1420 features
## It has 3 fields
## Integer64 fields read as strings:  scalerank
## Flagged 156 records.
## Testing geographic outliers
## Flagged 3 records.
## Testing GBIF headquarters, flagging records around Copenhagen
## Flagged 0 records.
## Testing biodiversity institutions
## Flagged 0 records.
## Flagged 158 of 250 records, EQ = 0.63.

#via summary
summary(flags)
## decimallatitude             val             equ             zer 
##               0               0               0               0 
##             cap             cen             sea             otl 
##               0               0             156               3 
##             gbf            inst         summary 
##               0               0             158

Exclude flagged records

The output of clean_coordinates is in the same order as the input, thus you can easily exclude flagged records.

#exclude records flagged by any test
clean <- exmpl[, flags$sumary]

#exclude records flagged by the centroids test
clean <- exmpl[, flags$centroids]