The clean_coordinates function enables a fast, automated and reproducible flagging of potentially erroneous occurrence coordinates based on geographic gazetteers. The function can flag records based on known problems common to biological collections.

Individual test can be switched on and off by a logical flag (see ?clean_coordinates) and distance thresholds for all tests can also be adapted. Custom gazetteers for the cleaning can be provided for all tests (for a higher level of detail). See here for a detailed tutorial on how to clean occurrence records using CoordinateCleaner.

Please find a detailed tutorial on how to clean occurrence records (e.g. from GBIF) here and how to clean fossil data (e.g. from PBDB) here.

Switch individual test on/off

clean_coordinates wraps around multiple tests for common error sources in species distribution records. Individual test can be included or excluded from a run with the tests argument of clean_coordinates, e.g. "seas" switches the seas test off. Most basic tests are switched on by defaults, but some more complex are switched off by default.

library(CoordinateCleaner)
## Registered S3 method overwritten by 'dplyr':
##   method               from  
##   as.data.frame.tbl_df tibble
## Registered S3 methods overwritten by 'ggplot2':
##   method         from 
##   [.quosures     rlang
##   c.quosures     rlang
##   print.quosures rlang

exmpl <- data.frame(species = sample(letters, size = 250, replace = TRUE),
                   decimallongitude = runif(250, min = 42, max = 51),
                   decimallatitude = runif(250, min = -26, max = -11),
                   countries = "MDG")


#run all tests
dat <- clean_coordinates(exmpl, tests = c("capitals", "centroids", "countries", "equal", "gbif", "institutions", "outliers", "seas", "urban", "zeros"), countries = "countries")
## Testing coordinate validity
## Flagged 0 records.
## Testing equal lat/lon
## Flagged 0 records.
## Testing zero coordinates
## Flagged 0 records.
## Testing country capitals
## Flagged 0 records.
## Testing country centroids
## Flagged 0 records.
## Testing sea coordinates
## OGR data source with driver: ESRI Shapefile 
## Source: "C:\Users\alexander.zizka\AppData\Local\Temp\Rtmp8kHf7i", layer: "ne_50m_land"
## with 1420 features
## It has 3 fields
## Integer64 fields read as strings:  scalerank
## Flagged 160 records.
## Testing urban areas
## No reference for urban areas found. 
##             Using rnaturalearth to download.
## OGR data source with driver: ESRI Shapefile 
## Source: "C:\Users\alexander.zizka\AppData\Local\Temp\Rtmp8kHf7i", layer: "ne_50m_urban_areas"
## with 2143 features
## It has 4 fields
## Integer64 fields read as strings:  scalerank
## Flagged 0 records.
## Testing country identity
## Flagged 161 records.
## Testing geographic outliers
## Flagged 4 records.
## Testing GBIF headquarters, flagging records around Copenhagen
## Flagged 0 records.
## Testing biodiversity institutions
## Flagged 0 records.
## Flagged 162 of 250 records, EQ = 0.65.

#only run the validity test
dat <- clean_coordinates(exmpl, tests = c(""))
## Testing coordinate validity
## Flagged 0 records.
## Flagged 0 of 250 records, EQ = 0.
Test Function Background Default
capitals radius around capitals georeferenced from location description on
centroids radius around country and province centroids geo-referenced from description on
countries coordinates in the right country switched lon/lat, data entry errors off
duplicates records from one species with identical coordinates repetitive observation of identical individual, same voucher from multiple data sources, genetic data off
gbif radius around GBIF headquarters data entry errors, falsely geo-referenced on
institutions radius around biodiversity institutions falsely geo-referenced, zoo or garden records on
outliers records far away from all other records of this species various off
seas in the sea switched lon/lat on
urban within urban area cultivated/captivity off
validity outside reference coordinate system missing data, data entry errors on
zeros plain zeros, lat = lon missing data, data entry errors on

Custom test radii for capitals, centroids and institutions

The capitals, centroids and institutions test use a radius around gazetteers to flag coordinates. You can change this radius for each test using the .rad arguments. The radius is specified in decimal degrees. This means that the actual size of the in meters will vary slightly depending on latitude.

clean_coordinates(exmpl, capitals_rad = 0.1)
## Testing coordinate validity
## Flagged 0 records.
## Testing equal lat/lon
## Flagged 0 records.
## Testing zero coordinates
## Flagged 0 records.
## Testing country capitals
## Flagged 0 records.
## Testing country centroids
## Flagged 0 records.
## Testing sea coordinates
## OGR data source with driver: ESRI Shapefile 
## Source: "C:\Users\alexander.zizka\AppData\Local\Temp\Rtmp8kHf7i", layer: "ne_50m_land"
## with 1420 features
## It has 3 fields
## Integer64 fields read as strings:  scalerank
## Flagged 160 records.
## Testing geographic outliers
## Flagged 4 records.
## Testing GBIF headquarters, flagging records around Copenhagen
## Flagged 0 records.
## Testing biodiversity institutions
## Flagged 0 records.
## Flagged 161 of 250 records, EQ = 0.64.
##     species decimallongitude decimallatitude  val  equ  zer  cap  cen
## 1         x         43.93877       -18.72588 TRUE TRUE TRUE TRUE TRUE
## 2         e         48.82225       -22.88105 TRUE TRUE TRUE TRUE TRUE
## 3         o         46.47323       -22.27676 TRUE TRUE TRUE TRUE TRUE
## 4         l         47.43149       -23.27714 TRUE TRUE TRUE TRUE TRUE
## 5         r         47.25910       -11.32719 TRUE TRUE TRUE TRUE TRUE
## 6         u         49.72204       -21.32244 TRUE TRUE TRUE TRUE TRUE
## 7         l         42.09530       -15.79168 TRUE TRUE TRUE TRUE TRUE
## 8         v         47.96785       -12.74394 TRUE TRUE TRUE TRUE TRUE
## 9         w         43.86164       -21.32102 TRUE TRUE TRUE TRUE TRUE
## 10        q         49.26223       -13.48887 TRUE TRUE TRUE TRUE TRUE
## 11        d         42.49719       -21.85377 TRUE TRUE TRUE TRUE TRUE
## 12        k         42.32765       -23.56994 TRUE TRUE TRUE TRUE TRUE
## 13        z         50.25511       -24.57668 TRUE TRUE TRUE TRUE TRUE
## 14        h         48.61500       -23.23414 TRUE TRUE TRUE TRUE TRUE
## 15        k         48.32440       -15.88720 TRUE TRUE TRUE TRUE TRUE
## 16        o         48.59958       -11.26924 TRUE TRUE TRUE TRUE TRUE
## 17        u         45.97133       -16.32862 TRUE TRUE TRUE TRUE TRUE
## 18        d         42.64345       -17.43725 TRUE TRUE TRUE TRUE TRUE
## 19        g         48.19080       -24.44149 TRUE TRUE TRUE TRUE TRUE
## 20        l         46.31021       -21.90993 TRUE TRUE TRUE TRUE TRUE
## 21        a         50.51379       -13.56122 TRUE TRUE TRUE TRUE TRUE
## 22        z         44.61765       -15.14601 TRUE TRUE TRUE TRUE TRUE
## 23        t         48.13864       -15.00901 TRUE TRUE TRUE TRUE TRUE
## 24        z         47.95719       -22.63157 TRUE TRUE TRUE TRUE TRUE
## 25        y         45.25720       -11.11015 TRUE TRUE TRUE TRUE TRUE
## 26        y         46.58291       -19.63427 TRUE TRUE TRUE TRUE TRUE
## 27        g         43.99849       -15.19968 TRUE TRUE TRUE TRUE TRUE
## 28        x         46.07308       -16.08053 TRUE TRUE TRUE TRUE TRUE
## 29        x         49.15344       -17.10823 TRUE TRUE TRUE TRUE TRUE
## 30        z         42.12575       -19.65985 TRUE TRUE TRUE TRUE TRUE
## 31        b         46.54492       -11.12280 TRUE TRUE TRUE TRUE TRUE
## 32        e         42.17460       -17.20014 TRUE TRUE TRUE TRUE TRUE
## 33        v         50.17839       -14.95969 TRUE TRUE TRUE TRUE TRUE
## 34        x         50.55986       -11.19349 TRUE TRUE TRUE TRUE TRUE
## 35        w         50.10880       -14.10668 TRUE TRUE TRUE TRUE TRUE
## 36        y         48.43941       -16.07576 TRUE TRUE TRUE TRUE TRUE
## 37        y         44.89317       -22.29173 TRUE TRUE TRUE TRUE TRUE
## 38        d         49.09375       -19.13459 TRUE TRUE TRUE TRUE TRUE
## 39        a         45.51597       -14.69466 TRUE TRUE TRUE TRUE TRUE
## 40        m         43.72659       -18.44612 TRUE TRUE TRUE TRUE TRUE
## 41        w         50.16131       -18.52503 TRUE TRUE TRUE TRUE TRUE
## 42        f         44.42731       -18.09311 TRUE TRUE TRUE TRUE TRUE
## 43        f         49.39479       -16.56903 TRUE TRUE TRUE TRUE TRUE
## 44        y         43.65398       -11.46027 TRUE TRUE TRUE TRUE TRUE
## 45        j         43.75778       -18.61842 TRUE TRUE TRUE TRUE TRUE
## 46        t         44.08682       -18.79788 TRUE TRUE TRUE TRUE TRUE
## 47        x         47.90933       -17.14288 TRUE TRUE TRUE TRUE TRUE
## 48        z         44.96427       -22.83217 TRUE TRUE TRUE TRUE TRUE
## 49        o         47.36346       -24.89971 TRUE TRUE TRUE TRUE TRUE
## 50        f         44.86068       -15.32773 TRUE TRUE TRUE TRUE TRUE
## 51        l         49.00818       -25.71119 TRUE TRUE TRUE TRUE TRUE
## 52        i         47.69323       -22.26402 TRUE TRUE TRUE TRUE TRUE
## 53        m         47.31773       -14.41918 TRUE TRUE TRUE TRUE TRUE
## 54        a         43.30220       -11.42567 TRUE TRUE TRUE TRUE TRUE
## 55        j         49.08877       -21.75620 TRUE TRUE TRUE TRUE TRUE
## 56        l         42.47653       -15.56339 TRUE TRUE TRUE TRUE TRUE
## 57        k         44.70933       -17.99674 TRUE TRUE TRUE TRUE TRUE
## 58        p         44.70178       -18.52975 TRUE TRUE TRUE TRUE TRUE
## 59        x         45.22882       -15.65000 TRUE TRUE TRUE TRUE TRUE
## 60        x         47.97584       -13.37695 TRUE TRUE TRUE TRUE TRUE
## 61        h         44.31021       -20.45666 TRUE TRUE TRUE TRUE TRUE
## 62        i         44.07051       -12.78862 TRUE TRUE TRUE TRUE TRUE
## 63        u         45.32284       -22.66392 TRUE TRUE TRUE TRUE TRUE
## 64        d         43.07597       -11.27800 TRUE TRUE TRUE TRUE TRUE
## 65        h         43.75927       -17.72341 TRUE TRUE TRUE TRUE TRUE
## 66        b         43.49330       -11.08562 TRUE TRUE TRUE TRUE TRUE
## 67        z         49.41341       -16.53169 TRUE TRUE TRUE TRUE TRUE
## 68        i         44.38191       -12.88057 TRUE TRUE TRUE TRUE TRUE
## 69        a         45.87004       -21.40570 TRUE TRUE TRUE TRUE TRUE
## 70        u         48.98747       -14.55868 TRUE TRUE TRUE TRUE TRUE
## 71        a         42.85935       -20.65919 TRUE TRUE TRUE TRUE TRUE
## 72        g         43.31817       -19.12096 TRUE TRUE TRUE TRUE TRUE
## 73        y         43.79234       -18.25547 TRUE TRUE TRUE TRUE TRUE
## 74        l         46.07724       -14.26303 TRUE TRUE TRUE TRUE TRUE
## 75        e         43.20867       -16.27928 TRUE TRUE TRUE TRUE TRUE
## 76        y         48.93672       -13.93983 TRUE TRUE TRUE TRUE TRUE
## 77        w         50.87024       -18.83283 TRUE TRUE TRUE TRUE TRUE
## 78        z         50.68895       -18.57825 TRUE TRUE TRUE TRUE TRUE
## 79        u         50.84024       -25.33612 TRUE TRUE TRUE TRUE TRUE
## 80        w         47.08623       -22.37289 TRUE TRUE TRUE TRUE TRUE
## 81        r         44.09162       -21.30312 TRUE TRUE TRUE TRUE TRUE
## 82        j         50.10791       -11.86249 TRUE TRUE TRUE TRUE TRUE
## 83        f         44.49742       -21.63300 TRUE TRUE TRUE TRUE TRUE
## 84        b         45.68322       -12.36200 TRUE TRUE TRUE TRUE TRUE
## 85        x         46.32936       -19.83927 TRUE TRUE TRUE TRUE TRUE
## 86        d         42.98343       -16.07813 TRUE TRUE TRUE TRUE TRUE
## 87        h         47.13683       -11.58588 TRUE TRUE TRUE TRUE TRUE
## 88        h         50.34479       -22.00389 TRUE TRUE TRUE TRUE TRUE
## 89        l         49.28459       -14.76116 TRUE TRUE TRUE TRUE TRUE
## 90        z         46.27702       -21.22481 TRUE TRUE TRUE TRUE TRUE
## 91        d         43.08123       -18.61452 TRUE TRUE TRUE TRUE TRUE
## 92        w         43.03766       -24.00425 TRUE TRUE TRUE TRUE TRUE
## 93        n         46.72967       -12.27878 TRUE TRUE TRUE TRUE TRUE
## 94        w         45.96897       -20.41649 TRUE TRUE TRUE TRUE TRUE
## 95        h         50.49960       -21.64548 TRUE TRUE TRUE TRUE TRUE
## 96        r         43.10367       -21.69437 TRUE TRUE TRUE TRUE TRUE
## 97        k         44.54938       -19.63173 TRUE TRUE TRUE TRUE TRUE
## 98        k         42.12794       -24.66150 TRUE TRUE TRUE TRUE TRUE
## 99        z         44.24457       -13.05368 TRUE TRUE TRUE TRUE TRUE
## 100       l         47.61909       -14.08426 TRUE TRUE TRUE TRUE TRUE
## 101       b         44.25767       -13.65867 TRUE TRUE TRUE TRUE TRUE
## 102       y         42.63775       -25.72929 TRUE TRUE TRUE TRUE TRUE
## 103       n         50.56177       -15.52224 TRUE TRUE TRUE TRUE TRUE
## 104       o         48.09714       -21.31699 TRUE TRUE TRUE TRUE TRUE
## 105       v         48.42260       -11.64633 TRUE TRUE TRUE TRUE TRUE
## 106       f         45.52129       -25.25189 TRUE TRUE TRUE TRUE TRUE
## 107       s         48.02754       -12.66976 TRUE TRUE TRUE TRUE TRUE
## 108       b         43.79185       -22.98669 TRUE TRUE TRUE TRUE TRUE
## 109       r         47.78217       -15.05166 TRUE TRUE TRUE TRUE TRUE
## 110       z         46.63354       -24.26560 TRUE TRUE TRUE TRUE TRUE
## 111       y         48.38623       -19.55432 TRUE TRUE TRUE TRUE TRUE
## 112       t         42.68095       -15.03948 TRUE TRUE TRUE TRUE TRUE
## 113       e         44.45842       -11.49847 TRUE TRUE TRUE TRUE TRUE
## 114       r         43.98256       -21.37550 TRUE TRUE TRUE TRUE TRUE
## 115       n         46.23768       -15.87168 TRUE TRUE TRUE TRUE TRUE
## 116       n         44.67655       -12.40487 TRUE TRUE TRUE TRUE TRUE
## 117       q         49.28855       -23.36617 TRUE TRUE TRUE TRUE TRUE
## 118       x         46.59162       -25.94746 TRUE TRUE TRUE TRUE TRUE
## 119       e         45.81446       -11.63528 TRUE TRUE TRUE TRUE TRUE
## 120       v         48.19875       -14.26387 TRUE TRUE TRUE TRUE TRUE
## 121       m         45.27996       -18.80533 TRUE TRUE TRUE TRUE TRUE
## 122       h         48.25332       -21.45675 TRUE TRUE TRUE TRUE TRUE
## 123       x         50.67405       -21.57275 TRUE TRUE TRUE TRUE TRUE
## 124       o         43.33540       -19.00190 TRUE TRUE TRUE TRUE TRUE
## 125       b         43.53319       -19.02038 TRUE TRUE TRUE TRUE TRUE
## 126       r         50.59898       -12.41533 TRUE TRUE TRUE TRUE TRUE
## 127       s         47.21971       -17.14605 TRUE TRUE TRUE TRUE TRUE
## 128       f         46.35849       -14.11888 TRUE TRUE TRUE TRUE TRUE
## 129       z         43.42081       -24.02800 TRUE TRUE TRUE TRUE TRUE
## 130       k         48.56654       -16.31181 TRUE TRUE TRUE TRUE TRUE
## 131       a         42.05272       -17.23460 TRUE TRUE TRUE TRUE TRUE
## 132       a         48.03033       -20.29701 TRUE TRUE TRUE TRUE TRUE
## 133       l         43.72870       -15.74660 TRUE TRUE TRUE TRUE TRUE
## 134       v         50.02084       -19.57710 TRUE TRUE TRUE TRUE TRUE
## 135       r         42.71142       -16.65727 TRUE TRUE TRUE TRUE TRUE
## 136       l         43.11430       -17.34768 TRUE TRUE TRUE TRUE TRUE
## 137       d         47.08516       -19.45374 TRUE TRUE TRUE TRUE TRUE
## 138       c         43.21944       -20.40069 TRUE TRUE TRUE TRUE TRUE
## 139       u         45.33228       -20.01697 TRUE TRUE TRUE TRUE TRUE
## 140       m         45.51633       -24.28956 TRUE TRUE TRUE TRUE TRUE
## 141       x         45.54985       -14.43651 TRUE TRUE TRUE TRUE TRUE
## 142       x         44.48579       -15.68046 TRUE TRUE TRUE TRUE TRUE
## 143       w         45.80883       -17.08256 TRUE TRUE TRUE TRUE TRUE
## 144       x         45.95261       -11.95781 TRUE TRUE TRUE TRUE TRUE
## 145       o         42.01165       -16.98168 TRUE TRUE TRUE TRUE TRUE
## 146       l         43.54525       -21.91721 TRUE TRUE TRUE TRUE TRUE
## 147       z         49.80257       -22.04956 TRUE TRUE TRUE TRUE TRUE
## 148       z         50.26173       -23.79491 TRUE TRUE TRUE TRUE TRUE
## 149       l         45.25441       -15.85776 TRUE TRUE TRUE TRUE TRUE
## 150       r         47.38067       -20.34766 TRUE TRUE TRUE TRUE TRUE
## 151       l         45.54300       -18.01769 TRUE TRUE TRUE TRUE TRUE
## 152       j         43.55238       -15.78671 TRUE TRUE TRUE TRUE TRUE
## 153       o         43.95303       -14.57498 TRUE TRUE TRUE TRUE TRUE
## 154       x         43.43715       -20.26976 TRUE TRUE TRUE TRUE TRUE
## 155       i         49.85998       -19.10017 TRUE TRUE TRUE TRUE TRUE
## 156       w         47.26902       -24.79822 TRUE TRUE TRUE TRUE TRUE
## 157       x         49.39018       -24.64833 TRUE TRUE TRUE TRUE TRUE
## 158       b         46.75153       -25.21689 TRUE TRUE TRUE TRUE TRUE
## 159       e         43.94753       -12.26930 TRUE TRUE TRUE TRUE TRUE
## 160       a         42.86798       -18.27541 TRUE TRUE TRUE TRUE TRUE
## 161       r         42.78160       -17.17657 TRUE TRUE TRUE TRUE TRUE
## 162       f         46.61320       -24.41557 TRUE TRUE TRUE TRUE TRUE
## 163       b         46.10434       -15.16312 TRUE TRUE TRUE TRUE TRUE
## 164       n         46.60538       -18.52469 TRUE TRUE TRUE TRUE TRUE
## 165       w         43.78946       -18.61970 TRUE TRUE TRUE TRUE TRUE
## 166       z         47.13337       -21.10885 TRUE TRUE TRUE TRUE TRUE
## 167       u         42.01773       -22.09508 TRUE TRUE TRUE TRUE TRUE
## 168       j         45.23699       -16.44067 TRUE TRUE TRUE TRUE TRUE
## 169       q         43.66431       -24.73657 TRUE TRUE TRUE TRUE TRUE
## 170       y         46.71326       -16.34841 TRUE TRUE TRUE TRUE TRUE
## 171       s         43.03447       -20.61498 TRUE TRUE TRUE TRUE TRUE
## 172       u         50.91772       -23.54362 TRUE TRUE TRUE TRUE TRUE
## 173       u         48.57316       -22.48582 TRUE TRUE TRUE TRUE TRUE
## 174       i         45.94770       -23.25947 TRUE TRUE TRUE TRUE TRUE
## 175       v         44.12810       -16.75064 TRUE TRUE TRUE TRUE TRUE
## 176       t         47.87198       -23.63836 TRUE TRUE TRUE TRUE TRUE
## 177       q         45.09153       -19.76190 TRUE TRUE TRUE TRUE TRUE
## 178       x         46.45874       -16.11367 TRUE TRUE TRUE TRUE TRUE
## 179       y         44.23441       -19.28907 TRUE TRUE TRUE TRUE TRUE
## 180       x         48.62647       -23.39357 TRUE TRUE TRUE TRUE TRUE
## 181       t         46.11653       -12.69036 TRUE TRUE TRUE TRUE TRUE
## 182       q         44.57303       -13.15878 TRUE TRUE TRUE TRUE TRUE
## 183       u         42.88975       -25.11983 TRUE TRUE TRUE TRUE TRUE
## 184       q         44.81103       -11.75950 TRUE TRUE TRUE TRUE TRUE
## 185       k         44.55513       -22.69342 TRUE TRUE TRUE TRUE TRUE
## 186       t         46.13540       -16.28488 TRUE TRUE TRUE TRUE TRUE
## 187       f         47.83236       -18.99767 TRUE TRUE TRUE TRUE TRUE
## 188       l         50.72144       -11.54819 TRUE TRUE TRUE TRUE TRUE
## 189       j         42.79143       -22.72967 TRUE TRUE TRUE TRUE TRUE
## 190       u         43.05746       -17.65287 TRUE TRUE TRUE TRUE TRUE
## 191       d         42.12329       -16.09074 TRUE TRUE TRUE TRUE TRUE
## 192       h         44.36387       -15.12297 TRUE TRUE TRUE TRUE TRUE
## 193       d         50.12582       -11.77490 TRUE TRUE TRUE TRUE TRUE
## 194       f         42.77349       -20.10665 TRUE TRUE TRUE TRUE TRUE
## 195       x         50.82328       -22.33732 TRUE TRUE TRUE TRUE TRUE
## 196       d         49.14350       -11.87275 TRUE TRUE TRUE TRUE TRUE
## 197       q         48.82128       -20.84369 TRUE TRUE TRUE TRUE TRUE
## 198       x         47.67810       -16.18856 TRUE TRUE TRUE TRUE TRUE
## 199       t         43.76357       -25.69082 TRUE TRUE TRUE TRUE TRUE
## 200       e         47.08895       -21.57374 TRUE TRUE TRUE TRUE TRUE
## 201       f         43.17219       -21.42370 TRUE TRUE TRUE TRUE TRUE
## 202       q         47.26448       -12.53670 TRUE TRUE TRUE TRUE TRUE
## 203       p         49.23773       -11.63687 TRUE TRUE TRUE TRUE TRUE
## 204       f         42.65693       -11.26759 TRUE TRUE TRUE TRUE TRUE
## 205       n         48.65318       -16.63039 TRUE TRUE TRUE TRUE TRUE
## 206       t         49.89189       -12.34670 TRUE TRUE TRUE TRUE TRUE
## 207       c         48.85200       -17.42104 TRUE TRUE TRUE TRUE TRUE
## 208       b         49.48822       -25.36624 TRUE TRUE TRUE TRUE TRUE
## 209       x         42.04986       -12.40224 TRUE TRUE TRUE TRUE TRUE
## 210       y         42.65004       -25.73593 TRUE TRUE TRUE TRUE TRUE
## 211       r         48.43692       -21.98941 TRUE TRUE TRUE TRUE TRUE
## 212       o         43.43068       -25.20060 TRUE TRUE TRUE TRUE TRUE
## 213       x         50.04667       -11.27414 TRUE TRUE TRUE TRUE TRUE
## 214       h         42.59626       -15.21479 TRUE TRUE TRUE TRUE TRUE
## 215       l         43.70555       -17.25495 TRUE TRUE TRUE TRUE TRUE
## 216       l         48.93929       -15.85808 TRUE TRUE TRUE TRUE TRUE
## 217       l         43.94709       -23.00420 TRUE TRUE TRUE TRUE TRUE
## 218       i         50.61529       -21.22680 TRUE TRUE TRUE TRUE TRUE
## 219       e         48.56507       -22.64271 TRUE TRUE TRUE TRUE TRUE
## 220       i         45.10744       -12.27912 TRUE TRUE TRUE TRUE TRUE
## 221       y         47.26248       -19.40321 TRUE TRUE TRUE TRUE TRUE
## 222       g         45.80010       -16.22599 TRUE TRUE TRUE TRUE TRUE
## 223       x         49.15514       -11.36680 TRUE TRUE TRUE TRUE TRUE
## 224       s         50.35173       -16.39809 TRUE TRUE TRUE TRUE TRUE
## 225       j         50.80877       -13.54904 TRUE TRUE TRUE TRUE TRUE
## 226       i         43.63994       -20.37702 TRUE TRUE TRUE TRUE TRUE
## 227       z         42.21519       -16.70126 TRUE TRUE TRUE TRUE TRUE
## 228       j         43.87516       -21.99814 TRUE TRUE TRUE TRUE TRUE
## 229       f         42.48480       -15.94132 TRUE TRUE TRUE TRUE TRUE
## 230       g         49.73260       -20.11793 TRUE TRUE TRUE TRUE TRUE
## 231       r         48.15550       -18.47572 TRUE TRUE TRUE TRUE TRUE
## 232       p         50.32137       -25.36853 TRUE TRUE TRUE TRUE TRUE
## 233       l         49.15904       -18.86498 TRUE TRUE TRUE TRUE TRUE
## 234       e         44.34725       -11.44278 TRUE TRUE TRUE TRUE TRUE
## 235       r         44.44456       -17.75734 TRUE TRUE TRUE TRUE TRUE
## 236       s         44.96774       -21.04106 TRUE TRUE TRUE TRUE TRUE
## 237       c         48.46247       -14.85572 TRUE TRUE TRUE TRUE TRUE
## 238       h         49.79673       -13.56381 TRUE TRUE TRUE TRUE TRUE
## 239       o         48.42974       -20.02620 TRUE TRUE TRUE TRUE TRUE
## 240       c         43.51217       -20.45768 TRUE TRUE TRUE TRUE TRUE
## 241       c         47.65279       -21.09283 TRUE TRUE TRUE TRUE TRUE
## 242       n         45.78383       -21.55343 TRUE TRUE TRUE TRUE TRUE
## 243       j         50.87812       -21.37454 TRUE TRUE TRUE TRUE TRUE
## 244       q         49.01058       -23.55376 TRUE TRUE TRUE TRUE TRUE
## 245       f         47.93338       -15.06843 TRUE TRUE TRUE TRUE TRUE
## 246       r         45.10064       -24.33310 TRUE TRUE TRUE TRUE TRUE
## 247       a         42.74028       -21.65079 TRUE TRUE TRUE TRUE TRUE
## 248       i         43.50729       -14.20087 TRUE TRUE TRUE TRUE TRUE
## 249       i         47.48146       -11.81416 TRUE TRUE TRUE TRUE TRUE
## 250       v         47.13322       -17.50790 TRUE TRUE TRUE TRUE TRUE
##       sea   otl  gbf inst summary
## 1   FALSE  TRUE TRUE TRUE   FALSE
## 2   FALSE  TRUE TRUE TRUE   FALSE
## 3    TRUE  TRUE TRUE TRUE    TRUE
## 4    TRUE  TRUE TRUE TRUE    TRUE
## 5   FALSE  TRUE TRUE TRUE   FALSE
## 6   FALSE  TRUE TRUE TRUE   FALSE
## 7   FALSE  TRUE TRUE TRUE   FALSE
## 8   FALSE  TRUE TRUE TRUE   FALSE
## 9    TRUE  TRUE TRUE TRUE    TRUE
## 10   TRUE  TRUE TRUE TRUE    TRUE
## 11  FALSE  TRUE TRUE TRUE   FALSE
## 12  FALSE  TRUE TRUE TRUE   FALSE
## 13  FALSE  TRUE TRUE TRUE   FALSE
## 14  FALSE  TRUE TRUE TRUE   FALSE
## 15   TRUE  TRUE TRUE TRUE    TRUE
## 16  FALSE  TRUE TRUE TRUE   FALSE
## 17   TRUE  TRUE TRUE TRUE    TRUE
## 18  FALSE  TRUE TRUE TRUE   FALSE
## 19  FALSE  TRUE TRUE TRUE   FALSE
## 20   TRUE  TRUE TRUE TRUE    TRUE
## 21  FALSE  TRUE TRUE TRUE   FALSE
## 22  FALSE  TRUE TRUE TRUE   FALSE
## 23   TRUE  TRUE TRUE TRUE    TRUE
## 24  FALSE  TRUE TRUE TRUE   FALSE
## 25  FALSE  TRUE TRUE TRUE   FALSE
## 26   TRUE  TRUE TRUE TRUE    TRUE
## 27  FALSE  TRUE TRUE TRUE   FALSE
## 28   TRUE  TRUE TRUE TRUE    TRUE
## 29   TRUE  TRUE TRUE TRUE    TRUE
## 30  FALSE  TRUE TRUE TRUE   FALSE
## 31  FALSE  TRUE TRUE TRUE   FALSE
## 32  FALSE  TRUE TRUE TRUE   FALSE
## 33   TRUE  TRUE TRUE TRUE    TRUE
## 34  FALSE  TRUE TRUE TRUE   FALSE
## 35   TRUE  TRUE TRUE TRUE    TRUE
## 36   TRUE  TRUE TRUE TRUE    TRUE
## 37   TRUE  TRUE TRUE TRUE    TRUE
## 38  FALSE  TRUE TRUE TRUE   FALSE
## 39  FALSE  TRUE TRUE TRUE   FALSE
## 40  FALSE  TRUE TRUE TRUE   FALSE
## 41  FALSE  TRUE TRUE TRUE   FALSE
## 42   TRUE  TRUE TRUE TRUE    TRUE
## 43   TRUE  TRUE TRUE TRUE    TRUE
## 44  FALSE  TRUE TRUE TRUE   FALSE
## 45  FALSE  TRUE TRUE TRUE   FALSE
## 46  FALSE  TRUE TRUE TRUE   FALSE
## 47   TRUE  TRUE TRUE TRUE    TRUE
## 48   TRUE  TRUE TRUE TRUE    TRUE
## 49  FALSE  TRUE TRUE TRUE   FALSE
## 50  FALSE  TRUE TRUE TRUE   FALSE
## 51  FALSE  TRUE TRUE TRUE   FALSE
## 52   TRUE  TRUE TRUE TRUE    TRUE
## 53  FALSE  TRUE TRUE TRUE   FALSE
## 54   TRUE  TRUE TRUE TRUE    TRUE
## 55  FALSE  TRUE TRUE TRUE   FALSE
## 56  FALSE  TRUE TRUE TRUE   FALSE
## 57   TRUE  TRUE TRUE TRUE    TRUE
## 58   TRUE  TRUE TRUE TRUE    TRUE
## 59  FALSE  TRUE TRUE TRUE   FALSE
## 60  FALSE  TRUE TRUE TRUE   FALSE
## 61   TRUE  TRUE TRUE TRUE    TRUE
## 62  FALSE  TRUE TRUE TRUE   FALSE
## 63   TRUE  TRUE TRUE TRUE    TRUE
## 64  FALSE  TRUE TRUE TRUE   FALSE
## 65  FALSE  TRUE TRUE TRUE   FALSE
## 66  FALSE  TRUE TRUE TRUE   FALSE
## 67   TRUE  TRUE TRUE TRUE    TRUE
## 68  FALSE  TRUE TRUE TRUE   FALSE
## 69   TRUE  TRUE TRUE TRUE    TRUE
## 70   TRUE  TRUE TRUE TRUE    TRUE
## 71  FALSE  TRUE TRUE TRUE   FALSE
## 72  FALSE  TRUE TRUE TRUE   FALSE
## 73  FALSE  TRUE TRUE TRUE   FALSE
## 74  FALSE  TRUE TRUE TRUE   FALSE
## 75  FALSE  TRUE TRUE TRUE   FALSE
## 76   TRUE  TRUE TRUE TRUE    TRUE
## 77  FALSE  TRUE TRUE TRUE   FALSE
## 78  FALSE  TRUE TRUE TRUE   FALSE
## 79  FALSE  TRUE TRUE TRUE   FALSE
## 80   TRUE  TRUE TRUE TRUE    TRUE
## 81   TRUE  TRUE TRUE TRUE    TRUE
## 82  FALSE  TRUE TRUE TRUE   FALSE
## 83   TRUE  TRUE TRUE TRUE    TRUE
## 84  FALSE  TRUE TRUE TRUE   FALSE
## 85   TRUE  TRUE TRUE TRUE    TRUE
## 86  FALSE  TRUE TRUE TRUE   FALSE
## 87  FALSE  TRUE TRUE TRUE   FALSE
## 88  FALSE  TRUE TRUE TRUE   FALSE
## 89   TRUE  TRUE TRUE TRUE    TRUE
## 90   TRUE  TRUE TRUE TRUE    TRUE
## 91  FALSE  TRUE TRUE TRUE   FALSE
## 92  FALSE  TRUE TRUE TRUE   FALSE
## 93  FALSE  TRUE TRUE TRUE   FALSE
## 94   TRUE  TRUE TRUE TRUE    TRUE
## 95  FALSE  TRUE TRUE TRUE   FALSE
## 96  FALSE  TRUE TRUE TRUE   FALSE
## 97   TRUE  TRUE TRUE TRUE    TRUE
## 98  FALSE  TRUE TRUE TRUE   FALSE
## 99  FALSE  TRUE TRUE TRUE   FALSE
## 100 FALSE  TRUE TRUE TRUE   FALSE
## 101 FALSE  TRUE TRUE TRUE   FALSE
## 102 FALSE  TRUE TRUE TRUE   FALSE
## 103 FALSE  TRUE TRUE TRUE   FALSE
## 104  TRUE  TRUE TRUE TRUE    TRUE
## 105 FALSE  TRUE TRUE TRUE   FALSE
## 106  TRUE  TRUE TRUE TRUE    TRUE
## 107 FALSE  TRUE TRUE TRUE   FALSE
## 108  TRUE  TRUE TRUE TRUE    TRUE
## 109  TRUE  TRUE TRUE TRUE    TRUE
## 110  TRUE  TRUE TRUE TRUE    TRUE
## 111  TRUE  TRUE TRUE TRUE    TRUE
## 112 FALSE  TRUE TRUE TRUE   FALSE
## 113 FALSE  TRUE TRUE TRUE   FALSE
## 114  TRUE  TRUE TRUE TRUE    TRUE
## 115  TRUE  TRUE TRUE TRUE    TRUE
## 116 FALSE  TRUE TRUE TRUE   FALSE
## 117 FALSE  TRUE TRUE TRUE   FALSE
## 118 FALSE  TRUE TRUE TRUE   FALSE
## 119 FALSE  TRUE TRUE TRUE   FALSE
## 120  TRUE  TRUE TRUE TRUE    TRUE
## 121  TRUE  TRUE TRUE TRUE    TRUE
## 122  TRUE  TRUE TRUE TRUE    TRUE
## 123 FALSE  TRUE TRUE TRUE   FALSE
## 124 FALSE  TRUE TRUE TRUE   FALSE
## 125 FALSE  TRUE TRUE TRUE   FALSE
## 126 FALSE  TRUE TRUE TRUE   FALSE
## 127  TRUE  TRUE TRUE TRUE    TRUE
## 128 FALSE  TRUE TRUE TRUE   FALSE
## 129 FALSE  TRUE TRUE TRUE   FALSE
## 130  TRUE  TRUE TRUE TRUE    TRUE
## 131 FALSE  TRUE TRUE TRUE   FALSE
## 132  TRUE  TRUE TRUE TRUE    TRUE
## 133 FALSE  TRUE TRUE TRUE   FALSE
## 134 FALSE  TRUE TRUE TRUE   FALSE
## 135 FALSE  TRUE TRUE TRUE   FALSE
## 136 FALSE  TRUE TRUE TRUE   FALSE
## 137  TRUE  TRUE TRUE TRUE    TRUE
## 138 FALSE  TRUE TRUE TRUE   FALSE
## 139  TRUE  TRUE TRUE TRUE    TRUE
## 140  TRUE  TRUE TRUE TRUE    TRUE
## 141 FALSE  TRUE TRUE TRUE   FALSE
## 142 FALSE  TRUE TRUE TRUE   FALSE
## 143  TRUE  TRUE TRUE TRUE    TRUE
## 144 FALSE  TRUE TRUE TRUE   FALSE
## 145 FALSE  TRUE TRUE TRUE   FALSE
## 146  TRUE  TRUE TRUE TRUE    TRUE
## 147 FALSE  TRUE TRUE TRUE   FALSE
## 148 FALSE  TRUE TRUE TRUE   FALSE
## 149 FALSE  TRUE TRUE TRUE   FALSE
## 150  TRUE  TRUE TRUE TRUE    TRUE
## 151  TRUE  TRUE TRUE TRUE    TRUE
## 152 FALSE  TRUE TRUE TRUE   FALSE
## 153 FALSE  TRUE TRUE TRUE   FALSE
## 154 FALSE  TRUE TRUE TRUE   FALSE
## 155 FALSE  TRUE TRUE TRUE   FALSE
## 156 FALSE  TRUE TRUE TRUE   FALSE
## 157 FALSE  TRUE TRUE TRUE   FALSE
## 158 FALSE  TRUE TRUE TRUE   FALSE
## 159 FALSE  TRUE TRUE TRUE   FALSE
## 160 FALSE  TRUE TRUE TRUE   FALSE
## 161 FALSE  TRUE TRUE TRUE   FALSE
## 162  TRUE  TRUE TRUE TRUE    TRUE
## 163 FALSE  TRUE TRUE TRUE   FALSE
## 164  TRUE  TRUE TRUE TRUE    TRUE
## 165 FALSE  TRUE TRUE TRUE   FALSE
## 166  TRUE  TRUE TRUE TRUE    TRUE
## 167 FALSE  TRUE TRUE TRUE   FALSE
## 168  TRUE  TRUE TRUE TRUE    TRUE
## 169 FALSE FALSE TRUE TRUE   FALSE
## 170  TRUE  TRUE TRUE TRUE    TRUE
## 171 FALSE  TRUE TRUE TRUE   FALSE
## 172 FALSE  TRUE TRUE TRUE   FALSE
## 173 FALSE  TRUE TRUE TRUE   FALSE
## 174  TRUE  TRUE TRUE TRUE    TRUE
## 175 FALSE  TRUE TRUE TRUE   FALSE
## 176 FALSE  TRUE TRUE TRUE   FALSE
## 177  TRUE FALSE TRUE TRUE   FALSE
## 178  TRUE  TRUE TRUE TRUE    TRUE
## 179 FALSE  TRUE TRUE TRUE   FALSE
## 180 FALSE  TRUE TRUE TRUE   FALSE
## 181 FALSE  TRUE TRUE TRUE   FALSE
## 182 FALSE  TRUE TRUE TRUE   FALSE
## 183 FALSE  TRUE TRUE TRUE   FALSE
## 184 FALSE FALSE TRUE TRUE   FALSE
## 185  TRUE  TRUE TRUE TRUE    TRUE
## 186  TRUE  TRUE TRUE TRUE    TRUE
## 187  TRUE  TRUE TRUE TRUE    TRUE
## 188 FALSE  TRUE TRUE TRUE   FALSE
## 189 FALSE  TRUE TRUE TRUE   FALSE
## 190 FALSE  TRUE TRUE TRUE   FALSE
## 191 FALSE  TRUE TRUE TRUE   FALSE
## 192 FALSE  TRUE TRUE TRUE   FALSE
## 193 FALSE  TRUE TRUE TRUE   FALSE
## 194 FALSE  TRUE TRUE TRUE   FALSE
## 195 FALSE  TRUE TRUE TRUE   FALSE
## 196 FALSE  TRUE TRUE TRUE   FALSE
## 197 FALSE FALSE TRUE TRUE   FALSE
## 198  TRUE  TRUE TRUE TRUE    TRUE
## 199 FALSE  TRUE TRUE TRUE   FALSE
## 200  TRUE  TRUE TRUE TRUE    TRUE
## 201 FALSE  TRUE TRUE TRUE   FALSE
## 202 FALSE  TRUE TRUE TRUE   FALSE
## 203 FALSE  TRUE TRUE TRUE   FALSE
## 204 FALSE  TRUE TRUE TRUE   FALSE
## 205  TRUE  TRUE TRUE TRUE    TRUE
## 206 FALSE  TRUE TRUE TRUE   FALSE
## 207  TRUE  TRUE TRUE TRUE    TRUE
## 208 FALSE  TRUE TRUE TRUE   FALSE
## 209 FALSE  TRUE TRUE TRUE   FALSE
## 210 FALSE  TRUE TRUE TRUE   FALSE
## 211 FALSE  TRUE TRUE TRUE   FALSE
## 212 FALSE  TRUE TRUE TRUE   FALSE
## 213 FALSE  TRUE TRUE TRUE   FALSE
## 214 FALSE  TRUE TRUE TRUE   FALSE
## 215 FALSE  TRUE TRUE TRUE   FALSE
## 216  TRUE  TRUE TRUE TRUE    TRUE
## 217  TRUE  TRUE TRUE TRUE    TRUE
## 218 FALSE  TRUE TRUE TRUE   FALSE
## 219 FALSE  TRUE TRUE TRUE   FALSE
## 220 FALSE  TRUE TRUE TRUE   FALSE
## 221  TRUE  TRUE TRUE TRUE    TRUE
## 222  TRUE  TRUE TRUE TRUE    TRUE
## 223 FALSE  TRUE TRUE TRUE   FALSE
## 224 FALSE  TRUE TRUE TRUE   FALSE
## 225 FALSE  TRUE TRUE TRUE   FALSE
## 226 FALSE  TRUE TRUE TRUE   FALSE
## 227 FALSE  TRUE TRUE TRUE   FALSE
## 228  TRUE  TRUE TRUE TRUE    TRUE
## 229 FALSE  TRUE TRUE TRUE   FALSE
## 230 FALSE  TRUE TRUE TRUE   FALSE
## 231  TRUE  TRUE TRUE TRUE    TRUE
## 232 FALSE  TRUE TRUE TRUE   FALSE
## 233  TRUE  TRUE TRUE TRUE    TRUE
## 234 FALSE  TRUE TRUE TRUE   FALSE
## 235  TRUE  TRUE TRUE TRUE    TRUE
## 236  TRUE  TRUE TRUE TRUE    TRUE
## 237  TRUE  TRUE TRUE TRUE    TRUE
## 238  TRUE  TRUE TRUE TRUE    TRUE
## 239  TRUE  TRUE TRUE TRUE    TRUE
## 240 FALSE  TRUE TRUE TRUE   FALSE
## 241  TRUE  TRUE TRUE TRUE    TRUE
## 242  TRUE  TRUE TRUE TRUE    TRUE
## 243 FALSE  TRUE TRUE TRUE   FALSE
## 244 FALSE  TRUE TRUE TRUE   FALSE
## 245  TRUE  TRUE TRUE TRUE    TRUE
## 246  TRUE  TRUE TRUE TRUE    TRUE
## 247 FALSE  TRUE TRUE TRUE   FALSE
## 248 FALSE  TRUE TRUE TRUE   FALSE
## 249 FALSE  TRUE TRUE TRUE   FALSE
## 250  TRUE  TRUE TRUE TRUE    TRUE
Test Default radius [°] Default radius (lat 0°/45°) [km]
capitals 0.05 5.5/4
centroids 0.01 1.1/0.8
gbif 1 63.0
institutions 0.001 0.1/0.08
zeros 0.5 55.6

Custom gazetteers

You can use custom gazetteers for all CleanCoordinates tests, via the .ref arguments of the function. For example the capitals.ref argument controls the reference for the capitals test. Customized reference data must follow the same format as the default reference for the same test. You can check the structure of gazetteers via their documentation or by looking at the gazetteer (e.g. head(capitals)). For example:

#check the format of the default capitals reference
head(countrtyref) #a data.frame with four columns: ISO3, capital, longitude, latitude

#create new reference data set from scratch. For real analysis you 
#probably want to load the alternative file from a .txt file
my.cap <- data.frame(ISO3 = LETTERS[1:10],
                     capital = letters[1:10],
                     capital.longitude = runif(10, -180, 180),
                     capital.latitude = runif(10, -90, 90))

flags <- clean_coordinates(exmpl, capitals.ref = my.cap)

In this way test can be completely customized, you could for example provide a gazetteer with the locations of hardware stores (in the capitals format) if you want to flag records around hardware stores.

Classes of the default gazetteers of clean_coordinates.

Test Default gazetteer Class Argument
capitals countryref data.frame capitals.ref
centroids countryref data.frame centroids.ref
countrycheck rnaturalearth::ne_countries(scale = “medium”) SpatialPolygonsDataFrame country.ref
institutions institutions data.frame inst.ref
seas landmass SpatialPolygonsDataFrame seas.ref
urban rnaturalearth::ne_download(scale = ‘medium’, type = ‘urban_areas’) SpatialPolygonsDataFrame urban.ref

Summary and visualization

You cane easily summarize the results of clean_coordinates either with the report option or via summary. If report == T the summary is written to the working directory as a .txt file, if report is a character, it is the path to which the summary file will be written, Alternatively, you can get a summary of the number of records flagged with summary.

#via the report option
flags <- clean_coordinates(exmpl, report = T)
## Testing coordinate validity
## Flagged 0 records.
## Testing equal lat/lon
## Flagged 0 records.
## Testing zero coordinates
## Flagged 0 records.
## Testing country capitals
## Flagged 0 records.
## Testing country centroids
## Flagged 0 records.
## Testing sea coordinates
## OGR data source with driver: ESRI Shapefile 
## Source: "C:\Users\alexander.zizka\AppData\Local\Temp\Rtmp8kHf7i", layer: "ne_50m_land"
## with 1420 features
## It has 3 fields
## Integer64 fields read as strings:  scalerank
## Flagged 160 records.
## Testing geographic outliers
## Flagged 4 records.
## Testing GBIF headquarters, flagging records around Copenhagen
## Flagged 0 records.
## Testing biodiversity institutions
## Flagged 0 records.
## Flagged 161 of 250 records, EQ = 0.64.

#via summary
summary(flags)
## decimallatitude             val             equ             zer 
##               0               0               0               0 
##             cap             cen             sea             otl 
##               0               0             160               4 
##             gbf            inst         summary 
##               0               0             161

Exclude flagged records

The output of clean_coordinates is in the same order as the input, thus you can easily exclude flagged records.

#exclude records flagged by any test
clean <- exmpl[, flags$sumary]

#exclude records flagged by the centroids test
clean <- exmpl[, flags$centroids]