qs_clean_coordinates.RmdThe clean_coordinates function enables a fast, automated and reproducible flagging of potentially erroneous occurrence coordinates based on geographic gazetteers. The function can flag records based on known problems common to biological collections.
Individual test can be switched on and off by a logical flag (see ?clean_coordinates) and distance thresholds for all tests can also be adapted. Custom gazetteers for the cleaning can be provided for all tests (for a higher level of detail). See here for a detailed tutorial on how to clean occurrence records using CoordinateCleaner.
Please find a detailed tutorial on how to clean occurrence records (e.g. from GBIF) here and how to clean fossil data (e.g. from PBDB) here.
clean_coordinates wraps around multiple tests for common error sources in species distribution records. Individual test can be included or excluded from a run with the tests argument of clean_coordinates, e.g. "seas" switches the seas test off. Most basic tests are switched on by defaults, but some more complex are switched off by default.
library(CoordinateCleaner)
## Registered S3 method overwritten by 'dplyr':
## method from
## as.data.frame.tbl_df tibble
## Registered S3 methods overwritten by 'ggplot2':
## method from
## [.quosures rlang
## c.quosures rlang
## print.quosures rlang
exmpl <- data.frame(species = sample(letters, size = 250, replace = TRUE),
decimallongitude = runif(250, min = 42, max = 51),
decimallatitude = runif(250, min = -26, max = -11),
countries = "MDG")
#run all tests
dat <- clean_coordinates(exmpl, tests = c("capitals", "centroids", "countries", "equal", "gbif", "institutions", "outliers", "seas", "urban", "zeros"), countries = "countries")
## Testing coordinate validity
## Flagged 0 records.
## Testing equal lat/lon
## Flagged 0 records.
## Testing zero coordinates
## Flagged 0 records.
## Testing country capitals
## Flagged 0 records.
## Testing country centroids
## Flagged 0 records.
## Testing sea coordinates
## OGR data source with driver: ESRI Shapefile
## Source: "C:\Users\alexander.zizka\AppData\Local\Temp\Rtmpqc5aFR", layer: "ne_50m_land"
## with 1420 features
## It has 3 fields
## Integer64 fields read as strings: scalerank
## Flagged 156 records.
## Testing urban areas
## No reference for urban areas found.
## Using rnaturalearth to download.
## OGR data source with driver: ESRI Shapefile
## Source: "C:\Users\alexander.zizka\AppData\Local\Temp\Rtmpqc5aFR", layer: "ne_50m_urban_areas"
## with 2143 features
## It has 4 fields
## Integer64 fields read as strings: scalerank
## Flagged 0 records.
## Testing country identity
## Flagged 156 records.
## Testing geographic outliers
## Flagged 3 records.
## Testing GBIF headquarters, flagging records around Copenhagen
## Flagged 0 records.
## Testing biodiversity institutions
## Flagged 0 records.
## Flagged 158 of 250 records, EQ = 0.63.
#only run the validity test
dat <- clean_coordinates(exmpl, tests = c(""))
## Testing coordinate validity
## Flagged 0 records.
## Flagged 0 of 250 records, EQ = 0.| Test | Function | Background | Default |
|---|---|---|---|
| capitals | radius around capitals | georeferenced from location description | on |
| centroids | radius around country and province centroids | geo-referenced from description | on |
| countries | coordinates in the right country | switched lon/lat, data entry errors | off |
| duplicates | records from one species with identical coordinates | repetitive observation of identical individual, same voucher from multiple data sources, genetic data | off |
| gbif | radius around GBIF headquarters | data entry errors, falsely geo-referenced | on |
| institutions | radius around biodiversity institutions | falsely geo-referenced, zoo or garden records | on |
| outliers | records far away from all other records of this species | various | off |
| seas | in the sea | switched lon/lat | on |
| urban | within urban area | cultivated/captivity | off |
| validity | outside reference coordinate system | missing data, data entry errors | on |
| zeros | plain zeros, lat = lon | missing data, data entry errors | on |
capitals, centroids and institutions
The capitals, centroids and institutions test use a radius around gazetteers to flag coordinates. You can change this radius for each test using the .rad arguments. The radius is specified in decimal degrees. This means that the actual size of the in meters will vary slightly depending on latitude.
clean_coordinates(exmpl, capitals_rad = 0.1)## Testing coordinate validity
## Flagged 0 records.
## Testing equal lat/lon
## Flagged 0 records.
## Testing zero coordinates
## Flagged 0 records.
## Testing country capitals
## Flagged 0 records.
## Testing country centroids
## Flagged 0 records.
## Testing sea coordinates
## OGR data source with driver: ESRI Shapefile
## Source: "C:\Users\alexander.zizka\AppData\Local\Temp\Rtmpqc5aFR", layer: "ne_50m_land"
## with 1420 features
## It has 3 fields
## Integer64 fields read as strings: scalerank
## Flagged 156 records.
## Testing geographic outliers
## Flagged 3 records.
## Testing GBIF headquarters, flagging records around Copenhagen
## Flagged 0 records.
## Testing biodiversity institutions
## Flagged 0 records.
## Flagged 158 of 250 records, EQ = 0.63.
## species decimallongitude decimallatitude val equ zer cap cen
## 1 l 49.18590 -20.71880 TRUE TRUE TRUE TRUE TRUE
## 2 w 48.73326 -18.32124 TRUE TRUE TRUE TRUE TRUE
## 3 u 42.43554 -15.72547 TRUE TRUE TRUE TRUE TRUE
## 4 c 50.17924 -21.66092 TRUE TRUE TRUE TRUE TRUE
## 5 d 50.75853 -19.11579 TRUE TRUE TRUE TRUE TRUE
## 6 m 43.93606 -15.53044 TRUE TRUE TRUE TRUE TRUE
## 7 h 42.74776 -19.26143 TRUE TRUE TRUE TRUE TRUE
## 8 f 46.16682 -13.53061 TRUE TRUE TRUE TRUE TRUE
## 9 o 43.64763 -14.84481 TRUE TRUE TRUE TRUE TRUE
## 10 u 49.25360 -12.62174 TRUE TRUE TRUE TRUE TRUE
## 11 w 42.64977 -18.68775 TRUE TRUE TRUE TRUE TRUE
## 12 g 43.10648 -21.70366 TRUE TRUE TRUE TRUE TRUE
## 13 x 42.53900 -19.24482 TRUE TRUE TRUE TRUE TRUE
## 14 a 44.42925 -19.23684 TRUE TRUE TRUE TRUE TRUE
## 15 u 48.71508 -19.50283 TRUE TRUE TRUE TRUE TRUE
## 16 r 42.03020 -24.37555 TRUE TRUE TRUE TRUE TRUE
## 17 k 42.52437 -19.41456 TRUE TRUE TRUE TRUE TRUE
## 18 j 43.68465 -25.56188 TRUE TRUE TRUE TRUE TRUE
## 19 k 43.50951 -24.39107 TRUE TRUE TRUE TRUE TRUE
## 20 u 45.78401 -16.86553 TRUE TRUE TRUE TRUE TRUE
## 21 w 44.38697 -16.53746 TRUE TRUE TRUE TRUE TRUE
## 22 m 44.21495 -14.75126 TRUE TRUE TRUE TRUE TRUE
## 23 w 45.91899 -19.84619 TRUE TRUE TRUE TRUE TRUE
## 24 j 49.44800 -13.22954 TRUE TRUE TRUE TRUE TRUE
## 25 o 47.19915 -23.81937 TRUE TRUE TRUE TRUE TRUE
## 26 c 48.33915 -18.87889 TRUE TRUE TRUE TRUE TRUE
## 27 n 43.36642 -12.04434 TRUE TRUE TRUE TRUE TRUE
## 28 e 47.02641 -18.71071 TRUE TRUE TRUE TRUE TRUE
## 29 y 44.49769 -21.45606 TRUE TRUE TRUE TRUE TRUE
## 30 q 49.46204 -11.50508 TRUE TRUE TRUE TRUE TRUE
## 31 p 47.45954 -23.91925 TRUE TRUE TRUE TRUE TRUE
## 32 w 50.81620 -13.85193 TRUE TRUE TRUE TRUE TRUE
## 33 j 42.02057 -18.59120 TRUE TRUE TRUE TRUE TRUE
## 34 j 47.95033 -22.09662 TRUE TRUE TRUE TRUE TRUE
## 35 a 49.01466 -19.52101 TRUE TRUE TRUE TRUE TRUE
## 36 j 48.74231 -20.13698 TRUE TRUE TRUE TRUE TRUE
## 37 f 42.97659 -11.28416 TRUE TRUE TRUE TRUE TRUE
## 38 g 43.76136 -13.61890 TRUE TRUE TRUE TRUE TRUE
## 39 g 49.57766 -12.60070 TRUE TRUE TRUE TRUE TRUE
## 40 y 46.32367 -14.09645 TRUE TRUE TRUE TRUE TRUE
## 41 n 47.06535 -23.20093 TRUE TRUE TRUE TRUE TRUE
## 42 r 45.97036 -21.66840 TRUE TRUE TRUE TRUE TRUE
## 43 r 45.25523 -16.44089 TRUE TRUE TRUE TRUE TRUE
## 44 g 48.99622 -23.53746 TRUE TRUE TRUE TRUE TRUE
## 45 w 49.54365 -14.29799 TRUE TRUE TRUE TRUE TRUE
## 46 v 50.21024 -16.20776 TRUE TRUE TRUE TRUE TRUE
## 47 y 50.61223 -12.84135 TRUE TRUE TRUE TRUE TRUE
## 48 r 43.79180 -24.42490 TRUE TRUE TRUE TRUE TRUE
## 49 o 48.26625 -22.92693 TRUE TRUE TRUE TRUE TRUE
## 50 e 49.23238 -11.06857 TRUE TRUE TRUE TRUE TRUE
## 51 o 47.86576 -18.18920 TRUE TRUE TRUE TRUE TRUE
## 52 h 50.23453 -24.45763 TRUE TRUE TRUE TRUE TRUE
## 53 w 44.99258 -13.44145 TRUE TRUE TRUE TRUE TRUE
## 54 b 49.84508 -18.81094 TRUE TRUE TRUE TRUE TRUE
## 55 f 50.01793 -24.46228 TRUE TRUE TRUE TRUE TRUE
## 56 n 44.84224 -13.18717 TRUE TRUE TRUE TRUE TRUE
## 57 a 46.83722 -14.55670 TRUE TRUE TRUE TRUE TRUE
## 58 i 46.04549 -17.19569 TRUE TRUE TRUE TRUE TRUE
## 59 z 43.11164 -19.64094 TRUE TRUE TRUE TRUE TRUE
## 60 t 46.17499 -22.27887 TRUE TRUE TRUE TRUE TRUE
## 61 a 44.89776 -16.40322 TRUE TRUE TRUE TRUE TRUE
## 62 r 44.56910 -12.66744 TRUE TRUE TRUE TRUE TRUE
## 63 r 48.29982 -22.92505 TRUE TRUE TRUE TRUE TRUE
## 64 y 42.61644 -17.59462 TRUE TRUE TRUE TRUE TRUE
## 65 v 49.91636 -21.74183 TRUE TRUE TRUE TRUE TRUE
## 66 x 49.09550 -17.40778 TRUE TRUE TRUE TRUE TRUE
## 67 l 46.92588 -25.30999 TRUE TRUE TRUE TRUE TRUE
## 68 l 42.51916 -22.83208 TRUE TRUE TRUE TRUE TRUE
## 69 q 49.79692 -12.01504 TRUE TRUE TRUE TRUE TRUE
## 70 w 48.12006 -18.64730 TRUE TRUE TRUE TRUE TRUE
## 71 w 44.10432 -16.25259 TRUE TRUE TRUE TRUE TRUE
## 72 d 48.16776 -22.87364 TRUE TRUE TRUE TRUE TRUE
## 73 s 48.93125 -13.53774 TRUE TRUE TRUE TRUE TRUE
## 74 w 47.91291 -15.93257 TRUE TRUE TRUE TRUE TRUE
## 75 w 49.21318 -22.99666 TRUE TRUE TRUE TRUE TRUE
## 76 y 49.13329 -19.77676 TRUE TRUE TRUE TRUE TRUE
## 77 v 48.58976 -17.13427 TRUE TRUE TRUE TRUE TRUE
## 78 o 49.32941 -21.18135 TRUE TRUE TRUE TRUE TRUE
## 79 a 44.61405 -11.29742 TRUE TRUE TRUE TRUE TRUE
## 80 s 45.94918 -14.73340 TRUE TRUE TRUE TRUE TRUE
## 81 v 48.42274 -18.94465 TRUE TRUE TRUE TRUE TRUE
## 82 a 46.40954 -22.93284 TRUE TRUE TRUE TRUE TRUE
## 83 w 49.99914 -24.95373 TRUE TRUE TRUE TRUE TRUE
## 84 y 50.32892 -18.29211 TRUE TRUE TRUE TRUE TRUE
## 85 r 47.65391 -19.76936 TRUE TRUE TRUE TRUE TRUE
## 86 p 46.64227 -13.90433 TRUE TRUE TRUE TRUE TRUE
## 87 z 47.25414 -24.93289 TRUE TRUE TRUE TRUE TRUE
## 88 s 42.07978 -11.53997 TRUE TRUE TRUE TRUE TRUE
## 89 x 49.38772 -14.72063 TRUE TRUE TRUE TRUE TRUE
## 90 z 49.23698 -23.91420 TRUE TRUE TRUE TRUE TRUE
## 91 x 48.39761 -22.32064 TRUE TRUE TRUE TRUE TRUE
## 92 o 44.07180 -23.31976 TRUE TRUE TRUE TRUE TRUE
## 93 w 45.60565 -24.31831 TRUE TRUE TRUE TRUE TRUE
## 94 u 43.90905 -12.39791 TRUE TRUE TRUE TRUE TRUE
## 95 s 42.95996 -18.67143 TRUE TRUE TRUE TRUE TRUE
## 96 h 45.83874 -15.22579 TRUE TRUE TRUE TRUE TRUE
## 97 r 43.21896 -12.55921 TRUE TRUE TRUE TRUE TRUE
## 98 x 48.69670 -14.70274 TRUE TRUE TRUE TRUE TRUE
## 99 a 46.72645 -20.22503 TRUE TRUE TRUE TRUE TRUE
## 100 p 46.34369 -14.23710 TRUE TRUE TRUE TRUE TRUE
## 101 e 46.53400 -25.20683 TRUE TRUE TRUE TRUE TRUE
## 102 y 42.16734 -18.63340 TRUE TRUE TRUE TRUE TRUE
## 103 z 49.72625 -19.44816 TRUE TRUE TRUE TRUE TRUE
## 104 i 50.45714 -22.12839 TRUE TRUE TRUE TRUE TRUE
## 105 g 45.33411 -14.00022 TRUE TRUE TRUE TRUE TRUE
## 106 i 45.56808 -25.69178 TRUE TRUE TRUE TRUE TRUE
## 107 z 50.04195 -15.04625 TRUE TRUE TRUE TRUE TRUE
## 108 t 44.57765 -20.25642 TRUE TRUE TRUE TRUE TRUE
## 109 b 43.97457 -22.08555 TRUE TRUE TRUE TRUE TRUE
## 110 s 48.81252 -12.32890 TRUE TRUE TRUE TRUE TRUE
## 111 v 44.39171 -24.10033 TRUE TRUE TRUE TRUE TRUE
## 112 d 46.99930 -11.00269 TRUE TRUE TRUE TRUE TRUE
## 113 m 44.60472 -11.05668 TRUE TRUE TRUE TRUE TRUE
## 114 b 50.33073 -12.06249 TRUE TRUE TRUE TRUE TRUE
## 115 y 49.89654 -15.21176 TRUE TRUE TRUE TRUE TRUE
## 116 r 46.61428 -22.01891 TRUE TRUE TRUE TRUE TRUE
## 117 s 46.38357 -23.96059 TRUE TRUE TRUE TRUE TRUE
## 118 s 47.78609 -19.11666 TRUE TRUE TRUE TRUE TRUE
## 119 l 44.13842 -16.00176 TRUE TRUE TRUE TRUE TRUE
## 120 n 49.53060 -20.58976 TRUE TRUE TRUE TRUE TRUE
## 121 l 42.98346 -17.63155 TRUE TRUE TRUE TRUE TRUE
## 122 v 48.48893 -20.20935 TRUE TRUE TRUE TRUE TRUE
## 123 c 49.46769 -13.10034 TRUE TRUE TRUE TRUE TRUE
## 124 s 44.60732 -14.53303 TRUE TRUE TRUE TRUE TRUE
## 125 q 49.28102 -12.04293 TRUE TRUE TRUE TRUE TRUE
## 126 y 48.81690 -16.71582 TRUE TRUE TRUE TRUE TRUE
## 127 z 45.20443 -24.20471 TRUE TRUE TRUE TRUE TRUE
## 128 n 47.42808 -18.53953 TRUE TRUE TRUE TRUE TRUE
## 129 w 44.47183 -15.44532 TRUE TRUE TRUE TRUE TRUE
## 130 f 45.10992 -20.40357 TRUE TRUE TRUE TRUE TRUE
## 131 q 48.11443 -19.13391 TRUE TRUE TRUE TRUE TRUE
## 132 e 43.38047 -14.91228 TRUE TRUE TRUE TRUE TRUE
## 133 v 43.21990 -18.42594 TRUE TRUE TRUE TRUE TRUE
## 134 n 49.87413 -25.31316 TRUE TRUE TRUE TRUE TRUE
## 135 l 44.37213 -25.26315 TRUE TRUE TRUE TRUE TRUE
## 136 r 42.83473 -24.40360 TRUE TRUE TRUE TRUE TRUE
## 137 g 49.25300 -24.54716 TRUE TRUE TRUE TRUE TRUE
## 138 i 42.77585 -11.90303 TRUE TRUE TRUE TRUE TRUE
## 139 i 50.25854 -23.59804 TRUE TRUE TRUE TRUE TRUE
## 140 q 42.75531 -23.96597 TRUE TRUE TRUE TRUE TRUE
## 141 r 43.26437 -15.46460 TRUE TRUE TRUE TRUE TRUE
## 142 w 42.75644 -14.36546 TRUE TRUE TRUE TRUE TRUE
## 143 d 50.32222 -15.65287 TRUE TRUE TRUE TRUE TRUE
## 144 t 43.83573 -16.41725 TRUE TRUE TRUE TRUE TRUE
## 145 e 42.40943 -19.19846 TRUE TRUE TRUE TRUE TRUE
## 146 v 48.22779 -16.12883 TRUE TRUE TRUE TRUE TRUE
## 147 o 44.19167 -22.04497 TRUE TRUE TRUE TRUE TRUE
## 148 w 48.27839 -12.47075 TRUE TRUE TRUE TRUE TRUE
## 149 p 44.54864 -22.23884 TRUE TRUE TRUE TRUE TRUE
## 150 p 45.10895 -22.66670 TRUE TRUE TRUE TRUE TRUE
## 151 j 48.77856 -22.59544 TRUE TRUE TRUE TRUE TRUE
## 152 h 42.64749 -12.14627 TRUE TRUE TRUE TRUE TRUE
## 153 j 45.44877 -16.74619 TRUE TRUE TRUE TRUE TRUE
## 154 r 46.09674 -21.13085 TRUE TRUE TRUE TRUE TRUE
## 155 e 42.41916 -11.89262 TRUE TRUE TRUE TRUE TRUE
## 156 p 43.58359 -15.86326 TRUE TRUE TRUE TRUE TRUE
## 157 n 49.67242 -11.90380 TRUE TRUE TRUE TRUE TRUE
## 158 a 43.45945 -11.39999 TRUE TRUE TRUE TRUE TRUE
## 159 v 48.76599 -18.19055 TRUE TRUE TRUE TRUE TRUE
## 160 c 45.27119 -20.09108 TRUE TRUE TRUE TRUE TRUE
## 161 t 45.91788 -15.96579 TRUE TRUE TRUE TRUE TRUE
## 162 q 49.44890 -22.24958 TRUE TRUE TRUE TRUE TRUE
## 163 q 47.54690 -15.58417 TRUE TRUE TRUE TRUE TRUE
## 164 y 47.04942 -12.43993 TRUE TRUE TRUE TRUE TRUE
## 165 z 42.23309 -17.92042 TRUE TRUE TRUE TRUE TRUE
## 166 p 50.60985 -11.75911 TRUE TRUE TRUE TRUE TRUE
## 167 y 47.95595 -13.23602 TRUE TRUE TRUE TRUE TRUE
## 168 x 46.23414 -21.91418 TRUE TRUE TRUE TRUE TRUE
## 169 t 49.96675 -19.39709 TRUE TRUE TRUE TRUE TRUE
## 170 p 50.24237 -22.72095 TRUE TRUE TRUE TRUE TRUE
## 171 g 44.65291 -24.30957 TRUE TRUE TRUE TRUE TRUE
## 172 v 42.48343 -23.99245 TRUE TRUE TRUE TRUE TRUE
## 173 b 48.72915 -20.04092 TRUE TRUE TRUE TRUE TRUE
## 174 r 50.63772 -24.46493 TRUE TRUE TRUE TRUE TRUE
## 175 e 48.79448 -20.57736 TRUE TRUE TRUE TRUE TRUE
## 176 z 45.17639 -15.87400 TRUE TRUE TRUE TRUE TRUE
## 177 j 45.47444 -17.59675 TRUE TRUE TRUE TRUE TRUE
## 178 h 46.25624 -22.35446 TRUE TRUE TRUE TRUE TRUE
## 179 r 50.90072 -13.08348 TRUE TRUE TRUE TRUE TRUE
## 180 k 42.20132 -18.76894 TRUE TRUE TRUE TRUE TRUE
## 181 f 48.02464 -14.76150 TRUE TRUE TRUE TRUE TRUE
## 182 h 44.99305 -11.14091 TRUE TRUE TRUE TRUE TRUE
## 183 w 43.22851 -23.48841 TRUE TRUE TRUE TRUE TRUE
## 184 y 42.41693 -20.62161 TRUE TRUE TRUE TRUE TRUE
## 185 n 46.49501 -22.67301 TRUE TRUE TRUE TRUE TRUE
## 186 r 47.57834 -16.55797 TRUE TRUE TRUE TRUE TRUE
## 187 u 45.85827 -11.12132 TRUE TRUE TRUE TRUE TRUE
## 188 m 42.51685 -12.47987 TRUE TRUE TRUE TRUE TRUE
## 189 w 43.76790 -19.56184 TRUE TRUE TRUE TRUE TRUE
## 190 s 50.93977 -22.50521 TRUE TRUE TRUE TRUE TRUE
## 191 u 48.59199 -20.75355 TRUE TRUE TRUE TRUE TRUE
## 192 l 43.96398 -25.97814 TRUE TRUE TRUE TRUE TRUE
## 193 f 47.34199 -18.34360 TRUE TRUE TRUE TRUE TRUE
## 194 i 48.90730 -19.90369 TRUE TRUE TRUE TRUE TRUE
## 195 v 43.46306 -14.18464 TRUE TRUE TRUE TRUE TRUE
## 196 c 46.77790 -17.54921 TRUE TRUE TRUE TRUE TRUE
## 197 c 43.89798 -20.27962 TRUE TRUE TRUE TRUE TRUE
## 198 f 44.19955 -25.45451 TRUE TRUE TRUE TRUE TRUE
## 199 p 47.15078 -19.13664 TRUE TRUE TRUE TRUE TRUE
## 200 c 42.29943 -25.85997 TRUE TRUE TRUE TRUE TRUE
## 201 l 48.77541 -19.93621 TRUE TRUE TRUE TRUE TRUE
## 202 r 50.46370 -12.67779 TRUE TRUE TRUE TRUE TRUE
## 203 o 48.01572 -21.25904 TRUE TRUE TRUE TRUE TRUE
## 204 v 47.23324 -25.42260 TRUE TRUE TRUE TRUE TRUE
## 205 t 48.37226 -11.93376 TRUE TRUE TRUE TRUE TRUE
## 206 j 49.69528 -14.67603 TRUE TRUE TRUE TRUE TRUE
## 207 z 46.06268 -22.15271 TRUE TRUE TRUE TRUE TRUE
## 208 b 49.97101 -13.88328 TRUE TRUE TRUE TRUE TRUE
## 209 l 44.11513 -19.65934 TRUE TRUE TRUE TRUE TRUE
## 210 t 50.39503 -15.55402 TRUE TRUE TRUE TRUE TRUE
## 211 p 48.25199 -11.88655 TRUE TRUE TRUE TRUE TRUE
## 212 z 45.90567 -23.02689 TRUE TRUE TRUE TRUE TRUE
## 213 c 46.36629 -12.06558 TRUE TRUE TRUE TRUE TRUE
## 214 y 49.39284 -22.12101 TRUE TRUE TRUE TRUE TRUE
## 215 t 43.52331 -21.02947 TRUE TRUE TRUE TRUE TRUE
## 216 h 44.92737 -16.40351 TRUE TRUE TRUE TRUE TRUE
## 217 h 49.05715 -23.06282 TRUE TRUE TRUE TRUE TRUE
## 218 g 47.47413 -16.59857 TRUE TRUE TRUE TRUE TRUE
## 219 n 45.92569 -14.90568 TRUE TRUE TRUE TRUE TRUE
## 220 u 48.18932 -12.54389 TRUE TRUE TRUE TRUE TRUE
## 221 h 50.77166 -22.98788 TRUE TRUE TRUE TRUE TRUE
## 222 f 47.33253 -20.36102 TRUE TRUE TRUE TRUE TRUE
## 223 d 44.83019 -24.31481 TRUE TRUE TRUE TRUE TRUE
## 224 g 45.10038 -20.41926 TRUE TRUE TRUE TRUE TRUE
## 225 n 42.45133 -19.85839 TRUE TRUE TRUE TRUE TRUE
## 226 p 46.50129 -14.18795 TRUE TRUE TRUE TRUE TRUE
## 227 v 43.63416 -19.66356 TRUE TRUE TRUE TRUE TRUE
## 228 q 43.27273 -12.86189 TRUE TRUE TRUE TRUE TRUE
## 229 m 48.22686 -14.01560 TRUE TRUE TRUE TRUE TRUE
## 230 y 50.69444 -20.24032 TRUE TRUE TRUE TRUE TRUE
## 231 j 42.89292 -15.93477 TRUE TRUE TRUE TRUE TRUE
## 232 m 50.95544 -16.66376 TRUE TRUE TRUE TRUE TRUE
## 233 q 50.89336 -14.72263 TRUE TRUE TRUE TRUE TRUE
## 234 t 47.02953 -16.85623 TRUE TRUE TRUE TRUE TRUE
## 235 z 48.86044 -13.09634 TRUE TRUE TRUE TRUE TRUE
## 236 x 43.37459 -25.73203 TRUE TRUE TRUE TRUE TRUE
## 237 a 45.16985 -24.38672 TRUE TRUE TRUE TRUE TRUE
## 238 f 45.17163 -14.54149 TRUE TRUE TRUE TRUE TRUE
## 239 l 47.70606 -15.85942 TRUE TRUE TRUE TRUE TRUE
## 240 h 45.40316 -24.54162 TRUE TRUE TRUE TRUE TRUE
## 241 b 45.74062 -23.28833 TRUE TRUE TRUE TRUE TRUE
## 242 c 42.57740 -19.61903 TRUE TRUE TRUE TRUE TRUE
## 243 u 44.87698 -22.39369 TRUE TRUE TRUE TRUE TRUE
## 244 s 50.90221 -21.67745 TRUE TRUE TRUE TRUE TRUE
## 245 s 45.40282 -14.35720 TRUE TRUE TRUE TRUE TRUE
## 246 y 49.04774 -13.51070 TRUE TRUE TRUE TRUE TRUE
## 247 m 42.53044 -22.77043 TRUE TRUE TRUE TRUE TRUE
## 248 k 42.19256 -11.68454 TRUE TRUE TRUE TRUE TRUE
## 249 p 45.07205 -13.95623 TRUE TRUE TRUE TRUE TRUE
## 250 r 50.82171 -15.64992 TRUE TRUE TRUE TRUE TRUE
## sea otl gbf inst summary
## 1 FALSE TRUE TRUE TRUE FALSE
## 2 TRUE TRUE TRUE TRUE TRUE
## 3 FALSE TRUE TRUE TRUE FALSE
## 4 FALSE TRUE TRUE TRUE FALSE
## 5 FALSE TRUE TRUE TRUE FALSE
## 6 FALSE TRUE TRUE TRUE FALSE
## 7 FALSE TRUE TRUE TRUE FALSE
## 8 FALSE TRUE TRUE TRUE FALSE
## 9 FALSE FALSE TRUE TRUE FALSE
## 10 TRUE TRUE TRUE TRUE TRUE
## 11 FALSE TRUE TRUE TRUE FALSE
## 12 FALSE TRUE TRUE TRUE FALSE
## 13 FALSE TRUE TRUE TRUE FALSE
## 14 TRUE TRUE TRUE TRUE TRUE
## 15 TRUE TRUE TRUE TRUE TRUE
## 16 FALSE TRUE TRUE TRUE FALSE
## 17 FALSE TRUE TRUE TRUE FALSE
## 18 FALSE TRUE TRUE TRUE FALSE
## 19 FALSE TRUE TRUE TRUE FALSE
## 20 TRUE FALSE TRUE TRUE FALSE
## 21 FALSE TRUE TRUE TRUE FALSE
## 22 FALSE TRUE TRUE TRUE FALSE
## 23 TRUE TRUE TRUE TRUE TRUE
## 24 TRUE TRUE TRUE TRUE TRUE
## 25 TRUE TRUE TRUE TRUE TRUE
## 26 TRUE TRUE TRUE TRUE TRUE
## 27 FALSE TRUE TRUE TRUE FALSE
## 28 TRUE TRUE TRUE TRUE TRUE
## 29 TRUE TRUE TRUE TRUE TRUE
## 30 FALSE TRUE TRUE TRUE FALSE
## 31 TRUE TRUE TRUE TRUE TRUE
## 32 FALSE TRUE TRUE TRUE FALSE
## 33 FALSE TRUE TRUE TRUE FALSE
## 34 TRUE TRUE TRUE TRUE TRUE
## 35 FALSE TRUE TRUE TRUE FALSE
## 36 FALSE TRUE TRUE TRUE FALSE
## 37 FALSE TRUE TRUE TRUE FALSE
## 38 FALSE TRUE TRUE TRUE FALSE
## 39 TRUE TRUE TRUE TRUE TRUE
## 40 FALSE TRUE TRUE TRUE FALSE
## 41 TRUE TRUE TRUE TRUE TRUE
## 42 TRUE TRUE TRUE TRUE TRUE
## 43 TRUE TRUE TRUE TRUE TRUE
## 44 FALSE TRUE TRUE TRUE FALSE
## 45 TRUE TRUE TRUE TRUE TRUE
## 46 FALSE TRUE TRUE TRUE FALSE
## 47 FALSE TRUE TRUE TRUE FALSE
## 48 TRUE TRUE TRUE TRUE TRUE
## 49 FALSE TRUE TRUE TRUE FALSE
## 50 FALSE TRUE TRUE TRUE FALSE
## 51 TRUE TRUE TRUE TRUE TRUE
## 52 FALSE TRUE TRUE TRUE FALSE
## 53 FALSE TRUE TRUE TRUE FALSE
## 54 FALSE TRUE TRUE TRUE FALSE
## 55 FALSE TRUE TRUE TRUE FALSE
## 56 FALSE TRUE TRUE TRUE FALSE
## 57 FALSE TRUE TRUE TRUE FALSE
## 58 TRUE TRUE TRUE TRUE TRUE
## 59 FALSE TRUE TRUE TRUE FALSE
## 60 TRUE TRUE TRUE TRUE TRUE
## 61 TRUE TRUE TRUE TRUE TRUE
## 62 FALSE TRUE TRUE TRUE FALSE
## 63 FALSE TRUE TRUE TRUE FALSE
## 64 FALSE TRUE TRUE TRUE FALSE
## 65 FALSE TRUE TRUE TRUE FALSE
## 66 TRUE TRUE TRUE TRUE TRUE
## 67 FALSE TRUE TRUE TRUE FALSE
## 68 FALSE TRUE TRUE TRUE FALSE
## 69 FALSE TRUE TRUE TRUE FALSE
## 70 TRUE TRUE TRUE TRUE TRUE
## 71 FALSE TRUE TRUE TRUE FALSE
## 72 FALSE TRUE TRUE TRUE FALSE
## 73 TRUE TRUE TRUE TRUE TRUE
## 74 TRUE TRUE TRUE TRUE TRUE
## 75 FALSE TRUE TRUE TRUE FALSE
## 76 FALSE TRUE TRUE TRUE FALSE
## 77 TRUE TRUE TRUE TRUE TRUE
## 78 FALSE TRUE TRUE TRUE FALSE
## 79 FALSE TRUE TRUE TRUE FALSE
## 80 FALSE TRUE TRUE TRUE FALSE
## 81 TRUE TRUE TRUE TRUE TRUE
## 82 TRUE TRUE TRUE TRUE TRUE
## 83 FALSE TRUE TRUE TRUE FALSE
## 84 FALSE TRUE TRUE TRUE FALSE
## 85 TRUE TRUE TRUE TRUE TRUE
## 86 FALSE TRUE TRUE TRUE FALSE
## 87 FALSE TRUE TRUE TRUE FALSE
## 88 FALSE TRUE TRUE TRUE FALSE
## 89 TRUE TRUE TRUE TRUE TRUE
## 90 FALSE TRUE TRUE TRUE FALSE
## 91 FALSE TRUE TRUE TRUE FALSE
## 92 TRUE TRUE TRUE TRUE TRUE
## 93 TRUE TRUE TRUE TRUE TRUE
## 94 FALSE TRUE TRUE TRUE FALSE
## 95 FALSE TRUE TRUE TRUE FALSE
## 96 FALSE TRUE TRUE TRUE FALSE
## 97 FALSE TRUE TRUE TRUE FALSE
## 98 TRUE TRUE TRUE TRUE TRUE
## 99 TRUE TRUE TRUE TRUE TRUE
## 100 FALSE TRUE TRUE TRUE FALSE
## 101 FALSE TRUE TRUE TRUE FALSE
## 102 FALSE TRUE TRUE TRUE FALSE
## 103 FALSE TRUE TRUE TRUE FALSE
## 104 FALSE TRUE TRUE TRUE FALSE
## 105 FALSE TRUE TRUE TRUE FALSE
## 106 FALSE TRUE TRUE TRUE FALSE
## 107 TRUE TRUE TRUE TRUE TRUE
## 108 TRUE TRUE TRUE TRUE TRUE
## 109 TRUE TRUE TRUE TRUE TRUE
## 110 FALSE TRUE TRUE TRUE FALSE
## 111 TRUE TRUE TRUE TRUE TRUE
## 112 FALSE TRUE TRUE TRUE FALSE
## 113 FALSE TRUE TRUE TRUE FALSE
## 114 FALSE TRUE TRUE TRUE FALSE
## 115 TRUE TRUE TRUE TRUE TRUE
## 116 TRUE TRUE TRUE TRUE TRUE
## 117 TRUE TRUE TRUE TRUE TRUE
## 118 TRUE TRUE TRUE TRUE TRUE
## 119 FALSE TRUE TRUE TRUE FALSE
## 120 FALSE TRUE TRUE TRUE FALSE
## 121 FALSE TRUE TRUE TRUE FALSE
## 122 TRUE TRUE TRUE TRUE TRUE
## 123 TRUE TRUE TRUE TRUE TRUE
## 124 FALSE TRUE TRUE TRUE FALSE
## 125 FALSE TRUE TRUE TRUE FALSE
## 126 TRUE TRUE TRUE TRUE TRUE
## 127 TRUE TRUE TRUE TRUE TRUE
## 128 TRUE TRUE TRUE TRUE TRUE
## 129 FALSE TRUE TRUE TRUE FALSE
## 130 TRUE TRUE TRUE TRUE TRUE
## 131 TRUE TRUE TRUE TRUE TRUE
## 132 FALSE TRUE TRUE TRUE FALSE
## 133 FALSE TRUE TRUE TRUE FALSE
## 134 FALSE TRUE TRUE TRUE FALSE
## 135 FALSE TRUE TRUE TRUE FALSE
## 136 FALSE TRUE TRUE TRUE FALSE
## 137 FALSE TRUE TRUE TRUE FALSE
## 138 FALSE TRUE TRUE TRUE FALSE
## 139 FALSE TRUE TRUE TRUE FALSE
## 140 FALSE TRUE TRUE TRUE FALSE
## 141 FALSE TRUE TRUE TRUE FALSE
## 142 FALSE TRUE TRUE TRUE FALSE
## 143 TRUE TRUE TRUE TRUE TRUE
## 144 FALSE TRUE TRUE TRUE FALSE
## 145 FALSE TRUE TRUE TRUE FALSE
## 146 TRUE TRUE TRUE TRUE TRUE
## 147 TRUE TRUE TRUE TRUE TRUE
## 148 FALSE TRUE TRUE TRUE FALSE
## 149 TRUE TRUE TRUE TRUE TRUE
## 150 TRUE TRUE TRUE TRUE TRUE
## 151 FALSE TRUE TRUE TRUE FALSE
## 152 FALSE TRUE TRUE TRUE FALSE
## 153 TRUE TRUE TRUE TRUE TRUE
## 154 TRUE TRUE TRUE TRUE TRUE
## 155 FALSE TRUE TRUE TRUE FALSE
## 156 FALSE TRUE TRUE TRUE FALSE
## 157 FALSE TRUE TRUE TRUE FALSE
## 158 FALSE TRUE TRUE TRUE FALSE
## 159 TRUE TRUE TRUE TRUE TRUE
## 160 TRUE TRUE TRUE TRUE TRUE
## 161 TRUE TRUE TRUE TRUE TRUE
## 162 FALSE TRUE TRUE TRUE FALSE
## 163 TRUE TRUE TRUE TRUE TRUE
## 164 FALSE TRUE TRUE TRUE FALSE
## 165 FALSE TRUE TRUE TRUE FALSE
## 166 FALSE TRUE TRUE TRUE FALSE
## 167 FALSE TRUE TRUE TRUE FALSE
## 168 TRUE TRUE TRUE TRUE TRUE
## 169 FALSE TRUE TRUE TRUE FALSE
## 170 FALSE TRUE TRUE TRUE FALSE
## 171 TRUE TRUE TRUE TRUE TRUE
## 172 FALSE TRUE TRUE TRUE FALSE
## 173 TRUE TRUE TRUE TRUE TRUE
## 174 FALSE TRUE TRUE TRUE FALSE
## 175 FALSE TRUE TRUE TRUE FALSE
## 176 FALSE TRUE TRUE TRUE FALSE
## 177 TRUE TRUE TRUE TRUE TRUE
## 178 TRUE TRUE TRUE TRUE TRUE
## 179 FALSE TRUE TRUE TRUE FALSE
## 180 FALSE TRUE TRUE TRUE FALSE
## 181 TRUE TRUE TRUE TRUE TRUE
## 182 FALSE TRUE TRUE TRUE FALSE
## 183 FALSE TRUE TRUE TRUE FALSE
## 184 FALSE TRUE TRUE TRUE FALSE
## 185 TRUE TRUE TRUE TRUE TRUE
## 186 TRUE TRUE TRUE TRUE TRUE
## 187 FALSE TRUE TRUE TRUE FALSE
## 188 FALSE TRUE TRUE TRUE FALSE
## 189 FALSE TRUE TRUE TRUE FALSE
## 190 FALSE TRUE TRUE TRUE FALSE
## 191 FALSE TRUE TRUE TRUE FALSE
## 192 FALSE TRUE TRUE TRUE FALSE
## 193 TRUE TRUE TRUE TRUE TRUE
## 194 FALSE TRUE TRUE TRUE FALSE
## 195 FALSE TRUE TRUE TRUE FALSE
## 196 TRUE TRUE TRUE TRUE TRUE
## 197 FALSE TRUE TRUE TRUE FALSE
## 198 FALSE TRUE TRUE TRUE FALSE
## 199 TRUE TRUE TRUE TRUE TRUE
## 200 FALSE TRUE TRUE TRUE FALSE
## 201 TRUE TRUE TRUE TRUE TRUE
## 202 FALSE TRUE TRUE TRUE FALSE
## 203 TRUE TRUE TRUE TRUE TRUE
## 204 FALSE TRUE TRUE TRUE FALSE
## 205 FALSE TRUE TRUE TRUE FALSE
## 206 TRUE TRUE TRUE TRUE TRUE
## 207 TRUE TRUE TRUE TRUE TRUE
## 208 TRUE TRUE TRUE TRUE TRUE
## 209 FALSE TRUE TRUE TRUE FALSE
## 210 TRUE TRUE TRUE TRUE TRUE
## 211 FALSE TRUE TRUE TRUE FALSE
## 212 TRUE TRUE TRUE TRUE TRUE
## 213 FALSE TRUE TRUE TRUE FALSE
## 214 FALSE TRUE TRUE TRUE FALSE
## 215 FALSE TRUE TRUE TRUE FALSE
## 216 TRUE TRUE TRUE TRUE TRUE
## 217 FALSE TRUE TRUE TRUE FALSE
## 218 TRUE TRUE TRUE TRUE TRUE
## 219 FALSE TRUE TRUE TRUE FALSE
## 220 FALSE TRUE TRUE TRUE FALSE
## 221 FALSE TRUE TRUE TRUE FALSE
## 222 TRUE TRUE TRUE TRUE TRUE
## 223 TRUE TRUE TRUE TRUE TRUE
## 224 TRUE TRUE TRUE TRUE TRUE
## 225 FALSE TRUE TRUE TRUE FALSE
## 226 FALSE TRUE TRUE TRUE FALSE
## 227 FALSE TRUE TRUE TRUE FALSE
## 228 FALSE TRUE TRUE TRUE FALSE
## 229 TRUE TRUE TRUE TRUE TRUE
## 230 FALSE TRUE TRUE TRUE FALSE
## 231 FALSE TRUE TRUE TRUE FALSE
## 232 FALSE TRUE TRUE TRUE FALSE
## 233 FALSE TRUE TRUE TRUE FALSE
## 234 TRUE TRUE TRUE TRUE TRUE
## 235 TRUE TRUE TRUE TRUE TRUE
## 236 FALSE TRUE TRUE TRUE FALSE
## 237 TRUE TRUE TRUE TRUE TRUE
## 238 FALSE TRUE TRUE TRUE FALSE
## 239 TRUE TRUE TRUE TRUE TRUE
## 240 TRUE TRUE TRUE TRUE TRUE
## 241 TRUE TRUE TRUE TRUE TRUE
## 242 FALSE TRUE TRUE TRUE FALSE
## 243 TRUE FALSE TRUE TRUE FALSE
## 244 FALSE TRUE TRUE TRUE FALSE
## 245 FALSE TRUE TRUE TRUE FALSE
## 246 TRUE TRUE TRUE TRUE TRUE
## 247 FALSE TRUE TRUE TRUE FALSE
## 248 FALSE TRUE TRUE TRUE FALSE
## 249 FALSE TRUE TRUE TRUE FALSE
## 250 FALSE TRUE TRUE TRUE FALSE
| Test | Default radius [°] | Default radius (lat 0°/45°) [km] |
|---|---|---|
| capitals | 0.05 | 5.5/4 |
| centroids | 0.01 | 1.1/0.8 |
| gbif | 1 | 63.0 |
| institutions | 0.001 | 0.1/0.08 |
| zeros | 0.5 | 55.6 |
You can use custom gazetteers for all CleanCoordinates tests, via the .ref arguments of the function. For example the capitals.ref argument controls the reference for the capitals test. Customized reference data must follow the same format as the default reference for the same test. You can check the structure of gazetteers via their documentation or by looking at the gazetteer (e.g. head(capitals)). For example:
#check the format of the default capitals reference
head(countrtyref) #a data.frame with four columns: ISO3, capital, longitude, latitude
#create new reference data set from scratch. For real analysis you
#probably want to load the alternative file from a .txt file
my.cap <- data.frame(ISO3 = LETTERS[1:10],
capital = letters[1:10],
capital.longitude = runif(10, -180, 180),
capital.latitude = runif(10, -90, 90))
flags <- clean_coordinates(exmpl, capitals.ref = my.cap)In this way test can be completely customized, you could for example provide a gazetteer with the locations of hardware stores (in the capitals format) if you want to flag records around hardware stores.
Classes of the default gazetteers of clean_coordinates.
| Test | Default gazetteer | Class | Argument |
|---|---|---|---|
| capitals | countryref | data.frame |
capitals.ref |
| centroids | countryref | data.frame |
centroids.ref |
| countrycheck | rnaturalearth::ne_countries(scale = “medium”) | SpatialPolygonsDataFrame |
country.ref |
| institutions | institutions | data.frame |
inst.ref |
| seas | landmass | SpatialPolygonsDataFrame |
seas.ref |
| urban | rnaturalearth::ne_download(scale = ‘medium’, type = ‘urban_areas’) | SpatialPolygonsDataFrame |
urban.ref |
You cane easily summarize the results of clean_coordinates either with the report option or via summary. If report == T the summary is written to the working directory as a .txt file, if report is a character, it is the path to which the summary file will be written, Alternatively, you can get a summary of the number of records flagged with summary.
#via the report option
flags <- clean_coordinates(exmpl, report = T)
## Testing coordinate validity
## Flagged 0 records.
## Testing equal lat/lon
## Flagged 0 records.
## Testing zero coordinates
## Flagged 0 records.
## Testing country capitals
## Flagged 0 records.
## Testing country centroids
## Flagged 0 records.
## Testing sea coordinates
## OGR data source with driver: ESRI Shapefile
## Source: "C:\Users\alexander.zizka\AppData\Local\Temp\Rtmpqc5aFR", layer: "ne_50m_land"
## with 1420 features
## It has 3 fields
## Integer64 fields read as strings: scalerank
## Flagged 156 records.
## Testing geographic outliers
## Flagged 3 records.
## Testing GBIF headquarters, flagging records around Copenhagen
## Flagged 0 records.
## Testing biodiversity institutions
## Flagged 0 records.
## Flagged 158 of 250 records, EQ = 0.63.
#via summary
summary(flags)
## decimallatitude val equ zer
## 0 0 0 0
## cap cen sea otl
## 0 0 156 3
## gbf inst summary
## 0 0 158