Using 20m
(most generlized) file.
Excluding Puerto Rico:
Final dataset used:
From these set of 3142 counties in 51 states 5% sample would mean 157 polygons.
Also - important to notice - there are five counties with zero neighbours, Three in Hawaii:
And two elsewhere:
And also important to notice the text from the paper:
Within this dataset, the 5 boroughs/counties of New York are treated as a single entity. We have done the same in these analyses, assigning all 5 counties the values associated with New York County
The counties in question are most likely these
At the moment, using dataset prepared these counties are excluded from analyses since there is no merge to explanatory variables and no spatial join possible!
Might be an option. Must be the same in between runs? Most likely not enough counties?
## [1] 51
## [1] "55"
## [1] 72
Texas as the biggest state that gives 8.1 data?
## [1] 254
st_sample
Unfortunately non contiguous :/
igraph
, spdep
and sf
custom solutionSolution suggested by @Spacedman here.
## Deleting source `data/cb_2018_us_county_20m_prep_sample.shp' using driver `ESRI Shapefile'
## Writing layer `cb_2018_us_county_20m_prep_sample' to data source `data/cb_2018_us_county_20m_prep_sample.shp' using driver `ESRI Shapefile'
## Writing 157 features with 2 fields and geometry type Multi Polygon.
These are counties that will be used as input for 5% sample analyses!