partition_tiles
divides the study area into a specified number of
rectangular tiles. Optionally small partitions can be merged with adjacent
tiles to achieve a minimum number or percentage of samples in each tile.
partition_tiles(data, coords = c("x", "y"), dsplit = NULL, nsplit = NULL, rotation = c("none", "random", "user"), user_rotation, offset = c("none", "random", "user"), user_offset, reassign = TRUE, min_frac = 0.025, min_n = 5, iterate = 1, return_factor = FALSE, repetition = 1, seed1 = NULL)
data |
|
---|---|
coords | vector of length 2 defining the variables in |
dsplit | optional vector of length 2: equidistance of splits in
(possibly rotated) x direction ( |
nsplit | optional vector of length 2: number of splits in
(possibly rotated) x direction ( |
rotation | indicates whether and how the rectangular grid should
be rotated; random rotation is only between |
user_rotation | if |
offset | indicates whether and how the rectangular grid should be shifted by an offset. |
user_offset | if |
reassign | logical (default |
min_frac | numeric >=0, <1: minimum relative size of partition as
percentage of sample; argument passed to get_small_tiles.
Will be ignored if |
min_n | integer >=0: minimum number of samples per partition;
argument passed to get_small_tiles.
Will be ignored if |
iterate | argument to be passed to tile_neighbors |
return_factor | if |
repetition | numeric vector: cross-validation repetitions
to be generated. Note that this is not the number of repetitions,
but the indices of these repetitions. E.g., use |
seed1 |
|
A represampling object.
Contains length(repetition)
resampling objects as
repetitions. The exact number of folds / test-set tiles within each
resampling objects depends on the spatial configuration of
the data set and possible cleaning steps (see min_frac
, min_n
).
Default parameter settings may change in future releases.
This function, especially the rotation and shifting part of it and the
algorithm for cleaning up small tiles is still a bit experimental.
Use with caution.
For non-zero offsets (offset!='none')
), the number of tiles may
actually be greater than nsplit[1]*nsplit[2]
because of fractional
tiles lurking into the study region. reassign=TRUE
with suitable
thresholds is therefore recommended for non-zero (including random) offsets.
sperrorest, as.resampling.factor, get_small_tiles, tile_neighbors
data(ecuador) parti <- partition_tiles(ecuador, nsplit = c(4, 3), reassign = FALSE) # plot(parti,ecuador) summary(parti) # tile A4 has only 55 samples#> $`1` #> n.train n.test #> X1:Y2 686 65 #> X1:Y3 665 86 #> X2:Y1 711 40 #> X2:Y2 666 85 #> X2:Y3 690 61 #> X3:Y1 664 87 #> X3:Y2 661 90 #> X3:Y3 681 70 #> X4:Y1 671 80 #> X4:Y2 692 59 #> X4:Y3 723 28 #># same partitioning, but now merge tiles with less than 100 samples to # adjacent tiles: parti2 <- partition_tiles(ecuador, nsplit = c(4,3), reassign = TRUE, min_n = 100) # plot(parti2,ecuador) summary(parti2)#> $`1` #> n.train n.test #> X1:Y3 600 151 #> X2:Y2 626 125 #> X3:Y1 584 167 #> X3:Y2 574 177 #> X3:Y3 620 131 #># tile B4 (in 'parti') was smaller than A3, therefore A4 was merged with B4, # not with A3 # now with random rotation and offset, and tiles of 2000 m length: parti3 <- partition_tiles(ecuador, dsplit = 2000, offset = 'random', rotation = 'random', reassign = TRUE, min_n = 100) # plot(parti3, ecuador) summary(parti3)#> $`1` #> n.train n.test #> X1:Y2 584 167 #> X2:Y1 530 221 #> X2:Y2 508 243 #> X3:Y2 631 120 #>