Assessing landscape scale predictions from simulations
Reading in the outputs of the ‘Aggregating_simulations’ script, which was run on the QUT HPC
Load packages
Load observed buffalo data for validation
Code
Rows: 1165406 Columns: 22
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
dbl (18): id, burst_, x1_, x2_, y1_, y2_, sl_, ta_, dt_, hour_t2, step_id_,...
lgl (1): case_
dttm (3): t1_, t2_, t2_rounded
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Code
buffalo_data_all <- buffalo_data_all %>%
mutate(t1_ = lubridate::with_tz(buffalo_data_all$t1_, tzone = "Australia/Darwin"),
t2_ = lubridate::with_tz(buffalo_data_all$t2_, tzone = "Australia/Darwin"))
buffalo_data_all <- buffalo_data_all %>%
mutate(t1_rounded = round_date(t1_, "hour"),
hour_t1 = hour(t1_rounded),
t2 = t2_,
t2_rounded = round_date(t2_, "hour"),
hour_t2 = hour(t2_rounded),
hour = ifelse(hour_t2 == 0, 24, hour_t2),
yday = yday(t2),
year = year(t2),
month = month(t2))
# ensure that we have just the individuals that were used for model fitting
buffalo_year_ids <- c(2005, 2014, 2018, 2021, 2022, 2024, 2039,
2154, 2158, 2223, 2327, 2387, 2393)
buffalo_data_all <- buffalo_data_all %>% filter(id %in% buffalo_year_ids)
buffalo_data_all <- buffalo_data_all %>% filter(t1_ < "2019-07-25 09:32:42 ACST")
buffalo_ids <- unique(buffalo_data_all$id)
# make a dataframe of only the presence locations
buffalo_data_all_pres <- buffalo_data_all %>% filter(y == 1)
# convert to track object for step lengths and turning angles etc
buffalo_data_all_pres <- buffalo_data_all_pres %>%
mk_track(id = id, x1_, y1_, t1_, order_by_ts = T, all_cols = T, crs = 3112) %>%
arrange(id)
Subset the buffalo data to the late dry season 2018 for validation
Code
months_wet <- c(1:4, 11:12)
buffalo_data_all_dry2018 <- buffalo_data_all %>%
filter(!month %in% months_wet & year == 2018) # dry season 2018
# subset by spatial extent
buffalo_data_subset_dry2018 <- buffalo_data_all_dry2018 %>% filter(
x2_ > 20000 & x2_ < 40000 & y2_ > -1445000 & y2_ < -1425000)
# total number of observed locations for that period
buffalo_data_subset_dry2018 %>% filter(y == 1) %>% summarise(n = n())
Code
Code
[1] 67.78483
Long-term predictions
Using all simulated locations from each model, which cover a ~4-month period, and not splitting up by the hour of the day.
Load prediction maps
List all the .tif files in the HPC directory, which are the outputs of the DynamicSSF_4a_Aggregating_simulations.R
script.
Code
[1] "0p_subset_freq_points_res_25_2024-05-13.tif"
[2] "1p_subset_freq_points_res_25_2024-05-13.tif"
[3] "2p_subset_freq_points_res_25_2024-05-13.tif"
[4] "3p_subset_freq_points_res_25_2024-05-13.tif"
Code
# create a list of the prediction maps from the 'Aggregating simulations' script
# here only pull out the results from the subsetted landscape with 25m resolution
list_all_points_subset <- lapply(sim_data_list[grepl("freq_points_res_25", sim_data_list) &
grepl("subset", sim_data_list)], rast)
# stack into a raster stack
raster_all_points_25x25_subset <- rast(list_all_points_subset)
# check that they have the expected total number of simulated locations
# should be around 300,000
terra::global(raster_all_points_25x25_subset, fun = "sum", na.rm = TRUE)
For the predictions, we aggregate the 25m cells into 50m cells. Predictions on a 50m scale are fine enough for our purposes, and require fewer simulations than predictions at 25m scales (i.e. on average 1/4).
Code
Create dataframes of raster values to plot with ggplot
KDE of buffalo locations for plotting
We try several different options to estimate the KDE bandwidth, although as this is for plotting, we find that manually setting the bandwidth gave the best results. The reference bandwidth was too broad and the plug-in bandwidth was too narrow leading to many clusters of space use, which made it difficult to assess general space use by each individual buffalo.
Code
min_x <- min(df_all_points_subset[[1]]$x)
min_y <- min(df_all_points_subset[[1]]$y)
# set the origin of x and y at 0,0
buffalo_data_subset_dry2018 <- buffalo_data_subset_dry2018 %>% mutate(x00 = x1_ - min_x, y00 = y1_ - min_y)
# set origin of template raster to 0,0
template_raster <- raster_all_points_subset[[1]]
ext(template_raster) <- c(0, 2e4, 0, 2e4)
# convert to track object to use HR analyses
buffalo_data_subset_dry2018_pres_track <- buffalo_data_subset_dry2018 %>% filter(y == 1) %>%
mk_track(id = id, x00, y00, t1_, order_by_ts = T, all_cols = T, crs = 3112) %>%
arrange(id)
buffalo_data_subset_nested <- buffalo_data_subset_dry2018_pres_track %>%
nest(data = !id)
buffalo_hr <- buffalo_data_subset_nested %>%
mutate(
hr_mcp = map(data, hr_mcp),
hr_kde_ref = map(data, ~ hr_kde(.,
trast = template_raster,
h = hr_kde_ref(.),
levels = c(0.5, 0.95))),
hr_kde_pi = map(data, ~ hr_kde(.,
trast = template_raster,
h = hr_kde_pi(.),
levels = c(0.5, 0.95))),
hr_kde_manual = map(data, ~ hr_kde(.,
trast = template_raster,
h = c(150,150),
levels = c(0.5, 0.95))),
hr_kde_ref_iso = map(hr_kde_ref, hr_isopleths),
hr_kde_pi_iso = map(hr_kde_pi, hr_isopleths),
hr_kde_manual_iso = map(hr_kde_manual, hr_isopleths)
)
# to plot the KDEs - we decided that the manual approach was the best for plotting
# the reference was too diffuse and the plug-in was too narrow
# walk(buffalo_hr$hr_kde_ref, ~ plot(.x$ud, main = "Reference bandwidth"))
# walk(buffalo_hr$hr_kde_pi, ~ plot(.x$ud, main = "Plug-in bandwidth"))
# walk(buffalo_hr$hr_kde_manual, ~ plot(.x$ud, main = "Manual bandwidth"))
# walk(buffalo_hr$hr_kde_ref_iso, ~ plot(.x$geometry))
# walk(buffalo_hr$hr_kde_pi_iso, ~ plot(.x$geometry))
# walk(buffalo_hr$hr_kde_manual_iso, ~ plot(.x$geometry))
buffalo_isopleths <- buffalo_hr %>% dplyr::select(id, hr_kde_ref_iso, hr_kde_pi_iso, hr_kde_manual_iso) %>%
pivot_longer(!id, names_to = "estimator", values_to = "hr") %>%
unnest(cols = hr)
Plot using ggplot, with buffalo KDE density overlaid
Code
models <- c("0p", "1p", "2p", "3p")
min_x <- min(df_all_points_subset[[1]]$x)
min_y <- min(df_all_points_subset[[1]]$y)
# for equal colour scaling, find min and max number of simulated locations in any cell
# min_prob <- min(df_all_points_subset[[1]]$sum)
# the min number of simulated locations is close to 0
min_prob <- 0
# max number of simulated locations in any one cell is in the 1p predictions,
# we scale the colours to that
max_prob <- max(df_all_points_subset[[2]]$sum)
for(i in 1:length(df_all_points_subset)) {
print(ggplot() +
geom_raster(data = df_all_points_subset[[i]], aes(x = x - min_x, y = y - min_y, fill = sum)) +
scale_fill_viridis_c("Simulated locations", limits = c(min_prob, max_prob)) +
geom_sf_pattern(data = buffalo_isopleths %>% filter(estimator == "hr_kde_manual_iso" & level == 0.95),
aes(geometry = geometry),
pattern = "none",
pattern_angle = 45,
pattern_size = 0.025,
pattern_spacing = 0.005,
pattern_density = 0.01,
pattern_colour = "white",
pattern_fill = "white",
colour = "white",
linewidth = 0.2,
linetype = "twodash",
alpha = 0) + #
geom_sf_pattern(data = buffalo_isopleths %>% filter(estimator == "hr_kde_manual_iso" & level == 0.5),
aes(geometry = geometry),
pattern = "none",
pattern_angle = 135,
pattern_size = 0.025,
pattern_spacing = 0.005,
pattern_density = 0.01,
pattern_colour = "white",
pattern_fill = "white",
colour = "white",
linewidth = 0.25,
alpha = 0) +
scale_x_continuous("Easting") +
scale_y_continuous("Northing") +
ggtitle(paste0("Model ", models[i])) +
theme_bw() +
theme(legend.position = "right",
axis.title.x = element_text(size = 8), # reduce x-axis title text size
axis.title.y = element_text(size = 8), # reduce y-axis title text size
axis.text.x = element_text(size = 5), # reduce x-axis text size
axis.text.y = element_text(size = 5), # reduce y-axis text size
plot.title = element_text(size = 10),
legend.text = element_text(size = 6),
legend.title = element_text(size = 7))
)
# ggsave(paste0("outputs/plots/manuscript_figs_R1/sim_predictions_subset_KDE_", models[i], ".png"), #_legend
# width=80, height=75, units="mm", dpi = 1000)
}
Plot a subset to show detail of an area of high observed use (KDE density)
Code
for(i in 1:length(df_all_points_subset)) {
print(ggplot() +
geom_raster(data = df_all_points_subset[[i]], aes(x = x - min_x, y = y - min_y, fill = sum)) +
scale_fill_viridis_c("Simulated locations", limits = c(min_prob, max_prob)) +
geom_sf_pattern(data = buffalo_isopleths %>% filter(estimator == "hr_kde_manual_iso" & level == 0.95),
aes(geometry = geometry),
pattern = "none",
pattern_angle = 45,
pattern_size = 0.05,
pattern_spacing = 0.015,
pattern_density = 0.01,
pattern_colour = "white",
pattern_fill = "white",
colour = "white",
linewidth = 0.25,
alpha = 0) +
geom_sf_pattern(data = buffalo_isopleths %>% filter(estimator == "hr_kde_manual_iso" & level == 0.5),
aes(geometry = geometry),
pattern = "none",
pattern_angle = 135,
pattern_size = 0.05,
pattern_spacing = 0.015,
pattern_density = 0.01,
pattern_colour = "white",
pattern_fill = "white",
colour = "white",
linewidth = 0.5,
alpha = 0) +
scale_x_continuous("Easting", limits = c(5000, 10000)) +
scale_y_continuous("Northing", limits = c(10000, 15000)) +
ggtitle(paste0("Model ", models[i])) +
theme_bw() +
theme(legend.position = "right",
axis.title.x = element_text(size = 8), # Reduce x-axis title text size
axis.title.y = element_text(size = 8), # Reduce y-axis title text size
axis.text.x = element_text(size = 5), # Reduce x-axis text size
axis.text.y = element_text(size = 5), # Reduce y-axis text size
plot.title = element_text(size = 10),
legend.text = element_text(size = 6),
legend.title = element_text(size = 7))
)
# ggsave(paste0("outputs/plots/manuscript_figs_R1/sim_predictions_subset_subset_KDE_", models[i], ".png"), #_legend
# width=80, height=75, units="mm", dpi = 1000)
}
Warning: Removed 150198 rows containing missing values or values outside the scale range
(`geom_raster()`).
Warning: Removed 150198 rows containing missing values or values outside the scale range
(`geom_raster()`).
Warning: Removed 150198 rows containing missing values or values outside the scale range
(`geom_raster()`).
Warning: Removed 150198 rows containing missing values or values outside the scale range
(`geom_raster()`).
Histograms of cells and observed locations
The Boyce index assesses the number of observed locations and where the model predicts there to be locations. A histogram is a useful way to determine where a model may be over- or under-predicting.
Code
# relevant presence locations
obs_data <- buffalo_data_subset_dry2018 %>% filter(y == 1) %>% transmute(x = x2_, y = y2_)
for(i in 1:length(df_all_points_subset)) {
# observed (true) locations
obs <- terra::extract(raster_all_points_subset[[i]], obs_data, ID = FALSE)[,1]
print(ggplot() +
geom_histogram(data = df_all_points_subset[[i]],
aes(x = sum, fill = ..x..),
bins = 100) +
scale_fill_viridis_c("Simulated locations", limits = c(min_prob, max_prob)) +
geom_histogram(data = data.frame(sum = obs),
aes(x = sum),
colour = "black", linewidth = 0.1, fill = "white", alpha = 0.9,
bins = 100) +
scale_x_continuous("Simulated locations per cell", breaks = seq(0,4500,500)) +
scale_y_log10("No. of cells") +
ggtitle(paste0("Model ", models[i])) +
theme_classic() +
theme(legend.position = "none",
axis.title.x = element_text(size = 8), # Reduce x-axis title text size
axis.title.y = element_text(size = 8), # Reduce y-axis title text size
axis.text.x = element_text(size = 5), # Reduce x-axis text size
axis.text.y = element_text(size = 5), # Reduce y-axis text size
plot.title = element_text(size = 10)))
# ggsave(paste0("outputs/plots/manuscript_figs_R1/sim_predictions_histogram_", models[i], ".png"),
# width=75, height=30, units="mm", dpi = 1000)
}
Warning: The dot-dot notation (`..x..`) was deprecated in ggplot2 3.4.0.
ℹ Please use `after_stat(x)` instead.
Warning in scale_y_log10("No. of cells"): log-10 transformation introduced
infinite values.
Warning: Removed 13 rows containing missing values or values outside the scale range
(`geom_bar()`).
Warning in scale_y_log10("No. of cells"): log-10 transformation introduced
infinite values.
Warning in scale_y_log10("No. of cells"): log-10 transformation introduced
infinite values.
Warning: Removed 6 rows containing missing values or values outside the scale range
(`geom_bar()`).
Warning: Removed 32 rows containing missing values or values outside the scale range
(`geom_bar()`).
Warning in scale_y_log10("No. of cells"): log-10 transformation introduced
infinite values.
Warning in scale_y_log10("No. of cells"): log-10 transformation introduced
infinite values.
Warning: Removed 3 rows containing missing values or values outside the scale range
(`geom_bar()`).
Warning: Removed 14 rows containing missing values or values outside the scale range
(`geom_bar()`).
Warning in scale_y_log10("No. of cells"): log-10 transformation introduced
infinite values.
Warning in scale_y_log10("No. of cells"): log-10 transformation introduced
infinite values.
Warning: Removed 4 rows containing missing values or values outside the scale range
(`geom_bar()`).
Warning: Removed 18 rows containing missing values or values outside the scale range
(`geom_bar()`).
Validation
Calculating the Boyce index for each model’s long-term predictions. We checked both the Spearman rank correlation coefficient (which assesses monotnicity, that the values increase or decrease), and the Pearson correlation coefficient (which assesses linearity). For the manuscript we used the Spearman rank correlation coeffiicent, as it indicates which areas are likely to be used more often, relative to other areas. The Pearson correlation coefficient also shows the same relative trend of performance between models.
Long-term predictions
Code
# the defaults are 10 and 100
bi_window_width <- 10
bi_res <- 100
models <- c("0p", "1p", "2p", "3p")
F_ratio_list <- vector(mode = "list", length = nlyr(raster_all_points_subset))
spearman_corr_subset <- vector(mode = "numeric", length = nlyr(raster_all_points_subset))
pearson_corr_subset <- vector(mode = "numeric", length = nlyr(raster_all_points_subset))
obs_data <- buffalo_data_subset_dry2018 %>% filter(y == 1) %>% transmute(x = x2_, y = y2_)
for(i in 1:nlyr(raster_all_points_subset)) {
# observed (true) locations
obs <- terra::extract(raster_all_points_subset[[i]], obs_data, ID = FALSE)[,1]
obs <- obs[!is.na(obs)]
# all values of the raster layer
abs <- terra::values(raster_all_points_subset[[i]], na.rm = T)[,1]
abs <- abs[!is.na(abs)]
window.w <- (max(abs) - min(abs))/bi_window_width
boyce_index_spearman <- ecospat.boyce(abs, obs,
window.w = window.w,
res = bi_res,
method = "spearman",
PEplot = F)
spearman_corr_subset[i] <- boyce_index_spearman$cor
boyce_index_pearson <- ecospat.boyce(abs, obs,
method = "pearson",
PEplot = F)
pearson_corr_subset[i] <- boyce_index_pearson$cor
F_ratio_list[[i]] <- data.frame(model = models[i],
F.ratio = boyce_index_pearson$F.ratio,
HS = boyce_index_pearson$HS,
HS_scaled = (boyce_index_pearson$HS - min(boyce_index_pearson$HS)) /
(max(boyce_index_pearson$HS) - min(boyce_index_pearson$HS)))
}
sp_PB_all_subset <- data.frame(hour = 1:nlyr(raster_all_points_subset),
corr = "Spearman",
BI = "PB",
model = c("0p", "1p", "2p", "3p"),
value = spearman_corr_subset)
sp_PB_all_subset
Code
Plotting the F ratio values for each model
We can see from this plot that the predictions from the 0p model don’t predict the areas of high use well, which is likely due an averaging over the habitat selection coefficients. The dynamic models perform mostly similarly, although the 1p model predicts very high buffalo use in areas that weren’t used at all, such as the road in the upper right of the prediction map.
Code
F_ratio_df <- bind_rows(F_ratio_list)
ggplot(F_ratio_df, aes(x = HS, y = F.ratio, colour = model)) +
geom_line(alpha = 0.75, linewidth = 1) +
geom_hline(yintercept = 1, linetype = "dashed") +
scale_colour_viridis_d("Model") +
scale_x_continuous("Simulated locations per cell") +
scale_y_log10("F-ratio") +
theme_bw() +
theme(legend.position = "right",
axis.title.x = element_text(size = 8), # Reduce x-axis title text size
axis.title.y = element_text(size = 8), # Reduce y-axis title text size
axis.text.x = element_text(size = 5), # Reduce x-axis text size
axis.text.y = element_text(size = 5), # Reduce y-axis text size
plot.title = element_text(size = 10),
legend.text = element_text(size = 5),
legend.title = element_text(size = 8))
Warning in scale_y_log10("F-ratio"): log-10 transformation introduced infinite
values.
Plotting the Spearman and Pearson correlation values for each model
Code
subset_BI_results <- bind_rows(sp_PB_all_subset,
pe_PB_all_subset)
ggplot(subset_BI_results %>% filter(BI == "PB"),
aes(x = model, y = value, color = corr)) +
geom_point(position = position_dodge(width = 0.5), size = 3) +
# geom_line() +
scale_y_continuous(limits = c(0, 1)) +
ggtitle("Continuous Boyce Index") +
# facet_wrap(~BI + corr) +
theme_classic()
Subsetting by hour
Here we have split the simulated locations up into each hour of the day, and we validate against the observed buffalo locations for those hours.
As there is a prediction map for each hour, which creates a very large html file, we will just plot a couple example prediction maps.
Code
setwd("Z:/ssa_simulations/hourly_prediction_maps/0p")
# list files in the folder
filenames_0p_hourly <- list.files(pattern = "*.tif", full.names = F)
# read in as rasters
predictions_0p_hourly <- lapply(filenames_0p_hourly, rast)
# Use a regular expression to extract the hours from each filename
extracted_numbers_chr <- gsub(".*hour_(\\d+).*", "\\1", filenames_0p_hourly)
extracted_numbers <- as.integer(extracted_numbers_chr)
order <- order(extracted_numbers)
# order the layers by hour
list_0p_hour_points_subset <- predictions_0p_hourly[order]
# stack into a raster stack
raster_0p_hourly_points_subset <- rast(list_0p_hour_points_subset)
# aggregate to 50m resolution
raster_all_points_subset <- aggregate(raster_0p_hourly_points_subset, fact = 2, fun = sum)
Create dataframes of raster values to plot with ggplot
[1] 1007
Code
# raster_0p_hourly_points_subset_norm <- rast(lapply(raster_0p_hourly_points_subset, normalise_raster))
# change NA values to 0
raster_0p_hourly_points_subset <- terra::classify(raster_0p_hourly_points_subset, cbind(NA, 0))
df_0p_hourly_points_subset <- lapply(raster_0p_hourly_points_subset, function(x) {
return(as.data.frame(x, xy = TRUE, na.rm = FALSE))
})
Plot using ggplot, with buffalo locations overlaid
Code
##| layout-ncol: 3
which_plots <- c(3,7,12)
max_locs <- max(sapply(1:length(df_0p_hourly_points_subset), function(x) {
max(df_0p_hourly_points_subset[[x]]$sum)
}))
for(i in 1:length(which_plots)) {
print(ggplot() +
geom_raster(data = df_0p_hourly_points_subset[[which_plots[i]]], aes(x = x, y = y, fill = sum)) +
geom_point(data = buffalo_data_subset_dry2018 %>% filter(y == 1 & hour == i),
aes(x = x2_, y = y2_), size = 0.1, col = "white", alpha = 0.5) +
coord_equal() +
scale_x_continuous("Easting") +
scale_y_continuous("Northing") +
ggtitle(paste("Model 0p - Hour ", which_plots[i])) +
scale_fill_viridis_c(limits = c(0, max_locs)) +
theme_bw() +
theme(legend.position = "none",
axis.title.x = element_text(size = 8), # Reduce x-axis title text size
axis.title.y = element_text(size = 8), # Reduce y-axis title text size
axis.text.x = element_text(size = 5), # Reduce x-axis text size
axis.text.y = element_text(size = 5), # Reduce y-axis text size
plot.title = element_text(size = 10)))
# loop over all hours to save the plots for the gif
# ggsave(paste0("outputs/plots/manuscript_figs_R1/hourly/sim_predictions_subset_0p_hour_", i, ".png"),
# width=80, height=75, units="mm", dpi = 600)
}
Create a gif of the hourly predictions
Uncomment the last few lines to have the gif created. We show the gif in the readme of the GitHub repo: https://github.com/swforrest/dynamic_SSF_sims.
Code
sim_preds_0p_img_unordered <- list.files(path = "outputs/plots/manuscript_figs_R1/hourly/",
pattern = "0p",
full.names = F)
# Use a regular expression to extract the hours from each filename
extracted_numbers_chr <- gsub(".*hour_(\\d+).*", "\\1", sim_preds_0p_img_unordered)
extracted_numbers <- as.integer(extracted_numbers_chr)
order <- order(extracted_numbers)
sim_preds_0p_img_files <- sim_preds_0p_img_unordered[order]
# read in the ordered image files into a list to convert to gif
# sim_preds_0p_img <- lapply(paste0("outputs/plots/manuscript_figs_R1/hourly/", sim_preds_0p_img_files), image_read)
# Combine images into a GIF
# img_gif <- image_animate(image_join(sim_preds_0p_img), fps = 4)
# img_gif
# Save the GIF
# image_write(img_gif, "outputs/plots/manuscript_figs_R1/sim_preds_0p_hourly.gif")
Validation
Calculating the Boyce index for each model’s hourly predictions
Code
spearman_corr_subset_0p_hour <- vector(mode = "numeric", length = nlyr(raster_0p_hourly_points_subset))
pearson_corr_subset_0p_hour <- vector(mode = "numeric", length = nlyr(raster_0p_hourly_points_subset))
for(i in 1:nlyr(raster_0p_hourly_points_subset)) {
obs_data <- buffalo_data_subset_dry2018 %>% filter(y == 1 & hour == i) %>% transmute(x = x2_, y = y2_)
# observed (true) locations
obs <- terra::extract(raster_0p_hourly_points_subset[[i]], obs_data, ID = FALSE)[,1]
obs <- obs[!is.na(obs)]
# all values of the raster layer
abs <- terra::values(raster_0p_hourly_points_subset[[i]], na.rm = T)[,1]
abs <- abs[!is.na(abs)]
boyce_index_spearman <- ecospat.boyce(abs, obs, method = "spearman", PEplot = F)
spearman_corr_subset_0p_hour[i] <- boyce_index_spearman$cor
boyce_index_pearson <- ecospat.boyce(abs, obs, method = "pearson", PEplot = F)
pearson_corr_subset_0p_hour[i] <- boyce_index_pearson$cor
}
cat("Spearman correlation = ", spearman_corr_subset_0p_hour, "\n")
cat("Pearson correlation = ", pearson_corr_subset_0p_hour)
sp_PB_subset_0p_hour <- data.frame(hour = 1:24,
corr = "Spearman",
BI = "PB",
model = "0p",
value = spearman_corr_subset_0p_hour)
pe_PB_subset_0p_hour <- data.frame(hour = 1:24,
corr = "Pearson",
BI = "PB",
model = "0p",
value = pearson_corr_subset_0p_hour)
Spearman correlation = 0.223 -0.115 0.071 0.116 0.407 0.343 -0.082 0.361 0.496 0.588 0.183 0.478 0.469 0.602 0.444 0.235 -0.344 -0.165 -0.055 -0.267 -0.125 -0.106 -0.282 0.105
Pearson correlation = 0.383 -0.13 0.115 0.216 0.452 0.406 -0.049 0.395 0.463 0.509 0.517 0.529 0.471 0.508 0.403 0.539 -0.401 -0.112 -0.126 -0.32 -0.128 -0.068 -0.268 0.225
Code
setwd("Z:/ssa_simulations/hourly_prediction_maps/1p")
# list files in the folder
filenames_1p_hourly <- list.files(pattern = "*.tif", full.names = F)
# read in as rasters
predictions_1p_hourly <- lapply(filenames_1p_hourly, rast)
# Use a regular expression to extract the hours from each filename
extracted_numbers_chr <- gsub(".*hour_(\\d+).*", "\\1", filenames_1p_hourly)
extracted_numbers <- as.integer(extracted_numbers_chr)
order <- order(extracted_numbers)
# order the layers by hour
list_1p_hour_points_subset <- predictions_1p_hourly[order]
# stack into a raster stack
raster_1p_hourly_points_subset <- rast(list_1p_hour_points_subset)
# aggregate to 50m resolution
raster_all_points_subset <- aggregate(raster_1p_hourly_points_subset, fact = 2, fun = sum)
Create dataframes of raster values to plot with ggplot
[1] 8855
Code
# raster_1p_hourly_points_subset_norm <- rast(lapply(raster_1p_hourly_points_subset, normalise_raster))
# change NA values to 0
raster_1p_hourly_points_subset <- terra::classify(raster_1p_hourly_points_subset, cbind(NA, 0))
df_1p_hourly_points_subset <- lapply(raster_1p_hourly_points_subset, function(x) {
return(as.data.frame(x, xy = TRUE, na.rm = FALSE))
})
Plot using ggplot, with buffalo locations overlaid
Code
##| layout-ncol: 3
which_plots <- c(3,7,12)
max_locs <- max(sapply(1:length(df_1p_hourly_points_subset), function(x) {
max(df_1p_hourly_points_subset[[x]]$sum)
}))
for(i in 1:length(which_plots)) {
print(ggplot() +
geom_raster(data = df_1p_hourly_points_subset[[which_plots[i]]], aes(x = x, y = y, fill = sum)) +
geom_point(data = buffalo_data_subset_dry2018 %>% filter(y == 1 & hour == i),
aes(x = x2_, y = y2_), size = 0.1, col = "white", alpha = 0.5) +
coord_equal() +
scale_x_continuous("Easting") +
scale_y_continuous("Northing") +
ggtitle(paste("Model 1p - Hour ", which_plots[i])) +
scale_fill_viridis_c(limits = c(0, max_locs)) +
theme_bw() +
theme(legend.position = "none",
axis.title.x = element_text(size = 8), # Reduce x-axis title text size
axis.title.y = element_text(size = 8), # Reduce y-axis title text size
axis.text.x = element_text(size = 5), # Reduce x-axis text size
axis.text.y = element_text(size = 5), # Reduce y-axis text size
plot.title = element_text(size = 10)))
# loop over all hours to save the plots for the gif
# ggsave(paste0("outputs/plots/manuscript_figs_R1/hourly/sim_predictions_subset_1p_hour_", i, ".png"),
# width=80, height=75, units="mm", dpi = 600)
}
Create a gif of the hourly predictions
Uncomment the last few lines to have the gif created. We show the gif in the readme of the GitHub repo: https://github.com/swforrest/dynamic_SSF_sims.
Code
sim_preds_1p_img_unordered <- list.files(path = "outputs/plots/manuscript_figs_R1/hourly/",
pattern = "1p",
full.names = F)
# Use a regular expression to extract the hours from each filename
extracted_numbers_chr <- gsub(".*hour_(\\d+).*", "\\1", sim_preds_1p_img_unordered)
extracted_numbers <- as.integer(extracted_numbers_chr)
order <- order(extracted_numbers)
sim_preds_1p_img_files <- sim_preds_1p_img_unordered[order]
# read in the ordered image files into a list to convert to gif
# sim_preds_1p_img <- lapply(paste0("outputs/plots/manuscript_figs_R1/hourly/", sim_preds_1p_img_files), image_read)
# Combine images into a GIF
# img_gif <- image_animate(image_join(sim_preds_1p_img), fps = 4)
# img_gif
# Save the GIF
# image_write(img_gif, "outputs/plots/manuscript_figs_R1/sim_preds_1p_hourly.gif")
Validation
Calculating the Boyce index for each model’s hourly predictions
Code
spearman_corr_subset_1p_hour <- vector(mode = "numeric", length = nlyr(raster_1p_hourly_points_subset))
pearson_corr_subset_1p_hour <- vector(mode = "numeric", length = nlyr(raster_1p_hourly_points_subset))
for(i in 1:nlyr(raster_1p_hourly_points_subset)) {
obs_data <- buffalo_data_subset_dry2018 %>% filter(y == 1 & hour == i) %>% transmute(x = x2_, y = y2_)
# observed (true) locations
obs <- terra::extract(raster_1p_hourly_points_subset[[i]], obs_data, ID = FALSE)[,1]
obs <- obs[!is.na(obs)]
# all values of the raster layer
abs <- terra::values(raster_1p_hourly_points_subset[[i]], na.rm = T)[,1]
abs <- abs[!is.na(abs)]
boyce_index_spearman <- ecospat.boyce(abs, obs, method = "spearman", PEplot = F)
spearman_corr_subset_1p_hour[i] <- boyce_index_spearman$cor
boyce_index_pearson <- ecospat.boyce(abs, obs, method = "pearson", PEplot = F)
pearson_corr_subset_1p_hour[i] <- boyce_index_pearson$cor
}
cat("Spearman correlation = ", spearman_corr_subset_1p_hour, "\n")
cat("Pearson correlation = ", pearson_corr_subset_1p_hour)
sp_PB_subset_1p_hour <- data.frame(hour = 1:24,
corr = "Spearman",
BI = "PB",
model = "1p",
value = spearman_corr_subset_1p_hour)
pe_PB_subset_1p_hour <- data.frame(hour = 1:24,
corr = "Pearson",
BI = "PB",
model = "1p",
value = pearson_corr_subset_1p_hour)
Spearman correlation = -0.003 0.224 -0.003 0.352 0.25 0.15 -0.018 0.525 0.694 0.745 0.816 0.911 0.885 0.805 0.843 0.791 0.441 0.124 -0.317 -0.139 -0.051 -0.397 -0.643 0.355
Pearson correlation = -0.124 -0.084 0.002 0.171 0.181 0.116 0.405 0.539 0.659 0.659 0.653 0.775 0.631 0.515 0.653 0.57 0.248 0.324 -0.276 0.209 -0.172 -0.385 -0.693 0.283
Code
setwd("Z:/ssa_simulations/hourly_prediction_maps/2p")
# list files in the folder
filenames_2p_hourly <- list.files(pattern = "*.tif", full.names = F)
# read in as rasters
predictions_2p_hourly <- lapply(filenames_2p_hourly, rast)
# Use a regular expression to extract the hours from each filename
extracted_numbers_chr <- gsub(".*hour_(\\d+).*", "\\1", filenames_2p_hourly)
extracted_numbers <- as.integer(extracted_numbers_chr)
order <- order(extracted_numbers)
# order the layers by hour
list_2p_hour_points_subset <- predictions_2p_hourly[order]
# stack into a raster stack
raster_2p_hourly_points_subset <- rast(list_2p_hour_points_subset)
# aggregate to 50m resolution
raster_all_points_subset <- aggregate(raster_2p_hourly_points_subset, fact = 2, fun = sum)
Create dataframes of raster values to plot with ggplot
[1] 8939
Code
# raster_2p_hourly_points_subset_norm <- rast(lapply(raster_2p_hourly_points_subset, normalise_raster))
# change NA values to 0
raster_2p_hourly_points_subset <- terra::classify(raster_2p_hourly_points_subset, cbind(NA, 0))
df_2p_hourly_points_subset <- lapply(raster_2p_hourly_points_subset, function(x) {
return(as.data.frame(x, xy = TRUE, na.rm = FALSE))
})
Plot using ggplot, with buffalo locations overlaid
Code
##| layout-ncol: 3
which_plots <- c(3,7,12)
max_locs <- max(sapply(1:length(df_2p_hourly_points_subset), function(x) {
max(df_2p_hourly_points_subset[[x]]$sum)
}))
for(i in 1:length(which_plots)) {
print(ggplot() +
geom_raster(data = df_2p_hourly_points_subset[[which_plots[i]]], aes(x = x, y = y, fill = sum)) +
geom_point(data = buffalo_data_subset_dry2018 %>% filter(y == 1 & hour == i),
aes(x = x2_, y = y2_), size = 0.1, col = "white", alpha = 0.5) +
coord_equal() +
scale_x_continuous("Easting") +
scale_y_continuous("Northing") +
ggtitle(paste("Model 2p - Hour ", which_plots[i])) +
scale_fill_viridis_c(limits = c(0, max_locs)) +
theme_bw() +
theme(legend.position = "none",
axis.title.x = element_text(size = 8), # Reduce x-axis title text size
axis.title.y = element_text(size = 8), # Reduce y-axis title text size
axis.text.x = element_text(size = 5), # Reduce x-axis text size
axis.text.y = element_text(size = 5), # Reduce y-axis text size
plot.title = element_text(size = 10)))
# loop over all hours to save the plots for the gif
# ggsave(paste0("outputs/plots/manuscript_figs_R1/hourly/sim_predictions_subset_2p_hour_", i, ".png"),
# width=80, height=75, units="mm", dpi = 600)
}
Create a gif of the hourly predictions
Uncomment the last few lines to have the gif created. We show the gif in the readme of the GitHub repo: https://github.com/swforrest/dynamic_SSF_sims.
Code
sim_preds_2p_img_unordered <- list.files(path = "outputs/plots/manuscript_figs_R1/hourly/",
pattern = "2p",
full.names = F)
# Use a regular expression to extract the hours from each filename
extracted_numbers_chr <- gsub(".*hour_(\\d+).*", "\\1", sim_preds_2p_img_unordered)
extracted_numbers <- as.integer(extracted_numbers_chr)
order <- order(extracted_numbers)
sim_preds_2p_img_files <- sim_preds_2p_img_unordered[order]
# read in the ordered image files into a list to convert to gif
# sim_preds_2p_img <- lapply(paste0("outputs/plots/manuscript_figs_R1/hourly/", sim_preds_2p_img_files), image_read)
# Combine images into a GIF
# img_gif <- image_animate(image_join(sim_preds_2p_img), fps = 4)
# img_gif
# Save the GIF
# image_write(img_gif, "outputs/plots/manuscript_figs_R1/sim_preds_2p_hourly.gif")
Validation
Calculating the Boyce index for each model’s hourly predictions
Code
spearman_corr_subset_2p_hour <- vector(mode = "numeric", length = nlyr(raster_2p_hourly_points_subset))
pearson_corr_subset_2p_hour <- vector(mode = "numeric", length = nlyr(raster_2p_hourly_points_subset))
for(i in 1:nlyr(raster_2p_hourly_points_subset)) {
obs_data <- buffalo_data_subset_dry2018 %>% filter(y == 1 & hour == i) %>% transmute(x = x2_, y = y2_)
# observed (true) locations
obs <- terra::extract(raster_2p_hourly_points_subset[[i]], obs_data, ID = FALSE)[,1]
obs <- obs[!is.na(obs)]
# all values of the raster layer
abs <- terra::values(raster_2p_hourly_points_subset[[i]], na.rm = T)[,1]
abs <- abs[!is.na(abs)]
boyce_index_spearman <- ecospat.boyce(abs, obs, method = "spearman", PEplot = F)
spearman_corr_subset_2p_hour[i] <- boyce_index_spearman$cor
boyce_index_pearson <- ecospat.boyce(abs, obs, method = "pearson", PEplot = F)
pearson_corr_subset_2p_hour[i] <- boyce_index_pearson$cor
}
cat("Spearman correlation = ", spearman_corr_subset_2p_hour, "\n")
cat("Pearson correlation = ", pearson_corr_subset_2p_hour)
sp_PB_subset_2p_hour <- data.frame(hour = 1:24,
corr = "Spearman",
BI = "PB",
model = "2p",
value = spearman_corr_subset_2p_hour)
pe_PB_subset_2p_hour <- data.frame(hour = 1:24,
corr = "Pearson",
BI = "PB",
model = "2p",
value = pearson_corr_subset_2p_hour)
Spearman correlation = 0.518 0.388 0.136 0.227 0.522 0.57 0.749 0.837 0.772 0.828 0.91 0.916 0.926 0.994 0.92 0.86 0.862 0.748 0.687 0.677 0.128 0.501 0.49 0.385
Pearson correlation = 0.424 0.363 0.067 0.234 0.493 0.581 0.729 0.415 0.685 0.533 0.576 0.685 0.681 0.566 0.672 0.694 0.692 0.577 0.526 0.654 0.023 0.466 0.441 0.285
Code
setwd("Z:/ssa_simulations/hourly_prediction_maps/3p")
# list files in the folder
filenames_3p_hourly <- list.files(pattern = "*.tif", full.names = F)
# read in as rasters
predictions_3p_hourly <- lapply(filenames_3p_hourly, rast)
# Use a regular expression to extract the hours from each filename
extracted_numbers_chr <- gsub(".*hour_(\\d+).*", "\\1", filenames_3p_hourly)
extracted_numbers <- as.integer(extracted_numbers_chr)
order <- order(extracted_numbers)
# order the layers by hour
list_3p_hour_points_subset <- predictions_3p_hourly[order]
# stack into a raster stack
raster_3p_hourly_points_subset <- rast(list_3p_hour_points_subset)
# aggregate to 50m resolution
raster_all_points_subset <- aggregate(raster_3p_hourly_points_subset, fact = 2, fun = sum)
Create dataframes of raster values to plot with ggplot
[1] 7413
Code
# raster_3p_hourly_points_subset_norm <- rast(lapply(raster_3p_hourly_points_subset, normalise_raster))
# change NA values to 0
raster_3p_hourly_points_subset <- terra::classify(raster_3p_hourly_points_subset, cbind(NA, 0))
df_3p_hourly_points_subset <- lapply(raster_3p_hourly_points_subset, function(x) {
return(as.data.frame(x, xy = TRUE, na.rm = FALSE))
})
Plot using ggplot, with buffalo locations overlaid
Code
##| layout-ncol: 3
which_plots <- c(3,7,12)
max_locs <- max(sapply(1:length(df_3p_hourly_points_subset), function(x) {
max(df_3p_hourly_points_subset[[x]]$sum)
}))
for(i in 1:length(which_plots)) {
print(ggplot() +
geom_raster(data = df_3p_hourly_points_subset[[which_plots[i]]], aes(x = x, y = y, fill = sum)) +
geom_point(data = buffalo_data_subset_dry2018 %>% filter(y == 1 & hour == i),
aes(x = x2_, y = y2_), size = 0.1, col = "white", alpha = 0.5) +
coord_equal() +
scale_x_continuous("Easting") +
scale_y_continuous("Northing") +
ggtitle(paste("Model 3p - Hour ", which_plots[i])) +
scale_fill_viridis_c(limits = c(0, max_locs)) +
theme_bw() +
theme(legend.position = "none",
axis.title.x = element_text(size = 8), # Reduce x-axis title text size
axis.title.y = element_text(size = 8), # Reduce y-axis title text size
axis.text.x = element_text(size = 5), # Reduce x-axis text size
axis.text.y = element_text(size = 5), # Reduce y-axis text size
plot.title = element_text(size = 10)))
# loop over all hours to save the plots for the gif
# ggsave(paste0("outputs/plots/manuscript_figs_R1/hourly/sim_predictions_subset_3p_hour_", i, ".png"),
# width=80, height=75, units="mm", dpi = 600)
}
Create a gif of the hourly predictions
Uncomment the last few lines to have the gif created. We show the gif in the readme of the GitHub repo: https://github.com/swforrest/dynamic_SSF_sims.
Code
sim_preds_3p_img_unordered <- list.files(path = "outputs/plots/manuscript_figs_R1/hourly/",
pattern = "3p",
full.names = F)
# Use a regular expression to extract the hours from each filename
extracted_numbers_chr <- gsub(".*hour_(\\d+).*", "\\1", sim_preds_3p_img_unordered)
extracted_numbers <- as.integer(extracted_numbers_chr)
order <- order(extracted_numbers)
sim_preds_3p_img_files <- sim_preds_3p_img_unordered[order]
# read in the ordered image files into a list to convert to gif
# sim_preds_3p_img <- lapply(paste0("outputs/plots/manuscript_figs_R1/hourly/", sim_preds_3p_img_files), image_read)
# Combine images into a GIF
# img_gif <- image_animate(image_join(sim_preds_3p_img), fps = 4)
# img_gif
# Save the GIF
# image_write(img_gif, "outputs/plots/manuscript_figs_R1/sim_preds_3p_hourly.gif")
Validation
Calculating the Boyce index for each model’s hourly predictions
Code
spearman_corr_subset_3p_hour <- vector(mode = "numeric", length = nlyr(raster_3p_hourly_points_subset))
pearson_corr_subset_3p_hour <- vector(mode = "numeric", length = nlyr(raster_3p_hourly_points_subset))
for(i in 1:nlyr(raster_3p_hourly_points_subset)) {
obs_data <- buffalo_data_subset_dry2018 %>% filter(y == 1 & hour == i) %>% transmute(x = x2_, y = y2_)
# observed (true) locations
obs <- terra::extract(raster_3p_hourly_points_subset[[i]], obs_data, ID = FALSE)[,1]
obs <- obs[!is.na(obs)]
# all values of the raster layer
abs <- terra::values(raster_3p_hourly_points_subset[[i]], na.rm = T)[,1]
abs <- abs[!is.na(abs)]
boyce_index_spearman <- ecospat.boyce(abs, obs, method = "spearman", PEplot = F)
spearman_corr_subset_3p_hour[i] <- boyce_index_spearman$cor
boyce_index_pearson <- ecospat.boyce(abs, obs, method = "pearson", PEplot = F)
pearson_corr_subset_3p_hour[i] <- boyce_index_pearson$cor
}
cat("Spearman correlation = ", spearman_corr_subset_3p_hour, "\n")
cat("Pearson correlation = ", pearson_corr_subset_3p_hour)
sp_PB_subset_3p_hour <- data.frame(hour = 1:24,
corr = "Spearman",
BI = "PB",
model = "3p",
value = spearman_corr_subset_3p_hour)
pe_PB_subset_3p_hour <- data.frame(hour = 1:24,
corr = "Pearson",
BI = "PB",
model = "3p",
value = pearson_corr_subset_3p_hour)
Spearman correlation = 0.048 0.446 0.167 0.158 0.359 0.422 0.509 0.687 0.797 0.888 0.869 0.918 0.919 0.902 0.847 0.703 0.502 0.803 0.384 0.542 0.468 0.561 0.51 0.567
Pearson correlation = -0.024 0.541 0.082 -0.035 0.315 0.352 0.59 0.548 0.6 0.581 0.55 0.63 0.671 0.571 0.657 0.566 0.533 0.667 0.345 0.476 0.403 0.492 0.457 0.485
Hourly validation results
Combine the results dataframes
Code
hourly_BI_results <- bind_rows(sp_PB_subset_0p_hour,
pe_PB_subset_0p_hour,
sp_PB_subset_1p_hour,
pe_PB_subset_1p_hour,
sp_PB_subset_2p_hour,
pe_PB_subset_2p_hour,
sp_PB_subset_3p_hour,
pe_PB_subset_3p_hour)
# add a row for hour = 0 (which is hour = 24)
hourly_BI_results_hour0 <- hourly_BI_results %>% group_by(corr, BI, model) %>% filter(hour == 24)
hourly_BI_results_hour0$hour <- 0
hourly_BI_results <- bind_rows(hourly_BI_results, hourly_BI_results_hour0)
Plotting the Spearman correlation for each model for each hour
Code
ggplot(hourly_BI_results %>% filter(BI == "PB" & corr == "Spearman"),
aes(x = hour, y = value, col = model, fill = model, group = interaction(corr, model))) +
geom_smooth(linewidth = 0.75, alpha = 0.1, se = T) +
geom_point(alpha = 0.5, size = 0.5) +
geom_hline(yintercept = 0, linetype = "dashed") +
scale_colour_viridis_d("Model") +
scale_fill_viridis_d("Model") +
scale_y_continuous("Spearman correlation") +
scale_x_continuous("Hour", limits = c(0,24), breaks = seq(0,24,6)) +
theme_bw() +
theme(legend.position = "right",
axis.title.x = element_text(vjust = -2, size = 8),
axis.title.y = element_text(size = 8), # Reduce y-axis title text size
axis.text.x = element_text(size = 5), # Reduce x-axis text size
axis.text.y = element_text(size = 5), # Reduce y-axis text size
plot.title = element_text(size = 10),
legend.text = element_text(size = 6),
legend.title = element_text(size = 7))
`geom_smooth()` using method = 'loess' and formula = 'y ~ x'
References
Session info
R version 4.4.1 (2024-06-14 ucrt)
Platform: x86_64-w64-mingw32/x64
Running under: Windows 11 x64 (build 22631)
Matrix products: default
locale:
[1] LC_COLLATE=English_Australia.utf8 LC_CTYPE=English_Australia.utf8
[3] LC_MONETARY=English_Australia.utf8 LC_NUMERIC=C
[5] LC_TIME=English_Australia.utf8
time zone: Europe/Zurich
tzcode source: internal
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] magick_2.8.4 ggpattern_1.1.1 ecospat_4.1.1 beepr_2.0
[5] tictoc_1.2.1 terra_1.7-78 amt_0.2.2.0 lubridate_1.9.3
[9] forcats_1.0.0 stringr_1.5.1 dplyr_1.1.4 purrr_1.0.2
[13] readr_2.1.5 tidyr_1.3.1 tibble_3.2.1 ggplot2_3.5.1
[17] tidyverse_2.0.0
loaded via a namespace (and not attached):
[1] tidyselect_1.2.1 viridisLite_0.4.2 farver_2.1.2 fastmap_1.2.0
[5] digest_0.6.37 timechange_0.3.0 lifecycle_1.0.4 sf_1.0-16
[9] survival_3.6-4 magrittr_2.0.3 compiler_4.4.1 rlang_1.1.4
[13] tools_4.4.1 utf8_1.2.4 yaml_2.3.10 knitr_1.48
[17] labeling_0.4.3 htmlwidgets_1.6.4 bit_4.0.5 classInt_0.4-10
[21] KernSmooth_2.23-24 withr_3.0.1 grid_4.4.1 fansi_1.0.6
[25] e1071_1.7-14 colorspace_2.1-1 scales_1.3.0 iterators_1.0.14
[29] cli_3.6.3 rmarkdown_2.28 crayon_1.5.3 ragg_1.3.2
[33] generics_0.1.3 rstudioapi_0.16.0 tzdb_0.4.0 DBI_1.2.3
[37] cachem_1.1.0 proxy_0.4-27 audio_0.1-11 splines_4.4.1
[41] parallel_4.4.1 vctrs_0.6.5 Matrix_1.7-0 jsonlite_1.8.8
[45] hms_1.1.3 bit64_4.0.5 systemfonts_1.1.0 foreach_1.5.2
[49] units_0.8-5 glue_1.7.0 codetools_0.2-20 stringi_1.8.4
[53] gtable_0.3.5 munsell_0.5.1 pillar_1.9.0 htmltools_0.5.8.1
[57] R6_2.5.1 textshaping_0.4.0 Rdpack_2.6.1 vroom_1.6.5
[61] evaluate_0.24.0 lattice_0.22-6 rbibutils_2.2.16 backports_1.5.0
[65] memoise_2.0.1 class_7.3-22 Rcpp_1.0.13 nlme_3.1-164
[69] checkmate_2.3.2 mgcv_1.9-1 gridpattern_1.2.2 xfun_0.47
[73] pkgconfig_2.0.3