NOTES:

The plotted figures and .txt files are saved in the ‘Output’ folder. All the results are based on 999 simulations.


Load Libraries

The following libraries were loaded: ‘LibPack’.

Introduction

Funeral landscapes are a specific example of archaeological landscapes that embed the relationship between environment, burials and human behaviour. In this study, we employ the Point Pattern Analysis (PPA) technique to explore the choice of location of a desk-created dataset of several thousands stone-built medieval Islamic funerary monuments (qubbas), found along the foothills of isolated rocky outcrops in the remote semi-arid Kassala province of Eastern Sudan. In the region a comprehensive geomorphological and geoarchaeological survey was carried out in the field to assess the main archaeological features, highlighting the existence of an impressive number of archaic and modern qubbas, whose spatial distribution can be interpreted using specific tools, including, for the first time in archaeology, of the Neyman-Scott Cluster Process.

Two complementary null hypotheses were tested through Point Pattern Analysis (PPA):

H1: At landscape-scale, the density of qubbas is uniform (the intensity of the point pattern is stationary and isotropic). The spatial variables do not affect the spatial distribution of the points.

H2: At local-scale, the qubbas are completely independent from each other presenting no spatial correlation

Part 1: Import Data

The geomorphological analysis of the region served as a starting point for determining a list of potentially explanatory variables. The Elevation values were retrieved from the ALOS World 3D, a global 30-meters resolution Digital Surface Model (DSM). The DSM was then elaborated in GRASS GIS and in SAGA GIS to create the spatial variables used in the PPA. The desk-created dataset comprises 783 tumuli and 10274 qubbas.

# Topographic variables
dem <- raster("GeoTiff/Elevation.tif")
slp <- raster("GeoTiff/Slope.tif")
asp <- raster("GeoTiff/Aspect.tif")

# Aspect Northness (ns) & Eastness (es) [rad]
ns <- writeRaster((cos(asp)* pi / 180),"GeoTiff/Northness.tif", format="GTiff",
            overwrite=TRUE)

es <- writeRaster((sin(asp)* pi / 180),"GeoTiff/Eastness.tif", format="GTiff",
            overwrite=TRUE)

# Topographic Position Index (TPI)
TPI_100<- raster("GeoTiff/TPI100.tif")
TPI_500<- raster("GeoTiff/TPI500.tif")
TPI_1000<- raster("GeoTiff/TPI1000.tif")

# Convergence Index (CI)
CI <- raster("GeoTiff/CI.tif")

# Geomorphon
geomorph <- raster("GeoTiff/Geomorphons.tif")

# Euclidean distances from resources
igne_eu <- raster("GeoTiff/I.R.Dist.tif")
meta_eu <- raster("GeoTiff/M.R.Dist.tif")
stream_eu <-raster("GeoTiff/S.dist.tif")

# Tumuli's Cumulative Viewshed Analysis (CVA)
cva <-raster("GeoTiff/CVA.tif")

# Qubbas sites
sites_q <- remove.duplicates(readOGR(dsn="EsriSHP/qubbas", layer="qubbas"))

# Tumuli sites
sites_t <- remove.duplicates(readOGR(dsn="EsriSHP/tumuli", layer="tumuli"))

# Region Of Interest (ROI)
prj_area <- readOGR(dsn="EsriSHP/ROI", layer="ROI")

# Setting the working region
region<-as(as(prj_area,"SpatialPolygons"),"owin")

# Converting qubbas site locations to point pattern process (ppp)
q_ppp<- as.ppp(coordinates(sites_q), region)

# Save files loaded into R ####

save(dem, TPI_100,TPI_500, TPI_1000, CI, slp, ns, es,igne_eu, meta_eu, stream_eu,cva,
     sites_q, sites_t, prj_area, file="PPA_Data.RData")

# Number of Simulations
NuSim <- 999

Part 2: Non-parametric analysis: Kernel Density Estimation (KDE)

The intensity of sites was initially explored through the KDE that was obtained with the spatstat function density.ppp. The smoothing bandwidth can be specified in the function with the argument sigma. The kernel bandwidth sigma controls the degree of smoothing: a small value may omit a potential general trend, while a large value may omit local details. In this research, the sigma was automatically estimated with the likelihood cross-validation method in spatstat (bw.ppl) and manually adjusted through the function argument adjust (Fig.1).

# Rescaling q_ppp from m to Km (better plot representation !)
q_ppp_KM<- rescale((as.ppp(coordinates(sites_q), region)), 1000, "km")

# Automatic BE 
b = bw.ppl(q_ppp_KM)

# Adjusting BE manually
MBE1 <- density.ppp(q_ppp_KM)
MBE2 <- density.ppp(q_ppp_KM, sigma = b)
MBE3 <- density.ppp(q_ppp_KM, sigma = b, adjust= 10)
MBE4 <- density.ppp(q_ppp_KM, sigma = b, adjust= 20)
MBE5 <- density.ppp(q_ppp_KM, sigma = b, adjust= 30)
MBE6 <- density.ppp(q_ppp_KM, sigma = b, adjust= 40)
Fig. 1 - *Kernel Density estimated with default, automatic and manually adjusted methods. From the top left: the default KDE calculated with the spatstat function density.ppp.; KDE with sigma automatically estimated with the spatstat function bw.ppl; the KDE estimated with ppl sigma adjusted with values 10, 20, 30 and 40 respectively*

Fig. 1 - Kernel Density estimated with default, automatic and manually adjusted methods. From the top left: the default KDE calculated with the spatstat function density.ppp.; KDE with sigma automatically estimated with the spatstat function bw.ppl; the KDE estimated with ppl sigma adjusted with values 10, 20, 30 and 40 respectively

The best representation of the dynamic intensity of the qubbas in the ROI’s landscape was obtained by a manual adjustment of the sigma at adjust = 30 (Fig.2).

*Fig.2 - Kernerl Density Estimation for the study area*

Fig.2 - Kernerl Density Estimation for the study area

Spatial Covariates

To avoid overparameterization in the models, the Pearson correlation’s test was employed to test collinearity between the variables. Collinearity is the high linear relation of two or more predictors. In general, an absolute correlation coefficient of >0.7 (dark blue or dark red) among two or more predictors indicates the presence of collinearity. In this research we considered > 0.6 as a parameter to exclude correlated variables (Fig.3).

Testing Collinearity between covariates

RandomC<- as.data.frame((sampleRandom(
  (stack(dem, TPI_100,TPI_500, TPI_1000, CI, slp, ns, es, 
         geomorph,igne_eu, meta_eu, stream_eu, cva)), 999)))

coll_test <- cor(RandomC, method = "pearson")
Fig. 3 - *Pearson's correlation test results*

Fig. 3 - Pearson’s correlation test results

The topographic and morphometric variables that showed less collinearity (Slope, Northness, Eastness, TPI100, CI and Geomorphon, distance from metamorphic rocks and CVA) were furtherly investigated to evaluate their actual influence on the point process (Fig. 4).

Fig. 4 - *Intensity spatial variation according to covariate values*

Fig. 4 - Intensity spatial variation according to covariate values

Both Northness and Eastness were excluded from the parametric estimation of intensity as they show seem to have no influence on the landscape spatial distribution of the qubbas: the points are uniformly oriented in all cardinal directions. The six resulting covariates employed for the PPM are: Slope, Aspect, Topographic Position Index (TPI 100), Convergence Index (CI), Geomorphon, Euclidean distances from outcrops of metamorphic rock, Tumuli’s Cumulative Viewshed Analysis (CVA) (Fig. 5).

Fig. 5 - *Spatial covariates selected*

Fig. 5 - Spatial covariates selected

Part 3: Parametric analysis

First-order properties

With the spatstat function ppm three different Inhomogeneous Poisson Process (IPP) models have been fitted: Model 1 (Topographic covariates), Model 2 (Lithological covariate), Model 3 (Socio-cultural covariate).

# Convert covariates to objects of class image and create list

covar<-list(Elev=as.im(as(dem,"SpatialGridDataFrame")),
            TPI=as.im(as(TPI_100,"SpatialGridDataFrame")),
            CI=as.im(as(CI,"SpatialGridDataFrame")),
            Geo=as.im(as(geomorph,"SpatialGridDataFrame")),
            Slope=as.im(as(slp,"SpatialGridDataFrame")),
            NS=as.im(as(ns,"SpatialGridDataFrame")),
            ES=as.im(as(es,"SpatialGridDataFrame")),
            Meta=as.im(as(meta_eu,"SpatialGridDataFrame")),
            CVA=as.im(as(cva,"SpatialGridDataFrame")))

# Model 0: homogeneous point process - Null model

q_mod_0<- ppm(q_ppp ~ 1)

# Model 1: parametric inhomogeneous point process - Topographic variables

q_mod_1<- ppm(q_ppp, ~ Slope + TPI + CI + Geo, data = covar)

# Bayesian Information Criterion (BIC): stepwise variable selection

q_mod_1BIC<-stepAIC(q_mod_1,k=log(length(sites_q)))

# Model 2: parametric inhomogeneous point process - Buiding Resource

q_mod_2<- ppm(q_ppp, ~ Meta, data = covar)

# Bayesian Information Criterion (BIC): stepwise variable selection

q_mod_2BIC<-stepAIC(q_mod_2,k=log(length(sites_q)))

# Model 3: parametric inhomogeneous point process - CVA

q_mod_3<- ppm(q_ppp, ~ CVA, data = covar)

# Bayesian Information Criterion (BIC): stepwise variable selection

q_mod_3BIC<-stepAIC(q_mod_3,k=log(length(sites_q)))

# Model 4: parametric inhomogeneous point process - hybrid model

q_mod_4<- ppm(q_ppp, ~ Slope + CI + TPI + Geo + Meta + CVA, data = covar)

# Bayesian Information Criterion (BIC): stepwise variable selection

q_mod_4BIC<-stepAIC(q_mod_4,k=log(length(sites_q)))

# Exporting Tables

# Create PPM lists

ppp_q_models<-list(model0=q_mod_0,model1=q_mod_1BIC,model2=q_mod_2BIC,model3=q_mod_3BIC,model4=q_mod_4BIC)

# Export PPM summary

write.table(capture.output(print(ppp_q_models)),file="Output/PPP_Models.txt")

# Compare BIC scores for the different models

q_BIC<-lapply(ppp_q_models,BIC)
write.table(q_BIC,file="Output/BIC_Models_scores.txt")

# BIC weight for different models (Credits: dr. Francesco Carrer, Newcastle University)

AIC.BIC.weight<-function(x){
  for(i in 1:length(x)){
    x.vect<-as.numeric(c(x[1:i]))}
  delta<-x.vect-min(x.vect)
  L<-exp(-0.5*delta)
  result<-round(L/sum(L),digits=7)
  return(result)
}

write.table(AIC.BIC.weight(q_BIC[1:2]),file="Output/BIC_Models01_weights.txt")
write.table(AIC.BIC.weight(q_BIC[1:3]),file="Output/BIC_Models02_weights.txt")
write.table(AIC.BIC.weight(q_BIC[1:4]),file="Output/BIC_Models03_weights.txt")
write.table(AIC.BIC.weight(q_BIC[1:5]),file="Output/BIC_Models04_weights.txt")

The Schwarz’s Bayesian Information Criterion (BIC) has been employed to compare the competing models and to assess the performance of each covariate. BIC is calculated as the difference between the maximised likelihood of the model and the product of covariates and number of observations (points), therefore a lower BIC corresponds to a better performing model. Following the principle of parsimony, step-wise selection of covariates enables the identification of the combination of variables that minimises BIC values, and the covariates that show less contribution to the performance of the models are excluded during the process (Tab. 1). The remaining covariates were then employed to fit a fourth “hybrid” model that includes the best performing variables (Model 4).

Tab. 1 - Result of the BIC stepwise covariates selection and model selection based on BIC weights.
PPMs Selected Covariates Discarded
Covariates
BIC Weights
Model 0–1
Weights
Model 0–2
Weights
Model 0–3
Weights
Model 0–4
0 - - 285618.727843511 0 0 0 0
1 Slope, TPI100, CI, Geomorphon - 267576.992701805 1 1 1 0
2 Eu. Dist. Metamorphic Rocks - 279957.068568553 - 0 0 0
3 Tumuli’s CVA - 284537.742730204 - - 0 0
4 Slope, TPI100, CI, Geomorphon, Eu. Dist. Metamorphic Rocks, Tumuli’s CVA - 266045.103883927 - - - 1

During the step-wise selection no covariates have been excluded in Model 1, 2, 3 meaning that all the spatial variables show significant correlation with the points. Thus, the hybrid model (Model 4) has been fitted using all the six initially selected covariates and it scored the lowest BIC value (Tab. 1). The coefficients of the hybrid (Model 4) show an inverse correlation of the sites’ locational pattern with Slope, TPI and the distances from metamorphic rocks (Eu. Dist. Meta). Conversely, the coefficients of BIC-selected Model 4 show a direct correlation with CI, Geomorphon and CVA (Tab. 2).

Tab. 2 - Covariates of the BIC-selected Model 4.
Covariates Estimate S.E. CI 95% lo CI 95% hi Z test
Slope -4.492999e-03 1.449379e-03 -7.333730e-03 -1.652268e-03 <0.001
CI 5.986409e-02 6.500641e-04 5.858999e-02 6.113819e-02 <0.001
TPI -1.309407e-01 4.892103e-03 1.405291e-01 -1.213524e-01 <0.001
Geomorphon 1.695567e-01 3.900717e-03 1.619115e-01 1.772020e-01 <0.001
Eu. Dist. Meta -6.269355e-05 1.828274e-06 -6.627690e-05 -5.911020e-05 <0.001
CVA 1.433004e-03 1.617823e-04 1.115916e-03 1.750091e-03 <0.001

Second-order properties

The interpoint dependence of sites (H2) has been assessed applying the **Besag’s L function* (Fig. 6).

Lfun_0 <- envelope(q_mod_0, fun=Lest, correction="trans", nsim=NuSim)

Lfun_1 <- envelope(q_mod_4, fun=Lest, correction="trans", nsim=NuSim)
Fig. 6 - *Results of the L function on model 0 and model 4*

Fig. 6 - Results of the L function on model 0 and model 4

The observed values completely exceed the 95th percentile of simulated values (the highest limit of the confidence envelope) both in the homogeneous (Model 0) and inhomogeneous (Model 4) plots, suggesting an extremely high positive correlation of points that cannot be explained with landscape preferences alone. Because even the KDE highlighted hot-spots in the intensity of the sites and the nearest neighbour (nn) distances evidence that the qubbas are very close placed (min nn distance: 0,72 m; mean nn distance: 23,3 m), the occurrence of a positive dependence between points was tested fitting a cluster process model.

Fitting Neyman-Scott Cluster Point Processes (KPPM)

In this research, a Neyman Scott Cluster process (NSC) was employed. A variant of the NSC process called Thomas cluster process was employed, using the argument method = Thomas in the kppm function. Then, two different kppm have been fitted: a stationary Thomas Cluster model (Model 5) and an inhomogeneous Thomas Cluster model (Model 6) resulted by fitting the point process intensity to the set of covariates of the ppm with the best BIC score (Model 4).

q_mod_5<- kppm(q_ppp ~ 1, "Thomas", method="clik2")

q_mod_6<- kppm(q_ppp, ~  Slope + CI + TPI + Geo + Meta + CVA,
               "Thomas", method="clik2", data = covar)

# Create KPPM lists
kppp_q_models<-list(model5=q_mod_5,model6=q_mod_6)

# Export KPPM summary
write.table(capture.output(print(kppp_q_models)), file="Output/KPP_Models.txt")
Tab. 3 - Parameters of the Stationary cluster point process Model 5 and of the Inhomogeneous cluster point process Model 6. Kappa= intensity of the parents.
Models Intensity Kappa (Intensity of the parents) Scale (cluster radius) Mean cluster size (number of offsprings)
Kppm 5 2.505782e-06 1.606992e-08 4.815851e+02 155.5071 points
Kppm 6 Log intensity: ~Slope + CI + TPI + Geo + Meta + CVA 1.66968e-08 4.72056e+02 Mean cluster size: [pixel image]

The parameters of the two kppm show that the qubbas spatial clusterization is characterised by a very low number of parent points and a mean cluster radius of ≈ 475 m (Tab. 3). The H2 has been addressed applying the L function to Models 6 (Fig. 7).

# L Function Neyman-Scott Cluster Process

Lfun_2 <- envelope(q_mod_5, fun=Lest, correction="trans", nsim=NuSim)

Lfun_3 <- envelope(q_mod_6, fun=Lest, correction="trans", nsim=NuSim)
Fig. 7 - *Results of the L function on model 5 and model 6*

Fig. 7 - Results of the L function on model 5 and model 6

The results display no deviations of the observed values from the confidence envelope. This suggests that the distribution of qubbas can be accounted for by broad landscape preferences with local tendencies towards tomb clustering.

Part 3: Residuals values measure

Finally, the goodness-of-fit of Model 6 was assessed through the calculation of the model’s residual values. The diagnostic smoothed residual values of the Model 6 have been calculated through the spatstat function residuals.kppm.

resv <- residuals.kppm(q_mod_6, drop=T)
Fig. 8 - *Residual values of model 6*

Fig. 8 - Residual values of model 6

As shown in Fig. 8, the smoothed residual values of Model 6 display an overall uniform level of prediction, except for a limited portion in the centre of the ROI in which the model greatly underestimated the true intensity of sites. The area corresponds to a small outcrop, where the remote survey documented a markedly higher occurrence of qubbas than in the surroundings (∼1/3rd of the dataset). This significant positive autocorrelation (similar values are close to each other) might imply that the location became a pole of attraction of disproportionate size against the others in the region.

Part 4: Assessing spatial correlation between qubbas and ancient tumuli

Likely, social memory-driven choices may also emerge assessing the possibility of a cultural continuity between tumuli and qubbas. In fact, the CVA coefficient of Model 4 shows that there is a slight direct correlation between the distribution of qubbas and the intervisibility with tumuli (Tab. 2).

# Create marked ppp with Tumuli and Qubbas

valT<-paste(rep("T",length(sites_t)))
valQ<-paste(rep("Q",length(sites_q)))
val<-c(valT,valQ)

X<-c(coordinates(sites_t)[,1],coordinates(sites_q)[,1])
Y<-c(coordinates(sites_t)[,2],coordinates(sites_q)[,2])

sites_tq<-ppp(X,Y,region,marks=as.factor(val))
rjitter(sites_tq,0.5)

# Cross G- Function r=seq(0,1500,50),

D <- density(split(sites_tq))

xgf <- envelope(sites_tq, Gcross, i= "Q", j="T",
                simulate= expression(rmpoispp(D, types=names(D))), nsim = NuSim)

# Extracting the Cross G function highest observed value
print(xgf$r[xgf$obs>xgf$hi])

The resulting graph of the cross-type nearest-neighbour function Gij(r) shows that for great distances (>400m) the cross G function displays a segregation pattern that is probably related to the dispersal of the qubbas in the ROI towards locations where the tumuli are not found (Fig. 9). On the contrary, at shorter distances (<400m) there is an aggregation of the qubbas around the tumuli that is consistent throughout the region (Fig. 9).

Fig. 9 - *The results of the cross G function*

Fig. 9 - The results of the cross G function