The lczexplore package was developed thanks to the PAENDORA2 (Pour la gestion du confort estival : Données, Outils et Recherche-Action, 2022 -2025, funded by ADEME, a French agency for ecological transition).
Its purpose is to easily compare different Local Climate Zone (LCZ) classifications on the same study territory.
Before using this package, one must assign a local climate zone (LCZ) type
to each Spatial Reference Units of a territory.
These units can be topological units or geometries derived from pixels of a raster approach
the package deals with the conversion). Fo a detailed example of the raster approach,
please read the “lczexplore_raster_vector” vignette (vignette("lczexplore_raster_vector"
).
As several workflows can produce these LCZ types, this package allows to represent them and to compare
pairs of these classifications (pairwise).
It has been extended to compare any pair of qualitative variable set on valid geographical units.
The following allows the installation from the github repository: (devtools::install_github("orbisgis/lczexplore", build_vignettes = TRUE)
).
On can also install from the tar.gz file and install it (see ?install.packages
).
Load the package: library(lczexplore)
.
library(ggplot2)
library(lczexplore)
The function importLCZvect
is the generic import function for the file containing geometries,
or Reference Spatial Units (RSU), and the associated Local Climate Zone Type (LCZ).
Imports of .geojson and .shp files were tested, but any format accepted by the function st_read
of the sf
package should work.
The dirPath
and the file
arguments set the path to the folder and the filename to be loaded.
The compulsory argument column
is the name of the column in which are the LCZ types.
The optional arguments geomID
and confid
allow the user to load column of geometry identifiers
and a column containing a measure of the confidence associated to the LCZ type of a given geometry
(these are needed for the sensitivity analysis described below).
The default output of importLCZvect
is an object of class simple feature
as defined by the openGIS consortium and as handled by the sf
package, an R reference
library for Geographical Information System vector Data [https://doi.org/10.32614/RJ-2018-009].
If one sets the argument output="bBox"
, importLCZvect will output only the bounding
box of the geometries of the input file. It is useful, for instance, when you want to use it to
crop another file and study the same area.
dirPath<-paste0(system.file("extdata",package="lczexplore"),"/")
dirPathOSM<-paste0(dirPath,"osm/2022/Redon")
dirPathBDT<-paste0(dirPath,"bdtopo_2_2/Redon")
redonOSM<-importLCZvect(dirPath=dirPathOSM,file="rsu_lcz.geojson",column = "LCZ_PRIMARY")
redonBDT<-importLCZvect(dirPath=dirPathBDT,file="rsu_lcz.geojson",column = "LCZ_PRIMARY")
#
Once the dataset has been loaded into an R object of class sf
, the function showLCZ
produces the map of it’s repartition. The argumentsf
sets the input dataset,
the column of the LCZ types is passed to the argument column
. The argument wf
is simply a string
to specify which workflow produced the input data (it is used to produce the title).
For standard LCZ types, one sets the argument repr="standard"
, and the function will recognize
levels from 1 to 10 and from 101 to 107.
It is the recommended choice. But standard
representation will also accept levels from 11 to 17 and from A to G.
(for other conventions, see the alter
representation).
showLCZ(sf=redonOSM,wf="OSM",column="LCZ_PRIMARY",repr="standard",LCZlevels="",colors="")
One can note that by default, all the 17 possible types of an LCZ classification are present in the legend. This makes comparison easier when one compares dataset with different present levels.
To only represent the present levels, set
drop=TRUE
.
#> [1] "Compact mid" "Compact low" "Open mid"
#> [4] "Open low" "Large low" "Sparsely built"
#> [7] "Dense trees" "Scattered trees" "Low plants"
#> [10] "Bare rock or paved" "Water"
#> Compact mid Compact low Open mid Open low
#> 2 3 5 6
#> Large low Sparsely built Dense trees Scattered trees
#> 8 9 11 12
#> Low plants Bare rock or paved Water
#> 14 15 17
The non-standard representation, triggered with repr=alter
is useful when:
With repr=alter
, one passes the vector of possible levels to the argument LCZlevels
and the vector of associated colors to the argument colors
.
If no vector of levels or no vector of colors is given, the levels will be deduced from the unique values present in the data, and the colors will be picked from the “Polychrome 36” palette of the grDevices
package.
For instance: one can first plot the column without knowing the LCZ levels present.
# Choice of random colors from a palette
testCol <- palette.colors(n=17, palette="Polychrome 36")
showLCZ(sf=redonOSM,wf="OSM",column="LCZ_PRIMARY",repr="alter",colors=testCol, useStandCol = FALSE, drop=TRUE)
#> [1] "redonOSM"
#> 3.1 : No levels but a color vector which size is greater than the number of levels in the data, unused colors were dropped.
The colors one wants to use can be specified either by their name (if a known color name in R) or it’s RGB code, and passed to the argument colors
.
LCZlevels<-c(2, 3, 5, 6, 8, 9, 101, 102, 104, 105, 107)
couleurs<-c("red","brown1","orange","coral","grey","darkgreen","chartreuse4","springgreen3",
"darkolivegreen","black","#6d67fd")
showLCZ(sf=redonOSM,wf="OSM",column="LCZ_PRIMARY",repr="alter",drop=TRUE,LCZlevels=LCZlevels,colors=couleurs)
#> [1] "redonOSM"
#> One vector of levels, one vector of colors, either the same length (case 9: and 10:), or colors longer (case 12:, unused colors were dropped),
#> reduced to 4: & 5: A single vector was provided, whose names cover the levels in the data (values are expected to be colors).
For instance, one may want to illustrate the contour of a city, by grouping the urban LCZs and the vegetation ones.
Use the groupLCZ
function and set the dataset with the argument sf
and the column with column
.
Then pass as many vectors as there are grouped category, each of them containing the levels to be grouped together.
The newly created categories will be stored in the outCol
column.
If the user doesn’t enter them (by default outCol="grouped"
).
The newly created dataset can be mapped with showLCZ
function.
redonOSMgrouped<-groupLCZ(redonOSM,column="LCZ_PRIMARY",
urban=c("1","2","3","4","5","6","7","8","9"),
industry="10",
vegetation=c("101","102","103","104"),
impervious="105",pervious="106",water="107",colors=c("red","black","green","grey","burlywood","blue"))
showLCZ(redonOSMgrouped, column="grouped",repr="alter",wf="OSM",
LCZlevels = c("urban","industry","vegetation","impervious","pervious","water"),
colors=c("red","black","green","grey","burlywood","blue"))
#> [1] "redonOSMgrouped"
#> One vector of levels, one vector of colors, either the same length (case 9: and 10:), or colors longer (case 12:, unused colors were dropped),
#> reduced to 4: & 5: A single vector was provided, whose names cover the levels in the data (values are expected to be colors).
The purpose of this package is to easily compare local climate zone classifications produced by different algorithms on the same study area. In our example, the GeoClimate workflow was used with two different input data to produce two LCZ classifications: the BD TOPO® v2.2, produced by the French National Institute of Geographic and Forest Information, and OpenStreetMap.
NOTE: The vignette “lczexplore_raster_vector” explores in detail another example with WUDAPT raster data.
One has to call compareLCZ
and feed it, for each dataset,
at least the name of the sf
object and the name of the columns
where the LCZ types are stored.
As the two input data set can have different coordinate reference system (CRS),
you need to tell which dataset is the reference
(by default ref=1
, that is the second data set will be re-transformed into the CRS of the first data set).
NOTE: if the two data sets to compare have the same column names, the compareLCZ function will concatenate a “.1” string to the column names of the second file
The graphical output includes 4 graphs and is easier to read after zooming or exporting.
comparison<-compareLCZ(sf1=redonBDT,column1="LCZ_PRIMARY", wf1="BD TOPO v2.2",
sf2=redonOSM,column2="LCZ_PRIMARY",wf2="Open Street Map", ref=1,
repr="standard",exwrite=F, location="Redon", saveG="comparison")
#> The column LCZ_PRIMARY of the dataset redonBDT is the reference against which the LCZ_PRIMARY column of the dataset redonOSM will be compared.
#> Both sf datasets need to live in the same crs projection (srid / epsg),
#> they will be coerced to the specified reference (redonBDT)
#> Warning: attribute variables are assumed to be spatially constant throughout
#> all geometries
#> Warning: attribute variables are assumed to be spatially constant throughout
#> all geometries
knitr::include_graphics("comparison.png", error=FALSE)
plot of chunk unnamed-chunk-10
The comparison also works on grouped data.
redonOSMgrouped<-groupLCZ(redonOSM,column="LCZ_PRIMARY",
urban=c("1","2","3","4","5","6","7","8","9"),
industry="10",
vegetation=c("101","102","103","104"),
impervious="105",pervious="106",water="107",colors=c("red","black","green","grey","burlywood","blue"))
redonBDTgrouped<-groupLCZ(redonBDT,column="LCZ_PRIMARY",
urban=c("1","2","3","4","5","6","7","8","9"),
industry="10",
vegetation=c("101","102","103","104"),
impervious="105",pervious="106",water="107",colors=c("red","black","green","grey","burlywood","blue"))
comparisonGrouped<-compareLCZ(sf1=redonOSMgrouped,column1="grouped",
sf2=redonBDTgrouped, column2="grouped",
repr="alter",
levels=c("urban","industry","vegetation","impervious","pervious","water"),
colors=c("red","black","green","grey","burlywood","blue"), saveG="comparisonGrouped")
#> The column grouped of the dataset redonOSMgrouped is the reference against which the grouped column of the dataset redonBDTgrouped will be compared.
#> Both sf datasets need to live in the same crs projection (srid / epsg),
#> they will be coerced to the specified reference (redonOSMgrouped)
#> Warning: attribute variables are assumed to be spatially constant throughout
#> all geometries
#> Warning: attribute variables are assumed to be spatially constant throughout
#> all geometries
knitr::include_graphics("comparisonGrouped.png", error=FALSE)
plot of chunk unnamed-chunk-12
A little cheat is possible ! You can try to group and compare on-the-fly with tryGroup=TRUE
. The syntax is then as shown in the nex code chunk.
comparisonGrouped<-compareLCZ(sf1=redonBDT,column1="LCZ_PRIMARY", wf1="groupedBDT",
sf2=redonOSM, column2="LCZ_PRIMARY", wf2="groupedOSM",
repr="alter", ref=2, saveG="", exwrite=FALSE, location="Redon", plot=TRUE,
urban=c("1","2","3","4","5","6","7","8","9"),
industry="10",
vegetation=c("101","102","103","104"),
impervious="105",pervious="106",water="107",
colors=c("red","black","green","grey","burlywood","blue"),tryGroup = TRUE)
#> The column LCZ_PRIMARY of the dataset redonBDT is the reference against which the LCZ_PRIMARY column of the dataset redonOSM will be compared.
#> Both sf datasets need to live in the same crs projection (srid / epsg),
#> they will be coerced to the specified reference (redonOSM)
#> Level names in your 1st dataset didn't match original data.
#> As tryGroup=TRUE, the function groupLCZ will try to create a "grouped" column with level names and levels specified in (...).
#> If this doesn't work, compareLCZ function may fail.
#> As tryGroup=TRUE, the function groupLCZ will try to create a "grouped" column with level names and levels specified in (...).
#> If this doesn't work, compareLCZ function may fail.
#> Warning: attribute variables are assumed to be spatially constant throughout
#> all geometries
#> Warning: attribute variables are assumed to be spatially constant throughout
#> all geometries
CompareLCZ
outputs a list called matConfOut
which contains:
$data
the intersected geometries, their identifiers (if geomID
was not empty), their LCZ types and their areas,$matConfLarge
the confusion matrix,$percAgg
the general agreement of the two classifications on the whole area (as a percentage of the global surface of the area),$areas
the summed area for each LCZ of both classifications.At last: it exWrite=TRUE
the data are exported to a .csv file whose name
is created from the argument wf1
and wf2
. It can then be loaded for further analysis.
comparison$percAgg
#> [1] 61.75
comparisonGrouped$areas
#> typeLevels area1 area2
#> 1 urban 25.85 30.75
#> 2 industry 0.00 0.00
#> 3 vegetation 67.56 58.37
#> 4 impervious 3.17 7.61
#> 5 pervious 0.00 0.00
#> 6 water 3.41 3.28
Some algorithm may provide a confidence value associated to the LCZ type. For instance, GeoClimate, for a given geometry, chooses the LCZ type according to a distance to a hypercube defining each LCZ type (in a space defined by the urban canopy parameters)?. Therefore, GeoClimate assigns the geometry to the closest type, but another type may have been quite close too. GeoClimate supplies a uniqueness value, between 0 and 1. The closer it is to 1, the less the second LCZ type was a good candidate. So this uniqueness can be seen as a confidence value.
When using the importLCZvect
function one needs to specify the confidence value associated
to the LCZ (confid
argument). On can also set the name of the geometry identifier (geomID
argument ).
dirPath<-paste0(system.file("extdata",package="lczexplore"),"/")
dirPathOSM<-paste0(dirPath,"osm/2022/Redon")
dirPathBDT<-paste0(dirPath,"bdtopo_2_2/Redon")
redonOSM<-importLCZvect(dirPath=dirPathOSM,file="rsu_lcz.geojson",column = "LCZ_PRIMARY",geomID = "ID_RSU",confid="LCZ_UNIQUENESS_VALUE")
redonBDT<-importLCZvect(dirPath=dirPathBDT,file="rsu_lcz.geojson",column = "LCZ_PRIMARY",geomID = "ID_RSU",confid="LCZ_UNIQUENESS_VALUE")
redonCompare<-compareLCZ(sf1=redonBDT,wf1="bdt", geomID1 = "ID_RSU",column1 ="LCZ_PRIMARY",
confid1 = "LCZ_UNIQUENESS_VALUE",
sf2=redonOSM,wf2="osm",geomID2 = "ID_RSU",column2="LCZ_PRIMARY",confid2 ="LCZ_UNIQUENESS_VALUE",exwrite=FALSE,plot=FALSE)
#> The column LCZ_PRIMARY of the dataset redonBDT is the reference against which the LCZ_PRIMARY column of the dataset redonOSM will be compared.
#> Both sf datasets need to live in the same crs projection (srid / epsg),
#> they will be coerced to the specified reference (redonBDT)
#> Warning: attribute variables are assumed to be spatially constant throughout
#> all geometries
#> Warning: attribute variables are assumed to be spatially constant throughout
#> all geometries
#> Plot set to FALSE, no plots created
The function confidSensib
performs the sensitivity analysis from the output of compareLCZ
.
NOTE: if the two data sets to compare have the same column names,
the compareLCZ function will concatenate a “.1” string to the column names of the second file,
so the enriched names will be the ones to feed confidSensib
.
The question we want to answer is: Does the agreement between the two classification gets bigger if we keep only the geometries for which the LCZ value is associated to a confidence value bigger than a thershold?
The span of the confidence values is divided on as many points as the user chooses with the argument nPoints
For each of these thresholds, the function only keeps the reference spatial units (RSU, i.e. the geometries) for which the confidence value is higher than the threshold. Then it computes the agreement between the classifications based on these geoms.
names(redonCompare$data)
#> [1] "ID_RSU" "LCZ_PRIMARY" "LCZ_UNIQUENESS_VALUE"
#> [4] "ID_RSU.1" "LCZ_PRIMARY.1" "LCZ_UNIQUENESS_VALUE.1"
#> [7] "area" "agree" "location"
confidRedon<-confidSensib(inputDf=redonCompare$data,filePath="", nPoints=10,
wf1="bdtopo_2_2", wf2="osm",
geomID1="ID_RSU", column1="LCZ_PRIMARY", confid1="LCZ_UNIQUENESS_VALUE",
geomID2="ID_RSU.1",column2="LCZ_PRIMARY.1", confid2="LCZ_UNIQUENESS_VALUE.1",
sep=";", repr="standard",
plot=TRUE, saveG = ".")
#> [1] "confSeq in internFunction "
#> 0% 11.11111% 22.22222% 33.33333% 44.44444% 55.55556%
#> 0.006101597 0.091693921 0.202059986 0.268302499 0.456597082 0.575888942
#> 66.66667% 77.77778% 88.88889% 100%
#> 0.683662384 0.815106352 0.993250260 1.000000000
#> Warning: Removed 1 rows containing missing values (`geom_point()`).
#> Warning: Removed 1 rows containing missing values (`geom_text()`).
#> Warning: Removed 1 rows containing missing values (`geom_point()`).
#> Warning: Removed 1 rows containing missing values (`geom_text()`).
#> [1] "confSeq in internFunction "
#> 0% 11.11111% 22.22222% 33.33333% 44.44444% 55.55556% 66.66667%
#> 0.03755517 0.10117962 0.15640481 0.23005578 0.24716110 0.29995499 0.43094746
#> 77.77778% 88.88889% 100%
#> 0.52734432 0.65608618 0.79049066
#> [1] "confSeq in internFunction "
#> 0% 11.11111% 22.22222% 33.33333% 44.44444% 55.55556%
#> 0.006101597 0.080437929 0.103621330 0.112250552 0.215540816 0.272522342
#> 66.66667% 77.77778% 88.88889% 100%
#> 0.425410263 0.484218592 0.487496806 0.693319494
#> [1] "confSeq in internFunction "
#> 0% 11.11111% 22.22222% 33.33333% 44.44444% 55.55556% 66.66667%
#> 0.03755517 0.06939875 0.08532054 0.08532054 0.08532054 0.08532054 0.08532054
#> 77.77778% 88.88889% 100%
#> 0.18387091 0.30116352 0.43719838
#> [1] "confSeq in internFunction "
#> 0% 11.11111% 22.22222% 33.33333% 44.44444% 55.55556%
#> 0.008741054 0.008861656 0.009826473 0.009826473 0.010955103 0.012365891
#> 66.66667% 77.77778% 88.88889% 100%
#> 0.012365891 0.013056145 0.013253361 0.013253361
#> [1] "confSeq in internFunction "
#> 0% 11.11111% 22.22222% 33.33333% 44.44444% 55.55556%
#> 0.006692699 0.067271124 0.154323145 0.213118523 0.257162253 0.418052676
#> 66.66667% 77.77778% 88.88889% 100%
#> 0.524607611 0.601875346 0.679941194 1.000000000
#> [1] "confSeq in internFunction "
#> 0% 11.11111% 22.22222% 33.33333% 44.44444% 55.55556% 66.66667%
#> 0.06324468 0.12275556 0.18226644 0.24177732 0.24177732 0.24177732 0.24177732
#> 77.77778% 88.88889% 100%
#> 0.24177732 0.24177732 0.24177732
#> [1] "confSeq in internFunction "
#> 0% 11.11111% 22.22222% 33.33333% 44.44444% 55.55556% 66.66667%
#> 0.01899439 0.24990516 0.46084437 0.62301538 0.71562722 0.81304148 0.87950091
#> 77.77778% 88.88889% 100%
#> 0.99051465 1.00000000 1.00000000
#> Warning: Removed 7 rows containing missing values (`geom_point()`).
#> Warning: Removed 7 rows containing missing values (`geom_text()`).
#> Warning: Removed 7 rows containing missing values (`geom_point()`).
#> Warning: Removed 7 rows containing missing values (`geom_text()`).
knitr::include_graphics("GeneralUniquenessSensib.png", error=FALSE)
plot of chunk unnamed-chunk-17
One can see that the agreement between classifications using BDTOPO and OpenStreetMap tends to increase when the confidence of the LCZ we keep increases (except for the open low LCZ !).
knitr::include_graphics("byLCZUniquenessSensib.png", error=FALSE)
plot of chunk unnamed-chunk-18
These are the main functions of the package, for further use, refer to the man page of each function.