lczexplore_en

Matthieu Gousseff

2023-10-25

A package to explore Local Climate Zones: lczexplore

Purpose and origin

The lczexplore package was developed thanks to the PAENDORA2 (Pour la gestion du confort estival : Données, Outils et Recherche-Action, 2022 -2025, funded by ADEME, a French agency for ecological transition).

Its purpose is to easily compare different Local Climate Zone (LCZ) classifications on the same study territory.

Before using this package, one must assign a local climate zone (LCZ) type to each Spatial Reference Units of a territory. These units can be topological units or geometries derived from pixels of a raster approach the package deals with the conversion). Fo a detailed example of the raster approach, please read the “lczexplore_raster_vector” vignette (vignette("lczexplore_raster_vector"). As several workflows can produce these LCZ types, this package allows to represent them and to compare pairs of these classifications (pairwise). It has been extended to compare any pair of qualitative variable set on valid geographical units.

Installation

The following allows the installation from the github repository: (devtools::install_github("orbisgis/lczexplore", build_vignettes = TRUE)).

On can also install from the tar.gz file and install it (see ?install.packages).

Load the package: library(lczexplore).

library(ggplot2)
library(lczexplore)

First use-case: load a file with geometries and standard LCZ

Import files with importLCZvect function

The function importLCZvect is the generic import function for the file containing geometries, or Reference Spatial Units (RSU), and the associated Local Climate Zone Type (LCZ). Imports of .geojson and .shp files were tested, but any format accepted by the function st_read of the sf package should work.

The dirPath and the file arguments set the path to the folder and the filename to be loaded. The compulsory argument column is the name of the column in which are the LCZ types. The optional arguments geomID and confid allow the user to load column of geometry identifiers and a column containing a measure of the confidence associated to the LCZ type of a given geometry (these are needed for the sensitivity analysis described below).

The default output of importLCZvect is an object of class simple feature as defined by the openGIS consortium and as handled by the sf package, an R reference library for Geographical Information System vector Data [https://doi.org/10.32614/RJ-2018-009].

If one sets the argument output="bBox", importLCZvect will output only the bounding box of the geometries of the input file. It is useful, for instance, when you want to use it to crop another file and study the same area.

dirPath<-paste0(system.file("extdata",package="lczexplore"),"/")
dirPathOSM<-paste0(dirPath,"osm/2022/Redon")
dirPathBDT<-paste0(dirPath,"bdtopo_2_2/Redon")
redonOSM<-importLCZvect(dirPath=dirPathOSM,file="rsu_lcz.geojson",column = "LCZ_PRIMARY")
redonBDT<-importLCZvect(dirPath=dirPathBDT,file="rsu_lcz.geojson",column = "LCZ_PRIMARY")
#

Map the LCZ types with the function showLCZ

Once the dataset has been loaded into an R object of class sf, the function showLCZ produces the map of it’s repartition. The argumentsf sets the input dataset, the column of the LCZ types is passed to the argument column. The argument wf is simply a string
to specify which workflow produced the input data (it is used to produce the title).

Standard representation

For standard LCZ types, one sets the argument repr="standard", and the function will recognize levels from 1 to 10 and from 101 to 107. It is the recommended choice. But standard representation will also accept levels from 11 to 17 and from A to G. (for other conventions, see the alter representation).

showLCZ(sf=redonOSM,wf="OSM",column="LCZ_PRIMARY",repr="standard",LCZlevels="",colors="")

plot of chunk unnamed-chunk-4plot of chunk unnamed-chunk-4 One can note that by default, all the 17 possible types of an LCZ classification are present in the legend. This makes comparison easier when one compares dataset with different present levels. To only represent the present levels, set drop=TRUE.

#>  [1] "Compact mid"        "Compact low"        "Open mid"          
#>  [4] "Open low"           "Large low"          "Sparsely built"    
#>  [7] "Dense trees"        "Scattered trees"    "Low plants"        
#> [10] "Bare rock or paved" "Water"             
#>        Compact mid        Compact low           Open mid           Open low 
#>                  2                  3                  5                  6 
#>          Large low     Sparsely built        Dense trees    Scattered trees 
#>                  8                  9                 11                 12 
#>         Low plants Bare rock or paved              Water 
#>                 14                 15                 17

plot of chunk unnamed-chunk-5plot of chunk unnamed-chunk-5

Alternative visualization

The non-standard representation, triggered with repr=alter is useful when:

With repr=alter, one passes the vector of possible levels to the argument LCZlevels and the vector of associated colors to the argument colors.

If no vector of levels or no vector of colors is given, the levels will be deduced from the unique values present in the data, and the colors will be picked from the “Polychrome 36” palette of the grDevices package.

For instance: one can first plot the column without knowing the LCZ levels present.

# Choice of random colors from a palette
testCol <- palette.colors(n=17, palette="Polychrome 36")
showLCZ(sf=redonOSM,wf="OSM",column="LCZ_PRIMARY",repr="alter",colors=testCol, useStandCol = FALSE, drop=TRUE)
#> [1] "redonOSM"
#> 3.1 : No levels but a color vector which size is greater than the number of levels in the data, unused colors were dropped.

plot of chunk unnamed-chunk-6plot of chunk unnamed-chunk-6

The colors one wants to use can be specified either by their name (if a known color name in R) or it’s RGB code, and passed to the argument colors.


LCZlevels<-c(2, 3, 5, 6, 8, 9, 101, 102, 104, 105, 107)
couleurs<-c("red","brown1","orange","coral","grey","darkgreen","chartreuse4","springgreen3",
            "darkolivegreen","black","#6d67fd")
showLCZ(sf=redonOSM,wf="OSM",column="LCZ_PRIMARY",repr="alter",drop=TRUE,LCZlevels=LCZlevels,colors=couleurs)
#> [1] "redonOSM"
#> One vector of levels, one vector of colors, either the same length (case 9: and 10:), or colors longer (case 12:, unused colors were dropped),
#>          reduced to  4: & 5: A single vector was provided, whose names cover the levels in the data (values are expected to be colors).

plot of chunk unnamed-chunk-7plot of chunk unnamed-chunk-7

Group several LCZ types in a category using groupLCZ

For instance, one may want to illustrate the contour of a city, by grouping the urban LCZs and the vegetation ones.
Use the groupLCZ function and set the dataset with the argument sf and the column with column. Then pass as many vectors as there are grouped category, each of them containing the levels to be grouped together. The newly created categories will be stored in the outCol column. If the user doesn’t enter them (by default outCol="grouped").

The newly created dataset can be mapped with showLCZ function.

redonOSMgrouped<-groupLCZ(redonOSM,column="LCZ_PRIMARY",
                           urban=c("1","2","3","4","5","6","7","8","9"),
                           industry="10",
                           vegetation=c("101","102","103","104"),
                           impervious="105",pervious="106",water="107",colors=c("red","black","green","grey","burlywood","blue"))
showLCZ(redonOSMgrouped, column="grouped",repr="alter",wf="OSM",
        LCZlevels = c("urban","industry","vegetation","impervious","pervious","water"),
        colors=c("red","black","green","grey","burlywood","blue"))
#> [1] "redonOSMgrouped"
#> One vector of levels, one vector of colors, either the same length (case 9: and 10:), or colors longer (case 12:, unused colors were dropped),
#>          reduced to  4: & 5: A single vector was provided, whose names cover the levels in the data (values are expected to be colors).

plot of chunk unnamed-chunk-8plot of chunk unnamed-chunk-8

Compare two LCZ classification with the function compareLCZ

The purpose of this package is to easily compare local climate zone classifications produced by different algorithms on the same study area. In our example, the GeoClimate workflow was used with two different input data to produce two LCZ classifications: the BD TOPO® v2.2, produced by the French National Institute of Geographic and Forest Information, and OpenStreetMap.

NOTE: The vignette “lczexplore_raster_vector” explores in detail another example with WUDAPT raster data.

One has to call compareLCZ and feed it, for each dataset, at least the name of the sf object and the name of the columns where the LCZ types are stored. As the two input data set can have different coordinate reference system (CRS), you need to tell which dataset is the reference (by default ref=1, that is the second data set will be re-transformed into the CRS of the first data set).

NOTE: if the two data sets to compare have the same column names, the compareLCZ function will concatenate a “.1” string to the column names of the second file

The graphical output includes 4 graphs and is easier to read after zooming or exporting.

comparison<-compareLCZ(sf1=redonBDT,column1="LCZ_PRIMARY", wf1="BD TOPO v2.2",
                        sf2=redonOSM,column2="LCZ_PRIMARY",wf2="Open Street Map", ref=1,
                        repr="standard",exwrite=F, location="Redon", saveG="comparison")
#>  The column  LCZ_PRIMARY  of the dataset redonBDT is the reference against which the  LCZ_PRIMARY  column of the dataset  redonOSM will be compared.
#> Both sf datasets need to live in the same crs projection (srid / epsg),
#> they will be coerced to the specified reference (redonBDT)
#> Warning: attribute variables are assumed to be spatially constant throughout
#> all geometries

#> Warning: attribute variables are assumed to be spatially constant throughout
#> all geometries

knitr::include_graphics("comparison.png", error=FALSE)
plot of chunk unnamed-chunk-10

plot of chunk unnamed-chunk-10

The comparison also works on grouped data.

redonOSMgrouped<-groupLCZ(redonOSM,column="LCZ_PRIMARY",
                           urban=c("1","2","3","4","5","6","7","8","9"),
                           industry="10",
                           vegetation=c("101","102","103","104"),
                           impervious="105",pervious="106",water="107",colors=c("red","black","green","grey","burlywood","blue"))
redonBDTgrouped<-groupLCZ(redonBDT,column="LCZ_PRIMARY",
                           urban=c("1","2","3","4","5","6","7","8","9"),
                           industry="10",
                           vegetation=c("101","102","103","104"),
                           impervious="105",pervious="106",water="107",colors=c("red","black","green","grey","burlywood","blue"))
comparisonGrouped<-compareLCZ(sf1=redonOSMgrouped,column1="grouped",
           sf2=redonBDTgrouped, column2="grouped",
           repr="alter", 
           levels=c("urban","industry","vegetation","impervious","pervious","water"),
           colors=c("red","black","green","grey","burlywood","blue"), saveG="comparisonGrouped") 
#>  The column  grouped  of the dataset redonOSMgrouped is the reference against which the  grouped  column of the dataset  redonBDTgrouped will be compared.
#> Both sf datasets need to live in the same crs projection (srid / epsg),
#> they will be coerced to the specified reference (redonOSMgrouped)
#> Warning: attribute variables are assumed to be spatially constant throughout
#> all geometries

#> Warning: attribute variables are assumed to be spatially constant throughout
#> all geometries

knitr::include_graphics("comparisonGrouped.png", error=FALSE)
plot of chunk unnamed-chunk-12

plot of chunk unnamed-chunk-12

A little cheat is possible ! You can try to group and compare on-the-fly with tryGroup=TRUE. The syntax is then as shown in the nex code chunk.

comparisonGrouped<-compareLCZ(sf1=redonBDT,column1="LCZ_PRIMARY", wf1="groupedBDT",
    sf2=redonOSM, column2="LCZ_PRIMARY", wf2="groupedOSM",
    repr="alter", ref=2, saveG="", exwrite=FALSE, location="Redon", plot=TRUE,
    urban=c("1","2","3","4","5","6","7","8","9"),
    industry="10",
    vegetation=c("101","102","103","104"),
    impervious="105",pervious="106",water="107",
    colors=c("red","black","green","grey","burlywood","blue"),tryGroup = TRUE)
#>  The column  LCZ_PRIMARY  of the dataset redonBDT is the reference against which the  LCZ_PRIMARY  column of the dataset  redonOSM will be compared.
#> Both sf datasets need to live in the same crs projection (srid / epsg),
#> they will be coerced to the specified reference (redonOSM)
#> Level names in your 1st dataset didn't match original data.
#>       As tryGroup=TRUE, the function groupLCZ will try to create a "grouped" column with level names and levels specified in (...).
#>       If this doesn't work, compareLCZ function may fail.
#> As tryGroup=TRUE, the function groupLCZ will try to create a "grouped" column with level names and levels specified in (...).
#>       If this doesn't work, compareLCZ function may fail.
#> Warning: attribute variables are assumed to be spatially constant throughout
#> all geometries

#> Warning: attribute variables are assumed to be spatially constant throughout
#> all geometries

Output data for further analysis

CompareLCZ outputs a list called matConfOut which contains:

At last: it exWrite=TRUE the data are exported to a .csv file whose name is created from the argument wf1 and wf2. It can then be loaded for further analysis.

comparison$percAgg
#> [1] 61.75
comparisonGrouped$areas
#>   typeLevels area1 area2
#> 1      urban 25.85 30.75
#> 2   industry  0.00  0.00
#> 3 vegetation 67.56 58.37
#> 4 impervious  3.17  7.61
#> 5   pervious  0.00  0.00
#> 6      water  3.41  3.28

Sensitivity of the agreement to the confidence value

Some algorithm may provide a confidence value associated to the LCZ type. For instance, GeoClimate, for a given geometry, chooses the LCZ type according to a distance to a hypercube defining each LCZ type (in a space defined by the urban canopy parameters)?. Therefore, GeoClimate assigns the geometry to the closest type, but another type may have been quite close too. GeoClimate supplies a uniqueness value, between 0 and 1. The closer it is to 1, the less the second LCZ type was a good candidate. So this uniqueness can be seen as a confidence value.

Import the necessary data: LCZ type, but also confidence.

When using the importLCZvect function one needs to specify the confidence value associated to the LCZ (confid argument). On can also set the name of the geometry identifier (geomID argument ).

dirPath<-paste0(system.file("extdata",package="lczexplore"),"/")
dirPathOSM<-paste0(dirPath,"osm/2022/Redon")
dirPathBDT<-paste0(dirPath,"bdtopo_2_2/Redon")

redonOSM<-importLCZvect(dirPath=dirPathOSM,file="rsu_lcz.geojson",column = "LCZ_PRIMARY",geomID = "ID_RSU",confid="LCZ_UNIQUENESS_VALUE")

redonBDT<-importLCZvect(dirPath=dirPathBDT,file="rsu_lcz.geojson",column = "LCZ_PRIMARY",geomID = "ID_RSU",confid="LCZ_UNIQUENESS_VALUE")

redonCompare<-compareLCZ(sf1=redonBDT,wf1="bdt", geomID1 = "ID_RSU",column1 ="LCZ_PRIMARY",
                         confid1 = "LCZ_UNIQUENESS_VALUE",
                         sf2=redonOSM,wf2="osm",geomID2 = "ID_RSU",column2="LCZ_PRIMARY",confid2 ="LCZ_UNIQUENESS_VALUE",exwrite=FALSE,plot=FALSE)
#>  The column  LCZ_PRIMARY  of the dataset redonBDT is the reference against which the  LCZ_PRIMARY  column of the dataset  redonOSM will be compared.
#> Both sf datasets need to live in the same crs projection (srid / epsg),
#> they will be coerced to the specified reference (redonBDT)
#> Warning: attribute variables are assumed to be spatially constant throughout
#> all geometries

#> Warning: attribute variables are assumed to be spatially constant throughout
#> all geometries
#> Plot set to FALSE, no plots created

The function confidSensib performs the sensitivity analysis from the output of compareLCZ.

NOTE: if the two data sets to compare have the same column names, the compareLCZ function will concatenate a “.1” string to the column names of the second file, so the enriched names will be the ones to feed confidSensib.

The question we want to answer is: Does the agreement between the two classification gets bigger if we keep only the geometries for which the LCZ value is associated to a confidence value bigger than a thershold?

The span of the confidence values is divided on as many points as the user chooses with the argument nPoints

For each of these thresholds, the function only keeps the reference spatial units (RSU, i.e. the geometries) for which the confidence value is higher than the threshold. Then it computes the agreement between the classifications based on these geoms.

names(redonCompare$data)
#> [1] "ID_RSU"                 "LCZ_PRIMARY"            "LCZ_UNIQUENESS_VALUE"  
#> [4] "ID_RSU.1"               "LCZ_PRIMARY.1"          "LCZ_UNIQUENESS_VALUE.1"
#> [7] "area"                   "agree"                  "location"
confidRedon<-confidSensib(inputDf=redonCompare$data,filePath="", nPoints=10,
             wf1="bdtopo_2_2", wf2="osm",
             geomID1="ID_RSU", column1="LCZ_PRIMARY", confid1="LCZ_UNIQUENESS_VALUE",
             geomID2="ID_RSU.1",column2="LCZ_PRIMARY.1", confid2="LCZ_UNIQUENESS_VALUE.1",
             sep=";", repr="standard",
             plot=TRUE, saveG = ".")
#> [1] "confSeq in internFunction "
#>          0%   11.11111%   22.22222%   33.33333%   44.44444%   55.55556% 
#> 0.006101597 0.091693921 0.202059986 0.268302499 0.456597082 0.575888942 
#>   66.66667%   77.77778%   88.88889%        100% 
#> 0.683662384 0.815106352 0.993250260 1.000000000
#> Warning: Removed 1 rows containing missing values (`geom_point()`).
#> Warning: Removed 1 rows containing missing values (`geom_text()`).
#> Warning: Removed 1 rows containing missing values (`geom_point()`).
#> Warning: Removed 1 rows containing missing values (`geom_text()`).
#> [1] "confSeq in internFunction "
#>         0%  11.11111%  22.22222%  33.33333%  44.44444%  55.55556%  66.66667% 
#> 0.03755517 0.10117962 0.15640481 0.23005578 0.24716110 0.29995499 0.43094746 
#>  77.77778%  88.88889%       100% 
#> 0.52734432 0.65608618 0.79049066 
#> [1] "confSeq in internFunction "
#>          0%   11.11111%   22.22222%   33.33333%   44.44444%   55.55556% 
#> 0.006101597 0.080437929 0.103621330 0.112250552 0.215540816 0.272522342 
#>   66.66667%   77.77778%   88.88889%        100% 
#> 0.425410263 0.484218592 0.487496806 0.693319494 
#> [1] "confSeq in internFunction "
#>         0%  11.11111%  22.22222%  33.33333%  44.44444%  55.55556%  66.66667% 
#> 0.03755517 0.06939875 0.08532054 0.08532054 0.08532054 0.08532054 0.08532054 
#>  77.77778%  88.88889%       100% 
#> 0.18387091 0.30116352 0.43719838 
#> [1] "confSeq in internFunction "
#>          0%   11.11111%   22.22222%   33.33333%   44.44444%   55.55556% 
#> 0.008741054 0.008861656 0.009826473 0.009826473 0.010955103 0.012365891 
#>   66.66667%   77.77778%   88.88889%        100% 
#> 0.012365891 0.013056145 0.013253361 0.013253361 
#> [1] "confSeq in internFunction "
#>          0%   11.11111%   22.22222%   33.33333%   44.44444%   55.55556% 
#> 0.006692699 0.067271124 0.154323145 0.213118523 0.257162253 0.418052676 
#>   66.66667%   77.77778%   88.88889%        100% 
#> 0.524607611 0.601875346 0.679941194 1.000000000 
#> [1] "confSeq in internFunction "
#>         0%  11.11111%  22.22222%  33.33333%  44.44444%  55.55556%  66.66667% 
#> 0.06324468 0.12275556 0.18226644 0.24177732 0.24177732 0.24177732 0.24177732 
#>  77.77778%  88.88889%       100% 
#> 0.24177732 0.24177732 0.24177732 
#> [1] "confSeq in internFunction "
#>         0%  11.11111%  22.22222%  33.33333%  44.44444%  55.55556%  66.66667% 
#> 0.01899439 0.24990516 0.46084437 0.62301538 0.71562722 0.81304148 0.87950091 
#>  77.77778%  88.88889%       100% 
#> 0.99051465 1.00000000 1.00000000
#> Warning: Removed 7 rows containing missing values (`geom_point()`).
#> Warning: Removed 7 rows containing missing values (`geom_text()`).
#> Warning: Removed 7 rows containing missing values (`geom_point()`).
#> Warning: Removed 7 rows containing missing values (`geom_text()`).

knitr::include_graphics("GeneralUniquenessSensib.png", error=FALSE)
plot of chunk unnamed-chunk-17

plot of chunk unnamed-chunk-17

One can see that the agreement between classifications using BDTOPO and OpenStreetMap tends to increase when the confidence of the LCZ we keep increases (except for the open low LCZ !).


knitr::include_graphics("byLCZUniquenessSensib.png", error=FALSE)
plot of chunk unnamed-chunk-18

plot of chunk unnamed-chunk-18

These are the main functions of the package, for further use, refer to the man page of each function.