Data entry routines¶
Resources
This exercise uses the same dataset used for the *Spatial relationships. If you need to download the data again just clik on this link. The dataset contains the following layers:
dorset_cadaster.qgz a QGIS project preloading a geopackage containing the following layers:
roads (road network)
water_plan (area a water management plan where special provisions may apply)
power_line_project (proposed route for a high voltage aereal cable)
parcels (the cadaster)
land_use (land uses as of 2015)
parish (admninistrative boundary of the parishes within the Dorset municipality - Tasmania)
party (fictional list of parties)
building (empty layer of type polygon)
topographic_map (a sample topographic map generated from Open Street Map)
building_type (fictional list of types of buildings)
In additon to the project and respective datasets, there are also folders with auxiliary files the exercise may refer to.
Principle¶
Data entry routines refer to techniques used to optimize Spatial data acquisition especially in the context of Digitizing and data imports. The idea is to automate as much as possible the production of new geographic information by minimizing the amount of data that a human as to manually declare.
Attention
The more people touch the data during the editing phase, the higher the chance errors and data corruption will occur. In general, it is good policy to restrict data editing as much as possible.
From a technical point of view it is important to understand that these routines always have to abide by the constraints imposed by the data provider. Therefore data entry routines exist on two levels: the provider level, and the interface level Fig. 23.
Fig. 23 Data routine levels¶
For this exercise we are going to use the case of a dataset that has to be manually digitized to store the footprint of buildings. The simplified data model Fig. 24 shows how this dataset is structured and how it relates with other datasets.
Fig. 24 Simplified data model for building dataset¶
This data model is implemented using a geopackage (.gpkg), but ideally it would be implemented in a concurrent access database like PostgreSQL. From the data model we can have an overview of the constraints that have to be enforced. We are now going to optimize the data entry through interface optimizations.
Widgets¶
Before starting to tweak our data entry interface, lets take a look at how it looks and feels by default.
Task Having the
topographic_mapas background map, try to digitize at least one of the buildings footprints into thebuildinglayer.
You will see Fig. 25 that you have to manually enter the information yourself into all the fields and in some cases you simply do not know what value to enter.
Fig. 25 Default data entry interface¶
Clearly, this is not convenient at all. In order to make the data entry more efficient we will start by defining what type of widgets should be used for each attribute of our table.
Attention
In the context of the Vector layer properties dialog in QGIS, a widget defines the type of interface the user will see when editing a layer. You can learn more about widgets in the official documentation
Task From the
Layers panel, right-click on layerbuildingan then click onProperties. From theLayer propertiesdialog go access theAttributes formtab Fig. 26
Fig. 26 The Attributes form tab¶
Task Define your first widget on field
fid. The widget type will beHiddenFig. 27.
Fig. 27 Widget definition example¶
Task Continue defining the widget types field by field according the parameters indicated in Table 4. In the end hit the
Applybutton.
field |
widget |
editable |
remarks |
|---|---|---|---|
type |
Value Relation |
yes |
Layer=’building_type’; Key column=’fid’; Value column=’type |
allowed_use |
Text Edit |
no |
Layer=’roads’; Key column=’fid’; Value Column ‘pri_nane’ |
historic_building |
Checkbox |
yes |
|
official |
Text Edit |
no |
|
registered |
Date |
yes |
|
storeys |
Range |
yes |
Minimum=’1’; Maximum=’5’; Step=’1’ (use the slider widget) |
street |
Text Edit |
no |
|
perimeter_m |
Text Edit |
no |
|
area_m2 |
Text Edit |
no |
If you now try to digitize one of the buildings, the interface is different Fig. 28
Fig. 28 Data entry interface after widget definition¶
However this is still not good. Some fields are grayed out and cannot be edited. These fields are to be automatically calculated, but for that we need to look into the QGIS expressions
Expressions¶
QGIS expressions engine offers powerful possibilities when it comes to styling, analyses and, of course, editing data.
Attention
Expressions are a fundamental part of workflows and productivity in QGIS. A full description of all the expressions is available in the official documentation
In our case we are interested in defining what the default value for a given field is. This default value can be the output of an expression.
Task Define your first default expression on field
land_use. Enter this expression:aggregate( layer:= 'land_use', aggregate:='concatenate', expression:= LU_DESCRIP, concatenator:='', filter:=intersects($geometry, geometry(@parent)))
as the Default value and make sure the option Apply default value on update is checked Fig. 29.
Fig. 29 Expression example¶
Task Continue defining the default expressions according to the definitions provided in Table 5. In the end hit the
Applybutton.
field |
expression |
|---|---|
official |
@user_full_name |
registered |
to_date( now() ) |
storeys |
2 |
street |
|
perimeter_m |
$perimeter |
area_m2 |
$area |
I everything went well, if you now proceed to digitize your buildings you should observe that most of the fields are now pre-filled Fig. 30 making the data entry proccess more reliable and faster.
Fig. 30 Data entry interface after defining default expressions¶
To make the interface as simple as possible Fig. 31, you can also opt to change the widget type of fields currently not editable to Hidden.
Those fields are allowed use, official, registered, street, perimeter_m and area_m2.
Fig. 31 Data entry interface after hiding autofill fields¶
workflows¶
Another type of data entry routine is related with importing from external sources. These sources often take form of a topographic survey where each surveyed point is stored in a table or CSV file. These points might represent a geographic phenomena representable by a point, in which case each surveyed point will be integrated as geometry of type point in the GIS System.
This integration can get tricky when these points are actually the vertices of a more complex geometry, for example a polygon representing a recently surveyed land parcel like shown in Table 6
vertex_index |
vertex_part |
distance |
angle |
xcoord |
ycoord |
|---|---|---|---|---|---|
0 |
0 |
0 |
145.4303973 |
542746.806 |
5448568.026 |
1 |
0 |
165.5815304 |
191.0113576 |
542715.179 |
5448405.493 |
2 |
0 |
493.3223253 |
226.1405706 |
542652.58 |
5448083.786 |
3 |
0 |
548.0989514 |
261.011077 |
542598.438 |
5448075.472 |
4 |
0 |
602.7958644 |
266.1858122 |
542544.452 |
5448066.682 |
5 |
0 |
709.414446 |
272.9521545 |
542437.876 |
5448069.695 |
6 |
0 |
755.1472779 |
322.7666605 |
542392.271 |
5448073.112 |
7 |
0 |
1303.576168 |
55.54885149 |
542499.249 |
5448611.006 |
8 |
0 |
1554.836489 |
145.4303973 |
542746.806 |
5448568.026 |
The workflow required to transform such a table into a geometry is deeply dependent on the overall data model adopted for survey works. But a relative simple way to do it would be a succession of steps where the output of each of these steps is the input to the next operation until the final output is obtained.
In the example we will explore, this workflow consists of
Import the csv as point data >
Generate a line connecting these dots >
Close the line to obtain a polygon >
Fix geometry >
> FINAL OUTPUT
This succession of steps is tedious and time consuming, especially if it is a recurrent task. A better way to do it is to build a Model (or workflow) in QGIS that chains these steps into one single operation Fig. 32 that can even be executed as a batch process if needed.
Fig. 32 Model to import survey data in CSV format¶
Along with the data for this exercise you have a folder named surveys. Inside you will see 30 CSV files similar to the one shown in Table 6, each representing a different topographic survey (i.e. different parcel). We will use those files to demonstrate a possible approach to build a workflow to import external data.
Task Import the model
import_surveys.model3into your collection of processing tools Fig. 33 you will find this file inside themodelsfolder.
Fig. 33 Adding a model to the Processing toolbox¶
Task From the
Processing Toolbox, filter byimport survey. right-click on it and chooseExectute as Batch ProcessFig. 34.
Fig. 34 Starting a Batch Process¶
Task Provide the necessary parameters to execute the batch operation. Check the video below to see how it is done
The end product is a collection of 30 layers with point geometries representing the vertices of the polygons and 30 layers of polygon geometries representing the parcels Fig. 35
Fig. 35 Result of the batch import¶