Data entry routines

Resources

This exercise uses the same dataset used for the *Spatial relationships. If you need to download the data again just clik on this link. The dataset contains the following layers:

  • dorset_cadaster.qgz a QGIS project preloading a geopackage containing the following layers:

    • roads (road network)

    • water_plan (area a water management plan where special provisions may apply)

    • power_line_project (proposed route for a high voltage aereal cable)

    • parcels (the cadaster)

    • land_use (land uses as of 2015)

    • parish (admninistrative boundary of the parishes within the Dorset municipality - Tasmania)

    • party (fictional list of parties)

    • building (empty layer of type polygon)

    • topographic_map (a sample topographic map generated from Open Street Map)

    • building_type (fictional list of types of buildings)

In additon to the project and respective datasets, there are also folders with auxiliary files the exercise may refer to.

Principle

Data entry routines refer to techniques used to optimize Spatial data acquisition especially in the context of Digitizing and data imports. The idea is to automate as much as possible the production of new geographic information by minimizing the amount of data that a human as to manually declare.

Attention

The more people touch the data during the editing phase, the higher the chance errors and data corruption will occur. In general, it is good policy to restrict data editing as much as possible.

From a technical point of view it is important to understand that these routines always have to abide by the constraints imposed by the data provider. Therefore data entry routines exist on two levels: the provider level, and the interface level Fig. 23.

Principle of data entry routines

Fig. 23 Data routine levels

For this exercise we are going to use the case of a dataset that has to be manually digitized to store the footprint of buildings. The simplified data model Fig. 24 shows how this dataset is structured and how it relates with other datasets.

Simplified data model

Fig. 24 Simplified data model for building dataset

This data model is implemented using a geopackage (.gpkg), but ideally it would be implemented in a concurrent access database like PostgreSQL. From the data model we can have an overview of the constraints that have to be enforced. We are now going to optimize the data entry through interface optimizations.

Widgets

Before starting to tweak our data entry interface, lets take a look at how it looks and feels by default.

  1. Task Having the topographic_map as background map, try to digitize at least one of the buildings footprints into the building layer.

You will see Fig. 25 that you have to manually enter the information yourself into all the fields and in some cases you simply do not know what value to enter.

default menu

Fig. 25 Default data entry interface

Clearly, this is not convenient at all. In order to make the data entry more efficient we will start by defining what type of widgets should be used for each attribute of our table.

Attention

In the context of the Vector layer properties dialog in QGIS, a widget defines the type of interface the user will see when editing a layer. You can learn more about widgets in the official documentation

  1. Task From the Layers panel, right-click on layer building an then click on Properties. From the Layer properties dialog go access the Attributes form tab Fig. 26

attributes form

Fig. 26 The Attributes form tab

  1. Task Define your first widget on field fid. The widget type will be Hidden Fig. 27.

widget example

Fig. 27 Widget definition example

  1. Task Continue defining the widget types field by field according the parameters indicated in Table 4. In the end hit the Apply button.

Table 4 Building widgets

field

widget

editable

remarks

type

Value Relation

yes

Layer=’building_type’; Key column=’fid’; Value column=’type

allowed_use

Text Edit

no

Layer=’roads’; Key column=’fid’; Value Column ‘pri_nane’

historic_building

Checkbox

yes

official

Text Edit

no

registered

Date

yes

storeys

Range

yes

Minimum=’1’; Maximum=’5’; Step=’1’ (use the slider widget)

street

Text Edit

no

perimeter_m

Text Edit

no

area_m2

Text Edit

no

If you now try to digitize one of the buildings, the interface is different Fig. 28

after widget

Fig. 28 Data entry interface after widget definition

However this is still not good. Some fields are grayed out and cannot be edited. These fields are to be automatically calculated, but for that we need to look into the QGIS expressions

Expressions

QGIS expressions engine offers powerful possibilities when it comes to styling, analyses and, of course, editing data.

Attention

Expressions are a fundamental part of workflows and productivity in QGIS. A full description of all the expressions is available in the official documentation

In our case we are interested in defining what the default value for a given field is. This default value can be the output of an expression.

  1. Task Define your first default expression on field land_use. Enter this expression:

    aggregate(
    layer:= 'land_use',
    aggregate:='concatenate',
    expression:= LU_DESCRIP,
    concatenator:='',
    filter:=intersects($geometry, geometry(@parent)))
    

as the Default value and make sure the option Apply default value on update is checked Fig. 29.

expression example

Fig. 29 Expression example

  1. Task Continue defining the default expressions according to the definitions provided in Table 5. In the end hit the Apply button.

Table 5 Building expressions

field

expression

official

@user_full_name

registered

to_date( now() )

storeys

2

street

aggregate( layer:= 'roads', aggregate:='concatenate', expression:= pri_name, concatenator:='', filter:=intersects($geometry, buffer(geometry(@parent),10)))

perimeter_m

$perimeter

area_m2

$area

I everything went well, if you now proceed to digitize your buildings you should observe that most of the fields are now pre-filled Fig. 30 making the data entry proccess more reliable and faster.

after expressions

Fig. 30 Data entry interface after defining default expressions

To make the interface as simple as possible Fig. 31, you can also opt to change the widget type of fields currently not editable to Hidden. Those fields are allowed use, official, registered, street, perimeter_m and area_m2.

after expressions

Fig. 31 Data entry interface after hiding autofill fields

workflows

Another type of data entry routine is related with importing from external sources. These sources often take form of a topographic survey where each surveyed point is stored in a table or CSV file. These points might represent a geographic phenomena representable by a point, in which case each surveyed point will be integrated as geometry of type point in the GIS System.

This integration can get tricky when these points are actually the vertices of a more complex geometry, for example a polygon representing a recently surveyed land parcel like shown in Table 6

Table 6 Parcel vertices

vertex_index

vertex_part

distance

angle

xcoord

ycoord

0

0

0

145.4303973

542746.806

5448568.026

1

0

165.5815304

191.0113576

542715.179

5448405.493

2

0

493.3223253

226.1405706

542652.58

5448083.786

3

0

548.0989514

261.011077

542598.438

5448075.472

4

0

602.7958644

266.1858122

542544.452

5448066.682

5

0

709.414446

272.9521545

542437.876

5448069.695

6

0

755.1472779

322.7666605

542392.271

5448073.112

7

0

1303.576168

55.54885149

542499.249

5448611.006

8

0

1554.836489

145.4303973

542746.806

5448568.026

The workflow required to transform such a table into a geometry is deeply dependent on the overall data model adopted for survey works. But a relative simple way to do it would be a succession of steps where the output of each of these steps is the input to the next operation until the final output is obtained.

In the example we will explore, this workflow consists of Import the csv as point data > Generate a line connecting these dots > Close the line to obtain a polygon > Fix geometry > > FINAL OUTPUT

This succession of steps is tedious and time consuming, especially if it is a recurrent task. A better way to do it is to build a Model (or workflow) in QGIS that chains these steps into one single operation Fig. 32 that can even be executed as a batch process if needed.

import survey model

Fig. 32 Model to import survey data in CSV format

Along with the data for this exercise you have a folder named surveys. Inside you will see 30 CSV files similar to the one shown in Table 6, each representing a different topographic survey (i.e. different parcel). We will use those files to demonstrate a possible approach to build a workflow to import external data.

  1. Task Import the model import_surveys.model3 into your collection of processing tools Fig. 33 you will find this file inside the models folder.

add model

Fig. 33 Adding a model to the Processing toolbox

  1. Task From the Processing Toolbox, filter by import survey. right-click on it and choose Exectute as Batch Process Fig. 34.

execute batch process

Fig. 34 Starting a Batch Process

  1. Task Provide the necessary parameters to execute the batch operation. Check the video below to see how it is done

The end product is a collection of 30 layers with point geometries representing the vertices of the polygons and 30 layers of polygon geometries representing the parcels Fig. 35

execute batch process

Fig. 35 Result of the batch import