RoBivaL data corpus
Creators
- Backe, Christian (Data manager)1
- Wirkus, Malte (Project leader)1
- Hinck, Stefan (Project member)2
- Babel, Jonathan (Project member)1
- Riedel, Vadim (Project member)2
- Reichert, Nele (Project member)1
- Kolesnikov, Andrej (Project member)1
- Stark, Tobias (Project member)1
- Hilljegerdes, Jens (Project member)1
- Kücüker, Hilmi Dogu (Project member)1
- Barcic, Emir (Project member)2
- Klink, Eduard (Project member)2
- Ruckelshausen, Arno (Supervisor)2
- Kirchner, Frank (Supervisor)1
Description
1. Introduction
This data corpus was produced during the RoBivaL project, by robotics and agriculture researchers from DFKI (German Research Center for Artificial Intelligence, Robotics Innovation Center) and HSO (Hochschule Osnabrück, University of Applied Sciences, Agro-Technicum), between August 2021 and October 2023.
The RoBivaL project compared different robot locomotion concepts from both space research and agricultural applications on the basis of experiments conducted under agricultural conditions. Four robot systems were used, two of which (ARTEMIS & SherpaTT) have their origin in futuristic space applications, while the other two (Naio Oz & BoniRob) were developed specifically for agriculture.
The robots were subjected to six experiments, addressing different challenges and requirements for agricultural applications. Since real-world soil conditions usually change with the seasons and can be expected to have a crucial impact on robot performance, the experimental soil conditions were controlled and varied on the two dimensions moisture (dry, moist, wet) and density (tilled, compacted), resulting in six soil condition options. Depending on the specific objectives, each experiment was conducted either on a subset or on all available soil conditions. The experiments were:
- Straight travel: Determine variations of travel speed and directional stability under different soil moisture and densitiy levels, and determine the soil deformation and compaction caused by a traverse under given initial soil conditions.
- Turn around: Examine the effect of steering on soil deformation with moist and tilled soil.
- Repeated rollover: Investigate the effects of repeated axle rollovers on soil compaction, determined by measuring the soil penetration resistance.
- Tensile force: Compare the maximum exerted tractive force under different soil moisture and densitiy levels, and gain insights how varying soil conditions affect the performance of each system during traction.
- Sill crossing: Determine the ability to overcome different types of obstacles, and compare relevant system characteristics, e.g. ground clearance, or center of mass.
- Obstacle avoidance: Demonstrate SherpaTT's ability to step over an obstacle without contact, thanks to its actively controlled suspension.
Field conditions and robot behavior were monitored with various sensors and measuring devices, partly on the robots and partly in the field, in order to document the experiment execution, and to determine the robot performance. The data capturing devices, their roles and deployments are summarized in Table 1.
Table 1: Overview of data capturing devices
Device on System | Device on System and in Field | Device in Field | |
---|---|---|---|
System Monitoring |
|
|
|
System and Field Monitoring |
|
||
Field Monitoring |
|
2. File tree
The data corpus is stored in a file tree, which is divided into three main sections:
- Logbook
- Data
- Specification
Each section is described in detail in the following chapters.
Here is a complete overview of the file tree:
logbook/
csv/
experiment.csv
parameter.csv
possible_robot.csv
possible_value.csv
robot.csv
run__${experiment}.csv
database/
logbook.sqlite
schema/
logbook_entities.png
logbook_schema.sqlite
src/
create_sqlite_database.sh
data/
${experiment}/
${robot}/
${run}/
${datafile}
specification/
experiment/
experiment.md
${experiment}/
img/
${experiment}.png
${experiment}-description.md
${experiment}.json
parameter/
parameter.json
datafile/
${datafile_stem}.json
robot/
${robot}/
system_properties.json
robots.png
sensor/
${sensor}.json
software/
${software}.json
The variables ${experiment}
, ${robot}
, ${run}
, ${datafile}
, ${datafile_stem}
, ${sensor}
, and ${software}
are explained in the context of each respective section.
2.1. Logbook
The Logbook is a small relational database. Primarily, it contains one table for every experiment, where each row represents an experiment run. These tables capture all facts and measurements about a run that can be expressed as scalar values, including
- start and end times,
- independent variables, (e.g. run track length, soil moisture and density level, name of the tested robot, commanded speed, etc.),
- dependent variables, (e.g. wheel track depth and width, heading and offset of the robot after a run, etc.),
- comments about unforeseen events.
Additional measurements that are better managed in separate data files are stored in the Data section of the corpus, which is discussed in Chapter 2.2.
Besides the run tables, the Logbook contains tables to specify the experiments and the available robots, as well as the parameters that are present in the run tables. These additional tables have some overlap with the Specification section of the data corpus, which is discussed in Chapter 2.3. In the Logbook, the specifying tables were used during the run data acquisition in the field, in order to facilitate and live-validate the data entry.
The entire Logbook is stored in the SQLite file logbook/database/logbook.sqlite
. Users who prefer other tools than SQLite can find the constituting tables as CSV files in the directory logbook/csv/
. The Logbook schema and entity-relationship-diagram are in the logbook/schema/
directory. The database can be recreated from the schema and CSV files with the Bash script logbook/src/create_sqlite_database.sh
The full Logbook file tree is as follows:
logbook/
csv/
experiment.csv
parameter.csv
possible_robot.csv
possible_value.csv
robot.csv
run__${experiment}.csv
database/
logbook.sqlite
schema/
logbook_entities.png
logbook_schema.sqlite
src/
create_sqlite_database.sh
The ${experiment}
variable refers to the keys at the top level of the data tree, which is discussed in Chapter 2.2.1.
2.2. Data
The Data section of the corpus contains all measurements that would be impractical to store directly in a run table of the Logbook database, but are better managed as separate data files. In most cases, these are time series issued by a particular sensor, and/or by a software running on one of the robots.
In addition to the data strictly necessary for evaluation purposes in RoBivaL, there are some extra data streams that were routinely captured on the ARTEMIS robot which were not available for the other systems, as well as data from experiment runs that were considered invalid or performed for testing.
For data recording on the robots, two different approaches were used, due to different sensor availabilites. In the case of SherpaTT, Naio Oz, and BoniRob, a custom-built, stand-alone embedded PC in a battery-equipped box for autonomous operation (aka Sensor Box) was attached to the given robot. The Sensor Box includes IMU and GNSS sensors, which are of primary relevance for the experiments. In the case of ARTEMIS, built-in sensors and data logging functionality could be used that relies on similar sensors as the Sensor Box, and employs the same software infrastructure for data recording, based on the Rock software framework.
Table 2 gives an overview of all available data files with a short description and possible sources, including sources outside of the robots (i.e. Force logger, Penetrometer, Tilt scanner, and Video camera). A thorough specification of the data files and their respective hard- and software resources is in the Specification section of the corpus, which is discussed in Chapter 2.3.
Table 2: Overview of file types in the Data section
Data file name | Description | Possible sources |
---|---|---|
|
Time series of status of the joint of the mobile base | ARTEMIS |
|
Time series of momentary tractive force exerted by a robot, measured at regular intervals | Force logger |
|
Time series of cartesian positions measured by GPS in North-West-Up coordinate system | ARTEMIS, Sensor Box |
|
Time series of cartesian positions measured by GPS in robot coordiante system | ARTEMIS, Sensor Box |
|
Time series of raw values from the GPS sensor | ARTEMIS, Sensor Box |
|
Time series of joystick commands interpreted as motion commands | ARTEMIS |
|
Time series of commands for joints of mobile base | ARTEMIS |
|
Time series of aggregated pose of the odometry component | ARTEMIS |
|
Penetrometer measurements of the soil penetration resistance at multiple depth levels, before or after the experiment run | Penetrometer |
|
Tilt scanner measurements of the track surface, before or after the experiment run, on the front, rear, left, or right side of the robot, in raw pointcloud (.asc) or rasterized and consolidated (.png, .txt) form | Tilt scanner |
|
Video recordings of the robot performing the experiment run. Postprocessed to remove faces for privacy protection. | Video camera |
|
Raw readings of inertial unit | ARTEMIS, Sensor Box |
|
Integrated Cartesian pose measured by inertial unit | ARTEMIS, Sensor Box |
The attribution of each data file to a particular experiment run is managed via a hierarchical directory structure as follows:
data/
${experiment}/
${robot}/
${run}/
${datafile}
For example:
data/
straight_travel/
artemis/
dry-tilled-1-035-2023-06-19-14-05/
bogie_dispatcher.motion_status.csv
gnss.nwu_position_samples.csv
...
All directory and file names are normalized, to facilitate scripting, and to control the references between the Logbook, Data, and Specification sections of the data corpus. As a general rule, the names are composed of a restricted character set: Only lower case ASCII letters, numbers, and punctuation from the set [-_.
] are used. There is no whitespace. Additional normalization rules specific to each hierarchy level are discussed in the following subsections.
2.2.1. Level 1: ${experiment}/
At Level 1 are the six experiments, named
obstacle_avoidance
repeated_rollover
sill_crossing
straight_travel
tensile_force
turn_around
These names serve as keywords both in the Logbook and in the Specification section.
2.2.2. Level 2: ${robot}/
At Level 2 are the four robots, named
artemis
bonirob
naio_oz
sherpatt
These names serve as keywords both in the Logbook and in the Specification section.
2.2.3. Level 3: ${run}/
The names of the ${run}
directories at Level 3 are composed from characteristic variables to provide a human-readable "fingerprint" for each experiment run, which is also machine processable. For example: dry-tilled-1-035-2023-06-19-14-05
is the name of a run of the straight_travel
experiment. The general naming pattern of a ${run}
directory for all experiments is
${independent_variables}-${repetition_count}-${run_id}-${start_datetime}
if the run is valid. For invalid runs, the suffix -INVALID
is appended to this pattern.
The characteristic ${independent_variables}
per experiment are:
- Obstacle avoidance:
${type_of_avoidance}
- Repeated rollover:
${soil_moisture_class}-${soil_density}-${number_of_axle_rollovers}
- Sill crossing:
${soil_moisture_class}-${soil_density}-${has_been_traveled}-${sill_type}-${sill_height}
- Straight travel:
${soil_moisture_class}-${soil_density}
- Tensile force:
${soil_moisture_class}-${soil_density}
- Turn around:
${soil_moisture_class}-${soil_density}
The ${repetition_count}
reflects that each robot undergoes each experiment multiple times for each combination of its ${independent_variables}
. Repetitions are done, because the agricultural field conditions are noisy, e.g. due to uneven run tracks, or limited capability to provide homogenous soil moisture and density over all tracks for all times. By repeating an experiment with the same combination of its ${independent_variables}
, noise can be mitigated with statitistical measures during data postprocessing.
The ${run_id}
is a numerical ID to the run__${experiment}
table in the Logbook database, which has been discussed in Chapter 2.1.
Finally, the ${start_datetime}
is a human-readable timestamp of the start of the experiment run in local time. Since the experiments are run consecutively, it is a natural identifier in the run__${experiment}
table, and therefore provides some redundancy and robustness for the linkage between run data and metadata. The format ${year}-${month}-${day}-${hour}-${minute}
is chosen to be mostly compliant with ISO 8601, while respecting the restricted character set mentioned above, which is imposed by technical considerations.
2.2.4. Level 4: ${datafile}
At the lowest level is a collection of all data files for a given experiment run. Names of timeseries that were logged with the Sensor Box or on the ARTEMIS robot (see Table 2) are a result of the data logging technique and correspond with names of the software components that produce the data streams. Such components can either be device drivers producing raw device data, or data processing components of any kind such as control or sensor processing algorithms.
Table 3 lists the maximum set of data files per experiment. It is possible that during individual experiment runs only a subset of the data files is available. This can be due to the corresponding sensor not being present on the system, to a failure of the sensor or capturing device, or to human error. Data files between braces are only available on ARTEMIS, all other data files are available independently of the robot.
Table 3: Maximum set of data files per experiment
Experiment | Data files |
---|---|
Obstacle avoidance |
|
Repeated rollover |
|
Sill crossing |
|
Straight travel |
|
Tensile force |
|
Turn around |
|
The contents of the data files is described in the Specification section of the corpus, which is discussed in Chapter 2.3.
2.3. Specification
The Specification section of the corpus contains structural and semantic information to give data (re)users an in-depth understanding of
- each experiment, including purpose, setup, procedure, parameters, and available data files,
- used assets, including robots, sensors, and software,
- contents and format of the resulting data files.
As far as possible, the information is provided in structured form as JSON files, so that it is both machine-processable and human-readable.
The section is subdivided by entities:
experiment
parameter
datafile
robot
sensor
software
Here is a compete file tree of the Specification section:
specification/
experiment/
experiment.md
${experiment}/
img/
${experiment}.png
${experiment}-description.md
${experiment}.json
parameter/
parameter.json
datafile/
${datafile_stem}.json
robot/
${robot}/
system_properties.json
robots.png
sensor/
${sensor}.json
software/
${software}.json
2.3.1. Entity experiment
The experiment/
directory contains a detailed specification for each experiment. It is divided into multiple files with the following file tree layout:
experiment/
experiment.md
${experiment}/
${experiment}.json
${experiment}-description.md
img/
${experiment}.png
The file experiment.md
contains some general remarks that apply to all experiments.
The ${experiment}
variable refers to the same keys as Level 1 of the Data section, i.e.
obstacle_avoidance
repeated_rollover
sill_crossing
straight_travel
tensile_force
turn_around
The file ${experiment}.json
contains machine-processable references to
- the robots that were involved in a particular experiment,
- the names of its available datafiles, and
- its characteristic parameters together with their experiment-specific attributes, i.e. whether a parameter is
- an independent or a dependent variable,
- required for every run or not.
Additional parameter attributes that are not experiment-specific are discussed in Chapter 2.3.2.
To facilitate human consumption, a detailed description of each experiment is stored in the separate Markdown-formatted file ${experiment}-description.md
. It specifies
- the experiment's objectives,
- its setup, illustrated by the image
img/${experiment}.png
(except for therepeated_rollover
experiment), - the actions performed during an experiment run, and
- all available measurements, and where they can be found in the data corpus.
2.3.2. Entity parameter
The parameter/
directory contains a single file parameter/parameter.json
, which specifies the aspects of each parameter that are not experiment-specific, i.e. its
- definition,
- data type,
- measuring unit (if applicable), and
- possible values (if applicable).
This is done outside the experiment specification experiment/${experiment}/${experiment}.json
(discussed in Chapter 2.3.1.) to reduce redundancy, because most parameters are employed in multiple experiments.
2.3.3. Entity datafile
For each available datafile (as per Table 2), except video.mp4.defaced.mp4
, the datafile/
directory contains a machine-processable datafile format specification datafile/${datafile_stem}.json
. The variable ${datafile_stem}
depends on the type of datafile:
- For data sources whose output is stored in a single file per experiment run, the
${datafile_stem}
is the${datafile}
name without its filename extension, e.g.bogie_dispatcher.motion_status
. Incidentally, this applies to all.csv
datafiles. - For data sources whose output is stored in multiple files per experiment run, the
${datafile_stem}
is the characteristic prefix of the${datafile}
name. In particular, forpenetrometer*
ortiltscan*
datafiles, the${datafile_stem}
ispenetrometer
ortiltscan
, respectively.
All datafile specification files contain
- a semantic definition of the datafile,
- references to sensors involved in its making (if applicable), and
- references to software involved in its making (if applicable).
Additionally, a specification file may contain information depending on the type of datafile:
- For tabular timeseries, there is a specification of each table column (i.e. column definition, data type, and measuring unit (if applicable)), and a mention of the primary time column. (The latter is necessary, because two different time columns may be available, capturing sample creation time and log time, respectively.) This holds for all
.csv
datafiles. - The Penetrometer file specification contains a description of all its JSON fields.
- The Tilt scanner file specification covers three different file types (raw pointcloud, numeric raster, colorized raster) with individual formats, semantic definitions and software references.
2.3.4. Entity robot
The robot/
directory has the following layout:
robot/
${robot}/
system_properties.json
robots.png
The ${robot}
variable refers to the same keys as Level 2 of the Data section, i.e.
artemis
bonirob
naio_oz
sherpatt
The file ${robot}/system_properties.json
lists relevant system properties of the respective robot, namely
- size (length, width, height) of a bounding box
- weight
- description of the type of actuation
- description of the type of wheels
- contact surface
- area density
The robots.png
image shows a complete view of all robots in the field, as well as a close-up of each of the wheel types.
2.3.5. Entity sensor
The sensor/
directory contains a file sensor/${sensor}.json
for each of the sensors involved in the creation of the datafiles. It lists the values of relevant properties specific to each sensor, e.g. baudrate, update frequency, or connection.
The ${sensor}
variable in the file name serves as a key which is referenced in the datafile/${datafile_stem}.json
specifications discussed in Chapter 2.3.3., and in the software/${software}.json
specifications discussed in Chapter 2.3.6.
2.3.6. Entity software
The software/
directory contains a file software/${software}.json
for each of the software components involved in the creation of the datafiles. It lists the values of relevant properties, namely
- type
- format
- URLs of source code repositories (if available)
- developer
- list of sensors
- list of datafile names
The ${software}
variable in the file name serves as a key which is referenced in the datafile/${datafile_stem}.json
specifications discussed in Chapter 2.3.3., and in the sensor/${sensor}.json
specifications discussed in Chapter 2.3.5.
3. Acknowledgements
The work on this dataset is based on the project RoBivaL, funded by the German Federal Ministry for Economic Affairs and Climate Action (BMWK) under grant number 50RP2150. The responsibility for the content of this publication lies with the authors.
The authors would like to thank the Federal Government and the Heads of Government of the Länder, as well as the Joint Science Conference (GWK), for their initiative within the framework of the NFDI4Ing consortium (German Research Foundation (DFG) - project number 442146713).
The dataset has also been supported by the ZLA (NiMWK, Volkswagenstiftung, ZDIN 11-76251-14-3/19) and the Experimentierfeld Agro-Nordwest (BMEL 28DE103B18).
Files
data.zip
Additional details
Related works
- Is supplement to
- Journal article: 10.1002/rob.22347 (DOI)