RoBivaL data corpus

Backe, Christian; Wirkus, Malte; Hinck, Stefan; Babel, Jonathan; Riedel, Vadim; Reichert, Nele; Kolesnikov, Andrej; Stark, Tobias; Hilljegerdes, Jens; Kücüker, Hilmi Dogu; Barcic, Emir; Klink, Eduard; Ruckelshausen, Arno; Kirchner, Frank

doi:10.5281/zenodo.12547116

Published June 26, 2024 | Version v2

Dataset Open

RoBivaL data corpus

1. German Research Center for Artificial Intelligence (DFKI)
2. Hochschule Osnabrück

1. Introduction

This data corpus was produced during the RoBivaL project, by robotics and agriculture researchers from DFKI (German Research Center for Artificial Intelligence, Robotics Innovation Center) and HSO (Hochschule Osnabrück, University of Applied Sciences, Agro-Technicum), between August 2021 and October 2023.

The RoBivaL project compared different robot locomotion concepts from both space research and agricultural applications on the basis of experiments conducted under agricultural conditions. Four robot systems were used, two of which (ARTEMIS & SherpaTT) have their origin in futuristic space applications, while the other two (Naio Oz & BoniRob) were developed specifically for agriculture.

The robots were subjected to six experiments, addressing different challenges and requirements for agricultural applications. Since real-world soil conditions usually change with the seasons and can be expected to have a crucial impact on robot performance, the experimental soil conditions were controlled and varied on the two dimensions moisture (dry, moist, wet) and density (tilled, compacted), resulting in six soil condition options. Depending on the specific objectives, each experiment was conducted either on a subset or on all available soil conditions. The experiments were:

Straight travel: Determine variations of travel speed and directional stability under different soil moisture and densitiy levels, and determine the soil deformation and compaction caused by a traverse under given initial soil conditions.
Turn around: Examine the effect of steering on soil deformation with moist and tilled soil.
Repeated rollover: Investigate the effects of repeated axle rollovers on soil compaction, determined by measuring the soil penetration resistance.
Tensile force: Compare the maximum exerted tractive force under different soil moisture and densitiy levels, and gain insights how varying soil conditions affect the performance of each system during traction.
Sill crossing: Determine the ability to overcome different types of obstacles, and compare relevant system characteristics, e.g. ground clearance, or center of mass.
Obstacle avoidance: Demonstrate SherpaTT's ability to step over an obstacle without contact, thanks to its actively controlled suspension.

Field conditions and robot behavior were monitored with various sensors and measuring devices, partly on the robots and partly in the field, in order to document the experiment execution, and to determine the robot performance. The data capturing devices, their roles and deployments are summarized in Table 1.

Table 1: Overview of data capturing devices

	Device on System	Device on System and in Field	Device in Field
System Monitoring	IMU Force logger	RTK-GPS	Stopwatch Compass
System and Field Monitoring			Video camera Ruler
Field Monitoring			Tilt laser scanner Penetrometer Moisture meter

2. File tree

The data corpus is stored in a file tree, which is divided into three main sections:

Logbook
Data
Specification

Each section is described in detail in the following chapters.

Here is a complete overview of the file tree:

logbook/
    csv/
        experiment.csv
        parameter.csv
        possible_robot.csv
        possible_value.csv
        robot.csv
        run__${experiment}.csv
    database/
        logbook.sqlite
    schema/
        logbook_entities.png
        logbook_schema.sqlite
    src/
        create_sqlite_database.sh
data/
    ${experiment}/
        ${robot}/
            ${run}/
                ${datafile}
specification/
    experiment/
        experiment.md
        ${experiment}/
            img/
                ${experiment}.png
            ${experiment}-description.md
            ${experiment}.json
    parameter/
        parameter.json
    datafile/
        ${datafile_stem}.json
    robot/
        ${robot}/
            system_properties.json
        robots.png
    sensor/
        ${sensor}.json
    software/
        ${software}.json

The variables ${experiment}, ${robot}, ${run}, ${datafile}, ${datafile_stem}, ${sensor}, and ${software} are explained in the context of each respective section.

2.1. Logbook

The Logbook is a small relational database. Primarily, it contains one table for every experiment, where each row represents an experiment run. These tables capture all facts and measurements about a run that can be expressed as scalar values, including

start and end times,
independent variables, (e.g. run track length, soil moisture and density level, name of the tested robot, commanded speed, etc.),
dependent variables, (e.g. wheel track depth and width, heading and offset of the robot after a run, etc.),
comments about unforeseen events.

Additional measurements that are better managed in separate data files are stored in the Data section of the corpus, which is discussed in Chapter 2.2.

Besides the run tables, the Logbook contains tables to specify the experiments and the available robots, as well as the parameters that are present in the run tables. These additional tables have some overlap with the Specification section of the data corpus, which is discussed in Chapter 2.3. In the Logbook, the specifying tables were used during the run data acquisition in the field, in order to facilitate and live-validate the data entry.

The entire Logbook is stored in the SQLite file logbook/database/logbook.sqlite. Users who prefer other tools than SQLite can find the constituting tables as CSV files in the directory logbook/csv/. The Logbook schema and entity-relationship-diagram are in the logbook/schema/ directory. The database can be recreated from the schema and CSV files with the Bash script logbook/src/create_sqlite_database.sh

The full Logbook file tree is as follows:

logbook/
    csv/
        experiment.csv
        parameter.csv
        possible_robot.csv
        possible_value.csv
        robot.csv
        run__${experiment}.csv
    database/
        logbook.sqlite
    schema/
        logbook_entities.png
        logbook_schema.sqlite
    src/
        create_sqlite_database.sh

The ${experiment} variable refers to the keys at the top level of the data tree, which is discussed in Chapter 2.2.1.

2.2. Data

The Data section of the corpus contains all measurements that would be impractical to store directly in a run table of the Logbook database, but are better managed as separate data files. In most cases, these are time series issued by a particular sensor, and/or by a software running on one of the robots.

In addition to the data strictly necessary for evaluation purposes in RoBivaL, there are some extra data streams that were routinely captured on the ARTEMIS robot which were not available for the other systems, as well as data from experiment runs that were considered invalid or performed for testing.

For data recording on the robots, two different approaches were used, due to different sensor availabilites. In the case of SherpaTT, Naio Oz, and BoniRob, a custom-built, stand-alone embedded PC in a battery-equipped box for autonomous operation (aka Sensor Box) was attached to the given robot. The Sensor Box includes IMU and GNSS sensors, which are of primary relevance for the experiments. In the case of ARTEMIS, built-in sensors and data logging functionality could be used that relies on similar sensors as the Sensor Box, and employs the same software infrastructure for data recording, based on the Rock software framework.

Table 2 gives an overview of all available data files with a short description and possible sources, including sources outside of the robots (i.e. Force logger, Penetrometer, Tilt scanner, and Video camera). A thorough specification of the data files and their respective hard- and software resources is in the Specification section of the corpus, which is discussed in Chapter 2.3.

Table 2: Overview of file types in the Data section

Data file name	Description	Possible sources
bogie_dispatcher.motion_status.csv	Time series of status of the joint of the mobile base	ARTEMIS
force.csv	Time series of momentary tractive force exerted by a robot, measured at regular intervals	Force logger
gnss.nwu_position_samples.csv	Time series of cartesian positions measured by GPS in North-West-Up coordinate system	ARTEMIS, Sensor Box
gnss.position_samples.csv	Time series of cartesian positions measured by GPS in robot coordiante system	ARTEMIS, Sensor Box
gnss.solution.csv	Time series of raw values from the GPS sensor	ARTEMIS, Sensor Box
joystick_converter.motion_command.csv	Time series of joystick commands interpreted as motion commands	ARTEMIS
motion_controller.actuators_command.csv	Time series of commands for joints of mobile base	ARTEMIS
odometry.odometry_samples.csv	Time series of aggregated pose of the odometry component	ARTEMIS
penetrometer-after.json penetrometer-before.json penetrometer.json	Penetrometer measurements of the soil penetration resistance at multiple depth levels, before or after the experiment run	Penetrometer
tiltscan-before-front.asc tiltscan-before-rear.asc tiltscan-before-left.asc tiltscan-before-right.asc tiltscan-before.png tiltscan-before.txt tiltscan-after-front.asc tiltscan-after-rear.asc tiltscan-after-left.asc tiltscan-after-right.asc tiltscan-after.png tiltscan-after.txt	Tilt scanner measurements of the track surface, before or after the experiment run, on the front, rear, left, or right side of the robot, in raw pointcloud (.asc) or rasterized and consolidated (.png, .txt) form	Tilt scanner
video.mp4.defaced.mp4	Video recordings of the robot performing the experiment run. Postprocessed to remove faces for privacy protection.	Video camera
xsens.calibrated_sensors.csv	Raw readings of inertial unit	ARTEMIS, Sensor Box
xsens.orientation_samples.csv	Integrated Cartesian pose measured by inertial unit	ARTEMIS, Sensor Box

The attribution of each data file to a particular experiment run is managed via a hierarchical directory structure as follows:

data/
    ${experiment}/
        ${robot}/
            ${run}/
                ${datafile}

For example:

data/
    straight_travel/
        artemis/
            dry-tilled-1-035-2023-06-19-14-05/
                bogie_dispatcher.motion_status.csv
                gnss.nwu_position_samples.csv
                ...

All directory and file names are normalized, to facilitate scripting, and to control the references between the Logbook, Data, and Specification sections of the data corpus. As a general rule, the names are composed of a restricted character set: Only lower case ASCII letters, numbers, and punctuation from the set [-_.] are used. There is no whitespace. Additional normalization rules specific to each hierarchy level are discussed in the following subsections.

2.2.1. Level 1: `${experiment}/`

At Level 1 are the six experiments, named

obstacle_avoidance
repeated_rollover
sill_crossing
straight_travel
tensile_force
turn_around

These names serve as keywords both in the Logbook and in the Specification section.

2.2.2. Level 2: `${robot}/`

At Level 2 are the four robots, named

artemis
bonirob
naio_oz
sherpatt

These names serve as keywords both in the Logbook and in the Specification section.

2.2.3. Level 3: `${run}/`

The names of the ${run} directories at Level 3 are composed from characteristic variables to provide a human-readable "fingerprint" for each experiment run, which is also machine processable. For example: dry-tilled-1-035-2023-06-19-14-05 is the name of a run of the straight_travel experiment. The general naming pattern of a ${run} directory for all experiments is

${independent_variables}-${repetition_count}-${run_id}-${start_datetime}

if the run is valid. For invalid runs, the suffix -INVALID is appended to this pattern.

The characteristic ${independent_variables} per experiment are:

Obstacle avoidance: ${type_of_avoidance}
Repeated rollover: ${soil_moisture_class}-${soil_density}-${number_of_axle_rollovers}
Sill crossing: ${soil_moisture_class}-${soil_density}-${has_been_traveled}-${sill_type}-${sill_height}
Straight travel: ${soil_moisture_class}-${soil_density}
Tensile force: ${soil_moisture_class}-${soil_density}
Turn around: ${soil_moisture_class}-${soil_density}

The ${repetition_count} reflects that each robot undergoes each experiment multiple times for each combination of its ${independent_variables}. Repetitions are done, because the agricultural field conditions are noisy, e.g. due to uneven run tracks, or limited capability to provide homogenous soil moisture and density over all tracks for all times. By repeating an experiment with the same combination of its ${independent_variables}, noise can be mitigated with statitistical measures during data postprocessing.

The ${run_id} is a numerical ID to the run__${experiment} table in the Logbook database, which has been discussed in Chapter 2.1.

Finally, the ${start_datetime} is a human-readable timestamp of the start of the experiment run in local time. Since the experiments are run consecutively, it is a natural identifier in the run__${experiment} table, and therefore provides some redundancy and robustness for the linkage between run data and metadata. The format ${year}-${month}-${day}-${hour}-${minute} is chosen to be mostly compliant with ISO 8601, while respecting the restricted character set mentioned above, which is imposed by technical considerations.

2.2.4. Level 4: `${datafile}`

At the lowest level is a collection of all data files for a given experiment run. Names of timeseries that were logged with the Sensor Box or on the ARTEMIS robot (see Table 2) are a result of the data logging technique and correspond with names of the software components that produce the data streams. Such components can either be device drivers producing raw device data, or data processing components of any kind such as control or sensor processing algorithms.

Table 3 lists the maximum set of data files per experiment. It is possible that during individual experiment runs only a subset of the data files is available. This can be due to the corresponding sensor not being present on the system, to a failure of the sensor or capturing device, or to human error. Data files between braces are only available on ARTEMIS, all other data files are available independently of the robot.

Table 3: Maximum set of data files per experiment

Experiment	Data files
Obstacle avoidance	`gnss.nwu_position_samples.csv` `gnss.position_samples.csv` `gnss.solution.csv` `video.mp4.defaced.mp4` `xsens.calibrated_sensors.csv` `xsens.orientation_samples.csv` (`bogie_dispatcher.motion_status.csv`) (`joystick_converter.motion_command.csv`) (`motion_controller.actuators_command.csv`) (`odometry.odometry_samples.csv`)
Repeated rollover	`penetrometer.json`
Sill crossing	`gnss.nwu_position_samples.csv` `gnss.position_samples.csv` `gnss.solution.csv` `video.mp4.defaced.mp4` `xsens.calibrated_sensors.csv` `xsens.orientation_samples.csv` (`bogie_dispatcher.motion_status.csv`) (`joystick_converter.motion_command.csv`) (`motion_controller.actuators_command.csv`) (`odometry.odometry_samples.csv`)
Straight travel	`gnss.nwu_position_samples.csv` `gnss.position_samples.csv` `gnss.solution.csv` `penetrometer-after.json` `penetrometer-before.json` `tiltscan-after-left.asc` `tiltscan-after.png` `tiltscan-after-right.asc` `tiltscan-after.txt` `tiltscan-before-left.asc` `tiltscan-before.png` `tiltscan-before-right.asc` `tiltscan-before.txt` `video.mp4.defaced.mp4` `xsens.calibrated_sensors.csv` `xsens.orientation_samples.csv` (`bogie_dispatcher.motion_status.csv`) (`joystick_converter.motion_command.csv`) (`motion_controller.actuators_command.csv`) (`odometry.odometry_samples.csv`)
Tensile force	`force.csv` `gnss.nwu_position_samples.csv` `gnss.position_samples.csv` `gnss.solution.csv` `video.mp4.defaced.mp4` `xsens.calibrated_sensors.csv` `xsens.orientation_samples.csv` (`bogie_dispatcher.motion_status.csv`) (`joystick_converter.motion_command.csv`) (`motion_controller.actuators_command.csv`) (`odometry.odometry_samples.csv`)
Turn around	`gnss.nwu_position_samples.csv` `gnss.position_samples.csv` `gnss.solution.csv` `penetrometer-after.json` `penetrometer-before.json` `tiltscan-after-front.asc` `tiltscan-after.png` `tiltscan-after-rear.asc` `tiltscan-after.txt` `tiltscan-before-front.asc` `tiltscan-before.png` `tiltscan-before-rear.asc` `tiltscan-before.txt` `video.mp4.defaced.mp4` `xsens.calibrated_sensors.csv` `xsens.orientation_samples.csv` (`bogie_dispatcher.motion_status.csv`) (`joystick_converter.motion_command.csv`) (`motion_controller.actuators_command.csv`) (`odometry.odometry_samples.csv`)

The contents of the data files is described in the Specification section of the corpus, which is discussed in Chapter 2.3.

2.3. Specification

The Specification section of the corpus contains structural and semantic information to give data (re)users an in-depth understanding of

each experiment, including purpose, setup, procedure, parameters, and available data files,
used assets, including robots, sensors, and software,
contents and format of the resulting data files.

As far as possible, the information is provided in structured form as JSON files, so that it is both machine-processable and human-readable.

The section is subdivided by entities:

experiment
parameter
datafile
robot
sensor
software

Here is a compete file tree of the Specification section:

specification/
    experiment/
        experiment.md
        ${experiment}/
            img/
                ${experiment}.png
            ${experiment}-description.md
            ${experiment}.json
    parameter/
        parameter.json
    datafile/
        ${datafile_stem}.json
    robot/
        ${robot}/
            system_properties.json
        robots.png
    sensor/
        ${sensor}.json
    software/
        ${software}.json

2.3.1. Entity `experiment`

The experiment/ directory contains a detailed specification for each experiment. It is divided into multiple files with the following file tree layout:

experiment/
    experiment.md
    ${experiment}/
        ${experiment}.json
        ${experiment}-description.md
        img/
            ${experiment}.png

The file experiment.md contains some general remarks that apply to all experiments.

The ${experiment} variable refers to the same keys as Level 1 of the Data section, i.e.

obstacle_avoidance
repeated_rollover
sill_crossing
straight_travel
tensile_force
turn_around

The file ${experiment}.json contains machine-processable references to

the robots that were involved in a particular experiment,
the names of its available datafiles, and
its characteristic parameters together with their experiment-specific attributes, i.e. whether a parameter is
- an independent or a dependent variable,
- required for every run or not.

Additional parameter attributes that are not experiment-specific are discussed in Chapter 2.3.2.

To facilitate human consumption, a detailed description of each experiment is stored in the separate Markdown-formatted file ${experiment}-description.md. It specifies

the experiment's objectives,
its setup, illustrated by the image img/${experiment}.png (except for the repeated_rollover experiment),
the actions performed during an experiment run, and
all available measurements, and where they can be found in the data corpus.

2.3.2. Entity `parameter`

The parameter/ directory contains a single file parameter/parameter.json, which specifies the aspects of each parameter that are not experiment-specific, i.e. its

definition,
data type,
measuring unit (if applicable), and
possible values (if applicable).

This is done outside the experiment specification experiment/${experiment}/${experiment}.json (discussed in Chapter 2.3.1.) to reduce redundancy, because most parameters are employed in multiple experiments.

2.3.3. Entity `datafile`

For each available datafile (as per Table 2), except video.mp4.defaced.mp4, the datafile/ directory contains a machine-processable datafile format specification datafile/${datafile_stem}.json. The variable ${datafile_stem} depends on the type of datafile:

For data sources whose output is stored in a single file per experiment run, the ${datafile_stem} is the ${datafile} name without its filename extension, e.g. bogie_dispatcher.motion_status. Incidentally, this applies to all .csv datafiles.
For data sources whose output is stored in multiple files per experiment run, the ${datafile_stem} is the characteristic prefix of the ${datafile} name. In particular, for penetrometer* or tiltscan* datafiles, the ${datafile_stem} is penetrometer or tiltscan, respectively.

All datafile specification files contain

a semantic definition of the datafile,
references to sensors involved in its making (if applicable), and
references to software involved in its making (if applicable).

Additionally, a specification file may contain information depending on the type of datafile:

For tabular timeseries, there is a specification of each table column (i.e. column definition, data type, and measuring unit (if applicable)), and a mention of the primary time column. (The latter is necessary, because two different time columns may be available, capturing sample creation time and log time, respectively.) This holds for all .csv datafiles.
The Penetrometer file specification contains a description of all its JSON fields.
The Tilt scanner file specification covers three different file types (raw pointcloud, numeric raster, colorized raster) with individual formats, semantic definitions and software references.

2.3.4. Entity `robot`

The robot/ directory has the following layout:

robot/
    ${robot}/
        system_properties.json
    robots.png

The ${robot} variable refers to the same keys as Level 2 of the Data section, i.e.

artemis
bonirob
naio_oz
sherpatt

The file ${robot}/system_properties.json lists relevant system properties of the respective robot, namely

size (length, width, height) of a bounding box
weight
description of the type of actuation
description of the type of wheels
contact surface
area density

The robots.png image shows a complete view of all robots in the field, as well as a close-up of each of the wheel types.

2.3.5. Entity `sensor`

The sensor/ directory contains a file sensor/${sensor}.json for each of the sensors involved in the creation of the datafiles. It lists the values of relevant properties specific to each sensor, e.g. baudrate, update frequency, or connection.

The ${sensor} variable in the file name serves as a key which is referenced in the datafile/${datafile_stem}.json specifications discussed in Chapter 2.3.3., and in the software/${software}.json specifications discussed in Chapter 2.3.6.

2.3.6. Entity `software`

The software/ directory contains a file software/${software}.json for each of the software components involved in the creation of the datafiles. It lists the values of relevant properties, namely

type
format
URLs of source code repositories (if available)
developer
list of sensors
list of datafile names

The ${software} variable in the file name serves as a key which is referenced in the datafile/${datafile_stem}.json specifications discussed in Chapter 2.3.3., and in the sensor/${sensor}.json specifications discussed in Chapter 2.3.5.

3. Acknowledgements

The work on this dataset is based on the project RoBivaL, funded by the German Federal Ministry for Economic Affairs and Climate Action (BMWK) under grant number 50RP2150. The responsibility for the content of this publication lies with the authors.

The authors would like to thank the Federal Government and the Heads of Government of the Länder, as well as the Joint Science Conference (GWK), for their initiative within the framework of the NFDI4Ing consortium (German Research Foundation (DFG) - project number 442146713).

The dataset has also been supported by the ZLA (NiMWK, Volkswagenstiftung, ZDIN 11-76251-14-3/19) and the Experimentierfeld Agro-Nordwest (BMEL 28DE103B18).

Files

data.zip

Files (13.3 GB)

Name	Size	Download all
data.zip md5:bf19c190cfdd2e53fc9e2a9379933cd4	13.3 GB	Preview Download
logbook.zip md5:10b7c304759ada344ca3d3f6fba3de3b	328.6 kB	Preview Download
README.md md5:77eea9a8b9a1035ec10f7cf6ce84e9d3	31.7 kB	Preview Download
specification.zip md5:1294be287267f84dd9aa9791aca9582a	8.8 MB	Preview Download

Additional details

Is supplement to: Journal article: 10.1002/rob.22347 (DOI)

Citations

Oops! Something went wrong while fetching results.

	All versions	This version
Views	477	175
Downloads	189	138
Data volume	623.3 GB	545.3 GB

RoBivaL data corpus

Creators

Description

1. Introduction

2. File tree

2.1. Logbook

2.2. Data

2.2.1. Level 1: ${experiment}/

2.2.2. Level 2: ${robot}/

2.2.3. Level 3: ${run}/

2.2.4. Level 4: ${datafile}

2.3. Specification

2.3.1. Entity experiment

2.3.2. Entity parameter

2.3.3. Entity datafile

2.3.4. Entity robot

2.3.5. Entity sensor

2.3.6. Entity software

3. Acknowledgements

Files

data.zip

Files (13.3 GB)

Additional details

Related works

2.2.1. Level 1: `${experiment}/`

2.2.2. Level 2: `${robot}/`

2.2.3. Level 3: `${run}/`

2.2.4. Level 4: `${datafile}`

2.3.1. Entity `experiment`

2.3.2. Entity `parameter`

2.3.3. Entity `datafile`

2.3.4. Entity `robot`

2.3.5. Entity `sensor`

2.3.6. Entity `software`