Authors:
Advanced Integrated Sensing lab (ADVISE) / Department of Electrical Engineering (ESAT) / KU Leuven
Other (Recording, supervision, ...):
The dataset is a derivative of the SINS dataset and is meant to be used as an evaluation set for the DCASE2018 Task 5 challenge. The development set to be used can be found here. The SINS dataset contains a continuous recording of one person living in a vacation home over a period of one week. It was collected using a network of 13 microphone arrays distributed over the entire home. The microphone array consists of 4 linearly arranged microphones. For this dataset 4 microphone arrays in the combined living room and kitchen area are used. Figure 2 shows the floorplan of the recorded environment along with the position of the used sensor nodes.
Approximately 200 hours of data from 7 sensor nodes are taken from the SINS dataset. The partitioning of the data was done randomly. The segments belonging to one particular consecutive activity (e.g. a full session of cooking) were kept together. The data provided for each sensor node contain recordings of the same time period. This means that the performed activities are observed from multiple microphone arrays at the same time instant.
The recordings were split into audio segments of 10s. Each segment represents one activity. These audio segments are provided as individual files along with the ground truth. The daily activities for this dataset (9) are shown in Table 1 along with the available 10s segments in the dataset and the amount of full sessions of a certain activity (e.g. a cooking session).
Activity | # 10s segments | # sessions |
---|---|---|
Absence (nobody present in the room) | 21112 | 21 |
Cooking | 4221 | 6 |
Dishwashing | 1477 | 5 |
Eating | 2100 | 6 |
Other (present but not doing any relevant activity) | 1960 | 59 |
Social activity (visit, phone call) | 3815 | 10 |
Vacuum cleaning | 868 | 4 |
Watching TV | 21116 | 4 |
Working (typing, mouse click, ...) | 16303 | 16 |
Total | 72972 | 131 |
The sensor node configuration used in this setup is a control board together with a linear microphone array. The control board contains an EFM32 ARM cortex M4 microcontroller from Silicon Labs (EFM32WG980) used for sampling the analog audio. The microphone array contains four Sonion N8AC03 MEMS low-power (±17µW) microphones with an inter-microphone distance of 5 cm. The sampling for each audio channel is done sequentially at a rate of 16 kHz with a bit depth of 12. The annotation was performed in two phases. First, during the data collection a smartphone application was used to let the monitored person(s) annotate the activities while being recorded. The person could only select a fixed set of activities. The application was easy to use and did not significantly influence the transition between activities. Secondly, the start and stop timestamps of each activity were refined by using our own annotation software. Postprocessing and sharing the database involves privacy-related aspects. Besides the person(s) living there, multiple people visited the home. Moreover, during a phone call, one can partially hear the person on the other end. A written informed consent was obtained from all participants.
The content of the dataset is structured in the following manner:
dataset root
│ EULA.pdf End user license agreement
│ meta.txt meta data, tsv-format, [audio file (str)][tab][label (str)][tab][session (str)]\n
│ readme.md Dataset description (markdown)
│ readme.html Dataset description (HTML)
│
└───audio 72984 audio segments, 16-bit 16kHz
│ │ 1.wav name format {segmentID}.wav
│ │ 100.wav
│ │ ...
│
└───evaluation_setup
│ evaluation.txt evaluation file list, tsv-format, [audio file (str)][tab][label (str)][tab][session (str)]\n
│ map.txt mapping between filenames, tsv-format, [audio file (str)][tab][audio file (str)]\n
└───test.txt test file list, tsv-format, [audio file (str)]\n
The multi-channel audio files can be found under directory audio
and are formatted in the following manner:
{segmentID}.wav
The file meta.txt
and the content of the folder evaluation_setup
contain filenames and labels. Additionally, a filename mapping is available that will map the filenames to a filename similar as the development dataset. The dataset is structured so that it can work with the DCASE 2018 Task baseline code.
See file EULA.pdf