Dataset Open Access

To bee or not to bee: An annotated dataset for beehive sound recognition

Nolasco, Inês; Benetos, Emmanouil

-- Dataset documentation --

1- Introduction

The present dataset was developed in the context of our work in [1] that focus on the automatic recognition of beehive sounds. The problem is posed as the classification of sound segments in two classes: Bee and noBee. The novelty of the explored approach and the need for annotated data, dictated the construction of such dataset.

2- Description

2.1- Audio recordings:

The annotated dataset was developed based on a selected set of recordings acquired in the context of two different projects: the Open Source Beehive (OSBH) project [2] and the NU-Hive project [3]. Both projects main goal is to develop a beehive monitoring system capable of identifying and predict certain events and states of the hive that are of interest to the beekeeper. Among many different variables that can be measured and that help the recognition of different states of the hive, the analysis and use of the sound the bees produce is a big focus for both projects.

The recordings from the OSBH project were acquired through a citizen science initiative which asked people from the general public to record the sound from their beehives together with the registering of the hive state at the moment. Because of the amateur and collaborative nature of this project, the recordings from the OSBH project present great diversity due to the very different conditions in which the signals were acquired: different recording devices used, different environments where the hives were placed, and even different position for the microphones inside the hive. This variety of settings makes this dataset a very interesting tool to help evaluate and challenge the methods developed.

The NU-Hive project is a comprehensive effort of data acquisition, concerning not only sound, but a vast amount of variables that will allow the study of bees behaviors and other unknown aspects. The selected recordings are taken from 2 hives and labeled regarding two states: queen bee is present, and queen bee not present. Contrary to the OSBH project recordings, the recordings from the NU-Hive project are from a much more controlled and homogeneous environment. Here the occurring external sounds are mainly traffic, car honks and birds.

The annotated dataset:

For each selected recording, time segments are labeled as Bee or noBee depending on the perceived source of the sound signal being from bees or external to the hive.

The whole annotated dataset consists of 78 recordings of varying lengths which make up for a total duration of approximately 12 hours of which 25% is annotated as noBee events.

About 60% of the recordings are from the NU-Hive dataset and represent 2 hives, the remaining are recordings from the OSBH dataset and 6 different hives. The recorded hives are from 3 main locations: North America, Australia and Europe.

2- Annotation procedure

The annotation procedure consists in hearing the selected recordings and marking the beginning and the end of every sound that could not be recognized as a beehive sound. The recognition of external sounds is based primarily on the perceived heard sounds, but a visual aid is also used by visualizing the log-mel frequency spectrum of the signal. All the above are functionalities offered by the Sonic Visualiser software, which was used by two volunteers that are neither bee-specialists nor specially trained in sound annotation tasks.

By marking these pairs of moments corresponding to the beginning and end of external sound periods, we are able to get the whole recording labeled into Bee and noBee intervals. Thus in the resulting Bee intervals only pure beehive sounds, (no external sounds) should be perceived for the entirety of the segment. The noBee intervals refer to periods where an external sound can be perceived (superimposed to the bee sounds).

File Structure:

Each audio file is coupled with its corresponding annotation file, identified by the same name and extension .lab.
For convenience, all the annotations are collected in a single master label file named beeAnnotations.mlf

The .lab files consist of :

• First row identifies the audio file to which the annotations refer to.
• Each line after that describes an interval with starting time point, end time point and label. The time points are expressed in seconds.

Below is an example of such an annotation file:

Hive3_20_07_2017_QueenBee_H3_audio_15_30_00
0 78.45 bee
78.46 78.95 nobee
78.96 103.92 bee
103.93 112.48 nobee
112.49 152.48 bee
.


When using this dataset, please cite [1]:

[1] I. Nolasco and E. Benetos, “To bee or not to bee: Investigating machine learning approaches to beehive sound recognition,” in Workshop on Detection and Classification of Acoustic Scenes and Events (DCASE), 2018, submitted.

[2] “Open Source Beehives Project,” https://www.osbeehives.com/.

[3] S. Cecchi, A. Terenzi, S. Orcioni, P. Riolo, S. Ruschioni, and N. Isidoro, “A preliminary study of sounds emitted by honey bees in a beehive,” in Audio Engineering Society Convention 144, 2018.

Files (3.7 GB)
Name Size
beeAnnotations.mlf
md5:de08c3a889049e240c06a7f048f20b09
50.3 kB
CF001 - Missing Queen - Day -.lab
md5:8c752b7fbf749518b443eea64ef993ed
82 Bytes
CF001 - Missing Queen - Day -.mp3
md5:7619dc60d425a1ab8604bd4212e11e83
247.0 kB
CF003 - Active - Day - (214).lab
md5:accfc41ffc92d8654eca9b19dff4d0a7
460 Bytes
CF003 - Active - Day - (214).wav
md5:ef54bf7122b3689f80d1c71852e34975
26.5 MB
CF003 - Active - Day - (215).lab
md5:c897b1f4e7b0252eaee197238584524c
560 Bytes
CF003 - Active - Day - (215).wav
md5:5914a41640ca4c0a31718039c21fa808
26.5 MB
CF003 - Active - Day - (216).lab
117 Bytes
CF003 - Active - Day - (216).wav
md5:8caa6a8c113007f160c188ca449160b6
26.5 MB
CF003 - Active - Day - (217).lab
md5:8daf6a9bdfe268c7dd67570763d11119
798 Bytes
CF003 - Active - Day - (217).wav
md5:3d3141627c7b1a8f8eae9de5968d7a4b
26.5 MB
CF003 - Active - Day - (218).lab
md5:4bdd06811e0944f28a822abe3b7d42f3
280 Bytes
CF003 - Active - Day - (218).wav
md5:3f703c3cfeac42a77b94bed390146698
26.5 MB
CF003 - Active - Day - (219).lab
md5:8d8a5de910662a9efbeaa2b7e54bc5e3
1.3 kB
CF003 - Active - Day - (219).wav
md5:cdb5d11da1672cde4359bab6b3e556f2
26.5 MB
CF003 - Active - Day - (220).lab
md5:b420928d577f910fb4e17b3d4842950e
46 Bytes
CF003 - Active - Day - (220).wav
md5:4024e4f8f4c6a3d57a21cf325e785a5b
26.5 MB
CF003 - Active - Day - (221).lab
md5:319d1c52f4bf21e58e8e1b2469140399
86 Bytes
CF003 - Active - Day - (221).wav
26.5 MB
CF003 - Active - Day - (222).lab
md5:531c8ae89c5fce1659aebf4cdce6fa6f
46 Bytes
CF003 - Active - Day - (222).wav
md5:e519337dd0de723a749f21649ae78297
26.5 MB
CF003 - Active - Day - (223).lab
md5:c8eff4f097f972ba565426d48763e4e7
79 Bytes
CF003 - Active - Day - (223).wav
md5:10f793da1e512be4eb2c9f945fafe2c6
26.5 MB
CF003 - Active - Day - (224).lab
md5:604170385d6efd8d76f740f2057b6b66
126 Bytes
CF003 - Active - Day - (224).wav
md5:69f6e4dde181719c57939a955de1901a
26.5 MB
CF003 - Active - Day - (225).lab
md5:547d2966f4d37499e437e46be152fcaf
46 Bytes
CF003 - Active - Day - (225).wav
md5:fc6cdcc3a4e562949b6576583f3dacb6
26.5 MB
CF003 - Active - Day - (226).lab
md5:54e3cc2f7f69bbfeb4407d1e616598a4
81 Bytes
CF003 - Active - Day - (226).wav
26.5 MB
CF003 - Active - Day - (227).lab
md5:c6df17614dba073b9da2ddd148f22c3c
235 Bytes
CF003 - Active - Day - (227).wav
md5:bd0e8c7b32ca98f05da54b674c359a32
26.5 MB
CJ001 - Missing Queen - Day - (100).lab
md5:de77fa35d8ebfee96ff544aa23fa991e
2.1 kB
CJ001 - Missing Queen - Day - (100).wav
md5:7f609ee00cb3152a8d1d0f34a121413b
26.5 MB
CJ001 - Missing Queen - Day - (101).lab
md5:eeb4ebc581d948b97fd6c8cd4e229532
718 Bytes
CJ001 - Missing Queen - Day - (101).wav
md5:2b18ac05da9b1591ecaae763277be78e
26.5 MB
CJ001 - Missing Queen - Day - (102).lab
md5:56c3577cfaf98e153b3a9b8677f6a2e6
447 Bytes
CJ001 - Missing Queen - Day - (102).wav
md5:7a7b9318e482922c5375acffcb6dd4f3
26.5 MB
CJ001 - Missing Queen - Day - (103).lab
md5:01d0183ba132c4d01c758960240333f4
1.5 kB
CJ001 - Missing Queen - Day - (103).wav
md5:9a6ae560df5a517e52896f68becd0314
26.5 MB
CJ001 - Missing Queen - Day - (104).lab
md5:c7727beb1a2bf8d86b6d0899f7888183
1.9 kB
CJ001 - Missing Queen - Day - (104).wav
md5:833b60e9a109f7eb25801104977f2006
26.5 MB
Dataset documentation.pdf
423.8 kB
GH001 - Active - Day - 141022_0659_0751.lab
md5:5525cfbcb6579e80b92e3933a90e4de0
4.0 kB
GH001 - Active - Day - 141022_0659_0751.mp3
md5:dc0aa0e87b4cb6f3845e28c0e64a10b5
50.0 MB
Hive1_12_06_2018_QueenBee_H1_audio___15_00_00.lab
md5:8e95f7f4a00fd75774e007e4b0f038a8
652 Bytes
Hive1_12_06_2018_QueenBee_H1_audio___15_00_00.wav
76.6 MB
Hive1_12_06_2018_QueenBee_H1_audio___15_10_00.lab
md5:fcc9764e9cb5b6a22f6d02bd985269a8
483 Bytes
Hive1_12_06_2018_QueenBee_H1_audio___15_10_00.wav
md5:29478441c7cb26794b7fdfaea6f4eba7
76.2 MB
Hive1_12_06_2018_QueenBee_H1_audio___15_20_00.lab
md5:dfb6344dabf84b4c2b473eb79e0079b1
917 Bytes
Hive1_12_06_2018_QueenBee_H1_audio___15_20_00.wav
md5:991ca0846251d5f724e03f74ac4479c1
76.6 MB
Hive1_12_06_2018_QueenBee_H1_audio___15_30_00.lab
md5:0db8bd95f521294f89622c0143096884
1.2 kB
Hive1_12_06_2018_QueenBee_H1_audio___15_30_00.wav
md5:f8d3ce9a274775b177c2db85237acbf6
76.6 MB
Hive1_12_06_2018_QueenBee_H1_audio___15_40_00.lab
495 Bytes
Hive1_12_06_2018_QueenBee_H1_audio___15_40_00.wav
md5:df5a7857a2dcb3fca8593b45c012402f
76.6 MB
Hive1_12_06_2018_QueenBee_H1_audio___16_00_00.lab
868 Bytes
Hive1_12_06_2018_QueenBee_H1_audio___16_00_00.wav
md5:ffe657b0861ab28000ba46abc1e16eae
76.6 MB
Hive1_12_06_2018_QueenBee_H1_audio___16_10_00.lab
md5:9db906eca9d09f487a0f3227f3a425e2
677 Bytes
Hive1_12_06_2018_QueenBee_H1_audio___16_10_00.wav
md5:814a2c99101742f6efdbba09869a95ef
76.6 MB
Hive1_12_06_2018_QueenBee_H1_audio___16_20_00.lab
md5:13c9a0654784760203d1a0c6e8a9acb4
334 Bytes
Hive1_12_06_2018_QueenBee_H1_audio___16_20_00.wav
76.6 MB
Hive1_12_06_2018_QueenBee_H1_audio___16_30_00.lab
md5:d9b083f4e3fe00dddd1810340240ce1d
1.3 kB
Hive1_12_06_2018_QueenBee_H1_audio___16_30_00.wav
md5:dfd83a1b4ee5fee33a27e80c9786fbaf
76.6 MB
Hive1_12_06_2018_QueenBee_H1_audio___16_40_00.lab
md5:b0ab908c04a713bba80531d84d891c0a
680 Bytes
Hive1_12_06_2018_QueenBee_H1_audio___16_40_00.wav
md5:ac3d907a6abcb1669aa04e8a38fd7e98
76.6 MB
Hive1_12_06_2018_QueenBee_H1_audio___16_50_00.lab
md5:a321cc3d254a35615f197b2154602ffc
372 Bytes
Hive1_12_06_2018_QueenBee_H1_audio___16_50_00.wav
md5:8029f8d923d45fdf7d3c0ea2397b278e
76.6 MB
Hive1_12_06_2018_QueenBee_H1_audio___17_00_00.lab
md5:67244b6cbd79d47b79290c3779444cf9
435 Bytes
Hive1_12_06_2018_QueenBee_H1_audio___17_00_00.wav
76.2 MB
Hive1_31_05_2018_NO_QueenBee_H1_audio___15_00_00.lab
md5:f99b55b4b97d533e197f1ae69e89b7f7
1.7 kB
Hive1_31_05_2018_NO_QueenBee_H1_audio___15_00_00.wav
md5:96879187a1999a6f6b8e1c9b859e01c3
75.8 MB
Hive1_31_05_2018_NO_QueenBee_H1_audio___15_10_00.lab
md5:fb417ce061cde2691dfeb5611bc980c5
748 Bytes
Hive1_31_05_2018_NO_QueenBee_H1_audio___15_10_00.wav
md5:cf1e9a76bbcdb00d2154a41034da2333
76.2 MB
Hive1_31_05_2018_NO_QueenBee_H1_audio___15_20_00.lab
md5:9322a92042a2a5dfa296859ca9b25b3f
581 Bytes
Hive1_31_05_2018_NO_QueenBee_H1_audio___15_20_00.wav
md5:3f54b05c6c1ae222c3349691bc9338ee
75.4 MB
Hive1_31_05_2018_NO_QueenBee_H1_audio___15_30_00.lab
md5:2d59cfb9e302ca9dcd828e4f32ee7e75
1.8 kB
Hive1_31_05_2018_NO_QueenBee_H1_audio___15_30_00.wav
76.6 MB
Hive1_31_05_2018_NO_QueenBee_H1_audio___15_40_00.lab
md5:62f6ec25fe641811bc79a4288532ac9b
2.6 kB
Hive1_31_05_2018_NO_QueenBee_H1_audio___15_40_00.wav
md5:70008431a366a0a35a49347fa498bec0
76.6 MB
Hive3_12_07_2017_NO_QueenBee_H3_audio___15_00_00.lab
md5:726c51d23ac47cd092760207887284ac
771 Bytes
Hive3_12_07_2017_NO_QueenBee_H3_audio___15_00_00.wav
md5:39b2ef68c170ebc6715449f631e07b54
103.6 MB
Hive3_12_07_2017_NO_QueenBee_H3_audio___15_10_00.lab
md5:90b0c7e7f706801df521487f392f654c
846 Bytes
Hive3_12_07_2017_NO_QueenBee_H3_audio___15_10_00.wav
md5:d1bd96e49d6eb73ed56657d54ce64d24
102.0 MB
Hive3_12_07_2017_NO_QueenBee_H3_audio___15_20_00.lab
md5:614da974e0d91697aa5c8f1e913d8243
773 Bytes
Hive3_12_07_2017_NO_QueenBee_H3_audio___15_20_00.wav
md5:d1db7f03f06a6f13a281e2c964b52d08
103.2 MB
Hive3_12_07_2017_NO_QueenBee_H3_audio___15_30_00.lab
md5:8b2c08a5d3d92957862e41861c07fbdd
608 Bytes
Hive3_12_07_2017_NO_QueenBee_H3_audio___15_30_00.wav
103.2 MB
Hive3_12_07_2017_NO_QueenBee_H3_audio___15_40_00.lab
md5:63fdd2d1cf996b15f9201a8827280f24
264 Bytes
Hive3_12_07_2017_NO_QueenBee_H3_audio___15_40_00.wav
103.6 MB
Hive3_12_07_2017_NO_QueenBee_H3_audio___15_50_00.lab
md5:310edfa49d5d30e789f2f5c9a36b3f38
297 Bytes
Hive3_12_07_2017_NO_QueenBee_H3_audio___16_00_00.lab
md5:825f902b9844a9e5f8f3a53785543b1d
457 Bytes
Hive3_12_07_2017_NO_QueenBee_H3_audio___16_10_00.lab
md5:c9c94a3b2cd16f9b6b3f8cf28c918892
579 Bytes
Hive3_12_07_2017_NO_QueenBee_H3_audio___16_20_00.lab
md5:25d5a16b19425ab2eeae7fd15ea54a54
663 Bytes
Hive3_12_07_2017_NO_QueenBee_H3_audio___16_20_00.wav
md5:09b8dbe04c2fc1cbf5e0c5203a27d3bf
104.0 MB
Hive3_12_07_2017_NO_QueenBee_H3_audio___16_30_00.lab
305 Bytes
Hive3_12_07_2017_NO_QueenBee_H3_audio___16_30_00.wav
md5:751cf921a4cf79157101b78784c77d32
104.4 MB
Hive3_12_07_2017_NO_QueenBee_H3_audio___16_40_00.lab
md5:bf9bc92079d71b11f20431050272d980
576 Bytes
Hive3_12_07_2017_NO_QueenBee_H3_audio___16_40_00.wav
md5:9c65d6fa23ae86397b6c857c8337e7d8
103.6 MB
Hive3_12_07_2017_NO_QueenBee_H3_audio___16_50_00.lab
md5:2a9f207a5ee6180ba97a8b187b43b288
728 Bytes
Hive3_12_07_2017_NO_QueenBee_H3_audio___16_50_00.wav
md5:2d8a77269e8d4c54f15c8fb727695d43
103.6 MB
Hive3_12_07_2017_NO_QueenBee_H3_audio___17_00_00.lab
md5:d2fa29342d208e429a9d5260455ee881
459 Bytes
Hive3_12_07_2017_NO_QueenBee_H3_audio___17_00_00.wav
md5:820c6620c14910d179b1921762a31ab0
104.4 MB
Hive3_15_07_2017_NO_QueenBee_H3_audio___06_10_00.lab
md5:07091eaf23fcf8de430b2906ce3c3196
550 Bytes
Hive3_15_07_2017_NO_QueenBee_H3_audio___06_10_00.wav
md5:a588c9c2c1ce71344286a77251e964a8
104.4 MB
Hive3_15_07_2017_NO_QueenBee_H3_audio___06_20_00.lab
md5:e23b13fde3f8526e24aa0572d2176340
513 Bytes
Hive3_15_07_2017_NO_QueenBee_H3_audio___06_20_00.wav
md5:9e9b22a2b2bdb9ca52b9e08a9146a85e
104.4 MB
Hive3_15_07_2017_NO_QueenBee_H3_audio___06_30_00.lab
md5:f59021d2dd0ce893a97647e56779aa76
526 Bytes
Hive3_15_07_2017_NO_QueenBee_H3_audio___06_30_00.wav
md5:4d850d26101eec13b5a7a2e07dcda1ca
104.0 MB
Hive3_15_07_2017_NO_QueenBee_H3_audio___06_40_00.lab
md5:8e14b046d8e5152b94fce28a3fb2de6c
745 Bytes
Hive3_15_07_2017_NO_QueenBee_H3_audio___06_40_00.wav
md5:d908764485e6fabd3d9d4f4b9e0f318d
102.4 MB
Hive3_15_07_2017_NO_QueenBee_H3_audio___06_50_00.lab
md5:381246831497532585afbb715179dba4
853 Bytes
Hive3_15_07_2017_NO_QueenBee_H3_audio___06_50_00.wav
md5:976473091af608b0fa34c9f573cf5d6a
104.0 MB
Hive3_15_07_2017_NO_QueenBee_H3_audio___07_00_00.lab
md5:47bd76d7a594aab8982c1b875c0eaf37
577 Bytes
Hive3_15_07_2017_NO_QueenBee_H3_audio___07_00_00.wav
md5:f1a35afd9b9a3a4385f49777061d1c00
103.2 MB
Hive3_20_07_2017_QueenBee_H3_audio___06_10_00.lab
599 Bytes
Hive3_20_07_2017_QueenBee_H3_audio___06_10_00.wav
md5:bb5b70ed826b4e30a6f6a3c8594b24f5
104.4 MB
Hive3_20_07_2017_QueenBee_H3_audio___06_20_00.lab
910 Bytes
Hive3_20_07_2017_QueenBee_H3_audio___06_20_00.wav
md5:b6f57051366cdbee631a2107ed8b82d3
104.4 MB
Hive3_20_07_2017_QueenBee_H3_audio___06_30_00.lab
md5:5c2f5d65a62d868be9bfc9c71c06a7fa
497 Bytes
Hive3_20_07_2017_QueenBee_H3_audio___06_40_00.lab
md5:9e43341b260b9f1bafe145094c112ce3
686 Bytes
Hive3_20_07_2017_QueenBee_H3_audio___06_50_00.lab
md5:650d61c0ab90dd47d2884a02120b5fce
549 Bytes
Hive3_20_07_2017_QueenBee_H3_audio___07_00_00.lab
md5:3e6f16a7eb88c9f070b87559e4d08ca4
484 Bytes
Hive3_20_07_2017_QueenBee_H3_audio___15_00_00.lab
md5:80468a4dd956e67315031c1541ef5f73
64 Bytes
Hive3_20_07_2017_QueenBee_H3_audio___15_10_00.lab
md5:902022b68b3f6887d84d7301cb7337eb
64 Bytes
Hive3_20_07_2017_QueenBee_H3_audio___15_20_00.lab
md5:c0e0e110669662fdecede7367efc9355
99 Bytes
Hive3_20_07_2017_QueenBee_H3_audio___15_30_00.lab
md5:14623ab7c0fcd5fc3e96b7bda1804a09
415 Bytes
Hive3_20_07_2017_QueenBee_H3_audio___15_40_00.lab
md5:6dd02be08db4a83edbc22159b7b0f1c1
215 Bytes
Hive3_20_07_2017_QueenBee_H3_audio___15_50_00.lab
md5:828100c4231d12d947c356bb7c249ec9
64 Bytes
Hive3_20_07_2017_QueenBee_H3_audio___16_00_00.lab
md5:9faac5d41c6b9f019043ed0089df5c07
585 Bytes
Hive3_20_07_2017_QueenBee_H3_audio___16_10_00.lab
md5:43df1750e0533a3bcf401f2aee57355a
220 Bytes
Hive3_20_07_2017_QueenBee_H3_audio___16_20_00.lab
md5:66e4ac173582918bdf868ae98b77e718
179 Bytes
Hive3_20_07_2017_QueenBee_H3_audio___16_30_00.lab
md5:b3dde08fc024d4e123522fdef54846ff
457 Bytes
Hive3_20_07_2017_QueenBee_H3_audio___16_40_00.lab
md5:68926f7b65a05d11e3d6fa7ebc4d194d
80 Bytes
Hive3_20_07_2017_QueenBee_H3_audio___16_50_00.lab
md5:6b5117f0208519c4842da5f54b9a74d3
80 Bytes
Hive3_20_07_2017_QueenBee_H3_audio___17_00_00.lab
md5:e1249d358d702a86197e7b398d5516a0
289 Bytes
Sound Inside a Swarming Bee Hive -25 to -15 minutes-sE02T8B2LfA.lab
md5:cca8ac6eba7f9357e344c0d0e509ee56
980 Bytes
Sound Inside a Swarming Bee Hive +5 to +15 minutes-BIZx-8kLrdw.lab
md5:1ac73db76e2aa12424a67ed28e7c1d4a
788 Bytes
2,576
15,268
views