Published January 14, 2021 | Version 1.0.0
Dataset Open

Social networks predict the life and death of honey bees - Data

  • 1. Freie Universität Berlin

Description

Interaction matrices and metadata used in "Social networks predict the life and death of honey bees"

Preprint: Social networks predict the life and death of honey bees

See the README file in bb_network_decomposition for example code.

The following files are included:

interaction_networks_20160729to20160827.h5

The social interaction networks as a dense tensor and metadata.

Keys:

  • interactions: Tensor of shape (29, 2010, 2010, 9) (days x individuals x individuals x interaction_types). I_{d,i,j,t} = log(1 + x), where x is the number of interactions of type t between individuals i and j at recording day d. See the methods section of paper of the interaction types.
  • labels: Names of the 9 interaction types in the order they are stored in the interactions tensor.
  • bee_ids: List of length 2010, mapping from sequential index used in the interaction tensor to the original BeesBook tag ID of the individual

alive_bees_bayesian.csv

This file contains the results of the bayesian lifetime model with one row for each bee.

Columns:

  • bee_id: Numerical unique identifier for each individual.
  • days_alive: Number of bees the bees was determined to be alive. If the individual was still alive at the end of the recording, the number of days from the day she hatched until the end of the recording.
  • death_observed: Boolean indicator whether the death occurred during the recording period.
  • annotated_tagged_date: Hatch date of the individual, i.e. the date she was tagged.
  • inferred_death_date: The death date as determined by the model.

bee_daily_data.csv

This file contains one row per bee per day that she was alive for the focal period.

Columns:

  • bee_id: Numerical unique identifier for each individual.
  • date: Date in year-month-day format.
  • age: Age in days. Can be NaN if the bee has no associated death_date.
  • network_age, network_age_1, network_age_2: The first three dimensions of network age.
  • dance_floor, honey_storage, near_exit, brood_area_total: Normalized (sum to 1). Can be NaN if a bee had no high confidence detections (>0.9) for a given day. Can be 0 if a bee was only seen outside of the annotated areas.
  • location_descriptor_count: The number of minutes the bee was seen in one of the location labels during that day. I.e., dance_floor * location_descriptor_count calculates the number of minutes, the bee was seen on the dance floor on the given day.
  • death_date: Date the bee was last seen in the colony in year-month-day format. Can be NaN for individuals that did not die until the end of the recording period.
  • circadian_rhythm: R² value of a sine with a period of one day fitted to the velocity data of the individual over three days. Can be NaN if the fit did not converge due to a lack of data points.
  • velocity_peak_time: Phase of the circadian sine fit in hours as an offset to 12:00 UTC. Can be NaN if circadian_rhythm is NaN.
  • velocity_day, velocity_night: Mean velocity of the individual between 09:00-18:00 UTC and 21:00-06:00 UTC, respectively. Can be NaN if no velocity data was available for that interval.
  • days_left: Difference in days between date and death_date. Can be NaN if death_date is NaN.

location_data.csv

This file contains subsampled position information for all bees during the focal period. The data contains one row for every individual for every minute of the recording if that individual was seen at least once during that minute with a tag confidence of at least 0.9. The first matching detection for each individual is used.

Columns:

In addition to the bee_id and date columns as in the bee_daily_data.csv, the file contains these additional columns:

  • cam_id, cams: The cam_id is a numerical identifier from {0, 1, 2, 3}. Each side of the hive is filmed by two cameras where {0, 1} and {2, 3} record the same side respectively. The cams column contains values either “(0, 1)” or “(2, 3)” and indicates to which sides of the hive this detection belongs.
  • x_pos_hive, y_pos_hive: The spatial positions in millimeters on the hive. The two cameras from one side share a common coordinate system.
  • location: The label that was assigned to the comb at (x_pos_hive, y_pos_hive) on the given date. The label “other” indicates detections that were outside of any annotated region. The label “not_comb” indicates the wooden frame or empty space around the comb.
  • timestamp, date: The timestamp indicates the beginning of each one-minute sampling interval and is given in UTC, as indicated (example: “2016-08-13 00:00:00+00:00”). The date part of the timestamp is repeated in the “date” column. Both are given in year-month-day format.

Software used to acquire and analyze the data:

 

Files

alive_bees_bayesian.csv

Files (849.8 MB)

Name Size Download all
md5:a3b4426d9762aeb406cfc22e85d9308c
100.2 kB Preview Download
md5:63eaba3bfbb4ab712bb8bf3313c06f7a
6.7 MB Preview Download
md5:e0c8609ea57b75e2345f2c7f47b92724
658.5 MB Download
md5:beed2acd319f752b0f891c6705ba7816
184.5 MB Download