Social networks predict the life and death of honey bees - Data
Description
Interaction matrices and metadata used in "Social networks predict the life and death of honey bees"
Preprint: Social networks predict the life and death of honey bees
See the README file in bb_network_decomposition for example code.
The following files are included:
interaction_networks_20160729to20160827.h5
The social interaction networks as a dense tensor and metadata.
Keys:
- interactions: Tensor of shape (29, 2010, 2010, 9) (days x individuals x individuals x interaction_types). I_{d,i,j,t} = log(1 + x), where x is the number of interactions of type t between individuals i and j at recording day d. See the methods section of paper of the interaction types.
- labels: Names of the 9 interaction types in the order they are stored in the interactions tensor.
- bee_ids: List of length 2010, mapping from sequential index used in the interaction tensor to the original BeesBook tag ID of the individual
alive_bees_bayesian.csv
This file contains the results of the bayesian lifetime model with one row for each bee.
Columns:
- bee_id: Numerical unique identifier for each individual.
- days_alive: Number of bees the bees was determined to be alive. If the individual was still alive at the end of the recording, the number of days from the day she hatched until the end of the recording.
- death_observed: Boolean indicator whether the death occurred during the recording period.
- annotated_tagged_date: Hatch date of the individual, i.e. the date she was tagged.
- inferred_death_date: The death date as determined by the model.
bee_daily_data.csv
This file contains one row per bee per day that she was alive for the focal period.
Columns:
- bee_id: Numerical unique identifier for each individual.
- date: Date in year-month-day format.
- age: Age in days. Can be NaN if the bee has no associated death_date.
- network_age, network_age_1, network_age_2: The first three dimensions of network age.
- dance_floor, honey_storage, near_exit, brood_area_total: Normalized (sum to 1). Can be NaN if a bee had no high confidence detections (>0.9) for a given day. Can be 0 if a bee was only seen outside of the annotated areas.
- location_descriptor_count: The number of minutes the bee was seen in one of the location labels during that day. I.e., dance_floor * location_descriptor_count calculates the number of minutes, the bee was seen on the dance floor on the given day.
- death_date: Date the bee was last seen in the colony in year-month-day format. Can be NaN for individuals that did not die until the end of the recording period.
- circadian_rhythm: R² value of a sine with a period of one day fitted to the velocity data of the individual over three days. Can be NaN if the fit did not converge due to a lack of data points.
- velocity_peak_time: Phase of the circadian sine fit in hours as an offset to 12:00 UTC. Can be NaN if circadian_rhythm is NaN.
- velocity_day, velocity_night: Mean velocity of the individual between 09:00-18:00 UTC and 21:00-06:00 UTC, respectively. Can be NaN if no velocity data was available for that interval.
- days_left: Difference in days between date and death_date. Can be NaN if death_date is NaN.
location_data.csv
This file contains subsampled position information for all bees during the focal period. The data contains one row for every individual for every minute of the recording if that individual was seen at least once during that minute with a tag confidence of at least 0.9. The first matching detection for each individual is used.
Columns:
In addition to the bee_id and date columns as in the bee_daily_data.csv, the file contains these additional columns:
- cam_id, cams: The cam_id is a numerical identifier from {0, 1, 2, 3}. Each side of the hive is filmed by two cameras where {0, 1} and {2, 3} record the same side respectively. The cams column contains values either “(0, 1)” or “(2, 3)” and indicates to which sides of the hive this detection belongs.
- x_pos_hive, y_pos_hive: The spatial positions in millimeters on the hive. The two cameras from one side share a common coordinate system.
- location: The label that was assigned to the comb at (x_pos_hive, y_pos_hive) on the given date. The label “other” indicates detections that were outside of any annotated region. The label “not_comb” indicates the wooden frame or empty space around the comb.
- timestamp, date: The timestamp indicates the beginning of each one-minute sampling interval and is given in UTC, as indicated (example: “2016-08-13 00:00:00+00:00”). The date part of the timestamp is repeated in the “date” column. Both are given in year-month-day format.
Software used to acquire and analyze the data:
- bb_network_decomposition: Network age calculation and regression analyses
- bb_pipeline: Tag localization and decoding pipeline
- bb_pipeline_models: Pretrained localizer and decoder models for bb_pipeline
- bb_binary: Raw detection data storage format
- bb_irflash: IR flash system schematics and arduino code
- bb_imgacquisition: Recording and network storage
- bb_behavior: Database interaction and data (pre)processing, velocity calculation
- bb_circadian: Circadian rhythm calculations
- bb_tracking: Tracking of bee detections over time
- bb_wdd: Automatic detection and decoding of honey bee waggle dances
- bb_interval_determination: Homography calculation
- bb_stitcher: Image stitching