Descriptive analysis

Author

Johannes Zauner

Preface

This is a work-in-progress descriptive analysis of the BaezaEtAl2025 dataset.

Overview

Data import: wearable data

The first step is the import of wearable data from the head position (mounted on glasses).

#time zone of madrid
tz <-  "Europe/Madrid"
#coordinates for Madrid
coordinates <- c(40.4165, -3.70256)
#regex to extract participant Id and wearing position
# pattern <- "[A-Z]+_S[0-9]{3}_[hcw]"
#regex to extract participant Id
pattern <- "[A-Z]+_S[0-9]{3}"

country_colors <- c(
  Sweden     = "#88CCEE",  # Sky blue
  Spain      = "#CC6677",  # Coral red
  Germany    = "#DDCC77",  # Mustard yellow
  Netherlands= "#117733",  # Dark green
  Turkey     = "#332288",  # Indigo
  Ghana      = "#AA4499",  # Purple-pink
  Costa_Rica = "#44AA99"   # Teal
)
#path to participants
path_part1 <- "../data/raw/individual"
#path to actlumus data sans wearing position
path_part2 <- "/continuous/actlumus_"
#wearing position
wearing_position <- "head"
#getting all subfolders
folders <- dir(path_part1)
#creating complete folder names
paths <- glue("{path_part1}/{folders}{path_part2}{wearing_position}")
#collecting file names
files <- list.files(paths, full.names = TRUE)
files <- files[str_detect(files, "Report", negate = TRUE)]
#there remain some early data from a pilot collection. these will be removed
data <- import$ActLumus(files, tz, auto.id = pattern, dst_adjustment = TRUE, not.before = "2024-08-01")

Successfully read in 1'363'859 observations across 23 Ids from 23 ActLumus-file(s).
Timezone set is Europe/Madrid.
The system timezone is Europe/Berlin. Please correct if necessary!
Observations in the following 2 file(s) and 2 Id(s) cross to or from daylight savings time (DST): 
File: FUSPCEU_S007_h_actlumus_Log_4199_20241028181516382, Group:FUSPCEU_S007
File: FUSPCEU_S008_h_actlumus_Log_3937_20241028191252892, Group:FUSPCEU_S008
The Datetime column was adjusted in these files. For more info on what that entails see `?dst_change_handler`.

First Observation: 2024-10-07 14:58:25
Last Observation: 2025-02-12 13:37:25
Data from before 2024-08-01 were not imported. Adjust with `not.before` if needed. 
Timespan: 128 days

Observation intervals: 
   Id           interval.time     n pct  
 1 FUSPCEU_S003 10s           59695 100% 
 2 FUSPCEU_S003 14s               1 0%   
 3 FUSPCEU_S004 10s           60041 100% 
 4 FUSPCEU_S004 11s               1 0%   
 5 FUSPCEU_S004 17s               1 0%   
 6 FUSPCEU_S005 10s           60080 100% 
 7 FUSPCEU_S006 10s           60333 100% 
 8 FUSPCEU_S007 10s           62713 100% 
 9 FUSPCEU_S008 10s           62931 100% 
10 FUSPCEU_S009 10s           66865 100% 
# ℹ 17 more rows

Regularizing data

In the first step, we will trim the data by the study time.

#import table with study times
Study_dates <- read_excel("../data/Study_dates_MeLiDos_FUSPCEU.xlsx")
#gather the important information
Study_dates <-
  Study_dates |> 
    rename(Id = subjectID_device, start = datetime_trial_start, end = datetime_trial_end) |> 
    select(Id, start, end) |> 
    mutate(across(c(start, end), \(x) force_tz(x, tz)),
           trial = TRUE) |> 
    filter(str_detect(Id, "_h$")) |>
    mutate(Id = str_remove(Id, "_h$")) |> 
  group_by(Id)

#add the trim information to the dataset and filter by it
data <- 
  data |> 
  add_states(Study_dates) |> 
  dplyr::filter(trial) |> 
  select(-trial)

data |> gg_overview()

data |> has_gaps()
[1] FALSE
data |> has_irregulars()
[1] FALSE
data |> gg_gaps(group.by.days = TRUE, show.irregulars = TRUE, full.days = FALSE)
No gaps nor irregular values were found. Plot creation skipped

No irregular data or gaps were found.

data_cleaned <- data 
data_cleaned |> gap_table(MEDI) |> cols_hide(ends_with("_n"))
Summary of available and missing data
Variable: melanopic EDI
Data
Missing
Regular
Irregular
Range
Interval
Gaps
Implicit
Explicit
Time % n1,2 Time Time N ø Time % Time % Time %
Overall 23w 1d 2h 3m 30s 87.1%3 0 26w 4d 2h 10; 60 46 1w 4d 23h 58m 15s 3w 2d 23h 56m 30s 12.9%3 3w 2d 23h 56m 30s 12.9%3 0s 0.0%3
FUSPCEU_S003
6d 21h 24m 86.1% 0 1w 1d 10s 2 13h 18m 1d 2h 36m 13.9% 1d 2h 36m 13.9% 0s 0.0%
FUSPCEU_S004
6d 22h 5m 86.5% 0 1w 1d 10s 2 12h 57m 30s 1d 1h 55m 13.5% 1d 1h 55m 13.5% 0s 0.0%
FUSPCEU_S005
6d 22h 53m 10s 86.9% 0 1w 1d 10s 2 12h 33m 25s 1d 1h 6m 50s 13.1% 1d 1h 6m 50s 13.1% 0s 0.0%
FUSPCEU_S006
6d 23h 21m 30s 87.2% 0 1w 1d 10s 2 12h 19m 15s 1d 38m 30s 12.8% 1d 38m 30s 12.8% 0s 0.0%
FUSPCEU_S007
1w 6h 12m 20s 90.3% 0 1w 1d 1h 10s 2 9h 23m 50s 18h 47m 40s 9.7% 18h 47m 40s 9.7% 0s 0.0%
FUSPCEU_S008
1w 6h 48m 40s 90.6% 0 1w 1d 1h 10s 2 9h 5m 40s 18h 11m 20s 9.4% 18h 11m 20s 9.4% 0s 0.0%
FUSPCEU_S009
1w 12h 8m 83.4% 0 1w 2d 10s 2 17h 56m 1d 11h 52m 16.6% 1d 11h 52m 16.6% 0s 0.0%
FUSPCEU_S010
4d 15h 55m 77.7% 0 6d 10s 2 16h 2m 30s 1d 8h 5m 22.3% 1d 8h 5m 22.3% 0s 0.0%
FUSPCEU_S011
6d 23h 10m 87.1% 0 1w 1d 10s 2 12h 25m 1d 50m 12.9% 1d 50m 12.9% 0s 0.0%
FUSPCEU_S012
1w 1h 37m 20s 88.3% 0 1w 1d 10s 2 11h 11m 20s 22h 22m 40s 11.7% 22h 22m 40s 11.7% 0s 0.0%
FUSPCEU_S013
6d 22h 44m 86.8% 0 1w 1d 10s 2 12h 38m 1d 1h 16m 13.2% 1d 1h 16m 13.2% 0s 0.0%
FUSPCEU_S014
6d 1h 16m 86.5% 0 1w 60s (~1 minutes) 2 11h 22m 22h 44m 13.5% 22h 44m 13.5% 0s 0.0%
FUSPCEU_S015
6d 22h 2m 86.5% 0 1w 1d 10s 2 12h 59m 1d 1h 58m 13.5% 1d 1h 58m 13.5% 0s 0.0%
FUSPCEU_S016
6d 23h 21m 87.2% 0 1w 1d 10s 2 12h 19m 30s 1d 39m 12.8% 1d 39m 12.8% 0s 0.0%
FUSPCEU_S017
6d 23h 24m 87.2% 0 1w 1d 10s 2 12h 18m 1d 36m 12.8% 1d 36m 12.8% 0s 0.0%
FUSPCEU_S018
6d 22h 6m 86.5% 0 1w 1d 10s 2 12h 57m 1d 1h 54m 13.5% 1d 1h 54m 13.5% 0s 0.0%
FUSPCEU_S019
6d 23h 50m 87.4% 0 1w 1d 10s 2 12h 5m 1d 10m 12.6% 1d 10m 12.6% 0s 0.0%
FUSPCEU_S020
1w 2h 40m 88.9% 0 1w 1d 10s 2 10h 40m 21h 20m 11.1% 21h 20m 11.1% 0s 0.0%
FUSPCEU_S021
6d 22h 9m 50s 86.5% 0 1w 1d 10s 2 12h 55m 5s 1d 1h 50m 10s 13.5% 1d 1h 50m 10s 13.5% 0s 0.0%
FUSPCEU_S022
6d 23h 9m 87.1% 0 1w 1d 10s 2 12h 25m 30s 1d 51m 12.9% 1d 51m 12.9% 0s 0.0%
FUSPCEU_S023
6d 20h 4m 40s 85.5% 0 1w 1d 10s 2 13h 57m 40s 1d 3h 55m 20s 14.5% 1d 3h 55m 20s 14.5% 0s 0.0%
FUSPCEU_S024
1w 1d 23h 48m 89.9% 0 1w 3d 10s 2 12h 6m 1d 12m 10.1% 1d 12m 10.1% 0s 0.0%
FUSPCEU_S025
1w 1d 23h 54m 90.0% 0 1w 3d 10s 2 12h 3m 1d 6m 10.0% 1d 6m 10.0% 0s 0.0%
1 If n > 0: it is possible that the other summary statistics are affected, as they are calculated based on the most prominent interval.
2 Number of (missing or actual) observations
3 Based on times, not necessarily number of observations

Visualization

data_cleaned |> 
  mutate(Id = Id |> fct_relabel(\(x) str_remove(x, "FUSPCEU_"))) |> 
grand_overview(coordinates, "Madrid", "Spain", country_colors, photoperiod_sequence = 1)

ggsave("../output/figures/Figure_1.png", width = 17, height = 10, scale = 2, units = "cm")
table_summary <-
light_summary_table(
  data_cleaned, coordinates, "Madrid", "Spain", country_colors["Spain"],
  histograms = TRUE
)

table_summary
Summary table
Madrid, Spain (40.4°N, 3.7°E), TZ: Europe/Madrid
Overview
Participants
23
Participant-days
186
Days ≥80% complete
141
Missing/irregular
12.90% (9.42% - 22.28%)
Photoperiod
11h 7m (10h 19m - 12h 23m) 1 
Metrics2
Dose D (lx·h) 5,908 ±8,341 (0 - 52,104)
Duration above 250 lx TAT250 2h 48m ±2h 11m (0s - 7h 47m)
Duration within 1-10 lx TWT1–10 1h 59m ±1h 32m (0s - 7h 43m)
Duration below 1 lx TBT1 12h 24m ±3h 16m (7h 37m - 1d)
Period above 250 lx PAT250 32m 58s ±30m 35s (0s - 2h 47m)
Duration above 1000 lx TAT1000 37m 2s ±46m 5s (0s - 4h 53m)
First timing above 250 lx FLiT250 09:36 ±02:34 (00:05 - 15:31) 1 
Mean timing above 250 lx MLiT250 13:48 ±01:32 (08:32 - 17:43) 1 
Last timing above 250 lx LLiT250 19:09 ±02:31 (12:22 - 23:55) 1 
Brightest 10h midpoint M10midpoint 14:09 ±02:19 (04:59 - 18:59) 1 
Darkest 5h midpoint L5midpoint 03:16 ±03:11 (01:43 - 22:59) 1 
Brightest 10h mean3 M10mean (lx) 113.7 ±118.0 (0.0 - 631.8)
Darkest 5h mean3 L5mean (lx) 0.0 ±0.0 (0.0 - 0.0)
Interdaily stability IS 0.329 ±0.096 (0.182 - 0.544)
Intradaily variability IV 1.288 ±0.374 (0.475 - 1.962)
values show: mean ±sd (min - max) and are all based on measurements of melanopic EDI (lx)
1 Histogram limits are set from 00:00 to 24:00
2 Metrics are calculated on a by-participant-day basis (n=141) with the exception of IV and IS, which are calculated on a by-participant basis (n=23).
3 Values were log 10 transformed prior to averaging, with an offset of 0.1, and backtransformed afterwards
gtsave(table_summary, here("output/tables/table_summary.png"), vwidth = 860)
gtsave(table_summary, here("output/tables/table_summary.pdf"))
gtsave(table_summary |> cols_hide(c(plot)), here("output/tables/table_summary.docx"))