Import light wrist

Author

Johannes Zauner

Preface

This is a work-in-progress descriptive analysis of the BaezaEtAl2025 dataset.

Overview

Data import: wearable data

The first step is the import of wearable data from the wrist position (mounted on wrist).

#regex to extract participant Id and wearing position
# pattern <- "[A-Z]+_S[0-9]{3}_[hcw]"
#regex to extract participant Id
pattern <- "[A-Z]+_S[0-9]{3}"
files <- filefinder("actlumus_wrist", continuous = TRUE, negate = "Report")
data <- 
  import$ActLumus(files[c(1:18, 20, 22:23)], tzs[site], auto.id = pattern,
                  dst_adjustment = TRUE)

Successfully read in 1'302'891 observations across 21 Ids from 21 ActLumus-file(s).
Timezone set is Europe/Madrid.
The system timezone is Europe/Berlin. Please correct if necessary!
Observations in the following 2 file(s) and 2 Id(s) cross to or from daylight savings time (DST): 
File: FUSPCEU_S007_w_actlumus_Log_4218_20241028180637836, Group:FUSPCEU_S007
File: FUSPCEU_S008_w_actlumus_Log_3954_20241028185757492, Group:FUSPCEU_S008
The Datetime column was adjusted in these files. For more info on what that entails see `?dst_change_handler`.

First Observation: 2024-10-07 15:19:44
Last Observation: 2025-02-12 14:10:14
Data from before 2001-01-01 were not imported. Adjust with `not.before` if needed. 
Timespan: 128 days

Observation intervals: 
   Id           interval.time     n pct  
 1 FUSPCEU_S003 10s           59713 100% 
 2 FUSPCEU_S003 17s               1 0%   
 3 FUSPCEU_S004 10s           59878 100% 
 4 FUSPCEU_S005 10s           59999 100% 
 5 FUSPCEU_S006 10s           60289 100% 
 6 FUSPCEU_S007 10s           62769 100% 
 7 FUSPCEU_S008 10s           62922 100% 
 8 FUSPCEU_S009 10s           66267 100% 
 9 FUSPCEU_S010 10s           66790 100% 
10 FUSPCEU_S011 10s           60439 100% 
# ℹ 12 more rows

Two files need further adjustment, as they contains timestamps from yr 2000 and 2008. The former is a typical issue with ActLumus devices after their battery is empty. We will use the other two wearing positions (wrist and head) to derive a valid offset

#getting all wearing positions
files <- list.dirs("../data/raw/individual/FUSPCEU_S023/continuous", full.names = TRUE)
files <- files[str_detect(files, "actlumus")] |> list.files(full.names = TRUE)
files <- files[str_detect(files, "Report", negate = TRUE)]

#loading the data
pattern <- "[A-Z]+_S[0-9]{3}_[hcw]"
data2 <- 
  import$ActLumus(files, tzs[site], auto.id = pattern,
                  dst_adjustment = TRUE, not.before = "1970-01-01")

Successfully read in 181'415 observations across 3 Ids from 3 ActLumus-file(s).
Timezone set is Europe/Madrid.
The system timezone is Europe/Berlin. Please correct if necessary!

First Observation: 2000-01-01 00:04:11
Last Observation: 2025-02-03 13:32:52
Timespan: 9166 days

Observation intervals: 
  Id             interval.time     n pct  
1 FUSPCEU_S023_c 10s           60484 100% 
2 FUSPCEU_S023_h 10s           60453 100% 
3 FUSPCEU_S023_w 10s           60475 100% 

data2 |> 
  mutate(
    Datetime = case_when(
      Id == "FUSPCEU_S023_c" ~ Datetime + dyears(25) + ddays(27) + dhours(7) + dminutes(45),
      Id == "FUSPCEU_S023_w" ~ Datetime + dhours(4),
      .default = Datetime)
    ) |> 
  gg_day(geom = "line", aes_col = Id) |> 
  gg_photoperiod(coordinates[[site]])

data2 <- 
data2 |> 
  mutate(
    Datetime = case_when(
      Id == "FUSPCEU_S023_c" ~ Datetime + dyears(25) + ddays(27) + dhours(7) + dminutes(45),
      Id == "FUSPCEU_S023_w" ~ Datetime + dhours(4),
      .default = Datetime)
    )
#getting all wearing positions
files <- list.dirs("../data/raw/individual/FUSPCEU_S021/continuous", full.names = TRUE)
files <- files[str_detect(files, "actlumus")] |> list.files(full.names = TRUE)
files <- files[str_detect(files, "Report", negate = TRUE)]

#loading the data
pattern <- "[A-Z]+_S[0-9]{3}_[hcw]"
data3 <- 
  import$ActLumus(files, tzs[site], auto.id = pattern,
                  dst_adjustment = TRUE)

Successfully read in 135'036 observations across 3 Ids from 3 ActLumus-file(s).
Timezone set is Europe/Madrid.
The system timezone is Europe/Berlin. Please correct if necessary!

First Observation: 2008-01-08 00:38:25
Last Observation: 2025-01-27 13:35:42
Data from before 2001-01-01 were not imported. Adjust with `not.before` if needed. 
Timespan: 6230 days

Observation intervals: 
  Id             interval.time     n pct  
1 FUSPCEU_S021_c 10s           12527 100% 
2 FUSPCEU_S021_h 10s           61221 100% 
3 FUSPCEU_S021_w 10s           61285 100% 

data3 |> 
  mutate(
    Datetime = case_when(
      Id == "FUSPCEU_S021_c" ~ Datetime + dyears(17) + ddays(13) + dhours(4),
      Id == "FUSPCEU_S021_w" ~ Datetime + dhours(4),
      .default = Datetime)
    ) |> 
  gg_day(geom = "line", aes_col = Id, linewidth = 0.5) |> 
  gg_photoperiod(coordinates[[site]])+
  coord_cartesian(xlim = c(20,24)*3600)

data3 <- 
data3 |> 
  mutate(
    Datetime = case_when(
      # Id == "FUSPCEU_S021_c" ~ Datetime + dyears(17) + ddays(13) + dhours(4),
      Id == "FUSPCEU_S021_w" ~ Datetime + dhours(4),
      .default = Datetime)
    )

Taken together, both are solvable through a 4 hour time shift.

data2 <-
data2 |> 
  sample_groups(sample = 3) |> 
  mutate(Id = "FUSPCEU_S023")

data3 <-
data3 |> 
  sample_groups(sample = 3) |> 
  mutate(Id = "FUSPCEU_S021")

data <-
  join_datasets(data, data2, data3)
rm(data2, data3)

Regularizing data

In the first step, we will trim the data by the study time.

path_study_dates <- paste0("../data/Study_dates_MeLiDos_", site, ".xlsx")

#import table with study times
Study_dates <- read_excel(path_study_dates)
#gather the important information
Study_dates <-
  Study_dates |> 
    rename(Id = subjectID_device, start = datetime_trial_start, end = datetime_trial_end) |> 
    select(Id, start, end) |> 
    mutate(across(c(start, end), \(x) force_tz(x, tzs[site])),
           trial = TRUE) |> 
    filter(str_detect(Id, "_w$")) |>
    mutate(Id = str_remove(Id, "_w$")) |> 
  group_by(Id)

#add the trim information to the dataset and filter by it
data <- 
  data |> 
  add_states(Study_dates) |> 
  dplyr::filter(trial) |> 
  select(-trial)

data |> gg_overview()

data |> has_gaps()
[1] FALSE
data |> has_irregulars()
[1] FALSE
data |> gg_gaps(group.by.days = TRUE, show.irregulars = TRUE, full.days = FALSE)
No gaps nor irregular values were found. Plot creation skipped
data_cleaned <- 
data |> 
  gap_handler(full.days = TRUE)
data_cleaned |> gap_table(MEDI) |> cols_hide(ends_with("_n"))
Summary of available and missing data
Variable: melanopic EDI
Data
Missing
Regular
Irregular
Range
Interval
Gaps
Implicit
Explicit
Time % n1,2 Time Time N ø Time % Time % Time %
Overall 22w 2d 1h 23m 50s 86.7%3 0 25w 5d 2h 10 46 1w 5d 18m 5s 3w 3d 36m 10s 13.3%3 0s 0.0%3 3w 3d 36m 10s 13.3%3
FUSPCEU_S003
6d 21h 24m 86.1% 0 1w 1d 10s 2 13h 18m 1d 2h 36m 13.9% 0s 0.0% 1d 2h 36m 13.9%
FUSPCEU_S004
6d 21h 59m 30s 86.5% 0 1w 1d 10s 2 13h 15s 1d 2h 30s 13.5% 0s 0.0% 1d 2h 30s 13.5%
FUSPCEU_S005
6d 22h 36m 10s 86.8% 0 1w 1d 10s 2 12h 41m 55s 1d 1h 23m 50s 13.2% 0s 0.0% 1d 1h 23m 50s 13.2%
FUSPCEU_S006
6d 23h 22m 87.2% 0 1w 1d 10s 2 12h 19m 1d 38m 12.8% 0s 0.0% 1d 38m 12.8%
FUSPCEU_S007
1w 6h 4m 40s 90.2% 0 1w 1d 1h 10s 2 9h 27m 40s 18h 55m 20s 9.8% 0s 0.0% 18h 55m 20s 9.8%
FUSPCEU_S008
1w 6h 34m 90.4% 0 1w 1d 1h 10s 2 9h 13m 18h 26m 9.6% 0s 0.0% 18h 26m 9.6%
FUSPCEU_S009
1w 12h 7m 20s 83.4% 0 1w 2d 10s 2 17h 56m 20s 1d 11h 52m 40s 16.6% 0s 0.0% 1d 11h 52m 40s 16.6%
FUSPCEU_S010
4d 15h 47m 30s 77.6% 0 6d 10s 2 16h 6m 15s 1d 8h 12m 30s 22.4% 0s 0.0% 1d 8h 12m 30s 22.4%
FUSPCEU_S011
6d 23h 10m 87.1% 0 1w 1d 10s 2 12h 25m 1d 50m 12.9% 0s 0.0% 1d 50m 12.9%
FUSPCEU_S012
1w 1h 38m 88.4% 0 1w 1d 10s 2 11h 11m 22h 22m 11.6% 0s 0.0% 22h 22m 11.6%
FUSPCEU_S013
6d 22h 43m 50s 86.8% 0 1w 1d 10s 2 12h 38m 5s 1d 1h 16m 10s 13.2% 0s 0.0% 1d 1h 16m 10s 13.2%
FUSPCEU_S014
6d 1h 16m 86.5% 0 1w 10s 2 11h 22m 22h 44m 13.5% 0s 0.0% 22h 44m 13.5%
FUSPCEU_S015
6d 22h 1m 50s 86.5% 0 1w 1d 10s 2 12h 59m 5s 1d 1h 58m 10s 13.5% 0s 0.0% 1d 1h 58m 10s 13.5%
FUSPCEU_S016
6d 23h 20m 40s 87.2% 0 1w 1d 10s 2 12h 19m 40s 1d 39m 20s 12.8% 0s 0.0% 1d 39m 20s 12.8%
FUSPCEU_S017
6d 23h 24m 87.2% 0 1w 1d 10s 2 12h 18m 1d 36m 12.8% 0s 0.0% 1d 36m 12.8%
FUSPCEU_S018
4d 22h 55m 20s 82.6% 0 6d 10s 2 12h 32m 20s 1d 1h 4m 40s 17.4% 0s 0.0% 1d 1h 4m 40s 17.4%
FUSPCEU_S019
6d 23h 50m 87.4% 0 1w 1d 10s 2 12h 5m 1d 10m 12.6% 0s 0.0% 1d 10m 12.6%
FUSPCEU_S020
1w 2h 11m 20s 88.6% 0 1w 1d 10s 2 10h 54m 20s 21h 48m 40s 11.4% 0s 0.0% 21h 48m 40s 11.4%
FUSPCEU_S021
6d 22h 10m 86.5% 0 1w 1d 10s 2 12h 55m 1d 1h 50m 13.5% 0s 0.0% 1d 1h 50m 13.5%
FUSPCEU_S022
6d 23h 8m 10s 87.1% 0 1w 1d 10s 2 12h 25m 55s 1d 51m 50s 12.9% 0s 0.0% 1d 51m 50s 12.9%
FUSPCEU_S023
6d 19h 58m 20s 85.4% 0 1w 1d 10s 2 14h 50s 1d 4h 1m 40s 14.6% 0s 0.0% 1d 4h 1m 40s 14.6%
FUSPCEU_S024
6d 23h 48m 87.4% 0 1w 1d 10s 2 12h 6m 1d 12m 12.6% 0s 0.0% 1d 12m 12.6%
FUSPCEU_S025
6d 23h 53m 10s 87.4% 0 1w 1d 10s 2 12h 3m 25s 1d 6m 50s 12.6% 0s 0.0% 1d 6m 50s 12.6%
1 If n > 0: it is possible that the other summary statistics are affected, as they are calculated based on the most prominent interval.
2 Number of (missing or actual) observations
3 Based on times, not necessarily number of observations

Exporting values

data_cleaned <- 
  data_cleaned |> mutate(position = "wrist")

light_wrist_1min <- 
data_cleaned |> 
  aggregate_Datetime("1 minute", numeric.handler = \(x) mean(x, na.rm = TRUE)) |> 
  remove_partial_data(MEDI, threshold.missing = "3 hours", by.date = TRUE)
This dataset has irregular or singular data. Singular data will automatically be removed. If you are uncertain about irregular data, you can check them with `gap_finder`, `gap_table`, and `gg_gaps`.
light_wrist <- data_cleaned

save(light_wrist_1min, file = "../data/imported/light/light_wrist_1minute.RData")
save(light_wrist, file = "../data/imported/light/light_wrist.RData")

Visualization

prefix <- paste0(site, "_")

data_cleaned |> 
  mutate(Id = Id |> fct_relabel(\(x) str_remove(x, prefix))) |> 
grand_overview(coordinates[[site]], cities[[site]], countries[[site]], 
               country_colors[[site]], photoperiod_sequence = 1)

Stats

Summary table

summary_table(
  data_cleaned, 
  coordinates = coordinates[[site]], 
  location = cities[[site]], 
  site = countries[[site]], 
  color = country_colors[[site]],
  histograms = TRUE
)
Summary table
Madrid, Spain, 40.4°N, 3.7°W, TZ: Europe/Madrid
Overview
Participants Participants 23
Participant-days Participant-days 180 (6 - 9)
Days ≥80% complete Days ≥80% complete 134 (4 - 7)
Missing/irregular Missing/Irregular 13.0% (10.0% - 22.0%)
Photoperiod Photoperiod 11h 7m (10h 19m - 12h 23m) 1 
Metrics2
Dose D (lx·h) 5,004 ±7,346 (12 - 44,795)
Duration above 250 lx TAT250 2h 42m ±2h 18m (0s - 7h 22m)
Duration within 1-10 lx TWT1-10 2h 25m ±1h 27m (30s - 7h 27m)
Duration below 1 lx TBT1 13h 42m ±3h 22m (7h 38m - 23h 17m)
Period above 250 lx PAT250 19m 21s ±18m 26s (0s - 1h 29m)
Duration above 1000 lx TAT1000 32m 40s ±45m (0s - 4h 30m)
First timing above 250 lx FLiT250 09:18 ±02:54 (00:09 - 16:54) 1 
Mean timing above 250 lx MLiT250 14:00 ±01:24 (09:55 - 18:19) 1 
Last timing above 250 lx LLiT250 19:28 ±02:40 (10:48 - 23:51) 1 
Brightest 10h midpoint M10midpoint 14:38 ±01:50 (08:58 - 18:59) 1 
Darkest 5h midpoint L5midpoint 02:59 ±00:56 (00:36 - 07:53) 1 
Brightest 10h mean3 M10mean (lx) 64.5 ±83.7 (0.1 - 359.5)
Darkest 5h mean3 L5mean (lx) 0.0 ±0.0 (0.0 - 0.0)
Interdaily stability IS 0.321 ±0.113 (0.160 - 0.556)
Intradaily variability IV 1.294 ±0.411 (0.468 - 1.934)
values show: mean ±sd (min - max) and are all based on measurements of melanopic EDI (lx)
1 Histogram limits are set from 00:00 to 24:00
2 Metrics are calculated on a by-participant-day basis (n=134) with the exception of IV and IS, which are calculated on a by-participant basis (n=23).
3 Values were log 10 transformed prior to averaging, with an offset of 0.1, and backtransformed afterwards