Published January 13, 2020 | Version v1

Slicing: a sustainable approach to structuring samples for analysis in long-term studies

  • 1. University of Leeds
  • 2. University of Edinburgh/Norwegian University of Science and Technology
  • 3. Imperial College London
  • 4. University of Sheffield

Description

  1. The longitudinal study of populations is a core tool for understanding ecological and evolutionary processes. Long-term studies typically collect samples repeatedly over individual lifetimes and across generations. These samples are then analysed in batches (e.g. qPCR-plates) and clusters (i.e. group of batches) over time in the laboratory. However, these analyses are constrained by cross-classified data structures introduced biologically or through experimental design. The separation of biological variation from the confounding among-batch and among-cluster variation is crucial, yet often ignored.
  2. The commonly used approaches to structuring samples for analysis, sequential and randomisation, generate bias due to the non-independence between time of collection and the batch and cluster they are analysed in. We propose a new sample structuring strategy, called slicing, designed to separate confounding among-batch and among-cluster variation from biological variation. Through simulations we tested the statistical power and precision to detect within-individual, between-individual, year and cohort effects of this novel approach.
  3. Our slicing approach, whereby recently and previously collected samples are sequentially analysed in clusters together, enables the statistical separation of collection time and cluster effects by bridging clusters together, for which we provide a case study. Our simulations show, with reasonable slicing width and angle, similar precision and similar or greater statistical power to detect year, cohort, within- and between-individual effects when samples are sliced across batches, compared with strategies that aggregate longitudinal samples or use randomised allocation.
  4. While the best approach to analysing long-term datasets depends on the structure of the data and questions of interest, it is vital to account for confounding among-cluster and batch variation. Our slicing approach is simple to apply and creates the necessary statistical independence of batch and cluster from environmental or biological variables of interest. Crucially, it allows sequential analysis of samples and flexible inclusion of current data in later analyses without completely confounding the analysis. Our approach maximises the scientific value of every sample, as each will optimally contribute to unbiased statistical inference from the data. Slicing thereby maximises the power of growing biobanks to address important ecological, epidemiological and evolutionary questions.

Files

Files (31.5 kB)

Name Size Download all
md5:2ca8a34615b9769cbdb67cb88509f06c
15.7 kB Download
md5:9a612ebf68aaa763ed74db65cb7f3c17
15.8 kB Download