Published December 8, 2025 | Version v3
Dataset Open

Supplementary Data for Coale et al. 2025

  • 1. ROR icon University of California, Santa Cruz

Description

Supplementary Data 1-16 for Coale et al. 2025.

01: P. calceolata physiology data from Fe/light co-limitation experiment including cell concentraions, chlorophyll a, C, N, Fe, Cu and protein cellular contents.

02:  P. calceolata transcriptomic data. Counts and CPM of P. calceolata transcripts from Fe/light co-limitation experiment.

03: Annotations and abbreviations for P. calceolata gene models.

04: Fasta file of P. calceolata gene model coding sequences.

05: Fasta file of P. calceolata gene model amino acid sequences.

06: edgeR comparisons made using transcriptomic data.

07: Gene membership and GO enrichment in WGCNA modules.

08: Proteomics intensities (normalized and imputed).

09: Proteomics DE analysis via limma.

10: Results of Fe x Light interaction test - physiological parameters.

11: Results of Fe x Light interaction test - transcriptomics.

12: Results of Fe x Light interaction test - proteomics.

13: P. calceolata dyneins - class, Fe sensitivity, and protein sequence.

14: Environnmental data from the NCOG project.

15: Transcriptomics response types determine with edgeR.

16: Results of imputation analysis of proteomics data.

Source data for Figures: SourceData.xlsx

Code

01: R script for edgeR analysis of transcriptomes. 

02: R script for limma analysis of proteomes. 

README for code:

README – RNA-seq and Proteomics Differential Expression Pipelines
================================================================

- Code_01_Transcriptomics_edgeR.R reads `SupplementaryData_02_transcriptomics.xlsx`.
- Code_02_Proteomics_limma.R reads `SupplementaryData_08_proteomics.csv`.


----------------------------------------------------------------
1. Contents
----------------------------------------------------------------

This Zenodo deposit (10.5281/zenodo.17859961) includes:

- Scripts
  - Code_01_Transcriptomics_edgeR.R - RNA-seq differential expression and Fe × Light interaction (edgeR)
  - Code_02_Proteomics_limma.R - Proteomics differential abundance analyses (limma)

 

----------------------------------------------------------------
2. Software Requirements
----------------------------------------------------------------

- R 4.3.x
- CRAN packages
  - edgeR ≥ 3.42
  - limma ≥ 3.56
  - readxl ≥ 1.4
  - readr ≥ 2.1
  - dplyr ≥ 1.1
  - tidyr ≥ 1.3
  - stringr ≥ 1.5
  - matrixStats ≥ 1.3
  - ggplot2 ≥ 3.5
  - patchwork ≥ 1.2
- BiocManager ≥ 1.30 


----------------------------------------------------------------
3. RNA-seq Differential Expression (edgeR)
----------------------------------------------------------------

3.1 Input
--------

- File: `SupplementaryData_02_transcriptomics.xlsx`
  - Sheet: `mapped_read_counts`
  - Column 1: gene IDs
  - Remaining columns: raw integer read counts for each RNA-seq sample.

The script:

1. Reads the Excel sheet using `readxl::read_excel`.
2. Converts all count columns to numeric.
3. Uses a manually defined `group` vector that encodes each sample as a treatment combination (e.g., `HLpDay`, `LLmNight`, `HLd00`, `HLr03`, etc.), corresponding to Light (HL/LL), Fe status (p/m/d/r), and time of day (Day/Night/diel times).

3.2 edgeR GLM contrasts
-----------------------

The first part of the script:

1. Constructs a `DGEList` and performs TMM normalization.
2. Fits a negative binomial GLM with design `~ 0 + group`.
3. Defines a panel of pairwise contrasts using `makeContrasts`, including:
   - HL vs LL within a given Fe and time (e.g., `HLpDayvsLLpDay`, `HLmNightvsLLmNight`)
   - pFe vs −Fe within HL or LL (e.g., `HLpDayvsHLmDay`, `LLpNightvsLLmNight`)
   - Day vs Night within Fe × Light (e.g., `HLmDayvsHLmNight`, `LLpDayvsLLpNight`)
4. For each contrast, runs `glmLRT`, extracts the full ranked gene table with:
   - log2 fold change, logCPM, LR, PValue, FDR
5. Combines all contrasts into a single data frame and writes:

Output:

- `all_contrasts_with_FDR.csv`
  - A wide table where each contrast contributes five columns (`logFC_*`, `logCPM_*`, `LR_*`, `PValue_*`, `FDR_*`). Each row corresponds to one gene ID.

3.3 Fe × Light interaction (RNA-seq)
------------------------------------

The second part of the script focuses on the Fe × Light interaction, restricting to +Fe (p) and −Fe (m) samples:

1. Builds a `meta` data frame from column names and the `group` vector, with:
   - `Light` (LL / HL)
   - `Fe` (p / m / d / r)
   - `TimeOfDay` (Day / Night; coded times like 00, 23, 03 are treated as Day when not explicitly labeled).
2. Filters to +Fe vs −Fe only (Fe ∈ {p, m}) and drops resupply (r) and DFOB (d) conditions.
3. Runs an edgeR quasi-likelihood pipeline with design:

   design_int <- model.matrix(~ Fe * Light + TimeOfDay, data = meta_int)

   which includes main effects for Fe and Light and their interaction term, plus TimeOfDay as a covariate.
4. Identifies the Fe:Light interaction coefficient and runs `glmQLFTest` on that term.
5. Outputs a full ranked table of genes with interaction statistics:

Outputs (in directory `interaction_rna/`):

- `rna_FeXLight_interaction_edgeR.csv`
  - Columns: `gene_id`, `logFC`, `logCPM`, `F`, `PValue`, `FDR`
  - `logFC` is the estimated effect of the Fe × Light interaction on expression.
- `rna_FeXLight_interaction_summary.csv`
  - Summary counts of genes tested and the number passing FDR < 0.05 and FDR < 0.10.

----------------------------------------------------------------
4. Proteomics Differential Abundance (limma)
----------------------------------------------------------------

4.1 Input
--------

- File: `SupplementaryData_08_proteomics.csv`
  - Contains normalized, imputed protein intensities.
  - The script:
    - Skips lines 2–3 using `read_csv_drop_lines()` (to remove extra annotation rows).
    - Expects one column named exactly `id` for protein IDs.
    - Treats all remaining columns that match the pattern `^(HL|LL)([pmr])(00|11|23)([A-Z])$` as sample columns, where:
      - `HL` / `LL` = light treatment
      - `p` / `m` / `r` = +Fe, −Fe, or resupply
      - `00`, `11`, `23` = sampling time codes
      - `A`, `B`, `C` = biological replicates

4.2 Preprocessing
-----------------

1. Builds a sample metadata (`md`) with:
   - `Light` ∈ {LL, HL}
   - `Fe` ∈ {p, m, r}
   - `Tcode` ∈ {11, 00, 23}
   - `cond` = interaction of `Light.Fe.Tcode` (e.g., `HL.m.23`).
2. Extracts the expression matrix `X`, converts to numeric, and sets rownames to protein IDs.
3. Applies log2 transform with a pseudocount (offset = 1), then `normalizeBetweenArrays` (quantile normalization).
4. Removes proteins with zero variance across samples using `rowSds`.

4.3 Design and contrasts
------------------------

1. Builds a design matrix with one coefficient per observed `cond`:

   design <- model.matrix(~ 0 + cond, data = md)

2. Fits a linear model with `lmFit` and `eBayes` (trend = TRUE, robust = TRUE).
3. Programmatically constructs a series of numeric contrast vectors (stored in list `C`), including:

   - Fe effects within light at “day” sampling:
     - (−Fe “day”) − (+Fe “day”) for each light level (LL, HL), pooling Day00/Day23 where appropriate.
   - Resupply vs +Fe within light at day:
     - Resupply “day” vs +Fe “day” for LL and HL.
   - Day vs Night within each Fe × Light combination:
     - (Fe “day”) − (Fe “night”) for p, m, r in LL and HL.
   - HL vs LL within a given Fe at day:
     - (HL, Fe, day) − (LL, Fe, day) against their appropriate Day timepoints.

All contrasts are built only if the required timepoints/conditions exist in the design (the code skips missing combinations safely).

4.4 Outputs
-----------

All results are written to the folder:

- `limma_results/`

Key files:

- Per-contrast differential abundance tables
  - For each contrast name (e.g., `HL_Day_r_vs_p`, `LL_Day_m_vs_p`, `HLvsLL_p_Day`), the script writes:

    - `limma_results/<contrast>.csv`
      - Columns:
        - `ProteinID`
        - `logFC`
        - `AveExpr`
        - `t`
        - `P.Value`
        - `adj.P.Val`
        - `B`

- Summaries
  - `limma_results/summary_counts.csv`
    - Basic counts of significant proteins per contrast.
  - `limma_results/summary_counts_with_pct.csv`
    - Adds `n_tested`, `pct_FDR_5`, and counts of up- and down-regulated proteins at FDR ≤ 0.05.

- Design and contrast metadata
  - `limma_results/design_columns.csv`
    - Lists the design matrix coefficients.
  - `limma_results/<contrast>_weights.csv` for each contrast
    - Records the non-zero coefficient weights used to build that contrast.

 

----------------------------------------------------------------
5. Running the Scripts
----------------------------------------------------------------

From R (or RStudio), set the working directory to the folder containing the scripts and supplemental data, for example:

setwd("path/to/unzipped_zenodo_archive")

Then:

- RNA-seq differential expression + Fe × Light interaction

  source("path_to_RNAseq_script.R")

- Proteomics limma analyses

  source("path_to_proteomics_script.R")

(Replace `path_to_RNAseq_script.R` and `path_to_proteomics_script.R` with the actual filenames in this archive.)

Provided the paths at the top of each script (`infile`, `outdir`, and any hard-coded paths) point to the included supplementary data files, the scripts will reproduce the RNA and protein differential expression results used in the manuscript and supplementary figures.

 

Files

SupplementaryData_01_physiology.csv

Files (77.5 MB)

Name Size Download all
md5:9b448399b0882e48ac1b3543b7c99419
6.1 kB Download
md5:3009093e9f4767bd9621b48d8e2cb063
11.3 kB Download
md5:de33b16e94543709de77e0ec0f47705d
4.0 MB Download
md5:e60e517ecb84c4b349cbed65d422685e
10.1 kB Preview Download
md5:4490c2de9abb3aef6f821ea72101fed6
13.1 MB Download
md5:bec35c4eedbff45cc08901c5f294ad9d
12.4 MB Preview Download
md5:5468aab063a1d9179a980bf26569e370
21.0 MB Download
md5:189a39fc796e1945f62d15d3506ffd0e
7.6 MB Download
md5:f37460d1c962cab793b1e4544bb64daf
9.7 MB Preview Download
md5:da72a51dbf2b081198458682282a19d9
3.8 MB Download
md5:719358a9be91e4257dc5443b413e2133
1.3 MB Preview Download
md5:b4a0e71440d205add3c2162309b2e07e
2.4 MB Preview Download
md5:e80eff4479e85c5a4b9438941453dca3
3.8 kB Preview Download
md5:28649509d08a84df3ab6d036054692d0
1.1 MB Preview Download
md5:b0f4045545604b2526f4dfa5455735fd
336.6 kB Preview Download
md5:f8f4b12b7c4db2a2be7ecbe52f9750ce
84.0 kB Preview Download
md5:ad2bc918bb508621567983a491bf80ef
148.0 kB Preview Download
md5:88021bfed5dee6706c17600b4f5df51a
308.1 kB Download
md5:c7946d746d7eacedd408a80e1a752009
137.5 kB Preview Download