Supplementary Data for Coale et al. 2025
Description
Supplementary Data 1-16 for Coale et al. 2025.
01: P. calceolata physiology data from Fe/light co-limitation experiment including cell concentraions, chlorophyll a, C, N, Fe, Cu and protein cellular contents.
02: P. calceolata transcriptomic data. Counts and CPM of P. calceolata transcripts from Fe/light co-limitation experiment.
03: Annotations and abbreviations for P. calceolata gene models.
04: Fasta file of P. calceolata gene model coding sequences.
05: Fasta file of P. calceolata gene model amino acid sequences.
06: edgeR comparisons made using transcriptomic data.
07: Gene membership and GO enrichment in WGCNA modules.
08: Proteomics intensities (normalized and imputed).
09: Proteomics DE analysis via limma.
10: Results of Fe x Light interaction test - physiological parameters.
11: Results of Fe x Light interaction test - transcriptomics.
12: Results of Fe x Light interaction test - proteomics.
13: P. calceolata dyneins - class, Fe sensitivity, and protein sequence.
14: Environnmental data from the NCOG project.
15: Transcriptomics response types determine with edgeR.
16: Results of imputation analysis of proteomics data.
Source data for Figures: SourceData.xlsx
Code
01: R script for edgeR analysis of transcriptomes.
02: R script for limma analysis of proteomes.
README for code:
README – RNA-seq and Proteomics Differential Expression Pipelines
================================================================
- Code_01_Transcriptomics_edgeR.R reads `SupplementaryData_02_transcriptomics.xlsx`.
- Code_02_Proteomics_limma.R reads `SupplementaryData_08_proteomics.csv`.
----------------------------------------------------------------
1. Contents
----------------------------------------------------------------
This Zenodo deposit (10.5281/zenodo.17859961) includes:
- Scripts
- Code_01_Transcriptomics_edgeR.R - RNA-seq differential expression and Fe × Light interaction (edgeR)
- Code_02_Proteomics_limma.R - Proteomics differential abundance analyses (limma)
----------------------------------------------------------------
2. Software Requirements
----------------------------------------------------------------
- R 4.3.x
- CRAN packages
- edgeR ≥ 3.42
- limma ≥ 3.56
- readxl ≥ 1.4
- readr ≥ 2.1
- dplyr ≥ 1.1
- tidyr ≥ 1.3
- stringr ≥ 1.5
- matrixStats ≥ 1.3
- ggplot2 ≥ 3.5
- patchwork ≥ 1.2
- BiocManager ≥ 1.30
----------------------------------------------------------------
3. RNA-seq Differential Expression (edgeR)
----------------------------------------------------------------
3.1 Input
--------
- File: `SupplementaryData_02_transcriptomics.xlsx`
- Sheet: `mapped_read_counts`
- Column 1: gene IDs
- Remaining columns: raw integer read counts for each RNA-seq sample.
The script:
1. Reads the Excel sheet using `readxl::read_excel`.
2. Converts all count columns to numeric.
3. Uses a manually defined `group` vector that encodes each sample as a treatment combination (e.g., `HLpDay`, `LLmNight`, `HLd00`, `HLr03`, etc.), corresponding to Light (HL/LL), Fe status (p/m/d/r), and time of day (Day/Night/diel times).
3.2 edgeR GLM contrasts
-----------------------
The first part of the script:
1. Constructs a `DGEList` and performs TMM normalization.
2. Fits a negative binomial GLM with design `~ 0 + group`.
3. Defines a panel of pairwise contrasts using `makeContrasts`, including:
- HL vs LL within a given Fe and time (e.g., `HLpDayvsLLpDay`, `HLmNightvsLLmNight`)
- pFe vs −Fe within HL or LL (e.g., `HLpDayvsHLmDay`, `LLpNightvsLLmNight`)
- Day vs Night within Fe × Light (e.g., `HLmDayvsHLmNight`, `LLpDayvsLLpNight`)
4. For each contrast, runs `glmLRT`, extracts the full ranked gene table with:
- log2 fold change, logCPM, LR, PValue, FDR
5. Combines all contrasts into a single data frame and writes:
Output:
- `all_contrasts_with_FDR.csv`
- A wide table where each contrast contributes five columns (`logFC_*`, `logCPM_*`, `LR_*`, `PValue_*`, `FDR_*`). Each row corresponds to one gene ID.
3.3 Fe × Light interaction (RNA-seq)
------------------------------------
The second part of the script focuses on the Fe × Light interaction, restricting to +Fe (p) and −Fe (m) samples:
1. Builds a `meta` data frame from column names and the `group` vector, with:
- `Light` (LL / HL)
- `Fe` (p / m / d / r)
- `TimeOfDay` (Day / Night; coded times like 00, 23, 03 are treated as Day when not explicitly labeled).
2. Filters to +Fe vs −Fe only (Fe ∈ {p, m}) and drops resupply (r) and DFOB (d) conditions.
3. Runs an edgeR quasi-likelihood pipeline with design:
design_int <- model.matrix(~ Fe * Light + TimeOfDay, data = meta_int)
which includes main effects for Fe and Light and their interaction term, plus TimeOfDay as a covariate.
4. Identifies the Fe:Light interaction coefficient and runs `glmQLFTest` on that term.
5. Outputs a full ranked table of genes with interaction statistics:
Outputs (in directory `interaction_rna/`):
- `rna_FeXLight_interaction_edgeR.csv`
- Columns: `gene_id`, `logFC`, `logCPM`, `F`, `PValue`, `FDR`
- `logFC` is the estimated effect of the Fe × Light interaction on expression.
- `rna_FeXLight_interaction_summary.csv`
- Summary counts of genes tested and the number passing FDR < 0.05 and FDR < 0.10.
----------------------------------------------------------------
4. Proteomics Differential Abundance (limma)
----------------------------------------------------------------
4.1 Input
--------
- File: `SupplementaryData_08_proteomics.csv`
- Contains normalized, imputed protein intensities.
- The script:
- Skips lines 2–3 using `read_csv_drop_lines()` (to remove extra annotation rows).
- Expects one column named exactly `id` for protein IDs.
- Treats all remaining columns that match the pattern `^(HL|LL)([pmr])(00|11|23)([A-Z])$` as sample columns, where:
- `HL` / `LL` = light treatment
- `p` / `m` / `r` = +Fe, −Fe, or resupply
- `00`, `11`, `23` = sampling time codes
- `A`, `B`, `C` = biological replicates
4.2 Preprocessing
-----------------
1. Builds a sample metadata (`md`) with:
- `Light` ∈ {LL, HL}
- `Fe` ∈ {p, m, r}
- `Tcode` ∈ {11, 00, 23}
- `cond` = interaction of `Light.Fe.Tcode` (e.g., `HL.m.23`).
2. Extracts the expression matrix `X`, converts to numeric, and sets rownames to protein IDs.
3. Applies log2 transform with a pseudocount (offset = 1), then `normalizeBetweenArrays` (quantile normalization).
4. Removes proteins with zero variance across samples using `rowSds`.
4.3 Design and contrasts
------------------------
1. Builds a design matrix with one coefficient per observed `cond`:
design <- model.matrix(~ 0 + cond, data = md)
2. Fits a linear model with `lmFit` and `eBayes` (trend = TRUE, robust = TRUE).
3. Programmatically constructs a series of numeric contrast vectors (stored in list `C`), including:
- Fe effects within light at “day” sampling:
- (−Fe “day”) − (+Fe “day”) for each light level (LL, HL), pooling Day00/Day23 where appropriate.
- Resupply vs +Fe within light at day:
- Resupply “day” vs +Fe “day” for LL and HL.
- Day vs Night within each Fe × Light combination:
- (Fe “day”) − (Fe “night”) for p, m, r in LL and HL.
- HL vs LL within a given Fe at day:
- (HL, Fe, day) − (LL, Fe, day) against their appropriate Day timepoints.
All contrasts are built only if the required timepoints/conditions exist in the design (the code skips missing combinations safely).
4.4 Outputs
-----------
All results are written to the folder:
- `limma_results/`
Key files:
- Per-contrast differential abundance tables
- For each contrast name (e.g., `HL_Day_r_vs_p`, `LL_Day_m_vs_p`, `HLvsLL_p_Day`), the script writes:
- `limma_results/<contrast>.csv`
- Columns:
- `ProteinID`
- `logFC`
- `AveExpr`
- `t`
- `P.Value`
- `adj.P.Val`
- `B`
- Summaries
- `limma_results/summary_counts.csv`
- Basic counts of significant proteins per contrast.
- `limma_results/summary_counts_with_pct.csv`
- Adds `n_tested`, `pct_FDR_5`, and counts of up- and down-regulated proteins at FDR ≤ 0.05.
- Design and contrast metadata
- `limma_results/design_columns.csv`
- Lists the design matrix coefficients.
- `limma_results/<contrast>_weights.csv` for each contrast
- Records the non-zero coefficient weights used to build that contrast.
----------------------------------------------------------------
5. Running the Scripts
----------------------------------------------------------------
From R (or RStudio), set the working directory to the folder containing the scripts and supplemental data, for example:
setwd("path/to/unzipped_zenodo_archive")
Then:
- RNA-seq differential expression + Fe × Light interaction
source("path_to_RNAseq_script.R")
- Proteomics limma analyses
source("path_to_proteomics_script.R")
(Replace `path_to_RNAseq_script.R` and `path_to_proteomics_script.R` with the actual filenames in this archive.)
Provided the paths at the top of each script (`infile`, `outdir`, and any hard-coded paths) point to the included supplementary data files, the scripts will reproduce the RNA and protein differential expression results used in the manuscript and supplementary figures.
Files
SupplementaryData_01_physiology.csv
Files
(77.5 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:9b448399b0882e48ac1b3543b7c99419
|
6.1 kB | Download |
|
md5:3009093e9f4767bd9621b48d8e2cb063
|
11.3 kB | Download |
|
md5:de33b16e94543709de77e0ec0f47705d
|
4.0 MB | Download |
|
md5:e60e517ecb84c4b349cbed65d422685e
|
10.1 kB | Preview Download |
|
md5:4490c2de9abb3aef6f821ea72101fed6
|
13.1 MB | Download |
|
md5:bec35c4eedbff45cc08901c5f294ad9d
|
12.4 MB | Preview Download |
|
md5:5468aab063a1d9179a980bf26569e370
|
21.0 MB | Download |
|
md5:189a39fc796e1945f62d15d3506ffd0e
|
7.6 MB | Download |
|
md5:f37460d1c962cab793b1e4544bb64daf
|
9.7 MB | Preview Download |
|
md5:da72a51dbf2b081198458682282a19d9
|
3.8 MB | Download |
|
md5:719358a9be91e4257dc5443b413e2133
|
1.3 MB | Preview Download |
|
md5:b4a0e71440d205add3c2162309b2e07e
|
2.4 MB | Preview Download |
|
md5:e80eff4479e85c5a4b9438941453dca3
|
3.8 kB | Preview Download |
|
md5:28649509d08a84df3ab6d036054692d0
|
1.1 MB | Preview Download |
|
md5:b0f4045545604b2526f4dfa5455735fd
|
336.6 kB | Preview Download |
|
md5:f8f4b12b7c4db2a2be7ecbe52f9750ce
|
84.0 kB | Preview Download |
|
md5:ad2bc918bb508621567983a491bf80ef
|
148.0 kB | Preview Download |
|
md5:88021bfed5dee6706c17600b4f5df51a
|
308.1 kB | Download |
|
md5:c7946d746d7eacedd408a80e1a752009
|
137.5 kB | Preview Download |