Superoxide is promoted by sucrose and affects amplitude of circadian rhythms in the evening
Ángela Román, Xiang Li, Dongjing Deng, John W. Davey, Sally James, Ian A. Graham, Michael J. Haydon
Contact: m.haydon@unimelb.edu.au, johnomics@gmail.com

This repository contains the following files and folders:
adjusted.pvalues.csv - stageR p values after multiple testing correction
Araport11_genes.201606.cdna.fasta.gz - reference transcript sequences
Araport11_GFF3_genes_transposons.201606.gtf.gz - reference annotation
Araport11_GFF3_genes_transposons.201606.salmon.geneMap.tsv - gene to transcript mapping
cluster_analysis.R - R code to generate and analyse clusters
clusters.14.tsv - clusters of genes used in paper from k=14 analysis
dark_vs_sucrose_0.5h.wald.csv - pairwise Sleuth Wald test result for 0.5h Dark vs Sucrose comparison
dark_vs_sucrose_2h.wald.csv - pairwise Sleuth Wald test result for 2h Dark vs Sucrose comparison
dark_vs_sucrose_8h.wald.csv - pairwise Sleuth Wald test result for 8h Dark vs Sucrose comparison
full_abundances.csv - Sleuth scaled reads per base for each sample and gene
full_lrt.csv - Sleuth likelihood ratio test output for full model including all samples
light_vs_dcmu_0.5h.wald.csv - pairwise Sleuth Wald test result for 0.5h Light vs DCMU comparison
light_vs_dcmu_2h.wald.csv - pairwise Sleuth Wald test result for 2h Light vs DCMU comparison
light_vs_dcmu_8h.wald.csv - pairwise Sleuth Wald test result for 8h Light vs DCMU comparison
README.txt - this file
run_sleuth.R - R code to run Sleuth differential expression analyses
run_stageR.R - R code to run stageR multiple testing correction 
salmon - folder of Salmon output for all samples
samples.tsv - Sleuth sample table with Salmon paths and Treatment, Time, Experiment and Replicate variables
sugar_activated_genes.txt - list of sugar activated genes (also in Dataset 2)
sugar_repressed_genes.txt - list of sugar repressed genes (also in Dataset 2)

Full details of each of these files follows.

REFERENCE
---------
Araport 11 reference files downloaded from https://www.arabidopsis.org/download/index-auto.jsp?dir=%2Fdownload_files%2FGenes%2FAraport11_genome_release on 26 April 2017:
Araport11_genes.201606.cdna.fasta.gz
Araport11_GFF3_genes_transposons.201606.gtf.gz

Transcript to gene map for Salmon:
Araport11_GFF3_genes_transposons.201606.salmon.geneMap.tsv

SALMON
------
The salmon folder contains 53 folders, one for each sample. Each folder contains Salmon transcript counts (quant.sf), Salmon gene counts (quant.genes.sf) and the complete Salmon output in Kallisto format (abundance.h5), including bootstrap values.

The following samples are included:
Control_0h_Exp1_Rep1
Control_0h_Exp1_Rep2
Control_0h_Exp1_Rep3
Control_0h_Exp2_Rep2
Control_0h_Exp2_Rep3
DCMU_0.5h_Exp1_Rep1
DCMU_0.5h_Exp1_Rep2
DCMU_0.5h_Exp1_Rep3
DCMU_2h_Exp1_Rep1
DCMU_2h_Exp1_Rep2
DCMU_2h_Exp1_Rep3
DCMU_8h_Exp1_Rep1
DCMU_8h_Exp1_Rep2
DCMU_8h_Exp1_Rep3
Dark_0.5h_Exp1_Rep1
Dark_0.5h_Exp1_Rep2
Dark_0.5h_Exp1_Rep3
Dark_2h_Exp1_Rep1
Dark_2h_Exp1_Rep2
Dark_2h_Exp1_Rep3
Dark_2h_Exp2_Rep1
Dark_2h_Exp2_Rep2
Dark_2h_Exp2_Rep3
Dark_8h_Exp1_Rep1
Dark_8h_Exp1_Rep2
Dark_8h_Exp1_Rep3
Dark_8h_Exp2_Rep1
Dark_8h_Exp2_Rep2
Dark_8h_Exp2_Rep3
Light_0.5h_Exp1_Rep1
Light_0.5h_Exp1_Rep2
Light_0.5h_Exp1_Rep3
Light_2h_Exp1_Rep1
Light_2h_Exp1_Rep2
Light_2h_Exp1_Rep3
Light_8h_Exp1_Rep1
Light_8h_Exp1_Rep2
Light_8h_Exp1_Rep3
Sucrose_0.5h_Exp1_Rep1
Sucrose_0.5h_Exp1_Rep2
Sucrose_0.5h_Exp1_Rep3
Sucrose_2h_Exp1_Rep1
Sucrose_2h_Exp1_Rep2
Sucrose_2h_Exp1_Rep3
Sucrose_2h_Exp2_Rep1
Sucrose_2h_Exp2_Rep2
Sucrose_2h_Exp2_Rep3
Sucrose_8h_Exp1_Rep1
Sucrose_8h_Exp1_Rep2
Sucrose_8h_Exp1_Rep3
Sucrose_8h_Exp2_Rep1
Sucrose_8h_Exp2_Rep2
Sucrose_8h_Exp2_Rep3

There are two experiments (Exp1, Exp2). Each experiment has three replicates. The first experiment covers all conditions (times 0h, 0.5h, 2h, 8h, treatments Dark, Sucrose, Light, DCMU); the second includes Control 0h, Dark 2h, Dark 8h, Sucrose 2h, Sucrose 8h. Control_0h_Exp2_Rep1 failed sequencing and is not included.

SLEUTH
------
Sleuth was run using the script run_sleuth.R as follows:
./run_sleuth.R -s samples.tsv

This script produced the following outputs:
full_lrt.csv - Sleuth likelihood ratio test output for full model including all samples
full_abundances.csv - Sleuth scaled reads per base for each sample and gene

Wald test pairwise comparison output:
dark_vs_sucrose_0.5h.wald.csv
dark_vs_sucrose_2h.wald.csv
dark_vs_sucrose_8h.wald.csv
light_vs_dcmu_0.5h.wald.csv
light_vs_dcmu_2h.wald.csv
light_vs_dcmu_8h.wald.csv

Three columns in Dataset 1 are taken from these Sleuth pairwise comparison outputs:
Log2 Effect Size    - b
Log2 Effect Size SE - b_se
Log2 Mean Abundance - mean_obs

STAGER
------
Sleuth output was processed and corrected for multiple tests with StageR with run_stageR.R, which produces adjusted.pvalues.csv, a file of adjusted p values after multiple testing correction for each gene and pairwise comparison. These p values are the AdjustedP values in Dataset 1.


SUGAR-REGULATED GENE SETS
-------------------------
The sugar regulated genes listed in Dataset 2 are provided here as text files, as they are required for the cluster analysis (below):
sugar_activated_genes.txt
sugar_repressed_genes.txt


CLUSTER ANALYSIS
----------------
The cluster_analysis.R script requires samples.tsv, sugar_activated_genes.txt, sugar_repressed_genes.txt, full_lrt.csv and full_abundances.csv as input. It produces the following:
clusters.14.tsv - clustering output for k=14 clusters, as used in the paper
elbow_plot.pdf - Elbow plot for Figure S4
trajectories.pdf - cluster trajectories plot for Figure 1D
Dataset3.xlsx - raw cluster data and GO analysis for Dataset 3
enrichment_plot.pdf - GO enrichment diagram for Figure 1E / Dataset 4