Phytophthora in horticultural nursery green waste - a risk to plant health
Creators
- 1. Forest Research
- 2. James Hutton Institute
Description
This dataset on Zenodo accompanies the manuscript Schiffer-Forsyth et al. (2023), Phytophthora in horticultural nursery green waste – a risk to plant health.
There are two files:
- metadata.tsv - plain text table as tab-separated variables
- raw_data.tar.gz - compressed archive of 81 paired raw FASTQ files
This represents a complete Illumina MiSeq run, with the names of the unrelated samples redacted.
To repeat the analysis described in the paper, first install THAPBI PICT. See https://github.com/peterjc/thapbi-pict/ for instructions. At the time of the paper, v0.14.1 was the current release (with a near-identical v1.0.0 expected to be released shortly).
Next, decompress the raw data into a folder of paired gzipped FASTQ files. There is no need to decompress those:
$ tar -zxvf raw_data.tar.gz
$ ls -1 raw_data/
If you wish, verify the checksums to confirm the data integrity:
$ cd raw_data/
$ md5sum -c MD5SUM.txt
$ cd ..
Setup output directories:
$ mkdir -p intermediate/ summary/
You can run the analysis in one step. This should take under five minutes:
$ thapbi_pict pipeline -i raw_data/ \
-n raw_data/SynCtrl_*.fastq.gz \
-y raw_data/SynCtrl_*.fastq.gz \
-s intermediate/ -o summary/ \
-t metadata.tsv -x 3 -c 1,2,4,5
The options here are as follows:
- -i raw_data - input directory of paired raw FASTQ files.
- -n raw_data/SynCtrl_*.fastq.gz - negative controls used to increase the absolute abundance threshold
- -y raw_data/SynCtrl_*.fastq.gz - synthetic controls used to increase the fractional abundance threshold
- -s intermediate/ - optional location to store intermediate files
- -o summary/ - output location for reports
- -t metadata.tsv -x 3 -c 1,2,4,5 - show and sort on metadata columns 1, 2, 4 and 5 from metadata.tsv using column 3 to cross-reference the FASTQ filename stems (semi-colon separated lists for replicates).
This assumes the following key default settings:
- -a 100 -f 0.001 - default absolute and fractional abundance thresholds
- -d - - default to the provided ITS1 database
With these settings, only synthetic sequences were found in the controls, and therefore the thresholds were not automatically increased any further.
Note some of these options could change in future releases of the software, and in particular there would likely be additional Phytophthora species or sequences in future updates to the default database.
Output file summary/ITS1.samples.onebp.xlsx (and .tsv) is equivalent to Table 2 (after pooling replicates, and applying human judgement to resolve ambiguous ITS1 markers shared by multiple species).
Note P. austrocedri was identified in three samples, N2-Water_S2 and N2-Water_S22 described in this work, and a third sample REDACTED_S28 from another location.
Files
Files
(1.6 GB)
Name | Size | Download all |
---|---|---|
md5:c44b2414b2df26652afb4dbaa07e61f9
|
1.5 kB | Download |
md5:72d75f0f5b08d205b56b0e022bef9f82
|
1.6 GB | Download |