colford/nbgw-plant-illumina-pipeline: Temperate grass allergy season defined by spatio-temporal shifts in pollen biodiversity
Description
Bioinformatic analysis scripts for "Temperate grass allergy season defined by spatio-temporal shifts in pollen biodiversity".
Initial sequence processing was carried out following a modified version of the workflow described by (16). Briefly, raw sequences were trimmed using Trimmomatic v0.33 (17) to remove short reads (<200bp), adaptors and low-quality regions. Reads were merged using FLASH v 1.2.11 (10, 18), and merged reads shorter than 450bp were excluded. Identical reads were merged using fastx-toolkit (v0.0.14), and reads were split into ITS2 and rbcL based on primer sequences.
To prevent spurious BLAST hits, custom databases containing rbcL and ITS2 sequences from UK plant species were generated. A list of species found in the UK was generated by combining lists of native and alien species (19) with a list of cultivated plants obtained from Botanic Gardens Conservation International (BGCI) which represented horticultural species. All available rbcL and ITS2 records were downloaded from NCBI Genbank, and sequences belonging to UK species were extracted using the script 'creatingselectedfastadatabase.py', archived on GitHub.
Metabarcoding data was searched against the relevant sequence database using blastn (20), via the script 'blast_with_ncbi.py'. The top twenty blast hits were tabulated ('blast_summary.py'), then manually filtered to limit results to species currently present in Great Britain. Reads occurring fewer than four times were excluded from further analysis.
Files
colford/nbgw-plant-illumina-pipeline-V1.02.zip
Files
(36.3 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:d8694dded4f3565a09c1280b9fbaa3b9
|
36.3 kB | Preview Download |
Additional details
Related works
- Is supplement to
- https://github.com/colford/nbgw-plant-illumina-pipeline/tree/V1.02 (URL)