There is a newer version of the record available.

Published July 5, 2018 | Version V1.02
Software Open

colford/nbgw-plant-illumina-pipeline: Temperate grass allergy season defined by spatio-temporal shifts in pollen biodiversity

Authors/Creators

  • 1. Spirent

Description

Bioinformatic analysis scripts for "Temperate grass allergy season defined by spatio-temporal shifts in pollen biodiversity".

Initial sequence processing was carried out following a modified version of the workflow described by (16). Briefly, raw sequences were trimmed using Trimmomatic v0.33 (17) to remove short reads (<200bp), adaptors and low-quality regions. Reads were merged using FLASH v 1.2.11 (10, 18), and merged reads shorter than 450bp were excluded. Identical reads were merged using fastx-toolkit (v0.0.14), and reads were split into ITS2 and rbcL based on primer sequences.

To prevent spurious BLAST hits, custom databases containing rbcL and ITS2 sequences from UK plant species were generated. A list of species found in the UK was generated by combining lists of native and alien species (19) with a list of cultivated plants obtained from Botanic Gardens Conservation International (BGCI) which represented horticultural species. All available rbcL and ITS2 records were downloaded from NCBI Genbank, and sequences belonging to UK species were extracted using the script 'creatingselectedfastadatabase.py', archived on GitHub.

Metabarcoding data was searched against the relevant sequence database using blastn (20), via the script 'blast_with_ncbi.py'. The top twenty blast hits were tabulated ('blast_summary.py'), then manually filtered to limit results to species currently present in Great Britain. Reads occurring fewer than four times were excluded from further analysis.

Files

colford/nbgw-plant-illumina-pipeline-V1.02.zip

Files (36.3 kB)

Name Size Download all
md5:d8694dded4f3565a09c1280b9fbaa3b9
36.3 kB Preview Download

Additional details