Published January 30, 2018 | Version v.1.0.0
Journal article Open

SECAPR - A bioinformatics pipeline for the rapid and user-friendly alignment of hybrid enrichment sequences, from raw reads to alignments

  • 1. Department of Biological and Environmental Sciences, University of Gothenburg, Göteborg, Sweden
  • 2. Department of Botany and Plant Biology, University of Geneva, Geneva, Switzerland

Description

Evolutionary biology has entered an era of unprecedented amounts of DNA sequence data, as new sequencing platforms such as Massive Parallel Sequencing (MPS) can generate billions of nucleotides within less than a day. The current bottleneck is how to efficiently handle, process, and analyze such large amounts of data in an automated and reproducible way. To tackle these challenges we introduce the Sequence Capture Processor (SECAPR) pipeline for processing raw sequencing data into multiple sequence alignments for downstream phylogenetic and phylogeographic analyses. SECAPR is user-friendly and we provide an exhaustive tutorial intended for users with no prior experience with analyzing MPS output. SECAPR is particularly useful for the processing of sequence capture (= hybrid enrichment) datasets for non-model organisms, as we demonstrate using an empirical dataset of the palm genus Geonoma (Arecaceae). Various quality control and plotting functions help the user to decide on the most suitable settings for even challenging datasets. SECAPR is an easy-to-use, free, and versatile pipeline, aimed to enable efficient and reproducible processing of MPS data for many samples in parallel.

Files

alignments.zip

Files (3.9 GB)

Name Size Download all
md5:c5ad7686bff848dcaad37e25a6000973
961.1 kB Preview Download
md5:cce76de57aded95fa3dec5d2bdae9fb5
21.9 MB Preview Download
md5:26e6814bc749d8939686c8541d52141b
1.2 GB Preview Download
md5:f85cd1e6de57b24112dd6312550a67fe
2.6 GB Preview Download
md5:bfe5b448eb81fedabf266ffffd5140a5
46.1 MB Preview Download
md5:aba30a47688f19a2e898b6b8ab4bc00f
4.7 MB Preview Download