Magnetic beads, a particularly effective novel method for extraction of NGS-ready DNA from macroalgae

Abstract Next Generation Sequencing (NGS) technologies allow for the generation of robust information on the genetic diversity of organisms at the individual, species and higher taxon levels. Indeed, the number of single nucleotide polymorphisms (SNPs) detected using next-generation sequencing is order of magnitudes higher than SNPs and/or alleles detected using traditional barcoding or microsatellites. However, the amount and quality of DNA required for next-generation sequencing is greater as compared with PCR-based methods. Such high DNA amount and quality requirements can be a hindrance when working on species that are rich in polyphenols and polysaccharides, such as macroalgae (seaweeds). Various protocols, based on column-based DNA extraction kits and CTAB/Phenol:Chloroform, are available to tackle this issue. However, those protocols are usually costly and/or time-consuming and may not be reliable due to variations in the polyphenols/polysaccharide content between individuals of the same species. Here, we report the successful use of a magnetic-beads-based protocol for efficient, reliable, fast and simultaneous DNA extraction of several macroalgae species. DNA extracted from macroalgae using this method is of high quality and purity, allowing successful library preparation for next generation sequencing. We generated Genotype-By-Sequencing (GBS) data for 12 Ulva spp. (Sea Lettuce; Ulvophyceae, Ulvaceae) individuals and were able to generate a robust phylogenetic tree that we compared with traditional barcoding methods.


Introduction
In recent years, the widespread use of next-generation sequencing (NGS) technologies has led to the increasing use of restriction site-associated DNA sequencing (RAD-seq) for various genetic studies involving ecological assessment [1], genetic characterisation [2] and GWAS (Genome-Wide Association Study) [3] analyses. RAD-Seq is an umbrella term for techniques using DNA restriction enzymes and subsequent fragment size selection to sequence a reduced representation of the genome of interest [4]. The major advantage of RAD-Seq methods is the potential for multiplexing a large number of samples in a single library and/or sequencing lane, drastically reducing the cost of sequencing per sample [5].
Sanger sequencing methods such as barcoding [6] or microsatellite analysis [7], typically involving a simple PCR, require a very low DNA input (from picograms to a few nanograms), can work on partially degraded DNA, and the presence of contaminants such as polysaccharides and polyphenols can generally be mitigated by diluting the DNA 40-500 folds before performing the PCR [8,9]. RAD-Seq however, requires a larger starting DNA input (50-500 ng of DNA in a relatively low volume, typically 50 μL), is sensitive to DNA degradation, and requires high-molecular-weight DNA free of contaminants to perform reliably [10]. Hence, obtaining high yields of high purity DNA represents a major challenge when extracting samples for RAD-Seq and more generally NGS-based sequencing. DNA extraction from macroalgal samples has proven difficult over the years, due to the presence of high amount of polysaccharides and phenolics [11,12] that can inhibit downstream enzymatic reactions [13]. To reduce contaminants copurification together with DNA, various methods have been published. The most common ones involve the use of column-based DNA extraction kits (QIAGEN DNeasy, MASHEREY-NAGEL NucleoSpin, MOBIO PowerPlant Pro), CTAB/Phenol:Chloroform, or a combination of both [9,11,12,14,15]. However, all of those methods have drawbacks. First, column based kits are expensive, and efforts to remove contaminating polysaccharides often lead to reduced yield. Second, CTAB-based and CTAB combined with kits are time consuming and involve the use of toxic chemicals such as phenol and chloroform. In addition, populationbased genetics or genetic marker(s) discovery require DNA extractions T from several hundred individuals, making the use of 96-well plate DNA extraction protocols a prerequisite for time-efficient extractions on a large number of samples in a minimal time.
Here, we report the use of a modified magnetic beads protocol (using MASHEREY-NAGEL NucleoMag Plant kit) for cheap, efficient and rapid purification of DNA from several macroalgal species. We found that magnetic beads produce the highest yield and purity, and are relatively cheap compared with column-based commercial kits. Moreover, the 96-well format allows for simultaneous purification of a large number of samples quickly. Then we used the DNA extracted from 12 strains of Ulva spp. to generate RAD-seq libraries, sequenced them, and used them to generate a phylogenetic tree that we compared to barcode sequencing-based phylogenetic tree.

Algae materials
Intertidal laminar Ulva individuals were collected from several sites in Ireland (Supplementary Table 1), transported from sampling sites in a cooler box in bags filled with seawater. The Netherlands samples originate from the Eastern Scheldt and were shipped to Ireland in Falcon tubes filled with seawater. Upon arrival to the lab, all Ulva samples were rinsed with distilled water, blotted on tissue paper, and immediately frozen in liquid nitrogen. Prior to further use, samples were stored at −80°C. Samples of the kelp Saccharina latissima (Phaeophyceae, Laminariaceae) meristem (~3 × 3 cm in size), also collected throughout Ireland, were treated in the same way. The other macroalgae samples belong to 10 species, identified morphologically, and consist of six brown algae: three Fucus species (Fucus vesiculosus, F. spiralis and F. serratus); the commercially important (for biostimulant production) Ascophyllum nodosum [16], the kelp Laminaria digitata [17], and Pelvetia canaliculata. Four red algae species were also included: Gracilaria gracilis, important in food-grade agar production [18], Phycodrys rubens, a mixture of Chondrus crispus and Mastocarpus stellatus, producers of carrageenans [19], and the Ascophyllum epiphyte Vertebrata lanosa [20]. Fronds from several individuals from each species were harvested from a sheltered shore at the Claddagh, Galway (Supplementary Table 1), pooled and processed in the laboratory as described above. After freezing, all samples were freeze dried, and ground to a fine powder using a ball mill (QIAGEN TissueLyser II).

DNA extraction protocols
All Ulva spp. and Saccharina latissima DNA extractions were performed on sample aliquots of 7 mg dried powder and 200 μL elution volume, to allow for comparison between DNA extraction methods.
DNA was extracted using NucleoMag Plant kit (744,400.1) following the manufacturer's instructions, with some modifications: the lysis of samples was performed for 2 h at 56°C with the addition of 20 μL of 1 mg/mL of proteinase K (Sigma-Aldrich P6556) and 3 μL of RNAse A (provided). After lysis, samples were centrifugated for 15 min at 4°C (instead of room temperature), and the supernatant was used for the rest of the DNA extraction. The magnetic rack used was Invitrogen™ Magnetic Stand-96 (AM10027), and Sigma-Aldrich (BR701354-24EA) deep well plates.
QIAGEN DNeasy PowerPlant Pro (13400-50) and QIAGEN DNeasy Plant Mini Kit (69106) were used as per the manufacturer's instructions, with the addition of 20 μL of 1 mg/mL of proteinase K at the lysis stage.
The CTAB method was adapted from [21]. Lysis was performed using 600 μL of CTAB buffer (100 mM Tris-HCl pH 7.5, 50 mM EDTA pH 7.5, 2% CTAB, 2 M NaCl, 2% PVP and 1% β-Mercaptoethanol added just before use), 20 μL of 1 mg/mL of proteinase K, 3 μL RNAse A, and incubated for 2 h at 56°C. After lysis, 200 μL of 3 M Potassium Acetate pH 4.8 was added to the samples and incubated for 30 min on ice. After incubation, the samples were centrifugated at 13,000 g for 30 min at 4°C, and the supernatant (avoiding top white layer containing polysaccharides) extracted twice with an equal volume of Phenol:-Chloroform:Isoamylalcohol (25:24:1), and once with an equal volume of Chloroform:Isoamylalcohol (24:1). DNA was precipitated with 1/10 volume Sodium Acetate pH 5.2 and 2.5 volumes 100% Ethanol for 20 min at −80°C. The pellet was washed twice with 75% Ethanol and resuspended with 200 μL of water. DNA yield and quality was assessed with UV absorbance using an IMPLEN Spectophotometer.

PCR analysis and sanger sequencing
To test whether DNA extracted with magnetic beads, columns or CTAB was suitable for downstream applications, we used PCR to assess for the presence of inhibitors and whether the DNA is of sufficient quality and quantity. Most microsatellite or Sanger sequencing methods to investigate genetic diversity and/or phylogeny in macroalgae require a 1:50 to 1:500 fold dilution of the extracted DNA in order to obtain amplifiable DNA, due to the presence of various PCR inhibitors such as polysaccharides and polyphenols [8,9,22]. Hence, we performed PCRs on a dilution series of the DNA extracted from both macroalgae species, against the rubisco large subunit rbcL gene for Ulva spp. [9], and the microsatellite SLN510 for Saccharina latissima [7]. PCRs were performed in 25 μL reaction volumes using MyTaq Red Mix (bioline), following the manufacturer's instructions, using 1 μL of undiluted DNA, or DNA diluted 1:5, 1:10 and 1:100 folds with water. The PCR program consisted of 95°C for 3 min, followed by 35 cycles of 95°C for 30 s, 58°C for 30 s and 72°C for 30 s, and a final extension of 7 min at 72°C. The list of primers is available in Supplementary Table 2. For barcoding, PCR amplicons were sent to LGC Genomics GmbH (Germany) for Sanger sequencing using rbcL primers SHF1 and SHF4 [9].

RAD-Seq: genotyping by sequencing (GBS)
500 ng to 2 μg of DNA extracted using magnetic beads from 12 Ulva individuals (6 individuals from Ireland and 6 individuals from the Netherlands) was sent to Novogene (Hong Kong) for GBS library construction and sequencing [23]. Libraries were constructed using Mse1 and EcoR1 digestion, an insert size of~300 bp, and sequenced using Illumina HiSeq platform (150 bp paired end reads). After demultiplexing and adaptor trimming, paired-end reads were analysed using the ipyrad toolkit [24], with de novo assembly method and 0.85 clustering threshold. After analysis, the phylogenetic trees where generated using RAxML [25], and visualised using Figtree (http://tree.bio.ed.ac. uk/software/figtree/). Clean reads (adaptors and barcodes removed) are available at Sequence Read Archive (https://www.ncbi.nlm.nih. gov/sra) under the accession number SRP131215.

Comparison of DNA extraction methods on Ulva spp. and Saccharina latissima
We selected two species of ecological and agronomical importance, the Sea Lettuce Ulva spp. and the kelp Saccharina latissima. Both species are cultivated [26][27][28], and efforts are ongoing to generate genetic markers and/or characterise the genetic diversity amongst these species, mostly in wild strains [7,14,29,30]. However, most of those studies rely on either Sanger sequencing or microsatellite markers to infer genetic diversity, and little to no genome-wide information is available on those species, possibly due to the difficulty of obtaining good quality DNA from macroalgae.
DNA from four individuals from laminar Ulva spp. and four individuals from Saccharina latissima was extracted using the four different protocols described in the method section. As shown in Fig. 1A, the NucleoMag protocol generated the highest yield, with an average of 18 ng·μL −1 DNA, for both macroalgae species. The three other protocols, column based and CTAB, yielded a similar amount of DNA of 7 ng·μL −1 . To assess the purity of the extracted nucleic acids, we assessed the absorbance ratios at 260 and 280 nm, as well as 260 and 230 nm, to detect proteins, polysaccharides and other contaminants left in the extracted DNA ( Fig. 1B and C). The average 260/280 ratio of samples extracted using NucleoMag and PowerPlant kits were~1.8 for both species, while QIAGEN and CTAB-extracted samples showed higher deviations from the expected 1.8, due either to the presence of contaminants or a DNA concentration too low to generate an accurate ratio. The 260/230 ratio for all protocols was generally low, indicating a possible presence of carbohydrates and other contaminants left in the samples. However, the ratios do not indicate whether the DNA is degraded or if RNA contamination is present in the sample. To investigate DNA integrity, we ran 5 μL of the extracted DNA on a 0.8% agarose gel. As shown in Fig. 2A and B, the DNA extracted using magnetic beads shows a clear band of high molecular weight for both species, while the other protocols lead to partial or complete degradation of the DNA, and some RNA remains in DNeasy and CTAB-extracted Saccharina latissima samples. Taking into account yield, ratios and DNA integrity, magneticbead extraction outperformed other methods for both the kelp and Sea Lettuce samples.
We next tested whether DNA extracted using magnetic beads was suitable for downstream applications using PCR, on undiluted and diluted DNA. Fig. 2C and D show the gel electrophoresis of the PCR products of the different dilutions in Ulva spp. and Saccharina latissima samples, respectively. We successfully amplified the rbcL gene on undiluted Ulva DNA from all protocols, indicating that the samples are overall free of major PCR inhibitors. Interestingly, after 1:100 dilution only NucleoMag and CTAB-extracted DNA show a successful amplification on all four individual samples, possibly due to a better DNA integrity in those two protocols compared with column based ones ( Fig. 2A). For Saccharina latissima samples however, only DNA  extracted using magnetic beads successfully produced a PCR amplicon when undiluted (Fig. 2D). A five-fold dilution was enough for DNA extracted with the PowerPlant kit to produce a PCR amplicon, while a hundred-fold dilution was needed for DNeasy-extracted DNA. Samples extracted using the CTAB protocol failed to yield a PCR amplicon. The difference in yield, purity and quality of DNA extracted using magnetic beads and silica-based columns could be due to several factors: i) magnetic beads can attract and release high molecular weight DNA while column-based ones tend to retain large DNA fragments in the membrane following elution, due to the physical constrains of the silica matrices; ii) the binding between paramagnetic beads and DNA may be more specific than between silica matrix and DNA, leading to higher purity and lower co-purification of polysaccharides and polyphenols. Overall, we found that the use of magnetic beads allows for higher yield and higher purity compared with column based kits and CTAB method. Moreover, it is the only method that, in those eight samples, would consistently generate DNA of high enough quality to pass the stringent quality control requirement of sequencing companies (typically > 300 ng DNA in < 30 μL, an OD260/280 at 1.8 and no degradation nor contamination). Lastly, as of 2017 in Ireland, the cost per sample (excl. VAT) for the NucleoMag kit was~1.3€ compared with~4€ for Pow-erPlant and DNEasy kits, indicating that magnetic beads represents a significantly cheaper solution than column-based kits.

Magnetic beads can be used to extract DNA from a wide variety of macroalgal species
We next tested whether magnetic beads could be used to extract DNA from a wider array of macroalgal species by harvesting fronds of several individuals from 10 species found in the Claddagh, Galway, Ireland. Fig. 3 shows a DNA gel electrophoresis of all 10 seaweed species extracted using the magnetic bead protocol. Apart from Ascophyllum nodosum, we were able to recover good quality DNA from all of the other species, indicating that magnetic beads can be used reliably amongst a wide range macroalgae species, even on frond samples that traditionally contain the highest levels of polysaccharides and/or polyphenols.

Genotype by sequencing results on 12 Ulva spp. samples allow for a discrimination per geographical location and taxa
To test whether DNA obtained using paramagnetic beads is suitable for next generation sequencing, we collected 12 laminar Ulva individuals, throughout Ireland and The Netherlands (GPS data available in Supplementary Table 1). DNA was extracted using the NucleoMag protocol and 500 ng to 2 μg of DNA was sent to Novogene (Hong Kong) for GBS analysis. All samples passed the stringent QC of the company and the library construction using Mse1 and EcoR1 digestion was successful, with a selected insert size of 265-315 bp. Table 1 shows the summary statistics of the sequencing data. We generated between 500,000 and 1.8 million reads per sample, with a final number of clusters of~290 bp with at least 5× coverage between 40,000 and 140,000 per sample. Each cluster can contain useful information for either phylogenetic studies or marker discovery, and after removing clusters shared between < 6 individuals, 7802 clusters remained. Amongst those common clusters, we identified 26,181 parsimony informative SNPs within our population, allowing us to generate a robust phylogenic tree representing the genetic diversity of the 12 sampled individuals (Fig. 4A). Colours on trees indicate the closest species in the NCBI database according to the rbcL sequence. In total three species identified within our samples (U. rigida C·Agardh [31], U. pseudorotundata Cormaci, G.Furnari & Alongi [32], formerly U. rotundata Bliding, and U. australis Areschoug [33]. The tree based on the~26,000 SNPs between individuals allows for separation of species as well as sampling location since individuals from similar geographical area cluster together. We then compared the GBS-based tree with a tree based on the Sanger sequencing results from rbcL (Fig. 4B). Interestingly, while rbcL barcoding does discriminate well amongst species, we were unable to separate samples based on location due to the very low number of SNPs detected in rbcL. Indeed, no SNPs are found between the U. pseudorotundata individuals, and only two and three SNPs are present within U. rigida and U. australis samples, respectively. While using multiple barcoding primers and target genes could generate more SNPs, to allow for a finer characterisation of samples within the same species [34], the costs of multiple barcoding reactions/sequencing would be similar to the cost of current-day GBS sequencing (USD35). In addition, our data supports the recent findings based on GBS data demonstrating the importance of using large scale genetic datasets (> 10,000 parsimony-informative SNPs) to robustly distinguish between closely related species [35]. Hence, with a similar cost and an order of magnitude more SNPs, we consider RAD-Seq methods as the future method of choice for assessing the genetic diversity of Ulva species and showing great potential for discriminating many other problematic macroalgal species.

Conclusion
Unravelling the genetic relationships between macroalgae individuals can be challenging, mainly due to the difficulty to obtain good quality DNA from polysaccharide and polyphenol-rich species. Here, we report that DNA obtained using magnetic beads extraction is of enough quality and quantity to routinely perform next generation sequencing on as little as 5 mg dry weight of sample, and we generated robust RAD-Seq results on Ulva spp. We expect that the combination of reliable DNA extractions and next-generation sequencing data will unlock the potential for large-scale genetic/ecological studies as well as GWAS efforts on algae growth and/or metabolites.

Competing interests
The authors declare no conflict of interest.