Genomic tools for comparative conservation genetics among three recently diverged stag beetles (Lucanus, Lucanidae)

1. We are witnessing a rapid decline in global biodiversity. International protocols and local conservation laws have been installed to counter such an unprecedent rate of decline. 2. However, quantitatively evaluating how much biodiversity has been lost due to climatic and anthropogenic effects and how much biodiversity has been restored due to conservation efforts remains challenging. 3. We applied a comparative conservation genomic approach to statistically and quantitatively address these questions using three geographical taxa from a stag beetle species complex. 4. We found that the three sky‐island taxa formed three independently evolving units without detectable post‐divergence gene flow; furthermore, the three taxa, which have been divergent from each other since the mid‐Pleistocene, have experienced episodes of demographic decline in the past. 5. More importantly, even though idiosyncratic anthropogenic exploitations have been hypothesised to impact the recent demographic history (<100 years) differently, we found a shared pattern of continuous decline in effective population size among the three geographical taxa. 6. We argue that future empirical studies should include more taxa, in addition to the focal species, that may or may not be affected by the focal historical events to avoid making biased conservation plans.


INTRODUCTION
Anthropogenic exploitation and climate change have been and will continue to be major threats to biodiversity.The loss of global and local biodiversity may affect the sustainability of resources and the long-term survival of our own species (Butchart et al., 2010;Perrings et al., 1992).Species of cultural importance may also experience high risks of extinction (e.g., Naveda-Rodríguez et al., 2016).Local governments and international coalitions have rapidly legislated conservation plans and environmental protection acts to protect critical resources of biodiversity (Neumann et al., 2018;Prathapan et al., 2018), but how different species respond to climate change and anthropogenic exploitation and how protected species react after an act of conservation/protection have rarely been quantitatively evaluated (Hoey Mattia De Vivo and Min-Hsun Chou contributed equally.et al., 2022;Huang, 2014Huang, , 2019;;Li et al., 2022;Lin et al., 2021;Nunziata et al., 2017).The difficulty in evaluating the impacts on focal species is twofold.First, focal species are often rare and highly vulnerable, so obtaining enough samples for quantitative and statistical evaluations of changes in population size can be challenging and even unethical (e.g., Lin et al., 2021).Second, the evolutionary timescale of the changes to the environment and climate and the impacts and effects of conservation acts on the demographics of the focal species can be extremely short, where conventional toolkits may fail to detect such fine nuances (Nunziata et al., 2017).The inability to statistically and objectively evaluate the efficacy of conservation acts and the impacts of climatic and environmental changes on biodiversity may lead to a false sense of security for successfully protecting and restoring local biodiversity because of the conservation efforts communicated by various media (Butchart et al., 2010), whereas in reality a continuous loss of biodiversity could go undetected.
In this study, we aim to demonstrate that the impact of environmental change and the effect of conservation acts can be statistically evaluated using genomic data, system-specific models of population demographics and a flagship stag beetle species in Taiwan.Additionally, we would also like to explore whether genomic data and modelbased approaches can have the power to distinguish the impacts between historical and recent demographic processes on current genetic diversity.First, conventionally used molecular markers, for example, Sanger sequencing loci and microsatellites, have been effective in detecting genetic structure within species and in reconstructing demographic histories in the Pliocene/Pleistocene (e.g., Chou et al., 2021;Huang & Lin, 2010;Sun et al., 2020), but the power of using these markers to detect fine-scale geographic genetic variation and demographic change in the Anthropocene is limited (Nunziata & Weisrock, 2018).The ability to detect recent demographic change (<100 years) using molecular data, however, is crucial for statistically evaluating the efficacy of conservation acts and could be better explored using population genomic data (e.g., Clark et al., 2023;DeWoody et al., 2022;Hoey et al., 2022;Hogg et al., 2022;Nunziata et al., 2017).More importantly, recent changes in land use (e.g., forest loss due to the development of recreational areas or agricultural land) and the launch of various conservation acts (e.g., the establishment of national parks) are well documented in Taiwan (Chen et al., 2019).
These historical events can help us form explicit species-and policy-specific hypotheses to account for changes in the demographic histories of focal species.That is, we can be at the forefront of transforming biological conservation from pattern description and subjectspecific research (habitat-or species-oriented) into a question-driven and hypothesis-testing scientific field of comparative conservation genomics.
One seldomly applied approach in this exciting new era of conservation genomics is to involve multiple closely related populations/taxa to test the predicted impact of the same anthropogenic exploitation, that is, comparative conservation genomics.Accumulating empirical evidence has shown that the effective population size of focal species can be associated with the observed number of changes in census population sizes (Hoey et al., 2022;Li et al., 2022;Nunziata et al., 2017), and contemporary anthropogenic exploitations have been hypothesised to cause the loss of genetic diversity.However, without studying additional populations that might exhibit different responses to the same historical event, it is challenging to establish a confident link between anthropogenic impacts and the observed demographic changes.In this sense, studying a system where multiple closely related independent lineages exist and have been documented to have experienced different or the same historical anthropogenic events can help us further evaluate the efficacy of omics for conservation studies.We believe a Taiwanese endemic species complex of stag beetles, Lucanus miwai s.l.(Chou et al., 2021), which is composed of three closely related allopatric sky-island taxa may be a good empirical system to statistically and comparatively evaluate the genetic consequence of various anthropogenic exploitations.
L. miwai and its sister species Lucanus yulaoensis are stag beetles endemic to Taiwan (Chou et al., 2021; Figure 1).An additional closely related taxon geographically intermediate between the two sister species has recently been found in Miaoli County (Figure 1).The species/ taxa in this species complex are only found in sky-island habitats during a very short breeding period every year, typically from late-March to mid-April (Huang, 2014).The adult beetles die soon after the short breeding season each year.The forest habitat of L. miwai from Nantou County (the taxon will be referred to as NT hereafter; Figure 1 habitats have been protected.Note that continuous demographic decline can be detected even when a species is located in protected areas (Chowdhury et al., 2023).Specifically, global insect decline due, for example, to climate change has been documented in many insect lineages (Sánchez-Bayo & Wyckhuys, 2019).As a result, we also tested a constant demographic decrease model, which represents the recent trend observed in the global decline of insects.
It is essential to mention that the current genetic diversity within a population and, thus, the estimated effective population size are also shaped by historical events.For example, post-divergence gene flow and Pleistocene climate change could have shaped the current genetic diversity of focal species (Fedorov et al., 2020;Ortego & Knowles, 2022).As a result, before testing the impacts of anthropogenic activity on the changes in effective population size, it is important to account for the confounding impacts of historical events; otherwise, odds are erroneous inferences will be made (Eriksson & Manica, 2012).Although it has been shown that post-divergence gene flow may not play a significant role in the diversification history between L. miwai and L. yulaoensis (Chou et al., 2021), gene flow after secondary contacts between geographic populations has been hypothesised as an important and common force shaping current genetic diversity in some Taiwanese beetle lineages due to distributional shifts (Huang & Lin, 2010;Tsai et al., 2014).With the addition of a geographically intermediate taxon in the system, statistical evaluation for historical gene flow and demographic change are needed before testing for recent changes in effective population sizes.
We studied genetic diversity using genomic data, identified breaks of population genetic structure to define evolutionarily important units for developing conservation management plans and reconstructed historical changes in effective population size in this study.Furthermore, we statistically tested whether the direct anthropogenic habitat disturbances (e.g., changes in land use) could be associated with changes in the effective population sizes of focal taxa.We also tested whether genomic data has the statistical power to distinguish recent demographic events from long-term historical processes.The results were used to discuss why and how different taxa from different localities may or may not respond similarly to the same historical event, and to inform governmental agencies of the efficacies and effects of previous plans and the best practice for future conservation efforts.Specifically, with help from genomic data and system-specific demographic models (Nunziata et al., 2017), we expected the loss of forest habitat due to anthropogenic exploitation for agricultural (started around 1961) and recreational (started around 2001) uses could result in decreases in effective population sizes of NT and HC.On the other hand, the establishment of a national park that protects wildlife and habitats may have led to a stable or an increase in the effective population size of ML.Alternatively, if the impact of island-wide events instead outweighed local changes, such as the loss of forest habitats due to different management plans imposed in different eras (Chen et al., 2019) or global environmental and climate change (Chowdhury et al., 2023;Sánchez-Bayo & Wyckhuys, 2019), then similar demographic histories would result.Because the three taxa of the L. miwai complex prefer the same sky-island habitat type, we also expected that historical changes in climatic conditions during Pleistocene and Pliocene that altered the distribution and size of the montane forest habitats could lead to the same historical demographic response among the three taxa.

DNA extraction and preparation of the ddRADseq libraries
A total of 96 adult beetles were mainly collected in two breeding seasons, 2020 and 2021.Specifically, we used the DNA extractions for samples from HC and NT from a previous study (Chou et al., 2021).
There were some individuals collected in 2016 from HC and 2019 from NT.All the ML samples were newly obtained in 2020 and 2021.
We included exactly 32 adult beetles from each geographic population.All the collected specimens were preserved in absolute alcohol in a À80 C freezer before extraction.Total genomic DNA was extracted from the beetles using the DNeasy Blood & Tissue Kit (Qiagen).We followed the corresponding manufacturer's protocol for the DNA extractions.
We applied a genomic approach, double-digested restrictionsite-associated DNA sequencing (ddRADseq; Peterson et al., 2012), to study the genetic diversity within, and the genetic divergence among, the three populations of stag beetles and to estimate their demographic histories.Briefly, we first digested each DNA sample using two restriction enzymes, EcoR1 and Mse1.The total amount of genomic DNA used for each individual was around 200 ng.Adaptors with specific barcode sequences were then ligated to the digested DNA using T4 ligase.The barcoded DNA samples were individually purified using magnetic beads (Beckman Coulter).Polymerase chain reactions (PCRs) with eight cycles of reactions were applied to the cleaned fragments to attach the Illumina sequencing adaptors.The PCR products were individually cleaned using magnetic beads again and then pooled according to their DNA concentrations.We selected DNA fragments that were between 300 and 400 base pairs (bp) in size using Pipin Prep (Sage Science) before the Illumina sequencing.Single-end sequencing of 150-bp lengths was performed using an Illumina HiSeq2500 system at the sequencing core at BRCAS.A detail protocol of how we prepared the ddRADseq library can be found in Supplementary Materials (both versions in English and Mandarin are available).

Bioinformatic processing of the ddRADseq data set
The produced sequence data were first processed and analysed using the ipyrad pipeline (version 0.9.84;Eaton & Overcast, 2020), with de-novo assembly, Phred Qscore offset of 43, 90% clust threshold, 0.25 maximum of heterozygotes in consensus and minimum sample locus set at 80% of all the samples, while keeping the other settings as defaults.Samples with fewer than 1500 loci were discarded before further analyses.Because ipyrad does not output information for invariant sites, which is critical for subsequent analyses based on the site-frequency spectrum (SFS), we further processed our ddRADseq data set using the Stacks pipeline (version 2.62: Rochette et al., 2019).Specifically, we followed the protocol listed in a previous study (Rivera-Col on & Catchen, 2022), which was modified from the 'R80' method (Paris et al., 2017).Briefly, we kept only the individuals with at least 200,000 raw reads after trimming using a customised script provided by the Stacks program, process_radtags.Such number of reads was chosen after observing the ipyrad output, where individuals with lower number of reads resulted in very low number of loci (<2000) in the final data set after data processing steps.To minimise downstream impacts of missing data (e.g., PCA has been shown to be sensitive to missing data), we excluded these individuals.Subsequently, we tested various values of M (the number of mismatches allowed by the software when it matches putative alleles/stacks to loci) and n (number of mismatches between the loci of individuals and the catalogue of these loci), while keeping the two equal (i.e., M1n1, M2n2, up to 12).We also fixed m (the minimum number of raw reads required to form a putative allele) to the default value of 3. Furthermore, we set the 'r' flag to 80 for having a loci catalogue with at least 80% of the loci shared by the individuals for each population.We selected the optimal loci set by calculating for which M/n value the change in R80 loci tended to come close to zero, while remaining positive; we subtracted the number of R80 loci found in a run with the number of R80 loci found in the previous run (i.e., M2n2 loci-M1n1 loci, M3n3-M2n2 loci, etc.).The results are available in Table S1.

Analyses of population structure and estimation of divergence times
To detect whether population genetic structure existed among the three geographical taxa, we used tools implemented in ipyrad and R (version 4.1.3;R Core Team, 2022) using output files generated by both the ipyrad and Stacks pipelines.We converted the VCF file generated by Stacks to a HDF5 database using a file converter provided by an ipyrad-analysis toolkit (Eaton & Overcast, 2020).Additionally, the VCF file generated by Stacks was filtered to retain only loci with Single Nucleotide Polymorphisms (SNPs) having less than 20% missing data among individuals by modifying customised scripts (Dalapicolla et al., 2021; original scripts available at https://github.com/jdalapicolla/LanGen_pipeline_version2).
We performed a principal component analysis (PCA) with 100 replicates using the kmeans clustering method, which does not require a pre-defined population structure (Eaton & Overcast, 2020).Subsequently, we tested for the number of genetic clusters using STRUCTURE software (version 2.3.4;Pritchard et al., 2000) with K-values (number of genetic clusters) ranging from 2 to 5, 10 independent runs performed for each K-value, 300,000 MCMC iterations and a burn-in of 100,000 steps.The best K-value was determined by calculating the highest ΔK using the get_evanno_table function in the ipyrad API.The PCA and STRUCTURE results were plotted using ipyrad API and the toyplot library (version 0.18.0;Shead, 2014).
Moreover, we used the snapclust method implemented in the R package adegenet (version 2.1.6;Beugin et al., 2018) to test for the population genetic structure and individual assignment.Compared to STRUCTURE, snapclust tends to be faster and more importantly can be used for checking if a panmictic population is the best population genetic hypothesis (K = 1).The best number of clusters was determined by computing both the Akaike information criterion (AIC; Akaike, 1998) and the Bayesian information criterion (BIC; Schwarz, 1978).Specifically, the best K was chosen by having the lowest AIC and BIC values.
Additionally, we used the fineRADstructure package (version 0.3.2;Malinsky et al., 2018), which uses information for haplotype linkage within each locus, to estimate a co-ancestry matrix for population inference.We first converted the VCF file from the ipyrad output into an input file for the software using the hapsFromVCF script implemented inside RADpainter.We then followed the online protocol provided by the authors to analyse our data (https://www.milanmalinsky.org/fineradstructure).We plotted the results of the analyses using a R script available as part of the RADpainter software.The same analysis was performed using the unfiltered fineRADstructure file from the Stacks 'populations' output.
We tested for post-divergence gene flow among the three geographical taxa of the L. miwai species complex using Treemix (version 1.13; Pickrell & Pritchard, 2012).We randomly chose one SNP per locus to minimise the linkage effect for the Treemix analysis.We then estimated the likelihood values for models that included 0-5 migration edges.We plotted the Treemix tree without migration edges (the best model, see the Results section) and the log likelihood value of each model using toyplot and toytree (version 2.0.1)libraries (Eaton, 2020).Note that the same analyses were performed using output files generated by both the ipyrad and Stacks pipelines.
To estimate the divergence times among the three geographical taxa, additional analyses were run with the A00 model implemented into bpp (version 4.4.1;Flouri et al., 2020).First, we manually modified a gphocs input file generated by the ipyrad pipeline into a bpp DNA sequences file format.A total of 2933 variable loci were used in the analysis.We specified individuals to taxa assignment in a population map text file (Imap); subsequently, a control file for A00 analysis that specified the demographic priors and analysis settings was generated using an online resource provided by the authors (available at https://brannala.github.io/bpps/#/).We ran the A00 analysis with a sampling frequency of 2, a total of 2,000,000 samples and a burn-in period of 2 Â 10 4 .The results from the BPP A00 analysis were imported into R for calculating divergence times using the R package bppr (version 0.6.2;Angelis & dos Reis, 2015;Dos Reis, 2018).Because we did not have enough information regarding the variation of the rate of spontaneous mutation in the genome of the Lucanus beetles, we set a standard deviation of 0.00001 Â 10 À9 for the mutation rate.The mean genomic mutation rate was set to 2.9 Â 10 À9 per site per generation (Keightley et al., 2015;Pélissié et al., 2022), and the generation time for the beetles was set to 4 years (Chou et al., 2021;Huang, 2014).

Demographic analyses
The demographic histories of the three geographical taxa were estimated using two different approaches.First, we used Stairway plot 2 (version 2.1.1;Liu & Fu, 2020) to infer the changes in effective population size without specifying historical events.Subsequently, we used fastsimcoal2 (version 2.7.09;Excoffier et al., 2021) to specifically test for changes in effective population size during specified historical time periods when we believe anthropogenic activities may have impacted the species' current genetic diversity.For both softwares, we applied the same mutation rate as used in bpp analyses (2.9 Â 10 À9 per site per generation; Keightley et al., 2015).Additionally, since fastsimcoal2 is a haploid coalescent simulator, we converted our sample size of beetles for each population into the number of haploid individuals before conducting the analyses.For example, the sample size of population NT was specified as 50 because it included 25 diploid beetle samples in the data set.
Before running demographic analyses, we first generated SFS files using the easySFS script (https://github.com/isaacovercast/easySFS) to convert VCF files from the Stacks pipeline with information for both variable and invariant sites.Importantly, a Stairway plot 2 analysis requires information of the total number of sites that have been sequenced (L), which includes both variable and invariable/monomorphic sites.As a result, we could only use the VCF files generated from the Stacks pipeline, because VCF files from ipyrad do not include invariable sites.We chose the number of individuals that maximised the number of segregating sites for each geographical taxon separately (Gutenkunst et al., 2009).Specifically, easySFS accounts for missing data and projects SFS files based on different number of haploid sequences.We included all the SNPs information from each locus to generate SFS files, because we wanted to include accurate zero bins for estimating the effective population sizes.

Inferring historical changes in effective population sizes
For the Stairway plot 2 analyses, we ran models for each geographical taxon by setting the number of sequences (nseq) and total number of sites (L) according to outputs from easySFS.We also used seven random break points and created 1000 inputs (ninputs) for each analysis.
The confidence intervals of the estimated effective population sizes through time for the three taxa were further estimated using 200 bootstraps.

Testing the impacts of recent anthropogenic activities
We tested five main competitive demographic models assuming that anthropogenic activities and/or environmental changes may or may not have impacted the recent demographic history for each of the geographical taxa using fastsimcoal2 (Figure 1).Specifically, to estimate the likelihood values for the models, 100 replicates with 200,000 coalescent simulations (Àn) and 50 optimisation cycles (ECM; ÀL) for estimating parameters based on maximum likelihood (ÀM) were used.Our analyses were performed while considering the monomorphic sites, pooling together SFS entries with fewer than 10 SNPs (ÀC 10) and outputting all SNPs (Às0), as suggested by the user manual.For demographic models that assumed sudden demographic changes due to anthropogenic exploitation (models with double decrease, one past decrease and one recent decrease; Figure 1), the prior of the older event was set between 10 and 20 generations ago (ca.40-80 years ago during the 1960s); the second more recent event was set with a prior range from three to eight generations ago (ca.10-30 years ago in the 2000s).Although it is unlikely that it is based on a previous study (Huang, 2014) and the Stairway plot results (see Results section), we also tested for an additional sixth model that assumes a sudden demographic increase in the recent past as a sanity test.Note that the sudden demographic increase model was not shown in Figure 1 because we do not have a good historical representation for the model (i.e., when and why this event should have occurred).Figure 1 only illustrates five models that are relevant to historical records of anthropogenic activity and global insect decline.
We estimated for each model the current effective population size (NCUR).We set the time of demographic reduction (TBOT) between 10 and 20 generations ago for a model of a past decrease and set the new size (RESBOT) as the ratio between the ancestral population size (NANC) and NCUR (NANC/NCUR).We set a model for a recent decrease similarly, but TBOT was set between 3 and 8.For the model of constant population size, we set two events (EST1 and EST2), but we did not set any change in the effective population size (NCUR = NANC).For the model of constant decline, we estimated the growth rate (GROWTHRT) as the ratio between the logarithm of RESBOT and the possible start of the decline (TENDBOT; RESBOT/TENDBOT).For a model of two demographic reduction events (double decrease), we set the first event (TBOT) between 10 and 20 generations ago and the second one (TOTHER) between 3 and 8, while also bounding these numbers.Note that, we conducted tests on the double decrease model both with and without applying parameter value bounds in order to explore a wider range of parameter space.The first change of population size (RESBOT) was set as NANC/NCUR, while the second one (RESOTHER) was set as the ratio between the population size at TBOT (NBOT) and NCUR (NBOT/ NCUR).For such models, the population sizes (NBOT, NCUR and NANC) were log-uniformly distributed around from 10 to 100,000.
For the sudden increase model, we set the prior range for NCUR from 100 to 1,000,000 and a prior range for NANC from 10 to 10,000; both priors were set to be log-uniformly distributed.The increasing time was set around 10-20 generations ago without bounding.Our settings of the models and parameters to be estimated were specified by modifying and editing the freely shared input files, thanks to Hoey et al. ( 2022) and from the default fastsimcoal templates.
The best likelihood estimate for each model was identified using a bash script (available at https://raw.githubusercontent.com/speciationgenomics/scripts/master/fsc-selectbestrun.sh).We also calculated AIC using the highest likelihood values from each historical model by converting the log 10 -likelihoods reported by fastsimcoal2 to ln-likelihoods.We further calculated AIC weights (Wagenmakers & Farrell, 2004) among the models using the akaike.weight()function in the R package 'qpcR' (version 1.4-1; Ritz & Spiess, 2008).We performed bootstrapped analyses based on the constant decrease model by using 100 bootstrapped VCF files (the bootstrapped VCF files were generated by modifying an existing script available at https:// speciationgenomics.github.io/fastsimcoal2/).Specifically, we performed 100 parameter estimation analyses using each of the 100 bootstrapped VCF files to estimate the means and confidence intervals for the current effective population sizes for the three taxa.
Note that constant decrease model was selected as the best historically relevant model (see Results and Discussion sections).
Testing the efficacy of genomic data to discriminate recent from historical demographic processes We have observed that the effective population sizes of all three populations have been decreasing since their divergence, as indicated by the results from Stairway plot analyses (see Results section).Additionally, the results from fastsimcoal2 analyses demonstrate a decline in effective population sizes in the recent history (see Results section).
Consequently, to further explore the efficacy of genomic data and model-based approach to detect recent demographic changes, we investigated whether this recent decline represents the tail end of an older demographic decline or if it is a separate pattern occurring more recently.In other words, we want to determine if our model-testing approach is capable of distinguishing between the impacts of recent events and long-term historical events.
We tested two additional historical models.First, we considered a scenario where there has been a constant demographic decline since 400,000 years ago (400 kya).Second, we examined a model where there was a constant decline from 400 kya to around 100 years ago, followed by a different regime of constant decline since then.Note that the two new models mentioned in this paragraph were solely compared to each other because the focus here differed from the previous tests.In the long-term continuous decrease model, we defined the range for the start time of the constant demographic decrease event as occurring between 10 and 100,000 generations ago, which corresponds to approximately 400,000 years ago (kya).In the alternative model, which assumes two constant demographic decrease regimes, we set the range for the start of the first decrease event between 25 generations ago (ca. 100 years ago) and 100,000 generations ago.The range for the start of the second decrease event was set between 10 and 25 generations ago.

Bioinformatic processing of the ddRADseq data set
We retained totals of 87 (29 from HC, 26 from ML and 32 from NT) and 86 (31 from HC, 24 from ML and 31 from NT) samples after filtering and processing using ipyrad and Stacks, respectively.After the trimming steps, the retained samples had on average 1232612.552raw reads in the ipyrad analysis.The resulting SNPs matrix from ipyrad processing consisted of 6940 sites (ca.12.77% missing data among individuals).In total, the assembled sequence matrix size was 267,511, with an average of 2588 assembled loci per sample (ca.12.07% missing data among individuals).The chosen (ÀM4n4) Stacks output retained 5124 variable loci (R80) shared among the three geographical taxa (Table S1), with 38.9 Â mean per locus per-sample coverage from an average of 1,102,091 raw reads per sample.Note that these average values also accounted for samples that were discarded due to the low number of reads retained.After filtering out SNPs with >20% missing values, a total of 2292 unlinked SNPs were retained for the following analyses (i.e., PCA, STRUCTURE and SNAPCLUST).Specifically, analyses using output files from the ipyrad and Stacks pipelines resulted in similar inferences (see the following sections), so we only show one representative result for each analysis in the figures (all results are accompanied in the Supplementary Results file).The resulting SFS files contained 25 individuals (50 haploid sequences) for both HC and NT; 20 individuals (40 haploid sequences) were retained for ML.A total of 914,115 nucleic sites (counting both monomorphic and polymorphic ones) were observed for HC, 971,868 for NT and 659,416 for ML.The SFSs are available in Dryad data repository (doi: 10.5061/dryad.jh9w0vtft).

Analyses of population structure and estimated divergence times
Our PCA results showed three genetically distinct clusters in the L. miwai species complex (Figures 2a and S1).On the other hand, although there were some rogue individuals (Figures S2-S7), snapclust analyses using the Stacks filtered output files revealed that there was no intra-taxon genetic subdivision within each of the three taxa (K = 1; Figures S8-S10).The three genetic clusters within the L. miwai complex corresponded to their geographical origins.
The STRUCTURE and snapclust analyses also statistically supported the geographical taxa as genetically distinct entities (Figures 2b and S11-S16).Specifically, ΔK was highest when K = 3 in the STRUCTURE analyses (Figures S13 and S15), whereas both the computed AIC and BIC values from the snapclust analyses were lowest when K = 3 (Figures 2c and S12

Testing the impacts of recent anthropogenic activities
The simulations done with fastsimcoal2 showed that the model of constant decline was the best one for HC, followed by the model of double decrease without applying parameters bounding (Table 1).
On the other hand, the double decrease model without applying parameters bounding was the best model for populations ML and NT, followed by the constant decline model (Table 1).We would like to note that, however, the parameter estimation exceeded the priors for the recent sudden decrease event for all three populations based on the double decrease model without applying parameters bounding (Table 2).Specifically, instead of indicating a decrease around 1960 and a second one around 2000, the best model suggested two consecutive generations of declines.For HC, the decline events occurred  2).
The constant decline models for all populations showed that the current effective population size was lower than 50 haploid individuals.Specifically, the estimated current effective population sizes for ML were 11 (12.52 [mean] ± 2.258 [SD] from 100 bootstrapping analyses), and 12 for both HC (12.52 ± 1.374) and NT (12.29 ± 1.956) (Table S2).Note that all the parameter values regarding the effective population sizes were reported as in the number of haploid individuals (default output values from fastsimcoal2 analyses).Additionally, the growth rate estimated for the model of constant decline was positive, indicating demographic contraction.The current effective population sizes based on the double decrease model were 16, 16 and 12 for HC, ML and NT populations (Table 2).Similarly, the estimated current effective population sizes were 11 for all three taxa according to the double decrease model while applying parameter bounding.
Testing the efficacy of genomic data to discriminate recent from historical demographic processes

DISCUSSION
Genomic data and demographic models specific to biological systems have been shown to effectively help with identifying conservation units, evaluating the vulnerability of the conservation units, and testing different historical demographic models that may explain the observed genetic diversity.Our study not only showed that such approaches could be readily adopted in insects, where a comprehensive sampling design and a comparative approach can be applied, but also further revealed that a comparative sampling design was critical to make inferences regarding the impacts of anthropogenic threats.Specifically, we showed that not only the historical demographic trends were similar across the three geographic taxa of the L. miwai species complex according to the results from Stairway plot 2, but more importantly that they also shared the same recent (<100 years) demographic histories in the Anthropocene according to model-testing approaches implemented in fastsimcoal2.Although idiosyncratic anthropogenic exploitations across time and space may be associated with different responses in the demographic history of the species affected, we argue that our results on the other hand revealed global environmental changes may have played a more important role affecting the effective population sizes of the studied organisms.Such an inference would only be possible when multiple taxa were included in a genomic study of comparative conservation genetics.Note that we are not arguing that local anthropogenic exploitation is not important; instead, the impact of local exploitation may be masked by the impact of global environmental changes.A better sampling strategy, such as time serial sampling with historical samples, may help reveal the demographic nuances associated with more subtle differences in Anthropocene (Clark et al., 2023).Specifically, because of the relatively long generation time of the beetles (4 years; Huang, 2014), we could not employ time serial sampling in this study.Our results also demonstrated that genomic data, coupled with a model-based approach, may have the potential to differentiate historical from recent demographic events.However, interpreting the model test results and parameter estimations can present challenges.

Omics tools for the origin and demographic history of evolutionarily independent lineages
We showed that the three closely related taxa diverged in the mid-Pleistocene around 600 and 200 kya without post-divergence gene T A B L E 1 Estimated likelihood values for the demographic models included in the fastsimcoal2 analyses.flow (Figures 3 and 4) and that an episodic decrease in effective population size characterised their common responses to historical climatic and environmental changes (Figure 5).Similar data and approaches have become commonly used to infer recent speciation events that could not be effectively accessed using conventional tools (Huang et al., 2020;Ortego & Knowles, 2022;Wagner et al., 2013).Furthermore, recent studies that applied novel models to reconstruct the demographic histories of focal taxa using genome-scale data also revealed fine geological timescale changes in effective population sizes that could not be unravelled using Sanger sequence data (Huang, 2019;Hung et al., 2014;Reid & Pinsky, 2022).Our results thus agree with previous studies that genomic tools, including the data and the rapidly developing analytical methods and models, are rapidly advancing our understanding of how endemic, or evolutionarily independent, lineages can be generated and maintained in nature.
Understanding the history of divergence is critical to identifying evolutionarily significant units (ESUs) for conservation (Naveda-Rodríguez et al., 2016) and is especially important to our study system.Our genomic data clearly supported that the three geographic taxa of the L. miwai species complex formed three distinct, nonoverlapping, genetic clusters (Figures 2 and 3).Furthermore, statistically neglectable post-divergence gene flow among the three taxa indicated their independent evolutionary history since the divergences (Figure 4).The divergence among the three taxa of our study can be categorised as at the speciation initiation stage (Huang, 2020), facilitated by allopatric isolation possibly due to climate change and forest-habitat fragmentation in the Pleistocene (Chou et al., 2021).
Whether the three geographical taxa, and especially the newly discov-  (Burbrink et al., 2022) that cannot be properly and objectively evaluated in this study given the allopatric nature of their distribution (Chambers et al., 2022).Note that it is not unprecedented in insects that speciation initiated by an allopatric process could have been completed in fewer than 200 kya (Huang et al., 2020).In Asia, geographical taxa have frequently been reported to have originated around 100-600 kya (Chou et al., 2021;Dong et al., 2017;Morgan & Huang, 2021), where the geographic taxa may or may not be treated as different species (Chou et al., 2022;Dong et al., 2020).
We are reluctant in this study to revise the current taxonomic status of the three taxa, because even though they form three distinct evolutionary lineages, it is also apparent that, according to the RADpainter results (Figure 3), the geographically intermediate taxon, ML, may also be a taxon genetically intermediate between HC and NT.Specifically, the numbers of alleles shared between individuals from ML and HC and between ML and NT are higher than those between HC and NT.As a result, isolation by distance (IBD) may also explain the current genetic divergence among the three taxa (Mason et al., 2020), where the taxonomic status among the taxa might better be treated as intraspecific (Chou et al., 2022;Poelstra et al., 2021).
Without additional geographic samples, however, it is beyond our grasp to statistically test whether IBD is the best model for the genetic divergence in our system.Nevertheless, different species or not, it is clear that the three taxa are ESUs that should be managed as different conservation units.Taxonomically, the scientific name L. yulaoensis should be associated with the newly recorded taxon from ML because of the phylogenetic affinity (Figures 3 and 4).
More importantly, understanding the history of divergence among the three focal taxa of stag beetles helped us form evolutionary historyinformed hypotheses when testing the impacts of recent anthropogenic exploitation.Specifically, we inferred that the three taxa had diverged without gene flow; thus, when applying a model comparison approach to test across historical scenarios that may explain the current genetic diversity within each taxon (Figure 1), we could simply test for different scenarios of demographic change without invoking complicated hypotheses of gene flow that may also contribute to genetic diversity observed within a population (Ortego & Knowles, 2022).On the other hand, our Stairway plots indicated a continuous and drastic demographic decrease since the divergences among the three taxa; specifically, a rapid demographic decline occurred around 400 kya after the divergence between L. miwai (NT) and L. yulaoensis (ML and HC).Another drastic demographic decline occurred after 200 kya when ML and HC further divided into two geographical taxa.Notably, the same demographic declines were registered in the genome across all three taxa, implying that global events, such as climate change and especially habitat fragmentation (Dong et al., 2017;Morgan & Huang, 2021), should have been responsible for the observed pattern.

Omics tools for evaluating anthropogenic impacts
The novelty of our study design that stands out from previous similar studies using genomics and model-based approaches for conservation in Anthropocene (Hoey et al., 2022;Li et al., 2022;Nunziata et al., 2017) is that we studied multiple closely related taxa that may or may not respond to the same anthropogenic exploitation.We not only tested for the changes in effective population size for a taxon during the time when a known historical anthropogenic exploitation occurred, but we also tested the changes in effective population size in other taxa where we did not expect the same anthropogenic exploitation to have had impacts.Surprisingly, instead of revealing idiosyncratic demographic histories in recent times, our demographic-model test results revealed that the recent demographic trends of the three taxa were similar, indicating the importance of shared island-wide environmental or climatic events (e.g., Chen et al., 2019).Specifically, when a model of continuous decline was included in the model comparison for historical demographic inferences of the three taxa, this model outperformed the other historical models (Table 1).Note that, since historical records do not provide indications as to why consecutive generations of sudden demographic decrease may be a valid hypothesis around 80-90 years ago, we selected the continuous demographic decline as the historically relevant best model.Specifically, not all parameter values and models are biologically interpretable and historically relevant.Therefore, it is important for us to carefully select appropriate models before and after conducting statistical and comparative studies (Knowles, 2009).Moreover, even though the double decline model may yield AIC values as good as, or even better than, the constant decline model when parameters were not bounded, the three populations consistently exhibit the same demographic history, for example, for a double decline model to be the best historical scenario, the demographic declines had to occur between two consecutive generations (Table 2).Our study thus unravelled one critical element in conservation genomic studies that have to be accounted for before making inferences, including multiple taxa to test the impacts of focal historical events (i.e., comparative conservation genomics).
Without including a model of continuous decline and the taxa from HC and ML, we may infer that recent anthropogenic exploitation possibly due to habitat loss had major impacts on the decrease of genetic diversity in L. miwai (NT).Specifically, the double sudden decline model (with or without bounding the priors for model parameters) was found to be the best-fitting model for population NT when the constant decline model was not included for model comparison (Table 1).An equally likely, or even better, explanation to the data regarding climate change and island-wide change of habitat size (Chen et al., 2019;Huang, 2014) may have gone overlooked.Our study demonstrated the ramifications of an inadequate sampling design and failed to include a potential relevant historical model when designing conservation studies.Specifically, when a comparative conservation genomic approach was applied, and a model of continuous decrease was tested, we understood that the effective population sizes of the three taxa have been and are still decreasing.A similar pattern has also been reported across many insect groups, even within protected areas (Chowdhury et al., 2023).
Note that our study design is not without limitations.First of all, because of the limitation of Stairway plot analysis to unravel recent history (<100 years), we employed a model comparison approach to test for recent demographic changes.Although such combinations of approaches to infer remote and recent demographic history has been applied in other systems (e.g., Li et al., 2022), we do not fully understand how drastic past demographic events may impact recent demographic inference.Our additional results from testing the potential of genomic data to distinguish historical from recent demographic events did result in better AIC values when a historical continuous demographic decline was accounted for in the model (as shown in Tables 1 and 3).Additionally, as what has been discussed extensively in statistical phylogeography, we could not test for all possible historical models (Knowles, 2009); the true demographic model may not be included in the analyses.For example, we downplayed the importance of the double sudden decline model without parameter bounding in our model comparisons because the results lack reasonable historical support.However, these results may become relevant if new evidence emerges indicating that multiple anthropogenic exploitations may have occurred consecutively around 80 years ago.Furthermore, even though our results may seem to suggest that recent anthropogenic exploitation of forest habitats only led to limited impacts on the genetic diversity of taxon NT, the results may also imply that global climate change is more impactful and thus masks the effect of local anthropogenic impacts.Based on our additional model comparison results (Table 3), we can confidently infer that the recent constant demographic decline in population NT may be operating under a different regime compared to the older constant decline.That is, the current demographic decline is not the tail end of an older demographic decline.However, we are less confident in making the same inference for population ML (based on AIC weight results; Table 3).The subtle nuances observed in the results of the model comparison among populations may reflect different management plans of the habitats in the recent history (i.e., heavily exploited versus protected within a national park).
Alongside the general challenges mentioned above regarding model-based approaches in studying comparative conservation genomics, there are specific difficulties linked to interpreting the results of model testing and parameter estimation in fastsimcoal2 analyses.For example, although it is tempting to infer that the establishment of a protected area helped decrease the rate of population decline in ML, as mentioned in the previous section, it is evident that the current estimated effective population sizes are nevertheless comparable among the three taxa (Table 2).Furthermore, while the estimated recent demographic decline was indeed steeper in NT and HC, the estimated historical demographic decline was steeper in ML.Consequently, the difference in model fits among the three taxa (Table 3) may also be attributed, at least in part, to other unaccounted historical processes.Furthermore, we were unable to determine the reasons why the double sudden decrease model consistently indicated two consecutive generations of demographic decrease among the three taxa when the priors for the model parameters were left unbounded.First, it is possible that this model actually reflects the same underlying pattern as the sudden decline model, as the estimated time of population decline was similar between the two models (Table 2).The two consecutive decline models may have assigned higher weight to a single event (weighted the same event twice), resulting in a better fit to the data.Second, it is plausible, as briefly discussed in previous sections, that multiple deforestation events or acts occurred during that specific period of time, and although we treated it as a single historical event, there could have been consecutive generations exploiting forest habitats throughout Taiwan.However, both explanations would require additional evidence from historical records and computer simulation studies, which fall outside the scope of this study.
The current effective population size, irrespective of the historical models (except the one assuming a constant population size through time), estimated for the three taxa of the L. miwai complex using genomic data are extremely low (Table 2).Note that, even though Stairway plot analyses may not be able to reveal recent demographic changes, the most recent estimate of effective population size for the three taxa are also very low, which is around 100 (Figures 5 and S20-S22).
Specifically, the three taxa have an effective population size that may not be able to maintain their long-term genetic diversity because of strong genetic drift (N e < 50; DeWoody et al., 2021DeWoody et al., , 2022;;Franklin, 1980;Lande, 1995;Teixeira & Huber, 2021).As a result, and in particular due to their geographical isolation (Ryman et al., 2019), they may lose their ability to respond to future environmental and climate change and become extinct.As the Taiwanese government is compiling its first red list for invertebrates, we propose that the three taxa of the L. miwai complex should be included in the list for extinction concern.A dire condition like this may even call for immediate and radical approaches that, although maybe controversial to conventional practitioners of biodiversity conservation (Burridge, 2019), may help to secure the long-term survival of the species, or the species complex as a whole.Genetic rescue and assisted gene flow between geographically isolated taxa have been proposed and demonstrated to be effective using computer simulations (Grummer et al., 2022).
Because of the current geographical distribution of the three taxa and the limitation of the ddRADseq data, we could not evaluate whether local adaptation had occurred nor whether speciation had completed in this species complex, and thus we shall not argue that the approach of assisted gene flow should be applied immediately.All we can say is that males and females from different geographical taxa would mate readily and produce viable offspring under artificial conditions and that different taxa may exhibit a certain degree of morphological difference (Chou et al., 2021).Future studies that generate reference genome resources and that identify possible adaptive or genomic islands between the geographical taxa are needed before we can make recommendations for any further policy suggestions (Formenti et al., 2022).Nevertheless, we believe the L. miwai species complex could become an excellent system to empirically test and observe the impacts and consequences of assisted gene flow in nature.

CONCLUSIONS
Genomic tools and model-based approaches have been readily applied to inform conservation studies of non-model organisms.However, before making conservation inferences, our results demonstrated that a comparative sampling scheme should be taken into account.Specifically, studies should not only test the demographic response of a target taxon to anthropogenic events, but also the demographic responses of other taxa that may or may not be responding to the same events.By doing so, we can prevent making biased inferences of what are the critical historical events that shaped the current genetic diversity of a species and of what could be the best practice to ensure the long-term survival of the species.We believe that comparative conservation genomic approaches can be easily adopted in many empirical systems and can help move the field forward to make practical impacts.
) has been heavily exploited and largely transformed into recreational and agricultural land.Specifically, after the Chinese Civil War, a group of veterans were relocated to Qingjing Farm, Nantou, for new settlements in 1961.The forest habitat has been heavily altered into fruit and vegetable farms ever since.Furthermore, a growing interest of the public for camping and glamping has resulted in severe habitat alteration in this area since 2001 after the Taiwanese government passed a law to promote local tourism.The forest habitat of L. yulaoensis from Hsinchu County (the taxon will be referred to as HC; Figure 1) has also been developed into camping grounds and recreational areas since 2001.On the other hand, the forest habitat of the recently discovered intermediate taxon from Miaoli County (referred to as ML; Figure 1) has always been in a protected forest within a national park (Shei-Pa National Park).These three closely related taxa thus provide a unique opportunity for studying how different intensities and histories of anthropogenic exploitation and destruction to the habitats may have affected their demographic histories.Specifically, based on a generation time of 4 years (Huang, 2014), we would like to test whether the effective population size of NT decreased drastically approximately 60 years/15 generations ago associated with the onset of forest-habitat alteration.Additionally, we would like to test whether the effective population sizes of NT and HC decreased approximately 20 years/5 generations ago because of the rapid development of camping grounds and recreational areas.The demographic history of the newly found intermediate taxon ML can serve as a standard for what to expect when F I G U R E 1 Images of representative adult males of the three geographical taxa of the Lucanus miwai species complex (a-c; photos take by Ming-Hsun Chou) and their geographical distribution (d).The legend in (d) shows elevation in meters above sea level; the x and y axes are longitude and latitude.(e) Graphical representation of the five main tested historical models for the fastsimcoal2 analyses.Effective population size parameters (NCUR, NBOT and NANC) were indicated in different demographic models.HL, Hsinchu County; ML, Miaoli County; NT, Nantou County.
), indicating that the current genetic diversity observed in the data sets could be best explained by three Wright-Fisher populations.Furthermore, individuals from the same geographical origin were assigned to the same population with an extremely high probability (Figures 2b; S11, S13 and S15).The RADpainter and fineRADstructure results also indicated three genetic clusters (Figures 3 and S17), where individuals from the same F I G U R E 2 The results from the population-genetic analyses.(a) A principal component analysis (PCA) plot showing three distinct genetic clusters corresponding to the geographical groups of our studied organisms.(b) Probability of assigning individuals to different populations based on the snapclust analysis.Note that each and every individual sample has a probability of 1.0 of being assigned to the same population as the other individuals from the same geographical origin.(c) Akaike information criterion (AIC) and Bayesian information criterion (BIC) estimates of the snapclust analyses for identifying the best numbers of genetic clusters.Note that both analyses indicated that a K = 3 represents the best number of genetic clusters from the genetic data.HL, Hsinchu County; ML, Miaoli County; NT, Nantou County.U R E 3 Results from the RADpainter and fineRADstructure analyses.The colours in the matrix indicate the level of coancestry inferred from the genetic data (by counting the number of shared alleles between individuals), where a colder colour indicates a higher level of coancestry.A genetic cluster based on the number of estimated shared alleles between individuals is shown above the coancestry matrix.Individuals from the same geographical origin have higher levels of coancestry to each other than do individuals originating from different geographical localities.Note that individuals from the geographically most distant taxa (i.e., between Hsinchu County [HC] and Nantou County[NT]) have the lowest level of coancestry.ML, Miaoli County.geographical origin showed higher levels of coancestry than did individuals from different geographical origins.Interestingly, a phylogenetic cluster estimated based on the coancestry matrix showed that HC was a sister taxon to ML (Figures3 and S17); however, ML, which is geographically intermediate between NT and HC, shared comparatively more alleles with NT than the shared allele between NT and HC.Our Treemix analysis results did not support the inclusion of any admixture edge, because the estimated likelihood values for models with and without migration edges were the same (Figures4a,b; S18 and S19).The finding here is comparable with the results from the previous section that the three geographical taxa were isolated genetic clusters.The estimated phylogenetic relationships among the three taxa from Treemix were also consistent with the inferences from fineRADstructure (Figures3 and 4a).The estimated divergence time from the A00 model of BPP and the bppr programs was around 650 kya for the split between NT and the HC/ML lineage (656,798; 95% CI: 566,895.99-862,072.07)and around 200 kya between HC and ML(201,181; 95% CI: 172,344.76;Figure4c).All parameters from the A00 analyses had effective sample sizes larger than 200.Demographic analysesInferring historical changes in effective population sizes Our Stairway plots analyses revealed similar past demographic trends among the three geographical taxa.Their effective population sizes were highest around 600 kya and had multiple sudden demographic drops during the Pleistocene.More precisely, demographic drops occurred around 400 kya for both ML and NT, while around 300 kya all the taxa experienced sudden decreases in population size.Furthermore, another major drop in effective population size was noticed more recently, around 30 kya (Figure5).Note that the Stairway plot analyses lacked resolution for very recent demographic changes (<100 years ago).The most recent estimates of Ne outputted by Stairway plot were around 113 for ML and 123 for both HC and NT (Figures S20-S22; data can be retrieved from Dryad doi: 10.5061/dryad.jh9w0vtft).As a result, recent demographic changes were investigated using a model comparison approach implemented in fastsimcoal2 (see, for example, inNunziata et al., 2017, Li et al., 2022and Hoey et al., 2022).
24 and 23 generations ago (equivalent to 96 and 92 years ago).For ML, the decline events occurred 23 and 21 generations ago (corresponding to 92 and 84 years ago).While for NT, the decline events occurred 19 and 18 generations ago (ca.76 and 72 years ago).The times of the demographic reduction events estimated from the model of double demographic decrease with parameters bounding were 17 and 8 generations ago (around 68/32 years) for HC, 18 and F I G U R E 4 Estimates of the divergence history among the three geographical taxa of the Lucanus miwai complex.(a) A population tree inferred using treemix without migration edges.(b) The estimated likelihood values of different treemix models assuming different numbers of migration edges.(c) A time-calibrated population/species tree estimated using the A00 model implemented in BPP and plotted using bppr.HL, Hsinchu County; ML, Miaoli County; NT, Nantou County.8 generations ago (around 72/32 years) for ML and 20 and 8 generations ago (around 80/32 years) for NT (Table

For
our additional tests to investigate whether genomic data can distinguish historical from recent regime change in a continuous demographic decline model, we showed that the two constant decline events model yielded better AIC values(112871.4,72122.2 and   114023.0for HC, ML and NT, respectively)  than the one long-term constant decline model for all three populations (112875.9,72124.8 and 114036.1 for HC, ML and NT, respectively; Table3).The times of the regime change between two constant demographic decline events were estimated as 17, 20 and 23 generations ago for HC, ML and NT populations, respectively.Notably, based on the AIC weight results, the two constant decrease model (>99%) was strongly favoured over the one long-term decline model in population NT, which has experienced severe habitat loss due to recent anthropogenic activities.The estimated slope for the historical decline was 0.74 haploid individuals per generation (from 102,088 to 70,964 haploid individuals; historical decline started from 42,015 generations ago), while the slope for the recent decline was 4173 haploid individuals per generation (from 70,964 to 23 haploid individuals).On the other hand, although the two constant decrease model was also preferred for population ML (78.77%), the difference in AIC weights between the two compared models was not as substantial.The estimated slope for the historical decline was 5.02 haploid individuals per generation (from 106,057 to 15,774 haploid individuals; historical decline started from 17,988 generations ago), while the slope for the recent decline was 787.55 haploid individuals per generation (from 15,774 to 23 haploid individuals).F I G U R E 5 Historical demographic inferences using Stairway plot 2. The three geographical taxa shared very similar histories of changes in estimated effective population size through time.Solid lines indicate the means, and dash lines indicate the 95% confidence intervals, of the estimated effective population sizes.HL, Hsinchu County; ML, Miaoli County; NT, Nantou County.Similarly, the AIC weight results showed that the two constant decrease model (90.35%) was favoured over the one long-term decrease model for population HC.The estimated slope for the historical decline was 0.92 haploid individuals per generation (from 108,574 to 71,974 haploid individuals; historical decline started from 39,852 generations ago), while the slope for the recent decline was 3127.87 individuals per generation (from 71,974 to 33 haploid individuals).All three taxa revealed a much steeper slope of demographic decline associated with the recent constant decrease regime.
Estimated parameter values based on the best maximum likelihood (ML) estimates for each model.Estimated Akaike information criterion (AIC) and parameter values based on the best Miaoli County (ML) estimates for long-term constant decline model and a double constant decline model.
ered intermediate taxon ML, should, or can, be taxonomically treated as different species, however, is a challenging philosophical question T A B L E 2 a Estimated time of change between two constant decrease regimes.