Demographic changes and life‐history strategies predict the genetic diversity in crabs

Uncovering what predicts genetic diversity (GD) within species can help us access the status of populations and their evolutionary potential. Traits related to effective population size show a proportional association to GD, but evidence supports life‐history strategies and habitat as the drivers of GD variation. Instead of investigating highly divergent taxa, focusing on one group could help to elucidate the factors influencing the GD. Additionally, most empirical data is based on vertebrate taxa; therefore, we might be missing novel patterns of GD found in neglected invertebrate groups. Here, we investigated the predictors of the GD in crabs (Brachyura) by compiling the most comprehensive cytochrome c oxidase subunit I (COI) available. Eight predictor variables were analysed across 150 species (16 992 sequences) using linear models (multiple linear regression) and comparative methods (PGLS). Our results indicate that population size fluctuation represents the most critical trait predicting GD, with species that have undergone bottlenecks followed by population expansion showing lower GD. Egg size, pelagic larval duration and habitat might play a role probably because of their association with how species respond to disturbances. Ultimately, K‐strategists that have undergone bottlenecks are the species showing lower GD. Some variables do not show an association with GD as expected, most likely due to the taxon‐specific role of some predictors, which should be considered in further investigations and generalizations. This work highlights the complexity underlying the predictors of GD and adds results from a marine invertebrate group to the current understanding of this topic.

the field; however, it is essential that we gain a deeper understanding of the predictors of GD.
The neutral theory of molecular evolution predicts that GD is proportional to the effective population size (N e ) at neutral sites because of the mutation/drift equilibrium. Hence, the bigger the N e , the bigger the GD. However, directly estimating N e can be challenging because of the underlying assumptions to calculate this variable . Therefore, scientists must rely on indirect methods to estimate N e when investigating its effects on GD (Frankham, 1995;Mackintosh et al., 2019;Montgomery et al., 2000;Romiguier et al., 2014).
In this context of using indirect measures to investigate N e , multiple life-history traits have been found to predict GD across different taxa probably because of their association with N e (Kort et al., 2021;Romiguier et al., 2014). A comparison of 31 families from different phyla has shown that the combination of adult size, body mass, maximum longevity, adult dispersion ability, fecundity and propagule size explained more than 70% of the GD variation (Romiguier et al., 2014). Interestingly, propagule size was the primary factor influencing GD, with r-strategist species showing higher GD than K-strategist (Pianka, 1970;Romiguier et al., 2014). A more extensive comparison, but restricted to mammals, birds, reptiles, amphibians and molluscs, also found that body size, longevity and fecundity are important life-history traits predicting GD, depending on the taxa (Kort et al., 2021). Both studies indicate that life-history strategies can predict GD because they affect how species respond to demographic changes. It is known that population size may increase or decrease due to environmental disturbances (i.e., climatic changes, biotic interactions, anthropic effects) and subsequently altering GD (Banks et al., 2013;Gurgel et al., 2020;Hewitt, 2004;Lewis & Maslin, 2015;Wenger et al., 2011). Hence, investigations including life-history strategies combined with informed data about signals of population expansion are warranted.
Studies focusing on one group instead of comparing highly divergent taxa could help elucidate the factors influencing the GD (Leffler et al., 2012). The mutation rate for both nuclear DNA (nuDNA) and mitochondrial DNA (mtDNA) is highly variable across animals, potentially affecting comparisons among animal groups (Allio et al., 2017). Also, sometimes there are general trends but taxon-specific patterns (e.g., Kort et al., 2021;Romiguier et al., 2014). Investigating the predictors of GD in phylogenetically closer groups showing variable traits across species could elucidate which factors contribute to GD variation at different taxonomic scales. In addition, current advances on the association of different traits with GD is predominantly based upon vertebrate taxa, and we might be missing new trends due to the lack of investigation of neglected taxa (Barry et al., 2022;Kort et al., 2021;Leffler et al., 2012;Romiguier et al., 2014). For instance, bony fishes show a negative relationship between GD and maximum size, egg diameter and length at maturity as expected (Mitton & Lewis Jr, 1989); some butterfly families show a negative correlation between GD and size but no relationship between GD and egg size, larval host plat and current abundance (Mackintosh et al., 2019).
Habitat has also been shown as an essential factor for GD.
Upland Amazonian bird species show higher GD than floodplain species (Harvey et al., 2017), terrestrial birds show higher GD than aquatic birds (Eo et al., 2011), marine fishes show higher GD than freshwater species (DeWoody & Avise, 2000;Martinez et al., 2018), and shallow-decapod species show higher GD than deep-sea species (García-Merchán et al., 2012). Similar habitats may undergo the same geological and abiotic changes leading to similar demographic responses influencing the GD of the habitat-associated fauna in terrestrial and marine environments (Gehara et al., 2017;Marko et al., 2010). Furthermore, some habitat types are more connected through species dispersal resulting in patterns, such as canopy bird species being less genetically differentiated than understory species (Burney & Brumfield, 2009) and less genetic differentiation explained by depth in marine animals (Etter et al., 2005;García-Merchán et al., 2012;Selkoe et al., 2014). In many marine species, dispersal occurs through a planktonic larva that remains in the water column and may be transported by currents (Shanks, 2009). For these species, the dispersal ability can be related to the number of larval stages and the pelagic larval duration (PLD), promoting population connectivity (Faurby & Barber, 2012). Population connectivity might hamper the erosion of the GD caused by genetic drift and population size decrease by inputting new individuals into these populations. Yet, there might be differences between the potential and realized dispersal (Weersing & Toonen, 2009), and we still lack the use of the number of larval stages and PLD as predictors of the GD (Kort et al., 2021).
Considering the open questions on the predictors of GD, the benefits of investigating related groups, the need to expand the groups of organisms investigated to include great diversity within the different biomes, and the need to explore GD patterns in neglected taxa, crabs (Brachyura) emerge as a model taxon. Brachyura is one of the most diversified invertebrates and one of the most studied crustaceans (Davie et al., 2015;Ng et al., 2008;Wolfe et al., 2021). To this date, 7657 extant brachyuran species and 98 families are described as having a plethora of morphological diversity (Davie et al., 2015;Ng et al., 2008;WoRMS, 2022). Crabs are found from abyssal zones to terrestrial environments occupying most habitats, showing a vast life-history traits variation and dispersal potential (Anger et al., 2015;Davie et al., 2015;Hines, 1982;Hines, 1986). There are indications that species density, fecundity and demographic changes explain the GD for seven mangrove crab species from the Western Indian Ocean (Fratini et al., 2016). A broad taxonomic sampling investigating species from different habitats and showing diverse life-history traits could unveil if this is a general trend in crabs and contribute to our understanding of what affects GD in an invertebrate group (see also Mackintosh et al., 2019).

| Species sequence collection
Brachyura comprises the Podotremata and Eubrachyura groups (Guinot et al., 2013;Ng et al., 2008;Wolfe et al., 2019). We used the Taxonomy Browser tool in NCBI (https://www.ncbi.nlm.nih.gov/ taxonomy) to first search all Brachyura genetic sequences publicly available. However, Podotremata crabs were highly underrepresented and did not meet our criteria for retrieving sequences (see below); therefore, they were not included in our data set. We detected inconsistencies in how COI sequences are named in the database, leading to different sequence sets retrieved depending on the name used for searching. Thus, we searched the terms 'cytochrome c oxidase subunit 1', 'cytochrome oxidase subunit 1', 'cytochrome oxidase 1', 'COI' and 'COX1' within Eubrachyura to ensure a total inclusion of COI data. Both 5′ and 3′ regions were included because they do not differ in their intraspecies divergence rate (Lefébure et al., 2006). We noticed that many authors did not upload all sequences generated during their study, but only the unique haplotypes, which could potentially influence our GD estimates. Initially, we retrieved all species showing more than two sequences because they could represent unique haplotypes from a larger data set and recovered 210 crab species (Jan/2021). Then, we reconstructed the species' haplotype frequencies for each study by consulting the reference article or contacting the authors. This step allowed us to analyse the complete data set used by authors, not only unique haplotypes. We could not reconstruct haplotype frequencies for the species with sequences deposited with no reference article. These were excluded from downstream analyses.
We kept only the species with more than 15 sequences in the database or more than 15 sequences after haplotypes frequency reconstruction in our data set. This threshold was chosen based on simulation and empirical results demonstrating that 15 sequences could ensure that we had a comprehensive picture of the intraspecific GD (Goodall-Copestake et al., 2012;Luo et al., 2015;Phillips et al., 2019). We followed the authors' interpretation who generated the sequences when they found cryptic species through molecular data but did not formally describe them. These species were maintained as separate species in our data set (i.e., if the authors addressed more than one species under the same species name, downstream analyses used the sequences for each cryptic species).
For each species, sequences were aligned using MAFFT v.7 (Katoh & Standley, 2013) and visually inspected in Geneious Prime 2020.2.4 (https://www.genei ous.com). Within each species alignment, sequences shorter than 375 base pairs (bp) or showing incongruences with the rest of the alignment were excluded.

| Response variable-genetic diversity (GD)
We estimated the GD of each species using the nucleotide diversity (π) calculated in DnaSP v.6 (Rozas et al., 2017). Nucleotide diversity is one of the measures of GD and represents the average number of nucleotide differences per site between two sampled DNA sequences (Nei & Li, 1979). After this step, we excluded species that showed π > 0.02 due to the potential presence of cryptic species in the data set, which inflated π estimates beyond the values found for the majority of our species and are also considered above intraspecific π for other taxa (Goodall-Copestake et al., 2012). After this step, we ended up with a data set containing 150 species.

| Predictor variables-life-history and demographic traits
We investigated the influence of fecundity, egg diameter (propagule size), size (maximum carapace width [CW]), number of larval stages, larval development until crab 1 phase in days (pelagic larval duration [PLD]), maximum longevity in years, habitat and signal of population expansion on the GD. Using Google Scholar © , we searched for articles using the species name combined with the keywords 'fecundity' OR 'population' OR 'larva' to obtain data for fecundity, egg size diameter, maximum CW, larval development and habitat.
All data were retrieved from scientific articles in English. We did not include grey literature, dissertations, thesis, non-scientific articles or scientific articles in other languages (Appendix S1). When multiple papers containing a targeted variable were available, the trait's value used resulted from the average value among all articles (e.g., average among the fecundity for different localities; an average of larval development days under different temperatures or salinity treatments). This was done to obtain a unique value representing the species trait while also considering its variation. If necessary, egg volume (v) was transformed to egg diameter using the equation v = (π × diameter 3 )/6 (Hines, 1982;Peres et al., 2018;Terossi et al., 2010). Maximum longevity was retrieved from Vogt (2019).
The number of larval stages was obtained from larval development articles and, when necessary, extrapolated to species with no available information from the same genus. Number of larval stages and PLD were set as 0 for species lacking a larval phase. Habitats were classified as categorical variables: deep sea, symbiotic, infralittoral, intertidal, mangrove, estuarine and terrestrial. We

| Data sets
After all the filtering steps, our data set (150 species) contained missing values, especially because most of the predictor variables were unavailable for all the species (Appendix S2). We decided to analyse multiple data sets to have more confidence in our results (Figure 1). We followed two different strategies: (1) performing dataimputation methods; (2) trimming our data set in different ways until completeness of data set (no missing values).
Data imputation was performed using two different methods: (a) non-phylogenetic data imputation by chaining random forests using the function missRanger from the missRanger package (Mayer, 2021); (b) phylogenetic data imputation using phylopars from the Rphylopars package (Goolsby et al., 2022) using the phylogenetic tree used in PGLS analyses (see below). Both methods were run because deeper lineages within Brachyura are not well resolved (Tsang et al., 2014), which could affect inferences based solely on phylogenetic data imputation.
Besides that, we designed 15 trimmed data sets by excluding individuals with missing data for specific variables to maximize the number of species included in each model tested. Due to maximum longevity being our least sampled variable, we restricted it to just one model combined with habitat and signal of population expansion (the most sampled variables).

| Multiple linear regression (MLR) and phylogenetic generalized least squares (PGLS)
We ran MLR and PGLS models to test the association between predictor variables and GD (Figure 1). We tested 17 different MLR models (2 imputed data sets +15 trimmed data sets), followed by model selection (see next section). We also tested 17 different PGLS models (2 imputed data sets +15 trimmed data sets) using a phylogenetic tree built using genetic data publicly available, followed by model selection (see next section). The most comprehensive crab phylogenetic tree to date is reported in Tsang et al. (2014). The molecular markers used by them are six nuclear protein-coding genes, arginine kinase (AK), enolase, glyceraldehyde-3-phosphate dehydrogenase (GAPDH), histone 3 (H3), sodium-potassium ATPase αsubunit (NaK) and phosphoenolpyruvate carboxykinase (PEPCK), and the large (16S) and small subunit (12S) mitochondrial ribosomal RNA genes, representing eight genes in total. We downloaded all sequences they have used (Appendix S3), plus searched all the eight genes for all species we had in our data set after the filtering steps (150 species). Species were included in the tree if they had at least one gene available; otherwise, they were removed from PGLS (132 species used). We constructed a phylogram using the maximum likelihood approach in IQ-TREE (Minh et al., 2020;Nguyen et al., 2015) using the same outgroups as Tsang et al. (2014); data was partitioned by gene, and the best evolutionary model was chosen according to Bayesian Information Criterion (BIC) (Luo et al., 2010), then used for tree inference. Branch support was assessed by ultrafast bootstrap with 1000 replicates. The resulting tree was used in PGLS analyses and for phylogenetic data imputation. When necessary, tips were pruned using the drop.tip function implemented in the ape package (Paradis & Schliep, 2019).
Although MLR and PGLS have different assumptions, we report both results because some species had to be removed from the PGLS analyses because they were not present or could not be included in the crab tree available for comparative methods and because of the uncertainty of the evolutionary relationships (Tsang et al., 2014).
The complete data set used in MLR and PGLS contained 150 and 132 species, respectively.

| Model selection
We performed MLR and PGLS followed by model selection using the information-theoretic approach (Burnham & Anderson, 2002).
A multimodel information-theoretic approach is preferable over stepwise and backward model selection because it considers all possible variable combinations, and the importance of variables can be explored while considering model uncertainty (Johnson & Omland, 2004;Mundry & Nunn, 2009). For each set, we used the dredge function available in the package MuMIn (Bartoń, 2022).
This function fits models for subsets of the global model, performing an automated model selection for all possible combinations of predictor variables. These models are then ranked based on AICc, and models with ΔAICc < 2 were considered the best ones to explain the relationship between the response and predictor variables in the top subsets (Burnham & Anderson, 2002). When more than one model had ΔAICc < 2, we used the function model.avg to do model averaging and estimate the importance of each predictor variable based on Akaike weights (w). A variable showing w = 1 indicates that the variable was present in all candidate models (Burnham & Anderson, 2002). Standardized partial slope coefficients are reported with the 95% confidence intervals (CI), and significant effects were accepted when CI did not include zero. Before running the analysis, the predictor variables were centred, and we checked for collinearity among our variables using Spearman's rank-order correlations (ρ) and the variance inflation factor (VIF and 1/VIF).
Before running both statistical analyses, the normality, homoscedasticity and independence of residuals were visually inspected on residual plots (Boldina & Beninger, 2016). Outliers were removed, or the data were log-transformed when necessary. All analyses were run in R software (version 4.1.3).

| RE SULTS
After the filtering steps, our full data set contained 150 species and 16 992 COI sequences (Appendix S2-species removed for PGLS not Different sets of variables were included in the best models on the data sets tested for both MLR and PGLS (Table 1 and Appendix S4). We decided to interpret only the significant variables included in 4 models (MLR and PGLS with random-forest and phylogenetic imputation) because they are the largest data sets and their results agree with many of the trimmed data sets (Appendix S4).
Considering all PGLS analyses, λ was estimated as not different from 0 (p > 0.05) in all models tested. Species that have gone through population expansion after a bottleneck consistently were found to show lower GD. Egg diameter, PLD and habitat also play a role in predicting GD. Specifically, smaller eggs and long PLD are associated with species having high GD; deep-sea species show small GD compared to other species.

| DISCUSS ION
Our most consistent evidence obtained from the analyses of multiple models investigating 150 crab species and eight predictor variables indicates that one of the main determinant factors predicting genetic diversity (GD) is if a species went through population F I G U R E 1 Workflow for the model designing after having the complete data set following the filtering steps (150 species, eight predictor variables with missing data). Each arrow indicates the decision-making process for both multiple linear regression (MLR) and phylogenetic generalized least squares (PGLS), which included 2 data-imputation methods + trimming the complete data set in different ways to avoid missing data. At the end, 17 models were tested for each MLR and PGLS analyses.

| Temporal population size fluctuations, r/K strategies and dispersal ability
Temporal fluctuation of population size was the most consistent variable predicting GD. Our results show that species that did not experience populational size changes are genetically more diverse, an expected outcome (e.g., Carnaval et al., 2009). Tajima's D can detect population expansion following bottleneck events (Tajima, 1989).
Bottleneck events can decrease GD by removing small frequency alleles from populations or stochastically removing part of the genetic variation. Consequently, genetic drift can be stronger because of decreases in population size (Leberg, 1992;Nei et al., 1975). Many species faced bottleneck events, especially during the Pleistocene climate changes (Hewitt, 2004), including aquatic species (Maggs et al., 2008), which can directly explain our results. Many crabs are known to have undergone population size changes during this time across many regions of the world (Buranelli & Mnatelatto, 2019;Deli et al., 2019;He et al., 2010;Parvizi et al., 2018;Peres & Mantelatto, 2020;Ragionieri & Schubart, 2013;Xu et al., 2009).
Species that were able to maintain their population size when facing disturbances could have kept higher GD. Therefore, the question becomes which traits are driving how species respond to environmental changes.
The r/K strategies have been hypothesized to predict different responses to environmental disturbances and, consequently, to explain GD variation across many species (Barry et al., 2022;Chen et al., 2017;Romiguier et al., 2014). These strategies represent different life-history alternatives, with r-strategists defined as the species showing small propagule size, high fecundity, short lifespan, early reproduction, low parental investment in the offspring and TA B L E 1 Model-averaged standardized regression coefficients (β), 95% confidence intervals (CIs) and weight (w) for each multiple linear regression (MLR) and phylogenetic generalizes least squares (PGLS) data set tested  (Pianka, 1970). Our results show a negative association between egg size-GD (i.e., large egg species show lower GD), in agreement with previous hypotheses showing this trait as the most important one predicting GD across many taxa (Romiguier et al., 2014). The potential explanation is that r-strategists can maintain larger population sizes or recover faster after bottleneck events than K-strategists, even though K-strategists are favoured in stable environments, keeping constant population sizes and competing better for resources (MacArthur & Wilson, 1967;Pianka, 1970). Considering that many species underwent population size fluctuation over time (Hewitt, 2004), our finding that long PLD is associated with higher GD can indicate that genetic rescue can play a role in preserving GD (Whiteley et al., 2015). This means that having the potential to spread alleles across distant populations (long PLD) can result in the immigration of alleles from stable populations to populations that went through population bottlenecks or that many populations that went through bottlenecks can share alleles among themselves. Both situations can result in the overall maintenance of GD, in opposition to species with short PLD that might not spread new alleles as efficiently. However, we must consider that we are considering PLD as a proxy for connectivity, which is a topic under debate (Butler IV et al., 2011;Iacchei et al., 2013;Timm et al., 2020;Weersing & Toonen, 2009). The lack of association might come from the difference between potential and realized dispersal due to ocean conditions that cause larvae retention and accumulation or accelerate larval development in warmer regions (Álvarez-Noriega et al., 2020;Hedgecock & Pudovkin, 2011;White et al., 2010). Other factors might also affect dispersal, such as larval behaviour (Butler IV et al., 2011) and adults' behaviour (Timm et al., 2020). These explanations might also indicate why the number of larval stages was not an important factor, as it might be a poor indicator of connectivity (Weersing & Toonen, 2009). Although we assumed an association between PLD with dispersal ability, we did not estimate connectivity among crab populations. The association between dispersal and GD can be elucidated by comparing species showing the same distribution and investigating multiple regions across the genome (Gagnaire, 2020), a different framework from the one we have used (macrogenetics: Blanchet et al., 2017;Paz-Vinas et al., 2021).
Studies that estimate connectivity among populations and/or compare it across species found that habitats vary at their connectivity level and that there might be habitat-GD associations (e.g., Harvey et al., 2017;Manel et al., 2020). Our results indicate GD variation across habitats, indicating deep-sea species showing lower GD. Indeed, depth can be associated with less differentiated populations and lower GD in the marine environment (Etter et al., 2005;García-Merchán et al., 2012;Selkoe et al., 2014). This might be explained by the stability of the deep-sea environment, which would favour specialization and refinement, generating low GD (Bretsky & Lorenz, 1970;Sanders, 1968). However, this hypothesis is questionable (McClain & Schlacher, 2015), and a compilation of marine invertebrate population genetic studies shows their GD is comparable to shallow-water species (Taylor & Roterman, 2017). Also, stable environments have been shown to hold high GD (Carnaval et al., 2009;Nielsen et al., 2021), casting doubt on the association we have found.
This particular result might be biased because of the small number of deep-sea species analysed. Future studies will confirm if the habitat is indeed a crucial factor when predicting GD.

| Taxon-specific features explain the lack of importance of variables predicting GD
According to our analyses, not all traits ranging within the r/K continuum can predict GD. Body size, fecundity and maximum longevity did not influence GD, even though these are important traits across broad taxonomic scale investigations (Kort et al., 2021;Romiguier et al., 2014). We believe that taxon-specific features can explain this outcome. Body sizes show a negative relationship to GD in many mammals, birds and fish species (Eo et al., 2011;Kort et al., 2021;Mitton & Lewis Jr, 1989;Wooten & Smith, 1985); alternatively, some studies also do not support this affirmation (Azizan & Paradis, 2021;Doyle et al., 2015;James & Eyre-Walker, 2020;Kort et al., 2021). This might indicate a possible taxon-specific lack of body size-GD correlation. In the case of crabs and other marine species, we hypothesize that large and variable habitats (e.g., infralittoral) can sustain large populations, including large-bodied animals. Therefore, large species can maintain large populations in specific habitats leading to a pattern that does not follow initial predictions (Eo et al., 2011;Martin & Palumbi, 1993;Wooten & Smith, 1985).
Some studies have shown fecundity strongly predicting GD across many animals, although the process behind this pattern is not fully understood (Romiguier et al., 2014). In some cases, the effect of fecundity on GD is taxon-dependent (Chen et al., 2017). For instance, there are examples of fishes showing a positive association (Martinez et al., 2018) or no relationship between fecundity and GD (Mitton & Lewis Jr, 1989). The lack of association is also found in another invertebrate group, the butterflies (Mackintosh et al., 2019).
Interestingly, Fratini et al. (2016) show a negative association between fecundity and GD for seven mangrove crab species, but this might result from their data set, and we found that the association was lost when we combined more species data. This might be explained by the sweepstakes reproductive success (SRS) hypothesis (Hedgecock & Pudovkin, 2011). Like other marine species, crabs produce many eggs that develop into larvae that are released into the water column. The selective pressures acting on the larvae can lead to variance in the individual reproduction success because most of the offspring die before metamorphosing into adults, thus influencing their contribution to the future genetic pool despite the larvae genetic pool. The SRS predicts that high fecundity might not be related to GD due to random processes acting on which set of larvae will contribute to the next generation, disrupting the association between fecundity and GD.
Finally, longevity is expected to negatively correlate to GD because it is often associated with large species, small populations and slower mutation rates (Martin & Palumbi, 1993;Nabholz, Glémin, & Galtier, 2008). Multiple taxa comparisons show this association (Chen et al., 2017;Kort et al., 2021;Romiguier et al., 2014); however, longevity often shows no relationship to GD within a taxon (Mackintosh et al., 2019;Mitton & Lewis Jr, 1989;Nabholz, Mauffrey, et al., 2008). A lack of association between longevity and GD aligns with the body size results. Although a general trend exists across many groups, crabs might show a taxon-specific pattern that might be explained by larger habitats sustaining larger populations of larger animals. In the future, when more longevity data is available, this association should be explored again in a more comprehensive view.

| Limitations and future directions
We also have to consider that we have used the COI (an mtDNA gene) to represent the species GD. The patterns of GD can vary significantly across the genome depending on the region and the forces acting upon it, having effects on population size and GD estimates (Charlesworth, 2009;Gossmann et al., 2011). The use of mtDNA is widely debated, mainly due to its lack of recombination, susceptibility to positive selection and selective sweeps, which can impact the GD (Ballard & Kreitman, 1995;Galtier et al., 2009).
However, many studies show an association between mtDNA and nuDNA variation (Mulligan et al., 2006;Nabholz, Mauffrey, et al., 2008;Piganeau & Eyre-Walker, 2009). Even if selection acts on mtDNA, it might not be the dominant evolutionary force influencing variation, and we can assume the mtDNA is nearly neutral (Figuet et al., 2016;Ohta, 1992) and a reliable source to estimate Besides that, we suggest researchers deposit all sequences generated during their studies, not just unique haplotypes, always linking the sequences with a reference paper and using a standardized notation when submitting the sequence name. We recommend using the term 'cytochrome c oxidase subunit I' to make it easier for public database searching algorithms to find all available sequences.
Guaranteeing a standard submission name, abbreviated names like COI, coxI and CO1 can be used without interfering with the search.
Therefore, studies that need to compile data will benefit (Paz-Vinas et al., 2021).

| CON CLUS IONS
We provide insights on the association between ecological and lifehistory traits with crabs' GD and contribute with taxon-specific results to the field investigating the predictors of GD. Our work takes advantage of publicly available COI sequence data to investigate the predictors of GD following the tendency of recent approaches (Manel et al., 2020;Miraldo et al., 2016;Theodoridis et al., 2020) but focusing on an invertebrate group. Using eight life-history and demographic traits, we support the hypothesis that population size fluctuation is the most crucial factor predicting GD, but other life-history traits might also play a role (Kort et al., 2021;Romiguier et al., 2014).
Therefore, our results indicate this complex scenario in which GD is predicted by population size fluctuations coupled with the species' ability to recover from a disturbance (r-strategy and the spread of alleles). Many other variables were not evaluated and could provide novel associations (e.g., species range size, age at maturity, latitude), demanding further investigation. Some traits within the r/K spectrum do not explain GD, probably representing a taxa-dependent result.
However, we still lack studies investigating other taxa, especially marine invertebrates and specific groups among the crustaceans, to expand our comparisons. Unfortunately, genetic, life-history and ecological data may not be available for many invertebrate species.
Further studies are warranted, and we encourage others to explore the influence of life-history traits in the GD of other groups to test which general trends represent ecological and evolutionary rules.

AUTH O R CO NTR I B UTI O N S
PAP: designed research, performed research, analysed and interpreted data, wrote and reviewed drafts of the paper; FLM: designed research, mentored research, interpretated data, obtained licence for collection, funding acquisition and reviewed drafts of the article.

ACK N OWLED G EM ENTS
We are grateful to all scientists that have been investigating the ecology and evolution of crabs and their contributions (genetic sequences or biological data) to the field; compiling data would not be possible without the efforts of past research. We would also like to 11777-1 MMA/IBAMA/SISBIO). Mantelatto.

CO N FLI C T O F I NTE R E S T
We have no conflict of interest to declare.

PE E R R E V I E W
The peer review history for this article is available at https://publo ns.com/publo n/10.1111/jeb.14138.

DATA AVA I L A B I L I T Y S TAT E M E N T
The data set used in this study (including nucleotide diversity calculations and traits), the references used to get all traits data are provided as supplementary data, the GenBank accession numbers and alignment used to build the phylogenetic tree are provided at 10.5061/dryad.5x69p8d72.