Why Are There So Many Flowering Plants? A Multiscale Analysis of Plant Diversification

The causes of the rapid diversification and extraordinary richness of flowering plants (angiosperms) relative to other plant clades is a long-standing mystery. Angiosperms are only one among 10 major land plant clades (phyla) but include ∼90% of land plant species. However, most studies that have tried to identify which traits might explain the remarkable diversification of angiosperms have focused only on richness patterns within angiosperms and tested only one or a few traits at a single hierarchical scale. Here, we assemble a database of 31 diverse traits among 678 families and analyze relationships between traits and diversification rates across all land plants at three hierarchical levels (phylum, order, and family) using phylogenetic multiple regression. We find that most variation (∼85%) in diversification rates among major clades (phyla) is explained by biotically mediated fertilization (e.g., insect pollination) and clade-level geographic range size. Different sets of traits explain diversification at different hierarchical levels, with geographic range size dominating among families. Surprisingly, we find that traits related to local-scale species interactions (i.e., biotic fertilization) are particularly important for explaining diversification patterns at the deepest timescales, whereas large-scale geographic factors (i.e., clade-level range size) are more important at shallower timescales. This dichotomy might apply broadly across organisms.


Introduction
Explaining the dramatic differences in diversification and species richness among major plant clades is a long-standing and unresolved problem (Raven 1977;Stebbins 1981;Doyle and Donoghue 1986;Friis et al. 1987;Crepet and Niklas 2009;Givnish 2010). Land plants currently contain ∼300,000 spe-cies distributed among 10 major clades (phyla sensu Niklas 2016). The differences in richness among these clades are striking ( fig. 1), with ∼90% of described land plant species belonging to angiosperms (flowering plants; Magnoliophyta). Given that angiosperms also appear to be relatively young, they clearly have a high overall diversification rate (speciation minus extinction; Morlon 2014). The challenge is to identify which traits might help explain this accelerated diversification, including traits related to ecology, morphology, physiology, reproduction, and genomic characteristics. More generally, to resolve the question of why there are such dramatic differences in richness among major clades of land plants, we need to know which traits are most strongly correlated with variation in diversification rates among them.
Many studies have attempted to explain high angiosperm species richness in terms of particular traits. Nevertheless, most studies had at least one of four limitations. First, many studies focused specifically on explaining richness and diversification patterns among angiosperm clades (e.g., Ricklefs and Renner 1994;Davies et al. 2004;Sargent 2004;Kay et al. 2006;Vamosi and Vamosi 2010;Bromham et al. 2015;O'Meara et al. 2016;Igea et al. 2017;Magallón et al. 2019; recently reviewed in Sauquet and Magallón 2018;Vamosi et al. 2018). Although this is an important topic, these studies do not necessarily explain why angiosperms are more diverse than other plant clades. Specifically, it is unclear whether traits that explain variation within angiosperms will also explain differences between angiosperms and other major plant clades. Second, most studies have focused on testing the impacts of single (or a few) traits on diversity and diversification even though numerous traits have been proposed as possible explanations for variation in species richness and diversification rates. Third, some studies have not addressed how much variation in diversification rates or richness is statistically explained by each trait. Thus, traits could show a significant association with diversification or richness but still explain little overall variation in diversification rates and richness among clades. Fourth, some studies focused on richness patterns alone without analyzing diversification rates directly. This is important because land plant clades differ considerably in their ages ( fig. 1). Traits may be especially important if they occur in younger clades with high richness (thus having rapid diversification rates) rather than in older clades with high richness (since older clades are expected to have more species, all else being equal). Here, we address these limitations, building on previous studies that identified potentially important traits. The traits previously hypothesized to be drivers of diversification include tropical distributions (Ricklefs and Renner 1994), zygomorphic flowers (Sargent 2004), floral nectar spurs (Kay et al. 2006), geographic extent (Vamosi and Vamosi 2010), and rates of genome size evolution (Puttick et al. 2015).
The factors that explain patterns of plant richness may also have implications beyond plants. A prominent theme in ecology is that local-scale species interactions are most important for explaining community diversity patterns at shallow timescales, whereas large-scale factors (like geographic and climatic distributions) are important at deeper timescales. For example, this idea was emphasized in a classic article by Ricklefs (1987) and in other well-known articles (e.g., fig. 1 in Cavender- Bares et al. 2009). This idea raises the broader question: Which types of traits are most important for explaining richness and diversification at  (Gnetum, Welwitschia, Ephedra) (pines, firs, spruces, cypresses, etc) Magnoliophyta Polypodiophyta Figure 1: Summary of the phylogeny, species richness, and net diversification rates of the major clades (phyla) of land plants. Angiosperms (flowering plants) are Magnoliophyta. The apex of each colored triangle indicates the crown group age of each phylum. Colors within triangles indicate the approximate diversification rate of each clade (based on stem group ages and ε p 0:5). The tree shown here is used in the primary analyses (FPCM tree) and combines estimates from Fiz-Palacios et al. (2011) for nonangiosperm relationships and from Magallón et al. (2015) within angiosperms. Alternative trees used are given in appendix I. Source of photos: free images from pixabay.com and public domain images from flickr.com. Photographers, from top to bottom: anonymous, K. Paulick, Nhelia, S. Buissine, W. Claussen, M. Gaida, M. Machová (pixabay.com), Tab Tannery (www.flickr.com/photos/tabtannery/25648174636/, CC BY-NC-SA 2.0), banusevim, hekakoskinen. different scales? Surprisingly, recent studies in animals have shown the importance of traits related to local-scale species interactions for explaining deep-scale diversification patterns (Wiens et al. 2015;Jezkova and Wiens 2017). However, these studies did not compare the importance of these interaction-related traits relative to large-scale ecological factors. Here, we provide the first test of the hypothesis that traits related to biotic interactions are more important for explaining diversification rates at deeper timescales, relative to large-scale factors (like range size).
In this study, we perform a multiscale analysis of diversification in land plants. This analysis involves three main steps. First, we compile a database of 31 potentially relevant traits for 678 families. We include traits related to fertilization (e.g., pollination mechanisms), dispersal mechanisms, reproduction, life history (annual vs. perennial), genomic characteristics (rate of genome size change, polyploidy), and geographic range (e.g., range size, tropical vs. temperate distribution). We then estimate the proportion of species in each clade having each trait, at different phylogenetic scales. Specifically, we analyze data on 10 plant phyla, 140 orders, and up to 678 families (apps. A-J; apps. A-R are available online). Collectively, these higher taxa incorporate all (or almost all) known land plant species. To our knowledge, these higher taxa are all monophyletic and thus represent clades. Taxonomic ranks are somewhat arbitrary, and named clades may not represent a random sample of all possible clades in a tree (Beaulieu and O'Meara 2019). However, it is unclear how this issue would negatively impact our analysis. Furthermore, the use of named clades allows us to incorporate all species in every clade (not just those included in trees) and to readily address patterns at different scales (given that families are generally younger than orders, which are younger than phyla). Specifically, using the FPC tree with all families (see below), the mean ages for land plant phyla, orders, and families are 393 Ma (range p 199-511), 149 Ma (48-428), and 79 Ma (7-428).
Second, we estimate net diversification rates for each clade at each rank. Younger clades with many species will have higher rates, and older clades with fewer species will have lower rates (regardless of variation in rates within clades over time or among subclades). An important advantage of using rate estimates for each clade (combined with regression) is that one can include multiple variables simultaneously and quantify how much variation in diversification rates among clades is explained by each trait at each scale, which is not possible using most other approaches (e.g., state-dependent speciation-extinction [SSE] methods;FitzJohn et al. 2009).
Third, we test for pairwise relationships between each trait and diversification rates, using phylogenetic generalized least squares (PGLS) regression (Martins and Han-sen 1997). PGLS corrects for the potential nonindependence of trait values and diversification rates among clades due to phylogeny. We do this for each phylogenetic scale (phyla, orders, and families) and both across land plants and within angiosperms. We then conduct multiple regression analyses to identify which combination of traits best explains variation in diversification rates at each scale. This three-part approach (i.e., estimating trait frequencies and diversification rates within clades and then linking traits and rates with multiple regression) has now been successfully applied to many organisms (e.g., Wiens et al. 2015;Jezkova and Wiens 2017). Note that we use "explain" in a statistical sense here and throughout.
Using this approach, we can explain considerable variation in diversification rates among plant clades. Thus, our results help resolve the long-standing mystery of why major plant clades differ so dramatically in their richness. Our results also show that different types of traits are important for explaining diversification rates and richness patterns at different phylogenetic scales. For example, biotic fertilization (e.g., insect pollination) explains considerable variation in diversification rates among major plant clades (phyla). In contrast, geographic range sizes of clades dominate diversification patterns at shallower timescales (among families). More generally, these results contrast with the idea that traits related to local-scale species interactions (like biotic fertilization) are primarily important for diversity patterns at shallower timescales, whereas large-scale geographic factors (like range size) dominate at deeper timescales.

Trait Database
An initial list of 678 land plant families (and their species richness) was obtained from Fiz-Palacios et al. (2011). Data were then updated from various sources for bryophytes Vanderpoorten and Goffinet 2009) and for ferns, gymnosperms, and angiosperms (Kubitzki 1990(Kubitzki -2015The Plant List 2013;Stevens 2015).
We focused on traits previously hypothesized to be associated with increased species richness or diversification rates of plant clades, following major reviews (e.g., Crepet and Niklas 2009;Givnish 2010) and previous studies (see above). However, we did not focus simply on synapomorphies of clades. We also emphasized traits with data available for most phyla, although we included some angiosperm-specific traits (e.g., flower morphology) and some lacking data in some families (see below). The traits are listed in table A1 (app. A). We describe the traits in further detail and how taxa were assigned to each state in appendix A. We did not always follow conventional terminology for trait names (app. A) because different names have sometimes been applied to the same feature in different groups (e.g., pollination vs. fertilization). Data for each family (including literature sources), order, and phylum are given in appendixes B-H. All data are available in the appendixes, which have also been deposited in the Dryad Digital Repository (https://doi.org/10.5061 /dryad.5tb2rbp10; Hernández-Hernández and Wiens 2019).
We typically estimated the number of species in each family having each state based on genus-level descriptions and richness (given the challenge of scoring 31 traits for ∼300,000 species). This approach was potentially problematic when genera were variable for a given trait, but the potential distortion of estimated trait frequencies should have limited impact on our results, given that our analyses are among phyla, orders, and families, not genera. Specifically, distortion should decrease at higher levels (i.e., the effects of variable genera on frequency estimation should be mitigated by invariant genera and families at the level of orders and phyla; app. A). Furthermore, analyses linking diversification rates and trait frequencies of clades can be robust to considerable error in assigning traits to species (at least ∼20%; Moen and Wiens 2017, their app. B). We emphasize that trait frequencies are estimates, not known values. Some traits were also coded at the family level (e.g., zygomorphy), and we also incorporated species-level information in some cases (app. A).
For most traits, each clade was characterized by the estimated proportion of species having that trait. Species in different clades (or even the same clade) need not share the same trait through a single evolutionary origin. Indeed, statistical relationships from PGLS can be stronger if traits arise independently in different clades. These analyses do not assume that each trait originated concomitantly with the origin of a given clade. Instead, they assume that traits that are more frequent among species in a clade can more strongly influence the clade's overall net diversification rate.
We acknowledge that there are scenarios whereby trait frequencies might appear to be related to diversification but without a causal relationship. For example, imagine a clade with two subclades (A and B), with the trait of interest in A but increased diversification in B (Jezkova and Wiens 2017). However, this problematic scenario would need to be repeated across multiple clades to yield a strong relationship between the trait and diversification, which seems unlikely. Moreover, our analyses across multiple scales should reduce these artifacts (i.e., treating A and B as separate clades).
It is also possible that we failed to include one or more relevant traits. Nevertheless, to our knowledge we included more traits than any previous analysis of plant diversi-fication. Furthermore, this same criticism could apply to any study, no matter how many traits were included. Finally, our results show that we identified important traits that explain considerable variation in diversification rates at many scales.

Time-Calibrated Phylogenies
We used three tree topologies at each of three taxonomic levels (phyla, orders, and families). We also analyzed angiosperm orders and families separately. We also performed separate family-level analyses within the more diverse nonangiosperm clades (bryophytes, liverworts, ferns, and gymnosperms). However, our trait sampling did not focus on these latter groups.
Our primary analyses used a time-calibrated phylogeny that combined a comprehensive tree of land plant families (Fiz-Palacios et al. 2011) with a more recent phylogeny of angiosperm families (Magallón et al. 2015). The former tree (Fiz-Palacios et al. 2011) includes all land plant families (Stevens 2015) and was estimated from concatenated plastid and nuclear genes. The topology is broadly congruent with more recent studies having more extensive gene sampling but more limited taxon sampling (e.g., Wickett et al. 2014;APGIII 2016). The phylum-level relationships and divergence times are very similar to those of Magallón et al. (2013) and are broadly similar to Morris et al. (2018). However, we utilized the tree of Fiz-Palacios et al.
(2011) given its more comprehensive sampling of families.
The primary tree of Fiz-Palacios et al. (2011) was calibrated by fixing the age of eudicots at 121 Ma, using 16 calibration points, and setting a maximum age of 725 Ma. We refer to this as the unconstrained phylogeny (FPU). These authors also performed an analysis that constrained the maximum age of angiosperms to a young age (130 Ma) similar to other estimates. We refer to this as the constrained phylogeny (FPC).
Our main analyses combined the nonangiosperm portion of the FPC tree with a more recent angiosperm tree (Magallón et al. 2015). The latter study is the most extensive dating analysis of higher-level angiosperm phylogeny to date, including 792 taxa and 137 fossil calibration points. We first pruned this tree to include one species per family (note: all species have the same length to the family stem age). Next, we substituted the angiosperm portion of the FPC tree with this pruned tree (Magallón et al. 2015). We refer to this as the FPCM tree. The dates for angiosperm origins within these trees differ only by ∼9.4 million years (within the confidence interval from Magallón et al. 2015). The FPCM tree was used for the main analyses, but we addressed the robustness of the results with two alternative trees (FPU, FPC). At the family level, these two source trees (Fiz-Palacios et al. 2011;Magallón et al. 2015) differ in taxon sampling and treatment of some groups. Thus, we assembled different databases for angiosperms and land plants (apps. E-H). The FPC and FPU trees and databases included 678 families, whereas the FPCM tree and database included 603. The three trees differed primarily within angiosperms. Trees were pruned to obtain phylogenies at the order and phylum levels (i.e., including one terminal taxon per order and phylum). The stem age for angiosperms is the same in the FPC and FPCM trees. Therefore, we used only two pruned topologies for analyses among phyla (FPCM/FPC and FPU). All trees used are available in appendix I. Overall, different trees had relatively little impact on our conclusions (see "Results"), so we focused primarily on the preferred FPCM tree.

Diversification Rates
Net diversification rates for each clade were estimated using the standard method-of-moments estimator for stem group ages (Magallón and Sanderson 2001), with the R package GEIGER (ver. 2.0.6; Pennell et al. 2014). Many other approaches are available to study diversification (Morlon 2014). However, most other methods would be impractical here, since they utilize detailed species-level phylogenies within each clade, which are unavailable for many plant clades. Furthermore, the use of net diversification rates makes it straightforward to use regression to assess how much variation in rates is explained by each trait. This would be impossible to infer using most other methods (e.g., SSE models). Finally, our goal here is to explain differences in richness and diversification rates among clades, not shifts in diversification rates over time.
Diversification rates were estimated using the stem group estimator. The crown group estimator requires a time-calibrated tree with many species per clade (currently difficult for many families), whereas the stem group estimator requires only one species per clade. Furthermore, incomplete species sampling can bias rate estimates using crown ages (by underestimating crown ages) but not stem ages . Most importantly, stembased estimators generally show stronger relationships with true rates in simulations, even with complete taxon sampling .
The method-of-moments estimator requires the age and species richness of each clade and a relative extinction fraction (ε). Epsilon is intended to correct for the overestimation of diversification rates due to the failure to include extinct clades (a potential source of ascertainment bias when estimating diversification rates; Beaulieu and O'Meara 2019). It is typically assumed across an entire tree, not estimated within individual clades. Following standard practice, three values (0, 0.5, 0.9) were used. Re-sults were similar across different values (see "Results"). Therefore, we primarily focused on the intermediate value (0.5). In simulations, different ε values yield similar relationships between true and estimated rates for the stem estimator . Use of a single ε value does not assume that extinction rates are identical in all clades: diversification rates within clades reflect the balance of speciation and extinction over time, and estimated diversification rates can still accurately reflect the true rates when a single ε value is assumed but extinction rates vary among clades .
Some authors (Rabosky et al. 2012) have stated that these rate estimators require constant rates within clades and are valid only if there is a positive relationship between clade age and species richness among clades. However, true and estimated rates are strongly correlated in simulations, regardless of whether there is a positive or negative relationship between clade age and richness or between clade age and diversification rates (Kozak and Wiens 2016) and regardless of variation in rates between subclades within a clade  and within clades over time ). Thus, these estimators can correctly assign high net rates to young clades with many species and low net rates to older clades with few species. Nevertheless, fast diversification rates in younger clades can potentially uncouple diversification rates and richness patterns, which would make it problematic to interpret diversification rates as explaining richness patterns (Kozak and Wiens 2016;Scholl and Wiens 2016). Strong positive relationships between age and richness can also yield weak relationships between diversification rates and richness (Kozak and Wien 2016;Scholl and Wiens 2016). Therefore, it is important to test whether diversification rates and richness are significantly related. The mere fact that richness is included in the calculation of diversification rates does not make such an analysis circular since these variables can be unrelated (Kozak and Wiens 2016;Scholl and Wiens 2016). We confirmed that diversification rates and richness are strongly related at all levels (app. K). For phyla, the r 2 between richness and diversification rates ranged from 0.80 to 0.92 (across trees and ε values), for orders it ranged from 0.75 to 0.86, and for families it ranged from 0.40 to 0.74 (with lower values in more complete family-level trees). Thus, diversification rates are relevant for explaining richness patterns among clades, especially for orders and phyla. Yet diversification rates and richness are not identical, especially among families (despite studies that focused on explaining family-level richness without incorporating diversification rates).
In general, variation in richness among clades can be explained by variation in their diversification rates (speciation-extinction) and/or ages (e.g., Scholl and Wiens 2016) because only speciation and extinction directly change richness in a clade. We confirmed that all relationships between clade ages and richness were weak and negative using the FPCM tree among land plant phyla, orders, and families (app. K). This further confirms that variation in richness is explained by variation in diversification rates.
Finally, we also performed analyses among phyla using diversification rates estimated from the crown group age of each clade. This was important because the crown age of angiosperms may be substantially younger than the stem age (leading to slower diversification rates considering only stem group estimates). Phylum-level analyses should not suffer from incomplete species sampling because all major clades within each phylum were sampled. We used ε p 0:9, given that crown group estimates are most accurate using this value in simulations . We used only the FPC tree, the tree with the younger age estimate for crown group angiosperms.

Testing Relationships between Traits and Diversification Rates
We estimated relationships between diversification rates (dependent variable) and other traits (independent variables) using PGLS regression with the R package caper (ver. 0.5.2; Orme 2013). Following standard practice, we used the maximum likelihood transformation of branch lengths optimized for the data (l p ML), based on the estimated phylogenetic signal (l). Kappa and delta branchlength transformations were fixed at 1. Lambda estimates and corrects for the observed phylogenetic signal in the data but does not require that all traits evolve following a Brownian motion model. PGLS is also valid when independent variables are categorical and the dependent variable is continuous (like diversification rates; see Martins and Hansen 1997). We first conducted pairwise regression analyses between diversification rates and each trait. We then performed multiple regression analyses with models that included only those traits showing significant (P ! :05) and marginally significant (P ! :10) relationships. Excluding some traits was especially important given the many traits considered (i.e., the number of possible trait combinations would be intractable, but many combinations would be suboptimal given traits with weak relationships in pairwise analyses). We did not correct P values for multiple comparisons because our goal was simply to identify candidate traits for inclusion in multiple regression models. Multiple regression models were then chosen on the basis of their overall fit, not their P values.
In multiple regression analyses, all candidate traits were included and then sequentially excluded (i.e., backward elimination). Thus, after running the first multiple regres-sion model, the trait with the highest P value was excluded and the analysis was rerun. We repeated this process until models included only two variables. Preliminary analyses using other variable selection procedures (e.g., forward selection) yielded similar results. The best-fitting model was generally considered the one with the lowest Akaike information criterion (AIC; Burnham and Anderson 2002). However, models within 0-4 AIC units may be considered indistinguishable from the best model (Burnham and Anderson 2002). When the best models differed by only 0-4 AIC units, we selected the model that included the fewest traits. No analyses included more traits than taxa. Traits that are highly correlated should be removed by this selection procedure, since redundant traits will add unnecessary parameters without increasing model fit. We also confirmed that for our best-fitting models (table 1) the relationships between included traits were either nonsignificant or very weak (all r 2 ! 0:12, with most r 2 ! 0:05; app. P).
Prior to these analyses, we excluded traits with a low frequency in all taxa in each data set (!10% of species having that trait across all clades). It would be problematic to infer that a trait was responsible for a clade's high diversification rate if that trait were present in relatively few species. The specific value (10%) is arbitrary, but alternative values should have little impact, given that the traits in the best-fitting models either do not involve frequencies (e.g., number of ecozones, rates of genome size evolution) or differ strongly in frequencies among taxa (e.g., biotic fertilization, tropical distribution, annual life history, zygomorphic flowers, nonwater dispersal).
Some traits had missing data in some taxa, such as reproductive traits in poorly studied families. Thus, our data for each trait at each taxonomic level had different proportions of missing data (0-0.70; app. J). Mean completeness per taxon varied on the basis of taxonomic level (family p 0.80 [range p 0.20-1.00]; order p 0.87 [range p 0.31-1.00]; phylum p 0.90 [range p 0.77-1.00]). PGLS analyses eliminate taxa with any missing values (Orme 2013). We used two strategies to deal with missing data. First, to minimize data loss in each round of analyses, we excluded traits with extensive missing data across taxa (10.30; see app. J). However, deleting observations with missing data reduces statistical power and can increase estimation bias (Nakagawa and Freckleton 2008). Therefore, we also used phylogenetic imputation to fill in missing data with the R package Rphylopars (Goolsby et al. 2016). Imputation uses a phylogenetic covariance matrix to estimate trait values in taxa lacking data for that trait. To estimate the covariance matrix, we first fit three evolutionary models to each data set: Brownian motion, Ornstein-Uhlenbeck, and early burst (Goolsby et al. 2016). We selected the best-fitting model on the basis of AIC values and then used this model to estimate the covariance matrix with likelihood. The final imputed matrix had no missing data. Details are given in appendix Q. Imputed data are in appendixes B-H. Importantly, PGLS results from the reduced data sets and the imputed data were similar (e.g., compare table R1 with table R2 as well as table 1 with table R3). This similarity is not surprising given the limited missing data overall (∼10%-20%; see above). For brevity, the main analyses were based on the imputed data set, which is more complete.
Finally, for the best-fitting multiple regression models for the main results (FPCM tree, ε p 0:5; imputed data), we calculated standardized partial regression coefficients (SPRCs) for each variable (code from Moen and Wiens 2017). These coefficients quantify the relative importance of different variables in the multiple regression models, showing the influence of each variable when all others are held constant (Sokal and Rohlf 1995).
Among angiosperm families, the three traits related to range size (geographic extent, number of ecozones, and area available for expansion) were strongly associated with each other (r 2 p 0:7-0:8; P ! :0001; app. K). For these analyses, we included only the trait showing the highest r 2 and lowest P value in multiple trait models (geographic extent). For analyses across land plants and in nonangiosperm taxa, we used the number of ecozones since precise data on geographic extent and available area were not available. We used the eight ecozones from Vamosi and Vamosi (2010; Nearctic, Neotropical, Palearctic, Afrotropical, Indo-Malayan, Australasia, Oceania, and Antarctic) and tallied the number of ecozones that each clade occurred in.

Phylum-Level Analyses
The phylogeny, species richness, and diversification rates of the 10 phyla are summarized in figure 1. For this and all other analyses, we describe results using our preferred tree (FPCM), imputed data (for missing entries), and an intermediate ε (0.5; see apps. K-L for complete results). Twenty-three traits were individually tested for relationships with diversification rates (table R1; app. K), excluding eight traits relevant only to angiosperms or having low frequencies. Five traits showed significant or marginally significant relationships (table R1; app. K). The best-fitting model included two traits (table 1; app. L) and explained 85% of the variation in diversification rates. SPRCs showed that the most influential trait was biotic fertilization (0.51), followed closely by range size (0.48; number of ecozones). Among phyla, biotic fertilization was present at high frequencies (110%) only in angiosperms (87%) and cycads (75%) and is predominantly mediated by insects in both groups (app. B). Range size is largely invariant among phyla (80% are present in all or almost all eight ecozones), but the phylum with the lowest diversification rate (Gingkophyta) is present in only one (app. B). These two traits together .54 !.0001 Note: The best-fitting multiple regression model is shown for each analysis. The full results for each analysis (and results with alternative methods and trees) are given in appendix L. For each analysis, the mean stem age of the included clades (and range of ages) are given in parentheses with the taxonomic scope. Numbers in parentheses adjacent to each trait indicate standardized partial regression coefficients, showing the relative contribution of each trait to the multiple regression model. Analyses are based on the imputed data, the FPCM tree, and diversification rates estimated using ε p 0:5. Comparable results using nonimputed data are summarized in table R3. explain most variation in diversification rates among phyla, but biotic fertilization is clearly the one most relevant to the rapid diversification of angiosperms. Models including two additional variables (genome size rates, tropical distribution) explain more variance in diversification rates (93%) but have almost identical fit (app. L).
Results were generally similar across topologies, ε values, and nonimputed matrices (tables R2, R3; app. L). Specifically, the best-fitting model included biotic fertilization and range size and explained similar amounts of variance in diversification rates. However, using the nonimputed data and ε p 0:9, rates of genome size evolution were also included in the best-fitting model, and using the FPU tree, nonimputed data, and ε p 0:9, the best-fitting model included biotic fertilization, range size, dioecy, and annual life history and explained 97% of the variance in diversification rates. Analyses using crown age rate estimates (and emphasizing variables present in 125% of species; app. O) supported a model including only biotic fertilization and genome size rates (but not range size), which explained 90% of the variance.

Order-Level Analyses across Land Plants
Twenty-five traits were analyzed among the 140 land plant orders, excluding traits present only in angiosperms (e.g., floral characters; table A1; app. K). Many traits showed significant relationships with diversification rates in pairwise analyses (table R1; app. K), but only one explained 110% of the variation in diversification rates (number of ecozones p 43%). The best-fitting model included seven traits and explained 56% of the variance in diversification rates among orders (table 1). The number of ecozones was the most influential trait (SPRC p 0:30). Biotic fertilization was also influential (0.16), as was nonwater fertilization (0.23). Although these two variables might seem redundant, they were not significantly related (r 2 p 0:108, P p 0:35). Annual life history, rate of genome size evolution, nonwater dispersal, and polyploidy were also included in the model but showed smaller effects (SPRC p 0:06-0:09; table 1). Results were broadly similar using alternative data, trees, and rate estimates, with range size and annual life history being consistently important, along with traits related to fertilization and dispersal (app. L).

Angiosperm Order-Level Analyses
A total of 28 traits were analyzed among the 66 angiosperm orders (app. K). Twelve traits showed significant or marginally significant relationships with diversification (app. K). Besides number of ecozones (r 2 p 0:54), all remaining traits explained 20% or less of the variation in diversification rates (table R1). The best-fitting multiple re-gression model explained 72% of the variation (table 1) and included four traits: number of ecozones (SPRC p 0:43), annual life history (0.20), zygomorphic flowers (0.22), and nonwater dispersal (0.12). Analyses using alternative data, trees, and rate estimates consistently supported range size and annual life history as important and almost always supported zygomorphic flowers, whereas inclusion of other traits (e.g., dispersal, fertilization, growth form) was more variable (app. L).

Land Plant Family-Level Analyses
Twenty-five traits were analyzed among 603 families (app. K). Many showed significant relationships in pairwise analyses. The number of ecozones explained 42% of the variation, and other traits explained 5% or less (table R1). The best-fitting model included six traits and explained 45% of the variation (table 1), with the number of ecozones being the most important (SPRC p 0:55) and other traits less so (SPRC p 0:05-0:22; annual life history, biotic fertilization, nonwater dispersal, rates of genome size evolution, tropical distribution). Across trees, data, and rates, the variables included in the best-fitting model varied substantially, but they always included range size and nonwater dispersal and only sometimes included biotic fertilization (app. L).

Angiosperm Family-Level Analyses
Thirty traits were analyzed among 363 angiosperm families (app. K). Seventeen showed significant or marginally significant relationships (app. K). Most explained relatively little variation in diversification rates, but the three traits related to range size each explained substantial variation (140% ; table R2). The best model explained 54% of the variation and included six variables (table 1). Geographic extent was the most influential variable (SPRC p 0:57), whereas other traits had weaker effects (zygomorphic flowers, biotic fertilization, epiphytic habit, rates of genome size evolution, and nonwater dispersal). These results were for the FPCM tree, which did not include all angiosperm families. Analyses using alternative trees, rates, and data all consistently supported range size (geographic extent). Inclusion of other variables in the best-fitting model varied (app. L) but typically included zygomorphic flowers, biotic fertilization, and/or herbaceous growth form.

Nonangiosperm Clades
Few studies have addressed correlates of large-scale diversification patterns in nonangiosperm clades. Analyses of the clades with the most families (mosses, liverworts, ferns, gymnosperms) showed that range size (number of ecozones) was strongly related to diversification rates in all four (table 2; app. M). This variable alone explained considerable (56%; app. L) variation in diversification rates in gymnosperms. Range size and tropical distribution were the most important variables in mosses and liverworts (together explaining 44% and 30% of the total variance in each group, respectively; table 2; app. N). Range size and leaf-vein density were most important in ferns (40%).

Discussion
In this study, we conducted a multiscale analysis of diversification among land plant clades. At each scale, we identified a limited set of traits that explained considerable variation in diversification rates. Among phyla, most variation (85%) was explained by biotic fertilization (pollination) and range size (table 1). More specifically, the rapid diversification of angiosperms was largely explained by biotic fertilization. Furthermore, these two traits (biotic fertilization, range size) remained influential among land plant orders, with smaller contributions from other traits. Within angiosperms, most variation among orders (72%) was explained by range size, along with annual life history, zygomorphic flowers, and nonwater dispersal. Range size was then the most important trait at the family level among land plants, within angiosperms, and within nonangiosperm clades (tables 1, 2). Overall, these results contrast with the idea that species interactions (like biotic fertilization) are important primarily at shallow timescales, whereas large-scale geographic factors (like range size) are important at deeper timescales. Below, we discuss patterns across land plants and angiosperms, focusing on biotic fertilization and range size (not all traits), and then discuss broader implications beyond plants.

Patterns across Land Plants
A major goal of our study was to analyze patterns across all land plants, not merely angiosperms. Few studies have examined correlates of diversification at this scale. One study (Puttick et al. 2015) analyzed genome sizes and rates of genome size evolution across land plants. We confirmed that genome size rates are significantly related to diversification (app. K), but this variable is not included in the bestfitting model (table 1).
We strongly supported the importance of biotic fertilization and the idea that flowers drove rapid angiosperm diversification. We acknowledge that almost any trait unique to angiosperms and widespread among them might show a significant relationship with diversification rates among phyla, given the high rate in angiosperms. At the same time, few traits actually had such a distribution. In fact, we did not directly include flowers or fruit. Instead, we treated mechanisms of biotic and abiotic fertilization and dispersal as traits, including specific abiotic mechanisms (wind, water) and biotic agents (insects, vertebrates). Our analyses did not support vertebrate or biotic seed dispersal (e.g., fruit) as major drivers of diversification. Indeed, neither trait is present in 150% of angiosperm species, and they actually show higher frequencies in another phylum (Gnetophyta). Instead, we supported biotically mediated fertilization in general (present in 87% of angiosperms; app. B) and insect-mediated fertilization in particular (in 70%) as explanations for high angiosperm richness. Among phyla, biotic fertilization is largely confined to angiosperms and cycads, where it occurs at high frequencies (and at very low frequencies in Bryophyta and Gnetophyta). This trait presumably evolved independently in each group. Intriguingly, cycads also have relatively high diversification rates (see also Nagalingum et al. 2011), especially based on their crown group ages (app. O). Thus, biotic fertilization itself seems to be important, not just flowers.
Biotic fertilization (i.e., animal pollination) has long been suggested as a key to angiosperm success (e.g., Stebbins 1981;Niklas 2016). Nevertheless, the specific mechanisms linking diversification and flowers remain an area of .73 .0008 Note: The best-fitting multiple regression model is shown for each analysis. Full results are given in appendix N. For each analysis, the mean stem age of the included clades (and range of ages) are given in parentheses with the taxonomic scope. Numbers in parentheses adjacent to each trait indicate standardized partial regression coefficients, showing the relative contribution of each trait to the multiple regression model. Analyses are based on the imputed data, the FPC tree, and diversification rates estimated using ε p 0:5. Gymnosperms include the phyla Cycadophyta, Ginkophyta, Gnetophyta, and Pinophyta. active investigation (Crepet and Niklas 2009;Van der Niet et al. 2014). Pollinator-mediated divergence in floral traits is considered an important reproductive isolating mechanism in plants (Grant 1949(Grant , 1981Raven 1977;Doyle and Donoghue 1986;Crepet and Niklas 2009). Yet the observation that specialist pollinator species are infrequent (Robertson 1928;Ollerton et al. 2009) suggests that there is not a simple one-to-one relationship between pollinator shifts and speciation. Nevertheless, there are dramatic differences in flower morphology that appear to have evolved in association with particular groups of pollinator species (Harder and Barrett 2006). Furthermore, particular flower morphologies (i.e., zygomorphy) significantly impacted diversification rates within angiosperms (table 1). Specifically, zygomorphy is thought to be important in restricting access to pollen by nonspecialist pollinators and thereby increasing pollinator specificity and reproductive isolation among plant species (reviewed in Sargent 2004). Biotic fertilization might also influence diversification without being so directly linked to speciation. For example, Raven (1977) suggested that insect pollination was advantageous in increasing accurate transfer of pollen among widely spaced conspecific individuals. This could increase range sizes and buffer species from extinction. Overall, there is an extensive literature on pollination biology and speciation, which is too vast to review here (see instead, among others, Kay et al. 2006;Crepet and Niklas 2009;Givnish 2010). Our results strongly support the idea that biotic fertilization drives large-scale patterns of plant diversification and should provide greater impetus for smaller-scale studies of the specific mechanisms by which flowers drive speciation and diversification.
Our study does not address patterns of diversification over time but rather variation in extant richness and net diversification among clades. However, since net diversification rates reflect both speciation and extinction, our results are fully compatible with the idea that some speciespoor extant clades (e.g., ferns, gymnosperms) were once more species rich than they are today (Niklas 2016).

Patterns within Angiosperms
Many previous studies have addressed diversity patterns in angiosperms (see the introduction), and we expand on them by including many variables, considering different scales (orders, families), and utilizing diversification rates instead of richness. Our results show several areas of agreement with these studies. For example, we supported the importance of range sizes (e.g., Vamosi and Vamosi 2010;Tang et al. 2017), although previous studies analyzed richness rather than diversification rates. Our results also supported zygomorphic flowers (Sargent 2004) and rates of genome size evolution (Puttick et al. 2015). We also found annual life history and nonwater seed dispersal (including biotic and wind) to be important, especially among orders (table 1). Dispersal modes were included in some analyses of angiosperm diversity (e.g., Ricklefs and Renner 1994), and they may increase diversification by facilitating range expansion (see below). Annual life history is less studied but can decrease generation times and thereby increase the potential for rapid evolutionary change. Overall, our study is unique in combining these traits and showing that each remains important when considered in the context of the others (table 1). Interestingly, we found that range size is the most important variable among angiosperm orders and families (table 1) and among orders and families across all land plants.
We explained considerable variance in diversification rates among angiosperm orders (72%) and families (54%; table 1), but some remained unexplained at each level. This remaining variance might be explained by variation in substitution rates (e.g., Bromham et al. 2015), rates of change in seed size (e.g., Igea et al. 2017), and other aspects of flower morphology (e.g., O'Meara et al. 2016).

Effects of Range Size
Our results on range size raise at least three important questions for future research. First, does range size actually drive diversification, or does diversification drive range size? Second, how does range size increase diversification? Third, why do some clades have larger range sizes than others?
Faster diversification rates might contribute to larger clade range sizes, especially by driving higher species richness. Higher species richness within a region might then facilitate dispersal among regions. However, many clades can be broadly distributed without high species richness, including a family with one species shared across all eight regions (Lunulariaceae; app. F). Many broadly distributed low-richness families are in Bryophyta and Marchantiophyta, but some are angiosperms (e.g., Nelumbonaceae, with two species across five regions, and Ceratophyllaceae, with six species across six regions). Interestingly, the most diverse plant families (Orchidaceae, Asteraceae, Fabaceae) occur in only seven regions, not all eight. Thus, high richness is not necessary (or sufficient) for dispersal among all regions, and there is considerable variation in diversification rates unrelated to range size. Our results do not resolve this issue, but they should motivate further research on the mechanisms linking range area and diversification in plants, which could disentangle the causal relationships between these variables.
If clade range sizes primarily drive diversification instead of the converse, how might this occur? In general, large range sizes may provide more opportunities for allopatric speciation and might buffer clades from extinction (Rosenzweig 1995). Future studies that separately estimate speciation and extinction rates (e.g., with specieslevel phylogenies) could address how these two processes are related to range sizes of clades. Dispersal to new regions might also increase diversification rates by creating new ecological opportunities (e.g., Vamosi and Vamosi 2010;Yoder et al. 2010), and this should also be tested.
The third question is why do some clades have larger range sizes than others? This question could be addressed by testing which traits are most strongly related to clade range sizes, such as different seed-dispersal mechanisms. Note that range sizes in some clades might be influenced by range contractions over time, not merely expansions.

Broader Implications
Beyond plants, our results offer insights into the general factors that drive diversification. By analyzing many kinds of traits at different levels (phyla, order, family), our results address whether the types of traits that are most important in driving diversification change with phylogenetic scale. Intriguingly, our results here concur with recent studies in animals showing the importance of traits related to local-scale species interactions (between unrelated clades) as a major driver of diversification patterns at deep timescales (i.e., with clades that originated ∼500-200 Ma; fig. 1). For example, herbivory increases diversification rates among insect orders (∼500-100 Ma; Wiens et al. 2015), as does parasitism among animal phyla (∼900-300 Ma; Jezkova and Wiens 2017).
Furthermore, our results show that a trait related to localscale species interactions (biotic fertilization) is as important for explaining diversification patterns as large-scale geographic factors at this deep phylum-level scale (table 1). Yet traits related to large-scale distributions (i.e., range size) were the most important at shallower timescales, especially among families (tables 1, 2). These results contrast with the idea that local-scale interactions are less important for diversity patterns at deeper timescales relative to large-scale geographic factors (Ricklefs 1987).
We speculate that these scale-specific patterns might be explained by greater variability in traits related to localscale interactions at deeper timescales and greater variability in large-scale geographic factors at shallower time-scales (Wiens 2017). For example, most phyla are cosmopolitan in distribution, reducing potential differences in range size among them (60% in all eight ecoregions, compared with only 12% of 603 families; apps. B, F). Conversely, biotic fertilization varies among plant phyla but is relatively invariant within most clades (i.e., present in most angiosperms and cycads but absent in most other major clades).
We acknowledge that levels of variability in these traits can depend on how they are defined. For example, biotic fertilization is defined broadly here, not in terms of specific pollinator species (which would be more variable). Nevertheless, traits were defined the same way across all scales here, allowing for comparison.
Finally, we note that the meaning of "deep" versus "shallow" may depend on the study. The deeper timescales here are similar to those in studies of animal phyla and insect orders, with plant phyla having a mean age of 393 Ma and orders a mean age of 149 Ma. However, our "shallow" timescale here might be considered relatively old (mean of 91 Ma for the FPCM tree). Regardless, the generality of these patterns will require testing across many other groups, traits, and timescales.

Conclusions
In summary, we provide an analysis of land plant diversification based on multiple traits at multiple phylogenetic scales. We show that different traits are important at different scales and that multiple traits are important at every scale. Thus, focusing on a single trait at a single scale may be problematic. At the largest scale (phyla), biotic fertilization (e.g., insect pollination) helps explain the extraordinary richness of flowering plants. Thus, local-scale species interactions seem to help drive diversity patterns at deep phylogenetic scales. Conversely, variation in geographic range sizes of clades dominates diversification patterns at shallower timescales (among families). Overall, these results run counter to the idea that local-scale interactions are primarily important at shallower timescales and that large-scale distributions dominate at deeper timescales.