The biogeography of community assembly: latitude and predation drive variation in community trait distribution in a guild of epifaunal crustaceans Collin Gross Department of Evolution and Ecology University of California, Davis 4343 Storer Hall 1 Shields Ave Davis, CA 95616 colgross@ucdavis.edu Data available for use under an Attribution LicenseÊ(ODC-By). Users are free to use the data in new and different ways, provided they provide attribution to the source of the data and/or the database. Environmental and species abundance data collected between May-September 2014 from eelgrass beds in East Asia, Europe, and the east and west coasts of North America; model selection performed in spring 2021. This dataset consists of 4 data files. 1. Model_Selection.xlsx, created August 3, 2021. Coefficients and AICc values for all a priori models used to model the effects of environmental predictors on community dispersion (as SES), as well as backwards-elimination model selection steps. Permutation Algorithm: Algorithm used to generate null distribution of diversity metrics (independent swap or tip shuffle) Metric: Metric of trait diversity calculated for a given community (mean pairwise distance or mean nearest taxon distance) Species pool: Species pool within which permutations take place (Global, Atlantic, or Pacific) Trait set: Which traits were used to calculate diversity metrics (Diet, Microhabitat, or All) A priori model name: Name of the model used to model standard effect sizes against site-level variables, ordered according to AICc Model coefficients: Model coefficients (including intercepts) for a priori and composite models (14 columns), with predictor significance derived from model ANOVAs AICc: AkaikeÕs Information Criterion (AIC) for each model, corrected for small sample sizes dAICc: Difference in AICc from the a priori model with the lowest score Further details are provided in the READ ME tab of the spreadsheet. 2. site_data.csv, created July 28, 2021. Site-level environmental parameters used to build a priori models described above. Site: 2- or 4-character code for each site Site Name: full name of each site Ocean: Ocean basin where site is located (Atlantic or Pacific) Coast: Sub-basin of the ocean where the corresponding site is located (East Atlantic, West Atlantic, East Pacific, West Pacific). Note that ÒeastÓ and ÒwestÓ refer to the region of the ocean basin, not the continent (e.g. East Atlantic corresponds to Europe) Continental margin: Whether the Coast is on the Eastern or Western side of the ocean basin. Latitude: Latitude in degrees north of the Equator Longitude: Longitude in degrees east of the Prime Meridian Month: Month in-situ eelgrass and community data were collected Date.Collected: Date of collection Temperature.C: In-situ temperature in degrees centigrade during the time of sampling. Salinity.ppt: In-situ salinity in parts per thousand Mean.Shoots.Zmarina.per.m2: mean number of eelgrass shoots per square meter, averaged across 20 sample plots per site. Mean.Site.Std.Periphyton: mean eelgrass epiphyte dry mass (g) per g dry mass eelgrass, averaged across 80 shoots per site. Mean.Macroalgae.g.m2: mean dry mass (g) of macroalgae per square meter, averaged across 20 sample plots per site. Sheath.Width.cm.: mean widest measurement (cm) of eelgrass leaf sheaths, averaged across 100 shoots per site. Sheath.Length.cm.: mean length (cm) of eelgrass leaf sheaths from the meristem to the top of the sheath, averaged across 100 shoots per site. Longest.Leaf.Length.cm.: mean length (cm) of the longest eelgrass leaf per shoot from meristem to leaf tip, averaged across 100 shoots per site. Above.Zmarina.g: mean aboveground dry biomass (g) of eelgrass per square meter, averaged across 20 plots per site. Mean.Leaf.PercN: mean leaf percent nitrogen in the eelgrass tissue, averaged across 100 shoots per site. Mean.Pred.Amphipod: mean percent of tethered amphipod prey removed, averaged across 20 prey tethering units per site (raw data presence-absence). Mean.Epifaunal.Richness: mean number of all epifaunal species per plot, averaged across 20 plots per site. Mean.Std.Total.Abund.Crustaceans: mean number of epifaunal crustaceans per eelgrass dry mass (g) from sample grab bags, averaged across 20 plots per site. Site.Epifaunal.Richness: total number of epifaunal species per site, summed across 20 plots per site. crustaceangrazers.med.size: median size of epifaunal crustacean grazers across 20 plots per site. chlomean: mean annual surface chlorophyll A (mg per cubic meter) from the region surrounding each site, from Bio-ORACLE (Tyberghein et al. 2012) nitrate: mean annual water column nitrate (micromoles per l) from the region surrounding each site, from Bio-ORACLE (Tyberghein et al. 2012) parmean: mean annual photosynthetically active radiation (Einsteins per square meter per day) from the region surrounding each site, from Bio-ORACLE (Tyberghein et al. 2012) sstmean: mean annual sea surface temperature (degrees centigrade) from the region surrounding each site, from Bio-ORACLE (Tyberghein et al. 2012) sstrange: annual sea surface temperature range (degrees centigrade) from the region surrounding each site, from Bio-ORACLE (Tyberghein et al. 2012) Site.Peracarid.Richness: total number of peracarid crustacean species per site, summed across 20 plots per site. PC1: first principle component axis of eelgrass bed morphology, derived from sheath length, sheath width, longest leaf length, and aboveground biomass. PC2: second principle component axis of eelgrass bed morphology, derived from sheath length, sheath width, longest leaf length, and aboveground biomass. Missing data coded as NA 3. species_matrix.csv, created July 28, 2021. Species-by-site abundance matrix for all peracarids identified to species in our study. 4. trait_matrix.csv, created July 28, 2021. Trait-by-species matrix including values of traits for all species included in our analyses, as outlined in Table 1 of the main text. Missing data coded as NA Model selection was performed as described in Gross et al. 2022 (doi: 10.1098/rspb.2021.1762) Species abundance and environmental parameters were collected according to protocols described in Gross et al. 2022 and Reynolds et al. 2018 (doi: 10.1002/ecy.2064). Trait data was compiled from literature cited in Appendix 1 of Gross et al. 2022.