A geographic mosaic of coevolution between Eurosta solidaginis (Fitch) and its host plant tall goldenrod Solidago altissima (L.)

A geographic mosaic of coevolution has produced local reciprocal adaptation in tall goldenrod, Solidago altissima (L.), and the goldenrod ball‐gall fly, Eurosta solidaginis (Fitch 1855). The fly is selected to induce gall diameters that minimize mortality from natural enemies, and the plant is selected to limit gall growth that reduces plant fitness. We conducted a double reciprocal transplant experiment where S. altissima and E. solidaginis from three sites were grown in gardens at each site to partition the gall morphology variation into fly genotype, plant genotype, and the environment components. The host plant gall diameter induced by each E. solidaginis population was adapted to inhibit local natural enemies from ovipositing on or consuming enclosed larvae. Reciprocally, increasing the gall size induced by the local fly population increased the resistance of the local plant host population to gall growth. Differences among sites in natural enemies produced a mosaic of hotspots of coevolutionary arms races between flies selecting for greater gall diameter and plants for smaller diameters, and coldspots where there is no selection on plant or fly for a change in gall diameter. In contrast, the geographic variations of gall length and gall shape were not due to coevolutionary interactions.

The geographic mosaic theory of coevolution (GMTC) (Thompson 2005) states that geographic and environmental variation can exert selection on pair-wise species interactions leading to reciprocal local adaptation in traits of both species. A geographic mosaic of coevolution has three defining components (Thompson 1999(Thompson , 2005(Thompson , 2009. First, geographic mosaics of natural selection on species pairs vary across the landscape because of genotype × genotype × environment interactions. Geographic variation in reciprocal selection on a landscape scale results in coevolutionary arms races that produce mosaics of local reciprocal adaptation in defense traits by one species and counter-defenses in another species. Examples include jaw size in squirrels and pine cone traits (Smith 1970) and rostrum length in weevils and pericarp thickness in fruits (Toju and Sota 2006;Toju 2008). Selection mosaics also occur in mutualistic plant-pollinator interactions (An-derson and Johnson 2007), and interactions between species can vary geographically from antagonistic to mutualistic (Thompson and Cunningham 2002).
The second prediction of the GMTC is that there are hotspots where two species strongly interact, producing locally adapted coevolved traits, and coldspots where interactions are weak or absent and lack coevolved traits. Coldspots occur due to the absence of one of the interacting species or due to the presence or absence of a third species (Thompson 2005;Gomulkiewicz et al. 2007). For example, crossbills, Loxia curvirostra, only have bills locally adapted to match defensive traits of Aleppo pine, Pinus halperisis, cones in coevolutionary hotspots that occur in the absence of squirrels. Coldspots in this coevolved interaction occur when squirrels are present that strongly coevolve with pine cone traits (Siepielski and Benkman 2004;Mezquida and Benkman 2005). Similarly, Zangerl and Berenbaum (2003) found coevolved hotspots with matching chemical defenses in wild parsnips and webworm detoxifying abilities when cow parsnips were absent; however, when cow parsnips were present, a wild parsnip-webworm coldspot occurred due to selection on the webworm to adapt to cow parsnip defenses. A coldspot occurred in the plant-pollinator mutualism between the moth Greyea politella, which pollinates and consumes seeds of Lithophragma parviflorum, in sites where the presence of non-seed eating pollinators shifted the impact of Greyea from a mutualist to a parasite (Thompson and Cunningham 2002).
The third GMTC prediction is that mismatched coevolved traits will occur due to trait remixing. Trait remixing can occur due to gene flow, extinction of local populations, genetic drift, or mutations, although the impact of these processes can be challenging to assess because they require historical information about populations and their interactions (Thompson 2005). Mathematical models show that gene flow among hot spots and cold spots can significantly impact the distribution of coevolved traits (Nuismer et al. 1999;Gomulkiewicz et al. 2000;Fernandes et al. 2019). Coevolution in the presence of gene flow has been demonstrated (Lively 1999;Burdon et al. 2002;Martin-Galvez et al. 2007;Seppa et al. 2020), and its impact on geographic mosaics of coevolution has been experimentally demonstrated (Forde et al. 2004). Gene flow can impact the structure of geographic mosaics of coevolution, influencing the distribution of traits of interacting species (Dybdahl and Lively 1996;Chaves-Campos et al. 2011). Gene flow is hypothesized to be the cause of mismatched traits found in several coevolved interactions (Brodie et al. 2002;Siepielski and Benkman 2004;Martin-Galvez et al. 2007). Complex biogeographic history, including past colonization events and historical patterns of habitat connectivity and fragmentation, have also been suggested as causes of trait mismatching (Reimche et al. 2020).
The GMTC hypothesizes that there will be geographic patterns of genotypic matching and mismatching of traits, but previous studies have measured only phenotypic matching. Showing a geographic pattern of matched or mismatched traits alone is insufficient to support or reject the GMTC (Gomulkiewicz et al. 2007;Nuismer et al. 2010). Spatial correlation of adaptive traits can occur without coevolution for various reasons, including similar responses by two species to the abiotic environment and nonreciprocal evolution of trait, where one species adapts to preexisting variation in another species (Gomulkiewicz et al. 2007;Nuismer et al. 2010). Geographic variation measured for each species must be partitioned into its genotypic and environmental components to measure coevolution in interacting species. Nuismer and Gandon (2008) argue that the design of coevolutionary studies has limited their ability to measure the genotypic component of geographic variation in each species. We partitioned the geographic variation in putatively coevolved traits in the interaction of a gall-inducing fly, Eurosta solidaginis, and its host plant, Solidago altissima, using a double reciprocal transplant design, where specific traits are measured in all possible combinations of two interacting species' geographic populations and their local environments. Local reciprocal adaptation of two species (G 1 , G 2 ) is a key prediction of the GMTC, and local adaptation in a population (G 1 ) consists of the fitness consequences of three components (Nuismer and Gandon 2008): (1) Local environmental adaptation resulting from the G 1 × E interaction. (2) Local adaptation to the other species resulting from the G 1 × G 2 interaction.
(3) Local adaptation to the interaction of the other species and the environment resulting from G 1 × G 2 × E interaction.
We tested hypotheses on the geographic mosaic of coevolution of the plant and the gall-maker. Insect galls are abnormal growths of plant tissue induced by insects (Raman 2011), and the interaction of insect genes, plant genes, and the environment determine gall morphology (Weis and Abrahamson 1986). Because insect genes partially determine gall morphology, it is part of the galler's extended phenotype (Bailey et al. 2009), and there is selection on the insect to induce adaptive gall morphology. The enemy hypothesis states that defense against natural enemies is an important factor in the evolution of gall morphology (Price et al. 1987). Natural enemies must penetrate gall walls to reach their victims, so variation in gall morphology influences natural enemy mortality (Cornell 1983;Craig et al. 1990;Stone and Schonrogge 2003) and natural enemy community structure (Craig 1994;Van Hezewijk and Roland 2003;Bailey et al. 2009). Several gall traits influence natural enemy mortality, including toughness (Weis 1982;Craig et al. 1990), hairiness (Dixon et al. 1998;Bailey et al. 2009), stickiness (Bailey et al. 2009), and size (Price and Clancy 1986;Weis and Abrahamson 1986;Rossi et al. 1992;Ito and Hijii 2002;Marchosky and Craig 2004;Gil-Tapetado et al. 2021;Hernández-Lopez et al. 2021). A parasitoid's ovipositor length can limit its ability to attack host larvae in large galls (Weis and Abrahamson 1986;Hernández-Lopez et al. 2021), producing the potential for coevolution between gall size and ovipositor length . Gallers reduce plant fitness (Price et al. 1987;Sacchi et al. 1988), so there is selection on host plants to eliminate or reduce gall growth, producing a conflict between selection for gallers to increase gall size for defense and plants to minimize gall size. Geographic variation in gall morphology provides the opportunity to test the assumptions of the GMTC that gall traits are under reciprocal selection by the galler and the plant and that genetic variation in the plant and galler influences these traits.

INTERACTION
The goldenrod ball gall fly, Eurosta solidaginis (Fitch), lays eggs into the stems of the herbaceous, clonal host plant tall goldenrod Solidago altissima (L.), inducing swelling in stems as the gall grows. Uhler (1951) and Abrahamson and Weis (1997) have detailed the natural history of this interaction summarized below. Dissection of females showed that they contain over 200 eggs (Uhler 1951). In late spring, females insert eggs into stems immediately below the apical meristem, and hatching larvae burrow down into the stem inducing gall growth (Uhler 1951). Insertion of the ovipositor leaves a visible mark termed an ovipuncture, and while every ovipuncture does not result in oviposition, there is a positive correlation between ovipuncture and egg number (Hess et al. 1996;Craig et al. 2000). An individual female can oviposit multiple eggs in an internode, and multiple females can lay eggs in one bud (Hess et al. 1996;Craig et al. 2000). An individual female oviposits on multiple stems (Craig et al. 1993;Abrahamson and Weis 1997). However, only one larva develops in an internode due to intraspecific competition (Hess et al. 1996;Craig et al. 2000). Multiple galls can develop on a stem (Hess et al. 1996), and since multiple females oviposit on individual stems and individual females oviposit on multiple stems, we infer that frequently different females induce galls sharing a stem.
Galls are visible after about three weeks and finish growth within eight weeks (Weis and Abrahamson 1985). Larvae develop in a central chamber surrounded by a variable amount of tissue from 8 to 12 mm thick. Larvae complete development between September and October depending on local climate; larvae then enter diapause. Depending on the local climate, they break diapause in the late spring and pupate and emerge from May to mid-June. Natural enemies attack after gall growth is complete (Weis and Abrahamson 1985). In late summer, the parasitoid wasp Eurytoma gigantea (Walsh) oviposits through the gall into the Eurosta larval chamber (Uhler 1951). Also, in late summer, the inquiline beetle Mordellistena convicta (LeConte) oviposit on the gall surface, and larvae burrow into the gall, feeding on plant tissue and gall occupants (Abrahamson and Weis 1997). Blackcapped Chickadees, Poecile atricapillus (L.), and downy woodpeckers, Dryobates pubescens (L.) attack the larvae by pecking through the gall during the winter (Uhler 1951).

EUROSTA, AND NATURAL ENEMIES INTERACTION
These natural enemies of Eurosta larvae exert selection on gall size that varies geographically (Craig et al. 2007;Craig et al. 2020). Birds exert selection for smaller gall diameters and gall lengths because they preferentially attack larvae in long, largediameter galls. In contrast, the parasitoid E. gigantea, and the inquiline beetle M. convicta, cause higher mortality in the smaller diameter, shorter, galls where the larvae are easier to reach with their ovipositors, thereby exerting selection for longer and larger diameter galls (Weis and Abrahamson 1986;Craig et al. 2007, Craig et al. 2020). The combination of birds, E. gigantea, and M. convicta creates stabilizing selection for intermediate gall diameters in the forest (Weis and Abrahamson 1985;Weis and Abrahamson 1986;Weis et al. 1992;Weis 1996). In contrast, in the prairie where these birds are rare or absent, there is directional selection for larger galls, resulting in larger gall diameters in the prairie than in the forest (Craig et al. 2007;Craig and Itami 2011). In transitional sites on the prairie-forest biome border, with intermixed patches of treeless prairie and forest vegetation, selection by tree-dwelling birds is inconsistent in time and space, producing a range of intermediate gall sizes. Craig et al. (2020) found that at the prairie-forest biome border selection on gall diameter varied on the scale of a few kilometers producing a small-tiled geographic mosaic of gall intermediate gall diameters and shapes. However, Craig et al. (2020) found that gall length at a site was not correlated with selection on gall length at that site, indicating a lack of response to selection. The reasons for this lack of response to selection are unknown. Gall shape also varies geographically (Craig et al. 2007(Craig et al. , 2020Craig and Itami 2011), and this could result from either coevolution or evolution with different factors influencing gall diameter and length.
Different studies have divided Eurosta solidaginis into subspecies based on pigmentation patterns in the wing (Ming 1989;Foote et al. 1993) and host races based on host plant adaptation (Craig and Itami 2011). Ovipositing flies from prairie and forest biomes each prefer their local host plants, and larval performance is higher on their local host plants (Craig and Itami 2011). Also, there is statistically significant but incomplete assortative mating in E. solidaginis on S. altissima subspecies based on host preference leading to the Craig and Itami (2011) designation of the fly populations as host races. Further research is required to clarify the degree of reproductive isolation between host-associated fly populations and their place on the speciation continuum. The degree of reproductive isolation shows a geographic variation (Craig et al. 2020). The intermediate wing pigmentation patterns indicate gene flow between the E. solidaginis subspecies/host races in the intermixed prairie and forest habitats at the biome border (Brown and Cooper 2006;Craig et al. 2020). Semple et al. (2015) defined geographical subspecies of S. altissima based on plant morphology with S. altissima altissima in the forest and S. a. gilvocanescens in the prairie. The degree of reproductive isolation between these subspecies remains uncertain and requires further research. Craig et al. (2020) found intermediates between the subspecies based on morphological traits in the mosaic of prairie and forest habitats on the biome border, indicating gene flow between the tall goldenrod subspecies.
The complex variation in the spatial distribution of the host races/subspecies of the flies and the subspecies of S. altissima, and the potential for gene flow among populations of both of the interacting species means that the fly and plant populations at sites potentially could consist of a mix of pure host races/subspecies of E. solidaginis, and a mixture of S. altissima subspecies and intermediates between them. This potential for gene flow and trait mixing meets the third prediction of GMTC.

COEVOLUTION HYPOTHESIS
Demonstrating a geographic mosaic of coevolution requires evidence of reciprocal local adaptation in an interacting pair of species. Parasitoids are larger and have longer ovipositors in prairie sites where gall diameters are larger than in the forest, indicating a coevolutionary arms race between E. solidaginis and its parasitoid E. gigantea. Eurytoma gigantea were smaller and had shorter ovipositors in the forest where gall diameters are small (Craig et al. 2007). However, Weis et al. (1989) demonstrated that the parasitoid feeds on the nutritive layer of gall tissue after consuming the fly larva. Larger galls have more food for parasitoids, and gall size is correlated with parasitoid size, and thus an arms race between gall size and parasitoid size could occur without genetic coevolution. However, prairie populations of parasitoids are significantly larger with longer ovipositors than forest populations even when the effects of gall size are accounted for (Craig and Itami unpublished manuscript), indicating genetic differentiation of populations. However, the study by Weis et al. (1989) illustrates that it is crucial to partition phenotypic traits into environmental and genetic components in testing for coevolution.
We tested the geographic mosaic of coevolution hypothesis in the pair-wise interaction of E. solidaginis and S. altissima; previous studies have not tested the hypothesis that E. solidaginis and S. altissima have coevolved. As discussed above, there is a stronger selection on E. solidaginis by natural enemies for larger gall diameters and longer gall lengths in the prairie than in the forest. Eurosta galls negatively impact S. altissima fitness exerting selection for the plant to limit gall growth (Hartnett and Abrahamson 1979;Abrahamson and McCrea 1986). Increasing E. solidaginis gall size diverts resources and reduces S. altissima vegetative growth and flower production (Stinner and Abrahamson 1979). Therefore, we predicted that S. altissima in the prairie where E. solidaginis induce larger diameter galls, causing a greater reduction in fitness, will be more resistant to gall diameter growth than in the forest where Eurosta induces smaller galls, causing a lower reduction in fitness. We predicted that gall sizes would be intermediate in transitional areas between prairie and forest because of variable selection in time and space. These predictions assume that resistance to gall growth is costly, and therefore there is a trade-off between resistance to gall growth and plant growth and reproduction.
We tested the prediction that the geographic mosaic of coevolution has produced a spatially structured positive relationship between the size of the gall growth induced by the fly and the strength of resistance to gall growth by the plant. To do this, we conducted a double reciprocal transplant experiment where the local morphs of both Eurosta and S. altissima were transplanted in a complete factorial design between forest and prairie sites to measure the three components of local adaptation (G × E, G × G, and G × G × E). Common garden and single reciprocal transplant experiments do not accurately measure local adaptation (Nuismer and Gandon 2008). However, a double reciprocal transplant design where populations of both fly and host plant species are reciprocally transplanted among their geographic origins can quantify genetic adaptation because it allows measurements of all three-way G × G × E interactions. Geographic phenotypic variation in traits results from variation in the genotype of species 1 (Gsp 1 ) × genotype of species 2 (Gsp 2 ) × environment (E) interaction. To measure the genetic differences among the local genotypes of species one (G 1 sp 1 ) in response to the local genotype of species 2 (G 1 sp 2 ) requires partitioning the genetic effects the two species have on each other from the environmental effects on each species alone. Common garden experiments measure the traits of a species in a single environment, either the home environment or the environment of one of the species (i.e., G 1 sp 1 × G 1 sp 2 × E 1 vs. G 1 sp 1 × G 2 sp 2 × E 1 ) or a neutral one (i.e., G 1 sp 1 × G 1 sp 2 × E 3 vs. G 1 sp 2 × G 2 sp 2 × E 3 ), and thereby prevent the separation of the effects of species 2 and the environment on species 1. Reciprocal transplant experiments compare the performance of a population of a species in its environment and a foreign environment (G 1 sp 1 × G 1 sp 2 × E 1 vs. G 1 sp 1 × G 2 sp 2 × E 2 ), again making the separation of the effects of the interacting species and the environment impossible because the environment and the population of the second species change simultaneously.
A fully-crossed experimental design in a double reciprocal transplant experiment allows partitioning of the effects of two species on each other from environmental effects. In a double reciprocal transplant experiment, a fully factorial design allows measurements of traits of all possible combinations of plant and insect populations from different environments (G 1 sp 1 × G 1 sp 2 × E 1 , G 1 sp 1 × G 2 sp 2 × E 1, G 1 sp 1 × G 1 sp 2 × E 2 G 1 sp 1 × G 2 sp 2 × E 2 ). This design makes it possible to partition the variance in the traits of each species in the interaction. It partitions variance into that caused by variance among populations of the other species (G × G), the different environments (G × E), and the interaction of populations and environment (G × G × E).
Gall morphology is partly due to the interaction of plant and insect genotypes, and genotypic variation in these genotypes will produce a distribution of outcomes in gall morphology. The shape of the distribution of outcomes between coevolved species influences interactions with other species in the community, contributing to the geographical variation in ecological communities (Thompson 1999;Thompson 2005). The shape of the distribu- tion of E. solidaginis gall morphologies influences its interactions with the natural enemies and impacts the structure of the natural enemy community Craig et al. 2007).

RECIPROCAL TRANSPLANT GARDEN
We conducted the experiment in three locations where previous research had found different gall sizes and where the environments differ (Craig et al. 2007;Craig and Itami 2011) (Fig. 1). Our forest biome site was the University of Minnesota Duluth Re-search and Field Studies Center (46°52 07 N, and 92°02 59 W) which is in a habitat classified by the Minnesota Department of Natural Resources (DNR) (2005) as Laurentian mixed forest region Laurentian upland subsection (212LB). Galls in this area have ovoid shapes and small diameters (Craig et al. 2007;Craig and Itami 2011). Flies all had "forest" wing patterns as defined in Craig et al. (2020) (voucher specimens for this study are available in UMD insect collection). Plants from this site were classified as "forest" (Craig and Itami 2011), and this corresponds with S. a. altissima subspecies designation of Semple (2015). The forestprairie biome transition site was at the Cedar Creek Ecosystem Science Reserve (45°24' 07 N, and 93 11' 27 W), which is classified by the Minnesota DNR (2005) as dry sand-gravel oak savanna (Ups14c). Galls at this site were ellipsoid-shaped and intermediate in diameter (Craig and Itami 2007, Craig and Itami 2011, Craig et al. 2020. Flies from this site had "forest" wing patterns. However, plants were intermediate in some characters (Craig andItami 2011), andCraig et al. (2020) found that plant populations in the prairie-forest biome transition had variable and often intermediate scores on the prairie-forest plant continuum. The third site was the prairie site at the Minnesota State University Moorhead Regional Science Center (46°52 09 N, and 96°27 07 W). The site is immediately adjacent to Buffalo River State Park and Bluestem Prairie Scientific Natural Area, one of the region's largest areas of preserved prairie, and Minnesota DNR (2005) classifies this area as northern mesic prairie (UPn23). Galls at this site have large gall diameters and spherical shapes (Craig et al. 2007;Craig and Itami 2011). All flies had the "prairie" wing pattern.
Populations of E. solidaginis and S. altissima were transplanted from each site to all three sites resulting in 27 different combinations of plant origins, insect origins, and sites, with 3 replicates each. We set up 84 two-by-two-meter plots at each site separated by 46 cm deep aluminum flashing buried in the ground to prevent rhizome spread among plots. The plots were arranged in three blocks of 28 plots, each composed of two rows of 14 plots. All three sites had identical designs of randomized plots of plants. Each site was cleared of all vegetation and cultivated before the plots were initiated.
Our goal was for each plot to contain a unique, diverse combination of plant genotypes from one site. In April 2010, we collected multiple rhizomes from 50 plants from a 20 km radius of the three sites (150 plants total). We defined a plant as being ramets physically connected by rhizomes indicating that they were from the same genet. We maximized the number of different genets by collecting plants from distinct clumps separated from adjoining plants by at least five meters, in many instances by several kilometers, with morphologically distinct traits. We assumed that these represented different genotypes, but we did not complete genetic analyses to test this assumption. We collected plants from sites with relatively recent disturbance that supports rapidly expanding plants producing many rhizomes. Plants collected in 2009 consisted of multiple physically connected stems with multiple new rhizomes attached (usually 5-20). We used a 2009 stem and all attached ramets as a planting unit. We randomly assigned nine plants to each plot, with the restriction that each plot had a unique combination of plants. In each plot, we planted nine planting units equally spaced within the plot. Although we did not count the rhizomes per plot, we estimate that each plot contained approximately 90 to 100 rhizomes from nine plants.
In 2010, we did not conduct any experiments. We provided supplemental water, weeded the plots of all plants other than the rhizomes we had planted, and removed all seed heads of plants at the end of the growing season to minimize the probability of different genotypes colonizing the plots.
In fall 2010, we collected a total of 30,000 + galls from areas within a 20 km radius of each of our three sites. We placed onethird of the galls from each site at each of the three garden sites in the fall so that the population at each site would experience local environmental conditions. Galls were placed in mesh bags and stored in wire mesh cages elevated one meter off the ground. On 10 May 2011, we removed the galls from the cages and hung them on lines in small mesh bags. We checked and removed all emerging flies each day, sorted by sex into small transport cages, and assigned them to specific garden plots.
To determine the relationship between the gall morphology of the source populations and those in the experiments, we randomly sampled 180 galls from each site and measured their diameters and lengths with dial calipers. We dissected galls to determine if Eurosta or a natural enemy had survived. To analyze gall size and shape, we included only those galls where Eurosta or a natural enemy had survived because gall growth depends on the survival of living larvae, and gall size growth is completed before the natural enemy attack (Weis and Abrahamson 1986). Gall diameter was measured as the maximum diameter of the gall, and gall length was measured between the points on either side of the center of the gall where the stem increased in diameter. We calculated gall shape by dividing gall diameter by gall length.
In spring 2011, we did not weed the plots or provide any supplemental water as we had done in 2010. During the first week of May, we placed cages over each plot consisting of rebar pounded into the ground and with sections of plastic irrigation pipe placed over the ends of each pair of rebar to form an arch (Fig. 1A). To exclude insects from colonizing the plots, we covered these arches with Agribon ® (Berry Plastics, Evansville, IN, USA) nonwoven fabric used in organic farms to protect crops from herbivory. The cloth edges were stapled to 2 by 4-dimensional lumber to provide the cage edges and then buried in the ground to secure the cages. The different plants within each plot had produced many new rhizomes and had become intermixed, forming a dense stand, making identifying which of the nine plants had produced a stem difficult (Fig 1B), and we did not record plant identity when collecting data.
Our goal was to maximize gall formation in each cage. To accomplish this, we introduced groups of five male and five female flies into each cage. Every three days, we checked the number of ovipunctures, which are easily visible marks indicating that a Eurosta has inserted its ovipositor in the bud, although it does not necessarily indicate that an egg was laid (Abrahamson and Weis 1995). If less than 50% of the stems in the plot had been ovipunctured we added another five fly pairs. We repeated this process until the 50% threshold was reached or the supply of flies was exhausted. Not all plots had 50% of the stems ovipunctured.
The cages were removed at each site 14 days after the last fly emerged, ensuring no additional ovipunctures were added to the plots. The plants were then allowed to grow normally (Fig. 1C). Galls were collected in early October prior to bird predation and overwintered in bags in screen cages at each of the sites. We brought the galls into the laboratory on May 1, 2012 and individually reared them at room temperature in compostable 120 ml clear cups (Eco-Products, Boulder, CO, USA). The organisms emerging from each gall were recorded daily. After emergence was completed in July 2012, gall maximum diameter and length were measured using dial calipers. Gall shape was calculated by dividing gall diameter by gall length. As described for the wild galls to analyze gall size and shape, we dissected all galls but included only those galls where Eurosta or a natural enemy had survived in the analyses. Gall diameter, gall length, and shape were analyzed using ANOVA with fly origin, plant origin, and garden location (site) as fixed factors because we chose these populations for their specific characteristics. We used the mean gall diameter from each plot in our analysis of variance because the plot was the experimental unit.

STATISTICAL ANALYSIS
All statistical analyses and graphs described in the methods and results were completed in MINITAB 18 ® Statistical Software (State College, PA, USA). We graphed the distribution of gall traits to examine the distribution of outcomes and used Levene's test for equal variances. We first conducted multiple analysis of variance (MANOVA) to test whether there were overall differences among plant and insect populations when all traits were considered. When the MANOVA showed significant differences, we analyzed variance to test for the significance of fly origin, plant origin, and garden location. We used the Akaike Information Criteria in model reduction to choose the best-fit model. We evaluated models containing all combinations of the main effects with interaction terms. The main effects were retained in all models, but interaction terms were eliminated if they were non-significant at the P < 0.5 level. (C) Gall shape = diameter/length. The interquartile range box represents the pooled data for the middle 50% of each field site, whiskers represent the ranges for the bottom 25% and the top 25%, gall means for each field site are plotted with diamond crosshair symbols.

GALL MORPHOLOGY IN THE FIELD
Gall diameter, gall length, and gall shape varied significantly among sites (Wilks F 6,744 = 26.736, P < 0.0001). The Akaike index indicated that a model without interaction terms was the best-fit model. All of the interaction terms removed were statistically non-significant at the P < 0.05 level. A Tukey's multiple range test showed that each site was a unique group, with galls from Moorhead having the largest diameters, Cedar Creek having intermediate diameters, and Duluth having the smallest diameters (F 2, 461 = 76.80, P < 0.0001, Fig. 2A). In ad- dition, a Tukey's multiple range test showed that Duluth galls were significantly shorter than Cedar Creek and Moorhead galls, which together formed a homogeneous group (F 2, 461 = 16.37, P < 0.0001, Fig. 2B). Moorhead galls were significantly more spherical than Cedar Creek and Duluth galls, which together formed a homogeneous group (F 2, 461 = 32.51, P < 0.0001, Fig. 2C).
The natural population of galls at each site significantly differed in variance for gall shape (Levene's test = 4.23, P < 0.0015), but not in gall diameter or length. The shapes of the normal curves for gall diameter (Fig. 3A) and gall length (Fig. 3B) were similar among the three sites, but Moorhead flies produced a wider range of gall shapes (Fig 3C).  (51) 1588 (172) The number of replicates used in the analysis of each fly origin by plant origin category are indicated in parentheses. The means of gall diameter for each plot were used in the analysis.

TRANSPLANT EXPERIMENT
In the experiment, 2997 galls were formed. We analyzed the gall morphology of the 1588 galls that had Eurosta or natural enemies. Galls with larvae that died before the natural enemy attack were excluded from analysis since gall growth requires larval survival. We also excluded 104 galls that had larvae at the time of the natural enemy attack but occurred in adjacent nodes. These galls overlapped in their alteration of the stem, and it was difficult to determine the dimensions of the two or more galls growing together, and these galls were statistically significant outliers. We used plots as independent statistical units in the analysis, and Table 1 (Table 1). Gall diameter, length, and shape varied significantly by fly origin (Wilks F 8,362 = 7.532, P < 0.0001), plant origin (Wilks F 8,362 = 6.369, P < 0.0001) and the fly origin by plant origin interactions (Wilks F 8,362 = 2.933, P < 0.0001).
The Akaike index indicated that a model without interaction terms was the best-fit model explaining gall diameter, and all of the interaction terms removed were statistically non-significant at the P < 0.05 level. The fly origin and plant origin significantly influenced gall diameter, but garden location and interactions among factors did not significantly affect diameter (Table 2,  The variances for gall diameter did not vary significantly due to either plant or fly origin. The shapes of the distribution of gall diameter were similar, but Moorhead flies had a wider variance on Duluth plants, and conversely, Duluth flies had a wider variance on Moorhead plants. (Fig. 5).
The Akaike index again indicated that a model without interaction terms was the best-fit model for the effects of gall length and that all of the interaction terms removed were statistically non-significant at the P < 0.05 level. The plant origin significantly influenced gall length, but the fly, garden, and interaction among factors had no significant effect on gall length (Table 3, Fig 4B). Gall length was significantly greater on Duluth plants (LSE ± SE = 21.77 ± 0.60, Tukey Group a) than those on Cedar Creek (LSE ± SE = 19.44 ± 0.42 Tukey Group b) and Moorhead plants (LSE ± SE = 18.59 ± 0.83, Tukey Group b). The variances for gall length did not differ significantly due to either plant or fly origin. The shapes of the distribution of gall length were similar, but Moorhead flies had a wider variance on Duluth plants, and conversely, Duluth flies had a wider variance on Moorhead plants (Fig. 5). As for the other gall traits, the Akaike index indicated that a model without interaction terms was the best fit model for gall shape, and all of the interaction terms removed were statistically non-significant at the P < 0.05 level. The fly origin significantly influenced gall shape, but the plant, garden location, and the interactions between factors had no significant impact on gall shape (Table 4, Fig. 4C). A sphere would have a diameter/length ratio = 1.0. Moorhead flies induced significantly more spherical galls (LSE ± SE = 0.91 ± 0.017, Tukey Group a) than Cedar Creek flies (LSE ± SE = 0.82 ± 0.013, Tukey Group b) or Duluth flies (LSE ± SE = 0.83 ± 0.011, Tukey Group b). The variances for gall shape did not vary significantly due to either plant or fly origin. The shapes of the distribution of gall shape did not differ among plant or insect origins. (Fig. 5).

Discussion
The double reciprocal transplant design demonstrated adaptive genetically-based geographic variation in gall diameter growth in the flies and resistance to gall diameter growth in host plants. Gall trait distribution due to the interaction of coevolved plant and fly genotypes determines the susceptibility of E. solidaginis larvae to natural enemies.
Fly populations differed in their resistance to natural enemies due to differences in the means of their trait distributions, not the shape of their trait distributions. Eurosta solidaginis showed locally adaptive genetic variation by inducing galls that provided the greatest protection against local natural enemies. At the prairie site (Moorhead), where bird predation is largely absent, flies induced the largest diameter galls that provide the greatest protection against parasitoids and inquilines. At the forest site (Duluth), where birds, parasitoids, and inquilines are present, exerting stabilizing selection for gall diameter, flies induced smaller diameter galls, and at the transitional oak savannah site (Cedar Creek), flies induced intermediate diameter galls.
Solidago altissima also showed locally adaptive genetic resistance to gall diameter growth. Plants were most resistant to gall growth in the prairie (Moorhead), where the E. solidaginis population induced the largest diameter galls, and plants were least resistant in the forest (Duluth), where the fly induced smaller diameter galls, and intermediate resistance occurred in the oak savannah vegetation (Cedar Creek) where flies induced intermediate diameter galls. This resistance pattern indicates that the plant increases defenses against the gallmaker where the fly induces increasing gall diameter growth. Thus, the prairie is a hotspot for the coevolution of E. solidaginis and S. altissima where the coevolutionary arms race exerts strong selection for the fly to induce larger diameter galls and the plant to resist induction of larger diameter galls. In contrast, the forest is a coldspot with no selection for a change in gall diameter by the fly or for plants to evolve stronger resistance to gall diameter growth. The Cedar Creek sites on the forest-prairie biome border had an intermediate level of gall growth induction by the fly and resistance to gall growth.
The distributed outcomes in gall traits due to the interaction of the coevolved insect and plant genotypes resulted in a shift in the mean of these traits in different environments but little shift in the variance or the shape of the distribution. Local variation in gall morphology is consistent with differences due to additive genetic variance from interacting coevolved insect and plant genotypes (Fig. 2). Naturally occurring galls in Moorhead are This experiment's results indicate that selection limits gene flow between fly and plant populations in different habitats. If prairie host race flies migrated to the forest and induced relatively large diameter galls by ovipositing on forest plants, they would suffer very high mortality from bird predation. Conversely, if for-est host race flies migrated to the prairie and induced relatively small diameter galls in the prairie, they would suffer very high mortality from the parasitoid and the inquiline. Plants growing in their non-local habitat would also suffer reduced fitness. If forest plants grew in the prairie, their fitness would be reduced by the very large gall masses induced by prairie flies to which they had little resistance. On the other hand, if prairie plants grew in the forest, they would invest resources in defense against gall diameter growth that would be wasted against forest flies that could not induce large-diameter galls. The small-tiled mosaic of variation in gall diameter and plant traits in areas where prairie and forest habitats are intermixed on a small geographic scale supports the hypothesis that selection is strong enough to overcome the influence of gene flow (Craig et al. 2020). The means of gall length for each plot where used in the analysis. The means of gall shape for each plot where used in the analysis.
Variation in bird attack produces hot and coldspots in the E. solidaginis -S. altissima geographic mosaic of coevolution. A coldspot occurs in the forest where stabilizing selection resulting from high bird mortality on larger diameter galls combined with high parasitoid and inquiline predation on larvae in small galls eliminates any selection for a coevolutionary arms race between fly and plant for changes in gall diameter growth. A hotspot occurs in the prairie where the combination of the absence of birds and the presence of parasitoids and inquilines exerts strong selection on the fly to induce larger galls, and as a result, selection is exerted on plants to reduce gall growth resulting in an escalating arms race between the fly and plant for control of gall-diameter growth. This cascading effect also creates a coldspot in the coevolution of gall diameter and the ovipositor length in the fly's parasitoid wasp Eurytoma gigantea in the forest and a hotspot for increasing gall diameter in the fly and increasing ovipositor length in the parasitoid in the prairie.
Gall length shows a lack of coevolution between E. solidaginis and S. altissima as gall length was influenced only by the plant origin and not by fly origin. Craig et al. (2020) demonstrated selection on gall length, but there was no response to this selection. Further analysis of the data used by Craig et al. (2020) showed that variation in bird predation produced the variation in selection on gall growth (unpublished data). Birds caused higher mortality in the very longest galls, and therefore it would be predicted that flies would induce shorter galls in the forest. Consistent with this prediction, field-collected Duluth galls had the shortest mean length. However, the experimental data showed that flies had no influence on gall length and that Duluth plants produced the longest galls. The differences in gall length in the experiment and the field data could be due to environmental effects. The galls used in the 2011 experiment were galls initiated in 2010, and the environmental conditions vary among years. The Duluth galls were smallest in both diameter and length in the 2010 cohort, indicating that the environmental effects limited overall gall growth in this area.
This lack of fly influence on gall length could indicate such strong selection on plant traits that influence gall length that plants have evolved resistance that prevents flies from influencing gall length. Weis and Abrahamson (1986) demonstrated that gall length is a heritable trait in E. solidaginis, but plant resistance may have prevented a response to selection by the fly. A common garden experiment showed genetic differences between S. altissima altissima and S. a. gilvocanescens in several plant traits (Craig and Itami 2011). The maintenance of significant differentiation between the plant subspecies on small geographic scales indicates that these differences are adaptive (Craig et al. 2020). The forest subspecies grew significantly taller than the prairie subspecies, indicating significant geographical differences in selection on stem length growth by the plant that counter any selection by the fly to influence stem elongation to alter gall length. Environmental differences between prairie and forest environments, including differences in drought stress and interspecific competition, could select different plant growth patterns that would influence gall elongation, but further research is needed on selection for differentiation between the subspecies to test this hypothesis.
Prairie galls are more spherical than forest galls, and our results indicate this is the result of selection acting independently on gall diameter and gall length to produce different shapes. The fly population had the only significant impact on gall shape because differences among fly populations altered only one dimension: gall diameter. In contrast, plant origin determined both gall diameter and gall length. Because Duluth plants had larger gall diameters and longer gall lengths, and Moorhead and Cedar Creek plants had smaller gall diameters, and shorter gall lengths differences among plant populations did not alter gall shape. The more spherical galls produced by Moorhead flies result from flies' ability to increase gall diameter but not length. The more ellipsoid galls of Cedar Creek and Duluth flies result from smaller diameter galls induced on plants that did not significantly differ in gall length from Moorhead plants.
An alternative hypothesis that gall shape varies due to bird predation has not been supported in other studies. Birds are rare or absent in the prairie, and if they preferentially preyed on spherical galls, this could explain the evolution of more ellipsoid galls in the forest. However, in an experiment where birds were offered the choice between prairie and forest, gall shape was not a significant predictor of preference (unpublished data). Also, Craig et al. (2020) found no response to the variable selection on gall length among geographic populations.
The lack of a significant impact of among-site environmental variation on gall morphology was an unexpected result. Previous work had demonstrated that the environment influences gall morphology (Weis and Abrahamson 1986), and we had selected the sites based on large environmental differences. The lack of impact of environmental variance among sites does not imply a lack of environmental effects on gall morphology. The large amount of gall trait variation that was not explained by plant or fly origin was probably due to within-site environmental variation. Multiple within-site factors could have influenced variation in gall morphology among galls, including internode attacked, intraspecific competition or facilitation of other Eurosta, interspecific competition or facilitation by other herbivores, and variation in ramet growth within and among plots. We hypothesize that there may be strong among-site environmental influences on gall morphology that fluctuate through time, and that the G × E and G × G × E interactions might be significant in other years. The environment in 2011 may have been particularly benign for plant and gall growth, minimizing the among site effects. There was above average rainfall at all sites and a normal growing season so that all plant populations grew well at all sites (National Weather Service). For example, Duluth and Cedar Creek plants are adapted to higher rainfall locations and may have grown well even in Moorhead with abundant rainfall. Later studies in years with more extreme weather at these sites found large differences in gall morphology due to site effects. For example, weather produced large variations in conditions among sites in 2012. Moorhead had extremely warm spring weather and dry conditions compared to Duluth and Cedar Creek, and this produced large among-site differences in gall morphology among gardens (unpublished data).
The strong variation in gall numbers among treatment shows that there are selection mosaics for the fly and plant for fly oviposition preference and the ability to survive on a host plant, and for plant resistance to oviposition and fly survival in addition to the selection mosaics on gall morphology. We will report these differences in a subsequent paper. The differences in gall numbers were due to the combined effect of variation in the fly populations' oviposition preferences for host plant populations and the flies' ability to survive on host plant populations. A logical prediction would be that if a fly survived on a plant population resistant to them, they would have relatively low vigor and not be able to control gall growth, but this prediction was not supported. The ability to induce a gall was uncorrelated with the gall size induced. For example, few Moorhead larvae survived to induce galls on Duluth and Cedar Creek plants, but those larvae induced the largest diameter galls in the entire experiment (Fig. 2a). This indicates that the most critical stage in E. solidaginis survival is the ability to induce a gall, and once a fly induces a gall, the gall growth is typical for that fly population. These results suggest that different genes control fly survival and gall induction ability.
Geographic populations of E. solidaginis and S. altissima demonstrate all of the requirements necessary for a geographic mosaic of coevolution (Thompson 2005). First, there is a geographic selection mosaic for the strength of gall diameter growth induction in E. solidaginis and a reciprocal geographic selection mosaic for resistance to gall growth in S. altissima (Craig et al. 2007;Craig et al. 2020). Second, there are coevolutionary hot and coldspots. There is a coevolutionary hotspot with a coevolutionary arms race in the prairie for an increase in gall diameter in E. solidaginis and resistance to increased gall diameter in S. altissima. In contrast, the forest is a coldspot due to stabilizing selection on gall diameter by natural enemies that produce no selection for an increase in gall diameter by the fly or for increased resistance by the plant. Finally, (Craig et al. 2020) found interactions between the populations in prairie hot spots and forest coldspots in areas where prairie and forest habitats are mixed in a small-tiled geographic mosaic.
Our results indicate the importance of the bottom-up effects of abiotic variation in producing a geographic mosaic of coevolution. The presence of birds determines whether there will be a hotspot in the plant and the gall-inducer interaction. Treedwelling bird presence, woodpeckers and chickadees, is influenced by trees' presence, which is ultimately determined by climate. Climatic variation producing the distribution of prairie and forest biomes determines the distribution of the coevolved genotypes of S. altissima and E. solidaginis and the resulting distribution of gall morphologies. Thus, as Smith (1970) argued in his classic paper on coevolution, the abiotic environment is the ultimate independent variable determining the course of the coevolutionary interactions among species.

FURTHER RESEARCH
The interpretation of these results relies on assumptions that require further research. A critical assumption is that increasing gall growth decreases plant fitness. Stinner and Abrahamson (1979) found that galls containing E. gigantea were smaller and diverted less energy to gall growth, reducing their impact on plant fitness, and they interpreted this as the result of the parasitoid inhibiting gall growth. Weis and Abrahamson (1985) subsequently demonstrated that parasitoid attack does not inhibit gall growth because they attack larvae after gall growth is complete. Eurytoma gigantea parasitize only small galls because their short ovipositors cannot reach larvae in the center of larger galls. Therefore, small galls cause less reduction in fitness to S. altissima because small galls require less allocation of the plant's resources. Measuring the impact of increasing gall size on plant fitness across the complete range of gall sizes in both subspecies of S. altissima would allow quantification of selection for plant defense. Sakata et al. (2014) supported the hypothesis that S. altissima defense against herbivores is costly. They found that introduced S. altissima in Japan that had not been exposed to the lace bug Corythucha marmorata herbivory for 100 years had lost their resistance to the lace bug but rapidly regained it following the accidental introduction of the lace bug to Japan. The newly resistant S. altissima populations had higher sexual and asexual reproduction when exposed to lace herbivory, indicating resistance was lost when it was costly and regained when benefits exceeded costs.
Another limitation of this study was that it was restricted to plant and fly populations' interactions at three sites. In a double reciprocal transplant experiment, each site added expands the number of plots exponentially. However, since the plant and fly genetic effects were large compared to an environmental variation on gall trait variation, future studies could use simpler, common garden designs to measure gall variation due to plant and fly genotypes.

BROADER IMPLICATIONS
Advancing from phenotypic descriptions of local variation in putative geographic mosaics of coevolution to measuring the genotypic and environmental components of local variation in traits is crucial in understanding how geographic mosaics of coevolution evolve (Nuismer and Gandon 2008). Partitioning the three components of local adaptation using the Nuismer and Gandon (2008) design, as we did in this study, provides a rigorous test of the central assumption of the GMTC that reciprocal adaptive genetic differences in interacting species contribute to geographic variation. It allows the description of the G × E and G × G × E selection mosaics that produce geographic mosaics of coevolution and the potential to compare how they differ among coevolved interactions.
Partitioning the genetic and environmental components of local adaptation in traits makes it possible to understand an interactions' evolutionary trajectory and predict how perturbations would alter them. For example, in the year we conducted the study, we found G × G interactions strongly influenced gall morphology, but G × E and G × G × E interactions were relatively weak. On this basis, we would predict that introducing foreign genotypes into a local population would strongly impact gall morphology but that environmental changes such as a shift in climate would have relatively small effects. Species pairs found in the same ecological community that differed in the impact of their G × G, G × E, and G × G × E interactions would have different evolutionary trajectories. Measuring these interactions in multiple species pairs in an ecological community would provide insight into their impact on community structure and how environmental or genetic perturbations would alter this structure. For example, another herbivore that had coevolved with S. altissima might be strongly influenced by the G × E, and G × G × E interaction and large environmental differences among sites would alter both the interaction of this species with goldenrod and E. solidaginis.
The geographic variation in ecological communities is at least partially due to geographic mosaics of coevolved interactions. Measurement of the coevolutionary interaction components of multiple species pairs in ecological communities could partially explain geographic variation among communities. (Whitham et al. (2003) proposed the concept of community genetics, where the genetics of key species can influence the structure of an ecological community, a logical extension of this concept is that understanding coevolved genetics of interacting species can help explain geographic variation in communities. Local coevolved differences in the traits of interacting species could have extensive ramifications that affect the structure of the whole community creating geographic differences among communities. Understanding the underlying mechanisms that produce these community differences requires partitioning the components of the coevolved interaction that form the basis of these differences.
Finally, applying the techniques used in this study will permit the search for patterns in the forces creating geographic mosaics of coevolution. The structure of phenotypic mosaics of traits of interacting species does not reveal the genetic and environmental forces producing them: similar phenotypic mosaics can result from a range of combinations G × G, G × E, and G × G × E interactions. Measuring the impact of these interactions on traits in multiple coevolved interactions will allow comparative studies that could produce generalizations about the underlying forces producing geographic mosaics of coevolution. Models could then be developed that would predict where coevolutionary