Embryo retention, character optimization, and the origin of the extra‐embryonic membranes of the amniotic egg

This study summarizes the data relevant to understanding the appearance of the amniotic egg and provides new analyses to determine the inferences that can be drawn from these data. A survey of the recent literature shows that a consensus exists on the probably primitive absence of extended embryo retention in caecilians, despite recent suggestions to the contrary. The two most recent studies on the evolution of embryo retention in sarcopterygians both suggest that early amniotes lacked extended embryo retention. New analyses of length distribution of random trees suggest that the data on embryo retention do not include a phylogenetic signal, and this implies that character optimization of these data does not yield reliable information on the primitive condition for amniotes. Thus, the study of ancestral features of amniotes will probably have to exploit data from the fossil record.


Introduction
The origin of the amniotic egg is a classical evolutionary problem that has interested palaeontologists and evolutionary biologists for decades (Romer 1956(Romer , 1957Carroll 1970Carroll , 1991Lombardi 1994). Because of the poor preservation potential of eggs that lack a mineralized external shell, the oldest known fossil amniotic egg dates only from the Upper Triassic (Bonaparte and Vince 1979), although nests (without remains of the eggs that they must have contained) have been found in the Lower Triassic (Coyne 1999). Therefore, the origin of this type of egg must be studied indirectly, either using characters that are thought to be correlated with egg type or size, or by studying the egg types of extant taxa and using character optimization to try to reconstruct the condition of the distant common ancestor of amniotes that lived more than 310 million years ago (Laurin 2004). Both approaches have been used recently. The first approach was adopted by Laurin (2004), who used optimization by squared-change parsimony, independent contrasts, and permuted multiple linear regressions on a dataset of 107 species of early stegocephalians to study the evolution of body size in early amniotes and their close relatives and hence test Carroll's (1970Carroll's ( , 1991 scenario on the origin of the amniotic egg. The second method was used by Laurin and Reisz (1997), Wilkinson and Nussbaum (1998), Laurin and Girondot (1999), Laurin et al. (2000), and Wilkinson et al. (2002), who optimized the presence or absence of extended embryo retention to test Lombardi's (1994) hypothesis that some structures of the ''amnion and serosa evolved in response to the selective pressures for hypertrophied respiratory exchange surfaces in an embryo-retaining ancestor''. Of Lombardi's (1994) idea, only the presence or absence of extended embryo retention in an ancestor of amniotes was tested through the use of optimization of this character in extant sarcopterygians (Laurin and Girondot 1999). The main conclusion of this series of studies was that the hypothetical ancestors of amniotes probably did not exhibit extended embryo retention, thus rejecting Lombardi's (1994) interesting hypothesis. This result holds for the most widely accepted phylogeny of sarcopterygians, if caecilians are considered not to display extended embryo retention. However, under other phylogenies, if caecilians (whose development is poorly documented compared to that of other sarcopterygians) are considered to display extended embryo retention, the optimization of the character in the last common ancestor of amniotes becomes ambiguous and the test can neither reject nor lend support to Lombardi's (1994) suggestion.
A brief review of the previous studies of the evolution of embryo retention in sarcopterygians may be useful to understand the background against which the present study was undertaken. Laurin and Reisz (1997) provided the first phylogenetic test of Lombardi's (1994) hypothesis that the extra-embryonic membranes appeared in an embryo-retaining ancestor. That study was criticized by Wilkinson and Nussbaum (1998), based largely on the erroneous premise that extended embryo retention equated to viviparity. Laurin et al. (2000) clarified this point, that was accepted by Wilkinson et al. (2002). Laurin and Girondot (1999) performed an additional literature search and provided a more objective and explicit coding scheme for embryo retention and reevaluated the evolution of this character in sarcopterygians. As such, that study (Laurin and Girondot 1999) expressed my most recent point of view on embryo retention. More recently, Wilkinson et al. (2002) provided a valuable review of the evidence on the developmental stage at which caecilian eggs are laid, but their comments in reference to my work on this topic focused on two earlier studies (Laurin and Reisz 1997;Laurin et al. 2000) that contain points of view that I had since expressly rejected (Laurin and Girondot 1999). Thus, several comments made by Wilkinson et al. (2002) are misleading. Wilkinson et al. (2002) concluded that ''neither the ancestral caecilian nor the ancestral amniote had EER [extended embryo retention]''. Here arguments are presented that suggest, contrary to the suggestion of Wilkinson et al. (2002), the presence or absence of extended embryo retention in the ancestors of amniotes cannot be known with the available data and analytical techniques.

A few facts
As mentioned above, much of the criticism of Wilkinson et al. (2002) is misdirected because it discusses the somewhat more dated studies of Laurin and Reisz (1997) and Laurin et al. (2000) as opposed to the most up-to-date report of Laurin and Girondot (1999), as shown by the fact that these studies are cited 29, 26, and 15 times, respectively, in Wilkinson et al. (2002). Similarly the three figures (1-3) that show the evolution of embryo retention in Wilkinson et al. (2002) are derived from Laurin and Reisz (1997), the oldest of my three studies on this topic. Possibly, Wilkinson et al. (2002) did not notice that Laurin and Girondot (1999), although published before Laurin et al. (2000), was accepted for publication on 24 June 1999, whereas Laurin et al. (2000) had been accepted for publication about five months earlier, on 11 January 1999. I do not wish to provide a Figure 1. Sarcopterygian phylogeny showing an optimization of embryo retention (character 1), as previously advocated by Laurin and Girondot (1999). The only modification is that all terminal taxa are in the present tree, instead of collapsing Monotremata and Theria into Mammalia, to better match the character distribution shown in Table I, and that Actinistia is coded as unknown (as shown by the absence of a data box below that taxon). The reference tree requires nine steps (*) for this character. Note that for both characters, many random trees require no more steps than the reference tree, thus implying that there is no phylogenetic signal in these characters. detailed list of such unnecessary criticism because this would be of little interest to the reader, but I will concentrate on a few important points. Wilkinson et al. (2002) stated that ''Confusingly, Laurin and Girondot (1999) reach a very different conclusion from Laurin et al. (2000) regarding the coding of caecilians with respect to EER [extended embryo retention], but neither contribution refers to the other or provides any explanation of the obvious discrepancy''. This statement includes two errors. First, Laurin and Girondot (1999) cited Laurin et al. (2000), that was then in press (reference 21 in Laurin and Girondot 1999). Second, an explanation for the difference in coding was provided: ' 'Laurin,Reisz and Girondot [21] concluded that not enough evidence was available to infer safely the ancestral state for the embryonic stage at oviposition of gymnophiones (beyond the inference that they were oviparous). Therefore, the possibility that stem-amniotes performed extended embryo retention could not be excluded [21]. However, an extensive literature search on this topic has revealed additional data that requires modifying this viewpoint''. Laurin and Girondot (1999) coded caecilians as lacking extended embryo retention, just as Wilkinson et al. (2002) subsequently argued. Wilkinson et al. (2002) stated that ''Under other plausible relationships, or under scorings of caecilians that are the most plausible (absence of EER in the ancestral caecilian) or simply equivocal, the condition of the ancestral amniote is unambiguous: it lacks EER. Thus, despite the objections of Laurin et al. (2000), the alternative hypothesis of Laurin and Reisz (1997) is less parsimonious than the terrestrial egg hypothesis as an explanation for the origin of the amniotic egg''. Perhaps most importantly, Wilkinson et al. (2002) Figure 3. Sarcopterygian phylogeny showing an optimization of the developmental stage at oviposition (character 2, with ordered states). This optimization suggests that the ancestral amniote laid its eggs at the gastrula developmental stage (equivalent to absence of extended embryo retention). If the character is left unordered, the ancestral condition for amniotes is to lay eggs in the post-neurula embryonic stage (equivalent to presence of extended embryo retention). failed to mention that Laurin and Girondot (1999) also suggested that the ancestral amniote did not display extended embryo retention. Wilkinson et al. (2002) object to the character coding of extended embryo retention documented in Laurin and Girondot (1999). Developmental stage at oviposition is an intrinsically continuous character, but it is very difficult to quantify, even when detailed information is available, because the main events of development of the main sarcopterygian taxa do not necessarily happen in the same sequence, and some important features of development differ considerably. Because of these problems, Laurin and Girondot (1999) considered fairly broadly defined, easily characterized developmental stages at oviposition, such as blastula, gastrula, neurula, etc. A histogram of distribution of the developmental stage at oviposition was built for the terminal taxa considered (Latimeria, Dipnoi, Anura, Gymnophiona, Urodela, Monotremata, Theria, Chelonia, Squamata, Sphenodon, Crocodylia, and Aves). The histogram showed a bimodal distribution, with a gap at the neurula stage. Thus, whenever a taxon lays eggs at a developmental stage that precedes neurula, it was coded as lacking extended embryo retention, and whenever it was laid at a post-neurula stage, it was coded as displaying extended embryo retention. This dichotomous coding resulted in seven taxa lacking extended embryo retention (including Gymnophiona), and five possessing it (Laurin and Girondot 1999). Wilkinson et al. (2002) suggest that the gastrula stage is not an appropriate cut-off point between presence and absence of extended embryo retention, but they admitted that they were unable to provide a better alternative: ''Thus, instead we might seek to identify developmental events that co-occur with the development of the extra-embryonic membranes in amniotes and use these to define 'significant' EER in anamniotes. Unfortunately, we have not been able to identify any such events''. Lacking a viable alternative method, I have retained the coding suggested by Laurin and Girondot (1999) for the new analyses presented below. I also note that choosing a developmental stage more advanced than the gastrula would necessarily result in fewer terminal taxa showing extended embryo retention, and as Laurin and Girondot (1999) had already found the absence of embryo retention to be the most parsimonious condition in the ancestral amniote, this would only add bias to strengthen the conclusions of Laurin and Girondot (1999).

Phylogenetic signal and the reliability of character optimization
Any character of known distribution can be optimized on any phylogeny to yield an optimization, but such an exercise is not always biologically meaningful. Data devoid of phylogenetic information (such as saturated data in molecular biology, but the equivalent exists for all types of characters) should not be used to produce phylogenies (Archie 1989;Faith and Cranston 1991;Huelsenbeck 1991a;Ho and Jermiin 2004) because the resulting trees would probably not be reliable. Similarly, character data devoid of a phylogenetic signal should probably not be used to infer conditions of hypothetical ancestors because these inferences would probably not be reliable either (Cubo et al. 2002;Laurin 2004). Thus, I present below the first analysis of a phylogenetic signal in extended embryo retention data in sarcopterygians, to determine whether or not this approach, most recently attempted by Wilkinson et al. (2002), can be expected to yield reliable data about the ancestral condition of amniotes.

Method of detection of a phylogenetic signal
Using the equiprobable tree generation algorithm of MacClade 3.06 (Maddison and Maddison 2001), 10,000 random trees were created. This algorithm samples evenly all possible dichotomous rooted trees. By comparing the number of steps required by the optimization of a character of interest (here, the presence or absence of extended embryo retention, or the ontogenetic stage at which the egg is laid) on the reference phylogeny to that of the random trees, a phylogenetic signal can be detected. More specifically, if the number of steps of the character over the reference phylogeny is less than the number of steps on at least 95% of the random trees, we can reject, at a 5% threshold, the null hypothesis that the character states are randomly distributed with respect to the phylogeny. Such a result would indicate the presence of a phylogenetic signal.
Two characters are evaluated ( Table I). The first is the absence or presence of extended embryo retention, as coded by Laurin and Girondot (1999, Figures 2, 3) in their character optimization. The second one is the developmental stage at oviposition, as it was compiled by Laurin and Girondot (1999 , Table 1). This character includes more detailed information and its states are considered ordered, as it exhibits a continuous variation from oviposition of unfertilized eggs to viviparity. This second character is included to maximize the power of our test, because failure to detect a phylogenetic signal could result from an inappropriate delimitation of character states. For both characters, I have attempted to code the primitive condition for each taxon. Thus, the coding does not represent the full range of variation found in each terminal taxon. This method was used because when trying to optimize the evolution of a character or to determine the ancestral condition of a hypothetical ancestor of at least two of the included terminal taxa, variations that appeared within these terminal taxa are not relevant. Thus, Urodela is coded as lacking extended embryo retention and laying unfertilized eggs because this condition is observed in cryptobranchoids and probably in sirenids (Duellman and Trueb 1986, p 462, 496), and under several recent phylogenies, cryptobranchids, hynobiids, and sirenids constitute the two or three most basal lineages of urodeles (Duellman and Trueb 1986;Larson and Dimick 1993;Wiens et al. 2005). Thus, even though internal fertilization may be more Table I. Developmental character data of sarcopterygians analysed in this study (Figures 1-3).

Taxon
Character 1  The coding and state definitions follow exactly Laurin and Girondot (1999), except for Actinistia. State definitions: character 1 (extended embryo retention): absent (0) or present (1); character 2 (developmental stage at oviposition): unfertilized egg (0), blastula (1), gastrula (2), post-neurula embryonic stage (3), viviparity (4). common than external fertilization in urodeles, the latter is probably the primitive condition for that taxon (Duellman and Trueb 1986, p 22). Furthermore, most urodele species that have internal fertilization do not retain the eggs in the oviducts for extended periods of time; therefore, extended embryo retention in urodeles is only present in a few salamandrid species (Duellman and Trueb 1986, p 22;Pough et al. 2004, p 311), and coding of character 1 for Urodela is unproblematic. Similarly, most anurans have external fertilization; of those that display internal fertilization, the most basal is Ascaphus truei Stejneger, 1899. That species was at some time thought to be the sister-group of all other anuran species (Cannatella and Hillis 1993;Hedges and Maxson 1993), but most recent phylogenies supported by morphological (Gao and Wang 2001) or molecular (Roelants and Bossuyt 2005) data place Ascaphus as the sister-group of Leiopelma, and this clearly implies that the ancestral condition for anurans is external fertilization. Viviparous anurans (e.g. Eleutherodactylus coqui ) are too deeply nested to be relevant here (Packard et al. 1996). However, changing the coding of Anura or Urodele or both taxa for character 2 does not alter substantially the results of the analyses of phylogenetic signal or character optimization reported below. A single modification must be made, compared to the coding that I last advocated (Laurin and Girondot 1999); namely there is evidence that the Carboniferous actinistian Rhabdoderma was oviparous (Schultze 1985). Since an ancient species has a greater probability of retaining the ancestral condition (Huelsenbeck 1991b) than a related extant species (since the ancient species has had less time to evolve away from the ancestral condition), I accept this as evidence that actinistians were more likely to have been primitively oviparous, contrary to the condition in the extant taxon Latimeria, that is viviparous. However, there is no way to determine at which stage the eggs of Rhabdoderma were laid; the smallest preserved embryos appear to be in a post-neurula embryonic stage (Schultze 1985), but the possibility that the eggs were laid much earlier cannot be excluded. Therefore, I have changed the coding of character 2 to states 0/1/2/3, to more responsibly reflect this uncertainty, and I have replaced Latimeria by Actinistia in the figures and table. The correct coding for character 1 is less obvious, but I have decided to code it as unknown for actinistians, as there is evidence that the reproductive mode of Latimeria is not primitive for Actinistia (Schultze 1985).

Results
Character 1 (extended embryo retention) requires three steps over the reference phylogeny, and parsimony suggests that the ancestral amniote lacked extended embryo retention (Figure 1). Of the 10,000 random, equiprobable trees, 4406 require no more than three steps (P 5 0.4406) for this character (Figure 2a).
Character 2 (embryonic stage at oviposition), when ordered, requires nine steps over the reference phylogeny (Figure 3). The optimization suggests that the ancestral amniote laid its eggs at the gastrula developmental stage, which is equivalent to the absence of extended embryo retention. Of the 10,000 random, equiprobable trees, 1357 required no more than nine steps (P 5 0.1357) for this character (Figure 2b). If this character is treated as unordered (a biologically less justifiable approach based on known developmental sequences, but tested here for the sake of completeness), its optimization requires six steps over the reference phylogeny (not shown), and suggests that the ancestral amniote laid eggs in a post-neurula embryonic stage (equivalent to presence of extended embryo retention). Of the 10,000 random trees, 3534 require no more than six steps (P 5 0.3534).
Thus, the null hypothesis that these two characters are randomly distributed with respect to the tree cannot be rejected. This suggests that character optimization cannot be expected to yield reliable information about the ancestral condition of the various nodes of the tree for extended embryo retention or developmental stage at oviposition.

Discussion
Both character 1 (a binary character) and character 2 (multi-state), when the states of the latter are ordered, suggest that the ancestral amniote did not have extended embryo retention. This congruence was to be expected because both characters evaluate the same aspect of sarcopterygian development. The fact that character 2, when its states are left unordered, suggests the presence of extended embryo retention (oviposition in a postneurula stage) is inconsistent with the other results, but it probably reflects that such a continuous character should not be treated as unordered because this ignores critical information about the similarity between the various states.
More importantly, the absence of a phylogenetic signal in the studied developmental characters suggests that using such data on extant taxa alone will not provide a reliable inference of the ancestral amniote condition. This absence is not surprising for character 1 given that the derived state of this binary character appears mostly as autapomorphies of terminal taxa (with the exception of being a synapomorphy of Mammalia). However, this limitation does not apply to character 2, which displays several changes in internal and terminal branches; the absence of a phylogenetic signal in this case is far more convincing evidence that the optimization is unreliable. Using a dense sampling of lower-level terminal taxa (e.g. individual genera or species) could perhaps yield a phylogenetic signal, but this would only mean that there is a signal within some of the large terminal taxa used in the present analysis, and such a result would not imply that the ancestral condition for amniotes can be inferred. Indeed, the phylogenetic signal may be found in subsets of the sampled taxa, as was recently shown for histological and microanatomical characters of sauropsids (Cubo et al. 2005). Thus, contrary to Wilkinson et al. (2002Wilkinson et al. ( , p 2196), Lombardi's (1994) hypothesis can be neither rejected nor supported using optimization of the embryo retention data discussed above (Table I), and these optimizations should not be viewed as an accurate depiction of the evolution of these characters (Figures 1, 3). This does not mean that the evolution of embryo retention should no longer be studied in extant taxa because it might be possible to trace the evolution of such characters in taxa, such as Squamata (Lee and Shine 1998) or the lepidosaur Lerista (Fairbairn et al. 1998), that started differentiating more recently than Amniota. Also, my comments do not invalidate other, physiological, ecological, or morphometric approaches to study the evolution of embryo retention in squamates (e.g. Shine and Guillette 1988;Qualls and Shine 1998;Shine 1999Shine , 2002Cei et al. 2003;Shine et al. 2003;Parker et al. 2004).
The ancestral condition for some developmental features of amniotes could be studied indirectly, by studying, in extinct species, osteological features that are correlated with developmental characters. Osteological features fossilize readily and can thus be studied in taxa that lived just before or just after the last common ancestor of amniotes, and that were separated from it by just a few million years of independent evolution. Such a method was recently used to test Carroll's (1970Carroll's ( , 1991 suggestion that the eggs of an ancestral stem-amniote were laid on land, and that such anamniotic eggs were limited to a diameter of less than 1 cm because of gas exchange constraints reflecting the initial absence of the extra-embryonic membranes that were to appear later in the amniotic egg. Carroll (1970Carroll ( , 1991 argued that egg size is correlated with adult body size in squamates and plethodontids that lay eggs outside the water on moist ground, and that this implied that the ancestral stem-amniote measured less than 10 cm in snout-vent length. After the extraembryonic membranes had appeared, this size constraint was released, and amniotes (or their precursors) increased in size, according to Carroll's (1970Carroll's ( , 1991 hypothesis. Parts of this reasoning can be questioned. For instance, caecilian eggs measure up to about 4 cm in diameter (Breckenridge and Jayasinghe 1979), possibly because the embryonic external gills of caecilian larvae mediate gas exchange, just like some amniotic extra-embryonic membranes. This raises the possibility that the size of reptiliomorph eggs was not as tightly constrained as Carroll (1970Carroll ( , 1991 argued. Another potential problem is that growth in extant amphibians (Duellman and Trueb 1986) and squamates (Buffrénil et al. 1994) is indeterminate, and this is probably a primitive condition for stegocephalians, tetrapods, and amniotes, as shown by skeletochronological studies in most early members of these taxa (e.g. Ricqlès 1974;Steyer et al. 2004). In taxa with indeterminate growth, the presence of a correlation between egg size and adult body size is not necessarily expected, and indeed, Carroll (1970Carroll ( , 1991 only showed scatter plots (some of which look convincing), but no results of statistical tests to support his hypothesis.
I do not wish to further criticize Carroll's (1970Carroll's ( , 1991 reasoning, but I recently tested the size of the last common hypothetical ancestor of amniotes by optimizing body size on a phylogeny that included several taxa that were probably older than, contemporaneous to, or only slightly (20-30 million years) more recent than the last common ancestor of amniotes. A phylogenetic signal was found, and the size of the ancestor was inferred with a reasonably small 95% confidence interval (Laurin 2004). Methods that can exploit quantitative information in extinct taxa, as in this example, may be more productive ways to study the origin of amniotes and of their unique characters than the optimization of characters that can be documented only in extant taxa, that have had at least 310 million years to evolve away from the ancestral amniotic condition.
The absence of a phylogenetic signal in the developmental characters evaluated here may not be atypical. A thorough survey of the phylogenetic signal in the many characters whose evolution has been studied using parsimony character optimization is beyond the scope of this study, but I would like to emphasize the need to assess the presence of a phylogenetic signal before performing any character optimization. This is clearly not a standard procedure; among the previous studies on embryo retention and the origin of the amniotic egg (Laurin and Reisz 1997;Wilkinson and Nussbaum 1998;Laurin and Girondot 1999;Laurin et al. 2000;Wilkinson et al. 2002), none performed this test. Most other studies of which I am aware that studied character evolution did not include a test of the presence of a phylogenetic signal; this includes studies on various characters linked with the evolution of the egg (Skulan 2000) and palaeohistological studies (Padian et al. 2001(Padian et al. , 2004Ray et al. 2004). This is not meant to criticize these studies because I believe that the vast majority of papers dealing with character evolution do not consider the presence of a phylogenetic signal, and the exceptions that I know of are fairly recent (e.g. Irwin 1996; Winberger and de Queiroz 1996;Blomberg and Garland 2002;Cubo et al. 2002Cubo et al. , 2005Freckleton et al. 2002;Blomberg et al. 2003;Laurin 2004). Nevertheless, I hope that the presence of a phylogenetic signal will be assessed more routinely in the future in comparative studies that attempt to trace character history.