Use of sublethal endpoints in sediment toxicity tests with the amphipod Hyalella azteca

Short‐term sediment toxicity tests that only measure effects on survival can be used to identify high levels of contamination but may not be able to identify marginally contaminated sediments. The objective of the present study was to develop a method for determining the potential sublethal effects of contaminants associated with sediment on the amphipod Hyalella azteca (e.g., reproduction). Exposures to sediment were started with 7‐ to 8‐d‐old amphipods. On day 28, amphipods were isolated from the sediment and placed in water‐only chambers where reproduction was measured on day 35 and 42. Typically, amphipods were first in amplexus at about day 21 to 28 with release of the first brood between day 28 to 42. Endpoints measured included survival (day 28, 35, and 42), growth (as length and weight on day 28 and 42), and reproduction (number of young/female produced from day 28 to 42). This method was used to evaluate a formulated sediment and field‐collected sediments with low to moderate concentrations of contaminants. Survival of amphipods in these sediments was typically >85% after the 28‐d sediment exposures and the 14‐d holding period in water to measure reproduction. Reproduction was more variable than growth; hence, more replicates might be needed to establish statistical differences among treatments. Previous studies have demonstrated that growth of H. azteca in sediment tests often provides unique information that can be used to discriminate toxic effects of exposure to contaminants. Either length or weight can be measured in sediment tests with H. azteca. However, additional statistical options are available if length is measured on individual amphipods, such as nested analysis of variance that can account for variance in length within replicates. Ongoing water‐only studies testing select contaminants will provide additional data on the relative sensitivity and variability of sublethal endpoints in toxicity tests with H. azteca.


INTRODUCTION
A variety of standard methods has been developed for assessing the toxicity of contaminants associated with sediments using amphipods, midges, polychaetes, oligochaetes, mayflies, or cladocerans [1][2][3][4][5][6][7]. Several endpoints are suggested in these methods to measure effects of contaminants in sediment including survival, growth, behavior, or reproduction; however, survival of test organisms in 10-d exposures is the endpoint most commonly reported. These short-term exposures, which only measure effects on survival, can be used to identify high levels of contamination but may not be able to identify marginally contaminated sediments. Sublethal endpoints in sediment tests may also prove to be better estimates of responses of benthic communities to contaminants in the field [1]. The objective of the present study was to develop a method for determining the potential sublethal effects of contaminants associated with sediment on the amphipod Hyalella azteca including effects on reproduction. Companion studies have been conducted that evaluated sublethal endpoints in sediment tests based on a life-cycle test with the midge Chironomus tentans [8][9][10].
Presented at the 16th Annual Meeting, Society of Environmental Toxicology and Chemistry, November 5-9, 1995, Vancouver, British Columbia, Canada. and growth [1,6]. At our laboratory, sediment toxicity tests with H. azteca have typically been conducted for 28 d, starting with about 14-d-old organisms at 20 to 23ЊC. Endpoints measured in these tests include survival, growth, and sexual maturation [20]. While survival and growth endpoints have provided unique information on the toxicity of sediments, most samples that resulted in a lower percentage of amphipods becoming sexually mature were the same samples that reduced growth [20].
The reproductive biology of H. azteca is compatible with the measurement of reproduction as an endpoint in sediment tests. Hyalella azteca develops through five to eight prereproductive instars and an indefinite number of postreproductive instars [21,22]. The first five instars correspond to the juvenile stage of development, instars 6 and 7 correspond to the adolescent stage (when sexes can be differentiated), instar 8 correspond to the nuptial stage, and all later instars represent the adult stage [21][22][23]. Reproduction can reportedly occur from 10 to 28ЊC with highest reproduction at 26 to 28ЊC [24]. Positive intrinsic rates of natural increase were reported above 10ЊC with maxima between 20 to 25ЊC [22].
Reproduction in H. azteca starts with amplexus, in which a male grasps the female with its gnathopods while on the back of the female. After 1 to 7 d in amplexus, the pair separates for a short time while the female sheds her exoskeleton, then reunite briefly for copulation. After copulation, the pair again separates and the female releases eggs from her oviducts into the marsupium where the eggs are fertilized. The developing embryos and newly hatched young are kept in the marsupium until the next molt. The next amplexus occurs during incubation of the previous brood in the marsupium [1,21]. At 24 to 28ЊC, hatching ranges from 5 to 10 d after fertilization [22,25]. The time between molts for females ranges from 18 to 20 d at 20 to 22ЊC [25], 9 to 10 d at 25ЊC [26], and 7 to 8 d at 26 to 28ЊC [25]. Hyalella azteca averages 15 broods in 152 d with brood sizes averaging 18 eggs/brood [23]. The size of the first brood ranges from 4 to 10 young/female [22,26] with larger organisms typically producing larger broods [22].
Several designs were considered for measuring reproduction in sediment exposures based on the reproductive biology of H. azteca. The first design considered was a continuation of the 28-d sediment exposures described in Ingersoll et al. [20] for an additional 2 weeks to determine the number of young produced in the first brood. The limitation of this design is the difficulty in quantitatively isolating young amphipods from sediment [27]. A second design considered was extension of the 28-d sediment exposure for an additional month or longer until several broods are released. These multiple broods would then be isolated from the sediment. The limitation of this second design is that specific effects on reproduction could not be differentiated from reduced survival of offspring and it would still be difficult to isolate the young amphipods from sediment. A third design considered, and the one evaluated in the present study, was to expose amphipods in sediment until a few days before the release of the first brood. The amphipods could then be sieved from the sediment and held in water to determine the number of young produced. This test design allows a quantitative measure of reproduction. However, one limitation to this design is that amphipods might recover from effects of sediment exposure during this holding period in clean water.
This paper describes results of the three studies: (1) an evaluation of the time for newborn H. azteca to produce a first brood in water-only exposures using three different diets, (2) exposure of H. azteca for 28 d to formulated or field-collected sediments (low to moderate levels of contamination) followed by a 14-d holding period in water to measure reproduction, and (3) exposure of H. azteca in three types of sediments to better define a feeding ration to optimize water quality, survival, growth, and reproduction. Results of these studies were used to both develop a procedure for quantitatively evaluating reproduction in sediment tests with H. azteca and to evaluate relationships between growth (i.e., length or weight) and reproduction of H. azteca.

Culture of amphipods
Mixed-age amphipods were cultured in 80-L glass aquaria containing 50 L of water that received about 6 volume additions/d of well water (hardness 283 mg/L as CaCO 3 ). Cultures were maintained at 23ЊC (Ϯ1ЊC) at light intensity of about 500 lux and were fed Tetramin fish food (Tetra-Werke, Melle, Germany) and presoaked maple leaves ad libitum. Each aquarium contained six nylon substrates (20-cm-diameter sections of nylon ''Coiled-web material''; 3-M, St. Paul, MN, USA) [27]. Known-age amphipods were obtained by sieving organisms from the mixed-aged culture through a no. 25 (710-m mesh) U.S. standard size sieve placed under water. Mature amphipods retained on the sieve were pipeted into a no. 40 (425-m mesh) sieve, placed in a shallow glass pan containing water, and left overnight to release newborn amphipods [27]. After 24 h, the Ͻ24-h-old amphipods (neonates) were rinsed through the no. 40 sieve into the surrounding water. The ne-onates were held in a 2-L beaker for 7 to 10 d before the start of a test. On the first day of isolation, the neonates were fed 10 ml of YCT (yeast-cerophyl-trout chow, 1,800 mg/L stock solution [4]; cerophyl as dried cereal leaves obtained from Sigma Chemical, St. Louis, MO, USA and trout chow obtained from Zeigler Brothers, Gardners, PA, USA) and 10 ml of Selenastrum capricornutum (about 3 ϫ 10 7 cells/ml). On the third, fifth, seventh, and ninth day after isolation, the amphipods were fed 5 ml of both YCT and S. capricornutum. Amphipods were initially fed a higher volume to establish a layer of food on the bottom of the culture chamber. If dissolved oxygen dropped below 4 mg/L, about 50% of the water was replaced. Preparation of YCT and algae followed procedures outlined in ASTM [1] and EPA [4].

Feeding study in well water
A 49-d water-only feeding study was conducted in well water to compare survival and reproduction of H. azteca fed three different diets: (1) YCT and algae, (2) Purina Rabbit Chow (Ralston Purina, St. Louis, MO, USA), or (3) Tetramin. The objective of this feeding study was to determine the influence of these three diets that are commonly used in sediment tests with H. azteca on time to amplexus, release of the first brood, and size of the first brood.
Three diet treatments were evaluated: (1) 1.5 ml YCT (1,800 mg/L stock solution) and 3 ϫ 10 5 cells of S. capricornutum added daily to each beaker; (2) 6 mg Rabbit Chow added three times/week to each beaker; and (3) 2.5 mg Tetramin added three times/week to each beaker. These feeding levels were chosen to be representative of levels recommended by the U.S. Environmental Protection Agency (U.S. EPA) [4] for YCT, by Kemble et al. [28] for Rabbit Chow, and by Day et al. [18] for Tetramin. Eight replicate beakers were tested for each diet treatment. Ten 10-to 11-d old amphipods were placed in 300-ml beakers containing 150 ml of well water and a 5-cm ϫ 5-cm piece of Nitex screen (Nylon bolting cloth; 44% open area and 280-m aperture, Wildlife Supply Company, Saginaw, MI, USA). We have observed improved survival of amphipods when a substrate is provided in water-only exposures. These beakers have a 1.6-cm hole cut above the 150-ml level that is covered with stainless-steel cloth (36% open area and 40 ϫ 40 mesh/inch, McMaster-Carr, Chicago, IL, USA). Two volume additions/day of well water were added to each beaker using an automated water-delivery system [29]. Exposures were conducted at 23ЊC (Ϯ1ЊC) on a 16 light:8 dark photoperiod at a light intensity of about 500 to 1,000 lux.
The following information was obtained for amphipods in each beaker: (1) weekly survival, (2) time to first occurrence of amplexus, (3) length (four replicates destructively sampled on day 28), and (4) daily young production was obtained for the amphipods in each replicate. At the start of the exposure, about 20 amphipods were also archived in 8% sugar formalin for later measurement of amphipod length [1]. Length of amphipods was measured along the dorsal surface from the base of the first antenna to the tip of the third uropod along the curve of the dorsal surface using a microscope and digitizing system [28]. Excess food and debris were removed from the beaker when weekly survival estimates were made. Reproduction was determined by removing amphipods in amplexus from each beaker and placing these paired organisms into individual 300-ml beakers containing 150 ml of water and a 5cm ϫ 5-cm piece of Nitex screen. Each pair of isolated amphipods was fed at 20% of the original feeding level with  1. Results of the water-only feeding study with the YCT diet. Ⅵ, The day a pair of amphipods was observed in amplexus and isolated into individual chambers; ࡗ, days the paired amphipods were not in amplexus; *, death of one of the paired amphipods; and a number indicates the number of young produced by a pair of amphipods on a particular day.
water replacement as described above. Daily observations were made to determine duration of amplexus, time to release of the first brood, and number of young released. Results of this water-only feeding study are presented in Table 1 and Figure 1. A slight buildup of food was observed in the bottom of the beakers in each of the diets indicating the amphipods were fed in excess. Survival of amphipods was Ͼ90% through day 28 and Ͼ78% through day 35 across all three diets. Amphipods were first observed in amplexus on day 23 when fed the YCT ϩ algae diet, on day 25 when fed the Rabbit Chow diet, and on day 31 when fed the Tetramin diet. Length of amphipods on day 28 was greater with the YCT ϩ algae diet (4.2 mm) compared to either the Rabbit Chow (2.8 mm) or Tetramin (2.9 mm) diets.
By day 35, 14 pairs of amphipods had been isolated from the YCT ϩ algae diet compared to only 8 in the Rabbit Chow diet and 2 in the Tetramin diet ( Fig. 1). Ten of the 15 pairs of amphipods fed the YCT ϩ algae diet had their first brood by day 48 and a second brood was produced by one of these pairs (Table 1). In contrast, only two pairs of amphipods fed either the Rabbit Chow or Tetramin diets produced broods by day 48. The number of young/female was also higher at day 38 and 48 with amphipods fed the YCT ϩ algae diet compared to the other two diets. Isolated pairs of amphipods fed the YCT ϩ algae diet were maintained until day 56 when most of these pairs had produced a second brood. Timing and duration of amplexus or size of first brood in this study are similar to results of previous studies conducted at comparable temperatures [21][22][23]25,26]. A reduction in growth and reproduction and a delay in the onset of amplexus of H. azteca was also reported in a study that varied feeding level of Rabbit Chow [30].

Sediment study conducted with field-collected samples
The objective of this study was to evaluate effects of fieldcollected sediments on survival, growth, and reproduction of H. azteca in 42-d exposures (28-d exposure to sediment followed by a 14-d holding period in water to measure reproduction). The sources of sediment samples evaluated in this study were (1)  . The control sediment was obtained from West Bearskin Lake, Minnesota (WB) [15]. Sediment samples from the field were collected from about the upper 6 cm sediment surface using a petite Ponar grab (225 cm 2 ), with a hand-held scoop, or with a coring device (1996 Rio Grande samples). Sediments were stored at 4ЊC in high-density polyethylene containers until the start of the tests. All of the field-collected sediment samples were tested concurrently except for the samples from the upper Mississippi and the second set of samples from the Rio Grande.
Based on the results of the first feeding study (Table 1), sediment tests with H. azteca were conducted for 42 d with a diet of YCT ϩ algae. Exposures to sediment were started with 7-to 8-d old amphipods. On day 28, amphipods were isolated from the sediment and placed in water-only chambers where reproduction was measured on day 35 and 42 (Appendix). Using this design, amphipods can be expected to be in amplexus first at about day 21 to 28 with release of the first brood between day 28 to 42. Survival was measured on day 28, 35, and 42, growth (length) was measured on day 28 and 42, and reproduction (number of young/female produced) was measured on day 35 and 42. Starting the test with 7-to 8-dold amphipods would probably not reduce the sensitivity of the test. Collyard et al. [32] reported that the sensitivity of H. azteca to a variety of contaminants was relatively similar up to at least 24-to 26-d-old organisms.
Ten amphipods were exposed in 300-ml beakers containing 100 ml of sediment and 175 ml of overlying water (Appendix [29]). At the start of a test, about 20 amphipods were archived in sugar formalin for later measurement of length. Exposures were conducted at 23ЊC (Ϯ1ЊC) on a 16 light:8 dark photoperiod at a light intensity of about 500 to 1,000 lux. Eight replicate beakers/sediment were tested (four for 28-d length and four for reproduction and 42-d length; except for the samples from the Upper Mississippi River where all eight replicates were used to measure reproduction and 42-d length). Each sediment sample was thoroughly mixed, visually inspected to judge homogeneity, and subsamples were added to the test beakers the day before start of the sediment test (day Ϫ1).
Well water used as the source of overlying water was added on day Ϫ1 in a manner that minimized suspension of sediment. For sediment samples from Aberdeen that were collected from brackish water, overlying water was amended with the addition of Instant Ocean salt (Aquarium Systems, Mentor, OH) to achieve a salinity of two parts per thousand. Two volume additions of overlying water were added each day using an automated water-delivery system [29]. Well water was used as the source of overlying water because Kemble et al. [31] and McNulty [33] observed poor survival of H. azteca in tests conducted 14 to 28 d using a variety of reconstituted waters including the reconstituted water (reformulated moderately hard reconstituted water) described in Smith et al. [34] and U.S. EPA [4]. Borgmann [35] described a reconstituted water for culturing H. azteca; however, other laboratories have not found this reconstituted water to be an improvement over use of a natural water (T.J. Norberg-King, personal communication).
Feeding rates differed somewhat among sets of sediments tested at different times, reflecting refinement in feeding regimes. In most of the studies, amphipods in each beaker were fed a 1.5-ml mixture of the YCT stock solution (1,800 mg/L) and 3 ϫ 10 5 cells of S. capricornutum three times/week. In exposures with upper Mississippi River sediments, amphipods were fed this same quantity of YCT and algae daily. In exposures with two Rio Grande sediments (RG-04 and RG-05), amphipods were fed 1 ml of YCT daily without the addition of algae. Results of a subsequent feeding study in sediment (described in the next section) resulted in modifying this feeding regime (to 1 ml of YCT daily without the addition of algae; Appendix).
On day 28 of the exposures, four of the replicate beakers/ sediment were sieved through a no. 50 sieve (300-m mesh), and surviving amphipods were preserved in sugar formalin for later length measurements. The remaining four replicate beakers/sediment to be used for determination of reproduction were also sieved on day 28. Surviving amphipods isolated from these beakers were placed in corresponding 300-ml water-only beakers containing 150 ml of water and a 5-cm ϫ 5-cm piece of Nitex screen. In a subsequent study, improved reproduction of H. azteca was observed when the Nitex screen was replaced with a 3-cm ϫ 3-cm piece of the nylon ''Coiled-web material'' described above for use in culturing amphipods (T.J. Norberg-King, personal communication). Each water-only beaker received two volume additions of water daily and YCT ϩ algae as previously described. Production of young amphipods in theses beakers was determined on day 35 and 42 by removing and counting the adults and young in each beaker. On day 35, the adults were then returned to the same water-only beakers. Adult amphipods surviving on day 42 were preserved in sugar formalin for subsequent determination of sex and length. The number of adult males and females in each beaker was determined from the day 42 sample (mature male amphipods were distinguished by the presence of an enlarged second gnatho-pod). This information was used to calculate the number of young produced/female/beaker from day 28 to 42.
A buildup of food and mold was observed on the sediment surface after day 14 of exposures conducted with samples from the upper Mississippi River (i.e., a diet of 1.5 ml YCT and algae daily). Thus, feeding was reduced in these exposures for 7 of the remaining 14 d of the sediment exposure. In the subsequent exposures with the other sediments listed above, the feeding rate was adjusted to 1.5 ml of YCT and algae three times/week. With this reduced feeding rate, a buildup of food was not observed on the surface of the sediment. However, survival and growth of amphipods fed three times/week in the formulated sediment was reduced relative to the amphipods fed daily (see the Results and Discussion section).

Feeding study in sediment
To better assess the importance of food quantity, a subsequent feeding study was conducted to evaluate the performance of amphipods in 42-d exposures using the following sediments: (1) West Bearskin, (2) formulated sediment, and (3) Florissant soil (FL) [13]. Each of these sediments has been previously been used as a control sediment. For this study, four feeding levels of YCT were evaluated: 1.5, 1.0, 0.75, and 0.5 ml/d/ beaker (1,800 mg/L stock solution of YCT) without the addition of algae (the addition of algae was not found to improve the performance of amphipods in 42-d exposures, unpublished data). All other test conditions were the same as described for the exposures conducted with field-collected sediments (Appendix).
Endpoints measured in the feeding study included survival on day 28, 35, and 42, length on day 28 and 42, and reproduction from day 28 to 42. In addition, dry weight of amphipods in each replicate was determined after length was measured on day 28 and 42 using amphipods preserved in sugar formalin. Gaston et al. [36] and Duke et al. [37] have shown that weight or length of several aquatic invertebrates did not significantly change after 2 to 4 weeks of storage in 10% formalin. Dry weight of amphipods was determined by (1) transferring the archived amphipods from a replicate out of the sugar formalin solution into a crystallization dish, (2) rinsing amphipods with deionized water, (3) transferring rinsed amphipods to a preweighed aluminum pan, (4) drying samples for 24 h at 60ЊC, and (5) weighing the pan and dried amphipods on a balance to the nearest 0.00001 g. Average dry weight of individual amphipods in each replicate was calculated from these data. Due to the small size of amphipods, caution must be taken during weighing (10 dried amphipods after a 28-d sediment exposure weigh only about 2.5-3.5 mg). Weigh pans need to be handled carefully using powderless gloves and the balance should be calibrated with standard weights with each use. We initially planned to report dry weight for the study conducted with field-collected sediments; however, these data are not reported due high variability in the measurements of weight at day 28 and 42. Subsequent use of smaller aluminum pans (7 ϫ 22 ϫ 7 mm, Sigma Chemical, St. Louis, MO, USA) reduced variability in measurements of dry weight. Others have also used weigh boats constructed from sheets of aluminum foil.

Water quality measurements
Pore-water samples were analyzed for total sulfide and ammonia, alkalinity, pH, hardness, conductivity, and dissolved oxygen using methods described in Kemble et al. [28]. About Environ. Toxicol. Chem. 17,1998 C.G. Ingersoll et al. 170 ml of pore water was isolated from each sediment sample by centrifugation at 4ЊC for 15 min at 5,200 rpm (7,000 g) before the start of the tests (day Ϫ1 of the sediment exposure). A wide range in the water quality characteristics of the pore water was observed for pH (5.81-8.  Table 2). Concentrations of total ammonia (Ͻ12.9 mg/L), unionized ammonia (Ͻ0.043 mg/L), and hydrogen sulfide (Ͻ0.05 mg/L) were relatively low in these pore-water samples.
Water quality characteristics of the overlying water of sediment measured on day Ϫ1 (the day before organisms were placed into beakers) and at the end of each sediment test included dissolved oxygen, temperature, conductivity, pH, alkalinity, total hardness, and total ammonia (Table 3). Dissolved oxygen was measured weekly in the overlying water. Overlying water quality characteristics were similar among all treatments and the in-flowing test water (Table 3). Dissolved oxygen measurements were at or above acceptable levels in all treatments throughout the study (Ͼ40% saturation) [1].

Physical characterization of sediment samples
Physical characterizations of sediment samples included organic carbon content, water content, and particle size (Table 4). See Kemble et al. [28] for a description of these methods. Sediment samples exhibited a wide range in total organic carbon (0.3-9.6%), water content (19-81%), and particle size (Table 4).

Chemical characterization of sediment samples
Chemical analyses included PAHs, organochlorine pesticides (OCs), PCBs, acid volatile sulfide (AVS), and simultaneously extracted metals (SEM; Table 5). Kemble et al. [31] (N.E. Kemble, unpublished results) describe methods used to perform chemical characterization of sediment samples or results of these analyses for samples from the upper Mississippi River, Aberdeen Proving Grounds, Rio Grande (RG-04 and RG-05), formulated sediment, and Florissant soil.
Metals (cadmium, copper, nickel, lead, mercury, and zinc) in sediments were extracted with dilute hydrochloric acid (3 N HCl; simultaneously extracted metal; SEM) at room temperature for 1 h simultaneously with AVS determination [39]. Minimum detection limits (in g/g) were 0.001 for cadmium, 0.045 for copper, 0.051 for nickel, 0.003 for lead, 0.097 for zinc, and 0.004 for sulfur. Mercury was Ͻ0.09 g/g in all samples. The AVS has been demonstrated to control porewater concentrations and bioavailability of divalent metals in sediment toxicity and bioaccumulation tests [40]. Divalent metals in sediment with molar concentration of SEM less than AVS would not be predicted to be toxic to aquatic organisms. The SEM metal concentrations in the samples were typically low with an excess of AVS relative to SEM (Table 5) [31] (N.E. Kemble; unpublished results). Samples NB-04, NB-07, NB-10, and RG-04 had low concentrations (Ͻ0.023 mol/g) of AVS; however, the magnitude of the difference between SEM and AVS was quite small and other sediment-binding phases in addition to AVS may also limit the bioavailability of metals in sediment (e.g., organic carbon [40]).

Statistical analyses
Statistical analyses were conducted using one-way analysis of variance (ANOVA) at ␣ ϭ 0.05 for all endpoints except length, which was analyzed using a one-way nested ANOVA at ␣ ϭ 0.05 (amphipods nested within a beaker). Percent survival data were arcsin transformed, and length, weight, and reproduction data were log transformed before analysis. Mean separation was performed by Fisher's protected least-significant difference test at ␣ ϭ 0.05. Spearman rank correlation procedures were used to evaluate relationships between the responses of amphipods exposed to field-collected sediments (Table 6) and the physical characteristics of sediment (Table 4), the water quality characteristics of the pore water (Table 2) or overlying water (Table 3), or the metals concentrations in sediment ( Table 5). Concentrations of organic contaminants in sediment samples were typically below detection and were not evaluated using rank correlations. Statistical significance for the rank correlations was established at p ϭ 0.0005 to minimize experiment-wise error (Bonferroni method) [41]. All statistical analyses were performed with Statistical Analyses Systems programs [42].

Sediment study conducted with field-collected samples
Mean survival of amphipods across sediment types was typically Ͼ85% and was not significantly different among treatments after the 28-d sediment exposures and the 14-d water-only reproduction period (Table 6). Exceptions to this trend were lower survival of amphipods in formulated sediment fed three times/week (68% survival) compared to the amphipods fed daily (88% survival) and lower survival of amphipods in the CC-REF sample on day 28 (80% survival). Significant differences in mean length of amphipods were evident among sediment treatments after both the 28-d sediment exposure (3.3-4.3 mm) and after the 14-d water-only reproduction period (3.0-5.0 mm). Increase in length of amphipods between day 28 and 42 was typically 5 to 25%; however, mean lengths in two treatments (NB-10 and CC-01) were lower at day 42 compared to day 28. This smaller length at day 42 may have been the result of measuring length on different replicates on day 28 compared to day 42. Lengths of amphipods in the control sediment were typically similar to, or less than, lengths of amphipods in the other treatments. An exception to this trend was significantly shorter lengths of amphipods in the CC-02 treatment at day 28 and the CC-01 and NB-07 treatments at day 42 compared to the control.
Mean reproduction ranged from 0.8 to 8.3 young/female and was typically not significantly different among treatments. Exceptions to this trend were significantly higher reproduction of amphipods in Rio Grande sediments (RG-02, RG-03) compared to the control sediment and significantly lower reproduction in the CC-02 sediment (0.8 young/female) compared to a reference sediment (3.5 young/female). In addition, reproduction of amphipods in formulated sediment fed a daily ration of YCT (8.3 young/female) was significantly higher than reproduction in the control or formulated sediments fed YCT three times/week (1.2 and 1.8 young/female).
While daily feeding 1.5 ml of YCT in formulated sediment increased reproduction relative to feeding three times/week, dissolved oxygen in overlying water was consistently lower in the treatment receiving food daily. No obvious pattern was observed between reproduction and the percentage of females in a treatment or in individual replicates (Table 6).

Feeding study in sediment
The second feeding study was designed to determine if reproduction and dissolved oxygen concentration would be improved in sediment tests by feeding a lower ration of YCT daily instead of the higher ration three times/week. Table 7 summarizes mean survival, growth, and reproduction of amphipods fed four rations of YCT daily in three sediments. During the 28-d sediment exposure, dissolved oxygen was Ͼ40% saturation across all treatments. However, dissolved oxygen in overlying water was consistently lower at the highest feeding ration of YCT (1.5 ml/day) compared to the lower feeding rations (Table 3).
Mean survival of amphipods across feeding rations and sediment types was typically Ͼ85% and was not significantly different among treatments at day 42 (Table 7). Exceptions to this trend were significantly lower survival of amphipods fed 1.5 ml of YCT in the Florissant soil or formulated sediment compared to the West Bearskin sediment.
Significant differences in mean length and weight of amphipods were evident among feeding or sediment treatments. Mean lengths and weights after the 28-d sediment exposure and the 14-d water-only reproduction period tended to increase with increasing feeding ration. Exceptions to this pattern were lower weight at day 28 and shorter length on day 28 and 42 of amphipods fed 1.5 ml of YCT in formulated sediment. Amphipods at the lower feeding rations in formulated sediment were generally larger than amphipods fed the same ration in the Florissant soil or West Bearskin treatments.
Mean reproduction was related to the quantity of food supplied. With the exception of West Bearskin, reproduction in Florissant soil or formulated sediment was highest in amphipods fed 1.0 ml of YCT. In addition, a feeding ration of 1.0 ml of YCT maximized growth relative to the lower feeding rations of 0.5 or 0.75 ml and maximized survival and dissolved oxygen in the overlying water relative to the higher feeding ration of 1.5 ml of YCT. Brood sizes of amphipods fed 1.0 ml   a Means (standard error of the means in parentheses) within a column and within a group of samples followed by a common letter are not significantly different (p Ͼ 0.05). For means not followed by a letter, the ANOVA was not significant (p Ͼ 0.05). b Starting body length of amphipods was 1.2 mm (0.03 SE, n ϭ 20); n ϭ 4 replicate beakers for all samples except for day 28 survival where n ϭ 8. Amphipods in each beaker were fed YCT ϩ algae three times/week (except for the FS-7X that received YCT ϩ algae daily). c Starting body length of amphipods was 1.6 mm (0.06 SE, n ϭ 16); n ϭ 8 replicate beakers for all samples. Amphipods were fed YCT ϩ algae daily. d Starting body length of amphipods was 1.2 mm (0.05 SE, n ϭ 19); n ϭ 4 replicate beakers for all samples except for day 28 survival where n ϭ 8. Amphipods were fed YCT daily. e NM ϭ not measured.
of YCT daily were comparable to the size of the first broods (about 4-10 young/female) for H. azteca [22,26]. No obvious pattern was observed between reproduction and the percentage of females in a treatment or in individual replicates (Table 7).

Relationships between physical and chemical characteristics of sediments to biological endpoints
No significant Spearman rank correlations were observed among the biological endpoints listed in Table 6 and the physical (Table 4), pore-water (Table 2), overlying water (Table 3), or chemical characteristics (Table 5) of the sediments (p Ͼ 0.0005). Weak relationships were evident between mean reproduction of amphipods and percent clay (r ϭ 0.48, p ϭ 0.03), percent silt (r ϭ 0.42, p ϭ 0.06), and percent sand (r ϭ Ϫ0.45, p ϭ 0.04). Additional study is needed to better evaluate potential relationships between reproduction of H. azteca and the physical characteristics of the sediment. The weak relationship between particle size of sediment and re-production in the present study may have been due to the fact that samples with higher amounts of sand (i.e., Aberdeen and U.S. Naval Air Station) also had higher concentrations of organic contaminants compare to other samples. Hyalella azteca tolerated a wide range in sediment particle size and organic matter in 10-to 28-d tests measuring effects on survival or growth [15,17,20] (N.E. Kemble, unpublished results).
Until additional studies have been conducted that substantiate this lack of a correlation between physical characteristics of sediment and the endpoints measured in this test, it would be desirable to test control sediments that are representative of the physical characteristics of field-collected sediments. Formulated sediments could be used to bracket the ranges in physical characteristics expected in the field-collected sediments being evaluated [1,5] (N.E. Kemble, unpublished results). Addition of YCT should provide a minimum amount of food needed to support adequate survival, growth, and reproduction of H. azteca in sediments low in organic matter.  Means across sediment types within a feeding level followed by a common letter (designated x, y, or z) are not significantly different (p Ͼ 0.05). For means not followed by a letter, the ANOVA was not significantly different (p Ͼ 0.05). One-way ANOVA was used to analyze these data because the primary objective of this study was to evaluate the influence of feeding level on the responses.
Interaction among sediments across feeding levels was not a critical issue and might be expected due to different nutrient content in the sediments.  [20] or Kemble et al. [31]. Odd numbers indicate toxic samples and even numbers indicate nontoxic samples from Without addition of food, H. azteca can starve during exposures [33] making it impossible to differentiate effects of contaminants from other sediment characteristics.
In addition to the correlation procedure described above, sediment chemistry was evaluated using previously published effect range medians (ERMs) for 28-d toxicity tests with H. azteca [20]. An ERM was calculated as the median concentration for a chemical in toxic sediment samples above which an effect is usually or always observed. Toxicity endpoints measured in these 28-d tests included survival, growth, or sexual maturation of amphipods. Ingersoll et al. [20] reported ERMs primarily for metals and PAHs. Use of ERMs to classify samples as toxic or not toxic minimized type I (false positives) and type II (false negative) errors relative to other sediment quality guidelines reported in Ingersoll et al. [20].
In sediment assessments by Ingersoll et al. [20] and Kemble et al. [31], the frequency of samples classified as toxic was highest when the proportion of ERMs exceeded Ͼ0.4 or when the mean ERM quotient was Ͼ1 (Fig. 2). In the present study, only three samples had a proportion of ERMs that exceeded about 0.4 or a mean ERM quotient Ͼ1 (CC-01, CC-02, and NB-07; Table 5 and Fig. 2). These three samples were also designated as toxic relative to the control sediment (significantly shorter lengths at day 28 or 42 relative to the control; Table 6).
Three additional samples from the Upper Mississippi River were designated as toxic (significantly shorter lengths at day 42 relative to the control; UM-04C, UM-11C, UM-14C; Table  6) but did not exceed any ERMs and had a mean ERM quotient of Ͻ0.2 (Fig. 2). These three samples had Ͻ7% reduction in 42-d length of amphipods relative to the control. In contrast the remaining toxic samples with elevated concentrations of contaminants (CC-01, CC-02, NB-07) had a 10 to 27% reduction in length relative to the control ( Table 6). The lack of correspondence between toxicity and chemistry in the Upper Mississippi River samples may have been due to unmeasured contaminants or other stressors in these samples. However, these responses may only be statistical differences rather than a true toxic effect (i.e., low variability in the responses). In a database described by Ingersoll et al. [20], a difference of about 10% in length of H. azteca was needed to consistently identify sediment samples as toxic relative to contamination.
In summary, of the 18 field-collected samples in Table 6, 83% of the samples were correctly classified as toxic (n of 3) or not toxic (n of 12), 17% were toxic samples classified as not toxic (false negative, n of 3), and none of the nontoxic samples were classified as toxic (false positive, Fig. 2). The majority of the samples (67%) were low in contamination and not designated as toxic and none of the remaining samples designated as toxic were severely contaminated or consistently toxic to all endpoints measured. This is consistent with the low to moderate concentrations of contaminants in these fieldcollected sediments.

Relationship between growth and reproduction endpoints
Natural or anthropogenic stressors that effect growth of invertebrates may also effect reproduction, because of a minimum body size needed for reproduction [9,10,30,[43][44][45][46]. In the present study, there was a significant correlation between reproduction from day 28 to 42 and length of amphipods on day 28 when data are plotted by the mean of each treatment ( Fig. 3a; Spearman rank correlation of 0.59, p ϭ 0.0001, n ϭ 53). Based on 28-d lengths, smaller amphipods (Ͻ3.5 mm) tended to have lower reproduction and larger amphipods (Ͼ4.3 mm) tended to have higher reproduction; however, the range in reproduction was wide for amphipods 3.5 to 4.3 mm in length. Based on 42-d lengths, there was a weaker correlation between length and reproduction (i.e., reproduction and length measured in paired replicates; Fig. 3b, Spearman rank correlation of 0.49, p ϭ 0.0001, n ϭ 58). Similarly, plotting data by individual replicates (data not shown) did not improve the relationship between 42-d length and reproduction compared to the plots by the mean of each treatment (Fig. 3b). Weaker relationships were observed between reproduction and weight measured on day 28 (Fig. 4a, Spearman rank correlation of 0.44, p ϭ 0.0037, n ϭ 42) or weight measured on day 42 (Fig.  4b, Spearman rank correlation of 0.34, p ϭ 0.0262, n ϭ 42). Recently completed round-robin studies have generated ad- Fig. 3. Relationships between amphipod length and reproduction by (a) treatment means for 28-d length or (b) treatment means for 42-d length. ᭞, Data from the feeding study with sediments (Table 7); ⅜, data from field-collected sediments (Table 6); Ⅺ, data from Waukegan Harbor (N.E. Kemble, unpublished data); and #, unpublished data for round-robin testing (G.A. Burton). ditional data to further evaluate relationships between growth and reproduction of H. azteca in sediment tests using the procedures outlined in the Appendix (G.A. Burton, personal communication).
A significant correlation was evident between length and weight of amphipods (Fig. 5, Spearman rank correlation of 0.80, p ϭ 0.0001, n ϭ 415), indicating that either length or weight could be monitored in sediment tests with H. azteca. However, additional statistical options are available if length is measured on individual amphipods, such as nested ANOVA that can account for variance in length within replicates [47]. Analyses are ongoing evaluating the ability of length versus weight to discriminate between contaminated and uncontaminated samples in the database described in Ingersoll et al. [20].
The relatively weak relationship between growth and reproduction probably reflects the fact that these comparisons were made within a fairly narrow range in length (3.5-4.5 mm; Fig. 3) or weight (0.25-0.50 mg; Fig. 4). Other inves-tigators have reported a similar degree of variability in reproduction of H. azteca within a narrow range of length or weight, with stronger correlations observed over wider ranges [26,30,48,49]. The degree of correlation between growth and reproduction may also be dependent on the strain of H. azteca evaluated [26,50].
The proportion of males to females within a treatment or by replicate was not correlated to young production (Tables 6 and 7) but may have contributed to variation in reproduction. Wen [49] reported that when two or three males were placed in a beaker with one female H. azteca, the frequency of successful amplexus was reduced, possibly from aggression among males. Future study is needed to determine if increasing the number of amphipods/beaker would result in a more consistent proportion of males to females within a beaker and would reduce variability in reproduction.
Reproduction was often more variable than growth (Tables  6 and 7). The coefficient of variation (CV) was typically Ͻ10% for growth and Ͼ20% for reproduction. This difference in variation affects the statistical power of comparisons and the number of replicates required for a test. For example, detection of a 20% difference between treatments means at a statistical power of 0.8 would require about 4 replicates at a CV of 10% and 14 replicates at a CV of 20% [4]. In the present study, to detect a 20% difference among treatments at a power of 0.8, 4 to 8 replicates/treatment would be adequate for measuring effects on growth, but not reproduction. Fewer replicates would be required if detection of only larger differences among treatment means were of interest. Ongoing water-only studies testing select contaminants will hopefully provide additional data on the relative sensitivity and variability of sublethal endpoints in toxicity tests with H. azteca.
Kubitz et al. [19] recommended a two-step process for assessing growth in sediment tests with H. azteca. A limited number of replicates are tested in a screening step and samples identified as possibly toxic are then tested in a confirmatory step with additional replicates. This two-step analysis conserves laboratory resources and increases statistical power when needed to discriminate sublethal effects. A similar approach could be applied to evaluate reproductive effects of contaminants in sediment where a limited number of replicates could be initially tested to evaluate potential effects. Samples identified as possibly toxic based on reproduction could then be re-evaluated using an increased number of replicates. How-ever, the use of sediments stored for extended periods of time may introduce variability in results between the two studies [1].

Relative endpoint sensitivity
Measurement of sublethal endpoints in sediment tests with H. azteca provides unique information that has been used to discriminate toxic effects of exposure to contaminants. Table  8 compares the relative sensitivity of survival and growth endpoints in 14-and 28-d tests with H. azteca in a historic database [51] generated at our laboratory with contaminated sediments. When 14-d and 28-d tests were conducted concurrently measuring both survival and growth, both tests identified 34% of the samples as toxic and 53% of the samples as not toxic (N ϭ 32). Both tests identified an additional 6% of the samples as toxic. Survival or growth endpoints identified a similar percentage of samples as toxic in both the 14-and 28-d tests. However, the majority of the samples used to make these comparisons were highly contaminated. We have not compared responses in 14-d and 28-d tests using moderately contaminated samples. Additional exposures conducted with moderately contaminated sediment may exhibit a higher percentage of sublethal effects in the 28-d test compared to the 14-d test.
When both survival and growth were measured in 14-d tests (N ϭ 25), only 4% of the samples reduced both survival and growth; however, 20% reduced survival only and 16% reduced growth only (60% did not reduce survival or growth). Hence, if survival was the only endpoint measured in 14-d tests, 16% of the toxic samples would be incorrectly classified. Similar percentages are also observed for the 28-d tests. When both survival and growth were measured in the 28-d test (N ϭ 44), 16% of the samples reduced both survival and growth, 14% reduced survival only, 18% reduced growth only, and 52% did not reduce survival or growth.
The endpoint comparisons in Table 8 represent only samples in which both survival and growth could be measured. If a sample was extremely toxic, it would not be included in this comparison because growth could not be measured. Moderately contaminated sediments that did not severely reduce survival could have reduced growth. For example, in 28-d tests with sediments from the Clark Fork River, growth was a more sensitive endpoint compared to survival or maturation. Only 13% of the samples reduced survival and 20% of the samples reduced maturation; however, growth was reduced in 53% of the samples [28].
Other investigators have reported measurement of growth in tests with H. azteca often provides unique information that can help to discriminate toxic effects of exposure to contaminants in sediment [19,47,52] or water [53][54][55]. Similarly, in sediment tests with the midge C. tentans, sublethal endpoints are often more sensitive than survival as indicators of contaminant stress [9,10,56]. In contrast, Borgmann et al. [57] reported that growth or reproduction did not add additional information beyond measurement of survival in H. azteca wateronly exposures with cadmium or pentachlorophenol. Similarly, Day et al. [18] reported that weight did not add additional information beyond measurement of survival in 14-d sediment tests with H. azteca. Ramirez-Romero [58] reported that reproduction of H. azteca was not affected by exposure to sublethal concentrations of fluoranthene in sediment when exposures were started with juvenile amphipods. Brasher and Ogle [53] started exposures with adult amphipods and observed the sensitivity of reproduction compared to survival of H. azteca was dependent on the chemical tested (reproduction more sensitive to selenite and survival more sensitive to selenate in water-only exposures). Long-term exposures starting with juvenile amphipods would likely be more appropriate to assess effects of contaminants on reproduction (i.e., [59,60]).
In summary, the method outlined in the Appendix can be used to measure sublethal effects of contaminated sediments on H. azteca in 42-d exposures and these sublethal endpoints provide unique information regarding effects of contaminants. Both this method and the method described by Benoit et al. [8] for C. tentans are being considered as possible revisions to existing methods published by U.S. EPA [4] and ASTM [1]. The method outlined in the Appendix has also been evaluated in a preliminary round-robin test with 12 laboratories using two clean sediments. After the 28-d sediment exposures with H. azteca, laboratories typically reported survival Ͼ80%, length Ͼ3.2 mm/ individual, and weight Ͼ0.15 mg/individual. Reproduction was more variable within and among laboratories (0-11 young/female; G.A. Burton, personal communication).
Either length or weight of amphipods can be measured in the sediment test. However, additional statistical options are available if length is measured on individual amphipods, such as nested ANOVA, which can account for variance in length within replicates. Reproduction was more variable than growth; hence, more replicates might be needed to establish statistical differences among treatments. A two-step analysis could be used where a limited number of replicates are initially evaluated and then select samples identified as possibly affecting reproduction could be re-evaluated using an increased number of replicates.
Additional studies are needed to evaluate further the use of reconstituted water and the influence of sediment particle size and ammonia in these long-term exposures with H. azteca. Ongoing water-only toxicity tests with select chemicals (i.e., cadmium, ammonia, and an organic compound) should generate data that can be used to better determine the relative sensitivity of endpoints measured in this test. These wateronly studies will also be used to evaluate potential recovery of amphipods after transfer into clean water to measure reproduction. Results of these water-only studies will help to determine if there is a substantial increased sensitivity in this 42-d test compared to the 10-d sediment test with H. azteca where only lethality is measured. In addition to evaluating the relative sensitivity of endpoints in the laboratory, the ultimate measure of the utility of this or any other toxicity test is the ability of the test to estimate effects on populations in the field. Therefore, additional research is also needed to evaluate the ability of survival, growth, or reproductive endpoints measured in the laboratory tests to estimate responses of benthic organisms exposed in the field to contaminated sediments.