The relative influence of catchment and site variables on fish and macroinvertebrate richness in cerrado biome streams

Landscape and site-scale data analyses aid the interpretation of biological data and thereby help us develop more cost-effective natural resource management strategies. Our study focused on environmental influences on stream assemblages and we evaluated how three classes of environmental variables (geophysical landscape, land use and cover, and site habitat), influence fish and macroinvertebrate assemblage richness in the Brazilian Cerrado biome. We analyzed our data through use of multiple linear regression (MLR) models using the three classes of predictor variables alone and in combination. The four MLR models explained dissimilar amounts of benthic macroinvertebrate taxa richness (geophysical landscape R 2 ≈ 35 %, land use and cover R 2 ≈ 28 %, site habitat R 2 ≈ 36 %, and combined R 2 ≈ 51 %). For fish assemblages, geophysical landscape, land use and cover, site habitat, and combined models explained R 2 ≈ 28 %, R 2 ≈ 10 %, R 2 ≈ 31 %, and R 2 ≈ 47 % of the variability in fish species richness, respectively. We conclude that (1) environmental variables differed in the degree to which they explain assemblage richness, (2) the amounts of variance in assemblage richness explained by geophysical landscape and site habitat were similar, (3) the variables explained more variability in macroinvertebrate taxa richness than in fish species richness, and (4) all three classes of environmental variables studied were useful for explaining assemblage richness in Cerrado headwater streams. These results help us to understand the drivers of assemblage patterns at regional scales in tropical areas.


Introduction
Freshwater ecosystems are the most threatened environments in the world (Dudgeon et al. 2006), with species extinctions exceeding those of terrestrial environments (Sala et al. 2000). Richness is a common measure of biodiversity, and understanding richness patterns and pressures at various spatial scales is essential to reduce biodiversity loss, because we can use it as an indicator of resistance and resilience to disturbance, habitat simplification, and biological condition (Hughes and Noss 1992;Vinson and Hawkins 1998). We focus on fish and macroinvertebrate assemblages, the most commonly used taxonomic groups for assessing stream condition in large national and continental monitoring programs because of their value as indicators (e.g. Hughes and Peck 2008;Marzin et al. 2012a;USEPA 2013). Environmental patterns at various spatial scales of analysis within a watershed directly affect the structure of biological communities (Vannote et al. 1980;Frissell et al. 1986;Tonn 1990). Further, the differing sensitivities, mobilities and physiologies of fish and benthic macroinvertebrate assemblages should yield differing sensitivities to environmental variables. Benthos are more sensitive to disturbances than fish which can move or physiologically adapt to changing conditions, especially those arising from anthropogenic sources (e.g., Wang et al. 1997;Lammert and Allan 1999;Walser and Bart 1999;Hrodey et al. 2009;Walters et al. 2009;Marzin et al. 2012a).
At a broad spatial scale, climate, geology and topography influence the geomorphic processes that govern smaller scale energy inputs and site habitat structure for aquatic assemblages (Frissell et al. 1986;Allan 2004;Goldstein et al. 2007). In addition, those geophysical factors influence human occupation and land and water use (Whittier et al. 2006;Steel et al. 2010). Geophysical factors, land use, and human impacts in a watershed affect the structure and composition of riparian zones, substrates, flow and thermal regimes, nutrient inputs, and potential inputs of pollutants, which directly affect the availability of site habitats for aquatic assemblages (Wang et al. 1997;Allan 2004). In turn, these site-scale physical and chemical habitats are primary factors influencing the structure and composition of aquatic assemblages (Vannote et al. 1980;Frissell et al. 1986;Allan 2004) because of their greater proximity to the organisms. Therefore, the hierarchical organization of various environmental factors and spatial scales affect the characteristics of aquatic assemblages (Hierarchy Theory;O'Neill et al. 1989;Fig. 1). However, because site-scale chemical and physical habitats are structured by environmental factors at the catchment scale, it is difficult to determine the importance of various levels of environmental variables on aquatic communities (Frissell et al. 1986;Allan 2004). Nonetheless, with increased availability of digital maps of landscape variables and greater statistical computing power, various combinations of those variables have been used to explain biological patterns across multiple biomes and ecoregions (Wang et al. 2006;Sály et al. 2011;Marzin et al. 2012b). Based on predictions from Hierarchy Theory, we expect that landscape factors will affect biota through their effects on physical and chemical habitat (Frissell et al. 1986;O'Neil et al. 1989;Tonn 1990).
Savanna biomes are distributed across tropical zones around the world (IBGE 1991). In South America, the Cerrado is the largest savanna biome, the second largest biome in Brazil (smaller only than the Amazon), and represents about 23 % of Brazil (Ratter et al. 1997). This biome is a world biodiversity hotspot with many rare and endemic species (Myers et al. 2000). It is one of the world's most threatened biomes, mainly because of the replacement of natural vegetation by pastures and row crops (Ratter et al. 1997;Myers et al. 2000). Thus, multiple effects of anthropogenic influences threaten Cerrado rivers, mainly by pollution and sedimentation (Wantzen et al. 2006). Although well studied around the world (e.g., Vinson and Hawkins 1998;Beauchard et al. 2003;Clarke et al. 2008;Oberdorff et al. 2011), spatial patterns in aquatic assemblage richness are poorly studied in the Brazilian Cerrado. Instead, most Cerrado diversity research is centered on the spatial patterns of richness of terrestrial frogs, birds, and mammals (e.g., Diniz-Filho et al. 2005;Rangel et al. 2006;Blamires et al. 2008;Melo et al. 2009). Therefore, our objective was to identify how landscape variables assessed at varying spatial scales influence benthic macroinvertebrate and fish assemblage richness in the Brazilian Cerrado. We tested two hypotheses: (1) the variability of geophysical landscape variables, catchment land use and cover, and site habitat all influence benthic macroinvertebrates and fish richness; (2) geophysical landscape variables influence land use and cover and site habitat, therefore, they affect macroinvertebrate richness and fish richness more than site variables. Our analytical approach was based on building multiple linear regression (MLR) models to explain assemblage richness with three sets of predictor variables: (1) geophysical landscape variables that are not influenced by human action, (2) land use and cover that represent anthropogenic pressures, and (3) site physical and chemical habitat.

Study areas
We conducted this study in wadeable 1st-3rd order streams in two Brazilian Cerrado basins: Upper Rio Araguari Basin (Fig. 2a) and Upper Rio São Francisco Basin (Fig. 2b). Both study areas were demarcated upstream of the first hydroelectric plant in each basin (Nova Ponte and Três Marias, respectively). We collected samples at the end of the dry season in September 2009 (Araguari) and 2010 (São Francisco).

Site selection
We randomly selected 80 sampling sites, 40 in each basin. Site selection followed the generalized random tessellation stratified (GRTS) sampling design developed for the U.S. EPA's Wadeable Stream Assessment (Stevens and Olsen 2004;Olsen and Peck 2008). Our target was wadeable perennial streams, so we Hierarchical organization and interactions of landscape elements. The focal processes of aquatic biota richness, abundance, and diversity can be acted on directly by site-scale factors, or indirectly from catchment-scale factors and anthropogenic factors excluded all tributaries greater than Strahler order 3 on a digital 1:100,000 scale map. We excluded all stream channels [35 km from the shore of each respective reservoir to limit the effect of differing fish species dispersal capacities, as recommended by Hitt and Angermeier (2008).

Geophysical landscape variables
We calculated rainfall through use of time series data from the Brazilian National Water Agency (ANA 2011) obtained at 31 stations. Each station had total annual rainfall records for C30 years, and those data were extracted, geo-referenced, and interpolated using ordinary kriging (Johnston et al. 2001). The overlap grid cell value of mean annual rainfall was transferred to each site.
Using GIS software, we extracted geographic variables representing each of the 80 catchments. Watersheds of each site were manually delineated to the entire upstream drainage area for each sampled site through use of elevation data from Shuttle Radar Topographic Mission-SRTM (3 arc second; USGS 2005). Drainage density (km/km 2 ) was calculated by dividing the total length of streams by catchment area. Catchment elevation (range, mean and standard deviation) was extracted directly from SRTM imagery, whereas mean catchment slope was calculated from the maximum rate of change in elevation in every grid cell, based on SRTM elevation raster. We calculated the upstream stream segment slope by dividing the channel length between the site and the mapped initiation of the stream by the altitude range between the two points. We calculated the proportions of various geologic units in the catchments after extracting data from Brazil (2004). Continuous variables were log transformed and proportional data were arcsine squared root transformed to improve normality of the distributions. We assessed catchment land use and cover for each site through screen digitizing of land use and land cover. We interpreted September Landsat TM sensor multispectral imagery (R4G3B2 false color band combination) in conjunction with fine resolution imageries (0.6-5 m spatial resolution, Google Earth data; Google 2010). The fine resolution images provided information about the shape and texture of the elements, and the Landsat images showed specific spectral response for each land use or vegetation cover. For example, in fine resolution imagery, vegetation targets are usually the same color (e.g., forest and sugar plantation are both green), but, their responses in the infrared band in multispectral imagery are different because their leaf structures (physiognomies) differ considerably.
Our mapping identified four natural land covers (woodland savanna, grassy-woody savanna, savanna park, wetland palm swamps), and four human influenced land uses (eucalyptus forest, pasture, agriculture, urban) in each catchment. We also calculated total natural land cover and total anthropogenic land use by summing the preceding four land covers and four land uses, respectively. To further characterize anthropogenic influences on the sites, we measured Euclidean distance between each site and nearby towns and highways. Additionally, we calculated road density in each catchment (km/km 2 ) by extracting the roads from a digital 1:100,000 scale map and dividing by catchment area. Finally, the location of each household in the study area was extracted from the 2010 Brazilian Population Census (IBGE 2011). From those data we calculated the density of households (houses/km 2 ) in each catchment and the proximity of houses to the sites through a spatial proximity kernel (Johnston et al. 2001), at a distance of up to 10 km between households and sites. We transformed the variables to improve normality distribution as we did with the geophysical landscape variables.

Site habitat variables
We assessed physical habitat through use of preprinted field forms that could be quickly and precisely completed by checking or circling options (Peck et al. 2006) and that are widely used in regional Kaufmann et al. 2009;Bryce et al. 2010) and national Paulsen et al. 2008;Stoddard et al. 2008) studies. The length of each stream site sampled was 40 times its mean wetted width, with a minimum length of 150 m. Each site was divided into 11 equally spaced transects. At each transect we quantified channel dimensions (e.g. wetted width, depth, bankfull width), bank angle, riparian vegetation condition (e.g. tree canopy, understory and ground cover), presence of in-stream fish cover (e.g. undercut banks, overhanging shrubs, filamentous algae, macrophytes), and presence of human activities (e.g. pasture, agriculture, trash, pipes). Between transects, we determined channel slope (with a clinometer) and sinuosity (with a compass) and at every 1.5 m we recorded flow habitat type (e.g. riffles, pools, glides, etc.) and thalweg depth. Substrate size was sampled by visually classifying the diameter class (e.g. sand, gravel, boulder) of a total of 105 individual particles in five systematic points distributed across 21 cross-sections of the wetted channel to ensure stable and precise substrate estimates (Kaufmann et al. 1999). We measured instantaneous discharge at a cross section with non-turbulent or near-laminar flow in or near the site (Peck et al. 2006) and calculated physical habitat metrics as described in Kaufmann et al. (1999), but with relative bed stability (Lrbs), calculated as recommended by Kaufmann et al. (2009). Site habitat variables were normally distributed and so were not transformed.
We measured temperature, electrical conductivity, pH, and dissolved solids (TDS) in situ with a multiprobe. In the laboratory, dissolved oxygen, turbidity, total alkalinity, total nitrogen, and total phosphorus were analyzed following APHA (1998). We log transformed water physical and chemical variables when necessary.

Benthic macroinvertebrate sampling
We sampled benthic macroinvertebrate assemblages through the use of D-frame kick nets (30 cm aperture, 500 lm mesh). Sampling followed a systematic zigzag pattern, with the first transect sampled near the left margin, the second transect sampled in mid-channel, the third transect near the right margin, and so on. In each site's eleven transects we sampled a 0.09 m 2 quadrat, totaling 1 m 2 per site. This sampling area and distribution were found sufficient to yield typically 500 individuals, sample all major habitat types, and provide sufficiently precise and accurate estimates of macroinvertebrate taxa richness for regional (Li et al. 2001;Cao et al. 2002;Gerth and Herlihy 2006) and national Paulsen et al. 2008;Stoddard et al. 2008) studies. The samples were preserved in the field with 4 % formalin and taken to the laboratory. In the laboratory, the samples were washed through a 500 lm mesh sieve, sorted, and identified under stereo microscopes at 32X. We identified specimens to family, except for Annelida, Mollusca and Arachnida, with the aid of Pérez (1988), Merritt and Cummins (1996), and Mugnai et al. (2010). The specimens were cataloged and deposited in the macroinvertebrate reference collection of the Federal University of Minas Gerais.

Fish sampling
We sampled fish assemblages with two hand nets made from mosquito screen (1 mm mesh) attached to an 80 cm hemispherical steel frame. We used a single type of net because we could thrust them into macrophyte beds and under overhanging banks and vegetation, drive fish into them by overturning rocks immediately upstream of the net and allowing the current to flush fish into the nets, and dash and splash through pools to drive fish downstream into the nets. Each site was sampled for two hours (12 min between each of 10 transects, which was adequate for these small and shallow headwater streams), thoroughly lifting substrates and netting between each transect. The efficiency of this method was previously tested through use of various estimators, whose efficiency of 78-85 % for both benthic and water column species, was superior to several other studies conducted in Brazil (Junqueira 2011), considering the high beta diversity in a neotropical hotspot (Allan and Flecker 1993). Fish were tagged separately by transect and preserved in 10 % formalin. In the laboratory, fish were identified to species through use of Britski et al. (1988) and Graça and Pavanelli (2007), preserved in 70 % ethanol, and deposited in the fish reference collection of the Federal University of Lavras.

Data analyses
Our statistical approach was based on the construction of multiple linear regression (MLR) models. This approach is widely used to determine the environmental factors most strongly associated with patterns of taxa richness (e.g., MacNally 2000; Diniz-Filho et al. 2003;Graham 2003). The environmental variables were divided into three groups and two levels (see Fig. 1): geophysical landscape variables that are not controlled by human action, anthropogenic pressure variables represented by land use and land cover variables (both at catchment scale), and site habitat variables (site scale). First, we eliminated the environmental variables that had more than 90 % zero values. We next analyzed Pearson correlations among the remaining 87 candidate predictor variables to identify highly correlated variables (r [ |0.8|; see Table SM1). Third, we calculated Pearson correlations between those variables and richness values. Then we screened predictor variables by examining Pearson correlations between candidate predictor variables (only those not highly correlated with each other) and richness values, limiting the number of potential predictors for the creation of MLR models to those having correlations with r [ |0.1|. These steps yielded 50 and 37 potential predictors of macroinvertebrate and fish assemblages, respectively (Table SM2).
To evaluate the best environmental level to explain both fish species richness and macroinvertebrate family richness, we developed three multivariate models of the three metric groups separately (geophysical landscape variables, land use and land cover, and site habitat), through stepwise regression (forward selection; P-to enter & 0.15). Each model was validated by analyzing the normality (Harrel 2001) and spatial autocorrelation of its residuals (Diniz-Filho et al. 2003;Rangel et al. 2010).
To evaluate the relative influence of each environmental level on the richness of both assemblages, we performed partial linear regression (Legendre and Legendre 1998). First, we developed a new MLR from the set of all environmental variables, using the same screening criteria used in developing the three separate models. The fish and macroinvertebrate combined models were analyzed by variance partitioning, to evaluate the relative importance of each set of variables on the combined model performance (Legendre and Legendre 1998;Goldstein et al. 2007).

Results
We identified 84 benthic macroinvertebrate taxa and 77 fish species, with 77 benthic macroinvertebrate taxa and 38 fish species in the Upper Araguari Basin, and 80 and 54, respectively, in the Upper São Francisco Basin. Few environmental variables were highly correlated (r [ |0.8|); most correlations were between r [ |0.3| and r \ |0.5| (Table SM1).
Stream sites occurred in catchments with variable area, drainage density, altitude, and slope, but with rainfall generally \1,500 mm. The predominant geological units were schist and mudstones. Phyllites, arkoses, sandstones and conglomerates occurred in smaller proportion, but in equal values among them. Sedimentary rocks occupy most of the study area. Overall, the sampled watersheds had moderate levels of natural cover, low levels of pasture and agriculture, and very low levels of urbanization and non-native eucalyptus forest; sites were distant ([6 km) from urban centers and highways. Most sites were shallow and narrow, dominated by fine sediment with low geometric mean diameter. The flow habitat type was predominantly slow water, mostly glides. Riparian gallery forests were typical of the Cerrado, with a predominance of mid-canopy and understory cover. Site human disturbances were mostly agricultural (pasture, agriculture and eucalyptus forest) rather than cities or roads. The water quality was good, with high concentrations of dissolved oxygen, and low levels of total nitrogen, total phosphorus, turbidity, and dissolved solids (Table SM2).

Relationships between environmental variables and assemblage richness
Both benthic macroinvertebrate and fish assemblage richness were weakly to moderately correlated with all three classes of metrics studied. In relation to the geophysical landscape, both assemblage richnesses were similarly correlated with rainfall (negative), elevation, and slope patterns. Fish species richness was correlated with basin area and benthos taxa richness with drainage density. Both assemblage richnesses were negatively correlated with sandstones and macroinvertebrate richness was correlated with phyllites (Table SM2).
Regarding land use and cover, fish and macroinvertebrate richness showed distinct correlations: benthos richness was moderately and negatively correlated with wetland, agricultural and urban land use, but positively with natural land cover, especially parkland savanna. Fish richness was weakly correlated negatively with urban land use and positively with parkland savanna and eucalyptus planted forest.
At the site scale, morphological parameters were similarly correlated to both fish and macroinvertebrate richness: positive and moderate correlation with wetted width, bankfull width, and wetted area. Fish richness also correlated positively with the length of the sampled reach. There was no correlation between substrate and fish assemblages; benthos correlated only with excess fines (negatively) and relative bed stability. Neither assemblage was correlated with the presence of wood substrate. Macroinvertebrates were not correlated with flow habitat type, but fish were correlated negatively with pools and positively with glides. The results for riparian vegetation conditions were also different between taxa: benthos richness was only weakly correlated with canopy presence whereas fish responded positively to canopy absence (negative with shading and positively with presence of groundlayer cover). Fish richness was correlated with macrophyte cover, whereas benthos richness was correlated with sum of natural types of fish cover, but particularly with boulder cover. Riparian disturbance was correlated only with macroinvertebrates, mainly to non-agricultural types. Regarding water chemistry, dissolved oxygen and phosphorus (negatively) were correlated with both assemblages, but more strongly with benthos richness, which also was negatively correlated with turbidity. Other water quality parameters, such as conductivity and nitrogen, were not correlated significantly with the richness of either assemblage.

Multiple linear regression (MLR) models
The MLRs built with geophysical landscape and site habitat explained similar amounts of variation in macroinvertebrate richness: &35 and &36 %, respectively. Land use and cover explained &28 % ( Table 1). The geophysical landscape model was composed of elevation range, drainage density, phyllite and rainfall (negatively). The land use and cover model was composed of percentage of wetland (negatively), and percentage of natural cover. The site habitat MLR was composed of phosphorus, natural fish cover, undercut bank fish cover, bankfull width, and presence of canopy layer. None of the MLRs had spatially correlated residuals (Fig. 3).
Regarding fish, the geophysical landscape MLR was composed of sandstones (negative), drainage area, slope range, and thalweg slope and explained &28 % of species richness ( Table 2). The land use and cover model was the weakest, explaining only &10 % and was composed of house density, percent parkland and city distance. The site habitat model explained &31 % of richness and was composed of phosphorus (negative), percentage of glides, and proportion of macrophyte cover. Furthermore, the MLRs had no spatially correlated residuals (Fig. 3).

Partial linear regression models
The macroinvertebrate richness model based on the three sets of variables combined explained &52 % of the macroinvertebrate richness, and was based on percent phyllites, rainfall, percentages of natural cover and wetland, total phosphorus concentration, and undercut bank fish cover (Table 3). The variance partitioning analysis indicated that land use and cover alone explained &14 %, geophysical landscape &12 %, and site habitat &10 % of the macroinvertebrate richness. Geophysical landscape and land use and cover shared &10 %, geophysical landscape and site habitat shared 1 % and land use and cover and site habitat shared \1 % of the explained variance. Approximately 3 % of the explained variance was shared among all three types of variables.  The composite fish richness model included three geophysical landscape variables (percent sandstones (negative), slope range, drainage area), three site variables (percent glides, proportion of macrophyte cover, ground-layer cover) and only one land use and cover variable (distance to cities) and explained &47 % of fish species richness (Table 3). The variance partitioning analysis indicated that geophysical landscape and site habitat explained similar amounts of fish richness (&22 %) whereas land use and cover explained only &5 %. Geophysical landscape and land use and cover explained &2 % of the explained variance, and the other combinations were negative.

Discussion
Our results showed that benthic macroinvertebrate and fish assemblage richness correlated similarly with the level of geophysical landscape and site habitat variability when analyzed alone; however compared with fish, benthos richness was more strongly correlated (negatively) with anthropogenic land use and cover alone. In national surveys, Brown et al. (2009) andUSEPA (2013) also reported that macroinvertebrate assemblages were more sensitive to disturbance than fish assemblages. Presumably the more restricted mobility and physiology of aquatic benthic macroinvertebrates makes them more sensitive to the pressures and stressors we evaluated whereas the fish are more mobile and physiologically adaptable. It is also possible that the Cerrado fish species are simply more responsive to geophysical landscape and site variables than are Cerrado macroinvertebrate families (Table 3). Analysis at the macroinvertebrate species or genus (versus family) level might produce different results, but such taxonomic keys are unavailable for Brazil. However Whittier and Van Sickle (2010) concluded that there was little difference between family and genus tolerances for western USA benthos in relation to synthetic catchment and local habitat disturbances.
Landscape composition and configuration derived from remotely sensed data and rigorous characterization of in-stream physical habitat have not been used in previous studies in tropical areas (Casatti et al. 2008;Moreno et al. 2009;Pinto et al. 2009;Moya et al. 2011;Feio et al. 2013). We used geospatial data (e.g. SRTM radar images, satellite images, rainfall time series, population census data) integrated in GIS for properly distinguishing landscape controls and anthropogenic pressure classes. To characterize physical habitat structure quantitatively in our synoptic survey context, we applied current methods used by the USEPA in its national surface water surveys . Consequently, we believe that our study reveals more accurate and repeatable results regarding the multi-scalar relationships between environmental predictor variables and taxa richness than those of earlier tropical stream studies. A recurring problem in ecological studies is the bias caused by spatial autocorrelation of assemblage responses (Diniz-Filho et al. 2003;Stevens and Olsen 2004;Steel et al. 2010). In this study, we successfully used a spatially balanced sampling network to counteract this bias. Further, such a study design helps guarantee the principle of sample independence when extrapolating the results to entire basins (Whittier et al. 2007a). In the USA, this approach is already used at both national and regional scales; however, this is a relatively new approach in Brazil, with other studies conducted using unbalanced sampling networks (e.g. Casatti et al. 2008;Moreno et al. 2009;Pinto et al. 2009;Feio et al. 2013).

Relationships between geophysical landscape variables and assemblage richness
The associations of benthos and fish assemblages with geophysical landscape level variability were, together with site habitat variables, the best fit set of environmental factors to explain assemblage richness of both assemblages. This is not surprising, because geophysical aspects drive site habitat structure (Frissell et al. 1986;Allan 2004;Goldstein et al. 2007) and land use and cover (Whittier et al. 2006;Steel et al. 2010). In fact, the interaction between topography and geology are important drivers of catchment size and shape, energy inputs, and fine sediment dynamics (Montgomery 1999). Thus, we can see the interaction of slope and geology in explaining the richness of both assemblages (Tables 1 and 2). The contribution of rain has an unclear effect. At a global scale, higher rainfall levels are positively correlated with fish richness ). However, biomes and ecoregions are very different with respect to their climatic regimes and rainfall overall. In the case of the Brazilian Cerrado, taxa richness was lower in areas with increased rainfall, possibly because of increased freshets or other processes in these hydrologically flashy systems. The Cerrado has well-defined wet and dry seasons, with variation in the driest areas less than variation in wetter areas (ANA 2011). Thus, the use of these geophysical landscape variables is very useful because similar variables are used for developing ecoregions (Pinto et al. 2006;Omernik et al. 2011) and ecoregions were useful for developing differing expectations in spatially extensive assessments of the biological condition of USA streams (Paulsen et al. 2008;USEPA 2013). Further, landscape variables on broad regional scales could be very useful for explaining assemblage patterns in Cerrado and other neotropical studies.

Relationships between land use and cover and assemblage richness
The MLR relating anthropogenic pressure on assemblage richness of benthic macroinvertebrates was stronger than that for fish (&28 vs &10 % respectively). Others have reported similar patterns for lotic systems (Wang et al. 1997;Lammert and Allan 1999;Walser and Bart 1999;Hrodey et al. 2009;Walters et al. 2009). However, Marzin et al. (2012a) found that fish assemblages, when summarized by functional traits, were more responsive than macroinvertebrates.
Biotic multimetric condition indices are influenced by land use and cover conditions (Hughes et al. 1998;Karr 1999) and commonly used in making regional bioassessments. One reason that macroinvertebrate taxa richness is more often used than fish species richness in such indices is that the former tend to be more sensitive to disturbance (e.g., Kerans and Karr 1994;Fore et al. 1996;Karr 1999;Weigel et al. 2002). On the other hand, fish species richness is not commonly used now as a metric in such indices (e.g., McCormick et al. 2001;Hughes et al. 2004;Whittier et al. 2007b;Pont et al. 2009). However, macroinvertebrates are also easier and quicker to sample effectively than fish. Nonetheless, fish assemblages have been employed effectively in regionalscale (Whittier et al. 2007b;Pont et al. 2009) and continental-scale (Pont et al. 2007;Esselman et al. 2013) biological assessments.
Low percentages of natural cover and parkland savanna are proxies for general disturbances (e.g., pasture, agriculture, urbanization), and high percentages indicate less disturbed environmental conditions (Ligeiro et al. 2013). However, in the Cerrado, natural palm swamp wetlands are characterized by low gradients, floodplains with little riparian cover and abundant fine sediments, which produce high levels of fines in streams that are inhospitable to many macroinvertebrate taxa (e.g., Bryce et al. 2010).

Relationships between site habitat and assemblage richness
Site habitat explained 31-37 % of the richness of both assemblages. Others have reported that physical habitat was a primary factor influencing the structure and composition of lotic assemblages (Frissell et al. 1986;Allan 2004;Hughes et al. 2010). However, geodynamic aspects drive the site habitats, especially at broad spatial scales (Goldstein et al. 2007). The MLR models showed the relative importance of channel morphology, canopy, fish cover and water quality in macroinvertebrate and fish richness. Although substrate is considered an important variable for both assemblages in temperate streams (Kaufmann et al. 2009;Bryce et al. 2010), our models did not detect this relationship, possibly because of covariance effects or the lower variability in substrate sizes of Cerrado streams.
Similar to published studies, we found that, bankfull width and length of sample reach increases with drainage area ) and greater wetted area provides more resources and the capacity to support more species (Vinson and Hawkins 1998;Brooks et al. 2002). The negative influence of excess phosphorus reflects its contribution to eutrophication, which leads to loss of sensitive species (Allan and Flecker 1993).The increased amounts of natural fish cover and riparian canopy indicate greater riparian zone quality which results in greater habitat complexity, and high and low flow refugia (Allan et al. 1997;Johnson et al. 2003).
Interestingly, we found a positive relationship between fish richness and macrophyte cover and herbaceous ground cover. Macrophytes provide cover, food, and foraging sites for fish in tropical and subtropical streams (Casatti et al. 2003), but their presence indicates anthropogenic pressure resulting from the removal of riparian trees. Similar to aquatic macrophyte cover, herbaceous ground cover is indicative of anthropogenic disturbance of riparian vegetation, reinforcing the idea that moderate pressures can lead to increased fish species richness in subtropical streams. Lyons et al. (1996), Hughes et al. (2004) and Davies and Jackson (2006) reported a similar relationship for some minimally disturbed temperate streams. We suspect that intermediate disturbance releases more nutrients and energy into the systems and tropical and subtropical biota are less disturbed than temperate biota by elevated temperatures as long as there is sufficient food. Accordingly, further studies of fish assemblage structure relative to disturbance of subtropical streams are needed to better understand how human pressures affect those assemblages.
Relative influence of geophysical landscape, land use and cover, and site habitat models Models incorporating interaction among the three variable types had greater explanatory power than the models based on a single variable type, justifying the composite models. Analyzing which variables were selected in single models and were not selected in the combined model helps us to understand how landscape factors influence habitat and in turn biota. Further, analyzing the shared explanation, we can clarify the covariance effect among the three environmental levels (geophysical landscape, land use and cover, site habitat) on richness. Thus, our results are consistent with Hierarchy Theory (O'Neill et al. 1989). According to the theory, ecological processes operate hierarchically, and the behavior of an ecological system at smaller spatial scales is constrained by processes at larger spatial scales (Frissell et al. 1986;Tonn 1990).
In the case of separate models, geophysical landscape and site habitat had similar explanatory power (&10 and &20 % for benthos and fish, respectively) while land use and cover in the catchment was relatively more important for macroinvertebrates than for fish, relative to the other two sets of environmental variables. However, the shared explanation among geophysical landscape and land use and cover reveals the relative influence of geodynamic factors over anthropogenic pressures. In the combined macroinvertebrate model, two variables (drainage density, elevation range) previously incorporated by the geophysical landscape MLR were not incorporated into the combined model (compare Tables 1 and 3).
However, the explained fractions by landscape and land use and cover were similar (&10 and &12 %, respectively). Likewise, two land use and cover variables (house density, parkland savanna) were eliminated in the combined fish model but were included in the MLR (compare Tables 2 and 3). In general, these omitted variables were correlated (see Table SM1). In this case, sandstone is often predisposed to erosion (e.g., Kaufmann and Hughes 2006) and occured in flat and rainy areas, suitable for agriculture. Thus, that erosion is amplified by agriculture (Wang et al. 1997;Walser and Bart 1999). Parkland savanna typically occurs in mountain range areas, like phyllite lithology, and this factor hinders human occupation and anthropogenic disturbance.
Interactions among site habitats and geophysical landscape and land use and cover had less shared explanation, probably because of the variables selected for the combined models, which had low correlation with both catchment variables (see Table  SM1). Further, variables not selected for the combined model (e.g., bankfull width, length of sample reach, natural fish cover, canopy cover) were correlated with geophysical landscape and land use and cover. Bankfull width and site length increase with drainage area ) and natural fish cover and riparian canopy tend to decline (Allan et al. 1997;Johnson et al. 2003).
The combined models increased the ability of the three separate MLRs to explain richness variability from low (10-35 %) to moderate (47-51 %) levels and similar results have been obtained for European and United States streams in exploratory landscape studies. With 104 Oregon stream sites, Kaufmann and Hughes (2006) could only explain 52-79 % of the variability in fish index of biotic integrity scores with 28 potential predictor variables. Sály et al. (2011) studying 54 Hungarian stream sites could explain only 31-57 % of the variability in fish species presence or relative abundance with 62 potential predictor variables. Studying 302 French sites and using 39 potential predictor variables, Marzin et al. (2012b) explained only 29-30 % of fish species abundance. This suggests four issues: (1) it is very difficult to explain most of assemblage variability even with relatively large data sets, (2) the choices of predictor and response indicators likely affect the amount of variance that can be explained, (3) the geographic location of the study likely affects predictor-response relationships, and (4) sampling variability related to both the environmental and biological indicators limits the amount of variability that can be explained (Kaufmann et al. 1999).
The low to moderate amount of variation explained in preceding studies may seem discouraging to some readers. However analyses designed to compare influences on aquatic organisms from landscape-scale and in-stream habitat factors differ from modeling single species distributions. In species distribution modeling, researchers attempt to attain the highest predictive capacity possible of a single species. In more exploratory analyses such as our study, we desired to test and characterize discrete sets of influences on entire assemblages. When one considers the multitude of factors affecting distributions of multiple aquatic species throughout their lifetimes (i.e., reproduction, predation, competition, exploitation, disease, migration, evolutionary histories, unassessed landscape and habitat variables, etc.), it is unreasonable to expect a high R 2 without accounting for such variables. Despite the lower predictive capacity of our models, we believe our results are highly relevant for understanding how landscape patterns structure aquatic species assemblages.

Conclusions
Our analyses between multiple environmental factors and assemblage richness indicated the importance of geophysical landscape, land use and cover and site habitat variables in explaining macroinvertebrate and fish assemblage richness in the Brazlian Cerrado. Probably the reduced mobility and physiological adaptability of larval macroinvertebrates led to the greater importance of land use and cover in explaining the richness of that assemblage versus fish, because fish tend to be more mobile and physiologically adaptable. We also demonstrated that, when combined, these differing sets of environmental factors can explain moderate (&50 %) amounts of the variability in the benthos and fish assemblage richness of Cerrado headwater streams. Geophysical landscape variables have an important role in regulating the magnitude and timing of water and sediment inputs, as well as the competence of streams to transport or store various sizes of sediments. Human activities frequently increase sediment production and pollution, and reduce riparian vegetation, which in turn, alters assemblage richness. Richness of macroinvertebrate and fish taxa is typically reduced in river basins strongly influenced by anthropogenic pressure; however, moderate riparian disturbance can increase fish species richness. This nonlinearity in fish species richness response to disturbance, versus what has been observed with fish species traits or guilds, makes it a poor indicator of site disturbance in some regions. Site habitat conditions, independent or allied with a disturbance gradient, are also useful richness predictors. All three classes of environmental variables influenced the richness of both assemblages in Cerrado streams, confirming our first hypothesis. Landscape variables explained relatively more of the variability in richness in both assemblages when combined, confirming our second hypothesis. For the same biome, fish and macroinvertebrate assemblages responded differently to the same sets of predictor variables, corroborating other studies in temperate zones and indicating the value of assessing both assemblages when conducting bioassessments.