Spatial statistical analysis of land-use determinants in the Brazilian Amazonia: Exploring intra-regional heterogeneity

The goal of this paper is to explore intra-regional differences in land-use determining factors. We built spatial regression models to assess the determining factors of deforestation, pasture, temporary and permanent agriculture in four space partitions: the whole Amazon ; the Densely Populated Arch (southern and eastern parts of the Amazon), where most deforestation has occurred; Central Amazon, where the new frontiers are located; and Occidental Amazon, still mostly undisturbed. Our land-use data combines deforestation maps derived from remote sensing and 1996 agricultural census. We compiled a spatially explicit database with 50 socio-economic and environmental potential factors using 25km × 25km regular cells. Our results show that the concentrated deforestation pattern in the Arch is related to the diffusive nature of land-use change, proximity to urban centers and roads, reinforced by the higher connectivity to the more developed parts of Brazil and more favorable climatic conditions, expressed as intensity of the dry season. Distance to urban centers was used as a proxy of accessibility to local markets, and was found to be as important as distance to roads in most models. However, distance to roads and to urban centers does not explain intra-regional differences, which were captured by other factors, such as connection to national markets and more favorable climatic conditions in the Arch. Agrarian structure results show that areas in which the land structure is dominated by large and medium farms have a higher impact on deforestation and pasture extent. Temporary and permanent agriculture patterns were concentrated in areas where small farms are dominant. We conclude that the heterogeneous occupation patterns of the Amazon can only be explained when combining several factors related to the organization of the productive systems, such as favorable environmental conditions and access to local and national markets. Agrarian structure and land-use analysis reinforced this conclusion, indicating the heterogeneity of land-use systems by type of actor, and the inﬂuence of the agrarian structure on land-use patterns across the region.


Introduction
The Brazilian Amazonia rain forest covers an area of 4 million km 2 . Due to the intense human occupation process are the main threats to the forest (Margulis, 2004). The enormous potential impact of deforestation in Amazonia calls for qualified and comprehensive assessments of the factors affecting it. Such analysis has to take into account the enormous socio-economic and biophysical diversity of the region, aiming at understanding intra-regional differences. The process of human occupation in Brazilian Amazonia is heterogeneous in space and time. Until the 1950s, human occupation in the Brazilian Amazonia was concentrated along the rivers and coastal areas (Costa, 1997;Machado, 1998). The biggest changes in the region started in the 1960s and 1970s, due to an effort of the Federal Government of populating the region and integrating it to the rest of the country, including infrastructure network investments (roads, energy, telecommunication), colonization and development zones, and credit policies (Becker, 1997;Costa, 1997;Machado, 1998). In the last decades, after the mid-1980s, occupation continues intensively, but more commanded by market forces (wood extraction, cattle, soybeans) than subsided by the Federal Government (Becker, 2005). Human occupation followed concentrated patterns along the axis of rivers and roads, kept apart by large forest masses. These forest areas have scattered population and include indigenous lands and conservation units. According to Alves (2002), deforestation tends to occur close to previously deforested areas, showing a marked spatially dependent pattern. Most of it concentrated within 100 km from major roads and 1970s development zones, but not uniformly. As the occupation process is linked to agricultural production, deforestation tends also to be concentrated along roads that provide an easier connection to the more prosperous economic areas in the center and south of Brazil (Alves, 2002). According to Becker (2001), in the Amazon coexist subregions with different speed of change, due to the diversity of ecological, socio-economic, political and of accessibility conditions.
Recent estimates indicate that in the average, close to 110,000 km 2 of forest were cut in Amazonia in the period -2005(INPE, 2005. The land cover change has also been associated to a concentration of land ownership. Farmers with large properties tend to be the dominant economic actors in the region, whereas the vast majority of the population lives on substandard conditions (Becker, 2005). Given the importance of the Brazilian Amazonia region both at the national and international scales, it is important to derive sound indicators for public policy making. As stated by Becker (2001), "understanding the differences is the first step to appropriate policy actions". Informed policymaking requires a quantitative assessment of the factors that bring about change in Amazonia. Quantifying land-use determinant factors is also a requirement to the development of LUCC models that could be used to evaluate the potential impact of alternative policy actions.
For instance, predictions of future deforestation presented by Laurance et al. (2001) are based on the assumption that the road infrastructure is the prime factor driving deforestation. Such predictions are based on a simple and uniform extrapolation of past patterns of change into the medium term future (2020), disregarding Amazonia's biophysical and socio-economic heterogeneity, and the web of immediate and subjacent conditions that influence location and different rates of change in space and time. Predictions based on such an over-simplified view of reality may even lead to ineffective policy recommendations, unable to deal with the real factors affecting the Amazon occupation process .
In that context, this paper develops a spatial statistical analysis of the determinants associated to land-use change in Amazonia. We use a spatially explicit database (25 km × 25 km regular cells covering the original forest areas), including 50 environmental and socio-economic variables to support a spatially explicit statistical analysis. Measures of territorial connectivity received special attention in our analysis. We use spatial statistical analysis methods to understand the relative importance of the immediate factors related to deforestation, pasture and temporary agriculture patterns, and to explore the intra-regional differences between these factors. The paper also compares the results of conventional linear regression models to spatial regression models, and discusses the use of the two approaches in LUCC dynamic models and scenario analysis.
The paper is organized as follows. Section 2 presents a review of previous work on assessment of factors of deforestation in tropical forests. Section 3 presents the methods used in the assessment of determinant factors for land-use patterns in Amazonia. Section 4 presents the results and discusses them. We close the paper with final considerations regarding the use of spatial regression methods in LUCC modeling, and summarizing the main findings regarding the Amazonia human occupation process.

Review of previous work
In this section, we consider previous work on assessment of factors associated to land-use change in Amazonia, focusing mainly on studies that cover the whole region. Table 1 summarizes results of previous studies in Amazonia, including econometric models, and grid-based models as described below. For other tropical forest areas, Kaimowitz and Angelsen (1998) present a broad review of deforestation models. One of the approaches reviewed is the use of econometric methods based on municipal data. Along this line, Reis and Guzmán (1994) developed a non-spatial econometric analysis of deforestation at the region-wide level. They found out that population density, road network density and extension of cultivated areas were the most important factors.
Also using econometric methods, Andersen and Reis (1997) analyzed the determining factors of deforestation from 1975 to 1995, using municipal data at a region-wide level. Results indicate that deforestation started by a governmental action associated to road construction and establishment of development programs. Later on, local market forces turned out to be the more important factor, replacing government action as the main drivers for deforestation. Their model indicates that land-use change is caused by 11 factors: distance to the federal capital, road length, earlier deforestation in the area, earlier deforestation in neighboring municipalities, rural population density, land prices, urban GDP growth, size of cattle herd, change in the size of cattle herd, change in agricultural production and change in land prices.

Author
Goal Approach Most important factors/results Reis and Guzmán (1994) Determining factors of deforestation Econometric model/municipal data Population density, road network density and extension of cultivated areas Andersen and Reis (1997) Determining factors of deforestation Econometric model/municipal data from 1975Econometric model/municipal data from to 1995 Distance to the federal capital, road length, earlier deforestation in the area, earlier deforestation in neighboring municipalities, rural population density, land prices, urban GDP growth, size of cattle herd, change in the size of cattle herd, change in agricultural production, and change in land prices Pfaff (1999) Determining factors of deforestation  (1980 and 1991) and agricultural (1980 and 1985) census data Factors have a significant spatial variation among the three subregions considered by the authors (remote, frontier, consolidated). Social factors are organized into: (1) settlement history, (2) agricultural intensification, (3) non-traditional land use, (4) crop productivity, (5) tenure insecurity, (6) fuelwood extraction and (7) rural in-migration Laurance et al. (2002) and Kirby et al. (2006) Spatial determinants of deforestation Statistical analysis to assess the relative importance of 10 factors at two spatial resolutions: 50 km × 50 km and 20 km × 20 km (with sampling to avoid auto-correlation) Factors analyzed: paved road, unpaved roads, urban population size, rural population density, annual rainfall, soil fertility, soil water logging. Both at the coarser and finer scales, three factors are most relevant: urban and rural population density, distance to paved roads and dry season extension. Soils were not considered relevant Soares-Filho et al. (2006) Spatial determinants of deforestation (to feed a dynamic model) Logistic regression/regular grid of 1.25 km on sample areas Distance paved and unpaved roads, distance to urban areas, relief, existence of protected areas. Deforestation is not influence by soils quality, nor necessarily follows rivers Pfaff (1999) analyzed the determining factors of deforestation using an econometric model based on municipal data from 1978 to 1988, associated to deforestation data obtained from remote sensing surveys, covering the whole region. His results indicate the relevance of biophysical variables (soil quality and vegetation type), transportation-related variables (road network density in the area and in its neighbors) and government-related variables (development policies). Population density was only considered a significant factor when the model used a non-linear (quadratic) formulation. The author concluded that, in a newly occupied area, earlier migration has a stronger impact on deforestation than latter settlements. Margulis (2004) presents an econometric model that analyzes the Amazon occupation quantifying the relationships in space and time of the main agricultural activities (wood extraction, pasture and crops), and their effects in the region deforestation. He also considers the ecological and economic factors conditioning these relationships. Models are based on municipal panel data from five agricultural census, from 1970 to 1996, complemented by geo-ecological information (vegetation cover, relief, average rainfall and rainfall in June), and transport costs (transport cost to São Paulo by roads). Results indicate: (a) no evidence of precedence between the wood extraction and pasture activities; (b) rainfall seems to be the major agro-ecological determinant; (c) reducing transportation cost induces intensification, but results were not conclusive in relation to intensification increasing or decreasing deforestation.
The second type of research on causes of land-use change in Amazonia studies social factors based on municipal data and remote sensing. Perz and Skole (2003) developed a spatial regression model for secondary vegetation using social indicators as determining factors. They used demographic (1980 and 1991) and agricultural (1980 and 1985) census data, aggregated at the municipal level. The results show that the factors have a significant spatial variation among the three subre-gions considered by the authors (remote, frontier, consolidated). Their study points out that analysis of factors that influence land-use change in Amazonia should consider regional differences.
A third line of work use regular cells as analysis units. Laurance et al. (2002) perform statistical analysis to assess the relative importance of 10 factors at two spatial resolutions: 50 km × 50 km and 20 km × 20 km. Their main conclusions were that, both at the coarser and finer scales, three factors are most relevant for deforestation: population density, distance to roads and dry season extension. Kirby et al. (2006) refine this analysis, and reinforce that both paved and non-paved roads are the main factor determining deforestation. Soares-Filho et al. (2006) performed a statistical analysis to define spatial determinants of deforestation to feed a dynamic model, using a regular grid of 1.25 km 2 . The dynamic model allocates deforestation using empirical relationships between forest conversion in a given period of time and spatial factors. These factors include proximity to roads, rivers and towns, land-use zoning and biophysical features. To establish such relationships, sample regional studies were used, and calibrated for 12 Landsat TM scenes. Results were then used in the dynamic model to construct scenarios for the whole Amazonia. Their results indicate that the most important factors to predict deforestation location is proximity to roads; indigenous reserves are important as a deterrent of deforestation; proximity to urban centers increases deforestation; deforestation is related to relief, being smaller in low wet lands, and also in areas with higher altitude and slope. According to their results, it is not influenced by soil quality and vegetation type, and not necessarily follows the river network. 1 Also using regular grids as the unit of analysis, another line of work are subregional studies that consider specific areas and localized factors. Soares-Filho et al. (2002) analyzed a small colonist's area in north Mato Grosso during two time periods : 1986-1991 and 1991-1994. He constructed logistic regression models to analyze the determining factors for the following transitions: forest to deforested, deforested to secondary vegetation, and secondary vegetation to removal of secondary vegetation. The factors considered were: vegetation type, soil fertility, distance to rivers, distance to main roads, distance to secondary roads, distance to deforestation, distance to secondary vegetation and urban attractiveness factor. Mertens et al. (2002) studied the deforestation patterns in the São Felix do Xingu region (Pará State). He divided the study area in subregions according to patterns identified by remote sensing and identified different types of social actors. Then he applied logistic regression to analyze deforestation determining factors by type of actor in three time periods (before 1986, 1986-1992, 1992-1999). The factors analyzed were: presence of colonization areas, presence of protected areas, presence of relief, distance to cities, distance to villages, distance to dairy industries, distance to main roads, distance to secondary roads and distance to rivers.
Our work adds to these efforts in four aspects. Most studies in Amazonia are restricted to deforestation factors, while we are going a step further, decomposing deforestation patterns into pasture, temporary and permanent agriculture. Our study investigates intra-regional differences through comparative analyses of alternative space partitions. We use a spatial regression model, what allow us to investigate the deforestation spatial dependence. In addition to the socio-economic and biophysical factors adopted in previous works, the model includes measures of connectivity to national markets and to ports, and introduces agrarian structure indicators that have not been used before. Our approach will be fully described in the next section of this paper.

Study area, spatial resolution and spatial partitions
The study area is the Brazilian Amazonia rain forest (around 4 million km 2 ). To perform a spatially explicit analysis, all variables representing land-use patterns and potential factors are decomposed in regular cells of 25 km × 25 km. The model considers two spatial partitions: the whole Brazilian Amazonia and three macro-zones defined by Becker (2005), namely the Densely Populated Arch, the Central Amazonia and the Oriental Amazonia. The Densely Populated Arch is associated with higher demographic densities, roads and the core economic activities. The Central Amazonia is the area crossed by the new axes of development, from center of the Pará state to the eastern part of the Amazonas state. According to Becker (2004Becker ( , 2005, it is currently the most vulnerable area, where the new occupation frontiers are located. The Occidental Amazonia is the more preserved region outside the main road axes influence, with a unique population concentration in the city of Manaus. Fig. 1 illustrates the study area, the three macroregions, the nine Federative States, and the distribution of protected areas in the region.

Land cover/use patterns
The analysis uses the deforestation maps compiled by the Brazilian National Institute of Space Research (INPE, 2005). Cells with a major proportion of clouds, non-forest vegetation, or outside the Brazilian Amazonia were eliminated from our analysis. Cloud cover in 1997 represents around 13% of forest area. Using a deforestation map that presents the accumulated deforestation until 1997, we computed the proportion of deforestation for each valid 25 km × 25 km cell, as illustrated in Fig. 2. The deforestation patterns were decomposed into the main agricultural uses for which area estimates was available from the IBGE (Brazilian Institute for Geography and Statistics) Agricultural Census of 1996 (IBGE, 1996). In this paper, we focus on pasture, temporary and permanent agriculture patterns. Although more recent information would be available for specific crops (e.g., soya), the 1996 Agricultural Census is the last available source for planted pasture area, and, as seen below, pasture occupies around 70% of deforested area in 1997. Municipality-based census data was converted from polygonbased data to the cell space of 25 km × 25 km. Comparison between agricultural area reported by census data and measured by remote sensing showed disagreements in total area (INPE, 2005). The total agricultural area for each municipality was taken from the remote sensing survey, and the proportion of each agricultural land-use category was taken from the census. The conversion process assumed that the proportion of land-use types is uniformly distributed over the deforested areas of the municipality. Fig. 3 presents the resulting pasture, temporary agriculture and permanent agriculture patterns.
As Fig. 3 shows, pasture is spread over the whole deforested area, being the major land use in 1996/1997. It covers approximately 70% of total deforested area, in agreement with the estimates presented by Margulis (2004). Temporary crops represent approximately 13% of the deforested area, and permanent crops approximately 3% of the deforested area. Agricultural patterns are considerably more concentrated than pasture. Table 2 presents some quantitative indicators of the heterogeneity of distribution of the three land-use patterns across the region, considering different Federative States.
As shown in Table 2   Maranhão states. The state of Mato Grosso and the areas along the main rivers in the Amazonas state also present a significant area proportion of the temporary agriculture pattern. The temporary agriculture class we adopted encompasses around 80 types of temporary crops, and includes both subsistence and capitalized agriculture. According to the 1996 IBGE census information (IBGE, 1996), the temporary agriculture pattern seen in the south border of Mato Grosso is already related to the capitalized agriculture (especially soybeans) expansion in forest areas (Becker, 2001). On the other hand, in old occupation areas such as the northeast of Pará and Maranhão, and also in some municipalities in the north of Mato Grosso, agrarian structure is dominated by small holders. According to IBGE database (IBGE, 1996), dominant temporary crops were manioc and corn in 1996. Permanent crops occupy a smaller area than the other two land uses, concentrated in the old occupation areas of the northeastern of Pará state and along the Amazon River, and in Rondônia, where most occupation is related to official settlement projects (Becker, 2005). These specific characteristics of the distribution of the temporary and permanent agriculture patterns reinforced the need to include agrarian structure indicators in our regression analysis, as discussed in the next section.

Spatial database of potential determinants
The spatially explicit database is organized as a cellular space of 25 km × 25 km. It includes 50 environmental and socioeconomic variables that could potentially explain macro and intra-regional differences in land use. The complete list of variables is in Appendix A. Dependent variables are those associated to land use (deforestation, pasture, temporary and permanent agriculture). The potential explanatory variables were grouped into seven types: • Accessibility to markets: distance to roads, rivers and urban centers, connection to national markets and ports, derived from IBGE (Brazilian Institute for Geography and Statistics) cartographic maps. • Economic attractiveness: capacity to attract new occupation areas, measured as distance to timber-production facilities and to mineral deposits. Timber-production facility data were provided by IBAMA (Brazilian Institute of Environment and Natural Resources) and mineral deposit data by CPRM (Brazilian Geological Service). • Agrarian structure: land distribution indicators, indicating the proportion (in terms of number of properties and in terms of area inside the municipality) of small (<200 ha), medium (200-1000 ha) and large (>1000 ha) farms. These measures use the IBGE (1996)  The measures of accessibility to markets include the connections to markets and ports. These variables deserved special attention. According to Becker (2001), road building has considerably modified the pattern of connectivity in Amazonia. Until the 1960s, the main connections were the Amazonas river and its main tributaries; after road building of the last decades of the 20th century, the importance of such connections were largely supplanted by transversal connections of roads crossing the valleys of the main tributary rivers. As Becker (2001) states: "connection distance and time were reduced from weeks to hours". For our analysis, we computed connectivity indicators for each cell. We measured the minimum path distance through the roads network from each cell to national markets and to ports. The connectivity indicator for each cell was taken as inversely proportional to this minimum path distance. We distinguished paved from non-paved roads (non-paved roads are supposed to double the distances). These measures were computed using the generalized proximity matrix (GPM), described in Aguiar et al. (2003). The GPM is an extension of the spatial weights matrix used in many spatial analysis methods (Bailey and Gattrel, 1995) where the spatial relations are computed taking into account not only absolute space relations (such as Euclidean distance), but also relative space relations (such as topological connection on a network). Currently, most spatial data structures and spatial analytical methods used in GIS, and also in LUCC modeling, embody the notion of space as a set of absolute locations in a Cartesian coordinate system, thus failing to incorporate spatial relations dependent on topological connections and fluxes between physical or virtual networks. Our connection measures are an attempt to combine both when assessing land-use determining factors. As pointed by Verburg et al. (2004), understanding the role of networks is essential to understanding land-use structure, and is considered a LUCC research priority.
Other measures of accessibility to markets include distances to roads, rivers and urban centers. The distance to roads measure uses the minimum Euclidean distance from each cell to the nearest road. Distances from each cell to urban centers, and rivers were measured in the same way.
The agrarian structure indicators are based on municipality level information. The percentage of small, medium and large farms in area was computed in relation to the total area of farms inside the municipality. It disregards non-farm areas inside the municipality such as protected areas, or land owned by the Federal government. Thus, the small, medium and large categories sum 100%. Alternative variables were also computed giving the proportion of the number small, medium and large farms in relation to the total number of farms in the municipality. These six variables are indicators of the dominance of a certain type of actor in a certain region. As the variables are highly correlated, we choose to use the small farms area proportion in our analysis. Demographical, technological and settlements variables are also derived from municipality level data. Variable values in the 25 km × 25 km cells were computed taking the average of the corresponding values in each municipality (e.g., number of settled families) weighted by the area intersection between the municipalities and the cell.
The measure of environmental protection areas uses the percentage of each cell that intercepts a protected area. Soil variables use a fertility classification based on IBGE soils map that considers soil type, morphology, texture and drainage information. Based on this classification, we grouped the soils into three categories: fertile soils, non-fertile soils and wetland soils. The soil variables considered in our analysis represent the proportion of each of these categories in the 25 km × 25 km cells.
Climate data uses monthly averages of precipitation, humidity and temperature from 1961 to 1990, on a grid with a spacing of 0.25 • of latitude and longitude. Since the three indices were highly correlated, we choose to work with humidity, which has a higher correlation to deforestation than the other two climatic variables. The humidity data was converted into 25 km × 25 km cells by computing the intensity of the dry season in each cell. The dry season does not occur at the same period in each cell, and varies from June-July-August in the state of Mato Grosso region to November-December-January on the state of Roraima. The climate indicator for each cell is a measure that accounts for these differences, by taking the average of the three drier and consecutive months in each cell.

Exploratory analysis and selection of variables
An initial exploratory statistical analysis showed that some of the relationships between potential explanatory variables and the land-use variables were not linear. We applied a logarithmic transformation to the land-use variables and to some explanatory variables. The log transformation improved the regression results significantly. This improvement suggests that the explanatory variables are related to the initial choice of areas to be occupied. After the initial choice, land-use change behaves as a spatial diffusion process because deforestation tends to occur close to previously deforested areas (Alves, 2002). There was a high degree of correlation among potential explanatory factors. When choosing between highly correlated variables, those related to public policies of infrastructure (accessibility) and conservation (protected areas), to subside the next step of this work that aims at LUCC dynamic modeling and policy scenario analysis. For the same category, alternative possibilities were tested. For instance, out of the many environmental variables, we chose the average humidity in the drier months. The final choice of explanatory variables for regression analysis does not include demographical or technological factors, which are captured indirectly by other variables. As a result, the statistical analysis used only a representative subset of all variables, as shown in Table 3. This subset was selected to cover the broadest possible range of categories, while minimizing correlation problems.
Even in the subset of variables presented above, there was still a high degree of correlation, which varied across the spatial partitions. We decided to build different spatial regression models, where each model had potentially explanatory variables with less than 50% correlation between them. To build the regression models, we selected as primary variables those with potentially greater explanatory power in relation to deforestation: distance to urban centers, distance to roads, climatic conditions and connection to markets. Then we tested these three variables for correlation to select the leading variables for each model. Distance to urban centers and distance to roads were correlated in all spatial partitions, except in the Occidental one. Distance to roads and connection to national markets could not be placed in the same subgroup for the whole Amazon. Climatic conditions and connection to national markets were also highly correlated, except in the central region. This cross-correlation analysis between the potentially explanatory variables led to the models summarized in Table 4. An automatic linear forward stepwise regression was applied to refine the models and discard non-significant variables. Some variables were found to be significant in some of the models and non-significant in others, as shown in Table 4. The resulting models are: • Amazonia: for the whole region, we considered three models: one including distance to urban centers and connection to markets (urban + connection), one including distance to urban centers and climatic conditions (urban + climate), and a third one including distance to roads and climatic conditions (roads + climate). • Densely Populated Arch: for this region, we considered two models. The first is lead by distance to urban centers and connection to markets (urban + connection) and the second includes distance to roads and connection to markets (roads + connection).
n/s: non-significant variables discarded in an automatic forward stepwise procedure.
• Central Amazonia: for this region, we considered two models. The first is lead by distance to urban centers and connection to markets (urban + connection) and the second includes distance to roads and connection to markets (roads + connection). • Central Amazonia: for this region, we considered a single model that includes distance to urban centers, distance to roads, and connection to markets (urban + roads + connection).

Spatial regression modeling
We used spatial regression models to establish the relative importance of the determining factors for different land uses. One of the basic hypotheses in linear regression models is that observations are not correlated, and consequently the residuals of the models are not correlated too. In land-use data, this hypothesis is frequently not true. Land-use data have the tendency to be spatially autocorrelated. The land-use changes in one area tend to propagate to neighboring regions. This work applies a spatial lag regression model (Anselin, 2001) to assess the relative importance of potential explanatory factors. In this method, the spatial structure is supposed to be captured in one parameter. The linear regression model formulation can be described as where Y is an (n × 1) vector of observations on a dependent variable taken at each of n locations, X the (n × k) matrix of exogenous variables,ˇthe (k × 1) vector of parameters, and ε is the (n × 1) an vector of disturbances. The spatial lag model includes a spatial dependence term, through a new term that incorporates the spatial autocorrelation as part of the explanatory component of the model: where W is the spatial weights matrix, and the product WY expresses the spatial dependence on Y, where is the spatial autoregressive coefficient. The spatial autoregressive lag model aims at exploring the global patterns of spatial autocorrelation in the data set. This spatial model considers that the spatial process whose observations are being analyzed is stationary. This implies that the spatial autocorrelation patterns can be captured in a single regression term. This method was employed by Overmars et al. (2003) in a study in Ecuador. In the Brazilian Amazon, Perz and Skole (2003) used a spatial lag model, focusing on social factors related to secondary vegetation.
In this work, we compare the results of the spatial lag models with those of a non-spatial linear regression model for the whole Amazonia. This helps to understand how explanatory factors contribute to spatial dependence in this case. This is also the basis for the analysis of how the different methods could be used in LUCC dynamic modeling.
These results will be presented in the next section. In order to compare the models, we will present the R 2 value (coefficient of multiple determination) and the Akaike information criteria (AIC). As stated by Anselin (2001), the R 2 value is not a reliable indicator of goodness of fit when the data is spatially autocorrelated. The Akaike information criteria (Akaike, 1974) is a more suitable performance measure than the R 2 value for spatially correlated data. The model with the highest AIC absolute value is the best. To compare the determining factors relative importance in each model, the standardized regression coefficients (beta) and associated significance level (p-level) for each variable will be presented.

Results and discussion
This section summarizes our main findings, organized as follows. Section 4.1 presents the deforestation determining factors for whole Amazonia. It compares the results obtained by linear regression to those of spatial regression. The comparison shows how determinants change their importance when spatial autocorrelation is considered, and what this indicates in terms of spatial dependence and land-use structure. Section 4.2 presents a comparison of deforestation factors across the four spatial partitions (Amazonia, Densely Populated Arch, Central and Occidental macro-zones), using spatial regression models. Section 4.3 presents a comparison of the main land-use (pasture, temporary and permanent agriculture) determinants, also using spatial regression models. The results of pasture and agriculture determinants are presented only for the Arch macro-zone, where occupation is more consolidated. Appendix B shows the spatial distribution of the most important factors analyzed in the next sections.

Deforestation factors in the whole Amazonia
In this section, we present and discuss regression models for whole Amazonia. A pre-processing step maintained in the models only variables less than 50% correlated to each other, and eliminated those non-significant according to an automatic forward stepwise procedure (see Table 4). The three models we compare are: urban + connection, urban + climate and roads + climate. Table 5 presents the statistical analysis results for the three models and compares the non-spatial linear regression model with the spatial lag model, where the dependent variable is the log percentage of deforestation for each 25 km × 25 km cell. The spatial lag model includes one additional variable (w log def) that measures the extent of spatial autocorrelation in the deforestation process. In Table 5, we present the R 2 value (coefficient of multiple determination) and the Akaike information criteria for all models. In both indicators, the spatial regression models showed a better performance than the non-spatial linear model. The spatial coefficient of the spatial lag models is significant and higher than 0.70 in all models. This is a quantitative evidence that corroborates of earlier assessments that deforestation is a diffusive process in the Amazon, and tends to occur close of previously opened areas (Alves, 2002). The other variables found to be important (with higher betas) are distance to urban centers (log), distance to roads (log), connection to markets, humidity and protected areas.
We also compared the strength of the most important factors considering the linear regression model and the spatial lag model, as shown in Table 6. It groups the distance to urban centers and distance to roads variables that are highly correlated, and then connection to markets and climate variables, also highly correlated. As expected, using the spatial lag regression model, all betas get lower, but not in a uniform way. When considering the intrinsic spatial dependence of deforestation, the 'connection to markets' variable (and the climate one) decreases proportionally more than the others, although it is still one of the main factors. Therefore, these variables carry a large part of the spatial dependence. This corroborates with earlier assessments (Alves, 2002) that showed that deforestation tends to occur along roads that provide an easier connection to the more developed areas in Brazil. These areas also present the driest climate in Amazon, with more favorable conditions to agriculture (and also to infra-structure construction and maintenance) than the more humid areas in the western Amazonia, in accordance with previous results (Schneider et al., 2000). Our statistical results indicate that these factors (the diffusive nature of deforestation, distance to roads and to urban centers, climate and connection to markets), and the interaction among them, contributed significantly for the pattern of deforestation in 1996/1997. The existence of protected areas also plays an important role in avoiding deforestation in high-pressure areas, as will be further discussed in the next section.
Previous studies of causes of land-use change in Amazonia tended to emphasize distance to roads as the main determinant (Kirby et al., 2006;Laurance et al., 2002). The results from this paper indicate that distance to urban centers is as important as distance to roads as a determinant factor for land-use change. Distance to urban centers is a population indicator and also a proxy of local markets. In 1996, 61% of the approximately 20 million people lived in Amazonian urban areas; in 2000, 69% of the total population (Becker, 2004). Urban population growth rates increase faster in Amazonia than in other parts of Brazil, not only in the larger cities but also in those with less than 100,000 people (Becker, 2001). Faminow (1997) showed that the local demand for cattle products such as beef and milk is an overlooked cause of cattle production increase, and consequently, deforestation. Our results reinforce the need to further understand the relationship between land-use change and this process of urban population growth in Amazonia.
In summary, our results indicate that strong spatially concentrated pattern of deforestation in Amazonia is related to the diffusive nature of the land-use change process. The concentration of this pattern in the southern and eastern parts of the Amazonia is related to proximity to urban centers and roads, reinforced by the higher connectivity to the more developed parts of Brazil, and more favorable climatic con-ditions in comparison to the rest of the region. Therefore, more favorable production conditions in terms of climate, connection to national markets, and proximity to local markets seem to be the key factors in explaining the deforestation process.

Comparison of deforestation determining factors across space partitions
In this section, we present and discuss the regression models for three spatial partitions: Densely Populated Arch, Central and Occidental Amazonia. For each space partition, two alternative models were considered, one including the 'distance to urban centers' variable, and one with the 'distance to roads' variable (except in the Occidental partition where they were allowed to be in the same model). A pre-processing step maintained in the models only variables less than 50% correlated to each other, and eliminated those non-significant according to an automatic forward stepwise procedure (see Table 4). The following models are compared: urban + climate (Arch), roads + connection (Arch), urban + climate + connection (Central), roads + climate + connection (Central) and urban + roads (Occidental). Table 7 presents the statistical analysis results for these models, including the R 2 and the Akaike information criteria. Both criteria indicate that the Arch models are the best fit. The spatial autoregressive coefficient (w log def) is significant and higher than 0.67 in all models of the Arch and Central regions. In the Occidental region, it is also significant, but presents a lower value (0.54), indicating a less marked spatial pattern. The Occidental region is still quite undisturbed, except by the areas close to the main rivers, and around Manaus. As stated by Becker (2001) the Amazonia presents regions with different speeds of modification. The lower spatial dependence is an indicator that occupied areas in the Occidental region do not spread to the neighboring cells at the same pace as the ones in the main axes of development in the Arch and central region. The other variables found to be important (with higher betas) -or that present some relevant variation among the spatial partitions -are: distance to urban centers (log), distance to roads (log), protected areas, connection to markets, connection to ports, distance to large rivers, soil fertility, number of settled families, and agrarian structure. Fig. 4 illustrates graphically the most important differences found among these eight factors. The first main difference is the relative higher values of the protected areas variable (percent of all types of protected areas in each cell, including Indigenous Lands and Federal and State Conservation Units). In the Arch, it is the second most important factor (after the spatial autocorrelation coefficient), preceding distance to roads and distance to urban centers. Indigenous lands and conservation units correspond, respectively, to 22 and 6% of the Amazon region (Becker, 2001), spread over the region (see Fig. 2). Our results indicate quantitatively that protected areas can be important instruments in avoiding deforestation in high-pressure areas such as the Arch. This is in accordance with earlier results that showed that protected areas are in general effective in refraining deforestation even if some level of deforestation is found inside of them Ferreira and Almeida (2005). Their efficacy depends on the clear demarcation of its limits, on the socio-economic context in which they are created, and on appropriate monitoring and controlling measures, as discussed by Ribeiro et al. (2005) and .
Distance to roads and distance to urban centers are not the most important determinants in all macro-regions. Also, they do not explain intra-regional differences, as they are both similarly important in all macro-zones, except in the Occidental macro-zone, where distance to urban centers is considerably more important. In the Occidental macro-zone, distance to large rivers also plays an important role. This result is coherent with the small disturbance of the area, concentrated mostly in Manaus and close to the rivers.
On the other hand, connection measures (connection to markets and connection to ports) play different roles across the partitions. Connection to markets is important in explaining Arch deforestation patterns, but not in the other macro-regions. In the central macro-region it looses significance in one of the models, when distance to roads is also used. Connection to ports is important only in the central region, whose historical occupation process is related to the rivers. Climate (intensity of dry season) is also important in explaining deforestation in the Arch and central partitions.
In the central spatial partition, the climate variable did not present correlation to the connection to markets variable, and both could be placed in the same regression model. In the Arch, climate and connection to markets are correlated, and were analyzed in different models, both presenting significant coefficient values. This indicates that both factors created favorable conditions to occupation in the eastern part of the Amazon.
The differences between the models for the Arch and the central regions are important. They point out to an occupation process in the Arch that uses roads as its main connections. In the Arch, the existence of protected areas is the main factor that is statistically significant as an impediment to deforestation. A second deterrent is unfavorable climatic conditions, in areas where the dry season is more intense. Since the area on the south of the Arch (see Fig. 1 and Appendix B) still has a considerable extension of primary forest areas outside protected areas, close to the mechanized agriculture belt in the south of Mato Grosso, and also benefits from drier climate, the creation of protected areas in that region would be an important factor for deterrence of the deforestation process.
In the central region, due to its historical occupation process, connection to national markets is not significant in one of the models. There is a stronger influence of rivers connections (variables distance to rivers and connection to ports). The central region is currently the most vulnerable region, where new frontiers are located (Becker, 2004). As the agricultural production systems of the new occupied areas in the central region became stronger, these statistical relationships will be modified to reflect the new reality, but not necessarily replicating the Arch relationships. For instance, connection to ports may continue to be important in the central region due to the presence of exportation ports in the Amazon River, but road connection to the rest of the country may also gain importance, linking productive areas to their markets. In relation to protected areas, the statistical rela-tionship was not as strong as in the Arch in the period of analysis. However, the creation of protected areas in the central region, in appropriate socio-economic contexts , would also be an important instrument for conservation of areas that may become threatened by the new frontiers.
In the next paragraphs, we discuss results related to other significant variables: soils fertility, number of settled families and agrarian structure indicators. The soils fertility indicator (percentage of fertile soils in each cell) has a positive relationship to deforestation in the Arch and in the whole Amazonia models. Comparing the deforestation patterns and the patterns of medium and high fertility soils in the 25 km × 25 km cell space shown in Appendix B, one can notice the existence of better quality soils in Rondônia and the Transamazônica, where most colonization programs were placed. Better soils are also found in Mato Grosso. Federal Government took into consideration existing soil surveys when planning the development projects and colonization settlements of the 1970s and 1980s (the RADAM project in the 1970s mapped vegetation, soils, geology and geomorphology).
As expected, the number of settled families by official colonization programs (accumulated from 1970 to 1999) has a positive and significant relationship in the Arch and central regions (and also in the whole Amazonia, as Table 5 shows). On the other hand, the agrarian structure indicator (percentage in area of farms smaller than 200 ha) is also significant in the Arch, but presents a negative signal, indicating that deforestation is more associated with areas with a greater proportion of medium and large farms, than areas occupied by small farms. This relationship is also significant in the whole Amazonia.
Many authors have presented diverse estimates of the share of small and large farmers in relation to deforestation (for instance, Fearnside, 1993;Walker et al., 2000). As stated by Walker et al. (2000) and Margulis (2004), the relative importance of small, medium and large farms on deforestation varies from one region to the other, as the dynamics of deforestation are very distinct at different localities. However, most of previous works show that when considering the overall deforestation extent in the Amazon a more significant impact is caused by large farms (Margulis, 2004). Our results provide further evidence that areas occupied by large and medium farms have a higher impact on deforestation than areas occupied by small farms, when the whole Arch macro-zone is analyzed. This can be explained by the relative contribution of Pará, Tocantins and Mato Grosso states. As Fig. 5 illustrates, small farm areas are concentrated in Rondônia, northeast of Pará and Maranhão. In most of the Arch area, the agrarian structure is predominantly of medium and large farms. For instance, in Mato Grosso the mean value for the agrarian structure indicator is 0.07 (0.07 standard deviation), meaning that in average only 7% of the farm lands are occupied by properties with less than 200 ha.

Comparison of land-use determining factors in the Arch partition
This section presents and discusses the results of the spatial lag models for the Arch partition, in which the dependent variables are the log percentage of pasture, temporary agriculture and permanent agriculture in each 25 km × 25 km cell. For each of these three types of land use, we consider two alternative models, one including the 'distance to urban centers' variable (urban + climate model), and one with the 'distance to roads' (roads + connection), as summarized in Table 4. Table 8 presents the statistical analysis results for the six models. The R 2 and the Akaike information criteria are presented as measures of goodness of fit to compare the models. All indices are similar, but temporary agriculture models perform slightly better according to the log likelihood. The spatial auto-regressive coefficient of the spatial lag models is significant and higher than 0.70 in all models, presenting the higher values in the permanent agriculture models (above 0.80), indicating a stronger clustering of such use (see Fig. 2). The other relevant factors that will be analyzed in this section are: distance to urban centers (log), distance to roads (log), protected areas, connection to markets and agrarian structure. Fig. 6 illustrates graphically the most important differences found among these eight factors.
As with deforestation in the Arch macro-region, protected areas, distance to roads and distance to urban centers are the most important variables in explaining the distribution of land-use patterns. Connection to markets is significant to temporary agriculture and pasture, but not to permanent agriculture. The main difference is the signal in relation to agrarian structure variable (percentage in area of farms smaller than 200 ha). The beta value for the agrarian structure has a positive value in both temporary agriculture and permanent agriculture models. In the pasture model, the beta is negative.
Pasture is spread over the region (see Fig. 3), and its determining factors are very similar to deforestation ones, discussed in previous section. Our results indicate that medium and large farms have a larger proportion of pasture areas when considering the whole Arch extent. The relative share of small, medium and large farms in terms of pasture area varies according different localities. Rondônia, for instance, have a significant pasture area (see Table 2), and an agrarian structure related to small farmers. The negative signal our model captures is related to the proportionally larger area of Mato Grosso and Pará States, in which the agrarian structure is predominantly of large farms.
On the other hand, temporary and permanent agriculture present differentiated and concentrated patterns, as discussed in Section 3.2. Our results indicate a tendency for temporary and permanent agriculture to occupy areas associated to small farms, when considering the whole Arch, in our period of analysis. Permanent crops are present in northeastern Pará, Rondônia and along the Amazon River. These three areas have a land structure related mostly to small properties, what explains the positive signal in the permanent agriculture model. In the temporary agriculture model, the positive signal can be explained by the fact that the temporary agriculture practiced in Pará and Maranhão by small farmers occupy a larger area than the mechanized agriculture found in the south of Mato Grosso (see Table 2). Although this statistical relationship may change with the expansion of mechanized agriculture into forest areas (Becker, 2005), that requires large tracts of plain land, and is practiced by a capitalized type of actor, our results indicate the existence of a land-use system based on temporary agriculture practiced by small farms, especially in old occupation areas such as Maranhão and northeast Pará.
This land-use pattern analysis we conducted provide further evidence of the heterogeneity of the region, both in terms of agrarian structure and land-use trajectories adopted in different localities. For instance, both Rondônia and the north-  eastern part of Pará State have a dominance of small farms. However, in Rondônia temporary crops are not as significant as in northeastern Pará. On the other hand, there is a significant pattern of permanent crops in Rondônia. Soybean expansion may change the statistical relationship with the agrarian structure as we obtained for temporary crops, but not the fact that these other land-use systems exist, and that effective policy action may take this heterogeneity into consideration.

Spatial regression and dynamic modeling
One of the basic hypotheses in linear regression models is that observations are not correlated, and consequently the residuals of the models are not correlated as well. In land-use data, this hypothesis is usually not true. Land-use data have the tendency to be spatially autocorrelated, as land-use changes in one area tend to propagate to neighboring regions. Spatial dependence could be seen as a methodological disadvantage, as it interferes on linear regression results, but on the other hand is exactly what gives us information on spatial pattern and structure and process (Overmars et al., 2003). In Section 4.1, we compared the results of the spatial lag models with those of a non-spatial linear regression model for the whole Amazonia to understand how explanatory factors contribute to spatial dependence. Results show that the spatial coefficient of the spatial lag models is significant and higher than 0.70 in all models, a quantitative evidence that corroborates of earlier assessments that deforestation is a diffusive process in the Amazon, and tends to occur close of previously opened areas (Alves, 2002). Results also show that when using the spatial lag regression model, the determining factors coefficients in the regression equation get lower, but not in a uniform way. Connectivity to markets and climate factors carry a larger part of the spatial dependence, and reinforce the diffusive pattern of deforestation.
One of the goals of quantifying empirically the relationships of land-use patterns and determining factors is to feed dynamical LUCC models. Our results indicate that, in areas similar to the Amazonia, with such spatially marked patterns, there is however a risk of using the spatial lag model for dynamical LUCC modeling. For instance, in the case of deforestation, the spatial autocorrelation parameter is related to the previous deforestation in the neighborhood. The resulting model using the spatial lag coefficients would have a tendency to concentrate changes in previously occupied areas, not allowing new patterns to emerge. Thus, we considered more appropriate to tie the diffusive aspect of deforestation to scenario-dependent variables such as connectivity to markets and distance to roads. New patterns could emerge as connectivity characteristics are changed. Similar considerations are presented by Overmars et al. (2003).

Amazonia intra-regional heterogeneity
We conducted the spatial lag regression analysis to explore intra-regional differences in the relative importance of landuse determining factors in the Amazon, based on a cellular database including several environmental, socio-economic and political potential factors. The quantitative results we obtained using this methodology corroborates with the hypothesis of intra-regional heterogeneity as stated Becker (2001): in the Amazon coexist subregions with different speed of change, due to the diversity of ecological, socio-economic, political and of accessibility conditions. The use of spatial regression models also corroborated earlier assessments about the diffusive nature of land-use change in the Amazon (Alves, 2002) as showed by the high values of the autocorrelation coefficient in all models. Only in the Occidental region values were slightly lower, indicating a less intense diffusive pattern and speed of change.
Our models show the significance of several of the potential determining factors, demonstrating that focusing on single factor analysis can be misleading. It is the interaction of many factors that can explain the land-use patterns in the Amazon. And the relative importance of such factors varies from one region to another, and unravels the region heterogeneity in terms of patterns and speed of change. For instance, when only the Arch is analyzed, protected areas becomes the second most important factor, after the deforestation spatial dependence coefficient, preceding distance to roads and to urban centers, indicating how they play an important role in avoiding deforestation in high-pressure areas. On the other hand, distance to roads is an important factor in all space partitions. But our multi-factor analysis shows that the heterogeneous occupation patterns of the Amazon can only be explained when combining roads to other factors related to the organization of the productive systems in different regions, such as favorable environmental conditions and access to local and national markets. This provides further evidence that the implantation of roads and development poles in the 1970s was a first incentive to deforestation, but it continued more elevated in regions that established productive systems linked to the center, south and northeast of Brazil (Alves, 2001;Alves, 2002). The municipality of São Felix do Xingu, a current deforestation hot-spot, is exemplary of this: it has been the lead in deforestation rates in the last years (INPE, 2005), although it is not served by a paved road. Land market plays an important role there, and also lack of State presence, but it also has a very well organized beef market chain . Our agrarian structure and specific land-use analysis results reinforce the conclusions in relation to the importance of the productive systems, as they point out the heterogeneity of land-use systems adopted by different actors, and the influence of the agrarian structure on land-use pattern distribution across the region.
We conclude that incorporating this heterogeneity of factors, actors, land-use and productive systems are essential to a sound understanding of the land-use change process in the region, especially to subside policy decisions appropriated for each subregion in a non-uniform and non-misleading way.