Dataset Open Access
About the Maps
Maps depict model-predicted species distribution and provide information about relative habitat suitability based on current climate and landcover. The suitability ranking of any mapped grid cell is the sum of the probabilities of that grid cell and all other grid cells with equal or lower probability, multiplied by 100 to give a percentage. This value represents the % of grid cells with a lower suitability value within the boreal/hemiboreal study region. Higher value pixels represent higher habitat suitability for a given species. Models do not account for physiographic barriers that may prevent colonization of otherwise suitable habitat, e.g., the Canadian Cordillera (Erskine 1977). Therefore actual species distributions may be over-estimated in certain regions, particularly in Alaska.
The maximum entropy (Maxent) method (Phillips et al. 2006, Phillips and Dudik 2008) was used to develop species distribution models (SDM) for passerine species during the breeding season. Maxent is a powerful machine-learning algorithm with demonstrated high predictive accuracy compared to other SDM methods (Elith et al. 2006). Although Maxent was developed for presence-only data (e.g., museum records), it is also appropriate for datasets compiled from disparate sources with varying levels of effort, such that information on species absence varies across model units. Although resulting predictions cannot be interpreted as probability of occurrence, they are robust representations of rank-order suitability. The power of Maxent lies in the complexity of relationships (e.g., non-linear, threshold, multiplicative) that it can readily handle, producing detailed, high-accuracy predictions.
The key consideration in development of Maxent models, as with traditional resource selection functions (Manly 1993), is the selection of appropriate “background” data (Phillips et al. 2009). Otherwise, sample bias can lead to biased predictions. As recommended (Phillips et al. 2009), we constrained our background to all locations surveyed for birds. Due to high spatial aggregation of survey locations and the resulting potential for bias, we aggregated occurrence records at the level of 4-km grid cells corresponding with the resolution of our climate data. A species was considered present in a grid cell if at least one individual had been counted over all point-count surveys contained in the grid cell. Model background was thus defined as all surveyed 4-km grid cells (n = 29,059).
Climate variables were derived from 4-km monthly climate normals (1961-1990) based on a combination of PRISM (Daly et al. 2002) and WorldClim (Hijmans et al. 2005) climate data. The western North America portion of these data are described in Wang et al. (2011). We used a set of 17 derived bioclimatic variables presumed to adequately summarize climate conditions within the boreal forest region (Table 1 in report). We were not concerned with high correlation among these covariates because models were developed for prediction purposes only and our goal for this particular exercise was not to interpret the importance of individual covariates.
Landcover data consisted of a 2005 classified landcover map of North America developed by the Council on Economic Development (http://www.cec.org/Page.asp?PageID=924&ContentID=2819). We used 15 of the 19 landcover classifications as inputs to bird models (Table 2 in report). Because landcover was mapped at a 250-m resolution, we summarized the proportion of each landcover type within a 4-km grid cell for prediction purposes. For model-building purposes, we summarized landcover proportions according to the distribution of survey locations within the 4-km grid cell. This was based on the landcover type at the point-count center, reflecting the dominant type surveyed.
Distribution models were developed for all passerine species (+ 2 non-passerine landbird species) with at least 100 occurrence records in separate 4-km grid cells (n=94). Avian occurrence records were obtained from two major datasets: (1) the Boreal Avian Modelling (BAM) point-count dataset (Cumming et al. 2010) and the North American Breeding Bird Survey (BBS) point-count dataset from USGS (http://www.pwrc.usgs.gov/bbs/). Due to large discrepancies in survey characteristics and species detectability, which have already been addressed for the purpose of density estimation (Sólymos et al. 2013), the focus here was on the occurrence portion of the dataset only. Future efforts will integrate detectability offsets into bioclimatic density models that can be used to generate regional abundance estimates.
In order to improve model predictive power within the boreal forest region, data from neighboring hemiboreal regions were also incorporated (as well as data from arctic and mountain regions where possible). Because the core BAM dataset is largely restricted to the boreal forest region, ancillary data consisted primarily of point-level BBS data (breeding bird atlas datasets were notable exceptions). BBS data were obtained for the level 3 ecoregions (http://www.epa.gov/wed/pages/ecoregions/na_eco.htm#Level III) that intersected the boundary of the combined Brandt (2009) boreal/hemiboreal boundary. This additional data improved coverage of climate and landcover conditions at species’ range limits, thereby providing more opportunities to detect differences in habitat suitability. A total of 117,179 point-count locations were used to summarize species occurrence within 29,059 surveyed grid cells. See Table 3 in report for numbers of individual species occurrence records.
Maxent Model Details and Accuracy Assessment
Models were developed using Maxent version 3.3.3e. We used the cumulative probability output format, allowed all feature types except hinge features, and used a regularization multiplier of 1. We ran the model 10 times using bootstrapped subsamples of the BAM/BBS dataset, each time holding out a random 50% for validation purposes (test data). Model predictions were averaged across the 10 bootstrap replicates. Although models were developed using data from outside of the Brandt boreal/hemiboreal boundary, predictions were constrained to this region..
The accuracy of each model was assessed by calculating the area under the curve (AUC) of the receiver operating characteristic plot (Fielding and Bell 1997) based on test data. AUC values were also averaged across the 10 replicates. The AUC value can be interpreted as the likelihood that a randomly-selected presence location will have a higher suitability score than a randomly-selected background location.
In general, models were reasonably accurate in their prediction of species’ distributions. Average AUC scores ranged from 0.56 for American Robin to 0.97 for American Tree Sparrow (Table 3). Across all 94 species, AUC scores averaged 0.81 ±0.09 (SD). Models for 31 species were considered acceptable 0.7 ≤ AUC < 0.8), 31 were excellent (0.8 ≤ AUC < 0.9), and 17 had outstanding discrimination ability (AUC ≥ 0.9) (Hosmer and Lemeshow 1989). AUC scores reflected the ability to discriminate among different levels of habitat suitability within the greater boreal/hemiboreal region. Thus, species with distinct range limits within this region were more accurately predicted.
Models and occurrence records were overlaid with published range maps from NatureServe (http://datazone.birdlife.org/species/requestdis) for comparison purposes. For all but three species, the BAM/BBS dataset contained occurrence records outside of NatureServe range map limits. This discrepancy is reflected in the Maxent model predictions. Thus, both occurrence records and model predictions may be used to refine the range limits for several species. All but nine species had data observations north of their published range limits. Although better range maps may exist for many species (e.g., in recently revised Birds of North America volumes, http://bna.birds.cornell.edu/bna/), digital versions are not generally available for comparison.
Brandt, J. P. 2009. The extent of the North American boreal zone. Environmental Reviews 17:101–161.
Cumming, S. G., K. L. Lefevre, E. Bayne, T. Fontaine, F. K. A. Schmiegelow, and S. J. Song. 2010. Toward conservation of Canada's boreal forest avifauna: design and application of ecological models at continental extents. Avian Conservation and Ecology 5(2):8.
Daly, C., W. P. Gibson, G. H. Taylor, G. L. Johnson, and P. Pasteris. 2002. A knowledge-based approach to the statistical mapping of climate. Climate Research 22:99-113.
Elith, J., C. H. Graham, R. P. Anderson, M. Dudik, S. Ferrier, A. Guisan, R. J. Hijmans, F. Huettmann, J. R. Leathwick, A. Lehmann, J. Li, L. G. Lohmann, B. A. Loiselle, G. Manion, C. Moritz, M. Nakamura, Y. Nakazawa, J. McC. M. Overton, A. Townsend Peterson, S. J. Phillips, K. Richardson, R. Scachetti-Pereira, R. E. Schapire, J. Soberón, S. Williams, M. S. Wisz, and N. E. Zimmermann. 2006. Novel methods improve prediction of species' distributions from occurrence data. Ecography 29:129-151.
Erskine, A. J. 1977. Birds in boreal Canada: communities, densities, and adaptations. Ottawa, Canada.
Fielding, A. H. and J. F. Bell. 1997. A review of methods for the assessment of prediction errors in conservation presence/absence models. Environmental Conservation 24:38-49.
Hijmans, R. J., S. E. Cameron, J. L. Parra, P. G. Jones, and A. Jarvis. 2005. Very high resolution interpolated climate surfaces for global land areas. International Journal of Climatology 25:1965-1978.
Hosmer, D. W. and S. Lemeshow. 1989. Applied logistic regression. John Wiley and Sons, New York.
Manly, B. F. J. 1993. Resource Selection by Animals: Statistical Design and Analysis for Field Studies. Chapman and Hall, London.
Phillips, S. J., R. P. Anderson, and R. E. Schapire. 2006. Maximum entropy modeling of species geographic distributions. Ecological Modelling 190:231-259.
Phillips, S. J. and M. Dudik. 2008. Modeling of species distributions with Maxent: new extensions and a comprehensive evaluation. Ecography 31:161-175.
Phillips, S. J., M. Dudik, J. Elith, C. H. Graham, A. Lehmann, J. Leathwick, and S. Ferrier. 2009. Sample selection bias and presence-only distribution models: implications for background and pseudo-absence data. Ecological Applications 19:181-197.
Sólymos, P., S. M. Matsuoka, E. M. Bayne, S. R. Lele, P. Fontaine, S. G. Cumming, D. Stralberg, F. K. A. Schmiegelow, and S. J. Song. 2013. Calibrating indices of avian density from non-standardized survey data: making the most of a messy situation. Methods in Ecology and Evolution 4:1047-1058.
Wang, T., A. Hamann, D. L. Spittlehouse, and T. Q. Murdock. 2011. ClimateWNA-High-Resolution Spatial Climate Data for Western North America. Journal of Applied Meteorology and Climatology 51.