2015-09-10

a note on using this presentation

  • press 'p' to see a note with a brief explanation for each slide
  • press 'f' to see the presentation in 'full screen' mode

Model validation is a big issue

points covered here:

  • what is validation?

  • the purpose of SDMs
  • criteria for SDM adequacy

  • current 'SDM-validation' practices
  • suggestions for improvement

What is validation?

  • perhaps not a good term, better: evaluation (Oreskes, 1998)

  • Three imporant questions (Beck,2002):
    • were approved materials & methods used for model construction?
    • does it approximate the 'real thing' well?
    • does it serve its intended purpose?

right materials & methods?

Data

- predictor variables
    - quality
    - resolution
    - realism
- occurrence/abundance data
    - quality
    - size

Method

- appropriate/balanced workflow 
   - model-type/algorithm
   - training/testing cycle
   - error-metrics
- quality checks/reproducibility

the purpose of SDMs

  • predict where a species occurs
  • suggest why a species does(n't) occur somewhere

Other reasons to construct a model (f.i.):

  • encode contemporary knowledge about a system unambiguously
  • provide exploratory tool to generate ideas and make ignorance explicit
  • communicate scientific notions to a non-expert audience

when does an SDM achieve its purpose?

How to measure predictive performance?

- choice of index/indices
- splitting data for training & testing (evaluating)
- avoiding artifacts & comparing across studies

How to learn from a model?

- interpreting differences among models
- deal with scale-effects
- avoid over-interpretation of statistical relations

When does an SDM predict well?

  • high score for performance index on unseen data
    • AUC, TSS (or any other index based on conf.matrix)
    • metrics of spatial overlap (e.g. Schoener's index)
  • consistency in response functions for different data

important issues:

  • absolute scores of indices often case-dependent
    • comparison to null model is important
    • conflicting results among studies (for unknown reasons)
  • sensitivity of outcome to arbitrary or undocumented model choices

A note about performance indices

Who has a favourite?

  • Which one(s)?

Arguments in favour of/against using an index?

  • has been used previously in similar studies
  • can be calculated, given the constraints
  • has desirable properties (which properties?)

the ideal index doesn't exist (sorry)

Why is data splitting needed at all?

Different ways to split

single data source/survey

  • split once
  • cross-validation random/unstratified
  • cross-validation stratified
  • different part of spatio/temporal domain

Different ways to split

different data sources/surveys

  • split once
  • cross-over designs (train on one, evaluate on others)
  • etc.

Model evaluation with various data sources

Synthetic data in model validation

  • correct model
  • incorrect model
  • different levels of noise (e.g. random predictors)
  • etc.

Fresh ideas

  • make evaluation spatially explicit
  • explore relation between predictor resolution, model resolution and predictive performance
  • apply the concepts of 'behavioural models' & equifinality

Think spatially

And the role of ZOÖN?

  • (e.g.) implement the functions for cross-validation in ENMeval Muscarella et al (2014)
  • design, test & share your own ideal model evaluation workflow