Testing prediction algorithms as null hypotheses: Application to assessing the performance of deep neural networks

Bickel, David R.

doi:10.5281/zenodo.3519876

Published October 26, 2019 | Version v1

Working paper Open

Testing prediction algorithms as null hypotheses: Application to assessing the performance of deep neural networks

Bickel, David R.

Bayesian models use posterior predictive distributions to quantify the uncertainty of their predictions. Similarly, the point predictions of neural networks and other machine learning algorithms may be converted to predictive distributions by various bootstrap methods. The predictive performance of each algorithm can then be assessed by quantifying the performance of its predictive distribution. Previous methods for assessing such performance are relative, indicating whether certain algorithms perform better than others. This paper proposes performance measures that are absolute in the sense that they indicate whether or not an algorithm performs adequately without requiring comparisons to other algorithms. The first proposed performance measure is a predictive p value that generalizes a prior predictive p value with the prior distribution equal to the posterior distribution of previous data. The other proposed performance measures use the generalized predictive p value for each prediction to estimate the proportion of target values that are compatible with the predictive distribution. The new performance measures are illustrated by using them to evaluate the predictive performance of deep neural networks when applied to the analysis of a large housing price data set that is used as a standard in machine learning.

Files

assessment-preprint191026.pdf

Files (432.1 kB)

Name	Size	Download all
assessment-preprint191026.pdf md5:319bd8a9a81dd881dd7e3f698a457838	432.1 kB	Preview Download

	All versions	This version
Views	111	63
Downloads	113	53
Data volume	50.1 MB	23.3 MB

Testing prediction algorithms as null hypotheses: Application to assessing the performance of deep neural networks

Creators

Description

Files

assessment-preprint191026.pdf

Files (432.1 kB)