Design Analysis - Bayes Factor

Anna Vesely

2019-12-05

Bayes Factor

Consider \(n_1\) observations \(\mathbf{x}_1\) from a \(\mathcal{N}(0,1)\) distribution, and \(n_2\) observations \(\mathbf{x}_2\) from a \(\mathcal{N}(d,1)\) distribution. Consider \[\begin{align*} \bar{X}_1\sim\mathcal{N}\left(0,\frac{1}{n_1}\right),\quad\quad \bar{X}_2\sim\mathcal{N}\left(d,\frac{1}{n_2}\right),\quad\quad Z=(\bar{X}_2-\bar{X}_1)\sim\mathcal{N}\left(d,\frac{n_1+n_2}{n_1\,n_2}\right), \end{align*}\] and assume that we want to compare the difference between the group means.

In particular, consider the null hypothesis \(H_0:\,d=0\), and the alternative hypothesis \(H_1\,:\, d\sim F\), where \(F\) is a distribution (analysis prior) on a set \(D\) containing zero. For simplicity, assume that \(F\) is continuous with density \(f\). Then the Bayes factor \(B\) is computed as following: \[\begin{align*} &L(d\,|\,z)=\sqrt{\frac{n_1\,n_2}{2\pi(n_1+n_2)}}\exp\left\{-\frac{n_1\,n_2}{2(n_1+n_2)}(z-d)^2\right\}\\ &P(z\,|\,H_0)=P(z\,|\,d=0),\quad\quad P(z\,|\,H_1)=\int_{D} P(z\,|\,d=v)\,f(v)\,\text{d}v\\ & B = \frac{P(z\,|\,H_1)}{P(z\,|\,H_0)}= \int_{D} \exp\left\{\frac{n_1\,n_2}{n_1+n_2}\,v\,\left(z-\frac{v}{2}\right)\right\}\,f(v)\,\text{d}v. \end{align*}\]

For a given threshold \(k\) (e.g. \(k=3\)):

As shown the following table:

\(\,\, B<1/k \,\,\) \(\,\, 1/k \leq B \leq k \,\,\) \(\,\, B > k \,\,\)
in favor of \(H_0\) inconclusive in favor of \(H_1\)
\(H_0\) is true \(\gamma\) \(e_I^{(0)}\) \(e_P\)
\(H_1\) is true \(e_N\) \(e_I^{(1)}\) \(\beta\), \(e_S\), \(e_M\)

we want to estimate:

Estimation from Fixed Samples Sizes

Fix the sample sizes \(n_1\) and \(n_2\), an analysis prior \(F\) and a threshold \(k\). Moreover, consider a plausible effect size \(d^*\). The above-mentioned quantities can be estimated through \(2T\) simulations. By default:

Firstly, for \(i=1,\ldots, T\), generate \(z_i\) from a \(\mathcal{N}\left(d^*,\frac{n_1+n_2}{n_1\,n_2}\right)\) distribution, and compute the Bayes factor \[ B_i = \int_{D}\exp\left\{\frac{n_1\,n_2}{n_1+n_2}\,v\,\left(z_i-\frac{v}{2}\right)\right\}\,f(v)\,\text{d}v. \] Then \[\begin{align*} &\hat{\beta}=\frac{\#\{i\,:\,B_i>k\}}{T},\quad\quad\hat{e}_N=\frac{\#\{i\,:\,B_i<1/k\}}{T},\quad\quad\hat{e}_I^{(1)}=\frac{\#\{i\,:\,1/k\leq B_i\leq k\}}{T}\\ &\hat{e}_S=\frac{\#\{i\,:\,B_i>k,\;\text{sgn}(z_i)\neq\text{sgn}(d^*)\}}{\#\{i\,:\,B_i>k\}},\quad\quad\hat{e}_M=\frac{\text{mean}\{|z_i|\,:\,B_i>k\}}{d^*}. \end{align*}\]

Subsequently, for \(j=1,\ldots, T\), generate \(z_j\) from a \(\mathcal{N}\left(0,\frac{n_1+n_2}{n_1\,n_2}\right)\) distribution, and compute the Bayes factor \(B_j\) as before. Then \[\begin{align*} &\hat{\gamma}=\frac{\#\{j\,:\,B_j<1/k\}}{T},\quad\quad\hat{e}_P=\frac{\#\{j\,:\,B_j>K\}}{T},\quad\quad\hat{e}_I^{(0)}=\frac{\#\{j\,:\,1/k\leq B_j\leq k\}}{T}. \end{align*}\]