Bootstrap-based improved estimators for the two-parameter Birnbaum–Saunders distribution

In this paper, we consider the two-parameter Birnbaum–Saunders distribution proposed by Birnbaum and Saunders [Birnbaum, Z.W. and Saunders, S.C., 1969, A new family of life distributions. Journal of Applied Probability, 6, 319–327], which is commonly used for modeling the lifetime of materials and equipment. We consider different strategies of bias correction of the maximum-likelihood estimators for the parameters that index the distribution via bootstrap (parametric and nonparametric). The numerical evidence favors a particular bootstrap estimator based on parametric resampling. Finally, an example with real data is presented and discussed.


Introduction
Birnbaum and Saunders [1] proposed a new family of life distributions for modeling the lifetime of materials and equipment under the influence of dynamic loads. This two-parameter Birnbaum-Saunders distribution was derived from a model in which failures happen due to the development and growth of a dominant crack. Later, this distribution was derived in a more general way by Desmond [2], on the basis of a biological model. Desmond also extended the physical justification for its use by relaxing some of the assumptions made in Birnbaum and Saunders [1]. Moreover, Desmond [3] investigated the relationship between the Birnbaum-Saunders distribution and the inverse Gaussian distribution.
A random variable T is said to follow the two-parameter Birnbaum-Saunders distribution with parameters α, β > 0, denoted by B-S(α, β), if its cumulative distribution function is given by
The density (2) is asymmetric to the right for large values of α, the asymmetry decreasing with α. In figure 1, we present the density (2) for different values of α, considering β = 1. Note that as α decreases, the density becomes more symmetric around β, which is the median of the distribution. Note also that the variance decreases with α.

Bootstrap methods
In what follows, we shall present different bootstrapping schemes that can be used to reduce the bias of the MLE. Let y = (y 1 , . . . , y n ) T be a random sample of size n, where each element is a random draw from the random variable Y , which has the distribution function F = F θ (y).
Here, θ is the parameter that indexes the distribution and is viewed as a functional of F , i.e., θ = t (F ). Finally, let beθ be an estimator of θ based on y; we writeθ = s(y). The application of the bootstrap method proposed by Efron [15] consists in obtaining, from the original sample y, a large number of pseudo-samples y * = (y * 1 , . . . , y * n ) T and then extracting information from these samples to improve inference. Bootstrap methods can be classified into two classes, depending on how the sampling is performed: parametric and nonparametric. In the parametric version, the bootstrap samples are obtained from F (θ), which we shall denote as Fθ , whereas in the nonparametric version, they are obtained from the empirical distribution functionF , through sampling with replacement. Note that the nonparametric bootstrap does not entail parametric assumptions.
Let B F (θ, θ) be the bias of the estimatorθ = s(y), that is, where the subscript F indicates that expectation is taken with respect to F . The bootstrap estimators of the bias in the parametric and nonparametric versions are obtained by replacing the true distribution F , which generated the original sample, with Fθ andF , respectively, in the expression above. Therefore, the parametric and nonparametric estimates of the bias are given by If B bootstrap samples (y * 1 , y * 2 , . . . , y * B ) are generated independently from the original sample y, and the respective bootstrap replications (θ * 1 ,θ * 2 , . . . ,θ * B ) are calculated, wherê θ * b = s(y * b ) and b = 1, 2, . . . , B, then it is possible to approximate the bootstrap expectations E Fθ [s(y)] and EF [s(y)] by the meanθ * (·) = 1/B B b=1θ * b . Therefore, the bootstrap bias estimates based on B replications ofθ arê for the parametric and nonparametric versions, respectively. An alternative bootstrap bias estimate was proposed by Efron [17]. His approach is nonparametric and uses an auxiliary vector known as the resampling vector, which records the proportions of the original observations y = (y 1 , . . . , y n ) T included in the bootstrap sample. We denote the resampling vector by P * = (P * 1 , P * 2 , . . . , P * n ), where its components P * j , j = 1, 2, . . . , n, are defined with respect to a given bootstrap sample y * = (y * 1 , . . . , y * n ) T as P * j = n −1 (#{y * k = y j }), j = 1, 2, . . . , n. The vector P 0 = (1/n, 1/n, . . . , 1/n) corresponds to the original sample.
As proposed by Efron [17], the bias-corrected estimatorθ 2 requires the original estimator θ to have closed form. However,θ does not have closed form in general. To circumvent this problem, Cribari-Neto et al. [19] proposed an adaptation of Efron's method, which can be applied to estimators that do not have closed form. Their proposal is to use the resampling vector to modify the log-likelihood function and then maximize the modified log likelihood. The idea is to write the log-likelihood function in terms of P 0 , replace this quantity by P * (·) , and then maximize the resulting modified log-likelihood function.
This adaptation of Efron's method is useful for performing inference in the two-parameter Birnbaum-Saunders distribution, as the MLEs of the parameters do not have closed form. Let t = (t 1 , . . . , t n ) T be a random sample of size n of the Birnbaum-Saunders distribution with parameters α and β. The log-likelihood function can be written in terms of P 0 , apart from an unimportant constant, as One then replaces P 0 by P * (·) after obtaining P * (·) from a nonparametric bootstrapping scheme based on B replications and maximizes the modified log-likelihood function. That is, we maximize instead of maximizing the original log-likelihood function. MacKinnon and Smith [18] argue that the estimatorsθ 1 andθ 3 , which they call CBC, are designed to work well when the bias function B(θ) is flat, that is, when it does not depend on θ. They considered the situation when the bias function is linear with respect to θ , that is, The estimation of the bias function now involves the estimation of two parameters, namely a and c. We need to obtain the estimates of two points of the bias line and then use such points to obtain our estimates of a and c.
The procedure can be summarized as follows. Using the original sample y, we compute the estimateθ = s(y). In order to estimate the first point, we use a parametric bootstrapping scheme to obtain a bootstrap estimate of the bias ofθ , denoted asB, and given byθ * (·) −θ . Next, in order to estimate the second point, we use another parametric bootstrapping scheme based onθ, whereθ = 2θ −θ * (·) , and, for each bootstrap sample, we obtain the corresponding replicationsθ * b Fθ , b = 1, . . . , B. Therefore, we estimate the bias ofθ by the quantityB, wherẽ B =θ * (·) Fθ −θ,θ * (·) Fθ being the mean of the bootstrap replications ofθ . Note that here, one needs to perform 2B bootstrap replications (instead of B). In other words, the number of replications required is the double of that used in the case of CBC estimators. Finally, using the point estimates,θ andθ , and their respective estimated biases,B andB, we arrive at the following system of two simultaneous equations: We now obtain the following bias-corrected estimator, known as the linear bias-correcting (LBC) estimator [18] and denoted byθ 4 : Its variance is a function of the variance ofθ: If the estimated value of c,c, belongs to the set A = {(−2, 0)\{−1}}, then the variance ofθ 4 will exceed that ofθ.
A detailed discussion of bootstrap methods and their applications can be found in Davison and Hinkley [20] and in Efron and Tibshirani [21].

Numerical results
The simulation of B-S(α, β) independent deviates can be performed using the monotone mapping: That is, from equation (1), we have that X ∼ N (0, (1/4)α 2 ). Therefore, from the equation above, T can be written as In other words, pseudo-random numbers from the two-parameter Birnbaum-Saunders distribution can be obtained from pseudo-random normal numbers using equation (3). The Monte Carlo simulations were performed using the object-oriented matrix programing language Ox † [22,23]. We use R = 5000 (number of Monte Carlo replications) and B = 500 (number of bootstrap replications). The MLEs of α and β are obtained by maximizing the log-likelihood function using the BFGS quasi-Newton nonlinear optimization algorithm with analytical derivatives. This method is generally considered the best nonlinear optimization method [24, p. 199]. Note that the simulations are computationally intensive, as each experiment requires millions of nonlinear maximizations. We have evaluated, through Monte-Carlo simulations, the performance of the MLÊ θ = (α,β) T of the vector of parameters θ = (α, β) T of the Birnbaum-Saunders distribution and its corrected versions:θ 1 = (α 1 ,β 1 ) T (nonparametric CBC),θ 2 = (α 2 ,β 2 ) T (based on the resampling vector P * (·) ),θ 3 = (α 3 ,β 3 ) T (parametric CBC) andθ 4 = (α 4 ,β 4 ) T (parametric LBC), in finite samples. We also computed the corrected MLE proposed by Ng et al. [14], denoted in this paper asθ = (ᾱ,β) T . The sample sizes considered were n = 10, 20, 40 and 60, and the values for the shape parameter α were 0.10, 0.25, 0.50, 0.75 and 1.00. Without loss of generality, the scale parameter β was set at 1, that is, β = 1 in all experiments.
For each sample size, we computed the following quantities: relative bias estimates (the relative bias of an estimatorθ of a scalar parameter θ is defined as {E(θ) − θ }/θ , and the estimated    In tables 1 and 2, we present the estimated relative biases. Note, in table 1, that the estimators α 1 ,α 2 ,α 3 ,α 4 andᾱ displayed relative biases (in absolute values) that are smaller than that of the usual MLEα in all sample sizes considered. Note also that the bias correction proposed by Ng et al. [14] (estimatorᾱ) was the least effective among all bias corrections considered. For instance, for n = 20 and α = 0.10, the estimated relative bias of the estimatorᾱ was 0.01238, whereas the estimated relative bias of the estimatorsα 1 ,α 2 ,α 3 andα 4 were, in absolute values, 0.00286, 0.00361, 0.00167 and 0.00430, respectively. Furthermore, note that, among the bootstrap estimators, the estimatorα 3 (parametric CBC) displayed the best finite-sample performance, followed by the estimatorα 1 (nonparametric CBC). For instance, for n = 40 and α = 0.25, the relative biases ofα 1 ,α 2 ,α 3 andα 4 were, in absolute values, 0.00029, 0.00105, 0.00003 and 0.00483, respectively. The figures in table 2 show that the estimatorsβ 1 andβ 3 were again the most accurate for all sample sizes. For instance, for n = 10 and α = 0.25, the estimated relative biases of the estimatorsβ 1 andβ 3 were nearly 10 times smaller than that of the MLEβ. Of these two bootstrap estimators, the one with best performance wasβ 3 (parametric CBC). In fact, this estimator was the most accurate in samples of small sizes.
In tables 3 and 4, we present the square root of the estimated mean-square error ( √ MSE) of the estimators of α and β, respectively. Note, in table 3, that the estimates of √ MSE of α 1 ,α 3 andᾱ were similar to that of the MLEα. In contrast, the estimates of √ MSE ofα 2 andα 4 were significantly larger than that ofα. For instance, for n = 60 and α = 0.10, the estimates of √ MSE ofα,α 2 andα 4 were 0.00902, 0.01268 and 0.02113, respectively. Note, in table 4, similar to what happens in table 3, that the estimates of √ MSE ofβ 1 ,β 3 and β were close to that ofβ. In contrast, the estimates of √ MSE ofβ 2 andβ 4 were considerably larger than that of the MLEβ. For instance, for n = 60 and α = 0.75, the estimates of √ MSE ofβ,β 2 andβ 4 were 0.08952, 0.12467 and 0.12026, respectively.
It is noteworthy that as the shape parameter α increases, the estimates of the scale parameter β become less accurate (tables 2 and 4). For instance, when n = 20, the estimated relative bias of the MLEβ was 0.00039 for α = 0.10 and 0.02150 for α = 1.00, an increase of relative bias of almost 56 times. Table 5 contains the mean of the 5000 estimates of c, denoted asc, as well as the proportion of the 5000 estimatesc which belonged to the set A, previously defined. Note that allc values for the shape parameter α belonged to A. Furthermore, note that these values are close to zero, especially in samples of small size (less than 40 observations). This suggests that the parameter c may be approximated by zero when estimating α, which corresponds to a constant bias function. Note also that thec values for α were similar in all sample sizes. For instance, for n = 20, the values ofc were −0.0779, −0.0781, −0.0786, −0.0790 and −0.0793, respectively, for α = 0.10, 0.25, 0.50, 0.75 and 1.00. Yet, most of the estimatesc, corresponding to the shape parameter α, belonged to A (50% in all considered cases), thus explaining the large variability of the estimatorα 4 relative to the MLEα (table 3). Note in table 5 that thec values for the scale parameter β vary considerably. For instance, for n = 20, thec values were −2.0242, 3.6810, 0.7426, 0.0686 and −0.3013 and for α = 0.10, 0.25, 0.50, 0.75 and 1.00, respectively. As a result, we do not recommend the use of the estimatorβ 4 , for it is possible that such oscillations lead to poor estimates of the slope of the bias function, thus yielding poor estimates of β. For instance, note in table 2 that, for n = 10 and α = 0.50, the estimated relative bias of the estimatorsβ andβ 4 were, in absolute values, 0.01185 and 0.17898, respectively. Still, as the value of the shape parameter α increases, the proportions of the estimatesc which belonged to A, corresponding to the scale parameter β, also increased significantly. For instance, for n = 10, the percentage was 28.54 with α = 0.10 and 57.60 with α = 1.00.
In order to further understand some of the results presented earlier, we have produced two figures (figures 2 and 3). The figures include estimated density plots and box plots. Density estimation was performed using kernel methods. (A Gaussian kernel was used.) Figure 2 corresponds to the estimation of α, whereas figure 3 corresponds to the estimation of β. Two values of α were used, namely 0.25 and 0.75. We note from these figures that the distributions ofα 2 andβ 2 (weighted bootstrap estimators) are clearly less peaked than the remaining ones, thus resulting in larger variance. The box plots also show the existence of atypical values, especially in the upper tail. The larger variability ofα 4 andβ 4 (linear parametric bootstrap estimators) stems from a few very atypical realizations in the upper tail of the distribution; their densities are not markedly less peaked than that of the parametric and nonparametric bootstrap estimators.
It is noteworthy that the weighted nonparametric bootstrap estimator, i.e.,θ 2 = (α 2 ,β 2 ) T (based on the resampling vector P * (·) ), was clearly outperformed by other bootstrap-based  estimators. This contrasts with the numerical results in Cribari-Neto et al. [19], in the context of processing radar image data, where this estimator was the best performing among several bootstrap-based estimators.
The MLEsα andβ of α and β are considerably biased (tables 1 and 2), especially in small samples. We thus strongly recommend their bias correction. To that end, we recommend the use of the bias-adjusted parametric bootstrap estimator (CBC estimator), which displayed the best finite-sample performance.

Application to real data
The source of the data is McCool [25]. The data describe the lifetime, in hours, of 10 sustainers of a certain type. They were used as an illustration of the three-parameter Weibull distribution in Cohen et al. [26] and are given in table 6.  The MLEs of the parameters α and β and their corrected versions, together with their respective standard errors (in parentheses), are presented in table 7, where NP stands for nonparametric bootstrap and P stands for parametric bootstrap. Note that the corrected estimates α 1 ,α 2 ,α 3 ,α 4 andᾱ are larger than the MLEα and that the corrected estimatesβ 1 ,β 2 ,β 3 ,β 4 andβ are smaller than the MLEβ. Note also that the estimatesα 2 andβ 2 have large standard errors relative toα andβ, respectively.

Concluding remarks
This paper considered different bootstrapping schemes (parametric and nonparametric) for correcting the biases of the MLEs of the parameters that index the two-parameter Birnbaum-Saunders distribution, as well as the corrected estimators proposed by Ng et al. [14]. The numerical evidence showed that the bias-correcting schemes are generally effective, even when the sample size is small.
The best performing estimator was the CBC parametric bootstrap estimator. We, therefore, strongly recommend that practitioners use this estimator when modeling data using the Birnbaum-Saunders distribution.