Statistics-based Bayesian Modeling Framework for Uncertainty Quantification and Propagation

A new Bayesian modeling framework is proposed to account for the uncertainty in the model parameters arising from model and measurements errors, as well as experimental, operational, environmental and manufacturing variabilities. Uncertainty is embedded in the model parameters using a single level hierarchy where the uncertainties are quantified by Normal distributions with the mean and the covariance treated as hyperparameters. Unlike existing hierarchical Bayesian modelling frameworks, the likelihood function for each observed quantity is built based on the Kullback–Leibler divergence used to quantify the discrepancy between the probability density functions (PDFs) of the model predictions and measurements. The likelihood function is constructed assuming that this discrepancy for each measured quantity follows a truncated normal distribution. For Gaussian PDFs of measurements and response predictions, the posterior PDF of the model parameters depends on the lower two moments of the respective PDFs. This representation of the posterior is also used for non-Gaussian PDFs of measurements and model predictions to approximate the uncertainty in the model parameters. The proposed framework can tackle the situation where only PDFs or statistical characteristics are available for measurements. The propagation of uncertainties is accomplished through sampling. Two applications demonstrate the use and effectiveness of the proposed framework. In the first one, structural model parameter inference is considered using simulated statistics for the modal frequencies and mode shapes. In the second one, uncertainties in the parameters of the probabilistic S-N curves used in fatigue are quantified based on experimental data.


Introduction
The general Bayesian statistical framework proposed by Beck and Katafygiotis [ 1 ] provides a rigorous mathematical means to address the model updating problem under uncertainty. Based on this framework, there have been a lot of works for various applications, such as parameter estimation [2][3][4][5][6][7], model selection [8,9], damage identification [10][11][12], and robust uncertainty propagation [13,14], among which the parameter estimation application serving as the foundation of other applications has kept the overwhelming attention. The Bayesian parameter inference is accomplished by embedding a parameterized probabilistic model to describe the discrepancy between model predictions and measurements. Then the formulation for the posterior distribution of the structural and prediction error model parameters is provided as a product of the likelihood function and the prior distribution of the model parameters. To estimate the posterior distribution of the model parameters, the likelihood function is usually built based on a relation function between model predictions and measurements by defining a probabilistic structure of the prediction error.
For industrial applications, the parameter estimation results from different measurements show distinct variations. The variation usually arises from load uncertainty, model error, 2 measurement noise, and changing environmental/operational conditions [15][16][17]. Variations in the parameters of a model introduced to simulate a population of identically manufactured structures are also obtained due to manufacturing variabilities [ 18 , 19 ]. Therefore, it is important to describe these variations. The hierarchical Bayesian modeling framework (HBM) [20][21][22][23][24][25][26] has been proposed to quantify the uncertainty in the model parameters and prediction errors due to the aforementioned variabilities. The core of HBM is using a parameterized prior distribution of model parameters by introducing an extra layer involving hyper parameters to describe the variation of model parameters.
In this paper, a new probabilistic model is proposed on the basis of Bayesian framework, and the main difference is the principle to build the likelihood function, which is based on the relationship of statistics between model predictions and measurements for each model output. Compared with HBM, it does not require collecting datasets, and measurements can be the PDFs or statistics of measured quantities, so its application is more universal. It can also handle cases for which only the statistics like mean and variance are available from the measurements. In such cases it is unreasonable to build datasets in terms of samples generated from them and then apply existing conventional or hierarchical Bayesian modeling frameworks.
This paper is organized as follow. In Section 2, the new proposed Bayesian modeling framework is described in detail, including construction of the proposed probabilistic model, uncertainty quantification of parameters, and uncertainty propagation to quantities of interest (QoI). The application to structural dynamics based on measured modal properties is presented in Section 3. In Section 4, a three-DOF spring mass chain system is taken as a simulated example to illustrate the effectiveness of the proposed framework. The application to the parameter inference and uncertainty quantification of probabilistic S-N curves used in fatigue damage accumulation is given in Section 5. Conclusions are presented in Section 6. Figure 1 shows the structure of the proposed probabilistic modeling framework. Assume a parameterized model of a structural system and let   k q θ , 1, , q kn  be the model predictions for q n output quantities, where θ is model parameter vector to be identified by measurements available for these output quantities. To account for model error and environmental/operational variabilities in the model predictions, an additive error term k e is considered so that the predictions from the model are taken as

Probabilistic model
where k w is a weighting factor that scales the error terms k e . Uncertainties are embedded in the model parameter set θ by assigning to the set θ a Gaussian distribution with mean vector θ μ and covariance matrix θ Σ . to correspond to a measure of the intensity of the respective measured or response quantity, such as the mean of the measured or response quantity.

Figure 1 Proposed probabilistic modeling framework
To distinguish from the model parameters θ and the error term k e , the parameters   Also, if d n is not big enough, () k y  can be assumed to be a Gaussian distribution with mean and variance calculated by the data set k D .
Given () k y  and ( | ) k pq h , the discrepancy between them is then quantified by the Kullback-Leibler divergence (KL-div) [31]. However, considering the asymmetry of KL-div, a symmetric measure of the discrepancy can be used, defined as  (8) Note that the first two terms give a measure of the error between the variance of the experimental value and the variance predicted from the model. These two terms become zero when the variance 2 k y  of the experimental value equals the variance 2 | k q  h of the model prediction. Also, the last term gives the error between the mean of the experimental value and the mean of the model predictions. When the mean of the experimental value is equal to the mean of the model predictions, then the third term disappears. The discrepancy as defined by KL-div is a weighted sum of the discrepancies between the variances of the two PDFs and the means of the two PDFs. However, the KL-div measure is a rational method to assign the weights which otherwise one would have to select arbitrarily.
For non-Gaussian PDFs arising from nonlinear models, Eq. (8) for the KL-div can also be used as an approximate measure of the discrepancy between the two PDFs in terms of the 5 first two moments of the PDFs. Alternatively, for nonlinear models, the integral can be approximated by Monte Carlo (MC) sample estimates respectively. Estimating the KL-div from Eq. (9) requires a large number of samples and can be a computationally very tedious procedure. Simplified approximations, such as Eq. (8), based on the first two moments of the non-Gaussian PDFs are preferred.

Estimation of hyper parameters uncertainty
The first task of uncertainty quantification is to identify the hyper parameters in the proposed probabilistic model. This is accomplished by introducing a probabilistic model to represent the variables According to the relationship between measurements and model predictions described by Eq. (7), the Bayes theorem is applied to infer the posterior distribution of hyper parameters as Using the assumed truncated normal distribution for k with all 0 k   . Substituting Eq. (12) and (11) (14) 6 is the mean square discrepancy function formed from the individual discrepancies for each measurement property. It should be noted that ( stabilizes to a finite value as the number of output measured quantities increases. Given the prior distribution, samples

Uncertainty quantification of structural model parameters and error term
The posterior distribution of model parameters can be derived using the total probability theorem To calculate the multi-dimensional integrals efficiently, the univariate dimension reduction method (UDRM) [33] is introduced to approximate them. The UDRM involves an additive decomposition of a multivariate response function into multiple univariate functions, so the multi-dimensional integral required by response moments are approximated by a series of onedimensional integral of these univariate functions. According to UDRM,   (23), the mean and variance can be approximated into a series of one-dimensional integral of univariate functions in Eq. (24) [33], which can be easily solved by numerical integral methods, hereby a lot of computation can be saved.

Simulated Example
A population of 3-DOF linear systems manufactured to be identical is taken as a simulated example. Due to manufacturing variabilities, the properties of the system, such as stiffness vary for each member in the population. The modal properties of the members of the population are chosen as measured quantities to study the effectiveness of the proposed probabilistic modeling framework. 9 Consider a three DOF model, shown in Figure 2, introduced to represent each member in the population.

Uncertainty quantification
The properties of each member in the population are simulated as follows. To consider the variation of model parameters from member to member due to manufacturing variability, the stiffness of the link i of each member is generated from a Gaussian distribution  3 10 MCs samples. The error terms defined in Eq. (21) are assumed to follow the Normal distribution with zero mean and standard deviation equals to 0.05, corresponding to a 5% model error. The mean and variance of modal properties are then computed and listed in Table 2 serving as known statistics of measurements. Then the uncertainty quantification for the stiffness parameters can be conducted according to the methodology presented in Section 3. The 3-DOF model shown in Figure 2 is used to represent each member in the group. To take into account the variation in the model properties, the stiffness of each link is parameterized by    Table 3, the nested sampling algorithm [32] is implemented to generate samples of hyper parameters. The results are shown in Figure 3, while the mean and standard deviation of the posterior distributions of hyper parameters are summarized in Table 4 Table 5. The values are compared with the nominal values assigned to simulate the measurements. It can be seen that the mean and standard deviation of posterior distributions of model parameters and error terms are very close to the nominal values. The samples of  are close to zeros, which indicates that the discrepancy between the PDFs of measurements and predictions is small enough.

Uncertainty propagation
Using the samples of θ , r e and , rj e , the mean and variance of modal properties are predicted and listed in Table 6. These values should be compared with the mean and variance of the measurements in Table 2. The predicted results are of good accuracy, and the maximum relative error is less than 4%. The predicted PDFs of modal properties are also computed and compared with Gaussian PDFs of the measurements in Figure 5. A very good agreement is also observed, validating the effectiveness of the proposed methodology in identifying the model parameters. It should be noted that such uncertainty bounds are expected to be thin for classical Bayesian framework based on multiple datasets used for the modal properties [34]. The level of uncertainty is expected in classical Bayesian approaches to decrease as the number of data increases.   Furthermore, based on the identified modal properties, the response time history of displacement or acceleration or velocity can also be predicted. For this, a zero mean discrete Gaussian white noise base excitation with standard deviation 1, shown in Figure 6, is applied and the mean and variance of time history response of displacement of the third DOF is estimated taking into account the uncertainties in the model parameters and error terms assumed to simulate the measurements in Section 4.2. The modal analysis is used to perform the corresponding predictions based on the predicted modal properties with error terms taken into consideration. To make a comparison, the mean and 90% credible interval boundaries obtained respectively from measurements and predictions are shown in Figure 7. As we can see, the model prediction results match very well the measurements. For the situation that the error terms of modal properties are not considered in the predictions the predicted uncertainty intervals are smaller than that of measurement uncertainty intervals, as shown in Figure 8, signifying that the error terms are necessary to be included in the propagation analysis for accurate model predictions.

Figure 7
Comparison between measured and predicted results for the displacement time history at DOF 3

Figure 8
Comparison between measured and predicted results for the displacement time history at DOF 3 without consideration of error terms in prediction

Estimation of S-N Curve Model Parameters using Fatigue Data
In this section, the experimental data from fatigue tests are used to infer the uncertainties in the model parameters of the S-N curves. As the S-N data is usually modelled by a linear model, the equations derived for the linear model in Section 2.1 are directly applicable.

Model description
Materials fatigue performance is commonly characterized in the form of an S-N curve, which is usually simulated by the Basquin's relation [35] B kk N AS   (25)

Uncertainty quantification using fatigue tests
The data listed in Table 7, taken from [37], are used to infer the model parameters. Fatigue tests were conducted with standard plate specimens of aluminum alloy 2524-T3 under four stress levels with about 15 observations each. The data from each stress level is assumed to follow a Gaussian distribution, and the mean and standard deviation are solved as measured statistics. Based on the probabilistic model described above, the proposed probabilistic modeling framework can be implemented. Given the prior distributions of all the hyper parameters as uniform distribution with their upper and lower bounds listed in Table 8 Table 9. Based on Eq. (17), the posterior distribution of  and  are shown in Figure 10. Using the samples in Figure 9, the uncertainties in the estimates of the hyperparameters are of the order of 0.83%

Figure 10
Samples of posterior distribution of  ,  and 

Uncertainty propagation
Using the samples of  and  generated by Eq. (17), the samples of k q for various stress levels are predicted and the 90% credible interval is estimated. Results are shown in Figure 11 and compared with measured data available for the four stress levels. Predictions of the 90% credible intervals take into account the uncertainty in the hyper-parameters. Results are also presented for the 90% credible intervals estimated by ignoring the uncertainties in the hyper-parameters. This is achieved by drawing samples from the distribution   |,  θθ θ μ Σ and propagating these samples for predicting the fatigue life for different stress levels in Figure  11. It can be seen that the 90% uncertainty intervals considering the uncertainties in the hyper parameters are narrow enough and contain the fatigue data available at the four stress levels. Moreover, these uncertainty intervals are comparable to uncertainty intervals obtained by methods based on HBM [38,39]. The uncertainty intervals computed using the MPV of the hyper parameters, ignoring the uncertainties in the hyper parameters, are narrower and contain well a large percentage of fatigue data. It is evident, however, that propagation based on the MPV of the hyper parameters fail to fully contain all the data. The discrepancy between the two credible intervals is expected to decrease as one includes fatigue data from more than four stress levels.

Figure 11
Predicted 90% credible intervals and comparison with measured fatigue data Finally, from the results in Figure 9, it is observed that the values of   are not close to zero, which means there are some discrepancies between predicted PDFs and the measured PDFs. This is also depicted in Figure 11, as well as in Figure 12 comparing, for each stress level, the prediction of the Gaussian PDF of k q based on the model to the Gaussian PDF based on the measurements. For the model predictions, the PDF corresponding to the most probable values of the hyperparameters is presented along with the PDF taking into account the uncertainties in the hyper parameters. It can be seen that the most probable values of two PDFs in each figure are consistent, while the uncertainty predicted using only MPV is narrower than that considering uncertainties of hyper parameters. The reason for the discrepancies between measured and model predicted PDFs is that the variation of the available fatigue data from the four stress levels deviates from the linear model, so that the predictions from the linear model cannot exactly account for the mean and variance of experimental data for all four stress levels simultaneously.

Figure 12
Comparison of PDF of experimental data and model predicted PDFs for the four stress levels

Conclusion
The new Bayesian inference method proposed in this work addresses the issue of underestimation of the uncertainty in the model parameters due to model error, mentioned in [24], arising from multiple measurements available for a structure or measurements available for members of a population of identically manufactured structures [18,19]. The proposed method offers an alternative to HBM methods recently proposed in the literature [21,2525] to correctly address the uncertainty in the model parameters. Based on the proposed framework, uncertainties are embedded in the model parameters by assigning a Normal distribution with mean and covariance constituting the hyperparameters to be estimated using Bayesian inference. The posterior distribution of hyper parameters of the model parameters is directly computed by Bayes theorem applied on KL-div measures between the model predicted PDF and the PDF of the experimental data. In particular, the proposed framework is applicable when the statistics of the measured quantities are available. Computationally efficient and insightful analytical expressions for the posterior distribution of the hyperparameters were developed for the case for which the PDFs are approximated by Normal distributions. In particular, Normal distributions for the predictions arise when the output QoI depend linearly on the model parameters. In this case the posterior PDF of the model parameters depends on the lower two moments of the respective PDFs. This representation of the posterior is also used for non-Gaussian PDFs to approximate the uncertainty in the model parameters. For nonlinear relations between the output QoI and the model parameters, the univariate dimension reduction method (UDRM) is used to efficiently estimate the involved multi-dimensional integrals for the lower two moments of model predicted PDF.
An application to structural dynamics based on measured modal properties from a group of identically manufactured 3-DOF systems is presented based on simulated data to illustrate the proposed framework. The effectiveness of the methodology is demonstrated by noting that the estimates of hyperparameters, model parameters, and uncertainties recover the values used to simulate the data. Also, the method is applied to the quantification of uncertainties of the parameters of S-N curves based on the experimental data from fatigue tests, demonstrating that the proposed framework can also obtain competitive results to alternative methods based on HBM.