Data-driven simulation for general purpose multibody dynamics using deep neural networks

In this paper, a machine learning-based simulation framework of general-purpose multibody dynamics is introduced. The aim of the framework is to generate a well-trained meta-model of multibody dynamics (MBD) systems. To this end, deep neural network (DNN) is employed to the framework so as to construct data-based meta-model representing multibody systems. Constructing well-defined training data set with time variable is essential to get accurate and reliable motion data such as displacement, velocity, acceleration, and forces. As a result of the introduced approach, the meta-model provides motion estimation of system dynamics without solving the analytical equations of motion. The performance of the proposed DNN meta-modeling was evaluated to represent several MBD systems.


Introduction
Using Machine Learning (ML) with big data is an important subject matter in science and engineering.This is because ML is effective to handle and interpret big data sets for the purpose of finding certain patterns from the data.
In particular, Deep Neural Network (DNN), which is based on an Artificial Neural Network (ANN) with multiple hidden layers between input and output layers allows to handle complex shapes with nonlinear functions with multi-dimensional input data.DNN has been successfully used in a large number of practical applications.Well trained neural network then provides precise pattern recognition based on data sets in real time.
These features, big data recognition and real time estimation of nonlinear functions, of ML approaches are attractive to dynamics and control engineers who are handling nonlinear system dynamics with real world data.There have been several previous studies on applying ML, DNN, or other big-data handling techniques to rigid multibody system problems.For example, Bayesian formulation [4,13,15] in combination with Markov random field approximation, Kalman filter, or particle filter has been applied to various multibody dynamics (MBD) problems to handle noise data effectively in real-life applications, generate reliable modeling with efficient computational cost, estimate multibody system in probabilistic sense, or identify nonlinear parameters in governing equations.ML approaches [14,16,19,17,18] such as regression methods, reinforcement learning algorithms, and surrogate models have also been employed.Regression methods have many different types that can be performed in ML.In addition to the simple linear regression model, one can select and use techniques such as polynomial regression, support vector regression, decision tree regression, and random forest regression to suit a given problem.Based on the investigated input-label values, surrogate models perform a probabilistic estimate for an unknown objective function.This is an approach that uses an interpretable model to describe complex models.The most commonly used model in surrogate models is the Gaussian process.The proposed method has enhanced accuracy of prediction, especially in the long time scales, and increased computational efficiency in simulating dynamic response of multibody system.Moreover, neural networks [19,20,21,22,23] have been suggested as effective alternatives to multibody dynamics simulation in comparison with conventional algorithms.The approaches have been proved to be fast and reliable to describe and predict characteristics of multibody systems.
It is important to note that previous studies [4,17,18,20,21,22,23] are focused on particular MBD problems, mainly on contact, railways, vehicles, gaits, robotics, or tracking.Accordingly, a general MBD problem has not been introduced and analyzed through DNN technique.
To address these shortcomings, this study introduces a procedure to generate a solver based on DNN meta-model for general purpose multibody system, which allows us to predict MBD with high accuracy in real time.Among the various ML methods, a supervised learning technique is used for the mathematical and/or numerical data set of the MBD model in the training process.Data preparation and training process are called off-line stage, and its trained result is known as meta-model.Using the meta-model, the time varying results can be estimated such as displacement, velocity, and acceleration of the multibody system without directly solving the governing equations of MBD, and then this estimation process is called on-line stage.In particular, the feed forward networks (FFN) with hidden layers and non-linear activation functions are employed among the various DNN methods since it can efficiently represent continuous functions.Three representative MBD problems, single pendulum, double pendulum, and slider crank mechanism, were considered to evaluate the performance of the proposed DNN based meta-modeling framework.To get the reliable meta-model, sufficient and accurate training data set of MBD is prerequisite, and random search is also important to define appropriate hyper-parameters of MBD problems such as the number of hidden layers, the size of batches, the number of epochs, optimizer, etc.In particular, numerical results imply that a position of time variable as input or output data is crucial to get the usable transient response of MBD.
In Section 2, the governing equations of MBD is briefly reviewed.In Section 3, the overview of neural networks of MBD and its meta-modeling process is presented.It should be noted that the framework of the proposed meta-modeling provides fundamental ideas of handling experimental or real-world data and exploiting their structures and relations to understand dynamics of general multibody systems.Not depending on complexity of MBD systems, the present meta-modeling helps us to achieve real-time and robust simulations with accurate motion results.In addition, high level of engineering simulations can be employed for not only engineering designs, but also motion related Internet of Things (IoT).Section 4 describes the case studies of the meta-modeling process using single pendulum, double pendulum, and slider crank mechanism.Conclusions are given in Section 5.

Brief Review on Common General Purpose MBD Governing Equations
Multibody system dynamics offers a straightforward approach to construct and solve equations of motion for mechanical systems.Multibody system dynamics includes a large number of procedures those can be categorized based on the used coordinates.In topological approaches, such like semi-recursive formulation, relative coordinates between the bodies are used.In the global approaches, in turn, the set of coordinates defines each body of the system.It is important to note that although topological and global approaches both lead to identical dynamic responses, the numerical performance differs.In this section often used global methods are briefly reviewed.
In the augmented formulation, constraint equations are accounted in the equation of motion by employing Lagrange multipliers.In this approach the equations of motion can be written as where M is the mass matrix, C is the constraint vector, C q is the the Jacobian matrix of the constraint vector C, F a is the vector of applied generalized forces, and F c the vector can be obtained by differential constraint twice with respect to time.The equation of motion is solved to obtain the generalized coordinates q and the Lagrange multipliers λ .
The other commonly used form of equations of motion for multibody system can be achieved from applying the embedding technique to global coordinates (1).The embedding technique reduces the generalized coordinates to be solved from q to a set of independent generalized qind .In practice, this reduction can be accomplished using a transformation matrix T: where r is a remainder vector.Substituting (2) into the augmented system (1) yields By applying an identity T T C T q = 0, the equation ( 3) can be simplified into where M := T T MT ,

Deep Neural Network for Multibody Dynamics Systems
In this section, a brief introduction to DNN that will be used in numerical examples is presented, and training of the DNN for MBD systems is also described.
Machine Learning (ML) aims to develop technologies and algorithms that enables computers to analyze and predict mechanisms of a system by learning structures of big amount of data.ML allows important tasks to be performed by generalizing from examples [12].ML has already powered many aspects of modern society from web searches and item recognition to image classification, speech recognition [11], and cyber-physical systems (CPS)." Being a part of ML, Artificial Neural Networks (ANN) are clusters of nodes (or neurons), which is designed to mimic the decision-making process of human brain, see Fig. 1.Nodes form layers, i.e. the input layer, the hidden layer, and the output layer.The input and output layers consist input and output parameters, respectively, of a meta-model.Containing information, nodes of each layer interchange the information through weights.One of the main purposes of ANN is to find the best weights to maximize the performance of a given neural network.Rumelhart et al. [2] developed an error back-propagation algorithm to find weights and improve neural networks efficiently.
To describe and represent more complicated and intricate data, more than one hidden layer can be considered.In this case, the ANN is referred to as Deep Neural Networks (DNN).The increased number of hidden layers increases the number of nodes and weights, which requires an expensive computational cost and makes it difficult to train a model.Despite the shortcomings, DNN yields better meta-models for solving complex nonlinear problems.
Structure of DNN can be specified in more details by the hyper-parameters such as the number of layers, the number of nodes for each layer, the batch size, the activation functions, the regulatory method, and the optimizers.The performance of DNN highly relies on the proper choice of hyper-parameters.Some important hyperparameters mentioned in the numerical tests (Section 4) are briefly summarized as follows: • Batch size The batch size is the number of training data samples in one pass for updating weights.Due to memory limitations, it is not recommended to perform training with all available data samples at once.The larger the batch size, the less computational cost a training requires.

• Activation function
In DNN, values specified to nodes of a layer are not transferred directly to the next layer, but transformed through a nonlinear function, called activation function.It helps the values of nodes not to diverge during training and allows to solve complex problems with a small number of nodes.If an unsuitable activation function is chosen, gradients of DNN (in the error back-propagation process) can be vanishing, which makes learning speed severely slow.Activation functions such as tanh, sigmoid, and ReLU, are known to appropriate choices.

• Optimizers
Weights of DNN are found by error back-propagation process, which sequentially updates the weights to minimize a loss function defined by a given error, such as E mse described in (7).In this process, a local minimum problem needs to be solved and an efficient optimizer helps to reduce solution time.Representative techniques are stochastic gradient descent, Adam [1], RMSprop [3].

Meta-model using Neural Networks
ML methods can be categorized in viewpoint of learning styles into three: supervised, unsupervised, and reinforcement learning.Supervised learning trains a metamodel by considering both reference response features called labels and predictive features, and by gradually improving the model to fit the given training data.There are mainly classification and regression methods in supervised learning.In unsupervised learning, in contrary to supervised learning, label (or reference) features are not designated.It focuses on how training data is structured.Reinforcement learning is an effective algorithm for optimization analysis.It learns data by making decisions to maximize user-specified reward.Users need to design appropriate model conditions such as environments, actions, rewards.
MBD problems can be mainly dealt with supervised or reinforcement learning techniques since many MBD problems aim to seek robust and optimal design considering a set of design parameters.
To apply supervised learning, training data need to be prepared afore-hand for learning the model.The training data for MBD meta-models can be obtained in a few manners, usually by computational methods.In case of reinforcement learning, a multibody systems simulation environment is requisite to train an agent according to cumulated reward for each action.Both learning approaches require time-consuming tasks to learn the meta-models of MBD: data preparation task for supervised learning and simulation task for reinforcement learning.However, once the meta-model is built, it resolves MBD problems in real-time and yields dynamics responses.
In this research, the supervised learning of MBD meta-model based on training data is mainly considered.Supervised learning finds an approximation function M that minimizes a loss L(x; M ) over samples x.An algorithm A α produces M for a training set X train through the optimization of a training criterion with respect to a set of parameters, given hyper-parameters α [10].The built function A neural network algorithm is one of the powerful machine learning algorithms of minimizing the loss L x; A (X train ; α) .
Specifically, the algorithm uses a network structure and optimizes the parameters of the networks, weights and biases, by utilizing the back-propagation algorithms, which is an extension of the gradient descent method for neural network structures.
In this research, neural networks are adopted to build the meta-models of MBD problems, since it is subject to be generalized to fit various shapes of nonlinear functions with multi-dimensional input data.In particular, the feed forward networks (FFN) with hidden layers and nonlinear activation functions are considered, which are the universal approximators that can represent effectively continuous functions.Owing to the characteristics of FFN, it is a powerful candidate of implementing the meta-models of general purposed MBD problems.Moreover, many techniques for DNN including accelerated activation functions such as ReLU, dropout, regularization, and batch normalization have strengthened the potential of FFN with deep layers for modeling general purpose MBD problems.
The flowchart in Fig. 2 shows brief outlines of meta-modeling of MBD problems and its benefits.
Design of Neural Networks for Meta-models MBD problems rarely have high dimensionality of input or output data, compared to common DNN applications such as image, speech, and text data.Rather than high dimensionality, in general, MBD considers complicated nonlinear functions and requires accurate and robust solutions.
If an MBD problem is given, the design of input and output layers is typically decided.For example, each variable of input (or output) data is mapped to a single node of the input (or output) layer in case that the variable is numeric one, but if the variable is nominal one, it should be mapped to multiple nodes through one-hot encoding.In one-hot encoding, each value of the nominal variable is transformed to one of one-hot vectors, Different from input and output layers, the design of hidden layers is volatile.The number of hidden layers and the number of nodes are the most critical hyperparameters, and their best design must be decided along with other hyper-parameters at the step of hyper-parameter tuning.Empirically, it is known that deeper hidden layers are more effective than larger nodes of shallower hidden layers if two FFN models have similar numbers of parameters such as weights and biases.
To build expressive MBD meta-models, FFN models with enough width and depth are necessary.However, proper regularization methods such as L 1 and L 2 regularization, dropout and batch-normalization are required to achieve the generalized meta-models because FFN models with too many parameters are often overfitted to the given training data [9].

Hyper-parameters Optimization of meta-models
Similar to typical ML algorithms, the neural network algorithm does not provide a method to find the optimal hyper-parameters α.Hyper-parameters of DNN are critical to the accuracy and robustness of the meta-model.Unfortunately, there is no perfect scheme of building the most accurate and robust DNN model from a given training data.One must search the best set of hyper-parameters such as the number of hidden layers, the number of nodes in each hidden layers, activation function, optimization function, learning rate, and the number of epochs.
Generally, two kinds of search methods are often used for the purpose of hyperparameter optimization; a given set of candidate values for each hyper-parameter are investigated with the grid search method, or randomly selected values for hyperparameters are evaluated with the random search method.It is known that random search is more efficient to find optimal hyper-parameters than grid search [10].Recently, AutoML is actively researched in academic and practical fields to find the best design of DNN.When the AutoML techniques are mature, it is expected that the optimal design of the DNN-based meta-models can be found in easier and faster manners [8].

Generation of MBD Training Data
In this paper, it is assumed that one can obtain as many MBD sample data as is need to train the meta-model and achieve a reliable model.In other words, a case with an insufficient training data set is not considered.Nevertheless, since the process of MBD data collection takes so long time in case of complex multibody systems, a more efficient manners of collecting training data is needed.
First, the amount of training samples can be determined according to some criteria.Incremental learning methods can be applied to learn the meta-models.For instance, a certain level of performance measures such as the root mean squared errors or the mean absolute percentage errors can be adopted for the criteria to stop feeding more samples to the meta-model.In case of the random search method, simply more random samples can be provided to the less trained meta-model, and in case of the grid search method, finer-grained grid samples can be done [7].
Second, the range of each design parameter for more training samples can be adjusted after seeking less accurate ranges of design parameters of the meta-model.It is under an assumption that model complexity is often different in many ranges of nonlinear hyperplanes.In such cases, adaptive sampling methods such as focused grid search can be less exhaustive than uniform design of the typical grid search method [6].

Detailed Assumptions and Conditions for Meta-modeling Process
The followings are some assumptions and comments on the meta-modeling that is developed for MBD problems.The same conditions are applied to the numerical tests in Section 4.

Training Data
• Sufficiently many sets of training data As mentioned in Section 3.1, it is assumed that there are as many sets of data for training and tests as one wants.Since the most important objective of this research is to achieve a highly accurate meta-model, the other issues such as computational efficiency and problems of insufficient training data are not mainly concerned.

• Uniform Meshes
Training data for input parameters are uniformly meshed in a given finite range.

• Data without Noise
Training data for output responses such as displacements, veloicities, or acclerations are exactly calculuated from governing equations for MBD problems.In other words, training data are artifically generated without any noise.

• Time Variable and Structurues of Training Data
An important question in meta-modeling for dynamic problems is whether time variable t needs to be handled as an input parameter or not.Table 1 shows an example of training data set, where time variable t is considered as an input.All the discrete time instants are contained in the set of training data.On the other hand, if time variable is not considered as an input parameter, there are #{t n } sets of training data, where time is fixed to t = t n , as shown in Table 2.The two types of training data structures are referred to as S f ull and S f ixed .
It may seem that S f ixed is simpler than S f ull , in that the former considers a fixed time instant t = t n and has a much smaller size of training data set compared to S f ull , especially when the number of discrete time instants is very large.However, handling time variable as a non-input (S f ixed ) is not adequate for MBD analysis in two following major aspects: (a) It requires to make as many meta-models as the number of discrete time instants t n , n = 0, 1, • • • .Moreover, if grid search is performed for each metamodel to find out the best hyper-parameters, this approach can be computationally infeasible.(b) Each resulting meta-model provides predictions only for a specific time t = t n , which makes it difficult to figure out time-varying tendency of MBD.
Thus, in this research, it is concluded that a meta-model for MBD problems need to be generated from training data of form S f ull , where time variable is considered as an input.More details on training data structure and its results are described in Section 4.

• Unseen Data
The performance of a resulting meta-model is evaluated with some sets of test data which are unseen from training process.

• Randomly Distributed Data
Unlike training data, input parameters for test are not uniformly meshed.They are randomly distributed in the same given range.

Grid Search and Hyper-parameters
Grid search is performed to find out appropriate hyper-parameters for each MBD example, which helps to yield a highly accurate meta-model.From grid search, the number of hidden layers, the number of nodes for each layer, the size of batches, the number of epochs, optimizer, and loss functions need to be decided.
Still, there can be other sets of hyper-parameters that result in similar or better performance.

Evaluation of Performance
The performance of a resulting meta-model M is evaluated in terms of two mea-sures: R-squared value and absolute mean-squared error (MSE), denoted by R 2 and E mse , respectively.When an output label y is given for a set of test data, and the mata-model M yields a prediction ŷ for the test set, the performance measures are defined by where   meta-model is generated through FFN, and its performance is evaluated in various ways, as described in Section 3.2.

Damped Single Pendulum
A damped single pendulum problem shown in Fig. 3 can be expressed in the following mathematical governing equation: where g is the gravity acceleration, L is the length of the massless rod, m is the mass, and c is the damping coefficient, respectively.The variables θ and θ are time-varying angle and its velocity, whose initial values are specified as θ 0 and θ 0 , respectively.Although all the input parameters (g, L, m, c, θ 0 , θ 0 ) affect dynamics of the single pendulum in Fig. 3, it is empirically noticed that the parameters (L, c, θ 0 ) make a major influence on the dynamic response characteristics.Thus, it is assumed that the relatively insignificant parameters (g, m, θ 0 ) are fixed to values (9.81[m/s 2 ], 0.3[kg], π/2[rad]), while the parameters (L, c, θ 0 ) are not determined specifically.It is the objective of this example to generate a meta-model which yields the dynamics of damped single pendulum as outputs when a particular set of input parameters (L, c, θ 0 ) are given.

𝑚 θ 𝐿
For an efficient learning, it is assumed that (L, c, θ 0 ) are chosen within finite ranges: Here, (∆ L, ∆ c, ∆ θ 0 ) denote uniform meshsizes for training data.In evaluating a meta-model, the uniform meshes are not applied, and arbitrarily chosen input values are used.
To describe dynamics of the damped single pendulum, the time-varying solutions θ (t), θ (t), and θ (t) are achieved as outputs of a meta-model.For time variable t, discrete time instants {t n } with a uniform meshsize ∆t is considered in an interval [0,t f ], where t f = 2: for n = 0, 1, • • • , 200.
As described in Section 3.2, time variable t can be handled as an input (S f ull ) or fixed to a certain instant (S f ixed ).Results from the two structures are compared.S f ull case generates only one meta-model, while S f ixed case #{t n } = 201 meta-models.Thus, for S f ull , the input and output of meta-model are four and three dimenional, repectively.The total number of training data is 267, 531.S f ixed has three dimensional input and the number of its training data is 1, 331 for each model.
Hyper-parameters found from grid search are shown in Table 3.

Hyper-parameters Choice
The number of hidden layers Table 3: Hyper-parameters for the damped single pendulum problem Fig. 4 displays the scatter plots where labels, i.e. reference solutions, and predictions of outputs (θ , θ , θ ) are compared.The results are achieved from a set of test data, which are unseen from training.The R 2 scores are around 0.997, which implies that the DNN model predicts the outputs with high accuracy.4: Input parameters of multiple cases for Fig. 6 are observed in case of S f ixed (Left).On the other hand, S f ull (Right) gives relatively smooth solutions.In Fig. 6, performance comparison of S f ixed (Left) and S f ull (Right) for other input parameters are summarized in Table 4: Similarly as in Fig. 5, oscillatory waves are observed in case of S f ixed .Some are more severe than others, which makes prediction error greater.On the other hand, S f ull yields smooth and accurate predictions for all cases.

Hyper-parameters for S f ull and S f ixed
In the damped single pendulum problem, the same hyper-parameters are used to both types of training data S f ull and S f ixed , where the hyper-parameters are found from a grid search for S f ull .Since the data structures of S f ull and S f ixed are different, it would be the best to carry out independent grid search for each structure, in comparing results of S f ull and S f ixed .Obviously, the performance of S f ixed will be improved if more appropriate hyper-parameters are applied.To clarify positives and negatives of employing better hyper-parameters for S f ixed , independent grid searches for S f ixed models are performed.Since there are #{t n } = 201 models in S f ixed , #{t n } grid searches are required.The hyper-parameters found for S f ixed are listed in Table 5.
Obviously, compared to the hyper-parameters for S f ull in Table 3, those in Table 5 improves the performance of S f ixed .The improved results corresponding to Fig. 5 (Left) and 6 (Left) are shown in Fig. 7 (Left) and Fig. 7 (Right), respectively.Compared to the results shown in Fig. 5 (Left) and 6 (Left), the accuracies of solutions from independent grid searches are clearly enhanced, which can be confirmed by the orders of E mse .
However, the oscillations are still observed, which yield less smooth solutions compared to the results of S f ull , shown in Fig. 5 (Right) and 6 (Right).In addition, #{t n } numbers of grid searches for S f ixed requires a heavy computational burden.The normalized clock time for grid search for S f ull and S f ixed are compared in Table 6.
Thus, the usage of the same hyper-parameters to both S f ull and S f ixed is not a serious hindrance to comparing performance of the two types of training data sets.For simplicity and computational feasibility, the hyper-parameters found from S f ull for both S f ull and S f ixed are employed, in the numerical examples in Sections 4.2 and 4.3.

Double Pendulum
A double pendulum problem in Fig. 8 follows the given mathematical governing equation: represent the time-varying angles of the links as shown in Fig. 8. Parameters g is the gravity constant, L i is the length of the massless rod i, m i is the mass, θ 0 i is the initial angle, θ 0 i is the the initial angular velocity, and i = 1, 2, body notation, respectively.
. Labels(dashed) and predictions(solid) are compared for test data.Left:#{t n } numbers of meta-models are generated for each fixed time t = t n .(Sf ixed ).Some oscillations are observed.Right:When time variable t is considered as an input parameter (S f ull ).Relatively smooth solutions are achieved.5.While the results in Fig. 5 (Left) and 6 (Left) employs the hyper-parameters of S f ull , the present results uses the hyperparameters from independent grid searches on #{t n } = 201 numbers of S f ixed models.While the accuracies of solutions are improved, the oscillations are still observed.In the meta-modeling, it is assumed that (L 1 , L 2 , θ 0 1 , θ 0 2 ) are independent input parameters and (θ 1 , θ 2 , θ1 , θ2 ) are output parameters.As in the single pendulum problem (8), inputs are chosen within some ranges.The other parameters are fixed to given constants.More details on ranges and mesh sizes of parameters are summarized in Table 9.
As in the previous numerical example, two types of training data, i.e. S f ixed and S f ull are compared.For S f ixed , there are #{t n } = 501 meta-models, where each model is trained from 14, 641 numbers of data set.For S f ull , there is only one meta-model trained from 14, 641 × 501 = 7, 335, 141 numbers of data set.For both S f ixed and S f ull types of training data, hyper-parameters are found as in Table 7.

Hyper-parameters Choice
The number of hidden layers The scatter plots in Fig. 9 show that a meta-model from S f ull predicts output parameters (θ 1 , θ 2 , θ1 , θ2 ) with a great accuracy.The R 2 values are over 0.997 in all cases of solutions.
Performances of meta-models from S f ixed and S f ull types of training data are compared in Fig. 10 and 11.It shows dynamic changes of predictions (solid) from meta-models in comparison with their labels (dashed), for multiple cases as shown in Table 8.
As observed in single pendulum cases shown in Fig. 5 and 6, the meta-model from S f ixed shows lots of oscillations in its dynamic responses.Here the oscillations are quite severe, especially when t is large.Though these results can be improved if zeros, and the angular acceleration of the crank shaft θ (t)[rad/s 2 ] is given as θ (t) = θ 0 = 0, where t = 0, θ (t) = θ 0 = 0, where t = 0, θ (t) = sin(τ t), where t ∈ [0,t f ], (11) for some constant τ ∈ R. Then the angle of the crank shaft θ (t) and its temporal derivatives can be rewritten explicitly, for t ∈ [0,t f ], In DNN modeling, three independent parameters (τ, r, L/r) are considered as inputs, while time variable t can be fixed to an instant (S f ixed ) or considered as an input (S f ull ).More details on ranges and mesh sizes of parameters are summarized in Table 11.
Although the slider crank mechanism is not a dynamic problem, this kinematic example is a good example because the kinematics should be treated as a special case of dynamic problems.To describe kinematics of the slider crank, seven kinematic  6 solutions θ , φ , φ , φ , x B , ẋB , and ẍB are considered as an output parameters, where x B denotes the x-directional translation of the slider.
The output solutions other than (θ , θ , θ ) can be found from kinematic equations as follows: Two meta-models are generated from S f ixed and S f ull types of training data, by employing the hyper-parameters found from grid searches for the case of S f ull shown in Table 10.

Hyper-parameters Choice
The number of hidden layers The scatter plots in Fig. 14 compares labels and predictions of the meta-model from S f ull , and verifies that the meta-model produces almost accurate results.Its performance is much better than the other meta-models of previous examples, which seems to be caused by a simple form of kinematic equations ( 13 Table 11: Summary on parameters of slider crank problem.In S f ixed , a fixed time instant is considered.In S f ull , all the time instants are treated as inputs. Since the predictions for test data are highly accurate as confirmed in Fig. 14, Fig. 15, 16, and 17 present results only for a specific case of test data: τ = 1.780, r = 1.360,L/r = 3.050.Fig. 15 shows changes of translation and velocities of the slider mass B in time t.As shown in previous Sections 4.1 and 4.2, S f ixed (Left) shows oscillatory waves, while S f ull yields smooth solutions.The error E mse compares the difference of their accuracies more clearly.
Fig. 16 displays time-varying relations between the angle of connecting rod φ (t) and its temporal derivatives ( φ (t), φ (t)).The oscillations from the case of S f ixed (Left) are observed.Fig. 17 shows relations between the displacement of slider x B and its derivatives.Performance of two training data set S f ixed (Left) and S f ull (Right) is more clear than Fig. 16.S f ull yields more smooth and accurate results than S f ixed .

Conclusions
The present study introduces a procedure to combine a machine learning and solution of general purpose multibody dynamics.The paper contributes to data-driven modeling for multibody systems in two meaningful aspects.The first is that Deep Neural Network learning is applied, not to a specified particular type, but a general multibody dynamic problem.The generality makes it possible for the proposed DNN algorithm to be employed for other multibody system problems in future research.The second is that the present work analyzes and suggests how training data need to be structured for more effective DNN learning.In particular, it is found out that treating time variable as an input parameter enhances accuracy and smoothness of resulting predictions.The observation is worthwhile to notice, since the smoothness of physical variables in time direction is significant in dynamic problems.The paper demonstrates that the accurate solution of general purpose multibody dynamics can be achieved by DNN procedure.Despite the introduced numerical results, the present data-based learning algorithm can be improved through further studies.For one thing, performing smart sampling which decides more suitable ranges and non-uniform mesh sizes of data will improve computational efficiency in generating a meta-model.Moreover, to make fundamental progress in data-driven design of MBD, further studies are required on other various subjects, from theories on probability, uncertainties, and physics, to brand-new data-handling techniques.

Fig. 1 :
Fig. 1: Structure of Artifical Neural Networks (ANN).If there are multiple hidden layers, ANN is referred to as Deep Neural Networks (DNN).

Fig. 2 :
Fig. 2: Flows of meta-modeling for MBD.By analyzing and learning data on MBD, a meta-model can be generated.The meta-model is intended to yield real-time dynamic responses of given MBD problems.Performance of the meta-model can be evaluated by comparing its results with experimental or real-world data.The evaluation helps to reconstruct or improve the off-line learning algorithm.

y
< l a t e x i t s h a 1 _ b a s e 6 4 = " m E c z 1 F L h u G 1 B p P 6 c 5 h i 5 0 q A I J 0g = " > A A A B 6 H i c b V B N S 8 N A E J 3 U r 1 q / q h 6 9 L B b B U 0 m q o M e i F 4 8 t 2 F p o Q 9 l s J + 3 a z S b s b o Q S + g u 8 e F D E q z / J m / / G b Z u D t j 4 Y e L w 3 w 8 y 8 I B F c G 9 f 9 d g p r 6 x u b W 8 X t 0 s 7 u 3 v 5 B + f C o r e N U M W y x W M S q E 1 C N g k t s G W 4 E d h K F N A o E P g T j2 5 n / 8 I R K 8 1 j e m 0 m C f k S H k o e c U W O l 5 q R f r r h V d w 6 y S r y c V C B H o 1 / + 6 g 1 i l k Y o D R N U 6 6 7 n J s b P q D n e I U 3 5 9 F 5 c d 6 d j 0 V r w c l n j u E P n M 8 f 6 Q u N A Q = = < / l a t e x i t > x < l a t e x i t s h a 1 _ b a s e 6 4 = " h L + F a L t O T 9 l u w f L W 3 U t 0 8 x l 3 P c w = " > A A A B 6 H i c b V D L T g J B E O z F F + I L 9 e h l I j H x R H b R R I 9 E L x 4 h k U c C G z I 7 9 M L I 7 O x m Z t Z I C F / g x Y P G e P W T v P k 3 D r A H B S v p p F L V n e 6 u I B F c G 9 f 9 d n J r 6 x u b W / n t w s 7 u 3 v 5 B 8 f C o q e N U M W y w W

Fig. 5
Fig.4displays the scatter plots where labels, i.e. reference solutions, and predictions of outputs (θ , θ , θ ) are compared.The results are achieved from a set of test data, which are unseen from training.The R 2 scores are around 0.997, which implies that the DNN model predicts the outputs with high accuracy.Fig. 5 shows dynamics of angle(θ ) (Top), angular velocity( θ ) (Middle), and angular acceleration( θ ) (Bottom), for a specific case: L = 0.1911[m], c = 3.78[kg • m/s], θ 0 = 0.055[rad/s].Labels (blue dashed, crosses) and predictions (red solid, circles) are shown for each solution.Results of S f ixed (Left) and S f ull (Right) are compared.Although both S f ixed and S f ull yields highly accurate results, some oscillations

Fig. 4 :
Fig. 4: Labels vs. Predictions for test data.The meta-model for the damped single pendulum problem is generated from S f ull type of training set.Test data are unseen from training.The R 2 values are almost 1, which implies that the meta-model predicts output solutions with high accuracy.

Fig. 5 :Fig. 6 :
Fig. 5: Dynamic responses of the damped single pendulum for specific input L = 0.1911[m], c = 3.78[kg • m/s], θ 0 = 0.055[rad/s].Labels(blue dashed, crosses) and predictions(red solid, circles) are compared for test data.Left: #{t n } numbers of meta-models are generated for each fixed time t = t n (S f ixed ).Some oscillations are observed.Right:When time variable t is considered as an input parameter (S f ull ).Relatively smooth solutions are achieved.

Fig. 7 :
Fig. 7: Dynamic responses of the damped single pendulum achieved from S f ixed training data with hyper-parameters in Table5.While the results in Fig.5(Left) and 6 (Left) employs the hyper-parameters of S f ull , the present results uses the hyperparameters from independent grid searches on #{t n } = 201 numbers of S f ixed models.While the accuracies of solutions are improved, the oscillations are still observed.

Fig. 9 :
Fig. 9: Labels vs. Predictions for test data.The meta-model for the double pendulum problem is generated from S f ull type of training set.Test data are unseen from training.The R 2 scores are almost 1, which implies that the meta-model yields accurate solutions.

Fig. 14 :
Fig. 14: Labels vs. Predictions for normalized test data.The meta-model for the slider crank problem is generated from S f ull type of training set.Test data are unseen from training.The R 2 scores are almost 1, which implies that the DNN model predicts output solutions with high accuracy.

Fig. 15 :
Fig. 15: Dynamic responses of slider crank: Labels(blue dashed) vs. Predictions(red solid) for specific input τ = 1.780, r = 1.360, and L/r = 3.050.Left:#{t n } numbers of meta-models are generated for each fixed time t = t n (S f ixed ).Some oscillations are observed.Right:When time variable t is considered as an input parameter (S f ull ).Relatively smooth solutions are achieved.

Fig. 16 :
Fig. 16: Relations between dynamic responses of slider crank problem when τ = 1.780, r = 1.360, and L/r = 3.050: Labels(black dashed) and predictions (red solid, circles) are given.Results from different types of training data set S f ixed (Left) and S f ull (Right) are compared.S f ull yields more smooth and accurate dynamic results.

Fig. 17 :
Fig. 17: Relations between dynamic responses of slider crank problem when τ = 1.780, r = 1.360, and L/r = 3.050: Labels(black dashed) and predictions (red solid, circles) are given.Results from different types of training data set S f ixed (Left) and S f ull (Right) are compared.S f ull yields more smooth and accurate dynamic results.

Table 1 :
In this section, three fundamental MBD examples, single pendulum, double pendulums, and slider crank mechanisms, are investigated.For each example, a data-driven Structure of training data set for DNN, where time variable t is considered as an input.This type of training data structure is denoted by S f ull .In this case, a single meta-model is generated.

Table 2 :
Structure of training data set for DNN, where time variable t is fixed and not considered as an input.This type of training data structure is denoted by S f ixed .In this case, #{t n } numbers of meta-models are generated corresponding to #{t n } sets of training data.

Table 5 :
Hyper-parameters for S f ixed training data, which are achieved from independent grid searches for #{t n } = 201 S f ixed models.

Table 6 :
Comparison of data structures S f ull and S f ixed , and normalized clock times taken for independent grid searches.Grid searches for S f ixed requires a heavy computational cost.

Table 7 :
Hyper-parameters for the double pendulum problem

Table 10 :
Hyper-parameters for the slider crank problem