Discrete adjoint gradient evaluations for linear stress and vibration analysis

AbstractThis paper presents methods used to perform discrete adjoint gradient evaluations for linear stress and vibration analysis. The methods are implemented within the framework of a discrete adjoint structural solver being developed for multidisciplinary adjoint optimizations of turbomachinery components. The code is differentiated using the algorithmic differentiation tool CoDiPack in tandem with manual treatment of the iterative solvers. Stress analysis leads to a linear system of equations that is typically solved by an iterative solver (e.g. GMRES). To ensure accuracy, the adjoint problem is formulated as a new linear system of equations to be solved. Vibration analysis results in a generalized eigenvalue problem that is also typically solved by an interative solver. The adjoint problem takes out the generalized eigenvalue solve and replaces it by one outer product per eigenfrequency, leading to significantly cheap eigenfrequency gradients for vibration analysis.



Introduction
State of the art adjoint optimization methods in the field of turbomachinery focus mainly on aerodynamic cost functions and constraints [26,13,25]. While such optimization methods can produce an aerodynamically optimized shape, the structural requirements may not be met. These are usually evaluated a posteriori rather than within the optimization process, leading to several reiterations in practice. In order to take structural constraints into account and reduce design iterations, a multidisciplinary adjoint optimization framework for turbomachinery is introduced in section 2. To realize this, a discrete adjoint structural solver that is equipped to meet the needs of turbomachinery design is required.
Stress and vibration analyses play a significant role in evaluating the structural requirements of turbomachinery components. These components undergo high rotational speeds in operation, which can lead to large centrifugal forces and structural deformations. As a result of these deformations, the structural stresses increase and need to be kept under the specified safety limit. Additionally, unsteady pressure forces from the surrounding fluid contribute to undesired vibrations that can lead to structural failure. Evaluating the maximum von Mises stress in a static stress analysis and the eigenfrequencies using a vibration analysis, provides engineers with vital information to design structurally feasible components. For gradient-based optimization methods, the sensitivities of stress and vibrational quantities of interest with respect to design parameters are required. To compute the required sensitivites efficiently, an adjoint structural solver that offers stress and vibration analysis has been developed. While recent works [14,9] have differentiated structural solvers for multidiscplinary optimizations (MDOs) using a more analytical approach, this work focuses on the efficient use of algorithmic differentiation (AD). The discrete adjoint implementations of the linear stress and vibration analysis are discussed in sections 3 and 4, respectively.
While the gradients of the stiffness and mass matrices can be derived analytically, one of the advantages of using AD for code differentiation is that additional code contributions are automatically differentiated. By using an object-oriented design approach, this provides a maintainable code base for future contributions. In section 5, an example is shown where additional contributions for composite material applications to the primal code are automatically differentiated and gradients for material design parameters can be easily obtained.

Multidisciplinary Adjoint Optimization
Performing an optimization in a single discipline, such as aerodynamics, neglects crucial constraints imposed by other disciplines, such as structural mechanics. Ideally, the structural constraints are considered during the optimization procedure, such that the resulting shape is aerodynamically optimized and satisfies the structural constraints. This section will briefly motivate the use of gradientbased optimization methods using adjoints and discuss what is required for a multidisciplinary adjoint framework.
Gradient-free optimization methods have been applied successfully in turbomachinery [17,24,16,22,10]. The methods are non-intrusive and one only needs to evaluate the cost function. Unfortunately, these methods suffer from the curse of dimensionality. This means that the cost of performing a gradientfree optimization is dependent on the size of the design space m, which can be significantly large. Convergence often requires a large amount of function evaluations. Gradient-based methods, on the other hand, use the gradient of the cost function J(x) ∈ R with respect to the design x ∈ R m , to converge towards an optimum with less iterations. However, these methods only converge towards the global optimum if the problem is convex and computing the sensitivities can be cumbersome.
The least intrusive way of approximating the gradient (1) is to use the finite differences (FD) method. However, the cost is dependent on the number of input variables m, costing 2m · cost(J) when using a second order approximation.
Here cost(J) refers to the computational cost of evaluating the primal model J(x). As the size of the design space increases, computing the gradients may become infeasible. The adjoint approach, which was first applied in aerodynamic optimization by Pironneau [19], followed by Jameson [7,6], offers an efficient alternative for computing the gradients. Evaluating the adjoint model withJ = 1 computes the gradient (1) at a cost dependent on the number of outputs, rather than the number of inputs. The bar notationJ denotes an adjoint variable. Typically the number of outputs in an optimization is much lower than the number of inputs. Thus, the adjoint method is superior in terms of computational cost. Moreover, being independent of the number of inputs gives engineers a broader design space to work with.
Two approaches can be taken to implement the adjoint model. The continuous adjoint approach [8] involves deriving a continuous adjoint model, then discretizing it to solve numerically. This results in computing the gradients of the continuous problem. The discrete adjoint approach [15], on the other hand, derives an adjoint model from the discretized system of equations, and computes the gradients of the discretized problem. The source code transformation method algorithmic differentiation (AD) [4,18], sometimes referred to as automatic differentiaton, can be used to generate the adjoint model of an existing code. Several AD tools are available and are listed on the AD community's web page www.autodiff.org. In this work, the discrete adjoint approach is taken using the AD tool CoDiPack [20].
In the context of AD, evaluating the adjoint model (2) is also referred to as reverse mode and may be used interchangeably in this text. AD can also be used to differentiate a primal model J(x) in forward mode using the chain rule of differentiation. This paper will concentrate on using AD to implement the adjoint, or reverse mode. For a more detailed description of algorithmic differentiation, the reader is referred to e.g. [4,18].
In addition to the gradients (1), one requires gradients of constraints with respect to design parameters to formulate a gradient-based multidisciplinary optimization problem. For turbomachinery applications, structural constraints are of particular interest. Previous development efforts of implementing the structural constraints into this multidsciplinary optimization framework include an adjoint CAD transformation for cold-to-hot deformations [21]. To complement this, the gradients of structural quantities of interest, such as the maximum von Mises stress and eigenfrequencies, with respect to design parameters are required. The discrete adjoint models that compute these gradients are the focus of this paper. They are introduced in the next sections 3 and 4.

Discrete Adjoint Linear Stress Analysis
The structural linear stress analysis is used to assess the structural integrity based on given forces. In the case of turbomachinery, relevant forces are composed primarily of the centrifugal forces due to high rotational speeds. Additionally, pressure forces from the surrounding fluid may be considered as well. The structural solver discussed in this paper is discretized using the finite element method (FEM). For a detailed discussion on the FEM approach for structural applications, the interested reader is referred to [2].

Linear Stress Equations
Using the weak formulation of the balance of momentum, the finite element formulation for linear stress analysis can be derived [2] as with the stiffness matrix K ∈ R m×m , mass matrix M ∈ R m×m , load vector b ∈ R m , and displacements u ∈ R m . For a static linear stress analysis, Mü = 0. One is left with a linear system of the form where the stiffness matrix and the load vector depend on the design variables x ∈ R m . The linear system is solved for the displacements which are then used to compute a structural cost function, such as the maximum von Mises stress σ max (u) ∈ R. Note that the maximum is computed using a continuous function, e.g. the Kreisselmeier-Steinhauser function [11] or the p-norm For the inclusion of structural constraints in a gradient-based multidisciplinary optimization, the sensitivities have to be computed.

Adjoint Implementation
The implementation of the discrete adjoint linear stress analysis can be broken down into a three-step algorithm (figure 1). First, the linear system is assembled by computing the stiffness matrix K and the load vector b. Then, the linear system is solved for the displacement field u. Finally, the displacement results are used to evaluate the cost function σ max . Note that here the mentioned cost function is σ max , yet this can actually be any cost function that is dependent on the displacements u.
The adjoint model has a similar three step form, only in reverse (figure 1). Seeding the adjoint modelx = ∂σ max ∂x Tσ max (8) withσ max = 1 and performing a reverse run will compute the sensitivities However, a black-box differentiation using AD would involve differentiating through the iterative linear solver of step 2 in figure 1. This could result in a significantly high memory consumption and computational cost proportional to the number of iterations. Additionally, the same number of iterations and the same basis for the Krylov subspace used by the primal linear solver would be used to compute the derivatives. At that point, the adjoint variables may not actually be converged and inaccurate derivatives would be propagated through the calculation.
Alternatively, the adjoint of the linear system can be treated explicitly as described in [23]. This method would result in the equations for the adjoints of K and b. Due to the symmetric positive definite structure of the stiffness matrix for linear stress analyses, an additional simplification can be made such that Kb =ū.
As a result, computing the adjoints of step 2 involves an additional linear system solve with the same matrix K. Recording the operations of the iterative linear solver with AD is no longer necessary. The vectorū comes from the adjoint of the cost function evaluation (step 1 of figure 1), which is differentiated using AD. AfterK andb have been computed using (9), they are plugged into the adjoint linear system assembly (step 3 of figure 1) to computex.

Accuracy and Performance Results
To assess the accuracy of the discussed adjoint formulation, the gradient (7) is computed using a rotating cantilever beam test case with 270 degrees of freedom (figure 2). The beam has fixed boundaries on the left hand edge (figure 2) and concentrated loads on the opposite edge, where the displacements are most prominent. The sensitivities are visualized in figure 3 and plotted in figure 4. The gradient computed using the adjoint method is in good agreement with the gradient computed using finite differences and a forward AD differentiation.
An axial fan geometry, which has been used as a baseline for an optimization in [1], is used to perform run time and memory consumption tests. Run time and memory consumption results using the axial fan test case are shown in table 1. The results reflect the computational effort and memory requirements for computing the gradient (7). A small test case with approximately 30, 000 degrees of freedom and a larger test case with approximately 200, 000 degrees of freedom ( figure 5) are compared, using the RealReverse AD type of CoDiPack. The resulting sensitivities are shown in figure 6. The larger test case shows better run time and memory ratios, reaching a computational run time of around 2.1 times the primal run time. This reflects the fact that two linear systems are solved using the same matrix K and solving the linear system accounts for   figure 7. Note that here the same numerical set up is used to solve both linear systems. The results also show that the RealReverseIndex AD type of CoDiPack offers a lower peak memory consumption at the expense of a higher computational effort.

A free vibration analysis starts with equation (3) in unloaded form:
With a proposed solution of u = u k e iω k t , a generalized eigenvalue problem of the form λ k is the k-th eigenfrequency and u k the k-th eigenvector. The vibration analysis recycles the same stiffness matrix K from the linear stress analysis, but requires an additional matrix, the mass matrix M ∈ R m×m . Note that K and M are symmetric, i.e.
and the eigenvectors can be normalized such that The system (14) can be solved using an iterative eigenvalue solver and the process can be broken down into a two-step algorithm ( figure 8). The gradient of interest in this case is where up to roughly twenty eigenfrequencies are usually of interest.

Tangent Model
The tangent model of the vibration analysis is required to derive the adjoint model and will be introduced in this section. One starts by differentiating the Left multiplying the result by u T k gives The definition of the eigenvalue problem (14), symmetry of K and M (15), and mass normalization (16) are used to simplify further to which results in Finally, this is rearranged to Given that the eigenvector u k , eigenfrequency λ k , and the gradients ∂K ∂xi , ∂M ∂xi have been computed, the eigenfrequency sensitivity (24) can be directly evaluated.

Adjoint Model
Previous work done by Lee [12] shows an adjoint formulation of the generalized eigenvalue problem, where an augmented response function was used to construct the adjoint equations. The method boils down to solving a linear system of equations for the adjoint variables and plugging them into the differentiated response function to compute the sensitivities (17). However, the gradients of the mass and stiffness matrices with respect to the design variables are still required to complete the equation. Dhondt et al. [3] computed the sensitivities via a symbolic approach as well by direct differentiation of the generalized eigenvalue problem. However, they used a perturbed stiffness and mass matrix to complete the equation, which led to numerically sensitive results.
The method presented in this paper uses a different approach and focuses more on the adjoint method by algorithmic differentiation, which will yield algorithmically exact gradients of the stiffness and mass matrices. While the majority of the discrete problem can be handled directly with AD, the iterative solver of the generalized eigenvalue problem has to be handled separately. It is well understood how this needs to be done for linear systems ( [23] and section 3.2 of this paper), but not yet for generalized eigenvalue problems. This is presented in this section.
The adjoint model of the vibration analysis is seeded withλ k = 1 to compute the sensitivities of each eigenfrequency of interest. The adjoint implementation (figure 8) involves a black-box AD differentation of the assembly (step 2). As was the case for the linear stress analysis, one would want to avoid differentiating through an iterative solver for the generalized eigenvalue problem of (14). A similar approach can also be used to compute the adjointsK andM . Consider the generalized eigenvalue system solver as a function that takes matrices K and M as input arguments and outputs the eigenfrequency λ k . The following dot product relationship [18,4] between the tangent and adjoint models holds: Plugging in the tangent model (24) yields In index notation, this can be expressed as From relationship R1, an expression for the adjoint stiffness matrixK ij can be deduced: Similarly, an expression for the adjoint mass matrix is found using relationship R2: After having solved for the eigenfrequency λ k and eigenvector u k , adjoint equations (32) and (34) can be seeded withλ k = 1. The adjointsK andM are then plugged into step 2 of figure 8, the adjoint system assembly, to finally compute the gradient (25). Note that unlike the adjoint linear solver, no additional eigenvalue systems have to be solved and the adjoints can even be evaluated explicitly.

Accuracy and Performance Results
The accuracy of the adjoint vibration analysis is assessed by computing the gradient (17) with the same cantilever beam test case as in section 3.3. The accuracy results are shown in figure 9. The gradients computed using the adjoint implementation show an excellent accordance with the finite differences and forward AD computed gradients.
A run time comparison of the primal versus adjoint implementations was made using the larger axial fan test case described in section 3.3. The results of the run time breakdown are shown in figure 11. Note that in this case only the assembly of the mass matrix M is measured, since the stiffness matrix K from the linear stress analysis can be reused. Nearly the entire computational effort (99.5%) of the primal run is due to the iterative eigenvalue solver. Not needing to differentiate through this expensive function allows more efficient adjoint runs. Ten eigenfrequencies and their respective gradients are computed for the run time breakdown. This means that the timing results are for ten adjoint evaluations, each amounting to approximately 1% of the considered run time.

Differentiation of Further Contributions
In addition to the node coordinates x, which mainly serve the purpose of shape design parameters, alternative design parameters may be of interest as well. For instance, in the case of composite material applications based on first-order shear deformation theory (FSDT), the lamination parameters V ∈ R l can be used to manipulate the material properties [5]. Gradients with respect to the lamination parameters could e.g. be used for material optimization problems. The design space could thus be extended as As a result, the stiffness and mass matrix computations are now defined to be dependent on the extended design space such that the adjoint vibration analysis as described in section 4 produces gradients with respect to node coordinates, as well as lamination parameters This is mainly achieved by using an object-oriented design approach in the implementation of composite material capabilities into the code of the vibration solver. The subclass CompositeElement, a child of class Element (fig 12), defines the construction of the stiffness and mass matrices using lamination Figure 12: IsotropicElement and CompositeElement defined as subclasses of Element parameters V . The algorithmically differentiated portion of the code was implemented for the parent class Element, enabling the introduction of additional element classes without additional AD differentiation. As a result, the composite element subclass and its matrix construction functions are automatically differentiated, which ensures maintainability of the differentiated code for future contributions.
Using a flat plate composite material test case (fig 13), the eigenfrequency gradients with respect to lamination parameters ∂λ k ∂V were computed. A comparison of adjoint-computed gradients with FD and forward AD in figure 14 show a positive agreement.

Conclusion
This paper presented the methods used for a discrete adjoint gradient evaluation for linear stress and vibration analysis within the context of a discrete adjoint structural solver for turbomachinery applications. The adjoint linear stress analysis involves solving an additional linear system to avoid the blackbox algorithmic differentiation of an iterative linear solver. Performance tests show that computing the sensitivities using the adjoint implementation costs as little as approximately 2.1 times the primal run time with 6.4 times the memory consumption, or 2.8 times the primal run time with 5.8 times the memory consumption.
The adjoint vibration analysis uses a similar approach to compute the adjoints of the generalized eigenvalue problem explicitly, without the need of solving any additional eigenvalue systems. Performance tests have shown that one reverse evaluation of this adjoint implementation costs about 1% of the forward run time. The accuracy test results for linear stress and vibration analysis show a good agreement of sensitivities computed using finite differences, forward AD, and adjoint, i.e. reverse, AD. With an AD differentiated code base, additional code contributions are automatically differentiated. As a result, eigenfrequency gradients with respect to composite material design parameters could be directly computed and the sensitivities correlate with the accuracy previously shown.
Further work would include extending the linear stress analysis to a nonlinear stress analysis, and a pre-loaded vibration analysis.

Funding
This project has received funding from the European Unions Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie grant agreement No. 642959.