Optimal control of MIMO input-quadratic nonlinear systems

We study the infinite-horizon optimal control problem for nonlinear, multi-input, input-quadratic systems. It is shown that optimality of the input-quadratic closed-loop system is intimately related to the property that an auxiliary input-affine system possesses a ${\mathcal{L}_2}$-gain smaller than one. Such equivalence is established, or approximated, by relying on (a combination of) three alternative sets of technical conditions based (i) on the inclusion of the gradient of the underlying storage function in a certain co-distribution, (ii) on verifying specific algebraic inequalities, (iii) or achieved dynamically by considering the immersion of the original nonlinear plant into a system defined on an augmented state-space.


I. INTRODUCTION
While the class of input-affine nonlinear systems has been thoroughly studied in the past decades (see, e.g., [3], [4], [13]), the literature concerning more general systems is instead mainly focused on plants that exhibit simultaneously a generic nonlinear dependence both on the state and on the control input, hence typically losing the particularly interesting structural and constructive insight acquired for input-affine systems. It is then not surprising that results dealing with input-quadratic nonlinear systems are rather limited and mostly hinging upon the notion of Control Lyapunov Function [5], [14], [6], despite the fact that the study of such a class of systems is significantly motivated by practical applications, including for instance magnetic systems [6] and micro-electromechanical systems (MEMS), based on electromagnetic or electrostatic actuation forces [9]. Moreover, quadratic inputs may appear in intermediate steps of the popular back-stepping stabilizing procedure, see e.g. [1] for the ball and beam example or [2] for the transient stabilization problem in multimachine power systems.
The main contribution of this paper consists in formulating and addressing the infinite-horizon optimal control problem for nonlinear, multi-input, input-quadratic systems. In particular, the property of optimality of the closed-loop system is shown to be intimately related to the property that an auxiliary input-affine system possesses a L 2 -gain smaller or equal to one and provided that the gradient of the corresponding storage function belongs to a certain co-distribution. Alternatively, the explicit solution of the underlying Hamilton-This work has been partially supported by the European Union's Horizon 2020 Research and Innovation Programme under grant agreement No 739551 (KIOS CoE).
A. Astolfi is with the Department of Electrical and Electronic Engineering, Imperial College London, London SW7 2AZ, UK and with the Dipartimento di Ingegneria Civile e Ingegneria Informatica, Università di Roma "Tor Vergata", Via del Politecnico, 1 00133 Roma, Italy (Email: a.astolfi@ic.ac.uk).
Jacobi partial differential equation and the requirement of the following inclusion may be relaxed to a pair of algebraic inequalities by relying on the use of a dynamic extension, thus obtaining an approximate solution to the optimal control problem. Note that the case of single-input models, which is dealt with in [10], does not require any additional technical condition to establish the above equivalence. Therefore, with respect to [10], herein the analysis of the inclusion of the gradient of the storage function to the annihilator of a specific distribution is expanded and the constructive scheme based on algebraic solutions and dynamic extension is introduced.
The rest of the paper is organized as follows. A few preliminary results are briefly reviewed in Section II, while the considered problem is formulated in Section III. The main results concerning optimality of the underlying inputquadratic system and its approximation via immersion and dynamic extension are the topic of Sections IV and V, respectively. Finally, a simple numerical example illustrates the theory in Section VI.

II. NOTATION AND PRELIMINARIES
The aim of this section consists in briefly reviewing a few basic definitions and results that are instrumental for the following derivations. Towards this end, consider a nonlinear, input-affine system described by equations of the forṁ where x(t) ∈ X ⊆ R n denotes the state of the system, v(t) ∈ V ⊆ R m is a control input and w(t) ∈ W ⊆ R d is an exogenous signal. Assume that f , g i , i = 1, ..., m and p j , j = 1, ..., d are smooth vector fields mapping x → T x X , where T x X denotes the tangent space to X at x. Moreover, let y(t) ∈ Y ⊆ R q denote the output of system (1).

Assumption 1:
The origin of R n , contained in X , is an equilibrium point of the system (1) with v(t) = 0 and w(t) = 0, for all t 0, namely f (0) = 0.
• Given a continuous function V : R n → R, the following definitions are employed in the rest of the paper. The notation V : R n → R >0 is used to denote a function that is positive definite around the origin, namely a function such that V (0) = 0, while V (x) is locally strictly positive. Moreover, the notation V : R n → R 0 denotes a locally positive semi-definite function, namely such that V (x) 0.

Definition 1:
A nonlinear systemẋ = f (x) + g(x)v with output y is said to be zero-state detectable from the output y if for any trajectory such that v(t) ≡ 0, the condition y(t) ≡ 0 implies lim t→∞ x(t) = 0. Assuming initially that v(t) = 0 for all t 0, consider, on the space R d × R q of the external variables of (1), a function s : R d × R q → R, referred to as supply rate.
Definition 2: [13] The system (1), with v(t) ≡ 0, is said to be dissipative with respect to the supply rate s if there exists a function V : R n → R 0 , called storage function, such that for all x 0 = x(0) ∈ X , T 0 and inputs w Moreover, if (2) holds with the equality sign, then system (1) is lossless with respect to s. • Definition 3: [13] Let γ > 0. The system (1), with v(t) ≡ 0, has L 2 -gain less than or equal to γ if it is dissipative with respect to the supply rate s(w, y) = γ 2 w 2 − y 2 . • The following classical result relates the possibility of imposing, via feedback, a desired L 2 -gain to the existence of a solution to a certain first-order quadratic partial differential equation, the so-called Hamilton-Jacobi (HJ) equation. Proposition 1: [12] Consider the nonlinear system (1) and let γ > 0. Suppose that there exists a smooth solution V : R n → R 0 of the Hamilton-Jacobi equation with V (0) = 0. Then system (1) in closed loop with v = −g(x) V x (x) has L 2 -gain less than or equal to γ from the input from w to the output [y, v] , with V as a storage function.

III. PROBLEM DEFINITION
Consider a nonlinear, multi-input system, quadratic in the control input, described by equations of the forṁ where x(t) ∈ R n denotes the state of the system and u(t) ∈ R m is the control input. The mappings f : R n → R n , g : R n → R n×m and h i : R n → R m×m , i = 1, ..., n are assumed to be sufficiently smooth and such that h i (x) = h i (x) for all x. Throughout the paper we assume that f (0) = 0, namely the nonlinear system (4) possesses an equilibrium at the origin with u = 0. The optimal control problem of interest can be formulated as follows. Problem 1: Consider the nonlinear, input-quadratic, system (4) together with the cost functional where q : R n → R 0 is a smooth positive semi-definite function, q(0) = 0, and k : R n → R n is such that q(x) = k(x) k(x), ∀x ∈ R n . The infinite-horizon optimal control problem consists in determining a state-feedback control law u = α(x), α : R n → R m , α(0) = 0, smooth mapping, such that J x0 (u ) J x0 (u) for any u and all x 0 . • In the following, we refer to the function V : R n → R >0 , defined as V (x 0 ) = J x0 (u ), ∀x 0 ∈ R n , as the value function of the optimal control problem. Consider now the following standing assumption, which is supposed to hold throughout the entire paper.
• Since the presence of the input nonlinearity in (4) may render the solution to Problem 1 a daunting task, a relaxed formulation is provided. In particular, the simplification is twofold: on one hand we allow for the presence of an additional running cost while, on the other hand, the design of a dynamic control law, instead of a static state feedback, is permitted, as summarized in the following statement.
Problem 2: Consider the nonlinear, input-quadratic, system (4) together with the cost functional (5) and Assumption 2. The infinite-horizon dynamic optimal control problem with stability consists in determining an integer ν 0, a positive semi-definite function : R n × R ν → R 0 , a dynamic control law described by the equationṡ and an open set U ⊂ R n × R ν containing the origin such that: (i) the zero equilibrium of the interconnected system (4), (6) is asymptotically stable with region of attraction containing U; (ii) for anyũ(x, ξ) and any (x 0 , ξ 0 ) such that the trajectory of system (4), (6a) interconnected byũ remains in U the inequalityJ x0,ξ0 (u) J x0,ξ0 (ũ) holds, with the augmented costJ defined as • For convenience the following notation is introduced. Let the function h ij k : R n → R, i = 1, ..., m, j = 1, ..., m, k = 1, ..., n, denote the (i, j) entry of the matrix-valued function h k , namely and define the mappings µ i,j : R n → R n as Note that, by symmetry of the matrix-valued functions h i in (4), it follows that µ i,j = µ j,i for all i = 1, ..., m and j = 1, ..., m. The section is concluded by stating a property of the vector field f , the matrix-valued functions g, and h i and the positive semi-definite function q that is instrumental for the derivations in the following sections.
Fact 1: There exist non-negative integers n p , n d and n l and smooth mappings p : and for all x ∈ R n . • Note that Fact 1 essentially summarizes, in (10), the property of a symmetric matrix-valued function to be decomposed in its positive and negative semi-definite parts, respectively, and, in (11), the property of a positive semi-definite matrixvalued function to be decomposable as the external product of a certain vector field l(x) of suitable dimensions related to the rank of the matrices on the right-hand side.

IV. OPTIMALITY OF MIMO INPUT-QUADRATIC SYSTEMS
In this section we explore the relation between the properties of optimality for the underlying input-quadratic system (4) and of dissipativity of an auxiliary input-affine nonlinear system. To provide concise statements of the following propositions, given the matrix-valued functions h i , i = 1, ..., n and a continuously differentiable function V : R n → R, define the matrix-valued function M : indicator function of the set E {r ∈ N : r 3}, and with D 2 such that D 2 (V x , µ d (x), 0) = 0, for any x ∈ R n . Note that the terms D 1 and D 2 are identically equal to zero if m 2. Moreover where M ad denotes the adjoint matrix of M and the matrixvalued function Ψ is such that The following result provides the characterization of the optimal solution to Problem 1 in a somewhat trivial case. Nonetheless, its proof permits the illustration of the main ideas on which also the proofs of the following results are based, hence it is explicitly reported.
Proposition 2: Consider the nonlinear, input-quadratic, system (4) together with the cost functional (5) and Assumption 2. Consider the input-affine systeṁ where v(t) ∈ R m is a control input, and suppose that there exists a smooth solution V : R n → R 0 to the HJ equation Suppose, in addition, that the value function V : R n → R >0 is such that 1 L µi,j V = 0, for all i = 1, ..., m, j = 1, ..., m.
Then there exists U ⊆ R n , containing the origin, such that the state-feedback solves Problem 1 for any x ∈ U. Moreover, V is the corresponding value function.
Proof: Consider the input-quadratic system (4) together with the cost functional (5). The corresponding Hamilton-Jacobi-Bellman (HJB) partial differential equation is in the unknown V : R n → R, V (0) = 0. Then, by recalling (13) and by continuity of the involved functions, there exists a non-empty neighborhood of the origin W such that det (M(x, V x )) > 0 for any x ∈ W. Therefore, the minimum with respect to the control input of HJ B(x, u) is continuously achieved atû( Replacing thenû(x) into (19), the latter reduces to Therefore, by relying on the equations (13) and (14) and by recalling the decomposition in (10), the partial differential equation (20) can be equivalently arranged as It is evident that the equation (21) reduces to (17) provided the solution to the latter satisfies L µi,j V = 0, for all i = 1, ..., m and j = 1, ..., m, thus concluding the proof.

Example 1.
To illustrate the constructions of Proposition 2 consider a nonlinear, input-quadratic system described bẏ which possesses the form (4), with f (x) = Ax, g(x) = B, where and with for some continuous functions σ i : R 2 → R, i = 1, 2, 3. Consider the cost functional (5) with q(x) = x 2 1 +2x 1 x 2 +x 2 2 , which is positive semi-definite. Then, recalling the auxiliary input-affine (linear) system defined in (16), namelẏ it can be easily shown that the equation (17) reduces to the classic Algebraic Riccati Equation (ARE) 0 = P A + A P + Q − P BB P , with Q ∈ R 2×2 such that q(x) = x Qx, which admits the positive definite solution P = I. Therefore, and noting that by definition it follows that L µi,j V = 0, for any x ∈ R 2 and for any i = 1, 2, j = 1, 2. Thus, the optimal solution of the auxiliary problem for the linear system, i.e. v (x) = [−x 2 , −x 1 ] , constitutes also the optimal solution for the family of inputquadratic nonlinear systems defined in (23).
• The statement of the previous proposition entails that the optimal solution to Problem 1 is derived by a value function for the system (30) provided the latter function satisfies additional conditions in terms of its Lie derivatives along the vector fields µ i,j . The latter technical condition can be circumvented if a certain function of the state and of the storage function is (locally) negative semi-definite.
Proposition 3: Consider the nonlinear, input-quadratic, system (4) together with the cost functional (5) and Assumption 2. Consider the input-affine systeṁ where v(t) ∈ R n l is a control input and w(t) ∈ R np is a disturbance input. Suppose that there exists a smooth solution V : R n → R >0 to the HJ equation with the property that there exists an open set U ⊂ R n , containing the origin, such that R(V x , µ d , µ o ) 0 for any x ∈ U, with R defined in (22). Then there existsÛ ⊆ U such that the state-feedback u (x) = −M(x, V x ) −1 g(x) V x solves Problem 2 for any x ∈Û ⊆ U, with ν = 0 and Proof: The claim is proved by following the same arguments as those employed in the proof of Proposition 2. In particular, it can be easily shown that a solution to the partial differential equation (29) with the additional property that R(V x , µ d , µ o ) 0 for all x in a neighborhood of the origin is such that also (21), hence (19), holds with the inequality sign. As a consequence, Problem 2 is solved by a static feedback and with the additional running cost provided by the positive semi-definite term −R, with the set U obtained by considering the intersection of W, defined in the proof of Proposition 2, U and a level-set of the function V .
Remark 1: By inspecting the structure of the auxiliary input-affine system (28) and of the partial differential equation (29) in the statement of Proposition 3, it is evident that the optimality properties of the original nonlinear, inputquadratic system (4) with respect to (7) are in fact related to the property that the input-affine system (28) in closed loop with v = −l(x) V x possesses an L 2 -gain from the (virtual) disturbance input w to the output [y, v], with y = k(x), less than or equal to one, provided the underlying storage function satisfies an additional technical condition.
The following result, instead, combines (relaxed) versions of the conditions introduced in the two previous statements. To this end, consider a modified version of the decompositions in Fact 1, namely definep andl such thatp( Note that the off-diagonal functions µ o are equal to zero if the matrix-valued functions h i are diagonal, namely if (4) does not contain mixed terms u i u j for all i and j.
Proposition 4: Consider the nonlinear, input-quadratic, system (4) together with the cost functional (5) and Assumption 2. Consider the input-affine systeṁ where v(t) ∈ R n l is a control input and w(t) ∈ R np is a disturbance input. Suppose that there exists a smooth solution V : R n → R >0 to the HJ inequality namely system (30) in closed loop with v = −l(x) V x has L 2 -gain less than or equal to one from w to [y, v] . Suppose, in addition, that the storage function V : R n → R >0 is such that (i) L µi,j V = 0, for any mapping µ i,j belonging to µ o (x); (ii) there exists an open set U ⊂ R n , containing the origin, with g i : R n → R n denoting the i-th column of g, for all x ∈ U. Then there existsÛ ⊆ U such that the state-feedback denotes the left annihilator of ∆ µ o . Moreover, the existence of at least one exact co-vector in ∆ ⊥ µ o (x) is guaranteed provided the distribution ∆ µ o is nonsingular around the origin and its involutive closure has dimension smaller than n. Note that in general the distribution ∆ µ o contains m(m − 1)/2 vector fields in R n , hence it is reasonable to expect that the distribution is nonsingular whenever n > m(m − 1)/2.

V. DYNAMIC SOLUTION VIA SYSTEM IMMERSION
While the discussion in Remark 2 deals with the requirement in item (i), the following statement tackles the inequality condition in item (ii). In particular, the rationale behind the following result is that -instead of satisfying the inequality R r (V x , µ d ) 0 by implementing the static state feedback (33) -a similar inequality is dynamically enforced via the selection of the time-evolution of a dynamic extension ξ, as suggested in (6). More precisely, by relying on constructions similar to those introduced in [11] and then further extended in [7] and [8], the following statement yields a characterization of the solution to Problem 2 based only on algebraic conditions rather than partial differential equations. To this end, consider first the partial differential equation the solution of which -provided it satisfies item (i) of Proposition 4 -should be approximated via an algebraic solution and the immersion of the system into an auxiliary one defined on an extended state-space, i.e.
Following [7], the matrix-valued function P : R n → R n×n , P (x) = P (x) 0 for any x ∈ R n , is an approximate algebraic solution of (35) if there exists a matrix-valued function Σ : R n → R n×n , with Σ(0) > 0 and Σ(x) 0 for any x ∈ R n \ {0}, such that whereF : R n → R n×n is any matrix-valued function such thatf (x) =F (x)x, for all x. Moreover, to provide a concise statement of the following result, define the function V : with R = R > 0 such that V is locally positive definite, with the property that its partial derivatives satisfy Finally, let the function π : R n → R n be such that V ξ | ξ=π(x) = 0 for any x ∈ R n . Proposition 5: Consider the nonlinear, input-quadratic, system (4) together with the cost functional (5) and Assumption 2. Consider the input-affine system (30), where v(t) ∈ R n l is a control input and w(t) ∈ R np is a disturbance input, and suppose that there exists a continuous algebraic solution P to the algebraic inequality (36) such that (i) (x P (x) + δ(x, ξ) )µ i,j (x) = 0, for any mapping µ i,j belonging to µ o ; (ii) there exists an open set U ⊂ R n , containing the origin, in which δ(x, ξ) (p(x)p(x) − l(x)l(x) )(2P (x)x + δ(x, ξ)) | ξ=π(x) .

VI. A SIMPLE EXAMPLE
The theory is corroborated in this section by discussing a simple numerical example. To this end, consider the inputquadratic system described by the equatioṅ for any x ∈ R. According to the notation introduced above, µ 1,1 (x) = µ 2,2 (x) = 0 and µ 1,2 (x) = 1, for any x, and hence, consequently det(M(x, V x )) = 1 − V 2 x and Therefore, the partial differential equation associated to the auxiliary disturbance attenuation problem is hence with R(V x , µ d , µ o ) in (22) equal to zero for all x, the solution of which is given by The latter function then yields the optimal control input VII. CONCLUSIONS In this paper we have studied the infinite-horizon optimal control problem for multi-input input-quadratic nonlinear systems. It has been shown that the property of optimality of the closed-loop system is in fact strongly related to the property that an auxiliary input-affine system possesses a L 2 -gain smaller or equal to one and provided an additional technical condition is verified. Such conditions have then been relaxed to a pair of algebraic inequalities that are combined with a dynamic extension to yield an approximate solution to the optimal control problem.