Dynamic Certainty Equivalence Adaptive Control by Nonlinear Parameter Filtering

This paper presents a novel solution to the problem of designing an implementable (i.e., differentiator-free) model-reference output-feedback direct-adaptive controller for single-input-single-output linear time-invariant systems with relative degree possibly larger than one. The new paradigm is based on a version of the Dynamic Certainty Equivalence (DyCE) principle. The approach proposed in this work consists in realizing the DyCE control through surrogate parameter derivatives, made available by a Nonlinear Parameter Filter (NPF), instead of feeding the DyCE controller with the derivatives of the estimates produced High-Order Tuner (HOT). The proposed adaptive controller does not require error augmentation or normalization, allowing the use of large adaptation gains for fast convergence speed. Moreover, the proposed architecture can be easily equipped with well-known robust modifications of tuning laws. The performance of the proposed algorithm is demonstrated via comparative simulations with an error augmentation-based method and a simplified HOT algorithm.


I. INTRODUCTION AND PROBLEM FORMULATION
Model Reference Adaptive Control (MRAC) is undoubtedly one of the most intensively studied problems by the adaptive control community, dating back to the 1950s. Despite this long and rich history, there are still several open issues in MRAC design that involve, in particular, system with non-unitary relative degree.
The different MRAC approaches that have emerged from decades of research can be roughly grouped into two main categories: The first one, known as indirect-adaptive control, include identifiers of the plant and observers of the states. Its main restriction is the dependence of control and observation performance on the identification process, where lack of persistence of excitation (PE) of the regressor may degrade the control performance or even cause instability. The second category involves direct-adaptive control schemes, where the controller parameters are directly updated online without the need of estimating plant parameters. Typically, directadaptive schemes enjoy the favourable property of guaranteeing asymptotic convergence of the tracking error even in absence of PE. In this paper, we will only consider the directadaptive case.
The majority of direct MRAC schemes for uncertain linear time-invariant (LTI) systems reported in the literature relies on the certainty equivalence (CE) principle. This principle reposes on the fact that the ideal model-matching controller is time-invariant and affine in the state variables of the system. Based on this paradigm, the unavailable (unknown) parameters of the ideal model-matching controller are replaced by their estimates. CE type controllers can be readily applied to Strictly Positive Real (SPR) systems. Conversely, when applied to systems with relative degree larger than 2, direct MRAC schemes require error-augmentation and normalization of the parameter adaptation law [1]. Normalization makes the convergence speed trajectory-dependent (i.e., nonuniform in the initial conditions) and typically slower in presence of large initial parametric mismatch. These issues prompt an investigation of adaptive controllers that are not based on the CE paradigm and do not require normalization.
In this regard, an alternative approach to CE is DyCE adaptive control. This technique dates back to 1987, when Mudgett and Morse presented it in [2], although the name and the acronym for this method have been coined by Ortega in [3]. In order to apply DyCE control to systems with arbitrary relative degrees, high-order derivatives of the parameter vector are needed. Unfortunately, the tuners (adaptation laws for the parameters) available at the time of Widgett and Morse were only able to provide the first derivative [4], [5]. In the seminal work [6], Morse provided DyCE with a modified parameter adaptation scheme, named HOT, which did not require normalization. HOT update laws are able to produce -without direct differentiation -the estimated parameter vector plus its first ρ derivatives, where ρ is the relative degree of the plant. Compared to the augmented error CE methods, the DyCE+HOT architecture is characterized by an enhanced transient performance, due to the possibility of using large adaptation gains without normalization. The flip side of the coin is the increased complexity of the original HOT method of Morse, which -in absence of robust modifications -is known to suffer from a lack of robustness in case of poor excitation, which may cause parameter drift in the presence of unstructured perturbations. Nikiforov [7] incorporated a leakage modification to the HOT to make it robust in presence of bounded perturbations (in the sense of guaranteeing boundedness of all signals). Moreover, the dynamical order of the tuner was reduced by a factor of 2n via a simplification of its structure. However, it is known that leakage modifications alter the adaptation dynamics so that the parameters are not guaranteed to converge to the true values, even in case of Persistence of Excitation (PE). Other robust modifications used in conventional CE adaptive control, most noticeably parameter projections, are not affected by this problem and allow to achieve exact parameter convergence under PE conditions and in absence of unstructured perturbations. However, these modifications cannot be applied easily to the DyCE+HOT framework, due to the nonlinearity of the tuner.
Finally, it is worth mentioning that the Adaptive Backstepping method by Krstic, Kanellakopolous and Kokotovic [8], as is the case of DyCE+HOT and its variants, does not need normalization and achieves enhanced performances compared to CE adaptive control. Adaptive backstepping is inherently nonlinear as it involves the use of nonlinear tuning functions. However, backstepping design is less procedural than DyCE+HOT, in the sense that it calls for specific customization when applied to different plant models. This increased complexity with respect to DyCE+HOT comes with the advantage of bearing the possibility to cope with unmatched model uncertainties. As all other adaptive schemes, backstepping requires robust techniques such as projection, dead-zone and leakage modifications to enforce signal boundedness in case of poor excitation. Nonetheless, due to its complexity, it shares with the DyCE+HOT a certain lack of popularity among practitioners.
In this paper, we propose a new approach to DyCE adaptive control. Instead of relying on Morse's HOT, our method consists of filtering the estimated parameter vector through a NPF, combined with a conventional first-order tuner. The NPF devised in this work is able to produce surrogate signals to be used in the DyCE control law in place of the unavailable higher-order derivatives of the parameters. The proposed adaptive controller does not require error augmentation or normalization, allowing the designer to use large adaptation gains, thus achieving -in principle -a faster speed of convergence. As a noteworthy advantage versus conventional DyCE+HOT methods, the proposed architecture can be easily equipped with all robust modifications found in conventional adaptive control, like leakage and parameter projection.
Notation : The Laplace transform of a signal u(t) : R → R in the s-domain will be denoted by u(s) = u(t) . This notation slightly departs from the one (typical of the adaptivecontrol literature) making use of simple square brackets. Simple square brackets in this paper will be used for vectors and matrices, e.g. v = [1 0 0] ∈ R 3 . | · | denotes the absolute value, whereas (·) denotes the transpose of vectors and matrices.

II. CERTAINTY EQUIVALENCE CONTROLLER BY
NONLINEAR PARAMETER FILTERING Consider the single-input-single-output (SISO) LTI system described by: where u, y ∈ R denote the plant input and output, respectively. N (s) and D(s) are monic and coprime polynomials with unknown coefficients, b is the high-frequency gain of the system. The following standard assumptions of outputfeedback MRAC [5], [9] are made: (A.1) The degrees n and m of N (s) and D(s) are known and relative degree ρ := n − m ≥ 1; (A.2) b is non-zero, has known sign and is norm-bounded from above by a known constantb; (A.3) The polynomial N (s) is Hurwitz. The objective is to determine u(t) using a differentiator-free controller such that the trajectories of the closed-loop system are bounded, and the plant output y(t) tends asymptotically to the output y r (t) of the reference model where r(t) is referred to as the "reference command", and The reference model satisfies the following assumptions: (B.1) The command r(t) is a uniformly bounded piece-wise continuous function of time; (B.2) The polynomial D r (s) is a monic and Hurwitz polynomial of degree ρ = n − m.
Next, the DyCE will be used to design the MRAC. Using well-known results from adaptive control theory, the tracking errorỹ(t) := y(t) − y r (t) can be written in the form where (t) is an exponentially-decaying signal which incorporates the effect of the unknown initial conditions, θ ∈ R 2n is a vector of unknown constant parameters and η(t) ∈ R 2n is the regressor vector obtained by filtering the input, output and reference signals being L(s) an arbitrary Hurwitz polynomial of order n − 1.
The DyCE method consists in transforming (3), of relative degree ρ, into a new error equation of relative degree 0, for which the design of the parametric adaptation law becomes much simpler. Define a vector of filtered regressors, obtained by filtering η(t) as: Then the tracking error can be written as The DyCE paradigm applied to output-feedback MRAC consists in choosing a control law of the form whereθ(t) is the estimated parameter vector, whose update law will be determined later. Defining the estimation error θ(t) :=θ(t) − θ and substituting (6) in (5) yields the relative degree-0 (a.k.a. "algebraic" or "static") error model The main difficulty in this approach arises when one expresses (6) in time-domain, as u(t) will depend on the first ρ time-derivatives of both ξ(t) andθ(t), as seen in the following: where d i ∈ R, i = 0, . . . , ρ are the coefficients of the polynomial D r (s), with d ρ = 1.
It is noted that, in equation (7), the required ρ derivatives of ξ(t) are available without direct differentiation, due to the fact that ξ(t) is obtained from η(t) through the filter (4) having relative degree equal to ρ. Conversely, the need to compute the first ρ time-derivatives ofθ prompts the development of specific adaptation laws capable to produce all the required derivatives without direct differentiation. In this regard, the DyCE control scheme will be applied using a NPF to develop the tuner, alternative to HOT. The NPF method takes the current estimate of the parameter vector as input, and produces a filtered version together with all the needed derivatives. The parameter vector feeding the NPF can then by updated via a conventional first-order tuner (see (13) below), typically used in CE adaptive control. The proposed DyCE+NPF control law takes the following form: equivalent to the time-domain representation whereθ NPF (t),θ NPF (t) are the filtered parameter vector and its derivatives, all generated by the NPF to be determined.
Substituting the control law (8) into the error equation (5) and adding the term −b ξ(t) θ (t) to each side yields where the effect of the exponentially-decaying term (t) has been neglected. Defining the NPF error vector θ NPF (t) :=θ NPF (t) −θ(t), and the scalar signals (projected errors) equation (10) can be expressed in time-domain as Let the update law forθ(t) be given by the first-order tuneṙ where µ > 0 is an arbitrary scalar gain. Substituting (12) into (13) and considering (11), one obtainṡ This expression will be used later in the stability analysis.
The NPF that generates the filtered parameter vector θ NPF (t) takes on the following cascaded structure where Γ(t) ∈ R 2n×2n is a (possibly time-varying) gain matrix to be determined.
Remark 2.1: The use of a conventional first order adaptive law makes it easy to apply usual robust modifications, including σ and e 1 modifications. In particular, parameter projection can be used whenever a convex admissible parameter set is known a priori. The main advantage of using parameter projection is that in nominal noise-free conditions, the estimated parameters are guaranteed to converge to the true ones in case of sufficient excitation.

III. STABILITY ANALYSIS
In this section, we present the stability analysis of the proposed scheme using the following intermediate result: Lemma 3.1: (Invariance-like Lemma) If two scalar functions V (t), W (t) : R ≥0 → R ≥0 satisfy the following conditions: The proof of the above Lemma follows from the Lyapunovlike Lemma in [10] (that, in turn, is a derived from Barbȃlat's Lemma) and is omitted here due to the space limitations.
To aid the discussion, let us introduce the parametric differences between successive layers of the NPF:θ 1 (t) :=θ 1 (t) −θ(t) and Moreover, the layer-to-layer differences obey the dynamics: where we have taken advantage of the relatioṅ θ i (t) = −Γ(t)θ i (t). It is worth noticing that the NPF vector error,θ NPF (t), can be expressed as Moreover, defining the scalar signals then ε NPF (t) can be expressed as The main result concerning the stability of the proposed scheme is reported in the following theorem. Theorem 3.1: If Assumptions (A.1)-(B.2) hold, then for system (1) in closed-loop with the adaptive controller comprising the DyCE controller (9), the tuner (13) and the filter (15), there exists a proper choice of Γ(t) such that the trajectories of the closed-loop system originating from any initial condition are bounded and the tracking errorỹ(t) converges to zero asymptotically.
Proof: In view of (12), the following implication holds: Therefore, the proof consists in showing that the DyCE+NPF scheme makes ε θ (t) and ε NPF (t) converge to zero simultaneously. With this in mind, consider the following quadratic function as a building block of the overall candidate Lyapunov-like function: where α : 0 < α < 1 and σ Ξ > 0 are arbitrary positive scalars, and Ξ(t) := ξ(t)ξ(t) . Thanks to (11), (17) and (19), one can use V NPF as a brick of the candidate Lyapunovlike function to establish the convergence of ε NPF (t). Before proceeding further, Γ(t) is given the following structure: with Λ(t) to be determined. Taking into account (16) and (22), the derivative of V NPF along the trajectories of the closed-loop system reads aṡ To study the stability of the closed-loop adaptive system, consider the following candidate Lyapunov-like function The challenge is to design Λ(t) to make V a Lyapunov-like function for the closed-loop adaptive system. In what follows, the notation will be streamlined by omitting the explicit time-dependence of all signals, unless strictly required. By exploiting (14), the derivative of V along system's trajectory can be written aṡ Assigning the matrix Λ(t) = σ ΞΞ (t) + Λ(t), with Λ(t) a (possibly time-varying) symmetric positive-definite matrix to be determined, one obtainṡ Application of Young's inequality and the identity ε NPF = ρ i=1 ξ θ i to selected terms of the above expression yields where Ψ(t) is a time-varying non-negative-definite symmetric matrix defined as Ψ(t) := (I + σ Ξ Ξ(t))ξξ (I + σ Ξ Ξ(t)). Application of Young's inequality to the last two terms of the right-hand side of the inequality forV yields is a time-varying strict positive-definite symmetric matrix defined as Φ(t) := σ Ξ (I +Ξ(t) Ξ (t)). Accordingly, the derivative of the candidate Lyapunov-like function can be bounded as followṡ The previous inequality can be rewritten in compact form as: where γ 1 (Ψ, Φ, Ξ), γ ρ (Ψ, Φ, Ξ), γ i (Ψ, Φ, Ξ) are defined as for i = 2 · · · ρ − 1. Next, a selection of Λ must be made so that γ 1 , γ ρ and γ i become strictly positive for all t ≥ 0, to make the right-hand side of (24) negative semi-definite. To this end, Γ and Λ are selected as with λ > 0 is an arbitrary scalar constant, while the matrix Λ * will be assigned to be a time-varying positive-definite matrix dependent on available signals. Substituting (26) in (25) one obtains Next, Λ * = Λ * (Ψ, Φ, Ξ) will be designed such that γ j > 0, for all j = 1, 2, · · · ρ and t ≥ 0. Noticing that Λ ≥ Λ * +λΞ, by virtue of (26), one obtains where i = 2, .., ρ − 1 andρ := µb 1 + ρ 2 . Choosing for all i = 2, .., ρ − 1. As a result, both terms on the righthand side of (28) are positive, as for all i ∈ {2, .., ρ − 1}. Since 0 < α < 1, one can easily see that (1 + α)/(1 − α) > 1. Moreover, due to the choice of λ made in (29), we have so that the following bound holds for γ ρ : Finally, (29) also guarantees that for any α ∈ (0, 1). Therefore, one obtains Substituting the lower bounds in (31), (33) and (35) into (24), and due to the relations defined in (11), and (18) and recalling that Ξ(t) = ξ(t)ξ (t), one finally obtainṡ where It is straightforward to see that V (t), defined in (23), and W (t) are positive semi-definite, whileV (t) is negative semi-definite in view of (36). Therefore, from (21), (23) and (36) we can infer thatθ,θ i , i ∈ L ∞ and θ , i ∈ L 2 , for i = 1 · · · ρ. This result in turn implies that NPF ∈ L 2 , and thereforeỹ ∈ L 2 . By invoking to Assumptions A.3 and B.2 ( i.e., customary minimum-phase arguments) and owing to the fact that y is the sum of y r ∈ L ∞ andỹ ∈ L 2 , one can easily show that the both the filtered regressors'vector ξ and its first derivativeξ are bounded (the vector ξ can be written as the output of a bank of strict-proper stable linear filters taking the signal y as input). From the boundedness of ξ,ỹ we can conclude that alsoθ ∈ L ∞ , and consequently (being each layer of the filter (16) stable by design), alsoθ i ,θ i ∈ L ∞ , for i = 1 · · · ρ . Then, the boundedness ofθ,θ, ξ,ξ,θ i ,θ i yields thatẆ (t) is bounded. Hence, W (t) is a uniformly continuous function. Then, according to Lemma 3.1, W (t) converges to zero asymptotically, implying that ε θ (t) and ε i (t), with i = 1 · · · ρ, converge as well. Thanks to (19), the asymptotic convergence of ε NPF (t) follows. Finally, in the light of (20), the convergence of ε θ (t) and ε NPF (t) implies that ofỹ(t).
Remark 3.1 (Implementability-Causality): This remark is aimed at showing that the DyCE+NPF adaptive control is implementable. Compared to the HOT of Morse, which contains matrix-gain terms depending on Ξ(t) = ξ(t)ξ(t) , the gain matrix Γ(t) of the NPF reads as hence, Γ(t) depends on the regressor vector ξ(t) and its derivativeξ(t). The dependence of Γ(t) onξ(t) calls for further considerations about the implementability (that is, the realizability ofξ(t) via a causal system) of the ρ-th derivative of the filtered parameter vector. It is noted that the derivatives ofθ NPF (t) required in the implementation of the DyCE control satisfy the following functional dependencê θ (i) NPF (t) = f ρ θ ρ (t), . . . ,θ 1 (t),θ(t), ξ(t), . . . , ξ (ρ) (t) .
Since the derivatives of ξ(t) are available up to the ρ-th order, thenθ

IV. ILLUSTRATIVE EXAMPLE
In this section, a numerical example is provided to demonstrate the effectiveness of the proposed adaptive controller. The proposed algorithm is compared with the classical augmented-error based MRAC with normalization [9] and the simplified HOT [7] of Nikiforov. The Runge-Kutta integration method with fixed sampling interval T s = 10 −3 s has been employed for all simulations.
Consider the relative degree-two unstable LTI plant with unknown parameters a 2 = 2, a 1 = −1 and b = 2 and external disturbance d(t). The upper boundb = 4 is assumed on the high-frequency gain b, whose sign is is known a priori. The reference model is selected as y r (t) = 1 s 2 +2s+1 r(t) . We first consider a sinusoidal reference signal r(t) = 4 sin(0.8t) in a disturbance-free scenario, i.e., d(t) = 0. For all the three methods considered, the plant model is initialized with the same initial condition, y(0) =ẏ(0) = 0, whereas the gains are tuned so to achieve comparable convergence speed. More specifically, the AugE ( short for augmented error adaptive controller with normalization) is tuned with the following selection of the gains: Λ(s) = s + 2, γ = 0.5 and Γ = 0.5I 2×2 ; conversely, the parameters of HOT(short for Nikiforov's HOT controller) have been selected as λ = 1, µ = 1, γ = 1 and σ = 0. The parameters of the proposed controller, denoted by acronym NPF, have been chosen as: µ = 1, α = 0.5 and λ = 10, whereas Γ(t) is given by (27) and (29). The behavior of the tracking error for the three MRAC algorithms is shown in Figure 1.
From the analysis of Figure 1, it is noted that the three methods all succeed in tracking the reference signal with similar convergence time. However, the DyCE methods based on HOT and NPF display a better transient behavior.
Next, the performance of three algorithms in presence of the bounded disturbance d(t) = 2 sin(3t) + 0.1 sin(20t) is compared. The results of the simulations are reported in Figure 2, which suggests that the proposed method achieves a higher tolerance to high frequency disturbances than the other two methods. V. CONCLUDING REMARKS In this paper, we have proposed a new model-reference output-feedback controller for SISO LTI systems with relative-degree possibly larger than one. The new adaptive controller is based on the DyCE principle and consists of a nonlinear parameter filter equipped with a first-order tuner. The rationale behind the proposed architecture consists in designing a cascaded nonlinear low-pass filter that, once applied to the parameter vector obtained by standard adaptation laws, produces a filtered parameter vector that has computable derivatives, making the DyCE control law implementable.
One remarkable feature of the proposed adaptive controller is that it does not require normalization, that is instead mandatory in augmented-error adaptive controllers for systems with relative degree larger that two. Hence, the proposed DyCE adaptive controller with NPF is able to use large adaptation gains, thus achieving fast convergence, similarly to the HOT formulation of Morse. Moreover, the proposed architecture can be easily equipped with the robust modifications of update laws, including leakage modifications and parameter projection.