Enhancing Robot-Environment Physical Interaction via Optimal Impedance Profiles

Physical interaction of robots with their environment is a challenging problem because of the exchanged forces. Hybrid position/force control schemes often exhibit problems during the contact phase, whereas impedance control appears to be more simple and reliable, especially when impedance is shaped to be energetically passive. Even if recent technologies enable shaping the impedance of a robot, how best to plan impedance parameters for task execution remains an open question. In this paper we present an optimization-based approach to plan not only the robot motion but also its desired end-effector mechanical impedance. We show how our methodology is able to take into account the transition from free motion to a contact condition, typical of physical interaction tasks. Results are presented for planar and three-dimensional open-chain manipulator arms. The compositionality of mechanical impedance is exploited to deal with kinematic redundancy and multi-arm manipulation.


I. INTRODUCTION
Recent frontiers of robotics research have encountered a major obstacle in physical human-robot interaction. Contact with the environment requires fine control not only of robot position but also of the wrench exerted. The importance of this concern is even more pressing for applications in which robots are intended to interact physically with humans for working activities, for example with cobots, or for assistive applications, as is the case with exoskeletons, rehabilitative and perhaps companion robots (see [1], [2] for review).
Classical approaches, such as position control, typically fail to modulate force effectively, while force control is effective only during contact. For this reason, several studies focused on methodologies able to control in both domains. Proposed solutions to this problem can be categorized loosely into two broad classes: i) Active interaction control; ii) Passive interaction control. The first group is essentially a 'hybrid' combination of position and force control [3]. The reader may refer for example to [4], [5]. These approaches typically require end-effector force sensors for their implementation, even if some techniques are available for estimating force [6]. Moreover, force control may result in unstable behavior during contact with external objects [7]. The second group, instead, implement energetically passive control of force which is not affected by typical force-control problems. Among the second group, one of the simplest effective strategies is impedance control [8]. The idea behind the impedance control, originally presented as a general control framework for manipulators in free or constrained motion, with and without dynamic interaction with the environment, is conceptually simple. The goal of the controller is to produce and maintain a desired impedance, i.e. force as a dynamic function of the deviation of position from its zero-force value. This concept appears to be effective since it manages contact with external objects in the environment without explicitly programming the transition between free motion and contact, yet preserves adequate tracking of a zero-force trajectory during free movements. Several applications have benefited from impedance control; the interested reader may refer for example to [9], [10], [11].
One characteristic of impedance control implementation is an increase in the number of parameters to be specified in the controller. That is, in addition to the zero-force trajectory, the designer is required to tune the robot stiffness, damping and (potentially) parameters related to higher-order derivatives (e.g. inertia). How best to modulate impedance parameters is still an open question. Typical implementations pre-tune these parameters depending on the specific task. It has been observed that high impedance values are desirable for fine movements, while lower values should be used for large/rapid movements [12], [13]. In [14], the authors propose a strategy to adapt robot impedance depending on the human's behavior during a physical human-robot interaction task. Adaptive impedance control has been used to account for differences between model and manipulator parameters [15], [16]. In addition, the ability to adapt controller parameters may be used to improve system performance and has been proposed to optimize task efficiency [17], [18], [19].
In this paper, we build upon the state-of-the art and tackle the problem of planning the whole robot behavior in terms of both motion and impedance profiles. More specifically, we propose an optimization-based approach to modulate the Cartesian impedance of a generic manipulator, in terms of the desired stiffness and damping, together with the zero-force trajectory commanded to the controller. This would shape the manipulator overall behavior in a coherent and coordinated way. The results presented in this paper demonstrate that our implementation is able to cope with different kinematic structures and effectively exploit the properties of impedance Fig. 1. Sample task for a single arm. The robot starts from an initial configuration x S and moves to push against the spring to reach the goal position x G . This task includes a transition from a free movement condition to an environment-interaction condition. compositionality to manage robot redundancy. To demonstrate our approach, we present a series of simulation experiments, first with a simple planar structure, then generalizing to a multi-arm redundant three-dimensional robot. To test the effectiveness of our methodology, all tasks include a transition from free motion to contact with a compliant environment. Although an experimental validation remains to be conducted, the method described in this work develops an approach to plan the motion and impedance of a manipulator as a whole, and potentially provides a means to ensure safe and effective use of assistive and rehabilitative devices in daily life.

II. BACKGROUND
The theory of impedance control is well established in the literature. This section reviews the main concepts necessary for the rest of this paper. As mentioned in the introduction, the general idea is to control the dynamic relation between the generalized configuration of a manipulator and the energydual generalized forces. Given a general operational space with p dimensions, we can define the robot generalized configuration x ∈ R p , its derivativesẋ,ẍ, · · · ∈ R p , and the output force/torque F ∈ R p . A common implementation of mechanical impedance is where Z is an arbitrary non-linear operator and x 0 ,ẋ 0 ,ẍ 0 , · · · ∈ R p define the motions of the robot that yield zero force/torque, F = 0. One typical implementation is to assume a linear operator and truncate the derivatives at the first order, in such a way that the external perceived behavior is the effect of a spring and a damper. This results in the following linear controller [8] It is important to recognize that all forces/torques-whatever their origin-superimpose their effects on the inertial dynamics of a robot. As a result, multiple controllers defined in the form of Eq. 2 may be combined linearly to increase the complexity of the behavior. That is, mechanical impedance has the property of compositionality. So far we defined x as a generalized configuration variable but Eq. 2 can be defined in different spaces. For physical interaction, one may want to shape the impedance perceived at the robot end-effector. In that case, without any loss of generality, consider a manipulator with n revolute joints q = [q 1 , . . . , q n ] T and an end-effector moving in an m-dimensional Cartesian space which has fk(q) forward kinematics and a Jacobian matirx J(q) ∈ R m,n so that Eq. 2 may be rewritten as where K EE ∈ R m,m and B EE ∈ R m,m are the endpoint stiffness and damping matrices respectively, q ∈ R n is the vector of the joint angular values and x 0 ,ẋ 0 ∈ R m are the end-effector zero-force position and velocity.
This formulation enables control of the robot trajectory directly in Cartesian space, with a desired impedance also specified in Cartesian space. If endpoint stiffness is calculated at zero endpoint force, the map from endpoint impedance matrices to joint impedance matrices is straightforward: Importantly, note that no kinematic inversion is required. For additional details on this topic, the interested reader is referred to [8], [3]. It is worth mentioning that the major goal of impedance control is to shape the interaction between the robot and the external environment rather than accurately control its motion. For this reason, in case of external forces, the actual motion of the robot will be different from the zero-force motion x 0 ,ẋ 0 . Hereafter we will use the subscript 0 to refer to the zero-force motion, and use the subscript goal to refer to the desired motion. As previously stated, in the presence of external forces x 0 = x goal . The difference between desired and zero-force motion will be used as an additional degree of freedom for the impedance optimization presented in the remainder of this paper.

A. Redundant robots
In the case of robots with a more degrees of freedom than the dimension of Cartesian space, the impedance control implemented in Eq. 3 has a nullspace cardinality higher than zero. In other words, a desired Cartesian impedance can be achieved via a manifold of robot configurations. This may be an undesired effect, since a small perturbation may produce an arbitrarily large movement in the nullspace. To address this problem, one solution is to include a second impedance control which defines a configurationspace impedance with respect to a desired rest position. In this paper we implemented this concept at the joint level as: where τ J are the joint torques required to control the robot configuration toward the rest position q rest with stiffness and damping matrices K J ∈ R n,n and B J ∈ R n,n respectively.
Note that, since the rest configuration defined by the two controllers is different in general, their superimposition may produce a steady-state error at the end-effector with respect to the rest position of the Cartesian impedance control. This can be arbitrarily reduced by design if K J << K EE . Alternatively, zero error may be achieved via null-space projection [20], [21]. Yet another possibility is that the optimization (presented below) could adjust x 0 to achieve zero steady-state error (that is x = x goal ). Although we are aware that null-space projections could solve the issue in this case, other causes may also result in tracking errors (e.g. imperfect gravity compensation). In this work, we decided to follow the latter approach to show the potential of optimization to compensate for disturbances that may result in tracking error.

III. IMPEDANCE OPTIMIZATION
In Sec. II we discussed the stiffness and damping matrices as coefficients of the controller. Typical implementations use fixed parameters selected through heuristic strategies. However, as mentioned in the introduction, having the possibility to modulate the controller parameters during task execution may be a major advantage for robot physical interaction, yet how to shape time-dependent impedance profiles is still an open question. In the following we propose a solution to this problem by presenting a novel approach to jointly optimize the impedance parameters and reference profiles for a given (arbitrary) robot interacting with its environment.

A. Interaction modeling
Modeling physical contact with an external object is a complex topic. However, in many applications it is sufficient to schematize it as a force experience at the contact point. A typical approach to simulating contact with a compliant object is to calculate such a force/torque as the reaction of a spring pushed by the robot against a fixed reference. We will refer to this model in the following through the parameter k E , which relates the strain in the contact with the corresponding exchanged force. The model can be formalized as W ext = k E ε, where the strain ε is calculate as the difference ε = x − x rest between the endpoint position and the rest configuration of the spring (all defined in the same reference frame). Note that the particular model of the external environment that we assumed is neither mandatory nor a limitation of this work, since it is only used to simulate a sample task and any other model would require similar computational cost. The general experiment considered in this work is depicted in Fig. 1.

B. Robot definition
To show the theory behind our implementation clearly, we build this paper starting from a simple basic building block, a planar 4R manipulator (see Fig. 2). This keeps the initial complexity low, yet preserves the general goal and the essential challenges as well as the consequences of robot redundancy. Later in this paper, the complexity of the model will be increased, moving toward bi-manual 4R structures and fully humanoid structures. The motion of the manipulator is simulated by integrating a dynamic model of the serial links through the Robotics Toolbox [22], which can be formalized as in the following: where the term related to the gravity is assumed to be compensated by a gravity compensation controller, M(q) is the inertia matrix of the manipulator and c(q,q) takes into account Coriolis and centripetal forces.

C. Cost function
Since mechanical impedance is a relation between force and motion, it is important to include in the objective function to be minimized at least one term related to force and one related to robot motion at the same point. The reason to do this is also practical. A penalty on end-effector position is needed to follow a desired path, while minimizing end-effector force is useful for several practical purposes, including increased safety while interacting with humans or the environment, lower power consumption, etc. [23]. The general form of the scalar objective function can be written as: where x is the time-varying configuration of the endeffector, x goal is a desired reference (virtual) trajectory, x t f in and x t f in goal are the end-effector position and target at the end of the action, W ext is the external wrench exerted on the endeffector. α 1 , α 2 and α 3 ∈ R + are task-dependent constants with proper units, and || • || is the norm operator. Note that the first and third terms quantify the tracking error and external wrench along the whole trajectory. The particular selection of α 1 , α 2 and α 3 will affect the shape of the optimization problem and, hence, the final solution to which Fig. 3. Snapshots of the optimal robot behavior detailed in Fig. 4. Visualization was performed using the Robotic Toolbox [22] the algorithm will converge. Using higher values for α 1 and α 2 will provide more precise path following, at the expense of higher interaction-forces; vice-versa, a larger value of α 3 will result in a reduction of W ext up to the point of preventing a satisfactory path following of robot endpoint. It is hence clear that the particular selection of relative weights of cost function terms is a critical point that deserves particular care, also considering task specific conditions.
Given the cost function in Eq. 7, the optimization problem can be formalized as: where K * EE , B * EE and x * re f are the optimal parameters for the impedance control of Eq. 3, k i, j , b i, j are the i, j entries of K EE , B EE , and l k i, j , u k i, j , l b i, j , u b i, j , l x goal,i , u x goal,i are the lower-and upper-bounds for stiffness, damping and controller reference configuration in task space respectively. The optimal values K * EE , B * EE are shaped to be diagonal, while lower bounds are set to zero to guarantee that matrices are always positive definite.
The problem of Eq. 8 is in general non-convex. However, a good approximation can be achieved with a reasonable selection of an initial guess. In our case we heuristically selected constant values for K EE and B EE , while the initial guess for x 0 was a minimum-jerk trajectory between the initial and final desired positions. This choice of trajectory was inspired by human movements (see e.g. [24], [25]), and is typically used to generate continuous, smooth movements in robotics [17], [26]. Let x 1 and x N be the initial and final goal positions of the robot end-effector; the minimum jerk trajectory between x 1 and x N is calculated as

D. Problem complexity and generalization to parallel robots
The optimization problem formulated in Eq. 8 is written in a general form and is valid both for planar and for 3D structures. The main difference between these two conditions is the dimensionality of K * EE , B * EE and x * 0 . For the planar case the end-effector is defined in m = 3 dimensions, while for the general case m = 6. This results in an increased computational time, but the formulation of the problem is still valid. For this reason, we will present results on a planar robot first, and on a 3D humanoid robot later. To optimize the whole robot motion, each optimization variable is defined for a series of time frames, i.e. K EE ∈ R m×m×T , B EE ∈ R m×m×T and x 0 ∈ R m×T , where T is the number of time knot-points (or via-points) considered. Note that the greatest contribution to computational complexity does not arise from the dimensionality of the robot's configuration space or its end-effector workspace but rather from the number of time knot-points considered in the optimization.
To reduce complexity we defined a set of six evenly spaced knot-points and optimized the controller parameters at these points. The impedance and reference profiles were then interpolated using a piece-wise cubic polynomial. Specifically, between two adjacent knots the function was approximated by a cubic Hermite polynomial with a constraint on the derivatives at the knots. The resulting profile is guaranteed to be continuous with a continuous first derivative, while the interpolation error at the knots is zero. Interpolation via piece-wise cubic Hermite polynomials is shape preserving. This results in interpolated stiffness and reference profiles that preserve the shape of the optimized data, respect monotonicity and maintain local minima/maxima. Note that, even if the approximation of motion and impedance profiles by piece-wise cubic Hermite polynomials produces continuous variation, this does not imply continuous dynamics; interaction with the external environment may introduce discontinuities in the robot's actual motion. For additional details on the implementation we refer to [27], [28].

IV. RESULTS
The impedance optimization algorithm discussed in Section III was implemented and tested on several kinematic structures. To show the effectiveness of our approach we present a set of simulations performed on a 4R planar robot (see Fig. 2). In this implementation, the objective function introduced in Eq. 8 was minimized via an Interior Point Algorithm. For each iteration, the algorithm simulated the robot motion using an updated set of parameters. The endeffector goal trajectory of Eq. 7 was a minimum jerk trajectory between an initial configuration in free space and a final configuration that was within the simulated object, thus in a configuration where W ext > 0 (See Fig. 1). The initial guess for the optimization parameters were constant stiffness and damping and a minimum-jerk trajectory between initial and final configuration (the same as the goal trajectory). Weighting coefficients for the cost function were selected as α 1 = 100, α 2 = 1 and α 3 = 5. Results are depicted in Fig. 4.
The algorithm was able to identify a solution that locally minimized ζ in a small number of iterations (fewer than 40 for the 4R case). For each iteration, the time-varying profiles for the impedance control parameters were updated, until the minimum was approximated with suitable precision (step size less than 10 −15 ). It is interesting that, without any constraint on the shapes of the impedance profiles and reference trajectory, the calculated optimal solution found impedance variations that would increase the stiffness in proximity of the impact, while maintaining good tracking of the reference trajectory and avoiding overshoot. In addition, since the task space has three dimensions while the robot has four revolute joints, we can expect undesired movements of the manipulator in the nullspace of the controller. As discussed in Sec. II-A, we solve this problem using a secondary jointspace impedance controller. The parameters of this second controller were not optimized and its reference position was a constant rest configuration. Note that, even though the two controllers have different target behaviors and hence compete, the overall behavior was not affected because the optimization was able to learn the effect of the second controller on the end-effector behavior and compensate for it by commanding a modified zero-force motion to the Cartesian impedance controller, while still preserving its contribution at the joint level. This effect can be observed in Fig. 4-A. Indeed, before contact (black vertical line), one would expect a commanded trajectory (yellow-to-red lines) close to the goal trajectory (blue line), as for typical impedance control. Here, instead, the optimal solution results with a modulation of the commanded trajectory also in case of no external forces, as a compensation for the undesired Cartesian effects of τ J and, potentially, of other sources of steady-state errors (as, for example, inexact gravity compensation).

A. Extension to parallel robots
Since the procedure presented in this work is focused on the end-effector behavior, it is, from a theoretical point of view, straightforward to extend it to more complex structures. In this section, we generalize to multi-arm manipulation. Let us consider two identical robots, the first placed in a configuration resembling a right arm and the second in a left-arm configuration. The choice to use two identical manipulators was only made for convenience and is not a limitation of this work, as the only required condition is that both arms are able move along the optimized trajectory. The task in this case is depicted in Fig. 6.
The problem formalized in Sec. III can be easily extended here with the same computational complexity. Indeed, since the optimization is designed to shape the end-effector behavior, the control of each of the two manipulators can be straightforwardly mapped as a rigid transform of the optimum behavior in the coordinates of the paired manipulators. The final robot behavior is depicted in Fig. 5. In this case, the algorithm was able to find a solution that satisfied all the task requirements. Fig. 5. Snapshots of the optimal bi-manual robot behavior. Visualization was performed using the Robotic Toolbox [22] Fig. 6. Sample task for the dual-arm robot. The robot started from the initial configuration x S and pushed against the spring to reach the goal position x S . The dual-arm robot end-effector was defined as the mean between the single arms end-effectors. This task included a transition from a free movement condition to an environment-interaction condition.

B. Generalization to 3D robots
In the previous sections, we discussed the application of impedance optimization to planar robots. The extension to non-planar manipulators comes with no additional theoretical complexity, and all of the framework developed in Sec. II and III is still valid. Nevertheless, from a computational point of view, the extension to 3D robots introduces nonnegligible challenges, and may require excessive computation time. However, leveraging the compositionality of impedance, our implementation was still able to provide a solution in a reasonable time. We report here a set of experiments carried out on a model of a 14 degree-of-freedom Baxter Robot. In this case, the task was the same as Fig. 6 with two 7 degreeof-freedom arms moving in a 6 degree-of-freedom Cartesian space. The initial guess for the optimization parameters was constant stiffness and damping and a minimum-jerk trajectory between initial and final configuration (the same as the goal trajectory). Weighting coefficients for the cost function were selected as α 1 = 100, α 2 = 10 and α 3 = 3.5. The optimal robot behavior (iteration 71) is reported in Fig. 8, while the resulting impedance profile and reference trajectory are presented in Fig. 7. As in the previous case, the optimization was able to provide a desired behavior at the endeffector while the kinematic redundancy was handled through additional joint-space impedance controllers. As may be seen in Fig. 7, the goal trajectory (blue line) was implemented by the manipulator endpoint with high fidelity, resulting in a maximum average linear deviation from the goal trajectory equal to 6mm (worst case along the X direction, up to a total motion of the manipulator along X equal to 52cm). As in the previous cases, the optimization has i) learned the effect of the joint impedance controller at the endpoint and ii) planned a counter-action to maintain a good trajectory tracking while preserving the effect of τ J at the joint level.

V. DISCUSSION AND CONCLUSIONS
This paper presented an optimization-based algorithm that is able to identify an optimal end-effector impedance profile for a generic task of interaction with an external compliant body. This implementation provides a coordinated, coherent and independent planning of robot dynamic behavior in Cartesian space, in terms of apparent stiffness and damping, together with the end-effector desired trajectory. Although it is common to impose a relationship between stiffness and damping, for instance to get critically damped systems (see [19] among others), in this work we decided to optimise the three parameters independently. In this way, the optimal solution was free to implement under-damped or over-damped conditions in specific portions of the motion, for example to increase velocity.
This task-oriented approach comes with several advantages, since it enables i) a clear interpretation of the results and ii) generalization to different kinematic structures. The results, derived from extensive simulations, with multiple kinematic configurations and considering both single-and multi-robot manipulation, show that this approach is effective and reliable.
We reduced computational complexity through the selection of a limited number of via-points. Controller parameter values at these points were optimized by the planner and then used as knot-points of an interpolating spline. The number of knot-points is one of the task-dependent parameters of the optimization. For example, a larger number may be required to approximate fast movements. For this reason, we are also testing the use of other methods of dimensionality reduction, for example exploiting an incremental enrollment of functional Principal Components extracted from human movements, which has already been profitably exploited in  human-like motion planning [29], [30]. The idea is to identify, from a dataset of human examples, a basis set of functional Principal Components, ordered by decreasing dataset variance explained. A weighted combination of these components can be used to accurately reconstruct an arbitrary time-domain signal. Exploiting the hierarchy imposed by the explained variance -in which some components are 'more important' than others -an algorithm can generate an approximation of the desired time-series with a limited number of functional Principal Components. In this case, the optimization is free to modulate the weighting coefficients of these functions rather their value at each time point. The hierarchy embedded in the definition of functional Principal Components has the additional benefit that a recursive algorithm can search for the solution in a low dimensional space with the option to increase the number of independent functional Principal Components in the optimization (hence augmenting the cardinality of the optimization state) when the problem is not solved with a desired accuracy [31].
Future work will also implement this algorithm on real platforms. Particular effort will be devoted to improvement of the optimization routine in terms of computation time, to enable on-line implementation. The use of Machine Learning techniques is currently under evaluation to infer impedance and trajectory profiles for an unknown case, leveraging on a set of previously solved examples [32], [33].
The implementation presented in this paper does not guarantee convergence to the global minimum. However, the particular selection of the optimization method is a free parameter of the problem. Indeed, the same approach can be implemented with global optimization algorithms, such as Genetic Algorithms, Multi-Search approaches or, leveraging on the particular bi-linear formulation of the cost function, with specific optimization algorithms [34]. However, we anticipate that those approaches may incur a significant expansion of the time required to converge. For this reason, whether it is worth searching for the global optimum, or instead accepting a local solution that can be identified in a reduced amount of time, remains an open question to be addressed in future developments of this work.