Progressive automation of periodic tasks on planar surfaces of unknown pose with hybrid force/position control

This paper presents a teaching by demonstration method for contact tasks with periodic movement on planar surfaces of unknown pose. To learn the motion on the plane, we utilize frequency oscillators with periodic movement primitives and we propose modified adaptation rules along with an extraction method of the task’s fundamental frequency by automatically discarding near-zero frequency components. Additionally, we utilize an online estimate of the normal vector to the plane, so that the robot is able to quickly adapt to rotated hinged surfaces such as a window or a door. Using the framework of progressive automation for compliance adaptation, the robot transitions seamlessly and bi-directionally between hand guidance and autonomous operation within few repetitions of the task. While the level of automation increases, a hybrid force/position controller is progressively engaged for the autonomous operation of the robot. Our methodology is verified experimentally in surfaces of different orientation, with the robot being able to adapt to surface orientation perturbations.


I. INTRODUCTION
Kinesthetic teaching of a robot, is a promising way to easily teach new tasks and reduce the programming time of a robot. In such a teaching by demonstration process the operator physically grabs the robot and provides motion and force information. Contact tasks pose an additional challenge since apart from the motion, the robot has to learn and regulate an appropriate force to the environment as well. Hybrid force/position control is a well-known approach to address this problem given knowledge of the constraint frame of the surface [1]. However, the orientation of the surface and the desired normal force might be unknown beforehand.
By focusing on contact tasks of periodic motion, such as wiping a window, we utilize the notion of progressive automation [2]. With progressive automation a human can teach repetitive tasks to a robot for seamless transition from kinesthetic guidance to autonomous operation. While the operator demonstrates a task few times, the robot's stiffness gradually increases based on the correspondence between consecutive demonstrations so that the robot learns and accurately tracks the desired trajectory. Unlike [1] which focused only on motion encoding with manual segmentation of movements, the approach presented in this paper aims at periodic movements that need not be segmented and at learning of the unknown desired force profile to a non-fixed planar surface of unknown pose. The presented approach Authors  combines the following sub-components: a) frequency oscillators and a novel extraction method of the fundamental task frequency, b) a progressive automation strategy to smoothly adapt robot compliance towards autonomy, c) a periodic motion generation system with variable adaptation rates to learn and reproduce the reference trajectory, d) a force extraction method to learn and reproduce the desired contact force profile using an estimator of the normal vector, and e) a hybrid force/position controller that combines the output from all modules.
A method to determine the frequency and encode the waveform of a periodic movement was proposed in [3], [4], that consists of adaptive frequency oscillators and periodic Dynamic Movement Primitives (DMP) to encode the motion pattern [5]. However, to encode multiple degrees of freedom (DOF) with coupled frequencies, the fundamental frequency of the task needs to be extracted among the learned frequencies in each DOF and be used in a common Canonical System for both learning and reproduction, otherwise drifting may occur. This extraction usually requires treatments such as logical operations [3], which can lead to side-effects like canceling or doubling of frequencies.
For periodic contact tasks, such as the wiping of a surface, the authors in [6] initially learned the periodic movement with DMP and in a second phase they adapted the DMP to apply a certain normal force. In [7] a method was proposed to gradually adapt a wiping task to non-rapid changes of the environment. A passivity based iterative learning was proposed in [8] to gradually modify the anchor point of a periodic DMP with a pre-specified pattern, with respect to external forces due to changes of the environment. Under the same objective, an adaptation mechanism was proposed in [9] to modify the spatial parameters of dynamical systems in periodic tasks of pre-specified patterns. These methods assume specific motion patterns and slow adaptation to changes of the environment, which are considerable limitations in cases when the operator desires to significantly change the task (force profile, constraint frame, pattern, frequency). In specific, a rotation of the surface in [8] alters the executed task because the motion generation system does not consider the change of orientation.
Hybrid force/position control for periodic tasks was implemented in [10], [11], [12] combining adaptive frequency oscillators in a single degree of freedom and periodic DMP learning. The authors implemented unidirectional progressive transition from learning to autonomous execution which in [10] was based on the tracking error and in [12] on the fatigue level of the operator using EMG sensors. On the contrary, a key element of progressive automation that is present in our approach is the seamless, bidirectional and uninterrupted transition between hand guidance and autonomous execution, allowing with the proposed setup the robot to distinguish between user intervention and task disturbances that need to be rejected. A common characteristic in learning with hybrid force/position control is the predetermined values for the desired force magnitude in [8], [13], [14] and for the force direction in [10], [12], which however may be unknown to the robot. Alternative approaches to learn force and position tasks have also been proposed using DMP [15], where the robot can learn and reproduce a force profile instead of a constant desired force. Probabilistic methods have also been proposed for force/position tasks in [16], [17], which however aim to generalization and depended on data variability that require the operator to provide sufficient statistical information about the task during demonstration.
In this paper we present a novel approach to progressively automate contact tasks of periodic movement with teaching by demonstration. The proposed approach aims at fast and effective learning of tasks with few demonstration, where the robot needs to apply an unknown to the robot time-varying force profile that is demonstrated by the operator to a planar surface of unknown pose and perform a periodic movement. The contribution of this work is highlighted below: • We extend the formulation of adaptive frequency oscillators with variable adaptation rules and we propose a method to determine the fundamental frequency of a task by automatically discarding near-zero components, based on the variance of the position signal, avoiding in that way logical operations. The frequency extraction is combined with periodic DMP that automatically stop adapting once the task has been learned. • We allow the robot to adapt to surface orientation perturbations utilizing a continuous estimation of the normal vector to the surface using the robot's tool velocity, hence, the estimate is not affected by the surface friction. • We integrate progressive automation to quickly learn tasks through demonstration, without requiring redemonstration in surface orientation perturbations and even in gradual changes of the surface orientation.

A. The proposed system with surface normal estimation
The concept of the proposed system involves an operator who guides the robotic manipulator kinesthetically and demonstrates a task of periodic movement on a surface while applying a force normal to that surface, as it is illustrated in Fig. 1. The robot is controlled by a hybrid force/position control scheme with gravity compensation and is, initially, with zero control gains (for position and force) to allow kinesthetic guidance. After few demonstrations the robot automatically transitions gradually from kinesthetic guidance into autonomous operation, according to the automation level strategy, based on the operator's force F h and the position error projected on the surface. During the demonstration it is assumed that the robot interacts with the environment at the end-effector and with the operator at the intermediate links, as it is illustrated in Fig. 1. Since the modification of the automation level is based on the intervention of the human, the operator's force F h needs to be extracted. For that purpose we utilize the external joint torque estimates [18] projected at the endpoint of the manipulator F rob and the direct measurement of the contact force F c from a force/torque sensor attached to the wrist of the robot, both expressed with respect to the base frame {0}. By neglecting internal robot dynamics -such as joint friction-which are relatively small compared to the interaction forces, the operator's force can then be estimated as F h = F rob − F c . Alternatively, the operator's force could be directly measured without the F rob , assuming interaction through a sensorized handle setup.
The desired force F d and the trajectory p d ∈ R 3 of the robot are learned incrementally during the demonstration without prior knowledge and they are being simultaneously provided as reference to the hybrid controller. With this approach there is not a distinction between a learning and reproduction phase, but a gradual transition by increasing the control gains, while the reference trajectory p d approximates the demonstrated trajectory p ∈ R 3 and the operator does not apply significant forces to the robot. The application of high forces reduces the control gains and re-enables kinesthetic guidance to allow a new demonstration of the task.
The identification and the tracking of the task frame {T} is necessary for the hybrid force/position controller in order to determine the force-and the position-controlled coordinates with respect to the base frame {B} of the robot. To determine the task frame, we estimate the normal unit vector n c ∈ R 3 to the surface using the kinematic adaptive control law proposed in [19]:ṅ where γ n , β n are constant gains. An initial estimate of n c (0) = −F c /||F c || is provided by the measured contact force vector when contact is detected. This estimator guarantees exponential convergence to zero angle error for a stationary planar surface, given persistent excitation ofṗ, which holds in the case of the periodic movements. By aligning the (arbitrarily selected) z-axis of the task frame {T} with the direction of the desired force F d , which is expressed with respect to the base frame, we define a rotation matrix R = T R B from the base frame to the task frame. We can then utilize the diagonal binary selection matrix S ∈ R 6×6 , selecting S 3,3 = 0 to activate force control in that z-axis and S i,i = 1 to activate position control in the rest of the axes (i = 1, 2, 4, 5, 6). The adaptive law is updated even when the robot operates autonomously in order to detect unknown rotation of the contact surface. Assuming that R 0 is the orientation of the surface during teaching, if the surface is rotated around an axis passing though the origin point p t (Fig. 1), then the relative orientation R T 0 R is used to rotate the produced reference trajectory and desired force to align with the surface's orientation.
The objective of the system involves the operator demonstrating the task as many times required until the system has learned the fundamental frequency Ω, the desired trajectory p d , the desired force profile F d and the normal vector n c . During demonstration, the position is encoded by periodic DMP with incremental regression learning in each coordinate i of the base frame and an adaptive frequency oscillator determines the basic frequency ω i of the input signal p i . The fundamental frequency Ω ∈ R of the task is then determined by discarding near-zero frequencies and is used as a common clock to achieve synchronization of the produced trajectory p d generated by the periodic DMP.
During the initial demonstrations, however, the current estimation of the constraint frame might be incompatible with the actual one, that would result to incorrect behavior of the hybrid controller. For that purpose, the control gains for position and force are initially zero and the robot gradually becomes autonomous depending on the automation level, which is determined by the variable κ ∈ 0, 1 . We utilize κ (Fig. 2) as a weight in the control gains for shared control as well as in the adaptation rules to suspend any further adaptation of the DMP when the robot has learned the task. In the following subsections we present each module of the proposed system in detail.

B. Automation level strategy
The automation level κ transitions the behavior of the robot from pure gravity compensation to accurately following the DMP trajectory and regulating the desired force. The rate of change κ r of the automation level κ depends on the interaction force of the operator F h , on the tracking error p = p d − p transformed on the xy plane of {T} using R T SR p and on the current value of κ(t): The design parameter f min is a positive constant to induce a gain increase when κ = 0, and f r is a scaling term. The robot is gradually becoming autonomous while the level κ increases. The transition rate κ r depends on the current value of κ, so that the rate is initially slow, requiring from the operator to demonstrate the task a few periods until the constraint frame, force, frequency and waveform are learned.
With the increase of κ, the rate increases as well. When the robot moves autonomously (κ = 1), a high interaction force F h in any direction reverts the robot to gravity compensation mode for allowing modifications. Small tracking errors and incorrect estimates of F h , can be compensated by setting appropriately the thresholds λ 1 , λ 2 that have physical units and correspond to the impact of the tracking error and of the interaction force respectively on the rate of change. In addition, the power to the 3 in the force norm is a design parameter to strengthen the effect of high interaction forces. A gradual transition in the level of automation κ provides haptic feedback to operators so that they can smoothly feel the difference in the robot behavior while it is learning the tasks and avoid abrupt behavior during task modifications. The proposed strategy can easily scale to different types of tasks regardless of the complexity, because it highly depends on the interaction force F h , suggesting that the automation level should not be increased as long as the operator makes adjustments and corrections through F h .

C. Modified adaptive frequency oscillators with automatic fundamental frequency extraction
The two objectives of progressive automation are the ability of the robot to continue executing the demonstrated task autonomously after it has been encoded sufficiently and the ability of the operator to intervene during the autonomous execution for spatial or temporal modification. Regarding the temporal modification, the frequency adaptation needs to stop and restart accordingly.
To learn the basic frequency in each axis of the demonstrated movement, we utilize adaptive frequency oscillators [4], with modified adaptation rules according to the automation level κ. The proposed adaptation is weighted by (1 − κ) to smoothly stop the learning when the robot is in the autonomous mode (κ = 1) and to smoothly re-enable it when the operator intervenes. In that way continuous and bidirectional transition is allowed. The oscillators are structured as:φ where E = diag (p −p) is a diagonal matrix with the error between the position input p and the estimatep ∈ R 3 , ω ∈ R 3 is the vector of basic frequencies (with ω i ≥ 0), φ φ φ ∈ R 3 is the vector of the corresponding phases and a ∈ R is a coupling constant. The vector of estimatesp = [p 1 , ...,p 3 ] T is given byp i = M c=0 (α i,c cos(cφ i ) + β i,c sin(cφ i )), where the parameter M is the number of Fourier components. The amplitudes α i,c , β i,c are updated according to the following rule:α where η is the learning constant and the error e i is the i th diagonal element of E. With the proposed modification, the adaptation rates in (5), (6), (7), (8) are reduced while the automation level increases (κ → 1) and stops completely when the robot is fully autonomous (κ = 1). The application of high correction forces by the operator causes the automation level to drop and the adaptation of the oscillators to be reenabled.
For encoding a complex periodic movement, using M =1 the system will learn the strongest frequency component in each axis. The fundamental frequency Ω ∈ R of the task can then be extracted as the minimum non-zero frequency among the components of ω. Because of a single fundamental frequency, no drifting problems occur. To automatically discard near-zero frequency components that can result from e.g. zero velocity of demonstration along a principal axis, we propose the following formula that utilizes the variance p v = VAR(p) of the demonstration over a window of N v measurements. The fundamental frequency of the task is then calculated as: where 0 <p i < 1 is a logistic sigmoid functionp i (p i v ) = 1/(1+e −as(p i v −p0) ), with a s being the steepness of the curve. With this method, if the position variance p i v is smaller than the threshold p 0 , thenp i is close to zero and, as a result, the corresponding frequency ω i is no longer considered as the minimum. On the other hand, the higher the variance p i v than the threshold p 0 is, it converges to 1. Finally, the term max{p i } is used to handle the case when all components of p v are below threshold. Notice that the extraction of the fundamental frequency is decoupled by the estimation of the normal force vector.

D. Recursive learning of periodic DMP and force profile
To learn the periodic movement online from demonstration we use the periodic DMP formulation. In parallel with learning the fundamental frequency, the DMP can encode the waveform using that frequency and at the same time produce the reference trajectory p d to the position controller. The output of the DMP p d which is expressed with respect to the base frame {B} is rotated according to the relative orientation of the surface using p d (t) = R T 0 R(p d (t) − p t ) + p t . The produced trajectory is specified by an attractor landscape around an anchor point g ∈ R 3 and is governed by the phase Φ ∈ R that is the common canonical system among the coordinates, given byΦ = Ω, with Φ(0) = 0.
The reference trajectory is produced according to: where a y , β y are constants. Parameter N is the number of basis functions Ψ = exp(h(cos(Φ − c) − 1)) where h, c are the width and centers over a period. The weights in w ∈ R 3×N are updated online with Recursive Least Squares. The adaptation law for each weight vector w j ∈ R 3 of the basis function Ψ j is: (12) where P j ∈ R 3 is the vector of inverse covariance associated to the weights w j with a forgetting factor λ and with elements of P j (i = 1, 2, 3): To smoothly stop the learning when the robot is in autonomous mode and to smoothly re-enable it when the operator intervenes, we propose the modification of the recursive least squares (RLS) fitting error e r by introducing (1 − κ) in the error: where f s is the target trajectory shape: When the automation level is κ = 1, the error e r is zero and the adaptation of the DMP weights stops. Notice that the DMP formulation can also be used in the case when the orientation of the robot needs to be encoded as well. Moreover, the desired force F d needs to be learned during the demonstration depending on how much force the operator applies to the robot. The direct measurement of −F c is used for this purpose and F d is encoded and reproduced with radial basis functions: 5249 where w f ∈ R 3×N are the weights updated online with RLS according to the fitting error e f :

E. Hybrid force/position control for progressive automation
The proposed method is implemented to a gravity compensated n-dof manipulator with the following control law: where τ p , τ f ∈ R n are the joint torques that implement the desired Cartesian position and force respectively. Let x = [ p T e T ] T ∈ R 6 be the generalized tracking error, where Q = Q d * Q −1 is the quaternion error between the desired orientation Q d and the current orientation Q of the tool, with Q = [η e T e ] T . The velocity error is then described Within this paper, we set the desired orientation Q d of the tool (+Z) to align with the estimated vector n c (-Z) so that the tool always remains normal to the surface during autonomous operation.
The position and force controllers are then defined as: where is the manipulator's Jacobian matrix and U is a block transformation matrix: This transformation matrix is used to apply the Cartesian position controller (19) in a way that is compatible with the task plane orientation. Similarly, the transformation I − U in (20) enables force control only in the z axis of frame {T}, which utilizes a feed-forward and a proportionalintegral term on the force error with gains K P , K I ∈ R 6×6 . Notice that the force controller also depends on the level of automation κ. The matrices K d , D d ∈ R 6×6 are the variable PD gains that correspond to a spring-damper behavior of the end-effector. The variable stiffness matrix is selected as: where k T , k R ∈ R >0 are the maximum desired translational and tortional stiffness for autonomous operation. The damping matrix D d is then chosen accordingly for critically damped behavior. Since K d is a variable matrix, the system can lose its passivity property [20]. This problem can be overcome by introducing the varying stiffness via a tank energy system, as proposed in previous works [21], [22], so that the system remains passive and stable under the changes of κ.

III. EXPERIMENTAL EVALUATION
To effectiveness of the proposed method is verified in two experiments using a 7-DOF KUKA LWR4+ robot with an operator demonstrating contact tasks with simple and more complex periodic movements on a surface that can be rotated, as it is shown in Fig. 3.
The parameters used for the automation level strategy are f r = 1, f min = 0.01, λ 1 = 0.02m, λ 2 = (20N ) 3 (more details on tuning can be found in [2]), for the Cartesian position/orientation are k T = 2500N/m, k R = 100N m/rad, for the force control gains we use K P = K I = 0.1I 6×6 for the DMP are a y = 20, β y = 5 with forgetting factor λ = 0.999 and N = 30 basis functions, while for the frequency oscillators we use α = 50, η = 1 and M = 1.
In the normal vector estimator the parameters are selected as γ n = 60, β n = 1. In the frequency extraction we use a steepness of a s = 10, a window of N v = 2000 and a threshold p 0 = 1e −4 .
The first experiment, involves the operator demonstrating initially a circular wiping task of a horizontal surface until the robot has learned it. The surface has been oriented so that the normal axis aligns with a principal axis of the base frame, aiming to extract the fundamental frequency of the task by discarding near-zero frequency components. While this task is executed autonomously, the operator modifies it by simply grabbing the robot again and demonstrating a more complex one (a "figure 8" pattern). The operator then pauses the execution of the task and restarts it after rotating the surface by 40 degrees. The objectives of this experiment are to determine how fast the robot can learn the overall task, how effectively the operator can make task modifications, and to assess the identification of the surface normal vector in orientation perturbation without needing to re-demonstrate the task.
The results of the first experiment are illustrated in Fig.  4. In particular, the learned periodic DMP trajectory p d is overlaid with the robot's position p in Fig. 4a and the basic frequencies ω in each axis are overlaid with the extracted fundamental frequency Ω in Fig. 4b. Additionally, we show the level of automation κ in Fig. 4c, the estimated normal vector n c with the actual one in Fig. 4d, and finally, the estimated force norm of the operator F h in Fig. 4e. Initially, the operator grabs the robot, gets in contact with the surface and demonstrates a cyclic movement while applying a normal force to the surface. The normal vector n c is Fig. 4. Experimental results of the proposed method with the operator demonstrating contact tasks with different periodic patterns on a surface that is abruptly rotated without requiring re-demonstration. determined very quickly because the force vector is used as an initial estimate when contact is detected. After 4 periods of demonstration (around t=20s), the fundamental frequency has been extracted and the DMP has learned the waveform. The low tracking error causes the increase of the automation level κ, which reaches its maximum level at t = 22s. Then the operator stops interacting with the robot and the latter continues to execute autonomously the learned contact task. Notice that during the autonomous operation (22s<t<27s) the estimated operator's force F h is almost zero (Fig. 4e) and the non-zero values appear because of unmodeled robot dynamics in F rob (e.g. joint friction). At t=27s the operator grabs the robot aiming to modify the task and the high interaction force (spike in Fig. 4e) causes the automation level κ to drop quickly, allowing the operator to demonstrate another task. The operator then demonstrates a more complex "figure-8" pattern and the robot learns to execute it within 2-3 periods. In both patterns, the fundamental frequency extraction method successfully discards the zero frequency in the Z axis of the horizontal plane (Fig. 4b), since there is no movement in that axis (Fig. 4a).
The task is manually paused by the operator at t=56s and the surface is rotated by 40degrees around the given point p t . The pause function is implemented so that the current automation level and all learned information (waveform, force profile, frequency) are not forgotten. After contact with the rotated surface is detected at t=83s, the task is automatically restarted with the estimator detecting very fast the correct normal vector and the learned "figure-8" continuous being executed without requiring re-demonstration.
In the second experiment, the surface is rotated gradually, while the robot is executing a circular wiping task on it, in order to verify the ability of the method to continuously estimate the time varying normal vector and perform the task. In Fig. 5 are presented the results of the rotated produced trajectory p d , the normal vector evolution n c , the desired and actual measured forces F d , −F c and the rotation of the task. It can be shown that the normal vector to the rotated surface is quickly estimated, producing the appropriate rotated desired trajectory and force profile that are compatible with the rotating task frame.

IV. CONCLUSIONS
In this work we proposed the progressive automation of contact tasks with periodic movements on planar surfaces of unknown pose using teaching by demonstration. To detect the desired contact force direction we utilize an estimator of the normal surface, from which we determine the constraint frame and progressively implement hybrid force/position control. Our approach utilizes an automation level strategy and extends adaptive frequency oscillators with customized rules. We also proposed a method to extract the fundamental frequency of the task by automatically discarding near-zero components, and then encode the waveform with periodic DMP and the desired force profile with basis functions.
The proposed method was experimentally verified in two tasks with surface orientation perturbations. The results showed that the constraint frames were successfully identified quickly and the fundamental frequency of the tasks was also extracted correctly. The robot was able to learn very quickly the demonstrated task within few seconds. Although the learning stops during the autonomous operation, the robot is able to adapt to changes in the orientation of the surface by autonomously reorienting the constraint frame. A limitation of the normal vector estimator is that it cannot detect rotation of the surface around the axis that is normal to it. For such a rotation additional sensor modalities are required. Another limitation can be that of premature learning, when the operator lets go of the robot before it has fully transitioned to autonomous mode. In this case the robot accuracy might be compromised. This effect can be mitigated using a visual indicator to inform the human when the robot has reached its maximum stiffness. Future work includes the automatic estimation of the origin of the rotation frame which is considered known in this paper.