The KIT Gripper: A Multi-Functional Gripper for Disassembly Tasks

We introduce a multi-functional robotic gripper equipped with a set of actions required for disassembly of electromechanical devices. The gripper consists of a robot arm with 5 degrees of freedom (DoF) for manipulation and a jaw gripper with a 1-DoF rotation joint and a 1-DoF closing joint. The system enables manipulation in 7 DoF and offers the ability to reposition objects in hand and to perform tasks that usually require bimanual systems. The sensor system of the gripper includes relative and absolute joint encoders, force and pressure sensors to provide feedback about interaction forces, a tool- mounted camera for screw detection and precise placement of the tool tip using image-based visual servoing. We present a data-driven method for estimating joint torques based on the output voltage and motor speed. Further, we provide methods for teaching disassembly actions based on human demonstration, their representation as movement primitives and execution based on sensory feedback. We provide quantitative results regarding positioning and torque estimation accuracy, disassembly success rate and qualitative results regarding the successful disassembly of hard disc drives.


I. INTRODUCTION
Recycling of raw materials from electronic devices at the end of their lifetime is crucial for the sustainable supply of materials to our technical world and in recent years sustainability has become an important objective of the EU and other organizations [1]. However, recycling processes for these products are often based on a destructive separation of parts and components [2], which results in material mixtures. This increases the problem of separation of valuable and hazardous compounds and can prevent the recovery of material in its original quality. Manual dismantling is cost-intensive and might bring workers in contact with hazardous substances in electronic waste. From a robotics point of view, dismantling is a challenging task due to the fact that a robot system has to be able to handle unknown object structures and perform complex manipulation actions.
In contrast to the assembly in manufacturing, where industrial robots are already being used intensively, the dismantling of disposed devices is merely performed by robots. The disassembly process faces robots with a large variety of product models from different manufacturers, which in addition might be damaged or in unclean condition. This requires a number of complex manipulation actions, precise execution, the usage of multiple tools, and the ability to adapt to new devices and to recover from failures. To increase flexibility concerning type and order in disassembly we propose a multi-functional robotic gripper design that is able to execute a wide range of disassembly actions instead of a disassembly line with multiple single robots with specialized end-effectors.
In this paper, we present the realization of the first physical prototype of our KIT multi-functional gripper ( Fig. 1) as well as methods for kinesthetic teaching and autonomous disassembly. The gripper is based on our previously presented concept of the KIT Swiss knife gripper [3] with optimizations of the kinematics, sensors and embedded systems and has been designed for dismantling operations that normally require bimanual manipulation but are performed with a single arm and exchangeable tools.

II. RELATED WORK
Various approaches have been proposed for increased automation in recycling. Some works present specialized robots for certain tasks like automated disassembly of snap-fit covers [4] or unscrewing [5]. In [6], the unscrewing during disassembly for the recycling of electric vehicle batteries is performed by a robot in a workspace shared with a worker. In [7], a system composed of the small industrial robot manipulator Mitsubishi RV-M1 [8], a camera and range sensor is used for partial disassembly of desktop computers. The manipulator has 5 DoF, a jaw gripper and is capable of 1.2 kg payload.
With Apple Inc. trying to reduce its ecological footprint [9], several disassembly robots [10]- [13] for the iPhone or its subcomponents were developed. Other works use full-size industrial robots like KUKA KR 240-2 [14] or ABB IRB 140 [15] combined with additional tools for multiple disassembly steps with one robot. Biwidi et al. use a system with an industrial robot arm, a force/torque sensor, a disassembly tool and a workspace for automatic tool change for dismantling of electric vehicle motors [14]. Chen et al. describe a multi-head tool for robot-based disassembly with a drill, grinder, and screwdriver mounted to a standard robot arm and combined with an external flip table [15]. Different approaches to disassembly are presented by Poschmann et al. [16]. The authors state that "disassembly has not been in the focus of mainstream robotics research so far" and some of the works in this survey also prefer a destructive approach. The partially destructive disassembly of LCD screens in [17] is performed by an industrial robot arm with a circular saw attached.
While the mentioned works above either rely on a multi-stage disassembly line [10]- [12], focus on partial disassembly [4]- [7] or use large industrial robots [14], [15], our work aims at providing an integrated and compact robot capable of all required actions of a complete disassembly process of electromechanical devices such as HDDs. This provides maximum flexibility and has the potential to address large scale autonomous disassembly of non-homogeneous device types.
Also, other multi-functional grippers targeting a general action set with no special reference to recycling can be found in literature, mentioning [18]- [21] as examples. To the best of our knowledge, no other existing robot can solve the complete disassembly task including tool change and in-hand manipulation with one multi-functional and low-cost robot.

III. MECHATRONIC DESIGN
In this section, we present the mechanical design together with electronics, sensors and torque estimation to perform complex disassembly tasks in the context of recycling.

A. Kinematics and Mechanical Design
In our previously presented concept, the KIT Swiss knife gripper [3], we described the concept for a multi-functional gripper for bimanual manipulation with a single arm and discussed the advantages for disassembly tasks over a conventional approach with a disassembly line. Several experiments with a first prototype of the gripper showed, however, that the proposed SCARA arm kinematic in our previous work with 4 DoF is not sufficient for the execution of all disassembly actions of a HDD. In particular, an additional DoF was needed for actions like levering in a vertical median plane, e. g. to remove the magnets in the HDD. Thus, the kinematics of the gripper was extended resulting in a robot arm with 5 rotational DoF. Together with the jaw gripper (1 DoF rotation and 1 DoF closing), 7 DoF are provided for disassembly actions.
The kinematic chain of the gripper is a tree structure with two sub-chains starting from the base segment that includes a magazine for exchangeable tools (Fig. 2, F ) and provides a mechanical interface to either connect to a stationary mounting    or an additional robot arm that is not investigated further in this paper. The first sub-chain consists of the gripper arm with 5 rotational DoF starting with a yaw joint followed by 3 pitch joints and a tool rotation joint. The second sub-chain consists of jaw gripper rotation and closing joints. The actuation of the jaw gripper is realized by a globoid worm drive and levers. The arm yaw rotation is actuated with a DC motor [22] in combination with a harmonic drive (CSF-8-30-1U-CC-F1 100:1), see 1 in Fig. 2. All other joints, ( 2 , 3 , 5 , 6 , 7 , 8 ) are actuated with DC motors [22] and planetary gears (231:1 and 243:1). Only the rotation of the jaw gripper is realized off-axis in combination with a timing belt. Flat spiral springs are connected in parallel to joints to counteract torques induced by gravity ( 2 , 3 , 5 ) and to mitigate the effects of backlash in the planetary gears.
The lever mechanism for the closing of the gripper jaws is actuated by a laser sintered (IGUS iglidur I6-PL) globoid worm drive with two worm wheels of custom design. With this construction, the jaw gripper offers a grasping force of more than 100 N at a weight of the subassembly of 760 g. Specifications of the gripper are summarized in Tab. I.
To provide the best accuracy we aimed at maximizing the stiffness of the mechanical structure while keeping the design cost low. Therefore, we use a sandwich design consisting out of laser cut sheet aluminum and a selective laser sintered (SLS) polyamide structure in between. High stiffness of the base segment is achieved using SLS poyamid with an internal support structure optimized using numerical simulations.
The tool head includes the axially rotatable suction cup that can function as a tool adapter as well as a miniature camera for macro tool view and LED lighting. A vacuum pump (BOXER 20K , 4 in Fig. 2) in combination with two miniature 3-2 way solenoid valves provide pressure and vacuum for the suction cup. A rotary air transmission ( C ) between the gearbox and the tool holder ( D ) feeds the air to the rotating suction cup.

B. Electronics and Sensors
The electronics of the gripper follows a modular concept with multiple hardware units interconnected by a real-time data and power bus. Each robot segment includes a local control unit that is complemented by a segment specific add-on board for specific sensor and actor interfacing as shown in Fig. 3. The local controller units are based on a common controller/communication base-PCB that is shared across all four segments of the robot and an additional segment specific add-on board. This hardware architecture reduces cable routing across moving joints while single components of the system can be easily modified for changes in the sensor and actuator design during robot development. The controller units are interconnected by an 8-pin ribbon cable that includes the data bus lines and 100 W, 48 V supply voltage. The gripper is equipped with several sensors needed for feedback and control. All motors include incremental encoders that allow precise velocity and position control. 24 bit absolute encoders (RLS d.o.o., [23]) in five joints ( 1 , 2 , 3 , 5 , 8 ) provide a more than sufficient resolution. The pressure in the suction system is sensed by a ±15 psi pressure sensor ( III , Honeywell [24]). Two strain gauge based force sensors ( VI , ME-Meßsysteme GmbH [25]) measure the axial force exerted by the tool. Additional torque sensing for all motors without the need for expensive torque sensing is provided by torque estimation, described in the following section.

C. Torque Estimation
Direct torque sensing using a dedicated sensor provides the highest accuracy, however, it comes with disadvantages of high demand for space in the design, weight and significant costs. Thus, we developed a method for estimating the torque of the motors based on the known relation between motor current and torque. Since an exact measurement of the current in motor with pulse width modulation (PWM) control is difficult, we estimate the torque from motor speed n obtained from the incremental encoder and PWM motor voltage U . Using a linear model, the torque τ motor of a DC motor can be approximately described using the two motor specific constants a motor and b motor : where a motor is referred to as the speed constant and b motor the speed-torque gradient. These parameters are provided by the manufacturer or can be obtained experimentally. However, this linear model neglects non-linear terms including friction in the gearbox and bearings as well as motor drivers non-linearity, which leads to reduced accuracy of the model. In [26], the torque was estimated using a detailed motor model that is parametrized using manual system identification. In our work, we follow a data-driven approach for learning a model that allows torque estimation. To this end, we apply neural networks to predict such additional terms. We collected data on a test stand (Fig. 4) and in addition on the jaw closing joint and the tool rotation joint directly on the robots. We achieved the best results using a neural network with 4 hidden layers including 32-8 nodes per layer. Based on this, a pseudo zerotorque control was implemented allowing manual guiding of the gripper arm and jaw rotation joint for kinesthetic teaching based programming of manipulation actions (see subsection IV-B). In section V, we present evaluation results of the torque estimation for the jaw closing joint and tool rotation joint.
IV. PROGRAMMING AND EXECUTING ACTIONS In this section, the programming and execution of the disassembly actions of the gripper are introduced. We also describe the employed software framework (subsection IV-A) and the robot-assisted kinesthetic teaching (subsection IV-B). In subsection IV-C we describe, how disassembly actions are represented and subsection IV-D shows the use of visual servoing to compensate for mechanical positioning errors while unscrewing.

A. Software Control Architecture
The software architecture of the KIT gripper is implemented in the robot development environment and control framework ArmarX 1 . The control structure consists of three layers: a realtime capable low-level unit running at 1 kHz with hardwarespecific joint controllers and hardware-independent multi-joint controllers for Cartesian control or trajectory execution, a midlevel layer with an abstraction of the underlying controllers and a multi-purpose component for detection of unplanned collisions and joint torque peaks, and a high-level layer which implements disassembly tasks as sequences of actions implemented using statecharts [28] with both symbolic action information and parameters needed for the execution. The control architecture offers interfaces to ROS to allow the integration and use of the gripper in different robotic setups.

B. Robot-Assisted Kinesthetic Teaching
One challenging aspect of kinesthetic teaching in the context of disassembly is that many primitive actions cannot easily be demonstrated by a human teacher, such as unscrewing or changing tools, since the involved actuators (especially the bit actuator and the pump) can only be controlled through software. To solve this problem, a robot-assisted kinesthetic teaching component making use of pseudo zero-torque control was developed. Apart from recording data, this component triggers and activates certain actuators depending on sensor data and the state of the gripper. This is required for certain actions e. g. if a force is sensed at the Tool Center Point (TCP) and if the TCP is in a designated area of interest. We identified three different areas of interest, namely at the HDD tray, at the jaw, and at the tool magazine. If, for example, the human demonstrator wants to pick up a specific tool, it is sufficient to guide the arm to the tool magazine area and slightly push the TCP into one of the tools, which automatically triggers the pump to pick up the bit. The component is implemented as a finite state machine (FSM), with states such as pick tool X, place tool X, pick HDD, place HDD, unscrew, and lever to model the kind of assistance, as well as guiding, to indicate no assistance. The FSM is outlined in Fig. 5. The recorded data includes for each timestamp all joint angle positions, forces applied on the TCP, as well as semantic information such as attached tool, HDD in jaw, and FSM-state. Especially the FSM state is later useful, as it semantically segments the recorded demonstration into labeled parts that are important for learning certain primitive actions (e.g., unscrewing, picking tool, . . . ). The parts of the demonstration labeled as guiding can be used to extract trajectories from uninterrupted human disassembly recordings which are useful for learning the corresponding movement primitives of the consecutive action. Further, multiple recordings of segmented and labeled disassembly demonstrations for a specific product can be used to extract task constraints (e.g., temporal constraints of certain actions) and to learn task models to describe human strategies in executing disassembly tasks.

C. Representing Disassembly Actions
The semantically segmented demonstrations of disassembly tasks obtained from kinesthetic teaching data are used to learn movement primitive for disassembly actions. These movement primitives are learned from multiple demonstrations and are represented using the Via-points Movement Primitives (VMPs) formulation, see [29]. VMPs provide a compact probabilistic representation and allow the adaptation of the learned motions to different via-points while ensuring the extrapolation to new areas in the space, especially in the case of a small number of demonstrations. In our case, we use a maximum of 3 demonstrations for learning a disassembly action.
The initial action parameters are derived from the demonstrations. These include the suction location, screw, and levering poses and the time slots for activating the gripper LED lighting, the pump, and the bit rotation. The learned VMPs of each disassembly action and the corresponding initial action parameters, preconditions, and effects that are extracted from the demonstrations (see [30], [31]) are stored as one instance in an action descriptors (ADES) database [32]. The following disassembly actions are learned from kinesthetic teaching and stored in the ADES database: pick, drop, lever, tool change, flip, shake, unscrew, push, cut, and reposition of the device. These actions cover most of the disassembly tasks in the context of recycling HDDs.
During execution (see Figure 6), the actions are retrieved and dynamically parameterized according to the current scene. The action parameterization is enabled by the perception system using RGB-D data (see [33], [34]), which is not part of this work. The suction cup is used to perform picking, dropping, and automatic tool change actions. An automatic visual calibration procedure of tool tip position and orientation using a marker is performed to improve the precision of unscrewing and levering actions. Our system is able to choose out of a variety of actions of the same action type from the database depending on the current task parameters. Figure 6d and Figure 6e show two potential levering actions for the purpose of removing the magnet, that can be inferred situational to achieve better generalization to unseen devices.
The ability to reason the required action type can be achieved by using a symbolic plan (see [30], [31]), that includes the capability of updating the probability of each action by measuring the action effects and the success rate to improve the plan in the next execution.

D. Visual Servoing for Unscrewing Actions
Unscrewing actions require placing the tool inside the screw head and thus a high positioning accuracy, which cannot be achieved by open loop control. Therefore, we use the tool mounted camera to locate screws and the tool tip in the image and apply image-based visual servoing to guide the tool tip to the screw head.
1) Screw Detection: For the detection of screw positions we rely on a combination of screw localization and classification. In contrast to a direct pixel-wise segmentation of screws, we initially search for possible candidates for screws positions and in a following step these candidates are classified. The circular outline of screws enables to detect their positions in the images by applying Circle Hough Transform (CHT) (Fig. 7). We use the OpenCV implementation which allows to parametrize the detection parameters including minimum mutual distance as well as minimum and maximum radius. Thereby overlapping circles can be filtered out and only approximately screw sized circles are found in the image.
To decide if the detected circular image contour is a screw or any other circular object like e. g. holes, we designed a binary classification algorithms using a convolutional neural network. To train the network we recorded a dataset of images with possible screw position candidates. The dataset consists out of 1,791 images of actual screws and 2,488 images of other circular objects. The best classification results were obtained with a network consisting out of three convolutional layers with 16, 32 and 64 features followed by two fully-connected layers which output 64 and 1 features. The accuracy after training with 50 batches reaches a maximum of 95.3 % training accuracy and 93.6 % evaluation accuracy. To track the positive classified screw positions in the image, we track the detected coordinates in consecutive images and apply low-pass filtering the classification results. These positions are used by the image-based visual servo controller to reduce the position error between detected screw and the tool position.
2) Visual Servoing Unscrewing: Inserting the tool tip into the screw requires the error to be less than 1 mm, which cannot be achieved from an initially guessed target pose. To achieve sufficient accuracy in placing the tool tip, we implement a visual servoing process. First, the gripper gets into contact with the surface detected by the force sensor and moves the screw gradually to the calibrated location of the tool tip in the image frame. To realize the motion, the gripper can either directly slide the tool tip on the complicated surface of the hard disk, which can lead to the tool tip getting stuck, or move the tool tip first up and then towards the screw without contact, which cannot guarantee that the tool tip is right above the screw before insertion action. We combine both strategies with a distance threshold. When the tool tip is close enough, the gripper slides it towards the screw, otherwise, it raises the tool tip and moves towards the screw without contact. The process has a success rate above 80 %.

V. EVALUATION
To evaluate the gripper design we determine critical parameters like torque estimation accuracy of tool joint and jaw closing joint as well as mechanical accuracy. We evaluate the repeatability of the joint angle accuracy of the jaw rotation joint and the tool position accuracy. We also conducted extensive disassembly experiments with HDDs, which show high success rates and thus a sufficient accuracy of our system.

A. Mechanical Accuracy
The accuracy of reaching the same position and orientation is important for several tasks. The total error e total of reaching the same position and orientation is composed of a mechanical error e mech and a control error e control . The mechanical error is the difference between the robot model and the real robot and caused by errors in the length of the mechanical parts and errors in joint position sensing as well as elasticity in the robot structure. The control error is the difference between goal and reached robot model position. For the exact measurement of the rotation angle, a laser pointer is attached to the jaw gripper. The laser points on a millimeter grid in a distance of 4 m. The laser can be measured with an accuracy better than 0.25 mm which results in an angular resolution of 0.003 • . We measure the accuracy of reaching the same position of the tool tip on a plane in between the gripper jaws which corresponds to the repeat accuracy of reaching a screw with a bit on a HDD. For the test setup, we grasped a translucent plate with a millimeter grid on it. On a macro photo taken from below, the position of a torque bit on the plate can be measured with a resolution of 0.05 mm. The mean absolute errors for the rotation of the jaw gripper from 19 data-points and the position of the tool tip from 30 data-points are given in Tab. II.

B. Torque Estimation Accuracy
We used two of the identically built gripper models to train and test on the same as well as a different model. As ground  truth for the torque estimation an a force/torque sensor (ATI mini 45) with a resolution of 0.125 N in F z direction and 0.001 N m in T z direction was used. We evaluate the torque estimation accuracy for the jaw gripper and for the tool rotation. The estimation of exerted grasping force is important for a stable but non-destructive grasp. A mock-up of a HDD with the FT-sensor in the middle plane for measuring grasping force was used. The torque in the levers of the jaw gripper was calculated from the measured force. The estimation of a tool torque is important for screwing and unscrewing tasks as well as levering. The mean average error for the linear model (e linear ) and for the torque estimation (e est ) for both tool rotation and the closing of the parallel jaw gripper is given in Tab.III. The estimated torques and ground truth values during grasping of an object and (un-)screwing are shown in Fig. 8.

C. Disassembly Success Rate
To evaluate the capabilities of the system, we repeatably perform a sequence of disassembly actions for a hard drive with a fixed action sequence and predefined action parameters. Note that planning, action effects, and ADES are not part of this work and are out of the scope of the evaluation. We conducted 10 repeated disassembly experiments with the same HDD and collected data about 180 disassembly actions in total. For each trial, 8 categories of disassembly actions were involved (see Figure 9), and each of which can be executed multiple times, e. g. we executed the unscrew action 5 times per trial. When an action failed, we waited until the planned time for this action ended up and manually resumed the disassembly tasks. The average execution time of each trial was 16.3 min, and thereby provides a satisfying result but is out of competition with human performance. In total, 10 actions out of 180 failed, 9 of them being unscrew actions, and the remaining one is caused by some parts stuck in the case during the shake action. To improve the unscrewing success-rate, higher perception accuracy of screw and tool position would be beneficial, as well as a more reliable detection method to ensure the tool tip is placed correctly. However, our current perception system can detect screws that are not removed successfully, and the planner will decide to try again or switch to a more probable action. In future work, we aim at the evaluation of complete system integration including high-level planning. With our current design, we achieve high success rates of the disassembly actions that give evidence of our chosen concept and let expect a future increase in performance.

VI. CONCLUSION
We present a concept and complete physical realization of a multi-functional gripper for disassembly tasks. Three copies of the gripper were build, evaluated and are used for experimental disassembly. The sensor setup allows precise proprioception of the robot state, which is enhanced with torque estimation by a neural network. The tool head offers a tool holder combined with a suction cup, sensors and a camera. The gripper is capable of executing a set of disassembly actions including grasping, tool change, levering and unscrewing. The actions are learned from human demonstration and can be adapted to new tasks. We evaluated the mechanical accuracy and tested the gripper abilities in the context of disassembly tasks of HDDs. We show our results in an attached video. Future work will address the transferability of the learned disassembly actions for HDDs to other devices of similar size.
We believe our prototype is a step towards the use of robots for automated disassembly of electronic devices and can contribute to a more sustainable use of resources.