A Robot Collision Avoidance Method Using Kinect and Global Vision

This paper introduces a robot collision avoidance method using Kinect and global vision to improve the industrial robot’s security. Global vision is installed above the robot, and a combination of the background-difference method and the Otsu algorithm are used. Human skeleton detection is then introduced to detect the location information of the human body. The collided objects are classified into nonhuman and human obstacle which is further categorized into the human head and non-head areas such as the arm. The Kalman filter is used to predict the human gesture. The human joints danger index is used to evaluate the risk level of the human on the basis of human body joints and robot’s motion information. Finally, a motion control strategy is adopted in view of obstacle categories and the human joint danger index. Results show that the proposed method can effectively improve robot’s security in real time.


Introduction
Safety control strategy must be taken to avoid the collision between the moving robot and the operators in the working space [1]. Strategies used in safe human-robot collaboration (HRC) can be broadly divided into two categories: pre-collision strategy [2][3][4] and post-collision strategy [5]. The former strategy detects the danger before the collision and takes measures to prevent imminent collision. While the latter one requires higher real-time performance during the collision to suppress the impact force, and ensure the security of operators and robots. Therefore, pre-collision methods can normally achieve safer result in implementation of HRC. Researchers have been working on different pre-collision methods and presenting some important findings and solutions. As one of the pre-collision methods, sensors, such as ultrasonic [6] and photoelectric sensors, are installed on the robot to detect the man-machine position. The sensors identify the danger and then the robots will be immediately stopped. Such security strategy is very simple, and greatly reduce the reliability of collision avoidance and the work efficiency of the robot.
The robot should follow the collision avoidance strategy at all run time. A more reliable safety strategy is essential. Sanderud [7] have presented proactive safety strategy based on a quantified measure of risk for human robot collaboration. The risk field is established based on an analysis of the human's movement and the consequence of a collision with different human limbs, combined with a likelihood analysis. Similarly, a simulation tool using real-world geometrical data was proposed to investigate different algorithms and safety strategies [8]. Kulić [9,10] suggested a method of robot safe trajectory planning based on the mechanical principle of the minimizing danger index. However, the application of this method is limited by the large number of environmental data required.
One of the most common and useful technologies used to detect intruding obstacles is robot vision. The robot vision has been developed in recent years, and could be a feasible solution in the collision avoidance strategy. A robot manipulator automatic path planning strategy based on 3D-TOF sensor was presented in peg-in-hole assembly process [11]. Similarly, a two fold strategy was presented to automatically generate safe path for robot trajectory based on data from TOF sensor [12]. Kuhn [13] used monocular vision to measure the human-robot distance in manipulator space and thus identified the risk. Yet this method is not conducive to the human-robot collaboration because of its limited detection of monocular vision, Kinect is an advanced motion tracking sensor, and it can be used for identifying individuals in the robot work space. In [14], an obstacle detection method using Kinect was designed to obtain the obstacle's information. By designing an artificial parallel system of the manipulator in the obstacle avoidance controller, the proposed control method enables the robot to achieve the ability of moving away from the obstacle. He has also investigated [15] and used Kinect to obtain the obstacle information. This scheme is mainly used for mobile robot to avoid obstacles and is not used in man-machine cooperation environment. Flacco [16] calculated the human-robot distance to determine the danger level and took the safety control measures in the Kinect depth space. This method cannot accurately distinguish human body parts, quantitatively evaluate the movement state, or take effective control measures in time.
This study proposes an improved method to detect and control the robot. Global vision is used to identify the danger of collision. Kinect, combined with the Kalman filter, is used to discriminate the obstacle and detect the human gesture. The specific motion control strategy is adopted depending on different situations.

Robot Collision Avoidance System Scheme
As shown in Figure 1, the project consists of a six-degrees-of-freedom (DOF) robot arm, Kinect, an industrial camera, and a personal computer (PC). Kinect and the camera transmit data to the PC by the universal serial bus (USB) port; simultaneously, the PC and the six-DOF robot arm transmit parameters and control movement from each other via the transmission control protocol/internet protocol (TCP/IP). As shown in Figure 1, the industrial camera and Kinect are installed above and in the rear of the six-DOF robot arm, respectively. The detection area can cover the entire working area of the robot arm by adjusting the locations of the industrial camera and Kinect ( Figure 2). In this system, the colliding signal of obstacles should be identified first. Hence, global vision and the background-difference method are adopted to detect human and nonhuman obstacles. Upon sensing a potential collision, global vision transmits the danger signal to the robot arm controller for the arm to take safety control measures in time. Kinect then identifies whether or not the obstacle is a part of the human body [17,18]. If the obstacle is a part of the human body, Kinect combined with the Kalman filter detects and predicts the pose of the human body to calculate the current human joint danger index. This index can be used to classify the danger level compared with the danger index threshold, which transmits different danger signals to make the robot arm take different control strategies.
The Kalman filter can predict the next position and velocity of the object on the basis of the current position of the object [19,20]. Noise often interferes with the measurement of the position and the velocity of the object; thus, the Kalman filter uses the dynamic information of the object to eliminate the influence of noise and estimate the position of the object accurately.

Obstacle Hazard Detection 3.1. Global Vision 3.1.1. Coordinate System Transformation
The Pinhole imaging principle was adopted in this study to understand the transfromation between the image coordinate system and world coordinate system. As shown in w are the coordinates of the camera coordinate system and the world coordinate system, respectively. (u, v) is the image pixel as the unit of the image coordinate. Therefore, the expression is [21] 0 (1) where d is the image distance and converted the image coordinate system into the world coordinate system. The chessboard method was adopted for calibration. The expression of the relationship between the camera coordinate system and the world coordinate system is: Thus, the transformation between the robot arm coordinates system and the image coordinate system can then be realized .

Detect Potential Collision Objects
The entire control system must be in real time. Thus, the image processing data should be minimized and meet the requirements of target detection. Image Pyramid was used to reduce the sampling and the image resolution by Gauss transform.
The background-difference method was adopted to detect obstacles. Its background modeling method includes (1) the simple background model, (2) the single Gauss background model, and (3) the mixed Gauss background model. The efficiency of the background-difference method, based on the simple background model, is the highest on the basis of experiment comparison, which can meet the real-time requirement of the system. In this system, the background model was built via image sequence synthesis. The frame i of the sampled image is defined as ( , ) i I x y , the background image is expressed as However, the background-difference method based on the simple background model needs a good threshold; otherwise, achieving the desired detection effect is difficult. Hence, the Otsu algorithm is used to obtain the threshold value that improves the detection effect of obstacles. Using the Otsu algorithm, the calculation of each frame requires approximately 300 ms in the experiment, and it cannot meet the real-time requirements of the system. Therefore, the method cannot be directly applied to the operating process of the system. The working environment of the system is human-controlled, and environmental brightness is the main influencing factor. Therefore, different brightness levels were used in this study by changing the number of lights in the working environment, and the threshold segmentation algorithm was used to calculate the current best threshold for each brightness level. The separate targets that were adopted to calculate the separation threshold include the head, the arm, and some nonhuman obstacles. Thus, the background-difference method based on the simple background model can not only meet real-time requirements but also detect obstacles accurately. Finally, the image was corrupted and expanded by senior morphology transformation. All of the contours in the binarization image were detected and the information of the most outside rectangle boundary was calculated, including the center point coordinates, and the rectangle's length (L) and width (W). Image interference can be partially eliminated by discriminating each contour.

Human Body Posture Recognition and Detection
Kinect was used to detect human skeleton information and combined with the Kalman filter to predict human posture. Taking the first left arm joint in Figure 3 as an example, we discuss the realization of the Kalman filter in the X axis below and show that it is similar to the Kalman filter in the Y axis.

Establishment of the Kalman State Equation
The motion equation of joint 1 is: where k x , kx v , and kx a are the coordinates, velocity, and acceleration of joint 1 in the X axis in K time, respectively, and t is the time interval between the two frame images.

Establishment of the Kalman System Observation Equation [22]
The Kalman system observation vector not only corrects the prediction data of the last round of Kalman iterative algorithm, but also prepares for the next round of iterative algorithm. Because the observed joint position is the actual position of the joint pointion in the Kinect coordinate system, the joint position is calculated as: and the observation coefficient matrix is:

Initialization Filter
First round iteration of Kalman filter needs to intitialize various vectors. When the human body first appeared, according to the observational position of human joints, the filter state is initialized to: The system initial state vector covariance matrix could obtain a large value from the diagonal of the matrix. The values are obtained according to the actual measurement situation and are not influenced after filtering for some time. For the first time, the value is: The system dynamic noise covariance matrix is:

Kalman Filtering Algorithm Iterative Process
On the basis of the state vector prediction equation, the state of joint 1 in frame k+1 of image is predicted to be: The prediction equation of the state vector covariance matrix is: The Kalman weighting matrix is computed as: The state vector update equation is computed as: The state vector covariance update equation is computed as: The states of joints 1 and 2 in the next frame can be predicted. Furthermore, state prediction was combined with the precise joint data provided by Kinect. After a short iterative process, constantly updating the state vector and its covariance matrix using Equation (14) to (17) increases the accuracy of the prediction position. Thus, the velocity vector of the joint motion can be calculated.

Danger Index
When robot arm moves, the state of human body's movement plays an impotant role in affecting collision risk leve. Different parts of the human body can bear different degree of impact force. Therefore, distinguishing the different parts of the human body in detecting collision danger and quantifying the level of danger in each body part are crucial. Take the human head and arm as examples, the experimental results show that the motion control strategy is effective in calculating, illustrating, and setting the specific motion control strategy.
As shown in Figure 4, the start and end points of the robot arm movement are ( , )  Suppose the human's left arm to be a potential collision site, joint 1 will be considered to be the reference point if it is closer to the robot arm comparing to the others. As shown in Figure 4, the instant velocities of joint 1 in the r X and r Y axes of the robot arm coordinate system are kx v and ky v , respectively. The velocity component of joint 1 in the line between the center point and the robot arm is k v , and it is the sum of kx v and ky v in this line. m v is the velocity vector of the robot arm joint movement, m v is the velocity component of the robot in this line, and m X and m Y are the coordinates of the robot arm joint in the r X and r Y axes, respectively. As a quantitative analysis parameter of collision, the danger index is calculated only when the danger signal is detected. Therefore, the danger index is 0 in the safe state. The human joint danger index is computed as: where k v and m v are the velocity vectors, which include the moving direction information of the industrial robot and the human body joint. Therefore, the danger index also contains the direction information of the relative movement of the robot arm and joint 1, and the danger index plays a vital role in detecting the danger signal.
If joint 2 is closer to the robot arm than other parts, then it is used as a reference point. When the obstacle is a human head, the calculation is the same as above. The method is based on the danger index and combined with global vision's obstacle information detection to sense the collision danger in real time.

Specific Motion Control Strategy
When detecting the collision danger, control measures must be taken effectively to ensure safety. In this study, the situation where the obstacle is a human body or a different part of a human body determines which effective control strategy should be taken, as well as which strategy can make the system successfully protect the robot arm and the people [23].
The collision danger signal is composed of the signals provided by Kinect and global vision, which determine the type of obstacle and the danger degree of collision. Therefore, the selection of the control strategy is mainly based on the following three aspects: (1) The control strategy can protect the robot as much as possible, so that the robot will deal with an impending dangerous situation better.
(2) When the obstacle is a part of the human body, the control strategy should give the human body the priority protection.
(3) The control strategy, in the role of protection, should not affect the efficiency of the robot after the danger signal released.

If the Obstacle is not human
The robot arm was controlled to move vertically along the direction of the end-effector and the obstacle center. While image processing, the center point coordinates of the obstacle detection is A(x, y) , the end-effector coordinates of the robot arm is When the danger signal is detected, the vector a between the end-effector of the robot arm and the obstacle center is: (24) At this point, the motion vector direction of the end-effector is: where  is the angle between the velocity vector of the end-effector and the direction vector of the obstacle. If

If the Obstacle is a Human Head
When the danger index DI＞F (where F is a pre-set head danger index threshold), the robot arm has a repulsion action to the head, and the direction of the robot arm is far from the head.
When the danger signal was detected by global vision, the robot arm was controlled in the reverse direction in accordance with the following equation: where  is the safe detection distance of robot and kx v is the motion velocity of obstacle. The shape factor  was taken as 6 in the experiment. When the danger signal was released, the human head is still in the area of robot movement.
The robot movement would be recovered after the human head left the area.

If the Obstacle is the Arm
When the danger index DI＞F h (where F h is a pre-set head danger index threshold), the robot arm was controlled to move along the direction of the end-effector and the center of the arm, and a new path to the target point was planned.
The acceleration of the robot's end-effector is computed as: The moving direction vector of end-effector is computed as: If , r a n   (35) a n a n v  where c is the proportional constant and was taken as 5 in the experiment. The arm center coordinates is A(x, y) , and the length and width of the arm's outer contour are L and W, respectively, by global vision. When the position coordinate of the robot is , the robot moved to a safe position. Then, the robot was controlled to move along a new path to the target point.

Experimental Results and Analysis
The experimental platform is the six-DOF ABB robot arm that communicates with a PC having CPU(Intel Core i3-4130 3.4GHz)and 4GB RAM through the PC interface. Firstgeneration Kinect sensors and a 1 200 000 pixel industrial camera were used to get the detecting data. The system was implemented in VS2010 to process images and analyze data.

Human Gesture Prediction Experiment
In order to verify the reliability of this study for human gesture prediction, the error of Kinect detection and Kalman filter detection must be tested and verified. In the experiment, using Kinect to track the motion of the hand and the head in the r X axis of the robot coordinate system and then combined with the Kalman filter to predict the moving gesture of the human's hand and head. As shown in Figure 5 and Figure 6, the white rectangle was the changeable position of the hand and the head, the white line was the path of the palm and the head. In order to contrast the actual values of the hand and head with Kinect detection value and Kalman filter values, the tape was set up in the experiment, and the hand and the head moved along the direction of tape. The actual value, Kinect detection value and Kalman filter values of the hand and head are shown in Figure 7, respectively. The error between the actual value and the Kinect detection value, as well as the error between the actual value and the Kalman filter values are shown in Figure 8, respectively.
A typical experiment result is presented in Figure 7. Both the Kinect detection value and the Kalman filter value are almost the same as the actual value as shown in Figure 7. The error of the Kinect detection value and the error of the Kalman filter value are summarized in Figure  8, respectively. Both the absolute error of the Kinect detection value and the Kalman filter value are below 50mm. Therefore, the detection error is negligible, and will not affect the outcome of human gesture prediction. In summary, using Kinect combined with the Kalman filter to predict the human gesture is reliable and practicable.

The Robot Collision Avoidance Experiment
To verify the real-time and effectiveness of the system, a series of experiments was carried out in terms of the following situations.

The ObstacleiIs Not Human
An umbrella was placed in the robot's working space, and the robot's moving trajectory passed the position of the umbrella. When the umbrella is detected as an obstacle by the system, the system would control the robot arm to stop the movement and escape in the reverse direction (Figure 9). For instance, when the robot arm is moving close to the umbrella (Figure 9a), the system detects that the robot arm is too close to the umbrella, and the robot arm would stop (Figure 9b). The system would further avoid collision by taking other measures (Figure 9c). The collision avoidance trajectory of the end-effector in its working space is shown in Figure 9d. When the obstacle reach the dangerous point, the robot arm moves to the safe direction according to the collision avoidance trajectory, thus it will avoid the collision in time (Figure 9).

The Obstacle is the Head and the Danger Index f＞F
When a head is detected as an obstacle by the system and judged at risk, the system will control the robot arm to move along the reverse direction of the head's movement ( Figure 10). For instance, when a head appears in the moving space of the robot arm and both are approaching each other (Figure 10a), the system will detect the collision danger between the head and the robot arm (Figure 10b). The robot arm then retreats in the reverse direction, and the head continues to move at the same time (Figure 10c). The collision avoidance trajectory of the end-effector in its working space is shown in Figure 10d.
As is shown in Figure 10, the global vision and the Kinect detect the existence of the head, and combine with Kalman filter to predict the situation of head's gesture, thus calculating the outcome of the danger index which is greater than the critical value. So the robot arm takes timely control strategy, which is shown in Figure 10d, to avoid the collisions with the head.

The obstacle is the arm and the danger index f＞F h
When the palm is detected as an obstacle by the system and judged at risk, the system would control the robot arm to retreat vertically along the vector direction of the palm ( Figure 11). When the palm appears in the moving space of the robot arm, the system detects the collision danger ( Figure 11a). The robot arm retreats vertically along the vector direction of the palm (Figure 11b) and then moves along a new trajectory to the original target ( Figure 11c). The collision avoidance trajectory of the end-effector in its working space is shown in Figure 11d.
When the palm reaches the dangerous point, the robot arm timely moves to the safe direction according to the collision avoidance trajectory. When the robot reaches the safety position, the system re-planning of the trajectory so that the robot reaches the original target 's position ( Figure 11).
The system can detect different obstacles such the head or the arm of a human body and then control the robot timely and effectively to avoid collision from different risks. The average response time was 45-55 ms, which met the real-time requirements. Furthermore, the robot arm returns to the original trajectory when the danger signal is released.

Conclusion
This paper presents a pre-collision method based on global vision and Kinect for robot movement's security. The system is simple in structure because it can recognize the human body and nonhuman obstacles accurately and calculate the danger index by using the Kinect and Kalman filter algorithm when the obstacle is a human body. Moreover, the danger index is used as a basis by the system when assuming a different security control strategy, thereby ensuring the safety of the human body and the robot arm.
This pre-collision method can be applied to man-machine cooperation of industrial robots. A series of experiments on the six-DOF ABB robot arm using this system confirmed the real-time effectiveness and satisfactory performance of the method. Thus, the method can improve the safety of the robot's movement effectively when the effect on the working efficiency of the robot is minimal.