A New Shopfloor Orchestration Approach for Collaborative Human-Robot Device Disassembly

We present a new approach to tackle the problem of task assignment and scheduling in human-robot teams that undertake collaborative device disassembly tasks. The proposed approach is a hybrid between a global search metaheuristic and an adaptive greedy operation assignment and scheduling algorithm. We propose the concept of an "Adaptive (work) Cell", (aCell), which becomes the basis for the hierarchical organization of the proposed search approach. At high level, metaheuristic search establishes resource constraints for each aCell and determines parameters for the task-level operation scheduling. At low level, the task-level scheduling algorithm produces feasible assignments and schedules within a single aCell by backtracking through feasible time slots using an adaptive score metric. The advantage of the proposed approach is that it clearly delineates between higher level state space exploration and focused, task-oriented exploitation. We validate the proposed approach on a class of novel multi-objective benchmark problems involving human-robot teams collaborating throughout a factory floor, addressing specifically for the first time the problem of device disassembly tasks, relevant to WEEE recycling, with additional constraints, where we obtain favorable results.


I. INTRODUCTION
Human-Robot Collaboration (HRC) is a major research direction that is gaining ever-increasing attention over the past years owing to the proliferation of industry 4.0 as a prominent research agenda. Within this setting, an important research topic is the orchestration of human-robot agent teams undertaking collaborative tasks. One example of such application is HRC in device assembly scenarios, where versatile teams of robots and humans carry out assembly of devices of varying specifications characterized by flexible, agile assembly processes. A much less explored application in the field of HRC concerns the disassembly of electric and electronic devices, which is highly relevant to Waste Electric and Electronic Equipment (WEEE) recycling. Despite the ever-increasing importance of WEEE recycling, much of the disassembly is still performed by human workers. WEEE recycling presents a series of challenges that render close HRC throughout a factory floor the only possibility towards automation. Indicatively, WEEE coming as an input to the plant are highly diverse, while not all of their components are expected to be extracted by a robot in an efficient manner, e.g. small, delicate parts of device sub-components).
Coordination of small human-robot teams undertaking collaborative tasks is a well-studied topic in the literature 1 Authors are with the Centre for Research and Technology Hellas / Information Technologies Institute (CERTH / ITI) (ihatz,dgiakoum,Dimitrios.Tzovaras)@iti.gr [1], [2], [3], [4]. In this type of problem, small teams of humans and robots collaborate on a single task, and the goal is to assign and sequence operations so as to achieve a problem-dependent objective, such as minimizing makespan or cost. On the other hand, the problem of orchestrating large groups comprising multiple human-robot teams undertaking multiple collaborative tasks simultaneously, is one that has received less attention so far, even though efficient orchestration in such large scale collaborative scenarios is key to achieving factory-wide efficiency [5].
Efficient orchestration of resources at a factory-level is a commonly occurring problem in resource assignment and scheduling literature [6]. The goal is to assign a number of operations each of which belongs to a task, to suitable resources, and schedule their processing so as to optimize one or more defined objectives. It is possible to have more than a single type of resource, in which case each operation is assigned a resource of each type [7], [8]. Operations within a task are typically determined by a strict precedence relationship. Resource assignment and scheduling is a class of problems that is proven to be NP-Hard [6], and several categories of methods are proposed to address them, such as exact [9], heuristic [5], [10] and metaheuristic [11], [12].
The problem of large-scale human-robot team orchestration addressed in this paper shares many common features with assignment and scheduling problems, but also presents significant differences. The co-existence of human and robotic agents introduces a resource category, that of the robots, which is diversified with respect to (i.) skills pertaining to different operations necessary for task completion and (ii.) mobility characteristics (e.g. mobile robotic agents versus agents fixed to workstations), both of which impose additional constraints to resource assignment and scheduling. Second, when considering a full factory shopfloor, work is performed on significantly more than one physical workstations that are topologically located according to a specific floor plan; this imposes temporal constraints in resource exchange between workstations and possible task operation distribution among workstations. Third, precedence relations among operations are optional, and dictated according to the device disassembly process; a separation of device components early on offers the opportunity for two or more agents to work in component extraction in parallel. Finally, the problem is characterized by multiple conflicting objectives pertaining to aspects such as completion times, production costs and environmental factors, which are in turn more complex in definition due to the diversity of the resources involved; i.e. human and robot workers.

A. Contribution
The contribution of this paper is twofold: First, we propose a novel multi-objective problem definition that includes aspects necessary to tackle real-world, large-scale collaborative human-robot team orchestration, i.e. factorywide, as required to achieve efficiency on the scale of a factory conforming to industry 4.0 realizations. Second, we propose a novel approach that is able to address this class of problems effectively. The proposed approach comprises a hybrid resource assignment and scheduling algorithm combining high-level global metaheuristic search and adaptive greedy operation assignment. We introduce the concept of an "Adaptive Workcell" (aCell), a flexible hierarchical boundary between the global and the local search counterparts. Within this context search operates first at high level, constructing the aCells by determining resources and scheduling parameters for each of them. In this stage global search is performed by an Multi-Objective Evolutionary Algorithm (MOEA). Subsequently, at low level, local search identifies optimal assignments and schedules within each aCell. Local search operates using dispatch rules with adaptive goals. Our aCell-based approach favors solutions with compact temporal distribution of task operations, by incorporating task continuity into the solution's definition. In addition, the distinction of global and task-level optimization strikes an efficient balance between exploration and exploitation, as is evident by the results of the computational experiments.

B. Paper Outline
The rest of the paper is organized as follows. Section II presents the state of art in collaborative humanrobot team orchestration and in multi-objective dual-resource constrained jobshop scheduling. Section III elaborates the problem definition, including objective functions and Mixed Integer-Linear Programming. Section IV presents in different subsections an overview of the proposed architecture as well as elaboration on each component. Section V presents experimental results. Section VI concludes the paper.

II. RELATED WORK
The problem of efficiently orchestrating multiple collaborating human-robot teams is strongly related to resource assignment and scheduling in operations research. In this problem, a series of tasks comprising multiple operations need to be executed considering limited resources, with the aim of minimizing one or more objectives. The problem addressed here includes two limiting resource types, namely agents and workstations, which is known in operations research as a Dual-Resource Constrained Flexible Jobshop Scheduling Problem (DRCFJSP) [7]. DRCFJSP is a well-studied problem with extensive coverage in literature [7], [8], [13]. In DRCFJSP two resource types constrain the scheduling operation. Each operation is assigned to one candidate of each resource type. The multi-objective variant of DRCFJSP has gained attention recently, due to the increased considerations in reducing production costs, environmental emissions etc. Hamedi et al. [14] proposed a Multi-Objective Tabu Search algorithm based on Goal Programming. Lei and Tan [15] proposed a hybrid Genetic Algorithm with Local Search with Controlled Deterioration for the multi-objective DRCFJSP. The algorithm in question performs global search using the Genetic Algorithm and incorporates a local search step using VNS in a similar fashion to previous work of Lei and Guo [11], but also accepting dominated solutions in addition to dominating ones, with a deteriorating factor. Finally, Xixing and Yi [16] propose an algorithm for the multi-objective DRCFJSP problem based on Multi-Objective Optimization Algorithm with Decomposition (MOEA/D) [17] with the objectives of minimizing makespan, minimizing equipment load and minimizing processing cost.
HRC is a research field relevant to this work that is rapidly gaining attention in the literature, largely owing to the proliferation of industry 4.0-related research. In a recent work, Johannsmeier et al. [1] demonstrated efficient humanrobot interaction planning at the device assembly task level. Their approach employs a graph traversal algorithm based on A* applied on AND/OR graphs corresponding to assembly sequences. Chen et al. [2] propose a Multi-Objective Genetic Algorithm for task planning in small human-robot teams on one and two assembly stream problems. Zhang and Shah [5] develop multi-abstraction search for addressing the multi-agent placement and task orchestration problem, where agent placement, task assignment and task scheduling happen sequentially and form the starting point for a satisficing scheduler. Bogner et al. [4] present a Mixed-Integer Linear Programming (MILP) formulation for single-and multi-task HRC with application in PCB production, a set of efficient heuristics and a metaheuristic approach.
Despite the relevance of existing approaches, HRC in factory-scale device disassembly applications entails a combination of unique characteristics, namely: i. the presence of heterogeneous agents, including humans and robots with varying skills and mobility characteristics, ii. the topological organization of workstations in a factory floor, and the temporal constraints that they impose with respect to mobile agents, iii. precedence constraints for operations within a single task that are optional and thus allow for parallelism, iv. multiple conflicting objectives and, v. operations carried out at scale and through multiple collaborating agent teams. To the best of the authors' knowledge, there does not exist a work that addresses all of the above aspects simultaneously.

III. PROBLEM FORMULATION A. Illustrative Domain
In line with the vision of industry 4.0, collaborative robotics can expand applications of robots to a diverse range of industrial settings where the application of current industrial automation is infeasible. HRC in industrial settings has the potential to alleviate production costs by allowing robots to undertake tasks requiring skilled labor in collaboration with humans [18]. The examplary task examined in the present paper is that of skilled disassembly of WEEE devices in recycling factories, which is of major importance for increasing the volume of reclaimed materials per device. Despite it's importance, it is a task handled almost exclusively through manual labor, as it entails skillfull manipulation and high degree of uncertainty, challenges that industrial robots cannot easily overcome. Novel collaborative robots, on the other hand, are able to work alongside humans with the required precision and increased confidence, as is evident by studies on collaborative device assembly by small human-robot teams [1]. However, despite attention to smallscale HRC, the problem of orchestrating multiple humanrobot teams has not yet been sufficiently addressed.
In WEEE disassembly, device dismantling takes place on workstations organized within a factory floor, where device parts are separated and relevant components extracted through a series of operations. Optional precedence relationships exist between operations, enabling parallel operation execution in some cases within a single task. Each agent involved in the process has different skill levels with respect to the operations required for task completion, including some agents not having certain skills. Mobility of agents imposes temporal constraints that are in direct relation to the topological organization of the shopfloor and agent mobility characteristics. In addition, while some agents such as humans and AGVs possess mobility, typical robotic manipulators are fixed to workstations and thus immobile, enforcing tasks to be assigned to a limited number of workstations.
In our optimization problem we consider two objectives, namely minimization of makespan and minimization of disassembly cost. Minimization of makespan ensures that all operations are performed in a timely manner taking full advantage of the shopfloor capabilities. Disassembly cost is defined as the sum of labor cost and robot operation cost, for the periods that an agent (human or robot) is executing an operation. Outside these periods agents are considered not to incur costs.

B. MILP Formulation
The following problem inputs are defined: • T , a set of tasks. • A, a set of agents, including humans and robots. • C, a set of operations relevant to the tasks at hand. • K, a matrix of values denoting times to complete an exemplar operation corresponding to a skill, indexed by agent and skill, K ∈ R |A|,|C| . Lower values denote better times, thus higher skill level. If agent a does not possess skill relevant to operation c, then K a,c = ∞. • V , a list of values denoting achievable agent speed for inter-workstation mobility, indexed by agent, V ∈ R |A| . • W , a set of workstations that are suitable for carrying out any one of the operations defined in C. • Z t , a list comprising required operations in sequence, indexed by task, t ∈ T, Z ti ∈ C∀i ∈ {1, ..., |Z t |}. • Ψ t , a matrix of boolean values denoting requirements on operation precedence. For any two operations i and j in task t if i precedes j, then Ψ t;ij=1 . • F , a matrix of boolean values, indexed by agent and workstation, indicating whether an agent is allowed at a workstation. If agent a is allowed in workstation w, then F a,w = 1, else F a,w = 0. a ∈ A, w ∈ W, F a ∈ Z |A|,|W | 2 . • D w,w , a matrix of inter-workstation travel distances. • Q a , a vector of agent costs per unit time, indexed by agent. In addition, the following decision variables are defined: • C max ∈ R + , the makespan, i.e. total processing time for all tasks. • Q tot ∈ R + , the total production cost, which translates to the sum of cost per unit time for each of the invovled agents and operations. • τ S c,t ∈ R + , starting time of operation c in task t. • τ E c,t ∈ R + , ending time of operation c in task t. • P (c1,t1),(c2,t2) ∈ {0, 1}, operation precedence matrix of binary values. If c 1 in t 1 precedes c 2 in t 2 then P (c1,t1),(c2,t2) = 1. P ∈ Z |Z|,|Z| 2 • S A a,c,t ∈ {0, 1}, assignment of operation c in task t to agent a. • S W w,c,t ∈ {0, 1}, assignment of operation c in task t to workstation w. The Mixed-Integer Linear Problem is formulated as follows: Equations 2 and 3 ensure all operations are assigned an agent and workstation. Inequalities 4 and 5 constrain the makespan and cost values respectively. Inequalities 6, 7 and 8, 9 ensure that operations assigned to same agents and workstations respectively, do not overlap. Inequality 10 ensures operation endtimes are assigned according to operation duration. Inequality 11 ensures operations within a task that need to be ordered maintain start and end time order. Inequality 12 ensures agents are only assigned to allowed workstations. M is a large positive number for encoding conditional constraints.

A. Architecture
The proposed approach is organized in a two-level hierarchical architecture. At high level a MOEA evolves a population of solutions. Solution genomes represent assignments of resources, namely human and robot agents and workstations, for each of the tasks. In addition, each solution contains information that determines the order that each task will be considered for scheduling by the low-level scheduler. For each of the prescribed tasks the corresponding resource assignments forms an abstraction over the detailed schedule that is termed Adaptive Workcell, or aCell. The importance of aCell is that it abstracts away information regarding the detailed schedule from the metaheuristic search. In this sense, it simplifies the search space by offloading detailed scheduling decisions to the low-level scheduling.
The low-level scheduler processes each individual of the global search population in the sequence determined at high level, greedily assigning operations in a fashion similar to dispatch rules [19], however with two notable differences: i. the scheduler considers potential time slots earlier than the last scheduled operation, and, ii. the dispatch rule itself is adaptive, with adaptation parameters being part of the individual genome. The latter allows the low-level scheduler to accommodate varying objectives with different priority, while the balance of objectives becomes itself a matter of evolution. The performance of each individual according to both objectives is measured on the basis of the final schedule derived by the low-level scheduler.

B. Global Search
Global search occurs at the combined state space of resource assignment and task scheduling order. We employ a MOEA, wherein the encoding of each individual reflects resource availability, order of scheduling for each of the tasks defined in the problem, and parameter values for the tasklevel assignment. One advantage of using a population-based optimization algorithm is that in real-world problems with multiple objectives such as the one presented in this paper, obtaining a set of Pareto-optimal solutions is straightforward, which is not the case with a trajectory-based algorithm such Fig. 1: Architecture overview of the proposed HRC scheduling approach as VNS [11]. The overall scheme of the proposed search algorithm is available in Fig. 2. The encoding comprises four parts: A vector that determines scheduling order of each task using Random Key encoding [20]; a matrix of agents by tasks that determines agent constraints for each task; a matrix of workstations by tasks that determines workstation constraints; and a vector of tasklevel assignment parameters. The total length of encoding is λ = |T | × (1 + |S A | + |S W |) + |p|, where p corresponds to the vector of task-level scheduling parameters, in our case |p| = 4 as will be discussed in section IV-C.
In our experiments we combine aCell with two different MOEAs, namely NSGA-II [21] and MOEA/D [17], in order to draw comparative results and evaluate the contribution of the aCell approach. NSGA-II employs tournament selection Two constraints are defined which correspond to equations 2 and 3. The constraints are defined on the basis of the definite schedule that is generated for each individual. We handle constraints through a tournament operator that employs lexicographical ordering of solutions such as proposed in [22].

C. Low-Level Operation Scheduling
The low-level operation scheduling algorithm operates by following a myopic optimization rule that schedules each operation within a task, going through tasks one at a time. The low-level objective is a weighted summation of four criteria that bear similarity to dispatch rules [19]. The low-level scheduler maintains a series of feasible timeslots for each agent and workstation, defined by the quintuplet (a, w, τ S , t, i), where a is an agent, w is a workstation, τ is the starting time of the timeslot, t is the corresponding task and i is the operation index. The optimal scheduling time slot is defined as follows: (13) The weights w j are part of the genome of each individual in the global search population, and thus are subject to evolution themselves. Each time an operation gets assigned to a timeslot, the occupied timeslot is replaced with one or more non-overlapping timeslots, depending on the temporal proximity of preceding and following operations. Even though this process increases computational complexity of the scheduling operation, we have observed that by limiting the workstation and agent assignment seats available to the low level scheduler only to the k highest random key values, it is possible to manage complexity with minimal impact to solution quality.

V. EXPERIMENTAL RESULTS
We introduce a new class of benchmark problems that account for idiosyncrasies of the shopfloor HRC orchestration problem, in the context of device disassembly. The premise is that, overall, robots exhibit lower skill levels compared to human skilled workers and are only capable of some operations; at the same time the operation cost per unit time of a robotic agent is significantly lower than the labor cost per unit time of a skilled worker. Thus, the tradeoff between cost and makespan is established. We consider that some robots are fixed, while humans can move among workstations, and define five instances representing increasingly complex orchestration scenarios. Labor cost assumes values q h ∼ U (1.0, 1.1) and robot operation cost q r = 0.05. Workstations are topologically arranged in a single row with neighbor distance of d = 1.0. The attributes of proposed problem instances are available in Table I.
The proposed aCell approach is compared with Random Key encoding NSGA-II and MOEA/D in minimizing makespan and production cost. The population size is 200 in all cases and each algorithm was allowed to run a total of 500 generations. For each problem instance, the mean and best of each objective value was established over a series of 10 experimental runs. The algorithms are implemented in Python and make use of the PAGMO framework [23]. The results are presented in table II and indicative objective space distribution of each algorithm is available in figure 3.
Results indicate that the aCell configuration is advantageous over the pure random key encoding. For all except one problem instances both aCell algorithms (aCell-NSGAII and aCell-MOEA/D) are able to achieve mostly dominant results over their pure random key counterparts, which suggests that the advantage of aCell is independent of algorithm selection. In addition, in most cases aCell converges faster to the Pareto front. NSGA-II performs better than MOEA/D in all problem instances except C2 and in both aCell and pure random key cases. Interestingly it is seen that in some larger problem instances aCell solutions dominate the makespan objective, while Random Key solutions dominate the cost objective. It is possible that this is due to the limitations imposed as described in subsection IV-C. In this respect, one step towards achieving a wider Pareto front would be the relaxation of agent selection limits for the low-level scheduler at the expense of computation time.

VI. CONCLUSIONS
This paper presented a new optimization approach for addressing human-robot collaborative (HRC) orchestration problems, focusing in the application field of device disassembly. The proposed problem definition is derived after real-world use cases to address the specific aspects that a large-scale collaborative human-robot operation presents. The proposed optimization approach introduces the concept of Adaptive Workcell (aCell), which proves beneficial in reducing search complexity. Through considering operations within a task in a continuous sequencing order, we prioritize solutions that favor task coherency, both in the temporal as well as in the spatial domain, effectively reducing the complexity of search, while maintaining identification of feasible, and favorable solutions. The proposed optimization approach achieves favorable results compared to two  Random Key-coded MOEAs, and produces schedules that minimize undesirable aspects such as agent mobility and maintaining task coherency.