An Effective PSO-inspired Algorithm for Workflow Scheduling

ABSTRACT


INTRODUCTION
The Cloud is a computing platform that provides convenient, on-demand access to a shared pool of configurable computing resources such as networks, servers and storage [1]. Workflow scheduling is one of the challenges that the Cloud must tackle especially if a large number of tasks are executed on the geographically distributed servers. This demands the adoption of a reasonable scheduling algorithm in order to attain a minimal completion time (called makespan).
The rest of the paper is organized as follow. Section 2 reviews some of the related works germane to workflow scheduling algorithms. Section 3 describes the computation and communication model on which Cloud tasks operate. Based on this model, Section 4 presents our proposed scheduling algorithm LPSO (Local-search Particle Swarm Optimization). Section 5 describes the experiments we conducted using the CloudSim simulation tool [2] in order to evaluate the proposed algorithm. Section 6 concludes our paper and sketches future work.
constraints. In general, the scheduling problem, i.e., the mapping of tasks to the computation resources such as servers, is an NP-complete problem [3]. Hence, past works banked mostly on heuristic-based solutions for scheduling workflows.
For example, S. Parsa [4] proposed a scheduling algorithm that minimizes the makespan of the workflow in the Grid environment. A. Agarwal [5] studied the greedy algorithm which assigned an appropriate priority sequence numbers to tasks. J. Huang [6] proposed a workflow task scheduling algorithm based on genetic algorithm. S. Pandey [7] presented an effective scheduling algorithm (PSO_H) to minimize the cost of the execution. R. Buyya [2] presented a brief description of CloudSim, the useful simulation toolkit that used in this paper to simulate the execution of the tasks with different scheduling policy.
J. Jintao [8] proposed a task scheduling algorithm based on service quality and the advantages of the Min-min algorithm. Guo-Ning and Ting-Lei [9] presented an optimized algorithm for task scheduling based on Hybrid Genetic Algorithms. The authors covered in their study the QoS requirements like completion time, bandwidth, cost, distance, reliability of different types of tasks. L. Guo [10] presented a model for task scheduling in Cloud to minimize the overall time of execution and transmission. L. Guo proposed the PSO algorithm (Particle Swarm Optimization) which is based on the small position value rule. R. Rajkumar [11] proposed a hierarchical scheduling algorithm that helps satisfy service level agreement with quick response from the service provider. S. J. Xue [12] proposed the hybrid PSO algorithm to minimize the cost execution of the workflow. Crossover and mutation of genetic algorithm are embedded into the PSO algorithm to improve the global search. J. Liu In et al [13] presented the components of an intelligent job scheduling system in cloud computing.

The particle swarm optimization method
The Particle Swarm Optimization (PSO) is one of the latest evolutionary optimization techniques introduced in 1995 by Kennedy and Eberhart [14]. There are many studies which succeed PSO strategy such as [15], [16]. They proposed the formula of updating the position vector as follows: where a. vi k , vi k+1 : velocity of particle i at iteration k and k+1 b. xi k , xi k+1 : position of the particle i at iteration k and k+1 c. ω : inertia weight; c1, c2 : acceleration coefficients d. rand1, rand2 : random number between 0 and 1 e. pbesti : best position of particle i; gbest: position of best particle in a population The goal of PSO is to find the position that minimizes the fitness function denoted by: Fitness(gbest) → Min

Topological neighborhood for the PSO
The standard PSO has no neighborhood relationship, all of particles are directly connected to each other so there are no neighborhood relationships between them. The position of each particle is updated according to its personal best position (pbest) and the global best position among all the particles (gbest). However, various personal relationships, such as parent-child relationships, in real world do exist. This compelled some researchers [17] to propose topological neighborhood between particles in PSO's. Researches [17] have applied various topological neighborhoods such as the Ring neighborhood and Von Neuman neighbourhood where each particle shares its local best position among neighboring particles in the topological space. For this reason each particle is affected by the local best (lbest) in its local neighborhood instead of pbest. In PSOs that use a local best position, the formula for updating the position vector is where lbesti is the local best position of particle i with the best fitness value among its neighbors.
As shown in Figure 1, the neighborhood relationships are determined based on each topology. For example, in the Ring topology, each particle has k neighbors. In this paper we set k=2 so each particle xi connects directly to its left-neighbor (Left(xi)) and its right-neighbor (Right(xi)). Based on the Ring topology, we build a searching function described as follows

PROBLEM FORMULATION
We denote the workflow as a Directed Acyclic Graph (DAG) represented by G=(V, E), where: a. V is a set of vertex, each vertex represents a task b. T={T1, T2,…,TM } is the set of tasks, M is the number of tasks c. E represents the data dependencies between these tasks. The edge (Ti, Tj)  E means the task Ti is the father of the task Tj, the data produced by Ti will be consumed by the task Tj.
The communication time between the task Ti and Tj is Formally, we seek to minimize the execution time of the workflow: makespan → min where the execution time, called makespan, is the time difference between the start and finish of a sequence of workflow's tasks.

PROPOSED ALGORITHM 4.1. Escaping local extremum
During their execution, PSO-based algorithms may get trapped in local extrema. Our proposed idea to escape such local extrema is as follows: when the swarm falls into the area around the local extrema, we combine the PSOs in order to have a topological neighborhood with a neighborhood searching function [18] that moves particles to a new area.
Variable Neighborhood Searching Function in order to help the swarm escape from the area around the local extrema, we devised two operators named Exchange and RotateRight, as illustrated in Figure 2, and built a Variable_Neighborhood_Searching function based on these operators.

The LPSO algorithm
The LPSO algorithm can be described as follows: In each iteration, the LPSO updates the position vectors of particles based on gbest and lbest using formulas (2) and (3). If the deviation of gbest less than  during K continuous generations, this means that the swarm is trapped in a local extremum area, and hence the function Variable_Neighbourhood_Searching( ) should be called. This function moves (migrates) the swarm to a new area and produces a new generation.
If gbest is not improved significantly, i.e. the deviation of gbest is still less than  after K continuous In the worst case, LPSO always finds a better position after the function Variable_Neighbourhood_Searching( ) is executed without getting trapped in a local extrumum, rendering LPSO an exhaustive search. Our default threshold for number of generations is 300. The LPSO stops upon reaching this threshold.

RESULTS AND DISCUSSION
We conducted some experiments in order to compare the performance of the LPSO algorithm with others, namely the PSO_H [7] and Random [19]. Our experimental setup consists of a computer with Intel Core i5 2.2 GHz, RAM 4GB, and Windows 7 Ultimate. The experiments were carried out using the CloudSim simulation package, the packet library Jswarm [20] and Java.

Problem instance
We use both random and real world instances in our experiments using the following data sets: a. The computation power of the servers and the bandwidth of connections between servers are collected from Cloud firms such as Amazon [21] and their Web site (exp. http://aws.amazon.com/ec2/pricing) b. The sets of working data are collected from the Montage project [22] We denote the ratio of the number of edges and the number of vertexes of graph G as follows:

Results
Each problem instance was executed 30 times continuously. The results summarized in Table 1 show that the mean value (listed in column Mean) and standard deviation value (listed in column STD) of LPSO are better than those of PSO_H [7] and Random [19] in most of the cases. When the number of servers (N) and the number of tasks (M) are relatively large (i.e. larger scale cloud), for example M=20 and N=8; M=25, N=8; M=50, N=8, LPSO outperforns PSO_H and Random with respect to all metrics: mean, standard deviation and best value (listed under column Best).
Since the number of server (N) is a finite integer number, the elements of the position vector (denoted by xi k [t]) must be integer numbers (t =1..M) too. In Equation (2), the value of the left hand side xi k+1 is an integer number while the value of the right hand side (xi k + vi k ) is a real number. Pandey [7] resolved this situation by rounding the real value of the right hand side to the nearest integer. For example, if xi k [t] + vi k [t] = 3.2 then task Tt gets assigned to server S3. If xi k [t] + vi k [t] = 3.8 then Tt gets assigned to server S4. Inevitably, this introduces some sort of randomness in the assignment of servers in the PSO_H algorithm [7], and hence it can not maintain the diversification of swarm. For this reason, PSO_H often gets trapped in local extrema.
Alternatively, we propose a novel method in which we assign the left hand side xi k+1 to the server whose computation power is the closest to (xi k + vi k ).
In other words, the new particle's position is the one which renders the task to be assigned to the server that has the closest computation power to the real value computed from the position vector. The results described in Table 1 show that the mean value (the Mean column) and standard deviation value (the STD column) of LPSO are better than those of PSO_H [7] and Random [19] in most of the cases. The solutions of LPSO are smaller than the solutions of PSO_H with a value difference varying from 1% to 12%. The LPSO's standard deviations are smaller than the PSO_H's with a value difference varying from 53% to 84%. These 3857 results show that LPSO is stable and better than both the PSO_H [7] and Random [19]. Table 2 shows the comparison the makespan of LPSO with other algorithms for Montage workflows (seconds). Figure 3, Figure Figure 4 Figure 5 and Figure 6 depict the performance of the three algorithms: proposed algorithm LPSO, PSO_H [7], and Random [19] where the vertical axis represents the makespan of the schedule in seconds. For each instance, we compare the best position vector (column BEST), the mean value (column MEAN) and standard deviation value (column STD). At the first instance, LPSO was even able to find the optimal solution.

CONCLUSION
The ultimate goal of any scheduling algorithm is to minimize the execution time. In this work, we showed is advantageous as it avert getting trapped in local extrema. The contributions of our paper are: The experimental results show that LPSO is superior to its predecessor especially when LPSO works in a larger scale Cloud. In the future, we wish to investigate how to improve the LPSO algorithm in order to solve bigger instances within a reasonable makespan.