Effective ACO-Based Memetic Algorithms for Symmetric and Asymmetric Dynamic Changes

Ant colony optimization (ACO) algorithms have proved to be suitable for solving dynamic optimization problems (DOPs). The integration of local search operators with ACO has also proved to significantly improve the output of ACO algorithms. However, almost all previous works of ACO in DOPs do not utilize local search operators. In this work, the ${\mathcal M}{\mathcal A}{\mathcal X}-{\mathcal M}{\mathcal I}{\mathcal N}$ Ant System (${\mathcal M}{\mathcal M}$AS), one of the best ACO variations, is integrated with advanced and effective local search operators, i.e., the Lin-Kernighan and the Unstringing and Stringing heuristics, resulting in powerful memetic algorithms. The best solution constructed by ACO is passed to the operator for local search improvements. The proposed memetic algorithms aim to combine the adaptation capabilities of ACO for DOPs and the superior performance of the local search operators. The travelling salesperson problem is used as the base problem to generate both symmetric and asymmetric dynamic test cases. Experimental results show that the ${\mathcal M}{\mathcal M}$AS is able to provide good initial solutions to the local search operators especially in the asymmetric dynamic test cases.


I. INTRODUCTION
Ant colony optimization (ACO) algorithms have proved that they are powerful problem solving tools.They are able to provide the optimal (or near to the optimal) solution for difficult combinatorial optimization problems (e.g., the travelling salesperson problem (TSP) [1]).Traditionally, researchers have focused their attention on static optimization problems, where the environment of the problem remains fixed during the optimization process of an algorithm.However, many real-world applications are subject to dynamic environments.Dynamic optimization problems (DOPs) are challenging since the aim of an algorithm is not only to locate the optimum of the problem quickly, but to efficiently track the moving optimum when changes occur [2].A dynamic change may involve factors like the objective function, input variables, problem instance, and constraints.
The integration of local search operators with ACO (or other metaheuristics), resulting to the so-called memetic algorithms, showed that it significantly improves the output in static [5], [6] and, recently, in DOPs [14].This is because ACO-based memetic algorithms can better explore locally a neighbourhood in the search space compared to conventional ACO algorithms.In addition, ACO-based memetic algorithms take advantage of the global search capabilities of ACO to guide local search in neighbourhoods, leading to high quality solutions.Furthermore, ACO-based memetic algorithms in DOPs inherit the adaptation capabilities of ACO when dynamic changes occur.In this paper, we consider two advanced local search operators: 1) Lin-Kernighan (LK) [8], and 2) Unstringing and Stringing (US) [9].Although the US operator has already been investigated in the ACO-based memetic algorithm in [7], [14], in this work the investigation is extended with comparisons against algorithms based on the advanced LK operator.
TSP is used as the base problem to systematically generate dynamic test cases [11].Almost all of the dynamic test cases considered in the literature focused only on symmetric dynamic changes [7], [11], [18].However, the real-world is often asymmetric [12], [14].For example, in transportation routing problems the time when driving in one direction is not necessarily the same in the opposite direction (e.g., may be affected by different traffic conditions).Hence, in this work, we consider both symmetric and asymmetric dynamic changes.The rest of the paper is organized as follows.Section II describes the TSP and the construction of the dynamic test cases.Section III describes the memetic framework based on one of the best variations of ACO.The core logic of the two local search operators is also described to make the paper self-contained.Section IV gives the experimental results and analysis.Finally, Section V concludes this paper.

A. Base Problem Formulation
TSP is used as the base problem to generate dynamic test cases.Typically, a TSP instance is modelled by a fully connected weighted graph G = (N, A), where N = {1, . . ., n} is a set of n nodes and A = {(i, j) | i, j ∈ N, i = j} is a set of arcs.For the classic TSP, nodes and arcs represent 978-1-7281-2153-6/19/$31.00 c 2019 IEEE the cities and the links between them.Each arc (i, j) ∈ A is associated with a non-negative value w ij ∈ R + , which for the classic TSP represents the distance between nodes i and j.For symmetric TSPs, these distances are independent of the direction of traversing the arcs, that is, w ij = w ji for every pair of nodes.If w ij = w ji for at least one pair of nodes, then the TSP becomes asymmetric.Every problem instance consists of a weight matrix, i.e., W = {w ij } n×n , that contains all the weights associated with the arcs of the corresponding graph G.

B. Generating Dynamic Test Environments
To generate dynamic test cases, the weight matrix of the problem instance becomes dynamic as follows [11]: where T = ⌈t/f ⌉ is the period of a dynamic change, t is the evaluation count and f is the frequency of change.Note that the introduced dynamic changes are synchronized with the optimization process of the algorithm.Hence, the parameter f is expressed in algorithmic evaluations.
A particular dynamic test case can be generated by assigning an increasing or decreasing factor value to the arc connecting nodes i and j as follows: where w ij (0) is the initial weight of the arc connecting nodes i and j (i.e., from the static problem instance when T = 0), R ij is a normally distributed random number (with zero mean and standard deviation set to 0.2 • w ij (0) [12]) that defines the modified factor value of the arc, A S (T ) ⊂ A defines the set of arcs randomly selected for the change at that period and T defines the environmental period index as defined in Eq. (1).
The size of the set A is defined by the number of arcs (i.e., n(n − 1) for asymmetric cases and n(n − 1)/2 for symmetric cases).Hence, the size of A S (T ) is defined by the magnitude of change (i.e., m ∈ [0, 1]) and the size of A. For example, at period T , exactly ⌈mn(n − 1)⌉ and ⌈mn(n − 1)/2⌉ arcs will be selected to change their weights in asymmetric and symmetric cases, respectively.The higher the value of m, the more arcs will be selected for changes.Note that the arcs for asymmetric cases are directed whereas for symmetric cases are undirected.Therefore, if w ij changes in symmetric cases, then w ji changes uniformly (i.e., w ji = w ij ).On the contrary, when w ij changes in asymmetric cases, w ji will not change unless arc (j, i) is selected for a change (but not necessarily uniformly with w ij ).
A particular solution π = [π 1 , . . ., π n ] in the search space is specified by a permutation of the node indices, and for the dynamic TSP (DTSP), it is evaluated as follows:

III. ACO-BASED MEMETIC ALGORITHMS
The ACO metaheuristic consists of a colony of ω ants that construct solutions and share their information among each other via their pheromone trails.One of the best performing ACO variations, i.e., the MAX -MIN AS (MMAS) [13], is used in the memetic framework shown in Algorithm 1.Since the TSP is used as the base problem to generate dynamic test cases, it is used as a concrete example to describe the ACObased memetic framework.

A. Constructing Solutions
Ants read pheromones to construct solutions and write pheromones to mark their constructed solutions.The probability with which ant k, currently at node i, will move to node j is calculated as follows: where τ ij and η ij are the existing pheromone trail and the heuristic information available a priori between nodes i and j, respectively.The pheromone trails are initialized uniformly with a value τ 0 .The heuristic information is calculated as where w ij (T ) is defined as in Eq. (1).N k i is the set of unvisited nodes for ant k adjacent to node i. α and β are the two parameters which determine the relative influence of τ ij and η ij , respectively.

B. Applying Local Search Operator
Stützle and Hoos [13] applied local search operators to the iteration-best ant of MMAS after every iteration, whereas in [5], they further applied local search operators to all ants.Considering that local search operators are computationally expensive methods, such an extensive usage may not be very efficient for DTSP because the computation time naturally increases.As previously discussed, for DTSPs, algorithms must produce high quality solutions quickly [12].In [14] the local search operator is applied to the the best-so-far ant of MMAS only when a new best solution is found.This is because local search operators are typically executed until no further improvement is possible.In case a new best solution is not found, the local search is not applied because it will unnecessarily increase the computation time (and waste evaluations) to potentially "improve" a solution for which basically no further improvement is possible.In this work we investigate two powerful local search heuristics designed for the TSP, i.e., LK and US heuristics, described below.
1) Lin-Kernighan: The LK heuristic performs a series of λ-opt moves to transform a TSP tour into a shorter one [8].A λ-opt move consists of the exchange of a set of λ tour arcs by a set of λ new arcs.The LK heuristic starts with two empty arc sets: X (i.e., out-arcs) and Y (i.e., in-arcs).At each step one arc that currently belongs to the tour will be added to X and a new arc that does not belong to the tour will be added to Y .After the first step, the LK heuristic will favour arc insertions that result in a shorter complete TSP tour.When if (φ(π ib , t) < φ(π bs , t)) then PheromoneUpdate 10: end while 11: OUTPUT: π bs %best TSP solution a new complete tour is achieved, the algorithm will begin a new phase of arc exchanges and this process will continue until there is no further improvement.
A set of rules, that each step must follow, was established in order to enhance the algorithm's efficiency as follows: • Each arc removed must share a node with its added counterpart.After the first arc exchange in each cycle, each arc being removed must also share a node with the previously added arc.Fig. 1 illustrates an example, where on the first step arc (V 1 ,V 2 ) is removed and arc (V 2 ,V 4 ), which shares the node V 2 with its removed counterpart, is added.On the second step arc (V 3 ,V 4 ), which shares node V 4 with the previously added arc, is removed and arc (V 3 ,V 1 ) is added, closing the tour.
• No exchanges that result in the tour being broken into multiple closed circuits are allowed.An example of this type of exchange is shown in Fig. 2, where arcs (V 1 ,V 2 ) and (V 3 ,V 4 ) are removed and arcs (V 4 ,V 1 ) and (V 3 ,V 2 ) are added.In this case, the addition of either of the arcs would not be accepted because it will result in a segment of tour forming a cycle.
• Each pair of arcs exchanged must be gainful, meaning that each arc being added must be shorter than its removed counterpart.
• Once an arc is removed, until the tour is closed again, it cannot be added again in subsequent exchanges.
Although LK was originally designed for symmetric problems [8], it can be also applied to asymmetric problems by transforming an asymmetric weight matrix to a symmetric weight matrix (i.e., by doubling the nodes of the graph) [15], [22].
2) Unstringing and Stringing: The US heuristic is based on the removal (or unstringing) of nodes from the tour and their subsequent re-insertion (or stringing) [9].The main feature of the algorithm is that the re-insertion of nodes can happen between non-adjacent nodes, resulting in a tour where both nodes become adjacent to the node being inserted.Suppose that we wish to insert V x between any two nodes V i and V j .For a given orientation of a tour, consider V k a node in the subtour from V j to V i , and V l a node in the subtour from V i to V j .We also consider for any node V h on the tour, V h+1  its successor and V h−1 its predecessor.The re-insertion of V x between V i and V j can be done in several ways using different types of insertions and removals.In [7], [9], only symmetric problem instances were considered and tackled with Type I and Type II removals (Fig. 3(a) and Fig. 3(b)) and Type I and Type II insertions (Fig. 4(a) and Fig. 4(b)).In [10], [14] another two types of removals and another two types of insertions were considered in order to tackle asymmetric problem instances: Type III and Type IV removals (Fig. 3(c) and Fig. 3(d)), and Type III and Type IV insertions (Fig. 4(c) and Fig. 4(d)).
The unstringing procedure removes a given node from the tour and repairs the connections with the remaining nodes in order to have a closed tour.The procedure consists of four types of removals as follows: • Type I removal: Assume that V j belongs to the neighbourhood of V i+1 and V k belongs to the neighbourhood of V i−1 , with V k being part of the subtour (V i+1 , . . ., V j−1 ).The removal of node V i results in the deletion of arcs ) and (V j , V j+1 ); and the insertion of arcs Also, the subtours (V i+1 , . . ., V k ) and (V k+1 , . . ., V j ) are reversed.
• Type II removal: Assume that V j belongs to the neighbourhood of V i+1 , V k belongs to the neighbourhood of and V l belongs to the neighbourhood of V k+1 , with V l being part of the subtour (V j , . . ., V k−1 ).The removal of node V i results in the deletion of arcs ) and (V l , V l+1 ); and the insertion of arcs As above, the subtours (V i+1 , . . ., V j−1 ) and (V l+1 , . . ., V k ) are reversed.• Type III removal: Assume that V j belongs to the neighborhood of V i+1 and V k belongs to the neighborhood of V i−1 with V k being part of the subtour (V i+1 , . . ., V j−1 ).The removal of node V i results in the deletion of arcs and the insertion of arcs (V i+1 , V j ), (V i−1 , V k ), and (V j−1 , V k−1 ).As above, the subtours (V j−1 , . . ., V k ) and (V i−1 , . . ., V j ) are reversed.• Type IV removal: Assume that V j belongs to the neighborhood of V i+1 , V k belongs to the neighborhood of V i−1 with V k being part of the subtour (V l+1 , . . ., V i−2 ), and V l belongs to the neighborhood of V j−1 with V l being part of the subtour (V j+1 , . . ., V k−1 ).The removal of node V i results in the deletion of arcs , and (V k , V k+1 ); and the insertion of arcs (V k , V i−1 ), (V j−1 , V l ), (V k+1 , V l−1 ), and (V j , V i+1 ).As above, the subtours (V k+1 , . . ., V i−1 ) and (V j , . . ., V l−1 ) are reversed.The stringing procedure is basically the reverse of the unstringing procedure and consists of four types of insertions as follows: • Type I insertion: Assume that V k = V i and V k = V j .The insertion of V x results in the deletion of arcs (V i , V i+1 ), (V j , V j−1 ) and (V k , V k+1 ), and the insertion of arcs (V i , V x ), (V x , V j ), (V i+1 , V k ) and (V j+1 , V k+1 ).Also, the subtours (V i+1 , . . ., V j ) and (V j+1 , . . ., V k ) are reversed.
• Type III insertion: Basically, this type of insertion can be seen as the inverse of Type I insertion.When node V x is inserted between V i and V j , the subtour of nodes is rearranged in such a way that almost the entire sequence is inverted.The aim is to explore other promising regions of the search space.As in Type I insertion, assume V k = V i and V k = V j .The insertion of V x results in the deletion of arcs and the insertion of arcs (V i , V x ), (V x , V j ), (V k , V j−1 ) and (V k−1 , V i−1 ).As above, the subtours (V i , . . ., V j−1 ) and (V k , . . ., V i−1 ) are reversed.
• Type IV insertion: Similarly, this type of insertion can be seen as the reverse of Type II insertion.As in Type II, assume that and the insertion of arcs ) and (V k , V j−1 ).As above, the subtours (V i , . . ., V l ) and (V l+1 , . . ., V j−1 ) are reversed.

C. Updating Pheromones
The pheromone trails in MMAS are updated by applying evaporation as follows: where ρ is the evaporation rate which satisfies 0 < ρ ≤ 1, and τ ij is the existing pheromone value.After evaporation, the best ant deposits pheromone as follows: where ∆τ best ij = 1/φ(π best , t) is the amount of pheromone that the best ant deposits.The best ant that is allowed to deposit pheromone may be either the best-so-far ant 1 , in which case π best = π bs , or the iteration-best ant, in which case π best = π ib .Both ants are used to deposit pheromones to achieve a transition from a stronger exploration of the search space early to a stronger exploitation of the best-so-far solution later.More precisely, let f bs indicate the frequency the best-so-far ant is allowed to deposit pheromone.f bs increases as the search progresses following a pre-defined schedule [5]: where f bs is the number of algorithmic iterations between two updates performed by the best-so-far ant and I is the iteration counter of the algorithm.In other words, the emphasis from the iteration-best ant to the best-so-far ant for the pheromone update is shifted gradually.The schedule is restarted at the beginning of every dynamic change.It must be noted that pheromones are updated after the local search improvements to mark them in the pheromone trails so they can be exploited in the following iterations.

D. Maintaining Diversity
Maintaining the diversity of the constructed solutions is one of the key factors when addressing DOPs.This is because it helps the search to escape from (outdated) solutions of previously optimized environments and adapt to the new ones [20].
Since only the best ant is allowed to deposit pheromone, the search may quickly converge towards the best solution found in the first iterations.Therefore, the pheromone trails are occasionally reinitialized to the current τ max value to increase exploration.For example, whenever the stagnation behaviour2 occurs or when no improved solution is found for a given number of iterations, the pheromone reinitialization mechanism is triggered.
In addition, the lower and upper limits τ min and τ max of the pheromone trail values are imposed.In this way, the probability of selecting an arc will always be p k ij > 0, and, consequently, all arcs will have a chance to be selected.The τ max value is set to τ max = 1/(ρ • φ(π bs , t)), and is updated whenever a new best-so-far ant is found.The τ min value is set to τ min = τ max /(2n), where n is the number of nodes.

E. Responding to Dynamic Changes
ACO algorithms are able to use knowledge from previous environments via their pheromone trails and can be applied directly to DOPs without any modifications [17], [18].For example, when the changing environments are similar, the pheromone trails of the previous environment may provide knowledge to speed up the optimization process to the new environment.However, the algorithm must be flexible enough to accept the knowledge transferred from the pheromone trails, or eliminate the pheromone trails, in order to adapt well to the new environment.When a dynamic change occurs, evaporation eliminates the pheromone trails of the previous environment from areas that are generated on the old optimum and helps ants to explore areas for the new optimum.
In case the changing environments are completely different, then pheromone reinitialization may be a better choice rather than transferring the knowledge from previous pheromone trails [17]- [19].

A. Experimental Setup
In the experiments, we investigate the effects of having ACO providing its solutions for symmetric and asymmetric dynamic changes rather than using randomly generated solutions for local search improvements.Comparisons of two effective local search operators are performed.In particular, the performance of the following algorithms is investigated: • MMAS+US: the US operator is applied in MMAS whenever a new best-so-far ant is found until there is no further improvement.
• MMAS+LK: the LK operator is applied in MMAS whenever a new best-so-far ant is found until there is no further improvement.
• US: the US operator applied on a random initial solution rather than the best-so-far solution generated by MMAS.
• LK: the LK operator applied on a random initial solution rather than the best-so-far solution generated by MMAS.All algorithmic parameters were set to commonly used values: α = 1, β = 5, ρ = 0.8 and the number of ants was set to ω = 50.DTSPs are generated from five static benchmark instances obtained from TSPLIB 3 : pcb442, u574, pcb1173, rat783, lin318 using the dynamic generator described in Section II.The first three benchmark instances arise from the task of drilling holes in printed circuit boards, the next benchmark instance arise from rattled grid, and the last benchmark instance from the travel cost between cities.The frequency of change f was set to change every 10e4 algorithmic evaluations and the magnitude of change m was set to 0.05, 0.1, 0.2 and 0.4, indicating small to medium changing environments.Totally, a series of 4 DTSP test cases were constructed from each stationary instance, for symmetric and asymmetric changes, to systematically analyze the performance of the algorithms (all asymmetric problem instances have an extension of .atsp at the end of the problem label).For each algorithm on a DTSP, 30 independent runs were executed on the same set of random seed numbers.For each run, 100 environmental changes were allowed and an observation (i.e., the value of the best-so-far ant after a dynamic change) was recorded.For a fair comparison, all the algorithms performed the same number of evaluations.The proportional evaluations required when applying US and LK operators are added to the total evaluations of the algorithms.
The offline performance [21] was used to evaluate the overall performance of the algorithms, which is defined as: where E is the total number of evaluations and π bs is the best-so-far solution quality after a change.

B. Experimental Results and Discussion
The experimental results regarding the offline performance of the investigated algorithms for all DTSPs are presented in Table I.The corresponding statistical results are presented in Table II, in which pairwise Mann-Whitney statistical tests with a significance level of 0.05 were performed.In Table II, the results are shown as "+", "−" and "∼" when the first algorithm is significantly better than the second one, when the second algorithm is significantly better than the first one, and when the two algorithms are not significantly different, respectively.In Fig. 5 and Fig. 6, the dynamic average offline performance against the algorithmic iterations of MMAS+US, MMAS+LK, US and LK are plotted for the last ten environmental changes to better understand the behaviour of the algorithms in symmetric and asymmetric dynamic changes, respectively.From the experimental results the following observations can be drawn.
First, MMAS+US significantly outperforms US in both symmetric and asymmetric cases (see the comparisons in Table II).This is because MMAS can provide the US local search heuristic an initial solution from a promising neighbourhood (probably the one that contains the global optimum solution) in the search space, whereas it is less likely when starting from an initial random solution as in the US algorithm.On the contrary, MMAS+LK significantly outperforms LK only in asymmetric DTSPs, but it has no effect in symmetric DTSPs (since the improvement is not significant); see the comparisons in Table II.It is well known that the LK local search heuristic is considered by far the best heuristic on symmetric cases [22], and, consequently, starting from a random initial solution (as in the LK) will still result in good performance.However, it is still interesting to observe that the guidance provided from MMAS is very effective for the LK heuristic in asymmetric DTSPs.This is because the LK heuristic was designed specifically for symmetric cases, and, thus, loses its effectiveness in asymmetric cases.However, with the guidance of MMAS it will still explore some promising neighbourhoods.Second, MMAS+US significantly outperforms MMAS+LK in most asymmetric cases whereas MMAS+LK significantly outperforms MMAS+US in most symmetric cases.This is because MMAS+US performs a wide range of alternative untried moves that tend to improve the solution quality of the tour.However, when the LK heuristic is dealing with asymmetric cases it performs a very specific set of moves that preserve the direction of the tour.The moves are composed by a specific case of 3-opt (i.e., cycle patching) and non-sequential 4-opt (i.e., the so-called double-bridge move) moves.On the contrary, when the US heuristic is dealing with asymmetric cases the segments of the tour in which their direction is not preserved will be recalculated.In this way, the set of moves performed by the US heuristic will not be restricted only to the moves that preserve the direction of the tour.In addition, Type III and Type IV moves are designed to cope with asymmetric cases, by performing moves in the opposite direction of Type I and Type II moves, hoping that new regions (in the search) with high quality solutions will be found.

V. CONCLUSIONS
In this paper, we integrate two advanced local search operators (i.e., LK and US) with the MMAS for dynamic environments.The aim of the integration is to take advantage of the adaptation capabilities of MMAS and the solution

TABLE II :
Statistical results regarding the average offline performance of algorithms on symmetric DTSPs (upper half) and asymmetric DTSPs (lower half).