Analysis of Evolutionary Diversity Optimisation for the Maximum Matching Problem
Abstract
This paper delves into the enhancement of solution diversity in evolutionary algorithms (EAs) for the maximum matching problem, with a particular focus on complete bipartite graphs and paths. We utilize binary string encoding for matchings and employ Hamming distance as the metric for measuring diversity, aiming to maximize it. Central to our research is the -EA and 2P-EAD, applied for diversity optimization, which we rigorously analyze both theoretically and empirically.
For complete bipartite graphs, our runtime analysis demonstrates that, for reasonably small , the -EA achieves maximal diversity with an expected runtime of for the small gap case (where the population size is less than the difference in the sizes of the bipartite partitions) and otherwise. For paths we give an upper bound of . Additionally, for the 2P-EAD we give stronger performance bounds of for the small gap case, otherwise, and for paths. Here is the total number of vertices and the number of edges. Our empirical studies, examining the scaling behavior with respect to and , complement these theoretical insights and suggest potential for further refinement of the runtime bounds.
1 Introduction
Evolutionary algorithms (EAs) stand as a robust class of heuristics that navigate the intricate landscapes of various domains, from combinatorial optimization to bioinformatics, and have proven especially valuable in addressing problems within graph theory [24]. Central to the discussion in the field is the concept of diversity within EAs, which has been pivotal in enhancing the search process and preventing premature convergence on suboptimal solutions [11].
1.1 Related work
Recent research in evolutionary computation investigates various connections between quality and diversity. Quality Diversity (QD) has gained recognition as a widely adopted search paradigm, particularly in the fields of robotics and games [28, 5, 16, 1, 4]. The goal of QD is to illuminate the space of solution behaviours by exploring various niches in the feature space and maximizing quality within each specific niche. In particular, the popular MAP-elites algorithm divides the search space into cells to identify the solution with the highest possible quality for each cell [18, 31, 1, 32]
The area of Evolutionary diversity optimization (EDO) aims to find a maximal diverse set of solutions that all meet a given quality criterion. EDO approaches have been applied in a wide range of settings. Diversity, while typically a means to avoid stagnation in the search for a single optimal solution, here is leveraged to yield a set of diverse, high-quality solutions. This is advantageous for decision-makers who value a variety of options from which to select the most fitting solution, accounting for different practical considerations and trade-offs [29, 30]. For example the use of different diversity measures has been explored for evolving diverse set of TSP instances that exhibit the difference in performance of algorithms for the traveling salesperson problem as well as differences in terms of features of variation of a given image.[6] In the classical context of combinatorial optimization, EDO algorithms have been designed for problems such as the knapsack problem [2], the computation of minimum spanning trees [3], communication networks [15, 23], to compute sets of problem instances [12, 21, 22], as well as the computation of diverse sets of solutions for monotone submodular functions under given constraints [20, 8]. Furthermore, Pareto Diversity Optimization (PDO) has been developed in [19] which is a coevolutionary approach optimizing the quality of the best possible solution as well as computing a diverse set of solutions meeting a given threshold value. EDO approaches have been analyzed with respect to their theoretical behavior for simple single- and multi-objective pseudo-Boolean functions [10] as well as simple scenarios of the traveling salesperson problem [6, 26, 25], the minimum spanning tree problem [3], the traveling thief problem [27], the permutation problems [7] and the optimization of submodular functions [20].
1.2 Our contribution
This paper builds upon the methodology of [13] applying the theoretical runtime analysis framework to the maximum matching problem, specifically in bipartite graphs and paths. We aim to provide a deeper understanding of how diversity mechanisms influence the efficiency of population-based EAs in converging to a diverse set of high-quality maximum matchings.
To achieve this, we adopt a binary string representation for matchings and use Hamming distance as a measure of diversity. We then delve into the theoretical underpinnings of evolutionary diversity optimization for the maximum matching problem, examining structural properties that impact the performance of diversity-enhancing mechanisms within EAs. We provide runtime analysis for evolutionary algorithms, shedding light on their scalability for different problem instances. Finally, we present our experimental investigations to assess how close the bounds on the theoretical runtimes match the the experimental runtimes.
In summary, our research provides theoretical insights and empirical evidence to understand how diversity can be effectively maximized for the maximum matching problem. Our findings contribute to a deeper understanding of the interplay between diversity and optimization in EAs and pave the way for further research in this direction.
The paper is organized as follows. In Section 2, we introduce the maximum matching problem and the evolutionary diversity optimization approaches analyzed in this study. We then explore structural properties and present runtime analyses for diversity optimization in the context of complete bipartite graphs and paths (Section 3). Experimental investigations are detailed for both unconstrained and constrained scenarios (Section 4 and 5), followed by concluding remarks and suggestions for future research directions (Section 6).
2 Preliminaries
In this part of the paper, we present the core concepts related to diversity optimization for matchings in bipartite graphs. We start by establishing the definitions and measures of diversity that will be used throughout our discussion.
2.1 Maximum matching problem and diversity optimization
Our study is concerned with the matching problem in bipartite graphs, described by a graph . The aim is to find a maximum matching , which is a collection of edges that do not share common vertices. It is presumed that each individual in the starting population represents a valid maximum matching. Our analysis is directed at determining how long it takes evolutionary algorithms to cultivate a population that is not only diverse but also meets a specified quality benchmark.
Let represent a bitstring where each bit corresponds to an edge in , indicating whether the edge is included in the matching. We define the fitness function as follows, adapting the approach introduced by Giel and Wegener[14]:
Here, is the collision number, representing the count of pairs of edges that are included in and share a common endpoint, rendering an invalid matching, and is the number of edges included in the matching represented by .
This fitness function imposes a penalty for invalid matchings proportional to the number of edge conflicts, thereby encouraging the evolution of valid matchings. The goal is to maximize , which aligns with identifying a maximum matching that has no edge collisions.
The divergence between individuals is gauged using the Hamming distance, which is appropriate given our binary string representation of solutions. This distance measures how many bits differ between two strings.
2.2 Diversity measure
The diversity of a multiset (duplicates allowed) of search points (called population in the following) is defined as the cumulative Hamming distance across all unique individual pairings within . This is mathematically expressed as
where is the set (no duplicates) containing all solutions in , and is the Hamming distance between any two solutions and . The notion of contribution for a solution within a population is quantified as the difference in diversity if were to be excluded and defined as
2.3 Algorithms
The ()-EAD (see Algorithm 1) operates on a principle of maintaining and enhancing diversity within a population. It starts with a population of solutions, iteratively evolving them through mutation. In each iteration, it selects a solution uniformly at random, applies mutation, and if the new solution meets quality criteria, it is added to the population. To maintain population size, the least diverse individual (or one of them, if there are several) is removed. This process continues until the termination criterion is met. In our case this would be achieving maximal diversity and the quality criterion being a valid maximum matching.
The Two-Phase Matching EAD (see Algorithm 2) is also designed to generate diverse solutions in the population. The first phase involves ’unmatching’ a random subset of vertices in a solution, while the second phase focuses on ’rematching’ these vertices to other unmatched vertices in the graph. The algorithm keeps adding these newly formed solutions to the population if they fulfill the quality criteria and, similar to the ()-EAD, removes the least diverse solutions to maintain population size. The algorithm continues this process until the set criteria are met, aiming to achieve a diverse set of high-quality matchings.
2.4 Drift theorems
We analyse the considered algorithms with respect to their runtime behaviour. The expected runtime refers to the expected number of generated offspring until a given goal has been achieved (usually until a valid population of maximal diversity has been computed). For our analysis, we make use of the additive and multiplicate drift theorems which we state in the following.
Theorem 2.1 (Additive Drift Theorem[17]).
Let be a finite set of positive numbers and let be a sequence of random variables over . Let be the random variable that denotes the first point in time for which . Suppose that there exists a constant such that
holds. Then
If there exists a constant such that
holds. Then
Theorem 2.2 (Multiplicative Drift Theorem[9]).
Let be random variables over , , and let . Furthermore, suppose that
(a) and, for all , it holds that , and that
(b) there is some value such that, for all , it holds that .
Then
3 Runtime Analysis for complete bipartite graphs
This section introduces key theoretical results on complete bipartite graphs. We commence with a lemma that characterizes the conditions for maximal diversity within a population. Subsequently, we present a series of theorems that delineate the expected runtime to achieve this optimal diversity. These theorems compare the performance of the -EAD and 2P-EAD algorithms, providing a quantitative basis for assessing their efficacy.
Lemma 3.1 (Diversity of a Population).
Maximal diversity on a complete bipartite graph for a population of size , is attained if and only if all matchings in are pairwise edge-disjoint.
Proof.
Consider a set of matchings in , where each matching is a solution in the population. Let the diversity of this set be denoted by , defined as the sum of pairwise Hamming distances between all matchings.
A matching in involves pairing each vertex in with a unique vertex in , yielding edges in each matching. The Hamming distance between any two distinct matchings is the count of edges that differ between them.
To maximize , each pair of matchings should differ by the greatest number of edges. This maximum difference is , occurring when the matchings share no common edges.
Given matchings, the number of distinct pairs of matchings is . If all matchings are disjoint, each pair contributes to , leading to .
If any pair of matchings shares at least one edge, the Hamming distance for that pair is strictly less than , thus reducing . Therefore, is maximized if and only if all matchings are pairwise edge-disjoint.
This argument hinges on the fact that , ensuring the feasibility of having disjoint matchings in since each matching uses edges and there are possible edges in . Consequently, it is possible to construct disjoint matchings, each utilizing a different subset of edges from the total pool. ∎
In the following theorem we show that there is always a local improvement, needing bit flips, to reach a population with maximum diversity if the difference in size between both partitions is larger than the population size.
Theorem 3.2.
Let be a complete bipartite graph with , and . In the -EAD applied to , the expected time until the diversity is maximized is .
Proof.
We define the potential function as the difference between the optimal diversity and the current diversity at time :
In each solution, exactly vertices from are adjacent to a matching edge, leaving vertices in unadjacent in every solution. Additionally, each vertex in can be matched to at most different vertices across all solutions, ensuring that, for each vertex in R, there exists a vertex in L that is not matched with it in any solution.
To show that there is always a 2-bit flip which improves diversity by at least , we focus on a sequence of improving -bit flips. Each -bit flip corresponds to changing a match for a vertex in , which entails deactivating one edge (currently part of a matching) and activating another edge (currently not part of the matching). This process is akin to reassigning a vertex in to a different, unmatched vertex in .
Consider an edge used in solutions. When this edge is deactivated (removed from the matching), the diversity change is , since solutions lose a unique edge, reducing diversity. Conversely, when a new edge is activated (added to the matching) that is unused across all other solutions,it contributes to the diversity.
Thus, for each such -bit flip involving edge , the total change in diversity is:
This calculation demonstrates that the diversity improve achieved by applying the -bit flip for an edge in the sequence either decreases or remains unchanged if it is flipped later in the sequence. Note that in each step of the sequence the new maximum matching contains an edge unused by any other matching, so the offspring is always valid and the diversity improvement is at least , since this would be achieved by replacing the parent. Also since as soon as all edges are unique across all solutions the population is optimal and thus the total change across all such edges equals the difference to the optimum .
Let represent the count of such "imperfect" edges (edges used in more than one solution). Applying the 2-bit flip to one edge of the sequence gives at-least the diversity increase it achieves in the sequence, since the value of can only decrease or remain unchanged, and it is at most . Thus , which implies . The expected drift then is:
Given that is the maximum diversity, when all edges aire pairwise distinct, it holds that , the application of the multiplicative drift theorem yields the expected runtime of to achieve maximum diversity. ∎
We now show that the Two-Phase Matching Algorithm achieves significant speedup since no longer two edges have to be flipped to change where one vertex is matched to.
Theorem 3.3.
Let be a complete bipartite graph with , and . In the Two-Phase Matching Evolutionary Algorithm applied to , the expected time until the diversity is maximized is , where .
Proof.
We define the potential function as the difference between the optimal diversity and the current diversity at time :
The maximal diversity is achieved when all matchings in the population are pairwise edge-disjoint. The drift in the potential function at each step of the algorithm is analyzed as follows:
In each step, the algorithm first selects a solution and a subset of vertices, which it rematches with unmatched vertices in . Let represent the count of such "imperfect" edges (edges used in more than one solution). As shown in Theorem 3.2 it holds that . The expected drift then is obtained by selecting the corresponding solution to any of the edges, unmatching the adjacent vertex in and rematching it to include an edge unused by any solution. The probability to unmatch any and no other particular vertex in is , and the probability of matching it with an appropriate unmatched vertex in is at-least .
The expected decrease in the potential function per step, or the expected drift, is then given by:
where the factor accounts for the probability of selecting the right vertex and making a beneficial rematch.
Given that is the maximum diversity, when all edges are pairwise distinct, it holds that , the application of the multiplicative drift theorem yields the expected runtime of to achieve maximum diversity. ∎
Theorem 3.5 covers the case missing in the previous theorem, which gives a much larger runtime bound. Intuitively this happens because as gets greater than the gap between it is not longer guaranteed that we can always find a new rematch, such that this matching edge is not used by any other solution, thus making more than two bit flips necessary. Theorem 3.4 includes such a situation with a theoretical lower bound.
Theorem 3.4.
Let be a complete bipartite graph with . Consider a population size , satisfying and . There exists a starting population such that when the -EAD is applied to , the expected time to reach a population with maximal diversity is .
Proof.
Consider a bipartite graph with vertex partitions and . Define a matrix representing solutions to a matching problem, where each row of corresponds to a solution, and each column (for ) indicates the match in for vertex in .
The matrix is constructed as follows:
-
1.
The first column of , denoted , is defined as:
-
2.
For each row (for ), the entries in the row are filled by rotating the elements of such that:
-
3.
This process results in each row of sharing the same sequence of vertices from , except for the first entry, with a cyclical shift to the right in each subsequent row.
This matrix represents distinct solutions for the bipartite graph matching problem, where each row corresponds to a different solution, and each column represents a match between a vertex in and a vertex in , arranged according to the specified rotating pattern.
This matrix exemplifies the construction of solutions, with each row depicting a unique solution in the bipartite graph matching problem.
For each such matrix only the first column has two solutions using the same edge and the distance to optimal diversity is 2. Selecting any solution except these two can’t increase the diversity. And for each of these 2 rows there is no value we can change the assignment of to without creating another duplicate edge or creating an invalid matching. Thus we have to change to one of the edges not part of the row and subsequently deactivate that edge and activate to one of the edges not used in the row. The probability of doing this is at most . The remaining runtime is . ∎
For the given hard instance, while there is no improving 2-bit flip there is however an improving 4-bit flip of the following form, changing two matches. We make use of the fact that there is a match we can alter freeing a vertex () we can match to , which is unique in all solutions.
In the following theorem we generalize that such a 4-bit flip can always be found.
Theorem 3.5.
For a complete bipartite graph where , let the population size satisfy and . When the -EAD is applied to , the expected time to achieve maximal diversity is bounded by .
Proof.
We investigate the expected time for the -EAD to maximize diversity in a complete bipartite graph with the given conditions. Initially, we note that for any maximum matching there exist unmatched vertices from the left partition.
Let be a maximum matching in . Consider that full diversity is not achieved yet and thus an edge is part of multiple maximum matchings. We define to be the set of vertices in that are matched to a vertex in at least one maximum matching. Given that and since a matching pairs each vertex in with at most one vertex in , there must exist more than vertices in that are not paired with in any maximum matching. Let be the set of vertices in that are adjacent to these unpaired vertices in .
In the context of the -EAD, by strategically reassigning the pairs in , we can ensure an increase in diversity without decreasing the matching size. We denote by the vertex in to which a vertex is matched under .
Now, for the sake of contradiction, assume that . This would suggest that each vertex in is matched to a vertex in under . However, since and , this situation is not possible.
Therefore, there must exist a vertex such that . This implies that we can activate an edge connecting with an unmatched vertex in and deactivate the edge currently matching without reducing the size of the matching, thereby increasing diversity. Just as in Theorem 3.2 each of those 4-bit flips only decreases or does not change the multiplicities of other edges, since they are both unique edges over all solutions. Also succsesively applying these 4-bit flips at most times will result in optimal diversity, so holds.
Define to be the difference between the optimal diversity and the current diversity at time . Then, we observe a positive drift in the expected diversity increase per time step, similarly as Theorem 3.2 which can be bounded below by:
Here, represents the probability of selecting the correct individual for reassignment, and the term accounts for the probability of selecting the appropriate edges for activation and deactivation.
Given that is the maximum diversity, when all edges are pairwise distinct, it holds that , the Multiplicative Drift Theorem provides us with a runtime bound of to achieve maximum diversity. ∎
A similar speedup as for the small gap case can be shown by applying the 2P-EAD.
Theorem 3.6.
Given a complete bipartite graph with , consider a population size that fulfills and . For the 2P-EAD, the expected time to reach maximal diversity is .
Proof.
Consider the -EAD applied to a complete bipartite graph under the condition . Define the potential function as in the previous theorem:
In this adapted algorithm, we focus on efficiently increasing diversity by unmatching and then rematching only two vertices at a time. This process targets the subset of vertices in that can be rematched to different vertices in to increase diversity more effectively.
Let be the number of edges that are shared across different matchings. The expected drift in per step, considering the efficient selection and rematching process of only two vertices, is given by:
where the factor accounts for the probability of selecting the right solution and pair of vertices and of making a beneficial rematch. The term considers the probability of unmatching and rematching exactly two vertices without affecting the others.
Given that is the maximum diversity, when all edges are pairwise distinct, it holds that , applying the Multiplicative Drift Theorem yields an expected runtime of to achieve maximum diversity. Now since it holds that , which implies . Also by definition , so and we get a bound of . ∎
4 Runtime Analysis for paths
This section introduces key theoretical results on paths. We commence with an introduction of useful notation to simplify the following proofs. Subsequently, we present a series of theorems that delineate the expected runtime to achieve this optimal diversity. These theorems compare the performance of the -EAD and 2P-EAD algorithms, providing a quantitative basis for assessing their efficacy.
In a path with an even number of edges, such as when , there are multiple ways to form a maximum matching. Each maximum matching includes exactly three edges, ensuring that no two edges in the matching share a vertex. The notation is used to represent these matchings, where and denote the number of edges with even and odd indices in the matching, respectively. The detailed proof is given in the following Lemma.
Lemma 4.1.
The number of different maximum matchings on a path with edges is for even and for odd and each is of size . Also for even each maximum matching can be described as . For odd the unique solution has the form .
Proof.
We approach the proof of this lemma by employing induction to verify the claim regarding the number and arrangement of maximum matchings in path graphs of varying edge counts.
Base Case (,):
Clearly for there is only one solution consisting of one edge with index , so the unique solution is . For only one of both edges of the path can be part of the maximum matching so the maximum matchings are or .
Inductive Step:
In a maximum matching of size either the last or the second to last edge of the path has to be included, else we could increase the size by including the last edge.
Case 1: even
If the last edge is part of the matching, then the first edges must also form a maximum matching, since the choice of being in the matching is independent of the last two edges. By the induction hypothesis we can extend each maximum matching on the edges by .
If we instead include the second to last edge of the past, then the last and third to last edge of the path can’t be part of the matching, while the remaining edges are independent of the choice and must thus also form a maximum matching. The remaining path is then odd and thus has a unique maximum matching , so inductively the only maximum matching of this form is
Case 2: odd
If the last edge is part of the matching, then by the induction hypothesis the maximum matching for the remaining edges is unique and thus the maximum matching including the last edge of even index is .
If we instead include the second to last edge of the past, then the last and third to last edge of the path can’t be part of the matching, while the remaining edges are independent of the choice and must thus also form a maximum matching. The remaining path is then even and by the induction hypothesis each matching will have edges, which is not maximum since by instead including the last edge we obtain a matching of size .
∎
With an even number of edges, such as , there are the following maximum matching configurations, represented as (matching edges in red)
Matching :
Matching :
Matching :
Matching :
With an odd number of edges, such as , there is only one maximum matching configuration, represented as
Matching :
In each case, every vertex is incident to at most one matching edge, and the notation describes the composition of the matching in terms of even and odd-indexed edges.
The following Lemma characterizes the conditions for maximal diversity within a population using this notation.
Lemma 4.2 (Diversity of a Population).
The population with optimum diversity for even contains for each from to the individuals and . For odd and it further contains any one individual of the form .
Proof.
We approach the proof of this lemma by employing induction to verify the claim regarding the number and arrangement of maximum matchings in path graphs of varying population sizes.
Base Case (, ):
For any solution maximizes the diversity of .
For , the population with maximum diversity contains and with maximum diversity of . Suppose that there exists another maximum matching population of size , since the diversity has to be , if the first matching is then the second matching must be the complement . As soon as edges with both even and odd indices are part of , either or does not have the form and can’t be a valid maximum matching by Lemma 12.
Inductive Step:
Suppose by way of contradiction that is not part of the population. Then there exists an such that for all solutions in the population the first edges with even index are part of the solution and the th edge of one solution has odd index. By changing the th edge of the solution to also be of odd index we would increase the diversity, which contradicts the assumption of having maximum diversity. Analogously this holds for . Since all individuals are distinct all other solutions must start with an even edge and end with an odd edge. To the remaining individuals restricted on the inner edges we can then apply the Induction Hypothesis, so for even the population further contains for each from to the individuals
and
. For odd and it further contains any one individual of the form .
∎
Building up on this, in the following theorem we show that there is always a local improvement, needing bit flips, to improve diversity.
Theorem 4.3.
In the -EAD applied to a path with edges, the expected time until the diversity is maximized is .
Proof.
We consider a path graph with an even number of edges , where multiple maximum matchings are possible. The maximum matching is unique when is odd, hence the maximum diversity is trivially obtained in that case. Therefore, our analysis focuses on when is even.
Within a population, suppose there is duplication. By Lemma 4.2 it follows that there exists at least one individual for which the first matched edges have even indices without another individual having the first matched edges with even indices, or an individual where the last matched edges have odd indices without another individual having the last matched edges with odd indices.
Considering that the total number of distinct maximum matchings for a path with edges exceeds , the likelihood of choosing an individual from the current population and correctly flipping two edges to enhance diversity is at least . This lower bound on the probability yields a diversity improvement of at least 1.
If the population has not reached maximal diversity but consists of pairwise distinct maximum matchings, then there must exist a maximal such that or is not present in the population. W.l.og. let this be . We focus on the individual with most odd edges. By applying a 2-bit flip we get . The diversity change, by replacing the parent, would be only determined by this edge change. This new odd edge is already used by matchings, since is maximal, and only those since else would not have the most odd edges of the remaining population. By symmetry the deactivated even edge is used in solutions (excluding the parent). Thus the change in diversity by replacing the parent would be . By choice of this is strictly positive. Since replacing the parent is possible, the diversity increase is at least of this size. Let denote the difference between the optimal diversity and the current diversity at time . The possibility of enhancing diversity via a two-bit flip provides us with a drift given by
Since the initial diversity deficit is at most (each pair of solutions can have a hamming distance of at most ), applying the additive drift theorem results in a runtime estimation of . ∎
Theorem 4.4.
In the 2P-EAD applied to a path with edges, the expected time until the diversity is maximized is .
Proof.
Since the proof follows closely the arguments presented in Theorem 4.3, we will focus only on the different bounds on drift, which is the main differing element.
Any maximum matching can be chosen with probability and be mutated to by unmatching the jth vertex and rematching him with probability to his unmatched left neighbour. Since all previous edges have to be of even index this neighbour must be unmatched. Analogously it holds for to . For both the case of having duplicates or not being optimal in Theorem 4.3 we make use of such a local edge swap. The drift is therefore given by
Where is the probability of not rematching any other vertex. Given that the initial diversity deficit is at most (each pair of solutions can have a hamming distance of at most ), the additive drift theorem provides an upper bound on the expected run time of , since . ∎
5 Empirical Analysis
In this section, we present our empirical findings on the performance of the evolutionary diversity algorithms on complete bipartite graphs and paths. Our experiments were designed to test the theoretical predictions made in previous sections, particularly focusing on the efficiency of the algorithm in terms of the number of iterations required to achieve optimal diversity.
5.1 Experimental Setup
Our experiments were designed to explore the performance dynamics of the algorithms under two specific conditions: when the population size is held constant and when the number of edges remains fixed.
Complete Bipartite Graphs
The starting condition for complete bipartite graphs involves a maximum matching where for each is matched to , forming a homogeneous initial population. In the constant scenario, we increase the size of both and by one unit per iteration to maintain a steady difference, allowing a controlled analysis of the algorithms’ scalability. In the constant scenario we simply increase by one per iteration.
Paths
For paths, the initial population comprises maximum matchings including all even-indexed edges. With a fixed , the number of edges is incrementally increased by ten in each iteration, in order to cover a wider set of problem sizes, while staying experimentally feasible. In the constant case, out of feasibility, we simply increase by one per iteration.
5.2 Methodology
Each experiment was conducted 30 times to determine the average number of iterations and the standard deviation, estimating the algorithms’ asymptotic runtime for both fixed population size () and a fixed number of edges (). For complete bipartite graphs and fixed we chose and for the small gap case and and for the big gap case, such that the number of edges for the small gap case and for the big gap case are comparable in size.
5.3 Complete Bipartite Graphs
This subsection focuses on the performance of evolutionary diversity algorithms on complete bipartite graphs, specifically examining the -EAD and 2P-EAD algorithms.
-EAD
In Figure 1(a), we show the average number of iterations for a fixed population size of and different values of . Specifically, we examine cases where the difference is either , referred to as the ’small gap’ scenario or , the ’big gap’ scenario. The -EAD algorithm presented a quadratic growth in for the big gap case in iterations, empirically estimated as , suggesting an out-performance by a factor of approximately over the theoretical bound. For the small gap case we empirically estimate the run time as , an even stronger suggested out-performance by a factor of when compared against the theoretical bound of .
2P-EAD
In Figure 1(b) for fixed and Figure 1(d) for fixed, we zoom in on the results for the 2P-EAD algorithm. For both the small and big gap case the 2P-EAD algorithm exhibited a linear increase in the number of iterations with respect to when was held constant and vice versa. Empirically, the run time for 2P-EAD was observed to be close to , a notable deviation from the predicted . The results summarized in Table 1 provide a summary of these observations. It is evident that the performance of the 2P-EAD algorithm is not only superior in practice but also suggests that our theoretical bounds may be refined to more closely predict the empirical outcomes.
5.4 Paths
This subsection focuses on the performance of evolutionary diversity algorithms on paths, specifically examining the -EAD and 2P-EAD algorithms.
-EAD
In Figure 2(a), we present the average number of iterations when the population size is fixed at 8. The graph illustrates how the number of iterations required for convergence changes as the number of edges in the path increases. Figure 2(c) shows the average number of iterations for a fixed number of edges and varying population size . For the -EAD algorithm, a trend of polynomial growth in the number of iterations is observed as a function of the problem size. When is fixed at 8, the empirical runtime grows in line with , which could indicate a performance better than the theoretical upper bound of by a factor of .
2P-EAD
When we examine the 2P-EAD algorithm in Figure 2(b) for a fixed , and in Figure 2(d) for a fixed , we notice a similar pattern. The empirical runtime for the 2P-EAD is consistently around , also possibly deviating by a factor of from the theoretical bound. The results in Table 2 provide a summary of these observations. It is evident that the performance of the 2P-EAD algorithm is not only superior in practice but also suggests that our theoretical bounds may be refined to more closely predict the empirical outcomes.
Algo. | ||||
---|---|---|---|---|
Empirical | Theor. UB | Empirical | Theor. UB | |
EAD | ||||
2P |
Algorithm | Empirical | Theor. UB |
---|---|---|
EAD | ||
2P |
6 Conclusions
In this study, we explored the application of evolutionary algorithms (EAs) for maximizing diversity in solving the maximum matching problem in complete bipartite graphs and paths. Our methodology was structured into two distinct phases: a rigorous theoretical analysis followed by comprehensive empirical evaluations. We specifically looked at the -EAD and the Two-Phase Matching Evolutionary Algorithm (2P-EAD), finding that both could achieve maximal diversity in expected polynomial time, with 2P-EAD showing a speed advantage in all scenarios. Our findings not only underscore the utility of EAs in combinatorial diversity problems but also open up avenues for further research. A significant future direction would be to refine the theoretical upper bounds of these algorithms’ runtime. Additionally, applying these insights to other graph problems and exploring real-world applications, could provide practical benefits.
Acknowledgements
This work has been supported by the Australian Research Council through grant DP190103894.
References
- [1] Alvarez, A., Dahlskog, S., Font, J.M., Togelius, J.: Empowering quality diversity in dungeon design with interactive constrained map-elites. In: IEEE Conference on Games, CoG 2019. pp. 1–8. IEEE (2019). https://doi.org/10.1109/CIG.2019.8848022
- [2] Bossek, J., Neumann, A., Neumann, F.: Breeding diverse packings for the knapsack problem by means of diversity-tailored evolutionary algorithms. In: Chicano, F., Krawiec, K. (eds.) GECCO ’21: Genetic and Evolutionary Computation Conference, Lille, France, July 10-14, 2021. pp. 556–564. ACM (2021). https://doi.org/10.1145/3449639.3459364
- [3] Bossek, J., Neumann, F.: Evolutionary diversity optimization and the minimum spanning tree problem. In: Chicano, F., Krawiec, K. (eds.) GECCO ’21: Genetic and Evolutionary Computation Conference, Lille, France, July 10-14, 2021. pp. 198–206. ACM (2021). https://doi.org/10.1145/3449639.3459363
- [4] Bossens, D.M., Tarapore, D.: QED: using quality-environment-diversity to evolve resilient robot swarms. IEEE Trans. Evol. Comput. 25(2), 346–357 (2021). https://doi.org/10.1109/TEVC.2020.3036578
- [5] Cully, A., Demiris, Y.: Quality and diversity optimization: A unifying modular framework. IEEE Trans. Evol. Comput. 22(2), 245–259 (2018). https://doi.org/10.1109/TEVC.2017.2704781
- [6] Do, A.V., Bossek, J., Neumann, A., Neumann, F.: Evolving diverse sets of tours for the travelling salesperson problem. In: Coello, C.A.C. (ed.) GECCO ’20: Genetic and Evolutionary Computation Conference, Cancún Mexico, July 8-12, 2020. pp. 681–689. ACM (2020). https://doi.org/10.1145/3377930.3389844
- [7] Do, A.V., Guo, M., Neumann, A., Neumann, F.: Analysis of evolutionary diversity optimization for permutation problems. ACM Trans. Evol. Learn. Optim. 2(3), 11:1–11:27 (2022). https://doi.org/10.1145/3561974, https://doi.org/10.1145/3561974
- [8] Do, A.V., Guo, M., Neumann, A., Neumann, F.: Diverse approximations for monotone submodular maximization problems with a matroid constraint. In: Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI 2023. pp. 5558–5566. ijcai.org (2023). https://doi.org/10.24963/IJCAI.2023/617, https://doi.org/10.24963/ijcai.2023/617
- [9] Doerr, B., Johannsen, D., Winzen, C.: Multiplicative drift analysis. Algorithmica 64(4), 673–697 (2012). https://doi.org/10.1007/S00453-012-9622-X
- [10] Friedrich, T., Horoba, C., Neumann, F.: Illustration of fairness in evolutionary multi-objective optimization. Theor. Comput. Sci. 412(17), 1546–1556 (2011). https://doi.org/10.1016/J.TCS.2010.09.023
- [11] Friedrich, T., Oliveto, P.S., Sudholt, D., Witt, C.: Analysis of diversity-preserving mechanisms for global exploration. Evol. Comput. 17(4), 455–476 (2009). https://doi.org/10.1162/EVCO.2009.17.4.17401
- [12] Gao, W., Nallaperuma, S., Neumann, F.: Feature-based diversity optimization for problem instance classification. Evol. Comput. 29(1), 107–128 (2021). https://doi.org/10.1162/EVCO_A_00274, https://doi.org/10.1162/evco_a_00274
- [13] Gao, W., Pourhassan, M., Neumann, F.: Runtime analysis of evolutionary diversity optimization and the vertex cover problem. In: Silva, S., Esparcia-Alcázar, A.I. (eds.) Genetic and Evolutionary Computation Conference, GECCO 2015, Companion Material Proceedings. pp. 1395–1396. ACM (2015). https://doi.org/10.1145/2739482.2764668
- [14] Giel, O., Wegener, I.: Evolutionary algorithms and the maximum matching problem. In: Alt, H., Habib, M. (eds.) STACS 2003, 20th Annual Symposium on Theoretical Aspects of Computer Science. Lecture Notes in Computer Science, vol. 2607, pp. 415–426. Springer (2003). https://doi.org/10.1007/3-540-36494-3_37
- [15] Gounder, S., Neumann, F., Neumann, A.: Evolutionary diversity optimisation for sparse directed communication networks. In: Genetic and Evolutionary Computation Conference, GECCO 2024. ACM (2024), to appear
- [16] Gravina, D., Khalifa, A., Liapis, A., Togelius, J., Yannakakis, G.N.: Procedural content generation through quality diversity. In: IEEE Conference on Games, CoG 2019, London, United Kingdom, August 20-23, 2019. pp. 1–8. IEEE (2019). https://doi.org/10.1109/CIG.2019.8848053
- [17] He, J., Yao, X.: A study of drift analysis for estimating computation time of evolutionary algorithms. Nat. Comput. 3(1), 21–35 (2004). https://doi.org/10.1023/B:NACO.0000023417.31393.C7
- [18] Mouret, J.B., Clune, J.: Illuminating search spaces by mapping elites. arXiv preprint arXiv:1504.04909 (2015)
- [19] Neumann, A., Antipov, D., Neumann, F.: Coevolutionary pareto diversity optimization. In: GECCO ’22: Genetic and Evolutionary Computation Conference. pp. 832–839. ACM (2022). https://doi.org/10.1145/3512290.3528755, https://doi.org/10.1145/3512290.3528755
- [20] Neumann, A., Bossek, J., Neumann, F.: Diversifying greedy sampling and evolutionary diversity optimisation for constrained monotone submodular functions. In: GECCO ’21: Genetic and Evolutionary Computation Conference. pp. 261–269. ACM (2021). https://doi.org/10.1145/3449639.3459385
- [21] Neumann, A., Gao, W., Doerr, C., Neumann, F., Wagner, M.: Discrepancy-based evolutionary diversity optimization. In: Proceedings of the Genetic and Evolutionary Computation Conference. pp. 991–998. ACM (2018). https://doi.org/10.1145/3205455.3205532, https://doi.org/10.1145/3205455.3205532
- [22] Neumann, A., Gao, W., Wagner, M., Neumann, F.: Evolutionary diversity optimization using multi-objective indicators. In: Proceedings of the Genetic and Evolutionary Computation Conference, GECCO 2019. pp. 837–845. ACM (2019). https://doi.org/10.1145/3321707.3321796, https://doi.org/10.1145/3321707.3321796
- [23] Neumann, A., Gounder, S., Yan, X., Sherman, G., Campbell, B., Guo, M., Neumann, F.: Diversity optimization for the detection and concealment of spatially defined communication networks. In: Proceedings of the Genetic and Evolutionary Computation Conference, GECCO 2023. pp. 1436–1444. ACM (2023). https://doi.org/10.1145/3583131.3590405, https://doi.org/10.1145/3583131.3590405
- [24] Neumann, F., Witt, C.: Bioinspired computation in combinatorial optimization: algorithms and their computational complexity. In: Blum, C., Alba, E. (eds.) Genetic and Evolutionary Computation Conference, GECCO ’13. pp. 567–590. ACM (2013). https://doi.org/10.1145/2464576.2466738
- [25] Nikfarjam, A., Bossek, J., Neumann, A., Neumann, F.: Computing diverse sets of high quality TSP tours by eax-based evolutionary diversity optimisation. In: FOGA ’21: Foundations of Genetic Algorithms XVI. pp. 9:1–9:11. ACM (2021). https://doi.org/10.1145/3450218.3477310, https://doi.org/10.1145/3450218.3477310
- [26] Nikfarjam, A., Bossek, J., Neumann, A., Neumann, F.: Entropy-based evolutionary diversity optimisation for the traveling salesperson problem. In: GECCO ’21: Genetic and Evolutionary Computation Conference. pp. 600–608. ACM (2021). https://doi.org/10.1145/3449639.3459384, https://doi.org/10.1145/3449639.3459384
- [27] Nikfarjam, A., Neumann, A., Neumann, F.: Evolutionary diversity optimisation for the traveling thief problem. In: GECCO ’22: Genetic and Evolutionary Computation Conference. pp. 749–756. ACM (2022). https://doi.org/10.1145/3512290.3528862, https://doi.org/10.1145/3512290.3528862
- [28] Pugh, J.K., Soros, L.B., Stanley, K.O.: Quality diversity: A new frontier for evolutionary computation. Frontiers Robotics AI 3, 40 (2016). https://doi.org/10.3389/FROBT.2016.00040
- [29] Ulrich, T., Bader, J., Thiele, L.: Defining and optimizing indicator-based diversity measures in multiobjective search. In: Schaefer, R., Cotta, C., Kolodziej, J., Rudolph, G. (eds.) Parallel Problem Solving from Nature - PPSN XI, 11th International Conference 2010, Proceedings, Part I. Lecture Notes in Computer Science, vol. 6238, pp. 707–717. Springer (2010). https://doi.org/10.1007/978-3-642-15844-5_71
- [30] Ulrich, T., Thiele, L.: Maximizing population diversity in single-objective optimization. In: Krasnogor, N., Lanzi, P.L. (eds.) 13th Annual Genetic and Evolutionary Computation Conference, GECCO 2011, Proceedings, Dublin, Ireland, July 12-16, 2011. pp. 641–648. ACM (2011). https://doi.org/10.1145/2001576.2001665
- [31] Vassiliades, V., Chatzilygeroudis, K., Mouret, J.B.: Using centroidal voronoi tessellations to scale up the multidimensional archive of phenotypic elites algorithm. IEEE Transactions on Evolutionary Computation 22(4), 623–630 (2017)
- [32] Zhang, H., Chen, Q., Xue, B., Banzhaf, W., Zhang, M.: Map-elites for genetic programming-based ensemble learning: An interactive approach [ai-explained]. IEEE Comput. Intell. Mag. 18(4), 62–63 (2023). https://doi.org/10.1109/MCI.2023.3304085