Spanning and Splitting: Integer Semidefinite Programming for the Quadratic Minimum Spanning Tree Problem
Abstract
In the quadratic minimum spanning tree problem (QMSTP) one wants to find the minimizer of a quadratic function over all possible spanning trees of a graph. We give two formulations of the QMSTP as mixed-integer semidefinite programs exploiting the algebraic connectivity of a graph. Based on these formulations, we derive a doubly nonnegative relaxation for the QMSTP and investigate classes of valid inequalities to strengthen the relaxation using the Chvátal-Gomory procedure for mixed-integer conic programming.
Solving the resulting relaxations is out of reach for off-the-shelf software. We therefore develop and implement a version of the Peaceman-Rachford splitting method that allows to compute the new bounds for graphs from the literature. The numerical results demonstrate that our bounds significantly improve over existing bounds from the literature in both quality and computation time, in particular for graphs with more than 30 vertices.
This work is further evidence that semidefinite programming is a valuable tool to obtain high-quality bounds for problems in combinatorial optimization, in particular for those that can be modelled as a quadratic problem.
Keywords: Combinatorial Optimization, Spanning Trees, Integer Semidefinite Programming, Algebraic Connectivity, Projection Methods
1 Introduction
The quadratic minimum spanning tree problem (QMSTP) is the problem of finding a spanning tree of a connected, undirected graph such that the sum of interaction costs over all pairs of edges in the tree is minimized. The QMSTP was introduced by Assad and Xu [1] in 1992. The adjacent-only quadratic minimum spanning tree problem (AQMSTP), that is, the QMSTP where the interaction costs of all non-adjacent edge pairs are assumed to be zero, is also introduced in [1]. Assad and Xu proved that both the QMSTP and AQMSTP are strongly -hard problems. Interestingly, the QMSTP remains -hard even when the cost matrix is of rank one [36].
There are many existing variants of the QMSTP problem, such as the minimum spanning tree problem with conflict pairs, the quadratic bottleneck spanning tree problem, and the bottleneck spanning tree problem with conflict pairs. For a description of those problems, see e.g., Ćustić et al. [12]. The QMSTP has various applications in telecommunication, transportation, energy and hydraulic networks, see e.g., [1, 7, 8].
There is a lot of research on lower-bounding approaches and exact algorithms for the QMSTP. The majority of lower bounding approaches for the QMSTP may be classified into Gilmore-Lawler (GL) type bounds [1, 11, 30, 37] and reformulation linearization technique (RLT) based bounds [34, 37]. The GL procedure is a well-known approach to construct lower bounds for quadratic binary optimization problems, see e.g., [18, 25]. The RLT is a method to derive a hierarchy of convex approximations of mixed-integer programming problems [38] where integer variables are binary. Lower bounding approaches based on an extended formulation of the minimum spanning tree problem are derived in [39]. For an overview of the above-mentioned lower bounding approaches and their comparison, see e.g., [39]. Semidefinite programming (SDP) lower bounds for the QMSTP are considered in [20]. SDP bounds incorporated in a branch-and-bound algorithm provide the best exact solution approach for the problem up to date [20]. Different exact approaches for solving the QMSTP are considered in [1, 11, 34, 33]. For a comparison of various heuristic approaches for solving the QMSTP, see Palubeckis et al. [31].
In this paper, we derive two mixed-integer semidefinite (MISDP) formulations for the QMSTP by exploiting the algebraic connectivity of a tree. Algebraic connectivity was also exploited in [13] and [14] to derive ISDP formulations for the traveling salesman problem (TSP) and the quadratic TSP, respectively. We prove that the continuous relaxation of the cut-set QMSTP formulation of the QMSTP is at least as strong as the continuous relaxations of MISDP formulations of the QMSTP. Further, we derive several classes of valid inequalities for our MISDPs by exploiting the Chvátal-Gomory (CG) procedure for mixed-integer conic programming [6, 14]. In particular, we show that the classical cut-set constraints and the first level RLT constraints are CG cuts. The cut-set constraints are derived from the linear matrix inequality (LMI) that is related to the algebraic connectivity of a tree. The RLT-type constraints are derived using two LMIs from the MISDP formulation of the QMSTP.
Our preliminary computational results show that the cut-set constraints have a small impact on the quality of our doubly nonnegative (DNN) relaxation of the QMSTP, but the RLT-type constraints improve the DNN bound. Therefore, we add RLT-type constraints to the DNN relaxation of the QMSTP. The resulting relaxation has a large number of constraints, and it is difficult to solve using state-of-the-art interior point methods. Therefore, we design a version of the Peaceman-Rachford splitting method (PRSM) that is able to handle a large number of cutting-planes efficiently. In particular, the PRSM algorithm is adding violated RLT-type inequalities iteratively while using warm-starts. The numerical results show that our bounds for the QMSTP outperform bounds from the literature in quality, as well as in computational time required to obtain them. Our approach shows significant improvement over other methods from the literature, particularly for larger instances, specifically, for graphs with more than 30 vertices.
Notation
The set of real symmetric matrices is denoted by . The space of symmetric matrices is considered with the trace inner product, which for any is defined as . The associated norm is the Frobenius norm . The cone of symmetric positive semidefinite matrices of order is defined as . We order the eigenvalues of as follows . If it is clear from the context to which matrix the eigenvalues relate, we denote eigenvalues by . The Hadamard product of two matrices and of the same size is denoted by , and defined as follows The operator maps a square matrix to a vector consisting of its diagonal elements. The adjoint operator of diag is denoted by .
We denote by the vector of all ones of length , and define . The indicator vector of is denoted by The all-zero matrix of order is denoted by . We use to denote the identity matrix of order , while its -th column is given by . In case the dimension of , , and is clear from the context, we omit the subscript.
We define the -simplex as and the capped -simplex as . By we denote the projection operator onto the set . We use to denote the set of integers
Given a subset of vertices in a graph , we denote the set of edges with both endpoints in by and the cut induced by by . However, when we define .
2 The Quadratic Minimum Spanning Tree Problem
In this section, we formally introduce the QMSTP. Let be a connected, undirected graph with vertices and edges. Let be a matrix of interaction costs between edges of , where represents the cost of edge .
The QMSTP can be formulated as the following binary quadratic programming problem:
where denotes the set of all spanning trees in . Each spanning tree in is represented by its incidence vector of length , and therefore
(1) |
The constraints of the type
(2) |
are known as the cut-set constraints, and they ensure connectivity of a subgraph from . If is a diagonal matrix then the QMSTP reduces to the minimum spanning tree problem that is solvable in polynomial time [24, 35].
Let us now fix an ordering for the edges . For define such that if and , and otherwise. Then the QMSTP can be formulated as the following mixed-integer programming problem, see e.g., [1]:
(3) |
One can verify that the constraints, in combination with the binarity of , are sufficient to obtain the coupling between and . Note that each row in is an incidence vector of a tree. The above model was introduced by Assad and Xu [1]. We refer to the above program the cut-set formulation of the QMSTP.
3 MISDP formulations for the QMSTP
In 1973, Fiedler [17] defined the algebraic connectivity, , of a graph as the second smallest eigenvalue of the Laplacian matrix of the graph. It is well-known that the algebraic connectivity is greater than zero if and only if is a connected graph. In this section, we will exploit the algebraic connectivity of a tree to derive two MISDP formulations of the QMSTP. We also prove that the continuous relaxations of our MISDP formulations are at least as strong as the continuous relaxation of the cut-set QMSTP formulation (3).
It is known that the algebraic connectivity for the graph class of trees with vertices lies in the interval between and , see e.g., [19]. Here, is the algebraic connectivity of the path graph, and is the algebraic connectivity of the star graph. It is also known that a tree on vertices has exactly edges. Hence, a tree can be characterized as a connected graph with exactly edges, see also (1). We use those facts to characterize trees by means of positive semidefiniteness.
Proposition 1.
Let be a simple graph on vertices with edges. Let be its Laplacian matrix and let with and . Then, is a tree if and only if
Proof.
Let be the eigenvalues of the Laplacian matrix . We denote by and for the eigenvectors of such that they form a basis of . The matrix has eigenvalue whose corresponding eigenvector is , and eigenvalue of multiplicity with eigenvectors for . Therefore, and from where it follows that the eigenvalues of are and for . Using the fact that we have and thus is positive semidefinite if and only if its eigenvalue is nonnegative.
Now suppose that is a tree. In this case, we know that holds. Therefore, we have that and thus .
On the other hand, if then . Since , it follows that and, thus is connected. As has vertices and edges, it is a tree. ∎
The previous result can be generalized for any graph as follows.
Proposition 2.
Let be a simple graph on vertices and be the Laplacian matrix of . Then if and only if .
Proof.
Let be the eigenvalues of , the Laplacian matrix of . The eigenvalues of are and . If , then all eigenvalues of are greater or equal than and therefore . Conversely, if , then all eigenvalues of greater or equal to and therefore . ∎
In the sequel, we exploit Proposition 1 to derive MISDP formulations for the QMSTP. Let us first define the set of adjacency matrices of induced subgraphs of with vertices and edges:
(4) |
The set of all adjacency matrices of spanning trees on vertices is:
(5) |
where and . There is a bijection , see (1), where maps to a column vector containing the entries of X corresponding to with respect to the fixed ordering of the edge set. Hence, the QMSTP can be written as the following MISDP problem:
(6a) | ||||
s.t. | (6b) | |||
(6c) | ||||
(6d) |
One can verify that the integrality of the matrix variable in (6) follows from the integrality of the matrix variable . Let us compare the continuous relaxations of (6) and the continuous relaxation of the cut-set QMSTP formulation (3). We first show the following result.
Proposition 3.
Let be a matrix such that , , , and . Then .
Proof.
The proof is similar to the proof of Statement 4.3. in [17]. ∎
It is not difficult to show that for a feasible for the continuous relaxation of the cut-set QMSTP formulation (3) one can construct a feasible pair for the continuous relaxation of (6). This leads us to the following result.
Corollary 1.
The continuous relaxation of the cut-set QMSTP formulation is at least as strong as the continuous relaxations of (6).
This result is not very surprising. Namely, Goemans and Rendl [40] show a similar result that relates the subtour elimination relaxation and an algebraic connectivity based SDP relaxation for the traveling salesman problem.
One can also formulate the QMSTP by exploiting theory on discrete PSD matrices from [15], i.e., the following result.
Theorem 1 ([15]).
Let with . Then, if and only if .
Now, by using the previous result, we formulate the QMSTP as the following MISDP:
(7a) | ||||
s.t. | (7b) | |||
(7c) | ||||
(7d) | ||||
(7e) |
where is given in (4). In (7), we do not impose integrality on the off-diagonal elements of as those follow by the integrality of , see [15]. Due to the integrality of , the constraints (7b) and (7c) ensure that . Now, by using the same arguments as earlier, one can show the following result.
Corollary 2.
The continuous relaxation of the cut-set QMSTP formulation is at least as strong as the continuous relaxations of (7).
It is difficult to directly compare the continuous relaxations of (6) and (7). However, by adding the constraint to (7) (which is redundant in the presence of integrality), we have the following result.
Corollary 3.
3.1 Valid inequalities
In this section, we derive Chvátal-Gomory cuts from the MISDP formulations of the QMSTP from the previous section. Some of those cuts coincide with well-known cuts from the literature. In particular, we show that the cut-set constraints (2) and some of the first level RLT constraints are CG cuts.
Let us first present a result that applies to any graph having several connected components.
Proposition 4.
Let be the Laplacian matrix of a graph on vertices, consisting of exactly connected components. Let be the partition of the vertices implied by these components. For each , let be the vector defined as
Then for all and .
Proof.
Recall the LMI from Proposition 2. We can write , . It is not difficult to verify that and for are eigenvectors of corresponding to the zero eigenvalue. It further holds that for all , and therefore, . This implies that is an eigenvector of corresponding to the eigenvalue , and therefore the above inequality holds. ∎
A similar result was obtained in [14] in the context of a directed node-disjoint cycle cover. Let us restate Proposition 4 in terms of the adjacency matrix of a graph.
Corollary 4.
Let be a graph with vertices consisting of connected components. Denote by the set of vertices in component . Let be the adjacency matrix of . Further, let for all and let . Then
for all and .
Proof.
The claim follows from Proposition 4, the fact that and using . ∎
We can now use the result of Corollary 4 to derive Chvátal-Gomory cuts for the QMSTP. Let , then we have the following CG cut:
where is defined as in Proposition 4. One can use the above cuts within a branch-and-cut framework to solve (6) and/or (7). In particular, those cuts may be used to separate matrices that are in , see (4), but not in , see (5).
In the sequel, we derive the cut-set constraints (2) as CG cuts. Let , and be feasible for (6) or (7). Then, for the PSD matrix we have that
(8) |
is a valid inequality for (6) and (7). After rewriting (8) and exploiting , we have Since the left-hand side of this inequality is integer, we may round the right-hand side, which results in the following CG cut which after rewriting the left-hand side results in the following inequality
(9) |
For and we have that , and the above CG cut implies the cut-set constraint , see also (2). Let us summarize the previous discussion.
Proposition 5.
Subsequently, we derive valid inequalities by exploiting the constraint (7c) that may be equivalently reformulated as . Let , be feasible for (7), , and be the indicator vector of . For , we define the following positive semidefinite matrix where the index corresponds to the ordering number of the edge , i.e., . Since , it follows that By rewriting the left-hand side, we have
from where it follows since due to the fact that the underlying graph is connected. Moreover, is the cut-set constraint that is a CG cut. Thus, we have the following constraints
(10) |
Interestingly, these constraints follow also from the reformulation-linearization technique [38] applied to the cut-set constraints (2) with . Namely, after multiplying both sides of (2) by and replacing by , one obtains the constraints (10). We refer later to the constraints (10) as RLT-type constraints.
4 DNN relaxation
Here, we derive two doubly nonnegative relaxations for the QMSTP and derive their facially reduced formulations.
To this end, instead of the matrix , we introduce a vector that results in relaxing to . We then use formulation (7) where we drop the linear matrix inequality (7d), and relax the constraint , see (7e), to . Furthermore, we add the constraint that can be derived from (7). Additionally, we impose nonnegativity constraints on the matrix variable, and obtain the following DNN relaxation:
(11a) | ||||
s.t. | (11b) | |||
(11c) | ||||
(11d) |
The above relaxation does not include the connectivity constraint (7d), because that constraint has only a small impact on the bound. However, it makes the relaxation more difficult to solve. In order to include a type of connectivity constraints in (11), we consider valid inequalities from Section 3.1. Preliminary numerical results show that by adding the cut-set constraints (2), see also Proposition 5, the resulting bound only marginally improves on the DNN bound (11). The RLT-type cuts (10), however, turn out to have a more positive impact on the bound value. We therefore present the following strengthening of the relaxation (11):
(12) |
In the remaining part of this section, we perform facial reduction of the DNN relaxations. Let and It is not difficult to verify that
(13) |
is an eigenvector corresponding to the zero eigenvalue of any matrix feasible for (11). Since there is no feasible matrix which is positive definite, the DNN relaxation (11) has no Slater feasible point.
To provide a facially reduced DNN relaxation of (11), let be a matrix whose columns form a basis for , see (13). As we will show in Theorem 3 later on in this section, the relaxation (11) may be equivalently written as the following facially reduced relaxation:
(14) |
We obtained this relaxation from (11) by replacing with and removing redundant constraints. Note that the feasible set of (11) is contained in , which is a face of . To show that (14) has an interior point, we use Theorem 3.15 from [22]. That theorem additionally takes into account a zero pattern in the feasible matrix, which is not present in our problem.
Theorem 2 (Theorem 3.15 in [22]).
Let where is a linear transformation, be the feasible set of a quadratically constrained program. Suppose with . Then, there exist a matrix with full row rank and such that
Let and be a matrix such that its columns form a basis of . Let and be its complement. Then, there exists a Slater point for the facially reduced, DNN feasible set:
We are now ready to state the following result on our facially reduced problem.
Proof.
Let
where with
Note that in the definition of , the equality models the constraint for all . The constraint models the constraint . For the indices , the constraint models the redundant constraint for all .
The convex hull equals . For each index there exist vectors such that and , hence, we get that the affine hull is and has dimension . Hence, where is from Theorem 2 and given in (13). Let be a matrix whose columns form a basis of the nullspace of . Then, a face of containing the feasible set of (11) is of the form . Therefore, one can replace with in (11).
5 Peaceman-Rachford splitting method for the QMSTP
Interior point solvers have difficulties computing our DNN relaxations for medium-sized problems in a reasonable time due to the large number of (inequality) constraints. Therefore, we use the Peaceman-Rachford splitting method (PRSM) for computing the bounds. The PRSM was first proposed in [32, 27] and is a symmetric variant of the alternating direction method of multipliers (ADMM). For more details and convergence results we refer to [21].
5.1 PRSM for solving the DNN relaxation
In this section, we outline the main steps of the Peaceman-Rachford splitting method for solving the DNN relaxation for the QMSTP (14).
Recall that the matrix should be such that its columns provide a basis for . For reasons explained later, we additionally require the columns of to be orthonormal. Therefore, we take as the matrix obtained from applying a QR decomposition to .
Now, we define the following sets
(15) | ||||
(16) |
and rewrite (14) as
(17) |
Note that we added redundant constraints to and , where the constraint holds, since the columns in are orthonormalized. Those redundant constraints help for the efficiency of the algorithm, see e.g., [16, 29, 26].
For a fixed penalty parameter , the augmented Lagrangian function of (17) w.r.t. the constraint is
The basic idea of the PRSM is to iteratively alternate between optimizing over and and updating the dual variable . The -th iteration of the PRSM to minimize the augmented Lagrangian function is
with step lengths and satisfying and , see [21]. The optimization problems occurring in this PRSM scheme can be simplified to projection problems. Namely, optimizing the augmented Lagrangian over can be simplified to
where we exploited the fact that the columns of are orthonormal. The projection of a matrix onto the set can be computed by projecting the eigenvalues of in the spectral decomposition onto the -simplex , see e.g., [26]. In more detail, let be the eigenvalue decomposition of with denoting the vector of eigenvalues of , then The projection onto the simplex can be performed efficiently. We refer to [9] for an overview of algorithms for projecting onto the simplex and their complexities.
Similarly, the optimization problem over the polyhedral set can be reformulated as
The projection onto can then be done in the following way
where and denotes the elementwise projection onto the interval .
5.2 PRSM for solving the strengthened DNN relaxation
In this subsection, we modify the previously described PRSM algorithm that solves the relaxation (14), so that it can handle additional RLT-type constraints.
Let us extend the set , see (16), by adding the RLT-type constraints, yielding
Thus, the strengthened DNN relaxation (17) is as follows
(18) |
The RLT-type constraints make the projection onto significantly harder. To the best of our knowledge, there is no closed-form expression for the projection onto . However, one may write as an intersection of sets that are easier to project on and then use an algorithm to project onto the intersection of convex sets. The cyclic Dykstra’s projection algorithm [5] is a suitable algorithm. An overview and analysis of algorithms to project onto the intersection of convex sets can be found in [2].
To apply Dykstra’s cyclic projection algorithm, let denote a coloring of the graph , i.e., is a partitioning of into independent sets of . We then define the polyhedral sets as
for . With this we can now rewrite as
The projection onto the sets can be performed independently over each row of and the corresponding entries of in . This allows us to restrict ourselves to projections onto the following type of sets
(19) |
where the first entries correspond to the -th row of and the last entry corresponds to . The projection onto can then be computed as presented in the following proposition.
Proposition 7.
Let , and let denote a coloring of . For each , we define and sort these values in non-increasing order, i.e., , where and is an appropriate sorting permutation. For each , let
where denotes the degree of vertex in . If for all , then , where for all and . Otherwise, let denote the largest index for which . Then, , where
Proof.
First, observe that if , then for all . Consequently, the projection of onto is given by , where is such that for all and .
If , then Hence, the largest index for which , i.e., the index , exists. Next, we prove that the projection is of the described form.
Using the fact that , the vector can be obtained as the solution of the following optimization problem, where we restrict to the support of the constraints in .
(20) |
Let , , denote the dual variables corresponding to the constraints of (20). We further denote by the two vertices in adjacent to . Then, the KKT optimality conditions for (20) are as follows
(21) | ||||
(22) | ||||
(23) | ||||
(24) | ||||
(25) |
It follows from (21) and (22) that we have
and | (26) | |||||
Suppose is the set of vertices for which at an optimal solution of (20). The complementary slackness constraints (24) then imply that for all and for . Note that since is an independent set in . By exploiting (26) and , these equations can be rewritten to
(27) |
for all . Summing the latter equations over all yields
or equivalently, After substitution into (27), we obtain
(28) |
for all . For each , we have . The inequalities (23) for these then read
or equivalently,
(29) |
By combining (28) and (29) we obtain the following optimality conditions on the dual variables concerning the indices in
(30) |
We conclude from the conditions (30) that the support of restricted to always consists of the vertices for which lies above a certain threshold value. To find this threshold value, we sort the ’s in non-increasing order and check all possible candidate sets for corresponding to the first entries in this sorted list. Let denote an according sorting permutation, i.e., is bijective and fulfills . For each candidate set , it suffices to check whether is strictly larger than the candidate threshold value
If is the largest index for which this holds, then this candidate set equals . The existence of such a is guaranteed by the existence of a solution to the projection problem (20).
It follows from Lemma 7 that the projection onto involves both a sorting and an enumeration of a list of elements. Hence, the worst-case time complexity is .
In fact, for computational purposes, we are not going to project on but iteratively add violated cuts only. For that, we denote by the set of violated cuts that we to add to , where an element represents the cut . We further define analogously to the polyhedral set
The projection follows the same idea as explained above for the projection onto , but in this case, instead of partitioning the vertex set into independent sets, we can partition the constraints in for each edge separately. For a fixed , we partition the vertices occurring together with in into independent sets . Note that the number of independent sets for an edge will probably be way smaller than the number of colors needed to color the whole graph, which can, in the worst case of a complete graph, be the number of vertices. Furthermore, as mentioned above, it is possible to project independently over each row , which allows us to parallelize this step. Hence, we cluster the cut constraints in for with and obtain where we can easily project onto using Proposition 7. A pseudocode for the Cyclic Dykstra projection algorithm to project onto can be found in Algorithm 1.
To compute the lower bound (18) with a PRSM algorithm, we first compute the DNN bound (17) with the PRSM, as explained in the previous subsection. Then, we separate violated cuts from the current solution and add the ncutsmax most violated ones to . We then proceed to compute (17) with the additional new cuts in with the PRSM and use the solution from before for a warm-start. This process of separating and adding new cuts to in an outer loop is iterated until one of the stopping criteria is met. Algorithm 2 provides a pseudocode for the described algorithm.
5.3 Stopping criteria and post-processing of the PRSM algorithm
In this subsection, we briefly discuss the stopping criteria and the post-processing phase of our PRSM algorithm.
Stopping criteria
We use several criteria to decide when to stop the inner and outer iterations of Algorithm 2. The main stopping criteria for the inner while loop is when the primal and dual errors satisfy
cf. [4]. We further stop the inner iterations when the maximum number of total PRSM iterations or a time limit is reached. In that case, we compute a valid dual bound as described below, and stop the algorithm.
For the outer loop, we have the following possible stopping criteria. If an upper bound is known, the algorithm stops as soon as the obtained valid lower bound closes the gap. We further stop the algorithm if the number of new violated cuts found is below a certain threshold ncutsmin. If the improvement of the valid lower bound compared to the valid lower bound of the previous outer iteration is smaller than epslbimprov, we stop the algorithm as well. And finally, we stop after a maximum of noutermax outer iterations.
Valid lower bound
The value obtained as an output of Algorithm 2 does not necessarily provide a lower bound for the problem, as the convergence of the PRSM is typically not monotonic, and one stops the algorithm earlier. Therefore, it is necessary to perform a postprocessing procedure to obtain a valid lower bound. We apply the approach presented in [26]. The safe lower bound derived by this method is then given by
where denotes the dual matrix variable resulting from (an early stop of) the PRSM. The computation of this lower bound boils down to computing the largest eigenvalue and solving a linear program. Similarly, one can obtain a valid lower bound from the PRSM algorithm that solves (17), by replacing with , see (16), in the above expression.
6 Numerical results
We implemented777The code can be found on https://github.com/melaniesi/QMST.jl and as ancillary files on the arXiv page of this paper. our algorithm in Julia [3] version 1.10.0. For solving the linear program to compute a valid lower bound, we are using the solver HiGHS [23] with the modeling language JuMP [28]. The projection onto is multithreaded. All computations were carried out on an AMD EPYC 7343 with 16 cores with 4.00GHz and 1024GB RAM, operated under Debian GNU/Linux 11.
Parameter setting
We initialize the matrices, penalty parameters, and step lengths as follows. As starting values for the matrices, we choose and
Based on the results of numerical tests, we have determined the values for the penalty parameter and step lengths. We set the step length parameters to , . For the penalty parameter, let and , we then set
We run our algorithm for all instances with and the parameter is set to . Violated cuts are considered if the violation is greater than and after each outer iteration, the ncutsmax = most violated cuts are added. No further cuts are added if the improvement of the lower bound is smaller than epslbimprov = or the number of new violated cuts found is less than ncutsmin = 10. The maximum wall-clock time for running our algorithm is set to 3 hours per instance, and the maximum number of total iterations is set to . We set the number of maximum outer iterations to noutermax = 10.
Benchmark instances
We test our algorithm on the following three benchmark sets. The first benchmark set OP was introduced in [30] by Öncan and Punnen. The benchmark set consists of 3 different classes, each consisting of 160 instances on complete graphs: the OPsym, OPvsym and OPesym instances. The OPsym instances have diagonal entries chosen uniformly from , and the off-diagonal values are uniformly distributed at random in . For instances in the class OPvsym, the diagonal values are uniformly distributed in , and the off-diagonal values are computed as , where assigns to each vertex in the graph a uniformly distributed weight at random in . The cost matrix for instances of the type OPesym is constructed in the following way. First, the vertex coordinates are randomly chosen in the box , and the edges are represented as straight lines connecting vertices. The edge cost is then set as the length of the edge , and the interaction cost between two edges and is computed as the Euclidean distance between the midpoints of and . For each of those test sets, they randomly generated 10 instances each for . We do not include the benchmark instances of type OPesym and in our study, as we were unable to locate the correct instances888 In the benchmark set https://data.mendeley.com/datasets/cmnh9xc6wb/1, the instances indicated as type OPesym for are the OPvsym for ..
The second family of benchmark instances CP was introduced by Cordone and Passeri in [10]. The benchmark set consists of 108 instances divided into 4 classes, specifying the sets from which the diagonal and off-diagonal values of the cost matrix are chosen uniformly at random. For each pair of the number of vertices , density and class, one random graph was generated. The values of the cost matrix are uniformly distributed on the sets as listed below.
class | CP1 | CP2 | CP3 | CP4 |
---|---|---|---|---|
diagonal values | [10] | [10] | [100] | [100] |
off-diagonal values | [10] | [100] | [10] | [100] |
The last benchmark set SV was introduced by Sotirov and Verchére in their recent paper [39]. It consists of 24 instances, with one random graph for each pair of and . They constructed the cost matrices in such a way that for a given maximum cost for the diagonal entries, and a maximum cost for the off-diagonal entries, 10% of the edges have high interaction costs with each other (between 90 and 100% of the maximum off-diagonal cost) and low interaction costs with the rest (between 20 and 40% of the maximum off-diagonal cost). The other 90% of edges have an interaction cost of between 50 and 70% of the maximum off-diagonal cost with each other. The diagonal entries are chosen to be between 0 and 20% of the maximum diagonal cost.
Bounds from the literature
We compare our numerical results to lower bounds from [20, 34, 39]. The upper bounds on the benchmark instances are taken from the literature.
The bounds from [20], called LAGN and LAGP, are used in the to-date best exact algorithm for the QMSTP. Those bounds are obtained from two different ways of dualizing an SDP relaxation of QMSTP. For LAGN, the semidefiniteness constraint is dualized, and a subgradient method is used to compute the optimum. Whereas for computing LAGP, there is no semidefiniteness constraint present, but a semi-infinite reformulation together with polyhedral cutting planes is solved using a bundle method.
The lower bounds VS1 and VS2 were introduced by Sotirov and Verchére in [39]. These lower bounds are based on an extended formulation of the minimum quadratic spanning tree problem and are strengthened by facet defining inequalities of the Boolean Quadric polytope. The lower bound VS2 is stronger than VS1.
Pereira et al. [34] solved several benchmark problems of sizes up to 50 vertices using a RLT based relaxation RLT1. RLT1 is an incomplete first level RLT relaxation and is computed by dualizing the symmetry constraint, applying the GL procedure, and using a subgradient algorithm. Another RLT based bound among the strongest relaxations in the literature is RLT2, presented in [37]. The authors of [37] use a dual-ascent procedure for computing their relaxation based on the second-level of RLT.
Computational results
We first present a comparison of our algorithm to the results from [20], where the authors also compute SDP bounds. Their computations were carried out on a machine with 32 GB RAM and two E5645 Intel Xeon processors, with six 2.40GHz cores each.
The structure of Section 6 is analogous to Table 4 in [20] and reads as follows. The rows are grouped into 3 blocks, each reporting the results averaged over all CP instances with the same property as specified in the first column of the table. The first block of rows averages over instances of the same size, the second averages the results over the densities of the graphs, and the last block averages over the different classes of the CP instances. In the second column of Section 6, we report the average gap obtained by the valid lower bound obtained with our PRSM algorithm when stopping after the first outer iteration, cf. (17). We compute the relative gap between that lower bound () and the best known upper bound (UB) from the literature using . We remark here that the same gap was calculated in Guimarães et al. [20].999There was a typo in that paper that claims differently, but our statement can be easily verified. In the third column, we report the average wall clock time in seconds needed to compute this lower bound. In column 4, we report the average gap obtained by the bound returned by Algorithm 2, cf. (18), and in column 5, the average time needed to compute this bound. In the sixth and seventh column, we list the average gaps and computation times for the bound LAGN of [20], which is used in the best up-to-date exact algorithm for the QMSTP. The average gaps and computation times of LAGP, the second lower bound introduced in [20], are given in the last two columns of Section 6.
The results in Section 6 show that for the CP instances, our lower bounds are, on average, significantly stronger than the SDP bounds LAGN and LAGP. Except for the instances with , the average computation times for solving our relaxations are smaller than those reported for computing SDP bounds LAGN and LAGP. The average time to compute the DNN + CUTS bound, that is (18), over all CP instances is seconds, compared to and seconds for LAGN and LAGP, respectively. More significant difference in the computation times and relative gaps can be seen for larger instances. One can also observe that the less dense the instances are, the smaller the average relative gap. Furthermore, the effect of adding cuts is more significant for sparse graphs than for dense graphs. Guimarães et al. [20, Table 4] compare their bounds to RLT1 [34], which can be computed approximately three times faster than LAGN but yields much weaker bounds. The average gap of bound RLT2 [37] over all instances of size for each of the four CP classes is at least three times larger than our reported average gaps for (17). Overall, Section 6 shows that, especially for larger CP instances, our bounds are significantly stronger and faster to compute than any other bounds.
In the LABEL:tab:CP1, LABEL:tab:CP2, LABEL:tab:CP3, LABEL:tab:CP4 and LABEL:tab:SV we report the numerical results for all benchmark instances of the test sets CP and SV. The first four columns give details about the instance as the number of vertices, the edge density, the number of edges and an upper bound on the QMST. The next three columns report the valid lower bound (17) obtained after the first outer loop of our PRSM algorithm, the relative gap to the upper bound , and the wall clock time in seconds needed to compute that bound. The last six columns outline the numerical results of our algorithm to compute (18). In columns 8 to 10, we provide the valid lower bound returned by our algorithm, the relative gap, and the wall clock time needed to compute the lower bound. The next two columns list the total number of iterations and the total number of cuts added. In the last column, we report the relative gap closed by adding the RLT-type cuts to the DNN relaxation (17). This performance measurement is computed as where LBDNN refers to the lower bound (17) reported in column 5 and LBDNN+CUTS is the lower bound (18) reported in column 8 in each table. This metric gives information on how much the gap to the upper bound was improved.
LABEL:tab:CP1, LABEL:tab:CP2, LABEL:tab:CP3 and LABEL:tab:CP4 show that especially for CP instances with vertices and edge density 100% there were only a few violated cuts found. Hence, the relative improvement of the DNN relaxation by adding those cuts was only marginal. One can further observe that the improvement of the relative gap and the relative gap closed, is better for smaller instances. For larger instances, adding cuts such as the RLT-type of the cut-set constraints for subsets of size 2 and larger, might further improve the DNN bounds.
LABEL:tab:SV presents the results of our algorithm for the benchmark set SV introduced in [39]. To the best of our knowledge, there are no results on LAGN, LAGP, and RLT2 for this benchmark set. The by far best lower bound up to date for the SV instance set was VS2. Our DNN relaxation bound without cuts outperforms VS2 for all instances, with the number of edges , except for the instance with and . Both our relaxations yield a relative gap of less than 1%. The relative gap of VS2 ranges between 0 and 16.4%. The maximum runtime to compute the DNN bound for these instances is less than 5 seconds, whereas computation time for bound VS2 of and was reported to be 45 minutes. Computing the DNN bound with cuts is faster than the reported time to compute VS2 for all instances with more than 80 edges.
LABEL:tab:OPsym, LABEL:tab:OPesym and LABEL:tab:OPvsym read similarly to the tables for the CP and SV benchmark sets but the results are averaged over all instances of the same size. Again, to the best of our knowledge, we are not aware of any detailed and complete results for LAGN and LAGP on the OP benchmark set.
LABEL:tab:OPsym reports the results obtained for the benchmark set OPsym. The lower bound (18) with cuts outperforms VS2 for , and RLT2 for with the exception of , where the average relative gap for RLT2 is reported to be 33% and is 33.41% for the DNN bound with cuts. For , no bounds were reported. One can observe that the absolute improvement by adding RLT cuts to (17) for is approximately 20.
LABEL:tab:OPesym shows that for the benchmark set OPesym adding the RLT-type cuts to (17) yields a substantial improvement of the relative gap. The DNN lower bound with cuts yields better bounds compared to VS2 but is clearly dominated by RLT1, giving an average relative gap between and for instances with .
The authors of [39] report that the relative gap of the VS1 lower bound is less than or equal 0.2% for all instances of the class OPvsym. Although, on average, not many violated cuts to be added were found, the averaged relative bound closed is above 49% for all instances except that with , where on average only 0.5 violated cuts were found. Considering the instances with , the average relative bound closed is even above 80%.
The time limit of 3 hours was reached by all instances from OPesym and OPvsym of size and almost all of those instances of size . The higher computational costs for those two classes of benchmark instances can be explained, among other things, by the high number of clusters , cf. Section 5.2. The number of clusters has a direct effect on the computation time of Dykstra’s algorithm, which accounts for a substantial part of the overall computation time. The average number of clusters needed for the OPvsym and OPesym instances are 6.43 and 6.38, whereas the average over all other benchmark instances is 3.26. Note that for those two classes of instances, added RTL-type constraints significantly improve lower bounds. Additionally, as for the CP3 instances, one can observe the higher number of iterations until convergence of the algorithm compared to other classes in our benchmark sets.
This study | Guimarães et al. [20] | |||||||
DNN | DNN + CUTS | LAGN | LAGP | |||||
gap (%) | time (s) | gap (%) | time (s) | gap (%) | time (s) | gap (%) | time (s) | |
\csvreader[head to column names, late after line = | ||||||||
\lagptime \csvreader[head to column names, late after line = | \dnntime | \lbgapp | \lbtime | \lagngap | \lagntime | \lagpgap | ||
\lagptime \csvreader[head to column names, late after line = | \dnntime | \lbgapp | \lbtime | \lagngap | \lagntime | \lagpgap | ||
\lagptime | \dnntime | \lbgapp | \lbtime | \lagngap | \lagntime | \lagpgap |
7 Conclusion
This paper provides two mixed-integer semidefinite programming formulations for the quadratic minimum spanning tree problem. Each of these formulations includes only one connectivity constraint, which is a linear matrix inequality based on the algebraic connectivity of trees. By exploiting the MISDP formulations, we derive a DNN relaxation for the QMSTP. We also derive the cut-set and RLT-type constraints as Chvátal-Gomory cuts of the MISDP by applying a CG procedure for mixed integer conic programming. The RLT-type constraints are added to the DNN relaxation, resulting in a strengthened DNN relaxation. An iterative cutting plane Peaceman-Rachford splitting method is designed to compute the DNN relaxation with the RLT-type constraints of the QMSTP efficiently.
The computational experiments on the benchmark instances from the literature demonstrate that our bounds significantly outperform existing bounds both in quality and computation time. While other approaches struggled to compute bounds for larger instances, we compute strong bounds in short time.
Given these results, incorporating our new bounds in a branch-and-bound algorithm would be the obvious next step for further research. Another topic for future research would be to incorporate additional RLT-type cut-set constraints to further strengthen the DNN relaxation.
Instance | DNN | DNN + CUTS | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
(%) | UB | LB | gap (%) | time (s) | LB | gap (%) | time (s) | iterations | cuts | closed (%) | ||
\csvreader[head to column names, late after line = | ||||||||||||
\relgapclosed |
Instance | DNN | DNN + CUTS | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
(%) | UB | LB | gap (%) | time (s) | LB | gap (%) | time (s) | iterations | cuts | closed (%) | ||
\csvreader[head to column names, late after line = | ||||||||||||
\relgapclosed |
Instance | DNN | DNN + CUTS | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
(%) | UB | LB | gap (%) | time (s) | LB | gap (%) | time (s) | iterations | cuts | closed (%) | ||
\csvreader[head to column names, late after line = | ||||||||||||
\relgapclosed |
Instance | DNN | DNN + CUTS | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
(%) | UB | LB | gap (%) | time (s) | LB | gap (%) | time (s) | iterations | cuts | closed (%) | ||
\csvreader[head to column names, late after line = | ||||||||||||
\relgapclosed |
Instance | DNN | DNN + CUTS | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
(%) | UB | LB | gap (%) | time (s) | LB | gap (%) | time (s) | iterations | cuts | closed (%) | ||
\csvreader[head to column names, late after line = | ||||||||||||
\relgapclosed |
Instance | DNN | DNN + CUTS | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
UB | LB | gap (%) | time (s) | LB | gap (%) | time (s) | iterations | cuts | closed (%) | ||
\csvreader[head to column names, late after line = | |||||||||||
\relgapclosed |
Instance | DNN | DNN + CUTS | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
UB | LB | gap (%) | time (s) | LB | gap (%) | time (s) | iterations | cuts | closed (%) | ||
\csvreader[head to column names, late after line = | |||||||||||
\relgapclosed |
Instance | DNN | DNN + CUTS | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
UB | LB | gap (%) | time (s) | LB | gap (%) | time (s) | iterations | cuts | closed (%) | ||
\csvreader[head to column names, late after line = | |||||||||||
\relgapclosed |
References
- [1] Arjang Assad and Weixuan Xu. The quadratic minimum spanning tree problem. Naval Res. Logist., 39(3):399–417, 1992.
- [2] Heinz H. Bauschke and Valentin R. Koch. Projection methods: Swiss army knives for solving feasibility and best approximation problems with halfspaces. In Infinite products of operators and their applications, volume 636 of Contemp. Math., pages 1–40. AMS, Providence, RI, 2015.
- [3] Jeff Bezanson, Alan Edelman, Stefan Karpinski, and Viral B Shah. Julia: A fresh approach to numerical computing. SIAM Review, 59(1):65–98, 2017.
- [4] Stephen Boyd, Neal Parikh, Eric Chu, Borja Peleato, and Jonathan Eckstein. Distributed optimization and statistical learning via the alternating direction method of multipliers. Foundations and Trends® in Machine Learning, 3(1):1–122, 2011.
- [5] James P. Boyle and Richard L. Dykstra. A method for finding projections onto the intersection of convex sets in hilbert spaces. In Richard Dykstra, Tim Robertson, and Farroll T. Wright, editors, Advances in Order Restricted Statistical Inference, pages 28–47, New York, NY, 1986.
- [6] M. Tolga Çezik and Garud N. Iyengar. Cuts for mixed 0-1 conic programming. Math. Program., 104:179–202, 2005.
- [7] Tzu-Chiang Chiang, Chien-Hung Liu, and Yueh-Min Huang. A near-optimal multicast scheme for mobile ad hoc networks using a hybrid genetic algorithm. Expert Syst. Appl., 33(3):734 – 742, 2007.
- [8] Wushow Chou and Aaron Kershenbaum. A unified algorithm for designing multidrop teleprocessing networks. IEEE Trans. Commun., 22:1762–1772, 1974.
- [9] Laurent Condat. Fast projection onto the simplex and the ball. Math. Program., 158(1–2):575–585, 2016.
- [10] Roberto Cordone and Gianluca Passeri. Heuristic and exact approaches to the quadratic minimum spanning tree problem. In Seventh Cologne Twente Workshop on Graphs and Comb. Opt., Gargano, Italy, 13-15 May, 2008, pages 52–55. University of Milan, 2008.
- [11] Roberto Cordone and Gianluca Passeri. Solving the quadratic minimum spanning tree problem. Appl. Math. Comput., 218:11597–11612, 2012.
- [12] Ante Ćustić, Ruonan Zhang, and Abraham P. Punnen. The quadratic minimum spanning tree problem and its variations. Discrete Optim., 27:73–87, 2018.
- [13] D. Cvetković, M. Čangalović, and V. Kovačević-Vujčić. Semidefinite programming methods for the symmetric traveling salesman problem. In G. Cornuj́ols, R.E. Burkard, and G.J. Woeginger, editors, Integer programming and Combinatorial Optimization (IPCO 1999), volume 1610 of Lecture Notes in Comput. Sci. Springer, Berlin, Heidelberg, 1999.
- [14] Frank de Meijer and Renata Sotirov. The Chvátal-Gomory procedure for integer SDPs with applications in combinatorial optimization. Math. Program., 2024.
- [15] Frank de Meijer and Renata Sotirov. On integrality in semidefinite programming for discrete optimization. SIAM J. Optim., 34(1), 2024.
- [16] Frank de Meijer, Renata Sotirov, Angelika Wiegele, and Shudian Zhao. Partitioning through projections: Strong SDP bounds for large graph partition problems. Comput. Oper. Res., 151:106088, 2023.
- [17] Miroslav Fiedler. Algebraic connectivity of graphs. Czechoslov. Math. J., 23(2):298–305, 1973.
- [18] Paul C. Gilmore. Optimal and suboptimal algorithms for the quadratic assignment problem. J Soc Ind Appl Math, 10(2):305–313, 1962.
- [19] Robert Grone and Russell Merris. Ordering trees by algebraic connectivity. Graphs Combin., 6:229–237, 1990.
- [20] Dilson A. Guimarães, Alexandre S. da Cunha, and Dilson L. Pereira. Semidefinite programming lower bounds and branch-and-bound algorithms for the quadratic minimum spanning tree problem. European J. of Oper. Res., 280(1):46–58, 2020.
- [21] Bingsheng He, Feng Ma, and Xiaoming Yuan. Convergence study on the symmetric version of ADMM with larger step sizes. SIAM J. on Imaging Sci., 9(3):1467–1501, 2016.
- [22] Hao Hu, Renata Sotirov, and Henry Wolkowicz. Facial reduction for symmetry reduced semidefinite and doubly nonnegative programs. Math. Program., 200(1):475–529, 2023.
- [23] Qi Huangfu and J. A. Julian Hall. Parallelizing the dual revised simplex method. Math. Program. Comput., 10(1):119–142, 2018.
- [24] Joseph B. Kruskal. On the shortest spanning subtree of a graph and the traveling salesman problem. In Proc. Amer. Math. Soc., 7, 1956.
- [25] Eugene L. Lawler. The quadratic assignment problem. Manag. Sci., 9(4):586–599, 1963.
- [26] Xinxin Li, Ting Kei Pong, Hao Sun, and Henry Wolkowicz. A strictly contractive Peaceman-Rachford splitting method for the doubly nonnegative relaxation of the minimum cut problem. Comput. Optim. Appl., 78(3):853–891, 2021.
- [27] Pierre-Louis Lions and Bertrand Mercier. Splitting algorithms for the sum of two nonlinear operators. SIAM J. Numer. Anal., 16(6):964–979, 1979.
- [28] Miles Lubin, Oscar Dowson, Joaquim Dias Garcia, Joey Huchette, Benoît Legat, and Juan Pablo Vielma. JuMP 1.0: Recent improvements to a modeling language for mathematical optimization. Math. Program. Comput., 15:581–589, 2023.
- [29] Danilo E. Oliveira, Henry Wolkowicz, and Yangyang Xu. ADMM for the SDP relaxation of the QAP. Math. Program. Comput., 10:631–658, 2018.
- [30] Temel Öncan and Abraham P. Punnen. The quadratic minimum spanning tree problem: A lower bounding procedure and an efficient search algorithm. Comput. Oper. Res., 37(10):176–1773, 2010.
- [31] Gintaras Palubeckis, Dalius Rubliauskas, and Aleksandras Targamadz. Metaheuristic approaches for the quadratic minimum spanning tree problem. Inform. Tech. Control, 29:257––268, 2010.
- [32] Donald W. Peaceman and Henry H. Rachford. The numerical solution of parabolic and elliptic differential equations. SIAM J. Appl. Math., 3(1):28–41, 1955.
- [33] Dilson L. Pereira, Michel Gendreau, and Alexandre S. da Cunha. Branch-and-cut and branch-and-cut-and-price algorithms for the adjacent only quadratic minimum spanning tree problem. Networks, 65:367–379, 2015.
- [34] Dilson L. Pereira, Michel Gendreau, and Alexandre S. da Cunha. Lower bounds and exact algorithms for the quadratic minimum spanning tree problem. Comput. Oper. Res., 63:149 – 160, 2015.
- [35] Robert C. Prim. Shortest connection networks and some generalizations. The Bell Systems Technical Journal, 36(6):1389–1401, 1957.
- [36] Abraham P. Punnen. Combinatorial optimization with multiplicative objective function. Int. J. Oper. Quant. Manag., 7:205–209, 2001.
- [37] Borzou Rostami and Federico Malucelli. Lower bounds for the quadratic minimum spanning tree problem based on reduced cost computation. Comput. Oper. Res., 64:178–188, 2015.
- [38] Hanif D. Sherali and Warren P. Adams. A reformulation-linearization technique for solving discrete and continuous nonconvex problems, volume 31. SSBM, 2013.
- [39] Renata Sotirov and Zoe Verchére. The quadratic minimum spanning tree problem: Lower bounds via extended formulations. Vietnam J. Math., 2024.
- [40] Henry Wolkowicz, Romesh Saigal, and Lieven Vandenberghe. Handbook of Semidefinite Programming: Theory, Algorithms, and Applications. Internat. Ser. Oper. Res. Management Sci. Springer, 2000.