Spanning and Splitting: Integer Semidefinite Programming for the Quadratic Minimum Spanning Tree Problem

Frank de Meijer¹¹1Delft Institute of Applied Mathematics, Delft University of Technology, Mekelweg 4, 2628 CD Delft, The Netherlands, f.j.j.demeijer@tudelft.nl ²²2Corresponding Author:f.j.j.demeijer@tudelft.nl \scalerel* — Melanie Siebenhofer³³3Institut für Mathematik, Alpen-Adria-Universität Klagenfurt, Universitätstraße 65-67, 9020 Klagenfurt, melanie.siebenhofer@aau.at, angelika.wiegele@aau.at ⁴⁴4This research was funded in part by the Austrian Science Fund (FWF) [10.55776/DOC78]. For open access purposes, the authors have applied a CC BY public copyright license to any author-accepted manuscript version arising from this submission. \scalerel* — Renata Sotirov⁵⁵5Tilburg University, Department of Econometrics & Operations Research, CentER, 5000 LE Tilburg, r.sotirov@tilburguniversity.edu \scalerel* — Angelika Wiegele^‡§⁶⁶6Universität zu Köln, Weyertal 86–90, 50931 Köln, Germany \scalerel* —

Abstract

In the quadratic minimum spanning tree problem (QMSTP) one wants to find the minimizer of a quadratic function over all possible spanning trees of a graph. We give two formulations of the QMSTP as mixed-integer semidefinite programs exploiting the algebraic connectivity of a graph. Based on these formulations, we derive a doubly nonnegative relaxation for the QMSTP and investigate classes of valid inequalities to strengthen the relaxation using the Chvátal-Gomory procedure for mixed-integer conic programming.

Solving the resulting relaxations is out of reach for off-the-shelf software. We therefore develop and implement a version of the Peaceman-Rachford splitting method that allows to compute the new bounds for graphs from the literature. The numerical results demonstrate that our bounds significantly improve over existing bounds from the literature in both quality and computation time, in particular for graphs with more than 30 vertices.

This work is further evidence that semidefinite programming is a valuable tool to obtain high-quality bounds for problems in combinatorial optimization, in particular for those that can be modelled as a quadratic problem.

Keywords: Combinatorial Optimization, Spanning Trees, Integer Semidefinite Programming, Algebraic Connectivity, Projection Methods

1 Introduction

The quadratic minimum spanning tree problem (QMSTP) is the problem of finding a spanning tree of a connected, undirected graph such that the sum of interaction costs over all pairs of edges in the tree is minimized. The QMSTP was introduced by Assad and Xu [1] in 1992. The adjacent-only quadratic minimum spanning tree problem (AQMSTP), that is, the QMSTP where the interaction costs of all non-adjacent edge pairs are assumed to be zero, is also introduced in [1]. Assad and Xu proved that both the QMSTP and AQMSTP are strongly ${\mathcal{N}P}$ -hard problems. Interestingly, the QMSTP remains ${\mathcal{N}P}$ -hard even when the cost matrix is of rank one [36].

There are many existing variants of the QMSTP problem, such as the minimum spanning tree problem with conflict pairs, the quadratic bottleneck spanning tree problem, and the bottleneck spanning tree problem with conflict pairs. For a description of those problems, see e.g., Ćustić et al. [12]. The QMSTP has various applications in telecommunication, transportation, energy and hydraulic networks, see e.g., [1, 7, 8].

There is a lot of research on lower-bounding approaches and exact algorithms for the QMSTP. The majority of lower bounding approaches for the QMSTP may be classified into Gilmore-Lawler (GL) type bounds [1, 11, 30, 37] and reformulation linearization technique (RLT) based bounds [34, 37]. The GL procedure is a well-known approach to construct lower bounds for quadratic binary optimization problems, see e.g., [18, 25]. The RLT is a method to derive a hierarchy of convex approximations of mixed-integer programming problems [38] where integer variables are binary. Lower bounding approaches based on an extended formulation of the minimum spanning tree problem are derived in [39]. For an overview of the above-mentioned lower bounding approaches and their comparison, see e.g., [39]. Semidefinite programming (SDP) lower bounds for the QMSTP are considered in [20]. SDP bounds incorporated in a branch-and-bound algorithm provide the best exact solution approach for the problem up to date [20]. Different exact approaches for solving the QMSTP are considered in [1, 11, 34, 33]. For a comparison of various heuristic approaches for solving the QMSTP, see Palubeckis et al. [31].

In this paper, we derive two mixed-integer semidefinite (MISDP) formulations for the QMSTP by exploiting the algebraic connectivity of a tree. Algebraic connectivity was also exploited in [13] and [14] to derive ISDP formulations for the traveling salesman problem (TSP) and the quadratic TSP, respectively. We prove that the continuous relaxation of the cut-set QMSTP formulation of the QMSTP is at least as strong as the continuous relaxations of MISDP formulations of the QMSTP. Further, we derive several classes of valid inequalities for our MISDPs by exploiting the Chvátal-Gomory (CG) procedure for mixed-integer conic programming [6, 14]. In particular, we show that the classical cut-set constraints and the first level RLT constraints are CG cuts. The cut-set constraints are derived from the linear matrix inequality (LMI) that is related to the algebraic connectivity of a tree. The RLT-type constraints are derived using two LMIs from the MISDP formulation of the QMSTP.

Our preliminary computational results show that the cut-set constraints have a small impact on the quality of our doubly nonnegative (DNN) relaxation of the QMSTP, but the RLT-type constraints improve the DNN bound. Therefore, we add RLT-type constraints to the DNN relaxation of the QMSTP. The resulting relaxation has a large number of constraints, and it is difficult to solve using state-of-the-art interior point methods. Therefore, we design a version of the Peaceman-Rachford splitting method (PRSM) that is able to handle a large number of cutting-planes efficiently. In particular, the PRSM algorithm is adding violated RLT-type inequalities iteratively while using warm-starts. The numerical results show that our bounds for the QMSTP outperform bounds from the literature in quality, as well as in computational time required to obtain them. Our approach shows significant improvement over other methods from the literature, particularly for larger instances, specifically, for graphs with more than 30 vertices.

Notation

The set of $n\times n$ real symmetric matrices is denoted by ${\mathcal{S}}^{n}$ . The space of symmetric matrices is considered with the trace inner product, which for any $X,Y\in{\mathcal{S}}^{n}$ is defined as $\langle X,Y\rangle\coloneqq\text{tr}(XY)$ . The associated norm is the Frobenius norm $\|X\|_{F}\coloneqq\sqrt{\text{tr}(XX)}$ . The cone of symmetric positive semidefinite matrices of order $n$ is defined as ${\mathcal{S}}_{+}^{n}\coloneqq\{X\in{\mathcal{S}}^{n}:X\succeq\mathbf{0}\}$ . We order the eigenvalues of $X\in{\mathcal{S}}^{n}$ as follows $\lambda_{1}(X)\leq\cdots\leq\lambda_{n}(X)$ . If it is clear from the context to which matrix the eigenvalues relate, we denote eigenvalues by $\lambda_{i}$ . The Hadamard product of two matrices $X=(x_{ij})$ and $Y=(y_{ij})$ of the same size is denoted by $\circ$ , and defined as follows $(X\circ Y)_{ij}\coloneqq x_{ij}y_{ij}.$ The operator $\text{diag}\colon\mathbb{R}^{n\times n}\rightarrow\mathbb{R}^{n}$ maps a square matrix to a vector consisting of its diagonal elements. The adjoint operator of diag is denoted by $\text{Diag}\colon\mathbb{R}^{n}\rightarrow\mathbb{R}^{n\times n}$ .

We denote by ${\mathbf{1}}_{n}$ the vector of all ones of length $n$ , and define ${\mathbf{J}}_{n}\coloneqq{\mathbf{1}}_{n}{\mathbf{1}}_{n}^{\top}$ . The indicator vector of $S\subseteq V$ is denoted by $\mathbb{1}_{S}.$ The all-zero matrix of order $n$ is denoted by ${\mathbf{0}}_{n}$ . We use ${\mathbf{I}}_{n}$ to denote the identity matrix of order $n$ , while its $i$ -th column is given by ${\mathbf{u}}_{i}$ . In case the dimension of ${\mathbf{1}}_{n}$ , ${\mathbf{0}}_{n}$ , ${\mathbf{J}}_{n}$ and $\mathbf{I}_{n}$ is clear from the context, we omit the subscript.

We define the $n$ -simplex as $\Delta_{n}\coloneqq\{x\in\mathbb{R}^{p}:x\geq\mathbf{0},\ \sum_{i=1}^{p}x_{i}=n\}$ and the capped $n$ -simplex as $\bar{\Delta}_{n}\coloneqq\{x\in\mathbb{R}^{p}:\mathbf{0}\leq x\leq\mathbf{1},% \ \sum_{i=1}^{p}x_{i}=n\}$ . By $\mathcal{P}_{\mathcal{M}}$ we denote the projection operator onto the set $\mathcal{M}$ . We use $[n]$ to denote the set of integers $\{1,\ldots,n\}.$

Given a subset $S\subseteq V$ of vertices in a graph $G=(V,E)$ , we denote the set of edges with both endpoints in $S$ by $E(S)\coloneqq\{\{i,j\}\in E~{}:~{}i,j\in S\}$ and the cut induced by $S$ by $\partial S\coloneqq\big{\{}\{i,j\}\in E~{}:~{}i\in S,j\notin S\big{\}}$ . However, when $S=\{i\}$ we define $\delta(i)\coloneqq\partial S$ .

2 The Quadratic Minimum Spanning Tree Problem

In this section, we formally introduce the QMSTP. Let $G=(V,E)$ be a connected, undirected graph with $n=|V|$ vertices and $m=|E|$ edges. Let $Q=(q_{ef})\in{\mathcal{S}}^{m}$ be a matrix of interaction costs between edges of $G$ , where $q_{ee}$ represents the cost of edge $e$ .

The QMSTP can be formulated as the following binary quadratic programming problem:

\displaystyle\min\limits_{x\in\mathcal{T}}~{}\sum_{e\in E}\sum_{f\in E}q_{ef}x% _{e}x_{f},

where $\mathcal{T}$ denotes the set of all spanning trees in $G$ . Each spanning tree in $\mathcal{T}$ is represented by its incidence vector of length $m$ , and therefore

\displaystyle{\mathcal{T}}\coloneqq\bigg{\{}x\in\{0,1\}^{m}~{}:~{}\sum_{e\in E% }x_{e}=n-1,~{}\sum_{e\in\partial S}x_{e}\geq 1,~{}\forall S\subsetneq V,~{}S% \not=\emptyset\bigg{\}}.

(1)

The constraints of the type

\displaystyle\sum_{e\in\partial S}x_{e}\geq 1

(2)

are known as the cut-set constraints, and they ensure connectivity of a subgraph from $\mathcal{T}$ . If $Q$ is a diagonal matrix then the QMSTP reduces to the minimum spanning tree problem that is solvable in polynomial time [24, 35].

Let us now fix an ordering for the edges $E=\{e_{1},\ldots,e_{m}\}$ . For $x\in{\mathcal{T}}$ define $Y\coloneqq(y_{ef})\in{\mathcal{S}}^{m}$ such that $y_{ef}=1$ if $x_{e}=1$ and $x_{f}=1$ , and $y_{ef}=0$ otherwise. Then the QMSTP can be formulated as the following mixed-integer programming problem, see e.g., [1]:

\displaystyle\begin{aligned} \min\ &\langle Q,Y\rangle\\ \text{s.t. }&\text{diag}(Y)=x,\,\,Y\mathbf{1}_{m}=(n-1)x&&\\ &\mathbf{0}\leq Y\leq\mathbf{J}_{m},\,\,Y\in{\mathcal{S}}^{m},\,\,x\in\mathcal% {T}.\end{aligned}

(3)

One can verify that the constraints, in combination with the binarity of $x$ , are sufficient to obtain the coupling between $Y$ and $x$ . Note that each row in $Y$ is an incidence vector of a tree. The above model was introduced by Assad and Xu [1]. We refer to the above program the cut-set formulation of the QMSTP.

3 MISDP formulations for the QMSTP

In 1973, Fiedler [17] defined the algebraic connectivity, $a(G)$ , of a graph $G$ as the second smallest eigenvalue of the Laplacian matrix of the graph. It is well-known that the algebraic connectivity is greater than zero if and only if $G$ is a connected graph. In this section, we will exploit the algebraic connectivity of a tree to derive two MISDP formulations of the QMSTP. We also prove that the continuous relaxations of our MISDP formulations are at least as strong as the continuous relaxation of the cut-set QMSTP formulation (3).

It is known that the algebraic connectivity for the graph class of trees with $n\geq 3$ vertices lies in the interval between $2\left(1-\cos\left(\frac{\pi}{n}\right)\right)$ and $1$ , see e.g., [19]. Here, $2\left(1-\cos\left(\frac{\pi}{n}\right)\right)$ is the algebraic connectivity of the path graph, and $1$ is the algebraic connectivity of the star graph. It is also known that a tree on $n$ vertices has exactly $n-1$ edges. Hence, a tree can be characterized as a connected graph with exactly $n-1$ edges, see also (1). We use those facts to characterize trees by means of positive semidefiniteness.

Proposition 1.

Let $G$ be a simple graph on $n\geq 3$ vertices with $n-1$ edges. Let ${L}$ be its Laplacian matrix and let $\alpha,\beta\in\mathbb{R}$ with $\alpha\geq\frac{\beta}{n}$ and $0<\beta\leq 2\left(1-\cos\left(\frac{\pi}{n}\right)\right)$ . Then, $G$ is a tree if and only if ${Z}={L}+\alpha\mathbf{J}_{n}-\beta\mathbf{I}_{n}\succeq\mathbf{0}.$

Proof.

Let $0=\lambda_{1}\leq\lambda_{2}\leq\dots\leq\lambda_{n}$ be the eigenvalues of the Laplacian matrix $L$ . We denote by $v^{1}=\mathbf{1}$ and $v^{i}$ for $i\in\{2,\dots,n\}$ the eigenvectors of $L$ such that they form a basis of $\mathbb{R}^{n}$ . The matrix $\mathbf{J}$ has eigenvalue $n$ whose corresponding eigenvector is $\mathbf{1}$ , and eigenvalue $0$ of multiplicity $n-1$ with eigenvectors $v^{i}$ for $i\in\{2,\dots,n\}$ . Therefore, $Z\mathbf{1}=(L+\alpha\mathbf{J}-\beta\mathbf{I})\mathbf{1}=(\alpha n-\beta)% \mathbf{1}$ and $Zv^{i}=(\lambda_{i}-\beta)v^{i},$ from where it follows that the eigenvalues of $Z$ are $\alpha n-\beta$ and $\lambda_{i}-\beta$ for $i\in\{2,\dots,n\}$ . Using the fact that $\alpha\geq\frac{\beta}{n}$ we have $\alpha n-\beta\geq 0,$ and thus $Z$ is positive semidefinite if and only if its eigenvalue $\lambda_{2}-\beta$ is nonnegative.

Now suppose that $G$ is a tree. In this case, we know that $a(G)=\lambda_{2}\geq 2\left(1-\cos\left(\frac{\pi}{n}\right)\right)$ holds. Therefore, we have that $\lambda_{2}-\beta\geq 2\left(1-\cos\left(\frac{\pi}{n}\right)\right)-\beta\geq 0,$ and thus $Z\succeq\mathbf{0}$ .

On the other hand, if $Z\succeq\mathbf{0}$ then $\lambda_{2}-\beta\geq 0$ . Since $\beta>0$ , it follows that $a(G)=\lambda_{2}>0$ and, thus $G$ is connected. As $G$ has $n$ vertices and $n-1$ edges, it is a tree. ∎

The previous result can be generalized for any graph as follows.

Proposition 2.

Let $G$ be a simple graph on $n\geq 3$ vertices and $L$ be the Laplacian matrix of $G$ . Then $a(G)\geq\beta$ if and only if $L+\frac{\beta}{n}\mathbf{J}_{n}-\beta\mathbf{I}_{n}\succeq\mathbf{0}$ .

Proof.

Let $0=\lambda_{1}\leq\lambda_{2}\leq\dots\leq\lambda_{n}$ be the eigenvalues of $L$ , the Laplacian matrix of $G$ . The eigenvalues of $L+\frac{\beta}{n}\mathbf{J}$ are $\beta$ and $a(G)=\lambda_{2}\leq\dots\leq\lambda_{n}$ . If $a(G)\geq\beta$ , then all eigenvalues of $L+\frac{\beta}{n}\mathbf{J}$ are greater or equal than $\beta$ and therefore $L+\frac{\beta}{n}\mathbf{J}-\beta\mathbf{I}\succeq\mathbf{0}$ . Conversely, if $L+\frac{\beta}{n}\mathbf{J}-\beta\mathbf{I}\succeq\mathbf{0}$ , then all eigenvalues of $L+\frac{\beta}{n}\mathbf{J}$ greater or equal to $\beta$ and therefore $a(G)\geq\beta$ . ∎

In the sequel, we exploit Proposition 1 to derive MISDP formulations for the QMSTP. Let us first define the set of adjacency matrices of induced subgraphs of $G$ with $n$ vertices and $n-1$ edges:

\displaystyle\mathcal{F}\coloneqq\left\{X\in\{0,1\}^{n\times n}\cap\mathcal{S}% ^{n}~{}:~{}\langle X,\mathbf{J}_{n}\rangle=2(n-1),\ x_{ij}=0\text{ if }\{i,j\}% \notin E\right\}.

(4)

The set of all adjacency matrices of spanning trees on $n$ vertices is:

\displaystyle\mathcal{T}_{M}=\mathcal{F}\cap\left\{X\in\mathcal{S}^{n}~{}:~{}% \text{Diag}(X{\bf 1})-X+\alpha\mathbf{J}-\beta\mathbf{I}\succeq\mathbf{0}% \right\},

(5)

where $\alpha\geq\frac{\beta}{n}$ and $0<\beta\leq 2\left(1-\cos\left(\frac{\pi}{n}\right)\right)$ . There is a bijection ${\mathcal{B}}:{\mathcal{T}}_{M}\to{\mathcal{T}}$ , see (1), where ${\mathcal{B}}(X)$ maps $X$ to a column vector containing the entries of X corresponding to $E$ with respect to the fixed ordering of the edge set. Hence, the QMSTP can be written as the following MISDP problem:


$\displaystyle\min\$	$\displaystyle\langle Q,Y\rangle$	(6a)
s.t.	$\displaystyle\text{diag}(Y)={\mathcal{B}}(X),\,\,Y\mathbf{1}_{m}=(n-1){% \mathcal{B}}(X)$	(6b)
	$\displaystyle\text{Diag}(X\mathbf{1})-X+\alpha\mathbf{J}_{n}-\beta\mathbf{I}_{% n}\succeq\mathbf{0}$	(6c)
	$\displaystyle\mathbf{0}\leq Y\leq\mathbf{J}_{m},\,\,Y\in{\mathcal{S}}^{m},\,\,% X\in\mathcal{F}.$	(6d)

One can verify that the integrality of the matrix variable $Y$ in (6) follows from the integrality of the matrix variable $X$ . Let us compare the continuous relaxations of (6) and the continuous relaxation of the cut-set QMSTP formulation (3). We first show the following result.

Proposition 3.

Let $X\in\mathcal{S}^{n}$ be a matrix such that $\mathbf{0}\leq X\leq{\mathbf{J}}$ , $\text{diag}(X)=\mathbf{0}$ , $\langle X,{\mathbf{J}}\rangle=2(n-1)$ , and $\min_{\emptyset\neq S\subsetneq V}\sum_{i\in S}\sum_{j\notin S}x_{ij}=1$ . Then $\lambda_{2}(\text{Diag}(X\mathbf{1})-X)\geq 2(1-\cos\frac{\pi}{n})$ .

Proof.

The proof is similar to the proof of Statement 4.3. in [17]. ∎

It is not difficult to show that for a feasible $(x,Y)$ for the continuous relaxation of the cut-set QMSTP formulation (3) one can construct a feasible pair $(X,Y)$ for the continuous relaxation of (6). This leads us to the following result.

Corollary 1.

The continuous relaxation of the cut-set QMSTP formulation is at least as strong as the continuous relaxations of (6).

This result is not very surprising. Namely, Goemans and Rendl [40] show a similar result that relates the subtour elimination relaxation and an algebraic connectivity based SDP relaxation for the traveling salesman problem.

One can also formulate the QMSTP by exploiting theory on discrete PSD matrices from [15], i.e., the following result.

Theorem 1 ([15]).

Let $Z=\begin{pmatrix}X&x\\ x^{\top}&1\\ \end{pmatrix}\succeq\mathbf{0}$ with $\text{diag}({X})={x}$ . Then, ${\text{rank}(Z)}=1$ if and only if ${{X}\in\{0,1\}^{n\times n}}$ .

Now, by using the previous result, we formulate the QMSTP as the following MISDP:


$\displaystyle\min\$	$\displaystyle\langle Q,Y\rangle$	(7a)
s.t.	$\displaystyle\text{diag}(Y)={\mathcal{B}}(X)$	(7b)
	$\displaystyle\begin{pmatrix}Y&{\mathcal{B}}(X)\\ {\mathcal{B}}(X)^{\top}&1\end{pmatrix}\succeq\mathbf{0}$	(7c)
	$\displaystyle\text{Diag}(X\mathbf{1})-X+\alpha\mathbf{J}_{n}-\beta\mathbf{I}_{% n}\succeq\mathbf{0}$	(7d)
	$\displaystyle Y\in{\mathcal{S}}^{m},\,\,X\in\mathcal{F},$	(7e)

where $\mathcal{F}$ is given in (4). In (7), we do not impose integrality on the off-diagonal elements of $Y$ as those follow by the integrality of ${\mathcal{B}}(X)$ , see [15]. Due to the integrality of $Y$ , the constraints (7b) and (7c) ensure that $Y={\mathcal{B}}(X){\mathcal{B}}(X)^{\top}$ . Now, by using the same arguments as earlier, one can show the following result.

Corollary 2.

The continuous relaxation of the cut-set QMSTP formulation is at least as strong as the continuous relaxations of (7).

It is difficult to directly compare the continuous relaxations of (6) and (7). However, by adding the constraint $Y\mathbf{1}_{m}=(n-1){\mathcal{B}}(X)$ to (7) (which is redundant in the presence of integrality), we have the following result.

Corollary 3.

The continuous relaxation of (7) with additional constraint $Y\mathbf{1}_{m}=(n-1){\mathcal{B}}(X)$ dominates the continuous relaxation of (6).

3.1 Valid inequalities

In this section, we derive Chvátal-Gomory cuts from the MISDP formulations of the QMSTP from the previous section. Some of those cuts coincide with well-known cuts from the literature. In particular, we show that the cut-set constraints (2) and some of the first level RLT constraints are CG cuts.

Let us first present a result that applies to any graph having several connected components.

Proposition 4.

Let $L$ be the Laplacian matrix of a graph on $n\geq 3$ vertices, consisting of exactly $k\geq 2$ connected components. Let $\{S_{1},\dots,S_{k}\}$ be the partition of the vertices implied by these components. For each $\ell\in[k]$ , let ${v}_{\ell}$ be the vector defined as

({v}_{\ell})_{i}\coloneqq\begin{dcases*}n-\lvert S_{\ell}\rvert&if $i\in S_{% \ell}$\\ -\lvert S_{\ell}\rvert&if $i\notin S_{\ell}$.\end{dcases*}

Then $\langle{v}_{\ell}{v}^{\top}_{\ell},L+\frac{\beta}{n}\mathbf{J}-\beta\mathbf{I}% \rangle<0$ for all $\ell\in[k]$ and $\beta>0$ .

Proof.

Recall the LMI from Proposition 2. We can write ${v}_{\ell}=n\mathbb{1}_{S_{\ell}}-\lvert S_{\ell}\rvert\mathbf{1}$ , $\ell\in[k]$ . It is not difficult to verify that $\mathbf{1}$ and $\mathbb{1}_{S_{\ell}}$ for $\ell\in[k]$ are eigenvectors of $L$ corresponding to the zero eigenvalue. It further holds that $\mathbf{J}\mathbb{1}_{S_{\ell}}=\lvert S_{\ell}\rvert\mathbf{1}$ for all $\ell\in[k]$ , and therefore, $\mathbf{J}{v}_{\ell}=n\lvert S_{\ell}\rvert\mathbf{1}-\lvert S_{\ell}\rvert n% \mathbf{1}=\mathbf{0}$ . This implies that $v_{\ell}$ is an eigenvector of $L+\frac{\beta}{n}\mathbf{J}-\beta\mathbf{I}$ corresponding to the eigenvalue $-\beta$ , and therefore the above inequality holds. ∎

A similar result was obtained in [14] in the context of a directed node-disjoint cycle cover. Let us restate Proposition 4 in terms of the adjacency matrix of a graph.

Corollary 4.

Let $G$ be a graph with $n\geq 3$ vertices consisting of $k\geq 2$ connected components. Denote by $S_{\ell}$ the set of vertices in component $\ell\in[k]$ . Let $X$ be the adjacency matrix of $G$ . Further, let $v_{\ell}=n\mathbb{1}_{S_{\ell}}-\lvert S_{\ell}\rvert\mathbf{1}$ for all $\ell\in[k]$ and let ${v}^{(2)}_{\ell}={v}_{\ell}\circ{v}_{\ell}$ . Then

\langle{v}_{\ell}{v}^{\top}_{\ell}-{v}^{(2)}_{\ell}\mathbf{1}^{\top},X\rangle>% \Big{\langle}{v}_{\ell}{v}^{\top}_{\ell},\frac{\beta}{n}\mathbf{J}-\beta% \mathbf{I}\Big{\rangle}

for all $\ell\in[k]$ and $\beta>0$ .

Proof.

The claim follows from Proposition 4, the fact that $L=\text{Diag}(X{\mathbf{1}})-X$ and using $\langle{v}_{\ell}^{(2)}\mathbf{1}^{\top},X\rangle=\langle{v}_{\ell}{v}^{\top}_% {\ell},\text{Diag}(X{\mathbf{1}})\rangle$ . ∎

We can now use the result of Corollary 4 to derive Chvátal-Gomory cuts for the QMSTP. Let $\beta=2\left(1-\cos\left(\frac{\pi}{n}\right)\right)$ , then we have the following CG cut:

\displaystyle\langle{v}_{\ell}{v}^{\top}_{\ell}-{v}^{(2)}_{\ell}\mathbf{1}^{% \top},X\rangle\leq\left\lfloor\bigg{\langle}{v}_{\ell}{v}^{\top}_{\ell},\frac{% \beta}{n}\mathbf{J}-\beta\mathbf{I}\bigg{\rangle}\right\rfloor\quad\ell\in[k],

where $v_{\ell}$ is defined as in Proposition 4. One can use the above cuts within a branch-and-cut framework to solve (6) and/or (7). In particular, those cuts may be used to separate matrices that are in $\mathcal{F}$ , see (4), but not in $\mathcal{T}_{M}$ , see (5).

In the sequel, we derive the cut-set constraints (2) as CG cuts. Let $S\subsetneq V$ , $S\neq\emptyset$ and $X$ be feasible for (6) or (7). Then, for the PSD matrix $\mathbb{1}_{S}\mathbb{1}_{S}^{\top}$ we have that

\displaystyle\langle\mathbb{1}_{S}\mathbb{1}_{S}^{\top},\text{Diag}(X\mathbf{1% })-X+\alpha\mathbf{J}-\beta\mathbf{I}\rangle\geq 0,

(8)

is a valid inequality for (6) and (7). After rewriting (8) and exploiting $\langle\mathbb{1}_{S}\mathbb{1}_{S}^{\top},\text{Diag}(X{\mathbf{1}})\rangle=% \langle\mathbb{1}_{S}\mathbf{1}^{\top},X\rangle$ , we have $\langle\mathbb{1}_{S}\mathbb{1}_{S}^{\top}-\mathbb{1}_{S}\mathbf{1}^{\top},X% \rangle\leq\langle\mathbb{1}_{S}\mathbb{1}_{S}^{\top},\alpha\mathbf{J}-\beta% \mathbf{I}\rangle.$ Since the left-hand side of this inequality is integer, we may round the right-hand side, which results in the following CG cut $\langle\mathbb{1}_{S}\mathbb{1}_{S}^{\top}-\mathbb{1}_{S}\mathbf{1}^{\top},X% \rangle\leq\lfloor\langle\mathbb{1}_{S}\mathbb{1}_{S}^{\top},\alpha\mathbf{J}-% \beta\mathbf{I}\rangle\rfloor,$ which after rewriting the left-hand side results in the following inequality

\displaystyle-\sum_{i\in S}\sum_{j\notin S}x_{ij}\leq\left\lfloor\lvert S% \rvert(\lvert S\rvert\alpha-\beta)\right\rfloor.

(9)

For $\alpha=\frac{\beta}{n}$ and $\beta=2\left(1-\cos\left(\frac{\pi}{n}\right)\right)$ we have that $\lfloor\lvert S\rvert(\lvert S\rvert\alpha-\beta)\rfloor=-1$ , and the above CG cut implies the cut-set constraint $\sum_{e\in\partial S}x_{e}\geq 1$ , see also (2). Let us summarize the previous discussion.

Proposition 5.

Let $S\subsetneq V$ , $S\neq\emptyset$ . Then, the cut-set constraint (2) is a Chvátal-Gomory cut with respect to the MISDPs (6) and (7).

Subsequently, we derive valid inequalities by exploiting the constraint (7c) that may be equivalently reformulated as $Y-{\mathcal{B}}(X){\mathcal{B}}(X)^{\top}\succeq{\mathbf{0}}$ . Let $X$ , $Y$ be feasible for (7), $i\in V$ , and $\mathbb{1}_{\delta(i)}$ be the indicator vector of $\delta(i)$ . For $f\in E$ , we define the following positive semidefinite matrix $P_{f}\coloneqq{\mathbf{u}}_{k}\mathbb{1}_{\delta(i)}^{\top}+\mathbb{1}_{\delta% (i)}{\mathbf{u}}_{k}^{\top}+{\mathbf{I}}_{m}+(n-1){\mathbf{u}}_{k}{\mathbf{u}}% _{k}^{\top},$ where the index $k$ corresponds to the ordering number of the edge $f$ , i.e., ${\mathcal{B}}(X)_{k}=x_{f}$ . Since $P_{f}\succeq{\mathbf{0}}$ , it follows that $\langle Y-{\mathcal{B}}(X){\mathcal{B}}(X)^{\top},P_{f}\rangle\geq 0.$ By rewriting the left-hand side, we have

\langle Y-{\mathcal{B}}(X){\mathcal{B}}(X)^{\top},P_{f}\rangle=2\sum_{e\in% \delta(i)}y_{fe}-2x_{f}{\mathcal{B}}(X)\mathbb{1}_{\delta(i)}^{\top}\geq 0,

from where it follows $\sum_{e\in\delta(i)}y_{fe}\geq x_{f}{\mathcal{B}}(X)\mathbb{1}_{\delta(i)}^{% \top}\geq x_{f},$ since ${\mathcal{B}}(X)\mathbb{1}_{\delta(i)}^{\top}\geq 1$ due to the fact that the underlying graph is connected. Moreover, ${\mathcal{B}}(X)\mathbb{1}_{\delta(i)}^{\top}\geq 1$ is the cut-set constraint that is a CG cut. Thus, we have the following constraints

\displaystyle\sum_{e\in\delta(i)}y_{fe}\geq x_{f}\qquad\forall f\in E,~{}% \forall i\in V.

(10)

Interestingly, these constraints follow also from the reformulation-linearization technique [38] applied to the cut-set constraints (2) with $|S|=1$ . Namely, after multiplying both sides of (2) by $x_{f}$ and replacing $x_{f}x_{e}$ by $y_{fe}$ , one obtains the constraints (10). We refer later to the constraints (10) as RLT-type constraints.

Proposition 6.

Let $i\in V$ and $f\in E$ . Then, the constraint (10) is a Chvátal-Gomory cut with respect to MISDP (7).

4 DNN relaxation

Here, we derive two doubly nonnegative relaxations for the QMSTP and derive their facially reduced formulations.

To this end, instead of the matrix $X$ , we introduce a vector $y$ that results in relaxing ${\mathcal{B}}(X)$ to $y$ . We then use formulation (7) where we drop the linear matrix inequality (7d), and relax the constraint $X\in\mathcal{F}$ , see (7e), to $\mathbf{1}^{\top}y=n-1$ . Furthermore, we add the constraint $Y\mathbf{1}=(n-1)y$ that can be derived from (7). Additionally, we impose nonnegativity constraints on the matrix variable, and obtain the following DNN relaxation:


$\displaystyle\min\$	$\displaystyle\langle Q,Y\rangle$	(11a)
s.t.	$\displaystyle\text{diag}(Y)=y$	(11b)
	$\displaystyle Y\mathbf{1}=(n-1)y,\,\,\mathbf{1}^{\top}y=n-1$	(11c)
	$\displaystyle{Y\geq\mathbf{0}},~{}~{}\begin{pmatrix}Y&y\\ {y}^{\top}&1\end{pmatrix}\succeq\mathbf{0}.$	(11d)

The above relaxation does not include the connectivity constraint (7d), because that constraint has only a small impact on the bound. However, it makes the relaxation more difficult to solve. In order to include a type of connectivity constraints in (11), we consider valid inequalities from Section 3.1. Preliminary numerical results show that by adding the cut-set constraints (2), see also Proposition 5, the resulting bound only marginally improves on the DNN bound (11). The RLT-type cuts (10), however, turn out to have a more positive impact on the bound value. We therefore present the following strengthening of the relaxation (11):

\displaystyle\begin{aligned} \min\ &\langle Q,Y\rangle\\ \text{s.t. }&\text{\eqref{subeq:sdp1:diag}--\eqref{subeq:sdp1:psd}}\\ &\sum_{e\in\delta(i)}y_{fe}\geq y_{f}\qquad\forall f\in E,~{}\forall i\in V.% \end{aligned}

(12)

In the remaining part of this section, we perform facial reduction of the DNN relaxations. Let $\widetilde{Y}=\left(\begin{smallmatrix}Y&y\\ {y}^{\top}&1\end{smallmatrix}\right)$ and $\widetilde{Q}=\left(\begin{smallmatrix}Q&\mathbf{0}_{m}\\ {\mathbf{0}}^{\top}_{m}&0\end{smallmatrix}\right).$ It is not difficult to verify that

\displaystyle T=\begin{pmatrix}\mathbf{1}_{m}\\ -(n-1)\end{pmatrix}

(13)

is an eigenvector corresponding to the zero eigenvalue of any matrix $\widetilde{Y}$ feasible for (11). Since there is no feasible matrix $\widetilde{Y}$ which is positive definite, the DNN relaxation (11) has no Slater feasible point.

To provide a facially reduced DNN relaxation of (11), let $W\in\mathbb{R}^{(m+1)\times m}$ be a matrix whose columns form a basis for ${\mathcal{W}}={\rm null}(T^{\top})$ , see (13). As we will show in Theorem 3 later on in this section, the relaxation (11) may be equivalently written as the following facially reduced relaxation:

\displaystyle\begin{aligned} \min\ &\langle\widetilde{Q},WRW^{\top}\rangle\\ \text{s.t. }&\text{diag}(WRW^{\top})=(WRW^{\top})\mathbf{u}_{m+1}\\ &(WRW^{\top})_{m+1,m+1}=1\\ &WRW^{\top}\geq\mathbf{0},\quad R\succeq\mathbf{0}.\end{aligned}

(14)

We obtained this relaxation from (11) by replacing $\widetilde{Y}$ with $WRW^{\top}$ and removing redundant constraints. Note that the feasible set of (11) is contained in $W\mathcal{S}^{m}_{+}W^{\top}$ , which is a face of $\mathcal{S}^{m+1}_{+}$ . To show that (14) has an interior point, we use Theorem 3.15 from [22]. That theorem additionally takes into account a zero pattern in the feasible matrix, which is not present in our problem.

Theorem 2 (Theorem 3.15 in [22]).

Let $\mathcal{Q}=\Bigg{\{}y\in\mathbb{R}^{m}~{}:~{}\mathcal{A}\bigg{(}\bigg{(}% \begin{matrix}yy^{\top}&y\\ y\top&1\end{matrix}\bigg{)}\bigg{)}=\mathbf{0},\ y\geq\mathbf{0}\Bigg{\}},$ where $\mathcal{A}$ is a linear transformation, be the feasible set of a quadratically constrained program. Suppose $\text{aff}(\operatorname{conv}(\mathcal{Q}))=\mathcal{L}$ with $\dim(\mathcal{L})=p$ . Then, there exist a matrix $C$ with full row rank and $d$ such that $\mathcal{L}=\big{\{}y\in\mathbb{R}^{m}:Cy=d\big{\}}.$

Let $M=(\begin{matrix}C&-d\end{matrix})$ and $W$ be a matrix such that its columns form a basis of ${\rm null}(M)$ . Let $\mathcal{J}=\big{\{}(i,j):y_{i}y_{j}=0\ \forall y\in\mathcal{Q}\big{\}}$ and $\mathcal{J}^{c}$ be its complement. Then, there exists a Slater point $\hat{R}$ for the facially reduced, DNN feasible set:

\hat{\mathcal{Q}}_{R}=\Big{\{}R\in\mathcal{S}^{p+1}:R\succeq\mathbf{0},\ \big{% (}{WRW^{\top}}\big{)}_{\mathcal{J}}=0,\ \big{(}{WRW^{\top}}\big{)}_{\mathcal{J% }^{c}}\geq\mathbf{0},\ \mathcal{A}\big{(}WRW^{\top}\big{)}=\mathbf{0}\Big{\}}.

We are now ready to state the following result on our facially reduced problem.

Theorem 3.

For $n\geq 3$ , the DNN relaxation (14) is a strictly feasible equivalent reformulation of (11).

Proof.

Let

	$\displaystyle\mathcal{Q}_{n}(m)$	$\displaystyle=\big{\{}y\in\{0,1\}^{m}:\mathbf{1}_{m}^{\top}y=n-1\big{\}}=\big{% \{}y\in\mathbb{R}^{m}:\mathbf{1}_{m}^{\top}y=n-1,\ y_{i}y_{i}=y_{i}\ \forall i% \in[m]\big{\}}$
		$\displaystyle=\bigg{\{}y\in\mathbb{R}^{m}:\mathcal{A}\bigg{(}\begin{pmatrix}yy% ^{\top}&y\\ y^{\top}&1\end{pmatrix}\bigg{)}=\mathbf{0},\ y\geq\mathbf{0}\bigg{\}},$

where $\mathcal{A}(X)=\begin{pmatrix}\mathcal{A}_{1}(X),\cdots,\mathcal{A}_{2m+1}(X)% \end{pmatrix}^{\top}$ with

	$\displaystyle\mathcal{A}_{i}(X)$	$\displaystyle=\bigg{\langle}\begin{pmatrix}\mathbf{u}_{i}\mathbf{u}_{i}^{\top}% &-\frac{1}{2}\mathbf{u}_{i}\\ -\frac{1}{2}\mathbf{u}_{i}^{\top}&0\end{pmatrix},X\bigg{\rangle}$	$\displaystyle\text{for all }i\in[m]\text{, and}$
	$\displaystyle\mathcal{A}_{m+i}(X)$	$\displaystyle=\bigg{\langle}\frac{1}{2}\bigg{(}\mathbf{u}_{i}\begin{pmatrix}% \mathbf{1}_{m}^{\top}&-(n-1)\end{pmatrix}+\begin{pmatrix}\mathbf{1}_{m}\\ -(n-1)\end{pmatrix}u_{i}^{\top}\bigg{)},X\bigg{\rangle}$	$\displaystyle\text{for all }i\in[m+1].$

Note that in the definition of $\mathcal{Q}_{n}(m)$ , the equality $\mathcal{A}_{i}(X)=0$ models the constraint $y_{i}^{2}=y_{i}$ for all $i\in[m]$ . The constraint $\mathcal{A}_{2m+1}(X)=0$ models the constraint $\mathbf{1}_{m}^{\top}y=n-1$ . For the indices $i\in[m]$ , the constraint $\mathcal{A}_{m+i}(X)=0$ models the redundant constraint $y_{i}(\mathbf{1}_{m}^{\top}y)=(n-1)y_{i}$ for all $1\leq i\leq m$ .

The convex hull equals $\operatorname{conv}(\mathcal{Q}_{n}(m))=\big{\{}y\in[0,1]^{m}:\mathbf{1}_{m}^{% \top}y=n-1\big{\}}$ . For each index $i\in[m]$ there exist vectors $y^{1},y^{2}\in\mathcal{Q}_{n}(m)$ such that $y^{1}_{i}>0$ and $y^{2}_{i}<1$ , hence, we get that the affine hull is $\text{aff}(\operatorname{conv}(\mathcal{Q}_{n}(m)))=\{y\in\mathbb{R}^{m}:% \mathbf{1}_{m}^{\top}y=n-1\},$ and has dimension $m-\text{rank}(\mathbf{1}_{m}^{\top})=m-1$ . Hence, $M=T^{\top}$ where $M$ is from Theorem 2 and $T$ given in (13). Let $W\in\mathbb{R}^{(m+1)\times m}$ be a matrix whose columns form a basis of the nullspace of $M$ . Then, a face of ${\mathcal{S}}^{m+1}_{+}$ containing the feasible set of (11) is of the form $W{\mathcal{S}}^{m}_{+}W^{\top}$ . Therefore, one can replace $\widetilde{Y}$ with $WRW^{\top}$ in (11).

Moreover, it holds that for each pair of indices $(i,j)\in[m]\times[m]$ , there exists a vector $y\in\mathcal{Q}_{n}(m)$ such that $y_{i}=y_{j}=1$ , and hence the index set $\mathcal{J}=\big{\{}(i,j):y_{i}y_{j}=0\ \forall y\in\mathcal{Q}\big{\}}$ is empty. Thus, by Theorem 2, there exists a Slater feasible point for the facially reduced DNN relaxation (14). ∎

On top of imposing strict feasibility, facial reduction reduces both the number of variables and constraints. Therefore, the relaxation (14) is preferred over (11). In a similar fashion, relaxation (12) can be rewritten by replacing $\widetilde{Y}$ in (12) by $WRW^{\top}$ .

5 Peaceman-Rachford splitting method for the QMSTP

Interior point solvers have difficulties computing our DNN relaxations for medium-sized problems in a reasonable time due to the large number of (inequality) constraints. Therefore, we use the Peaceman-Rachford splitting method (PRSM) for computing the bounds. The PRSM was first proposed in [32, 27] and is a symmetric variant of the alternating direction method of multipliers (ADMM). For more details and convergence results we refer to [21].

5.1 PRSM for solving the DNN relaxation

In this section, we outline the main steps of the Peaceman-Rachford splitting method for solving the DNN relaxation for the QMSTP (14).

Recall that the matrix $W$ should be such that its columns provide a basis for $\mathcal{W}={\rm null}(T^{\top})$ . For reasons explained later, we additionally require the columns of $W$ to be orthonormal. Therefore, we take $W$ as the matrix obtained from applying a QR decomposition to $(\begin{matrix}(n-1)\mathbf{I}_{m}&\mathbf{1}_{m}\end{matrix})^{\top}$ .

Now, we define the following sets

	$\displaystyle\mathcal{R}$	$\displaystyle\coloneqq\left\{R\in S^{m}~{}\colon~{}R\succeq\mathbf{0},\ \text{% tr}(R)=n\right\},$		(15)
	$\displaystyle\mathcal{Y}$	$\displaystyle\coloneqq\bigg{\{}\widetilde{Y}\in S^{m+1}~{}\colon~{}\widetilde{% Y}=\begin{pmatrix}Y&y\\ y^{\top}&1\end{pmatrix},\ \text{diag}(Y)=y,\ \mathbf{0}\leq\widetilde{Y}\leq% \mathbf{J},\ \text{tr}(\widetilde{Y})=n\bigg{\}},$		(16)

and rewrite (14) as

\min~{}\Big{\{}\big{\langle}\widetilde{Q},\widetilde{Y}\big{\rangle}~{}\colon~% {}\widetilde{Y}=WRW^{\top},\ R\in\mathcal{R},\ \widetilde{Y}\in\mathcal{Y}\Big% {\}}.

(17)

Note that we added redundant constraints to $\mathcal{Y}$ and $\mathcal{R}$ , where the constraint $\text{tr}(R)=n$ holds, since the columns in $W$ are orthonormalized. Those redundant constraints help for the efficiency of the algorithm, see e.g., [16, 29, 26].

For a fixed penalty parameter $\beta>0$ , the augmented Lagrangian function of (17) w.r.t. the constraint $\widetilde{Y}=WRW^{\top}$ is

\mathcal{L}_{\beta}(R,\widetilde{Y},S)=\big{\langle}\widetilde{Q},\widetilde{Y% }\big{\rangle}+\big{\langle}S,\widetilde{Y}-WRW^{\top}\big{\rangle}+\frac{% \beta}{2}\big{\lVert}\widetilde{Y}-WRW^{\top}\big{\rVert}^{2}_{F}.

The basic idea of the PRSM is to iteratively alternate between optimizing $\mathcal{L}_{\beta}$ over $R$ and $\widetilde{Y}$ and updating the dual variable $S$ . The $(k+1)$ -th iteration of the PRSM to minimize the augmented Lagrangian function is

	$\displaystyle R^{k+1}$	$\displaystyle=\operatorname*{arg\,min}_{R\in\mathcal{R}}\mathcal{L}_{\beta}(R,% \widetilde{Y}^{k},S^{k})$
	$\displaystyle S^{\frac{k+1}{2}}$	$\displaystyle=S^{k}+\gamma_{1}\beta(\widetilde{Y}^{k}-WR^{k+1}W^{\top})$
	$\displaystyle\widetilde{Y}^{k+1}$	$\displaystyle=\operatorname*{arg\,min}_{\widetilde{Y}\in\mathcal{Y}}\mathcal{L% }_{\beta}(R^{k+1},\widetilde{Y},S^{\frac{k+1}{2}})$
	$\displaystyle S^{k+1}$	$\displaystyle=S^{\frac{k+1}{2}}+\gamma_{2}\beta(\widetilde{Y}^{k+1}-WR^{k+1}W^% {\top}),$

with step lengths $\gamma_{1}\in(-1,1)$ and $\gamma_{2}\in\big{(}0,\frac{1+\sqrt{5}}{2}\big{)}$ satisfying $\gamma_{1}+\gamma_{2}>0$ and $\lvert\gamma_{1}\rvert<1+\gamma_{2}-\gamma_{2}^{2}$ , see [21]. The optimization problems occurring in this PRSM scheme can be simplified to projection problems. Namely, optimizing the augmented Lagrangian over $\mathcal{R}$ can be simplified to

\displaystyle R^{k+1}

\displaystyle=\operatorname*{arg\,min}_{R\in\mathcal{R}}~{}\langle S^{k},-WRW^% {\top}\rangle+\frac{\beta}{2}\big{\lVert}\widetilde{Y}^{k}-WRW^{\top}\big{% \rVert}_{F}=\mathcal{P}_{\mathcal{R}}\bigg{(}W^{\top}\bigg{(}\widetilde{Y}^{k}% +\frac{1}{\beta}S^{k}\bigg{)}W\bigg{)},

where we exploited the fact that the columns of $W$ are orthonormal. The projection $\mathcal{P}_{\mathcal{R}}(M)$ of a matrix $M\in\mathcal{S}^{m}$ onto the set $\mathcal{R}$ can be computed by projecting the eigenvalues of $M$ in the spectral decomposition onto the $n$ -simplex $\Delta_{n}$ , see e.g., [26]. In more detail, let $M=U\text{Diag}({\lambda})U^{\top}$ be the eigenvalue decomposition of $M$ with ${\lambda}$ denoting the vector of eigenvalues of $M$ , then $\mathcal{P}_{\mathcal{R}}(M)=U\text{Diag}(\mathcal{P}_{\Delta_{n}}({\lambda}))% U^{\top}.$ The projection onto the simplex can be performed efficiently. We refer to [9] for an overview of algorithms for projecting onto the simplex and their complexities.

Similarly, the optimization problem over the polyhedral set $\mathcal{Y}$ can be reformulated as

\displaystyle\widetilde{Y}^{k+1}

\displaystyle=\operatorname*{arg\,min}_{\widetilde{Y}\in\mathcal{Y}}\big{% \langle}\widetilde{Q},\widetilde{Y}\big{\rangle}+\big{\langle}S^{\frac{k+1}{2}% },\widetilde{Y}\big{\rangle}+\frac{\beta}{2}\big{\lVert}\widetilde{Y}-WR^{k+1}% W^{\top}\big{\rVert}_{F}=\mathcal{P}_{\mathcal{Y}}\Big{(}WR^{k+1}W^{\top}-% \frac{1}{\beta}\big{(}\widetilde{Q}+S^{\frac{k+1}{2}}\big{)}\Big{)}.

The projection onto $\mathcal{Y}$ can then be done in the following way

\mathcal{P}_{\mathcal{Y}}\bigg{(}\begin{pmatrix}Z&z\\ z^{\top}&\omega\end{pmatrix}\bigg{)}=\mathcal{P}_{[0,1]}\Bigg{(}\begin{pmatrix% }Z-\text{Diag}(\text{diag}(Z))+v&v\\ v^{\top}&1\end{pmatrix}\Bigg{)},

where $v=\mathcal{P}_{\bar{\Delta}(n-1)}\big{(}\frac{1}{3}\text{diag}(Z)+\frac{2}{3}z% \big{)}$ and $\mathcal{P}_{[0,1]}$ denotes the elementwise projection onto the interval $[0,1]$ .

5.2 PRSM for solving the strengthened DNN relaxation

In this subsection, we modify the previously described PRSM algorithm that solves the relaxation (14), so that it can handle additional RLT-type constraints.

Let us extend the set $\mathcal{Y}$ , see (16), by adding the RLT-type constraints, yielding

\mathcal{Y}_{RLT}=\bigg{\{}\widetilde{Y}\in S^{m+1}\colon\widetilde{Y}=\begin{% pmatrix}Y&y\\ y^{\top}&1\end{pmatrix},\ \text{diag}(Y)=y,\ \text{tr}(\widetilde{Y})=n,\ % \mathbf{0}\leq\widetilde{Y}\leq\mathbf{J},\\ \sum_{e\in\delta(i)}y_{fe}\geq y_{f}\quad\forall f\in E,~{}\forall i\in V\bigg% {\}}.

Thus, the strengthened DNN relaxation (17) is as follows

\min~{}\Big{\{}\big{\langle}\widetilde{Q},\widetilde{Y}\big{\rangle}~{}\colon~% {}\widetilde{Y}=WRW^{\top},\ R\in\mathcal{R},\ \widetilde{Y}\in\mathcal{Y}_{% RLT}\Big{\}}.

(18)

The RLT-type constraints make the projection onto $\mathcal{Y}_{RLT}$ significantly harder. To the best of our knowledge, there is no closed-form expression for the projection onto $\mathcal{Y}_{RLT}$ . However, one may write $\mathcal{Y}_{RLT}$ as an intersection of sets that are easier to project on and then use an algorithm to project onto the intersection of convex sets. The cyclic Dykstra’s projection algorithm [5] is a suitable algorithm. An overview and analysis of algorithms to project onto the intersection of convex sets can be found in [2].

To apply Dykstra’s cyclic projection algorithm, let $\mathcal{K}$ denote a coloring of the graph $G$ , i.e., $\mathcal{K}=\{K_{1},\ldots,K_{N}\}$ is a partitioning of $V$ into independent sets of $G$ . We then define the polyhedral sets $\mathcal{Y}^{k}$ as

\displaystyle\mathcal{Y}^{k}\coloneqq\left\{\widetilde{Y}\in\mathbb{R}^{(m+1)% \times(m+1)}\,:\,\,\widetilde{Y}=\begin{pmatrix}Y&y\\ y^{\top}&1\end{pmatrix},\ \text{diag}(Y)=y,~{}\sum_{e\in\delta(i)}y_{fe}\geq y% _{f}\quad\forall f\in E,~{}\forall i\in K_{k}\right\},

for $k=1,\ldots,N$ . With this we can now rewrite $\mathcal{Y}_{RLT}$ as $\mathcal{Y}_{RLT}=\mathcal{Y}\cap\left(\bigcap_{k=1}^{N}\mathcal{Y}^{k}\right).$

The projection onto the sets $\mathcal{Y}^{k}$ can be performed independently over each row $f\in E$ of $Y$ and the corresponding entries of $y_{f}$ in $\widetilde{Y}$ . This allows us to restrict ourselves to projections onto the following type of sets

\displaystyle T^{k}_{f}\coloneqq\left\{z\in\mathbb{R}^{m+2}\,:\,\,z_{f}=z_{m+1% }=z_{m+2},~{}\sum_{e\in\delta(i)}z_{e}\geq z_{f}\quad\forall i\in K_{k}\right\},

(19)

where the first $m+1$ entries correspond to the $f$ -th row of $\widetilde{Y}$ and the last entry $z_{m+2}$ corresponds to $\widetilde{Y}_{m+1,f}$ . The projection onto $T_{f}^{k}$ can then be computed as presented in the following proposition.

Proposition 7.

Let $a\in\mathbb{R}^{m+2}$ , $f\in E$ and let $\mathcal{K}=\{K_{1},\ldots,K_{N}\}$ denote a coloring of $G$ . For each $i\in K_{k}$ , we define $g_{i}\coloneqq\frac{a_{f}+a_{m+1}+a_{m+2}}{3}-\sum_{e\in\delta(i)}a_{e}$ and sort these values in non-increasing order, i.e., $g_{\sigma(1)}\geq g_{\sigma(2)}\geq\dots\geq g_{\sigma(n_{k})}$ , where $n_{k}=|K_{k}|$ and $\sigma\colon[n_{k}]\to K_{k}$ is an appropriate sorting permutation. For each $p\in[n_{k}]$ , let

\displaystyle\omega(p)\coloneqq\frac{\sum_{j=1}^{p}\frac{g_{\sigma(j)}}{d({% \sigma(j)})}}{3+\sum_{j=1}^{p}\frac{1}{d({\sigma(j)})}},

where $d(\sigma(j))$ denotes the degree of vertex $\sigma(j)$ in $G$ . If $g_{i}\leq 0$ for all $i\in K_{k}$ , then $\mathcal{P}_{T^{k}_{f}}(a)=z$ , where $z_{e}=a_{e}$ for all $e\in E\setminus\{f\}$ and $z_{f}=z_{m+1}=z_{m+2}=\frac{a_{f}+a_{m+1}+a_{m+2}}{3}$ . Otherwise, let $p^{*}$ denote the largest index $p$ for which $g_{\sigma(p)}>\omega(p)$ . Then, $\mathcal{P}_{T^{k}_{f}}(a)=z$ , where

\displaystyle z_{e}

\displaystyle=\begin{cases}\frac{a_{f}+a_{m+1}+a_{m+2}}{3}-\omega(p^{*})&\text% {if $e\in\{f,m+1,m+2\}$},\\ a_{e}+\frac{1}{d(i)}(g_{i}-\omega(p^{*}))&\text{if $e\in\delta(i)\setminus\{f% \}$ for $i\in K_{k}\setminus V(f)$ with $\sigma(i)\leq p^{*}$,}\\ a_{e}-\frac{1}{d(i)-1}\sum_{e\in\delta(i)\setminus\{f\}}a_{e}&\text{if $e\in% \delta(i)\setminus\{f\}$ for $i\in K_{k}\cap V(f)$ with $\sum_{e\in\delta(i)% \setminus\{f\}}a_{e}<0$},\\ a_{e}&\text{otherwise.}\end{cases}

Proof.

First, observe that if $g_{\sigma(1)}\leq 0$ , then $g_{i}\leq 0$ for all $i\in K_{k}$ . Consequently, the projection of $a$ onto $T^{k}_{f}$ is given by $z$ , where $z$ is such that $z_{e}=a_{e}$ for all $e\in E\setminus\{f\}$ and $z_{f}=z_{m+1}=z_{m+2}=\frac{a_{f}+a_{m+1}+a_{m+2}}{3}$ .

If $g_{\sigma(1)}>0$ , then $\omega(1)=\frac{\frac{g_{\sigma(1)}}{d(\sigma(1))}}{3+\frac{1}{d(\sigma(1))}}<% \frac{g_{\sigma(1)}}{d(\sigma(1))}\leq g_{\sigma(1)}.$ Hence, the largest index $p$ for which $g_{\sigma(p)}>\omega(p)$ , i.e., the index $p^{*}$ , exists. Next, we prove that the projection $z=\mathcal{P}_{T^{k}_{f}}(a)$ is of the described form.

Using the fact that $z_{f}=z_{m+1}=z_{m+2}$ , the vector $z$ can be obtained as the solution of the following optimization problem, where we restrict to the support of the constraints in $T^{k}_{f}$ .

\displaystyle\begin{aligned} \min_{z}\quad&\sum_{i\in K_{k}}\,\sum_{e\in\delta% (i)\setminus\{f\}}||a_{e}-z_{e}||_{2}^{2}+||a_{f}-z_{f}||^{2}_{2}+||a_{m+1}-z_% {f}||^{2}_{2}+||a_{m+2}-z_{f}||^{2}_{2}\\ \text{s.t.}\quad&\sum_{e\in\delta(i)}z_{e}\geq z_{f}\qquad\forall i\in K_{k}.% \end{aligned}

(20)

Let $\lambda_{i}$ , $i\in K_{k}$ , denote the dual variables corresponding to the constraints of (20). We further denote by $V(f)$ the two vertices in $G$ adjacent to $f\in E$ . Then, the KKT optimality conditions for (20) are as follows

$\displaystyle 2(z_{e}-a_{e})-\lambda_{i}$	$\displaystyle=0\qquad\forall e\in\delta(i)\setminus\{f\},~{}\forall i\in K_{k}$	(21)
$\displaystyle 6z_{f}-2(a_{f}+a_{m+1}+a_{m+2})+\sum_{i\in K_{k}\setminus V(f)}% \lambda_{i}$	$\displaystyle=0$	(22)
$\displaystyle\sum_{e\in\delta(i)}z_{e}$	$\displaystyle\geq z_{f}\quad\ \,\forall i\in K_{k}$	(23)
$\displaystyle\lambda_{i}(z_{f}-\sum_{e\in\delta(i)}z_{e})$	$\displaystyle=0\qquad\forall i\in K_{k}$	(24)
$\displaystyle\lambda_{i}$	$\displaystyle\geq 0\qquad\forall i\in K_{k}.$	(25)

It follows from (21) and (22) that we have

	$\displaystyle z_{f}$	$\displaystyle=\frac{a_{f}+a_{m+1}+a_{m+2}}{3}-\frac{1}{6}\sum_{i\in K_{k}% \setminus V(f)}\lambda_{i},$		and		(26)
	$\displaystyle z_{e}$	$\displaystyle=a_{e}+\frac{1}{2}\lambda_{i}$		$\displaystyle\forall e\in\delta(i)\setminus\{f\},\ \forall i\in K_{k}.$		(26)

Suppose $K^{*}\subseteq K_{k}$ is the set of vertices for which $\lambda_{i}>0$ at an optimal solution of (20). The complementary slackness constraints (24) then imply that $z_{f}=\sum_{e\in\delta(i)}z_{e}$ for all $i\in K^{*}\setminus V(f)$ and $\sum_{e\in\delta(i)\setminus\{f\}}z_{e}=0$ for $i\in K^{*}\cap V(f)$ . Note that $\lvert K^{*}\cap V(f)\rvert\leq 1$ since $K_{k}$ is an independent set in $G$ . By exploiting (26) and $\sum_{j\in K_{k}\setminus V(f)}\lambda_{j}=\sum_{j\in K^{*}\setminus V(f)}% \lambda_{j}$ , these equations can be rewritten to

$\displaystyle\frac{a_{f}+a_{m+1}+a_{m+2}}{3}$	$\displaystyle-\frac{1}{6}\sum_{j\in K^{*}\setminus V(f)}\lambda_{j}=\sum_{e\in% \delta(i)}\left(a_{e}+\frac{1}{2}\lambda_{i}\right)$
$\displaystyle\Longleftrightarrow\quad\lambda_{i}$	$\displaystyle=\frac{2}{d(i)}\left(\frac{a_{f}+a_{m+1}+a_{m+2}}{3}-\sum_{e\in% \delta(i)}a_{e}-\frac{1}{6}\sum_{j\in K^{*}\setminus V(f)}\lambda_{j}\right)$
$\displaystyle\Longleftrightarrow\quad\lambda_{i}$	$\displaystyle=\frac{2}{d(i)}\left(g_{i}-\frac{1}{6}\sum_{j\in K^{*}\setminus V% (f)}\lambda_{j}\right)$	(27)

for all $i\in K^{*}\setminus V(f)$ . Summing the latter equations over all $i\in K^{*}\setminus V(f)$ yields

\sum_{i\in K^{*}\setminus V(f)}\lambda_{i}=2\sum_{i\in K^{*}\setminus V(f)}% \frac{g_{i}}{d(i)}-\frac{1}{3}\sum_{i\in K^{*}\setminus V(f)}\frac{1}{d(i)}% \sum_{j\in K^{*}\setminus V(f)}\lambda_{j},

or equivalently, $\sum_{i\in K^{*}\setminus V(f)}\lambda_{i}=\frac{2\sum_{i\in K^{*}\setminus V(% f)}\frac{g_{i}}{d(i)}}{1+\frac{1}{3}\sum_{i\in K^{*}\setminus V(f)}\frac{1}{d(% i)}}.$ After substitution into (27), we obtain

\displaystyle\lambda_{i}=\frac{2}{d(i)}\left(g_{i}-\frac{\sum_{i\in K^{*}% \setminus V(f)}\frac{g_{i}}{d(i)}}{3+\sum_{i\in K^{*}\setminus V(f)}\frac{1}{d% (i)}}\right)>0

(28)

for all $i\in K^{*}\setminus V(f)$ . For each $i\in(K_{k}\setminus K^{*})\setminus V(F)$ , we have $\lambda_{i}=0$ . The inequalities (23) for these $i$ then read

\sum_{e\in\delta(i)}a_{e}\geq\frac{a_{f}+a_{m+1}+a_{m+2}}{3}-\frac{\sum_{i\in K% ^{*}\setminus V(f)}\frac{g_{i}}{d(i)}}{3+\sum_{i\in K^{*}\setminus V(f)}\frac{% 1}{d(i)}},

or equivalently,

g_{i}-\frac{\sum_{i\in K^{*}\setminus V(f)}\frac{g_{i}}{d(i)}}{3+\sum_{i\in K^% {*}\setminus V(f)}\frac{1}{d(i)}}\leq 0.

(29)

By combining (28) and (29) we obtain the following optimality conditions on the dual variables $\lambda$ concerning the indices in $K_{k}\setminus V(F)$

\displaystyle\left\{\begin{aligned} \frac{2}{d(i)}\left(g_{i}-\frac{\sum_{i\in K% ^{*}\setminus V(f)}\frac{g_{i}}{d(i)}}{3+\sum_{i\in K^{*}\setminus V(f)}\frac{% 1}{d(i)}}\right)&>0&&\text{for all $i\in K^{*}\setminus V(f)$,}\\ g_{i}-\frac{\sum_{i\in K^{*}\setminus V(f)}\frac{g_{i}}{d(i)}}{3+\sum_{i\in K^% {*}\setminus V(f)}\frac{1}{d(i)}}&\leq 0&&\text{for all $i\in(K_{k}\setminus K% ^{*})\setminus V(F)$.}\end{aligned}\right.

(30)

We conclude from the conditions (30) that the support of $\lambda$ restricted to $K_{k}\setminus V(f)$ always consists of the vertices for which $g_{i}$ lies above a certain threshold value. To find this threshold value, we sort the $g_{i}$ ’s in non-increasing order and check all possible candidate sets for $K^{*}\setminus V(f)$ corresponding to the first $r$ entries in this sorted list. Let $\sigma\colon[n_{k}]\to K_{k}$ denote an according sorting permutation, i.e., $\sigma$ is bijective and fulfills $g_{\sigma(1)}\geq g_{\sigma(2)}\geq\dots\geq g_{\sigma(n_{k})}$ . For each candidate set $\{\sigma(1),\ldots,\sigma(p)\}\subseteq K_{k}\setminus V(f)$ , it suffices to check whether $g_{\sigma(p)}$ is strictly larger than the candidate threshold value

\omega(p)\coloneqq\frac{\sum_{j=1}^{p}\frac{g_{\sigma(j)}}{d(\sigma(j))}}{3+% \sum_{j=1}^{p}\frac{1}{d(\sigma(j))}}.

If $p^{*}$ is the largest index for which this holds, then this candidate set equals $K^{*}\setminus V(f)$ . The existence of such a $p^{*}$ is guaranteed by the existence of a solution to the projection problem (20).

Finally, we need to address the optimality conditions for all $i\in K_{k}\cap V(f)$ . In case $i\in K^{*}\cap V(f)$ , we have $\lambda_{i}>0$ , and due to complementary slackness (24) it holds that

0=\sum_{e\in\delta(i)\setminus\{f\}}z_{e}=\sum_{e\in\delta(i)\setminus\{f\}}% \Big{(}a_{e}+\frac{1}{2}\lambda_{i}\Big{)},~{}~{}\text{or equivalently, }~{}~{% }\lambda_{i}=-\frac{2}{d(i)-1}\sum_{e\in\delta(i)\setminus\{f\}}a_{e}>0.

We note here that we may w.l.o.g. assume that $d(i)>1$ . Namely, if $d(i)=1$ , then the set $\delta(i)\setminus\{f\}$ is empty, hence $\lambda_{i}$ will not appear anywhere in (26), making this dual variable redundant.

For the case $i\in(K_{k}\setminus K^{*})\cap V(f)$ , and hence $\lambda_{i}=0$ , condition (24) with (23) reads as $\sum_{e\in\delta(i)\setminus\{f\}}a_{e}\geq 0$ . Combining both cases, we obtain the following optimality conditions for $i\in K_{k}\cap V(F)$ :

\left\{\begin{aligned} \sum_{e\in\delta(i)\setminus\{f\}}a_{e}&<0&&\text{for $% i\in K^{*}\cap V(f)$,}\\ \sum_{e\in\delta(i)\setminus\{f\}}a_{e}&\geq 0&&\text{for $i\in(K_{k}\setminus K% ^{*})\cap V(f)$.}\end{aligned}\right.

Altogether, the equations (26) then imply

\displaystyle z_{e}

\displaystyle=\begin{cases}\frac{a_{f}+a_{m+1}+a_{m+2}}{3}-\omega(p^{*})&\text% {if $f\in\{e,m+1,m+2\}$,}\\ a_{e}+\frac{1}{d(i)}(g_{i}-\omega(p^{*}))&\text{if $e\in\delta(i)\setminus\{f% \}$ and $i\in K^{*}\setminus V(f)$,}\\ a_{e}-\frac{1}{d(i)-1}\sum_{e\in\delta(i)\setminus\{f\}}a_{e}&\text{if $e\in% \delta(i)\setminus\{f\}$ and $i\in K_{k}\cap V(f)$ with $\sum_{e\in\delta(i)% \setminus\{f\}}a_{e}<0,$}\\ a_{e}&\text{otherwise.}\end{cases}

∎

It follows from Lemma 7 that the projection onto $T^{k}_{f}$ involves both a sorting and an enumeration of a list of $n_{k}$ elements. Hence, the worst-case time complexity is $O(n_{k}\log n_{k})$ .

In fact, for computational purposes, we are not going to project on $\mathcal{Y}_{RLT}$ but iteratively add violated cuts only. For that, we denote by $\mathcal{C}\subseteq V\times E$ the set of violated cuts that we to add to $\mathcal{Y}$ , where an element $(i,f)$ represents the cut $\sum_{e\in\delta(i)}y_{ef}\geq y_{f}$ . We further define analogously to $\mathcal{Y}_{RLT}$ the polyhedral set

\mathcal{Y}_{\mathcal{C}}\coloneqq\left\{\widetilde{Y}\in\mathbb{R}^{(m+1)% \times(m+1)}\,:\,\,\widetilde{Y}=\begin{pmatrix}Y&y\\ y^{\top}&1\end{pmatrix},\ \text{diag}(Y)=y,~{}\sum_{e\in\delta(i)}y_{fe}\geq y% _{f}\quad\forall(i,f)\in\mathcal{C}\right\}.

The projection follows the same idea as explained above for the projection onto $\mathcal{Y}_{RLT}$ , but in this case, instead of partitioning the vertex set $V$ into independent sets, we can partition the constraints in $\mathcal{C}$ for each edge $f$ separately. For a fixed $f$ , we partition the vertices occurring together with $f$ in $\mathcal{C}$ into independent sets $K^{f}_{1},\dots,K^{f}_{N_{f}}$ . Note that the number of independent sets $N_{f}$ for an edge will probably be way smaller than the number of colors needed to color the whole graph, which can, in the worst case of a complete graph, be the number of vertices. Furthermore, as mentioned above, it is possible to project independently over each row $f\in E$ , which allows us to parallelize this step. Hence, we cluster the cut constraints in $\mathcal{C}_{k}=\big{\{}(i,f)\in{\mathcal{C}}:f\in E,\ i\in K^{f}_{k}\big{\}}$ for $1\leq k\leq N_{max}$ with $N_{max}\coloneqq\max\{N_{f}:f\in E\}$ and obtain $\mathcal{Y}_{\mathcal{C}}=\mathcal{Y}\cap\Bigg{(}\bigcap_{k=1}^{N_{max}}% \mathcal{Y}_{\mathcal{C}_{k}}\Bigg{)},$ where we can easily project onto $\mathcal{Y}_{\mathcal{C}_{k}}$ using Proposition 7. A pseudocode for the Cyclic Dykstra projection algorithm to project onto $\mathcal{Y}_{\mathcal{C}}$ can be found in Algorithm 1.

Algorithm 1 Dykstra’s cyclic projection algorithm to project onto

\mathcal{Y}_{\mathcal{C}}

Input: matrix $M$ , cuts $\mathcal{C}$ , $\varepsilon_{proj}$
Output: the projection $\mathcal{P}_{\mathcal{Y}_{\mathcal{C}}}(M)$ of $M$ onto $\mathcal{Y}_{\mathcal{C}}$

1:cluster

\mathcal{C}

into

\{\mathcal{C}_{1},\dots,\mathcal{C}_{N_{max}}\}

2:initialize

X=M

P=\mathbf{0}

Q_{1}=\dots=Q_{N_{max}}=\mathbf{0}

3:repeat

X_{old}=X

X_{tmp}=X+P

X=\mathcal{P}_{\mathcal{Y}}(X_{tmp})

P=X_{tmp}-X

8: for

k=1,\dots,N_{max}

X_{tmp}=X+Q_{k}

10:

X=\mathcal{P}_{\mathcal{Y}_{\mathcal{C}_{k}}}(X_{tmp})

11:

Q_{k}=X_{tmp}-X

12: end for

13:until

\lVert X_{old}-X\rVert<\varepsilon_{proj}

14:return

X

To compute the lower bound (18) with a PRSM algorithm, we first compute the DNN bound (17) with the PRSM, as explained in the previous subsection. Then, we separate violated cuts from the current solution and add the ncutsmax most violated ones to $\mathcal{C}$ . We then proceed to compute (17) with the additional new cuts in $\mathcal{C}$ with the PRSM and use the solution from before for a warm-start. This process of separating and adding new cuts to $\mathcal{C}$ in an outer loop is iterated until one of the stopping criteria is met. Algorithm 2 provides a pseudocode for the described algorithm.

Algorithm 2 PRSM algorithm to compute lower bounds on the QMST

Input: graph $G=(V,E)$ , cost matrix $\widetilde{Q}$
Output: (valid) lower bound LB

1:initialize

\widetilde{Y}^{0}

S^{0}

\beta

\gamma_{1}

\gamma_{2}

, set

\mathcal{C}=\emptyset

\triangleright

cf. Section 6

2:compute

W

, e.g., apply QR decomposition to

(\begin{matrix}(n-1)\mathbf{I}_{m}&\mathbf{1}_{m}\ \end{matrix})^{\top}

k=0

4:while no stopping criteria met do

5: while no stopping criteria met do

R^{k+1}=\mathcal{P}_{\mathcal{R}}(W^{\top}(\widetilde{Y}^{k}+\frac{1}{\beta}S^% {k})W)

S^{\frac{k+1}{2}}=S^{k}+\gamma_{1}\beta(\widetilde{Y}^{k}-WR^{k+1}W^{\top})

\widetilde{Y}^{k+1}=\mathcal{P}_{\mathcal{Y}_{\mathcal{C}}}\big{(}WR^{k+1}W^{% \top}-\frac{1}{\beta}\big{(}\widetilde{Q}+S^{\frac{k+1}{2}}\big{)}\big{)}

S^{k+1}=S^{\frac{k+1}{2}}+\gamma_{2}\beta(\widetilde{Y}^{k+1}-WR^{k+1}W^{\top})

10:

k=k+1

11: end while

12: compute a valid lower bound LB from

S^{k}

\triangleright

cf. Section 5.3

13: separate violated cuts and add the ncutsmax most violated ones to

\mathcal{C}

14: cluster the cuts in

\mathcal{C}

15:end while

16:return LB

5.3 Stopping criteria and post-processing of the PRSM algorithm

In this subsection, we briefly discuss the stopping criteria and the post-processing phase of our PRSM algorithm.

Stopping criteria

We use several criteria to decide when to stop the inner and outer iterations of Algorithm 2. The main stopping criteria for the inner while loop is when the primal and dual errors satisfy

\max\Bigg{\{}\frac{\big{\lVert}\widetilde{Y}^{k+1}-WR^{k+1}W^{\top}\big{\rVert% }_{F}}{1+\big{\lVert}\widetilde{Y}^{k+1}\big{\rVert}_{F}},\ \beta\frac{\big{% \lVert}W^{\top}\big{(}\widetilde{Y}^{k}-\widetilde{Y}^{k+1}\big{)}W\big{\rVert% }_{F}}{1+\big{\lVert}S^{k+1}\big{\rVert}_{F}}\Bigg{\}}\leq\varepsilon_{PRSM},

cf. [4]. We further stop the inner iterations when the maximum number of total PRSM iterations or a time limit is reached. In that case, we compute a valid dual bound as described below, and stop the algorithm.

For the outer loop, we have the following possible stopping criteria. If an upper bound is known, the algorithm stops as soon as the obtained valid lower bound closes the gap. We further stop the algorithm if the number of new violated cuts found is below a certain threshold ncutsmin. If the improvement of the valid lower bound compared to the valid lower bound of the previous outer iteration is smaller than epslbimprov, we stop the algorithm as well. And finally, we stop after a maximum of noutermax outer iterations.

Valid lower bound

The value obtained as an output of Algorithm 2 does not necessarily provide a lower bound for the problem, as the convergence of the PRSM is typically not monotonic, and one stops the algorithm earlier. Therefore, it is necessary to perform a postprocessing procedure to obtain a valid lower bound. We apply the approach presented in [26]. The safe lower bound derived by this method is then given by

\text{lb}(S^{\text{out}})=\min_{\widetilde{Y}\in\mathcal{Y}_{\mathcal{C}}}% \langle\widetilde{Q}+S^{\text{out}},\widetilde{Y}\rangle-n\lambda_{\max}(W^{% \top}S^{\text{out}}W),

where $S^{\text{out}}$ denotes the dual matrix variable resulting from (an early stop of) the PRSM. The computation of this lower bound boils down to computing the largest eigenvalue and solving a linear program. Similarly, one can obtain a valid lower bound from the PRSM algorithm that solves (17), by replacing $\mathcal{Y}_{\mathcal{C}}$ with $\mathcal{Y}$ , see (16), in the above expression.

6 Numerical results

We implemented⁷⁷7The code can be found on https://github.com/melaniesi/QMST.jl and as ancillary files on the arXiv page of this paper. our algorithm in Julia [3] version 1.10.0. For solving the linear program to compute a valid lower bound, we are using the solver HiGHS [23] with the modeling language JuMP [28]. The projection onto $\mathcal{C}_{k}$ is multithreaded. All computations were carried out on an AMD EPYC 7343 with 16 cores with 4.00GHz and 1024GB RAM, operated under Debian GNU/Linux 11.

Parameter setting

We initialize the matrices, penalty parameters, and step lengths as follows. As starting values for the matrices, we choose $S^{0}=\mathbf{0}$ and

\widetilde{Y}^{0}=\left(\begin{smallmatrix}\frac{(n-1)}{m}\mathbf{I}+\frac{(n-% 1)(n-2)}{m(m-1)}(\mathbf{J}-\mathbf{I})&~{}\frac{(n-1)}{m}\mathbf{1}\\[6.45831% pt] \frac{(n-1)}{m}\mathbf{1}^{\top}&1\end{smallmatrix}\right)

Based on the results of numerical tests, we have determined the values for the penalty parameter $\beta$ and step lengths. We set the step length parameters to $\gamma_{1}=0.9$ , $\gamma_{2}=1$ . For the penalty parameter, let $q_{\max}\coloneqq\max\{\text{tr}(Q),\lVert Q\rVert_{F}\}$ and $q_{\min}\coloneqq\min\{\text{tr}(Q),\lVert Q\rVert_{F}\}$ , we then set

\beta=\begin{cases}\sqrt{\frac{q_{\min}}{m+1}\lVert Q\rVert_{F}}&\text{if }% \frac{q_{\max}}{q_{\min}}<1.2,\\ \sqrt{\frac{q_{\max}}{q_{\min}}\lVert Q\rVert_{F}}&\text{else.}\end{cases}

We run our algorithm for all instances with $\varepsilon_{PRSM}=10^{-4}$ and the parameter $\varepsilon_{proj}$ is set to $10^{-5}$ . Violated cuts are considered if the violation is greater than $10^{-3}$ and after each outer iteration, the ncutsmax = $m$ most violated cuts are added. No further cuts are added if the improvement of the lower bound is smaller than epslbimprov = $10^{-3}$ or the number of new violated cuts found is less than ncutsmin = 10. The maximum wall-clock time for running our algorithm is set to 3 hours per instance, and the maximum number of total iterations is set to $10\,000$ . We set the number of maximum outer iterations to noutermax = 10.

Benchmark instances

We test our algorithm on the following three benchmark sets. The first benchmark set OP was introduced in [30] by Öncan and Punnen. The benchmark set consists of 3 different classes, each consisting of 160 instances on complete graphs: the OPsym, OPvsym and OPesym instances. The OPsym instances have diagonal entries chosen uniformly from $[100]$ , and the off-diagonal values are uniformly distributed at random in $[20]$ . For instances in the class OPvsym, the diagonal values are uniformly distributed in $[10\,000]$ , and the off-diagonal values $Q_{\{i,j\},\{k,l\}}$ are computed as $w(i)w(j)w(k)w(l)$ , where $w\colon V\to[10]$ assigns to each vertex in the graph a uniformly distributed weight at random in $[10]$ . The cost matrix for instances of the type OPesym is constructed in the following way. First, the vertex coordinates are randomly chosen in the box $[0,100]\times[0,100]$ , and the edges are represented as straight lines connecting vertices. The edge cost $Q_{ee}$ is then set as the length of the edge $e$ , and the interaction cost between two edges $e$ and $f$ is computed as the Euclidean distance between the midpoints of $e$ and $f$ . For each of those test sets, they randomly generated 10 instances each for $n\in\{6,7,\dots,17,18\}\cup\{20,30,50\}$ . We do not include the benchmark instances of type OPesym and $n=20$ in our study, as we were unable to locate the correct instances⁸⁸8 In the benchmark set https://data.mendeley.com/datasets/cmnh9xc6wb/1, the instances indicated as type OPesym for $n=20$ are the OPvsym for $n=6$ ..

The second family of benchmark instances CP was introduced by Cordone and Passeri in [10]. The benchmark set consists of 108 instances divided into 4 classes, specifying the sets from which the diagonal and off-diagonal values of the cost matrix are chosen uniformly at random. For each pair of the number of vertices $n\in\{10,15,20,25,30,35,40,45,50\}$ , density $d\in\{33\%,67\%,100\%\}$ and class, one random graph was generated. The values of the cost matrix $Q$ are uniformly distributed on the sets as listed below.

class	CP1	CP2	CP3	CP4
diagonal values	[10]	[10]	[100]	[100]
off-diagonal values	[10]	[100]	[10]	[100]

The last benchmark set SV was introduced by Sotirov and Verchére in their recent paper [39]. It consists of 24 instances, with one random graph for each pair of $n\in\{10,12,14,16,18,20,25,30\}$ and $d\in\{33\%,67\%,100\%\}$ . They constructed the cost matrices in such a way that for a given maximum cost for the diagonal entries, and a maximum cost for the off-diagonal entries, 10% of the edges have high interaction costs with each other (between 90 and 100% of the maximum off-diagonal cost) and low interaction costs with the rest (between 20 and 40% of the maximum off-diagonal cost). The other 90% of edges have an interaction cost of between 50 and 70% of the maximum off-diagonal cost with each other. The diagonal entries are chosen to be between 0 and 20% of the maximum diagonal cost.

Bounds from the literature

We compare our numerical results to lower bounds from [20, 34, 39]. The upper bounds on the benchmark instances are taken from the literature.

The bounds from [20], called LAGN and LAGP, are used in the to-date best exact algorithm for the QMSTP. Those bounds are obtained from two different ways of dualizing an SDP relaxation of QMSTP. For LAGN, the semidefiniteness constraint is dualized, and a subgradient method is used to compute the optimum. Whereas for computing LAGP, there is no semidefiniteness constraint present, but a semi-infinite reformulation together with polyhedral cutting planes is solved using a bundle method.

The lower bounds VS1 and VS2 were introduced by Sotirov and Verchére in [39]. These lower bounds are based on an extended formulation of the minimum quadratic spanning tree problem and are strengthened by facet defining inequalities of the Boolean Quadric polytope. The lower bound VS2 is stronger than VS1.

Pereira et al. [34] solved several benchmark problems of sizes up to 50 vertices using a RLT based relaxation RLT1. RLT1 is an incomplete first level RLT relaxation and is computed by dualizing the symmetry constraint, applying the GL procedure, and using a subgradient algorithm. Another RLT based bound among the strongest relaxations in the literature is RLT2, presented in [37]. The authors of [37] use a dual-ascent procedure for computing their relaxation based on the second-level of RLT.

Computational results

We first present a comparison of our algorithm to the results from [20], where the authors also compute SDP bounds. Their computations were carried out on a machine with 32 GB RAM and two E5645 Intel Xeon processors, with six 2.40GHz cores each.

The structure of Section 6 is analogous to Table 4 in [20] and reads as follows. The rows are grouped into 3 blocks, each reporting the results averaged over all CP instances with the same property as specified in the first column of the table. The first block of rows averages over instances of the same size, the second averages the results over the densities of the graphs, and the last block averages over the different classes of the CP instances. In the second column of Section 6, we report the average gap obtained by the valid lower bound obtained with our PRSM algorithm when stopping after the first outer iteration, cf. (17). We compute the relative gap between that lower bound ( $\text{LB}_{DNN}$ ) and the best known upper bound (UB) from the literature using $100(UB-\text{LB}_{DNN})/UB$ . We remark here that the same gap was calculated in Guimarães et al. [20].⁹⁹9There was a typo in that paper that claims differently, but our statement can be easily verified. In the third column, we report the average wall clock time in seconds needed to compute this lower bound. In column 4, we report the average gap obtained by the bound returned by Algorithm 2, cf. (18), and in column 5, the average time needed to compute this bound. In the sixth and seventh column, we list the average gaps and computation times for the bound LAGN of [20], which is used in the best up-to-date exact algorithm for the QMSTP. The average gaps and computation times of LAGP, the second lower bound introduced in [20], are given in the last two columns of Section 6.

The results in Section 6 show that for the CP instances, our lower bounds are, on average, significantly stronger than the SDP bounds LAGN and LAGP. Except for the instances with $n\in\{10,15\}$ , the average computation times for solving our relaxations are smaller than those reported for computing SDP bounds LAGN and LAGP. The average time to compute the DNN + CUTS bound, that is (18), over all CP instances is $51$ seconds, compared to $1\,360$ and $5\,652$ seconds for LAGN and LAGP, respectively. More significant difference in the computation times and relative gaps can be seen for larger instances. One can also observe that the less dense the instances are, the smaller the average relative gap. Furthermore, the effect of adding cuts is more significant for sparse graphs than for dense graphs. Guimarães et al. [20, Table 4] compare their bounds to RLT1 [34], which can be computed approximately three times faster than LAGN but yields much weaker bounds. The average gap of bound RLT2 [37] over all instances of size $n\leq 35$ for each of the four CP classes is at least three times larger than our reported average gaps for (17). Overall, Section 6 shows that, especially for larger CP instances, our bounds are significantly stronger and faster to compute than any other bounds.

In the LABEL:tab:CP1, LABEL:tab:CP2, LABEL:tab:CP3, LABEL:tab:CP4 and LABEL:tab:SV we report the numerical results for all benchmark instances of the test sets CP and SV. The first four columns give details about the instance as the number of vertices, the edge density, the number of edges and an upper bound on the QMST. The next three columns report the valid lower bound (17) obtained after the first outer loop of our PRSM algorithm, the relative gap to the upper bound $100(UB-\text{LB}_{DNN})/UB$ , and the wall clock time in seconds needed to compute that bound. The last six columns outline the numerical results of our algorithm to compute (18). In columns 8 to 10, we provide the valid lower bound returned by our algorithm, the relative gap, and the wall clock time needed to compute the lower bound. The next two columns list the total number of iterations and the total number of cuts added. In the last column, we report the relative gap closed by adding the RLT-type cuts to the DNN relaxation (17). This performance measurement is computed as $100(\text{LB}_{DNN+CUTS}-\text{LB}_{DNN})/(\text{UB}-\text{LB}_{DNN}),$ where LB_DNN refers to the lower bound (17) reported in column 5 and LB_DNN+CUTS is the lower bound (18) reported in column 8 in each table. This metric gives information on how much the gap to the upper bound was improved.

LABEL:tab:CP1, LABEL:tab:CP2, LABEL:tab:CP3 and LABEL:tab:CP4 show that especially for CP instances with $n\geq 30$ vertices and edge density 100% there were only a few violated cuts found. Hence, the relative improvement of the DNN relaxation by adding those cuts was only marginal. One can further observe that the improvement of the relative gap and the relative gap closed, is better for smaller instances. For larger instances, adding cuts such as the RLT-type of the cut-set constraints for subsets $S$ of size 2 and larger, might further improve the DNN bounds.

LABEL:tab:SV presents the results of our algorithm for the benchmark set SV introduced in [39]. To the best of our knowledge, there are no results on LAGN, LAGP, and RLT2 for this benchmark set. The by far best lower bound up to date for the SV instance set was VS2. Our DNN relaxation bound without cuts outperforms VS2 for all instances, with the number of edges $m\geq 45$ , except for the instance with $n=12$ and $d=67\%$ . Both our relaxations yield a relative gap of less than 1%. The relative gap of VS2 ranges between 0 and 16.4%. The maximum runtime to compute the DNN bound for these instances is less than 5 seconds, whereas computation time for bound VS2 of $n=30$ and $d=100\%$ was reported to be 45 minutes. Computing the DNN bound with cuts is faster than the reported time to compute VS2 for all instances with more than 80 edges.

LABEL:tab:OPsym, LABEL:tab:OPesym and LABEL:tab:OPvsym read similarly to the tables for the CP and SV benchmark sets but the results are averaged over all instances of the same size. Again, to the best of our knowledge, we are not aware of any detailed and complete results for LAGN and LAGP on the OP benchmark set.

LABEL:tab:OPsym reports the results obtained for the benchmark set OPsym. The lower bound (18) with cuts outperforms VS2 for $n\geq 10$ , and RLT2 for $n\geq 8$ with the exception of $n=18$ , where the average relative gap for RLT2 is reported to be 33% and is 33.41% for the DNN bound with cuts. For $n=50$ , no bounds were reported. One can observe that the absolute improvement by adding RLT cuts to (17) for $n\geq 9$ is approximately 20.

LABEL:tab:OPesym shows that for the benchmark set OPesym adding the RLT-type cuts to (17) yields a substantial improvement of the relative gap. The DNN lower bound with cuts yields better bounds compared to VS2 but is clearly dominated by RLT1, giving an average relative gap between $0.2\%$ and $1.7\%$ for instances with $n\leq 30$ .

The authors of [39] report that the relative gap of the VS1 lower bound is less than or equal 0.2% for all instances of the class OPvsym. Although, on average, not many violated cuts to be added were found, the averaged relative bound closed is above 49% for all instances except that with $n\in\{6,7\}$ , where on average only 0.5 violated cuts were found. Considering the instances with $n\geq 11$ , the average relative bound closed is even above 80%.

The time limit of 3 hours was reached by all instances from OPesym and OPvsym of size $n=50$ and almost all of those instances of size $n=30$ . The higher computational costs for those two classes of benchmark instances can be explained, among other things, by the high number of clusters $N_{\max}$ , cf. Section 5.2. The number of clusters has a direct effect on the computation time of Dykstra’s algorithm, which accounts for a substantial part of the overall computation time. The average number of clusters needed for the OPvsym and OPesym instances are 6.43 and 6.38, whereas the average over all other benchmark instances is 3.26. Note that for those two classes of instances, added RTL-type constraints significantly improve lower bounds. Additionally, as for the CP3 instances, one can observe the higher number of iterations until convergence of the algorithm compared to other classes in our benchmark sets.

	This study				Guimarães et al. [20]
	DNN		DNN + CUTS		LAGN		LAGP
	gap (%)	time (s)	gap (%)	time (s)	gap (%)	time (s)	gap (%)	time (s)
\csvreader[head to column names, late after line =
\lagptime \csvreader[head to column names, late after line =	\dnntime	\lbgapp	\lbtime	\lagngap	\lagntime	\lagpgap
\lagptime \csvreader[head to column names, late after line =	\dnntime	\lbgapp	\lbtime	\lagngap	\lagntime	\lagpgap
\lagptime	\dnntime	\lbgapp	\lbtime	\lagngap	\lagntime	\lagpgap

Table 1: Comparison to averaged results on lower bounds for CP instances.

7 Conclusion

This paper provides two mixed-integer semidefinite programming formulations for the quadratic minimum spanning tree problem. Each of these formulations includes only one connectivity constraint, which is a linear matrix inequality based on the algebraic connectivity of trees. By exploiting the MISDP formulations, we derive a DNN relaxation for the QMSTP. We also derive the cut-set and RLT-type constraints as Chvátal-Gomory cuts of the MISDP by applying a CG procedure for mixed integer conic programming. The RLT-type constraints are added to the DNN relaxation, resulting in a strengthened DNN relaxation. An iterative cutting plane Peaceman-Rachford splitting method is designed to compute the DNN relaxation with the RLT-type constraints of the QMSTP efficiently.

The computational experiments on the benchmark instances from the literature demonstrate that our bounds significantly outperform existing bounds both in quality and computation time. While other approaches struggled to compute bounds for larger instances, we compute strong bounds in short time.

Given these results, incorporating our new bounds in a branch-and-bound algorithm would be the obvious next step for further research. Another topic for future research would be to incorporate additional RLT-type cut-set constraints to further strengthen the DNN relaxation.

Instance				DNN			DNN + CUTS
$n$	$d$ (%)	$m$	UB	LB	gap (%)	time (s)	LB	gap (%)	time (s)	iterations	cuts	closed (%)
\csvreader[head to column names, late after line =
\relgapclosed

Table 2: Results for CP1.

Instance				DNN			DNN + CUTS
$n$	$d$ (%)	$m$	UB	LB	gap (%)	time (s)	LB	gap (%)	time (s)	iterations	cuts	closed (%)
\csvreader[head to column names, late after line =
\relgapclosed

Table 3: Results for CP2.

Instance				DNN			DNN + CUTS
$n$	$d$ (%)	$m$	UB	LB	gap (%)	time (s)	LB	gap (%)	time (s)	iterations	cuts	closed (%)
\csvreader[head to column names, late after line =
\relgapclosed

Table 4: Results for CP3.

Instance				DNN			DNN + CUTS
$n$	$d$ (%)	$m$	UB	LB	gap (%)	time (s)	LB	gap (%)	time (s)	iterations	cuts	closed (%)
\csvreader[head to column names, late after line =
\relgapclosed

Table 5: Results for CP4.

Instance				DNN			DNN + CUTS
$n$	$d$ (%)	$m$	UB	LB	gap (%)	time (s)	LB	gap (%)	time (s)	iterations	cuts	closed (%)
\csvreader[head to column names, late after line =
\relgapclosed

Table 6: Results for SV instances.

Instance			DNN			DNN + CUTS
$n$	$m$	UB	LB	gap (%)	time (s)	LB	gap (%)	time (s)	iterations	cuts	closed (%)
\csvreader[head to column names, late after line =
\relgapclosed

Table 7: Results for OPsym instances.

Instance			DNN			DNN + CUTS
$n$	$m$	UB	LB	gap (%)	time (s)	LB	gap (%)	time (s)	iterations	cuts	closed (%)
\csvreader[head to column names, late after line =
\relgapclosed

Table 8: Results for OPesym instances.

Instance			DNN			DNN + CUTS
$n$	$m$	UB	LB	gap (%)	time (s)	LB	gap (%)	time (s)	iterations	cuts	closed (%)
\csvreader[head to column names, late after line =
\relgapclosed

Table 9: Results for OPvsym instances.

References

[1] Arjang Assad and Weixuan Xu. The quadratic minimum spanning tree problem. Naval Res. Logist., 39(3):399–417, 1992.
[2] Heinz H. Bauschke and Valentin R. Koch. Projection methods: Swiss army knives for solving feasibility and best approximation problems with halfspaces. In Infinite products of operators and their applications, volume 636 of Contemp. Math., pages 1–40. AMS, Providence, RI, 2015.
[3] Jeff Bezanson, Alan Edelman, Stefan Karpinski, and Viral B Shah. Julia: A fresh approach to numerical computing. SIAM Review, 59(1):65–98, 2017.
[4] Stephen Boyd, Neal Parikh, Eric Chu, Borja Peleato, and Jonathan Eckstein. Distributed optimization and statistical learning via the alternating direction method of multipliers. Foundations and Trends® in Machine Learning, 3(1):1–122, 2011.
[5] James P. Boyle and Richard L. Dykstra. A method for finding projections onto the intersection of convex sets in hilbert spaces. In Richard Dykstra, Tim Robertson, and Farroll T. Wright, editors, Advances in Order Restricted Statistical Inference, pages 28–47, New York, NY, 1986.
[6] M. Tolga Çezik and Garud N. Iyengar. Cuts for mixed 0-1 conic programming. Math. Program., 104:179–202, 2005.
[7] Tzu-Chiang Chiang, Chien-Hung Liu, and Yueh-Min Huang. A near-optimal multicast scheme for mobile ad hoc networks using a hybrid genetic algorithm. Expert Syst. Appl., 33(3):734 – 742, 2007.
[8] Wushow Chou and Aaron Kershenbaum. A unified algorithm for designing multidrop teleprocessing networks. IEEE Trans. Commun., 22:1762–1772, 1974.
[9] Laurent Condat. Fast projection onto the simplex and the $\ell_{1}$ ball. Math. Program., 158(1–2):575–585, 2016.
[10] Roberto Cordone and Gianluca Passeri. Heuristic and exact approaches to the quadratic minimum spanning tree problem. In Seventh Cologne Twente Workshop on Graphs and Comb. Opt., Gargano, Italy, 13-15 May, 2008, pages 52–55. University of Milan, 2008.
[11] Roberto Cordone and Gianluca Passeri. Solving the quadratic minimum spanning tree problem. Appl. Math. Comput., 218:11597–11612, 2012.
[12] Ante Ćustić, Ruonan Zhang, and Abraham P. Punnen. The quadratic minimum spanning tree problem and its variations. Discrete Optim., 27:73–87, 2018.
[13] D. Cvetković, M. Čangalović, and V. Kovačević-Vujčić. Semidefinite programming methods for the symmetric traveling salesman problem. In G. Cornuj́ols, R.E. Burkard, and G.J. Woeginger, editors, Integer programming and Combinatorial Optimization (IPCO 1999), volume 1610 of Lecture Notes in Comput. Sci. Springer, Berlin, Heidelberg, 1999.
[14] Frank de Meijer and Renata Sotirov. The Chvátal-Gomory procedure for integer SDPs with applications in combinatorial optimization. Math. Program., 2024.
[15] Frank de Meijer and Renata Sotirov. On integrality in semidefinite programming for discrete optimization. SIAM J. Optim., 34(1), 2024.
[16] Frank de Meijer, Renata Sotirov, Angelika Wiegele, and Shudian Zhao. Partitioning through projections: Strong SDP bounds for large graph partition problems. Comput. Oper. Res., 151:106088, 2023.
[17] Miroslav Fiedler. Algebraic connectivity of graphs. Czechoslov. Math. J., 23(2):298–305, 1973.
[18] Paul C. Gilmore. Optimal and suboptimal algorithms for the quadratic assignment problem. J Soc Ind Appl Math, 10(2):305–313, 1962.
[19] Robert Grone and Russell Merris. Ordering trees by algebraic connectivity. Graphs Combin., 6:229–237, 1990.
[20] Dilson A. Guimarães, Alexandre S. da Cunha, and Dilson L. Pereira. Semidefinite programming lower bounds and branch-and-bound algorithms for the quadratic minimum spanning tree problem. European J. of Oper. Res., 280(1):46–58, 2020.
[21] Bingsheng He, Feng Ma, and Xiaoming Yuan. Convergence study on the symmetric version of ADMM with larger step sizes. SIAM J. on Imaging Sci., 9(3):1467–1501, 2016.
[22] Hao Hu, Renata Sotirov, and Henry Wolkowicz. Facial reduction for symmetry reduced semidefinite and doubly nonnegative programs. Math. Program., 200(1):475–529, 2023.
[23] Qi Huangfu and J. A. Julian Hall. Parallelizing the dual revised simplex method. Math. Program. Comput., 10(1):119–142, 2018.
[24] Joseph B. Kruskal. On the shortest spanning subtree of a graph and the traveling salesman problem. In Proc. Amer. Math. Soc., 7, 1956.
[25] Eugene L. Lawler. The quadratic assignment problem. Manag. Sci., 9(4):586–599, 1963.
[26] Xinxin Li, Ting Kei Pong, Hao Sun, and Henry Wolkowicz. A strictly contractive Peaceman-Rachford splitting method for the doubly nonnegative relaxation of the minimum cut problem. Comput. Optim. Appl., 78(3):853–891, 2021.
[27] Pierre-Louis Lions and Bertrand Mercier. Splitting algorithms for the sum of two nonlinear operators. SIAM J. Numer. Anal., 16(6):964–979, 1979.
[28] Miles Lubin, Oscar Dowson, Joaquim Dias Garcia, Joey Huchette, Benoît Legat, and Juan Pablo Vielma. JuMP 1.0: Recent improvements to a modeling language for mathematical optimization. Math. Program. Comput., 15:581–589, 2023.
[29] Danilo E. Oliveira, Henry Wolkowicz, and Yangyang Xu. ADMM for the SDP relaxation of the QAP. Math. Program. Comput., 10:631–658, 2018.
[30] Temel Öncan and Abraham P. Punnen. The quadratic minimum spanning tree problem: A lower bounding procedure and an efficient search algorithm. Comput. Oper. Res., 37(10):176–1773, 2010.
[31] Gintaras Palubeckis, Dalius Rubliauskas, and Aleksandras Targamadz. Metaheuristic approaches for the quadratic minimum spanning tree problem. Inform. Tech. Control, 29:257––268, 2010.
[32] Donald W. Peaceman and Henry H. Rachford. The numerical solution of parabolic and elliptic differential equations. SIAM J. Appl. Math., 3(1):28–41, 1955.
[33] Dilson L. Pereira, Michel Gendreau, and Alexandre S. da Cunha. Branch-and-cut and branch-and-cut-and-price algorithms for the adjacent only quadratic minimum spanning tree problem. Networks, 65:367–379, 2015.
[34] Dilson L. Pereira, Michel Gendreau, and Alexandre S. da Cunha. Lower bounds and exact algorithms for the quadratic minimum spanning tree problem. Comput. Oper. Res., 63:149 – 160, 2015.
[35] Robert C. Prim. Shortest connection networks and some generalizations. The Bell Systems Technical Journal, 36(6):1389–1401, 1957.
[36] Abraham P. Punnen. Combinatorial optimization with multiplicative objective function. Int. J. Oper. Quant. Manag., 7:205–209, 2001.
[37] Borzou Rostami and Federico Malucelli. Lower bounds for the quadratic minimum spanning tree problem based on reduced cost computation. Comput. Oper. Res., 64:178–188, 2015.
[38] Hanif D. Sherali and Warren P. Adams. A reformulation-linearization technique for solving discrete and continuous nonconvex problems, volume 31. SSBM, 2013.
[39] Renata Sotirov and Zoe Verchére. The quadratic minimum spanning tree problem: Lower bounds via extended formulations. Vietnam J. Math., 2024.
[40] Henry Wolkowicz, Romesh Saigal, and Lieven Vandenberghe. Handbook of Semidefinite Programming: Theory, Algorithms, and Applications. Internat. Ser. Oper. Res. Management Sci. Springer, 2000.