On Implementing 2D Rectangular Assignment Algorithms
On Implementing 2D Rectangular Assignment Algorithms
INTRODUCTION
The two-dimensional (2D) assignment problem, also
known as the linear sum assignment problem and the
On Implementing 2D bipartite matching problem, arises in many contexts such
as scheduling, handwriting recognition, and multitarget
Rectangular Assignment tracking, as discussed in [12]. Strong and weak
polynomial time algorithms exist for solving the 2D
Algorithms assignment problem, unlike assignment problems
involving more than two indices, such as the multiframe/
S-dimensional assignment problem, which are NP
complete [40, ch. 15.7].1 The execution time of strong
polynomial algorithms scales polynomially with the size
DAVID F. CROUSE, Member, IEEE of the problem; the execution time of weak polynomial
Naval Research Laboratory
Washington, DC, USA
algorithms scales polynomially with the size of the
problem but also depends on values within the problem, in
some cases allowing for very slow worst case execution
time depending on particular values chosen. This paper
This paper reviews research into solving the two-dimensional considers the task of obtaining the best (lowest cost) and
(2D) rectangular assignment problem and combines the best the k-best solutions to the 2D rectangular assignment
methods to implement a k-best 2D rectangular assignment algorithm problem.2 Only strong polynomial time algorithms are
with bounded runtime. This paper condenses numerous results as an
given serious consideration as they are best suited for
understanding of the “best” algorithm, a strong polynomial-time
algorithm with a low polynomial order (a shortest augmenting path critical applications that cannot tolerate rare, very slow run
approach), would require assimilating information from many times given certain inputs.
separate papers, each making a small contribution. 2D rectangular Given an NR × NC matrix C of costs (which might be
assignment Matlab code is provided. positive, negative, or zero) with NC ≥ NR , the 2D
rectangular assignment problem consists of choosing one
element in each row and at most one element in each
column such that the sum of the chosen elements is
minimized or maximized. For example, a hotel might want
to assign rooms to clients based upon the price that the
clients have bid to stay in each room. If there are more
rooms than clients, then the clients are the rows and some
rooms will remain unassigned; if there are more clients
than rooms, then the rooms are the rows and some clients
will not be able to stay in the hotel.
Expressed mathematically, the 2D rectangular
assignment problem for minimization is
NR
NC
∗
X = arg min ci,j xi,j (1)
x
i=1 j =1
NC
subject to xi,j = 1 ∀i Every row is assigned
j =1 to a column. (2)
NR
Manuscript received December 21, 2014; revised August 28, 2015, xi,j ≤ 1 ∀j Not every column is
December 30, 2015; released for publication March 14, 2016. i=1 assigned to a row. (3)
DOI. No. 10.1109/TAES.2016.140952.
xi,j ∈ {0, 1} ∀xi,j Equivalent to
Refereeing of this contribution was handled by S. Maskell. (4)
xi,j ≥ 0 ∀xi,j ,
This research is supported by the Office of Naval Research through the
Naval Research Laboratory (NRL) Base Program.
1 The relationship between the complexity classes P and N P is a major
Author’s address: Naval Research Laboratory, Code 534, 4555
Overlook Ave., SW, Washington, DC 20375-5320. E-mail: unsolved problem in theoretical computer science for which a million
(david.crouse@nrl.navy.mil). dollar prize is offered [17]. Many believe that P = N P, though no proof
exists to date.
2 This paper is an extension of the second half of the conference
0018-9251/16/$26.00
C 2016 IEEE publication [22].
IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS VOL. 52, NO. 4 AUGUST 2016 1679
where min is replaced by max if one wishes to maximize assignment algorithm can be implemented using the 2D
the cost function, ci,j is the element in row i and column j rectangular assignment algorithm of Section II, where
of the cost matrix C, and the matrix X is the set of all of simulation examples are also presented. The results are
the xi,j . If xi,j = 1, then the item in row i is assigned to the summarized in Section IV. To facilitate the understanding
item in column j. Implicitly, the cost of not assigning a and use of the algorithms described here, the Matlab code
column to a row is zero. implementing the 2D rectangular shortest augmenting
In (4), it is indicated that the binary constraint on the path algorithm is given in the Appendix.
xi,j terms can be replaced by a nonnegativity constraint.
This substitution is acceptable as it has been proven that II. A 2D RECTANGULAR ASSIGNMENT ALGORITHM
such a substitution does not change the optimal value of WITH FEASIBILITY DETECTION
the 2D optimization problem [11, ch. 7.3, 7.8]. The A. Problem Formulation and Background
optimal X satisfying all the constraints will still be such
that all elements are binary. The substitution of inequality One of the most frequently used techniques for solving
constraints for integer constraints is possible on families the linear sum assignment problem is the auction
of optimization problems that are considered unimodular algorithm, described in its basic form in [4, 11, ch. 7.8].
[7, ch. 5.5.1]. This substitution turns the 2D assignment The basic form of the auction algorithm assumes that
problem into a linear programming problem. NR = NC , that is, that all items represented by rows must
This paper focuses on solving the problem in (1) when be assigned to all items represented by columns, which
the cost of the globally optimal solution is finite. The limits the scope of problems that can be solved by the
solution is derived for the minimization problem where all algorithm. Generalizations of the basic auction algorithm
of the costs in the matrix are positive, because any are discussed in [5, 8, 9, 42]. However, the auction
minimization problem with finite negative costs can be algorithm is not always the best solution. All formulations
transformed to have all positive costs and any of the auction algorithm are weakly polynomial time
maximization problem can be transformed into an algorithms. That means that the worst case computational
equivalent minimization problem. Specifically, if one complexity of the algorithms depends not only on the size
wishes to perform minimization on a matrix C̃ (where an of the problem (in this case on NR and NC ), but also on the
element in the ith row and jth column is c̃i,j ) that might relative values of the elements of C. Given an
have negative elements, one can transform the matrix into appropriately degenerate C matrix, if one wants to be
a usable cost function C as guaranteed the globally optimal solution, then an upper
bound on the execution time of the algorithm can become
C = C̃ − min c̃i,j . (5) arbitrarily long. Versions of the auction algorithm utilizing
i,j
-scaling have the lowest theoretical bound, which does
The requirement that the globally optimal solution have a not always translate into fast execution times in practice
finite cost ensures that the term min c̃i > −∞. However, [6, ch. 7.1.4], [10, ch. 5.4], as demonstrated in Subsection
i,j j, II-D.3
the transformation in (5) does not preclude the use of
On the other hand, a number of other 2D assignment
certain c̃i,j values being set to ∞ to forbid certain
algorithms exist. Many of these have strong polynomial
assignments. Detecting whether a problem with positive,
complexity; their worst case execution time scales
infinite costs is feasible (whether any finite-cost solution
polynomially dependent only on the dimensions NR and
exists) will be subsequently discussed and is essential to
NC and not on the actual values of the elements of C. The
implementing an efficient algorithm for finding the k-best
first such algorithm is often referred to as the “Hungarian
assignments. Similarly, the problem of maximizing the
algorithm”4 and is described in [14, ch. 4.2], among many
cost of assignments on a matrix C̃ can be transformed into
other places. When considering a square cost matrix,
a problem of minimizing the cost of assignments with the
NR = NC = N, the Hungarian algorithm has a complexity
matrix
of O(N4 ). Many of the most efficient 2D assignment
C = −C̃ + max c̃i,j , (6) algorithms tend to be variants of the Hungarian algorithm.
i,j For example, the algorithm of Jonker and Volgenant [32],
under the assumption that none of the elements of C̃ is ∞, which unbeknownst to many can be considered a
though elements of C̃ are allowed to be – ∞ to forbid particularly efficient variant of the Hungarian algorithm
certain assignments.
Section II describes how the 2D rectangular 3 It is not unusual for an algorithm with a low worst case bound to have
assignment algorithm can be solved in polynomial time poor average performance. For example, when considering linear
with an upper bound on the total number of instructions programming, the popular simplex algorithm has an exponential worst
necessary to run to completion and how such an algorithm case complexity [11, ch. 3.7], whereas the ellipsoid method is weakly
polynomial in complexity [11, ch. 3.7]. However, the simplex method is
can be implemented to quickly determine whether or not a
generally much faster than the ellipsoid method [11, ch. 3.7].
cost matrix presents a feasible solution. Simulation 4 The algorithm was first named the “Hungarian” algorithm by Kuhn
examples demonstrating the runtime of the algorithm are [34], who based the approach on work done by Jenő Egerváry that was
also given. Section III then describes how a k-best 2D published in Hungarian.
1680 IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS VOL. 52, NO. 4 AUGUST 2016
[14, ch. 4.4], has a complexity of O(N3 ) [32]. A variant of In [36], it is noted that the JVC algorithm had been
the Jonker-Volgenant algorithm that has been generalized implemented in a subsystem used in a (at that time) next
to rectangular cost matrices is often called the JVC generation helicopter for the U.S. Army. The JVC
algorithm, with the C standing for Castañon, who algorithm is compared with an unpublished, proprietary,
provided a generalized implementation of the algorithm to heuristic algorithm called “competition” and is shown to
work with rectangular matrices and replaced the original be faster. However, the competition algorithm is shown to
initialization step with a few iterations of the auction have fewer assignment errors. Note, however, that Jonker
algorithm, which is faster [27].5 and Volgenant’s algorithm [32] is guaranteed to converge
A number of studies have been conducted comparing to the globally optimal solution at each time-step. The
algorithms for 2D assignment in numerous applications, source of the suboptimal convergence of the JVC
such as multiple target tracking. In [41], three 2D algorithm used in [36] probably comes from the fact that
assignment algorithms are compared considering their the auction algorithm was used in an initialization step.
performance in multiframe optimization, namely, the The auction algorithm does not strictly guarantee
auction algorithm, the RELAX II algorithm and the complementary slackness (which shall subsequently be
generalized signature method, with the auction algorithm defined), as the Jonker-Volgenant algorithm requires, but
performing the best. In [27], the JVC algorithm is only does so within a factor of . Thus, the accelerated
compared with three variants of the Munkres algorithm initialization provided with the code in [27], which uses a
[38],6 with the JVC algorithm performing the best. In [15], fixed, heuristic value of for all problems, is probably the
the JVC algorithm is compared with variants of the cause of the suboptimal results. Such problems will be
auction algorithm, with a scaled forward-reverse auction avoided in the 2D assignment algorithm presented in this
algorithm performing the best. Other studies have section.
concluded that the JVC algorithm is often a better In [35], the JVC algorithm, the auction algorithm, the
alternative to the auction algorithm. In [33], the Munkres, Munkres algorithm, and a suboptimal greedy approach are
JVC, deepest hole,7 and auction algorithms are compared compared in a tracking scenario. The Munkres algorithm
based on a measurement assignment problem for multiple is found to be significantly slower than the other methods,
target tracking, where the JVC algorithm is found to be the agreeing with the study done in [44] that found the auction
best overall, with the auction algorithm being faster on algorithm to be faster than the Munkres algorithm. The
sparse problems. In the assignment problem, “sparse” JVC algorithm is found to be fast enough to negate any
means that many elements of C are not finite (certain rows speed benefit from using a suboptimal greedy technique.
cannot be assigned to certain columns). In [28], it is also concluded that the speed of the JVC
However, it is noted that the auction algorithm in the algorithm negated the need for greedy approximations.
simulations in [33] does not always produce optimal In [35], the JVC algorithm is shown to be faster than
results. Such suboptimal solutions can arise when the the auction algorithm in all instances when implemented
parameter in the auction algorithm is not small enough. in C, and on dense problems (mostly finite elements in C)
The auction algorithm is a type relaxation dual when implemented in Matlab. At first, this seems to
optimization technique [7, ch. 6.3.4]. When considering contradict the results of [33], which deemed the auction
NR = NC = N, the accuracy of the cost function is within algorithm superior on sparse problems. However, the
N of the optimal value in both the forward and reverse [9] focus of [33] was on sparse problems for target tracking
versions of the algorithm. Though papers on the auction applications, which at the time were considered difficult,
algorithm generally consider setting in view of integer because one did not always include missed detection
values of ci,j , nothing in the proof [4] of that accuracy hypotheses in the hypothesis matrix C. In such an
bound requires the costs to be integers. Thus, to ensure instance, if the only thing that two targets could be
convergence to an optimal solution N should be less than assigned to was a single, common measurement, no
the minimum nonzero difference between all pairs of feasible assignment would be possible and 2D assignment
elements in C. However, as decreases, the worst case algorithms would fail.
computational complexity of the auction algorithm Eight different assignment algorithms are compared
increases [4]. with NR = NC in [14, ch. 4.10]. Variants of the
Jonker-Volgenant algorithm are shown to perform the best
on the majority of dense problems, and are competitive on
sparse problems (when many elements of C are not finite,
5 In other words, the shortest augmenting path algorithm of Jonker and representing forbidden assignments). However, the
Volgenant was kick-started by a form of the auction algorithm. Jonker Jonker-Volgenant algorithm variants are beaten on the
and Volgenant’s algorithm does not actually have to be initialized, though most difficult problem by the cost scaling implementation
that can speed it up. [30] of the algorithm of push-relabel algorithm
6 The Munkres algorithm is an old O(N4 ) version of the Hungarian
[18, ch. 26.4], which performs poorly on one of the sparse
algorithm.
7 The deepest hole algorithm is suboptimal. Given the speed of optimal problems. The cost scaling algorithm of [30] is an
2D assignment algorithms, there is seldom need to use a suboptimal -scaling technique like that used in the auction algorithm.
approach now. This paper avoids such algorithms as the choice of is
1682 IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS VOL. 52, NO. 4 AUGUST 2016
will be formulated. The function (inequality) constraint, the primal problem is a linear
⎡ ⎛ ⎞ programming problem. For linear programming problems,
NR NC
NR
NC
g(u, v) = min ⎣ ci,j xi,j + ui ⎝1 − xi,j ⎠ the strong duality theorem says that the duality gap, that
x,x≥0
i=1 j =1 i=1 j =1
is, the expression c x* – g(u* , v* ), is zero [11, ch. 4.3].
⎤ Thus, solving the dual problem provides the value of c x* ,
NC
NR
but does not directly provide the value of x* . Given (16),
+ vj 1− xi,j ⎦ (14a) one can say that xi,j = 0 if ci,j – ui – v j > 0. However, this
j =1 i=1 does not explicitly say which values of x should be one. It
could be possible that multiple values of x, not all of
= min c x + u (1 − Ax) + v (1 − Bx) (14b)
x,x≥0 which satisfy the constraints of the original problem, yield
the same cost.
= min c − A u − B v x + u 1 + v 1 (14c)
x,x≥0 Consider, first, the case where NR = NC , meaning that
is known as the dual cost function. The u and v variables the inequality constraint in (3) becomes an equality
are known as Lagrange multipliers or dual variables and constraint. In this case, given the optimal dual solution,
their use in eliminating constraints in optimization valid solutions for x are those satisfying the equality
problems is known as Lagrangian relaxation [7, ch. 3]. constraints. The satisfaction of the constraints means that
There is one Lagrange multiplier variable per constraint the terms in (14b) involving dual variables disappear so
that has been eliminated, except for the nonnegativity that the dual and primal costs are equal, as expected. On
constraint on x, which will not be relaxed. Due to the extra the other hand, if NR ≤ NC , then by the strong duality
degrees of freedom introduced by the dual variables, under theorem, it is known that the dual and primal cost
the constraint that v ≤ 0, it can be shown that g(u, v) ≥ functions should be equal at the globally optimal values
c x* [11, ch. 4.1]. In other words, the dual cost function u* , v* , and x* . To eliminate the terms in (14b) involving v,
forms a lower bound on the value of the primal cost it is necessary that
function at the globally optimal solution. The dual
v (1 − Bx) = 0 (20)
optimization problem seeks to find the values of u and v
that maximize this lower bound. Expressed in scalar form, this says that
To formulate the dual optimization problem note that
NR
NC
NR
vj 1 − xi,j = 0 ∀j (21)
c−Au−Bv x= ci,j − ui − vj xi,j (15)
i=1
i=1 j =1
Given that the binary constraint on x has been relaxed to a This requirement is also known as another complementary
nonnegativity constraint, if ci,j – ui – v j < 0, then xi,j can slackness condition [11, ch. 4.3], [7, ch. 3.3]. The
be chosen arbitrarily large so that g(u, v) is arbitrarily complementary slackness theorem [11, ch. 4.3] springs
small. Since the dual optimization problem concerns logically from this. The complementary slackness theorem
maximizing the dual cost function, it makes sense to only says that given vectors u, v, and x such that (16) and (20)
consider solutions greater than –∞ by introducing the hold, then u, v, and x are optimal solutions to both the
constraint that ci,j – ui – v j ≥ 0, or expressed in vector primal as well as the dual optimization problems.
form that c – A u – B v ≥ 0. This constraint is known as a The complementary slackness theorem plays an
complementary slackness condition [11, ch. 4.3]. important role in algorithms such as the Jonker-Volgenant
However, if ci,j – ui – v j > 0, then the value of xi,j that algorithm that use shortest augmenting path techniques for
minimizes (15) is 0, implying that the globally minimum solving the assignment problem, as well as in more
value of (15) is always zero if ci,j – ui – v j ≥ 0. Put general augmenting path methods, such as the
differently, given the complementary slackness condition, Ford-Fulkerson and Edmonds-Karp algorithms [18, ch.
the following equation is true, 26.2], for use in general network optimization problems.
Such algorithms solve the assignment problem by
min c − A u − B v x = 0. (16) sequentially solving a series of assignment problems with
x:x≥0
NR = 1, NR = 2, et cetera. The complementary slackness
Substituting this into the dual cost function of (14c), the theorem is used to verify that the globally optimal solution
dual optimization problem is to each subproblem is obtained. Most presentations of
{u∗ , v∗ } = arg max u 1 + v 1 (17) shortest augmenting path algorithms, such as in [26, 46]
u,v
and [14, ch. 4.4], relate them to minimum cost network
subject to v ≤ 0 (18) flow problems from which the shortest path algorithm can
be derived. Others, such as [24], simply provide the
algorithm and then show that the complementary
c−Au−Bv≥0 Complementary Slackness (19)
slackness conditions hold after each step. Due to the
Because the binary constraint on x in the primal complexity of the minimum cost network flow problem, a
problem was replaced by a vector nonnegativity direct derivation directly along those lines will be avoided.
1684 IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS VOL. 52, NO. 4 AUGUST 2016
shortest augmenting path in Fig. 1(b) starting from node t2 AC will be the set of all columns. Allocate NC × 1 space
is highlighted, and the new assignment, which is given by for a vector called path, which will specify which row is
the parts of the path going from rows to columns, is associated with which column in the minimum cost path.
illustrated. 2) Prepare for Augmentation. Set all elements of the
In this example, it was decided that the shortest NC × 1 vector shortestPathCosts equal to infinity. Set SR
augmenting path starting at row t2 would be found. and SC equal to the empty sets. They will hold the row and
However, the shortest augmenting path among all possible column vertices, respectively, that have been reached by a
augmenting paths is t1 → e5 , which has a cost of 3. If one shortest path emanating from row k. Set sink = – 1,
were to augment using that path, one would get new minVal = 0 and the current row to be assigned is set to
assignments of t3 → e1 , t4 → e2 , and t1 → e5 . Because i = curRow. The variable minVal will ultimately hold the
that path did not overlap with any already existing cost of the shortest augmenting path that is found, and sink
assignments, those assignments would remain unchanged. will ultimately be the index of the final column in the
To solve the complete assignment problem, however, it alternating path. The variable j will select the current
does not matter whether a shortest augmenting path column.
starting from t1 or t2 is found, because the algorithm 3) Find the Shortest Augmenting Path.
provides the minimum cost assignment for the rows that while sink = -1 do
have been chosen to be assigned [26], [14, ch. 4.4.2], as SR ← SR ∪ {i};
shall be elucidated when the dual update step is discussed. for all j AC\SC such that minVal + C [i,j] – u[i] –
An important aspect of this realization is that if an v[j] < shortestPathCosts [j] do
augmenting path to add a given unassigned target to the path [j]← i;
partial assignment cannot be found, then the assignment shortestPathCosts [j]← minVal + C [i,j]–
problem is infeasible.9 The rapid identification of u [i]– v [j];
infeasible assignment problems is useful in efficiently end for
implementing techniques that find the k-best assignments j ← arg min (shortestPathCosts[h] given h
and is often overlooked in the literature. AC\SC); (If shortestPathCosts[j] = ∞, then infeasible!)
The simplest way to show that augmenting with the SC← SC ∪ j;
shortest augmenting path algorithm produces an optimal minVal ← shortestPathCosts[j]
assignment for the subproblem involving only the rows if row4col [j] = –1 then
that one has chosen to add to the problem is to present the sink ← j;
algorithm with its dual update step and show that the result else
satisfies the complementary slackness conditions for the i ← row4col [j];
dual problem that were previously mentioned. The end if
procedure for finding the shortest augmenting path (given end while
a partial assignment) that is commonly used in the The ∪ operation means that the two sets are being merged.
assignment problem is the Dijkstra algorithm [25], [18, ch. Thus, SR ← SR ∪ {i} means that row i is being added to
24.3], [11, ch. 7.9]. A particularly clear description of the the collection of rows that have been visited. A backslash
Dijkstra algorithm is given in [11, ch. 7.9]. A modified means that the right-hand quantity is subtracted from the
version of the algorithm that properly utilizes and updates set. Vectors are indexed using brackets. The shortest
the dual variables for calculating the reduced costs is augmenting path ends at column sink. The first row in the
presented in [14, ch. 4.4], [32] and is given as steps path is thus, r = path[sink]. The next column is given by
2–4 of a complete rectangular version of the 2D col4row[r]. The following row is then path[r] and so on.
Jonker-Volgenant algorithm as follows. The path is traced until row curRow, which began the
1) Initialize. Initialize the NR × 1 vector u, the dual path, is reached. After updating the dual variables, the
cost vector for the rows, and the NC × 1 vector v, the dual assignments from the shortest augmenting path step must
cost vector for the columns, to all zeros. The scalar value be saved for the next iteration.
cur Row will be the index of the current unassigned row 4) Update the Dual Variables.
that is to be assigned. Set cur Row = 0 to select the first u [cur Row] ← u [cur Row] + minVal;
row to assign (indexation is assumed to start from 0). The for all i ∈ SR \curRow do
NR × 1 vector col4row and the NC × 1 vector row4col will u [i] ← u[i] + minVal
hold the values of the assigned indices. As nothing has – shortestPathCosts [col4row [i]];
been assigned yet, set all of their elements to –1. The set end for
for all j ∈ SC do
v [j] ←v [j] –minVal + shortestPathCosts [j];
9 When considering infeasible assignments as having infinite cost, an end for
augmenting path can always be found, but it might have infinite cost. 5) Augment the Previous Solution.
Since adding more rows (targets) to the problem can only increase the
cost of the assignment problem, the total cost will always be infinite, so
j← sink;
the algorithm can be stopped once it find a single infeasible (infinite cost) do
path. i ← path [j];
1686 IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS VOL. 52, NO. 4 AUGUST 2016
TABLE I
The Median and Worst Observed Execution Times of the Modified Jonker-Volgenant Algorithm without Initialization when run on Random Matrices
of the Sizes Indicated, Implemented in Matlab, C, and C++. In Matlab and C, the Algorithm was Implemented Either so that the Innermost Loop
Scanned the Data Across Rows or Across Columns. The C++ Implementation, Which Forms the Basis of the First Step of the k-Best Rectangular 2D
Assignment Algorithm of Section III, Scanned only Across Rows. The Execution Times are taken from 1000 Monte Carlo Runs. Note that the
Execution Times for the 500 × 1000 Problem are Always Less Than Those for the 500 × 500 Problem and that the Median Execution Times of the
Row-Wise Algorithms are Always Less Than Those of the Corresponding Column-Wise Algorithms
Problem Size Median Worst Case Median Worst Case Median Worst Case Median Worst Case Median Worst Case
100 × 100 22.1 ms 63.4 ms 22.4 ms 68.4 ms 0.518 ms 1.34 ms 0.553 ms 1.28 ms 0.723 ms 2.20 ms
200 × 200 65.9 ms 139 ms 71.5 ms 145 ms 2.59 ms 5.07 ms 3.08 ms 9.18 ms 3.09 ms 6.12 ms
500 × 500 376 ms 697 ms 530 ms 892 ms 20.9 ms 54.2 ms 46.4 ms 163 ms 26.4 ms 62.3 ms
500 × 1000 165 ms 288 ms 137 ms 206 ms 14.7 ms 26.2 ms 23.0 ms 57.5 ms 18.1 ms 32.2 ms
3000 × 3000 20.6 s 25.4 s 49.2 s 63.3 s 1.90 s 3.56 s 9.81 s 1351 s 2.16 s 3.73 s
The C implementations of the algorithm mirror the upon how long it takes to find the shortest augmenting
Matlab implementations, except care is taken to allocate path each time. Thus, by modifying the termination
all memory outside of the loops. The reason two condition to force the algorithm to take the maximum
implementations are present each in C and in Matlab is number of loops, modifying the elements in the loops to
because modern processors, such as the Intel Xeon E5645 force the if-statement to always be true, and adjusting the
[19], on which the simulations are run, contain code to make sure that no invalid memory locations are
sophisticated prefetch algorithms that try to fill the accessed, one can estimate the worst case execution time
processor’s cache with data (prefetch data) that it of the algorithm. Whereas such worst case execution
anticipates the program will need. However, if the program times were given in the conference work preceding this
requests data from widely-separated places in memory, paper [22], they are omitted here as a truly firm bound is
then the prefetch algorithms will perform poorly leading very processor specific, requiring one to force a as many
to a large number of cache misses and slower execution false branch predictions12 and cache misses on modern
time. The implementations of the 2D assignment processors as possible for the bound to
algorithm in Matlab and C thus differ in the order be valid.
in which the rows or columns of the assignment matrix The algorithms are run on random assignment
are scanned. matrices of varying sizes, as shown in Table I. All of the
The assignment matrices given to the different algorithms modify copies of the input matrices using (5)
algorithms are generated in Matlab. The implementations and (6) to guarantee that the matrices are appropriate for
in C and C++ are called from Matlab. Matlab stores the algorithm. Only minimization is performed in the
matrices in memory with column-major ordering.11 simulations. One thousand Monte Carlo runs are
Consequently, the Jonker-Volgenant algorithm performed on a computer made by the Xi Corporation
implemented such that the shortest path portion scans running Windows 7 with two Intel Xeon E5645 processors
across rows rather than columns (unlike the and 12 gigabytes (GB) of random access memory (RAM)
implementation described in Section II-B) would be in Matlab 2013b. In order to speed up the simulations, the
expected to be faster. The C++ implementation of the 2D parallel processing toolbox is used to run Monte Carlo
rectangular algorithm is shared by the k-best 2D runs across 12 processor cores simultaneously. (The
rectangular assignment algorithm of Section III and computer has 24 cores in total).
only scans across rows in the innermost loop of As can be seen, even for hundreds of targets, the
the algorithm. execution time of the algorithms implemented in C is on
One benefit of the simple implementation of the the order of milliseconds. The row-wise implementations
Jonker-Volgenant algorithm without initialization is that of the algorithms, which allow for fewer cache-misses, are
the worst case execution time can be estimated without faster than the column-wise implementations with the
running many Monte Carlo runs. Ignoring the influence of
background processes running on a computer, the
execution time of the algorithm varies only depending 12 When an “if” statement arises in the code, modern processors might
Problem Size Median Worst Case Median Worst Case Median Worst Case
difference increasing as a function of the dimensionality Volgenant algorithm used in this paper is good on generic,
of the cost matrix. The Matlab implementations, with unstructured problems.
Matlab being an interpreted language rather than a
compiled programming language, are the slowest of all.
D. Comparison to Auction Algorithms
The execution time for the rectangular assignment
problem is less than that of the square assignment problem To better understand why this paper focusses on
with the same number of rows. An explanation for this is shortest augmenting path 2D assignment algorithms rather
that the extra columns decreased the likelihood that two than using variants of the auction algorithm, which are
rows would contest the same column. significantly more common in the literature, this
The 3000 × 3000 matrix example is chosen to subsection looks at specific scenarios where the auction
demonstrate how the need for parallelization has changed algorithm can perform poorly. Here, all of the auction
with advances in hardware and algorithms over the years. algorithm variants are implemented in Matlab. The
In 1991, a parallelized shortest augmenting path algorithm variants considered are:
that ran over 14 processors (and was implemented in such 1) the forward auction algorithm using -scaling
a manner that the run time depended on the range of described in [6, ch. 7.1] using the open-source
values of the costs) took 811 s to run in the worst case on a implementation for square matrices given in [3]. Like
random 3000 × 3000 matrix [2]. That is about 427 times most versions of the auction algorithm in the literature,
slower than the median run time of this algorithm. this is only suited for integer-valued cost matrices C. The
However, for such large problems, the quality of the -scaling causes this variant of the auction algorithm to
assignment algorithm is more important than the hardware have a particularly low computational complexity. The
on which the algorithm is run. For example, if one were to default heuristic method of initially setting the parameter
solve the 3000 × 3000 assignment problem by evaluating used in the code is
all combinations via brute force, one would need to
= max ci,j (N + 1) (24)
consider 3000! ≈ 4 × 109130 different possible i,j
assignments, which is not computationally feasible on if the cost matrix C is an N × N matrix.
modern hardware. 2) the basic forward auction algorithm for square
In the event that the algorithm given here is not fast matrices with a fixed set to guarantee an optimal
enough to solve a particularly massive optimization solution. This is described in [6, ch. 7.2], among other
problem within a desired time interval, then modifications sources. An optimal solution is guaranteed by setting the
for parallelization considered in [2, 48] can be used. as
Additional references for parallelization techniques
min
applied to shortest augmenting path algorithms are given = (25)
in [14, ch. 4.11.3]. One of the main arguments for using 1.01N
the auction algorithm over the shortest augmenting path where min is the smallest, positive nonzero difference
algorithms is its ability to be more easily parallelized. between entries in C. The 1.01 term could be any value
Given well-structured problems, a well-implemented larger than 1 to guarantee that the algorithm converges to
highly parallelized auction algorithm can be faster than a the globally optimal solution.
shortest augmenting path algorithm for assignment in the Table II shows the runtimes of the different algorithms
average case [51]. However, the modified Jonker when run on random integer cost matrices with up to 9
1688 IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS VOL. 52, NO. 4 AUGUST 2016
TABLE III TABLE IV
The Execution Times of the Auction Algorithms and the Modified The Execution Times of the Auction Algorithms and the Modified
Jonker-Volgenant Algorithm when Run in Matlab 2015b on Magic Jonker-Volgenant Algorithm when Run in Matlab 2015b on Magic
Matrices or Matrices Full of Ones of the Indicated Dimensions (32 × 32, Matrices of the Indicated Dimensions where all Odd Numbered Entries
64 × 64, ...). The Forward Auction Algorithm Performs Significantly were Marked as Impermissible Assignments. The Results are Compared
Worse than with Random Matrices in Table II as the Size Increases. It with Matrices of all Ones with the Same Impermissible Entries. Compare
Can be Seen that the Values of the Matrices have an Effect on Runtime. with Table III to see the Effects of Sparsity on the Problem
This Changes with Sparsity, as Table IV Shows
Problem -Scaled Forward Modified
Problem -Scaled Forward Modified Size Auction Auction Jonker-Volgenant
Size Auction Auction Jonker-Volgenant
32 (Magic) 14.3 ms 2.60 ms 1.34 ms
32 (Magic) 11.3 ms 235 ms 1.87 ms 32 (Ones) 2.92 ms 4.72 ms 1.80 ms
32 (Ones) 4.49 ms 8.17 ms 2.62 ms 64 (Magic) 42.2 ms 9.45 ms 4.17 ms
64 (Magic) 33.8 ms 4.64 s 6.61 ms 64 (Ones) 5.30 ms 17.58 ms 6.20 ms
64 (Ones) 9.04 ms 34.3 ms 10.0 ms 128 (Magic) 155 ms 38.5 ms 18.0 ms
128 (Magic) 119 ms 93.6 s 31.8 ms 128 (Ones) 64.27 ms 76.5 ms 28.7 ms
128 (Ones) 115 ms 147 ms 46.6 ms
1690 IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS VOL. 52, NO. 4 AUGUST 2016
TABLE V
The Median and Worst Case Execution Times to Generate the Given Number of Hypotheses Shown for Varying Problem Sizes with 1000 Monte Carlo
Runs in Matlab and C++. The Median Speed of the C++ Implementation is up to 379 Times Faster than the Matlab Implementation in the Scenarios
Considered
equal to zero, or removing rows and columns from the algorithm, despite its simplicity, is not always an ideal
cost matrix for assignments that are fixed, such an choice for performing 2D assignment if one requires
implementation is quite computationally inefficient, globally optimal solutions, due to difficulties associated
particularly for large cost matrices, requiring allocating with choosing the complementary slackness parameter. A
and copying large amounts of data for each cost matrix in complementary slackness parameter that is too large
addition to allocating space for the partial solutions that cannot guarantee a globally optimal solution; a parameter
are constrained to exist. A more efficient implementation that is too small can have a very slow rate of convergence,
never copies the cost matrix, but rather keeps track of the and adaptive methods for setting the complementary
partial assignment as well as which rows and columns are slackness parameter can be difficult to implement for use
fixed or forbidden. Such constraints can be stored in with arbitrary cost matrices. Moreover simulations
arrays. Moreover, an efficient implementation would demonstrated that a forward auction algorithm with
inherit the dual variables and partial solution from each scaling of the complementary slackness parameter was
problem as it splits to bring down the computational slower than a modified form of the Jonker-Volgenant
complexity to O(kN3 ). Doing that, however, would require algorithm. Consequently, the Jonker-Volgenant shortest
the addition of dummy rows to a rectangular cost matrix augmenting path algorithm was chosen for
when NR = NC . Dummy rows were used in the implementation in this paper.
implementation of the algorithm for this paper in Matlab The implementation combines concepts from multiple
and C++. C++ was used instead of C so that the papers in the literature to create an efficient algorithm that
priority_queue template class could be used as an ordered can be generalized to find the k-best 2D assignments
list. In Matlab, a binary heap class was created to function rather than just the best 2D assignment. Whereas previous
as an ordered list. work has considered dual variable inheritance to improve
Table V shows the execution times for 1000 Monte k-best 2D assignment algorithms, this paper appears to be
Carlo runs of the two implementations on Murty’s the only work that combines aspects of rapid infeasibility
algorithm on assignment matrices of various sizes. As is detection and dual variable inheritance to create a more
the case in Section II-C, the cost matrices are generated efficient k-best 2D assignment algorithm. It was
with elements randomly generated uniformly between 0 demonstrated that the order in which one scans the
and 1. Both implementations of the algorithm scanned the elements in the cost matrix matters, with the influence
assignment matrices row-wise to obtain the best being attributable to cache prediction algorithms that are
computational performance. The results demonstrate built into the processor. This is an implementation aspect
that the algorithm can produce a large number of that is not necessarily obvious to noncomputer science
hypotheses on moderate sized problems in a fraction of professionals. Additionally, execution time differences
a second. between the C and the C++ implementations of the 2D
assignment algorithm demonstrate the difference that the
compiler and a few minor changes between languages can
IV. CONCLUSIONS
make.
An overview of the literature covering 2D assignment Matlab code implementing the 2D rectangular shortest
algorithms for rectangular problems and for k-best augmenting path algorithm is given in the Appendix. It has
assignment was given. It was determined that the auction been tested in Matlab 2013b, 2014a, 2014b, and 2015a.
1692 IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS VOL. 52, NO. 4 AUGUST 2016
numRow=numCol;
numCol=temp;
didFlip=true;
end
%The cost matrix must have all non-negative elements for the assignment
%algorithm to work. This forces all of the elements to be positive. The
%delta is added back in when computing the gain in the end.
if(maximize== true)
CDelta=max(max(C));
C=-C + CDelta;
else
CDelta=min(min(C));
C=C-CDelta;
end
%These store the assignment as it is made.
col4row=zeros(numRow, 1);
row4col=zeros(numCol, 1);
u=zeros(numCol, 1);%The dual variable for the columns
v=zeros(numRow, 1);%The dual variable for the rows.
%Initially, none of the columns are assigned.
for curUnassCol=1:numCol
%This finds the shortest augmenting path starting at k and returns
%the last node in the path.
[sink,pred,u,v]=ShortestPath(curUnassCol,u,v,C,col4row,row4col);
%If the problem is infeasible, mark it as such and return.
if(sink== 0)
col4row=[];
row4col=[];
gain=-1;
return;
end
%We have to remove node k from those that must be assigned.
j=sink;
while(1)
i=pred(j);
col4row(j)=i;
h=row4col(i);
row4col(i)=j;
j=h;
if(i== curUnassCol)
break;
end
end
end
%Calculate the gain that should be returned.
if(nargout>2)
gain=0;
for curCol=1:numCol
gain=gain + C(row4col(curCol),curCol);
end
%Adjust the gain for the initial offset of the cost matrix.
if(maximize== true)
gain=-gain + CDelta*numCol;
else
gain=gain + CDelta*numCol;
end
end
if(didFlip== true)
1694 IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS VOL. 52, NO. 4 AUGUST 2016
sink=closestRow; [12] Bijsterbosch, J., and Volgenant, A.
else Solving the rectangular assignment problem and applications.
Annals of Operations Research, 181, 1 (Dec. 2010), 443–462.
curCol=col4row(closestRow);
[13] Blackman, S. S., and Popoli, R.
end Design and Analysis of Modern Tracking Systems. Norwood,
end MA: Artech House, 1999.
%Dual Update Step [14] Burkard, R., Dell’Amico, M., and Martello, S.
%Update the first row in the augmenting path. Assignment Problems. Philadelphia, PA: Society for Industrial
and Applied Mathematics, 2009.
u(curUnassCol)=u(curUnassCol) + delta;
[15] Castañón, D. A.
%Update the rest of the rows in the agumenting path. New assignment algorithms for data association.
sel=(ScannedCols∼=0); In Proceedings of SPIE: Signal and Data Processing of
sel(curUnassCol)=0; Small Targets Conference, Orlando, FL, Apr. 20, 1992,
u(sel)=u(sel) + delta-shortestPathCost(row4col(sel)); 313–323.
[16] Chegireddy, C. R., and Hamacher, H. W.
%Update the scanned columns in the augmenting path.
Algorithms for finding k-best perfect matchings.
sel=ScannedRow∼=0; Discrete Applied Mathematics, 18, 2 (Nov. 1987), 155–165.
v(sel)=v(sel)-delta + shortestPathCost(sel); [17] Cook, S.
end The P versus NP problem.
In The Millenium Prize Problems, J. Carlson, A. Jaffe, and A.
Wiles, (Eds.) Providence, RI: The American Mathematical
REFERENCES Society for the Clay Mathematics Institute, 2006.
[18] Cormen, T. H., Leiserson, C. E., Rivest, R. L., and Stein, C.
[1] Akgül, M. Introduction to Algorithms, 2nd ed. Cambridge, MA: The MIT
A genuinely polynomial primal simplex algorithm for the Press, 2001.
assignment problem. [19] Intel Corporation.
Discrete Applied Mathematics, 45, 2 (1993), Intel(R) 64 and IA-32 architectures optimization reference
93–115. manual.
[2] Balas, E., Miller, D., Pekny, J., and Toth, P. Intel Corporation, Tech. Rep. 248966-026, Apr. 2012.
A parallel shortest augmenting path algorithm for the [Online]. Available: http://www.intel.com/content/
assignment problem. www/us/en/processors/architectures-software-
Journal of the Association for Computing Machinery, 38, 4 developer-manuals.html
(Oct. 1991), 985–1004. [20] Cox, I. J., and Miller, M. L.
[3] Bernard, F. On finding ranked assignments with application to
Fast linear assignment problem using auction algorithm. multi-target tracking and motion correspondence.
Nov. 13, 2014. [Online]. Available: http://www.mathworks. IEEE Transactions on Aerospace and Electronic Systems, 32,
com/matlabcentral/fileexchange/48448-fast-linear- 1 (Jan. 1995), 486–489.
assignment-problem-using-auction-algorithm [21] Cox, I. J., Miller, M. L., Danchick, R., and Newman, G. E.
[4] Bertsekas, D. P. A comparison of two algorithms for determining ranked
The auction algorithm: A distributed relaxation method for the assignments with application to multitarget tracking and
assignment problem. motion correspondence.
Annals of Operations Research, 14, 1 (Dec. 1988), IEEE Transactions on Aerospace and Electronic Systems, 33,
105–123. 1 (Jan. 1997), 295–301.
[5] Bertsekas, D. P. [22] Crouse, D. F.
Auction algorithms for network flow problems: A tutorial Advances in displaying uncertain estimates of multiple targets.
introduction. In Proceedings of SPIE: Signal Processing, Sensor Fusion,
Computational Optimization and Applications, 1, 1 (Oct. and Target Recognition XXII, Baltimore, MD, Apr. 2013.
1992), 7–66. [23] Crouse, D. F., and Willett, P.
[6] Bertsekas, D. P. Identity variance for multi-object estimation.
Network Optimization: Continuous and Discrete Models. In Proceedings of SPIE: Signal and Data Processing of Small
Belmont, MA: Athena Scientific, 1998. Targets, Vol. 8137, San Diego, CA, Aug. 25, 2011.
[7] Bertsekas, D. P. [24] Derigs, U.
Nonlinear Programming, 2nd ed. Belmont, MA: Athena The shortest augmenting path method for solving assignment
Scientific, 2003. problems.
[8] Bertsekas, D. P. and Castañón, D. A. Annals of Operations Research, 4, 1 (Dec. 1985), 57–102.
A forward/reverse auction algorithm for asymmetric [25] Dijkstra, E. W.
assignment problems. A note on two problems in connection with graphs.
Computational Optimization and Applications, 1, 3 (Dec. Numerische Mathematik, 1, 1 (Dec. 1959), 269–271.
1992), 277–297. [26] Dorhout, B.
[9] Bertsekas, D. P., Castañón, D. A., and Tsaknakis, H. Experiments with some algorithms for the linear assignment
Reverse auction and the solution of inequality constrained problem.
assignment problems. Stichting Mathematisch Centrum, Amsterdam, The
SIAM Journal on Optimization, 3, 2 (May 1993), 268–297. Netherlands, Tech. Rep., Nov. 1970.
[10] Bertsekas, D. P., and Tsitsiklis, J. N. [27] Drummond, O., Castañon, D. A.,and Bellovin, M.
Parallel and Distributed Computation: Numerical Methods. Comparison of 2-D assignment algorithms for sparse,
Englewood Cliffs, NJ: Prentice-Hall, 1989. rectangular, floating point, cost matrices.
[11] Bertsimas, D., and Tsitsiklis, J. N. Journal of the SDI Panels on Tracking, 4 (1990), 81–97.
Introduction to Linear Optimization. Belmont, MA: Athena [28] Fitzgerald, R. J.
Scientific/Dynamic Ideas, 1997. Performance comparisons of some association algorithms.
David Frederic Crouse (S’05—M’12) received B.S., M.S., and Ph.D. degrees in
electrical engineering in 2005, 2008, and 2011 from the University of Connecticut
(UCONN). He also received a B.A. degree in German from UCONN for which he spent
a year at the Ruprecht-Karls Universität in Heidelberg, Germany.
He is currently employed at the Naval Research Laboratory in Washington, D.C.
and serves as an associate editor for the IEEE Aerospace and Electronic Systems
Magazine and has shared online a library of reusable algorithms for target trackers
called the Tracker Component Library. His interests lie in the areas of stochastic signal
processing and tracking.
1696 IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS VOL. 52, NO. 4 AUGUST 2016