A competitive local search heuristic for the subset sum
problem
Diptesh Ghosh∗ and Nilotpal Chakravarti
Indian Institute of Management Calcutta,
P.O. Box 16757, P.O. Alipore, Calcutta 700027, India
Abstract
Subset sum problems are a special class of difficult singly constrained zero-one integer programming problems. Several heuristics for solving these problems have been reported in the
literature. In this paper we propose a new heuristic based on local search which improves upon
the previous best.
Keywords : subset sum, local search, heuristics
1
Introduction
Subset sum problems are a special class of binary knapsack problems which interest both theoreticians and practitioners. This problem has varied applications. To cite one example, the problem
of workload allocation of parallel unrelated machines with setup times gives rise to a 0-1 integer
program in which coefficient reduction can be achieved by solving some subset sum problems ( see
Dietrich and Escudero [3, 4] ). The subset sum problem is also useful in cryptography ( see Konheim [8] ) and the calculation of power indices in cooperative voting games ( see Prasad and Kelly [12]
and Chakravarti et al. [1] ). In this paper, we consider the following optimization version of the
problem.
Problem Subset
Sum
Pn
Maximize
w
x
Pnj=1 j j
Subject to
w
x
j=1 j j ≤ c,
xj ∈ {0, 1} 1 ≤ j ≤ n,
where each wj , 1 ≤ j ≤ n and c is a positive real number.
We will refer to this
Pn problem as SSP (n, {w1 , w2 , . . . , wn }, c). Any vector X = {x1 , x2 , . . . , xn } ∈
{0, 1}n such that j=1 wj xj ≤ c is called a solution to the above problem.
SSP’s can be solved to optimality using exact algorithms based on branch and bound or dynamic
programming. However the problem is known to be NP- complete, ( see Garey and Johnson [7] )
which means that it is extremely unlikely that any exact algorithm will solve all its instances efficiently. ( For some classes of SSP’s which are difficult to solve in practice, see Chvátal [2]. ) Hence
SSP’s are often solved using heuristics — fast methods which yield approximate solutions.
∗ Currently at the Indian Institute of Management Lucknow, Prabandh Nagar, Off Sitapur Road, Lucknow 226013,
India, E-mail: diptesh@iiml.ac.in
1
Previous studies ( see Martello and Toth [9] ) suggest that the best heuristic for SSP is a scheme
developed by Martello and Toth. They use ideas from the Sahni scheme for the binary knapsack
problem ( see Sahni [15] ) to modify one of their own heuristics for the subset sum problem. We refer
to this heuristic as the Martello, Toth and Sahni scheme ( MTSS ). It takes as an input parameter
an integer k and generates
candidate solutions as follows. A subset S ⊆ {1, 2, . . . , n} such that
P
|S| ≤ k is chosen. If j∈S wj ≤ c, then a |S|-dimensional partial solution is constructed by setting
xs = 1 ∀s ∈ S. A candidate solution is obtained by augmenting it using the Martello andP
Toth greedy
n
scheme ( see Martello and Toth [10] ) on the SSP(n-—S—, {wr |r such that xr = 0}, c − j=1 wj xj ).
The heuristic generates all possible candidate solutions and outputs the best of these. It requires
O(nk+2 ) time and its worst case performance ratio is 3k+3
3k+4 ( see Fischetti [6]. ) This heuristic is
usually implemented with k = 2 and is extremely good in practice.
Recently there has been considerable interest in the use of local search and related heuristics to solve
discrete optimization problems. Local search is a well-established template algorithm which can be
easily modified to yield problem-specific heuristics. It is an iterative procedure. We begin with an
arbitrary “current solution” and search its neighbourhood for a better one. ( The neighbourhood
is a correspondence which maps each solution to a set of solutions. ) If better solutions exist
in the neighbourhood, we replace the current solution by the best among them and repeat the
neighbourhood search from the new solution. The search stops when the current solution has no
better neighbour. Local search terminates with the first locally optimal solution it finds, which is
usually not a global optimum.
Local search has several variants. For example one might choose the first neighbour which offers
an improvement rather than the best. Many general purpose heuristics such as simulated annealing
and tabu search are also derived from local search. These permit occasional worsening moves in
order to escape from local minima. They permit improvements in solution quality at the expense of
increased execution time.
Local search and its generalizations have the advantage of being widely applicable. Although variants
of local search are the heuristics of choice for some problem domains, for example vehicle routing and
graph partitioning, their major use is in the solution of otherwise intractable problems. Conventional
wisdom suggests that good problem specific heuristics, wherever available, are preferable to local
search. ( See for example, Pirlot [11] and Reeves [13, p 64]. )
The choice of the neighbourhood is perhaps the single most important problem-specific decision in
local search. The most intuitive one for SSP is perhaps the k-swap neighourhood structure. A vector
X = {x1 , x2 , . .P
. , xn } is said to belong
the k-swap neighbourhood of a solution Y = {y1 , y2 , . . . , yn }
Pto
n
n
if and only if j=1 wj xj ≤ c and j=1 |(xj − yj )| ≤ k. Notice that as the value of k increases,
the neighbourhood becomes larger. This results in the possibility of more dramatic improvements
in solution value at each iteration at the cost of longer execution time per iteration. It is easy to
see that for a k-swap neighbourhood, each local search iteration takes O(nk ) time. This makes local
search iterations quite expensive for k > 2.
We propose in the present paper a new local search heuristic based on a rather different neighbourhood structure, which we term the switch neighbourhood structure. It outputs solutions superior in
quality to the MTSS scheme, though it also requires more time.
2
The New Heuristic
The following observations prompt the switch neighbourhood structure.
2
1. The optimal solution to a SSP is always maximal, i.e. a vector such that it ceases to be
a solution if any vector element in the solution is changed from 0 to 1 while leaving others
unchanged.
2. Given any permutation (i1 , i2 , . . . , in ) the greedy solution corresponding to that permutation
is obtained by the following algorithm.
Algorithm greedy (n, c, {w1 , . . . , wn }, {i1 , . . . , in })
Step 0 ( Initialization ) Set j ← 1, residual ← c and xk ← 0, 1 ≤ k ≤ n. Go to Step 1.
Step 1 ( Termination ) If j = n + 1, output {x1 , x2 , . . . , xn } and terminate. Else go to Step
2.
Step 2 ( Augmentation ) If wij ≤ residual then set residual ← residual − wij , xij = 1. Set
j ← j + 1 and go to Step 1.
Any maximal solution is the greedy solution corresponding to some permutation. Thus, for
example, a maximal solution X with xi = 1∀i ∈ S and xi = 0 otherwise is the greedy solution
corresponding to any permutation of {1, 2, . . . , n} in which the elements of S preceed all the
other integers.
It should be noted that several permutations may give rise to the same maximal solution.
( For example, in the SSP(4, {8, 7, 6, 2}, 13), the permutations (1, 2, 3, 4), (1, 2, 4, 3), (1, 3, 2, 4),
(1, 3, 4, 2), (1, 4, 2, 3), (1, 4, 3, 2), (4, 1, 2, 3) and (4, 1, 3, 2) give rise to the same solution x1 =
x4 = 1, x2 = x3 = 0 with an objective function value of 10. )
The switch neighbourhood structure is defined on the set of all maximal solutions. Suppose that
X is a maximal solution and P = (p1 , p2 , . . . , pn ) is a permutation corresponding to X. Let Q =
(q1 , q2 , . . . , qn ) be a transposition of P , i.e. let Q be a permutation which can be obtained from P
by switching the position of exactly two elements. Let Y be the solution to which Q corresponds.
Then in the switch neighbourhood, X and Y are neighbours.
Algorithm switch (n, c, {w1 , . . . , wn })
Step 0 ( Initialization ) Choose an arbitrary permutation P = (p1 , p2 , . . . , pn ) of {1, 2, . . . , n}.
PnObtain X ∗ = {x∗1 , x∗2 , . . . , x∗n } = greedy(n, c, {w1 , w2 , . . . , wn }, {p1 , p2 , . . . , pn }). Set z ∗ → j=1 wj x∗j .
Go to Step 1.
Step 1 ( Termination ) Choose a permutation Q = (q1 , q2 , . . . , qn ) obtained from P by switching the
position of exactly two elements of P , such that greedy(n, c, {w1 , w2 , . . . , wn }, {q1 , q2 , . . . , qn })
is better than z ∗ . If such a permutation does not exist, output {x∗1 , x∗2 , . . . , x∗n } and terminate.
Else go to Step 2.
Step 2 ( Update
X ∗ → {x∗1 , x∗2 , . . . , x∗n } = greedy(n, c, {w1 , w2 , . . . , wn }, {q1 , q2 , . . . , qn }) and
Pn ) Set
∗
∗
z → j=1 wj xj . Go to Step 1.
The switch neighbourhood structure has several advantages over the k-swap neighbourhoods. Since
non-maximal solutions have been removed from the search space, the objective function graph is
less “spiky”, i.e. has fewer of local optima. Local search performs better on such smooth objective
function graphs. Also, the switch neighbourhood, contains in addition to some k-swap neighbours
of the current solution, others which can only be obtained from the current solution after multiple
k-swaps. Let us consider the effect of a switch iteration on the space of all solutions equipped with
3
Heuristic
MTSS
Local Search (2-swap)
Local Search (Switch)
Problem Size
125
251
0.9999978 0.9999993
0.9997773 0.9999404
0.9999993 0.9999993
75
0.9999854
0.9993987
0.9999993
Figure 1: wj ∈ [1, 10000], c = 0.25
Heuristic
MTSS
Local Search (2-swap)
Local Search (Switch)
Pn
j=1
wj .
Problem Size
125
251
0.9999997 0.9999993
0.9998937 0.9999693
0.9999993 0.9999993
75
0.9999837
0.9996682
0.9999993
Figure 2: wj ∈ [1, 10000], c = 0.5
Pn
j=1
501
0.9999993
0.9999852
0.9999993
501
0.9999993
0.9999943
0.9999993
wj .
the k-swap neighbourhood structure; the iteration produces a solution in the k-swap neighbourhood
of the current solution only if it is good enough; otherwise it results in a jump to some other portion
of the search space. Therefore local search with the switch neighbourhood may be considered an
adaptive version of multiple start local search using k-swap neighbourhoods, which searches the
crucial portion of the search space more thoroughly.
However this neighbourhood structure does have one serious disadvantage. The generation of each
neighbouring solution and the calculation of its objective function value require linear time. As a
result, each local search iteration, requiring O(n3 ) execution time, is quite expensive. The idea of
switch-type neighbourhoods was developed in the context of number partitioning problems by Ruml
et al. [14] where it proved very effective.
3
Computational Results
In this section we compare the MTSS heuristic ( with the input parameter k = 2 ) and local search
with 2-swap and switch neighbourhoods on the basis of their performance on randomly generated
problems. The algorithms have been coded in C and run on a 80486 machine running SCO UNIX.
The implementations were fairly straightforward since we were more interested in the quality of
solutions output than in either the execution time or the maximum size of problems which could be
solved.
Twelve problem sets were generated for computational comparison. Each problem set contained 100
problem instances for each problem size. (Recall that problem
size refers to the parameter n in the
P
n
wj xj
( i.e. the fraction of c “filled”
SSP.) The quality of a solution is given by the fraction j=1c
by the solution ). The displayed output is the average of solution quality values obtained for all
instances of that size in the set.
Figures 1 through 12 summarize the results that we obtained. It may be observed that the quality
of solutions output by each heuristic improves with problem size and deteriorates when the interval
from which the numbers are picked becomes wider.
4
Heuristic
MTSS
Local Search (2-swap)
Local Search (Switch)
Problem Size
125
251
0.9999955 0.9999992
0.9999219 0.9999811
0.9999993 0.9999993
75
0.9999781
0.9997891
0.9999993
Figure 3: wj ∈ [1, 10000], c = 0.75
Heuristic
MTSS
Local Search (2-swap)
Local Search (Switch)
Pn
j=1
Figure 4: wj ∈ [10000, 20000], c = 0.25
Heuristic
MTSS
Local Search (2-swap)
Local Search (Switch)
Heuristic
j=1
Pn
j=1
Figure 6: wj ∈ [10000, 20000], c = 0.75
Heuristic
75
0.9999831
0.9994085
0.9999863
501
0.9999993
0.9991959
0.9999993
wj .
j=1
501
0.9999992
0.9993483
0.9999993
wj .
Problem Size
125
251
0.9999954 0.9999992
0.9997849 0.9999448
0.9999993 0.9999993
Figure 7: wj ∈ [1, 10000000], c = 0.25
5
Pn
501
0.9999993
0.9986213
0.9999991
wj .
Problem Size
125
251
0.9999784 0.9999982
0.9972560 0.9987549
0.9999939 0.9999990
75
0.9998764
0.9952351
0.9999868
MTSS
Local Search (2-swap)
Local Search (Switch)
Pn
Problem Size
125
251
0.9999940 0.9999990
0.9967080 0.9985452
0.9999971 0.9999993
75
0.9999657
0.9931960
0.9999810
Figure 5: wj ∈ [10000, 20000], c = 0.5
MTSS
Local Search (2-swap)
Local Search (Switch)
wj .
Problem Size
125
251
0.9999939 0.9999991
0.9948497 0.9966084
0.9999908 0.9999938
75
0.9999684
0.9933293
0.9999738
501
0.9999993
0.9999952
0.9999993
Pn
j=1
wj .
Heuristic
MTSS
Local Search (2-swap)
Local Search (Switch)
75
0.9999815
0.9996778
0.9999992
Problem Size
125
251
0.9999972 0.9999992
0.9998801 0.9999722
0.9999993 0.9999993
Figure 8: wj ∈ [1, 10000000], c = 0.5
Heuristic
MTSS
Local Search (2-swap)
Local Search (Switch)
75
0.9999776
0.9997764
0.9999992
Pn
MTSS
Local Search (2-swap)
Local Search (Switch)
Pn
j=1
Heuristic
Pn
j=1
501
0.9999993
0.9986174
0.9999990
wj .
Problem Size
125
251
0.9999930 0.9999991
0.9968154 0.9985417
0.9999974 0.9999993
75
0.9999655
0.9931902
0.9999804
Figure 11: wj ∈ [1000000, 2000000], c = 0.5
Heuristic
MTSS
Local Search (2-swap)
Local Search (Switch)
wj .
Problem Size
125
251
0.9999932 0.9999991
0.9948471 0.9966031
0.9999903 0.9999958
75
0.9999659
0.9933215
0.9999726
Figure 10: wj ∈ [1000000, 2000000], c = 0.25
MTSS
Local Search (2-swap)
Local Search (Switch)
wj .
Problem Size
125
251
0.9999954 0.9999991
0.9999147 0.9999800
0.9999993 0.9999993
Figure 9: wj ∈ [1, 10000000], c = 0.75
Heuristic
j=1
Pn
j=1
wj .
Problem Size
125
251
0.9999782 0.9999979
0.9972528 0.9987528
0.9999942 0.9999990
75
0.9998764
0.9952316
0.9999871
Figure 12: wj ∈ [1000000, 2000000], c = 0.75
6
Pn
501
0.9999993
0.9991924
0.9999992
j=1
501
0.9999992
0.9993818
0.9999993
wj .
Both the MTSS heuristic and local search with switch neighbourhoods output solutions of excellent
quality. In fact, for many of our randomly generated test instances, they generated the optimal
solution! (We did not calculate the optimal solution for all instances and hence do not report the
exact number of instances in which each heuristic generated the optimal solution.) Local search
using switch neighbourhoods outperformed MTSS fairly consistently. This is all the more impressive
since the MTSS heuristic is so good! Local search using 2-swap neighbourhoods in general was
outperformed by both the MTSS heuristic and local search with switch neighbourhoods.
In general the MTSS heuristic required shorter execution times. Local search using the switch
neighbourhood took, on an average, 1.33 times more time when the data was in the interval [1, 10000],
14 times more time when the data was in the interval [1000, 20000], 68.5 times more time when the
data was in the interval [1, 10000000], and 67 times more time when the data was in the interval
[1000000, 2000000]. However the actual time taken to solve the problems was in most cases, less
than two cpu seconds, making both the heuristics eminently applicable.
4
Conclusions
In this paper we have presented a local search heuristic with a new switch neighbourhood structure
for the subset sum problem. This compares favourably with the best problem-specific heuristic.
This is all the more impressive since SSP is well-studied and the MTSS heuristic is very good.
Any local search heuristic can be easily modified to obtain a simulated annealing and a tabu search
variant, which may provide better quality solutions at the expense of longer execution times. Consequently if further improvements in solution quality are required, tabu search or simulated annealing
versions of the heuristic of the present paper may be implemented.
The switch neighbourhood structure described here is quite general in scope and may be used in
many other problem domains. We expect this to be the subject of further fruitful research.
7
References
[1] N. Chakravarti, A. M. Goel and T. Sastry, Easy weighted majority games. Working Paper
Series WPS-287/97 Indian Institute of Management Calcutta.
[2] V. Chvátal, Hard Knapsack Problems. Operations Research 28, 1402 – 1411 (1980).
[3] B. L. Dietrich and L. F. Escudero, Coefficient reduction for knapsack-like constraints in 0-1
programs with variable upper bounds. Operation Research Letters 9 9 – 14 (1990).
[4] B. L. Dietrich, L. F. Escudero and F. Chance, Efficient reformulation for 0-1 programs: methods
and computational results. Discrete Applied Mathematics 42 144 – 176 (1993).
[5] L. F. Escudero, S. Martello and P. Toth, A framework for tightening 0-1 programs based on
extensions of pure 0-1 KP and SS problems, 110-123, in E. Balas and J. Clausen (eds.) Integer
Programming and Combinatorial Optimization. Springer, Berlin.
[6] M. Fischetti, Worst-case analysis of an approximation scheme for the subset-sum problem.
Operations Research Letters 5, 283 – 284 (1986).
[7] M. R. Garey and D. S. Johnson, Computers and Intractibility: A guide to the Theory of NPCompleteness. W. H. Freeman & Co. San Francisco (1979).
[8] A. G. Konheim, Cryptography: A Primer. John Wiley & Sons (1981).
[9] S. Martello and P. Toth, Knapsack Problems, Algorithms and Computer Implementations. John
Wiley & Sons (1989).
[10] S. Martello and P. Toth, Worst-case analysis of greedy algorithms for the subset-sum problem.
Mathematical Programming 28, 198 – 205 (1984).
[11] M. Pirlot, General local search methods. European Journal of Operations Research 92, 493 –
511 (1996).
[12] K. Prasad and J. S. Kelly, NP-Completeness of some problems concerning voting games. International Journal of Game Theory 19, 1–9 (1990).
[13] C. R. Reeves(Ed.), Modern heuristic techniques for combinatorial problems. Orient Longman,
Great Britain (1993).
[14] W. Ruml, J. T. Ngo, J. Marks and S. M. Schieber, Easily searched encodings for number
partitioning. Journal of Optimization Theory and Applications 89, 251 – 291 (1996).
[15] S. Sahni, Approximate algorithms for the 0-1 knapsack problem. Journal of Association of
Computing Machinery 22, 115 – 124 (1975).
8