1 Introduction

Construct, Merge, Solve & Adapt (CMSA) (Blum et al., 2016) is a hybrid metaheuristic for solving combinatorial optimization problems. It is an iterative approach based on solving a subinstance of the considered problem instance at each iteration. In this context, note that the search space (set of feasible solutions) of a sub-instance is a subset of the search space of the original problem instance. In other words, every feasible solution to a subinstance is also a feasible solution to the original problem instance. Today’s existing CMSA variants are all covered by the algorithmic framework shown in Fig. 1. After generating an initial subinstance—which is empty in the simplest case—any CMSA algorithm executes four different steps at each algorithm iteration. First, in the Construct step, valid solutions to the considered problem instance are generated in some way. Then, in the Merge step, the solution components found in these solutions are added to the incumbent subinstance, which is hereby extended. The incumbent subinstance is solved in the third step, the Solve step. Note, in this context, that it is not necessarily the case that the subinstance is solved to optimality. Finally, an aging mechanism is employed in the Adapt step to adapt the subinstance based on the outcome of the Solve step. This adaptation reduces the incumbent subinstance by removing some of the solution components identified by the aging mechanism.

Fig. 1
figure 1

General framework of any CMSA approach

While the Merge and Adapt steps are implemented in the same way in today’s existing CMSA variants, the Construct and Solve steps offer a rather large degree of freedom. In the case of the Construct step, valid solutions to the considered problem instance might, for example, be generated by using a probabilistic greedy heuristic. A second option consists of feeding the incumbent subinstance of CMSA with solution components from solutions generated by a metaheuristic approach run in parallel, for example. Concerning the Solve step, an integer linear programming (ILP) solver might be used to solve the incumbent subinstance if the considered optimization problem can be expressed by means of an ILP model. Other options include using specialized exact techniques and short runs of an available metaheuristic for the tackled problem.

In this paper, we study different ways of solving the incumbent subinstance in the Solve step of the CMSA framework. We do so in the context of the application of CMSA to combinatorial optimization problems whose goal is the partitioning of a finite set of items into disjoint subsets. This problem type comprises large families of important problems such as bin packing problems, multiple knapsack problems, assembly line balancing problems, and vehicle routing problems, just to mention a few. ILP solvers such as CPLEX and Gurobi find it generally difficult to solve standard assignment-type ILP models of these problems. Therefore, the Operations Research community has developed specialized exact techniques based on set covering models for solving them. Concerning exact techniques, set covering models are especially useful in the context of column generation methods; see, for example, (Barnhart et al., 1998; Desrochers & Soumis, 1989; Parker & Ryan, 1993). However, the transformation of vehicle routing and packing problems to set covering problems has also been exploited in the context of heuristic methods; see, for example, (Monaci & Toth, 2006; Cacchiani et al., 2014; Machado et al., 2021). In this paper, we show that replacing the standard assignment-type ILP models with set covering formulations for solving subinstances within CMSA algorithms may often result in high-performance techniques. In addition, these set-covering-based, high-performance CMSA techniques are, in contrast to column generation techniques, rather easy to program.

1.1 Literature Review

As mentioned above, different CMSA variants—that is, different instantiations of the framework shown in Fig. 1—have been proposed in the related literature. The most-utilized CMSA variant to date is a generic CMSA based on probabilistically constructing solutions in the Construct step and using an ILP solver for solving subinstances in the Solve step. First, this algorithm has been applied to the minimum covering arborescence problem and the minimum common string partition problem (Blum et al., 2016). More recent applications include the ones to prioritized pairwise test data generation in software product lines (Ferrer et al., 2021), the maximum happy vertices problem (Thiruvady & Lewis, 2022), and the max tree coverage problem (Zhou & Zhang, 2024). A more recent CMSA variant is self-adaptive CMSA, proposed in Akbay et al. (2022). This CMSA variant is characterized by the fact that the values of important algorithm parameters are handled in a self-adaptive way and that the probabilistic generation of valid solutions in the Construct step is biased towards the best-found solution. Existing applications of this CMSA variant include the one to the multi-dimensional multi-way number partitioning (Djukanović et al., 2023) and the two-echelon electric vehicle routing problem with simultaneous pickup and deliveries (Akbay et al., 2023). Finally, it is worth mentioning that CMSA has already been applied to complex, practical problems such as the optimization of refueling and maintenance planning of nuclear power plants (Dupin & Talbi, 2021).

Next, we review works from the literature based on ideas similar to CMSA’s. The core concept of CMSA closely aligns with that found in large neighborhood search (LNS) (Pisinger & Røpke, 2010). Like CMSA, LNS is based on iteratively solving subinstances of the tackled problem instance. The main difference is in the way in which these subinstances are generated and adapted from one iteration to the next. Applegate et al. (1999) and Cook and Seymour (2003) solved the classical traveling salesman problem (TSP) as follows. Initially, in a preliminary phase, they employ a metaheuristic to generate a collection of high-quality TSP solutions. Subsequently, these solutions are merged to create a subinstance, which is then solved by an exact solver. In other words, their algorithm applies one iteration of CMSA (without the execution of the Adapt step). A similar approach is used in (Cavaliere et al., 2022) for the capacitated vehicle routing problem (CVRP). Another example of work related to CMSA can be found in (Nepomuceno et al., 2007), where the so-called generate-and-solve (GS) framework is proposed. A recent application of this framework is presented in (Sá Santos & Nepomuceno, 2022). The method known as kernel search (Angelelli et al., 2010) employs a heuristic strategy that involves identifying a limited set of solution components likely to be present in optimal solutions (the kernel) and then utilizing an ILP solver to find the best solutions that contain the kernel; see (Lamanna et al., 2022) for a recent application. Another approach related to CMSA worth mentioning is merge search (MS) (Kenny et al., 2018). Like CMSA, MS creates a subinstance in each iteration and attempts to solve it with an exact solver. However, the key difference lies in how these subinstances are formed. CMSA focuses on identifying a set of variables with fixed values in high-quality solutions, shifting optimization to the remaining variables. In contrast, MS groups variables with consistent values across good solutions, although the exact values within these groups are still optimized. An example of MS application can be found in Thiruvady et al. (2020).

1.2 Our Contribution

The contribution of this work is twofold. First, we provide a conceptual contribution and show that—in the context of problems that can be modeled both by means of standard assignment-type ILP models and by set-covering-based ILP models—the use of set-covering-based models for solving subinstances within CMSA may significantly outperform the use of standard assignment-type models. This is shown both for a problem from the bin packing field and for a complex problem from the field of electric vehicle routing. The paper’s second contribution is to be found in the state-of-the-art results that our algorithms obtain for both considered problems. In fact, in the case of the variable-sized bin packing problem, new best-known solutions are found in 68 out of 150 cases. Furthermore, in the context of the electric vehicle routing problem, we compare to our previous work in which this complex problem was introduced and we are able to show that the results obtained by our new CMSA variant significantly outperform previous results.

2 CMSA: construct, merge, solve & adapt

In this section, we describe the generic CMSA algorithm variant, which will be used in our first application example concerning the variable-sized bin packing problem. The first action that needs to be taken for applying CMSA to a combinatorial optimization problem is defining the set C of solution components, that is, those components of which solutions to the considered problem are composed. Later we will provide specific examples for such a definition. For the moment, however, let \(C = \{c_1, \ldots , c_n\}\) be a generic set of solution components. Moreover, any valid solution \(S \in \mathscr {S}\) to our problem—where \(\mathscr {S}\) is the set of all valid solutions—can be expressed as a subset of C, that is, \(S \subseteq C\) for all \(S \in \mathscr {S}\). Finally, let \(f: \mathscr {S} \mapsto \mathbb {N}^+\) for the following discussion be the objective function to be minimized, and let \(f(\emptyset ):= \infty \).

Algorithm 1
figure a

Pseudo-code of a generic CMSA

Algorithm 1 provides the pseudo-code of our generic CMSA. The algorithm starts by initializing both the best-so-far solution \(S^{\textrm{bsf}}\) and the subinstance \(C'\), which is always a subset of C, to the empty set. Furthermore, the so-called age values of all solution components are initialized to zero, that is, \( age [c]:= 0\) for all \(c \in C\). The main loop of the CMSA algorithm consists of four actions, as already shown in the general CMSA framework from Fig. 1. First, in the Construct step of CMSA, a number of \({n_{\textrm{a}}}\) valid solutions to the considered problem are probabilistically generated (see line 8 of Algorithm 1). Second, in the Merge step of CMSA, the current subinstance \(C'\) is updated with the solution components found in these \({n_{\textrm{a}}}\) solutions (see lines 10–13). Third, in the Solve step of CMSA, an ILP solver is applied in order to find the best possible solution that only contains components from subinstance \(C'\), within a time limit of \(t_{\textrm{ILP}}\) CPU seconds (see line 15). Fourth, in the Adapt step of CMSA, subinstance \(C'\) is adapted based on the solution \(S^{\textrm{ILP}}\) returned by the ILP solver (see line 17). These four steps are iterated until a given CPU time limit is reached. The output of the algorithm is \(S^{\textrm{bsf}}\), the best solution found during the whole process.

Note that the Construct step and the Solve step of CMSA are problem-dependent. In particular, for the Construct step of the algorithm, generally, a greedy heuristic for the tackled problem is used in a randomized way, and the Solve step depends on the availability of an ILP model for the tackled problem. Moreover, the way in which an ILP model is exactly generated based on subinstance \(C'\) leaves room for variation. In contrast, the Merge step and the Adapt step are problem-independent. In the Merge step, those solution components that (1) are found in at least one of the \({n_{\textrm{a}}}\) constructed solutions and (2) do currently not form part of \(C'\), are added to \(C'\) and their age value is set to zero. Finally, the Adapt step consists in the application of function Adapt(\(C'\), \(S^{\textrm{ILP}}\), \( age _{\max }\)), which adapts subinstance \(C'\) in the following way. First, the age values of all components in \(C'\setminus S^{\textrm{ILP}}\) are incremented by one. Second, the age values of all components in \(S^{\textrm{ILP}}\) are set to zero. The final action in the Adapt step removes all those components from \(C'\) whose age value has reached the maximum allowed age (\( age _{\max }\)). This is done in order to prevent components that never appear in \(S^{\textrm{ILP}}\) from slowing down the ILP solver in subsequent CMSA iterations. Note that the age value \( age [c]\) of a solution component \(c \in C\), at any time, indicates the number of consecutive CMSA iterations for which c has formed part of subinstance \(C'\) without having been included in the ILP-solution to the subinstance \(C'\).

3 Application to variable-sized bin packing

The variable-sized bin packing problem (VSBPP) can technically be described as follows. Given is a set \(S = \{1,\ldots ,n\}\) of n items, and each item \(i \in S\) has a weight \(w_i > 0\). Also given is a set \(B = \{1,\ldots ,m\}\) of m bin types. Hereby, a bin type \(k \in B\) is characterized by a capacity \(W_k > 0\) and a cost \(C_k\). We henceforth assume, without loss of generality, that \(W_1< \ldots < W_m\). The goal of the VSBPP is to pack the n items into a number of bins such that sum of the costs of the utilized bins is minimized. Note that there is no restriction on the number of times a bin type may be used.

3.1 Assignment-Type ILP Model of the VSBPP

The VSBPP can be modeled in the following way as an assignment-type integer linear program; see also (Haouari & Serairi, 2009). This model is henceforth denoted by \(\texttt {ILP}_{\textrm{std}}^{\textrm{VSBPP}}\):

figure b

This ILP model makes use of two sets of binary variables. A setting of \(x_{ij} = 1\), for example, means that item i is placed in bin j. Moreover, a setting of \(y_{jk} = 1\) means that bin j has bin type k. Note also that the number of used bins can safely be limited to n, which is the number of items. Constraints (2) make sure that each item is assigned to exactly one bin. In a similar way, constraints (3) enforce that each used bin has exactly one bin type. In addition, constraints (4) ensure that bin capacities are respected. Finally, note that the VSBPP is NP-hard due to being a generalization of the one-dimensional bin packing problem. For an example of a small VSBPP problem instance, see Fig. 2.

Fig. 2
figure 2

An example instance with \(n=3\) items and \(m=3\) different bin types. The item weights are provided in (a) while the bin type capacities and costs are indicated in (b). c shows a valid solution in which item 3 is assigned to a bin of type 2 (and cost 4), while items 1 and 2 are assigned to a bin of type 3 (and cost 5). Therefore, the solution in (c) has value \(4+5=9\). (d) shows the optimal solution where item 1 is assigned to a bin of type 1 (and cost 3), while items 2 and 3 are assigned to a bin of type 3 (and cost 5). The cost of the optimal solution is, therefore, \(3+5=8\)

3.2 Literature Review Concerning the VSBPP

We focus on the original version of the VSBPP, where—as mentioned above—the number of bins available per bin type is unlimited. Crainic et al. (2011) derived lower bounds and developed heuristics for the more general version of the VSBPP with explicit limits on the number of bins per bin type. Haouari and Serairi (2009) introduced a range of greedy heuristics, in addition to a genetic algorithm. Finally, Hemmelmayr et al. (2012) proposed a rather sophisticated variable neighborhood search (VNS) algorithm for solving the VSBPP. The results presented in (Hemmelmayr et al., 2012) have not improved since then. Instead, researchers have focused on the VSBPP with additional constraints. Recent examples include the VSBPP with time windows (Fleszar, 2023) and the VSBPP with conflicts (Ekici, 2023).

3.3 Set-Covering Based ILP Model of the VSBPP

An alternative, set-covering-based ILP model for the VSBPP is obtained as follows. Let \(\mathscr {B}\) be the set of all possible bins with respect to their content. Hereby, a bin \(b \in \mathscr {B}\) is defined by the set of items it contains. The weight \(w_b\) of a bin \(b \in \mathscr {B}\) is defined as the sum of the weights of the items assigned to that bin. Moreover, the cost \(c_b\) of a bin \(b \in \mathscr {B}\) is defined as the cost of the lowest-cost bin type possible for accommodating all the items assigned to b. Finally, let \(\mathscr {B}_i \subset \mathscr {B}\) be the set of bins that contain item i. After introducing a binary variable \(x_b\) for each bin \(b \in \mathscr {B}\), the set-covering-based ILP model for the VSBPP, henceforth denoted as \(\texttt {ILP}_{\textrm{setcov}}^{\textrm{VSBPP}}\), can be stated as follows.

figure c

Note that an exact correspondence between \(\texttt {ILP}_{\textrm{std}}^{\textrm{VSBPP}}\) and \(\texttt {ILP}_{\textrm{setcov}}^{\textrm{VSBPP}}\) would be obtained by replacing the“\(\ge \)”symbol in constraints (8) by the equality symbol (“\(=\)”). However, every optimal solution obtained with the“\(\ge \)” symbol is easily transformed into an optimal solution of the model with the“\(=\)”symbol by removing duplicate items from all bins apart from one. Moreover, according to Barnhart et al. (1998), the linear programming relaxation of the model when using the “\(\ge \)”symbol is numerically far more stable and thus easier to solve, which facilitates solving the ILP by means of ILP solvers such as CPLEX or Gurobi.

3.4 Application of standard CMSA to the VBSPP

First, we applied CMSA in the following standard way to the VBSPP. This algorithm variant is henceforth labeled Adapt-Cmsa-Std. As ILP model for solving subinstances, Adapt-Cmsa-Std uses model \(\texttt {ILP}_{\textrm{std}}^{\textrm{VSBPP}}\) from Sect. 3.1. A generic way of defining the set of solution components is as follows. For each binary variable of the model, exactly two solution components are introduced: one component that corresponds to setting the variable to zero, and another component that refers to setting the variable to one. In the case of model \(\texttt {ILP}_{\textrm{std}}^{\textrm{VSBPP}}\), this means that C consists of solution components \(cx_{ij}^0\) and \(cx_{ij}^1\) for all binary variables \(x_{ij}\) (\(i,j, = 1,\ldots , n)\), and of solution components \(cy_{jk}^0\) and \(cy_{jk}^1\) for all binary variables \(y_{jk}\) (\(j=1,\ldots , n\); \(k=1,\ldots ,m\)). Hereby, \(cx_{ij}^0\), for example, corresponds to \(x_{ij} = 0\), while \(cx_{ij}^1\) corresponds to \(x_{ij} = 1\). Moreover, \(C = \{cx_{11}^0,\ldots ,cx_{nn}^0, cx_{11}^1, \ldots , cx_{nn}^1, cy_{11}^0,\ldots ,cy_{nm}^0, cy_{11}^1, \ldots , cy_{nm}^1\}\) is the complete set of \(2n^2 + 2nm\) solution components. Any valid solution S is a subset of C with \(\vert S \vert = n^2+nm\) because, for each binary variable, a solution S contains exactly one of the two corresponding solution components.

3.4.1 Probabilistic Solution Construction

For the following discussion, a bin b is a set of items, that is, \(b \subseteq \{1, \ldots , n\}\). Moreover, a bin b is always characterized by three well-defined measures:

  1. 1.

    \(b_{\textrm{load}}\): the load of a bin b is defined as the sum of the weights of the items it contains, that is, \(b_{\textrm{load}}:= \sum _{i \in b} w_i\)

  2. 2.

    \(b_{\textrm{type}}\): the type of a bin b is defined as the lowest-cost bin type that is able to accommodate the load of the bin, that is, \(b_{\textrm{type}}:= k\) such that \(C_k < C_r\) for all \(r \in \{1, \ldots , m\}\) with \(W_r \ge b_{\textrm{load}}\).

  3. 3.

    \(b_{\textrm{cost}}\): the cost of a bin is defined as the cost of its type, that is, \(b_{\textrm{cost}}:= C_{b_{\textrm{type}}}\).

  4. 4.

    \(b_{\textrm{ratio}}\): the ratio between the cost and the load of a bin, that is, \(b_{\textrm{ratio}}:= \frac{b_{\textrm{cost}}}{b_{\textrm{load}}}\)

In addition, let \(\max _{\textrm{load}}\) be defined as the maximum capacity of all bin types, that is, \(\max _{\textrm{load}}:= \max \{W_j \mid j = 1,\ldots ,m\}\). For the probabilistic construction of a solution, the following simple procedure is applied; see also Algorithm 2. First, the n items are randomly ordered; see line 3. Then, the set of bins B is initialized by placing the first item from the list in a new bin, whose load, type, cost and ratio is determined as defined above. Then, in the pre-determined order, the remaining items are placed into bins. In particular, in probability, among all options to place an item, the one resulting in a bin with a lower ratio is preferred over the others. This is encapsulated in function ChooseOption(O, \(d_{\textrm{rate}}\), \(l_{\textrm{size}}\)) (see line 16), where O is the current set of options. The working of this function is as follows. First, a number z is chosen uniformly at random from [0, 1]. In case \(z \le d_{\textrm{rate}}\), the chosen option is the one with the lowest ratio. Otherwise, the \(\max \{|O|, l_{\textrm{size}}\}\) options with the lowest ratios are pre-selected from O, and the chosen option is randomly determined among those. When all items are placed into bins, the set of bins is sorted by bin ratio (from small to large); see function Sort(B) in line 22. As a tie-breaking criterion, we utilized the smallest item index of a bin (preferring smaller ones). Finally, the constructed solution is transformed into the corresponding set S of solution components in function ExtractSolutionComponents(B); line 23. In the case of the set of solution components outlined above, this works as follows. If the first bin (after sorting B) is of type k, then \(cy_{1k}^1\) is added to S. Moreover, all \(cy_{1r}^0\) (with \(r \not = k \in \{1, \ldots , m\}\)) are added to S. The same is done for all other bins in B. Similarly, for each item i in the first bin (after sorting B), component \(cx_{i1}^1\) is added to S. Moreover, all \(cx_{ir}^0\) (with \(r = 2, \ldots , n\)) are added to S. The same is done for the items of all other bins from B.

Algorithm 2
figure d

Probabilistic construction of a valid VSBPP solution

3.4.2 Subinstance generation and solving

In the solve step, we first generate a reduced problem instance on the basis of \(C'\), which is done by adding—for all \(i=1,\ldots ,n\)—the following constraints to model \(\texttt {ILP}_{\textrm{std}}^{\textrm{VSBPP}}\) before applying an ILP solver:

  1. 1.

    For all \(i,j = 1, \ldots , n\):

    • If \(cx_{ij}^0 \in C'\) and \(cx_{ij}^1 \notin C'\): add constraint \(x_{ij} = 0\) to \(\texttt {ILP}_{\textrm{std}}^{\textrm{VSBPP}}\)

    • If \(cx_{ij}^0 \notin C'\) and \(cx_{ij}^1 \in C'\): add constraint \(x_{ij} = 1\) to \(\texttt {ILP}_{\textrm{std}}^{\textrm{VSBPP}}\)

  2. 2.

    For all \(j = 1, \ldots , n\) and \(k = 1, \ldots , m\):

    • If \(cy_{jk}^0 \in C'\) and \(cy_{jk}^1 \notin C'\): add constraint \(y_{jk} = 0\) to \(\texttt {ILP}_{\textrm{std}}^{\textrm{VSBPP}}\)

    • If \(cy_{jk}^0 \notin C'\) and \(cy_{jk}^1 \in C'\): add constraint \(y_{jk} = 1\) to \(\texttt {ILP}_{\textrm{std}}^{\textrm{VSBPP}}\)

In other words, whenever subinstance \(C'\) only contains one of the two solution components corresponding to a variable, the value of this variable is fixed to the corresponding value. As a consequence, the more such constraints are added to the original ILP model, the smaller becomes the search space to be explored by the ILP solver for solving the subinstance.

3.5 Application of set-covering based CMSA to the VSBPP

The ILP model for solving subinstances in this variant of CMSA—henceforth labeled Adapt-Cmsa-SetCov—is model \(\texttt {ILP}_{\textrm{setcov}}^{\textrm{VSBPP}}\) from Sect. 3.3. In this case, the complete set of solution components C consists of a component \(c^b\) for each valid bin b from \(\mathscr {B}\) (see Sect. 3.3), that is, \(C:= \{c^b \mid b \in \mathscr {B}\}\). Any subset \(S \subset C\) such that each item \(i \in \{1, \ldots , n\}\) appears in exactly one bin b such that \(c^b \in S\) is a valid solution to the tackled VSBPP problem instance.

The probabilistic construction of solutions works exactly in the same way as outlined in Sect. 3.4.1. The only difference lies in the implementation of function ExtractSolutionComponents(B) that assembles the solution components corresponding to a set of bins B. Here, this function simply adds for each \(b \in B\) the corresponding solution component \(c^b\) to S.

The ILP model solved in the solve step of this CMSA variant is obtained by exchanging \(\mathscr {B}\) in model \(\texttt {ILP}_{\textrm{setcov}}^{\textrm{VSBPP}}\) by \(C'\), that is, by replacing the set of all possible valid bins with the set of those bins that form part of the current subinstance \(C'\). The solution S obtained from the ILP solver after \(t_{\textrm{ILP}}\) CPU seconds is then checked for duplicate occurrences of items. If this happens, duplicate items are randomly removed from the bins in S until each item appears exactly once in the bins of S.

Finally, extracting the solution components corresponding to the bins in set B in function ExtractSolutionComponents(B) in line 23 of Algorithm 2 consists simply in adding for each \(b \in B\) the corresponding solution component \(c^b\) to S.

3.6 Experimental evaluation

The proposed algorithms were implemented in C++ and the experiments were conducted using a cluster of machines equipped with Intel® Xeon® 5670 CPUs having 12 cores of 2.933 GHz and at least 32 GB of RAM. To solve the respective subinstances within both versions of CMSA, we used CPLEX version 22.1 in one-threaded mode.

3.6.1 Problem instances

The used set of problem instances was introduced in (Haouari & Serairi, 2009). In this set, the number of bin types (m) is 7 and bin capacities are defined as \(W_1 = 70\), \(W_2 = 100\), \(W_3 = 130\), \(W_4 = 160\), \(W_5 = 190\), \(W_6 = 220\) and \(W_7 = 250\). Moreover, item weights are randomly drawn from [1, 250]. In fact, this benchmark set is composed of three different classes of instances. In particular, class B1 is characterized by a linear bin cost function \(C_i = W_i\) (\(i=1,\ldots ,7\)), class B2 has a concave cost function \(C_i = \lceil 10\sqrt{W_i} \rceil \) (\(i=1,\ldots ,7\)), and class B3 has a convex cost function \(C_i = \lceil 0.1 {W_i}^{3/2} \rceil \) (\(i=1,\ldots ,7\)). There are 10 problem instances for each combination of \(n \in \{100, 200, 500, 1000, 2000\}\) (number of items) and bin cost function class. This makes a total of 150 problem instances. Optimal solutions to these instances are not known.

3.6.2 Parameter tuning

Both Cmsa-Std and Cmsa-SetCov require well-working parameter values. In fact, the same five parameters as shown in the first column of Table 1 must be tuned in both cases. The domains that we allowed for these parameters are indicated in the second column of the same table. Parameter tuning was conducted with the irace tool (López-Ibáñez et al., 2016). Each algorithm variant was tuned separately for each instance class, with a budget of 2000 runs. Moreover, each run was limited to 150 CPU seconds, as in (Hemmelmayr et al., 2012). The obtained parameter values, as shown in Table 1, were used for the final experimentation.

Table 1 Parameter settings for Cmsa-Std and Cmsa-SetCov for the three classes of instances

3.6.3 Numerical results

Table 2 Results for the 50 instances of class B1 (linear cost function)
Table 3 Results for the 50 instances of class B2 (concave cost function)
Table 4 Results for the 50 instances of class B3 (convex cost function)

The results of the current state-of-the-art technique (VNS) from (Hemmelmayr et al., 2012), Cmsa-Std, and Cmsa-SetCov are provided in Table 2 (instances of class B1), Table 3 (instances of class B2), and Table 4 (instances of class B3). The first two columns of these tables indicate the number of items (n) and the instance number (#), respectively. For each problem instance (table row), the results of the three algorithms are provided in terms of the best solution found over 10 runs, the average of the best solutions from the 10 runs, and the average times at which these solutions were found within the time limit of 150 CPU seconds per run. The following observations can be made:

  • First, Cmsa-SetCov clearly outperforms both Cmsa-Std and VNS. Only in two out of 150 cases, Cmsa-Std is able to find a solution of the same quality as the one found by Cmsa-SetCov. Moreover, while Cmsa-SetCov and VNS perform comparably for rather small problem instances with \(n \in \{100, 200\}\) items, Cmsa-SetCov clearly outperforms VNS in the context of larger problem instances (especially for \(n \in \{1000, 2000\}\)).

  • Cmsa-SetCov is able to find new best-known solutions in 68 out of 150 cases. In particular, Cmsa-SetCov finds 27 new best-known solutions in the case of the 50 B1 instances, 26 new best-known solutions in the case of the 50 B2 instances, and 15 new best-known solutions in the case of the 50 B3 instances.

  • Only in 7 out of 150 cases, the best solution found by Cmsa-SetCov is slightly worse than the one found by VNS.

In order to test the statistical relevance of these results we produced so-called critical difference (CD) plots with the scmamp toolkit Calvo and Santafe (2016) of the R statistical language, which implements the procedure recommended in García and Herrera (2008) for the comparison of multiple algorithms over multiple problem instances. The horizontal axis of such a CD plot shows the range of algorithm ranks, while each of the vertical lines represents the average rank of the corresponding algorithm for the considered instance group. Bold horizontal lines connecting algorithm markers indicate that the corresponding algorithms perform statistically equivalent–i.e. the critical difference is not greater than the significance level of 0.05. Note also that, in the context of these plots, we added the results of the GA approach from (Crainic et al., 2011). The plot from Fig. 3a shows that–from a global point of view–Cmsa-SetCov outperforms all other algorithms with statistical significance. When considering the instances from the three classes separately, no statistically significant difference between Cmsa-SetCov and VNS can be detected in the context of the B3 class (see Fig. 3d).

Fig. 3
figure 3

Criticial difference (CD) plots showing the statistical significance of the results

Fig. 4
figure 4

Radar charts concerning the comparison of the two ILP models applied to a B1 instance with 100 items (see (a) and (b)), and to a problem instance with 2000 items (see (c) and (d))

3.6.4 Performance Difference Between the two VSBPP ILP Models

Finally, we aim to show why Cmsa-SetCov outperforms Cmsa-Std so clearly. For this purpose, we generate subinstances of different sizes, translate them both into models \(\texttt {ILP}_{\textrm{std}}^{\textrm{VSBPP}}\) and \(\texttt {ILP}_{\textrm{setcov}}^{\textrm{VSBPP}}\), and solve them with CPLEX. In particular, we generated 10 subinstances by probabilistically constructing five, respectively, 20 solutions and by merging their solution components in order to obtain subinstances. This was done for one B1 instance with \(n=100\) items and for another B1 instance with \(n=2000\) items. Figure 4 shows radar charts that present the obtained results in the four different cases. Each radar chart provides four different measures, averaged over 10 runs: (1) the number of variables in the models of the subinstances (top), (2) the relative MIP gap after termination of CPLEX (right), (3) the computation time required by CPLEX (bottom), and (4) the absolute improvement when comparing the result of solving the subinstance with the best individual solution that was used to generate the subinstance. Note that the time limit for CPLEX was set to 20 CPU seconds in all cases. In this context, a model is promising if the improvement (left) is large, and the number of variables (top), the relative MIP gap (right) and the required time (bottom) are low. The four radar plots indicate that this is indeed the case for model \(\texttt {ILP}_{\textrm{setcov}}^{\textrm{VSBPP}}\), while the opposite is actually the case for model \(\texttt {ILP}_{\textrm{std}}^{\textrm{VSBPP}}\). Obviously, the results become more pronounced with growing problem instance size.

4 Application to an electric vehicle routing problem

As mentioned before, another large family of problems where standard ILP models can be replaced by set-covering-based ILP models is vehicle routing. In recent years, due to environmental concerns, many researchers have focused on vehicle routing problems (VRPs) concerned with electric vehicles, so-called electric vehicle routing problems (EVRPs). In this section, with the aim of presenting a second example for the benefits of using set-covering-based ILP models within CMSA algorithms, we consider the so-called electric vehicle routing problem with time windows and simultaneous pickups and deliveries (EVRP-TW-SPD). In a preliminar work (Akbay et al., 2023), we already presented the application of a CMSA variant called Adapt-CMSA based on the standard assignment-type ILP model of the problem. In this section, we present an improved algorithm version together with the variant that makes use of the set-covering-based ILP model.

Fig. 5
figure 5

Illustration of an EVRP instance and its solution. a presents a map showing the locations of a depot, five customers, and three charging stations based on Cartesian coordinates. Gray dashed lines indicate a fully connected graph connecting each pair of nodes. b shows a valid solution to the given instance on the same map, with two distinct tours represented by arrows with different colors. Both routes begin and end at the depot, passing through various customers and charging stations

The standard ILP model of the EVRP-TW-SPD is an extension of the model for the EVRP-TW-PR problem proposed in (Keskin & Çatay, 2016), which—in turn—is a modified variant of the model for the EVRP-TW problem proposed in (Schneider et al., 2014). In particular, we extend the model further in order to consider SPD constraints. When dealing with SPD constraints in the context of vehicle routing problems, it is important to note that each customer’s demand may consist of two distinct requirements: (1) delivering goods to the demand point, known as the “delivery demand”, and (2) collecting goods from the demand point, known as the “pickup demand”. It is necessary to satisfy both demands simultaneously when a vehicle visits a particular customer.

We adopt the same notation as in the previous works to maintain consistency with the existing literature. In particular, the EVRP-TW-SPD problem involves a set of N customers, denoted by \(V=\{1,\ldots ,N\}\), and a set of charging stations, denoted by F.

As we will model the EVRP-TW-SPD by means of a two-index ILP model, which allows the visit of each node (customers and charging stations) exactly once, we enable multiple visits (possibly by different vehicles) to each charging station by defining a set \(F^{'}\) that contains multiple copies of each charging station from F. Note that the utilization of such dummy nodes in the context of two-index ILP models is already known from the related literature (Bard et al., 1998; Erdoğan & Miller-Hooks, 2012). However, determining the appropriate number of copies for each charging station is non-trivial. Insufficient replication, which leads to restricting the number of permissible visits to the same charging station, may exclude an optimal solution from the search space. Conversely, an excessive number of replicas could inflate the model size, leading to prolonged execution times for ILP solvers. The way in which the utilized number of charging station copies was determined for each problem instance will be described in Sect. 4.6.3.

The depot is represented by nodes 0 and \(N+1\), where node 0 is the starting point and node \(N+1\) is the ending point for each route. Note that both 0 and \(N+1\) refer to the same depot. The set \(V^{'}=V \cup F^{'}\) contains all customers and dummy charging stations. Moreover, sub-indexes 0, \(N+1\), or both, added to a node-set, indicate the inclusion of the respective depot instances to that set. For example, \(V_0\) is the node set that contains all customers and the instance 0 of the depot. Based on the above notations, we define the following sets:

  1. 1.

    \(F^{'}_0:= F^{'}\cup \{0\}\)

  2. 2.

    \(V^{'}_0:= V^{'}\cup \{0\}\)

  3. 3.

    \(V^{'}_{N+1}:= V^{'}\cup \{N+1\}\)

  4. 4.

    \(V^{'}_{0,N+1}:= V^{'}\cup \{0\}\cup \{N+1\}\)

Following established sets and notations, the EVRP-TW-SPD can be defined on a complete, directed graph \(G(V^{'}_{0,N+1},A)\). \(A=\{(i,j)|i,j\in V^{'}_{0,N+1},i\ne j\}\) is the set of arcs where each arc has a corresponding distance \(d_{ij}\) and travel time \(t_{ij}\). Note that both distances and travel times are symmetric, that is, it always holds that \(d_{ij} = d_{ji}\) and \(t_{ij} = t_{ji}\). The energy consumed per unit distance traveled by an electric vehicle (EV) is denoted by a constant h. A fleet of electric vehicles with identical loading capacity \(EV_{cap}\) and battery capacity \(B_{cap}\) is stationed at a depot to satisfy delivery demand \(q_i > 0\) and pickup demand \(p_i > 0\) of each customer i simultaneously. Each vertex \(i\in V^{'}_{0, N+1}\) is only allowed to be visited within a time window \([e_i, l_i]\) that indicates the earliest and latest possible visiting times. Moreover, each customer \(i\in V\) has a service time \(s_i\), which refers to the time an electric vehicle spends visiting a customer. When an EV visits a charging station, its battery is charged as determined by a parameter \(g > 0\), which indicates the units of time required to charge one unit of energy. Note that this way of fixing the charging speed is very usual in EVRPs; see, for example, (Schneider et al., 2014; Keskin & Çatay, 2016).

The following decision variables are used to formulate the problem’s ILP model. The binary decision variable \(x_{ij}\) takes a value of 1 if the arc \(a_{ij}\) is included in the route and 0 otherwise. The starting time of the service for each customer visited by the electric vehicle is monitored by the decision variable \(\tau _i\). Moreover, to keep track of the battery’s state of charge upon arrival and departure at each vertex \(i \in V^{'}_{0,N+1}\), the decision variables \(y_i\) and \(Y_i\) are employed, respectively. Furthermore, the remaining cargo to be delivered to the customers of the route and the amount of cargo already collected (picked up) at the previously visited customers are represented by the variables \(u_{ij}\) and \(v_{ij}\), respectively. ILP model \(\texttt {ILP}_{\textrm{std}}^{\textrm{EVRP}}\) is technically described as follows.

$$\begin{aligned} {\textbf {Min}} \quad&\sum \limits _{i \in V^{'}_{0},j \in V^{'}_{N+1} } { d_{ij} x_{ij}}+\sum \limits _{j \in V^{'}_{N+1} } M{x_{0j}} \end{aligned}$$
(10)
$$\begin{aligned} {\textbf {s.t.}} \quad&\sum \limits _{j \in V^{'}_{N+1}, i \ne j} {x_{ij}}= 1&\forall i \in V \end{aligned}$$
(11)
$$\begin{aligned} \quad&\sum \limits _{j \in V^{'}_{N+1}, i \ne j}{x_{ij}} \le 1&\forall i \in F^{'} \end{aligned}$$
(12)
$$\begin{aligned} \quad&\sum \limits _{i \in V^{'}_{0}, i \ne j}{x_{ij}}-\sum \limits _{i \in V^{'}_{N+1}, i \ne j}{x_{ji}} = 0&\forall j \in V^{'} \end{aligned}$$
(13)
$$\begin{aligned} \quad&\tau _{i}+(t_{ij}+s_{i})x_{ij}-l_{0}(1-x_{ij})\le \tau _{j}&\forall i \in V_{0}, j \in V^{'}_{N+1}, i\ne j \end{aligned}$$
(14)
$$\begin{aligned} \quad&\tau _{i}+t_{ij}x_{ij}+g(Y_{i}-y_{i})-(l_{0}+gB_{cap})(1-x_{ij}) \le \tau _{j}&\forall i \in F^{'}, \forall j \in V^{'}_{N+1}, i\ne j \end{aligned}$$
(15)
$$\begin{aligned} \quad&e_{j} \le \tau _{j} \le l_{j}&\forall j \in V^{'}_{0,N+1} \end{aligned}$$
(16)
$$\begin{aligned} \quad&0 \le u_{0j} \le EV_{cap}&\forall j \in V^{'}_{N+1} \end{aligned}$$
(17)
$$\begin{aligned} \quad&v_{0j} = 0&\forall j \in V^{'}_{N+1} \end{aligned}$$
(18)
$$\begin{aligned} \quad&\sum \limits _{i \in V^{'}_{0}, i \ne j}{u_{ij}}-\sum \limits _{i \in V^{'}_{N+1}, i \ne j}{u_{ji}} = q_j&\forall j \in V^{'} \end{aligned}$$
(19)
$$\begin{aligned} \quad&\sum \limits _{i \in V^{'}_{N+1}, i \ne j}{v_{ji}}-\sum \limits _{i \in V^{'}_{0}, i \ne j}{v_{ij}} = p_j&\forall j \in V^{'} \end{aligned}$$
(20)
$$\begin{aligned} \quad&u_{ij}+v_{ij} \le EV_{cap} x_{ij}&\forall i \in V^{'}_{0}, j \in V^{'}_{N+1}, i \ne j \end{aligned}$$
(21)
$$\begin{aligned} \quad&0 \le y_{j} \le y_{i}-(hd_{ij})x_{ij}+B_{cap}(1-x_{ij})&\forall i \in V, \forall j \in V^{'}_{N+1}, i\ne j \end{aligned}$$
(22)
$$\begin{aligned} \quad&0 \le y_{j} \le Y_{i}-(hd_{ij})x_{ij}+B_{cap}(1-x_{ij})&\forall i \in F^{'}_{0}, \forall j \in V^{'}_{N+1}, i\ne j \end{aligned}$$
(23)
$$\begin{aligned} \quad&y_{i} \le Y_{i} \le B_{cap}&\forall i \in F^{'}_{0} \end{aligned}$$
(24)
$$\begin{aligned} \quad&x_{ij} \in {0,1}&\forall i \in V^{'}_{0}, j \in V^{'}_{N+1}, i\ne j \end{aligned}$$
(25)

The distance-based objective function from (Keskin & Çatay, 2016) is extended in order to prioritize solutions that utilize fewer vehicles, even if the total distance traveled in such cases is greater than in other solutions. This is done by introducing an additional cost parameter \(M > 0\) per vehicle utilized. Note that the number of vehicles used in a solution corresponds to the variables on outgoing arcs of the depot (0) that have a value of 1. In this line, the objective function (10) minimizes the total travel and vehicle cost. Constraints (11) ensure that each customer is visited by an electric vehicle, while constraints (12) allow vehicles to visit a charging station only when required. Constraints (13) guarantee that each vehicle that visits a particular node must also depart from the corresponding node. The arrival and departure times are calculated using constraints (14) and (15), which consider the service and battery charging times. Constraints (16) permit vehicles to visit each node within the corresponding time windows. At the same time, constraints (14)-(16) prevent sub-tours. Constraints (17)–(21) ensure that the delivery and pickup demands of customers are simultaneously met. Finally, constraints (22)-(24) are related to the battery state of charge. For an example instance together with a solution, see Fig. 5.

4.1 Literature Review Concerning the EVRP-TW-SPD

In response to mounting environmental concerns and the consequent need for alternative fuel sources in logistics, recent research has focused on developing routing strategies that optimize the transportation of goods while also considering the limited driving range and en-route charging necessities associated with electric vehicles. These problems are commonly referred to as EVRPs or, more broadly, Green Vehicle Routing Problems. Recent, comprehensive surveys of research on EVRPs can be found in (Asghari & Mirzapour Al-e-Hashem, 2021; Moghdani et al., 2021). As the main focus of our work is on the methodology for solving certain types of problems instead of specific problem variants, we refer the interested reader to these survey papers for further reference.

Instead, we point out the differences between the EVRP-TW-SPD with existing problems from the literature. Apart from time window constraints, our problem also considers simultaneous pickup and delivery (SPD) constraints regarding customer deliveries. This practice is commonly associated with reverse logistics. Nevertheless, despite the pivotal role of reverse logistics in promoting sustainability, the number of publications that examine variants of the EVRP-SPD is very limited. So far, only (Yilmaz & Kalayci, 2022) have considered SPD constraints within the scope of EVRPs. Finally, in conventional EVRP models, electric vehicle batteries are assumed to be fully charged upon visiting a charging station. However, our problem considers a more realistic scenario by allowing for partial recharging.

4.2 Set-Covering Based ILP Model of the EVRP-TW-SPD

Assignment-type ILP models such as the one presented above for the EVRP-TW-SPD generally do not allow to derive good lower bounds (see, for example, (Angelelli & Mansini, 2002)). In addition, experiments reported in (Akbay et al., 2023) showed that finding any feasible solution to the corresponding model within reasonable execution times for CPLEX becomes difficult, even in the context of small-sized subinstances of the original problem instances.

In a way similar to the one presented for the VSBPP in Sect. 3.3, the EVRP-TW-SPD can be modeled in terms of a set-covering-based ILP in the following way. Let \(\mathscr {T}\) be the set of all possible (and feasible) tours, where a tour is defined as the trip of one single vehicle returning to the depot from which it originally left. Each tour \(T_r \in \mathscr {T}\) is evaluated by the total distance traveled \(d_r\), that is, the sum of the distances of all arcs on the tour. Finally, let \(\mathscr {T}_i \subset \mathscr {T}\) be the set of tours that serve customer \(i \in V\). With these definitions, the set-covering-based ILP model for the EVRP-TW-SPD, henceforth denoted as \(\texttt {ILP}_{\textrm{setcov}}^{\textrm{EVRP}}\), can be stated as follows.

(26)
(27)
(28)

The objective function minimizes the total travel and vehicle costs and constraints (27) ensure that each customer is visited at least once. Note that the set-covering-based formulation is generally used as a post-optimization method in the VRP literature. In (Rochat & Taillard, 1995), following the termination of their algorithm, the authors suggest aggregating all tours from the solutions generated by their algorithm into a set. They then attempt to find an even better solution by resolving the set-covering model based on the tours contained in this set. In contrast, our results will show that CMSA provides a suitable algorithmic framework for iteratively applying heuristics and exact components.

4.3 Application of standard Adapt-CMSA to the EVRP-TW-SPD

In the case of the EVRP-TW-SPD problem, we apply the Adapt-CMSA variant from (Akbay et al., 2022). In particular, we first develop an Adapt-CMSA version based on the assignment-type ILP model—that is, model \(\texttt {ILP}_{\textrm{std}}^{\textrm{EVRP}}\)—to the EVRP-TW-SPD. This version of Adapt-CMSA is henceforth labeled Adapt-Cmsa-Std. In the context of Adapt-Cmsa-Std, the complete set of solution components consists of a component \(c_{ij}\) for each arc \(a_{ij}\) from \(A=\{(i,j)|i,j\in V^{'}_{0,N+1},i\ne j\}\). Consider the following example. The vector \(\textbf{I}\) comprises all the node indexes for a small problem instance involving three charging stations and five customers. Nodes indexed with 0 and 6 denote the depot.

Now consider a solution consisting of two tours \(T_1\) and \(T_2\), where and . In the context of Adapt-Cmsa-Std, this solution is represented by \(S=\{c_{0,9}, c_{9,1}, c_{1,4}, c_{4,6}, c_{0,2}, c_{2,8}, c_{8,3}, c_{3,7}, c_{7,5}, c_{5,6}\}\), that is, a solution S in Adapt-Cmsa-Std is kept in terms of the list of solution components representing the arcs used in any of the tours of S.

4.4 The Adapt-Cmsa-Std Algorithm

The pseudo-code presented in Algorithm 3 is common to Adapt-Cmsa-Std and Adapt-Cmsa-SetCov. It describes the general algorithmic framework of Adapt-CMSA for the EVRP-TW-SPD. First, a feasible solution is generated by calling function GenerateGreedySolution() to initialize the best-so-far solution \(S^{\textrm{bsf}}\). More precisely, this function applies an insertion heuristic which is further explained in Sect. 4.4.1. Following that, in lines 4 and 5, parameters \(\alpha _{\textrm{bsf}}\), \(n_a\), and \(l_{\textrm{size}}\) are given initial values. How these variables are managed within the algorithm will be explained below.

Algorithm 3
figure i

Pseudo-code of Adapt-CMSA for the EVRP-TW-SPD

During each iteration of Adapt-Cmsa-Std, a subinstance \(C'\) of the original problem instance is created. Similar to the solution representation, a subinstance is also a set of solution components, that is, \(C' \subseteq C\), where \(C'\) is initialized to the best solution found so far (\(S^{\textrm{bsf}}\)) at the beginning of each iteration. Subsequently, a probabilistic solution construction process shown in lines 8–12 probabilistically generates \(n_a\) solutions using function ProbabilisticSolutionConstruction(\(S^{\textrm{bsf}}\), \(\alpha _{\textrm{bsf}}\), \(l_{size}\)). This function takes two additional parameters apart from \(S^{\textrm{bsf}}\). These are the parameter \(\alpha _{\textrm{bsf}}\) (\(0 \le \alpha _{\textrm{bsf}}< 1\)), which biases the creation of new solutions towards the best-so-far solution, and the parameter \(l_{size}\), which determines the number of options considered at each solution construction step. Note that higher values of \(\alpha _{\textrm{bsf}}\) lead to an increase of similarity between the constructed solutions and \(S^{\textrm{bsf}}\). On the contrary, a higher value of \(l_{size}\) results in the construction of more diverse solutions and, consequently, contributes to forming a larger subinstance.

After constructing a solution S by calling the above-mentioned function in line 9 of Algorithm 3, each tour of S undergoes a local search process, as indicated in line 10. This local search procedure applies well-known intra-route operators such as relocation, swap, and two_opt in sequential order. Moreover, the best-improvement strategy is adopted in the context of the applied operators. The so-called relocation operator sequentially extracts each node from its existing position within a route and repositions it at an alternative location inside the same route. On the other hand, the swap operator works by interchanging the positions of a pair of selected nodes belonging to the same route. Lastly, the two_opt neighborhood explores every feasible combination of choosing two non-adjacent nodes in the same route and then reverses the arrangement of the nodes situated between the chosen pair of nodes.

Upon the application of local search, the so-called merge step is executed in line 11 in the same way as in any other CMSA algorithm. After probabilistically constructing \({n_{\textrm{a}}}\) solutions and forming the subinstance \(C'\), the subinstance is solved by first generating a corresponding ILP model based on model \(\texttt {ILP}_{\textrm{std}}^{\textrm{EVRP}}\) and then solving the model with a CPU time limit of \(t_{\textrm{ILP}}\) seconds with CPLEX in function SolveSubinstance(\(C'\), \(t_{\textrm{ILP}}\)). In order to generate this model, the following constraints are added to \(\texttt {ILP}_{\textrm{std}}^{\textrm{EVRP}}\):

$$\begin{aligned} \quad x_{ij} = 0 \quad \text {for all } c_{ij} \in C \setminus C' \end{aligned}$$
(29)

In other words, if an arc \(a_{ij}\) has not been used in any of the solutions that were merged into \(C'\), using this arc is forbidden by fixing the value of \(x_{ij}\) to zero. It is important to note that incorporating more constraints into the original ILP reduces the search space of the resulting ILP model, thereby facilitating CPLEX’s ability to generate a high-quality solution or even the optimal one for the corresponding subinstance. However, note that—due to the employed CPU time limit for each application of CPLEX—the output of function SolveSubinstance(\(C'\), \(t_{\textrm{ILP}}\)), denoted as \(S^{\textrm{ILP}}\), is not necessarily an optimal solution to the subinstance. In any case, \(S^{\textrm{ILP}}\) is subject to the application of a local search method different from the one described before. In particular, this local search procedure utilizes inter-tour neighborhoods such as exchange (1,1) and shift (1,0). The exchange (1,1) neighborhood investigates all potential two-customer swaps not part of the same tour, whereas the shift (1,0) neighborhood examines every option for removing a customer from its existing tour and placing it at any possible location in other tours. As is done within LocalSearch1(S), operators used by LocalSearch2(S) employ the best-improvement search strategy.

The self-adaptive nature of Adapt-Cmsa-Std can be found in the dynamical adjustment of the values of parameters \(\alpha _{\textrm{bsf}}\), \({n_{\textrm{a}}}\), and \(l_{\textrm{size}}\). The value of the dynamic parameter \(\alpha _{\textrm{bsf}}\) is bounded from below by \(\alpha ^{\textrm{LB}}\) and from above by \(\alpha ^{\textrm{UB}}\). Both \(\alpha ^{\textrm{LB}}\) and \(\alpha ^{\textrm{UB}}\) are input parameters of the algorithm. Moreover, the value of a step size parameter \(\alpha _{\textrm{red}}\) is employed for systematically reducing the \(\alpha _{\textrm{bsf}}\)’s value, if needed. Initially, the value of \(\alpha _{\textrm{bsf}}\) is set to the highest possible value, \(\alpha ^{\textrm{UB}}\), as shown in line 4.Footnote 1 If the resulting ILP is solved within a computation time \(t_{\textrm{solve}}\) that is below a proportion \(t_{\textrm{prop}}\) of the maximum possible computation time \(t_{\textrm{ILP}}\)—that is, if \(t_{\textrm{solve}}\le t_{\textrm{prop}}\cdot t_{\textrm{ILP}}\)\(\alpha _{\textrm{bsf}}\)’s value is reduced by \(\alpha _{\textrm{red}}\), as seen in line 15. The reasoning behind this step is as follows. If the resulting ILP can be easily solved to optimality for the respective subinstance, the search space is too small, owing to a relatively low number of free variables. To increase the number of free variables in the ILP, solutions produced in ProbabilisticSolutionConstruction(\(S^{\textrm{bsf}}\), \(\alpha _{\textrm{bsf}}\), \(l_{size}\)) should differ more from \(S^{\textrm{bsf}}\), which is achievable by lowering the value of \(\alpha _{\textrm{bsf}}\).

The adjustment of parameters \({n_{\textrm{a}}}\) and \(l_{\textrm{size}}\) follows a similar scheme as the one described above. Their initial values are set as follows: \({n_{\textrm{a}}}:= n^{\textrm{init}}\) and \(l_{\textrm{size}}= l_{\textrm{size}}^{\textrm{init}}\), which is done in the Initialize(\({n_{\textrm{a}}}\), \(l_{\textrm{size}}\)) function. This function can be called under three distinct circumstances: (1) at the beginning of the algorithm (line 5), (2) when solution \(S^{\textrm{ILP}}\) is strictly better than \(S^{\textrm{bsf}}\) (line 18), and (3) when solution \(S^{\textrm{ILP}}\) is strictly worse than \(S^{\textrm{bsf}}\) while \({n_{\textrm{a}}}\) concurrently exceeds \(n^{\textrm{init}}\) (line 21). On the other hand, when \(S^{\textrm{ILP}}\) and the \(S^{\textrm{bsf}}\) have the same objective function value, the algorithm has the capacity to create larger subinstances, leading to an increase in the values of the three parameters in Increment\(({n_{\textrm{a}}}, l_{\textrm{size}})\) function. Specifically, \({n_{\textrm{a}}}\) is increased by \(n^{\textrm{inc}}\), and \(l_{\textrm{size}}\) is increased by \(l_{\textrm{size}}^{\textrm{inc}}\).

4.4.1 Probabilistic Solution Construction

The call of function ProbabilisticSolutionConstruction(\(S^{\textrm{bsf}}\), \(\alpha _{\textrm{bsf}}\), \(l_{size}\)) invokes the execution of one of two heuristics: either (1) a version of the Clarke & Wright (C&W) savings algorithm (Clarke & Wright, 1964), or (2) a variant of the insertion algorithm. The choice of a heuristic is done uniformly at random. Both heuristics exclusively generate feasible solutions. In the following, both construction algorithms and their variants are described in detail.

Probabilistic C&W savings algorithm. Similar to the original C&W approach, our algorithm variant begins by generating a set of direct routes, denoted as . Next, the algorithm initializes a savings list L consisting of pairs of nodes (ij), where i and j represent customers and charging stations. The savings value \(\sigma _{ij}\) for each pair is computed using the following equation:

$$\begin{aligned} \sigma _{ij} := d_{0i}+d_{0j}-\lambda d_{ij} + \mu |d_{0i}-d_{0j}| \end{aligned}$$
(30)

Here, \(\lambda \) and \(\mu \) are the so-called route shape and asymmetry scaling parameters, respectively. The route shape parameter \(\lambda \) prioritizes the selection of nodes based on their distance from each other (Yellow, 1970); parameter \(\mu \), on the other hand, scales the asymmetry between nodes i and j (Paessens, 1988). Well-working values for these parameters are identified through a parameter tuning procedure described in Sect. 4.6.2. It is important to note that L only includes pairs of nodes (ij) that satisfy two conditions: (1) i and j are part of two different tours, and (2) both i and j must be adjacent to the depot in the tour of which they form part. Moreover, solution construction will not only be influenced by the savings values of node pairs (ij) but also by the fact whether or not arc \(a_{ij}\) appears in the current best-so-far solution \(S^{\textrm{bsf}}\). For this purpose, an additional value, \(q_{ij}\), is calculated for each entry \((i,j) \in L\):

$$\begin{aligned} \begin{array}{llr} q_{ij} & := {\left\{ \begin{array}{ll} (\sigma _{ij} + 1) \cdot \alpha _{\textrm{bsf}}& \quad \text {if } c_{ij} \in S^{\textrm{bsf}}\\ (\sigma _{ij} + 1) \cdot (1 - \alpha _{\textrm{bsf}}) & \quad \text {otherwise} \end{array}\right. } \end{array} \end{aligned}$$
(31)

The algorithm performs the following sequence of steps until the savings list L is empty.

  1. 1.

    After computing \(q_{ij}\) for all entries in L, the list is sorted in non-increasing order with respect to the \(q_{ij}\) values, and a reduced list \(L_r\) is created, containing the first \(l_{size}\) elements of L.

  2. 2.

    Next, an entry (ij) is chosen from \(L_r\) with respect to the following probabilities:

    $$\begin{aligned} \textbf{p}(i,j) := \frac{q_{ij}}{\sum _{\text {( }i',j' \text {) }\in L_{r}} q_{i'j'}} \quad \forall \; \text {( }i,j \text {) }\in L_{r} \end{aligned}$$
    (32)

    Note that the higher the value of \(\alpha _{\textrm{bsf}}\), where \(0 \le \alpha ^{\textrm{LB}}\le \alpha _{\textrm{bsf}}\le \alpha ^{\textrm{UB}}\le 1\), the higher the probability of selecting arcs that are part of the best-so-far solution \(S^{\textrm{bsf}}\).

  3. 3.

    Then, the chosen tours corresponding to nodes i and j are merged. The merging process is determined by one of the following four possible cases, depending on the direct connection of nodes i and j to the depot:

    1. (a)

      Case 1:

      • ,

      • Merging: Reverse \(T_1\), \(\textsf {rev}(T_1)\), and concatenate with \(T_2\)

      • Result:

    2. (b)

      Case 2:

      • ,

      • Merging: Reverse both \(T_1\) and \(T_2\),\(\textsf {rev}(T_1)\), \(\textsf {rev}(T_2)\), and concatenate

      • Result:

    3. (c)

      Case 3:

      • ,

      • Merging: Concatenate \(T_1\) and \(T_2\)

      • Result:

    4. (d)

      Case 4:

      • ,

      • Merging: Reverse \(T_2\),\(\textsf {rev}(T_2)\), and concatenate with \(T_1\)

      • Result:

    Depending on the positions of nodes i and j in the tour, it may be required to reverse one or both of the tours selected to ensure a direct connection from i to j. In such a case, the reversed form of tour \(T_1\) is represented by \(\textsf {rev}(T_1)\). Subsequently, the feasibility of the merged tour \(T_m\) is checked in terms of vehicle loading capacity and time windows. If the obtained route violates vehicle capacity and/or time window constraints, it is deemed infeasible and eliminated from the savings list. A new candidate is then chosen following the procedure already outlined above. In the event that the merged tour is not feasible due to battery constraints, a charging station is inserted into the tour. Determining the optimal charging station location involves identifying the first node in the tour to which the electric vehicle has arrived with a negative battery level. Then, a charging station is inserted between this node and the previous node. After determining the insertion position, the charging station that results in the minimum amount of increase in the overall tour distance is selected and placed into the predetermined position. If the tour remains infeasible, then the same procedure is applied to the previous arcs. In those cases in which the infeasibility persists even after attempting to insert charging stations, the merged tour is discarded, and the associated nodes are taken out of the savings list. Then, the next candidate, pair of nodes, is selected from the savings list following the procedure described above. The tour merging process is repeatedly executed until the saving list is exhausted. Once the merging phase is complete, some of the previously added charging stations may no longer be necessary. Therefore, redundant charging stations are first identified and then removed from the constructed tours.

  4. 4.

    Finally, the savings list L is updated as described above.

As a last step, the final set of tours is converted into its corresponding set of solution components.

Probabilistic Insertion Algorithm. Our second constructive heuristic operates by inserting customers into available tours in a sequential manner until all customers are visited. The first customer to be inserted into the tour is chosen based on the distance from the depot or the latest possible visiting time. In particular, the initial tour is established by inserting the customer with either the greatest distance from the depot or the earliest deadline. We then produce a cost list that outlines all possible insertion points for each unvisited customer, along with the associated costs. To determine the cost of inserting a customer at a particular point, we use the following equation, which calculates the cost of inserting customer i between nodes j and k

$$\begin{aligned} c(j,i,k) = d_{ji}+d_{ik}-d_{jk} \end{aligned}$$
(33)

Then, \(q_{jik}\) is calculated for each entry \((j,i,k) \in L\) as follows:

$$\begin{aligned} \begin{array}{llr} q_{jik} & := {\left\{ \begin{array}{ll} (c(j,i,k) + 1) \cdot (1 - \alpha _{\textrm{bsf}})(1 - \alpha _{\textrm{bsf}}) & \quad \text {if } c_{ji} \in S^{\textrm{bsf}}\text { and } c_{ik} \in S^{\textrm{bsf}}\\ (c(j,i,k) + 1) \cdot (\alpha _{\textrm{bsf}})^2 & \quad \text {if } c_{ji} \notin S^{\textrm{bsf}}\text { and } c_{ik} \notin S^{\textrm{bsf}}\\ (c(j,i,k) + 1) \cdot \alpha _{\textrm{bsf}}(1 - \alpha _{\textrm{bsf}}) & \quad \text {otherwise} \end{array}\right. } \end{array} \end{aligned}$$
(34)

Subsequently, the selection of an entry (jik) from the generated list is carried out based on the probabilities calculated using Eq. (34). Next, if the remaining load capacity of the vehicle permits, customer i is added in between customers j and k. In addition, if this insertion is infeasible in terms of battery restrictions, a charging station is inserted into the tour using the process described in the C&W savings algorithm. In situations where the insertion of a customer results in the vehicle exceeding its load capacity or battery capacity (even after charging station insertion) or causes a time window violation, a new tour is initiated, which includes only the respective customer.

After inserting all of the customers and a complete solution is derived, the obtained set of tours is transformed into the corresponding set S of solution components.

4.5 The Adapt-Cmsa-SetCov Algorithm

The ILP model for solving subinstances in this variant of Adapt-CMSA is model \(\texttt {ILP}_{\textrm{setcov}}^{\textrm{EVRP}}\) from Sect. 4.2. In this case, the complete set of solution components C consists of a component \(c_r\) for each valid tour \(T_r \in \mathscr {T}\) (see Sect. 4.2), that is, \(C:= \{c_r \mid T_r \in \mathscr {T}\}\). Any subset \(S \subset C\) such that each customer \(i \in V\) is served by exactly one tour of S is a valid solution to the EVRP-TW-SPD problem instance.

The probabilistic solution construction process in Adapt-Cmsa-SetCov works in exactly the same way as in Adapt-Cmsa-Std. Just that the solutions returned consist of solution components that directly correspond to tours (instead of arcs as in the case of Adapt-Cmsa-Std).

Another difference is—as mentioned above—the ILP model used to solve subinstances. In fact, given a subinstance \(C'\), the corresponding ILP model is obtained by replacing each occurrence of \(\mathscr {T}\) with \(C'\), that is, the model only considers those tours as eligible tours that appear subinstance \(C'\). However, before returning the solution, it is checked for duplicate occurrences of customers, that is, all redundant customers are initially identified. Afterward, the benefit of removing each redundant customer, which is directly related to the distance of the respective customer with the adjacent nodes, is computed. Subsequently, redundant customers, beginning with the one with the greatest benefit, are removed until each customer appears in a single tour only.

4.6 Computational Experiments

The experiments were conducted on the same machines as the ones for the VSBPP problem, that is, on a cluster of machines equipped with Intel® Xeon® 5670 CPUs having 12 cores of 2.933 GHz and at least 32 GB of RAM. Moreover, the proposed algorithms were implemented in C++. Sub-instances in both Adapt-CMSA variants were solved using CPLEX version 20.1 in one-threaded mode. Furthermore, the ILP models representing complete problem instances were solved using CPLEX version 20.1 in standalone mode.

4.6.1 Generation of the problem instances for EVRP-SPD-TW

The algorithm’s performance was evaluated utilizing the EVRP-TW problem instances derived by Schneider et al. (2014) from the classical VRPTW instances by Solomon (1987). This dataset includes a total of 92 instances, consisting of 36 small-sized instances and 56 large-sized instances. Small-sized instances include 5, 10, and 15 customers, while large-sized instances contain 100 customers and 21 charging stations. These instances are organized into three distinct groups based on the spatial distribution of customer locations: clustered instances (marked by the prefix“c”), randomly distributed instances (prefix “r”), and a hybrid of random and clustered distributions (prefix “rc”). Each group further contains two sub-classes (type 1, respectively type 2) which differentiate instances based on factors such as time windows, vehicle load, and battery capacity.

Schneider’s modifications primarily involved integrating charging stations and adjusting battery capacities to ensure instance feasibility. Specifically, one charging station was positioned at the depot, with the remaining stations distributed randomly, yet in such a way that every customer could be reached using at most two charging stations. The battery capacity was determined as the maximum of (1) the need to travel 60% of the average route length of the best-known solution and (2) twice the battery capacity required to traverse the longest arc from a customer to a charging station. These changes also caused the creation of new time windows, as the original ones from Solomon became infeasible due to added constraints related to charging times.

Since Schneider’s instances only provided a single demand type per customer, we adapted these to fit the requirements of our EVRP-SPD-TW model. This adaptation involved separating the combined delivery and pickup demands. We applied the method described by Salhi and Nagy (1999) to calculate a ratio \(\rho _i = \min \{\frac{x_i}{y_i}, \frac{y_i}{x_i}\}\) using the Cartesian coordinates \((x_i, y_i)\) of each customer \(i \in V\). The delivery demand \(q_i\) was then computed by multiplying the original demand \(\delta _i\) by \(\rho _i\), and the pickup demand \(p_i\) was obtained by subtracting \(q_i\) from \(\delta _i\). These modified instances are available at: https://github.com/manilakbay/EVRP-TW-SPD-Instances, accessed on (25/04/2024).

4.6.2 Parameter tuning

Table 5 Parameters, their domains, and the chosen values as determined by irace

As in the case of the VSBPP problem, we employed the scientific tuning software irace (López-Ibáñez et al., 2016) to derive well-working parameter values for Adapt-Cmsa-Std and Adapt-Cmsa-SetCov. The tuning process was conducted using six instances, namely r107, r205, rc101, rc104, rc105, and rc205. The budget of irace—that is, the number of algorithm runs allowed for tuning—was set to 2500, and the time limit per instance was fixed to 900 CPU seconds. Moreover, the precision of irace was fixed to two positions behind the comma for numerical parameters. Table 5 presents a summary of the parameters, their domains, and the final values selected for the experimentation.

It is worth highlighting that the obtained values for \(n^{\textrm{init}}\) and \(n^{\textrm{inc}}\) are significantly smaller in the context of Adapt-Cmsa-Std when compared to those for Adapt-Cmsa-SetCov. One possible explanation for this observation is that the ILP model used within Adapt-Cmsa-Std makes it difficult for the algorithm to be successful. It seems as if the subinstances are required to be as small as possible so that valid solutions can be generated by CPLEX when solving these subinstances in a restricted time. This explanation is also supported by the obtained values for \(t_{\textrm{ILP}}\). The limit for the running time of CPLEX for solving the ILP models of each iteration is about twice as high in the case of Adapt-Cmsa-Std. Contrary to this, the value of the \(l_{\textrm{size}}^{\textrm{init}}\) parameter determined for Adapt-Cmsa-Std is much higher than that determined for Adapt-Cmsa-SetCov. Higher \(l_{\textrm{size}}^{\textrm{init}}\) may be considered as a diversification mechanism, as compensation for dealing with small subinstances.

4.6.3 Numerical results

In this section, we provide a detailed experimental evaluation of the proposed algorithms and study their performance in various scenarios. To gain a better understanding of how they perform in different situations, we tested them on small-sized instances with 5, 10, and 15 customers, as well as larger-sized instances with 100 customers. The numerical results for the small-sized instances can be found in Tables 6, 7 and 8, while the results for the larger-sized instances are presented in Tables 9, 10 and 11.

Table 6 Computational results for small-sized instances with 5 customers
Table 7 Computational results for small-sized instances with 10 customers
Table 8 Computational results for small-sized instances with 15 customers. The better result between Adapt-Cmsa-Std and Adapt-Cmsa-SetCov is highlighted in case it improves over the corresponding CPLEX result

To assess the effectiveness of the algorithms in handling small problem instances, we compared Adapt-Cmsa-Std and Adapt-Cmsa-SetCov with the application of CPLEX to the full-size instances. However, since CPLEX cannot handle larger problem instances, we used our probabilistic Clark & Wright savings algorithm (pC&W) and our probabilistic sequential insertion algorithm (pSI) as benchmarks for those scenarios. We ensured that the parameters for both algorithms were set in the same way as for their application within Adapt-Cmsa-Std. Finally, we imposed a computation time limit of 150 CPU seconds for small problem instances and 900 CPU seconds for larger problem instances. Each algorithm was applied 10 times to each problem instance. Note also that, to compute objective function values, we set the cost of each vehicle used in a solution to 1000, that is, \(M=1000\).

The first multi-column of each result table concerning the small problem instances (Tables 68) presents the instance names and the value of cs, the number of copies of each charging station (dummy charging stations) permitted to be utilized by our algorithms. The value of cs was determined for each problem instance as follows. The corresponding ILP model was solved with \(cs=1\) in the first step. With this setting, each charging station can mostly be visited once. Then, the value of cs was iteratively incremented until the optimal solution to the ILP model no longer improved with respect to the previous iteration. The final value of cs was then set to the value of cs in the previous iteration. Unfortunately, this procedure is not feasible in large-sized problem instances because the ILP solver cannot derive an optimal solution in a reasonable time frame. Therefore, we adopted a constant number of five dummy charging stations per instance. Next, the columns with the heading ‘m’ show the number of vehicles used in the respective solutions. For algorithms Adapt-Cmsa-Std, Adapt-Cmsa-SetCov, pC&W, and pSI, the columns labeled ‘best’ display the best objective function values among the solutions obtained after ten runs. Additionally, columns with the heading ‘avg.’ show the average objective function values over the best solutions of each of the 10 runs. Moreover, the ‘time’ columns show the computation time of CPLEX and the average computation times of both Adapt-CMSA variants to generate the best solutions in each run. All indicated times are in CPU seconds. The time limit for CPLEX was set to two hours. The ‘gap(%)’ columns provide the percentage difference between the optimal solutions obtained and the best lower bounds CPLEX achieves. It is worth noting that if the gap value is zero, CPLEX has found an optimal solution.

Based on the obtained results, the following observations can be made. For small-sized problem instances, CPLEX optimally solved 31 instances. However, for the remaining 5 instances (rc201C10, c103C15, r102C15, r202C15, rc204C15), it only provided feasible solutions. It is worth noting that these solutions were the best found within 2 h of computation time, except for rc201C10, where we allowed 9 h of running time to derive the presented solution. On the other hand, both versions of Adapt-CMSA found optimal solutions, as proven by CPLEX. In the case of the r202C15 instance, Adapt-Cmsa-Std and Adapt-Cmsa-SetCov were even able to improve over the solution obtained by CPLEX by 0.15% and 36.17%, respectively. Furthermore, Adapt-Cmsa-Std and Adapt-Cmsa-SetCov could improve the solution obtained by CPLEX by 1.51% in the case of the rc204C15 instance. Moreover, both variants of Adapt-CMSA require considerably less computation time than CPLEX. More specifically, while CPLEX found its best solutions on average in 2965.35 s, Adapt-Cmsa-Std was able to do so in 23.04 s, and Adapt-Cmsa-SetCov was able to do so in just 21.07 s.

The numerical results for the large-sized instances demonstrate that both variants of Adapt-CMSA are superior to pC&W and pSI in terms of both best-performance and average-performance. Although the average computation time required by Adapt-Cmsa-Std and Adapt-Cmsa-SetCov was higher than that required by pC&W and pSI, this was because pC&W and pSI were not able to improve their best-found solutions any further, while the Adapt-CMSA algorithms were still able to explore the search space in order find even better solutions. It is also worth noting that solutions found by both versions of Adapt-CMSA utilize fewer vehicles than those found by pC&W and pSI.

In addition, the following observations can be made concerning the comparison of the performances of Adapt-Cmsa-Std and Adapt-Cmsa-SetCov. First, Adapt-Cmsa-SetCov significantly outperforms Adapt-Cmsa-Std in the case of random and random-clustered instances; see below for statistical tests. However, from the results presented in Table 9, it seems difficult to come to a definite conclusion in the context of clustered-type instances. Adapt-Cmsa-SetCov seems to provide a slightly better performance both in terms of best and average results. However, in order to back this claim up in a scientifically well-founded way, we also present critical difference (CD) plots as a statistical tool for assisting in the evaluation of the obtained results. As in the case of the VSBPP problem, we used the scmamp tool Calvo and Santafe (2016) in order to generate the CD plots. Remember that, in these plots, each algorithm variant is positioned along the horizontal axis according to its average ranking for the considered subset of problem instances. Algorithm variants whose performances fall below the critical difference threshold, computed with a significance level of 0.05, are considered statistically equivalent, as indicated by the horizontal bars connecting their respective markers. According to Fig. 6, there is not a significant difference between the performance of Adapt-Cmsa-SetCov and Adapt-Cmsa-Std on clustered instances, while both outperform the probabilistic implementations of the construction heuristics.

In summary, both variants of Adapt-CMSA show a very satisfactory performance both in the context of small and large problem instances. Moreover, Adapt-Cmsa-SetCov shows superiority over Adapt-Cmsa-Std, particularly in the context of random and random-clustered instances. These claims are backed up in a statistical way by means of the graphics in Fig. 6. However, we also observed that the performance of Adapt-Cmsa-SetCov decreases in the context of instances with a long scheduling horizon (C2* R2* and RC2*), see Fig. 6f. Solutions for those instances include fewer routes and hence more customers per route when compared to the solutions for the instances with short scheduling horizons (C1* R1* and RC1*).

Table 9 Computational results for large-sized clustered instances. The better result between Adapt-Cmsa-Std and Adapt-Cmsa-SetCov is highlighted in case it improves over the best result of pC&W and pSI
Table 10 Computational results for large-sized random instances. The better result between Adapt-Cmsa-Std and Adapt-Cmsa-SetCov is highlighted in case it improves over the best result of pC&W and pSI
Table 11 Computational results for large-sized random clustered instances. The better result between Adapt-Cmsa-Std and Adapt-Cmsa-SetCov is highlighted in case it improves over the best result of pC&W and pSI
Fig. 6
figure 6

Critical difference (CD) plots concerning the results for large instances. The results in (a) consider all instances together, while the subsequent plots display the results for subsets of the set of large instances: (b) clustered instances; (c) random instances; (d) random-clustered instances; (e) instances R1*, C1* and RC1*; (f) instances R2*, C2* and RC2*

4.6.4 Performance Difference Between the two EVRP-TW-SPD ILP Models

Finally, as in the case of the VSBPP, we want to show why Adapt-Cmsa-SetCov generally outperforms Adapt-Cmsa-Std. For this purpose, we again generate subinstances of different sizes, translate them both into models \(\texttt {ILP}_{\textrm{std}}^{\textrm{EVRP}}\) and \(\texttt {ILP}_{\textrm{setcov}}^{\textrm{EVRP}}\), and solve them with CPLEX. In particular, we generated 10 subinstances by probabilistically constructing 100, respectively 500, solutions (small problem instance) and 50, respectively 100, solutions (large problem instance) and by merging their solution components in order to obtain subinstances. This was done for the small problem instance r202C15 with 15 customers and for the large problem instance c101 with 100 customers. Figure 7 shows radar charts that present the obtained results in the four different cases. Each radar plot provides four different measures, averaged over 10 subinstances: (1) the number of variables in the models of the subinstances (top), (2) the relative MIP gap after termination of CPLEX (right), (3) the computation time required by CPLEX (bottom), and (4) the absolute improvement when comparing the result of solving the subinstance with the best individual solution that was used to generate the subinstance. Note that the time limit for CPLEX was set to 20 CPU seconds in all cases. Note that a model is promising if the improvement is large, and the number of variables, the relative MIP gap and the required time are low. The radar charts concerning the large problem instance (see Fig. 7c and d) indicate that this is the case for model \(\texttt {ILP}_{\textrm{setcov}}^{\textrm{EVRP}}\), while the opposite is actually the case for model \(\texttt {ILP}_{\textrm{std}}^{\textrm{EVRP}}\). Especially the case of the small problem instance considering the lower number of solution constructions (see Fig. 7a) indicates that subinstances must not be too small. Otherwise, there might not be many improvements to be found in the context of model \(\texttt {ILP}_{\textrm{setcov}}^{\textrm{EVRP}}\).

Fig. 7
figure 7

Radar charts concerning the comparison of the two ILP models for the EVRP-TW-SPD problem applied to a small problem instance with 15 customers (see (a) and (b)), and to a large problem instance with 100 customers (see (c) and (d))

5 Conclusions and future research directions

In this paper, we presented the application of different CMSA variants to two NP-hard combinatorial optimization problems. The first problem is known as the variable-sized bin packing problem, and the second one is the electric vehicle routing problem with time windows and simultaneous pickup and delivery. Both considered optimization problems share the property that they can be modeled using an assignment-type integer linear program and a set-covering-based integer linear program. Both models were used to solve subinstances at each iteration of the presented CMSA algorithms. In both cases, the results have shown convincingly that the CMSA variants using the set-covering-based models significantly outperform the CMSA variants using the standard assignment-type models. In fact, in our opinion, CMSA algorithms are an ideal algorithmic framework for taking profit from set-covering-based models for solving optimization problems that can be modeled in this way. This is because CMSA algorithms are less sophisticated and easier to implement in comparison to column generation approaches. In addition, CMSA algorithms can explore search spaces, in contrast to simpler heuristic methods from the literature devised to take profit from set-covering-based models.

When faced with a combinatorial optimization problem that can be modeled by both types of integer linear programming models, we recommend to test the suitability of using the set-covering-based model by performing experiments such as the ones that led to the radar plots in Figs. 4 and 7. They show that it is much more profitable to use the set-covering formulation for solving subinstances because the quality of the solutions found is much higher when compared to the standard integer linear programming model. At the same time, the computational effort is much lower. If this is the case, the use of a set-covering-based model is indicated.

We envisage at least two possible lines for future work. One line consists in consolidating our findings in the context of additional combinatorial optimization problems that can be modeled by set-covering-based models. Another line of work consists of improving the CMSA algorithms presented in this work. In the context of the VSBPP problem, for example, we only implemented the first solution construction approach that came to mind. Adding additional greedy heuristics for the construction step of CMSA could help to generate potentially different bins that, in combination with other bins, could help to find even better solutions or to improve our results in the few cases in which our algorithm is not able to compete with the state-of-the-art variable neighborhood search technique. Also, in the case of the electric vehicle routing problem, we see potential for improvement by adding additional solution construction techniques.