Flowshop Final Report
Flowshop Final Report
Flowshop Final Report
2022/2023
Abstract
The scheduling of jobs in flowshop environments has become increasingly
important in various industries. As the number of machines and jobs to be
scheduled increases, the complexity of the problem also increases, which
necessitates the need for efficient scheduling techniques. In this report, we will
investigate various methods for solving flow shop scheduling problems,
including branch and bound, heuristics, metaheuristics, and hyperheuristics. We
will compare the results obtained from each method and study their
performance under different input parameters. Specifically, we will examine the
make span, which is an essential parameter for measuring the efficiency of
scheduling. This report aims to provide a comprehensive analysis of the different
techniques used in flow shop scheduling and identify the most efficient method
for obtaining optimum results.
2
Contribution of members
Task Responsible
3
variants. - Benazzoug Nour El Houda
4
Table of contents
Chapter 01 : Scheduling problems 8
1. Introduction 8
4. Objectives 9
1. Problem Definition 10
3. Makespan calculation 10
1. Overview 12
3. Tests 15
a. Random instance 15
c. Common instance 17
d. Taillard instance 17
i. Random initialization 17
Chapter 04 : Heuristics 19
1. Overview 19
5
a. NEH Heuristic 19
b. Greedy NEH 20
c. Johnson’s heuristic 20
d. Ham heuristic 20
e. Palmer’s heuristic 21
f. CDS heuristic 22
h. PRSKE heuristic 22
i. Artificial heuristic 23
3. Tests 23
a. First instance 24
b. Seventh instance 24
1. Overview 26
a. Random walk 26
b. Hill climbing 27
c. Simulated annealing 27
d. Tabu Search 28
e. VNS 29
3. Tests 29
a. Random initialization 29
b. NEH initialization 30
c. Hyperparameters tuning 30
i. Simulated annealing 30
ii. VNS 31
6
Chapter 06 : Population based Metaheuristics 33
1. Overview 33
2. Genetic algorithm 34
5. Tests 36
Chapter 07 : Discussion 38
References 39
7
Chapter 01 : Scheduling problems
1. Introduction
Scheduling is a crucial decision-making process used in a variety of
manufacturing and service industries. Its aim is to optimize one or more
objectives by allocating resources to tasks over specific time periods. Resources
and tasks can take various forms, such as machines in a workshop, runways at
an airport, crew at a construction site, processing units in a computing
environment, etc. Tasks may have specific characteristics, such as priority levels,
earliest starting times, and due dates. The objectives can also vary, such as
minimizing tasks completed after their respective due dates.
Scheduling plays a vital role in most manufacturing and production
systems, information processing environments, transportation and distribution
settings, and other types of service industries.
8
and every job is distinct. Therefore while evaluating a flow shop problem,
we carry out different possible sequences of carrying out the job, and then
the best among those is chosen.
● Job Shop Scheduling - This differs from Flow Shop Scheduling in that
it's not mandatory for all jobs to be processed on every machine available.
Each job may be processed on a distinct number of machines, and the
sequence of jobs processed on each machine is also unique. The best
sequence is selected based on the requirements of the problem, after
considering several possible sequences.
4. Objectives
The main objective of this study is to investigate different methods for
scheduling jobs in a flowshop environment, with a focus on performance
evaluation.
Through this study, we aim to contribute to the field of flowshop
scheduling by providing insights into the performance of various scheduling
methods, which can help practitioners and researchers in making informed
decisions when selecting the appropriate method for their specific scheduling
problem.
9
Chapter 02 : Flowshop problem
1. Problem Definition
Flowshop scheduling is a problem where a set of n jobs, each with m
operations, must be processed on m machines. Each operation of a job must be
executed on a specific machine, and no machine can perform more than one
operation simultaneously. The objective is to determine the optimal arrangement
of jobs that will result in the shortest possible total job execution time, also
known as makespan. This is typically achieved by finding the optimal order in
which to process the jobs on the machines.
3. Makespan calculation
The makespan of a sequence in a flowshop scheduling problem is the time
required to complete all jobs on all machines in the sequence. It can be
calculated by starting with the first job on the first machine and adding the time
required to process each job on each machine, while ensuring that the machines
10
are not idle and the jobs are processed in the specified order. Here’s the code
that calculates the makespan of a sequence.
11
Chapter 03 : Exact methods
1. Overview
Exact methods are algorithms used to find optimal solutions to
optimization problems, where the goal is to minimize or maximize a certain
objective function subject to a set of constraints. In many real-world
applications, such as logistics, scheduling, finance, and engineering, it is critical
to find the best possible solution to these problems. Exact methods provide a
way to obtain optimal solutions with a guaranteed level of optimality, but they
can be computationally expensive, especially for large-scale problems.
One powerful exact method that has been widely used in solving a variety
of optimization problems is Branch and Bound. Branch and Bound is a
tree-based search algorithm that divides the search space into smaller
subproblems, evaluates the subproblems, and prunes those that cannot lead to
better solutions than the current best known solution. Branch and Bound has
proven to be a flexible and effective method for solving optimization problems,
especially for mixed-integer programming problems, where some variables are
restricted to take integer values.
The power of Branch and Bound lies in its ability to systematically
eliminate large portions of the search space that are guaranteed to contain
suboptimal solutions. This is achieved through the use of bounding techniques
that enable the algorithm to quickly prune subproblems that cannot lead to
better solutions than the current best known solution.
One of the trickiest aspects of Branch and Bound is finding the lower
bound for the problem, which is used to prune subproblems that cannot lead to
a better solution. The lower bound is a value that provides an estimate of the
12
minimum possible value of the objective function for a given subproblem. The
lower bound is used to prune branches of the search tree that cannot lead to a
better solution than the current best known solution, thereby reducing the
number of subproblems that need to be explored.
The algorithm starts with an initial node, which represents the empty
schedule, and generates child nodes by adding one job at a time to the schedule.
The lower bound is then used to prune branches of the search tree that cannot
lead to a better solution.
As the search tree is built, the Branch and Bound algorithm prunes
branches that cannot possibly lead to an optimal solution, by using the lower
bound to eliminate nodes that have a makespan greater than the current best
solution. This allows the algorithm to focus on the most promising branches of
the search tree, and ultimately find the optimal solution.
13
of the jobs and computing the completion time of the partial schedule. By
incorporating this lower bound into our Branch and Bound algorithm, we were
able to efficiently explore the solution space and find the optimal solution to the
scheduling problem. Below is the code for the Branch and Bound algorithm:
14
We implemented this approach to compare its performance to Branch and
Bound with lower bound. By comparing the two methods, we can evaluate the
effectiveness of using lower bounds in reducing the search space and improving
the efficiency of the search algorithm.
3. Tests
a. Random instance
We conducted an experiment to compare the performance of Branch and
Bound pure, Branch and Bound with lower bound, and Brute Force algorithms
on a randomly generated instance of 8 jobs and 5 machines. The processing
times for each job on each machine were also randomly generated. We ran each
algorithm and recorded the results. Here are the results:
We observed that Branch and Bound performed better than Branch and
Bound pure. Specifically, Branch and Bound visited significantly fewer nodes
than Branch and Bound pure. This result indicates that the lower bound used in
Branch and Bound was effective in pruning branches of the search tree that
could not lead to a better solution.
15
The implementation of Branch and Bound required slightly more time
than Brute force because of the computational cost of calculating the lower
bound. Although the lower bound provides a way to prune branches of the
search tree, the complexity of its calculation may offset some of the time savings.
c. Common instance
A random common was provided for us to test our branch and bound
algorithm on, and we were also able to compare our results with other teams in
the class. The instance consisted of 10 jobs and 5 machines, and we ran our
algorithm multiple times to ensure accuracy. The results are summarized below.
16
d. Taillard instance
We tested our branch and bound algorithm on the first instance of the
first Taillard benchmark (20 jobs and 5 machines) for 30mn then we stopped it.
The results are summarized below.
i. Random initialization
17
allocate the search space among multiple workers. This approach can lead to
significantly faster attainment of the optimal solution.
Branch & Bound algorithms are particularly effective when the search
space is not excessively large. However, when the search space becomes too
large, there are other methods that can be employed. We will explore these
methods in the upcoming chapters.
18
Chapter 04 : Heuristics
1. Overview
Exact methods, such as branch and bound or brute force, guarantee
optimality, but they can be computationally expensive and may require a
significant amount of time to find an optimal solution, especially for large
problem instances. Therefore, for many practical problems, exact methods are
not always feasible. Heuristics, on the other hand, are approximate algorithms
that provide good-quality solutions in a reasonable amount of time. They are
designed to quickly generate feasible solutions that are close to optimal, without
the guarantee of finding the global optimum. The flowshop problem has been
widely studied in the literature and various heuristics have been proposed for
solving it. Below are some of the most known heuristics for solving the problem.
19
b. Greedy NEH
The Greedy NEH algorithm is a constructive heuristic approach used to
solve permutation flow shop scheduling problems. At each step, instead of
selecting the best partial solution as dictated by the original NEH method, a
modification is introduced : rather than choosing the absolute best partial
solution, the algorithm randomly selects one out of the best five partial solutions
with equal probabilities and keeps it for the next steps of the solution
construction. When we have a complete solution, we reiterate the construction
process, having at each iteration a high-quality individual that is different from
its predecessor. We keep track of the best found solution, and at the end of the
execution, it's the one Greedy NEH returns. While it does not guarantee an
optimal solution, the Greedy NEH algorithm provides a quick and effective
method for solving permutation flowshop scheduling problems.
c. Johnson’s heuristic
Johnson's algorithm is a simple and effective method used to find the
optimal sequence for two-machine flowshop problem, and can also be applied to
problems with more than two machines under certain conditions. The algorithm
assigns processing times for each job to both machines and selects the job with
the smallest processing time from the list. If the minimum processing time is on
the first machine, the job is placed at the beginning of the sequence, and if it is
on the second machine, it is placed at the end of the sequence. The selected job
is then removed from the list, and the process is repeated until all jobs are
sequenced. It ensures that the job with the smallest processing time is scheduled
first, reducing idle time and maximizing machine utilization.
20
d. Ham heuristic
Ham heuristic consists of dividing the execution time matrix into two
sub-matrices, and based on that, it will generate two solutions, of which the best
is retained. Each sub-matrix will contain columns, where 𝑚 is the number of
𝑚
2
machines, such that the first sub-matrix will contain the first machines and
𝑚
2
the second sub-matrix will contain the last machines. For each sub-matrix,
𝑚
2
the sum of execution times for each job 𝑖 is calculated, and we will have these
two sums:
𝑚
2
𝑃𝑖1 = ∑ 𝑡𝑖𝑗
𝐽=1
𝑚
𝑃𝑖2 = ∑ 𝑡𝑖𝑗
𝑚
𝐽= 2
+1
e. Palmer’s heuristic
The Palmer heuristic involves assigning weights to each machine and then
calculating the weighted sum of each job. To start the process, the problem of
scheduling 𝑛 jobs on 𝑚 machines with a processing time matrix noted 𝑡 is
considered. The weight of each job is evaluated is evaluated using the formula
below:
21
𝑚
𝑓(𝑖) = ∑ (𝑚 − 2𝑗 + 1)𝑡𝑖𝑗
𝐽=1
The jobs are then sorted in ascending order of their weights 𝑓(𝑖), and a
sequence is formed based on this sorting.
f. CDS heuristic
The CDS (Constructive Dispatching Sequence) heuristic is a constructive
approach. It works by constructing a sequence of jobs that is expected to have a
relatively low makespan. The first step of the CDS heuristic is to select the job
that has the smallest processing time on the first machine. Then, for each
subsequent machine, the CDS heuristic selects the job that has the shortest
processing time among the remaining jobs. This process continues until all jobs
have been assigned to a machine. The resulting sequence of jobs is then
evaluated to determine the makespan.
𝑆𝑖𝑔𝑛(𝑡𝑖0 − 𝑡𝑖(𝑚−1))
𝑓(𝑖) = 𝑚𝑖𝑛−𝑔𝑢𝑝𝑡𝑎(𝑖, 𝑡)
The jobs are then sorted in ascending order of their weights, and a
sequence is formed based on this sorting.
h. PRSKE heuristic
PRSKE is another heuristic that orders job using the following formula:
assigning weights to each job using the following formula:
𝐴𝑉𝐺𝑖 + 𝑆𝑇𝐷𝑖 + 𝑎𝑏𝑠(𝑆𝐾𝐸𝑖)
22
Where 𝐴𝑉𝐺𝑖 is the average processing times of job 𝑖, 𝑆𝑇𝐷𝑖 stands for the
standard deviation of job 𝑖, and 𝑎𝑏𝑠(𝑆𝐾𝐸𝑖) the absolute value of the skewness of
job 𝑖.
i. Artificial heuristic
This heuristic algorithm works by iteratively reducing the number of
machines considered in the problem until we are left with a single machine. At
each iteration, we compute a weight matrix that assigns a weight to each
operation based on its position in the processing sequence. We then use the
Johnson's algorithm, which is a well-known heuristic for the flow shop
scheduling problem, to compute a feasible sequence of operations for the
reduced problem. We evaluate the quality of the sequence using a cost function
that computes the completion time of the jobs on the reduced set of machines.
We repeat the process with the reduced problem until we are left with a single
machine. At the end, we return the best solution found.
3. Tests
We conducted tests for each implemented heuristic on the first and
seventh instance of the following Taillard benchmarks:
- 20 jobs and 5 machines.
- 50 jobs and 10 machines.
- 100 jobs and 10 machines.
The quality of the results for each heuristic was evaluated using the
deviation metric, which measures the difference between the makespan of the
heuristic solution and the best known solution for a given instance. It is
represented as:
23
𝐶𝑚𝑎𝑥 − 𝑈𝑝𝑝𝑒𝑟 𝐵𝑜𝑢𝑛𝑑
𝑞𝑢𝑎𝑙𝑖𝑡𝑦 = 𝑈𝑝𝑝𝑒𝑟 𝐵𝑜𝑢𝑛𝑑
a. First instance
The table provided below presents a summary of the test results for the
initial instance of each benchmark. The test results offer a comprehensive
overview of the performance and outcomes obtained from evaluating the
algorithms on these specific instances.
24
b. Seventh instance
The table provided below presents a summary of the test results for the
seventh instance of each benchmark. The test results offer a comprehensive
overview of the performance and outcomes obtained from evaluating the
algorithms on these specific instances.
25
4. Discussion and Analysis
After conducting experiments with all of the heuristics mentioned above,
we have determined that heuristics can provide solutions of reasonable quality
within a reasonable amount of time. Additionally, heuristics can serve as a useful
initialization step for subsequent use with Branch & Bound or metaheuristic
algorithms, as we will explore in the next chapter.
Heuristics are often used in real-world applications due to their speed and
practicality, even though they may not always yield optimal solutions.
Additionally, the choice of which heuristic to use can depend on the specific
problem and its constraints, as different heuristics may perform better in
different scenarios.
26
Chapter 05 : Local search based
Metaheuristics
1. Overview
Local search metaheuristics are a class of optimization algorithms that use
local search methods to explore the search space and find good solutions to
optimization problems. These algorithms start with an initial solution and then
iteratively improve it by making small changes and evaluating the new solution.
The goal is to find the best solution within a given time or iteration limit. These
algorithms are often used for solving combinatorial optimization problems.
Compared to heuristics, local search metaheuristics are more effective for
solving the flowshop problem because they can escape from local optima and
find better solutions. Heuristics are simpler algorithms that make use of
domain-specific knowledge to generate solutions. While they can be effective in
some cases, they often get stuck in suboptimal solutions and cannot explore the
search space as thoroughly as local search metaheuristics.
27
b. Hill climbing
Hill climbing is another simple metaheuristic optimization algorithm that
iteratively improves a candidate solution. The algorithm starts with an initial
solution 𝑆0 and then repeatedly evaluates neighboring solutions and moves to
the best neighboring solution that improves the objective function until a local
optimum is reached. There are several types of hill climbing algorithms,
including:
- Simple hill climbing: This is the most basic type of hill climbing
algorithm. It chooses the first neighbor solution that is better than the
current solution. The search stops either when no further improvements
can be made or when the stop condition is met.
- Steepest-ascending hill climbing: This variant of hill climbing generates
all neighboring solutions and selects the best one that improves the
objective function the most. It is more computationally expensive than
simple hill climbing, but can converge to better solutions.
- Stochastic hill climbing: To add variation in the search process, this
variant of hill climbing selects a random neighbor from all the neighbors
that outperform the current solution, rather than choosing the best one.
This randomness aids in avoiding local optima and promoting a more
comprehensive search of the solution space.
c. Simulated annealing
Simulated annealing is a metaheuristic optimization algorithm inspired by
the process of annealing in metallurgy. The algorithm starts with an initial
solution and a high temperature, and then iteratively generates a new solution by
making a small perturbation to the current solution. The acceptance of the new
solution is determined by the probability function, which depends on the
difference in the objective function value and the current temperature. The
28
temperature is gradually reduced over time according to a cooling schedule,
which determines the rate at which the temperature decreases. The algorithm
terminates when the final temperature is reached.
The effectiveness of the algorithm depends on the choice of the cooling
schedule and the initial temperature, as well as the perturbation strategy used to
generate new solutions.
d. Tabu Search
Tabu search is another popular metaheuristic optimization algorithm that
is used to find high-quality solutions to combinatorial optimization problems.
The algorithm is based on the idea of using a "tabu list" to keep track of
previously visited solutions and prevent the algorithm from revisiting them, in
order to encourage exploration of new parts of the search space. The tabu list is
a memory structure that stores information about recently visited solutions, such
as the moves that were made to reach them. These moves are "tabu" or forbidden
for a certain number of iterations, in order to prevent the algorithm from
revisiting solutions that have already been explored. This encourages the
algorithm to explore new parts of the search space and avoid getting stuck in
local optima.
e. VNS
Variable Neighborhood Search is another local search algorithm for
solving combinatorial optimization problems. It starts with an initial solution
and iteratively explores the search space by generating neighboring solutions and
evaluating their quality. Unlike other local-search metaheuristics, VNS uses
multiple search neighborhoods of increasing size and diversity to escape from
local optima. At each iteration, the algorithm selects a random neighborhood
and applies a perturbation to escape the current local minimum. The search then
29
continues in the new neighborhood until a better solution is found or a stopping
criterion is met.
3. Tests
The tests for local search metaheuristics were divided into two parts, each
focusing on generating the initial solution using different methods. The purpose
of this division was to compare the performance of the metaheuristic algorithms
based on the initial solutions obtained.
In the first part of the tests, the initial solution was randomly generated
whereas in the second part, the initial solution was generated using the NEH
heuristic.
a. Random initialization
The following table summarizes the test results on the first and seventh
instances of the first benchmark. The initial solution was generated randomly.
30
b. NEH initialization
The following table summarizes the test results on the first and seventh
instances of the first benchmark. The initial solution was generated by NEH
heuristic.
31
c. Hyperparameters tuning
Hyperparameters play a critical role in the performance of local search
metaheuristics, as they can significantly impact the resulting solution quality and
computation time. Selecting appropriate values for hyperparameters, such as the
initial temperature and cooling rate for Simulated Annealing, can be crucial in
achieving optimal results. Additionally, tuning hyperparameters can involve a
trade-off between solution quality and computation time, as choosing certain
hyperparameters may lead to better solutions but also result in longer execution
times. Below are some tests to elaborate on this point.
i. Simulated annealing
Here are the results obtained using an initial temperature of 300, a final
temperature of 1, an alpha value of 0.1, and random insertion as the
neighboring method.
32
ii. VNS
Here are the results obtained using a maximum number of iterations equal
to 400, and k max equal to 6.
33
Chapter 06 : Population based
Metaheuristics
1. Overview
Population-based metaheuristics, such as genetic algorithms, differential
evolution, particle swarm optimization, and ant colony optimization, are a class
of optimization algorithms that can be applied to a wide range of problems.
These algorithms generate a set of candidate solutions and iteratively update the
population by applying a set of operators.
One key characteristic that sets population-based metaheuristics apart
from local search methods is their ability to explore a larger portion of the
search space. Local search algorithms start from a single solution and iteratively
improve upon it by searching the neighborhood of the current solution.
However, these methods can easily get stuck in a local optimum, preventing the
algorithm from finding a better solution that may be located in a different region
of the search space.
On the other hand, population-based metaheuristics maintain a diverse set
of solutions and explore different regions of the search space by using various
operators to create new solutions. However, they tend to require more
computation time compared to local search algorithms. This is because they
need to maintain and update a population of candidate solutions, and perform
operations such as crossover and mutation, which can be computationally
expensive.
34
2. Genetic algorithm
One of the most well-known population-based metaheuristics is the
genetic algorithm (GA) which simulates the process of natural selection and
evolution to search for the optimal solution. At each iteration, a population of
candidate solutions is evaluated based on a fitness function, and then the most fit
solutions are selected for reproduction. The reproduction is performed by
combining the selected solutions through crossover and mutation operators to
create new solutions. These new solutions replace the least fit solutions in the
population, and the process is repeated until a termination criterion is met.
One advantage of GA over other population-based metaheuristics is its
ability to handle multiple objectives simultaneously, known as multi-objective
optimization. GA can search for a set of solutions that optimize multiple
objectives, rather than a single optimal solution. Additionally, GA can handle
various types of decision variables, such as binary, integer, and real-valued
variables.
35
By leveraging the collective intelligence of the ant population, ACO can
effectively explore the solution space of the FSP. It can potentially discover
high-quality schedules that minimize the makespan or other objective criteria.
36
5. Tests
We conducted tests on the shared instance (10 jobs 5 machines) as well as
the first, second, fifth and seventh instances of the first benchmark of Taillard
instances (20 jobs and 5 machines) to show the performance of the different
metaheuristics that we implemented. The results are implemented below:
37
4. Discussion and Analysis
After conducting experiments with all the population-based metaheuristics
mentioned earlier, we have observed that these approaches can produce
promising results. Among them, the hybrid algorithm stood out by achieving the
best makespan outcomes. However, it is important to note that this algorithm
required more computational time compared to other metaheuristics and
heuristics. Additionally, it is worth mentioning that the effectiveness of these
methods is also influenced by the selection and tuning of hyperparameters. For
the hybrid algorithm, a good compromise would be to exclude the VNS from the
iterations, keeping only the GA, but to pass the highest quality individual that
the GA returns through a VNS.
38
Chapter 07 : Discussion
In conclusion, we have explored various optimization techniques for the
flowshop scheduling problem, ranging from exact algorithms such as Branch and
Bound, to heuristic methods like NEH, local search heuristics such as VNS, and
population-based metaheuristics like GA. Each method has its own strengths and
weaknesses, and the choice of method depends on the specific problem instance
and constraints. While exact methods provide optimal solutions, they may be
computationally infeasible for large problem instances. Heuristic methods
provide good solutions within reasonable time limits, but may get stuck in local
optimum. Local search metaheuristics are more effective for solving the
flowshop problem because they can escape from local optima and find better
solutions. On the other hand, Population-based methods explore a larger portion
of the search space and can provide better solutions but at the expense of higher
computational time. Overall, it is important to carefully select and adapt
optimization methods based on the specific problem requirements and
constraints.
39
References
[1] Selected heuristic algorithms for solving job shop and flow shop scheduling
problems.
[5] ACO-LS Algorithm for Solving No-wait Flow Shop Scheduling Problem.
40