Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Sudoku: Daa Case Study

Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

SUDOKU

DAA CASE STUDY

Name :- Kavita Jadhav


Roll No :- 24
Class :- TE(IT)
DEPARTMENT OF INFORAMTION TECHNOLOGY
Academic Year: 2020-21 Class: TE SEM-II

Subject: Design And Analysis Of Algorithm

Topic : SUDOKU

Guide : Mrs. T. S. Pawar

Name : Kavita Dullabh Jadhav


Class : TE (IT)
Roll No : 24
SUDOKU

Abstract :
In this bachelor thesis three different Sudoku solving algorithms are studied. The study
is primarily concerned with solving ability, but also includes the following: difficulty rating,
puzzle generation ability, and suitability for parallelizing. These aspects are studied for
individual algorithms but are also compared between the different algorithms. The evaluated
algorithms are backtrack, rule-based and Boltzmann machines. Measurements are carried out
by measuring the solving time on a database of 17-clue puzzles, with easier versions used for
the Boltzmann machine. Results are presented as solving time distributions for every algorithm,
but relations between the algorithms are also shown. We conclude that the rule-based algorithm
is by far the most efficient algorithm when it comes to solving Sudoku puzzles. It is also shown
that some correlation in difficulty rating exists between the backtrack and rule-based
algorithms. Parallelization is applicable to all algorithms to a varying extent, with clear
implementations for search-based solutions. Generation is shown to be suitable to implement
using deterministic algorithms such as backtrack and rule-based.

Introduction :
Sudoku is a game that under recent years has gained popularity. Many newspapers
today contain Sudoku puzzles and there are even competitions devoted to Sudoku solving. It is
therefore of interest to study how to solve, generate and rate such puzzles by the help of
computer algorithms. This thesis explores these concepts for three chosen algorithms.

1. Details of Algorithmic strategy to which problem belongs to:


The algorithm uses four strategies to determine a valid value for a cell in the Sudoku
board. The strategies are the following:
Search for cells with a unique solution possible.
Search for rows with a unique cell for an specific digit.
Search for columns with a unique cell for an specific digit.
Search for "squares" with a unique cell for an specific digit.
Pick a value randomly from the available valid values for a cell.
This programs applies the 4 first strategies into a loop. This 4 strategies have proven to be
sufficient to solve any Sudoku puzzle with one solution. In other words, if a Sudoku puzzle has
only one solution, these 4 strategies are enough to find this solution.
If the Sudoku puzzle has more than one solution, the last strategy is needed: Pick a Random
value for the cell.
2. Problem statement in details:
There are multiple algorithms for solving Sudoku puzzles. This report is limited to the
study of three different algorithms, each representing various solving approaches. Primarly the
focus is to measure and analyze those according to their solving potential. However there are
also other aspects that will be covered in this thesis. Those are difficulty rating, Sudoku puzzle
generation, and how well the algorithms are suited for parallelizing. The goal of this thesis is
to conclude how well each of those algorithms performs from these aspects and how they relate
to one another. Another goal is to see if any general conclusions regarding Sudoku puzzles can
be drawn. The evaluated algorithms are backtrack, rule-based and Boltzmann machines. All
algorithms with their respective implementation issues are further discussed in section 2
(background).

3. Algorithm or code:

Boltzmann Machines Algorithm :


The concept of Boltzmann machines is gradually introduced by beginning with the
neuron, network of neurons and finally concluding with a discussion on simulation techniques.
The central part of an artificial neural network (ANN) is the neuron, as pictured in figure 2.1.
A neuron can be considered as a single computation unit. It begins by summing up all weighted
inputs, and thresholding the value for some constantthreshold θ. Then a transfer function is
applied which sets the binary output if the input value is over some limit.
In the case of Boltzmann machines the activation function is stochastic and the probability of
a neuron being active is defined as follows:
pi=on =11 + e −∆Ei T Ei
is the summed up energy of the whole network into neuron i, which is a fully connected to all
other neurons. A neural network is simply a collection of nodes interconnected in some way.
All weights are stored in a weight matrix, describing connections between all the neurons. T is
a temperature constant controlling the rate of change during several evaluations with the
probability pi=on during simulation. Ei is defined as follows [9]:
∆Ei =Xjwijsj − θ
where sj is a binary value set if neuron j is in a active state, which occurs with probability
pi=on, and wij are weights between the current node and node j. θ is a constant offset used to
control the overall activation.
The state of every node and the associated weights describes the entire network and encodes
the problem to be solved. In the case of Sudoku there is a need to represent all 81 grid values,
each having 9 possible values. The resulting 81∗9 = 729 nodes are fully connected and have a
binary state which is updated at every discrete time step. Some of these nodes will have
predetermined outputs since the initial puzzle will fix certain grid values and simplify the
problem. In order to produce valid solutions it is necessary to insert weights describing known
relations. This is done by inserting negative weights, making the interconnected nodes less
likely to fire at the same time, resulting in reduced probability of conflicts. Negative weights
are placed in rows, columns, boxes, and between nodes in the same square, since a single square
should only contain a single active digit. In order to produce a solution the network is simulated
in discrete time steps. For every step, all probabilities are evaluated and states are assigned
active with the given probability.
Finally the grid is checked for conflicts and no conflicts implies a valid solution, which
is gathered by inspecting which nodes are in a active state. Even though the procedure detailed
above eventually will find a solution, there are enhanced techniques used in order to converge
faster to a valid solution. The temperature, T, can be controlled over time and is used to adjust
the rate of change in the network while still allowing larger state changes to occur. A typical
scheme being used is simulated annealing [12]. By starting off with a high temperature
(typically T0 = 100) and gradually decreasing the value as time progresses, it is possible to
reach a global minimum. Due to practical constraints it is not possible to guarantee a solution
but simulated annealing provides a good foundation which was used. The temperature descent
is described by the following function, where i is the current iteration: T(i) = T0 ∗ exp(Kt ∗ i)
Kt controls the steepness of the temperature descent and can be adjusted in order to make sure
that low temperatures are not reached too early.
The result section describes two different decline rates and their respective properties.
There are some implications of using a one-pass temperature descent which was chosen to fit
puzzles as best as possible. Typically solutions are much less likely to appear in a Boltzmann
machine before the temperature has been lowered enough to a critical level. This is due to the
scaling of probabilities in the activation function. At a big temperature all probabilities are
more or less equal, even though the energy is vastly different. With a low temperature the
temperature difference will be scaled and produce a wider range of values, resulting in
increasing probability of ending up with less conflicts. This motivates the choice of an
exponential decline in temperature over time; allowing solutions at lower temperatures to
appear earlier. An overview of the Boltzmann machine is given here in pseudocode.

Code:
boltzmann(puzzle):
temperature = T_0
i=0
//encode puzzle
nodes <- {0}
for each square in puzzle
nodes[same row as square][square] = -10
nodes[same column as square][square] = -10
nodes[sharing the same node][square] = -20
9
CHAPTER 2. BACKGROUND
//iterate until a valid solution is found
while(checkSudoku(nodes) != VALID)
//update the state of all nodes
for each node in nodes
node.offset = calculateOffset(nodes, node)
probability = 1/(1 + exp(temperature * node.offset))
node.active = rand() < probability
//perform temperature decline
i++
temperature = T_0 * exp(TEMP_DECLINE * i)
return nodes
checkSudoku(nodes):
//begin by building the Sudoku grid
grid = {0}
for each node in nodes:
//check if this node should be used
//for the current square
if(node.offset > nodes[same square])
grid.add(node)
//check constraints on grid
if unique rows &&
unique columns &&
unique subsquares
return true
return false
calculateOffset(nodes, selected):
offset = 0
//iterates over all nodes and calculates the summed weights
//many negative connections implies a large negative offset
for each node in nodes
offset += nodes[node][selected]
return offset;

4. Analysis:
In this section multiple results are presented together with a discussion about how the
results could be interpreted. Section 4.1 is devoted to presenting how different algorithms
perform. Section 4.2 show how the algorithms performs relative to the each other and discusses
different aspect of comparison. Section 4.3 explores the idea of difficulty rating and the concept
of some puzzles being inherently difficult. Section 4.4 compares the algorithms by how well
they are suited for generation and parallelizing. 4.1 Time distributions To get an idea of how
each algorithm performs it is suitable to plot solving times in a histogram. Another way of
displaying the performance is to sort the solving times and plot puzzle index versus solving
time. Both of these are of interest however since they can reveal different things about the
algorithms performance. 4.1.1 Rule-based solver The rule-based solver was by far the fastest
algorithm in the study with a mean solving time of 0.02 seconds. Variation in solving time was
also small with a standard deviation of 0.02 seconds. It solved all 49151 17-clue puzzles that
was in the puzzle database used for testing and none of the puzzles resulted in a unstable
measurement of solving time. Figure 4.1 is a histogram on a logarithmic scale that shows how
the rule-based solver performed over all test puzzles. It is observable that there is a quite small
time interval at which most puzzles are solved. This is probably due to the use of logic rules
with a polynomial time complexity. When the solver instead starts to use guessing the time
complexity is changed to exponential time and it is therefore reasonable to believe that the
solving time will then increase substantially. As will be seen in section 4.1.2 the backtrack
algorithm have a similar behavior which is also taken as a reason to believe that the rule-based
solver starts to use guessing ANALYSIS after the peak. Guessing might of course be used
sparingly at or even before the peak, but the peak is thought to decrease as a result of a more
frequent use of guessing solvers time distribution among all 49151 puzzles. The bars represent
half the time interval compared to figure 4.1. All puzzles was solved and none had an unstable
measurement in running time. The confidence level for the measured solving times was 95 %
at an interval of 0.05 seconds. Another way to visualize the result is shown in figure 4.3. The
figure have plotted the puzzle indices sorted after solving time against their solving times. Note
that the y-axis is a logarithmic scale of the solving time. As in figure 4.2, only a few puzzles
had relatively high solving times. This picture also more clearly illustrates the idea explored
above. Namely that the algorithm will increase its solving times fast at a certain point. That
point is as mentioned thought to be the point where the solver starts to rely more upon guessing
then the logical rules. From that statement, it can be concluded that only a small portion of all
Sudoku puzzles are difficult, in the sense that the logic rules that the rule-based solver uses is
not enough.
Puzzle difficulty :
Both the backtrack solver and the rule-based solver were executed on the same set of puzzles.
One interesting aspect to study is to see if some of those puzzles are difficult for both algorithms
or if they are independent when it comes to whichpuzzles they perform well at. Even if the
rule-based solver uses backtrack searchas a last resort it is not clear if the most difficult puzzles
correlate between thetwo algorithms. The reason for this is because a puzzle can be very hard
for thebacktrack algorithm, but still trivial for the rule-based solver. This has to do withthe
naked tuple rule in the rule-based solver that quickly can reduce the number ofcandidates in
each square.
To test for independence the statistical method described in section 3.4.1 is used. The
measurements shows that about 20% of the worst 10% puzzles are common forboth algorithms.
This means that some puzzles are inherently difficult regardlessof which of the two algorithms
are used. If that would not have been the case only10% of the worst puzzles for one algorithm
shall have been among the 10% worstpuzzles for the other algorithm. The statistical test also
confirms this with a highconfidence level, higher than 99.9%.While there is interest to correlate
results of the Boltzmann machine solverwith others, there are difficulties with doing this.
Considering the large variance inrunning time for individual puzzles there is little room for
statistical significance in the results.

Conclusion :
Three different Sudoku solvers have been studied; backtrack search, rule-basedsolver and
Boltzmann machines. All solvers were tested with statistically significant results being
produced. They have shown to be dissimilar to each other interms of performance and general
behaviorBacktrack search and rule-based solvers are deterministic and form executiontime
distributions that are precise with relatively low variance. Their executiontime was also shown
to have rather low variance when sampling the same puzzlerepeatedly, which is believed to
result from the highly deterministic behavior. Comparing the two algorithms leads to the
conclusion that rule-based performs betteroverall. There were some exceptions at certain
puzzles, but overall solution timeswere significantly lower.
The Boltzmann machine solver was not capable of solving harder puzzles withless clues
within a reasonable time frame. A suitable number of clues was foundto be 46 with a 20 second
execution time limit, resulting in vastly worse generacapabilities than the other solvers. Due to
stochastic behavior, which is a centralpart of the Boltzmann solver, there was a relatively large
variance when samplingthe execution time of a single puzzle. Another important aspect of the
Boltzmann isthe method of temperature descent, in this case selected to be simulated
annealingwith a single descent. This affected the resulting distribution times in a way
thatmakes the probability of puzzles being solved under a certain critical temperaturelimit high.
The critical temperature was found to be about 0.5% of the startingtemperature, with no puzzles
being solved after this interval.

You might also like