eric pass objective of this project was to develop an interactive sudoku solver generalized ... over...
TRANSCRIPT
Solving Sudoku Using Artificial Intelligence
Eric Pass
BitBucket: https://bitbucket.org/ecp89/aipracticumproject Demo: https://youtu.be/-7Mv2_UlsAs
Background
Overview
Sudoku problems are some of the most recognizable and popular puzzle games played around the
world. A standard Sudoku puzzle is a two dimensional square grid containing 81 cells which are well
distributed in 9 rows and 9 columns. The grid is further divided into 9 smaller squares called units which
contain 9 cells each. (Zhai and Zhang) For the discussion in this paper, we will denote a puzzle of size or
order N to mean a Sudoku puzzle that has N columns, N rows and N units.
The rules of Sudoku are an extension of the rules used in the Latin Square puzzles. (Soedarmadji and
Ouaknine) The rules for a general N by N Sudoku puzzle are:
1. All of the N*N squares must be assigned a single value from 1 to N.
2. Each row must contain each number from 1 to N exactly once.
3. Each column must contain each number from 1 to N exactly once.
4. Each unit must contain each number from 1 to N exactly once.
Even though the rules of Sudoku are incredibly simple, the task of correctly completing a partially
filled Sudoku puzzle is exceedingly difficult. Felgenhauer and Jarvis calculate in their paper
“Enumerating Possible Sudoku Grids” that there are 6.671 x 1021 valid 9 by 9 Sudoku puzzles.
Figure 1: A 9 by 9 Sudoku Puzzle with a unit highlighted in light blue.
(Felgenhauer and Jarvis) Since the state space is so enormous, considerable effort has to go into
developing algorithms which will yield a correct solution in a reasonable amount of computing time.
Complexity of Sudoku
In 2002, Yato and Seta at The University of Tokyo proved, by reducing Sudoku into a Latin square
completion problem, that Sudoku is NP-Complete. (Yato T. and Seta) The assertion that Sudoku is NP-
Complete indicates that it is possible to reduce Sudoku in polynomial time to another NP-Complete
problem. Two common and natural reductions are reducing Sudoku to an exact cover problem and
reducing Sudoku to a Boolean satisfiability problem.
Sudoku as an Exact Cover
Donald Knuth in his “Dancing Links” paper explains the exact cover problem very intuitively. The
problem statement is: Given a matrix of 0s and 1s, does it have a set of rows containing exactly one 1 in
each column? For example, the matrix
has such a set (rows 1, 4, and 5. (Knuth) Sudoku can be reduced to an exact cover problem; for a 9 by 9
puzzle, the matrix has 729 rows and 324 columns. The rows of this matrix encode information about
what value is assigned to a given row and column. Since, for a 9 by 9 puzzle, there are 9 possible values
which could occupy any of the 9 columns and any of the 9 rows we have a total number of 9*9*9 = 729
rows in this matrix. The columns of this matrix encode the four constraints of the problem that each cell
must be assigned a single value between 1 and 9 and every row, column and unit must contain exactly one
instance of each number between 1 and 9. This means there is a total of 4*9*9 = 324 columns in this
matrix. Once Sudoku has been reduced to an exact cover problem, one can use various algorithms like
Donald Knuth’s “Algorithm X” which utilize the “Dancing Links” technique to solve these problems.
Figure 2: An example exact cover problem matrix from Donald Knuth’s paper “Dancing Links.”
Sudoku as a SAT Problem
Sudoku can just as easily be encoded as a Boolean Satisfiability Problem (SAT). In order to
represent Sudoku as a SAT problem, one will need a considerable number of propositional variables. We
will denote Si,j,z to be assigned true if and only if the cell at row i and column j is assigned value z. This
means for a 9 by 9 Sudoku puzzle we will need 729 different propositional variables, the same as the
number of rows in the Exact Cover reduction. Adding the constraints that each cell must contain a single
value from 1 to 9 and that each row, column and unit must contain exactly one instance of each value
from 1 to 9, the size of both the minimal and extended encoding can be shown to be O(n4) where n is
order of the Sudoku puzzle. (Lynce and Ouaknine) As explained by Lnyce and Ouaknine in their paper
“Sudoku as a SAT Problem” the minimal encoding for a Sudoku puzzle is:
By encoding Sudoku as a SAT problem, one can use various SAT solving algorithms such as the
famous Davis-Putnam-Logemann-Loveland (DPLL) algorithm to solve a Sudoku puzzle.
Objectives The objective of this project was to develop an interactive Sudoku solver generalized to handle N by
N Sudoku puzzles in addition to creating a powerful and intuitive graphical user interface (GUI) that can
be used to compare the efficiency of different Sudoku solving algorithms against a wide array of Sudoku
puzzles differing in size and complexity. Overall, I met these goals by creating five different Sudoku
solvers, all of which can handle N by N Sudoku puzzles, although they might not produce a solution in a
reasonable amount of time. Each of the solvers can be tested interactively with a GUI I adapted and
enhanced from Gilbert Le Blanc. While using the GUI, the user can manually enter in the starting clues,
load a puzzle from the database of a given difficulty, or generate a new puzzle specifying how many clues
the puzzle should contain. Over the course of this project, I was able to experiment with different design
patterns such as the Singleton, Strategy, and MVC design pattern as well as implement features that I did
not intend to such as a database to load and save puzzles to. Additionally, my project can be run from the
command line without a GUI as a tool to provide data, such as the number of nodes needed to explore and
the time it took to find the solution, in order to analyze the performance of an AI against a large sample of
Sudoku puzzles. Overall, my project which spanned 27 classes for a total of 2351 lines of code, turned out
to be an enjoyable exploration into larger scale software development and an investigation into techniques
to solve Sudoku puzzles utilizing artificial intelligence.
Software Written
Overview
I utilized the Model View Controller design pattern to organize this project and create a robust GUI.
There are four main packages: model, view, controller, and solvers. The state of the Sudoku puzzle is
maintained in the model package comprising of two classes: SudokuPuzzle and SudokuCell. The model
package also contains the implementation of the Singleton used to interface with the database in addition
to other classes and enumerations such as PuzzleEntity, which is class abstracting a puzzle record in the
database, and SudokuPuzzleSize, which is an enumeration of supported sizes for Sudoku puzzles
encapsulating drawing logic per puzzle size. Most interestingly, the model package contains the
implementation of the the puzzle generator whose implementation and performance will be discussed
later in this paper.
The controller and view packages contain all the code related to presenting and updating the Sudoku
puzzle. These two packages are entirely GUI oriented and was written using Java Swing. If the reader
interested in these details, they encouraged to view the actual code via the BitBucket link listed on the
cover page of this paper.
The last package is solvers, which is the home for my implementation of the five different Sudoku
solvers. The solvers I implemented are all general backtracking algorithms and each will be discussed
individually in terms of implementation and performance in the “Solvers” section. In the solvers package,
I utilized the strategy design pattern to create an abstract class of a SudokuSolver of which each
implementation of solver is forced to extend. The three abstract methods in SudokuSolver are the two
methods for solving a Sudoku puzzle with and without a GUI to update, and a function to return the name
of the solver.
The Sudoku Puzzle Abstraction
For my implementation of a Sudoku puzzle, I chose to use a two dimensional array of SudokuCell
objects. This was accomplished by having a private field in SudokuPuzzle called cells which was a two
dimensional array of SudokuCell objects. The general abstraction function states that given a puzzle P
where r denotes the row and c denotes the column the value of the cell Pr,c, where r=0 and c=0 is the
topmost left column, is represented by SudokuCell at cells[c][r] whose value is equal to Pr,c. If the value
of Pr,c is not known then the value of the SudokuCell at cells[c][r] is set to 0. The SudokuCell object also
keeps track of whether this cell was a given value in the puzzle as well as encapsulates all the logic
necessary to present the cell in the GUI.
Choosing a two dimensional array to represent a Sudoku puzzle was a natural abstraction. This
implementation made it very easy and efficient to traverse and interact with the model. However, because
I chose to use a two dimensional array, I was unable to utilize Donald Knuth’s Dancing Links technique,
which requires an exact cover matrix whose cells are all connected in in doubly linked list. If I had
implemented the model as a matrix of doubly linked lists, I could have utilized Donald Knuth’s
Algorithm X which is an efficient recursive, non-deterministic, depth-first, backtracking algorithm for
solving the exact cover problem.
The Database and Puzzle Generation
Generating a Sudoku puzzle solver was a surprisingly difficult task. My initial hope was that there
would exist a general Sudoku puzzle generator which could generate puzzles of a given difficulty and
size. Unfortunately, such an instrument does not exist. Stephen Ostermiller’s qqwing was the closest tool
available. Qqwing is a very robust Sudoku generator and solver; however, it is limited in that it can only
generate 9 by 9 Sudoku puzzles. One very nice feature of qqwing is that it can generate puzzles of a
specified difficulty. This is important because some Sudoku puzzles are inherently harder to solve than
others and it is generally difficult to tell if a given puzzle will be trivial or hard solve.
Qqwing grades a puzzle on basis of how many different Sudoku solving techniques are needed to
solve that given puzzle. These techniques are used by advanced Sudoku players to solve puzzles by hand.
Qqwing’s grading system is based on the premise that the more techniques needed to solve a puzzle, the
harder the puzzle is to solve. An interesting concept discussed in Ansotegui, et al’s paper “Generating
Highly Balanced Sudoku Problems as Hard Problems” is that certain properties such as how balanced the
holes of a puzzle are distributed contributes to how hard a Sudoku puzzle is to solve. (Ansotegui, et al)
Unfortunately, my generator does not utilize counting the number of different techniques to solve the
puzzle nor the interesting properties such as singly, doubly and fully balanced Sudoku puzzles opting
instead for the naive approach of randomly generating a solved Sudoku puzzle then inserting holes into
the puzzle arbitrarily. While this approach does not yield any insight to the difficulty of the puzzle, my
generator can generate random puzzles for 16 by 16 Sudoku puzzles and qqwing cannot.
Generating Sudoku puzzles takes a non-trivial amount of computational resources because in order to
create a new puzzle, you must first solve an empty puzzle. This was the primary motivation for creating a
database to store generated puzzles. Also the existence of a database also makes it easy to compare
different solvers because you can ensure that they are facing the same puzzles. My database is a MySQL
database with two tables PUZZLES and RUNS. I interact with the database through the
DatabaseManangerSingleton class which resides in the model package.
Solvers
Overview
For this project, I implemented five different solvers all of which are backtracking algorithms. From
least sophisticated to most sophisticated, I implemented a simple backtracking algorithm, a backtracking
algorithm with forward checking, an implementations of a backtracking algorithm using the minimum
remaining heuristic, an implementation of a backtracking algorithm with a probabilistic heuristic, and
finally a backtracking algorithms using constraint propagation. The only algorithm I can performed
superbly was the constraint propagation algorithm which was able to solve an empty 25 by 25 Sudoku
puzzle, which none of the algorithms other could manage.
Presentation of Results
Each algorithm was tested on a 2014 MacBook Pro with a 2.8 GHz Intel Core i7 processor and 16 GB
of memory. Each algorithm faced the exact same puzzles in the same order. Each algorithm was presented
with 200 simple, easy, intermediate and 400 expert Sudoku puzzles generated and graded by qqwing. The
maximum amount of nodes for any given puzzle a solver was allowed to evaluate was 10,000,000. If a
solution was not found before reach 10,000,000 nodes explored the search was aborted and the solver
moved onto the next puzzle.
Simple Backtracking
This algorithm unsurprisingly faired miserably. Simple backtracking could only solve the most basic
of 9 by 9 puzzles and was almost always at magnitude of orders higher in terms of nodes explored than
the other algorithms.
As we can see from the two graphs, it was rare for simple backtracking to return a solution to problem
of any difficulty in under 10,000,000 million nodes explored. My implementation of this simple
backtracking algorithm, which is consistent with Russle and Norvig’s backtracking algorithm presented
on page 215 of the third edition of Artificial Intelligence a Modern Approach, enumerates all the possible
values for a cell and for each assignment checks to see if this does not violate any of the constraints. If the
assignment does violate a constraint, then the enumeration continues until either a valid assignment is
found or all possibilities are exhausted. If a valid assignment is not identified, then the algorithm
backtracks to the previous assignment and the enumeration process continues from that previous node.
The fundamental problem with this algorithm is that it fails to use all of the information available. In
visiting the exact same number of nodes, the simple backtracking algorithm can be dramatically improved
by finding all the valid assignments of this cell and only trying those instead of checking to see if a given
assignment is valid. This is exactly what is done in the forward checking backtracking solver.
Overall, simple backtracking was able to solve 69% of simple puzzles, 66% of easy puzzles, 65% of
intermediate puzzles, and 68% of expert puzzles exploring in an average of 2,516,820 nodes and
4,807,164 nanoseconds.
Figure 3: Graph showing time vs. difficulty for simple backtracking algorithm.
Figure 4: Graph showing number of nodes explored vs. difficulty for simple backtracking algorithm.
Forward Checking Solver
The first improvement made to the simple backtracking solver was to employ forward checking. This
solver is almost the exact same as the simple backtracking solver except that it only assigns values to cells
it knows will preserve the consistency of the puzzle instead of checking each possible assignment. Like
the simple backtracking algorithm, if a correct assignment does not exist, the algorithm backtracks to the
previous cell.
As a result of the improvement of forward checking, many more Sudoku puzzles can be solved;
however ,we are still unable to solve slightly less than 10% of each difficulty. Furthermore, we see a
speedup in the time to find a solution when compared to simple backtracking by a little greater than 25%.
Figure 5: Graph showing time vs. difficulty for forward checking solver.
Figure 6: Graph showing number of nodes explored vs. difficulty for forward checking solver.
Minimum Reaming Values Heuristic
The first heuristic I investigated was the minimum reaming values heuristic (MVR). This heuristic
orders the search to check the puzzle from the most constrained cells, the cells with the fewest possible
legal assignments, to the most unconstrained cells. By selecting the most constrained cell, you are
effectively picking the variable that is most likely to cause a failure, thereby pruning the search tree.
(Russell and Norvig, 216) I implemented MVR using a PriorityQueue where the cells the fewest possible
assignments had the highest priority.
The performance of the MVR solver was surprisingly good. It was able to solve all the
simple puzzles, 98% of the easy puzzles and 97% of the intermediate and expert puzzles. More
impressively, MVR was able to cut down the number of nodes it needed to visit in half and
produced a speed up of 25% when compared to forward checking.
Figure 7: Graph showing time vs. difficulty for MVR solver. Figure 8: Graph showing number of nodes explored vs. difficulty for MVR solver.
Probabilistic Heuristic
The second heuristic I investigated was using a probabilistic heuristic in which out of the
valid assignments for a cell, it would choose the enumerate from the least frequently assigned
value to the most frequently assigned value in this puzzle. The logic for this is you know that for
a N by N puzzle you need exactly N instances of every number from 1 to N. Therefore, it is more
likely that the assignment of this cell falls into a less frequently assigned value than a more
frequently assigned value.
As seen above, this algorithm did not perform any better, and may be a little worst than the
forward checking solver. Due to its additional memory needs and complexity, one cannot justify
using this over forward checking.
Constraint Propagation
The last solver I implemented utilized constraint propagation. This was by far the most technically
challenging solver with many small details to overlook. My implementation in addition to using
constraint propagation used the MVR heuristic. The heart of this algorithm is again a PriorityQueue
which contains the current valid assignments for every cell with the priority being those cells which have
the fewest possible assignments. If this queue is empty, then we have found a solution to the puzzle. If the
queue is not empty, then we poll the first cell off the queue and begin to enumerate through its valid
assignments. It is important that before starting to go through the enumerations, that you have captured
the old state of the puzzle before the start of the enumeration. After selecting a valid assignment for this
Figure 9: Graph showing time vs. difficulty for probabilistic solver.
Figure 10: Graph showing number of nodes explored vs. difficulty for probabilistic solver.
cell, you then traverse the puzzle and get the new possible assignments for all the cells that were effected
by this assignment. You then iterate through each new assignment making sure that there is no cell that
has been left without a valid assignment. If one is identified, you restore the state to before the original
assignment and restart the enumeration. If an effected cell after this assignment is left with only one valid
assignment, you then assign that value to that cell and collect those updates from that assignment. You
continue this way until there are no more assignments or you reach a cell with no valid assignment.
The performance of this algorithm was astounding as it was able to solve every 9 by 9 puzzle
effortlessly averaging only 2,433 nodes explored and taking only on average only 1,478,335 nanoseconds
to complete. Also, this algorithm was able to solve 16 by 16 puzzles better than all the other algorithms
and it was the only algorithm able to solve an empty 25 by 25 grid.
Final Thoughts
This project was a fun exploration into attempting to solve the very hard and ubiquitous
Sudoku puzzle. My entire project is versatile as it can be used to test Sudoku solving algorithms,
draw and experiment with large Sudoku puzzles, as well as play and solve Sudoku problems by
hand. My five algorithms overall did a good job and the last solver I implemented, the constraint
propagation solver, is incredibly exciting. The overall performance of my algorithms are
depicted in the three figures below.
Figure 11: Graph showing time vs. difficulty for constraint propagation solver.
Figure 12: Graph showing number of nodes explored vs. difficulty for constraint propagation solver.
Figure 13: Graph showing percentage solved for each solver subdivided by difficulty of the puzzle.
Figure 14: Graph showing average number of nodes explored while solving a puzzle.
References
Ansotegui C, Bejar R, Fernandez C, Gomes C, Mateu C. Generating Highly Balanced Sudoku Problems As Hard Problems. J Heuristics 2011;17:589-614. Felgenhauer B, Jarvis F. Enumerating Possible Sudoku Grids. Dresden, Germany: TU Dresden. Sheffield, UK: University of Sheffield June 20, 2005. Knuth DE. Dancing Links. arXiv:cs.DS 1, 1-26. 11-15-2000. Lynce I, Ouaknine J. Sudoku as a SAT Problem. Lisbon, Portugal: Technical University of Lisbon. Oxford, UK: Oxford University. 2015. Russell, Stuart J, Peter Norvig, and Ernest Davis. Artificial Intelligence: A Modern Approach. Upper Saddle River, NJ: Prentice Hall, 2010. Print. Soedarmadji E, McEliece R. Iterative Decoding for Sukoku and Latin Square Codes. Pasadena, CA: California Institute of Technology; 2007. Yato T, Seta T. Complexity and Completeness of Finding Another Solution and its Application to Puzzles. Tokyo, Japan: University of Tokyo; 2005. Zhai G, Zhang J. Solving Sudoku Puzzles Based on Customized Information Entropy. International Journal of Hybrid Information Technology. Jan 2013;6(1):77-91.