maandag 10 januari 2011

A Sudoku solving algorithm

When I make long trips by using the train I often take a newspaper to try to make the long travel less boring. Unfortunately the contents of the newspaper is not always very interesting, so I end up at the back of the newspaper rather quickly. Luckily that section always contains a lot of puzzles. The most interesting, and time consuming, of them all are usually the Sudoku puzzles. I often solve them without too much effort, but sometimes one of the puzzles is really hard and I am unable to solve it. So I have done what every self respecting programmer would have done: I wrote a algorithm to solve Sudoku puzzles. Now of course there are already a lot of websites that have Sudoku solving algorithms, but where is the fun in using a premade solution?

My algorithm is heavily based on how humans solve Sudoku puzzles. Basically what the algorithms does for each of the 81 tiles is marking whether a number is possible on that space or not. This can be visualized as a 9x9x9 volume of boolean values where the width and length of the volume lie parallel to the Sudoku puzzle and the depth signifies the numbers 1 to 9. This will be called the state volume. In the state volume a value of true means that a number is possible at that tile and false means it is not.


The algorithm works as follows:

  1. Create an empty 9x9 grid, this is called the solved numbers.
  2. Create a 9x9x9 state volume with all elements set to possible.
  3. Put the given numbers of the Sudoku puzzle in a container, these are the new numbers.
  4. Use the new numbers to mark elements in the state volume as not possible. Transfer the new numbers to the correct position in the solved numbers grid.
  5. Search the state volume for new numbers and put them in the container, if they are not already in the solved numbers grid. 
  6. If there are new numbers then go to step 4.
  7. The solved numbers grid contains the solved Sudoku.

Now the most interesting steps are of course step 4 and step 5, because how should we mark the elements as not possible and how should we search for the new numbers? The rules of a Sudoku puzzle say that a number should only appear once in a row, once in a column and once in a square. We also want to mark the elements in the state volume in this way.

This image shows how the fourth slice of the state volume is affected when it is marked by the number 4 in the middle of the puzzle.


The tiles that are now marked yellow cannot contain the number 4. There are also elements that are marked in the state volume that are not shown in this image, namely the elements directly above and below the center of the fourth slice. This is because the number 4 now occupies the center tile on the Sudoku puzzle and this means it is impossible for any other number to exist there.

Searching happens in the same four ways that the volume is marked. Along the rows, along the columns, in a square and in the depth of the volume. In the case of searching along the rows, the algorithm checks for each slice and for each row whether there is only one element in the row on that slice that is still possible. If this is the case then the column and row number become the position on the Sudoku and the depth in the volume becomes the number at that location. Along the columns is almost the same as along the rows of course, just switch the two words. Searching in a square happens for each slice and for each of the 9 3x3 squares. Inside a square at that slice the algorithm checks if only one element is still possible. Searching in the depth happens for each column and for each row.

When I finished coding it and started testing the algorithm I noticed that the algorithm solved the puzzles within the blink of an eye. Also note that the running time of the algorithm, and any other Sudoku solving algorithm for that matter, depends on the number of tiles that are not given. This is of course assuming that the size of the puzzle stays constant. Unfortunately when I tried to solve one of the Sudoku’s on Wikipedia that are marked “Hardest”. The algorithm didn’t work. This was a bit disappointing and I think that due to the fact that the algorithm is based on the methods that humans use when solving Sudoku’s, but I already have a good idea of how to extend the algorithm so it might even solve these puzzles.

Oh, before I forget. The source code below (in C++) simply uses a 9x9 grid as the container for the new numbers. This way I don’t have to make an extra data type to remember the positions on the Sudoku grid. However I do have to search over this entire grid every time I want to mark the elements in the state volume. It is not a big overhead though.


 

Geen opmerkingen:

Een reactie posten