Solving Sudoku's with C++

What is Sudoku?

A Sudoku is a puzzle which consists of a 9x9 grid. The rule is that you need to enter the numbers 1 through 9, but each row, column, and square, may only contain each number once. So it is not possible to use the same number twice for any given row, column or square. 


A Sudoku has exactly one unique solution, but this could not always be the case. When there is a 'bad' puzzle, it could contain more solutions, or none at all.

Format for Puzzles

A convenient format for a Sudoku puzzle, is this puzzle string format. Each of the cell in the grid is translated to a 1D array which consists of a total of 81 characters. If a square is empty, this is denoted with a zero (0).

An exampe of a puzzle string for the puzzle above:

530070000600195000098000060800060003400803001700020006060000280000419005000080079

The grid

The puzzle will be stored in a 1D array which contains 81 integers. The value 0 indicates that the cell is empty, and the values 1 through 9 indicate which number is used for the cell. The grid is defined as:

int grid[81];


We also need a small utility function which will load the grid from the puzzle string, which is the following function:

void LoadPuzzle(std::string puzzleString) {
    // Check if the puzzle string has the correct size.
    if (puzzleString.size() != 81)
        throw std::invalid_argument("invalid puzzle.");
    for (int i = 0; i < 81; i++)
        grid[i] = puzzleString[i] - 48; // 48 is the ASCII code for 0.
}

Checking rows

For any given cell, we know that the y-coordinate is the row that must be checked. If we let r be the row, the we can index each of the cells in the row with: {r, r + 1, r + 2, r + 3, ..., r + 8}. We also want to keep track of the numbers that are used, when a number is used twice, the row is invalid. The following function checks if a row is valid:

bool IsValidRow(int r) {
    bool digits[10] = { false, false, false, false, false,
                        false, false, false, false, false };
    for (int i = 9 * r; i < 9 * r + 9; i++)  // i is in the index in the grid for row r.
        if (!digits[grid[i]] || !grid[i])
            digits[grid[i]] = true;
        else
            return false;
    return true;
}

Checking columns

This is the same as for the row, but now the x-coordinate is used to indicate which column is checked. If we let c be the column, then we can index each of the cells in column with: {c, c + 1*9, c + 2*9, ..., c + 8*9}. The following functions checks if a column is valid:

bool IsValidColumn(int c) {
    bool digits[10] = { false, false, false, false, false,
                        false, false, false, false, false };
    for (int i = c; i < 81; i += 9)  // i is the index in the grid for column c.
        if (!digits[grid[i]] || !grid[i])
            digits[grid[i]] = true;
        else
            return false;
    return true;
}

Encoding squares

There is no such simple formula for indexing the cells in a square. To make this easier, a lookup array will be used which will return an array with the indices of the cells for a square. 

int squares[9][9] = {
    {0,  1,  2,  9,  10, 11, 18, 19, 20},    // square 1
    {3,  4,  5,  12, 13, 14, 21, 22, 23},    // square 2
    {6,  7,  8,  15, 16, 17, 24, 25, 26},    // square 3
    {27, 28, 29, 36, 37, 38, 45, 46, 47},    // square 4
    {30, 31, 32, 39, 40, 41, 48, 49, 50},    // square 5
    {33, 34, 35, 42, 43, 44, 51, 52, 53},    // square 6
    {54, 55, 56, 63, 64, 65, 72, 73, 74},    // square 7
    {57, 58, 59, 66, 67, 68, 75, 76, 77},    // square 8
    {60, 61, 62, 69, 70, 71, 78, 79, 80}     // square 9
};

Checking squares

With the lookup array in place, checking the squares goes very similar to checking a row. The following function will check if a square s is valid.

bool IsValidSquare(int s) {
    bool digits[10] = { false, false, false, false, false,
                        false, false, false, false, false };
    for (int i = 0; i < 9; i++) {  // i is the index in the square lookup for square s.
        int index = squares[s][i];
        if (!digits[grid[index]] || !grid[index])
            digits[grid[index]] = true;
        else
            return false;
    }
    return true;
}

Partial valid solution

We can now create a function that will check if the grid contains a valid partial solution, with partial meaning that there are still zero's in the grid. For any given cell c, we check if the numbers appear only once in the row, column, and square.

bool IsValidPartialSolution() {
    // Check all the rows, columns, and squares to have exactly
    // one number of 1..9.
    for (int i = 0; i < 9; i++)
        if (!IsValidRow(i) || !IsValidColumn(i) || !IsValidSquare(i))
            return false;
    return true;
}

Backtracking search procedure

The core of the algorithm is the backtracking search procedure. The following illustration visualizes the algorithm:


Here we search for an empty cell (= 0), and then try each value in {1, 2, ..., 9}. If the pick creates a valid partial solution, we will recursively call the search procedure, and move on to the next cell. When the recursive function returns, we will also unset the number that was picked. By undoing the picked number, the algorithm will backtrack when the partial solution becomes invalid. 

void Search(int k=0) {
    if (k >= 81)
        return PrintSolution();
    if (grid[k] == 0) {  // Check if the cell is empty.
        for (int i = 1; i <= 9; i++) {  // Try each value in 1..9.
            grid[k] = i;   // Set the value of the cell.
            if(IsValidPartialSolution())
                Search(k + 1); // Continue searching with this value.
            grid[k] = 0;   // Remove the value (backtrack).
        }
    }
    else Search(k + 1);  // Cell is already filled, go to the next cell.
}

Example 

Now the algorithm is completed, we can test it. We will solve the puzzle that is shown above. With the following command, the solver is called:

.\sudosolve.exe <puzzle string>

The program gives the following output: 

-- Sudoku Solver --

PuzzleString:
530070000600195000098000060800060003400803001700020006060000280000419005000080079

Puzzle:
5 3 0 0 7 0 0 0 0
6 0 0 1 9 5 0 0 0
0 9 8 0 0 0 0 6 0
8 0 0 0 6 0 0 0 3
4 0 0 8 0 3 0 0 1
7 0 0 0 2 0 0 0 6
0 6 0 0 0 0 2 8 0
0 0 0 4 1 9 0 0 5
0 0 0 0 8 0 0 7 9

Solution 1:
5 3 4 6 7 8 9 1 2
6 7 2 1 9 5 3 4 8
1 9 8 3 4 2 5 6 7
8 5 9 7 6 1 4 2 3
4 2 6 8 5 3 7 9 1
7 1 3 9 2 4 8 5 6
9 6 1 5 3 7 2 8 4
2 8 7 4 1 9 6 3 5
3 4 5 2 8 6 1 7 9

Executed in 32 milliseconds


It has found the correct solution for the puzzle.

Improvements

There are many improvements that can be made to increase the speed of the algorithm. One such a thing, would be to keep track of the allowable picks for each cell. Then when we pick a number, we remove the pick from all the cells in row, column, and square. If it happens to be that a cell has no picks left, we have an invalid partial solution. This is also called inference, because we infer some information after making a choice. It is possible to infer some information because of the constraints of the puzzle. 

However, to keep the example simple, no such optimizations have been made to the algorithm. The following puzzle string will take about 170 seconds to solve, which makes it a good candidate for a more optimized algorithm.

000000010000002003000400000000000500401600000007100000050000200000080040030910000

A more efficient method to solve this problem is to formulate it as an exact cover problem, and solve it with Knuth's Algorithm X. It is also implemented with the dancing links data structure, which is a lot faster than using a matrix, because the matrix would become very sparse. More information can be found at the following links.

Download

The program can be downloaded from GitHub.

Source

The complete source code is available on GitHub.