In the last post we discussed how to extract the sudoku region from the captured frame of a live stream and then we applied perspective transform to the extracted region and then we slide a window through each cell and recognized the digits in the cells. Now in this post we discuss how to solve the sudoku puzzle using Backtracking algorithm. There are several approaches to solve the sudoku puzzle. For example, you can find some of them in this paper.
Though there are several approaches to solve it, let’s stick to the traditional backtracking algorithm to solve the puzzle. For our understanding, just have a look at how the sudoku is presented below.
Consider the sample puzzle presented above for further explanation. I know that we all know sudoku rules. But for completeness those are listed below
- It contains 81 cells in the 9×9 sudoku puzzle.
- And every 3×3 cells are called as a block (marked as dark in above image).
- Each cell should contain a number that lies in [1-9].
- The number $N$ in any cell should be placed in such a way that the corresponding rows,columns should not contain the same number.
- Also the number $N$ should not be presented in the corresponding block other than the current cell.
- Finally there should not be any empty cells after we solve the puzzle. (ofcourse that’s what we meant by solving)
Backtracking
Let’s first try to solve the puzzle a bit and then try to understand the backtracking algorithm. For that consider the image below.
If we started filling out with a number $3$ in the first cell, then we should check for whether it is safe or not by checking the corresponding row,column and block.
Now that the backtracking algorithm can be used to solve this puzzle very easily. The procedure to solve this puzzle using backtracking is given below.
solve(puzzle){ find an empty cell E if no empty cell{ return True } for each number in 1 to 9{ check whether the number is safe or not if safe{ fill that number in the cell if solve(puzzle){ return True } else{ undo filling the number before } } } }
To explain the above algorithm in simple terms we can think of that we find the empty cell, fill it with valid number, and then solve the puzzle as we did before, if everything works fine it is solved, otherwise the number we started to fill the first cell is changed and the process is continued. That’s it. If it still seems to be confusing, you better have a look at this video for better intuition.
Solve the puzzle
All right! We can now code our sudoku solver, for that let’s open up a file named sudoku.py
and start coding.
import numpy as np class SolveSudoku(object): def __init__(self,grid): self.grid = grid
We start off by creating a SolveSudoku
class which takes a puzzle grid
for instantiation. Let’s consider the grid to be a numpy
array, since the computations on numpy
arrays are a bit easier.
def _check_row(self,digit,row): return digit not in self.grid[row,:]
The _check_row
method takes in a digit
and the row
where the digit is to be placed and verify whether any element in the same row
is not equal to the digit
we are filling with.
def _check_column(self,digit,column): return digit not in self.grid[:,column]
Similarly the _check_column
method takes in a digit
and the column
where the digit
to be placed and verify whether any element in the same column
is not equal to the digit
we are filling with.
def _check_box(self,digit,row,col): rc = (row/3)*3 cc = (col/3)*3 return digit not in self.grid[rc:rc+3,cc:cc+3]
The above method verifies the given digit is safe to be placed in the block or not. For that we compute to which the given row and column belong to and then check in that block for validity.
def _check_safe(self,digit,row,col): return self._check_box(digit,row,col) \ and self._check_row(digit,row) \ and self._check_column(digit,col)
This is just a helper method that checks whether a digit
is safe to place in the row
and the column
given as arguments to the method.
def _find_empty(self,stop): if 0 in self.grid: stop[0] = np.where(self.grid==0)[0][0] stop[1] = np.where(self.grid==0)[1][0] return True else: return False
We find an empty cell using the above defined method. It takes a list
which is a mutable object and places the row
and column
indices in to that list respectively. Remember we use a mutable object and also it is not the class attribute. The reason we do this is because we need each list
for each recursive function call which we will see later.
def _solve(self): stop = [0,0] if not self._find_empty(stop): return True for digit in np.arange(1,10): row = stop[0] col = stop[1] if self._check_safe(digit,row,col): self.grid[row,col] = digit if self._solve(): return True self.grid[row,col]=0 return False
The _solve
is the actual method which solves the puzzle using the backtracking algorithm defined previously. Note that the stop
is a list
and not a consistent class attribute, something like self.stop
. We do like this because if the stop
indices are going to be changed by each recursive function call and they are consistent then we would not have a persistent approach to backtrack to the optimal solution. Remember this method uses recursive calls and the line self.grid[row,col]=0
triggers to backtrack which in case the filled number is not correct.
def solve(self): if self._solve(): return self.grid raise("Sudoku can not be solved")
Finally we define solve
method which just wraps around _solve
method and it raises a ValueError
if the given sudoku is not a valid one.
We now have our SolveSudoku
class defined and we are good to go for the actual solving of the captured puzzle in the previous post. Let’s open up solve.py
file and add the code to the bottom of it and continue where we left.
#convert to numpy array of 9x9 grid = np.array(labels).reshape(9,9) #find the indices of empty cells gz_indices = zip(*np.where(grid==0)) #center co-ordinates of all the cells gz_centers = np.array(centers).reshape(9,9,2)
We convert the labels to a numpy
array of shape 9x9
which contains the recognized digits of each cell in the puzzle as a list
. Then find the indices of the empty cells in the puzzle and then reshape the center co-ordinates of cells to 9x9x2
.
#solve the sudoku sudoku = SolveSudoku(grid) grid = sudoku.solve() #fill the solved numbers in empty cells for row,col in gz_indices: cv2.putText(warped,str(grid[row][col]),tuple(gz_centers[row][col]),cv2.FONT_HERSHEY_SIMPLEX,0.6,(0,0,0),2) cv2.imshow("Solved",warped) cv2.waitKey(0)
We then create a SolveSudoku
object and pass our unsolved grid and then finally solve the puzzle using solve()
method. We then place the solved digits on the warped
image. The solved warped
image is as shown below.
Reproject the solved puzzle to the original image
Now that we have our puzzle solved. All we need to do is to project back the solved puzzle on to the original image. We can achieve this using homography technique. If you are not familiar with homography please visit this link before continuing further.
#process the src and dst points pt_src = [[0,0],[warped.shape[1],0],[warped.shape[1],warped.shape[0]],[0,warped.shape[0]]] pt_src = np.array(pt_src,dtype="float") pt_dst = poly.reshape(4,2) pt_dst = pt_dst.astype("float") #align points in order pt_src = order_points(pt_src) pt_dst = order_points(pt_dst) #calculate homography matrix H,_ = cv2.findHomography(pt_src,pt_dst) #reproject the puzzle to original image im_out = cv2.warpPerspective(warped,H,dsize=(gray.shape[1],gray.shape[0])) im_out = cv2.addWeighted(gray,0.9,im_out,0.2,0) cv2.imshow("Projected",im_out) cv2.waitKey(0)
We first gather the source points which in case the solved puzzle co-ordinates and the destination co-ordinates which are the actual points of the puzzle in the original image. We then order our points as top-left, top-right, bottom-right, bottom-left
using order_points
function available in imutils.perspective
library. We do like this as the order of points in both source and destination should in the same order.
We then construct the homography matrix using cv2.findHomography
function and apply perspective transform for the warped
image as source. We then blend the two images using cv2.addWeighted
. Finally we have our projected back our solved puzzle on to the original image.
You can get the code snippets from my github account.
It’s a lot of fun writing this post. And in the next post we dig somewhat deeper to actually build a Facial Expression Recognition system completely from scratch.
Thank you, Have a nice day…