Before we consider how computers play chess, let’s talk about how humans do it.
How humans play chess
The classic image most of us have of chess grandmasters is that they are calculating deep sequences of possibilities; “if I do this, and they do that, and then I play here, …”. In reality this image is not entirely precise.
There is no doubt that good players can calculate many moves in advance, but they can do so only because they have an incredible intuitive understanding of positions. Great players can take one look at a board and accurately determine who is winning and why, where often the first move that comes to mind is the best one.
This observation about human intelligence (with respect to chess) will guide how we think of Artificial Intelligence (AI) in general. You’ve probably heard before that “the number of chess positions is larger than the number of atoms in the universe”. We cannot hope to consider so many possibilities, and therefore to create a chess playing computer we will provide it with “intuition” that can reduce the set of positions it considers in the first place.
Intuition in decision making
There are many ways we can define intuition, but for our purposes it is a quick estimate of the value of a state. In AI, we call this concept a heuristic.
You use heuristics every day. Rounding provides an estimate which makes future arithmetic easier at a small cost to accuracy. When deciding what computer to buy you don’t consider all possible choices, but rather simplify the decision by picking some key attributes you care about, such as battery life. When we decide on accepting candidates for a job we use quick judgements based on limited information that hopefully captures their quality.
Let’s now return to chess. Pretend that every time you see a position a little voice in your head screams “Terrible”, “Bad”, “Ok”, “Good”, or “Amazing”, which for convenience we can refer to as (-2, -1, 0, 1, 2). One should clearly always trust the voices in their head, so how would you use it to play chess better?
One option would be to branch and bound. Consider every move from the current position, and only keep the ones which lead to positions the voice is most positive about. Do this again for every possible move from those next positions. Keep doing this as many times as your memory can handle, and then choose the first move as the one which leads to the most positive final position.
Each step we branch from the current set of positions, then bound (or prune) the heuristic values and only keep the higher ones. This procedure allows us to calculate what move to play by only considering a smaller set of possible sequences.
It’s important to keep in mind that heuristics can be wrong, for example the yellow state in the image is at first evaluated as “1” even though later we see it leads to a better state which is evaluated as “2”. If we had pruned away states valued at “1” we would never have found this sequence.
Encoding heuristics for chess
Now that we know how to use a heuristic to make decisions, let’s describe some heuristics for chess.
Who is winning in the position to the right, white or black? Even if you don’t know much about chess you can likely guess that white is winning because they have so many more pieces. It may seem stupid but it’s the first and most powerful heuristic in chess - whoever has more and stronger pieces is probably (but not always!) winning.
Most chess coaches start teaching with the same set of mantras: “control the center”, “develop your pieces”, “castle early”, “a knight in the corner is a mourner”, and so on. These are just heuristics! None of them are necessarily always true, but if you follow these basic rules your first games will go significantly better.
With a large set of heuristics you can add or average their separate “opinions” about a position to get a better estimate for whether it’s worth considering.
Generalizing beyond chess
That’s it! The best chess engines in the world are, at their core, nothing more than a large set of handwritten heuristics combined with a “branch and bound” like algorithm for making decisions. While humans are still significantly better at evaluating individual positions than these computers, the engine makes up for it by calculating and remembering much deeper sequences of moves than humans.
The ideas we considered here are not just applicable to chess, but to all of AI. Artificial Intelligence is the study of large or complicated search problems, in which we cannot hope to search through all possibilities and instead must use tricks and heuristics to narrow the scope.
Incorporating machine learning
Machine Learning (ML) is not Artificial Intelligence. In AI we solve search problems. In ML we try to approximate a function from examples of inputs and outputs to that function. Linear regression (finding the line of best fit) is the simplest example of such a thing.
We can think of a heuristic as just a function, whose input is the current position and whose output is a numerical evaluation of its goodness. Truly modern chess engines no longer use a large set of handwritten heuristics, but rather an ML model (in the form of a Neural Network) which learns good heuristics using billions of games as examples. We’ll leave the discussion for how this is done for another day, but the key to remember is that the underlying algorithm is exactly the same!