As a degree requirement, my university requires students to complete a “major qualifying project” (MQP) in their field of study. Usually these projects are done in teams of two to three people, and often they take place abroad. MQPs last seven weeks and they culminate with the group writing a thesis-length paper (80+ pages) and giving a twenty minute presentation.
One of the projects available last quarter was about teaching computers to play Go. It took place in Budapest, Hungary, because one of the foremost researchers in the field (Levente Kocsis) works at the renowned Hungarian Academy of Sciences.
I really enjoy Go. It is an amazingly deep game where complex play is derived from very simple rules. Computers have a very difficult time playing it. Amateurs can defeat the best programs even when they run on supercomputers. Playing Go is very much an open problem in artificial intelligence.
The project was researching Go with Levente Kocsis. I was really eager to get on board! Because of my (bad) academic standing I had problems with the bilious idiots in the registrar and administration, but eventually I got in. In the end, no one else joined. I had to tackle the whole project myself. That’s implementation, research, testing, and writing that paper — all in seven weeks. Well, that’s a lot for one student with absolutely no AI experience.
If I had known this ahead of time, and I had also known that I’d be going into real compiler research, I might not have done this Go project. On the other hand, I loved Budapest. It’s a great city, and I’d go back in a heartbeat. I also learned a lot. So in another one or two blog posts, I will try to share a bit about the project, before it completely slips my mind.
Go is a two-player board game. The board is a 19×19 grid. Players take turns placing black and white stones on the intersections. The object of the game is to surround as many empty intersections as possible. The game ends when both players agree there are no more moves, at which point the player with more “territory” is the winner.
So why is Go really difficult for computers, while a desktop can beat grandmasters at Chess?
The standard Chess algorithm is α-β pruning over minimax trees. Unfortunately Go’s search space is too large for this to work efficiently (see chart). The average number of moves per position in Chess is around 35, whereas in Go it is 250. Even if you play Go on a small board (such as 9×9, for beginners), there are around 40-60 moves per position.
There are additional complications. In Chess there are opening books and end-game solvers. In Go there are opening styles and patterns, but they evolve rapidly over time and are not rote as in Chess. It is also possible to quickly evaluate the value of a Chess position, whereas in Go the game must be played until the end to get an accurate scoring.
So, that’s what doesn’t work work. What does work?
If there’s one thing a computer can do quickly, it’s picking moves at random until there is a position that can be scored. With enough random sampling, the true value of a position can be approximated within a certain accuracy. This is the basis of all Monte-Carlo algorithms, and it has become the foremost area of research for Go.
The most common application is Monte-Carlo Tree Search. The idea is to play random games from various positions. The results can then be used to explore the most interesting paths in the game tree. The idea seems ridiculous both in its simplicity and stochasticity, but it works.
Next week: Bandit problems.