Over the years, people have been trying to exploit the abilities of a computer to create a better world. One of such abilities is to make decision as if human do. A way to show the capability of a computer making good decision is by playing board games as it involves the evaluation of the current board settings and selecting the next best moves constantly throughout the game. It is also a good way to evaluate how good a computer can perform when compared with human.
Traditionally, a computer plays a board game by searching through the game tree - a tree containing all the possible moves with the corresponding weights, which indicate how likely one can win the game with those moves. The values of the weights are being assigned according to an evaluation function.
Due to the limited storage and the complexity of the game search tree, the computer has to prune the branches with less weight for decision-making.
Inspired by the human brain, neural network is a way of information processing. Several layers of notes in the neural network are connected together and change as the system is trained. A large amount of training data are used to fine-tune the connections during the training process so that the network can produce a specific output corresponds to the given input. A deep learning neural network is a network with many layers. This deep learning neural network can help increase the accuracy of the results generated by the evaluation function throughout a series of learning processes, and hence, can help the computer make better decision of which branches to prune when searching for the next best move.
This project aims to demonstrate how powerful deep learning neural network can be for game-playing. The game Othello is chosen as the technique of deep learning neural network can be applied this game. Meanwhile, the size and the complexity of this game is suitable for a one-year long project with limited resources.
The objective of this project to develop a computer Othello program with the following attributes:
While the ultimate outcome is to deliver the above program, the key of this project is to allow the evaluation function of the program being constructed by the program itself without human logic.
Figure 1 describes the development model of this project. Detail steps of the project flow are listed below:
The simple flow of building the evaluation function is demonstrated in Figure 2. The set of board configurations of different game will be pre-processed and used as the training data for the neural network. Once a learning model is built and configured, it can learn from the training data during the learning process and the evaluation function will be updated accordingly. An updated evaluation function for the board configuration will be obtained at the end of the learning process.
The computer Othello developed will be tested against two types of opponents, moderate computer opponents and strong computer opponents. The choice of opponents at both levels for testing shall be discussed and agreed with the supervisor during the second phase of the project.
The two types of opponents will be tested with 50 games each. When playing against a moderate computer opponent, the targeted winning rate is 50%, while the targeted winning rate for playing against a strong computer opponent is 35%.
It is believed that when both sides play the game perfectly, the game will very likely end with a draw. Hence, when the winning rate of 50% is achieved, the computer Othello we developed can be said to be comparable with the existing computer Othello developed with the traditional method.
A lower winning rate of 35% is set when our computer Othello plays against a strong computer opponent and the reasons are stated as follow. The neural network may not be trained very well due to the limited time and other resources. Besides, the number of different board configurations we obtained as the training data may not be large enough for constructing a good evaluation function. These factors will affect the performance of the computer Othello we are going to develop, and hence, a target of 35% winning rate is set.
Date | Task(s) | Deliverables | Status |
---|---|---|---|
Sep 2016 |
|
Completed | |
2 Oct 2016 |
|
Completed | |
Mid-Oct 2016 |
|
|
Completed |
16 Nov 2016 |
|
Completed | |
21 Dec 2016 |
|
Completed | |
28 Dec 2016 |
|
Completed | |
9 - 13 Jan 2017 |
|
Completed | |
22 Jan 2017 |
|
Completed | |
08 Feb 2017 |
|
Completed | |
Mid-Mar 2017 |
|
Completed | |
16 Apr 2017 |
|
Completed | |
02 May 2017 |
|
In Progress |
Supervisor
Chan Lok Wang
Team member
Yip Tsz Kwan
Team member
Please contact us at fyp16020@cs.hku.hk