2. Abstract
• On algorithm for computer Shogi (Japanese
chess)
• Contents
– Exhibition of Dobutsu Shogi
– Min-max method (conventional)
– Monte-Carlo method (conventional)
– Win rate first search (presented)
3. Dobutsu shogi
• This slide mentions computer game
algorithm by using Dobutsu Shogi
• Dobutsu Shogi: a miniature shogi
• Shogi: Japanese chess
• Dobutsu: animal
• Normal shogi is too large to examine new
methods
4. Rule of Dobutsu Shogi 1
Five kind of pieces
Initial position is as figure
Win if you catch lion
Win if your lion reaches
to opposite end
Chick promotes
chicken
5. Rule of Dobutsu Shogi 2
All pieces move by one step
vertical horizontal and
forward forward-diagonal
around 8 diago vertical
squares nal horizontal
You can reuse (drop) the pieces that you took
6. Copy right of Dobutsu shogi
• I do not know who has copy right
– FUJITA Maiko (illustration)
– KITAO Madoka (making rule)
– LPSA (the two designers had belonged to)
– GENTOSHA Education (toy seller)
7. Illustration on this slide
• Because of that complex copy right, I use
the illustrations on the website below in this
slide, instead of FUJITA's ones
• “SOZAIYA JUN”
• (http://park18.wakwak.com/~osyare/)
8. Exhibition initial position
Black: win rate first
search (presented)
White: min-max
method, search depth
9, evaluation function
is composed by only
piece value
(conventional)
39. Exhibition 31st move
Black took chick by
lion, and white
resigned
After it, white drops giraffe on side of
lion, black giraffe takes elephant and
check, white lion takes it, black chick
advances, white lion moves backward,
black drops chick, check mate
40. Min-max method
• A conventional method
• Today the most successful method for shogi
• Explanation using tree structure from next
page
41. Min-max Example: 3 depth
Present board position
after 1 and 2 moves
Board position
Board position after 3 moves
43. Scores after 2 moves are
maximum of each score
-8
23
23
5
5
-9
Min-max
3
10
10
-3
-3
-4
44. Scores after 1 moves are
minimum of each score
-8
23
23
5
5
5
-9
Min-max
3
10
10
-3
-3
-3
-4
45. Select the move having
maximum score
-8
23
23
5
5
5
-9
5
Min-max
3
10
10
-3
-3
-3
-4
46. Min-max method
• Theoretically you can select the move that
has the maximum score after N moves
• Theoretically if we could obtain the score of
the end of the game, we would always win
the game
• Practically because of too large
computational cost, we cannot calculate all
moves
47. Min-max method
• Although many methods for reducing
computational cost is presented, they will
be not mentioned this slide (It is called
pruning to reduce the number of searched
nodes)
48. Conclusion of min-max method
• It uses tree structure
• Scores after N moves are needed
• Pruning is needed
49. Monte-Carlo method
• While I do not know the history of Monte-
Carlo method, it have been successful for
computer “go” (precisely successful by
Monte-Carlo tree search)
• They say that it is difficult to apply
computer shogi (or chess-like game) yet
50. Outline of Monte-Carlo
first move • Repeat random
moves
• Then game finishes
random move
and winner is
playout
revealed
• making game end by
random moves is
called playout
end of game
51. Outline of Monte-Carlo
• Repeat playout
• Obtain win rate of
the first move
• (number of win) /
(number of playout)
• Select move having
highest win rate at
the last
52. Outline of Monte-Carlo
• Outline is only it
• As to “Go”, this method has become
stronger by combining tree structure and
making Monte-Carlo tree search (this slide
does not mention it)
• Another improvement is that playout uses
moves by knowledge of “Go” instead of
simple random moves
53. Example of knowledge of “Go”
• Observe 3x3 squares
• Set low probability to drop
black stone the center of
above figure
• Set high probability to drop
black stone the center of
below figure
54. Monte-Carlo for shogi
• Simple Monte-Carlo method does not work
for shogi (too many bad moves appear)
• A causal must be that few moves in all legal
moves are good on shogi
• I do not want to use knowledge of shogi by
neither machine learning nor manual setting
55. Why Monte-Carlo for shogi
• Ability to determine the move by result of
the end of game, which seems beautiful
• No evaluation function is needed, no preset
knowledge is needed
56. Discussion Monte using tree
green and red
equal win rate between
Simple random moves lead
Truth is that green win and red lose
It tells importance of tree structure
57. Discussion Monte using tree
after 3 moves
Suppose you obtain win rate
0.1 0.3 0.7 0.8 0.2 0.6 0.9 0.4
Obtain win rate of green and red from
These 3-move-after rates by playout
58. Discussion Monte using tree
ones of min-max method
Ideally the rates are equal to
0.3 0.6
0.3 0.8 0.6 0.9
0.1 0.3 0.7 0.8 0.2 0.6 0.9 0.4
59. Discussion Monte using tree
• Q: How do you calculate
parent node 0.6 by children
nodes 0.2 and 0.6
0.6
• A: Ignore 0.2
0.2 0.6
60. Discussion Monte using tree
• Q: How do you ignore 0.2?
• A1: Always search maximum
0.6 win rate node
• A2: sometimes search through
node randomly
0.2 0.6
61. Discussion Monte using tree
maximum win rate
Search node that has
0.1 0.3 0.7 0.8 0.2 0.6 0.9 0.4
This tactics finds the best path
62. Win rate first search
• Remember win rate of searched node
• Almost always search node that has
maximum win rate
• Sometimes search randomly (ideally it is
not needed)
• Then this algorithm finds the best move
63. Additional explanation
• Update win rate at every playout
• Keep numerator and denominator as win
rate
• Add constant number to both numerator and
denominator when win the playout
• Add constant number to only denominator
when lose the playout
64. Problems of presented method
• Win rates of the nodes that have not been
searched are mentioned from the next pages
• Many other issues must be hiding, though I
have not defined them
65. Unreached node
• On the node that has
not been searched
and no win rate
0.4 0.6 0.3
unreached
66. Another win rate
• Before this page, knowledge of shogi does
not appear and only graph is used
• This win rate uses knowledge of shogi
• Win rate is calculated by kind of moves
• For example, taking piece, promotion, and
etc.
67. Another win rate
• Calculate win rate by these factors
– Piece position before and after move
– Kind of pieces moving and taken
– Is position whether controlled or not
• Win rate table for all combination of these
factors is prepared
• These win rates are learned by playout,
whose values are not prepared
68. Another smaller win rate
• Another smaller win rate table is prepared
– Kind of pieces moving and taken
– Is position whether controlled or not
• Since it is small, it learns fast
• It is used when “another larger win rate” is
not learned yet
• If all three kinds of win rate have not been
learned, let win rate be 1
69. Conclusion of presented method
• Win rates of all searched nodes are
remembered and learned by playout
• Select node that has highest win rate in
playout (“win rate first search”)
• Sometimes select node randomly
• If win rate has not been learned, other win
rates are used
70. Condition of simulation game
• Win rate first search vs. Simple min-max
method (evaluation function is composed
by only values of pieces)
• If the game continues till 80 moves, the
game is regarded as even (special rule for
this simulation)
71. Result of simulation 1
Number
of playout 10000 30000 100000
Presented
method: 22-76 44-52 48-49
black
Presented
method: 16-81 30-68 61-35
white
Win-lose for presented method in 100 games
Some even games exist
Depth of min-max method is 6
More the playouts are, stronger the method is
72. Result of simulation 2
Depth of
min-max 4 5 6 7 8 9
Present
method: 94-6 77-20 48-49 37-61 24-73 14-85
black
Present
method: 78-21 78-20 61-35 38-57 40-52 20-74
white
Win-lose for presented method in 100 games
Some even games exist
100000 playouts for presented method
Almost same strongness to 6-depth min-max
73. Impression by human viewer
• Frequently presented method take bad
moves
• Although it is a variation of Monte-Carlo
method, it can find mate route
• It is good at finding narrow route
• Difference of the number of playout shows
clearly difference of strongness
74. Conclusion and future issue
• Conclusion
– Playout by win rate first
– Select moves without preset knowledge
– Select moves by result of playout
• Future
– Someone can apply it to “Go” or other
chess-like games
– I return to research speech signal
processing