SlideShare uma empresa Scribd logo
1 de 74
Baixar para ler offline
From Monte-Carlo
to win rate first search
for “Dobutsu Shogi”

                        2010/05/22
                    IHARA Takehiro
Abstract
• On algorithm for computer Shogi (Japanese
  chess)
• Contents
   – Exhibition of Dobutsu Shogi
   – Min-max method (conventional)
   – Monte-Carlo method (conventional)
   – Win rate first search (presented)
Dobutsu shogi
• This slide mentions computer game
  algorithm by using Dobutsu Shogi
• Dobutsu Shogi: a miniature shogi
• Shogi: Japanese chess
• Dobutsu: animal
• Normal shogi is too large to examine new
  methods
Rule of Dobutsu Shogi 1
Five kind of pieces
Initial position is as figure
Win if you catch lion
Win if your lion reaches
to opposite end



     Chick promotes
     chicken
Rule of Dobutsu Shogi 2
All pieces move by one step
                              vertical horizontal and
          forward             forward-diagonal

        around 8         diago           vertical
        squares          nal             horizontal

You can reuse (drop) the pieces that you took
Copy right of Dobutsu shogi
• I do not know who has copy right
   – FUJITA Maiko (illustration)
   – KITAO Madoka (making rule)
   – LPSA (the two designers had belonged to)
   – GENTOSHA Education (toy seller)
Illustration on this slide
• Because of that complex copy right, I use
  the illustrations on the website below in this
  slide, instead of FUJITA's ones
• “SOZAIYA JUN”
• (http://park18.wakwak.com/~osyare/)
Exhibition initial position
                Black: win rate first
                search (presented)
                White: min-max
                method, search depth
                9, evaluation function
                is composed by only
                piece value
                (conventional)
Exhibition 1st move
              Black advanced
              giraffe
Exhibition 2nd move
              White advanced
              giraffe
Exhibition 3rd move
              Black took chick by
              chick
Exhibition 4th move
              White took chick by
              elephant
Exhibition 5th move
              Black advanced
              elephant
Exhibition 6th move
              White dropped chick
              for defense
Exhibition 7th move
              Black moved giraffe
              backward
Exhibition 8th move
              White advanced
              giraffe
Exhibition 9th move
              Black dropped chick
              for defense
Exhibition 10th move
              White took elephant
              by giraffe
Exhibition 11th move
              Black took giraffe by
              lion
Exhibition 12th move
              White dropped
              elephant
              This elephant
              combination style is
              strong
Exhibition 13th move
              Black lion escaped
Exhibition 14th move
              White advanced lion
Exhibition 15th move
              Black dropped giraffe
              and check
Exhibition 16th move
              White escaped lion
Exhibition 17th move
              Black advanced
              giraffe
              Black forced white to
              select taking giraffe or
              escaping elephant
Exhibition 18th move
              White took giraffe by
              elephant
Exhibition 19th move
              Black took elephant by
              lion
Exhibition 20th move
              White dropped giraffe
Exhibition 21st move
              Black dropped
              elephant behind lion
Exhibition 22nd move
             White moved elephant
             backward
Exhibition 23rd move
              Black advanced
              elephant
Exhibition 24th move
              White check by giraffe
Exhibition 25th move
              Black took giraffe by
              elephant
Exhibition 26th move
              White took elephant
              by chick
              If white had taken by
              elephant, white would
              be mate
Exhibition 27th move
              Black lion escaped
Exhibition 28th move
              White dropped
              elephant
Exhibition 29th move
              Black check by giraffe
Exhibition 30th move
              White took giraffe by
              elephant
Exhibition 31st move
              Black took chick by
              lion, and white
              resigned
              After it, white drops giraffe on side of
              lion, black giraffe takes elephant and
              check, white lion takes it, black chick
              advances, white lion moves backward,
              black drops chick, check mate
Min-max method
• A conventional method
• Today the most successful method for shogi
• Explanation using tree structure from next
  page
Min-max        Example: 3 depth

                    Present board position




                                   after 1 and 2 moves
                                   Board position
Board position after 3 moves
Suppose scores after 3 moves
          were revealed
                                  -8
                                  23
                                  5
                                  -9
Min-max




                                  3
                                  10
                                  -3
                                  -4
Scores after 2 moves are
          maximum of each score
                                     -8
                         23


                                     23
                                     5
                         5


                                     -9
Min-max




                                     3
                         10


                                     10
                                     -3
                         -3


                                     -4
Scores after 1 moves are
          minimum of each score
                                     -8
                         23


                                     23
                  5




                                     5
                         5


                                     -9
Min-max




                                     3
                         10


                                     10
                 -3




                                     -3
                         -3


                                     -4
Select the move having
          maximum score
                                   -8
                         23


                                   23
                 5




                                   5
                         5


                                   -9
          5
Min-max




                                   3
                        10


                                   10
                 -3




                                   -3
                        -3


                                   -4
Min-max method
• Theoretically you can select the move that
  has the maximum score after N moves
• Theoretically if we could obtain the score of
  the end of the game, we would always win
  the game
• Practically because of too large
  computational cost, we cannot calculate all
  moves
Min-max method
• Although many methods for reducing
  computational cost is presented, they will
  be not mentioned this slide (It is called
  pruning to reduce the number of searched
  nodes)
Conclusion of min-max method
• It uses tree structure
• Scores after N moves are needed
• Pruning is needed
Monte-Carlo method
• While I do not know the history of Monte-
  Carlo method, it have been successful for
  computer “go” (precisely successful by
  Monte-Carlo tree search)
• They say that it is difficult to apply
  computer shogi (or chess-like game) yet
Outline of Monte-Carlo
 first move                • Repeat random
                             moves
                           • Then game finishes
random move




                             and winner is
                 playout


                             revealed
                           • making game end by
                             random moves is
                             called playout

end of game
Outline of Monte-Carlo
            • Repeat playout
            • Obtain win rate of
              the first move
            • (number of win) /
              (number of playout)
            • Select move having
              highest win rate at
              the last
Outline of Monte-Carlo
• Outline is only it
• As to “Go”, this method has become
  stronger by combining tree structure and
  making Monte-Carlo tree search (this slide
  does not mention it)
• Another improvement is that playout uses
  moves by knowledge of “Go” instead of
  simple random moves
Example of knowledge of “Go”
     • Observe 3x3 squares
     • Set low probability to drop
       black stone the center of
       above figure
     • Set high probability to drop
       black stone the center of
       below figure
Monte-Carlo for shogi
• Simple Monte-Carlo method does not work
  for shogi (too many bad moves appear)
• A causal must be that few moves in all legal
  moves are good on shogi
• I do not want to use knowledge of shogi by
  neither machine learning nor manual setting
Why Monte-Carlo for shogi
• Ability to determine the move by result of
  the end of game, which seems beautiful
• No evaluation function is needed, no preset
  knowledge is needed
Discussion Monte using tree




                                        green and red
                                        equal win rate between
                                        Simple random moves lead
Truth is that green win and red lose
It tells importance of tree structure
Discussion Monte using tree




                                           after 3 moves
                                           Suppose you obtain win rate
0.1   0.3 0.7    0.8 0.2     0.6 0.9     0.4
 Obtain win rate of green and red from
 These 3-move-after rates by playout
Discussion Monte using tree




                                                    ones of min-max method
                                                    Ideally the rates are equal to
          0.3                         0.6


  0.3             0.8         0.6           0.9


0.1     0.3 0.7     0.8 0.2         0.6 0.9       0.4
Discussion Monte using tree
                  • Q: How do you calculate
                    parent node 0.6 by children
                    nodes 0.2 and 0.6
      0.6
                  • A: Ignore 0.2

0.2         0.6
Discussion Monte using tree
                  • Q: How do you ignore 0.2?
                  • A1: Always search maximum
      0.6           win rate node
                  • A2: sometimes search through
                    node randomly
0.2         0.6
Discussion Monte using tree




                                           maximum win rate
                                           Search node that has
0.1   0.3 0.7     0.8 0.2      0.6 0.9   0.4

 This tactics finds the best path
Win rate first search
• Remember win rate of searched node
• Almost always search node that has
  maximum win rate
• Sometimes search randomly (ideally it is
  not needed)
• Then this algorithm finds the best move
Additional explanation
• Update win rate at every playout
• Keep numerator and denominator as win
  rate
• Add constant number to both numerator and
  denominator when win the playout
• Add constant number to only denominator
  when lose the playout
Problems of presented method
• Win rates of the nodes that have not been
  searched are mentioned from the next pages
• Many other issues must be hiding, though I
  have not defined them
Unreached node
                              • On the node that has
                                not been searched
                                and no win rate


0.4   0.6               0.3
            unreached
Another win rate
• Before this page, knowledge of shogi does
  not appear and only graph is used
• This win rate uses knowledge of shogi
• Win rate is calculated by kind of moves
• For example, taking piece, promotion, and
  etc.
Another win rate
• Calculate win rate by these factors
   – Piece position before and after move
   – Kind of pieces moving and taken
   – Is position whether controlled or not
• Win rate table for all combination of these
  factors is prepared
• These win rates are learned by playout,
  whose values are not prepared
Another smaller win rate
• Another smaller win rate table is prepared
   – Kind of pieces moving and taken
   – Is position whether controlled or not
• Since it is small, it learns fast
• It is used when “another larger win rate” is
  not learned yet
• If all three kinds of win rate have not been
  learned, let win rate be 1
Conclusion of presented method
• Win rates of all searched nodes are
  remembered and learned by playout
• Select node that has highest win rate in
  playout (“win rate first search”)
• Sometimes select node randomly
• If win rate has not been learned, other win
  rates are used
Condition of simulation game
• Win rate first search vs. Simple min-max
  method (evaluation function is composed
  by only values of pieces)
• If the game continues till 80 moves, the
  game is regarded as even (special rule for
  this simulation)
Result of simulation 1
   Number
   of playout 10000     30000     100000
   Presented
   method:    22-76     44-52     48-49
   black
   Presented
   method:    16-81     30-68     61-35
   white

Win-lose for presented method in 100 games
Some even games exist
Depth of min-max method is 6
More the playouts are, stronger the method is
Result of simulation 2
Depth of
min-max    4      5     6     7     8     9
Present
method:    94-6   77-20 48-49 37-61 24-73 14-85
black
Present
method:    78-21 78-20 61-35 38-57 40-52 20-74
white

Win-lose for presented method in 100 games
Some even games exist
100000 playouts for presented method
Almost same strongness to 6-depth min-max
Impression by human viewer
• Frequently presented method take bad
  moves
• Although it is a variation of Monte-Carlo
  method, it can find mate route
• It is good at finding narrow route
• Difference of the number of playout shows
  clearly difference of strongness
Conclusion and future issue
• Conclusion
   – Playout by win rate first
   – Select moves without preset knowledge
   – Select moves by result of playout
• Future
   – Someone can apply it to “Go” or other
     chess-like games
   – I return to research speech signal
     processing

Mais conteúdo relacionado

Último

Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 

Último (20)

Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 

Destaque

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by HubspotMarius Sescu
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTExpeed Software
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsPixeldarts
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthThinkNow
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfmarketingartwork
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 

Destaque (20)

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPT
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 

win rate first search

  • 1. From Monte-Carlo to win rate first search for “Dobutsu Shogi” 2010/05/22 IHARA Takehiro
  • 2. Abstract • On algorithm for computer Shogi (Japanese chess) • Contents – Exhibition of Dobutsu Shogi – Min-max method (conventional) – Monte-Carlo method (conventional) – Win rate first search (presented)
  • 3. Dobutsu shogi • This slide mentions computer game algorithm by using Dobutsu Shogi • Dobutsu Shogi: a miniature shogi • Shogi: Japanese chess • Dobutsu: animal • Normal shogi is too large to examine new methods
  • 4. Rule of Dobutsu Shogi 1 Five kind of pieces Initial position is as figure Win if you catch lion Win if your lion reaches to opposite end Chick promotes chicken
  • 5. Rule of Dobutsu Shogi 2 All pieces move by one step vertical horizontal and forward forward-diagonal around 8 diago vertical squares nal horizontal You can reuse (drop) the pieces that you took
  • 6. Copy right of Dobutsu shogi • I do not know who has copy right – FUJITA Maiko (illustration) – KITAO Madoka (making rule) – LPSA (the two designers had belonged to) – GENTOSHA Education (toy seller)
  • 7. Illustration on this slide • Because of that complex copy right, I use the illustrations on the website below in this slide, instead of FUJITA's ones • “SOZAIYA JUN” • (http://park18.wakwak.com/~osyare/)
  • 8. Exhibition initial position Black: win rate first search (presented) White: min-max method, search depth 9, evaluation function is composed by only piece value (conventional)
  • 9. Exhibition 1st move Black advanced giraffe
  • 10. Exhibition 2nd move White advanced giraffe
  • 11. Exhibition 3rd move Black took chick by chick
  • 12. Exhibition 4th move White took chick by elephant
  • 13. Exhibition 5th move Black advanced elephant
  • 14. Exhibition 6th move White dropped chick for defense
  • 15. Exhibition 7th move Black moved giraffe backward
  • 16. Exhibition 8th move White advanced giraffe
  • 17. Exhibition 9th move Black dropped chick for defense
  • 18. Exhibition 10th move White took elephant by giraffe
  • 19. Exhibition 11th move Black took giraffe by lion
  • 20. Exhibition 12th move White dropped elephant This elephant combination style is strong
  • 21. Exhibition 13th move Black lion escaped
  • 22. Exhibition 14th move White advanced lion
  • 23. Exhibition 15th move Black dropped giraffe and check
  • 24. Exhibition 16th move White escaped lion
  • 25. Exhibition 17th move Black advanced giraffe Black forced white to select taking giraffe or escaping elephant
  • 26. Exhibition 18th move White took giraffe by elephant
  • 27. Exhibition 19th move Black took elephant by lion
  • 28. Exhibition 20th move White dropped giraffe
  • 29. Exhibition 21st move Black dropped elephant behind lion
  • 30. Exhibition 22nd move White moved elephant backward
  • 31. Exhibition 23rd move Black advanced elephant
  • 32. Exhibition 24th move White check by giraffe
  • 33. Exhibition 25th move Black took giraffe by elephant
  • 34. Exhibition 26th move White took elephant by chick If white had taken by elephant, white would be mate
  • 35. Exhibition 27th move Black lion escaped
  • 36. Exhibition 28th move White dropped elephant
  • 37. Exhibition 29th move Black check by giraffe
  • 38. Exhibition 30th move White took giraffe by elephant
  • 39. Exhibition 31st move Black took chick by lion, and white resigned After it, white drops giraffe on side of lion, black giraffe takes elephant and check, white lion takes it, black chick advances, white lion moves backward, black drops chick, check mate
  • 40. Min-max method • A conventional method • Today the most successful method for shogi • Explanation using tree structure from next page
  • 41. Min-max Example: 3 depth Present board position after 1 and 2 moves Board position Board position after 3 moves
  • 42. Suppose scores after 3 moves were revealed -8 23 5 -9 Min-max 3 10 -3 -4
  • 43. Scores after 2 moves are maximum of each score -8 23 23 5 5 -9 Min-max 3 10 10 -3 -3 -4
  • 44. Scores after 1 moves are minimum of each score -8 23 23 5 5 5 -9 Min-max 3 10 10 -3 -3 -3 -4
  • 45. Select the move having maximum score -8 23 23 5 5 5 -9 5 Min-max 3 10 10 -3 -3 -3 -4
  • 46. Min-max method • Theoretically you can select the move that has the maximum score after N moves • Theoretically if we could obtain the score of the end of the game, we would always win the game • Practically because of too large computational cost, we cannot calculate all moves
  • 47. Min-max method • Although many methods for reducing computational cost is presented, they will be not mentioned this slide (It is called pruning to reduce the number of searched nodes)
  • 48. Conclusion of min-max method • It uses tree structure • Scores after N moves are needed • Pruning is needed
  • 49. Monte-Carlo method • While I do not know the history of Monte- Carlo method, it have been successful for computer “go” (precisely successful by Monte-Carlo tree search) • They say that it is difficult to apply computer shogi (or chess-like game) yet
  • 50. Outline of Monte-Carlo first move • Repeat random moves • Then game finishes random move and winner is playout revealed • making game end by random moves is called playout end of game
  • 51. Outline of Monte-Carlo • Repeat playout • Obtain win rate of the first move • (number of win) / (number of playout) • Select move having highest win rate at the last
  • 52. Outline of Monte-Carlo • Outline is only it • As to “Go”, this method has become stronger by combining tree structure and making Monte-Carlo tree search (this slide does not mention it) • Another improvement is that playout uses moves by knowledge of “Go” instead of simple random moves
  • 53. Example of knowledge of “Go” • Observe 3x3 squares • Set low probability to drop black stone the center of above figure • Set high probability to drop black stone the center of below figure
  • 54. Monte-Carlo for shogi • Simple Monte-Carlo method does not work for shogi (too many bad moves appear) • A causal must be that few moves in all legal moves are good on shogi • I do not want to use knowledge of shogi by neither machine learning nor manual setting
  • 55. Why Monte-Carlo for shogi • Ability to determine the move by result of the end of game, which seems beautiful • No evaluation function is needed, no preset knowledge is needed
  • 56. Discussion Monte using tree green and red equal win rate between Simple random moves lead Truth is that green win and red lose It tells importance of tree structure
  • 57. Discussion Monte using tree after 3 moves Suppose you obtain win rate 0.1 0.3 0.7 0.8 0.2 0.6 0.9 0.4 Obtain win rate of green and red from These 3-move-after rates by playout
  • 58. Discussion Monte using tree ones of min-max method Ideally the rates are equal to 0.3 0.6 0.3 0.8 0.6 0.9 0.1 0.3 0.7 0.8 0.2 0.6 0.9 0.4
  • 59. Discussion Monte using tree • Q: How do you calculate parent node 0.6 by children nodes 0.2 and 0.6 0.6 • A: Ignore 0.2 0.2 0.6
  • 60. Discussion Monte using tree • Q: How do you ignore 0.2? • A1: Always search maximum 0.6 win rate node • A2: sometimes search through node randomly 0.2 0.6
  • 61. Discussion Monte using tree maximum win rate Search node that has 0.1 0.3 0.7 0.8 0.2 0.6 0.9 0.4 This tactics finds the best path
  • 62. Win rate first search • Remember win rate of searched node • Almost always search node that has maximum win rate • Sometimes search randomly (ideally it is not needed) • Then this algorithm finds the best move
  • 63. Additional explanation • Update win rate at every playout • Keep numerator and denominator as win rate • Add constant number to both numerator and denominator when win the playout • Add constant number to only denominator when lose the playout
  • 64. Problems of presented method • Win rates of the nodes that have not been searched are mentioned from the next pages • Many other issues must be hiding, though I have not defined them
  • 65. Unreached node • On the node that has not been searched and no win rate 0.4 0.6 0.3 unreached
  • 66. Another win rate • Before this page, knowledge of shogi does not appear and only graph is used • This win rate uses knowledge of shogi • Win rate is calculated by kind of moves • For example, taking piece, promotion, and etc.
  • 67. Another win rate • Calculate win rate by these factors – Piece position before and after move – Kind of pieces moving and taken – Is position whether controlled or not • Win rate table for all combination of these factors is prepared • These win rates are learned by playout, whose values are not prepared
  • 68. Another smaller win rate • Another smaller win rate table is prepared – Kind of pieces moving and taken – Is position whether controlled or not • Since it is small, it learns fast • It is used when “another larger win rate” is not learned yet • If all three kinds of win rate have not been learned, let win rate be 1
  • 69. Conclusion of presented method • Win rates of all searched nodes are remembered and learned by playout • Select node that has highest win rate in playout (“win rate first search”) • Sometimes select node randomly • If win rate has not been learned, other win rates are used
  • 70. Condition of simulation game • Win rate first search vs. Simple min-max method (evaluation function is composed by only values of pieces) • If the game continues till 80 moves, the game is regarded as even (special rule for this simulation)
  • 71. Result of simulation 1 Number of playout 10000 30000 100000 Presented method: 22-76 44-52 48-49 black Presented method: 16-81 30-68 61-35 white Win-lose for presented method in 100 games Some even games exist Depth of min-max method is 6 More the playouts are, stronger the method is
  • 72. Result of simulation 2 Depth of min-max 4 5 6 7 8 9 Present method: 94-6 77-20 48-49 37-61 24-73 14-85 black Present method: 78-21 78-20 61-35 38-57 40-52 20-74 white Win-lose for presented method in 100 games Some even games exist 100000 playouts for presented method Almost same strongness to 6-depth min-max
  • 73. Impression by human viewer • Frequently presented method take bad moves • Although it is a variation of Monte-Carlo method, it can find mate route • It is good at finding narrow route • Difference of the number of playout shows clearly difference of strongness
  • 74. Conclusion and future issue • Conclusion – Playout by win rate first – Select moves without preset knowledge – Select moves by result of playout • Future – Someone can apply it to “Go” or other chess-like games – I return to research speech signal processing