SlideShare uma empresa Scribd logo
1 de 21
Reinforcement Learning in  Partially Observable Environments Michael L. Littman
Temporal Difference Learning (1) Q learning: reduce discrepancy between successive Q estimates One step time difference: Why not two steps? Or  n ? Blend all of these:
Temporal Difference Learning (2) ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Equivalent expression:
non-Markov Examples ,[object Object]
Markov Decision Processes ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Partially Observable MDPs ,[object Object],[object Object],[object Object]
State Approaches to POMDPs ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Learning a POMDP ,[object Object],[object Object],[object Object],state occupation probabilities POMDP model E: Forward-backward M: Fractional counting
EM Pitfalls ,[object Object],[object Object],[object Object],[object Object]
Information State ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Plan with Information States ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Predictions as State ,[object Object],[object Object],[object Object],up   blue left   red up  not  blue left  not  red up   blue left  not  red predict:  up   blue ?  left   red ? up  not  blue left   red
Experience as State ,[object Object],[object Object],[object Object],[object Object]
Classification Dialog  (Keim & Littman 99) ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Incremental Pruning Output ,[object Object],S T S R S M
S T = 0.00
S T = 0.02
S T = 0.22
S T = 0.76
S T = 0.90
Wrap Up ,[object Object],[object Object],[object Object],[object Object],[object Object]

Mais conteúdo relacionado

Destaque

resume.doc
resume.docresume.doc
resume.doc
butest
 
Learning
LearningLearning
Learning
butest
 
Lecture 7
Lecture 7Lecture 7
Lecture 7
butest
 
Machine Learning CSCI 5622
Machine Learning CSCI 5622Machine Learning CSCI 5622
Machine Learning CSCI 5622
butest
 
Machine Learning: Decision Trees Chapter 18.1-18.3
Machine Learning: Decision Trees Chapter 18.1-18.3Machine Learning: Decision Trees Chapter 18.1-18.3
Machine Learning: Decision Trees Chapter 18.1-18.3
butest
 
Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...
butest
 
EL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBEEL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBE
butest
 
Lecture 18: Gaussian Mixture Models and Expectation Maximization
Lecture 18: Gaussian Mixture Models and Expectation MaximizationLecture 18: Gaussian Mixture Models and Expectation Maximization
Lecture 18: Gaussian Mixture Models and Expectation Maximization
butest
 
Learning coordination strategies using reinforcement learning myriam z abrams...
Learning coordination strategies using reinforcement learning myriam z abrams...Learning coordination strategies using reinforcement learning myriam z abrams...
Learning coordination strategies using reinforcement learning myriam z abrams...
Chang Ching-Chao
 

Destaque (12)

resume.doc
resume.docresume.doc
resume.doc
 
Learning
LearningLearning
Learning
 
Lecture 7
Lecture 7Lecture 7
Lecture 7
 
Robot Motor Skill Coordination with EM-based Reinforcement Learning
Robot Motor Skill Coordination with EM-based Reinforcement LearningRobot Motor Skill Coordination with EM-based Reinforcement Learning
Robot Motor Skill Coordination with EM-based Reinforcement Learning
 
Machine Learning CSCI 5622
Machine Learning CSCI 5622Machine Learning CSCI 5622
Machine Learning CSCI 5622
 
Machine Learning: Decision Trees Chapter 18.1-18.3
Machine Learning: Decision Trees Chapter 18.1-18.3Machine Learning: Decision Trees Chapter 18.1-18.3
Machine Learning: Decision Trees Chapter 18.1-18.3
 
Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...
 
EL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBEEL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBE
 
Lecture 18: Gaussian Mixture Models and Expectation Maximization
Lecture 18: Gaussian Mixture Models and Expectation MaximizationLecture 18: Gaussian Mixture Models and Expectation Maximization
Lecture 18: Gaussian Mixture Models and Expectation Maximization
 
Clustering:k-means, expect-maximization and gaussian mixture model
Clustering:k-means, expect-maximization and gaussian mixture modelClustering:k-means, expect-maximization and gaussian mixture model
Clustering:k-means, expect-maximization and gaussian mixture model
 
Learning coordination strategies using reinforcement learning myriam z abrams...
Learning coordination strategies using reinforcement learning myriam z abrams...Learning coordination strategies using reinforcement learning myriam z abrams...
Learning coordination strategies using reinforcement learning myriam z abrams...
 
Psych 101 - Introduction to Psychology - Lecture 1
Psych 101 - Introduction to Psychology - Lecture 1Psych 101 - Introduction to Psychology - Lecture 1
Psych 101 - Introduction to Psychology - Lecture 1
 

Semelhante a Lecture notes

lecture_21.pptx - PowerPoint Presentation
lecture_21.pptx - PowerPoint Presentationlecture_21.pptx - PowerPoint Presentation
lecture_21.pptx - PowerPoint Presentation
butest
 
Relational Transfer in Reinforcement Learning
Relational Transfer in Reinforcement LearningRelational Transfer in Reinforcement Learning
Relational Transfer in Reinforcement Learning
butest
 
Hidden Markov Models with applications to speech recognition
Hidden Markov Models with applications to speech recognitionHidden Markov Models with applications to speech recognition
Hidden Markov Models with applications to speech recognition
butest
 
Hidden Markov Models with applications to speech recognition
Hidden Markov Models with applications to speech recognitionHidden Markov Models with applications to speech recognition
Hidden Markov Models with applications to speech recognition
butest
 

Semelhante a Lecture notes (20)

lecture_21.pptx - PowerPoint Presentation
lecture_21.pptx - PowerPoint Presentationlecture_21.pptx - PowerPoint Presentation
lecture_21.pptx - PowerPoint Presentation
 
Reinfrocement Learning
Reinfrocement LearningReinfrocement Learning
Reinfrocement Learning
 
Hierarchical Reinforcement Learning
Hierarchical Reinforcement LearningHierarchical Reinforcement Learning
Hierarchical Reinforcement Learning
 
REINFORCEMENT LEARNING
REINFORCEMENT LEARNINGREINFORCEMENT LEARNING
REINFORCEMENT LEARNING
 
Matineh Shaker, Artificial Intelligence Scientist, Bonsai at MLconf SF 2017
Matineh Shaker, Artificial Intelligence Scientist, Bonsai at MLconf SF 2017Matineh Shaker, Artificial Intelligence Scientist, Bonsai at MLconf SF 2017
Matineh Shaker, Artificial Intelligence Scientist, Bonsai at MLconf SF 2017
 
reiniforcement learning.ppt
reiniforcement learning.pptreiniforcement learning.ppt
reiniforcement learning.ppt
 
RL intro
RL introRL intro
RL intro
 
TensorFlow and Deep Learning Tips and Tricks
TensorFlow and Deep Learning Tips and TricksTensorFlow and Deep Learning Tips and Tricks
TensorFlow and Deep Learning Tips and Tricks
 
Reinforcement Learning.ppt
Reinforcement Learning.pptReinforcement Learning.ppt
Reinforcement Learning.ppt
 
Introduction to Reinforcement Learning for Molecular Design
Introduction to Reinforcement Learning for Molecular Design Introduction to Reinforcement Learning for Molecular Design
Introduction to Reinforcement Learning for Molecular Design
 
YijueRL.ppt
YijueRL.pptYijueRL.ppt
YijueRL.ppt
 
RL_online _presentation_1.ppt
RL_online _presentation_1.pptRL_online _presentation_1.ppt
RL_online _presentation_1.ppt
 
Goprez sg
Goprez  sgGoprez  sg
Goprez sg
 
Reinforcement Learning : A Beginners Tutorial
Reinforcement Learning : A Beginners TutorialReinforcement Learning : A Beginners Tutorial
Reinforcement Learning : A Beginners Tutorial
 
meng.ppt
meng.pptmeng.ppt
meng.ppt
 
Relational Transfer in Reinforcement Learning
Relational Transfer in Reinforcement LearningRelational Transfer in Reinforcement Learning
Relational Transfer in Reinforcement Learning
 
Hidden Markov Models with applications to speech recognition
Hidden Markov Models with applications to speech recognitionHidden Markov Models with applications to speech recognition
Hidden Markov Models with applications to speech recognition
 
Hidden Markov Models with applications to speech recognition
Hidden Markov Models with applications to speech recognitionHidden Markov Models with applications to speech recognition
Hidden Markov Models with applications to speech recognition
 
Ydstie
YdstieYdstie
Ydstie
 
This Week in Machine Learning and AI Feb 2019
This Week in Machine Learning and AI Feb 2019This Week in Machine Learning and AI Feb 2019
This Week in Machine Learning and AI Feb 2019
 

Mais de butest

1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同
butest
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIAL
butest
 
Timeline: The Life of Michael Jackson
Timeline: The Life of Michael JacksonTimeline: The Life of Michael Jackson
Timeline: The Life of Michael Jackson
butest
 
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
butest
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIAL
butest
 
Com 380, Summer II
Com 380, Summer IICom 380, Summer II
Com 380, Summer II
butest
 
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet JazzThe MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
butest
 
MICHAEL JACKSON.doc
MICHAEL JACKSON.docMICHAEL JACKSON.doc
MICHAEL JACKSON.doc
butest
 
Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1
butest
 
Facebook
Facebook Facebook
Facebook
butest
 
Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...
butest
 
NEWS ANNOUNCEMENT
NEWS ANNOUNCEMENTNEWS ANNOUNCEMENT
NEWS ANNOUNCEMENT
butest
 
C-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.docC-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.doc
butest
 
MAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.docMAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.doc
butest
 
Mac OS X Guide.doc
Mac OS X Guide.docMac OS X Guide.doc
Mac OS X Guide.doc
butest
 
WEB DESIGN!
WEB DESIGN!WEB DESIGN!
WEB DESIGN!
butest
 
Download
DownloadDownload
Download
butest
 
Download.doc.doc
Download.doc.docDownload.doc.doc
Download.doc.doc
butest
 

Mais de butest (20)

1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIAL
 
Timeline: The Life of Michael Jackson
Timeline: The Life of Michael JacksonTimeline: The Life of Michael Jackson
Timeline: The Life of Michael Jackson
 
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIAL
 
Com 380, Summer II
Com 380, Summer IICom 380, Summer II
Com 380, Summer II
 
PPT
PPTPPT
PPT
 
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet JazzThe MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
 
MICHAEL JACKSON.doc
MICHAEL JACKSON.docMICHAEL JACKSON.doc
MICHAEL JACKSON.doc
 
Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1
 
Facebook
Facebook Facebook
Facebook
 
Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...
 
NEWS ANNOUNCEMENT
NEWS ANNOUNCEMENTNEWS ANNOUNCEMENT
NEWS ANNOUNCEMENT
 
C-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.docC-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.doc
 
MAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.docMAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.doc
 
Mac OS X Guide.doc
Mac OS X Guide.docMac OS X Guide.doc
Mac OS X Guide.doc
 
hier
hierhier
hier
 
WEB DESIGN!
WEB DESIGN!WEB DESIGN!
WEB DESIGN!
 
Download
DownloadDownload
Download
 
Download.doc.doc
Download.doc.docDownload.doc.doc
Download.doc.doc
 

Lecture notes