SlideShare uma empresa Scribd logo
1 de 22
Emotion-Driven Reinforcement Learning Bob Marinier & John Laird University of Michigan, Computer Science and Engineering CogSci’08
Introduction Interested in the functional benefits of emotion for a cognitive agent Appraisal theories of emotion PEACTIDM theory of cognitive control Use emotion as a reward signal to a reinforcement learning agent Demonstrates a functional benefit of emotion Provides a theory of the origin of intrinsic reward 2
Outline Background Integration of emotion and cognition Integration of emotion and reinforcement learning Implementation in Soar Learning task Results 3
Appraisal Theories of Emotion A situation is evaluated along a number of appraisal dimensions, many of which relate the situation to current goals Novelty, goal relevance, goal conduciveness, expectedness, causal agency, etc. Appraisals influence emotion Emotion can then be coped with (via internal or external actions) Situation Goals Appraisals Coping Emotion 4
Appraisals to Emotions (Scherer 2001) 5
Cognitive Control: PEACTIDM (Newell 1990) 6
Unification of PEACTIDM and Appraisal Theories 7 Perceive Raw Perceptual Information Environmental Change Encode Motor Suddenness Unpredictability Goal Relevance Intrinsic Pleasantness Stimulus Relevance Motor Commands Prediction Outcome Probability Attend Decode Causal Agent/Motive Discrepancy Conduciveness Control/Power Stimulus chosen for processing Action Comprehend Intend Current Situation Assessment
Distinction between emotion, mood, and feeling(Marinier & Laird 2007) Emotion: Result of appraisals Is about the current situation Mood: “Average” over recent emotions Provides historical context Feeling: Emotion “+” Mood What agent actually perceives 8
Emotion, mood, and feeling Cognition Active Appraisals Perceived Feeling Emotion Feeling Combination Function Pull Mood Decay 9
Intrinsically Motivated Reinforcement Learning(Sutton & Barto 1998; Singh et al. 2004) 10 External Environment Environment Actions Sensations Critic “Organism” Internal Environment Actions States Rewards Critic Appraisal Process Agent +/- Feeling Intensity States Rewards Decisions Agent Reward = Intensity * Valence
Extending Soar with Emotion(Marinier & Laird 2007) Episodic Semantic Symbolic Long-Term Memories Procedural Semantic Learning Episodic Learning Chunking Reinforcement Learning Appraisal Detector Short-Term Memory Situation, Goals Decision Procedure Visual Imagery Perception Action Body 11
Extending Soar with Emotion(Marinier & Laird 2007) 12 Episodic Semantic Symbolic Long-Term Memories Procedural Semantic Learning Episodic Learning Chunking Reinforcement Learning      +/-Intensity Appraisal Detector Feeling .9,.6,.5,-.1,.8,… Short-Term Memory Situation, Goals Feelings Decision Procedure Feelings Appraisals Visual Imagery Emotion .5,.7,0,-.4,.3,… Mood .7,-.2,.8,.3,.6,… Perception Action Knowledge Body Architecture
Learning task Start Goal 13
Learning task: Encoding 14 North Passable: false On path: false Progress: true East Passable: false On path: true Progress: true West Passable: false On path: false Progress: true South Passable: true On path: true Progress: true
Learning task: Encoding & Appraisal 15 North Intrinsic Pleasantness: Low Goal Relevance: Low Unpredictability: High East Intrinsic Pleasantness: Low Goal Relevance: High Unpredictability: High West Intrinsic Pleasantness: Low Goal Relevance: Low Unpredictability: High South Intrinsic Pleasantness: Neutral Goal Relevance: High Unpredictability: Low
Learning task: Attending, Comprehending & Appraisal 16 South Intrinsic Pleasantness: Neutral Goal Relevance: High Unpredictability: Low Conduciveness: High Control: High …
Learning task: Tasking 17
Learning task: Tasking 18 Optimal Subtasks
What is being learned? When to Attend vs Task If Attending, what to Attend to If Tasking, which subtask to create When to Intend vs. Ignore 19
Learning Results 20
Results: With and without mood 21
Discussion Agent learns both internal (tasking) and external (movement) actions Emotion allows for more frequent rewards, and thus learns faster than standard RL Mood “fills in the gaps” allowing for even faster learning and less variability 22

Mais conteúdo relacionado

Mais procurados

Expectancy theory
Expectancy theoryExpectancy theory
Expectancy theorykdore
 
Eiwp conf presentation scott thor
Eiwp conf presentation scott thorEiwp conf presentation scott thor
Eiwp conf presentation scott thorScott Thor
 
Lessons learntmanagingsoftwareprojects
Lessons learntmanagingsoftwareprojectsLessons learntmanagingsoftwareprojects
Lessons learntmanagingsoftwareprojectsRamanan Jagannathan
 
Identifying neurocorrelates in psychological type ap ti tc 2011
Identifying neurocorrelates in psychological type  ap ti tc 2011Identifying neurocorrelates in psychological type  ap ti tc 2011
Identifying neurocorrelates in psychological type ap ti tc 2011Ann Holm
 
Thinking Reasoning & Problem Solving (Human Behavior)
Thinking Reasoning & Problem Solving (Human Behavior)Thinking Reasoning & Problem Solving (Human Behavior)
Thinking Reasoning & Problem Solving (Human Behavior)zohebchana
 
Zenjoy - The psychology of habit forming apps.
Zenjoy - The psychology of habit forming apps.Zenjoy - The psychology of habit forming apps.
Zenjoy - The psychology of habit forming apps.dewitkoen
 

Mais procurados (9)

Expectancy theory
Expectancy theoryExpectancy theory
Expectancy theory
 
Eiwp conf presentation scott thor
Eiwp conf presentation scott thorEiwp conf presentation scott thor
Eiwp conf presentation scott thor
 
Lessons learntmanagingsoftwareprojects
Lessons learntmanagingsoftwareprojectsLessons learntmanagingsoftwareprojects
Lessons learntmanagingsoftwareprojects
 
Ei
EiEi
Ei
 
Identifying neurocorrelates in psychological type ap ti tc 2011
Identifying neurocorrelates in psychological type  ap ti tc 2011Identifying neurocorrelates in psychological type  ap ti tc 2011
Identifying neurocorrelates in psychological type ap ti tc 2011
 
Thinking Reasoning & Problem Solving (Human Behavior)
Thinking Reasoning & Problem Solving (Human Behavior)Thinking Reasoning & Problem Solving (Human Behavior)
Thinking Reasoning & Problem Solving (Human Behavior)
 
HOW STATISTICS WORKS?
HOW STATISTICS WORKS?HOW STATISTICS WORKS?
HOW STATISTICS WORKS?
 
Problem solving
Problem solvingProblem solving
Problem solving
 
Zenjoy - The psychology of habit forming apps.
Zenjoy - The psychology of habit forming apps.Zenjoy - The psychology of habit forming apps.
Zenjoy - The psychology of habit forming apps.
 

Semelhante a Marinier Laird Cogsci 2008 Emotionrl Pres

A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
TS4-5: Yuan Ma from Japan Advanced Institute of Science and Technology
TS4-5: Yuan Ma from Japan Advanced Institute of Science and TechnologyTS4-5: Yuan Ma from Japan Advanced Institute of Science and Technology
TS4-5: Yuan Ma from Japan Advanced Institute of Science and TechnologyJawad Haqbeen
 
Reflective learning
Reflective learningReflective learning
Reflective learningP&CO
 
Intention-behavior relations
Intention-behavior relationsIntention-behavior relations
Intention-behavior relationsrenes002
 
How to Foster Great Employee Attitudes at Work
How to Foster Great Employee Attitudes at WorkHow to Foster Great Employee Attitudes at Work
How to Foster Great Employee Attitudes at WorkThe Chazin Group LLC
 
The Emotionally Intelligent Interim Manager.Ppt2
The Emotionally Intelligent Interim Manager.Ppt2The Emotionally Intelligent Interim Manager.Ppt2
The Emotionally Intelligent Interim Manager.Ppt2MartinD1
 
Process theories of motivation
Process theories of motivationProcess theories of motivation
Process theories of motivationace boado
 
Perception.pptx js5dihob ycydugobcb ytsi kf
Perception.pptx js5dihob ycydugobcb ytsi kfPerception.pptx js5dihob ycydugobcb ytsi kf
Perception.pptx js5dihob ycydugobcb ytsi kfnikhilojha4142
 
Mindfulness@work case-agile india2018
Mindfulness@work case-agile india2018Mindfulness@work case-agile india2018
Mindfulness@work case-agile india2018Vishweshwar Hegde
 
PERCEPTION IN ORGANISATIONAL BEHAVIOUR
PERCEPTION IN ORGANISATIONAL BEHAVIOURPERCEPTION IN ORGANISATIONAL BEHAVIOUR
PERCEPTION IN ORGANISATIONAL BEHAVIOURKriace Ward
 
Lab Presentation 103108
Lab Presentation 103108Lab Presentation 103108
Lab Presentation 103108tkvaran
 
Emotional Intelligence with Suzette Reyes
Emotional Intelligence with Suzette ReyesEmotional Intelligence with Suzette Reyes
Emotional Intelligence with Suzette ReyesJodi Rudick
 
Perception ppt @ bec doms bagalkot mba
Perception ppt @ bec doms bagalkot mbaPerception ppt @ bec doms bagalkot mba
Perception ppt @ bec doms bagalkot mbaBabasab Patil
 
Perseption
PerseptionPerseption
Perseptionnymufti
 
Interactive Metronome
Interactive MetronomeInteractive Metronome
Interactive MetronomeSharpBrains
 
LASI13-Boston, Rappolt Schlichtmann
LASI13-Boston, Rappolt SchlichtmannLASI13-Boston, Rappolt Schlichtmann
LASI13-Boston, Rappolt SchlichtmannLA-Boston
 
Depth of Feelings: Modeling Emotions in User Models and Agent Architectures
Depth of Feelings: Modeling Emotions in User Models and Agent ArchitecturesDepth of Feelings: Modeling Emotions in User Models and Agent Architectures
Depth of Feelings: Modeling Emotions in User Models and Agent ArchitecturesEva Hudlicka
 
Week 4BUSI7280 Managing in a Global Context1.docx
Week 4BUSI7280 Managing in a Global Context1.docxWeek 4BUSI7280 Managing in a Global Context1.docx
Week 4BUSI7280 Managing in a Global Context1.docxhelzerpatrina
 

Semelhante a Marinier Laird Cogsci 2008 Emotionrl Pres (20)

A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
TS4-5: Yuan Ma from Japan Advanced Institute of Science and Technology
TS4-5: Yuan Ma from Japan Advanced Institute of Science and TechnologyTS4-5: Yuan Ma from Japan Advanced Institute of Science and Technology
TS4-5: Yuan Ma from Japan Advanced Institute of Science and Technology
 
Reflective learning
Reflective learningReflective learning
Reflective learning
 
Intention-behavior relations
Intention-behavior relationsIntention-behavior relations
Intention-behavior relations
 
How to Foster Great Employee Attitudes at Work
How to Foster Great Employee Attitudes at WorkHow to Foster Great Employee Attitudes at Work
How to Foster Great Employee Attitudes at Work
 
The Emotionally Intelligent Interim Manager.Ppt2
The Emotionally Intelligent Interim Manager.Ppt2The Emotionally Intelligent Interim Manager.Ppt2
The Emotionally Intelligent Interim Manager.Ppt2
 
Process theories of motivation
Process theories of motivationProcess theories of motivation
Process theories of motivation
 
Perception.pptx js5dihob ycydugobcb ytsi kf
Perception.pptx js5dihob ycydugobcb ytsi kfPerception.pptx js5dihob ycydugobcb ytsi kf
Perception.pptx js5dihob ycydugobcb ytsi kf
 
Mindfulness@work case-agile india2018
Mindfulness@work case-agile india2018Mindfulness@work case-agile india2018
Mindfulness@work case-agile india2018
 
PERCEPTION IN ORGANISATIONAL BEHAVIOUR
PERCEPTION IN ORGANISATIONAL BEHAVIOURPERCEPTION IN ORGANISATIONAL BEHAVIOUR
PERCEPTION IN ORGANISATIONAL BEHAVIOUR
 
Lab Presentation 103108
Lab Presentation 103108Lab Presentation 103108
Lab Presentation 103108
 
Emotional Intelligence with Suzette Reyes
Emotional Intelligence with Suzette ReyesEmotional Intelligence with Suzette Reyes
Emotional Intelligence with Suzette Reyes
 
Perception ppt @ bec doms bagalkot mba
Perception ppt @ bec doms bagalkot mbaPerception ppt @ bec doms bagalkot mba
Perception ppt @ bec doms bagalkot mba
 
Perseption
PerseptionPerseption
Perseption
 
Interactive Metronome
Interactive MetronomeInteractive Metronome
Interactive Metronome
 
Motivation
MotivationMotivation
Motivation
 
LASI13-Boston, Rappolt Schlichtmann
LASI13-Boston, Rappolt SchlichtmannLASI13-Boston, Rappolt Schlichtmann
LASI13-Boston, Rappolt Schlichtmann
 
Module 1
Module 1Module 1
Module 1
 
Depth of Feelings: Modeling Emotions in User Models and Agent Architectures
Depth of Feelings: Modeling Emotions in User Models and Agent ArchitecturesDepth of Feelings: Modeling Emotions in User Models and Agent Architectures
Depth of Feelings: Modeling Emotions in User Models and Agent Architectures
 
Week 4BUSI7280 Managing in a Global Context1.docx
Week 4BUSI7280 Managing in a Global Context1.docxWeek 4BUSI7280 Managing in a Global Context1.docx
Week 4BUSI7280 Managing in a Global Context1.docx
 

Mais de gueste9cbbf

Mais de gueste9cbbf (7)

Power Point 2007
Power Point 2007Power Point 2007
Power Point 2007
 
Marinier Laird Cogsci 2008 Emotionrl Pres
Marinier Laird Cogsci 2008 Emotionrl PresMarinier Laird Cogsci 2008 Emotionrl Pres
Marinier Laird Cogsci 2008 Emotionrl Pres
 
Presentation 10 20 08 1
Presentation 10 20 08 1Presentation 10 20 08 1
Presentation 10 20 08 1
 
bb
bbbb
bb
 
b
bb
b
 
Power Point 2007
Power Point 2007Power Point 2007
Power Point 2007
 
Britwear
BritwearBritwear
Britwear
 

Último

How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Scott Andery
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 

Último (20)

How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 

Marinier Laird Cogsci 2008 Emotionrl Pres

  • 1. Emotion-Driven Reinforcement Learning Bob Marinier & John Laird University of Michigan, Computer Science and Engineering CogSci’08
  • 2. Introduction Interested in the functional benefits of emotion for a cognitive agent Appraisal theories of emotion PEACTIDM theory of cognitive control Use emotion as a reward signal to a reinforcement learning agent Demonstrates a functional benefit of emotion Provides a theory of the origin of intrinsic reward 2
  • 3. Outline Background Integration of emotion and cognition Integration of emotion and reinforcement learning Implementation in Soar Learning task Results 3
  • 4. Appraisal Theories of Emotion A situation is evaluated along a number of appraisal dimensions, many of which relate the situation to current goals Novelty, goal relevance, goal conduciveness, expectedness, causal agency, etc. Appraisals influence emotion Emotion can then be coped with (via internal or external actions) Situation Goals Appraisals Coping Emotion 4
  • 5. Appraisals to Emotions (Scherer 2001) 5
  • 6. Cognitive Control: PEACTIDM (Newell 1990) 6
  • 7. Unification of PEACTIDM and Appraisal Theories 7 Perceive Raw Perceptual Information Environmental Change Encode Motor Suddenness Unpredictability Goal Relevance Intrinsic Pleasantness Stimulus Relevance Motor Commands Prediction Outcome Probability Attend Decode Causal Agent/Motive Discrepancy Conduciveness Control/Power Stimulus chosen for processing Action Comprehend Intend Current Situation Assessment
  • 8. Distinction between emotion, mood, and feeling(Marinier & Laird 2007) Emotion: Result of appraisals Is about the current situation Mood: “Average” over recent emotions Provides historical context Feeling: Emotion “+” Mood What agent actually perceives 8
  • 9. Emotion, mood, and feeling Cognition Active Appraisals Perceived Feeling Emotion Feeling Combination Function Pull Mood Decay 9
  • 10. Intrinsically Motivated Reinforcement Learning(Sutton & Barto 1998; Singh et al. 2004) 10 External Environment Environment Actions Sensations Critic “Organism” Internal Environment Actions States Rewards Critic Appraisal Process Agent +/- Feeling Intensity States Rewards Decisions Agent Reward = Intensity * Valence
  • 11. Extending Soar with Emotion(Marinier & Laird 2007) Episodic Semantic Symbolic Long-Term Memories Procedural Semantic Learning Episodic Learning Chunking Reinforcement Learning Appraisal Detector Short-Term Memory Situation, Goals Decision Procedure Visual Imagery Perception Action Body 11
  • 12. Extending Soar with Emotion(Marinier & Laird 2007) 12 Episodic Semantic Symbolic Long-Term Memories Procedural Semantic Learning Episodic Learning Chunking Reinforcement Learning +/-Intensity Appraisal Detector Feeling .9,.6,.5,-.1,.8,… Short-Term Memory Situation, Goals Feelings Decision Procedure Feelings Appraisals Visual Imagery Emotion .5,.7,0,-.4,.3,… Mood .7,-.2,.8,.3,.6,… Perception Action Knowledge Body Architecture
  • 14. Learning task: Encoding 14 North Passable: false On path: false Progress: true East Passable: false On path: true Progress: true West Passable: false On path: false Progress: true South Passable: true On path: true Progress: true
  • 15. Learning task: Encoding & Appraisal 15 North Intrinsic Pleasantness: Low Goal Relevance: Low Unpredictability: High East Intrinsic Pleasantness: Low Goal Relevance: High Unpredictability: High West Intrinsic Pleasantness: Low Goal Relevance: Low Unpredictability: High South Intrinsic Pleasantness: Neutral Goal Relevance: High Unpredictability: Low
  • 16. Learning task: Attending, Comprehending & Appraisal 16 South Intrinsic Pleasantness: Neutral Goal Relevance: High Unpredictability: Low Conduciveness: High Control: High …
  • 18. Learning task: Tasking 18 Optimal Subtasks
  • 19. What is being learned? When to Attend vs Task If Attending, what to Attend to If Tasking, which subtask to create When to Intend vs. Ignore 19
  • 21. Results: With and without mood 21
  • 22. Discussion Agent learns both internal (tasking) and external (movement) actions Emotion allows for more frequent rewards, and thus learns faster than standard RL Mood “fills in the gaps” allowing for even faster learning and less variability 22
  • 23. Conclusion & Future Work Demonstrated computational model that integrates emotion and cognitive control Confirmed emotion can drive reinforcement learning We have already successfully demonstrated similar learning in a more complex domain Would like to explore multi-agent scenarios 23
  • 24. 24 HIGH INTENSITY alert tense excited nervous elated stressed happy upset NEGATIVE VALENCE POSITIVE VALENCE sad contented depressed serene lethargic relaxed fatigued calm LOW INTENSITY Circumplex models Emotions can be described in terms of intensity and valence, as in a circumplex model: Adapted from Feldman Barrett & Russell (1998)
  • 25. Computing Feeling from Emotion and Mood 25 Assumption: Appraisal dimensions are independent Limited Range: Inputs and outputs are in [0,1] or [-1,1] Distinguishability: Very different inputs should lead to very different outputs Non-linear: Linearity would violate limited range and distinguishability
  • 26. Computing Feeling Intensity 26 Motivation: Intensity gives a summary of how important (i.e., how good or bad) the situation is Limited range: Should map onto [0,1] No dominant appraisal: No single value should drown out all the others Can’t just multiply values, because if any are 0, then intensity is 0 Realization principle: Expected events should be less intense than unexpected events

Notas do Editor

  1. Be careful about how say agent generates appraisal values
  2. Say prediction is our extension
  3. A cognitive architecture is a set of task-independent mechanisms that interact to give rise to behavior.
  4. In this environment, the agent’s sensing is limited: it can only see the cells immediately adjacent to it in the four cardinal directions. The agent has a sensor that tells it its Manhattan distance to the goal. However, the agent has no knowledge as to the effects of its actions, and thus cannot evaluate possible actions relative to the goal until it has actually performed them. Even then, it cannot always blindly move closer to the goal because given the shape of the maze, it must sometimes increase its Manhattan distance to the goal in order to make progress in the maze.
  5. Mention relaxation and direction
  6. 15 episodes50 trialsCutoff at 10kdcsmedian
  7. 1st and 3rd quartiles shownReach optimality at the same time, but mood is less variable
  8. This is an extension of previous workThese constraints define a set of equations. This is one possible equation which improves previous work that seems to work well for our current models.
  9. This is an extension of previous workUnifies intensity for all feelings in one equation (others use different equations for each “kind” of feeling)Again these constraints define a set of possible functions, of which this is one that seems to work well for us