SlideShare uma empresa Scribd logo
1 de 28
Introduction
                       Stochastic Hybrid Systems
                             A learning Approach
                        Discussion & Future Work




         A Learning Approach to Verification and Control of
         Stochastic Hybrid Systems
         Literature Colloquium
         Sofie Haesaert, DCSC
         Supervisors: dr. ir. A. Abate and prof. dr. R. Babuˇka
                                                            s
         May 28, 2012




                                                                             Honeywell.com



1 / 25                                              Verification & Control of SHS
Introduction
                       Stochastic Hybrid Systems
                             A learning Approach
                        Discussion & Future Work


Outline
         Introduction
             Air Traffic Safety and Control
             Applications

         Stochastic Hybrid Systems
            Discrete-time Stochastic Hybrid Systems
            Stochastic Hybrid Systems: Properties
            Verification and Control

         A learning Approach
             Current Methods: Dynamic Programming
             Related Work

         Discussion & Future Work


2 / 25                                              Verification & Control of SHS
Introduction
                         Stochastic Hybrid Systems    Air Traffic Safety and Control
                               A learning Approach    Applications
                          Discussion & Future Work


Air Traffic Safety and Control


         Flight Safety
             Avoid: other airplanes, bad weather conditions, restricted
             airspace,...
             Reach end destination

         Analysis based on air traffic model
             Hybrid state space
             Stochastic due to wind and turbulence disturbance of flight
             path

         Safety → Probabilistic
                                                                                     [J.Hu,2003]
3 / 25                                                Verification & Control of SHS
Introduction
                         Stochastic Hybrid Systems    Air Traffic Safety and Control
                               A learning Approach    Applications
                          Discussion & Future Work


Introduction



         Other Applications
             Systems Biology
                   DNA replication
                   HIV Treatment
             Industrial Robotics
                   Pick-and-place tasks
             ...




                                                                                     [A. Singh,2010]
4 / 25                                                Verification & Control of SHS
Introduction
                                                     Discrete-time Stochastic Hybrid Systems
                        Stochastic Hybrid Systems
                                                     Stochastic Hybrid Systems: Properties
                              A learning Approach
                                                     Verification and Control
                         Discussion & Future Work


Discrete-Time Stochastic Hybrid Systems


             Hybrid state space S =             q∈Q {q}   × Rn(q)
             Stochastic transitions
             Transition kernels in discrete time for
                  Discrete transitions Tq
                  Reset transition Tr
                  Continuous transitions Tx
             Controlled / Autonomous
                  Control of transitions, either continuous or finite action space
                  Policy = string of controls
         ⇒ Lots of variations in definition of SHS e.g. initial states vs
         initial subsets.
                                                                                      [A. Abate,2008]
5 / 25                                               Verification & Control of SHS
Introduction
                                                     Discrete-time Stochastic Hybrid Systems
                        Stochastic Hybrid Systems
                                                     Stochastic Hybrid Systems: Properties
                              A learning Approach
                                                     Verification and Control
                         Discussion & Future Work


Reachability Analysis




                                                     K


                                  s0


         Determine if a given SHS will reach a certain target set K within a
         time horizon [0, N], starting from a set of initial states s0 . N can
         either be finite or infinite.


6 / 25                                               Verification & Control of SHS
Introduction
                                                     Discrete-time Stochastic Hybrid Systems
                        Stochastic Hybrid Systems
                                                     Stochastic Hybrid Systems: Properties
                              A learning Approach
                                                     Verification and Control
                         Discussion & Future Work


Reach-Avoid Problem




                                      A              K

                                     s0




         Determine the probability (rs0 ) that given an initial state s0 the
         SHS will reach a certain target set K within a time horizon [0, N]
         while staying inside the safe set A.


7 / 25                                               Verification & Control of SHS
Introduction
                                                         Discrete-time Stochastic Hybrid Systems
                         Stochastic Hybrid Systems
                                                         Stochastic Hybrid Systems: Properties
                               A learning Approach
                                                         Verification and Control
                          Discussion & Future Work




     j = First hitting time of target set K
     Reach-Avoid trajectory:                                                 A            K
             j ≤N
             State trajectory stays in safe set A                            s0
             until j,
                                                                                 
                                                j−1
                    rs0 = Es0                         1AK (si ) 1K (sj )
                                   j∈[0,N]      i=0


                                                  1,      if sk ∈ K
         Indicator function 1K (sk ) =
                                                  0,      otherwise

                                                                                        [S. Summers,2010]
8 / 25                                                   Verification & Control of SHS
Introduction
                                                         Discrete-time Stochastic Hybrid Systems
                         Stochastic Hybrid Systems
                                                         Stochastic Hybrid Systems: Properties
                               A learning Approach
                                                         Verification and Control
                          Discussion & Future Work




     j = First hitting time of target set K
     Reach-Avoid trajectory:                                                 A            K
             j ≤N
             State trajectory stays in safe set A                            s0
             until j,
                                                                                 
                                                j−1
                    rs0 = Es0                         1AK (si ) 1K (sj )
                                   j∈[0,N]      i=0


                                                  1,      if sk ∈ K
         Indicator function 1K (sk ) =
                                                  0,      otherwise

                                                                                        [S. Summers,2010]
8 / 25                                                   Verification & Control of SHS
Introduction
                                                         Discrete-time Stochastic Hybrid Systems
                         Stochastic Hybrid Systems
                                                         Stochastic Hybrid Systems: Properties
                               A learning Approach
                                                         Verification and Control
                          Discussion & Future Work




     j = First hitting time of target set K
     Reach-Avoid trajectory:                                                 A            K
             j ≤N
             State trajectory stays in safe set A                            s0
             until j,
                                                                                 
                                                j−1
                    rs0 = Es0                         1AK (si ) 1K (sj )
                                   j∈[0,N]      i=0


                                                  1,      if sk ∈ K
         Indicator function 1K (sk ) =
                                                  0,      otherwise

                                                                                        [S. Summers,2010]
8 / 25                                                   Verification & Control of SHS
Introduction
                                                         Discrete-time Stochastic Hybrid Systems
                         Stochastic Hybrid Systems
                                                         Stochastic Hybrid Systems: Properties
                               A learning Approach
                                                         Verification and Control
                          Discussion & Future Work




     j = First hitting time of target set K
     Reach-Avoid trajectory:                                                 A            K
             j ≤N
             State trajectory stays in safe set A                            s0
             until j,
                                                                                 
                                                j−1
                    rs0 = Es0                         1AK (si ) 1K (sj )
                                   j∈[0,N]      i=0


                                                  1,      if sk ∈ K
         Indicator function 1K (sk ) =
                                                  0,      otherwise

                                                                                        [S. Summers,2010]
8 / 25                                                   Verification & Control of SHS
Introduction
                                                        Discrete-time Stochastic Hybrid Systems
                         Stochastic Hybrid Systems
                                                        Stochastic Hybrid Systems: Properties
                               A learning Approach
                                                        Verification and Control
                          Discussion & Future Work




         Verification: Find the probability associated to a reach-avoid
         problem

                                                                              
                                               j−1
                     rs0 = Es0                       1AK (si ) 1K (sj )
                                  j∈[0,N]      i=0

         Control: Find a Policy π that maximizes rsπ
                                                   0

                                                                                      
                                                      j−1
                sup rsπ = sup Esπ 
                      0         0
                                                            1AK (si ) 1K (sj )
                 π          π
                                        j∈[0,N]       i=0



                                                                                       [S. Summers,2010]
9 / 25                                                  Verification & Control of SHS
Introduction
                                                     Discrete-time Stochastic Hybrid Systems
                      Stochastic Hybrid Systems
                                                     Stochastic Hybrid Systems: Properties
                            A learning Approach
                                                     Verification and Control
                       Discussion & Future Work


Dynamic Programming (1/2)




      Define Value function Vk : S → [0, 1]
                                                                                   
                                                   j−1
                  Vk (s) = Es                           1AK (si ) 1K (sj )
                                     j∈[k,N]       i=k

      Then it follows that V0 (s0 ) = rs0 .




10 / 25                                              Verification & Control of SHS
Introduction
                                                     Discrete-time Stochastic Hybrid Systems
                        Stochastic Hybrid Systems
                                                     Stochastic Hybrid Systems: Properties
                              A learning Approach
                                                     Verification and Control
                         Discussion & Future Work


Dynamic Programming (2/2)


      Verification: For k = 0, . . . N, iterate

                    Vk (s) = 1K (s) + 1AK (s)Es [Vk+1 ] ,                 ∀s ∈ S

      With VN (s) = 1K (s) ∀s ∈ S. Then V0 (s0 ) = rs0 .


      Control: Maximization at every iteration step
                ∗
             Vk (s) = sup 1K (s) + 1AK (s)Esπ [Vk+1 ] ,
              π
                                                                                    ∀s ∈ S
                            π




11 / 25                                              Verification & Control of SHS
Introduction
                                                  Discrete-time Stochastic Hybrid Systems
                     Stochastic Hybrid Systems
                                                  Stochastic Hybrid Systems: Properties
                           A learning Approach
                                                  Verification and Control
                      Discussion & Future Work


Computational Issues



      Recursion often cannot be written out analytically
      → Approximation: Vk ∼ Vk ˆ

          Curse of Dimensionality
          Difference between exact solution and approximation
      Approximations of value function and/or policy include:
          Discrete approximation: Discretization of action-state space
          Functional approximation over action-state space




12 / 25                                           Verification & Control of SHS
Introduction
                     Stochastic Hybrid Systems    Current Methods: Dynamic Programming
                           A learning Approach    Related Work
                      Discussion & Future Work


Current Methods: Dynamic Programming


      Discretization: Partitioning of state space


     Finite action space and hybrid state
     space
       ⇓
     Markov Chain
     (= finite action-state process)
     ˆ
     Vk = Tabular form
                                                               Figure: Discretization of
                                                               Hybrid State Space (S)

                                                                                 [A. Abate, 2007]
13 / 25                                           Verification & Control of SHS
Introduction
                       Stochastic Hybrid Systems    Current Methods: Dynamic Programming
                             A learning Approach    Related Work
                        Discussion & Future Work




                  +   Error Bounds
                  −   Partitioning method
                  −   Curse of Dimensionality: bad scaling towards higher
                      dimensions
      Goal
             Less conservative error bounds
             Optimal partitioning
             Action partitioning
             Functional approximation




14 / 25                                             Verification & Control of SHS
Introduction
                       Stochastic Hybrid Systems    Current Methods: Dynamic Programming
                             A learning Approach    Related Work
                        Discussion & Future Work


Current Methods: Dynamic Programming



      Functional Approximation:
      SHS with finite action space
                                       h
                   ˆ                        q
                   Vk (s, θk ) =           θi,k φi (x),    s = (q, x) ∈ S
                                       i

                               q             q
      Parameter Vector θk = (θk 1 , . . . , θk m )
                                  q          q        q
      for each discrete mode q: θk = (θ1,k , . . . , θh,k )



                                                                                   [A. Abate, 2008]
15 / 25                                             Verification & Control of SHS
Introduction
                  Stochastic Hybrid Systems    Current Methods: Dynamic Programming
                        A learning Approach    Related Work
                   Discussion & Future Work




      Functional Approximation:

             −   Only applied on Safety problems
             −   Only applied on finite action spaces
             −   No Error Bounds (yet)
             +   Curse of Dimensionality: better scaling qualities




16 / 25                                        Verification & Control of SHS
Introduction
                      Stochastic Hybrid Systems    Current Methods: Dynamic Programming
                            A learning Approach    Related Work
                       Discussion & Future Work


A Learning Approach: Related Work on Discounted Return
Problems


                                                    N
      Control objective: maxπ Jπ (x) = maxπ Esπ0
                                                         k
                                                    k=0 γ rk
      With rk the reward at k and γ ∈ [0, 1) the discount factor.
          Model Free
          Samples (sk , a, sk+1 , rk )
          Approximation methods for continuous state spaces
          Most methods for N → ∞
      e.g. (Approximate) Q-learning, LSPI, actor-critic, ...



17 / 25                                            Verification & Control of SHS
Introduction
                        Stochastic Hybrid Systems    Current Methods: Dynamic Programming
                              A learning Approach    Related Work
                         Discussion & Future Work


Fitted Value Iteration (1/2)



          1. Collect samples (sk , a, sk+1 , rk ) at M states:
             si , i = 1, . . . , M
          2. Estimate value-function at M states :
             ˜
             Vk (si ), i = 1, . . . , N
          3. Fit value function to Vk   ˜

                                              ˆ       ˜
                                              Vk = fit(Vk )

          4. k ← k − 1, go to 2


                                                                                    [R. Munos, 2008]
18 / 25                                              Verification & Control of SHS
Introduction
                     Stochastic Hybrid Systems    Current Methods: Dynamic Programming
                           A learning Approach    Related Work
                      Discussion & Future Work


Fitted Value Iteration (2/2)

          Finite action space & continuous state space
          Monte-Carlo approximations
          Probabilistic error bounds on the value functions
                     ∼ descriptive power of approximation functions
                     ∼ limited number of samples in Monte-Carlo
                         approximations
      Extension/Variations available for
          Samples usage
          Action-value function : Q(s, a)
          Continuous states and actions



19 / 25                                           Verification & Control of SHS
Introduction
                       Stochastic Hybrid Systems
                             A learning Approach
                        Discussion & Future Work


Plan of Work



          1. Control Synthesis: Fitted Value Iteration for SHS with
                 Finite control Space
                 Finite Horizon N
                 Batch samples
                 Kernel Based approximation
          2. Finite Horizon Error Bounds
          3. Infinite Horizon Error Bounds
          4. Extensions: tree-based fitted Q-iteration, continuous action,
             Infinite Horizon



20 / 25                                             Verification & Control of SHS
Introduction
                     Stochastic Hybrid Systems
                           A learning Approach
                      Discussion & Future Work




      Other research lines:
          Functional approximation after discretization
          LSPI for N → ∞
          ...




21 / 25                                           Verification & Control of SHS
Introduction
              Stochastic Hybrid Systems
                    A learning Approach
               Discussion & Future Work




      Thank you for your time
                    Are there any questions?




22 / 25                                    Verification & Control of SHS
Discounted Return vs Reach-avoid
                                   FVI: Formulas




Appendix Slides



      Discounted Return vs Reach-avoid



      FVI: Formulas




23 / 25                                            Verification & Control of SHS
Discounted Return vs Reach-avoid
                                   FVI: Formulas




      discounted return

                        Vk (s) = rk + γEs [Vk+1 ]              ∀s ∈ S

      reach-avoid

                Vk (s) = 1K (s) + 1AK (s)Es [Vk+1 ]                    ∀s ∈ S




24 / 25                                            Verification & Control of SHS
Discounted Return vs Reach-avoid
                                      FVI: Formulas




                                                  ˜
          1. Estimate value-function at M states Vk (si ), i = 1, . . . , N
                                  ˆ
             For a given function Vk−1 , and using samples the monte-carlo
             estimate V˜ of T (Vk−1 ) can be determined at the M base
                               ˆ
             points as follows
                                                       h
                                         1
                           ˜
                           V (si ) = max                       rjsi ,a + γVk (sk+1 )
                                                                               si ,a
                                     a∈A h
                                                      j=1

                                   ˜
          2. Fit value function to Vk
                                                           M
                                                                                      p
                             ˆ
                             Vk+1 = arg min                                ˜
                                                                 f (si ) − V (si )
                                                f ∈F
                                                           i=1




25 / 25                                                     Verification & Control of SHS

Mais conteúdo relacionado

Último

Último (20)

Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfUnit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
 
Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - English
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 
REMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxREMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptx
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentation
 
Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxHMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
 
Single or Multiple melodic lines structure
Single or Multiple melodic lines structureSingle or Multiple melodic lines structure
Single or Multiple melodic lines structure
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptx
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the Classroom
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 

Destaque

How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
ThinkNow
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
Kurio // The Social Media Age(ncy)
 

Destaque (20)

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPT
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 

Presentation

  • 1. Introduction Stochastic Hybrid Systems A learning Approach Discussion & Future Work A Learning Approach to Verification and Control of Stochastic Hybrid Systems Literature Colloquium Sofie Haesaert, DCSC Supervisors: dr. ir. A. Abate and prof. dr. R. Babuˇka s May 28, 2012 Honeywell.com 1 / 25 Verification & Control of SHS
  • 2. Introduction Stochastic Hybrid Systems A learning Approach Discussion & Future Work Outline Introduction Air Traffic Safety and Control Applications Stochastic Hybrid Systems Discrete-time Stochastic Hybrid Systems Stochastic Hybrid Systems: Properties Verification and Control A learning Approach Current Methods: Dynamic Programming Related Work Discussion & Future Work 2 / 25 Verification & Control of SHS
  • 3. Introduction Stochastic Hybrid Systems Air Traffic Safety and Control A learning Approach Applications Discussion & Future Work Air Traffic Safety and Control Flight Safety Avoid: other airplanes, bad weather conditions, restricted airspace,... Reach end destination Analysis based on air traffic model Hybrid state space Stochastic due to wind and turbulence disturbance of flight path Safety → Probabilistic [J.Hu,2003] 3 / 25 Verification & Control of SHS
  • 4. Introduction Stochastic Hybrid Systems Air Traffic Safety and Control A learning Approach Applications Discussion & Future Work Introduction Other Applications Systems Biology DNA replication HIV Treatment Industrial Robotics Pick-and-place tasks ... [A. Singh,2010] 4 / 25 Verification & Control of SHS
  • 5. Introduction Discrete-time Stochastic Hybrid Systems Stochastic Hybrid Systems Stochastic Hybrid Systems: Properties A learning Approach Verification and Control Discussion & Future Work Discrete-Time Stochastic Hybrid Systems Hybrid state space S = q∈Q {q} × Rn(q) Stochastic transitions Transition kernels in discrete time for Discrete transitions Tq Reset transition Tr Continuous transitions Tx Controlled / Autonomous Control of transitions, either continuous or finite action space Policy = string of controls ⇒ Lots of variations in definition of SHS e.g. initial states vs initial subsets. [A. Abate,2008] 5 / 25 Verification & Control of SHS
  • 6. Introduction Discrete-time Stochastic Hybrid Systems Stochastic Hybrid Systems Stochastic Hybrid Systems: Properties A learning Approach Verification and Control Discussion & Future Work Reachability Analysis K s0 Determine if a given SHS will reach a certain target set K within a time horizon [0, N], starting from a set of initial states s0 . N can either be finite or infinite. 6 / 25 Verification & Control of SHS
  • 7. Introduction Discrete-time Stochastic Hybrid Systems Stochastic Hybrid Systems Stochastic Hybrid Systems: Properties A learning Approach Verification and Control Discussion & Future Work Reach-Avoid Problem A K s0 Determine the probability (rs0 ) that given an initial state s0 the SHS will reach a certain target set K within a time horizon [0, N] while staying inside the safe set A. 7 / 25 Verification & Control of SHS
  • 8. Introduction Discrete-time Stochastic Hybrid Systems Stochastic Hybrid Systems Stochastic Hybrid Systems: Properties A learning Approach Verification and Control Discussion & Future Work j = First hitting time of target set K Reach-Avoid trajectory: A K j ≤N State trajectory stays in safe set A s0 until j,   j−1 rs0 = Es0  1AK (si ) 1K (sj ) j∈[0,N] i=0 1, if sk ∈ K Indicator function 1K (sk ) = 0, otherwise [S. Summers,2010] 8 / 25 Verification & Control of SHS
  • 9. Introduction Discrete-time Stochastic Hybrid Systems Stochastic Hybrid Systems Stochastic Hybrid Systems: Properties A learning Approach Verification and Control Discussion & Future Work j = First hitting time of target set K Reach-Avoid trajectory: A K j ≤N State trajectory stays in safe set A s0 until j,   j−1 rs0 = Es0  1AK (si ) 1K (sj ) j∈[0,N] i=0 1, if sk ∈ K Indicator function 1K (sk ) = 0, otherwise [S. Summers,2010] 8 / 25 Verification & Control of SHS
  • 10. Introduction Discrete-time Stochastic Hybrid Systems Stochastic Hybrid Systems Stochastic Hybrid Systems: Properties A learning Approach Verification and Control Discussion & Future Work j = First hitting time of target set K Reach-Avoid trajectory: A K j ≤N State trajectory stays in safe set A s0 until j,   j−1 rs0 = Es0  1AK (si ) 1K (sj ) j∈[0,N] i=0 1, if sk ∈ K Indicator function 1K (sk ) = 0, otherwise [S. Summers,2010] 8 / 25 Verification & Control of SHS
  • 11. Introduction Discrete-time Stochastic Hybrid Systems Stochastic Hybrid Systems Stochastic Hybrid Systems: Properties A learning Approach Verification and Control Discussion & Future Work j = First hitting time of target set K Reach-Avoid trajectory: A K j ≤N State trajectory stays in safe set A s0 until j,   j−1 rs0 = Es0  1AK (si ) 1K (sj ) j∈[0,N] i=0 1, if sk ∈ K Indicator function 1K (sk ) = 0, otherwise [S. Summers,2010] 8 / 25 Verification & Control of SHS
  • 12. Introduction Discrete-time Stochastic Hybrid Systems Stochastic Hybrid Systems Stochastic Hybrid Systems: Properties A learning Approach Verification and Control Discussion & Future Work Verification: Find the probability associated to a reach-avoid problem   j−1 rs0 = Es0  1AK (si ) 1K (sj ) j∈[0,N] i=0 Control: Find a Policy π that maximizes rsπ 0   j−1 sup rsπ = sup Esπ  0 0 1AK (si ) 1K (sj ) π π j∈[0,N] i=0 [S. Summers,2010] 9 / 25 Verification & Control of SHS
  • 13. Introduction Discrete-time Stochastic Hybrid Systems Stochastic Hybrid Systems Stochastic Hybrid Systems: Properties A learning Approach Verification and Control Discussion & Future Work Dynamic Programming (1/2) Define Value function Vk : S → [0, 1]   j−1 Vk (s) = Es  1AK (si ) 1K (sj ) j∈[k,N] i=k Then it follows that V0 (s0 ) = rs0 . 10 / 25 Verification & Control of SHS
  • 14. Introduction Discrete-time Stochastic Hybrid Systems Stochastic Hybrid Systems Stochastic Hybrid Systems: Properties A learning Approach Verification and Control Discussion & Future Work Dynamic Programming (2/2) Verification: For k = 0, . . . N, iterate Vk (s) = 1K (s) + 1AK (s)Es [Vk+1 ] , ∀s ∈ S With VN (s) = 1K (s) ∀s ∈ S. Then V0 (s0 ) = rs0 . Control: Maximization at every iteration step ∗ Vk (s) = sup 1K (s) + 1AK (s)Esπ [Vk+1 ] , π ∀s ∈ S π 11 / 25 Verification & Control of SHS
  • 15. Introduction Discrete-time Stochastic Hybrid Systems Stochastic Hybrid Systems Stochastic Hybrid Systems: Properties A learning Approach Verification and Control Discussion & Future Work Computational Issues Recursion often cannot be written out analytically → Approximation: Vk ∼ Vk ˆ Curse of Dimensionality Difference between exact solution and approximation Approximations of value function and/or policy include: Discrete approximation: Discretization of action-state space Functional approximation over action-state space 12 / 25 Verification & Control of SHS
  • 16. Introduction Stochastic Hybrid Systems Current Methods: Dynamic Programming A learning Approach Related Work Discussion & Future Work Current Methods: Dynamic Programming Discretization: Partitioning of state space Finite action space and hybrid state space ⇓ Markov Chain (= finite action-state process) ˆ Vk = Tabular form Figure: Discretization of Hybrid State Space (S) [A. Abate, 2007] 13 / 25 Verification & Control of SHS
  • 17. Introduction Stochastic Hybrid Systems Current Methods: Dynamic Programming A learning Approach Related Work Discussion & Future Work + Error Bounds − Partitioning method − Curse of Dimensionality: bad scaling towards higher dimensions Goal Less conservative error bounds Optimal partitioning Action partitioning Functional approximation 14 / 25 Verification & Control of SHS
  • 18. Introduction Stochastic Hybrid Systems Current Methods: Dynamic Programming A learning Approach Related Work Discussion & Future Work Current Methods: Dynamic Programming Functional Approximation: SHS with finite action space h ˆ q Vk (s, θk ) = θi,k φi (x), s = (q, x) ∈ S i q q Parameter Vector θk = (θk 1 , . . . , θk m ) q q q for each discrete mode q: θk = (θ1,k , . . . , θh,k ) [A. Abate, 2008] 15 / 25 Verification & Control of SHS
  • 19. Introduction Stochastic Hybrid Systems Current Methods: Dynamic Programming A learning Approach Related Work Discussion & Future Work Functional Approximation: − Only applied on Safety problems − Only applied on finite action spaces − No Error Bounds (yet) + Curse of Dimensionality: better scaling qualities 16 / 25 Verification & Control of SHS
  • 20. Introduction Stochastic Hybrid Systems Current Methods: Dynamic Programming A learning Approach Related Work Discussion & Future Work A Learning Approach: Related Work on Discounted Return Problems N Control objective: maxπ Jπ (x) = maxπ Esπ0 k k=0 γ rk With rk the reward at k and γ ∈ [0, 1) the discount factor. Model Free Samples (sk , a, sk+1 , rk ) Approximation methods for continuous state spaces Most methods for N → ∞ e.g. (Approximate) Q-learning, LSPI, actor-critic, ... 17 / 25 Verification & Control of SHS
  • 21. Introduction Stochastic Hybrid Systems Current Methods: Dynamic Programming A learning Approach Related Work Discussion & Future Work Fitted Value Iteration (1/2) 1. Collect samples (sk , a, sk+1 , rk ) at M states: si , i = 1, . . . , M 2. Estimate value-function at M states : ˜ Vk (si ), i = 1, . . . , N 3. Fit value function to Vk ˜ ˆ ˜ Vk = fit(Vk ) 4. k ← k − 1, go to 2 [R. Munos, 2008] 18 / 25 Verification & Control of SHS
  • 22. Introduction Stochastic Hybrid Systems Current Methods: Dynamic Programming A learning Approach Related Work Discussion & Future Work Fitted Value Iteration (2/2) Finite action space & continuous state space Monte-Carlo approximations Probabilistic error bounds on the value functions ∼ descriptive power of approximation functions ∼ limited number of samples in Monte-Carlo approximations Extension/Variations available for Samples usage Action-value function : Q(s, a) Continuous states and actions 19 / 25 Verification & Control of SHS
  • 23. Introduction Stochastic Hybrid Systems A learning Approach Discussion & Future Work Plan of Work 1. Control Synthesis: Fitted Value Iteration for SHS with Finite control Space Finite Horizon N Batch samples Kernel Based approximation 2. Finite Horizon Error Bounds 3. Infinite Horizon Error Bounds 4. Extensions: tree-based fitted Q-iteration, continuous action, Infinite Horizon 20 / 25 Verification & Control of SHS
  • 24. Introduction Stochastic Hybrid Systems A learning Approach Discussion & Future Work Other research lines: Functional approximation after discretization LSPI for N → ∞ ... 21 / 25 Verification & Control of SHS
  • 25. Introduction Stochastic Hybrid Systems A learning Approach Discussion & Future Work Thank you for your time Are there any questions? 22 / 25 Verification & Control of SHS
  • 26. Discounted Return vs Reach-avoid FVI: Formulas Appendix Slides Discounted Return vs Reach-avoid FVI: Formulas 23 / 25 Verification & Control of SHS
  • 27. Discounted Return vs Reach-avoid FVI: Formulas discounted return Vk (s) = rk + γEs [Vk+1 ] ∀s ∈ S reach-avoid Vk (s) = 1K (s) + 1AK (s)Es [Vk+1 ] ∀s ∈ S 24 / 25 Verification & Control of SHS
  • 28. Discounted Return vs Reach-avoid FVI: Formulas ˜ 1. Estimate value-function at M states Vk (si ), i = 1, . . . , N ˆ For a given function Vk−1 , and using samples the monte-carlo estimate V˜ of T (Vk−1 ) can be determined at the M base ˆ points as follows h 1 ˜ V (si ) = max rjsi ,a + γVk (sk+1 ) si ,a a∈A h j=1 ˜ 2. Fit value function to Vk M p ˆ Vk+1 = arg min ˜ f (si ) − V (si ) f ∈F i=1 25 / 25 Verification & Control of SHS