Mais conteúdo relacionado Semelhante a [REPEAT] Get hands on with AWS DeepRacer & compete in the AWS DeepRacer League - AIM203-R - Santa Clara AWS Summit.pdf (20) Mais de Amazon Web Services (20) [REPEAT] Get hands on with AWS DeepRacer & compete in the AWS DeepRacer League - AIM203-R - Santa Clara AWS Summit.pdf1. S U M M I T
SA NTA CLA R A
2. © 2019, Amazon Web Services, Inc. orits affiliates. All rights reserved.S UM M I T
Get hands on with AWS DeepRacer & compete in
the AWS DeepRacer League
DeClercq Wentzel
Senior Product Manager
Amazon Web Services
A I M 2 0 3
3. © 2019, Amazon Web Services, Inc. orits affiliates. All rights reserved.S UM M I T
Agenda
• AWS DeepRacer origin
• RL for the Sunday driver
• Virtual simulator
• Under the hood
• Rubber meets the road
4. S UM M I T © 2019, Amazon Web Services, Inc. orits affiliates. All rights reserved.
5. © 2019, Amazon Web Services, Inc. orits affiliates. All rights reserved.S UM M I T
How can we put
reinforcement learning
in the hands of all
developers? literally
6. © 2019, Amazon Web Services, Inc. orits affiliates. All rights reserved.S UM M I T
1/18 scale autonomous
race car
AWS DeepRacer: An exciting way for developers to get hands-on experience with
reinforcement learning
Global Racing LeagueVirtual simulator, to train
and evaluate
7. S UM M I T © 2019, Amazon Web Services, Inc. orits affiliates. All rights reserved.
8. © 2019, Amazon WebServices, Inc. or its affiliates. All rights reserved.S UM M I T
Reinforcement learning in the broader AI context
Reinforcement
Learning
Supervised
Learning
Unsupervised
Learning
9. © 2019, Amazon Web Services, Inc. orits affiliates. All rights reserved.S UM M I T
Machine learning overview
SUPERVISED UNSUPERVISED REINFORCEMENT
10. © 2019, Amazon Web Services, Inc. orits affiliates. All rights reserved.S UM M I T
Reinforcement learning in the real world
Reward positive
behavior
Don’t reward
negative behavior The result!
11. © 2019, Amazon Web Services, Inc. orits affiliates. All rights reserved.S UM M I T
Reinforcement learning terms
AGENT ENVIRONMENT STATE
ACTION
EPISODEREWARD
12. © 2019, Amazon Web Services, Inc. orits affiliates. All rights reserved.S UM M I T
The reward function
The reward function incentivizes particular
behaviors and is at the core of reinforcement
learning
13. © 2019, Amazon Web Services, Inc. orits affiliates. All rights reserved.S UM M I T
The reward function in a race grid
S G = 2
GOALAGENT
14. © 2019, Amazon Web Services, Inc. orits affiliates. All rights reserved.S UM M I T
Incentivizing centerline behavior
0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1
S 2 2 2 2 2 2 G = 2
0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1
8.6 9.5 8.5 7.5 6.3 5.0 3.5 1.9
S 10.4 9.4 8.2 6.9 5.4 3.8 G = 2
8.6 9.5 8.5 7.5 6.3 5.0 3.5 1.9
Discount per step
0.9
15. © 2019, Amazon Web Services, Inc. orits affiliates. All rights reserved.S UM M I T
AWS DeepRacer problem formulation
STATE
16. © 2019, Amazon Web Services, Inc. orits affiliates. All rights reserved.S UM M I T
How does learning happen? VALUE FUNCTION
POLICY FUNCTION
17. © 2019, Amazon Web Services, Inc. orits affiliates. All rights reserved.S UM M I T
RL algorithms: Vanilla policy gradient
* Image Source: Landscape image is CC0 1.0 public domain
Datais only used once
• High variance of rewards
• Magnitude of update could be too large
J()New
weights
New
weights
0.4 ± 𝛿 0.3 ± 𝛿
18. © 2019, Amazon Web Services, Inc. orits affiliates. All rights reserved.S UM M I T
RL algorithms: Proximal policy optimization (PPO)
(State, action, reward,
next state)
(st,at, rt, st+1)
Advantage
Improved model
19. © 2019, Amazon Web Services, Inc. orits affiliates. All rights reserved.S UM M I T
METHOD Supervised learning
HOW IT WORKS Expert driver controls a real world
car, that has a camera. Save the images from the
camera as inputs and corresponding driving actions
(speed and steering angle) as outputs. Train a
model.
RESULT Provide state(image)into model and
receive driving action
RL vs. other approaches for robotic racing
METHOD Reinforcementlearning
HOW IT WORKS Virtual agent repeatedly interacts
with a simulated environment and logs
experience (image, action, new state, reward).
Experience is used to train a model, and new
model is used to get more experience.
RESULT Provide state(image)into model and
receive driving action
20. S UM M I T © 2019, Amazon Web Services, Inc. orits affiliates. All rights reserved.
21. © 2019, Amazon Web Services, Inc. orits affiliates. All rights reserved.S UM M I T
AWS Cloud
AWS
DeepRacer
NATgateway
VPC
AWS DeepRacer
Models
Simulation
video
Metrics
AWS DeepRacer simulator architecture
22. © 2019, Amazon Web Services, Inc. orits affiliates. All rights reserved.S UM M I T
AWS DeepRacer console diagram
23. © 2019, Amazon Web Services, Inc. orits affiliates. All rights reserved.S UM M I T
Programming your own reward function
24. © 2019, Amazon Web Services, Inc. orits affiliates. All rights reserved.S UM M I T
Track components
TRACK CENTER
TRACK WALL
TRACK SURFACE aka ON-TRACK
FIELD aka OFF-TRACK
TRACK BOUNDARIES
25. © 2019, Amazon Web Services, Inc. orits affiliates. All rights reserved.S UM M I T
Coordinate system and track waypoints
OUTER BOUNDARY WAYPOINTS
TRACK CENTER WAYPOINTS
INNER BOUNDARY WAYPOINTS
X
Y
TRACK WIDTH
CAR DIRECTION
26. © 2019, Amazon Web Services, Inc. orits affiliates. All rights reserved.S UM M I T
Hyper parameters control the training algorithm
27. © 2019, Amazon Web Services, Inc. orits affiliates. All rights reserved.S UM M I T
Action space
28. © 2019, Amazon Web Services, Inc. orits affiliates. All rights reserved.S UM M I T
Lab 1 – AWS DeepRacer service
OBJECTIVE Build your first AWS DeepRacer RL model
TIME 50 min.
1. Find the lab content here:
https://github.com/aws-samples/aws-deepracer-workshops/
2. Navigate to: Workshops/2019-AWSSummits-AWSDeepRacerService/Lab1
29. S UM M I T © 2019, Amazon Web Services, Inc. orits affiliates. All rights reserved.
30. © 2019, Amazon Web Services, Inc. orits affiliates. All rights reserved.S UM M I T
AWS DeepRacer car specifications
CAR 18th scale 4WD with monster truck chassis
CPU Intel AtomProcessor
MEMORY 4 GB RAM
STORAGE 32 GB (expandable)
WI-FI 802.11ac
CAMERA 4 MP camera with MJPEG
DRIVE BATTERY 1000 mAh lithium polymer
COMPUTE BATTERY 13600 mAh USB-C
SENSORS Integrated accelerometer and gyroscope
PORTS 4x USB-A, 1x USB-C, 1x Micro-USB, 1x HDMI
SOFTWARE Ubuntu OS 16.04.3 LTS, Intel OpenVINO
toolkit,ROS Kinetic
31. © 2019, Amazon Web Services, Inc. orits affiliates. All rights reserved.S UM M I T
ROS msg node
Stored file
ROS nodes
Web
Server
Publisher
Model
Optimizer
VideoM-
JPEG
WebServer
Video
Inference
Results
Autonomous
Drive
Control
Node
Optimized
Model
Mediaengine
Camera
Model
Inference
engine
Manual
Drive
Navigation
Node
Servo& Motor
AWS DeepRacer software architecture
32. © 2019, Amazon Web Services, Inc. orits affiliates. All rights reserved.S UM M I T
Simulation-to-real domain transfer
SIM-to-REAL CHALLENGE
Train model using simulated images, but the
race car using the images the car experiences
in the real world
STRATEGIES
Environment control
Domain randomization
Modularity and abstraction
33. S UM M I T © 2019, Amazon Web Services, Inc. orits affiliates. All rights reserved.
34. © 2019, Amazon Web Services, Inc. orits affiliates. All rights reserved.S UM M I T
Race for prizes and glory in the AWS DeepRacer League
Train your AWS DeepRacer model and compete:
• Online in the Virtual Circuit
• In person in the Summit Circuit (visit the Expo Hall)
www.deepracerleague.com
35. Thank you!
S UM M I T © 2019, Amazon Web Services, Inc. orits affiliates. All rights reserved.
36. S UM M I T © 2019, Amazon Web Services, Inc. orits affiliates. All rights reserved.