Mais conteúdo relacionado
Semelhante a Reinforcement Learning with Sagemaker, DeepRacer and Robomaker (20)
Mais de Alex Barbosa Coqueiro (15)
Reinforcement Learning with Sagemaker, DeepRacer and Robomaker
- 1. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Racing with Artificial Intelligence
Alex Coqueiro
Head of Public Sector Solutions Architecture for Canada, Latin America and Caribbean
AWS
@alexbcbr
- 2. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Rubik’s cube challenge
- 3. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
43,252,003,274,489,856,000
43 QUINTILLION
UNIQUE COMBINATIONS
- 4. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Don’t code the patterns, let the
system learn through data
- 5. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
F2 U' R' L F2 R L' U'
ModelData
- 6. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
F2 U' R' L F2 R L' U'
Confidence
1%
accuracy
R U r U R U2 r U2%
accuracy
Training Models
Model
- 7. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Confidence
20%
accuracy
40%
accuracy
60%
accuracy
80%
accuracy
95%
accuracy
2%
accuracy
R U r U R U2 r U
Training Models
Model
- 8. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Confidence
95%
accuracy
?
F2 R F R′ B′ D F D′ B D F
Inference
Model
- 9. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
- 10. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
SOLVED IN 0.9 SECONDS
- 11. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Show me how to do it
- 12. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.Use Case – Autonomous Driving
- 13. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Our problem re-formulation
- 14. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Different problems require different learning strategies
labeled training data
Complexityofdecisions
Supervised learning
Non-labeled training data
- 15. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Autonomous Driving Development
- 16. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Robocar (Donkey Car Project)
Donkey Car Project
https://github.com/sunilmallya/donkey/tree/master/sagetrain
http://awsrobocar.s3-website-us-east-1.amazonaws.com/
https://github.com/tescal2/donkey
- 17. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
SSD MultiBox — Real-Time Object Detection +
Behavioral Cloning
- 18. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon SageMaker:
Build, Train, and Deploy ML Models at Scale
1
2
3
- 19. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Different problems require different learning strategies
labeled training data
Complexityofdecisions
Supervised learning
Unsupervised
learning
Reinforcement
Learning
Non-labeled training data
- 20. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Reinforcement learning in the real world
Reward positive
behavior
Don’t reward
negative
behavior
The result!
- 21. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
• Build machine learning models in Amazon
SageMaker
• Train, test, and iterate on the track using the AWS
DeepRacer 3D racing simulator
• Compete in the world’s first global autonomous
racing league, to race for prizes and a chance to
advance to win the coveted AWS DeepRacer Cup
AWS DeepRacer
A fully autonomous 1/18th-scale race car designed to help you learn about
reinforcement learning through autonomous driving
- 22. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Robotic autonomous
race car
DeepRacer: An exciting way for developers to get hands-on experience with
Reinforcement Learning
Racing LeagueVirtual simulator, to
train and experiment
- 23. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Track components
TRACK CENTER
TRACK WALL
TRACK SURFACE aka ON-TRACK
FIELD aka OFF-TRACK
TRACK BOUNDARIES
- 24. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Action space
- 25. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
The reward function in a race grid
S G = 2
GOALAGENT
- 26. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Incentivizing centerline behavior
0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1
S 2 2 2 2 2 2 G = 2
0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1
REWARD FUNCTION
8.6 9.5 8.5 7.5 6.3 5.0 3.5 1.9
S 10.4 9.4 8.2 6.9 5.4 3.8 G = 2
8.6 9.5 8.5 7.5 6.3 5.0 3.5 1.9
MAX VALUE OF EACH STATE
AFTER LOTS OF EXPLORING
Discount per step
0.9
- 27. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Programming your own reward function
- 28. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Let’s go deeper
Let’s go deeper…
- 29. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AWS DeepRacer Neural Network Architecture
An overview of the network architecture that AWS DeepRacer uses:
Output
- 30. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon
Sagemaker RL
- 31. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Reinforcement Learning Algorithms Compared
Value Approximation Policy Approximation
Advantages
More stable performance when it works, and tends to
converge on global optimum
Effective in continuous action spaces, can learn stochastic policies,
and faster convergence
Disadvantages
Difficult to converge if too many (state, action)
combinations, slower convergence in general, and can’t
learn stochastic properties
Typically converges to a local rather than global optimum, high
variance in estimating the gradient adversely affects stability, and
evaluating a policy is generally inefficient
Examples Q-Learning, Deep Q Network, Deep Double Q Network Policy Gradient, Proximal Policy Optimization (PPO)
- 32. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Hyper parameters control the training algorithm
- 33. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Coordinate system and track waypoints
OUTER BOUNDARY WAYPOINTS
TRACK CENTER WAYPOINTS
INNER BOUNDARY WAYPOINTS
X
Y
TRACK WIDTH
CAR DIRECTION
- 34. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AWS DeepRacer Car Specifications
CAR 18th scale 4WD with monster truck chassis
CPU Intel Atom™ Processor
MEMORY 4GB RAM
STORAGE 32GB (expandable)
WI-FI 802.11ac
CAMERA 4 MP camera with MJPEG
DRIVE BATTERY 7.4V/1100mAh lithium polymer
COMPUTE BATTERY 13600mAh USB-C PD
SENSORS Integrated accelerometer and gyroscope
PORTS 4x USB-A, 1x USB-C, 1x Micro-USB, 1x HDMI
SOFTWARE Ubuntu OS 16.04.3 LTS, Intel® OpenVINO™
toolkit, ROS Kinetic
- 35. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Robotic Arms
International Space Station
Drones
Education
Water
Home
Self-Driving Vehicles
Autonomous Walker
Rover
Robot landscape
- 36. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Robotics trends
in 2018
Robotics is undergoing fundamental
change in collaboration, autonomous
mobility, and increasing intelligence
Source: IDTechEx
• Logistics
• Construction
• Retail
• Hospitality
• Healthcare
Robots are being put to work every
day across many industries
• Agriculture
• Energy Management
• Oil and Gas
• Facilities Management
• Household chores
By 2023, it’s estimated that mobile autonomous robots will
emerge as the standard for logistic and fulfillment processes
By 2030, 70% of all mobile material
handling equipment will be autonomous
- 37. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Robotic development cycle
2) Develop
robotics
application
1) Select robotics
software
framework
1) Deploy and
manage
application
3) Test and
simulate
application
New application release and update
- 38. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Most widely used software framework for teaching and learning about robotics – over 16 million .deb (Linux Debian)
packages downloaded in 2018, a 400% increase since 2014
Founded in Stanford labs over 10 year ago, now managed by the Open Source Robotics Foundation (OSRF)
Global open-source community supports two products—Robot Operating System (ROS) and Gazebo
ROS
A set of software libraries and tools, from drivers to algorithms,
that help developers build robot applications
Gazebo
Robust physics engine, high-quality graphics, and programmatic
and graphical interfaces to help developers simulate robots
Robot Operating System (ROS) primer
- 39. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Introducing AWS RoboMaker
A service that makes it easy for
developers to develop, test, and
deploy robotics applications, as
well as build intelligent robotics
functions using cloud services
- 40. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AWS RoboMaker service suite
Development
Environment
SimulationCloud Extensions for
ROS
Fleet
Management
- 41. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AWS RoboMaker
Sample Robot Applications
Hello
World
Navigation
and Person
Recognition
Voice
Commands
Robot
Monitoring
Object-
following using
RL
Self-
driving
using RL
- 43. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
ROS Msg Node
Stored File
ROS Nodes
Model
Optimizer
Video
M-JPEG
Web Server
Video
Inference
Results
Web
Server
Publisher
Autonomous
Drive
Control
Node
Optimized
Model
Media engine
Camera
Model
Inference
engine
Manual
Drive
Navigation
Node
Servo & Motor
AWS DeepRacer Software Architecture
- 44. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
DATA
- 45. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
We are not spectators,
but actors of the future
Herb Simon, 2000
- 46. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
ml.aws@alexbcbr