O slideshow foi denunciado.
Utilizamos seu perfil e dados de atividades no LinkedIn para personalizar e exibir anúncios mais relevantes. Altere suas preferências de anúncios quando desejar.

Bring Intelligent Motion Using Reinforcement Learning Engines | SIGGRAPH 2019 Technical Sessions

517 visualizações

Publicada em

Review state-of-the-art techniques that use neural networks to synthesize motion, such as mode-adaptive neural network and phase-functioned neural networks. See how next-generation CPUs with reinforcement learning can offer better performance.

Publicada em: Tecnologia
  • Check out the brain training for Dogs course now. It's great for eliminating any bad behaviors by tapping into your dog's hidden intelligence. ●●● https://tinyurl.com/rrpbzfr
       Responder 
    Tem certeza que deseja  Sim  Não
    Insira sua mensagem aqui
  • Seja a primeira pessoa a gostar disto

Bring Intelligent Motion Using Reinforcement Learning Engines | SIGGRAPH 2019 Technical Sessions

  1. 1. SIGGRAPH 2019 | LOS ANGLES | 28 JULY - 1 AUGUST
  2. 2. Bringing Intelligent Motion Using Reinforcement Learning On Intel Client Manuj Sabharwal, Yaz Khabiri
  3. 3. Agenda 3 Ø Overview of Reinforcement Learning (RL) Ø Reinforcement Learning in Gaming Ø Training RL Algorithms Ø Intelligent Motion Use case Ø Performance Optimization on Intel® CPU Ø Inference RL Algorithms Ø Understanding Motion models Ø Using DirectML* to leverage Intel GPUs Ø Summary
  4. 4. Overview of Machine Learning 4 4 m Machine Learning Supervised Unsupervised Reinforcement Data; labels à Class Task driven Data à Cluster State à Action Learn from mistake
  5. 5. Successes Of Reinforcement Learning SIGGRAPH 2019 | LOS ANGLES | 28 JULY - 1 AUGUST
  6. 6. High-Level Reinforcement Learning Overview SIGGRAPH 2019 | LOS ANGLES | 28 JULY - 1 AUGUST Agent gets state (s) from environment Agent takes action (a) using policy (π) Agent receives reward (r) Goal: Maximize large future reward return (R) https://unity3d.com/machine-learning
  7. 7. Examples Of RL Algorithms SIGGRAPH 2019 | LOS ANGLES | 28 JULY - 1 AUGUST • Actor-Critic algorithms (model based learning)* • Reduce variance of policy gradient using the actor (the policy) and critic (value function) • Value Based • Q-Learning • Find best action under current state • Policy based • Trust Region Policy Optimization • Generalized Advantage estimation http://rail.eecs.berkeley.edu/deeprlcourse-fa17/f17docs/lecture_3_rl_intro.pdf
  8. 8. Brain behind Algorithms SIGGRAPH 2019 | LOS ANGLES | 28 JULY - 1 AUGUST • Value Functions • How much reward a state or an action by prediction of total future reward (return) • Policy Methods • Find the best action directly • Optimize policy (behavior) directly • Vanilla Policy Gradients • For every episode with positive reward use gradient to increase probability of future actions • Improved Policy Gradients • Multiple gradient steps per episode
  9. 9. Popular Path To Bring Machine Learning In Games • Microsoft* • DirectML (DML) framework • Ubisoft* – LaForge • Bringing research into industry • Access to game engines and data • Unity* • First party support via ML-Agents • Interface between research and gaming • DML backend coming soon
  10. 10. Motion With Reinforcement Learning SIGGRAPH 2019 | LOS ANGLES | 28 JULY - 1 AUGUST • Understanding path or motion planning problem is crucial in unstructured environment • Data driven input in combination of physics based animation character to create smooth and robust animation • RL offers a convenient framework for learning different strategies without mountain of data • Solves generalization problems by path or motion planning Deep Q-Networks : Volodymyr Mnih, Deep RL Bootcamp, Berkeley, DeepMind*
  11. 11. SIGGRAPH 2019 | LOS ANGLES | 28 JULY - 1 AUGUST • Q-learning (Q) : State × Action → Result, if we were to take an action in a given state, then we could easily construct a policy that maximizes our rewards: • A = argmax Q (s,a) • Neural network helps to resemble Q as it can calculate universal function approximators • Q(s,a)=r+γQa’(sʹ,aʹ)) Equations to framework (e.g. Q-Learning à DQN Learning) Layer-1 Layer-3Layer-2state Q(s,n) conv conv conv FC FC Q Values Straight Left Right Activation function Activation function Activation function
  12. 12. SIGGRAPH 2019 | LOS ANGLES | 28 JULY - 1 AUGUST Evaluating Motion Algorithms On Intel® Core Processors https://github.com/xbpeng/DeepMimic 0 500 1000 1500 2000 2500 3000 3500 5 10 15 20 25 30 35 40 45 50 55 60 Minutes MillionIterations TensorFlow Baseline ~52hours of training on 8Core platform ~52hours to train on CPU à Can we do better? Testing by Intel as of June 28th , 2019 Intel® i9-9900k, 95W TDP, 8C16T; Frequency : 4.3Ghz, Turbo Enabled Graphics: NVIDIA* GTX 2080, Memory: 4x8GB@2133Mhz, Storage: Intel SSD 545 Series 240GB, OS: Windows* 10 RS5 BIOS build: CFLSFX1.R00.X151B01. All data is collected with Tensorflow* 1.12 and DeepMimic branch dates June 28th 2019
  13. 13. SIGGRAPH 2019 | LOS ANGLES | 28 JULY - 1 AUGUST Analyzing Software Stack ~20% of actual time is spend in compute and rest are overhead Intel® VTune™ Amplifier XE Actual compute Inefficiency due to spins
  14. 14. SIGGRAPH 2019 | LOS ANGLES | 28 JULY - 1 AUGUST Optimizing the Software Stack - 1 ØRe-evaluating libraries included in software stack for DeepMimic • Recompiling Tensorflow* with Intel® MKLDNN bazel --output_base=output_dir build --config=mkl --config=opt //tensorflow/tools/pip_package:build_pip_package python -c "import tensorflow; print(tensorflow.pywrap_tensorflow.IsMklEnabled())“ à Result : True • Evaluate different threading parameters to reduce spin time import tensorflow # this sets KMP_BLOCKTIME and OMP_PROC_BIND import os # delete the existing values del os.environ['OMP_PROC_BIND’] del os.environ['KMP_BLOCKTIME’] ØMoving Python installation à Optimize Intel Python libraries • Simple optimizations by moving numpy libraries to more efficient Intel Numpy libraries
  15. 15. Optimizing the Software Stack - 2 SIGGRAPH 2019 | LOS ANGLES | 28 JULY - 1 AUGUST ØOptimizing math libraries to use FP32 datatype and parallelism instead of double precision and scalar code • Mapping libraries from Eigen scaler to Eigen with MKL Compiling EIGEN with MKL and Bullet3 (Physics SDK : real-time collision library) to use AVX2 code path
  16. 16. Optimization Results SIGGRAPH 2019 | LOS ANGLES | 28 JULY - 1 AUGUST Baseline After Optimizations Putting CPUs to Work • Application is able to train with acceptable compute instead of spinning • Most of spinning from OpenMP and threading is removed due to Tensorflow with MKLDNN • Eigen MKL library in DeepMimic Core is able to take advantage of intrinsic code
  17. 17. SIGGRAPH 2019 | LOS ANGLES | 28 JULY - 1 AUGUST • Optimizing training is first step for deployment • Correct libraries and datatype is important for deep learning training performance Training Result with Optimized Stack Reducing training time by 2.6x by enabling multithreading and using MKLDNN instead of Eigen à 50hours to 19hours 0 1000 2000 3000 4000 5 10 15 20 25 30 35 40 45 50 55 60 MINUTES ITERATIONS (MILLIONS) Timing After Optimizations TensorFlow - Baseline TensorFlow- MKLDNN Tensorflow+MKLDNN+EIGEN Libs 2.6x better training performance
  18. 18. Take-away Use of optimization libraries to train machine learning algorithms help to boost performance and reduce training time SIGGRAPH 2019 | LOS ANGLES | 28 JULY - 1 AUGUST
  19. 19. Bringing Motion to Production SIGGRAPH 2019 | LOS ANGLES | 28 JULY - 1 AUGUST
  20. 20. SIGGRAPH 2019 | LOS ANGLES | 28 JULY - 1 AUGUST Understanding inference model Training checkpoint Inference Model How can developer read?
  21. 21. SIGGRAPH 2019 | LOS ANGLES | 28 JULY - 1 AUGUST Unity® ML Agents Bridging Gap between Research and Game integration
  22. 22. SIGGRAPH 2019 | LOS ANGLES | 28 JULY - 1 AUGUST Overview : Unity ML-Agents Unity Environment Agent Collect Observations Agent Action Vector Action Brain Academy Unity Inference Engine DirectML CS CPU
  23. 23. SIGGRAPH 2019 | LOS ANGLES | 28 JULY - 1 AUGUST • Goal: Puppy runs for bone • Agent: Corgi • About 50 float32 inputs • Three hidden layers of 512 nodes • About 20 float output Puppo Motion Using Unity ML Agent
  24. 24. SIGGRAPH 2019 | LOS ANGLES | 28 JULY - 1 AUGUST Analyzing inference performance à 1 Agent No Meta command : 1.8 seconds/inference Meta command : 0.8 seconds/inference https://devblogs.microsoft.com/pix/download/ Execution time reduced by 2x with meta commands on kernel level
  25. 25. Microsoft® PIX Tool – Benefits of using Meta Commands SIGGRAPH 2019 | LOS ANGLES | 28 JULY - 1 AUGUST 3.064msec 1.364msec More the Agents à Better performance with Metacommands
  26. 26. Results SIGGRAPH 2019 | LOS ANGLES | 28 JULY - 1 AUGUST 0.00 0.50 1.00 1.50 2.00 2.50 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 1 Agent 10 Agent 50 Agent GAIN(%) MSEC SCALING WITH Multiple AGENTS Computer Shader Metacommands Gain Lower is better Metacommands gives significant boost in performance by leveraging Intel® Graphics driver optimizations
  27. 27. Intel® Graphics Performance Analyzer (GPA) DX12 Profiling Preview SIGGRAPH 2019 | LOS ANGLES | 28 JULY - 1 AUGUST DX12 DirectML profiling in Intel® GPA
  28. 28. Summary SIGGRAPH 2019 | LOS ANGLES | 28 JULY - 1 AUGUST • Tensorflow with Intel® MKLDNN build is now available on Windows • Leveraging new instruction set on Intel® Xeon™ and Core™ Processors • Performance boost on training as Reinforcement learning use cases are CPU favorable • Using optimized pre-post libraries gives E2E performance boost • DirectML from Microsoft leverages metacommands which gives good boost in performance for game + deep learning infused workloads
  29. 29. References SIGGRAPH 2019 | LOS ANGLES | 28 JULY - 1 AUGUST Tensorflow https://www.tensorflow.org/ Tensorflow Optimization guide https://software.intel.com/en-us/articles/intel- optimization-for-tensorflow-installation-guide DeepMimic https://github.com/xbpeng/DeepMimic/tree/master/learning AI4Animation https://github.com/xbpeng/DeepMimic/tree/master/learning Unity-ML Agents https://github.com/Unity-Technologies/ml-agents RL beginner guide https://skymind.ai/wiki/deep-reinforcement-learning Gym https://gym.openai.com/ Ubisoft https://montreal.ubisoft.com/en/our-engagements/research-and- development/ Intel® GPA - https://software.intel.com/en-us/gpa
  30. 30. • Subtitle Copy Goes Here

×