Deep Learning at Scale

Hanlin Tang presents at RE-Work NYC. He discusses Nervana's Deep Learning Platform.

  1. 1. Proprietary and confidential. Do not distribute. Nervana’s Deep Learning Platform MAKING MACHINES SMARTER.™ Hanlin Tang, PhD Algorithms Engineer
  2. 2. Facebook DeepMask Silver et al, 2016 The Atlantic, March 2016 “The error rate has been cut by a factor of two in all the languages, more than a factor of two in many cases. That’s mostly due to deep learning and the way we have optimized it …” Alex Acero, Siri Senior Director, Apple Article in Backhannel/WIRED, Aug 2016 Deep Learning
  3. 3. neon deep learning framework train deployexplore nervana engine Fastest deep learning framework cloudn
  4. 4. • Unprecedented computing power • 10x speedup over current Maxwell GPUs (~55 TeraOps) • 32 GB High-Bandwidth Memory • Six bi-directional high-bandwidth links for 3D torus interconnect • 8 chips in a box, seamlessly scale to multiple chassis
  8. 8. Neon (ms) Caffe (ms) Speed-up Forward 101 719 7.1x Backward 164 746 4.5x Total 265 1455 5.5x
  9. 9. neon v1.6 + mgpu v1.6 neon v2.0 Modular dataloader (aeon) Neural machine translation model neon v3.0 • Nervana Graph • Tensorflow inter-operability • Graph-enabled models • Distributed computing
  10. 10. “Training neural networks is a dark art.” Hyperparameters: •Number and type of units/layers •Convolution filter size •Weight Initialization •Optimization method •Learning Rate schedule
  11. 11. Command Line client Web Interface
  12. 12. Nervana in action Healthcare: Tumor detection Automotive: Speech interfaces Finance: Time-series search engine Positive: Negative: Agricultural Robotics Oil & Gas Positive: Negative: Proteomics: Sequence analysis Query: Results:
