SlideShare uma empresa Scribd logo
1 de 22
Baixar para ler offline
Deep Learning for FinTech
GEETA CHAUHAN, CTO SVSG
Agenda
 AI & Deep Learning in FinTech
 What is Deep Learning?
 Rise of Specialized Compute
 Techniques for Optimization
 Look into future
 Steps for starting your AI journey
 References
Source: CBInsights
Deep Learning in FinTech
Visual Chart
Pattern trading
(AlpacaAlgo)
AI - Crypto
Hedge Fund
(NumeraAI)
Trading Gym
(Prediction Machines)
Real Time
Fraud
Detection
(FeedZai, Kabbage)
FX Trading across
time zones
(QuantAlea)
Cyber Security
(Deep Instinct)
Personal Finance
Assistant
(Cleo AI)
Customer
Experience AI
(AugmentHQ)
What is
Deep
Learning?
 AI Neural Networks
composed of many
layers
 Learn like humans
 Automated Feature
Learning
 Layers are like Image
Filters
Rise of Deep Learning
• Computer Vision, Language Translation,
Speech Recognition, Question & Answer,
…
Major Advances
in AI
• Latency, Cost, Power consumption issues
• Complexity & size outpacing commodity
“General purpose compute”
• Hyper-parameter tuning, Black box
Challenging to
build & deploy
for large scale
applications
Exascale, 15 Watts
6
Shift towards Specialized Compute
 Special purpose Cloud
 Google TPU, Microsoft Brainwave, Intel Nervana, IBM Power AI, Nvidia v100
 Bare Metal Cloud – Preview AWS, GCE coming April 2018
 Spectrum: CPU, GPU, FPGA, Custom Asics
 Edge Compute: Hardware accelerators, AI SOC
 Intel Neural Compute Stick, Nvidia Jetson, Nvidia Drive PX (Self driving cars)
 Architectures
 Cluster Compute, HPC, Neuromorphic, Quantum compute
 Complexity in Software
 Model tuning/optimizations specific to hardware
 Growing need for compilers to optimize based on deployment hardware
 Workload specific compute: Model training, Inference
7
CPU Optimizations
 Leverage High Performant compute tools
 Intel Python, Intel Math Kernel Library (MKL),
NNPack (for multi-core CPUs)
 Compile Tensorflow from Source for CPU
Optimizations
 Proper Batch size, using all cores & memory
 Proper Data Format
 NCHW for CPUs vs Tensorflow default NHWC
 Use Queues for Reading Data
Source: Intel Research Blog
8
Tensorflow CPU Optimizations
 Compile from source
 git clone https://github.com/tensorflow/tensorflow.git
 Run ./configure from Tensorflow source directory
 Select option MKL (CPU) Optimization
 Build pip package for install
 bazel build --config=mkl --copt=-DEIGEN_USE_VML -c opt
//tensorflow/tools/pip_package:build_pip_package
 Install the optimized TensorFlow wheel
 bazel-bin/tensorflow/tools/pip_package/build_pip_package
~/path_to_save_wheel
pip install --upgrade --user ~/path_to_save_wheel /wheel_name.whl
 Intel Optimized Pip Wheel files
9
Parallelize your models
 Data Parallelism
 Tensorflow Estimator + Experiments
 Parameter Server, Worker cluster
 Intel BigDL Spark Cluster
 Baidu’s Ring AllReduce
 Uber’s Horovod TensorFusion
 HyperTune Google Cloud ML
 Model Parallelism
 Graph too large to fit on one
machine
 Tensorflow Model Towers
10
Optimizations for Training
Source: Amazon MxNET
11
Workload Partitioning
Source: Amazon MxNET
 Minimize communication time
 Place neighboring layers on same GPU
 Balance workload between GPUs
 Different layers have different memory-compute
properties
 Model on left more balanced
 LSTM unrolling: ↓ memory, ↑ compute time
 Encode/Decode: ↑ memory
12
Optimizations for Inferencing
 Graph Transform Tool
 Freeze graph (variables to constants)
 Quantization (32 bit float → 8 bit)
 Quantize weights (20 M weights for IV3)
 Inception v3 93 MB → 1.5 MB
 AlexNet 35x smaller, VGG-16 49x smaller
 3x to 4x speedup, 3x to 7x more energy-efficient
13
bazel build tensorflow/tools/graph_transforms:transform_graph
bazel-bin/tensorflow/tools/graph_transforms/transform_graph 
--in_graph=/tmp/classify_image_graph_def.pb 
--outputs="softmax" --out_graph=/tmp/quantized_graph.pb 
--transforms='add_default_attributes strip_unused_nodes(type=float,
shape="1,299,299,3")
remove_nodes(op=Identity, op=CheckNumerics)
fold_constants(ignore_errors=true)
fold_batch_norms fold_old_batch_norms quantize_weights quantize_nodes
strip_unused_nodes sort_by_execution_order'
Cluster
Optimizations
 Define your ML Container locally
 Evaluate with different parameters in the cloud
 Use EFS / GFS for data storage and sharing across
nodes
 Create separate Data processing container
 Mount EFS/GFS drive on all pods for shared
storage
 Avoid GPU Fragmentation problems by bundling
jobs
 Placement optimizations – Kubernetes Bundle
as pods, Mesos placement constraints
 GPU Drivers bundling in container a problem
 Mount as Readonly volume, or use Nvidia-
docker
14
Uber’s
Horovod on
Mesos
 Peleton Gang Scheduler
 MPI based bandwidth
optimized communication
 Code for one GPU, replicates
across cluster
 Nested Containers
15
Source: Uber Mesoscon
Future: FPGA Hardware Microservices
Project Brainwave Source: Microsoft Research Blog
16
FPGA Optimizations
Brainwave Compiler Source: Microsoft Research Blog
17
Can FPGA Beat GPU Paper:
➢ Optimizing CNNs on Intel FPGA
➢ FPGA vs GPU: 60x faster, 2.3x more energy-
efficient
➢ <1% loss of accuracy
ESE on FPGA Paper:
➢ Optimizing LSTMs on Xilinx FPGA
➢ FPGA vs CPU: 43x faster, 40x more energy-
efficient
➢ FPGA vs GPU: 3x faster, 11.5x more energy-
efficient
Future: Neuromorphic Compute
Intel’s Loihi: Brain Inspired AI Chip Neuromorphic memristors
18
Future:
Quantum
Computers
Source: opentranscripts.org
+ Monte Carlo Simulations & Dynamic Portfolio
Optimization
? Cybersecurity a big challenge
19
Where to start your AI journey?
 Level 1: Just Starting
 Start with Lower Risk use case like AI driven Customer Services, RPA
 Level 2: Intermediate
 Invest in data cleansing and provenance for building richer systems
 Combine 3rd party data sets for greater insights
 Level 3: Advanced
 Experiment with Deep Learning Models for complex scenarios
 or New innovative use cases like Face Recognition for Banking app security
 Level 4: Mature
 Add feedback look to your models, learning from outcomes
 Experiment with Deep Reinforcement Learning
 Industrialize the ML/DL Pipeline, shared model repository across company
20
Resources
 CBInsights AI in FinTech Market Map: https://www.cbinsights.com/research/ai-fintech-startup-market-map/
 Deep Portfolios Paper: http://onlinelibrary.wiley.com/doi/10.1002/asmb.2209/pdf
 Opening the Blackbox of Financial AI with ClearTrade: https://arxiv.org/pdf/1709.01574.pdf
 Trading Gym: https://github.com/Prediction-Machines/Trading-Gym
 Tensorflow Intel CPU Optimized: https://software.intel.com/en-us/articles/tensorflow-optimizations-on-modern-
intel-architecture
 Tensorflow Quantization: https://www.tensorflow.org/performance/quantization
 Deep Compression Paper: https://arxiv.org/abs/1510.00149
 Microsoft’s Project Brainwave: https://www.microsoft.com/en-us/research/blog/microsoft-unveils-project-
brainwave/
 Can FPGAs Beat GPUs?: http://jaewoong.org/pubs/fpga17-next-generation-dnns.pdf
 ESE on FPGA: https://arxiv.org/abs/1612.00694
 Intel Spark BigDL: https://software.intel.com/en-us/articles/bigdl-distributed-deep-learning-on-apache-spark
 Baidu’s Paddle-Paddle on Kubernetes: http://blog.kubernetes.io/2017/02/run-deep-learning-with-
paddlepaddle-on-kubernetes.html
 Uber’s Horovod Distributed Training framework for Tensorflow: https://github.com/uber/horovod
 A Study of Complex Deep Learning Networks on High Performance, Neuromorphic, and Quantum Computers
https://arxiv.org/pdf/1703.05364.pdf
21
Questions?
Contact
http://bit.ly/geeta4c
geeta@svsg.co
@geeta4c

Mais conteúdo relacionado

Mais procurados

Affordable AI Connects To A Better Life
Affordable AI Connects To A Better LifeAffordable AI Connects To A Better Life
Affordable AI Connects To A Better Life
NVIDIA Taiwan
 

Mais procurados (20)

Deep learning: Hardware Landscape
Deep learning: Hardware LandscapeDeep learning: Hardware Landscape
Deep learning: Hardware Landscape
 
Affordable AI Connects To A Better Life
Affordable AI Connects To A Better LifeAffordable AI Connects To A Better Life
Affordable AI Connects To A Better Life
 
AI Hardware
AI HardwareAI Hardware
AI Hardware
 
CPN211 My Datacenter Has Walls That Move - AWS re: Invent 2012
CPN211 My Datacenter Has Walls That Move - AWS re: Invent 2012CPN211 My Datacenter Has Walls That Move - AWS re: Invent 2012
CPN211 My Datacenter Has Walls That Move - AWS re: Invent 2012
 
Transfer learning for IoT
Transfer learning for IoTTransfer learning for IoT
Transfer learning for IoT
 
Tensorflow IoT - 1 Wk coding challenge
Tensorflow IoT - 1 Wk coding challengeTensorflow IoT - 1 Wk coding challenge
Tensorflow IoT - 1 Wk coding challenge
 
Intel optimized tensorflow, distributed deep learning
Intel optimized tensorflow, distributed deep learningIntel optimized tensorflow, distributed deep learning
Intel optimized tensorflow, distributed deep learning
 
Profiling PyTorch for Efficiency & Sustainability
Profiling PyTorch for Efficiency & SustainabilityProfiling PyTorch for Efficiency & Sustainability
Profiling PyTorch for Efficiency & Sustainability
 
Deep Learning Update May 2016
Deep Learning Update May 2016Deep Learning Update May 2016
Deep Learning Update May 2016
 
A Primer on FPGAs - Field Programmable Gate Arrays
A Primer on FPGAs - Field Programmable Gate ArraysA Primer on FPGAs - Field Programmable Gate Arrays
A Primer on FPGAs - Field Programmable Gate Arrays
 
On-Device AI
On-Device AIOn-Device AI
On-Device AI
 
An AI accelerator ASIC architecture
An AI accelerator ASIC architectureAn AI accelerator ASIC architecture
An AI accelerator ASIC architecture
 
08 Supercomputer Fugaku
08 Supercomputer Fugaku08 Supercomputer Fugaku
08 Supercomputer Fugaku
 
Deep Learning Primer: A First-Principles Approach
Deep Learning Primer: A First-Principles ApproachDeep Learning Primer: A First-Principles Approach
Deep Learning Primer: A First-Principles Approach
 
PowerAI Deep dive
PowerAI Deep divePowerAI Deep dive
PowerAI Deep dive
 
Mastering Computer Vision Problems with State-of-the-art Deep Learning
Mastering Computer Vision Problems with State-of-the-art Deep LearningMastering Computer Vision Problems with State-of-the-art Deep Learning
Mastering Computer Vision Problems with State-of-the-art Deep Learning
 
Early Benchmarking Results for Neuromorphic Computing
Early Benchmarking Results for Neuromorphic ComputingEarly Benchmarking Results for Neuromorphic Computing
Early Benchmarking Results for Neuromorphic Computing
 
An Introduction to Deep Learning (May 2018)
An Introduction to Deep Learning (May 2018)An Introduction to Deep Learning (May 2018)
An Introduction to Deep Learning (May 2018)
 
13 Supercomputer-Scale AI with Cerebras Systems
13 Supercomputer-Scale AI with Cerebras Systems13 Supercomputer-Scale AI with Cerebras Systems
13 Supercomputer-Scale AI with Cerebras Systems
 
Squeezing Deep Learning Into Mobile Phones
Squeezing Deep Learning Into Mobile PhonesSqueezing Deep Learning Into Mobile Phones
Squeezing Deep Learning Into Mobile Phones
 

Semelhante a Deep learning for FinTech

TECHNICAL OVERVIEW NVIDIA DEEP LEARNING PLATFORM Giant Leaps in Performance ...
TECHNICAL OVERVIEW NVIDIA DEEP  LEARNING PLATFORM Giant Leaps in Performance ...TECHNICAL OVERVIEW NVIDIA DEEP  LEARNING PLATFORM Giant Leaps in Performance ...
TECHNICAL OVERVIEW NVIDIA DEEP LEARNING PLATFORM Giant Leaps in Performance ...
Willy Marroquin (WillyDevNET)
 

Semelhante a Deep learning for FinTech (20)

Innovation with ai at scale on the edge vt sept 2019 v0
Innovation with ai at scale  on the edge vt sept 2019 v0Innovation with ai at scale  on the edge vt sept 2019 v0
Innovation with ai at scale on the edge vt sept 2019 v0
 
Introduction to PowerAI - The Enterprise AI Platform
Introduction to PowerAI - The Enterprise AI PlatformIntroduction to PowerAI - The Enterprise AI Platform
Introduction to PowerAI - The Enterprise AI Platform
 
AIoT: Intelligence on Microcontroller
AIoT: Intelligence on MicrocontrollerAIoT: Intelligence on Microcontroller
AIoT: Intelligence on Microcontroller
 
TDC2019 Intel Software Day - Tecnicas de Programacao Paralela em Machine Lear...
TDC2019 Intel Software Day - Tecnicas de Programacao Paralela em Machine Lear...TDC2019 Intel Software Day - Tecnicas de Programacao Paralela em Machine Lear...
TDC2019 Intel Software Day - Tecnicas de Programacao Paralela em Machine Lear...
 
Intel Powered AI Applications for Telco
Intel Powered AI Applications for TelcoIntel Powered AI Applications for Telco
Intel Powered AI Applications for Telco
 
TensorFlow meetup: Keras - Pytorch - TensorFlow.js
TensorFlow meetup: Keras - Pytorch - TensorFlow.jsTensorFlow meetup: Keras - Pytorch - TensorFlow.js
TensorFlow meetup: Keras - Pytorch - TensorFlow.js
 
Accelerate Machine Learning Software on Intel Architecture
Accelerate Machine Learning Software on Intel Architecture Accelerate Machine Learning Software on Intel Architecture
Accelerate Machine Learning Software on Intel Architecture
 
Backend.AI Technical Introduction (19.09 / 2019 Autumn)
Backend.AI Technical Introduction (19.09 / 2019 Autumn)Backend.AI Technical Introduction (19.09 / 2019 Autumn)
Backend.AI Technical Introduction (19.09 / 2019 Autumn)
 
Deep learning at supercomputing scale by Rangan Sukumar from Cray
Deep learning at supercomputing scale  by Rangan Sukumar from CrayDeep learning at supercomputing scale  by Rangan Sukumar from Cray
Deep learning at supercomputing scale by Rangan Sukumar from Cray
 
Enhance and Accelerate Your AI and Machine Learning Solution | SIGGRAPH 2019 ...
Enhance and Accelerate Your AI and Machine Learning Solution | SIGGRAPH 2019 ...Enhance and Accelerate Your AI and Machine Learning Solution | SIGGRAPH 2019 ...
Enhance and Accelerate Your AI and Machine Learning Solution | SIGGRAPH 2019 ...
 
GDG Cloud Southlake #16: Priyanka Vergadia: Scalable Data Analytics in Google...
GDG Cloud Southlake #16: Priyanka Vergadia: Scalable Data Analytics in Google...GDG Cloud Southlake #16: Priyanka Vergadia: Scalable Data Analytics in Google...
GDG Cloud Southlake #16: Priyanka Vergadia: Scalable Data Analytics in Google...
 
Considerations for Abstracting Complexities of a Real-Time ML Platform, Zhenz...
Considerations for Abstracting Complexities of a Real-Time ML Platform, Zhenz...Considerations for Abstracting Complexities of a Real-Time ML Platform, Zhenz...
Considerations for Abstracting Complexities of a Real-Time ML Platform, Zhenz...
 
TECHNICAL OVERVIEW NVIDIA DEEP LEARNING PLATFORM Giant Leaps in Performance ...
TECHNICAL OVERVIEW NVIDIA DEEP  LEARNING PLATFORM Giant Leaps in Performance ...TECHNICAL OVERVIEW NVIDIA DEEP  LEARNING PLATFORM Giant Leaps in Performance ...
TECHNICAL OVERVIEW NVIDIA DEEP LEARNING PLATFORM Giant Leaps in Performance ...
 
AI for good: Scaling AI in science, healthcare, and more.
AI for good: Scaling AI in science, healthcare, and more.AI for good: Scaling AI in science, healthcare, and more.
AI for good: Scaling AI in science, healthcare, and more.
 
AI Scalability for the Next Decade
AI Scalability for the Next DecadeAI Scalability for the Next Decade
AI Scalability for the Next Decade
 
Accelerated Machine Learning with RAPIDS and MLflow, Nvidia/RAPIDS
Accelerated Machine Learning with RAPIDS and MLflow, Nvidia/RAPIDSAccelerated Machine Learning with RAPIDS and MLflow, Nvidia/RAPIDS
Accelerated Machine Learning with RAPIDS and MLflow, Nvidia/RAPIDS
 
Power AI introduction
Power AI introductionPower AI introduction
Power AI introduction
 
Austin big data ai meetup march 14
Austin big data ai meetup march 14Austin big data ai meetup march 14
Austin big data ai meetup march 14
 
InTech Event | Cognitive Infrastructure for Enterprise AI
InTech Event | Cognitive Infrastructure for Enterprise AIInTech Event | Cognitive Infrastructure for Enterprise AI
InTech Event | Cognitive Infrastructure for Enterprise AI
 
Enabling Artificial Intelligence - Alison B. Lowndes
Enabling Artificial Intelligence - Alison B. LowndesEnabling Artificial Intelligence - Alison B. Lowndes
Enabling Artificial Intelligence - Alison B. Lowndes
 

Mais de geetachauhan

Mais de geetachauhan (14)

Building AI with Security Privacy in Mind
Building AI with Security Privacy in MindBuilding AI with Security Privacy in Mind
Building AI with Security Privacy in Mind
 
Building AI with Security and Privacy in mind
Building AI with Security and Privacy in mindBuilding AI with Security and Privacy in mind
Building AI with Security and Privacy in mind
 
Scaling AI in production using PyTorch
Scaling AI in production using PyTorchScaling AI in production using PyTorch
Scaling AI in production using PyTorch
 
Building Interpretable & Secure AI Systems using PyTorch
Building Interpretable & Secure AI Systems using PyTorchBuilding Interpretable & Secure AI Systems using PyTorch
Building Interpretable & Secure AI Systems using PyTorch
 
Future is private intel dev fest
Future is private   intel dev festFuture is private   intel dev fest
Future is private intel dev fest
 
Decentralized AI Draper
Decentralized AI   DraperDecentralized AI   Draper
Decentralized AI Draper
 
Decentralized AI: Convergence of AI + Blockchain
Decentralized AI: Convergence of AI + Blockchain Decentralized AI: Convergence of AI + Blockchain
Decentralized AI: Convergence of AI + Blockchain
 
Decentralized AI: Convergence of Blockchain + AI
Decentralized AI: Convergence of Blockchain + AIDecentralized AI: Convergence of Blockchain + AI
Decentralized AI: Convergence of Blockchain + AI
 
Decentralized AI: Convergence of Blockchain + AI
Decentralized AI: Convergence of Blockchain + AIDecentralized AI: Convergence of Blockchain + AI
Decentralized AI: Convergence of Blockchain + AI
 
Deep learning for medical imaging
Deep learning for medical imagingDeep learning for medical imaging
Deep learning for medical imaging
 
Build Secure IOT Solutions using Blockchain
Build Secure IOT Solutions using BlockchainBuild Secure IOT Solutions using Blockchain
Build Secure IOT Solutions using Blockchain
 
Data Analytics in Real World (May 2016)
Data Analytics in Real World (May 2016)Data Analytics in Real World (May 2016)
Data Analytics in Real World (May 2016)
 
Data Analytics in Real World
Data Analytics in Real WorldData Analytics in Real World
Data Analytics in Real World
 
Blockchain revolution
Blockchain revolutionBlockchain revolution
Blockchain revolution
 

Último

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Último (20)

Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdf
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 

Deep learning for FinTech

  • 1. Deep Learning for FinTech GEETA CHAUHAN, CTO SVSG
  • 2. Agenda  AI & Deep Learning in FinTech  What is Deep Learning?  Rise of Specialized Compute  Techniques for Optimization  Look into future  Steps for starting your AI journey  References
  • 4. Deep Learning in FinTech Visual Chart Pattern trading (AlpacaAlgo) AI - Crypto Hedge Fund (NumeraAI) Trading Gym (Prediction Machines) Real Time Fraud Detection (FeedZai, Kabbage) FX Trading across time zones (QuantAlea) Cyber Security (Deep Instinct) Personal Finance Assistant (Cleo AI) Customer Experience AI (AugmentHQ)
  • 5. What is Deep Learning?  AI Neural Networks composed of many layers  Learn like humans  Automated Feature Learning  Layers are like Image Filters
  • 6. Rise of Deep Learning • Computer Vision, Language Translation, Speech Recognition, Question & Answer, … Major Advances in AI • Latency, Cost, Power consumption issues • Complexity & size outpacing commodity “General purpose compute” • Hyper-parameter tuning, Black box Challenging to build & deploy for large scale applications Exascale, 15 Watts 6
  • 7. Shift towards Specialized Compute  Special purpose Cloud  Google TPU, Microsoft Brainwave, Intel Nervana, IBM Power AI, Nvidia v100  Bare Metal Cloud – Preview AWS, GCE coming April 2018  Spectrum: CPU, GPU, FPGA, Custom Asics  Edge Compute: Hardware accelerators, AI SOC  Intel Neural Compute Stick, Nvidia Jetson, Nvidia Drive PX (Self driving cars)  Architectures  Cluster Compute, HPC, Neuromorphic, Quantum compute  Complexity in Software  Model tuning/optimizations specific to hardware  Growing need for compilers to optimize based on deployment hardware  Workload specific compute: Model training, Inference 7
  • 8. CPU Optimizations  Leverage High Performant compute tools  Intel Python, Intel Math Kernel Library (MKL), NNPack (for multi-core CPUs)  Compile Tensorflow from Source for CPU Optimizations  Proper Batch size, using all cores & memory  Proper Data Format  NCHW for CPUs vs Tensorflow default NHWC  Use Queues for Reading Data Source: Intel Research Blog 8
  • 9. Tensorflow CPU Optimizations  Compile from source  git clone https://github.com/tensorflow/tensorflow.git  Run ./configure from Tensorflow source directory  Select option MKL (CPU) Optimization  Build pip package for install  bazel build --config=mkl --copt=-DEIGEN_USE_VML -c opt //tensorflow/tools/pip_package:build_pip_package  Install the optimized TensorFlow wheel  bazel-bin/tensorflow/tools/pip_package/build_pip_package ~/path_to_save_wheel pip install --upgrade --user ~/path_to_save_wheel /wheel_name.whl  Intel Optimized Pip Wheel files 9
  • 10. Parallelize your models  Data Parallelism  Tensorflow Estimator + Experiments  Parameter Server, Worker cluster  Intel BigDL Spark Cluster  Baidu’s Ring AllReduce  Uber’s Horovod TensorFusion  HyperTune Google Cloud ML  Model Parallelism  Graph too large to fit on one machine  Tensorflow Model Towers 10
  • 12. Workload Partitioning Source: Amazon MxNET  Minimize communication time  Place neighboring layers on same GPU  Balance workload between GPUs  Different layers have different memory-compute properties  Model on left more balanced  LSTM unrolling: ↓ memory, ↑ compute time  Encode/Decode: ↑ memory 12
  • 13. Optimizations for Inferencing  Graph Transform Tool  Freeze graph (variables to constants)  Quantization (32 bit float → 8 bit)  Quantize weights (20 M weights for IV3)  Inception v3 93 MB → 1.5 MB  AlexNet 35x smaller, VGG-16 49x smaller  3x to 4x speedup, 3x to 7x more energy-efficient 13 bazel build tensorflow/tools/graph_transforms:transform_graph bazel-bin/tensorflow/tools/graph_transforms/transform_graph --in_graph=/tmp/classify_image_graph_def.pb --outputs="softmax" --out_graph=/tmp/quantized_graph.pb --transforms='add_default_attributes strip_unused_nodes(type=float, shape="1,299,299,3") remove_nodes(op=Identity, op=CheckNumerics) fold_constants(ignore_errors=true) fold_batch_norms fold_old_batch_norms quantize_weights quantize_nodes strip_unused_nodes sort_by_execution_order'
  • 14. Cluster Optimizations  Define your ML Container locally  Evaluate with different parameters in the cloud  Use EFS / GFS for data storage and sharing across nodes  Create separate Data processing container  Mount EFS/GFS drive on all pods for shared storage  Avoid GPU Fragmentation problems by bundling jobs  Placement optimizations – Kubernetes Bundle as pods, Mesos placement constraints  GPU Drivers bundling in container a problem  Mount as Readonly volume, or use Nvidia- docker 14
  • 15. Uber’s Horovod on Mesos  Peleton Gang Scheduler  MPI based bandwidth optimized communication  Code for one GPU, replicates across cluster  Nested Containers 15 Source: Uber Mesoscon
  • 16. Future: FPGA Hardware Microservices Project Brainwave Source: Microsoft Research Blog 16
  • 17. FPGA Optimizations Brainwave Compiler Source: Microsoft Research Blog 17 Can FPGA Beat GPU Paper: ➢ Optimizing CNNs on Intel FPGA ➢ FPGA vs GPU: 60x faster, 2.3x more energy- efficient ➢ <1% loss of accuracy ESE on FPGA Paper: ➢ Optimizing LSTMs on Xilinx FPGA ➢ FPGA vs CPU: 43x faster, 40x more energy- efficient ➢ FPGA vs GPU: 3x faster, 11.5x more energy- efficient
  • 18. Future: Neuromorphic Compute Intel’s Loihi: Brain Inspired AI Chip Neuromorphic memristors 18
  • 19. Future: Quantum Computers Source: opentranscripts.org + Monte Carlo Simulations & Dynamic Portfolio Optimization ? Cybersecurity a big challenge 19
  • 20. Where to start your AI journey?  Level 1: Just Starting  Start with Lower Risk use case like AI driven Customer Services, RPA  Level 2: Intermediate  Invest in data cleansing and provenance for building richer systems  Combine 3rd party data sets for greater insights  Level 3: Advanced  Experiment with Deep Learning Models for complex scenarios  or New innovative use cases like Face Recognition for Banking app security  Level 4: Mature  Add feedback look to your models, learning from outcomes  Experiment with Deep Reinforcement Learning  Industrialize the ML/DL Pipeline, shared model repository across company 20
  • 21. Resources  CBInsights AI in FinTech Market Map: https://www.cbinsights.com/research/ai-fintech-startup-market-map/  Deep Portfolios Paper: http://onlinelibrary.wiley.com/doi/10.1002/asmb.2209/pdf  Opening the Blackbox of Financial AI with ClearTrade: https://arxiv.org/pdf/1709.01574.pdf  Trading Gym: https://github.com/Prediction-Machines/Trading-Gym  Tensorflow Intel CPU Optimized: https://software.intel.com/en-us/articles/tensorflow-optimizations-on-modern- intel-architecture  Tensorflow Quantization: https://www.tensorflow.org/performance/quantization  Deep Compression Paper: https://arxiv.org/abs/1510.00149  Microsoft’s Project Brainwave: https://www.microsoft.com/en-us/research/blog/microsoft-unveils-project- brainwave/  Can FPGAs Beat GPUs?: http://jaewoong.org/pubs/fpga17-next-generation-dnns.pdf  ESE on FPGA: https://arxiv.org/abs/1612.00694  Intel Spark BigDL: https://software.intel.com/en-us/articles/bigdl-distributed-deep-learning-on-apache-spark  Baidu’s Paddle-Paddle on Kubernetes: http://blog.kubernetes.io/2017/02/run-deep-learning-with- paddlepaddle-on-kubernetes.html  Uber’s Horovod Distributed Training framework for Tensorflow: https://github.com/uber/horovod  A Study of Complex Deep Learning Networks on High Performance, Neuromorphic, and Quantum Computers https://arxiv.org/pdf/1703.05364.pdf 21