Palestra apresentada por Pedro Mário Cruz e Silva, Solution Architect da NVIDIA, como parte da programação da VIII Semana de Inverno de Geofísica, em 19/07/2017.
Aplicações Potenciais de Deep Learning à Indústria do Petróleo
1. Pedro Mario Cruz e Silva (pcruzesilva@nvidia.com)
Solution Architect
Enterprise Latin America
Global O&G Team
HPC, DEEP LEARNING, AND
BIG DATA & ANALYTICS
2. 2
May 8 - 11, 2017 | Silicon Valley | #GTC17
www.gputechconf.com
CONNECT
Connect with technology
experts from NVIDIA and
other leading organizations
LEARN
Gain insight and valuable
hands-on training through
hundreds of sessions and
research posters
DISCOVER
See how GPUs are creating
amazing breakthroughs in
important fields such as
deep learning and AI
INNOVATE
Hear about disruptive
innovations from startups
The world’s most important event for GPU developers
May 8 – 11, 2017 in Silicon Valley
http://on-demand-gtc.gputechconf.com
4. 4
LIFE AFTER MOORE’S LAW
1980 1990 2000 2010 2020
102
103
104
105
106
107
40 Years of Microprocessor Trend Data
Original data up to the year 2010 collected and plotted by M. Horowitz, F. Labonte,
O. Shacham, K. Olukotun, L. Hammond, and C. Batten New plot and data collected
for 2010-2015 by K. Rupp
Single-threaded perf
1.5X per year
1.1X per year
Transistors
(thousands)
5. 5
200B CORE HOURS OF LOST SCIENCE
Data Center Throughput is the Most Important Thing for HPC
Source: NSF XSEDE Data: https://portal.xsede.org/#/gallery
NU = Normalized Computing Units are used to compare compute resources across supercomputers and are
based on the result of the High Performance LINPACK benchmark run on each system
0
50
100
150
200
250
300
350
400
2009 2010 2011 2012 2013 2014 2015
Computing Resources Requested
Computing Resources Available
NormalizedUnit(Billions)
National Science Foundation (NSF XSEDE) Supercomputing Resources
6. 6
0.0
1.0
2.0
3.0
4.0
5.0
6.0
2008 2009 2010 2011 2012 2013 2014 2016
NVIDIA GPU x86 CPUTFLOPS
M2090
M1060
K20
K80
K40
Fast GPU
+
Strong CPU
THE ADVANTAGES OF
GPU-ACCELERATED DATA CENTER
P100
7. 7
1980 1990 2000 2010 2020
GPU-Computing perf
1.5X per year
1000X
by
2025
RISE OF GPU COMPUTING
Original data up to the year 2010 collected and plotted by M. Horowitz, F. Labonte,
O. Shacham, K. Olukotun, L. Hammond, and C. Batten New plot and data collected
for 2010-2015 by K. Rupp
102
103
104
105
106
107
Single-threaded perf
1.5X per year
1.1X per year
APPLICATIONS
SYSTEMS
ALGORITHMS
CUDA
ARCHITECTURE
8. 8
U.S. TO BUILD TWO FLAGSHIP SUPERCOMPUTERS
Pre-Exascale Systems Powered by the Tesla Platform
100-300 PFLOPS Peak
IBM POWER9 CPU + NVIDIA Volta GPU
NVLink High Speed Interconnect
40 TFLOPS per Node, >3,400 Nodes
2017
Summit & Sierra Supercomputers
9. 9
TEN YEARS OF GPU COMPUTING
2006 2008 2012 20162010 2014
Fermi: World’s
First HPC GPU
Oak Ridge Deploys World’s
Fastest Supercomputer w/ GPUs
World’s First Atomic
Model of HIV Capsid
GPU-Trained AI Machine
Beats World Champion in Go
Stanford Builds AI
Machine using GPUs
World’s First 3-D Mapping
of Human Genome
CUDA Launched
World’s First GPU
Top500 System
Google Outperform
Humans in ImageNet
Discovered How H1N1
Mutates to Resist Drugs
AlexNet beats expert code
by huge margin using GPUs
11. 11
OIL & GAS - NVIDIA INDEX
Leading HPC Tool to Analyze Large-Scale Data for Faster Discoveries
Interactive
Built for Large-Scale DataRemote Visualization
Performance @ Scale
Visualize Anywhere
13. 13
“IBM-NVIDIA SERVERS ACHIEVE HIGH-
PERFORMANCE COMPUTING MILESTONE IN OIL
INDUSTRY”
Servers 22,400
Processors 24
Total CPUs 537,600
Servers 30
GPUs 4
Total GPUs 120
https://www.forbes.com/sites/aarontilley/2017/04/25/ibm-nvidia-servers-achieve-high-performance-computing-milestone-in-oil-industry/#8e3b56626330
1 Billion Cells Resservoir Model
25 April 2017
ExxonMobil using the
Blue Water facility at NCSA
ECHELON – Simulation on GPUs
Stone Ridge Technologies
14. 14
RESERVOIR SIMULATION
Company Simulator/Method Model
Production
Simulation
Runtime Reference Cores/Servers
Saudi Aramco GIGAPOWERS
Three-phase black oil
1.03 Billion cells
3,000 wells
60 years 4 days
[1]
Saudi Aramco GIGAPOWERS
Three-phase black oil
1.03 Billion cells
3,000 wells
60 years 21 hours
[2]
5640 Cores
470 Servers
Total/Schlumberger INTERSECT 1.1 Billion cells
361 wells
20 years 10.5 hours
[3]
576 Cores
288 Servers
ExxonMobil
?
1 Billion cells
? ? ?
716,800 Cores
22,400 Servers
StoneRidge Echelon
Three-phase black oil
1.01 Billion cells
1,000 wells
45 years 92 minutes
?
120 GPUS
30 Servers
Performance Comparison
[1] SPE 119272 “A Next-Generation Parallel Reservoir Simulator for Giant Reservoirs”, A. Dogru et. al. 2009 SPE Reservoir Simulation Symposium.
[2] SPE 142297 “New Frontiers in Large Scale Reservoir Simulation”, A. Dogru et. al. 2011 SPE Reservoir Simulation Symposium.
[3] IPTC 17648 “Giga Cell Compositional Simulation”, E. Obi et. al., 2014 International Petroleum Technology Conference.
19. 19
LSDalton
Quantum
Chemistry
12X speedup
in 1 week
Numeca
CFD
10X faster kernels
2X faster app
PowerGrid
Medical
Imaging
40 days to
2 hours
INCOMP3D
CFD
3X speedup
NekCEM
Computational
Electromagnetics
2.5X speedup
60% less energy
COSMO
Climate
Weather
40X speedup
3X energy efficiency
CloverLeaf
CFD
4X speedup
Single CPU/GPU code
MAESTRO
CASTRO
Astrophysics
4.4X speedup
4 weeks effort
20. 20
OPENACC FOR EVERYONE
New PGI Community Edition Now Available
PROGRAMMING MODELS
OpenACC, CUDA Fortran, OpenMP,
C/C++/Fortran Compilers and Tools
PLATFORMS
x86, OpenPOWER, NVIDIA GPU
UPDATES 1-2 times a year 6-9 times a year 6-9 times a year
SUPPORT User Forums PGI Support
PGI Enterprise
Services
LICENSE Annual Perpetual Volume/Site
FREE
23. 23
LEARNING FROM DATA
AND SOME BUZZ WORDS
ARTIFICAL
INTELLIGENCE
MACHINE
LEARNING DEEP
LEARNING
Knowledge & Reason
Learning
Planning
Communicating
Perceiving
Learning from data
Expert systems
Handcrafted
features
Learning from data
Neural networks
Computer learned
features
24. 24
A NEW COMPUTING MODEL
“Label”
Input
Training Data
Output
Trained Neural
Network
Trained Neural
Network
“Label”
Output
Input
TRAINING
INFERENCE
25. 25
A NEW COMPUTING MODEL
Outperform experts, facts, rules with software that writes software
Deep Learning Object Detection
DNN + Data + GPU
Traditional Computer Vision
Experts + Time
Deep Learning Achieves
“Superhuman” Results
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
2009 2010 2011 2012 2013 2014 2015 2016
Traditional CV
Deep Learning
ImageNet
26. 26
“ACCELERATING EULERIAN FLUID SIMULATION WITH
CONVOLUTIONAL NETWORKS”
Tompson, J., Schlachter, K., Sprechmann, P., & Perlin, K. (2016). Accelerating Eulerian Fluid
Simulation With Convolutional Networks. arXiv preprint arXiv:1607.03597.
29. 29
Ad Service
Technology
Investment
Media
Oil & Gas
Mfg
Retail
Other
$500B OPPORTUNITY OVER 10 YRS
Deep Learning Software Revenue
by Industry
Deep Learning Total Revenue
by Segment
IBM: “Cognitive business represents
a $2T opportunity”
SOURCE: “Deep Learning for Enterprise Applications,” 4Q 2015, Tractica
31. 31
SAP AI FOR THE
ENTERPRISE
First commercial AI offerings
from SAP
Brand Impact, Service Ticketing,
Invoice-to-Record applications
Powered by NVIDIA GPUs on
DGX-1 and AWS
35. 35
Uber Enters the Race
Toyota Invests $1B
in AI Lab
Volvo Drive Me on
Public Roads in 2017
NHTSA: Computer
Counts as Driver
Tesla Model 3:
300K pre-orders
AN AMAZING YEAR FOR SELF-DRIVING CARS
Audi, BMW, Daimler
Buy HERE
Tesla Model S Auto-pilot
Baidu Enters the Race
Honda, Nissan, Toyota
Team Up
GM Buys Cruise
36. 36
NEW AI DRIVING
Training on
DGX-1
Driving with
DriveWorks
KALDI
LOCALIZATION
MAPPING
DRIVENET
DAVENET
NVIDIA DGX-1 NVIDIA DRIVE PX
39. 39
POWERING THE DEEP LEARNING ECOSYSTEM
NVIDIA SDK accelerates every major framework
COMPUTER VISION
OBJECT DETECTION IMAGE CLASSIFICATION
SPEECH & AUDIO
VOICE RECOGNITION LANGUAGE TRANSLATION
NATURAL LANGUAGE PROCESSING
RECOMMENDATION ENGINES SENTIMENT ANALYSIS
DEEP LEARNING FRAMEWORKS
Mocha.jl
NVIDIA DEEP LEARNING SDK
developer.nvidia.com/deep-learning-software
40. 40
NVIDIA DIGITS
Interactive Deep Learning GPU Training System
developer.nvidia.com/digits
Interactive deep neural network development
environment for image classification and object
detection
Schedule, monitor, and manage neural network training
jobs
Analyze accuracy and loss in real time
Track datasets, results, and trained neural networks
Scale training jobs across multiple GPUs automatically
41. 41
OBJECT
DETECTION
IMAGE
CLASSIFICATION
DEEP LEARNING WORKFLOWS
Classify images into
classes or categories
Object of interest could
be anywhere in the image
Find instances of objects
in an image
Objects are identified
with bounding boxes
98% Dog
2% Cat
New in DIGITS 5
Partition image into
multiple regions
Regions are classified at
the pixel level
IMAGE
SEGMENTATION
42. 42
DLI – DEEP LEARNING INSTITUTE
http://www.nvidia.com/object/deep-learning-institute.html
44. 44
DATA DELUGE TO DATA HUNGRY
INCREASING DATA VARIETY
Search
Marketing
Behavioral
Targeting
Dynamic
Funnels
User
Generated
Content
Mobile Web
SMS/MMS
Sentiment
HD Video
Speech To
Text
Product/
Service Logs
Social
Network
Business
Data Feeds
User Click
Stream
Sensors Infotainment
Systems
Wearable
Devices
Cyber
Security Logs
Connected
Vehicles
Machine
Data
IoT Data
Dynamic
Pricing
Payment
Record
Purchase
Detail
Purchase
Record
Support
Contacts
Segmentation
Offer
Details
Web
Logs
Offer
History
A/B
Testing
BUSINESS
PROCESS
PETABYTESTERABYTESGIGABYTESEXABYTESZETTABYTES
Streaming
Video
Natural
Language
Processing
WEB
DIGITAL
AI
45. 45
DATA & ANALYTICS USE CASES
AUTOMOTIVE
Auto sensors reporting
location, problems
COMMUNICATIONS
Location-based advertising
CONSUMER PACKAGED GOODS
Sentiment analysis of
what’s hot, problems
$
FINANCIAL SERVICES
Risk & portfolio analysis
New products
EDUCATION & RESEARCH
Experiment sensor analysis
HIGH TECHNOLOGY /
INDUSTRIAL MFG.
Mfg. quality
Warranty analysis
LIFE SCIENCES
Clinical trials
MEDIA/ENTERTAINMENT
Viewers / advertising
effectiveness
ON-LINE SERVICES /
SOCIAL MEDIA
People & career matching
HEALTH CARE
Patient sensors,
monitoring, EHRs
OIL & GAS
Drilling exploration sensor
analysis
RETAIL
Consumer sentiment
TRAVEL &
TRANSPORTATION
Sensor analysis for
optimal traffic flows
UTILITIES
Smart Meter analysis
for network capacity,
LAW ENFORCEMENT
& DEFENSE
Threat analysis - social media
monitoring, photo analysis
46. 46
DGX-1 FOR ANALYTICS SOLUTIONS
+ ARCHITECTURES
Spark Scheduler
CORE
TECHNOLOGIES
GPU-ACCELERATED
DATA CENTER
ACCELERATED
VISUALIZATION
ACCELERATED
DATABASES
DEEP
LEARNING
CloudNVIDIA DGX Products
CORE
TECHNOLOGIES
TRADITIONAL
DATA CENTER
VISUALIZATION
DATABASES
NVIDIA Tesla GPUs
Mesos
47. 47
GPU-ACCELERATION HAS NO LIMITS
MapD
BlazeGraph
Kinetica
Leading In-Memory DB
> 50x Slower
NoSQL DB’s
> 100x Slower
Aggregate of queries - Time (s)
Less is better!
SQream
1403
1843
700
GPUs 700X-800X faster
than graphs in all cases
700M Edges Single Node
Xeon 2650 vs 2 K80
1.98B Edges 16 EC2
r3.xlarge vs 16 K40s
1.98B Edges 16 EC2
r3.4xlarge vs 16 K40s2
1.98B Edges Spark CPU
Baseline
1
Speed-up over baseline spark CPU configuration
Speed-up(higherisfaster)
50. 50
WEAK NODES
Lots of Nodes Interconnected with
Vast Network Overhead
STRONG NODES
Few Lightning-Fast Nodes with
Performance of Hundreds of Weak Nodes
Network
Fabric
Server
Racks
51. 51
150B XTORS | 5.3TF FP64 | 10.6TF FP32 | 21.2TF FP16 | 14MB SM RF | 4MB L2 Cache
TESLA P100
THE MOST ADVANCED
HYPERSCALE DATACENTER GPU EVER BUILT
53. 53
Instant productivity — plug-and-
play, supports every AI framework
and accelerated analytics
software applications
Performance optimized across
the entire stack
Always up-to-date via the cloud
Mixed framework environments
— baremetal and containerized
Direct access to NVIDIA experts
DGX STACK
Complete Analytics and Deep Learning platform
54. 54
Fastest AI Supercomputer in TOP500
#28 Top500
4.9 Petaflops Peak FP64
19.6 Petaflops Peak FP16
Most Energy Efficient Supercomputer
#1 Green500
9.5 GFLOPS per Watt
Rocket for Cancer Moonshot
CANDLE Development Platform
Common platform with DOE labs – ANL, LLNL,
ORNL, LANL
NVIDIA DGX SATURNV
Giant Leap Towards Exascale AI
56. 56
TESLA REVOLUTIONIZES
DEEP LEARNING
GOOGLE BRAIN APPLICATION
BEFORE TESLA AFTER TESLA
Cost $5,000K $200K
Servers 1,000 Servers 16 Tesla Servers
Energy 600 KW 4 KW
Performance 1x 6x
57. 57
ANNOUNCING TESLA V100
GIANT LEAP FOR AI & HPC
VOLTA WITH NEW TENSOR CORE
21B xtors | TSMC 12nm FFN | 815mm2
5,120 CUDA cores
7.5 FP64 TFLOPS | 15 FP32 TFLOPS
NEW 120 Tensor TFLOPS
20MB SM RF | 16MB Cache
16GB HBM2 @ 900 GB/s
300 GB/s NVLink
58. 58
NEW TENSOR CORE
New CUDA TensorOp instructions
& data formats
4x4 matrix processing array
D[FP32] = A[FP16] * B[FP16] + C[FP32]
Optimized for deep learning
Activation Inputs Weights Inputs Output Results
60. 60
Tesla P100 vs Tesla V100
Tesla P100 (Pascal) Tesla V100 (Volta)
Memory 16 GB (HBM2) 16 GB (HMB2)
Memory Bandwidth 720 GB/s 900 GB/s
NVLINK 160 GB/s 300 GB/s
CUDA Cores (FP32) 3584 5120
CUDA Cores (FP64) 1792 2560
Tensor Cores (TC) NA 640
Peak TFLOPS/s (FP32) 10.6 15
Peak TFLOPS/s (FP64) 5.3 7.5
Peak TFLOPS/s (TC) NA 120
Power 300 W 300 W
61. 61
ANNOUNCING
NVIDIA DGX-1 WITH TESLA V100
ESSENTIAL INSTRUMENT OF AI RESEARCH
960 Tensor TFLOPS | 8x Tesla V100 | NVLink Hybrid Cube
From 8 days on TITAN X to 8 hours
400 servers in a box
63. 63
Registry of
Containers, Datasets,
and Pre-trained models
NVIDIA
GPU CLOUD
CSPs
ANNOUNCING
NVIDIA GPU CLOUD
Containerized in NVDocker | Optimization across the full stack
Always up-to-date | Fully tested and maintained by NVIDIA | Beta in July
GPU-accelerated Cloud Platform Optimized for Deep Learning
65. 65
WELL-LOG ESTIMATION
Korjani, N., Popa, A., Grijalva, E., Cassidy,
S., Ershaghi, E. (2016), “A New Approach to
Reservoir Characterization Using Deep
Learning Neural Networks”, SPE 2016
Univ of South California
Chevron North America Exp & Prod
SPE Western Regional Meeting, 23-26 May 2016
66. 66
WELL-LOG ESTIMATION
Input: >20.000 wells
Kern River Field, San Joaquim, California.
Test: New drilled wells (“A” & “B”).
SPE Western Regional Meeting, 23-26 May 2016
WELL “A” WELL “B”
DRES 81% 82%
GR 80% 81%
NPHI 79% 72%
Results (Correlation)
67. 67
FACIES CLASSIFICATION
Deep Learning Approach
Hall, B. (2016). “Facies classification using
machine learning”.
The Leading Edge, October 2016, 906-909.
68. 68
FACIES CLASSIFICATION
1) Gamma ray (GR)
2) Resistivity (ILD_log10)
3) Photoelectric effect (PE)
4) Neutron-density porosity difference (DeltaPHI)
5) Average neutron-density porosity (PHIND)
6) Nonmarine/marine indicator (NM_M)
7) Relative position (RELPOS)
8) Depth
Deep Learning Approach