SlideShare uma empresa Scribd logo
1 de 42
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved
Deep Learning with MXNet workshop
Sunil Mallya
Solutions Architect, Deep Learning
smallya@amazon.com
@sunilmallya
Agenda
• Deep Learning motivation and basics
• Apache MXNet overview
• MXNet programing model deep dive
• Train our first neural network using MXNet
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved
Deep Learning basics
Biological Neuron
slide from http://cs231n.stanford.edu/
Neural Network basics: http://cs231n.github.io/neural-networks-1/
Artificial Neuron
output
synaptic
weights
• Input
Vector of training data x
• Output
Linear function of inputs
• Nonlinearity
Transform output into desired range
of values, e.g. for classification we
need probabilities [0, 1]
• Training
Learn the weights w and bias b
• Activation functions governs behavior of
neurons.
• Transition of input is called forward propagation.
• Activations are the values passed on to the next
layer from each previous layer. These values are
the output of the activation function of each
artificial neuron.
• Some of the more popular activation functions
include:
• Linear
• Sigmoid
• Hiberbolic Tangant
• Relu
• Softmax
• Step function
Activation Functions
Deep Neural Network
hidden layers
The optimal size of the hidden
layer (number of neurons) is
usually between the size of the
input and size of the output
layers
Input layer
output
The “Learning” in Deep Learning
0.4 0.3
0.2 0.9
...
back propogation (gradient descent)
X1 != X
0.4 ± 𝛿 0.3 ± 𝛿
new
weights
new
weights
0
1
0
1
1
.
.
-
-
X
input
label
...
X1
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved
Apache MXNet
Apache MXNet
Programmable Portable High Performance
Near linear scaling
across hundreds of GPUs
Highly efficient
models for mobile
and IoT
Simple syntax,
multiple languages
88% efficiency
on 256 GPUs
Resnet 1024 layer network
is ~4GB
Ideal
Inception v3
Resnet
Alexnet
88%
Efficiency
1 2 4 8 16 32 64 128 256
No. of GPUs
• Cloud formation with Deep Learning AMI
• 16x P2.16xlarge. Mounted on EFS
• Inception and Resnet: batch size 32, Alex net: batch
size 512
• ImageNet, 1.2M images,1K classes
• 152-layer ResNet, 5.4d on 4x K80s (1.2h per epoch),
0.22 top-1 error
Scaling with MXNet
http://bit.ly/deepami
Deep Learning any way you want on AWS
Tool for data scientists and developers
Setting up a DL system takes (install) time & skill
Keep packages up to date and compiled (MXNet, TensorFlow, Caffe, Torch,
Theano, Keras)
Anaconda, Jupyter, Python 2 and 3
NVIDIA Drivers for G2 and P2 instances
Intel MKL Drivers for all other instances (C4, M4, …)
Deep Learning AMIs
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved
MXNet Programing model
import numpy as np
a = np.ones(10)
b = np.ones(10) * 2
c = b * a
• Straightforward and flexible.
• Take advantage of language
native features (loop,
condition, debugger)
• E.g. Numpy, Matlab, Torch, …
• Hard to optimize
PROS
CONS
d = c + 1c
Easy to tweak
with python codes
Imperative Programing
• More chances for optimization
• Cross different languages
• E.g. TensorFlow, Theano,
Caffe
• Less flexible
PROS
CONS
C can share memory with D
because C is deleted later
A = Variable('A')
B = Variable('B')
C = B * A
D = C + 1
f = compile(D)
d = f(A=np.ones(10),
B=np.ones(10)*2)
A B
1
+
X
Declarative Programing
IMPERATIVE
NDARRAY API
DECLARATIVE
SYMBOLIC
EXECUTOR
>>> import mxnet as mx
>>> a = mx.nd.zeros((100, 50))
>>> b = mx.nd.ones((100, 50))
>>> c = a + b
>>> c += 1
>>> print(c)
>>> import mxnet as mx
>>> net = mx.symbol.Variable('data')
>>> net = mx.symbol.FullyConnected(data=net, num_hidde
>>> net = mx.symbol.SoftmaxOutput(data=net)
>>> texec = mx.module.Module(net)
>>> texec.forward(data=c)
>>> texec.backward()
NDArray can be set
as input to the graph
MXNet: Mixed programming paradigm
Embed symbolic expressions into imperative programming
texec = mx.module.Module(net)
for batch in train_data:
texec.forward(batch)
texec.backward()
for param, grad in zip(texec.get_params(), texec.get_grads()):
param -= 0.2 * grad
MXNet: Mixed programming paradigm
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved
Hands on Lab
https://github.com/dmlc/mxnet-
notebooks/blob/master/python/tutorials/linear-regression.ipynb
Linear Regression
Linear regression
train_data = np.array([[1,2],[3,4],[5,6],[3,2],[7,1],[6,9]])
Y = ax +b ; such that
the error is
minimized
Defining the Model
Variables: A variable is a placeholder for future data
X = mx.sym.Variable('data')
Y = mx.symbol.Variable('lin_reg_label')
Neural Network Layers: The layers of a network or any other type of model are also defined by
Symbols
fully_connected_layer = mx.sym.FullyConnected(data=X,
name='fc1', num_hidden = 1)
Output Symbols: Output symbols are MXNet's way of defining a loss
lro = mx.sym.LinearRegressionOutput(data=fully_connected_layer,
label=Y, name="lro”)
Layers: Fully Connected
Fully connected layer of a neural
network (without any activation being
applied), which in essence, is just a
linear regression on the input
attributes.
It takes the following parameters:
a. data: Input to the layer
b. num_hidden: # of hidden
dimensions, specifies the size of the
output of the layer
Layers: Linear Regression Output
Linear Regression Output: Output layers in MXNet aim at
implementing a loss function.
We apply an L2 loss (Least Square errors)
The parameters to this layer are:
a. data: Input to this layer (specify the symbol whose
output should be fed here)
b. Label: The training label against whom we will compare
the input to the layer for calculation of l2 loss
Defining the Model
model = mx.mod.Module(
symbol = lro ,
data_names=['data'],
label_names = ['lin_reg_label']# network structure
)
model.fit(train_iter, eval_iter,
optimizer_params={'learning_rate':0.01, 'momentum': 0.9},
num_epoch=1000,
batch_end_callback = mx.callback.Speedometer(batch_size,
2))
Visualizing Networks
mx.viz.plot_network(symbol=lro)
Inference
#Inference
model.predict(eval_iter).asnumpy()
#Evaluation
metric = mx.metric.MSE()
model.score(eval_iter, metric)
#Evaluation Data
eval_data = np.array([[7,2],[6,10],[12,2]])
eval_label = np.array([11.1,26.1,16.1]) #Adding 0.1 to each of the values
eval_iter = mx.io.NDArrayIter(eval_data, eval_label, batch_size,
shuffle=False)
model.score(eval_iter, metric)
MNIST Notebook
https://github.com/dmlc/mxnet-
notebooks/blob/master/python/tutorials/mnist.ipynb
NDArray Data Iterator
import mxnet as mx
def to4d(img):
return img.reshape(img.shape[0], 1, 28, 28).astype(np.float32)/255
batch_size = 100
train_iter = mx.io.NDArrayIter(to4d(train_img), train_lbl, batch_size, shuffle=True)
val_iter = mx.io.NDArrayIter(to4d(val_img), val_lbl, batch_size)
Batch of 4-D matrix with shape (batch_size, num_channels, width, height)
For the MNIST dataset, there is only one color channel, and both width
and height are 28
• Input Layer: This layer is how we get input data
(vectors) fed into our network. The number of neurons in
an input layer is typically the same number as the input
feature to the network.
• Hidden Layer: The weight values on the connections
between the layers are how an ANN encodes what it
learns. Hidden layers are crucial in learning non-linear
functions.
• Output Layer: Output layer represents predictions.
Output can be regression or classification.
• Connections Between Layers: In a feed-forward
network connections link a layer to the next layer of an
ANN. Each connection has a weight. The weights of
connections are encoding of the knowledge of the
network.
Neural Network basics: http://cs231n.github.io/neural-networks-1/
Feed Forward Network
Multilayer Perceptron
Y = WX +b
Network Definition
Model
model = mx.model.FeedForward(
symbol = mlp, # network structure
num_epoch = 10, # number of data passes for training
learning_rate = 0.1 # learning rate of SGD
)
model.fit(
X=train_iter, # training data
eval_data=val_iter, # validation data
batch_end_callback = mx.callback.Speedometer(batch_size,
200) # output progress for each 200 data batches
)
Predictions and Validation
# prediction on a single image
prob = model.predict(val_img[0:1].astype(np.float32)/255)[0]
# get the class with highest probablity
print 'Classified as %d with probability %f' % (prob.argmax(),
max(prob))
# Run the model on the validation setand calculate the score with
eval_metric.
valid_acc = model.score(val_iter)
Convolution Neural Network (CNN)
CNN arranges neurons in 3
dimensions (width, height, depth)
CNN Layers
Convolutional Layer
Pooling Layer
Activation
Fully-Connected Layer
Convolutions
More info: http://cs231n.github.io/convolutional-networks/
Pooling Layer
Model Definition
Running the model
model = mx.model.FeedForward(
ctx = mx.gpu(0), # use GPU 0 for training, others are same as
before
symbol = lenet,
num_epoch = 10,
learning_rate = 0.1)
model.fit(
X=train_iter,
eval_data=val_iter,
batch_end_callback = mx.callback.Speedometer(batch_size, 200)
)
Visualizing hidden layers
Layer
1
Layer
2
Layer
3
LeNet-5 Architecture
img src: http://yann.lecun.com/exdb/publis/pdf/lecun-98.pdf
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved
Thank You
smallya@amazon.com
sunilmallya

Mais conteúdo relacionado

Mais procurados

Machine Learning With Python | Machine Learning Algorithms | Machine Learning...
Machine Learning With Python | Machine Learning Algorithms | Machine Learning...Machine Learning With Python | Machine Learning Algorithms | Machine Learning...
Machine Learning With Python | Machine Learning Algorithms | Machine Learning...
Simplilearn
 

Mais procurados (20)

Machine Learning With Python | Machine Learning Algorithms | Machine Learning...
Machine Learning With Python | Machine Learning Algorithms | Machine Learning...Machine Learning With Python | Machine Learning Algorithms | Machine Learning...
Machine Learning With Python | Machine Learning Algorithms | Machine Learning...
 
Hands-on Deep Learning in Python
Hands-on Deep Learning in PythonHands-on Deep Learning in Python
Hands-on Deep Learning in Python
 
CIFAR-10
CIFAR-10CIFAR-10
CIFAR-10
 
Anomaly Detection using Deep Auto-Encoders
Anomaly Detection using Deep Auto-EncodersAnomaly Detection using Deep Auto-Encoders
Anomaly Detection using Deep Auto-Encoders
 
Reinventing Deep Learning
 with Hugging Face Transformers
Reinventing Deep Learning
 with Hugging Face TransformersReinventing Deep Learning
 with Hugging Face Transformers
Reinventing Deep Learning
 with Hugging Face Transformers
 
Intro to LLMs
Intro to LLMsIntro to LLMs
Intro to LLMs
 
Deep learning crash course
Deep learning crash courseDeep learning crash course
Deep learning crash course
 
Fine tune and deploy Hugging Face NLP models
Fine tune and deploy Hugging Face NLP modelsFine tune and deploy Hugging Face NLP models
Fine tune and deploy Hugging Face NLP models
 
Introduction to Keras
Introduction to KerasIntroduction to Keras
Introduction to Keras
 
Deep Generative Models
Deep Generative Models Deep Generative Models
Deep Generative Models
 
An introduction to computer vision with Hugging Face
An introduction to computer vision with Hugging FaceAn introduction to computer vision with Hugging Face
An introduction to computer vision with Hugging Face
 
Image classification using CNN
Image classification using CNNImage classification using CNN
Image classification using CNN
 
Notes from Coursera Deep Learning courses by Andrew Ng
Notes from Coursera Deep Learning courses by Andrew NgNotes from Coursera Deep Learning courses by Andrew Ng
Notes from Coursera Deep Learning courses by Andrew Ng
 
Federated learning
Federated learningFederated learning
Federated learning
 
Deep Learning - Convolutional Neural Networks
Deep Learning - Convolutional Neural NetworksDeep Learning - Convolutional Neural Networks
Deep Learning - Convolutional Neural Networks
 
Deep learning
Deep learningDeep learning
Deep learning
 
RNN & LSTM: Neural Network for Sequential Data
RNN & LSTM: Neural Network for Sequential DataRNN & LSTM: Neural Network for Sequential Data
RNN & LSTM: Neural Network for Sequential Data
 
Lecture: Word Sense Disambiguation
Lecture: Word Sense DisambiguationLecture: Word Sense Disambiguation
Lecture: Word Sense Disambiguation
 
Introduction of Deep Learning
Introduction of Deep LearningIntroduction of Deep Learning
Introduction of Deep Learning
 
Semi-Supervised Learning
Semi-Supervised LearningSemi-Supervised Learning
Semi-Supervised Learning
 

Semelhante a Scalable Deep Learning Using Apache MXNet

Separating Hype from Reality in Deep Learning with Sameer Farooqui
 Separating Hype from Reality in Deep Learning with Sameer Farooqui Separating Hype from Reality in Deep Learning with Sameer Farooqui
Separating Hype from Reality in Deep Learning with Sameer Farooqui
Databricks
 
NeuralProcessingofGeneralPurposeApproximatePrograms
NeuralProcessingofGeneralPurposeApproximateProgramsNeuralProcessingofGeneralPurposeApproximatePrograms
NeuralProcessingofGeneralPurposeApproximatePrograms
Mohid Nabil
 

Semelhante a Scalable Deep Learning Using Apache MXNet (20)

MXNet Workshop
MXNet WorkshopMXNet Workshop
MXNet Workshop
 
Distributed Deep Learning on AWS with Apache MXNet
Distributed Deep Learning on AWS with Apache MXNetDistributed Deep Learning on AWS with Apache MXNet
Distributed Deep Learning on AWS with Apache MXNet
 
Separating Hype from Reality in Deep Learning with Sameer Farooqui
 Separating Hype from Reality in Deep Learning with Sameer Farooqui Separating Hype from Reality in Deep Learning with Sameer Farooqui
Separating Hype from Reality in Deep Learning with Sameer Farooqui
 
Startup.Ml: Using neon for NLP and Localization Applications
Startup.Ml: Using neon for NLP and Localization Applications Startup.Ml: Using neon for NLP and Localization Applications
Startup.Ml: Using neon for NLP and Localization Applications
 
AI powered emotion recognition: From Inception to Production - Global AI Conf...
AI powered emotion recognition: From Inception to Production - Global AI Conf...AI powered emotion recognition: From Inception to Production - Global AI Conf...
AI powered emotion recognition: From Inception to Production - Global AI Conf...
 
AI powered emotion recognition: From Inception to Production - Global AI Conf...
AI powered emotion recognition: From Inception to Production - Global AI Conf...AI powered emotion recognition: From Inception to Production - Global AI Conf...
AI powered emotion recognition: From Inception to Production - Global AI Conf...
 
Power ai tensorflowworkloadtutorial-20171117
Power ai tensorflowworkloadtutorial-20171117Power ai tensorflowworkloadtutorial-20171117
Power ai tensorflowworkloadtutorial-20171117
 
NeuralProcessingofGeneralPurposeApproximatePrograms
NeuralProcessingofGeneralPurposeApproximateProgramsNeuralProcessingofGeneralPurposeApproximatePrograms
NeuralProcessingofGeneralPurposeApproximatePrograms
 
Towards neuralprocessingofgeneralpurposeapproximateprograms
Towards neuralprocessingofgeneralpurposeapproximateprogramsTowards neuralprocessingofgeneralpurposeapproximateprograms
Towards neuralprocessingofgeneralpurposeapproximateprograms
 
Multilayer Perceptron - Elisa Sayrol - UPC Barcelona 2018
Multilayer Perceptron - Elisa Sayrol - UPC Barcelona 2018Multilayer Perceptron - Elisa Sayrol - UPC Barcelona 2018
Multilayer Perceptron - Elisa Sayrol - UPC Barcelona 2018
 
A Deeper Dive into Apache MXNet - March 2017 AWS Online Tech Talks
A Deeper Dive into Apache MXNet - March 2017 AWS Online Tech TalksA Deeper Dive into Apache MXNet - March 2017 AWS Online Tech Talks
A Deeper Dive into Apache MXNet - March 2017 AWS Online Tech Talks
 
A Deeper Dive into Apache MXNet - March 2017 AWS Online Tech Talks
A Deeper Dive into Apache MXNet - March 2017 AWS Online Tech TalksA Deeper Dive into Apache MXNet - March 2017 AWS Online Tech Talks
A Deeper Dive into Apache MXNet - March 2017 AWS Online Tech Talks
 
Introduction To Tensorflow
Introduction To TensorflowIntroduction To Tensorflow
Introduction To Tensorflow
 
Deep learning (2)
Deep learning (2)Deep learning (2)
Deep learning (2)
 
Diving into Deep Learning (Silicon Valley Code Camp 2017)
Diving into Deep Learning (Silicon Valley Code Camp 2017)Diving into Deep Learning (Silicon Valley Code Camp 2017)
Diving into Deep Learning (Silicon Valley Code Camp 2017)
 
Apache MXNet ODSC West 2018
Apache MXNet ODSC West 2018Apache MXNet ODSC West 2018
Apache MXNet ODSC West 2018
 
Android and Deep Learning
Android and Deep LearningAndroid and Deep Learning
Android and Deep Learning
 
Cnn
CnnCnn
Cnn
 
Scaling Deep Learning with MXNet
Scaling Deep Learning with MXNetScaling Deep Learning with MXNet
Scaling Deep Learning with MXNet
 
Scalable Deep Learning on AWS Using Apache MXNet - AWS Summit Tel Aviv 2017
Scalable Deep Learning on AWS Using Apache MXNet - AWS Summit Tel Aviv 2017Scalable Deep Learning on AWS Using Apache MXNet - AWS Summit Tel Aviv 2017
Scalable Deep Learning on AWS Using Apache MXNet - AWS Summit Tel Aviv 2017
 

Mais de Amazon Web Services

Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
Amazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
Amazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
Amazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
Amazon Web Services
 

Mais de Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Último

Chiulli_Aurora_Oman_Raffaele_Beowulf.pptx
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptxChiulli_Aurora_Oman_Raffaele_Beowulf.pptx
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptx
raffaeleoman
 
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
Sheetaleventcompany
 
If this Giant Must Walk: A Manifesto for a New Nigeria
If this Giant Must Walk: A Manifesto for a New NigeriaIf this Giant Must Walk: A Manifesto for a New Nigeria
If this Giant Must Walk: A Manifesto for a New Nigeria
Kayode Fayemi
 
Uncommon Grace The Autobiography of Isaac Folorunso
Uncommon Grace The Autobiography of Isaac FolorunsoUncommon Grace The Autobiography of Isaac Folorunso
Uncommon Grace The Autobiography of Isaac Folorunso
Kayode Fayemi
 

Último (20)

ANCHORING SCRIPT FOR A CULTURAL EVENT.docx
ANCHORING SCRIPT FOR A CULTURAL EVENT.docxANCHORING SCRIPT FOR A CULTURAL EVENT.docx
ANCHORING SCRIPT FOR A CULTURAL EVENT.docx
 
Dreaming Marissa Sánchez Music Video Treatment
Dreaming Marissa Sánchez Music Video TreatmentDreaming Marissa Sánchez Music Video Treatment
Dreaming Marissa Sánchez Music Video Treatment
 
BDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort ServiceBDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort Service
 
Mohammad_Alnahdi_Oral_Presentation_Assignment.pptx
Mohammad_Alnahdi_Oral_Presentation_Assignment.pptxMohammad_Alnahdi_Oral_Presentation_Assignment.pptx
Mohammad_Alnahdi_Oral_Presentation_Assignment.pptx
 
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara ServicesVVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
 
Causes of poverty in France presentation.pptx
Causes of poverty in France presentation.pptxCauses of poverty in France presentation.pptx
Causes of poverty in France presentation.pptx
 
Busty Desi⚡Call Girls in Sector 51 Noida Escorts >༒8448380779 Escort Service-...
Busty Desi⚡Call Girls in Sector 51 Noida Escorts >༒8448380779 Escort Service-...Busty Desi⚡Call Girls in Sector 51 Noida Escorts >༒8448380779 Escort Service-...
Busty Desi⚡Call Girls in Sector 51 Noida Escorts >༒8448380779 Escort Service-...
 
Thirunelveli call girls Tamil escorts 7877702510
Thirunelveli call girls Tamil escorts 7877702510Thirunelveli call girls Tamil escorts 7877702510
Thirunelveli call girls Tamil escorts 7877702510
 
Introduction to Prompt Engineering (Focusing on ChatGPT)
Introduction to Prompt Engineering (Focusing on ChatGPT)Introduction to Prompt Engineering (Focusing on ChatGPT)
Introduction to Prompt Engineering (Focusing on ChatGPT)
 
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptx
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptxChiulli_Aurora_Oman_Raffaele_Beowulf.pptx
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptx
 
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
 
If this Giant Must Walk: A Manifesto for a New Nigeria
If this Giant Must Walk: A Manifesto for a New NigeriaIf this Giant Must Walk: A Manifesto for a New Nigeria
If this Giant Must Walk: A Manifesto for a New Nigeria
 
Presentation on Engagement in Book Clubs
Presentation on Engagement in Book ClubsPresentation on Engagement in Book Clubs
Presentation on Engagement in Book Clubs
 
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night EnjoyCall Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy
 
ICT role in 21st century education and it's challenges.pdf
ICT role in 21st century education and it's challenges.pdfICT role in 21st century education and it's challenges.pdf
ICT role in 21st century education and it's challenges.pdf
 
Uncommon Grace The Autobiography of Isaac Folorunso
Uncommon Grace The Autobiography of Isaac FolorunsoUncommon Grace The Autobiography of Isaac Folorunso
Uncommon Grace The Autobiography of Isaac Folorunso
 
Air breathing and respiratory adaptations in diver animals
Air breathing and respiratory adaptations in diver animalsAir breathing and respiratory adaptations in diver animals
Air breathing and respiratory adaptations in diver animals
 
My Presentation "In Your Hands" by Halle Bailey
My Presentation "In Your Hands" by Halle BaileyMy Presentation "In Your Hands" by Halle Bailey
My Presentation "In Your Hands" by Halle Bailey
 
The workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdf
The workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdfThe workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdf
The workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdf
 
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
 

Scalable Deep Learning Using Apache MXNet

  • 1. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved Deep Learning with MXNet workshop Sunil Mallya Solutions Architect, Deep Learning smallya@amazon.com @sunilmallya
  • 2. Agenda • Deep Learning motivation and basics • Apache MXNet overview • MXNet programing model deep dive • Train our first neural network using MXNet
  • 3. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved Deep Learning basics
  • 4. Biological Neuron slide from http://cs231n.stanford.edu/ Neural Network basics: http://cs231n.github.io/neural-networks-1/
  • 5. Artificial Neuron output synaptic weights • Input Vector of training data x • Output Linear function of inputs • Nonlinearity Transform output into desired range of values, e.g. for classification we need probabilities [0, 1] • Training Learn the weights w and bias b
  • 6. • Activation functions governs behavior of neurons. • Transition of input is called forward propagation. • Activations are the values passed on to the next layer from each previous layer. These values are the output of the activation function of each artificial neuron. • Some of the more popular activation functions include: • Linear • Sigmoid • Hiberbolic Tangant • Relu • Softmax • Step function Activation Functions
  • 7. Deep Neural Network hidden layers The optimal size of the hidden layer (number of neurons) is usually between the size of the input and size of the output layers Input layer output
  • 8. The “Learning” in Deep Learning 0.4 0.3 0.2 0.9 ... back propogation (gradient descent) X1 != X 0.4 ± 𝛿 0.3 ± 𝛿 new weights new weights 0 1 0 1 1 . . - - X input label ... X1
  • 9. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved Apache MXNet
  • 10. Apache MXNet Programmable Portable High Performance Near linear scaling across hundreds of GPUs Highly efficient models for mobile and IoT Simple syntax, multiple languages 88% efficiency on 256 GPUs Resnet 1024 layer network is ~4GB
  • 11. Ideal Inception v3 Resnet Alexnet 88% Efficiency 1 2 4 8 16 32 64 128 256 No. of GPUs • Cloud formation with Deep Learning AMI • 16x P2.16xlarge. Mounted on EFS • Inception and Resnet: batch size 32, Alex net: batch size 512 • ImageNet, 1.2M images,1K classes • 152-layer ResNet, 5.4d on 4x K80s (1.2h per epoch), 0.22 top-1 error Scaling with MXNet
  • 12.
  • 13. http://bit.ly/deepami Deep Learning any way you want on AWS Tool for data scientists and developers Setting up a DL system takes (install) time & skill Keep packages up to date and compiled (MXNet, TensorFlow, Caffe, Torch, Theano, Keras) Anaconda, Jupyter, Python 2 and 3 NVIDIA Drivers for G2 and P2 instances Intel MKL Drivers for all other instances (C4, M4, …) Deep Learning AMIs
  • 14. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved MXNet Programing model
  • 15. import numpy as np a = np.ones(10) b = np.ones(10) * 2 c = b * a • Straightforward and flexible. • Take advantage of language native features (loop, condition, debugger) • E.g. Numpy, Matlab, Torch, … • Hard to optimize PROS CONS d = c + 1c Easy to tweak with python codes Imperative Programing
  • 16. • More chances for optimization • Cross different languages • E.g. TensorFlow, Theano, Caffe • Less flexible PROS CONS C can share memory with D because C is deleted later A = Variable('A') B = Variable('B') C = B * A D = C + 1 f = compile(D) d = f(A=np.ones(10), B=np.ones(10)*2) A B 1 + X Declarative Programing
  • 17. IMPERATIVE NDARRAY API DECLARATIVE SYMBOLIC EXECUTOR >>> import mxnet as mx >>> a = mx.nd.zeros((100, 50)) >>> b = mx.nd.ones((100, 50)) >>> c = a + b >>> c += 1 >>> print(c) >>> import mxnet as mx >>> net = mx.symbol.Variable('data') >>> net = mx.symbol.FullyConnected(data=net, num_hidde >>> net = mx.symbol.SoftmaxOutput(data=net) >>> texec = mx.module.Module(net) >>> texec.forward(data=c) >>> texec.backward() NDArray can be set as input to the graph MXNet: Mixed programming paradigm
  • 18. Embed symbolic expressions into imperative programming texec = mx.module.Module(net) for batch in train_data: texec.forward(batch) texec.backward() for param, grad in zip(texec.get_params(), texec.get_grads()): param -= 0.2 * grad MXNet: Mixed programming paradigm
  • 19. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved Hands on Lab
  • 21. Linear regression train_data = np.array([[1,2],[3,4],[5,6],[3,2],[7,1],[6,9]]) Y = ax +b ; such that the error is minimized
  • 22. Defining the Model Variables: A variable is a placeholder for future data X = mx.sym.Variable('data') Y = mx.symbol.Variable('lin_reg_label') Neural Network Layers: The layers of a network or any other type of model are also defined by Symbols fully_connected_layer = mx.sym.FullyConnected(data=X, name='fc1', num_hidden = 1) Output Symbols: Output symbols are MXNet's way of defining a loss lro = mx.sym.LinearRegressionOutput(data=fully_connected_layer, label=Y, name="lro”)
  • 23. Layers: Fully Connected Fully connected layer of a neural network (without any activation being applied), which in essence, is just a linear regression on the input attributes. It takes the following parameters: a. data: Input to the layer b. num_hidden: # of hidden dimensions, specifies the size of the output of the layer
  • 24. Layers: Linear Regression Output Linear Regression Output: Output layers in MXNet aim at implementing a loss function. We apply an L2 loss (Least Square errors) The parameters to this layer are: a. data: Input to this layer (specify the symbol whose output should be fed here) b. Label: The training label against whom we will compare the input to the layer for calculation of l2 loss
  • 25. Defining the Model model = mx.mod.Module( symbol = lro , data_names=['data'], label_names = ['lin_reg_label']# network structure ) model.fit(train_iter, eval_iter, optimizer_params={'learning_rate':0.01, 'momentum': 0.9}, num_epoch=1000, batch_end_callback = mx.callback.Speedometer(batch_size, 2))
  • 27. Inference #Inference model.predict(eval_iter).asnumpy() #Evaluation metric = mx.metric.MSE() model.score(eval_iter, metric) #Evaluation Data eval_data = np.array([[7,2],[6,10],[12,2]]) eval_label = np.array([11.1,26.1,16.1]) #Adding 0.1 to each of the values eval_iter = mx.io.NDArrayIter(eval_data, eval_label, batch_size, shuffle=False) model.score(eval_iter, metric)
  • 29. NDArray Data Iterator import mxnet as mx def to4d(img): return img.reshape(img.shape[0], 1, 28, 28).astype(np.float32)/255 batch_size = 100 train_iter = mx.io.NDArrayIter(to4d(train_img), train_lbl, batch_size, shuffle=True) val_iter = mx.io.NDArrayIter(to4d(val_img), val_lbl, batch_size) Batch of 4-D matrix with shape (batch_size, num_channels, width, height) For the MNIST dataset, there is only one color channel, and both width and height are 28
  • 30. • Input Layer: This layer is how we get input data (vectors) fed into our network. The number of neurons in an input layer is typically the same number as the input feature to the network. • Hidden Layer: The weight values on the connections between the layers are how an ANN encodes what it learns. Hidden layers are crucial in learning non-linear functions. • Output Layer: Output layer represents predictions. Output can be regression or classification. • Connections Between Layers: In a feed-forward network connections link a layer to the next layer of an ANN. Each connection has a weight. The weights of connections are encoding of the knowledge of the network. Neural Network basics: http://cs231n.github.io/neural-networks-1/ Feed Forward Network
  • 33. Model model = mx.model.FeedForward( symbol = mlp, # network structure num_epoch = 10, # number of data passes for training learning_rate = 0.1 # learning rate of SGD ) model.fit( X=train_iter, # training data eval_data=val_iter, # validation data batch_end_callback = mx.callback.Speedometer(batch_size, 200) # output progress for each 200 data batches )
  • 34. Predictions and Validation # prediction on a single image prob = model.predict(val_img[0:1].astype(np.float32)/255)[0] # get the class with highest probablity print 'Classified as %d with probability %f' % (prob.argmax(), max(prob)) # Run the model on the validation setand calculate the score with eval_metric. valid_acc = model.score(val_iter)
  • 35. Convolution Neural Network (CNN) CNN arranges neurons in 3 dimensions (width, height, depth) CNN Layers Convolutional Layer Pooling Layer Activation Fully-Connected Layer
  • 39. Running the model model = mx.model.FeedForward( ctx = mx.gpu(0), # use GPU 0 for training, others are same as before symbol = lenet, num_epoch = 10, learning_rate = 0.1) model.fit( X=train_iter, eval_data=val_iter, batch_end_callback = mx.callback.Speedometer(batch_size, 200) )
  • 41. LeNet-5 Architecture img src: http://yann.lecun.com/exdb/publis/pdf/lecun-98.pdf
  • 42. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved Thank You smallya@amazon.com sunilmallya

Notas do Editor

  1. Learn about the features and benefits of Apache MXNet Learn about the deep learning AMIs with the tools you need for DL Learn how to train a neural network using MXNet
  2. Cell body – SOMA ; Dendrites appendages that listen to other neurons Single axon that carries the output of the computation that the neuron performs Cell body receives multiple inputs and if the cell body aligns the cell body can spike and sends an activation potential down the neuron and Then branches out to axons to other neurons. We have neurons connected through the synapses Crude model :: Neuron to Neuron connection is through the synapse. Each neuron has a weight, which is a function of “how much does this neuron like the other neuron”
  3. For computation efficiency Neurons are arranged in layers
  4. Hard to define the network the definition of the inception network has >1k lines of codes in Caffe Memory consumption is linear with number of layers
  5. Execute operations step by step. c = b ⨉ a invokes a kernel operation Numpy programs are imperative
  6. Declares the computation Compiles into a function C = B ⨉ A  only specifies the requirement SQL is declarative
  7. @zz: “data_shape” is confusing, one would expect it to bind with some input data, not "shape"
  8. Execute operations step by step. c = b ⨉ a invokes a kernel operation Numpy programs are imperative