SlideShare uma empresa Scribd logo
1 de 54
Baixar para ler offline
Neural NetworksUniversal Function Approximators
-Prakhar Mishra
Agenda
● Machine Learning Refresher
○ An Example
○ Hierarchical Division
○ Split Ratio
○ Evaluation Metric
● Neural Networks
○ Inspiration
○ Computation Graph
○ Architecture
○ Hyperparameters
○ Regularization
○ Backpropagation
Machine Learning - Quick Refresher
Machine Learning - Quick Refresher
Machine Learning - Quick Refresher
Feature Engineering
Machine Learning - Quick Refresher
Figure out yourself
Machine Learning - Quick Refresher
Machine Learning - Quick Refresher
Machine Learning - Quick Refresher
70%-80% 30%-20%
Machine Learning - Evaluation Metrics
● Confusion Matrix
○ Evaluation for performance of classification model
● Accuracy = (TP + TN) /total samples
Machine Learning - Evaluation Metrics
● Root Mean Squared Error
○ Spread of the predicted y-values about the original y-values.
N = Total Samples
Yi
= Predicted
Yi
= Actual
Rise of Neural Nets
Scale drives
Deep Learning
Learning from Data
Structured Unstructured
Neural Nets - Supervised
Input Output Application
Home Features Cost Real Estate
Ad, User Information Click on Ad ? Online Advertising
Image (1...1000) Class Photo Tagging
Audio Text Speech Recognition
English Chinese Machine Translation
Computation Graph
J(a, b, c) = 3(a + bc)
U = bc
V = a + U
J = 3V
Substitution
U=b*c
b
c
a V= a+U J = 3V
Input
a = 5
b = 3
c = 2
How does J
change if we
change V a bit?
11
33
6
How does J
change if we
change a a bit?
a→V→J
∂J/∂a = (∂J/∂V) x (∂V/∂a)
How does J
change if we
change b a bit?
b→U→V→J
∂J/∂b = (∂J/∂V) x (∂V/∂U) x (∂U/∂b)
Forward →
Backward ←
Architecture
w1
i
1
i2
.
.
in
wn
o1
on
.
.
xF
F = Activation Function
X = w1
*i1
+ w2
*i2
+ . . +wn
*in
+ b
3 Layer NN
Hyperparameters
● There are number of parameters that can be tuned in while building your
neural network.
○ Number of Hidden Layers
○ Epochs
○ Loss Function
○ Optimization Function
○ Weight Initialization
○ Activation Functions
○ Batch Size
○ Learning Rate
Weight Initialization
● If the weights in a network start too small, then the signal shrinks as it
passes through each layer until it’s too tiny to be useful.
● If the weights in a network start too large, then the signal grows as it
passes through each layer until it’s too massive to be useful.
-
Xavier Initialization
-
Weight Initialization
Wi
= √(2 / ni
)
Hyperparameters
● There are number of parameters that can be tuned in while building your
neural network.
○ Number of Hidden Layers
○ Epochs
○ Loss Function
○ Optimization Function
○ Weight Initialization
○ Activation Functions
○ Batch Size
○ Learning Rate
Loss Functions
● Binary Cross Entropy
● Categorical Cross Entropy
● Root Mean Squared Error
Hyperparameters
● There are number of parameters that can be tuned in while building your
neural network.
○ Number of Hidden Layers
○ Epochs
○ Loss Function
○ Optimization Function
○ Weight Initialization
○ Activation Functions
○ Batch Size
○ Learning Rate
Optimization Functions
● Adagrad Optimizer
● Gradient Descent Optimizer
● Adams Optimizer
● Stochastic Gradient Descent Optimizer
● RMSProp Optimizer
Optimization Functions - Adam
Optimization Functions - Adam
Hyperparameters
● There are number of parameters that can be tuned in while building your
neural network.
○ Number of Hidden Layers
○ Epochs
○ Loss Function
○ Optimization Function
○ Weight Initialization
○ Activation Functions
○ Batch Size
○ Learning Rate
Learning Rate
● Decaying the Learning Rate overtime is seen to fasten the learning
process/convergence.
Learning Rate- Intuition
Learning Rate- Formula
1
1 + decay x learning_rate
Alpha0Alpha1
Learning Rate- Special Case
Wi
= Wi-1
+ Alpha x Slope
Pseudo Self Adaptive in
Convex Curve
Hyperparameters
● There are number of parameters that can be tuned in while building your
neural network.
○ Number of Hidden Layers
○ Epochs
○ Loss Function
○ Optimization Function
○ Weight Initialization
○ Activation Functions
○ Batch Size
○ Learning Rate
Activation Functions
Biologically inspired by activity of our brain, where different neurons are
activated by different stimuli.
Activation Functions - Sigmoid
Activation Functions - Tanh
Activation Functions - ReLU
Activation Functions - Standards
● In practice, Tanh outperforms Sigmoid for internal layers.
○ Mean 0, Tanh Function.
○ Mean 0, Sigmoid Function.
○ In ML, we tend to center our data to avoid any kind of bias behaviour.
● Rule of thumb, ReLU for hidden layers generally performs well.
● Avoid Sigmoid for hidden layers.
● Sigmoid is a good candidate for Binary Classification problem.
● Identity Function for hidden layers - No Sense
Activation Functions - ReLU or Tanh ?
ReLU > Tanh
-
Avoids Vanishing Gradient
-
Is it the best ? [No]
Activation Functions - Why ?
Because
fLinear fLinear = fLinear = (N) Layers = (N-X) Layers
-
Trivial Functions are learned
-
Activation Functions - Why ?
● More Advanced Functions - Nonlinear.
● Should be Differentiable - for Backpropagation.
Hyperparameters
● There are number of parameters that can be tuned in while building your
neural network.
○ Number of Hidden Layers
○ Epochs
○ Loss Function
○ Optimization Function
○ Weight Initialization
○ Activation Functions
○ Batch Size
○ Learning Rate
Batch Size
● The Batch Size is the number of samples that will be passed through the
network at a time.
● Advantages
○ Your machine might not fit all the data in-memory at any given instance.
○ You want your model to generalize quickly.
Training - Pre:1
Derivative
Training - Pre:2
Partial Derivative
Training - Pre:3
Chain Rule
Training - Example
0.05
0.10
0.02
Xi
(Input)
Input
0.15
0.30
0.20
Weights
H1
H2
H3
X1
X2
X3
O1 Y
Output
0.33
Input Layer
Hidden Layer
Output Layer
Training - Forward Propagation
Hi
= ∑i=1
wi
*xi
(Compact Representation)
H1
= w1
*x1
+ w2
*x2
+ w3
*x3
(Expanded Representation)
H1
= 0.15*0.05 + 0.20*0.10 + 0.30*0.02
H1
= σ(0.0335) = Hσ1
O1
= ∑i=0
Hi
*wi
(Compact Representation)
O1
= H1
*0.33 = 0.0335*0.33
O1
= σ(0.011055) = Oσ1
σ = 1 / 1 + e-H
Error = |Y - Yi
|
Training - Backward Propagation
The Goal,
Is to update each of the weights in the network so that they
cause the actual output to be closer the target output.
Training - Backward Propagation
∂Error/∂w4
= (∂Error/∂O 1
) x (∂O 1
/∂O1
) x (∂O1
/∂w4
)
∂Error/∂wi
= Partial derivative w.r.t wi
w4
O 1O1
w
4
Error
∂Error/∂w1
= (∂Error/∂H 1
) x (∂H 1
/∂H1
) x (∂H1
/∂w1
)
= |Y - Yi
|
w1
Training - Backward Propagation
0.33
0.04
H1
Etotal
= Eo1
+ Eo2
E0
E1
∂Etotal
/∂H 1
= ∂Eo1
/∂H 1
+ ∂Eo2
/∂H 1
∂Eo1
/∂H 1
= (∂Eo1
/∂H1
x (∂H1
/∂H 1
)
∂Eo2
/∂H 1
= (∂Eo2
/∂H1
) x (∂H1
/∂H 1
)
∂Etotal
/∂w1
= (∂Etotal
/∂H 1
) x (∂H 1
/∂H1
) x (∂H1
/∂w1
)
w1
H 1
Works perfect on training data ?
Regularization
Technique for preventing overfitting
Regularization reduces overfitting by adding a penalty to the loss function
Regularization- Dropout
● Dropout refers to ignoring units (i.e. neurons) during the training phase of
certain set of neurons which is chosen at random.
● Avoids co-dependency amongst neurons during training.
● Dropout with a given probability (20%-50%) in each weight update cycle.
● Dropout at each layer of the network has shown good results.
Regularization- Dropout
References
● Adam Optimization
● Andrew Ng Youtube
● Siraj Raval Youtube
● Adam Optimization
● Cross Entropy
● Deep Learning Basics
● BackPropagation

Mais conteúdo relacionado

Mais procurados

Introduction to Deep learning and H2O for beginner's
Introduction to Deep learning and H2O for beginner'sIntroduction to Deep learning and H2O for beginner's
Introduction to Deep learning and H2O for beginner'sVidyasagar Bhargava
 
Techniques in Deep Learning
Techniques in Deep LearningTechniques in Deep Learning
Techniques in Deep LearningSourya Dey
 
Variational Autoencoder
Variational AutoencoderVariational Autoencoder
Variational AutoencoderMark Chang
 
Deep Learning in Finance
Deep Learning in FinanceDeep Learning in Finance
Deep Learning in FinanceAltoros
 
Introduction of "TrailBlazer" algorithm
Introduction of "TrailBlazer" algorithmIntroduction of "TrailBlazer" algorithm
Introduction of "TrailBlazer" algorithmKatsuki Ohto
 
Deep learning with TensorFlow
Deep learning with TensorFlowDeep learning with TensorFlow
Deep learning with TensorFlowBarbara Fusinska
 
Ultrasound Nerve Segmentation
Ultrasound Nerve Segmentation Ultrasound Nerve Segmentation
Ultrasound Nerve Segmentation Sneha Ravikumar
 
Introducton to Convolutional Nerural Network with TensorFlow
Introducton to Convolutional Nerural Network with TensorFlowIntroducton to Convolutional Nerural Network with TensorFlow
Introducton to Convolutional Nerural Network with TensorFlowEtsuji Nakai
 
Differential privacy without sensitivity [NIPS2016読み会資料]
Differential privacy without sensitivity [NIPS2016読み会資料]Differential privacy without sensitivity [NIPS2016読み会資料]
Differential privacy without sensitivity [NIPS2016読み会資料]Kentaro Minami
 
Training Neural Networks
Training Neural NetworksTraining Neural Networks
Training Neural NetworksDatabricks
 
Improving Variational Inference with Inverse Autoregressive Flow
Improving Variational Inference with Inverse Autoregressive FlowImproving Variational Inference with Inverse Autoregressive Flow
Improving Variational Inference with Inverse Autoregressive FlowTatsuya Shirakawa
 
Chap 8. Optimization for training deep models
Chap 8. Optimization for training deep modelsChap 8. Optimization for training deep models
Chap 8. Optimization for training deep modelsYoung-Geun Choi
 
Optimization (DLAI D4L1 2017 UPC Deep Learning for Artificial Intelligence)
Optimization (DLAI D4L1 2017 UPC Deep Learning for Artificial Intelligence)Optimization (DLAI D4L1 2017 UPC Deep Learning for Artificial Intelligence)
Optimization (DLAI D4L1 2017 UPC Deep Learning for Artificial Intelligence)Universitat Politècnica de Catalunya
 
Anomaly Detection by ADGM / LVAE
Anomaly Detection by ADGM / LVAEAnomaly Detection by ADGM / LVAE
Anomaly Detection by ADGM / LVAEPreferred Networks
 
Generative adversarial networks
Generative adversarial networksGenerative adversarial networks
Generative adversarial networks남주 김
 
Backpropagation (DLAI D3L1 2017 UPC Deep Learning for Artificial Intelligence)
Backpropagation (DLAI D3L1 2017 UPC Deep Learning for Artificial Intelligence)Backpropagation (DLAI D3L1 2017 UPC Deep Learning for Artificial Intelligence)
Backpropagation (DLAI D3L1 2017 UPC Deep Learning for Artificial Intelligence)Universitat Politècnica de Catalunya
 
VAE-type Deep Generative Models
VAE-type Deep Generative ModelsVAE-type Deep Generative Models
VAE-type Deep Generative ModelsKenta Oono
 
Reinforcement Learning for Self Driving Cars
Reinforcement Learning for Self Driving CarsReinforcement Learning for Self Driving Cars
Reinforcement Learning for Self Driving CarsSneha Ravikumar
 
Overview of TensorFlow For Natural Language Processing
Overview of TensorFlow For Natural Language ProcessingOverview of TensorFlow For Natural Language Processing
Overview of TensorFlow For Natural Language Processingananth
 
Reading group gan - 20170417
Reading group   gan - 20170417Reading group   gan - 20170417
Reading group gan - 20170417Shuai Zhang
 

Mais procurados (20)

Introduction to Deep learning and H2O for beginner's
Introduction to Deep learning and H2O for beginner'sIntroduction to Deep learning and H2O for beginner's
Introduction to Deep learning and H2O for beginner's
 
Techniques in Deep Learning
Techniques in Deep LearningTechniques in Deep Learning
Techniques in Deep Learning
 
Variational Autoencoder
Variational AutoencoderVariational Autoencoder
Variational Autoencoder
 
Deep Learning in Finance
Deep Learning in FinanceDeep Learning in Finance
Deep Learning in Finance
 
Introduction of "TrailBlazer" algorithm
Introduction of "TrailBlazer" algorithmIntroduction of "TrailBlazer" algorithm
Introduction of "TrailBlazer" algorithm
 
Deep learning with TensorFlow
Deep learning with TensorFlowDeep learning with TensorFlow
Deep learning with TensorFlow
 
Ultrasound Nerve Segmentation
Ultrasound Nerve Segmentation Ultrasound Nerve Segmentation
Ultrasound Nerve Segmentation
 
Introducton to Convolutional Nerural Network with TensorFlow
Introducton to Convolutional Nerural Network with TensorFlowIntroducton to Convolutional Nerural Network with TensorFlow
Introducton to Convolutional Nerural Network with TensorFlow
 
Differential privacy without sensitivity [NIPS2016読み会資料]
Differential privacy without sensitivity [NIPS2016読み会資料]Differential privacy without sensitivity [NIPS2016読み会資料]
Differential privacy without sensitivity [NIPS2016読み会資料]
 
Training Neural Networks
Training Neural NetworksTraining Neural Networks
Training Neural Networks
 
Improving Variational Inference with Inverse Autoregressive Flow
Improving Variational Inference with Inverse Autoregressive FlowImproving Variational Inference with Inverse Autoregressive Flow
Improving Variational Inference with Inverse Autoregressive Flow
 
Chap 8. Optimization for training deep models
Chap 8. Optimization for training deep modelsChap 8. Optimization for training deep models
Chap 8. Optimization for training deep models
 
Optimization (DLAI D4L1 2017 UPC Deep Learning for Artificial Intelligence)
Optimization (DLAI D4L1 2017 UPC Deep Learning for Artificial Intelligence)Optimization (DLAI D4L1 2017 UPC Deep Learning for Artificial Intelligence)
Optimization (DLAI D4L1 2017 UPC Deep Learning for Artificial Intelligence)
 
Anomaly Detection by ADGM / LVAE
Anomaly Detection by ADGM / LVAEAnomaly Detection by ADGM / LVAE
Anomaly Detection by ADGM / LVAE
 
Generative adversarial networks
Generative adversarial networksGenerative adversarial networks
Generative adversarial networks
 
Backpropagation (DLAI D3L1 2017 UPC Deep Learning for Artificial Intelligence)
Backpropagation (DLAI D3L1 2017 UPC Deep Learning for Artificial Intelligence)Backpropagation (DLAI D3L1 2017 UPC Deep Learning for Artificial Intelligence)
Backpropagation (DLAI D3L1 2017 UPC Deep Learning for Artificial Intelligence)
 
VAE-type Deep Generative Models
VAE-type Deep Generative ModelsVAE-type Deep Generative Models
VAE-type Deep Generative Models
 
Reinforcement Learning for Self Driving Cars
Reinforcement Learning for Self Driving CarsReinforcement Learning for Self Driving Cars
Reinforcement Learning for Self Driving Cars
 
Overview of TensorFlow For Natural Language Processing
Overview of TensorFlow For Natural Language ProcessingOverview of TensorFlow For Natural Language Processing
Overview of TensorFlow For Natural Language Processing
 
Reading group gan - 20170417
Reading group   gan - 20170417Reading group   gan - 20170417
Reading group gan - 20170417
 

Semelhante a Neural networks

Deep Learning Module 2A Training MLP.pptx
Deep Learning Module 2A Training MLP.pptxDeep Learning Module 2A Training MLP.pptx
Deep Learning Module 2A Training MLP.pptxvipul6601
 
08 neural networks
08 neural networks08 neural networks
08 neural networksankit_ppt
 
Machine Learning With Neural Networks
Machine Learning  With Neural NetworksMachine Learning  With Neural Networks
Machine Learning With Neural NetworksKnoldus Inc.
 
Deep learning from scratch
Deep learning from scratch Deep learning from scratch
Deep learning from scratch Eran Shlomo
 
Deep Learning Tutorial
Deep Learning Tutorial Deep Learning Tutorial
Deep Learning Tutorial Ligeng Zhu
 
Deep Learning Interview Questions And Answers | AI & Deep Learning Interview ...
Deep Learning Interview Questions And Answers | AI & Deep Learning Interview ...Deep Learning Interview Questions And Answers | AI & Deep Learning Interview ...
Deep Learning Interview Questions And Answers | AI & Deep Learning Interview ...Simplilearn
 
Artificial Neural Network
Artificial Neural NetworkArtificial Neural Network
Artificial Neural NetworkPratik Aggarwal
 
Deep learning crash course
Deep learning crash courseDeep learning crash course
Deep learning crash courseVishwas N
 
V2.0 open power ai virtual university deep learning and ai introduction
V2.0 open power ai virtual university   deep learning and ai introductionV2.0 open power ai virtual university   deep learning and ai introduction
V2.0 open power ai virtual university deep learning and ai introductionGanesan Narayanasamy
 
DeepLearningLecture.pptx
DeepLearningLecture.pptxDeepLearningLecture.pptx
DeepLearningLecture.pptxssuserf07225
 
Machine Learning, Deep Learning and Data Analysis Introduction
Machine Learning, Deep Learning and Data Analysis IntroductionMachine Learning, Deep Learning and Data Analysis Introduction
Machine Learning, Deep Learning and Data Analysis IntroductionTe-Yen Liu
 
Activation_function.pptx
Activation_function.pptxActivation_function.pptx
Activation_function.pptxMohamed Essam
 
Foundations: Artificial Neural Networks
Foundations: Artificial Neural NetworksFoundations: Artificial Neural Networks
Foundations: Artificial Neural Networksananth
 

Semelhante a Neural networks (20)

Deep Learning Module 2A Training MLP.pptx
Deep Learning Module 2A Training MLP.pptxDeep Learning Module 2A Training MLP.pptx
Deep Learning Module 2A Training MLP.pptx
 
08 neural networks
08 neural networks08 neural networks
08 neural networks
 
Machine Learning With Neural Networks
Machine Learning  With Neural NetworksMachine Learning  With Neural Networks
Machine Learning With Neural Networks
 
Eye deep
Eye deepEye deep
Eye deep
 
Deep learning
Deep learningDeep learning
Deep learning
 
Deep learning from scratch
Deep learning from scratch Deep learning from scratch
Deep learning from scratch
 
Deep Learning Tutorial
Deep Learning Tutorial Deep Learning Tutorial
Deep Learning Tutorial
 
Deep Learning Interview Questions And Answers | AI & Deep Learning Interview ...
Deep Learning Interview Questions And Answers | AI & Deep Learning Interview ...Deep Learning Interview Questions And Answers | AI & Deep Learning Interview ...
Deep Learning Interview Questions And Answers | AI & Deep Learning Interview ...
 
Artificial Neural Network
Artificial Neural NetworkArtificial Neural Network
Artificial Neural Network
 
Deep learning crash course
Deep learning crash courseDeep learning crash course
Deep learning crash course
 
V2.0 open power ai virtual university deep learning and ai introduction
V2.0 open power ai virtual university   deep learning and ai introductionV2.0 open power ai virtual university   deep learning and ai introduction
V2.0 open power ai virtual university deep learning and ai introduction
 
DeepLearningLecture.pptx
DeepLearningLecture.pptxDeepLearningLecture.pptx
DeepLearningLecture.pptx
 
Artificial Neural Networks
Artificial Neural NetworksArtificial Neural Networks
Artificial Neural Networks
 
CSSC ML Workshop
CSSC ML WorkshopCSSC ML Workshop
CSSC ML Workshop
 
Machine Learning, Deep Learning and Data Analysis Introduction
Machine Learning, Deep Learning and Data Analysis IntroductionMachine Learning, Deep Learning and Data Analysis Introduction
Machine Learning, Deep Learning and Data Analysis Introduction
 
Activation_function.pptx
Activation_function.pptxActivation_function.pptx
Activation_function.pptx
 
Deep learning - a primer
Deep learning - a primerDeep learning - a primer
Deep learning - a primer
 
Deep learning - a primer
Deep learning - a primerDeep learning - a primer
Deep learning - a primer
 
Practical ML
Practical MLPractical ML
Practical ML
 
Foundations: Artificial Neural Networks
Foundations: Artificial Neural NetworksFoundations: Artificial Neural Networks
Foundations: Artificial Neural Networks
 

Último

Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 

Último (20)

Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 

Neural networks

  • 1. Neural NetworksUniversal Function Approximators -Prakhar Mishra
  • 2. Agenda ● Machine Learning Refresher ○ An Example ○ Hierarchical Division ○ Split Ratio ○ Evaluation Metric ● Neural Networks ○ Inspiration ○ Computation Graph ○ Architecture ○ Hyperparameters ○ Regularization ○ Backpropagation
  • 3. Machine Learning - Quick Refresher
  • 4. Machine Learning - Quick Refresher
  • 5. Machine Learning - Quick Refresher Feature Engineering
  • 6. Machine Learning - Quick Refresher Figure out yourself
  • 7. Machine Learning - Quick Refresher
  • 8. Machine Learning - Quick Refresher
  • 9. Machine Learning - Quick Refresher 70%-80% 30%-20%
  • 10. Machine Learning - Evaluation Metrics ● Confusion Matrix ○ Evaluation for performance of classification model ● Accuracy = (TP + TN) /total samples
  • 11. Machine Learning - Evaluation Metrics ● Root Mean Squared Error ○ Spread of the predicted y-values about the original y-values. N = Total Samples Yi = Predicted Yi = Actual
  • 12. Rise of Neural Nets Scale drives Deep Learning
  • 14. Neural Nets - Supervised Input Output Application Home Features Cost Real Estate Ad, User Information Click on Ad ? Online Advertising Image (1...1000) Class Photo Tagging Audio Text Speech Recognition English Chinese Machine Translation
  • 15. Computation Graph J(a, b, c) = 3(a + bc) U = bc V = a + U J = 3V Substitution U=b*c b c a V= a+U J = 3V Input a = 5 b = 3 c = 2 How does J change if we change V a bit? 11 33 6 How does J change if we change a a bit? a→V→J ∂J/∂a = (∂J/∂V) x (∂V/∂a) How does J change if we change b a bit? b→U→V→J ∂J/∂b = (∂J/∂V) x (∂V/∂U) x (∂U/∂b) Forward → Backward ←
  • 16. Architecture w1 i 1 i2 . . in wn o1 on . . xF F = Activation Function X = w1 *i1 + w2 *i2 + . . +wn *in + b 3 Layer NN
  • 17. Hyperparameters ● There are number of parameters that can be tuned in while building your neural network. ○ Number of Hidden Layers ○ Epochs ○ Loss Function ○ Optimization Function ○ Weight Initialization ○ Activation Functions ○ Batch Size ○ Learning Rate
  • 18. Weight Initialization ● If the weights in a network start too small, then the signal shrinks as it passes through each layer until it’s too tiny to be useful. ● If the weights in a network start too large, then the signal grows as it passes through each layer until it’s too massive to be useful. - Xavier Initialization -
  • 20. Hyperparameters ● There are number of parameters that can be tuned in while building your neural network. ○ Number of Hidden Layers ○ Epochs ○ Loss Function ○ Optimization Function ○ Weight Initialization ○ Activation Functions ○ Batch Size ○ Learning Rate
  • 21. Loss Functions ● Binary Cross Entropy ● Categorical Cross Entropy ● Root Mean Squared Error
  • 22. Hyperparameters ● There are number of parameters that can be tuned in while building your neural network. ○ Number of Hidden Layers ○ Epochs ○ Loss Function ○ Optimization Function ○ Weight Initialization ○ Activation Functions ○ Batch Size ○ Learning Rate
  • 23. Optimization Functions ● Adagrad Optimizer ● Gradient Descent Optimizer ● Adams Optimizer ● Stochastic Gradient Descent Optimizer ● RMSProp Optimizer
  • 26. Hyperparameters ● There are number of parameters that can be tuned in while building your neural network. ○ Number of Hidden Layers ○ Epochs ○ Loss Function ○ Optimization Function ○ Weight Initialization ○ Activation Functions ○ Batch Size ○ Learning Rate
  • 27. Learning Rate ● Decaying the Learning Rate overtime is seen to fasten the learning process/convergence.
  • 29. Learning Rate- Formula 1 1 + decay x learning_rate Alpha0Alpha1
  • 30. Learning Rate- Special Case Wi = Wi-1 + Alpha x Slope Pseudo Self Adaptive in Convex Curve
  • 31. Hyperparameters ● There are number of parameters that can be tuned in while building your neural network. ○ Number of Hidden Layers ○ Epochs ○ Loss Function ○ Optimization Function ○ Weight Initialization ○ Activation Functions ○ Batch Size ○ Learning Rate
  • 32. Activation Functions Biologically inspired by activity of our brain, where different neurons are activated by different stimuli.
  • 36. Activation Functions - Standards ● In practice, Tanh outperforms Sigmoid for internal layers. ○ Mean 0, Tanh Function. ○ Mean 0, Sigmoid Function. ○ In ML, we tend to center our data to avoid any kind of bias behaviour. ● Rule of thumb, ReLU for hidden layers generally performs well. ● Avoid Sigmoid for hidden layers. ● Sigmoid is a good candidate for Binary Classification problem. ● Identity Function for hidden layers - No Sense
  • 37. Activation Functions - ReLU or Tanh ? ReLU > Tanh - Avoids Vanishing Gradient - Is it the best ? [No]
  • 38. Activation Functions - Why ? Because fLinear fLinear = fLinear = (N) Layers = (N-X) Layers - Trivial Functions are learned -
  • 39. Activation Functions - Why ? ● More Advanced Functions - Nonlinear. ● Should be Differentiable - for Backpropagation.
  • 40. Hyperparameters ● There are number of parameters that can be tuned in while building your neural network. ○ Number of Hidden Layers ○ Epochs ○ Loss Function ○ Optimization Function ○ Weight Initialization ○ Activation Functions ○ Batch Size ○ Learning Rate
  • 41. Batch Size ● The Batch Size is the number of samples that will be passed through the network at a time. ● Advantages ○ Your machine might not fit all the data in-memory at any given instance. ○ You want your model to generalize quickly.
  • 46. Training - Forward Propagation Hi = ∑i=1 wi *xi (Compact Representation) H1 = w1 *x1 + w2 *x2 + w3 *x3 (Expanded Representation) H1 = 0.15*0.05 + 0.20*0.10 + 0.30*0.02 H1 = σ(0.0335) = Hσ1 O1 = ∑i=0 Hi *wi (Compact Representation) O1 = H1 *0.33 = 0.0335*0.33 O1 = σ(0.011055) = Oσ1 σ = 1 / 1 + e-H Error = |Y - Yi |
  • 47. Training - Backward Propagation The Goal, Is to update each of the weights in the network so that they cause the actual output to be closer the target output.
  • 48. Training - Backward Propagation ∂Error/∂w4 = (∂Error/∂O 1 ) x (∂O 1 /∂O1 ) x (∂O1 /∂w4 ) ∂Error/∂wi = Partial derivative w.r.t wi w4 O 1O1 w 4 Error ∂Error/∂w1 = (∂Error/∂H 1 ) x (∂H 1 /∂H1 ) x (∂H1 /∂w1 ) = |Y - Yi | w1
  • 49. Training - Backward Propagation 0.33 0.04 H1 Etotal = Eo1 + Eo2 E0 E1 ∂Etotal /∂H 1 = ∂Eo1 /∂H 1 + ∂Eo2 /∂H 1 ∂Eo1 /∂H 1 = (∂Eo1 /∂H1 x (∂H1 /∂H 1 ) ∂Eo2 /∂H 1 = (∂Eo2 /∂H1 ) x (∂H1 /∂H 1 ) ∂Etotal /∂w1 = (∂Etotal /∂H 1 ) x (∂H 1 /∂H1 ) x (∂H1 /∂w1 ) w1 H 1
  • 50. Works perfect on training data ?
  • 51. Regularization Technique for preventing overfitting Regularization reduces overfitting by adding a penalty to the loss function
  • 52. Regularization- Dropout ● Dropout refers to ignoring units (i.e. neurons) during the training phase of certain set of neurons which is chosen at random. ● Avoids co-dependency amongst neurons during training. ● Dropout with a given probability (20%-50%) in each weight update cycle. ● Dropout at each layer of the network has shown good results.
  • 54. References ● Adam Optimization ● Andrew Ng Youtube ● Siraj Raval Youtube ● Adam Optimization ● Cross Entropy ● Deep Learning Basics ● BackPropagation