SlideShare uma empresa Scribd logo
1 de 27
Short Story Submission
Rimzim Thube
SJSU ID- 014555021
A Survey of Convolutional Neural
Networks:
Analysis, Applications, and Prospects
Zewen Li, Wenjie Yang, Shouheng Peng, Fan Liu, Member, IEEE
Introduction to Convolution Neural Network (CNN)
•Applications using CNN –
•Face recognition
•Autonomous vehicles
•Self-service supermarket
•Intelligent medical treatment
Emergence of CNN
• McCulloch and Pitts – First mathematical MP model of
neurons
• Rosenblatt - Added learning capability to MP model
• Hinton – Proposed multi-layer feedforward network trained
by the error Back Propagation – BP network
• Waibel - Time Delay Neural Network (TDNN) for speech
recognition
• LeCun – First convolution network (LeNet) to recognize
handwritten text
Overview of CNN
• Feedforward neural network
• Extracts features from data from convolution structures
• Architecture inspired by visual perception
• Biological neuron corresponds to an artificial neuron
• CNN kernels represent different receptors that can respond to various
features
• Activation function transmit signal to next neuron if it exceeds certain
threshold
• Loss functions and optimizers teach the whole CNN system to learn
Advantages of CNN
• Local connections – Each neuron connected to not all but
small no. of neurons. Reduces parameters and speed up
convergence.
• Weight sharing - Connections share same weights
• Down-sampling dimensionality reduction.
• These characteristics make CNN most representative
algorithms
Components of CNN
• Convolution - pivotal step for feature extraction. Output is feature
map
• Padding - introduced to enlarge the input with zero value
• Stride – Control the density of convolution
• Pooling - Obviate redundancy or down sampling
LeNet - 5
• Composed of 7 trainable layers containing 2 convolutional layers, 2
pooling layers, and 3 fully-connected layers
• NN characteristics of local receptive fields, shared weights, and spatial
or temporal subsampling, ensures shift, scale, and distortion
• Used for handwriting recognition
AlexNet
• Has 8 layers, containing 5 convolutional layers and 3 fully-connected
layers
• uses ReLU as the activation function of CNN to solve gradient
vanishing
• Dropout was used in last few layers to avoid overfitting
• Local Response Normalization (LRN) to enhance generalization of
model
AlexNet
• Employ 2 powerful GPUs, two feature maps generated by two GPUs
can be combined as the final output
• Enlarges dataset and calculates average of their predictions as final
result
• Principal Component Analysis (PCA) to change the RGB values of
training set
VGGNet
• LRN layer was removed
• VGGNets use 3 × 3 convolution kernels rather than 5 × 5 or 5 × 5
ones, since several small kernels have the same receptive field and
more nonlinear variations compared with larger ones.
GoogLeNet - Inception v1
• CNN formed by stacking with Inception modules
• Inception v1 deploys 1 × 1, 3 × 3, 5 × 5 convolution kernels to
construct a “wide” network
• Convolution kernels with different sizes can extract the feature maps
of different scales of the image
• 1 × 1 convolution kernel is used to reduce the number of channels,
i.e., reduce computational cost
GoogLeNet - Inception v2
• Output of every layer is normalized to increase the robustness of
model and train it with high learning rate
• Single 5 × 5 convolutional layers can be replaced by two 3 × 3 ones
• One n x n convolutional layer can be replaced byone 1 x n and one n x
1 convolutional layer
• Filter banks expanded wider to improve high dimensional
representations
ResNet
• Two layer residual block constructed by the shortcut connection
• 50-layer ResNet, 101-layer ResNet, and 152-layer ResNet utilize three-
layer residual blocks
• Three-layer residual block is also called the bottleneck module
because the two ends of the block are narrower than the middle
• Can mitigate the gradient vanishing problem since the gradient can
directly flow through shortcut connections
•
DCGAN
• GAN has generative model G and a discriminative model D
• The model G with random noise z generates a sample G(z) that
subjects to the data distribution data learned by G.
• The model D can determine whether the input sample is real data x
or generated data G(z).
• Both G and D can be nonlinear functions. The aim of G is to generate
real data, the aim of D is to distinguish fake data generated by G from
the real data
MobileNets
• lightweight models proposed by Google for embedded devices such
as mobile phones
• depth-wise separable convolutions and several advanced techniques
to build thin deep neural networks.
ShuffleNets
• Series of CNN-based models to solve the problem of insufficient
computing power of mobile devices
• Combine pointwise group convolution, channel shuffle, which
significantly reduce the computational cost with little loss of accuracy
GhostNet
• As large amounts of redundant features are extracted by existing
CNNs for image cognition, GhostNet is used to reduce computational
cost effectively
• Similar feature maps in traditional convolution layers are called ghost
• Traditional convolution layers divided into two parts
• Less convolution kernels are directly used in feature extraction
• These features are processed in linear transformation to acquire
multiple feature maps. They proved that Ghost module applies to
other CNN models
Activation function
• In a multilayer neural network, there is a function between two layers
which is called activation function
• Determines which information should be transmitted to the next
neuron
• If no activation function, input layer will be linear function of the
output
• Nonlinear functions are introduced as activation functions to enhance
ability of neural network
Types of activation function
• Sigmoid function can map a real number to (0, 1), so it can be used
for binary classification problems.
• Tanh function maps a real number to (-1, 1), achieves normalization.
This makes the next layer easier to learn.
• Rectified Linear Unit (ReLU), when x is less than 0, its value is 0; when
x is greater than or equal to 0, its value is x itself. Speeds up learning.
• ELU function has a negative value, so the average value of its output is
close to 0, making the rate of convergence faster than ReLU.
Loss/Cost function
• Calculates the distance between the predicted value and the actual
value
• Used as a learning criterion of the optimization problem
• Common loss functions Mean Absolute Error (MAE), Mean Square
Error (MSE), Cross Entropy
Rules of Thumb for Loss Function Selection
• CNN models for regression problems, choose L1 loss or L2 loss as the
loss function.
• For classification problems, select the rest of the loss functions
• Cross entropy loss is the most popular choice, with a softmax layer in
the end.
• The selection of loss function in CNNs also depends on the
application scenario. For example, when it comes to face recognition,
contrastive loss and triplet loss are turned out to be the commonly-
used ones nowadays.
Optimizer
• In convolutional neural networks, need to optimize non-convex
functions.
• Mathematical methods require huge computing power, so optimizers
are used in the training process to minimize the loss function for
getting optimal network parameters within acceptable time.
• Common optimization algorithms are Momentum, RMSprop, Adam,
etc.
Applications of one-dimensional CNN
• Time Series Prediction
• Electrocardiogram (ECG) time series, weather forecast, and traffic flow
prediction, highway traffic flow prediction
• Signal Identification
• ECG signal identification, structural damage identification, and system fault
identification
Applications of two-dimensional CNN
• Image Classification
• medical image classification, traffic scenes related classification, classify
breast cancer tissues
• Object Detection
• Image Segmentation
• Face Recognition
Applications of multi-dimensional CNN
• Human Action Recognition
• Object Recognition/Detection
Conclusion
• Due to the advantages of convolutional neural networks, such as local
connection, weight sharing, and down-sampling dimensionality reduction,
they have been widely deployed in both research and industry projects
• First, we discussed basic building blocks of CNN and how to construct a
CNN-based model from scratch
• Secondly, some excellent CNN networks
• Third, we introduce activation functions, loss functions, and optimizers for
CNN
• Fourth, we discuss some typical applications of CNN
• CNN can be refined further in terms of model size, security, and easy
hyperparameters selection. Moreover, there are lots of problems that
convolution is hard to handle, such as low generalization ability, lack of
equivariance, and poor crowded-scene results, so that several promising
directions are pointed.

Mais conteúdo relacionado

Mais procurados

Deep learning - A Visual Introduction
Deep learning - A Visual IntroductionDeep learning - A Visual Introduction
Deep learning - A Visual Introduction
Lukas Masuch
 

Mais procurados (20)

Deep learning presentation
Deep learning presentationDeep learning presentation
Deep learning presentation
 
Understanding Convolutional Neural Networks
Understanding Convolutional Neural NetworksUnderstanding Convolutional Neural Networks
Understanding Convolutional Neural Networks
 
Deep learning - A Visual Introduction
Deep learning - A Visual IntroductionDeep learning - A Visual Introduction
Deep learning - A Visual Introduction
 
Introduction to CNN
Introduction to CNNIntroduction to CNN
Introduction to CNN
 
Computer vision
Computer visionComputer vision
Computer vision
 
Convolutional Neural Networks
Convolutional Neural NetworksConvolutional Neural Networks
Convolutional Neural Networks
 
Deep neural networks
Deep neural networksDeep neural networks
Deep neural networks
 
Resnet
ResnetResnet
Resnet
 
Computer vision introduction
Computer vision  introduction Computer vision  introduction
Computer vision introduction
 
Deep Learning - Convolutional Neural Networks
Deep Learning - Convolutional Neural NetworksDeep Learning - Convolutional Neural Networks
Deep Learning - Convolutional Neural Networks
 
Natural language processing and transformer models
Natural language processing and transformer modelsNatural language processing and transformer models
Natural language processing and transformer models
 
Cat and dog classification
Cat and dog classificationCat and dog classification
Cat and dog classification
 
Transformer Introduction (Seminar Material)
Transformer Introduction (Seminar Material)Transformer Introduction (Seminar Material)
Transformer Introduction (Seminar Material)
 
Deep Learning Tutorial
Deep Learning TutorialDeep Learning Tutorial
Deep Learning Tutorial
 
Image captioning
Image captioningImage captioning
Image captioning
 
Deep learning and Healthcare
Deep learning and HealthcareDeep learning and Healthcare
Deep learning and Healthcare
 
Autoencoders in Deep Learning
Autoencoders in Deep LearningAutoencoders in Deep Learning
Autoencoders in Deep Learning
 
Deep Neural Networks (DNN)
Deep Neural Networks (DNN)Deep Neural Networks (DNN)
Deep Neural Networks (DNN)
 
Computer Vision
Computer VisionComputer Vision
Computer Vision
 
Attention Models (D3L6 2017 UPC Deep Learning for Computer Vision)
Attention Models (D3L6 2017 UPC Deep Learning for Computer Vision)Attention Models (D3L6 2017 UPC Deep Learning for Computer Vision)
Attention Models (D3L6 2017 UPC Deep Learning for Computer Vision)
 

Semelhante a A Survey of Convolutional Neural Networks

intro-to-cnn-April_2020.pptx
intro-to-cnn-April_2020.pptxintro-to-cnn-April_2020.pptx
intro-to-cnn-April_2020.pptx
ssuser3aa461
 
Deep Learning
Deep LearningDeep Learning
Deep Learning
Pierre de Lacaze
 

Semelhante a A Survey of Convolutional Neural Networks (20)

Deep learning
Deep learningDeep learning
Deep learning
 
Handwritten Digit Recognition(Convolutional Neural Network) PPT
Handwritten Digit Recognition(Convolutional Neural Network) PPTHandwritten Digit Recognition(Convolutional Neural Network) PPT
Handwritten Digit Recognition(Convolutional Neural Network) PPT
 
Cvpr 2018 papers review (efficient computing)
Cvpr 2018 papers review (efficient computing)Cvpr 2018 papers review (efficient computing)
Cvpr 2018 papers review (efficient computing)
 
04 Deep CNN (Ch_01 to Ch_3).pptx
04 Deep CNN (Ch_01 to Ch_3).pptx04 Deep CNN (Ch_01 to Ch_3).pptx
04 Deep CNN (Ch_01 to Ch_3).pptx
 
Introduction to deep learning
Introduction to deep learningIntroduction to deep learning
Introduction to deep learning
 
intro-to-cnn-April_2020.pptx
intro-to-cnn-April_2020.pptxintro-to-cnn-April_2020.pptx
intro-to-cnn-April_2020.pptx
 
Deep Learning
Deep LearningDeep Learning
Deep Learning
 
Image Segmentation Using Deep Learning : A survey
Image Segmentation Using Deep Learning : A surveyImage Segmentation Using Deep Learning : A survey
Image Segmentation Using Deep Learning : A survey
 
Wits presentation 6_28072015
Wits presentation 6_28072015Wits presentation 6_28072015
Wits presentation 6_28072015
 
Sp19_P2.pptx
Sp19_P2.pptxSp19_P2.pptx
Sp19_P2.pptx
 
Autoencoders for image_classification
Autoencoders for image_classificationAutoencoders for image_classification
Autoencoders for image_classification
 
Digit recognition
Digit recognitionDigit recognition
Digit recognition
 
Deep Learning in Robotics: Robot gains Social Intelligence through Multimodal...
Deep Learning in Robotics: Robot gains Social Intelligence through Multimodal...Deep Learning in Robotics: Robot gains Social Intelligence through Multimodal...
Deep Learning in Robotics: Robot gains Social Intelligence through Multimodal...
 
PR-108: MobileNetV2: Inverted Residuals and Linear Bottlenecks
PR-108: MobileNetV2: Inverted Residuals and Linear BottlenecksPR-108: MobileNetV2: Inverted Residuals and Linear Bottlenecks
PR-108: MobileNetV2: Inverted Residuals and Linear Bottlenecks
 
Convolutional Neural Networks : Popular Architectures
Convolutional Neural Networks : Popular ArchitecturesConvolutional Neural Networks : Popular Architectures
Convolutional Neural Networks : Popular Architectures
 
FINAL_Team_4.pptx
FINAL_Team_4.pptxFINAL_Team_4.pptx
FINAL_Team_4.pptx
 
Convolutional Neural Network and Its Applications
Convolutional Neural Network and Its ApplicationsConvolutional Neural Network and Its Applications
Convolutional Neural Network and Its Applications
 
DL.pdf
DL.pdfDL.pdf
DL.pdf
 
DSRLab seminar Introduction to deep learning
DSRLab seminar   Introduction to deep learningDSRLab seminar   Introduction to deep learning
DSRLab seminar Introduction to deep learning
 
Lecture on Deep Learning
Lecture on Deep LearningLecture on Deep Learning
Lecture on Deep Learning
 

Último

notes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptnotes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.ppt
MsecMca
 
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort ServiceCall Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Call Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort ServiceCall Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak HamilCara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Kandungan 087776558899
 
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
dharasingh5698
 
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 

Último (20)

notes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptnotes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.ppt
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
 
DC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equationDC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equation
 
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
 
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...
 
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort ServiceCall Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
 
Employee leave management system project.
Employee leave management system project.Employee leave management system project.
Employee leave management system project.
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performance
 
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
 
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced LoadsFEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
 
Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024
 
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
 
Call Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort ServiceCall Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
 
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak HamilCara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
 
chapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineeringchapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineering
 
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
 
Work-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptxWork-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptx
 
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
 
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
 
Block diagram reduction techniques in control systems.ppt
Block diagram reduction techniques in control systems.pptBlock diagram reduction techniques in control systems.ppt
Block diagram reduction techniques in control systems.ppt
 

A Survey of Convolutional Neural Networks

  • 1. Short Story Submission Rimzim Thube SJSU ID- 014555021
  • 2. A Survey of Convolutional Neural Networks: Analysis, Applications, and Prospects Zewen Li, Wenjie Yang, Shouheng Peng, Fan Liu, Member, IEEE
  • 3. Introduction to Convolution Neural Network (CNN) •Applications using CNN – •Face recognition •Autonomous vehicles •Self-service supermarket •Intelligent medical treatment
  • 4. Emergence of CNN • McCulloch and Pitts – First mathematical MP model of neurons • Rosenblatt - Added learning capability to MP model • Hinton – Proposed multi-layer feedforward network trained by the error Back Propagation – BP network • Waibel - Time Delay Neural Network (TDNN) for speech recognition • LeCun – First convolution network (LeNet) to recognize handwritten text
  • 5. Overview of CNN • Feedforward neural network • Extracts features from data from convolution structures • Architecture inspired by visual perception • Biological neuron corresponds to an artificial neuron • CNN kernels represent different receptors that can respond to various features • Activation function transmit signal to next neuron if it exceeds certain threshold • Loss functions and optimizers teach the whole CNN system to learn
  • 6. Advantages of CNN • Local connections – Each neuron connected to not all but small no. of neurons. Reduces parameters and speed up convergence. • Weight sharing - Connections share same weights • Down-sampling dimensionality reduction. • These characteristics make CNN most representative algorithms
  • 7. Components of CNN • Convolution - pivotal step for feature extraction. Output is feature map • Padding - introduced to enlarge the input with zero value • Stride – Control the density of convolution • Pooling - Obviate redundancy or down sampling
  • 8. LeNet - 5 • Composed of 7 trainable layers containing 2 convolutional layers, 2 pooling layers, and 3 fully-connected layers • NN characteristics of local receptive fields, shared weights, and spatial or temporal subsampling, ensures shift, scale, and distortion • Used for handwriting recognition
  • 9. AlexNet • Has 8 layers, containing 5 convolutional layers and 3 fully-connected layers • uses ReLU as the activation function of CNN to solve gradient vanishing • Dropout was used in last few layers to avoid overfitting • Local Response Normalization (LRN) to enhance generalization of model
  • 10. AlexNet • Employ 2 powerful GPUs, two feature maps generated by two GPUs can be combined as the final output • Enlarges dataset and calculates average of their predictions as final result • Principal Component Analysis (PCA) to change the RGB values of training set
  • 11. VGGNet • LRN layer was removed • VGGNets use 3 × 3 convolution kernels rather than 5 × 5 or 5 × 5 ones, since several small kernels have the same receptive field and more nonlinear variations compared with larger ones.
  • 12. GoogLeNet - Inception v1 • CNN formed by stacking with Inception modules • Inception v1 deploys 1 × 1, 3 × 3, 5 × 5 convolution kernels to construct a “wide” network • Convolution kernels with different sizes can extract the feature maps of different scales of the image • 1 × 1 convolution kernel is used to reduce the number of channels, i.e., reduce computational cost
  • 13. GoogLeNet - Inception v2 • Output of every layer is normalized to increase the robustness of model and train it with high learning rate • Single 5 × 5 convolutional layers can be replaced by two 3 × 3 ones • One n x n convolutional layer can be replaced byone 1 x n and one n x 1 convolutional layer • Filter banks expanded wider to improve high dimensional representations
  • 14. ResNet • Two layer residual block constructed by the shortcut connection • 50-layer ResNet, 101-layer ResNet, and 152-layer ResNet utilize three- layer residual blocks • Three-layer residual block is also called the bottleneck module because the two ends of the block are narrower than the middle • Can mitigate the gradient vanishing problem since the gradient can directly flow through shortcut connections •
  • 15. DCGAN • GAN has generative model G and a discriminative model D • The model G with random noise z generates a sample G(z) that subjects to the data distribution data learned by G. • The model D can determine whether the input sample is real data x or generated data G(z). • Both G and D can be nonlinear functions. The aim of G is to generate real data, the aim of D is to distinguish fake data generated by G from the real data
  • 16. MobileNets • lightweight models proposed by Google for embedded devices such as mobile phones • depth-wise separable convolutions and several advanced techniques to build thin deep neural networks.
  • 17. ShuffleNets • Series of CNN-based models to solve the problem of insufficient computing power of mobile devices • Combine pointwise group convolution, channel shuffle, which significantly reduce the computational cost with little loss of accuracy
  • 18. GhostNet • As large amounts of redundant features are extracted by existing CNNs for image cognition, GhostNet is used to reduce computational cost effectively • Similar feature maps in traditional convolution layers are called ghost • Traditional convolution layers divided into two parts • Less convolution kernels are directly used in feature extraction • These features are processed in linear transformation to acquire multiple feature maps. They proved that Ghost module applies to other CNN models
  • 19. Activation function • In a multilayer neural network, there is a function between two layers which is called activation function • Determines which information should be transmitted to the next neuron • If no activation function, input layer will be linear function of the output • Nonlinear functions are introduced as activation functions to enhance ability of neural network
  • 20. Types of activation function • Sigmoid function can map a real number to (0, 1), so it can be used for binary classification problems. • Tanh function maps a real number to (-1, 1), achieves normalization. This makes the next layer easier to learn. • Rectified Linear Unit (ReLU), when x is less than 0, its value is 0; when x is greater than or equal to 0, its value is x itself. Speeds up learning. • ELU function has a negative value, so the average value of its output is close to 0, making the rate of convergence faster than ReLU.
  • 21. Loss/Cost function • Calculates the distance between the predicted value and the actual value • Used as a learning criterion of the optimization problem • Common loss functions Mean Absolute Error (MAE), Mean Square Error (MSE), Cross Entropy
  • 22. Rules of Thumb for Loss Function Selection • CNN models for regression problems, choose L1 loss or L2 loss as the loss function. • For classification problems, select the rest of the loss functions • Cross entropy loss is the most popular choice, with a softmax layer in the end. • The selection of loss function in CNNs also depends on the application scenario. For example, when it comes to face recognition, contrastive loss and triplet loss are turned out to be the commonly- used ones nowadays.
  • 23. Optimizer • In convolutional neural networks, need to optimize non-convex functions. • Mathematical methods require huge computing power, so optimizers are used in the training process to minimize the loss function for getting optimal network parameters within acceptable time. • Common optimization algorithms are Momentum, RMSprop, Adam, etc.
  • 24. Applications of one-dimensional CNN • Time Series Prediction • Electrocardiogram (ECG) time series, weather forecast, and traffic flow prediction, highway traffic flow prediction • Signal Identification • ECG signal identification, structural damage identification, and system fault identification
  • 25. Applications of two-dimensional CNN • Image Classification • medical image classification, traffic scenes related classification, classify breast cancer tissues • Object Detection • Image Segmentation • Face Recognition
  • 26. Applications of multi-dimensional CNN • Human Action Recognition • Object Recognition/Detection
  • 27. Conclusion • Due to the advantages of convolutional neural networks, such as local connection, weight sharing, and down-sampling dimensionality reduction, they have been widely deployed in both research and industry projects • First, we discussed basic building blocks of CNN and how to construct a CNN-based model from scratch • Secondly, some excellent CNN networks • Third, we introduce activation functions, loss functions, and optimizers for CNN • Fourth, we discuss some typical applications of CNN • CNN can be refined further in terms of model size, security, and easy hyperparameters selection. Moreover, there are lots of problems that convolution is hard to handle, such as low generalization ability, lack of equivariance, and poor crowded-scene results, so that several promising directions are pointed.