Deep Learning and its Applications - Computer Vision Zipfian Academy Meetup
Deep-learning is useful in detecting anomalies like fraud, spam and money laundering; identifying similarities to augment search and text analytics; predicting customer lifetime value and churn; recognizing faces and voices.
The framework's neural nets include restricted Boltzmann machines, deep-belief networks, deep autoencoders, convolutional nets and recursive neural tensor networks.
Deep Learning and its Applications - Computer Vision
1. The image part with relationship ID rId14 was not found in the file.
{
Deep Learning
And Its Applications: Computer Vision
Adam Gibson
{ deeplearning4j.org // skymind.io // zipfian academy
2. The image part with relationship ID rId14 was not found in the file.
• Object Recognition
• Image Categorization
• Scene Parsing
• Face Recognition
Computer Vision: A Primer
3. The image part with relationship ID rId14 was not found in the file.
• OpenCV
• SIFT
• Filters/Edge Detection
• Feature Extraction
What’s currently done?
4. The image part with relationship ID rId14 was not found in the file.
• Representation Learning
• More precise than hand-‐‑done
features
• Non-‐‑linearities and higher-‐‑order
trends
• Pretrain and Hessian Free
This is manual!
5. The image part with relationship ID rId14 was not found in the file.
• Representation Learning
• Position Invariance with convolutions
• Semantic Hashing
Deep Learning and Images
6. The image part with relationship ID rId14 was not found in the file.
• Normal pixels – 0-‐‑255 –
normalization
• Sparse – binarization (depending on
pixel presence)
Different kinds of images
7. The image part with relationship ID rId14 was not found in the file.
• Faces = a collection of images.
• With persistent pa_erns of pixels.
• Pixel pa_erns = features.
• Nets learn to identify features in data, to
classify faces as faces and label them: John or
Sarah.
• Nets train by reconstructing faces from features
many times.
• Measuring their work against a benchmark.
Facial recognition
8. The image part with relationship ID rId14 was not found in the file.
DL4J’s Facial Reconstructions
9. The image part with relationship ID rId14 was not found in the file.
• Slices of a feature space (Max pooling)
• Learns different portions for easily scalable
and robust feature engineering.
Position Invariance -‐‑ Convolutions
10. The image part with relationship ID rId14 was not found in the file.
Visual Example -‐‑ Convolutions
11. The image part with relationship ID rId14 was not found in the file.
Pen Strokes
12. The image part with relationship ID rId14 was not found in the file.
• Facebook uses facial recognition to make
itself stickier and know more about us.
• Government agencies use it to secure
national borders.
• Video game makers use it to construct more
realistic worlds.
• Stores use it to identify customers and track
behavior.
What are faces for?
13. The image part with relationship ID rId14 was not found in the file.
• 2 layers of neuron-‐‑like nodes.
• The 1st is the visible, or input, layer
• The 2nd is “hidden.” It identifies features in input
• Symmetrically connected.
• “Restricted” = no visible-‐‑visible or hidden-‐‑hidden
ties
• All connections happen between layers.
Restricted Bolgmann
Machines (RBMs)
14. The image part with relationship ID rId14 was not found in the file.
• A stack of RBMs.
• Each RBM’s hidden layer à Next RBM’s visible/input
layer.
• DBNs learn more & more complex features
• Example:
• 1) Pixels = input;
• 2) H1 learns an edge or line;
• 3) H2 learns a corner or set of lines;
• 4) H3 learns two groups of lines forming an object
-‐‑-‐‑ a face!
• Final layer classifies feature groups: sunset, elephant,
flower, John, Sarah.
Deep-‐‑Belief Net (DBN)
15. The image part with relationship ID rId14 was not found in the file.
• 2 DBNs.
• 1st DBN *encodes* data into vector of 10-‐‑30
numbers = Pre-‐‑training.
• 2nd DBN decodes data into original state.
• Backprop only happens on 2nd DBN
• 2nd is the fine-‐‑tuning stage (reconstruction entropy).
• Reduces documents or images to compact vectors .
• Useful in search, QA and information retrieval.
Deep Autoencoder
16. The image part with relationship ID rId14 was not found in the file.
Deep Autoencoder Architecture
17. The image part with relationship ID rId14 was not found in the file.
Image Search Results
18. The image part with relationship ID rId14 was not found in the file.
• Top-‐‑down & hierarchical rather than feed-‐‑forward (DBNs).
• Handles sequence-‐‑based classification, windows of several
events, entire scenes (multiple objects).
• Features themselves are vectors.
• A tensor = a multi-‐‑dimensional matrix, or multiple matrices of
the same size.
Recursive Neural Tensor Net
19. The image part with relationship ID rId14 was not found in the file.
RNTNs & Scene Composition