Deep Learning and its Applications - Computer Vision

The image part with relationship ID rId14 was not found in the ﬁle.
{
Deep Learning
And Its Applications: Computer Vision
Adam Gibson
{ deeplearning4j.org // skymind.io // zipﬁan academy

•  Object Recognition
•  Image Categorization
•  Scene Parsing
•  Face Recognition
Computer Vision: A Primer

•  OpenCV
•  SIFT
•  Filters/Edge Detection
•  Feature Extraction
What’s currently done?

•  Representation Learning
•  More precise than hand-‐‑done
features
•  Non-‐‑linearities and higher-‐‑order
trends
•  Pretrain and Hessian Free
This is manual!

•  Representation Learning
•  Position Invariance with convolutions
•  Semantic Hashing
Deep Learning and Images

•  Normal pixels – 0-‐‑255 –
normalization
•  Sparse – binarization (depending on
pixel presence)
Diﬀerent kinds of images

•  Faces = a collection of images.
•  With persistent pa_erns of pixels.
•  Pixel pa_erns = features.
•  Nets learn to identify features in data, to
classify faces as faces and label them: John or
Sarah.
•  Nets train by reconstructing faces from features
many times.
•  Measuring their work against a benchmark.
Facial recognition

DL4J’s Facial Reconstructions

•  Slices of a feature space (Max pooling)
•  Learns diﬀerent portions for easily scalable
and robust feature engineering.
Position Invariance -‐‑ Convolutions

Visual Example -‐‑ Convolutions

Pen Strokes

•  Facebook uses facial recognition to make
itself stickier and know more about us.
•  Government agencies use it to secure
national borders.
•  Video game makers use it to construct more
realistic worlds.
•  Stores use it to identify customers and track
behavior.
What are faces for?

•  2 layers of neuron-‐‑like nodes.
•  The 1st is the visible, or input, layer
•  The 2nd is “hidden.” It identiﬁes features in input
•  Symmetrically connected.
•  “Restricted” = no visible-‐‑visible or hidden-‐‑hidden
ties
•  All connections happen between layers.
Restricted Bolgmann
Machines (RBMs)

•  A stack of RBMs.
•  Each RBM’s hidden layer à Next RBM’s visible/input
layer.
•  DBNs learn more & more complex features
•  Example:
•  1) Pixels = input;
•  2) H1 learns an edge or line;
•  3) H2 learns a corner or set of lines;
•  4) H3 learns two groups of lines forming an object
-‐‑-‐‑ a face!
•  Final layer classiﬁes feature groups: sunset, elephant,
ﬂower, John, Sarah.
Deep-‐‑Belief Net (DBN)

•  2 DBNs.
•  1st DBN *encodes* data into vector of 10-‐‑30
numbers = Pre-‐‑training.
•  2nd DBN decodes data into original state.
•  Backprop only happens on 2nd DBN
•  2nd is the ﬁne-‐‑tuning stage (reconstruction entropy).
•  Reduces documents or images to compact vectors .
•  Useful in search, QA and information retrieval.
Deep Autoencoder

Deep Autoencoder Architecture

Image Search Results

•  Top-‐‑down & hierarchical rather than feed-‐‑forward (DBNs).
•  Handles sequence-‐‑based classiﬁcation, windows of several
events, entire scenes (multiple objects).
•  Features themselves are vectors.
•  A tensor = a multi-‐‑dimensional matrix, or multiple matrices of
the same size.
Recursive Neural Tensor Net

RNTNs & Scene Composition

Deep Learning and its Applications - Computer Vision

Recomendados

Recomendados

Mais conteúdo relacionado

Destaque

Destaque (20)

Mais de Adam Gibson

Mais de Adam Gibson (20)

Último

Último (20)

Deep Learning and its Applications - Computer Vision