SlideShare uma empresa Scribd logo
1 de 133
Baixar para ler offline
1
Machine learning
for document analysis and
understanding
TC10/TC11 Summer School on Document Analysis:
Traditional Approaches and New Trends
@La Rochelle, France. 8:30-10:30, 4th July 2018
Seiichi Uchida, Kyushu University, Japan
2
The Nearest Neighbor Method
The simplest ML for pattern recognition;
Everything starts from it!
3
The nearest neighbor method:
Learning = memorizing
Input PorkBeef
Orange
Watermelon
Pineapple Fish
Which reference
pattern is the most
similar?
Reference
patterns
4
Each pattern is represented
as a feature vector
Color feature
Texture
feature
Pork=(10, 2.5, 4.3)
*Those numbers are just a random example
Note: In the classical
nearest neighbor method,
those features are
designed by human
5
A different pattern becomes a different
feature vector
Beef = (8, 2.6, 0.9)
*Those numbers are just a random example
Pork=(10, 2.5, 4.3)
*Those numbers are just a random example
Color feature
Texture
feature
6
Reference patterns
in the feature vector space
Color feature
Texture
feature
7
An input pattern
in the feature vector space
We want to
recognize this
input x
Color feature
Texture
feature
8
input x
Nearest neighbor method
in the feature vector space
Nearest
neighbor
input = orange
Color feature
Texture
feature
99
How do you define
“the nearest neighbor”?
Distance-based
The smallest distance gives
the nearest neighbor
Ex.
• Euclidean distance /
Similarity-based
The largest similarity gives
the nearest neighbor
Ex.
• Inner product
• Cosine similarity
𝐱 𝐲
𝐱 𝐲
x
?
1010
Do you remember an important property
of “inner product”?
If and are in the similar direction, their
inner product becomes larger
The inner product evaluates
the similarity between and
11
Well, two different types of features
(Note: important to understand deep learning)
Features defined by the
pattern itself
 Orange pixels→ Many
 Blue pixels → Rare
 Roundness → High
 Symmetry →High
 Texture → Fine
…
Features defined by the
similarity to others
 Similarity to ”car” → Low
 Similarity to ”apple” → High
 Similarity to “monkey”→Low
 Similarity to “Kaki”
(persimmon) →Very high
…
12
The nearest neighbor method with
similarity-based feature vectors
Similarity
to “Kaki”
Similarity to “car”
Important note:
Similarity is used for not
only feature extraction
but also classification
13
A shallow explanation of
neural networks
Don’t think it is a black box.
If you know “inner-product”, it becomes
14
The neuron – its reality
https://commons.wikimedia.org/
15
From reality to computational model
https://commons.wikimedia.org/
input g  xgx
1x
jx
dx
1w
jw
dw
f
 xg
16
The neuron by computer
Σ  xg
1x
jx
dx
1
……
b
 bf
bxwfg
T
d
j
jj









 
xw
x
   
1
)(
x 1w
jw
dw
f
f: non-linear func.
input
output
17
The neuron by computer
Σ  xg
1x
jx
dx
1
……
b
x 1w
jw
dw
f
f: non-linear func.
Let’s
forget
 bf
bxwfg
T
d
j
jj









 
xw
x
   
1
)(
18
The neuron by computer
Σ  xg
1x
jx
dx
1
……
b
x 1w
jw
dwLet’s
forget


d
j
jj bxwg
1
)(x
19
The neuron by computer
Σ
1x
jx
dx
……
xwT
just “inner product”
of two vectors
x
1w
jw
dw
w
xw
x
T
d
j
jj xwg

 
   
1
)(
20
So, a neuron calculate…
xwT
Σ
1x
jx
dx
……
1w
jw
dw
xw andbetweensimilarityA
=0.9 if they
are similar
=0.02 if they
are dissimilar
21
So, if we have K neurons, we have
a K-dimensional similarity-based feature vector
…














xw
xw
xw
T
K
T
T

2
11w
2w
Kwx
1x
jx
dx
0.9
0.05
0.75
x
22
K-dimensional similarity-based feature
vector by K neurons
0.9
0.05
0.75
input
equiv.
similarity to
similarity to
23
Another function of the inner product
Similarity-based classification!
(Yes, the nearest neighbor method!)
Σ
1x
jx
dx
……
x
reference
pattern
of class k
24
Note: Multiple functions are realized by just combining neurons!
Just by layering the neuron elements, we
can have a complete recognition system!
…
Feature extraction
1w
Kw
1x
jx
dx
……
2w
Classification
AV
CV
BV
Similarity
to class A
Similarity
to class B
Similarity
to class C
Choose
max
25
Now the time for deep neural networks
1x
jx
dx
feature extraction layers
……
…
f
f
f
…
classification
f
f
f
26
An example: AlexNet
“Deep” neural network called AlexNet
A Krizhevsky, NIPS2012
feature extraction layers
classification
layers
27
Now the time for deep neural networks
1x
jx
dx
feature extraction layers
……
…
f
f
f
…
Classification
f
f
f
Why do we need to repeat
feature extraction?
28
Why do we need to repeat feature
extraction?
A
D
C
B
E
F
A difficult
classification
task
29
Why do we need to repeat feature
extraction?
A
D
C
B
E
F
1w2w
30
Why do we need to repeat feature
extraction?
A
D
C
B
E
F
1w2w
F
A
B
C
D
E
Large similarity to 𝐰
Small similarity to 𝐰
similarity to
similarity
to𝐰
Note: The lower picture is not
very accurate (because it does not
use inner-product-based but
distance-based space
transformation. However I believe
that it does not seriously damage
the explanation here.
31
Why do we need to repeat feature
extraction?
A
D
C
B
E
F
1w2w
F
A
B
C
D
E
It becomes more
separable
but still not
very separable
similarity to
similarity
to𝐰
32
Why do we need to repeat feature
extraction?
A
D
C
B
E
F
1w2w
F
A
B
C
D
E3w
4w
similarity to
similarity
to𝐰
33
Why do we need to repeat feature
extraction?
A
D
C
B
E
F
1w2w
F
A
B
C
D
E3w
4w
similarity to
similarity
to𝐰
A
D
E
B
C
F
similarity to
similarity
to𝐰
34
F
A
B
C
D
E3w
4w
similarity to
similarity
to𝐰
A
D
E
B
C
F
similarity to
similarity
to𝐰
Why do we need to repeat feature
extraction?
A
D
C
B
E
F
1w2w
3w
4w
Wow, they
become
separable!
35
Why do we need to repeat feature
extraction?
A
D
C
B
E
F
1w2w Now two classes
become totally
separable by
2v
1v
A
D
E
B
C
F
similarity to
similarity
to𝐰
A
D
E
B
C
F
similarity to
similarity
to𝐰
F
A
B
C
D
E3w
4w
similarity to
similarity
to𝐰
36
Remembering the non-linear function
Σ  xg
1x
jx
dx
1
……
b
x 1w
jw
dw
f
f: non-linear func.
37
The typical non-linear function:
Rectified linear function (ReLU)
Σ  xg
1x
jx
dx
1
……
b
x 1w
jw
dw
f
Rectified linear function
3838
How does ReLU affect
the similarity-based feature?
Minus elements in the feature vector are
forced to be zero
xwT
1
xwT
K
f
Unchanged
Unchanged
f
39
How to train neural networks:
Super-superficial explanation
40
In order to realize a DNN with
an expected “input-output” relation
…
1w
Kw
1x
jx
dx
……
2w
AV
CV
BV
Similarity to
class A
Similarity to
class B
Similarity to
class C
Those parameters should be tuned
1w AV2w
41
Training DNN; the goal
Class B
Class A
DNN
Knobs
Perfect
classification
boundary
Note: Actual number of #knobs (=#parameters)
42
Training DNN;
error-correcting learning by back propagation
NG
tuning
NG
NG
NG
Initial status
tuning
OK, end.
boundary
4343
Advanced topic: Why (SGD-based) back-
propagation works?
Many theoretical researches have been done
[Choromanska+, PMLR2015] [Wu+, arXiv2017]
Under several assumptions,
local minima is close to the
global minimum.
flat basin of loss surface
44
Knob = weight
= a pattern for similarity-based feature
Σ
1x
jx
dx
……
input weight
similarity
to
similarity to
This pattern is automatically
derived through training…
45
Optimal feature is extracted automatically
through training (Representation learning)
Google’s cat
https://googleblog.blogspot.jp/2012/06/
similarity
to
similarity toDetermined
automatically
46
DNN for image classification:
Convolutional neural networks
(CNN)
47
kw
How to deal with images by DNN?
x
xwT
k
400million-dim vector
400million-dim vector
①Intractable
computations
②Enormous
parameters
4848
kw
Convolution
= Repeating “local inner product” operations
= Linear filtering
x
ji
T
k ,xw
Low-dimensional
vector
ji,x
①Tractable
computations
②Trainable
#parameters
49
kw
Convolutional layer
x
ji,x
= Use the same weight
(filter coefficient)
at all locations
“Filtered”
image
50
kw
Pooling layer
x
ji,x
Keep only the
maximum value
①Deformation
compensation
②Local info
aggregation
51
Application to DAR:
Isolated character recognition
machine printed
handwritten
designed fonts
95.49%
99.79%
99.99%
[Uchida+, ICFHR2016]
Near-human performance
5252
Application to DAR:
Breaking Captcha
99.8% by 1 million training samples
[Goodfellow+, ArXiv, 2014]
53
Application to DAR:
Detecting a component in a character imageMulti-part
component
[Iwana+, ICDAR2017]
Q: Can CNN detect complex components accurately?
54
Application to DAR:
Font Recognition (DeepFont)
[Wang+, ACMMM2015]
55
Several tips about DNN/CNN
56
CNN can be used as a feature extractor
1x
jx
dx
feature extraction layers
……
…
f
f
f
…
classification
(discarded)
f
f
f
1x
jx
dx
……
…
f
f
f
Another classifier
e.g., SVM and LSTM
Anomaly
detector
Clustering
great
5757
The current CNN does not
“understand” characters yet
Adversarial examples
[Abe+, unpublished]
Motivated by [Nguyen+, CVPR2015]
Likelihood values for
classes “A” and “B”
58
On the other hand, CNN can learn “math
operation” through images
input images output “image”
showing the sum
[Hoshen+, AAAI, 2016]
5959
Visualization for deep learning:
DeCAF [Donahue+, arXiv 2013]
Visualizing the pattern distribution at each
layer
Near to the input layer Near to the output layer
6060
Visualization for deep learning:
DeepDream and its relations
Finding an input image that excites a neuron
at a certain layer
https://distill.pub/2017/feature-visualization/
6161
Visualization for deep learning:
Layer-wise Relevance Propagation (LRP)
Finding pixels which contribute the final
decision by a backward process
http://www.explain-ai.org/
62
Visualization for deep learning:
Local sensitivity analysis by making a hole
Motivated by [Zeiler+, arXiv, 2013][Ide+, Unpublished]
Likelihood of class “0” degrades a lot by making a hole around the pixel
6363
Visualization for deep learning:
Grad-CAM [Selvaraju+, arXiv2016]
Finding pixels which contribute the final
decision by a backward process
http://gradcam.cloudcv.org/
6464
tensorflow playground by Google
https://playground.tensorflow.org/
65
Several Variants of DNN/CNN
6666
Auto encoder
(= Nonlinear principal component analysis)
Training the network to output the input
App: Denoising by convolutional auto-encoder
Compact
representation
of the input
wikipedia
https://blog.sicara.com/keras-tutorial-content-based-image-retrieval-convolutional-denoising-autoencoder-dc91450cc511
6767
U-Net:
Conv-Deconv net that outputs an image
[Ronneberger+, MICCAI2015]
Skip connection
cell
image
cell boundary
image
68
Application to DAR:
Scene text eraser
[Nakamura+, ICDAR2017]
6969
Application to DAR:
Binarization
ICDAR-DIBCO2017 Winner (Smart Engines
Ltd, Moscow, Russia) used U-net
[Pratikakis+, ICDAR2017]
70
Application to DAR:
Dewarping [Ma+, CVPR2018]
Stacked U-nets
7171
Note: Deep Image Prior
[Ulyanov+, CVPR2018]
Conv-Deonv structure has an inherent
characteristics which is suitable for image
completion and other “low-pass” operations
train a conv-deconv
net just to generate
the left image but it
results in the right
image
7272
Generative Adversarial Networks
The battle of two neural networks
VS
Generate
“fake bill” Discriminate
fake or real bill
Generator Discriminator
Fake bill becomes more and more realistic
73
Application to DAR:
(Our) Style-consistent font generation
[Hayashi+, unpublished]
74
Application to DAR:
Oh no.. CVPR2018 was filled by Font-GANs
75
Huge variety of GANs:
Just several examples…
StackGANCycleGAN
Standard GAN (DCGAN)
https://www.slideshare.net/YunjeyChoi/generative-adversarial-networks-75916964
condition
(class)
Conditional GAN
76
Style Transfer [Gatys+, CVPR2016]
style image
(given)
content image
(given)
generated
image
77
Style Transfer [Gatys+, CVPR2016]
style image
(given)
content image
(given)
generated
image
similar internal
outputs
similar internal
output
78
Application to DAR:
Font Style Transfer
[Gantugs+, DAS2018]
79
SSD (Single Shot MultiBox Detector)
Fully-Conv Net that outputs bounding boxes
[Liu+, ECCV2016]
80
Application to DAR:
EAST: An Efficient and Accurate Scene Text Detector
[Zhou+, “EAST: An Efficient and Accurate Scene Text Detector”, CVPR2017]
Evaluating bounding box shape
81
Long short-term memory(LSTM),
which is the most typical
Recurrent Neural Networks
82
LSTM (Long short-term memory):
A recurrent neural network
… …
… …
Recurrent
structure
Info from
all the past
Gate
structure
Active info
selection
input vector
output vector
Also very effective for solving
the vanishing gradient problem
in t-direction
[Graves+, TPAMI2009]
83
LSTM NN
Recurrent NN
Recurrent
structure
Info from
all the past
LSTM NN
Gate
structure
Active info
selection input
output
input
output
input gate
forget gate
output gate
[Graves+, TPAMI2009]
84
Standard LSTM NN-based HWR
Character
class
Feature vector sequence
85
Extension to Bi-directional LSTM
Character
class
Feature vector sequence
combine Output using the past info
Output using the future info
86
Deep BLSTM network
[Frinken-Uchida, ICDAR2015]
Output
layer
Input
layer
LSTM layer
LSTM layer
LSTM layer
87
Application to DAR:
Convolutional Recurrent Neural Network (CRNN)
An End-to-End Trainable Neural Network for Image-Based Sequence Recognition and Its Application to Scene Text
Recognition, IEEE TPAMI, 2017
88
Image Captioning (CNN+LSTM):
Converting an image to a “document”
[Vinyals+, arXiv2015]
89
Application to DAR:
End-to-end math OCR (Image LaTeX)
[Deng+, Image-to-Markup Generation with Coarse-to-Fine Attention, arXiv2017]
90
More conventional machine
learning techniques
(SVM, -machine, AdaBoost)
91
Support Vector Machines (SVM)
Still the best choice when the amount of data is
insufficient
92
Linear discriminant function
A A AB B
x
training patterns from class A and B
93
Linear discriminant function
A A AB B
x
bT
xw
positive
=classA
negative
=classB
94
Linear discriminant function
A A AB B
x
bT
xw
positive
=classA
negative
=classB
misrecognized
95
Linear discriminant function
A A AB B
x
bT
xw
positive
=classA
negative
=classB
no misrecognition!
96
Which one is the best?
A A AB B
x
bT
xw
positive
=classA
negative
=classB
All of those functions
can recognize all
training patterns…
97
Don’t forget unseen patterns…
A A AB B
x
bT
xw
positive
=classA
negative
=classB
B A
We might have those patterns
around the class boundary
A
98
Max-margin classification
A A AB B
x
bT
xw
positive
=classA
negative
=classB
margin margin
99
A A AB B
How can we get it?
Minimize the slope under constraints
x
bT
xw
1
-1
For us, the function value
should be more than 1
For us, the function value
should be more than 1
100
A A AB B
How can we get it?
Minimize the slope under constraints
x
bT
xw
1
-1
NG OK NG
101
A A AB B
How can we get it?
Minimize the slope under constraints
x
bT
xw
1
-1
nail
102
A A AB B
How can we get it?
Minimize the slope under constraints
x
bT
xw
1
-1
the minimum slope
satisfying the constraints
103
A A AB B
How can we get it?
Minimize the slope under constraints
x
bT
xw
1
-1
It also gives the maximum
margin classification!
104
A A AB B
Support vectors
x
bT
xw
1
-1
SV
SV
Only those SVs contribute to
determine the discriminant function
105
Two-dimensional case
Minimize the slope
106
No solution that satisfies the constraints:
Not linearly-separable
A A AB B
x
bT
xw
1
-1
B
107
A relaxation:
Replace the constraint as a penalty
A A AB B
x
bT
xw
1
-1
B
Penalty
Penalty
Minimize “slope + penalty”
108
-machine
An old partner for linear classifier
The idea of “kernel” comes from this
109
Mapping the feature vector space
to a higher-dimensional space
1x
2x
1
1
0
Not linearly-separable
1x
21xx
2x
Linearly-separable!

















21
2
1
2
1
xx
x
x
x
x
:
110
What happens in the original space
1x
2x
21xx
111
What happens in the original space
1y
2y
3y
dcybyay  321
A plane in 3D space
Rewrite
112
What happens in the original space
1x
2x
21xx
dxcxbxax  2121
??? What is this?
Revert
113
What happens in the original space
1x
2x
1
1
0
1
1
2
2121
cxb
axd
x
dxcxbxax



識別面:
Linear classification
in the higher-space
corresponds to a
non-linear classification
in the original space
Classification
boundary
114
Another example
A
A
B
B
1x
2x
B
1x
2
2
1 xx 
B
A
A
2x


















2
2
1
2
1
2
1
xx
x
x
x
x
:
115
What happens in the original space
 
 daxcx
cb
x
dxxcbxax




1
2
12
2
2
121
1
B
1x
2
2
1 xx 
B
A
A
2x
A
A
B
B
1x
2x
116116
Notes about -machine
Combination with SVM is popular
 -function leads “kernel”
Choosing a good mapping is not trivial
In the past, the choice was done by try-and-error
Recently….
   
  





i
iji
ji
jiji
i
ij
T
i
ji
jiji
i
ij
T
i
ji
jiji
kyy
yy
yy



xx
xx
xx
,
,
,
,
117117
Deep neural networks can find
a good mapping automatically
Feature extraction layer = a mapping
The mapping is specified by the weight
The weight (i.e, ) is optimized via training
It is so-called “representation learning”
…
…
118
Classifier Ensemble
and AdaBoost
119
Majority voting
ifsum>0thenA;elseB
1g
cg…
…
Cg
…
…1cg
Two-class
classifier
that returns:
1 for class A,
-1 for class B
+1
-1
+1
+1
input x
A
A
A
B
A
120
Weighted majority voting
1g
cg…
…
Cg
…
…1cg
Two-class
classifier
that returns:
1 for class A,
-1 for class B
input x
A
A
A
B
0.7
0.02
0.2
0.15
0.7
-0.2
ifsum>0thenA;elseB
A
well, how do we
decide the weighs?
121
AdaBoost:
A set of complementary classifiers
1g 0.7
training
patterns
1.Train
ifsum>0thenA;elseB
2. Reliability
122
AdaBoost:
A set of complementary classifiers
0.7
training
patterns
ifsum>0thenA;elseB
3. Give a large (small) weight to each
sample which is misrecognized
(correctly recognized) by
123
training
patterns
ifsum>0thenA;elseB
AdaBoost:
A set of complementary classifiers
0.7
0.43
4. Training with the weight
(Patterns with larger
weight should be
recognized correctly)
5. Reliability
124
training
patterns
ifsum>0thenA;elseB
AdaBoost:
A set of complementary classifiers
0.7
0.43
6. Give a large (small)
weight to each sample which
is misrecognized (correctly
recognized) by
Repeat until
convergence of
training accuracy
125125
Today I cannot explain the following ML
techniques…
 Semi-supervised learning methods
 ex. constrained clustering, virtual adversarial training,
 Weakly-supervised learning methods
 ex. Multiple-instance learning
 Unsupervised learning methods
 Clustering, self-organizing feature maps, intrinsic dimensionality
 Ensemble methods
 Random forests, ECOC, bagging, random subspace
 Robust regression
 Hidden Markov models, graphical models
 Error-correcting learning (and perceptron)
 Statistical inference
 Esp. Gaussian mixtures, maximum likelihood, Bayesian estimation
126
Concluding remarks:
New DAR research by ML
127
Near-human performance has been
achieved by big data and neural networks
machine printed
handwritten
designed fonts
95.49%
99.79%
99.99%
[Uchida+, ICFHR2016]
[Zhou+, CVPR2017]
Scene text detection
Scene text recognition
CRNN [Shi+, TPAMI, 2017]
F value=0.8 on
ICDAR2015 Incidental scene text
89.6% word recog. rate
on ICDAR2013
128
Now we can imagine
what we can do in the world
129
Beyond 100% = Computer can detect, read,
and collect all text information perfectly
Texts on notebook
Texts on object label
Texts on digital display
Texts on book page
Texts on signboard
Texts on poster / ad
So, what do want to do
with the perfect recognition results?
130
Poor recognition results
In fact, our real goal should NOT be
perfect recognition results
Real goals
Ultimate application
by using perfect
recognition results
Scientific discovery
by analyzing perfect
recognition results
Perfect recognition results
Tentative goal
131
What will you do
in the world beyond 100%?
Ultimate application
 Education
 “Total-recall” for perfect
information search
 Welfare
 Alarm, translation,
information complement
 “Life log”-related apps
 Summary, log compression,
captioning, question
answering, behavior
prediction, reminder
Scientific discovery
 With social science
 Interaction between scene
text and human
 Text statistics
 With design science
 Font shape and impression
 Discovering typographic
knowledge
 With humanities
 Historical knowledge
 Semiology
132132
Another direction:
Use characters to understand ML
Simple binary and stroke-structured pattern
Less background clutter
Small size (ex. 32x32)
Big data (ex. 80,000 samples / class)
Predefined classes (ex. 10 classes for digits)
ML has achieved near-human performance
Very good “testbed” for
not only evaluating but also understanding ML
133
The last message...
... and please do NOT become an accuracist,
parameter-tuner, and libraholic!

Mais conteúdo relacionado

Mais procurados

Credit card fraud detection using random forest & cart algorithm
Credit card fraud detection using random forest & cart algorithmCredit card fraud detection using random forest & cart algorithm
Credit card fraud detection using random forest & cart algorithmVenkat Projects
 
Webinar on Graph Neural Networks
Webinar on Graph Neural NetworksWebinar on Graph Neural Networks
Webinar on Graph Neural NetworksLucaCrociani1
 
Object Detection Using R-CNN Deep Learning Framework
Object Detection Using R-CNN Deep Learning FrameworkObject Detection Using R-CNN Deep Learning Framework
Object Detection Using R-CNN Deep Learning FrameworkNader Karimi
 
Machine Learning and Inductive Inference
Machine Learning and Inductive InferenceMachine Learning and Inductive Inference
Machine Learning and Inductive Inferencebutest
 
Adaptive Machine Learning for Credit Card Fraud Detection
Adaptive Machine Learning for Credit Card Fraud DetectionAdaptive Machine Learning for Credit Card Fraud Detection
Adaptive Machine Learning for Credit Card Fraud DetectionAndrea Dal Pozzolo
 
Unified Approach to Interpret Machine Learning Model: SHAP + LIME
Unified Approach to Interpret Machine Learning Model: SHAP + LIMEUnified Approach to Interpret Machine Learning Model: SHAP + LIME
Unified Approach to Interpret Machine Learning Model: SHAP + LIMEDatabricks
 
Intro to deep learning
Intro to deep learning Intro to deep learning
Intro to deep learning David Voyles
 
Intro to machine learning
Intro to machine learningIntro to machine learning
Intro to machine learningTamir Taha
 
GAN - Theory and Applications
GAN - Theory and ApplicationsGAN - Theory and Applications
GAN - Theory and ApplicationsEmanuele Ghelfi
 
畳み込みニューラルネットワークの研究動向
畳み込みニューラルネットワークの研究動向畳み込みニューラルネットワークの研究動向
畳み込みニューラルネットワークの研究動向Yusuke Uchida
 
IRJET- Fake News Detection using Logistic Regression
IRJET- Fake News Detection using Logistic RegressionIRJET- Fake News Detection using Logistic Regression
IRJET- Fake News Detection using Logistic RegressionIRJET Journal
 
Smart Data Slides: Machine Learning - Case Studies
Smart Data Slides: Machine Learning - Case StudiesSmart Data Slides: Machine Learning - Case Studies
Smart Data Slides: Machine Learning - Case StudiesDATAVERSITY
 
An introduction to machine learning in biomedical research: Key concepts, pr...
An introduction to machine learning in biomedical research:  Key concepts, pr...An introduction to machine learning in biomedical research:  Key concepts, pr...
An introduction to machine learning in biomedical research: Key concepts, pr...FranciscoJAzuajeG
 
Anomaly Detection Using Isolation Forests
Anomaly Detection Using Isolation ForestsAnomaly Detection Using Isolation Forests
Anomaly Detection Using Isolation ForestsTuri, Inc.
 
Parametric & Non-Parametric Machine Learning (Supervised ML)
Parametric & Non-Parametric Machine Learning (Supervised ML)Parametric & Non-Parametric Machine Learning (Supervised ML)
Parametric & Non-Parametric Machine Learning (Supervised ML)Rehan Guha
 
【DL輪読会】Free Lunch for Few-shot Learning: Distribution Calibration
【DL輪読会】Free Lunch for Few-shot Learning: Distribution Calibration【DL輪読会】Free Lunch for Few-shot Learning: Distribution Calibration
【DL輪読会】Free Lunch for Few-shot Learning: Distribution CalibrationDeep Learning JP
 
Credit Card Fraudulent Transaction Detection Research Paper
Credit Card Fraudulent Transaction Detection Research PaperCredit Card Fraudulent Transaction Detection Research Paper
Credit Card Fraudulent Transaction Detection Research PaperGarvit Burad
 
Introduction to Graph Neural Networks: Basics and Applications - Katsuhiko Is...
Introduction to Graph Neural Networks: Basics and Applications - Katsuhiko Is...Introduction to Graph Neural Networks: Basics and Applications - Katsuhiko Is...
Introduction to Graph Neural Networks: Basics and Applications - Katsuhiko Is...Preferred Networks
 
大域的探索から局所的探索へデータ拡張 (Data Augmentation)を用いた学習の探索テクニック
大域的探索から局所的探索へデータ拡張 (Data Augmentation)を用いた学習の探索テクニック 大域的探索から局所的探索へデータ拡張 (Data Augmentation)を用いた学習の探索テクニック
大域的探索から局所的探索へデータ拡張 (Data Augmentation)を用いた学習の探索テクニック 西岡 賢一郎
 

Mais procurados (20)

Credit card fraud detection using random forest & cart algorithm
Credit card fraud detection using random forest & cart algorithmCredit card fraud detection using random forest & cart algorithm
Credit card fraud detection using random forest & cart algorithm
 
Deep learning
Deep learningDeep learning
Deep learning
 
Webinar on Graph Neural Networks
Webinar on Graph Neural NetworksWebinar on Graph Neural Networks
Webinar on Graph Neural Networks
 
Object Detection Using R-CNN Deep Learning Framework
Object Detection Using R-CNN Deep Learning FrameworkObject Detection Using R-CNN Deep Learning Framework
Object Detection Using R-CNN Deep Learning Framework
 
Machine Learning and Inductive Inference
Machine Learning and Inductive InferenceMachine Learning and Inductive Inference
Machine Learning and Inductive Inference
 
Adaptive Machine Learning for Credit Card Fraud Detection
Adaptive Machine Learning for Credit Card Fraud DetectionAdaptive Machine Learning for Credit Card Fraud Detection
Adaptive Machine Learning for Credit Card Fraud Detection
 
Unified Approach to Interpret Machine Learning Model: SHAP + LIME
Unified Approach to Interpret Machine Learning Model: SHAP + LIMEUnified Approach to Interpret Machine Learning Model: SHAP + LIME
Unified Approach to Interpret Machine Learning Model: SHAP + LIME
 
Intro to deep learning
Intro to deep learning Intro to deep learning
Intro to deep learning
 
Intro to machine learning
Intro to machine learningIntro to machine learning
Intro to machine learning
 
GAN - Theory and Applications
GAN - Theory and ApplicationsGAN - Theory and Applications
GAN - Theory and Applications
 
畳み込みニューラルネットワークの研究動向
畳み込みニューラルネットワークの研究動向畳み込みニューラルネットワークの研究動向
畳み込みニューラルネットワークの研究動向
 
IRJET- Fake News Detection using Logistic Regression
IRJET- Fake News Detection using Logistic RegressionIRJET- Fake News Detection using Logistic Regression
IRJET- Fake News Detection using Logistic Regression
 
Smart Data Slides: Machine Learning - Case Studies
Smart Data Slides: Machine Learning - Case StudiesSmart Data Slides: Machine Learning - Case Studies
Smart Data Slides: Machine Learning - Case Studies
 
An introduction to machine learning in biomedical research: Key concepts, pr...
An introduction to machine learning in biomedical research:  Key concepts, pr...An introduction to machine learning in biomedical research:  Key concepts, pr...
An introduction to machine learning in biomedical research: Key concepts, pr...
 
Anomaly Detection Using Isolation Forests
Anomaly Detection Using Isolation ForestsAnomaly Detection Using Isolation Forests
Anomaly Detection Using Isolation Forests
 
Parametric & Non-Parametric Machine Learning (Supervised ML)
Parametric & Non-Parametric Machine Learning (Supervised ML)Parametric & Non-Parametric Machine Learning (Supervised ML)
Parametric & Non-Parametric Machine Learning (Supervised ML)
 
【DL輪読会】Free Lunch for Few-shot Learning: Distribution Calibration
【DL輪読会】Free Lunch for Few-shot Learning: Distribution Calibration【DL輪読会】Free Lunch for Few-shot Learning: Distribution Calibration
【DL輪読会】Free Lunch for Few-shot Learning: Distribution Calibration
 
Credit Card Fraudulent Transaction Detection Research Paper
Credit Card Fraudulent Transaction Detection Research PaperCredit Card Fraudulent Transaction Detection Research Paper
Credit Card Fraudulent Transaction Detection Research Paper
 
Introduction to Graph Neural Networks: Basics and Applications - Katsuhiko Is...
Introduction to Graph Neural Networks: Basics and Applications - Katsuhiko Is...Introduction to Graph Neural Networks: Basics and Applications - Katsuhiko Is...
Introduction to Graph Neural Networks: Basics and Applications - Katsuhiko Is...
 
大域的探索から局所的探索へデータ拡張 (Data Augmentation)を用いた学習の探索テクニック
大域的探索から局所的探索へデータ拡張 (Data Augmentation)を用いた学習の探索テクニック 大域的探索から局所的探索へデータ拡張 (Data Augmentation)を用いた学習の探索テクニック
大域的探索から局所的探索へデータ拡張 (Data Augmentation)を用いた学習の探索テクニック
 

Semelhante a Machine learning for document analysis and understanding

Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)Gaurav Mittal
 
Deep Learning, Keras, and TensorFlow
Deep Learning, Keras, and TensorFlowDeep Learning, Keras, and TensorFlow
Deep Learning, Keras, and TensorFlowOswald Campesato
 
Deep learning for molecules, introduction to chainer chemistry
Deep learning for molecules, introduction to chainer chemistryDeep learning for molecules, introduction to chainer chemistry
Deep learning for molecules, introduction to chainer chemistryKenta Oono
 
Faire de la reconnaissance d'images avec le Deep Learning - Cristina & Pierre...
Faire de la reconnaissance d'images avec le Deep Learning - Cristina & Pierre...Faire de la reconnaissance d'images avec le Deep Learning - Cristina & Pierre...
Faire de la reconnaissance d'images avec le Deep Learning - Cristina & Pierre...Jedha Bootcamp
 
Artificial Intelligence, Machine Learning and Deep Learning
Artificial Intelligence, Machine Learning and Deep LearningArtificial Intelligence, Machine Learning and Deep Learning
Artificial Intelligence, Machine Learning and Deep LearningSujit Pal
 
Paper review: Measuring the Intrinsic Dimension of Objective Landscapes.
Paper review: Measuring the Intrinsic Dimension of Objective Landscapes.Paper review: Measuring the Intrinsic Dimension of Objective Landscapes.
Paper review: Measuring the Intrinsic Dimension of Objective Landscapes.Wuhyun Rico Shin
 
AET vs. AED: Unsupervised Representation Learning by Auto-Encoding Transforma...
AET vs. AED: Unsupervised Representation Learning by Auto-Encoding Transforma...AET vs. AED: Unsupervised Representation Learning by Auto-Encoding Transforma...
AET vs. AED: Unsupervised Representation Learning by Auto-Encoding Transforma...Tomoyuki Suzuki
 
Applying your Convolutional Neural Networks
Applying your Convolutional Neural NetworksApplying your Convolutional Neural Networks
Applying your Convolutional Neural NetworksDatabricks
 
nlp dl 1.pdf
nlp dl 1.pdfnlp dl 1.pdf
nlp dl 1.pdfnyomans1
 
[DSC 2016] 系列活動:李宏毅 / 一天搞懂深度學習
[DSC 2016] 系列活動:李宏毅 / 一天搞懂深度學習[DSC 2016] 系列活動:李宏毅 / 一天搞懂深度學習
[DSC 2016] 系列活動:李宏毅 / 一天搞懂深度學習台灣資料科學年會
 
Document Analysis with Deep Learning
Document Analysis with Deep LearningDocument Analysis with Deep Learning
Document Analysis with Deep Learningaiaioo
 
Citython presentation
Citython presentationCitython presentation
Citython presentationAnkit Tewari
 
Data Summer Conf 2018, “From the math to the business value: machine learning...
Data Summer Conf 2018, “From the math to the business value: machine learning...Data Summer Conf 2018, “From the math to the business value: machine learning...
Data Summer Conf 2018, “From the math to the business value: machine learning...Provectus
 
Convolutional Neural Networks (D1L3 2017 UPC Deep Learning for Computer Vision)
Convolutional Neural Networks (D1L3 2017 UPC Deep Learning for Computer Vision)Convolutional Neural Networks (D1L3 2017 UPC Deep Learning for Computer Vision)
Convolutional Neural Networks (D1L3 2017 UPC Deep Learning for Computer Vision)Universitat Politècnica de Catalunya
 
KaoNet: Face Recognition and Generation App using Deep Learning
KaoNet: Face Recognition and Generation App using Deep LearningKaoNet: Face Recognition and Generation App using Deep Learning
KaoNet: Face Recognition and Generation App using Deep LearningVan Huy
 
Machine Learning in Computer Chess: Genetic Programming and KRK
Machine Learning in Computer Chess: Genetic Programming and KRKMachine Learning in Computer Chess: Genetic Programming and KRK
Machine Learning in Computer Chess: Genetic Programming and KRKbutest
 
Functional Programming with Immutable Data Structures
Functional Programming with Immutable Data StructuresFunctional Programming with Immutable Data Structures
Functional Programming with Immutable Data Structureselliando dias
 
brief Introduction to Different Kinds of GANs
brief Introduction to Different Kinds of GANsbrief Introduction to Different Kinds of GANs
brief Introduction to Different Kinds of GANsParham Zilouchian
 

Semelhante a Machine learning for document analysis and understanding (20)

Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)
 
Java and Deep Learning
Java and Deep LearningJava and Deep Learning
Java and Deep Learning
 
Deep Learning, Keras, and TensorFlow
Deep Learning, Keras, and TensorFlowDeep Learning, Keras, and TensorFlow
Deep Learning, Keras, and TensorFlow
 
Deep learning for molecules, introduction to chainer chemistry
Deep learning for molecules, introduction to chainer chemistryDeep learning for molecules, introduction to chainer chemistry
Deep learning for molecules, introduction to chainer chemistry
 
Faire de la reconnaissance d'images avec le Deep Learning - Cristina & Pierre...
Faire de la reconnaissance d'images avec le Deep Learning - Cristina & Pierre...Faire de la reconnaissance d'images avec le Deep Learning - Cristina & Pierre...
Faire de la reconnaissance d'images avec le Deep Learning - Cristina & Pierre...
 
Artificial Intelligence, Machine Learning and Deep Learning
Artificial Intelligence, Machine Learning and Deep LearningArtificial Intelligence, Machine Learning and Deep Learning
Artificial Intelligence, Machine Learning and Deep Learning
 
Paper review: Measuring the Intrinsic Dimension of Objective Landscapes.
Paper review: Measuring the Intrinsic Dimension of Objective Landscapes.Paper review: Measuring the Intrinsic Dimension of Objective Landscapes.
Paper review: Measuring the Intrinsic Dimension of Objective Landscapes.
 
AET vs. AED: Unsupervised Representation Learning by Auto-Encoding Transforma...
AET vs. AED: Unsupervised Representation Learning by Auto-Encoding Transforma...AET vs. AED: Unsupervised Representation Learning by Auto-Encoding Transforma...
AET vs. AED: Unsupervised Representation Learning by Auto-Encoding Transforma...
 
Applying your Convolutional Neural Networks
Applying your Convolutional Neural NetworksApplying your Convolutional Neural Networks
Applying your Convolutional Neural Networks
 
Deep Learning for Computer Vision: Attention Models (UPC 2016)
Deep Learning for Computer Vision: Attention Models (UPC 2016)Deep Learning for Computer Vision: Attention Models (UPC 2016)
Deep Learning for Computer Vision: Attention Models (UPC 2016)
 
nlp dl 1.pdf
nlp dl 1.pdfnlp dl 1.pdf
nlp dl 1.pdf
 
[DSC 2016] 系列活動:李宏毅 / 一天搞懂深度學習
[DSC 2016] 系列活動:李宏毅 / 一天搞懂深度學習[DSC 2016] 系列活動:李宏毅 / 一天搞懂深度學習
[DSC 2016] 系列活動:李宏毅 / 一天搞懂深度學習
 
Document Analysis with Deep Learning
Document Analysis with Deep LearningDocument Analysis with Deep Learning
Document Analysis with Deep Learning
 
Citython presentation
Citython presentationCitython presentation
Citython presentation
 
Data Summer Conf 2018, “From the math to the business value: machine learning...
Data Summer Conf 2018, “From the math to the business value: machine learning...Data Summer Conf 2018, “From the math to the business value: machine learning...
Data Summer Conf 2018, “From the math to the business value: machine learning...
 
Convolutional Neural Networks (D1L3 2017 UPC Deep Learning for Computer Vision)
Convolutional Neural Networks (D1L3 2017 UPC Deep Learning for Computer Vision)Convolutional Neural Networks (D1L3 2017 UPC Deep Learning for Computer Vision)
Convolutional Neural Networks (D1L3 2017 UPC Deep Learning for Computer Vision)
 
KaoNet: Face Recognition and Generation App using Deep Learning
KaoNet: Face Recognition and Generation App using Deep LearningKaoNet: Face Recognition and Generation App using Deep Learning
KaoNet: Face Recognition and Generation App using Deep Learning
 
Machine Learning in Computer Chess: Genetic Programming and KRK
Machine Learning in Computer Chess: Genetic Programming and KRKMachine Learning in Computer Chess: Genetic Programming and KRK
Machine Learning in Computer Chess: Genetic Programming and KRK
 
Functional Programming with Immutable Data Structures
Functional Programming with Immutable Data StructuresFunctional Programming with Immutable Data Structures
Functional Programming with Immutable Data Structures
 
brief Introduction to Different Kinds of GANs
brief Introduction to Different Kinds of GANsbrief Introduction to Different Kinds of GANs
brief Introduction to Different Kinds of GANs
 

Mais de Seiichi Uchida

1 データとデータ分析
1 データとデータ分析1 データとデータ分析
1 データとデータ分析Seiichi Uchida
 
13 分類とパターン認識
13 分類とパターン認識13 分類とパターン認識
13 分類とパターン認識Seiichi Uchida
 
12 非構造化データ解析
12 非構造化データ解析12 非構造化データ解析
12 非構造化データ解析Seiichi Uchida
 
0 データサイエンス概論まえがき
0 データサイエンス概論まえがき0 データサイエンス概論まえがき
0 データサイエンス概論まえがきSeiichi Uchida
 
14 データ収集とバイアス
14 データ収集とバイアス14 データ収集とバイアス
14 データ収集とバイアスSeiichi Uchida
 
10 確率と確率分布
10 確率と確率分布10 確率と確率分布
10 確率と確率分布Seiichi Uchida
 
8 予測と回帰分析
8 予測と回帰分析8 予測と回帰分析
8 予測と回帰分析Seiichi Uchida
 
6 線形代数に基づくデータ解析の基礎
6 線形代数に基づくデータ解析の基礎6 線形代数に基づくデータ解析の基礎
6 線形代数に基づくデータ解析の基礎Seiichi Uchida
 
5 クラスタリングと異常検出
5 クラスタリングと異常検出5 クラスタリングと異常検出
5 クラスタリングと異常検出Seiichi Uchida
 
4 データ間の距離と類似度
4 データ間の距離と類似度4 データ間の距離と類似度
4 データ間の距離と類似度Seiichi Uchida
 
3 平均・分散・相関
3 平均・分散・相関3 平均・分散・相関
3 平均・分散・相関Seiichi Uchida
 
2 データのベクトル表現と集合
2 データのベクトル表現と集合2 データのベクトル表現と集合
2 データのベクトル表現と集合Seiichi Uchida
 
「あなたがいま読んでいるものは文字です」~画像情報学から見た文字研究のこれから
「あなたがいま読んでいるものは文字です」~画像情報学から見た文字研究のこれから「あなたがいま読んでいるものは文字です」~画像情報学から見た文字研究のこれから
「あなたがいま読んでいるものは文字です」~画像情報学から見た文字研究のこれからSeiichi Uchida
 
データサイエンス概論第一=8 パターン認識と深層学習
データサイエンス概論第一=8 パターン認識と深層学習データサイエンス概論第一=8 パターン認識と深層学習
データサイエンス概論第一=8 パターン認識と深層学習Seiichi Uchida
 
データサイエンス概論第一=7 画像処理
データサイエンス概論第一=7 画像処理データサイエンス概論第一=7 画像処理
データサイエンス概論第一=7 画像処理Seiichi Uchida
 
An opening talk at ICDAR2017 Future Workshop - Beyond 100%
An opening talk at ICDAR2017 Future Workshop - Beyond 100%An opening talk at ICDAR2017 Future Workshop - Beyond 100%
An opening talk at ICDAR2017 Future Workshop - Beyond 100%Seiichi Uchida
 
データサイエンス概論第一 6 異常検出
データサイエンス概論第一 6 異常検出データサイエンス概論第一 6 異常検出
データサイエンス概論第一 6 異常検出Seiichi Uchida
 

Mais de Seiichi Uchida (20)

1 データとデータ分析
1 データとデータ分析1 データとデータ分析
1 データとデータ分析
 
9 可視化
9 可視化9 可視化
9 可視化
 
13 分類とパターン認識
13 分類とパターン認識13 分類とパターン認識
13 分類とパターン認識
 
12 非構造化データ解析
12 非構造化データ解析12 非構造化データ解析
12 非構造化データ解析
 
0 データサイエンス概論まえがき
0 データサイエンス概論まえがき0 データサイエンス概論まえがき
0 データサイエンス概論まえがき
 
15 人工知能入門
15 人工知能入門15 人工知能入門
15 人工知能入門
 
14 データ収集とバイアス
14 データ収集とバイアス14 データ収集とバイアス
14 データ収集とバイアス
 
10 確率と確率分布
10 確率と確率分布10 確率と確率分布
10 確率と確率分布
 
8 予測と回帰分析
8 予測と回帰分析8 予測と回帰分析
8 予測と回帰分析
 
7 主成分分析
7 主成分分析7 主成分分析
7 主成分分析
 
6 線形代数に基づくデータ解析の基礎
6 線形代数に基づくデータ解析の基礎6 線形代数に基づくデータ解析の基礎
6 線形代数に基づくデータ解析の基礎
 
5 クラスタリングと異常検出
5 クラスタリングと異常検出5 クラスタリングと異常検出
5 クラスタリングと異常検出
 
4 データ間の距離と類似度
4 データ間の距離と類似度4 データ間の距離と類似度
4 データ間の距離と類似度
 
3 平均・分散・相関
3 平均・分散・相関3 平均・分散・相関
3 平均・分散・相関
 
2 データのベクトル表現と集合
2 データのベクトル表現と集合2 データのベクトル表現と集合
2 データのベクトル表現と集合
 
「あなたがいま読んでいるものは文字です」~画像情報学から見た文字研究のこれから
「あなたがいま読んでいるものは文字です」~画像情報学から見た文字研究のこれから「あなたがいま読んでいるものは文字です」~画像情報学から見た文字研究のこれから
「あなたがいま読んでいるものは文字です」~画像情報学から見た文字研究のこれから
 
データサイエンス概論第一=8 パターン認識と深層学習
データサイエンス概論第一=8 パターン認識と深層学習データサイエンス概論第一=8 パターン認識と深層学習
データサイエンス概論第一=8 パターン認識と深層学習
 
データサイエンス概論第一=7 画像処理
データサイエンス概論第一=7 画像処理データサイエンス概論第一=7 画像処理
データサイエンス概論第一=7 画像処理
 
An opening talk at ICDAR2017 Future Workshop - Beyond 100%
An opening talk at ICDAR2017 Future Workshop - Beyond 100%An opening talk at ICDAR2017 Future Workshop - Beyond 100%
An opening talk at ICDAR2017 Future Workshop - Beyond 100%
 
データサイエンス概論第一 6 異常検出
データサイエンス概論第一 6 異常検出データサイエンス概論第一 6 異常検出
データサイエンス概論第一 6 異常検出
 

Último

DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfJohn Sterrett
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsappssapnasaifi408
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceSapana Sha
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home ServiceSapana Sha
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...limedy534
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectBoston Institute of Analytics
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPTBoston Institute of Analytics
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 217djon017
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanMYRABACSAFRA2
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhijennyeacort
 
MK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxMK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxUnduhUnggah1
 

Último (20)

DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdf
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts Service
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis Project
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population Mean
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
 
MK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxMK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docx
 

Machine learning for document analysis and understanding

  • 1. 1 Machine learning for document analysis and understanding TC10/TC11 Summer School on Document Analysis: Traditional Approaches and New Trends @La Rochelle, France. 8:30-10:30, 4th July 2018 Seiichi Uchida, Kyushu University, Japan
  • 2. 2 The Nearest Neighbor Method The simplest ML for pattern recognition; Everything starts from it!
  • 3. 3 The nearest neighbor method: Learning = memorizing Input PorkBeef Orange Watermelon Pineapple Fish Which reference pattern is the most similar? Reference patterns
  • 4. 4 Each pattern is represented as a feature vector Color feature Texture feature Pork=(10, 2.5, 4.3) *Those numbers are just a random example Note: In the classical nearest neighbor method, those features are designed by human
  • 5. 5 A different pattern becomes a different feature vector Beef = (8, 2.6, 0.9) *Those numbers are just a random example Pork=(10, 2.5, 4.3) *Those numbers are just a random example Color feature Texture feature
  • 6. 6 Reference patterns in the feature vector space Color feature Texture feature
  • 7. 7 An input pattern in the feature vector space We want to recognize this input x Color feature Texture feature
  • 8. 8 input x Nearest neighbor method in the feature vector space Nearest neighbor input = orange Color feature Texture feature
  • 9. 99 How do you define “the nearest neighbor”? Distance-based The smallest distance gives the nearest neighbor Ex. • Euclidean distance / Similarity-based The largest similarity gives the nearest neighbor Ex. • Inner product • Cosine similarity 𝐱 𝐲 𝐱 𝐲 x ?
  • 10. 1010 Do you remember an important property of “inner product”? If and are in the similar direction, their inner product becomes larger The inner product evaluates the similarity between and
  • 11. 11 Well, two different types of features (Note: important to understand deep learning) Features defined by the pattern itself  Orange pixels→ Many  Blue pixels → Rare  Roundness → High  Symmetry →High  Texture → Fine … Features defined by the similarity to others  Similarity to ”car” → Low  Similarity to ”apple” → High  Similarity to “monkey”→Low  Similarity to “Kaki” (persimmon) →Very high …
  • 12. 12 The nearest neighbor method with similarity-based feature vectors Similarity to “Kaki” Similarity to “car” Important note: Similarity is used for not only feature extraction but also classification
  • 13. 13 A shallow explanation of neural networks Don’t think it is a black box. If you know “inner-product”, it becomes
  • 14. 14 The neuron – its reality https://commons.wikimedia.org/
  • 15. 15 From reality to computational model https://commons.wikimedia.org/ input g  xgx 1x jx dx 1w jw dw f  xg
  • 16. 16 The neuron by computer Σ  xg 1x jx dx 1 …… b  bf bxwfg T d j jj            xw x     1 )( x 1w jw dw f f: non-linear func. input output
  • 17. 17 The neuron by computer Σ  xg 1x jx dx 1 …… b x 1w jw dw f f: non-linear func. Let’s forget  bf bxwfg T d j jj            xw x     1 )(
  • 18. 18 The neuron by computer Σ  xg 1x jx dx 1 …… b x 1w jw dwLet’s forget   d j jj bxwg 1 )(x
  • 19. 19 The neuron by computer Σ 1x jx dx …… xwT just “inner product” of two vectors x 1w jw dw w xw x T d j jj xwg        1 )(
  • 20. 20 So, a neuron calculate… xwT Σ 1x jx dx …… 1w jw dw xw andbetweensimilarityA =0.9 if they are similar =0.02 if they are dissimilar
  • 21. 21 So, if we have K neurons, we have a K-dimensional similarity-based feature vector …               xw xw xw T K T T  2 11w 2w Kwx 1x jx dx 0.9 0.05 0.75 x
  • 22. 22 K-dimensional similarity-based feature vector by K neurons 0.9 0.05 0.75 input equiv. similarity to similarity to
  • 23. 23 Another function of the inner product Similarity-based classification! (Yes, the nearest neighbor method!) Σ 1x jx dx …… x reference pattern of class k
  • 24. 24 Note: Multiple functions are realized by just combining neurons! Just by layering the neuron elements, we can have a complete recognition system! … Feature extraction 1w Kw 1x jx dx …… 2w Classification AV CV BV Similarity to class A Similarity to class B Similarity to class C Choose max
  • 25. 25 Now the time for deep neural networks 1x jx dx feature extraction layers …… … f f f … classification f f f
  • 26. 26 An example: AlexNet “Deep” neural network called AlexNet A Krizhevsky, NIPS2012 feature extraction layers classification layers
  • 27. 27 Now the time for deep neural networks 1x jx dx feature extraction layers …… … f f f … Classification f f f Why do we need to repeat feature extraction?
  • 28. 28 Why do we need to repeat feature extraction? A D C B E F A difficult classification task
  • 29. 29 Why do we need to repeat feature extraction? A D C B E F 1w2w
  • 30. 30 Why do we need to repeat feature extraction? A D C B E F 1w2w F A B C D E Large similarity to 𝐰 Small similarity to 𝐰 similarity to similarity to𝐰 Note: The lower picture is not very accurate (because it does not use inner-product-based but distance-based space transformation. However I believe that it does not seriously damage the explanation here.
  • 31. 31 Why do we need to repeat feature extraction? A D C B E F 1w2w F A B C D E It becomes more separable but still not very separable similarity to similarity to𝐰
  • 32. 32 Why do we need to repeat feature extraction? A D C B E F 1w2w F A B C D E3w 4w similarity to similarity to𝐰
  • 33. 33 Why do we need to repeat feature extraction? A D C B E F 1w2w F A B C D E3w 4w similarity to similarity to𝐰 A D E B C F similarity to similarity to𝐰
  • 34. 34 F A B C D E3w 4w similarity to similarity to𝐰 A D E B C F similarity to similarity to𝐰 Why do we need to repeat feature extraction? A D C B E F 1w2w 3w 4w Wow, they become separable!
  • 35. 35 Why do we need to repeat feature extraction? A D C B E F 1w2w Now two classes become totally separable by 2v 1v A D E B C F similarity to similarity to𝐰 A D E B C F similarity to similarity to𝐰 F A B C D E3w 4w similarity to similarity to𝐰
  • 36. 36 Remembering the non-linear function Σ  xg 1x jx dx 1 …… b x 1w jw dw f f: non-linear func.
  • 37. 37 The typical non-linear function: Rectified linear function (ReLU) Σ  xg 1x jx dx 1 …… b x 1w jw dw f Rectified linear function
  • 38. 3838 How does ReLU affect the similarity-based feature? Minus elements in the feature vector are forced to be zero xwT 1 xwT K f Unchanged Unchanged f
  • 39. 39 How to train neural networks: Super-superficial explanation
  • 40. 40 In order to realize a DNN with an expected “input-output” relation … 1w Kw 1x jx dx …… 2w AV CV BV Similarity to class A Similarity to class B Similarity to class C Those parameters should be tuned 1w AV2w
  • 41. 41 Training DNN; the goal Class B Class A DNN Knobs Perfect classification boundary Note: Actual number of #knobs (=#parameters)
  • 42. 42 Training DNN; error-correcting learning by back propagation NG tuning NG NG NG Initial status tuning OK, end. boundary
  • 43. 4343 Advanced topic: Why (SGD-based) back- propagation works? Many theoretical researches have been done [Choromanska+, PMLR2015] [Wu+, arXiv2017] Under several assumptions, local minima is close to the global minimum. flat basin of loss surface
  • 44. 44 Knob = weight = a pattern for similarity-based feature Σ 1x jx dx …… input weight similarity to similarity to This pattern is automatically derived through training…
  • 45. 45 Optimal feature is extracted automatically through training (Representation learning) Google’s cat https://googleblog.blogspot.jp/2012/06/ similarity to similarity toDetermined automatically
  • 46. 46 DNN for image classification: Convolutional neural networks (CNN)
  • 47. 47 kw How to deal with images by DNN? x xwT k 400million-dim vector 400million-dim vector ①Intractable computations ②Enormous parameters
  • 48. 4848 kw Convolution = Repeating “local inner product” operations = Linear filtering x ji T k ,xw Low-dimensional vector ji,x ①Tractable computations ②Trainable #parameters
  • 49. 49 kw Convolutional layer x ji,x = Use the same weight (filter coefficient) at all locations “Filtered” image
  • 50. 50 kw Pooling layer x ji,x Keep only the maximum value ①Deformation compensation ②Local info aggregation
  • 51. 51 Application to DAR: Isolated character recognition machine printed handwritten designed fonts 95.49% 99.79% 99.99% [Uchida+, ICFHR2016] Near-human performance
  • 52. 5252 Application to DAR: Breaking Captcha 99.8% by 1 million training samples [Goodfellow+, ArXiv, 2014]
  • 53. 53 Application to DAR: Detecting a component in a character imageMulti-part component [Iwana+, ICDAR2017] Q: Can CNN detect complex components accurately?
  • 54. 54 Application to DAR: Font Recognition (DeepFont) [Wang+, ACMMM2015]
  • 56. 56 CNN can be used as a feature extractor 1x jx dx feature extraction layers …… … f f f … classification (discarded) f f f 1x jx dx …… … f f f Another classifier e.g., SVM and LSTM Anomaly detector Clustering great
  • 57. 5757 The current CNN does not “understand” characters yet Adversarial examples [Abe+, unpublished] Motivated by [Nguyen+, CVPR2015] Likelihood values for classes “A” and “B”
  • 58. 58 On the other hand, CNN can learn “math operation” through images input images output “image” showing the sum [Hoshen+, AAAI, 2016]
  • 59. 5959 Visualization for deep learning: DeCAF [Donahue+, arXiv 2013] Visualizing the pattern distribution at each layer Near to the input layer Near to the output layer
  • 60. 6060 Visualization for deep learning: DeepDream and its relations Finding an input image that excites a neuron at a certain layer https://distill.pub/2017/feature-visualization/
  • 61. 6161 Visualization for deep learning: Layer-wise Relevance Propagation (LRP) Finding pixels which contribute the final decision by a backward process http://www.explain-ai.org/
  • 62. 62 Visualization for deep learning: Local sensitivity analysis by making a hole Motivated by [Zeiler+, arXiv, 2013][Ide+, Unpublished] Likelihood of class “0” degrades a lot by making a hole around the pixel
  • 63. 6363 Visualization for deep learning: Grad-CAM [Selvaraju+, arXiv2016] Finding pixels which contribute the final decision by a backward process http://gradcam.cloudcv.org/
  • 64. 6464 tensorflow playground by Google https://playground.tensorflow.org/
  • 66. 6666 Auto encoder (= Nonlinear principal component analysis) Training the network to output the input App: Denoising by convolutional auto-encoder Compact representation of the input wikipedia https://blog.sicara.com/keras-tutorial-content-based-image-retrieval-convolutional-denoising-autoencoder-dc91450cc511
  • 67. 6767 U-Net: Conv-Deconv net that outputs an image [Ronneberger+, MICCAI2015] Skip connection cell image cell boundary image
  • 68. 68 Application to DAR: Scene text eraser [Nakamura+, ICDAR2017]
  • 69. 6969 Application to DAR: Binarization ICDAR-DIBCO2017 Winner (Smart Engines Ltd, Moscow, Russia) used U-net [Pratikakis+, ICDAR2017]
  • 70. 70 Application to DAR: Dewarping [Ma+, CVPR2018] Stacked U-nets
  • 71. 7171 Note: Deep Image Prior [Ulyanov+, CVPR2018] Conv-Deonv structure has an inherent characteristics which is suitable for image completion and other “low-pass” operations train a conv-deconv net just to generate the left image but it results in the right image
  • 72. 7272 Generative Adversarial Networks The battle of two neural networks VS Generate “fake bill” Discriminate fake or real bill Generator Discriminator Fake bill becomes more and more realistic
  • 73. 73 Application to DAR: (Our) Style-consistent font generation [Hayashi+, unpublished]
  • 74. 74 Application to DAR: Oh no.. CVPR2018 was filled by Font-GANs
  • 75. 75 Huge variety of GANs: Just several examples… StackGANCycleGAN Standard GAN (DCGAN) https://www.slideshare.net/YunjeyChoi/generative-adversarial-networks-75916964 condition (class) Conditional GAN
  • 76. 76 Style Transfer [Gatys+, CVPR2016] style image (given) content image (given) generated image
  • 77. 77 Style Transfer [Gatys+, CVPR2016] style image (given) content image (given) generated image similar internal outputs similar internal output
  • 78. 78 Application to DAR: Font Style Transfer [Gantugs+, DAS2018]
  • 79. 79 SSD (Single Shot MultiBox Detector) Fully-Conv Net that outputs bounding boxes [Liu+, ECCV2016]
  • 80. 80 Application to DAR: EAST: An Efficient and Accurate Scene Text Detector [Zhou+, “EAST: An Efficient and Accurate Scene Text Detector”, CVPR2017] Evaluating bounding box shape
  • 81. 81 Long short-term memory(LSTM), which is the most typical Recurrent Neural Networks
  • 82. 82 LSTM (Long short-term memory): A recurrent neural network … … … … Recurrent structure Info from all the past Gate structure Active info selection input vector output vector Also very effective for solving the vanishing gradient problem in t-direction [Graves+, TPAMI2009]
  • 83. 83 LSTM NN Recurrent NN Recurrent structure Info from all the past LSTM NN Gate structure Active info selection input output input output input gate forget gate output gate [Graves+, TPAMI2009]
  • 84. 84 Standard LSTM NN-based HWR Character class Feature vector sequence
  • 85. 85 Extension to Bi-directional LSTM Character class Feature vector sequence combine Output using the past info Output using the future info
  • 86. 86 Deep BLSTM network [Frinken-Uchida, ICDAR2015] Output layer Input layer LSTM layer LSTM layer LSTM layer
  • 87. 87 Application to DAR: Convolutional Recurrent Neural Network (CRNN) An End-to-End Trainable Neural Network for Image-Based Sequence Recognition and Its Application to Scene Text Recognition, IEEE TPAMI, 2017
  • 88. 88 Image Captioning (CNN+LSTM): Converting an image to a “document” [Vinyals+, arXiv2015]
  • 89. 89 Application to DAR: End-to-end math OCR (Image LaTeX) [Deng+, Image-to-Markup Generation with Coarse-to-Fine Attention, arXiv2017]
  • 90. 90 More conventional machine learning techniques (SVM, -machine, AdaBoost)
  • 91. 91 Support Vector Machines (SVM) Still the best choice when the amount of data is insufficient
  • 92. 92 Linear discriminant function A A AB B x training patterns from class A and B
  • 93. 93 Linear discriminant function A A AB B x bT xw positive =classA negative =classB
  • 94. 94 Linear discriminant function A A AB B x bT xw positive =classA negative =classB misrecognized
  • 95. 95 Linear discriminant function A A AB B x bT xw positive =classA negative =classB no misrecognition!
  • 96. 96 Which one is the best? A A AB B x bT xw positive =classA negative =classB All of those functions can recognize all training patterns…
  • 97. 97 Don’t forget unseen patterns… A A AB B x bT xw positive =classA negative =classB B A We might have those patterns around the class boundary A
  • 98. 98 Max-margin classification A A AB B x bT xw positive =classA negative =classB margin margin
  • 99. 99 A A AB B How can we get it? Minimize the slope under constraints x bT xw 1 -1 For us, the function value should be more than 1 For us, the function value should be more than 1
  • 100. 100 A A AB B How can we get it? Minimize the slope under constraints x bT xw 1 -1 NG OK NG
  • 101. 101 A A AB B How can we get it? Minimize the slope under constraints x bT xw 1 -1 nail
  • 102. 102 A A AB B How can we get it? Minimize the slope under constraints x bT xw 1 -1 the minimum slope satisfying the constraints
  • 103. 103 A A AB B How can we get it? Minimize the slope under constraints x bT xw 1 -1 It also gives the maximum margin classification!
  • 104. 104 A A AB B Support vectors x bT xw 1 -1 SV SV Only those SVs contribute to determine the discriminant function
  • 106. 106 No solution that satisfies the constraints: Not linearly-separable A A AB B x bT xw 1 -1 B
  • 107. 107 A relaxation: Replace the constraint as a penalty A A AB B x bT xw 1 -1 B Penalty Penalty Minimize “slope + penalty”
  • 108. 108 -machine An old partner for linear classifier The idea of “kernel” comes from this
  • 109. 109 Mapping the feature vector space to a higher-dimensional space 1x 2x 1 1 0 Not linearly-separable 1x 21xx 2x Linearly-separable!                  21 2 1 2 1 xx x x x x :
  • 110. 110 What happens in the original space 1x 2x 21xx
  • 111. 111 What happens in the original space 1y 2y 3y dcybyay  321 A plane in 3D space Rewrite
  • 112. 112 What happens in the original space 1x 2x 21xx dxcxbxax  2121 ??? What is this? Revert
  • 113. 113 What happens in the original space 1x 2x 1 1 0 1 1 2 2121 cxb axd x dxcxbxax    識別面: Linear classification in the higher-space corresponds to a non-linear classification in the original space Classification boundary
  • 114. 114 Another example A A B B 1x 2x B 1x 2 2 1 xx  B A A 2x                   2 2 1 2 1 2 1 xx x x x x :
  • 115. 115 What happens in the original space    daxcx cb x dxxcbxax     1 2 12 2 2 121 1 B 1x 2 2 1 xx  B A A 2x A A B B 1x 2x
  • 116. 116116 Notes about -machine Combination with SVM is popular  -function leads “kernel” Choosing a good mapping is not trivial In the past, the choice was done by try-and-error Recently….             i iji ji jiji i ij T i ji jiji i ij T i ji jiji kyy yy yy    xx xx xx , , , ,
  • 117. 117117 Deep neural networks can find a good mapping automatically Feature extraction layer = a mapping The mapping is specified by the weight The weight (i.e, ) is optimized via training It is so-called “representation learning” … …
  • 120. 120 Weighted majority voting 1g cg… … Cg … …1cg Two-class classifier that returns: 1 for class A, -1 for class B input x A A A B 0.7 0.02 0.2 0.15 0.7 -0.2 ifsum>0thenA;elseB A well, how do we decide the weighs?
  • 121. 121 AdaBoost: A set of complementary classifiers 1g 0.7 training patterns 1.Train ifsum>0thenA;elseB 2. Reliability
  • 122. 122 AdaBoost: A set of complementary classifiers 0.7 training patterns ifsum>0thenA;elseB 3. Give a large (small) weight to each sample which is misrecognized (correctly recognized) by
  • 123. 123 training patterns ifsum>0thenA;elseB AdaBoost: A set of complementary classifiers 0.7 0.43 4. Training with the weight (Patterns with larger weight should be recognized correctly) 5. Reliability
  • 124. 124 training patterns ifsum>0thenA;elseB AdaBoost: A set of complementary classifiers 0.7 0.43 6. Give a large (small) weight to each sample which is misrecognized (correctly recognized) by Repeat until convergence of training accuracy
  • 125. 125125 Today I cannot explain the following ML techniques…  Semi-supervised learning methods  ex. constrained clustering, virtual adversarial training,  Weakly-supervised learning methods  ex. Multiple-instance learning  Unsupervised learning methods  Clustering, self-organizing feature maps, intrinsic dimensionality  Ensemble methods  Random forests, ECOC, bagging, random subspace  Robust regression  Hidden Markov models, graphical models  Error-correcting learning (and perceptron)  Statistical inference  Esp. Gaussian mixtures, maximum likelihood, Bayesian estimation
  • 127. 127 Near-human performance has been achieved by big data and neural networks machine printed handwritten designed fonts 95.49% 99.79% 99.99% [Uchida+, ICFHR2016] [Zhou+, CVPR2017] Scene text detection Scene text recognition CRNN [Shi+, TPAMI, 2017] F value=0.8 on ICDAR2015 Incidental scene text 89.6% word recog. rate on ICDAR2013
  • 128. 128 Now we can imagine what we can do in the world
  • 129. 129 Beyond 100% = Computer can detect, read, and collect all text information perfectly Texts on notebook Texts on object label Texts on digital display Texts on book page Texts on signboard Texts on poster / ad So, what do want to do with the perfect recognition results?
  • 130. 130 Poor recognition results In fact, our real goal should NOT be perfect recognition results Real goals Ultimate application by using perfect recognition results Scientific discovery by analyzing perfect recognition results Perfect recognition results Tentative goal
  • 131. 131 What will you do in the world beyond 100%? Ultimate application  Education  “Total-recall” for perfect information search  Welfare  Alarm, translation, information complement  “Life log”-related apps  Summary, log compression, captioning, question answering, behavior prediction, reminder Scientific discovery  With social science  Interaction between scene text and human  Text statistics  With design science  Font shape and impression  Discovering typographic knowledge  With humanities  Historical knowledge  Semiology
  • 132. 132132 Another direction: Use characters to understand ML Simple binary and stroke-structured pattern Less background clutter Small size (ex. 32x32) Big data (ex. 80,000 samples / class) Predefined classes (ex. 10 classes for digits) ML has achieved near-human performance Very good “testbed” for not only evaluating but also understanding ML
  • 133. 133 The last message... ... and please do NOT become an accuracist, parameter-tuner, and libraholic!