2. • Objective assessment of QoE
• A brief intro to Machine Learning
• Setting up your ML-Based objective metric:
• Feature space definition
• ML paradigm selection
• Model selection & Robust testing
• A practical example
• SWOT analysis + Conclusions
2
3.
4. Configuration of
technology settings
Lower
quality!
Q = 0.3!
Quality restoration
Experience-
centered
technology design
Quality assessment Quality Preservation
We should be able to predict visual quality at any point of the media
lifecycle
5. “degree of delight or annoyance of the user of an
application or service
It results from the fulfillment of his or her expectations
with respect to the utility and / or
enjoyment of the application or service in the light of
the user’s personality and current state”
Qualinet White Paper, 2012
6. 1 • Reproducing the 2• Using a mimicking
Human Brain approach
• Modeling perceptual, • Modeling (parts of) the
cognitive and affective overall transfer function
processes triggered by
media consumption • E.g. input: pixel intensities,
user profile; output: QoE
judgment
6
7. “A machine is said to learn from experience E
with respect to some task T and performance measure P,
if its performance at task T, as measured by P, improves
with experience E”
Mitchell, 1997
Judith Redi – VPQM 2012 7
8.
9. • You have a task T to perform, i.e., link inputs x to TASK:
outputs y in some (unknown) domain E through Map images into
:x y QoE scores
• All you know about E is a bunch of examples E
(experience)
E = {(xi, yi), i = 1, …, p} E
Good Bad
• A learning machine is something that implements
some form of
y (x) ii (x) 0
ˆ Set I, I, 0
i so that
And learns from the examples in E how to set the I, I,
0 so that T is performed with a performance P, and
the larger is E, the better is P
NOTE: no specific model of is assumed a priori Bad
9
10. • Empirical learning (from the examples in E)
an accurate knowledge or representation
of the domain E is not needed
And we have subjective databases! (next talk) Good Bad
• Highly non-linear models can be implemented
• Which is useful when perceptual, cognitive and affective processes
are involved
• Most of the computational effort is spent in training
once the parameters are set, ML paradigms are computationally
efficient tools
10
11. Q Q
OBJECTIVE QOE ASSESSMENT
QUALITY SPACE
Q
MEDIA SPACE Feature Non-linear
Extraction mapping
• Computationally
efficient metric
Machine
• Small-sized descriptor FEATURE SPACE Learning
11
12. Given E, subjective quality dataset
E = {Mi, qi}, M RA, q R
1. Select a good feature space RB, B<<A
2. Select the most appropriate ML paradigm to
implement : xRB y
3. Select the best configuration for the system (set I,
I, 0) and test its performance in a robust way
12
13. • The feature space has to encode all and only media
information that is relevant for quality prediction
no ML paradigm can repair a defective feature space
design by restoring missing information
13
14. • Encode all relevant information for quality assessment
Study the preceptual, cognitive and affective processes that
regulate QoE and design features that are actually related to
them (e.g., Moorthy and Bovik, 2011, Liu et al., 2010, 2011)
Computational complexity can be kept low (Liu et al., 2010,
2011)
• Encode only relevant information for quality assessment
FEATURE SELECTION (PCA, Gastaldo et al. 2005, SVD, Narwaria
and Lin, 2010)
14
15. • Structure of the feature space
• High number of features machines less prone to curse of
dimensionality, such as SVMs (Moorthy and Bovik, 2011)
• Structure of the problem
• E.g. time delays in video quality assessment Time Delay NN
(Le Callet et al., 2006 )
• Application domain
• Complexity vs accuracy
15
16. • Overfitting = excessive specialization of the (parameters
of the) mapping function γ on the training set
Dataset X
Np examples
X = {(xp,yp),
p = 1…Np}
trained
New input
(x*,y*) X
16
17. • Model selection: select the configuration of
your ML paradigm (types and number of I, I)
that minimizes the risk of overfitting
• Typically, too many parameters higher risk of
overfitting
• Empirical methods to select the best model while
training e.g., cross validation
• ROBUST TESTING!
17
18. M1
TEST SET
M2
TRAINING SET
…
MN
VALIDATION
TEST SET
SET
18
19. Image restoration
algorithms
Which one to use?
Which parameter settings?
Objective quality metric
20. Subjective studies: overall quality is related to the integrity of
the image structure, color matters for visual quality too
Color correlogram features to describe structure
5 possible features, including
irrelevant/redundant information
FEATURE SELECTION
Kolmogorof-Smirnoff test
Finds “active features”, whose values computed for
undistorted and distorted images differ significantly
20
21. • Clustering algorithms look for a structure in the data
distribution, without using target information
Cluster
collection of objects
which are “similar”
among each other
and are “dissimilar” to
the objects belonging
to other clusters
Vector Quantization
22. Features Absolute Value, Inverse Difference and IMC
350
300
250
Number of Images
200
150
100
50
0
17
30
10
11
12
13
14
15
16
18
19
20
21
22
23
24
25
26
27
28
29
31
32
33
34
35
36
37
38
39
0
1
2
3
4
5
6
7
8
9
Clusters
Noise low quality (s = 0.005) Noise medium quality (s = 0.001) Original (s = 0)
JPEG high quality (q100) JPEG medium quality (q60) JPEG low quality (q20)
3200 images, 127 original contents, 2 types of distortions, different quality levels
22
23. Transmission
Original image system Distorted image
Feature Feature
Extractor x x(r) Extractor
I descriptor I(r) descriptor
x x(r)
Regression problem
VQA SYSTEM
QA System
p-class
calssification
Redi et al., problem
Ensembles of ANNs
...
SVMs in
2009, 2010 One Vs All
strategy
modules trained for a
specific distortion
23
24. • CBP Feed forward neural networks
• K-fold cross-validation model selection and test
• K groups of images each including different image contents
• Model selection decides number of hidden neurons
G1
G2 G1
G2 test
G3
IMAGE
G3
DATASET G4
G4
VAL
G5
Model selection
24
25. Correlation prediction-Subjective scores, LIVE
1.00
0.90
0.80
Correlation
0.70
0.60
0.50
0.40
JP2K1 JP2K2 Noise Blur JPEG1 JPEG2
CBP - No FS CBP with FS CELM with FS
ELM requires a much higher number of neurons,
trade-off complexity - accuracy
25
26. Helpful Harmful
In achieving the objective In achieving the objective
• Empirical Learning
• The less training examples,
Internal
• Ability to implement highly
Origin
the less accurate
non linear models
• Computationally
S
inexpensive at runtime
• Overfitting
W
• Crowdsourcing
External
• Databases
Origin
• The black box temptation!
•
•
O
QoE-centered ML design
Standardization of robust
testing procedures T
26