Machine Learning for objective QoE assessment: Science, Myths and a look to the future

• Objective assessment of QoE
• A brief intro to Machine Learning
• Setting up your ML-Based objective metric:
• Feature space definition
• ML paradigm selection
• Model selection & Robust testing
• A practical example
• SWOT analysis + Conclusions

2

Configuration of
technology settings
Lower
quality!
Q = 0.3!
Quality restoration

Experience-
centered
technology design

Quality assessment Quality Preservation
We should be able to predict visual quality at any point of the media
lifecycle

“degree of delight or annoyance of the user of an
application or service
It results from the fulfillment of his or her expectations
with respect to the utility and / or
enjoyment of the application or service in the light of
the user’s personality and current state”

Qualinet White Paper, 2012

1 • Reproducing the 2• Using a mimicking
Human Brain approach
• Modeling perceptual, • Modeling (parts of) the
cognitive and affective overall transfer function
processes triggered by
media consumption • E.g. input: pixel intensities,
user profile; output: QoE
judgment

6

“A machine is said to learn from experience E
with respect to some task T and performance measure P,
if its performance at task T, as measured by P, improves
with experience E”

Mitchell, 1997

Judith Redi – VPQM 2012 7

• You have a task T to perform, i.e., link inputs x to TASK:
outputs y in some (unknown) domain E through  Map images into
 :x  y QoE scores

• All you know about E is a bunch of examples E
(experience)
E = {(xi, yi), i = 1, …, p}  E
Good Bad
• A learning machine is something that implements
some form of
y   (x)    ii (x)   0
ˆ Set I,  I,  0
i so that
And learns from the examples in E how to set the I,  I,
 0 so that T is performed with a performance P, and
the larger is E, the better is P

NOTE: no specific model of  is assumed a priori Bad

9

• Empirical learning (from the examples in E)
 an accurate knowledge or representation
of the domain E is not needed
And we have subjective databases! (next talk) Good Bad

• Highly non-linear models can be implemented
• Which is useful when perceptual, cognitive and affective processes
are involved
• Most of the computational effort is spent in training
 once the parameters are set, ML paradigms are computationally
efficient tools

10

Q Q

OBJECTIVE QOE ASSESSMENT
QUALITY SPACE

Q

MEDIA SPACE Feature Non-linear
Extraction mapping

• Computationally
efficient metric
Machine
• Small-sized descriptor FEATURE SPACE Learning

11

Given E, subjective quality dataset
E = {Mi, qi}, M  RA, q  R

1. Select a good feature space RB, B<<A
2. Select the most appropriate ML paradigm to
implement  : xRB  y
3. Select the best configuration for the system (set I,
 I,  0) and test its performance in a robust way

12

• The feature space has to encode all and only media
information that is relevant for quality prediction
no ML paradigm can repair a defective feature space
design by restoring missing information

13

• Encode all relevant information for quality assessment
Study the preceptual, cognitive and affective processes that
regulate QoE and design features that are actually related to
them (e.g., Moorthy and Bovik, 2011, Liu et al., 2010, 2011)
Computational complexity can be kept low (Liu et al., 2010,
2011)

• Encode only relevant information for quality assessment
FEATURE SELECTION (PCA, Gastaldo et al. 2005, SVD, Narwaria
and Lin, 2010)

14

• Structure of the feature space
• High number of features  machines less prone to curse of
dimensionality, such as SVMs (Moorthy and Bovik, 2011)
• Structure of the problem
• E.g. time delays in video quality assessment  Time Delay NN
(Le Callet et al., 2006 )
• Application domain
• Complexity vs accuracy

15

• Overfitting = excessive specialization of the (parameters
of the) mapping function γ on the training set

Dataset X
Np examples

X = {(xp,yp),
p = 1…Np}

trained
New input
(x*,y*)  X

16

• Model selection: select the configuration of
your ML paradigm (types and number of I,  I)
that minimizes the risk of overfitting
• Typically, too many parameters  higher risk of
overfitting
• Empirical methods to select the best model while
training e.g., cross validation
• ROBUST TESTING!

17

M1
TEST SET

M2
TRAINING SET
…

MN

VALIDATION
TEST SET
SET

18

Image restoration
algorithms

Which one to use?
Which parameter settings?
 Objective quality metric

Subjective studies: overall quality is related to the integrity of
the image structure, color matters for visual quality too
 Color correlogram features to describe structure

5 possible features, including
irrelevant/redundant information

FEATURE SELECTION
Kolmogorof-Smirnoff test
Finds “active features”, whose values computed for
undistorted and distorted images differ significantly
20

• Clustering algorithms look for a structure in the data
distribution, without using target information

Cluster
collection of objects
which are “similar”
among each other
and are “dissimilar” to
the objects belonging
to other clusters

Vector Quantization

Features Absolute Value, Inverse Difference and IMC
350

300

250
Number of Images

200

150

100

50

0
17

30
10
11
12
13
14
15
16

18
19
20
21
22
23
24
25
26
27
28
29

31
32
33
34
35
36
37
38
39
0
1
2
3
4
5
6
7
8
9

Clusters
Noise low quality (s = 0.005) Noise medium quality (s = 0.001) Original (s = 0)
JPEG high quality (q100) JPEG medium quality (q60) JPEG low quality (q20)

3200 images, 127 original contents, 2 types of distortions, different quality levels

22

Transmission
Original image system Distorted image

Feature Feature
Extractor x x(r) Extractor

I descriptor I(r) descriptor
x x(r)

Regression problem
VQA SYSTEM

QA System

p-class
calssification
Redi et al., problem
Ensembles of ANNs
...

SVMs in
2009, 2010 One Vs All
strategy
modules trained for a
specific distortion
23

• CBP Feed forward neural networks
• K-fold cross-validation model selection and test
• K groups of images each including different image contents
• Model selection decides number of hidden neurons

G1
G2 G1
G2 test
G3
IMAGE
G3
DATASET G4
G4
VAL
G5
Model selection

24

Correlation prediction-Subjective scores, LIVE
1.00

0.90

0.80
Correlation

0.70

0.60

0.50

0.40
JP2K1 JP2K2 Noise Blur JPEG1 JPEG2

CBP - No FS CBP with FS CELM with FS

ELM requires a much higher number of neurons,
trade-off complexity - accuracy

25

Helpful Harmful
In achieving the objective In achieving the objective

• Empirical Learning
• The less training examples,
Internal

• Ability to implement highly
Origin

the less accurate
non linear models
• Computationally
S
inexpensive at runtime
• Overfitting
W
• Crowdsourcing
External

• Databases
Origin

• The black box temptation!
•
•
O
QoE-centered ML design
Standardization of robust
testing procedures T
26

j.a.redi@tudelft.nl

27

Machine Learning for objective QoE assessment: Science, Myths and a look to the future

Recomendados

Recomendados

Mais conteúdo relacionado

Semelhante a Machine Learning for objective QoE assessment: Science, Myths and a look to the future

Semelhante a Machine Learning for objective QoE assessment: Science, Myths and a look to the future (20)

Mais de Förderverein Technische Fakultät

Mais de Förderverein Technische Fakultät (20)

Último

Último (20)

Machine Learning for objective QoE assessment: Science, Myths and a look to the future