Generative AI for Technical Writer or Information Developers
Assessing 3DTV QoE and beyond a look on testing methodologies
1. Colloquium on Quality of Experience in Multimedia Systems
and Services - Klagenfurt
Assessing 3DTV QoE and beyond
a look on testing methodologies
Patrick Le Callet
November 2012
1
4. S-3D Objective Metric: a first approach
Quality of stereoscopic images, A. Benoit, P. Le Callet, P. Campisi and R.
Cousseau, EURASIP Journal on Image and Video Processing, special issue on
3D Image and Video Processing, vol. 2008, doi:10.1155/2008/659024, 2008.
4
6. Performances ?
2D metrics are able to estimate visual quality of 3D
content …if this latter is measured subjectively
with usual protocols !
=> Need to define 3D QoE
6
7. S-3DTV quality: new issues
Quality & 3D => what should it be measured ?
S-3D needs to accepted by end user
…some have already announced its death
R. Ebert, “Why i hate 3-d (and you should too),” Newsweek, May 2010.
[Online]. Available: http://www.newsweek.com/2010/04/30/whyi-
hate-3-d-and-you-should-too.html
M. Kermode, “Come in number 3d, your time is up,” BBC News,
December 2009. [Online]. Available:
http://www.bbc.co.uk/blogs/markkermode/2009/12/come in
number 3d your time is.html
7
12. S-3D is cheating our perception
Depth cues combination: correlation vs ambiguity
…
Dominance, Dissociation, Reinterpretation
Question :
Cue enhancement without reliability with
others => cognitive load ?
12
13. S-3DTV quality: new issues
Quality & 3D => what should be measured ?
S-3D needs to accepted by end user
- Comfortable viewing experience
- Added value compared to 2D services =>
enhanced experience
Immersiveness, naturalness
=> Moving from visual quality evaluation to
Quality of Experience evaluation
13
14. Definition of Quality of Experience
Quality of Experience (QoE) is the degree of delight
or annoyance of the user of an application or
service.
It results from the fulfillment of his or her
expectations with respect to the utility and / or
enjoyment of the application or service in the light
of the user’s personality and current state.
[Qualinet White Paper on Definitions of Quality of Experience (2012).
European Network on Quality of Experience in Multimedia Systems and
Services (COST Action IC 1003
Available at http://www.qualinet.eu
14
15. S-3DTV: from Visual Quality to visual
Quality of Experience
Quality & 3D => multidimensionnal
3D visual experience (Seuntiëns 2006)
+ Visual fatigue ?
Towards QoE of 3DTV
Step 1: how to measure it with observers ?
Step 2: objective metric 15
17. Measuring QoE: explorative studies
Explorative studies : focus groups
Feeling and reactions towards 3D services are explored
Texture (Seuntiëns, 2006)
ITU-R quality scale with or
Quality and Single Stimulus (Seuntiëns et al., 2006),
without adjectives
Sharpness (Lambooij et al., 2011)
(Seuntiëns, 2006),
Amount of
Single Stimulus Numerical scale(0-5) Lambooij et al., 2011),
Depth
(Strohmeier et al., 2010)
Quality of Single Stimulus, (IJsselsteijn et al., 2000),
Numerical scale(0-10)
Depth Pair Comparison (Barkowsky et al., 2009)
Visual
ITU impairment and (Wöpking, 1992), (Yano et
comfort, Eye
Single Stimulus, quality scale, adapted al., 2002),(Kooi and Toet,
strain and
SSCQE, DSIS impairment scale from 2004), (Seuntiëns et al.,
Visual
ITU 2006)
Annoyance
17
18. Measuring QoE: explorative studies
(Yano et al., 2004, Hyung-Chul
Questionnaire, objective measurement
Visual Fatigue et al., 2008, Li et al., 2008,
(e.g. EEG)
Emoto et al., 2004)
(Seuntiens et al., 2005),
Viewing
(Seuntiëns et al., 2006),
experience,
(Seuntiëns, 2006), Lambooij et
overall image Single Stimulus ITU quality scale
al., 2011), (Goldmann et al.,
quality, visual
2010b, Goldmann et al., 2010a),
experience
(Strohmeier et al., 2010)
(IJsselsteijn et al., 2000),
Numerical scale(0-
(Seuntiens et al., 2005),
Naturalness Single Stimulus 10), ITU quality
(Seuntiëns, 2006), Lambooij et
scale
al., 2011)
Presence and
Single Stimulus ITU quality scale (Seuntiëns, 2006)
enjoyment
19. Measuring QoE: ITU Status
Video : ITU-R BT.1438 Recommendation
=> lack of specification of new characteristics for assessing S-3DTV. ITU-R
WP6 and ITU-T SG9 have addressed Question Q.2 and Q.12 and are
making progress on the recommendation (draft)
Recommendation Title Content
Subjective Methods for the
Recommendation covering
Assessment of Stereoscopic
ITU-R BT.[3DTV SubMEth] subjective assessment
Three-Dimensional
methods for 3DTV
Television (3DTV) systems
Subjective assessment Recommendation regarding
ITU-T P.3D-sam methods for 3D video 3D assessment methods for
quality the current 3D environment
Assessment methods of Visual fatigue and safety
ITU-T J.3D-fatigue visual fatigue and safety assessment guideline for 3D
guideline for 3D video video
Requirements for displays
Display requirements for 3D
ITU-T J.3D-disp-req used for 3D assessment
video quality assessment 19
testing
21. Multidimension: measuring on
different scales
End-user Multidimensionnal: assessment of
Visual quality Visual Comfort several attributes
Excellent eye strain
Bad
Good Headache
poor etc.
Fair
3D Perception Visual quality
Naturalness visual experience
Presence etc.
Visual comfort Depth
• several scales => several interpretation Inter observer variability ?
• Image quality measurement may be copied from traditional methods
• New scale attributes have to be developed for the other scales
*New requirement of subjective video quality assessment methodologies for 3DTV - VPQM 2010
21
Wei Chen, Jérôme Fournier, Marcus Barkowsky, Patrick Le Callet - , Orange labs R&D - IRCCyN
22. Choosing attributes
User centered approach: cf OPQ (Open Profiling of Quality)
[Strohmeier, D., Jumisko-Pyykko, S., Kunze, K., & Bici, M. O. (2011). The Extended-OPQ Method for
User-Centered Quality of Experience Evaluation: A Study for Mobile 3D Video Broadcasting over
DVB-H. EURASIP Journal on Image and Video Processing, 2011(1)]
Fixed attributes:
Visual
experience
Depth
Naturalness
rendering
2D Image Depth Visual
quality quantity comfort
[Chen, W., Fournier, J., Barkowsky, M., & Le Callet, P.
P. Seuntiëns, "Visual experience of 3D TV," PhD Thesis, (2012). Exploration
Eindhoven University of Technology, 2006] of Quality of Experience of Stereoscopic Images: Binocular
Depth. VPQM]
22
23. Scale interpretation & observer
variability
Co joint quality and comfort ratings for 4 observers
*TOWARDS A FRAMEWORK OF INTER-OBSERVER ANALYSIS IN MULTIMEDIA QUALITY ASSESSMENT- QOMEX
2011 - Ulrich Engelke, Yohann Pitrey, Patrick Le Callet 23
24. Scales vs Pair comparison
Observers are always capable of indicating a preference when
confronted with two samples while they may not be able to
project their decision onto scale values
Conversion to scale values possible using Bradley-Terry or
Thurstone-Mosteller models
Paired Comparison experiments may be conducted:
- Time sequential presentation, difficult for observers if conditions are close
- Side-by-Side presentation on two displays, exact calibration of screens and temporal
synchronization of playback is required
24
25. Pair comparison
Large number of comparisons required: N(N-1)/2
Suitable mostly for obtaining ground-truth data, i.e. effort of the
Video Quality Experts Group (VQEG) – 3DTV
….but, subset selection algorithms exist to reduce the number of
pairs from O(N²) to O(N)
Dykstra’s Square design method was evaluated recently
An optimal selection criterion for the construction of the square was developed and
evaluated
(see Jing Li and al. ICIP 2012)
25
26. Challenge: measuring long term QoE
attribute
Short term Long term
Visual Visual fatigue:
experience
a decrease in performance of
Naturalness
Depth the visual system.
rendering
A measurable criterion that is
of particular value of
2D Image Depth Visual
quality quantity comfort ascertaining long-term
adaptive processes of visual
system
*New requirement of subjective video quality assessment methodologies for 3DTV - VPQM 2010
26
Wei Chen, Jérôme Fournier, Marcus Barkowsky, Patrick Le Callet - , Orange labs R&D - IRCCyN
27. Challenge: measuring long term QoE
factor
questionnaries optometry
End-user
Visual quality Visual Comfort
Excellent eye strain
Bad
Good Headache
poor etc.
Fair
3D Perception
Naturalness visual experience
Presence etc.
EEG - EMG
eyetracking
*New requirement of subjective video quality assessment methodologies for 3DTV - VPQM 2010
27
Wei Chen, Jérôme Fournier, Marcus Barkowsky, Patrick Le Callet - , Orange labs R&D - IRCCyN
28. Long term visual fatigue measurement
by EEG signal
[Li 2008], the authors found out that in most of the channels, the
power of high frequencies (higher than 12 Hz) is stronger in 3D
conditions than in 2D conditions and it tends to increase as
presentation duration increases.
But …conclusion might differ with the “quality” of the content
2D vs 3D Before vs After
[Li 2008]: Li and al., “Measurement of 3D Visual Fatigue Using Event-Related Potential (ERP): 3D Oddball
Paradigm”, in 3DTV Conference: The True Vision – Capture, Transmission and Display of 3D Video, pp 213-
216
29. Impact on task performance (pre et post
3D)
performance
measurement:
Q & A=> eyetracking
measurement
Influence of autostereoscopic 3D displays on subsequent task performance- SPIE Stereoscopic
Displays and Applications 2010
29
M. Barkowsky, P. Le Callet
30. Impact on performance (pre et post 3D)
Performance measurement: psychophysics + optometry
Performances are better while observer report discomfort
(questionnary)
Is visual fatigue changing the perceived depth accuracy on an autostereoscopic display?
M. Barkowsky, R. Cousseau, P. Le Callet- in SPIE Stereoscopic Displays and Applications 2011 30
32. test conditions: the display
on 2D, transparent (or almost: LCD) displays … Far to be the case
in S-3D !
Issues: luminance rendering and depth rendering
New requirement of subjective video quality assessment methodologies for 3DTV - VPQM 2010
W. Chen, J. Fournier, M. Barkowsky, P. Le Callet - , Orange labs R&D - IRCCyN
32
33. Luminance rendering : perceived
crosstalk with autosteroscopic display
*SUBJECTIVE CROSSTALK ASSESSMENT METHODOLOGY FOR AUTO-STEREOSCOPIC
DISPLAYS- IEEE ICME 2012
33
L. Xing, J. Xu, K. Skildheim, A. Perkis, T. Ebrahimi
34. depth rendering and source content
• Strong relation between shooting parameters and viewing
configuration
– Shooting parameters : focal length (f), inter-camera baseline (b), convergence distance (d)
– Visualisation parameters : screen distance (D), screen size (M), inter-ocular distance (B)
Restituted space = f(shooting parameters, visualisation parameters)
• Conformity rules have to be defined between the real and the restituted spaces
• Visual Comfort have to be considered regarding Human visual system.
M
Real object Screen
Is it comfortable ?
d
D
b
f
B
Cameras
Eyes
35. Depth distortion and Shape distortion example
Only change camera baseline in a stereoscopic system
optimal camera 0.5x camera 1.5x camera
baseline baseline baseline
See for details:New stereoscopic video shooting rule based on stereoscopic distortion
parameters and comfortable viewing zone _ SPIE EI/SDA 2012, W. Chen, J. Fournier, M.
barkowsky, P. le callet
36. conformity of stereoscopic images
• A compliant image
– Looks natural
– Avoids (or minimizes) visual fatigue and visual annoyance of observers
• 3 types of conformities can be defined
Total conformity Relative conformity Partial conformity
of shapes and dimensions limited to a slice of space of shapes without
100 30 90 dimensions
80
25
Profondeur restituée (m)
Profondeur restituée (m)
Profondeur restituée (m)
80 70
Restituted depth
Focale = 100 mm
Focal length = 100
20 60
60 mm
Focal length = 100
Focale = 100 mm 50
15
mm 40
40
10 Focale = 300 mm 30 Focal length =mm
Focale = 300
Focal length = 300 mm
20 20
5 300 mm
10
0 0 0
0 20 40 60 80 100 0 10 20 30 40 50 0 10 20 30 40 50
Real depth Real depth Real depth
Profondeur réelle (m) Profondeur réelle (m) Profondeur réelle (m)
38. High Dynamic Range
• HDR filling the gap between:
the abilities of the human visual system.
…capture and display technologies
=> more realistic visual experience compared to current Low
Dynamic Range (LDR) imaging
38
39. HDR and visual quality: new issues
Image capture
almost no native HDR sensor exists, the capture
step is multi-phased.
=> Many quality issues can arise from such
processing(geometric distortions, ghosting,
noises, etc.).
HDR Content delivery
Need for efficient compression techniques to store
HDR data
=> impact on QoE ?
39
40. HDR and visual quality: new issues
HDR and LDR technologies will have to coexist for some
time !
=> HDR to LDR operations (tone mapping, TMO) and LDR
to HDR operations (inverse tone-mapping, iTMO) will
need to be used.
Impact of TMO and iTMO on QoE ?
can even change the original artistic intention
simple contrast reduction with local tone mapper 40
41. HDR and visual quality: new issues
TMO perceptual evaluation so far ….
quality and aesthetic appeal studies
– TMOs comparison
• Rating (features) [Drago2003]
– Contrast, detail, naturalness
• Paired comparison (preference) [Kuang2007]
– Features study [Kuang2007]
• Highlight details, shadow details, overall contrast, sharpness,
colorfulness, artifacts
• One feature rating can predict overall rating
– Comparison with “real life” scenes [Yoshida2005] or HDR
display [Ledda2005]
• Naturalness, overall contrast, overall brightness, detail in dark
and bright regions
• Differences in rating if real or HDR image [Ashikimin2006]
42. TMO and QoE: VA as an indicator of Visual
Experience
• VA has many technological
applications
– Video coding / compression
– Quality estimation
– Computer vision
– Etc.
Bottom-up “intention”
• Also a great ‘tool’ for artists
– Guide user’s discovery of a scene
– Highlight / hide parts of a scene
– Convey a message or emotions
• TMOs should preserve VA behavior
– Need to study the effect of TMOs on Top-down “intention”
visual attention deployment
43. TMO and QoE: A recent study
• Eye-tracking experiments on
88 images (tone mapped by
11 TMOs)
• Analysis* shows that different
TMOs modify VA to different
extents
Tumblin iCam Linear
M. Narwaria, M. Silva, P. Le Callet and R. Pepion “Effect of Tone Mapping on Visual Attention
Deployment”, SPIE Conference on Applications of Digital Image Processing XXVII (Special Session
on High Dynamic Range Imaging), vol. 8499, 2012
44. Take away messages …
• Quality of (Visual) Experience is not only Video Quality
• The environment for subjective experiments need to
be redefined
• Subjective measurement methods need to be refined
• Multiscale methods need to be validated
• New test methodologies are required: Continuous
measurements, non-intrusive measurements, …
• Objectively measuring the observer’s response with
psychophysical devices may be required
44
45. VQEG (Video Quality Expert Group)
3DTV Group HDR Group
activities: One mission:
- Impact of viewing develop methods for
conditions on assessing the quality
quality of HDR video.
- Methodologies for
subjective QA
45
46. Next challenges: Ultra HD
Higher resolutions, Ultra-HD 4K/8K, “retina” displays:
• Exceeding the visual acuity of standard observers
• Higher frame rates leading to fluent motion
reconstruction
How to measure content quality / added value?
Loss of reference system in reality: How to avoid simulator
sickness?
46