SlideShare uma empresa Scribd logo
1 de 25
Baixar para ler offline
Content-gnostic Bitrate Ladder
Prediction forAdaptive Video Streaming
29/09/2020
Angeliki Katsenou
Structure
I. Motivation
II. Compression and Content
III. Proposed Framework
IV. Results
V.Conclusion and Future Work
I. Motivation
Cisco reports on internet data traffic estimate that the video data share is
expected to reach 80% by 2022 and is expected to increase more. [1]
Due to the pandemic and recently shift towards remote work life-style, this figure
is probably almost a reality.
Video providers employ adaptive streaming to address the users specifications.
Traditionally, this is achieved by creating several versions for a video sequence
using different encoding parameters, such as resolution.
This, however, requires a huge amount of encodings, which impacts on time, cost
and energy (increased CO2 footprint).
11%
17%
11%
6%20%
16%
19%
Distribution of energy
consumption for production
and use in 2017
TVs (production)
Computers
(production)
Smartphones
(production)
Others
Terminals (use)
Networks (use)
Data Centers (use)
“…as of the end of December last year,
the maximum number of daily meeting
participants, both free and paid,
conducted on Zoom was approximately
10 million. In March this year, we reached
more than 200 million daily meeting
participants, both free and paid.” [2]
Eric S. Yuan 
Founder and CEO, Zoom
I. Motivation
Fig.1 Sample frames of a 100 4K dataset.
101
102
103
104
105
106
Bitrate (kbps)
25
30
35
40
45
50
55
PSNR(dB)
4K
RQsFHD
RQs
HD
RQs
Fig.2 PSNR-log(Rate) curves across resolutions.
One ladder
does not fit all!
Table 1 The encoding ladder presented in Apple Tech Note TN2224.
I. Motivation
How can we find the “best” bitrate ladder per content so that we do not compromise the quality of
experience?
How could we make this process more computationally efficient without degrading the delivered
video quality?
Table 1 The encoding ladder presented in Apple Tech Note TN2224.
Table 2 Netflix’s per-title can change both the
number of rungs and their resolution. [3, 4]
Other Per-Title Approaches: Bitmovin, Mux, CAMBRIA, etc
I. Motivation
How can we find the “best” bitrate ladder per content so that we do
not compromise the quality of experience?
How could we make this process more computationally efficient
without degrading the delivered video quality? Convex Hull-
Optimal Encoding
Solution
Sub-optimal
Encoding Solution
Sub-optimal
Encoding Solution
Practical
Approach
Fig.3 RD curves and convex hull.
Ideally the optimal solution would to build the ladder by sampling
the convex hull of the RQ curves across resolutions.
We propose a content-gnostic machine-
learning based approach that predicts the
bitrate ladder.
II. Content Features and Compression
Fig.4 Correlation matrix of HM coding statistics to
spatio-temporal features. [5]
Fig.5 Examples of predicted PSNR-Rate curves. [5]
III. Proposed Framework
101
102
103
104
105
106
Bitrate (kbps)
25
30
35
40
45
50
55
PSNR(dB)
4K
RQsFHD
RQs
HD
RQs
Fig.2 PSNR-log(Rate) curves across resolutions.
5000 10000 50000
log (Bitrate (kbps))
32
34
36
38
40
PSNR(dB)
4K
FHD
HD
Convex Hull
{QP
high
FHD
,QP
HD
}
{QP
4K
,QP
low
FHD
}
Fig.6 Example of RQ curves’ intersection.
Finding the cross-over points helps defining
the switching of resolution on the convex hull.
We assume that the RQs are intersecting in an ordered monotonic fashion (e.g. 2160p intersects with the
1080p, 1080p with the 720p, etc).
III. Proposed Framework
Fig.7 Scatterplots of cross-over QPs.
15 20 25 30 35 40 45
QP
4K
15
20
25
30
35
40
45
QP
low
FHD
PCC: .9917
SROCC: .9888
20 25 30 35 40
QPhigh
FHD
20
25
30
35
40
QP
HD
PCC: .9817
SROCC: .9538
This relation can be used to improve cross-
over QP predictions.
III. Proposed Framework
Content
Features
Extraction
Machine
Learning-based
Regression
Testing Videos @
Native Spatial
Resolution
Spatio-temporal
Features of
Testing Videos
Video
CodecBitrate of
Cross-over
Points
RQ Convex Hull
Fitting
Ground-truth -
RQ Convex Hull
Training Videos @
Native Resolution
Downscaling
Resolution
Training Videos @ all considered
resolutions
Training
Videos Cross-
over QPs
Training Videos @
Native Resolution
Spatio-temporal Features of Training
Videos
Training Process
Testing Process
Upscaling
Resolution
Decoded Training
Videos @ all
considered
resolutions
Upscaled Training Videos
@ Native Resolution
Quality
Metrics
Computation
Upscaled
Decoded Training
Videos @ Native
Resolution
Decoded Testing
Videos
@ Cross-over QPs
Upscaled Decoded
Testing Videos @ Cross-over
QPs
Quality Metric Values for
Training Videos
Quality Metric Values for Testing
Videos at Cross-over Points
Testing Videos @ Native Spatial
Resolution
Predicted Cross-
over QPs per
Resolution
Predicted
BitrateLadder • RQ Convex Hull
Eq.
• Rate-QP Eq.
• Resolution
Switching Rate
points
Fig.8 Proposed method.
III. Proposed Framework
Fig.9 RQ convex hulls (blue: 2160p, red: 1080p, yellow: 720p. purple: 540p, green: 480p).
III. Proposed Framework
We fitted the convex hull in a 3rd order polynomial.
This means that after determining the cross-over QPs, we need four encodes in order to determine
the polynomial parameters.
Then, we can sample the convex hull and build the bitrate ladder.
Table 3 Fitted Models.
III. Proposed Framework
17 18 19 20 21 22 23 24 25
log2(Bitrate)
20
30
40
50
60
70
80
90
100
VMAF
17 18 19 20 21 22 23 24 25
log2(BitRate)
20
25
30
35
40
45
50
55
PSNR(dB)
Fig.10 PSNR-Rate Ladder Fig.11 VMAF-Rate Ladder
RL,i ≃ 2RL,i−1 or log(RL,i) ≃ 1 + log(RL,i−1) , where RL,i ∈ (Rmin, Rmax)
QL,i(RL,i) ≤ Qmax and
dQL,i
RL
> ϵ , where ϵ → 0
Building the bitrate ladder:
1. Determine the operational bitrate range;
2. Sample the bitrate:
3. Sample the quality:
IV. Results
0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
stdTC
std
20
25
30
35
40
45
QP
4K
Table 4 List of Features [5].
Fig.12 Example of content dependency of cross-over QPs.
IV. Results
Fig.1 Sample frames of a 100 4K dataset. Fig.13 Spatial and Temporal Information of the dataset.
IV. Results
From PCS2019 paper[6]: We have tested the proposed framework with HM16.20, considering the resolutions
{2160p,1080p,720p}.
Lanczos-3 filter (ffmpeg implementation) was used for the spatial down/up-sampling.
We compare our method against two state-of-the-art solutions:
• Brute force method: we performed encodings with a QP step equal to 1. The brute force method theoretically
creates the optimal convex hull. This is considered our ground truth.
• Interpolation-based method: 7 encodings per resolution (using equidistant QPs to cover the range) and by
using a piece-wise cubic Hermite interpolation for the in-between QPs. This method of course results in
constructing a suboptimal convex hull, but it can provide a good approximation of it, while significantly
reducing the number of pre-encodes.
IV. Results
We applied feature selection, and particularly Recursive Feature Elimination on the set of spatio-temporal
features.
We perform a sequential prediction of the QPs starting from the higher resolution:
• For the QP4K prediction, we only relied on spatio-temporal features.
• For the rest of the predictions, we made use of the identified relations and considered the previously predicted QPs (of the
highest resolutions) as features.
We have tested various regression methods, such as SVMs with different kernels, RFs, etc, but GPs were the best
performing models.
To avoid overfitting, we performed a 10-fold cross-validation.
IV. Results
15 20 25 30 35 40
True QP4K
15
20
25
30
35
40
PredictedQP4K
20 25 30 35 40
True QPhigh
FHD
20
22
24
26
28
30
32
34
36
38
40
PredictedQPhigh
FHD
14 16 18 20 22 24 26 28 30 32 34 36
True QPlow
HD
14
16
18
20
22
24
26
28
30
32
34
36
PredictedQPlow
HD
Fig.14 Predicted cross QP 4K
Fig.15 Predicted cross QP FHD high.
Fig.16 Predicted cross QP HD
Table 5 Results on cross-over QPs prediction.
IV. Results
The different distributions
are due to the different
reference convex hulls.
Fig.17 BDRate Histogram. Fig.18 BDPSNR Histogram.
Most outliers refer to sequences
that do not comply with the
hypothesis that the RQs are
intersecting in a resolution-
monotonic manner.
IV. Results
0 5 10 15
Bitrate (kbps) 104
31
32
33
34
35
36
37
38
39
PSNR(dB)
4K - 2160p
FHD - 1080p
HD - 720p
Convex Hull
0 0.5 1 1.5 2
Bitrate (kbps) 105
30
32
34
36
38
40
PSNR(dB)
campfirepartyg
op1 - BDRate:0.18364 , BDPSNR:-0.0040022
Ground Truth Convex Hull
Predicted Convex Hull
BDRate=0.18%
BDPSNR=-0.004dB
0 1 2 3 4 5
Bitrate (kbps) 10
4
37
37.5
38
38.5
39
39.5
40
PSNR(dB)
4K - 2160p
HD - 1080p
SD - 720p
Convex Hull
0 1 2 3 4 5 6 7
Bitrate (kbps) 10
4
37.5
38
38.5
39
39.5
40
PSNR(dB)
barsceneg
op1 - BDRate:2.0087 , BDPSNR:-0.0091561
Ground Truth Convex Hull
Predicted Convex Hull
BDRate=2.009%
BDPSNR=-0.009dB
Fig.19 Examples of results.
IV. Results
94.2% fewer encodings compared to the brute
force method and 80.95% compared to the
interpolation-based method.
Proposed method overhead: the average feature
extraction time for a sequence at 4K resolution to
the average 4K encoding time for a sequence at
QP=27 is 0.18.
Table 6 Comparison of the number of encodes required per method.
V. Conclusion and Future Work
Conclusions:
We proposed a method that can predict the bitrate ladders of the considered resolutions based on spatio-temporal
features extracted from the uncompressed videos at their native resolution and with a few video encodings (two
encodes per RQ intersecting points).
The first results are promising compared to the ground truth, while requiring 94.2% and 81% fewer pre-encodes
compared to the brute force method and the interpolation- based method, respectively.
Future Work:
Our focus will be on validating the presented method across different codecs.
We will also work on identifying cross-codecs optimization of bitrate ladders.
References
1. “Global Mobile Data Traffic Forecast Update 2017-2022”, White Paper, Cisco, 2018.
2. E. S. Yuan, “A message to our users”, https://blog.zoom.us/a-message-to-our-users/
3. J. De Cock, Z. Li, M. Manohara, and A. Aaron, “Complexity-based consistent quality encoding in the Cloud”, IEEE ICIP 2016.
4. J. Sole, L. Guo, A. Norkin, M. Afonso, K. Swanson, and A. Aaron, “Performance comparison of video coding standards: an

adaptive streaming perspective,” https://medium.com/netflix-techblog/performance- comparison- of- video- coding- standards- an- adaptive- streaming-
perspective- d45d0183ca95, 2018.
5.A. Katsenou, M. Afonso, D. Agrafiotis, and D. R. Bull, “Predicting Video Rate-Distortion Curves using Textural Features,” in PCS 2016.
6. A. V. Katsenou, J. Sole, and D. R. Bull, “Content-gnostic Bitrate Ladder Prediction for Adaptive Video Streaming,” in PCS 2019.
Thanks to
Dr Joel Sole
Dr Mariana Afonso
Prof David Bull
pcs2021.org
You are invited!

Mais conteúdo relacionado

Mais de Förderverein Technische Fakultät

The Role of Machine Learning in Fluid Network Control and Data Planes.pdf
The Role of Machine Learning in Fluid Network Control and Data Planes.pdfThe Role of Machine Learning in Fluid Network Control and Data Planes.pdf
The Role of Machine Learning in Fluid Network Control and Data Planes.pdfFörderverein Technische Fakultät
 
Nonequilibrium Network Dynamics_Inference, Fluctuation-Respones & Tipping Poi...
Nonequilibrium Network Dynamics_Inference, Fluctuation-Respones & Tipping Poi...Nonequilibrium Network Dynamics_Inference, Fluctuation-Respones & Tipping Poi...
Nonequilibrium Network Dynamics_Inference, Fluctuation-Respones & Tipping Poi...Förderverein Technische Fakultät
 
East-west oriented photovoltaic power systems: model, benefits and technical ...
East-west oriented photovoltaic power systems: model, benefits and technical ...East-west oriented photovoltaic power systems: model, benefits and technical ...
East-west oriented photovoltaic power systems: model, benefits and technical ...Förderverein Technische Fakultät
 
Advances in Visual Quality Restoration with Generative Adversarial Networks
Advances in Visual Quality Restoration with Generative Adversarial NetworksAdvances in Visual Quality Restoration with Generative Adversarial Networks
Advances in Visual Quality Restoration with Generative Adversarial NetworksFörderverein Technische Fakultät
 
Industriepraktikum_ Unterstützung bei Projekten in der Automatisierung.pdf
Industriepraktikum_ Unterstützung bei Projekten in der Automatisierung.pdfIndustriepraktikum_ Unterstützung bei Projekten in der Automatisierung.pdf
Industriepraktikum_ Unterstützung bei Projekten in der Automatisierung.pdfFörderverein Technische Fakultät
 
Standardising the compressed representation of neural networks
Standardising the compressed representation of neural networksStandardising the compressed representation of neural networks
Standardising the compressed representation of neural networksFörderverein Technische Fakultät
 
In the region – for the region? The multiple roles of universities for their ...
In the region – for the region? The multiple roles of universities for their ...In the region – for the region? The multiple roles of universities for their ...
In the region – for the region? The multiple roles of universities for their ...Förderverein Technische Fakultät
 
Understanding Users Behaviours in User-Centric Immersive Communications
Understanding Users Behaviours in User-Centric Immersive CommunicationsUnderstanding Users Behaviours in User-Centric Immersive Communications
Understanding Users Behaviours in User-Centric Immersive CommunicationsFörderverein Technische Fakultät
 
Quality of Experience: Measuring Quality from the End-User Perspective
Quality of Experience: Measuring Quality from the End-User PerspectiveQuality of Experience: Measuring Quality from the End-User Perspective
Quality of Experience: Measuring Quality from the End-User PerspectiveFörderverein Technische Fakultät
 

Mais de Förderverein Technische Fakultät (20)

The Role of Machine Learning in Fluid Network Control and Data Planes.pdf
The Role of Machine Learning in Fluid Network Control and Data Planes.pdfThe Role of Machine Learning in Fluid Network Control and Data Planes.pdf
The Role of Machine Learning in Fluid Network Control and Data Planes.pdf
 
Nonequilibrium Network Dynamics_Inference, Fluctuation-Respones & Tipping Poi...
Nonequilibrium Network Dynamics_Inference, Fluctuation-Respones & Tipping Poi...Nonequilibrium Network Dynamics_Inference, Fluctuation-Respones & Tipping Poi...
Nonequilibrium Network Dynamics_Inference, Fluctuation-Respones & Tipping Poi...
 
Towards a data driven identification of teaching patterns.pdf
Towards a data driven identification of teaching patterns.pdfTowards a data driven identification of teaching patterns.pdf
Towards a data driven identification of teaching patterns.pdf
 
Förderverein Technische Fakultät.pptx
Förderverein Technische Fakultät.pptxFörderverein Technische Fakultät.pptx
Förderverein Technische Fakultät.pptx
 
The Computing Continuum.pdf
The Computing Continuum.pdfThe Computing Continuum.pdf
The Computing Continuum.pdf
 
East-west oriented photovoltaic power systems: model, benefits and technical ...
East-west oriented photovoltaic power systems: model, benefits and technical ...East-west oriented photovoltaic power systems: model, benefits and technical ...
East-west oriented photovoltaic power systems: model, benefits and technical ...
 
Machine Learning in Finance via Randomization
Machine Learning in Finance via RandomizationMachine Learning in Finance via Randomization
Machine Learning in Finance via Randomization
 
IT does not stop
IT does not stopIT does not stop
IT does not stop
 
Advances in Visual Quality Restoration with Generative Adversarial Networks
Advances in Visual Quality Restoration with Generative Adversarial NetworksAdvances in Visual Quality Restoration with Generative Adversarial Networks
Advances in Visual Quality Restoration with Generative Adversarial Networks
 
Recent Trends in Personalization at Netflix
Recent Trends in Personalization at NetflixRecent Trends in Personalization at Netflix
Recent Trends in Personalization at Netflix
 
Industriepraktikum_ Unterstützung bei Projekten in der Automatisierung.pdf
Industriepraktikum_ Unterstützung bei Projekten in der Automatisierung.pdfIndustriepraktikum_ Unterstützung bei Projekten in der Automatisierung.pdf
Industriepraktikum_ Unterstützung bei Projekten in der Automatisierung.pdf
 
Introduction to 5G from radio perspective
Introduction to 5G from radio perspectiveIntroduction to 5G from radio perspective
Introduction to 5G from radio perspective
 
Förderverein Technische Fakultät
Förderverein Technische Fakultät Förderverein Technische Fakultät
Förderverein Technische Fakultät
 
RL-Cache: Learning-Based Cache Admission for Content Delivery
RL-Cache: Learning-Based Cache Admission for Content DeliveryRL-Cache: Learning-Based Cache Admission for Content Delivery
RL-Cache: Learning-Based Cache Admission for Content Delivery
 
Standardising the compressed representation of neural networks
Standardising the compressed representation of neural networksStandardising the compressed representation of neural networks
Standardising the compressed representation of neural networks
 
Cloud, Fog, or Edge: Where and When to Compute?
Cloud, Fog, or Edge: Where and When to Compute?Cloud, Fog, or Edge: Where and When to Compute?
Cloud, Fog, or Edge: Where and When to Compute?
 
In the region – for the region? The multiple roles of universities for their ...
In the region – for the region? The multiple roles of universities for their ...In the region – for the region? The multiple roles of universities for their ...
In the region – for the region? The multiple roles of universities for their ...
 
Understanding Users Behaviours in User-Centric Immersive Communications
Understanding Users Behaviours in User-Centric Immersive CommunicationsUnderstanding Users Behaviours in User-Centric Immersive Communications
Understanding Users Behaviours in User-Centric Immersive Communications
 
What will 5G bring to the future of video?
What will 5G bring to the future of video?What will 5G bring to the future of video?
What will 5G bring to the future of video?
 
Quality of Experience: Measuring Quality from the End-User Perspective
Quality of Experience: Measuring Quality from the End-User PerspectiveQuality of Experience: Measuring Quality from the End-User Perspective
Quality of Experience: Measuring Quality from the End-User Perspective
 

Último

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 

Último (20)

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 

Content-gnostic Bitrate Ladder Prediction for Adaptive Video Streaming

  • 1. Content-gnostic Bitrate Ladder Prediction forAdaptive Video Streaming 29/09/2020 Angeliki Katsenou
  • 2. Structure I. Motivation II. Compression and Content III. Proposed Framework IV. Results V.Conclusion and Future Work
  • 3. I. Motivation Cisco reports on internet data traffic estimate that the video data share is expected to reach 80% by 2022 and is expected to increase more. [1] Due to the pandemic and recently shift towards remote work life-style, this figure is probably almost a reality. Video providers employ adaptive streaming to address the users specifications. Traditionally, this is achieved by creating several versions for a video sequence using different encoding parameters, such as resolution. This, however, requires a huge amount of encodings, which impacts on time, cost and energy (increased CO2 footprint). 11% 17% 11% 6%20% 16% 19% Distribution of energy consumption for production and use in 2017 TVs (production) Computers (production) Smartphones (production) Others Terminals (use) Networks (use) Data Centers (use) “…as of the end of December last year, the maximum number of daily meeting participants, both free and paid, conducted on Zoom was approximately 10 million. In March this year, we reached more than 200 million daily meeting participants, both free and paid.” [2] Eric S. Yuan  Founder and CEO, Zoom
  • 4. I. Motivation Fig.1 Sample frames of a 100 4K dataset. 101 102 103 104 105 106 Bitrate (kbps) 25 30 35 40 45 50 55 PSNR(dB) 4K RQsFHD RQs HD RQs Fig.2 PSNR-log(Rate) curves across resolutions. One ladder does not fit all! Table 1 The encoding ladder presented in Apple Tech Note TN2224.
  • 5. I. Motivation How can we find the “best” bitrate ladder per content so that we do not compromise the quality of experience? How could we make this process more computationally efficient without degrading the delivered video quality? Table 1 The encoding ladder presented in Apple Tech Note TN2224. Table 2 Netflix’s per-title can change both the number of rungs and their resolution. [3, 4] Other Per-Title Approaches: Bitmovin, Mux, CAMBRIA, etc
  • 6. I. Motivation How can we find the “best” bitrate ladder per content so that we do not compromise the quality of experience? How could we make this process more computationally efficient without degrading the delivered video quality? Convex Hull- Optimal Encoding Solution Sub-optimal Encoding Solution Sub-optimal Encoding Solution Practical Approach Fig.3 RD curves and convex hull. Ideally the optimal solution would to build the ladder by sampling the convex hull of the RQ curves across resolutions. We propose a content-gnostic machine- learning based approach that predicts the bitrate ladder.
  • 7. II. Content Features and Compression Fig.4 Correlation matrix of HM coding statistics to spatio-temporal features. [5] Fig.5 Examples of predicted PSNR-Rate curves. [5]
  • 8. III. Proposed Framework 101 102 103 104 105 106 Bitrate (kbps) 25 30 35 40 45 50 55 PSNR(dB) 4K RQsFHD RQs HD RQs Fig.2 PSNR-log(Rate) curves across resolutions. 5000 10000 50000 log (Bitrate (kbps)) 32 34 36 38 40 PSNR(dB) 4K FHD HD Convex Hull {QP high FHD ,QP HD } {QP 4K ,QP low FHD } Fig.6 Example of RQ curves’ intersection. Finding the cross-over points helps defining the switching of resolution on the convex hull. We assume that the RQs are intersecting in an ordered monotonic fashion (e.g. 2160p intersects with the 1080p, 1080p with the 720p, etc).
  • 9. III. Proposed Framework Fig.7 Scatterplots of cross-over QPs. 15 20 25 30 35 40 45 QP 4K 15 20 25 30 35 40 45 QP low FHD PCC: .9917 SROCC: .9888 20 25 30 35 40 QPhigh FHD 20 25 30 35 40 QP HD PCC: .9817 SROCC: .9538 This relation can be used to improve cross- over QP predictions.
  • 10. III. Proposed Framework Content Features Extraction Machine Learning-based Regression Testing Videos @ Native Spatial Resolution Spatio-temporal Features of Testing Videos Video CodecBitrate of Cross-over Points RQ Convex Hull Fitting Ground-truth - RQ Convex Hull Training Videos @ Native Resolution Downscaling Resolution Training Videos @ all considered resolutions Training Videos Cross- over QPs Training Videos @ Native Resolution Spatio-temporal Features of Training Videos Training Process Testing Process Upscaling Resolution Decoded Training Videos @ all considered resolutions Upscaled Training Videos @ Native Resolution Quality Metrics Computation Upscaled Decoded Training Videos @ Native Resolution Decoded Testing Videos @ Cross-over QPs Upscaled Decoded Testing Videos @ Cross-over QPs Quality Metric Values for Training Videos Quality Metric Values for Testing Videos at Cross-over Points Testing Videos @ Native Spatial Resolution Predicted Cross- over QPs per Resolution Predicted BitrateLadder • RQ Convex Hull Eq. • Rate-QP Eq. • Resolution Switching Rate points Fig.8 Proposed method.
  • 11. III. Proposed Framework Fig.9 RQ convex hulls (blue: 2160p, red: 1080p, yellow: 720p. purple: 540p, green: 480p).
  • 12. III. Proposed Framework We fitted the convex hull in a 3rd order polynomial. This means that after determining the cross-over QPs, we need four encodes in order to determine the polynomial parameters. Then, we can sample the convex hull and build the bitrate ladder. Table 3 Fitted Models.
  • 13. III. Proposed Framework 17 18 19 20 21 22 23 24 25 log2(Bitrate) 20 30 40 50 60 70 80 90 100 VMAF 17 18 19 20 21 22 23 24 25 log2(BitRate) 20 25 30 35 40 45 50 55 PSNR(dB) Fig.10 PSNR-Rate Ladder Fig.11 VMAF-Rate Ladder RL,i ≃ 2RL,i−1 or log(RL,i) ≃ 1 + log(RL,i−1) , where RL,i ∈ (Rmin, Rmax) QL,i(RL,i) ≤ Qmax and dQL,i RL > ϵ , where ϵ → 0 Building the bitrate ladder: 1. Determine the operational bitrate range; 2. Sample the bitrate: 3. Sample the quality:
  • 14. IV. Results 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 stdTC std 20 25 30 35 40 45 QP 4K Table 4 List of Features [5]. Fig.12 Example of content dependency of cross-over QPs.
  • 15. IV. Results Fig.1 Sample frames of a 100 4K dataset. Fig.13 Spatial and Temporal Information of the dataset.
  • 16. IV. Results From PCS2019 paper[6]: We have tested the proposed framework with HM16.20, considering the resolutions {2160p,1080p,720p}. Lanczos-3 filter (ffmpeg implementation) was used for the spatial down/up-sampling. We compare our method against two state-of-the-art solutions: • Brute force method: we performed encodings with a QP step equal to 1. The brute force method theoretically creates the optimal convex hull. This is considered our ground truth. • Interpolation-based method: 7 encodings per resolution (using equidistant QPs to cover the range) and by using a piece-wise cubic Hermite interpolation for the in-between QPs. This method of course results in constructing a suboptimal convex hull, but it can provide a good approximation of it, while significantly reducing the number of pre-encodes.
  • 17. IV. Results We applied feature selection, and particularly Recursive Feature Elimination on the set of spatio-temporal features. We perform a sequential prediction of the QPs starting from the higher resolution: • For the QP4K prediction, we only relied on spatio-temporal features. • For the rest of the predictions, we made use of the identified relations and considered the previously predicted QPs (of the highest resolutions) as features. We have tested various regression methods, such as SVMs with different kernels, RFs, etc, but GPs were the best performing models. To avoid overfitting, we performed a 10-fold cross-validation.
  • 18. IV. Results 15 20 25 30 35 40 True QP4K 15 20 25 30 35 40 PredictedQP4K 20 25 30 35 40 True QPhigh FHD 20 22 24 26 28 30 32 34 36 38 40 PredictedQPhigh FHD 14 16 18 20 22 24 26 28 30 32 34 36 True QPlow HD 14 16 18 20 22 24 26 28 30 32 34 36 PredictedQPlow HD Fig.14 Predicted cross QP 4K Fig.15 Predicted cross QP FHD high. Fig.16 Predicted cross QP HD Table 5 Results on cross-over QPs prediction.
  • 19. IV. Results The different distributions are due to the different reference convex hulls. Fig.17 BDRate Histogram. Fig.18 BDPSNR Histogram. Most outliers refer to sequences that do not comply with the hypothesis that the RQs are intersecting in a resolution- monotonic manner.
  • 20. IV. Results 0 5 10 15 Bitrate (kbps) 104 31 32 33 34 35 36 37 38 39 PSNR(dB) 4K - 2160p FHD - 1080p HD - 720p Convex Hull 0 0.5 1 1.5 2 Bitrate (kbps) 105 30 32 34 36 38 40 PSNR(dB) campfirepartyg op1 - BDRate:0.18364 , BDPSNR:-0.0040022 Ground Truth Convex Hull Predicted Convex Hull BDRate=0.18% BDPSNR=-0.004dB 0 1 2 3 4 5 Bitrate (kbps) 10 4 37 37.5 38 38.5 39 39.5 40 PSNR(dB) 4K - 2160p HD - 1080p SD - 720p Convex Hull 0 1 2 3 4 5 6 7 Bitrate (kbps) 10 4 37.5 38 38.5 39 39.5 40 PSNR(dB) barsceneg op1 - BDRate:2.0087 , BDPSNR:-0.0091561 Ground Truth Convex Hull Predicted Convex Hull BDRate=2.009% BDPSNR=-0.009dB Fig.19 Examples of results.
  • 21. IV. Results 94.2% fewer encodings compared to the brute force method and 80.95% compared to the interpolation-based method. Proposed method overhead: the average feature extraction time for a sequence at 4K resolution to the average 4K encoding time for a sequence at QP=27 is 0.18. Table 6 Comparison of the number of encodes required per method.
  • 22. V. Conclusion and Future Work Conclusions: We proposed a method that can predict the bitrate ladders of the considered resolutions based on spatio-temporal features extracted from the uncompressed videos at their native resolution and with a few video encodings (two encodes per RQ intersecting points). The first results are promising compared to the ground truth, while requiring 94.2% and 81% fewer pre-encodes compared to the brute force method and the interpolation- based method, respectively. Future Work: Our focus will be on validating the presented method across different codecs. We will also work on identifying cross-codecs optimization of bitrate ladders.
  • 23. References 1. “Global Mobile Data Traffic Forecast Update 2017-2022”, White Paper, Cisco, 2018. 2. E. S. Yuan, “A message to our users”, https://blog.zoom.us/a-message-to-our-users/ 3. J. De Cock, Z. Li, M. Manohara, and A. Aaron, “Complexity-based consistent quality encoding in the Cloud”, IEEE ICIP 2016. 4. J. Sole, L. Guo, A. Norkin, M. Afonso, K. Swanson, and A. Aaron, “Performance comparison of video coding standards: an
 adaptive streaming perspective,” https://medium.com/netflix-techblog/performance- comparison- of- video- coding- standards- an- adaptive- streaming- perspective- d45d0183ca95, 2018. 5.A. Katsenou, M. Afonso, D. Agrafiotis, and D. R. Bull, “Predicting Video Rate-Distortion Curves using Textural Features,” in PCS 2016. 6. A. V. Katsenou, J. Sole, and D. R. Bull, “Content-gnostic Bitrate Ladder Prediction for Adaptive Video Streaming,” in PCS 2019.
  • 24. Thanks to Dr Joel Sole Dr Mariana Afonso Prof David Bull