SlideShare uma empresa Scribd logo
1 de 21
Baixar para ler offline
Intelligent Thumbnail Selection
Kamil Sindi, Lead Data Scientist
JW Player
1. Company
a. Open-source video player
b. Hosting platform
c. 5% of global internet video traffic
d. 150+ team
2. Data Team
a. Handling 5MM events per minute
b. Storing 1TB+ per day
c. Stack: Storm (Trident), Kafka, Luigi,
Elasticsearch, Spark, AWS, MySQL Customers
Thumbnails are Important
● Your video's first impression
● Types: Upload, Manual, Auto (default)
● Manual >> Auto in Play Rate
● Current Auto is 10th second frame
● Many big publishers only use Manual
● 90% of Thumbnails are Auto! :-(
source: tastingtable.com (2016-10-12)
What’s a “Good” Thumbnail?
It’s subjective to the viewer!
Common themes:
● Not blurry
● Balanced brightness
● Centered objects
● Large text overlay
● Relevant to subject
vs
Source: Big Buck Bunny, Blender Studios
Manually Creating a Model is Hard
● Which features to extract?
● How to describe those features?
● How to weight features?
● How to penalize overfitting of models?
● Many techniques: SIFT, SURF, HOG?
Need to be an expert in Computer Vision :-(
Edge Detection Color Histogram Pixel Segmentation
So Many Image Features...
Deep Learning
● Learn features implicitly
● Learn from examples
● Techniques to avoid overfitting
● Success in a lot of applications:
○ Image classification
○ Image captioning
○ Machine translation
○ Speech-to-Text
Inception
● Learn multiple models in parallel; concatenate
their outputs (“modules”)
● Factoring convolutions (“towers”): e.g. 1x1
convs followed by 3x3
● Parameter reduction: GoogleNet (5MM) vs.
AlexNet (60MM), VGG (200MM)
● Auxiliary classifiers for regularization
● Residual connections (Inceptionv4)
● Depthwise separable convolutions (Xception)
https://www.udacity.com/course/deep-learning--ud730
https://arxiv.org/abs/1409.4842
Source: Rethinking the Inception Architecture for Computer Vision
1. Dimensionality reduction: fewer
channels, strides, feature pooling
2. Parameter reduction: faster, less
overfitting
3. “Cheap” nonlinearity: 1x1 + 3x3 is non-lin
4. Cross-channel ⊥ spatial correlations
1x1 Convolutions: what’s the point?
1x1 convolution with strides Pooling with 1x1 convolution
Source: http://iamaaditya.github.io/2016/03/one-by-one-convolution/
In Convolutional Nets, there is no such thing as
“fully-connected layers”. There are only
convolution layers with 1x1 convolution kernels. –
Yann LeCun
InceptionV3 Architecture
https://research.googleblog.com/2016/03/train-your-own-image-classifier-with.html
Dog (0.80)
Cat (0.05)
Rat (0.01)
...
Transfer Learning
1,000,000 images, 1,000 categories● Use pre-trained model
○ Cheaper (no GPU required)
○ Faster
○ Prevents overfitting
● Penultimate (“Bottleneck”) layer contains
image’s “essence” (CNN codes); acts as a
feature extractor
● Just add a linear classifier (Softmax; lin-SVM)
to Bottleneck
Fine Tuning + Tips
● Change classification layer +
backprop layers back
● Idea:
Early layers do basic filters; later
layers more dataset specific
● Generally use a pre-trained model
regardless of data size or similarity
Data Size (per class)
< 500 > 500 > 5,000
Similar to
original
Too small TL
TL + FT earlier
layers
Not Similar Too small
TL on earlier
layers
TL + FT entire
network
Other Applications of Transfer Learning
Google “Show and Tell”
https://github.com/tensorflow/models/tree/master/im2txt
Image Captioning Image Search
http://www.slideshare.net/ScottThompson90/applying-transfer-learn
ing-in-tensorflow
Training: Thesis
Train to differentiate between Manual and Auto
● Manual thumbnails are (usually) better than Auto
● Select Manual with high views and play rate;
Auto selection is random but low plays
● We have a lot of examples: 10K+ manual
● We used InceptionV3 pre-trained on ImageNet
Training: Examples
Positive (Manual)
Negative Examples
Negative (Auto)
Video Pre-Filter
Use FFMPEG to select top 100 frame
candidates
Methods:
● Color histogram changes to avoid
dupes
● Coded Macroblock information
● Remove “black” frame
● Measure motion vectors
Motion Vectors
Source: Sintel, Blender Studios
Engineering
Demo: Evaluation Tool
Demo: Examples Original Auto (10th second frame)
Top scored frames
from new model
What’s Next
● Refinements:
○ Fine tuning to earlier layers
○ Other models: ResnetV2, Xception
○ Pre-Filtering: adaptive, hardware accel.
● Products:
○ New auto thumbnails
○ Thumbstrips
Resources
Blog Posts:
● https://research.googleblog.com/2016/03/train-your-own-image-classifier-with.html
● https://github.com/tensorflow/models/tree/master/inception
● http://iamaaditya.github.io/2016/03/one-by-one-convolution/
● http://www.slideshare.net/ScottThompson90/applying-transfer-learning-in-tensorflow
● https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html
● http://cs231n.github.io/transfer-learning/
● https://research.googleblog.com/2015/10/improving-youtube-video-thumbnails-with.html
● https://pseudoprofound.wordpress.com/2016/08/28/notes-on-the-tensorflow-implementation-of-inception-v3/
● https://adeshpande3.github.io/adeshpande3.github.io/The-9-Deep-Learning-Papers-You-Need-To-Know-About.html
Papers:
● Rethinking the inception architecture for computer vision. https://arxiv.org/abs/1512.00567
● Xception: Deep Learning with Depthwise Separable Convolutions. https://arxiv.org/abs/1610.02357
● CNN Features off-the-shelf: an Astounding Baseline for Recognition. https://arxiv.org/abs/1403.6382
● DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition. https://arxiv.org/abs/1310.1531
● How transferable are features in deep neural networks? https://arxiv.org/abs/1411.1792

Mais conteúdo relacionado

Semelhante a Intelligent Thumbnail Selection

Artificial Intelligence in practice - Gerbert Kaandorp - Codemotion Amsterdam...
Artificial Intelligence in practice - Gerbert Kaandorp - Codemotion Amsterdam...Artificial Intelligence in practice - Gerbert Kaandorp - Codemotion Amsterdam...
Artificial Intelligence in practice - Gerbert Kaandorp - Codemotion Amsterdam...Codemotion
 
Tensors Are All You Need: Faster Inference with Hummingbird
Tensors Are All You Need: Faster Inference with HummingbirdTensors Are All You Need: Faster Inference with Hummingbird
Tensors Are All You Need: Faster Inference with HummingbirdDatabricks
 
FGS 2011: Making A Game With Molehill: Zombie Tycoon
FGS 2011: Making A Game With Molehill: Zombie TycoonFGS 2011: Making A Game With Molehill: Zombie Tycoon
FGS 2011: Making A Game With Molehill: Zombie Tycoonmochimedia
 
Adtech scala-performance-tuning-150323223738-conversion-gate01
Adtech scala-performance-tuning-150323223738-conversion-gate01Adtech scala-performance-tuning-150323223738-conversion-gate01
Adtech scala-performance-tuning-150323223738-conversion-gate01Giridhar Addepalli
 
Adtech x Scala x Performance tuning
Adtech x Scala x Performance tuningAdtech x Scala x Performance tuning
Adtech x Scala x Performance tuningYosuke Mizutani
 
Gopher in performance_tales_ms_go_cracow
Gopher in performance_tales_ms_go_cracowGopher in performance_tales_ms_go_cracow
Gopher in performance_tales_ms_go_cracowMateuszSzczyrzyca
 
Performance evaluation of GANs in a semisupervised OCR use case
Performance evaluation of GANs in a semisupervised OCR use casePerformance evaluation of GANs in a semisupervised OCR use case
Performance evaluation of GANs in a semisupervised OCR use caseFlorian Wilhelm
 
Performance evaluation of GANs in a semisupervised OCR use case
Performance evaluation of GANs in a semisupervised OCR use casePerformance evaluation of GANs in a semisupervised OCR use case
Performance evaluation of GANs in a semisupervised OCR use caseinovex GmbH
 
Deep learning for FinTech
Deep learning for FinTechDeep learning for FinTech
Deep learning for FinTechgeetachauhan
 
Deep Dive on Deep Learning (June 2018)
Deep Dive on Deep Learning (June 2018)Deep Dive on Deep Learning (June 2018)
Deep Dive on Deep Learning (June 2018)Julien SIMON
 
"Source Code Abstracts Classification Using CNN", Vadim Markovtsev, Lead Soft...
"Source Code Abstracts Classification Using CNN", Vadim Markovtsev, Lead Soft..."Source Code Abstracts Classification Using CNN", Vadim Markovtsev, Lead Soft...
"Source Code Abstracts Classification Using CNN", Vadim Markovtsev, Lead Soft...Dataconomy Media
 
SF Big Analytics 2022-03-15: Persia: Scaling DL Based Recommenders up to 100 ...
SF Big Analytics 2022-03-15: Persia: Scaling DL Based Recommenders up to 100 ...SF Big Analytics 2022-03-15: Persia: Scaling DL Based Recommenders up to 100 ...
SF Big Analytics 2022-03-15: Persia: Scaling DL Based Recommenders up to 100 ...Chester Chen
 
EclipseCon Eu 2015 - Breathe life into your Designer!
EclipseCon Eu 2015 - Breathe life into your Designer!EclipseCon Eu 2015 - Breathe life into your Designer!
EclipseCon Eu 2015 - Breathe life into your Designer!melbats
 
AWS Big Data Demystified #1.2 | Big Data architecture lessons learned
AWS Big Data Demystified #1.2 | Big Data architecture lessons learned AWS Big Data Demystified #1.2 | Big Data architecture lessons learned
AWS Big Data Demystified #1.2 | Big Data architecture lessons learned Omid Vahdaty
 
Lessons learned from designing QA automation event streaming platform(IoT big...
Lessons learned from designing QA automation event streaming platform(IoT big...Lessons learned from designing QA automation event streaming platform(IoT big...
Lessons learned from designing QA automation event streaming platform(IoT big...Omid Vahdaty
 
Data Summer Conf 2018, “Monitoring AI with AI (RUS)” — Stepan Pushkarev, CTO ...
Data Summer Conf 2018, “Monitoring AI with AI (RUS)” — Stepan Pushkarev, CTO ...Data Summer Conf 2018, “Monitoring AI with AI (RUS)” — Stepan Pushkarev, CTO ...
Data Summer Conf 2018, “Monitoring AI with AI (RUS)” — Stepan Pushkarev, CTO ...Provectus
 

Semelhante a Intelligent Thumbnail Selection (20)

Angular and Deep Learning
Angular and Deep LearningAngular and Deep Learning
Angular and Deep Learning
 
Artificial Intelligence in practice - Gerbert Kaandorp - Codemotion Amsterdam...
Artificial Intelligence in practice - Gerbert Kaandorp - Codemotion Amsterdam...Artificial Intelligence in practice - Gerbert Kaandorp - Codemotion Amsterdam...
Artificial Intelligence in practice - Gerbert Kaandorp - Codemotion Amsterdam...
 
Tensors Are All You Need: Faster Inference with Hummingbird
Tensors Are All You Need: Faster Inference with HummingbirdTensors Are All You Need: Faster Inference with Hummingbird
Tensors Are All You Need: Faster Inference with Hummingbird
 
FGS 2011: Making A Game With Molehill: Zombie Tycoon
FGS 2011: Making A Game With Molehill: Zombie TycoonFGS 2011: Making A Game With Molehill: Zombie Tycoon
FGS 2011: Making A Game With Molehill: Zombie Tycoon
 
Adtech scala-performance-tuning-150323223738-conversion-gate01
Adtech scala-performance-tuning-150323223738-conversion-gate01Adtech scala-performance-tuning-150323223738-conversion-gate01
Adtech scala-performance-tuning-150323223738-conversion-gate01
 
Adtech x Scala x Performance tuning
Adtech x Scala x Performance tuningAdtech x Scala x Performance tuning
Adtech x Scala x Performance tuning
 
Gopher in performance_tales_ms_go_cracow
Gopher in performance_tales_ms_go_cracowGopher in performance_tales_ms_go_cracow
Gopher in performance_tales_ms_go_cracow
 
Performance evaluation of GANs in a semisupervised OCR use case
Performance evaluation of GANs in a semisupervised OCR use casePerformance evaluation of GANs in a semisupervised OCR use case
Performance evaluation of GANs in a semisupervised OCR use case
 
Performance evaluation of GANs in a semisupervised OCR use case
Performance evaluation of GANs in a semisupervised OCR use casePerformance evaluation of GANs in a semisupervised OCR use case
Performance evaluation of GANs in a semisupervised OCR use case
 
Practical ML
Practical MLPractical ML
Practical ML
 
Deep learning for FinTech
Deep learning for FinTechDeep learning for FinTech
Deep learning for FinTech
 
Deep Dive on Deep Learning (June 2018)
Deep Dive on Deep Learning (June 2018)Deep Dive on Deep Learning (June 2018)
Deep Dive on Deep Learning (June 2018)
 
"Source Code Abstracts Classification Using CNN", Vadim Markovtsev, Lead Soft...
"Source Code Abstracts Classification Using CNN", Vadim Markovtsev, Lead Soft..."Source Code Abstracts Classification Using CNN", Vadim Markovtsev, Lead Soft...
"Source Code Abstracts Classification Using CNN", Vadim Markovtsev, Lead Soft...
 
AI and Deep Learning
AI and Deep Learning AI and Deep Learning
AI and Deep Learning
 
SF Big Analytics 2022-03-15: Persia: Scaling DL Based Recommenders up to 100 ...
SF Big Analytics 2022-03-15: Persia: Scaling DL Based Recommenders up to 100 ...SF Big Analytics 2022-03-15: Persia: Scaling DL Based Recommenders up to 100 ...
SF Big Analytics 2022-03-15: Persia: Scaling DL Based Recommenders up to 100 ...
 
EclipseCon Eu 2015 - Breathe life into your Designer!
EclipseCon Eu 2015 - Breathe life into your Designer!EclipseCon Eu 2015 - Breathe life into your Designer!
EclipseCon Eu 2015 - Breathe life into your Designer!
 
AWS Big Data Demystified #1.2 | Big Data architecture lessons learned
AWS Big Data Demystified #1.2 | Big Data architecture lessons learned AWS Big Data Demystified #1.2 | Big Data architecture lessons learned
AWS Big Data Demystified #1.2 | Big Data architecture lessons learned
 
Lessons learned from designing QA automation event streaming platform(IoT big...
Lessons learned from designing QA automation event streaming platform(IoT big...Lessons learned from designing QA automation event streaming platform(IoT big...
Lessons learned from designing QA automation event streaming platform(IoT big...
 
Data Summer Conf 2018, “Monitoring AI with AI (RUS)” — Stepan Pushkarev, CTO ...
Data Summer Conf 2018, “Monitoring AI with AI (RUS)” — Stepan Pushkarev, CTO ...Data Summer Conf 2018, “Monitoring AI with AI (RUS)” — Stepan Pushkarev, CTO ...
Data Summer Conf 2018, “Monitoring AI with AI (RUS)” — Stepan Pushkarev, CTO ...
 
Monitoring AI with AI
Monitoring AI with AIMonitoring AI with AI
Monitoring AI with AI
 

Último

ETHICAL HACKING dddddddddddddddfnandni.pptx
ETHICAL HACKING dddddddddddddddfnandni.pptxETHICAL HACKING dddddddddddddddfnandni.pptx
ETHICAL HACKING dddddddddddddddfnandni.pptxNIMMANAGANTI RAMAKRISHNA
 
Company Snapshot Theme for Business by Slidesgo.pptx
Company Snapshot Theme for Business by Slidesgo.pptxCompany Snapshot Theme for Business by Slidesgo.pptx
Company Snapshot Theme for Business by Slidesgo.pptxMario
 
TRENDS Enabling and inhibiting dimensions.pptx
TRENDS Enabling and inhibiting dimensions.pptxTRENDS Enabling and inhibiting dimensions.pptx
TRENDS Enabling and inhibiting dimensions.pptxAndrieCagasanAkio
 
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书zdzoqco
 
Unidad 4 – Redes de ordenadores (en inglés).pptx
Unidad 4 – Redes de ordenadores (en inglés).pptxUnidad 4 – Redes de ordenadores (en inglés).pptx
Unidad 4 – Redes de ordenadores (en inglés).pptxmibuzondetrabajo
 
『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书
『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书
『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书rnrncn29
 
Film cover research (1).pptxsdasdasdasdasdasa
Film cover research (1).pptxsdasdasdasdasdasaFilm cover research (1).pptxsdasdasdasdasdasa
Film cover research (1).pptxsdasdasdasdasdasa494f574xmv
 
IP addressing and IPv6, presented by Paul Wilson at IETF 119
IP addressing and IPv6, presented by Paul Wilson at IETF 119IP addressing and IPv6, presented by Paul Wilson at IETF 119
IP addressing and IPv6, presented by Paul Wilson at IETF 119APNIC
 
Top 10 Interactive Website Design Trends in 2024.pptx
Top 10 Interactive Website Design Trends in 2024.pptxTop 10 Interactive Website Design Trends in 2024.pptx
Top 10 Interactive Website Design Trends in 2024.pptxDyna Gilbert
 
SCM Symposium PPT Format Customer loyalty is predi
SCM Symposium PPT Format Customer loyalty is prediSCM Symposium PPT Format Customer loyalty is predi
SCM Symposium PPT Format Customer loyalty is predieusebiomeyer
 
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书rnrncn29
 

Último (11)

ETHICAL HACKING dddddddddddddddfnandni.pptx
ETHICAL HACKING dddddddddddddddfnandni.pptxETHICAL HACKING dddddddddddddddfnandni.pptx
ETHICAL HACKING dddddddddddddddfnandni.pptx
 
Company Snapshot Theme for Business by Slidesgo.pptx
Company Snapshot Theme for Business by Slidesgo.pptxCompany Snapshot Theme for Business by Slidesgo.pptx
Company Snapshot Theme for Business by Slidesgo.pptx
 
TRENDS Enabling and inhibiting dimensions.pptx
TRENDS Enabling and inhibiting dimensions.pptxTRENDS Enabling and inhibiting dimensions.pptx
TRENDS Enabling and inhibiting dimensions.pptx
 
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书
 
Unidad 4 – Redes de ordenadores (en inglés).pptx
Unidad 4 – Redes de ordenadores (en inglés).pptxUnidad 4 – Redes de ordenadores (en inglés).pptx
Unidad 4 – Redes de ordenadores (en inglés).pptx
 
『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书
『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书
『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书
 
Film cover research (1).pptxsdasdasdasdasdasa
Film cover research (1).pptxsdasdasdasdasdasaFilm cover research (1).pptxsdasdasdasdasdasa
Film cover research (1).pptxsdasdasdasdasdasa
 
IP addressing and IPv6, presented by Paul Wilson at IETF 119
IP addressing and IPv6, presented by Paul Wilson at IETF 119IP addressing and IPv6, presented by Paul Wilson at IETF 119
IP addressing and IPv6, presented by Paul Wilson at IETF 119
 
Top 10 Interactive Website Design Trends in 2024.pptx
Top 10 Interactive Website Design Trends in 2024.pptxTop 10 Interactive Website Design Trends in 2024.pptx
Top 10 Interactive Website Design Trends in 2024.pptx
 
SCM Symposium PPT Format Customer loyalty is predi
SCM Symposium PPT Format Customer loyalty is prediSCM Symposium PPT Format Customer loyalty is predi
SCM Symposium PPT Format Customer loyalty is predi
 
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书
 

Intelligent Thumbnail Selection

  • 1. Intelligent Thumbnail Selection Kamil Sindi, Lead Data Scientist
  • 2. JW Player 1. Company a. Open-source video player b. Hosting platform c. 5% of global internet video traffic d. 150+ team 2. Data Team a. Handling 5MM events per minute b. Storing 1TB+ per day c. Stack: Storm (Trident), Kafka, Luigi, Elasticsearch, Spark, AWS, MySQL Customers
  • 3. Thumbnails are Important ● Your video's first impression ● Types: Upload, Manual, Auto (default) ● Manual >> Auto in Play Rate ● Current Auto is 10th second frame ● Many big publishers only use Manual ● 90% of Thumbnails are Auto! :-( source: tastingtable.com (2016-10-12)
  • 4. What’s a “Good” Thumbnail? It’s subjective to the viewer! Common themes: ● Not blurry ● Balanced brightness ● Centered objects ● Large text overlay ● Relevant to subject vs Source: Big Buck Bunny, Blender Studios
  • 5. Manually Creating a Model is Hard ● Which features to extract? ● How to describe those features? ● How to weight features? ● How to penalize overfitting of models? ● Many techniques: SIFT, SURF, HOG? Need to be an expert in Computer Vision :-( Edge Detection Color Histogram Pixel Segmentation So Many Image Features...
  • 6. Deep Learning ● Learn features implicitly ● Learn from examples ● Techniques to avoid overfitting ● Success in a lot of applications: ○ Image classification ○ Image captioning ○ Machine translation ○ Speech-to-Text
  • 7. Inception ● Learn multiple models in parallel; concatenate their outputs (“modules”) ● Factoring convolutions (“towers”): e.g. 1x1 convs followed by 3x3 ● Parameter reduction: GoogleNet (5MM) vs. AlexNet (60MM), VGG (200MM) ● Auxiliary classifiers for regularization ● Residual connections (Inceptionv4) ● Depthwise separable convolutions (Xception) https://www.udacity.com/course/deep-learning--ud730 https://arxiv.org/abs/1409.4842 Source: Rethinking the Inception Architecture for Computer Vision
  • 8. 1. Dimensionality reduction: fewer channels, strides, feature pooling 2. Parameter reduction: faster, less overfitting 3. “Cheap” nonlinearity: 1x1 + 3x3 is non-lin 4. Cross-channel ⊥ spatial correlations 1x1 Convolutions: what’s the point? 1x1 convolution with strides Pooling with 1x1 convolution Source: http://iamaaditya.github.io/2016/03/one-by-one-convolution/ In Convolutional Nets, there is no such thing as “fully-connected layers”. There are only convolution layers with 1x1 convolution kernels. – Yann LeCun
  • 10. Transfer Learning 1,000,000 images, 1,000 categories● Use pre-trained model ○ Cheaper (no GPU required) ○ Faster ○ Prevents overfitting ● Penultimate (“Bottleneck”) layer contains image’s “essence” (CNN codes); acts as a feature extractor ● Just add a linear classifier (Softmax; lin-SVM) to Bottleneck
  • 11. Fine Tuning + Tips ● Change classification layer + backprop layers back ● Idea: Early layers do basic filters; later layers more dataset specific ● Generally use a pre-trained model regardless of data size or similarity Data Size (per class) < 500 > 500 > 5,000 Similar to original Too small TL TL + FT earlier layers Not Similar Too small TL on earlier layers TL + FT entire network
  • 12. Other Applications of Transfer Learning Google “Show and Tell” https://github.com/tensorflow/models/tree/master/im2txt Image Captioning Image Search http://www.slideshare.net/ScottThompson90/applying-transfer-learn ing-in-tensorflow
  • 13. Training: Thesis Train to differentiate between Manual and Auto ● Manual thumbnails are (usually) better than Auto ● Select Manual with high views and play rate; Auto selection is random but low plays ● We have a lot of examples: 10K+ manual ● We used InceptionV3 pre-trained on ImageNet
  • 15. Video Pre-Filter Use FFMPEG to select top 100 frame candidates Methods: ● Color histogram changes to avoid dupes ● Coded Macroblock information ● Remove “black” frame ● Measure motion vectors
  • 19. Demo: Examples Original Auto (10th second frame) Top scored frames from new model
  • 20. What’s Next ● Refinements: ○ Fine tuning to earlier layers ○ Other models: ResnetV2, Xception ○ Pre-Filtering: adaptive, hardware accel. ● Products: ○ New auto thumbnails ○ Thumbstrips
  • 21. Resources Blog Posts: ● https://research.googleblog.com/2016/03/train-your-own-image-classifier-with.html ● https://github.com/tensorflow/models/tree/master/inception ● http://iamaaditya.github.io/2016/03/one-by-one-convolution/ ● http://www.slideshare.net/ScottThompson90/applying-transfer-learning-in-tensorflow ● https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html ● http://cs231n.github.io/transfer-learning/ ● https://research.googleblog.com/2015/10/improving-youtube-video-thumbnails-with.html ● https://pseudoprofound.wordpress.com/2016/08/28/notes-on-the-tensorflow-implementation-of-inception-v3/ ● https://adeshpande3.github.io/adeshpande3.github.io/The-9-Deep-Learning-Papers-You-Need-To-Know-About.html Papers: ● Rethinking the inception architecture for computer vision. https://arxiv.org/abs/1512.00567 ● Xception: Deep Learning with Depthwise Separable Convolutions. https://arxiv.org/abs/1610.02357 ● CNN Features off-the-shelf: an Astounding Baseline for Recognition. https://arxiv.org/abs/1403.6382 ● DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition. https://arxiv.org/abs/1310.1531 ● How transferable are features in deep neural networks? https://arxiv.org/abs/1411.1792