2. Context (1/2)
Increasing consumption of online video content
easy-to-use devices and online services
cheap storage and bandwidth
more and more people going online
Increasing availability of online video content
digitization of professional video archives
popularity of user-generated video content
8/11/2012 2
3. Context (2/2)
Some statistics
professional video content
BBC Motion Gallery (as of January 2009)
offers over 2.5 million hours of video content
with video content dating back 60 years in time
user-generated video content
YouTube (as of October 2012)
people watch 4 billion hours of video content each month
people upload 72 hours of video content each minute
8/11/2012 3
4. Digital Video Overload (1/2)
Problem description
our ability to manage video content is not able to keep
up with our ability to create video content
Cause
to facilitate text-based video search, we need to
manually annotate video content with textual labels
8/11/2012 4
5. Digital Video Overload (2/2)
Real cause
people experience manual video annotation as time-
consuming and cumbersome, thus foregoing the effort
Solution
automatic video content understanding
this is, computerized translation of pixels into text
“Curiosity
on Mars”
8/11/2012 5
6. Automatic Video Content Understanding
Traditionally: video content analysis
works reasonably well in highly controlled environments
room for improvement in terms of applicability and
effectiveness
Nowadays: video content analysis, enhanced with
unstructured knowledge from the Social Web, and/or
structured knowledge from the Semantic Web
two use cases
8/11/2012 6
7. Social Video Face Annotation (1/2)
Description
improving face annotation for personal video collections
by harvesting online social network context
Goal of video face annotation
person 2
person 1
person 3
Search for peoples
8/11/2012 7
8. Social Video Face Annotation (2/2)
Contact list
Labeled face images
contact 1
contact 2
occurrence
contact 3
+ probabilities
contact 4
contact 5 co-occurrence
contact 6 probabilities
video face recognition using
visual features
robust video face recognition
using visual and social features
8/11/2012 [ published in IEEE ToMM, 2011 ] 8
9. Annotation of Live Soccer Video (1/2)
Description
annotation of live soccer video by harvesting collective
knowledge from Twitter
Goal of annotating soccer video
logo attack goal trainer logo
Search for events
8/11/2012 9
10. Annotation of Live Soccer Video (2/2)
6
Tweets/s
4
2
0
0 5 Time (s) 10
soccer event detection using
visual features
Twitter-assisted annotation What is happening?
of live soccer video What are people saying?
8/11/2012 [ submitted to IEEE ToMM, 2012 ] 10
11. Other Use Cases
Movie actor recognition
Semantic video copy
detection
Audiovisual enrichment
of text documents
8/11/2012 11
12. Research Challenges (1/2)
Design of techniques that jointly take advantage
of unstructured and structured knowledge
unstructured knowledge: collective knowledge
structured knowledge: Linked Data Cloud
cf. “Everything is Connected” for video content enrichment
http://everythingisconnected.be/
Design of techniques for translating unstructured
knowledge into structured knowledge
velocity, volume, and variety
sparsity, ambiguity, and complexity
8/11/2012 12
13. Research Challenges (2/2)
Design of effective semantic similarity metrics
visual distance
semantic distance
Design of user-oriented performance metrics
need to go beyond the use of precision and recall
need to better capture whether the needs of users
have been met by a video content retrieval system
8/11/2012 13