Bringing Context to Multimedia Content Analysis

Multimedia Content Understanding:
Bringing Context to Content
Dr. Benoit Huet
HDR Presentation

Université de Nice Sophia-Antipolis
October 3rd 2012

Welcome
 Jury Members
 Tat-Seng CHUA (NUS, Singapore)

[reviewer]

 Patrick GROS (INRIA, France)

[reviewer]

 Alan SMEATON (DCU, Ireland)

[reviewer]

 Edwin HANCOCK (University of York, UK)

[member]

 Bernard MERIALDO (EURECOM, France)

[member]

 Nicu SEBE (University of Trento, Italy)

[member]

03/10/2012

B. HUET - HDR Presentation

-2

Talk Outline
 Curriculum Vitae
 Research Activities

 A Technical Presentation
 Event Media Modeling based on User Generated Content

 Conclusions and Research Perspectives
 Questions/Discussion
03/10/2012


-3

Curriculum Vitae


Ecole Superieure de Technologie Electrique
Batchelor of Science: Electrical Enginering and Computing

1992



University Of Westminster
MSc Artificial Intelligence (with distinction)

1993

 Thesis: Recurrent neural networks for temporal sequence recognition



University Of York
PhD Computer Science (Computer Vision)

1999

 Thesis: Object recognition from large libraries of line-patterns



University Of Westminster
Part-time Lecturer

1994-95



University Of York (R. Associate)
Learning 2D and 3D object models from 2D scenes

1998



Eurecom (Assistant/MdC)
Multimedia content analysis, indexing and retrieval

03/10/2012


since

-4

1999

Curriculum Vitae
 Teaching:
 Multimedia Technologies (course leader)
• Rated 3.25/4.00 by students (33) in Spring 2012

 Multimedia Advanced Topics (course leader)
• Rated 3.25/4.00 by students (25) in Fall 2012

 Intelligent Systems (course leader: B. Merialdo)
 Multimedia Information Retrieval (course leader: B. Merialdo)
 Artificial Neural Networks (94/95)
03/10/2012


-5

Curriculum Vitae


Mentoring



PhD Advisor:



Mathilde Sahuguet, “Multimedia Mining on the Web”, since March 2012.
Xueliang Liu, "Event-based Social Media Data Mining" since 2009.



Stephane Turlier, PhD from Telecom ParisTech in 2011
"Personalisation and Aggregation of Infotainment for Mobile Platforms“
Marco Paleari, PhD from Telecom ParisTech in 2009
"Affective Computing; Display, Recognition and Articial Intelligence“
Rachid Benmokhtar, PhD from Telecom ParisTech in 2009
"Fusion multi-niveau pour l'indexation et la recherche multimedia par le contenu semantique"
Eric Galmar, defended his PhD from Telecom ParisTech in 2008
"Representation and Analysis of Video Content for Automatic Object Extraction".







PhD Co-advisor (with B. Merialdo):




Joakim Jiten, PhD from Telecom ParisTech in 2007
"Multidimensional hidden Markov model applied to image and video analysis"
Fabrice Souvannavong, PhD from Telecom Paris in June 2005
"Semantic video content indexing and retrieval“
Ithery Yahiaoui, PhD from Telecom Paris in October 2003
"Automated Video Summary Construction“

03/10/2012


-6

Curriculum Vitae
 Scientific Visibility
 Editorial Boards:
 Multimedia Tools and Application (Springer),
 Multimedia Systems (Springer),
 Guest Editor for EURASIP Journal on Image and Video Processing: selected
papers from MultiMedia Modeling 2009,
 Guest Editor for IEEE Multimedia special issue on Large Scale Multimedia
Retrieval and Mining (July 2011),
 Guest Editor for IEEE Multimedia special issue on Large Scale Multimedia Data
Collections (July 2012),
 Guest Editor for the Journal of Media Technology and Applications special issue
on Multimedia Content Analysis,
 Guest Editor for Multimedia Systems special issue on Social Media Mining and
Knowledge Discovery.

03/10/2012


-7

Curriculum Vitae
 Scientific Visibility
 Reviewing:
 Most journals of the field: ACM Multimedia Systems, IEEE PAMI, IEEE Multimedia, …
 Most conferences of the field: ACM Multimedia, IEEE ICME, ACM SIGIR, MMM, …

 Conference Organisation





MMM 2009 General Chair
ACM MM 2012: Area Chair (Content Processing Track)
ACM ICMR 2012: Tutorial Chair
…

 Technical Commitees
 Chair of the IEEE Multimedia Communication Technical Committee (VAIG) Visual
Analysis and Content Management for Communications
 Vice Chair of the IAPR Technical Committee 14 Signal Analysis for Machine
Intelligence

03/10/2012


-8

Curriculum Vitae
 Publications:
 Books and book chapters: 5
 Journals: 11
 Conferences: 99 (90 International + 9 National)
 Technical reports: 6
 Invited Talks/Seminars: 16
 Panels: 3 ( 2 panelist + 1 moderator)
 Patent: 1
03/10/2012


-9

Curriculum Vitae


Current Projects
 ALIAS (EU AAL/ANR): Adaptable Ambient LIving ASsistant
 mobile robot system that interacts with elderly users
 promotes social inclusion by creating connections to people and events in the wider world.

 EventMap (EIT ICT Labs):
 demonstrate the use of explicit representations of events to organize the provision and exchange of
information and media.

 LinkedTV (EU FP7): TeleVision Linked to the Web
 A novel practical approach to Future Networked Media
 Based on four phases: annotation, interlinking, search, and usage (including personalization,
filtering, etc.).

 MediaMixer (EU FP7):
 re-purposing and re-using media fragment
 media production, library, TV archive, news production, e-learning and UGC portal industries.



Past Projects
 8 (2 direct contracts, 2 National, 4 European)

03/10/2012


- 10

Research Activities
 Multimedia Content Analysis/Understanding
 Information/Data Overload
 72 hours of video are uploaded every minute
 4 billion hours of video are watched each months
 More than 20% of views come from mobile devices
 3 hours of video is uploaded per minute from mobile devices
 3M Photos/day
 85M Photos/day
 Video is over 40% of today‟s Internet Traffic

 Need for Efficient and Scalable Content-Based Indexing/Search
Tools
03/10/2012


- 11

Research Activities

 Content and Context
Ubiquitous Media Capturing Devices
Sensor data complements Media
• Clock / GPS / Gyroscope / Accelerometer

 Using Context to better analyse Content
03/10/2012


- 12

Research Activities
 Social Networks
 User Generated Content
 Additional data complements Media
• Comments, Tags, +1/Like, etc…

 Reliability Issues!
 Few UGC feature tags!

 Using Context to better analyse Content
03/10/2012


- 13

Research Activities

 Social Networks
 Long term research objective:

 How CONTEXTual information can help analyse CONTENT
better
Content without context is meaningless
[Ramesh Jain, 2008]

03/10/2012


- 14

Research Activities
 Bringing Context to Content:
 Internal Context
Knowledge/Data extracted from within the document

 External Context
Knowledge/Data associated or found in conjunction with the
document

03/10/2012


- 15

Research Activities

 Spatio-temporal Video Segmentation
 Context at the pixel level
Spatial Segmentation
Create atomic homogeneous color regions,
constrained by a contour map

Temporal Grouping with Feature Points
Establish temporal edges between regions
Group regions that are strongly linked

Space-Time Merging
Refine linkage of static regions using local
and global constraints

Initialization
Define the coarseness of the
segmentation

03/10/2012


ARG Matching
Compare ARGs between frame pairs to
validate creation of new regions

- 16

[CIVR‟07]

Research Activities

 Structural Representation of Video Objects
 Region co-occurence

[CIVR‟05/IEE VISP‟05]
03/10/2012


- 17

Research Activities

 Spatio-Temporal Semantic Segmentation
 Concept Co-occurence

[MMSP‟08]
03/10/2012


- 18

Research Activities

 High-Level Fusion
 Concept Co-Detection

[MIR‟08]
03/10/2012


- 19

Research Activities

 Multimodal Emotion
Recognition

[CBMI‟08
03/10/2012


- 20

CIVR‟10]

Research Activities
Knowledge/Data extracted from within the document

Knowledge/Data associated or found in conjunction with the
document

03/10/2012


- 21

Research Activities

 High-Level Fusion
 Concept Co-occurence

[MTAP‟11]
03/10/2012


- 22

Research Activities

 Large Scale MM Annotation
 Tags
 Categories
 Comments
[VLSMCMR‟10]
03/10/2012


- 23

Research Activities

 Mining Events from the Web

Event 1

 Machine tags
 Geolocation
 Timestamps
[ACM WSM‟11]
03/10/2012


- 24

Event Media Modeling
based on User Generated Content
Xueliang Liu and Benoit Huet

What is an Event?

03/10/2012


- 26

What is an Event?
VIGTA
2012
Capri
Italy

03/10/2012


- 27

Big Data!
Media Sharing

Event
Directory

03/10/2012


Social Apps/Networks

Search Engines

- 28

Search For media

03/10/2012


- 29

Searching for an event

03/10/2012


- 30

Media explicitly associated with the event

03/10/2012


- 31

REST API for query

03/10/2012


- 32

Objective
 Automatically and explicitly associate media with their
originating event
 Build event visual appearance models

 Model training requires both positive and negative
samples
 Can we mine the training set automatically online?
03/10/2012


- 33

Related Works – Event Based
 EventBurn.com
 Create summaries about given events (searching Twitter, Facebook,
and Flickr)

 Firan et al. [CIKM’10]
 Event categorization from social media data

 Gao et al. [WWW’11]
 Employing Twitter data to enrich event information

 Mattivi et al. [ACM workshop on Modeling and Representing
Events’11]

 Event and Sub-Event clustering (visual features and time)
03/10/2012


- 34

Related Works – Concept Based
 Li et al. [CVPR ’07]
 OPTIMOL: automatic Online Picture collecTion via Incremental MOdel
Learning

 Schroff et al. [IEEE PAMI ’11]
 Harvesting image databases from the Web

 Li et al. [ICMR ’11]
 Social negative bootstrapping to model concepts

 Automatically collect samples for specified concepts

 Extension to Events using Contextual information
03/10/2012


- 35

Automated Event Modeling FrameWork

Positive
Samples
Event

Negative
Samples

03/10/2012


- 36

Event
Model

Event Machine Tags
 A way to explicitly link Media and Events

03/10/2012


- 37

Positive Samples Collection
 Machine Tag

 Abbreviation of events name + Geo-Tag
 For example “ACMMM12” is the tag to query photos from “ACM
Multimedia 2012”
03/10/2012


- 38


Positive
Samples
Event

Negative
Samples

03/10/2012


- 39

Event
Model

Mining Photos from Sharing Platforms
L
o
c
a n
t
i
o

Megwelk, Amsterdam

D
a e
t

03/10/2012

Flickr
API


- 40

Negative Samples Collection
 Assumption 1: Photos with recurrent tags captured near the
location of the event describe the location/region not the
event.
 Assumption 2: Photos taken near the location of the event
and in the same period offer better discriminating power
than random photos.
 Collecting Approach
 Collect the all the media captured near the event„s location and time
 Extract tag from the collection, and rank them according to appearance
frequency.
 Keep the top tags as common tags and use them to rank photos by
similarity
03/10/2012


- 41


Positive
Samples
Event
tag1

tags

Top N
tags

tag2
tag3
tagN

Top M
Photos

Pic2
Pic3

……….

Rank tags
by frequency

03/10/2012

Pic1

PicM

Negative
Samples

……

Rank Photos
by distance to tags


- 42

Event
Model

The DataSet


10 LastFM concerts, 3 international conferences
and 1 popular carnival
EventID
lastfm:804783
lastfm:1830095
lastfm:1858887
lastfm:1499065
lastfm:1787326
lastfm:1351984
lastfm:1842684
lastfm:2020655
lastfm:1301748
lastfm:1370837
ACMMM10
SIGIR2010
ACMMM07
NICECarnival2011
Total
03/10/2012

Positive
Samples
441
716
408
348
446
307
602
538
944
592
100
30
118
52

Negative
Candidate
1063
748
745
712
913
584
1125
745
541
1025
557
525
64
848

Pos
466
398
431
16
0
498
535
750
1157
592
178
0
15
60

Neg
64
134
266
153
313
19
78
6
80
115
23
201
44
209

5642

10195

5096

1705


Testing

- 43

DataSet Examples
Positive Samples

03/10/2012

Negative Samples


Test Positive

Test Negative

- 44

Event Model Training
 Feature:
 400D Bag of Words from SIFT features.

 Model:
 SVM implemented with libSVM
 RBF kernel
 Cross validation is used to
optimize the parameters

03/10/2012


- 45

The (Negative Samples) Model Parameters
 R: the location distance between photo taken and
event venue

 D: the time-span between photo taken and event
taken time
lastfm:804783
-An example on event:

03/10/2012


R and D should be
large enough to
pool a diverse set
of photos
- 46

Visual Event Modeling Results
EventID

Query

Our Algorithm

k-NN Pruning

Random
Sample

Uniform
Negative

lastfm:804783
lastfm:1830095
lastfm:1858887
lastfm:1499065
lastfm:1787326
lastfm:1351984
lastfm:1842684
lastfm:2020655
lastfm:1301748
lastfm:1370837
ACMMM10
SIGIR2010
ACMMM07
NICECarnival2011

87.92
74.81
61.84
9.47
0.00
96.32
87.28
99.21
93.53
83.73
88.56
0.00
25.42
22.30

88.68
78.38
63.41
90.53
98.40
96.32
87.93
91.80
93.53
85.15
91.04
60.19
57.62
76.58

46.98
80.26
63.56
89.94
92.65
55.32
67.86
71.69
73.73
73.83
87.56
42.28
46.61
59.10

50.00
96.62
76.47
92.90
97.12
86.65
79.28
75.00
64.83
60.25
86.57
16.41
28.81
55.39

75.85
84.96
73.89
89.35
42.49
93.81
87.11
94.58
93.21
80.62
89.05
22.38
27.18
56.51

Average Accuracy

69.41

83.31

68.64

70.07

73.42

03/10/2012


- 48

Conclusions
 Visual Modeling of Event allows to attach media to
their corresponding event
 Device and User Metadata provide interesting and
valuable clues for automatically constructing a
ground truth
 Visual Event Models can be created in an
unsupervised way

 Detecting Events from Social Media activity
03/10/2012


- 49

Conclusions and Future Work
 Combine multiple information sources
(Tweets, Social Graph, etc…) to detect and
media enrich events.
 Meta-Objective: Social Event analysis based on
connections between events, media and participants

 CONTEXTual Information contributes
significantly to CONTENT understanding
03/10/2012


- 50

Research Directions
 Accenture Technology Vision [2012]
Context-based services
Social-driven IT

 Cisco forecasts 65% of Internet Traffic
will be video by 2015
 Need for efficient and effective
Multimedia Content Understanding
 Are these multimedia content related?
 What event is depicted in this document?
 What is this video about?
03/10/2012


- 51

Research Directions
 Are these multimedia content related?
 Future Digital Television
 Interactive TV: no commercial success
 2sd Screen:
70% of mobile device user use them while watching TV.

 Need for relevant additional content
 Web Media Mining for objects, people and events
based on A/V Content and Contextual Information
03/10/2012


- 52

Research Directions
 What event is depicted in this document?
 Events are a natural structuring element for
Humans
 Public events (show,…) vs Private events (birthday,…)

 Initial promising results on public event

 Extension to private events
 Social Media Graph

 2 ACM MM Grand Challenges in 2012
03/10/2012


- 53

Research Directions
 What is this video about?
 Users are becoming Broadcasters
 User Generated Content
YouTube tutorials, product test, etc…

 Business Intelligence
 Harvest the social web for media documents related to
products and understand its content
 visual detection of product
 emotion recognition
03/10/2012


- 54

Questions?

 Thank you for your attention.
03/10/2012


- 55

Visual Data in the 90’s
 Huet & Hancock [WACV’96]

Digital Map
Ground Truth

03/10/2012

Corresponding aerial images
taken at different aircraft altitudes


- 57

Large Scale in the 90’
 Huet & Hancock [IEEE PAMI’99]
 Cartographic Database





22 original images
Aerial scenes
Main features: roads
100-1000 lines per image

 Trademarks and logos Database [Flickner et al. ’95]
 Over 1000 original images
 Scanned data
 B&W, Various resolution
03/10/2012


- 58

The TRECVID years (2001- to date)
 2001: 11 hrs from BBC & OpenVideo Project
 2003 first collaborative ground truth annotation

 2005-2006: 170 hrs (Nov.’04 news in Arabic, Chinese,
and English)
 High-level feature extraction (10)

 2007-2009: 100hrs from the Netherlands Institute for
Sound and Vision (news magazine, science news, news reports,
documentaries, educational programming, and archival video)

 2010-2011: 600hrs of MPEG-4 Creative Commons
Videos
 High-level feature extraction (light=50 full=364)
03/10/2012


- 59

The Trend:
 Datasets are going Large-Scale (Web-Scale)
...slowly...
Multimedia / Computer Vision researchers
are tackling and experimenting
with Large-Scale data
MIRflickr / NUSWide / ImageNet / MCG-WEBV
 Issue:
1 research objective <-> 1 data corpus

03/10/2012


- 60

Talk Outline
 The scene / motivation
 Social Events and Big Data
 Using social platforms for creating a corpus automatically

 Social Event Detection
 Using social media for detecting events

 Social Event Media Mining
 Enriching Event‟s Illustrations through Web Mining

 Conclusions
03/10/2012


- 61

Event Detection
by Temporal Analysis
X. Liu, R. Troncy and B. Huet

Event Detection - Related Work
 EventBurn.com
 Create summaries about given events (searching
Twitter, Facebook, and Flickr)

 Firan et al. (CIKM’10)
 Event categorization from social media data

 Gao et al. (WWW’11)
 Employing Twitter data to enrich event information
03/10/2012


- 63

How to mine events from PhotoSet…

Events ??

03/10/2012


- 64

Observation
 Media are captured during events and shared

03/10/2012


- 65

How fast are media uploaded?

03/10/2012


- 66

Experiment Data
 9 Attractive Venues WorldWide
Venue Name

NbEvents

NbUsers

352
151
106
24
79
148
79
212
204

Melkweg
Koko
HMV Forum
111 Minna Gallery
HMV Hammersmith Apollo
Circolo degli Artisti
Circolo Magnolia
Ancienne Belgique
Rotown

NbPhotos
6912
3546
2650
1369
2124
2571
2190
7831
3623

266
155
130
105
96
86
76
56
49

 Event Ground Truth obtained from the official agendas
03/10/2012


- 67

Detecting and Identifying Events
 Our solution consists of 3 steps:
 Location Monitoring: finding the bounding-box of venues.
 Temporal Analysis: detecting events by analyzing the
uploading behavior along time.
 Event Topic Identification: identifying detected events’ topics
through tag analysis.
14

12

10

8

6

4

2

0
10/05/01

Location
Monitoring
03/10/2012

10/05/06

10/05/11

10/05/16

10/05/21

10/05/26

10/05/31

Temporal
Analysis

Event Topic
Identification


Results
- 68

Event Detections
 Region Monitoring

03/10/2012


- 69

Venue Bounding Box Estimation
1 : INPUT : VenueName
2 : OUTPUT : BoundingBo
x
3 : PhotoSet
4 : Center

[]
GetInfo(
VenueName)

5 : EventSet

GetPastEve
nts(VenueName)

6 : f oreach event in EventSet do
7:

photos

GetFlickrP
hoto(event)

8:

PhotoSet.append ( photos)

9 : end
10 : GeoSet

GetGeoInfo PhotoSet)
(

11 : Filter (GeoSet, Center, threshold

1km)

12 : RETURN MinRect(GeoSet)
03/10/2012


- 70

Venue Bounding Boxes (a selection)

Paradiso

Megwelk
03/10/2012

HMV Hammersmith Apollo

KoKo

- 71

Analyzing the number of Photos
L
o
c o
a n
t
i
D t
a e

03/10/2012

Megwelk
REST
Query


- 72

Our Media DataSet
 Flickr Photos
 Taken in May 2010
 InName
either one of the 9 selected locations:
Number of Photos
Koko
Rotown
Melkweg
HMV Forum
111 Minna Gallery
Ancienne Belgique
Circolo Magnolia
Hammersmith Apollo
Total :

03/10/2012

Geo-tagged
372
90
363
184
937
2206
70
95
287
4604

Venue Name tagged
2040
273
700
412
3
288
553
236
84
4589


Overlap

Total

3
1
8
0
0
2
1
0
0
15

2409
362
1055
596
940
2492
622
331
371
9178

- 73

Analyzing the number of Photos
250

200

Events ??
150

100

50

0
10/05/01

10/05/06

10/05/11

10/05/16

10/05/21

10/05/26

10/05/31

Number of Photos taken in Melkweg (NL) in May 2010
03/10/2012


- 74

Analyzing the number of Photos Owners
14

12

Events ??

10

8

6

4

2

0
10/05/01

10/05/06

10/05/11

10/05/16

10/05/21

10/05/26

10/05/31

Number of Photo Owners in Melkweg in May 2010
03/10/2012


- 75

Event Detection Approach
 Based on media upload activity
 At a given time
 At a given location

 Events can beet arg(by: T )
detected ti
i

 Where

03/10/2012

ti
N photos * N owners
T : Threshold


- 76

Event Topics Mining
 Keep the top N most frequent tags

 Result:

melkweg anouk amsterdam jemaine 2010 european flight flightoftheconchords
conchords fotc mckenzie clement tour bret evelyn

03/10/2012


- 77

Number of photos * Number of photo owners

Event Detection Example

03/10/2012

Melkweg in May 2010


- 78

Number of photos * Number of photo owners

Event Detection Example

03/10/2012

111 Minna Gallery in May 2010


- 79

Event Detection Results
 Detection results on different conditions
Source

Threshold

True Predict

False Predict

F1

mean

43

21

0.211

median

64

51

0.279

mean

56

56

0.246

median

58

62

0.251

mean

34

18

0.172

median

67

53

0.289

Image

Owner

Image*Owner

03/10/2012


- 80

Event Detection Results
 Event Detection Statistics
Venues

Our Method

Ground Truth

LastFM

Melkweg
Koko
HMV Forum

69
20
14

Detect
15
15
12

111 Minna Gallery

23

15

2

0.133

0.087

0

Ancienne Belgique
Rotown

38
16

15
15

9
8

0.600
0.533

0.237
0.500

28
13


22

15

8

0.533

0.364

12

Circolo Magnolia

25

3

1

0.333

0.040

11

Hammersmith Apollo
In total

15
242

15
120

10
67

0.667
0.558

0.667
0.277

14
136

03/10/2012

Matched
12
8
9

Precision
0.800
0.533
0.750

Recall
0.174
0.400
0.643

44
0
14


- 81

Events Detection at Melkweg
Venue

Detection Results
Date
Tags

Date

Ground Truth
Title

LastFM
LastFM
Title

melkweg

03/05/2010

parkwaydrive drive parkway

03/05/2010

Parkway Drive / Despised Icon /
Winds Of Plague / The Warriors / 50 Lions

1336473

Parkway Drive

melkweg

02/05/2010

flight flightoftheconchords
conchords

02/05/2010

Flight Of The Conchords - UITVERKOCHT

1439320

Flight of the
Conchords

melkweg

04/05/2010

flightoftheconchords

04/05/2010


1439407

Flight of the
Conchords

melkweg

05/05/2010

mayerhawtorne mayer
hawthorne

05/05/2010

Mayer Hawthorne & The County

1416229

Mayer Hawthorne &
The County

melkweg

11/05/2010

bonobo

11/05/2010

Bonobo - UITVERKOCHT

1398102

Bonobo

melkweg

14/05/2010

paulweller paul

14/05/2010

Paul Weller - UITVERKOCHT

1406677

Paul Weller

melkweg

18/05/2010

brokensocialscene

18/05/2010

Broken Social Scene - UITVERKOCHT

1334429

Broken Social Scene

melkweg

19/05/2010

mikestern richardbona

19/05/2010

Mike Stern band with special guest Richard
Bona featuring Dave Weckl & Bob Malach

melkweg

25/05/2010

beattimemelkweg

24/05/2010

Beattime - The Kika Edition

melkweg

26/05/2010

beattime

24/05/2010

melkweg

28/05/2010

offcentre

28/05/2010

Off Centre - day 3 - night met Kode 9 / Falty DL / Gold
Panda / Kelpe

melkweg

30/05/2010

joannanewsom

30/05/2010

1425481

Joanna Newsom

03/10/2012


Joanna Newsom

- 82

Collage For illustration
She & Him in Koko 07/05/2010

03/10/2012


- 83

Conclusions on Event Detection
 A novel approach for automatically detecting
social events is presented
 The key idea consists in temporally monitoring
media shared on social web sites at a specific
location (Geo Localized Photo)

 Automatic Efficient Social Event Detection and
Identification can be achieved
03/10/2012


- 84

Enriching Events
with Social Media
X. Liu, R. Troncy and B. Huet

Searching for media about an event

03/10/2012


- 86

Finding more media that illustrate an event
A. Compute the bounding box area of a venue
B. Retrieve all media geo-tagged in this area

C. Retrieve all media with a similar title
D. Prune the results with visual analysis
E. Extend the result set with all media from the
same uploader
03/10/2012


- 87

A. Bounding box of Nouveau Casino?

03/10/2012


- 88

B. 74 photos taken in this area this day

03/10/2012


- 89

C. 85 additional photos with a similar title

03/10/2012


- 90

D. 6 photos after visual pruning






  
03/10/2012




- 91



How is the visual pruning performed?
 Model dataset: photo event id + photo geo
 Testing dataset: similar title

 Low-level features used:
 Color moments, Gabor texture, Edge histogram

 L1 distance on the k-Nearest Neighbors (k-NN)
 Threshold
 Min L1 distance between two model image pairs
 Conservative approach
03/10/2012


- 92

E. 66 photos after uploader heuristics


hellerpop

DustGraph / Stefan
cartoixa

13 photos
03/10/2012


46 photos
- 93

Same process for videos



1 video (id)
3 videos (geo)
26 videos (title)


03/10/2012

Visual pruning
performed on
key frames
Nb positive > 50%


- 94

How illustrated are the events?
Query By ID

Photos
Videos
(title)

Videos
(title+venue)

Query By Geo

Query By Title

Visual Pruning

Heuristic

5

74 (74)

85 (85)

6 (6)

66 (66)

1

3 (0)

23 (0)

13 (0)

-

10 (10)

 20 events
 Model dataset: 785 photos
 Testing dataset: 1766 photos (1573 positive, 193 negative)
 Results: 439 photos (99% precision, 28% recall)
03/10/2012


- 95

Conclusions
 Method for finding media illustrating scheduled
events
 Search media with machine and geo tags
 Search media with title and normal tags
 Prune visually and retrieve all media from confirmed users
 Challenge: do not necessarily trust the geo-coordinates

03/10/2012


- 96

Event detection with latent
topic model

Framework

Tj

Priori knowledge
Validating Data

Tj
Decision

Learning
Inference
Mass of data

03/10/2012

Ti
Semantic Space
Cluster the documents
by concepts


events distribution on
semantic space

- 98

Ti

Infer topics with LDA model
Parameters:
α and β are the Dirichlet prior on
the distribution of per-document
topic, and per-topic word
Inputs:
W is the observed words.

 To learn to the topics from large scale of data

 To estimate topic distribution on new data
03/10/2012


- 99

Estimate Event Distribution
 Estimate Validating data Distribution (D)
 Learn the event distribution

Dist : KL divergence

 Decision rule:
Where
03/10/2012


- 100

Threshold K
 Inference on validating set

#Docs

Decision:
k = 0.3

K

03/10/2012


- 101

Examples on Melkweg, Amsterdam
 LDA topics in Amsterdam
-- topic: 0
bus: 0.032322
berkhof: 0.012423
autobus: 0.011886
chair: 0.011348
man: 0.010810
gvb: 0.010810
-- topic: 3
………
canal: 0.059344
boat: 0.029258
bridge: 0.021845
river: 0.015741
water: 0.012253
……..
03/10/2012

-- topic: 1
park: 0.042738
green: 0.030343
nature: 0.027130
museum: 0.024835
tree: 0.018408
bird: 0.017031
-- topic: 4
………
event: 0.069694
paradiso: 0.067306
concert: 0.039046
music: 0.033872
live: 0.032280
B. HUET
…….. - HDR Presentation

-- topic: 2
dutch: 0.044336
building: 0.037352
architecture: 0.0359
centrum: 0.028923
cityscape: 0.024829
urban: 0.017123
-- topic: 5
………
………
………
………
………
……..
- 102
……..

Decision on Photos on Melkweg

tags="holland netherlands
amsterdam bike canal"
title="201005037961-2"

tags="portrait music
netherlands face rock
concert live band may "
title="Imelda May"
03/10/2012

tags="amsterdam snoekbaars
nikond90 thepowerofplots
pjotrp" title="Amsterdam"

tags="longexposure canon lights
nightshot tram leidseplein lijn2"
title="Carnival Ride"

tags="twitter" title="Gewoon
omdat we niet mogen
fotograferen van She &
Him @ Melkweg"

tags="tweeted" title="#eskimojoe
soundcheck @melkweg sounds
good!"


- 103

Detected Events on Melkweg
Event Title

Date
2010/5/3

Parkway Drive / Despised Icon / Winds Of Plague / The Warriors / 50 Lions
Mayer Hawthorne & The County
She & Him
Gert Vlok Nel
Bonobo - UITVERKOCHT
Gentleman & The Evolution - UITVERKOCHT
Paul Weller - UITVERKOCHT
Broken Social Scene - UITVERKOCHT

2010/5/2
2010/5/4
2010/5/5
2010/5/6
2010/5/7
2010/5/11
2010/5/12
2010/5/14
2010/5/18

Mike Stern band with special guest Richard Bona featuring Dave Weckl & Bob Malach
Off Centre - day 3 - night met Kode 9 / Falty DL / Gold Panda / Kelpe
Joanna Newsom

2010/5/19
2010/5/24
2010/5/28
2010/5/30

03/10/2012


- 104

Results Summary (1)
Social Media Data statistics over Event
Detection
Venue
Melkweg
Koko
111 Minna Gallery
Ancienne Belgique
Rotown
HMV Forum
total

03/10/2012

Total Post

Detection
355
724
313
496
118
167
97
2270


Positive
42
95
26
32
6
46
18
265

Precision
32
44
10
19
4
36
15
160

- 105

0.76
0.46
0.38
0.59
0.67
0.78
0.83
0.60

Results Summary (2)
Social Event Detection Performance
Venue
Melkweg
Koko
111 Minna Gallery
Ancienne Belgique
Rotown
HMV Forum
Total

03/10/2012

GroundTruth
27
15
4
19
7
17
10
99


Detection
14
12
4
10
2
15
6
63

Recall
0.52
0.80
1.00
0.53
0.29
0.88
0.60
0.64

- 106

Conclusion
 A novel approach for detecting social events is
presented
 The idea consists in mining the event distribution
on concepts learned from large scale data
 Future work:
 Exploring multimodality of data ( visual feature, EXIF
data…) on event detection
 Modeling the topics efficiently (varying along time)
03/10/2012


- 107

Multimedia Challenges
 Gartner Group “Twelve Technologies for 2000 to 2010”
 content-based retrieval and object recognition.

 Ever increasing volume of multimedia data
(internet + p2p, set-top-box, pda, mobile phone, etc…)

 Cross-device access…
(wireless or wired)

 Access to data remains mostly text based
 Data indexes remain mostly text based
(filename and eventually few user fed metadata)

 Multimedia content analysis for automatic semantic
metadata creation
03/10/2012


- 108

Current Research Themes
 Recognition and Retrieval
 Retrieval in technical drawings [E.P.O.]
 Video Object Analysis
Graph representation

 EC projects: GMF for ITV (PT, IRT, JRS, ...)

 Video Summarisation
 Automated summary construction
 Evaluation of summary‟s performance
 EC projects: SPATION (Philips, Uni. of Brescia, ...)
03/10/2012


- 109

3W3S
 World Wide Web Safe Surfing Services
 The 3W3S Filter is composed of several
components:
 HTTP proxy / ICAP Server [WebWasher],
 a URL filter [WebWasher],
 a PICS filter [Thales],
 Content Word filter [Eurecom].

 Implementation of the content word filter as an
03/10/2012


- 110

SPATION
 Services Platforms and Applications for Transparent Information
management in an in-hOme Network
 Analysis of Audio-Visual Descriptors for Summary Construction [Eurecom]
 1 PhD on Automatic Video Summarisation
 Image similarity measure
TV
Photo
 Patent: Using textual summaries for video summarisation

HiFi

[Philips/Eurecom co-inventor]
Browser

Set-Top-Box
03/10/2012

PDA


- 111
(Remote Control)

GMF4iTV
 Generic Media Framework for Interactive
Television
User
Profile
Semantic Ontology

 Semantic Video Object/Shot Classification
video
 Interactive Personalization



Annotations
Annotations
Annotations
2Extra Annotations
PhD students
content MPEG7
03/10/2012

Set-Top-Box

User
Extra
Content

(Co-advisor)

- 112

Attributed Relational Graph for
Semantic VideoFrame
Object Retrieval Frame
Video
Query Video

Segmentation
Regions
Adjacency Graph

03/10/2012

Object Graph
Matching
Salient object characteristics
are obtained through
Latent Semantic Analysis


113

Perspective Research Themes
 Multimedia Data Analysis and Mining
 Object segmentation (MPEG4),
 Object representation and recognition (MPEG7).

 Multimedia Information Organisation
 Increasing amount of material,
 Tools for easy and rapid navigation and search,
 Internet (WWW, Peer2Peer), Home network.

 Multimedia Document Understanding
03/10/2012


- 114

Multimedia Data Analysis and
Mining

 Object segmentation

 Static vs Dynamic… combined strategy

 Object representation and recognition:
 Objects vs Images
 Multimedia vs ”mono”media
Image
 Extensive use of MPEG7
Motion

 Major

Text
objectives: Speech
of structural constraints
Sounds + Music

 Use
(Attributed Graph) to
improve robustness to feature detection
03/10/2012


- 115

Multimedia Information Organisation
 Organisation for efficient retrieval:
 Matching/recognition is slow, efficient data organisation and
indexing is crucial
 Distributed Index: Peer2Peer and home network
P2P scenario under investigation with PF.

 Organisation for visualisation:
Summarisation
 Reduced in time orAllow MPEG7 descriptors to be used
space not in semantic content
as required
 Some “Semantic” informationindexing/search terms.
03/10/2012
116
 Combination of multiple cues in selection mechanism

Multimedia Document
Understanding

 Making sense of multimedia data:
 Manual annotation of documents is time consuming
Multimedia archives
MPEG7 descriptors

 Automatic association of “low-level” descriptors with
corresponding key-words (ontology)
 Extension of the work on object recognition and retrieval

 Bridging the
03/10/2012

Outdoor/Indoor? Studio? M/F?

- 117

Research Objective and Strategy
 “Habilitation a diriger des recherches”
 Funding for research
 EC RTD Sixth Framework
IST Key Action III Multimedia Content and Tools
 RIAM: Numerisation, Indexation des contenus et gestion
des flux audiovisuel.
 Bourse CIFRE: Bouygues Telecom.
 ACI Masse des données: Pidot [CNRS].

 Prospective Industrial Partners
03/10/2012


- 118

Conclusion
 Research Themes
 Address challenges of multimedia content analysis
 Build upon existing expertise
 Complementary with Eurecom‟s current research
projects and themes

 Research Strategy
 Strengthen industrial contacts
 Participate in project proposals

 Strong Points
03/10/2012


- 119

Questions?

03/10/2012


- 120

Publications
 Books and Book Chapters

 Journals

 Raphael Troncy, Benoit Huet, Simon Schenk, "Multimedia semantics:
metadata, analysis and interaction" Wiley-Blackwell, July 2011, ISBN: 9780470747001, pp 1-328
 Rachid Benmokhtar, Benoit Huet, Gael Richard, Slim Essid, "Feature
extraction for multimedia analysis", Book Chapter no. 4 in "Multimedia
Semantics: Metadata, Analysis and Interaction", Wiley, July 2011, ISBN: 9780-470-74700-1 , pp 35-58
 Slim Essid, Marine Campedel, Gael Richard, Tomas Piatrik, Rachid
Benmokhtar, Benoit Huet, "Machine learning techniques for multimedia
analysis" Book Chapter no. 5 in "Multimedia Semantics: Metadata, Analysis
and Interaction", Wiley, July 2011, ISBN: 978-0-470-74700-1 , pp 59-80
 Benoit Huet, Alan F. Smeaton, Ketan Mayer-Patel , Yannis Avrithis;
Advances in Multimedia Modeling Springer : Lecture Notes in Computer
Science, Subseries: Information Systems and Applications, incl.
Internet/Web, and HCI , Vol. 5371, ISBN: 978-3-540-92891-1
 Benoit Huet and Bernard Merialdo, "Automatic video summarization",
Chapter in "Interactive Video, Algorithms and Technologies“ by Hammoud,
Riad (Ed.), 2006, XVI, 250 p, ISBN: 3-540-33214-6 , pp 27-41.

03/10/2012

 Benoit Huet, Tat-Seng Chua and Alexander Hauptmann, "Large-Scale
Multimedia Data Collections", IEEE Multimedia, Volume 19, No. 3, JulySeptember 2012.
 Rachid Benmokhtar and Benoit Huet, "An ontology-based evidential
framework for video indexing using high-level multimodal fusion", Multimedia
Tools and Applications, Springer, December 2011 , pp 1-27
 Rong Yan, Benoit Huet, Rahul Sukthankar, "Large-scale multimedia retrieval
and mining", IEEE Multimedia, Vol 18, No. 1, January-March 2011
 Benoit Huet, Alan F. Smeaton, Ketan Mayer-Patel, Yannis Avrithis, "Selected
papers from multimedia modeling conference 2009", EURASIP Journal on
Image and Video Processing Volume 2010, Article ID 792567
 Fabrice Souvannavong, Lukas Hohl, Bernard Merialdo and BenoitHuet,
"Structurally Enhanced Latent Semantic Analysis for Video Object Retrieval ",
Special Issue of the IEE Proceedings on Vision, Image and Signal
Processing , Volume 152, No. 6, 9 December 2005, pp 859-867.
 Fabrice Souvannavong, Bernard Merialdo and Benoit Huet, "Partition
sampling: an active learning selection strategy for large database
annotation", Special Issue of the IEE Proceedings on Vision, Image and
Signal Processing ,Volume 152 No. 3, May 2005, Special section on
Technologies for interactive multimedia services , pp 347-355.
 Ithery Yahiaoui, Bernard Merialdo and Benoit Huet, "Comparison of multiepisode video summarisation algorithms", EURASIP Journal on Applied
Signal Processing, Special issue on Multimedia Signal Processing, Vol.
2003, No. 1, page 48-55, January 2003.
 Huet B. and E. R. Hancock, "Relational Object Recognition from Large
Structural Libraries", Pattern Recognition, Vol. 35, No. 9, page 1895-1915,
Sept 2002.
 Huet B. and E. R. Hancock, "Line Pattern Retrieval Using Relational
Histograms", IEEE Transactions on Pattern Analysis and Machine
Intelligence, Vol. 21, No. 12, page 1363-1370, December 1999.
 Huet B., A.D.J. Cross and E.R. Hancock, "Shape Recognition from Large
Image Libraries by Inexact Graph Matching", Pattern Recognition in Practice
VI, June 2-4 1999, Vlieland, The Netherlands. Appeared in a special issue of
Pattern Recognition Letters, 20, page 1259-1269, December 1999.
 Huet B. and E.R. Hancock, "Object Recognition from Large Structural
Libraries", Advances in Pattern Recognition: Lecture Notes in Computer
Science (SSPR98), Springer-Verlag, 1451, August 1998.


- 121

Publications
 International Conferences and Workshops
 Xueliang Liu and Benoit Huet, "Social Event Visual Modeling from Web
Media Data", ACM Multimedia'12 Workshop on Socially-Aware Multimedia,
Nara, Japan, 2012.
 Xueliang Liu and Benoit Huet, "Social Event Discovery by Topic Inference",
WIAMIS 2012, 13th International Workshop on Image Analysis for
Multimedia Interactive Services, 23-25 May 2012, Dublin City University,
Ireland , Dublin, Ireland.
 Xueliang Liu, Raphael Troncy and Benoit Huet, "Using social media to
identify events" WSM'11, ACM Multimedia 3rd Workshop on Social Media,
November 18-December 1st, 2011, Scottsdale, Arizona, USA
 Symeon Papadopoulos, Raphael Troncy, Vasileios Mezaris, Benoit Huet,
Ioannis Kompatsiaris, "Social event detection at MediaEval 2011:
Challenges, dataset and evaluation", MediaEval 2011, MediaEval
Benchmarking Initiative for Multimedia Evaluation, September 1-2, 2011,
Pisa, Italy
 Xueliang Liu, Raphael Troncy and Benoit Huet, " EURECOM @ MediaEval
2011 social event detection task" MediaEval 2011, MediaEval Benchmarking
Initiative for Multimedia Evaluation, September 1-2, 2011, Pisa, Italy
 Xueliang Liu, Raphael Troncy and Benoit Huet, "Finding media illustrating
events", ICMR'11, 1st ACM International Conference on Multimedia
Retrieval, April 17-20, 2011, Trento, Italy
 Marco Paleari, Ryad Chellali and Benoit Huet, "Bimodal emotion
recognition", ICSR'10, International Conference on Social Robotics,
November 23-24, 2010, Singapore - Also published as LNCS Volume
6414/2010, pp 305-314
 Xueliang Liu and Benoit Huet, "Concept detector renement using social
videos", VLS-MCMR'10, International workshop on Very large-scale
multimedia corpus, mining and retrieval, October 29, 2010, Firenze, Italy , pp
19-24
 Benoit Huet, Tat-Seng Chua and Alexander Hauptmann, "ACM international
workshop on very-large-scale multimedia corpus, mining and retrieval",
ACMMM'10, ACM Multimedia 2010, October 25-29, 2010, Firenze, Italy , pp
1769-1770
 Xueliang Liu, Benoit Huet, "Automatic concept detector refinement for largescale video semantic annotation", ICSC'10, IEEE 4th International
Conference on Semantic Computing, September 22-24, 2010, Pittsburgh,
PA, USA , pp 97-100
03/10/2012

 Marco Paleari, Benoit Huet, Ryad Chellali, "Towards multimodal emotion
recognition : A new approach", CIVR 2010, ACM International Conference on
Image and Video Retrieval, July 5-7, Xi'an, China , pp 174-181
 Marco Paleari, Ryad Chellali, Benoit Huet, "Features for multimodal emotion
recognition : An extensive study", CIS'10, IEEE International Conference on
Cybernetics and Intelligent Systems, June 28-30, 2010, Singapore , pp 90-95
 Marco Paleari, Vivek Singh, Benoit Huet, Ramesh Jain, "Toward
environment-to-environment (E2E) affective sensitive communication
systems", MTDL'09, Proceedings of the 1st ACM International Workshop on
Multimedia Technologies for Distance Learning at ACM Multimedia, October
23rd, 2009, Beijing, China , pp 19-26
 Benoit Huet, Jinhui Tang, Alex Hauptmann, ACM SIGMM the first workshop
on web-scale multimedia corpus MM'09 : Proceedings of the seventeen ACM
international conference on Multimedia, October 19-24, 2009, Beijing, China
, pp 1163-1164
 Marco Paleari, Carmelo Velardo, Benoit Huet, Jean-Luc Dugelay, "Face
dynamics for biometric people recognition" MMSP'09, IEEE International
Workshop on Multimedia Signal Processing, October 5-7, 2009, Rio de
Janeiro, Brazil
 Rachid Benmokhtar and Benoit Huet, "Hierarchical ontology-based robust
video shots indexing using global MPEG-7 visual descriptors", CBMI 2009,
7th International Workshop on Content-Based Multimedia Indexing, June 35, 2009, Chania, Crete Island, Greece
 Rachid Benmokhtar and Benoit Huet, "Ontological reranking approach for
hybrid concept similarity-based video shots indexing", WIAMIS 2009, 10th
International Workshop on Image Analysis for Multimedia Interactive
Services, May 6-8, 2009, London, UK
 Marco Paleari, Rachid Benmokhtar and Benoit Huet, "Evidence theory based
multimodal emotion recognition", MMM 2009, 15th International MultiMedia
Modeling Conference, January 7-9, 2009, Sophia Antipolis, France , pp 435446
 Thanos Athanasiadis, Nikolaos Simou, Georgios Th. Papadopoulos, Rachid
Benmokhtar, Krishna Chandramouli, Vassilis Tzouvaras, Vasileios Mezaris,
Marios Phiniketos, Yannis Avrithis, Yiannis Kompatsiaris, Benoit Huet, Ebroul
Izquierdo, "Integrating image segmentation and classication for fuzzy
knowledge-based multimedia indexing“ MMM 2009, 15th International
MultiMedia Modeling Conference, January 7-9, 2009, Sophia Antipolis,
France


- 122

Publications
 Rachid Benmokhtar, Eric Galmar and Benoit Huet, "K-Space at TRECVid
2008" TRECVid'08, 12th InternationalWorkshop on Video Retrieval
Evaluation, November 17-18, 2008, Gaithersburg, USA
 Rachid Benmokhtar and Benoit Huet, "Perplexity-based evidential neural
network classifier fusion using MPEG-7 low-level visual features", MIR 2008,
ACM International Conference on Multimedia Information Retrieval 2008,
October 27- November 01, 2008, Vancouver, BC, Canada , pp 336-341
 L. Goldmann, T. Adamek, P. Vajda, M. Karaman, R. M•orzinger, E. Galmar,
T. Sikora, N. O'Connor, T. Ha-Minh, T. Ebrahimi, P. Schallauer, B. Huet,
"Towards fully automatic image segmentation evaluation" ACIVS 2008,
Advanced Concepts for Intelligent Vision Systems, October 20-24, 2008,
Juan-les-Pins, France
 Eric Galmar and Benoit Huet, "Spatiotemporal modeling and matching of
video shots", 1st ICIP Workshop on Multimedia Information Retrieval : New
Trends and Challenges, October 12-15, 2008, San Diego, California, USA ,
pp 5-8
 Marco Paleari, Benoit Huet, Antony Schutz and Dirk T. M. A. Slock, "A
multimodal approach to music transcription", 1st ICIP Workshop on
Multimedia Information Retrieval : New Trends and Challenges, October 1215, 2008, San Diego, USA , pp 93-96
 Eric Galmar, Thanos Athanasiadis, Benoit Huet, Yannis Avrithis,
"Spatiotemporal semantic video segmentation" MMSP 2008, 10th IEEE
International Workshop on MultiMedia Signal Processing, October 8-10,
2008, Cairns, Queensland, Australia , pp 574-579
 Stephane Turlier, Benoit Huet, Thomas Helbig, Hans-Jorg Vogel,
"Aggregation and personalization of infotainment, an architecture illustrated
with a collaborative scenario" 8th International Conference on Knowledge
Management and Knowledge Technologies, September 4th, 2008, Graz,
Austria
 Marco Paleari, Benoit Huet, Antony Schutz and Dirk T. M. A. Slock, "Audiovisual guitar transcription", Jamboree 2008 : Workshop By and For KSpace
PhD Students, July, 25 2008, Paris, France
 Rachid Benmokhtar, Benoit Huet and Sid-Ahmed Berrani, "Low level feature
fusion models for soccer scene classification", 2008 IEEE International
Conference on Multimedia & Expo, June 23-26, 2008, Hannover, Germany
 Marco Paleari, Benoit Huet, "Toward emotion indexing of multimedia
excerpts" CBMI 2008, 6th International Workshop on Content Based
Multimedia Indexing, June, 18-20th 2008, London, UK [Best student paper
award]
03/10/2012

 Marco Paleari, Benoit Huet, Brian Duffy, "SAMMI, Semantic affect enhanced
multimedia indexing", SAMT 2007, 2nd International Conference on
Semantic and Digital Media Technologies, 5-7 December 2007, Genoa, Italy
 Rachid Benmokhtar, Eric Galmar and Benoit Huet, "Eurecom at TRECVid
2007: Extraction of high level features", TRECVid'07, 11th International
Workshop on Video Retrieval Evaluation, November 2007, Gaithersburg,
USA
 Rachid Benmokhtar, Eric Galmar and Benoit Huet, ,"K-Space at TRECVid
2007", TRECVid'07, 11th International Workshop on Video Retrieval
Evaluation, November 2007, Gaithersburg, USA
 Marco Paleari, Brian Duffy and Benoit Huet, "ALICIA, an architecture
 for intelligent affective agents", IVA 2007 7th International Conference on
Intelligent Virtual Agents, 17th - 19th September 2007 Paris, France | Also
published in LNAI Volume 4722 , pp 397-398
 Marco Paleari, Brian Duffy and Benoit Huet, "Using emotions to tag media",
Jamboree 2007: Workshop By and For KSpace PhD Students, September,
15th 2007, Berlin, Germany
 Eric Galmar and Benoit Huet, "Analysis of vector space model and
spatiotemporal segmentation for video indexing and retrieval", CIVR 2007,
ACM International Conference on Image and Video Retrieved, July 9-11
2007, Amsterdam, The Netherlands
 Rachid Benmokhtar, Benoit Huet, Sid-Ahmed Berrani, Patrick Lechat, "Video
shots key-frames indexing and retrieval through pattern analysis and fusion
techniques", FUSION'07, 10th International Conference on Information
Fusion, July 9-12 2007, Quebec, Canada
 Rachid Benmokhtar and Benoit Huet, "Multi-level fusion for semantic
indexing video content", AMR'07, International Workshop on Adaptive
Multimedia Retrieval, June 5-6 2007, Paris, France
 Rachid Benmokhtar and Benoit Huet, "Performance analysis of multiple
classifier fusion for semantic video content indexing and retrieval", MMM'07,
International MultiMedia Modeling Conference,January 9-12 2007, Singapore
- Also published as LNCS Volume 4351, pp 517-526
 Rachid Benmokhtar and Benoit Huet, "Neural network combining classifier
based on Dempster-Shafer theory for semantic indexing in video content",
MMM'07, International MultiMedia Modeling Conference, January 9-12 2007,
Singapore - Also published as LNCS Volume 4351 , pp 196-205


- 123

Publications
 Rachid Benmokhtar, Emilie Dumont, Bernard Merialdo and Benoit Huet,
"Eurecom in TrecVid 2006: high level features extractions and rushes study",
TrecVid 2006, 10th International Workshop on Video Retrieval Evaluation,
November 2006, Gaithersburg, USA
 Peter Wilkins, Tomasz Adamek, Paul Ferguson, Mark Hughes, Gareth J F
Jones, Gordon Keenan, Kevin McGuinness, Jovanka Malobabic, Noel E.
O'Connor, David Sadlier, Alan F. Smeaton, Rachid Benmokhtar, Emilie
Dumont, Benoit Huet, Bernard Merialdo, Evaggelos Spyrou, George
Koumoulos, Yannis Avrithis, R. Moerzinger, P. Schallauer, W. Bailer, Qianni
Zhang, Tomas Piatrik, Krishna Chandramouli, Ebroul Izquierdo, Lutz
Goldmann, Martin Haller, Thomas Sikora, Pavel Praks, Jana Urban, Xavier
Hilaire and Joemon M. Jose, "K-Space at TRECVid 2006", TrecVid 2006,
10th International Workshop on Video Retrieval Evaluation, November 2006,
Gaithersburg, USA
 Isao Echizen, Stephan Singh, Takaaki Yamada, Koichi Tanimoto, Satoru
Tezuka and Benoit Huet, "Integrity verification system for video content by
using digital watermarking", ICSSSM'06, IEEE International Conference on
Services Systems and Services Management, 25-27 October 2006, Troyes,
France
 Eric Galmar and Benoit Huet, "Graph-based spatio-temporal region
extraction", ICIAR 2006, 3rd International Conference on Image Analysis and
Recognition, September 18-20, 2006, Povoa de Varzim, Portugal | Also
published as Lecture Notes in Computer Science (LNCS) Volume 4141 , pp
236-247
 Rachid Benmokhtar and Benoit Huet, "Classifier fusion : combination
methods for semantic indexing in video content", ICANN 2006, International
Conference on Artificial Neural Networks, 10-14 September 2006, Athens,
Greece - also published as LNCS Volume 4132 , pp 65-74
 Bernard Merialdo, Joakim Jiten, Eric Galmar and Benoit Huet, "A new
approach to probabilistic image modeling with multidimensional hidden
Markov models", AMR 2006, 4th International Workshop on Adaptive
Multimedia Retrieval , 27-28 July 2006, Geneva, Switzerland |Also published
as LNCS Volume 4398
 Fabrice Souvannavong and Benoit Huet, "Continuous behaviour knowledge
space for semantic indexing of video content", Fusion 2006, 9th International
Conference on Information Fusion, 10-13 July 2006, Florence Italy
 Benoit Huet and Bernard Merialdo, "Automatic video summarization",
Chapter in "Interactive Video, Algorithms and Technologies“ by Hammoud,
Riad (Ed.), 2006, XVI, 250 p, ISBN: 3-540-33214-6, pp 27-41
03/10/2012

 Joakim Jiten, Bernard Merialdo and Benoit Huet, "Multi-dimensional
dependency-tree hidden Markov models", ICASSP 2006, 31st IEEE
International Conference on Acoustics, Speech, and Signal Processing, May
14-19, 2006, Toulouse, France
 Joakim Jiten, Benoit Huet and Bernard Merialdo, "Semantic feature
extraction with multidimensional hidden Markov model", SPIE Conference on
Multimedia Content Analysis, Management and Retrieval 2006, January 1719, 2006 - San Jose, USA - SPIE proceedings Volume 6073 Volume 6073 ,
pp 211-221
 Joakim Jiten, Fabrice Souvannavong, Bernard Merialdo and Benoit Huet,
“Eurecom at TRECVid 2005: extraction of high-level features", TRECVid
2005, TREC Video Retrieval Evaluation, November 14, 2005, USA
 Benoit Huet, Joakim Jiten, Bernard Merialdo, "Personalization of hyperlinked
video in interactive television", IEEE International Conference on Multimedia
& Expo July 6-8, 2005, Amsterdam, The Netherlands.
 B. Cardoso, F. de Carvalho, L. Carvalho, G. Fernandez, P. Gouveia, B. Huet,
J. Jiten, A.Lopez, B. Merialdo, A. Navarro, H. Neuschmied, M. Noe, R.
Salgado, G. Thallinger, "Hyperlinked video with moving object in digital
television", IEEE International Conference on Multimedia & Expo, July 6-8,
2005, Amsterdam, The Netherlands.
 F. Souvannavong, B. Merialdo and B. Huet, "Region-based video content
indexing and retrieval", Fourth International Workshop on Content-Based
Multimedia Indexing (CBMI'05), June 21-23, 2005 Riga, Latvia.
 F. Souvannavong, B. Merialdo and B. Huet, "Multi-modal classier fusion for
video shot content", 6th International Workshop on Image Analysis for
Multimedia Interactive Services (WIAMIS'05), Montreux, Switzerland, April
2005.
 Fabrice Souvannavong, L. Hohl, B. Merialdo and B. Huet, "Enhancing latent
Semantic Analysis Video Object Retrieval with Structural Information", IEEE
International Conference on Image Processing, October 24-27, 2004
Singapore.
 Fabrice Souvannavong, B. Merialdo and B. Huet, "Latent Semantic Analysis
For An Effective Region Based Video Shot Retrieval System", 6th ACM
SIGMM International Workshop on Multimedia Information Retrieval, held in
conjunction with ACM Multimedia 2004, October 15-16, 2004, New York, NY
USA.
 Fabrice Souvannavong, B. Merialdo and B. Huet, "Eurecom at Video-TREC
2004: Feature Extraction Task ", NIST Special Publication, The 13th Text
Retrieval Conference (TREC 2004 Video Track).


- 124

Publications
 Bernardo Cardoso and Fausto de Carvalho and Gabriel Fernandez and
Benoit Huet and Joakim Jiten and Alejandro Lopez and Bernard Merialdo
and Helmut Neuschmied and Miquel Noe and David Serras Pereira and
Georg Thallinger. "Personalization of Interactive Objects in the GMF4iTV
project ". Proceedings of TV'04: the 4th Workshop on Personalization in
Future TV held in conjunction with Adaptive Hypermedia 2004 ,Eindhoven,
The Netherlands, August 23, 2004.
 Fabrice Souvannavong, L. Hohl, B. Merialdo and B. Huet, "Using Structure
for Video Object Retrieval", International Conference on Image and Video
Retrieval, July 21-23, 2004, Dublin City University, Ireland .
 Fabrice Souvannavong, B. Merialdo and B. Huet, "Improved Video Content
Indexing By Multiple Latent Semantic Analysis", International Conference on
Image and Video Retrieval, July 21-23, 2004, Dublin City University, Ireland .
 Fabrice Souvannavong, B. Merialdo and B. Huet, "Latent Semantic Indexing
For Semantic Content Detection Of Video Shots", IEEE International
Conference on Multimedia and Expo (ICME'2004), June 27th { 30th, 2004,
Taipei, Taiwan.
 Fabrice Souvannavong, B. Merialdo and B. Huet, "Partition Sampling for
Active Video Database Annotation", 5th International Workshop on Image
Analysis for Multimedia Interactive Services (WIAMIS'04), April 21-23, 2004,
Instituto Superior Tecnico, Lisboa, Portugal.
 Fabrice Souvannavong, B. Merialdo and B. Huet, "Latent Semantic Indexing
for Video Content Modeling and Analysis", NIST Special Publication, The
12th Text Retrieval Conference (TREC 2003 Video Track).
 Fabrice Souvannavong, B. Merialdo and B. Huet, "Video Content
Structuration With Latent Semantic Analysis", Third International Workshop
on Content-Based Multimedia Indexing, CBMI 2003, 22-24 Septembre 2003,
Rennes, France.
 Fabrice Souvannavong, B. Merialdo and B. Huet, "Semantic Feature
Extraction using Mpeg Macro-block Classification", NIST Special Publication:
SP 500-251, The Eleventh Text Retrieval Conference (TREC 2002 Video
Track).
 Gerhard Mekenkamp, Mauro Barbieri, Benoit Huet, Itheri Yahiaoui, Bernard
Merialdo, Riccardo Leonardi and Michael Rose, "Generating TV Summaries
for CE Devices", ACM Multimedia 2002, December 3-5 2002, Juan Les Pins,
France.

03/10/2012

 Benoit Huet, Itheri Yahiaoui, Bernard Merialdo, "Image Similarity for
Automatic Video Summarization", EUSIPCO 2002 - 11th European Signal
Processing Conference, September 3-6 2002, Toulouse, France.
 Bernard Merialdo, B. Huet, I. Yahiaoui, Fabrice Souvannavong, "Automatic
Video Summarization", International Thyrrenian Workshop on Digital
Communications, Advanced Methods for Multimedia Signal Processing,
September 8th - 11th, 2002, Palazzo dei Congressi, Capri, Italy.
 Benoit Huet, G. Guarascio, N. Kern and B. Merialdo, "Relational skeletons
for retrieval in patent drawings", IEEE International Conference Image
Processing (ICIP2001), October 7-10 2001, Thessaloniki, Greece.
 Ithery Yahiaoui, Bernard Merialdo et Benoit Huet, "Automatic Summarization
of Multi-episode Videos with the Simulated User Principle", Workshop on
MultiMedia Signal Processing (MMSP'01), October 3-5, 2001, Cannes,
France.
 Itheri Yahiaoui, Bernard Merialdo and Benoit Huet, "Optimal video
summaries for simulated evaluation", EuropeanWorkshop on Content-Based
Multimedia Indexing, September 19-21, 2001 Brescia, Italy.
 Itheri Yahiaoui, Bernard Merialdo and Benoit Huet, "AUTOMATIC VIDEO
SUMMARIZATION", MMCBIR 2001 - Indexation et Recherche par le
Contenu dans les Documents Multimedia, 24 et 25 septembre 2001, INRIA Rocquencourt, France.
 Ithery Yahiaoui, Bernard Merialdo et Benoit Huet, "Generating Summaries of
Multi-Episodes Video", International Conference on Multimedia & Expo
(ICME2001), August 22-25, 2001 Tokyo, Japan.
 Itheri Yahiaoui, Bernard Merialdo and Benoit Huet, "Automatic construction
of multi-video summaries", ISKO: Filtrage et resume automatique de
l'information sur les reseaux, July 5-6 2001, Nanterre, France.
 Benoit Huet, Ithery Yahiaoui et Bernard Merialdo, "Multi-Episodes Video
Summaries", International Conference on Media Futures 2001, 8-9 May
2001, Florence, Italy.
 Arnd Kohrs, Benoit Huet, et Bernard Merialdo, "Multimedia Information
Recommendation and Filtering on the Web", Networking 2000, May 14 - 19,
2000, Paris, France.
 Merialdo B., S. Marchand-Maillet and B. Huet, "Approximate Viterbi decoding
for 2D-Hidden Markov Models", IEEE International Conference on Acoustics,
Speech and Signal Processing (ICASSP2000), Istanbul Turkey, June 5-9
2000.


- 125

Publications
 Huet B. and E. R. Hancock, "Sensitivity Analysis for Object Recognition from
Large Structural Libraries", IEEE International Conference on Computer
Vision (ICCV99), Kerkyra, Greece, September 20-27, 1999.
 Huet B. and E. R. Hancock, "Inexact Graph Retrieval", IEEE CVPR99
Workshop on Content-based Access of Image and Video Libraries (CBAIVL99), Fort Collins, Colorado USA, June 22, 1999.
 Huet B., A.D.J. Cross and E.R. Hancock, "Shape Retrieval by Inexact Graph
Matching";, IEEE International Conference on Multimedia Computing and
Systems (ICMCS'99), Florence, Italy, page 772-776, 7-11 June 1999.
 Huet B. and E.R. Hancock, "Structural Sensitivity for Large-Scale LinePattern Recognition", Third International Conference on Visual Information
Systems (VISUAL99), page 711-718, 2-4 June, 1999, The Netherlands.
 Huet B., A.D.J. Cross and E.R. Hancock, "Graph Matching for Shape
Retrieval", Advances in Neural Information Processing Systems 11, Edited
by M.J. Kearns, S.A. Solla and D.A. Cohn, MIT Press, June 1999.
 Worthington P., B. Huet and E.R. Hancock, "Appearance-Based Object
Recognition Using Shape-From-Shading", Proceeding of the 14th
International Conference on Pattern Recognition (ICPR'98), Brisbane
(Australia), page 412-416, 16-20 August 1998.
 Huet B. and E.R. Hancock, "Relational Histograms for Shape Indexing",
IEEE International Conference on Computer Vision (ICCV98), Mumbai India,
page 563-569, Jan 1998.
 Huet B. and E.R. Hancock, "Fuzzy Relational Distance for Large-scale
Object Recognition", IEEE Conference on Computer Vision and Pattern
Recognition (CVPR'98), Santa Barbara California USA, page 138-143, 1998.
 Huet B. and E.R. Hancock, "Pairwise Representation for Image Database
Indexing", Sixth International Conference on Image Processing and its
Applications (IPA97), Dublin (Ireland), 15-17 July 1997.
 Huet B. and E.R. Hancock, "Cartographic Indexing into a Database of
Remotely Sensed Images", Third IEEE Workshop on Applications of
Computer Vision (WACV96), Sarasota Florida (USA), page 8-14, 1996.
 Huet B. and E.R. Hancock, "Structural Indexing of infra-red images using
Statistical Histogram Comparison", Third International Workshop on Image
and Signal Processing (IWISP'96), Manchester (UK), 4-7 Nov 1996.
 Charlton P. and Huet B., "Intelligent Agents for Image Retrieval",Research
and Technology Advances in Digital Libraries, Virginia (USA), May 1995.





Charlton P. and Huet B., "Using Multiple Agents For Content-Based Image
Retrieval", European Research Seminar on Advances in Distributed
Systems, L'Alpe D'Huez (France), April 1995 .

National Conferences and Workshops
 E. Galmar and B. Huet, "Methode de segmentation par graphe pour le suivi
de regions spatio-temporelles". CORESA 2005, 10emes journees
Compression et representation des signaux audiovisuels, 7-8 Novembre
2005, Rennes, France.
 Fabrice Souvannavong, B. Merialdo and B. Huet, "Classification Semantique
des Macro-Blocs Mpeg dans le Domaine Compresse.", CORESA 2003,16 17 Janvier 2003, Lyon France.
 Itheri Yahiaoui, Bernard Merialdo, Benoit Huet, "User Evaluation of MultiEpisode Video Summaries", Indexation de documents et Recherche
d'informations, GDR I3 et ISIS, July 9 2002, Grenoble, France.
 Itheri Yahiaoui, Bernard Merialdo, Benoit Huet, "Construction et Evaluation
automatique de resumes multi-videos", Analyse et Indexation Multimedia,
June 20 2002, Universite Bordeaux 1, France.
 I. Yahiaoui, B. Merialdo et B. Huet, "Construction automatique de resumes
multi-videos", CORESA 2001, Nov 2001, Universite de Dijon, France.
 I. Yahiaoui, B. Merialdo et B. Huet, "Resumes automatiques de sequences
video", CORESA2000, 19-20 Octobre 2000, Universite de Poitiers,
Futuroscope, France.
 Worthington P., B. Huet and E.R. Hancock, "Increased Extend of
Characteristic Views using Shape-from-Shading for Object Recognition",
Proceeding of the British Machine Vision Conference (BMVC'98),
Southampton (UK), page 710-719, 7-10 Sept 1998.
 Huet B. and E.R. Hancock, "Structurally Gated Pairwise Geometric
Histograms for Shape Indexing", Proceeding of the British Machine Vision
Conference (BMVC97), Colchester (UK), page 120-129, 8-11 Sept 1997.
 9. Huet B. and E.R. Hancock, "A Statistical Approach to Hierarchical Shape
Indexing", Intelligent Image Databases (IEE and BMVA), London (UK), May
1996.



Thesis:



03/10/2012


Huet B., “Multimedia Content Understanding: Bringing Context to Content”,
HDR, University Nice Sophia-Antipolis, France, Oct 2012
Huet B., "Object Recognition from Large Libraries of Line-Patterns", PhD
Thesis, University of York, Mai 1999.
- 126

Bringing Context to Multimedia Content Analysis

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Destaque

Destaque (14)

Semelhante a Bringing Context to Multimedia Content Analysis

Semelhante a Bringing Context to Multimedia Content Analysis (20)

Mais de Benoit HUET

Mais de Benoit HUET (11)

Último

Último (20)

Bringing Context to Multimedia Content Analysis

Notas do Editor