One weekend software hack called "Movie Hack Attack".
Video content is played and is analyzed realtime for sentiment, emotions, and more.
Sentiment shown in chart below, emotions+objects of attention to the left with random picture grabbed from google search, persons/locations/organizations to the right with random picture grabbed from google search.
2. Aim of the Hack
(As it’s really only a small hack, the real aim is slightly less grand:
playing with Python as a Rest Api/Server and make some nice stuff in a
couple of days… and not spend time on that wasteful activity called “sleep”)
❖
Provide extra content information to users while
watching a Movie or TV Show:!
❖
Moods and Sentiment of a Movie!
❖
Persons and Places mentioned or featuring in a
Movie
3. and now… The Hack!
Persons/Locations/
Organizations
Objects/Expressions
In this case the sentence
“mandarin of television” is
recognized as an Expression. The
word “Mandarin” is recognized,
as well as the word “Television”
and marching pictures are
searched and shown
in this case the word
“Howard Beale” is seen as a
“Person”
4. The Network (1976) [Movie]
Sentiment
current
Positive or Negative
Sentiment of each
Sentence
over time
(you can see it needs more
normalization for long video items)
5. Obama 2012 victory speech [Talk]
Persons/Locations/
Organizations
in this case the word
“America” is seen as the
“Location”, “USA”
,
Emotions
in this case the word “Fall” is
seen as the emotion “Triumph”
(because of the context of the
sentence)
The Pictures are random matches
on the specific concept and grabbed from
Google Image/Flickr in Realtime
6. Language Tech
(Don’t worry, only 2 slides about tech… but C’mon its a Hackathon, isn’t it?)
❖
Analysis of Subtitles is done through:!
❖
Language pre- and post-Processing (tokenize,remove
stopwords,punctuation, etc) [nltk]!
❖
Part-Of-String Tagging (POS) for identifying the
Grammar of a sentence [nltk pos-tagger + Brown’s text
corpus]!
❖
Named Entity Tagging (NER) for identifying “objects”
of interest: Persons, Locations, Organizations
[Stanford’s ner tagger]
7. Language Tech (2)
❖
Analysis of Subtitles is done through:!
❖
Sentiment extraction through a trained sentiment
model, adapted/hacked to be more applicable for
movie data [Princeton's SentiWordNet (annotated
sentiment lexicon) + hack]!
❖
Matching Emotions through many different
techniques [Princeton’s Wordnet annotated sunset
lexicon, all previous steps, and many… many…
hacks]
8. Possible use cases
❖
Get sentiment and extracted emotional values from news broadcasts from different
channels (ie Al Jazeera, CNN, Russia Today) and get a quick indiction of their specific
viewpoint (or “bias”) of a “news event”;!
❖
Filter content by emotional thresholds (Today I only want to read “Happy” news/
items with a overall positive sentiment / emotional values;!
❖
Plug movie/video content into the Semantic Web by linking extracted subtitle entities/
chunks to their specific ontologies (adding ref header tags to movie information pages;!
❖
Enable a richer user interaction through adding extra meta information to existing
content and user interfaces;!
❖
Enable smart semantic (textual) searching for non-textual content through feature
extraction by some of the technologies showcased here.!
❖
(…)