11. What to Collect to measure
• Type of event
(Zync player command or a normal chat message)
• Anonymous hash
(uniquely identifies the sender and the receiver, without
exposing personal account data)
• URL to the shared video
• Timestamp for the event
• The player time (with respect to the specific video) at the
point the event occurred
• The number of characters and the number words typed
(for chat messages)
• Emoticons used in the chat message
15. Reciprocity
• 43.6% of the sessions the invitee played at
least one video back to the session’s initiator.
• 77.7% sharing reciprocation
• Pairs of people often exchanged more than
one set of videos in a session.
• In the categories of Nonprofit, Technology
and Shows, the invitees shared more videos
16. How do we know what people are watching?
How can we give them better things to watch?
CLASSIFICATION
18. 5 star ratings has been the golden egg for recommendation systems
so far; implicit human cooperative sharing activity works better.
CLASSIFICATION BASED ON
IMPLICIT CONNECTED SOCIAL
19.
20. Used and Unused Data
You Tube Zync
Duration (video) Duration (session)*
Views (video)
Duration # of Play/Pause*
Duration (session)*
Rating*
Views # of Scrubs*
# of Play/Pause*
Rating* # of Chats*
# of Scrubs*
You Tube (not used) Zync (not used)
Tags Emoticons
Comments User ID data
Favorites # of Sessions
# of Loads
21. Phone in your favorite ML technique.
FIRST ORDER DATA WASN’T
PRETTY
22. Naïve Bayes Classification
Type Accuracy
Random Chance 23.0%
You Tube Features 14.6%
You Tube Top 5 Categories 32.4%
Zync Features 53.9%
Humans 60.9%
23. What about these three videos? Which one you like?
Nominal Factorization
25. Classification with Factoring
Type Accuracy
Random Chance 23.0%
You Tube Features 14.6%
You Tube Top 5 Categories 32.4%
YT Top 5 Factoring Duration 51.8%
Humans 60.9%
YT Top 5 Factoring Views 66.9%
YT Top 5 Factoring Ratings 75.5%
YT Top 5 Factoring All Features 75.9%
psst, yes we know that more training will do the same thing eventually,
I just don’t like waiting.
26. Classification w/ Zync features
Type Accuracy
Random Chance 23.0%
You Tube Features 14.6%
You Tube Top 5 Categories 32.4%
YT Top 5 Factoring Duration 51.8%
Humans 60.9%
YT Top 5 Factoring Views 66.9%
YT Top 5 Factoring Ratings 75.5%
YT Top 5 Factoring All Features 75.9%
Zync Factored All Features 87.8%
psst, yes we know that more training will do the same thing eventually,
I just don’t like waiting.
32. Sept 26, 2009 18:23 EST
RT: @jowyang If you are watching the debate you’re
invited to participate in #tweetdebate Here is the 411
http://tinyurl.com/3jdy67
INDIRECT ANNOTATION
33. Repeated (retweet) content starts with
RT
Address other users with an @
Rich Media embeds via links
Tags start with #
RT: @jowyang If you are watching the debate you’re
invited to participate in #tweetdebate Here is the 411
http://tinyurl.com/3jdy67
ANATOMY OF A TWEET
34. Tweet Crawl circa 2008
• Three hashtags: #current #debate08 #tweetdebate
• 97 mins debate + 53 mins following = 2.5 hours total.
• 3,238 tweets from 1,160 people.
– 1,824 tweets from 647 people during the debate.
– 1,414 tweets from 738 people post debate.
• 577 @ mentions (reciprocity!)
– 266 mentions during the debate
– 311 afterwards.
• Low RT: 24 retweets in total
– 6 during
– 18 afterwards.
35. Volume of Tweets by Minute
Crawled from the Twitter RESTful search API.
36. Tweets During and After the Debates
Conversation swells after the debate.
42. Automatic Segment Detection
We use Newton’s Method to find extrema outside μ±σ to find
candidate markers. Any marker that follows from the a marker on the
previous minute is ignored.
48. Sen$ment/Affect judgements from the debate.
[1]
Diakopoulos, N. A., and Shamma, D. A. Characterizing debate performance via
aggregated twitter sentiment. In CHI ’10: Proceedings of the 28th international
conference on Human factors in computing systems (New York, NY, USA, 2010),
ACM, pp. 1195–1198.
48
49. Sen$ment/Affect judgements by candidate.
[1]
Diakopoulos, N. A., and Shamma, D. A. Characterizing debate performance via
aggregated twitter sentiment. In CHI ’10: Proceedings of the 28th international
conference on Human factors in computing systems (New York, NY, USA, 2010),
ACM, pp. 1195–1198.
49
51. Tweet Stream circa 2009
• Data Mining Feed
• 600 Tweets per minute
• 90 Minutes
• 54,000 Tweets from 1.5 hours
• Constant data rate means the volume
method doesn’t work.
56. No significant
occurrence of
Terms as topics points “remaking”
Using a TF/IDF window of 5 mins, find terms that are only relevant to
that slice, subtract out salient, non-stop listed terms like: Obama,
president, and speech.
57. Peak occurrence
of “remaking”.
Contains an
occurrence of
“remaking” less
No significant significant than
occurrence of peak.
“remaking”.
Terms as Sustained Interest
Using a TF/IDF window of 5 mins, find terms that are only relevant to
that slice, subtract out salient, non-stop listed terms like: Obama,
president, and speech.
58. Sustained Interest & Background Whispers.
Some topics continue over time with a higher conversational context.
59. 0.35
0.25
Sustained Interest & Background Whispers.
These terms are not sailent by any standard term/document model.
60. People Announce
(12:05) Bastille71: OMG - Obama just messed
up the oath - AWESOME! he’s human!
(12:07) ryantherobot: LOL Obama messed up
his inaugural oath twice! regardless, Obama is
the president today! whoooo!
(12:46) mattycus: RT @deelah: it wasn’t Obama
that messed the oath, it was Chief Justice
Roberts: http://is.gd/gAVo
(12:53) dawngoldberg: @therichbrooks He
flubbed the oath because Chief Justice
61. People Reply
(12:05) Bastille71: OMG - Obama just messed
up the oath - AWESOME! he’s human!
(12:07) ryantherobot: LOL Obama messed up
his inaugural oath twice! regardless, Obama is
the president today! whoooo!
(12:46) mattycus: RT @deelah: it wasn’t Obama
that messed the oath, it was Chief Justice
Roberts: http://is.gd/gAVo
(12:53) dawngoldberg: @therichbrooks He
flubbed the oath because Chief Justice
69. Earlier you said
synchronous,
what’s that all
You may ask yourself….
70. Understanding Engagement
Better recommendations.
Better understanding of the relationship between users and the sharing/
consumption of media content.
Better organization and classification of media for efficient navigation and content
retrieval.
Better advertising.
71. Remember these Verbs?
Which one was added?
OPTIONS GET HEAD POST
CONNECT TRACE DELETE PUT
PATCH MONITOR
72. More and more sync…
Tweeting Together
• External Associated Media
• HTTP Long Poll
Watching Together
• Embedded & Manipulated Media
• Connected Sharing Session
Creating Together
• Appropriated Collaborative Media
• Connected Action & Motion
77. Dancers & People Acting Together
They thought we were controlling the
images, once they learned that they were
controlling it was interesting to see their
delight in that and how it brought them to a
new place of play with the phones and then
they got a little bit more engaged and
excited. (D3)
78. Body Moving, Body Moving
“I want to just start moving my
body so much even though I
know it doesn’t make a
difference.” (A3)
79. Recapitulation
Human conversation is Human conversation…start there before you go all “Big Data”.
Connected action provides better signals. (Implicit Synchronous Sharing activity is richer
than Asynchronous Annotations)
Metrics and instrumentation should account for social interactions & engagement (which may
be measured by the lack of signal).
Nominal Factorization Assists Classification (enough with the Big Data thing already)
The Verbs are in need of updating. How we build the next generation of tools and appliances
shouldnʼt be limited by them.
80. +
Thanks round one…
To my fellow arVsts (Renata & Jürgen).
Also to our amazing dancers Christy
Funsch, Nol Simonse, and Erin Mei‐Ling
Stuart; their contribuVon, advise, and
paVence during many a rehearsal secVon.
81. Fin.
Thanks to N. Diakopoulos, E. Churchill, L. Kennedy, J.Yew, S. Pentland, A.
Brooks, J. Antin, J. Dunning, Chloe S., Ben C., Marc S., & M. Cameron J.
Human-to-Dancer Interaction Designing for Embodied Performances in a Participatory Installation. David A. Shamma,
Renata Sheppard, Jürgen Schible, DIS 2010.
Tweet the Debates: Understanding Community Annotation of Uncollected Sources David A. Shamma; Lyndon Kennedy;
Elizabeth F. Churchill, ACM Multimedia, ACM, 2009
Understanding the Creative Conversation: Modeling to Engagement David A. Shamma; Dan Perkel; Kurt Luther,
Creativity and Cognition, ACM, 2009
Spinning Online: A Case Study of Internet Broadcasting by DJs David A. Shamma; Elizabeth Churchill; Nikhil Bobb; Matt
Fukuda, Communities & Technology, ACM, 2009
Zync with Me: Synchronized Sharing of Video through Instant Messaging David A. Shamma; Yiming Liu; Pablo Cesar,
David Geerts, Konstantinos Chorianopoulos, Social Interactive Television: Immersive Shared Experiences and
Perspectives, Information Science Reference, IGI Global, 2009
Enhancing online personal connections through the synchronized sharing of online video Shamma, D. A.; Bastéa-Forte,
M.; Joubert, N.; Liu, Y., Human Factors in Computing Systems (CHI), ACM, 2008
Supporting creative acts beyond dissemination David A. Shamma; Ryan Shaw, Creativity and Cognition, ACM, 2007
Watch what I watch: using community activity to understand content David A. Shamma; Ryan Shaw; Peter Shafton;
Yiming Liu, ACM Multimedia Workshop on Multimedia Information Retrival (MIR), ACM, 2007
Zync: the design of synchronized video sharing Yiming Liu; David A. Shamma; Peter Shafton; Jeannie Yang, Designing
for User eXperiences, ACM, 2007
Editor's Notes
\n
There are many of us, but this is the work of three.\n
If you know what these are, good. If not, no problem. Take a note here at the Methods of HTTP…there will be a quiz later. A lot of my research begins with feeling restricted by these words.\n
These verbs have us trapped in 1998…oh ya and the anti-flash silliness doesn’t help.\n
Transactional. There is MORE to tagging and comments in social media than how we think of it currently as the single browser/site/startup.\n
These tags and comments are regulated to anchored explicit annotation. This is the problem. Temporally, there is a gap – we cannot leverage these components like we have with photos. Some tags and notes are added as deep annotation, but that’s rare.\n
\n
\n
\n
\n
\n
\n
\n
\n
Gift giving at its finest\n
\n
\n
\n
\n
So we started looking at classification based on two datasets YouTube and Zync. Each is about 5000 videos (or sessions).\n
I come from a strong AI family…so I don’t wanna get too into it…\n
\n
So we started to think about what the data was saying to us…\n
\n
\n
\n
Triangulate between the classifier results, the survey results and the interviews:\n Determine whether the Naïve Bayes classifier or humans are better at determining whether a video belongs to the “comedy” genre.\n Determine if the “ground truth” genre categories provided by the original uploader is reliable.\n
\n
\n
Many People Tweet while they watch tv, many TV shows call for people to follow the twitter stream.\n
\n
Not only of the tweet to the video but the rich data within the tweet.\n
So the question is how does a tweet relate to human conversation…does it map to the same patterns?\n
\n
\n
\n
\n
More on this later…but for the few of you that havent used my tool.\n
http://www.flickr.com/photos/wvs/3833148925/\n\nThis is a three part talk where I’ll discuss IM, Chatrooms, and Twitter.\n
\n
\n
\n
\n
\n
But I’m not going to talk about SNA today.\n
\n
Some techniques from may be applicable: Wei Hao Lin, Alexander Haputmann: Identifying News Videos ideological viewpoint or bias\n
\n
\n
Will it scale?\n
\n
\n
Conversational Gasp\n
\n
\n
\n
\n
EXPLAIN THIS! The whispers in the background.\n
EXPLAIN THIS! The whispers in the background.\n
\n
\n
\n
\n
\n
Scale! Will this scale?\n
\n
\n
Blue is Kanye\n
\n
\n
What’s Monitor suppose to look like? Can we start to prototype protocols for synchronous interaction?\n
\n
\n
\n
\n
“an ambiguous relationship is always the most interesting one.”\nOne painter (A1) liked how the dancers effect was slowly revealed, citing he was comfortable with painting by the time he noticed them. Other audience members enjoyed having performance movement in the crowd\n
\n
Sandy Pentland says a pause can be the best signal of engagement.\n
Nol and Christy were Goldie Award winners – one of SF’s most prestigious dance award\n