In the online world, user engagement refers to the quality of the user experience that emphasises the phenomena associated with wanting to use an application longer and frequently. User engagement is a multifaceted, complex phenomenon; this gives rise to a number of measurement approaches. Common ways to evaluate user engagement include self-report measures, e.g., questionnaires; physiological methods, e.g. cursor and eye tracking; and web analytics, e.g., number of site visits, click depth. These methods represent various trade-off in terms of the setting (laboratory versus in the wild), object of measurement (user behaviour, affect or cognition) and scale of data collected. This talk will present various efforts aiming at combining approaches to measure engagement. A particular focus will be what these measures individually and combined can tell us and not tell about user engagement. The talk will use examples of studies on news sites, social media, and native advertising.
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service Available
Measuring user engagement: the do, the do not do, and the we do not know
1. Measuring
user
engagement:
the
do,
the
do
not
do,
and
the
we
do
not
know
Mounia Lalmas
mounia@acm.org
World Usability Day Berlin – November 2014
2. About me
§ Since October 2013: Principal Research Scientist at Yahoo Labs
London
› User engagement, native advertising, social media, search
§ 2011- 2013: Visiting Principal Scientist at Yahoo Labs Barcelona
› User engagement, social media, search
§ 2008 2010: Microsoft Research/RAEng Research Professor at
the University of Glasgow
› Quantum theory to model information retrieval
§ 1999 - 2008: Lecturer (assistant professor) to Professor at Queen
Mary, University of London
› XML retrieval and evaluation (INEX)
Blog: labtomarket.wordpress.com
3. This talk
§ What is user engagement
› Definitions
› Characteristics
› Approaches
§ Attributes of user engagement measurement
› Scalability
› Setting
› Objectivity versus subjectivity
› Temporality
5. What is user engagement?
“User engagement is a quality of the user
experience that emphasizes the phenomena
associated with wanting to use a technological
resource longer and frequently” (Attfield et al, 2011)
self-report: happy, sad,
enjoyment, …
analytics: click, upload,
read, comment, share …
physiology: gaze, body heat,
mouse movement, …
emotional, cognitive and behavioural connection
that exists, at any point in time and over time, between
a user and a technological resource
5
6. Why is it important to engage users?
§ In today’s wired world, users have enhanced expectations
about their interactions with technology
… resulting in increased competition amongst the
purveyors and designers of interactive systems.
§ In addition to utilitarian factors, such as usability, we must
consider the hedonic and experiential factors of interacting
with technology, such as fun, fulfillment, play, and user
engagement.
(O’Brien, Lalmas & Yom-Tov, 2014)
7. Patterns of user engagement
Online sites differ concerning their engagement!
Games
Users spend
much time per
visit
Search
Users come
frequently and
do not stay long
Social media
Users come
frequently and
stay long
Niche
Users come on
average once
a week e.g. weekly
post
News
Users come
periodically,
e.g. morning and
evening
Service
Users visit site,
when needed,
e.g. to
renew
subscription
(Lehmann etal, 2012)
8. Why is it important to measure and
interpret user engagement well?
CTR
9. Characteristics of user engagement
• Users must be focused to be engaged
• Distortions in the subjective perception of time used to
measure it
Focused attention
(Webster & Ho, 1997; O’Brien,
2008)
• Emotions experienced by user are intrinsically motivating
• Initial affective “hook” can induce a desire for exploration,
active discovery or participation
Positive Affect
(O’Brien & Toms, 2008)
• Sensory, visual appeal of interface stimulates user &
promotes focused attention
• Linked to design principles (e.g. symmetry, balance,
saliency)
Aesthetics
(Jacques et al, 1995; O’Brien,
2008)
• People remember enjoyable, useful, engaging
experiences and want to repeat them
• Reflected in e.g. the propensity of users to recommend
an experience/a site/a product
Endurability
(Read, MacFarlane, & Casey,
2002; O’Brien, 2008)
10. Characteristics of user engagement
• Novelty, surprise, unfamiliarity and the unexpected
• Appeal to users’ curiosity; encourages inquisitive
behavior and promotes repeated engagement
Novelty
(Webster & Ho, 1997; O’Brien,
2008)
• Richness captures the growth potential of an activity
• Control captures the extent to which a person is able
to achieve this growth potential
Richness and control
(Jacques et al, 1995; Webster &
Ho, 1997)
• Trust is a necessary condition for user engagement
• Implicit contract among people and entities which is
more than technological
Reputation, trust and
expectation (Attfield et al,
2011)
• Difficulties in setting up “laboratory” style
experiments
• Why should users engage?
Motivation, interests,
incentives, and
benefits (Jacques et al.,
1995; O’Brien & Toms, 2008)
11. Attributes of user engagement
§ Scale (large versus small)
§ Setting (laboratory versus field)
§ Objective versus subjective
§ Temporality (short- versus long-term)
one is not better than other:
it depends on aims and constraints.
12. Measuring user engagement
Measures
Attributes
Self-report Questionnaire, interview,
think-aloud and think after protocols
Subjective
Short- and long-term
Lab and field
Small scale
Physiology EEG, SCL, fMRI
eye tracking
mouse-tracking
Objective
Short-term
Lab and field
Small and large scale
Analytics Intra and inter-session metrics
Data science
Objective
Short- and long-term
Field
Large scale
14. dozen – qualitative & physiology
hundred to thousand – online survey
million – analytics
from rich but limited in generalisation … to powerful but hard to explain
Scalability
15. Large scale measurement – analytics
Metrics
• Dwell time
• Session duration
• Bounce rate
• Play time (video)
• Mouse movement
• Click through rate
(CTR)
• Number of pages
viewed (click depth)
• Conversion rate
• Number of UCG
(comments)
• …
Dwell time as a proxy of user interest
Dwell time as a proxy of relevance
Dwell time as a proxy of conversion
Intra-session measurement
16. Small scale measurement – eye tracking
18 users, 16 tasks each
(chose one story & rate it)
eye movement recorded
Attention (gaze)
interest has no role
position > saliency
Selection
mainly driven by interest
position > attention
(Lin et al, 2007)
(Navalpakkam etal, 2012)
17. Small scale measurement – focused
attention questionnaire
5-point scale (strong disagree to strong agree)
1. I lost myself in this news tasks experience
2. I was so involved in my news tasks that I lost track of time
3. I blocked things out around me when I was completing the
news tasks
4. When I was performing these news tasks, I lost track of the
world around me
5. The time I spent performing these news tasks just slipped
away
6. I was absorbed in my news tasks
7. During the news tasks experience I let myself go
(O'Brien & Toms, 2010)
18. Small scale measurement – PANAS
questionnaire
(10 positive items and 10 negative items)
§ You feel this way right now, that is, at the present moment
[1 = very slightly or not at all; 2 = a little; 3 = moderately;
4 = quite a bit; 5 = extremely]
[randomize items]
distressed, upset, guilty, scared, hostile,
irritable, ashamed, nervous, jittery, afraid
interested, excited, strong, enthusiastic, proud,
alert, inspired, determined, attentive, active
(Watson, Clark & Tellegen, 1988)
19. Small scale measurement – gaze and self-reporting
§ News
§ interest
§ 57 users
§ reading task (114)
Three metrics: gaze,
focus attention and
positive affect
§ questionnaire (qualitative data)
§ record eye tracking
(quantitative data)
All three metrics align:
interesting content promote
all engagement metrics
(Arapakis etal, 2014)
20. From small- to large-scale
measurement – mouse tracking
§ Navigation & interaction with digital
environment usually involves the use
of a mouse (selecting, positioning, clicking)
§ Several works show mouse cursor as
weak proxy of gaze (attention)
§ Low-cost, scalable alternative
§ Can be performed in a non-invasive
manner, without removing users from
their natural setting
21. Relevance, dwell time & cursor
(Guo & Agichtein, 2012)
“reading” a relevant long document vs “scanning” a long non-relevant
document
23. Towards a taxonomy of mouse gestures for user
engagement measurement
§ The top-ranked clustering configuration is the Spectral Clustering
for the original dataset, with hyperbolic tangent kernel, for k = 38
• certain types of mouse gestures occur more or less often, depending on user
interest in article
• significant correlations between certain types of mouse gestures and self-report
measures
• cursor behaviour goes beyond measuring frustration
• inform about the positive and negative interaction
24. laboratory
“in the wild”
from high level of consistency and control … to greater external validity
and more “true to life”
Setting
25. Crowdsourcing and self-report
§ How the visual catchiness (saliency) of
“relevant” information impacts
› focused attention
› affect
§ Saliency model of visual attention developed
by (Itti & Koch, 2000)
27. Study design
§ 8 tasks = finding latest news or headline on celebrity or
entertainment topic
§ Affect measured pre- and post- task using the Positive e.g.
“determined”, “attentive” Affect Schedule (PANAS)
§ Focused attention measured with 7-item focused attention scale
e.g. “I was so involved in my news tasks that I lost track of time”, “I blocked
things out around me when I was completing the news tasks” and
perceived time
§ Interest level in topics (pre-task) and questionnaire (post-task) e.g.
“I was interested in the content of the web pages”, “I wanted to find out
more about the topics that I encountered on the web pages”
§ 189 (90+99) participants from Amazon Mechanical Turk
28. Using crowdsourcing works
§ When headlines are non-salient:
users are slow at finding them, report more distraction
due to web page features, and show a drop in affect
§ When headlines are salient:
user find them faster, report that it is easy to focus, and
maintain positive affect
§ Users reported “easier to focus in the salient condition”
BUT no significant improvement in the focused attention
scale or differences in perceived time spent on tasks
User interest in web page content is a good predictor
è of focused attention, itself a good predictor
èof positive affect
29. objective – analytics and physiological
subjective – self-report
towards reliability and validity … mapping objective and subjective
measurement
Objectivity vs
Subjectivity
31. Mouse tracking and self-reporting
§ 324 users from Amazon Mechanical Turk (between
subject design)
§ Two domains (BBC News and Wikipedia)
§ Two tasks (reading and search)
§ “Normal vs Ugly” interface
§ Questionnaires (qualitative data)
› focus attention, positive effect
› interest, aesthetics
› + demographics, hardware
§ Mouse tracking (quantitative data)
› movement speed, movement rate, click rate, pause length, percentage of time
still
(Warnock & Lalmas, 2013)
32. Mouse tracking could not tell much about
§ focused attention and positive affect
§ user interests in the task/topic
§ aesthetics
§ BUT BUT BUT BUT
› “ugly” variant did not result in lower USER aesthetics scores
› although BBC > Wikipedia
• BUT – the comments left …
› Wikipedia: “The website was simply awful. Ads flashing everywhere, poor
text colors on a dark blue background.”; “The webpage was entirely blue. I don't
know if it was supposed to be like that, but it definitely detracted from the
browsing experience.”
› BBC News: “The website's layout and color scheme were a bitch to
navigate and read.”; “Comic sans is a horrible font.”
33. Flawed methodology? Non-existing signal?
Wrong metric?
§W Hraownthgo rmnee Eafsfeucrte ?
§ Design
› Usability versus engagement
› Within- versus between-subject
§ Mouse movement was not sophisticated enough
as shown by recent work (Arapakis etal 2014)
35. Large scale measurements – analytics
• intra-session engagement measures success in attracting user to
remain on site for as long as possible.
• inter-session engagement measured by observing lifetime user value.
intra-session measures inter-session measures
• Dwell time
• Session duration
• Bounce rate
• Play time (video)
• Mouse movement
• Click through rate (CTR)
• Number of pages viewed (click
depth)
• Conversion rate
• Number of UCG (comments).
• …
• Fraction of return visits
• Time between visits (inter-session time,
absence time)
• Total view time per month (video)
• Lifetime value (number of actions)
• Number of sessions per unit of time
• Total usage time per unit of time
• Number of friends on site (social
networks)
• Number of UCG (comments)
• …
loyalty
popularity
activity
36. Inter-session metric – absence time
short absence is
a sign of loyalty
important indication
of user engagement
(Dupret & Lalmas, 2013)
37. Absence time – search experience
search session metrics absence time
1. Clicks after the 5th results reflect poorer user experience;
users cannot find what they are looking for
2. No click means a bad user experience
3. Clicking at bottom is a sign of low quality overall ranking
4. Users finding their answers quickly (click sooner) return
sooner to the search application
5. Returning to the same search result page is a worse user
experience than reformulating the query.
39. Measuring User Engagement
What is a good signal?
What is a good metric?
What is a correct interpretation?
1. No one measurement is perfect or complete.
2. Studies have different constraints.
3. Measurement should be applied consistently with
attention to reliability.
4. Mostly “normal” interaction.
5. “It is a capital mistake to theorize before one has
data.” - Arthur Conan Doyle
40. Danke schön
This talk is based on tutorial
& book “Measuring User
Engagement” (with Heather
O’Brien and Elad Yom-Tov)