Ei09 Thousands Observers

Thousands of Online
Observers is Just the
Beginning

Nathan Moroney, HP Labs

Human Vision and Electronic Imaging XIV
Session 2: Social Software, Internet Experiments and New Paradigms for the Web
Monday, January 19, 2009, 1:00-1:30 PM

Outline
• Brief History of Crowd-Sourcing
• Online Experiments
− Unconstrained color naming
− Color name comparison
− Color difference description
− Image quality description
− World Wide Gamma
• Online Tools
− Color Thesaurus, Color Zeitgeist & Italian Color Thesaurus
• Eight considerations

1/27/2009 2

Brief History of Crowdsourcing: Part 1

“Since the beginning, it was
just the same. The only
difference, the crowds are
bigger now.”
Elvis

1/27/2009 3

Brief History of Crowdsourcing: Part 2

“The future belongs to crowds.”
Mao II
Don Delillo

(Left as an exercise for the audience to do an Elvis – Delillo mash-up)

1/27/2009 4

Online Experiments
• Basic pieces
− Experimental design – unconstrained text
− Software, a server – JavaScript
− Communication network –World Wide Web
− Participants - volunteers
• Results
− Direct Data
− Usage Data
• Optional but useful – lab data for validation

1/27/2009 5

Unconstrained Color Naming
• Seven colored patches
• Randomly selected
− 6x6x6 RGB sampling
• Text field for names
• Provide the “best” name
• Optional comments
• Started in 2002

1/27/2009 6

On-Line vs. Berlin & Kay
CIECAM02Hue Angle
CIECAM02 hue angle

y = 0.9971x + 28.986
360
2
R = 0.9859

270
Berlin & Kay

180

90

0
0 90 180 270 360

On-Line
Web
1/27/2009 7

Color Name Comparison
• Text only
• Eleven color names
• Non-repeating random
walk
• Eleven triads
− Which color is least like the
other two?
• Collect
additional
demographic data

1/27/2009 8

Clustering Nominal Comparisons

1/27/2009 9

Color Difference Description
• Five pairs of colored patches.
• Best describe the difference
• Text field per pair
− Unconstrained description
• Randomly sample RGB cube
− Constrained RGB offsets

1/27/2009 10

Frequencies of Words
0.048 right
0.045 more
0.031 left

is six times as frequent
• ‘More’
0.028 one
0.018 color

as ‘less’ 0.017 green
0.017 darker

• ‘Darker’ is twice as frequent
0.015 blue
0.012 than

as ‘lighter’, 0.012 saturated
0.011 patch
− same for ‘dark’ and ‘light’ 0.011 first
0.010 purple

• Lime and magenta are not in
0.009 lighter
0.009 second
the top 100 terms – 0.008 dark
0.007 less
− But they are in the top 10 of 0.007 brown

unconstrained naming. 0.007 red
0.006 different
0.006 yellow
0.006 difference
0.006 brighter
0.006 hue
0.005 pink
1/27/2009 11

Image Quality Description
Overall and specific
•
description of image quality
Demographic questions
•

Proportion vs. Token
0.089 the
0.033 of
0.032 is
0.031 and
color(s)
0.021
0.017 to
0.016 good
0.014 on
0.014 a
0.013 in

1/27/2009 12

Opt-In Demographics: n=338
Non-Native
Male 35%
44% Female
Native
56%
English
Gender 65%
Proficiency
Maybe
>60
1% Color Blind
40-60 < 20
1% Don’t Know 9%
Definitely
17% 1%
23%
Color Blind

Color
Age Vision
(years) (self-described)

59% 89%
Normal
20-40
1/27/2009 13

World Wide Gamma
• Lightness
partitioning task, benchmark to a nominal
display and existing lightness scales, such as L*.

After

Before
1/27/2009 14

World Wide Gamma
• Red is >600
participants
• Black is current
results
• Specific
experimental
feedback
• Offsetfor darkest
levels but quite
linear

1/27/2009 15

Online Color Thesaurus
• Interface to the underlying database of color names
• Largest number of users

1/27/2009 16

Color Zeitgeist
• Usage data – tools use creates data which in turn
creates another tool

1/27/2009 17

Italian Color Thesaurus
• Italian data < English data
• Adaptive tools
− Qualification through ratings
− Quantity through instance-
based harvesting, collect new
data only for missing colors

1/27/2009 18

Consideration 1: Scale
• Yes online experiments mean bigger crowds
− Larger & more diverse pool of possible participants
− Logarithmic scale of participation

Stanford
HP Palo San
HP
Department California
(under)
Labs Alto Jose
1 10 100 1K 10K 100K 1M 10M 100M

English Application OS
Lab Color
Web-based Based Based
Prototypes & Thesaurus
Color naming Color Color
Experiments
experiment Picker Picker

1/27/2009 19

Observers per Experiment by Year
10000

1000
Log of the Number Observers

These
should also
have error
bars and
100
connecting
lines…

10

1
1990 1995 2000 2005 2010
Experiment by Year
1/27/2009 20

Consideration 2: Distributed Design
• Minimize the effort from any single participant
− Increase volunteer participation rate?
− Minimize impact of an single, systematically disruptive
participant
•A ‘knob’ that can be used to dial the target “time to
completion” for any given web participant
• Applicable to even relatively complex tasks
− Triadic comparison

vs.

1/27/2009 21

Consideration 3: Ambiguity
• Lack of constraints is a trade-off
− May make the task more difficult for observers
− May enable a different set of questions
− General bias is towards unconstrained tasks
− Implicitly include real world variability
• Sourcesof variability are vast, robustness comes
from scale – and a focus categories not thresholds

“wasn’t sure whether you wanted
accurate or poetic names.”
Anonymous Comment
June 8, 2002

1/27/2009 22

Consideration 4: Hypotheses vs Training
• Thresholds versus Categories
• Individual performance versus collective capability
• Numbers versus Words

Pixel by pixel
machine color
naming – see -
‘Lexical Image
Processing’
CIC 16

1/27/2009 23

Consideration 5: Simplicity
• In both tasks and tools
• The simpler the task – likely the less confusion over
instructions, higher the volunteer participation rate
• The simpler the tools – lowest common denominator
infrastructure, minimum number of versions over the
years, likely widest audience

1/27/2009 24

Consideration 6: Global & Open-Ended
Global scale for participation
•

Effort is front loaded - once uploaded no
•
real penalty to indefinite data collection
Data ‘evolves’ as it changes scale
•

Especially true for
•
− inter-related experiments,
10000

− variations in experimental designs and 1000

Log of the Number Observers
− results that are in pursuit of an aggregate
property 100

− results that change over time
10

1
1990 1995 2000 2005 2010
Experiment by Year

1/27/2009 25

Consideration 7: Usage as Data
• Any online interaction creates data
• Theboundary between experiments and tools is
potentially fuzzy
• Usefulexperiments can be formatted as a useful
tool, and the more useful the tool the greater the
potential data.
• An important implication and possible advantage is
that a tool defines context for the task, the
pragmatics is inherent.

1/27/2009 26

Consideration 8: Mutual Bootstrapping
Mutual bootstrapping – machine learning applied to training
•
data gathered online, which in turn creates processed data
which can enable human learning.
Social data can be educational.
•

Chartreuse
Revisiting approaches to laboratory experiments – if the
•
goals are simplicity, categorization, ambiguity, larger scale
and so on, how are the designs different?

1/27/2009 27

Questions?

Elvis’s favorite color?

That would be blue.

1/27/2009 28

Ei09 Thousands Observers

Recomendados

Recomendados

Mais conteúdo relacionado

Destaque

Destaque (20)

Último

Último (20)

Ei09 Thousands Observers