Introduction to Web Survey Usability Design and Testing
1. Introduction to Web Survey Usability
Design and Testing
DC-AAPOR Workshop
Amy Anderson Riemer
Jennifer Romano Bergstrom
The views expressed on statistical or methodological issues are those of the presenters
and not necessarily those of the U.S. Census Bureau.
2. Schedule
9:00 – 9:15 Introduction & Objectives
9:15 – 11:45 Web Survey Design: Desktop &
Mobile
11:45 – 12:45 Lunch
12:45 – 2:30 Assessing Your Survey
2:30 – 2:45 Break
2:45 – 3:30 Mixed Modes Data Quality
3:30 – 4:00 Wrap Up
2
3. Objectives
Web Survey Design: Desktop & Assessing Your Survey
Mobile
• Paging vs. Scrolling • Paradata
• Navigation
• Scrolling lists vs. double-banked • Usability
response options
• Edits & Input fields
• Checkboxes & Radio buttons Quality of Mixed Modes
• Instructions & Help
• Graphics • Mixed Mode Surveys
• Emphasizing Text & White Space
• Authentication
• Response Rates
• Progress Indicators • Mode Choice
• Consistency
3
4. Web Survey Design
The views expressed on statistical or methodological issues are those of the presenters
and not necessarily those of the U.S. Census Bureau.
5. Activity #1
1. Today’s date
2. How long did it took you to get to BLS today?
3. What do you think about the BLS entrance?
5
6. Why is Design Important?
• No interviewer present to correct/advise
• Visual presentation affects responses
– (Couper’s activity)
• While the Internet provides many ways to
enhance surveys, design tools may be misused
6
7. Why is Design Important?
• Respondents extract meaning from how
question and response options are displayed
• Design may distract from or interfere with
responses
• Design may affect data quality
7
8. Why is Design Important?
8
http://www.cc.gatech.edu/gvu/user_surveys/
9. Why is Design Important?
• Many surveys are long (> 30min)
• Long surveys have higher nonresponse rates
• Length affects quality
Adams & Darwin, 1982; Dillman et al., 1993;
9
Haberlein & Baumgartner, 1978
10. Why is Design Important?
• Respondents are more tech savvy today and
use multiple technologies
• It is not just about reducing respondent
burden and nonresponse
• We must increase engagement
• High-quality design = trust in the designer
Adams & Darwin, 1982; Dillman et al., 1993;
10
Haberlein & Baumgartner, 1978
23. Web Survey Design
• Paging vs. Scrolling • Graphics
• Navigation • Emphasizing Text
• Scrolling vs. Double- • White Space
Banked • Authentication
• Edits and Input Fields • Progress Indicators
• Checkboxes and • Consistency
Radio Buttons
• Instructions and Help
23
24. Paging vs. Scrolling
Paging Scrolling
• Multiple questions per page • All on one static page
• Complex skip patterns • No data is saved until
submitted at end
• Not restricted to one item
– Can lose all data
per screen
• Respondent can
• Data from each page saved review/change responses
– Can be suspended/resumed • Questions can be answered
• Order of responding can be out of order
controlled • Similar look-and-feel as
• Requires more mouse clicks paper
24
25. Paging vs. Scrolling
• Little advantage (breakoffs, nonresponse,
time, straightlining) of one over the other
• Mixed approach may be best
• Choice should be driven by content and target
audience
– Scrolling for short surveys with few skip patterns;
respondent needs to see previous responses
– Paging for long surveys with intricate skip
patterns; questions should be answered in order
Couper, 2001; Gonyea, 2007; Peytchev, 2006;
25
Vehovar, 2000
26. Web Survey Design
• Paging vs. Scrolling • Graphics
• Navigation • Emphasizing Text
• Scrolling vs. Double- • White Space
Banked • Authentication
• Edits and Input Fields • Progress Indicators
• Checkboxes and • Consistency
Radio Buttons
• Instructions and Help
26
27. Navigation
• In a paging survey, after entering a response
– Proceed to next page
– Return to previous page (sometimes)
– Quit or stop
– Launch separate page with Help, definitions, etc.
27
28. Navigation: NP
• Next should be on the left
– Reduces the amount of time to move cursor to
primary navigation button
– Frequency of use
Couper, 2008; Dillman et al., 2009; Faulkner,
28
1998; Koyani et al., 2004; Wroblewski, 2008
36. Method
• Lab-based usability study
• TA read introduction and left letter on desk
• Separate rooms
• R read letter and logged in to survey
• Think Aloud
• Eye Tracking
• Satisfaction Questionnaire
• Debriefing
Romano & Chen, 2011
36
38. 8.5
Results: Satisfaction II
8.5
Mean Satisfaction
8
Mean Satisfaction
8
7.5
Rating
Rating
7.5
7 7
6.5 6.5
6 6
Mean N_P PN Mean N_P PN
Overall reaction to the survey: Information displayed on the screens:
terrible – wonderful. p < 0.05. inadequate – adequate. p = 0.07.
8.5
8.5
Mean Satisfaction
Mean Satisfaction
8
8
7.5
Rating
Rating
7.5
7 7
6.5 6.5
6 6
Mean N_P PN Mean N_P PN
Arrangement of information on the screens: Forward navigation:
illogical – logical. p = 0.19. impossible – easy. p = 0.13.
Romano & Chen, 2011
38
39. Eye Tracking
• Participants looked at Previous and Next in PN conditions
• Many participants looked at Previous in the N_P conditions
–
39 Couper et al. (2011): Previous gets used more when it is on the right.
40. N_P vs. PN: Respondent Debriefing
• N_P version
– Counterintuitive
– Don’t like the “buttons being flipped.”
– Next on the left is “really irritating.”
– Order is “opposite of what most people would
design.”
• PN version
– “Pretty standard, like what you typically see.”
– The location is “logical.”
Romano & Chen, 2011
40
41. Navigation Alternative
• Previous below Next
– Buttons can be closer
– But what about older adults?
– What about on mobile?
Couper et al., 2011; Wroblewski, 2008
41
42. Navigation Alternative
• Previous below Next
– Buttons can be closer
– But what about older adults?
– What about on mobile?
Couper et al., 2011; Wroblewski, 2008
42
46. Web Survey Design
• Paging vs. Scrolling • Graphics
• Navigation • Emphasizing Text
• Scrolling vs. Double- • White Space
Banked • Authentication
• Edits and Input Fields • Progress Indicators
• Checkboxes and • Consistency
Radio Buttons
• Instructions and Help
46
47. Long List of Response Options
• One column: Scrolling
– Visually appear to belong to one group
– When there are two columns, 2nd one may not be
seen (Smyth et al., 1997)
• Two columns: Double banked
– No scrolling
– See all options at once
– Appears shorter
47
48. 1 Column vs. 2 Column Study
Romano & Chen, 2011
48
49. Seconds to First Fixation
25
20
15
first half
* p < 0.01
10 second half
5
0
2 column 1 column
Romano & Chen, 2011
49
50. Total Number of Fixations
40
35
30
25
20 first half
15 second half
10
5
0
2 column 1 column
Romano & Chen, 2011
50
51. Time to Complete Item
120
100
80
Seconds
60 1 col
2 col
40
20
0
Mean Min Max
Romano & Chen, 2011
51
52. 1 Col. vs. 2 Col.: Debriefing
• 25 had a preference
– 6 preferred one column
• They had received the one-column version
– 19 preferred 2 columns
• 7 had received the one-column version
• Prefer not to scroll
• Want to see and compare everything at once
• It is easier to “look through,” to scan, to read
• Re one column, “How long is this list going to be?”
Romano & Chen, 2011
52
53. Long Lists
• Consider breaking list into smaller questions
• Consider series of yes/no questions
• Use logical order or randomize
• If using double-banked, do not separate
columns widely
53
54. Web Survey Design
• Paging vs. Scrolling • Graphics
• Navigation • Emphasizing Text
• Scrolling vs. Double- • White Space
Banked • Authentication
• Edits and Input Fields • Progress Indicators
• Checkboxes and • Consistency
Radio Buttons
• Instructions and Help
54
56. Input Fields
• Smaller text boxes = more restricted
• Larger text boxes = less restricted
– Encourage longer responses
• Visual/Verbal Miscommunication
– Visual may indicate “Write a story”
– Verbal may indicate “Write a number”
• What do you want to allow?
56
57. Types of Open-Ended Responses
• Narrative
– E.g., Describe…
• Short verbal responses
– E.g., What was your occupation?
• Single word/phrase responses
– E.g., Country of residence
• Frequency/Numeric response
– E.g., How many times…
• Formatted number/verbal
– E.g., Telephone number
57
62. Open-Ended Responses: Numeric
• Use of templates reduces ill-formed responses
– E.g., $_________.00
Couper et al., 2009; Fuchs, 2007
62
63. Open-Ended Responses: Date
• Not a good use: intended response will always
be the same format
• Same for state, zip code, etc.
• Note
– “Month” = text
– “mm/yyyy” = #s
63
64. Web Survey Design
• Paging vs. Scrolling • Graphics
• Navigation • Emphasizing Text
• Scrolling vs. Double- • White Space
Banked • Authentication
• Edits and Input Fields • Progress Indicators
• Checkboxes and • Consistency
Radio Buttons
• Instructions and Help
64
65. Check Boxes and Radio Buttons
• Perceived Affordances
• Design according to existing conventions and
expectations
• What are the conventions?
65
73. Web Survey Design
• Paging vs. Scrolling • Graphics
• Navigation • Emphasizing Text
• Scrolling vs. Double- • White Space
Banked • Authentication
• Edits and Input Fields • Progress Indicators
• Checkboxes and • Consistency
Radio Buttons
• Instructions and Help
73
80. Placement of Clarifying Instructions
• Help respondents have the same
interpretation
• Definitions, instructions, examples
Conrad & Schober, 2000; Conrad et al., 2006;
Conrad et al., 2007; Martin, 2002; Schober & 80
Conrad, 1997; Tourangeau et al., 2010
82. Placement of Clarifying Instructions
• Percentage of valid responses was higher with
clarification
• Longer response time when before item
• No effects of changing the font style
• Before item is better than after
• Asking a series of questions is best
82
Redline, 2013
83. Placement of Help
• People are less likely to use help when they
have to click than when it is near item
• “Don’t make me think”
83
84. Placement of Error Message
• Should be near the item
• Should be positive and helpful, suggesting
HOW to help
• Bad error message:
84
85. Placement of Error Message
• Should be near the item
• Should be positive and helpful, suggesting
HOW to help
• Bad error message:
85
88. Web Survey Design
• Paging vs. Scrolling • Graphics
• Navigation • Emphasizing Text
• Scrolling vs. Double- • White Space
Banked • Authentication
• Edits and Input Fields • Progress Indicators
• Checkboxes and • Consistency
Radio Buttons
• Instructions and Help
88
89. Graphics
• Improve motivation, engagement, satisfaction
with “fun”
• Decrease nonresponse & measurement error
• Improve data quality
• Gamification
89
Henning, 2012; Manfreda et al., 2002
90. Graphics
• Use when they supply meaning
– Survey about advertisements
• Use when user experience is improved
– For children or video-game players
– For low literacy
90
Libman, 2012
93. Graphics Experiment 1.1
• Appearance
– Decreasing boldness (bold faded)
– Increasing boldness (faded bold)
– Adding face symbols to response options ( )
• ~ 2400 respondents
• Rated satisfaction re health-related things
• 5-pt scale: very satisfied very dissatisfied
93
Medway & Tourangeau, 2011
94. Graphics Experiment 1.2
• Bold side selected more
Very Somewhat Somewhat Very
satisfied satisfied Neutral dissatisfied dissatisfied
Your physician O O O O O
• Less satisfaction when face symbols present
Very Very
satisfied dissatisfied
Your physician O O O O O
94
Medway & Tourangeau, 2011
95. Graphics Experiment 2.1
• Appearance
– Radio buttons
– Face symbols ( )
• ~ 1000 respondents
• Rated satisfaction with a journal
• 6-pt scale: very dissatisfied very satisfied
95
Emde & Fuchs, 2011
96. Graphics Experiment 2.2
• Faces were equivalent to radio buttons
• Respondents were more attentive when faces
were present
– Time to respond
96
Emde & Fuchs, 2011
97. Slider Usability Study
• Participants thought 1 was selected and did not move the slider. 0 was actually
selected if they did not respond.
97
Strohl, Romano Bergstrom & Krulikowski, 2012
98. Graphics Experiment 3.1
• Modified the visual design of survey items
– Increase novelty and interest on select items
– Other items were standard
• ~ 100 respondents in experimental condition
• ~ 1200 in control
• Questions about military perceptions and
media usage
• Variety of question types
98
Gibson, Luchman & Romano Bergstrom, 2013
100. Graphics Experiment 3.3
• Slight differences:
– Those with enhanced version skipped more often
– Those in standard responded more negatively.
100
Gibson, Luchman & Romano Bergstrom, 2013
101. Graphics Experiment 3.4
• Slight differences:
– Those with enhanced version skipped more often
101
Gibson, Luchman & Romano Bergstrom, 2013
107. Web Survey Design
• Paging vs. Scrolling • Graphics
• Navigation • Emphasizing Text
• Scrolling vs. Double- • White Space
Banked • Authentication
• Edits and Input Fields • Progress Indicators
• Checkboxes and • Consistency
Radio Buttons
• Instructions and Help
107
108. Emphasizing Text
• Font
– Never underline plain text
– Never use red for plain text
– Use bold and italics sparingly
108
111. Emphasizing Text
• Hypertext
– Use meaningful
words and phrases
– Be specific
– Avoid “more” and
“click here.”
111
112. Web Survey Design
• Paging vs. Scrolling • Graphics
• Navigation • Emphasizing Text
• Scrolling vs. Double- • White Space
Banked • Authentication
• Edits and Input Fields • Progress Indicators
• Checkboxes and • Consistency
Radio Buttons
• Instructions and Help
112
113. White Space
• White space on a page
• Differentiates sections
• Don’t overdo it
113
115. Web Survey Design
• Paging vs. Scrolling • Graphics
• Navigation • Emphasizing Text
• Scrolling vs. Double- • White Space
Banked • Authentication
• Edits and Input Fields • Progress Indicators
• Checkboxes and • Consistency
Radio Buttons
• Instructions and Help
115
116. Authentication
• Ensures respondent is the selected person
• Prevents entry by those not selected
• Prevents multiple entries by selected
respondent
116
117. Authentication
• Passive
– ID and password embedded in URL
• Active
– E-mail entry
– ID and password entry
• Avoid ambiguous passwords (Couper et al., 2001)
– E.g., contains 1, l, 0, o
• Security concerns can be an issue
• Don’t make it more difficult than it needs to be
117
119. Web Survey Design
• Paging vs. Scrolling • Graphics
• Navigation • Emphasizing Text
• Scrolling vs. Double- • White Space
Banked • Authentication
• Edits and Input Fields • Progress Indicators
• Checkboxes and • Consistency
Radio Buttons
• Instructions and Help
119
120. Progress Indicators
• Reduce breakoffs
• Reduce burden by displaying length of survey
• Enhance motivation and visual feedback
• Not needed in scrolling design
• Little evidence of benefit
Couper et al., 2001; Crawford et al., 2001;
Conrad et al., 2003, 2005; Sakshaug & 120
Crawford, 2009
125. Web Survey Design
• Paging vs. Scrolling • Graphics
• Navigation • Emphasizing Text
• Scrolling vs. Double- • White Space
Banked • Authentication
• Edits and Input Fields • Progress Indicators
• Checkboxes and • Consistency
Radio Buttons
• Instructions and Help
125
126. Consistency
• Predictable
– User can anticipate what the system will do
• Dependable
– System fulfills user’s expectations
• Habit-forming
– System encourages behavior
• Transferable
– Habits in one context can transfer to another
• Natural
– Consistent with user’s knowledge
126
131. Assessing Your Survey
The views expressed on statistical or methodological issues are those of the presenters
and not necessarily those of the U.S. Census Bureau.
132. Assessing Your Survey
Paradata Usability
• Usability vs. User
• Background Experience
• Why, When, What?
• Uses of Paradata by • Methods
mode – Focus Groups, In-Depth
Interviews
• Paradata issues – Ethnographic
Observations, Diary Studies
– Usability & Cognitive
Testing
• Lab, Remote, In-the-Field
• Obstacles
134. Types of Data
• Survey Data – collected information from R’s
• Metadata – data that describes the survey
– Codebook
– Description of the project/survey
• Paradata – data about the process of
answering the survey at the R level
• Auxiliary/Administrative Data – not collected
directly, but acquired from external sources
135. Paradata
• Term coined by Mick Couper
– Originally described data that were by-products of
computer-assisted interviewing
– Expanded to include data from other self-
administered modes
• Main uses:
– Adaptive / Responsive design
– Nonresponse adjustment
– Measurement error identification
138. Adaptive / Responsive Design
• Create process indicators
• Real-time monitoring (charts & “dashboards”)
• Adjust resources during data collection to achieve
higher response rate and/or cost savings
• Goal:
– Achieve high response rates in a cost-effective way
– Introduce methods to recruit uncooperative – and
possibly different – sample members (reducing
nonresponse bias)
139. Nonresponse Adjustment
• Decreasing response rates have encouraged
researchers to look at other sources of
information to learn about nonrespondents
– Doorstep interactions
– Interviewer observations
– Contact history data
140. Contact History Instrument (CHI)
• CHI developed by the U.S. Census Bureau
(Bates, 2003)
• Interviewers take time after each attempt (refusal
or non-contact) to answer questions in the CHI
• Use CHI information to create models (i.e., heat
maps) to identify optimal contact time
• Typically a quick set of questions to answer
• European Social Survey uses a standard contact
form (Stoop et al., 2003)
144. Uses of Paradata - CAPI
• Information collected can include:
– Interviewer time spent calling sampled households
– Time driving to sample areas
– Time conversing with household members
– Interview time
– GPS coordinates (tablets/mobile devices)
• Information can be used to:
– Inform cost-quality decisions (Kreuter, 2009)
– Develop cost per contact
– Predict the likelihood of response by using interviewer
observations of the response unit (Groves & Couper, 1998)
– Monitor interviewers and identify any falsification
145. Uses of Paradata - CATI
• Information collected can include:
– Call transaction history (record of each attempt)
– Contact rates
– Sequence of contact attempts & contact rates
• Information can be used to:
– Optimize call back times
– Interviewer monitoring
– Inform a responsive design
146. Uses of Paradata - Web
• Server-side vs. client-side
• Information collected can include:
– Device information (i.e., browser type, operating
system, screen resolution, detection of JavaScript
or Flash)
– Questionnaire navigation information
Callegaro, 2012
147. Web Paradata - Server-side
• Page requests or “visits” to a web page from
the web server
• Identify device information and monitor
survey completion
148. Web Paradata - Server-side cont.
• Typology of response behaviors in web surveys
1. Complete responders
2. Unit non-responders
3. Answering drop-outs
4. Lurkers
5. Lurking drop-outs
6. Item non-responders
7. Item non-responding drop-outs
Bosnjak, 2001
149. Web Paradata – Client-Side
• Collected on the R’s computer
• Logs each “meaningful” action
• Heerwegh (2003) developed code / guidance for
client-side paradata collected using Java-Script
– Clicking on a radio button
– Clicking and selecting a response option in a drop-
down box
– Clicking a check box (checking / unchecking)
– Writing text in an input field
– Clicking a hyperlink
– Submitting the page
150. Web Paradata – Client-Side cont.
• Stern (2008) used Heerwegh’s paradata
techniques to identify:
– Whether R’s changed answers; what direction
– The order that questions are answered when more
than one are displayed on the screen
– Response latencies – the time that elapsed between
when the screen loaded on the R’s computer and they
submitted an answer
• Heerwegh (2003) found that the longer the response
time, the greater the probability of changing answers and an
incorrect response
151. Browser Information / Operating
System Information
• Programmers use this information to ensure they are
developing the optimal design
• Desktop, laptop, smartphone, tablet, or other device
• Sood (2011) found a correlation between browser type
and survey breakoff & number of missing items
– Completion rates for older browsers were lower
– Using browser type as a proxy for age of device and
possible connection speed
– Older browsers were more likely to display survey
incorrectly; possible explanation for higher drop-out rates
152. JavaScript & Flash
• Helps to understand what the R can see and do in
a survey
• JavaScript adds functionality such as question
validations, auto-calculations, interactive help
– 2% or less of computer users have JavaScript disabled
(Zakas, 2010)
• Flash is used for question types such as drag &
drop or slide-bar questions
– Without Flash installed, R’s may not see the question
154. Questionnaire Navigation Paradata
• Mouse clicks/coordinates
– Captured with JavaScript
– Excessive movements can indicate
• An issue with the question
• Potential for lower quality
• Changing answers
– Can indicate potential confusion with a question
– Paradata can capture answers that were erased
– Changes more frequent for opinion question than
factual questions
Stieger & Reips, 2010
155. Questionnaire Navigation Paradata cont.
• Order of answering
– When multiple questions are displayed on a
screen
– Can indicate how respondents read the questions
• Movement through the questionnaire
(forward and back)
– Unusual patterns can indicate confusion and a
possible issue with the questionnaire (i.e., poor
question order)
156. Questionnaire Navigation Paradata cont.
• Number of prompts/error messages/data
validation messages
• Quality Index (Haraldsen, 2005)
• Goal is to decrease number of activated errors by
improving the visual design and clarity of the
questions
157. Questionnaire Navigation Paradata cont.
• Clicks on non-question links
– Help, FAQs, etc.
– Indication of when and where Rs use help or other
information built into the survey and displayed as a
link
• Last question answered before dropping out
– Helps to determine if the data collected can be
classified as complete, partial, or breakoff
– Used for response rate computation
– Peytchev (2009) analyzed breakoff by question type
• Open ended increased break-off chances by 2.5x; long
questions by 3x; slider bars by 5x; introductory screens by
2.6x
158. Questionnaire Navigation Paradata cont.
• Time per screen / time latency
– Attitude strength
– Response uncertainty
– Response error
• Examples
– Heerwegh (2003)
• R’s with weaker attitudes take more time in answering
survey questions than R’s with stronger attitudes
– Yan and Tourangeu (2008)
• Higher-educated R’s respond faster than lower-educated R’s
• Younger R’s respond faster than older R’s.
159. Uses of Paradata – Call Centers
• Self-administered (mail or electronic) surveys
• Call transaction history software
– Incoming calls
• Date and time: useful for hiring, staffing, and workflow
decisions
• Purpose of the call
– Content issue: useful for identifying problematic questions or
support information
– Technical issue: useful for identifying usability issues or system
problems
• Call outcome: type of assistance provided
162. Reliability of data collected
• Interviewers can erroneously record housing
unit characteristics, misjudge features about
respondents & fail to record a contact attempt
• Web surveys can fail to load properly, and
client-side paradata fails to be captured
• Recordings of interviewers can be unusable
(e.g., background noise, loose microphones)
Casas-Cardero 2010; Sinibaldi 2010; West 2010
163. Paradata costs
• Data storage – very large files
• Instrument performance
• Development within systems
• Analysis
164. Privacy and Ethical Issues
• IP addresses along with e-mail address or
other information can be used to identify a
respondent
• This information needs to be protected
165. Paradata Activity
• Should the respondent be informed that the
organization is capturing paradata?
• If so, how should that be communicated?
166. Privacy and Ethical Issues cont.
• Singer & Couper asked members of the Dutch
Longitudinal Internet Studies for the Social
Sciences (LISS) panel at the end of the survey
if they could collect paradata – 38.4% agreed
• Asked before the survey – 63.4% agreed
• Evidence that asking permission to use
paradata might make R’s less willing to
participate in a survey
Couper & Singer, 2011
167. Privacy and Ethical Issues cont.
• Reasons for failing to inform R’s about
paradata or get their consent
– Concept of paradata is unfamiliar and difficult for
R’s to grasp
– R’s associate it with the activities of advertisers,
hackers or phishers
– Asking for consent gives it more salience
– Difficult to convey benefits of paradata for the R
171. Background Knowledge
• What does usability mean to you?
• Have you been involved in usability research?
• How is “user experience” different from
“usability?”
173. Usability vs. User Experience
• Usability: “the extent to which a product can
be used by specified users to achieve specified
goals with effectiveness, efficiency, and
satisfaction in a specified context of use.” ISO
9241-11
• Usability.gov
• User experience includes emotions, needs and
perceptions.
174. Understanding Users
Whitney’s 5 E’s of Usability Peter’s User Experience Honeycomb
The 5 Es to Understanding Users (W. Quesenbery):
http://www.wqusability.com/articles/getting-started.html
User Experience Design (P. Morville):
http://semanticstudios.com/publications/semantics/000029.php
178. Measuring the UX
• How does it work for the end user?
• What does the user expect?
• How does it make the user feel?
• What is the user’s story and habits?
• What is the user’s needs?
184. Why is Testing Important?
• Put it in the hands of the users.
• Things may seem straightforward to you but
maybe not to your users.
• You might have overlooked something big!
185. When to test
Test Test
Test Final Test
with Concept Prototype with
users
Test Test Product users
Test
186. What can be tested?
• Existing surveys
• Low-fidelity prototypes
– Paper mockups or mockups on computer
– Basic idea is there but not functionality or
graphical look
• High-fidelity prototypes
– As close as possible to final interface in look and
feel
188. Methods to Understand Users
ensure users
understand
Ethnographi Usability can use
interactions in
products
c Testing natural
efficiently & w
Observation environment
satisfaction
assess assess
User ensure content
Linguistic emotions, per interactional
Cognitive Experience is understood ceptions, and motivations
Testing Research Analysis as intended reactions and goals
Focus randomly
Groups and discuss users’
sample the
Surveys perceptions
In-Depth population of
and reactions
Interviews interest
Method Assessme
nt
189. Focus Groups
• Structured script
• Moderator discusses the survey
with actual or typical users
– Actual usage of survey
– Workflow beyond survey
– Expectations and opinions
– Desire for new features and
functionality
• Benefit of participants
stimulating conversations, but
risk of “group think”
190. In-Depth Interviews
• Structured or unstructured
• Talk one-on-one with
users, in person or
remotely
– Actual usage of the survey
– Workflow beyond survey
– Expectations and opinions
– Desire for new features and
functionality
192. Ethnographic Observations
• Observe users in home, office
or any place that is “real-
world.”
• Observer is embedded in the
user’s culture.
• Allows conversation & activity
to evolve naturally, with
minimum interference.
• Observe settings and artifacts
(other real-world objects).
• Focused on context and
meaning making.
193. Diaries/Journals
• Users are given a journal or a web site to
complete on a regular basis (often daily).
• They record how/when they used the survey,
what they did, and what their perceptions were.
• User-defined data
• Feedback/responses develop and change over time
• Insight into how technology is used “on-the-go.”
• There is often a daily set of structured questions
and/or free-form comments.
194. Diaries/Journals
• Users are given a journal or a web site to
complete on a regular basis (often daily).
• They record how/when they used the
survey, what they did, and what their perceptions
were.
• User-defined data
• Feedback/responses develop and change over time
• Insight into how technology is used “on-the-go.”
• There is often a daily set of structured questions
and/or free-form comments.
196. Usability Testing
•Participants respond to survey items
•Assess interface flow and design
• Understanding
• Confusion
• Expectations
•Ensure skip intricate response patterns work as
intended
•Can test final product or early prototypes
197. Cognitive Testing
•Participants respond to survey items
•Assess text
• Confusion
• Understanding
• Thought process
•Ensure questions are understood as intended
and resulting data is valid
•Proper formatting is not necessary.
198. Usability vs. Cognitive Testing
Usability Testing Metrics Cognitive Testing Metrics
• Accuracy • Accuracy
• In completing item/ survey • Of interpretations
• Number/severity of errors • Verbalizations
• Efficiency
• Time to complete item/survey
• Path to complete item/survey
• Satisfaction
• Item-based
• Survey-based
• Verbalizations
199. Moderating Techniques
Techniques Pros Cons
Concurrent Think Aloud Understand participants’ Can interfere with usability
thoughts as they occur and as metrics, such as accuracy and
(CTA)
they attempt to work through time on task
issues they encounter
Elicit real-time feedback and
emotional responses
Retrospective Think Aloud Does not interfere with usability Overall session length increases
metrics Difficulty in remembering
(RTA)
thoughts from up to an hour
before = poor data
Concurrent Probing Understand participants’ Interferes with natural thought
thoughts as they attempt to process and progression that
(CP)
work through a task participants would make on
their own, if uninterrupted
Retrospective Probing Does not interfere with usability Difficulty in remembering = poor
metrics data
(RP)
Romano Bergstrom, Moderating Usability Tests:
http://www.usability.gov/articles/2013/04/moderating-usability-tests.html
200. Choosing a Moderating Technique
• Can the participant work completely alone?
• Will you need time on task and accuracy data?
• Are the tasks multi layered and/or require
concentration?
• Will you be conducting eye tracking?
201. Tweaking vs. Redesign
Tweaking Redesign
• Less work • Lots of work after much has
• Small changes occur quickly. already been invested
• Small changes are likely to • May break something else
happen. • A lot of people
• A lot of meetings
203. Lab vs. Remote vs. In the Field
• Controlled environment
Laboratory Remote In the Field
• All participants have the
• Participants in their • Participants tend to be
same experience
natural environments more comfortable in
• Record and (e.g., home, work) their natural
communicate from environments
• Use video chat
control room
(moderated sessions) • Recruit hard-to-reach
• Observers watch from or online programs populations (e.g.,
control room and (unmoderated) children, doctors)
provide additional
• Conduct many sessions • Moderator travels to
probes (via moderator)
quickly various locations
in real time
• Recruit participants in • Bring equipment (e.g.,
• Incorporate
many locations (e.g., eye tracker)
physiological measures
states, countries)
(e.g., eye tracking, • Natural observations
EDA)
• No travel costs
204. Lab-Based Usability Testing
Participant in
the testing
room
Live streaming
close-up screen shot
of the participant’s
screen
0
Participant in
the testing Large screens
room to display
material
Observation during focus
area for groups
We maneuver the clients
cameras, record, and
communicate through
microphones and
speakers from the
control room so we do
not interfere
Fors Marsh Group UX Lab
206. Remote Moderated Testing
Observer
taking
notes, remain
s unseen from
Moderator participant
working from
the office
Participant
working on
the survey
from her
home in
another state
Fors Marsh Group UX Lab
207. Field Studies
Participant
uses books
from her
natural
environment
Researcher to complete
goes to tasks on the
participant’s website
workplace to
conduct
session. She
observes and
takes notes
Participant is
in her natural
environment,
completing
tasks on a site
she normally
uses for work
209. Obstacles to Testing
• “There is no time.”
– Start early in development process.
– One morning a month with 3 users (Krug)
– 12 people in 3 days (Anderson Riemer)
– 12 people in 2 days (Lebson & Romano Bergstrom)
• “I can’t find representative users.”
– Everyone is important.
– Travel
– Remote testing
• “We don’t have a lab.”
– You can test anywhere.
210. Final Thoughts
• Test across devices.
• “User experience is an ecosystem.”
• Test across demographics.
• Older adults perform differently than young.
• Start early.
Kornacki, 2013, The Long Tail of UX
212. Quality of Mixed Modes
The views expressed on statistical or methodological issues are those of the
presenters and not necessarily those of the U.S. Census Bureau.
214. Mixed Mode Surveys
• Definition: Any combination of survey data
collection methods/modes
• Mixed vs. Multi vs. Multiple – Modes
• Survey organization goal:
– Identify optimal data collection procedure (for the
research question)
– Reduce Total Survey Error
– Stay within time/budget constraints
214
215. Mixed Mode Designs
• Sequential
– Different modes for different phases of interaction
(initial contact, data collection, follow-up)
– Different modes used in sequence during data
collection (i.e., panel survey which begins in one
mode and moves to another)
• Concurrent – different modes implemented at
the same time
215
de Lueew & Hox, 2008
217. Mixed Modes – Cost Savings
• Mixed mode designs give an opportunity to
compensate for the weaknesses of each individual
mode in a cost effective way (de Leeuw, 2005)
• Dillman 2009 Internet, Mail, and Mixed-Mode
Surveys book:
– Organizations often start with lower cost mode
and move to more expensive one
• In the past: start with paper then do CATI or in person
nonresponse follow-up (NRFU)
• Current: start with Internet then paper NRFU
217
218. Mixed Modes – Cost Savings cont.
• Examples:
• U.S. Current Population Survey (CPS) – panel survey
– Initially does in-person interview and collects a
telephone number
– Subsequent calls made via CATI to reduce cost
• U.S. American Community Survey
– Phase 1: mail
– Phase 2: CATI NRFU
– Phase 3: face-to-face with a subsample of remaining
nonrespondents
218
219. Mixed Mode - Timeliness
• Collect responses more quickly
• Examples:
– Current Employment Statistics (CES) offers 5
modes (Fax, Web, Touch-tone Data Entry,
Electronic Data Interchange, & CATI) to facilitate
timely monthly reporting
219
222. Mixed Mode - Coverage Error
• Definition: proportion of the target population
that is not covered by the survey frame and
the difference in the survey statistic between
those covered and not covered
• Telephone penetration
• Landlines vs mobile phones
– Web penetration
222
Groves, 1989
223. Coverage – Telephone
• 88% of U.S. adults have a cell phone
• Young adults, those with lower education, and
lower household income more likely to use
mobile devices as main source of internet
access
223
Smith, 2011; Zickuhr & Smith, 2012
224. Coverage - Internet
• Coverage is limited
– No systematic directory of addresses
• 1 in 5 in U.S. do not use the Internet
224
Zickuhr & Smith, 2012
227. Coverage –Web cont.
• Indications that Internet adoption rates have
leveled off
• Demographics least likely to have Internet
– Older
– Less education
– Lower household income
• Main reason for not going online: not relevant
227
Pew, 2012
229. Coverage - Web cont.
• R’s reporting via Internet can be different from
those reporting via other modes
– Internet vs. mail (Diment & Garret-Jones, 2007;
Zhang, 2000)
• R’s cannot be contacted through the Internet
because e-mail addresses lack structure for
generating random samples (Dillman, 2009)
229
230. Mixed Mode – Nonresponse Error
• Definition: inability to obtain complete
measurements on the survey sample
(Groves, 1998)
– Unit nonresponse - entire sampling unit fails to
respond
– Item nonresponse – R’s fail to respond to all
questions
• Concern is that respondents and non-
respondents may differ on variable of interest
230
231. Mixed Mode – Nonresponse cont.
• Overall response rates have been declining
• Mixed mode is a strategy used to increase
overall response rates while keeping costs low
• Some R’s have a mode preference
(Miller, 2009)
231
232. Mixed Mode – Nonresponse cont.
• Some evidence of a reduction in overall
response rates when multiple modes offered
concurrently in population/household surveys
– Examples: Delivery Sequence File Study
(Dillman, 2009); Arbitron Radio Diaries
(Gentry, 2008), American Communities Survey
(Griffen, et al, 2001), Survey of Doctorate
Recipients (Grigorian & Hoffer, 2008)
• Could assign R’s to modes based on known
preferences
232
233. Mixed Mode – Measurement Error
• Definition: “observational errors” arising from
the interviewer, instrument, mode of
communication, or respondent (Groves, 1998)
• Providing mixed modes can help reduce the
measurement error associated with collecting
sensitive information
– Example: Interviewer begins face-to-face
interview (CAPI) then lets R continue on the
computer with headphones (ACASI) to answer
sensitive questions
233
234. Mode Comparison Research
• Meta-analysis of articles by
– Harder to get mail responses
– Overall non-response rates & item non-response
rates are higher in self-administered
questionnaires, BUT answered items are of high
quality
– Small difference in quality between face-to-face
and telephone (CATI) surveys.
– Face-to-face surveys had slightly less item non-
response rates
234
de Leeuw, 1992
235. Mode Comparison Research cont.
• Question order and response order effects
less likely in self-administered than telephone
– R’s more likely to choose last option heard in CATI
(recency effect)
– R’s more likely to choose the first option seen in
self-administered (primacy effect)
– Mixed results on item-nonresponse rates in Web
235
de Leeuw, 1992; 2008
236. Mode Comparison Research cont.
• Some indication that Internet surveys are
more like mail than telephone surveys
– Visual presentation vs auditory
• Conflicting evidence item non-response (some
show higher item non-response on Internet
v.s. mail while others show no difference)
• Some evidence of better quality data
– Fewer post-data collection edits needed for
electronic v.s. mail responses
236
Sweet & Ramos, 1995; Griffin et. al, 2001
237. Disadvantages of Mixed Mode
• Mode Effects
– Concerns for measurement error due to the mode
• R’s providing different answers to the same questions
displayed in different modes
– Different contact/cooperation rates because of
different strategies used to contact R’s
237
238. Disadvantages of Mixed Mode
• Decrease in overall response rates
– Why: Effects of offering a mix of mail/web mixed
– What: Meta-analysis of 16 studies that compared
mixed mode surveys with mail and web options
Results: empirical evidence that offering mail and
Web concurrently resulted in a significant
reduction in response rates
238
Medway & Fulton, 2012
239. Response Rates in Mixed Mode Surveys
• Why this is happening?
– Potential Hypothesis #1: R’s dissuaded from
responding because they have to make a choice
• Offering multiple modes increases burden (Dhar, 1997)
• While judging pros/cons of each mode, neither appear
attractive (Schwartz, 2004)
– Potential Hypothesis #2: R’s choose Web, but never
actually do it
• If R’s receive invitation in mail, there is a break in their
response process (Griffin, et. al, 2001)
– Potential Hypothesis #3: R’s that choose Web may get
frustrated with the instrument and abandon the
whole process (Couper, 2000)
239
240. Overall Goals
• Find the optimal mix given the research
questions and population of interest
• Other factors to consider:
– Reducing Total Survey Error (TSE)
– Budget
– Time
– Ethics and/or privacy issues
240
Biemer & Lyberg, 2003
242. Technique for Increasing Response
Rates to Web in Multi-Mode Surveys
• “Pushing” R’s to the web
– Sending R’s an invitation to report via Web
– No paper questionnaire in the initial mailing
– Invitation contains information for obtaining the
alternative version (typically paper)
– Paper versions are mailed out during follow-up to
capture responses to those that do not have web
access or do not want to respond via web
– “Responding to Mode in Hand” Principal
242
243. “Pushing” Examples
• Example 1: Lewiston-Clarkson Quality of Life
Survey
• Example 2: 2007 Stockholm County Council
Public Health Survey
• Example 3: American Community Survey
• Example 4: 2011 Economic Census Re-file
Survey
243
244. Pushing Example 1 –
Lewiston-Clarkson Quality of Life Survey
• Goals: increase web response rates in a
paper/web mixed-mode survey and identify
mode preferences
• Method:
– November 2007 – January 2008
– Random sample of 1,800 residential addresses
– Four treatment groups
– To assess mode preference, this question was at the
end of the survey:
• “If you could choose how to answer surveys like this, which
one of the following ways of answering would you prefer?”
• Answer options: web or mail or telephone
244
Miller, O’Neill, Dillman, 2009
245. Pushing Example 1 – cont.
• Group A: Mail preference with web option
– Materials suggested mail was preferred but web
was acceptable
• Group B: Mail Preference
– Web option not mentioned until first follow-up
• Group C: Web Preference
– Mail option not mentioned until first follow-up
• Group D: Equal Preference
245
247. Pushing Example 1 – cont.
“If you could choose how to answer surveys like this, which one
of the following ways of answering would you prefer?”
247
251. Pushing Example 2 – 2007 Stockholm
County Council Public Health Survey
• Goal: increase web response rates in a
paper/web mixed-mode survey
• Method:
– 50,000 (62% response rate)
– 4 treatments that varied in “web intensity”
– Plus a “standard” option – paper and web login
data
251
Holmberg, Lorenc, Werner, 2008
252. Pushing Example 2 – Cont.
• Overall response rates
S= Standard A1= very paper “intense”
A2= paper “intense” A3= web “intense”
A4= very web “intense”
252
253. Pushing Example 2 – Cont.
• Web responses
S= Standard A1= very paper “intense”
A2= paper “intense” A3= web “intense”
A4= very web “intense”
253
254. Pushing Example 3 – American
Community Survey
• Goals:
– Increase web response rates in a paper/web
mixed-mode survey
– Identify ideal timing for non-response follow-up
– Evaluate advertisement of web choice
254
Tancreto et. al., 2012
255. Pushing Example 3 – Cont.
• Method
– Push: 3 versus 2 weeks until paper questionnaire
– Choice: Prominent and Subtle
– Mail only (control)
– Tested among segments of US population
• Targeted
• Not Targeted
255
256. Response Rates by Mode in Targeted
Areas
45
40.5
40 38.1 38.2 37.6
35 31.1 12.5
30 2.5
25 28.4
20 34.1
38.1
15 28.6 28.0
10
5 9.8
3.5
0
Ctrl (Mail only) Prom Choice Subtle Choice Push (3 weeks) Push (2 weeks)
Internet Mail
256
257. Response Rates by Mode in Not
Targeted Areas
50
45
40
35 29.7 30.4 29.9 29.8
30
25 19.8
12.6
20 24.1 2.7
15 29.7 27.8
10 17.1 17.2
5
6.3
0 2.1
Ctrl (Mail only) Prom Choice Subtle Choice Push (3 weeks) Push (2 weeks)
Internet Mail
257
258. Example 4: Economic Census Refile
• Goal: to increase Internet response rates in a
paper/Internet establishment survey during
non-response follow-up
• Method: 29,000 delinquent respondents were
split between two NRFU mailings
– Letter-only mailing mentioning Internet option
– Letter and paper form mailing
258
Marquette, 2012
261. Why Respondents Choose Their Mode?
• Concern about “mode paralysis”
– When two option are offered, R’s much choose
between tradeoffs
– This choice makes each option less appealing
– By offering a choice between Web and mail;
possibly discouraging response
261
Miller and Dillman, 2011
262. Mode Choice
• American Community Survey – Attitudes and
Behavior Study
• Goals:
– Measure why respondents chose the Internet or
paper mode during the American Community
Survey Internet Test
– Why there was nonresponse and if it was linked to
the multi-mode offer
262
Nichols, 2012
263. Mode Choice – cont.
• CATI questionnaire was developed in
consultation with survey methodologists
• Areas of interest included:
– Salience of the mailing materials and messages
– Knowledge of the mode choice
– Consideration of reporting by Internet
– Mode preference
263
265. Mode Choice – cont.
• Results
– Choice/Push Internet respondents opted for perceived
benefits – easy, convenient, fast
– Push R’s noted that not having the paper form
motivated them to use the Internet to report
– Push R’s that reported via mail did so because they did
not have the Internet access or had computer
problems
– The placement of the message about the Internet
option was reasonable to R’s
– R’s often recalled the letter that accompanied the
mailing package mentioning the mode choice
265
266. Mode Choice – cont.
• Results cont.
– Several nonrespondents cited not knowing that a
paper option was available as a reason for not
reporting
– Very few nonrespondents attempted to access the
online form
– Salience of the mailing package and being busy
were main reasons for nonresponse
– ABS study did NOT find “mode paralysis”
266
268. Amy Anderson Riemer
US Census Bureau
amy.e.anderson.riemer@census.gov
Jennifer Romano Bergstrom
Fors Marsh Group
jbergstrom@forsmarshgroup.com
Notas do Editor
Explain lab environment with eye tracker
Metadata includes any contextual information that can be relevant to interpret the dataset. Auxiliary data may not be available at a R level. It may only be available at an aggregate level. This type of data can be used to for nonresponse adjustment. It can be used as a benchmark in order to assess the quality of the reported data. It can also be used prior to data collection for statistical adjustments for sampling purposes.
Dr. Couper is a research professor at the survey research center at the University of Michigan. His most recent work is focused on web survey design.
This is the TSE framework developed by Groves. The circles in the middle represent all of the different sources of error that can occur during the development and administration of a survey. Later in the day we will actually come back to a few of these sources of error to discuss how the different modes of electronic data collection that we have been discussing can effect those errors.
Krueter has taken Grove’s framework and identified where in the survey process that paradata can provide information
In a responsive design there are process indicators that allow for real time monitoring where decisions can be made by survey managers to make decisions that will allow them to adjust their resources to achieve better outcomes. Managers can use information from early results to help cut costs and achieve higher response rates. Managers can also use information learned from early paradata to augment methods for trying to reach uncooperative respondents. These nonrespondents could be different from your respondents. So by successfully recruiting these, you can possibly reduce nonresponse bias.
Researchers are using information learned from paradata about nonrespondents to made nonresponse adjustments.
In order to collect information about each contact, contact forms are completed. The CB uses a CHI to do this.
Example of options given: advance letter given, scheduled an appointment, left a note/appointment card, offered incentive, checked with neighbors, contacted other family members, contacted property manager, none.
Recording the date/time of prior contact allows schedulers to vary contact with the hope of increasing the probability of a successful contact and saving costs.
Answering drop outs = answer questions but quit before completing the surveyLurkers = view all of the questions but don’t answer anyLuking drop-outs= vies some of the Qs without responding and quit before completingItem non responding drop outs= view some questions, answer some but not all, break off before completing the survey
It is necessary to decide ahead of time which actions are meaningful to record based on the interests of the researchers/managers.These are examples of the meaningful actions that Heerwegh decided to collect.
It can even detect when users respond through devices such as book readers or game consoles.
An area of growing concern is the use of tablets to answer surveys and the inability of some to use Flash. Paradata can allow survey managers to monitor what devices their population (or similar populations if it is a one-time survey) are using and chose to, for example, redesign questions so that Flash isn’t necessary if tablet usage increases.
Paradata will show you all of the answers that a respondent entered into a field. Excessive movement can indicate that
This navigational paradata can also be captured on CATI instruments to assess the design of those questionnaires as well.
The quality of online questionnaires are now evaluated using this formula. The number of all possible errors is the sum of all potential error messages, potential prompts, and validation messages programmed into the instrument. The number of activated errors is the number that were encountered by the R during the survey
Help links… if you expect/want R’s to see your help information more frequently, than you may need to list it on the question screen somehow rather than leaving it as a link.
Note that some call centers may attempt to route the caller using menus to different staff based on the issue. For example, there may be a separate number or prompt that will take R’s with technical issues to support staff trained to deal with those issues.
Interviewers can have variability in the way they interpret different things about the R or the household that is asked of them in the Contact History Interview.Some studies have shown high missing data for interviewer provided paradata -- CHI information. If the paradata needed is provided by interviewers, it can be incomplete or entirely skipped. Therefore not as reliable.
Paradata can create very large files that need to be saved on servers. Client-side paradata that collects such things and mouse/key strokes can be known to be incredibly large. This is party the reason why researchers are encouraged to think about the types of “meaningful” information that is needed ahead of time. Capturing large amounts of paradata in electronic instruments could slow down the performance of the instrument for Rs. Resources must be dedicated to develop the tools to collect the paradata within the system (web, CATI, CAPI). Resources must be dedicated to monitor the paradata in a responsive design and/or to analyze the data post hoc for nonresponse adjustment or changes to the instrument. There can often be massive amounts of paradata to work with and sift through.
-Mixed/Multi/Multiple are used interchangeably-There may be other practical or regulatory constraints that also affect the data collection design.
Concurrent --
Because of the high cost of in-person interviews, survey organizations are using them to either establish their initial contact and ensure lower coverage error (such as in the CPS example) OR they are using it as a NRFU option after other NRFU options have been exhausted.
There are some examples were organization are offering mixed modes to save time.TDE is when R’s can just enter data directly over the phone after being prompted by an automatic recording. EDI is the transmission of data between two entities.
This is the TSE framework developed by Groves. The circles in the middle represent all of the different sources of error that can occur during the development and administration of a survey. Later in the day we will actually come back to a few of these sources of error to discuss how the different modes of electronic data collection that we have been discussing can effect those errors.
Households with no telephones. Cell phone only households. No directory of cell phone numbers.
Based on demographic characteristics.-This limits the usefulness for data collection by internet onlyPeople share e-mail addresses or have multiple ones. There is no equivalent to the RDD algorithm used to select phone numbers for generating a random sample. Researchers have come rely more of self-selected panels (which aren’t necessarily representative of the general population).
This phenomenon is not necessarily seen in establishment surveys.
Some of this information is based on de Leeuw’s meta-analysis of 67 articles/papers on mode comparisons.
-DeLeeuw’s International Handbook of Survey Methodology – mode effects do exist but tend to be small in well-conducted surveys. The biggest differences are seen with sensitive questions. Interviewer modes produced more socially desirable and less consistent answers, but also more detailed answers to open ended questions.-
Two current Ph.D. students at JPSMSome articles were saying offering both options raised response rates, and some were saying that it had an overall decrease in response rates
Because there weren’t a large number of studies that were available, they couldn’t identify any study characteristics that would explain why this was going on.Hyp 2: in the process of sorting mail and organizing themselves to go on the Web, R’s could have forgotten or misplaced the survey invitation.Hyp 3: although in the studies they looked at, approximately 80% of R’s that started the Web instrument completed it.
Overall response rate from the survey was 62%. The mail preference group had the highest response rate followed by the equal preference group. Dillman felt that although the response rate for the web preference was lower than the other groups, it was a respectable web response rate and he was encouraged by the fact that 41% of respondents could be ‘pushed’ into the web option. This suggested to Dillman that withholding paper could be a tool for surveyors to drive more respondents onto the web.
The telephone was not a option to respondents in this survey, so further analysis of this low percentage was not given.
Dillman also looked at the relationship between mode preferences and actual mode of response completed. There is a strong relationship between the mode completed and the mode preferred. This was especially interesting because there were treatments in which respondents were ‘pushed’ to respond in one mode versus the other. R’s appeared to prefer the mode they were pushed into, suggesting that ‘pushing’ could work.
Dillman then isolated the Web Preference group to see how strongly they preferred reporting via Web. He found a strong relationship between the mode that the respondent reported and their preference for that mode. This suggested that by pushing respondents to a certain mode, not only will they report in that mode, but they will like it!
It may seem exciting that you can push people to the web, but through additional Chi square analysis, Dillman looked further and different characteristics of respondents to see what limitations there would be to pushing respondents to the web. Because of coverage issues associated with the web, certain people will no other choice but to respond via mail. Mail respondents are older than web respondents and have lower levels of education and lower income. Mail respondents are also more likely to be female and married. These results show that only certain individuals may be ‘pushed’ successfully to the web. If you are doing a survey without an alternative mode to the web, you could introduce nonresponse bias.
There is a condition that doesn’t mention web as an option until much later in the mailouts (Day 23). There is an alternative option that doesn’t mail out a paper until later in the mailouts (Day 23). The other two options are a mix of varying “web intensities”. So there was one that started out with a strong push to the web without mentioning paper. And one that had a strong push towards paper and didn’t mention web until the last mail out. The other two in the middle mentioned web and paper at different times in the middle mailings.
Two differences are statistically significant: S and A2 and S and A3.
The results showed promise for promoting the web as an alterative strategy earlier. In the A4 sample, R’s didn’t see a paper form until the third contact. The authors also noted that the A2-A4 strategies could also lead to a cost reduction of between 12-20% compared to the cost of the standard strategy.
The current ACS has three modes: paper, CATI, and CAPI. In 2011, the Census Bureau conducted two Internet tests to evaluate the feasibility of offering a web mode and to identify the best way to present that mode to promote self-response. These are results from the first of the two tests.
Along with the control panel there were Push and Choice panels. Each one tested two different strategies. For the Push panel, they tested how quickly the follow-up (with paper form) should occur. The choice panel tested advertising of the web option in a prominent place or a subtle/inconspicuous place. The Targeted group consisted of areas that contained households that used the Internet at a higher rate. The remaining tracts were in the Not Targeted group.
The accelerated Push strategy (on the right) was seen as very successful and the first test where the Census Bureau saw a push strategy perform well in a household survey. It performed better than then the Prominent choice strategy or the control.
The accelerated Push strategy continued to show a strong presence in the Not Targeted areas, which was an unexpected finding. Response rates were not significantly different between the control and the choice strategies. Similar to the targeted areas, the push strategy was successful in having most of its responses sent via Internet.
This testing of a push strategy on just the form follow-up showed that the letter-only strategy had a higher percentage of web responses than the form and letter response group. There was no affect in overall response rate between the two groups. Similar finding to other studies that the mode offered is the mode preferred.
This table represents the various calls that were made to the different treatments offered in the ACS Internet Test. There were five different treatments (control, prominent choice, not prominent choice, push regular, push accelerated). Then among those there were respondents that replied via internet, mail, and were overall nonrespondents. This table shows the number of calls that were attempted and analyzed.
The letter was the primary way for communicating the mode choice to Rs. We are always working on the materials that go into the mailing packages to encourage R’s to respond to the survey and to respond in the mode that we want them to if we give them a choice. So this was hopeful information because we always worry that R’s are remembering these pieces of information. But just because they remember it doesn’t mean that they read it.