On December 7, 2016, Mark Menig, Chief Executive Officer of TrueSample and Lisa Wilding-Brown, Chief Research Officer of Innovate MR explored various strategies to help research professionals navigate the challenging landscape of online sample quality. The webinar addressed:
• A brief overview of quality through the years. Where have we been and where are we going?
• What are current examples of online sample fraud (i.e., bots, hijackers, foreign click shops etc.)?
• What are the challenges and costs associated with today’s online fraud? How does online fraud impact data quality, specifically B2B research?
• What technical and behavioral strategies help to protect online research?
VIP Call Girls Dongri WhatsApp +91-9833363713, Full Night Service
Webinar: Everyone cares about sample quality but not everyone values it!
1. EVERYONE CARES ABOUT SAMPLE
QUALITY, BUT NOT EVERYONE VALUES
IT!
A review of responsibilities and techniques you can implement to protect
your online research and beyond
3. Agenda
■ Quality through the years (brief overview of where we’ve been and where we are going)
■ Current landscape i.e., bots, hijackers, foreign click shops in China etc.
■ Challenges & costs associated with today’s online fraud and how it impacts data quality
■ Implementing an effective solution (multi-layered approach)
– Technical approaches: Digital fingerprinting (when and where); Respondent validation;
algorithmic solutions over a member’s lifetime, other 3rd-party techniques, etc.
– Behavioral approaches: Knowledge question design (red-herrings); Pre-survey screening; smart
survey design (do’s and don’ts)
■ The Path Forward: Responsibility, Accountability, & Collaboration
3
4. Care About vs.Value
■ When you care about something, you simply have even minimal regard for someone
or something.
■ When you VALUE something, you consider it important and worthwhile. ...As a verb,
it means "holding something in high regard," (like "I value our friendship") but it can
also mean "determine how much something is WORTH," like a prize valued at $200.
4
6. 2000 2006 2008 2012 2016 2020
The industry
rapidly
becomes
enamored
with the
speed and
cost savings
of moving to
online
Industry
associations
launch major
initiatives to
investigate
and restore
online
research
quality
Fraud
continues to
morph and
evolve with
the
emergence
of new
threats
P&G speaks
out about
online data
quality issues
at the Client
Summit
sparking
industry-
wide
discourse
Rapid
evolution and
diversification
of devices and
engaging
respondents
migrates from
a proximity-
fixed
experience to
a portable
experience
The only
constant is
change!
Continual
innovation is
required in
order to stay
ahead;
recognizing
the battle is
never over
7. Current Landscape
Dr. Liz Nelson, co-founder of
TNS, advisor to the board of Fly
Research and a fellow of the
Market Research Society, talks
about how the need for speed is
affecting the quality of
research.
Research Live – November 24, 2016
7
“I would say immediately that the emphasis on speed is what’s happening now. Clients demand
immediate results with the survey in field on Friday, and 2000 results the next day. I think the sad bit is
that quality suffers”
8. Current Landscape
Recent advances in big data and artificial intelligence are
now making it possible to teach a machine to understand
and speak to humans.
It's very difficult to simply look at the data provided by
some of the more sophisticated bots and identify what to
remove, because it's all gray goo inside, just like a real
brain, and may be indistinguishable from real data.
Need a real world example? Take out your iPhone and ask
Siri a question.
Forums like the one to the left abound online with users
looking for and sharing information about how to utilize
tools to create/mimic bots and automate the process of
filling in surveys.
8
9. Current Landscape
“Here is survey bot attempting to
complete a survey with no given
information.The creator ran this on 6
surveys a day for two weeks (fully
automated of course) and got the total
sum of £14.95p, with no user
interaction what so ever!”
That was 10 questions completed in
under 17 seconds in case you lost
count!
9
11. Current Landscape
TheTor software protects users by bouncing their
communications around a distributed network of relays run
by volunteers all around the world.
TheTor Browser gives access toTor onWindows, Mac OS X,
or Linux without needing to install any software.
Survey Click Shops are popping up around the globe
Comprised of many “unique” devices in a single location
being utilized by a group of fraudsters to game surveys and
generate incentives
11
12. Current Landscape
Device Emulators. In computing, an emulator is hardware
or software that enables one computer system (called the
host) to behave like another computer system (called the
guest).
This threat will only get worse as computers and global
computer networks continued to advance and emulator
developers grow more skilled in their work.
Datacenters,VPNs,Anonymous Proxies, etc. are favorite
tools for fraudsters because they allow them to spoof their
device to appear to be coming from a different country on a
case by case basis as needed based on the requirements of
a given survey.
12
13. Challenges & Costs
Timeliness
of fielding
Purchase
process
Ease of
accessing
panel
Customer
service
Quantity of
respondents
Cost of panel Quality of
Respondents
Not at all satisfied 2% 2% 2% 3% 5% 5% 7%
Slightly satisfied 11% 8% 12% 10% 17% 15% 26%
Moderately satisfied 33% 37% 36% 39% 41% 46% 42%
Very satisfied 44% 44% 42% 40% 31% 30% 23%
Completely satisfied 9% 9% 8% 8% 5% 5% 3%
Top 2 box 54% 53% 50% 49% 36% 34% 26%
2016 GRIT Report
13
14. Challenges & Costs
“Technology, or lack thereof, is the prime culprit for sample
getting worse: from bots, to survey design, to mobile enabled
surveys, all these are driving sample quality down. Many folks
have a strong sense that there are only professional survey
takers and fraudulent bots that are taking all the surveys
because there is a race to the bottom in terms of cost.”
“Sample providers should only actively communicate on
issue of representativeness, not quality or design.”
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
Insights Buyer or Client
Insights Providers or Supplier
Sample Quality by Buyers vs. Suppliers
Better Worse Stay the Same Not Sure
2016 GRIT Report
14
15. Implementing an Effective Solution
Technical Approaches
Most Adopted Fraud DetectionTools
67%
51%
32%
11%
13%
17%
0 10 20 30 40 50 60 70 80 90
Identity/Address Validation
IP Geo Location Information
Device Fingerprinting
Currently Using Planning New Implementation 2016 Fraud Report
15
16. Implementing an Effective Solution
Technical Approaches
DEVICE FINGERPRINT A device fingerprint or machine fingerprint or browser
fingerprint is information collected about a remote
computing device for the purpose of identification.
Fingerprints can be used to fully or partially identify
individual users or devices even when cookies are turned off.
Motivation for the device fingerprint concept stems from
the forensic value of human fingerprints. In the "ideal" case,
all web client machines would have a different fingerprint
value (diversity), and that value would never change
(stability). Under those assumptions, it would be possible to
uniquely distinguish between all machines on a network,
without the explicit consent of the users themselves.
16
17. Implementing an Effective Solution
Technical Approaches
IDENTITY VALIDATION
Identity validation solutions allow for the evaluation of
names, postal addresses, and/or email addresses against
third-party consumer databases to determine if they're
legitimate and correspond with one another. They provide
confidence in knowing that a participant is who they say they
are and lives where they say they live. Also allows for the
removal of duplicates within and across sources.
Layering in a Geo-Location Distance Check adds additional
fraud detection by calculating the distance (in miles across the
surface of a sphere) between the latitude/longitude
coordinates of the postal address and the latitude/longitude
coordinates that the user’s IP address resolves to.
17
18. Implementing an Effective Solution
Technical Approaches
FRAUD DETECTION
At the device level, there are key markers that can be identified
to indicate the risk of first time user fraud:
Language Check
Geo-Browser Language Check
Geo-OS Language Check
Geo-Time Zone Check
Geo-Off Hours Check
Geo-Country Check
Multi-Device Check
Bot Check
Anonymous Check
Blacklist Check
Browser Status Check
18
19. Implementing an Effective Solution
Technical Approaches
SURVEY VALIDATION
A respondent can be flagged as unengaged in the survey if he or
she speeds on at least X% of the pages they saw in the survey.The
norms and standard deviations of the times for each page should
be calculated in real-time as the page submissions from the
respondents are received by the survey platform.
It can also be useful to consider the response patterns that are
being submitted as another key indicator. Respondents who
provide undesirable response patterns on more than X% of pages
can also be classified as unengaged for the survey.
Good ResponseValidation tools leverage real-time Bayesian
statistical models/analysis to determine engagement.
19
20. Implementing an Effective Solution
Behavioral Approaches
There are three channels to address
in order to ensure superior data
quality in your study:
Sample Design & Management
Survey Design
Member Management
20
21. Implementing an Effective Solution
Behavioral Approaches – Sample Design & Management
Vendor selection is key. Understand how your vendor’s sample is sourced,
managed and incentivized.
Ask the tough questions! How is sample outgo balanced? What measures are
implemented to ensure the highest quality sample is provided?
Demographic balance
Activity & tenure balance
Survey field time
Invitation/introductory language
Competing survey inventory
Survey frequency & variation
Routing/project prioritization 21
22. Implementing an Effective Solution
Behavioral Approaches – Survey Design
Question design is key!
Use non-leading wording
Provide an out for all respondents
Use open-ends sparingly
Avoid yes/no format
22
23. Implementing an Effective Solution
Behavioral Approaches – Survey Design
Avoid burdensome question formats (i.e., extensive grids and lists longer
than 10-15 attributes).
Strive to keep your survey short and simple.
Clear, concise wording – write for a 5th grader!
Avoid multiple questions on one screen – visual clutter will result in
respondent fatigue.
Mobile-compatible and mobile-friendly are two different things!
23
24. Implementing an Effective Solution
Behavioral Approaches – Member Management
Trap Questions
Honey Pots
Algorithmic solutions
Tracking activity over time (LOI completions & invalids)
Profiling & third-party data validation sources
Demo consistency checks
Quality exists across a wide spectrum; lifetime
management is critical 24
25. Implementing an Effective Solution
Behavioral Approaches – Trap Questions Do’s & Don’ts
Not all trap questions are effective! Trap questions shouldn’t be too simple or too complex.
Types:
Instructional (i.e., Select the image which shows a book.)
Skill-based (i.e., 2+2 = ?)
Honesty-based (i.e., What brand(s) are you aware of? What activities have you done in the
last 12 months?)
Implement multiple measures to assess quality, never rely on a singular question within the
survey to dictate quality.
Be mindful of question position within the survey i.e., adding your trap question at minute 45
will yield false positives that arguably are a result of a lengthy survey NOT a poorly-behaving
respondent.
25
26. Implementing an Effective Solution
Applying Our Learnings to B2B Research
Know thy sample source!
Always use multiple knowledge-based trap questions (.i.e.,
looking for experts in cloud-computing? Test their knowledge
on various storage products vs. the color of the sky).
Implement multiple measures to assess quality (inclusive of
technical and behavioral approaches).
When possible, leverage 3rd party data sources to validate
member data.
Never become complacent – your research will always be a hot
target for fraud. Stay protected! 26
27. The Path Forward: Responsibility,
Accountability, & Collaboration
Every company up and down the supply chain involved in the execution of online
research has a role/responsibility as it relates to data quality/fraud detection. What
you are responsible for depends on which part of the research process you have
operational control over (i.e. you can’t just push responsibility down to the
operational layer below you, everyone has to do their part, or the whole system
suffers).
There is no silver bullet solution. Effective solutions require a layered
technique/approach that incorporates redundancies and failsafe mechanisms.
It’s not enough to simply care about data quality and fraud detection, you must
VALUE it!
27