SlideShare a Scribd company logo
1 of 63
User Interfaces and Algorithms
for Fighting Phishing
Jason I. Hong
Carnegie Mellon University
Everyday Privacy and Security Problem
This entire process
known as phishing
Fast Facts on Phishing
• Estimated 3.5 million people have fallen for phishing
• Estimated to cost $1-2 billion a year (and growing)
• 9255 unique phishing sites reported in June 2006
• Easier (and safer) to phish than rob a bank
Supporting Trust Decisions
• Goal: help people make better trust decisions
– Focus on anti-phishing
• Large multi-disciplinary team project at CMU
– Supported by NSF, ARO, CMU CyLab
– Six faculty, five PhD students, undergrads, staff
– Computer science, human-computer interaction,
public policy, social and decision sciences, CERT
Our Multi-Pronged Approach
• Human side
– Interviews to understand decision-making
– Embedded training
– Anti-phishing game
• Computer side
– Email anti-phishing filter
– Automated testbed for anti-phishing toolbars
– Our anti-phishing toolbar
Automate where possible, support where necessary
What do users know
about phishing?
Interview Study
• Interviewed 40 Internet users, included 35 non-experts
• “Mental models” interviews included email role play
and open ended questions
• Interviews recorded and coded
J. Downs, M. Holbrook, and L. Cranor. Decision Strategies
and Susceptibility to Phishing. In Proceedings of the
2006 Symposium On Usable Privacy and Security, 12-14
July 2006, Pittsburgh, PA.
Little Knowledge of Phishing
• Only about half knew meaning of the term “phishing”
“Something to do with the band Phish, I take it.”
Minimal Knowledge of Lock Icon
“I think that it means secured, it symbolizes
some kind of security, somehow.”
• 85% of participants were aware of lock icon
• Only 40% of those knew that it was supposed
to be in the browser chrome
• Only 35% had noticed https, and many of those did
not know what it meant
Little Attention Paid to URLs
• Only 55% of participants said they had ever noticed
an unexpected or strange-looking URL
• Most did not consider them to be suspicious
Some Knowledge of Scams
• 55% of participants reported being cautious when
email asks for sensitive financial info
– But very few reported being suspicious of email asking
for passwords
• Knowledge of financial phish reduced likelihood of
falling for these scams
– But did not transfer to other scams, such as
amazon.com password phish
Naive Evaluation Strategies
• The most frequent strategies don’t help much in
identifying phish
– This email appears to be for me
– It’s normal to hear from companies you do business with
– Reputable companies will send emails
“I will probably give them the information that they asked for.
And I would assume that I had already given them that
information at some point so I will feel comfortable giving it to
them again.”
Other Findings
• Web security pop-ups are confusing
“Yeah, like the certificate has expired. I don’t actually
know what that means.”
• Don’t know what encryption means
• Summary
– People generally not good at identifying scams they
haven’t specifically seen before
– People don’t use good strategies to protect themselves
Can we train people not to
fall for phishing?
Web Site Training Study
• Laboratory study of 28 non-expert computer users
• Two conditions, both asked to evaluate 20 web sites
– Control group evaluated 10 web sites, took 15 minute break
to read email or play solitaire, evaluated 10 more web sites
– Experimental group same as above, but spent 15 minute
break reading web-based training materials
• Experimental group performed significantly better
identifying phish after training
– Less reliance on “professional-looking” designs
– Looking at and understanding URLs
– Web site asks for too much information
People can learn from web-based training materials,
if only we could get them to read them!
How Do We Get People Trained?
• Most people don’t proactively look for training
materials on the web
• Many companies send “security notice” emails
to their employees and/or customers
• But these tend to be ignored
– Too much to read
– People don’t consider them relevant
– People think they already know how to protect themselves
Embedded Training
• Can we “train” people during their normal use of
email to avoid phishing attacks?
– Periodically, people get sent a training email
– Training email looks like a phishing attack
– If person falls for it, intervention warns and highlights
what cues to look for in succinct and engaging format
P. Kumaraguru, Y. Rhee, A. Acquisti, L. Cranor, J. Hong, and E.
Nunge. Protecting People from Phishing: The Design and
Evaluation of an Embedded Training Email System. CyLab
Technical Report. CMU-CyLab-06-017, 2006.
http://www.cylab.cmu.edu/default.aspx?id=2253
[to be presented at CHI 2007]
Diagram Intervention
Diagram Intervention
Explains why they are
seeing this message
Diagram InterventionExplains how to identify
a phishing scam
Diagram Intervention
Explains what a
phishing scam is
Diagram InterventionExplains simple things
you can do to protect self
Comic Strip Intervention
Embedded Training Evaluation
• Lab study comparing our prototypes to standard
security notices
– EBay, PayPal notices
– Diagram that explains phishing
– Comic strip that tells a story
• 10 participants in each condition (30 total)
• Roughly, go through 19 emails, 4 phishing attacks
scattered throughout, 2 training emails too
– Emails are in context of working in an office
Embedded Training Results
• Existing practice of security notices is ineffective
• Diagram intervention somewhat better
• Comic strip intervention worked best
– Statistically significant
• Pilot study showed interventions most
effective when based on real brands
Next Steps
• Iterate on intervention design
– Have already created newer designs, ready for testing
• Understand why comic strip worked better
– Story? Comic format?
• Preparing for larger scale deployment
– Include more people
– Evaluate retention over time
– Deploy outside lab conditions if possible
• Real world deployment and evaluation
– Need corporate partners to let us spoof their brand
Anti-Phishing Phil
• A game to teach people not to fall for phish
– Embedded training focuses on email
– Game focuses on web browser, URLs
• Goals
– How to parse URLs
– Where to look for URLs
– Use search engines instead
• Available on our web
site soon
Anti-Phishing Phil
Outline
• Human side
– Interviews to understand decision-making
– Embedded training
– Anti-phishing game
• Computer side
– Email anti-phishing filter
– Automated testbed for anti-phishing toolbars
– Our anti-phishing toolbar
How accurate are today’s
anti-phishing toolbars?
Some Users Rely on Toolbars
• Dozens of anti-phishing toolbars offered
– Built into security software suites
– Offered by ISPs
– Free downloads
– Built into latest version of popular web browsers
Some Users Rely on Toolbars
• Dozens of anti-phishing toolbars offered
– Built into security software suites
– Offered by ISPs
– Free downloads
– Built into latest version of popular web browsers
• Previous studies demonstrated usability
problems that need further work
• But how well do they detect phish?
Testing the Toolbars
• April 2006: Manual evaluation of 5 toolbars
– Required lots of undergraduate labor over 2-week period
• Summer 2006: Created a semi-automated test bed
• September 2006: Automated evaluation of 5 toolbars
– Used APWG feed as source of phishing URLs
• November 2006: Automated evaluation of 10 toolbars
– Used phishtank.com as source of phishing URLs
– Evaluated 100 phish and 510 legit sites in just 2 days
L. Cranor, S. Egelman, J. Hong and Y. Zhang. Phinding
Phish: An Evaluation of Anti-Phishing Toolbars. CyLab
Technical Report. CMU-CyLab-06-018, 2006.
http://www.cylab.cmu.edu/default.aspx?id=2255
[to be presented at NDSS]
Testbed for Anti-Phishing Toolbars
• Manual evaluation was tedious, slow, error-prone
• Created a testbed that could semi-automatically
evaluate these toolbars
– Just give it a set of URLs to check (labeled as phish or not)
– Checks all the toolbars, aggregates statistics
• How to automate this for different toolbars?
– Different APIs (if any), different browsers
– Image-based approach, take screenshots of web browser
and compare relevant portions to known states
Testbed System Architecture
Finding Fresh Phish for Test
• Need a source with lots of fresh phishing URLs
– Can’t use toolbar black lists if we are testing their tools
– Sites get taken down within a few days, need phish
less than one day old
• To observe how fast black lists get updated, the fresher
the better
• Experimented with several sources
– APWG - high volume, but many duplicates and legitimate
URLs included
– Phishtank.com - lower volume but easier to extract phish
– Other phish archives - often low volume or not fresh enough
• Choice of feed impacts results
November 2006 evaluation
• Tested 10 toolbars
– Microsoft Internet Explorer v7.0.5700.6
– Netscape Navigator v8.1.2
– EarthLink v3.3.44.0
– eBay v 2.3.2.0
– McAfee SiteAdvisor v1.7.0.53
– NetCraft v1.7.0
– TrustWatch v3.0.4.0.1.2
– SpoofGuard
– Cloudmark v1.0.
– Google Toolbar v2.1 (Firefox)
• Most use blacklists and simple heuristics
– SpoofGuard only one to rely solely on heuristics
November 2006 Evaluation
• Test URLs
– 100 manually confirmed fresh phish from phishtank.com
(reported within 6 hours)
• Did not use the fully confirmed ones
– 60 legitimate sites linked to by phishing messages
– 510 legitimate sites tested by 3Sharp in Sept 2006 report
Results
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
0 1 2 12 24
Time (hours)
Phishingsitescorrectlyidentified
SpoofGuard
EarthLink
Netcraft
Google
IE7
Cloudmark
TrustWatch
eBay
Netscape
McAfee
38% false positives
1% false positives
Results
• Only toolbar >90% accuracy has high false positive rate
• Several catch 70-85% of phish with few false positives
– After 15 minutes of training, users seem to do as well
• Few improvements in catch rates seen over 24 hours
– Suggests most toolbars not taking advantage of
available phish feeds to quickly update black lists
• Combination of heuristics and frequently updated black list
(and white list?) seems to be most promising approach
• Plan to periodically repeat study every quarter
• Should only consider this a rough ordering
– Different sources of phishing URLs lead to different results
Our Anti-Phishing Toolbar
Robust Hyperlinks
• Developed by Phelps and Wilensky to solve
“404 not found” problem
• Key idea was to add a lexical signature to URLs that
could be fed to a search engine if URL failed
– Ex. http://abc.com/page.html?sig=“word1+word2+...+word5”
• How to generate signature?
– Found that TF-IDF was fairly effective
• Informal evaluation found five words was sufficient
for most web pages
Adapting TF-IDF for Anti-Phishing
• Can same basic approach be used for anti-phishing?
– Scammers often directly copy web pages
– With Google search engine, fake should have low page rank
Fake Real
Adapting TF-IDF for Anti-Phishing
• Rough algorithm
– Given a web page, calculate TF-IDF for each word on page
– Take five terms with highest TF-IDF weights
– Feed these terms into a search engine (Google)
– If domain name of current web page is in top N search
results, consider it legitimate (N=30 worked well)
Evaluation #1
• 100 phishing URLs fro PhishTank.com
• 100 legitimate URLs from 3Sharp’s study
94%
30%
67%
10%
94%
31%
97%
10%
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
true negative false positive
Basic-TF-IDF
Basic-TF-IDF+domain
Basic-TF-IDF+ZMP
Basic-TF-IDF+domain+ZMP
Discussion of Evaluation #1
• Very good results (97%), but false positives (10%)
• Added several heuristics to reduce false positives
– Many of these heuristics used by other toolbars
– Age of domain
– Known images
– Suspicious URLs (has @ or -)
– Suspicious links (see above)
– IP Address in URL
– Dots in URL (>= 5 dots)
– Page contains text entry field
– TF-IDF
• Used simple forward linear model to weight these
Evaluation #2
• Compared to SpoofGuard and NetCraft
– SpoofGuard uses all heuristics
– NetCraft 1.7.0 uses heuristics (?) and extensive blacklist
• 100 phishing URLs from PhishTank.com
• 100 legitimate URLs
– Sites often attacked (citibank, paypal)
– Top pages from Alexa (most popular sites)
– Random web pages from random.yahoo.com
Results of Evaluation #2
97%
6%
89%
1%
91%
48%
97%
0%
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
true negative false positive
Final-TF-IDF
Final-TF-IDF +
Heuristics
SpoofGuard
Netcraft
Discussion
• Pretty good results for TF-IDF approach
– 97% with 6% false positive, 89% with 1% false positive
– False positives due to JavaScript phishing sites
• Limitations
– Does not work well for non-English web sites (TF-IDF)
– System performance (querying Google each time)
• Attacks by criminals
– Using images instead of words
– Invisible text
– Circumventing TF-IDF and PageRank (hard in practice?)
Summary
• Large multi-disciplinary team project at CMU looking
at trust decisions, currently anti-phishing
• Human side
– Interviews to understand decision-making
– Embedded training
– Anti-phishing game
• Computer side
– Automated testbed for anti-phishing toolbars
– Our anti-phishing toolbar
Embedded Training Results
0
10
20
30
40
50
60
70
80
90
1003:Phish
5:Training
7:Real
8:Spam
11:Training
12:Spam
13:Real
14:Phish
16:Phish
17:Phish
Emails which had links in them
Percentageofuserswhoclicked
onalink
Group A Group B Group C
Email Anti-Phishing Filter
• Philosophy: automate where possible, support
where necessary
• Goal: Create an email filter that detects phishing
emails
– Well explored area for spam
– Can we do better for phishing?
Email Anti-Phishing Filter
• Heuristics combined in SVM
– IP addresses in links (http://128.23.34.45/blah)
– Age of linked-to domains (younger domains likely phishing)
– Non-matching URLs (ex. most links point to PayPal)
– “Click here to restore your account”
– HTML email
– Number of links
– Number of domain names in links
– Number of dots in URLs
(http://www.paypal.update.example.com/update.cgi)
– JavaScript
– SpamAssassin rating
Email Anti-Phishing Filter Evaluation
• Ham corpora from SpamAssassin (2002 and 2003)
– 6950 good emails
• Phishingcorpus
– 860 phishing emails
Email Anti-Phishing Filter Evaluation
Is it legitimate
Our label
Yes No
Yes True positive False positive
No False negative True negative
User Interfaces and Algorithms for Fighting Phishing, at Google Tech Talk Jan 2007
User Interfaces and Algorithms for Fighting Phishing, at Google Tech Talk Jan 2007
User Interfaces and Algorithms for Fighting Phishing, at Google Tech Talk Jan 2007

More Related Content

What's hot

Top Security Challenges Facing Credit Unions Today
Top Security Challenges Facing Credit Unions TodayTop Security Challenges Facing Credit Unions Today
Top Security Challenges Facing Credit Unions TodayChris Gates
 
What you need to know about OSINT
What you need to know about OSINTWhat you need to know about OSINT
What you need to know about OSINTJerod Brennen
 
Man vs Internet - Current challenges and future tendencies of establishing tr...
Man vs Internet - Current challenges and future tendencies of establishing tr...Man vs Internet - Current challenges and future tendencies of establishing tr...
Man vs Internet - Current challenges and future tendencies of establishing tr...Luis Grangeia
 
Web Security - Introduction v.1.3
Web Security - Introduction v.1.3Web Security - Introduction v.1.3
Web Security - Introduction v.1.3Oles Seheda
 
Co-Presented: YOU are the Alpha and Omega of a Secure Future (Kottova / Dray)...
Co-Presented: YOU are the Alpha and Omega of a Secure Future (Kottova / Dray)...Co-Presented: YOU are the Alpha and Omega of a Secure Future (Kottova / Dray)...
Co-Presented: YOU are the Alpha and Omega of a Secure Future (Kottova / Dray)...Kimberley Dray
 
Password and Account Management Strategies - April 2019
Password and Account Management Strategies - April 2019Password and Account Management Strategies - April 2019
Password and Account Management Strategies - April 2019Kimberley Dray
 
CMS Hacking Tricks - DerbyCon 4 - 2014
CMS Hacking Tricks - DerbyCon 4 - 2014CMS Hacking Tricks - DerbyCon 4 - 2014
CMS Hacking Tricks - DerbyCon 4 - 2014Greg Foss
 
LonestarPHP 2014 Security Keynote
LonestarPHP 2014 Security KeynoteLonestarPHP 2014 Security Keynote
LonestarPHP 2014 Security KeynoteAlison Gianotto
 
Hacking Web Apps by Brent White
Hacking Web Apps by Brent WhiteHacking Web Apps by Brent White
Hacking Web Apps by Brent WhiteEC-Council
 
Workshop threat-hunting
Workshop threat-huntingWorkshop threat-hunting
Workshop threat-huntingTripwire
 
Threat Hunting with Splunk
Threat Hunting with SplunkThreat Hunting with Splunk
Threat Hunting with SplunkSplunk
 
The most dangerous places on the web
The most dangerous places on the webThe most dangerous places on the web
The most dangerous places on the webJoel May
 
Webinar - Tips and Tricks on Website Security
Webinar - Tips and Tricks on Website SecurityWebinar - Tips and Tricks on Website Security
Webinar - Tips and Tricks on Website SecurityStopTheHacker
 
Bug Bounty Hunter's Manifesto V1.0
Bug Bounty Hunter's Manifesto V1.0Bug Bounty Hunter's Manifesto V1.0
Bug Bounty Hunter's Manifesto V1.0Dinesh O Bareja
 
Negative Unemployment and Great Job Satisfaction? Why infosec is AWESEOME
Negative Unemployment and Great Job Satisfaction? Why infosec is AWESEOMENegative Unemployment and Great Job Satisfaction? Why infosec is AWESEOME
Negative Unemployment and Great Job Satisfaction? Why infosec is AWESEOMEjeffmcjunkin
 
MacIT 2014 - Essential Security & Risk Fundamentals
MacIT 2014 - Essential Security & Risk FundamentalsMacIT 2014 - Essential Security & Risk Fundamentals
MacIT 2014 - Essential Security & Risk FundamentalsAlison Gianotto
 
Windows Incident Response is hard, but doesn't have to be
Windows Incident Response is hard, but doesn't have to beWindows Incident Response is hard, but doesn't have to be
Windows Incident Response is hard, but doesn't have to beMichael Gough
 
Bug Bounty - Hackers Job
Bug Bounty - Hackers JobBug Bounty - Hackers Job
Bug Bounty - Hackers JobArbin Godar
 
Wi-Fi Hotspot Attacks
Wi-Fi Hotspot AttacksWi-Fi Hotspot Attacks
Wi-Fi Hotspot AttacksGreg Foss
 
SSL: Past, Present and Future
SSL: Past, Present and FutureSSL: Past, Present and Future
SSL: Past, Present and FutureLuis Grangeia
 

What's hot (20)

Top Security Challenges Facing Credit Unions Today
Top Security Challenges Facing Credit Unions TodayTop Security Challenges Facing Credit Unions Today
Top Security Challenges Facing Credit Unions Today
 
What you need to know about OSINT
What you need to know about OSINTWhat you need to know about OSINT
What you need to know about OSINT
 
Man vs Internet - Current challenges and future tendencies of establishing tr...
Man vs Internet - Current challenges and future tendencies of establishing tr...Man vs Internet - Current challenges and future tendencies of establishing tr...
Man vs Internet - Current challenges and future tendencies of establishing tr...
 
Web Security - Introduction v.1.3
Web Security - Introduction v.1.3Web Security - Introduction v.1.3
Web Security - Introduction v.1.3
 
Co-Presented: YOU are the Alpha and Omega of a Secure Future (Kottova / Dray)...
Co-Presented: YOU are the Alpha and Omega of a Secure Future (Kottova / Dray)...Co-Presented: YOU are the Alpha and Omega of a Secure Future (Kottova / Dray)...
Co-Presented: YOU are the Alpha and Omega of a Secure Future (Kottova / Dray)...
 
Password and Account Management Strategies - April 2019
Password and Account Management Strategies - April 2019Password and Account Management Strategies - April 2019
Password and Account Management Strategies - April 2019
 
CMS Hacking Tricks - DerbyCon 4 - 2014
CMS Hacking Tricks - DerbyCon 4 - 2014CMS Hacking Tricks - DerbyCon 4 - 2014
CMS Hacking Tricks - DerbyCon 4 - 2014
 
LonestarPHP 2014 Security Keynote
LonestarPHP 2014 Security KeynoteLonestarPHP 2014 Security Keynote
LonestarPHP 2014 Security Keynote
 
Hacking Web Apps by Brent White
Hacking Web Apps by Brent WhiteHacking Web Apps by Brent White
Hacking Web Apps by Brent White
 
Workshop threat-hunting
Workshop threat-huntingWorkshop threat-hunting
Workshop threat-hunting
 
Threat Hunting with Splunk
Threat Hunting with SplunkThreat Hunting with Splunk
Threat Hunting with Splunk
 
The most dangerous places on the web
The most dangerous places on the webThe most dangerous places on the web
The most dangerous places on the web
 
Webinar - Tips and Tricks on Website Security
Webinar - Tips and Tricks on Website SecurityWebinar - Tips and Tricks on Website Security
Webinar - Tips and Tricks on Website Security
 
Bug Bounty Hunter's Manifesto V1.0
Bug Bounty Hunter's Manifesto V1.0Bug Bounty Hunter's Manifesto V1.0
Bug Bounty Hunter's Manifesto V1.0
 
Negative Unemployment and Great Job Satisfaction? Why infosec is AWESEOME
Negative Unemployment and Great Job Satisfaction? Why infosec is AWESEOMENegative Unemployment and Great Job Satisfaction? Why infosec is AWESEOME
Negative Unemployment and Great Job Satisfaction? Why infosec is AWESEOME
 
MacIT 2014 - Essential Security & Risk Fundamentals
MacIT 2014 - Essential Security & Risk FundamentalsMacIT 2014 - Essential Security & Risk Fundamentals
MacIT 2014 - Essential Security & Risk Fundamentals
 
Windows Incident Response is hard, but doesn't have to be
Windows Incident Response is hard, but doesn't have to beWindows Incident Response is hard, but doesn't have to be
Windows Incident Response is hard, but doesn't have to be
 
Bug Bounty - Hackers Job
Bug Bounty - Hackers JobBug Bounty - Hackers Job
Bug Bounty - Hackers Job
 
Wi-Fi Hotspot Attacks
Wi-Fi Hotspot AttacksWi-Fi Hotspot Attacks
Wi-Fi Hotspot Attacks
 
SSL: Past, Present and Future
SSL: Past, Present and FutureSSL: Past, Present and Future
SSL: Past, Present and Future
 

Viewers also liked

Money making systems
Money making systemsMoney making systems
Money making systemsHellen Meyer
 
50 Essential Content Marketing Hacks (Content Marketing World)
50 Essential Content Marketing Hacks (Content Marketing World)50 Essential Content Marketing Hacks (Content Marketing World)
50 Essential Content Marketing Hacks (Content Marketing World)Heinz Marketing Inc
 
Prototyping is an attitude
Prototyping is an attitudePrototyping is an attitude
Prototyping is an attitudeWith Company
 
10 Insightful Quotes On Designing A Better Customer Experience
10 Insightful Quotes On Designing A Better Customer Experience10 Insightful Quotes On Designing A Better Customer Experience
10 Insightful Quotes On Designing A Better Customer ExperienceYuan Wang
 
Learn BEM: CSS Naming Convention
Learn BEM: CSS Naming ConventionLearn BEM: CSS Naming Convention
Learn BEM: CSS Naming ConventionIn a Rocket
 
SEO: Getting Personal
SEO: Getting PersonalSEO: Getting Personal
SEO: Getting PersonalKirsty Hulse
 
How to Build a Dynamic Social Media Plan
How to Build a Dynamic Social Media PlanHow to Build a Dynamic Social Media Plan
How to Build a Dynamic Social Media PlanPost Planner
 

Viewers also liked (7)

Money making systems
Money making systemsMoney making systems
Money making systems
 
50 Essential Content Marketing Hacks (Content Marketing World)
50 Essential Content Marketing Hacks (Content Marketing World)50 Essential Content Marketing Hacks (Content Marketing World)
50 Essential Content Marketing Hacks (Content Marketing World)
 
Prototyping is an attitude
Prototyping is an attitudePrototyping is an attitude
Prototyping is an attitude
 
10 Insightful Quotes On Designing A Better Customer Experience
10 Insightful Quotes On Designing A Better Customer Experience10 Insightful Quotes On Designing A Better Customer Experience
10 Insightful Quotes On Designing A Better Customer Experience
 
Learn BEM: CSS Naming Convention
Learn BEM: CSS Naming ConventionLearn BEM: CSS Naming Convention
Learn BEM: CSS Naming Convention
 
SEO: Getting Personal
SEO: Getting PersonalSEO: Getting Personal
SEO: Getting Personal
 
How to Build a Dynamic Social Media Plan
How to Build a Dynamic Social Media PlanHow to Build a Dynamic Social Media Plan
How to Build a Dynamic Social Media Plan
 

Similar to User Interfaces and Algorithms for Fighting Phishing, at Google Tech Talk Jan 2007

User Interfaces and Algorithms for Fighting Phishing, Cylab Seminar talk 2007
User Interfaces and Algorithms for Fighting Phishing, Cylab Seminar talk 2007User Interfaces and Algorithms for Fighting Phishing, Cylab Seminar talk 2007
User Interfaces and Algorithms for Fighting Phishing, Cylab Seminar talk 2007Jason Hong
 
Usable Privacy and Security: A Grand Challenge for HCI, Human Computer Inter...
Usable Privacy and Security: A Grand Challenge for HCI, Human Computer Inter...Usable Privacy and Security: A Grand Challenge for HCI, Human Computer Inter...
Usable Privacy and Security: A Grand Challenge for HCI, Human Computer Inter...Jason Hong
 
Pentesting Tips: Beyond Automated Testing
Pentesting Tips: Beyond Automated TestingPentesting Tips: Beyond Automated Testing
Pentesting Tips: Beyond Automated TestingAndrew McNicol
 
Do security toolbars actually prevent phishing attacks
Do security toolbars actually prevent phishing attacksDo security toolbars actually prevent phishing attacks
Do security toolbars actually prevent phishing attacksPankaj Saharan
 
Achieving Behavioral Change, for ISSA 2011 in San Francisco Feb 2011
Achieving Behavioral Change, for ISSA 2011 in San Francisco Feb 2011Achieving Behavioral Change, for ISSA 2011 in San Francisco Feb 2011
Achieving Behavioral Change, for ISSA 2011 in San Francisco Feb 2011Jason Hong
 
Introducing Bugcrowd
Introducing BugcrowdIntroducing Bugcrowd
Introducing BugcrowdCasey Ellis
 
CANTINA: A Content-Based Approach to Detecting Phishing Web Sites, at WWW2007
CANTINA: A Content-Based Approach to  Detecting Phishing Web Sites, at WWW2007CANTINA: A Content-Based Approach to  Detecting Phishing Web Sites, at WWW2007
CANTINA: A Content-Based Approach to Detecting Phishing Web Sites, at WWW2007Jason Hong
 
Teaching Johnny Not to Fall for Phish, for ISSA 2010 on May 2010
Teaching Johnny Not to Fall for Phish, for ISSA 2010 on May 2010Teaching Johnny Not to Fall for Phish, for ISSA 2010 on May 2010
Teaching Johnny Not to Fall for Phish, for ISSA 2010 on May 2010Jason Hong
 
Teaching Johnny Not to Fall for Phish, for ISSA 2011 in Pittsburgh on Feb2011
Teaching Johnny Not to Fall for Phish, for ISSA 2011 in Pittsburgh on Feb2011Teaching Johnny Not to Fall for Phish, for ISSA 2011 in Pittsburgh on Feb2011
Teaching Johnny Not to Fall for Phish, for ISSA 2011 in Pittsburgh on Feb2011Jason Hong
 
Keeping you and your library safe and secure
Keeping you and your library safe and secureKeeping you and your library safe and secure
Keeping you and your library safe and secureLYRASIS
 
How an Attacker "Audits" Your Software Systems
How an Attacker "Audits" Your Software SystemsHow an Attacker "Audits" Your Software Systems
How an Attacker "Audits" Your Software SystemsSecurity Innovation
 
Bug bounties - cén scéal?
Bug bounties - cén scéal?Bug bounties - cén scéal?
Bug bounties - cén scéal?Ciaran McNally
 
2020 FRSecure CISSP Mentor Program - Class 9
2020 FRSecure CISSP Mentor Program - Class 92020 FRSecure CISSP Mentor Program - Class 9
2020 FRSecure CISSP Mentor Program - Class 9FRSecure
 
Staying safe on the internet
Staying safe on the internetStaying safe on the internet
Staying safe on the internetArthur Landry
 
Attack Simulation and Hunting
Attack Simulation and HuntingAttack Simulation and Hunting
Attack Simulation and Huntingnathi mogomotsi
 
What You Can Do to Keep Your Email, Bank Accounts and Business Safe from Cybe...
What You Can Do to Keep Your Email, Bank Accounts and Business Safe from Cybe...What You Can Do to Keep Your Email, Bank Accounts and Business Safe from Cybe...
What You Can Do to Keep Your Email, Bank Accounts and Business Safe from Cybe...nexxtep
 
Blitzing with your defense bea con
Blitzing with your defense bea conBlitzing with your defense bea con
Blitzing with your defense bea conInnismir
 

Similar to User Interfaces and Algorithms for Fighting Phishing, at Google Tech Talk Jan 2007 (20)

User Interfaces and Algorithms for Fighting Phishing, Cylab Seminar talk 2007
User Interfaces and Algorithms for Fighting Phishing, Cylab Seminar talk 2007User Interfaces and Algorithms for Fighting Phishing, Cylab Seminar talk 2007
User Interfaces and Algorithms for Fighting Phishing, Cylab Seminar talk 2007
 
Usable Privacy and Security: A Grand Challenge for HCI, Human Computer Inter...
Usable Privacy and Security: A Grand Challenge for HCI, Human Computer Inter...Usable Privacy and Security: A Grand Challenge for HCI, Human Computer Inter...
Usable Privacy and Security: A Grand Challenge for HCI, Human Computer Inter...
 
Janitor vs cleaner
Janitor vs cleanerJanitor vs cleaner
Janitor vs cleaner
 
Pentesting Tips: Beyond Automated Testing
Pentesting Tips: Beyond Automated TestingPentesting Tips: Beyond Automated Testing
Pentesting Tips: Beyond Automated Testing
 
Do security toolbars actually prevent phishing attacks
Do security toolbars actually prevent phishing attacksDo security toolbars actually prevent phishing attacks
Do security toolbars actually prevent phishing attacks
 
Achieving Behavioral Change, for ISSA 2011 in San Francisco Feb 2011
Achieving Behavioral Change, for ISSA 2011 in San Francisco Feb 2011Achieving Behavioral Change, for ISSA 2011 in San Francisco Feb 2011
Achieving Behavioral Change, for ISSA 2011 in San Francisco Feb 2011
 
Introducing Bugcrowd
Introducing BugcrowdIntroducing Bugcrowd
Introducing Bugcrowd
 
CANTINA: A Content-Based Approach to Detecting Phishing Web Sites, at WWW2007
CANTINA: A Content-Based Approach to  Detecting Phishing Web Sites, at WWW2007CANTINA: A Content-Based Approach to  Detecting Phishing Web Sites, at WWW2007
CANTINA: A Content-Based Approach to Detecting Phishing Web Sites, at WWW2007
 
Teaching Johnny Not to Fall for Phish, for ISSA 2010 on May 2010
Teaching Johnny Not to Fall for Phish, for ISSA 2010 on May 2010Teaching Johnny Not to Fall for Phish, for ISSA 2010 on May 2010
Teaching Johnny Not to Fall for Phish, for ISSA 2010 on May 2010
 
Teaching Johnny Not to Fall for Phish, for ISSA 2011 in Pittsburgh on Feb2011
Teaching Johnny Not to Fall for Phish, for ISSA 2011 in Pittsburgh on Feb2011Teaching Johnny Not to Fall for Phish, for ISSA 2011 in Pittsburgh on Feb2011
Teaching Johnny Not to Fall for Phish, for ISSA 2011 in Pittsburgh on Feb2011
 
Keeping you and your library safe and secure
Keeping you and your library safe and secureKeeping you and your library safe and secure
Keeping you and your library safe and secure
 
How an Attacker "Audits" Your Software Systems
How an Attacker "Audits" Your Software SystemsHow an Attacker "Audits" Your Software Systems
How an Attacker "Audits" Your Software Systems
 
Bug bounties - cén scéal?
Bug bounties - cén scéal?Bug bounties - cén scéal?
Bug bounties - cén scéal?
 
2020 FRSecure CISSP Mentor Program - Class 9
2020 FRSecure CISSP Mentor Program - Class 92020 FRSecure CISSP Mentor Program - Class 9
2020 FRSecure CISSP Mentor Program - Class 9
 
Staying safe on the internet
Staying safe on the internetStaying safe on the internet
Staying safe on the internet
 
Attack Simulation and Hunting
Attack Simulation and HuntingAttack Simulation and Hunting
Attack Simulation and Hunting
 
Haifa
HaifaHaifa
Haifa
 
What You Can Do to Keep Your Email, Bank Accounts and Business Safe from Cybe...
What You Can Do to Keep Your Email, Bank Accounts and Business Safe from Cybe...What You Can Do to Keep Your Email, Bank Accounts and Business Safe from Cybe...
What You Can Do to Keep Your Email, Bank Accounts and Business Safe from Cybe...
 
12990739.ppt
12990739.ppt12990739.ppt
12990739.ppt
 
Blitzing with your defense bea con
Blitzing with your defense bea conBlitzing with your defense bea con
Blitzing with your defense bea con
 

Recently uploaded

Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...AliaaTarek5
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditSkynet Technologies
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 

Recently uploaded (20)

Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance Audit
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 

User Interfaces and Algorithms for Fighting Phishing, at Google Tech Talk Jan 2007

  • 1. User Interfaces and Algorithms for Fighting Phishing Jason I. Hong Carnegie Mellon University
  • 2. Everyday Privacy and Security Problem
  • 4. Fast Facts on Phishing • Estimated 3.5 million people have fallen for phishing • Estimated to cost $1-2 billion a year (and growing) • 9255 unique phishing sites reported in June 2006 • Easier (and safer) to phish than rob a bank
  • 5. Supporting Trust Decisions • Goal: help people make better trust decisions – Focus on anti-phishing • Large multi-disciplinary team project at CMU – Supported by NSF, ARO, CMU CyLab – Six faculty, five PhD students, undergrads, staff – Computer science, human-computer interaction, public policy, social and decision sciences, CERT
  • 6. Our Multi-Pronged Approach • Human side – Interviews to understand decision-making – Embedded training – Anti-phishing game • Computer side – Email anti-phishing filter – Automated testbed for anti-phishing toolbars – Our anti-phishing toolbar Automate where possible, support where necessary
  • 7. What do users know about phishing?
  • 8. Interview Study • Interviewed 40 Internet users, included 35 non-experts • “Mental models” interviews included email role play and open ended questions • Interviews recorded and coded J. Downs, M. Holbrook, and L. Cranor. Decision Strategies and Susceptibility to Phishing. In Proceedings of the 2006 Symposium On Usable Privacy and Security, 12-14 July 2006, Pittsburgh, PA.
  • 9. Little Knowledge of Phishing • Only about half knew meaning of the term “phishing” “Something to do with the band Phish, I take it.”
  • 10. Minimal Knowledge of Lock Icon “I think that it means secured, it symbolizes some kind of security, somehow.” • 85% of participants were aware of lock icon • Only 40% of those knew that it was supposed to be in the browser chrome • Only 35% had noticed https, and many of those did not know what it meant
  • 11. Little Attention Paid to URLs • Only 55% of participants said they had ever noticed an unexpected or strange-looking URL • Most did not consider them to be suspicious
  • 12. Some Knowledge of Scams • 55% of participants reported being cautious when email asks for sensitive financial info – But very few reported being suspicious of email asking for passwords • Knowledge of financial phish reduced likelihood of falling for these scams – But did not transfer to other scams, such as amazon.com password phish
  • 13. Naive Evaluation Strategies • The most frequent strategies don’t help much in identifying phish – This email appears to be for me – It’s normal to hear from companies you do business with – Reputable companies will send emails “I will probably give them the information that they asked for. And I would assume that I had already given them that information at some point so I will feel comfortable giving it to them again.”
  • 14. Other Findings • Web security pop-ups are confusing “Yeah, like the certificate has expired. I don’t actually know what that means.” • Don’t know what encryption means • Summary – People generally not good at identifying scams they haven’t specifically seen before – People don’t use good strategies to protect themselves
  • 15. Can we train people not to fall for phishing?
  • 16. Web Site Training Study • Laboratory study of 28 non-expert computer users • Two conditions, both asked to evaluate 20 web sites – Control group evaluated 10 web sites, took 15 minute break to read email or play solitaire, evaluated 10 more web sites – Experimental group same as above, but spent 15 minute break reading web-based training materials • Experimental group performed significantly better identifying phish after training – Less reliance on “professional-looking” designs – Looking at and understanding URLs – Web site asks for too much information People can learn from web-based training materials, if only we could get them to read them!
  • 17. How Do We Get People Trained? • Most people don’t proactively look for training materials on the web • Many companies send “security notice” emails to their employees and/or customers • But these tend to be ignored – Too much to read – People don’t consider them relevant – People think they already know how to protect themselves
  • 18. Embedded Training • Can we “train” people during their normal use of email to avoid phishing attacks? – Periodically, people get sent a training email – Training email looks like a phishing attack – If person falls for it, intervention warns and highlights what cues to look for in succinct and engaging format P. Kumaraguru, Y. Rhee, A. Acquisti, L. Cranor, J. Hong, and E. Nunge. Protecting People from Phishing: The Design and Evaluation of an Embedded Training Email System. CyLab Technical Report. CMU-CyLab-06-017, 2006. http://www.cylab.cmu.edu/default.aspx?id=2253 [to be presented at CHI 2007]
  • 20. Diagram Intervention Explains why they are seeing this message
  • 21. Diagram InterventionExplains how to identify a phishing scam
  • 23. Diagram InterventionExplains simple things you can do to protect self
  • 25. Embedded Training Evaluation • Lab study comparing our prototypes to standard security notices – EBay, PayPal notices – Diagram that explains phishing – Comic strip that tells a story • 10 participants in each condition (30 total) • Roughly, go through 19 emails, 4 phishing attacks scattered throughout, 2 training emails too – Emails are in context of working in an office
  • 26. Embedded Training Results • Existing practice of security notices is ineffective • Diagram intervention somewhat better • Comic strip intervention worked best – Statistically significant • Pilot study showed interventions most effective when based on real brands
  • 27. Next Steps • Iterate on intervention design – Have already created newer designs, ready for testing • Understand why comic strip worked better – Story? Comic format? • Preparing for larger scale deployment – Include more people – Evaluate retention over time – Deploy outside lab conditions if possible • Real world deployment and evaluation – Need corporate partners to let us spoof their brand
  • 28. Anti-Phishing Phil • A game to teach people not to fall for phish – Embedded training focuses on email – Game focuses on web browser, URLs • Goals – How to parse URLs – Where to look for URLs – Use search engines instead • Available on our web site soon
  • 30. Outline • Human side – Interviews to understand decision-making – Embedded training – Anti-phishing game • Computer side – Email anti-phishing filter – Automated testbed for anti-phishing toolbars – Our anti-phishing toolbar
  • 31. How accurate are today’s anti-phishing toolbars?
  • 32. Some Users Rely on Toolbars • Dozens of anti-phishing toolbars offered – Built into security software suites – Offered by ISPs – Free downloads – Built into latest version of popular web browsers
  • 33.
  • 34. Some Users Rely on Toolbars • Dozens of anti-phishing toolbars offered – Built into security software suites – Offered by ISPs – Free downloads – Built into latest version of popular web browsers • Previous studies demonstrated usability problems that need further work • But how well do they detect phish?
  • 35. Testing the Toolbars • April 2006: Manual evaluation of 5 toolbars – Required lots of undergraduate labor over 2-week period • Summer 2006: Created a semi-automated test bed • September 2006: Automated evaluation of 5 toolbars – Used APWG feed as source of phishing URLs • November 2006: Automated evaluation of 10 toolbars – Used phishtank.com as source of phishing URLs – Evaluated 100 phish and 510 legit sites in just 2 days L. Cranor, S. Egelman, J. Hong and Y. Zhang. Phinding Phish: An Evaluation of Anti-Phishing Toolbars. CyLab Technical Report. CMU-CyLab-06-018, 2006. http://www.cylab.cmu.edu/default.aspx?id=2255 [to be presented at NDSS]
  • 36. Testbed for Anti-Phishing Toolbars • Manual evaluation was tedious, slow, error-prone • Created a testbed that could semi-automatically evaluate these toolbars – Just give it a set of URLs to check (labeled as phish or not) – Checks all the toolbars, aggregates statistics • How to automate this for different toolbars? – Different APIs (if any), different browsers – Image-based approach, take screenshots of web browser and compare relevant portions to known states
  • 38. Finding Fresh Phish for Test • Need a source with lots of fresh phishing URLs – Can’t use toolbar black lists if we are testing their tools – Sites get taken down within a few days, need phish less than one day old • To observe how fast black lists get updated, the fresher the better • Experimented with several sources – APWG - high volume, but many duplicates and legitimate URLs included – Phishtank.com - lower volume but easier to extract phish – Other phish archives - often low volume or not fresh enough • Choice of feed impacts results
  • 39. November 2006 evaluation • Tested 10 toolbars – Microsoft Internet Explorer v7.0.5700.6 – Netscape Navigator v8.1.2 – EarthLink v3.3.44.0 – eBay v 2.3.2.0 – McAfee SiteAdvisor v1.7.0.53 – NetCraft v1.7.0 – TrustWatch v3.0.4.0.1.2 – SpoofGuard – Cloudmark v1.0. – Google Toolbar v2.1 (Firefox) • Most use blacklists and simple heuristics – SpoofGuard only one to rely solely on heuristics
  • 40. November 2006 Evaluation • Test URLs – 100 manually confirmed fresh phish from phishtank.com (reported within 6 hours) • Did not use the fully confirmed ones – 60 legitimate sites linked to by phishing messages – 510 legitimate sites tested by 3Sharp in Sept 2006 report
  • 41. Results 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 0 1 2 12 24 Time (hours) Phishingsitescorrectlyidentified SpoofGuard EarthLink Netcraft Google IE7 Cloudmark TrustWatch eBay Netscape McAfee 38% false positives 1% false positives
  • 42. Results • Only toolbar >90% accuracy has high false positive rate • Several catch 70-85% of phish with few false positives – After 15 minutes of training, users seem to do as well • Few improvements in catch rates seen over 24 hours – Suggests most toolbars not taking advantage of available phish feeds to quickly update black lists • Combination of heuristics and frequently updated black list (and white list?) seems to be most promising approach • Plan to periodically repeat study every quarter • Should only consider this a rough ordering – Different sources of phishing URLs lead to different results
  • 44. Robust Hyperlinks • Developed by Phelps and Wilensky to solve “404 not found” problem • Key idea was to add a lexical signature to URLs that could be fed to a search engine if URL failed – Ex. http://abc.com/page.html?sig=“word1+word2+...+word5” • How to generate signature? – Found that TF-IDF was fairly effective • Informal evaluation found five words was sufficient for most web pages
  • 45. Adapting TF-IDF for Anti-Phishing • Can same basic approach be used for anti-phishing? – Scammers often directly copy web pages – With Google search engine, fake should have low page rank Fake Real
  • 46. Adapting TF-IDF for Anti-Phishing • Rough algorithm – Given a web page, calculate TF-IDF for each word on page – Take five terms with highest TF-IDF weights – Feed these terms into a search engine (Google) – If domain name of current web page is in top N search results, consider it legitimate (N=30 worked well)
  • 47.
  • 48. Evaluation #1 • 100 phishing URLs fro PhishTank.com • 100 legitimate URLs from 3Sharp’s study 94% 30% 67% 10% 94% 31% 97% 10% 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% true negative false positive Basic-TF-IDF Basic-TF-IDF+domain Basic-TF-IDF+ZMP Basic-TF-IDF+domain+ZMP
  • 49. Discussion of Evaluation #1 • Very good results (97%), but false positives (10%) • Added several heuristics to reduce false positives – Many of these heuristics used by other toolbars – Age of domain – Known images – Suspicious URLs (has @ or -) – Suspicious links (see above) – IP Address in URL – Dots in URL (>= 5 dots) – Page contains text entry field – TF-IDF • Used simple forward linear model to weight these
  • 50. Evaluation #2 • Compared to SpoofGuard and NetCraft – SpoofGuard uses all heuristics – NetCraft 1.7.0 uses heuristics (?) and extensive blacklist • 100 phishing URLs from PhishTank.com • 100 legitimate URLs – Sites often attacked (citibank, paypal) – Top pages from Alexa (most popular sites) – Random web pages from random.yahoo.com
  • 51. Results of Evaluation #2 97% 6% 89% 1% 91% 48% 97% 0% 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% true negative false positive Final-TF-IDF Final-TF-IDF + Heuristics SpoofGuard Netcraft
  • 52. Discussion • Pretty good results for TF-IDF approach – 97% with 6% false positive, 89% with 1% false positive – False positives due to JavaScript phishing sites • Limitations – Does not work well for non-English web sites (TF-IDF) – System performance (querying Google each time) • Attacks by criminals – Using images instead of words – Invisible text – Circumventing TF-IDF and PageRank (hard in practice?)
  • 53. Summary • Large multi-disciplinary team project at CMU looking at trust decisions, currently anti-phishing • Human side – Interviews to understand decision-making – Embedded training – Anti-phishing game • Computer side – Automated testbed for anti-phishing toolbars – Our anti-phishing toolbar
  • 54.
  • 56. Email Anti-Phishing Filter • Philosophy: automate where possible, support where necessary • Goal: Create an email filter that detects phishing emails – Well explored area for spam – Can we do better for phishing?
  • 57. Email Anti-Phishing Filter • Heuristics combined in SVM – IP addresses in links (http://128.23.34.45/blah) – Age of linked-to domains (younger domains likely phishing) – Non-matching URLs (ex. most links point to PayPal) – “Click here to restore your account” – HTML email – Number of links – Number of domain names in links – Number of dots in URLs (http://www.paypal.update.example.com/update.cgi) – JavaScript – SpamAssassin rating
  • 58. Email Anti-Phishing Filter Evaluation • Ham corpora from SpamAssassin (2002 and 2003) – 6950 good emails • Phishingcorpus – 860 phishing emails
  • 60. Is it legitimate Our label Yes No Yes True positive False positive No False negative True negative

Editor's Notes

  1. 2-3.5 million http://www.gartner.com/it/page.jsp?id=498245
  2. Email #16 was from CardMember Services with the subject "Your Online Statement Is Now Available" Email #17 was from [email_address] with the subject "Reactivate your PayPal Account"