SlideShare uma empresa Scribd logo
1 de 24
Shady Paths: Leveraging
Surfing Crowds to Detect
Malicious Web Pages
Gianluca Stringhini, Christopher Kruegel, and Giovanni Vigna
University of California, Santa Barbara
The Web is a Dangerous
Place

• Drive-by downloads
• Social engineering

Shady Paths: Leveraging Surfing Crowds to Detect Malicious Web Pages

2
Current Detection Techniques
Static Analysis

Dynamic Analysis

Suspicious elements in
• URLs
• JavaScript
• Flash

Visit the web page (honeyclients)
• Signs of exploitation

Obfuscation

Cloaking

Can only detect attacks that
exploit vulnerabilities!
Shady Paths: Leveraging Surfing Crowds to Detect Malicious Web Pages

3
Our Technique

Shady Paths: Leveraging Surfing Crowds to Detect Malicious Web Pages

4
Redirection Graphs

No need to
analyze the
final page!

By analyzing the characteristics of the set of visitors and of the redirection
graph, we can determine if the destination page is malicious
Shady Paths: Leveraging Surfing Crowds to Detect Malicious Web Pages

5
Legitimate Uses of
Redirections
• Inform that a web page has moved
• Login functionalities
• Advertisements

We cannot flag all redirections as malicious
Luckily, malicious redirection graphs look different

Shady Paths: Leveraging Surfing Crowds to Detect Malicious Web Pages

6
Malicious Redirection Graphs
Uniform software configuration

Shady Paths: Leveraging Surfing Crowds to Detect Malicious Web Pages

7
Malicious Redirection Graphs
Cross-domain redirections

evil.co.cc

malicious.ru

Shady Paths: Leveraging Surfing Crowds to Detect Malicious Web Pages

8
Malicious Redirection Graphs
“Hubs” to aggregate traffic

Shady Paths: Leveraging Surfing Crowds to Detect Malicious Web Pages

9
Malicious Redirection Graphs
“Infected” websites

Shady Paths: Leveraging Surfing Crowds to Detect Malicious Web Pages

10
System Overview
Shady Paths: Leveraging Surfing Crowds to Detect Malicious Web Pages

11
Our System: SpiderWeb
We leverage the differences between
legitimate and malicious redirection
graphs for detection
Three components:
• Data collection
• Creation of redirection graphs
• Classification component
Shady Paths: Leveraging Surfing Crowds to Detect Malicious Web Pages

12
Data Collection
SpiderWeb needs a set of
navigation data from a
diverse population of users
Dataset obtained from a
large AV vendor
• Users of a browser
security tool
• Data collection was optin only
• Data was anonymized
Shady Paths: Leveraging Surfing Crowds to Detect Malicious Web Pages

13
Creation of Redirection
Graphs

b.com

c.com

d.com

c.com

a.com

d.com

c.com

d.com

When we specify the final page, we allow wildcards
(e.g., malicious.com/*) → Groupings
We need to discard groupings that are too general
Shady Paths: Leveraging Surfing Crowds to Detect Malicious Web Pages

14
Classification Component
Five categories of features
• Client features (3 features)
• Referrer features (4 features)
• Landing page features (4 features)
• Final page features (5 features)

}

how diverse are
these elements

Distinct URLs, Parameters, TLD, Domain is an IP

• Redirection graph features (12 features)
Length of chains, same country across referrer and final page,
intra-domain redirections, hubs

We use Support Vector Machines for classification
Shady Paths: Leveraging Surfing Crowds to Detect Malicious Web Pages

15
Evaluation
Shady Paths: Leveraging Surfing Crowds to Detect Malicious Web Pages

16
Evaluation Dataset
388,098 redirection chains, collected over two months
• 34,011 final URLs
• 13,780 distinct user IP addresses per week
• 145 countries

Labeled dataset for training
•
•

2,533 redirection chains leading to 1,854 malicious URLs
2,466 redirection chains leading to 510 legitimate URLs

Shady Paths: Leveraging Surfing Crowds to Detect Malicious Web Pages

17
Analysis of the Classifier
SpiderWeb’s performance depends on the redirection graph
complexity
• Complexity ≥ 6 causes no FPs and no FNs
• Our dataset is limited → we discard graphs with complexity < 4
We need to accept a certain amount of FPs and FNs
Full URL grouping: 1.2% FP rate, 17% FN rate
Redirection-graph specific features are the most important:
Without them, FNs raise to 67%

Shady Paths: Leveraging Surfing Crowds to Detect Malicious Web Pages

18
Detection in the Wild
3,549 redirection graphs with complexity ≥ 4

564 flagged as malicious → 3,368 URLs
778 URLs undetected by the AV vendor
• We could not confirm 1.5% of them
• Effectively complements state of the art

Shady Paths: Leveraging Surfing Crowds to Detect Malicious Web Pages

19
Comparison with Previous
Work
A few previous systems leverage redirection information to
detect malicious web pages
These systems also use other type of information
• WarningBird: uses Twitter profile information
• SURF: SEO specific
If this additional information is not present, SpiderWeb
outperforms previous systems

Shady Paths: Leveraging Surfing Crowds to Detect Malicious Web Pages

20
Possible Use Cases
Offline detection (blacklist)
Online detection
Users get infected until the required “complexity” is reached
We performed a chronological experiment
SpiderWeb would have protected 93% users

Shady Paths: Leveraging Surfing Crowds to Detect Malicious Web Pages

21
Discussion
Limitations
• Graphs with high complexity are required
• Groupings are not perfect
• Attackers might redirect users to legitimate pages

Attackers might make their redirections look legitimate
• Stop using cloaking (easier to detect by previous work)
• Stop using hubs (raises the bar)

Shady Paths: Leveraging Surfing Crowds to Detect Malicious Web Pages

22
Conclusions
• We showed that malicious and legitimate
redirection graphs differ
• We presented a system that analyzes redirection
graphs to detect malicious web pages
• We showed that our system is effective, and
complements existing systems

Shady Paths: Leveraging Surfing Crowds to Detect Malicious Web Pages

23
Questions?
gianluca@cs.ucsb.edu
@gianlucaSB

Shady Paths: Leveraging Surfing Crowds to Detect Malicious Web Pages

24

Mais conteúdo relacionado

Semelhante a Shady Paths: Leveraging Surfing Crowds to Detect Malicious Web Pages

Chasing web-based malware
Chasing web-based malwareChasing web-based malware
Chasing web-based malwareFACE
 
Report - Final_New_phishila
Report - Final_New_phishilaReport - Final_New_phishila
Report - Final_New_phishilaAshwin Palani
 
Phishing Website Detection by Machine Learning Techniques Presentation.pdf
Phishing Website Detection by Machine Learning Techniques Presentation.pdfPhishing Website Detection by Machine Learning Techniques Presentation.pdf
Phishing Website Detection by Machine Learning Techniques Presentation.pdfVaralakshmiKC
 
State of the Art Analysis Approach for Identification of the Malignant URLs
State of the Art Analysis Approach for Identification of the Malignant URLsState of the Art Analysis Approach for Identification of the Malignant URLs
State of the Art Analysis Approach for Identification of the Malignant URLsIOSRjournaljce
 
Compromised Website Report 2012
Compromised Website Report 2012Compromised Website Report 2012
Compromised Website Report 2012Cyren, Inc
 
Understanding and Mitigating the Security Risks of Content Inclusion in Web B...
Understanding and Mitigating the Security Risks of Content Inclusion in Web B...Understanding and Mitigating the Security Risks of Content Inclusion in Web B...
Understanding and Mitigating the Security Risks of Content Inclusion in Web B...Sajjad "JJ" Arshad
 
AutoBLG by Sun Bo
AutoBLG by Sun Bo AutoBLG by Sun Bo
AutoBLG by Sun Bo mori_tatsuya
 
The Personal and Website Security Mindset
The Personal and Website Security MindsetThe Personal and Website Security Mindset
The Personal and Website Security MindsetAdam W. Warner
 
NZNOG 2022: Routing Security
NZNOG 2022: Routing SecurityNZNOG 2022: Routing Security
NZNOG 2022: Routing SecurityAPNIC
 
Cyber Security Project : Comprehensive Vulnerability Analysis Report.pptx
Cyber Security Project : Comprehensive Vulnerability Analysis Report.pptxCyber Security Project : Comprehensive Vulnerability Analysis Report.pptx
Cyber Security Project : Comprehensive Vulnerability Analysis Report.pptxBoston Institute of Analytics
 
How i'm going to own your organization v2
How i'm going to own your organization v2How i'm going to own your organization v2
How i'm going to own your organization v2RazorEQX
 
Heat seeking honeypot
Heat seeking honeypotHeat seeking honeypot
Heat seeking honeypotAmeya Vp
 
PhD Thesis presentation
PhD Thesis presentationPhD Thesis presentation
PhD Thesis presentationJavier Ortega
 
Browser isolation (isc)2 may presentation v2
Browser isolation (isc)2 may presentation v2Browser isolation (isc)2 may presentation v2
Browser isolation (isc)2 may presentation v2Wen-Pai Lu
 
Malware detection-using-machine-learning
Malware detection-using-machine-learningMalware detection-using-machine-learning
Malware detection-using-machine-learningSecurity Bootcamp
 
Detection of Phishing Websites
Detection of Phishing Websites Detection of Phishing Websites
Detection of Phishing Websites Nikhil Soni
 
Practical White Hat Hacker Training - Passive Information Gathering(OSINT)
Practical White Hat Hacker Training -  Passive Information Gathering(OSINT)Practical White Hat Hacker Training -  Passive Information Gathering(OSINT)
Practical White Hat Hacker Training - Passive Information Gathering(OSINT)PRISMA CSI
 

Semelhante a Shady Paths: Leveraging Surfing Crowds to Detect Malicious Web Pages (20)

Chasing web-based malware
Chasing web-based malwareChasing web-based malware
Chasing web-based malware
 
Report - Final_New_phishila
Report - Final_New_phishilaReport - Final_New_phishila
Report - Final_New_phishila
 
Phishing Website Detection by Machine Learning Techniques Presentation.pdf
Phishing Website Detection by Machine Learning Techniques Presentation.pdfPhishing Website Detection by Machine Learning Techniques Presentation.pdf
Phishing Website Detection by Machine Learning Techniques Presentation.pdf
 
ppt presentation
ppt presentationppt presentation
ppt presentation
 
State of the Art Analysis Approach for Identification of the Malignant URLs
State of the Art Analysis Approach for Identification of the Malignant URLsState of the Art Analysis Approach for Identification of the Malignant URLs
State of the Art Analysis Approach for Identification of the Malignant URLs
 
Compromised Website Report 2012
Compromised Website Report 2012Compromised Website Report 2012
Compromised Website Report 2012
 
Understanding and Mitigating the Security Risks of Content Inclusion in Web B...
Understanding and Mitigating the Security Risks of Content Inclusion in Web B...Understanding and Mitigating the Security Risks of Content Inclusion in Web B...
Understanding and Mitigating the Security Risks of Content Inclusion in Web B...
 
AutoBLG by Sun Bo
AutoBLG by Sun Bo AutoBLG by Sun Bo
AutoBLG by Sun Bo
 
The Personal and Website Security Mindset
The Personal and Website Security MindsetThe Personal and Website Security Mindset
The Personal and Website Security Mindset
 
NZNOG 2022: Routing Security
NZNOG 2022: Routing SecurityNZNOG 2022: Routing Security
NZNOG 2022: Routing Security
 
Cyber Security Project : Comprehensive Vulnerability Analysis Report.pptx
Cyber Security Project : Comprehensive Vulnerability Analysis Report.pptxCyber Security Project : Comprehensive Vulnerability Analysis Report.pptx
Cyber Security Project : Comprehensive Vulnerability Analysis Report.pptx
 
A SOFT COMPUTING APPROACH FOR BENIGN AND MALICIOUS WEB ROBOT DETECTION
A SOFT COMPUTING APPROACH FOR BENIGN AND MALICIOUS WEB ROBOT DETECTIONA SOFT COMPUTING APPROACH FOR BENIGN AND MALICIOUS WEB ROBOT DETECTION
A SOFT COMPUTING APPROACH FOR BENIGN AND MALICIOUS WEB ROBOT DETECTION
 
How i'm going to own your organization v2
How i'm going to own your organization v2How i'm going to own your organization v2
How i'm going to own your organization v2
 
Heat seeking honeypot
Heat seeking honeypotHeat seeking honeypot
Heat seeking honeypot
 
PhD Thesis presentation
PhD Thesis presentationPhD Thesis presentation
PhD Thesis presentation
 
Browser isolation (isc)2 may presentation v2
Browser isolation (isc)2 may presentation v2Browser isolation (isc)2 may presentation v2
Browser isolation (isc)2 may presentation v2
 
Malware detection-using-machine-learning
Malware detection-using-machine-learningMalware detection-using-machine-learning
Malware detection-using-machine-learning
 
Detection of Phishing Websites
Detection of Phishing Websites Detection of Phishing Websites
Detection of Phishing Websites
 
Practical White Hat Hacker Training - Passive Information Gathering(OSINT)
Practical White Hat Hacker Training -  Passive Information Gathering(OSINT)Practical White Hat Hacker Training -  Passive Information Gathering(OSINT)
Practical White Hat Hacker Training - Passive Information Gathering(OSINT)
 
Web crawler
Web crawlerWeb crawler
Web crawler
 

Mais de Gianluca Stringhini

The Harvester, the Botmaster, and the Spammer: On the Relations Between the D...
The Harvester, the Botmaster, and the Spammer: On the Relations Between the D...The Harvester, the Botmaster, and the Spammer: On the Relations Between the D...
The Harvester, the Botmaster, and the Spammer: On the Relations Between the D...Gianluca Stringhini
 
The Tricks of the Trade: What Makes Spam Campaigns Successful?
The Tricks of the Trade: What Makes Spam Campaigns Successful?The Tricks of the Trade: What Makes Spam Campaigns Successful?
The Tricks of the Trade: What Makes Spam Campaigns Successful?Gianluca Stringhini
 
Follow the Green: Growth and Dynamics on Twitter Follower Markets
Follow the Green: Growth and Dynamics on Twitter Follower MarketsFollow the Green: Growth and Dynamics on Twitter Follower Markets
Follow the Green: Growth and Dynamics on Twitter Follower MarketsGianluca Stringhini
 
Detecting Spammers on Social Networks
Detecting Spammers on Social NetworksDetecting Spammers on Social Networks
Detecting Spammers on Social NetworksGianluca Stringhini
 
The Spammer, the Botmaster, and the Researcher: On the Arms Race in Spamming ...
The Spammer, the Botmaster, and the Researcher: On the Arms Race in Spamming ...The Spammer, the Botmaster, and the Researcher: On the Arms Race in Spamming ...
The Spammer, the Botmaster, and the Researcher: On the Arms Race in Spamming ...Gianluca Stringhini
 
BotMagnifier: Locating Spambots on the Internet
BotMagnifier: Locating Spambots on the InternetBotMagnifier: Locating Spambots on the Internet
BotMagnifier: Locating Spambots on the InternetGianluca Stringhini
 

Mais de Gianluca Stringhini (6)

The Harvester, the Botmaster, and the Spammer: On the Relations Between the D...
The Harvester, the Botmaster, and the Spammer: On the Relations Between the D...The Harvester, the Botmaster, and the Spammer: On the Relations Between the D...
The Harvester, the Botmaster, and the Spammer: On the Relations Between the D...
 
The Tricks of the Trade: What Makes Spam Campaigns Successful?
The Tricks of the Trade: What Makes Spam Campaigns Successful?The Tricks of the Trade: What Makes Spam Campaigns Successful?
The Tricks of the Trade: What Makes Spam Campaigns Successful?
 
Follow the Green: Growth and Dynamics on Twitter Follower Markets
Follow the Green: Growth and Dynamics on Twitter Follower MarketsFollow the Green: Growth and Dynamics on Twitter Follower Markets
Follow the Green: Growth and Dynamics on Twitter Follower Markets
 
Detecting Spammers on Social Networks
Detecting Spammers on Social NetworksDetecting Spammers on Social Networks
Detecting Spammers on Social Networks
 
The Spammer, the Botmaster, and the Researcher: On the Arms Race in Spamming ...
The Spammer, the Botmaster, and the Researcher: On the Arms Race in Spamming ...The Spammer, the Botmaster, and the Researcher: On the Arms Race in Spamming ...
The Spammer, the Botmaster, and the Researcher: On the Arms Race in Spamming ...
 
BotMagnifier: Locating Spambots on the Internet
BotMagnifier: Locating Spambots on the InternetBotMagnifier: Locating Spambots on the Internet
BotMagnifier: Locating Spambots on the Internet
 

Último

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 

Último (20)

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 

Shady Paths: Leveraging Surfing Crowds to Detect Malicious Web Pages

  • 1. Shady Paths: Leveraging Surfing Crowds to Detect Malicious Web Pages Gianluca Stringhini, Christopher Kruegel, and Giovanni Vigna University of California, Santa Barbara
  • 2. The Web is a Dangerous Place • Drive-by downloads • Social engineering Shady Paths: Leveraging Surfing Crowds to Detect Malicious Web Pages 2
  • 3. Current Detection Techniques Static Analysis Dynamic Analysis Suspicious elements in • URLs • JavaScript • Flash Visit the web page (honeyclients) • Signs of exploitation Obfuscation Cloaking Can only detect attacks that exploit vulnerabilities! Shady Paths: Leveraging Surfing Crowds to Detect Malicious Web Pages 3
  • 4. Our Technique Shady Paths: Leveraging Surfing Crowds to Detect Malicious Web Pages 4
  • 5. Redirection Graphs No need to analyze the final page! By analyzing the characteristics of the set of visitors and of the redirection graph, we can determine if the destination page is malicious Shady Paths: Leveraging Surfing Crowds to Detect Malicious Web Pages 5
  • 6. Legitimate Uses of Redirections • Inform that a web page has moved • Login functionalities • Advertisements We cannot flag all redirections as malicious Luckily, malicious redirection graphs look different Shady Paths: Leveraging Surfing Crowds to Detect Malicious Web Pages 6
  • 7. Malicious Redirection Graphs Uniform software configuration Shady Paths: Leveraging Surfing Crowds to Detect Malicious Web Pages 7
  • 8. Malicious Redirection Graphs Cross-domain redirections evil.co.cc malicious.ru Shady Paths: Leveraging Surfing Crowds to Detect Malicious Web Pages 8
  • 9. Malicious Redirection Graphs “Hubs” to aggregate traffic Shady Paths: Leveraging Surfing Crowds to Detect Malicious Web Pages 9
  • 10. Malicious Redirection Graphs “Infected” websites Shady Paths: Leveraging Surfing Crowds to Detect Malicious Web Pages 10
  • 11. System Overview Shady Paths: Leveraging Surfing Crowds to Detect Malicious Web Pages 11
  • 12. Our System: SpiderWeb We leverage the differences between legitimate and malicious redirection graphs for detection Three components: • Data collection • Creation of redirection graphs • Classification component Shady Paths: Leveraging Surfing Crowds to Detect Malicious Web Pages 12
  • 13. Data Collection SpiderWeb needs a set of navigation data from a diverse population of users Dataset obtained from a large AV vendor • Users of a browser security tool • Data collection was optin only • Data was anonymized Shady Paths: Leveraging Surfing Crowds to Detect Malicious Web Pages 13
  • 14. Creation of Redirection Graphs b.com c.com d.com c.com a.com d.com c.com d.com When we specify the final page, we allow wildcards (e.g., malicious.com/*) → Groupings We need to discard groupings that are too general Shady Paths: Leveraging Surfing Crowds to Detect Malicious Web Pages 14
  • 15. Classification Component Five categories of features • Client features (3 features) • Referrer features (4 features) • Landing page features (4 features) • Final page features (5 features) } how diverse are these elements Distinct URLs, Parameters, TLD, Domain is an IP • Redirection graph features (12 features) Length of chains, same country across referrer and final page, intra-domain redirections, hubs We use Support Vector Machines for classification Shady Paths: Leveraging Surfing Crowds to Detect Malicious Web Pages 15
  • 16. Evaluation Shady Paths: Leveraging Surfing Crowds to Detect Malicious Web Pages 16
  • 17. Evaluation Dataset 388,098 redirection chains, collected over two months • 34,011 final URLs • 13,780 distinct user IP addresses per week • 145 countries Labeled dataset for training • • 2,533 redirection chains leading to 1,854 malicious URLs 2,466 redirection chains leading to 510 legitimate URLs Shady Paths: Leveraging Surfing Crowds to Detect Malicious Web Pages 17
  • 18. Analysis of the Classifier SpiderWeb’s performance depends on the redirection graph complexity • Complexity ≥ 6 causes no FPs and no FNs • Our dataset is limited → we discard graphs with complexity < 4 We need to accept a certain amount of FPs and FNs Full URL grouping: 1.2% FP rate, 17% FN rate Redirection-graph specific features are the most important: Without them, FNs raise to 67% Shady Paths: Leveraging Surfing Crowds to Detect Malicious Web Pages 18
  • 19. Detection in the Wild 3,549 redirection graphs with complexity ≥ 4 564 flagged as malicious → 3,368 URLs 778 URLs undetected by the AV vendor • We could not confirm 1.5% of them • Effectively complements state of the art Shady Paths: Leveraging Surfing Crowds to Detect Malicious Web Pages 19
  • 20. Comparison with Previous Work A few previous systems leverage redirection information to detect malicious web pages These systems also use other type of information • WarningBird: uses Twitter profile information • SURF: SEO specific If this additional information is not present, SpiderWeb outperforms previous systems Shady Paths: Leveraging Surfing Crowds to Detect Malicious Web Pages 20
  • 21. Possible Use Cases Offline detection (blacklist) Online detection Users get infected until the required “complexity” is reached We performed a chronological experiment SpiderWeb would have protected 93% users Shady Paths: Leveraging Surfing Crowds to Detect Malicious Web Pages 21
  • 22. Discussion Limitations • Graphs with high complexity are required • Groupings are not perfect • Attackers might redirect users to legitimate pages Attackers might make their redirections look legitimate • Stop using cloaking (easier to detect by previous work) • Stop using hubs (raises the bar) Shady Paths: Leveraging Surfing Crowds to Detect Malicious Web Pages 22
  • 23. Conclusions • We showed that malicious and legitimate redirection graphs differ • We presented a system that analyzes redirection graphs to detect malicious web pages • We showed that our system is effective, and complements existing systems Shady Paths: Leveraging Surfing Crowds to Detect Malicious Web Pages 23
  • 24. Questions? gianluca@cs.ucsb.edu @gianlucaSB Shady Paths: Leveraging Surfing Crowds to Detect Malicious Web Pages 24