MonkeySpider at Sicherheit 2008

Monkey-Spider
Detecting Malicious Websites with
Low-Interaction Honeyclients
Ali Ikinci Thorsten Holz Felix Freiling
ali.ikinci(at)contentkeeper.com
{holz|freiling}(at)informatik.uni-mannheim.de
presented at
Outline
➔ Problem and related work
➔ Challenge and requirements
analysis
➔ Honeypots and honeyclients
➔ Monkey-Spider and its limitations
➔ Preliminary results
➔ Key Findings
➔ Future Trends
Monkey-Spider 2
Malicious Web sites ...
● Are Web sites which could be a threat to the
security of the client computers requesting them
● Even a visit without any other interaction of such
could be a threat (so called drive-by downloads)
● Such Web sites can ...
● host all sorts of malware and malicious code
● exploit browser vulnerabilities
● exploit vulnerabilities of other client software
● install backdoors, spyware or keyloggers
● steal confidential information
Monkey-Spider 3
The Problem continued
● No comprehensive, up-to-date and free
database of threats on the Internet
● Every Web site could serve malicious deeds
even trusted ones
● Manual malware analysis of malicious Web
sites is too slow and too expensive
● Even automatic analysis is often too slow to
cover millions of Web sites
Monkey-Spider 4
Related Work
a) dedicated honeyclient
b) Browsing tool to use
normal Web users
PCs as honeyclients
c) Off line honeyclient
like code analyzer
d) Browsing tool to
control access to
malicious Web sites
e) Database
Monkey-Spider 5
a b c d e
Caffeine Monkey X
Capture – HPC X
X X
X X
X
X
X
X X X
X X
MITRE Honeyclient X
Monkey-Spider X
X
SHELIA X
X
? X X
X X
Web exploit finder X
Explabs LinkScanner
Finjan SecureBrowsing
Firekeeper
HoneyC
Malzilla
McAfee SiteAdvisor
Microsoft HoneyMonkey
Phoneyc
SpyBye
TrendMicro TrendProtect
UW Spycrawler
Challenge
● Fast and broad scope analysis of
millions of resources on the Internet
● Find actual threats and zero-day
exploits on the Internet
● Collect malicious code
● Allow various infection vectors
● Build a database with detailed relevant
information about threats
● Continuous monitoring of suspicious
resources
Monkey-Spider 6
Monkey-Spider simplified
Monkey-Spider 7
Internet
Scanner a
.
.
Scanner z
CrawlerDB
Requirements Analysis
• Overall Requirements
– Performance!
– Modularity and multi threaded modules
– Expandability
– Scalability
– Logging and statistics
• Crawler
– Crawling policies
– Link extraction
– URL normalization
– Efficient storage
Monkey-Spider 8
Requirements Analysis
• Malware scanner
– Multiple malware scanners
– Support for automated malware analysis
tools
– Client side scripting support
• JavaScript, VBScript, ActionScript ...
– Client software support
• Media Players, Office Applications,
Acrobat Reader ...
Monkey-Spider 9
Solution ideas
● Do not reinvent the wheel
● Use existing Free Software
● Use existing honeypot
techniques
● Use extensive prototyping
● Only superficial detection
Monkey-Spider 10
Honeypots
● Honeypots are dedicated
deception devices
● Two types:
– server honeypots or
honeypots
– client honeypots or
honeyclients
• Both can be classified as:
– low-interaction honeypots or
– high-interaction honeypots
Monkey-Spider 11
Our Solution - The Monkey-Spider
● A crawler based low-interaction
honeyclient
● Started as a diploma thesis in 2006
● Available under the GPL at
http://monkeyspider.sourceforge.net
● Written in Python
● Makes use of Heritrix, Postgresql, ClamAV,
Web Services
● Command line tool set for the analysis of
crawled content
Monkey-Spider 12
Monkey-Spider - Setup
Monkey-Spider 13
Monkey-Spider - Queue Generation
● Provide starting point(s) (seeds)
utilizing different approaches:
– Web search seeders (MSN and Yahoo)
– (Spam) mail seeder
– Hosts file seeder
• Future seeders might include
– Monitoring seeder
– Typo squatting seeder
Monkey-Spider 14
Monkey-Spider - Malware Scanner
● ARC-Files are unpacked and
examined
● MW-Scanners are executed on
crawled content
– Found malware is stored for optionally
further research
• Information regarding the malware is
stored into database
Monkey-Spider 15
Sample of extracted file names
Limitations for now
● Analysis is limited to the publicly indexable Web
● Only known malware is recognized and stored
● Drive-by download sites, heavily obfuscated
JavaScript
● Zero-day exploits are not recognized
● Full scan of the Web is not possible with Heritrix
(yet?)
● Two separate jobs are not yet aware of examining
the same sites and contents
Monkey-Spider 16
Preliminary Results
● We have done various crawls over two months
during March and April 2007
● We crawled for various topics and did a hosts file
based crawl
● Defective crawl settings caused incomplete
preliminary results
Monkey-Spider 17
MIME-type distribution of crawled content:
Results
Monkey-Spider 18
Top 10 malware types
487
92
91
22
12
10
9
Dialer-715 8
7
6
HTML.MediaTickets.A
Trojan.Aavirus-1
Trojan.JS.RJump
Adware.Casino-3
Adware.Trymedia-2
Adware.Casino
Worm.Mytob.FN
Adware.Casino-5
Trojan.Hotkey
Top 10 malware sites
487
92
91
15
14
12
12
888casino.com 11
888.com 11
10
desktopwallpaperfree.com
waterfallscenes.com
pro.webmaster.free.fr
astalavista.com
bunnezone.com
oss.sgi.com
ppd-files.download.com
bigbenbingo.com
Topic maliciousness in %
Pirate 2.6
Wallpaper 2.5
Hosts file 1.7
Games 0.3
Celebrity 0.3
Adult 0.1
Total 1
Performance
● Measurements on a standard PC
● Not focused on a Web site but on throughput
● Crawl performance of 1 MB/sec
● Malware analysis (without the crawling) in
0.05 seconds per downloaded content and
2.35 seconds per downloaded and
compressed MB
● Resulting in about 3.35 seconds per analyzed
MB of content
● In comparison:
● other low-interaction honeyclients require a
minimum of 3 seconds per Web site
Monkey-Spider 19
Key Findings
● 1% of all examined Web sites are
malicious
● Adult Web sites are relatively harmless
● Most malware is spread through pirate and
wallpaper propagation Web sites
● A Web site has to be completely crawled
and analyzed to gather representative
results
● The scope of the crawl has to be chosen
carefully
● We know very little about malicious Web
sites and their operators
Monkey-Spider 20
Future Trends
● Attacks are concentrated more and more
from the server to the client
● Client programs other than the Web client
are targeted more often, like Media
Players, Flash and PDF interpreters
● Advanced honeypot, virtual machine and
anti-virus program detection techniques
contained in malware complicates the
detection of such
● Web exploitation kits who build an
infrastructure for Web based attacks are
on the rise
Monkey-Spider 21
Credits
Monkey-Spider 22
This presentation was possible because of the kind support from
http://www.contentkeeper.com
Thank you for your attention
Monkey-Spider 23
Questions?
Further information is available on
http://monkeyspider.sourceforge.net
1 de 23

Recomendados

ShmooCon 2015: No Budget Threat Intelligence - Tracking Malware Campaigns on ... por
ShmooCon 2015: No Budget Threat Intelligence - Tracking Malware Campaigns on ...ShmooCon 2015: No Budget Threat Intelligence - Tracking Malware Campaigns on ...
ShmooCon 2015: No Budget Threat Intelligence - Tracking Malware Campaigns on ...Andrew Morris
2.9K visualizações63 slides
Malware analysis, threat intelligence and reverse engineering por
Malware analysis, threat intelligence and reverse engineeringMalware analysis, threat intelligence and reverse engineering
Malware analysis, threat intelligence and reverse engineeringbartblaze
33.3K visualizações78 slides
Network Security in 2016 por
Network Security in 2016Network Security in 2016
Network Security in 2016Qrator Labs
784 visualizações48 slides
CMS Hacking Tricks - DerbyCon 4 - 2014 por
CMS Hacking Tricks - DerbyCon 4 - 2014CMS Hacking Tricks - DerbyCon 4 - 2014
CMS Hacking Tricks - DerbyCon 4 - 2014Greg Foss
2.5K visualizações67 slides
Introduction to Web Application Security - Blackhoodie US 2018 por
Introduction to Web Application Security - Blackhoodie US 2018Introduction to Web Application Security - Blackhoodie US 2018
Introduction to Web Application Security - Blackhoodie US 2018Niranjanaa Ragupathy
10.4K visualizações119 slides
NGINX User Summit. Wallarm llightning talk por
NGINX User Summit. Wallarm llightning talkNGINX User Summit. Wallarm llightning talk
NGINX User Summit. Wallarm llightning talkWallarm
1.8K visualizações12 slides

Mais conteúdo relacionado

Mais procurados

"Giving the bad guys no sleep" por
"Giving the bad guys no sleep""Giving the bad guys no sleep"
"Giving the bad guys no sleep"Christiaan Beek
540 visualizações32 slides
[HITCON 2020 CTI Village] Threat Hunting and Campaign Tracking Workshop.pptx por
[HITCON 2020 CTI Village] Threat Hunting and Campaign Tracking Workshop.pptx[HITCON 2020 CTI Village] Threat Hunting and Campaign Tracking Workshop.pptx
[HITCON 2020 CTI Village] Threat Hunting and Campaign Tracking Workshop.pptxChi En (Ashley) Shen
5.9K visualizações86 slides
Offensive malware usage and defense por
Offensive malware usage and defenseOffensive malware usage and defense
Offensive malware usage and defenseChristiaan Beek
5.1K visualizações42 slides
The 4horsemen of ics secapocalypse por
The 4horsemen of ics secapocalypseThe 4horsemen of ics secapocalypse
The 4horsemen of ics secapocalypseChristiaan Beek
1.1K visualizações29 slides
Security by Weston Hecker por
Security by Weston HeckerSecurity by Weston Hecker
Security by Weston HeckerEC-Council
590 visualizações51 slides
Defending Against 1,000,000 Cyber Attacks by Michael Banks por
Defending Against 1,000,000 Cyber Attacks by Michael BanksDefending Against 1,000,000 Cyber Attacks by Michael Banks
Defending Against 1,000,000 Cyber Attacks by Michael BanksEC-Council
341 visualizações38 slides

Mais procurados(20)

"Giving the bad guys no sleep" por Christiaan Beek
"Giving the bad guys no sleep""Giving the bad guys no sleep"
"Giving the bad guys no sleep"
Christiaan Beek540 visualizações
[HITCON 2020 CTI Village] Threat Hunting and Campaign Tracking Workshop.pptx por Chi En (Ashley) Shen
[HITCON 2020 CTI Village] Threat Hunting and Campaign Tracking Workshop.pptx[HITCON 2020 CTI Village] Threat Hunting and Campaign Tracking Workshop.pptx
[HITCON 2020 CTI Village] Threat Hunting and Campaign Tracking Workshop.pptx
Chi En (Ashley) Shen5.9K visualizações
Offensive malware usage and defense por Christiaan Beek
Offensive malware usage and defenseOffensive malware usage and defense
Offensive malware usage and defense
Christiaan Beek5.1K visualizações
The 4horsemen of ics secapocalypse por Christiaan Beek
The 4horsemen of ics secapocalypseThe 4horsemen of ics secapocalypse
The 4horsemen of ics secapocalypse
Christiaan Beek1.1K visualizações
Security by Weston Hecker por EC-Council
Security by Weston HeckerSecurity by Weston Hecker
Security by Weston Hecker
EC-Council590 visualizações
Defending Against 1,000,000 Cyber Attacks by Michael Banks por EC-Council
Defending Against 1,000,000 Cyber Attacks by Michael BanksDefending Against 1,000,000 Cyber Attacks by Michael Banks
Defending Against 1,000,000 Cyber Attacks by Michael Banks
EC-Council341 visualizações
Building & Hacking Modern iOS Apps por SecuRing
Building & Hacking Modern iOS AppsBuilding & Hacking Modern iOS Apps
Building & Hacking Modern iOS Apps
SecuRing656 visualizações
Tracking Exploit Kits - Virus Bulletin 2016 por John Bambenek
Tracking Exploit Kits - Virus Bulletin 2016Tracking Exploit Kits - Virus Bulletin 2016
Tracking Exploit Kits - Virus Bulletin 2016
John Bambenek864 visualizações
Ransomware - what is it, how to protect against it por Zoltan Balazs
Ransomware - what is it, how to protect against itRansomware - what is it, how to protect against it
Ransomware - what is it, how to protect against it
Zoltan Balazs2.2K visualizações
Detection Rules Coverage por Sunny Neo
Detection Rules CoverageDetection Rules Coverage
Detection Rules Coverage
Sunny Neo1.2K visualizações
[OWASP Poland Day] Saving private token por OWASP
[OWASP Poland Day] Saving private token[OWASP Poland Day] Saving private token
[OWASP Poland Day] Saving private token
OWASP332 visualizações
Practical White Hat Hacker Training - Introduction to Cyber Security por PRISMA CSI
Practical White Hat Hacker Training - Introduction to Cyber SecurityPractical White Hat Hacker Training - Introduction to Cyber Security
Practical White Hat Hacker Training - Introduction to Cyber Security
PRISMA CSI1.5K visualizações
Setup Your Personal Malware Lab por Digit Oktavianto
Setup Your Personal Malware LabSetup Your Personal Malware Lab
Setup Your Personal Malware Lab
Digit Oktavianto4.4K visualizações
OSX/Pirrit: The blue balls of OS X adware por Amit Serper
OSX/Pirrit: The blue balls of OS X adwareOSX/Pirrit: The blue balls of OS X adware
OSX/Pirrit: The blue balls of OS X adware
Amit Serper672 visualizações
Shamoon por Shakacon
ShamoonShamoon
Shamoon
Shakacon659 visualizações
CSW2017 Kyle ehmke lots of squats- ap-ts never miss leg day por CanSecWest
CSW2017 Kyle ehmke lots of squats- ap-ts never miss leg dayCSW2017 Kyle ehmke lots of squats- ap-ts never miss leg day
CSW2017 Kyle ehmke lots of squats- ap-ts never miss leg day
CanSecWest1.2K visualizações
Tw noche geek quito webappsec por Thoughtworks
Tw noche geek quito   webappsecTw noche geek quito   webappsec
Tw noche geek quito webappsec
Thoughtworks1.8K visualizações
Practical White Hat Hacker Training - Post Exploitation por PRISMA CSI
Practical White Hat Hacker Training - Post ExploitationPractical White Hat Hacker Training - Post Exploitation
Practical White Hat Hacker Training - Post Exploitation
PRISMA CSI4.6K visualizações
Web security for developers por Sunny Neo
Web security for developersWeb security for developers
Web security for developers
Sunny Neo229 visualizações
Csw2016 chaykin having_funwithsecuremessengers_and_androidwear por CanSecWest
Csw2016 chaykin having_funwithsecuremessengers_and_androidwearCsw2016 chaykin having_funwithsecuremessengers_and_androidwear
Csw2016 chaykin having_funwithsecuremessengers_and_androidwear
CanSecWest2.1K visualizações

Similar a MonkeySpider at Sicherheit 2008

OISF Aniversary: Active Defense - Helping threat actors hack themselves! por
OISF Aniversary: Active Defense - Helping threat actors hack themselves!OISF Aniversary: Active Defense - Helping threat actors hack themselves!
OISF Aniversary: Active Defense - Helping threat actors hack themselves!CiNPA Security SIG
95 visualizações41 slides
Rahul-Analysis_of_Adversarial_Code por
Rahul-Analysis_of_Adversarial_CodeRahul-Analysis_of_Adversarial_Code
Rahul-Analysis_of_Adversarial_Codeguest66dc5f
763 visualizações40 slides
NKU Cybersecurity Symposium: Active Defense - Helping threat actors hack them... por
NKU Cybersecurity Symposium: Active Defense - Helping threat actors hack them...NKU Cybersecurity Symposium: Active Defense - Helping threat actors hack them...
NKU Cybersecurity Symposium: Active Defense - Helping threat actors hack them...CiNPA Security SIG
106 visualizações42 slides
Webinar - Tips and Tricks on Website Security por
Webinar - Tips and Tricks on Website SecurityWebinar - Tips and Tricks on Website Security
Webinar - Tips and Tricks on Website SecurityStopTheHacker
1K visualizações26 slides
Browser isolation (isc)2 may presentation v2 por
Browser isolation (isc)2 may presentation v2Browser isolation (isc)2 may presentation v2
Browser isolation (isc)2 may presentation v2Wen-Pai Lu
645 visualizações41 slides
BSides Cleveland: Active Defense - Helping threat actors hack themselves! por
BSides Cleveland: Active Defense - Helping threat actors hack themselves!BSides Cleveland: Active Defense - Helping threat actors hack themselves!
BSides Cleveland: Active Defense - Helping threat actors hack themselves!CiNPA Security SIG
71 visualizações41 slides

Similar a MonkeySpider at Sicherheit 2008(20)

OISF Aniversary: Active Defense - Helping threat actors hack themselves! por CiNPA Security SIG
OISF Aniversary: Active Defense - Helping threat actors hack themselves!OISF Aniversary: Active Defense - Helping threat actors hack themselves!
OISF Aniversary: Active Defense - Helping threat actors hack themselves!
CiNPA Security SIG95 visualizações
Rahul-Analysis_of_Adversarial_Code por guest66dc5f
Rahul-Analysis_of_Adversarial_CodeRahul-Analysis_of_Adversarial_Code
Rahul-Analysis_of_Adversarial_Code
guest66dc5f763 visualizações
NKU Cybersecurity Symposium: Active Defense - Helping threat actors hack them... por CiNPA Security SIG
NKU Cybersecurity Symposium: Active Defense - Helping threat actors hack them...NKU Cybersecurity Symposium: Active Defense - Helping threat actors hack them...
NKU Cybersecurity Symposium: Active Defense - Helping threat actors hack them...
CiNPA Security SIG106 visualizações
Webinar - Tips and Tricks on Website Security por StopTheHacker
Webinar - Tips and Tricks on Website SecurityWebinar - Tips and Tricks on Website Security
Webinar - Tips and Tricks on Website Security
StopTheHacker1K visualizações
Browser isolation (isc)2 may presentation v2 por Wen-Pai Lu
Browser isolation (isc)2 may presentation v2Browser isolation (isc)2 may presentation v2
Browser isolation (isc)2 may presentation v2
Wen-Pai Lu645 visualizações
BSides Cleveland: Active Defense - Helping threat actors hack themselves! por CiNPA Security SIG
BSides Cleveland: Active Defense - Helping threat actors hack themselves!BSides Cleveland: Active Defense - Helping threat actors hack themselves!
BSides Cleveland: Active Defense - Helping threat actors hack themselves!
CiNPA Security SIG71 visualizações
Webinar: Insights from Cyren's 2016 cyberthreat report por Cyren, Inc
Webinar: Insights from Cyren's 2016 cyberthreat reportWebinar: Insights from Cyren's 2016 cyberthreat report
Webinar: Insights from Cyren's 2016 cyberthreat report
Cyren, Inc170 visualizações
Cybersecurity: Malware & Protecting Your Business From Cyberthreats por SecureDocs
Cybersecurity: Malware & Protecting Your Business From CyberthreatsCybersecurity: Malware & Protecting Your Business From Cyberthreats
Cybersecurity: Malware & Protecting Your Business From Cyberthreats
SecureDocs2.2K visualizações
Honeypots, Deception, and Frankenstein por Phillip Maddux
Honeypots, Deception, and FrankensteinHoneypots, Deception, and Frankenstein
Honeypots, Deception, and Frankenstein
Phillip Maddux5.1K visualizações
Behind The Scenes Of Web Attacks por Maurizio Abbà
Behind The Scenes Of Web AttacksBehind The Scenes Of Web Attacks
Behind The Scenes Of Web Attacks
Maurizio Abbà914 visualizações
Rat a-tat-tat por SensePost
Rat a-tat-tatRat a-tat-tat
Rat a-tat-tat
SensePost11.8K visualizações
Malware Analysis 101 - N00b to Ninja in 60 Minutes at CactusCon on April 4, 2014 por grecsl
Malware Analysis 101 - N00b to Ninja in 60 Minutes at CactusCon on April 4, 2014Malware Analysis 101 - N00b to Ninja in 60 Minutes at CactusCon on April 4, 2014
Malware Analysis 101 - N00b to Ninja in 60 Minutes at CactusCon on April 4, 2014
grecsl4K visualizações
Detecting Intrusions and Malware - Eric Vanderburg - JurInnov por Eric Vanderburg
Detecting Intrusions and Malware - Eric Vanderburg - JurInnovDetecting Intrusions and Malware - Eric Vanderburg - JurInnov
Detecting Intrusions and Malware - Eric Vanderburg - JurInnov
Eric Vanderburg39.8K visualizações
Hackers on Planet Earth (HOPE - 2012) Advancements in Botnet Attacks por Aditya K Sood
Hackers on Planet Earth (HOPE - 2012) Advancements in Botnet Attacks Hackers on Planet Earth (HOPE - 2012) Advancements in Botnet Attacks
Hackers on Planet Earth (HOPE - 2012) Advancements in Botnet Attacks
Aditya K Sood1.5K visualizações
Introduction To ICT Security Audit OWASP Day Malaysia 2011 por Linuxmalaysia Malaysia
Introduction To ICT Security Audit OWASP Day Malaysia 2011Introduction To ICT Security Audit OWASP Day Malaysia 2011
Introduction To ICT Security Audit OWASP Day Malaysia 2011
Linuxmalaysia Malaysia2.3K visualizações
Xfocus xcon 2008_aks_oknock por ownerkhan
Xfocus xcon 2008_aks_oknockXfocus xcon 2008_aks_oknock
Xfocus xcon 2008_aks_oknock
ownerkhan617 visualizações
Rahul - Analysis Of Adversarial Code - ClubHack2007 por ClubHack
Rahul - Analysis Of Adversarial Code - ClubHack2007Rahul - Analysis Of Adversarial Code - ClubHack2007
Rahul - Analysis Of Adversarial Code - ClubHack2007
ClubHack1.6K visualizações
Network Security Tools por Emanuela Boroș
Network Security ToolsNetwork Security Tools
Network Security Tools
Emanuela Boroș3.5K visualizações
Heat seeking honeypot por Ameya Vp
Heat seeking honeypotHeat seeking honeypot
Heat seeking honeypot
Ameya Vp714 visualizações

MonkeySpider at Sicherheit 2008

  • 1. Monkey-Spider Detecting Malicious Websites with Low-Interaction Honeyclients Ali Ikinci Thorsten Holz Felix Freiling ali.ikinci(at)contentkeeper.com {holz|freiling}(at)informatik.uni-mannheim.de presented at
  • 2. Outline ➔ Problem and related work ➔ Challenge and requirements analysis ➔ Honeypots and honeyclients ➔ Monkey-Spider and its limitations ➔ Preliminary results ➔ Key Findings ➔ Future Trends Monkey-Spider 2
  • 3. Malicious Web sites ... ● Are Web sites which could be a threat to the security of the client computers requesting them ● Even a visit without any other interaction of such could be a threat (so called drive-by downloads) ● Such Web sites can ... ● host all sorts of malware and malicious code ● exploit browser vulnerabilities ● exploit vulnerabilities of other client software ● install backdoors, spyware or keyloggers ● steal confidential information Monkey-Spider 3
  • 4. The Problem continued ● No comprehensive, up-to-date and free database of threats on the Internet ● Every Web site could serve malicious deeds even trusted ones ● Manual malware analysis of malicious Web sites is too slow and too expensive ● Even automatic analysis is often too slow to cover millions of Web sites Monkey-Spider 4
  • 5. Related Work a) dedicated honeyclient b) Browsing tool to use normal Web users PCs as honeyclients c) Off line honeyclient like code analyzer d) Browsing tool to control access to malicious Web sites e) Database Monkey-Spider 5 a b c d e Caffeine Monkey X Capture – HPC X X X X X X X X X X X X X MITRE Honeyclient X Monkey-Spider X X SHELIA X X ? X X X X Web exploit finder X Explabs LinkScanner Finjan SecureBrowsing Firekeeper HoneyC Malzilla McAfee SiteAdvisor Microsoft HoneyMonkey Phoneyc SpyBye TrendMicro TrendProtect UW Spycrawler
  • 6. Challenge ● Fast and broad scope analysis of millions of resources on the Internet ● Find actual threats and zero-day exploits on the Internet ● Collect malicious code ● Allow various infection vectors ● Build a database with detailed relevant information about threats ● Continuous monitoring of suspicious resources Monkey-Spider 6
  • 8. Requirements Analysis • Overall Requirements – Performance! – Modularity and multi threaded modules – Expandability – Scalability – Logging and statistics • Crawler – Crawling policies – Link extraction – URL normalization – Efficient storage Monkey-Spider 8
  • 9. Requirements Analysis • Malware scanner – Multiple malware scanners – Support for automated malware analysis tools – Client side scripting support • JavaScript, VBScript, ActionScript ... – Client software support • Media Players, Office Applications, Acrobat Reader ... Monkey-Spider 9
  • 10. Solution ideas ● Do not reinvent the wheel ● Use existing Free Software ● Use existing honeypot techniques ● Use extensive prototyping ● Only superficial detection Monkey-Spider 10
  • 11. Honeypots ● Honeypots are dedicated deception devices ● Two types: – server honeypots or honeypots – client honeypots or honeyclients • Both can be classified as: – low-interaction honeypots or – high-interaction honeypots Monkey-Spider 11
  • 12. Our Solution - The Monkey-Spider ● A crawler based low-interaction honeyclient ● Started as a diploma thesis in 2006 ● Available under the GPL at http://monkeyspider.sourceforge.net ● Written in Python ● Makes use of Heritrix, Postgresql, ClamAV, Web Services ● Command line tool set for the analysis of crawled content Monkey-Spider 12
  • 14. Monkey-Spider - Queue Generation ● Provide starting point(s) (seeds) utilizing different approaches: – Web search seeders (MSN and Yahoo) – (Spam) mail seeder – Hosts file seeder • Future seeders might include – Monitoring seeder – Typo squatting seeder Monkey-Spider 14
  • 15. Monkey-Spider - Malware Scanner ● ARC-Files are unpacked and examined ● MW-Scanners are executed on crawled content – Found malware is stored for optionally further research • Information regarding the malware is stored into database Monkey-Spider 15 Sample of extracted file names
  • 16. Limitations for now ● Analysis is limited to the publicly indexable Web ● Only known malware is recognized and stored ● Drive-by download sites, heavily obfuscated JavaScript ● Zero-day exploits are not recognized ● Full scan of the Web is not possible with Heritrix (yet?) ● Two separate jobs are not yet aware of examining the same sites and contents Monkey-Spider 16
  • 17. Preliminary Results ● We have done various crawls over two months during March and April 2007 ● We crawled for various topics and did a hosts file based crawl ● Defective crawl settings caused incomplete preliminary results Monkey-Spider 17 MIME-type distribution of crawled content:
  • 18. Results Monkey-Spider 18 Top 10 malware types 487 92 91 22 12 10 9 Dialer-715 8 7 6 HTML.MediaTickets.A Trojan.Aavirus-1 Trojan.JS.RJump Adware.Casino-3 Adware.Trymedia-2 Adware.Casino Worm.Mytob.FN Adware.Casino-5 Trojan.Hotkey Top 10 malware sites 487 92 91 15 14 12 12 888casino.com 11 888.com 11 10 desktopwallpaperfree.com waterfallscenes.com pro.webmaster.free.fr astalavista.com bunnezone.com oss.sgi.com ppd-files.download.com bigbenbingo.com Topic maliciousness in % Pirate 2.6 Wallpaper 2.5 Hosts file 1.7 Games 0.3 Celebrity 0.3 Adult 0.1 Total 1
  • 19. Performance ● Measurements on a standard PC ● Not focused on a Web site but on throughput ● Crawl performance of 1 MB/sec ● Malware analysis (without the crawling) in 0.05 seconds per downloaded content and 2.35 seconds per downloaded and compressed MB ● Resulting in about 3.35 seconds per analyzed MB of content ● In comparison: ● other low-interaction honeyclients require a minimum of 3 seconds per Web site Monkey-Spider 19
  • 20. Key Findings ● 1% of all examined Web sites are malicious ● Adult Web sites are relatively harmless ● Most malware is spread through pirate and wallpaper propagation Web sites ● A Web site has to be completely crawled and analyzed to gather representative results ● The scope of the crawl has to be chosen carefully ● We know very little about malicious Web sites and their operators Monkey-Spider 20
  • 21. Future Trends ● Attacks are concentrated more and more from the server to the client ● Client programs other than the Web client are targeted more often, like Media Players, Flash and PDF interpreters ● Advanced honeypot, virtual machine and anti-virus program detection techniques contained in malware complicates the detection of such ● Web exploitation kits who build an infrastructure for Web based attacks are on the rise Monkey-Spider 21
  • 22. Credits Monkey-Spider 22 This presentation was possible because of the kind support from http://www.contentkeeper.com
  • 23. Thank you for your attention Monkey-Spider 23 Questions? Further information is available on http://monkeyspider.sourceforge.net