2. agenda
● Introduction
● Bots and botnets: short walk-through
● Taming botnets: Detection and Evasion
● Our approach
● Case studies
● Conclusion
● Disclaimer:
We steal our images
From google image :)
3. Introduction
● Why we are doing this research?
● Objectives
● Our data sources
● Our environment
bunch of code in node.js
and python. Customized sandboxing platform
(cuckoo based). Data indexed in solr
4. Introduction: bots
● “bot”: a software program, installed on target
machine(s) for the purpose of utilizing that
machine computational/network resources or
collect information
● A typical bot is controlled by external party
therefore needs to be able to utilize a
communication channel in order to receive
commands and pass information
● Bots typically are used for malicious purposes ;-)
5. Introduction: bots (lifecycle)
● Installation (infection) phase: often by means of
a software exploit or a social engineering
technique (fake antivirus, fake software update)
● Post-infection phase: communication (C&C,
peer etc)
6. Introduction
● Our basic assumption is that a bot needs to be
able to communicate back in order to be useful.
● Our analysis is primarily “blackbox” by observing
network traffic of a large network infrastructure in
order to identify possible infections and
“communication” links
● We also utilize sandboxing techniques to
observe behavior (mainly from the network side)
● We do not attempt to reverse engineer
(manually or automatically) botnet software
7. Botnets
● Infection vectors → often targetting enduser
machines (clients) in large number of
occurrences by exploiting a software
vulnerability in browser or related components
● C&C communication:
● Remember IRC bots? :)
● over HTTP (most common)
● Proprietary protocol
● Centralized or P2P infrastructure
10. How do you get bots on your
machine? ;-)
● Compromised servers: most widespread, often
through silly vulns (i.e. wordpress!), but also
high profile web sites are affected, or domains
taken over (DNS poisoning and more)
● Placing a javascript iframe on compromised
high-traffic machine is way more profitable than
defacing (hacktivism is only for hippies? ;)
11. How do you get bots (pt 2)
● SEO poisoning/manipulation.
12. How you get bots (pt 3)
● Advertisements and malvertisements: whole
new ecosystem:
OpenX is a huge security hole ;)
13. Anyways
● Once infected, the bot talks back...
Lets look at some real-life cases. (data is very
recently, mostly past few months).
15. Carberp
● Bot Infection: Drive-By-HTTP
● Payload and intermediate malware domains: normal, just
registered/DynDNS
● Distributed via: Many many compromised web-sites, top
score > 100 compromised resources detected during 1
week.
● C&C domains usually generated, but some special cases
below ;-).
● C&C and Malware domains located on the same AS (from
bot point of view). Easy to detect.
● Typical bot activity: Mass HTTP Post
19. Detection during infection and by
postinfection activity
● Infection: executable transfer from just
registered, example lifenews-sport.org or
Dyn-DNS domains, like
uphchtxmji.homelinux.com
● Updates: executable transfer from just
registered or DynDNS domain
● Postinfection activity: Mass HTTP Post to
generated domains like
n87e0wfoghoucjfe0id.org, URL ends with
different extensions
20. Netprotocol.exe
● Bot Infection was: Drive-By-FTP,
now: Drive-By-FTP, Drive-By-HTTP
● Payload and intermediate malware domains:Normal, Obfuscated
● Distributed via: compromised web-sites
● C&C domains usually generated, many domains in .be zone.
● C&C and Malware domains located on the different AS. Bot
updates payload via HTTP
● Typical bot activity: HTTP Post, payload updates via HTTP.
22. Attack analysis
- Script from www. Java.com used during attack.
- Applet exp.jar loaded by FTP
- FTP Server IP address obfuscated to avoid
detection
24. Activity example
Date/Time 2012-04-29 Date/Time 2012-04-29
02:05:48 MSD 02:06:08 MSD
Tag Name HTTP_Post Tag Name HTTP_Post
Target IP Address Target IP Address
217.73.60.107 208.73.210.29
:server :server
rugtif.be eksyghskgsbakrys.com
● :URL :URL
/check_system.php /check_system.php
Domain registered:
2012-04-21
25. Onhost deteciton and activity
Payload: usually netprotocol.exe. Located in
UsersUSER_NAMEAppDataRoaming,
which periodically downloads other malware
Further payload loaded via HTTP
http://64.191.65.99/view_img.php?c=4&
k=a4422297a462ec0f01b83bc96068e064
26. Detection By AV Sample from May
09 2012 Detect ratio 1/42
● (demos, recoreded as videos)
27. Detection during infection and by
postinfection activity
● Infection: .jar and .dat file downloaded by FTP, server name
= obfuscated IP Addres, example ftp://3645456330/6/e.jar
Java version in FTP password, example Java1.6.0_29@
● Updates: executable transfer from some Internet host,
example GET http://184.82.0.35/f/kwe.exe
● Postinfection activity: Mass HTTP Post to normal and
generated domains with URL: check_system.php
09:04:46 POST http://hander.be/check_system.php
09:05:06 POST http://aratecti.be/check_system.php
09:06:48 POST http://hander.be/check_system.php
09:07:11 POST http://aratecti.be/check_system.php
28. Noproblemslove.com,
whoismistergreen.com, etc...
● Bot Infection: Drive-By-HTTP
● Payload and intermediate malware
domains:Normal /DynDNS
● Distributed via: Compromised web-sites.
● C&C domains: normal.
● C&C and Malware domains located on the
different AS. Sophisticated attack scheme.
Timeout before activity.
● Typical bot activity: Mass HTTP Post
30. Interesting domains from range
184.82.149.178-184.82.149.180 (Feb 2012)
Domain Name IP
www.google-analylics.com 184.82.149.179
google-anatylics.com 184.82.149.178
www.google-analitycs.com 184.82.149.180
webmaster-google.ru 184.82.149.178
paged2.googlesyndlcation.com 184.82.149.179
googlefilter.ru 184.82.149.179
rambler-analytics.ru 184.82.149.179
site-yandex.net 184.82.149.180
paged2.googlesyndlcation.com 184.82.149.179
www.yandex-analytics.ru 184.82.149.178
googles.4pu.com 184.82.149.178
googleapis.www1.biz 184.82.149.178
syn1-adriver.ru 184.82.149.178
31. HOSTER RANGE AND AS
www.google-analylics.com looks good,
BUT
Google, Rambler and Yandex together on
184.82.149.176/29 ?
hoster range and autonomous system (AS)
are useful, when you analyze suspicious events.
34. What's common
whoismistergreen.com noproblemslove.com
IP-адрес: 213.5.68.105 213.5.68.105
Create: 2011-07-26 Created: 2011-12-07
Registrant Name: JOHN Registrant Contact:
ABRAHAM Whois Privacy Protection Service
Address: ul. Dubois 119 Whois Agent
City: Lodz gmvjcxkxhs@whoisservices.cn
patr1ckjane.com noproblemsbro.com
IP Was 176.65.166.28 176.65.166.28
IP Now 213.5.68.105 Created: 2011-12-07
Registrant Contact:
Create: 2011-07-21
Whois Privacy Protection Service
Registrant Name: patrick jane Whois Agent
Address: ul. Dubois 119 gmvjcxkxhs@whoisservices.cn
City: Lodz
35. Detection during infection and by
postinfection activity
● Infection: executable transfer from just
registered, or Dyn-DNS domains, like
fx58.ddns.us
● Updates: application/octet-stream bulk data
load from C&C
● Postinfection activity: Mass HTTP Post to
seem-normal domains,i.e:
noproblemslove.com,
whoismistergreen.com, etc...
38. Cross-correlation data sources
● WHOIS (including team cymru whois)
● Our own DNS index, also talking to ISC about
possibilities of data swaps
● Sandbox farm (mainly to detect compromised
websites automagically and study behavior)
● Public “malicious IP address” databases.
● Public reputation (I.e ToS) databases.
● (still work in progress)
39. Detection
● Manual and Automated
● Automated detection is largely based on
analysis of network traffic:
● Anomaly detection
● Pattern based-analysis
● Signatures (snort!)
● Traffic profiling (DNS traffic profiling, HTTP traffic
profiling etc)
40. Detection
● Detecting malicious botnet activity is very
popular in academia (interesting problem).
● In our research we do not claim extreme
novelty but rather will demonstrate our
experience and a few practical solutions that
seem to work :-)
42. Detection: intreresting bits
● Botnet detection evolved from pattern based
approach (hardcoded bot CMD patterns and
capture then with snort) to a complex field of
generic detection of automated “call-back”
communication channels..
43. Detection
● Different “callback” methods, as seen in the
wild, possess interesting properties, such as:
● Large number of failed DNS requests
● Large number of DNS requests for IP addresses,
which are offline
● Connection attempts to mostly dead IP addresses
● Traffic pattern (differs from regular browsing)
44. Cat and mouse game
● Of course all of this is easy to evade. Once you
know the method. But security is always about
'cat-n-mouse' game ;-)
45. Detection
● Detecting botnet activities by analyzing DNS
traffic
● Analyzing DNS names (dictionary-comparison,
alpha numeric characters, detection of “generated”
domain names (similarities/patterns)
● Analyzing failed DNS queries
● DNS “ranking” (based on whois information)
49. Detection
● Further step: cross-correlation to domain
names which have the same WHOIS attributes
● Sandboxing (we use modified version of
cuckoosandbox, with user event simulation, not
perfect but works)
● Challenges:
– Simulate complex user behavior (mouse movements)
– Simulate complex user browsing pattern (visiting X with
search engine (image?) as referer)
51. Detection
(visualization)
● Parallel coordinates (also see recent talk by
Alexandre Dulaunoy from CIRCL.LU and
Sebastien Tricaud from Picviz Labs at
cansectwest)
52. Detection
● (demos, lets look at some videos :)
53. Conclusions
● Detection is still trivial, but keep your methods
“private” ;-)
● Detecting 'advanced' botnets (name your
favourite traffic profiling evasion method!) is out
of question here. Unless this becomes wide-
spread
● Cat and mouse game is still fun! ;-)
54. Tips and recommendations
● For infected machines: boot from clean media
and periodically do OFFLINE AV checking
● Monitor network traffic for any unusual activity
● Default-deny firewall policies + block any active
executable content
55. questions
● Contact us at:
● fygrave@gmail.com
● vladimir.b.kropotov@gmail.com
http://github.com/fygrave/dnslyzer for some code