SlideShare uma empresa Scribd logo
1 de 38
Baixar para ler offline
Scrutinizing a Country using Passive DNS and Picviz
    or how to analyze big dataset without loosing your mind




                                         Sebastien Tricaud,
                                        Alexandre Dulaunoy


                                          March 10, 2012
Disclaimer



• Passive DNS is a technique to collect only valid answers from
  caching/recursive nameservers and authoritative nameservers
• By its design, privacy is preserved (e.g. no source IP addresses
  from resolvers are captured1 )
• The research is done in the sole purpose to detect malicious
  IP/domains or content to better protect users




   1
     Except
2 of 38
              if the web application abused DNS answers to track back their users.
IP overview - some properties

                                                                                     MX
          Abuse                   CSIRT
          Helper                  RTIR


                                                                                                  A



                    has been                                                      has the DNS
                   reported to                       an IP address                 properties



                                 dshield.org                                                    CNAME




                                                                                      NS

                   AMaDa
                   Blocklist             has been                    announced
                                        tracked by                   by an ASN


                                                                                    peering
                                                                                     with
                            Zeus
                           Tracker
                                                                      stability




3 of 38
Introduction or Problem Statement



• Datasets become larger and larger (even for a small country)
• Malicious (and non malicious) activities are distributed across IP
  addresses or domain names
• Time to live of Internet resources (especially the malicious ones) is
  low
• → Attackers abuse and benefit from these facts




4 of 38
Passive DNS

                                                                                     MX
          Abuse                   CSIRT
          Helper                  RTIR


                                                                                                  A



                    has been                                                      has the DNS
                   reported to                       an IP address                 properties



                                 dshield.org                                                    CNAME




                                                                                      NS

                   AMaDa
                   Blocklist             has been                    announced
                                        tracked by                   by an ASN


                                                                                    peering
                                                                                     with
                            Zeus
                           Tracker
                                                                      stability




5 of 38
Storing Passive DNS or how to do trial and error?



• Implementing the storage of a Passive DNS can be challenging
• Starting from standard RDBMS to key-value store
• We learned to hate2 hard disk drive and to love random access
  memory
• Loving memory is great especially when it’s now cheap and
  addressable in 64bits




   2
     exception
6 of 38
                 → only used for data store snapshot
A minimalist and scalable implementation of a passive
DNS




• Our passive DNS implementation is a toolkit for experimenting
  classification or visualization techniques

7 of 38
Redis - Passive DNS data structure




8 of 38
Redis - a sample query

redis> SMEMBERS "r:www.linkedin.com:5"
1) "dub.linkedin.com"
redis> SMEMBERS "r:dub.linkedin.com:1"
1) "91.225.248.80"
redis> SMEMBERS "v:dub.linkedin.com"
1) "www.linkedin.com"
redis> GET "s:www.linkedin.com:dub.linkedin.com"
"1331057300"
redis> GET "l:www.linkedin.com:dub.linkedin.com"
"1331057412"
redis> GET "o:www.linkedin.com:dub.linkedin.com"
"3"

9 of 38
BGP Ranking on IP attributes

                                                                                      MX
           Abuse                   CSIRT
           Helper                  RTIR


                                                                                                   A



                     has been                                                      has the DNS
                    reported to                       an IP address                 properties



                                  dshield.org                                                    CNAME




                                                                                       NS

                    AMaDa
                    Blocklist             has been                    announced
                                         tracked by                   by an ASN


                                                                                     peering
                                                                                      with
                             Zeus
                            Tracker
                                                                       stability




10 of 38
AS Ranking Calculation


     Formula
                                     #s                    
                                 s=1
                                     ( Occ Simpact ) 
                    ASrank   =1+
                                                    
                                        ASsize       


      • Number of malicious occurrence per unique IP (Occ)
      • Weight of the blacklist source (Simpact )
      • Grand total of IP addresses announced by the ASN (ASsize )
      • Each iteration of the Occ sum is saved (e.g. to discard a
           source blacklist from the ranking calculation)

11 of 38
Why Ranking ISPs?



• CSIRTs can assess the level of trust per ISPs (e.g. know to host
   drive-by-download website, reactive to abuse handling, ...)
• Improve assessment between ISPs (e.g. IP peering policies)
• Detecting common suspicious activities among ISPs/ASN
• Can be used as an additional weight factor to abuse handling (e.g.
   detect outliers in large set of IP addresses)




12 of 38
A daily use: ease your log analysis



• 300 million lines of proxy logs? You have 30 minutes to find out
   what’s happened? or discarding the noise of ”known” malware
   communication?
• Prefix the ranking AS15169,1.00273578519859,74.125.... to the
   log file
• logs-ranking → sort -r -g -t”,” -k2 proxy.log-ranked




13 of 38
A daily use: ease your memory dump analysis



• During large incident, we got many memory dumps in a single day
• Dumping all the memory per process and we extracted all URLs
   and IPs from each memory dump
• Ranking URLs and IPs, and analyzing the processes with the
   higher malicious rank
• Ranking can be used for a lot of reverse analysis techniques (from
   finding malicious process to artefacts of antivirus in memory)




14 of 38
Ranked domains - Where Picviz can help

• Now, we have 50 millions lines of ranked hostname...

...
www.stopacta.info. = 1.0
www.vista-care.com. = 1.0
breadworld.com. = 1.00002301767
o-o.resolver.A.B.C.D.5xevqnwsds5zdq34.metricz.
l.google.com. = 1.00303388648
www.thechinagarden.com. = 1.00009822292
smtp10.dti.ne.jp. = 1.00010586629
...


15 of 38
Detection of multi-homed compromised systems




• Regularly malicious links are posted on compromised systems
• Ranking increased for the ASN and its announced subnet
• Passive DNS collects associated hostnamed to a subnet (usually
   filling the gap in the subnet)
• But how to find thoses cases?




16 of 38
Ooops wrong visualization




• For the ones who were at the party ;-)



17 of 38
Why visualization?




• Understand big data
• Find stuff we cannot guess




18 of 38
Problem with usual visualizations




• Limited
  ◦ Top 10 (!)
  ◦ Just to display tendencies. . .
  ◦ Hide most of information
• Hard to get meaningful/useful information
• Folks mostly use it to display stuff in a different way
19 of 38
Problem with usual visualizations




20 of 38
Choosing Parallel Coordinates




• Display as much dimensions wanted (yes, as many)
• Display as much data wanted (I mean it!)
21 of 38
Interesting patterns




22 of 38
Dataset




23 of 38
Picvizing the whole dataset




24 of 38
Splitting the URL



• We want to get the TLD, subdomains etc. . .
• A regex does not work: 192.168.0.1, http://localhost, google.com,
  www.slashdot.org:80, . . .
• We simply put them according to their ascii value
    ◦ a is at the axis bottom
    ◦ zzzzzzzzzzzzzzzzzzz{500} is on the very top




25 of 38
Picviz with the whole url split




26 of 38
Reward: highest is youtube




27 of 38
Subdomain entropy


Only one sub-domain has an entropy3 >4.8




   3
     Shannon
28 of 38
               entropy
Subdomain entropy


Only one sub-domain has an entropy4 >4.8




   4
     Shannon
29 of 38
               entropy
Scatter plot - finding outliers




30 of 38
Scatter plot - finding outliers - covert channel?




030066363663643937306531[..].36393764313333653763.lbl8.mailshell.net
t10000.u1318235395163.s203679668[..]-1329.zv6lit-null.zrdtd-1311.zr6td-
null.results.potaroo.net
03003064303831663965386[..].64306561343837346533.lbl8.mailshell.net

31 of 38
Searching for Zeus
Using the broad Polish CERT regex
[a-z0-9]{32,48}.(ru|com|biz|info|org|net)




• We get some cool domains:
  ◦ cg79wo20kl92doowfn01oqpo9mdieowv5tyj.com
  ◦ eef795a4eddaf1e7bd79212acc9dde16.net
• but more important we got a visualization profile to find outliers
   not matching the regexp
32 of 38
Zoom on NS answer domain




33 of 38
Back to the global view




• request domain: ns2.speed-tube.net
34 of 38
Investigating ns2.speed-tube.net



• Grab cool stuff that are not ranked like:
   adsforadsense.co.cc;1.0;ns2.speed-tube.net;1.0
   extra-tube.net;1.0001125221;ns2.speed-tube.net;1.0 ...
• A recurring (reactivated or cached) malicious site:
   adsforadsense.co.cc rogue safebrowsing.clients.google.com
   20110315 20110125




35 of 38
Conclusion



• Passive DNS is an infinite source of security data mining
• The toolkit is now available on github and this is the basis for
   more research
• (adequate) Visualization is an appropriate way to discover
   unknown malicious or suspicious services
• This finally helps CSIRTs to act earlier on the incidents




36 of 38
Free Software



• BGP Ranking software
   https://www.github.com/CIRCL/BGP-Ranking -
   http://bgpranking.circl.lu/
• Passive DNS toolkit
   https://www.github.com/adulau/pdns-viz/ - first commit for
   CanSecWest - more modules to come
• Domain Classification
   https://www.github.com/adulau/DomainClassifier/




37 of 38
Q&A




• @adulau - alexandre.dulaunoy@circl.lu
• @tricaud - sebastien@honeynet.org




38 of 38

Mais conteúdo relacionado

Semelhante a Scrutinizing a Country using Passive DNS and Picviz

Cassandra Performance and Scalability on AWS
Cassandra Performance and Scalability on AWSCassandra Performance and Scalability on AWS
Cassandra Performance and Scalability on AWSAdrian Cockcroft
 
AWS reInvent: Building an enterprise class backup and archival solution on AWS
AWS reInvent: Building an enterprise class backup and archival solution on AWSAWS reInvent: Building an enterprise class backup and archival solution on AWS
AWS reInvent: Building an enterprise class backup and archival solution on AWSDruva
 
DFX Architecture for High-performance Multi-core Microprocessors
DFX Architecture for High-performance Multi-core MicroprocessorsDFX Architecture for High-performance Multi-core Microprocessors
DFX Architecture for High-performance Multi-core MicroprocessorsIshwar Parulkar
 
DNS Measurements
DNS MeasurementsDNS Measurements
DNS MeasurementsAFRINIC
 

Semelhante a Scrutinizing a Country using Passive DNS and Picviz (7)

Cassandra Performance and Scalability on AWS
Cassandra Performance and Scalability on AWSCassandra Performance and Scalability on AWS
Cassandra Performance and Scalability on AWS
 
ION Hangzhou - Why Deploy DNSSEC?
ION Hangzhou - Why Deploy DNSSEC?ION Hangzhou - Why Deploy DNSSEC?
ION Hangzhou - Why Deploy DNSSEC?
 
ION Mumbai - Shailesh Gupta: Business Case for IPv6 and DNSSEC
ION Mumbai - Shailesh Gupta: Business Case for IPv6 and DNSSECION Mumbai - Shailesh Gupta: Business Case for IPv6 and DNSSEC
ION Mumbai - Shailesh Gupta: Business Case for IPv6 and DNSSEC
 
AWS reInvent: Building an enterprise class backup and archival solution on AWS
AWS reInvent: Building an enterprise class backup and archival solution on AWSAWS reInvent: Building an enterprise class backup and archival solution on AWS
AWS reInvent: Building an enterprise class backup and archival solution on AWS
 
DFX Architecture for High-performance Multi-core Microprocessors
DFX Architecture for High-performance Multi-core MicroprocessorsDFX Architecture for High-performance Multi-core Microprocessors
DFX Architecture for High-performance Multi-core Microprocessors
 
Backtrack Manual Part3
Backtrack Manual Part3Backtrack Manual Part3
Backtrack Manual Part3
 
DNS Measurements
DNS MeasurementsDNS Measurements
DNS Measurements
 

Último

Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 

Último (20)

Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 

Scrutinizing a Country using Passive DNS and Picviz

  • 1. Scrutinizing a Country using Passive DNS and Picviz or how to analyze big dataset without loosing your mind Sebastien Tricaud, Alexandre Dulaunoy March 10, 2012
  • 2. Disclaimer • Passive DNS is a technique to collect only valid answers from caching/recursive nameservers and authoritative nameservers • By its design, privacy is preserved (e.g. no source IP addresses from resolvers are captured1 ) • The research is done in the sole purpose to detect malicious IP/domains or content to better protect users 1 Except 2 of 38 if the web application abused DNS answers to track back their users.
  • 3. IP overview - some properties MX Abuse CSIRT Helper RTIR A has been has the DNS reported to an IP address properties dshield.org CNAME NS AMaDa Blocklist has been announced tracked by by an ASN peering with Zeus Tracker stability 3 of 38
  • 4. Introduction or Problem Statement • Datasets become larger and larger (even for a small country) • Malicious (and non malicious) activities are distributed across IP addresses or domain names • Time to live of Internet resources (especially the malicious ones) is low • → Attackers abuse and benefit from these facts 4 of 38
  • 5. Passive DNS MX Abuse CSIRT Helper RTIR A has been has the DNS reported to an IP address properties dshield.org CNAME NS AMaDa Blocklist has been announced tracked by by an ASN peering with Zeus Tracker stability 5 of 38
  • 6. Storing Passive DNS or how to do trial and error? • Implementing the storage of a Passive DNS can be challenging • Starting from standard RDBMS to key-value store • We learned to hate2 hard disk drive and to love random access memory • Loving memory is great especially when it’s now cheap and addressable in 64bits 2 exception 6 of 38 → only used for data store snapshot
  • 7. A minimalist and scalable implementation of a passive DNS • Our passive DNS implementation is a toolkit for experimenting classification or visualization techniques 7 of 38
  • 8. Redis - Passive DNS data structure 8 of 38
  • 9. Redis - a sample query redis> SMEMBERS "r:www.linkedin.com:5" 1) "dub.linkedin.com" redis> SMEMBERS "r:dub.linkedin.com:1" 1) "91.225.248.80" redis> SMEMBERS "v:dub.linkedin.com" 1) "www.linkedin.com" redis> GET "s:www.linkedin.com:dub.linkedin.com" "1331057300" redis> GET "l:www.linkedin.com:dub.linkedin.com" "1331057412" redis> GET "o:www.linkedin.com:dub.linkedin.com" "3" 9 of 38
  • 10. BGP Ranking on IP attributes MX Abuse CSIRT Helper RTIR A has been has the DNS reported to an IP address properties dshield.org CNAME NS AMaDa Blocklist has been announced tracked by by an ASN peering with Zeus Tracker stability 10 of 38
  • 11. AS Ranking Calculation Formula  #s   s=1 ( Occ Simpact )  ASrank =1+   ASsize  • Number of malicious occurrence per unique IP (Occ) • Weight of the blacklist source (Simpact ) • Grand total of IP addresses announced by the ASN (ASsize ) • Each iteration of the Occ sum is saved (e.g. to discard a source blacklist from the ranking calculation) 11 of 38
  • 12. Why Ranking ISPs? • CSIRTs can assess the level of trust per ISPs (e.g. know to host drive-by-download website, reactive to abuse handling, ...) • Improve assessment between ISPs (e.g. IP peering policies) • Detecting common suspicious activities among ISPs/ASN • Can be used as an additional weight factor to abuse handling (e.g. detect outliers in large set of IP addresses) 12 of 38
  • 13. A daily use: ease your log analysis • 300 million lines of proxy logs? You have 30 minutes to find out what’s happened? or discarding the noise of ”known” malware communication? • Prefix the ranking AS15169,1.00273578519859,74.125.... to the log file • logs-ranking → sort -r -g -t”,” -k2 proxy.log-ranked 13 of 38
  • 14. A daily use: ease your memory dump analysis • During large incident, we got many memory dumps in a single day • Dumping all the memory per process and we extracted all URLs and IPs from each memory dump • Ranking URLs and IPs, and analyzing the processes with the higher malicious rank • Ranking can be used for a lot of reverse analysis techniques (from finding malicious process to artefacts of antivirus in memory) 14 of 38
  • 15. Ranked domains - Where Picviz can help • Now, we have 50 millions lines of ranked hostname... ... www.stopacta.info. = 1.0 www.vista-care.com. = 1.0 breadworld.com. = 1.00002301767 o-o.resolver.A.B.C.D.5xevqnwsds5zdq34.metricz. l.google.com. = 1.00303388648 www.thechinagarden.com. = 1.00009822292 smtp10.dti.ne.jp. = 1.00010586629 ... 15 of 38
  • 16. Detection of multi-homed compromised systems • Regularly malicious links are posted on compromised systems • Ranking increased for the ASN and its announced subnet • Passive DNS collects associated hostnamed to a subnet (usually filling the gap in the subnet) • But how to find thoses cases? 16 of 38
  • 17. Ooops wrong visualization • For the ones who were at the party ;-) 17 of 38
  • 18. Why visualization? • Understand big data • Find stuff we cannot guess 18 of 38
  • 19. Problem with usual visualizations • Limited ◦ Top 10 (!) ◦ Just to display tendencies. . . ◦ Hide most of information • Hard to get meaningful/useful information • Folks mostly use it to display stuff in a different way 19 of 38
  • 20. Problem with usual visualizations 20 of 38
  • 21. Choosing Parallel Coordinates • Display as much dimensions wanted (yes, as many) • Display as much data wanted (I mean it!) 21 of 38
  • 24. Picvizing the whole dataset 24 of 38
  • 25. Splitting the URL • We want to get the TLD, subdomains etc. . . • A regex does not work: 192.168.0.1, http://localhost, google.com, www.slashdot.org:80, . . . • We simply put them according to their ascii value ◦ a is at the axis bottom ◦ zzzzzzzzzzzzzzzzzzz{500} is on the very top 25 of 38
  • 26. Picviz with the whole url split 26 of 38
  • 27. Reward: highest is youtube 27 of 38
  • 28. Subdomain entropy Only one sub-domain has an entropy3 >4.8 3 Shannon 28 of 38 entropy
  • 29. Subdomain entropy Only one sub-domain has an entropy4 >4.8 4 Shannon 29 of 38 entropy
  • 30. Scatter plot - finding outliers 30 of 38
  • 31. Scatter plot - finding outliers - covert channel? 030066363663643937306531[..].36393764313333653763.lbl8.mailshell.net t10000.u1318235395163.s203679668[..]-1329.zv6lit-null.zrdtd-1311.zr6td- null.results.potaroo.net 03003064303831663965386[..].64306561343837346533.lbl8.mailshell.net 31 of 38
  • 32. Searching for Zeus Using the broad Polish CERT regex [a-z0-9]{32,48}.(ru|com|biz|info|org|net) • We get some cool domains: ◦ cg79wo20kl92doowfn01oqpo9mdieowv5tyj.com ◦ eef795a4eddaf1e7bd79212acc9dde16.net • but more important we got a visualization profile to find outliers not matching the regexp 32 of 38
  • 33. Zoom on NS answer domain 33 of 38
  • 34. Back to the global view • request domain: ns2.speed-tube.net 34 of 38
  • 35. Investigating ns2.speed-tube.net • Grab cool stuff that are not ranked like: adsforadsense.co.cc;1.0;ns2.speed-tube.net;1.0 extra-tube.net;1.0001125221;ns2.speed-tube.net;1.0 ... • A recurring (reactivated or cached) malicious site: adsforadsense.co.cc rogue safebrowsing.clients.google.com 20110315 20110125 35 of 38
  • 36. Conclusion • Passive DNS is an infinite source of security data mining • The toolkit is now available on github and this is the basis for more research • (adequate) Visualization is an appropriate way to discover unknown malicious or suspicious services • This finally helps CSIRTs to act earlier on the incidents 36 of 38
  • 37. Free Software • BGP Ranking software https://www.github.com/CIRCL/BGP-Ranking - http://bgpranking.circl.lu/ • Passive DNS toolkit https://www.github.com/adulau/pdns-viz/ - first commit for CanSecWest - more modules to come • Domain Classification https://www.github.com/adulau/DomainClassifier/ 37 of 38
  • 38. Q&A • @adulau - alexandre.dulaunoy@circl.lu • @tricaud - sebastien@honeynet.org 38 of 38