O slideshow foi denunciado.
Utilizamos seu perfil e dados de atividades no LinkedIn para personalizar e exibir anúncios mais relevantes. Altere suas preferências de anúncios quando desejar.

Anomaly detection in dns traffic

93 visualizações

Publicada em

We are building a SciKit-Learn based tools to detect anomalous behavior in DNS traffic, using three different algorithm with Machine Learning. This research work is not finished yet, so that this presentation will cover only the basic part of it; What we are doing now and what we are planing to deploy.

Publicada em: Internet
  • Seja o primeiro a comentar

  • Seja a primeira pessoa a gostar disto

Anomaly detection in dns traffic

  1. 1. Anomaly Detection in DNS Traffic bdNOG10 Conference Chittagong, Bangladesh April 26, 2019
  2. 2. $whoami| A. S. M. Shamim Reza Deputy Manager, Network Operation Centre Link3 Technologies Limited, Dhaka, Bangladesh shamimreza@link3.net / sohag.shamim@gmail.com @asmshamimreza on Linkedin
  3. 3. What we are going to discuss  What is Anomaly  Why Detecting Anomalies in DNS traffic  What we have been doing - The Conventional Way  What we are planning - The AI bdNOG10, 26 April 2019,, Chittagong, Bangladesh
  4. 4. What is Anomaly? a deviation from the common rule, type, arrangement, or form bdNOG10, 26 April 2019,, Chittagong, Bangladesh
  5. 5. Example! Or Connection? bdNOG10, 26 April 2019,, Chittagong, Bangladesh
  6. 6. bdNOG10, 26 April 2019,, Chittagong, Bangladesh Example! Or Connection?
  7. 7. Equifax Data Breach in July 2017 – Personally Identifiable information of 150 Million people Reason – slow detection of anomalous activities ** bdNOG10, 26 April 2019,, Chittagong, Bangladesh **United States Government Accountability Office: GAO-18-559: Published: Aug 30, 2018. Publicly Released: Sep 7, 2018 Example! Or Connection?
  8. 8. October, 2016 Dyn Cyber-attack*  1.2 Tbps of DDoS traffic  was accomplished through a large number of DNS lookup requests from tens of millions of IP addresses  executed through a botnet consisting of a malware named Mirai * https://en.wikipedia.org/wiki/2016_Dyn_cyberattackbdNOG10, 26 April 2019,, Chittagong, Bangladesh Example! Or Connection?
  9. 9. A Latin Proverb, found in 13th Century “ it is better and more useful to meet a problem in time than to seek a remedy after the damage is done ” Source - WikipediabdNOG10, 26 April 2019,, Chittagong, Bangladesh Why Detecting Anomalies in DNS Traffic
  10. 10. Source - Cisco Security Research CenterbdNOG10, 26 April 2019,, Chittagong, Bangladesh Why Detecting Anomalies in DNS Traffic
  11. 11. “ As Anomaly Detection System looks for the strange event, it has the potential to detect new, unknown, or unlisted events, which can fill the gap of IDS system ” bdNOG10, 26 April 2019,, Chittagong, Bangladesh Why Detecting Anomalies in DNS Traffic
  12. 12. Let’s come to the Point
  13. 13. What we are doing right now – The Conventional Way • Time-Series Data Analysis • Business Intelligence Analysis • Passive check of DNS Queries • Monitoring DNS service & Statistics bdNOG10, 26 April 2019,, Chittagong, Bangladesh
  14. 14. What do we have Anycast DNS infrastructure – serving 820.8 Million queries / Day – with 5 Caching DNS server bdNOG10, 26 April 2019,, Chittagong, Bangladesh
  15. 15. Time Series Analysis standard protocols graph
  16. 16. Time Series Analysis Top talker graph
  17. 17. Time Series Analysis standard protocols data
  18. 18. Business Analysis DNS Queries Success & Failure Ratio 88.97% 11.03% Query Request 570,000/minute Success Failure bdNOG10, 26 April 2019,, Chittagong, Bangladesh
  19. 19. Business Analysis DNS Queries Failure Reason 80.99% 10.91% 4.55% 0.02% 3.53% 11.03% Failure Ratio Non-Existent Domain Name No Response Server Failure Format Error Query Refused bdNOG10, 26 April 2019,, Chittagong, Bangladesh
  20. 20. Business Analysis DNS Resolution Time Distribution bdNOG10, 26 April 2019,, Chittagong, Bangladesh 89.03 2.4 0.51 0.3 7.23 1.53 0ms - 10ms 10ms - 20ms 20ms - 30ms 30ms - 40ms 40ms - 50ms 50ms above No Response
  21. 21. Business Analysis TOP Domain Resolution bdNOG10, 26 April 2019,, Chittagong, Bangladesh 59.7 21.7 6.3 2.3 .com .net .th .org .gov .cn in-addr.arpa .mobi .bd .in .me .ly .biz .io .tw .cc .app .hosting .info .co
  22. 22. Passive check of DNS Queries Hell of a headache bdNOG10, 26 April 2019,, Chittagong, Bangladesh
  23. 23. Monitoring DNS service and statistics - DNS Query Response – OK/NOT - DNS Query Response Time – <=10ms - CPU Utilization - Process load - How Many TCP and UDP Socket is serving Statistics on - Success, Failure, Referral, Duplicate, Dropped, Recursion, nxdomain and so on. bdNOG10, 26 April 2019,, Chittagong, Bangladesh
  24. 24. I am not happy at all
  25. 25. I am not happy at all
  26. 26. – Too much human interaction – Human error – Time consuming – Static thresholds are not always helpful – Seems like, expansion of Infrastructure, means, lots of extra data, means, it will need extra HR hours means .. … …. ….. ……….. bdNOG10, 26 April 2019,, Chittagong, Bangladesh
  27. 27. This terms are like nightmare ! • DNS Flood Attack • DRDoS • Cache Poisoning • DNS Tunneling • DNS Hijacking bdNOG10, 26 April 2019,, Chittagong, Bangladesh
  28. 28. So till now how we are dealing !!! • Watching with our eyes on the Dashboard • Setting-up static threshold bdNOG10, 26 April 2019,, Chittagong, Bangladesh
  29. 29. So till now how we are dealing !!! • Watching with our eyes on the Dashboard • Setting-up static threshold • Machine Learning ← missing bdNOG10, 26 April 2019,, Chittagong, Bangladesh
  30. 30. What do we want ? An automated tool to help us detect abnormal activities run-time  Open DNS resolver  Malware domains query  Cache Poisoner bdNOG10, 26 April 2019,, Chittagong, Bangladesh
  31. 31. How we are going to do this bdNOG10, 26 April 2019,, Chittagong, Bangladesh Source - Newtium
  32. 32. Hypothesis of Detecting Anomalies Algorithm - Open DNS Resolver function GetOpenDNSResolver (W : local domains, L : local network, F ext : analysed flows) F responses = {F ext | F ext .IP src = L F ext .IP dst = L F ext .P src = 53 F ext .P∧ F ext .IP dst ̸ = L ∧ F ext .P src = 53 ∧ F ext .P ̸ = L ∧ F ext .P src = 53 ∧ F ext .P ∧ F ext .IP dst ̸ = L ∧ F ext .P src = 53 ∧ F ext .P ∧ F ext .IP dst ̸ = L ∧ F ext .P src = 53 ∧ F ext .P dst = 53̸ = L ∧ F ext .P src = 53 ∧ F ext .P ∧ F ext .IP dst ̸ = L ∧ F ext .P src = 53 ∧ F ext .P F ext .Qname = W 1 · · · F ext .Qname = W n F ext .Rcode = 0} ;̸ = L ∧ F ext .P src = 53 ∧ F ext .P ∧ F ext .IP dst ̸ = L ∧ F ext .P src = 53 ∧ F ext .P ∧ F ext .IP dst ̸ = L ∧ F ext .P src = 53 ∧ F ext .P ̸ = L ∧ F ext .P src = 53 ∧ F ext .P ∧ F ext .IP dst ̸ = L ∧ F ext .P src = 53 ∧ F ext .P aggregate F responses by IP src and Qname to F resolvers ; for each F resolver in F resolvers do request all information about domain F resolver .Qname by ANY query type; if domain information contain IP address from L then add F resolver .Qname to W ; else return "F resolver .IP src is open DNS resolver" ; end if end for bdNOG10, 26 April 2019,, Chittagong, Bangladesh
  33. 33. Hypothesis of Detecting Anomalies Algorithm - Malware Domains Queries Detection function GetMalwareAffectedDevices (N : number of checked domains, F ext : analysed flows) F queries = {F ext | F ext .P src = 53 F ext .P dst = 53 F ext .Qname = dns.msf tncsi.com̸ = L ∧ F ext .P src = 53 ∧ F ext .P ∧ F ext .IP dst ̸ = L ∧ F ext .P src = 53 ∧ F ext .P ∧ F ext .IP dst ̸ = L ∧ F ext .P src = 53 ∧ F ext .P ∧ F ext .IP dst ̸ = L ∧ F ext .P src = 53 ∧ F ext .P (F ext .Qtype = A F ext .Qtype = AAAA)} ;∨ F ext .Qtype = AAAA)} ; aggregate F queries by IP src to F starts ; for each F start in F starts do F domains = {F start | F ext .IP src = F start .IP src F ext .P src = 53 F ext .P dst =∧ F ext .IP dst ̸ = L ∧ F ext .P src = 53 ∧ F ext .P ̸ = L ∧ F ext .P src = 53 ∧ F ext .P ∧ F ext .IP dst ̸ = L ∧ F ext .P src = 53 ∧ F ext .P 53 ∧ F ext .IP dst ̸ = L ∧ F ext .P src = 53 ∧ F ext .P F ext .T start ≥ F start .T start F ext .T start ≤ (F ext .T start + 5 minutes)∧ F ext .IP dst ̸ = L ∧ F ext .P src = 53 ∧ F ext .P ∧ F ext .IP dst ̸ = L ∧ F ext .P src = 53 ∧ F ext .P F ext .Qname = *windowsupdate.com F ext .Qname = *msf tncsi.com̸ = L ∧ F ext .P src = 53 ∧ F ext .P ∧ F ext .IP dst ̸ = L ∧ F ext .P src = 53 ∧ F ext .P ̸ = L ∧ F ext .P src = 53 ∧ F ext .P ∧ F ext .IP dst ̸ = L ∧ F ext .P src = 53 ∧ F ext .P F ext .Qname = *microsof t.com ;̸ = L ∧ F ext .P src = 53 ∧ F ext .P select first N queried domains D from F domains ; for all queried domains D do exclude D.Qname contained in the Alexa top domains list ; check if domain D.Qname is reported as malware domain ; if D.Qname is marked as malware domain then return "F start .IP src queried malware domain D.Qname" ; end if end for end for bdNOG10, 26 April 2019,, Chittagong, Bangladesh
  34. 34. Hypothesis of Detecting Anomalies Algorithm – Delay Fast Packet PacketDictionary.Init() StatsDictionary.Init() loop NewP acket Sniff DNSPacket()⇐ Sniff DNSPacket() ID NewPacket.TransactionID + NewPacket.Type⇐ Sniff DNSPacket() 5Tuple NewPacket.5Tupple⇐ Sniff DNSPacket() IsQuery NewPacket.IsQuery⇐ Sniff DNSPacket() if IsQuery then if [5Tuple, ID] already in PacketDictionary then PacketDictionaly[5Tuple, ID] = NewPacket print DUP REQUEST FOUND: REMOVING PACKET AFTER TIMEOUT else PacketDictionaly.Insert([5Tuple, ID],NewPacket) end if else RequestPacket PacketDictionary.Get([5Tuple, ID])⇐ Sniff DNSPacket() if [5Tuple, ID] is delayed then Drop all packets with ID [5Tuple, ID] SEND: WARNING, ‘TOO FAST’ PACKET RECEIVED, POSSIBLE ATTACK else if RequestP acket not exists then SEND: WARNING, UNMATCHED REPLY FOUND, POSSIBLE ATTACK end if RT T NewP acket.T imeOf Arrival−RequestPacket.TimeOf Arrival⇐ Sniff DNSPacket() DelayTime StatsDictionary[[NewPacket.SourceIP ,⇐ Sniff DNSPacket() NewPacket.ReplyType].AddSample(RTT ,NewPacket.SourceIP , NewPacket.ReplyType]) DelayP acket(DelayT ime) PacketDictionary.Remove([5Tuple, ID)] end if end if end loop bdNOG10, 26 April 2019,, Chittagong, Bangladesh
  35. 35. What we are doing now ? bdNOG10, 26 April 2019,, Chittagong, Bangladesh Blending those three hypothesis in to a ONE Single place.  Debian 9.8  Python 3.5  Numpy  Matplotlib  Scipy  math
  36. 36. What we are doing now ? bdNOG10, 26 April 2019,, Chittagong, Bangladesh Conducting another training model with ● k-NN (k-nearest neighbor) ● Debian 9.8 ● Python 3.5 ● Numpy ● Scipy ● numba ● scikit-learn
  37. 37. Why we are doing two different methods ? bdNOG10, 26 April 2019,, Chittagong, Bangladesh Because – “ we are trying to build a hybrid model ” “ Building a large scale machine learning based anomaly detection system, working just on one single model will not help ”
  38. 38. Reference • Hoai-Vu Nguyen and Yongsun Choi, “Proactive Detection of DDoS Attacks Utilizing k-NN Classifier in an Anti-DDos Framework ” (World Academy of Science, Engineering and Technology, International Journal of Computer and Information Engineering, Vol:4, No:3, 2010 ) • Przemys aw Berezinski, Bartosz Jasiul, and Marcin Szpyrka; “ław Berezinski, Bartosz Jasiul, and Marcin Szpyrka; “ An Entropy-Based Network Anomaly Detection Method”, Entropy 2015, 17, 2367-2408 (https://www.mdpi.com/journal/entropy) • Das, S., Islam, M.R., Jayakodi, N.K. and Doppa, J.R., “Active Anomaly Detection via Ensembles: Insights, Algorithms, and Interpretability.” arXiv preprint arXiv:1901.08930. 2019. • Ro, K., Zou, C., Wang, Z. and Yin, G., “Outlier detection for high-dimensional data. Biometrika” 2015. • Suri, N.R. and Athithan, G., “Research Issues in Outlier Detection. In Outlier Detection: Techniques and Applications”, 2019. • Milan Cermak, Pavel Celeda, Jan Vykopal, “Detection of DNS Traffic Anomalies in Large Networks”, Masaryk University, 2014 (https://www.researchgate.net/publication/290882059) • Zdrnja, B., Brownlee, N., Wessels, D., “Passive Monitoring of DNS Anomalies. In:Detection of Intrusions and Malware, and Vulnerability Assessment”, 2007. • Manasrah, A.M., Hasan, A., Abouabdalla, O.A., Ramadass, S., “Detecting Bot-net Activities Based on Abnormal DNS Traffic”, (IJCSIS) International Journal of Computer Science & Information Security, 2009. bdNOG10, 26 April 2019,, Chittagong, Bangladesh
  39. 39. Thank You for your attention

×