O slideshow foi denunciado.
Utilizamos seu perfil e dados de atividades no LinkedIn para personalizar e exibir anúncios mais relevantes. Altere suas preferências de anúncios quando desejar.

Strata San Jose 2016 - Reduce False Positives in Security

How to reduce false positives in security systems through feedback and rules.

You will learn about:
1) Implicit Feedback
2) Applying Rules above ML systems
3) Applying Rules as Features
4) Combining them using MLN

  • Entre para ver os comentários

Strata San Jose 2016 - Reduce False Positives in Security

  1. 1. Powerball Predictor Photo Credit: Sean McGrath Crystal ball tells me with 99% accuracy if a powerball prediction is a winner.
  2. 2. Powerball Predictor Photo Credit: Sean McGrath ● ~300 million samples. ● ~ 3 million false positives. ● 1 true positive.
  3. 3. Powerball Predictor Photo Credit: Sean McGrath The overwhelming majority of tickets are not winners. Failing to recognize this is falling victim to the base rate fallacy.
  4. 4. Security Crystal Ball Photo Credit: Sean McGrath The overwhelming majority of log entries and data points do not represent fraud and intrusions. Failing to recognize this is falling victim to the base rate fallacy.
  5. 5. Source: MXLabs
  6. 6. Base Rate Fallacy
  7. 7. Why False Positives?
  8. 8. Case Study: Outlier Detection Using an outlier detection system to identify fraudsters within the environment.
  9. 9. For a set of generating mechanisms find the unusual ones.
  10. 10. Example Time Series
  11. 11. Photo credit SuperCar-RoadTrip.fr under Creative Commons Attribution 2.0 Change in the data over time in unforeseen ways. Concept Drift
  12. 12. Solution: Feedback Loop
  13. 13. Explicit Feedback Loop Photo credit Alan Levine under Creative Commons Attribution 2.0
  14. 14. Explicit Feedback Loop Photo credit Alan Levine under Creative Commons Attribution 2.0 Implicit Feedback Loop
  15. 15. Fraud: Takeaways - Concept Drift is a shift in behavior. - Feedback combats concept drift. - Implicit Feedback > Explicit Feedback
  16. 16. IDS: Anatomy of Successful Detection
  17. 17. Context: Security Analyst
  18. 18. Red team Kill Chain
  19. 19. Blue team Kill Chain
  20. 20. False positives: Lose Ability to Triage
  21. 21. Fact: You cannot salvage a false positive with Contextual Info or Visualization
  22. 22. What is a Successful detection? Properties + Frameworks
  23. 23. Successful detection captures Adversary TTP from Sensor data ignoring Expected activity Source: @MSwannMSFT
  24. 24. Properties of a Successful Detection Adaptability Credible Interpretability Actionable
  25. 25. Basic Advanced LessUsefulMoreUseful Sophistication of Algorithms UsefulnessofAlerts SecurityDomainKnowledge Framework for a Successful detection
  26. 26. Basic Advanced LessUsefulMoreUseful Sophistication of Algorithms UsefulnessofAlerts SecurityDomainKnowledge Outlier
  27. 27. Basic Advanced LessUsefulMoreUseful Sophistication of Algorithms UsefulnessofAlerts SecurityDomainKnowledge Outlier Anomaly
  28. 28. Basic Advanced LessUsefulMoreUseful Sophistication of Algorithms UsefulnessofAlerts SecurityDomainKnowledge Outlier Anomaly Security Interesting Alerts Successful Detections incorporate Domain Knowledge Alerts
  29. 29. How to encode Domain Knowledge: Embrace Rules • Business Heuristics to filter out the “Security interesting anomalies” • Rules can take many forms: •TI feeds •IOCs, IOAs •TTPs • Rules are awesome • Credible, Interpretable, Adaptable (to some extent), Actionable! • Highest Precision • Highest Recall
  30. 30. Three ways to combine ML and Rules
  31. 31. Three Ways to combine Rules and ML 1.Above Machine Learning Systems a.Business Heuristics to filter alerts i. “For account _foo_, only raise sev 2 alerts until March 28th, 2016”,
  32. 32. Work by Dan Mace et. al, Microsoft
  33. 33. 2. Below Machine Learning Systems a. Featurizations - “If IP address present in List of malicious IP dataset, flag 1” b. Utilizes Threat Intel feeds (Cymru, Virus total, FireEye)
  34. 34. 3: Combining Rules and Machine Learning together using Markov Logic Networks Initial Ideas given by Vinod Nair, MSR
  35. 35. Intuition •Rules alone place a set of hard constraints on the set of possible worlds •Let’s make them soft constraints: When a world violates a formula, It becomes less probable, not impossible •Give each formula a weight (Higher weight ⇒ Stronger constraint) Source: Lectures by Pedro Domingos
  36. 36. Interactive logons from service accounts causes attack Similar service accounts tend to have similar logon behavior Example: Service Accounts Domain Knowledge
  37. 37. Example: Service Accounts Encode as First Order Logic
  38. 38. Example: Service Accounts 1.5 1.1 Example: Service Accounts Associate Each Rule With the Learned Weight
  39. 39. Example: Service Accounts 1.5 1.1 Attack( A) InteractiveLo gon(A) InteractiveLo gon(B) Attack( B) Example: Service Accounts Consider two service accounts: A,B
  40. 40. Example: Service Accounts 1.5 1.1 Attack( A) InteractiveLo gon(A) InteractiveLo gon(B) Attack( B)Similar(A, B) Similar(B, A) Similar(A, A) Similar(B, B)
  41. 41. Example: Service Accounts 1.5 1.1 Attack( A) InteractiveLo gon(A) InteractiveLo gon(B) Attack( B)Similar(A, B) Similar(B, A) Similar(A, A) Similar(B, B)
  42. 42. Example: Service Accounts 1.5 1.1 Attack( A) InteractiveLo gon(A) InteractiveLo gon(B) Attack( B)Similar(A, B) Similar(B, A) Similar(A, A) Similar(B, B)
  43. 43. •How to learn the structure? •Begin with hand-coded rules •Use Inductive Logic Programming, but need to infer arbitrary clause •How to learn the weights? •For generative learning, depend on pseudolikelihood •Checkout Alchemy -- http://alchemy.cs.washington.edu/
  44. 44. Call for Action - After the conference • One Week • Review •@CodyRioux - IPython Notebook •@Ram_ssk - Follow Up material • Think comprehensively about Rules • One Month •Ask your data scientists to literature review section •Implement the rules on TOP of ML systems • One quarter •Implement a feedback system to capture training data •Implement all TI feeds within an ML System •Play with Alchemy
  45. 45. Literature ● The Base-Rate Fallacy and its Implications for the Difficulty of Intrusion Detection (Alexsson, 1999) ● Enhancing Performance Prediction Robustness by Combining Analytical Modeling and Machine Learning (Didona et al., 2015) ● Richardson, Matthew, and Pedro Domingos. "Markov logic networks."Machine learning 62.1-2 (2006): 107-136.

×