Mais conteúdo relacionado

Apresentações para você(20)

Similar a Big Data Visualization(20)


Mais de Raffael Marty(20)



Big Data Visualization

  1. Raffael Marty, CEO Big Data Visualization London February, 2015
  2. Security. Analytics. Insight.2 • Visualization • Design Principles • Dashboards • SOC Dashboard • Data Discovery and Exploration • Data Requirements for Visualization • Big Data Lake Overview
  3. Security. Analytics. Insight.3 I am Raffy - I do Viz! IBM Research
  4. 4 Visualization
  5. Security. Analytics. Insight.5 Why Visualization? the stats ... the data...
  6. Security. Analytics. Insight.6 Why Visualization? Human analyst: • patterndetection • remembers context • fantasticintuition • canpredict
  7. Security. Analytics. Insight.7 Visualization To … Present / Communicate Discover / Explore
  8. Design Principles
  9. Security. Analytics. Insight.9 Choosing Visualizations Objective AudienceData
  10. Security. Analytics. Insight.10 • Objective: Find attackers in the network moving laterally • Defines data needed (netflow, sflow, …) • maybe restrict to a network segment • Audience: security analyst, risk team, … • Informs how to visualize / present data For Example - Lateral Movement Recon Weaponize Deliver Exploit Install C2 Act
  11. Security. Analytics. Insight.11 • Show  comparisons, contrasts, differences • Show  causality, mechanism, explanation, systematic structure. • Show  multivariate data; that is, show more than 1 or 2 variables. by Edward Tufte Principals of Analytic Design
  12. Security. Analytics. Insight.12 Show Context 42
  13. Security. Analytics. Insight. 42 is just a number and means nothing without context 13 Show Context
  14. Security. Analytics. Insight.15 Use Numbers To Highlight Most Important Parts of Data Numbers Summaries
  15. Security. Analytics. Insight.16 Additional information about objects, such as: • machine • roles • criticality • location • owner • … • user • roles • office location • … Add Context source destination machine and 
 user context machine role user role
  16. Security. Analytics. Insight.17 Traffic Flow Analysis With Context
  17. Security. Analytics. Insight.18 • Black background • Blue or green colors • Glow Aesthetics Matter
  18. Security. Analytics. Insight.19 B O R I N G
  19. Security. Analytics. Insight.20 Sexier
  20. Security. Analytics. Insight.21 • Audience, audience, audience! • Comprehensive Information (enough context) • Highlight important data • Use graphics when appropriate • Good choice of graphics and design • Aesthetically pleasing • Enough information to decide if action is necessary • No scrolling • Real-time vs. batch? (Refresh-rates) • Clear organization Dashboard Design Principles
  21. 22 SOC Dashboards
  22. Security. Analytics. Insight.23 Mostly Blank
  23. Security. Analytics. Insight.24 • Disappears too quickly • Analysts focus is on their own screens • SOC dashboard just distracts • Detailed information not legible • Put the detailed dashboards on the analysts screens! Dashboards For Discovery
  24. Security. Analytics. Insight.25 • Provide analyst with context • “What else is going on in the environment right now?” • Bring Into Focus • Turn something benign into something interesting • Disprove • Turn something interesting into something benign Use SOC Dashboard For Context Environment informs detection policies
  25. Security. Analytics. Insight.26 Show Comparisons Current Measure week prior
  26. Security. Analytics. Insight.27 • News feed summary (FS ISAC feeds, mailinglists, threat feeds) • Monitoring twitter or IRC for certain activity / keywords • Volumes or metrics (e.g., #firewall blocks, #IDS alerts, #failed transactions) • Top N metrics: • Top 10 suspicious users • Top 10 servers connecting outbound What To Put on Screens Provide context to individual security alerts
  27. 28 Data Discovery & Exploration
  28. Security. Analytics. Insight.29 Visualize Me Lots (>1TB) of Data
  29. Security. Analytics. Insight.30 Information Visualization Mantra Overview Zoom / Filter Details on Demand Principle by Ben Shneiderman • summary / aggregation • data mining • signal detection (IDS, behavioral, etc.)
  30. Security. Analytics. Insight.31 • Access to data • Parsed data and data context • Data architecture for central data access and fast queries • Application of data mining (how?, what?, scalable, …) • Visualization tools that support • Complex visual types (||-coordinates, treemaps, 
 heat maps, link graphs) • Linked views • Data mining (clustering, …) • Collaboration, information sharing • Visual analytics workflow Visualization Challenges
  31. Big Data Lake
  32. Security. Analytics. Insight.33 • One central location to store all cyber security data • “Data collected only once and third party software leveraging it” • Scalability and interoperability • More than deploying an off the shelf product from a vendor • Data use influences both data formats and technologies to store the data • search, analytics, relationships, and distributed processing • correlation, and statistical summarization • What to do with Context? Enrich or join? • Hard problems: • Parsing: can you re-parse? Common naming scheme! • Data store capabilities (search, analytics, distributed processing, etc.) • Access to data: SQL (even in Hadoop context), how can products access the data? The Big Data Lake
  33. Security. Analytics. Insight.34 Federated Data Access SIEM dispatcher SIEM 
 connector SIEM console Prod A AD / LDAP HR … IDS FW Prod B DBs Data Lake Caveats: • Dispatcher? • Standard access to dispatcher /
 products enabled • Data lake technology? SNMP
  34. Security. Analytics. Insight.35 Multiple Data Stores raw logs key-value structured real-time
 processing (un)-structured data context SQL s t o r a g e stats index queue distributed
 processing a c c e s s graph Caveat: • Need multiple types of 
 data stores
  35. Security. Analytics. Insight.36 Technologies (Example) raw logs key-value (Cassandra) columnar (parquet) real-time
 processing (Spark) (un)-structured data context SQL (Impala, SparkSQL) H D F S aggregates index (ES) queue (Kafka) distributed
 processing (Spark) a c c e s s graph (GraphX) Caveat: • No out of the box solution available
  36. Security. Analytics. Insight.37 SIEM Integration - Log Management First SIEM columnar or search engine
 or log management processing SIEM 
 connector raw logs SIEM console SQL or search
 interface processing filtering H D F S e.g., PIG parsing
  37. Security. Analytics. Insight.38 Simple SIEM Integration raw, csv, json flume log data SQL (Impala, with SerDe) H D F S SIEM 
 connector SIEM Requirement: • SIEM connector to forward text- based data to Flume. SQL interface Tableau, etc. SIEM console
  38. Security. Analytics. Insight.39 SIEM Integration - Advanced SIEM columnar (parquet) processing syslog data SQL (Impala, SparkSQL) H D F S index (ES) queue (Kafka) a c c e s s other data sources SIEM 
 connector raw logs SIEM console SQL and search 
 interface Tableau, Kibana, etc. requires parsing and formatting in a SIEM readable format (e.g., CEF)
  39. Security. Analytics. Insight.40 What I am Working On Data Stores Analytics Forensics Models Admin --> --> --> --> --> 80 53 Anomalies Decomposition Data Seasonal Trend Anomaly Details “Hunt” ExplainVisual Search • Big data backend • Own visualization engine (Web-based) • Visualization workflows
  40. Security. Analytics. Insight.41 BlackHat Workshop Visual Analytics - Delivering Actionable Security Intelligence August 1-6 2015, Las Vegas, USA big data | analytics | visualization
  41. Security. Analytics. Insight.42 List: Twitter: @secviz Share, discuss, challenge, and learn about security visualization. Security Visualization Community
  42. Security. Analytics. Insight. and @secviz Further resources: