SlideShare uma empresa Scribd logo
1 de 10
Security Analytics at Web Scale
pratim_mukherjee@symantec.com
 Bangladesh Bank Chief Resigns After Cyber Theft of $81 million
 New York Times (Mar 15,2016)
 Cybercrime is a key fraud risk in India
 ey.com (Jan 20,2016)
 Target settles for $39 million over data breach
 Cnn.com (Dec 2,2015)
 Anthem is warning consumers about its huge data breach
 Los Angeles Times.com (Mar,2015)
 Ashley Madison
 Anyone Here !!
Why Should You Care !
 Incident Response
 Identify root cause and fix vulnerabilities
 Intrusion Detection
 Monitor network and systems for malicious activities
 Alert Prioritization
 Reduce false positives to stop the threat with highest impact
 Predicting Compromises
 Predict attacks based on vulnerability, command & control activity and past infections
 Access Analytics
 Isolate unusual user behavior e.g. concurrent geographical login
 Simulation
 Simulate various attacks by doing internal pen testing and take precautions based on log mining
 Simulate insider attack on data loss prevention software and take precautions based on its logs
What is Security Analytics
 No real time query on Petabytes
 Reduce data in stages like a funnel
Web Scale - Dealing with Petabytes
Streaming
Logs
Kafka
Log Parser HiveSemi
Aggregates
HBase
MOLAP CubesKafka Client
 Relational OLAP (ROLAP)
 SQL kind of queries from client front-end tools for a relational back-end
database.
 ROLAP servers include optimization for each DBMS back end,
implementation of aggregation navigation logic, and additional tools and
services
 ROLAP technology tends to have greater scalability than MOLAP technology
 Multi-dimensional OLAP (MOLAP)
 Query materialized views , think about Partially Ordered Sets (POSET)
 The advantage of using a data cube is that it allows fast indexing to pre-
computed summarized data and usually much faster than ROLAP
 Difficult to scale because of “curse of dimensionality”
Hybrid OLAP
Visualization of MOLAP as Lattice
O-D (apex) cuboid
1-D cuboids
2-D cuboids
3-D (base) cuboid
Infection_type
monthId
country
(Infection_type,monthId)
(country,monthId)
(Infection_type,country)
(Infection_type,country,monthId)
HBase MOLAP View
ROWKEY
[Infection_type,country,monthId]
Aggregate Column Family
Detection Count(COUNT Distinct)
GEN-JP-1 10 4a44dc15364204a
GEN-JP-2 12 e80e9039455cc
GEN-JP-3 9 f1e5233ade6af
GEN-JP-4 15 a80fe80e90
GEN-JP-5 5 3ade6af1dd5
GEN-JP-6 12 a44dc1536420
GEN-JO-1 2 ….
GEN-JO-2 1 ….
GEN-JO-3 0 ….
GEN-JO-4 5 …..
GEN-JO-5 2 …..
GEN-JO-6 1 …...
**hashes are representative
Hyperloglog Hash
 Hyperloglog
 Used for approximate count distinct queries
 Store HLL hash in 5 bytes in HBase columns
 Apply monoid SUM pattern to rollup
 Bloom Filter
 Used for checking whether an incoming stream element is “not” a member of a set
 False negative never happens, i.e. an element “definitely not in set” is always
correct
 Also used by Hbase to ascertain whether input row key is part of a Hfile
 Count-Min Sketch
 Used for counting frequencies of specific elements in sub-linear space
 Twitter’s Algebird library with Spark for HLL and CMS implementation
Probabilistic Data Structures
Real-Time Query Response Server
Query Controller
Calcite HBase
Adapter Yes
Spark Driver on
Jetty
No
SparkSQLQuery
Is
Cuboid
Found
?
HDFS/Hive/HBase
Incoming
Query Response
HBaseQuery
 Questions/Comments
Thank You

Mais conteúdo relacionado

Destaque

The changing face to workplace learning - Peter Davis
The changing face to workplace learning - Peter DavisThe changing face to workplace learning - Peter Davis
The changing face to workplace learning - Peter DavisLearningandTeaching
 
Using Facebook To Create Your Web Personality
Using Facebook To Create Your Web PersonalityUsing Facebook To Create Your Web Personality
Using Facebook To Create Your Web Personalitywoelfelr
 
Giving students feedback on assessment
Giving students feedback on assessmentGiving students feedback on assessment
Giving students feedback on assessmentLearningandTeaching
 
From Speech to Conversation: A UX Challenge
From Speech to Conversation: A UX ChallengeFrom Speech to Conversation: A UX Challenge
From Speech to Conversation: A UX ChallengeSiri Mehus
 
Tiroteo en el Empire State
Tiroteo en el Empire StateTiroteo en el Empire State
Tiroteo en el Empire Statenoaceituna
 
Earned value management with Examples | Control Cost | PMBOK | PMP
Earned value management with Examples | Control Cost | PMBOK | PMPEarned value management with Examples | Control Cost | PMBOK | PMP
Earned value management with Examples | Control Cost | PMBOK | PMPJustAcademy
 
Kirkpatrick 4 level evaluation model
Kirkpatrick 4 level evaluation modelKirkpatrick 4 level evaluation model
Kirkpatrick 4 level evaluation modelzhumin
 
Rueda de reconocimiento escultura griega
Rueda de reconocimiento escultura griegaRueda de reconocimiento escultura griega
Rueda de reconocimiento escultura griegaFernando Gómez
 

Destaque (15)

TGGBIO2014
TGGBIO2014TGGBIO2014
TGGBIO2014
 
The changing face to workplace learning - Peter Davis
The changing face to workplace learning - Peter DavisThe changing face to workplace learning - Peter Davis
The changing face to workplace learning - Peter Davis
 
cv FAIZAN SIDDIQUI
cv FAIZAN SIDDIQUIcv FAIZAN SIDDIQUI
cv FAIZAN SIDDIQUI
 
Using Facebook To Create Your Web Personality
Using Facebook To Create Your Web PersonalityUsing Facebook To Create Your Web Personality
Using Facebook To Create Your Web Personality
 
AGENDA DIGITAL PERUANA
AGENDA DIGITAL PERUANAAGENDA DIGITAL PERUANA
AGENDA DIGITAL PERUANA
 
Giving students feedback on assessment
Giving students feedback on assessmentGiving students feedback on assessment
Giving students feedback on assessment
 
Personality student
Personality   studentPersonality   student
Personality student
 
From Speech to Conversation: A UX Challenge
From Speech to Conversation: A UX ChallengeFrom Speech to Conversation: A UX Challenge
From Speech to Conversation: A UX Challenge
 
Praful_Resume
Praful_ResumePraful_Resume
Praful_Resume
 
Tiroteo en el Empire State
Tiroteo en el Empire StateTiroteo en el Empire State
Tiroteo en el Empire State
 
تفريد التعليم المحاضرة العاشرة
 تفريد التعليم المحاضرة العاشرة تفريد التعليم المحاضرة العاشرة
تفريد التعليم المحاضرة العاشرة
 
Automatic speech recognition
Automatic speech recognitionAutomatic speech recognition
Automatic speech recognition
 
Earned value management with Examples | Control Cost | PMBOK | PMP
Earned value management with Examples | Control Cost | PMBOK | PMPEarned value management with Examples | Control Cost | PMBOK | PMP
Earned value management with Examples | Control Cost | PMBOK | PMP
 
Kirkpatrick 4 level evaluation model
Kirkpatrick 4 level evaluation modelKirkpatrick 4 level evaluation model
Kirkpatrick 4 level evaluation model
 
Rueda de reconocimiento escultura griega
Rueda de reconocimiento escultura griegaRueda de reconocimiento escultura griega
Rueda de reconocimiento escultura griega
 

Último

Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 

Último (20)

Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 

Security analytics at web scale

  • 1. Security Analytics at Web Scale pratim_mukherjee@symantec.com
  • 2.  Bangladesh Bank Chief Resigns After Cyber Theft of $81 million  New York Times (Mar 15,2016)  Cybercrime is a key fraud risk in India  ey.com (Jan 20,2016)  Target settles for $39 million over data breach  Cnn.com (Dec 2,2015)  Anthem is warning consumers about its huge data breach  Los Angeles Times.com (Mar,2015)  Ashley Madison  Anyone Here !! Why Should You Care !
  • 3.  Incident Response  Identify root cause and fix vulnerabilities  Intrusion Detection  Monitor network and systems for malicious activities  Alert Prioritization  Reduce false positives to stop the threat with highest impact  Predicting Compromises  Predict attacks based on vulnerability, command & control activity and past infections  Access Analytics  Isolate unusual user behavior e.g. concurrent geographical login  Simulation  Simulate various attacks by doing internal pen testing and take precautions based on log mining  Simulate insider attack on data loss prevention software and take precautions based on its logs What is Security Analytics
  • 4.  No real time query on Petabytes  Reduce data in stages like a funnel Web Scale - Dealing with Petabytes Streaming Logs Kafka Log Parser HiveSemi Aggregates HBase MOLAP CubesKafka Client
  • 5.  Relational OLAP (ROLAP)  SQL kind of queries from client front-end tools for a relational back-end database.  ROLAP servers include optimization for each DBMS back end, implementation of aggregation navigation logic, and additional tools and services  ROLAP technology tends to have greater scalability than MOLAP technology  Multi-dimensional OLAP (MOLAP)  Query materialized views , think about Partially Ordered Sets (POSET)  The advantage of using a data cube is that it allows fast indexing to pre- computed summarized data and usually much faster than ROLAP  Difficult to scale because of “curse of dimensionality” Hybrid OLAP
  • 6. Visualization of MOLAP as Lattice O-D (apex) cuboid 1-D cuboids 2-D cuboids 3-D (base) cuboid Infection_type monthId country (Infection_type,monthId) (country,monthId) (Infection_type,country) (Infection_type,country,monthId)
  • 7. HBase MOLAP View ROWKEY [Infection_type,country,monthId] Aggregate Column Family Detection Count(COUNT Distinct) GEN-JP-1 10 4a44dc15364204a GEN-JP-2 12 e80e9039455cc GEN-JP-3 9 f1e5233ade6af GEN-JP-4 15 a80fe80e90 GEN-JP-5 5 3ade6af1dd5 GEN-JP-6 12 a44dc1536420 GEN-JO-1 2 …. GEN-JO-2 1 …. GEN-JO-3 0 …. GEN-JO-4 5 ….. GEN-JO-5 2 ….. GEN-JO-6 1 …... **hashes are representative Hyperloglog Hash
  • 8.  Hyperloglog  Used for approximate count distinct queries  Store HLL hash in 5 bytes in HBase columns  Apply monoid SUM pattern to rollup  Bloom Filter  Used for checking whether an incoming stream element is “not” a member of a set  False negative never happens, i.e. an element “definitely not in set” is always correct  Also used by Hbase to ascertain whether input row key is part of a Hfile  Count-Min Sketch  Used for counting frequencies of specific elements in sub-linear space  Twitter’s Algebird library with Spark for HLL and CMS implementation Probabilistic Data Structures
  • 9. Real-Time Query Response Server Query Controller Calcite HBase Adapter Yes Spark Driver on Jetty No SparkSQLQuery Is Cuboid Found ? HDFS/Hive/HBase Incoming Query Response HBaseQuery