O slideshow foi denunciado.
Utilizamos seu perfil e dados de atividades no LinkedIn para personalizar e exibir anúncios mais relevantes. Altere suas preferências de anúncios quando desejar.
Building the High Speed Cyber Security
Data Pipeline Using Apache NiFi
Praveen Kanumarlapudi
Cyber Security
60% of Small
Businesses Fold Within
6 Months of a Cyber
Attack.
How to make it
success ?
Global Security Key Stake Holders
Security Operations Center Data Scientists Data Analysts Executives
An information secur...
Cyber Security ‘BIG data’ challenges
• Speed , Volume and Variety
 Data Ingestion
 Cleansing
 Transformation
• data rel...
A couple of years Ago !
Network logs
Web logs
AD Logs
Infrastructure
logs
Application
Logs
Threat Intel
3rd Party RG
RDBMS...
Challenges
• Complexity of Architecture
• Debugging
• Data Source Dependencies
• Lack of Centralized logging
• Multiple Da...
Solution Framework
 Single Data entry point – avoids network traffic and
duplicate data flowing around
 Transformations ...
Deployment Models
Data Sources
Challenges
 Good architectural understanding of all
systems
 Good amount of coding effort
 Long development hours
 Mai...
• Guaranteed delivery
• Processors that supports multiple
formats
• Ease to develop the flows and
deploy in minutes
• Open...
The Data Gateway
Network logs
Web logs
AD Logs
Infrastructure
logs
Application
Logs
Threat Intel
3rd Party RG
RDBMS
unstru...
Sample Flow
To Azure
To Splunk
Grafana dashboard – Last 7 days
Yesterday
In the middle of the day
Metrics
 100+ production flows
 ~ 20 Billion events
 1000+ Transformations
Next ?
 MiNiFi
 Stateless NiFi
 Registry
 SAM
 Real-Time Model training
 CI/CD, NiFi API’s
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Próximos SlideShares
Carregando em…5
×

Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi

267 visualizações

Publicada em

Cybersecurity requires an organization to collect data, analyze it, and alert on cyber anomalies in near real-time. This is a challenging endeavor when considering the variety of data sources which need to be collected and analyzed. Everything from application logs, network events, authentications systems, IOT devices, business events, cloud service logs, and more need to be taken into consideration. In addition, multiple data formats need to be transformed and conformed to be understood by both humans and ML/AI algorithms.

To solve this problem, the Aetna Global Security team developed the Unified Data Platform based on Apache NiFi, which allows them to remain agile and adapt to new security threats and the onboarding of new technologies in the Aetna environment. The platform currently has over 60 different data flows with 95% doing real-time ETL and handles over 20 billion events per day. In this session learn from Aetna’s experience building an edge to AI high-speed data pipeline with Apache NiFi.

Publicada em: Tecnologia
  • Seja o primeiro a comentar

Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi

  1. 1. Building the High Speed Cyber Security Data Pipeline Using Apache NiFi Praveen Kanumarlapudi
  2. 2. Cyber Security
  3. 3. 60% of Small Businesses Fold Within 6 Months of a Cyber Attack.
  4. 4. How to make it success ?
  5. 5. Global Security Key Stake Holders Security Operations Center Data Scientists Data Analysts Executives An information security operations center ("ISOC" or "SOC") is a facility where enterprise information systems (websites, applications, databases, data centers and servers, networks, desktops and other endpoints) are monitored, assessed, and defended. Technology : SIEM Security data scientists have the skills to understand complex algorithms and build advanced models for threat and anomaly detection and applying these concepts to real security data sets in single or clustered environments. Technology : Python, R, Big Data, Spark/Scala or MATLAB… Map and trace the data from system to system for solving a given business or incident problem. Design and create data reports using various reporting tools that help business executive to make better decisions. Implements new metrics for business (KPIs) Technology : SQL, SIEM, Big Data, Reporting tools CSO’s, CISO’s
  6. 6. Cyber Security ‘BIG data’ challenges • Speed , Volume and Variety  Data Ingestion  Cleansing  Transformation • data reliance  Executives – KPI Metrics  Data scientists  SOC  Data Analysts • Real-Time context
  7. 7. A couple of years Ago ! Network logs Web logs AD Logs Infrastructure logs Application Logs Threat Intel 3rd Party RG RDBMS unstructured(semi)structured Syslog servers SIEM APP Sqoop PySpark SIEM Tool Data Source Ingestion Integration Delivery Flume UBA Tools SOCDataScienceKPI/Reporting
  8. 8. Challenges • Complexity of Architecture • Debugging • Data Source Dependencies • Lack of Centralized logging • Multiple Data Copies • Stress on Network • Transformations with respect to destination
  9. 9. Solution Framework  Single Data entry point – avoids network traffic and duplicate data flowing around  Transformations according destination – reduces the reliance on source  Should be capable of handling different formats and different sources Ingest Clean/Route Transform for 1 Transform for 2 Route to 1 Route to 2 Archive
  10. 10. Deployment Models Data Sources
  11. 11. Challenges  Good architectural understanding of all systems  Good amount of coding effort  Long development hours  Maintenance overheads  Maintain the sync between the systems  Provenance
  12. 12. • Guaranteed delivery • Processors that supports multiple formats • Ease to develop the flows and deploy in minutes • Open Source and rich community
  13. 13. The Data Gateway Network logs Web logs AD Logs Infrastructure logs Application Logs Threat Intel 3rd Party RG RDBMS unstructured(semi)structured Data Source Data Gateway Delivery SOCDataScienceKPI/Reporting SOC
  14. 14. Sample Flow
  15. 15. To Azure
  16. 16. To Splunk
  17. 17. Grafana dashboard – Last 7 days
  18. 18. Yesterday
  19. 19. In the middle of the day
  20. 20. Metrics  100+ production flows  ~ 20 Billion events  1000+ Transformations
  21. 21. Next ?  MiNiFi  Stateless NiFi  Registry  SAM  Real-Time Model training  CI/CD, NiFi API’s

×