SlideShare a Scribd company logo
1 of 32
FireEye & Scylla :
Intel Threat Analysis
using Graph Database
Rahul Gaikwad, Staff DevOps Engineer
&
Krishna Palati, Senior Devops Manager
Presenters
Rahul Gaikwad, Staff DevOps Engineer
❖ Role
➢ Database Administrator - SQL / NoSQL / Graph DB / Big Data
➢ Infrastructure & Cloud Operations
➢ DevOps Automation Engineer
❖ Education
➢ Master of Computer Applications (MCA) & Executive MBA
➢ Pursuing PhD Research in AIOps
❖ Certifications
➢ Scylla | OCP | CCAH | HDPCA | RHCSA | AWS - SA | AWS - SysOps | Confluent Kafka
Krishna Palati, Senior Devops Manager
❖ Role
➢ Senior DevOps Manager
➢ Cloud Infrastructure, Devops automation and Database Systems
❖ Education
➢ Bachelors and Masters Degrees in Computer Science and Engineering
❖ Hobbies
➢ Running, Biking, Hiking and Playing tennis
Agenda
■ Background
■ Why ScyllaDB
■ ScyllaDB at FireEye
■ Conclusion
■ Q&A
Background
Introduction to FireEye
Solutions
■ Threat Intelligence
■ Helix Security Platform
■ Endpoint Security
■ Network Security and Forensics
■ Email Security
■ Managed Defense
Services
■ Breach Response
■ Security Assessment
■ Security Enhancement
■ Security Transformation
★ FireEye is a intelligence-led Cyber security company
★ We offer solutions that blends security technologies, threat intelligence and consulting.
Forrester New Wave
Leading Threat Intel Services
FireEye Threat Intelligence
A portfolio of subscriptions and services designed to address all aspects of an
organization’s intelligence needs.
■ Intelligence Subscriptions
■ Intelligence Enablement
■ Intelligence Capability Development
■ Digital Threat Monitoring
■ Advanced Intelligence Access
Application Use Case
■ Homegrown custom graph database on Postgres
■ Centralizes, organizes and processes cyber threat intelligence data
■ Tracks threat groups by recording all of the analytic correlations
■ Provides analytic results by processing and analysing historical data
■ Data Objects - DNS data, RSS feeds, file md5s, FQDNs and URLs
■ Data Size: Nodes ~500M and Edges ~1.5B
Existing System as Graph DB
Structure of the Graph
■ Stores data as ”nodes” or “edges”
■ Also allows storing tags
Nodes
■ Each node represents a single object, event or evidence
■ E.g. Organizations, actors, hosts, files and FQDNs are represented as nodes in graph
Edges
■ Edges represent the relationships between nodes.
■ E.g. an edge exist from a threat actor to their location
Existing System- Graph Example
Challenges of Existing System
Limitations :
■ Slow performance
■ Not easily scalable
■ Not stable
■ Not highly available
■ Not distributed
Objectives:
■ Replace the current system with a new scalable, highly available,
distributed system.
Tech Evaluation for Graph DB
Evaluation Targets - Multiple Graph DB’s
■ Orient DB
■ Synapse
■ AWS Neptune
■ Janus Graph
Evaluation Criteria - Based on MoSCoW Model
■ Functional
■ Non-Functional
■ Supportability
Why JanusGraph?
Opinionated Selection Criteria for Janus Graph :
■ Indexing capabilities that can be controlled by the user.
■ Free / Full Text search
■ Embedded as well as Server mode setup capability
■ Schema Management
■ Triggers
■ OLAP Capabilities - Distributed Graph Processing
Result:
■ Based on our requirements, tech evaluation and test results, we selected JanusGraph.
Janus Graph is...
■ Distributed
■ Open source
■ Massively scalable
■ Graph Database
also...
■ Supports pluggable Backend Storage
● ScyllaDB
● Cassandra,
● HBase
● Berkeley DB
Motivation for ScyllaDB
Why ScyllaDB ?
Based on tech evaluations and tests we determined Scylla DB is the right
backend storage.
Features :
■ Easy Cluster setup
■ Self Tuning
■ Equal Load distribution
■ Easy to Manage On Cloud
■ Less Administration
■ No GC
■ Compression
ScyllaDB Usage
ScyllaDB Usage for Threat Analysis
■ Since data represents threat activity, we can get answers to questions about:
● Threat actors
● Malware
● Threat activity
● Victims
● Various other things.
■ Graph DB tells a story about data by connecting dots
Graph Traversing Examples
Architecture
Graph DB with ScyllaDB
Environment
Configurations
■ Running on AWS Cloud
■ Single Region (Multi AZ) deployment
■ Using EC2’s
■ AWS Instance - i3.8xlarge
■ Each Cluster has 7 nodes
■ Clusters - DEV, QA, STAGING, PROD.
H/W Per Node Per Cluster
CPU 32 224
RAM (GB) 244 1708
Disk (TB) 16 112
Deployment
Scylla DB - Infrastructure Management
Terraform is a tool for building, changing, and versioning infrastructure
safely and efficiently.
Scylla DB - Configuration Management
Puppet is a Configuration Management tool that is used for deploying,
configuring and managing servers.
Comparison
Conclusion
FireEye Traversing with Scylla DB
■ Very good experience and results observed so far
■ Cost Effective
■ Admin Friendly
■ Superfast
■ Looking at potential opportunities to use ScyllaDB in other projects
Thank You All ..!!
■ FireEye
● Architects
● Engineers: Developers, DevOps & QA
● Project and Program Managers
■ JanusGraph
■ ScyllaDB
● Scylla University
● Community
● Summit Organisers
Thank you Stay in touch
Any questions?
Rahul Gaikwad
rahul.gaikwad@fireeye.com
Krishna Palati
krishna.palati@FireEye.com
linkedin.com/in/rahul-gaikwad-2712b02a
linkedin.com/in/krishnapalati

More Related Content

What's hot

What's hot (20)

SAS Institute on Changing All Four Tires While Driving an AdTech Engine at Fu...
SAS Institute on Changing All Four Tires While Driving an AdTech Engine at Fu...SAS Institute on Changing All Four Tires While Driving an AdTech Engine at Fu...
SAS Institute on Changing All Four Tires While Driving an AdTech Engine at Fu...
 
Scylla Summit 2019 Keynote - Avi Kivity
Scylla Summit 2019 Keynote - Avi KivityScylla Summit 2019 Keynote - Avi Kivity
Scylla Summit 2019 Keynote - Avi Kivity
 
Powering a Graph Data System with Scylla + JanusGraph
Powering a Graph Data System with Scylla + JanusGraphPowering a Graph Data System with Scylla + JanusGraph
Powering a Graph Data System with Scylla + JanusGraph
 
Captial One: Why Stream Data as Part of Data Transformation?
Captial One: Why Stream Data as Part of Data Transformation?Captial One: Why Stream Data as Part of Data Transformation?
Captial One: Why Stream Data as Part of Data Transformation?
 
Scylla Summit 2016: Analytics Show Time - Spark and Presto Powered by Scylla
Scylla Summit 2016: Analytics Show Time - Spark and Presto Powered by ScyllaScylla Summit 2016: Analytics Show Time - Spark and Presto Powered by Scylla
Scylla Summit 2016: Analytics Show Time - Spark and Presto Powered by Scylla
 
How to Monitor and Size Workloads on AWS i3 instances
How to Monitor and Size Workloads on AWS i3 instancesHow to Monitor and Size Workloads on AWS i3 instances
How to Monitor and Size Workloads on AWS i3 instances
 
iFood on Delivering 100 Million Events a Month to Restaurants with Scylla
iFood on Delivering 100 Million Events a Month to Restaurants with ScyllaiFood on Delivering 100 Million Events a Month to Restaurants with Scylla
iFood on Delivering 100 Million Events a Month to Restaurants with Scylla
 
Introducing Scylla Open Source 4.0
Introducing Scylla Open Source 4.0Introducing Scylla Open Source 4.0
Introducing Scylla Open Source 4.0
 
Scylla Summit 2018: Consensus in Eventually Consistent Databases
Scylla Summit 2018: Consensus in Eventually Consistent DatabasesScylla Summit 2018: Consensus in Eventually Consistent Databases
Scylla Summit 2018: Consensus in Eventually Consistent Databases
 
mParticle's Journey to Scylla from Cassandra
mParticle's Journey to Scylla from CassandramParticle's Journey to Scylla from Cassandra
mParticle's Journey to Scylla from Cassandra
 
Scylla Summit 2016: Scylla at Samsung SDS
Scylla Summit 2016: Scylla at Samsung SDSScylla Summit 2016: Scylla at Samsung SDS
Scylla Summit 2016: Scylla at Samsung SDS
 
Back to the future with C++ and Seastar
Back to the future with C++ and SeastarBack to the future with C++ and Seastar
Back to the future with C++ and Seastar
 
Scylla Virtual Workshop 2020
Scylla Virtual Workshop 2020Scylla Virtual Workshop 2020
Scylla Virtual Workshop 2020
 
ScyllaDB @ Apache BigData, may 2016
ScyllaDB @ Apache BigData, may 2016ScyllaDB @ Apache BigData, may 2016
ScyllaDB @ Apache BigData, may 2016
 
Running a DynamoDB-compatible Database on Managed Kubernetes Services
Running a DynamoDB-compatible Database on Managed Kubernetes ServicesRunning a DynamoDB-compatible Database on Managed Kubernetes Services
Running a DynamoDB-compatible Database on Managed Kubernetes Services
 
Scylla Summit 2018: Getting the Most Out of Scylla on Kubernetes
Scylla Summit 2018: Getting the Most Out of Scylla on KubernetesScylla Summit 2018: Getting the Most Out of Scylla on Kubernetes
Scylla Summit 2018: Getting the Most Out of Scylla on Kubernetes
 
Scylla Summit 2018: Cassandra and ScyllaDB at Yahoo! Japan
Scylla Summit 2018: Cassandra and ScyllaDB at Yahoo! JapanScylla Summit 2018: Cassandra and ScyllaDB at Yahoo! Japan
Scylla Summit 2018: Cassandra and ScyllaDB at Yahoo! Japan
 
How Workload Prioritization Reduces Your Datacenter Footprint
How Workload Prioritization Reduces Your Datacenter FootprintHow Workload Prioritization Reduces Your Datacenter Footprint
How Workload Prioritization Reduces Your Datacenter Footprint
 
Scylla Summit 2022: What’s New in ScyllaDB Operator for Kubernetes
Scylla Summit 2022: What’s New in ScyllaDB Operator for KubernetesScylla Summit 2022: What’s New in ScyllaDB Operator for Kubernetes
Scylla Summit 2022: What’s New in ScyllaDB Operator for Kubernetes
 
Scylla Summit 2022: Scylla 5.0 New Features, Part 1
Scylla Summit 2022: Scylla 5.0 New Features, Part 1Scylla Summit 2022: Scylla 5.0 New Features, Part 1
Scylla Summit 2022: Scylla 5.0 New Features, Part 1
 

Similar to FireEye & Scylla: Intel Threat Analysis Using a Graph Database

Similar to FireEye & Scylla: Intel Threat Analysis Using a Graph Database (20)

5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
 
Study Notes - Architecting for the cloud (AWS Best Practices, Feb 2016)
Study Notes - Architecting for the cloud (AWS Best Practices, Feb 2016)Study Notes - Architecting for the cloud (AWS Best Practices, Feb 2016)
Study Notes - Architecting for the cloud (AWS Best Practices, Feb 2016)
 
sudoers: Benchmarking Hadoop with ALOJA
sudoers: Benchmarking Hadoop with ALOJAsudoers: Benchmarking Hadoop with ALOJA
sudoers: Benchmarking Hadoop with ALOJA
 
Building a High Performance Analytics Platform
Building a High Performance Analytics PlatformBuilding a High Performance Analytics Platform
Building a High Performance Analytics Platform
 
Microsoft Data Platform - What's included
Microsoft Data Platform - What's includedMicrosoft Data Platform - What's included
Microsoft Data Platform - What's included
 
Travel & Leisure Platform Department's tech info
Travel & Leisure Platform Department's tech infoTravel & Leisure Platform Department's tech info
Travel & Leisure Platform Department's tech info
 
AWS Big Data Demystified #1: Big data architecture lessons learned
AWS Big Data Demystified #1: Big data architecture lessons learned AWS Big Data Demystified #1: Big data architecture lessons learned
AWS Big Data Demystified #1: Big data architecture lessons learned
 
Paris Data Geek - Spark Streaming
Paris Data Geek - Spark Streaming Paris Data Geek - Spark Streaming
Paris Data Geek - Spark Streaming
 
Apache Cassandra training. Overview and Basics
Apache Cassandra training. Overview and BasicsApache Cassandra training. Overview and Basics
Apache Cassandra training. Overview and Basics
 
Scaling Cloud Web & Data Technologies
Scaling Cloud Web & Data TechnologiesScaling Cloud Web & Data Technologies
Scaling Cloud Web & Data Technologies
 
9/2017 STL HUG - Back to School
9/2017 STL HUG - Back to School9/2017 STL HUG - Back to School
9/2017 STL HUG - Back to School
 
IBM - Introduction to Cloudant
IBM - Introduction to CloudantIBM - Introduction to Cloudant
IBM - Introduction to Cloudant
 
0812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part2
0812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part20812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part2
0812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part2
 
Introduction to ClustrixDB
Introduction to ClustrixDBIntroduction to ClustrixDB
Introduction to ClustrixDB
 
Customer Education Webcast: New Features in Data Integration and Streaming CDC
Customer Education Webcast: New Features in Data Integration and Streaming CDCCustomer Education Webcast: New Features in Data Integration and Streaming CDC
Customer Education Webcast: New Features in Data Integration and Streaming CDC
 
Big Data on Cloud Native Platform
Big Data on Cloud Native PlatformBig Data on Cloud Native Platform
Big Data on Cloud Native Platform
 
Big Data on Cloud Native Platform
Big Data on Cloud Native PlatformBig Data on Cloud Native Platform
Big Data on Cloud Native Platform
 
MariaDB SkySQL Autonome Skalierung, Observability, Cloud-Backup
MariaDB SkySQL Autonome Skalierung, Observability, Cloud-BackupMariaDB SkySQL Autonome Skalierung, Observability, Cloud-Backup
MariaDB SkySQL Autonome Skalierung, Observability, Cloud-Backup
 
NetflixOSS Meetup season 3 episode 1
NetflixOSS Meetup season 3 episode 1NetflixOSS Meetup season 3 episode 1
NetflixOSS Meetup season 3 episode 1
 
Oracle big data appliance and solutions
Oracle big data appliance and solutionsOracle big data appliance and solutions
Oracle big data appliance and solutions
 

More from ScyllaDB

More from ScyllaDB (20)

Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
What Developers Need to Unlearn for High Performance NoSQL
What Developers Need to Unlearn for High Performance NoSQLWhat Developers Need to Unlearn for High Performance NoSQL
What Developers Need to Unlearn for High Performance NoSQL
 
Low Latency at Extreme Scale: Proven Practices & Pitfalls
Low Latency at Extreme Scale: Proven Practices & PitfallsLow Latency at Extreme Scale: Proven Practices & Pitfalls
Low Latency at Extreme Scale: Proven Practices & Pitfalls
 
Dissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasDissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance Dilemmas
 
Beyond Linear Scaling: A New Path for Performance with ScyllaDB
Beyond Linear Scaling: A New Path for Performance with ScyllaDBBeyond Linear Scaling: A New Path for Performance with ScyllaDB
Beyond Linear Scaling: A New Path for Performance with ScyllaDB
 
Dissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasDissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance Dilemmas
 
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
 
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
 
Database Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
Database Performance at Scale Masterclass: Driver Strategies by Piotr SarnaDatabase Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
Database Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
 
Replacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDBReplacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDB
 
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear ScalabilityPowering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
 
7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptx7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptx
 
Getting the most out of ScyllaDB
Getting the most out of ScyllaDBGetting the most out of ScyllaDB
Getting the most out of ScyllaDB
 
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a MigrationNoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
 
NoSQL Database Migration Masterclass - Session 3: Migration Logistics
NoSQL Database Migration Masterclass - Session 3: Migration LogisticsNoSQL Database Migration Masterclass - Session 3: Migration Logistics
NoSQL Database Migration Masterclass - Session 3: Migration Logistics
 
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and ChallengesNoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
 
ScyllaDB Virtual Workshop
ScyllaDB Virtual WorkshopScyllaDB Virtual Workshop
ScyllaDB Virtual Workshop
 
DBaaS in the Real World: Risks, Rewards & Tradeoffs
DBaaS in the Real World: Risks, Rewards & TradeoffsDBaaS in the Real World: Risks, Rewards & Tradeoffs
DBaaS in the Real World: Risks, Rewards & Tradeoffs
 
Build Low-Latency Applications in Rust on ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDBBuild Low-Latency Applications in Rust on ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDB
 
NoSQL Data Modeling 101
NoSQL Data Modeling 101NoSQL Data Modeling 101
NoSQL Data Modeling 101
 

Recently uploaded

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Recently uploaded (20)

Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 

FireEye & Scylla: Intel Threat Analysis Using a Graph Database

  • 1. FireEye & Scylla : Intel Threat Analysis using Graph Database Rahul Gaikwad, Staff DevOps Engineer & Krishna Palati, Senior Devops Manager
  • 2. Presenters Rahul Gaikwad, Staff DevOps Engineer ❖ Role ➢ Database Administrator - SQL / NoSQL / Graph DB / Big Data ➢ Infrastructure & Cloud Operations ➢ DevOps Automation Engineer ❖ Education ➢ Master of Computer Applications (MCA) & Executive MBA ➢ Pursuing PhD Research in AIOps ❖ Certifications ➢ Scylla | OCP | CCAH | HDPCA | RHCSA | AWS - SA | AWS - SysOps | Confluent Kafka Krishna Palati, Senior Devops Manager ❖ Role ➢ Senior DevOps Manager ➢ Cloud Infrastructure, Devops automation and Database Systems ❖ Education ➢ Bachelors and Masters Degrees in Computer Science and Engineering ❖ Hobbies ➢ Running, Biking, Hiking and Playing tennis
  • 3. Agenda ■ Background ■ Why ScyllaDB ■ ScyllaDB at FireEye ■ Conclusion ■ Q&A
  • 5. Introduction to FireEye Solutions ■ Threat Intelligence ■ Helix Security Platform ■ Endpoint Security ■ Network Security and Forensics ■ Email Security ■ Managed Defense Services ■ Breach Response ■ Security Assessment ■ Security Enhancement ■ Security Transformation ★ FireEye is a intelligence-led Cyber security company ★ We offer solutions that blends security technologies, threat intelligence and consulting.
  • 6. Forrester New Wave Leading Threat Intel Services
  • 7.
  • 8. FireEye Threat Intelligence A portfolio of subscriptions and services designed to address all aspects of an organization’s intelligence needs. ■ Intelligence Subscriptions ■ Intelligence Enablement ■ Intelligence Capability Development ■ Digital Threat Monitoring ■ Advanced Intelligence Access
  • 9. Application Use Case ■ Homegrown custom graph database on Postgres ■ Centralizes, organizes and processes cyber threat intelligence data ■ Tracks threat groups by recording all of the analytic correlations ■ Provides analytic results by processing and analysing historical data ■ Data Objects - DNS data, RSS feeds, file md5s, FQDNs and URLs ■ Data Size: Nodes ~500M and Edges ~1.5B
  • 10. Existing System as Graph DB Structure of the Graph ■ Stores data as ”nodes” or “edges” ■ Also allows storing tags Nodes ■ Each node represents a single object, event or evidence ■ E.g. Organizations, actors, hosts, files and FQDNs are represented as nodes in graph Edges ■ Edges represent the relationships between nodes. ■ E.g. an edge exist from a threat actor to their location
  • 12. Challenges of Existing System Limitations : ■ Slow performance ■ Not easily scalable ■ Not stable ■ Not highly available ■ Not distributed Objectives: ■ Replace the current system with a new scalable, highly available, distributed system.
  • 13. Tech Evaluation for Graph DB Evaluation Targets - Multiple Graph DB’s ■ Orient DB ■ Synapse ■ AWS Neptune ■ Janus Graph Evaluation Criteria - Based on MoSCoW Model ■ Functional ■ Non-Functional ■ Supportability
  • 14. Why JanusGraph? Opinionated Selection Criteria for Janus Graph : ■ Indexing capabilities that can be controlled by the user. ■ Free / Full Text search ■ Embedded as well as Server mode setup capability ■ Schema Management ■ Triggers ■ OLAP Capabilities - Distributed Graph Processing Result: ■ Based on our requirements, tech evaluation and test results, we selected JanusGraph.
  • 15. Janus Graph is... ■ Distributed ■ Open source ■ Massively scalable ■ Graph Database also... ■ Supports pluggable Backend Storage ● ScyllaDB ● Cassandra, ● HBase ● Berkeley DB
  • 17. Why ScyllaDB ? Based on tech evaluations and tests we determined Scylla DB is the right backend storage. Features : ■ Easy Cluster setup ■ Self Tuning ■ Equal Load distribution ■ Easy to Manage On Cloud ■ Less Administration ■ No GC ■ Compression
  • 19. ScyllaDB Usage for Threat Analysis ■ Since data represents threat activity, we can get answers to questions about: ● Threat actors ● Malware ● Threat activity ● Victims ● Various other things. ■ Graph DB tells a story about data by connecting dots
  • 22. Graph DB with ScyllaDB
  • 24. Configurations ■ Running on AWS Cloud ■ Single Region (Multi AZ) deployment ■ Using EC2’s ■ AWS Instance - i3.8xlarge ■ Each Cluster has 7 nodes ■ Clusters - DEV, QA, STAGING, PROD. H/W Per Node Per Cluster CPU 32 224 RAM (GB) 244 1708 Disk (TB) 16 112
  • 26. Scylla DB - Infrastructure Management Terraform is a tool for building, changing, and versioning infrastructure safely and efficiently.
  • 27. Scylla DB - Configuration Management Puppet is a Configuration Management tool that is used for deploying, configuring and managing servers.
  • 30. FireEye Traversing with Scylla DB ■ Very good experience and results observed so far ■ Cost Effective ■ Admin Friendly ■ Superfast ■ Looking at potential opportunities to use ScyllaDB in other projects
  • 31. Thank You All ..!! ■ FireEye ● Architects ● Engineers: Developers, DevOps & QA ● Project and Program Managers ■ JanusGraph ■ ScyllaDB ● Scylla University ● Community ● Summit Organisers
  • 32. Thank you Stay in touch Any questions? Rahul Gaikwad rahul.gaikwad@fireeye.com Krishna Palati krishna.palati@FireEye.com linkedin.com/in/rahul-gaikwad-2712b02a linkedin.com/in/krishnapalati

Editor's Notes

  1. KP: Hello everyone, hope you are enjoying the CA weather. As you heard in the introduction video, today we will talk abt how we at FireEye, used ScyllaDB to redesign an existing product and built a new solution for our Intel product portfolio.
  2. KP: I am Krishna Palati, I manage Devops team for Solutions Engineering comprising of Intel, Managed Defense and Incidence Response for FireEye. We are responsible for Core Devops, Cloud Infrastructure operations & Database systems. In this presentation we will talk abt how we used Scylla to implement a solution that is critical for our Intelligence product portfolio. RG - Hello, I am Rahul Gaikwad. I am a Staff DevOps Engineer at FireEye cybersecurity. I am responsible for continuous integration and deployment , different database administration and cloud operations. I came from India to talk in Scylla summit about how we are doing Intel Threat analysis using Graph database. We will be talking about the challenges with existing systems and how ScyllaDB helps us solve some of these challenges.
  3. KP
  4. KP
  5. KP: FireEye is a unique cyber security company in the sense that we bring our Security Appliances & Intelligence capabilities together for our customers. Appliances could be physical or virtual and include a range of products like Endpoint (HX), Network (NX), Email (ETP). Solutions include Intel, Managed Defense & Incidence Reponse.
  6. KP: As per Forrester Report, FireEye is the leader in cyber Threat Intelligence offering, both for current content and our strategy. We are specifically focused on Intel because we will be discussing the problems we encountered with current technology and solutions we implemented to address them during rest of this presentation.
  7. KP: As is evident here, we are Industry recognized thought leader in cyber Intelligence and often called upon to provide our analysis and thoughts on this topic.
  8. KP Subscription: Access to published intelligence reports Enablement: Include onboarding and provisioning, API integration with your security systems, analyst access, workshops. Digital Threat Monitoring: Tailored, proactive monitoring and analysis of threats to your brand, your VIPs. Advanced Intelligence Access: This capability enables direct queries into global visibility, insights and intelligence from FireEye. https://www.fireeye.com/content/dam/fireeye-www/products/pdfs/pf/intel/ds-fireeye-threat-intelligence.pdf
  9. KP: Now that we went through the business aspects of why and how we do Threat Intel, let's briefly talk about our current application and what it does at very high level.
  10. RG: Our customized graph system stores data as “nodes” or “edges”. It also allows analyst to define and apply tags to nodes and edges , we can call it as attributes or characteristics. Each node represents a single object, event or evidence. For example, organizations, actor, hacker, host computers, files, and FQDNs are all represented as nodes in the graph database. Edges represent the relationships between nodes. For example, an edge exist from a threat actor to their location.
  11. RG : In the above diagram, blue circles indicate nodes, green arrows are edges, red labels are properties, and orange labels are aspects node 1 - email - sender mail id node 2 – filemd5 - email content message / file attachment node 3 - email – receiver mail id node 4 - ipv4addr – IP address of filemd5 node SenderEmail-ID (node) sent filemd5 email to ReceiverEmail-Id Each node has properties in our intel system. For example: The SenderEmail-Id is associated with APT3 actor - a known hacking group. Filemd5 has been associated with an email phishing campaign. ReceiverEmail-id is a tagged as victim Filemd5 has association with the IP Address from which such phishing campaigns has been executed in the past.
  12. RG : Over time, our intel system became very effective & popular. Its usage has increased from hand full of analysts to several hundred analysts spread across the globe. We became a victim of our own success - as we started running into performance limitations.
  13. RG: Based on our objectives, we started evaluating Graph database technologies like OrientDB, Synapse, AWS Neptune, JanusGraph. We had various evaluation criteria like Functional – Traversing Speed , Full text search, Concurrent users Non-functional – Pluggable storage backend , High Availability and Disaster Recovery Supportability – Strong and active user community , Already deployed in Production, Documentations
  14. RG: Indexing capabilities - We can define the indexes per use case. Free / Full - Text Search is a capability where the system allows users to search for records that includes one or more word within a Free Text Field. Embedded - We can embed JG with application code layer. Schema Management - Allows to define and change Schema. It also validates incoming data (schema validations). Triggers system generates Events when certain specific actions are performed on the underlying database store. OLAP - Online Analytical Processing - using distributed graph processing
  15. RG :
  16. RG
  17. RG: When we setup or scale the cluster, we just need to run scyall_setup.sh which sets up configs automatically. During data migration from existing to new system we got 80% compression rate.
  18. RG
  19. RG
  20. RG: Here is an example of how those questions are asked. We are showing a Gremlin query used to select a Node with specific property. And then traverse through the graph system and find all the other nodes it is connected via edges. As shown in the red highlighted box, the query traversed through 15,000 nodes and provided results in 322 ms - abt 10 times faster than it is in our current system.
  21. KP-
  22. KP This is a high level overview of what we built in the cloud. It is an N-tier architecture. App UI JanusGraph Scylladb (primary) & Elasticsearch (search) App API System is designed with redundancy for each of these components for scalability and HA. They are built across multiple Availability Zones so we are protected against AZ failures. Everything is in a private VPC with restricted access. Access comes in via Nginx. Authentication/authorization is handled via an Nginx/OpenResty combination to our internal IDAM server. All the business logic is abstracted in the the Application Tier.
  23. RG
  24. RG As Krishna mentioned , we have setup all system components in AWS cloud. We went through several iterations to come up with the optimal size of the cluster and resources to accomplish our goals like functionality and data migration from current system.
  25. RG
  26. RG
  27. RG : Using these automation tools we can build the whole stack shown in the architecture diagram with in minutes to an hour.
  28. RG: We ran set of queries on existing and new system , and found the new system based on Scylla is 10 times faster than the existing system.
  29. KP
  30. KP: Our experience with Scylladb has been very good. Its cost effective and performant. We are looking at opportunities to use Scylladb in other projects with in FireEye.
  31. KP: Finally, a big thanks to our internal FireEye team of Architects, Developers, QA & Devops. Architects and Devs worked closely with Devops to iterate and improve this solution. Our teams are spread across Reston, VA, Amsterdam & Pune, India - and we work very closely to deliver world class solutions. I would also like to extend our gratitude to JanusGraph and ScyllaDB for the excellent Scylla University resources, the community and the organizers of this Summit.
  32. KP