SlideShare uma empresa Scribd logo
1 de 69
Baixar para ler offline
Drilling Security Data
Charles S. Givre, CISSP

charles.givre@gtkcyber.com

www.thedataist.com
• Lead Data Scientist at Deutsche Bank
• PMC Member of Apache Drill Project
• Co-founder GTK Cyber
• Senior Lead Data Scientist @ Booz Allen
• 5 Years @ CIA
• Working on Apache Drill Book
• Masters Degree in Middle Eastern Studies
• Undergraduate in Comp.Sci & Music
Charles Givre
The Problem: Analyzing
Security Data is Hard
gtkcyber.com
Security Data Analysis is Hard…Really Hard
Why?
Security Data comes in Many
Forms
Security Data Comes in Many Forms
• Standard Types such as JSON/CSV/XML

• Log Files: Event Logs, Database, Web Server etc.

• Syslog

• PCAP / PCAP-NG: Binary, raw network traffic
Security Data Lives In Many Places
• File systems

• Cloud Storage: Azure / S3

• Databases

• Real Time Event Streams

• Other sources
Few tools can effectively analyze
these data types effectively
Even fewer tools can effectively analyze
ALL these data types effectively
Splunk and ELK are solid SIEM
platforms
Wireshark is good for PCAP
analysis
gtkcyber.com
Security data is not arranged in an
optimal way for ad-hoc analysis
gtkcyber.com
Security data is not arranged in an optimal way
for ad-hoc analysis
ETL Data Warehouse
ETL is expensive and wasteful
gtkcyber.com
Analytics teams spend
between 50%-90% of their
time preparing their data.
gtkcyber.com
76% of Data Scientists say
this is the least enjoyable
part of their job.
http://visit.crowdflower.com/rs/416-ZBE-142/images/CrowdFlower_DataScienceReport_2016.pdf
gtkcyber.com
The ETL Process consumes the
most time and contributes almost
no value to the end product.
gtkcyber.com https://goleansixsigma.com/8-wastes/
So where does Drill fit in?
Apache Drill is an SQL Engine
for self-describing data.
Drill lets you query anything*,
wherever it is*, no matter its size**
using standard SQL.
* well.. almost anything
** within reason
Drill acts as a universal
translator for data.
gtkcyber.com
ETL Data Warehouse
gtkcyber.com
gtkcyber.com
SELECT src_ip, COUNT(*)

FROM dfs.test.`bad.pcap`
GROUP BY src_ip
Table
Whoa! I can query PCAP as if it
was a database!!
gtkcyber.com
So let's take a look...
gtkcyber.com
For instructions on how to
use Apache Drill with Apache
Superset (Incubating)
http://thedataist.com/visualize-anything-with-superset-and-drill/
gtkcyber.com
158.222.5.157 - - [25/Oct/2015:04:24:37 +0100] "GET /acl_users/
credentials_cookie_auth/require_login?came_from=http%3A//
howto.basjes.nl/join_form HTTP/1.1" 200 10716 "http://
howto.basjes.nl/join_form" "Mozilla/5.0 (Windows NT 6.3; WOW64;
rv:34.0) Gecko/20100101 Firefox/34.0 AlexaToolbar/alxf-2.21"
158.222.5.157 - - [25/Oct/2015:04:24:39 +0100] "GET /login_form
HTTP/1.1" 200 10543 "http://howto.basjes.nl/" "Mozilla/5.0
(Windows NT 6.3; WOW64; rv:34.0) Gecko/20100101 Firefox/34.0
AlexaToolbar/alxf-2.21"
gtkcyber.com https://github.com/nielsbasjes/logparser/tree/master/examples/demolog
gtkcyber.com https://github.com/nielsbasjes/logparser/tree/master/examples/demolog
Who is Trying to Hack This Site?
• The url /join_form is not public so anyone attempting to access this site,
so almost anyone trying to access this probably a hacker... 

• Let's see who is looking...
https://github.com/nielsbasjes/logparser/tree/master/examples/demolog
Who is Trying to Hack This Site?
SELECT request_receive_time,
connection_client_host,
request_useragent
FROM dfs.test.`hackers-access.httpd`
WHERE request_firstline_uri = '/join_form'
https://github.com/nielsbasjes/logparser/tree/master/examples/demolog
Who is Trying to Hack This Site?
SELECT request_receive_time,
connection_client_host,
request_useragent
FROM dfs.test.`hackers-access.httpd`
WHERE request_firstline_uri = '/join_form'
Who is Trying to Hack This Site?
Let's take a look at those IP
addresses shall we?
Who is Trying to Hack This Site?
SELECT request_receive_time,
connection_client_host,
request_useragent
FROM dfs.test.`hackers-access.httpd`
WHERE request_firstline_uri = '/join_form'
Drill's Flexible UDF Interface Allows
you to write your own functions.
A collection of GeoIP Functions
is available on GitHub.
https://github.com/cgivre/drill-geoip-functions
Drill GeoIP Functions
• getCountryName( <ip> ): This function returns the country name of the IP address, "Unknown" if the IP is unknown or invalid.

• getCountryConfidence( <ip> ): This function returns the confidence score of the country ISO code of the IP address.

• getCountryISOCode( <ip> ): This function returns the country ISO code of the IP address, "Unknown" if the IP is unknown or invalid.

• getCityName( <ip> ): This function returns the city name of the IP address, "Unknown" if the IP is unknown or invalid.

• getCityConfidence( <ip> ): This function returns confidence score of the city name of the IP address.

• getLatitude( <ip> ): This function returns the latitude associated with the IP address.

• getLongitude( <ip> ): This function returns the longitude associated with the IP address.

• getTimezone( <ip> ): This function returns the timezone associated with the IP address.

• getAccuracyRadius( <ip> ): This function returns the accuracy radius associated with the IP address, 0 if unknown.

• getAverageIncome( <ip> ): This function returns the average income of the region associated with the IP address, 0 if unknown.

• getMetroCode( <ip> ): This function returns the metro code of the region associated with the IP address, 0 if unknown.

• getPopulationDensity( <ip> ): This function returns the population density associated with the IP address.

• getPostalCode( <ip> ): This function returns the postal code associated with the IP address.

• getCoordPoint( <ip> ): This function returns a point for use in GIS functions of the lat/long of associated with the IP address.

• getASN( <ip> ): This function returns the autonomous system of the IP address, "Unknown" if the IP is unknown or invalid.

• getASNOrganization( <ip> ): This function returns the autonomous system organization of the IP address, "Unknown" if the IP is
unknown or invalid.

• isEU( <ip> ), isEuropeanUnion( <ip> ): This function returns true if the ip address is located in the European Union, false if not.

• isAnonymous( <ip> ): This function returns true if the ip address is anonymous, false if not.

• isAnonymousVPN( <ip> ): This function returns true if the ip address is an anonymous virtual private network (VPN), false if not.

• isHostingProvider( <ip> ): This function returns true if the ip address is a hosting provider, false if not.

• isPublciProxy( <ip> ): This function returns true if the ip address is a public proxy, false if not.

• isTORExitNode( <ip> ): This function returns true if the ip address is a known TOR exit node, false if not.
https://github.com/cgivre/drill-geoip-functions
Who is Trying to Hack This Site?
SELECT request_receive_time, connection_client_host,
request_useragent,
getCountryName(connection_client_host ) as countryName,
getCityName(connection_client_host ) as cityName
FROM dfs.test.`hackers-access.httpd`
WHERE request_firstline_uri = '/join_form'
Who is Trying to Hack This Site?
Who is Trying to Hack This Site?
SELECT request_receive_time, connection_client_host, request_useragent,
getCountryName(connection_client_host ) as countryName,
getCityName(connection_client_host ) as cityName,
getLatitude(connection_client_host ) as latitude,
getLongitude(connection_client_host ) as longitude
FROM dfs.test.`hackers-access.httpd`
WHERE request_firstline_uri = '/join_form'
Who is Trying to Hack This Site?
What equipment are they using?
What equipment are they using?
• Drill has support for multidimensional data structures including KV Pairs
and Lists.

• There is a pre-existing UDF (https://github.com/nielsbasjes/yauaa) for
Apache Drill which can parse User Agent Strings and get you a lot of
useful information from the UA string.
Who is Trying to Hack This Site?
SELECT parse_user_agent(request_useragent) AS ua
FROM dfs.test.`hackers-access.httpd`
WHERE request_firstline_uri = '/join_form'
Who is Trying to Hack This Site?
SELECT table1.ua.OperatingSystemNameVersion as os,
table1.ua.AgentNameVersionMajor as browser
FROM
(
SELECT
parse_user_agent(request_useragent) AS ua
FROM dfs.test.`hackers-access.httpd`
WHERE request_firstline_uri = '/join_form'
) AS table1
Questions so far?
Let's have some fun with PCAP
Let's have some fun with PCAP
• PCAP is short for Packet Capture and represent raw network traffic. 

• Files are encoded in binary format, but various tools exist to analyze
PCAP files or convert them into more easily accessible formats, such as
JSON.
PCAP Analysis
PCAP Analysis
• Drill has a comprehensive collection of networking functions to facilitate
network analysis

• is_private_ip(), is_valid_ip(), in_network() and many others. 

• inet_aton() and inet_ntoa() exist to facilitate sorting IP ranges.
PCAP Analysis
Let's look for who is talking to
whom...
PCAP Analysis
SELECT DISTINCT src_ip, dst_ip
FROM dfs.test.`slowdownload.pcap`
WHERE src_ip is not null
SynFlood Detection
SYN
SYN/ACK
ACK
SynFlood Detection
SYN
SYN/ACK
SYN
Drill can automate SYN Flood
detection with a custom UDF.
https://github.com/cgivre/drill-synflood-udf
SYN Flood Detection
SELECT tcp_session
FROM dfs.test.`synscan.pcap`
GROUP BY tcp_session
HAVING is_syn_flood(tcp_session, tcp_flags_syn, tcp_flags_ack)
SYN Flood Detection
SELECT *
FROM dfs.test.`synscan.pcap`
WHERE tcp_session IN
(
SELECT tcp_session
FROM dfs.test.`synscan.pcap`
GROUP BY tcp_session
HAVING is_syn_flood(tcp_session, tcp_flags_syn, tcp_flags_ack)
)
Questions so far?
How about log files?
Dec 12 2018 06:50:25 sshd[36669]: Failed password for root from 61.160.251.136 port 1313 ssh2
Dec 12 2018 06:50:25 sshd[36669]: Failed password for root from 61.160.251.136 port 1313 ssh2
Dec 12 2018 06:50:24 sshd[36669]: Failed password for root from 61.160.251.136 port 1313 ssh2
Dec 12 2018 03:36:23 sshd[41875]: Failed password for root from 222.189.239.10 port 1350 ssh2
Dec 12 2018 03:36:22 sshd[41875]: Failed password for root from 222.189.239.10 port 1350 ssh2
Dec 12 2018 03:36:22 sshlockout[15383]: Locking out 222.189.239.10 after 15 invalid attempts
Dec 12 2018 03:36:22 sshd[41875]: Failed password for root from 222.189.239.10 port 1350 ssh2
Dec 12 2018 03:36:22 sshlockout[15383]: Locking out 222.189.239.10 after 15 invalid attempts
Dec 12 2018 03:36:22 sshd[42419]: Failed password for root from 222.189.239.10 port 2646 ssh2
Drill can natively query Syslog
formatted data*
* This feature will be available in Drill version 1.16
Questions?
https://amzn.to/2HDwE92
Thank You!!
Charles S. Givre, CISSP

charles.givre@gtkcyber.com

www.thedataist.com

Mais conteúdo relacionado

Mais procurados

Building a Big Data Pipeline
Building a Big Data PipelineBuilding a Big Data Pipeline
Building a Big Data PipelineJesus Rodriguez
 
Monitoring your Python with Prometheus (Python Ireland April 2015)
Monitoring your Python with Prometheus (Python Ireland April 2015)Monitoring your Python with Prometheus (Python Ireland April 2015)
Monitoring your Python with Prometheus (Python Ireland April 2015)Brian Brazil
 
Data Lake Overview
Data Lake OverviewData Lake Overview
Data Lake OverviewJames Serra
 
Lambda architecture for real time big data
Lambda architecture for real time big dataLambda architecture for real time big data
Lambda architecture for real time big dataTrieu Nguyen
 
Near real-time statistical modeling and anomaly detection using Flink!
Near real-time statistical modeling and anomaly detection using Flink!Near real-time statistical modeling and anomaly detection using Flink!
Near real-time statistical modeling and anomaly detection using Flink!Flink Forward
 
Data Democratization at Nubank
 Data Democratization at Nubank Data Democratization at Nubank
Data Democratization at NubankDatabricks
 
Introduction to ETL process
Introduction to ETL process Introduction to ETL process
Introduction to ETL process Omid Vahdaty
 
Types of connections in Power BI
Types of connections in Power BITypes of connections in Power BI
Types of connections in Power BISwapnil Jadhav
 
Big Data Analytics with Hadoop
Big Data Analytics with HadoopBig Data Analytics with Hadoop
Big Data Analytics with HadoopPhilippe Julio
 
Data Quality & Data Governance
Data Quality & Data GovernanceData Quality & Data Governance
Data Quality & Data GovernanceTuba Yaman Him
 
Metadata & Taxonomy: The Foundation for Content and Digital Asset Management
Metadata & Taxonomy: The Foundation for Content and Digital Asset ManagementMetadata & Taxonomy: The Foundation for Content and Digital Asset Management
Metadata & Taxonomy: The Foundation for Content and Digital Asset ManagementVeeva Systems
 
07. Analytics & Reporting Requirements Template
07. Analytics & Reporting Requirements Template07. Analytics & Reporting Requirements Template
07. Analytics & Reporting Requirements TemplateAlan D. Duncan
 
Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...
Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...
Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...Simplilearn
 
Data strategy in a Big Data world
Data strategy in a Big Data worldData strategy in a Big Data world
Data strategy in a Big Data worldCraig Milroy
 
Real-time Data Ingestion from Kafka to ClickHouse with Deterministic Re-tries...
Real-time Data Ingestion from Kafka to ClickHouse with Deterministic Re-tries...Real-time Data Ingestion from Kafka to ClickHouse with Deterministic Re-tries...
Real-time Data Ingestion from Kafka to ClickHouse with Deterministic Re-tries...HostedbyConfluent
 
Accelerating Data Ingestion with Databricks Autoloader
Accelerating Data Ingestion with Databricks AutoloaderAccelerating Data Ingestion with Databricks Autoloader
Accelerating Data Ingestion with Databricks AutoloaderDatabricks
 
Data Observability Best Pracices
Data Observability Best PracicesData Observability Best Pracices
Data Observability Best PracicesAndy Petrella
 

Mais procurados (20)

Building a Big Data Pipeline
Building a Big Data PipelineBuilding a Big Data Pipeline
Building a Big Data Pipeline
 
Monitoring your Python with Prometheus (Python Ireland April 2015)
Monitoring your Python with Prometheus (Python Ireland April 2015)Monitoring your Python with Prometheus (Python Ireland April 2015)
Monitoring your Python with Prometheus (Python Ireland April 2015)
 
Data Lake Overview
Data Lake OverviewData Lake Overview
Data Lake Overview
 
Lambda architecture for real time big data
Lambda architecture for real time big dataLambda architecture for real time big data
Lambda architecture for real time big data
 
Near real-time statistical modeling and anomaly detection using Flink!
Near real-time statistical modeling and anomaly detection using Flink!Near real-time statistical modeling and anomaly detection using Flink!
Near real-time statistical modeling and anomaly detection using Flink!
 
Data Democratization at Nubank
 Data Democratization at Nubank Data Democratization at Nubank
Data Democratization at Nubank
 
Introduction to ETL process
Introduction to ETL process Introduction to ETL process
Introduction to ETL process
 
Big Data Readiness Assessment
Big Data Readiness AssessmentBig Data Readiness Assessment
Big Data Readiness Assessment
 
Types of connections in Power BI
Types of connections in Power BITypes of connections in Power BI
Types of connections in Power BI
 
Big Data Analytics with Hadoop
Big Data Analytics with HadoopBig Data Analytics with Hadoop
Big Data Analytics with Hadoop
 
Data Infra and Data Access in Nubank
Data Infra and Data Access in NubankData Infra and Data Access in Nubank
Data Infra and Data Access in Nubank
 
Data Quality & Data Governance
Data Quality & Data GovernanceData Quality & Data Governance
Data Quality & Data Governance
 
Metadata & Taxonomy: The Foundation for Content and Digital Asset Management
Metadata & Taxonomy: The Foundation for Content and Digital Asset ManagementMetadata & Taxonomy: The Foundation for Content and Digital Asset Management
Metadata & Taxonomy: The Foundation for Content and Digital Asset Management
 
Analysing Data in Real-time
Analysing Data in Real-timeAnalysing Data in Real-time
Analysing Data in Real-time
 
07. Analytics & Reporting Requirements Template
07. Analytics & Reporting Requirements Template07. Analytics & Reporting Requirements Template
07. Analytics & Reporting Requirements Template
 
Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...
Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...
Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...
 
Data strategy in a Big Data world
Data strategy in a Big Data worldData strategy in a Big Data world
Data strategy in a Big Data world
 
Real-time Data Ingestion from Kafka to ClickHouse with Deterministic Re-tries...
Real-time Data Ingestion from Kafka to ClickHouse with Deterministic Re-tries...Real-time Data Ingestion from Kafka to ClickHouse with Deterministic Re-tries...
Real-time Data Ingestion from Kafka to ClickHouse with Deterministic Re-tries...
 
Accelerating Data Ingestion with Databricks Autoloader
Accelerating Data Ingestion with Databricks AutoloaderAccelerating Data Ingestion with Databricks Autoloader
Accelerating Data Ingestion with Databricks Autoloader
 
Data Observability Best Pracices
Data Observability Best PracicesData Observability Best Pracices
Data Observability Best Pracices
 

Semelhante a Drilling Cyber Security Data With Apache Drill

Creating PostgreSQL-as-a-Service at Scale
Creating PostgreSQL-as-a-Service at ScaleCreating PostgreSQL-as-a-Service at Scale
Creating PostgreSQL-as-a-Service at ScaleSean Chittenden
 
PyCon AU 2012 - Debugging Live Python Web Applications
PyCon AU 2012 - Debugging Live Python Web ApplicationsPyCon AU 2012 - Debugging Live Python Web Applications
PyCon AU 2012 - Debugging Live Python Web ApplicationsGraham Dumpleton
 
Who pulls the strings?
Who pulls the strings?Who pulls the strings?
Who pulls the strings?Ronny
 
The basics of hacking and penetration testing 이제 시작이야 해킹과 침투 테스트 kenneth.s.kwon
The basics of hacking and penetration testing 이제 시작이야 해킹과 침투 테스트 kenneth.s.kwonThe basics of hacking and penetration testing 이제 시작이야 해킹과 침투 테스트 kenneth.s.kwon
The basics of hacking and penetration testing 이제 시작이야 해킹과 침투 테스트 kenneth.s.kwonKenneth Kwon
 
How ElasticSearch lives in my DevOps life
How ElasticSearch lives in my DevOps lifeHow ElasticSearch lives in my DevOps life
How ElasticSearch lives in my DevOps life琛琳 饶
 
Exploring the Final Frontier of Data Center Orchestration: Network Elements -...
Exploring the Final Frontier of Data Center Orchestration: Network Elements -...Exploring the Final Frontier of Data Center Orchestration: Network Elements -...
Exploring the Final Frontier of Data Center Orchestration: Network Elements -...Puppet
 
System insight without Interference
System insight without InterferenceSystem insight without Interference
System insight without InterferenceTony Tam
 
Hadoop Eagle - Real Time Monitoring Framework for eBay Hadoop
Hadoop Eagle - Real Time Monitoring Framework for eBay HadoopHadoop Eagle - Real Time Monitoring Framework for eBay Hadoop
Hadoop Eagle - Real Time Monitoring Framework for eBay HadoopDataWorks Summit
 
Presto anatomy
Presto anatomyPresto anatomy
Presto anatomyDongmin Yu
 
(APP307) Leverage the Cloud with a Blue/Green Deployment Architecture | AWS r...
(APP307) Leverage the Cloud with a Blue/Green Deployment Architecture | AWS r...(APP307) Leverage the Cloud with a Blue/Green Deployment Architecture | AWS r...
(APP307) Leverage the Cloud with a Blue/Green Deployment Architecture | AWS r...Amazon Web Services
 
Running High-Speed Serverless with nuclio
Running High-Speed Serverless with nuclioRunning High-Speed Serverless with nuclio
Running High-Speed Serverless with nuclioiguazio
 
Ansible benelux meetup - Amsterdam 27-5-2015
Ansible benelux meetup - Amsterdam 27-5-2015Ansible benelux meetup - Amsterdam 27-5-2015
Ansible benelux meetup - Amsterdam 27-5-2015Pavel Chunyayev
 
introduction to node.js
introduction to node.jsintroduction to node.js
introduction to node.jsorkaplan
 
SwampDragon presentation: The Copenhagen Django Meetup Group
SwampDragon presentation: The Copenhagen Django Meetup GroupSwampDragon presentation: The Copenhagen Django Meetup Group
SwampDragon presentation: The Copenhagen Django Meetup GroupErnest Jumbe
 
Docker Logging and analysing with Elastic Stack - Jakub Hajek
Docker Logging and analysing with Elastic Stack - Jakub Hajek Docker Logging and analysing with Elastic Stack - Jakub Hajek
Docker Logging and analysing with Elastic Stack - Jakub Hajek PROIDEA
 
Docker Logging and analysing with Elastic Stack
Docker Logging and analysing with Elastic StackDocker Logging and analysing with Elastic Stack
Docker Logging and analysing with Elastic StackJakub Hajek
 
Being HAPI! Reverse Proxying on Purpose
Being HAPI! Reverse Proxying on PurposeBeing HAPI! Reverse Proxying on Purpose
Being HAPI! Reverse Proxying on PurposeAman Kohli
 

Semelhante a Drilling Cyber Security Data With Apache Drill (20)

Creating PostgreSQL-as-a-Service at Scale
Creating PostgreSQL-as-a-Service at ScaleCreating PostgreSQL-as-a-Service at Scale
Creating PostgreSQL-as-a-Service at Scale
 
Apache Spark v3.0.0
Apache Spark v3.0.0Apache Spark v3.0.0
Apache Spark v3.0.0
 
PyCon AU 2012 - Debugging Live Python Web Applications
PyCon AU 2012 - Debugging Live Python Web ApplicationsPyCon AU 2012 - Debugging Live Python Web Applications
PyCon AU 2012 - Debugging Live Python Web Applications
 
Who pulls the strings?
Who pulls the strings?Who pulls the strings?
Who pulls the strings?
 
The basics of hacking and penetration testing 이제 시작이야 해킹과 침투 테스트 kenneth.s.kwon
The basics of hacking and penetration testing 이제 시작이야 해킹과 침투 테스트 kenneth.s.kwonThe basics of hacking and penetration testing 이제 시작이야 해킹과 침투 테스트 kenneth.s.kwon
The basics of hacking and penetration testing 이제 시작이야 해킹과 침투 테스트 kenneth.s.kwon
 
Logstash
LogstashLogstash
Logstash
 
How ElasticSearch lives in my DevOps life
How ElasticSearch lives in my DevOps lifeHow ElasticSearch lives in my DevOps life
How ElasticSearch lives in my DevOps life
 
Exploring the Final Frontier of Data Center Orchestration: Network Elements -...
Exploring the Final Frontier of Data Center Orchestration: Network Elements -...Exploring the Final Frontier of Data Center Orchestration: Network Elements -...
Exploring the Final Frontier of Data Center Orchestration: Network Elements -...
 
System insight without Interference
System insight without InterferenceSystem insight without Interference
System insight without Interference
 
Hadoop Eagle - Real Time Monitoring Framework for eBay Hadoop
Hadoop Eagle - Real Time Monitoring Framework for eBay HadoopHadoop Eagle - Real Time Monitoring Framework for eBay Hadoop
Hadoop Eagle - Real Time Monitoring Framework for eBay Hadoop
 
Presto anatomy
Presto anatomyPresto anatomy
Presto anatomy
 
BIG DATA ANALYSIS
BIG DATA ANALYSISBIG DATA ANALYSIS
BIG DATA ANALYSIS
 
(APP307) Leverage the Cloud with a Blue/Green Deployment Architecture | AWS r...
(APP307) Leverage the Cloud with a Blue/Green Deployment Architecture | AWS r...(APP307) Leverage the Cloud with a Blue/Green Deployment Architecture | AWS r...
(APP307) Leverage the Cloud with a Blue/Green Deployment Architecture | AWS r...
 
Running High-Speed Serverless with nuclio
Running High-Speed Serverless with nuclioRunning High-Speed Serverless with nuclio
Running High-Speed Serverless with nuclio
 
Ansible benelux meetup - Amsterdam 27-5-2015
Ansible benelux meetup - Amsterdam 27-5-2015Ansible benelux meetup - Amsterdam 27-5-2015
Ansible benelux meetup - Amsterdam 27-5-2015
 
introduction to node.js
introduction to node.jsintroduction to node.js
introduction to node.js
 
SwampDragon presentation: The Copenhagen Django Meetup Group
SwampDragon presentation: The Copenhagen Django Meetup GroupSwampDragon presentation: The Copenhagen Django Meetup Group
SwampDragon presentation: The Copenhagen Django Meetup Group
 
Docker Logging and analysing with Elastic Stack - Jakub Hajek
Docker Logging and analysing with Elastic Stack - Jakub Hajek Docker Logging and analysing with Elastic Stack - Jakub Hajek
Docker Logging and analysing with Elastic Stack - Jakub Hajek
 
Docker Logging and analysing with Elastic Stack
Docker Logging and analysing with Elastic StackDocker Logging and analysing with Elastic Stack
Docker Logging and analysing with Elastic Stack
 
Being HAPI! Reverse Proxying on Purpose
Being HAPI! Reverse Proxying on PurposeBeing HAPI! Reverse Proxying on Purpose
Being HAPI! Reverse Proxying on Purpose
 

Mais de Charles Givre

Data Exploration with Apache Drill: Day 2
Data Exploration with Apache Drill: Day 2Data Exploration with Apache Drill: Day 2
Data Exploration with Apache Drill: Day 2Charles Givre
 
Data Exploration with Apache Drill: Day 1
Data Exploration with Apache Drill:  Day 1Data Exploration with Apache Drill:  Day 1
Data Exploration with Apache Drill: Day 1Charles Givre
 
Killing ETL with Apache Drill
Killing ETL with Apache DrillKilling ETL with Apache Drill
Killing ETL with Apache DrillCharles Givre
 
What Does Your Smart Car Know About You? Strata London 2016
What Does Your Smart Car Know About You?  Strata London 2016What Does Your Smart Car Know About You?  Strata London 2016
What Does Your Smart Car Know About You? Strata London 2016Charles Givre
 
Apache Drill Workshop
Apache Drill WorkshopApache Drill Workshop
Apache Drill WorkshopCharles Givre
 
Merlin: The Ultimate Data Science Environment
Merlin: The Ultimate Data Science EnvironmentMerlin: The Ultimate Data Science Environment
Merlin: The Ultimate Data Science EnvironmentCharles Givre
 
Apache Drill and Zeppelin: Two Promising Tools You've Never Heard Of
Apache Drill and Zeppelin: Two Promising Tools You've Never Heard OfApache Drill and Zeppelin: Two Promising Tools You've Never Heard Of
Apache Drill and Zeppelin: Two Promising Tools You've Never Heard OfCharles Givre
 
Strata NYC 2015 What does your smart device know about you?
Strata NYC 2015 What does your smart device know about you?Strata NYC 2015 What does your smart device know about you?
Strata NYC 2015 What does your smart device know about you?Charles Givre
 

Mais de Charles Givre (9)

Blockchain and UDFs
Blockchain and UDFsBlockchain and UDFs
Blockchain and UDFs
 
Data Exploration with Apache Drill: Day 2
Data Exploration with Apache Drill: Day 2Data Exploration with Apache Drill: Day 2
Data Exploration with Apache Drill: Day 2
 
Data Exploration with Apache Drill: Day 1
Data Exploration with Apache Drill:  Day 1Data Exploration with Apache Drill:  Day 1
Data Exploration with Apache Drill: Day 1
 
Killing ETL with Apache Drill
Killing ETL with Apache DrillKilling ETL with Apache Drill
Killing ETL with Apache Drill
 
What Does Your Smart Car Know About You? Strata London 2016
What Does Your Smart Car Know About You?  Strata London 2016What Does Your Smart Car Know About You?  Strata London 2016
What Does Your Smart Car Know About You? Strata London 2016
 
Apache Drill Workshop
Apache Drill WorkshopApache Drill Workshop
Apache Drill Workshop
 
Merlin: The Ultimate Data Science Environment
Merlin: The Ultimate Data Science EnvironmentMerlin: The Ultimate Data Science Environment
Merlin: The Ultimate Data Science Environment
 
Apache Drill and Zeppelin: Two Promising Tools You've Never Heard Of
Apache Drill and Zeppelin: Two Promising Tools You've Never Heard OfApache Drill and Zeppelin: Two Promising Tools You've Never Heard Of
Apache Drill and Zeppelin: Two Promising Tools You've Never Heard Of
 
Strata NYC 2015 What does your smart device know about you?
Strata NYC 2015 What does your smart device know about you?Strata NYC 2015 What does your smart device know about you?
Strata NYC 2015 What does your smart device know about you?
 

Último

%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...masabamasaba
 
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...chiefasafspells
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...masabamasaba
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrandmasabamasaba
 
%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Harare%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Hararemasabamasaba
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension AidPhilip Schwarz
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplatePresentation.STUDIO
 
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...WSO2
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisamasabamasaba
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...masabamasaba
 
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...WSO2
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech studentsHimanshiGarg82
 
tonesoftg
tonesoftgtonesoftg
tonesoftglanshi9
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Steffen Staab
 
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfonteinmasabamasaba
 
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...WSO2
 
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyviewmasabamasaba
 
Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastPapp Krisztián
 

Último (20)

%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
 
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
 
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand
 
%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Harare%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Harare
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation Template
 
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
 
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students
 
tonesoftg
tonesoftgtonesoftg
tonesoftg
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
 
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
 
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
 
Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the past
 

Drilling Cyber Security Data With Apache Drill

  • 1. Drilling Security Data Charles S. Givre, CISSP charles.givre@gtkcyber.com www.thedataist.com
  • 2. • Lead Data Scientist at Deutsche Bank • PMC Member of Apache Drill Project • Co-founder GTK Cyber • Senior Lead Data Scientist @ Booz Allen • 5 Years @ CIA • Working on Apache Drill Book • Masters Degree in Middle Eastern Studies • Undergraduate in Comp.Sci & Music Charles Givre
  • 4. gtkcyber.com Security Data Analysis is Hard…Really Hard
  • 6. Security Data comes in Many Forms
  • 7. Security Data Comes in Many Forms • Standard Types such as JSON/CSV/XML • Log Files: Event Logs, Database, Web Server etc. • Syslog • PCAP / PCAP-NG: Binary, raw network traffic
  • 8. Security Data Lives In Many Places • File systems • Cloud Storage: Azure / S3 • Databases • Real Time Event Streams • Other sources
  • 9. Few tools can effectively analyze these data types effectively
  • 10. Even fewer tools can effectively analyze ALL these data types effectively
  • 11. Splunk and ELK are solid SIEM platforms
  • 12. Wireshark is good for PCAP analysis
  • 13. gtkcyber.com Security data is not arranged in an optimal way for ad-hoc analysis
  • 14. gtkcyber.com Security data is not arranged in an optimal way for ad-hoc analysis ETL Data Warehouse
  • 15. ETL is expensive and wasteful
  • 16. gtkcyber.com Analytics teams spend between 50%-90% of their time preparing their data.
  • 17. gtkcyber.com 76% of Data Scientists say this is the least enjoyable part of their job. http://visit.crowdflower.com/rs/416-ZBE-142/images/CrowdFlower_DataScienceReport_2016.pdf
  • 18. gtkcyber.com The ETL Process consumes the most time and contributes almost no value to the end product.
  • 20. So where does Drill fit in?
  • 21. Apache Drill is an SQL Engine for self-describing data.
  • 22. Drill lets you query anything*, wherever it is*, no matter its size** using standard SQL. * well.. almost anything ** within reason
  • 23. Drill acts as a universal translator for data.
  • 26. gtkcyber.com SELECT src_ip, COUNT(*)
 FROM dfs.test.`bad.pcap` GROUP BY src_ip Table Whoa! I can query PCAP as if it was a database!!
  • 28. gtkcyber.com For instructions on how to use Apache Drill with Apache Superset (Incubating) http://thedataist.com/visualize-anything-with-superset-and-drill/
  • 29. gtkcyber.com 158.222.5.157 - - [25/Oct/2015:04:24:37 +0100] "GET /acl_users/ credentials_cookie_auth/require_login?came_from=http%3A// howto.basjes.nl/join_form HTTP/1.1" 200 10716 "http:// howto.basjes.nl/join_form" "Mozilla/5.0 (Windows NT 6.3; WOW64; rv:34.0) Gecko/20100101 Firefox/34.0 AlexaToolbar/alxf-2.21" 158.222.5.157 - - [25/Oct/2015:04:24:39 +0100] "GET /login_form HTTP/1.1" 200 10543 "http://howto.basjes.nl/" "Mozilla/5.0 (Windows NT 6.3; WOW64; rv:34.0) Gecko/20100101 Firefox/34.0 AlexaToolbar/alxf-2.21"
  • 32. Who is Trying to Hack This Site? • The url /join_form is not public so anyone attempting to access this site, so almost anyone trying to access this probably a hacker... • Let's see who is looking... https://github.com/nielsbasjes/logparser/tree/master/examples/demolog
  • 33. Who is Trying to Hack This Site? SELECT request_receive_time, connection_client_host, request_useragent FROM dfs.test.`hackers-access.httpd` WHERE request_firstline_uri = '/join_form' https://github.com/nielsbasjes/logparser/tree/master/examples/demolog
  • 34. Who is Trying to Hack This Site? SELECT request_receive_time, connection_client_host, request_useragent FROM dfs.test.`hackers-access.httpd` WHERE request_firstline_uri = '/join_form'
  • 35. Who is Trying to Hack This Site?
  • 36. Let's take a look at those IP addresses shall we?
  • 37. Who is Trying to Hack This Site? SELECT request_receive_time, connection_client_host, request_useragent FROM dfs.test.`hackers-access.httpd` WHERE request_firstline_uri = '/join_form'
  • 38. Drill's Flexible UDF Interface Allows you to write your own functions.
  • 39. A collection of GeoIP Functions is available on GitHub. https://github.com/cgivre/drill-geoip-functions
  • 40. Drill GeoIP Functions • getCountryName( <ip> ): This function returns the country name of the IP address, "Unknown" if the IP is unknown or invalid. • getCountryConfidence( <ip> ): This function returns the confidence score of the country ISO code of the IP address. • getCountryISOCode( <ip> ): This function returns the country ISO code of the IP address, "Unknown" if the IP is unknown or invalid. • getCityName( <ip> ): This function returns the city name of the IP address, "Unknown" if the IP is unknown or invalid. • getCityConfidence( <ip> ): This function returns confidence score of the city name of the IP address. • getLatitude( <ip> ): This function returns the latitude associated with the IP address. • getLongitude( <ip> ): This function returns the longitude associated with the IP address. • getTimezone( <ip> ): This function returns the timezone associated with the IP address. • getAccuracyRadius( <ip> ): This function returns the accuracy radius associated with the IP address, 0 if unknown. • getAverageIncome( <ip> ): This function returns the average income of the region associated with the IP address, 0 if unknown. • getMetroCode( <ip> ): This function returns the metro code of the region associated with the IP address, 0 if unknown. • getPopulationDensity( <ip> ): This function returns the population density associated with the IP address. • getPostalCode( <ip> ): This function returns the postal code associated with the IP address. • getCoordPoint( <ip> ): This function returns a point for use in GIS functions of the lat/long of associated with the IP address. • getASN( <ip> ): This function returns the autonomous system of the IP address, "Unknown" if the IP is unknown or invalid. • getASNOrganization( <ip> ): This function returns the autonomous system organization of the IP address, "Unknown" if the IP is unknown or invalid. • isEU( <ip> ), isEuropeanUnion( <ip> ): This function returns true if the ip address is located in the European Union, false if not. • isAnonymous( <ip> ): This function returns true if the ip address is anonymous, false if not. • isAnonymousVPN( <ip> ): This function returns true if the ip address is an anonymous virtual private network (VPN), false if not. • isHostingProvider( <ip> ): This function returns true if the ip address is a hosting provider, false if not. • isPublciProxy( <ip> ): This function returns true if the ip address is a public proxy, false if not. • isTORExitNode( <ip> ): This function returns true if the ip address is a known TOR exit node, false if not. https://github.com/cgivre/drill-geoip-functions
  • 41. Who is Trying to Hack This Site? SELECT request_receive_time, connection_client_host, request_useragent, getCountryName(connection_client_host ) as countryName, getCityName(connection_client_host ) as cityName FROM dfs.test.`hackers-access.httpd` WHERE request_firstline_uri = '/join_form'
  • 42. Who is Trying to Hack This Site?
  • 43. Who is Trying to Hack This Site? SELECT request_receive_time, connection_client_host, request_useragent, getCountryName(connection_client_host ) as countryName, getCityName(connection_client_host ) as cityName, getLatitude(connection_client_host ) as latitude, getLongitude(connection_client_host ) as longitude FROM dfs.test.`hackers-access.httpd` WHERE request_firstline_uri = '/join_form'
  • 44. Who is Trying to Hack This Site?
  • 45. What equipment are they using?
  • 46. What equipment are they using? • Drill has support for multidimensional data structures including KV Pairs and Lists. • There is a pre-existing UDF (https://github.com/nielsbasjes/yauaa) for Apache Drill which can parse User Agent Strings and get you a lot of useful information from the UA string.
  • 47. Who is Trying to Hack This Site? SELECT parse_user_agent(request_useragent) AS ua FROM dfs.test.`hackers-access.httpd` WHERE request_firstline_uri = '/join_form'
  • 48. Who is Trying to Hack This Site? SELECT table1.ua.OperatingSystemNameVersion as os, table1.ua.AgentNameVersionMajor as browser FROM ( SELECT parse_user_agent(request_useragent) AS ua FROM dfs.test.`hackers-access.httpd` WHERE request_firstline_uri = '/join_form' ) AS table1
  • 50. Let's have some fun with PCAP
  • 51. Let's have some fun with PCAP • PCAP is short for Packet Capture and represent raw network traffic. • Files are encoded in binary format, but various tools exist to analyze PCAP files or convert them into more easily accessible formats, such as JSON.
  • 53. PCAP Analysis • Drill has a comprehensive collection of networking functions to facilitate network analysis • is_private_ip(), is_valid_ip(), in_network() and many others. • inet_aton() and inet_ntoa() exist to facilitate sorting IP ranges.
  • 55. Let's look for who is talking to whom...
  • 56. PCAP Analysis SELECT DISTINCT src_ip, dst_ip FROM dfs.test.`slowdownload.pcap` WHERE src_ip is not null
  • 59. Drill can automate SYN Flood detection with a custom UDF. https://github.com/cgivre/drill-synflood-udf
  • 60. SYN Flood Detection SELECT tcp_session FROM dfs.test.`synscan.pcap` GROUP BY tcp_session HAVING is_syn_flood(tcp_session, tcp_flags_syn, tcp_flags_ack)
  • 61. SYN Flood Detection SELECT * FROM dfs.test.`synscan.pcap` WHERE tcp_session IN ( SELECT tcp_session FROM dfs.test.`synscan.pcap` GROUP BY tcp_session HAVING is_syn_flood(tcp_session, tcp_flags_syn, tcp_flags_ack) )
  • 63. How about log files?
  • 64. Dec 12 2018 06:50:25 sshd[36669]: Failed password for root from 61.160.251.136 port 1313 ssh2 Dec 12 2018 06:50:25 sshd[36669]: Failed password for root from 61.160.251.136 port 1313 ssh2 Dec 12 2018 06:50:24 sshd[36669]: Failed password for root from 61.160.251.136 port 1313 ssh2 Dec 12 2018 03:36:23 sshd[41875]: Failed password for root from 222.189.239.10 port 1350 ssh2 Dec 12 2018 03:36:22 sshd[41875]: Failed password for root from 222.189.239.10 port 1350 ssh2 Dec 12 2018 03:36:22 sshlockout[15383]: Locking out 222.189.239.10 after 15 invalid attempts Dec 12 2018 03:36:22 sshd[41875]: Failed password for root from 222.189.239.10 port 1350 ssh2 Dec 12 2018 03:36:22 sshlockout[15383]: Locking out 222.189.239.10 after 15 invalid attempts Dec 12 2018 03:36:22 sshd[42419]: Failed password for root from 222.189.239.10 port 2646 ssh2
  • 65.
  • 66. Drill can natively query Syslog formatted data* * This feature will be available in Drill version 1.16
  • 69. Thank You!! Charles S. Givre, CISSP charles.givre@gtkcyber.com www.thedataist.com