SlideShare uma empresa Scribd logo
1 de 38
©2019 FireEye
©2019 FireEye©2019 FireEye2
About Us
 Michael Sikorski
 Philip Tully
 Jay Gibble
 Matthew Haigh
©2019 FireEye
"HTTP 1.1 200 OK "
©2019 FireEye©2019 FireEye
One String can Make a Difference
4
NanoHTTPD webserver produces extra whitespace
Cobalt Strike Server Detection
Continued for 7 years
Detection signature
Track threat actors, identify C2 addresses
https://blog.fox-it.com/2019/02/26/identifying-cobalt-strike-team-servers-in-the-wild/
©2019 FireEye©2019 FireEye
Running Strings on larger
binaries produces tens of
thousands of strings.
5
©2019 FireEye©2019 FireEye
Strings produces a ton of noise
mixed in with important
information.
6
©2019 FireEye©2019 FireEye
What is a String
7
 N characters + NULL
No file format, context
0x31 0x33 0x33 0x37 0x00
– ‘1337’, right?
Not necessarily:
– memory address
– CPU instructions
– data used by the program
©2019 FireEye©2019 FireEye
Wide Strings
8
 Also be referred to as Wide strings
 The Windows OS uses Wide strings internally
– Microsoft’s encoding standard is UTF-16 LE
 Each wide character is two bytes
 C-style wide character strings terminated with double NULL (0x00, 0x00)
©2019 FireEye©2019 FireEye
Compilation
9
SourceCode
int main() {
printf("Derby");
return 0;
}
ObjectFile
"Derby"
.EXEBinary
.data
0x56000:
"Derby"
Strings persist on disk throughout the compilation process.
©2019 FireEye©2019 FireEye
The Strings Program
10
!This program cannot be run in DOS mode.
??3@YAXPAX@Z
??2@YAPAXI@Z
__CxxFrameHandler
_except_handler3
WSAStartup() error: %d
User-Agent: Mozilla/4.0 (compatible; MSIE 6.00; Windows
NT 5.1)
GetLastInputInfo
SeShutdownPrivilege
%sIEXPLORE.EXE
SOFTWAREMicrosoftWindowsCurrentVersionApp
PathsIEXPLORE.EXE
[Machine IdleTime:] %d days + %.2d:%.2d:%.2d
[Machine UpTime:] %-.2d Days %-.2d Hours %-.2d Minutes
%-.2d Seconds
ServiceDll
SYSTEMCurrentControlSetServices%sParameters
if exist "%s" goto selfkill
del "%s"
attrib -a -r -s -h "%s"
Inject '%s' to PID '%d' Successfully!
cmd.exe /c
Hi,Master [%d/%d/%d %d:%d:%d]
©2019 FireEye©2019 FireEye
Malware Triage
11
Customer
Suspected
compromise
Incident Response
Forensic analysis
Identify malware
sample
Reverse Engineer
Binary triage
Malware analysis
reverse engineers, SOC analysts, red teamers, incident responders, malware researchers
©2019 FireEye©2019 FireEye
Knowing which strings are
relevant often requires highly
experienced analysts.
12
©2019 FireEye©2019 FireEye
Strings Tells a Story
13
Relevance
domain names
IP addresses
URLs
filenames
registry paths
registry keys
HTTP user-agent strings
service configuration info
keylogger indicators
(e.g. ”[DELETE]”, “[BS]”
third party libraries
PDB strings
function names
debugging messages
command line help/usage options
OSINT
runtime artifacts
compiler artifacts
Windows APIs
library code
localizations
locations
languages
error messages
random byte sequences
format specifiers
©2019 FireEye©2019 FireEye
Relevance is subjective and its
definition can vary significantly
across analysts.
14
©2019 FireEye©2019 FireEye
Hypothesis and Goals
15
 Develop a tool that can:
– efficiently identify and prioritize strings
– based on relevance for malware analysis
StringSifter should:
– be easy to use
– generalize across:
– personas, use cases, downstream apps
– save time and money
 How does it work?
©2019 FireEye©2019 FireEye
Rankings are Everywhere
16
©2019 FireEye©2019 FireEye
 Search engines
– web
– e-commerce
 News Feeds
– social networks
 Recommender systems
– ads
– movies
– music
Our Favorite Products Serve Up Rankings
17
©2019 FireEye©2019 FireEye
( )
 Create optimal ordering of a list of items
 Precise individual item scores less important
than their relative ordering
 In classification, regression, clustering we
predict a class or single score
 LTR rarely applied in security applications
Learning to Rank
18
f
©2019 FireEye©2019 FireEye
 Rank items within unseen lists in a similar way to rankings within training lists
 Each item associated with a set of features and an ordinal integer label
 Ordinal label is the teaching signal that encodes relevance level
LTR as Supervised Learning
19
©2019 FireEye©2019 FireEye
 Decision Trees
– greedily choose splits by Gini impurity
 Gradient Boosted Decision Trees (GBDTs)
– combine outputs from multiple Decision Trees
– reduce loss using gradient descent
– weighted sum of trees’ predictions as ensemble
 LightGBM
– GBDTs with an LTR objective function
Gradient Boosted Decision Trees
20
©2019 FireEye©2019 FireEye
EMBER Training Dataset
21
 Endgame Malware BEnchmark for Research
– v1 (1.1 million PE files scanned on or before 2017)
 https://arxiv.org/abs/1804.04637
 https://github.com/endgameinc/ember
– 400k train + test malware binaries from v1
 malware defined as > 40 VT vendors say malicious
 Ran Strings on 400k malware binaries
– produced 3+ billion individual strings (24 GB)
– performed sampling
– labeled according to heuristics and FLARE hand-labeling
©2019 FireEye©2019 FireEye
 Natural Language Processing
– Markov model
– Entropy rate, english KL divergence
– Scrabble scores
 Host, Network IoCs
 Malware Regexes
– encodings (base64)
– format specifiers
– user agents
Representing Strings as Features
22
t
%
F
0.02
0.07
0.01
0.2
0.2
0.01
0.03
0.14
0.05
threshold = 0.01
http://evil.com
SOFTWAREincludeevil.pdb
t%Ft
Vr}Y
0.018
0.014
0.007
0.001
©2019 FireEye©2019 FireEye
quixotry  ˈkwik-sə-trē  (n.)
behavior inspired by idealistic
beliefs without regard to reality.
23
©2019 FireEye©2019 FireEye
Example
24
©2019 FireEye©2019 FireEye
 Normalized Discounted Cumulative Gain
– Normalized: divide DCG by ideal DCG on a
ground truth holdout dataset
– Discounted: divides each string’s predicted
relevance by a monotonically increasing
function (log of its ranked position)
– Cumulative: the cumulative gain or summed
total of every string’s relevance
– Gain: the magnitude of each string’s relevance
Evaluation
25
©2019 FireEye©2019 FireEye
Results
26
StringSifter performs well on a holdout set of 7+ years of FLARE malware reports.
©2019 FireEye©2019 FireEye
Putting it All Together
27
©2019 FireEye©2019 FireEye
Open Sourcing StringSifter
28
 The tool is now live:
– https://github.com/fireeye/stringsifter
– pip install stringsifter
– Command line and Docker tools
 flarestrings <my_sample> | rank_strings
 Versatility
– FLOSS outputs
– live memory dumps
©2019 FireEye
Tools demo
©2019 FireEye©2019 FireEye
 Git + local pip install
– Easy access to source code
 Pip install from PyPi
– If you just want to use the tool
 Docker container
– Minimum impact to host
Install and Use
30
git clone https://github.com/fireeye/stringsifter.git
cd stringsifter
pip install -e .
flarestrings <my_sample> | rank_strings
pip install stringsifter
flarestrings <my_sample> | rank_strings
git clone https://github.com/fireeye/stringsifter.git
cd stringsifter
docker build -t stringsifter -f docker/Dockerfile .
docker run -v <malware_dir>:/samples -it stringsifter
flarestrings /samples/<my_sample> | rank_strings
©2019 FireEye©2019 FireEye
 There are many versions of "strings"
– Gnu binutils, BSD, various windows implementations
– Inconsistent features
 flarestrings
– Pure python implementation of "strings"
– Consistent across platforms
– Prints both ASCII and wide strings
flarestrings *
31
* FLARE => FireEye Labs Advanced Reverse Engineering
©2019 FireEye©2019 FireEye
flarestrings Demo
32
©2019 FireEye©2019 FireEye
StringSifter rank_strings Demo
33
©2019 FireEye©2019 FireEye
rank_strings Options
34
©2019 FireEye©2019 FireEye
rank_strings with --scores
35
©2019 FireEye©2019 FireEye
rank_strings with --min-score
36
©2019 FireEye©2019 FireEye
 Rapid screening for potential capabilities
 Detect and handle packed / obfuscated binaries
– Tipoff for automated unpacker tooling
 Leverage feature vectors to focus triage
 Improve NLP
 Improve ranking performance on mach-o, ELF
Other Use Cases and Future Work
37
©2019 FireEye©2019 FireEye
 Plug into your malware analysis stack
 Seeking critical feedback
– improve accuracy and utility
– pertinent edge cases, non-PE files
– contribute via GitHub Issues
 Beginners and experts alike
 Thank you for your attention!
Community Support
38
https://github.com/fireeye/stringsifter
pip install stringsifter

Mais conteúdo relacionado

Mais procurados

Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...Databricks
 
Spark Summit EU talk by Herman van Hovell
Spark Summit EU talk by Herman van HovellSpark Summit EU talk by Herman van Hovell
Spark Summit EU talk by Herman van HovellSpark Summit
 
Taking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout SessionTaking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout SessionSplunk
 
Improving Python and Spark (PySpark) Performance and Interoperability
Improving Python and Spark (PySpark) Performance and InteroperabilityImproving Python and Spark (PySpark) Performance and Interoperability
Improving Python and Spark (PySpark) Performance and InteroperabilityWes McKinney
 
The Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization OpportunitiesThe Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization OpportunitiesDatabricks
 
Iceberg + Alluxio for Fast Data Analytics
Iceberg + Alluxio for Fast Data AnalyticsIceberg + Alluxio for Fast Data Analytics
Iceberg + Alluxio for Fast Data AnalyticsAlluxio, Inc.
 
SplunkLive 2011 Beginners Session
SplunkLive 2011 Beginners SessionSplunkLive 2011 Beginners Session
SplunkLive 2011 Beginners SessionSplunk
 
THINGS TO REMEMBER BEFORE PURCHASING ANY LAPTOP/COMPUTER
THINGS TO REMEMBER BEFORE PURCHASING ANY LAPTOP/COMPUTERTHINGS TO REMEMBER BEFORE PURCHASING ANY LAPTOP/COMPUTER
THINGS TO REMEMBER BEFORE PURCHASING ANY LAPTOP/COMPUTERAbhishekKumarPandey34
 
SplunkLive! Splunk for Insider Threats and Fraud Detection
SplunkLive! Splunk for Insider Threats and Fraud DetectionSplunkLive! Splunk for Insider Threats and Fraud Detection
SplunkLive! Splunk for Insider Threats and Fraud DetectionSplunk
 
Fine Tuning and Enhancing Performance of Apache Spark Jobs
Fine Tuning and Enhancing Performance of Apache Spark JobsFine Tuning and Enhancing Performance of Apache Spark Jobs
Fine Tuning and Enhancing Performance of Apache Spark JobsDatabricks
 
Internet Safety
Internet SafetyInternet Safety
Internet Safetymcgeet
 
Six Steps to SIEM Success
Six Steps to SIEM SuccessSix Steps to SIEM Success
Six Steps to SIEM SuccessAlienVault
 

Mais procurados (16)

Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
 
Spark Summit EU talk by Herman van Hovell
Spark Summit EU talk by Herman van HovellSpark Summit EU talk by Herman van Hovell
Spark Summit EU talk by Herman van Hovell
 
Taking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout SessionTaking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout Session
 
Week 4.5.ppt
Week 4.5.pptWeek 4.5.ppt
Week 4.5.ppt
 
Improving Python and Spark (PySpark) Performance and Interoperability
Improving Python and Spark (PySpark) Performance and InteroperabilityImproving Python and Spark (PySpark) Performance and Interoperability
Improving Python and Spark (PySpark) Performance and Interoperability
 
The Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization OpportunitiesThe Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization Opportunities
 
Iceberg + Alluxio for Fast Data Analytics
Iceberg + Alluxio for Fast Data AnalyticsIceberg + Alluxio for Fast Data Analytics
Iceberg + Alluxio for Fast Data Analytics
 
Concourse webhook
Concourse webhookConcourse webhook
Concourse webhook
 
CSIRT - BSSN
CSIRT - BSSNCSIRT - BSSN
CSIRT - BSSN
 
SplunkLive 2011 Beginners Session
SplunkLive 2011 Beginners SessionSplunkLive 2011 Beginners Session
SplunkLive 2011 Beginners Session
 
THINGS TO REMEMBER BEFORE PURCHASING ANY LAPTOP/COMPUTER
THINGS TO REMEMBER BEFORE PURCHASING ANY LAPTOP/COMPUTERTHINGS TO REMEMBER BEFORE PURCHASING ANY LAPTOP/COMPUTER
THINGS TO REMEMBER BEFORE PURCHASING ANY LAPTOP/COMPUTER
 
SplunkLive! Splunk for Insider Threats and Fraud Detection
SplunkLive! Splunk for Insider Threats and Fraud DetectionSplunkLive! Splunk for Insider Threats and Fraud Detection
SplunkLive! Splunk for Insider Threats and Fraud Detection
 
Hacker tool talk: maltego
Hacker tool talk: maltegoHacker tool talk: maltego
Hacker tool talk: maltego
 
Fine Tuning and Enhancing Performance of Apache Spark Jobs
Fine Tuning and Enhancing Performance of Apache Spark JobsFine Tuning and Enhancing Performance of Apache Spark Jobs
Fine Tuning and Enhancing Performance of Apache Spark Jobs
 
Internet Safety
Internet SafetyInternet Safety
Internet Safety
 
Six Steps to SIEM Success
Six Steps to SIEM SuccessSix Steps to SIEM Success
Six Steps to SIEM Success
 

Semelhante a StringSifter: Learning to Rank Strings Output for Speedier Malware Analysis

IBM Watson & PHP, A Practical Demonstration
IBM Watson & PHP, A Practical DemonstrationIBM Watson & PHP, A Practical Demonstration
IBM Watson & PHP, A Practical DemonstrationClark Everetts
 
apidays LIVE Paris - Bring the API culture to DevOps teams by Christophe Bour...
apidays LIVE Paris - Bring the API culture to DevOps teams by Christophe Bour...apidays LIVE Paris - Bring the API culture to DevOps teams by Christophe Bour...
apidays LIVE Paris - Bring the API culture to DevOps teams by Christophe Bour...apidays
 
Open Source AI - News and examples
Open Source AI - News and examplesOpen Source AI - News and examples
Open Source AI - News and examplesLuciano Resende
 
Learning to Rank Relevant Malware Strings Using Weak Supervision
Learning to Rank Relevant Malware Strings Using Weak SupervisionLearning to Rank Relevant Malware Strings Using Weak Supervision
Learning to Rank Relevant Malware Strings Using Weak SupervisionPhil Tully
 
Inteligencia artificial, open source e IBM Call for Code
Inteligencia artificial, open source e IBM Call for CodeInteligencia artificial, open source e IBM Call for Code
Inteligencia artificial, open source e IBM Call for CodeLuciano Resende
 
3-Way Scripts as a Practical Platform for Secure Distributed Code in Clouds
3-Way Scripts as a Practical Platform for Secure Distributed Code in Clouds3-Way Scripts as a Practical Platform for Secure Distributed Code in Clouds
3-Way Scripts as a Practical Platform for Secure Distributed Code in CloudsTokyo University of Science
 
The Role of Standards in IoT Security
The Role of Standards in IoT SecurityThe Role of Standards in IoT Security
The Role of Standards in IoT SecurityHannes Tschofenig
 
Breaking Extreme Networks WingOS: How to own millions of devices running on A...
Breaking Extreme Networks WingOS: How to own millions of devices running on A...Breaking Extreme Networks WingOS: How to own millions of devices running on A...
Breaking Extreme Networks WingOS: How to own millions of devices running on A...Priyanka Aash
 
" Breaking Extreme Networks WingOS: How to own millions of devices running on...
" Breaking Extreme Networks WingOS: How to own millions of devices running on..." Breaking Extreme Networks WingOS: How to own millions of devices running on...
" Breaking Extreme Networks WingOS: How to own millions of devices running on...PROIDEA
 
Firepower ngfw internet
Firepower ngfw internetFirepower ngfw internet
Firepower ngfw internetRony Melo
 
Csa UK agm 2019 - Web API attacks - Trends seen in the field Kriti Mohul
Csa UK agm 2019 - Web API attacks - Trends seen in the field Kriti MohulCsa UK agm 2019 - Web API attacks - Trends seen in the field Kriti Mohul
Csa UK agm 2019 - Web API attacks - Trends seen in the field Kriti MohulCloud Security Alliance, UK chapter
 
Serverless survival kit
Serverless survival kitServerless survival kit
Serverless survival kitSteve Houël
 
Fuzzing malware for fun & profit. Applying Coverage-Guided Fuzzing to Find Bu...
Fuzzing malware for fun & profit. Applying Coverage-Guided Fuzzing to Find Bu...Fuzzing malware for fun & profit. Applying Coverage-Guided Fuzzing to Find Bu...
Fuzzing malware for fun & profit. Applying Coverage-Guided Fuzzing to Find Bu...Maksim Shudrak
 
CIS 2015 How to secure the Internet of Things? Hannes Tschofenig
CIS 2015 How to secure the Internet of Things? Hannes TschofenigCIS 2015 How to secure the Internet of Things? Hannes Tschofenig
CIS 2015 How to secure the Internet of Things? Hannes TschofenigCloudIDSummit
 
Using LLVM to accelerate processing of data in Apache Arrow
Using LLVM to accelerate processing of data in Apache ArrowUsing LLVM to accelerate processing of data in Apache Arrow
Using LLVM to accelerate processing of data in Apache ArrowDataWorks Summit
 
Introduction To NIDS
Introduction To NIDSIntroduction To NIDS
Introduction To NIDSMichael Boman
 
apl5iy2ftxiwofbhsmxj-signature-584e2459f99b5370bda435f09b42cc84cc8c063b8cd454...
apl5iy2ftxiwofbhsmxj-signature-584e2459f99b5370bda435f09b42cc84cc8c063b8cd454...apl5iy2ftxiwofbhsmxj-signature-584e2459f99b5370bda435f09b42cc84cc8c063b8cd454...
apl5iy2ftxiwofbhsmxj-signature-584e2459f99b5370bda435f09b42cc84cc8c063b8cd454...Chrysostomos Christofi
 
technical-information-gathering-slides.pdf
technical-information-gathering-slides.pdftechnical-information-gathering-slides.pdf
technical-information-gathering-slides.pdfMarceloCunha571649
 

Semelhante a StringSifter: Learning to Rank Strings Output for Speedier Malware Analysis (20)

IBM Watson & PHP, A Practical Demonstration
IBM Watson & PHP, A Practical DemonstrationIBM Watson & PHP, A Practical Demonstration
IBM Watson & PHP, A Practical Demonstration
 
apidays LIVE Paris - Bring the API culture to DevOps teams by Christophe Bour...
apidays LIVE Paris - Bring the API culture to DevOps teams by Christophe Bour...apidays LIVE Paris - Bring the API culture to DevOps teams by Christophe Bour...
apidays LIVE Paris - Bring the API culture to DevOps teams by Christophe Bour...
 
Open Source AI - News and examples
Open Source AI - News and examplesOpen Source AI - News and examples
Open Source AI - News and examples
 
Learning to Rank Relevant Malware Strings Using Weak Supervision
Learning to Rank Relevant Malware Strings Using Weak SupervisionLearning to Rank Relevant Malware Strings Using Weak Supervision
Learning to Rank Relevant Malware Strings Using Weak Supervision
 
Inteligencia artificial, open source e IBM Call for Code
Inteligencia artificial, open source e IBM Call for CodeInteligencia artificial, open source e IBM Call for Code
Inteligencia artificial, open source e IBM Call for Code
 
voip_en
voip_envoip_en
voip_en
 
3-Way Scripts as a Practical Platform for Secure Distributed Code in Clouds
3-Way Scripts as a Practical Platform for Secure Distributed Code in Clouds3-Way Scripts as a Practical Platform for Secure Distributed Code in Clouds
3-Way Scripts as a Practical Platform for Secure Distributed Code in Clouds
 
The Role of Standards in IoT Security
The Role of Standards in IoT SecurityThe Role of Standards in IoT Security
The Role of Standards in IoT Security
 
Breaking Extreme Networks WingOS: How to own millions of devices running on A...
Breaking Extreme Networks WingOS: How to own millions of devices running on A...Breaking Extreme Networks WingOS: How to own millions of devices running on A...
Breaking Extreme Networks WingOS: How to own millions of devices running on A...
 
" Breaking Extreme Networks WingOS: How to own millions of devices running on...
" Breaking Extreme Networks WingOS: How to own millions of devices running on..." Breaking Extreme Networks WingOS: How to own millions of devices running on...
" Breaking Extreme Networks WingOS: How to own millions of devices running on...
 
Firepower ngfw internet
Firepower ngfw internetFirepower ngfw internet
Firepower ngfw internet
 
Csa UK agm 2019 - Web API attacks - Trends seen in the field Kriti Mohul
Csa UK agm 2019 - Web API attacks - Trends seen in the field Kriti MohulCsa UK agm 2019 - Web API attacks - Trends seen in the field Kriti Mohul
Csa UK agm 2019 - Web API attacks - Trends seen in the field Kriti Mohul
 
Serverless survival kit
Serverless survival kitServerless survival kit
Serverless survival kit
 
Fuzzing malware for fun & profit. Applying Coverage-Guided Fuzzing to Find Bu...
Fuzzing malware for fun & profit. Applying Coverage-Guided Fuzzing to Find Bu...Fuzzing malware for fun & profit. Applying Coverage-Guided Fuzzing to Find Bu...
Fuzzing malware for fun & profit. Applying Coverage-Guided Fuzzing to Find Bu...
 
CIS 2015 How to secure the Internet of Things? Hannes Tschofenig
CIS 2015 How to secure the Internet of Things? Hannes TschofenigCIS 2015 How to secure the Internet of Things? Hannes Tschofenig
CIS 2015 How to secure the Internet of Things? Hannes Tschofenig
 
Using LLVM to accelerate processing of data in Apache Arrow
Using LLVM to accelerate processing of data in Apache ArrowUsing LLVM to accelerate processing of data in Apache Arrow
Using LLVM to accelerate processing of data in Apache Arrow
 
Introduction To NIDS
Introduction To NIDSIntroduction To NIDS
Introduction To NIDS
 
apl5iy2ftxiwofbhsmxj-signature-584e2459f99b5370bda435f09b42cc84cc8c063b8cd454...
apl5iy2ftxiwofbhsmxj-signature-584e2459f99b5370bda435f09b42cc84cc8c063b8cd454...apl5iy2ftxiwofbhsmxj-signature-584e2459f99b5370bda435f09b42cc84cc8c063b8cd454...
apl5iy2ftxiwofbhsmxj-signature-584e2459f99b5370bda435f09b42cc84cc8c063b8cd454...
 
technical-information-gathering-slides.pdf
technical-information-gathering-slides.pdftechnical-information-gathering-slides.pdf
technical-information-gathering-slides.pdf
 
Atelier Technique CISCO ACSS 2018
Atelier Technique CISCO ACSS 2018Atelier Technique CISCO ACSS 2018
Atelier Technique CISCO ACSS 2018
 

Último

Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastPapp Krisztián
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnAmarnathKambale
 
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...Bert Jan Schrijver
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...masabamasaba
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...masabamasaba
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisamasabamasaba
 
%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in sowetomasabamasaba
 
%in Benoni+277-882-255-28 abortion pills for sale in Benoni
%in Benoni+277-882-255-28 abortion pills for sale in Benoni%in Benoni+277-882-255-28 abortion pills for sale in Benoni
%in Benoni+277-882-255-28 abortion pills for sale in Benonimasabamasaba
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park masabamasaba
 
WSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With SimplicityWSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With SimplicityWSO2
 
WSO2CON 2024 Slides - Open Source to SaaS
WSO2CON 2024 Slides - Open Source to SaaSWSO2CON 2024 Slides - Open Source to SaaS
WSO2CON 2024 Slides - Open Source to SaaSWSO2
 
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyviewmasabamasaba
 
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...masabamasaba
 
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...masabamasaba
 
WSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go PlatformlessWSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go PlatformlessWSO2
 
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburgmasabamasaba
 
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfonteinmasabamasaba
 

Último (20)

Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
 
Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the past
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
 
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto
 
%in Benoni+277-882-255-28 abortion pills for sale in Benoni
%in Benoni+277-882-255-28 abortion pills for sale in Benoni%in Benoni+277-882-255-28 abortion pills for sale in Benoni
%in Benoni+277-882-255-28 abortion pills for sale in Benoni
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 
WSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With SimplicityWSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
 
WSO2CON 2024 Slides - Open Source to SaaS
WSO2CON 2024 Slides - Open Source to SaaSWSO2CON 2024 Slides - Open Source to SaaS
WSO2CON 2024 Slides - Open Source to SaaS
 
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
 
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
 
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
 
WSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go PlatformlessWSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go Platformless
 
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
 
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
 

StringSifter: Learning to Rank Strings Output for Speedier Malware Analysis

  • 2. ©2019 FireEye©2019 FireEye2 About Us  Michael Sikorski  Philip Tully  Jay Gibble  Matthew Haigh
  • 4. ©2019 FireEye©2019 FireEye One String can Make a Difference 4 NanoHTTPD webserver produces extra whitespace Cobalt Strike Server Detection Continued for 7 years Detection signature Track threat actors, identify C2 addresses https://blog.fox-it.com/2019/02/26/identifying-cobalt-strike-team-servers-in-the-wild/
  • 5. ©2019 FireEye©2019 FireEye Running Strings on larger binaries produces tens of thousands of strings. 5
  • 6. ©2019 FireEye©2019 FireEye Strings produces a ton of noise mixed in with important information. 6
  • 7. ©2019 FireEye©2019 FireEye What is a String 7  N characters + NULL No file format, context 0x31 0x33 0x33 0x37 0x00 – ‘1337’, right? Not necessarily: – memory address – CPU instructions – data used by the program
  • 8. ©2019 FireEye©2019 FireEye Wide Strings 8  Also be referred to as Wide strings  The Windows OS uses Wide strings internally – Microsoft’s encoding standard is UTF-16 LE  Each wide character is two bytes  C-style wide character strings terminated with double NULL (0x00, 0x00)
  • 9. ©2019 FireEye©2019 FireEye Compilation 9 SourceCode int main() { printf("Derby"); return 0; } ObjectFile "Derby" .EXEBinary .data 0x56000: "Derby" Strings persist on disk throughout the compilation process.
  • 10. ©2019 FireEye©2019 FireEye The Strings Program 10 !This program cannot be run in DOS mode. ??3@YAXPAX@Z ??2@YAPAXI@Z __CxxFrameHandler _except_handler3 WSAStartup() error: %d User-Agent: Mozilla/4.0 (compatible; MSIE 6.00; Windows NT 5.1) GetLastInputInfo SeShutdownPrivilege %sIEXPLORE.EXE SOFTWAREMicrosoftWindowsCurrentVersionApp PathsIEXPLORE.EXE [Machine IdleTime:] %d days + %.2d:%.2d:%.2d [Machine UpTime:] %-.2d Days %-.2d Hours %-.2d Minutes %-.2d Seconds ServiceDll SYSTEMCurrentControlSetServices%sParameters if exist "%s" goto selfkill del "%s" attrib -a -r -s -h "%s" Inject '%s' to PID '%d' Successfully! cmd.exe /c Hi,Master [%d/%d/%d %d:%d:%d]
  • 11. ©2019 FireEye©2019 FireEye Malware Triage 11 Customer Suspected compromise Incident Response Forensic analysis Identify malware sample Reverse Engineer Binary triage Malware analysis reverse engineers, SOC analysts, red teamers, incident responders, malware researchers
  • 12. ©2019 FireEye©2019 FireEye Knowing which strings are relevant often requires highly experienced analysts. 12
  • 13. ©2019 FireEye©2019 FireEye Strings Tells a Story 13 Relevance domain names IP addresses URLs filenames registry paths registry keys HTTP user-agent strings service configuration info keylogger indicators (e.g. ”[DELETE]”, “[BS]” third party libraries PDB strings function names debugging messages command line help/usage options OSINT runtime artifacts compiler artifacts Windows APIs library code localizations locations languages error messages random byte sequences format specifiers
  • 14. ©2019 FireEye©2019 FireEye Relevance is subjective and its definition can vary significantly across analysts. 14
  • 15. ©2019 FireEye©2019 FireEye Hypothesis and Goals 15  Develop a tool that can: – efficiently identify and prioritize strings – based on relevance for malware analysis StringSifter should: – be easy to use – generalize across: – personas, use cases, downstream apps – save time and money  How does it work?
  • 17. ©2019 FireEye©2019 FireEye  Search engines – web – e-commerce  News Feeds – social networks  Recommender systems – ads – movies – music Our Favorite Products Serve Up Rankings 17
  • 18. ©2019 FireEye©2019 FireEye ( )  Create optimal ordering of a list of items  Precise individual item scores less important than their relative ordering  In classification, regression, clustering we predict a class or single score  LTR rarely applied in security applications Learning to Rank 18 f
  • 19. ©2019 FireEye©2019 FireEye  Rank items within unseen lists in a similar way to rankings within training lists  Each item associated with a set of features and an ordinal integer label  Ordinal label is the teaching signal that encodes relevance level LTR as Supervised Learning 19
  • 20. ©2019 FireEye©2019 FireEye  Decision Trees – greedily choose splits by Gini impurity  Gradient Boosted Decision Trees (GBDTs) – combine outputs from multiple Decision Trees – reduce loss using gradient descent – weighted sum of trees’ predictions as ensemble  LightGBM – GBDTs with an LTR objective function Gradient Boosted Decision Trees 20
  • 21. ©2019 FireEye©2019 FireEye EMBER Training Dataset 21  Endgame Malware BEnchmark for Research – v1 (1.1 million PE files scanned on or before 2017)  https://arxiv.org/abs/1804.04637  https://github.com/endgameinc/ember – 400k train + test malware binaries from v1  malware defined as > 40 VT vendors say malicious  Ran Strings on 400k malware binaries – produced 3+ billion individual strings (24 GB) – performed sampling – labeled according to heuristics and FLARE hand-labeling
  • 22. ©2019 FireEye©2019 FireEye  Natural Language Processing – Markov model – Entropy rate, english KL divergence – Scrabble scores  Host, Network IoCs  Malware Regexes – encodings (base64) – format specifiers – user agents Representing Strings as Features 22 t % F 0.02 0.07 0.01 0.2 0.2 0.01 0.03 0.14 0.05 threshold = 0.01 http://evil.com SOFTWAREincludeevil.pdb t%Ft Vr}Y 0.018 0.014 0.007 0.001
  • 23. ©2019 FireEye©2019 FireEye quixotry ˈkwik-sə-trē (n.) behavior inspired by idealistic beliefs without regard to reality. 23
  • 25. ©2019 FireEye©2019 FireEye  Normalized Discounted Cumulative Gain – Normalized: divide DCG by ideal DCG on a ground truth holdout dataset – Discounted: divides each string’s predicted relevance by a monotonically increasing function (log of its ranked position) – Cumulative: the cumulative gain or summed total of every string’s relevance – Gain: the magnitude of each string’s relevance Evaluation 25
  • 26. ©2019 FireEye©2019 FireEye Results 26 StringSifter performs well on a holdout set of 7+ years of FLARE malware reports.
  • 28. ©2019 FireEye©2019 FireEye Open Sourcing StringSifter 28  The tool is now live: – https://github.com/fireeye/stringsifter – pip install stringsifter – Command line and Docker tools  flarestrings <my_sample> | rank_strings  Versatility – FLOSS outputs – live memory dumps
  • 30. ©2019 FireEye©2019 FireEye  Git + local pip install – Easy access to source code  Pip install from PyPi – If you just want to use the tool  Docker container – Minimum impact to host Install and Use 30 git clone https://github.com/fireeye/stringsifter.git cd stringsifter pip install -e . flarestrings <my_sample> | rank_strings pip install stringsifter flarestrings <my_sample> | rank_strings git clone https://github.com/fireeye/stringsifter.git cd stringsifter docker build -t stringsifter -f docker/Dockerfile . docker run -v <malware_dir>:/samples -it stringsifter flarestrings /samples/<my_sample> | rank_strings
  • 31. ©2019 FireEye©2019 FireEye  There are many versions of "strings" – Gnu binutils, BSD, various windows implementations – Inconsistent features  flarestrings – Pure python implementation of "strings" – Consistent across platforms – Prints both ASCII and wide strings flarestrings * 31 * FLARE => FireEye Labs Advanced Reverse Engineering
  • 37. ©2019 FireEye©2019 FireEye  Rapid screening for potential capabilities  Detect and handle packed / obfuscated binaries – Tipoff for automated unpacker tooling  Leverage feature vectors to focus triage  Improve NLP  Improve ranking performance on mach-o, ELF Other Use Cases and Future Work 37
  • 38. ©2019 FireEye©2019 FireEye  Plug into your malware analysis stack  Seeking critical feedback – improve accuracy and utility – pertinent edge cases, non-PE files – contribute via GitHub Issues  Beginners and experts alike  Thank you for your attention! Community Support 38 https://github.com/fireeye/stringsifter pip install stringsifter

Notas do Editor

  1. Introduce what binary triage is and how it relates to malware analysis – add a slide about other users (incident response, soc analyst, researchers (move triage before incident response.
  2. Starts at hex 21 / 94 printable characters
  3. Reverse inference
  4. Traditional ML solves a prediction problem (classification or regression) on a single instance at a time. E.g. if you are doing spam detection on email, you will look at all the features associated with that email and classify it as spam or not. The aim of traditional ML is to come up with a class (spam or no-spam) or a single numerical score for that instance. LTR solves a ranking problem on a list of items. The aim of LTR is to come up with optimal ordering of those items. As such, LTR doesn’t care much about the exact score that each item gets, but cares more about the relative ordering among all the items.
  5. Traditional ML solves a prediction problem (classification or regression) on a single instance at a time. E.g. if you are doing spam detection on email, you will look at all the features associated with that email and classify it as spam or not. The aim of traditional ML is to come up with a class (spam or no-spam) or a single numerical score for that instance. LTR solves a ranking problem on a list of items. The aim of LTR is to come up with optimal ordering of those items. As such, LTR doesn’t care much about the exact score that each item gets, but cares more about the relative ordering among all the items.
  6. - Learning to rank learns to directly rank items by training a model to predict the probability of a certain item ranking over another item. - This is done by learning a scoring function where items ranked higher should have higher scores. The model can be trained via gradient descent on a loss function defined over these scores. - For each item, gradient descent pushes the score up for every item that ranks below it and pushes the score down for every item that ranks above it. The “strength” of the push is determined by the difference in scores. - To ensure that the model focuses on getting the higher ranks (which are generally more important) correct, we can weight the “strength” of the push by a factor that accounts for how important the ranking is.
  7. Discounted reflects the goal of having the most relevant strings ranked towards the top of our predictions Normalization makes it possible to compare scores across samples since the number of strings within different Strings outputs can vary widely. which we obtain from FLARE-identified relevant strings contained within historical malware reports.
  8. Discounted reflects the goal of having the most relevant strings ranked towards the top of our predictions Normalization makes it possible to compare scores across samples since the number of strings within different Strings outputs can vary widely. which we obtain from FLARE-identified relevant strings contained within historical malware reports.