SlideShare uma empresa Scribd logo
1 de 58
Baixar para ler offline
Triton: Concolic Execution Framework
Florent Saudel
Jonathan Salwan
SSTIC
Rennes – France
Slides: detailed version
June 3 2015
Keywords: program analysis, DBI, DBA, Pin, concrete execution,
symbolic execution, concolic execution, DSE, taint analysis,
context snapshot, Z3 theorem prover and behavior analysis.
2
● Jonathan Salwan is a student at Bordeaux University
(CSI Master) and also an employee at Quarkslab
● Florent Saudel is a student at the Bordeaux
University (CSI Master) and applying to an
Internship at Amossys
● Both like playing with low-level computing, program
analysis and software verification methods
Who are we?
3
● Triton is a project started on January 2015 for our
Master final project at Bordeaux University (CSI)
supervised by Emmanuel Fleury from laBRI
● Triton is also sponsored by Quarkslab from the
beginning
Where does Triton come from?
4
● Triton is a concolic execution framework as Pintool
● It provides advanced classes to improve dynamic
binary analysis (DBA) using Pin
– Symbolic execution engine
– SMT semantics representation
– Interface with SMT Solver
– Taint analysis engine
– Snapshot engine
– API and Python bindings
What is Triton?
5
What is Triton?
Plug what you want which supports
the SMT2-LIB format
Pin
Taint
Engine
Symbolic
Execution
Engine
Snapshot
Engine
SMT
Semantics
IR
SMT
Solver
Interface
Python
Bindings
Triton internal components
Triton.so pintool
user
script.py
Z3 *
Front-endBack-end User side
6
● Well-known projects
– SAGE
– Mayhem
– Bitblaze
– S2E
● The difference?
– Triton works online* through a higher level
languages using the Pin engine
Relative projects
online*: Analysis is performed at runtime and data can be modified directly
in memory to go through specific branches.
7
● You can build tools which:
– Analyze a trace with concrete information
● Registers and memory values at each program point
– Perform a symbolic execution
● To know the symbolic expression of registers and memory at each program point
– Perform a symbolic fuzzing session
– Generate and solve path constraints
– Gather code coverage
– Runtime registers and memory modification
– Replay traces directly in memory
– Scriptable debugging
– Access to Pin functions through a higher level languages (Python bindings)
– And probably lots of others things
What kind of things you can
build with Triton?
8
● Triton's Internal Components
9
● Symbolic Engine
10
● Symbolic Engine
● Symbolic execution is the execution of a program using
symbolic variables instead of concrete values
● Symbolic execution translates the program's semantics into a
logical formula
● Symbolic execution can build and keep a path formula
– By solving the formula and its negation we can take all paths
and “cover” a code
● Instead of concrete execution which takes only one path
● Then a symbolic expression is given to a SMT solver to
generate a concrete value
11
● Symbolic Engine inside Triton
●
A trace is a sequence of instructions
T = (Ins1 Ins∧ 2 Ins∧ 3 Ins∧ 4 … Ins∧ ∧ i)
●
Instructions are represented with symbolic expressions
●
A symbolic trace is a sequence of symbolic expressions
● Each symbolic expression is translated like this:
REFout = semantic
– Where :
●
REFout := unique ID
●
Semantic := REFin | <<smt expression>>
●
Each register or byte of memory points to its last reference →
Single Static Assignment Form (SSA)
12
● Register References
movzx eax, byte ptr [mem]
add eax, 2
mov ebx, eax
Example:
// All refs initialized to -1
Register Reference Table {
EAX : -1,
EBX : -1,
ECX : -1,
...
}
// Empty set
Symbolic Expression Set {
}
13
● Register references
movzx eax, byte ptr [mem] #0 = symvar_1
add eax, 2
mov ebx, eax
Example:
// All refs initialized to -1
Register Reference Table {
EAX : #0,
EBX : -1,
ECX : -1,
...
}
// Empty set
Symbolic Expression Set {
<#0, symvar_1>
}
14
● Register references
movzx eax, byte ptr [mem] #0 = symvar_1
add eax, 2 #1 = (bvadd #0 2)
mov ebx, eax
Example:
// All refs initialized to -1
Register Reference Table {
EAX : #1,
EBX : -1,
ECX : -1,
...
}
// Empty set
Symbolic Expression Set {
<#1, (bvadd #0 2)>,
<#0, symvar_1>
}
15
● Register references
movzx eax, byte ptr [mem] #0 = symvar_1
add eax, 2 #1 = (bvadd #0 2)
mov ebx, eax #2 = #1
Example:
// All refs initialized to -1
Register Reference Table {
EAX : #1,
EBX : #2,
ECX : -1,
...
}
// Empty set
Symbolic Expression Set {
<#2, #1>,
<#1, (bvadd #0 2)>,
<#0, symvar_1>
}
16
● Rebuild the trace with
backward analysis
movzx eax, byte ptr [mem]
add eax, 2
mov ebx, eax
Example:
// All refs initialized to -1
Register Reference Table {
EAX : #1,
EBX : #2,
ECX : -1,
...
}
// Empty set
Symbolic Expression Set {
<#2, #1>,
<#1, (bvadd #0 2)>,
<#0, symvar_1>
}
What is the semantic trace of EBX ?
17
● Rebuild the trace with
backward analysis
movzx eax, byte ptr [mem]
add eax, 2
mov ebx, eax
Example:
// All refs initialized to -1
Register Reference Table {
EAX : #1,
EBX : #2,
ECX : -1,
...
}
// Empty set
Symbolic Expression Set {
<#2, #1>,
<#1, (bvadd #0 2)>,
<#0, symvar_1>
}
What is the semantic trace of EBX ?
EBX holds the reference #2
18
● Rebuild the trace with
backward analysis
movzx eax, byte ptr [mem]
add eax, 2
mov ebx, eax
Example:
// All refs initialized to -1
Register Reference Table {
EAX : #1,
EBX : #2,
ECX : -1,
...
}
// Empty set
Symbolic Expression Set {
<#2, #1>,
<#1, (bvadd #0 2)>,
<#0, symvar_1>
}
What is the semantic trace of EBX ?
EBX holds the reference #2
What is #2 ?
19
● Rebuild the trace with
backward analysis
movzx eax, byte ptr [mem]
add eax, 2
mov ebx, eax
Example:
// All refs initialized to -1
Register Reference Table {
EAX : #1,
EBX : #2,
ECX : -1,
...
}
// Empty set
Symbolic Expression Set {
<#2, #1>,
<#1, (bvadd #0 2)>,
<#0, symvar_1>
}
What is the semantic trace of EBX ?
EBX holds the reference #2
What is #2 ? Reconstruction: EBX = #2
20
● Rebuild the trace with
backward analysis
movzx eax, byte ptr [mem]
add eax, 2
mov ebx, eax
Example:
// All refs initialized to -1
Register Reference Table {
EAX : #1,
EBX : #2,
ECX : -1,
...
}
// Empty set
Symbolic Expression Set {
<#2, #1>,
<#1, (bvadd #0 2)>,
<#0, symvar_1>
}
What is the semantic trace of EBX ?
EBX holds the reference #2
What is #2 ? Reconstruction: EBX = #1
21
● Rebuild the trace with
backward analysis
movzx eax, byte ptr [mem]
add eax, 2
mov ebx, eax
Example:
// All refs initialized to -1
Register Reference Table {
EAX : #1,
EBX : #2,
ECX : -1,
...
}
// Empty set
Symbolic Expression Set {
<#2, #1>,
<#1, (bvadd #0 2)>,
<#0, symvar_1>
}
What is the semantic trace of EBX ?
EBX holds the reference #2
What is #2 ? Reconstruction: EBX = (bvadd #0 2)
22
● Rebuild the trace with
backward analysis
movzx eax, byte ptr [mem]
add eax, 2
mov ebx, eax
Example:
// All refs initialized to -1
Register Reference Table {
EAX : #1,
EBX : #2,
ECX : -1,
...
}
// Empty set
Symbolic Expression Set {
<#2, #1>,
<#1, add(#0, 2)>,
<#0, symvar_1>
}
What is the semantic trace of EBX ?
EBX holds the reference #2
What is #2 ? Reconstruction: EBX = (bvadd symvar_1 2)
23
● Assigning a reference for each register is not enough, we must also add
references on memory
● Follow references over memory
mov dword ptr [rbp-0x4], 0x0
...
mov eax, dword ptr [rbp-0x4]
push eax
...
pop ebx
What do we want to know?
Eax = 0 from somewhere ebx = eax
References
#1 = 0x0
...
#x = #1
#2 = #1
...
#x = #2
24
● All registers, flags and each byte of memory are references
● A reference assignment is in SSA form during the execution
● The registers, flags and bytes of memory are assigned in the same
way
● A memory reference can be assigned from a register reference
(mov [mem], reg)
● A register reference can be assigned from a memory reference
(mov reg, [mem])
● If a reference doesn't exist yet, we concretize the value and we
affect a new reference
● References conclusion
25
● SMT Semantics Representation
with SSA Form
26
● SMT Semantics Representation
with SSA Form
● All instructions semantics are represented via SMT2-LIB representation
● This SMT2-LIB representation is on SSA form
add rax, rdx
rax = (bvadd ((_ extract 63 0) rax) ((_ extract 63 0) rdx))
... (af, cf, of, pf) ...
sf = (ite (= ((_ extract 63 63) rax) (_ bv1 1)) (_ bv1 1) (_ bv0 1))
zf = (ite (= rax (_ bv0 64)) (_ bv1 1) (_ bv0 1))
#60 = (bvadd ((_ extract 63 0) #58) ((_ extract 63 0) #54))
... (af, cf, of, pf) ...
#64 = (ite (= ((_ extract 63 63) #60) (_ bv1 1)) (_ bv1 1) (_ bv0 1))
#65 = (ite (= #60 (_ bv0 64)) (_ bv1 1) (_ bv0 1))
Assembly
SMT
SSA SMT
New rax reference Old rax reference Old rdx reference
27
● SMT Semantics Representation
with SSA Form
● Why use SMT2-LIB representation?
– SMT-LIB is an international initiative aimed at facilitating research and
development in Satisfiability Modulo Theories (SMT)
– As all Triton's expressions are in the SMT2-LIB representation, you
can plug all solvers which supports this representation
● Currently Triton has an interface with Z3 but feel free to plug what
you want
28
● Symbolic Execution Guided By The
Taint Analysis
29
● Symbolic Execution Guided By The
Taint Analysis
● Taint analysis provides information about which registers and memory
addresses are controllable by the user at each program point:
– Assists the symbolic engine to setup the symbolic variables (a
symbolic variable is a memory area that the user can control)
– Limit the symbolic engine to the relevant part of the program
– At each branch instruction, we directly know if the user can go through
both branches (this is mainly used for code coverage)
30
● Symbolic Execution Guided By The
Taint Analysis
● Transform a tainted area into a symbolic variable
0x40058b: movzx eax, byte ptr [rax]
-> #33 = SymVar_0 ; Controllable by the user
-> #34 = (_ bv4195726 64) ; RIP
0x40058e: movsx eax, al
-> #35 = ((_ sign_extend 24) ((_ extract 7 0) #33))
-> #36 = (_ bv4195729 64) ; RIP
0x40058b: movzx eax, byte ptr [rax]
-> #33 = ((_ zero_extend 24) (_ bv97 8))
-> #34 = (_ bv4195726 64) ; RIP
0x40058e: movsx eax, al
-> #35 = ((_ sign_extend 24) ((_ extract 7 0) #33))
-> #36 = (_ bv4195729 64) ; RIP
rax points on a tainted area
Use symbolic variable instead of concrete value
31
● Symbolic Execution Guided By The
Taint Analysis
● Can I go through this branch?
– Check if flags are tainted
0x4005ae: cmp ecx, eax
-> #72 = (bvsub ((_ extract 31 0) #52) ((_ extract 31 0) #70))
...CF, OF, SF, AF, and PF skipped...
-> #78 = (ite (= #72 (_ bv0 32)) (_ bv1 1) (_ bv0 1)) ; ZF
-> #79 = (_ bv4195760 64) ; RIP
0x4005b0: jz 0x4005b9
-> #80 = (ite (= #78 (_ bv1 1)) (_ bv4195769 64) (_ bv4195762 64)) ; RIP
eax is tainted
tainted
32
● Taint Analysis guided by the Symbolic
Engine and the Solver Engine
●
As the symbolic execution may be guided by the taint analysis, the taint analysis may
also be guided by the symbolic execution and the solver engine
●
What to choose between an over-approximation and under-approximation?
– Over-approximation: We can generate inputs for infeasible concrete paths.
– Under-approximation: We can miss some feasible paths.
● The goal of the taint engine is to say YES or NO if a register and memory is probably
tainted (byte-level over approximation)
● The goal of the symbolic engine is to build symbolic expressions based on
instructions semantics
● The goal of the solver engine is to generate a model of an expression (path condition)
– If your target is not tainted, don't ask a model → gain time
– If the solver engine returns UNSAT → the tainted inputs can't influence the control
flow to go through this path.
– If the solver engine returns SAT → the path can be triggered with the actual tainted
inputs. The model give us the set of concrete inputs for this path.
33
● Snapshot Engine – Replay your trace
34
● Snapshot Engine – Replay your trace
●
The snapshot engine offers the possibility to take and restore snapshot
– Mainly used to apply code coverage in memory. Useful when you fuzz the binary
– In future versions, it will be possible to take different snapshots at several program point
● The snapshot engine only restores registers and memory states
– If there is some disk, network,... I/O, Triton won't be able to restore the files modification
Restore context
Snapshot
Restore
SnapshotTarget
function / basic block
Dynamic Symbolic Execution Possible paths in the target
35
● Stop talking about back-end!
Let's see how I can use Triton
36
● How to install Triton?
● Easy is easy
● You just need:
– Pin v2.14-71313
– Z3 v4.3.1
– Python v2.7
$ cd pin-2.14-71313-gcc.4.4.7-linux/source/tools/
$ git clone git@github.com:JonathanSalwan/Triton.git
$ cd Triton
$ make
$ ../../../pin -t ./triton.so -script your_script.py -- ./your_target_binary.elf64
Shell 1: Installation
37
● Start an analysis
import triton
if __name__ == '__main__':
# Start the symbolic analysis from the 'check' function
triton.startAnalysisFromSymbol('check')
# Run the instrumentation - Never returns
triton.runProgram()
import triton
if __name__ == '__main__':
# Start the symbolic analysis from address
triton.startAnalysisFromAddr(0x40056d)
triton.stopAnalysisFromAddr(0x4005c9)
# Run the instrumentation - Never returns
triton.runProgram()
Code 1: Start analysis from symbols
Code 2: Start analysis from address
38
● Predicate taint and untaint
import triton
if __name__ == '__main__':
# Start the symbolic analysis from the 'check' function
triton.startAnalysisFromSymbol('check')
# Taint the RAX and RBX registers when the address 0x40058e is executed
triton.taintRegFromAddr(0x40058e, [IDREF.REG.RAX, IDREF.REG.RBX])
# Untaint the RCX register when the address 0x40058e is executed
triton.untaintRegFromAddr(0x40058e, [IDREF.REG.RCX])
# Run the instrumentation - Never returns
triton.runProgram()
Code 3: Predicate taint and untaint at specific addresses
39
● Callbacks
● Triton supports 8 kinds of callbacks
– AFTER
● Defines a callback after the instruction processing
– BEFORE
● Defines a callback before the instruction processing
– BEFORE_SYMPROC
● Defines a callback before the symbolic processing
– FINI
● Define a callback at the end of the execution
– ROUTINE_ENTRY
● Define a callback at the entry of a specified routine.
– ROUTINE_EXIT
● Define a callback at the exit of a specified routine.
– SYSCALL_ENTRY
● Define a callback before each syscall processing
– SYSCALL_EXIT
● Define a callback after each syscall processing
40
● Callback on SYSCALL
def my_callback_syscall_entry(threadId, std):
print '-> Syscall Entry: %s' %(syscallToString(std, getSyscallNumber(std)))
if getSyscallNumber(std) == IDREF.SYSCALL.LINUX_64.WRITE:
arg0 = getSyscallArgument(std, 0)
arg1 = getSyscallArgument(std, 1)
arg2 = getSyscallArgument(std, 2)
print ' sys_write(%x, %x, %x)' %(arg0, arg1, arg2)
def my_callback_syscall_exit(threadId, std):
print '<- Syscall return %x' %(getSyscallReturn(std))
if __name__ == '__main__':
startAnalysisFromSymbol('main')
addCallback(my_callback_syscall_entry, IDREF.CALLBACK.SYSCALL_ENTRY)
addCallback(my_callback_syscall_exit, IDREF.CALLBACK.SYSCALL_EXIT)
runProgram()
Code 4: Callback before and after syscalls processing
-> Syscall Entry: fstat
<- Syscall return 0
-> Syscall Entry: mmap
<- Syscall return 7fb7f06e1000
-> Syscall Entry: write
sys_write(1, 7fb7f06e1000, 6)
Code 4 result
41
● Callback on ROUTINE
def mallocEntry(threadId):
sizeAllocated = getRegValue(IDREF.REG.RDI)
print '-> malloc(%#x)' %(sizeAllocated)
def mallocExit(threadId):
ptrAllocated = getRegValue(IDREF.REG.RAX)
print '<- %#x' %(ptrAllocated)
if __name__ == '__main__':
startAnalysisFromSymbol('main')
addCallback(mallocEntry, IDREF.CALLBACK.ROUTINE_ENTRY, "malloc")
addCallback(mallocExit, IDREF.CALLBACK.ROUTINE_EXIT, "malloc")
runProgram()
Code 5: Callback before and after routine processing
-> malloc(0x20)
<- 0x8fc010
-> malloc(0x20)
<- 0x8fc040
-> malloc(0x20)
<- 0x8fc010
Code 5 result
42
● Callback BEFORE and AFTER
instruction processing
def my_callback_before(instruction):
print 'TID (%d) %#x %s' %(instruction.threadId,
instruction.address,
instruction.assembly)
if __name__ == '__main__':
# Start the symbolic analysis from the 'check' function
startAnalysisFromSymbol('check')
# Add a callback.
addCallback(my_callback_before, IDREF.CALLBACK.BEFORE)
# Run the instrumentation - Never returns
runProgram()
Code 6: Callback before instruction processing
TID (0) 0x40056d push rbp
TID (0) 0x40056e mov rbp, rsp
TID (0) 0x400571 mov qword ptr [rbp-0x18], rdi
TID (0) 0x400575 mov dword ptr [rbp-0x4], 0x0
...
TID (0) 0x4005b2 mov eax, 0x1
TID (0) 0x4005b7 jmp 0x4005c8
TID (0) 0x4005c8 pop rbp
Code 6 result
43
● Instruction class
def my_callback(instruction):
...
● instruction.address
● instruction.assembly
● instruction.imageName - e.g: libc.so
● instruction.isBranch
● instruction.opcode
● instruction.opcodeCategory
● instruction.operands
● instruction.symbolicElements – List of SymbolicElement class
● instruction.routineName - e.g: main
● instruction.sectionName - e.g: .text
● instruction.threadId
44
● SymbolicElement class
instruction.symbolicElements[0]
● symbolicElement.comment → blah
● symbolicElement.destination → #41
● symbolicElement.expression → #41 = (bvadd ((_ extract 63 0) #40) ((_ extract 63 0) #39))
● symbolicElement.id → 41
● symbolicElement.isTainted → True or False
● symbolicElement.source → (bvadd ((_ extract 63 0) #40) ((_ extract 63 0) #39))
Instruction: add rax, rdx
SymbolicElement: #41 = (bvadd ((_ extract 63 0) #40) ((_ extract 63 0) #39)) ; blah
45
● Dump the symbolic expressions trace
def my_callback_after(instruction):
print '%#x: %s' %(instruction.address, instruction.assembly)
for se in instruction.symbolicElements:
print 't -> ', se.expression
print
if __name__ == '__main__':
startAnalysisFromSymbol('check')
addCallback(my_callback_after, IDREF.CALLBACK.AFTER)
runProgram()
Code 7: Dump a symbolic expression trace
0x4005ab: movsx eax, al
-> #70 = ((_ sign_extend 24) ((_ extract 7 0) #68))
-> #71 = (_ bv4195758 64)
0x4005ae: cmp ecx, eax
-> #72 = (bvsub ((_ extract 31 0) #52) ((_ extract 31 0) #70))
...
-> #77 = (ite (= ((_ extract 31 31) #72) (_ bv1 1)) (_ bv1 1) (_ bv0 1))
-> #78 = (ite (= #72 (_ bv0 32)) (_ bv1 1) (_ bv0 1))
-> #79 = (_ bv4195760 64)
0x4005b0: jz 0x4005b9
-> #80 = (ite (= #78 (_ bv1 1)) (_ bv4195769 64) (_ bv4195762 64))
Code 7 result
46
● Play with the Taint engine at runtime
# 0x40058b: movzx eax, byte ptr [rax]
def cbeforeSymProc(instruction):
if instruction.address == 0x40058b:
rax = getRegValue(IDREF.REG.RAX)
taintMem(rax)
if __name__ == '__main__':
startAnalysisFromSymbol('check')
addCallback(cbeforeSymProc, IDREF.CALLBACK.BEFORE_SYMPROC)
runProgram()
Code 8: Taint memory at runtime
0x40058b: movzx eax, byte ptr [rax]
-> #33 = SymVar_0
-> #34 = (_ bv4195726 64)
0x40058e: movsx eax, al
-> #35 = ((_ sign_extend 24) ((_ extract 7 0) #33))
-> #36 = (_ bv4195729 64)
Code 8 result
Modifications must be done before
the symbolic processing
47
● Taint argv[x][x] at the main function
def mainAnalysis(threadId):
rdi = getRegValue(IDREF.REG.RDI) # argc
rsi = getRegValue(IDREF.REG.RSI) # argv
while rdi != 0:
argv = getMemValue(rsi + ((rdi-1) * 8), 8)
offset = 0
while getMemValue(argv + offset, 1) != 0x00:
taintMem(argv + offset)
offset += 1
print '[+] %03d bytes tainted from the argv[%d] (%#x) pointer'
%(offset, rdi-1, argv)
rdi -= 1
return
Code 9: Taint all arguments when the main function occurs
$ pin -t ./triton.so -script taint_main.py -- ./example.bin64 12 123456 123456789
[+] 009 bytes tainted from the argv[3] (0x7fff802ad116) pointer
[+] 006 bytes tainted from the argv[2] (0x7fff802ad10f) pointer
[+] 002 bytes tainted from the argv[1] (0x7fff802ad10c) pointer
[+] 015 bytes tainted from the argv[0] (0x7fff802ad0ef) pointer
Code 9 result
48
● Play with the Symbolic engine
0x40058b: movzx eax, byte ptr [rax]
...
...
0x4005ae: cmp ecx, eax
Example 10: Assembly code
def callback_beforeSymProc(instruction):
if instruction.address == 0x40058b:
rax = getRegValue(IDREF.REG.RAX)
taintMem(rax)
def callback_after(instruction):
if instruction.address == 0x4005ae:
# Get the symbolic expression ID of ZF
zfId = getRegSymbolicID(IDREF.FLAG.ZF)
# Backtrack the symbolic expression ZF
zfExpr = getBacktrackedSymExpr(zfId)
# Craft a new expression over the ZF expression : (assert (= zfExpr True))
expr = smt2lib.smtAssert(smt2lib.equal(zfExpr, smt2lib.bvtrue()))
print expr
Code 10: Backtrack symbolic expression
We know that rax points on a tainted area
(assert (= (ite (= (bvsub ((_ extract 31 0) ((_ extract 31 0) (bvxor ((_ extract 31 0)
(bvsub ((_ extract 31 0) ((_ sign_extend 24) ((_ extract 7 0) SymVar_0))) (_ bv1 32)))
(_ bv85 32)))) ((_ extract 31 0) ((_ sign_extend 24) ((_ extract 7 0) ((_ zero_extend 24)
(_ bv49 8)))))) (_ bv0 32)) (_ bv1 1) (_ bv0 1)) (_ bv1 1)))
Example 10 result
Symbolic Variable
49
● Play with the Symbolic engine
...
zfExpr = getBacktrackedSymExpr(zfId)
# Craft a new expression over the ZF expression : (assert (= zfExpr True))
expr = smt2lib.smtAssert(smt2lib.equal(zfExpr, smt2lib.bvtrue()))
...
Extract of the Code 10
● What does it really mean?
– Triton builds symbolic formulas based on the instructions
semantics
– Triton also exports smt2lib functions which allows you to create
your own formula
– In this example, we want that the ZF expression is equal to 1
50
● Play with the Solver engine
...
zfExpr = getBacktrackedSymExpr(zfId)
# Craft a new expression over the ZF expression : (assert (= zfExpr True))
expr = smt2lib.smtAssert(smt2lib.equal(zfExpr, smt2lib.bvtrue()))
...
model = getModel(expr)
print model
Extract of the Code 10
{'SymVar_0': 0x65}
Result
● getModel() returns a dictionary of valid model for each
symbolic variable
0x40058b: movzx eax, byte ptr [rax]
...
...
0x4005ae: cmp ecx, eax
Example 10: Assembly code
We know now that the first character must be 0x65 to
set the ZF at the compare instruction
51
● Play with the Solver engine and inject
values directly in memory
...
model = getModel(expr)
print model
Extract of the Code 10 Result
● Each symbolic variable is assigned to a memory address (SymVar ↔ Address)
– Possible to get the symbolic variable from a memory address
● getSymVarFromMemory(addr)
– Possible to get the memory address from a symbolic variable
● getMemoryFromSymVar(symVar)
{'SymVar_0': 0x65}
for k, v in model.items():
setMemValue(getMemoryFromSymVar(k), getSymVarSize(k), v)
Inject values given by the solver in memory
52
● Inject values in memory is not enough
Play with the snapshot engine
● Inject values in memory after instructions processing is useless
● That's why Triton offers a snapshot engine
φ1 φ2 φ3 φ4 φ5 φ6
0x40058b: movzx eax, byte ptr [rax]
0x4005ae: cmp ecx, eax
def callback_after(instruction):
if instruction.address == 0x40058b and isSnapshotEnable() == False:
takeSnapshot()
if instruction.address == 0x4005ae:
if getFlagValue(IDREF.FLAG.ZF) == 0:
zfExpr = getBacktrackedSymExpr(...) # Described on slide 45
expr = smt2lib.smtAssert(...zfExpr...) # Described on slide 45
for k, v in getModel(expr).items(): # Described on slide 48
saveValue(...)
restoreSnapshot()
restore snapshottake snapshot
53
● Stop pasting fucking code
Show me a global vision
54
● Stop pasting fucking code
Show me a global vision
● Full API and Python bindings describes here
– https://github.com/JonathanSalwan/Triton/wiki/Python-Bindings
– ~80 functions exported over the Python bindings
● Basically we can:
– Taint and untaint memory and registers
– Inject value in memory and registers
– Add callbacks at each program point, syscalls, routine
– Assign symbolic expression on registers and bytes of memory
– Build and customize symbolic expressions
– Solve symbolic expressions
– Take and restore snapshots
– Do all this in Python!
55
● Conclusion
56
● Conclusion
● Triton:
– is a Pintool which provides others classes for DBA
– is designed as a concolic execution framework
– provides an API and Python bindings
– supports only x86-64 binaries
– currently supports ~100 semantics but we are working hard on it to
increase the semantics support
● An awesome thanks to Kevin `wisk` Szkudlapski and Francis `gg` Gabriel for
the x86.yaml from the Medusa project :)
– is free and open-source :)
– is available here : github.com/JonathanSalwan/Triton
57
● Contacts
– fsaudel@gmail.com
– jsalwan@quarkslab.com
● Thanks
– We would like to thank the SSTIC's staff and especially Rolf Rolles, Sean Heelan,
Sébastien Bardin, Fred Raynal and Serge Guelton for their proofreading and awesome
feedbacks! Then, a big thanks to Quarkslab for the sponsor.
Thanks For Your Attention
Question(s)?
58
● Q&A - Performances
● Host machine configuration
– Tested with an Intel(R) Core(TM) i7-3520M CPU @ 2.90GHz
– 16 Go DDR3
– 415 Go SSD Swap
● The targeted binary analyzed was /usr/bin/z3
– 6,789,610 symbolic expressions created for 1 trace
– The binary has been analyzed in 180 seconds
● One trace with SMT2-LIB translation and the taint spread
– 19 Go of RAM consumed
● Due to the SMT2-LIB strings manipulation

Mais conteúdo relacionado

Mais procurados

Delays in verilog
Delays in verilogDelays in verilog
Delays in verilogJITU MISTRY
 
Verilog 語法教學
Verilog 語法教學 Verilog 語法教學
Verilog 語法教學 艾鍗科技
 
C++ process new
C++ process newC++ process new
C++ process new敬倫 林
 
Verilog Lecture2 thhts
Verilog Lecture2 thhtsVerilog Lecture2 thhts
Verilog Lecture2 thhtsBéo Tú
 
Semaphores OS Basics
Semaphores OS BasicsSemaphores OS Basics
Semaphores OS BasicsShijin Raj P
 
SystemVerilog Assertions verification with SVAUnit - DVCon US 2016 Tutorial
SystemVerilog Assertions verification with SVAUnit - DVCon US 2016 TutorialSystemVerilog Assertions verification with SVAUnit - DVCon US 2016 Tutorial
SystemVerilog Assertions verification with SVAUnit - DVCon US 2016 TutorialAmiq Consulting
 
Verilog Lecture3 hust 2014
Verilog Lecture3 hust 2014Verilog Lecture3 hust 2014
Verilog Lecture3 hust 2014Béo Tú
 
Verilog overview
Verilog overviewVerilog overview
Verilog overviewposdege
 
Verilog HDL Training Course
Verilog HDL Training CourseVerilog HDL Training Course
Verilog HDL Training CoursePaul Laskowski
 
VHDL- gate level modelling
VHDL- gate level modellingVHDL- gate level modelling
VHDL- gate level modellingVandanaPagar1
 
Experiment write-vhdl-code-for-realize-all-logic-gates
Experiment write-vhdl-code-for-realize-all-logic-gatesExperiment write-vhdl-code-for-realize-all-logic-gates
Experiment write-vhdl-code-for-realize-all-logic-gatesRicardo Castro
 
Digital Circuit Verification Hardware Descriptive Language Verilog
Digital Circuit Verification Hardware Descriptive Language VerilogDigital Circuit Verification Hardware Descriptive Language Verilog
Digital Circuit Verification Hardware Descriptive Language VerilogAbhiraj Bohra
 

Mais procurados (20)

Lecture03
Lecture03Lecture03
Lecture03
 
Delays in verilog
Delays in verilogDelays in verilog
Delays in verilog
 
Switch level modeling
Switch level modelingSwitch level modeling
Switch level modeling
 
Coding verilog
Coding verilogCoding verilog
Coding verilog
 
Verilog 語法教學
Verilog 語法教學 Verilog 語法教學
Verilog 語法教學
 
C++ process new
C++ process newC++ process new
C++ process new
 
Quiz 9
Quiz 9Quiz 9
Quiz 9
 
Verilog Lecture2 thhts
Verilog Lecture2 thhtsVerilog Lecture2 thhts
Verilog Lecture2 thhts
 
Semaphores OS Basics
Semaphores OS BasicsSemaphores OS Basics
Semaphores OS Basics
 
SystemVerilog Assertions verification with SVAUnit - DVCon US 2016 Tutorial
SystemVerilog Assertions verification with SVAUnit - DVCon US 2016 TutorialSystemVerilog Assertions verification with SVAUnit - DVCon US 2016 Tutorial
SystemVerilog Assertions verification with SVAUnit - DVCon US 2016 Tutorial
 
Verilog Lecture3 hust 2014
Verilog Lecture3 hust 2014Verilog Lecture3 hust 2014
Verilog Lecture3 hust 2014
 
Verilogforlab
VerilogforlabVerilogforlab
Verilogforlab
 
Verilog overview
Verilog overviewVerilog overview
Verilog overview
 
Verilog HDL Training Course
Verilog HDL Training CourseVerilog HDL Training Course
Verilog HDL Training Course
 
Verilog tutorial
Verilog tutorialVerilog tutorial
Verilog tutorial
 
VHDL- gate level modelling
VHDL- gate level modellingVHDL- gate level modelling
VHDL- gate level modelling
 
Verilog tutorial
Verilog tutorialVerilog tutorial
Verilog tutorial
 
C programming session10
C programming  session10C programming  session10
C programming session10
 
Experiment write-vhdl-code-for-realize-all-logic-gates
Experiment write-vhdl-code-for-realize-all-logic-gatesExperiment write-vhdl-code-for-realize-all-logic-gates
Experiment write-vhdl-code-for-realize-all-logic-gates
 
Digital Circuit Verification Hardware Descriptive Language Verilog
Digital Circuit Verification Hardware Descriptive Language VerilogDigital Circuit Verification Hardware Descriptive Language Verilog
Digital Circuit Verification Hardware Descriptive Language Verilog
 

Destaque

Execution - The Discipline of getting things done
Execution - The Discipline of getting things done Execution - The Discipline of getting things done
Execution - The Discipline of getting things done GMR Group
 
Consulting - Stanford Strategic Framework for Matrix Structures
Consulting - Stanford Strategic Framework for Matrix Structures Consulting - Stanford Strategic Framework for Matrix Structures
Consulting - Stanford Strategic Framework for Matrix Structures Francisco Gonzalez
 
Strategic Business Execution Introduction
Strategic Business Execution IntroductionStrategic Business Execution Introduction
Strategic Business Execution IntroductionPMO Advisory LLC
 
CRM 2.0 - Frameworks for Program Strategy
CRM 2.0 - Frameworks for Program StrategyCRM 2.0 - Frameworks for Program Strategy
CRM 2.0 - Frameworks for Program StrategyMichael Moir
 
The Visioneering Operating System - A Framework For Execution
The Visioneering Operating System - A Framework For ExecutionThe Visioneering Operating System - A Framework For Execution
The Visioneering Operating System - A Framework For ExecutionDavender Gupta
 
Execution - The Discipline of getting things done
Execution - The Discipline of getting things doneExecution - The Discipline of getting things done
Execution - The Discipline of getting things doneSathish Kumar P
 
Strategic Planning, Execution Frameworks & Organizational Health
Strategic Planning, Execution Frameworks & Organizational HealthStrategic Planning, Execution Frameworks & Organizational Health
Strategic Planning, Execution Frameworks & Organizational HealthRichard Swartzbaugh
 
Content Strategy Frameworks (from KBS)
Content Strategy Frameworks (from KBS)Content Strategy Frameworks (from KBS)
Content Strategy Frameworks (from KBS)Paul Potenzone
 
Best Practices Frameworks 101
Best Practices Frameworks 101Best Practices Frameworks 101
Best Practices Frameworks 101shailsood
 
Porter's strategies (generic strategies, five forces, diamond model) with ref...
Porter's strategies (generic strategies, five forces, diamond model) with ref...Porter's strategies (generic strategies, five forces, diamond model) with ref...
Porter's strategies (generic strategies, five forces, diamond model) with ref...Sigortam.net
 
Performance Execution Framework (PEF).
Performance Execution Framework (PEF).Performance Execution Framework (PEF).
Performance Execution Framework (PEF).Mindtree Ltd.
 
Strategy frameworks-and-models
Strategy frameworks-and-modelsStrategy frameworks-and-models
Strategy frameworks-and-modelsTaposh Roy
 

Destaque (14)

Blue Ocean Strategy
Blue Ocean StrategyBlue Ocean Strategy
Blue Ocean Strategy
 
Execution - The Discipline of getting things done
Execution - The Discipline of getting things done Execution - The Discipline of getting things done
Execution - The Discipline of getting things done
 
Consulting - Stanford Strategic Framework for Matrix Structures
Consulting - Stanford Strategic Framework for Matrix Structures Consulting - Stanford Strategic Framework for Matrix Structures
Consulting - Stanford Strategic Framework for Matrix Structures
 
Strategic Business Execution Introduction
Strategic Business Execution IntroductionStrategic Business Execution Introduction
Strategic Business Execution Introduction
 
CRM 2.0 - Frameworks for Program Strategy
CRM 2.0 - Frameworks for Program StrategyCRM 2.0 - Frameworks for Program Strategy
CRM 2.0 - Frameworks for Program Strategy
 
The Visioneering Operating System - A Framework For Execution
The Visioneering Operating System - A Framework For ExecutionThe Visioneering Operating System - A Framework For Execution
The Visioneering Operating System - A Framework For Execution
 
IT Strategy
IT StrategyIT Strategy
IT Strategy
 
Execution - The Discipline of getting things done
Execution - The Discipline of getting things doneExecution - The Discipline of getting things done
Execution - The Discipline of getting things done
 
Strategic Planning, Execution Frameworks & Organizational Health
Strategic Planning, Execution Frameworks & Organizational HealthStrategic Planning, Execution Frameworks & Organizational Health
Strategic Planning, Execution Frameworks & Organizational Health
 
Content Strategy Frameworks (from KBS)
Content Strategy Frameworks (from KBS)Content Strategy Frameworks (from KBS)
Content Strategy Frameworks (from KBS)
 
Best Practices Frameworks 101
Best Practices Frameworks 101Best Practices Frameworks 101
Best Practices Frameworks 101
 
Porter's strategies (generic strategies, five forces, diamond model) with ref...
Porter's strategies (generic strategies, five forces, diamond model) with ref...Porter's strategies (generic strategies, five forces, diamond model) with ref...
Porter's strategies (generic strategies, five forces, diamond model) with ref...
 
Performance Execution Framework (PEF).
Performance Execution Framework (PEF).Performance Execution Framework (PEF).
Performance Execution Framework (PEF).
 
Strategy frameworks-and-models
Strategy frameworks-and-modelsStrategy frameworks-and-models
Strategy frameworks-and-models
 

Semelhante a Sstic 2015 detailed_version_triton_concolic_execution_frame_work_f_saudel_jsalwan

ElixirでFPGAを設計する
ElixirでFPGAを設計するElixirでFPGAを設計する
ElixirでFPGAを設計するHideki Takase
 
Lecture2 general structure of a compiler
Lecture2 general structure of a compilerLecture2 general structure of a compiler
Lecture2 general structure of a compilerMahesh Kumar Chelimilla
 
Introduction to Assembly Language
Introduction to Assembly Language Introduction to Assembly Language
Introduction to Assembly Language ApekshaShinde6
 
cs241-f06-final-overview
cs241-f06-final-overviewcs241-f06-final-overview
cs241-f06-final-overviewColin Bell
 
Attention mechanisms with tensorflow
Attention mechanisms with tensorflowAttention mechanisms with tensorflow
Attention mechanisms with tensorflowKeon Kim
 
Reversing & Malware Analysis Training Part 4 - Assembly Programming Basics
Reversing & Malware Analysis Training Part 4 - Assembly Programming BasicsReversing & Malware Analysis Training Part 4 - Assembly Programming Basics
Reversing & Malware Analysis Training Part 4 - Assembly Programming Basicssecurityxploded
 
Reversing malware analysis training part4 assembly programming basics
Reversing malware analysis training part4 assembly programming basicsReversing malware analysis training part4 assembly programming basics
Reversing malware analysis training part4 assembly programming basicsCysinfo Cyber Security Community
 
system software 16 marks
system software 16 markssystem software 16 marks
system software 16 marksvvcetit
 
W8_1: Intro to UoS Educational Processor
W8_1: Intro to UoS Educational ProcessorW8_1: Intro to UoS Educational Processor
W8_1: Intro to UoS Educational ProcessorDaniel Roggen
 
Assembly language programming_fundamentals 8086
Assembly language programming_fundamentals 8086Assembly language programming_fundamentals 8086
Assembly language programming_fundamentals 8086Shehrevar Davierwala
 

Semelhante a Sstic 2015 detailed_version_triton_concolic_execution_frame_work_f_saudel_jsalwan (20)

ElixirでFPGAを設計する
ElixirでFPGAを設計するElixirでFPGAを設計する
ElixirでFPGAを設計する
 
Lecture2 general structure of a compiler
Lecture2 general structure of a compilerLecture2 general structure of a compiler
Lecture2 general structure of a compiler
 
Introduction to Assembly Language
Introduction to Assembly Language Introduction to Assembly Language
Introduction to Assembly Language
 
The Phases of a Compiler
The Phases of a CompilerThe Phases of a Compiler
The Phases of a Compiler
 
cs241-f06-final-overview
cs241-f06-final-overviewcs241-f06-final-overview
cs241-f06-final-overview
 
Attention mechanisms with tensorflow
Attention mechanisms with tensorflowAttention mechanisms with tensorflow
Attention mechanisms with tensorflow
 
Assembler
AssemblerAssembler
Assembler
 
Compiler Design
Compiler DesignCompiler Design
Compiler Design
 
Reversing & Malware Analysis Training Part 4 - Assembly Programming Basics
Reversing & Malware Analysis Training Part 4 - Assembly Programming BasicsReversing & Malware Analysis Training Part 4 - Assembly Programming Basics
Reversing & Malware Analysis Training Part 4 - Assembly Programming Basics
 
Coal (1)
Coal (1)Coal (1)
Coal (1)
 
Reversing malware analysis training part4 assembly programming basics
Reversing malware analysis training part4 assembly programming basicsReversing malware analysis training part4 assembly programming basics
Reversing malware analysis training part4 assembly programming basics
 
ISA.pptx
ISA.pptxISA.pptx
ISA.pptx
 
SS-assemblers 1.pptx
SS-assemblers 1.pptxSS-assemblers 1.pptx
SS-assemblers 1.pptx
 
CD U1-5.pptx
CD U1-5.pptxCD U1-5.pptx
CD U1-5.pptx
 
Interm codegen
Interm codegenInterm codegen
Interm codegen
 
system software 16 marks
system software 16 markssystem software 16 marks
system software 16 marks
 
W8_1: Intro to UoS Educational Processor
W8_1: Intro to UoS Educational ProcessorW8_1: Intro to UoS Educational Processor
W8_1: Intro to UoS Educational Processor
 
CODch3Slides.ppt
CODch3Slides.pptCODch3Slides.ppt
CODch3Slides.ppt
 
Assembly language programming_fundamentals 8086
Assembly language programming_fundamentals 8086Assembly language programming_fundamentals 8086
Assembly language programming_fundamentals 8086
 
Lecture7.pdf
Lecture7.pdfLecture7.pdf
Lecture7.pdf
 

Último

High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑Damini Dixit
 
Zoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfZoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfSumit Kumar yadav
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksSérgio Sacani
 
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxCOST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxFarihaAbdulRasheed
 
Dopamine neurotransmitter determination using graphite sheet- graphene nano-s...
Dopamine neurotransmitter determination using graphite sheet- graphene nano-s...Dopamine neurotransmitter determination using graphite sheet- graphene nano-s...
Dopamine neurotransmitter determination using graphite sheet- graphene nano-s...Mohammad Khajehpour
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and ClassificationsAreesha Ahmad
 
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryAlex Henderson
 
GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)Areesha Ahmad
 
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.Nitya salvi
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bSérgio Sacani
 
Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptxAlMamun560346
 
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...Silpa
 
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxSCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxRizalinePalanog2
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsSérgio Sacani
 
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts ServiceJustdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Servicemonikaservice1
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfrohankumarsinghrore1
 
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptxPSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptxSuji236384
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY1301aanya
 

Último (20)

High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
 
Zoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfZoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdf
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
 
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxCOST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
 
Dopamine neurotransmitter determination using graphite sheet- graphene nano-s...
Dopamine neurotransmitter determination using graphite sheet- graphene nano-s...Dopamine neurotransmitter determination using graphite sheet- graphene nano-s...
Dopamine neurotransmitter determination using graphite sheet- graphene nano-s...
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
 
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
 
GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)
 
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 
Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptx
 
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
 
Site Acceptance Test .
Site Acceptance Test                    .Site Acceptance Test                    .
Site Acceptance Test .
 
Clean In Place(CIP).pptx .
Clean In Place(CIP).pptx                 .Clean In Place(CIP).pptx                 .
Clean In Place(CIP).pptx .
 
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxSCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
 
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts ServiceJustdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdf
 
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptxPSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY
 

Sstic 2015 detailed_version_triton_concolic_execution_frame_work_f_saudel_jsalwan

  • 1. Triton: Concolic Execution Framework Florent Saudel Jonathan Salwan SSTIC Rennes – France Slides: detailed version June 3 2015 Keywords: program analysis, DBI, DBA, Pin, concrete execution, symbolic execution, concolic execution, DSE, taint analysis, context snapshot, Z3 theorem prover and behavior analysis.
  • 2. 2 ● Jonathan Salwan is a student at Bordeaux University (CSI Master) and also an employee at Quarkslab ● Florent Saudel is a student at the Bordeaux University (CSI Master) and applying to an Internship at Amossys ● Both like playing with low-level computing, program analysis and software verification methods Who are we?
  • 3. 3 ● Triton is a project started on January 2015 for our Master final project at Bordeaux University (CSI) supervised by Emmanuel Fleury from laBRI ● Triton is also sponsored by Quarkslab from the beginning Where does Triton come from?
  • 4. 4 ● Triton is a concolic execution framework as Pintool ● It provides advanced classes to improve dynamic binary analysis (DBA) using Pin – Symbolic execution engine – SMT semantics representation – Interface with SMT Solver – Taint analysis engine – Snapshot engine – API and Python bindings What is Triton?
  • 5. 5 What is Triton? Plug what you want which supports the SMT2-LIB format Pin Taint Engine Symbolic Execution Engine Snapshot Engine SMT Semantics IR SMT Solver Interface Python Bindings Triton internal components Triton.so pintool user script.py Z3 * Front-endBack-end User side
  • 6. 6 ● Well-known projects – SAGE – Mayhem – Bitblaze – S2E ● The difference? – Triton works online* through a higher level languages using the Pin engine Relative projects online*: Analysis is performed at runtime and data can be modified directly in memory to go through specific branches.
  • 7. 7 ● You can build tools which: – Analyze a trace with concrete information ● Registers and memory values at each program point – Perform a symbolic execution ● To know the symbolic expression of registers and memory at each program point – Perform a symbolic fuzzing session – Generate and solve path constraints – Gather code coverage – Runtime registers and memory modification – Replay traces directly in memory – Scriptable debugging – Access to Pin functions through a higher level languages (Python bindings) – And probably lots of others things What kind of things you can build with Triton?
  • 10. 10 ● Symbolic Engine ● Symbolic execution is the execution of a program using symbolic variables instead of concrete values ● Symbolic execution translates the program's semantics into a logical formula ● Symbolic execution can build and keep a path formula – By solving the formula and its negation we can take all paths and “cover” a code ● Instead of concrete execution which takes only one path ● Then a symbolic expression is given to a SMT solver to generate a concrete value
  • 11. 11 ● Symbolic Engine inside Triton ● A trace is a sequence of instructions T = (Ins1 Ins∧ 2 Ins∧ 3 Ins∧ 4 … Ins∧ ∧ i) ● Instructions are represented with symbolic expressions ● A symbolic trace is a sequence of symbolic expressions ● Each symbolic expression is translated like this: REFout = semantic – Where : ● REFout := unique ID ● Semantic := REFin | <<smt expression>> ● Each register or byte of memory points to its last reference → Single Static Assignment Form (SSA)
  • 12. 12 ● Register References movzx eax, byte ptr [mem] add eax, 2 mov ebx, eax Example: // All refs initialized to -1 Register Reference Table { EAX : -1, EBX : -1, ECX : -1, ... } // Empty set Symbolic Expression Set { }
  • 13. 13 ● Register references movzx eax, byte ptr [mem] #0 = symvar_1 add eax, 2 mov ebx, eax Example: // All refs initialized to -1 Register Reference Table { EAX : #0, EBX : -1, ECX : -1, ... } // Empty set Symbolic Expression Set { <#0, symvar_1> }
  • 14. 14 ● Register references movzx eax, byte ptr [mem] #0 = symvar_1 add eax, 2 #1 = (bvadd #0 2) mov ebx, eax Example: // All refs initialized to -1 Register Reference Table { EAX : #1, EBX : -1, ECX : -1, ... } // Empty set Symbolic Expression Set { <#1, (bvadd #0 2)>, <#0, symvar_1> }
  • 15. 15 ● Register references movzx eax, byte ptr [mem] #0 = symvar_1 add eax, 2 #1 = (bvadd #0 2) mov ebx, eax #2 = #1 Example: // All refs initialized to -1 Register Reference Table { EAX : #1, EBX : #2, ECX : -1, ... } // Empty set Symbolic Expression Set { <#2, #1>, <#1, (bvadd #0 2)>, <#0, symvar_1> }
  • 16. 16 ● Rebuild the trace with backward analysis movzx eax, byte ptr [mem] add eax, 2 mov ebx, eax Example: // All refs initialized to -1 Register Reference Table { EAX : #1, EBX : #2, ECX : -1, ... } // Empty set Symbolic Expression Set { <#2, #1>, <#1, (bvadd #0 2)>, <#0, symvar_1> } What is the semantic trace of EBX ?
  • 17. 17 ● Rebuild the trace with backward analysis movzx eax, byte ptr [mem] add eax, 2 mov ebx, eax Example: // All refs initialized to -1 Register Reference Table { EAX : #1, EBX : #2, ECX : -1, ... } // Empty set Symbolic Expression Set { <#2, #1>, <#1, (bvadd #0 2)>, <#0, symvar_1> } What is the semantic trace of EBX ? EBX holds the reference #2
  • 18. 18 ● Rebuild the trace with backward analysis movzx eax, byte ptr [mem] add eax, 2 mov ebx, eax Example: // All refs initialized to -1 Register Reference Table { EAX : #1, EBX : #2, ECX : -1, ... } // Empty set Symbolic Expression Set { <#2, #1>, <#1, (bvadd #0 2)>, <#0, symvar_1> } What is the semantic trace of EBX ? EBX holds the reference #2 What is #2 ?
  • 19. 19 ● Rebuild the trace with backward analysis movzx eax, byte ptr [mem] add eax, 2 mov ebx, eax Example: // All refs initialized to -1 Register Reference Table { EAX : #1, EBX : #2, ECX : -1, ... } // Empty set Symbolic Expression Set { <#2, #1>, <#1, (bvadd #0 2)>, <#0, symvar_1> } What is the semantic trace of EBX ? EBX holds the reference #2 What is #2 ? Reconstruction: EBX = #2
  • 20. 20 ● Rebuild the trace with backward analysis movzx eax, byte ptr [mem] add eax, 2 mov ebx, eax Example: // All refs initialized to -1 Register Reference Table { EAX : #1, EBX : #2, ECX : -1, ... } // Empty set Symbolic Expression Set { <#2, #1>, <#1, (bvadd #0 2)>, <#0, symvar_1> } What is the semantic trace of EBX ? EBX holds the reference #2 What is #2 ? Reconstruction: EBX = #1
  • 21. 21 ● Rebuild the trace with backward analysis movzx eax, byte ptr [mem] add eax, 2 mov ebx, eax Example: // All refs initialized to -1 Register Reference Table { EAX : #1, EBX : #2, ECX : -1, ... } // Empty set Symbolic Expression Set { <#2, #1>, <#1, (bvadd #0 2)>, <#0, symvar_1> } What is the semantic trace of EBX ? EBX holds the reference #2 What is #2 ? Reconstruction: EBX = (bvadd #0 2)
  • 22. 22 ● Rebuild the trace with backward analysis movzx eax, byte ptr [mem] add eax, 2 mov ebx, eax Example: // All refs initialized to -1 Register Reference Table { EAX : #1, EBX : #2, ECX : -1, ... } // Empty set Symbolic Expression Set { <#2, #1>, <#1, add(#0, 2)>, <#0, symvar_1> } What is the semantic trace of EBX ? EBX holds the reference #2 What is #2 ? Reconstruction: EBX = (bvadd symvar_1 2)
  • 23. 23 ● Assigning a reference for each register is not enough, we must also add references on memory ● Follow references over memory mov dword ptr [rbp-0x4], 0x0 ... mov eax, dword ptr [rbp-0x4] push eax ... pop ebx What do we want to know? Eax = 0 from somewhere ebx = eax References #1 = 0x0 ... #x = #1 #2 = #1 ... #x = #2
  • 24. 24 ● All registers, flags and each byte of memory are references ● A reference assignment is in SSA form during the execution ● The registers, flags and bytes of memory are assigned in the same way ● A memory reference can be assigned from a register reference (mov [mem], reg) ● A register reference can be assigned from a memory reference (mov reg, [mem]) ● If a reference doesn't exist yet, we concretize the value and we affect a new reference ● References conclusion
  • 25. 25 ● SMT Semantics Representation with SSA Form
  • 26. 26 ● SMT Semantics Representation with SSA Form ● All instructions semantics are represented via SMT2-LIB representation ● This SMT2-LIB representation is on SSA form add rax, rdx rax = (bvadd ((_ extract 63 0) rax) ((_ extract 63 0) rdx)) ... (af, cf, of, pf) ... sf = (ite (= ((_ extract 63 63) rax) (_ bv1 1)) (_ bv1 1) (_ bv0 1)) zf = (ite (= rax (_ bv0 64)) (_ bv1 1) (_ bv0 1)) #60 = (bvadd ((_ extract 63 0) #58) ((_ extract 63 0) #54)) ... (af, cf, of, pf) ... #64 = (ite (= ((_ extract 63 63) #60) (_ bv1 1)) (_ bv1 1) (_ bv0 1)) #65 = (ite (= #60 (_ bv0 64)) (_ bv1 1) (_ bv0 1)) Assembly SMT SSA SMT New rax reference Old rax reference Old rdx reference
  • 27. 27 ● SMT Semantics Representation with SSA Form ● Why use SMT2-LIB representation? – SMT-LIB is an international initiative aimed at facilitating research and development in Satisfiability Modulo Theories (SMT) – As all Triton's expressions are in the SMT2-LIB representation, you can plug all solvers which supports this representation ● Currently Triton has an interface with Z3 but feel free to plug what you want
  • 28. 28 ● Symbolic Execution Guided By The Taint Analysis
  • 29. 29 ● Symbolic Execution Guided By The Taint Analysis ● Taint analysis provides information about which registers and memory addresses are controllable by the user at each program point: – Assists the symbolic engine to setup the symbolic variables (a symbolic variable is a memory area that the user can control) – Limit the symbolic engine to the relevant part of the program – At each branch instruction, we directly know if the user can go through both branches (this is mainly used for code coverage)
  • 30. 30 ● Symbolic Execution Guided By The Taint Analysis ● Transform a tainted area into a symbolic variable 0x40058b: movzx eax, byte ptr [rax] -> #33 = SymVar_0 ; Controllable by the user -> #34 = (_ bv4195726 64) ; RIP 0x40058e: movsx eax, al -> #35 = ((_ sign_extend 24) ((_ extract 7 0) #33)) -> #36 = (_ bv4195729 64) ; RIP 0x40058b: movzx eax, byte ptr [rax] -> #33 = ((_ zero_extend 24) (_ bv97 8)) -> #34 = (_ bv4195726 64) ; RIP 0x40058e: movsx eax, al -> #35 = ((_ sign_extend 24) ((_ extract 7 0) #33)) -> #36 = (_ bv4195729 64) ; RIP rax points on a tainted area Use symbolic variable instead of concrete value
  • 31. 31 ● Symbolic Execution Guided By The Taint Analysis ● Can I go through this branch? – Check if flags are tainted 0x4005ae: cmp ecx, eax -> #72 = (bvsub ((_ extract 31 0) #52) ((_ extract 31 0) #70)) ...CF, OF, SF, AF, and PF skipped... -> #78 = (ite (= #72 (_ bv0 32)) (_ bv1 1) (_ bv0 1)) ; ZF -> #79 = (_ bv4195760 64) ; RIP 0x4005b0: jz 0x4005b9 -> #80 = (ite (= #78 (_ bv1 1)) (_ bv4195769 64) (_ bv4195762 64)) ; RIP eax is tainted tainted
  • 32. 32 ● Taint Analysis guided by the Symbolic Engine and the Solver Engine ● As the symbolic execution may be guided by the taint analysis, the taint analysis may also be guided by the symbolic execution and the solver engine ● What to choose between an over-approximation and under-approximation? – Over-approximation: We can generate inputs for infeasible concrete paths. – Under-approximation: We can miss some feasible paths. ● The goal of the taint engine is to say YES or NO if a register and memory is probably tainted (byte-level over approximation) ● The goal of the symbolic engine is to build symbolic expressions based on instructions semantics ● The goal of the solver engine is to generate a model of an expression (path condition) – If your target is not tainted, don't ask a model → gain time – If the solver engine returns UNSAT → the tainted inputs can't influence the control flow to go through this path. – If the solver engine returns SAT → the path can be triggered with the actual tainted inputs. The model give us the set of concrete inputs for this path.
  • 33. 33 ● Snapshot Engine – Replay your trace
  • 34. 34 ● Snapshot Engine – Replay your trace ● The snapshot engine offers the possibility to take and restore snapshot – Mainly used to apply code coverage in memory. Useful when you fuzz the binary – In future versions, it will be possible to take different snapshots at several program point ● The snapshot engine only restores registers and memory states – If there is some disk, network,... I/O, Triton won't be able to restore the files modification Restore context Snapshot Restore SnapshotTarget function / basic block Dynamic Symbolic Execution Possible paths in the target
  • 35. 35 ● Stop talking about back-end! Let's see how I can use Triton
  • 36. 36 ● How to install Triton? ● Easy is easy ● You just need: – Pin v2.14-71313 – Z3 v4.3.1 – Python v2.7 $ cd pin-2.14-71313-gcc.4.4.7-linux/source/tools/ $ git clone git@github.com:JonathanSalwan/Triton.git $ cd Triton $ make $ ../../../pin -t ./triton.so -script your_script.py -- ./your_target_binary.elf64 Shell 1: Installation
  • 37. 37 ● Start an analysis import triton if __name__ == '__main__': # Start the symbolic analysis from the 'check' function triton.startAnalysisFromSymbol('check') # Run the instrumentation - Never returns triton.runProgram() import triton if __name__ == '__main__': # Start the symbolic analysis from address triton.startAnalysisFromAddr(0x40056d) triton.stopAnalysisFromAddr(0x4005c9) # Run the instrumentation - Never returns triton.runProgram() Code 1: Start analysis from symbols Code 2: Start analysis from address
  • 38. 38 ● Predicate taint and untaint import triton if __name__ == '__main__': # Start the symbolic analysis from the 'check' function triton.startAnalysisFromSymbol('check') # Taint the RAX and RBX registers when the address 0x40058e is executed triton.taintRegFromAddr(0x40058e, [IDREF.REG.RAX, IDREF.REG.RBX]) # Untaint the RCX register when the address 0x40058e is executed triton.untaintRegFromAddr(0x40058e, [IDREF.REG.RCX]) # Run the instrumentation - Never returns triton.runProgram() Code 3: Predicate taint and untaint at specific addresses
  • 39. 39 ● Callbacks ● Triton supports 8 kinds of callbacks – AFTER ● Defines a callback after the instruction processing – BEFORE ● Defines a callback before the instruction processing – BEFORE_SYMPROC ● Defines a callback before the symbolic processing – FINI ● Define a callback at the end of the execution – ROUTINE_ENTRY ● Define a callback at the entry of a specified routine. – ROUTINE_EXIT ● Define a callback at the exit of a specified routine. – SYSCALL_ENTRY ● Define a callback before each syscall processing – SYSCALL_EXIT ● Define a callback after each syscall processing
  • 40. 40 ● Callback on SYSCALL def my_callback_syscall_entry(threadId, std): print '-> Syscall Entry: %s' %(syscallToString(std, getSyscallNumber(std))) if getSyscallNumber(std) == IDREF.SYSCALL.LINUX_64.WRITE: arg0 = getSyscallArgument(std, 0) arg1 = getSyscallArgument(std, 1) arg2 = getSyscallArgument(std, 2) print ' sys_write(%x, %x, %x)' %(arg0, arg1, arg2) def my_callback_syscall_exit(threadId, std): print '<- Syscall return %x' %(getSyscallReturn(std)) if __name__ == '__main__': startAnalysisFromSymbol('main') addCallback(my_callback_syscall_entry, IDREF.CALLBACK.SYSCALL_ENTRY) addCallback(my_callback_syscall_exit, IDREF.CALLBACK.SYSCALL_EXIT) runProgram() Code 4: Callback before and after syscalls processing -> Syscall Entry: fstat <- Syscall return 0 -> Syscall Entry: mmap <- Syscall return 7fb7f06e1000 -> Syscall Entry: write sys_write(1, 7fb7f06e1000, 6) Code 4 result
  • 41. 41 ● Callback on ROUTINE def mallocEntry(threadId): sizeAllocated = getRegValue(IDREF.REG.RDI) print '-> malloc(%#x)' %(sizeAllocated) def mallocExit(threadId): ptrAllocated = getRegValue(IDREF.REG.RAX) print '<- %#x' %(ptrAllocated) if __name__ == '__main__': startAnalysisFromSymbol('main') addCallback(mallocEntry, IDREF.CALLBACK.ROUTINE_ENTRY, "malloc") addCallback(mallocExit, IDREF.CALLBACK.ROUTINE_EXIT, "malloc") runProgram() Code 5: Callback before and after routine processing -> malloc(0x20) <- 0x8fc010 -> malloc(0x20) <- 0x8fc040 -> malloc(0x20) <- 0x8fc010 Code 5 result
  • 42. 42 ● Callback BEFORE and AFTER instruction processing def my_callback_before(instruction): print 'TID (%d) %#x %s' %(instruction.threadId, instruction.address, instruction.assembly) if __name__ == '__main__': # Start the symbolic analysis from the 'check' function startAnalysisFromSymbol('check') # Add a callback. addCallback(my_callback_before, IDREF.CALLBACK.BEFORE) # Run the instrumentation - Never returns runProgram() Code 6: Callback before instruction processing TID (0) 0x40056d push rbp TID (0) 0x40056e mov rbp, rsp TID (0) 0x400571 mov qword ptr [rbp-0x18], rdi TID (0) 0x400575 mov dword ptr [rbp-0x4], 0x0 ... TID (0) 0x4005b2 mov eax, 0x1 TID (0) 0x4005b7 jmp 0x4005c8 TID (0) 0x4005c8 pop rbp Code 6 result
  • 43. 43 ● Instruction class def my_callback(instruction): ... ● instruction.address ● instruction.assembly ● instruction.imageName - e.g: libc.so ● instruction.isBranch ● instruction.opcode ● instruction.opcodeCategory ● instruction.operands ● instruction.symbolicElements – List of SymbolicElement class ● instruction.routineName - e.g: main ● instruction.sectionName - e.g: .text ● instruction.threadId
  • 44. 44 ● SymbolicElement class instruction.symbolicElements[0] ● symbolicElement.comment → blah ● symbolicElement.destination → #41 ● symbolicElement.expression → #41 = (bvadd ((_ extract 63 0) #40) ((_ extract 63 0) #39)) ● symbolicElement.id → 41 ● symbolicElement.isTainted → True or False ● symbolicElement.source → (bvadd ((_ extract 63 0) #40) ((_ extract 63 0) #39)) Instruction: add rax, rdx SymbolicElement: #41 = (bvadd ((_ extract 63 0) #40) ((_ extract 63 0) #39)) ; blah
  • 45. 45 ● Dump the symbolic expressions trace def my_callback_after(instruction): print '%#x: %s' %(instruction.address, instruction.assembly) for se in instruction.symbolicElements: print 't -> ', se.expression print if __name__ == '__main__': startAnalysisFromSymbol('check') addCallback(my_callback_after, IDREF.CALLBACK.AFTER) runProgram() Code 7: Dump a symbolic expression trace 0x4005ab: movsx eax, al -> #70 = ((_ sign_extend 24) ((_ extract 7 0) #68)) -> #71 = (_ bv4195758 64) 0x4005ae: cmp ecx, eax -> #72 = (bvsub ((_ extract 31 0) #52) ((_ extract 31 0) #70)) ... -> #77 = (ite (= ((_ extract 31 31) #72) (_ bv1 1)) (_ bv1 1) (_ bv0 1)) -> #78 = (ite (= #72 (_ bv0 32)) (_ bv1 1) (_ bv0 1)) -> #79 = (_ bv4195760 64) 0x4005b0: jz 0x4005b9 -> #80 = (ite (= #78 (_ bv1 1)) (_ bv4195769 64) (_ bv4195762 64)) Code 7 result
  • 46. 46 ● Play with the Taint engine at runtime # 0x40058b: movzx eax, byte ptr [rax] def cbeforeSymProc(instruction): if instruction.address == 0x40058b: rax = getRegValue(IDREF.REG.RAX) taintMem(rax) if __name__ == '__main__': startAnalysisFromSymbol('check') addCallback(cbeforeSymProc, IDREF.CALLBACK.BEFORE_SYMPROC) runProgram() Code 8: Taint memory at runtime 0x40058b: movzx eax, byte ptr [rax] -> #33 = SymVar_0 -> #34 = (_ bv4195726 64) 0x40058e: movsx eax, al -> #35 = ((_ sign_extend 24) ((_ extract 7 0) #33)) -> #36 = (_ bv4195729 64) Code 8 result Modifications must be done before the symbolic processing
  • 47. 47 ● Taint argv[x][x] at the main function def mainAnalysis(threadId): rdi = getRegValue(IDREF.REG.RDI) # argc rsi = getRegValue(IDREF.REG.RSI) # argv while rdi != 0: argv = getMemValue(rsi + ((rdi-1) * 8), 8) offset = 0 while getMemValue(argv + offset, 1) != 0x00: taintMem(argv + offset) offset += 1 print '[+] %03d bytes tainted from the argv[%d] (%#x) pointer' %(offset, rdi-1, argv) rdi -= 1 return Code 9: Taint all arguments when the main function occurs $ pin -t ./triton.so -script taint_main.py -- ./example.bin64 12 123456 123456789 [+] 009 bytes tainted from the argv[3] (0x7fff802ad116) pointer [+] 006 bytes tainted from the argv[2] (0x7fff802ad10f) pointer [+] 002 bytes tainted from the argv[1] (0x7fff802ad10c) pointer [+] 015 bytes tainted from the argv[0] (0x7fff802ad0ef) pointer Code 9 result
  • 48. 48 ● Play with the Symbolic engine 0x40058b: movzx eax, byte ptr [rax] ... ... 0x4005ae: cmp ecx, eax Example 10: Assembly code def callback_beforeSymProc(instruction): if instruction.address == 0x40058b: rax = getRegValue(IDREF.REG.RAX) taintMem(rax) def callback_after(instruction): if instruction.address == 0x4005ae: # Get the symbolic expression ID of ZF zfId = getRegSymbolicID(IDREF.FLAG.ZF) # Backtrack the symbolic expression ZF zfExpr = getBacktrackedSymExpr(zfId) # Craft a new expression over the ZF expression : (assert (= zfExpr True)) expr = smt2lib.smtAssert(smt2lib.equal(zfExpr, smt2lib.bvtrue())) print expr Code 10: Backtrack symbolic expression We know that rax points on a tainted area (assert (= (ite (= (bvsub ((_ extract 31 0) ((_ extract 31 0) (bvxor ((_ extract 31 0) (bvsub ((_ extract 31 0) ((_ sign_extend 24) ((_ extract 7 0) SymVar_0))) (_ bv1 32))) (_ bv85 32)))) ((_ extract 31 0) ((_ sign_extend 24) ((_ extract 7 0) ((_ zero_extend 24) (_ bv49 8)))))) (_ bv0 32)) (_ bv1 1) (_ bv0 1)) (_ bv1 1))) Example 10 result Symbolic Variable
  • 49. 49 ● Play with the Symbolic engine ... zfExpr = getBacktrackedSymExpr(zfId) # Craft a new expression over the ZF expression : (assert (= zfExpr True)) expr = smt2lib.smtAssert(smt2lib.equal(zfExpr, smt2lib.bvtrue())) ... Extract of the Code 10 ● What does it really mean? – Triton builds symbolic formulas based on the instructions semantics – Triton also exports smt2lib functions which allows you to create your own formula – In this example, we want that the ZF expression is equal to 1
  • 50. 50 ● Play with the Solver engine ... zfExpr = getBacktrackedSymExpr(zfId) # Craft a new expression over the ZF expression : (assert (= zfExpr True)) expr = smt2lib.smtAssert(smt2lib.equal(zfExpr, smt2lib.bvtrue())) ... model = getModel(expr) print model Extract of the Code 10 {'SymVar_0': 0x65} Result ● getModel() returns a dictionary of valid model for each symbolic variable 0x40058b: movzx eax, byte ptr [rax] ... ... 0x4005ae: cmp ecx, eax Example 10: Assembly code We know now that the first character must be 0x65 to set the ZF at the compare instruction
  • 51. 51 ● Play with the Solver engine and inject values directly in memory ... model = getModel(expr) print model Extract of the Code 10 Result ● Each symbolic variable is assigned to a memory address (SymVar ↔ Address) – Possible to get the symbolic variable from a memory address ● getSymVarFromMemory(addr) – Possible to get the memory address from a symbolic variable ● getMemoryFromSymVar(symVar) {'SymVar_0': 0x65} for k, v in model.items(): setMemValue(getMemoryFromSymVar(k), getSymVarSize(k), v) Inject values given by the solver in memory
  • 52. 52 ● Inject values in memory is not enough Play with the snapshot engine ● Inject values in memory after instructions processing is useless ● That's why Triton offers a snapshot engine φ1 φ2 φ3 φ4 φ5 φ6 0x40058b: movzx eax, byte ptr [rax] 0x4005ae: cmp ecx, eax def callback_after(instruction): if instruction.address == 0x40058b and isSnapshotEnable() == False: takeSnapshot() if instruction.address == 0x4005ae: if getFlagValue(IDREF.FLAG.ZF) == 0: zfExpr = getBacktrackedSymExpr(...) # Described on slide 45 expr = smt2lib.smtAssert(...zfExpr...) # Described on slide 45 for k, v in getModel(expr).items(): # Described on slide 48 saveValue(...) restoreSnapshot() restore snapshottake snapshot
  • 53. 53 ● Stop pasting fucking code Show me a global vision
  • 54. 54 ● Stop pasting fucking code Show me a global vision ● Full API and Python bindings describes here – https://github.com/JonathanSalwan/Triton/wiki/Python-Bindings – ~80 functions exported over the Python bindings ● Basically we can: – Taint and untaint memory and registers – Inject value in memory and registers – Add callbacks at each program point, syscalls, routine – Assign symbolic expression on registers and bytes of memory – Build and customize symbolic expressions – Solve symbolic expressions – Take and restore snapshots – Do all this in Python!
  • 56. 56 ● Conclusion ● Triton: – is a Pintool which provides others classes for DBA – is designed as a concolic execution framework – provides an API and Python bindings – supports only x86-64 binaries – currently supports ~100 semantics but we are working hard on it to increase the semantics support ● An awesome thanks to Kevin `wisk` Szkudlapski and Francis `gg` Gabriel for the x86.yaml from the Medusa project :) – is free and open-source :) – is available here : github.com/JonathanSalwan/Triton
  • 57. 57 ● Contacts – fsaudel@gmail.com – jsalwan@quarkslab.com ● Thanks – We would like to thank the SSTIC's staff and especially Rolf Rolles, Sean Heelan, Sébastien Bardin, Fred Raynal and Serge Guelton for their proofreading and awesome feedbacks! Then, a big thanks to Quarkslab for the sponsor. Thanks For Your Attention Question(s)?
  • 58. 58 ● Q&A - Performances ● Host machine configuration – Tested with an Intel(R) Core(TM) i7-3520M CPU @ 2.90GHz – 16 Go DDR3 – 415 Go SSD Swap ● The targeted binary analyzed was /usr/bin/z3 – 6,789,610 symbolic expressions created for 1 trace – The binary has been analyzed in 180 seconds ● One trace with SMT2-LIB translation and the taint spread – 19 Go of RAM consumed ● Due to the SMT2-LIB strings manipulation