7. Basic Interpreter
application code
foo() bar()
A
interpreter
B C
fetch decode execute
D
E
F
Slowdown: ~300x
7
8. Improvement #1: Basic Block Cache
application code software
code
foo() bar() cache
A A
B C C
DynamoRIO
D D
E E
F F
Slowdown: 300x 25x
8
9. Improvement # 2: Linking Direct Branches
application code software
code
foo() bar() cache
A A
B C C
DynamoRIO
D D
E E
F F
Slowdown: 300x 25x 3x
9
10. Improvement # 3: Linking Indirect Branches
application code software
code
foo() bar() cache
A A
B C C
DynamoRIO
D D
E E
indirect
branch
F lookup F
Slowdown: 300x 25x 3x 1.2x
10
11. Improvement # 4: Building Traces
application code software
code
foo() bar() cache
A A
C
B C D
DynamoRIO E
D cmp
F
E
indirect
branch
F lookup
Slowdown: 300x 26x 3x 1.2x 1.1x
11
12. Tool Platform
application code software
code
foo() bar() cache
tool code
A
A
C
X
B C
DynamoRIO D
E
D
cmp
F
E
indirect
branch
F lookup
12
13. Transparency
Do not want to interfere with the semantics of the program
Dangerous to make any assumptions about:
• Register usage
• Calling conventions
• Stack layout
• Memory/heap usage
• I/O and other system call use
13
14. Painful, But Necessary
Difficult and costly to handle corner cases
Many applications will not notice…
…but some will!
• Microsoft Office: Visual Basic generated code, stack convention
violations
• COM, Star Office, MMC: trampolines
• Adobe Premiere: self-modifying code
• VirtualDub: UPX-packed executable
• etc.
14
18. Anatomy of an Attack
network
ENTER
CORRUPT DATA
system and
application memory
HIJACK PROGRAM COUNTER
COMPROMISE
kernel
19. Critical Data: Control Flow Indirection
Subroutine calls
• Return address and activation records on visible stack
Dynamic library linking
• Function exports and imports
Object oriented polymorphism: dynamic dispatch
• Vtables
Callbacks – registered function pointers
• Event dispatch, atexit
Exception handling
Any problem in computer science can be solved with another layer
of indirection.
- David Wheeler
20. Critical Data: Control Flow Exploits
Return address overwrite
• Classic buffer overflow
GOT overwrite
Object pointer overwrite or uninitialized use
Function pointer overwrite
• Heap, stack, data, PEB
Exception handler overwrites
• SEH exploits
Any problem in computer science can be solved with another layer
of indirection. But that usually will create another problem.
- David Wheeler
21. Preventing Data Corruption Is Difficult
Stored program addresses legitimately manipulated by
many different entities
• Dynamic linker, language runtime
Intermingled with regular data
• Return addresses on stack
• Vtables in heap
Even if could distinguish a good write from a bad write, too
expensive to monitor all data writes
22. Insight: Hijack Violates Execution Model
Hardware
Interface
Typical
Application Security Attack
Execution Model
24. Program Shepherding
Monitor all control-flow transfers during program execution
• DynamoRIO is in perfect position to do this
Validate that each transfer satisfies security policy based
on execution model
• Application Binary Interface (ABI): calling convention, library
invocation
The application may be damaged by data corruption, but
the system will not be compromised by hijacking control
flow
26. Memory Bugs
Memory bugs are challenging to detect and fix
• Memory corruption, reading uninitialized memory, memory leaks
Observable symptoms resulting from memory bugs are
often delayed and non-deterministic
• Errors are difficult to discover during regular testing
• Testing usually relies on randomly happening to hit visible symptoms
• The sources of these bugs are painful and time-consuming to track
down from observed crashes
Memory bugs often remain in shipped products and can
show up in customer usage
26
27. Dr. Memory
Detects unaddressable memory
accesses
• Wild access to invalid address
• Use-after-free
• Buffer and array overflow and underflow
• Read beyond top of stack
• Invalid free, double free
Detects uninitialized memory reads
Detects memory leaks
27
28. Implementation Strategy
Track the state of application memory using shadow
memory
• Track whether allocated and whether defined
Monitor every memory-related action by the application:
• System call
• Malloc, realloc, calloc, free, mmap, mumap, mremap
• Memory read or write
• Stack adjustment
At exit or on request, scan memory to check for leaks
28
29. Shadow Metadata
Shadow each byte of memory with one of 3 states:
allocate: mmap, calloc
allocate:
malloc, stack write
unaddressable uninitialized defined
deallocate
deallocate
29
30. Shadow Memory
Shadow Stack Shadow Heap
Stack Heap
defined header unaddr
uninit defined
malloc uninit
defined
defined
unaddr padding unaddr
header unaddr
freed unaddr
30
33. DynamoRIO History
Dynamo Dynamo
@HP Labs @HP Labs
on PA-RISC on x86
late 1990’s 2000
RIO @MIT
Dynamo + RIO
(Runtime Introspection
DynamoRIO
and Optimization)
1999 2001
33
34. DynamoRIO History Cont’d
VMware Google
DynamoRIO Determina
acquires sponsors
@MIT security startup
Determina Dr. Memory
2001 2003 2007 2010
open-sourced
binary releases
BSD license
2002 2009
34
35. DynamoRIO Team
Google
DynamoRIO Determina VMware sponsors
@MIT security startup Dr. Memory
35
36. DynamoRIO Open Source Project
Google Code
• BSD license
• Subversion repository
300 KLOC
Mostly C, some assembly
• Issue tracker
Google Groups http://dynamorio.org
• User discussion forum/mailing list
• Developer mailing list
36
37. Dr. Memory Open Source Project
Google Code
• http://code.google.com/p/drmemory
• LGPL 2.1 license
• Subversion repository
67 KLOC
Mostly C
• Issue tracker
Google Groups
• User discussion forum/mailing list
• Developer mailing list
37
38. Potential Projects
Build a New Tool
• Code coverage
• Fuzzer
• Profiler: basic block, edge, function, etc.
• Malware sandbox
• Reverse engineering
Contribute to an Existing Tool
• Dr. Memory or Dr. Heapstat
• Revive PiPA or UMI
38
39. Potential Projects Cont’d
Build a Tool Library
• Control flow, call graph, data dependence analysis
• Symbol table access
Contribute to Platform
• Buffer filling API
• Probe API
• Port to MacOS
• Port to ARM
• Debugger integration
39