SlideShare uma empresa Scribd logo
1 de 42
Baixar para ler offline
The Reverse Engineering Language REIL
         and its applications


         Sebastian Porst (sebastian.porst@zynamics.com)
                          zynamics GmbH
I write code for




                   ... sometimes I debug other people‘s code
We love graphs
                     We love static analysis




    We want to improve binary code RE
Reverse Engineering Intermediate Language
Three Good Reasons for REIL


        Simple

        Free of side-effects

        Platform-Independent
Fewer Instructions
1000+
                988




                             17


PowerPC         x86          REIL
X86 Code         PowerPC Code         ARM Code


X86 Translator   PowerPC Translator   ARM Translator



                 REIL Code

             MonoREIL       Other Analysis
                             Algorithms
REIL VM
       Architecture
Register-based Virtual Machine

  Infinite number of registers

  Infinite amount of memory
Close to the original architecture
add
               and
               bisz
               bsh
               div
               jcc
               ldm
               mod
               mul
Instructions   nop
               or
               stm
               str
               sub
               undef
               unkn
               xor
40131A.07 add (t0, b4), (t1, b4), (t2, b8), [(foo, bar)]


 1 REIL Address
         Native Part
         REIL Part
 1 REIL Mnemonic          1 REIL Operand
 3 Operands               2 (really 3) Pieces of Information
 ∞ Metadata               3 Operand Types
                          7 Operand Sizes
                          ∞ Operand Values
add
Six Arithmetic Instructions   and
                              bisz
  Addition                    bsh
                              div
  Logical Shift               jcc
  Unsigned Division           ldm
                              mod
  Unsigned Modulo             mul
  Unsigned Multiplication     nop
  Subtraction                 or
                              stm
                              str
                              sub
                              undef
                              unkn
                              xor
add
Three Bitwise Instructions   and
                             bisz
  Bitwise AND                bsh
                             div
  Bitwise OR                 jcc
  Bitwise XOR                ldm
                             mod
                             mul
                             nop
                             or
                             stm
                             str
                             sub
                             undef
                             unkn
                             xor
add
Three Data Transfer Instructions   and
                                   bisz
  Load Memory                      bsh
                                   div
  Store Memory                     jcc
  Transfer Register Content        ldm
                                   mod
                                   mul
                                   nop
                                   or
                                   stm
                                   str
                                   sub
                                   undef
                                   unkn
                                   xor
add
Two Logical Instructions   and
                           bisz
  Compare to Zero          bsh
                           div
  Jump Conditional         jcc
                           ldm
                           mod
                           mul
                           nop
                           or
                           stm
                           str
                           sub
                           undef
                           unkn
                           xor
add
Three Other Instructions   and
                           bisz
  No Operation             bsh
                           div
  Undefine Register        jcc
  Unknown Mnemonic         ldm
                           mod
                           mul
                           nop
                           or
                           stm
                           str
                           sub
                           undef
                           unkn
                           xor
All Instructions are equal ...
   add   t0,   t1,   t2
   and   t0,   t1,   t2
   bsh   t0,   t1,   t2
   div   t0,   t1,   t2
   mod   t0,   t1,   t2
                        ... but some instructions are
   mul   t0,   t1,   t2
   or    t0,   t1,   t2     more equal than others
   sub   t0,   t1,   t2         bisz t0, , t2
   xor   t0,   t1,   t2         jcc    t0, , t2
                                ldm    t0, , t2
                                nop       , ,
                                stm    t0, , t2
                                str    t0, , t2
                                undef     , , t2
                                unkn      , ,
What REIL can‘t do
   FPU Instructions   fadd


                      ...


System Instructions   int


                      ...


 Weird Instructions   setend

                      ...
More Architectures (x64, MIPS, ...)

    More Operand Information

         More Instructions
So ...


         ... what is it all good for?
Register Tracking

Follow the effects
      of a register
    on the program state
Demo
Register Tracking
Negative Array Access



     MS08-67
Demo
MS08-67
... but aren‘t those really difficult to implement?




                       NO!
MonoREIL
Create an Instruction Graph

  Define a Lattice

     Define Transformations

        Define a Graph Walker

          Define a Start Vector
Create an Instruction Graph

                       1400: add t0, 15, t1


                        1401: bisz t1, , t2


                       1402: jcc t2, , 1405


  1403: str 8, , t3                           1405: str 16, , t3


1404: jcc t2, , 1406


                       1406: add t3, t3, t4
                                                   int t4 = t0 == -15 ? 32 : 16

                       1407: jcc 1, , 1420
Define Lattice Elements

             T




             B
Define Transformations
                       1400: add t0, 15, t1


                        1401: bisz t1, , t2


                       1402: jcc t2, , 1405


  1403: str 8, , t3                           1405: str 16, , t3


1404: jcc t2, , 1406


                       1406: add t3, t3, t4


                       1407: jcc 1, , 1420
Define a Join function
                       1400: add t0, 15, t1


                        1401: bisz t1, , t2


                       1402: jcc t2, , 1405


  1403: str 8, , t3                           1405: str 16, , t3


1404: jcc t2, , 1406


                       1406: add t3, t3, t4


                       1407: jcc 1, , 1420
Define a Graph Walker
                       1400: add t0, 15, t1


                        1401: bisz t1, , t2


                       1402: jcc t2, , 1405


  1403: str 8, , t3                           1405: str 16, , t3


1404: jcc t2, , 1406


                       1406: add t3, t3, t4


                       1407: jcc 1, , 1420
Define a Start Vector



         node1 → initialState1
         node2 → initialState2
         node3 → initialState3
         node4 → initialState4
         node5 → initialState5
                 ...
         noden → initialStaten
Running MonoREIL
                                     1400: add t0, 15, t1   initialState1400 → finalState1400


                                      1401: bisz t1, , t2   initialState1401 → finalState1401


                                     1402: jcc t2, , 1405   initialState1402 → finalState1402


                1403: str 8, , t3                            1405: str 16, , t3
initialState1403 → finalState1403
              1404: jcc t2, , 1406                          initialState1405 → finalState1405

initialState1404 → finalState1404
                                     1406: add t3, t3, t4   initialState1406 → finalState1406


                                     1407: jcc 1, , 1420    initialState1407 → finalState1407
The Result



         node1 → finalState1
         node2 → finalState2
         node3 → finalState3
         node4 → finalState4
         node5 → finalState5
                ...
         noden → finalStaten
LOC?                                        14 + n


                                     12 + n


                      8+n




1+n        1

Walker   Graph   Transformations   Start Vector   Lattice
Register Tracking
                      120                          120
                                       90




  1        1

Walker   Graph   Transformations   Start Vector   Lattice
Demo
Register Tracking Code
1700



      MS08-67
                                                   500



                                       150

  1        1

Walker   Graph   Transformations   Start Vector   Lattice
Demo
MS08-67
Deobfuscating Optimizer

    Type Reconstruction
Related Work
ERESI
  Vine (BitBlaze)
   Silvio Cesare
        Hex-Rays
Thank You!




             Questions?

Mais conteúdo relacionado

Semelhante a Hitb

Stale pointers are the new black
Stale pointers are the new blackStale pointers are the new black
Stale pointers are the new black
Vincenzo Iozzo
 
Archi Modelling
Archi ModellingArchi Modelling
Archi Modelling
dilane007
 

Semelhante a Hitb (8)

Kernel Recipes 2014 - x86 instruction encoding and the nasty hacks we do in t...
Kernel Recipes 2014 - x86 instruction encoding and the nasty hacks we do in t...Kernel Recipes 2014 - x86 instruction encoding and the nasty hacks we do in t...
Kernel Recipes 2014 - x86 instruction encoding and the nasty hacks we do in t...
 
Stale pointers are the new black
Stale pointers are the new blackStale pointers are the new black
Stale pointers are the new black
 
Archi Modelling
Archi ModellingArchi Modelling
Archi Modelling
 
Arm Cortex material Arm Cortex material3222886.ppt
Arm Cortex material Arm Cortex material3222886.pptArm Cortex material Arm Cortex material3222886.ppt
Arm Cortex material Arm Cortex material3222886.ppt
 
Armsim (simualtor)
Armsim (simualtor)Armsim (simualtor)
Armsim (simualtor)
 
ARM Fundamentals
ARM FundamentalsARM Fundamentals
ARM Fundamentals
 
N_Asm Assembly arithmetic instructions (sol)
N_Asm Assembly arithmetic instructions (sol)N_Asm Assembly arithmetic instructions (sol)
N_Asm Assembly arithmetic instructions (sol)
 
OptimizingARM
OptimizingARMOptimizingARM
OptimizingARM
 

Mais de zynamics GmbH

Everybody be cool, this is a roppery!
Everybody be cool, this is a roppery!Everybody be cool, this is a roppery!
Everybody be cool, this is a roppery!
zynamics GmbH
 
Automated Mobile Malware Classification
Automated Mobile Malware ClassificationAutomated Mobile Malware Classification
Automated Mobile Malware Classification
zynamics GmbH
 
VxClass for Incident Response
VxClass for Incident ResponseVxClass for Incident Response
VxClass for Incident Response
zynamics GmbH
 
Malware classification
Malware classificationMalware classification
Malware classification
zynamics GmbH
 
Platform-independent static binary code analysis using a meta-assembly language
Platform-independent static binary code analysis using a meta-assembly languagePlatform-independent static binary code analysis using a meta-assembly language
Platform-independent static binary code analysis using a meta-assembly language
zynamics GmbH
 
Automated static deobfuscation in the context of Reverse Engineering
Automated static deobfuscation in the context of Reverse EngineeringAutomated static deobfuscation in the context of Reverse Engineering
Automated static deobfuscation in the context of Reverse Engineering
zynamics GmbH
 

Mais de zynamics GmbH (8)

Everybody be cool, this is a roppery!
Everybody be cool, this is a roppery!Everybody be cool, this is a roppery!
Everybody be cool, this is a roppery!
 
Automated Mobile Malware Classification
Automated Mobile Malware ClassificationAutomated Mobile Malware Classification
Automated Mobile Malware Classification
 
VxClass for Incident Response
VxClass for Incident ResponseVxClass for Incident Response
VxClass for Incident Response
 
Malware classification
Malware classificationMalware classification
Malware classification
 
Eusecwest
EusecwestEusecwest
Eusecwest
 
Platform-independent static binary code analysis using a meta-assembly language
Platform-independent static binary code analysis using a meta-assembly languagePlatform-independent static binary code analysis using a meta-assembly language
Platform-independent static binary code analysis using a meta-assembly language
 
Bh dc09
Bh dc09Bh dc09
Bh dc09
 
Automated static deobfuscation in the context of Reverse Engineering
Automated static deobfuscation in the context of Reverse EngineeringAutomated static deobfuscation in the context of Reverse Engineering
Automated static deobfuscation in the context of Reverse Engineering
 

Hitb

  • 1. The Reverse Engineering Language REIL and its applications Sebastian Porst (sebastian.porst@zynamics.com) zynamics GmbH
  • 2. I write code for ... sometimes I debug other people‘s code
  • 3. We love graphs We love static analysis We want to improve binary code RE
  • 5. Three Good Reasons for REIL Simple Free of side-effects Platform-Independent
  • 6. Fewer Instructions 1000+ 988 17 PowerPC x86 REIL
  • 7. X86 Code PowerPC Code ARM Code X86 Translator PowerPC Translator ARM Translator REIL Code MonoREIL Other Analysis Algorithms
  • 8. REIL VM Architecture Register-based Virtual Machine Infinite number of registers Infinite amount of memory Close to the original architecture
  • 9. add and bisz bsh div jcc ldm mod mul Instructions nop or stm str sub undef unkn xor
  • 10. 40131A.07 add (t0, b4), (t1, b4), (t2, b8), [(foo, bar)] 1 REIL Address Native Part REIL Part 1 REIL Mnemonic 1 REIL Operand 3 Operands 2 (really 3) Pieces of Information ∞ Metadata 3 Operand Types 7 Operand Sizes ∞ Operand Values
  • 11. add Six Arithmetic Instructions and bisz Addition bsh div Logical Shift jcc Unsigned Division ldm mod Unsigned Modulo mul Unsigned Multiplication nop Subtraction or stm str sub undef unkn xor
  • 12. add Three Bitwise Instructions and bisz Bitwise AND bsh div Bitwise OR jcc Bitwise XOR ldm mod mul nop or stm str sub undef unkn xor
  • 13. add Three Data Transfer Instructions and bisz Load Memory bsh div Store Memory jcc Transfer Register Content ldm mod mul nop or stm str sub undef unkn xor
  • 14. add Two Logical Instructions and bisz Compare to Zero bsh div Jump Conditional jcc ldm mod mul nop or stm str sub undef unkn xor
  • 15. add Three Other Instructions and bisz No Operation bsh div Undefine Register jcc Unknown Mnemonic ldm mod mul nop or stm str sub undef unkn xor
  • 16. All Instructions are equal ... add t0, t1, t2 and t0, t1, t2 bsh t0, t1, t2 div t0, t1, t2 mod t0, t1, t2 ... but some instructions are mul t0, t1, t2 or t0, t1, t2 more equal than others sub t0, t1, t2 bisz t0, , t2 xor t0, t1, t2 jcc t0, , t2 ldm t0, , t2 nop , , stm t0, , t2 str t0, , t2 undef , , t2 unkn , ,
  • 17. What REIL can‘t do FPU Instructions fadd ... System Instructions int ... Weird Instructions setend ...
  • 18. More Architectures (x64, MIPS, ...) More Operand Information More Instructions
  • 19. So ... ... what is it all good for?
  • 20. Register Tracking Follow the effects of a register on the program state
  • 24. ... but aren‘t those really difficult to implement? NO!
  • 26. Create an Instruction Graph Define a Lattice Define Transformations Define a Graph Walker Define a Start Vector
  • 27. Create an Instruction Graph 1400: add t0, 15, t1 1401: bisz t1, , t2 1402: jcc t2, , 1405 1403: str 8, , t3 1405: str 16, , t3 1404: jcc t2, , 1406 1406: add t3, t3, t4 int t4 = t0 == -15 ? 32 : 16 1407: jcc 1, , 1420
  • 29. Define Transformations 1400: add t0, 15, t1 1401: bisz t1, , t2 1402: jcc t2, , 1405 1403: str 8, , t3 1405: str 16, , t3 1404: jcc t2, , 1406 1406: add t3, t3, t4 1407: jcc 1, , 1420
  • 30. Define a Join function 1400: add t0, 15, t1 1401: bisz t1, , t2 1402: jcc t2, , 1405 1403: str 8, , t3 1405: str 16, , t3 1404: jcc t2, , 1406 1406: add t3, t3, t4 1407: jcc 1, , 1420
  • 31. Define a Graph Walker 1400: add t0, 15, t1 1401: bisz t1, , t2 1402: jcc t2, , 1405 1403: str 8, , t3 1405: str 16, , t3 1404: jcc t2, , 1406 1406: add t3, t3, t4 1407: jcc 1, , 1420
  • 32. Define a Start Vector node1 → initialState1 node2 → initialState2 node3 → initialState3 node4 → initialState4 node5 → initialState5 ... noden → initialStaten
  • 33. Running MonoREIL 1400: add t0, 15, t1 initialState1400 → finalState1400 1401: bisz t1, , t2 initialState1401 → finalState1401 1402: jcc t2, , 1405 initialState1402 → finalState1402 1403: str 8, , t3 1405: str 16, , t3 initialState1403 → finalState1403 1404: jcc t2, , 1406 initialState1405 → finalState1405 initialState1404 → finalState1404 1406: add t3, t3, t4 initialState1406 → finalState1406 1407: jcc 1, , 1420 initialState1407 → finalState1407
  • 34. The Result node1 → finalState1 node2 → finalState2 node3 → finalState3 node4 → finalState4 node5 → finalState5 ... noden → finalStaten
  • 35. LOC? 14 + n 12 + n 8+n 1+n 1 Walker Graph Transformations Start Vector Lattice
  • 36. Register Tracking 120 120 90 1 1 Walker Graph Transformations Start Vector Lattice
  • 38. 1700 MS08-67 500 150 1 1 Walker Graph Transformations Start Vector Lattice
  • 40. Deobfuscating Optimizer Type Reconstruction
  • 41. Related Work ERESI Vine (BitBlaze) Silvio Cesare Hex-Rays
  • 42. Thank You! Questions?