Anúncio

[cb22] Under the hood of Wslink’s multilayered virtual machine en by Vladislav Hrčka

CODE BLUE
CODE BLUE
2 de Jan de 2023
Anúncio

Mais conteúdo relacionado

Similar a [cb22] Under the hood of Wslink’s multilayered virtual machine en by Vladislav Hrčka(20)

Mais de CODE BLUE(20)

Anúncio

[cb22] Under the hood of Wslink’s multilayered virtual machine en by Vladislav Hrčka

  1. Under the hood of Wslink’s multilayered virtual machine Vladislav Hrčka | Malware Researcher
  2. Vladislav Hrčka Malware Researcher at ESET @HrckaVladislav
  3. Agenda •Intro to VMs in general and symbolic execution •Internals of the VM used in Wslink •Our approach to dealing with the obfuscation •Demonstration of the approach
  4. Process VMs basics • Interpreter executes the bytecode • Bytecode contains instructions with: • Opcodes • Operands • Handlers define individual opcodes
  5. VM based obfuscation
  6. VM based obfuscation
  7. Symbolic execution in Miasm • Expresses the code in mathematical formulas • Registers and memory are treated as symbolic values • Summarizes the code’s effects on the symbolic values • We will frequently use it • Original ASM: MOV EAX, EBX MOV ECX, DWORD PTR [EDX] XOR ECX, 0x123 MOV AX, WORD PTR [ESI] JMP ECX • Performed symbolic execution: EAX = {@16[ESI] 0 16, EBX[16:32] 16 32} ECX = @32[EDX] ^ 0x123 zf = @32[EDX] == 0x123 nf = (@32[EDX] ^ 0x123)[31:32] ... EIP = @32[EDX] ^ 0x123 IRDst = @32[EDX] ^ 0x123
  8. Symbolic execution in Miasm • Allows us to simply apply known concrete values • Concrete values can simplify the expressions • Performed Symbolic execution: EAX = {@16[ESI] 0 16, EBX[16:3… ECX = @32[EDX] ^ 0x123 zf = @32[EDX] == 0x123 nf = (@32[EDX] ^ 0x123)[31:32] ... EIP = @32[EDX] ^ 0x123 IRDst = @32[EDX] ^ 0x123 • Applied EDX = 0x96 and @32[0x96] = 0xFEED: EAX = {@16[ESI] 0 16, EBX[16:3… ECX = 0xFFCE zf = 0x0 nf = 0x0 ... EIP = 0xFFCE IRDst = 0xFFCE EDX = 0x96 @32[0x96] = 0xFEED
  9. End of introduction
  10. • A virtualized function • Virtualized functions contain: • Prologue • Call to VM entry • Gibberish code Wslink: First contact
  11. Junk code
  12. Overview of the VM’s structure
  13. Overview of the VM’s structure
  14. • Prepares the next virtual instruction • Increases VPC (virtual program counter) • Zeroes out a register • RBP_init is context pointer • Instruction table at offset 0xA4 • VPC at offset 0x28 VM2: The first executed virtual instruction
  15. •Instruction 2 • Zeroes out several virtual registers •Instruction 3 • Stores RSP in a virtual register VM2: Virtual instructions 2 and 3
  16. VM2: Virtual instruction 4 • Multiple basic blocks:
  17. • Multiple basic blocks: VM2: Virtual instruction 4 • Summary of the first block – memory assignments and IRDst:
  18. • Multiple basic blocks: VM2: Virtual instruction 4 • Summary of the first block – memory assignments and IRDst:
  19. • Multiple basic blocks: VM2: Virtual instruction 4 • Summary of the first block – memory assignments and IRDst: • Rolling decryption – encryption of operands • Values of virtual registers at offsets 0x0B and 0x70 were set earlier
  20. VM2: Virtual instruction 4 - deobfuscation • The first original block
  21. VM2: Virtual instruction 4 - deobfuscation • The first original block
  22. VM2: Virtual instruction 4 - deobfuscation • The first original block • Application of known values of the rolling decryption registers • Application of known bytecode values reveals POP IRDst = (-@16[@64[RBP_init + 0x28] + 0x4] ^ 0x3038 == @16[@64[RBP_init + 0x28] + 0x6])?(0x7FEC91ABD1C,0x7FEC91ABCF6) @64[RBP_init + {-@16[@64[RBP_init + 0x28] + 0x4] ^ 0x3038, 0, 16, 0x0, 16, 64}] = @64[RSP_init] IRDst = @64[@64[RBP_init + 0xA4] + 0x5A8] @64[RBP_init + 0x28] = @64[RBP_init + 0x28] + 0x8 @64[RBP_init + 0x141] = @64[RBP_init + 0x141] + 0x8 @64[RBP_init + 0x12A] = @64[RSP_init]
  23. • Virtual instruction 4 summary: VM2: Deobfuscating bytecode chunks IRDst = @64[@64[RBP_init + 0xA4] + 0x5A8] @64[RBP_init + 0x28] = @64[RBP_init + 0x28] + 0x8 @64[RBP_init + 0x141] = @64[RBP_init + 0x141] + 0x8 @64[RBP_init + 0x12A] = @64[RSP_init]
  24. • Build a graph from summaries • Treat some values as concrete: • Rolling decryption registers • Memory accesses relative to the bytecode pointer • Preserve only decryption registers’ values between blocks VM2: Deobfuscating bytecode chunks • Virtual instruction 4 summary: IRDst = @64[@64[RBP_init + 0xA4] + 0x5A8] @64[RBP_init + 0x28] = @64[RBP_init + 0x28] + 0x8 @64[RBP_init + 0x141] = @64[RBP_init + 0x141] + 0x8 @64[RBP_init + 0x12A] = @64[RSP_init] virtual_instruction2_1(vpc) virtual_instruction2_2(vpc) virtual_instruction2_1(vpc) virtual_instruction2_3(vpc) virtual_instruction2_4(vpc)
  25. VM1: Deobfuscated virtual instruction structure CFG
  26. VM1: Deobfuscated virtual instruction structure OUTRO CFG INTRO
  27. OUTRO CFG INTRO
  28. CFG INTRO Legend: Yellow – Push virtual registers Red – Pop virtual registers; switch context Green – Jump to register OUTRO
  29. • Virtual instructions 1, 2, 3 • The same behavior as in VM2 VM1: Initially executed virtual instructions
  30. VM1: Virtual instruction 4
  31. VM1: Virtual instruction 4 • Instruction merging – contains, e.g., POP and PUSH operations • Operands decide which instruction is executed • PUSH:
  32. VM1: Virtual instruction 4 • Instruction merging – contains, e.g., POP and PUSH operations • Operands decide which instruction is executed • POP:
  33. • Instruction merging – contains, e.g., POP and PUSH operations • Operands decide which instruction is executed • PUSH/POP • Can be simplified by making the operands concrete VM1: Virtual instruction 4
  34. VM1: Deobfuscating bytecode chunks • Use the same approach as in VM2: • Build a graph from summaries • Treat certain values as concrete • Preserve certain values between blocks • Process both VMs at once: • Additionally make the entire VM2 concrete • Ignore assignments to the VM2 context virtual_instruction2_1(vpc) virtual_instruction2_2(vpc) virtual_instruction2_1(vpc) virtual_instruction2_3(vpc) virtual_instruction2_4(vpc)
  35. Resulting graph contained unexpected branches instead of a series of POPs VM1: Issue with opaque predicates during deobfuscation
  36. VM1: Issue with opaque predicates during deobfuscation •The branches check a known value • We can apply the value and simplify it • This is a sort of opaque predicates
  37. Does the approach work?
  38. Analyzing results Bytecode block VM1 VM2 Size in bytes 695 1,145 Total number of processed virtual instructions 62 109 Total number of underlying native instructions 3,536,427 17,406 Total number of resulting IR instructions (including IRDsts) 192 307 Execution time in seconds 382 10
  39. NON-OBFUSCATED SAMPLE* Processed ServiceMain bytecode OBFUSCATED
  40. DEOBFUSCATED
  41. ORIGINAL SAMPLE DEOBFUSCATED SIMPLIFIED mov R0, [R0] mov R1, 28
  42. lea R2, BASE+0xE3808 ORIGINAL SAMPLE DEOBFUSCATED SIMPLIFIED
  43. ORIGINAL SAMPLE DEOBFUSCATED
  44. ORIGINAL SAMPLE DEOBFUSCATED
  45. ORIGINAL SAMPLE DEOBFUSCATED lea R2, BASE+0x2FB0 mov R3, R0 SIMPLIFIED
  46. ORIGINAL SAMPLE DEOBFUSCATED SIMPLIFIED lea Rdest, BASE+0x8C038
  47. ORIGINAL SAMPLE DEOBFUSCATED lea Rdest, BASE+0x8C038 ... jmp Rdest RETaddr = &vm_pre_initX()
  48. •Symbolic execution cannot process unbounded loops Limitations • Instructions using such loops need to be addressed by other means
  49. Symbolic execution can help devirtualize advanced unknown VMs in a reasonable time if we treat the right values as concrete Takeaway
  50. Links You can read the whitepaper at WeLiveSecurity.com Full source code is available in ESET git repository
  51. www.eset.com | www.welivesecurity.com | @ESETresearch Vladislav Hrčka Malware Researcher vladislav.hrcka@eset.com | @HrckaVladislav Questions?

Notas do Editor

  1. Subsequently to starting our analysis found a non-obfuscated sample – we weren’t motivated to fully deobfuscate the code
  2. Since 2017 been focusing on reverse engineering challenging malware samples Computer Science student at the Comenius University in the first years of master’s degree BlackHat Arsenal, AVAR
  3. Work like standard machines – machine code is bytecode here PC is offset to the bytecode here – not necessary Transfer of control from one virtual instruction to the next during interpretation needs to be performed by every VM. This process is generally known as dispatching Centralized interpreter – dispatcher – called direct call threading; can be also simple switch case Can be also inside of handlers – direct threading
  4. Stored in sep.arate section Need context switch Bytecode is generated from a function Original function starts interpreter Interpreter is embedded in the executable Context switch needs to be performed Stored in a separate section Need context switch Bytecode is generated from a function Original function starts interpreter Interpreter is embedded in the executable Context switch needs to be performed
  5. The strength of this obfuscation technique resides in the fact that the ISA of the VM is unknown to any prospective reverse engineer – a thorough analysis of the VM, which can be very time-consuming; Additionally, obfuscating VMs usually virtualize only certain “interesting” functions
  6. SE in IR IRDst 1. stands for 2. is IR instruction pointer, e.g. cmov multiple blocks X bit value dereference Self explainable dereference
  7. SE engine in Miasm automatically performs common optimization techniques
  8. The VM used in the Wslink is a variant of CV (themida) that supports multiple VMs and polymorphic layers of obfuscation techniques. We found this out subsequently to publishing our research
  9. The right side contains a part of the function
  10. Why 1071? They are duplicated! Vm_init – Sync – shared virtual context in the memory Virtual program counter (VPC) register Base address register Instruction table register
  11. Zdorazni nested VM – zacneme VM2
  12. Rolling decryption – present in a blog about VM Protect’s structure
  13. Make values relative to the bytecode ptr concrete
  14. We’ve just deobfuscated one of the VM1 instructions
  15. Let’s start with CFG
  16. Reminder – atomic operations in blocks
  17. Laser point – These IR blocks are effects of VM2 virtual instructions, they represent one VM1 virtual instruction together
  18. Zeroes out a register, stores the RSP, initializes rolling decryption registers
  19. A part of the instruction. When we inspected the code, we realized what the problem is
  20. Memory range of the VMs is known – separate section
  21. Let’s take a closer look at one of the branches <CLICK>
  22. The branches check a known value We can apply the value and simplify it This is a sort of opaque predicates
  23. We will show the results on the first executed bytecode, but before that show statistics
  24. Before we look at the devirtualized bytecode, let’s see what the execution time is approximately like Let’s look at the deobfuscated bytecode block of VM1 <CLICK>
  25. *We discovered the sample after starting our analysis
  26. No ARM R0! – pseudocode close to assembly
  27. Advanced – use junk code, duplicate and merge handlers, encrypt operands Right values – operands and registers for encryption
Anúncio