O slideshow foi denunciado.
Seu SlideShare está sendo baixado. ×

grsecurity and PaX

Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Carregando em…3
×

Confira estes a seguir

1 de 42 Anúncio

grsecurity and PaX

Baixar para ler offline

In this talk, Gil Yankovitch discusses the PaX patch for the Linux kernel, focusing on memory manager changes and security mechanisms for memory allocations, reads, writes from user/kernel space and ASLR.

In this talk, Gil Yankovitch discusses the PaX patch for the Linux kernel, focusing on memory manager changes and security mechanisms for memory allocations, reads, writes from user/kernel space and ASLR.

Anúncio
Anúncio

Mais Conteúdo rRelacionado

Diapositivos para si (20)

Semelhante a grsecurity and PaX (20)

Anúncio

Mais de Kernel TLV (20)

Mais recentes (20)

Anúncio

grsecurity and PaX

  1. 1. grsecurity and PaX Gili Yankovitch, Nyx Software Security Solutions
  2. 2. Prerequisites ● Knowledge with kernel internals (duh!) ● Knowledge of the Memory Management model ● Slab Memory allocation ... ○ But I’ll cover anything anyway...
  3. 3. Key Points ● What is grsecurity and PaX ● How to apply it ○ Configuration variations ● PaX Memory Manager features ○ PAX_USERCOPY ○ PAX_MEMORY_SANITIZE ○ PAX_ASLR ■ PAX_RANDSTACK
  4. 4. Linus Torvalds ● Linus is not a big fan of security... “One reason I refuse to bother with the whole security circus is that I think it glorifies - and thus encourages - the wrong behavior. It makes "heroes" out of security people, as if the people who don't just fix normal bugs aren't as important.”
  5. 5. grsecurity - Technical details ● PaX is a set of features within the grsecurity (grsecurity.net) patch ● Linux Kernel patch ● Stable for versions 3.2.72 and 3.14.54 ● Test for 4.3.5 ● Was free until 08/2015 ○ Stopped being free because of copyright infringement ● Supports all major architectures ○ x86/x64, arm, ppc, mips
  6. 6. Applying the patch ● Very hard. ● Doesn’t always work. ● Needs to use specific commands depending your machine. ● But some say this works: $ patch -p1 < grsecurity-3.1-4.3.5-201602032209.patch ● So now, well, you don’t have any excuse not to use it. ● Now grsecurity is applied to your kernel. ● Even just by doing this we improve our system’s security. You’ll see...
  7. 7. Activating specific features ● grsecurity supports automatic configuration ● Automatic configuration depends on choosing between two factors: ○ Performance ○ Security ● They usually don’t go hand in hand... $ make menuconfig Security options ---> Grsecurity --->
  8. 8. Memory Manager Refresher ● Classical memory model within the Linux Kernel is of Zones ● There are 3 major zones in classic kernel architecture: ○ ZONE_HIGHMEM ○ ZONE_NORMAL ○ ZONE_DMA ZONE_DMA ZONE_NORMAL ZONE_HIGHMEM 0x00000000 KERNEL USER 0x00000000 0xC0000000 0xFFFFFFFF Virtual Memory Physical Memory © All Rights Reserved
  9. 9. Memory abstraction Physical Memory Virtual Memory Buddy System SLUB Allocator kmalloc © All Rights Reserved ZONE_HIGHMEM ZONE_NORMAL
  10. 10. PAX_USERCOPY
  11. 11. copy_from_user ● Not many use this function ● Eventually leads to this piece of code: static __always_inline __must_check unsigned long __copy_from_user_nocheck(void *dst, const void __user *src, unsigned long size) { size_t sz = __compiletime_object_size(dst); unsigned ret = 0; if (size > INT_MAX) return size; check_object_size(dst, size, false); arch/x86/include/asm/uaccess_64.h ● Lets focus on check_object_size()
  12. 12. check_object_size() void __check_object_size(const void *ptr, unsigned long n, bool to_user, bool const_size) { #ifdef CONFIG_PAX_USERCOPY const char *type; #endif ... #ifdef CONFIG_PAX_USERCOPY if (!n) return; type = check_heap_object(ptr, n); if (!type) { int ret = check_stack_object(ptr, n); if (ret == 1 || ret == 2) return; if (ret == 0) { if (check_kernel_text_object((unsigned long)ptr, (unsigned long)ptr + n)) type = ""; else return; } else type = ""; } pax_report_usercopy(ptr, n, to_user, type); #endif fs/exec.c
  13. 13. check_heap_object() const char *check_heap_object(const void *ptr, unsigned long n) { struct page *page; struct kmem_cache *s; unsigned long offset; if (ZERO_OR_NULL_PTR(ptr)) return "<null>"; if (!virt_addr_valid(ptr)) return NULL; page = virt_to_head_page(ptr); if (!PageSlab(page)) return NULL; s = page->slab_cache; if (!(s->flags & SLAB_USERCOPY)) return s->name; offset = (ptr - page_address(page)) % s->size; if (offset <= s->object_size && n <= s->object_size - offset) return NULL; return s->name; } mm/slub.c
  14. 14. check_stack_object() ● First, few basic checks #ifdef CONFIG_PAX_USERCOPY /* 0: not at all, 1: fully, 2: fully inside frame, -1: partially (implies an error) */ static noinline int check_stack_object(const void *obj, unsigned long len) { const void * const stack = task_stack_page(current); const void * const stackend = stack + THREAD_SIZE; #if defined(CONFIG_FRAME_POINTER) && defined(CONFIG_X86) const void *frame = NULL; const void *oldframe; #endif if (obj + len < obj) return -1; if (obj + len <= stack || stackend <= obj) return 0; if (obj < stack || stackend < obj + len) return -1; fs/exec.c
  15. 15. All hail gcc features! ● gcc has many wonderful features ○ Some of them documented, some of them undocumented. ○ One of those documented:— Built-in Function: void * __builtin_frame_address (unsigned int level) This function is similar to __builtin_return_address, but it returns the address of the function frame rather than the return address of the function. Calling __builtin_frame_address with a value of 0 yields the frame address of the current function, a value of 1 yields the frame address of the caller of the current function, and so forth. ...
  16. 16. check_stack_object() cont. #if defined(CONFIG_FRAME_POINTER) && defined(CONFIG_X86) oldframe = __builtin_frame_address(1); if (oldframe) frame = __builtin_frame_address(2); ... while (stack <= frame && frame < stackend) { ... if (obj + len <= frame) return obj >= oldframe + 2 * sizeof(void *) ? 2 : -1; oldframe = frame; frame = *(const void * const *)frame; } return -1; #else return 1; #endif } fs/exec.c 0xFF... 0x00... frame -> oldframe oldframe frame Stack Grows Down (Lower Addresses) Stack obj obj+len © All Rights Reserved
  17. 17. check_kernel_text_object() ● Now all there is to do is that we are not reading/writing from .text #ifdef CONFIG_PAX_USERCOPY static inline bool check_kernel_text_object(unsigned long low, unsigned long high) { ... unsigned long textlow = (unsigned long)_stext; unsigned long texthigh = (unsigned long)_etext; /* check against linear mapping as well */ if (high > (unsigned long)__va(__pa(textlow)) && low < (unsigned long)__va(__pa(texthigh))) return true; if (high <= textlow || low >= texthigh) return false; else return true; } #endif fs/exec.c
  18. 18. check_object_size() - Again, now from the other side void __check_object_size(const void *ptr, unsigned long n, bool to_user, bool const_size) { #ifdef CONFIG_PAX_USERCOPY const char *type; #endif ... #ifdef CONFIG_PAX_USERCOPY if (!n) return; type = check_heap_object(ptr, n); if (!type) { int ret = check_stack_object(ptr, n); if (ret == 1 || ret == 2) return; if (ret == 0) { if (check_kernel_text_object((unsigned long)ptr, (unsigned long)ptr + n)) type = ""; else return; } else type = ""; } pax_report_usercopy(ptr, n, to_user, type); #endif fs/exec.c ✔ ✔ ✔
  19. 19. PAX_MEMORY_SANITIZE
  20. 20. How you’d imagine it ● Generally, what this means is that on every deallocation, we’d like to sanitize the memory. ● If you ask me, it should look something like this: void kfree(const void *block) { /* Some kfree() logic */ … memset(block, 0x42, len); } kernel/lets_hope_its_that_way.c
  21. 21. Security is configurable! (not really…) ● PaX is configured for fast sanitization by default ○ Configurable via kernel command line #ifdef CONFIG_PAX_MEMORY_SANITIZE enum pax_sanitize_mode pax_sanitize_slab __read_only = PAX_SANITIZE_SLAB_FAST; static int __init pax_sanitize_slab_setup(char *str) { if (!str) return 0; if (!strcmp(str, "0") || !strcmp(str, "off")) { pax_sanitize_slab = PAX_SANITIZE_SLAB_OFF; } else if (!strcmp(str, "1") || !strcmp(str, "fast")) { pax_sanitize_slab = PAX_SANITIZE_SLAB_FAST; } else if (!strcmp(str, "full")) { pax_sanitize_slab = PAX_SANITIZE_SLAB_FULL; } ... } early_param("pax_sanitize_slab", pax_sanitize_slab_setup); #endif mm/slab_common.c ○ But actually PAX_SANITIZE_SLAB_FAST doesn’t do anything. (X_X)
  22. 22. Using the security powder ● In order to use it, pass pax_sanitize_slab=full as kernel argument ● Creating a SLAB that is sanitizable struct kmem_cache * kmem_cache_create(const char *name, size_t size, size_t align, unsigned long flags, void (*ctor)(void *)) { ... #ifdef CONFIG_PAX_MEMORY_SANITIZE if (pax_sanitize_slab == PAX_SANITIZE_SLAB_OFF || (flags & SLAB_DESTROY_BY_RCU)) flags |= SLAB_NO_SANITIZE; else if (pax_sanitize_slab == PAX_SANITIZE_SLAB_FULL) flags &= ~SLAB_NO_SANITIZE; #endif mm/slab_common.c ● I told you PAX_SANITIZE_SLAB_FAST does nothing...
  23. 23. Before anything else (!) static int calculate_sizes(struct kmem_cache *s, int forced_order) { ... if (((flags & (SLAB_DESTROY_BY_RCU | SLAB_POISON)) || #ifdef CONFIG_PAX_MEMORY_SANITIZE (!(flags & SLAB_NO_SANITIZE)) || #endif s->ctor)) { /* * Relocate free pointer after the object if it is not * permitted to overwrite the first word of the object on * kmem_cache_free. * * This is the case if we do RCU, have a constructor or * destructor or are poisoning the objects. */ s->offset = size; size += sizeof(void *); } … static inline void set_freepointer(struct kmem_cache *s, void *object, void *fp) { *(void **)(object + s->offset) = fp; } mm/slub.c
  24. 24. SL(U)B internals ● In order to keep track of free slubs, Linux uses free slubs linked list Slab Cache Slab Object Usual case free ptr Slab Object Slab Object free ptr Slab Object ... free ptr PAX_MEMORY_SANITIZE offset = 0 offset = object_size © All Rights Reserved
  25. 25. Finally ● Yeah, it’s pretty straight forward... static __always_inline void slab_free(struct kmem_cache *s, struct page *page, void *x, unsigned long addr) { ... #ifdef CONFIG_PAX_MEMORY_SANITIZE if (!(s->flags & SLAB_NO_SANITIZE)) { memset(x, PAX_MEMORY_SANITIZE_VALUE, s->object_size); if (s->ctor) s->ctor(x); } #endif mm/page_alloc.c ● Wait, what does this guy doing here?!
  26. 26. Gotta Clean Them All! ● Also let’s clear pages anyway upon new allocation static inline void clear_highpage(struct page *page) { void *kaddr = kmap_atomic(page); clear_page(kaddr); kunmap_atomic(kaddr); } static inline void sanitize_highpage(struct page *page) { void *kaddr; unsigned long flags; local_irq_save(flags); kaddr = kmap_atomic(page); clear_page(kaddr); kunmap_atomic(kaddr); local_irq_restore(flags); } include/linux/highmem.h
  27. 27. PAX_ASLR + PAX_RANDMAP
  28. 28. What is ASLR? ● ASLR is a CONCEPT ● Acronym for: Address Space Layout Randomization ● What it basically does is random the address space for each execution ● Does it constrain which binaries we are allowed to run? ● Linux comes with a built-in ASLR ○ But it’s not that good :( ● Activate it with: $ echo 2 > /proc/sys/kernel/randomize_va_space
  29. 29. Example ● In theory, on every execution the memory map should be different: Heap Stack Code Heap Stack Code Heap Stack Code Execution #1 Execution #2 Execution #3
  30. 30. Configuring PaX on an ELF basis ● Three (3) types of configuration options (depends on .config) ○ Default configuration (activates all mechanisms all the time) ○ Annotation as part of the ELF header (old method) ○ As part of file system extended attributes ● Check out chpax(1) and paxctl(1) ○ Requires a recompilation with: ■ PAX_PT_PAX_FLAGS / PAX_XATTR_PAX_FLAGS
  31. 31. So what do we REALLY want to random? ● Each process consists of several parts ○ brk (i.e. for malloc) ○ The ELF itself ○ mmap()s ■ For data ■ For dynamic libraries ○ Stack ● Let’s find out what happens to each of them
  32. 32. Brk ● Know load_elf_binary()? Well, you should! static int load_elf_binary(struct linux_binprm *bprm) { ... #ifdef CONFIG_PAX_RANDMMAP if (current->mm->pax_flags & MF_PAX_RANDMMAP) { unsigned long start, size, flags; vm_flags_t vm_flags; start = ELF_PAGEALIGN(elf_brk); size = PAGE_SIZE + ((pax_get_random_long() & ((1UL << 22) - 1UL)) << 4); flags = MAP_FIXED | MAP_PRIVATE; vm_flags = VM_DONTEXPAND | VM_DONTDUMP; down_write(current->mm->mmap_sem); start = get_unmapped_area(NULL, start, PAGE_ALIGN(size), 0, flags); ... if (retval == 0) retval = set_brk(start + size, start + size + PAGE_SIZE); ... } #endif fs/binfmt_elf.c BTW this is basically a call to prandom_u32()
  33. 33. ELF static int load_elf_binary(struct linux_binprm *bprm) { ... load_bias = ELF_ET_DYN_BASE - vaddr; if (current->flags & PF_RANDOMIZE) load_bias += arch_mmap_rnd(); load_bias = ELF_PAGESTART(load_bias); #ifdef CONFIG_PAX_RANDMMAP /* PaX: randomize base address at the default exe base if requested */ if ((current->mm->pax_flags & MF_PAX_RANDMMAP) && elf_interpreter) { load_bias = (pax_get_random_long() & ((1UL << PAX_DELTA_MMAP_LEN) - 1)) << PAGE_SHIFT; load_bias = ELF_PAGESTART(PAX_ELF_ET_DYN_BASE - vaddr + load_bias); elf_flags |= MAP_FIXED; } #endif ... error = elf_map(bprm->file, load_bias + vaddr, elf_ppnt, elf_prot, elf_flags, total_size); ... } #endif fs/binfmt_elf.c Regular Linux ASLR PaX ASLR Brutal overwrite to Linux’s previous work Make sure we map it exactly there
  34. 34. Randomizing the Stack ● This decides where the stack top will be placed #define STACK_TOP ((current->mm->pax_flags & MF_PAX_SEGMEXEC)?SEGMEXEC_TASK_SIZE:TASK_SIZE) arch/x86/include/asm/processor.h ● And now randomize! static unsigned long randomize_stack_top(unsigned long stack_top) { #ifdef CONFIG_PAX_RANDUSTACK if (current->mm->pax_flags & MF_PAX_RANDMMAP) return stack_top - current->mm->delta_stack; #endif /* … In load_elf_binary … */ retval = setup_arg_pages(bprm, randomize_stack_top(STACK_TOP), executable_stack); fs/binfmt_elf.c Another brutal hijack of PaX before any original Linux logic
  35. 35. Finally map the stack VMA int setup_arg_pages(struct linux_binprm *bprm, unsigned long stack_top, int executable_stack) { struct vm_area_struct *vma = bprm->vma; ... stack_top = arch_align_stack(stack_top); stack_top = PAGE_ALIGN(stack_top); stack_shift = vma->vm_end - stack_top; bprm->p -= stack_shift; ... stack_expand = 131072UL; /* randomly 32*4k (or 2*64k) pages */ stack_size = vma->vm_end - vma->vm_start; rlim_stack = rlimit(RLIMIT_STACK) & PAGE_MASK; if (stack_size + stack_expand > rlim_stack) stack_base = vma->vm_end - rlim_stack; else stack_base = vma->vm_start - stack_expand; current->mm->start_stack = bprm->p; ret = expand_stack(vma, stack_base); fs/exec.c vma->end Grows Down (Lower Addresses) 0x00... 0xFF... vma->start rlim_stack stack_base = vma->end - rlim_stack Extended VMA VMA VMA stack_expand stack_base = vma->start - stack_expand Stack © All Rights Reserved WTF random? stack_base 0x00...
  36. 36. Reminder for mmap(2) ● We want to random mmap for: ○ Random data locations (i.e. got) ○ Random dynamic libraries locations (dynamic linking) NAME mmap, munmap - map or unmap files or devices into memory SYNOPSIS #include <sys/mman.h> void *mmap(void *start, size_t length, int prot, int flags, int fd, off_t offset); int munmap(void *start, size_t length);
  37. 37. Allegory A B C ● What you’d expect ● How it is in reality A B C D E I G H J F
  38. 38. Randomizing mmap() void arch_pick_mmap_layout(struct mm_struct *mm) { unsigned long random_factor = 0UL; #ifdef CONFIG_PAX_RANDMMAP if (!(mm->pax_flags & MF_PAX_RANDMMAP)) #endif if (current->flags & PF_RANDOMIZE) random_factor = arch_mmap_rnd(); mm->mmap_legacy_base = mmap_legacy_base(mm, random_factor); if (mmap_is_legacy()) { mm->mmap_base = mm->mmap_legacy_base; mm->get_unmapped_area = arch_get_unmapped_area; } else { mm->mmap_base = mmap_base(mm, random_factor); mm->get_unmapped_area = arch_get_unmapped_area_topdown; } #ifdef CONFIG_PAX_RANDMMAP if (mm->pax_flags & MF_PAX_RANDMMAP) { mm->mmap_legacy_base += mm->delta_mmap; mm->mmap_base -= mm->delta_mmap + mm->delta_stack; } #endif } arch/x86/mm/mmap.c Remember these fellas. We will use them to allocate new pages. Here we use the random we generated before
  39. 39. Calling the actual logic unsigned long arch_get_unmapped_area_topdown(struct file *filp, const unsigned long addr0, const unsigned long len, const unsigned long pgoff, const unsigned long flags) { ... info.flags = VM_UNMAPPED_AREA_TOPDOWN; info.length = len; info.low_limit = PAGE_SIZE; info.high_limit = mm->mmap_base; info.align_mask = 0; info.align_offset = pgoff << PAGE_SHIFT; if (filp) { info.align_mask = get_align_mask(); info.align_offset += get_align_bits(); } info.threadstack_offset = offset; addr = vm_unmapped_area(&info); arch/x86/kernel/sys_x86_64.c
  40. 40. A brief on VMAs ● VMA stands for Virtual Memory Area ● Represents a chunk of virtually contiguous pagesstruct vm_area_struct { /* The first cache line has the info for VMA tree walking. */ unsigned long vm_start; /* Our start address within vm_mm. */ unsigned long vm_end; /* The first byte after our end address within vm_mm. */ /* linked list of VM areas per task, sorted by address */ struct vm_area_struct *vm_next, *vm_prev; struct rb_node vm_rb; /* * Largest free memory gap in bytes to the left of this VMA. * Either between this VMA and vma->vm_prev, or between one of the * VMAs below us in the VMA rbtree and its ->vm_prev. This helps * get_unmapped_area find a free area of the right size. */ unsigned long rb_subtree_gap; ... } __randomize_layout; include/linux/mm_types.h
  41. 41. So, let’s allocate some memory! mm_rb vma.rb_node vma.rb_node vma.rb_node vma.rb_node vma.rb_node vma.rb_node vma.rb_nodevma.rb_node V M A V M A V M A V M A V M A V M A V M A V M A Doubly-Linked List, Sorted by Virtual Address V M A vma.rb_node info->high_limit low_limitmmap_base Cannot allocate in here gap_start high_limit length 1 0 2 34 -> 7 gap_start gap_end length © All Rights Reserved
  42. 42. Questions? :)

Notas do Editor

  • ZONE_DMA is for DMA controllers
    ZONE_NORMAL implements the Buddy System
    ZONE_HIGHMEM is for everything else: temporary mappings, usermode memory etc…

    Focus on ZONE_NORMAL.
    Implements Slab Allocator (kmalloc uses it!)
    Slab is implemented on top of the Buddy System.
    Using it for fast access to memory
  • Note that copy_from/to_user is ARCHITECTURE DEPENDANT.
    Why? Every architecture has a different way of invalidating caches, loading/unloading pages to swap etc...
  • Eventually, we want to execute this function so we avoid pax_report_usercopy.
  • The purpose of this function is mainly looking up the pointer’s slab.
    First make some basic sanity checks (not null, is mapped (virt_addr_valid))
    Then get the first page of the page chain for this pointer
    Then check if this pointer is even backed by any slab cache
    This protection adds a parameter to SLABs to be marked so they can be used for communication with usermode.
    If this SLAB is not one of these, then we shouldn’t allow a copy.
    Finally, check that the copy is within the size of a single object of the SLAB
    Otherwise, return the name of the SLAB so the previous function will fail.
  • Few basic checks:
    Len does not overflow
    The object is not inside the (kernel!!) stack
    The object does not overrun the stack partially.
  • We travel the entire stack frame by frame
    For each frame, we make sure the object ends before the next frame...
    ...and starts after the current frame + 2 pointers
    Know what they are?
    Psst: return address and frame pointer
    For those that attend my previous lecture, does anyone see a semi-problem here?
    Do note that this DOES NOT comply with CONFIG_CC_STACKPROTECTOR
  • Very basic. Checks against start of text and end of text symbols.
    Double check for any case of double mapping (i.e. vector table)
  • Eventually, we want to execute this function so we avoid pax_report_usercopy.
  • PAX_SANITIZE_SLAB_FAST doesn’t do anything.
    Use PAX_SANITIZE_SLAB_FULL
  • This shows PAX_SANITIZE_SLAB_FAST doesn’t do anything.
    Doesn’t really show anything else… :P
  • Why is this? Anyone knows why this is?
    Well, the answer is written in the comment...
  • Why is this? Anyone knows why this is?
    Well, the answer is written in the comment…
    The CTor is here because we don’t wanna waste time on initializing a SLAB on an allocation.
    It’s important to know when your functions are called!
  • Why does it suspend the interrupts?
    Well, zeroing the entire page might take a while and we don’t want anyone using it in the meantime...
  • Yes. ASLR is a concept.
    Constraint: (generally!) only position independent code.
    In practice: position dependent code has a much lower entropy.
  • Yes. ASLR is a concept.
  • Stuff to random
    Brk
    ELF
    mmap()s
    Data
    Libraries
    Stack
  • Basically, this reserves a random size for the BRK area
    Then sets the brk as start+size.
  • This is pretty straight forward
    Note Linux’s basic ASLR
    Note that PaX brutally overwrites Linux’s work for load_bias
  • First part of the function defines the higher-most address of the stack VMA
    [p] is not very indicative. It indicates the highest address in memory.
    So there are two possible actions here:
    Either we exceed the rlimit, so we will shrink the stack
    Or we will expand the stack according to this weird “random” const
    All in all, the randomness of the stack is conveyed by the vma->end address we use as stack_top
  • We are going to execute arch_get_unmapped_area_topdown() to allocate new memory.
    Using mmap_base as the boundary for our allocation.
  • Just highlight the info.high_limit
    Notice that this is the only thing that is being randomized.
    So actually, mmap randomness is based on disallowing processes to allocate any higher than a certain address.
  • VMA consists of a start address and an end address.
    It is linked to the other VMAs of the process via two datatypes
    Doubly linked list
    Red-Black tree.
    It also has a “gap” member that resembles the largest unmapped area between VMAs of the current subtree (of the Red-Black tree).

×