Return to main page

Pseudo-PaX-in-userland

(started 4/5/2008, added 5/5/2008)

This document describes how it could feasible to implement a pseudo PaX implementation, completely in userland. The described idea is far more of a play thing, than anything completely serious, for reasons later described. It's more of just random thoughts and experiments.

I don't recall how I got started along this track, except that it was something I've been meaning to look at for a while.

Introduction

Firstly, we should review how PaX's segmexec operates.

   While Linux effectively does not use segmentation by creating 0 based and
   4 GB limited segments for both code and data accesses (therefore logical
   addresses are the same as linear addresses), it is possible to set up
   segments that allow to implement non-executable pages.

   The basic idea is that we divide the 3 GB userland linear address space
   into two equal halves and use one to store mappings meant for data access
   (that is, we define a data segment descriptor to cover the 0-1.5 GB linear
   address range) and the other for storing mappings for execution (that is,
   we define a code segment descriptor to cover the 1.5-3 GB linear address
   range). Since an executable mapping can be used for data accesses as well,
   we will have to ensure that such mappings are visible in both segments
   and mirror each other. This setup will then separate data accesses from
   instruction fetches in the sense that they will hit different linear
   addresses and therefore allow for control/intervention based on the access
   type. In particular, if a data-only (and therefore non-executable) mapping
   is present only in the 0-1.5 GB linear address range, then instruction
   fetches to the same logical addresses will end up in the 1.5-3 GB linear
   address range and will raise a page fault hence allow detecting such
   execution attempts.

PaX's segmexec works by modifying the Global Descriptor Table which separates code and data requests to different virtual addresses.

Userspace, as far as I know, can't modify the Global Descriptor Table, but it can influence it's own Local Descriptor Table via the modify_ldt() system call. The modify_ldt() syscall can create code and data descriptors easily enough, and we can use call far and return far (amongst other techniques) to change into that selector.

Proof of concept

As a sample, let's try and execute a int3 instruction. We'll create a new LDT entry, with a base address of 4096, which means all CS addresses after that's set, has to be subtracted by 4096. And on with the show:

#include <stdlib.h>
#include <unistd.h>
#include <strings.h>
#include <stdio.h>
#include <errno.h>
#include <sys/types.h>
#include <fcntl.h>
#include <asm/ldt.h>

asm(".globl debug;\
        .type   debug, @function;\
debug:;\
        int3;\
.size   exit, .-exit;\
        "
        );

extern void debug();

/*
<asm/ldt.h>
struct modify_ldt_ldt_s {
        unsigned int  entry_number;
        unsigned long base_addr;
        unsigned int  limit;
        unsigned int  seg_32bit:1;
        unsigned int  contents:2;
        unsigned int  read_exec_only:1;
        unsigned int  limit_in_pages:1;
        unsigned int  seg_not_present:1;
        unsigned int  useable:1;
};

#define MODIFY_LDT_CONTENTS_DATA        0
#define MODIFY_LDT_CONTENTS_STACK       1
#define MODIFY_LDT_CONTENTS_CODE        2
*/

int do_ldt(int num, unsigned long base, int type)
{
        struct modify_ldt_ldt_s ldt_entry = {
                num, // entry_number
                (unsigned long int) (base), // base_address
                0xfffff, // limit, 4G or so :p
                1, // seg_32bit
                type, // contents
                1, // read_exec_only
                1, // limit_in_pages
                0, // seg_not_present
                1 // usable
        };
        return modify_ldt(1, &ldt_entry, sizeof(struct modify_ldt_ldt_s)) == 0;
}

int main(int argc, char **argv)
{
        short int seg;

        if(do_ldt(0, 0x1000, MODIFY_LDT_CONTENTS_CODE) == 0) {
                printf("Failed to modify ldt\n");
                exit(EXIT_FAILURE);
        }
        seg = 7; // (0 * 8) + 7
        //printf("new segment: %d|%02x\n", seg, seg);

        __asm__ volatile("pushw %0;\
                        pushl %1;\
                        lret"
                        :
                        : "r" (seg), "r" ((unsigned int)(debug) - 0x1000)
                        );

}

Running the above code under a debugger:

Copyright (C) 2007 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "i686-pc-linux-gnu"...
(no debugging symbols found)
Using host libthread_db library "/lib/libthread_db.so.1".
(gdb) r
Starting program: /root/drifter/ldt/ldt_wot
(no debugging symbols found)
(no debugging symbols found)

Program received signal SIGTRAP, Trace/breakpoint trap.
0x08047559 in ?? ()
(gdb) x/4i $eip -1
0x8047558:      Cannot access memory at address 0x8047558
(gdb) x/4i $eip - 1 + 4096
0x8048558 <debug>:      int3
0x8048559 <do_ldt>:     push   %ebp
0x804855a <do_ldt+1>:   mov    %esp,%ebp
0x804855c <do_ldt+3>:   sub    $0x48,%esp
(gdb) i r cs
cs             0x7      7

As we can see in the debugger output, it's possible to set a custom CS descriptor, and execute code. Due to the now non-flat memory address space, it also messes with debugging a little bit.

Implementing the Pseduo-PaX

Quickly reviewing what we need to do:

Doing the above completely correctly would be difficult from userland, but possible if a bit of effort was to be expended.

For the purposes of this article, we'll write it using dietlibc, and make it not that feasible to use.

#include <stdlib.h>
#include <unistd.h>
#include <strings.h>
#include <stdio.h>
#include <errno.h>
#include <sys/types.h>
#include <fcntl.h>
#include <asm/ldt.h>
#include <sys/mman.h>
#include <asm/unistd.h>


/*
<asm/ldt.h>
struct modify_ldt_ldt_s {
        unsigned int  entry_number;
        unsigned long base_addr;
        unsigned int  limit;
        unsigned int  seg_32bit:1;
        unsigned int  contents:2;
        unsigned int  read_exec_only:1;
        unsigned int  limit_in_pages:1;
        unsigned int  seg_not_present:1;
        unsigned int  useable:1;
};

#define MODIFY_LDT_CONTENTS_DATA        0
#define MODIFY_LDT_CONTENTS_STACK       1
#define MODIFY_LDT_CONTENTS_CODE        2
*/

_syscall3(int,modify_ldt,int,op,void*,what,int,len);

/*int modify_ldt(int op, void *what, int len)
{
        return syscall(__NR_modify_ldt, op, what, len);
}*/


int do_ldt(int num, unsigned long base, int type)
{
        struct modify_ldt_ldt_s ldt_entry = {
                num, // entry_number
                (unsigned long int) (base), // base_address
                0x5ffff, // limit, 1.5G or so :p
                1, // seg_32bit
                type, // contents
                1, // read_exec_only
                1, // limit_in_pages
                0, // seg_not_present
                1 // usable
        };
        return modify_ldt(1, &ldt_entry, sizeof(struct modify_ldt_ldt_s)) == 0;
}

int do_exit()
{
        exit(EXIT_SUCCESS);
}

void vulnerable()
{
        int j[1];
        int i;
        char code[] = "\xcc\xcc\xcc\xcc";
#ifdef HEAP
        int addr = strdup(code);
#else
        int addr = &code;
#endif
        //char *where;
        //__asm__("int3;");
        for(i = 0; i < 10; i++) j[i] = (unsigned int)(addr);
}

unsigned char *old_stack;
int old_stack_len;

void duplicate_code_mappings()
{
        // taken from drifter level 11 code, but modified.
        FILE *f;
        int hi, low;
        char flags[5];
        int wot;
        int major, minor;
        int size;
        char remainder[1024];
        int ret;
        unsigned char *new;

        f = fopen("/proc/self/maps", "r");

        while(8 == (ret = fscanf(f, "%08x-%08x %[^ \n] %08x %02x:%02x %08x%[^\n]", &low, &hi, flags, &wot, &major, &minor, &size, remainder))) {
                if((low & 0x60000000) == 0x60000000) continue;

                size = hi - low;
                /*
                 * size = hi - low;
                 * printf("--> %08x-%d\n", low, size);
                printf("--> %08x-%08x %s %d %d:%d %d %s\n", low, hi, flags, wot, major, minor, size, remainder);
                 */

                // r-xp
                if(flags[1] == '-' && flags[2] == 'x') {
                        printf("--> Duplicating 0x%08x, %d bytes long\n", low, size);

                        new = mmap(low+0x60000000, size, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_FIXED|MAP_ANONYMOUS|MAP_PRIVATE, -1, 0);
                        if(new == MAP_FAILED) {
                                printf("Unable to map code duplicate: %s\n", strerror(errno));
                                exit(EXIT_FAILURE);
                        }
                        memcpy(new, low, size);
                        mprotect(new, size, PROT_READ|PROT_EXEC);
                }
                old_stack = low;
                old_stack_len = size;
        }
        fclose(f);

        //exit(EXIT_FAILURE);
}


#define STKSIZ (4096 * 32)
// more code from drifter level11.. so I'm lazy :P

unsigned char *allocate_stack()
{
        int found;
        int address;
        short int shift;
        unsigned char *stack_ptr;
        int urand_fd;

        urand_fd = open("/dev/urandom", O_RDONLY);

        found = 0;
        while(!found) {
                if(read(urand_fd, &address, 4) != 4) {
                        printf("Read failure on /dev/urandom: %s\n", strerror(errno));
                        exit(EXIT_FAILURE);
                }

                if(read(urand_fd, &shift, 2) != 2) {
                        printf("Read failure on /dev/urandom: %s\n", strerror(errno));
                        exit(EXIT_FAILURE);
                }

                shift &= 4088; // (page_size - 1) - last 4 bits, to align stack

#if 1
                address &= 0x5f7fffff; // remove everything except for last 8M of address space
                //address |= TOOBIG;
#endif
                address &= ~4095;       // clear page addr

                stack_ptr = mmap(address, STKSIZ, PROT_READ|PROT_WRITE, MAP_FIXED|MAP_ANONYMOUS|MAP_PRIVATE, -1, 0);
                if(stack_ptr != MAP_FAILED) found = 1;

        }

        close(urand_fd);

        stack_ptr = (unsigned int)(stack_ptr) + (STKSIZ) - shift;
        memset(stack_ptr, 0xcc, shift);

        stack_ptr--;
        return stack_ptr;
}

void do_vulnerable()
{
        munmap(old_stack, old_stack_len);
        printf("Hello from do_vulnerable\n");
        vulnerable();
        do_exit();
}

int main(int argc, char **argv)
{
        short int seg;
        unsigned char *stack;

        if(do_ldt(0, 0x60000000, MODIFY_LDT_CONTENTS_CODE) == 0) {
                printf("Failed to modify ldt\n");
                exit(EXIT_FAILURE);
        }
        seg = 7;

        duplicate_code_mappings();
        stack = allocate_stack();

        printf("--> Will unmap old stack starting @ 0x%08x\n", old_stack);
        system("cat /proc/$PPID/maps");

        printf("Returning to do_vulnerable\n");

        __asm__ volatile("movl %0, %%esp;\
                        movl %%esp, %%ebp;\
                        pushl %1;\
                        pushw %2;\
                        pushl %3;\
                        lret"
                        :
                        : "m"(stack), "m" (do_exit), "r" (seg), "r" ((unsigned int)(do_vulnerable))
                        );

}

The above was compiled with:

diet gcc -fno-pie -fno-stack-protector ldt_test.c -o ldt_test

The above code creates a new stack mapping (because the original is mapped somewhere around 0xbfff0000) and allocates underneath the cutoff point, in addition, it duplicates code layout via scanning /proc/self/maps for mappings marked executable, and NOT writable.

The vulnerable() function stimulates a stack overflow, pointing to the int3 instruction. Optionally, it can be compiled with -DHEAP, and it will stimulate a heap return address, rather than a stack return address.

Watching it catch heap execute attempts:

GNU gdb 6.7.1
Copyright (C) 2007 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "i686-pc-linux-gnu"...
(no debugging symbols found)
Using host libthread_db library "/lib/libthread_db.so.1".
(gdb) r
Starting program: /root/drifter/ldt/ldt_wot
--> Duplicating 0x08048000, 24576 bytes long
--> Will unmap old stack starting @ 0xbfffc000
00110000-00111000 rw-p 00000000 00:00 0
08048000-0804e000 r-xp 00000000 08:01 50639      /root/drifter/ldt/ldt_wot
0804e000-08050000 rw-p 00005000 08:01 50639      /root/drifter/ldt/ldt_wot
08050000-08051000 rwxp 00000000 00:00 0
0d2a4000-0d2c4000 rw-p 00000000 00:00 0
68048000-6804e000 r-xp 00000000 00:00 0
bfffc000-c0000000 rwxp ffffd000 00:00 0
Returning to do_vulnerable
Hello from do_vulnerable

Program received signal SIGSEGV, Segmentation fault.
0x00111008 in ?? ()
(gdb) x/4i $eip
0x111008:       int3
0x111009:       int3
0x11100a:       int3
0x11100b:       int3

And stack execution attempts:

GNU gdb 6.7.1
Copyright (C) 2007 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "i686-pc-linux-gnu"...
(no debugging symbols found)
Using host libthread_db library "/lib/libthread_db.so.1".
(gdb) r
Starting program: /root/drifter/ldt/ldt_wot
--> Duplicating 0x08048000, 24576 bytes long
--> Will unmap old stack starting @ 0xbfff4000
00110000-00111000 rw-p 00000000 00:00 0
063c3000-063e3000 rw-p 00000000 00:00 0
08048000-0804e000 r-xp 00000000 08:01 50635      /root/drifter/ldt/ldt_wot
0804e000-08050000 rw-p 00005000 08:01 50635      /root/drifter/ldt/ldt_wot
08050000-08051000 rwxp 00000000 00:00 0
68048000-6804e000 r-xp 00000000 00:00 0
bfff4000-c0000000 rwxp ffff5000 00:00 0
Returning to do_vulnerable
Hello from do_vulnerable

Program received signal SIGSEGV, Segmentation fault.
0x063e25b1 in ?? ()
(gdb) x/4i $eip
0x63e25b1:      int3
0x63e25b2:      int3
0x63e25b3:      int3
0x63e25b4:      int3

Implementing it properly

This idea would be easily implemented in the ELF loader (ld.so, not kernel) if it laid out the memory correctly, and probably remapped the stack to a lower address. It would also have to hook stuff like mmap() and duplicate it if possible/applicable.

Anonymous memory could possibly be handled via making it disk backed, and mapping twice from that.

As opposed to memcpy, it should correctly parse /proc/<pid>/maps, and mmap() from shared libraries correctly.

Dynamically generated code could be handled by marking segments non writable, when attempting to execute them. On attempts to write there again, it would be unmapped from the executable region, and marked writable again. Self modifying code would be handled via this mechanism, albeit slowly.

Weaknesses

Other uses for modify_ldt() syscall