(started 4/5/2008, added 5/5/2008)
This document describes how it could feasible to implement a pseudo
PaX implementation, completely in userland. The
described idea is far more of a play thing, than anything completely serious,
for reasons later described. It's more of just random thoughts and
experiments.
I don't recall how I got started along this track, except that it was
something I've been meaning to look at for a while.
Introduction
Firstly, we should review how PaX's
segmexec operates.
While Linux effectively does not use segmentation by creating 0 based and
4 GB limited segments for both code and data accesses (therefore logical
addresses are the same as linear addresses), it is possible to set up
segments that allow to implement non-executable pages.
The basic idea is that we divide the 3 GB userland linear address space
into two equal halves and use one to store mappings meant for data access
(that is, we define a data segment descriptor to cover the 0-1.5 GB linear
address range) and the other for storing mappings for execution (that is,
we define a code segment descriptor to cover the 1.5-3 GB linear address
range). Since an executable mapping can be used for data accesses as well,
we will have to ensure that such mappings are visible in both segments
and mirror each other. This setup will then separate data accesses from
instruction fetches in the sense that they will hit different linear
addresses and therefore allow for control/intervention based on the access
type. In particular, if a data-only (and therefore non-executable) mapping
is present only in the 0-1.5 GB linear address range, then instruction
fetches to the same logical addresses will end up in the 1.5-3 GB linear
address range and will raise a page fault hence allow detecting such
execution attempts.
PaX's segmexec works by modifying the
Global Descriptor Table
which separates code and data requests to different virtual addresses.
Userspace, as far as I know, can't modify the Global Descriptor Table, but it
can influence it's own
Local Descriptor Table
via the modify_ldt() system call. The modify_ldt() syscall can create code
and data descriptors easily enough, and we can use call far and return far
(amongst other techniques) to change into that selector.
Proof of concept
As a sample, let's try and execute a int3 instruction. We'll create a new LDT
entry, with a base address of 4096, which means all CS addresses after that's
set, has to be subtracted by 4096. And on with the show:
#include <stdlib.h>
#include <unistd.h>
#include <strings.h>
#include <stdio.h>
#include <errno.h>
#include <sys/types.h>
#include <fcntl.h>
#include <asm/ldt.h>
asm(".globl debug;\
.type debug, @function;\
debug:;\
int3;\
.size exit, .-exit;\
"
);
extern void debug();
/*
<asm/ldt.h>
struct modify_ldt_ldt_s {
unsigned int entry_number;
unsigned long base_addr;
unsigned int limit;
unsigned int seg_32bit:1;
unsigned int contents:2;
unsigned int read_exec_only:1;
unsigned int limit_in_pages:1;
unsigned int seg_not_present:1;
unsigned int useable:1;
};
#define MODIFY_LDT_CONTENTS_DATA 0
#define MODIFY_LDT_CONTENTS_STACK 1
#define MODIFY_LDT_CONTENTS_CODE 2
*/
int do_ldt(int num, unsigned long base, int type)
{
struct modify_ldt_ldt_s ldt_entry = {
num, // entry_number
(unsigned long int) (base), // base_address
0xfffff, // limit, 4G or so :p
1, // seg_32bit
type, // contents
1, // read_exec_only
1, // limit_in_pages
0, // seg_not_present
1 // usable
};
return modify_ldt(1, &ldt_entry, sizeof(struct modify_ldt_ldt_s)) == 0;
}
int main(int argc, char **argv)
{
short int seg;
if(do_ldt(0, 0x1000, MODIFY_LDT_CONTENTS_CODE) == 0) {
printf("Failed to modify ldt\n");
exit(EXIT_FAILURE);
}
seg = 7; // (0 * 8) + 7
//printf("new segment: %d|%02x\n", seg, seg);
__asm__ volatile("pushw %0;\
pushl %1;\
lret"
:
: "r" (seg), "r" ((unsigned int)(debug) - 0x1000)
);
}
Running the above code under a debugger:
Copyright (C) 2007 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "i686-pc-linux-gnu"...
(no debugging symbols found)
Using host libthread_db library "/lib/libthread_db.so.1".
(gdb) r
Starting program: /root/drifter/ldt/ldt_wot
(no debugging symbols found)
(no debugging symbols found)
Program received signal SIGTRAP, Trace/breakpoint trap.
0x08047559 in ?? ()
(gdb) x/4i $eip -1
0x8047558: Cannot access memory at address 0x8047558
(gdb) x/4i $eip - 1 + 4096
0x8048558 <debug>: int3
0x8048559 <do_ldt>: push %ebp
0x804855a <do_ldt+1>: mov %esp,%ebp
0x804855c <do_ldt+3>: sub $0x48,%esp
(gdb) i r cs
cs 0x7 7
As we can see in the debugger output, it's possible to set a custom CS
descriptor, and execute code. Due to the now non-flat memory address space,
it also messes with debugging a little bit.
Implementing the Pseduo-PaX
Quickly reviewing what we need to do:
-
Choose where to split our memory (let's stick with 0x60000000)
-
Duplicate code to the code section
-
Handle segmentation violations
-
Unmap old stack because it's at an inconvenient location.
Doing the above completely correctly would be difficult from userland, but
possible if a bit of effort was to be expended.
For the purposes of this article, we'll write it using dietlibc, and make it
not that feasible to use.
#include <stdlib.h>
#include <unistd.h>
#include <strings.h>
#include <stdio.h>
#include <errno.h>
#include <sys/types.h>
#include <fcntl.h>
#include <asm/ldt.h>
#include <sys/mman.h>
#include <asm/unistd.h>
/*
<asm/ldt.h>
struct modify_ldt_ldt_s {
unsigned int entry_number;
unsigned long base_addr;
unsigned int limit;
unsigned int seg_32bit:1;
unsigned int contents:2;
unsigned int read_exec_only:1;
unsigned int limit_in_pages:1;
unsigned int seg_not_present:1;
unsigned int useable:1;
};
#define MODIFY_LDT_CONTENTS_DATA 0
#define MODIFY_LDT_CONTENTS_STACK 1
#define MODIFY_LDT_CONTENTS_CODE 2
*/
_syscall3(int,modify_ldt,int,op,void*,what,int,len);
/*int modify_ldt(int op, void *what, int len)
{
return syscall(__NR_modify_ldt, op, what, len);
}*/
int do_ldt(int num, unsigned long base, int type)
{
struct modify_ldt_ldt_s ldt_entry = {
num, // entry_number
(unsigned long int) (base), // base_address
0x5ffff, // limit, 1.5G or so :p
1, // seg_32bit
type, // contents
1, // read_exec_only
1, // limit_in_pages
0, // seg_not_present
1 // usable
};
return modify_ldt(1, &ldt_entry, sizeof(struct modify_ldt_ldt_s)) == 0;
}
int do_exit()
{
exit(EXIT_SUCCESS);
}
void vulnerable()
{
int j[1];
int i;
char code[] = "\xcc\xcc\xcc\xcc";
#ifdef HEAP
int addr = strdup(code);
#else
int addr = &code;
#endif
//char *where;
//__asm__("int3;");
for(i = 0; i < 10; i++) j[i] = (unsigned int)(addr);
}
unsigned char *old_stack;
int old_stack_len;
void duplicate_code_mappings()
{
// taken from drifter level 11 code, but modified.
FILE *f;
int hi, low;
char flags[5];
int wot;
int major, minor;
int size;
char remainder[1024];
int ret;
unsigned char *new;
f = fopen("/proc/self/maps", "r");
while(8 == (ret = fscanf(f, "%08x-%08x %[^ \n] %08x %02x:%02x %08x%[^\n]", &low, &hi, flags, &wot, &major, &minor, &size, remainder))) {
if((low & 0x60000000) == 0x60000000) continue;
size = hi - low;
/*
* size = hi - low;
* printf("--> %08x-%d\n", low, size);
printf("--> %08x-%08x %s %d %d:%d %d %s\n", low, hi, flags, wot, major, minor, size, remainder);
*/
// r-xp
if(flags[1] == '-' && flags[2] == 'x') {
printf("--> Duplicating 0x%08x, %d bytes long\n", low, size);
new = mmap(low+0x60000000, size, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_FIXED|MAP_ANONYMOUS|MAP_PRIVATE, -1, 0);
if(new == MAP_FAILED) {
printf("Unable to map code duplicate: %s\n", strerror(errno));
exit(EXIT_FAILURE);
}
memcpy(new, low, size);
mprotect(new, size, PROT_READ|PROT_EXEC);
}
old_stack = low;
old_stack_len = size;
}
fclose(f);
//exit(EXIT_FAILURE);
}
#define STKSIZ (4096 * 32)
// more code from drifter level11.. so I'm lazy :P
unsigned char *allocate_stack()
{
int found;
int address;
short int shift;
unsigned char *stack_ptr;
int urand_fd;
urand_fd = open("/dev/urandom", O_RDONLY);
found = 0;
while(!found) {
if(read(urand_fd, &address, 4) != 4) {
printf("Read failure on /dev/urandom: %s\n", strerror(errno));
exit(EXIT_FAILURE);
}
if(read(urand_fd, &shift, 2) != 2) {
printf("Read failure on /dev/urandom: %s\n", strerror(errno));
exit(EXIT_FAILURE);
}
shift &= 4088; // (page_size - 1) - last 4 bits, to align stack
#if 1
address &= 0x5f7fffff; // remove everything except for last 8M of address space
//address |= TOOBIG;
#endif
address &= ~4095; // clear page addr
stack_ptr = mmap(address, STKSIZ, PROT_READ|PROT_WRITE, MAP_FIXED|MAP_ANONYMOUS|MAP_PRIVATE, -1, 0);
if(stack_ptr != MAP_FAILED) found = 1;
}
close(urand_fd);
stack_ptr = (unsigned int)(stack_ptr) + (STKSIZ) - shift;
memset(stack_ptr, 0xcc, shift);
stack_ptr--;
return stack_ptr;
}
void do_vulnerable()
{
munmap(old_stack, old_stack_len);
printf("Hello from do_vulnerable\n");
vulnerable();
do_exit();
}
int main(int argc, char **argv)
{
short int seg;
unsigned char *stack;
if(do_ldt(0, 0x60000000, MODIFY_LDT_CONTENTS_CODE) == 0) {
printf("Failed to modify ldt\n");
exit(EXIT_FAILURE);
}
seg = 7;
duplicate_code_mappings();
stack = allocate_stack();
printf("--> Will unmap old stack starting @ 0x%08x\n", old_stack);
system("cat /proc/$PPID/maps");
printf("Returning to do_vulnerable\n");
__asm__ volatile("movl %0, %%esp;\
movl %%esp, %%ebp;\
pushl %1;\
pushw %2;\
pushl %3;\
lret"
:
: "m"(stack), "m" (do_exit), "r" (seg), "r" ((unsigned int)(do_vulnerable))
);
}
The above was compiled with:
diet gcc -fno-pie -fno-stack-protector ldt_test.c -o ldt_test
The above code creates a new stack mapping (because the original is mapped
somewhere around 0xbfff0000) and allocates underneath the cutoff point, in
addition, it duplicates code layout via scanning /proc/self/maps for
mappings marked executable, and NOT writable.
The vulnerable() function stimulates a stack overflow, pointing to the int3
instruction. Optionally, it can be compiled with -DHEAP, and it will stimulate
a heap return address, rather than a stack return address.
Watching it catch heap execute attempts:
GNU gdb 6.7.1
Copyright (C) 2007 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "i686-pc-linux-gnu"...
(no debugging symbols found)
Using host libthread_db library "/lib/libthread_db.so.1".
(gdb) r
Starting program: /root/drifter/ldt/ldt_wot
--> Duplicating 0x08048000, 24576 bytes long
--> Will unmap old stack starting @ 0xbfffc000
00110000-00111000 rw-p 00000000 00:00 0
08048000-0804e000 r-xp 00000000 08:01 50639 /root/drifter/ldt/ldt_wot
0804e000-08050000 rw-p 00005000 08:01 50639 /root/drifter/ldt/ldt_wot
08050000-08051000 rwxp 00000000 00:00 0
0d2a4000-0d2c4000 rw-p 00000000 00:00 0
68048000-6804e000 r-xp 00000000 00:00 0
bfffc000-c0000000 rwxp ffffd000 00:00 0
Returning to do_vulnerable
Hello from do_vulnerable
Program received signal SIGSEGV, Segmentation fault.
0x00111008 in ?? ()
(gdb) x/4i $eip
0x111008: int3
0x111009: int3
0x11100a: int3
0x11100b: int3
And stack execution attempts:
GNU gdb 6.7.1
Copyright (C) 2007 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "i686-pc-linux-gnu"...
(no debugging symbols found)
Using host libthread_db library "/lib/libthread_db.so.1".
(gdb) r
Starting program: /root/drifter/ldt/ldt_wot
--> Duplicating 0x08048000, 24576 bytes long
--> Will unmap old stack starting @ 0xbfff4000
00110000-00111000 rw-p 00000000 00:00 0
063c3000-063e3000 rw-p 00000000 00:00 0
08048000-0804e000 r-xp 00000000 08:01 50635 /root/drifter/ldt/ldt_wot
0804e000-08050000 rw-p 00005000 08:01 50635 /root/drifter/ldt/ldt_wot
08050000-08051000 rwxp 00000000 00:00 0
68048000-6804e000 r-xp 00000000 00:00 0
bfff4000-c0000000 rwxp ffff5000 00:00 0
Returning to do_vulnerable
Hello from do_vulnerable
Program received signal SIGSEGV, Segmentation fault.
0x063e25b1 in ?? ()
(gdb) x/4i $eip
0x63e25b1: int3
0x63e25b2: int3
0x63e25b3: int3
0x63e25b4: int3
Implementing it properly
This idea would be easily implemented in the ELF loader (ld.so, not kernel) if
it laid out the memory correctly, and probably remapped the stack to a lower
address. It would also have to hook stuff like mmap() and duplicate it if
possible/applicable.
Anonymous memory could possibly be handled via making it disk backed, and
mapping twice from that.
As opposed to memcpy, it should correctly parse /proc/<pid>/maps, and mmap()
from shared libraries correctly.
Dynamically generated code could be handled by marking segments non writable,
when attempting to execute them. On attempts to write there again, it would be
unmapped from the executable region, and marked writable again. Self modifying
code would be handled via this mechanism, albeit slowly.
Weaknesses
-
Duplicating code section can't be completely the same, as
we can't duplicate anonymous mappings, etc, if they where
to become executable. A work around exists by making it
disk backed, however.
-
You could bypass it by returning to a retf instruction,
which would use the CS descriptor from the GDT, which is
probably marked 0-4G executable :p
-
Probably a bunch of other weaknesses.
-
Doing it from userland is pretty lame :P
Other uses for modify_ldt() syscall
-
Userland Process Scheduling (Another article for another day, maybe an
outline soon.)
-
Since debuggers on 32 bit platforms probably assume a flat memory layout,
it could be feasible to use different CS/DS/SS/whatever selectors as a
minor obfuscation mechanism.