May 07 21:12:04 nnp	 The focus of these lectures will be on finding and
May 07 21:12:04 nnp	exploiting vulnerabilities which can cause a security risk
May 07 21:12:33 nnp	 The aim of the lectures is to bring a software exploitation newbie to the point where they can confidently discover and exploit new vulnerabilities and
May 07 21:12:39 nnp	also have the foundation of knowledge to continue with more advanced
May 07 21:12:48 nnp	techniques.
May 07 21:13:07 nnp	The first few lectures are going to be linux and open source specific. Once everyone is confident in these areas we'll then move onto exploiting closed source and MS Windows based software.
May 07 21:13:21 nnp	The aim of this first lecture is to get everyone on a common level of knowledge. We will be covering ....
May 07 21:13:26 nnp	1) Linux memory structures
May 07 21:13:30 nnp	2) Classic buffer overflows
May 07 21:13:35 nnp	3) Basics of ShellCoding
May 07 21:13:40 nnp	4) An introduction to debugging with GDB
May 07 21:13:47 nnp	1.1 Linux Memory Structures
May 07 21:14:21 nnp	Before we can begin exploiting applications we need a basic knowledge of how they appear in memory. As many of you may know, programs have several different and distinct sections. These are the stack, heap, bss,data and code. Each of these sections has a specific use and all have different properties.
May 07 21:14:33 nnp	Code: Contains the code of the program (duh! ;) ). It is read only
May 07 21:14:35 *	Cyph3r (opera@bu-97721158.karoo.kcom.com) has joined #lecture
May 07 21:14:37 nnp	Data: The data section contains initialised global variables
May 07 21:14:43 nnp	BSS: The bss section contains uninitialised global variables
May 07 21:14:49 nnp	Heap: Used to store dynamically allocated variables (e.g via malloc() in
May 07 21:14:55 nnp	C or new in C++)
May 07 21:15:14 nnp	Stack: Used to store local variables,function arguments and data used in directing the flow of the application (it is herewe will be focusing our attentions for now)
May 07 21:15:38 nnp	The addresses available to your program start at 0x00000000 and grow towards a maximum e.g 0xbfffffff.
May 07 21:15:58 nnp	The heap grows from lower addresses towards higher ones and the stack starts at high memory addresses and grows downwards. Its very important to remember this and it can be quite confusing at first
May 07 21:16:18 nnp	 For example. To make space for 8 bytes of local variables (i.e make the stack bigger) the instruction would look like
May 07 21:16:25 nnp	sub    $0x8,%esp
May 07 21:16:30 nnp	not add $0x8, %esp as you might expect.
May 07 21:16:51 nnp	it looks something like this ......
May 07 21:17:07 nnp	0xffffffff ---stack -----> <----Unused memory -----> <-----Heap---- 
May 07 21:17:31 nnp	One other source of confusion can be in what direction variables are put onto the stack. e.g if you have space reserved for a string of size 0x4 at 0xbffff484 it would start at 0xbffff484 and finish at 0xbffff487 not 0xbffff471.
May 07 21:17:50 nnp	 i.e they still start at low and go to high even though the stack is growing in the opposite direction. Why is this good for us?
May 07 21:18:18 nnp	Well if we can make the string grow beyond the space we have reserved for it we will be overwriting and controlling previously stored variables as opposed to unused memory.
May 07 21:18:26 *	nnp sets mode -m #lecture
May 07 21:18:30 nnp	any questions so far?
May 07 21:18:33 nnp	am i going to slow?
May 07 21:18:36 nnp	or too fast?
May 07 21:18:48 Cyph3r	im getting you :)
May 07 21:19:01 nnp	k...
May 07 21:19:08 *	nnp sets mode +m #lecture
May 07 21:19:22 nnp	So what do I mean by 'direct the flow of the application'? First of all let me refresh your memories on some basic asm.
May 07 21:19:45 nnp	To keep track of what instruction the cpu should execute next a register called the EIP(Extended Instruction Pointer) is used....i.e if you can control this EIP then you can basically control what the cpu does
May 07 21:19:52 nnp	This will be our
May 07 21:19:52 nnp	aim.
May 07 21:20:21 nnp	when you make a function call (eg, strcpy(buf1, buf2)) the flow of execution jumps into the strcpy function
May 07 21:20:34 nnp	When this completes you must know where to continue
May 07 21:20:44 nnp	 i.e the instruction directly after the call to strcpy
May 07 21:20:52 nnp	 How does this magic occur? What happens is this:
May 07 21:21:09 nnp	The arguments to the function are pushed onto the stack. Just before the execution jumps to the strcpy function the current value of the EIP register is push'ed onto the stack
May 07 21:21:18 nnp	 Then execution jumps to the strcpy function
May 07 21:21:30 nnp	The stack then looks somewhat like this.....
May 07 21:21:35 nnp	|dest                           |
May 07 21:21:40 nnp	|src                            |
May 07 21:21:44 nnp	|stored eip(return address)     | <----ESP points here
May 07 21:22:12 nnp	In strcpy the EBP is push'ed onto the stack. (The EBP(extended base pointer) stores a reference point used to address both arguments and local variables)
May 07 21:22:52 nnp	The parameters are the start of the stack frame for that function. The stack frame contains the parameters,return information and the local variables
May 07 21:23:14 nnp	to the function in question and every function will have its own base. pointer) .Every invocation of a C function creates a new stack frame on the stack .
May 07 21:23:17 *	R4d30N (admin@31F8B797.6C483A91.B71335E7.IP) has joined #lecture
May 07 21:23:31 nnp	so once the function is entered the stack looks like this
May 07 21:23:40 nnp	|dest           |
May 07 21:23:41 nnp	|src            |
May 07 21:23:45 nnp	|stored eip     |
May 07 21:23:51 nnp	|stored ebp     | <----ESP points here
May 07 21:24:14 nnp	When the strcpy function is done it pop's the stored value of EBP and EIP into the EIP and EBP registers respectively then execution continues.
May 07 21:24:23 *	nnp sets mode -m #lecture
May 07 21:24:30 nnp	everyone still with me?
May 07 21:24:37 nnp	(sup r4d30n)
May 07 21:24:53 *	ellipsis has quit (Quit: Art + Programming = Insanity)
May 07 21:24:57 R4d30N	Aight man,said id stop by
May 07 21:24:58 nnp	i'll take that as a yes
May 07 21:25:04 *	nnp sets mode +m #lecture
May 07 21:25:17 nnp	This could happen many times e.g the strcpy function could call some other function and so on etc etc.
May 07 21:25:28 nnp	Its important to know how _exactly_ this happens and what the stack looks like so take a look at this code.
May 07 21:25:50 nnp	http://silenthack.co.uk/lectures/lecture1/stackex.txt
May 07 21:26:15 nnp	you can compile the above with
May 07 21:26:24 nnp	gcc -g -static -o lec11 lec11.c
May 07 21:26:49 nnp	Now to get a complete understanding we must look at the asm this C code compiles to. To do this we need a debugger and on linux the best availible is GDB. It is freely available and is included on most distros by default.
May 07 21:26:57 *	nnp sets mode -m #lecture
May 07 21:27:05 nnp	in case anyone has any immediate questions
May 07 21:27:08 d03boy	Question: Does each program get its own stack and heap and stuff? I'm used to working on motorola chips which dont have separate programs so you only have one stack.
May 07 21:27:13 nnp	yes
May 07 21:27:20 nnp	every program has its own sections
May 07 21:27:28 nnp	completely independent of each other
May 07 21:27:45 nnp	here is the disassembly of the above code
May 07 21:28:04 nnp	http://silenthack.co.uk/lectures/lecture1/disasstackex.txt
May 07 21:28:27 nnp	it also contains some comments to explain whats going on if your asm isnt so good
May 07 21:29:01 nnp	i should also mention now, gdb uses at&t syntax by default. to make it use intel syntax issue the command
May 07 21:29:14 nnp	 set disassembly-flavor intel
May 07 21:29:40 nnp	any questions on that code?
May 07 21:29:57 elmo_	Question: ``set disassembly-flavor intel`` do I do that as an argument while starting GDB or in a configuration file?
May 07 21:30:17 nnp	you can do it in a config file or just issue it from the gdb command line
May 07 21:30:57 nnp	ok...so on to shellcode
May 07 21:31:04 nnp	or at least the basics anyway
May 07 21:31:28 nnp	firstly if anyone is actually following this code and attempting it can i get you to check /proc/sys/kernel
May 07 21:31:38 nnp	for anything that looks like randomize_va_space
May 07 21:31:51 nnp	if you find it issue the following command echo 0 > /proc/sys/kerne/randomize_va_space
May 07 21:32:00 nnp	that will turn off stack randomization for now
May 07 21:32:10 nnp	after the lecture you can echo 1 back into that file
May 07 21:32:32 nnp	So, this brings us to the first of our many attacks.
May 07 21:32:48 nnp	You should see that if we can somehow change the stored eip to point to code that we control we can effectively do what we want
May 07 21:33:12 nnp	e.g if we could store the instructions system("/bin/sh") and then change the stored eip to point to it, instead of returning back into the main function a shell would be spawned. Cool, no?
May 07 21:33:31 nnp	So for our first real example we will just redirect the eip to some arbitary address just so you know im telling the truth ;)
May 07 21:33:41 nnp	look at the following code.....
May 07 21:33:55 nnp	http://silenthack.co.uk/lectures/lecture1/changeeip1.txt
May 07 21:34:32 nnp	as you can see from the comments if you attempt to run the above program it seg faults
May 07 21:34:52 nnp	why? well what the above code does is very simple. The int * i is a local variable
May 07 21:34:58 nnp	and therefore stored on the stack
May 07 21:35:09 nnp	So if you remember from earlier the stack would look like this
May 07 21:35:14 nnp	|stored eip|
May 07 21:35:19 nnp	|stored ebp|
May 07 21:35:23 nnp	|int * i   |
May 07 21:35:24 *	Ch4r (Ch4r@bu-BBA3FD18.hsd1.ca.comcast.net) has joined #lecture
May 07 21:35:24 *	ChanServ sets mode +q #lecture Ch4r
May 07 21:35:24 *	ChanServ gives channel operator status to Ch4r
May 07 21:35:41 Ch4r	whoa, lots of p eople here? The lecture isn;t for another twenty minutes, right?
May 07 21:35:49 nnp	so we give i its own address and then add 2 to it. Since i is a pointer to an int adding two has the effect of increasing the address stored in it by 8 bytes
May 07 21:35:56 nnp	um, 9pm gmt?
May 07 21:35:57 nnp	no?
May 07 21:36:02 Ch4r	9 PM GDT
May 07 21:36:04 Ch4r	err
May 07 21:36:07 Ch4r	10 PM GDT
May 07 21:36:10 nnp	?
May 07 21:36:20 Ch4r	9 PM GMT, 10 PM GDT
May 07 21:36:36 nnp	yea....its 21:36 gmt now
May 07 21:36:37 d03boy	it started 37 minutes ago
May 07 21:36:54 nnp	isnt it? fuck, is it?
May 07 21:37:09 elmo_	the time is right, go on
May 07 21:37:16 nnp	right oh....
May 07 21:37:37 nnp	 What's 8 bytes above i in memory? You got it, the stored eip. We put 0 in the address pointed to by i (the stored eip) and when
May 07 21:37:47 nnp	the program tries to use the stored eip to continue execution the value 0x0 is put into the eip
May 07 21:38:02 nnp	 This address isnt executable and so the program segmentation faults.
May 07 21:38:11 nnp	Lets make this a little more useful. Lets spawn a shell.
May 07 21:38:20 nnp	One way of doing this is using the execve system call.
May 07 21:38:30 nnp	The problem with this is the cpu has no idea how to interpret C code
May 07 21:38:38 Ch4r	nnp: sorry, to interrupt, but.. http://odin.gi.alaska.edu/lumm/GMTclock/ ;/
May 07 21:38:58 nnp	 The cpu needs machine instructions. To get these instructions we write the code we need executed in some asm variant and then get the opcodes using a program such as objdump or hexedit. 
May 07 21:39:30 nnp	ch4r: sorry about that 
May 07 21:39:49 nnp	 So....we need code to execute the system call. Pretty simple really. Shellcode is basically a sequence of asm
May 07 21:39:59 nnp	commands which usually spawn some form of shell(hence the name).
May 07 21:39:59 Cyph3r	well im in UK and its 9:39...
May 07 21:40:21 nnp	To do this we have to use system calls (as in calls to the system not the function system() from whatever C library you're using).
May 07 21:40:32 nnp	You can get the numbers of the various sytems calls from <asm/unistd.h>
May 07 21:40:45 nnp	execve() happens to be sys call 11. From the man page we can see that the execve call looks like this
May 07 21:41:15 nnp	(oh, forgot to say, in case you dont know the linux kernel and most others indentify sys calls via numbers not names)
May 07 21:41:21 nnp	int  execve  (const  char  *filename, char *const argv [], char *const envp[]);
May 07 21:41:34 nnp	argv is a null terminated list of the programs args (includes the program name) and envp can just be null for our purposes
May 07 21:41:41 nnp	the filename is the program we want to execute
May 07 21:41:48 *	elmo_ has quit (Ping timeout)
May 07 21:42:00 *	elmo_ (elmo@bu-3E871720.res.east.verizon.net) has joined #lecture
May 07 21:42:01 nnp	So we need to code a call to this in asm (nasm) and then get the instructions from this. Firstly, it usually helps to code up our shellcode in C so here is the 'classic' example
May 07 21:42:22 nnp	http://silenthack.co.uk/lectures/lecture1/shell1.txt
May 07 21:42:50 nnp	the nasm code follows it
May 07 21:43:07 nnp	there are a few intersting things in there
May 07 21:43:30 nnp	firstly the jmp at the start which goes to a call....
May 07 21:44:05 nnp	the reason for this is that we need to be able to get the address of that buffer despite the fact we will have no idea (well, pretty much) where we will be on the stack
May 07 21:44:11 *	nivek (Ze@EB549BD8.2202F3B8.D10BF7C7.IP) has joined #lecture
May 07 21:44:21 nnp	the call instruction puts the return address (i.e where buf is) onto the stack
May 07 21:44:30 nnp	this is then pop'ed off into esi
May 07 21:44:47 nnp	this is used then as a base for the buffer
May 07 21:45:12 nnp	We can compile the previous code with the line
May 07 21:45:15 nnp	nasm -f elf shell.asm
May 07 21:45:32 nnp	To then attain the shellcode from this we disassemble the program using objdump -d shell.o
May 07 21:45:53 nnp	this gives the output http://silenthack.co.uk/lectures/lecture1/objdump1.txt
May 07 21:46:16 elmo_	not found
May 07 21:46:17 nnp	hmm, broken link...
May 07 21:46:18 nnp	one sec
May 07 21:47:55 nnp	k, try now
May 07 21:48:39 nnp	The first column is the distance from the start in bytes. The second is the machine instruction and the final column is what objdump thinks the corresponding machine instructions are
May 07 21:49:11 nnp	What we will do is use the overflow to write the machine instructions back up the stack. By altering the eip to point to the start of this these instructions will then be executed
May 07 21:49:22 nnp	There are problems with the above instructions though,
May 07 21:49:44 nnp	 you'll notice there are several sequences of 00 i.e null bytes. The problem with this is that the majority ofbasic buffer overflow vulnerabilities exist in string manipulation functions e.g strcpy.
May 07 21:50:03 nnp	When strcpy encounters a null byte it will stop copying, as it should, becuase it thinks it's encountering the end of the string. We will first need to remove these by substituting the
May 07 21:50:08 nnp	offending instructions with ones that don't generate null bytes
May 07 21:50:15 nnp	this results in the following code....
May 07 21:50:43 nnp	http://silenthack.co.uk/lectures/lecture1/shell2.txt
May 07 21:51:17 nnp	hmm, there should be the following line at the start of that
May 07 21:51:20 nnp	segment .text
May 07 21:51:32 nnp	any questions regarding the optimisations in that?
May 07 21:52:15 nnp	We use objdump -d on the object file from the above code and then just copy out the machine instructions
May 07 21:52:22 nnp	that results in the following shellcode
May 07 21:52:41 nnp	\xeb\x1e\x5e\x31\xc0\x88\x46\x07\x89\x76\x08\x89\x46\x0c\x89\xf3\x8d\x4e\x08\x8d\x56\x0c\xb0\x0b\xcd\x80\x31\xdb\xb0\x01\xcd\x80\xe8\xdd\xff\xff\xff\x2f\x62\x69\x6e\x2f\x73\x68\x46\x55\x55\x55\x55\x4b\x4b\x4b\x4b
May 07 21:52:53 nnp	I have added '\x' before each of the bytes so that C will interpret them as hex.
May 07 21:53:12 nnp	Now we need to execute the same C program as before but this time instead having the eip point to 0x0 it will point to our shellcode
May 07 21:53:35 nnp	http://silenthack.co.uk/lectures/lecture1/shell3.txt
May 07 21:53:56 nnp	and the result of running the above program is surprise, surprise...a shell :P
May 07 21:53:57 d03boy	can you convert that to ascii for us so we know what it says?
May 07 21:54:28 nnp	all that is is the result of me compiling http://silenthack.co.uk/lectures/lecture1/shell2.txt
May 07 21:54:31 nnp	that code there
May 07 21:54:38 d03boy	oh, gotcha
May 07 21:54:39 nnp	with nasm -f elf shell.asm
May 07 21:54:49 nnp	then running objdump -d shell.o
May 07 21:54:56 nnp	and from that parsing out the machine instructions
May 07 21:55:11 nnp	(the second column)
May 07 21:55:27 nnp	any other questions?
May 07 21:55:50 nnp	ok...so....onto the final section for tonight.....basic buffer overflows
May 07 21:56:12 nnp	We'll now explore how to gain control of the eip in an actual program
May 07 21:56:19 nnp	Consider the following code
May 07 21:56:29 nnp	char buf1[16];
May 07 21:56:31 nnp	strcpy(buf1, argv[1]);
May 07 21:56:53 nnp	the aim of this code is to copy the first user provided argument into buf1
May 07 21:57:05 nnp	But what happens if the user inputs more than 16 bytes?
May 07 21:57:19 nnp	Their input will overflow the end of buf1 and continue overwriting data previously stored on the stack.
May 07 21:57:28 nnp	 Try it out
May 07 21:57:34 nnp	./vuln1 `perl -e "print "A"x16;"`
May 07 21:57:38 nnp	That should run just fine
May 07 21:57:44 nnp	whereas....
May 07 21:57:48 nnp	./vuln1 `perl -e "print "A"x32;"`
May 07 21:57:54 nnp	should seg fault
May 07 21:58:01 nnp	The reason? We just filled the EIP with A's.
May 07 21:58:10 nnp	The char A is 0x41 and so our program tried to return to 0x41414141
May 07 21:58:21 nnp	again not a valid address for execution
May 07 21:58:51 nnp	Now, we need to get a shell from this. We're going to do it first of all just using the command line.
May 07 21:59:08 nnp	So we store the shellcode from earlier as an environment variable
May 07 21:59:16 nnp	on linux this is done via the following command
May 07 21:59:22 nnp	export SHELLCODE="SHELLCODE"
May 07 21:59:33 nnp	where "shellcode" is our shellcode from earlier
May 07 21:59:39 nnp	We can then find the address of the shellcode using the following program.
May 07 22:00:21 nnp	k...one sec....my connection just went nuts
May 07 22:00:36 nnp	http://www.nostarch.com/extras/hacking/chap2/getenvaddr.c
May 07 22:00:59 nnp	the above code is a simple C program that uses getenv to find the address of an environment variable
May 07 22:01:23 nnp	There's something you should know about the address returned by this code though. It is specific to the environment of ./getenvaddr.
May 07 22:01:39 nnp	Where the shellcode is in memory will change with respect to the length of the program name you are running
May 07 22:01:51 nnp	 For every character extra the program name has the shellcode will be located 2 bytes lower in memory and vice versa.
May 07 22:02:18 nnp	For example call our program vuln1 which is 6 characters shorter than getenvaddr.
May 07 22:02:29 nnp	 It will therefore be located 0xC places behind it the address returned by getenvaddr
May 07 22:02:44 nnp	If you think about it this makes sense. The filename is also an environment variable for the running program and will therefore have an effect on where other environment variables are located.
May 07 22:02:53 nnp	Using this info we can exploit the program as follows
May 07 22:03:16 nnp	if getenvaddr returns the address 0xAABBCCDD
May 07 22:03:28 nnp	./vuln1 `perl -e "print "AAAAAAAAAAAAAAAAAAAA\xDD\xCC\xBB\xAA";"`
May 07 22:03:42 nnp	The 4 bytes in the address must be reversed becuase the intel architecture uses little endian ordering
May 07 22:03:55 nnp	This is normally taken care of for us by the operating system but since we are writing directly onto the stack its up to us
May 07 22:04:10 nnp	Any questions?
May 07 22:04:17 nnp	(anyone even there :P )
May 07 22:04:28 d03boy	ya
May 07 22:04:31 l3thal	ya
May 07 22:04:36 qwertydawom	no questions here
May 07 22:04:39 nnp	k, onwards then
May 07 22:04:48 nnp	how do we automate this kind of exploit?
May 07 22:05:03 nnp	 Well we can use the execl() call to execute this program and pass in a crafted buffer as the argument.
May 07 22:05:12 nnp	For example, suppose the vulnerable call is as follows
May 07 22:05:17 nnp	char buffer[512];
May 07 22:05:21 nnp	strcpy(buffer, argv[1]);
May 07 22:05:26 nnp	our crafted buffer would have to look something similar to the following
May 07 22:05:40 nnp	[516 - strlen(shellcode) of padding][shellcode][return address (start of shellcode)]
May 07 22:05:58 *	R4d30N has quit (Connection reset by peer)
May 07 22:06:16 nnp	As you can see, the above buffer would require us to get the return address exactly right, this could be quite difficult due to differences in compilers and what not.
May 07 22:06:28 nnp	We could brute force the address but on a remote system this would be less than ideal.
May 07 22:06:39 nnp	We have methods to increase our chances of success though.
May 07 22:06:56 nnp	One method is by using a NOP sled. A NOP is an asm instruction (\x90) that does nothing.
May 07 22:07:28 nnp	It is primarily used to waste cpu cycles for timing purposes but for us we can use it as a sort of fudge factor.
May 07 22:07:53 nnp	We put as many of these before the shellcode as we can fit (we also need other stuff so we cant fill it completely )and if we return anywhere within this sled the cpu will simply cycle
May 07 22:07:59 nnp	through them till it encounters the start of our shellcode. 
May 07 22:08:19 nnp	We will also repeat the return address at the end of the buffer to ensure we overwrite the saved eip.
May 07 22:08:35 nnp	Sometimes it is hard to predict where exactly the stored eip will be due to compiler optimizations and padding. So our final buffer will look something like this
May 07 22:08:41 nnp	[200 or so NOPS][Shellcode][Repeated return address]
May 07 22:09:18 nnp	We will craft this using the following C code(lifted from The Art of Exploitation by Jon Erickson which was in turn lifted from Smashing the stack for fun and profit by aleph1)
May 07 22:09:23 nnp	;)
May 07 22:09:53 nnp	http://silenthack.co.uk/lectures/lecture1/buffer.txt
May 07 22:10:21 nnp	k, i'll give ye a minute or two to look at that
May 07 22:11:19 nnp	k
May 07 22:11:56 nnp	As a side note, it is vital that your return address be aligned on a 4 byte boundary (i.e ensure the offset into the array where you start writing is divisable by 4) otherwise you will end up messing up overwriting the stored eip by misaligning the return address you are writing into it
May 07 22:12:35 nnp	One question many people have initially is regarding the sp() function above that returns the stack pointer for the current program and how could this address be used as the return address in the vulnerable program
May 07 22:12:54 nnp	The answer is that most programs have a very similar stack structure. This combined with the fudge factor provided by the NOPs is usually enough to land us somewhere in the NOP sled.
May 07 22:13:26 nnp	If it doesnt though you can simply pass an offset variable to the above program that will subtract this value from the address returned. This would be useful if the buffer we were overflowing was located further up the stack (lower  in memory addresses) than our exploit buffer provided for.
May 07 22:13:55 nnp	Well that pretty much concludes this lecture. The aim was to lay the ground work for more advanced and more intersting material. In the next lecture I will cover more advanced techniques.
May 07 22:14:10 nnp	I would strongly recommend everyone have a read of aleph1's paper 'Smashing the Stack for Fun and Profit'.