Stack buffer overflow
What is a stack buffer overflow ?
A stack buffer overflow is a bug where too much bytes have been written in a very specific part of the memory: the stack.
I'll give you an example from everyday life.
Imagine yourself in front of a coffee machine that offers you several choices including Ristretto and Latte macchiato :
It is possible to have a Ristretto in a Latte macchiatto cup (because a 400 ml cup can contain 22 ml of coffee).
But if you want a Latte macchiatto in a Ristretto cup, it will overflow (because the 50 ml cup cannot hold 360 ml of coffee).
What does cybersecurity have to do with coffee ? Let's continue the analogy and identify the objects/actors :
Coffee | Hacking |
---|---|
You take a coffee | You are an attacker |
A cup is used | The stack is used |
You use the coffee machine | You use a vulnerable function |
You pour coffee inside the cup using the machine | You inject bytes in the stack using the vulnerable function |
It overflows everywhere because the cup is not big enough | It overflows because the stack is not big enough |
Unlike coffee, the bytes do not overflow everywhere but they are still stored in the stack because there is always space in the memory.
The excess byte will then rewrite important data into the stack.
Is it dangerous ?
Oh yes, it is !
It can easily crash a program or a service leading to a deny-of-service, and in some cases allows the attacker to execute arbitrary code.
If there are addresses that points to executable code, it is possible to rewrite these addresses to hijack control flow : assembly instructions pop data from stack to registers and if RIP/EIP register is loaded with stack data, it is then possible to redirect execution.
Mitigations
It is possible to avoid stack buffer overflow in different ways:
- Secure vulnerable functions or not use them at all It is the case of
gets() function fromLIBC , where it is explicitly stated in the manual : "Never use gets(). [...] Use fgets() instead." - Protect the binary from stack smashing : There is a protection against stack smashing which is called a
canary , a random value stored in the stack before the old base pointer register value. Before leaving a function, it checks ifcanary value is the same as the one during initialization phase. If it is not the case, then a buffer overflow has been detected and the program stops, preventing attacker to do more damage. To protect against stack smashing, you can use argument-fstack-protector-all
from GCC compiler to protect all functions in the program. - Randomize address space : The
ASLR (Address Space Layout Randomization) is able to randomize addresses from the system. This makes finding valid addresses more difficult and prevents control flow hijacking. To activate address space randomization for both libraries and executable, you need to be root and to write the value2 to/proc/sys/kernel/randomize_va_space file. If you want to protect only libraries, write1 . - Forbid the stack to be executable : The
NX (No eXecute) bit forbid the area of memory where the stack is located to be executable. It is harder for attacker to find an executable memory area. To activateNX , you can use compilation argument-z noexecstack
from GCC compiler. - Use SAST (static analysis) and DAST (dynamic analysis) tools : cppcheck [SAST], Splint [SAST], ASAN (Address Sanitizer) [DAST], ...
Give me an example of stack buffer overflow
Here is a basic piece of C source code where gets() function from LIBC is called.
It stores bytes from user standard input into main() stack :
I compiled the 64-bit binary with GCC (LIBC linked statically) and I disabled ASLR, NX bit and canary protections :
script.sh | |
---|---|
As seen previously, gets() function is vulnerable to buffer overflows. It should crash if I inject too much bytes :
I used pwndbg to analyze the crash :
The RET instruction at address 0x555555555168 <main+31> will pop the value pointed by RSP into RIP.
It is pretty straightforward : RSP register points to "will crash", which is not a valid executable address. It crashes.
What is really happening here ? Let's disassemble main() and draw the stack to understand better :
There are 0x10 bytes allocated (at 0x0000555555555151 <+8>) for the stack frame of main().
The argument of gets() function is loaded into RAX register at location [RBP-0xc] (at 0x0000555555555155 <+12>).
I drew the stack for you, before and after gets() call :
As you can see, gets() function allows to write theoretically an unlimited number of bytes. Thus, it is possible to write what we want at the location of saved RBP and RIP values in the stack.
Note that "will crash\0" has an hexadecimal value of 0x006873617263206c6c6977 (in little-endian) and RIP is written as "will cra" of hexadecimal value 0x617263206c6c6977 : because it is not a valid address (an address that points to an executable memory area), a SIGSEGV occurs, leading to a crash.
Cool exploit
From this previous example, it is even possible to execute arbitrary code using system() function from LIBC : its base address is indeed directly leaked from its virtual memory mapping because ASLR is off.
In the example below, I launched a youtube video using pwntools Python module to craft my payload :
And here is the beautiful result :