Thilan Dissanayaka Low level Development Mar 23

Debugging Binaries with GDB

GDB is shipped with the GNU toolset. It is a debugging tool used in Linux environments. The term GDB stands for GNU Debugger.

In our previous protostar stack0 walkthrough tutorial, we used GDB many times.

So in this post, I'm going to explain how to use a Linux debugger for debugging and analyze a binary file. If you are planning to learn reverse engineering, malware analysis, or exploit development you must be familiar with debuggers.

To understand the Disassembly and stack etc, I suggest you read the following tutorials:

Introduction to assembly language
Stack architecture theory
Starting the debugging

We can open a binary inside GDB with the command gdb ./[binary_file]. Here binary_name is the name of the file we want to debug. You may see the following screen after this command.

gdb-main-interface-on-kali

However, in general, we don't need this banner. If you think it disturbs you, you may use quiet mode. It prevents GDB from showing this welcome banner.

gdb -q ./stack0

So guys, our next step is to disassemble the binary and understand the architecture of the program.

Disassemble a binary.

There are two main Assembly syntax styles called Intel syntax and AT&T syntax. In the following image, you can see both of them.

Intel and AT&T Assembly syntax

You can select one of them as your preference. I think Intel syntax is clear and easy to understand. So in my disassembly, I prefer to use Intel Assembly syntax.

By default, GDB uses AT&T assembly syntax. We can switch to Intel assembly syntax by entering the following command.

set disassembly-flavor intel

If you want to switch back, use the command set disassembly-flavor att.

If you feel it is boring to switch syntax every time you start GDB, you can permanently switch to Intel syntax by editing the .gdbinit file. This file is located in your home folder. So enter the following command:

echo 'set disassembly-flavor intel' > ~/.gdbinit

After you load a binary in GDB, you can disassemble a function and see the assembly code. To do that, you can use disassemble function_name or disas function_name. For example, if you want to disassemble the main function, you may use disassemble main or disas main.

    (gdb) disass main
    Dump of assembler code for function main:
    0x080483f4 : push ebp
    0x080483f5 : mov ebp,esp
    0x080483f7 : and esp,0xfffffff0
    0x080483fa : sub esp,0x60
    0x080483fd : mov DWORD PTR [esp+0x5c],0x0
    0x08048405 : lea eax,[esp+0x1c]
    0x08048409 : mov DWORD PTR [esp],eax
    0x0804840c : call 0x804830c
    0x08048411 : mov eax,DWORD PTR [esp+0x5c]
    0x08048415 : test eax,eax
    0x08048417 : je 0x8048427
    0x08048419 : mov DWORD PTR [esp],0x8048500
    0x08048420 : call 0x804832c
    0x08048425 : jmp 0x8048433
    0x08048427 : mov DWORD PTR [esp],0x8048529
    0x0804842e : call 0x804832c
    0x08048433 : leave
    0x08048434 : ret
    End of assembler dump.

On the left side, we can see memory addresses. Our CPU instructions are loaded there. On the right side, there are assembly instructions like push ebp, mov ebp,esp, etc. These Assembly instructions do various tasks on the CPU, memory, and registers.

Breakpoints

The breakpoint is an essential thing in debugging. We can stop the execution of the program at a decided state and examine the memory and registers. You can set a breakpoint on a function with the command break. For example, if you want to break execution at the main function, you may use break main or the shorthand command b main.

Let's make a breakpoint on the above binary and run it to see what happens.

(gdb) b main

Breakpoint 1 at 0x80483fd: file stack0/stack0.c, line 10.

(gdb) run

Starting program: /opt/protostar/bin/stack0

Breakpoint 1, main (argc=1, argv=0xbffff864) at stack0

I used the b main command. So GDB created a breakpoint at the memory address 0x80483fd. Go to the above-disassembled code and find out what is at that address.

The instruction at this address is mov DWORD PTR [esp+0x5c],0x0. So GDB has skipped the following instructions:

    0x080483f4 : push ebp
    0x080483f5 : mov ebp,esp
    0x080483f7 : and esp,0xfffffff0
    0x080483fa : sub esp,0x60

This is because these instructions are related to the function prologue generated by the compiler. The function prologue builds the stack frame of the function.

When you set a breakpoint with the function name, GDB automatically skips the function prologue. So if you want to see how the stack frame is built, you can use a memory address instead of the function name.

Let's set a breakpoint at the top of the assembly instructions.

(gdb) b *0x080483f4

Breakpoint 2 at 0x80483f4: file stack0/stack0.c, line 6.

Notice the star mark before the memory address.

Examine the memory and the registers

This is the most important part of our reverse engineering task. We can use various ways to examine the memory and the registers to see what is inside them.

Examine registers

To examine registers, we must run the program. What we do is set a breakpoint at a required state and run the program. After GDB stops the execution, we can examine registers.

Now I have created a breakpoint at the main function using b main and started the program. So at the moment, GDB has paused the execution at the main function.

We can use the info registers command or the shorthand command i r to examine all registers. See the following example:

    (gdb) i r
    eax 0xbffff864 -1073743772
    ecx 0xa9493a07 -1454818809
    edx 0x1 1
    ebx 0xb7fd7ff4 -1208123404
    esp 0xbffff750 0xbffff750
    ebp 0xbffff7b8 0xbffff7b8
    esi 0x0 0
    edi 0x0 0
    eip 0x80483fd 0x80483fd
    eflags 0x200286 [ PF SF IF ID ]
    cs 0x73 115
    ss 0x7b 123
    ds 0x7b 123
    es 0x7b 123
    fs 0x0 0
    gs 0x33 51

So, guys, GDB listed all registers and their current values.

Also, we can examine a specific register using their name with the command i r [register_name]. Let's see what is inside the esp register.

(gdb) i r esp

esp 0xbffff750 0xbffff750

The esp register holds the top address of the stack. The ebp register holds the base address of the stack. And the eip register holds the address of the next instruction to execute.

Examine memory addresses

Let's examine what is inside the address held by the eip register.

(gdb) x $eip

0x80483fd : 0x60ec8b55

The x command shows the content of the memory address. This content might be an instruction or data. By default, GDB shows the memory content in hexadecimal and as 4 bytes at a time.

The x command has various formats to show the memory contents. Some of them are:

x: hexadecimal
o: octal
u: unsigned decimal
t: binary
d: decimal

So, to view a value in binary, you can use the t format.

x/t $eip

You may want to see a number of addresses starting from a base address. In that case, you may use the format x/[number][format][unit] [address].

In this format, you can see a number of memory addresses and their contents. For example, to check 20 words from the top of the stack, use:

x/20x $esp

The above command lists 20 memory addresses from the top of the stack and their contents in hexadecimal format.

Run, Continue and Step

The run command starts the execution of the program. Sometimes, we may want to test a program with various arguments. We can provide arguments with the run command.

(gdb) run AAAAAA

The above command passes AAAAAA as a command-line argument to the program and starts its execution. This is a useful technique to check buffer overflows.

If a breakpoint is hit, we may continue the execution by using the continue command or the shorthand command c.

The ni command is used to step through one CPU instruction at a time. After each instruction, GDB will pause the execution.