Debugging Binaries with GDB
GDB is shipped with the GNU toolset. It is a debugging tool used in Linux environments. The term GDB stands for GNU Debugger.
In our previous protostar stack0 walkthrough tutorial, we used GDB many times.
So in this post, I'm going to explain how to use a Linux debugger for debugging and analyze a binary file. If you are planning to learn reverse engineering, malware analysis, or exploit development you must be familiar with debuggers.
To understand the Disassembly and stack etc, I suggest you read following tutorials
Starting the debugging.
We can open a binary inside GDB with the command gdb ./[binary_file]. Here binary_name is the name of the file we want to debug. You may see the following screen after this command.
However in general we don't need this banner. If you think it disturbs you, you may use quiet mode. It prevents GDB from showing this welcome banner.
gdb -q ./stack0
So guys our next step is to disassemble the binary and understand the architecture of the program.
Disassemble a binary.
There are two main Assembly syntax styles called Intel syntax and AT&T syntax. In the following image, you can see both of them.
You can select one of them as your preference. I think Intel syntax is clear and easy to understand. So in my disassembled, I prefer to use Intel Assembly syntax.
By default, gdb uses AT&T assembly syntax. We can switch to Intel assembly syntax by entering the following command.
set disassembly-flavor intel
If you want to switch back use the command
set disassembly-flavor intel
If you feel it is bearing to switch to syntax every time you start GDB, you can permanently switch to Intel syntax by editing the gdbinit file. This file is located in your home folder. So enter the following command.
echo 'set disassembly-flavor intel' > ~/.gdbinit
After you load a binary in GDB you can disassemble a function and see how is the assembly code. To do that you can use disassemble function_name or disas function_name. For example, if you want to disassemble the main function you may use disassemble main or disas main.
(gdb) disass main Dump of assembler code for function main: 0x080483f4 <main+0>: push ebp 0x080483f5 <main+1>: mov ebp,esp 0x080483f7 <main+3>: and esp,0xfffffff0 0x080483fa <main+6>: sub esp,0x60 0x080483fd <main+9>: mov DWORD PTR [esp+0x5c],0x0 0x08048405 <main+17>: lea eax,[esp+0x1c] 0x08048409 <main+21>: mov DWORD PTR [esp],eax 0x0804840c <main+24>: call 0x804830c <[email protected]> 0x08048411 <main+29>: mov eax,DWORD PTR [esp+0x5c] 0x08048415 <main+33>: test eax,eax 0x08048417 <main+35>: je 0x8048427 <main+51> 0x08048419 <main+37>: mov DWORD PTR [esp],0x8048500 0x08048420 <main+44>: call 0x804832c <[email protected]> 0x08048425 <main+49>: jmp 0x8048433 <main+63> 0x08048427 <main+51>: mov DWORD PTR [esp],0x8048529 0x0804842e <main+58>: call 0x804832c <[email protected]> 0x08048433 <main+63>: leave 0x08048434 <main+64>: ret End of assembler dump.
On the left side, we can see memory addresses. Our CPU instructions are loaded in there. On the right side, there are assembly instructions like push ebp, mov ebp,esp, etc. These Assembly instructions do various tasks on CPU, memory, and registers.
The breakpoint is an essential thing in debugging. We can stop the execution of the program on a decided state and examine the memory and registers. You can set a breakpoint on a function with the command break. For example, if you want to break execution at the main function you may use break main or the shorthand command b main.
Let's make a breakpoint on the above binary and run it to see what happens.
(gdb) b main Breakpoint 1 at 0x80483fd: file stack0/stack0.c, line 10. (gdb) run Starting program: /opt/protostar/bin/stack0 Breakpoint 1, main (argc=1, argv=0xbffff864) at stack0
I used b main command. So GDB created a breakpoint at the memory address 0x80483fd. Go to the above-disassembled code and find out what is at that address.
The instruction on this address is mov DWORD PTR [esp+0x5c],0x0. So GDB has skipped the following instructions.
0x080483f4 <main+0>: push ebp 0x080483f5 <main+1>: mov ebp,esp 0x080483f7 <main+3>: and esp,0xfffffff0 0x080483fa <main+6>: sub esp,0x60
This because these instructions are related to the function prologue generated by the compiler. The function prologue builds the stack frame of the function.
When you set a breakpoint with the function name, GDB automatically skips the function prologue. So if you want to see how the stack frame is building, you can use a memory address instead of the function name.
Let's set a breakpoint at the top of the assembly instructions.
(gdb) b *0x080483f4 Breakpoint 2 at 0x80483f4: file stack0/stack0.c, line 6.
Notice the star mark before the memory address.
Examine the memory and the registers
This is the most important part of our reverse engineering task. We can use various ways to examine the memory and the registers to see what is inside them.
To examine registers we must run the program. What we do is set a breakpoint at a required state and run the program. After gdb stops the execution we can examine registers.
Now I have created a breakpoint at main function using b main and started the program. So at the moment, GDB has paused the execution at main function.
We can use info registers command or the shorthand command I r to examine all registers. See the following example.
(gdb) i r eax 0xbffff864 -1073743772 ecx 0xa9493a07 -1454818809 edx 0x1 1 ebx 0xb7fd7ff4 -1208123404 esp 0xbffff750 0xbffff750 ebp 0xbffff7b8 0xbffff7b8 esi 0x0 0 edi 0x0 0 eip 0x80483fd 0x80483fd <main+9> eflags 0x200286 [ PF SF IF ID ] cs 0x73 115 ss 0x7b 123 ds 0x7b 123 es 0x7b 123 fs 0x0 0 gs 0x33 51
So, guys, GDB listed all registers and their current values.
Also, we can examine a specific register using their name using I r [register_name] command. Lets see what is inside esp register.
(gdb) i r esp esp 0xbffff750 0xbffff750
We can examine multiple registers at once using the following way.
(gdb) i r esp eip esp 0xbffff750 0xbffff750 eip 0x80483fd 0x80483fd <main+9>
We know the esp register is pointing to the top of the stack. So by examining the esp register we can find the address of the top of the stack.
Examine memory addresses
Here we are going to see how we examine a memory address. The command we use is x [memory_address].
In above we saw EIP register contains the value
0x80483fd. This should be the memory address of the next instruction that waiting to be executed by the CPU. Let's see what is in that location.
(gdb) x 0x80483fd 0x80483fd <main+9>: 0x5c2444c7
We can do both the above steps at once. For example, we can get the eip register holds by using $eip. So we can examine what inside of the memory address pointed by eip with the command I r $eip.
(gdb) x $eip 0x80483fd <main+9>: 0x5c2444c7
The examine command can be customized to satisfy our needs. For example, we can specify the data type that gdb prints out. By default GDB print values in hexadecimal format. The command to switch format is x/[format] [memory_address].
The following are some data type formats.
- x : Hexadecimal format
- o : Octal format
- u : Unsigned decimal format
- t : Binary format
- d: Decimal format
Most of the time we use binary and decimal types.
(gdb) x/x $eax 0xbffff864: 0xbffff975 (gdb) x/o $eax 0xbffff864: 027777774565 (gdb) x/u $eax 0xbffff864: 3221223797 (gdb) x/t $eax 0xbffff864: 10111111111111111111100101110101 (gdb) x/d $eax 0xbffff864: -1073743499
Also, there are some special types of formats. If we think there is a character string in memory address we can specify the string format to print raw bytes as a string. GDB automatically converts values to a string.
In the above disassembly, we can see CPU pushes a memory address "0x8048529" to the top of the stack and call [email protected] So we can guess there should be a string at this memory address. Here I examined that address.
(gdb) x/s 0x8048529 0x8048529: "Try again?"
Next, we can print a CPU instruction by specifying the format i. We know eip register points to CPU instruction. So we can check what is that instruction by using the following command.
(gdb) x/i $eip 0x80483fd <main+9>: mov DWORD PTR [esp+0x5c],0x0
We also can specify the number of units to show. By default, gdb shows one unit (A unit is the byte length of a word. The word size of 32-bit architecture is 4 bytes). The syntax to specify unit number is x/[unit_number][format] [memory_address].
Let's examine 20 words from the top of the stack.
(gdb) x/20x $esp 0xbffff750: 0x00000000 0x00000001 0xb7fff8f8 0xb7f0186e 0xbffff760: 0xb7fd7ff4 0xb7ec6165 0xbffff778 0xb7eada75 0xbffff770: 0xb7fd7ff4 0x08049620 0xbffff788 0x080482e8 0xbffff780: 0xb7ff1040 0x08049620 0xbffff7b8 0x08048469 0xbffff790: 0xb7fd8304 0xb7fd7ff4 0x08048450 0xbffff7b8
I think you got an idea about examining the memory.
Running, Continuing, and stepping the execution.
We can start the execution of the program with the command run. If we want to give command line arguments we can supply them after thee run the command as follows.
(gdb) run AAAAAA
Ig there is a breakpoint GDB stops the execution at the specified line. SO if we want to continue the execution we can use the command continue or the shorthand command c.
We also can execute one single CPU instruction at a time using ni command. Actually ni stands for next instruction.
So guys that's all for this tutorial. I think you enjoyed it. Feel free to leave a comment. Thank you for reading.