
Debugging Binaries with GDB
GDB is shipped with the GNU toolset. It is a debugging tool used in Linux environments. The term GDB stands for GNU Debugger. In our previous protostar stack0 walkthrough tutorial, we used GDB many times. So in this post, I'm going to explain how to use a Linux debugger for debugging and analyzing a binary file. If you are planning to learn reverse engineering, malware analysis, or exploit development you must be familiar with debuggers. To understand the Disassembly and stack etc, I suggest you read the following tutorialsStarting the debugging.
We can open a binary inside GDB with the command gdb ./[binary_file]. Here binary_name is the name of the file we want to debug. You may see the following screen after this command.
gdb -q ./stack0
So guys our next step is to disassemble the binary and understand the architecture of the program.
Disassemble a binary.
There are two main Assembly syntax styles called Intel syntax and AT&T syntax. In the following image, you can see both of them.
set disassembly-flavor intel
If you want to switch back use the command set disassembly-flavor intel
If you feel it is bearing to switch to syntax every time you start GDB, you can permanently switch to Intel syntax by editing the gdbinit file. This file is located in your home folder. So enter the following command.
echo 'set disassembly-flavor intel' > ~/.gdbinit
After you load a binary in GDB you can disassemble a function and see how is the assembly code. To do that you can use disassemble function_name or disas function_name. For example, if you want to disassemble the main function you may use disassemble main or disas main.
(gdb) disass main
Dump of assembler code for function main:
0x080483f4 <main+0>: push ebp
0x080483f5 <main+1>: mov ebp,esp
0x080483f7 <main+3>: and esp,0xfffffff0
0x080483fa <main+6>: sub esp,0x60
0x080483fd <main+9>: mov DWORD PTR [esp+0x5c],0x0
0x08048405 <main+17>: lea eax,[esp+0x1c]
0x08048409 <main+21>: mov DWORD PTR [esp],eax
0x0804840c <main+24>: call 0x804830c <[email protected]>
0x08048411 <main+29>: mov eax,DWORD PTR [esp+0x5c]
0x08048415 <main+33>: test eax,eax
0x08048417 <main+35>: je 0x8048427 <main+51>
0x08048419 <main+37>: mov DWORD PTR [esp],0x8048500
0x08048420 <main+44>: call 0x804832c <[email protected]>
0x08048425 <main+49>: jmp 0x8048433 <main+63>
0x08048427 <main+51>: mov DWORD PTR [esp],0x8048529
0x0804842e <main+58>: call 0x804832c <[email protected]>
0x08048433 <main+63>: leave
0x08048434 <main+64>: ret
End of assembler dump.
On the left side, we can see memory addresses. Our CPU instructions are loaded in there. On the right side, there are assembly instructions like push ebp, mov ebp,esp, etc. These Assembly instructions do various tasks on the CPU, memory, and registers.
breakpoints
The breakpoint is an essential thing in debugging. We can stop the execution of the program on a decided state and examine the memory and registers. You can set a breakpoint on a function with the command break. For example, if you want to break execution at the main function you may use break main or the shorthand command b main. Let's make a breakpoint on the above binary and run it to see what happens.(gdb) b main
Breakpoint 1 at 0x80483fd: file stack0/stack0.c, line 10.
(gdb) run
Starting program: /opt/protostar/bin/stack0
Breakpoint 1, main (argc=1, argv=0xbffff864) at stack0
I used b main command. So GDB created a breakpoint at the memory address 0x80483fd. Go to the above-disassembled code and find out what is at that address.
The instruction on this address is mov DWORD PTR [esp+0x5c],0x0. So GDB has skipped the following instructions.
0x080483f4 <main+0>: push ebp
0x080483f5 <main+1>: mov ebp,esp
0x080483f7 <main+3>: and esp,0xfffffff0
0x080483fa <main+6>: sub esp,0x60
This is because these instructions are related to the function prologue generated by the compiler. The function prologue builds the stack frame of the function.
When you set a breakpoint with the function name, GDB automatically skips the function prologue. So if you want to see how the stack frame is building, you can use a memory address instead of the function name.
Let's set a breakpoint at the top of the assembly instructions.
(gdb) b *0x080483f4
Breakpoint 2 at 0x80483f4: file stack0/stack0.c, line 6.
Notice the star mark before the memory address.
Examine the memory and the registers
This is the most important part of our reverse engineering task. We can use various ways to examine the memory and the registers to see what is inside them.Examine registers
To examine registers we must run the program. What we do is set a breakpoint at a required state and run the program. After gdb stops the execution we can examine registers. Now I have created a breakpoint at the main function using b main and started the program. So at the moment, GDB has paused the execution at the main function. We can use the info registers command or the shorthand command I r to examine all registers. See the following example.(gdb) i r
eax 0xbffff864 -1073743772
ecx 0xa9493a07 -1454818809
edx 0x1 1
ebx 0xb7fd7ff4 -1208123404
esp 0xbffff750 0xbffff750
ebp 0xbffff7b8 0xbffff7b8
esi 0x0 0
edi 0x0 0
eip 0x80483fd 0x80483fd <main+9>
eflags 0x200286 [ PF SF IF ID ]
cs 0x73 115
ss 0x7b 123
ds 0x7b 123
es 0x7b 123
fs 0x0 0
gs 0x33 51
So, guys, GDB listed all registers and their current values.
Also, we can examine a specific register using their name using I r [register_name] command. Let's see what is inside esp register.
(gdb) i r esp
esp 0xbffff750 0xbffff750
We can examine multiple registers at once using the following way.
(gdb) i r esp eip
esp 0xbffff750 0xbffff750
eip 0x80483fd 0x80483fd <main+9>
We know the esp register is pointing to the top of the stack. So by examining the esp register we can find the address at the top of the stack.
Examine memory addresses
Here we are going to see how we examine a memory address. The command we use is x [memory_address]. In the above, we saw EIP register contains the value0x80483fd.
This should be the memory address of the next instruction that waiting to be executed by the CPU. Let's see what is in that location.
(gdb) x 0x80483fd
0x80483fd <main+9>: 0x5c2444c7
We can do both of the above steps at once. For example, we can get the eip register holds by using $eip. So we can examine what is inside of the memory address pointed by eip with the command I r $eip.
(gdb) x $eip
0x80483fd <main+9>: 0x5c2444c7
The examine command can be customized to satisfy our needs. For example, we can specify the data type that gdb prints out. By default GDB prints values in hexadecimal format. The command to switch format is x/[format] [memory_address].
The following are some data type formats.
- x : Hexadecimal format
- o : Octal format
- u : Unsigned decimal format
- t : Binary format
- d: Decimal format
(gdb) x/x $eax
0xbffff864: 0xbffff975
(gdb) x/o $eax
0xbffff864: 027777774565
(gdb) x/u $eax
0xbffff864: 3221223797
(gdb) x/t $eax
0xbffff864: 10111111111111111111100101110101
(gdb) x/d $eax
0xbffff864: -1073743499
Also, there are some special types of formats. If we think there is a character string in the memory address we can specify the string format to print raw bytes as a string. GDB automatically converts values to a string.
In the above disassembly, we can see the CPU pushes a memory address "0x8048529" to the top of the stack and calls [email protected]. So we can guess there should be a string at this memory address. Here I examined that address.
(gdb) x/s 0x8048529
0x8048529: "Try again?"
Next, we can print a CPU instruction by specifying the format i. We know eip register points to a CPU instruction. So we can check what is that instruction by using the following command.
(gdb) x/i $eip
0x80483fd <main+9>: mov DWORD PTR [esp+0x5c],0x0
We also can specify the number of units to show. By default, gdb shows one unit (A unit is the byte length of a word. The word size of 32-bit architecture is 4 bytes). The syntax to specify unit number is x/[unit_number][format] [memory_address].
Let's examine 20 words from the top of the stack.
(gdb) x/20x $esp
0xbffff750: 0x00000000 0x00000001 0xb7fff8f8 0xb7f0186e
0xbffff760: 0xb7fd7ff4 0xb7ec6165 0xbffff778 0xb7eada75
0xbffff770: 0xb7fd7ff4 0x08049620 0xbffff788 0x080482e8
0xbffff780: 0xb7ff1040 0x08049620 0xbffff7b8 0x08048469
0xbffff790: 0xb7fd8304 0xb7fd7ff4 0x08048450 0xbffff7b8
I think you got an idea about examining the memory.
Running, Continuing, and stepping the execution.
We can start the execution of the program with the command run. If we want to give command line arguments we can supply them after thee run the command as follows.
(gdb) run AAAAAA
If there is a breakpoint GDB stops the execution at the specified line. So if we want to continue the execution we can use the command continue or the shorthand command c.
We also can execute one single CPU instruction at a time using the ni command. Actually, ni stands for next instruction.
So guys that's all for this tutorial. I think you enjoyed it. Feel free to leave a comment. Thank you for reading.
ABOUT HACKSLAND
Well explained and interesting cyber security articles and tutorials on the topics such as System exploitation, Web application hacking, exploit development, malwara analysis, Cryptography etc. Let's explorer the awesome world of computer
CATEGORIES
SOCIAL
RANDOM ARTICLES