Debugging Binaries with GDBGDB is shipped with the GNU toolset. It is a debugging tool used in Linux environments. The term GDB stands for GNU Debugger. In our previous protostar stack0 walkthrough tutorial, we used GDB many times. So in this post, I'm going to explain how to use a Linux debugger for debugging and analyzing a binary file. If you are planning to learn reverse engineering, malware analysis, or exploit development you must be familiar with debuggers. To understand the Disassembly and stack etc, I suggest you read the following tutorials
Starting the debugging.We can open a binary inside GDB with the command gdb ./[binary_file]. Here binary_name is the name of the file we want to debug. You may see the following screen after this command. However in general we don't need this banner. If you think it disturbs you, you may use quiet mode. It prevents GDB from showing this welcome banner.
So guys our next step is to disassemble the binary and understand the architecture of the program.
gdb -q ./stack0
Disassemble a binary.There are two main Assembly syntax styles called Intel syntax and AT&T syntax. In the following image, you can see both of them. You can select one of them as your preference. I think Intel syntax is clear and easy to understand. So in my disassembled, I prefer to use Intel Assembly syntax. By default, gdb uses AT&T assembly syntax. We can switch to Intel assembly syntax by entering the following command.
If you want to switch back use the command
set disassembly-flavor intel
set disassembly-flavor intelIf you feel it is bearing to switch to syntax every time you start GDB, you can permanently switch to Intel syntax by editing the gdbinit file. This file is located in your home folder. So enter the following command.
After you load a binary in GDB you can disassemble a function and see how is the assembly code. To do that you can use disassemble function_name or disas function_name. For example, if you want to disassemble the main function you may use disassemble main or disas main.
echo 'set disassembly-flavor intel' > ~/.gdbinit
On the left side, we can see memory addresses. Our CPU instructions are loaded in there. On the right side, there are assembly instructions like push ebp, mov ebp,esp, etc. These Assembly instructions do various tasks on the CPU, memory, and registers.
(gdb) disass main Dump of assembler code for function main: 0x080483f4 <main+0>: push ebp 0x080483f5 <main+1>: mov ebp,esp 0x080483f7 <main+3>: and esp,0xfffffff0 0x080483fa <main+6>: sub esp,0x60 0x080483fd <main+9>: mov DWORD PTR [esp+0x5c],0x0 0x08048405 <main+17>: lea eax,[esp+0x1c] 0x08048409 <main+21>: mov DWORD PTR [esp],eax 0x0804840c <main+24>: call 0x804830c <gets@plt> 0x08048411 <main+29>: mov eax,DWORD PTR [esp+0x5c] 0x08048415 <main+33>: test eax,eax 0x08048417 <main+35>: je 0x8048427 <main+51> 0x08048419 <main+37>: mov DWORD PTR [esp],0x8048500 0x08048420 <main+44>: call 0x804832c <puts@plt> 0x08048425 <main+49>: jmp 0x8048433 <main+63> 0x08048427 <main+51>: mov DWORD PTR [esp],0x8048529 0x0804842e <main+58>: call 0x804832c <puts@plt> 0x08048433 <main+63>: leave 0x08048434 <main+64>: ret End of assembler dump.
breakpointsThe breakpoint is an essential thing in debugging. We can stop the execution of the program on a decided state and examine the memory and registers. You can set a breakpoint on a function with the command break. For example, if you want to break execution at the main function you may use break main or the shorthand command b main. Let's make a breakpoint on the above binary and run it to see what happens.
I used b main command. So GDB created a breakpoint at the memory address 0x80483fd. Go to the above-disassembled code and find out what is at that address. The instruction on this address is mov DWORD PTR [esp+0x5c],0x0. So GDB has skipped the following instructions.
(gdb) b main Breakpoint 1 at 0x80483fd: file stack0/stack0.c, line 10. (gdb) run Starting program: /opt/protostar/bin/stack0 Breakpoint 1, main (argc=1, argv=0xbffff864) at stack0
This is because these instructions are related to the function prologue generated by the compiler. The function prologue builds the stack frame of the function. When you set a breakpoint with the function name, GDB automatically skips the function prologue. So if you want to see how the stack frame is building, you can use a memory address instead of the function name. Let's set a breakpoint at the top of the assembly instructions.
0x080483f4 <main+0>: push ebp 0x080483f5 <main+1>: mov ebp,esp 0x080483f7 <main+3>: and esp,0xfffffff0 0x080483fa <main+6>: sub esp,0x60
Notice the star mark before the memory address.
(gdb) b *0x080483f4 Breakpoint 2 at 0x80483f4: file stack0/stack0.c, line 6.
Examine the memory and the registersThis is the most important part of our reverse engineering task. We can use various ways to examine the memory and the registers to see what is inside them.
Examine registersTo examine registers we must run the program. What we do is set a breakpoint at a required state and run the program. After gdb stops the execution we can examine registers. Now I have created a breakpoint at the main function using b main and started the program. So at the moment, GDB has paused the execution at the main function. We can use the info registers command or the shorthand command I r to examine all registers. See the following example.
So, guys, GDB listed all registers and their current values. Also, we can examine a specific register using their name using I r [register_name] command. Let's see what is inside esp register.
(gdb) i r eax 0xbffff864 -1073743772 ecx 0xa9493a07 -1454818809 edx 0x1 1 ebx 0xb7fd7ff4 -1208123404 esp 0xbffff750 0xbffff750 ebp 0xbffff7b8 0xbffff7b8 esi 0x0 0 edi 0x0 0 eip 0x80483fd 0x80483fd <main+9> eflags 0x200286 [ PF SF IF ID ] cs 0x73 115 ss 0x7b 123 ds 0x7b 123 es 0x7b 123 fs 0x0 0 gs 0x33 51
We can examine multiple registers at once using the following way.
(gdb) i r esp esp 0xbffff750 0xbffff750
We know the esp register is pointing to the top of the stack. So by examining the esp register we can find the address at the top of the stack.
(gdb) i r esp eip esp 0xbffff750 0xbffff750 eip 0x80483fd 0x80483fd <main+9>
Examine memory addressesHere we are going to see how we examine a memory address. The command we use is x [memory_address]. In the above, we saw EIP register contains the value
0x80483fd.This should be the memory address of the next instruction that waiting to be executed by the CPU. Let's see what is in that location.
We can do both of the above steps at once. For example, we can get the eip register holds by using $eip. So we can examine what is inside of the memory address pointed by eip with the command I r $eip.
(gdb) x 0x80483fd 0x80483fd <main+9>: 0x5c2444c7
The examine command can be customized to satisfy our needs. For example, we can specify the data type that gdb prints out. By default GDB prints values in hexadecimal format. The command to switch format is x/[format] [memory_address]. The following are some data type formats.
(gdb) x $eip 0x80483fd <main+9>: 0x5c2444c7
- x : Hexadecimal format
- o : Octal format
- u : Unsigned decimal format
- t : Binary format
- d: Decimal format
Also, there are some special types of formats. If we think there is a character string in the memory address we can specify the string format to print raw bytes as a string. GDB automatically converts values to a string. In the above disassembly, we can see the CPU pushes a memory address "0x8048529" to the top of the stack and calls puts@plt. So we can guess there should be a string at this memory address. Here I examined that address.
(gdb) x/x $eax 0xbffff864: 0xbffff975 (gdb) x/o $eax 0xbffff864: 027777774565 (gdb) x/u $eax 0xbffff864: 3221223797 (gdb) x/t $eax 0xbffff864: 10111111111111111111100101110101 (gdb) x/d $eax 0xbffff864: -1073743499
Next, we can print a CPU instruction by specifying the format i. We know eip register points to a CPU instruction. So we can check what is that instruction by using the following command.
(gdb) x/s 0x8048529 0x8048529: "Try again?"
We also can specify the number of units to show. By default, gdb shows one unit (A unit is the byte length of a word. The word size of 32-bit architecture is 4 bytes). The syntax to specify unit number is x/[unit_number][format] [memory_address]. Let's examine 20 words from the top of the stack.
(gdb) x/i $eip 0x80483fd <main+9>: mov DWORD PTR [esp+0x5c],0x0
I think you got an idea about examining the memory.
(gdb) x/20x $esp 0xbffff750: 0x00000000 0x00000001 0xb7fff8f8 0xb7f0186e 0xbffff760: 0xb7fd7ff4 0xb7ec6165 0xbffff778 0xb7eada75 0xbffff770: 0xb7fd7ff4 0x08049620 0xbffff788 0x080482e8 0xbffff780: 0xb7ff1040 0x08049620 0xbffff7b8 0x08048469 0xbffff790: 0xb7fd8304 0xb7fd7ff4 0x08048450 0xbffff7b8
Running, Continuing, and stepping the execution.We can start the execution of the program with the command run. If we want to give command line arguments we can supply them after thee run the command as follows.
If there is a breakpoint GDB stops the execution at the specified line. So if we want to continue the execution we can use the command continue or the shorthand command c. We also can execute one single CPU instruction at a time using the ni command. Actually, ni stands for next instruction. So guys that's all for this tutorial. I think you enjoyed it. Feel free to leave a comment. Thank you for reading.
(gdb) run AAAAAA
Well explained and interesting cyber security articles and tutorials on the topics such as System exploitation, Web application hacking, exploit development, malwara analysis, Cryptography etc. Let's explorer the awesome world of computer