Aug 27, 2024

GDB reverse engineering tutorial

Today, I selected an interesting topic to discuss. Here, we are going to disassemble a binary file and take a look at what it does. This process is called reverse engineering. Let's run the program and figure out its functionality.

user@protostar:~$ ./rev 
HacksLand
user@protostar:~$

It just prints a string "HacksLand" and simply exits. Can you imagine what type of code this is? We can assume it might look like the following. We don't know for sure, but let's imagine:

#include <stdio.h>

int main(){
    printf("HacksLand\n");
    return 0;
}

Now let's start our actual reversing process. We can use GDB for this. If we are in a Windows environment, we can use IDA. I posted a tutorial on how to use GDB, so please take a look at it if you're not familiar with GDB.

user@protostar:~$ gdb -q ./rev
Reading symbols from /home/user/rev...(no debugging symbols found)...done.

(gdb) set disassembly-flavor intel

(gdb) disass main

Dump of assembler code for function main:
0x080483c4 : push    ebp
0x080483c5 : mov     ebp,esp
0x080483c7 : sub     esp,0x10
0x080483ca : mov     DWORD PTR [ebp-0xc],0x2
0x080483d1 : mov     DWORD PTR [ebp-0x8],0x3
0x080483d8 : mov     eax,DWORD PTR [ebp-0x8]
0x080483db : mov     edx,DWORD PTR [ebp-0xc]
0x080483de : lea     eax,[edx+eax*1]
0x080483e1 : mov     DWORD PTR [ebp-0x4],eax
0x080483e4 : cmp     DWORD PTR [ebp-0x4],0x7
0x080483e8 : jg      0x80483f8
0x080483ea : mov     DWORD PTR [esp],0x80484d0
0x080483f1 : call    0x80482f8
0x080483f6 : jmp     0x8048404
0x080483f8 : mov     DWORD PTR [esp],0x80484da
0x080483ff : call    0x80482f8
0x08048404 : mov     eax,0x0
0x08048409 : leave
0x0804840a : ret
End of assembler dump.

First, I switched to Intel syntax and disassembled the main function. You know that the first two assembly instructions are responsible for setting up the stack frame. If you missed our tutorial on Stack and Functions, please read it to get a clear idea about these instructions.

Next, there is a sub esp instruction. It allocates space for local variables. These few instructions are very common, and you can see them in every disassembly.

Next, there are two interesting assembly lines:

0x080483ca : mov DWORD PTR [ebp-0xc],0x2
0x080483d1 : mov DWORD PTR [ebp-0x8],0x3

What do they do? The first instruction copies 0x2 to an address pointed to by ebp-0xc. So now ebp-0xc in the stack contains 2 (0x2 in hexadecimal is equal to 2 in decimal). The hex value 0x3 is copied to ebp-0x8 as well.

Do you know what happened here? In the stack tutorial, I explained this. The main function copies some data to the allocated local variable's space on the stack. There should be at least two integer variables in main. Here, I have set a breakpoint before the above two commands get executed. So we can examine the stack before and after those instructions run.

(gdb) b *0x080483ca
Breakpoint 1 at 0x80483ca
(gdb) run
Starting program: /home/user/rev
Breakpoint 1, 0x080483ca in main ()
(gdb) x/x $ebp-0xc
0xbffff7dc: 0xb7fd7ff4
(gdb) ni
0x080483d1 in main ()
(gdb) x/x $ebp-0xc
0xbffff7dc: 0x00000002
(gdb) x/x $ebp-0x8
0xbffff7e0: 0x08048420
(gdb) ni
0x080483d8 in main ()
(gdb) x/x $ebp-0x8
0xbffff7e0: 0x00000003

Now it is crystal clear that 2 and 3 were copied to the stack. Let's see what happens next. Here we have another couple of instructions:

0x080483d8 : mov eax,DWORD PTR [ebp-0x8]
0x080483db : mov edx,DWORD PTR [ebp-0xc]

These two instructions will copy 2 and 3 to registers. Yes, 2 into edx and 3 into eax. [ebp-0x8] contains 3, and [ebp-0xc] contains 2. We can see this in GDB:

(gdb) i r eax edx
eax            0xbffff894    -1073743724
edx            0x1           1
(gdb) ni
0x080483db in main ()
(gdb) i r eax edx
eax            0x3           3
edx            0x1           1
(gdb) ni
0x080483de in main ()
(gdb) i r eax edx
eax            0x3           3
edx            0x2           2

Ok. At this moment, eax is filled with 3, and edx contains 2. After this, we see the assembly instruction:

0x080483de : lea eax,[edx+eax*1]

lea stands for "load effective address." It simply adds eax and edx. The result is saved in eax. Let's see if this is true:

(gdb) i r eax
eax            0x3           3
(gdb) ni
0x080483e1 in main ()
(gdb) i r eax
eax            0x5           5

3 + 2 = 5. So 5 was copied to eax. Did you notice something unclear here? We copied 2 and 3 into the stack, then copied them from the stack to registers. Why didn't we do that directly? We could have copied these two values directly into eax and edx. This is because those are variables. Variable values must be copied to the allocated space on the stack.

Now we can imagine the source code for these steps will be something like the following:

#include <stdio.h>

int main(){
    int x=2;
    int y=3;
    int z;
    z = x + y;
    return 0;
}

Yes, at this moment, we skipped the string printing part. We only focused on the disassembly we understood so far.

0x080483e1 : mov DWORD PTR [ebp-0x4],eax
0x080483e4 : cmp DWORD PTR [ebp-0x4],0x7

In the first of the above, we save our calculated value of 2 + 3 into ebp-0x4. After that, there is an interesting command called cmp. It compares that value with 0x7, actually checking 5 and 7. We know 7 is greater than 5.

Finally, let's see the instructions:

0x080483e8 : jg      0x80483f8
0x080483ea : mov     DWORD PTR [esp],0x80484d0
0x080483f1 : call    0x80482f8

In the first instruction, jg means "jump if greater." This only works if the first operand is greater than the second. We know that 5 is less than 7, so it won't jump.

It copies a memory address into esp and makes a call to the given address. That address contains the memory address of printf. Yes, finally, we reached our goal. Our original program is:

#include <stdio.h>

int main(){
    int x=2;
    int y=3;
    int z;
    z = x + y;
    if (z < 7){
        printf("HacksLand\n");
    }
    return 0;
}

ABOUT HACKSLAND

Explorer the world of cyber security. Read some cool articles on System exploitation, Web application hacking, exploit development, malwara analysis, Cryptography etc.

CATEGORIES
SOCIAL