Thilan Dissanayaka Web App Hacking Mar 23

GDB reverse engineering tutorial

hi, I selected an interesting topic to discuss. Here, we are going to disassemble a binary file and take a look at what it does. This process is called reverse engineering. Let's run the program and figure out its functionality.

    user@protostar:~$ ./rev 
    HacksLand
    user@protostar:~$

It just prints a string "HacksLand" and simply exits. Can you imagine what type of code this is? We can assume it might look like the following. We don't know for sure, but let's imagine:

    #include <stdio.h>

    int main(){
        printf("HacksLand\n");
        return 0;
    }

Now let's start our actual reversing process. We can use GDB for this. If we are in a Windows environment, we can use IDA. I posted a tutorial on how to use GDB, so please take a look at it if you're not familiar with GDB.

    user@protostar:~$ gdb -q ./rev
    Reading symbols from /home/user/rev...(no debugging symbols found)...done.

    (gdb) set disassembly-flavor intel

    (gdb) disass main

    Dump of assembler code for function main:
    0x080483c4 : push    ebp
    0x080483c5 : mov     ebp,esp
    0x080483c7 : sub     esp,0x10
    0x080483ca : mov     DWORD PTR [ebp-0xc],0x2
    0x080483d1 : mov     DWORD PTR [ebp-0x8],0x3
    0x080483d8 : mov     eax,DWORD PTR [ebp-0x8]
    0x080483db : mov     edx,DWORD PTR [ebp-0xc]
    0x080483de : lea     eax,[edx+eax*1]
    0x080483e1 : mov     DWORD PTR [ebp-0x4],eax
    0x080483e4 : cmp     DWORD PTR [ebp-0x4],0x7
    0x080483e8 : jg      0x80483f8
    0x080483ea : mov     DWORD PTR [esp],0x80484d0
    0x080483f1 : call    0x80482f8
    0x080483f6 : jmp     0x8048404
    0x080483f8 : mov     DWORD PTR [esp],0x80484da
    0x080483ff : call    0x80482f8
    0x08048404 : mov     eax,0x0
    0x08048409 : leave
    0x0804840a : ret
    End of assembler dump.

First, I switched to Intel syntax and disassembled the main function. You know that the first two assembly instructions are responsible for setting up the stack frame. If you missed our tutorial on Stack and Functions, please read it to get a clear idea about these instructions.

Next, there is a sub esp instruction. It allocates space for local variables. These few instructions are very common, and you can see them in every disassembly.

Next, there are two interesting assembly lines:

    0x080483ca : mov DWORD PTR [ebp-0xc],0x2
    0x080483d1 : mov DWORD PTR [ebp-0x8],0x3

What do they do? The first instruction copies 0x2 to an address pointed to by ebp-0xc. So now ebp-0xc in the stack contains 2 (0x2 in hexadecimal is equal to 2 in decimal). The hex value 0x3 is copied to ebp-0x8 as well.

Do you know what happened here? In the stack tutorial, I explained this. The main function copies some data to the allocated local variable's space on the stack. There should be at least two integer variables in main. Here, I have set a breakpoint before the above two commands get executed. So we can examine the stack before and after those instructions run.

    (gdb) b *0x080483ca
    Breakpoint 1 at 0x80483ca
    (gdb) run
    Starting program: /home/user/rev
    Breakpoint 1, 0x080483ca in main ()
    (gdb) x/x $ebp-0xc
    0xbffff7dc: 0xb7fd7ff4
    (gdb) ni
    0x080483d1 in main ()
    (gdb) x/x $ebp-0xc
    0xbffff7dc: 0x00000002
    (gdb) x/x $ebp-0x8
    0xbffff7e0: 0x08048420
    (gdb) ni
    0x080483d8 in main ()
    (gdb) x/x $ebp-0x8
    0xbffff7e0: 0x00000003

Now it is crystal clear that 2 and 3 were copied to the stack. Let's see what happens next. Here we have another couple of instructions:

    0x080483d8 : mov eax,DWORD PTR [ebp-0x8]
    0x080483db : mov edx,DWORD PTR [ebp-0xc]

These two instructions will copy 2 and 3 to registers. Yes, 2 into edx and 3 into eax. [ebp-0x8] contains 3, and [ebp-0xc] contains 2. We can see this in GDB:

    (gdb) i r eax edx
    eax            0xbffff894    -1073743724
    edx            0x1           1
    (gdb) ni
    0x080483db in main ()
    (gdb) i r eax edx
    eax            0x3           3
    edx            0x1           1
    (gdb) ni
    0x080483de in main ()
    (gdb) i r eax edx
    eax            0x3           3
    edx            0x2           2

Ok. At this moment, eax is filled with 3, and edx contains 2. After this, we see the assembly instruction:

0x080483de : lea eax,[edx+eax*1]

lea stands for "load effective address." It simply adds eax and edx. The result is saved in eax. Let's see if this is true:

    (gdb) i r eax
    eax            0x3           3
    (gdb) ni
    0x080483e1 in main ()
    (gdb) i r eax
    eax            0x5           5

3 + 2 = 5. So 5 was copied to eax. Did you notice something unclear here? We copied 2 and 3 into the stack, then copied them from the stack to registers. Why didn't we do that directly? We could have copied these two values directly into eax and edx. This is because those are variables. Variable values must be copied to the allocated space on the stack.

Now we can imagine the source code for these steps will be something like the following:

    #include <stdio.h>

    int main(){
        int x=2;
        int y=3;
        int z;
        z = x + y;
        return 0;
    }

Yes, at this moment, we skipped the string printing part. We only focused on the disassembly we understood so far.

0x080483e1 : mov DWORD PTR [ebp-0x4],eax
0x080483e4 : cmp DWORD PTR [ebp-0x4],0x7

In the first of the above, we save our calculated value of 2 + 3 into ebp-0x4. After that, there is an interesting command called cmp. It compares that value with 0x7, actually checking 5 and 7. We know 7 is greater than 5.

Finally, let's see the instructions:

0x080483e8 : jg      0x80483f8
0x080483ea : mov     DWORD PTR [esp],0x80484d0
0x080483f1 : call    0x80482f8

In the first instruction, jg means "jump if greater." This only works if the first operand is greater than the second. We know that 5 is less than 7, so it won't jump.

It copies a memory address into esp and makes a call to the given address. That address contains the memory address of printf. Yes, finally, we reached our goal. Our original program is:

#include <stdio.h>

int main(){
    int x=2;
    int y=3;
    int z;
    z = x + y;
    if (z < 7){
        printf("HacksLand\n");
    }
    return 0;
}
ALSO READ
Debugging Binaries with GDB
Mar 23 Linux exploits

GDB is shipped with the GNU toolset. It is a debugging tool used in Linux environments. The term GDB stands for GNU Debugger. In our previous protostar stack0 walkthrough tutorial, we used GDB....

Build A Simple Web shell
Mar 23 Web App Hacking

A web shell is a type of code that hackers use to gain control over a web server. It is particularly useful for post-exploitation attacks, and there are various types of web shells available. Some of....

GDB reverse engineering tutorial
Mar 23 Web App Hacking

hi, I selected an interesting topic to discuss. Here, we are going to disassemble a binary file and take a look at what it does. This process is called reverse engineering. Let's run the program and....

Ballerina connector for Hubspot Schema API
Mar 23 Ballerina

Hi all, It's a new article on something cool. Here we are going to see how we can use the Hubspot schema connector with Ballerina. When it comes to building connectors for seamless integration....

Remote Command Execution
Mar 23 Web App Hacking

Remote Command Execution (RCE) is a critical security vulnerability that allows an attacker to execute arbitrary commands on a remote server. This vulnerability can lead to unauthorized access, data....

Introduction to Edge Computing
Mar 23 Software Architecture

Edge computing is a distributed computing paradigm where computation and data storage are performed closer to the location where it is needed. Instead of relying solely on a centralized data center,....