Thilan Dissanayaka Low level Development Mar 23

GDB reverse engineering tutorial

hiii, I selected an interesting topic to discuss. Here, we are going to disassemble a binary file and take a look at what it does. This process is called reverse engineering. Let's run the program and figure out its functionality.

    user@protostar:~$ ./rev 
    HacksLand
    user@protostar:~$

It just prints a string "HacksLand" and simply exits. Can you imagine what type of code this is? We can assume it might look like the following. We don't know for sure, but let's imagine:

    #include <stdio.h>

    int main(){
        printf("HacksLand\n");
        return 0;
    }

Now let's start our actual reversing process. We can use GDB for this. If we are in a Windows environment, we can use IDA. I posted a tutorial on how to use GDB, so please take a look at it if you're not familiar with GDB.

    user@protostar:~$ gdb -q ./rev
    Reading symbols from /home/user/rev...(no debugging symbols found)...done.

    (gdb) set disassembly-flavor intel

    (gdb) disass main

    Dump of assembler code for function main:
    0x080483c4 : push    ebp
    0x080483c5 : mov     ebp,esp
    0x080483c7 : sub     esp,0x10
    0x080483ca : mov     DWORD PTR [ebp-0xc],0x2
    0x080483d1 : mov     DWORD PTR [ebp-0x8],0x3
    0x080483d8 : mov     eax,DWORD PTR [ebp-0x8]
    0x080483db : mov     edx,DWORD PTR [ebp-0xc]
    0x080483de : lea     eax,[edx+eax*1]
    0x080483e1 : mov     DWORD PTR [ebp-0x4],eax
    0x080483e4 : cmp     DWORD PTR [ebp-0x4],0x7
    0x080483e8 : jg      0x80483f8
    0x080483ea : mov     DWORD PTR [esp],0x80484d0
    0x080483f1 : call    0x80482f8
    0x080483f6 : jmp     0x8048404
    0x080483f8 : mov     DWORD PTR [esp],0x80484da
    0x080483ff : call    0x80482f8
    0x08048404 : mov     eax,0x0
    0x08048409 : leave
    0x0804840a : ret
    End of assembler dump.

First, I switched to Intel syntax and disassembled the main function. You know that the first two assembly instructions are responsible for setting up the stack frame. If you missed our tutorial on Stack and Functions, please read it to get a clear idea about these instructions.

Next, there is a sub esp instruction. It allocates space for local variables. These few instructions are very common, and you can see them in every disassembly.

Next, there are two interesting assembly lines:

    0x080483ca : mov DWORD PTR [ebp-0xc],0x2
    0x080483d1 : mov DWORD PTR [ebp-0x8],0x3

What do they do? The first instruction copies 0x2 to an address pointed to by ebp-0xc. So now ebp-0xc in the stack contains 2 (0x2 in hexadecimal is equal to 2 in decimal). The hex value 0x3 is copied to ebp-0x8 as well.

Do you know what happened here? In the stack tutorial, I explained this. The main function copies some data to the allocated local variable's space on the stack. There should be at least two integer variables in main. Here, I have set a breakpoint before the above two commands get executed. So we can examine the stack before and after those instructions run.

    (gdb) b *0x080483ca
    Breakpoint 1 at 0x80483ca
    (gdb) run
    Starting program: /home/user/rev
    Breakpoint 1, 0x080483ca in main ()
    (gdb) x/x $ebp-0xc
    0xbffff7dc: 0xb7fd7ff4
    (gdb) ni
    0x080483d1 in main ()
    (gdb) x/x $ebp-0xc
    0xbffff7dc: 0x00000002
    (gdb) x/x $ebp-0x8
    0xbffff7e0: 0x08048420
    (gdb) ni
    0x080483d8 in main ()
    (gdb) x/x $ebp-0x8
    0xbffff7e0: 0x00000003

Now it is crystal clear that 2 and 3 were copied to the stack. Let's see what happens next. Here we have another couple of instructions:

    0x080483d8 : mov eax,DWORD PTR [ebp-0x8]
    0x080483db : mov edx,DWORD PTR [ebp-0xc]

These two instructions will copy 2 and 3 to registers. Yes, 2 into edx and 3 into eax. [ebp-0x8] contains 3, and [ebp-0xc] contains 2. We can see this in GDB:

    (gdb) i r eax edx
    eax            0xbffff894    -1073743724
    edx            0x1           1
    (gdb) ni
    0x080483db in main ()
    (gdb) i r eax edx
    eax            0x3           3
    edx            0x1           1
    (gdb) ni
    0x080483de in main ()
    (gdb) i r eax edx
    eax            0x3           3
    edx            0x2           2

Ok. At this moment, eax is filled with 3, and edx contains 2. After this, we see the assembly instruction:

0x080483de : lea eax,[edx+eax*1]

lea stands for "load effective address." It simply adds eax and edx. The result is saved in eax. Let's see if this is true:

    (gdb) i r eax
    eax            0x3           3
    (gdb) ni
    0x080483e1 in main ()
    (gdb) i r eax
    eax            0x5           5

3 + 2 = 5. So 5 was copied to eax. Did you notice something unclear here? We copied 2 and 3 into the stack, then copied them from the stack to registers. Why didn't we do that directly? We could have copied these two values directly into eax and edx. This is because those are variables. Variable values must be copied to the allocated space on the stack.

Now we can imagine the source code for these steps will be something like the following:

    #include <stdio.h>

    int main(){
        int x=2;
        int y=3;
        int z;
        z = x + y;
        return 0;
    }

Yes, at this moment, we skipped the string printing part. We only focused on the disassembly we understood so far.

0x080483e1 : mov DWORD PTR [ebp-0x4],eax
0x080483e4 : cmp DWORD PTR [ebp-0x4],0x7

In the first of the above, we save our calculated value of 2 + 3 into ebp-0x4. After that, there is an interesting command called cmp. It compares that value with 0x7, actually checking 5 and 7. We know 7 is greater than 5.

Finally, let's see the instructions:

0x080483e8 : jg      0x80483f8
0x080483ea : mov     DWORD PTR [esp],0x80484d0
0x080483f1 : call    0x80482f8

In the first instruction, jg means "jump if greater." This only works if the first operand is greater than the second. We know that 5 is less than 7, so it won't jump.

It copies a memory address into esp and makes a call to the given address. That address contains the memory address of printf. Yes, finally, we reached our goal. Our original program is:

#include <stdio.h>

int main(){
    int x=2;
    int y=3;
    int z;
    z = x + y;
    if (z < 7){
        printf("HacksLand\n");
    }
    return 0;
}
ALSO READ
Boolean based Blind SQL Injection
Apr 26 Application Security

Blind SQL Injection happens when: There is a SQL injection vulnerability, BUT the application does not show any SQL errors or query outputs directly. In this case, an attacker has to ask....

Remote Command Execution
Mar 23 Application Security

Remote Command Execution (RCE) is a critical security vulnerability that allows an attacker to execute arbitrary commands on a remote server. This vulnerability can lead to unauthorized access, data....

Basic concepts of Cryptography
May 03 Cryptography

Cryptography is the practice of securing communication in the presence of third parties. It's a cornerstone of digital security, allowing us to protect sensitive information even when it's sent....

Ballerina connector for Hubspot Schema API
Mar 23 WSO2

Hi all, It's a new article on something cool. Here we are going to see how we can use the Hubspot schema connector with Ballerina. When it comes to building connectors for seamless integration....

Template Pattern explained simply
Apr 26 Software Architecture

Ever found yourself writing similar logic over and over, only to change a few steps each time? That’s exactly what the **Template Pattern** helps you solve. The **Template Pattern** is a....

SSRF - Server Side Request Forgery
May 27 Application Security

Server-Side Request Forgery (SSRF) is a web security vulnerability that allows an attacker to induce the server-side application to make HTTP requests to an arbitrary domain of the attacker's....