Jan 07, 2024

Moving data with assembly

In a normal Windows/Linux environment you have heard about moving data or files. What we do is copy data from a source location to a destination. Assembly MOV instruction is very similar to that. But actually, assembly MOV instruction is equal to copy/paste. Because when we move a file data will be removed from the source.

Even in a simple assembly program, we use the mov command many times. The source and destination can be a register, memory address, or any other place. Sometimes we copy a value from a stack to a register. At another time we copy the value of each to ebp. For all of such tasks, we can use mov instruction.

Let's see the syntax of the mov instruction. Since we use Intel's assembly syntax following is the format of the instruction.

(If you prefer using at&t assembly syntax you want to put source first and destination second.)

Ok, let's see some examples of moving data from one place to another place.

mov eax 0x1

This Assembly instruction will copy the value 0x1 (0x1 is the hexadecimal representation of one) into the eax register.

The above instruction directly copied a value to a place. So we cal it direct mode or immediate mode.

mov eax ebx

Can you imagine what the above code does? .it will copy whatever value found at EBX to the eax register.

Here we can see a practice example of the above Assembly instruction. I used the GDB debugger to demonstrate it.

(gdb) i r ebp esp
ebp            0xbffffd38       0xbffffd38
esp            0xbffffcb8       0xbffffcb8

First I examined esp and ebp using the info register command in GDB to see what holds those registers. The esp register holds the value 0xbffffcb8 and ebp holds the value 0xbffffd38.

If you're not familiar with GDB please refer to the tutorial debugging binaries with GDB.

Next, I used x/i $eip to see the Assembly instruction to be executed. You can see that instruction is mov ebp, esp. Well. You know what it is going to do. After that, I used the ni command (ni stands for next instruction) to execute that instruction on the CPU.

(gdb) x/i $eip
0x80483f5 :     mov    ebp,esp
(gdb) ni

Now as we know the CPU will copy whatever is found on esp to ebp register. Let's see if is it true.

(gdb) i r ebp esp
ebp            0xbffffcb8       0xbffffcb8
esp            0xbffffcb8       0xbffffcb8

I used the info register command again to see what's on esp and ebp. You can see the esp register holds the value 0xbffffcb8. That's fine. It is the old value of esp. So the value of esp is not changed.

What about ebp?It also holds the value 0xbffffcb8. So we can clearly see that the value of esp copied to ebp. Great. Now we saw it practically.

In the following image, you can see a layout of a 32-bit register. A register is a memory space that is located inside the CU. It can hold some data. In the x86 architecture, any CPU register is 32-bit in size. When we come to the x86-64 architecture it is 64 bits.

Before the 32-bit architecture comes into play there was the 16-bit architecture. n that one the register's size was 16 bits. Then the 32-bit architecture came. The size of a register was extended from another 16 bits. (Total is 32 bits) . Since the size was extended 32 bit registers are called extended registered. That's why an "E" letter is there at the beginning of these registers such as EAX, EBX, ECX, etc.

Okay, Now you know how we can copy data from one register to another register. Let's give your focus on the following code line.

mov eax [ebx]

If you pay attention closely you may see that this code is not equal to the previous one. Hear the source location (EBX) is covered with brackets. What does it mean? Here we are not copying the value of EBX into eax. Instead, the EBX register acts as a pointer. In c programming, you may have heard about pointers.

Here there is a memory address in the EBX register. We get that address and copy whatever is found at that memory address into the EAX register.

The following instruction is very similar to the above one.

mov [ebx] eax

It takes the value from eax. Then take the value from EBX and treat it as a memory address. Then go to that address and copy the value found at the EAX register.

It's also possible to copy a direct value to a pointer location like the following.

mov [esp],0x1

This will copy the value 0x1 into the location pointed by the value of esp. Here you can see it practically.

(gdb) i r esp
esp            0xbffffc50       0xbffffc50

(gdb) x/x 0xbffffc50
0xbffffc50:     0x080485e0

First I examined the value of esp. It is 0xbffffc50. Now we treat this as a memory address and check what is in that location. We can examine that by entering the command x/x 0xbffffc50. You can see there is a value of 0x080485e0.

Now we examine eip to check the next instruction and use the ni command to execute it on the CPU.

(gdb) x/i $eip
0x80484bc :    mov [esp],0x1
(gdb) ni

We saw the next instruction is mov [esp],0x1. No, as the theory 0x1 should be copied to the location that was pointed by esp. Let's examine that location again to check it.

(gdb) i r esp
esp            0xbffffc50       0xbffffc50
(gdb) x/x 0xbffffc50
0xbffffc50:     0x00000001

Yes. All are going as expected. I think now you understood the concept of pointed locations.

Now. what about the following example

(gdb) i r esp eax
esp            0xbffffc50       0xbffffc50
eax            0xbffffc6c       -1073742740

(gdb) x/x 0xbffffc50
0xbffffc50:     0x00000000

(gdb) x/i $eip
0x804844d :    mov [esp],eax
(gdb) ni

(gdb) x/x 0xbffffc50
0xbffffc50:     0xbffffc6c

I don't go to deeply explain it. If you understood the previous one you may realize what's going on here. It'll take the value of eax and copy it to the location pointed by the value of the esp register

Can you understand what's following Assembly instructions? What does it do?

mov    eax, [esp+0x5c]

The concept is the same as mov eax, [esp]. But this time we add a hexadecimal value to the value of the esp register. What does it mean? Read the following code and try to understand

(gdb) i r esp eax
esp            0xbffffc50       0xbffffc50
eax            0x30     48

(gdb) x/x 0xbffffc50 + 0x5c
0xbffffcac:     0x66666666

(gdb) x/i $eip
0x8048471 :    mov    eax, [esp+0x5c]
(gdb) ni

(gdb) i r eax
eax            0x66666666       1717986918

When we execute mov eax, [esp+0x5c], the following happens.

First, we get the value of esp. It is 0xbffffc50

Next, we add 0x5c to it. The answer is 0xbffffcac.

After we treat this answer as a memory address and go to that location. Next, we copy whatever is found at that location and copy it to the eax register.

Let's take another example.

mov [eax] [ebx]

What happens to hear?

First, we get the value of the EBX register. We treat it as a memory address. Let's call this address A.

Next, we take the value from eax and treat that as a memory address. Let's call this address B.

Now we copy whatever is found at address A into address A. Got it?

Finally, it's possible to do something like the one below.

mov al 0x1

This is a cool trick we use often in shell coding. Here al is not actually a register. It is a section of the register. To understand this refers to the following image.

Since we are talking about the 32-bit architecture a CPU register is 32 bits in length. The least significant two bytes of eax are called ax register. That is 16 bits long. That ax part can be divided into two parts as al and hl. You can learn more about this in our CPU registers tutorial.

So guys that's all for this document. I hope you learned something new. thanks for reading.


Well explained and interesting cyber security articles and tutorials on the topics such as System exploitation, Web application hacking, exploit development, malwara analysis, Cryptography etc. Let's explorer the awesome world of computer