1.
Instead of using movl $constant , %register, use xor, mul and lea instead. The instruction of moving constant to register cost five bytes, but xor, mul and lea only cost 1 to 3 bytes. This can reduce many size of the shellcode.
The following is a quick example for the exit system call.
This is the original one that I write in the previous articles.
int main() { __asm__("movw $1, %eax;\ movw $0, %ebx;\ int $0x80;"); return 0; }The size of each instruction is
mov $1, %eax => 5 bytes.
mov $0, %ebx => 5 bytes.
int $0x80 => 2 bytes.
------------------------------------
total bytes 12 bytes.
However if rewrite the shellcode into the following code:
int main() { __asm__("xorl %ebx, %ebx;\ leal 0x1(%ebx), %eax;\ int $0x80;"); return 0; }The size of the shellcode become:
xorl %ebx, %ebx => 2 bytes
leal 0x1(%ebx), %eax => 3 bytes
int $0x80 => 2 bytes
--------------------------------------------
total size 7 bytes
yes, reduce 5 bytes of the shellcode. :D
another example of reducing the shellcode:
int main(){ __asm__("jmp 0x20;\ #2bytes popl %esi;\ #1bytes movl $4,%eax;\ #5bytes movl $1,%ebx;\ #5bytes movl $0x7,%edx;\ #5bytes movl %esi,%ecx;\ #2bytes int $0x80;\ #2bytes movl $1,%eax;\ #5bytes movl $0,%ebx;\ #5bytes int $0x80;\ #2bytes call -0x37;\ #5bytes .string "Run Han"\ "); return 0; }
This code is the write system call that I wrote in the previous article.
The code size of this shellcode is 46 bytes long.
rewrite the shellcode __asm__("jmp 0x20;\ #2bytes popl %ecx;\ #1bytes xorl %ebx, %ebx;\ #2bytes mul %ebx;\ #2bytes leal 0x4(%eax),%eax;\ #3bytes leal 0x7(%edx),%edx;\ #3bytes int $0x80;\ #2bytes xorl %ebx, %ebx;\ #2bytes leal 0x1(%ebx), %eax;\ #3bytes int $0x80; #2bytes call -0x37;\ #5bytes .string "Run Han"\ #7bytes ");
The code size is reduce to 34 bytes long.
P.S the mul instruction will save the result to %eax and %edx, therefore the %eax and %edx is now being set to zero.
2.
The push trick and relative jmp/call trick both can get the address of the data, but sometimes using the push trick in the right condition can reduce some bytes of the shellcode.consider the following example in shell code 3:
int main(){ /* relative jmp/call trick */ __asm__("jmp 2f;\n\ 1:;\n\ pop %esi;\n\ movl %esi, %ebx;\n\ movl $0, %ecx;\n\ movl $162, %eax;\n\ int $0x80;\n\ movl $1,%eax;\n\ movl $0,%ebx;\n\ int $0x80;\n\ 2:;\n\ call 1b;\n\ .long 0x00000002,0x0;\n\ "); return 0; }
The above code is 42 bytes.
int main(){ /* push trick */ __asm__("push $0;\n\ push $2;\n\ movl %esp, %ebx;\n\ movl $0, %ecx;\n\ movl $162, %eax;\n\ int $0x80;\n\ movl $1,%eax;\n\ movl $0,%ebx;\n\ int $0x80;\n\ "); return 0; }
The code size is 30 bytes.
By using the push trick, we reduced 12 bytes of the shellcode. Nice!!!
3.
The 0x66 prefix or 16bit/8bit mov instruction.
If the constant value is smaller than 0xffff using the 0x66 prefix or movw instruction in gnu assember. In this way, it can reduce one more byte of the shellcode.
If the constant value is smaller than 0xff using the movb instruction since it only cost two bytes.
Rewrite the previous example:
By using the push trick, we reduced 12 bytes of the shellcode. Nice!!!
3.
The 0x66 prefix or 16bit/8bit mov instruction.
If the constant value is smaller than 0xffff using the 0x66 prefix or movw instruction in gnu assember. In this way, it can reduce one more byte of the shellcode.
If the constant value is smaller than 0xff using the movb instruction since it only cost two bytes.
Rewrite the previous example:
int main(){ /* push trick */ __asm__("push $0;\n\ push $2;\n\ movl %esp, %ebx;\n\ xorl %ecx, %ecx;\n\ mov $162, %al;\n\ int $0x80;\n\ xorl %ebx, %ebx;\n\ leal 0x1(%ebx), %eax;\n\ int $0x80;\n\ "); return 0; }
The size of the above code is reduce to 19 bytes.
Reduce 11 bytes of the code.
Now the wait system call and exit system call only cost 19 bytes instead of 42 bytes.
Reduce 11 bytes of the code.
Now the wait system call and exit system call only cost 19 bytes instead of 42 bytes.
These tricks is very useful in some conditions, enjoy. :D