Sunday, March 28, 2010

How to safely handle user passwords

When you create a website or other service you take on a responsibility to properly store user's credentials; someone who gets access to a user's account can easily see personal information, potentially even financial information. Even if you think you don't have anything of importance, 73% of users[1] have the same password for everything including online banking accounts.

When dealing with passwords, you should generally assume that the attacker has a copy of your database. Many websites still have SQL injection attacks[2]. Most websites are vulnerable to XSS/CSRF attacks.[3]. For this reason it is a bad idea to store passwords as clear text; instead one should use a hash[4].

Even with properly stored passwords users still insist on choosing low security passwords.[5]. As a result malicious hackers have compiled lists of commonly used passwords. In order to mitigate these brute force attacks a lot of techniques have been developed. For example, some sites require you to enter a CAPTCHA, but most of these could easily be defeated[6]. Another simple technique is to limit login failures (once an incorrect password has been entered don't allow another attempt for X period of time). I prefer an exponential delay where each failed attempt causes the delay the become longer than the previous one. There are other possibilities that could take into account more sophisticated criteria such as the geographical region and time of day.

Although important to know the above solutions won't help in the case the attacker has a copy of the user database. In the past hashing the password was sufficient to prevent an attacker from accessing user accounts. Nowadays computers are fast enough that even though the passwords are hashed, compromise is still possible. This is done by taking the list of commonly used passwords and hashing them to see if any match the database. In order to limit the potential for this attack, a new defense was created called "salting". What this entails is hashing a random value (called a "salt") combined with the password. When the user submits his password the system hashes it combined with the salt and compares the combination to the hash already stored in the database. The security benefit of this is that the attacker needs to calculate the hashes for common passwords for each user.

As technology improved even this became insecure. Nowadays attackers can just hash every conceivable password. Furthermore attackers can work together and using "rainbow tables" which contain pre-computed hashes of millions of passwords and salts. These tables are often generated using distributed computing - so each attacker does not have to develop one on their own. This reduces the amount of security that salts can offer.

Now that salting is not good enough we need to explore other options. The main factor when exploring these options is time; this is because it is impossible to create an uncrackable password. What we do instead is increase the time requirement to discover the passwords so as to discourage the attacker. The issue is that many hash functions were not designed for password security, but rather for speedy verification. Hash functions like sha-256 (despite currently being unbroken) lends itself to quickly hashing lists of passwords. Modern computers can md5-hash every conceivable alphanumeric 7 character password in less than hour[7]. Despite widespread use old hash functions like md5[8] or sha1[9] were recently discovered to be insecure.

Today there is bcrypt [pdf], a special hash function created specifically for password security. Uniquely designed, it is slow, and can keep up with the constantly increasing speeds of computers because it uses a "work factor". Although you might be thinking that this will bog down your server, those that use it don't find this to be an issue. For these reasons I strongly advise the use bcrypt along with the the previously stated techniques such as salting.

2011-6-16: update: At the time I wrote this article I was unaware scrypt, a slower (and therefore better) function to use[10]

[4] A hash is a function that takes some input and outputs a (sufficiently) unique output such that the original input can not be recovered. One simplistic (and highly insecure) hash function would be count the number of times specific letters occurs and store that instead. For example "One very bad zany apple" would become "31012000000102110100010021". It is not possible to know whether this hash becomes "One very bad zany apple", "Noe vrey adb nzya pplea", or "Npdyoazaevebrnpyela". A cryptographic hash function is designed so that multiple passwords like this are hard to create.
[5] [pdf]
[6] Aska: A viable alternative to CAPTCHA? - Eitan Adler (2008)

Edit 2010-3-31: fix footnote numbers; subtle grammar errors fixed.
Edit 2011-2-11: grammar errors fixed - thanks JT.

Edit 2011-6-16: change dates to use ISO format

tsocks with svn - failure

tsocks is a program that will "sockify" any application. It does so by watching the application and rewriting any networking requests to go through the socks server you specify in tsocks.conf.

You can either use tsocks by running tsocks command [arguments] or by running tsocks on followed by the commands you want to run. You turn it off by running tsocks off.

I ran into an annoying error when using tsocks to sockify subversion. I got the error
svn: OPTIONS of '...': could not connect to server (...)
It turns out that using "tsocks svn up" failed this way, but if I tried the second possibility (tsocks on; svn up; tsocks off) it works.
I have no clue why this might be.

Sunday, March 21, 2010

Mercurial Extensions

Mercurial has the ability to load extensions dynamically. All that is required is to modify ~/.hgrc.

In order to add an extension open ~/.hgrc in your preferred text editor. Then look for a line that looks like [extensions]. If you don't find one add it to file. Under this line you can add a large number of extensions. Stock mercurial comes with a number of useful extensions (although none of them are turned on by default).

The following are the ones I found useful.
color  - displays output from some commands in color
fetch - does hg fetch; hg update; hg merge in one command
graphlog - command to view revision graphs from a shell
progress (hgext.progress) - shows progress of some commands

to learn more about extensions type hg help extensions into the command line.

Monday, March 1, 2010

Prediction: yahoo won't matter

Within the next 5 years Yahoo!, like AOL will still be around but won't matter at all.