Common passwords blacklist

Any system that implements password authentication must check whether the passwords are not too common. Every system faces the brute-force attacks that try one or another list of most common password (and usually succeed, by the way). The system must have a capability to slow down an attacker by any means available: slowing down system response every time an unsuccessful authentication is detected, blocking an account for a short time after a number of unsuccessful authentication attempts or throwing up captchas.

Your password is not long enoughHowever, even the most sophisticated system fails if the user’s password is the most common word: “password”. The attacker simply succeeds then at once because that is likely to be the first word tried. So we need a system for blacklisting passwords that are thought of as most likely to be tried in a dictionary brute-force attack. This may be annoying for users of the system who may prefer to use a simple word as a password but this is the reality – any simple word used as a password is likely to be a security hole and must be banned.

While implementing the user login plugin for CakePHP I came across this simple question. Where do we get the password lists to check the newly entered passwords against? And here is a resource I can recommend: 62K Common Passwords by InfoSec Daily. Depending on your system’s speed you could use a smaller file of 6 MB, a 1.5 GB file that should take care of most common passwords or fuse the files into your own list.

Password storage in summary

We discussed the password storage in the article Speaking of passwords…and concluded that password implementation requires a cryptographically strong, contemporary (as in “very, very slow”) one-way hash function with a randomly generated salt for every password.

This is pretty much all you need to take care of. Salting is fairly straight-forward but it is essential to make sure it always works. Achieving a good balance between the slowness of the hashing algorithm for the attacker and an acceptable user performance is just a bit more involved but the things like key stretching techniques have been around for literally ages now too.

It is rumored that Thomas Ptacek once said:

What have we learned? We learned that if it’s 1975, you can set the ARPANet on fire with rainbow table attacks. If it’s 2007, and rainbow table attacks set you on fire, we learned that you should go back to 1975 and wait 30 years before trying to design a password hashing scheme.

We learned that if we had learned anything from this blog post, we should be consulting our friends and neighbors in the security field for help with our password schemes, because nobody is going to find the game-over bugs in our MD5 schemes until after my Mom’s credit card number is being traded out of a curbside stall in Tallinn, Estonia.

We learned that in a password hashing scheme, speed is the enemy. We learned that MD5 was designed for speed. So, we learned that MD5 is the enemy. Also Jeff Atwood and Richard Skrenta.

Finally, we learned that if we want to store passwords securely we have three reasonable options: PHK’s MD5 scheme, Provos-Maziere’s Bcrypt scheme, and SRP. We learned that the correct choice is Bcrypt.

And I think that is a great summary.

Speaking of passwords…

Wouldn’t it be quite logical to talk about passwords after user names? Most certainly. Trouble is, the subject is very, very large. Creating, storing, transmitting, verifying, updating, recovering, wiping… Did I get all of it? It is going to take a while to get through all of that, do you reckon? Let’s split the subject and talk about password storage now, as the subject that comes most often in the security discussions and in the news.

Speaking of which, some recent break-ins if you were not keeping track:

"Enter Password"LinkedIn  – 6.5 million passwords stolen, Yahoo – 450 thousand passwords stolen, Android Forums – 1 million, Last.fm – 8 million, Nvidia – 400 thousand, eHarmony – 1.5 million, Billabong – 21 thousand, TechRadar … the list is going on and on.

Out of 8 million passwords in LinkedIn and Last.fm breach, “It took a user on the forum less than 2½ hours to crack 1.2 million of the hashed passwords, Ars Technica reported.”

Oops. Is that supposed to be so easy? Actually… no.

There are few easy rules for storing the passwords. First of all, never store passwords in clear, unencrypted, like Billabong did. You remember that any and every system was or will eventually be broken into. You have to assume that your password database will fall into wrong hands sooner or later. Your password database has to be prepared for that eventuality to look good in the eyes of the press.

So, when your password database is in the hands of the attackers, it has to defend itself. A database full of unencrypted passwords does not provide any defense of course. What about an encrypted database?

Well, since you have to be able to use the database, you have to decrypt it when you need it. So the system will have the key to the database somewhere. Since the attacker got hands onto the database, there is no reason why the attacker should not get the encryption keys at the same time. So this is definitely not improving the situation.

Secure hashes (as in the name of this blog) are the ultimate answer. The important thing about the hashes is that they do not require a use of a key and they can be easily computed only one way: from the clear piece of information into the hash. They cannot be reversed, one cannot easily compute the original piece of information from the hash. That’s why they are called one-way hashes.

The hashes were invented a long time ago and they were improving over the years. The old hashes are not secure anymore with the increases in the computing power. That’s what they talked about when they referred to recovering the plain text passwords – they computed passwords that will result in the hash that is in the database.

Finding the passwords then given a database of password hashes boils down to taking a password, computing its hash according to the algorithm used, and comparing it to the hashes stored in the database. When a match is found – we have a good password. This is where the cost of computing the hashes comes in. Older hashes are much faster, newer hashes are much slower. With the advent of rental cloud computing services this is becoming a small distinction though. All SHA-1 passwords of up to 6 characters in length could be brute forced in 49 minutes with the help of Amazon EC2 for a cost of $2 two years ago. And it’s getting cheaper and faster. So here is where the speed matters but it has the opposite effect. The hash, to be secure, must be a very, very slow one. Almost too slow to be useful at all would be a good start.

Even if the computer systems weren’t getting blistering fast compared to the blistering fast of five years ago all the time, a workaround was figured a long time ago. If you are prepared to invest in some large storage, you can compute slowly but surely an enormous amount of hashes and keep them somewhere. When the time comes, you just have to go and compare the hashes you computed in advance to the given hashes in the password database. This is called using rainbow tables. And it’s bloody effective.

Ok, ok, it is not all that gloomy. This fight is an old one and we have defenses. A very effective measure against the rainbow tables is to use a cryptographic salt. A salt is an additional piece of data supplied to the hash function together with the password. Since the attacker did not know the salt in advance, precomputed rainbow tables suddenly become useless. Great. Unfortunately, many sites use a fixed salt that is generated once and set in stone. This effectively makes rainbow tables useful again. One just has to compute them once with that salt again for the whole database. So the salt, to be useful, must be generated new for every password and stored together with the password.

So, finally, the answer is simple: a cryptographically strong, contemporary (as in “very, very slow”) one-way hash function with a randomly generated salt for every password. And anything deviating from that is just plain tomfoolery.