SHA-3 is there!

NIST has announced the end of the Secure Hash Algorithm competition the day before yesterday, naming Keccak as the winner and making it the SHA-3 algorithm. The complete announcement from NIST is here.

One thing of note is that since the algorithm was developed by STMicroelectronics and NXP Semiconductors, the algorithm is heavily optimized for the use in smart cards. According to the announcements, it is both compact and fast when implemented in hardware. Which makes it once again very well suited to some applications and difficult to use for others (like password hashing).

Random or not? That is the question!

Oftentimes, the first cryptography related question you come across while designing a system is the question of random numbers. We need some random numbers in many places when developing web applications: identifiers, tokens, passwords etc. all need to be somewhat unpredictable. The question is, how unpredictable should they be? In other words, what should be the quality of the random for those purposes?

It is very tempting (and is usually done this way) to make a single random number generator that is “sufficiently good” for everything and use it all over the place. That is a flawed approach. Yes, provided a true random number generator, this is feasible. But true random number generators are sparse and are usually unavailable in a web application. So you end up with a pseudo-random number generator of unknown quality that you re-use for multiple purposes. This last part matters because by observing random numbers used in one part of the system an attacker can deduce the random numbers used in another, totally hidden part of the system. What should one do?

Importantly, what is the value generated through a (pseudo-)random number generator going to be used for? Will the security of the whole system hang on this one value being totally unpredictable?

Generally, the system should not fall apart when a random value is suddenly not-so-random. The principles of layered security should prevent attacks even when they succeed to guess your random ids, tokens and so on.

Random number generator by xkcd

Case in point: at Citibank they relied on the account numbers being “hidden”. They were not random to begin with and there was no other security in place, so anyone knowing the account number structure could go and receive information of other accounts in the bank. That kind of thing should not happen. When you use random values for ids of the information in your database, that’s good. But the security must not rely on just the ids. There must be some controls that prevent someone from seeing what they should not even when the ids are known.

Another case on the opposite side is the credit card “chip and pin” system, created by the EMV. They decided for some reason that it would be okay to have a random that is not, well, so random. The manufacturers, of course, decided to take the advantage and not provide the real random number generators. The criminal organizations, of course, took notice and promptly implemented card copying targeting the weak ATM terminals. All of this is because of a weak random. Which should not have happened because they were supposed to know that their security depends on that random number and they should have taken care. This is the case when you really need a proper random number generator and I am surprised MasterCard CAST actually let it be. But enough of the scary stories.

Generally, most of the times when you think”I need random numbers” you usually don’t. What you need are reasonably unpredictable numbers of fairly low cryptographic quality. And this is so because your system design must not be dependent on the quality of those random numbers. If it is, it is a bad design decision. Typically, whenever you encounter such a dependency you must consider a redesign. There are very few exceptions like key generation and similar things which you should not do yourself in the application code anyway. For the rest, the system must be fully operational and disallow access properly even if your random numbers are all equal to five suddenly. The system may fail to operate but it must not become wide open because of a failure of the random number generator.

There are situations where you could say that the security of the whole system relies on the random properties of a value. For example, the password reset systems of most websites send a random token by e-mail to the account holder. When this token is entered into the password reset page field, the system allows changing the password. If this token can be guessed (or forced to a particular value), the system’s security is easily compromised – just request the password reset and go enter the value, you do not need to see the e-mail. In this case, yes, that value must be truly random or at the very least impossible to predict with a reasonable period of time (your tokens do limit the “reasonable time” with a validity period assigned, right? Right??). Interestingly, in this case the delivery channel does not matter at all. Even if you had the so-called “two-factor” authentication system where you get this code sent by a short message to your mobile, it won’t matter. If an attacker can guess the token – the rest of the system is of no consequence in this design.

So, a typical system should have at least two random number generators. One used for internal purposes and one used to generate tokens sent to users. They should be good, both of them, but the one for tokens should be cryptographically strong while the one for internal use may be just fairly unpredictable because your security would not rely solely on those numbers. The generators should be written by people with some knowledge of cryptography, publicly reviewed and tested.

And here is some random reading for more on the subject:

  1. Randomness attacks against PHP applications.
  2. Chip and pin ‘weakness’ exposed by Cambridge researchers.
  3. Citigroup hack exploited easy-to-detect web flaw.
  4. How to test a random number generator.
  5. NIST on random number generators.

NFC, ain’t that funny

N-Mark Logo for certified devices

When we invented NFC (Near Field Communication) we never intended it for some of the uses that it was put to afterwards. And when we started discussing those unconventional (for us) uses, we immediately pointed out all security problems and proposed methods to protect the NFC devices from various attacks. That was… probably 2004. Do you think anyone listened? Nope. After that, we put in a few years worth of work into some (ok, granted, fairly fuzzy for political reasons) guidance, standards and white papers in Ecma International and NFC Forum. Did anyone take notice? I don’t think so.

At the recent Black Hat security conference security researcher Charlie Miller detailed and demonstrated attacks to the NFC devices and showed how he can pown a mobile phone through a combination of NFC and browser attacks.

The reason? NFC is a new attack surface and it has to be protected, both by itself and in comnbination with all the other things that are operating in the same device. However, the usual thing has happened. People paid attention only to the hype of usefulness and ease of use of the technology but never paid attention to the security of it. Now the security will have to be added, again, as an afterthought.

Duh, the humanity.

Password storage in summary

We discussed the password storage in the article Speaking of passwords…and concluded that password implementation requires a cryptographically strong, contemporary (as in “very, very slow”) one-way hash function with a randomly generated salt for every password.

This is pretty much all you need to take care of. Salting is fairly straight-forward but it is essential to make sure it always works. Achieving a good balance between the slowness of the hashing algorithm for the attacker and an acceptable user performance is just a bit more involved but the things like key stretching techniques have been around for literally ages now too.

It is rumored that Thomas Ptacek once said:

What have we learned? We learned that if it’s 1975, you can set the ARPANet on fire with rainbow table attacks. If it’s 2007, and rainbow table attacks set you on fire, we learned that you should go back to 1975 and wait 30 years before trying to design a password hashing scheme.

We learned that if we had learned anything from this blog post, we should be consulting our friends and neighbors in the security field for help with our password schemes, because nobody is going to find the game-over bugs in our MD5 schemes until after my Mom’s credit card number is being traded out of a curbside stall in Tallinn, Estonia.

We learned that in a password hashing scheme, speed is the enemy. We learned that MD5 was designed for speed. So, we learned that MD5 is the enemy. Also Jeff Atwood and Richard Skrenta.

Finally, we learned that if we want to store passwords securely we have three reasonable options: PHK’s MD5 scheme, Provos-Maziere’s Bcrypt scheme, and SRP. We learned that the correct choice is Bcrypt.

And I think that is a great summary.

News: Website and app security tips

TechRepublic has an interesting article “Website and app security tips for software developers” that talks in a very short space about a whole bunch of things, from the “shelf life of software developers” to the advice on security for the website developer.

It provides in particular an interesting insight into why a person thoroughly familiar with security made security mistakes again and again.

I know why I made those mistakes — it was either the hubris of “I can roll my own better than off-the-shelf,” or the idea that slapping something together quickly would be fine “for now” and I would pay the technical debt off later. I was wrong on both counts, every single time.

How often do we get trapped like that?

Speaking of passwords…

Wouldn’t it be quite logical to talk about passwords after user names? Most certainly. Trouble is, the subject is very, very large. Creating, storing, transmitting, verifying, updating, recovering, wiping… Did I get all of it? It is going to take a while to get through all of that, do you reckon? Let’s split the subject and talk about password storage now, as the subject that comes most often in the security discussions and in the news.

Speaking of which, some recent break-ins if you were not keeping track:

"Enter Password"LinkedIn  – 6.5 million passwords stolen, Yahoo – 450 thousand passwords stolen, Android Forums – 1 million, Last.fm – 8 million, Nvidia – 400 thousand, eHarmony – 1.5 million, Billabong – 21 thousand, TechRadar … the list is going on and on.

Out of 8 million passwords in LinkedIn and Last.fm breach, “It took a user on the forum less than 2½ hours to crack 1.2 million of the hashed passwords, Ars Technica reported.”

Oops. Is that supposed to be so easy? Actually… no.

There are few easy rules for storing the passwords. First of all, never store passwords in clear, unencrypted, like Billabong did. You remember that any and every system was or will eventually be broken into. You have to assume that your password database will fall into wrong hands sooner or later. Your password database has to be prepared for that eventuality to look good in the eyes of the press.

So, when your password database is in the hands of the attackers, it has to defend itself. A database full of unencrypted passwords does not provide any defense of course. What about an encrypted database?

Well, since you have to be able to use the database, you have to decrypt it when you need it. So the system will have the key to the database somewhere. Since the attacker got hands onto the database, there is no reason why the attacker should not get the encryption keys at the same time. So this is definitely not improving the situation.

Secure hashes (as in the name of this blog) are the ultimate answer. The important thing about the hashes is that they do not require a use of a key and they can be easily computed only one way: from the clear piece of information into the hash. They cannot be reversed, one cannot easily compute the original piece of information from the hash. That’s why they are called one-way hashes.

The hashes were invented a long time ago and they were improving over the years. The old hashes are not secure anymore with the increases in the computing power. That’s what they talked about when they referred to recovering the plain text passwords – they computed passwords that will result in the hash that is in the database.

Finding the passwords then given a database of password hashes boils down to taking a password, computing its hash according to the algorithm used, and comparing it to the hashes stored in the database. When a match is found – we have a good password. This is where the cost of computing the hashes comes in. Older hashes are much faster, newer hashes are much slower. With the advent of rental cloud computing services this is becoming a small distinction though. All SHA-1 passwords of up to 6 characters in length could be brute forced in 49 minutes with the help of Amazon EC2 for a cost of $2 two years ago. And it’s getting cheaper and faster. So here is where the speed matters but it has the opposite effect. The hash, to be secure, must be a very, very slow one. Almost too slow to be useful at all would be a good start.

Even if the computer systems weren’t getting blistering fast compared to the blistering fast of five years ago all the time, a workaround was figured a long time ago. If you are prepared to invest in some large storage, you can compute slowly but surely an enormous amount of hashes and keep them somewhere. When the time comes, you just have to go and compare the hashes you computed in advance to the given hashes in the password database. This is called using rainbow tables. And it’s bloody effective.

Ok, ok, it is not all that gloomy. This fight is an old one and we have defenses. A very effective measure against the rainbow tables is to use a cryptographic salt. A salt is an additional piece of data supplied to the hash function together with the password. Since the attacker did not know the salt in advance, precomputed rainbow tables suddenly become useless. Great. Unfortunately, many sites use a fixed salt that is generated once and set in stone. This effectively makes rainbow tables useful again. One just has to compute them once with that salt again for the whole database. So the salt, to be useful, must be generated new for every password and stored together with the password.

So, finally, the answer is simple: a cryptographically strong, contemporary (as in “very, very slow”) one-way hash function with a randomly generated salt for every password. And anything deviating from that is just plain tomfoolery.

Biometrics – any good?

I think I already talked about this subject previously but not here. Anyhow, the subject bears repeating.

Many go “yippee!” at the mention of biometrics and start to think their user authentication problem is solved. Do not pay attention, they will end up in the newspaper headlines fairly soon, either for massive security failures or being bankrupt, or both.

Cartoon of a man being checked on biometric fe...

The problem is not a huge false negative rate, and it is not the huge false positive rate either. The problem is immutability of the characteristics. The biometric characteristics change slowly over time as you age and can be influenced by the environment but they cannot be changed at will (well, at least not easily). The problem is that whatever this thing is, it is a part of your body and is most of the time something you do not want to change even if you had a possibility to do so.

And that’s also why it is dangerous – you may end up losing a limb or two.

The first question that should be asked then, “what’s it good for, anyway?” A characteristic that is fairly stable, cannot easily be changed at will, – that’s a fairly reasonable user name, i.e. the user identification. Even then, it is a questionable approach because it is a good idea to let users change names.

Biometrics is definitely not any good for authentication, that is, proving that you are who you say you are. If you compare to the familiar authentication with passwords, you will notice that the means of authentication are supposedly:

  • secret or concealable
  • non-degradable
  • easily changeable in a controlled manner
  • transferrable in a controlled manner

And biometrics is none of that.

But why then? Oh, I do not expect to find a definitive answer to that but one thing could be that it looks cool in movies. The other is that biometrics were historically good for tracking people that are not actively resisting such tracking. But then we talk about politics and power and that is not the subject of this discussion at all. One thing is certain: whatever the reason to use biometrics is, it has absolutely nothing to do with security.

So when you see claims like “Biometric is the most secure and convenient authentication tool”, now you know that’s just utter nonsense and you should stay away from people (and companies) making such claims. Unless there isn’t enough nonsense in your life, of course.

When it’s your responsibility to implement a security system, try to stay away from biometrics, you’ll live a happier life. Leave it to Hollywood.

What’s in a name?

Here is something quite interesting. Nobody ever considers the user names. They are just sort of “given”. Well, are they? Most of the time, they are not. They are assumed and designed into the system one way or another. And they can have an impact on security.

An old saying goes that a secure Windows computer is the one that is switched off, locked in a safe and dumped into the ocean. That goes for any other system and still leaves a tiny chance of a deep water diver break-in. We cannot make a computer system impenetrable. What we can do is make it harder for an attacker. One of the ways to make it just that bit harder is to choose good user names.

What’s a good user name? What is a user name? What is a user id? What is user identification? What on Earth is the difference? Hold it there, it’ll be all right in a minute.

I promise I will try to be as unscientific about it as possible.

Let’s start by making a distinction between what you think of as your user name and what the system thinks of as your user identification. In a normal traditional system design those are different things. And in security conservatism is a good trait.

Historically, the users were identified with numbers, account numbers. Still, in most systems internally the users are identified with account references, usually numbers. We will call these “user id”. Originally, users used the account numbers, the user ids, to log in to the system. Nowadays they usually don’t. Typically users use some kind of aliases to log in, those being user names, e-mail addresses and so on. We will call these “user names”.

Functionally, only one property is important for both the user id and the user name. They must be unique within the boundaries of the system. This is quite normal, no? The system must be able to distinguish between users, so the ids must be unique. The log in procedure must refer the user name to a single user id, so the user name must also be unique. Simple.

What are the security requirements to the user ids and user names?

The user names have to be unpredictable to help fighting the brute force attacks. We know very well that the most popular password is “password”. If an attacker knows the user names on the system, he can try all of them with the password “password” and he may get lucky. Another important issue is the lockout of users, also known as a denial-of-service attack. If the system locks users out after three unsuccessful attempts and the user names are known, an attacker can go through all user names and perform those three attempts, locking everyone out of the system. So unpredictable user names make an attacker’s life difficult. Mind you, unpredictable user names, not necessarily random user names.

We must distinguish between two very different cases now.

In one case the user base is automatically versatile and the user names selected for some semi-obscure reason are sufficiently diverse. For example, you have a website where people register with their email addresses from all over the world. Those are hard to predict. They are not random but making an exhaustive search for user names is technically hard. So you are lucky and you can leave things as they are.

On the other side, if your user base is fairly uniform, selecting something else as an identifier would work better. Suppose you run a company and all registered users belong to your company. If you select email address as an identifier, the attacker’s life suddenly becomes much easier. Get the list of people and there you go, you know all user identifiers. In this case letting users select their own identifiers, for example, would work much better. Do not be tempted into assigning the employee numbers, that are usually sequential, as user names. Letting people come up with aliases, nicknames is a really good idea then.

Let’s come back to the user ids now. For the user ids, unpredictability also matters. Suppose that there are methods in your application that take user id as a parameter to perform some operations, perhaps, viewing the user record. Then knowing the user id makes it easy to view user records and finding information about users and the system. And don’t count too much on your access control, it may fail some time and the only thing that will stand between your users and the attacker will be the unpredictability of those numbers. Remember, that’s what went wrong at Citibank. And, really, you have nothing to lose – the computer does not particularly care what kind of numbers it is crunching.

So, what’s a good strategy for the user ids? There is only one: the user ids must be random. They do not really need to be generated from cryptographically strong random numbers but they must be sufficiently random as to be unpredictable for an external observer a.k.a. the attacker. Oh, and UUIDs are not unpredictable, just in case you were wondering.

What’s a good strategy for user names? They must be unpredictable. Either they are already because you have a diverse user base and all you have to do is to force no additional scheme on them. Or your user base is uniform and then you must come up with a scheme to introduce unpredictability, the easiest being to use nicknames.

And that’s it for names, I think. Wasn’t that easy? ;)

For an extra bit of fun: Default Admin Username and Password list. May be useful, you know.

Posts navigation

1 2 3