Random or not? That is the question!

Oftentimes, the first cryptography related question you come across while designing a system is the question of random numbers. We need some random numbers in many places when developing web applications: identifiers, tokens, passwords etc. all need to be somewhat unpredictable. The question is, how unpredictable should they be? In other words, what should be the quality of the random for those purposes?

It is very tempting (and is usually done this way) to make a single random number generator that is “sufficiently good” for everything and use it all over the place. That is a flawed approach. Yes, provided a true random number generator, this is feasible. But true random number generators are sparse and are usually unavailable in a web application. So you end up with a pseudo-random number generator of unknown quality that you re-use for multiple purposes. This last part matters because by observing random numbers used in one part of the system an attacker can deduce the random numbers used in another, totally hidden part of the system. What should one do?

Importantly, what is the value generated through a (pseudo-)random number generator going to be used for? Will the security of the whole system hang on this one value being totally unpredictable?

Generally, the system should not fall apart when a random value is suddenly not-so-random. The principles of layered security should prevent attacks even when they succeed to guess your random ids, tokens and so on.

Random number generator by xkcd

Case in point: at Citibank they relied on the account numbers being “hidden”. They were not random to begin with and there was no other security in place, so anyone knowing the account number structure could go and receive information of other accounts in the bank. That kind of thing should not happen. When you use random values for ids of the information in your database, that’s good. But the security must not rely on just the ids. There must be some controls that prevent someone from seeing what they should not even when the ids are known.

Another case on the opposite side is the credit card “chip and pin” system, created by the EMV. They decided for some reason that it would be okay to have a random that is not, well, so random. The manufacturers, of course, decided to take the advantage and not provide the real random number generators. The criminal organizations, of course, took notice and promptly implemented card copying targeting the weak ATM terminals. All of this is because of a weak random. Which should not have happened because they were supposed to know that their security depends on that random number and they should have taken care. This is the case when you really need a proper random number generator and I am surprised MasterCard CAST actually let it be. But enough of the scary stories.

Generally, most of the times when you think”I need random numbers” you usually don’t. What you need are reasonably unpredictable numbers of fairly low cryptographic quality. And this is so because your system design must not be dependent on the quality of those random numbers. If it is, it is a bad design decision. Typically, whenever you encounter such a dependency you must consider a redesign. There are very few exceptions like key generation and similar things which you should not do yourself in the application code anyway. For the rest, the system must be fully operational and disallow access properly even if your random numbers are all equal to five suddenly. The system may fail to operate but it must not become wide open because of a failure of the random number generator.

There are situations where you could say that the security of the whole system relies on the random properties of a value. For example, the password reset systems of most websites send a random token by e-mail to the account holder. When this token is entered into the password reset page field, the system allows changing the password. If this token can be guessed (or forced to a particular value), the system’s security is easily compromised – just request the password reset and go enter the value, you do not need to see the e-mail. In this case, yes, that value must be truly random or at the very least impossible to predict with a reasonable period of time (your tokens do limit the “reasonable time” with a validity period assigned, right? Right??). Interestingly, in this case the delivery channel does not matter at all. Even if you had the so-called “two-factor” authentication system where you get this code sent by a short message to your mobile, it won’t matter. If an attacker can guess the token – the rest of the system is of no consequence in this design.

So, a typical system should have at least two random number generators. One used for internal purposes and one used to generate tokens sent to users. They should be good, both of them, but the one for tokens should be cryptographically strong while the one for internal use may be just fairly unpredictable because your security would not rely solely on those numbers. The generators should be written by people with some knowledge of cryptography, publicly reviewed and tested.

And here is some random reading for more on the subject:

  1. Randomness attacks against PHP applications.
  2. Chip and pin ‘weakness’ exposed by Cambridge researchers.
  3. Citigroup hack exploited easy-to-detect web flaw.
  4. How to test a random number generator.
  5. NIST on random number generators.

Leave a Reply

Your email address will not be published. Required fields are marked *