Over-engineering

Causes for security problems are legion. One of the high pertinence problems in software development is called “over-engineering” – creation of over-complicated design or over-complicated code not justified by the complexity of the task at hand.

Often it comes as a result of the designer’s desire to show off, to demonstrate the knowledge of all possible methods, patterns and tricks. Most of the time it impresses only people who are not deeply involved with the subject. Usually it annoys people that know software design and the subject matter. Slightly more often than always it results in overly complicated code full of security holes.

XKCD on over-engineering

Of course, over-engineering is nothing new and even in the old times this was a popular thing. The problem is, where the old mechanical contraptions were ridiculous and did not really quite work in the real life, the contemporary computer software, although built in about the same way, manages somehow to struggle through its tasks. The electronics and software design industry are the most impacted by this illness that became an epidemic. Interestingly, this is one aspect where open source software is no different from commercial software – the open source designers also try to impress their peers with design and code. That is how things are.

The over-engineering in software is not just big, it is omnipresent and omnipotent. The over-engineering disease has captured the software industry even more completely than the advertising – the broadcasting. The results are predictably sad and can be seen with an untrained eye. Morbidly obese code is generated by morbidly obese development tools leading to a simply astonishing myriad of bugs in software that fails to perform its most apparent task and written with so much effort that writing the original in assembler would have been faster.

The implications are, of course, clear: the complexity is bad for security and the bugs are bad for security. Mind you, the two interplay in ways that exacerbate the problem. To top it off, the software becomes unmaintainable in a number of ways, including the total impossibility of predicting the impact of even small changes to the overall system.

The correctness of the software implementation is hard to judge on non-trivial amounts of code. Software is one of the most complicated things ever invented by mankind. So when the software gets more complex, we lose oversight and understanding of its inner working. On large systems, it is nowadays usual for no one to have a complete overview of how the system works. Over-engineered software has much more functionality than required, complicating the analysis and increasing the attack space. It tends to have functions that allow complex manipulation of parameters and tons of side effects. Compare making a reaping hook and a combine-harvester from the point of view of not harming any gophers.

Bugs proliferate in over-engineered software both for the reason of complexity and sheer size of the code, which go hand in hand. We know by now that there are more bugs in higher complexity software. There is direct correlation between the size of the code, complexity of the code, and the amount of bugs potentially present there. Some of those bugs are going to be relevant for security. The bad news is that quite often what cannot be achieved by an attacker through using a single bug, can be achieved through using a combination of bugs. A bug in one place could bring a system to an inconsistent state that could allow an attack somewhere else. And the more complicated the software becomes, the more “interesting” combinations of bugs there are. Especially when the software is over-engineered it allows for a much wider range of functionality to be accessed through security bugs.

As for the maintainability, an over-engineered system becomes harder and harder to maintain because it is not a straightforward implementation of a single concept. Anyone trying to update such software would have a hard time understanding the design and all implications of the necessary changes. I came across a serious problem once, where a developer had to inherit a class (in Java) to add additional security checks into one of the methods. By doing so, he actually ruined the complete access control system. That was unintended, of course, but the system was so complicated it was not apparent until it turned out in penetration testing that one could now send any administrative command to the server without authenticating and the server would happily comply. When the problem was discovered, it became apparent in hindsight, of course. However, the system is so complex, that it requires non-trivial amounts of effort to analyze impacts of changes. Any small change can easily turn into a disaster again.

The complex, non-maintainable code riddled with bugs becomes a security nightmare. The analysis of over-engineered software is not only expensive, it becomes sometimes simply impossible. Who would want to spend the next hundred years to analyze and fix a piece of software? But that is what I had to estimate for some of the systems that I came across. For such a system, there is no cure.

So, the next time you develop a system, follow the good old KISS principle: Keep It Simple, Stupid. If it worked for Kelly Johnson, it will work for you, too. When you maintain the code and make changes, try to bring the code size down and reduce functionality. Code size and complexity are directly correlated, so decreasing the KLOC count is a good indicator.

Camera and microphone attack on smartphones

Tactile-password-288x192The researches at the University of Cambridge have published a paper titled “PIN Skimmer: Inferring PINs Through The Camera and Microphone” describing a new approach to recovering PIN codes entered on a mobile on-screen keyboard. We had seen applications use the accelerometer and gyroscope before to infer the buttons pressed. This time, they use the camera to figure out where the fingers are touching after the microphone has signalled the start of a PIN entry. The success rate varies between 30% and 60% depending on configuration and number of samples. And that is a lot.

This attack falls into the category of side-channel attacks and it is rather hard to prevent. The paper explains in detail how the attack works and gives recommendations for mitigation to the developers. The paper also refers to several other works that use side-channel attacks using smartphone. For mobile application developers, it would be a wise idea to read through this and referenced publications to find out what the state of the art now is.

Google bots subversion

There is a lot of truth in saying that every tool can be used by good and by evil. There is no point in blocking the tools themselves as the attacker will turn to new tools and subvert the very familiar tools in unexpected ways. Now Google crawler bots were turned into such a weapon to execute SQL injection attacks against websites chosen by attackers.

it_photo_76483_200x133The discussion of whether Google should or should not do anything about that is interesting but we are not going to talk about that. Instead, think that this is a prime case of a familiar tool that comes back to your website regularly subverted into doing something evil. You did not expect that to happen and you cannot just block the Google from your website. This is a perfect example of a security attack where your application security is the only way to stop the attacker.

The application must be written in such a way that it does not matter whether it is protected by a firewall – you will not always be able to block the attacks with the firewall. The application must also be written so that it withstands an unanticipated attack, something that you were not able to predict in advance would happen. The application must be prepared to ward off things that are not there yet at the time of writing. Secure design and coding cannot be replaced with firewalls and add-on filtering.

Only such securely designed and implemented applications withstand unexpected attacks.

Cloud security

Let’s talk a little about the very popular subject nowadays – the so-called ‘cloud security’. Let’s determine what it is, what we are talking about, in fact, and see what may be special about it.

Magnificent cumulonimbus clouds

‘Cloud’ – what is it? Basically, the mainframes have been doing ‘cloud’ all along, for decades now. Cloud is simply a remotely accessible computer with shared resources. Typically, what most people ‘get from the cloud’ is either a file server with a certain amount of redundancy and fault tolerance, a web service with some database resources attached, or a virtual machine (VM) to do with as they please. Yes, it is all that simple. Even the most hyped-up services, like Amazon, boil down to these things.

So when you determine the basics, you are then talking about three distinctly different types of operation: a file server, a web/database application and a VM. The three have different security models and can be attacked in completely different ways. So it does not really make sense to speak about ‘cloud security’ as such. It only makes sense to speak about the security of a particular service type. And all three of them have been studied in depth and have the defenses worked out in detail.

Mind you, there is also another type of ‘cloud security’ – the security of the data center itself, where the physical computers accessible through the physical network are. And the security of data centers is an important subject of interest to the operators of those data centers. At the same time, consumers of the services rarely are concerned with that type of security, assuming (sometimes without a good cause) that the data center took good care of its security.

So, from the point of view of the developer or consumer of services it makes sense to talk about three types of security in three different security models that are fairly well understood and were analyzed many times over the decades. Not using that experience of, first, the mainframe developers, and then, of the open systems developers, is at the very least irresponsible.

Exodus from Java

Finally the news that I was subconsciously waiting for: the exodus of companies from Java has started. It does not come as a surprise at all. Java has never fulfilled the promises it had at the beginning. It did not provide any of the portability, security and ease of programming. I am only surprised it took so long, although knowing full well that companies’ managers routinely optimize for their own career and bonuses that does not come as a shock either.

For those not in the know, the gist of the problem is that Java promised at all times to provide some sort of “inherent security”. That is, no matter how bad you write the code, it will still be more secure that the code written in C or other advanced high-level algorithmic languages. Java claimed absence of buffer overflows, null pinter dereferences and similar problems, which all turned out to be not true after all. And it had a very important consequence.

The consequence is that anyone writing in Java or learning it is subconsciously aware of that promise. So people tend to relax and allow themselves to be sloppy. So the code written in C ends up being tighter, more organized, and more secure than the code written in Java. And the developers in C tend to be on average better educated in the intricacies of software development and more aware of potential pitfalls. And that makes a huge difference.

So, the punch line is, if you want something done well, forget Java.

Common passwords blacklist

Any system that implements password authentication must check whether the passwords are not too common. Every system faces the brute-force attacks that try one or another list of most common password (and usually succeed, by the way). The system must have a capability to slow down an attacker by any means available: slowing down system response every time an unsuccessful authentication is detected, blocking an account for a short time after a number of unsuccessful authentication attempts or throwing up captchas.

Your password is not long enoughHowever, even the most sophisticated system fails if the user’s password is the most common word: “password”. The attacker simply succeeds then at once because that is likely to be the first word tried. So we need a system for blacklisting passwords that are thought of as most likely to be tried in a dictionary brute-force attack. This may be annoying for users of the system who may prefer to use a simple word as a password but this is the reality – any simple word used as a password is likely to be a security hole and must be banned.

While implementing the user login plugin for CakePHP I came across this simple question. Where do we get the password lists to check the newly entered passwords against? And here is a resource I can recommend: 62K Common Passwords by InfoSec Daily. Depending on your system’s speed you could use a smaller file of 6 MB, a 1.5 GB file that should take care of most common passwords or fuse the files into your own list.

Cryptography: just do not!

Software developers regularly attempt to create new encryption and hashing algorithms, usually to speed up things. There is only one answer one can give in this respect:

What part of "NO" don't you understand?

Here is a short summary of reasons why you should never meddle in cryptography.

  1. Cryptography is mathematics, very advanced mathematics
  2. There are only a few good cryptographers and cryptanalysts and even they get it wrong most of the time
  3. If you are not one of them, never, ever, ever try to write your own cryptographic routines
  4. Cryptography is a very delicate matter, worse than bomb defusing
  5. Consequently you must know that most usual “cryptographic” functions are not
  6. Even when it is good, cryptography is too easy to abuse without knowing it
  7. Bad cryptography looks the same as good cryptography. You will not know whether cryptography is broken until it is too late

So, I hope you are sufficiently convinced not to create your own cryptographic algorithms and functions. But we still have to use the cryptographic functions and that is no picknick either. What can mere mortals do to keep themselves on the safe side?

Additional information:

Random or not? That is the question!

Oftentimes, the first cryptography related question you come across while designing a system is the question of random numbers. We need some random numbers in many places when developing web applications: identifiers, tokens, passwords etc. all need to be somewhat unpredictable. The question is, how unpredictable should they be? In other words, what should be the quality of the random for those purposes?

It is very tempting (and is usually done this way) to make a single random number generator that is “sufficiently good” for everything and use it all over the place. That is a flawed approach. Yes, provided a true random number generator, this is feasible. But true random number generators are sparse and are usually unavailable in a web application. So you end up with a pseudo-random number generator of unknown quality that you re-use for multiple purposes. This last part matters because by observing random numbers used in one part of the system an attacker can deduce the random numbers used in another, totally hidden part of the system. What should one do?

Importantly, what is the value generated through a (pseudo-)random number generator going to be used for? Will the security of the whole system hang on this one value being totally unpredictable?

Generally, the system should not fall apart when a random value is suddenly not-so-random. The principles of layered security should prevent attacks even when they succeed to guess your random ids, tokens and so on.

Random number generator by xkcd

Case in point: at Citibank they relied on the account numbers being “hidden”. They were not random to begin with and there was no other security in place, so anyone knowing the account number structure could go and receive information of other accounts in the bank. That kind of thing should not happen. When you use random values for ids of the information in your database, that’s good. But the security must not rely on just the ids. There must be some controls that prevent someone from seeing what they should not even when the ids are known.

Another case on the opposite side is the credit card “chip and pin” system, created by the EMV. They decided for some reason that it would be okay to have a random that is not, well, so random. The manufacturers, of course, decided to take the advantage and not provide the real random number generators. The criminal organizations, of course, took notice and promptly implemented card copying targeting the weak ATM terminals. All of this is because of a weak random. Which should not have happened because they were supposed to know that their security depends on that random number and they should have taken care. This is the case when you really need a proper random number generator and I am surprised MasterCard CAST actually let it be. But enough of the scary stories.

Generally, most of the times when you think”I need random numbers” you usually don’t. What you need are reasonably unpredictable numbers of fairly low cryptographic quality. And this is so because your system design must not be dependent on the quality of those random numbers. If it is, it is a bad design decision. Typically, whenever you encounter such a dependency you must consider a redesign. There are very few exceptions like key generation and similar things which you should not do yourself in the application code anyway. For the rest, the system must be fully operational and disallow access properly even if your random numbers are all equal to five suddenly. The system may fail to operate but it must not become wide open because of a failure of the random number generator.

There are situations where you could say that the security of the whole system relies on the random properties of a value. For example, the password reset systems of most websites send a random token by e-mail to the account holder. When this token is entered into the password reset page field, the system allows changing the password. If this token can be guessed (or forced to a particular value), the system’s security is easily compromised – just request the password reset and go enter the value, you do not need to see the e-mail. In this case, yes, that value must be truly random or at the very least impossible to predict with a reasonable period of time (your tokens do limit the “reasonable time” with a validity period assigned, right? Right??). Interestingly, in this case the delivery channel does not matter at all. Even if you had the so-called “two-factor” authentication system where you get this code sent by a short message to your mobile, it won’t matter. If an attacker can guess the token – the rest of the system is of no consequence in this design.

So, a typical system should have at least two random number generators. One used for internal purposes and one used to generate tokens sent to users. They should be good, both of them, but the one for tokens should be cryptographically strong while the one for internal use may be just fairly unpredictable because your security would not rely solely on those numbers. The generators should be written by people with some knowledge of cryptography, publicly reviewed and tested.

And here is some random reading for more on the subject:

  1. Randomness attacks against PHP applications.
  2. Chip and pin ‘weakness’ exposed by Cambridge researchers.
  3. Citigroup hack exploited easy-to-detect web flaw.
  4. How to test a random number generator.
  5. NIST on random number generators.

NFC, ain’t that funny

N-Mark Logo for certified devices

When we invented NFC (Near Field Communication) we never intended it for some of the uses that it was put to afterwards. And when we started discussing those unconventional (for us) uses, we immediately pointed out all security problems and proposed methods to protect the NFC devices from various attacks. That was… probably 2004. Do you think anyone listened? Nope. After that, we put in a few years worth of work into some (ok, granted, fairly fuzzy for political reasons) guidance, standards and white papers in Ecma International and NFC Forum. Did anyone take notice? I don’t think so.

At the recent Black Hat security conference security researcher Charlie Miller detailed and demonstrated attacks to the NFC devices and showed how he can pown a mobile phone through a combination of NFC and browser attacks.

The reason? NFC is a new attack surface and it has to be protected, both by itself and in comnbination with all the other things that are operating in the same device. However, the usual thing has happened. People paid attention only to the hype of usefulness and ease of use of the technology but never paid attention to the security of it. Now the security will have to be added, again, as an afterthought.

Duh, the humanity.

Biometrics – any good?

I think I already talked about this subject previously but not here. Anyhow, the subject bears repeating.

Many go “yippee!” at the mention of biometrics and start to think their user authentication problem is solved. Do not pay attention, they will end up in the newspaper headlines fairly soon, either for massive security failures or being bankrupt, or both.

Cartoon of a man being checked on biometric fe...

The problem is not a huge false negative rate, and it is not the huge false positive rate either. The problem is immutability of the characteristics. The biometric characteristics change slowly over time as you age and can be influenced by the environment but they cannot be changed at will (well, at least not easily). The problem is that whatever this thing is, it is a part of your body and is most of the time something you do not want to change even if you had a possibility to do so.

And that’s also why it is dangerous – you may end up losing a limb or two.

The first question that should be asked then, “what’s it good for, anyway?” A characteristic that is fairly stable, cannot easily be changed at will, – that’s a fairly reasonable user name, i.e. the user identification. Even then, it is a questionable approach because it is a good idea to let users change names.

Biometrics is definitely not any good for authentication, that is, proving that you are who you say you are. If you compare to the familiar authentication with passwords, you will notice that the means of authentication are supposedly:

  • secret or concealable
  • non-degradable
  • easily changeable in a controlled manner
  • transferrable in a controlled manner

And biometrics is none of that.

But why then? Oh, I do not expect to find a definitive answer to that but one thing could be that it looks cool in movies. The other is that biometrics were historically good for tracking people that are not actively resisting such tracking. But then we talk about politics and power and that is not the subject of this discussion at all. One thing is certain: whatever the reason to use biometrics is, it has absolutely nothing to do with security.

So when you see claims like “Biometric is the most secure and convenient authentication tool”, now you know that’s just utter nonsense and you should stay away from people (and companies) making such claims. Unless there isn’t enough nonsense in your life, of course.

When it’s your responsibility to implement a security system, try to stay away from biometrics, you’ll live a happier life. Leave it to Hollywood.