Passwords remain the main means of authentication on the internet. People often forget their passwords and then they have to recover their access to the website services through some kind of mechanism. We try to make that so-called “password recovery” simple and automated, of course. There are several ways to do it, all of them but one are wrong. Let’s see how it is done.
If you did not read Part 1 – Secret questions and Part 2 – Secondary channel, I recommend you do so before reading on.
Part 3 – Example procedure: put it all together
Let’s assume we are putting together a website and we will have passwords stored in a salted hash form and we identify the users with their e-mail address. I will describe what I think a good strategy for password recovery then is and you are welcome to comment and improve upon.
Since we have the users’ e-mail addresses, that is the natural secondary authentication channel. So if a user needs password recovery, we will use their e-mail to authenticate them. Here is how.
The user will come to a login page and clicks the link for “forgot password” or similar. They have to provide then an e-mail address. The form for e-mail address submission has to have means of countering automated exhaustive searches to both lower the load onto the server in case of an attack and provide some small level of discouragement against such attacks. There are two ways that come to mind: using a CAPTCHA and slowing down the form submission with a random large (an order of seconds) delay. Let’s not go into the holy war on CAPTCHA, you are welcome to use any other means you can think of and, please, suggest them so that others can benefit from your thoughts here. You should also provide an additional hidden field that will be automatically filled in by an automated form scanning robot, so you can detect that too and discard the request. Anyway, the important part is: slow down the potential attacker. The person going through recovery will not mind if it takes a while.
As the next step, we will look up the user e-mail address in the database, create a random token, mail it out and provide the feedback to the user. The feedback should be done in constant time, so that an attacker does not use your recovery mechanism to collect valid e-mail addresses from your website. The process thus should take the same time whether you found the user or not. This is difficult to get right and the best solution is to store the request for off-line processing and return immediately. Another way is to use the user names instead and look up the e-mail address but a user is more likely to know their own e-mail address than remember their user name, so there is a caveat. If you cannot (or would not) do off-line processing of requests, you should at least measure your server and try to get the timing similar with delays. The timing of the server can be measured fairly precisely and this is difficult to get right, especially under fluctuating load but you must give it a try. Still, it’s best if you keep the submitted information and trigger an off-line processing while reporting to user something along the lines of “if your e-mail is correct, you will receive an automated e-mail with instructions within an hour”. The feedback should never say whether the e-mail is correct or not.
Now we generate a long, random, cryptographically strong token. It must be cryptographically strong because the user may actually be an attacker and if he can guess how we generate tokens and can do the same, he will be able to reset passwords for arbitrary users. We generate the token, format it in a way that can be e-mailed (base64 encoding, hex, whatever) and store it in a database together with a timestamp and the origin (e-mail address). The same token is then e-mailed to the e-mail address of the user.
The user receives the token, comes to the website and goes to the form for token verification. Here he has to enter his e-mail address again, of course, the token, and the new password. In principle, some measure against the automated searches is in order here too, to lower the load on the server in case of an attack. The tokens are verified against our database and then the e-mail is checked too. If we see a token, we remove it from the database anyway, then we check if the e-mail matches and we continue only if it does. This way, tokens are single use: once we see a token, it is removed from the database and cannot be used again.
Tokens also expire. We must have a policy at our server that sets the expiration period. Let’s say, that is 24 hours. Before we do any look up in our token database, we perform a query that removes all tokens with a creation timestamp older than 24 hours ago. That way, any token that expires is gone from the database when we start looking.
Well, now, if the token matches and e-mail is correct, we can look up the user in our passwords database and update the password hash to the new value. Then, flush the authentication tokens and session identifiers for the user, forcing logout of all preexisting sessions. Simple.