Recently I had to talk about this subject during a lecture, and I realized that I never published anything about this, just assuming that it's pretty common sense, but googling the subject I see that many people still don't have a clear understanding of the problem, so I decided to post this: hope it'll help the community 🙂
The most common mistake is storing passwords in clear text, accompanied by the equally dangerous mistake of sending them in clear text over the network, the latter based on the naïve assumption that an SSL connection grants enough security.
The fact of the matter is that SSL connectivity alone cannot guarantee security: almost every week new vulnerabilities are identified and fixed, and that shows how wrong it is to assume that SSL alone grants security. That assumption also implies that all users and all server admins always install the latest security fixes, another quite wrong assumption, I'm afraid.
Password process in details
The above implies that password hashing must always happen on the client side, and the result of that is sent over to the server. But is this safe enough? Not entirely, we need to add “salt“ to ensure security.
What that means? Let’s look into it from an attacker perspective. The common way to hack an account is to use lookup tables to find common passwords, so the hacker takes the encrypted password and runs it against a database of password hashes like the following:
The above are simple test resources, in reality any hacker has a seriously complete database compiled against full dictionaries and thesauruses in multiple languages: that’s what we are up against.
Because of this we have to be very careful, starting with establishing a strong password policy.
Let’s see this in a step by step practical example from a real world algorithm in use here in DFT Games Ltd games (well, a simplified version of it 🙂 ).
The user types his user name, let’s assume it’s the email address as it’s a common scenario (we force it lower case to ensure determinism in the next steps):
then he types his password:
Such weak passwords are painfully common, so we have to make sure we correct this to protect the client and reduce our liabilities. To do that let’s compute our salt from the user name. Here I use one of many possible approaches, any other is ok as long as it’s deterministic.
Because the first letter is “j” its value is 152 on ASCII, an even number, therefore we pick all odd characters from the user name, so that
becomes for our purpose the following string:
Now we hash this new string, and for this step a simple MD5 will be enough, giving us the following:
Now we have all we need to fix that weak password, so we chain all together adding a star character in between, getting the following string:
my password* d9e2feaea42f0f4b6891f8030f357041
Now this looks much better and it is quite hard to hack using the common tools, therefore this is what we are now going to hash using SHA512, getting our final value that will be stored in the database:
Now, if we want to improve this even more, we can apply the same odd/even rule to this hash, making sure that, because “j” is even, every non numeric character on an odd position is upper case, getting this final string:
This final string is not just the SHA512 of a salted password, but has been parsed to apply a letter casing rule derived by the user name halving the already dramatically low odds to be able to crack the code via any brute force attack with or without any hash database.
Why not to store the salt?
Storing the salt in the database is a common mistake, why? Because doing that we assume that the attacker is an entity outside of the company, but that is a wrong assumption. What about a disgruntled employee dumping the client’s database on the Torrent network? This is becoming an issue we encounter frequently, trending all over the world because of current employment practice. We need to remember that threats do not come only from outside the company: they might as well hit us from within.
Because of this we never store the salt in the database, we make sure that we code the algorithm that computes it deterministically and we make sure that the algorithm is known only by a very restricted amount of people. The actual full algorithm in the above real life example is known only to DFT Games Ltd. family members and provided via a scrambled DLL to the development teams for in-game implementation.