In the previous post I talked about symmetric key cryptography, and focused on block cryptography for sending messages with a shared key.
Sending secret messages is the use case that comes to mind when we think about cryptography. But cryptography is a wider field, and it deals with other kinds of problems. In this post I will talk about several topics, but they are all connected by a theme: Authentication.
Authentication is one of the most common uses of cryptography. It is defined as:
Authentication (from Greek: αὐθεντικός authentikos, "real, genuine", from αὐθέντης authentes, "author") is the act of proving an assertion, such as the identity of a computer system user.
For the purpose of this post we will cover two kinds of authentication, with the purpose of authenticating different kinds of assertions. One is for authenticating the integrity and source of a message, and the other will be for authentication of users. But first we need to introduce hashing, a cryptographical tool that will be used for both cases.
Hashes
A hash function in the most generic form is a function that is able to transform arbitrarily long data into a predefined amount of bits. Hash functions are used in many computer science applications, but for the purpose of cryptography, we are looking for a specific set of properties that hash functions should have to be considered secure:
- it is deterministic, meaning that the same message always results in the same hash
- it is quick to compute the hash value for any given message
- it is infeasible to generate a message that yields a given hash value (i.e. to reverse the process that generated the given hash value)
- it is infeasible to find two different messages with the same hash value
- a small change to a message should change the hash value so extensively that the new hash value appears uncorrelated with the old hash value (avalanche effect).
Basically what this means is that you should not be able to reverse a hash function (you are not able to come up with an input that hashes to a given hash in reasonable time).
Hashes are often used to validate integrity. For example in the case of downloading a big file, we would receive a hash along it, we would hash the file that we received, and compare it with the received hash. If the received hash and the generated hash are different then it means that the received file changed during transit, or that the received hash is incorrect. We call this kind of hash a checksum.
Hash collisions
In the example above, we are sending a file, along with a hash. as we said a hash is always the same size, depending on the used algorithm. For example for sha256 it is 256 bits. So if we are transforming files of any size into 256 bits, it is easy to see that it is impossible to develop a hash function where two different files don't hash into the same bits. In fact, since the number of combinations of bits that we can input into the hash function is infinite, and the number of possible hashes is finite, there is an infinite amount of combinations that will produce the same hash.
The objective when developing a cryptographic hash function then, is not to make collisions impossible, but to make it impractical for current technologies to find one. As technology advances however, hash functions that were considered secure, are not anymore. For example sha1 has had many attacks, and is not considered secure.
The first rule presented in the first episode of the series is about not using such outdated hash functions.
Which hash function should I use?
For the purpose of this series, we can say that any hash function in the sha2 and sha3 family are safe, sha256 or sha512 for example would both be safe and are widely used and implemented in many libraries. There are many safe hash functions out there, but we should first check that it is considered safe nowadays before using one.
It is important also to know when a hash function is needed, and when we need something more specific. In the case of password hashing for example, sha2 is not the best tool for the job.
Password hashing: Key derivation functions
There are many articles talking about how it is important to never store passwords in plaintext. It is nowadays common knowledge that passwords should be stored as hashes. But some hashes are better than others. A cryptographic hash function is not designed for this use case in mind, and so it has some issues:
Hashes are fast: Hash functions are designed to be efficient in computation time, meaning that given a hashed password, you can try many passwords by brute force in little time. It would be better to use a slower function that would make it more costly, even for bad passwords, to be brute-forced.
Same passwords have same hashes: Given two users with the same password, we would store the same hash, and that is a leak of information. Also this property makes it easy for attackers to store rainbow tables, containing precomputed password hashes. The solution for this is adding a random salt to passwords before hashing them, and storing the salt along the hash. This is similar to the usage of IVs in block cryptography.
For this reason, there are specific functions, called key derivation functions (KDF), which are made for the specific purpose of password hashing. They use salts, and they are arbitrarily slow to calculate, usually by applying hash functions many times recursively. When dealing with passwords we should always be using KDFs. Examples of good KDFs to use would be bcrypt, scrypt, and argon2.
Message Authentication
In the previous episode of the series we talked about block cryptography. In block cryptography we share a key with the recipient, and we are able to encrypt messages and decrypt them with the same key. It turns out that a lot of the attacks on block cryptography deal with an attacker intercepting an encrypted message, modifying it and then sending it to the recipient. The recipient then receives the message thinking that it comes from the original sender and has no knowledge of the modification.
Some sort of method of authentication needs to be applied to guarantee the integrity of the message sent by the original sender.
HMAC
HMAC stands for Hash based Message Authentication Code.
Sending a hash of the message in itself to authenticate the integrity would not be useful, since a middleman attacker could also generate the hash of the modified message.
So what an HMAC will do is it will include the secret shared key somewhere in the hash. The key is unknown to the attacker, and is impossible to recover from the hash.
The naive way to implement such a construction would be to send along the encrypted message a MAC such as: H(key || message), where H is a secure hash function, and || denotes concatenation. It turns out this is actually not a good solution, since length extension attacks allow an attacker to break that particular HMAC. The standard HMAC construction will be a little more complex, but it will avoid this kinds of attacks:
Where opad and ipad are padding, and m is the message to authenticate.
The HMAC construction can use any secure hash function, and will output a hash at the end. A middleman attacker could not generate a valid HMAC without knowledge of the key. So when we send encrypted data, it is a good idea to send an HMAC along with it, so that if the message is changed, the HMAC will be invalid.
Sometimes HMACs are used as signatures. Meaning that the message is sent unencrypted along an HMAC, so that any middleman could read it, but it can not be modified without knowledge of the key without invalidating the HMAC.
Authenticated Encryption
There exist encryption algorithms that already include the message authentication. The most important of those is GCM, which is a block cipher mode of operation, that is very efficient, and includes an authentication code. GCM is the mode of operation that is most used for transmitting data, especially in client/server environments where CBC is not the best choice. For example it is very commonly used in HTTPS communication, which we will talk about in the following episode. It is very important when using GCM and other stream cipher modes that the iv is never reused. It is recommended to make sure that the iv will not repeat, by adding a counter part in the iv that increments for each message encrypted.
JWT
JWT (JSON Web Token), is a protocol for generating signed tokens containing claims. Claims is any information that you can represent in JSON-like format. For example one claim could be [name = "Sergi"], where I am claiming that Sergi is my name. If I am sharing a secret key with somebody else, then I could send them a Signed token containing this claim, and they could verify that it is correctly signed, and that the claim comes from me. It is important to note that JWT data is not encrypted. Meaning that everyone can read the data on a token, but they can not modify it without breaking the signature. The way JWT implements this for symmetric keys is with HMACs.
a JWT is organized in 3 parts, each one separated by a period: xxxx.yyyy.zzzz. Each of the parts is encoded into a string by base64url. The three parts are:
Header
The header contains information about the algorithm that is used. In the example below, it is using HMAC with the SHA-256 hash.
{
"alg": "HS256",
"typ": "JWT"
}
Payload
Here is where the claims are, in JSON format. There are a few special standarized claims, that some libraries treat differently, such as iss (issuer), exp (expiration time) or sub (subject).
In particular exp is interesting because even if the signature is correct, if the token contains an exp claim, the token will not be valid after the expiration.
Signature
This part contains the actual signature of the first two encoded parts, using the specified algorithm in the header, and a secret:
HMACSHA256(
base64UrlEncode(header) + "." +
base64UrlEncode(payload),
secret)
If you want to quickly play around with JWT, at jwt.io there is a very useful tool for encoding and decoding jwt tokens.
User Authentication
User authentication is a problem that requires all of the techniques we talked about in this post, and is the most common use-case for JWT tokens. The problem in User Authentication, is not only to verify their credentials, but to provide them a way to keep an authenticated session open without having to verify their identity at every operation.
User authentication usually goes like this:
- The user signs up creating a user and a password.
- The auth server stores the user and a secure password hash of the password (created with a secure KDF).
- The user signs in to the server, providing the user and password.
- The auth server generates the hash from the provided password and compares it with the stored hash. If it matches, a JWT is provided with the sub (user) and exp (expiration) claims.
- The user provides the JWT along with every operation they realize, and the server verifies the signature and expiration every time. If the JWT signature is correct, then the operation is allowed for the user in the sub claim.
In this case, the auth server has a secret key for the JWT signing, so that it is the only one able to verify and sign tokens, and the user is not capable of altering the claims provided by the server.
I provide below a simplified example of the code that an auth server would have for user authentication. Usually though, implementing authentication is not the best solution, and it might be a better choice to use an authentication system such as firebase auth or auth0, which offer many interesting features such as SSO, and social logins.
And that's it for today! I hope I was able to give a good summary of the topic of authentication in cryptography. In the next episode I will briefly talk about asymmetric public key cryptography, and we will see how all that we learned in the series comes together in cryptographic protocols such as ssh and TLS that we use every day without noticing.