Join Sean Colins for an in-depth discussion in this video Understanding hashing, part of Learning Secure Sockets Layer.
- In the last movie of the last chapter, we talked briefly about HASHES and how they're utilized in the creation of a signature that will verify the authenticity of a certificate. But let's try to understand what a HASH is. Cryptgraphic HASH functions are what are used to change a message of any length into a string of data with exactly the same length every time. So that doesn't make a whole lot of sense if you think about it and you really try to put that together in your head. I mean if I say, "Sean likes ice cream," and I HASH "The Gettysburg Address," and I end up with an equally long HASH, sure the HASHes are different, but how is that even possible? There's an enormous difference in the amount of data being passed into the HASH, how can it make everything that comes out of it equally long? The important thing to understand there is that the HASH value that comes out the other end is being created specifically for the purpose of being uniquely tied to the original message and being a lot shorter, essentially, than that original message.
HASHes can be used for lots of different things, not necessarily always in cryptography. And those HASHes can be used to validate the data that they came from. So as I'm giving you here as an example, you make a HASH of a message and store the HASH. You can check the HASH value against the stored message any time later. This is frequently used with password databases. They'll HASH the password database. The HASH table's what's used to compare whether or not a password is what it should be whenever it's passed to the system by a user. That way the password's not really exposed on the server side and it protects the data.
But it also makes it much, much faster to search the data. So this is a pretty cool thing. It is cryptography, it's used in cryptography, but it's not the same thing as the encryption we've talked about previously. There are a lot of similarities. As I said before, if you use a HASH to validate your data or you use the digest that results from the HASH function to validate data, you can do that without exposing the data, and that's very convenient if you are, in the example I gave before, managing a password database.
But it could also be used, in fact it is used for tracking DNA and changes in DNA strands, because that's an awful lot of data and creating a HASH of that data makes it possible for you to know when you're looking at different versions of a string, because any small change will result in a change in the HASH. One important rule for HASHes used in cryptography is that the HASH must always result in a digest that is unique to the message that was used to create it.
If you could have two messages create the same HASH, you could end up with what's called a "HASH collision." That is one of the known failure points of HASHing technology, and it's something that the different HASH algorithms try to avoid. So let's talk about that. There are a lot of different kinds of HASH algorithms out there, and the most recent ones, the more recently-developed ones, avoid errors very, very well. In fact, there are some versions of HASH algorithms like SHA-2 and -3 that avoid colissions and easily-computated reverse engineering of the result of the HASH very, very well.
So how do we even know, when we're a server, how do we know what the client we're sending stuff to, how do we know they're gonna support? We don't really know that. When a secure connection is initially requested by a client, and I've told you in previous chapters that the server sends a whole bunch of information to that client about itself along with its own public certificate, right? So that information that is sent includes information about which HASH functions are supported, which encryption technologies are supported.
And so if both the client and the server support SHA-2 for example, they'll choose SHA-2. And the server will just say," "These are the lists of things that I support, "this is what I think we should do," right? So to review, in SSL, a HASH is going to be used to calculate that small, fixed-length message. And it doesn't matter what length the message was that was passed into it because it's always going to end up being that one smaller, fixed-length that's easier and more efficient to transfer across the network and to work with in calculations.
It's going to send a message that was encrypted using the server's private key, and then that's going to then be used on the client side to validate the authenticity of a certificate that was received by a client's system. This is what HASHes are used for within the SSL system. The HASH has safeguards built into it to prevent duplication or undetected modification of the digest in transit. This is important because if the digest can be modified without detection, then someone could get in the middle, replace the HASHed data, and the client would be open to an attack.
When the decrypted and the original versions of the digest are received by the client, that identical nature of that data proves that the public key was used to encrypt the HASH because the resulting decrypted data matches the clear text sent version of the HASH.
- SSL communications
- Certificate authorities
- Public key infrastructures
- Symmetric and asymmetric key pairs
- Cryptographic hash functions
- Encryption algorithms
Start now, and by the end of this course you'll have the knowledge to create SSL certificates, as well as revoke and renew them, from the command line.