In this post, I will go over encryption, SSL, hashing, and signatures. Additionally, this post will have a short segment about RPO & RTO - because it is too short to make its own article.
Encryption at Rest is a type of encryption where we are concerned about protecting stored data. This type of encryption is employed when we encrypt hard drives of data or encrypt data uploaded to cloud storage to be stored in an encrypted format. The primary purpose of this kind of encryption is to prevent the data from being used by anyone other than the person authorized to use it. This type of encryption might even include preventing the owner of the storage medium, such as AWS, from being able to use or read the data.
Encryption in Transit is a type of encryption that protects data as it moves. For example, this type of encryption is the type of encryption your bank and browser transparently employ when you access your financial information. It keeps your data safe from any rouge system operators on the internet who might intercept the traffic.
Some encryption concepts and terms to be aware of are:
Plaintext - This is unencrypted data. Plain text doesn’t have to be text. It could be binary files, text files, media, or anything else. It just means it isn’t encrypted.
Algorithm - This code or math takes plaintext and a Key and generates the ciphertext (or the opposite).
Key - This is the secret or the password that is used by the algorithm.
Ciphertext - The output of an algorithm after it processes the plaintext with the key. This output is the encrypted data. Again it does not need to be text.
- This type of key requires a shared secret.
- Difficult to use for multiple parties because getting the key to the other party is not secure.
- Disk encryption or encryption before uploading data to the cloud are good examples of places where symmetric keys are common.
- Asymmetric (Public/Private Key)
- These types of keys are generated as pairs of keys. The first is the public, and the second is the private key.
- The private key is known only to the party that generated the key pair. However, the public key can be published and made available widely.
- Algorithms that use Asymmetric Keys to encrypt data use the public key. The ciphertext produced from this algorithm can only be decrypted with the same algorithm using the private key. The public key, which encrypts the data, cannot decrypt a message that it was used to encrypt.
- With this type of key, no key exchange is ever needed, which makes it ideal for situations like encryption in transit where the user of the encrypted data is not the party that initially has the data. (i.e., Your bank has your balance, they encrypt it and send it to you using an asymmetric key process)
- Using asymmetric keys is more computationally expensive than using shared keys. Often, and in the example of the bank I mentioned above using SSL, the communication process starts with asymmetric keys. Once a secure method of communication is established, the parties exchange a shared key for ongoing communication to keep the process efficient.
Hashing is a way to turn an arbitrarily long piece of data into a fixed-length representation of that data. Every hash starts with data and some hash function.
Presuming we use the same data and the same hash function, the attributes should be: Hashed value is a fixed length.
- The similarity of input data has no impact on the similarity of the resulting hash.
- A hash cannot be used to produce the original data.
- Two different pieces of data should not produce the same hash. Furthermore, it should not be possible to manipulate the data in such a way as to create the same hash. If this happens, it is known as a hash collision.
We need a way to authenticate messages. This authentication includes being confident of the sender and that the message hasn’t been altered in transit. Signatures solve these problems.
Steps to sign a document:
- Hash the document
- Use a private key from an asymmetric key pair to sign the hash.
- Distribute the document along with the signed hash to the receiver
- The receiver can use the public key for the sender to get the hashed value of the document.
- The receiver can hash the document using the same algorithm and compare the hashes.
- If the hashes match, we know that the document was signed by the party holding the private key matching the public key we used to get the hash. We also can be confident that the hash matches the document in its current state - meaning it hasn’t been altered.
SSL (Secure Sockets Layer) and TLS (Transport Layer Security), an updated version of SSL, are responsible for encrypted and secure communications on the internet.
- Identity Verification (Server is what it says it is)
- Integrity Validation (Messages are not altered between client and server)
- Privacy - Data is encrypted between the client and server.
Three Phases of SSL/TLS conversation:
- Negotiate a cipher suite: The client sends a list of cipher suites it can support to the server. The server selects one of the cipher suites and responds back to the client with that suite information.
- Authentication: Certificate authorities sign the server’s certificate. Your browser or operating system keeps a list of trusted CA certificates (which contain the CA public key). The browser will use the CA public key to check the signature of the server’s certificate. If the signature matches, it will use the server certificate public key to encrypt data for the server. If the server can decode the information, we can be confident that we are communicating with the entity that controls that certificate.
- Key Exchange: Once we have established that we are communicating with the correct party and that the certificate can be trusted, the client will generate a pre-master key and encrypt it with the server’s public key. The server decrypts the pre-master key and converts it to a master secret. This key will be used, using symmetric algorithms for the rest of the conversation.
This topic doesn’t belong in an article about encryption - but I couldn’t find a better place to put it, so I include it here.
RPO stands for recovery point objective. RPO is a concept that states that when a failure occurs, we need to have a design that recovers the data up to X minutes before the failure. A simple way to think about it is that we have a backup that runs every two hours. If we have a disaster where we need to recover from backup, the maximum data loss would be two hours. It might be less if we just finished a backup, but the worst-case scenario would be two hours of data loss. Things like backups impact RPO and synchronous replications.
RTO stands for recovery time objective. RTO states that when we have a failure, we must be back up and running, with the application in the business’s hands, by X hours or X minutes. An example of RTO is when a system fails at 9:00 pm, and we have a 12-hour RTO, we must have the system back up and running (including testing and client validation) by 9:00 am the following day.