Welcome back to my 6th iteration of this blog series. In this series we’re growing our cybersecurity knowledge starting from the very basics using the overthewire.org challenges as a guide. First, I’d like to thank everyone for their feedback based on the last post! I’ll do my best to implement it and as always, more feedback is always welcome. Now let’s start!
We’re still working with the “data.txt” file and we’re told that the file contents are base64 encoded. Let’s take a look at the contents of the file first:
Well, it’s definitely not something we can understand, hence the encoded part of the hint. Fortunately, we know that it’s encoded in base64 so it’s as simple as looking at the man page of base64 to see that decoding a message is as simple as adding a -d:
Let’s try just running base64 without any options, for science!
The message got even more garbled. Let’s do one more:
There’s an interesting pattern developing here though: every encoding started with the same “V” and the “==” returned. In fact, the end of the triple encoded message (“Cg==”) is exactly the same as the originally once encoded message. This presents an interesting question: does encoding the same message always result in the same cipher?
This is a great time to introduce the idea of cryptography, ciphers, and hashing. Cryptography is the science of encoding and decoding ciphers or “secrets”. Hashing on the other hand, is the processing of data to produce a calculated result of fixed length. The most important difference between ciphers and hashes is that while ciphers are reversible, hashes are not.
A cipher is a calculated way of hiding the “plain text” information in an obscure way that would be difficult to understand or “de-cipher”. In the case of ciphers, we pass the plain text secret through a mathematical formula that produces our cipher text. For a cipher text to be effective, it must completely obscure the original plain text. Ciphers have been and are still being used to transport data in a way that only the intended recipient should be able to understand, at least theoretically. This is why a cipher must be reversible. The most common and easiest method of cryptography is called “symmetric” encryption.
Symmetric encryption is when both the sender and the recipient have agreed on a way (key) to decipher the data so that they can communicate secretly. On the flip side, asymmetric encryption is when the sender and recipient have different keys to cipher and decipher a message. We’ll tackle this in more detail later, but for now it’s enough to understand that base64 is a form of symmetric encryption.
This next challenge introduces a different kind of cipher and symmetric encryption called the Caesar Cipher. The Caesar Cipher is one of the oldest methods of encrypting messages and would have remained relatively effective if it weren’t for the processing speeds of modern computers. Let’s take a look at the hint: the message in the “data.txt” file has been encrypted by rotating the alphabet by 13 positions. This is the essence of a Caesar Cipher, by using a different alphabetical order to write a secret.
The Caesar Cipher is also known as “rot” followed by the number of positions which would be “rot13” in this case. So if we wanted to write “hello” and encrypt it with rot13 it would be “Uryyb”. What we do is take the first letter in our secret, “H” and instead write the letter 13 positions to the right of it. We continue to do this for the rest of our message until we get our complete cipher. Knowing how to decipher a coded message is considered the “key”.
Let’s take a look at the “data.txt”:
One of the pitfalls of “rot” ciphers is the fact that spaces and quantity of letters remains the same as the original text. So in this example we can guess that the first few words are very likely to be “The password is”. Knowing that the the positions are rotated by 13, we can rewrite the message by simply taking it letter by letter and counting back 13 positions. We can just copy this message and drop it into a website that will decode it such as rot13.com.
We should learn how to do this in the terminal though and luckily there’s a command for it: tr. tr is the “translate” function and it will accept an input to decipher followed by the “key”. Since computers in general are case sensitive, the capital and minimal alphabet sets are considered to be two distinct alphabets. One last thing to keep in mind with Caesar Ciphers is that it “wraps” around when it reaches the last letter of the alphabet on either end. So if we’re still counting and reach the letter “z”, we need to go back to the letter “a” and keep counting.
To use tr we need to specify the “original” then the key. In this case, our command would be cat data.txt | tr [a-z][A-Z] [n-za-m][N-ZA-M]. We’re essentially telling tr that for the set of “a to z” and “A to Z”, reorder the letters to start at “n” and continue until it wraps back around to “m” for both capital and minimal letters. Given that we can tell tr to use different “rot” numbers for capital and minimal letters, we can further complicate our cipher by giving it different rotations. This is what the file really says:
Sure enough, we guess the first part correctly and we now have the password.
In the “old world” before long distance communications were so common, it wasn’t difficult to agree on a key for deciphering messages; this is what’s now called a pre-shared key or PSK. Ciphers were commonly used for wartime communications and as such the parties involved would be physically present and agree on a key before separating. In today’s environment where long distance communications are commonplace and private, it’s not so feasible to physically meet everytime to decide on some key. This was resolved with the advent of Public Key Infrastructure or PKI: keys are now generated in a way that makes it possible to exchange information privately without ever having to decide on a PSK first.
PKI works by producing two distinctly different and but long and mathematically related keys. One of the keys is designated to be the “private” key and the other is the “public” key. The public key is known to everyone and is in some cases even broadcasted while the private key is kept secret. We’ll call this person who created this first key pair Bob. Another individual, Alice, also creates her own pair of public and private keys. She also broadcasts her public key and keeps her private one a secret. If Alice wants to talk to Bob, she creates a message and encrypts it using Bob’s public key (which is known to everyone). The “magic” of PKI is that this message, encrypted with Bob’s public key, is now only decipherable with Bob’s private key. Not even the original sender of the message, Alice, can decrypt the message she encrypted.
We’ve now covered two methods of encryption: PSK and PKI. These two encryption methods are examples of symmetric and asymmetric encryption. PSK is symmetric because both parties are using the same key to encrypt and decrypt the message. PKI is asymmetric because the encryption and decryption are split between two distinct keys. These are the two principles behind most encryption in the world today and all the different algorithms essentially rely on these two “methods”.
In fact, we’ve essentially been using PSK to log into our bandit levels and we’ll eventually use PKI as well. Later we’ll get into more detail about the pros and cons of both encryption methods, the different uses, and even cracking them.