"The urge to discover secrets is deeply ingrained in human nature; even the least curious mind is roused by the promise of sharing knowledge withheld from others. Some are fortunate enough to find a job which consists of the solution of mysteries, but most of us are driven to sublimate this urge by the solving of artificial puzzles designed for our entertainment. Detective stories or crossword puzzles cater for the majority; the solution of secret codes may be the pursuit of the few."
John Chadwick from The Decipherment of Linear B
As long as humans have been alive, they have had secrets and they have developed ways to hide those secrets. Beginning with stenography, the art of hiding messages, and then moving to cryptography, the art of writing in codes, this field is continually evolving. In this webpage we will go through a brief history of cryptography, beginning with Caesar and ending with quantum computing.
One of the first ciphers used was the Caesar Cipher. Julius Caesar aparently sent all of his messages with an alphabetacal shift of 3. This meant that A became D, B became E and so on. This cipher wasn't very secure because there are only 25 different shifts, so one only needs to try 25 different codes to find the right one.
To make up for the lack of security of the Caesar Cipher people started using any letter to replace any other letter in the alphabet. This meant that A could be any of the 26 letters of the alphabet, B could be any of the 25 remaining letters, C had 24 options and so on. In all there are 26! different ways that one rearrange the alphabet which meant the brute force method of trying every option was out of the question. The substitution cipher was secure for the time being.
Monks interested in finding the secrets of the Koran began keeping track of how many times each letter was used. This information was not used for breaking codes until it was realized that whichever letter or symbol was used most frequently in the cipher text would likely be the letter that was used most often in whatever language the plaintext was written in. By analyzing texts in different languages, frequencies tables of letters were created. Once they guessed a few of the common letters, codebreakers were able to find entire words and then break the code. The security of the substitution cipher was lost.
Frequency analysis is not perfect. The shorter the text, the less likely the letters are to show up in the expected frequencies. In fact, two books have been written without using the letter E. The first one was in 1939 written by Ernest Vincent Wright. It then inspired a French author named Georges Perec to write a novel without using E as well. Arguably more impressive, Perec's book has been translated into English, German, Italian, Dutch, Swedish, Spanish, Romanian and other languages, also avoiding the letter e. Frequency analysis can also be made more difficult. People began using nulls, or symbols that didn't represent any letter. Codemakers also created multiple symbols to represent common letters in order to lower their frequency and make their true value less obvious.
One of the most famous cases of a cipher was the case of Mary Queen of Scots. Mary was imprisoned by her cousin Elizabeth because of her threat to the throne. Elizabeth was Protestant and Mary was Catholic. Mary's Catholic supporters wanted a Catholic on the throne to ease their persecution. One of these supporters, Anthony Babington, had been planning an assasination of Queen Elizabeth, but they wanted Mary's blessing. He contacted her with an enciphered message explaining the plot. The enciphered message had nulls and characters representing different letters. He believed that the cipher was unbreakable. Unfortunately, unbeknonst to both Mary and Anthony their letters were being intercepted and decrypted. Once they had enough evidence they were able to bring them in and convict them. Mary and the rest of the conspirators were sentanced to death.
The next big step in cryptography came with the development of polyalphabetic ciphers. Everthing used up to this point was a monoalphabetic cipher meaning each letter could only be represented by one or more symbols or letters and those symbols and letters could only map back to one letter. In 1404 an Italian man named Leon Battista Alberti thought of the concept of encrypting a message by alternating between two ciphers. This gave the advantage that a single letter in cipher text could represent two different plaintext letters. This idea was advanced and developed further by several men until Blaise de Vigenère created the Vigenère cipher.
The Vigenère cipher is powerful because it has 26 possible alphabets. By establishing keyword, sender and receiver can agree on how to switch between alphabets. The chart shown above is universal for anyone using the Vigenère cipher. Let's work through an example to explain how to use this cipher.
Suppose we wanted to send the word "Cryptography" and we had previously decided on the key word "dogs." We start by writing DOGS repeatedly over our plaintext.
D O G S D O G S D O G S
C R Y P T O G R A P H Y
The letter that is above the plaintext is the cipher alphabet we are going to use to encipher that letter. The alphabets that we are going to use in our example are shown below. The plain text is highlighted red and the cipher alphabets are highlighted in yellow.
To encipher the letter C we will use the alphabet in the D column. In that column we go down to the row with C in the plaintext alphabet. Where the D column and the C row intersect is F, so C is encyphered as F. The next enciphered letter is F because that is where O and R intersect. Here we can see the main advantage of the Vigenère cipher because two different letters can be enciphered as the same letter. In the same manner, the same letter can be enciphered as different letters. This makes frequency analysis an ineffective strategy. Our final encrypted message would be FFEHWCMJDDNQ. Decrypting the message is the same process. You start by writing the word "dogs" repeatedly over the cipher text.
D O G S D O G S D O G S
F F E H W C M J D D N Q
Then one by one you use the alphabets indicated to decrypt the message, this time finding the letter in the appropriate column and writing down which row it is in.
The Vigenère cipher was thought to be unbreakable for many years, then Charles Babbage, the man who made plans for the first computer, cracked the code just to prove a point to a friend. The weakness of the Vigenère cipher was its cyclical nature. He found that if he could find a repeated group of letters in a text it was likely because the same word lined up with the same letters from the keyword, thus having them be enciphered the same way. If one counted the number of letters in between repeated segments you could guess the length of the key word or phrase. For example, if there were 30 letters in between the key phrase could only be the length of factors of 30, meaning 1, 2, 3, 5, 6, 15 or 30. 1 can be tossed out because if the keyphrase was only one letter long you would have a monoalphabetic cipher that can be cracked with frequency analysis. You then search the document for more repeated segments and count the number of letters in between. If the next one you found was 12 apart, you could then say that the key phrase was 2 or 6 letters long because those are the only shared factors between 30 and 12. At that point you could start guessing what the possible key phrase is by guessing common repeated words. A common word in any correspondence is the word "the." If reversing the cipher with "the" creates meaningful text for a key, you are likely on to something and can start to decrypt the rest of the message by guessing the rest of the keyword.
Some keywords were not intelligible words or phrases and were instead random letters. If this was the case, then it would be difficult to guess what the keyword would be. In order to crack this form of a Vigenère cipher you need to decide on the length of the keyword using the method described above. You can then split the message up into the different alphabets and run a frequency analysis on each individual alphabet. Even though you are pulling from a smaller number letters the frequency of the letters should follow the same rough pattern.
The longer the keyword, the more difficult the Vigenère cipher is to break. An unbreakable form of cryptography is found by using a one time pad. If the length of the keyword is exactly equal to the length of the message then every possible combination of letters is a possible outcome. This means that using brute force you can certainly come up with the right decryption, but you will also come up with many messages that make sense, but are not the correct decryption. If NAIFHEPWMASOFBT was the cipher text, both "meet me at the park" and "I need a new target" could be the possible decryption because they have the same number of letters. The keywords for both of the are different, but without knowledge of the keyword we don't know what the actual message is supposed to say.
Although Charles Babbage broke the Vigenère cipher, he had to keep it secret because it was still the main cipher used by militaries throughout the world. The British government wanted other governments to believe that their communication was still secure. This was the case for many cipher breakers throughout history. Because of the nature of what they are doing, successes are often kept secret until they are no longer needed.
One of the most famous types of enciphering that came after the era of the Vigenère cipher was the Enigma machine. Since there are many movies, books and websites dedicated to the Enigma and those that broke the code, I won't spend much time here talking about it. One of my favorite sources is here because it tells the story of the Polish codebreakers that initially broke the Enigma cipher.
As computing moved forward, people saw a need for privacy for the common person, not just for militaries and governments. One of the main problems that needed to be solved was that of key distribution. If I wanted to send my friend a message I would first have to meet with them or send a trusted person to exchange a key with them. Only then could I communicate securely. If I wanted to talk with several people that were in very different places, this would become difficult and costly. Governments, militaries and big businesses could afford this, but the commmon man couldn't. Many people tried to solve the problem of key exchange. One of them was named Whitfield Diffie. He got connected with two other people named Martin Hellman and Ralph Merkle who were alo interested in solving the problem. They spent a lot of time trying to figure out how to solve the problem until one night Ralph Merkle had a mathematical revelation. He shared it with Diffie and Hellman and together the men implemented the system.
Another major advancement in cryptography came when Ron Rivest, Adi Shamir and Leonard Adleman came up with an implementation for public key cryptography, which eliminated the need for key exchange. Below is a video explaining how RSA cryptography works.
Both the Diffie-Hellman key exchange and RSA rely on what are called one way functions or trap door functions. These are easy to do, but hard to undo. One example of a trapdoor function is one that is taken advantage of in the RSA system, multiplying primes. How long does it take you to find the factors of 629? What process did you use to find them? A computer has a few algorithms that optimize the time it takes to find factors, but it still is essentially trying one prime number after another until a remainder of zero is found. In cryptography numbers larger than 17 and 39 are used. The standard for encryption right now is 2048 bit encryption, which is numbers that are 617 digits long. If you were trying to factor that with a standard desktop computer it would take 6.4 quadrillion years.
People aren't trying to decrypt things with a single, standard desktop computer however. Advancements in quantum computing are cutting down on the time it takes to decrypt messages using brute force. This has lead to some major advancements in cryptography. One of these advancements is in a method of storing data called hashing. Hashing is a way of storing data using an algorithm. When the data is stored it gets a unique key that you can use to access the data stored. If the hashtable only goes one way, meaning you can store the data, but you can't access it, it becomes a useful way of storing passwords. For example, when someone creates a new account with a new password, the password gets hashed and is connected with the person who created the account. Then in the future when someone tries to log into the account the password entered is hashed again, then compared to the previously hashed password. If they match, the person is allowed into the account. The cool thing about hashing is that not even the developers have access to what the actual password is, only the hashed one. This function is truly one way because once a password is hashed, it can't be unhashed., Someone trying to break in wouldn't be able to guess what the password is based on the hashed version, and the hashed version won't get them into the account.