Cryptography: Cracking the Code Part I

HIQGPX OLQ KLBFO VYIR CRQN IOWVAKLBFK CII T YVGRX TFRQPEGUI VF XAGWG FJ NK AJF HH” – AWCRG TKMOFZ


How could you even start cracking this code (or writing one)? Well, to understand how one could even start thinking about this, let’s start with the basics.

Code breaking has been around for centuries. Indeed, the Romans used codes to hide communication from their enemies. This led to the use of the Caesar Cipher, a relatively simple cipher. 



The Caesar Cipher

In this cipher, we first encode each letter (say A-Z as 0-25, excluding punctuation). Then we shift the values. For example, we could add 5 to each letter, so that an A would become an F, a B a G, and so on (wrapping around at the other end).

Then, provided both the sender and recipient know the offset then they can easily encode and decode messages. The problem is that this is really quite simple to break.

Here is a message encoded using the Caesar Cipher:

“PA DHZ AOL ILZA VM APTLZ PA DHZ AOL DVYZA VM APTLZ…” – JOHYSLZ KPJRLUZ

See hint 1 below if you are having trouble.



The Substitution Cipher

However, the fact that the Caesar Cipher is easily breakable leads us to use a more complicated cipher. The Substitution Cipher is the cipher most school kids probably devise when sending notes. In this cipher, we permute the letters of the alphabet, so that each letter is swapped for another. For example, our scheme could be as follows: 

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Q W E R T Y U I O P A S D F G H J K L Z X C V B N M

In this cipher, the number of different encodings is too large to be done by the mechanism we used for the Caesar Cipher. However, there are some techniques that can help us and make this method easily solvable.

See if you can determine how and decipher the following message (see hint 2 if stuck):

EI QIPF IMFCFQA OC EVF DFWOMMOMW IA T QOAFQIMW JIZTMHF” – ICHTJ BOQRF



The Vigenere Cipher

The tricks to break the Substitution Cipher make it unsafe for any longish message. Thus, we need a more complicated cipher. This leads to the use of the Vigenere Cipher. 

In this cipher, we encode the letters as we did for the Shift Cipher: between 0 and 25. Then, we use a secret to obfuscate the message. We choose a short message: say the word cipher and then replicate it, adding it to the message to create our cipher text.

For example:

   THE TREASURE IS LOCATED AT : the plain text

+ CIP CIPHERCI PH ERCIPHE RC  : the secret

= VPT AVVCAJYI ZU TDJEKGL PA  : the cipher text

See if you can decode:

“YM PYI RNT XU XYG OJAXVT ADTI FH CH HVV NWDRMEI II ALV UBPYW” – FUKPY AZNLT



Hint 1: Consider that there are only 26 possible combinations. Thus we can try out each one and see what works. As a true Computer Scientist, we could write a program that takes the code as input and outputs all 26 possibilities from which we can easily see the true message.

Hint 2: Consider that the letters in the English alphabet are not commonly distributed: E is far more common than say Z. By determining those letters that occur most frequently in the code text, we can make a reasonable guess as to which letters are which. Then, we can make deductions to determine the original text.

The first 9 most common letters (in order) in English are: E T A O I N S H R.

Leave a comment