The Vigenere Cipher

  • The Vigenere Cipher
    
    The Vigenere Cipher is a mutation on the basic Caeser Cipher. To recap, the Caeser Cipher involves the shifting of the alphabet a specific number of letters left or right to allow a phrase to be re-written using the new alphabet. It is called the "Caeser" Cipher because it was he who used it during the war so that his enemies would not be able to decipher his messages.
    
    An example of the Caeser Cipher:
    
    Old Message:  This is a caeser example.
    Alphabet:     abcdefghijklmnopqrstuvwxyz
    shift(3):     defghijklmnopqrstuvwxyzabc
    New Message:  Wklv lv d fdhvhu hadpsoh
    
    To decipher this, frequency analysis is used.
    
    *Note: I have written a previous text explaining this method of encryption in full detail. Read it.
    
    Now, onto the Vigenere Cipher after that brief introduction. The Vigenere Cipher was developed by Giovan Batista Belaso in 1467, but was only recognized after Blaise de Vigenere mentioned it at the Court of Hentry III in 1586 (something like the calculus story - look it up). It is a ployalphabetic cipher meaning many alphabets are used. 
    
    The Vigenere table is as follows:
    
        	    A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
    
        	A   A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
        	B   B C D E F G H I J K L M N O P Q R S T U V W X Y Z A 
        	C   C D E F G H I J K L M N O P Q R S T U V W X Y Z A B
        	D   D E F G H I J K L M N O P Q R S T U V W X Y Z A B C 
        	E   E F G H I J K L M N O P Q R S T U V W X Y Z A B C D 
        	F   F G H I J K L M N O P Q R S T U V W X Y Z A B C D E 
        	G   G H I J K L M N O P Q R S T U V W X Y Z A B C D E F 
        	H   H I J K L M N O P Q R S T U V W X Y Z A B C D E F G 
        	I   I J K L M N O P Q R S T U V W X Y Z A B C D E F G H 
        	J   J K L M N O P Q R S T U V W X Y Z A B C D E F G H I 
        	K   K L M N O P Q R S T U V W X Y Z A B C D E F G H I J 
        	L   L M N O P Q R S T U V W X Y Z A B C D E F G H I J K 
        	M   M N O P Q R S T U V W X Y Z A B C D E F G H I J K L 
        	N   N O P Q R S T U V W X Y Z A B C D E F G H I J K L M 
        	O   O P Q R S T U V W X Y Z A B C D E F G H I J K L M N 
        	P   P Q R S T U V W X Y Z A B C D E F G H I J K L M N O 
        	Q   Q R S T U V W X Y Z A B C D E F G H I J K L M N O P 
        	R   R S T U V W X Y Z A B C D E F G H I J K L M N O P Q 
        	S   S T U V W X Y Z A B C D E F G H I J K L M N O P Q R  
        	T   T U V W X Y Z A B C D E F G H I J K L M N O P Q R S 
        	U   U V W X Y Z A B C D E F G H I J K L M N O P Q R S T 
        	V   V W X Y Z A B C D E F G H I J K L M N O P Q R S T U
        	W   W X Y Z A B C D E F G H I J K L M N O P Q R S T U V 
        	X   X Y Z A B C D E F G H I J K L M N O P Q R S T U V W 
        	Y   Y Z A B C D E F G H I J K L M N O P Q R S T U V W X 
        	Z   Z A B C D E F G H I J K L M N O P Q R S T U V W X Y 
    
    This table can be viewed as 26 different Caeser Shifts incrementing by one each time. Instead of choosing just one column to encrypt your message with, you choose a keyword that chooses the letter combination for you. Each letter of the keyword is linked to the top row of independant letters. The vertical row of letters on the left relates to the letter in the old message that is busy being encrypted. Where those two vertical and horizontal lines meet, the new letter is selected. Lets see this is practice...
    
    Eg. (keyword is : "paper")
    Position:    0 123456 789 012345 6789 012 345 678 90123456 78 90 123456 7890
    Keyword:     p aperpa per paperp aper pap erp ape rpaperpa pe rp aperpa perp
    Old Message: I praise the monkey head and ask its guidance on my hacker life
    New Message: x pgezhe ilv bocovn hteu pns ejz iiw xjiseere dr dn hpgbtr amwt
    
    *note: The format of my message has kept the format the same, i.e if the word is 2 letters long, the encrypted text is 2 letters long. What is quite common to see is a message spaced evenly with a series of 4 or 5 letters. For example, my message re-organized would appear as:
    
    Encrypted Message: xpge zhei lvbo covn hteu pnse jzii wxji seer edrd nhpg btra mwt
    
    The Kasiski Method of Decryption
    
    Wow you might say, thats a complicated cipher and should be unbreakable. And you would be right, until about the 19th century. The man who managed to crack this cipher was Major Friedrich Wilhelm Kasiski. He was a Prussian infantry officer, cryptographer and archeologist. He published the first book on breaking ciphers, but it wasnt so well received so Kasiski focused more on archeology. Only after his death was is work truly recongnized (like so many others we all know).
    Kasiski noticed patterns (bigrams) in a Vigenere encrypted cipher. From these he was able to deduct the length of the keyword then using frequency analysis, he could decrypt the message. Sounds easier than it is...Lets give it ago with my example.
    
    First, to find matching bigrams (two letters):
    
    Bigram	Location	Distance	Factors
    pg	43		39		1,2,3,6
    nh	41		26		1,2,3
    
    Hmmm...this is very interesting. It appears my message does not have any repeating bigrams which produce the correct prime number representing the length of the keyword. Being a relatively short message, the chance of not finding the correct keyword is much much higher. Using the method above, I will demonstrate an example in which the repeating bigrams do equal the keyword length.
    
    The New Message Example:
    
    The keyword used to encode this message is "relations" and the message being encoded is the is the famous line "To be or not to be, that is the question". As can be seen below, the location on each letter of the message has been written down to help with calculating the bigram distances. Below is the new example using conventional spacing of five characters:
    
    Location	01234 56789 01234 56789 01234 56789
    Keyword:	relat ionsr elati onsre latio nsrel
    Plaintext: 	tobeo rnott obeth atist heque stion
    Ciphertext:	ksmeh zbblk smemp ogajx sejcs flzsy
    
    
    Repeated Bigram	Location	Distance	Factors
    ks		9		9		3, 9
    sm		10		9		3, 9
    me		11		9		3, 9
    
    As can be seen from this example, the distance between the keywords is 9. Now, it wont always be the same number, but it will always be a number that has a prime number the same as the length of the keywords. The keyword is 9 characters long, and as you can see, 9 is a common factor of the bigram distances. Normally, the distances will give you a variety of
    
    After the length has been deduced, it is understandable that letters at every nth position is from the same letter in the keyword, and the longer the message, the more letters made with the same character from the keyword. With that information, frequency analysis can be used to work out the most probably letter and with patience, the original message will be deciphered... Thankfully today, you do not need to spend hours deciphering a Vigenere Cipher. Computer programs have been written that do it all for you within seconds. One such application can be experimented on this website: http://sharkysoft.com/misc/vigenere/
    
    The Frequency Analysis Table
    
    A  B  C  D   E   F  G  H  I  J  K  L  M  N  O  P  Q  R  S  T  U  V  W  X  Y  Z
    73 9  30 44 130  28 16 35 74 2	3  35 25 78 74 27 3  77	63 93 27 13 16 5  19 1
    
    This table represents the number of times a letter appears in a common text of 1000 letters. Using this, we can determine which letters that occur the most are.
    
    The Kerckhoff Method of Decryption
    
    Although the Kasiski method is the easiest and most well know method used for an encrypted text using the Vigenere Cipher, another method was developed by a French cryptographer called Kerckhoff. Kerckhoff's method involves oganizing the text into columns and then performing frequency analysis on those columns with the aim of determining the keyword before the text is deciphered, unlike the Kasiski method which deciphers the text first. The Kerckhoff's method is much more time consuming, keeping in mind that there are 26 different possible Caeser Shift that can be performed on each letter, and therefore not regulary used, however, thank you to technology, it has become usable.
    
    Conclusion
    
    And there you have it, the Vigenere Cipher deciphered :) I hope this text has been easy to follow and understand and that it will motivate you in your continuing study of ciphers and how they work. Look out for my next article coming soon about the Gronsfeld Cipher, a variation on the Vigenere Cipher.
    
    Peace. Invas10n
    [Ask the Experienced, Not the Learned]