Lab 20: Cracking the Vigenère Cipher

To refresh your memory of the Vigenère Cipher, go back to Lab 8: The Vigenère Cipher and read about it. You will need your key_length function from that lab today. Your ultimate goal is to determine the plaintext message for the following ciphertext:

CZVEILTUPRIMCGKBTVSVXVJRTCIBTSQQJHBVHVPKQRJSEUMHMGFBPXWXDOAMECWQTUCZXGIBBVLXGIHBFSDGESBTXLBESFJIWTFDRGTTWKBGQXWXDCBMLTKUWFVIGPCGEGESBPUOWXXMJOQPSEBEHHTIHHTQBPZTKUOGKSCLKBVVECWYVNVMHMJSHUIDYCPBQOIAQITJXPEKQRYMIAQIGRMRMWFRUSGVQBIGVHTVWBPWHHUVRYEHVQBFKHTKKBTKRWXTCJPQXGFOFYIAECGFJIRHWZQHSGMJSUQXSTAANFIWXTTRGPKXTMFNITIAOAFWINRWQYLTMJSEVLTINSNUYGXQTZCOXGIOQCMHREVNKRLHWZQDILHTHUVLTMTCHDPTHHURVXXGIICCRSIKQXKRVMJSQCMHBGGJJICLWRQGRARCKUKXTKCPOKXLBVVCKRZXASFTECVNCFGFNAGFGJIGXYOFPSIAKBTUSKXTMEGQPKMOONIXGVVNVRDKFWQCPXVGHUKRZBVGBXIGROIPJSJMQTGJILTAHBJIPKVVRTEQUKHFCCIHKHFGPUHJRRCVDAFSNTMHACZYDIATVSJJICLJSGJSJZJHVVSKXTOSVIGPCFQUMIHEQHTVTWVCUGVIACHFJIDNIVGVSWTXSJQRSXTSQCXIAKGOWXPMVVRVMBXKHNNPHXGARFUJBVSACXJKCZOWXLAGBGJIGTDPVVERMWOYNCIHQYNYEIVJCHVSUBVGJCMHMECNVTDVMSGCRSEQCXGHPMKHNPHIAGBUWVGBGRBPEABESFVEGMGRGQLTKHSRVJDKKHSNEHAGRNEVDLUVRTQXGFHUCXHAGVNFRTOGFOGJDKGGRGRPKCPOKXLBVVRKXWXTOJCMHMECNVTDVMSGQVPPCHPJXDMCYRQYIHHWGCRSUWFAKRVPKHUEYGBQGVVCHAGFNPERKQGFVLTYKSYFEUMGFVVECWHCEVYCTVSYAAPLLIFVMCMKARVSHXGWGRSEWQKACPPKISECFQBVVBNIJGFSEVLTAGRTGMCTPCGJIGFQARPXSHYBJGRITNWPGEUMGFVVRTOGFBPGTVQBFKHTKKBTJSLBPHUGADKNRFJILTUHBIIIHWHNIEXGVVRTEQUKHUQPTPGBGUXGTKUUVSCEKYRCXJGPSYHSGLQARYENTPRGJICWKDCGHHNFRRPPNWQKAUSHNFRRPPNMJOGCPXVGVNFRDMCABOICMVCGJMCDCPBWXHMQDCKRVAGFFGPUUGTBTIHAGTBWRSAGFFGPUYCZYKRVWQKACZTKARRGTLXNZRKXWXTHUGATENKNUZTKARRGTDKUVRHIAEXSEAWAHYZLHSGLJSUCHEEGBGASUMKARCWHAGKRPXSHYBGQPDHMOOQYIAGFNPHIHYCAFIGPJOGYEHZQWAIXDACDCGRCXZHSKVHMUVRVVXXFHBNSDDFCJPECWOOXGSJMYVNVWWXYOFESBBPUGQFJMKHJCWIHQRNTOIHUSRCRNMJWAIXWXPGUGPDHMSQCXIAGGVFIHHHHUGATENOAFRDMKQRFXWTVHUGCLXTSSKPAXFKVVLRNRPBCVSLCBQDSDDUVRNZTLJSEGECWVVRTIHAGGNYQPIUOAFTXVVIEGWWNPUHRSCIGUFULTMQCXFSLGCXNTJGHOCAGSUMJSFJIAOGGNUWWXROFUISBVKNUPPUGZYGHDKCBTGQPKOOYCHTUWHGQLTKIFRCXSBUOCRSXGVARPXXMYOFGQEMAGUGHXWPCGNMZXVCQTSEMJSWCVUHTTRCVDYMWYNMCZUCZGFDWAGBOECTISQVSENVWGKRIHQBRQJIAGQHRFDTTRFCWHAGTRNPETUHVVATENHUQYVAVOYKGTMQVRTWTEHOSVIGLWQUCJPENOFVLXLKGUCPAMJWAMRDMJWAISUMWAONMCZFCJPWITKFFJSLUTOIGXWXAZYCPAMJWAMQTTVVBOILAAWJQYAWPHFCCPGAHUKRVTDCHVMIXXSAKJXYGZYQJUMJSGQTDYVVRJSJLGKUKGWPCGIGVNEKYRNCIKWSQQACWQKAFSLGYCHNHIAGTNNPCXXSEESBXVCNPICWKKBPHTKJCJOECROWYGWXOGTNNPTGDMGJMHMKARULTLCWQCPDNFWZWWIUGURVXXGIGBOILAGFRPIPKVVREICMTSBHXWXGOEVLAXVARUITMJOGYSJEFPRHSJKVVBWWPGFAVNIHWQKAKXWBPYSQVNHWGRGEABESUCHAXCFAVWTOGFNNXWBPUFQJIAKGFQVIBPVRTPTLUCAUMCMJSFELDHNFBQQPGFHUQYVAVVVUAPLPCGCZTKAUBQHDIRCEVYCBVMSQVHAQKVPKDYHVRTOCHYZRFKTTUHUGVTPCGAQSCXVCYKWIXPHBJIGLVWYNMIPCGTQSSITOPVMRXVCFCCXMQJRTCTLVVNVWPUQIGVLTKKUUVHXLVOAEIQNVHUGRXPQBQGVLACHYCXXMWRRQVAHPUVVYSXKJRISIMQOYKGTACRAQMSXCKUCXATVWGWHTPCGBTPDGIWGWHTXKHUGVQNVHUQYVAVHUGCLXTSAKGTZTOAFADKFGGQWPRRFRUICMNMFJIQXIOACKPBPWJQRSXTWSKWWTNZSCPAKKUUVXWKQITJXWXGOEVLWHYTHPRNBVZYUITFVCPQQTHWHNOSCZVVRRIDINSGJEIPCZXYMIAVVRKVWXCRFFSLGYOEFXWXCBGKTPMJWRUMIAKBXULTPCGECXWXTUYCHIAGFRYEHGQCAGPXLVSAKRVMJWFVMBXCGVVHXWPHFQYCWCHNNPIAGFVILIPQFQDYIBUVNNPWTXSGQEHDVVROAWTVHUGRPFGCSVLTVQIAVVNBUMBWOCHYDYGEHXOONOMHMJWFPILSGOYCRSHTOHUXGTNWNCRSLJSGTMTWVCPWVILGMNUWWXUDBMIUTPQLEYGMUSLKRVTUMBWVTYCZYKRVMJFBWKWMJSNKVSHACHVLXGMMBWGDNNRZCRPZGWGCRSPJOGCRXZPCECRIEKHGNIVBTZFJIAEVVVPOBXHCECWZBPUAQMIENBRXIGWQHBCWZIGFUCTHBUVNNPHXGWGYVXMVSAWTHHOSJJIGXFCJPHDPPRBYRIAGFRYEHGQHUKRVXNGRVSSHUCNNMRXUCBPFTZCBGCPZBPUNIEXGFWACLAEOWFUQTOGFLOYRAVCAKKWMKGUQYAWVVVPOSBPOUYEHMJSPCXXAQDRVLTRNZEGQTFDSEJIGLCIPGVDYOWYMEIMGOGKQTWKBNJQNWGOEKAXLJMBWATKGRBYRWXTSJKXWFGHUGVTTTSAQQXVGWAVLTTKFVOEUKCWQDYIRQIZKKWMEOGELPUCHNPHIACHFXIGRNWXGEBHWGRASJDPCJDYIWQQNVWTTVPNVWXPQBQGVPGFVRTIPEKQRDIVTPHBIIIKCHUGVHEGSCAECWYSAVSCLCMVPKIHJSEUIAYKBNFVTTOMFQVIHHKNAHDVCHFGEIUCHFFSRTVGRCXQTVGNPHHHOSGKQTLFCOCXHXCHPCXHYQFLQYHXGOFULTVQIYFRITPGJGVTBVVRTUJXUHVQRXMFWQPXBNEVZCXIXTKUKGWPCMFJIENVWGULTYGZGVLPMUVRYEHWQNVPKDYHOAFLPWLIFVFTZWBGQHGXCAGJEILJSJCWLTNYVPKWTPRVPLPGFKVVLSBPOUCRSLCMVPKIHJSEXIGRGOEPIHMNMAQASBPOUVIAEOSGJIIKWHUFMSRQIRXIGXCHNDEIPJSAUYSWGBYAXWNODGJYBIFCJPWWXEOZGYEHPOUGEEHHGGKGZLCBQFVNEGOIGWPGFHUGJPENKNUSKXTOYKGTPCGAQXPUKHUWVITPRFJIYNODRFYEHPHBJIGYGSGKRPFQARPXHAGZBQOTWWDOWXXMYOFCPAWCFXQZTKJSNFFTYQFRJIGPCGNPSIAGFYQRVICGFCKTTPRGJILAKHRTEQUKHJCWHMKZYKRHBIVGJYGKAWAIHDPPWGVLTKGKNURDMCABOICMVCOGPDLVOJCCLXPHNNMRXNWXGXWXYWAFECWYOFLYHMKBGKQTMQVRCVXMUOLCWXMVIEPISTECEPIGHJALGEGLCBQYLXLMSEULDPNOGGMILISGVMCZUVRYEHVNCFGFTAKBQKXLAGBFJIINTBRFXWXECEPIGUWHGJIGTDPVVAPLPCYQRVXTHBDIHXGBFJIUHWBQJIGLGZSKRPEQBTNSLACZYYLXVJKNUPXMWDOAEGHYCSNEBIUVNPKXGITEQQIAGFBQJIAGFRYIGXFCBTWPENFBWRSMJSUCPAUWHGJINPGFRCPAEQQXGHPGFKUGRPEKQRJESUGSACPAMJSJCCSHYBBPIHBFSNPHJIVVRQXWXTHEAMCZGJRTCSHQFFJILTNYRFWPWNMQQACMJSZKHSEGKBPHTKKBTJSLLJSJCWTOGFGQKTMQIGCKPBPGHFHTGNMFJIRTOSHRSCTNWGVPTMJFRGPTZISQVEQEGOYNQPWGCSUSABFUYCWHMJSEGAPLPCGJMCZQBVVIMVGDGCXXGAUBNHTGMSLCRSTNWPGWUBTGGVLDNIVGYEHMJOGKXBBIVGDIAHPUGQSCXQTGJISHQFFQJIAGVNNPQNVOYCWTBVVRTXWXNCPMWLXTSGQSATTURQVIAGYRAAPLVCBUQPENPHVEITPMECXTBVKBWPSGQHBRICTPMBHXWXOVBYIKXTCAVLTLGQBPHIBOSEQYCWUVREEBXWDBPEAHYQHTXPBPGUGLPWPCGPSIBESQDIUHTSNPHQXJWAFMIPCGNNMIMNSQQSGTDCHVJXYVSRPMCVJSFJMVAUVRVVXXFHUGPXMVZRISAWGBXGCXGVVRNSRDCBQVSWXTUEGEIWGZVILIBVTVVXTWCZVEIDIGBRFXWXFCBTECWHCHPHIACHVVPTWKBGQEHFCZYREHLCURPSIFWQUNEGZGFGJECTTOGJSAXUVRMRTEVRBYRPGFZBQOTWCZBPKIAGDNUWPZGWAVSIAGZBXIABGGGIEGWGBLQYTOGFFCAWHYGUGPDGISQVSVXVCHVSUMJOGFEGDJOYNECWYOAFIGTDCHVEBHPUGJSHXDSQUSUUTWTJXUEQKRTWPGFHUQWTVQCYHSJGVOVPWQNVGUGGDNNRAQXTOGBTGXWXTVRCHIATCHILIAGRBQVLTAOAFIKXPWSOCWXCRJQYAWICGJVDNIVGJSJZJHCQSGTNWPGMIPQIYFFTHHJRTCABVHYGYHXYWGJSJMOMFJSJEFSEUSWAQKVYMHAKQBWPSLJIGWTABMSNVIAXUQBRIXMJWAMMRHWZQKJXHPZLMRTPJCJVSQXIWAHSGRQIFGIHHOOAASJMQTGJILTAHUKRVLJOQJEEIGBRFPPMGZLVLPMCZVEIWTFPRIYCMQHUKRZMJOGXIGRHSJVLXGIGVPHTXFKRTIGXCZYAMBIQGFKFAXVVRTIHXGARFXDUGBBWWTBPKNKXXGIPLVLTEKHGNISHQFFQWWXYSAVFPVMHBVLTMCPYGLPEHVBRMCZUVROMVAVTVPHPGQHUGVZXACAKXDKCHNPCGTVSNDSDDQTEWPTLHCEULJMVWAITTHRZRWTABMSGGPTLECCGWIAKGGKQTLJSSQYCWCZVVXAXDCGVPTHPWGYLXVJQRTXPBPZLYEHGQHUGVTUGTBTIHTKRNNMRXCBQTSJGFHUGRTVMCSVLTUQHGNILTUOCCTTKNOOGPLBVVGJILHTRFFVXGMARDIPNVWSWPARRFVPXTWQBVVMCECFTGPTMVSEUMIPCGNNPKXTMJGPAMQGNAHGBPYZGFJMVVRYMHXNWGVPTTNWPGAPLPCGISXGIHBFSIACHVPEWNTFLPSXENZBQOUBTGGULTLCWQCRSLGSJJIIAGFVVWBTTYRFTDBUCAQVCHVTBTWWXJOQTIPWUSIGVPEPWPGPXMVZRJMHMQFVGWPUQIGELXEFFRPAWHJOQISIUWFAVECWGOGGRJIDMJKPSUGOFVWPGFCGJIGNPDYGEHTPHGJMCZUOYNFTVCIFGXWXAKBWPSGQHEGQTFDSEVLTLKACNIGNNSFVLTBTTEKICWUVNFXPNIVGVLTFUIPJEHMJOGCVTWJCGRSZXTKVNPQNTBLQYXYACHJSAWKHGQSAHPUNPHIACHVHCDNEIGASJKHWAIIGOGFLFITINMJKXWTMBVHIXMWGHCPARDZRGHHTPRFJIWTFBRXIGYQFTQXIXPHUCXXYACHFVXGMAHELUKQANDSIMNSZCVZXFDBKWDGKHVUEAFQGGEIGMCWAVSSBUOTTITPKHUASJLQCAGVDKNOGGVWHYSIGVIAKGOQXIEGKNURDMOOEMISIQWFQRHHCZVEIKXPHHTISMQHNUXTBVOAFJXGFWAIMIOGFLPMRXKHUCHXGHOPVEHHTHBHQXQGRSNEKHWFBHGWXTFLVEGMEIFVEGWRWAGEEINSEQEHMVIEMINMQTSGIPGFVBVFJMVSEGHIHCGGULTOGFLUSDGHWAKWWXFWGQJU

The main fact we will use to break the cipher is that in most English texts the frequencies of letters are not equal (see the table below). For example, the letters RSTLNE, which you may recognize from watching Wheel of Fortune, are among the most frequent.

Frequencies of letters in English

a

b

c

d

e

f

g

h

i

.082

.015

.028

.043

.127

.022

.020

.061

.070

j

k

l

m

n

o

p

q

r

.002

.008

.040

.024

.067

.075

.019

.001

.060

s

t

u

v

w

x

y

z

.063

.091

.028

.010

.023

.001

.020

.001

The following is an outline for how to decrypt a ciphertext encoded with the Vigenère Cipher with an unknown key. (Read until the end of the lab before you start working on any individual part.)

  1. Compute the key length K. (We already did this.)

  2. For each j from 0 to K-1: build a string S[j] comprising the letters of the ciphertext at indices congruent to j modulo K. For example, if ciphertext='ABCDEFGHIJK' and the key length is K=3, then

    S[0] = 'ADGJ'
    S[1] = 'BEHK'
    S[2] = 'CFI'
    

    For each j, every letter of S[j] has been encrypted with the same letter of the key.

  3. For each j from 0 to K-1:

    1. For each m from 0 to 25, decrypt the j-th string using the m-th letter of the alphabet.

    2. Measure the frequency of each letter in each of the strings in (a), and record these frequencies in a table (list of 26 lists) called freq_list.

    3. Find the value of m for which the dot product freq_list[m]·F is maximized, where F is the vector of letter frequencies

      F=[.082, .015, .028, .043, .127, .022, .020, .061, .070, .002, .008, .040, .024, .067, .075, .019, .001, .060, .063, .091, .028, .010, .023, .001, .020, .001]

      from the table at the top of the lab. Then the j-th letter of the key is the m-th letter of the alphabet.

  4. Decrypt the ciphertext using the key you found in part 5.

Task 1: Split the ciphertext

Write a function str_split(s,j,k) that takes in a string s and two integers j,k, where k >= 1, and returns a string consisting of each letter of s whose index is congruent to j modulo k.

>>> str_split('ABCDEFGHIJK',1,3)
'BEHK'

Task 2: Measure letter frequencies

Write a function letter_freq(s) that takes as input a string s and outputs a vector of length 26 whose i-th element is the frequency of the i-th letter of the alphabet in the string s. The built-in string function count may be helpful here.

>>> s = 'AAAAAAABZZ'
>>> s.count('A')
7
>>> letter_freq('AAAAAAABZZ')
[0.7,0.1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.2]

Task 3: Maximize the dot product

Write a function maximize_dot(string) that takes as input a string string and outputs the integer m for which the dot product letter_freq(string-decrypted-by-mth-letter)·F is maximized. Here is an outline for this function.

def maximize_dot(string):
   F = [.082, .015, .028, .043, .127, .022, .020, .061, .070, .002, .008, .040, .024, .067, .075, .019, .001, .060, .063, .091, .028, .010, .023, .001, .020, .001]
   alphabet = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'

   freq_list = #list of 26 lists, where the m-th list is letter_freq(string after it has been decrypted using the m-th letter of the alphabet)
               #(hint: use a list comprehension)

   for m in range(26):
      #compute the dot product of freq_list[m] with F
      #keep the largest dot product you compute

   return #the m for which the dot product was largest
>>> maximize_dot('KYVHLZTBSIFNEWFOALDGJFMVIKYVCRQPUFX')
17

Task 4: Decrypt the ciphertext

Write a function vigenere_crack(message) that takes in a string message and outputs a list of two strings: the most likely key and the most likely plaintext.

As a test input, use the ciphertext at the top of this page. It will be very clear if you have computed the correct key and plaintext.

You can get other test input strings at https://mathdept.byu.edu/~doud/Vigenere/