Lab 20: Cracking the Vigenère Cipher¶
To refresh your memory of the Vigenère Cipher, go back to Lab 8: The Vigenère Cipher and read about it.
You will need your key_length
function from that lab today.
Your ultimate goal is to determine the plaintext message for the following ciphertext:
CZVEILTUPRIMCGKBTVSVXVJRTCIBTSQQJHBVHVPKQRJSEUMHMGFBPXWXDOAMECWQTUCZXGIBBVLXGIHBFSDGESBTXLBESFJIWTFDRGTTWKBGQXWXDCBMLTKUWFVIGPCGEGESBPUOWXXMJOQPSEBEHHTIHHTQBPZTKUOGKSCLKBVVECWYVNVMHMJSHUIDYCPBQOIAQITJXPEKQRYMIAQIGRMRMWFRUSGVQBIGVHTVWBPWHHUVRYEHVQBFKHTKKBTKRWXTCJPQXGFOFYIAECGFJIRHWZQHSGMJSUQXSTAANFIWXTTRGPKXTMFNITIAOAFWINRWQYLTMJSEVLTINSNUYGXQTZCOXGIOQCMHREVNKRLHWZQDILHTHUVLTMTCHDPTHHURVXXGIICCRSIKQXKRVMJSQCMHBGGJJICLWRQGRARCKUKXTKCPOKXLBVVCKRZXASFTECVNCFGFNAGFGJIGXYOFPSIAKBTUSKXTMEGQPKMOONIXGVVNVRDKFWQCPXVGHUKRZBVGBXIGROIPJSJMQTGJILTAHBJIPKVVRTEQUKHFCCIHKHFGPUHJRRCVDAFSNTMHACZYDIATVSJJICLJSGJSJZJHVVSKXTOSVIGPCFQUMIHEQHTVTWVCUGVIACHFJIDNIVGVSWTXSJQRSXTSQCXIAKGOWXPMVVRVMBXKHNNPHXGARFUJBVSACXJKCZOWXLAGBGJIGTDPVVERMWOYNCIHQYNYEIVJCHVSUBVGJCMHMECNVTDVMSGCRSEQCXGHPMKHNPHIAGBUWVGBGRBPEABESFVEGMGRGQLTKHSRVJDKKHSNEHAGRNEVDLUVRTQXGFHUCXHAGVNFRTOGFOGJDKGGRGRPKCPOKXLBVVRKXWXTOJCMHMECNVTDVMSGQVPPCHPJXDMCYRQYIHHWGCRSUWFAKRVPKHUEYGBQGVVCHAGFNPERKQGFVLTYKSYFEUMGFVVECWHCEVYCTVSYAAPLLIFVMCMKARVSHXGWGRSEWQKACPPKISECFQBVVBNIJGFSEVLTAGRTGMCTPCGJIGFQARPXSHYBJGRITNWPGEUMGFVVRTOGFBPGTVQBFKHTKKBTJSLBPHUGADKNRFJILTUHBIIIHWHNIEXGVVRTEQUKHUQPTPGBGUXGTKUUVSCEKYRCXJGPSYHSGLQARYENTPRGJICWKDCGHHNFRRPPNWQKAUSHNFRRPPNMJOGCPXVGVNFRDMCABOICMVCGJMCDCPBWXHMQDCKRVAGFFGPUUGTBTIHAGTBWRSAGFFGPUYCZYKRVWQKACZTKARRGTLXNZRKXWXTHUGATENKNUZTKARRGTDKUVRHIAEXSEAWAHYZLHSGLJSUCHEEGBGASUMKARCWHAGKRPXSHYBGQPDHMOOQYIAGFNPHIHYCAFIGPJOGYEHZQWAIXDACDCGRCXZHSKVHMUVRVVXXFHBNSDDFCJPECWOOXGSJMYVNVWWXYOFESBBPUGQFJMKHJCWIHQRNTOIHUSRCRNMJWAIXWXPGUGPDHMSQCXIAGGVFIHHHHUGATENOAFRDMKQRFXWTVHUGCLXTSSKPAXFKVVLRNRPBCVSLCBQDSDDUVRNZTLJSEGECWVVRTIHAGGNYQPIUOAFTXVVIEGWWNPUHRSCIGUFULTMQCXFSLGCXNTJGHOCAGSUMJSFJIAOGGNUWWXROFUISBVKNUPPUGZYGHDKCBTGQPKOOYCHTUWHGQLTKIFRCXSBUOCRSXGVARPXXMYOFGQEMAGUGHXWPCGNMZXVCQTSEMJSWCVUHTTRCVDYMWYNMCZUCZGFDWAGBOECTISQVSENVWGKRIHQBRQJIAGQHRFDTTRFCWHAGTRNPETUHVVATENHUQYVAVOYKGTMQVRTWTEHOSVIGLWQUCJPENOFVLXLKGUCPAMJWAMRDMJWAISUMWAONMCZFCJPWITKFFJSLUTOIGXWXAZYCPAMJWAMQTTVVBOILAAWJQYAWPHFCCPGAHUKRVTDCHVMIXXSAKJXYGZYQJUMJSGQTDYVVRJSJLGKUKGWPCGIGVNEKYRNCIKWSQQACWQKAFSLGYCHNHIAGTNNPCXXSEESBXVCNPICWKKBPHTKJCJOECROWYGWXOGTNNPTGDMGJMHMKARULTLCWQCPDNFWZWWIUGURVXXGIGBOILAGFRPIPKVVREICMTSBHXWXGOEVLAXVARUITMJOGYSJEFPRHSJKVVBWWPGFAVNIHWQKAKXWBPYSQVNHWGRGEABESUCHAXCFAVWTOGFNNXWBPUFQJIAKGFQVIBPVRTPTLUCAUMCMJSFELDHNFBQQPGFHUQYVAVVVUAPLPCGCZTKAUBQHDIRCEVYCBVMSQVHAQKVPKDYHVRTOCHYZRFKTTUHUGVTPCGAQSCXVCYKWIXPHBJIGLVWYNMIPCGTQSSITOPVMRXVCFCCXMQJRTCTLVVNVWPUQIGVLTKKUUVHXLVOAEIQNVHUGRXPQBQGVLACHYCXXMWRRQVAHPUVVYSXKJRISIMQOYKGTACRAQMSXCKUCXATVWGWHTPCGBTPDGIWGWHTXKHUGVQNVHUQYVAVHUGCLXTSAKGTZTOAFADKFGGQWPRRFRUICMNMFJIQXIOACKPBPWJQRSXTWSKWWTNZSCPAKKUUVXWKQITJXWXGOEVLWHYTHPRNBVZYUITFVCPQQTHWHNOSCZVVRRIDINSGJEIPCZXYMIAVVRKVWXCRFFSLGYOEFXWXCBGKTPMJWRUMIAKBXULTPCGECXWXTUYCHIAGFRYEHGQCAGPXLVSAKRVMJWFVMBXCGVVHXWPHFQYCWCHNNPIAGFVILIPQFQDYIBUVNNPWTXSGQEHDVVROAWTVHUGRPFGCSVLTVQIAVVNBUMBWOCHYDYGEHXOONOMHMJWFPILSGOYCRSHTOHUXGTNWNCRSLJSGTMTWVCPWVILGMNUWWXUDBMIUTPQLEYGMUSLKRVTUMBWVTYCZYKRVMJFBWKWMJSNKVSHACHVLXGMMBWGDNNRZCRPZGWGCRSPJOGCRXZPCECRIEKHGNIVBTZFJIAEVVVPOBXHCECWZBPUAQMIENBRXIGWQHBCWZIGFUCTHBUVNNPHXGWGYVXMVSAWTHHOSJJIGXFCJPHDPPRBYRIAGFRYEHGQHUKRVXNGRVSSHUCNNMRXUCBPFTZCBGCPZBPUNIEXGFWACLAEOWFUQTOGFLOYRAVCAKKWMKGUQYAWVVVPOSBPOUYEHMJSPCXXAQDRVLTRNZEGQTFDSEJIGLCIPGVDYOWYMEIMGOGKQTWKBNJQNWGOEKAXLJMBWATKGRBYRWXTSJKXWFGHUGVTTTSAQQXVGWAVLTTKFVOEUKCWQDYIRQIZKKWMEOGELPUCHNPHIACHFXIGRNWXGEBHWGRASJDPCJDYIWQQNVWTTVPNVWXPQBQGVPGFVRTIPEKQRDIVTPHBIIIKCHUGVHEGSCAECWYSAVSCLCMVPKIHJSEUIAYKBNFVTTOMFQVIHHKNAHDVCHFGEIUCHFFSRTVGRCXQTVGNPHHHOSGKQTLFCOCXHXCHPCXHYQFLQYHXGOFULTVQIYFRITPGJGVTBVVRTUJXUHVQRXMFWQPXBNEVZCXIXTKUKGWPCMFJIENVWGULTYGZGVLPMUVRYEHWQNVPKDYHOAFLPWLIFVFTZWBGQHGXCAGJEILJSJCWLTNYVPKWTPRVPLPGFKVVLSBPOUCRSLCMVPKIHJSEXIGRGOEPIHMNMAQASBPOUVIAEOSGJIIKWHUFMSRQIRXIGXCHNDEIPJSAUYSWGBYAXWNODGJYBIFCJPWWXEOZGYEHPOUGEEHHGGKGZLCBQFVNEGOIGWPGFHUGJPENKNUSKXTOYKGTPCGAQXPUKHUWVITPRFJIYNODRFYEHPHBJIGYGSGKRPFQARPXHAGZBQOTWWDOWXXMYOFCPAWCFXQZTKJSNFFTYQFRJIGPCGNPSIAGFYQRVICGFCKTTPRGJILAKHRTEQUKHJCWHMKZYKRHBIVGJYGKAWAIHDPPWGVLTKGKNURDMCABOICMVCOGPDLVOJCCLXPHNNMRXNWXGXWXYWAFECWYOFLYHMKBGKQTMQVRCVXMUOLCWXMVIEPISTECEPIGHJALGEGLCBQYLXLMSEULDPNOGGMILISGVMCZUVRYEHVNCFGFTAKBQKXLAGBFJIINTBRFXWXECEPIGUWHGJIGTDPVVAPLPCYQRVXTHBDIHXGBFJIUHWBQJIGLGZSKRPEQBTNSLACZYYLXVJKNUPXMWDOAEGHYCSNEBIUVNPKXGITEQQIAGFBQJIAGFRYIGXFCBTWPENFBWRSMJSUCPAUWHGJINPGFRCPAEQQXGHPGFKUGRPEKQRJESUGSACPAMJSJCCSHYBBPIHBFSNPHJIVVRQXWXTHEAMCZGJRTCSHQFFJILTNYRFWPWNMQQACMJSZKHSEGKBPHTKKBTJSLLJSJCWTOGFGQKTMQIGCKPBPGHFHTGNMFJIRTOSHRSCTNWGVPTMJFRGPTZISQVEQEGOYNQPWGCSUSABFUYCWHMJSEGAPLPCGJMCZQBVVIMVGDGCXXGAUBNHTGMSLCRSTNWPGWUBTGGVLDNIVGYEHMJOGKXBBIVGDIAHPUGQSCXQTGJISHQFFQJIAGVNNPQNVOYCWTBVVRTXWXNCPMWLXTSGQSATTURQVIAGYRAAPLVCBUQPENPHVEITPMECXTBVKBWPSGQHBRICTPMBHXWXOVBYIKXTCAVLTLGQBPHIBOSEQYCWUVREEBXWDBPEAHYQHTXPBPGUGLPWPCGPSIBESQDIUHTSNPHQXJWAFMIPCGNNMIMNSQQSGTDCHVJXYVSRPMCVJSFJMVAUVRVVXXFHUGPXMVZRISAWGBXGCXGVVRNSRDCBQVSWXTUEGEIWGZVILIBVTVVXTWCZVEIDIGBRFXWXFCBTECWHCHPHIACHVVPTWKBGQEHFCZYREHLCURPSIFWQUNEGZGFGJECTTOGJSAXUVRMRTEVRBYRPGFZBQOTWCZBPKIAGDNUWPZGWAVSIAGZBXIABGGGIEGWGBLQYTOGFFCAWHYGUGPDGISQVSVXVCHVSUMJOGFEGDJOYNECWYOAFIGTDCHVEBHPUGJSHXDSQUSUUTWTJXUEQKRTWPGFHUQWTVQCYHSJGVOVPWQNVGUGGDNNRAQXTOGBTGXWXTVRCHIATCHILIAGRBQVLTAOAFIKXPWSOCWXCRJQYAWICGJVDNIVGJSJZJHCQSGTNWPGMIPQIYFFTHHJRTCABVHYGYHXYWGJSJMOMFJSJEFSEUSWAQKVYMHAKQBWPSLJIGWTABMSNVIAXUQBRIXMJWAMMRHWZQKJXHPZLMRTPJCJVSQXIWAHSGRQIFGIHHOOAASJMQTGJILTAHUKRVLJOQJEEIGBRFPPMGZLVLPMCZVEIWTFPRIYCMQHUKRZMJOGXIGRHSJVLXGIGVPHTXFKRTIGXCZYAMBIQGFKFAXVVRTIHXGARFXDUGBBWWTBPKNKXXGIPLVLTEKHGNISHQFFQWWXYSAVFPVMHBVLTMCPYGLPEHVBRMCZUVROMVAVTVPHPGQHUGVZXACAKXDKCHNPCGTVSNDSDDQTEWPTLHCEULJMVWAITTHRZRWTABMSGGPTLECCGWIAKGGKQTLJSSQYCWCZVVXAXDCGVPTHPWGYLXVJQRTXPBPZLYEHGQHUGVTUGTBTIHTKRNNMRXCBQTSJGFHUGRTVMCSVLTUQHGNILTUOCCTTKNOOGPLBVVGJILHTRFFVXGMARDIPNVWSWPARRFVPXTWQBVVMCECFTGPTMVSEUMIPCGNNPKXTMJGPAMQGNAHGBPYZGFJMVVRYMHXNWGVPTTNWPGAPLPCGISXGIHBFSIACHVPEWNTFLPSXENZBQOUBTGGULTLCWQCRSLGSJJIIAGFVVWBTTYRFTDBUCAQVCHVTBTWWXJOQTIPWUSIGVPEPWPGPXMVZRJMHMQFVGWPUQIGELXEFFRPAWHJOQISIUWFAVECWGOGGRJIDMJKPSUGOFVWPGFCGJIGNPDYGEHTPHGJMCZUOYNFTVCIFGXWXAKBWPSGQHEGQTFDSEVLTLKACNIGNNSFVLTBTTEKICWUVNFXPNIVGVLTFUIPJEHMJOGCVTWJCGRSZXTKVNPQNTBLQYXYACHJSAWKHGQSAHPUNPHIACHVHCDNEIGASJKHWAIIGOGFLFITINMJKXWTMBVHIXMWGHCPARDZRGHHTPRFJIWTFBRXIGYQFTQXIXPHUCXXYACHFVXGMAHELUKQANDSIMNSZCVZXFDBKWDGKHVUEAFQGGEIGMCWAVSSBUOTTITPKHUASJLQCAGVDKNOGGVWHYSIGVIAKGOQXIEGKNURDMOOEMISIQWFQRHHCZVEIKXPHHTISMQHNUXTBVOAFJXGFWAIMIOGFLPMRXKHUCHXGHOPVEHHTHBHQXQGRSNEKHWFBHGWXTFLVEGMEIFVEGWRWAGEEINSEQEHMVIEMINMQTSGIPGFVBVFJMVSEGHIHCGGULTOGFLUSDGHWAKWWXFWGQJU
The main fact we will use to break the cipher is that in most English texts the frequencies of letters are not equal (see the table below).
For example, the letters RSTLNE
, which you may recognize from watching Wheel of Fortune, are among the most frequent.
a |
b |
c |
d |
e |
f |
g |
h |
i |
---|---|---|---|---|---|---|---|---|
.082 |
.015 |
.028 |
.043 |
.127 |
.022 |
.020 |
.061 |
.070 |
j |
k |
l |
m |
n |
o |
p |
q |
r |
---|---|---|---|---|---|---|---|---|
.002 |
.008 |
.040 |
.024 |
.067 |
.075 |
.019 |
.001 |
.060 |
s |
t |
u |
v |
w |
x |
y |
z |
---|---|---|---|---|---|---|---|
.063 |
.091 |
.028 |
.010 |
.023 |
.001 |
.020 |
.001 |
The following is an outline for how to decrypt a ciphertext encoded with the Vigenère Cipher with an unknown key. (Read until the end of the lab before you start working on any individual part.)
Compute the key length K. (We already did this.)
For each
j
from0
toK-1
: build a stringS[j]
comprising the letters of the ciphertext at indices congruent toj
moduloK
. For example, ifciphertext='ABCDEFGHIJK'
and the key length isK=3
, thenS[0] = 'ADGJ' S[1] = 'BEHK' S[2] = 'CFI'
For each
j
, every letter ofS[j]
has been encrypted with the same letter of the key.For each
j
from0
toK-1
:For each
m
from0
to25
, decrypt thej
-th string using them
-th letter of the alphabet.Measure the frequency of each letter in each of the strings in (a), and record these frequencies in a table (list of 26 lists) called
freq_list
.Find the value of
m
for which the dot productfreq_list[m]·F
is maximized, whereF
is the vector of letter frequenciesF=[.082, .015, .028, .043, .127, .022, .020, .061, .070, .002, .008, .040, .024, .067, .075, .019, .001, .060, .063, .091, .028, .010, .023, .001, .020, .001]
from the table at the top of the lab. Then the
j
-th letter of the key is them
-th letter of the alphabet.
Decrypt the ciphertext using the key you found in part 5.
Task 1: Split the ciphertext¶
Write a function str_split(s,j,k)
that takes in a string s
and two integers j,k
, where k >= 1
, and returns a string consisting of each letter of s
whose index is congruent to j
modulo k
.
>>> str_split('ABCDEFGHIJK',1,3)
'BEHK'
Task 2: Measure letter frequencies¶
Write a function letter_freq(s)
that takes as input a string s
and outputs a vector of length 26 whose i
-th element is the frequency of the i
-th letter of the alphabet in the string s
.
The built-in string function count
may be helpful here.
>>> s = 'AAAAAAABZZ'
>>> s.count('A')
7
>>> letter_freq('AAAAAAABZZ')
[0.7,0.1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.2]
Task 3: Maximize the dot product¶
Write a function maximize_dot(string)
that takes as input a string string
and outputs the integer m
for which the dot product letter_freq(string-decrypted-by-mth-letter)·F
is maximized.
Here is an outline for this function.
def maximize_dot(string):
F = [.082, .015, .028, .043, .127, .022, .020, .061, .070, .002, .008, .040, .024, .067, .075, .019, .001, .060, .063, .091, .028, .010, .023, .001, .020, .001]
alphabet = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
freq_list = #list of 26 lists, where the m-th list is letter_freq(string after it has been decrypted using the m-th letter of the alphabet)
#(hint: use a list comprehension)
for m in range(26):
#compute the dot product of freq_list[m] with F
#keep the largest dot product you compute
return #the m for which the dot product was largest
>>> maximize_dot('KYVHLZTBSIFNEWFOALDGJFMVIKYVCRQPUFX')
17
Task 4: Decrypt the ciphertext¶
Write a function vigenere_crack(message)
that takes in a string message
and outputs a list of two strings: the most likely key and the most likely plaintext.
As a test input, use the ciphertext at the top of this page. It will be very clear if you have computed the correct key and plaintext.
You can get other test input strings at https://mathdept.byu.edu/~doud/Vigenere/