|
|
by
James J. Gillogly and Larry Harnisch
This paper originally appeared in Cryptologia, October 1996; Volume XX, Number 4.
ABSTRACT: A Cambridge don and psychic researcher left two encrypted messages to test whether the dead can communicate with the living. The authors describe successful cryptanalysis of one of these messages, a double Playfair cipher with overlapped pairs. In summary: Computerized medium channels dead spirit to recover cryptographic keys. KEYWORDS: Cryptanalysis, Playfair, Life After Death.In the late 1940s Cambridge don Robert H. Thouless (1894-1984) proposed a way to test his hypothesis that the dead can communicate with the living [3]. He encrypted two messages with (he believed) indecipherable ciphers, and intended to reveal the cipher keys to any live person he could reach after his death. Before that melancholy event transpired, he encouraged mediums and clairvoyants to attempt to extract the keys from his brain, to demonstrate that the transfer of knowledge actually took place after his death. These trusted psychics would guard against the possibility that less ethical ones would pick their brains and save the results until their death, then take advantage of this illicitly gained knowledge.
The first encrypted message, which we will call cipher A, he gave as:
CBFTM HGRIO TSTAU FSBDN WGNIS BRVEF BQTAB
QRPEF BKSDG MNRPS RFBSU TTDMF EMA BIM
He said cipher A uses one of the well-known methods of encipherment with a key-word which I hope to be able to remember in the after life... The passage I have given is neither enciphered by a simple process of substitution or transposition, nor, I think, long enough for it to be possible to use the methods adapted for breaking the less simple systems.However, he did recognize the possibility that it could be broken, and offered in the Appendix a more rigorous test (cipher B):
INXPH CJKGM JIRPR FBCVY WYWES NOECN SCVHE GYRJQ
TEBJM TGXAT TWPNH CNYBC FNXPF LFXRV QWQL
The system as he described it is a Vigenére cipher with an incoherent running
key derived from each word of a continuous text from some published work.
Specifically, the numbers in a word are numbered, from A = 1 to Z = 26, added
together, reduced modulo 26, and converted back into letters. For example
the word "NOT" becomes 14 + 15 + 20 = 49 -----> 23 = W. This key letter
is used to encrypt the first letter of the plaintext using the standard Vigenére
algorithm: plaintext = ciphertext - key. Only the first occurrence of each word
is used to produce a key letter, Thouless thinking that repeated words might
give the cryptanalyst some leverage. The present authors are inclined to believe
that solving this cipher will require finding the right text; we have tried several
hundred books with no success, including all of Shakespeare's works and the
King James Bible.Thouless invited cipher experts to attempt to break these ciphers, and within a few weeks one had solved cipher A, discovering it was a Playfair cipher with keyword SURPRISE. Playfair uses digraphic substitution in a 5 by 5 square alphabetic array omitting one letter, Thouless omitted J.
S U R P I
E A B C D
F G H K L
M N O Q T
V W X Y Z
The two letters in each digraph are replaced by letters at the opposite corners of the
rectangle they form. Letters in the same row or column are replaced by the letters to the
right or below respectively. If the two letters are the same they are separated by an X before
encryption.For example, challenge cryptogram A starts CB=ba, FT=lm, MH=of and so on.
This solution of a 66-letter Playfair with no crib was not unprecedented: a decade earlier Private Alf Monge had cracked a 30-letter Playfair, using elaborate deductions built on assumptions about the key structure [2].
On the advice of his anonymous cryptanalyst, Thouless replaced the test of the cracked Playfair A with a more difficult 58-letter C:
BTYRR OOFLH KCDXK FWPCZ KTADR GFHKA
HTYXO ALZUP PYPVF AYMMF SDLR UVUB
He enciphered the plaintext with two different keys. After the first Playfair
encryption he chose one letter "at random" to place at the beginning and end
of the cryptogram, then encrypted the intermediate ciphertext using the second
key. This resulted in breaking up and overlapping the digraphs from the first
encryption. Since Playfair cannot encrypt a digraph consisting of a repeated
letter he specified a method for breaking up such a pair if one should come
about after the first encryption. This did not happen, since the method he chose
would have been obvious in the final ciphertext. Playfair ciphers without a crib can be challenging, and the overlapped double Playfair obscures the interesting statistical properties that make longer Playfairs tractable. However, Thouless' key selection habits and certain peculiarities of the double encryption suggest an attack. Each of Thouless' Playfair examples used a single English word for the key written horizontally into the key frame. If he used the keys the same way for this one, a dictionary attack becomes possible. Thouless anticipated and discounted this attack, since there are billions of possibilities -- enough to keep an army of hopeful psychics and mediums busy for their whole lives.
Mounting a dictionary attack requires selecting a suitable dictionary: if it is too small it may not include the right word; if too large, the search may take too long. A medium-sized English dictionary may have 60,000 words, and an unabridged dictionary 300,000 or more. In the smaller case the number of two-word key pairs is about 4 billion possibilities: tractable but not pleasant to try exhaustively. If either word is not in this smaller dictionary, the cost of trying the unabridged might be more than one hundred billion tests -- even less appealing. A straightforward Playfair keysearch program in C on an 80486 40 MHz laptop running Linux can try about 3,000 words per second and score the results, suggesting that a complete two-word search on the smaller dictionary would take over two weeks of compute time; the unabridged, if necessary, would take nearly a year.
The search was reduced significantly by taking advantage of two implications of Thouless' rules for the second encryption step.
First, he wrote that he placed the same letter at the beginning and end of the intermediate ciphertext before breaking it up into digraphs for the second encipherment. This means that if we decrypt the final ciphertext with the correct second keyword, it will have the same letter on the front and back; if those letters are different, then that keyword is wrong and we do not need to continue the search for the corresponding first keyword. For example, decrypting cipher C with the keyword AARDVARK gives intermediate ciphertext:
GOWVAPSBINRGXQVGPIGWZNKRKCBMVKNOXWHUNXWOSWSRBVSFFVQVIDYAOA
which begins with G and ends with A -- not acceptable. Each of the 63,747 words
in the "medium" dictionary was put into a key square and used to decrypt cipher
C, resulting in 6,900 possible second keywords that satisfied this constraint.The second observation is that no pair in the intermediate ciphertext may contain a repeated letter -- since Playfair rules require two different letters in each plaintext pair, each resulting ciphertext pair must also have two different letters For example, decrypting cipher C with keyword ABDUCT gives intermediate ciphertext:
AERMQSQTNFNBXQLEPKZSIECB MUEGIBGHXWIVNXBRRWOWTDRGLGQCMOAYDA
The first and last letters match, but when we remove those letters and break
it up into digraphs, we see:
ER MQ SQ TN FN BX QL EP KZ SI EC BM UE GI
BG HX WI VN XB RR WO WT DR GL GQ CM QA YD
Since digraph RR has a repeated letter, this intermediate ciphertext could not
have resulted from an initial Playfair encipherment. Each of the 6,900 possible
second keywords was placed in a key square and used to decrypt Playfair C,
resulting in 1,385 possible second keywords that passed both of the tests. Since
testing each second keyword against all possible first keywords takes about 22
seconds on the laptop, it would take about 8.5 hours to run through all 63,747
of them -- good enough. These 1,385 potential Playfairs were saved in a file for
the next step. Each of the 1,385 assumed Playfairs resulting from decrypting cipher C with the possible second keywords was tried with all possible first keywords, and the result was scored by finding matches with a reference table of logarithms of English trigraph frequencies derived from several dozen novels. Each trigraph in the ciphertext that matched an English trigraph rewarded the cryptogram's score with that log frequency.
For example, with the second keyword ABERRANTLY and the first keyword AARDVARK, the "plaintext" to be scored is:
IECGSLFZCNREWNRTLAYEKLOGNRFEMGWFYIEARYLNBQZPZOSCNZIGGUYA
The trigraph IEC scores 2, RTL is 2, TLA is 3, and so on, with the
scores for EAR and ARY at 6 each. This plaintext scores 40 total. The closer
the solution comes to English, the higher the score... or so we hope. All solutions with an "English-like" value above an arbitrary heuristic cutoff were printed, and the solution was found solving the cryptogram for the second keyword BEAUTY:
TU GN QN QC KM ID FW LD PK GV MA WA MT YL DW
MB FS PE MX AQ ND OW DU GH LG WY MQ EX AT
The keyword BLACK produced the highest score (166), and gave this plaintext:
THIS IS A CIPHER WHICH WILXL NOT BE READ
UNLESXS I GIVE THE KEYWORDS X
The two passwords for cipher C, then, are BLACK and BEAUTY. Needless to say,
we tried the text of Sewell's "Black Beauty" on cipher B without success. The Survival Research Foundation offered a $1000 reward to the first person to send the key to either cipher B or C to their offices, but unfortunately the Toshiba laptop was not ectoplasmically receptive -- or even built -- when the offer expired in 1987. Our successful computational seance must be its own reward.
1. Leighton, A. C. "Has Dr. Thouless Survived Death?" Cryptologia. 10(2): 108-109.
2. Monge, A. 1936. "Solution of a Playfair Cipher." Signal Corps Bulletin. #93, Nov-Dec. Reprinted in Cryptography and Cryptanalysis Articles Vol. 1. Ed. W. F. Friedman. Laguna Hills CA: Aegean Park Press.
3. Thouless, R. H. 1946-49. "A Test of Survival" and "Additional Notes on a Test of Survival." Proc. of the Society for Psychical Research . 63: 253-263 and 342-3.
James J. Gillogly is a computer scientist at RAND, specializing in system design and development, computer security, and C hacking. He edits the Cipher Exchange column for The Cryptogram (organ of the American Cryptogram Association), coordinates a mailing list dealing with the Voynich Manuscript, wrote a prize-winning chess program, and has sung with a chamber music group in Carnegie Hall.
Larry Harnisch is a copy editor on the Metro Desk of the Los Angeles Times. He received a bachelor's degree in music from the University of Arizona, where he did graduate work in musicology. He was classical music critic for the Arizona Daily Star in Tucson and has written about music for the Los Angeles Times. He has a longstanding interest in cryptology.
Copyright October 1996 by Cryptologia and James J. Gillogly and Larry Harnisch
Mail Jim Gillogly
Converted to hypertext by Joe Peschel October, 2000.