My Name Is… Sarah The Horse?

The 13-character cryptogram mailed by Zodiac on April 20, 1970 to the San Francisco Chronicle remains unsolved, despite many attempts to find solutions that fit into the cipher text.

Did Zodiac really encipher a name in this letter? If we assume that the cipher is a simple substitution cipher, read from left to right, then it is difficult to find real names that fit. This is because only 8 of the 13 symbols are unique. The repeated symbols are:

  • : Occurs 3 times.
  • : Occurs 2 times.
  • : Occurs 2 times.
  • : Occurs 2 times.

Assuming that each of those symbols can only represent one plaintext letter, many names we try to plug in will fail, because they will violate the cipher’s constraints.

We can roughly estimate the probability of 13 characters of plain text fitting into the cipher text. First, we start with these event probabilities:

  • Probability of two letters repeating in English (p2): About 0.0655.
  • Probability of three letters repeating in English (p3): About 0.00535.

Since we must have three identical letters, AND three sets of identical pairs, we must multiply the probabailities of each event together:

p3 * (p2)3 = 1.5 x 10-6, or about 1 in 670,000

Another way to look at this is to consider all possible 13-letter plain texts. Each position can be one of 26 letters, so that means for 13 spots there are 2613 possible plain texts (about two quintillion). But there are only 8 different symbols in the cipher, which means there are really only 268 possible plain texts (about 200 billion). That means that only about 1 in 12,000,000 plain texts will fit into the cipher without violating its constraints.

I found a database of a hundred million person names and wrote a program that scans them to look for names that fit in the cipher. When examining a name, the program allows the first, middle, and last names to appear in different orders, and allows any of the name parts to be abbreviated with initials. It discards anything that isn’t exactly 13 letters long. This results in almost a half billion tests of name combinations. The algorithm found only 213 plain texts that fit the cipher text. That works out to about 1 hit for every 2,000,000 attempts. Here is a sampling of the more interesting names that fit:

  • Chastity Tracy
  • Erica Natalie T
  • Islami M Amelia
  • Eddie Resendes
  • Ernie Zerenner
  • Aileen Enerlan
  • Chitumu Eunice
  • Ariyana Mariam
  • Sripada Harish
  • Ali Raja Mariam
  • Arriaga Laura L
  • Duitama David D
  • Steve Menezes N
  • Arriaza Laura L
  • Arizala Maria M
  • Adriana Laura L
  • Aziz a Zahaniah

Would we even know if we somehow stumbled on a name that was the true plaintext of this cipher? And, why would Zodiac have used a real name there in the first place?

Another way to generate solutions for this cipher is to do an exhaustive search for words and phrases that fit. I wrote another program that does this search using a dictionary of about 150,000 words, and it found numerous solutions. Here are some of the interesting ones:

  • Sarah The Horse
  • Uncle Peter Cut
  • Achieve Lethal
  • Obscene Nelson
  • Extreme Reuter
  • Emperors Ropes
  • Leigh The Haile
  • Kane Pops Punks
  • Gareth Tot Ergo
  • Niece Gene Penn
  • Drop Rarer Code
  • Door Non-Encode
  • Iraq Alan Again
  • All Banana Alan
  • Alice Venetian
  • Laura Catapult
  • Afghan Amalgam
  • Reassess Scars
  • Clara Cataract
  • Shannon Encase
  • Okinawa Nation
  • Indiana Saudis
  • Regan Adalard
  • Drippy Peptide
  • Outlet Erector
  • Akira Canadian
  • Adriana Marram
  • Mary Nina Norma
  • Robert Rory Bro
  • Smile Pete List
  • Mary Tate Terme
  • Mary Gags Germs
  • Ryan Amaya Gary
  • Daniel Eye Andy
  • Eats Bobs Bates
  • Kill Baby Bulky
  • Shit Pipe Poise
  • Eric Alan Alien
  • Slave Pete Last
  • Extra War After
  • Murder Eye Army
  • Clara Fatal Act
  • Encoded Odd CEO

You can see the full raw results here:

The program has a lot of freedom to combine words in any arrangement, resulting in very many matches, especially when more than two words are allowed in a solution. And that’s not including any freedoms to anagram the solutions, which generates an astronomical number of matches.

Even with so few matches on real names, there are still numerous solutions to this cryptogram. Such a short cryptogram may have no definitive solution without further compelling evidence, such as a discovery of the killer’s worksheet.

But is the cryptogram something else entirely, instead of a simple substitution cipher? Many mysterious and interesting questions remain:

  • In the letter, the killer refers to his unsolved 340-character cryptogram, sent five months prior: “By the way, have you cracked the last cipher I sent you? My name is _____ “, followed by the 13 mysterious symbols. Are the 13 symbols intended to be some kind of clue to the construction of the 340?
  • Is the appearance of “NAM” and “E” among the symbols an intentional reference to “NAME”? (Other words you can form out of the cipher’s letters include MEANT, TAKEN, TEAK, KENT, TANK, MANE, TAME, AMEN, TEAM, TAKE, MATE, NEAT, NAME, MEAN, MAKE, ANTE, MEAT, TEN, MEN, ATE, NET, NAE, TAN, MAN, MET, ANT, TEA, KEN, ETA, MAT, TAM, and EAT)
  • Do the symbols represent something other than a letter of the alphabet?
  • Why are the three symbols perfectly centered in the cryptogram, giving it a strong symmetry?
  • Look at the sequence of letters and symbols:

    The pattern reads the same forward and backward – is this an intentional palindrome?
  • Is there any significance to the orderly repetition of “N” and “M” in the sequence “N, M, N, M?”
  • Why did Zodiac include the new symbols and when he already had enough symbols to use from his previous cryptograms?
  • Did Zodiac intend for us to flip the cipher upside down, revealing the word “WANT”?
  • If you removed the symbols, would the remaining pieces signify his initials and a date?
  • Is the cipher supposed to line up with a section of the 340-character cipher?
  • Is there a relationship between the 13-character cipher and the filler in the 408 cipher, and possible filler in the 340 cipher?
  • Is the 13-character cipher Zodiac’s response to Dr. Marsh’s challenge to him to reveal his real name in a cipher?

  • In the 340, “K” and “M” are alternated in a strong, orderly pattern, suggesting sequential assignments were used for homophonic encryption. “K” and “M” also appear together in the 13-character cipher. Is there a connection?
  • Are the symbols meant to line up with letters in the 408’s section of filler?
  • Why does the 13-character cipher have a remarkable resemblance to the word “anetheke” found on ancient Greek temples and artifacts?

    (The above inscription concerns the liberation of slaves)

    (Even the three E’s follow the same pattern as the symbols)

As usual, we are left with more questions than answers.

Updated 9/18 to include this additional curiosity: If you apply the 408’s key to the 13, you get something like this: WEED_S_HREWH (WEED SHREW H). Note that you have to assume that the 408’s symbol is the same as the 13’s symbol. If you interpret it as instead, you can get WEED SHOE H.