Everybody likes a good story. Well, once again, the Corey Starliper story, which is over a year old, is enjoying another new round of attention:

The evidence is quite clear that anyone could use Corey’s decryption technique to invent their own creepy hidden messages and claim that they, too, have uncovered something left for them by the Zodiac killer among the mysterious symbols. Unfortunately, this little detail doesn’t inflict the same rush of excitement as believing the story at face value. The allure of unsolved mysteries is too great to overcome popular, unskeptical thinking.

Francis Bacon understood this weakness about us, almost 400 years ago:

The human understanding is no dry light, but receives infusion from the will and affections; whence proceed sciences which may be called ‘sciences as one would’. For what a man had rather were true he more readily believes. Therefore he rejects difficult things from impatience of research; sober things, because they narrow hope; the deeper things of nature, from superstition; the light of experience, from arrogance and pride; things not commonly believed, out of deference to the opinion of the vulgar. Numberless in short are the ways, and sometimes imperceptible, in which the affections colour and infect the understanding.
– Francis Bacon, Novum Organon (1620)

And numberless are the ways phantoms appear, deliberately and otherwise, within the strange cipher symbols, leading many to a ruinous path of conviction. Here is a more recent example:


Once again, someone who has found a handful of interesting words and phrases in the plain text has reached the conclusion that their decryption attempt is correct. Unfortunately, anyone can produce decryptions of the 340 cipher with a handful of interesting words and phrases. I’ve seen very many such solutions over the years. They all have some readable text, scattered in large swaths of gibberish. These solutions are easy to produce because if you allow a solution to contain a lot of gibberish, you can plug whatever you want into other parts of the cipher. From the thousands of such decryptions, how do you figure out which one is right?

Corey Starliper went a step beyond this, and decided to eliminate the constraints of the cipher text altogether, freeing him to squeeze in his invented plain text.

A frequent objection to this kind of analysis goes something like this: We can’t assume that Zodiac was a rational person, who would use a methodical encryption technique that could be easily understood or accepted. Wouldn’t he use some kind of crazy codemaking scheme that doesn’t make sense?

This is an acceptable objection. Yes, he very well could have done something insane to produce the sequence of symbols we see in the 340 cipher. But you still have to figure out which insane method is the correct one, because there are millions of them to select from!

Just because the Zodiac killer may have abandoned reason, doesn’t mean we should.

Programmer Dan Umanovskis has been busy running many experiments on the unsolved 340-character cipher. During his experiments, he used the zkdecrypto software to look for solutions for test ciphers and variations of the 340 cipher. But the current version of zkdecrypto can only work on one cipher at a time, and you have to click around in the user interface to kick off its search for solutions.

Dan needed a way to simplify and speed up the process, so he hacked together a command-line version called zkdecrypto-lite. Visit the project page, or go straight to the downloads page where you can find binaries for Windows, Linux, and Mac OS X. And read about how to use the program.

This version of zkdecrypto is really great for exploring ideas about the 340 cipher, because you can set up a bunch of test ciphers, and then run the command-line program on them all at once, instead of manually loading the ciphers one at a time into the user interface. (Side note: zkdecrypto was originally a command-line program written by Brax Sisco, who worked with other programmers to add the user interface to help make the program more user friendly. So, zkdecrypto is revisiting its roots!)

Here’s some info from Dan on how to use the program:

To invoke ZKDlite, call it with a parameter giving the relative path to the cipher that you wish to solve. Such as:

./zkdecrypto-lite cipher/408.zodiac.solved.txt

Invoked in that way, the program will run the solver for 2 minutes before outputting the result. It’s also possible to specify a stopping condition:

-t n will stop after n seconds.
-i n will stop after n solver iterations.
-s n will stop as soon as the score reaches n.

So for example,

./zkdecrypto-lite cipher/408.zodiac.solved.txt -s 44000

will work on the 408 until the score exceeds 44000. That’s a good value for testing, by the way, as the 408 cipher becomes comfortably readable at a score of 44000.

./zkdecrypto-lite cipher/408.zodiac.solved.txt -s 44000

When I run the above example on the solved 408 cipher, the program takes only a second and a half to find the (mostly) correct solution!

Thanks, Dan, for such a useful hack of zkdecrypto!

While Zamantha was digging up Thomas Dougherty’s letters at the San Francisco Public Library, she also made copies of their collection of newspaper article clippings.

The collection contains an assortment of Zodiac-related articles from 1978 to 2009. Zamantha generously sent me a copy of the articles, and now you can view the entire collection online in Google Docs by clicking on this link.

Here is a summary of the articles contained in the collection:

Tip: When you open one of the articles in Google Docs, it will extract the text from the article. This makes the article searchable. Here’s an example:

Many thanks again to Zamantha and Traveller1st!

Thirty-seven years ago, a man named Thomas Dougherty came up with a Zodiac “code theory”, and mailed dozens of bizarre letters from the Hotel Warfield in San Francisco to United States federal judge Oliver Carter.

The letters were recently unearthed again from the San Francisco Public Library by researchers Mark (traveller1st) and Di (Zamantha). Di paid a visit to the library, obtained copies of all the files, and also discovered that they had been previously found by “Goldcatcher / Blaine Blaine”, the person who originally promoted suspicions that Richard Gaikowski was the Zodiac killer (read more here and here). Goldcatcher refers to Dougherty in his report:

Some of the most pathetic Zodiac suspects included " my Uncle Bubba the Zodiac" – this Zodiac suspect turned out, like most of the others, a joke foisted on an ignorant mass media; and someone living in a tenderloin hotel room with a bottle of wine who got drunk and began believing the son of Howard Hughes was the Zodiac according to his written decoding illusions.

Di generously gave me a copy of the Dougherty files, which I’ve scanned and put online. You can read the letters in their entirety here.

Was Thomas Dougherty on to something with the Zodiac ciphers? Let’s examine his approach. Here’s how he describes his method:

Updated Oct 17, 2014: Fixed the Google Drive link to the FBI files.

Diligent researcher morf13‘s persistence recently paid off: He received over 800 pages of never before seen material from FBI files on the Zodiac case. Click here to view all of the new documents (thanks, Mark, for uploading them). The files contain a wide variety of material related to persons of interest investigated for the Zodiac crimes. Inside the files you’ll see handwriting samples, letters, envelopes, crime reports, interview transcripts, emergency call logs, evidence analysis, and even a suspect’s day planner. The files also contain some unconfirmed Zodiac letters, and a lot of material related to Gareth Penn’s oft debunked theories about the killer and his codes.

This cryptanalyst’s conclusions about Gareth Penn’s strange code theories says it all:

Alas, that conclusion is reached all too often when analyzing the many claims that have emerged over the years.

And speaking of codes, here is one that appears in the files:

The files don’t seem to mention this code in any way. At first glance, it resembles a book cipher like the Beale ciphers. But Quicktrader on Morf’s forums was quick to point out the code’s resemblance to a Vigenère cipher matrix, due to the way the numbers repeat in an aligned pattern.

The matrix appears with a lot of other material, including handwriting samples, and this crossword puzzle:

Perhaps someone’s interest in puzzles aroused the suspicions of investigators.

Have a look at the new documents. Can you find anything interesting?

Forum user 4on4off had an interesting idea: Why not run the Gutenberg crib search program again, but against the Zodiac’s own writings instead of the massive collection of books in the Gutenberg collection?

I ran the search, and it found a few matches. Here are the files containing the results:

Results for the 408 cipher
Results for the 340 cipher

In each file, the matches resulting in the highest Zkdecrypto scores are displayed first. Here is a sample line from one of the files:

37, 237, +yBX1*:49CE>VUZ5-+|c.3zBK(Op^.fMqG2Rc, RSEENSIGNEDYOURSTRULEYHEPLUNGEDHIMSEL, 15, 1.8402534E-5, {0 17} {2 23} {20 29} {19 36} , 2952.125, 79.78716

And here’s an explanation of the data format:

  • 37: Length of chunk
  • 237: Position of chunk. Positions start at 0.
  • +yBX1*:49CE>VUZ5-+|c.3zBK(Op^.fMqG2Rc: Transcription of cipher text chunk
  • 15: Number of unique letters in the plain text
  • 1.8402534E-5: Constraint difficulty. Lower values reflect higher difficulty (due to larger numbers of repeated symbols)
  • {0 17} {2 23} {20 29} {19 36}: Positions of repeated symbols, grouped into pairs
  • In the data for the 408, another value appears before the Zkdecrypto score, representing the proportion of characters in the solution that match the real known solution.
  • 2952.125: Zkdecrypto score
  • 79.78716: Zkdecrypto score divided by chunk length

What do you think? See anything interesting?

Harold Kravcik created a stir a few years back when he produced a solution to the 340-character cipher. Some people believed it to be the correct solution, so it was submitted to the FBI, and Harold required others to sign non-disclosure agreements to view his solution. This led to a lot of hype that the decades-old mystery of the 340 cipher had finally been solved. But eventually, confidence in his solution was lost. The drama resulting from all this caused Harold, and others who believed in the solution, to receive ridicule and derision from folks in the Zodiac community who are exhausted by the parade of discredited cipher solutions that have emerged since the Zodiac killer committed his awful crimes.

But what’s wrong with Harold’s solution?

The program I wrote for the Project Gutenberg crib experiments also looked for pieces of text that fit into the entire 13-character cipher.

This cipher has many repeated symbols: One symbol repeats three times, and three other symbols repeat twice. Nevertheless, the program found over 2,700 unique bits of text that still fit “as is” into the the cipher text. Here are some interesting examples:

Back in Part 1 we talked about the idea of using a large collection of books as a source of cribs to plug into cipher texts. Can we use a large collection of books, such as Project Gutenberg, to find pieces of real solutions to the ciphers?

I created an experiment to explore this idea. First, the 408 and 340 ciphers are broken down into chunks. Each chunk of cipher text must have some minimum number of repeated symbols. Chunks that have many repeated symbols are difficult to find solutions for, since the solutions must have repeated letters in the exact same locations. If we pick chunks that have too few repeated symbols, then there are way too many solutions that will fit.

Then, a program processes all of Project Gutenberg’s books. Each book is converted into a stream of uppercase text, with all punctuation and numbers removed. For example, here is what the beginning of A Tale of Two Cities looks like when it is converted:


The program then looks through all of the text from the books to find pieces that fit into the chunks we created from the cipher texts. In total, the program examined over eleven billion characters of text.

It is a bit like finding a needle in a haystack, since the chances are low of finding a long piece of text that exactly matches the real solution. Actually, it’s a bit worse than finding a needle in a haystack: It is more like finding a needle in a needle stack, because very many pieces of text can fit into a chunk of cipher text. You have to come up with a way to figure out which needle is the one you’re really looking for.

For example, take a look at this 46-character chunk of cipher text from the 408-character cipher:

Have you ever wanted to do cryptanalysis on someone’s back?