geocaching cryptocaches – an introduction by tenebrus

58
Geocaching Cryptocaches – An Introduction by tenebrus

Upload: lila-vergin

Post on 29-Mar-2015

226 views

Category:

Documents


8 download

TRANSCRIPT

Page 1: Geocaching Cryptocaches – An Introduction by tenebrus

Geocaching Cryptocaches – An Introductionby tenebrus

Page 2: Geocaching Cryptocaches – An Introduction by tenebrus

Lesson 1: Using Letter Frequencies

It is important to know which letters are the most frequently used. Depending on the resource used, this list varies slightly, although they are essentially the same.

The most common letters appear near 9-12%, median letters appear near 3-5%, and uncommon letters often appear near or under 2%.

In order (from Linotype machines), we haveE T A O I N S H R D L U C M F W Y P V B G K Q J X Z

Note: The words “geocache”, “geocaching”, and number-words are quite popular to use in this context. As such, the C, G, F and Y probably appear slightly more often than given above.

Let’s say that you code has quite a lot of the letters X, L, and F (in that order). Consider the possibility that these are (in some order) E, T, and A (or possibly O).

Page 3: Geocaching Cryptocaches – An Introduction by tenebrus

Lesson 1: Using Letter Frequencies

Consider the following cryptarithm:DROMKMROSCKDXYBDRPYBDIDGYNOQBOOCDGOXDIDGYZYSXDCOFOXREXNBONOSQRDIYXO…

There are a lot of Ds (16%) and Os (14%). Next more popular letters are Xs (8%) and Ys (8%). Try D=E and O=T first.

EXTXXXXTXXXEXXXEXXXXEXEXXXTXXTTXEXTXEXEXXXXXXEXTXTXXXXXXTXTXXXEXXXT…

Then, try D=T and O=E (as if the first word is THE – in fact, also try R=H)

THEXXXHEXXXTXXXTHXXXTXTXXXEXXEEXTXEXTXTXXXXXXTXEXEXHXXXXEXEXXHTXXXE…

E?T… is not a very good start. Perhaps trying it the other way another would be better.

T?E… could be THE. This looks very good.

Page 4: Geocaching Cryptocaches – An Introduction by tenebrus

Lesson 1: Using Letter Frequencies

THEXXXHEXXXTXXXTHXXXTXTXXXEXXEEXTXEXTXTXXXXXXTXEXEXHXXXXEXEXXHTXXXE…

It seems very likely now that the beginning of this message reads “the cache is at …” As such, let’s assume a few more substitutions: M=C, K=A, S=I, and C=S.

THECACHEISATXXXTHXXXTXTXXXEXXEESTXEXTXTXXXXIXTSEXEXHXXXXEXEIXHTXXXE…

The next words seem like “north ???ty-two”. Recall the proximity of false coordinates to make that “north forty-two.” So, assume further that X=N, Y=O, B=R, P=F, I=Y, and G=W.

Look for likely words as you guess and check:“Cache?” “Degrees?” “Point?” “Hundred?” “North?” “West?” “Minutes?”et cetera…

Page 5: Geocaching Cryptocaches – An Introduction by tenebrus

Lesson 1: Using Letter Frequencies

So, assuming further that X=N, Y=O, B=R, P=F, I=Y, and G=W.

THECACHEISATNORTHFORTYTWOXEXREESTWENTYTWOXOINTSEXENHXNXREXEIXHTYONE…

Whoa. This is almost fully translated. The end of the text looks like “seven hundred eighty-one” (so F=V, E=U, N=D, Q=G). This also matches an earlier sequence that looks like “DEGREES”. This, by the way, leaves only Z=P to complete the message.

THE CACHE IS AT NORTH FORTY-TWO DEGREES TWENTY-TWO POINT SEVEN HUNDRED EIGHTY-ONE…

Don’t forget: everything is not a simple cryptarithm. But, because of its shear simplicity, it is worth your time to check for it.

Page 6: Geocaching Cryptocaches – An Introduction by tenebrus

Lesson 2A: Cryptarithmic Ciphers

Now that you have seen how a cryptarithm works, it is time to see a diverse selection of cryptarithmic ciphers.

Let’s look at the Dancing Men Cipher. They are derived from the Sherlock Holmes story of The Dancing Men, but with only 17 letters used (and some mild inconsistencies), we have Aage Rieck Sørensen to thank for this complete cipher.

But what if we didn’t have Sørensen’s key?

A B C D E F G H IJ K L M N O P Q RS T U V W X Y Z

Page 7: Geocaching Cryptocaches – An Introduction by tenebrus

Lesson 2A: Cryptarithmic Ciphers

Well, for starters, you could just fake the key.

That is, assign the first symbol A, the second symbol B, and so on. Wherever a symbol is reused, use the already-assigned letter. Notice that we have a lot of As and Cs. Letter frequency analysis suggests that these are very likely cryptarithmic substitutions for Es and Ts. We should try it.

A B C D E D B C F G E A H I J A B K I J A L A M I N C

Page 8: Geocaching Cryptocaches – An Introduction by tenebrus

Lesson 2A: Cryptarithmic Ciphers

Well, for starters, you could just fake the key.

As it turns out, this is the beginning of the very same secret message from our previous example.

It should also be noted that Sørensen’s Dancing Men Cipher also has symbol for numerals too:

A B C D E D B C F G E A H I J A B K I J A L A M I N CT H E C A C H E I S A T N O R T H F O R T Y T WO D E

0 1 2 3 4 5 6 7 8 9

Page 9: Geocaching Cryptocaches – An Introduction by tenebrus

Lesson 2A: Cryptarithmic Ciphers

So pay attention to whether you think that you have 26 alphabetic symbols or 36 alphanumeric symbols.

Also, some puzzlemakers will intentionally layer their ciphers such that one code is cracked only to reveal another code. However, if all of the layers are cryptarithmic, it will never be more difficult than the sample we just did by faking the key and then fixing the key using letter frequency analysis.

0 1 2 3 4 5 6 7 8 9

Page 10: Geocaching Cryptocaches – An Introduction by tenebrus

Lesson 2B: Cryptarithmic Ciphers

Another very commonly-known cryptarithm is Braille (Level 1). Each symbol (consisting of either 6 or 8 dots in two columns) represents either an alphanumeric character, punctuation, or an escape character.

(Note: escape characters provide special information for escaping the normal interpretation for neighboring characters. One example is the escape character that says “the next character is a number” or “the next character is uppercase”.

It may be worth noting how the top two rows are similar from A-J to K-T to U-Z (skipping over W)

Page 11: Geocaching Cryptocaches – An Introduction by tenebrus

Lesson 2B: Cryptarithmic CiphersAlso notice that even Braille (Level 1) has some frequently used word substitutions (such as ‘and’, ‘the’, and the endings ‘-ing’ and ‘-ed’), thus making it not entirely cryptarithmic. Braille (Level 2) makes additional substitutions to the point that the code is not even close the cryptarithmic (ex: ‘not’ is the letter N, ‘according’ is the digraph AC, ‘because’ is a special digraph (dots 2 & 3) for (BE) followed by the letter C.)

The letters A-J are used to represent numbers 1 through 9, followed by 0, but only when preceded by the number sign.

Page 12: Geocaching Cryptocaches – An Introduction by tenebrus

Lesson 2B: Cryptarithmic CiphersAlso notice that even Braille (Level 1) has some frequently used word substitutions (such as ‘and’, ‘the’, and the endings ‘-ing’ and ‘-ed’), thus making it not entirely cryptarithmic. Braille (Level 2) makes additional substitutions to the point that the code is not even close the cryptarithmic (ex: ‘not’ is the letter N, ‘according’ is the digraph AC, ‘because’ is a special digraph (dots 2 & 3) for (BE) followed by the letter C.)

The letters A-J are used to represent numbers 1 through 9, followed by 0, but only when preceded by the number sign.

Page 13: Geocaching Cryptocaches – An Introduction by tenebrus

Lesson 2C: Cryptarithmic CiphersThere is another pictorial cryptarithmic cipher technique called the Pigpen Cipher or the Mason’s Cipher.

The letters are enters from left to right, then from top to bottom, in each of the 4 shapes. Dots are used to differentiate the second use of a shape from the first use.

Caution: there are 2 main versions of the Pigpen Cipher: (A) ##XX and (B) #X#X. Furthermore, letters can be entered in alternating fashion into each pair of shapes (that is, A,C,E,… in one and B,D,F,… in the other).

Page 14: Geocaching Cryptocaches – An Introduction by tenebrus

Lesson 2C: Cryptarithmic CiphersEach letter is then shown by using only the part of its graph.

That is, for example:

Remember: regardless of which version is used, using letter frequency analysis will guide you.

Page 15: Geocaching Cryptocaches – An Introduction by tenebrus

Lesson 2D: Cryptarithmic CiphersSometimes, a cryptarithm is given as a standardized set of numbers. For example, ASCII/UTF-8 has set the value 65 to represent ‘A’. Similarly, 66 represents ‘B’, and so on. The lowercase letters began with 97 for ‘a’ and continue onto 122 for ‘z’. It is important to also know that punctuation, spaces, and accented letters do not necessary have intuitive values.

However, as letter frequency analysis would suggest, 101 = ‘e’ and 116 = ‘t’ would be popular letters, as would 13 for spaces.

Note: while the letter frequency analysis and just faking a key would work, with practice, one learns to recognize certain ASCII/UTF-8 characters in their numeric form.

Page 16: Geocaching Cryptocaches – An Introduction by tenebrus

Lesson 2D: Cryptarithmic CiphersSometimes, the ASCII/UTF-8 is given as a binary or as hexadecimal code. Let’s briefly discuss the conversions from decimal to these bases just in case you should ever need to do them by hand.

For example: ‘t’ = 116 in decimal. You can afford a 64 (leaving 52). You can afford a 32 (leaving 20). You can afford a 16 (leaving 4). You cannot afford a 16! You can afford the 4 (leaving 0, but keep in mind that the 2 and the 1 will become 0 bits. So, it’s 1110100 (or because it is typical to use 8 bits = 1 byte: 01110100).

Decimal to binary: STEP 1: make a list of powers of 2: 1,2,4,8,16,32,64,… STEP 2: determine the larger power of 2 that you can ‘afford’ with the number that your got. That binary digit (or bit) becomes a 1. STEP 3: repeat step 2 will small powers of 2 until checking the zeroeth power = 1. If ever you cannot ‘afford’ a power of 2, that binary digit becomes a 0.

Page 17: Geocaching Cryptocaches – An Introduction by tenebrus

Lesson 2D: Cryptarithmic Ciphers

For example: 0 1 1 1 0 1 0 0128 64 32 16 8 4 2 1 0+64+32+16+ 0+ 4+ 0+ 0 = 116 = ‘t’

If you had a long string of bits, it is advisable to break them down into blocks of 8 bits:

011101000110100001100101 should be treated as [01110100][01101000][01100101] which then becomes ‘the’.

Binary to decimal: STEP 1: starting on the right-most bit, make a list of powers of 2: …,64,32,16,8,4,2,1. STEP 2: whether a bit is a 0 or a 1, multiply them by their respective powers of 2; then, add them up.

Page 18: Geocaching Cryptocaches – An Introduction by tenebrus

Lesson 2D: Cryptarithmic Ciphers

For example:01101101 should be treated as [0110][1101] which becomes 6-13, or rather ‘6d’. Note: this ‘d’ is the hexadecimal number ‘d’, not the letter ‘d’.

For example:011101000110100001100101 should be treated as [0111][0100][0110][1000][0110][0101] which then becomes 7 4 6 8 6 5.

Binary to hexadecimal: STEP 1: make blocks of 4 bits at a time. STEP 2: convert each block into decimal; for values greater than 9, continue counting with ‘a’ for 10, ‘b’ for 11, ‘c’ for 12, ‘d’ for 13, ‘e’ for 14, and ‘f’ for the maximum value 15.Hexadecimal to binary: Just reverse this process.

Page 19: Geocaching Cryptocaches – An Introduction by tenebrus

Lesson 2E: Cryptarithmic Ciphers

A specific type of simple cryptarithm is the Caesarean Rotation in which every letter is rotated a certain number of positions in the alphabet, looping around if necessary.

This is the same message given in plaintext, in rotation-1, and in rotation-2.

Notice: If we rotate far enough, then we would eventually return to the original message. (That is, rotation-26 is the same as no rotation at all.)

T H E C A C H E I S U N D E R T H E L O G

U I F D B D I F J T V O E F S U I F M P H

V J G E C E J GKUW P F G T V J G N Q I

Page 20: Geocaching Cryptocaches – An Introduction by tenebrus

Lesson 2E: Cryptarithmic Ciphers

Rotation 13 is a special case of enciphering and deciphering. The reason for this is that one has gone exactly halfway around the alphabet. Doing that twice would go exactly all the way around the alphabet, thus returning everything to their initial values.

So, under “Rot-13”, the word ‘geocache’ would become ‘trbpmpur’. The rule for use is to replace each letter with the one either immediately above it or immediately below it.

Some ciphers, such as the Rotation 13, are their own deciphering technique. This often makes for a very compact ciphering tool:

A B C D E F G H I J K L M N O P Q R S T U VW X Y Z

Page 21: Geocaching Cryptocaches – An Introduction by tenebrus

Lesson 2F: Cryptarithmic Ciphers

The Atbash cipher is very much like the Rot-13 in that it is a self-decoding cryptarithm; However, it differs in that it folds the alphabet over instead of rotating each letter.

Rotation-13: Atbash:

So, under “Atbash”, the word ‘geocache’ would become ‘tvlxzxsv’. The rule for use is still to replace each letter with the one either immediately above it or immediately below it.

The name of the Atbash comes from its original use with the Hebrew alphabet in which the cryptarithmic pairs beginning with Aleph (א) and Tav (ת), then with Beth (ב) and Shin (ש).

A B C D E F G H I J K L M Z Y X WVU T S R QP O N

A B C D E F G H I J K L M N O P Q R S T U VW X Y Z

Page 22: Geocaching Cryptocaches – An Introduction by tenebrus

Lesson 2G: Cryptarithmic Ciphers

Another famous cryptarithm is Morse code which uses dots and dashes, sometimes in auditory form, to convey alphanumeric characters. To accommodate at least 36 such characters, each character is translated into multiple dots and dashes. We’ve seen this before with ASCII/UTF-8 conversions into binary and hexadecimal.

Example:= = o o = = = = o = o o = = o = o o o o o o G E O C A C H E

A . – N – . B – . . . O – – – C – . – . P . – – . D – . . Q – – . – E . R . – . F . . – . S . . . G – – . T – H . . . . U . . – I . . V . . . – J . – – – W . – – K – . – X – . . – L . – . . Y – . – – M – – Z – – . . | 1 . – – – – 6 – . . . . 2 . . – – – 7 – – . . . 3 . . . – – 8 – – – . . 4 . . . . – 9 – – – – . 5 . . . . . 0 – – – – –

Page 23: Geocaching Cryptocaches – An Introduction by tenebrus

Lesson 2H: Cryptarithmic Ciphers

An historical literary substitution cipher worth presenting is Edgar Allen Poe’s Gold-Bug. It was defeated using none other than our good friend Letter Frequency Analysis. If you are seeing a lot of the characters {8 ; 5 ‡ 6 *} then you are probably encountering a Gold-Bug cipher in use.

53‡‡†305))6*;4826)4‡.)4‡);806*;48†8¶60))85;1‡(;:‡*8†83(88 )5*†;46(;88*96 *?;8)*‡(;485);5*†2:*‡(;4956*2(5*—4)8¶8*;40 69285);)6†8)4‡‡;1(‡9;48081;8:8‡1;48†85;4)485†528806*81( ‡9;48;(88;4 (‡?34;48)4‡;161;:188;‡?;

This message decodes as:A good glass in the bishop's hostel in the devil's seatforty-one degrees and thirteen minutes northeast and by northmain branch seventh limb east side shoot from the left eye of the death’s-heada bee line from the tree through the shot fifty feet out.

A 5 N * B 2 O ‡ C – P . D † Q $ E 8 R ( F 1 S ) G 3 T ; H 4 U ? I 6 V ¶ J , W ] K 7 X ¢ L 0 Y : M 9 Z [

Page 24: Geocaching Cryptocaches – An Introduction by tenebrus

Lesson 2I: Cryptarithmic Ciphers

A more modern cryptarithm is Cell Phone Text Code. It is very straightforward to use. Naturally, there will be a lot of 33, 8, 2, 666, 444, and 66 according to letter frequency analysis.

There is also a variant that, instead of replicating a digit, will follow that digit with \ | or / to indicate first, second, or third letter on that key. In this variant Q is usually 0\ and Z is usually 0/ (so that there is no need of a fourth symbol).

Example:2 3 3 7777 33 888 33 66 8 33 33 66 8 666 66 2 66 3 8 44 444 777 8 999 8 666 9

ADD SEVENTEEN TO N AND THIRTY TO W

A 2 N 66 B 22 O 666 C 222 P 7 D 3 Q 77 E 33 R 777 F 333 S 7777 G 4 T 8 H 44 U 88 I 444 V 888 J 5 W 9 K 55 X 99 L 555 Y 999 M 6 Z 9999* Sometimes Q=0 and Z=1 instead of being a part of the 7-block & 9-block respectively.

Page 25: Geocaching Cryptocaches – An Introduction by tenebrus

Lesson 2J: Cryptarithmic Ciphers

One last cryptarithm presented here today is from the future (okay, from Futurama).

Example:Do not forget that you could just fake the cryptarithm using letter frequency analysis.

Think:There are a lot of E T A O in this message.

Translated:Do not forget that you could just fake the cryptarithm using letter frequency analysis.

A A N N B B O O C C P P D D Q Q E E R R F F S S G G T T H H U U I I V V J J W W K K X X L L Y Y M M Z Z

Page 26: Geocaching Cryptocaches – An Introduction by tenebrus

Lesson 3A: Vigenère Ciphers

Blaise de Vigenère thought about the weaknesses of substitution ciphers with regards to Letter Frequency Analysis. As such, it occurred to him to take measures against it. With that, Vigenère decided to add a ‘keyword’ into the cipher so that without the knowledge of the keyword, the code would be, in theory, more difficult to break. Here’s how it works:

It is a multiple-rotation cipher. That is, the rotation ciphers that we have already seen are selectively used based off of a keyword. For example: if the keyword is ‘PARIS’, we would translate this into numbers (counting from 0 for A) to get 15-0-17-8-18. So, the first letter would be rotated by 15, the second letter would be rotated by 0, the third by 17, the fourth by 8, and the fifth by 18. When the keyword was expended, it would cycle back. So, the sixth letter would be rotated by 15.

Remember: A=0, B=1, C=2, …, Z=25.

Page 27: Geocaching Cryptocaches – An Introduction by tenebrus

Lesson 3A: Vigenère Ciphers

Let’s practice this, still using PARIS (15-0-17-8-18) was the keyword.

Plaintext: G O T O T H E C O R N E R 6 14 19 14 19 7 4 2 14 17 13 4 17

Keyword: P A R I S P A R I S P A R 15 0 17 8 18-15 0 17 8 18-15 0 17

(add above) 21 14 36 22 37 22 4 19 22 35 28 4 34(subtract 26s) 21 14 10 22 11 22 4 19 22 9 2 4 8 Ciphertext: V O K W L W E T W J C E I

So, we added the numerical values of the plaintext and the keyword (repeated as necessary) to get sums. If the sums were too large, we subtracted one alphabet (or 26) to correct for this. Then, we translated the new numerical values back into letters. Voilà.

To decipher, we simply reverse the steps (ex: add 26s as necessary)

Remember: A=0, B=1, C=2, …, Z=25.

Page 28: Geocaching Cryptocaches – An Introduction by tenebrus

Lesson 3A: Vigenère Ciphers

As clever as this seems, it only obscures the letter frequencies; it does not truly eliminate them. For instance, with a keyword like ‘PARIS’ in which one-fifth of the letters do not have a rotation at all, the ciphertext will still have one-fifth of the original letter frequencies! Furthermore, the other fifths will have modified letter frequencies. So, while there could be less pronounced letter frequencies, they will still be present. More importantly, it gives us a major clue for breaking a Vigenère ciphertext.

The repetitive nature of the rotations used is itself a weakness!

Remember: A=0, B=1, C=2, …, Z=25.

Page 29: Geocaching Cryptocaches – An Introduction by tenebrus

Lesson 3A: Vigenère Ciphers

The repetitive nature of the rotations used is itself a weakness!

To crack a Vigenère WITHOUT the keyword:Assume a possible length of a keyword, N.Then start putting the letters of the ciphertext into sets: the first letter goes into set #1, the second letter goes into set #2, and so on, until the Nth letter goes into set #N. Then, start repeating sets. So, the (N+1)th letter goes into set #1. (See below for an example using our previous ciphertext and assuming a 3-letter keyword.)

V1 O2 K3 W1 L2 W3 E1 T2 W3 J1 C2 E3 I1Set #1 = {VWEJI}, Set #2 = {OLTC}, Set #3 = {KWWE}

Set #3 contains a high-frequency of Ws. This could be verification of a 3-letter keyword. Then again, it’s no guarantee.

Note: These sample sets are too small effectively run letter analyses.

Page 30: Geocaching Cryptocaches – An Introduction by tenebrus

Lesson 3A: Vigenère Ciphers

New sample ciphertext:IHVOWDCRKZTIJTGRAKMVJNUMJPLFOSIAGZGYETBADNFNXDRKGQPRUAXGODBZTFRSWROFZVXNRBWHAKITTAIQFVOWBJJEKPJTEFVWHIO

The translation (using ‘PARIS’ as keyword) is as follows, but know that we will work at getting there as if we didn’t already know this:THEGEOCACHEISLOCATEDUNDERALOGATAPROJECTIONOFFORTYYARDSFROMTHEFAKECOORDINATESATABEARINGOFTRUETHREEONESIX

Note: Now the sample sets will be rich enough for proper letter frequency analyses.

Page 31: Geocaching Cryptocaches – An Introduction by tenebrus

Lesson 3A: Vigenère Ciphers

New sample ciphertext:IHVOWDCRKZTIJTGRAKMVJNUMJPLFOSIAGZGYETBADNFNXDRKGQPRUAXGODBZTFRSWROFZVXNRBWHAKITTAIQFVOWBJJEKPJTEFVWHIO

Ciphertext broken down into sets:IHVOWDCRKZTIJTGRAKMVJNUMJPLFOSI … (all)I O C Z J R M N J F I … (set #1) H W R T T A V U P O … (set #2) V D K I G K J M L S … (set #3)

Note: This is how the subsets are formed. At the 4th letter, there is no set #4 (assuming a 3-letter keyword) so the cycle begins again with set #1.

Page 32: Geocaching Cryptocaches – An Introduction by tenebrus

Lesson 3A: Vigenère Ciphers

Note: The plaintext has a very pronounced letter frequency: A, E, O, R, & T are each 9 or more with no other letter exceeding 5.

The ciphertext, on the other hand, has a flatter frequency: R & T have 7; A, F, I, J, & O each have 6; and four others have 5. It may be meaningful that the ciphertext only has 1 C, 1 L, and 1 Y. But what happens when we split the ciphertext into 3 sets?

Set #1 has at most 3 of a kind: F, I, J, O, R. Set #2 has 4 Ts and 3 As, Bs, and Fs.Set #3 has 5 Ks, no 4s or 3s, and 9 different letters have frequency 2.

The Ks might just be a coincidence. It would be more likely to have a few high-frequency letters than just the one extreme outlier across the 3 sets.

Note: You should always try at least 2 different keyword lengths for comparison.

Page 33: Geocaching Cryptocaches – An Introduction by tenebrus

Lesson 3A: Vigenère Ciphers

Assuming a 5-letter keyword yields different subsets and different results.

Set #1 has 4 Ts and 3 Ds; and has 14 missing letters.Set #2 has 4 As; 3 Es, Ns, and Os; and has 15 missing letters.Set #3 has 4 Fs & Ks; and has 14 missing letters.Set #4 has 4 Bs; and has 12 missing letters.Set #5 has 4 Ws; and 15 missing letters.

Considering that every letter is used in the whole ciphertext, these missing letters represent an extreme likelihood that the keyword is 5 letter long (or at least a multiple of 5 letters long).

Next up: Use the most frequent letters (ETAOINSHRDLU) in each subset to guess at the keyword. For example: Maybe the T in set #1 is really the E, in which case it could be Rotation-15 (because E+15=T). If not, maybe T=T and it isn’t rotated at all.

Page 34: Geocaching Cryptocaches – An Introduction by tenebrus

Lesson 3A: Vigenère CiphersSet #1 has 4 Ts and 3 Ds; and has 14 missing letters.Set #2 has 4 As; 3 Es, Ns, and Os; and has 15 missing letters.Set #3 has 4 Fs & Ks; and has 14 missing letters.Set #4 has 4 Bs; and has 12 missing letters.Set #5 has 4 Ws; and 15 missing letters.

Best first guesses yield that Set #1 is a Rotation 15; Set #2 might not be rotated at all; Sets #3, #4, and #5 need closer investigation at this point.A Closer Look: While it might be reasonable to suggest at first that Set #3 is Rotation-1 (to make E rotate to F), this also would mean that we had a lot of Js to start with (to make J rotate to K). As such, let’s assign the E elsewhere, such as making E rotate to the K (Rotation-6). This would have a lot of Zs rotating to Fs. So, let’s look at the distance between F and K. They are 5 apart. As such, one should look for 2 normally frequent letters that are 5 apart (such as D & I, I & N, N & S, or O & T). So, it would be most likely that Set #2 is Rotation-2 (C), Rotation-23 (W), Rotation-18 (S), or Rotation-17 (R).)

Page 35: Geocaching Cryptocaches – An Introduction by tenebrus

Lesson 3A: Vigenère CiphersSet #1 is very likely Rotation-15 (P)Set #2 is very likely not rotated (A)Set #3 is very likely Rotation-18 or 17 (S or R)Set #4 could be Rotation-8 (I):TB and missing letters are normalSet #5 could be Rotation-18 (S): odd that nothing maps to S or T here.

As such, the likeliest keyword to try is ‘PARIS’.Lo and behold:THE GEOCACHE IS LOCATED UNDER A LOG AT A PROJECTION OF FORTY YARDS FROM THE FAKE COORDINATES AT A BEARING OF TRUE THREE ONE SIX.

Keep in mind: If the keyword is actually a word, then it too suffers from the weaknesses of letter frequency analysis. Furthermore, don’t forget to not just play around with single letters, but to also play around with letter differences which are maintained under rotations.

Page 36: Geocaching Cryptocaches – An Introduction by tenebrus

Lesson 3A: Vigenère Ciphers

There’s actually some faster ways to find some keywords. If you check the ciphertext against a shifted copy of the ciphertext looking for abnormally high levels of coincidence (much more than 1/26 = 3.85%). When this happens, chances are that you’ve aligned the same sums from the same frequent plaintext letters added to the same keyword letters.

CIPHERTEXT CIPHERTEXT ----*-* 2/7 = 28.4% > 3.85%

When this sample word is shifted by 3 positions, the likelihood of coincidence skyrocket. As such, this would imply a 3-letter keyword.

Note: when both enciphering steps have clearly repetitive structures, the result will still have repetitive structure that can be exploited too.

Page 37: Geocaching Cryptocaches – An Introduction by tenebrus

Lesson 3A: Vigenère Ciphers

As it turns out, for our ciphertext, the slide-and-compare method just shown doesn’t yield too much fruit for a 5-letter slide: only 4 matches over 108 paired letters. 4/108 = 1/27. It’s actually lower than 1/26 random chance!!

However, let’s say it had yielded that we should be looking for a 5-letter keyword. Another tool that would be useful is to assign a weight to each rotation of each subset.

That is: Set #1 begins IDTRJPIY… It’s rotations would be 1: JEUSKQJZ…, 2: KFVTLRKA…, 3: LGWUMSLB…, and so on. Then, assign to each rotation a value equal to the sum of its weighted letters (using the chart below). Much higher totals are very likely rotations for that subset. (Example: ‘IDTRJPIY’ itself equals ‘8+7+9+8+1+6+8+6’ = ‘53’)

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z8 4 7 7 9 6 5 7 8 1 2 7 6 8 8 6 2 8 8 9 6 5 5 3 6 0

Page 38: Geocaching Cryptocaches – An Introduction by tenebrus

Lesson 3A: Vigenère Ciphers

One more possible tactic to apply to Vigenère-like ciphers is to keep subtracting likely plaintext words from various starting positions. For example, the word ‘the’ is very likely to appear (and reappear). The result of all such subtractions will produce an array consisting of some very unlikely trigraphs and some very likely trigraphs. You should look at the starting positions of the most likely trigraphs to see if the positions all share the same remainder when divided by a particular number. That particular number would be the likely length of your keyword and those likely trigraphs would yield strong clues for the actual rotations used in the keyword.

Note: these last few tactics are really just variations on the same exploitation. You can subtract likely words to get likely keyword fragments. You can do a shift-analysis to fish out the length of the keyword. You can do a letter frequency analysis to eliminate many rotation possibilities. A keyword that is itself a word is subject to letter frequencies too.

Page 39: Geocaching Cryptocaches – An Introduction by tenebrus

Lesson 3B: Vigenère Ciphers

One of the improvements upon the Vigenère cipher is to make it an autokey cipher. That is, a keyword will get the enciphering started, but then rather than repeating the keyword, a shift-copy of the ciphertext itself will be used as the keyword.

Plaintext: G O T O T H E C O R N E R 6 14 19 14 19 7 4 2 14 17 13 4 17

Keyword: P A R I S V O K W L C S M 15 0 17 8 18-21 14 10 22 11-21 14 10

(add above) 21 14 36 22 37 28 18 12 36 28 34 18 27(subtract 26s) 21 14 10 22 11 2 18 12 10 2 8 18 1 Ciphertext: V O K W L C S M K C I S B

Good news for codebreakers: Many of the same tactics used for a regular Vigenère cipher also work for an autokey Vigenère cipher. Here, subtracting likely words is a generally stronger tactic for codebreaking while letter frequencies play more of a supportive role.

Page 40: Geocaching Cryptocaches – An Introduction by tenebrus

Lesson 3C: Vigenère Ciphers

One of the other related ciphers is the Alberti cipher. This behaves a lot like a Vigenère cipher (perhaps in reverse). Where it varies from the traditional Vigenère cipher techniques is that the alphabet is permuted according to an additional keyword. That is, the keyword is written before the alphabet and any repeated letters are removed.

For example: if the keyword is ‘ALBERTICIPHER’ then the alphabet becomes ALBERTICiPHerabcDeFGhiJKlMNOpQrStUVWXYZ, or just simply ALBERTICPHDFGJKMNOQSUVWXYZ.

Here, A=0, L=1, B=2, E=3, R=4, …, Z=25. This is a permutation of the alphabet not present in the straightforward Vigenère cipher.

Note: The Alberti cipher requires two keywords: an alphabet permutation keyword and a repetition keyword.

Page 41: Geocaching Cryptocaches – An Introduction by tenebrus

Lesson 4: Railfence Cipher

The Railfence cipher is little more than a calculated permutation of the plaintext. Once you see how to encipher using it, you should by this point see how you can create subsets to crack this one.

Sample Plaintext: LOOK UNDER THE THIRD BENCHWrite these letters, including spaces (shown as * in the rails), in a down-and-up pattern for some number of rows (let’s use three to start). Then, keeping the spaces, squish together the rows in order.

L * E H H * C O K U D R T E T I D B N H O N * * R E

So the ciphertext becomes L EHH COKUDRTETIDBNHON RE.

Caution: It might not be advisable to use a double space as we have here.Perhaps this would be a good place to replace any additional spaces with a Q to get L EHH COKUDRTETIDBNHON QRE

Page 42: Geocaching Cryptocaches – An Introduction by tenebrus

Lesson 4: Railfence Cipher

We should probably see what happens to the plain alphabet under Railfence before getting into the nitty-gritty of cracking it.

Sample Plaintext: ABCDEFGHIJKLMNOPQRSTUVWXYZ

A E I M Q U Y B D F H J L N P R T V X Z C G K O S W

So the ciphertext becomes AEIMQUYBDFHJLNPRTVXZCGKOSW.The middle half of the ciphertext is every other letter of the plaintext. Then, the first and last quarters are every fourth letter of the plaintext. So, the real trick is to split the Railfence into manageable parts, and then line them up into a plaintext message.

Caution: Keep in mind that you do have to assume a certain number of rails in order and that your first choice could be incorrect.

Page 43: Geocaching Cryptocaches – An Introduction by tenebrus

Lesson 4: Railfence Cipher

Let’s see that alphabet one more time, but this time under 5 rails.

Sample Plaintext: ABCDEFGHIJKLMNOPQRSTUVWXYZ

A I Q Y B H J P R X Z C G K O S W D F L N T V E M U

So the ciphertext becomes AIQYBHJPRXZCGKOSWDFLNTVEMU.Now, we must be a little more clever. The number of total characters will define how to split up the ciphertext because of the geometry of the process (ex: row 2 will have 1 more letter than rows 3 and 4).

Keep in mind: If you can reverse engineer the alphabet under Railfence, then you can reverse engineer anything under Railfence.

Page 44: Geocaching Cryptocaches – An Introduction by tenebrus

Lesson 4: Railfence Cipher

Let’s try to crack one now

Sample Ciphertext: AIQYBHJPRXZCGKOSWDFLNTVEMU

Because there are exactly 26 letters (and if we assume 5 rails), then we already know that the first 4 letters are the top row of the rails. Likewise, we already know that the last 3 letters are the bottoms row of the rails. (Why? 26 / (5-1) = 6 with 2 left over. That 6 is 3 full downs and 3 full ups and the 2 left over are the beginning of a 4th down.)

From there, you should be able to piece it together (7 letters in row 2, 6 letters in both rows 3 and 4). So, we get the follow slices of the ciphertext: AIQY|BHJPRXZ|CGKOSW|DFLNTV|EMU.

Another way to read it: Once you calculate the number of rails you think it is, read the first unused letters of each block, start reading on the left and reflect back at each end. (you would read D on the first forward pass, F on the first return, L on the second pass, etc…)

Page 45: Geocaching Cryptocaches – An Introduction by tenebrus

Lesson 5A: Miscellaneous Ciphers

A skip cipher is quite simple. One finds a number, N, that is coprime* with the number of letters in the plaintext and then read every Nth letter starting at the beginning.

Sample Plaintext: ABCDEFGHIJKLMNOPQRSTUVWXYZRead with a Skip 3: ADGJMPSVY BEHKNQTWZ CFILORUX

Every Skip Cipher can be read using another Skip Cipher (whatever the inverse is). With our example, the inverse skip would be Skip 8.

Sample Ciphertext: ADGJMPSVYBEHKNQTWZCFILORUXRead with a Skip 3: ABC DEF GHI JKL MNO PQR STU VWX YZ

Use the same shift-comparison technique from Vigenère to crack these.

Definition: Coprime: when two numbers do not share any prime factors.

Page 46: Geocaching Cryptocaches – An Introduction by tenebrus

Lesson 5B: Miscellaneous Ciphers

An affine cipher is a family of ciphers which include the Caesarean rotations (when the multiplier = 1). The premise here is that letters will be given numerical values.

Then, those values will be multiplied by a number (coprime to the size of the alphabet), and then added (or rotated) by a number. When the multiplier is 1, the affine cipher is a Caesarean rotation. As such, it is a straight cryptarithm and is subject to the same codebreaking tactics. (Note how OB, TQ, ER, RK are always true.)

Example: if the multiplier is 3 and the addend is 5, it looks like this:Plaintext: G O T O T H E C O R N E R

6 14 19 14 19 7 4 2 14 17 13 4 17(tripled) 18 42 57 42 57 21 12 6 42 51 39 12 51(add 5s) 23 47 62 47 62 26 17 11 47 56 44 17 56(subtract 26s) 23 1 16 1 16 0 17 11 1 10 18 17 10Ciphertext: X B Q B Q A R L A K S R K

Remember: A=0, B=1, C=2, …, Z=25.

Page 47: Geocaching Cryptocaches – An Introduction by tenebrus

Lesson 5C: Miscellaneous Ciphers

A Baconian cipher uses two slightly-different fonts. Each block of five ciphertext letters corresponds to one plaintext letter. (Quite inefficient.)

One font will be the ‘A’ font and the other will be the ‘B’ font. Use the table on write to decipher from the fonts.

Example: “There are no secrets.”‘There’ translates to AAABB = ‘D’‘are no’ translates to ABBBA = ‘O’ and‘secre’ translates to AABBA = ‘G’.

Note: some versions assign only one codon for I & J and for U & V.

A AAAAA N ABBABB AAAAB O ABBBAC AAABA P ABBBBD AAABB Q BAAAAE AABAA R BAAABF AABAB S BAABAG AABBA T BAABBH AABBB U BABAAI ABAAA V BABABJ ABAAB W BABBAK ABABA X BABBBL ABABB Y BBAAAM ABBAA Z BBAAB

Page 48: Geocaching Cryptocaches – An Introduction by tenebrus

Lesson 6A: Square Ciphers

A Polybius square is simple to construct. It is usually a 5x5 with I & J sharing a spot, but it could be a 6x6 in which all 36 alphanumeric characters have their own spot.

In its simplest use, every character is translated into two numerals: the row and the column. For example: ‘BAD’ would become 12 11 14 (for either square).

Of course, this might be too easy to crack.

Page 49: Geocaching Cryptocaches – An Introduction by tenebrus

Lesson 6A: Square Ciphers

To bring Polybius squares to the next level, one can create a permutation of the alphabet before filling in the boxes of the square.

It is fairly standard to select a keyword or two, write them out with the alphabet at the end, and then cross out any letter if it has already appeared in the strand. For example: Keyword = ‘EIFFEL TOWER’.

EIFFELTOWERABCDEFGHIJKLMNOPQRSTUVWXYZEIF LTOW RABCD GH K MN PQ S UV XYZ (with I=J)

Now, ‘BAD’ would be 25 24 32.

It should be pointed out, however, that this isstill nothing more than a cryptarithm. The 44 (Q) would be very unlikely to show up for thissquare, but 11, 15, 24, 21, 12, and 42 would bevery common (as E,T,A,O,I,N).

1 2 3 4 5 1 E IJ F L T2 O W R A B3 C D G H K4 M N P Q S5 U V X Y Z

Page 50: Geocaching Cryptocaches – An Introduction by tenebrus

Lesson 6B: Square Ciphers

The Playfair cipher uses Polybius squares, but with a twist. Instead of using the numerical rows and columns, it uses digraphs (or letter pairs) to determine other digraphs.

STEP 1: split up the plaintext into digraphs. However, if ever a digraph were to be 2 of the same letter, use X as the second letter and save the replaced letter for the next digraph.

STEP 2: If the plaintext diagraph makes two corners of a rectangle, the ciphertext diagraph are the other two corners,with the moves being done horizontally. Other-wise, if they are in the same row or column,both letters are moved one square to the rightor down, wrapping around as necessary.

1 2 3 4 5 1 E IJ F L T2 O W R A B3 C D G H K4 M N P Q S5 U V X Y Z

Page 51: Geocaching Cryptocaches – An Introduction by tenebrus

Lesson 6B: Square Ciphers

Example of Playfair (using the EIFFEL TOWER Polybius square):

Plaintext: THIS IS THE SECRET MESSAGE(as diagraphs) TH IS IS TH ES EC RE TM ES SA GE(converted) LK TN TN LK TM OM FO ES TM QB CFCiphertext: LKTNT NLKTM OMFOE STMQB CF

First, notice that ECOM was the special case of conversion.Second, notice that common digraphs (such as THLK, ISTN, and ESTM) appeared more than once. This means that while a single letter frequency analysis might not be ofbenefit, a digraph frequency analysis will be.

For example, we could glean that THLK will beeither a rectangle or in a row (as opposed to allother possible arrangements) in the Polybius square. (Note: H,IJ,K,L could be a likely unpermuted row?)

1 2 3 4 5 1 E IJ F L T2 O W R A B3 C D G H K4 M N P Q S5 U V X Y Z

Page 52: Geocaching Cryptocaches – An Introduction by tenebrus

Lesson 6C: Square Ciphers

The ADFGX cipher also uses a 5x5 Polybius square. The labels are A D F G X are appliedand the diagraphs are determined (from theirlabels). However, this cipher is not done yet. A second keyword (that does not repeat a letter) is used as labels of new columns.

The digraphs are now split into single letters and are entered into the new columns, dropping down to the next row every time one row gets filled. The new columns are then alphabetized by label and the ciphertext will be read down each column in this order.

A D F G X A E IJ F L TD O W R A BF C D G H KG M N P Q SX U V X Y Z

Page 53: Geocaching Cryptocaches – An Introduction by tenebrus

Lesson 6C: Square Ciphers

Let’s continue with the “EIFFEL TOWER”Polybius square. Let’s also use the keyword“PRIZE” for the new column generation.

If the plaintext were ‘GOTOTHECORNER’,we would get FF DA AX DA AX FG AA FA DA DF GD AA DF.

From there, we use PRIZE as labels for 5 columns.

PRIZE EIPRZ Ciphertext:FFDAA ADFFA AXFFD DAAAA FXFAGF FDGDD AAADAXDAAX XAXDAFGAAF FAFGAADADF FAADDGDAAD DAGDAF F

A D F G X A E IJ F L TD O W R A BF C D G H KG M N P Q SX U V X Y Z

Page 54: Geocaching Cryptocaches – An Introduction by tenebrus

Lesson 6C: Square Ciphers

One of the easier things to realize that by counting the number of letters, one can figure out how tall each column is (possibly confusing which columns have one extra letter). This means that it is possible to speculate on the unalphabetized columns and their digraphs (which are susceptible to digraph frequency analysis).

For a 5-letter keyword, there would be 5! = 5(4)(3)(2)(1) = 120 possible arrangements of columns. For a 9-letter keyword, this number jumps up to 9! = 362,880 possible arrangements of columns (which is no longer convenient for playing around with best hopes).

Despite this, if the keyword is N letters long,Then every Nth letter (of a column) is part of a digraph and can be studies for frequency.Tedious as it could be, it is very possible to crackan ADFGX code with a 9-letter keyword.

A D F G X A E IJ F L TD O W R A BF C D G H KG M N P Q SX U V X Y Z

Page 55: Geocaching Cryptocaches – An Introduction by tenebrus

Lesson 6C: Square Ciphers

The ADFGX cipher was augmented to the 6x6 ADFGVX cipher so that numerical data could be sent in numerical form. This functions exactly the same way as the previous version does. It should be noted that most keywords do not include numerals so they will tend to appear in the V and X rows. Most coded messages do not include numerals either, so they will tend to not use the V and X rows. This has implications for the frequency of the digraphs, and more importantly, that Vs and Xs are more likely to be second-letters instead of first-letter in digraphs. This helps slightly with the determination of the alphabetizing keyword.

A D F G V X A A B C D E FD G H I J K LF M N O P Q RG S T U V W XV Y Z 0 1 2 3X 4 5 6 7 8 9

Page 56: Geocaching Cryptocaches – An Introduction by tenebrus

Lesson 7: Base64 CiphersA base64 cipher works is essential a computer-based cipher, but you actually have all of the tools necessary to do it by hand. First, the alphanumeric characters are translated into binary. Then, instead of using the 8-bits-in-a-byte ASCII/UTF-8 encoding, every 3 byte (24 bits) is split into 4 blocks of 6 bits each. Finally, each block is converted back to decimal and then translated using the standard base64 cipher encoding.

A=0, B=1, C=2, …, Z=25, a=26, b=27, c=28, …, z=51, ‘0’=52, ‘1’=53, ‘2’=54, …, ‘9’=60, ‘0’=61, ‘+’=62, ‘/’=63

Example: ‘Dogs’Plaintext: D o gASCII/UTF-8 068 111 103Binary 01000100 01101111 01100101Regrouped 010001 000110 111101 100101Base64 table 17 6 61 37Ciphertext: R G 0 l

Note: every step is easily reversible.

Page 57: Geocaching Cryptocaches – An Introduction by tenebrus

In conclusion, cryptocaches are varied but at the same time not. Many ciphers have their roots in cryptarithms. As such, letter frequencies, digraph frequencies, and shift-comparisons are regularly-employable tools.

Just taking a good guess at a keyword or plaintext word and then working backward is also a strong tactic for some ciphers.

Don’t forget your standard computer encodings such as ASCII/UTF-8 and the base64 tables.

And ultimately: practice, practice, practice.

Page 58: Geocaching Cryptocaches – An Introduction by tenebrus

MetadataThis document was authored by tenebrus for the geocaching community at large.

It is available for reproduction as is. It is not available for profitable use.It is reproducible for both individual use and free instructional use.

Any questions regarding this document or requests for alternate use should be made through http://geocaching.com/ to tenebrus’ profile.

by tenebrus(Jason Mutford)version 1.01