an elegant hoax - a possible solution to the voynich manuscript

5
8/3/2019 An Elegant Hoax - A Possible Solution to the Voynich Manuscript http://slidepdf.com/reader/full/an-elegant-hoax-a-possible-solution-to-the-voynich-manuscript 1/5 ptologia: AN ELEGANT HOAX? A POSSIBLE SOLUTION TO... http://findarticles.com/p/articles/mi_qa3926/is_200401/ai_n9362025/print 5 19/09/2007 19:36 FindArticles > Cryptologia > Jan 2004 Article > Print friendly  AN ELEGANT HOAX? A POSSIBLE SOLUTION TO THE VOYNICH MANUSCRIPT Rugg, Gordon  ABSTRACT: The Voynich manuscript is a substantial document in what appears to be ciphertext, which has resisted decipherment since its appearance around 1600. It has long been suspected that the Voynich manuscript is a hoax; however, the linguistic complexity of the manuscript has previously been considered good reason for rejecting the hoax hypothesis. The manuscript also contains many unusual linguistic features, and previous research has failed to produce a plausible mechanism for generating substantial bodies of text with these features. This article describes how sixteenth century cryptographic techniques can be adapted to generate text similar to that in the Voynich manuscript. This method can be used either to generate gibberish for a hoax, or to encode plaintext in a decodable cipher. Preliminary results suggest that a document the size of the Voynich manuscript could be produced by a single hoaxer in two or three months. It is concluded that the hoax hypothesis is now a plausible explanation for the Voynich manuscript. KEYWORDS: Voynich manuscript, tables, steganography, Cardan grille, hoax. INTRODUCTION The Voynich manuscript is well described in various places, including Mary d'lmperio's classic book [3], and the Web sites of Jorge Stolfi [7] Philip Neal [5] and Rene Zandbergen [9]. The manuscript is some two hundred and fifty pages long and is extensively illustrated. It is written in a script not found in any other document; the manuscript appears from the illustrations to contain sections on topics such as herbs, astronomy and biology. Many of the illustrations are without precedent - for instance, strange plants, or complicated systems of what looks like plumbing or intestinal tracts populated by bathing women. There is no punctuation in the manuscript, and no indication of sentences, though paragraphs do occur. The text is left justified, but where it encounters illustrations in the middle of a page, the text is normally also right-justified against the illustration. The unique illustrations and script make it difficult to establish a provenance for the manuscript. The few indicators suggest a European origin, around the last quarter of the fifteenth century. It is likely that the manuscript was purchased by Rudolph II around 1586; the first reasonably firm dating evidence is from 1608 or shortly after. The manuscript disappeared soon after its appearance, until its rediscovery by Voynich in 1912. There is circumstantial evidence that the manuscript  was sold to Rudolph by the Elizabethan scholar John Dee and his disreputable associate Edward Kelley, who has always been considered a prime suspect for hoaxing the manuscript. The main theories for the manuscript's origins all encounter serious problems. It is implausible that a scholar several centuries ago could have devised a cryptographic system so good that it has resisted the best modern cryptographers for almost a century, when no other early coding system has taken modern researchers more than a few days to crack. It is unlikely that the manuscript is a plaintext in an unidentified language, since the manuscript contains linguistic features unlike those in any known human language, such as the most common words often being repeated two or three times (roughly equivalent to "the the" in English). Similar arguments can be made against the "artificial language" explanation. There are obvious attractions in the "hoax" theory, but the manuscript exhibits so much linguistic structure that a hoax appears to require almost as much sophistication as an unbreakable code. It has been generally assumed that a hoax showing these features would take an enormous amount of time to generate, and  would not be economically viable for a hoax perpetrated for financial gain, even if it was technically possible. Most researchers have therefore abandoned the hoax hypothesis.  Various purported decipherments have been proposed, but none has been generally accepted (a good summary is provided in Zandbergen [9]). It has also been speculated that the manuscript may be the result of bad, rather than good, encryption - for instance, that the creator made so many mistakes in coding that the manuscript is indecipherable, or that the creator used a one-way encoding system. This argument is inconsistent with the linguistic regularities in the manuscript, described below. The main distinctive features of language of the manuscript, usually known as Voynichese, are as follows. The manuscript contains at least two different "dialects", known as Voynich A and B; there are clear differences in handwriting between the herbal sections written in A and B dialects [2]. The two "dialects" differ markedly in the relative frequencies of various syllables and characters and also in word lengths. The other sections show features both of A and B to varying degrees. The language is highly repetitive, with identical or very similar words often occurring next to or near each other. Character and word frequencies are non-random, and there are combinations of characters which never occur, even though these characters may be individually common. There are constraints on where characters can occur within a word and within a line [e.g., 5,7]. Words show considerable internal structure, as described in more detail below, (e.g., [5, 7]). The line forms a distinct linguistic unit: some characters tend to occur only at the start or at the end of a line. Words towards the end of a line tend to be shorter. Previous statistical analyses of the text include entropy measures (e.g., [I]), spectral analysis [4], word frequency counts (used by most Voynich manuscript researchers) and letter serial correlation [6]. Word lengths form a binomial distribution, which is extremely unusual among human languages [7]. On all of these, the manuscript's language is at or beyond the extreme values for the human languages used as comparisons, and also consistently different from randomly generated text, from poetry and highly repetitive magical incantations, and from various pathological speech forms such as schizophrenic rants.  An example of text in the EVA (European Voynich Alphabet) transliteration is as follows. It is from folio 78r, which has particularly clear writing. Prom f78r, para 1, lines 1-5: tshedor shedy qopchedy qokedydy qokoloky qokeedy qokedy shedy tchedy otarol kedy dam qckhedy cheky dol chedy qokedy qokain olkedy  yteedy qotal dol shedy qokedar chcthey otordoror

Upload: ordos-gold

Post on 06-Apr-2018

220 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: An Elegant Hoax - A Possible Solution to the Voynich Manuscript

8/3/2019 An Elegant Hoax - A Possible Solution to the Voynich Manuscript

http://slidepdf.com/reader/full/an-elegant-hoax-a-possible-solution-to-the-voynich-manuscript 1/5

ptologia: AN ELEGANT HOAX? A POSSIBLE SOLUTION TO... http://findarticles.com/p/articles/mi_qa3926/is_200401/ai_n9362025/print

5 19/09/2007 19:36

FindArticles > Cryptologia > Jan 2004 > Article > Print friendly 

 AN ELEGANT HOAX? A POSSIBLE SOLUTION TO THE VOYNICH M ANUSCRIPT

Rugg, Gordon

 ABSTRACT: The Voynich manuscript is a substantial document in what appears to be ciphertext, which has resisted decipherment since its appearance around 1600.

It has long been suspected that the Voynich manuscript is a hoax; however, the linguistic complexity of the manuscript has previously been considered good reason

for rejecting the hoax hypothesis. The manuscript also contains many unusual linguistic features, and previous research has failed to produce a plausible mechanism

for generating substantial bodies of text with these features.

This article describes how sixteenth century cryptographic techniques can be adapted to generate text similar to that in the Voynich manuscript. This method can be

used either to generate gibberish for a hoax, or to encode plaintext in a decodable cipher. Preliminary results suggest that a document the size of the Voynich

manuscript could be produced by a single hoaxer in two or three months. It is concluded that the hoax hypothesis is now a plausible explanation for the Voynich

manuscript.

KEYWORDS: Voynich manuscript, tables, steganography, Cardan grille, hoax.

INTRODUCTION

The Voynich manuscript is well described in various places, including Mary d'lmperio's classic book [3], and the Web sites of Jorge Stolfi [7] Philip Neal [5] and Rene

Zandbergen [9]. The manuscript is some two hundred and fifty pages long and is extensively illustrated. It is written in a script not found in any other document; the

manuscript appears from the illustrations to contain sections on topics such as herbs, astronomy and biology. Many of the illustrations are without precedent - for

instance, strange plants, or complicated systems of what looks like plumbing or intestinal tracts populated by bathing women. There is no punctuation in the

manuscript, and no indication of sentences, though paragraphs do occur. The text is left justified, but where it encounters illustrations in the middle of a page, the

text is normally also right-justified against the illustration.

The unique illustrations and script make it difficult to establish a provenance for the manuscript. The few indicators suggest a European origin, around the last

quarter of the fifteenth century. It is likely that the manuscript was purchased by Rudolph II around 1586; the first reasonably firm dating evidence is from 1608 or

shortly after. The manuscript disappeared soon after its appearance, until its rediscovery by Voynich in 1912. There is circumstantial evidence that the manuscript

 was sold to Rudolph by the Elizabethan scholar John Dee and his disreputable associate Edward Kelley, who has always been considered a prime suspect for hoaxing

the manuscript.

The main theories for the manuscript's origins all encounter serious problems. It is implausible that a scholar several centuries ago could have devised a

cryptographic system so good that it has resisted the best modern cryptographers for almost a century, when no other early coding system has taken modern

researchers more than a few days to crack. It is unlikely that the manuscript is a plaintext in an unidentified language, since the manuscript contains linguistic

features unlike those in any known human language, such as the most common words often being repeated two or three times (roughly equivalent to "the the" in

English). Similar arguments can be made against the "artificial language" explanation.

There are obvious attractions in the "hoax" theory, but the manuscript exhibits so much linguistic structure that a hoax appears to require almost as much

sophistication as an unbreakable code. It has been generally assumed that a hoax showing these features would take an enormous amount of time to generate, and

 would not be economically viable for a hoax perpetrated for financial gain, even if it was technically possible. Most researchers have therefore abandoned the hoax

hypothesis.

 Various purported decipherments have been proposed, but none has been generally accepted (a good summary is provided in Zandbergen [9]). It has also been

speculated that the manuscript may be the result of bad, rather than good, encryption - for instance, that the creator made so many mistakes in coding that the

manuscript is indecipherable, or that the creator used a one-way encoding system. This argument is inconsistent with the linguistic regularities in the manuscript,

described below.

The main distinctive features of language of the manuscript, usually known as Voynichese, are as follows. The manuscript contains at least two different "dialects",

known as Voynich A and B; there are clear differences in handwriting between the herbal sections written in A and B dialects [2]. The two "dialects" differ markedly 

in the relative frequencies of various syllables and characters and also in word lengths. The other sections show features both of A and B to varying degrees. The

language is highly repetitive, with identical or very similar words often occurring next to or near each other. Character and word frequencies are non-random, and

there are combinations of characters which never occur, even though these characters may be individually common. There are constraints on where characters can

occur within a word and within a line [e.g., 5,7]. Words show considerable internal structure, as described in more detail below, (e.g., [5, 7]). The line forms a distinct

linguistic unit: some characters tend to occur only at the start or at the end of a line. Words towards the end of a line tend to be shorter.

Previous statistical analyses of the text include entropy measures (e.g., [I]), spectral analysis [4], word frequency counts (used by most Voynich manuscript

researchers) and letter serial correlation [6]. Word lengths form a binomial distribution, which is extremely unusual among human languages [7]. On all of these, the

manuscript's language is at or beyond the extreme values for the human languages used as comparisons, and also consistently different from randomly generated

text, from poetry and highly repetitive magical incantations, and from various pathological speech forms such as schizophrenic rants.

 An example of text in the EVA (European Voynich Alphabet) transliteration is as follows. It is from folio 78r, which has particularly clear writing.

Prom f78r, para 1, lines 1-5:

tshedor shedy qopchedy qokedydy qokoloky 

qokeedy qokedy shedy tchedy otarol kedy dam

qckhedy cheky dol chedy qokedy qokain olkedy 

 yteedy qotal dol shedy qokedar chcthey otordoror

Page 2: An Elegant Hoax - A Possible Solution to the Voynich Manuscript

8/3/2019 An Elegant Hoax - A Possible Solution to the Voynich Manuscript

http://slidepdf.com/reader/full/an-elegant-hoax-a-possible-solution-to-the-voynich-manuscript 2/5

ptologia: AN ELEGANT HOAX? A POSSIBLE SOLUTION TO... http://findarticles.com/p/articles/mi_qa3926/is_200401/ai_n9362025/print

5 19/09/2007 19:36

qokal otedy qokedy qokedy dal qokedy qokedy skam

The sample shows "m" in its usual position at the end of a line, and a rare example of "q" followed by "ckh" instead of the usual "o". The repetitive nature of the text

is apparent. This repetitiveness derives partly from similarity between words, and partly from structure within words. So, for instance, "qokedy" is common

throughout the manuscript, but "dykeqo" and "kedyqo" do not occur. A detailed account is given by Neal [5], with particular reference to the restrictions on which

characters and syllables can and cannot occur in sequence.

It is possible to produce something superficially similar to Voynichese by turning each word of plaintext into an alphabetically ordered anagram. However, this does

not result in sets of characters which may be substituted for each other throughout the word. Both the ordering of characters, and the sets of possible substitutions,

show a degree of structure in Voynichese which is also difficult to reconcile with the explanation that the manuscript is the product of sloppy encoding in an

otherwise unexceptional system. Although Neal has produced some interesting work in this area [5], attempts to find an encoding system (whether one-way ortwo-way) which produces both these features appear to have been unsuccessful to date.

In summary, Voynichese appears not to be a normal human language or a plausible artificial language, but it appears too complex to be a hoax. Some features, such

as the presence of Voynich A and B, are difficult to reconcile with any previous theory. This article describes an attempt to produce text similar to that in the

manuscript, using techniques available to Kelley, the prime suspect for producing the manuscript.

METHOD

The starting point for this study was the set of techniques available to Dee and Kelley, which are documented in Dee's diary [8]. These techniques were also available

to other scholars of the time. One unique outcome of Dee and Kelley's work, however, is that Dee and Kelley used tables of characters (about 40 rows by 40 columns)

to generate a language called Enochian. This had obvious implications for any attempt to produce a hoax using sixteenth century techniques: although Enochian is

 very different in structure from Voynichese, it was possible that the two might have been produced using similar methods.

The number of rows in Dee and Kelley's tables is near the maximum number of lines of text in a page of the manuscript. This led to the suspicion that a similar table

could have been used to generate a page of text at a time, if a syllable was written in each cell. A table of 40 rows and 39 columns would then generate 40 lines, each

consisting of 13 three-syllable words - about the length of a full line of text in the manuscript.

One table of this sort would produce enough text for about eight pages of the herbal section, if it was only used once. There would be obvious advantages in re-using

the table. One way of doing this is to use a modification of the Cardan grille, introduced by Girolamo Cardano in 1550. This involves using a piece of card containing

slots which reveal cells of the table. Different permutations of slots will generate different text from the same table.

This study used a table populated with syllables derived from Stolfi's analysis of one section of the manuscript [7]. It is possible to populate a table with individual

characters, rather than syllables, but syllable-based tables are simpler to describe, so the rest of this article refers to syllable-based tables unless otherwise specified.

 A sixteenth century hoaxer could have generated syllables for the table in various ways, including a character-based table.

Figure 3 below shows a section of this table in EVA transliteration. The table also contains labels to help explanation. The top row contains column labels. The

second row indicates whether the column is for a prefix ("p"), midfix ("m") or suffix ("s"), and gives the number of the word which the three columns together would

generate. The first column contains row numbers.

This table section would generate lines consisting of three words. Row 2, for instance, could be read horizontally to give the line "qochedy qoky otdy". The blank cells

are intentional; in Stolfi's analysis, Voynichese words consist of three infix slots, of which one or two may be empty; these "empty" infixes can be replicated using

 blank cells.

The grille was a small piece of card, containing three slots corresponding to sequential columns, and with each slot on a different row, as described above. So, for

instance, the grille might have one slot which would show the cell in column Pl, row 1; the next slot might show the cell in cell M2, row 2; the third might show the

cell in cell S3, row 1, as in Figure 5. Another grille would have the slots in a different permutation, as illustrated in Figure 6.

The grille is moved across the table from left to right, three columns at a time; the three slots show a prefix, midfix and suffix respectively. The resulting word is

transcribed, and the grille is moved. Doing this systematically leads to regular sequences of syllables - for instance, each row of the table always generates the same

sequence of prefixes, though in combination with different midfixes and suffixes. It is therefore necessary to introduce variance (for instance, by moving up or down

a row at intervals) to break these sequences. For clarity, the worked example below does not include this, and describes simple left-to-right text generation; in

addition, only grilles with three slots are discussed, though other arrangements are possible (e.g., six slots, to generate two words at a time).

 We start by placing grille 1 at the top left corner of the table. This shows cells Al, B2 and Cl, i.e. "y", "ehe" and an empty cell. Moving the grille from left to right across

the table, three columns at a time, we produce the following text:

 yche lkaiin qotaiin

 We now move the grille back to the left of the table, down one row, and repeat the process. The next line we produce is:

qotdy qoshey ochedy 

Using the second grille in the same way, however, we produce the following text from the same rows of the table:

cheekdy ochey qokdy 

qochedy qokody ty 

The two grilles produce different, but similar-looking text from the same table. Once the table and grilles have been set up, which can be done in two or three hours,

this method produces text as fast as it can be transcribed. The total time taken to illustrate a replica page, generate text and transcribe it was consistently between

one and two hours for a normal page.

Syllable Co-Concurrences

Page 3: An Elegant Hoax - A Possible Solution to the Voynich Manuscript

8/3/2019 An Elegant Hoax - A Possible Solution to the Voynich Manuscript

http://slidepdf.com/reader/full/an-elegant-hoax-a-possible-solution-to-the-voynich-manuscript 3/5

ptologia: AN ELEGANT HOAX? A POSSIBLE SOLUTION TO... http://findarticles.com/p/articles/mi_qa3926/is_200401/ai_n9362025/print

5 19/09/2007 19:36

 A key feature of this method is that restricting the grille format allows constraints on co-occurrences of syllables in the table. If grilles never have two successive slots

in the same row, then two syllables in adjacent cells of the table will never co-occur as generated text. This can be used deliberately, to prevent a chosen combination

from occurring. It can also happen as an accidental side-effect. For instance, the tables might be populated in a fairly systematic way (e.g., a "qo" in every fourth

prefix cell, a "che" in every eighth midfix cell, etc). In this example, every "ehe" would be next to a "qo", so the two would never co-occur in text generated using this

table. The partially completed table below shows several forms of possible constraint.

 With this table format, the syllables "ehe" and "she" are equally common. However, if used with grilles where no two successive slots are on the same row, then this

table would never produce a word beginning "qoche", whereas words beginning "qoshe" would be common. An intermediate effect is visible with the syllables "tee"

and "kee" which occur in every sixth cell in the table; half of the "tee" occurrences are next to a "qo" and half are not, so words beginning "qotee" would be only half 

as common as words beginning "qokee", even though the syllables "kee" and "tee" would be equally frequent in other contexts.

Restrictions can also be imposed in other ways. One is to use colour coding. This can be done by (for instance) populating the table with all the "qo" prefixes and all

the "dy" prefixes with blue ink, then adding a different set of prefixes and suffixes in red ink, and finally adding the remaining syllables in black ink. The syllables in

 black are then treated as "neutral" and can co-occur with blue or with red, but blue and red syllables in the same word are forbidden. If the grille generates a word

 which contains a "blue" prefix, a "black" midfix and a "red" suffix, the user transcribes the prefix and the midfix, but then moves the grille further across the table

until a blue or black suffix is encountered, and then transcribes that suffix. The final manuscript will obviously be written in one colour only.

Tabular Representation of the Manuscript

The "table and grille" explanation predicts that, other things being equal, words produced by the same column of a grille will be more likely to resemble each other

than words produced by different columns. For example, the third words of each line will probably be more similar to each other than to other words. Two

confounding factors are the first lines of paragraphs, which often contain statistically anomalous words, and the common words "dain" and "daiin", which also show 

statistical anomalies. If we remove these from text in the Voynich manuscript and then align the remainder into tabular form, with each word divided into four

syllables (including "empty" syllables), we find the following. For reasons of space, only the first four words are shown, but the others show similar patterns. The two

examples below come from different parts of the manuscript.

 Although there is a risk of seeing patterns where none exist, the regularity in the examples above is striking, particularly for the second words of lines in paragraph 2.

The table below shows the first four words for lines from folio 78r; a similar regularity is visible, but with some noticeable differences in which syllables are

represented (e.g., the third syllables, where "ol" and "e" have very different frequencies between the two folios), reflecting the two main "dialects" of Voynichese.

DISCUSSION

The same table used with different grilles produces different sets of text, similar to Voynichese in terms of significant linguistic features. In particular, it shows

restrictions on where each character can occur in a word, and restrictions on which characters can co-occur.

The main linguistic features of text produced using this method are discussed below.

Edge effects

This method produces various edge effects. A single occurrence of a character in a table can only happen at one place in the table; the implication is that rarecharacters will have strict restrictions on where they occur in a line. The obvious example is the character "m" which in the Voynich manuscript almost always occurs

at the end of a line. (The full story of "m" is too complex to discuss in detail here, but the "edge of table" effect could explain a significant proportion of its

occurrences.) Subtler distributions will also occur, such as a tendency for some syllables to be more frequent at the beginning of a line or in the middle of a line.

 A less obvious edge effect involves the tops and bottoms of tables. The slots in the grilles are at different heights, so for the top row of a table, the grille has to be

aligned with the highest slot on the top row (otherwise it would protrude off the top of the table). If, for instance, the highest slot is the midfix slot, then that will be

aligned with the top row. The problem is that in this example, the prefixes and suffixes in the top row will not be used by this grille - it will start with rows further

down the table. This means that cells in the top one or two rows will not be used by some grilles, which will lead to skewed distributions for any rare characters which

occur in those cells.

There are various ways of handling this, but each means that text generated at the top or bottom of the table will be generated differently from text produced

elsewhere in the table. This effect may be obscured on manuscript pages which only include a few lines of text, if a forty-line output is divided across several pages -

only one of these pages would then show "top of table" effects, and a different page would show "bottom of table" effects, while others which used text from the

middle of the table would show no such effects. It would be possible to mask this effect on longer pages by producing lines in some form of staggered sequence, as

speculated by Neal [5] - for instance, generating and writing lines 1, 3, 5, etc. first, then generating lines 2, 4, 6 etc. and writing them in the gaps left between the

previous lines on the manuscript page. In addition, it is possible to alternate between using two or more tables when producing text, to mask regularities. For

instance, the text from Folio 36r above hints at the use of two tables, one mainly containing "qo" as an initial syllable, and another containing rarer initial syllables

such as "yt" and "ot" (though it would be unwise to read too much into this, since a single table could contain very different syllables).

 Word lengths and distributions

The table can be populated with shorter syllables in the rightmost columns, and also with blank cells (as per Stolfi's analysis) more frequent towards the right of the

table. This produces shorter words towards the end of a line, which can be useful for producing neat right justification when the text abuts an illustration, as often

happens in the herbal section.

Different ways of filling in the table (e.g., systematic versus semi-random) will produce different distributions for the words. This has particular implications for

 word length distributions, and for statistical features of the text such as letter serial correlation values. I am grateful to Mark Perakh for help with my preliminary 

 work in this area. The statistical properties of generated text in relation to entropy values, etc., will be affected by the structure of the tables, the structure of thegrilles and the algorithm for moving the grille around the table. all of these have numerous permutations, so investigating this is likely to take time. The limited

preliminary results suggest that random and weakly structured tables produce text which does not have the same letter serial correlation scores as Voynichese.

Perakh's original work [6] suggested that one possible explanation for the manuscript was a highly structured form of artificial gibberish, so the next area of 

investigation will be highly structured tables.

Page 4: An Elegant Hoax - A Possible Solution to the Voynich Manuscript

8/3/2019 An Elegant Hoax - A Possible Solution to the Voynich Manuscript

http://slidepdf.com/reader/full/an-elegant-hoax-a-possible-solution-to-the-voynich-manuscript 4/5

ptologia: AN ELEGANT HOAX? A POSSIBLE SOLUTION TO... http://findarticles.com/p/articles/mi_qa3926/is_200401/ai_n9362025/print

5 19/09/2007 19:36

The repetition of common words is another feature of this method. The likelihood of a word being repeated is a product of its frequency - the commonest words are

more likely to be repeated than rarer words. Words containing common syllables are similarly more likely to co-occur than words containing rarer ones.

Labels and names

There are reasons for believing that each page in the herbal section begins with a different plant name or label [7]. Unique names and statistically anomalous words

are identifiable elsewhere throughout the manuscript. These could be simply generated using a separate table and/or grille.This would be extra effort for a hoaxer,

 but is explicable if a hoaxer expected the manuscript to be scrutinised by an expert cryptographer, who would treat the apparent proper nouns as a potential way into

the document (and, by the same reasoning, might become suspicious if apparent proper nouns were no different from the body text, or were absent).

 Voynich A and B

More speculatively, this method also produces a possible explanation for the much-discussed two "languages" in the VMS. Although the distinction between Voynich

 A and B at their most extreme is fairly clear, it is significant that Currier's original article and subsequent work are not able to establish this distinction for all

sections - Currier refers to some sections as showing features both of A and B, for instance [2]. This sort of spectrum is not consistent with natural language, unless

the manuscript was produced by several people, all of whom spoke slightly different dialects. It is, however, consistent with text generation from a series of tables,

each slightly modified from a predecessor - Table A or Table B, for instance, followed by Table A2 and Table B2. A hoaxer would be starting to run short of 

combinations for grilles after some tens of pages. An obvious solution would be to generate a new table for the next section of the manuscript.

The clear distinction between A and B in the herbal section is consistent with a hoaxer making quite large changes when producing a second table for the herbal

section (perhaps so that text preparation could be carried out in parallel by a hoaxer and an accomplice, which would explain the differences in handwriting).

Encoding plaintext with tables and grilles

 Although the discussion above is framed in relation to text generation, this method can also be adapted in various ways to encode meaningful text. Of the two

possible adaptations so far identified, one is inconsistent with the degree of repetition found in successive words in the manuscript, and the other appears to1 beinconsistent with the degree of similarity between words in successive lines of the manuscript, as shown in Figures 8 and 9. Readers will doubtless be able to think of 

other ways of encoding plaintext using tables and grilles, but at present there is no evidence to suggest that the manuscript includes meaningful material encoded in

this way.

CONCLUSION

The method described here is fast and simple to use, and produces text with most of the distinctive features of Voynichese. There are grounds for believing that the

remaining features, such as entropy levels and letter serial correlation values, can also be replicated using this method.

This method could be used either to produce a hoax, or to encode meaningful material - there are various ways in which it can be adapted for encoding. It is perfectly 

possible that the manuscript does contain an encoded form of a meaningful plaintext, such as a demonstration of concept by Cardano, analogous to Trithemius'

Steganographia, and this possibility merits detailed investigation. Preliminary results for the encoding mechanisms which I have investigated so far are not

encouraging, however - the degrees of repetition between consecutive words and between vertically aligned words are inconsistent with the most obvious

mechanisms. It is possible that some version of the table and grille method was used either with a further layer of encoding, or with a substantial amount of meaningless text used as padding around the meaningful text. There is, however, a difference between possibility and probability. For the time being, there is no

evidence that the manuscript contains meaningful material encoded in this way, and the hoax hypothesis now appears to be a plausible explanation for the

manuscript.

One thing which came through strongly in this study was that the design features behind this method consistently involved the fastest, simplest, cheapest and also

the least obvious solution. If the manuscript is a hoax, then it has been not only a remarkably successful hoax, but also a remarkably elegant one.

 ACKNOWLEDGMENTS

I am indebted to Thomas Cooper, Stephanie Dale, Jim Gillogly, Jonathan Knight, Gabriel Landini, Philip Neal, Nick Felling, Jim Reeds, Jane Sherratt, Jorge Stolfi

and Rene Zandbergen for their help. Like other recent researchers in this field, I also owe thanks to the Voynich transcription groups for their work.

REFERENCES

1. Bennett, W. R. 1976. Scientific and Engineering Problem Solving with the Computer. Prentice-Hall, Englewood Cliffs, NJ: Prentice-Hall, cited in: Stallings, D. J.Understanding the second-Order Entropies of Voynich Text. Available at: http : //www.geocities. com/ctesibos/voynich/mbpaper.htm.

2. Currier, P. H. 1976. Some Important New Statistical Findings. In D'lmperio, M. E., editor, Proceedings of a Seminar held on 30 November 1976 in Washington DC.

Privately printed pamphlet, 30 November 1976. Available at: ftp://ftp.funet.fi/pub/doc/religion/occult/necromnomicon/voynich /currier.paper.

3. D'lmperio, M. E. 1978. The Voynich Manuscript - an elegant enigma. Laguna Hills CA: Aegean Park Press.

4. Landini, G. 2001. Evidence of linguistic structure in the Voynich Manuscript using spectral analysis. Cryptologia. 25(4): 275-295

5. Neal, P. 2003. http://mysite.freeserve.com/philipneal_vms.

6. Perakh, M. 2003. Application of the Letter Serial Correlation test to the Voynich manuscript, http : //www.nctimes.net/~mark/Texts/voynichl .htm.

7. Stolfi, J. 2003. http://www.dcc.unicamp.br/ stolfi/voynich/.

8. Woolley, B. 2002. The Queen's Conjuror: the life and magic of Dr Dec. London: Flamingo/HarperCollins.

9. Zandbergen, R. 2003. http://www.voynich.nu/.

Gordon Rugg

Page 5: An Elegant Hoax - A Possible Solution to the Voynich Manuscript

8/3/2019 An Elegant Hoax - A Possible Solution to the Voynich Manuscript

http://slidepdf.com/reader/full/an-elegant-hoax-a-possible-solution-to-the-voynich-manuscript 5/5

ptologia: AN ELEGANT HOAX? A POSSIBLE SOLUTION TO... http://findarticles.com/p/articles/mi_qa3926/is_200401/ai_n9362025/print

 ADDRESS: Department of Computer Science, Keele University, Keele Staffordshire ST5 5BG UK. [email protected].

BIOGRAPHICAL SKETCH

Dr. Gordon Rugg is Senior Lecturer in the School of Computing and Mathematics, Keele University, and Senior Visiting Research Fellow in the Department of 

Computer Science, at the Open University. His first degree was in French and Linguistics, at Reading University, UK, followed by a PhD in Experimental Psychology,

also at Reading University. His postdoctoral experience includes work with the Artificial Intelligence Group, Nottingham University, UK. Dr. Rugg is editor in chief 

of Expert Systems: the International Journal of Knowledge Engineering and Neural Nets.

Copyright Cryptologia Jan 2004

Provided by ProQuest Information and Learning Company. All rights Reserved