You do the math, you do the monkey math…

Thanks to RodeoClown, in the comments of that last monkey theorem quote, I now know the magnitude of improbability involved in a monkey creating the works of Shakespeare (or, even just one line from Hamlet).

The balance of probability is so incredibly weighted against a monkey even getting the right sequence of letters in order (every time the monkey strikes a letter there are 31 other keys he might press rather than getting the next stroke right that it is only the constraints of logic that mean we can’t call the situation impossible.

Which kind of makes you think. One of the arguments against God is broken down into two similar questions of probability (which seem a bit like a paradox to me) – those suggesting the idea of the God of the Bible occurring is so improbable that it’s impossible are, at the same time, suggesting that the improbability of the universe must, by definition, have occurred given infinite time and space. To me, both seem equally improbable. In any moment prior to the world (as we know it) existing it was much more likely not to start existing than it was to start existing. That little conundrum seems to be pretty easy to resolve to me – if one is true, then both can be true, but if one is false then both must be false. Wouldn’t an infinite universe over infinite time inevitably produce each possible permutation of God until it produced one able to control the parameters? Namely, the infinite space part? I think at this point it’s more logical that something pre-existed the nothing. That skews the probability pretty dramatically. Am I getting something wrong with my logic here? Now that I’ve read the math guy’s answer right to the end I can see that he agrees with me. He also wrote a follow up piece in which he answers my conundrum from the previous post.

“So what happens if you have an infinite number of monkeys typing away? Do we get a script for Hamlet as Mr Adams suggests? Yes, we do! In point of fact, we get every combination of letters possible with the given typewriter, and that in infinite quantities. So not only do we get Hamlet, we get Shakespeare’s complete works, The Hitchhiker’s Guide to the Galaxy, this document, and incomprehensibly vast quantities of random garbage. (Note that this document may also qualify as garbage, but I object to it being described as “random”.) An infinite number of monkeys typing randomly will rapidly produce every possible written work. “

The Math

Each time it presses a key, there is a one in 32 chance that it will be correct. To get our little snippet of Hamlet, it will need a total of 41 consecutive “correct” keystrokes. This means that the chances are one in 32 to the power of 41. Let’s look at a table of values.

Keys Chances (one in…)
————————————
1 32
2 32*32 = 1024
3 32*32*32 = 32768
4 32*32*32*32 = 1048576
5 32^5 = 33554432
6 32^6 = 1073741824
7 32^7 = 34359738368
8 32^8 = 1099511627776
9 32^9 = 3.518437208883e+013
10 32^10 = 1.125899906843e+015

20 32^20 = 1.267650600228e+030

30 32^30 = 1.427247692706e+045

41 32^41 = 5.142201741629e+061

204 32^204 = 1.123558209289e+307

The last figure is included only because it is the largest value that the MS Windows calculator can handle — it’s doing better than my hand-held Casio (old faithful!) which only goes up to 1e+99. Okay, so these figures are pretty vast, but we have a lot of monkeys and they can type fast. So how long will it take, on average, for one of my monkeys to type a line matching that sentence? Hard question. Let’s get an idea of how long we are talking here. How many lines can my monkey type in a year, given that it types at a rate of one line per second?

1 line per second
* 60 seconds per minute = 60 lines per minute
* 60 minutes per hour = 3600 lines per hour
* 24 hours per day = 86400 lines per day
* 365.24 days per year = 31556736 lines per year

If you have access to Unix, you can calculate this with the dc command, but be warned that it may take quite a while to calculate and annoy other users because the computer is so slow. Use of the nice command is suggested. The syntax, should you care to try, is as follows. Type the dc command, then type the following lines.

99k
1 1 32 41 ^ / – 60 ^ 60 ^ 24 ^ 365 ^
p

The figure that is eventually printed will be the probability (expressed as a value between zero and one) of our monkey not typing our little phrase from Hamlet in the space of one year’s worth of continuous attempts. The answer that it prints looks like this:

0.99999999999999999999999999999999999999999999999999999938
6721844366784484760952487499968756116464000

Notice all the nines? Even to fifty or more significant figures, this reads 100%. Okay, so realistically, there is no way that our monkey can do its job in a year. Maybe we should start talking centuries? Millenia? As I understand it, common scientific wisdom suggests that the universe is about 15 billion years old (although they may have revised their dating since I last heard about it). We can easily extend our current figure of one year to count many years. Our calculator will be much faster if we break the calculation down to powers of two and just use the “square” operation, so let’s choose a nice even power of two like 2^34, which is about 17 billion (17,179,869,184 to be precise). The new figure is:

0.99999999999999999999999999999999999999999998946
3961512816564762914005246488858434168051444149065728

Infinite monkey theorem meets infinite sentence theorem

You’ve no doubt heard the theory that if you gave a monkey a typewriter and infinite time he would eventually compose, in order, the complete works of Shakespeare. This may take him a very, very long time, but it possibly, if this other theory is correct, would not require infinite time.

The number of possible sentences in the English language is apparently finite. Indeed, the Macquarie University has calculated that there are 10570 possible sentences in English. They used these suppositions:

  • that English has about 500,000 words (there are about 450,000 in the 20 volume Oxford English Dictionary, but this excludes many colloquial forms – although it does include many obsolete forms),
  • that English sentences can be up to 100 words in length (a fairly reasonable working assumption)
  • that any individual word can occur 0 to 100 times in a single sentence (an unrealistic assumption)
  • that words can be combined in any order (a false assumption)

Now, they even acknowledge that some of their assumptions are unrealistic and false – but using these assumptions they’ve given us a pretty reasonable guess at the number of permutations available in English. I guess you could go a step further and introduce “texts” to the equation – if each text is assumed to be 100,000 sentences long, or something unreasonably high, we’re still in the realm of calculable numbers.

Then we could figure out how fast a monkey produces a sentence, and get an upper limit for how long a monkey would take to produce every possible combination of words within those parameters to figure out the maximum amount of time required for a monkey to produce the works of Shakespeare. Simple.

The Macquarie University used the calculations above to demonstrate that English is an open language – full of unique sentences that have never been written before. My aim when I write, is to produce as many unique sentences as possible. To claim my place in the lexicon of life. I am pretty sure that at the point of writing, those two sentences are unique and have not been read by imaginary dinosaurs before. This process of trying to achieve the maximum number of unique sentences may lead me to introducing an odd adjective, and a MacGuffin, in every tremendously Jurassic sentence. Then I could rewrite the same sentence over and over again in order to claim maximum mileage from the one creative work (ie the writing of a unique sentence).

Lets try.

  • As I wrote this, sitting next to a coffee pot plant, with my hot wife, I produced a uniquely blue sentence.
  • As I wrote this, sitting next to a ceramic pot plant, with my hot wife, I produced a uniquely green sentence.
  • As I wrote this, sitting next to a yellow dinosaur, with my hot wife, I produced a uniquely green sentence…

Producing unique sentences in an open language is relatively easy. Here’s what the Macquarie researcher had to say:

Grammatical rules would greatly reduce this number of sentences, as would the requirement that all sentences be meaningful, but the resulting number of possibilities would still be extremely large (more than could ever be spoken in the entire history of human languages let alone during the much shorter life span of an individual language). So for all practical, non-mathematical, purposes we can say that the English language, or any other living language (1), is an open system. It’s actually quite easy to come up with a unique, never before produced, sentence. To do so, for example, combine an unlikely (or impossible, or meaningless) event with a particular named person on a particular date. For example: “On 31st October 1999, whilst writing a lecture on animal communication, Robert had a colourless green idea.” (2) Once this sentence has been written or spoken, subsequent productions of this same sentence are not unique, but unique sentences may potentially be generated from it by making slight changes to it (eg. change “green” to “red”).

Got any unique sentences? Care to claim your place in history?

And perhaps a mathematically minded commenter might like to resolve this infinite monkey conundrum for me – if I have an infinite number of monkeys with typewriters will one monkey come up with the works of Shakespeare on his first go?