Dilip posted an interesting problem the other day: if you were a professor teaching probability theory, and asked your students to toss a coin 100 times and write down the sequence of heads and tails that they obtained, and some of them cheated and simply made up a sequence of heads and tails, how could you tell?
It is interesting because, generically, very few sequences are truly random; and the human mind is certainly incapable of randomness. But what signs of non-randomness could you look for? The answer that Dilip, apparently, had in mind is that true random sequences will usually contain long “runs” of heads or tails (say, 6 or more heads, or 6 or more tails, in succession in 100 coin tosses). However, individuals generating random sequences will perceive such short “runs” as non-random and correct for them. But this is not a very reliable answer by itself: the probability of a run of 6 or more (heads or tails) in 100 tosses is about 80%, so a truly random run will fail this test about a fifth of a time, while a smart student will probably throw in such runs. I argued that if one combined this with various other tests, one should be able to tell quite reliably. But at the moment I am unsure whether 100 tosses are enough for this. Certainly nobody could say for sure whether a sequence of 5 tosses was generated by a coin or a human. Meanwhile, I strongly doubt that any human could generate a sequence of 1000 random symbols (coin-tosses, numbers, whatever) that would fool the statistical tests. But can one reliably tell which of the following two is “random” and which isn’t?
Neither of these was generated by tossing a coin. One was made by me, by pressing “h” or “t” “randomly” on a keyboard (ie, I, a human being aware of the usual “pitfalls”, was trying to generate a random sequence, fairly rapidly without thinking much about it). The other was made using the pseudorandom number generator in Python, which is based on the Mersenne twister. I would guess that the Mersenne twister is more random than I am: what I would be interested in knowing, from any experts reading this, is whether one of the above sequences can be demonstrated, statistically, to be so non-random that the chances are very high it was generated by me and not by the program. I am, moreover, interested in the method and not the answer (which you have a 50% chance of getting right randomly). If you confidently identify the Mersenne twister-generated sequence, it is safe to say that the problem is with your test and not with the Mersenne twister.
The “bonus question” that came up in Dilip’s blog is, what is the probability of observing a run of 6 or more heads or tails (let’s call them 6-runs) in 100 coin tosses?
Kovendhan gave an approximate answer which seemed to work well in practice, but it turns out that he made two errors and a corrected calculation does poorly. The probability of a particular choice of 6 tosses (let’s call it a 6-mer) being all heads or all tails is (1/2)5 (it is (1/2)6 for all heads, and the same for all tails). The probability then that it is not all-heads or all-tails is 1-(1/2)5. There are 95 ways to choose 6 successive tosses in 100. The probability of none of these 95 is all-heads or all-tails is ( 1-(1/2)5)95 = about 0.049 approximately. The probability of at least one stretch of 6 identical tosses (all heads or all tails) existing would then seem to be 0.951 — pretty near certain. The approximation consists of neglecting the fact that adjacent 6-mers here are not independent: eg, if your chain of tosses is HHTHTT, not only does this fail to give a 6-run, but it will also fail to do so on any of the next 3 tosses at least.
Meanwhile, Kovendhan used (1/2)6 instead of (1/2)5 for the individual 6-run, and 94 for the number of 6-mers, which yields 0.772, but he reported 0.781 — I’m not sure how he got that. My numerical experiments suggested the true number is a little above 0.8, which is close to Kovendhan’s fortuitously incorrect calculation of his approximate method, but quite far from what his approximation should really give.
The moral is, be aware of approximations and, if possible, have an estimate of their effect. Kovendhan’s approximation is in fact very similar to Linus Pauling’s when he calculated the zero-temperature entropy of ice. In the crystal structure of ice, each oxygen atom has four neighbouring oxygen atom, in a locally tetrahedral arrangement. Along each of these O-O bonds is a hydrogen atom, but not centrally located: two H atoms must be closer to the central O atom, and two must be closer to the neighbouring O atoms. Globally, there are many ways to satisfy this; to count the ways, Pauling essentially assumed that the configurations of the local “tetrahedra” could be counted independently. This is like Kovendhan’s assumption about 6-mers; unfortunately, while two tetrahedra in ice share at most one corner, two 6-mers in the toss sequence can share up to 5 tosses, which makes the 6-mers much less independent than the tetrahedra in ice.
I attempted an answer which I give below. A commenter observed that a formula exists for the probability of a run of 6 heads, and the same formula gives the probability of a run of 6 tails. However, the probability of six heads or tails is trickier.
My approach (which will be recognisable by physicists, computer scientists and others) was this: supposing we can calculate the probability P(N) that there are no 6-runs in N tosses, how do we calculate P(N+5), the probability that there are no 6-runs in N+5 tosses? If we can do this, we can start from P(5) = 1 (there are no 6-runs in 5 tosses, obviously) and build it up from there.
Naively, a 6-run can be built up at any of the five tosses between N and N+5: for example, if the previous five tosses up until N were all heads, then tossing heads again will give a 6-run. So we must consider all possibilities for both the five tosses from N-4 to N, and the five tosses for N+1 to N+5: ten tosses in total. There are 210 = 1024 possibilities for these 10 tosses, so it looks like a counting problem. The number of possibilities of “successful” runs in these 10 tosses can be enumerated as follows (where “N” stands for “any”, and one can replace H with T in all these examples):
HHHHH HNNNN (32 possibilities: 24 for the last 4 tosses, times 2 for replacing H with T)
THHHH HHNNN (16 possibilities)
NTHHH HHHNN (16 possibilities)
NNTHH HHHHN (16 possibilities)
NNNTH HHHHH (16 possibilities)
for a total of 96 possibilities. There are then 1024-96 = 928 cases where there is no run of 6 heads or tails. So the “naive” answer is P(N+5) = P(N)*928/1024. If we want P(100), we can use this to go all the way back to P(5) = 1:
P(100) = P(95)(928/1024) = P(90) (98/1024)2 …
P(100) = P(5) (928/1024)19 = 0.154 roughly
so the probability of at least one 6-run in 100 tosses is about 0.846.
Unfortunately, this is still not correct, because the 1024 possibilities — and the 96 possibilities with runs — are not all equally probable: they are conditional on the premise that there is no 6-run up until the N-5th toss. So, for example, the case “HHHHH TNNNN” should be weighted by the fact that it would be disallowed for half the possible sequences prior to this (the ones that ended with H); the case THHHH HNNNN would disallow far fewer sequences (the ones that end in TTTTT, which is only 1 in 32 sequences).
Therefore, I considered as prior possibilities, the five tosses numbered N-14 to N-10 (that is, the five tosses preceding the current ten-mer). There are the following ten possibilities that should be distinguished:
NNNHT, NNNTH, NNHTT, NNTHH, NHTTT, NTHHH, HTTTT, THHHH, TTTTT, HHHHH
and each of these has a “prior probability” (1/4 for NNNHT, 1/32 for HTTTT, etc), and each disallows a fraction of the 1024 10-mers we are considering, as well as a fraction of the 96 10-mers that contain 6-runs. If you do the calculation separately for each of these 10 prior possibilities, and then weight the average by their prior probabilities, you end up with
P(no 6-run in 100 tosses) = (1481219/1614480)19 = 0.195 roughly,
and P(at least 1 6-run) = 0.805 roughly.
This, as it turns out, is in excellent agreement with what I had previously obtained numerically.
Final question for anyone who has read this far: is this the exact answer? (I posted on Dilip’s blog that I think it is, but don’t go by that.) It is, however, “good enough” in my opinion, by two measures: the remaining error is small; and the issues have (I think) been adequately illustrated that the answer can be (laboriously) improved, if need be.
I’d be fascinated, however, if there is an exact answer on the lines of the “run of heads” answer linked above.