How to make your own randomness

academics College statistics & probability

By Gavin

Suppose you end up in a foreign country which, though beautiful and with great cafes, has incredibly odd currency. Whenever you flip their coins, they don’t land heads 50% of the time and tails 50% of the time; diabolically, they land heads 90% of the time and tails 10% of the time. You’ve found yourself in the country of “Determinish”, and they do not like randomness.

But let’s say you really wanted to flip a 50-50 coin. Could you “make” a 50-50 coin flip using a single one of their 90-10 coins?

You can!

Thinking about symmetry

We’ll solve this for any biased coin with probability of heads $p$ , with which we want to construct a fair, 50-50 coin. By construct, I mean design a process that, once we start it, says “Heads” 50% of the time, and “Tails” the other 50% of the time. That’s our abstracted version of “making” or “constructing” a 50-50 coin. Of course, there are a lot of ways to construct a fair coin like this, and there’s active research in functions called “randomness extractors” that generalize this idea of “constructing unbiased randomness from biased randomness”. Our solution is one that the computer scientist John Von Neumann discovered.

A key issue with our biased coin is that Heads and Tails are asymmetric. If we changed the sides of the coins, like changing the stickers on a Rubik’s cube, then we’d have a different coin. Instead of being Heads with probability $p$ , it would be heads with probability $1 - p$ The only way for the coin to be the same if we change the sides, to be symmetric, is if 𝑝 = 1 <--> 𝑝 = (1/2).

If symmetry has some essential relationship with the 50-50 coin, maybe we can make lemonade out of biased coins lemons, and create something symmetric from our biased coin.

Here’s the clever idea: if we flip the biased coin twice, because there’s nothing about either coin flip that influences the other (they are independent), so we have symmetry across time. We could say “Heads then Tails” or “Tails then Heads” and no one would, statistically, know the difference. We can see this mathematically with the following probability table:

Screen Shot 2024-10-15 at 10.45.21 AM

Writing it out, we have that $P (H e a d s, T a i l s) = p (1 - p) = (1 - p) p = P (T a i l s, H e a d s)$ . So we have something symmetric! We can even see it visually, looking at the expression. How do we use it to construct the fair coin?

First, we have to ask: is flipping twice completely symmetric?

Not quite. When $p \neq \frac{1}{2},$ $P (H e a d s, H e a d s) = p^{2} \neq {(1 - p)}^{2} = P (T a i l s, T a i l s)$ , so no symmetry there.

Let’s just forge ahead and imagine what an algorithm/process might look like for making our fair coin out of two flips, and then see if we can fix it up, knowing that some problems will arise.

I’m thinking about our algorithm like this:

Screen Shot 2024-10-15 at 10.49.26 AM

So right now, we have that $P (A l g o r i t h m s a y s H e a d s) = P (A l g o r i t h m s a y s T a i l s)$ , which sort of sounds like our algorithm is a fair, 50-50 coin! But this relies on our algorithm saying nothing when it sees $(H e a d s, H e a d s)$ or $(T a i l s, T a i l s)$ . Our algorithm is a bit of a weird coin: where sometimes it lands Heads, sometimes it lands Tails, and sometimes it just refuses to say either.

Our solution is to just keep flipping until it says something! We can write this out like a procedure that a person or a computer could follow.

Screen Shot 2024-10-15 at 10.49.58 AM

Flipping forever?

Notice that our algorithm might continue forever! We might keep getting $(H e a d s, H e a d s)$ or $(T a i l s, T a i l s)$ forever, but of course, that would be incredibly unlikely. Though we can’t say exactly how many flips it will take beforehand, we can find the average (expected value of) how many flips it will take. We’ll use linearity of expectation and the law of total probability to break this expectation, $E [# o f f l i p s]$ , into two terms.

Screen Shot 2024-10-15 at 10.53.33 AM

But of course, $E [# o f f l i p s i f f i r s t (H, T), (T, H)]$ is straightforward, because our algorithm would say something and stop! So we know that $E [# o f f l i p s i f f i r s t (H, T), (T, H)] = 2$ .

Screen Shot 2024-10-15 at 10.53.47 AM If we see $(H e a d s, H e a d s)$ or $(T a i l s, T a i l s),$ then we start fresh, and there’s nothing about our initial flips that effects the result of subsequent flips because they are independent. So we can write:

Screen Shot 2024-10-15 at 10.53.53 AM

Statisticians call processes like this memoryless. Note that it’s easy to tie yourself into knots thinking about independence here. The knot is the question: if I see $(H e a d s, T a i l s)$ , then I won’t flip again, which doesn’t sound like they’re independent? This is because we’re combining two different events into one: whether we flip again, and the results of the next flip. The coin flips are not independent of flipping again, but they are independent of the results of the next flip.

Going back to our equation and using our probability table from earlier, our equation simplifies to

Screen Shot 2024-10-15 at 10.53.58 AM

We can solve this like any other algebraic equation, $E [# o f f l i p s]$ is just a number after all.

Screen Shot 2024-10-15 at 10.54.04 AM

We can do some algebra to simplify the right hand side, or look closer and notice that

Screen Shot 2024-10-15 at 10.54.11 AM

Likewise, on the left hand side we see that $1 - p^{2} - {(1 - p)}^{2} = 1 - P (H e a d s, H e a d s) - P (T a i l s, T a i l s) = P (H e a d s, T a i l s) + P (T a i l s, H e a d s)$ , leaving us with the nice expression:

Screen Shot 2024-10-15 at 10.54.17 AM Following this expression, the number of flips is minimized when 𝑝 = (1/2), which makes sense: making a fair coin out of fair coins would be easiest. In our example where 𝑝 = (1/10) in the strange country “Determinish”, we have

Screen Shot 2024-10-15 at 10.54.23 AM

It takes a lot of not-so-random coin flips to produce one “maximally random”, 50-50 coin flip!

Conclusion

Making randomness extractors like this more efficient is a broad topic in computer science, but our simple algorithm for the biased coin demonstrates an intuition which is fundamentally true: we can add together multiple not-so-random pieces to make a very-random whole, and the more random the pieces, the less of them we need. If you’re interested in how this works mathematically, the concept of “entropy” in information theory and statistics is a great place to start.

Gavin R.

Gavin holds a BA in Mathematics and Computer Science from Harvard College. After working as a software engineer in computational geometry at Markforged in Boston, he moved to Austria to work as a research assistant in quantitative social science at the Institute for Science and Technology Austria and Complexity Science Hub Vienna.

Work with Gavin R. Meet all our coaches

Did you know we offer tutoring for college students?

Learn more

Comments