## Transmuting Dice, Conserving Entropy

posted by Craig Gidney on April 23, 2013

Suppose you want to play a game of backgammon. Unfortunately, horror of horrors, you have lots of pocket change but no dice!

You realize you can generate random values by flipping coins, but a coin flip has two possible outcomes instead of six. How do you simulate a fair six sided die using only fair coin flips? What about the more general problem, of simulating an -sided die with an -sided die?

If you haven’t seen this puzzle before today, and want to solve it yourself, stop now. I will be spoiling it.

### Partial Credit

There’s a well known sub-optimal solution to this problem.

First, given a die with sides, it’s trivial to simulate a die with sides by grouping rolls. A sided die can be simulated by arranging coin flips into groups of , because each group has equally likely possibilities. The reverse direction, from to , is also easy because each roll can be split evenly into sub-rolls.

Second, given a die with more sides than necessary, you can just filter out rolls that have invalid results with respect to the smaller die. Given an -sided die, we can simulate a sided die by rolling until we almost surely get a result that’s not or .

Putting these two parts together allows us to simulate a -sided die using only our -sided coin. Just flip the coin three times, pairing each of the possible outcomes with the numbers to , and try again whenever the result is or .

There. Solved. Well… except it feels wasteful to throw away those sevens and eights.

### Wasted Entropy

The above solution wastes information entropy. A coin flip generates 1 bit of entropy. Three coin flips generates 3 bits of entropy. The result of the roll of a six sided die, on the other hand, gives you only bits of entropy.

Even if we get lucky, and don’t have to try again, we’re consuming bits of entropy while only producing bits when converting our coin flips into six sided die rolls. If we’re average, instead of lucky, then we’ll need an expected attempts for an expected cost of bits. We’re wasting bits of precious, delicious entropy per generated roll!

This raises the question: can we do better? Can we use that wasted entropy?

This is your second chance to stop and solve.

### Entropy Optimizations

There are several optimizations that we can make, to improve our entropy efficiency.

For example, consider what happens if we map [heads,heads,heads] to and [heads,heads,tails] to . Since they both start with [heads, heads], the third coin flip has no effect on whether or not we retry. This allows us to save one coin flip when retrying, reducing our expected cost from to bits.

Another useful optimization to notice is that factors into and , and that the product of two uniformly random numbers is also uniformly random. A -sided die roll is interchangeable with a -sided die roll combined with a -sided die roll. Since we have a coin with sides, -sided die rolls are trivial to do with perfect efficiency. Then, for each -sided roll we generate, we can generate an accompanying -sided roll by using groups of coin flips (and retrying when we get a instead of , , or ). Combined with the previous optimization, this lowers our cost to bits.

A more flexible optimization is to find a power of that comes proportionally sooner after a power of . For example, we can use our coin to simulate rolling a die with sides with perfect efficiency, discard when we exceed , and generate three rolls of a -sided die per success. It takes more coin flips to get our first result, but the probability of discarding decreases significantly. Combined with our previous two optimizations, we’ve lowered our expected cost to bits.

We can find even better powers of and . The next few good powers are , and . Each gets us tantalizingly closer to perfect efficiency, and it’s not even that hard to find good pairs. Just run the following program for sufficiently long:

BigRational r = 3;
int n2 = 0;
int n3 = 1;
BigRational best = 0;
while (true) {
while (r > 1) {
r /= 2;
n2 += 1;
}
if (r > best) {
best = r;
Console.WriteLine("3^{0},2^{1}", n3, n2);
}
while (r < 1) {
r *= 3;
n3 += 1;
}
}

(Note that the BigRational type is not standard. I referenced it from the BigRationalLibrary NuGet package.)

Now for some disappointing news: you'll never find the perfect pair of powers. We can achieve % efficiency with and . We can achieve % efficiency with and . We can get arbitrarily close, but we can't achieve % efficiency. Doing so would require an impossibility: a power of three that's also a power of two. That's not going to happen. Powers of three aren't even ever even! No matter how big we make the constants, there will always be some unlikely cases where we have to discard and try again.

Give that achieving % efficiency is impossible, and given that we can get arbitrarily close to % by choosing the right constants, a person would be tempted to think we'd hit the limit. That we can't do any better. That person would be wrong.

Last chance to solve on your own.

### Conservation of Entropy

Beating 'choose how close you want to be to 100% efficiency' is just a matter of not settling for a fixed amount of efficiency. One way to do this would be to compute better constants as we generated die rolls. As the algorithm ran longer and longer, it would waste less and less entropy.

However, it turns out we can do something slightly simpler: arithmetic coding. The idea is to treat the coin flips as the binary digits after the decimal point (binary point?) of a number between and . As we add more binary digits, the space where the 'complete' number can lie becomes smaller and smaller. Eventually, that space will almost certainly fall entirely within the domain covered by a single digit in base 6. We can then safely output that digit, and start waiting for the next base 6 digit to become fixed.

This idea easily generalizes to dies with arbitrary numbers of sides. The following code implements it:

///<summary>Generates rolls of a uniformly random die, based on rolls of a uniformly random die with a different number of sides.</summary>
public static IEnumerable<int> ChangeDiceSideCountFromTo(this IEnumerable<int> rollResults,
int oldSideCount,
int newSideCount) {
if (rollResults == null) throw new ArgumentNullException("rollResults");
if (oldSideCount < 2) throw new ArgumentOutOfRangeException("oldSideCount", "oldSideCount < 2");
if (newSideCount < 2) throw new ArgumentOutOfRangeException("newSideCount", "newSideCount < 2");

BigRational min = 0;
BigRational len = 1;
foreach (var roll in rollResults) {
// the roll represents the next digit in base-[number of old sides], reducing the valid range
len /= oldSideCount;
min += roll * len;

// check for digits in base-[number of new sides] becoming fixed
while (true) {
var minDigit = (min * newSideCount).GetWholePart();
var maxExclusive = (min + len) * newSideCount;
var epsilon = new BigRational(-1, maxExclusive.Denominator * newSideCount);
var maxDigit = (maxExclusive - epsilon).GetWholePart();
if (minDigit != maxDigit) break;

// another digit has become fixed and can be output
yield return (int)minDigit;

// normalize so the range of the next digit also ranges from 0 to 1
len *= newSideCount;
min *= newSideCount;
min -= minDigit;
}
}
}

The above code achieves 100% efficiency, with a small caveat: delay. If you convert from an sided die to an sided die and then back to an sided die, you'll get the exact same rolls out that you put in. That proves no entropy is lost. However, the output rolls will lag the input rolls by an expected constant amount. This delay occurs because different bases have different borders between digits.

For example, binary numbers starting with 0.1010101010 may start with 0.1 or 0.2 when represented in ternary. Until you almost certainly roll outside of the repeating 101010 pattern, the next ternary digit can't be determined. On the bright side, when you do finally roll outside of a repeating pattern that delays results, you get a large number of results all at once.

When converting from sides to sides and back, I measured average delays between and flips. The effect of the delay on our percentage of efficiency limits to as the number of rolls increases.

This is as far as we can go, I think. We never lose any rolls, so entropy is never permanently lost, but there's a small delay. Oh, and the performance is awful. The rational numbers used in the computation don't stay small (unless one of the dice has a number of sides that divides the other die's number of sides). They get bigger and bigger and BIGGER as you generate more rolls. It doesn't take long to convert a thousand rolls, but ten thousand is going to take a minute. Literally.

### Summary

You can convert an -sided die to an -sided die without losing any entropy, with the caveat that you expect a constant small delay between having generated enough entropy and yielding the next roll.

This solution is interesting theoretically, but note that in practice you'd be far better off doing naive discarding (the constants work well for coins-to-dice). Entropy is not particularly expensive. You can afford to waste 4% of it.

---

---