Expected Length of a Gamma-Coded Codeword

Expectation · Medium · Free problem

Gamma coding encodes a positive integer $x \geq 1$ as a binary string using three steps:

  1. Compute $N = \lfloor \log_2(x) \rfloor$.
  2. Write $N$ zero bits.
  3. Write $x$ in binary, which takes $N + 1$ bits.

The total codeword length is

N + 1$ bits. This encoding is prefix-free, so you can concatenate codewords and decode unambiguously by reading left to right: count the leading zeros to get $N$, then read the next $N + 1$ bits as the encoded integer.

For example, $x = 7$ has $N = 2$, so its gamma code is $00\,111$. And $x = 21$ has $N = 4$, giving $0000\,10101$.

Now generate an infinite string of random bits, where each bit is independently

$ with probability $p$ and $0$ with probability
- p$. Interpret this string as a sequence of gamma-coded codewords. Let $F$ be the length (in bits) of the first codeword.

Find $E[F]$ when $p = \frac{1}{3}$.

Hints

  1. The codeword length is entirely determined by the number of leading zeros. What familiar distribution describes the wait for the first
    $?
  2. If the first
    $ appears at position $O$, then $N = O - 1$ leading zeros, and the total codeword length is
    N + 1$. Express $F$ in terms of $O$.
  3. Use $O \sim \text{Geom}(p)$ with $E[O] = 1/p$ and apply linearity: $E[F] = 2E[O] - 1$.

Worked Solution

How to Think About It: The first codeword starts at the beginning of the random bit string. Its length is completely determined by how many leading zeros appear before the first

$. Once you see that first
$, you know $N$ (the number of leading zeros), and you just have to read $N$ more bits to finish the codeword. So the total length is $N$ zeros $+$ $(N+1)$ bits of the encoded integer $= 2N + 1$. The problem reduces to: what is the expected number of leading zeros in a random binary string with bit probability $p$?

Quick Estimate: Each bit is $0$ with probability

/3$, so the number of leading zeros before the first
$ is geometric-ish. The expected position of the first
$ is
/p = 3$, so $N$ (the number of zeros before it) is $3 - 1 = 2$ on average. The codeword length is
N + 1 = 5$. Done -- the estimate gives the exact answer here because the relationship is linear.

Approach: Let $Z$ be the number of leading zeros (bits before the first

$). Relate $Z$ to a geometric random variable and use linearity of expectation.

Formal Solution:

Define $Z$ as the number of $0$-bits that appear before the first

$ in the string. Each bit is
$ with probability $p$, independently. Let $O$ be the position of the first
$ (i.e., the first-success trial number), so $O \sim \text{Geom}(p)$ with $E[O] = 1/p$. Then $Z = O - 1$, since the $O$-th bit is the first
$ and the preceding $O - 1$ bits are all $0$.

The gamma codeword structure is: $Z$ zero-bits followed by $(Z + 1)$ bits encoding the integer. So the codeword length is:

$F = Z + (Z + 1) = 2Z + 1$

By linearity of expectation:

$E[F] = 2E[Z] + 1 = 2(E[O] - 1) + 1 = \frac{2}{p} - 1$

Substituting $p = 1/3$:

$E[F] = \frac{2}{1/3} - 1 = 6 - 1 = 5$

Answer: $E[F] = \dfrac{2}{p} - 1 = 5$ when $p = \dfrac{1}{3}$.

Intuition

The key insight is that in gamma coding, the "header" (the leading zeros) and the "payload" (the binary representation) have the same length up to one extra bit. Specifically, if there are $N$ leading zeros, the payload is $N + 1$ bits, making the total

N + 1$. So the codeword length is a simple linear function of the number of leading zeros, which in a random bit string is just a shifted geometric random variable. Once you see that, the whole problem collapses to one line of algebra via linearity of expectation.

This is a nice example of a problem that looks like it needs information theory but is really just geometric distributions in disguise. In quant interviews, encoding and compression problems often boil down to "how long until something happens?" -- which is almost always geometric or exponential. The general formula $E[F] = 2/p - 1$ also gives you quick intuition: as $p$ decreases (more zeros in the stream), codewords get longer because you are more likely to see long runs of zeros that inflate both the header and the payload.

Open the full interactive solver →