Skip to main content

General Solution to Gambler’s Ruin

Until now, we had been using fixed values of initial condition and stopping constraints. But what if we want to solve another similar problem with a different constraint? We would have to solve it all over again (set up the linear system, etc.)

Instead of that, let’s solve the general version of the gambler’s ruin problem, which can be stated as such:

Suppose the gambler starts with 0<k<N0 < k < N dollars in hand. For each round, the gambler will win with probability pp and lose with probability q=1pq=1-p. The gambler will stop playing with NN dollars in hand, or when he goes broke.

Then, we’re interested to know:

  1. What is the probability that he will go broke? (and, as a consequence, what is the probability that he has NN dollars at the end?)
  2. How many rounds will he play before the game ends?
note

The gambler’s ruin “process” is also more generally referred to as a “random walk” (where there is a probability of walking 1 step left/right at any state, and we stop walking when we reach either of the 2 points on the left/right end). A random walk can also be considered in 2-D (which becomes slightly more complicated but the idea still holds) or even nn-D 😲

Probability of Going Broke

Let’s deal with the first question first.

We can write it as P(XT=0X0=k)P(X_T=0|X_0=k).

Then, we can define, for every 0kN0 \leq k \leq N, uk=P(XT=0X0=k)u_k=P(X_T=0|X_0=k). We have already seen that it can be written as:

uk=E[n=0TI{XT=0}X0=k]u_k = E[\sum_{n=0}^T I\{X_T=0\}|X_0=k]

According to first-step analysis, we have:

uk=I{k=0}+j=0NμjP(X1=jX0=k)u_k = I\{k=0\} + \sum_{j=0}^N \mu_jP(X_1=j|X_0=k)

So, when k=0k=0, this expression can be simplified to: u0=1u_0=1 (since the indicator function is “on” and the probability of escaping the “broke” state is 00). Also, for k=Nk=N, this becomes: uk=0u_k=0 (indicator is “off” and the game is over)

For all other values of kk, we have:

uk=j=0NμjP(X1=jX0=k)=puk+1+quk1u_k = \sum_{j=0}^N \mu_jP(X_1=j|X_0=k) = p u_{k+1} + qu_{k-1}

Put in words, this is because we have probablility pp of going from state kk+1k \to k+1 and probability qq of going from kk1k \to k-1. Remember that this is only true for non-absorbing states.

We can write out the system of equations for all values of kk now.

u0=1u1=pu2+qu0u2=pu3+qu1uN=0\begin{equation*} \begin{split} u_0 &= 1 \\ u_1 &= pu_2 + qu_0 \\ u_2 &= pu_3 + qu_1 \\ & \dots \\ u_N &= 0 \end{split} \end{equation*}

How do we solve this? We can use some neat algebraic manipulation here. Remember that p+q=1p+q =1. So, we can rewrite our expression for uku_k in this form:

(p+q)uk=puk+1+quk1uk+1uk=qp(ukuk1)\begin{equation*} \begin{split} (p+q)u_k &= pu_{k+1} + qu_{k-1} \\ u_{k+1} - u_k &= \frac{q}{p}(u_k-u_{k-1}) \end{split} \end{equation*}

Notice that we can repeatedly apply this result to smaller values of kk as such:

uk+1uk=qp(ukuk1)=(qp)2(uk1uk2)=(qp)k(u1u0)\begin{equation*} \begin{split} u_{k+1} - u_k &= \frac{q}{p}(u_k-u_{k-1}) \\ &= \left(\frac{q}{p}\right)^2(u_{k-1}-u_{k-2}) \\ & \dots \\ &= \left(\frac{q}{p}\right)^k(u_1 -u_0) \end{split} \end{equation*}

Then, we have:

uku0=(ukuk1)+(uk1uk2)++(u1u0)=[(qp)k1+(qp)k2++1](u1u0)\begin{equation*} \begin{split} u_k - u_0 &= (u_k-u_{k-1}) + (u_{k-1} - u_{k-2}) + \dots + (u_1-u_0) \\ &= \left [\left(\frac{q}{p}\right)^{k-1} + \left(\frac{q}{p}\right)^{k-2} + \dots + 1 \right ](u_1 - u_0) \end{split} \end{equation*}

We now need to evaluate the sum of the finite series (qp)k1+(qp)k2++1\left(\frac{q}{p}\right)^{k-1} + \left(\frac{q}{p}\right)^{k-2} + \dots + 1 .

To do so, we need 2 separate cases: 1) p=qp=q, and 2) pqp \neq q.

Case 1: When p=qp=q (that is, the game is “fair”), we q/p=1q/p = 1. So, the result is simply:

uku0=k(u1u0)u_k-u_0 = k(u_1-u_0)

This needs to be true for k=Nk=N as well. But we already know that uN=0u_N=0. Using this, we can find the value of u1u_1

uNu0=1=N(u1u0)    u1u0=1Nu_N - u_0 = -1 = N(u_1-u_0) \implies u_1 - u_0 = \frac{1}{N}

Therefore for k=1,2,,N1k=1,2, \dots, N-1 we have:

uku0=k(u1u0)=kNu_k - u_0 = k(u_1 - u_0) = \frac{-k}{N}

Using the fact that u0=1u_0 =1, we get:

uk=1kNu_k = 1 - \frac{k}{N}

How can we interpret this result?

  1. When kk increases (that is, when we have more money to start with), the probability of going broke is smaller (for a fixed value of NN)
  2. When NN \to \infty (the gambler sets no bound on when he will stop, i.e., he will only stop when he goes broke), the probability of going broke converges to 11 even if the game is fair. This means that if we never stop, we will go broke eventually (this is why it’s called the gambler’s “ruin” - because it’s very difficult to quit when you’re winning, you are doomed to go broke and this is one of the reasons the casinos win → they have virtually unlimited cash compared to you so it’s more likely for you to go broke before the casino goes broke, even if the game is fair)
Insight

💡 From a gambler’s perspective, it makes sense to “set” (in his mind) NN to be smaller than 2×k2\times k so that the probability of going broke is less than 0.50.5 (which means he is more likey than not to reach NNdollars and quit, hence making a profit).

Case 2: Okay, now we need to solve the case when the game is not fair, i.e., pqp \neq q.

Then, using the geometric series formula, we have:

(qp)k1+(qp)k2++1=1(q/p)k1(q/p)\left(\frac{q}{p}\right)^{k-1} + \left(\frac{q}{p}\right)^{k-2} + \dots + 1 = \frac{1-(q/p)^k}{1-(q/p)}

Then,

uku0=1(q/p)k1(q/p)(u1u0)u_k - u_0 = \frac{1-(q/p)^k}{1-(q/p)} (u_1-u_0)

This must be true for k=Nk=N (we use NN because we already know the “real” value of uk=0u_k=0 and using this, we can find the value of u1u0u_1-u_0):

uNu0=1(q/p)N1(q/p)(u1u0)    u1u0=(1(q/p)1(q/p)N)u_N - u_0 = \frac{1-(q/p)^N}{1-(q/p)} (u_1-u_0) \implies u_1 - u_0 = - \left (\frac{1 - (q/p)}{1 - (q/p)^N} \right )

Therefore, for k=1,2,,N1k = 1, 2, \dots, N-1,

uku0=(1(q/p)k1(q/p))(u1u0)=(1(q/p)k1(q/p))(1(q/p)1(q/p)N)u_k - u_0 = \left (\frac{1 - (q/p)^k}{1 - (q/p)} \right )(u_1 - u_0)= - \left (\frac{1 - (q/p)^k}{1 - (q/p)} \right ) \left (\frac{1 - (q/p)}{1 - (q/p)^N} \right )

So, the solution (using u0=1u_0=1) is:

uk=1(1(q/p)k1(q/p)N)u_k = 1- \left (\frac{1 - (q/p)^k}{1 - (q/p)^N} \right )

How do we interpret this result:

  • When q>pq > p (game is NOT in our favor), then the probability of going broke is approximately 11(q/p)Nk1 - \dfrac{1}{(q/p)^{N-k}}. So, when NN \to \infty, the probability of going broke goes to 11 , i.e., if we do not set a stopping rule for ourselves, we will go broke eventually.
  • When p>qp >q (the game is in our favor), the probability of going broke is approximately (q/p)k(q/p)N(q/p)^k - (q/p)^N. This means that when NN \to \infty, the probability of going broke converges to (q/p)k(q/p)^k. Since p>1/2p > 1/2, it’s likely that we will not get broke. But notice that even in this case, increasing NN increases the probability of going broke (since we will end up subtracting a smaller quantity of (q/p)N(q/p)^N).
  • When kk increases (for a fixed NN), i.e., we start out with more money, then the probability of going broke is smaller.
  • It should come as no surprise that for a fixed kk and NN, if the probability of winning a round pp increases, our chances of going broke decreases (why else would we be wishing for “good” luck 😲)

Expected Number of Games Played

Recall that we’re trying to find out the average number of games the gambler will play before he reaches NN or 00 (i.e., before he “gets absorbed”).

That is, we are trying to find: vk=E[TX0=k]v_k= E[T|X_0=k].

We have already seen that it can be rewritten as: vk=E[i=0T11X0=k]v_k = E[\sum_{i=0}^{T-1}1|X_0=k] where g(Xi)=1g(X_i)=1 for any XiX_i

think 🤔

Why is it 0T1\sum_0^{T-1} and not 0T\sum_0^T here? Because we are concerned with the number of transitions here. In particular, we don’t want to count the “state” where the game is over (because in that state, he doesn’t play any game) → think of it as the difference between counting the number of nodes on a path and the numebr of edges on a path. Here, we care about the edges. If he plays 2 games, he will go from X0X1X2X_0 \to X_1 \to X_2 (absorbed here) which is a total of 33 “states”, but the “expected” answer is 22. It’s just a small technicality - don’t worry too much about it. (even if we do count the state at time TT, we can just set g(x)=0g(x) = 0 for absorbing states and it all works out okay)

The system of linear equations is quite simple:

v0=0vN=0vk=1+pvk+1+qvk1,k=1,2,N1\begin{equation*} \begin{split} v_0 &= 0 \\ v_N &= 0 \\ v_k &= 1 + pv_{k+1} + qv_{k-1} ,\quad k = 1, 2, \dots N-1 \end{split} \end{equation*}

Similar to what we had done for the previous question, we can use the fact that p+q=1p+q =1 and analyze vkv_k as such:

(p+q)vk=1+pvk+1+qvk1vk+1vk=qp(vkvk1)1p\begin{equation*} \begin{split} (p+q)v_k &= 1 + pv_{k+1} + qv_{k-1} \\ v_{k+1} - v_k &= \frac{q}{p}(v_k - v_{k-1}) - \frac{1}{p} \end{split} \end{equation*}

Similar to what we had done before, we can consider 2 cases: 1) p=qp=q, and 2) pqp \neq q

Case 1: p=q=1/2p = q = 1/2. The game is fair.

Then, using v0=0v_0 =0, the formula reduces to:

vk+1vk=vkvk12==v1v02k=v12kv_{k+1} - v_k = v_k - v_{k-1}-2 = \dots = v_1 - v_0 - 2k = v_1 - 2k

Noting that vk=0v_k=0, we have (think: telescoping series):

vk=vkv0=(vkvk1)+(vk1vk2)++(v1v0)=(v12(k1))+(v12(k2))++v1=kv1k(k1)\begin{equation*} \begin{split} v_k = v_k - v_0 &= (v_k - v_{k-1}) + (v_{k-1} - v_{k-2}) + \dots + (v_1 - v_0) \\ &= (v_1 - 2(k-1)) + (v_1 - 2(k-2)) + \dots + v_1 \\ &= kv_1 - k(k-1) \end{split} \end{equation*}

For k=Nk=N, the above equation must also be true, and we know that vN=0v_N = 0 and so,

vN=0=Nv1N(N1)    v1=vN/N+(N1)=N1v_N = 0 = Nv_1 - N(N-1) \implies v_1 = v_N/N + (N-1) = N-1

Therefore (substituting the above value of v1v_1 in the previous equation), for k=1,2,,N1k=1,2, \dots, N-1, we have:

vk=kv1k(k1)=k(Nk)v_k = kv_1-k(k-1) = k(N-k)

How do we interpret this?

  • When NN \to \infty, the above equation tells us that the expected number of games the gambler plays also tends to infinity. But recall that we have already shown that as NN \to \infty, the probability of going broke tends to 11. Moreover, we know that probability of stopping (either through going broke or reaching NN) must be at least the probability of going broke, i.e., P(stop)P(broke)=1P(stop) \geq P(broke) = 1. This may seem counter-intuitive: for NN \to \infty (gambler (virtually) “never” stops unless he goes broke), he is (virtually) destined to go broke and at the same time, the exepcted number of games he plays is infinite. This means that in expectation, “at some point in the very distant future (which refers to the “nearly infinite games”), he will go broke”. Note that this equation (actually, all our equations so far 😲) is only true for a finite NN so when we say “NN \to \infty”, we actually mean “a very large value of NN

Case 2: pqp \neq q

Using the formula for geometric series, we get:

vk+1vk=qp(vkvk1)1/p==(qp)k(v1v0)(qp)k1++1pv_{k+1} - v_k = \frac{q}{p}(v_k - v_{k-1}) - 1/ p= \dots = (\frac{q}{p})^k(v_1 - v_0) - \frac{(\frac{q}{p})^{k-1} + \dots + 1}{p}

We can simplify the second term on the RHS to be:

(qp)k1++1p=[1(q/p)k1(q/p)]/p=1(q/p)kpq\frac{(\frac{q}{p})^{k-1} + \dots + 1}{p} = \left [\frac{1 - (q/p)^k}{1 - (q/p)} \right ]/ p = \frac{1 - (q/p)^k}{p - q}

So, we get:

vk+1vk=(qp)k(v1v0)1(q/p)kpqv_{k+1} - v_k = (\frac{q}{p})^k(v_1 - v_0) - \frac{1 - (q/p)^k}{p - q}

We can repeatedly apply this equation for smaller values of kk (just as we had done in the first question) and use the formula for geometric series to obtain a simplified equation for vkv_k in terms of v1v_1 and v0v_0 (details omitted for the sake of brevity). Moreover, we can use the fact that vN=0v_N=0 and find the value of (v1v0)(v_1-v_0). Substituting this in the general equation for vkv_k, we get:

vk=1(pq)[N(1(q/p)k)1(q/p)Nk]v_k = \frac{1}{(p-q)}\left [\frac{N(1-(q/p)^k)}{1 - (q/p)^N} - k \right ]

How can we interpret this?

  • When p>qp > q (game is biased in our favor), and we start off with more money (i.e., higher kk), we will play a fewer number of games (for a fixed NN) → because we are more likely to reach the “winning state” (i.e., obtain NN dollars) quickly and we’ll just quit earlier.
  • Similarly, when q>pq > p (game is biased against us) and we start off with less money, we are likely to go broke quickly (and so, we will play fewer number of games).
  • Suppose we get greedy and abandon our stopping rule (i.e., set NN to be very large):
    • If p>qp > q, the random walk diverges to infinity, i.e., it takes a very long time to stop the game
    • If p<qp < q, vkkqpv_k \to \frac{k}{q-p} and so when the winning probability is less than 0.5, the expected time to converge is still finite.
note

💡 It is important to keep in mind that all our equations and interpretations are based on a finite, albeit arbitrarily large or small, value of NN (because in our derivations, we used the fact that uN=0u_N = 0 and vN=0v_N = 0)


Remarks on FSA (First Step Analysis)

We’ve seen that FSA can be applied when we’re interested in a quantity that can be expressed as:

ai=E[n=0Tg(Xn)X0=i]a_i = E[\sum_{n=0}^T g(X_n)|X_0=i]

Let’s answer the following questions:

  1. How do we find g(Xn)g(X_n)?

    Intuitively, we want to find out what and how does each step contribute to the final quantity of interest. We need to have a deep understanding of the formula and what it really means to be able to apply it.

    For example, if we want to find E[TX0=i]E[T|X_0=i] then we set g(Xn)=1g(X_n)=1 which means that every step we take, contributes one “time unit” to the final answer.

    When we’re interested in P(XT=0X0=i)P(X_T=0|X_0=i), we’re only concerned with the last state (i.e., the state at the stopping time), so we can write it as: g(Xn)=I{XT=0}g(X_n) = I\{X_T=0\} (it’s not a typo → for any n<T,g(Xn)=0n < T, g(X_n)=0 and only for n=Tn=T then g(XT)g(X_T) is 11 if XT=0X_T=0, and 00 otherwise). But since 00 is an absorbing state, we are sure that for any n<T,Xn0n<T, X_n \neq 0 and so, we can simplify it to be: g(Xn)=I{Xn=0}g(X_n)=I\{X_n=0\}

  2. Is g(Xn)g(X_n) unique for one problem?

    The functional form of g(Xn)g(X_n) can be different, but the final quantity of interest needs to be the same → so the contribution of each term must be the same, but different people may have different ways of expressing the same contribution (using different functions that evaluate to the same numerical value for each step). So, by technicality, the form of gg is not unique, but practically speaking, the values of g(Xn)g(X_n) are “fixed” for a given problem (otherwise we would get different answers).

  3. Can FSA only be applied for aia_i only? How about other formats?

    Yes. One common “other” usecase is when we want to find the probability of a specific path/subpath (e.g. the expected number of times we observe Xn=i,Xn+1=jX_n=i,X_{n+1}=j in the stochastic process). In such a case, we cannot directly capture information about both the states XnX_n and Xn+1X_{n+1} in a single equation of aia_i. However, we can define a new markov process {Yn}\{Y_n\} such that Yn=(Xn,Xn+1)Y_n=(X_n,X_{n+1}) and then we can perform first-step analysis on the process YY. If we had 44 possible values of XnX_n, we would have 42=164^2=16 possible values for YnY_n. So, we can write out the 16 equations of YijY_{ij} and solve all of them.

    Intuitively, there’s no reason to stop at “first-step” analysis → so this is identical to performing “two-step” analysis. We look at how “two” steps behave together (and we need to “store”/capture the previous state’s information in some way → the cleanest way to do this is to define a new markov process that does exactly this).

    But, it is not always possible to apply FSA. For example, consider S=S= number of times we visit state 22 in the markov process. Obviously, we can find E[SX0=i]E[S|X_0=i] using FSA. But we cannot find Var(S)Var(S) using FSA. Why? Because we don’t have a convenient way of representing S2S^2, which makes it difficult to perform the analysis.