General Solution to Gambler’s Ruin
Until now, we had been using fixed values of initial condition and stopping constraints. But what if we want to solve another similar problem with a different constraint? We would have to solve it all over again (set up the linear system, etc.)
Instead of that, let’s solve the general version of the gambler’s ruin problem, which can be stated as such:
Suppose the gambler starts with dollars in hand. For each round, the gambler will win with probability and lose with probability . The gambler will stop playing with dollars in hand, or when he goes broke.
Then, we’re interested to know:
- What is the probability that he will go broke? (and, as a consequence, what is the probability that he has dollars at the end?)
- How many rounds will he play before the game ends?
The gambler’s ruin “process” is also more generally referred to as a “random walk” (where there is a probability of walking 1 step left/right at any state, and we stop walking when we reach either of the 2 points on the left/right end). A random walk can also be considered in 2-D (which becomes slightly more complicated but the idea still holds) or even -D 😲
Probability of Going Broke
Let’s deal with the first question first.
We can write it as .
Then, we can define, for every , . We have already seen that it can be written as:
According to first-step analysis, we have:
So, when , this expression can be simplified to: (since the indicator function is “on” and the probability of escaping the “broke” state is ). Also, for , this becomes: (indicator is “off” and the game is over)
For all other values of , we have:
Put in words, this is because we have probablility of going from state and probability of going from . Remember that this is only true for non-absorbing states.
We can write out the system of equations for all values of now.
How do we solve this? We can use some neat algebraic manipulation here. Remember that . So, we can rewrite our expression for in this form:
Notice that we can repeatedly apply this result to smaller values of as such:
Then, we have:
We now need to evaluate the sum of the finite series .
To do so, we need 2 separate cases: 1) , and 2) .
Case 1: When (that is, the game is “fair”), we . So, the result is simply:
This needs to be true for as well. But we already know that . Using this, we can find the value of
Therefore for we have:
Using the fact that , we get:
How can we interpret this result?
- When increases (that is, when we have more money to start with), the probability of going broke is smaller (for a fixed value of )
- When (the gambler sets no bound on when he will stop, i.e., he will only stop when he goes broke), the probability of going broke converges to even if the game is fair. This means that if we never stop, we will go broke eventually (this is why it’s called the gambler’s “ruin” - because it’s very difficult to quit when you’re winning, you are doomed to go broke and this is one of the reasons the casinos win → they have virtually unlimited cash compared to you so it’s more likely for you to go broke before the casino goes broke, even if the game is fair)
💡 From a gambler’s perspective, it makes sense to “set” (in his mind) to be smaller than so that the probability of going broke is less than (which means he is more likey than not to reach dollars and quit, hence making a profit).
Case 2: Okay, now we need to solve the case when the game is not fair, i.e., .
Then, using the geometric series formula, we have:
Then,
This must be true for (we use because we already know the “real” value of and using this, we can find the value of ):
Therefore, for ,
So, the solution (using ) is:
How do we interpret this result:
- When (game is NOT in our favor), then the probability of going broke is approximately . So, when , the probability of going broke goes to , i.e., if we do not set a stopping rule for ourselves, we will go broke eventually.
- When (the game is in our favor), the probability of going broke is approximately . This means that when , the probability of going broke converges to . Since , it’s likely that we will not get broke. But notice that even in this case, increasing increases the probability of going broke (since we will end up subtracting a smaller quantity of ).
- When increases (for a fixed ), i.e., we start out with more money, then the probability of going broke is smaller.
- It should come as no surprise that for a fixed and , if the probability of winning a round increases, our chances of going broke decreases (why else would we be wishing for “good” luck 😲)
Expected Number of Games Played
Recall that we’re trying to find out the average number of games the gambler will play before he reaches or (i.e., before he “gets absorbed”).
That is, we are trying to find: .
We have already seen that it can be rewritten as: where for any
Why is it and not here? Because we are concerned with the number of transitions here. In particular, we don’t want to count the “state” where the game is over (because in that state, he doesn’t play any game) → think of it as the difference between counting the number of nodes on a path and the numebr of edges on a path. Here, we care about the edges. If he plays 2 games, he will go from (absorbed here) which is a total of “states”, but the “expected” answer is . It’s just a small technicality - don’t worry too much about it. (even if we do count the state at time , we can just set for absorbing states and it all works out okay)
The system of linear equations is quite simple:
Similar to what we had done for the previous question, we can use the fact that and analyze as such:
Similar to what we had done before, we can consider 2 cases: 1) , and 2)
Case 1: . The game is fair.
Then, using , the formula reduces to:
Noting that , we have (think: telescoping series):
For , the above equation must also be true, and we know that and so,
Therefore (substituting the above value of in the previous equation), for , we have:
How do we interpret this?
- When , the above equation tells us that the expected number of games the gambler plays also tends to infinity. But recall that we have already shown that as , the probability of going broke tends to . Moreover, we know that probability of stopping (either through going broke or reaching ) must be at least the probability of going broke, i.e., . This may seem counter-intuitive: for (gambler (virtually) “never” stops unless he goes broke), he is (virtually) destined to go broke and at the same time, the exepcted number of games he plays is infinite. This means that in expectation, “at some point in the very distant future (which refers to the “nearly infinite games”), he will go broke”. Note that this equation (actually, all our equations so far 😲) is only true for a finite so when we say “”, we actually mean “a very large value of ”
Case 2:
Using the formula for geometric series, we get:
We can simplify the second term on the RHS to be:
So, we get:
We can repeatedly apply this equation for smaller values of (just as we had done in the first question) and use the formula for geometric series to obtain a simplified equation for in terms of and (details omitted for the sake of brevity). Moreover, we can use the fact that and find the value of . Substituting this in the general equation for , we get:
How can we interpret this?
- When (game is biased in our favor), and we start off with more money (i.e., higher ), we will play a fewer number of games (for a fixed ) → because we are more likely to reach the “winning state” (i.e., obtain dollars) quickly and we’ll just quit earlier.
- Similarly, when (game is biased against us) and we start off with less money, we are likely to go broke quickly (and so, we will play fewer number of games).
- Suppose we get greedy and abandon our stopping rule (i.e., set to be very large):
- If , the random walk diverges to infinity, i.e., it takes a very long time to stop the game
- If , and so when the winning probability is less than 0.5, the expected time to converge is still finite.
💡 It is important to keep in mind that all our equations and interpretations are based on a finite, albeit arbitrarily large or small, value of (because in our derivations, we used the fact that and )
Remarks on FSA (First Step Analysis)
We’ve seen that FSA can be applied when we’re interested in a quantity that can be expressed as:
Let’s answer the following questions:
How do we find ?
Intuitively, we want to find out what and how does each step contribute to the final quantity of interest. We need to have a deep understanding of the formula and what it really means to be able to apply it.
For example, if we want to find then we set which means that every step we take, contributes one “time unit” to the final answer.
When we’re interested in , we’re only concerned with the last state (i.e., the state at the stopping time), so we can write it as: (it’s not a typo → for any and only for then is if , and otherwise). But since is an absorbing state, we are sure that for any and so, we can simplify it to be:
Is unique for one problem?
The functional form of can be different, but the final quantity of interest needs to be the same → so the contribution of each term must be the same, but different people may have different ways of expressing the same contribution (using different functions that evaluate to the same numerical value for each step). So, by technicality, the form of is not unique, but practically speaking, the values of are “fixed” for a given problem (otherwise we would get different answers).
Can FSA only be applied for only? How about other formats?
Yes. One common “other” usecase is when we want to find the probability of a specific path/subpath (e.g. the expected number of times we observe in the stochastic process). In such a case, we cannot directly capture information about both the states and in a single equation of . However, we can define a new markov process such that and then we can perform first-step analysis on the process . If we had possible values of , we would have possible values for . So, we can write out the 16 equations of and solve all of them.
Intuitively, there’s no reason to stop at “first-step” analysis → so this is identical to performing “two-step” analysis. We look at how “two” steps behave together (and we need to “store”/capture the previous state’s information in some way → the cleanest way to do this is to define a new markov process that does exactly this).
But, it is not always possible to apply FSA. For example, consider number of times we visit state in the markov process. Obviously, we can find using FSA. But we cannot find using FSA. Why? Because we don’t have a convenient way of representing , which makes it difficult to perform the analysis.