Skip to main content

Generalizing Long Run Performance Beyond Finite MCs

Recall this from the previous page:

Untitled

Note that we’re only talking about irreducible MCs for now. We already know what happens in all the cases except when we have a recurrent aperiodic MC with infinite states.

We will now try to generalize the results we’ve obtained for a regular MC to a recurrent aperiodic MC with infinite states. That is, we want to answer the question: when there are infinite states, how do we derive the regularity?

Is it possible that we have P(return)=1P(\text{return}) = 1 and E[Return Time]=E[\text{Return Time}] = \infty? Then it still seems “impossible” to return (even though the probability is 11). We want to be able to intrepret this corectly (and gain intuition about this).

Generalising Main Theorem

Consider a recurrent MC.

definition

First Return Time: Ri=min{n1,Xn=i}R_i = \min\{n \geq 1, X_n =i\}

In words, it is the first time that the process XnX_n returns to ii.

Recall that we defined fii(n)f_{ii}^{(n)} (first return probability) to be “starting with ii, what is the probability that the first return to ii happens at nnth step).

So, what is the relationship between RiR_i and fii(n)f_{ii}^{(n)}?

Clearly, we can express it as: fIi(n)=P(Ri=nX0=i)f_{Ii}^{(n)} = P(R_i = n|X_0=i)

In other words, we can say that fii(n)f_{ii}^{(n)} is the “probability mass” of RiR_i at nn, i.e., fii(n)f_{ii}^{(n)} gives us the probability that Ri=nR_i=n.

For recurrent states, we know that fii=1f_{ii}=1, i.e., nfii(n)=1\sum_{n} f_{ii}^{(n)} = 1. Therefore, we can calculate E[RiX0=i]E[R_i|X_0=i], which can be interpreted as the average first return time or (more commonly known as) the mean duration between visits.

By definition of “mean” (expectation), we can write it as such:

mi=E[RiX0=i]=n=1nP(Ri=nX0=i)=n=1nfii(n)m_i =E[R_i |X_0=i] = \sum_{n=1}^\infty nP(R_i=n|X_0=i) = \sum_{n=1}^\infty nf_{ii}^{(n)}

Note that we can only define mim_i when fii=1f_{ii} = 1. When we have fii<1f_{ii} < 1, then the probability that there are infinitely many steps between 2 visits is non-zero, and equal to 1fii1-f_{ii} so the expectation will be infinity (which is not very meaningful).

Limit Theorem

definition

For any recurrent irreducible MC, define:

mi=E[RiX0=i]=n=1nfii(n)m_i = E[R_i|X_0=i] = \sum_{n=1}^\infty nf_{ii}^{(n)}

Then,

  1. For any i,jSi, j \in S,

    limnk=1nPij(k)/n=1/mj\lim_{n \to \infty} \sum_{k=1}^n P_{ij}^{(k)}/n = 1/m_j
  2. If d=1d=1, then

    limnPij(n)=1/mj\lim_{n \to \infty} P_{ij}^{(n)} = 1/m_j
  3. If d>1d > 1, then

    limnPjj(nd)=d/mj\lim_{n \to \infty} P_{jj}^{(nd)} = d/m_j

Note that the theorem applies for MCs with infinitely many states too! It also applies for periodic MCs.

Interpretation

Let’s try to interpret the theorem:

  1. Consider a fixed jj and observe that for all ii, the “average transition probability” (in the long-run) is the same, and equal to 1/mj1/m_j. This seems intuitive because we can think of a transition to jj as a “success” and a transition to any other state as a “failure” and model it as a geometric distribution. Then, we know that if the probability of success is pp, the mean time for success is 1/p1/p. In this case, transitioning to any other state preserves the probability of success (i.e., it does not change pp because every state has the same probability of transition to pp) and so, it does indeed fit well as a geometric distribution. For an aperiodic MC, in the long-run, the probability of transitioning from iji \to j converges to 1/mj1/m_j.

    Note that this theorem applies to periodic MCs as well, but the analogy of a geometric distribution only holds for an aperiodic MC.

  2. In the first part of the theorem, we’re considering the limit of the average transition probability from iji \to j. But when the MC is aperiodic, we’re saying that the nn-th step probability Pij(n)P_{ij}^{(n)} itself (not the average) converges to a fixed value, equal to 1/mj1/m_j.

    This looks very similar to the regular MC case - we know that in the long-run, Pij(n)=πjP_{ij}^{(n)} = \pi_j when we have finitely many states (recall that for a regular MC with finite states, P(n)P^{(n)} converges to a “fixed” matrix where every row is equal to the limiting distribution). This should give a hint that πj=1/mj\pi_j = 1/m_j 😲

    Note that the average of a convergent series is determined by the “long-run terms” and since there are infinitely many terms which are equal to the limit (they dominate the initial few terms which may be different from the limit), the average is also equal to the limit. This is how we can understand that (2) and (1) are compatible/consistent with each other.

  3. In the third part, notice that we’re only considering the probability of returning to jj when we start from jj itself. If we start from any other state, we can’t make any claims since we don’t know how many steps it takes to reach jj. For any kk that is NOT a multiple of the period dd, obviously Pjj(k)=0P_{jj}^{(k)} = 0. So, we only consider the limit of the non-zero entries, i.e., entries of P(k)P^{(k)} where kk is a multiple of dd. In this case, we claim that the limit exists, and is equal to d/mjd/m_j.

    It is natural to expect dd to appear on the RHS because if we look at the first result, the “average” is brought down (by a factor of dd) by all the d1d-1 zero-entries between every 2 multiples of dd. But in this case, we’re removing all these zero-entries and only considering multiples of dd, and so we expect the “average over non-zero entries” to be dd times higher.

    Untitled

    Basically, we expect the non-zero entries, Pjj(nd)P_{jj}^{(nd)} to converge to d/mjd/m_j as nn \to \infty.

    Until now, we didn’t know what would happen for a periodic MC in the long run, since it doesn’t converge. While that is still true, we’ve shown that the subsequence formed by the non-zero entries does indeed converge (and so, we can make more sense out of the process).

Remarks

The mean duration between visits, mim_i, can be finite or infinite:

  • Infinite: When mj=m_j = \infty, the limiting probability at each state is 00, although it is recurrent. We call such a MC to be null recurrent. For example, consider the symmetric random walk with p=1/2p=1/2 and no absorbing state. Note that it is still recurrent (there’s only one class so it must be recurrent).
  • Finite: When mj<m_j < \infty, the limiting probability at each state is 1/mj1/m_j. In such a case, we call it a positive recurrent MC. e.g. Random walk with p<1/2p < 1/2 (process eventually reaches 0) and “reflection” at 00, i.e., P(Xn=1Xn1=0)=1P(X_n=1|X_{n-1}=0) = 1 When d>1d > 1, we can only consider the steps nnd. When d=1d=1, the limiting probability is positive, which means that it is a regular MC.
info

The notion of “null recurrence” only applies to MCs with \infty states. For a MC with finite states, a state can never be null-recurrent.

Basic Limit Theorem

We can generalize the results from the previous theorem by only considering those MCs that have mj<m_j < \infty. This gives us the Basic Limit Theorem.

theorem

Basic Limit Theorem

For a positive recurrent (mj<m_j < \infty), irreducible, and aperiodic MC,

  • limnPij(n)\lim_{n \to \infty} P_{ij}^{(n)} exists for any i,ji,j and is given by:
    limnPij(n)=limnPjj(n)=1mj\lim_{n \to \infty} P_{ij}^{(n)} = \lim_{n \to \infty} P_{jj}^{(n)} = \frac 1 {m_j}
  • If π\pi is the solution to the equation πP=π\pi P = \pi, then we have:
    πj=1mj\pi_j = \frac 1 {m_j}

Note:

  • A positive recurrent, irreducible, aperiodic MC is called an ergodic MC. Hence, the basic limit theorem applies to all ergodic MCs.
  • All regular MCs are also ergodic MCs. Why? Because if there are finite states (and it is irreducible so all of them must be recurrent), the limiting probability on each state is non-zero, hence it is positive recurrent.
  • We do NOT require the MC to have finite/infinite states for the theorem to hold.

In general, it’s quite difficult to prove the “positive recurrent” property of a MC with inifinite states, so we normally deal with MCs with finite states (for which we can apply the theorem from the previous page for regular MCs)

How do we calculate mjm_j? Using the second part of the theorem, we can first solve π=πP\pi = \pi P to find the limiting distribution, and then take its reciprocal to obtain all the mjm_j’s. This gives us another interpretation for πj\pi_j ⇒ knowing πj\pi_j (and hence, 1/πj1/\pi_j) also tells us the mean duration between visits.

Sometimes, it’s easier to find mim_i rather than πi\pi_i (generally when we have infinite states). Exampe: A MC representing the number of consecutive successes of an binomial trials (with probability of success = pp). To calculate mim_i in such a case, we need to find fii(n)f_{ii}^{(n)} for all values of nn and then use the definition of mi=n=1nfii(n)m_i = \sum_{n=1}^\infty nf_{ii}^{(n)}.

Reducible Chains

Until now, we’ve been talking only about irreducible chains (and all the theorems only apply to irreducible chains). So, what do we do for reducible chains?

Intuitively we know that we can “reduce” (i.e., decompose) the reducible chain into irreducible sub-chains (each subchain consists of states in the same communication class).

Then, in the long-run, the limiting probability of the transient classes is zero. Moreover, we can find the probability of entering any recurrent class using FSA (by treating every recurrent class as an absorbing state since the process can never leave the recurrent class, i.e., no outoing arrows from the recurrent class).

Then, within each recurrent class of the reducible MC, we can check the period dd.

  • If d>1d > 1, then the limit of the subsequence is d/mjd/m_j for every state jj in the class (only true if the process starts at state jj).
  • If d=1d=1
    • If the class has finite states, then we can find out the limiting probability (since it is a regular markov “subchain”) using π=πP\pi = \pi P
    • If the class has \infty states, then the limiting probability at each state is either zero (if it is null recurrent) or can be obtained by π=πP\pi = \pi P if it is positive recurrent.

Now we know what happens whenever we have a recurrent class. Next, we’ll try to incorporate these results into a general MC so that we can understand the long-run probability of every state (note: not every class).

Long-Run Performance in General Markov Chains

As a quick recap, currently we know how to:

  • Find the probability of a process entering any recurrent class (using FSA)
  • Find the limiting probability of each state within the recurrent class (using π=πP\pi = \pi P) if the period d=1d=1. (of course, if d1d \neq 1, then there is no limiting distribution since the limit does not converge)

If we can combine both, we can get the results for a general markov chain. Basically, we’re trying to find P(n)P^{(n)} as nn \to \infty for a general MC now (for a regular MC, we already know that P(n)P^{(n)} will consist of all rows equal to π\pi but we don’t expect this to be the case for a general MC).

Intuitively (using conditional probability), we know that the limiting probabiilty (if it exists) at a state jj is equal to the probability of entering jj’s recurrent class times the limiting probability πj\pi_j in its own class.

Formally, consider jCkj \in C_k where CkC_k is a recurrent class. Then, we can set up a new MC and find the entering probability:

ukπ0=P(entering CkX0π0)u_{k|\pi_0} = P(\text{entering } C_k|X_0 \sim \pi_0)

Once the process enters CkC_k, it never leaves. Hence, we can restrict the process (i.e., consider the sub-chain) on CkC_k only. Clearly, the new MC restricted on CkC_k is irreducible and recurrent (since it is formed by states in the same class).

Then, we can find mjm_j for state jj by solving π=πP\pi = \pi P.

Hence, if we repeat generating X0X_0 and making nn jumps for sufficiently large nn (i.e., nn \to \infty), then:

πj=P(entering CkX0π0)×P(staying at jentered Ck)=ukπ0πjk=ukπ0/mj\pi_j = P(\text{entering } C_k | X_0 \sim \pi_0) \times P(\text{staying at }j | \text{entered } C_k) \\ = u_{k|\pi_0} \pi_{j|k} = u_{k|\pi_0}/m_j

Procedure

For the sake of clarity and completeness, we outline the entire procedure to analyze the long-run performance of ANY general MC.

  1. Find all the classes CkC_k

  2. Set up a new MC where every recurrent class is denoted by one state. Then, find P(absorbed in recurrent class CkX0=i)P(\text{absorbed in recurrent class } C_k | X_0= i) denoted by ukiu_{k|i} → this gives the probability of entering any recurrent class, given the initial distribution.

  3. We can ignore all transient classes 😲 because the process will eventually leave them in the long-run, i.e., their long-term probability is zero.

  4. For every recurrent class CkC_k, we find the period dd.

    1. Aperiodic (d=$1): find the corresponding limiting distribution of state $j in this class, denoted by πjk\pi_{j|k}, by considering the sub-MC restricted on CkC_k
    2. Periodic (d>1d > 1): there is NO limiting distribution, but we can still check the long-run proportion of time in each state by finding mjm_j (i.e., we can still find π\pi but the interpretation is different in this case)
  5. Consider the initial state X0=iX_0 = i

    1. If jj is transient, then πj=0\pi_j =0

    2. If jCkj \in C_k is recurrent, then:

      πji=ukiπjk\pi_{j|i} = u_{k|i}\pi_{j|k}
  6. Finally, given the initial distribution X0π0X_0 \sim \pi_0, then:

    πjπ0=iSπjiπ0(i)\pi_{j|\pi_0} = \sum_{i \in S} \pi_{j|i} \pi_0(i)