Examples of Markov chains

This article contains examples of Markov chains and Markov processes in action.

All examples are in the countable state space. For an overview of Markov chains in general state space, see Markov chains on a measurable state space.

Discrete-time

Board games played with dice

A game of snakes and ladders or any other game whose moves are determined entirely by dice is a Markov chain, indeed, an absorbing Markov chain. This is in contrast to card games such as blackjack, where the cards represent a 'memory' of the past moves. To see the difference, consider the probability for a certain event in the game. In the above-mentioned dice games, the only thing that matters is the current state of the board. The next state of the board depends on the current state, and the next roll of the dice. It does not depend on how things got to their current state. In a game such as blackjack, a player can gain an advantage by remembering which cards have already been shown (and hence which cards are no longer in the deck), so the next state (or hand) of the game is not independent of the past states.

Random walk Markov chains

A center-biased random walk

Consider a random walk on the number line where, at each step, the position (call it x) may change by +1 (to the right) or −1 (to the left) with probabilities:

P_{\mathrm {move~left} }={\dfrac {1}{2}}+{\dfrac {1}{2}}\left({\dfrac {x}{c+|x|}}\right)

P_{\mathrm {move~right} }=1-P_{\mathrm {move~left} }

(where c is a constant greater than 0)

For example, if the constant, c, equals 1, the probabilities of a move to the left at positions x = −2,−1,0,1,2 are given by ${\dfrac {1}{6}},{\dfrac {1}{4}},{\dfrac {1}{2}},{\dfrac {3}{4}},{\dfrac {5}{6}}$ respectively. The random walk has a centering effect that weakens as c increases.

Since the probabilities depend only on the current position (value of x) and not on any prior positions, this biased random walk satisfies the definition of a Markov chain.

Gambling

Suppose that one starts with $10, and one wagers $1 on an unending, fair, coin toss indefinitely, or until all of the money is lost. If $X_{n}$ represents the number of dollars one has after n tosses, with $X_{0}=10$ , then the sequence $\{X_{n}:n\in \mathbb {N} \}$ is a Markov process. If one knows that one has $12 now, then it would be expected that with even odds, one will either have $11 or $13 after the next toss. This guess is not improved by the added knowledge that one started with $10, then went up to $11, down to $10, up to $11, and then to $12. The fact that the guess is not improved by the knowledge of earlier tosses showcases the Markov property, the memoryless property of a stochastic process.^[1]

A model of language

This example came from Markov himself.^[2] Markov chose 20,000 letters from Pushkin’s Eugene Onegin, classified them into vowels and consonants, and counted the transition probabilities. ${\begin{array}{lll}&{\text{vowel}}&{\text{consonant}}\\{\text{vowel}}&.128&.872\\{\text{consonant}}&.663&.337\end{array}}$ The stationary distribution is 43.2 percent vowels and 56.8 percent consonants, which is close to the actual count in the book.^[3]

A simple weather model

The probabilities of weather conditions (modeled as either rainy or sunny), given the weather on the preceding day, can be represented by a transition matrix:

P={\begin{bmatrix}0.9&0.1\\0.5&0.5\end{bmatrix}}

The matrix P represents the weather model in which a sunny day is 90% likely to be followed by another sunny day, and a rainy day is 50% likely to be followed by another rainy day. The columns can be labelled "sunny" and "rainy", and the rows can be labelled in the same order.

(P)_{i j} is the probability that, if a given day is of type i, it will be followed by a day of type j.

Notice that the rows of P sum to 1: this is because P is a stochastic matrix.

Predicting the weather

The weather on day 0 (today) is known to be sunny. This is represented by an initial state vector in which the "sunny" entry is 100%, and the "rainy" entry is 0%:

\mathbf {x} ^{(0)}={\begin{bmatrix}1&0\end{bmatrix}}

The weather on day 1 (tomorrow) can be predicted by multiplying the state vector from day 0 by the transition matrix:

\mathbf {x} ^{(1)}=\mathbf {x} ^{(0)}P={\begin{bmatrix}1&0\end{bmatrix}}{\begin{bmatrix}0.9&0.1\\0.5&0.5\end{bmatrix}}={\begin{bmatrix}0.9&0.1\end{bmatrix}}

Thus, there is a 90% chance that day 1 will also be sunny.

The weather on day 2 (the day after tomorrow) can be predicted in the same way, from the state vector we computed for day 1:

\mathbf {x} ^{(2)}=\mathbf {x} ^{(1)}P=\mathbf {x} ^{(0)}P^{2}={\begin{bmatrix}1&0\end{bmatrix}}{\begin{bmatrix}0.9&0.1\\0.5&0.5\end{bmatrix}}^{2}={\begin{bmatrix}0.86&0.14\end{bmatrix}}

or

\mathbf {x} ^{(2)}=\mathbf {x} ^{(1)}P={\begin{bmatrix}0.9&0.1\end{bmatrix}}{\begin{bmatrix}0.9&0.1\\0.5&0.5\end{bmatrix}}={\begin{bmatrix}0.86&0.14\end{bmatrix}}

General rules for day n are:

\mathbf {x} ^{(n)}=\mathbf {x} ^{(n-1)}P

\mathbf {x} ^{(n)}=\mathbf {x} ^{(0)}P^{n}

Steady state of the weather

In this example, predictions for the weather on more distant days change less and less on each subsequent day and tend towards a steady state vector. This vector represents the probabilities of sunny and rainy weather on all days, and is independent of the initial weather.

The steady state vector is defined as:

\mathbf {q} =\lim _{n\to \infty }\mathbf {x} ^{(n)}

but converges to a strictly positive vector only if P is a regular transition matrix (that is, there is at least one Pⁿ with all non-zero entries).

Since q is independent from initial conditions, it must be unchanged when transformed by P.^[4] This makes it an eigenvector (with eigenvalue 1), and means it can be derived from P.

In layman's terms, the steady-state vector is the vector that, when we multiply it by P, we get the exact same vector back.^[5] For the weather example, we can use this to set up a matrix equation:

{\begin{aligned}P&={\begin{bmatrix}0.9&0.1\\0.5&0.5\end{bmatrix}}\\\mathbf {q} P&=\mathbf {q} &&{\text{(}}\mathbf {q} {\text{ is unchanged by }}P{\text{.)}}\\&=\mathbf {q} I\\\mathbf {q} (P-I)&=\mathbf {0} \\\mathbf {q} \left({\begin{bmatrix}0.9&0.1\\0.5&0.5\end{bmatrix}}-{\begin{bmatrix}1&0\\0&1\end{bmatrix}}\right)&=\mathbf {0} \\\mathbf {q} {\begin{bmatrix}-0.1&0.1\\0.5&-0.5\end{bmatrix}}&=\mathbf {0} \\{\begin{bmatrix}q_{1}&q_{2}\end{bmatrix}}{\begin{bmatrix}-0.1&0.1\\0.5&-0.5\end{bmatrix}}&={\begin{bmatrix}0&0\end{bmatrix}}\\-0.1q_{1}+0.5q_{2}&=0\end{aligned}}

and since they are a probability vector we know that

q_{1}+q_{2}=1.

Solving this pair of simultaneous equations gives the steady state vector:

{\begin{bmatrix}q_{1}&q_{2}\end{bmatrix}}={\begin{bmatrix}0.833&0.167\end{bmatrix}}

In conclusion, in the long term about 83.3% of days are sunny. Not all Markov processes have a steady state vector. In particular, the transition matrix must be regular. Otherwise, the state vectors will oscillate over time without converging.

Stock market

A state diagram for a simple example is shown in the figure on the right, using a directed graph to picture the state transitions. The states represent whether a hypothetical stock market is exhibiting a bull market, bear market, or stagnant market trend during a given week. According to the figure, a bull week is followed by another bull week 90% of the time, a bear week 7.5% of the time, and a stagnant week the other 2.5% of the time. Labeling the state space {1 = bull, 2 = bear, 3 = stagnant} the transition matrix for this example is

P={\begin{bmatrix}0.9&0.075&0.025\\0.15&0.8&0.05\\0.25&0.25&0.5\end{bmatrix}}.

The distribution over states can be written as a stochastic row vector $x$ with the relation $x (n + 1) = x (n) P$ . So if at time $n$ the system is in state $x (n)$ , then three time periods later, at time $n + 3$ the distribution is

{\begin{aligned}x^{(n+3)}&=x^{(n+2)}P=\left(x^{(n+1)}P\right)P\\\\&=x^{(n+1)}P^{2}=\left(x^{(n)}P\right)P^{2}\\&=x^{(n)}P^{3}\\\end{aligned}}

In particular, if at time $n$ the system is in state 2 (bear), then at time $n + 3$ the distribution is

{\begin{aligned}x^{(n+3)}&={\begin{bmatrix}0&1&0\end{bmatrix}}{\begin{bmatrix}0.9&0.075&0.025\\0.15&0.8&0.05\\0.25&0.25&0.5\end{bmatrix}}^{3}\\[5pt]&={\begin{bmatrix}0&1&0\end{bmatrix}}{\begin{bmatrix}0.7745&0.17875&0.04675\\0.3575&0.56825&0.07425\\0.4675&0.37125&0.16125\\\end{bmatrix}}\\[5pt]&={\begin{bmatrix}0.3575&0.56825&0.07425\end{bmatrix}}.\end{aligned}}

Using the transition matrix it is possible to calculate, for example, the long-term fraction of weeks during which the market is stagnant, or the average number of weeks it will take to go from a stagnant to a bull market. Using the transition probabilities, the steady-state probabilities indicate that 62.5% of weeks will be in a bull market, 31.25% of weeks will be in a bear market and 6.25% of weeks will be stagnant, since:

\lim _{N\to \infty }\,P^{N}={\begin{bmatrix}0.625&0.3125&0.0625\\0.625&0.3125&0.0625\\0.625&0.3125&0.0625\end{bmatrix}}

A thorough development and many examples can be found in the on-line monograph Meyn & Tweedie 2005.^[6]

A finite-state machine can be used as a representation of a Markov chain. Assuming a sequence of independent and identically distributed input signals (for example, symbols from a binary alphabet chosen by coin tosses), if the machine is in state y at time n, then the probability that it moves to state x at time n + 1 depends only on the current state.

PageRank algorithm (Random surfer model)

The PageRank algorithm, originally used by Google to rank websites in their search engine results, is based on a discrete-time Markov chain. The state space consists of all the web pages indexed by the search engine. The model imagines a "random surfer" who starts on a randomly chosen page and begins clicking on links.

At each discrete time step, the surfer either:

Clicks on a randomly chosen outbound link from the current page with probability $d$ (the "damping factor", historically set around 0.85).
Teleports to a completely random page in the entire internet with probability $1-d$ .

Because the surfer's next page depends only on the current page's links and the fixed teleportation probability, the process satisfies the Markov property. The PageRank of a specific website is simply its probability value in the steady-state vector of this massive Markov chain, representing the long-term proportion of time the random surfer spends on that page.

Tennis match scoring

The scoring progression of a single game of tennis can be modeled as an absorbing Markov chain. The state space consists of the current score combinations (e.g., "0–0", "15–0", "30–15", "Deuce", "Advantage Server", "Advantage Receiver", "Game Server", "Game Receiver").

Assuming the serving player wins a given point with a constant probability $p$ and the receiving player wins with probability $q=1-p$ , the transition from one state to the next depends strictly on the current score, not on the sequence of points that led to it.

For example, from the state "Deuce", the chain transitions to "Advantage Server" with probability $p$ and to "Advantage Receiver" with probability $q$ . The states "Game Server" and "Game Receiver" act as absorbing states; once reached, the game is over and the chain remains in that state with probability 1. This model can be used to calculate the exact probability of a player winning a game from any intermediate score.

Algorithmic music composition

Markov chains are frequently used in algorithmic music generation to create new melodies that mimic the style of a specific composer, genre, or training corpus.

The state space consists of musical pitches, chords, or rhythmic values. By computationally analyzing a dataset of existing music (for example, the melodies of Johann Sebastian Bach), a transition matrix is constructed that dictates the probability of a specific note following the current note. If the current state is a "C quarter note", the matrix provides the calculated probabilities that the next state is an "E eighth note", a "G quarter note", and so on.

Because the choice of the next note depends only on the current note, the generation process is a random walk through the transition matrix. Higher-order Markov chains are also commonly used in this field, where the next state depends on a sequence of the previous $k$ notes, achieved by redefining the state space to consist of sequence tuples of length $k$ .

Continuous-time

The Poisson process

The Poisson process is one of the simplest and most fundamental examples of a continuous-time Markov chain. It models a sequence of events occurring randomly over time, such as radioactive decay, the arrival of emails in an inbox, or clicks on a website. If $X_{t}$ represents the total number of events that have occurred up to time $t$ , the process has a state space of non-negative integers $\{0,1,2,\dots \}$ . The chain only moves upwards (a "pure birth" process), transitioning from state $i$ to $i+1$ at a constant rate $\lambda$ . The time spent in each state is an independent exponentially distributed random variable with parameter $\lambda$ .

Pure birth process (The popcorn example)

If one pops one hundred kernels of popcorn in an oven, each kernel popping at an independent exponentially distributed time, then this would be a continuous-time Markov process. If $X_{t}$ denotes the number of kernels which have popped up to time $t$ , the problem can be defined as finding the number of kernels that will pop in some later time. The only thing one needs to know is the number of kernels that have popped prior to time $t$ . It is not necessary to know when they popped, so knowing the exact historical sequence of $X_{t}$ for previous times is not relevant to predicting the future.

Mathematically, this is a pure birth process with a finite state space $\{0,1,\dots ,100\}$ . Unlike the standard Poisson process, the rate of popping changes depending on the current state. If $i$ kernels have already popped, there are $100-i$ unpopped kernels remaining. If each kernel pops at a rate $\lambda$ , the overall transition rate from state $i$ to state $i+1$ is $(100-i)\lambda$ .

Two-state machine failure model

One of the most practical continuous-time Markov chains is the two-state model, often used in reliability engineering to model a machine that alternates between being operational and being broken. The state space is $\{0,1\}$ , where 0 represents "working" and 1 represents "failed".

When the machine is working, it fails after an exponentially distributed time with failure rate $\lambda$ .
When the machine is failed, it takes an exponentially distributed time to repair it with repair rate $\mu$ .

This system is completely memoryless: the probability of the machine breaking in the next minute depends only on the fact that it is currently working, not on how long it has been working since its last repair. The transition rate matrix (or generator matrix) $Q$ for this system is:

Q={\begin{pmatrix}-\lambda &\lambda \\\mu &-\mu \end{pmatrix}}

M/M/1 Queue (Birth–death process)

A classic example of a birth–death process is the M/M/1 queue, which models a single server handling a line of customers, such as a cashier at a grocery store or a router processing data packets.

Births (Arrivals): Customers arrive according to a Poisson process with rate $\lambda$ . This increases the state (number of people in the system) from $i$ to $i+1$ .
Deaths (Departures): The server processes customers one at a time. The service times are exponentially distributed with rate $\mu$ . A finished service decreases the state from $i$ to $i-1$ (for $i>0$ ).

The state space is the set of non-negative integers $\{0,1,2,\dots \}$ representing the number of customers in the system. Because both the arrival and service times are exponentially distributed, the system's future state depends only on the current number of customers in line, qualifying it as a continuous-time Markov chain.

Epidemic models (The SIS model)

Continuous-time Markov chains are heavily used to model the spread of infectious diseases. A classic example is the stochastic SIS (Susceptible-Infected-Susceptible) model, which tracks the number of infected individuals in a fixed, thoroughly mixed population of size $N$ . The state space is $\{0,1,\dots ,N\}$ , representing the current number of infected individuals, $I$ .

Infection: A susceptible individual comes into contact with an infected individual and catches the disease. The transition rate from state $I$ to $I+1$ is proportional to the number of infected individuals and the number of susceptible individuals: $\beta I(N-I)$ , where $\beta$ is the transmission rate.
Recovery: An infected individual recovers and immediately becomes susceptible again. The transition rate from state $I$ to $I-1$ is $\gamma I$ , where $\gamma$ is the recovery rate.

Because the rates of infection and recovery depend solely on the current number of infected people at any exact moment, the memoryless property holds. Notice that state 0 is an absorbing state: once the infection completely dies out, it cannot return.

The Moran model (Population genetics)

In genetics, the Moran model describes the stochastic evolution of two competing alleles (e.g., allele A and allele B) in a population of a strictly fixed size $N$ . The state of the Markov chain is the number of individuals carrying allele A, which can range from $0$ to $N$ .

In continuous time, reproduction events occur at a constant rate. When an event occurs:

One individual is chosen at random to reproduce (creating a copy of its allele).
One individual is chosen at random to die (making room for the new copy to maintain population size).

The transition rates from state $i$ (where there are $i$ individuals with allele A and $N-i$ with allele B) to $i+1$ or $i-1$ depend only on the current proportions of the alleles in the population. Both state 0 (allele A is permanently lost) and state $N$ (allele A is permanently fixed) act as absorbing boundaries.

Stochastic chemical kinetics

In physical chemistry and systems biology, the interactions of molecules in a well-mixed microscopic volume are modeled as a continuous-time Markov chain. The state of the system is a vector representing the exact discrete integer count of each type of molecule present.

Reactions are treated as random jumps between these discrete states. For example, a reaction where molecule A and molecule B collide to form molecule C ( $A+B\to C$ ) occurs at a transition rate proportional to the number of A molecules multiplied by the number of B molecules. The memoryless property is maintained because the probability of a collision resulting in a successful reaction in the next infinitesimally small time interval depends only on the instantaneous concentration of the reactants, not on how long they have been bouncing around in the container.

References

^ Øksendal, B. K. (Bernt Karsten), 1945- (2003). Stochastic differential equations : an introduction with applications (6th ed.). Berlin: Springer. ISBN 3540047581. OCLC 52203046.{{cite book}}: CS1 maint: multiple names: authors list (link) CS1 maint: numeric names: authors list (link)
^ Markov, A. A. "An example of statistical analysis of the text of eugene onegin illustrating the association of trials into a Chain." Bulletin de lAcadamie Imperiale des Sciences de St. Petersburg, ser 6 (1913): 153162.
^ Grinstead and Snell’s Introduction to Probability, page 465
^ Van Kampen, N.G. (2007). Stochastic Processes in Physics and Chemistry. NL: North Holland Elsevier. pp. 73–95. ISBN 978-0-444-52965-7.
^ "Going steady (state) with Markov processes". Bloomington Tutors.
^ S. P. Meyn and R.L. Tweedie, 2005. Markov Chains and Stochastic Stability Archived 2013-09-03 at the Wayback Machine

External links

Monopoly as a Markov chain

[:3-1] Øksendal, B. K. (Bernt Karsten), 1945- (2003). Stochastic differential equations : an introduction with applications (6th ed.). Berlin: Springer. ISBN 3540047581. OCLC 52203046.{{cite book}}: CS1 maint: multiple names: authors list (link) CS1 maint: numeric names: authors list (link)

[2] Markov, A. A. "An example of statistical analysis of the text of eugene onegin illustrating the association of trials into a Chain." Bulletin de lAcadamie Imperiale des Sciences de St. Petersburg, ser 6 (1913): 153162.

[3] Grinstead and Snell’s Introduction to Probability, page 465

[:1-4] Van Kampen, N.G. (2007). Stochastic Processes in Physics and Chemistry. NL: North Holland Elsevier. pp. 73–95. ISBN 978-0-444-52965-7.

[5] "Going steady (state) with Markov processes". Bloomington Tutors.

[MCSS-6] S. P. Meyn and R.L. Tweedie, 2005. Markov Chains and Stochastic Stability Archived 2013-09-03 at the Wayback Machine

[1]