# Dadadodo: Markov sentence generator

Posted by Jason Polak on 29. March 2018 · Write a comment · Categories: probability · Tags:

It is said that Markov originally invented Markov processes to understand how some letters follow other letters in poetry.

Recall that a Markov process is a probability random process that models moving from one state to another state, where the possible states is some set. There is a fixed probability from moving from each state to each other state.

Writing sentences could be modeled with a Markov process. To do so, we can take in some text, and compute the experimental frequency that one word will follow another. Then using a random number generator, we can output new sentences generated from this Markov process.

This doesn't use any deep theory but nevertheless is fun to try. It's not hard to write a program like this, and it would be interesting to do so, but for demonstration purposes I will now use a premade one: dadadodo. It's available for Linux. Basically, you run it like this:

This command will generate N sentences in a Markov process, using computed transition probabilities from the file inputfile.txt. According to dadadodo's documentation,

Sometimes these sentences are nonsense, but sometimes they cut right through to the heart of the matter and reveal hidden meanings.

Here is one amusing one generated from my PhD thesis:

I am grateful to do not actually integral.

Sounds an awful lot like "I am grateful to not do the actual integral," which is funny because I computed a rather tricky p-adic integral in my thesis. Here's one feeding in a cover letter I wrote in my current job application process:

I believe I have the next generation of my a number theory.

Sounds a lot like "I am the next generation of number theory." Here is a pretty funny one I got from feeding in a job posting:

Traditionally, enterprises have spent countless hours over the latest technology.

Now that sounds about right! The University of Melbourne sends out an email with news of interest to staff. Here is a random sentence that appeared from dadadodo using this news text:

The Research Fellowships on experiences of the bike racks.

I should apply for one of those.

Most of the outputted sentences are nonsense but are not too far away from meaningful, if you think about it.

# Cereal box prizes and transition matrices

Posted by Jason Polak on 04. March 2018 · Write a comment · Categories: probability · Tags: ,

If you don't know what a transition matrix is, you might want to read the transition matrix post before reading this one.

Transition matrices can be used to solve some classic probability problems. For example, consider the following problem:

Suppose in each cereal box you buy there is one number in the set $\{1,2,3,4,5\}$. You get a prize if you collect all five numbers. What is the expected number of boxes you have to buy (or steal) before you get all five numbers?

I found this problem in Frederick Mosteller's book 'Fifty Challenging Problems in Probability with Solutions'. I had to think about this problem for a few minutes, and you should too before going on.
More »

# Transition matrices

Posted by Jason Polak on 01. March 2018 · Write a comment · Categories: probability · Tags: ,

Imagine $n$ states of a system in a discrete-time stochastic system. For each pair of states $i$ and $j$, there is a probability $p_{ij}$ of moving to state $j$ in the next time step, given that the system is in state $i$. Each of these probabilities can be put in a matrix, known as the transition matrix.

Let's take an example. Such a discrete time system can be easily represented by a graph (in the sense of graph theory). Here is one:

(Incidentally, this is not how I would draw this graph by hand, but not having to do so by using Graphviz is worth not drawing it exactly the way I want it.) In this system, there are two states: in each of these states, there is a 1/2 probability of leaving the state and a 1/2 probability of remaining in it. It is a system with a lot of symmetry and whose transition matrix is
$$T_1 = \begin{pmatrix}1/2 & 1/2\\1/2 & 1/2\end{pmatrix}$$
The cool thing about a transition matrix is that you can raise it to some power. What does this mean? If $T$ is a transition matrix, then it follows from the definition of matrix multiplication and induction that the $i,j$ entry is the probabiltiy of moving to state $j$ given that the system is in state $i$. But it follows pretty much from the definition of matrix multiplication that the $i,j$-entry of $T^2$ is the probability from moving to state $j$ in two steps from state $i$. In general, the $i,j$ entry of $T^n$ is the probability of moving to state $j$ in $n$ steps starting from state $i$.
More »

# Expected iterations for a finite random walk

Posted by Jason Polak on 25. February 2018 · Write a comment · Categories: probability · Tags:

Consider three cells as so:

A player (the blue disc) starts out in the left-most cell, and discrete time starts. At each step in time, the player has a 1/2 probability of moving left and a 1/2 probability of moving right. If the player chooses to move left but cannot because it is in the left-most cell, then it does nothing, though that still counts as a move. The game ends when the player reaches the right-most cell.

What is the expected number of moves in this game?
More »