Posted by Jason Polak on 01. March 2018 · Write a comment · Categories: probability · Tags: ,

Imagine $n$ states of a system in a discrete-time stochastic system. For each pair of states $i$ and $j$, there is a probability $p_{ij}$ of moving to state $j$ in the next time step, given that the system is in state $i$. Each of these probabilities can be put in a matrix, known as the transition matrix.

Let’s take an example. Such a discrete time system can be easily represented by a graph (in the sense of graph theory). Here is one:

(Incidentally, this is not how I would draw this graph by hand, but not having to do so by using Graphviz is worth not drawing it exactly the way I want it.) In this system, there are two states: in each of these states, there is a 1/2 probability of leaving the state and a 1/2 probability of remaining in it. It is a system with a lot of symmetry and whose transition matrix is
$$T_1 = \begin{pmatrix}1/2 & 1/2\\1/2 & 1/2\end{pmatrix}$$
The cool thing about a transition matrix is that you can raise it to some power. What does this mean? If $T$ is a transition matrix, then it follows from the definition of matrix multiplication and induction that the $i,j$ entry is the probabiltiy of moving to state $j$ given that the system is in state $i$. But it follows pretty much from the definition of matrix multiplication that the $i,j$-entry of $T^2$ is the probability from moving to state $j$ in two steps from state $i$. In general, the $i,j$ entry of $T^n$ is the probability of moving to state $j$ in $n$ steps starting from state $i$.
More »

Posted by Jason Polak on 01. March 2018 · Write a comment · Categories: book · Tags: ,

Today, public-key cryptography is everywhere, offering some measure of security for virtually all internet commerce transactions and secure shell connections. It’s hard to imagine life without it, even though most people aren’t aware of it.

Steven Levy’s Crypto is a great book to explain it and how it all came about. In fact, I first read it almost twenty years ago when I was in high school, and just read it again the other day.

How did encryption in the digital age start out? How did public-key cryptography get invented? And how did public-key crypto get into the hands of pretty much everyone with a web browser all over the world despite the attempts of the U.S. government to control it? These questions are answered in Levy’s book.
More »

Posted by Jason Polak on 25. February 2018 · Write a comment · Categories: probability · Tags:

Consider three cells as so:

A player (the blue disc) starts out in the left-most cell, and discrete time starts. At each step in time, the player has a 1/2 probability of moving left and a 1/2 probability of moving right. If the player chooses to move left but cannot because it is in the left-most cell, then it does nothing, though that still counts as a move. The game ends when the player reaches the right-most cell.

What is the expected number of moves in this game?
More »

Posted by Jason Polak on 18. February 2018 · Write a comment · Categories: computer-science · Tags: , ,

Let’s look at Python’s map function. What does this function do? Here is the syntax:

It takes a function called function and applies it to each element of iterable, returning the result as an iterable. For example:

The output of this program is:

More »

Posted by Jason Polak on 18. February 2018 · 2 comments · Categories: computer-science · Tags: , ,

Let’s assume you have a polynomial over a finite field $\F_q$, defined in Sage. How can you tell whether it’s a permutation polynomial? That is, when is the corresponding function $\F_q\to\F_q$ bijective?

This is how you might have a polynomial over $\F_q$ defined in Sage:

Here, the variable $x$ refers the element $x$ in the isomorphism $\F_q \cong \F_p[x]/\alpha(x)$ and $t$ is the variable in the polynomial ring $\F_q[t]$. Is $f$ a permutation polynomial? That of course depends on what $q$ is.
More »

Let $\F_q$ be a finite field. For any function $f:\F_q\to \F_q$, there exists a polynomial $p\in \F_q[x]$ such that $f(a) = p(a)$ for all $a\in \F_q$. In other words, every function from a finite field to itself can be represented by a polynomial.

In particular, every permutation of $\F_q$ can be represented by a polynomial. This isn’t true for other finite rings, a fact which is the topic of one of my papers. At any rate, polynomials that represent permutations are called permutation polynomials.

It’s not easy to determine which polynomials are permutation polynomials. The exception is for the finite field $\F_2$, and more generally $\Z/2^w$. Then this case Ronald Rivest gave a straightforward criterion for whether a polynomial is a permutation polynomial.

However, there is a classical result that says any permutation may be represented uniquely by a polynomial whose degree is at most $q-2$. Here is an example noted by Charles Wells. Let $a,b\in\F_q$ be two distinct elements. The polynomial
$$f(x) = x + (a-b)(x-a)^{q-1} + (b-a)(x-b)^{q-1}$$
represents the transposition of $a$ and $b$. I suggest that you verify this by direct substitution, using the identity $c^{q-1} =1$ whenever $c\not=0$, which in turn follows from Fermat’s little theorem.
More »

Posted by Jason Polak on 14. February 2018 · Write a comment · Categories: homological-algebra, modules · Tags: , ,

Over a field $k$, an arbitrary product of copies of $k$ is a free module. In other words, every vector space has a basis. In particular, this means that arbitrary products of projective $k$-modules are projective.

Over the ring of integers, an arbitrary product of projective modules is not necessarily projective. In fact, a product of countably infinitely many copies of $\Z$ is not projective!

So while arbitrary direct sums of projective modules are projective, the same is not true of arbitrary products for some rings.

Besides fields, which other rings have the property that arbitrary direct products of their projective modules are projective?
More »

Posted by Jason Polak on 14. February 2018 · Write a comment · Categories: book · Tags: , , , ,

I first heard of Edward Snowden when I was still a PhD student. He became world-famous for leaking huge numbers of NSA documents on their surveillance and data collection program, whose primary aim is to indiscriminately collect as much private data from as many people as they can, regardless of whether they were suspected in any wrongdoing.

How this leak happened, and what the leak contained, is detailed in No Place to Hide by Glenn Greenwald. As Snowden entrusted Greenwald with the leaked documents, Greenwald is in a unique position to offer a detailed and accurate account of Snowden’s leaked files.

The bulk of Snowden’s documents describe the huge amounts of data are collected from pretty much anyone the NSA can get their hands on, regardless of who they are or where they live in the world. This data comes from various sources: intercepted internet transmissions, agreements with phone companies like AT&T, agreements with tech companies like Google, Facebook, and Microsoft, and other spy agencies. The data collected include the exact times and durations of phone calls, and the contents and metadata of emails. The data accumulated by the NSA is so huge that they have to build massive petabyte data centers to store it.

Greenwald’s book is frightening, as he describes his first interactions with Snowden.

I also found it alarming that so many people, including journalists, denounced Greenwald and his associates for the leaks. A journalist for the New York Times went so far as to lie in an article, and that was just one of many fabrications Greenwald endured to bring the leaked documents to the world. Of course, Greenwald and Snowden also had great support as well. And lest anyone think that the leaked documents compromised national security, they should actually read this book in which it clearly shows that Snowden carefully selected his documents so that they did not reveven so, I think the battle has only begun for basic human rights and existence in the face of massive data analytics in the hands of the power-hungry. Data-driven technology has only begun to mature and in the future we will see something that we’ve never seen before: a future where our lives are manipulated so quickly that most of us will be in danger of losing balance and losing control of some of the best things that make human life liveable. With the onslaught of data analytics and machine learning techniques, I fear that proponents of indiscriminate spying will only redouble their efforts to make the lives of the rest of us worse in their quest for their own power and money.al sensitive information that could compromise any security whatsoever.

Greenwald has done a major service in writing this book and his articles, and reporting on the Snowden leaks. Everyone should read this book. Highly recommended.

By the way, if you’re interested in this book (and you should be), read Bruce Schneier’s Data and Goliath. Schneier, an expert in computer security, worked with Greenwald on the Snowden files. Data and Goliath is more focused and detailed on the technical aspects of indiscriminate data collection and espionage, whereas Greenwald’s book is more about the initial event and the effort it took to actually get this crucial information to the public.

Posted by Jason Polak on 13. February 2018 · Write a comment · Categories: statistics · Tags: , ,

If we observe data coming from a distribution with known form but unknown parameters, estimating those parameters is our primary aim. If the distribution is uniform on $[0,\theta]$ with $\theta$ unknown, we already looked at two methods to estimate $\theta$ given $n$ i.i.d. observations $x_1,\dots,x_n$:

  1. Maximum likelihood, which maximizes the likelihood function and gives $\max\{ x_i\}$
  2. Moment estimator $2\sum x_i/n$, or twice the mean

The uniform distribution was an interesting example because maximum likelihood and moments gave two different estimates. But what about the Poisson distribution? It is supported on the natural numbers, depends on a single parameter $\mu$, and has density function
$$f(n) = \frac{e^{-\mu}\mu^n}{n!}$$
What about the two methods of parameter estimation here? Let’s start with the method of moments. It is easy to compute the moments of the Poisson distribution directly, but let’s write down the moment generating function of the Poisson distribution.
More »

Posted by Jason Polak on 13. February 2018 · Write a comment · Categories: statistics · Tags: ,

This post is a quick introduction to the R. I learnt R when I was an undergrad and I still use it from time to time. It was one of the first major programs I compiled from source as well.

What is R? It is simply the best statistical computing environment in use today. Better yet, it’s open source. If you’re working with data, you need to learn R. This post won’t teach you how to use R. Instead, it will give a whirlwind tour so you can get appreciate R’s flavour. The only thing I don’t like much about R is searching for material on it. Its one letter name makes that hard.

I will assume that you have R installed, and have opened the interactive console. It should look something like this:


More »