Posted by Jason Polak on 13. May 2018 · Write a comment · Categories: number-theory

We've been talking about the Miller-Rabin randomized primality test, which is one of the easiest to implement and most effective tests that, given a number, will either prove it to be composite or state that it is most likely prime. As good as it is for practical applications, the Miller-Rabin test leaves something to be desired.

If you need primes for cryptographic applications, Miller-Rabin practically perfect. But what if you want to generate super large primes and state for sure that the numbers you are generating really are primes?

The Lucas test is a fast test that can prove a number prime. Here it is:

Theorem (Lucas Test). Let $a$ and $n\gt 1$ be integers. If $a^{n-1}\equiv 1\pmod{n}$ and $a^{(n-1)/q}\not\equiv 1\pmod{n}$ for every prime factor $q\mid n-1$ then $n$ is prime.

The proof is quite simple: the congruences that $a$ satisfy show that $a$ generate the multiplicative group $\mathbb{Z}/n^\times$. Therefore, $n-1\mid |\mathbb{Z}/n^\times| = \varphi(n)$. However, if $n$ were composite then $\varphi(n) \lt n-1$, which contradicts $n-1\mid\varphi(n)$. Therefore, $n$ is prime.
More »

Posted by Jason Polak on 11. May 2018 · Write a comment · Categories: number-theory · Tags:

Last time, I explained the Miller-Rabin probabilistic primality test. Let's recall it:

Theorem. Let $p$ be an odd prime and write $p-1 = 2^kq$ where $q$ is an odd number. If $a$ is relatively prime to $p$ then at least one of the following statements is true:

  1. $a^q\equiv 1\pmod{p}$, or
  2. One of $a^q,a^{2q},a^{4q},\dots,a^{2^{k-1}q}$ is congruent to $-1$ modulo $p$.

The Miller-Rabin randomized test looks for values of $a$ that don't satisfy the conclusions of this theorem. Such an $a$ is called a witness. If such an $a$ can be found, then $n$ is composite. If no such $a$ is found after a bunch of iterations, then we consider $n$ to be probably prime.

For example, a hundred iterations of Miller-Rabin selected

as a number that is most likely prime. A brute-force factor testing to make sure that this number is prime would take many times the age of the universe. That's pretty crazy! The Miller-Rabin test only takes a couple seconds, but how good is it really?

Now what about the percentage of witnesses for different numbers? Is this percentage close to 75%? In fact, although at least 75% of all possible bases are witnesses for any given composite number, often there are far more bases. In fact, only 0.6% of composite integers below 2000 have fewer than 90% of the total possible bases as witnesses. One such number is 1891 = 31*61. About 76% of possible bases are Miller-Rabin witnesses for 1891, which shows that the estimate of 75% is pretty good.

But for cryptographic applications, we're not interested in finding composite numbers. Given a randomly selected odd composite number, we can say that the probability of the Miller-Rabin test failing to find it composite after $k$ iterations is at most $4^{-k}$, which we can make very small very quickly. However, this conditional probability isn't the same thing as the conditional probability that a randomly selected odd number is composite given that $k$-iterations of Miller-Rabin thinks it is a prime!

A priori, it may be possible that the probability a number is composite given that $k$-iterations of Miller-Rabin thinks it's a prime is higher than $4^{-k}$.

But this isn't the case. Beauchemin, Brassard, Crépeau, Goutier, and Pomerance showed in a paper that this probability (that a randomly selected odd number is composite given $k$-iterations of Miller-Rabin thinks it is a prime) is also at most $4^{-k}$, assuming that the randomly selected odd numbers of sufficiently large (which thankfully is less than the typical modern-day cryptographic application).

Posted by Jason Polak on 09. May 2018 · Write a comment · Categories: number-theory · Tags: , ,

Fermat's little theorem states that for a prime number $p$, any $a\in \Z/p^\times$ satisfies $a^{p-1} = 1$. If $p$ is not prime, this may not necessarily be true. For example:
$$2^{402} = 376 \in \Z/403^\times.$$
Therefore, we can conclude that 403 is not a prime number. In fact, $403 = 13\cdot 31$ Fermat's little theorem can be used as a test for compositeness this way: if you can find a number $a$ relatively prime to $p$ such that $a^{p-1} \not\equiv 1\pmod{p}$, then $p$ is actually composite.

If there exists an $a\in\{1,\dots,p-1\}$ with $a^{p-1}\not\equiv 1\pmod{p}$, then $a$ is called a Fermat witness to the compositeness of $p$. That is, it proves that $p$ is composite. Notice that we do not require $a$ to be relatively prime to $p$ in this definition of a Fermat witness: if any number $a\lt p$ exists with $a^{p-1}\not\equiv 1\pmod{p}$, then $p$ cannot be prime, because if it were, $a$ would actually be relatively prime to $p$ and this would contradict Fermat's little theorem.

Fermat's little theorem is a pretty good test for compositeness: if $p$ is some odd composite integer with at least one relatively prime Fermat witness, then at least half the numbers in the range $2,3,\dots,p-2$ will also be witnesses. Note: we are using this range because $p-1$ and $1$ aren't witnesses. Therefore, you can use Fermat's little theorem as a randomized prime-testing algorithm: randomly select elements $a$ from $\{2,\dots,p-2\}$ and check if they satisfy $a^{p-1} = 1\pmod{p}$. Therefore, if you have some large number $N$ that is composite and has at least one witness, a randomised Fermat's little theorem algorithm randomly testing $K$ different bases in $\{2,\dots,N-1\}$ will have probability of less than $1/2^K$ at failing to detect that $N$ is composite.
More »

Posted by Jason Polak on 06. May 2018 · Write a comment · Categories: number-theory · Tags:

Alice wants her friends to send her stuff only she can read. RSA public-key encryption allows her to do that: she chooses huge primes $p$ and $q$ and releases $N = pq$ along with an encryption exponent $e$ such that ${\rm gcd}(e,(p-1)(q-1)) = 1$. If Bob wants to send Alice a message $m$, he sends $c = m^e$ modulo $N$ to Alice.

Computing $m$ from $m^e$ is hard if you only know $N$, but becomes easy with the prime factors $p$ and $q$. And, only Alice knows $p$ and $q$. The word "hard" is relative, and maybe someone will find an easy way to do it in the future, perhaps with quantum computers.

However, if you know $p$ and $q$, then you can compute $d$ such that $de = 1$ modulo $(p-1)(q-1)$. That is, you're computing the inverse of $e$ in the multiplicative group $(\mathbb{Z}/N)^\times$. You can use Fermat's little theorem: $x^{p-1} = 1$ in $\mathbb{Z}/p^\times$ for an $x\in (\mathbb{Z}/p)^\times$ to compute that $c^d = m^{ed} = m$ modulo $N$. Therefore, $c^d$ in $\mathbb{Z}/N$ is the decrypted message.
More »

Posted by Jason Polak on 03. May 2018 · Write a comment · Categories: number-theory

In a group $G$, the discrete logarithm problem is to solve for $x$ in the equation $g^x =h$ where $g,h\in G$. In Part 1, we saw that solving the discrete log problem for finite fields would imply that we could solve the Diffie-Hellman problem and crack the ElGamal encryption scheme.

Obviously, the main question at this point is: how hard is the discrete logarithm problem (DLP)? In this post, we will look at various ways to approach the problem. We'll focus on solving the DLP in the cyclic group $\F_p^\times$, though some of the ideas generalise to more general groups as well.

The Brute Force Approach

So, we've got this equation: $g^x = h$ in $\F_p^\times$. Of course, we're assuming that there exists an $x$ that solves this equation, because in practice a solution exists.
More »

Posted by Jason Polak on 30. April 2018 · Write a comment · Categories: number-theory · Tags:

Given a group $G$, and an element $g\in G$ (the "base"), the discrete logarithm $\log_g(h)$ of an $h\in G$ is an element $x\in G$ such that $g^x = h$ if it exists. Its name "discrete logarithm" essentially means that we are only allowed to use integer powers in the group, rather than extending the definition of exponentiation as with the usual logarithm for the real numbers.

Because of cryptographic applications, usually the discrete logarithm is seen in the world of finite fields or elliptic curves. Here, given a finite field $\F_q$, the discrete logarithm is considered in the multiplicative group $\F_q^\times$. It so happens that this group is cyclic, and hence it must be isomorphic to $\Z/(q-1)$. So, given a $g\in \F_q$ that generates $\F_q^\times$ as a multiplicative group, the discrete logarithm always exists and thus defines a function
$$\log_g: \F_q^\times\longrightarrow \Z/(q-1).$$
More »

Posted by Jason Polak on 07. April 2018 · Write a comment · Categories: number-theory · Tags: ,

There is a cool way to express 1 as a sum of unit fractions using partitions of a fixed positive integer. What do we mean by partition? If $n$ is such an integer then a partition is just a sum $e_1d_1 + \cdots + e_kd_k = n$ where $d_i$ are positive integers. For example,

7 = 4 + 1 + 1 = 4 + 2(1)

The order of the partition is not very interesting and so we identify partitions up to order. Thus, here are all the 15 partitions of 7:

7
6+1
5+2
5+1+1
4+3
4+2+1
4+1+1+1
3+3+1
3+2+2
3+2+1+1
3+1+1+1+1
2+2+2+1
2+2+1+1+1
2+1+1+1+1+1
1+1+1+1+1+1+1

Group the same numbers together so that each partition is written as $n = \sum e_id_i$ where there are $e_i$ appearances of $d_i$ (or vice-versa, it's symmetric). Then it's a theorem that:
$$
1 = \sum (e_1!\cdots e_k!\cdot d_1^{e_1}\cdots d_k^{e_k})^{-1}.$$
This partition identity has a bunch of proofs. A neat one appears in the paper "Using Factorizations to Prove a Partition Identity" by David Dobbs and Timothy Kilbourn. In their proof, they used an asympotitc expression for the number of irreducible polynomials over a finite field of a given degree $n$ (the same $n$ that appears in the partition).

Here are some examples of this identity. For n=5, we have:

1 = 1/5 + 1/4 + 1/6 + 1/6 + 1/8 + 1/12 + 1/120

For n=7:

1 = 1/7 + 1/6 + 1/10 + 1/10 + 1/12 + 1/8 + 1/24 + 1/18
+ 1/24 + 1/12 + 1/72 + 1/48 + 1/48 + 1/240 + 1/5040

And for n=11

1 = 1/11 + 1/10 + 1/18 + 1/18 + 1/24 + 1/16 + 1/48 + 1/28 + 1/21
+ 1/56 + 1/28 + 1/168 + 1/30 + 1/24 + 1/36
+ 1/36 + 1/48 + 1/72 + 1/720 + 1/50 + 1/40 + 1/40
+ 1/90 + 1/30 + 1/90 + 1/240 + 1/80 + 1/240
+ 1/3600 + 1/96 + 1/64 + 1/192 + 1/72 + 1/96 + 1/48
+ 1/288 + 1/192 + 1/192 + 1/960 + 1/20160 + 1/324
+ 1/324 + 1/144 + 1/216 + 1/2160 + 1/1152 + 1/288 + 1/576
+ 1/4320 + 1/120960 + 1/3840 + 1/2304
+ 1/5760 + 1/40320 + 1/725760 + 1/39916800

More »

Posted by Jason Polak on 30. January 2018 · Write a comment · Categories: number-theory · Tags: ,
A real number is simply normal in base $b$ if the frequency of each base $b$ digit in the first $n$ digits tends to a limit as $n$ goes to infinity, and each of these limits is the same.

In other words, a real number is simply normal in base $b$ if each digit appears with equal frequency. It may not be the case of course that the frequency is well-defined: as you keep computing more and more decimal digits of a number, the frequency of any given digit could keep oscillating.

A real number is normal in base $b$ if the frequency of each length $k$ string of numbers exists and is equal to $1/b^k$.

A rational number can be simply normal in some base. For example, 0.12345678901234567890…. is simply normal in base ten. Rational numbers can't be normal, since their decimals repeat. Irrational numbers can be normal, and in fact almost every number is normal in the sense of the Lebesgue measure. Champernowne proved that the number

0.123456789101121314…

is normal to every base. Mathematicians believe that $e$ and $\pi$ are normal but no one has ever proved it, or found a proof that they are not normal. It's tempting to look at some numerical evidence. For example, here are the frequencies of the digits in the first 301029 decimal digits of $e$:

0 30057
1 30283
2 29785
3 30316
4 30206
5 30044
6 30375
7 30065
8 30055
9 29843

Counting these took about 2.4 seconds. Looks like the digit '6' occurs the most often, whereas '2' occurs the least often. Is this about what we'd expect? Here's the result of a $\chi^2$-test for goodness of fit, the null hypothesis being that the probability of each digit is the same:

X-squared = 11.309, df = 9, p-value = 0.2551

Looks like there's no reason to be alarmed. Of course, since the normality and simple normality of a number depends only on the "tail" of its expansion, just testing a finite number of digits can be misleading. The rational number defined by the first 301029 digits of $e$ is not normal or simply normal.

The frequencies for all two digit combinations starts off like this:

00 2957
01 3035
02 3040
03 2978

You probably don't want to look at the other 96 frequencies. But a $\chi^2$-test reveals the following:

X-squared = 108.06, df = 99, p-value = 0.2506

Although this is rather convincing that $e$ is normal, none of this proves anything. If it is, you'd expect 12345 to occur about three times in the first 301029 digits. In fact it does, the first time starting at the 166412th digit. The sequence 99999 only occurs once, starting at the 290471st digit. Even though the odds are against it, the starting six digit sequence 271828 actually does occur once, at the 252474th digit. Maybe that will help you memorise $e$?

Posted by Jason Polak on 29. November 2017 · Write a comment · Categories: number-theory · Tags: , ,

Lately I've been thinking about primes, and I've plotted a few graphs to illustrate some beautiful ideas involving primes. Even though you might not always associate with primes, they are always haunting quietly in the background.

Abundance of primes in an arithmetic progression

Let's start out with the oddest prime of all: 2. Get it? But after that, all the odd primes are either of the form $4k + 1$ or $4k + 3$. For fixed $x$, are there more primes less than $x$ of the form $4k + 1$ or the second form $4k + 3$? Let's write $\pi(4k + r,x)$ for the number of primes less than or equal to $x$ of the form $4k + r$. Here is a graph of the difference $\pi(4k+3,x) – \pi(4k+1,x)$:

Pretty neat right? It looks like this difference is wildly erratic, reaching zero after a short while with a bit of a fight and then for a really good long while the primes of the form $4k +3$ win out. So you might be tempeted to think that primes of the form $4k + 3$ become more and more abundant as $x$ increases. That would be wrong. In fact, John E. Littlewood proved that $\pi(4k +3,x)-\pi(4k + 1,x)$ switches sign infinitely often!

Of course, that must mean there are infinitely many primes of both types, and that's true and a special case of Dirichlet's theorem: there are infinitely many primes in any arithmetic progression $ax + b$ whenever $a$ and $b$ are relatively prime.
More »

Posted by Jason Polak on 26. November 2017 · Write a comment · Categories: number-theory · Tags: ,

Since the days of antiquity, we've always been looking for ways to determine whether a natural number is prime. Trial division up to the square root of a number quickly becomes tedious, thought it is worth noting that even on my fairly old laptop a slightly optimised trial-division algorithm will list all the primes under 10^7 in just under seven seconds.

Another method that sometimes works is Fermat's little theorem:

Theorem. If $n$ is a prime number then for every $a$,
$$a^n\equiv_n a.$$

Here, I'm using the notation to mean $a^n – n$ is divisible by $n$. Let's do some examples: 4^11 – 4 = 4194300, which is divisible by 11 (you can see this directly by using the alternating sum digit test: 4 – 1 + 9 – 4 + 3 = 11). On the other hand, $2^{69} – 2 = 590295810358705651710 \equiv_{69} 6$. Of course, from the nature of the theorem, if $a^n\equiv_n a$ is not true then $n$ is composite, so it can tell you for sure if a number is composite but not if a number is prime.
More »