Binomial logistic regression models in R

In this post, we will look at a simple logistic binomial regression model in R. First, let's take a look at the following hypothetical data taken from [1]:

  C = Yes C = No
  Disease Disease
  Yes No Yes No
Yes 1200 600 300 400
No 400 100 600 400

It is a contingency table. With this data we want to predict the probability of getting a disease based on exposure. We assume that some people are exposed to something (not necessarily the disease itself), and we count the number of people that get the disease. The exposure could be something good like an experimental treatment of "exercises at least twenty minutes a week on a treadmill" and the disease could be "heart disease". Notice that there is a second variable, "C" that we could also incorporate into our model. To get this data into R it is easiest to have it in a comma separate file like this:…read the rest of this post!

Chytrid fungus and logistic regression against temperature

Chytrid fungus refers to the fungus Batrachochytrium dendrobatidis (Bd). In amphibians, it causes a disease known as Chytridiomycosis. It is one of the worst diseases to strike out at multiple species of animals on the planet. This horrible disease degrades the skin, which in amphibians is a sensitive, permeable organ that is part of the respiratory system. Chytridiomycosis eventually leads to death and is a serious threat to many amphibian species in areas such as North America, Central America, Asia, Africa, and even in Australia.

While not necessarily threatened with extinction by Chytrid fungus, even Northern Leopard Frogs may be affected by the widespread organism. Photo by Jason Polak.

As humans, we have contributed to the spread of Chytrid. For example, the bullfrog Lithobates catesbeianus is an excellent reservoir of Bd and has been associated with increased presence of the bacteria in areas where it has become an invasive species due to imports for food. Chytrid fungus has also been responsible for huge losses and declines of frog species in Panama.

Bd grows between 4-25 C°. Therefore, we expect to see a decrease in prevalence in Bd infections in species in the wild as the temperature of the water rises. Matthew Forrest and Martin Schlaepfer tested this and modeled the presence or absence of a Bd infection against water temperature using binomial logistic regression. In their sample of 201 Lowland Leopard frogs, they grouped 10-12 individuals and plotted the proportion of individuals with Bd to illustrate their model:

Logistic equation model reproduced from Forrest and Schlaepfer.

The curve is their logistic regression equation, which is
$$p = \frac{1}{1+e^{4.56-0.226t}}$$ where $t$ is the temperature in Celsius, and $p$ is the proportion of individuals with Bd. This is just one example of the steps we need to take towards better understanding the ecology of the Bd organism so that we can better prevent it from eradicating beautiful species of amphibians that are integral parts of our biosphere.

Would you like to learn more about Chytrid fungus? Check out Frogs: The Thin Green Line PBS documentary on YouTube:

Book review: Richard Fortey's "Life"

As far as the telling of history goes, there can be little more ambitious than the entire history of life from the dawn of the Precambrian to the present day. That is just what paleontologist Richard Fortey attempts in his book Life: A Natural History of the First Four Billion Years of Life on Earth published by Vintage.

Life is a tour of the development of our planet, and puts our current state of the world in its humble place. The land biodiversity that we are so accustomed to as well as the existence of anything remotely resembling ourselves, is a relatively recent occurrence, and Fortey shows us the huge changes that the earth went through to get to the present day.

Many readers will be excited to read the chapters on dinosaurs and their extinction. The author did a great job here of portraying the controversies and ideas surrounding the exact nature of this cataclysm. Personally, I really enjoyed the earlier chapters on the little creatures that inhabited the ancient oceans. They were small, often delicate things that give the impression of a gentler time with gentle creatures. What I wouldn't give to view the planet at that time, though through much of it the atmosphere would not even be suitable for human breathing!

From plants to reptiles to birds to mammals, this book touches upon all aspects of life and its evolution. The narratives are imbued with the personal journey of the author to discover many of the facts in this book. Although at times I found his tangents into literature and human history a distracting, I found his choice of style and witty language to give the perfect grandeur to such an epic story. In fact, the story is so epic that I suggest to readers of this book that they create their own timeline of geologic history based on this chart, and have it handy when reading the book.

Although the story in Life was partially presented in minute and dim fragments to me in school, until now I had never be exposed to such cohesiveness, and I believe this knowledge should be known by everyone. Highly recommended!

Book review: Dunn's "The Wild Life of Our Bodies"

For the past few years, I have been increasingly aware of the unusual aspects of modern society with regard to technological and cultural development. Compared to thousands of years ago, these developments have led to vast changes in the food we eat and how we spend our time. More of our lives are indoors, away from nature. The more I think about it, the more alarmed I get.

When sentiments like this were echoed in the introduction of the book The Wild Life of Our Bodies by Rob Dunn, I jumped into reading until the end!

Dunn's book is a look at some aspects of our species evolution through interactions with parasites, predators, and diet. More specifically, through the lens of evolution, he examines some of our current problems and what we lack compared to our past that may actually be causing problems for us. The opening part is about parasites such as intestinal worms. He argues that certain intestinal worms may actually have been beneficial and that the chances of getting digestive problems such as inflammatory bowel disease may actually be exacerbated in those that do not have the intestinal whipworms.

The author takes us through other topics such as obesity and anxiety, and how traits that led to these were once adaptive, and colour vision, which is still useful to us (but not to the snakes which possibly led to colour vision). This idiosyncratic tour through our evolution makes the point that to understand ourselves and our modern problems, we should look in the past and discover what we have changed. The book finishes by examining our move away from living where we farm our food, and the growing needs of our increasing population.

Although each of the topics examined by Dunn could be a topic of a book by themselves, Dunn manages to continually entertain the reader through clear scientific exposition combined with the witty writing of a seasoned biologist. The book is consistently enjoyable and I highly recommend it to anyone interested in science or the natural history of humanity.

Forthcoming book review: Algorithms by Louridas

An algorithm is a set of well-defined instructions for data transformation. An algorithm typically is given input, and then gives some sort of output. It is the computing version of a function in mathematics. Examples of algorithms range from the very simple such as long division or multiplication that we are taught in grade school to S. Landau's polynomial time algorithm to factor polynomials over algebraic number fields.

For most people, the influence of algorithms on their life come from more practical examples, such as recommendations on Amazon or optimizing the amount of food delivered to the local grocery store. For this reason, it might be useful to have an introductory book on algorithms aimed at a general audience. Panos Louridas has written just such a book, entitled Algorithms, set to be released by MIT press on August 18, 2020. The publisher was kind enough to send me an advance copy, which I will now review.

Algorithms has the difficult task of introducing the concept of algorithm in just 244 pages. To do this, the author has chosen the greatest common divisor (GCD) algorithm to start with, also aptly illustrated by a flowchart on the front cover of the book. This algorithm, known as Euclid's algorithm, is introduced in the first chapter through a similar problem of distributing as evenly as possible two types of objects arranged in a row. By using Euclid's algorithm graphically, the author shows us how to create one possible distribution of the two types of objects. It is also quite interesting how he related these patterns to drumming rhythms in ancient music.
…read the rest of this post!

Check out my Sudoku solver

A couple days ago I started solving a few Sudokus. I would say I am average at solving them. I certainly don't use any advanced techniques and I don't really find the difficult ones very entertaining. By difficult here I mean where you have to think several moves ahead to rule out possibilities. So I thought I would write a Sudoku solver. You can try it out. Yeah, I named it 'Jasudoku' after myself. There are some test cases in it, and you can actually play them by entering some numbers and clicking 'Solve' as the program won't let you go to solve mode unless the puzzle is correct at the end.
…read the rest of this post!

Preprint: group theory and chords

A while ago I wrote a paper called A Group-theoretical Classification of Three-tone and Four-tone Harmonic Chords (arXiv link). The main contribution is an analysis of four-tone harmonic chords in music theory. These are: major-major, minor-major, augmented-major, major-minor, diminished-minor, minor-minor, and diminished-diminished. X-Y in this scheme means X is the basic triad (so major, minor, augmented, or diminished), and Y is the quality of the seventh factor. These four-tone chords can be represented as a quadruple $(0,a,b,c)$ where the numbers $a,b,c$ are the distance from the root $0$ in semitones. I define three fundamental operations on these chords: inversion (usual inversion on a keyboard instrument) denoted $i$, major-minor duality denoted $d$, and augmented-diminished duality denoted $a$. These three operators are elements of the symmetric group on four letters.

I show how all the harmonic chords are related under these operators, and each of these relations reveals the harmonic relations between these chords as we hear them in music. The highlight of the paper is this diagram, which shows all these relations (M=major, m=minor, A=augmented, d=diminished):

Unfortunately, I recently found out that it was rejected :( It was suggested that I could resubmit on the condition that I significantly expand its relation to the rest of the literature and make the notation more consistent with some other parts of the literature. I am going to have to figure out what to do about that because I think too much notational change might actually make this paper more confusing and hard to read. So, I am sharing it with you so you can see this cool diagram. I am not sure if I can really satisfy the requirements of the field of mathematics of music so I might have to leave this one as a preprint, especially since I would like to keep working on ecology.

Some misuses of science

Ever since I was a child, I was fascinated by science. Back then, science was reading about astronomy, physics, chemistry, and biology and the wonderful facts and theories proposed to explain those facts. The method that is used by scientists in their journey is the scientific method. Although the exact nature of the scientific method has been examined by philosophers for hundreds of years, its core nature of falsifiable hypotheses and experimentation holds a great beauty of thought.

However, science is carried out by individuals and societies and in turn that knowledge affects those individuals and societies. Thus, the actual discoveries and processes of science in the short term can be far more tumultuous than the overarching successes presented in history. As intelligent beings, we need to be aware of the dangers in science, so that we can more quickly discover the truth. We need to be wary of the misuse of science for means other than discovering the truth via genuine curiosity about the world.
…read the rest of this post!

Is math art? Part 1

Is mathematics an art? In this series of posts I will attempt to explore and answer this question. I encourage readers to also contribute their ideas in the comments.

The term art itself has undergone a metamorphoses over the centuries, so we should start by examining what humanity has meant by art and see if mathematics might fall under any of the definitions used over time. But beyond these definitions, we should examine whether we feel that mathematics is an art. Whether it is or not, what definition should art have in order to be consistent with the existence of mathematics? Only be examining these two angles will we gain insight into both art and mathematics, and how we as humans fit into both.

One of the earliest descriptions of art is the Platonic one, which says that art is imitation [1]. This view is echoed by Leon Battista Alberti (1404-1472), who thought that painting should be as faithful a reproduction or imitation of the real scene being constructed. Obviously this is a very limited definition by today's standards, but nonetheless worth looking at, as the germ of creating a reproduction of a scene using any media is still alive today, at least as a motivating factor to create art.
…read the rest of this post!

Summing the first $n$ powers and generating functions

One of the classic and most used sums in all of mathematics is the sum of the first $n$ natural numbers
$$1 + 2 + \cdots + n = \frac{n(n+1)}{2}.$$ Gauss's classic derivation of this formula involves observing that if we duplicate this sum, write it backwards under the first sum, and add termwise, we get $(n+1) + (n+1) + \cdots (n+1)$, whence the original sum is half $n(n+1)$.

Similar formulas work for higher powers. For example,
$$1^2 + 2^2 + \cdots + n^2 = \frac{n(n+1)(2n+1)}{6}.$$ But is it a priori clear that the sum of the $a$-powers of the first $n$ natural numbers is a polynomial in $n$ with degree $a+1$? The sum of the first $n$ natural numbers is a quadratic polynomial in $n$, and the sum of the first $n$ squares is a cubic polynomial in $n$.

It is actually true that for any natural number $a$,
$$\sum_{k=0}^n k^a$$ can be written as a polynomial function of $n$ of degree $a+1$. Of course, one way to see this is to derive in a brute force manner a formula that actually works for all powers $a$. That train of thought was actually carried out by Johann Faulhaber (1580-1635) and completed by Jacobi. The resulting formula is now known as Faulhaber's formula.
…read the rest of this post!