## Visualizing statistical models – it’s child’s play

Before you ask a mathematician if they can visualize the fourth dimension, ask them if they can truly visualize a three-dimensional object, like the boundary of a four-dimensional football. If they tell you it’s easy, and their name isn’t Maryna Viazovska, they’re probably lying.

Making an accurate picture of an object from a high dimensional space is very challenging. In this blog post we’ll see a surprising case where it turns out to be possible. We’ll visualize an interesting seven-dimensional object, which comes from a question in statistics.

Let’s consider the probability that each of the teams in the quarter-finals of the Men’s FIFA 2018 World Cup would win. The teams were (Uruguay, France, Brazil, Belgium, Russia, Croatia, Sweden, England). Today we know the probabilities of the teams winning, in that order, are $(0,1,0,0,0,0,0,0)$, because France has already won. Back on 3rd July the probabilities (according to FiveThirtyEight) were $(0.06, 0.15, 0.3, 0.11, 0.05, 0.12, 0.07, 0.14)$, and on 7th July the probabilities were $(0,0.29,0,0.26,0,0.18,0,0.27)$.

In a recent project we were studying which probability distributions lie in a particular statistical model. We found out that our statistical model is given by inequalities that the eight probabilities need to satisfy. If we call the probabilities $(a,b,c,d,e,f,g,h)$, the inequalities are:

$(ad-bc)(eh-fg) \geq 0, \quad (af-be)(ch - dg) \geq 0, \quad (ag-ce)(bh-df) \geq 0 .$

The probabilities have to sum to 1, so $a + b + c + d + e + f + g + h = 1$. We want to visualize the part of seven-dimensional space in which the inequalities hold. How can we do it?

The first step is to notice that some combinations of letters do not affect whether the inequalities hold or not. They are:

$(a + b + c + d) - (e + f + g + h) , \quad (a + c + e + g) - (b + d + f + h) , \quad (a + b + e + f) - (c + d + g + h)$

So we can apply a change of coordinates that removes these three directions, leaving something four-dimensional. Finally, to get something three-dimensional we can assume that the four remaining coordinates lie on the sphere.

We end up with a picture that looks like this:

The part of space that lies inside the statistical model are the points outside either the blue blob, the green blob, or the yellow blob.

These days, we have an even better way to visualize the statistical model, truly in 3D. It even doubles-up as a handmade toy for children.

We can’t help but wonder – which other children’s toys are really statistical models in disguise?

## A duality of pictures

Duality relates objects, which seem different at first but turn out to be similar. The concept of duality occurs almost everywhere in maths. If two objects seem different but are actually the same, we can view each object in a “usual” way, and in a “dual” way – the new vantage point is helpful for new understanding of the object.  In this blog post we’ll see a pictorial example of a mathematical duality.

How are these two graphs related?

In the first graph, we have five vertices, the five black dots, and six green edges which connect them. For example, the five vertices could represent cities (San Francisco, Oakland, Sausalito etc. ) and the edges could be bridges between them.

In the second graph, the role of the cities and the bridges has swapped. Now the bridges are the vertices, and the edges (or hyperedges) are the cities. For example, we can imagine that the cities are large metropolises and the green vertices are the bridge tolls between one city and the next.

Apart from swapping the role of the vertices and the edges, the information in the two graphs is the same. If we shrink each city down to a dot in the second graph, and grow each bridge toll into a full bridge, we get the first graph. We will see that the graphs are dual to each other.

We represent each graph by a labeled matrix: we label the rows by the vertices and the columns by the edges, and we put a $1$ in the matrix whenever the vertex is in the edge. For example, the entry for vertex $1$ and edge $a$ is $1$, because edge $a$ contains vertex $1$. The matrix on the left is for the first graph, and the one on the right is for the second graph.

We can see that the information in the two graphs is the same from looking at the two matrices – they are the same matrix, transposed (or flipped). The matrix of a hypergraph is the transpose of the matrix of the dual hypergraph.

Mathematicians are always on the look-out for hidden dualities between seemingly different objects, and we are happy when we find them. For example, in a recent project we studied the connection between graphical models, from statistics, and tensor networks, from physics. We showed that the two constructions are the duals of each other, using the hypergraph duality we saw in this example.

## Understanding the brain using topology: the Blue Brain project

ALERT ALERT! Applied topology has taken the world has by storm once more. This time techniques from algebraic topology are being applied to model networks of neurons in the brain, in particular with respect to the brain processing information when exposed to a stimulus. Ran Levi, one of the ‘co-senior authors’ of the recent paper published in Frontiers in Computational Neuroscience is based in Aberdeen and he was kind enough to let me show off their pictures in this post. The paper can be found here.

So what are they studying?

When a brain is exposed to a stimulus, neurons fire seemingly at random. We can detect this firing and create a ‘movie’ to study. The firing rate increases towards peak activity, after which it rapidly decreases. In the case of chemical synapses, synaptic communication flows from one neuron to another and you can view this information by drawing a picture with neurons as dots and possible flows between neurons as lines, as shown below. In this image more recent flows show up as brighter.

Numerous studies have been conducted to better understand the pattern of this build up and rapid decrease in neuron spikes and this study contains significant new findings as to how neural networks are built up and decay throughout the process, both at a local and global scale. This new approach could provide substantial insights into how the brain processes and transfers information. The brain is one of the main mysteries of medical science so this is huge! For me the most exciting part of this is that the researchers build their theory through the lens of Algebraic Topology and I will try to explain the main players in their game here.

Topological players: cliques and cavities

The study used a digitally constructed model of a rats brain, which reproduced neuron activity from experiments in which the rats were exposed to stimuli. From this model ‘movies’ of neural activity could be extracted and analysed. The study then compared their findings to real data and found that the same phenomenon occurred.

Neural networks have been previously studied using graphs, in which the neurons are represented by vertices and possible synaptic connections between neurons by edges. This throws away quite a lot of information since during chemical synapses the synaptic communication flows, over a miniscule time period, from one neuron to another. The study takes this into account and uses directed graphs, in which an edge has a direction emulating the synaptic flow. This is the structural graph of the network that they study. They also study functional graphs, which are subgraphs of the structural graph. These contain only the connections that fire within a certain ‘time bin’. You can think of these as synaptic connections that occur in a ‘scene’ of the whole ‘movie’. There is one graph for each scene and this research studies how these graphs change throughout the movie.

The main structural objects discovered and consequentially studied in these movies are subgraphs called directed cliques. These are graphs for which every vertex is connected to every other vertex. There is a source neuron from which all edges are directed away, and a sink neuron for which all edges are directed towards. In this sense the flow of information has a natural direction. Directed cliques consisting of n neurons are called simplices of dimension (n-1). Certain sub-simplices of a directed clique for their own directed cliques, when the vertices in the sub-simplices contain their own source and sink neuron, called sub-cliques. Below are some examples of the directed clique simplices.

And the images below show these simplices occurring naturally in the neural network.

The researchers found that over time, simplices of higher and higher dimension were born in abundance, as synaptic communication increased and information flowed between neurons. Then suddenly all cliques vanished, the brain had finished processing the new information. This relates the neural activity to an underlying structure which we can now study in more detail. It is a very local structure, simplices of up to 7 dimensions were detected, a clique of 8 neurons in a microcircuit containing tens of thousands. It was the pure abundance of this local structure that made it significant, where in this setting local means concerning a small number of vertices in the structural graph.

As well as considering this local structure, the researchers also identified a global structure in the form of cavities. Cavities are formed when cliques share neurons, but not enough neurons to form a larger clique. An example of this sharing is shown below, though please note that this is not yet an example of a cavity. When many cliques together bound a hollow space, this forms a cavity. Cavities represent homology classes, and you can read my post on introducing homology here. An example of a 2 dimensional cavity is also shown below.

The graph below shows the formation of cavities over time. The x-axis corresponds to the first Betti number, which gives an indication of the number of 1 dimensional cavities, and the y-axis similarly gives an indication of the number of 3 dimensional cavities, via the third Betti number. The spiral is drawn out over time as indicated by the text specifying milliseconds on the curve. We see that at the beginning there is an increase in the first Betti number, before an increase in the third alongside a decrease in the first, and finally a sharp decrease to no cavities at all. Considering the neural movie, we view this as an initial appearance of many 1 dimensional simplices, creating 1 dimensional cavities. Over time, the number of 2 and 3 dimensional simplices increases, by filling in extra connections between 1 dimensional simplices, so the lower dimensional cavities are replaced with higher dimensional ones. When the number of higher dimensional cavities is maximal, the whole thing collapses. The brain has finished processing the information!

The time dependent formation of the cliques and cavities in this model was interpreted to try and measure both local information flow, influenced by the cliques, and global flow across the whole network, influenced by cavities.

So why is topology important?

These topological players provide a strong mathematical framework for measuring the activity of a neural network, and the process a brain undergoes when exposed to stimuli. The framework works without parameters (for example there is no measurement of distance between neurons in the model) and one can study the local structure by considering cliques, or how they bind together to form a global structure with cavities. By continuing to study the topological properties of these emerging and disappearing structures alongside neuroscientists we could come closer to understanding our own brains! I will leave you with a beautiful artistic impression of what is happening.

There is a great video of Kathryn Hess (EPFL) speaking about the project, watch it here.

For those of you who want to read more, check out the following blog and news articles (I’m sure there will be more to come and I will try to update the list)

Frontiers blog

Wired article

Newsweek article

## Tea with (Almond) Milk

Making a cup of tea in a hurry is a challenge. I want the tea to be as drinkable (cold) as possible after a short amount of time. Say, 5 minutes. What should I do: should I add milk to the tea at the beginning of the 5 minutes or at the end?

The rule we will use to work this out is Newton’s Law of Cooling. It says “the rate of heat loss of the tea is proportional to the difference in temperature between the tea and its surroundings”.

This means the temperature of the tea follows the differential equation $T' = -k (T - T_s)$, where the constant $k$ is a positive constant of proportionality. The minus sign is there because the tea is warmer than the room – so it is losing heat. Solving this differential equation, we get $T = T_s + (A - T_s) e^{-kt}$, where $A$ is the initial temperature of the tea.

We’ll start by defining some variables, to set the question up mathematically. Most of them we won’t end up needing. Let’s say the tea, straight from the kettle, has temperature $T_0$. The cold milk has temperature $m$. We want to mix tea and milk in the ratio $L:l$. The temperature of the surrounding room is $T_s$.

Option 1: Add the milk at the start

We begin by immediately mixing the tea with the milk. This leaves us with a mixture whose temperature is $\frac{T_0 L + m l }{L + l}$. Now we leave the tea to cool. Its cooling follows the equation $T = T_s +\left( \frac{T_0 L + m l }{L + l} - T_s \right) e^{-kt}$. After five minutes, the temperature is

Option 1 $= T_s +\left( \frac{T_0 L + m l }{L + l}- T_s \right) e^{-5k} .$

Option 2: Add the milk at the end

For this option, we first leave the tea to cool. Its cooling follows the equation $T = T_s + (T_0 - T_s) e^{-kt}$. After five minutes, it has temperature $T = T_s + (T_0 - T_s) e^{-5k}$. Then, we add the milk in the specified ratio. The final concoction has temperature

Option 2 $= \frac{(T_s + (T_0 - T_s) e^{-5k}) L + m l }{L + l}.$

So which temperature is lower: the “Option 1” temperature or the “Option 2” temperature?

It turns out that most of the terms in the two expressions cancel out, and the inequality boils down to a comparison of $e^{-5k} (T_s L - ml)$ (from Option 2) with $(T_s L - ml)$ (from Option 1). The answer depends on whether $T_s L - ml > 0$. For our cup of tea, it will be: there’s more tea than milk ($L > l$) and the milk is colder than the surroundings ($m < T_s$). [What does this quantity represent?] Hence, since $k$ is positive, we have $e^{-5k} < 1$, and option 2 wins: add the milk at the end.

But, does it really make a difference? (What’s the point of calculus?)

Well, we could plug in reasonable values for all the letters ($T_0 = 95^o C$, etc.) and see how different the two expressions are.

So, why tea with Almond milk?

My co-blogger Rachael is vegan. She inspires me to make my tea each morning with Almond milk.

Finally, here’s a picture of an empirical experiment from other people (thenakedscientists) tackling this important question:

## Planes, trains and Kummer Surfaces

Here’s a short blog post for the holiday season, inspired by this article from Wolfram MathWorld. The topic is Kummer Surfaces, which are a particular family of algebraic varieties in 3-dimensional space. They make beautiful mathematical pictures, like these from their wikipedia page:

A Kummer surface is the points in space where a particular equation is satisfied. One way to describe them is as the zero-sets of equations like:

${(x^2 + y^2 + z^2 - \mu^2 )}^2 - \lambda (-z-\sqrt{2} x) ( -z + \sqrt{2} x) ( z + \sqrt{2} y ) ( z - \sqrt{2} y )$.

The variables $x, y , z$ are coordinates in 3-dimensional space, and $\lambda$ and $\mu$ are two parameters, related by the equation $\lambda ( 3 - \mu^2) = 3 \mu^2 - 1$. As we change the value of the parameter, the equation changes, and its zero set changes too.

What does the Kummer Surface look like as the parameter $\mu$ changes?

When the parameter $\mu^2 = 3$, the non-linearity of the Kummer surface disappears, the surface degenerates to a union of four planes.

When the parameter is close to 3, we’re between planes and Kummer surfaces:

And for $\mu^2 = 1.5$, we see the 16 singular points surrounding five almost-tetrahedra, in the center. A zoomed in version is in my other blog post that featured Kummer Surfaces.

Ok, I can see “planes” and “Kummer surface”, but what about “trains”? Well, I guess you say that when a parameter is changing, often something is being trained. Though, er, not here.

This equation is not for a Kummer surface, but it’s not so dissimilar either. It came up recently in one of my research projects:

${\left( x^2 + y^2 + z^2 - 2( x y + x z + y z ) \right)}^2 - 2(x + y - z )( x - y + z ) ( - x + y + z )$

P.S. The code (language=Mathematica) that I used to make the video is here:

anim = Animate[
ContourPlot3D[{(x^2 + y^2 + z^2 -
musq)^2 - ((3*musq - 1)/(3 - musq))*(1 - z -
sq2*x)*(1 - z + sq2*x)*(1 + z + sq2*y)*(1 + z - sq2*y) ==
0}, {x, -5, 5}, {y, -5, 5}, {z, -5, 5},
PerformanceGoal -> "Quality", BoxRatios -> 1,
PlotRange -> 1], {musq, 3.001, 1, 0.0002}];

## Mapping class groups and curves in surfaces

Firstly, thanks to Rachael for inviting me to write this post after meeting me at the ECSTATIC conference at Imperial College London, and to her and Anna for creating such a great blog!

My research is all about surfaces. One of the simplest examples of a surface is a sphere. We are all familiar with this – think of a globe or a beach ball. Really we should think of this beach ball as having no thickness at all, in other words it is 2-dimensional. We are allowed to stretch and squeeze it so that it doesn’t look round, but we can’t make every surface in this way. The next distinct surface we come to is the torus. Instead of a beach ball, this is like an inflatable ring (see this post by Rachael). We say that the genus of the torus is 1 because it has one “hole” in it. If we have $g$ of these holes then the surface has genus $g$. The sphere doesn’t have any holes so has genus 0. We can also alter a surface by cutting out a disc. This creates an edge called a boundary component. If we were to try to pass the edge on the surface, we would fall off. Here are a few examples of surfaces.

As with the sphere, topology allows us to deform these surfaces in certain ways without them being considered to be different. The classification of surfaces tells us that if two surfaces have the same genus and the same number of boundary components then they are topologically the same, or homeomorphic.

Now that we have a surface, we can start to think about its properties. A recurring theme across mathematics is the idea of symmetries. In topology, the symmetries we have are called self-homeomorphisms. Strictly speaking, all of the self-homeomorphisms we will consider will be orientation-preserving.

Let’s think about some symmetries of the genus 3 surface.

Here is a rotation which has order 2, that is, if we apply it twice, we get back to what we started with.

Here is another order 2 rotation.

And here is a rotation of order 3. Remember that we are allowed to deform the surface so that it looks a bit different to the pictures above but still has genus 3.

However, not all symmetries of a surface have finite order. Let’s look at a Dehn twist. The picture (for the genus 2 surface) shows the three stages – first we cut along a loop in the surface, then we rotate the part of the surface on just one side of this loop by one full turn, then we stick it back together.

A Dehn twist has infinite order, that is, if we keep on applying it again and again, we never get back to what we started with.

If we compose two homeomorphisms (that is, apply one after the other) then we get another homeomorphism. The self-homeomorphisms also satisfy some other properties which mean that they form a group under composition. However, this group is very big and quite nasty to study, so we usually consider two homeomorphisms to be the same if they are isotopic. This is quite a natural relationship between two homeomorphisms and roughly means that there is a nice continuous way of deforming one into the other. Now we have the set of all isotopy classes of orientation-preserving self-homeomorphisms of the surface, which we call mapping classes. These still form a group under composition – the mapping class group. This group is much nicer. It still (usually) has infinitely many elements, but now we can find a finite list of elements which form a generating set for the group. This means that every element of the group can be made by composing elements from this list. Groups with finite generating sets are often easier to study than groups which don’t have one.

An example of a mapping class group appears in Rachael’s post below. The braid group on $n$ strands is the mapping class group of the disc with $n$ punctures (where all homeomorphisms fix the boundary pointwise). Punctures are places where a point is removed from the surface. In some ways punctures are similar to boundary components, where an open disc is removed, but a mapping class can exchange punctures with other punctures.

So how can we study what a mapping class does? Rachael described in her post how we can study the braid group by looking at arcs on the punctured disc. Similarly, in the pictures above of examples of self-homeomorphisms the effect of the homeomorphism is indicated by a few coloured curves. More precisely, these are simple closed curves, which means they are loops which join up without any self-intersections. Suppose we are given a mapping class for a surface but not told which one it is. If we are told that it takes a certain curve to a certain other curve then we can start to narrow it down. If we get information about other curves we can narrow it down even more until eventually we know exactly what the mapping class is.

Now I can tell you a little about what I mainly think about in my research: the curve graph. In topology, a graph consists of a set of points – the vertices – with some pairs of vertices joined by edges.

Each vertex in the curve graph represents an isotopy class of curves. As in the case of homeomorphisms, isotopy is a natural relationship between two curves, which more or less corresponds to pushing and pulling a curve into another curve without cutting it open. For example, the two green curves in the picture are isotopic, as are the two blue curves, but green and blue are not isotopic to each other.

Also, we don’t quite want to use every isotopy class of curves. Curves that can be squashed down to a point (inessential) or into a boundary component (peripheral) don’t tell us very much, so we will ignore them. Here are a few examples of inessential and peripheral curves.

We now have infinitely many vertices, one for every isotopy class of essential, non-peripheral curves, and it is time to add edges. We put an edge between two vertices if they have representative curves which do not intersect. So if two curves from these isotopy classes cross each other we can pull one off the other by an isotopy. Here’s an example of some edges in the curve graph of the genus 2 surface. In the picture, all of the curves are intersecting minimally, so if they intersect here they cannot be isotoped to be disjoint.

I should emphasise that this is only a small subgraph of the curve graph of the genus 2 surface. Not only does the curve graph have infinitely many vertices, but it is also locally infinite – at each vertex, there are infinitely many edges going out! This isn’t too hard to see – if we take any vertex, this represents some curve (up to isotopy). If we cut along this curve we get either one or two smaller surfaces. These contain infinitely many isotopy classes of curves, none of which intersects the original curve.

So why is this graph useful? Well, as we noted above, we can record the effect of a mapping class by what it does to curves. Importantly, the property of whether two curves are disjoint is preserved by a mapping class. So not only does a mapping class take vertices of the curve graph (curves) to vertices, but it preserves whether or not two vertices are connected by an edge. Thus a mapping class gives us a map from the curve graph back to itself, where the vertices may be moved around but, if we ignore the labels, the graph is left looking the same. We say that the mapping class group has an isometric action on the curve graph, so to every element of the group we associate an isometry of the graph, which is a map which preserves distances between elements. The distance between two points in the graph is just the smallest number of edges we need to pass along to get from one to the other. When we have an isometric action of a group on a space, this is really useful for studying the geometry of the group, but that would be another story.

## Defining topology through interviews. Interview seven with Jeremy Mann.

The final interview (*cry*) in the Defining topology through interviews series is with Jeremy Mann, who is a PhD student in mathematics at the University of  Notre Dame, studying geometry and topology.

1. What would your own personal description of  “topology” be?

Topology studies features we call “qualitative”: ones that don’t change if the system is gently* disturbed. In some sense, we created topology in order to give precise answers to qualitative questions. In my day to day life, I reason qualitatively. I rarely wonder “Will the temperature outside be greater than 23 degrees?” I ask: “Is it warm outside?” I would call the first question quantitative, and my second one topological. In other words, topology is created to give precise answers to the types of questions we, as humans, are naturally interested in.

* What one means by “gently” depends enormously on the context, and one has a lot of freedom in choosing what that means. For these reasons, despite being wonderfully vivid, topology is at times unavoidably abstract.

2. What do you say when trying to explain your work to non-mathematicians?

I fudge the details and I lie. If a careful mathematician were listening, they might interject with a few “well, actually—”s. But the details can obscure content, and people enjoy fiction, so I try not to lose sleep over it. That being said, I might tell a story like this:
By the age of three, I could pick two peaches out of a bag without knowing the first thing about the symbol “2.” A number was something like a bunch of stuff contained within a box. A number could bounce around and bruise. I could hold it in my hands.
If I had a sack of plums and a sack of peaches, I could add them together by pouring them both into a bigger sack.
But these terms didn’t help me add the grains of sand in a bucket, or the stars spread before my eyes. So I dropped this way of adding, in favor of an algebra with lots of symbols like “2,” and “376,” and eventually “x.”
Since then, I’ve made another shift. These days, my conceptualization of arithmetic is a lot closer to a child’s. This approach has many names, but my favorite is Factorization Algebras. I see a number as a collection of objects contained within a region of space.
But now, my numbers can interact. Symbols are no longer rich enough to capture their structure. Sometimes my numbers feel like exotic creatures. They can circle each other suspiciously.

Symbols see this as “2=2=2=2,” but this picture shows us there’s a lot more going on.

“1 + 1 = 2”.
Two numbers can be enemies. When I add them together, they remove each other from existence. “1 + (-1) = 0”. Sometimes, I play this in reverse, watching two enemies spontaneously born from empty tranquility.
I guess I’m interested in more than just writing down the final answer. I want to see their costumes. I want to know how they come together. I want to feel the content in their choreography. My work helps me do this.

3. How does your work relate, if at all, to the Nobel prize work?

The Nobel Prize was awarded for insights into the behavior of certain forms of matter at very low energy, where their behavior becomes “topological.” Strictly speaking, the structures I consider are not “topological” — despite “being a topologist,” my work does in fact know the difference between a coffee cup and a donut. It’s much higher-energy.*
Many physicists are interested in a material’s low energy behavior because these conditions contain a huge amount of information about a material’s possible phases. This even includes more “exotic” phases of matter, some with potential applications to quantum computers.
I’d like to point out the following: often, “exotic” means “outside of one’s comfort zone.” So, when physicists say “exotic topological phases of matter,” I suspect they are expressing how the low energy behaviors of certain materials are outside the comfort of zone of many members of the physics (and mathematics) community. This “exotic” behavior defies common intuition. However, when a material behaves in this manner, to a topologist, it enters very familiar territory. The topological is not exotic to a topologist.

** hotter, but certainly not sexier.