ANNA’s notebook -> AMS notices

Here at “Picture this Maths”, we were very lucky last month to be featured by the American Mathematical Society (AMS) on their Blog on Math Blogs! It is wonderful to have people reading and sharing our blog in the mathematical community and beyond.

This blog post also tells an exciting AMS story. It is on the topic of tensors (like this post, and this one, and this one too). It’s about a mathematical picture which started out as a cartoon in my notebook.


Version 1: ANNA’s notebook

It ended up — much souped up, with help from computer graphics experts — on the front cover of the June/July issue of the Notices of the AMS. The full issue is available here.

Version 2: AMS Notices

So, what’s going on in this picture?

The story begins, as with many stories (ok, many of my stories)  with singular vectors and singular values of matrices. To understand mathematical concepts, it’s useful to have a picture in mind. Luckily, singular vectors and singular values of matrices lend themselves extremely well to visual description. Just take a look at this wikipedia gif.

A matrix can be thought of in complementary ways, either as a two-dimensional grid of data, or as the information that encodes a linear transformation of a space. The gif is about matrices as linear maps. Below are a couple of still images from it. They show how a linear transformation of space


can be decomposed as the combination of three “building block” transformations, each of which is far easier to understand. A rotation V^*, a coordinate scaling \Sigma and then another rotation U^*


What about visualizing the singular vectors and singular values of tensors?

Here, the story is more complicated, not least because the greater number of dimensions makes visualizing things harder. Usually, matrices have a finite number of singular vectors, and the same is true of tensors. But, like for the matrix case, some tensors have infinitely many singular vectors, and the singular vectors themselves form an interesting structure.

The picture shows the structure of the singular vectors of a four-dimensional orthogonally decomposable tensor of size 2 \times 2 \times 3 \times 3. For more on the ‘maths behind the picture’, see this About The Cover article from the AMS.


Hall of mirrors: Coxeter Groups and the Davis Complex

I’ve spent a lot of time this summer thinking about the best way to present maths. When someone gives a talk or lecture they normally write on the board in chalk or present via Beamer slides. Occasionally though, someone comes along with some great hand drawn slides that make listening to a talk that wee bit more exciting. So the images in this blog are part of my tribute to this new idea.

I’ve talked about Coxeter groups before (here), but I’ll start afresh for this post. It is worth mentioning now that Coxeter groups arise across maths, in areas such as combinatorics, geometric group theory, Lie theory etc. as well as topology.

A Coxeter group is a group generated by reflections, and “braid type” relations. Informally, you can imagine yourself standing in the middle of a room with lots of mirrors around you, angled differently. Your reflection in a single mirror can be viewed as a generator of the group, and any other reflection through a series of mirrors can be viewed as a word in the group. Here is a silly picture to show this:

silly picture

Formally, a Coxeter group is defined by a Coxeter matrix on the generating set S. This is an S by S matrix, with one entry for each ordered pair in the generating set. This entry has to be 1 on the diagonal of the matrix or a whole number that is either bigger than 2 or infinity (\infty ) off the diagonal. It also has to be symmetric, having the same entry for (t,s) as (s,t) . See below:

coxeter groups1

Given this matrix you can then define the Coxeter group to be the group generated by S with relations given by the corresponding entry in the matrix.

coxeter groups2

In particular notice that the diagonal of the matrix being 1 gives that each generator squares to the identity, i.e. it is an involution. It can be hard to see what is going on in the matrix so there is a nicer way to display this information: a Coxeter diagram. This is a graph with a vertex for every generator, and edges which tell you the relations, as described below:

coxeter groups3

The relation (st)^{m_{st}} can also be rewritten as ststs...=tstst... where there are m_{st} elements on each side. This is reminiscent of the braid relations, hence why I called them “braid like” before. In the mirror analogy, this says the mirrors are angled towards each other in such a way that you get the same reflection by bouncing between them m times, independent of which of the two mirrors you turn towards to begin with.

There exist both finite and infinite Coxeter groups. Here is an example of a finite Coxeter group, with two generators s and t. If you view them as reflections on a hexagon (as drawn) then doing first s and then t gives a rotation of 120 degrees, and so doing st 3 times gives the identity, as required.

coxeter example

On the other hand, if you add another generator u with a braid-3 relation with both s and t , then the group is infinite. You can imagine tiling the infinite plane with triangles. If you take s , t and u to be reflections in the 3 sides of one of these triangles then they satisfy the relations they need to, and you can use these three reflections to transport the central triangle to any other one. If you think about this for a while, this shows the group is infinite. A (somewhat truncated) picture is shown below.

coxeter example 1

Examples of Coxeter groups don’t just live in 2-D Euclidean space. There is another finite group which acts as reflections on the permutahedron:


And other Coxeter groups which act as reflections on the hyperbolic plane.

The mathematical object I am working with at the moment is called the Davis Complex. You can build it out of finite subgroups of a Coxeter group (side note for the mathematicians: taking cosets of finite subgroups and forming a poset which you can then realise). Even infinite Coxeter groups have lots of finite subgroups. The great thing about the Davis complex being built out of finite things is that there is a classification of finite Coxeter groups! What this means is that when you have a finite Coxeter group its diagram either looks like one of the diagrams below, or a disjoint collection of them.

classification of finite coxeter

So because we only have a few diagrams to look at in the finite case, we can prove some things! Right now I am working on some formulas for staring at the Coxeter diagrams and working out the homology of the group. I’m using the Davis complex and all its nice properties to do this. I’ll leave you with a picture of the Davis complex for our first example.

davis example 2

davis example

Deleting edges to save cows – guest post by Dr Kitty Meeks


I met Rachael at the LMS Women in Mathematics Day in Edinburgh in April, where I gave a talk about using graph theory to understand the spread of disease in livestock, and she invited me to write a blog on this topic.  Everything I’m going to say here is based on joint work with Dr Jess Enright from the University of Stirling (supported by Scottish Government as part of EPIC: Scotland’s Centre of Expertise on Animal Disease Outbreaks), who knows a lot more about cattle than I do!

It seems obvious that, when we want to understand how a disease spreads through a population, we should consider which members of the population have contact with each other – this simple observation has given rise to the field of Network Epidemiology.  If we’re trying to understand the spread of disease in humans, one of the key challenges is to guess the underlying contact network – who has contact with whom – and a lot of interesting work has been done on this.  Looking at the spread of disease in livestock, however, we don’t have to worry about this: all movements of farm animals have to be reported to the relevant government agencies, so we (well, people I collaborate with, who provide advice to the Scottish government on controlling animal disease outbreaks) know exactly which animals have contact with each other.

Depending on the specific disease we are interested in, there are several different kinds of contact we might need to consider.   If animals need to be in close contact to transmit the disease, then there are two main ways the disease could spread from one farm to another: it could spread between farms that are geographically adjacent (if cows on opposite sides of the fence decide to have a chat) or by trade when an animal is sold from an infected farm to another farm that is not yet infected.  Some other diseases are spread by biting insects, in which case any farm within a certain radius of an infected farm would be at risk.

Mathematically, we can represent these potentially infectious contacts by means of a graph: a vertex is used to represent each farm, and two farms are connected by an edge if there is the potential for the disease to spread from one to the other (in the case of infection due to trade, this would normally be one-way, so we could capture more information by using a directed edge in the direction of the trade).  My first picture illustrates what part of this graph might look like if we are just considering the spread of disease across shared fence lines.  The graph we get in this situation will normally be planar, that is it can be drawn in the plane without any edges crossing (although there are some circumstances where this is not necessarily true, for example if one farmer owns several fields that are not connected to each other).


Now we can add to this graph the contacts that come from trade.  It’s not so obvious what we would expect the graph of trade links to look like, but this is exactly what Jess and I have been trying to understand.


First, though, I should explain why we want to know about the structural properties of these contact graphs.  We’re interested in restricting the spread of disease, and Jess and her colleagues are asked by policy makers to investigate how it might be possible, in theory, to modify the network to make it less vulnerable to an epidemic.  When we talk about modifying the network, we’re considering actions that correspond roughly to deleting edges and/or vertices in the graph: to “delete” a vertex, we might vaccinate all animals at the particular farm (so that it can no longer play any role in the transmission of disease), whereas “deleting” an edge might correspond to putting up a double fence-line between two farms (keeping animals on adjacent farms just a little bit further apart) or introducing extra monitoring on a specific trade route.  What do we want to achieve by modifying the network?  Well, there are lots of different parameters we could look at which relate to how vulnerable a particular graph is to an epidemic, but the simplest one we could think of to start with is the maximum component size (the largest number of vertices that can be reached from any single starting vertex in the graph by traveling along edges).  In the picture below, if the farm circled in red is infected, then all of the others are potentially at risk.


However, if we delete one edge from this network, we can be sure that the disease will not spread to the three farms on the right hand side (now highlighted in green).


Of course, all of these possible ways to modify the network are costly, so we want to do this in the most efficient way possible, i.e. make the fewest modifications required to achieve our goal.  This motivates the following question: given a graph, what is the minimum number of edges (or vertices) we need to delete so that the maximum components size of the resulting graph is at most h?  This is an example of a graph modification problem, a kind of problem which comes up a lot on theoretical computer science.  It turns out that, like many graph modification problems, this particular problem is NP-complete: unless most mathematicians and computer scientists are wrong, and P=NP, there is no efficient algorithm to answer this question for all possible inputs.  But we don’t have to give up there!  Just because a problem is intractable on arbitrary inputs doesn’t mean we can’t find a fast way to solve it on input graphs with specific structural properties – most of my research is based on exactly this idea – so in order to solve the problem efficiently on the real data we want to understand the structure of the livestock contact networks.

Jess and I haven’t yet considered the general problem as I described it above: instead, we’ve been looking only at the trade network, and only at edge deletion.  This special case is relevant in the management of a particular kind of persistent cattle trade link that exists in Scotland, so it seemed as good a place to start as any.  Thinking about this problem, we did what any graph theorist does when faced with a computationally hard problem: we tried to solve it on trees (graphs that don’t have any cycles, so there’s exactly one way to get between any two vertices).  And, after dealing with a few slightly awkward details, it worked in exactly the way any graph theorist would expect: we can solve the problem recursively, starting with the leaves (vertices with only one neighbour) and working our way up the tree – the fact there are no cycles means we never have to rethink the answers we’ve already calculated.


But cattle trade networks aren’t trees, so is this any use in the real world?  Somewhat surprisingly, the answer seems to be yes!  Although the trade networks aren’t exactly trees, many of them seem to have small treewidth – this is a hard-to-define graph parameter that essentially captures how “tree-like” a particular graph is.  And the great thing about graphs of small treewidth is that algorithms that work on trees can usually be adapted (taking just a bit more time) to work on graphs of small treewidth in exactly the same way.  The next picture shows how the treewidth of the Scottish cattle trade network grows as we add in more trades, based on real data from 2009.  The trades that took place that year are added to the graph one day at a time, and the treewidth (in fact, for technical reasons, an upper bound on the treewidth) of the resulting graph is shown.


We can see from this graph that, even if we look at trades taking place in the whole year (much longer than the timescale we would need to consider for many disease outbreaks) the treewidth of the trade network is still less than 20, for a graph that has nearly 3000 vertices and 8000 edges – this is much lower than we would expect for a random graph with the same number of vertices and edges.

Initial experiments suggest that our algorithm that exploits the treewidth performs significantly better on some real datasets than a generic approach based on constraint programming (which doesn’t take into account any structural properties of the data): there were examples where our algorithm found the optimal solution in under a minute, whereas the alternative couldn’t solve the problem in 6 hours.

We’ve really only scratched the surface of this problem, though.  The planar graphs that arise from geographical proximity definitely don’t have small treewidth, so we will need a different approach here – but they certainly have other distinctive structural properties that we hope to exploit.  What interests me most to investigate next, though, is why many cattle trade networks have surprisingly low treewidth: farmers aren’t looking at the horrible definition of this parameter when they decide where to sell their animals, so there’s something else going on here – and if we could find a good way to model this “something else” mathematically, would it let us devise even better algorithms?

Tensor approximation of a colour photo

This post is about using low-rank tensor approximations to roughly capture the information in a colour photograph. As far as I know, it is not used in practice as part of a method for compressing colour photos. However, it gives a nice illustration of what’s happening when you find a low-rank tensor approximation, such as is done for compressing tensor data of other kinds.

First, we’ll talk about the simpler case of a greyscale (black and white) photograph. Such a photo has one number for each pixel which describes the brightness of that pixel. The numbers range from 0=black to 1=white to indicate how bright that part of the photo should be.

A standard size for a photo is 1000 x 667 pixels. This means the height of the photo is 1000 pixels and the width is 667 pixels. The information of the photo is a grid of numbers (a matrix) of size 1000 x 667, where each number in the matrix is the brightness of the corresponding pixel.

For a colour photograph, each pixel is described by three numbers, ranging from 0 to 1, that describe the redness, blueness and greenness of that pixel. The photo is described by a 3D array (a tensor) of size 1000 x 667 x 3. The height and width dimensions of the image have stayed the same, but we now have an extra “RGB” coordinate that tells us if we are referring to red intensity, blue intensity or green intensity.

A low-rank approximation of an array of data compresses the information it contains. For the matrix (greyscale) case, the best low-rank approximation is found using the Singular Value Decomposition (SVD). People learn this method in an undergraduate linear algebra class. For biological matrix data, using the SVD to view the most variance in the data (Principal Component Analysis) is widely used.

If we want a low-rank approximation of a colour photo (a tensor), we can no longer in general use the SVD (although sometimes we can). We use tensor methods instead. The numerics, and theory behind these method is much less well-understood and less widely taught. People tend to focus on the 2D case. This might explain why some people find linear algebra to be a little grey.

Using the software package Tensorlab, I took an original color photograph and found its best rank 10 tensor approximation.


The tensor approximation has captured quite a few features of the original image. Rather than the 1000 x 667 x 3 = 2,001,000 numbers needed to store the original image, the new one requires only (1000 + 667 + 3 – 2) x 10 = 16,680 numbers. Of course, storage concerns are not likely to be a problem for a single photo but for large data this data compression becomes important. If the original ‘photo’ had been a billion by a billion pixels, we’d get a compression factor of one hundred million.

Turning the tensor into three matrices. For comparison, I will talk about the option of considering the color photo (tensor) as three matrices: a red matrix, a blue matrix and a green matrix, each of size 1000 x 667. We divide up the cat photo into three pieces, by extracting the parts of each colour:


We can find a low rank approximation of each slice, one by one, using the SVD. If we wanted to store the image in 16,000 numbers overall, as we did in the tensor case, we take a rank 3 approximation in each slice: this requires (1000 + 667 – 1) x 3 x 3 numbers. To get back the original picture, we stack the three slices in their original order. A comparison of the two methods is here:


We see that the tensor method has recovered more features of the original image for the same amount of data storage! This post resulted from a fun conversation with Daniel Lowengrub yesterday, and the cat photo is from Piotr Achinger.

Here is the code I used:

[code language="matlab"]
% best rank r tensor approximation
r = 10;
A = imread('fig2.jpg');
A = single(A); m = max(max(max(A))); A = A./m;
U = cpd(A,r);
Tensor_approx = cpdgen(U);

% best rank 3 * (r/3) matrix approximation
k = floor(r/3);
for i = 1:3;
    CC{i} = A(:,:,i);
    [U,S,V] = svd(CC{i});
    Ck{i} = U(:,1:k)*S(1:k,1:k)*V(:,1:k)';
sz = size(Ck{1});
D = zeros(sz(1),sz(2),3);
for i = 1:3;
    D(:,:,i) = Ck{i};
Matrix_approx = D;

Introducing homology

A lot of things have happened since my last post, and I’ve been waiting for a great way to follow Anna’s fantastic series of SIAGA posts.

On February 16th Professor Robert Ghrist from the University of Pennsylvania gave the annual Potter lecture at the University of Aberdeen. The Potter lecture is to be aimed at a general audience, and his title was “Putting Topology to Work”. He discussed applications of topology to various areas of engineering and science and his talk included a great introduction to a topological invariant called homology.

Topologists work with homology a LOT. It has appeared in the title of my mathematics dissertation at undergrad, essay at masters level and I am pretty sure my PhD thesis title (touch wood) will also contain it. However I have never been good at explaining what homology is in layman’s terms (despite many attempts), so Professor Ghrist’s lecture was particularly inspirational.

A couple of weeks ago I gave a short talk at a London Mathematical Society Women in Mathematics day and tried to give a better description of homology than I have done before. There are some pictures involved so I thought I would recreate that section of my talk here. I’ll screenshot some of my slides and also add text and some extra sketches.

top spaces to sequence of groups

Homology is a process where we start with a topological space X and associate to it a sequence of abelian groups called homology groups, and denoted H_*(X) where * is a natural number (0, 1, 2, 3, …). Some examples of topological spaces are spheres, surfaces and manifolds (which are higher dimensional analogues of surfaces i.e. I can’t draw them).

top spaces




So what do these groups tell us? H_0(X) tells us about the connected components of our space. If the space is one point, the rank of H_0(X) will be 1, if it is a circle the rank of H_0(X) will still be 1 but if it is two disjoint points or circles the rank of H_0(X) will be 2 and so on.




H_1(X) tells us in some sense about ‘holes which look like a circle’. So it will let us know that a circle has one ‘hole that looks like a circle’, a figure of 8 has two, etcetera.

Similarly H_2(X) tells us about ‘holes which look like a 2- sphere’, in the sense that you can blow up a beach ball and what you get is a 2- sphere, so H_2(X) will tell you there is a hole in your beach ball which ‘looks like a 2-sphere hole’. You can also blow up a rubber ring or inner tube and in the same sense, H_2(X) will tell us these torus surfaces have ‘holes which look like 2-spheres’ or ‘holes which look like beach ball holes’. We can’t really visualise what the homology tells us after H_2(X), since it tells us about holes in higher dimensions than 2.


beachball rubber ring


So why do we do this? We might want to know something about a topological space, but maybe we can’t simply draw the space as it lives in a very high dimension. But the homology of a space is a sequence of groups which tells us about holes of all dimensions: and we know lots about groups! We can try to work out what the homology groups of a space are, we can do things such as study maps between these groups, and there is generally a lot more structure in the sequence of groups for us to take advantage of. So by looking at homology we can learn things about a space that we cannot draw or visualise.

Homology is also functorial in the sense that if we have two spaces X and Y and a map between them (the downwards black arrow in the diagram below), we can look at the homology of X and the homology of Y (horizontal wiggly arrows) and the map between X and Y will induce  a map between the homologies (the dotted green arrow). So because we know a lot about maps between groups this can tell us something about the possible maps between X and Y.


In my talk I was focusing on the homology of a group rather than that of a space, so how do we do that? Well we start off with a group and we associate to it something called a classifying space (see my previous post for an example). Calculating the homology of this space is then the same as calculating the homology of the group.

group 2

I also used homology with different coefficients, such as \mathbb{Z}_2 instead of the usual integer coefficients \mathbb{Z}. This allows us to manipulate what sort of abelian groups we get when taking homology, for instance using \mathbb{Q} coefficients will give us a \mathbb{Q}-vector space. Sometimes we do this to make our problem easier to solve, or sometimes the problem itself prescribes that we use different coefficients.


So now I have told you about homology, next time I will follow up with a post on the hot topic of homological stability!

And to reward you for reading to the end, here is a great comic drawn by my friend Tom!

SIAGA: Visualization of Algebraic Varieties

Seven pictures from Applied Algebra and Geometry: Picture #7

The Society for Industrial and Applied Mathematics, SIAM, has recently released a journal of Applied Algebra and Geometry called SIAGA. See here for more information on the new journal, which starts taking online submissions on March 23rd – in six days time.

The poster for the journal features seven pictures. In this final blog post I will talk about the seventh picture, on the subject of ‘Visualization of Algebraic Varieties’. This concluding blog post is short, and the picture is especially nice.

Thank you for reading this series of seven ‘SIAGA’ blog posts, and for following Rachael and my blog “Picture this Maths”. Check back here soon for future mathematical posts centered around pictures!


The Context

There is a vast mathematical toolbox of techniques that can be used to understand algebraic varieties. We have encountered some such tools in this series of blog posts, for example polyhedral geometry. It’s great when we are actually able to draw the algebraic variety in question, using visualization software. When possible, this facilitates the most direct of observations to be made.

Although it poses an obvious restriction on the number of dimensions we can work in, even visualizing particular slices through our variety of interest is structurally revealing. Large polynomials with many terms can be very hard to get a handle on, and it makes sense to use modern-day computer tools to convert these equations into helpful pictures.

The Picture

This picture shows a Kummer Surface. It was made by Oliver Labs using the visualization software Surfex. Many beautiful pictures have been created in this way: for more, see the picture galleries from the Imaginary: Open Mathematics website.

It is an example of an irreducible surface in three-dimensional space of degree four. In general, these have at most 16 singular points. Kummer surfaces are those that attain this upper bound. The 16 singular points represent the 2-torsion points on the Jacobian of the underlying genus 2 curve.

This picture also represents the problem-solving areas of coding theory and cryptography, in which there can be found a broad range of applied algebra and geometry. The group law on an elliptic curve is fundamental for cryptography. Similarly, the group law on the Jacobian of hyperelliptic curves has been used for cryptographic purposes, see “Fast Cryptography in Genus 2” by Bos, Costello, Hisil and Lauter (2013), and “Applied Cryptography and Network Security” by Bao, Samarati and Zhou (2014). One of the authors of the first article is Kristin Lauter from Microsoft Research who is president of the Association for Women in Mathematics (AWM).


SIAGA: Tensors

Seven pictures from Applied Algebra and Geometry: Picture #6

The Society for Industrial and Applied Mathematics, SIAM, has recently released a journal of Applied Algebra and Geometry called SIAGA. See here for more information on the new journal. They will start taking online submissions on March 23rd.

The poster for the journal features seven pictures. In this penultimate blog post I will talk about the sixth picture, on the subject of Tensors. In the first section of this post, “The Context”, I’ll set the mathematical scene. In the second section, “The Picture”, I’ll talk about this particular image.


The Context

Tensors are the higher-dimensional analogues of matrices. They are data arrays with three or more dimensions, and are represented by an array of size n_1 \times \cdots \times n_d, where n_k is the number of ‘rows’ in the kth direction of the array. The entries of the tensor A are denoted by A_{i_1 \ldots i_d} where i_k \in \{ 1, \ldots, n_k \} tells you which row in the kth direction you are looking at. Just as for a matrix, the entries of a tensor are elements in some field, for example real or complex numbers.

Tensors occur naturally when it makes sense to organize data by more than two indices. For example, if we have a function that depends on three or more discretized inputs f(x,y,z) where x \in \{ x_1, \ldots, x_{n_1} \}, y \in \{ y_1, \ldots, y_{n_2} \} and z \in \{ z_1, \ldots, z_{n_3} \}, then we can organize the values A_{ijk} = f(x_i,y_j,z_k) into a tensor of size n_1 \times n_2 \times n_3. Tensors are increasingly widely used in many applications, especially signal processing, where the uniqueness of a tensor’s decomposition allows the different signals comprising a mixture to be found. They have also been used in machine learning, genomics, geometric complexity theory and statistics.

Our data analysis techniques are currently limited to a matrix-centric perspective. To overcome this, there has been tremendous effort to extend the well-understood properties of matrices to the higher-dimensional world of tensors. A greater understanding of tensors paves the way for very exciting new developments that can cater to the natural structure of tensor-based data, for example in experimental design or confounding factor analysis. This analysis and understanding uses interesting and complicated geometry.

One requirement for computability of a tensor is to have a good low rank approximation. Tensors of size n_1 \times \cdots \times n_d have n_1 \ldots n_d entries and, for applications, this quickly becomes unreasonably large. Matrices are analyzable via their singular value decomposition, and the best low rank approximation is obtainable directly from this by truncating at the rth largest singular value. We can extend many of the useful notions from linear algebra to tensors: we have eigenvectors and singular vectors of tensors, and a higher order version of the singular value decomposition.


The Picture

As well as being a picture of the well-known Rubik’s cube, this picture describes a cartoon of a tensor of size 3 \times 3 \times 3. Such a tensor consists of 27 values.

To understand the structure contained in a tensor, we use its natural symmetry group to find a presentation of it that is simple and structurally transparent. This motivation also underlies the Rubik’s puzzle although the symmetries can be quite different: a change of basis transformation for the tensor case, and a permutation of pieces in the case of the puzzle.

Despite being small, a 3 \times 3 \times 3 tensor has interesting geometry. It is known that a generic tensor of size 3 \times 3 \times 3 has seven eigenvectors in \mathbb{P}^2. In the paper “Eigenconfigurations of Tensors” by Abo, Seigal and Sturmfels, we show that any configuration of seven eigenvectors can arise, provided no six of the seven points lie on a conic.