## Understanding the brain using topology: the Blue Brain project

ALERT ALERT! Applied topology has taken the world has by storm once more. This time techniques from algebraic topology are being applied to model networks of neurons in the brain, in particular with respect to the brain processing information when exposed to a stimulus. Ran Levi, one of the ‘co-senior authors’ of the recent paper published in Frontiers in Computational Neuroscience is based in Aberdeen and he was kind enough to let me show off their pictures in this post. The paper can be found here.

So what are they studying?

When a brain is exposed to a stimulus, neurons fire seemingly at random. We can detect this firing and create a ‘movie’ to study. The firing rate increases towards peak activity, after which it rapidly decreases. In the case of chemical synapses, synaptic communication flows from one neuron to another and you can view this information by drawing a picture with neurons as dots and possible flows between neurons as lines, as shown below. In this image more recent flows show up as brighter.

Numerous studies have been conducted to better understand the pattern of this build up and rapid decrease in neuron spikes and this study contains significant new findings as to how neural networks are built up and decay throughout the process, both at a local and global scale. This new approach could provide substantial insights into how the brain processes and transfers information. The brain is one of the main mysteries of medical science so this is huge! For me the most exciting part of this is that the researchers build their theory through the lens of Algebraic Topology and I will try to explain the main players in their game here.

Topological players: cliques and cavities

The study used a digitally constructed model of a rats brain, which reproduced neuron activity from experiments in which the rats were exposed to stimuli. From this model ‘movies’ of neural activity could be extracted and analysed. The study then compared their findings to real data and found that the same phenomenon occurred.

Neural networks have been previously studied using graphs, in which the neurons are represented by vertices and possible synaptic connections between neurons by edges. This throws away quite a lot of information since during chemical synapses the synaptic communication flows, over a miniscule time period, from one neuron to another. The study takes this into account and uses directed graphs, in which an edge has a direction emulating the synaptic flow. This is the structural graph of the network that they study. They also study functional graphs, which are subgraphs of the structural graph. These contain only the connections that fire within a certain ‘time bin’. You can think of these as synaptic connections that occur in a ‘scene’ of the whole ‘movie’. There is one graph for each scene and this research studies how these graphs change throughout the movie.

The main structural objects discovered and consequentially studied in these movies are subgraphs called directed cliques. These are graphs for which every vertex is connected to every other vertex. There is a source neuron from which all edges are directed away, and a sink neuron for which all edges are directed towards. In this sense the flow of information has a natural direction. Directed cliques consisting of n neurons are called simplices of dimension (n-1). Certain sub-simplices of a directed clique for their own directed cliques, when the vertices in the sub-simplices contain their own source and sink neuron, called sub-cliques. Below are some examples of the directed clique simplices.

And the images below show these simplices occurring naturally in the neural network.

The researchers found that over time, simplices of higher and higher dimension were born in abundance, as synaptic communication increased and information flowed between neurons. Then suddenly all cliques vanished, the brain had finished processing the new information. This relates the neural activity to an underlying structure which we can now study in more detail. It is a very local structure, simplices of up to 7 dimensions were detected, a clique of 8 neurons in a microcircuit containing tens of thousands. It was the pure abundance of this local structure that made it significant, where in this setting local means concerning a small number of vertices in the structural graph.

As well as considering this local structure, the researchers also identified a global structure in the form of cavities. Cavities are formed when cliques share neurons, but not enough neurons to form a larger clique. An example of this sharing is shown below, though please note that this is not yet an example of a cavity. When many cliques together bound a hollow space, this forms a cavity. Cavities represent homology classes, and you can read my post on introducing homology here. An example of a 2 dimensional cavity is also shown below.

The graph below shows the formation of cavities over time. The x-axis corresponds to the first Betti number, which gives an indication of the number of 1 dimensional cavities, and the y-axis similarly gives an indication of the number of 3 dimensional cavities, via the third Betti number. The spiral is drawn out over time as indicated by the text specifying milliseconds on the curve. We see that at the beginning there is an increase in the first Betti number, before an increase in the third alongside a decrease in the first, and finally a sharp decrease to no cavities at all. Considering the neural movie, we view this as an initial appearance of many 1 dimensional simplices, creating 1 dimensional cavities. Over time, the number of 2 and 3 dimensional simplices increases, by filling in extra connections between 1 dimensional simplices, so the lower dimensional cavities are replaced with higher dimensional ones. When the number of higher dimensional cavities is maximal, the whole thing collapses. The brain has finished processing the information!

The time dependent formation of the cliques and cavities in this model was interpreted to try and measure both local information flow, influenced by the cliques, and global flow across the whole network, influenced by cavities.

So why is topology important?

These topological players provide a strong mathematical framework for measuring the activity of a neural network, and the process a brain undergoes when exposed to stimuli. The framework works without parameters (for example there is no measurement of distance between neurons in the model) and one can study the local structure by considering cliques, or how they bind together to form a global structure with cavities. By continuing to study the topological properties of these emerging and disappearing structures alongside neuroscientists we could come closer to understanding our own brains! I will leave you with a beautiful artistic impression of what is happening.

There is a great video of Kathryn Hess (EPFL) speaking about the project, watch it here.

For those of you who want to read more, check out the following blog and news articles (I’m sure there will be more to come and I will try to update the list)

Frontiers blog

Wired article

Newsweek article

## SIAGA: Topology of Data

### Seven pictures from Applied Algebra and Geometry: Picture #4

The Society for Industrial and Applied Mathematics, SIAM, has recently released a journal of Applied Algebra and Geometry called SIAGA. The poster for the journal features seven pictures. In this blog post I will talk about the fourth picture. See here for my blog posts on pictures onetwo and three. And see here for more information on the new journal.

In the first section of this post “The Context”, I’ll set the mathematical scene and in the second section “The Picture” I’ll talk about this particular image, representing Topology of Data.

# The Context

Topology offers a set of tools that can be used to understand the shape of data. The techniques detect intrinsic geometric structures that are robust to many common sources of error including noise and arbitrary choice of metric. For an introduction, see the book “Elementary Applied Topology” by Robert Ghrist (2014), or the article “Topology and Data” by Gunnar Carlsson (2009).

Say we have noisy data points coming from some unknown space $X$ which we believe possesses an interesting shape. We are interested in using the data to capture the topological invariants of the unknown space. These are its holes of different dimensions, unchanged by continuous squeezing and stretching.

The holes of different dimensions are the homology groups of the space $X$. They are denoted by $H_k(X)$ for $k$ some non-negative integer. The zeroth homology group tells us the number of zero-dimensional holes or, more intuitively, the connectedness of the space. For a space $X$ with $n$ connected components, it is
$H_0(X) = \mathbb{Z}^n$,
the free abelian group with $n$ generators. One-dimensional holes are counted by $H_1(X)$. For example, a circle $X = S^1$ has a single one-dimensional hole, so $H_1(S^1) = \mathbb{Z}$.

The connectedness properties of sampled data tell us a lot about the underlying space from which they are sampled. In some situations, such as for structural biological information, it is indispensable to know the structure of the holes too. These features are unchanged no matter which metric we use, or which space we embed the points into. The higher homology groups $H_k(X)$ for $k \geq 2$ similarly give us such summarizing features.

But there’s a problem: sampling $N$ points from a space gives us a collection of zero-dimensional pieces, which  – unless two points land in exactly the same place – are all unconnected. Let us call this data space $D_0$. The space $D_0$ has homology groups
$H_k(D_0) = \begin{cases} \mathbb{Z}^N & k = 0 \\ 0 & \text{otherwise.} \end{cases}$

It is usually the case that many points are very close together, and ought to be considered to come from the same connected component. To measure this we use persistent homology. We take balls of increasing size centered at the original data points, and measure the homology groups of the space consisting of the union of these balls. We call this space $D_\epsilon$, where $\epsilon$ is the radius of the balls. The important structural features are those that persist for large ranges of values of $\epsilon$. For some great illustrations of persistent homology, see this post Rachael wrote for our blog in July 2015.

# The Picture

This picture shows data points sampled from a torus, which we imagine to live in three-dimensional space. It was made by Dmitriy Morozov, who works at the Lawrence Berkeley National Lab. He applies topological methods in cosmology, climate modeling and material science.

The sampled points in the picture lie on the torus, and furthermore in a more specialized slinky-shaped zone of the torus. This is an important feature of the shape which topological methods will capture.

The original data consists of 5000 points, and our persistent homology approach involves taking three-dimensional balls $B_\epsilon(d_i)$ of radius $\epsilon$ centered at each data point $d_i$. When the radius $\epsilon$ is very extremely small, none of the balls will be connected, and the shape of our data is indistinguishable from any other collection of 5000 points in space.

Before long, the radius will exceed half the distance to all the points’ nearest neighbors. The 5000 balls join together to form a curled up circular piece of string. Topological invariants do not notice the curling, so topologically the shape obtained is a thickened circle with a one-dimensional hole $H_1(D_{\epsilon}) = \mathbb{Z}$. When the radius is large enough for the adjacent curls of the slinky to meet, but not to read the opposite side of each curl, we get a hollow torus with $H_1(D_{\epsilon}) = \mathbb{Z}^2$ and $H_2(D_\epsilon) = \mathbb{Z}$. Finally, the opposite sides of each curl of the slinky will meet, and they will meet up with the slinky-curls on the opposite side of the torus. Our shape then becomes a three-dimensional shape with no holes, and $H_1(D_R) = 0$.

In this example, the data points can be visualized and we are able to confirm that our intuition for the important structure of the shape agrees with the homological computations. For higher-dimensional examples it is these persistent features that will guide our understanding of the shape of the data.

## Moduli Spaces

What do you think about when you see a circle?

Depending on the context, a mathematician might think of a circle in many ways. One option is to think about a circle as a collection of  infinitely many points, rather than to think of it as a line. It is the collection of points that lie a fixed distance away from a central point.

This perspective is important because we can think of each point on the circle as representing something. For example, in the picture below the blue point represents the red arrow (or ray) that starts at the centre of the circle and passes through that blue point:

We can do this for all points and arrows: each point on corresponds to the arrow starting at the centre of the circle and passing through that point, and each arrow corresponds to the point where it meets the circle.

This gives a bijection between points on the circle and arrows from the centre. But our correspondence between points and lines is better than a bijection: it seems quite natural. The reason for this is if two arrows are very close together then the corresponding points on the circle are also close together, and vice versa – our bijection has captured information about the topology of the lines.

In fact, we can find how similar two arrows are by finding the distance between their associated points on the circle.

Why is this useful? It’s much easier to think about a circle than it is to think about a whole collection of arrows. We can see from the fact that a circle is 1-dimensional that all such arrows can be described using one parameter – the angle of the arrow is enough information to define which arrow we’re talking about. We can stratify  the space of arrows by subdividing the circle, for example on the picture below the green region corresponds to arrows that point “upwards”:

We say that the circle is a moduli space for our arrows – each point on the circle represents an arrow in the right kind of “natural” way.

What if, instead of arrows starting at a given point, we were interested in lines passing through a point? What shape could we use to parametrize these – i.e. what could our new moduli space be? We have a problem because each line passes through two points on the circle instead of one so we no longer have a bijection:

Can we still use a circle as the moduli space? (Answer: yes!)

In the above examples we are looking for ways to classify arrows and lines. It is an important problem in maths to classify more complicated objects for example algebraic curves, or fly wings!

Ezra Miller is interested in understanding the moduli space of fly wings – a space where each point corresponds to a different kind of fly wing, where we say two fly wings are different if the veins make different shapes.

Clearly something much more complicated space than a circle will be needed to describe the different kinds of vein shapes of fly wings.  In fact, there are lots of different spaces that we could use. Once we have selected a shape, there are interesting and useful questions we can ask:

1. Distance: in our previous example, we had a clear notion of how far apart two arrows were – we could measure the distance travelled to get from one point to another by travelling along the circle. It’s harder to say how far apart two fly wings are, and we want a measure of distance that agrees with our biological intuition.
2. Stratifying the space: before, we could stratify our space of arrows by selecting some region of the circle (for example, the upper half). What about for the fly wing – are there ares of our moduli space that correspond to biologically significant subsets of flies?

There are also many areas of pure maths where people study moduli spaces, which I will talk about in a future blog post. They are a (fairly) simple but important intuitive concept that plays a role in both pure and applied maths.

## Persistent homology applied to evolution and Twitter

In this post I’ll let you know about an application and a variation of persistent homology I learnt about at the Young Topologists Meeting 2015. You might want to read my post on persistent homology first!

In his talks Gunnar Carlsson (Stanford) gave lots of examples of applications of persistent homology. A really interesting one for me was applying persistent homology to evolution trees. Remember that homology tells us about the shape of the data, and in particular if there are any holes or loops in it. We tend to think of evolution as a tree:

but in reality the reason why all our models for evolution are trees is that we take the data and try to fit it to the best tree we can. We don’t even think that it might have a different shape!

In reality, as well as vertical evolution, where one species becomes another, or two other, distinct species over time, we have something called horizontal  or reticulate evolution. This is where two species combine to form a hybrid species. In their paper Topology of viral evolution, Chan, Carlsson and Rabadan show how the homology (think of this as something describing the shape of the data, specifically the holes or loops that appear) of trees may be different if we take into account species merging together:

They go on to show how persistent homology can detect such loops caused by horizontal evolution, in the example of viral evolution. This is a brand new approach and really exciting as we now have a way of finding out how many loops are in a given evolutionary dataset, and which data points they correspond to. This can tell us about horizontal evolution, as well as vertical!

Up next is work from Balchin and Pillin (University of Leicester) on a variation of persistent homology inspired by directed graphs. The images in this section are from the slides of Scott Balchin’s talk at the young topologists meeting! The motivation for their variation is: what if you don’t simply have data points, but some other information as well. Take this example of people following people on twitter: draw an arrow from person A to person B if person A follows person B.

We see that Andy follows Cara but Cara does not reciprocate! If you just had Andy and Cara connected by an edge then this information would be lost. Balchin and Pillin looked at a way of encoding this extra information into the complex, taking into account the number of arrows you would need to move along to get from Andy to Cara (1) and also from Cara to Andy (2, via Bill). I will post a link to their paper here as soon as it is released. When the data is considered without this extra information, persistent homology gives a (crazy) barcode that looks like this:

but when you include the directions you get a slightly less mysterious bar code:

which is in a lot of ways more accurate and easy to interpret.

Balchin gave another example of a system where direction mattered: non-transitive dice. If you have a few 6 sided dice, you can represent each one by a circle with 6 numbers in it: the numbers on the sides of the dice! Then put an arrow from dice A to dice B if dice A beats dice B on average.

The non-transitive means sometimes there are loops where dice A beats dice B which beats dice C, but then dice C beats dice A! You can actually buy non-transitive dice and play with them in real life. As you can probably tell, the arrows in this picture are important and so we want to make sure we don’t loose the directions when considering the homology!

There are a few more applications of persistent homology I would like to share with you and hopefully I will get the chance some other week!

## Persistent Homology

A sample of the pictures we will look at this week:

Next week I’m going to the Young Topologists Meeting 2015, at EPFL in Lausanne, Switzerland. Over 180 young topologists are going and many of them will give short talks on their research. Alongside this, there are two invited speakers who will give mini lecture series:

• Gunnar Carlsson of Stanford University, lecturing about Methods of applied topology
• Emily Riehl of Harvard University, lecturing about Infinity category theory from scratch

I’ll try to write something about these courses, and this post will be a wee introduction to a tool introduced by Gunnar Carlsson which considers topology of data clouds: persistent homology. The pictures in this post were drawn by Paul Horrocks during our joint dissertation at undergrad: points to Paul!

The idea of persistent homology is to use a tool of topology – homology – to understand something about the structure or shape of a set of data points. But topology is to do with spaces, for example manifolds or surfaces. Therefore we want to make a space out of our data before we can work out the homology.

We do this by plotting our set of points, and around each point we draw a ball. This ball has a radius and we can vary the size of this radius:

Once we have drawn these balls, we join two of the points by a line if their corresponding balls intersect, and colour in triangles formed by three lines if the balls corresponding to the three points of the triangle have a patch where they all intersect. For different radii we get different structures.

here only two of the balls intersected so there is only one line