Last updated 11 April 2025
Welcome to Mathematical Biology III.
Throughout the year, we will be developing a series of models for populations of biological agents. At first, we will have a population that simply interacts with itself, reproducing and dying (such is life!). We’ll quickly make the population compete with itself, and then we’ll develop systems of two different populations: a predator and a prey.
In the second half of Michaelmas, we will let the populations interact meaningfully in space. This will lead to pursuit solutions: predators chasing prey across space... and time!
A key question throughout our journey will be investigating the long-term behaviour of these models. Given a set of starting conditions, where will we end up? To this end we will spend a considerable amount of time looking at the equilibria of these models. Then we will assess the stability of these equilibria. Are the predator and prey locked in a delicate balance? Will small changes cause massive upsets? Or can we perturb the system without too much worry? To answer these questions, we will repeatedly perform a linear stability analysis on our increasingly complex models. To help us in this task, we will bring in a series of more advanced mathematical tools.
This branch of mathematical modelling has never been more relevant. The Covid-19 pandemic has brought the largest societal change I have ever experienced. You may feel the same. At the beginning of 2020, governments the world over looked – and are still looking – to scientists and mathematicians to understand how this invisible threat spreads. Science was asked to inform policy, and to do so quickly. How quickly will coronavirus infect the population? Will it burn out or will it continue until everyone is infected? Is there an equilibrium of this system? Much talk was made of \(R_0\): the reproducibility number of the virus. The general public were reminded of the power of exponential growth.
The focus of this course is not on epidemiology, but instead on the development of population models, and more generally quantitative models capturing the dynamics of biological systems. We will spend most of our time learning from instructive classical models in population ecology. But disease modelling works in a very similar way: populations are divided into smaller populations: typically those susceptible, infected, and recovering from the disease. These three smaller populations then all interact with each other, and we can again ask the same questions of stability and long-term behaviour.
Therefore as part of the problems classes, we will see some simple disease models in order to gain an understanding of how the population modelling techniques we are developing were used (and are being used) to predict the course of the coronavirus spread and instruct national and global policy.
I think you will enjoy this course.
These notes are designed to be sufficient for the course, but sources and references will be given at the end of every chapter. The main reference for the course is Murray, J. D., 2002 Mathematical Biology, 3rd edn, volumes I and II.
Michaelmas term
A fundamental aspect of a living or biological system is its growth. In this course, we will consider time-dependent growth of a species (say greenflies!) whose population size or density will be represented by a function \(x(t)\). Note that we will model populations as continuous, rather than discrete numbers. For \(x(t)\) to be a sensible model we, naturally, need \(x(t)\in[0,\infty)\).
If we assume the food supply of this species is unlimited, it seems reasonable that the rate of growth of this population would be proportional to the current population size, as there are plenty of potential couplings, i.e., \[\begin{equation} \label{exp-US-growth-US-model} \mathchoice{\frac{{\mathrm d}x}{{\mathrm d}t}}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t} = a x \implies x(t) = A\mathrm{e}^{a t}. \end{equation}\] Here, \(a>0\) is the growth (birth ratio per greenfly) and \(A = x(0)\) is the initial population size. It tells us that the population grows without bound over time – see 1.1.
This is a pretty simple model of population growth, but it became influential in 1798, when it was presented by the cleric and economist Thomas Robert Malthus in his book An Essay on the Principle of Population. In it, Malthus warned that while human population growth was exponential, food production growth at the time was only arithmetic, and that this would lead to famine in the future. Its publication led to the first British census in 1801 and every ten years since.
But populations (of humans, or greenflies) don’t actually grow like this long-term. Hans Rosling’s 2018 book Factfulness warns us of the assumption that exponential growth never slows. So what could change?
One improvement would be to demand that we include a notion of self-competition within the population. This could, for example, model competition for food or territory. Mathematically, we need a decay term which is small for small \(x\), allowing the population to grow, but dominates the growth term when \(x\) gets larger, thus restricting its growth.
The simplest example of such a model is the logistic equation. Originally introduced by Pierre François Verhulst in 1838, the equation is nonlinear and takes the form \[\begin{equation} \label{logistic} \mathchoice{\frac{{\mathrm d}x}{{\mathrm d}t}}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t} = a x\left(1-\frac{x}{K}\right), \end{equation}\] where \(a>0\) is the usual growth term and, as we shall see, \(K\) is the limiting population or carrying capacity.
Can we guess what this model looks like? See that there is an equilibrium (\(\mathchoice{\frac{{\mathrm d}x}{{\mathrm d}t}}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t} = 0\)) at \(x=K\). The term \((1-x/K)\) is negative if \(x>K\) and positive if \(x<K\) so we might expect it to either decay towards \(K\) from above and up towards \(K\) if from below.
This sort of analysis turns out to be quite common for population models because it is not always possible to solve them analytically. Thankfully this time we can, however.
Let’s go ahead and now work out the solutions of our model. We can separate [logistic]: \[\int\frac{1}{x(1 - x/K)}\,\mathrm{d}x= at + C.\] We can integrate the first integral using partial fractions (remember those!), \[\frac{1}{x(1 - x/K)} = \frac{1}{x} + \frac{1}{K\left(1-x/K\right)},\] and so \[\log(x) - \log(1-x/K) =a t+C \implies \frac{x}{1-x/K} = A\mathrm{e}^{at},\] so that finally, \[\begin{equation} \label{logsol} x(t) = \frac{A \mathrm{e}^{a t}}{1+ \frac{A}{K}\mathrm{e}^{at}}. \end{equation}\]
1.2 demonstrates the behaviour of this equation.
What about that limiting population we promised? See that as \(t \to \infty\), \(x(t) \to K\): a fact independent of the initial condition.
Rather than worry about how the population changes, we might only really care where it will end up, given sufficient time. We have already spotted that the system tends to a state where \(\mathchoice{\frac{{\mathrm d}x}{{\mathrm d}t}}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t}=0\) at \(x=K\). Is this the only possibility?
No! If we set \(\mathchoice{\frac{{\mathrm d}x}{{\mathrm d}t}}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t}=0\) then we have \[a x\left(1-\frac{x}{K}\right) = 0 \implies x=0, K.\] So there is also an unchanging state of \(x=0\) where there is no population. This makes sense, of course: no population means no reproduction. At this stage we make our first definition:
An equilibrium of a system is one in which all time derivatives are zero. For example, consider the system \[\mathchoice{\frac{{\mathrm d}^2 u}{{\mathrm d}t^2}}{{\mathrm d}^2 u/{\mathrm d}t^2}{{\mathrm d}^2 u/{\mathrm d}t^2}{{\mathrm d}^2 u/{\mathrm d}t^2} + \mathchoice{\frac{{\mathrm d}u}{{\mathrm d}t}}{{\mathrm d}u/{\mathrm d}t}{{\mathrm d}u/{\mathrm d}t}{{\mathrm d}u/{\mathrm d}t} = u^2+v^2,\qquad \mathchoice{\frac{{\mathrm d}^{3} v}{{\mathrm d}t^{3}}}{{\mathrm d}^{3} v/{\mathrm d}t^{3}}{{\mathrm d}^{3} v/{\mathrm d}t^{3}}{{\mathrm d}^{3} v/{\mathrm d}t^{3}} = uv.\] The equations of equilibrium are \[0 = u^2+v^2,\qquad 0 = uv\] (the only solution to which is \(u=v=0\)). The definition of equilibrium is often (in a dynamical systems context) referred to as a steady state or fixed point.
We earlier demanded that the population is \(\geq 0\). We thus define a permissible or feasible equilibrium as one which satisfies this criteria. It will be important throughout the course that our models have permissible equilibria to be valid. Another idea we will come to discuss is that a good model should not allow a positive initial population to become negative.
We note that for any positive \(A\), [logsol] will tend to \(K\). We say that the \(x=K\) equilibrium is stable because any small change from \(x=K\) (say \(x= K-\varepsilon\)) will tend back to \(x=K\) if we go forward in time (convince yourself of this by looking again at 1.2). However, if we are at \(x=0\) and there is a sudden small change to \(x=\varepsilon\), perhaps representing a small population migration, if we go forward in time it will grow inexorably towards \(K\). Thus we say that the \(x=0\) equilibrium is unstable.
But did we need to solve the logistic equation, [logistic], to find this? Actually no! Because our differential equation is of the form \(\mathchoice{\frac{{\mathrm d}x}{{\mathrm d}t}}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t} = f(x)\), we can use a common technique where we simply plot \(\mathchoice{\frac{{\mathrm d}x}{{\mathrm d}t}}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t}\) against \(x\) (known as plotting the phase space) and make some observations.
Look at 1.3. Our two equilibria are marked. If you start at a value of \(x\) where \(\mathchoice{\frac{{\mathrm d}x}{{\mathrm d}t}}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t}\) is positive, we know that \(x\) increases in time, so you move to the right over time (indicated by the forward-pointing red arrow). And where \(\mathchoice{\frac{{\mathrm d}x}{{\mathrm d}t}}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t}\) is negative, you move to the left (the backward-pointing red arrow). This instinctively tells us that you will, at \(t=\infty\), always end up at \(x=K\) unless you start exactly at \(x=0\). Do you agree?
What can you say about the stability of the equilibria of \[\mathchoice{\frac{{\mathrm d}x}{{\mathrm d}t}}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t} = x(x-1)(x-2)(x-3) \, ?\]
We will make a more mathematically precise notion of stability/instability of equilibria in 2 and 2.2.
Let’s use this graphical phase space technique on another model. In the 1930s, American ecologist Warder Clyde Allee performed some experiments on goldfish swimming in polluted water, and saw the fish had a greater survival rate when there were more fish in the tank.1 The implication was that individuals within a species require the assistance of others for more than just reproductive reasons. You can see this in animals which hunt in packs, or defend against predators as a group.
A simple variation on logistic growth which exhibits this is \[\mathchoice{\frac{{\mathrm d}x}{{\mathrm d}t}}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t} = -ax\left(1-\frac{x}{K}\right)\left(1-\frac{x}{A}\right)\] where \(0 < A < K\). The phase space is shown in 1.4. This model, the Allee model, is a really nice example of multistability: there are three equilibria, at \(x=0, A, K\), and we can see from the graph that \(x=0,K\) are stable, whereas \(A\) is unstable. This model predicts that if the population drops below \(A\), the species will become extinct. This is also a warning of how models can be sensitive to initial conditions. Slight fluctuations around \(A\) can really change long-term behaviour dramatically.
What is the role of \(a\) here? Actually very little: it scales the phase space diagram vertically, and therefore controls the timescale of movement towards/away from equilibria, but it has no effect on the qualitative properties (e.g. stability) of them.
Many one-dimensional population models are of the form \(\mathchoice{\frac{{\mathrm d}x}{{\mathrm d}t}}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t} = xf(x)\). These have the nice property that \(x=0\) is automatically an equilibrium. This class of models are sometimes said to be Kolmogorov.
There is another property of one-dimensional models on display in 1.4: the stability of equilibria alternates as you increase \(x\). This is a consequence of the continuity of \(\mathchoice{\frac{{\mathrm d}x}{{\mathrm d}t}}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t}\), or in other words, if you cross the line \(\mathchoice{\frac{{\mathrm d}x}{{\mathrm d}t}}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t}=0\) in one direction and want to cross it again, you have to cross it in the opposite direction.
One key observation related to this: oscillations are impossible in 1D models. By this, we mean that we can’t have \(x\) increasing, then decreasing, then increasing, then decreasing, as we go forward in time. For any value of \(x\) in 1.4, the population is either growing or decaying towards an equilibrium, and its derivative must go through zero (meaning \(x\) tending to an equilibrium) for this to change.
But many populations in nature do oscillate in size. How might that come about? Time to bring in another species...
Instead of introducing self-competition, a second possibility to avoid unbounded growth is to model a second population, \(y(t)\), which represents a second species. In the case of our greenfly population, we have our antagonists in ladybirds (1.5). Since ladybirds prey on greenflies, the greenfly population, \(x\), will decrease proportionally to \[\text{[the number of ladybirds, $y$]} \times \text{[the number of greenflies, $x$],}\] i.e., the number of interactions of the two species which may lead to a sad little greenfly funeral. This law will therefore be in the form \[\begin{aligned} \mathchoice{\frac{{\mathrm d}x}{{\mathrm d}t}}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t} = a x - bxy, \end{aligned}\] with \(b\) the rate at which fatal interactions occur.
But we must then also model the changing ladybird population, \(y(t)\). We assume in the absence of greenflies it will decrease as its food supply has vanished: \[\mathchoice{\frac{{\mathrm d}y}{{\mathrm d}t}}{{\mathrm d}y/{\mathrm d}t}{{\mathrm d}y/{\mathrm d}t}{{\mathrm d}y/{\mathrm d}t} = -cy.\] (Ask yourself: Why is this proportional to \(y\)?) However, it will also grow proportionally to the availability of food, i.e., interactions of the two species (at some rate \(d\)), so \[\mathchoice{\frac{{\mathrm d}y}{{\mathrm d}t}}{{\mathrm d}y/{\mathrm d}t}{{\mathrm d}y/{\mathrm d}t}{{\mathrm d}y/{\mathrm d}t} = -cy + d x y.\] So we have a coupled set of ordinary differential equations, \[\begin{align} \label{lotvol} \mathchoice{\frac{{\mathrm d}x}{{\mathrm d}t}}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t} &= ax - bxy,\\ \mathchoice{\frac{{\mathrm d}y}{{\mathrm d}t}}{{\mathrm d}y/{\mathrm d}t}{{\mathrm d}y/{\mathrm d}t}{{\mathrm d}y/{\mathrm d}t} &= -cy + dxy, \end{align}\] where, once again, \(ax\) represents prey reproduction with unlimited food, \(-cy\) is predator natural death or emigration, and \(bxy\) and \(dxy\) are interaction terms. Note that \(b\) and \(d\) are not necessarily the same: many prey may have to be eaten for the predator population to grow by one.
This system represents both the individual growth/decay of the species (self-interactions) as well as their mutual interaction. This is the so-called Lotka–Volterra (predator–prey) system developed by the Italian mathematician Vito Volterra (1860–1940) in 1926 to explain the fluctuation of fish populations in the Adriatic Sea (more on Lotka in 4). In more modern theories there are multiple species, each with their own interactions, but we will limit ourselves to this simpler but highly instructive classical system.
An example solution is shown for the parameters \((a,b,c,d) = (\frac23,\frac43,1,1)\), \(x(0)=1\), \(y(0)=1\) in 1.6(a). See how the peaks in the greenfly population naturally increase the ladybird food supply: the ladybird population then increases. In turn this leads to the greenfly population dropping as they get eaten, and this decrease in food supply leads to the ladybird population dropping as food becomes competitive.
This periodic behaviour is made clear using a phase plot, as shown in 1.6(b). In this case we have a parametrised plot \((x(t),y(t))\): a geometric plot of the variables of the system (2D here because we have two variables). Closed curves in phase space indicate a periodic relationship between the two parameters.
The parameters \((a,b,c,d)\) play a key role in determining the system’s behaviour. However, they are not all independent. We can work out how many we actually need by doing a nondimensionalisation.
Thus far we’ve dealt directly with the dimensional form of the differential equations, [lotvol], meaning that the parameters in the equations have relevant dimensions (or units) associated with them. While this is perhaps convenient for making direct predictions from measured values of the parameters, working with dimensional parameters can at best keep the equations looking ‘untidy’, and at worst obscure the mathematical structure of the equations or relevant approximations that can be made.
The process of nondimensionalisation is to separate our variables into a nondimensional bit and a dimensional bit: \[x = \widehat{x}X, \quad y = \widehat{y}Y, \quad t = \widehat{t}T.\] Here, \(x\), \(X\), \(y\) and \(Y\) have units of ‘individuals’, and \(t\) and \(T\) have units of time. Variables with hats indicate they are nondimensional.
If we use this substitution, our equations are \[\begin{aligned} \mathchoice{\frac{{\mathrm d}\widehat x}{{\mathrm d}\widehat t}}{{\mathrm d}\widehat x/{\mathrm d}\widehat t}{{\mathrm d}\widehat x/{\mathrm d}\widehat t}{{\mathrm d}\widehat x/{\mathrm d}\widehat t}\frac{X}{T} &= a\widehat{x}X - b\widehat{x}\widehat{y}XY,\\ \mathchoice{\frac{{\mathrm d}\widehat y}{{\mathrm d}\widehat t}}{{\mathrm d}\widehat y/{\mathrm d}\widehat t}{{\mathrm d}\widehat y/{\mathrm d}\widehat t}{{\mathrm d}\widehat y/{\mathrm d}\widehat t}\frac{Y}{T} &= -c\widehat{y}Y + d\widehat{x}\widehat{y}XY. \end{aligned}\] Rearranging so that there are no dimensions on the left hand sides, we have \[\begin{aligned} \mathchoice{\frac{{\mathrm d}\widehat x}{{\mathrm d}\widehat t}}{{\mathrm d}\widehat x/{\mathrm d}\widehat t}{{\mathrm d}\widehat x/{\mathrm d}\widehat t}{{\mathrm d}\widehat x/{\mathrm d}\widehat t} &= a\widehat{x}T - b\widehat{x}\widehat{y}YT,\\ \mathchoice{\frac{{\mathrm d}\widehat y}{{\mathrm d}\widehat t}}{{\mathrm d}\widehat y/{\mathrm d}\widehat t}{{\mathrm d}\widehat y/{\mathrm d}\widehat t}{{\mathrm d}\widehat y/{\mathrm d}\widehat t} &= -c\widehat{y}T + d\widehat{x}\widehat{y}XT. \end{aligned}\] So now we pick \(T\), \(X\) and \(Y\) to remove as many of our parameters as possible. If we choose \[\begin{equation} \label{TYX-scaling} T = \frac{1}{a}, \quad Y = \frac{a}{b}, \quad X = \frac{c}{d}, \end{equation}\] then the system can be written as \[\begin{align} \label{reducedlotvol} \mathchoice{\frac{{\mathrm d}\widehat{x}}{{\mathrm d}\widehat{t}}}{{\mathrm d}\widehat{x}/{\mathrm d}\widehat{t}}{{\mathrm d}\widehat{x}/{\mathrm d}\widehat{t}}{{\mathrm d}\widehat{x}/{\mathrm d}\widehat{t}} &= \widehat{x} - \widehat{x}\widehat{y},\\ \mathchoice{\frac{{\mathrm d}\widehat{y}}{{\mathrm d}\widehat{t}}}{{\mathrm d}\widehat{y}/{\mathrm d}\widehat{t}}{{\mathrm d}\widehat{y}/{\mathrm d}\widehat{t}}{{\mathrm d}\widehat{y}/{\mathrm d}\widehat{t}} &= \gamma(-\widehat{y} +\widehat{x}\widehat{y}), \end{align}\] where \[\gamma = c/a.\] We see that nondimensionalisation has ‘tidied up’ our equation – we went from four parameters to one. Why might this matter? Well, if you fix a set of parameters you get a solution. Select another set of parameters and you get another solution. So (for given initial conditions) there are as many solutions as there are parameters. If we have decided that the four parameters are positive real numbers, then there is a four-dimensional space of solutions: quite a lot if we want to map out all the system’s behaviour! In fact, we have shown that the parameters relate to each other and that there is, in fact, only a one-dimensional space of solutions, a much simpler search.
The scalings show that the solutions just relate by constant stretching factors. For example, if I choose \(\gamma=1\), I can then set \(c=2\), thus \(a=2\). In fact, the parameters \(b\) and \(d\) are even redundant and we can just choose them (but this is rare, Lotka–Volterra is a slightly odd system). The ratio \(c/d\) just stretches any \(x\) solution (stretches its range/amplitude); \(a/b\) stretches \(y\); and the scaling along \(t\) changes the period of the solutions. That is to say, for a given \(\gamma\) we have a main solution which can be simply scaled to get other solutions, without solving the system again.
You might have found that the choice of \(X\) in [TYX-scaling] was a bit of a surprise. The truth is, nondimensionalisation is more of an art than a science, and there is not normally a unique choice. Instead you will find there are different competing values to maximise when doing it. For example, typically you want your nondimensionalised parameters to have some meaning (what is the biological interpretation of \(\gamma\) here?), but this might require not actually nondimensionalising to the fewest possible parameters.
Furthermore, one important caveat to the idea that nondimensionalisation is just a scaling of the axis is that this is valid as long as no parameter goes through 0 or changes sign – if this happens, the solutions are no longer equivalent between the dimensional and nondimensional system.
Essentially – here be dragons – but you shouldn’t worry about them at this stage (they are young and tame).
There are some example questions on this topic on Additional Problem Sheet 1.
We can solve this system using separation of variables. Dividing the two equations (and dropping hats) we obtain \[\begin{equation} \label{sepvareq} \mathchoice{\frac{{\mathrm d}y}{{\mathrm d}x}}{{\mathrm d}y/{\mathrm d}x}{{\mathrm d}y/{\mathrm d}x}{{\mathrm d}y/{\mathrm d}x}= \frac{\gamma y}{x}\left(\frac{-1+x}{1- y}\right)\implies \int \frac{1-y}{y} \,\mathrm{d}y= -\gamma \int \frac{1-x}{x} \,\mathrm{d}x. \end{equation}\] Integrating both sides of [sepvareq] we obtain \[\begin{equation} \label{dynsol} \log y-y =- \gamma(\log x -x) + C, \end{equation}\] where the constant \(C\) can be set by some initial condition, \((x(0),y(0))\). Unfortunately it is not possible to write this relationship in explicit form. This gives us the phase curves determined by the value of the constant \(C\). Parametrising this curve then gives the solutions \(x(t)\) and \(y(t)\), i.e. we could choose some behaviour for \(x(t)\) and [dynsol] will then determine the behaviour of \(y\). We will find that this kind of solution is common to such systems.
In Additional Problem Sheet 1 we will use this relationship to show that the phase curves must be closed curves.
The fact that we can’t write [dynsol] explicitly makes the solution difficult to plot…or does it? A tutorial for how to plot this numerically in Python, as you saw in 1.6, can be found in the SciPy Cookbook.
Looking at [reducedlotvol] we can see there are two possible equilibria where \(\mathchoice{\frac{{\mathrm d}x}{{\mathrm d}t}}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t} = \mathchoice{\frac{{\mathrm d}y}{{\mathrm d}t}}{{\mathrm d}y/{\mathrm d}t}{{\mathrm d}y/{\mathrm d}t}{{\mathrm d}y/{\mathrm d}t}=0\) \(\forall t\): \[\begin{equation} \label{equlibria} (x,y) = (0,0) \quad \mbox{and} \quad (x,y) = (1,1). \end{equation}\] (Dynamical systems fans would call these fixed points or steady state solutions). The \((0,0)\) solution corresponds to both populations being extinct! The second corresponds to the nonzero population densities at which the population sizes will remain fixed.
In 1.7(a), we see the varying behaviour of the closed phase curves of the system. All curves encircle the equilibrium at \((1,1)\) and as the initial conditions get closer to the equilibrium value, the radius of the curve decreases. In 1.7(b), we see the dramatic variety of morphology the parametric curves can exhibit. When the pair \((x(0),y(0))\) are initially close to the equilibrium, the curves have low-amplitude sinusoidal shape, while if \(y(0)\) is initially small, the curves have extremely sharp gradients and dramatic rates of change at the maxima.
We have our dynamic solutions, [dynsol], and the fixed point equilibria, [equlibria]. A number of questions beg to be asked at this point:
Can one or both of the species die out if they are both nonzero at some time \(t\)?
Can an oscillating pair of populations relax to nonzero fixed values, i.e., do the populations ever settle?
An immediate observation in regards to (i) is that [dynsol] only allows \(x=0\) when \(y=0\) and vice versa, so they would have to become extinct simultaneously. The existence of periodic solutions as shown in 1.7 seems to suggest neither (i) or (ii) can occur, because the system repeats itself cyclically. A solution which decays into equilibrium would have to have a phase space diagram which spiralled inwards. In fact, we have a precise means of determining the answer to such questions which we will discuss in 2.
Exponential growth is the most basic model of a population’s replication, but it is flawed as the population grows without bound.
The logistic model is a better model which includes intra- (within) species competition. It leads to a population which settles (asymptotically) to a fixed value.
We have derived a simple model for a predator–prey relationship between two species based on simple interaction and growth models. This represents inter- (between) species competition. It leads to periodic variations in the two populations.
We have covered various standard tools for analysing such systems: dynamic solutions, equilibrium solutions and phase curves.
In addition, we have raised the notion of stability and reachability of the equilibrium solutions. The phase curve behaviour we have observed appears to forbid reaching the equilibria from out-of-equilibrium states.
You can find the Lotka–Volterra system in Murray, vol. I, chap. 3.1.
Shlomo Sternberg at Harvard presents a nice resource on the system, now only available on the internet archive.
In 1, we asked if the Lotka–Volterra solutions could relax so that the population values become constant, given that they varied at some initial time. In order to answer this question, we look at the behaviour of the system in the neighbourhood of the equilibria. This is called a linear stability analysis. To give a clear picture of this technique, which we shall use continually in this course, we start by taking a side step to look at a simpler system from mechanics.
Let’s consider a rigid pendulum: a bead of mass \(m\) attached to a rigid rod of length \(\ell\), depicted in 2.1(a). Three forces act on the bead: its weight, \(mg\); tension in the rod, \(T\); and friction, which opposes the motion of the bead.
The distance travelled by the bead is the arclength, \(s=\ell \theta\). The velocity and acceleration are therefore \[\mathchoice{\frac{{\mathrm d}s}{{\mathrm d}t}}{{\mathrm d}s/{\mathrm d}t}{{\mathrm d}s/{\mathrm d}t}{{\mathrm d}s/{\mathrm d}t} = \ell\mathchoice{\frac{{\mathrm d}\theta}{{\mathrm d}t}}{{\mathrm d}\theta/{\mathrm d}t}{{\mathrm d}\theta/{\mathrm d}t}{{\mathrm d}\theta/{\mathrm d}t}, \quad \mathchoice{\frac{{\mathrm d}^2 s}{{\mathrm d}t^2}}{{\mathrm d}^2 s/{\mathrm d}t^2}{{\mathrm d}^2 s/{\mathrm d}t^2}{{\mathrm d}^2 s/{\mathrm d}t^2} = \ell\mathchoice{\frac{{\mathrm d}^2 \theta}{{\mathrm d}t^2}}{{\mathrm d}^2 \theta/{\mathrm d}t^2}{{\mathrm d}^2 \theta/{\mathrm d}t^2}{{\mathrm d}^2 \theta/{\mathrm d}t^2}.\] We say friction is proportional to velocity, \(\nu\ell \, \mathchoice{\frac{{\mathrm d}\theta}{{\mathrm d}t}}{{\mathrm d}\theta/{\mathrm d}t}{{\mathrm d}\theta/{\mathrm d}t}{{\mathrm d}\theta/{\mathrm d}t}\), and points in the direction opposing motion.
If we resolve parallel and perpendicular to the rod, given the rod makes an angle \(\theta\) to the vertical, then: \[\begin{aligned} \text{[parallel]} & & 0 &= T - mg\cos\theta, \\ \text{[perpendicular]} & & m \ell\mathchoice{\frac{{\mathrm d}^2 \theta}{{\mathrm d}t^2}}{{\mathrm d}^2 \theta/{\mathrm d}t^2}{{\mathrm d}^2 \theta/{\mathrm d}t^2}{{\mathrm d}^2 \theta/{\mathrm d}t^2} &= - \nu\ell \mathchoice{\frac{{\mathrm d}\theta}{{\mathrm d}t}}{{\mathrm d}\theta/{\mathrm d}t}{{\mathrm d}\theta/{\mathrm d}t}{{\mathrm d}\theta/{\mathrm d}t} - mg\sin\theta, \end{aligned}\] the second equation of which can be rearranged to give \[\begin{equation} \label{peneq} \mathchoice{\frac{{\mathrm d}^2 \theta}{{\mathrm d}t^2}}{{\mathrm d}^2 \theta/{\mathrm d}t^2}{{\mathrm d}^2 \theta/{\mathrm d}t^2}{{\mathrm d}^2 \theta/{\mathrm d}t^2} + \frac{\nu}{m} \mathchoice{\frac{{\mathrm d}\theta}{{\mathrm d}t}}{{\mathrm d}\theta/{\mathrm d}t}{{\mathrm d}\theta/{\mathrm d}t}{{\mathrm d}\theta/{\mathrm d}t}+ \frac{g}{\ell}\sin\theta = 0. \end{equation}\] This is a nonlinear ODE; our physical intuition tells us that the pendulum will swing with a decreasing amplitude until it relaxes to \(\theta=0\).
The damped pendulum system is not an integrable system, i.e., there are no general closed-form solutions. This is generally the case for mechanical systems with friction. On the other hand, numerical solutions are simple to obtain.
An example is shown in 2.2(a): the solution is indeed oscillatory with decreasing amplitude. In (b), we see a pendulum with a very high coefficient of friction (a pendulum in fluid, say) where the damping is so strong that the angle never becomes negative: its motion is killed off on the first swing.
See if you can generate these solutions for yourself by following the instructions in the SciPy documentation.
Accounting for periodicity, there are two equilibria to [peneq]: \[\frac{g}{\ell}\sin\theta = 0\] gives \(\theta(t) = 0\) and \(\theta(t) = \pi\) for all \(t\). The first \(\theta=0\) solution corresponds to a pendulum starting at the bottom of its cycle and not moving.
The second solution is far more interesting. This is when the pendulum rod is vertically upwards as in 2.1(b). The forces in the system are balanced as the rod tension balances gravity and the rotating moments are equal in either direction. But have you tried this? No matter how hard you try, it will eventually fail and the pendulum will start to rotate back to its \(\theta=0\) equilibrium. Why?
Any small variation in the pendulum – a gentle breeze or vibration in the rod – no matter how small, always grows over time. In practice, no system is perfect and such variations always exist. Mathematically we represent small variations by a linear stability analysis.
In asymptotic analysis, we say a function \(g(t,\varepsilon)\) is \(\mathcal{O}(f(t,\varepsilon))\) if \[\lim_{\varepsilon\to 0} \frac{g(t,\varepsilon)}{f(t,\varepsilon)} = C\] with \(C\) bounded.
So, for example, let’s have \(g(t,\varepsilon) = C \varepsilon\) and \(f(t,\varepsilon) = \varepsilon\). Well, \[\lim_{\varepsilon\to 0} \frac{g(t,\varepsilon)}{f(t,\varepsilon)} = C,\] so \(C \varepsilon\) is \(\mathcal{O}(\varepsilon)\).
But if \(g(t,\varepsilon) = C \varepsilon^2\) and \(f(t,\varepsilon) = \varepsilon\), \[\lim_{\varepsilon\to 0} \frac{g(t,\varepsilon)}{f(t,\varepsilon)} = 0.\] So \(C \varepsilon^2\) is also \(\mathcal{O}(\varepsilon)\).
On the other hand, if \(g(t,\varepsilon) = C\) and \(f(t,\varepsilon) = \varepsilon\), then \[\lim_{\varepsilon\to 0} \frac{g(t,\varepsilon)}{f(t,\varepsilon)} = \infty.\] So \(C\) is in some way much bigger than \(\varepsilon\) when \(\varepsilon\to 0\). We would say \(C\) is \(\mathcal{O}(1)\) but not \(\mathcal{O}(\varepsilon)\).
For non-polynomial functions the way in is to use Taylor expansion: what if \(g(t,\varepsilon) = \sin\varepsilon\) and \(f(t,\varepsilon) = \varepsilon\)? \[\lim_{\varepsilon\to 0} \frac{g(t,\varepsilon)}{f(t,\varepsilon)} = \lim_{\varepsilon\to 0}\frac{\sin\varepsilon}{\varepsilon} = \lim_{\varepsilon\to 0}\frac{\varepsilon-\varepsilon^3/3!+\cdots}{\varepsilon} = \lim_{\varepsilon\to 0}\left( 1 - \frac{\varepsilon^2}{3!} + \cdots \right) = 1.\] So \(\sin\varepsilon\) is also \(\mathcal{O}(\varepsilon)\).
Long story short: if something is \(\mathcal{O}(\varepsilon^n)\), it is the same size as, or smaller than, \(\varepsilon^n\) when \(\varepsilon\to 0\).
‘Big O’ notation is part of a class of notations called Landau notation. In asymptotic analysis, we look at the limit as \(\varepsilon\to 0\), but in many applications, especially in computer science, we look at the limit going to infinity. For example, how many operations does it take to multiply two \(N\times N\) matrices together by hand? As \(N \to \infty\), it’s \(\mathcal{O}(N^2)\). The correct limit is normally clear from context.
The basic steps of a linear stability analysis are as follows
Find the system’s equilibria, \(\theta_0\) (we will commonly use the subscript \(0\) to indicate the equilibrium solution).
Assume a value which is changed from this equilibrium value by a very small amount, \(\theta = \theta_0 +\varepsilon\theta_1\), with \(\varepsilon\ll 1\). This is supposed to mimic the small vibration in the system. Substitute this into the equation and ignore terms of order \(\varepsilon^2\) and higher. We are left with the behaviour of the system where only small vibrations matter.
Solve this system to find out if our small vibration, \(\theta_1\), grows (like it would for \(\theta_0=\pi\) in the pendulum) or decays (as it would for \(\theta_0=0\)).
Conclude that the equilibrium is stable if we have decay (small vibrations would disappear) and unstable if they grow.
Let’s apply this to our pendulum.
We already have these: \(\theta_0 = 0\) and \(\theta_0 = \pi\).
We assume the equilibrium solution \(\theta_0\) is changed to \[\begin{equation} \label{expsol} \theta(t) = \theta_0 + \varepsilon\theta_1(t), \end{equation}\] with \(\theta_1\) the changing behaviour and \(\varepsilon\ll 1,\) such that this is a vanishingly small change.
Ignoring all \(\mathcal{O}(\varepsilon^2)\) terms we substitute [expsol] into [peneq], \[\begin{equation} \label{penexp} \mathchoice{\frac{{\mathrm d}^2 }{{\mathrm d}t^2}}{{\mathrm d}^2 /{\mathrm d}t^2}{{\mathrm d}^2 /{\mathrm d}t^2}{{\mathrm d}^2 /{\mathrm d}t^2}[\theta_0 + \varepsilon\theta_1(t)] + \frac{\nu}{m} \mathchoice{\frac{{\mathrm d}}{{\mathrm d}t}}{{\mathrm d}/{\mathrm d}t}{{\mathrm d}/{\mathrm d}t}{{\mathrm d}/{\mathrm d}t}[\theta_0 + \varepsilon\theta_1(t)]+ \frac{g}{\ell}\sin[\theta_0 + \varepsilon\theta_1(t)] = 0. \end{equation}\]
Let’s go term by term. Remembering that \(\theta_0\) is a constant, \[\begin{aligned} \mathchoice{\frac{{\mathrm d}^2 }{{\mathrm d}t^2}}{{\mathrm d}^2 /{\mathrm d}t^2}{{\mathrm d}^2 /{\mathrm d}t^2}{{\mathrm d}^2 /{\mathrm d}t^2}[\theta_0 + \varepsilon\theta_1(t)] &= \varepsilon\mathchoice{\frac{{\mathrm d}^2 \theta_1}{{\mathrm d}t^2}}{{\mathrm d}^2 \theta_1/{\mathrm d}t^2}{{\mathrm d}^2 \theta_1/{\mathrm d}t^2}{{\mathrm d}^2 \theta_1/{\mathrm d}t^2} \\ \frac{\nu}{m} \mathchoice{\frac{{\mathrm d}}{{\mathrm d}t}}{{\mathrm d}/{\mathrm d}t}{{\mathrm d}/{\mathrm d}t}{{\mathrm d}/{\mathrm d}t}[\theta_0 + \varepsilon\theta_1(t)] &= \varepsilon\frac{\nu}{m} \mathchoice{\frac{{\mathrm d}\theta_1}{{\mathrm d}t}}{{\mathrm d}\theta_1/{\mathrm d}t}{{\mathrm d}\theta_1/{\mathrm d}t}{{\mathrm d}\theta_1/{\mathrm d}t} \\ \frac{g}{\ell}\sin[\theta_0 + \varepsilon\theta_1(t)] &= \; ? \end{aligned}\] For the \(\sin\) term we need another strategy…Taylor series!
Recall that, so long as a real function \(f(x)\) is sufficiently differentiable, we can expand it in a Taylor series, \[f({\color{C0}x}+{\color{C1}\delta}) = f({\color{C0}x}) + {\color{C1}\delta} f'({\color{C0}x}) + \frac{{\color{C1}\delta}^2}{2}f''({\color{C0}x}) + \cdots.\] We can think about this as a succession of approximations to the value of \(f\) at \(x+\delta\). In our case then, \[\sin[{\color{C0}\theta_0} + {\color{C1}\varepsilon\theta_1(t)}] = \sin({\color{C0}\theta_0}) + {\color{C1}\varepsilon\theta_1(t)} \cos({\color{C0}\theta_0}) + \mathcal{O}(\varepsilon^2).\] Since \(\theta_0=0\) and \(\pi\), we have \(\sin(\theta_0) = 0\) and \(\cos\theta_0 = 1\) and \(-1\) respectively.
So, putting this all together and dividing through by \(\varepsilon\), [penexp] becomes \[\begin{equation} \label{peneqlin} \mathchoice{\frac{{\mathrm d}^2 \theta_1}{{\mathrm d}t^2}}{{\mathrm d}^2 \theta_1/{\mathrm d}t^2}{{\mathrm d}^2 \theta_1/{\mathrm d}t^2}{{\mathrm d}^2 \theta_1/{\mathrm d}t^2} + \frac{\nu}{m} \mathchoice{\frac{{\mathrm d}\theta_1}{{\mathrm d}t}}{{\mathrm d}\theta_1/{\mathrm d}t}{{\mathrm d}\theta_1/{\mathrm d}t}{{\mathrm d}\theta_1/{\mathrm d}t} \pm \frac{g}{\ell}\theta_1(t) =0. \end{equation}\]
The big idea is that if \(\varepsilon\ll 1\) then [peneqlin] will basically give us the solution to the full system (if \(\theta\) is very close to \(\theta_0\)).
The linear equation, [peneqlin], is a constant-coefficient, linear, ordinary differential equation. The auxiliary equation (in \(\lambda\)) is \[\lambda^2 + \frac{\nu}{m} \lambda \pm \frac{g}{\ell} = 0,\] and the general solutions are therefore \[\begin{align} \label{linpensol}\theta_1(t) = A \mathrm{e}^{\lambda_1 t} + B\mathrm{e}^{\lambda_2 t}, \end{align}\] where \[\begin{aligned} \lambda_1 &= \frac{1}{2}\left[-\frac{\nu}{m} + \sqrt{\left(\frac{\nu}{m}\right)^2 \mp 4\frac{g}{\ell}} \right], \\ \lambda_2 &= \frac{1}{2}\left[-\frac{\nu}{m} - \sqrt{\left(\frac{\nu}{m}\right)^2 \mp 4\frac{g}{\ell}} \right], \end{aligned}\] and where \(A\) and \(B\) are set by initial conditions. For stability analysis, we need to assume that \(A\) and \(B\) could be any bounded values: that is to say, the equilibrium should be stable/unstable under any type of small change to the system.
The question of stability is what happens to \(\theta(t)\) as \(t \to \infty\). Looking at our solution, [linpensol], this clearly depends on \(\lambda_1\) and \(\lambda_2\).
In this case, \[\lambda_1 = \frac{1}{2}\left[-\frac{\nu}{m} + \sqrt{\left(\frac{\nu}{m}\right)^2 - 4\frac{g}{\ell}} \right], \quad \lambda_2 = \frac{1}{2}\left[-\frac{\nu}{m} - \sqrt{\left(\frac{\nu}{m}\right)^2 - 4\frac{g}{\ell}} \right].\] If \((\nu/m)^2 - 4g/\ell\) (the bit under the square root) is negative, then the square root is imaginary and \(\operatorname{Re}(\lambda_1)\) and \(\operatorname{Re}(\lambda_2)\) are both negative. If this same term is positive, then note that \[\sqrt{\left(\frac{\nu}{m}\right)^2 - 4 \frac{g}{\ell}} < \frac{\nu}{m},\] and so \(\operatorname{Re}(\lambda_1)\) and \(\operatorname{Re}(\lambda_2)\) are still both negative. Thus we see that the solutions, [linpensol], must always decay exponentially. The physical interpretation of this is what we expected: around the bottom of the pendulum cycle (\(\theta=0\)) all small oscillations will decay such that \(\theta(t) \to \theta_0=0\).
For this case, \[\lambda_1 = \frac{1}{2}\left[-\frac{\nu}{m} + \sqrt{\left(\frac{\nu}{m}\right)^2 + 4\frac{g}{\ell}} \right], \quad \lambda_2 = \frac{1}{2}\left[-\frac{\nu}{m} - \sqrt{\left(\frac{\nu}{m}\right)^2 + 4\frac{g}{\ell}} \right].\] The bit under the square root is positive, and so \(\lambda_1\) and \(\lambda_2\) are both guaranteed to be real. Additionally, \[\sqrt{\left(\frac{\nu}{m}\right)^2 + 4 \frac{g}{\ell}}>\frac{\nu}{m},\] so \(\lambda_1\) is positive and \(\lambda_2\) is negative. Thus the term \(\exp(\lambda_1 t)\) will grow exponentially. Again this matches our physical intuition and small oscillations about \(\theta=0\) will grow such that the pendulum moves away from the top of its arc.
If there is no friction, \(\nu =0\), steps – will be the same, except that our linearised solution is \[\theta_1(t) = A\mathrm{e}^{\lambda_1 t} + B \mathrm{e}^{\lambda_2 t},\] where \[\lambda_{1} = \sqrt{\mp \frac{g}{\ell}}, \quad \lambda_{2} = - \sqrt{\mp \frac{g}{\ell}}.\]
\(\lambda\) is pure imaginary, \(\lambda_{1,2} = \pm \mathrm{i}\sqrt{g/\ell}\), and the solutions are just sinusoidal oscillations which do not decay in time, \[\theta_1(t) = C \sin\left(\sqrt{\frac{g}{\ell}} t\right)+ D \cos\left(\sqrt{\frac{g}{\ell}} t \right).\]
This case is interesting because it tells us that any small oscillation will be maintained: this is neither stable nor unstable! In fact, in the case \(\nu=0\), [peneq] is integrable: that is to say, we can solve it analytically. Its solutions can be written in terms of elliptic integrals and the solutions are oscillatory with constant amplitude. The physical interpretation is that if there is no friction there is no reason for the swings of the pendulum to decay.
\(\lambda_1\) and \(\lambda_2\) are real and the first exponential will grow.
The growth instability of this solution and the periodic nature of the system’s solutions (it is possible to prove the periodicity) highlights a second issue. The linearised solutions – which exhibit exponential growth – often tell us little about the full nonlinear behaviour of the system – which is periodic. It just so happens in this case that the divergence between the full and linear solutions is rapid. We will not pursue this issue much further here as it is not important in what follows.
To test the feasibility (stability) of equilibrium solutions we linearise the equation about the equilibrium state.
We analyse the solutions for exponential decay or growth. If there is only decay, then the solution is stable in that it is resistant to small changes. If there is any growth it is unstable as small changes destroy the solution.
Some special systems will have neither growth nor decay. In this case we know, from the nonlinear behaviour of the frictionless \(\theta_0=0\) pendulum equation, that this is because neighbouring solutions are periodic.
We can make these conclusions far more general…
The pendulum equation and the Lotka–Volterra equations are both examples of ODEs which depend on a variable and its derivatives with respect to time, but not explicitly on time itself.
Many population models have this property, which is called autonomy:
An autonomous ordinary differential equation is one which has no explicit dependence on time, \(t\).
For example, this is autonomous: \[\begin{equation} \label{example} x^3\mathchoice{\frac{{\mathrm d}^3 x}{{\mathrm d}t^3}}{{\mathrm d}^3 x/{\mathrm d}t^3}{{\mathrm d}^3 x/{\mathrm d}t^3}{{\mathrm d}^3 x/{\mathrm d}t^3} + \sqrt{1-x^2}\mathchoice{\frac{{\mathrm d}x}{{\mathrm d}t}}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t} + x^3 = 0, \end{equation}\] but this is not: \[x^3\mathchoice{\frac{{\mathrm d}^3 x}{{\mathrm d}t^3}}{{\mathrm d}^3 x/{\mathrm d}t^3}{{\mathrm d}^3 x/{\mathrm d}t^3}{{\mathrm d}^3 x/{\mathrm d}t^3} + t \mathchoice{\frac{{\mathrm d}x}{{\mathrm d}t}}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t} + x^3 = 0,\] because of the explicit \(t\) dependence.
We now extend our gaze to linear stability analysis of this general class of ODEs, first when our system of interest is governed by a single ODE (as in the pendulum) rather than by a coupled system (as with Lotka–Volterra).
An autonomous ODE is a function in the form \[F\left(x(t),\mathchoice{\frac{{\mathrm d}x}{{\mathrm d}t}}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t},\dots,\mathchoice{\frac{{\mathrm d}^{n} x}{{\mathrm d}t^{n}}}{{\mathrm d}^{n} x/{\mathrm d}t^{n}}{{\mathrm d}^{n} x/{\mathrm d}t^{n}}{{\mathrm d}^{n} x/{\mathrm d}t^{n}}\right) = 0.\] At equilibrium, \(x = x_0\), all \(t\) derivatives must vanish, so we are left with some function of \(x_0\) alone, \[F(x_0,0,\dots,0) = 0.\] In our example equation, [example], this equation is \[x^3=0.\]
Given an equilibrium solution, \[F({\color{C0}x_0},{\color{C0}0},\dots,{\color{C0}0}) = 0,\] assume a solution in the form \(x = x_0 + \varepsilon x_1\), \[\begin{equation} \label{F-to-taylor-expand} F\left({\color{C0}x_0} + {\color{C1}\varepsilon x_1(t)}, \; {\color{C0}0}+{\color{C1}\varepsilon\mathchoice{\frac{{\mathrm d}x_1}{{\mathrm d}t}}{{\mathrm d}x_1/{\mathrm d}t}{{\mathrm d}x_1/{\mathrm d}t}{{\mathrm d}x_1/{\mathrm d}t}}, \; \dots, \; {\color{C0}0}+ {\color{C1}\varepsilon\mathchoice{\frac{{\mathrm d}^{n} x_1}{{\mathrm d}t^{n}}}{{\mathrm d}^{n} x_1/{\mathrm d}t^{n}}{{\mathrm d}^{n} x_1/{\mathrm d}t^{n}}{{\mathrm d}^{n} x_1/{\mathrm d}t^{n}}}\right) = 0. \end{equation}\]
Now recall that a two-dimensional Taylor series looks like \[f({\color{C0}x}+{\color{C1}\delta},{\color{C0}y}+{\color{C1}\zeta}) = f({\color{C0}x},{\color{C0}y}) + {\color{C1}\delta} \mathchoice{\frac{\partial f}{\partial x}}{\partial f/\partial x}{\partial f/\partial x}{\partial f/\partial x}({\color{C0}x},{\color{C0}y}) + {\color{C1}\zeta} \mathchoice{\frac{\partial f}{\partial y}}{\partial f/\partial y}{\partial f/\partial y}{\partial f/\partial y}({\color{C0}x},{\color{C0}y}) + \cdots,\] and with that in mind, let’s do an \(n\)-dimensional Taylor expansion of [F-to-taylor-expand] in \(\varepsilon\) about \(\varepsilon=0\).
This expansion is algebraically awkward, but denoting \(\mathchoice{\frac{{\mathrm d}^{n} x}{{\mathrm d}t^{n}}}{{\mathrm d}^{n} x/{\mathrm d}t^{n}}{{\mathrm d}^{n} x/{\mathrm d}t^{n}}{{\mathrm d}^{n} x/{\mathrm d}t^{n}} = x^{(n)}\), we get \[F({\color{C0}x_0},{\color{C0}0},\dots,{\color{C0}0}) +{\color{C1}\varepsilon}\left(\mathchoice{\frac{\partial F}{\partial x}}{\partial F/\partial x}{\partial F/\partial x}{\partial F/\partial x}{\color{C1}x_1} + \mathchoice{\frac{\partial F}{\partial x^{(1)}}}{\partial F/\partial x^{(1)}}{\partial F/\partial x^{(1)}}{\partial F/\partial x^{(1)}}{\color{C1}x_1^{(1)}} + \dots + \mathchoice{\frac{\partial F}{\partial x^{(n)}}}{\partial F/\partial x^{(n)}}{\partial F/\partial x^{(n)}}{\partial F/\partial x^{(n)}}{\color{C1}x_1^{(n)}}\right) = 0\] to \(\mathcal{O}(\varepsilon^2)\).
Remember for a Taylor expansion we evaluate the partial derivatives at \(\varepsilon=0\), i.e., \(x=x_0\). Thus the partial derivatives, \(\mathchoice{\frac{\partial F}{\partial x^{(n)}}}{\partial F/\partial x^{(n)}}{\partial F/\partial x^{(n)}}{\partial F/\partial x^{(n)}}\), are constant values. Also we know \(F(x_0,0,\dots 0)=0\) by our assumption of expanding around equilibrium. Thus, if we ignore terms of \(\mathcal{O}(\varepsilon^2)\), we have \[a_0x_1+ a_1\mathchoice{\frac{{\mathrm d}x_1}{{\mathrm d}t}}{{\mathrm d}x_1/{\mathrm d}t}{{\mathrm d}x_1/{\mathrm d}t}{{\mathrm d}x_1/{\mathrm d}t} + \dots a_n \mathchoice{\frac{{\mathrm d}^{n} x_1}{{\mathrm d}t^{n}}}{{\mathrm d}^{n} x_1/{\mathrm d}t^{n}}{{\mathrm d}^{n} x_1/{\mathrm d}t^{n}}{{\mathrm d}^{n} x_1/{\mathrm d}t^{n}} =0,\qquad a_n = \left.\mathchoice{\frac{\partial F}{\partial x^{(n)}}}{\partial F/\partial x^{(n)}}{\partial F/\partial x^{(n)}}{\partial F/\partial x^{(n)}}\right\vert_{x=x_0},\] which is a constant coefficient ODE.
As we saw with the pendulum, such equations have solutions in the form \(x_1 = A\mathrm{e}^{\lambda t}\). Substituting this in to our linearised equation will give a polynomial in the form \[a_0 + a_1\lambda +\dots a_n\lambda^n,\] and hence the full solution of the correction \(x_1\) takes the general form \[\begin{equation} \label{lincorr} x_1 = c_1\mathrm{e}^{\lambda_1 t}+\dots + c_n\mathrm{e}^{\lambda_n t}. \end{equation}\] Of course, it is entirely possible that some of the \(\lambda\) are complex, \(\lambda_i = \mu_i + \mathrm{i}\nu_i\). If so, recall that the solution is of the form \[\mathrm{e}^{\mu_i t} [ A\cos(\nu_i t) + B \sin(\nu_i t)].\]
For the sake of stability analysis we know that the imaginary part of \(\lambda\) does not control growth – only the real part does.
We now give more rigorous definitions of three classes for our equilibria: they will be asymptotically stable, unstable, or the linearised system will be degenerate.
Consider an equilibrium solution, \(x_0\), to an autonomous ODE, \[\begin{equation} \label{nonlin} F\left(x(t),\mathchoice{\frac{{\mathrm d}x}{{\mathrm d}t}}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t},\dots,\mathchoice{\frac{{\mathrm d}^{n} x}{{\mathrm d}t^{n}}}{{\mathrm d}^{n} x/{\mathrm d}t^{n}}{{\mathrm d}^{n} x/{\mathrm d}t^{n}}{{\mathrm d}^{n} x/{\mathrm d}t^{n}}\right) = 0. \end{equation}\] The solution is Lyapunov stable if for every \(\varepsilon> 0\), there exists a \(\delta > 0\) such that if \(\vert x(0)-x_0\vert<\delta\), then for all \(t \geq 0\) we have \(\vert x(t)-x_0\vert<\varepsilon\).
Two ways to think about this definition of Lyapunov stability:
You can draw an \(\varepsilon\)-sized fence around \(x_0\), and then find a smaller enclosed area of size \(\delta\) where if you start in the smaller area, you never leave the larger area.
When you start close to \(x_0\) (within a \(\delta\)-distance), you remain close to \(x_0\) (within an \(\varepsilon\)-distance). And this must be true for any \(\varepsilon\).
The solution is asymptotically stable if it is Lyaponov stable and there exists \(\varepsilon\) such that if \(\vert x(0)-x_0\vert<\varepsilon\), then \(\lim_{t\to \infty}\vert x(t)-x_0\vert=0\). In other words, it is asymptotically stable if in addition to being Lyapunov stable, when you start arbitrarily close to \(x_0\), you end up at \(x_0\) at \(t=\infty\).
It is possible for an equilibrium to be Lyapunov stable and not asymptotically stable: periodic solutions fulfil this criterion, for example, and we will talk about periodic solutions in our discussion of the degenerate case below.
It is also possible to satisfy the second condition for asymptotic stability without satisfying the first condition (i.e. Lyapunov stability). For example, consider a system in two functions \(x\) and \(y\) which can written in polar coordinates as \[\begin{align} \label{not-lyap-stab-theta-full} \mathchoice{\frac{{\mathrm d}r}{{\mathrm d}t}}{{\mathrm d}r/{\mathrm d}t}{{\mathrm d}r/{\mathrm d}t}{{\mathrm d}r/{\mathrm d}t} &= r(1-r), \\ \mathchoice{\frac{{\mathrm d}\theta}{{\mathrm d}t}}{{\mathrm d}\theta/{\mathrm d}t}{{\mathrm d}\theta/{\mathrm d}t}{{\mathrm d}\theta/{\mathrm d}t} &= \sin^2\left(\frac{\theta}{2}\right). \label{not-lyap-stab-theta} \end{align}\] You can see fairly easily that the equilibria are at \((r,\theta) = (0,0)\) and \((1,0)\), which conveniently correspond to \((x,y) = (0,0)\) and \((1,0)\). It turns out that although \((1,0)\) satisfies the second condition (if you start arbitrarily close to it, you end up at it), it is not Lyapunov stable because the route the solution takes in getting from the arbitrarily-close starting point to the final equilibrium is very indirect.
The solutions are plotted in 2.3: if you look back at [not-lyap-stab-theta], you can see that this indirectness comes from the fact that \(\mathchoice{\frac{{\mathrm d}\theta}{{\mathrm d}t}}{{\mathrm d}\theta/{\mathrm d}t}{{\mathrm d}\theta/{\mathrm d}t}{{\mathrm d}\theta/{\mathrm d}t} \geq 0\).
The key observation is this: if for our linear correction \(x_1\), [lincorr], all \(\operatorname{Re}(\lambda_i)<0\), then \(x_1 \to 0\) and hence any small perturbation \(x=x_0 +\varepsilon x_1 \to x_0\). That is to say, the equilibrium is asymptotically stable.
If any of the \(\operatorname{Re}(\lambda_i)\) of [lincorr] are \(>0\) then \(x_1\) will grow exponentially. We class such equilibria \(x_0\) (for which this \(x_1\) is our small correction, \(x_0+\varepsilon x_1\)) as unstable. As this growth continues, the linear approximation becomes invalid and the nonlinear dynamics take over. All that matters for stability is that the solution cannot approach \(x_0\) if the initial condition \(x(0)\) is not the equilibrium (it is essentially ‘repelled’ from \(x_0\)).
So far we’ve only considered cases where none of the \(\operatorname{Re}(\lambda_i) = 0\). So long as this is the case, the Hartman–Grobman theorem makes rigorous this idea that that the behaviour of the linearised problem around an equilibrium is qualitatively the same as the behaviour of the full problem around that point.
But sometimes we encounter the case where one of the \(\operatorname{Re}(\lambda_i)=0\). If all the other \(\operatorname{Re}(\lambda_i)\) are \(\leq 0\), this fits neither of our previous definitions (if one is positive, it doesn’t matter that one of the \(\operatorname{Re}(\lambda_i)=0\) as the exponential growth takes over). It is not unstable as there is no exponential growth, but the solution will not decay away, so it is not asymptotically stable.
We have already encountered this for the pendulum equation in the previous section. Unfortunately there is no simple answer to the question of what this implies, but in this course it will be one of three possibilities:
The solutions close to equilibrium are periodic limit cycles, as for the frictionless pendulum (stay tuned for more on limit cycles in the next chapter).
The equilibrium itself is degenerate, i.e., imagine that \[\mathchoice{\frac{{\mathrm d}x}{{\mathrm d}t}}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t} =0.\] The solution is any permissible \(x_0\in\mathbb{R}\) (\(>0\) to be realistic). Thus any equilibrium is local to another equilibrium and cannot be stable (an equilibrium doesn’t decay!).
Equations such as \[\begin{equation} \label{cubiceq} \mathchoice{\frac{{\mathrm d}x}{{\mathrm d}t}}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t} + x^3 =0, \end{equation}\] whose equilibrium is \(x_0=0\) and the expansion of \((x_0+\varepsilon x_1)^3\) is just \(\varepsilon^3 x_1^3\) (all other terms involve \(x_0\) which is zero). This means the linear equation is simply \[\mathchoice{\frac{{\mathrm d}x_1}{{\mathrm d}t}}{{\mathrm d}x_1/{\mathrm d}t}{{\mathrm d}x_1/{\mathrm d}t}{{\mathrm d}x_1/{\mathrm d}t}=0.\] Such cases are more complex and suggest that the terms that we ignored in our linearisation have turned out to be more important than we thought! Sometimes one can go back to the full equation: in fact, you should be able to solve [cubiceq] directly.
We will encounter specific examples of (ii) and (iii) in the problems class. In an exam you will generally be asked questions about non-degenerate cases, and if the question concerns a degenerate case it will be of a type you have seen before.
The degenerate case indicated by (iii) occurs when the equilibrium is the solution to a nonlinear equation (i.e. \(x^3=0\)) for which there are multiple repeated solutions. An example such as \[(x-a)(x-b)(x-c)=0,\quad a>b>c,\] will not lead to a degeneracy. Check this by substituting in \(x = x_0 + \varepsilon x_1\) and confirming that for any of the equilibria \(x_0 = a, b, c\) that the \(\mathcal{O}(\varepsilon)\) term is not zero.
Population models including more than one species will require systems of ODEs, like we saw with Lotka–Volterra. So more generally we could consider systems of \(m\) ODEs in \(m\) functions \(x^1(t),\dots, x^m(t)\) (note these are not powers, we’re just putting the numbers in the superscript slot to avoid notation clash shortly), \[\begin{align} \nonumber F^1\left(x^1(t),\mathchoice{\frac{{\mathrm d}x^1}{{\mathrm d}t}}{{\mathrm d}x^1/{\mathrm d}t}{{\mathrm d}x^1/{\mathrm d}t}{{\mathrm d}x^1/{\mathrm d}t},\dots,\mathchoice{\frac{{\mathrm d}^{n} x^1}{{\mathrm d}t^{n}}}{{\mathrm d}^{n} x^1/{\mathrm d}t^{n}}{{\mathrm d}^{n} x^1/{\mathrm d}t^{n}}{{\mathrm d}^{n} x^1/{\mathrm d}t^{n}},\dots\dots, x^m(t),\mathchoice{\frac{{\mathrm d}x^m}{{\mathrm d}t}}{{\mathrm d}x^m/{\mathrm d}t}{{\mathrm d}x^m/{\mathrm d}t}{{\mathrm d}x^m/{\mathrm d}t},\dots,\mathchoice{\frac{{\mathrm d}^{n} x^m}{{\mathrm d}t^{n}}}{{\mathrm d}^{n} x^m/{\mathrm d}t^{n}}{{\mathrm d}^{n} x^m/{\mathrm d}t^{n}}{{\mathrm d}^{n} x^m/{\mathrm d}t^{n}}\right) &= 0,\\ \label{nonlinsys} \vdots \hspace{4.93cm} &\\ \nonumber F^m\left(x^1(t),\mathchoice{\frac{{\mathrm d}x^1}{{\mathrm d}t}}{{\mathrm d}x^1/{\mathrm d}t}{{\mathrm d}x^1/{\mathrm d}t}{{\mathrm d}x^1/{\mathrm d}t},\dots,\mathchoice{\frac{{\mathrm d}^{n} x^1}{{\mathrm d}t^{n}}}{{\mathrm d}^{n} x^1/{\mathrm d}t^{n}}{{\mathrm d}^{n} x^1/{\mathrm d}t^{n}}{{\mathrm d}^{n} x^1/{\mathrm d}t^{n}},\dots\dots, x^m(t),\mathchoice{\frac{{\mathrm d}x^m}{{\mathrm d}t}}{{\mathrm d}x^m/{\mathrm d}t}{{\mathrm d}x^m/{\mathrm d}t}{{\mathrm d}x^m/{\mathrm d}t},\dots,\mathchoice{\frac{{\mathrm d}^{n} x^m}{{\mathrm d}t^{n}}}{{\mathrm d}^{n} x^m/{\mathrm d}t^{n}}{{\mathrm d}^{n} x^m/{\mathrm d}t^{n}}{{\mathrm d}^{n} x^m/{\mathrm d}t^{n}}\right) &= 0. \end{align}\]
But nearly all the models we see in this course will be first order systems. Indeed, (nondimensionalised) Lotka–Volterra can be written in this way, setting \(x^1 =x\) and \(x^2=y\): \[\begin{align} \label{lotvol2} \mathchoice{\frac{{\mathrm d}x}{{\mathrm d}t}}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t} - x + xy &= 0,\\ \mathchoice{\frac{{\mathrm d}y}{{\mathrm d}t}}{{\mathrm d}y/{\mathrm d}t}{{\mathrm d}y/{\mathrm d}t}{{\mathrm d}y/{\mathrm d}t} -\gamma(-y + xy)&=0. \end{align}\] So let’s just look at these first order systems and get our hands dirty with Lotka–Volterra: linear stability analysis is a technique best learnt by practice!
For Lotka–Volterra, we know from 1.4.4 that the equilibria are \((x_0,y_0) = (0,0)\) and \((1,1)\).
Put simply, by expanding to linear order we will always get a linear system of equations with constant coefficients. For the Lotka–Volterra system we could write it as \[\begin{aligned} \mathchoice{\frac{{\mathrm d}x}{{\mathrm d}t}}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t} - F(x,y) &=0,\\ \mathchoice{\frac{{\mathrm d}y}{{\mathrm d}t}}{{\mathrm d}y/{\mathrm d}t}{{\mathrm d}y/{\mathrm d}t}{{\mathrm d}y/{\mathrm d}t} - G(x,y) &=0, \end{aligned}\] so substituting in \(x=x_0 +\varepsilon x_1\) and \(y=y_0 +\varepsilon y_1\) to linear order (\(\mathcal{O}(\varepsilon)\)) we would find \[\begin{aligned} \mathchoice{\frac{{\mathrm d}x_1}{{\mathrm d}t}}{{\mathrm d}x_1/{\mathrm d}t}{{\mathrm d}x_1/{\mathrm d}t}{{\mathrm d}x_1/{\mathrm d}t} - \mathchoice{\frac{\partial F}{\partial x}}{\partial F/\partial x}{\partial F/\partial x}{\partial F/\partial x}(x_0,y_0) \, x_1 -\mathchoice{\frac{\partial F}{\partial y}}{\partial F/\partial y}{\partial F/\partial y}{\partial F/\partial y}(x_0,y_0) \, y_1 & =0 ,\\ \mathchoice{\frac{{\mathrm d}y_1}{{\mathrm d}t}}{{\mathrm d}y_1/{\mathrm d}t}{{\mathrm d}y_1/{\mathrm d}t}{{\mathrm d}y_1/{\mathrm d}t} - \mathchoice{\frac{\partial G}{\partial x}}{\partial G/\partial x}{\partial G/\partial x}{\partial G/\partial x}(x_0,y_0) \, x_1 -\mathchoice{\frac{\partial G}{\partial y}}{\partial G/\partial y}{\partial G/\partial y}{\partial G/\partial y}(x_0,y_0) \, y_1 & =0. \end{aligned}\] Using the notation \[F_x= \mathchoice{\frac{\partial F}{\partial x}}{\partial F/\partial x}{\partial F/\partial x}{\partial F/\partial x}(x_0,y_0), \quad F_y = \mathchoice{\frac{\partial F}{\partial y}}{\partial F/\partial y}{\partial F/\partial y}{\partial F/\partial y}(x_0,y_0), \quad \text{etc.},\] one can write this as a matrix equation, \[\mathchoice{\frac{{\mathrm d}}{{\mathrm d}t}}{{\mathrm d}/{\mathrm d}t}{{\mathrm d}/{\mathrm d}t}{{\mathrm d}/{\mathrm d}t} \begin{pmatrix} x_1 \\ y_1 \end{pmatrix} = \mathsfbfit{J} \begin{pmatrix} x_1 \\ y_1 \end{pmatrix} = \begin{pmatrix}F_x & F_y \\ G_x & G_y \end{pmatrix} \begin{pmatrix} x_1 \\ y_1 \end{pmatrix} ,\] or even \[\begin{equation} \mathchoice{\frac{{\mathrm d}\mathbfit{x}_1}{{\mathrm d}t}}{{\mathrm d}\mathbfit{x}_1/{\mathrm d}t}{{\mathrm d}\mathbfit{x}_1/{\mathrm d}t}{{\mathrm d}\mathbfit{x}_1/{\mathrm d}t} = \mathsfbfit{J}\mathbfit{x}_1, \label{dxdt-EQ-Ax} \end{equation}\] where \(\mathbfit{x}_1 = (x_1,y_1)\). The matrix \(\mathsfbfit{J}\) is commonly referred to as the Jacobian matrix. For Lotka–Volterra, the Jacobian is \[\begin{equation} \label{a1lotvol} \mathsfbfit{J} = \begin{pmatrix}1-y_0 & -x_0 \\ \gamma y_0 & \gamma(x_0-1) \end{pmatrix}. \end{equation}\]
For a linear stability analysis like this in an exam, I would only need you to correctly quote the Jacobian. You do not need to show all the steps of the linearisation.
Intuitively, solutions to [dxdt-EQ-Ax] will be a linear combination of exponentials. To calculate them, we compute the eigenvalues of \(\mathsfbfit{J}\). Suppose \(\mathsfbfit{J}\) is diagonalisable – if not, we need slightly heavier machinery but the outcome is the same. Diagonalisable just means we can write \[\mathsfbfit{J} = \mathsfbfit{P} \mathsfbfit{D} \mathsfbfit{P}^{-1},\] where \(\mathsfbfit{D} = \operatorname{diag}(\lambda_1, \dots, \lambda_n)\) is a diagonal matrix of the eigenvalues of \(\mathsfbfit{J}\), and \(\mathsfbfit{P}\) is a matrix with the eigenvectors of \(\mathsfbfit{J}\) along its columns: if the eigenvectors are notated by \(\mathbfit{v}_i\) then \[\mathsfbfit{P} = \begin{pmatrix}\mathbfit{v}_1 & \cdots & \mathbfit{v}_n \end{pmatrix}.\] We therefore have \[\mathchoice{\frac{{\mathrm d}\mathbfit{x}_1}{{\mathrm d}t}}{{\mathrm d}\mathbfit{x}_1/{\mathrm d}t}{{\mathrm d}\mathbfit{x}_1/{\mathrm d}t}{{\mathrm d}\mathbfit{x}_1/{\mathrm d}t} = \mathsfbfit{P} \mathsfbfit{D} \mathsfbfit{P}^{-1} \mathbfit{x}_1 \implies \mathsfbfit{P}^{-1} \mathchoice{\frac{{\mathrm d}\mathbfit{x}_1}{{\mathrm d}t}}{{\mathrm d}\mathbfit{x}_1/{\mathrm d}t}{{\mathrm d}\mathbfit{x}_1/{\mathrm d}t}{{\mathrm d}\mathbfit{x}_1/{\mathrm d}t} = \mathsfbfit{D} \mathsfbfit{P}^{-1} \mathbfit{x}_1 \implies \mathchoice{\frac{{\mathrm d}}{{\mathrm d}t}}{{\mathrm d}/{\mathrm d}t}{{\mathrm d}/{\mathrm d}t}{{\mathrm d}/{\mathrm d}t}[\mathsfbfit{P}^{-1}\mathbfit{x}_1] = \mathsfbfit{D} (\mathsfbfit{P}^{-1} \mathbfit{x}_1),\] where that last step is allowed because \(\mathsfbfit{P}^{-1}\) is just a matrix of constants. Now, if we let \(\mathbfit{z} = \mathsfbfit{P}^{-1} \mathbfit{x}_1\) then \[\begin{equation} \label{diagonalizable-US-lin-US-ode} \mathchoice{\frac{{\mathrm d}\mathbfit{z}}{{\mathrm d}t}}{{\mathrm d}\mathbfit{z}/{\mathrm d}t}{{\mathrm d}\mathbfit{z}/{\mathrm d}t}{{\mathrm d}\mathbfit{z}/{\mathrm d}t} = \mathsfbfit{D} \mathbfit{z} = \begin{pmatrix}\lambda_1 & & \\ & \ddots & \\ & & \lambda_n \end{pmatrix} \mathbfit{z}, \quad \text{i.e.} \quad \mathchoice{\frac{{\mathrm d}z_i}{{\mathrm d}t}}{{\mathrm d}z_i/{\mathrm d}t}{{\mathrm d}z_i/{\mathrm d}t}{{\mathrm d}z_i/{\mathrm d}t} = \lambda_i z_i, \end{equation}\] and because these are just scalar equations we can say \[z_i = C_i \mathrm{e}^{\lambda_i t}.\] But we were looking to solve for \(\mathbfit{x}_1\), right? Indeed, \[\mathbfit{x}_1 = \mathsfbfit{P} \mathbfit{z},\] i.e. \(\mathbfit{x}_1\) is just a linear combination of the exponentials.
In a 2D system like Lotka–Volterra, this looks like \[\begin{equation} \label{xpv-eq-2d} \mathbfit{x}_1 = \begin{pmatrix}\mathbfit{v}_1 & \mathbfit{v}_2\end{pmatrix} \begin{pmatrix}C_1 \mathrm{e}^{\lambda_1 t} \\ C_2 \mathrm{e}^{\lambda_2 t}\end{pmatrix} = C_1 \mathbfit{v}_1 \mathrm{e}^{\lambda_1 t} + C_2 \mathbfit{v}_2 \mathrm{e}^{\lambda_2 t}, \end{equation}\] as promised.
You will remember that eigenvectors are defined as the specific vectors \(\mathbfit{v}\) for which \[\begin{equation} \mathsfbfit{J} \mathbfit{v} = \lambda \mathbfit{v} \label{linmat} \end{equation}\] for some given matrix \(\mathsfbfit{J}\). The eigenvalues, \(\lambda\), are the scaling factors that make this work. To find eigenvalues, we therefore have to solve [linmat], and we do so by requiring \(\det(\mathsfbfit{J}-\lambda \mathsfbfit{I})=0\).
Do you remember why? The logic goes: given \(\mathbfit{v} \neq \mathbf{0}\), the kernel of the map \((\mathsfbfit{J}-\lambda\mathsfbfit{I})\), which always includes \(\mathbf{0}\), must also include the \(\mathbfit{v}\) which solves [linmat]. Therefore the map isn’t a bijection, so it isn’t invertible and the determinant must be zero.
You will remember this leads us to solve a polynomial in \(\lambda\), the so-called characteristic polynomial. In Lotka–Volterra, this will be a quadratic, so we get two pairs of eigenvalues and eigenvectors, \((\lambda_1,\mathbfit{v}_1)\) and \((\lambda_2,\mathbfit{v}_2)\). The solutions \(x_1\) and \(y_1\) then take the general form in [xpv-eq-2d], using initial conditions (if any) to define the constants.
The key observation is that the time dependence of \(\mathbfit{x}_1\) is entirely determined by the eigenvalues of \(\mathsfbfit{J}\). I will refer to the values of \(\lambda\) as eigenvalues hereafter; the characteristic polynomial is also known as the eigen-polynomial.
It’s useful to think about this theorem in light of what we already have learned about one-dimensional systems, \(\mathchoice{\frac{{\mathrm d}x}{{\mathrm d}t}}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t} = f(x)\). There, the Jacobian is just a real number, namely \(f'(x)\). Since this is the only eigenvalue, we only needed to examine whether \(f'(x) > 0\) or \(f'(x)<0\) to test stability.
In [a1lotvol], we found the linear matrix \(\mathsfbfit{J}\) for the Lotka–Volterra system.
When evaluated at the first equilibrium, \((x_0,y_0) = (0,0)\), the characteristic equation \(\det(\mathsfbfit{J}-\lambda \mathsfbfit{I})=0\) becomes \[(\lambda-1)(\lambda+\gamma) = 0.\] So one of the eigenvalues is positive. The solution therefore has an exponential term with a positive exponent, and so the positive eigenvalue tells us straight away that the system is unstable at \((0,0)\).
What does this mean for Lotka–Volterra? Well, this tells us that neither population of the system ever dies! We should not be happy with this conclusion as real life populations can be made extinct. We can perhaps note that this is similar to the \(\theta = \pi\) case for the frictionless pendulum as the fact the full solutions are periodic – a point we made in 1.4.4 – is not evident in this linear analysis.
The second equilibrium at \((x_0,y_0) = (1,1)\) leads to the characteristic equation \[\lambda^2 + \gamma = 0,\] so \(\lambda = \pm \sqrt{\gamma}\mathrm{i}\) and the solutions are purely imaginary. This tells us that system is neither asymptotically stable or unstable. But we knew this already as the solutions to Lotka–Volterra are periodic (recall 1.7a).
We have developed a general theory for the linear stability analysis of nonlinear autonomous ODEs and extended it to systems of autonomous ODEs.
The basic idea in both cases is to assume solutions \(y_i\) in the form \(y_i= y_{i0} + \varepsilon y_{i1}\) and to then expand the equation(s) to \(\mathcal{O}(\varepsilon)\). This will lead to either a single constant coefficient linear ODE or a system of such equations. The solutions are obtained by substituting in a solution in the form \(y_{i1}=a_{i1}\mathrm{e}^{\lambda_i t}\) and solving for \(\lambda\) (in the 1D case) or finding the eigenvalues of the Jacobian (for systems).
If \(\operatorname{Re}(\lambda_i)>0\) for any \(i\), then the system is unstable.
If \(\operatorname{Re}(\lambda_i)<0\) for all \(i\), then it is asymptotically stable.
Otherwise, if \(\operatorname{Re}(\lambda_i)=0\) (and all other \(\operatorname{Re}(\lambda_j)\leq0\)) the system does not decay and we have covered a number of possibilities which we should consider to complete the analysis. In future lectures and problems sheets we will encounter examples of the degenerate case.
There are more examples of linear stability analysis on systems of ODEs in Additional Problem Sheet 1.
The pendulum is well-covered on Wikipedia.
Dominique Bicout, from Grenoble Alpes University, has a nice set of slides on linear stability analysis.
Ask yourself – intuitively, what does \((0,0)\) being unstable mean?
If you said something like ‘if you start at \((0,0)\) and perturb the system, you head away from the equilibrium’, you’re pretty much right. But notice that for systems of \(m>1\) functions, there are different ways of perturbing the system: we could go to \((0,\varepsilon)\), \((\varepsilon,0)\) or \((\varepsilon,\delta)\) for \(\varepsilon,\delta>0\).
Look again at the Lotka–Volterra system, \[\begin{align} \label{lotvol3} \mathchoice{\frac{{\mathrm d}x}{{\mathrm d}t}}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t} &= x - xy,\\ \mathchoice{\frac{{\mathrm d}y}{{\mathrm d}t}}{{\mathrm d}y/{\mathrm d}t}{{\mathrm d}y/{\mathrm d}t}{{\mathrm d}y/{\mathrm d}t} &= \gamma(-y + xy), \end{align}\] and ask yourself: what happens if you start at \((0,0)\) and perturb it like \((0,\varepsilon)\)? In 3.1 we see the phase plot we’ve seen already, with two additional (purple) paths. If you perturb it like \((0,\varepsilon)\)... you end up back at \((0,0)\)! So how can this ‘unstable’ equilibrium be ‘stable’ in one special direction?
We say an equilibrium is unstable if there exists a direction you can perturb it in, in which it is unstable. What we see with Lotka–Volterra is a \((0,0)\) equilibrium which is ‘stable’ in one direction but ‘unstable’ in another. In the language of dynamical systems, this is known as a saddle point. A saddle point is a type of unstable equilibrium.
In fact, the eigenvector associated with each eigenvalue gives you the direction associated with the ‘stability’ indicated by the sign of the eigenvalue. Let’s formalise this idea.
Consider the 2D system whose stability matrix \(\mathsfbfit{J}\) is \[\mathsfbfit{J} = \begin{pmatrix}a & b \\ c & d \end{pmatrix}.\] Solving for the eigenvalues, we have \[\begin{aligned} \det(\mathsfbfit{J}-\lambda \mathsfbfit{I}) & = \lambda^2 - (a+d)\lambda + ad-bc \\ & = \lambda^2 - \operatorname{tr}(\mathsfbfit{J})\lambda + \det(\mathsfbfit{J}) = 0, \end{aligned}\] and hence \[\begin{equation} \label{lambda-as-tr-det} \lambda = \frac{1}{2}\left[\operatorname{tr}(\mathsfbfit{J}) \pm \sqrt{ \operatorname{tr}(\mathsfbfit{J})^2 -4\det(\mathsfbfit{J})}\right]. \end{equation}\]
The general solution, as we saw in [xpv-eq-2d], is then \[\mathbfit{x}_1 = C_1 \mathbfit{v}_1 \mathrm{e}^{\lambda_1 t} + C_2 \mathbfit{v}_2 \mathrm{e}^{\lambda_2 t}.\]
Considering [lambda-as-tr-det], we have the cases listed in 3.1, named after their graphical representation.
\(\operatorname{tr}(\mathsfbfit{J})^2 - 4 \det(\mathsfbfit{J})\) | \(\implies\lambda_{1,2}\) | \(\operatorname{Re}(\lambda_1)\) | \(\operatorname{Re}(\lambda_2)\) | example | |
---|---|---|---|---|---|
Stable node | \(+\) | real | \(-\) | \(-\) | |
Unstable node | \(+\) | real | \(+\) | \(+\) | |
Saddle point | \(+\) | real | \(+\) | \(-\) | |
Stable star | \(0\) | real, equal | \(-\) | ||
Unstable star | \(0\) | real, equal | \(+\) | ||
Stable spiral | \(-\) | complex | \(-\) | ||
Unstable spiral | \(-\) | complex | \(+\) | ||
Centre | \(-\) | imaginary | \(0\) |
In fact, we often don’t have to explicitly work out \(\lambda\) in order to develop stability criteria: instead we can just look at the trace and determinant. Note that \[\lambda_1 + \lambda_2 = \operatorname{tr}(\mathsfbfit{J}) \quad \text{and} \quad \lambda_1\lambda_2 = \det(\mathsfbfit{J}).\] Then our table becomes
\(\operatorname{tr}(\mathsfbfit{J})^2 - 4 \det(\mathsfbfit{J})\) | \(\operatorname{tr}(\mathsfbfit{J})\) | \(\det(\mathsfbfit{J})\) | ||
---|---|---|---|---|
Stable node | \(+\) | \(-\) | \(+\) | |
Unstable node | \(+\) | \(+\) | \(+\) | |
Saddle point | \(+\) | ? | \(-\) | |
Stable star | \(0\) | \(-\) | \(+\) | |
Unstable star | \(0\) | \(+\) | \(+\) | |
Stable spiral | \(-\) | \(-\) | \(+\) | |
Unstable spiral | \(-\) | \(+\) | \(+\) | |
Centre | \(-\) | \(0\) | \(+\) |
This gives us a useful shortcut for determining stability if calculating the trace and determinant is easier than explicitly calculating the eigenvalues.
For stability (of any sort) we therefore require \[\det(\mathsfbfit{J})>0 \quad \text{and} \quad \operatorname{tr}(\mathsfbfit{J})<0,\] but we can also use this observation to search for complex behaviour. In the case of the centre, the trajectories move around the fixed point never getting closer, or further away. This is an example of a closed orbit, or limit cycle. We will see that in context of fully nonlinear systems closed orbits can exist and they themselves can be stable or unstable.
As practice, you should try these determinant–trace criteria on the examples we have covered so far.
If \(b\) or \(c\) are zero then \[\det(\mathsfbfit{J} -\lambda \mathsfbfit{I}) = (\lambda -a)(\lambda-d),\] so there is no need to find the trace or determinant (it often complicates matters).
A last word: these conditions on the determinant and trace are useful but are only valid for \(2\times2\) matrices.
It’s important to keep in mind that the linear analysis about equilibria only gives a local description of the dynamics. To get a better picture of the global dynamics, we’d like to understand how the equilibria are connected, or if closed orbits exist. In one-dimensional systems, \(\mathchoice{\frac{{\mathrm d}x}{{\mathrm d}t}}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t} = f(x)\), this was easy as we only needed to examine the sign of \(f\) between the equilibria and the phase portrait was clear.
In two dimensions, the phase portrait will typically include:
The equilibria and some indication of their stability,
The trajectories near equilibria and periodic orbits that show the flow pattern.
While uniqueness of the solution dictates that trajectories cannot cross, compiling the phase portrait in 2D can be challenging. We’re going to construct the example phase portrait in 3.2. A good recipe looks like this:
Draw on the equilibria, coloured in according to their stability from a linear stability analysis
Draw the nullclines. These are the curves where either \[\mathchoice{\frac{{\mathrm d}x}{{\mathrm d}t}}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t} = f(x,y) = 0 \quad\text{or}\quad \mathchoice{\frac{{\mathrm d}y}{{\mathrm d}t}}{{\mathrm d}y/{\mathrm d}t}{{\mathrm d}y/{\mathrm d}t}{{\mathrm d}y/{\mathrm d}t} = g(x,y) = 0.\] Since the equilibria satisfy both \(\mathchoice{\frac{{\mathrm d}x}{{\mathrm d}t}}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t}=0\) and \(\mathchoice{\frac{{\mathrm d}y}{{\mathrm d}t}}{{\mathrm d}y/{\mathrm d}t}{{\mathrm d}y/{\mathrm d}t}{{\mathrm d}y/{\mathrm d}t} = 0\), the nullclines will intersect at the equilibria.
Particularly helpful is that when trajectories intersect the nullcline, they must be either vertical (\(\mathchoice{\frac{{\mathrm d}x}{{\mathrm d}t}}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t} = 0\)) or horizontal (\(\mathchoice{\frac{{\mathrm d}y}{{\mathrm d}t}}{{\mathrm d}y/{\mathrm d}t}{{\mathrm d}y/{\mathrm d}t}{{\mathrm d}y/{\mathrm d}t} = 0\)).
The nullclines provide the boundary between regions of phase space where \(f>0\) or \(f<0\), and \(g>0\) or \(g<0\). Thus, the nullclines give us some sense of the direction of the trajectories in different regions of the phase plane. However, there is no way to see which sign \(f,g\) have in each region a priori, so we have to test at least one point in each region.
Infer the directions from this information. For example, in a region where \(f<0\) and \(g<0\), we’d expect trajectories heading down and to the left.
See how we’ve done this in 3.2 for a made-up scenario. At this point we can try to plot the full trajectories (although I haven’t done so here).
With this in mind, let’s look at Lotka–Volterra and its variants in more detail.
We learnt in 2 that nondimensionalised Lotka–Volterra has two equilibria:
\((0,0)\) with eigenvalues \(1\) and \(-\gamma < 0\), and
\((1,1)\) with eigenvalues \(\pm\sqrt{\gamma}\mathrm{i}\).
Looking at our stability table, we see \((0,0)\) is therefore officially a saddle point, and \((1,1)\) is a centre. This matches what we’ve seen in our phase diagrams so far.
Looking at [lotvol3], we can see the nullclines are:
\(\displaystyle\mathchoice{\frac{{\mathrm d}x}{{\mathrm d}t}}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t} = 0\) \(\implies\) \(x=0\) and \(y=1\),
\(\displaystyle\mathchoice{\frac{{\mathrm d}y}{{\mathrm d}t}}{{\mathrm d}y/{\mathrm d}t}{{\mathrm d}y/{\mathrm d}t}{{\mathrm d}y/{\mathrm d}t} = 0\) \(\implies\) \(x=1\) and \(y=0\).
Putting these and the equilibria on the phase diagram in 3.3, we can see the quadrants where \(\mathchoice{\frac{{\mathrm d}x}{{\mathrm d}t}}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t}\) and \(\mathchoice{\frac{{\mathrm d}y}{{\mathrm d}t}}{{\mathrm d}y/{\mathrm d}t}{{\mathrm d}y/{\mathrm d}t}{{\mathrm d}y/{\mathrm d}t}\) are positive and negative. This tells us the direction that the centre at \((1,1)\) takes... it must be anticlockwise.
Furthermore, what about the saddle point at \((0,0)\)? Well, working out the associated eigenvectors, the positive (corresponding to unstable) eigenvalue \(1\) corresponds to the vector \((1,0)\), and the negative (stable) eigenvalue \(-\gamma\) corresponds to \((0,1)\). This tells us that the saddle is unstable in the direction \((1,0)\) but stable in the direction \((0,1)\). Just like we’ve seen! These are also part of the picture in 3.3.
So we have the Lotka–Volterra equations and we know about the stability of their equilibria. Is this a good model?
The original Lotka–Volterra model was intended to model periodic behaviour. One well-studied example of periodic predator–prey behaviour is the relationship between snowshoe hares and the Canada lynx. These animals, which coexist throughout Canada and Alaska, have pelts which were collected in the 19th and 20th century. The counts of pelts is shown in 3.4.
We saw in 2.3 that extinction is impossible in the classic Lotka–Volterra system. This results from the fact that if population \(y\) drops close to zero, then \(x\) is subject to unconstrained growth which eventually leads to an increase in \(y\), stopping it from dropping to zero. We only have one parameter in the nondimensionalised system (\(\gamma\)) and that just scales the solution.
If we remember the periodic orbits of Lotka–Volterra in 1.7(a), changing the initial condition is equivalent to varying which orbit the solution exists on. In terms of the biology, this implies a couple of things. Firstly, there is no natural oscillation in the population levels – different initial population sizes yield different oscillations. And secondly, this causes strange, unnatural things to occur, such as lowering the initial population of predators resulting in larger peaks in their population size.
We would expect in reality that it is possible for one species to win regardless of initial conditions. Let’s have a look at some variations.
In classic Lotka–Volterra, one species is the predator, while the other is the prey. But often in nature, multiple species compete for the same resources and as a result, the presence of one population impedes the growth of another. A classic example is American grey squirrels (with their bad spellings and perfect teeth) outcompeting native British red squirrels since their introduction to England in 1876 (3.5).
Let’s take the case of two species whose population sizes are \(x\) and \(y\), respectively. In isolation, each population will evolve according to logistic growth, which you will recall from 1.2, looks like \[\mathchoice{\frac{{\mathrm d}x}{{\mathrm d}t}}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t} = a x\left(1-\frac{x}{K}\right),\] with \(K\) the limiting population. When the other population is present, the death rate of a species will be proportional to the population size of the other species.
This leads us to the system: \[\begin{align} \label{lotvolcomp2} \mathchoice{\frac{{\mathrm d}x}{{\mathrm d}t}}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t} &= a x\left(1-\frac{x}{\eta_1}\right) - b x y,\\ \mathchoice{\frac{{\mathrm d}y}{{\mathrm d}t}}{{\mathrm d}y/{\mathrm d}t}{{\mathrm d}y/{\mathrm d}t}{{\mathrm d}y/{\mathrm d}t} &= c y\left(1-\frac{y}{\eta_2}\right) - d x y. \end{align}\] This has self-interaction for both species \(x\) and \(y\) through the logistic terms, as well as mutual interaction through the \(xy\) terms.
We nondimensionalise by setting \(x = \widehat{x}\eta_1\), \(y = \widehat{y}\eta_2\), \(t = \widehat{t}/a\), \(\gamma_1 = b\eta_2/a\), \(\gamma_2 = d\eta_1/c\) and \(\beta = c/a\). On substituting and dropping hats we obtain the nondimensionalised competitive Lotka–Volterra equations, \[\begin{align} \label{lotvolcomp2scaled} \mathchoice{\frac{{\mathrm d}x}{{\mathrm d}t}}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t} &= x(1-x - \gamma_1 y),\\ \mathchoice{\frac{{\mathrm d}y}{{\mathrm d}t}}{{\mathrm d}y/{\mathrm d}t}{{\mathrm d}y/{\mathrm d}t}{{\mathrm d}y/{\mathrm d}t} &= \beta y(1-y - \gamma_2 x), \end{align}\] where \(\beta,\gamma_1,\gamma_2>0\).
A worked example of a nondimensionalisation can be found in Additional Problem Sheet 2. You should try to recreate the nondimensionalisation (eqs. ([lotvolcomp2]) to ([lotvolcomp2scaled])) using the same method.
An example solution is shown in 3.6: we see both populations eventually settle on fixed values, although the competition through the mutual competition terms \(\gamma_1 xy\) and \(\gamma_2 x y\) ensures these fixed values are not those of the logistic behaviour alone.
See if you can reproduce 3.6 yourself by adapting your Python code from the last Lotka–Volterra plot.
Let’s do a linear stability analysis as well as drawing a phase portrait.
[lotvolcomp2scaled] allows for the trivial null population solutions, \((x,y)=(0,0)\), but interestingly it also allows for solutions where one or the other populations is extinct, \[(x,y) = (0,1)\mbox{ and } (x,y) = (1,0).\] In both cases the non-extinct population takes on its logistic value (remembering that we have nondimensionalised by this point). The final equilibrium takes the form \[\begin{equation} \label{nonzeroeq} (x,y) = \left( \frac{1-\gamma_1}{1-\gamma_1\gamma_2}, \frac{1-\gamma_2}{1-\gamma_1\gamma_2} \right), \end{equation}\] and exists if \(\gamma_1,\gamma_2<1\) or \(\gamma_1,\gamma_2>1\) (because of the denominator).
The nullclines are given by the same equations, but considered separately: \[\begin{aligned} 0 &= x(1-x-\gamma_1y) &&\implies x=0, \quad x=1-\gamma_1 y\\ 0 &= \beta y(1-y-\gamma_2x) &&\implies y=0, \quad y=1-\gamma_2 x. \end{aligned}\] The important observation is that the slope of the lines, and whether they intersect or not, depend on the values of \(\gamma_1\) and \(\gamma_2\). As the intersections are the equilibria, we see that varying \(\gamma_1\) and \(\gamma_2\) result in bifurcations. Four cases can arise, depicted in 3.7.
The Jacobian matrix \(\mathsfbfit{J}\) takes the form \[\begin{equation} \label{A1lotcomp} \mathsfbfit{J} = \begin{pmatrix} 1 - 2x_0 - \gamma_1 y_0 & - \gamma_1 x_0 \\ -\beta \gamma_2 y_0 & \beta(1 - 2y_0 - \gamma_2 x_0) \end{pmatrix}. \end{equation}\]
Solutions will be linear combinations of exponential terms, \(C\mathbfit{v}\mathrm{e}^{\lambda t}\), therefore we have to solve \(\det(\mathsfbfit{J} - \lambda \mathsfbfit{I}) = 0\).
The \((x_0,y_0)=(0,0)\) case leads to a polynomial, \[(1-\lambda)(\beta-\lambda)= 0,\] so the eigenvalues are \(\lambda_1 = 1\) and \(\lambda_2 = \beta\). The equilibrium is an unstable node as \(\operatorname{Re}(\lambda_1)>0\). That is to say, the two populations cannot die out simultaneously.
For the case \((x_0,y_0) = (0,1)\), the characteristic polynomial is \[(1-\gamma_1-\lambda)(-\beta -\lambda) = 0.\] So the eigenvalues are \(\lambda_1 = 1- \gamma_1\) and \(\lambda_2 = -\beta\). This is a stable node if \(\gamma_1>1\) (we assume \(\beta>0\)). If not, it is a saddle point.
3.8(a) shows the dynamic relaxation to this stable equilibrium when \(\gamma_1>1\). 3.8(b) demonstrates the instability of the equilibrium when \(\gamma_1<1\): a small perturbation around the equilibrium, \((0+\varepsilon,1)\), leads to the next equilibrium...
Using the same arguments for the case \((x_0,y_0) = (1,0)\), the eigenvalues are \(\lambda_1 = -1\) and \(\lambda_2 = \beta(1-\gamma_2)\). This is a stable node if \(\gamma_2>1\) (we assume \(\beta>0\)). If not, it is a saddle point.
The Jacobian in this case is \[\mathsfbfit{J} = \frac{1}{1-\gamma_1\gamma_2}\begin{pmatrix} \gamma_1 - 1 & \gamma_1(\gamma_1-1) \\ \beta\gamma_2(\gamma_2-1) & \beta(\gamma_2-1) \end{pmatrix},\] and the characteristic equation is therefore (after some algebra) \[\lambda^2(1-\gamma_1\gamma_2)+\lambda(1-\gamma_1+\beta[1-\gamma_2])+\beta(1-\gamma_1)(1-\gamma_2) = 0.\] This is a quadratic of the form \[a\lambda^2 + b\lambda + c = 0,\] so let’s look at the signs of \(a,b,c\) in the only two cases where this equilibrium exists:
Case 1: \(\gamma_1<1\), \(\gamma_2<1\). Then \(a>0\), \(b>0\), \(c>0\) and by the quadratic formula (or Descartes’ rule of signs), \(\operatorname{Re}(\lambda_1),\operatorname{Re}(\lambda_2)<0\). The equilibrium is therefore stable. Some more algebra can tell us that it is in fact a stable node, but it is tedious so we won’t bother here.
Descartes’ rule of signs says that the number of positive roots of a polynomial is at most the number of sign changes in the sequence of polynomial’s coefficients (omitting the zero coefficients), and that the difference between these two numbers is always even.
This implies something very useful:
0 sign changes \(\implies\) 0 positive roots,
1 sign change \(\implies\) 1 positive root.
This trick (in context) can be generalised for systems beyond two species using the Routh–Hurwitz criterion... but to do so is beyond the scope of our course.
Case 2: \(\gamma_1>1\), \(\gamma_2>1\). Then \(a<0\), \(b<0\), \(c>0\) and by the quadratic formula (or again Descartes), \(\operatorname{Re}(\lambda_1)<0<\operatorname{Re}(\lambda_2)\). The equilibrium is therefore a saddle point.
We can mark the stability on the plots, as shown in 3.9. We can work from the nullclines and equilibria to add in some sample trajectories. Witness the power of linear stability analysis! We have analysed this system in considerable detail without having to solve it.
There is an important biological interpretation to this model. A rather striking feature is that three of the four cases described above result in extinction of one of the species! The extinction of one population due to competition with another is known as the principle of competitive exclusion. In our model, we see that whether this occurs or not depends on the groups \(\gamma_1 = b\eta_2/a\) and \(\gamma_2 = d\eta_1/c\), and hence the competition coefficients (interspecies competition) and carrying capacities (intraspecies competition).
Consider a population of large animals, \(x\), and a population of smaller animals, \(y\), that compete for the same food source, such as grass in a fixed area. Suppose that \(b/a = d/c\). As the carrying capacity of the land is lower for the larger animals, we have that \(\eta_1 < \eta_2\), and as a result \(\gamma_1 > \gamma_2\). We can imagine then that we may encounter \(\gamma_1 > 1\) and \(\gamma_2 < 1\), which will result in the extinction of the larger animals leaving the smaller animal population to flourish.
We found that the only case where coexistence was permitted was \(\gamma_1<1\), \(\gamma_2<1\). What is the biological interpretation of this?
The principle of competitive exclusion is an import concept in ecology, but there is a famous counterexample. The paradox of the plankton asks why well-mixed lakes or oceans maintain dozens or hundreds of phytoplankton species, despite the fact they all consume the same resource. This is a good example of modelling failing to capture observations. Look it up!
In Problem Sheet 2, you will analyse all the equilibria of a similar system but for predator–prey dynamics, \[\begin{aligned} \mathchoice{\frac{{\mathrm d}x}{{\mathrm d}t}}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t} &= a x\left(1-\frac{x}{\eta_1}\right) - b x y,\\ \mathchoice{\frac{{\mathrm d}y}{{\mathrm d}t}}{{\mathrm d}y/{\mathrm d}t}{{\mathrm d}y/{\mathrm d}t}{{\mathrm d}y/{\mathrm d}t} &= c y\left(1-\frac{y}{\eta_2}\right) + d x y. \end{aligned}\] You will be asked to assess whether this variation on Lotka–Volterra is a physically reasonable system. As part of this analysis, you should consider the physical interpretation of each equilibrium and whether it would be desirable for it to be stable or not.
Similar examples can be found in Additional Problem Sheet 2.
We saw in the original Lotka–Volterra model, [reducedlotvol], that it was impossible for any population to become extinct. In the competitive Lotka–Volterra model, we introduced the notion of independent growth limitation on the individual populations to our system through the logistic model.
This has reintroduced extinction: we now have a model which allows one species to ‘win’, in addition to cases where both populations settle on fixed values. But in doing so we have found limitations. The main one is that we appear to have lost the periodic/cyclic nature of the original Lotka–Volterra model. (This is in fact true on account of the so-called Dulac criterion, but we won’t prove it in this course.)
There is an analogy to be made here with our pendulum in 2. When the pendulum was frictionless, we had eigenvalues with zero part (the degenerate case) and it would swing forever: periodic behaviour. Energy was conserved in time: if you take the kinetic energy, \(T\), and the potential energy of the system, \(V\), and let \(\mathcal{H} = T+V\), then \[\mathchoice{\frac{{\mathrm d}\mathcal{H}}{{\mathrm d}t}}{{\mathrm d}\mathcal{H}/{\mathrm d}t}{{\mathrm d}\mathcal{H}/{\mathrm d}t}{{\mathrm d}\mathcal{H}/{\mathrm d}t} = 0.\] The conserved quantity \(\mathcal{H}\) is called the Hamiltonian and manipulating it forms the basis of Hamiltonian mechanics. Systems where there is a conserved quantity are also called Hamiltonian.
When we included the friction term, energy was no longer conserved: such systems are called dissipative. Our eigenvalues had nonzero real part and we lost periodicity.
Compare this with Lotka–Volterra. The coexistence equilibrium of Lotka–Volterra similarly has eigenvalues with zero real part. This is also a Hamiltonian system, and we have actually already seen the quantity which is conserved: it’s \(C\) in [dynsol]! When we add the logistic term, we once again lose periodic behaviour as we create a dissipative systems.
This is a weakness of the Lotka–Volterra model. For a model to be successful, small modifications to the model should produce similar results. But because modifications make the system dissipative, you lose important properties like periodicity.
In the original model, [reducedlotvol], the control of the greenfly and ladybird populations was mediated entirely by the \(xy\) interaction term. In our competitive model, the populations are also self-controlled by the logistic part of [lotvolcomp2scaled], which is independent of the interacting population, i.e., \(\eta_1\) has no dependence on the size of \(y\). A popular variation of the competition model is to make \(\eta_1\) and \(\eta_2\) depend on \(y\) and \(x\) respectively. This re-introduces the periodicity to the model, but still allows for extinction. In 4 we will encounter an example of a model which can do just this.
There has been a significant body of work regarding extensions to the predator–prey model, including the above behaviour as well as extensions to \(n\) populations. For example, for three or more populations, the system can show chaotic behaviour: a phenomenon not possible for two populations on account of (for example) the Poincaré-Bendixson theorem. Mathematicians such as Stephen Smale and Morris Hirsch have proved some deep results regarding the asymptotic (limiting) behaviour of more general systems. However, these results are somewhat out of the scope of this course.
You can find the competitive Lotka–Volterra system in Murray, vol. I, chap. 3.5.
So far, we have looked two variants of the Lotka–Volterra system. The original, in 1, had periodic solutions. The competitive adaptation, in 3, had no periodic solutions but it allowed for the possibility of either species extinction or relaxation to a fixed value; which of these behaviours we got depended on the model’s parameters.
The values of parameters where the behaviour switches are known as bifurcation points. The business of looking for and analysing them is a large area in applied mathematics, and so we will take a brief look at some simpler, one-dimensional biological models which demonstrate the technique. This is all an example-driven prelude to next term where we will more systematically treat bifurcation theory in a slightly more general sense.
As mentioned above, a bifurcation is a qualitative change in the behaviour of a model. A bifurcation of an equilibrium, for example, is a point where the stability and/or number of equilibria change, and hence the topology of the phase space will change. While solving nonlinear differential equation models can be hard, using the technique of linear stability that we have developed, we can explore how these qualitative changes in behaviour matter. Importantly, we can also relate the idea of ‘changing parameters’ to real biological parameters, such as harvesting, climate change, and many other inputs to a given biological system. We will first do this via a few examples of single-species models.
We saw in 1.2 that the logistic equation has two equilibria, one which is stable and one which is unstable. Let’s now make a small modification to the logistic equation which accounts for constant-rate harvesting, \[\begin{equation} \label{logistic-harvest} \mathchoice{\frac{{\mathrm d}x}{{\mathrm d}t}}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t} = ax\left(1-\frac{x}{K}\right) - H. \end{equation}\] Let’s look at the phase space in 4.1(a).
As \(H\) increases, we can see graphically we go from two equilibria, to one, to none! This corresponds to the roots of \(\mathchoice{\frac{{\mathrm d}x}{{\mathrm d}t}}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t} = 0\) in [logistic-harvest] becoming equal and then imaginary: the quadratic equation tells us \[\begin{equation} \label{bifurcation-equilibrium-harvesting} x_0 = \frac{K \pm \sqrt{K^2 - 4HK/a}}{2}. \end{equation}\] So if we increase \(H\), as soon as we go over \(H=aK/4\), \(\mathchoice{\frac{{\mathrm d}x}{{\mathrm d}t}}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t}\) becomes always negative and we reach extinction in finite time. An interesting result about the dangerous of overharvesting! This discontinuous jump in the long-time solution is known as a catastrophe.
In general, when changes in parameter values in our models lead to qualitative changes in the equilibria and their stability, we say a bifurcation has occurred. The parameter values at which we see the bifurcation is called the bifurcation point.
To emphasise this point: bifurcation points are about parameters. In this harvesting model, we could ask this question more directly: ‘what are the equilibria, \(x_0\), for any given parameter, \(H\)?’ And we’ve actually answered this already, in [bifurcation-equilibrium-harvesting].
So why don’t we plot \(x_0\) as a function of \(H\)? That’s what we’ve done in 4.1(b): you’ll notice that we’ve plotted both of the ‘\(\pm\)’ branches of the solution. Plotting equilibria against parameters is what’s known as a bifurcation diagram and when people draw these, they note which branches refer to stable equilibria, and which refer to unstable equilibria. The stable equilibrium branch gets a solid line and the unstable equilibrium branch gets a dotted line. They can be a bit hard to get your head around to start with, so I will always draw them in purple to remind you that it’s not a phase space.
Can you spot where the bifurcation point is on the diagram? It’s where the unstable and stable lines meet. We’ll see a few more of these diagrams shortly.
One cannot take a mathematical biology course without encountering the spruce budworm. It shows how a simple, one-dimensional systems can yield rich dynamics with relevant predictions of a biological phenomenon. The spruce budworm, shown in 4.2, is a moth that infects spruce trees (a type of fir tree) in eastern Canada and the US. The moths produce larvae which feed on the needles of the conifers: a serious infestation can lead to complete defoliation of a forest in about four years, and so the budworm is considered one of the most destructive pests in North America.
The budworm is preyed upon by birds that for low budworm population feed only upon the budworms once the budworm population has reached a certain level. Assuming logistic growth for the budworm population in the absence of birds, the budworm population size, \(x\), is governed by the variation on the logistic equation, \[\mathchoice{\frac{{\mathrm d}x}{{\mathrm d}t}}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t} = rx\left(1-\frac{x}{K}\right) - \frac{Bx^2}{A^2 + x^2}.\] What does the new term tell us? As \(x \to \infty\), the predation rate becomes \(B > 0\), similar to our harvesting model. Meanwhile, \(A > 0\) provides a measure of the threshold population size where predation suddenly increases.
Nondimensionalising with \[x = A \widehat{x}, \quad t = \frac{A}{B}\widehat{t},\] and dropping hats, we get \[\begin{equation} \label{budworm-nondim} \mathchoice{\frac{{\mathrm d}x}{{\mathrm d}t}}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t} = Rx\left(1-\frac{x}{k}\right) - \frac{x^2}{1+x^2}, \end{equation}\] where \(R=rA/B\) and \(k=K/A\). Once again nondimensionalisation has reduced the parameter space, from four to two!
The equilibria are given by \(x=0\) and \[\begin{equation} R\left(1-\frac{x}{k}\right) = \frac{x}{1+x^2}, \label{spruce-rk} \end{equation}\] where the logistic growth is balanced by the predation. Thus, the equilibria occur when the line given by the left-hand side of [spruce-rk] intersects the curve given by the right-hand side. Naturally, we’d like to explore how these equilibria vary with \(R\) and \(k\) and, in doing so, we see the benefit of our choice of nondimensionalisation.
The curve given by \(g(x) = x/(1 + x^2)\) does not depend on either parameter, and as for the line \(h(x) = R(1-x/k)\), \(R\) is the \(y\)-intercept, while \(k\) is the where it crosses the \(x\)-axis. This is plotted in 4.3, and as you can see, this makes graphical evaluation quite easy!
Just like in the harvesting model (where we changed the parameter \(H\)), we see that changing \(R\) and \(k\) can lead to us going from three additional equilibria (we already have \(x=0\)), to two, to one: more bifurcations.
We can go further and classify the equilibria in these regions: as we saw in 1.3, for one dimension it is quite easy. Just to the right of \(x=0\), let’s say \(x=\varepsilon\), you can see from [budworm-nondim] that \(\mathchoice{\frac{{\mathrm d}x}{{\mathrm d}t}}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t} > 0\). So \(x=0\) is unstable. Then, thinking about the phase portrait of \(\mathchoice{\frac{{\mathrm d}x}{{\mathrm d}t}}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t}\) against \(x\), by the continuity of \(\mathchoice{\frac{{\mathrm d}x}{{\mathrm d}t}}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t}\), the stability must alternate.
Which equilibria \(a,b,c\) (in addition to the \(x=0\) one) do we have in each region?
region | equilibria | |||
---|---|---|---|---|
1 extra equilibrium (high \(R\), high \(k\)) | \(x=0\), | \(x=c\) | ||
3 extra equilibria | \(x=0\), | \(x=a\), | \(x=b\), | \(x=c\) |
1 extra equilibrium (low \(R\), low \(k\)) | \(x=0\), | \(x=a\) | ||
stability must be | unstable | stable | unstable | stable |
name | extinction | refuge | threshold | outbreak |
What are the biological interpretations of these equilibria? The equilibrium \(a\) corresponds to the refuge population level, and \(c\) to the outbreak population level. The unstable equilibrium \(b\) is referred to as the threshold as initial populations \(x > b\) increase to the outbreak level, while for \(x < b\) the population will decrease to the refuge size.
Although we have two parameters we can vary, we could still draw a bifurcation diagram by holding one of the parameters fixed and varying the other. So we could ask ‘fixing \(k\), what are the equilibria, \(x_0\), for any given parameter, \(R\)?’
In order to do this, we need to solve for \(x\) in [spruce-rk]. This is an ugly cubic but you can trust me that it’s possible to do analytically (I don’t expect you to work this out). We can plot all the branches where the solutions are real on one diagram: see 4.4(a). Once again, when the equilibrium \(x_0\) is unstable, the line is dotted; when it’s stable, the line is solid.
Convince yourself that this corresponds to 4.3 where \(k\) is fixed and \(R\) is varied.
One important feature of this system is that it can exhibit hysteresis, meaning that varying the parameters in such a way that their return to their initial values does not return the system to its original state.
Go back to 4.3. Suppose that the parameters \(R\) and \(k\) are such that we have four equilibria (including \(x = 0\)). Let’s suppose that the population is at the refuge size, \(x = a\). Now, imagine that \(R\) is increased and the bifurcation occurs making \(x=c\) the only stable equilibrium. The system will move to the outbreak size.
Even if we reduce \(R\) to its original value, we will be above the threshold and continue to move to \(c\) rather than \(a\). Thus, once the outbreak occurs, reducing \(R\) to its original value won’t solve the problem.
We can see how this plays out on the bifurcation diagram in 4.4(b). If we want four equilibria we should set \(R\) so that it’s in the middle of the ‘S’ shape. If we start at the equilibrium \(x_0=a\), we must on the lower branch of the ‘S’. Increasing \(R\), we move to the right until we have to jump! The only stable equilibrium is the top branch, so we jump up to the top. Now reducing \(R\) again, we slide down this top branch, but you see that when we return to the original value of \(R\) we are at a different equilibrium.
How might you go about returning the population of the refuge value?
The bifurcation diagrams we have seen here are a little too complicated to expect you to draw in an exam, although you are expected to understand them. But there are simpler equations than the two biological models here which also produce bifurcations, and in fact these simpler equations capture the generic scenarios which appear in many, more complicated biological models. So let’s look at them now.
In general, a single-species model (that is, a single ODE), can only do so many different things as parameters are varied. In particular, the only long-time behaviour that can occur for a finite population is that it tends to a steady state value.
Can you explain why the solution \(x(t)\) to a single first-order ODE must go to an equilibrium (or blow-up to \(\pm \infty\)) as \(t \to \infty\)? Try explaining this to a friend who has seen ODEs but has not done any dynamics.
But what sorts of different things can happen as parameters vary? Well it turns out that in general you can have a number of different scenarios, depending on how many parameters vary, and how ‘generic’ or commonplace we expect these behaviours to be. Rather than formally and systematically listing all of these bifurcations, let’s just consider the two most common examples, and one example of a ‘complex’ bifurcation diagram formed by combining these basic bifurcations together. Importantly, these are not just examples for single-species models, but they can and do occur commonly in larger models of many species.
A saddle node bifurcation (sometimes called a fold bifurcation or annihilation point) occurs when an unstable equilibrium and a stable equilibrium ‘collide’ at the same value, and then both cease to exist afterwards. This is what happened in 4.1.1 as \(H\) increased beyond \(aK/4\), our bifurcation point. Confirm you agree with this by looking at 4.1.
The model there was a little awkward to draw the bifurcation diagram for but actually the prototypical model of the saddle node bifurcation is the much simpler equation \[\begin{equation} \label{prototypical-saddle-node} \mathchoice{\frac{{\mathrm d}x}{{\mathrm d}t}}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t} = p+x^2, \end{equation}\] where \(p\) is a parameter.
Draw the phase space of [prototypical-saddle-node] and check how many equilibria it has, and what their stabilities are, as \(p\) varies. Don’t worry about negative \(x\) not being feasible for representing biological populations: here we are looking at the broader mathematical principle.
By plotting the equilibria against the parameter, we create a bifurcation diagram, and you can see it in 4.5(a).
Another type of bifurcation is the pitchfork bifurcation, and the prototypical example is \[\begin{equation} \label{prototypical-pitchfork} \mathchoice{\frac{{\mathrm d}x}{{\mathrm d}t}}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t} = x(p - x^2), \end{equation}\] for a parameter \(p\). Think about the phase space of this equation: this is a negative cubic with roots at \(x=0\) and \(x=\pm\sqrt{p}\) if \(p>0\). If \(p<0\), then the only real root is \(x=0\). The two possible phase spaces are drawn in 4.6, where you can infer the stability.
Calling our equilibria \(x_0\) again, consider the stability of the \(x_0=0\) equilibrium. If \(p<0\), it’s stable, and is the only equilibrium. As we increase \(p\) so it becomes positive, this equilibrium becomes unstable, and at the same time we get two new symmetric stable equilibria. The corresponding bifurcation diagram is shown in 4.5(b), where you can see from where the name ‘pitchfork’ comes!
We can combine these two simple models to find more complex behaviour, the type of which we have already seen in the spruce budworm model.
Let’s take the pitchfork prototype and add the shift from the saddle node prototype: in other words, let’s consider the two-parameter model \[\mathchoice{\frac{{\mathrm d}x}{{\mathrm d}t}}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t} = p + x(q-x^2),\] for two parameters \(p\) and \(q\). For this discussion, we’ll fix \(q=1\) but we could do the same analysis as what follows by fixing \(p\). So we have \[\begin{equation} \label{prototypical-hysteresis} \mathchoice{\frac{{\mathrm d}x}{{\mathrm d}t}}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t} = p + x(1-x^2). \end{equation}\] What does the phase space look like and how many equilibria do we have for any \(p\)?
The phase space is drawn in 4.7 and we can see that equilibria appear and disappear as we change \(p\). We can find the equilibria, \(x_0\), in terms of \(p\) by once again solving this equation when \(\mathchoice{\frac{{\mathrm d}x}{{\mathrm d}t}}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t}=0\). Plotting these \(x_0\) when they are real as a function of \(p\) gives us the bifurcation diagram in 4.5(c).
Just like in the spruce budworm, we see we can vary \(p\) such that returning \(p\) to its initial value doesn’t return the system to its original state: hysteresis. If you don’t see why, read again the discussion under the spruce budworm on page and see that it applies exactly to this model.
These three bifurcation phenomena – saddle nodes, pitchforks and hysteresis – will reappear next term: for now, we’re learning about them because with these bifurcation diagrams you can read off essentially the entire system’s behaviour for any given parameter. Pretty cool!
We promised that we would look at a two-dimensional model which also experiences bifurcations, displaying both periodic behaviour as well as decay and relaxation. For this we step into another important area of mathematical biology: biochemical reactions.
The reason that the Lotka–Volterra model has Lotka’s name attached is because the American biophysicist Alfred J. Lotka (1880–1949) stumbled upon the same system independently between 1910 and 1925, initially in the context of how chemicals interact.
Biochemical reactions are extremely important for biological function. They are involved in metabolism and its control, immunological responses, and cell-signalling processes. Biochemical processes are often controlled by enzymes. Enzymes are proteins that catalyse biochemical reactions by lowering the activation energy. Let’s have a look at such a reaction.
Consider the chemical reaction mechanism, \[U \xrightleftharpoons[k_{-1}]{k_1} A, \qquad B \xrightarrow{k_2} V, \qquad 2U+V \xrightarrow{k_3} 3U.\]
This system can be represented in nondimensional form by the system of equations, \[\begin{align} \label{limcyc} \mathchoice{\frac{{\mathrm d}u}{{\mathrm d}t}}{{\mathrm d}u/{\mathrm d}t}{{\mathrm d}u/{\mathrm d}t}{{\mathrm d}u/{\mathrm d}t} &= a - u +u^2v,\\ \mathchoice{\frac{{\mathrm d}v}{{\mathrm d}t}}{{\mathrm d}v/{\mathrm d}t}{{\mathrm d}v/{\mathrm d}t}{{\mathrm d}v/{\mathrm d}t} &= b - u^2v, \end{align}\] with constants \(a,b>0\).
We can translate between chemical (stoichiometric) equations and ODEs using the law of mass action. The law says that the rate of a reaction is proportional to the product of the concentrations of the reactants.
Suppose there are \(M\) elementary reactions of \(N\) molecules, or chemical species, \(A_i\), with concentrations \(a_i\). We write the \(j\)th reaction as \[\alpha_{1j}A_1 + \alpha_{2j}A_2 + \dots + \alpha_{Nj}A_N \xrightleftharpoons[k_j^-]{k_j^+} \beta_{1j}A_1 + \beta_{2j}A_2 + \dots + \beta_{Nj} A_N,\] with stoichiometric parameters \(\alpha_{ij}, \beta_{ij} \geq 0\), and rate constants \(k_j^{+,-} \geq 0\).
The law of mass action gives the differential equation for the concentration \(a_i\) as \[\mathchoice{\frac{{\mathrm d}x_i}{{\mathrm d}t}}{{\mathrm d}x_i/{\mathrm d}t}{{\mathrm d}x_i/{\mathrm d}t}{{\mathrm d}x_i/{\mathrm d}t} = \sum_{j=1}^M(\beta_{ij}-\alpha_{ij})\left(k_j^+\prod_{\ell=1}^N a_\ell^{\alpha_{\ell j}} - k_j^-\prod_{\ell=1}^N a_i^{\beta_{\ell j}} \right).\]
In Epiphany term you will analyse a variant of this system which includes spatial variables; this system will explain the tree-like growth of one of the largest single cell organisms, known as Acetabularia (see 4.8).
The system, [limcyc], was developed by Jürgen Schnakenberg in 1979 as a ‘simple’ chemical reaction which can exhibit very different behaviour, depending on the value of the parameters \(a\) and \(b\). In 4.9 we see solutions to this system which show:
a stable node (\(a=1\), \(b=2\)),
a stable spiral (\(a= 0.19\), \(b=0.55\)), and
(convergence to) fixed periodic behaviour (\(a=0.1\), \(b=0.5\)).
The question we seek to answer here is how we can identify which parameters \((a,b)\) lead to the differing behaviour, and hence – critically – where does the behaviour switch?
Have a go at reproducing the plots in 4.9 in Python by modifying your existing code. How sensitive is the system to small changes in \(a\) and \(b\)?
This is an important aspect of mathematical biology (and indeed other areas of applied mathematics). The idea is that if we know our model can accurately represent a given real life system (as in the Acetabularia case) then we can use the kind of analysis we are about to introduce in order to control the system’s behaviour.
We haven’t spoken much about periodic, or oscillatory, behaviour so far, other than saying that periodic behaviour is seen in the original Lotka–Volterra model and in the lynx/hare population it models. In bioscience, where we are often modelling biochemical reactions like we have here, periodic behaviour is extremely common: think heartbeats, breathing and nerve impulses.
So if we wanted our enzyme system to display periodic behaviour, we should influence the system to achieve (for example) the parameters \(a=0.1\), \(b=0.5\), which have been identified using the analysis we will see here.
It is straightforward to see that the equilibrium takes the form \[\begin{equation} \label{enzyme-equil} u_0 = a+b,\quad v_0 = \frac{b}{(a+b)^2}. \end{equation}\] See how this is true in 4.9(a) and (b). Because \(a,b>0\), this equilibrium is always permissible: if we are to find bifurcations, they will be of the ‘change in behaviour’ type, rather than the ‘change in number of equilibria’ type.
The Jacobian of the system is \[\mathsfbfit{J} = \begin{pmatrix}-1+2u_0 v_0 & u_0^2 \\ -2u_0v_0 & -u_0^2 \end{pmatrix}.\]
We once again look for eigenvalues by looking to solve \[\det(\mathsfbfit{J}-\lambda\mathsfbfit{I}) = 0.\]
Let’s go hunting for the values of \(a\) and \(b\) which produce fixed-amplitude oscillatory behaviour in the linearised system. Recalling 3.1, equilibria about which we find oscillatory behaviour that doesn’t decay or grow are known as centres. These require imaginary eigenvalues or \[\begin{equation} \label{req-for-cyclic-2} \operatorname{tr}(\mathsfbfit{J})^2 - 4 \det(\mathsfbfit{J}) < 0, \qquad \det(\mathsfbfit{J}) > 0, \quad \text{and} \quad \operatorname{tr}(\mathsfbfit{J}) = 0. \end{equation}\] The trace and determinant of our matrix, evaluated at the equilibrium, are \[\operatorname{tr}(\mathsfbfit{J}) = \frac{b-a}{a+b} -(a+b)^2,\quad \det(\mathsfbfit{J}) = (a+b)^2.\] Since \(\det(\mathsfbfit{J})>0\) (because \(a,b>0\)), the determinant automatically satisfies the second required condition for periodic behaviour.
Zero trace requires \[\begin{equation} b-a = (a+b)^3. \label{zero-trace-enzyme} \end{equation}\] The condition \(\operatorname{tr}(\mathsfbfit{J}) = 0\) automatically satisfies the first condition in [req-for-cyclic-2] and so, as long as we pick \(a\) and \(b\) such that \(b-a = (a+b)^3\), we will have periodic solutions. This relation was used to get the curve shown in 4.9(c).
Something we could do is plot [zero-trace-enzyme] in \((b,a)\) parameter space. Solving this cubic is messy, but the line is drawn in 4.10(a), and it splits the plane into two regions. But what are these regions?
Look back at the trace/determinant stability table on page . A centre (what we’ve found) happens when you go from a stable spiral to an unstable spiral, or vice versa: the trace changes sign and so the centre corresponds to the parameter values where the equilibrium of the system changes stability. So this line also represents a bifurcation! On one side of the line the parameters are such that our equilibrium is a stable spiral, and the other side it is an unstable spiral. This scenario – a transition between an unstable and stable spiral – is a special type of bifurcation called a Hopf bifurcation and you will see it again next term.
Zooming out, you can do this type of analysis for all possible changes between types of equilibrium: you can see the regions you’d get on 4.10(b). We have only uncovered one small part of this complicated system, but it turns out to be the most important, because only this line represents the change from a stable to an unstable equilibrium: it represents the only bifurcation.
What we see in 4.9 – convergence to an oscillatory solution – is known as a limit cycle. You can prove that these must exist using the Poincaré–Bendixson theorem. Contrast this to Lotka–Volterra, where the oscillations (in particular their amplitudes) are highly dependent on their initial conditions. For this set of parameter values, all initial conditions approach a single closed orbit in phase space – the limit cycle. This indicates that there is a natural oscillatory behaviour of the system that is independent of the initial conditions. Perhaps this is what Volterra was after!
The main ingredient missing from all our models so far is spatial dependence. For example, if our ladybirds and greenflies are originally in different parts of their ecosystem, they cannot interact. But the interaction constants (the ones proportional to \(xy\)) have no dependence on the position of the species; the model just assumes the species interact at a given frequency regardless. Our next aim must be to model the movement in space of the populations \(x\) and \(y\). With this in mind, we turn to the notion of diffusion.
The spruce budworm is covered in Murray, vol. I, chap. 1.2.
Enzyme kinetics are covered in Murray, vol. I, chap. 6.
Thus far, we’ve used ODEs to generate mathematical models of population dynamics, biochemical reactions, and epidemiological phenomena (in the problems classes). In modelling these biological systems in this way, we have implicitly assumed that the phenomena that can occur can be described by functions of time that evolve deterministically. It’s not very hard to envision situations, including those we’ve already discussed, that may also have a spatial dependence. For example, one can easily imagine a population of bacteria or another kind of microorganism that may initially be localised in space, but owing to motility, it may spread over time. As a result, we would not only observe growth in the population size due to cell division, but also in the region of space which the population occupies.
From this point on, we’ll discuss how ODE models can be augmented to become partial differential equations (PDEs) that allow for both temporal and spatial evolution. In particular, we’ll be focusing on a class of PDEs known as reaction–diffusion equations where the motion of the species from one location to another is modelled by diffusion. We’ll examine cases where these equations permit travelling wave solutions (this term) as well as stationary patterns (next term), and discuss how these arise in biological models.
Modelling the behaviour of individual members of a species of a population is a significantly difficult task. How do we represent the motivation of each individual to move? The answer is ‘lots of intensive computational power’, but even this is limited. This is what drives us to concentrate on the bulk statistical behaviour of a large population.
Random movement is one of the simplest kinds of motion, and it is very common in nature. Briefly, the history of diffusion (and mathematical models of it) goes as follows. In 1822, Jean-Baptiste Joseph Fourier published the first example of what we know as the heat equation, as he was interested in describing how heat flows through different materials with different thermal conductivities (think of metal vs wood). Fourier derived this equation by an argument regarding the conservation of heat. We will see an analogue of this derivation using conservation of mass in 5.1.2 below. He also developed a method to find solutions to this equation given an initial heat distribution using sines and cosines, which we will do in 5.3.
Unrelated to this, in 1827 the botanist Robert Brown was looking through a microscope at pollen, and noticed that the particles of pollen seemed to jiggle around randomly. For many years these random movements were interesting, but difficult to study or understand. Finally, in 1905 Albert Einstein published a paper connecting this so-called ‘Brownian motion’ to the heat equation, providing a concrete connection between microscopic random movement and macroscopic diffusion2. We will see a version of this connection below in terms of random movement, but the key idea is that the diffusion equation (aka the heat equation) is intimately tied to the small-scale random movement of whatever is being modelled – either chemical species in water, cells or animals moving around randomly in an ecosystem, or even human migration in a new territory. Of course animals, people, and even molecules have other modes of transportation – advection in a fluid flow, or directed movement towards where we might want to be for instance – but diffusion is one of the simplest kinds of motion that can be found in nature, and it plays a crucial role in many kinds of biological systems.
As a concrete example, let us consider an individual on a line
initially at a position \(x=0\). The
individual moves left or right in integer steps with a probability of
moving right \(p\) and hence left of
\((1-p)\). After \(n\) steps a path can be encoded as
LRLLRRR
…. If we repeat this process a large number of times
we get a set of paths – a set of random walks. In 5.1, we show five random walks of length
\(100\) for \(p=0.5\), \(0.6\) and \(0.9\) respectively. Here you see the
individual paths; in 5.2 we
see the same for 200 walks! For \(p=0.5\), the end positions are reasonably
evenly spread about \(x=0\). For \(p=0.6\), the story is similar except that
the spread is about the mean positions, \(x=2np-n\), and the spread is not equal
either side of this mean. This nonzero mean, indicated by the dotted
line, can be interpreted as a natural drift of the set of individuals
all starting at the same point due to the probability bias.
In 5.3, we see bar charts of the final displacement as a function of the total number of steps \(n\) for \(p=0.8\), a reasonably biased walk. As \(n\) increases, the distribution becomes gradually more symmetric. Indeed, since this is essentially a binomial distribution (with specific weighting) we should have expected this as we know that the binomial distribution \(B(n,p)\) approximates a normal distribution with mean \(np\) and variance \(np(1-p)\) as \(n\) gets large. We shall shortly see the importance of this example.
In our course, we will focus on the continuum limit of such a model. Rather than making the steps of size \(1\), we specify them to be \({\mathrm d}x\) which will be vanishingly small, and each step is taken in a time \({\mathrm d}t\). We let \(c(x,t)\) be the continuous probability that, at a time \(t\), a particle (population member) reaches a displacement \(x\) at a time \(t\). This implies that at a time \(t-{\mathrm d}t\), the particle must have been at either \(x-{\mathrm d}x\) or \(x+{\mathrm d}x\). If \(p\) is the probability that the particle moves right, we therefore have \[c(x,t)= p c(x-{\mathrm d}x,t-{\mathrm d}t) + (1-p) c(x+{\mathrm d}x,t-{\mathrm d}t).\] If we Taylor expand this to \(\mathcal{O}(\varepsilon^2)\), where \(\varepsilon= \max({\mathrm d}t,{\mathrm d}x)\), we obtain \[\begin{aligned} c(x,t) &\approx p\left[c(x,t) - {\mathrm d}x \mathchoice{\frac{\partial c}{\partial x}}{\partial c/\partial x}{\partial c/\partial x}{\partial c/\partial x} - {\mathrm d}t \mathchoice{\frac{\partial c}{\partial t}}{\partial c/\partial t}{\partial c/\partial t}{\partial c/\partial t} + \frac{1}{2}\left( {\mathrm d}x^2 \mathchoice{\frac{\partial^2 c}{\partial x^2}}{\partial^2 c/\partial x^2}{\partial^2 c/\partial x^2}{\partial^2 c/\partial x^2} + 2 \, {\mathrm d}x \, {\mathrm d}t \mathchoice{\frac{\partial^2 c}{\partial x \partial t}}{\partial^2 c /\partial x \partial t}{\partial^2 c /\partial x \partial t}{\partial^2 c /\partial x \partial t} + {\mathrm d}t^2 \mathchoice{\frac{\partial^2 c}{\partial t^2}}{\partial^2 c/\partial t^2}{\partial^2 c/\partial t^2}{\partial^2 c/\partial t^2} \right)\right] \\ & \quad + (1-p)\left[c(x,t) + {\mathrm d}x \mathchoice{\frac{\partial c}{\partial x}}{\partial c/\partial x}{\partial c/\partial x}{\partial c/\partial x} - {\mathrm d}t \mathchoice{\frac{\partial c}{\partial t}}{\partial c/\partial t}{\partial c/\partial t}{\partial c/\partial t} + \frac{1}{2}\left( {\mathrm d}x^2 \mathchoice{\frac{\partial^2 c}{\partial x^2}}{\partial^2 c/\partial x^2}{\partial^2 c/\partial x^2}{\partial^2 c/\partial x^2} - 2 \, {\mathrm d}x \, {\mathrm d}t \mathchoice{\frac{\partial^2 c}{\partial x \partial t}}{\partial^2 c /\partial x \partial t}{\partial^2 c /\partial x \partial t}{\partial^2 c /\partial x \partial t} + {\mathrm d}t^2 \mathchoice{\frac{\partial^2 c}{\partial t^2}}{\partial^2 c/\partial t^2}{\partial^2 c/\partial t^2}{\partial^2 c/\partial t^2} \right)\right] \nonumber\\ &= c(x,t) + (1-2p){\mathrm d}x \mathchoice{\frac{\partial c}{\partial x}}{\partial c/\partial x}{\partial c/\partial x}{\partial c/\partial x} - {\mathrm d}t \mathchoice{\frac{\partial c}{\partial t}}{\partial c/\partial t}{\partial c/\partial t}{\partial c/\partial t} + \frac{1}{2}{\mathrm d}x^2 \mathchoice{\frac{\partial^2 c}{\partial x^2}}{\partial^2 c/\partial x^2}{\partial^2 c/\partial x^2}{\partial^2 c/\partial x^2} - (1-2p)\, {\mathrm d}x \, {\mathrm d}t \mathchoice{\frac{\partial^2 c}{\partial x \partial t}}{\partial^2 c /\partial x \partial t}{\partial^2 c /\partial x \partial t}{\partial^2 c /\partial x \partial t} + \frac12 {\mathrm d}t^2 \mathchoice{\frac{\partial^2 c}{\partial t^2}}{\partial^2 c/\partial t^2}{\partial^2 c/\partial t^2}{\partial^2 c/\partial t^2}, \end{aligned}\] so that to order \(\mathcal{O}(\varepsilon^2)\), \[\mathchoice{\frac{\partial c}{\partial t}}{\partial c/\partial t}{\partial c/\partial t}{\partial c/\partial t} = (1-2 p)\mathchoice{\frac{{\mathrm d}x}{{\mathrm d}t}}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t}\mathchoice{\frac{\partial c}{\partial x}}{\partial c/\partial x}{\partial c/\partial x}{\partial c/\partial x} + \frac12 \frac{({\mathrm d}x)^2}{{\mathrm d}{t}} \mathchoice{\frac{\partial^2 c}{\partial x^2}}{\partial^2 c/\partial x^2}{\partial^2 c/\partial x^2}{\partial^2 c/\partial x^2} - (1-2p)\,\mathrm{d}x\mathchoice{\frac{\partial^2 c}{\partial x \partial t}}{\partial^2 c /\partial x \partial t}{\partial^2 c /\partial x \partial t}{\partial^2 c /\partial x \partial t} + \frac12 {\mathrm d}t \mathchoice{\frac{\partial^2 c}{\partial t^2}}{\partial^2 c/\partial t^2}{\partial^2 c/\partial t^2}{\partial^2 c/\partial t^2}.\] Taking the limit \({\mathrm d}x\to 0\) and \({\mathrm d}t\to 0\), the final two terms will vanish. The coefficient \((1-2p)\,\mathchoice{\frac{{\mathrm d}x}{{\mathrm d}t}}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t}\) has the dimensions of a velocity, so we set \((2p-1)\,\mathchoice{\frac{{\mathrm d}x}{{\mathrm d}t}}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t}{{\mathrm d}x/{\mathrm d}t} = v\) (with the flipped sign so that it points in the positive direction for high \(p\) biases) and we define a constant \(D\), \[D = \frac12 \frac{({\mathrm d}x)^2}{{\mathrm d}t},\] which we call the diffusion constant for reasons which will follow. One should note that the ratio \(v/D\) will diverge as \({\mathrm d}x \to 0\) unless \(p\) is very close to \(1/2\), so as the step size gets smaller, the velocity term (the so-called advective term) dominates. Then we have \[\begin{equation} \label{advectiondiffusion} \mathchoice{\frac{\partial c}{\partial t}}{\partial c/\partial t}{\partial c/\partial t}{\partial c/\partial t} = -v\mathchoice{\frac{\partial c}{\partial x}}{\partial c/\partial x}{\partial c/\partial x}{\partial c/\partial x} + D \mathchoice{\frac{\partial^2 c}{\partial x^2}}{\partial^2 c/\partial x^2}{\partial^2 c/\partial x^2}{\partial^2 c/\partial x^2}, \end{equation}\] a partial differential equation determining the behaviour of \(c(x,t)\). Note that if \(p=1/2\), there is no velocity: we shall come to associate this term with a drift. The physical description of the term, \(D\), can be seen by deriving [advectiondiffusion] in a continuum setting, as we’ll do now.
Let \(S\) be a surface, surrounding a volume \(V\), containing a density \(c(\mathbfit{x},t)\). This could be a population density or maybe a chemical density. Basic conservation of mass (see 5.4) says that \[\mathchoice{\frac{{\mathrm d}}{{\mathrm d}t}}{{\mathrm d}/{\mathrm d}t}{{\mathrm d}/{\mathrm d}t}{{\mathrm d}/{\mathrm d}t}\int_{V}c(\mathbfit{x},t) \, {\mathrm d}V = -\int_{S}\mathbfit{J}\cdot {\mathrm d}\mathbfit{S} + \int_{V} f \, {\mathrm d}V,\] where \(\mathbfit{J}\) is the flow of \(c(\mathbfit{x},t)\) through a surface element \({\mathrm d}\mathbfit{S}\), and \(f\) is some source of \(c(\mathbfit{x},t)\) in the body \(V\) (ants coming up through the ground or a chemical reaction!). Differentiating under the integral on the left-hand side (a process also known as Leibniz’ integral rule or the Reynolds transport theorem), we obtain \[\mathchoice{\frac{{\mathrm d}}{{\mathrm d}t}}{{\mathrm d}/{\mathrm d}t}{{\mathrm d}/{\mathrm d}t}{{\mathrm d}/{\mathrm d}t}\int_{V}c(\mathbfit{x},t) \, {\mathrm d}V = \int_{V}\mathchoice{\frac{\partial c}{\partial t}}{\partial c/\partial t}{\partial c/\partial t}{\partial c/\partial t}\,{\mathrm d}V + \int_{S}c(x,t)\mathbfit{v}\cdot {\mathrm d}{\mathbfit{S}}\] where \(\mathbfit{v}\) is the local velocity field of the changing shape of the surface \(S\). Using the divergence theorem on the surface integrals we have, altogether, \[\int_{V}\left[\mathchoice{\frac{\partial c}{\partial t}}{\partial c/\partial t}{\partial c/\partial t}{\partial c/\partial t}+ \boldsymbol{\nabla}\cdot (c \mathbfit{v}) + \boldsymbol{\nabla}\cdot \mathbfit{J} - f \right]{\mathrm d}{V} =0.\] If we assume \(c(\mathbfit{x},t)\) is sufficiently differentiable, we can shrink \(V\) infinitesimally to obtain the general advection–diffusion law, \[\begin{equation} \label{adv-diff-law} \mathchoice{\frac{\partial c}{\partial t}}{\partial c/\partial t}{\partial c/\partial t}{\partial c/\partial t} = - \boldsymbol{\nabla}\cdot (c \mathbfit{v}) - \boldsymbol{\nabla}\cdot \mathbfit{J} + f. \end{equation}\] In this case, \(\mathbfit{v}\) is the velocity of motion of the concentration \(c(\mathbfit{x},t)\) at the point \(\mathbfit{x}\) and time \(t\). The term \(- \boldsymbol{\nabla}\cdot (c \mathbfit{v})\) is the advection term, which is associated with motion of the concentration; and the term \(\boldsymbol{\nabla}\cdot \mathbfit{J}\) is the diffusion (or diffusive flux), which is associated with the spreading out of the density.
In this form the equation is not complete (too many unknowns \(f\), \(\mathbfit{v}\), \(c\), \(\mathbfit{J}\) for one equation). So we have to make some further assumptions; these are called constitutive laws. For example, let’s say \(\mathbfit{v}\) is constant – that is to say the concentration’s centre of mass is moving with a constant velocity – and we assume Fick’s law: \[\begin{equation} \label{fick} \mathbfit{J} = -D\boldsymbol{\nabla}c. \end{equation}\] Here, \(D\) is a constant, so putting it all together, we have the advection–diffusion equation, \[\begin{equation} \mathchoice{\frac{\partial c}{\partial t}}{\partial c/\partial t}{\partial c/\partial t}{\partial c/\partial t} = -\boldsymbol{\nabla}\cdot(c\mathbfit{v}) + D\nabla^2 c + f, \label{adv-diff-ndim} \end{equation}\] where \(\nabla^2 c = \boldsymbol{\nabla}\cdot \boldsymbol{\nabla}c\) is the Laplacian operator. The one-dimensional version of this equation (with \(f=0\)) is just [advectiondiffusion], so we come to relate the velocity, \(\mathbfit{v}\), with the probability drift. Now we have an equation for \(c\) alone which is complete.
Fick’s law, [fick], assumes that the density, \(c\), moves from a region of high concentration to low. For example, if we have a small (continuous) source of heat at the centre of a cold room, this heat will gradually diffuse radially outwards to fill the room until the room is at a constant temperature and the gradient vanishes. The constant, \(D\), which has units of area per unit time, is a measure of this expansion rate.
The law was originally proposed in 1855 by Adolf Fick, whose experiments concerned the diffusion of salt through tubes of water. A more modern application is the diffusion of drugs in the vitreous body of the human eye. In all cases, the diffusive material – be it a gas, salt chemical or population – is composed of microscopic bodies which are in randomised motion, often colliding. Thus when confined to a small space the bodies will tend to move apart.
We must stress that while diffusion does lead to motion, in the spreading of the body, it is different from advection. Advection alone is movement of the concentration while the concentration area is held fixed. When \(\mathbfit{v}=\mathbf{0}\), [adv-diff-ndim] becomes the diffusion equation, or heat equation, with the extra source term \(f\), \[\mathchoice{\frac{\partial c}{\partial t}}{\partial c/\partial t}{\partial c/\partial t}{\partial c/\partial t} = D\nabla^2 c + f.\]
The source term, \(f\), can be exactly the sort of term we have already seen in our ODE models. We can have two populations, \(u\) and \(v\), both individually satisfying the diffusion equation, but with Lotka–Volterra source terms: \[\begin{align} \label{spatial-lotka-volterra} \mathchoice{\frac{\partial u}{\partial t}}{\partial u/\partial t}{\partial u/\partial t}{\partial u/\partial t} &= D_1 \nabla^2 u + u- uv,\\ \mathchoice{\frac{\partial v}{\partial t}}{\partial v/\partial t}{\partial v/\partial t}{\partial v/\partial t} &= D_2 \nabla^2 v +\gamma(-v+ uv). \end{align}\] If we replaced the intraspecies growth/decay terms with logistic growth terms, and made the interspecies term negative in both equations, then we would have the spatial competitive Lotka–Volterra model. We will analyse this system in [chap-CL-pursuit-evasion], but first we have to ask ourselves: what do solutions to these equations look like?
In our thought experiment at the start of this chapter, represented by the random walks in 5.2, we started with a large number of particles at a single point. Over time their random motion led them to spread out: this is the diffusive element of the system. In cases (b) and (c), the probability bias meant there was a net motion of the paths in addition to the diffusive spreading. This is the equivalent of advective motion with the probability bias dictating its ‘velocity’.
We try to replicate this as a continuous system. Let’s start by taking a look at a one-dimensional version of the diffusion part of our problem, with no extra source term, but on a bounded domain.
Consider the function \(u(x,t)\) which satisfies \[\begin{equation} \label{heat-US-eqn} \mathchoice{\frac{\partial u}{\partial t}}{\partial u/\partial t}{\partial u/\partial t}{\partial u/\partial t} = \mathchoice{\frac{\partial^2 u}{\partial x^2}}{\partial^2 u/\partial x^2}{\partial^2 u/\partial x^2}{\partial^2 u/\partial x^2}, \quad x \in [0,L], \quad t > 0, \end{equation}\] with Neumann (no-flux) boundary conditions, \[\begin{equation} \label{BCs} \mathchoice{\frac{\partial u}{\partial x}}{\partial u/\partial x}{\partial u/\partial x}{\partial u/\partial x}(0,t) = \mathchoice{\frac{\partial u}{\partial x}}{\partial u/\partial x}{\partial u/\partial x}{\partial u/\partial x}(L,t) = 0, \end{equation}\] and the initial condition, \[\begin{equation} \label{ICs} u(x,0) = u_0(x). \end{equation}\]
PDEs are hard, and ODEs are easier, so a common technique for solving PDEs is to reduce them to ODEs. We can compute \(u\) by using a technique known as separation of variables which does exactly this. We proceed by writing \(u = X(x)T(t)\) so that our function is the product of two single-variable functions. Substituting this in, we find \[\mathchoice{\frac{\partial}{\partial t}}{\partial/\partial t}{\partial/\partial t}{\partial/\partial t}[X(x)T(t)] = X(x)T'(t), \qquad \mathchoice{\frac{\partial^2 }{\partial x^2}}{\partial^2 /\partial x^2}{\partial^2 /\partial x^2}{\partial^2 /\partial x^2}[X(x)T(t)] = X''(x)T(t),\] and hence \[X(x)T'(t) = X''(x)T(t).\] We now rearrange this equation to get \[\frac{T'(t)}{T(t)} = \frac{X''(x)}{X(x)}.\] This doesn’t look helpful yet, but the key idea is that everything on the left-hand side is a function of \(t\) only, and everything on the right-hand side is a function of \(x\) only. So if we fix some \(t\) and vary \(x\), the right-hand side can’t change its value (because \(t\) is fixed), and vice-versa. This implies the equation is equal to a constant: \[\frac{T'(t)}{T(t)} = \frac{X''(x)}{X(x)} = \lambda.\] Now if we look at the boundary conditions, [BCs], we see that \(X(x)\) ‘inherits’ these boundary conditions. To see this, substitute in \(u = X(x)T(t)\) and note that these conditions must be true for all \(x,t\), and we generally want \(X\) and \(T\) to be nonzero. The initial conditions are messier, as these take the form \(T(0)X(x) = u_0(x)\), so they do not clearly separate into a time and a space part, so we will deal with them later.
So we have the following two ODE problems: \[\begin{equation} \label{T-US-eq} T'(t) = \lambda T(t), \end{equation}\] and \[\begin{equation} \label{X-US-eq} X''(x) = \lambda X(x), \quad X'(0)=X'(L)=0. \end{equation}\] While we don’t know an initial condition yet for \(T\), it is an initial value problem, and we know its solution: \(T = C\mathrm{e}^{\lambda t}\), where \(C\) will depend on the initial condition.
The problem for \(X\) is a second order ODE with constant coefficients, and we can find the solutions by using the auxiliary equation and then using \(\lambda\) to deal with the boundary conditions. This turns out to have three cases depending on if \(\lambda\) is positive, zero, or negative.
If we say \(\lambda = \alpha^2\), you can write down the auxiliary equation and solve the ODE to find that \[\begin{equation} X(x) = A\sinh(\alpha x) + B\cosh(\alpha x). \label{cosh-sinh-form} \end{equation}\] Does this satisfy the boundary conditions? Well, \[X'(x) = A\alpha\cosh(\alpha x) + B\alpha\sinh(\alpha x),\] and trying to apply the boundary conditions, given that \(\alpha \neq 0\), \[\begin{aligned} X'(0) = 0 &\implies A\alpha = 0 &\implies A = 0,\\ X'(L) = 0 &\implies B\alpha\sinh(\alpha L) = 0 \; &\implies B = 0. \end{aligned}\] So we don’t have solutions unless \(A=B=0\), which is useless and so we throw away this case.
The \(\sinh\) and \(\cosh\) form of [cosh-sinh-form] works out nicely for us here because one of the boundary conditions is at \(0\). If instead we say \(X(x) = A\mathrm{e}^{\alpha x} + B\mathrm{e}^{-\alpha x}\), then \(X'(x) = A\alpha\mathrm{e}^{\alpha x} - B\alpha\mathrm{e}^{-\alpha x}\), which we can evaluate at \(x=0\) and \(L\). We find the linear system of equations, \[\begin{aligned} A \alpha - B \alpha & = 0,\\ A \alpha \mathrm{e}^{\alpha L} - B \alpha \mathrm{e}^{-\alpha L} & = 0. \end{aligned}\] Writing this as a matrix equation we have \[\underbrace{\begin{pmatrix} 1 &-1\\ \mathrm{e}^{\alpha L} &-\mathrm{e}^{-\alpha L} \end{pmatrix}}_{\mathsfbfit{M}} \begin{pmatrix} A \\ B \end{pmatrix} = \begin{pmatrix} 0 \\0 \end{pmatrix}.\] If you remember your favourite class, linear algebra, you’ll recall that this homogeneous system only has solutions if \(\det(\mathsfbfit{M})=0\). Complete the analysis to show that \(\det(\mathsfbfit{M})>0\) and so we can throw away this solution.
In this case, we can just integrate [X-US-eq] twice to get \(X = Ax + B\). Applying the boundary conditions, we require \(A = 0\), so that \(X = B\) being an arbitrary constant is a solution.
We can write \(\lambda = -\alpha^2\) with \(\alpha \neq 0\), and see that \(X\) satisfies, \[X''(x) + \alpha^2 X(x) = 0,\] which is the equation of the harmonic oscillator (or simple harmonic motion). This can again be solved using the auxiliary equation, and we find complex roots which lead to solutions of the form \[X(x) = A \sin(\alpha x) + B \cos(\alpha x).\] By applying the first boundary condition that \(X'(0) = 0\), we get \(A=0\). By applying the second we get \(X'(L) = 0 = -\alpha B \sin(\alpha L)\). If we want nontrivial solutions to this (that is, neither \(B\) nor \(\alpha\) being zero), we must have \(\sin(\alpha L) = 0\). We know that \(\sin(y)\) has zeros for \(y = 0, \pm\pi, \pm2\pi, ...\) and in general for \(y = \pm n\pi\) for all whole numbers \(n\). So we must have that \[\alpha L = n \pi \implies \alpha = \frac{n \pi}{L} \implies \lambda = -\left(\frac{n \pi}{L}\right)^2.\] Finally we note that this case can subsume the previous one if we allow \(n=0\) (so \(\alpha=0=\lambda\)), as \(B\cos(0) = B\).
By combining the solutions for \(T\)
and \(X\) above (and combining the
constants in front into one constant \(C\)), we have candidate solutions given by
\[\begin{equation} \label{sol}
u(x,t) = C\mathrm{e}^{\lambda t}\cos(x\sqrt{-\lambda}) =
C\mathrm{e}^{-(n \pi/L)^2t}\cos\left(\frac{n \pi x}{L}\right), \quad
n=0,1,2,\dots \end{equation}\] This solution satisfies the
original PDE (check this by plugging it in!) and the boundary
conditions, but it does not satisfy the initial conditions unless \(u_0(x)\) is exactly equal to a scalar times
one of these cosine functions. So what do we do? Panic!
OK, it turns out we can use the linearity of the PDE, and the fact that we have ‘many’ candidate solutions to come up with a solution for any \(u_0(x)\) (at least satisfying some abstract integrability constraints). First, check that if \(u = U_1(x,t)\) and \(u = U_2(x,t)\) satisfy [heat-US-eqn], then \(u = C_1U_1(x,t) + C_2U_2(x,t)\) also satisfies this equation, (and similarly for the boundary conditions, [BCs]). This property, probably quite familiar to you by this stage, is the principle of superposition which will allow us to add together many solutions to construct one which fits our initial data. This principle also plays an important role in quantum mechanics, and will be seen again next term in a more abstract setting.
Since we have that \(u\) given by [sol] is a solution for any \(n\), we can take a linear combination of these for all \(n\) as \[\begin{equation} \label{full-US-sol} u(x,t) = \sum_{n=0}^\infty C_n \mathrm{e}^{-(n\pi/L)^2t}\cos\left(\frac{n \pi x}{L}\right). \end{equation}\] We have to assume that the \(C_n\) decay quickly enough for large \(n\) for this infinite sum to make sense (i.e. converge), but we will assume this can be shown. So how do we compute the \(C_n\) given an initial condition, \(u_0\)? So far we have \[\begin{equation} \label{IC-US-eqn} u(x,0) = \sum_{n=0}^\infty C_n \cos\left(\frac{n \pi x}{L}\right) = u_0(x). \end{equation}\] Now, we will use the following orthogonality of this cosine series: \[\begin{equation} \label{cos-orthog} \int_0^L \cos\left(\frac{n \pi x}{L}\right)\cos\left(\frac{m \pi x}{L}\right) {\mathrm d}x = \begin{cases} L \quad &\text{for } m = n = 0,\\ L/2 \quad &\text{for } m=n>0,\\ 0 \quad &\text{for } m\neq n. \end{cases} \end{equation}\]
You might remember seeing this first in the context of deriving the coefficients in a Fourier series. The intuition for the \(n\neq m\) case is that when you multiply two functions together, graphically you have a drawing of one function trapped within the envelope of the other. Here, with \(n=2\), \(m=3\), inside the envelope of \(n=2\):
The shaded area under the curve, representing the integral – actually an inner product in a Hilbert space – cancels out. To confirm [cos-orthog] properly, you should compute the integrals. The way in is to use either your favourite \(\cos(A+B)\) formulae, or to write the \(\cos\) terms as complex exponentials.
How does this property help us? Well, take equation [IC-US-eqn] and multiply it by \(\cos(m \pi x/L)\) for some \(m=0,1,2,\dots\), and then integrate the equation from \(x=0\) to \(x=L\). By this orthogonality property (and cheekily interchanging the sum and the integral, which you can justify in an analysis class), you get that every term in the sum vanishes except one. In equations: \[\begin{aligned} \int_0^L\cos\left(\frac{m \pi x}{L}\right)\sum_{n=0}^\infty C_n \cos\left(\frac{ n \pi x}{L}\right){\mathrm d}x & \\ = \sum_{n=0}^\infty C_n \int_0^L\cos\left(\frac{m \pi x}{L}\right)\cos\left(\frac{n \pi x}{L}\right) {\mathrm d}x & = \int_0^L\cos\left(\frac{m \pi x}{L}\right)u_0(x) \, {\mathrm d}x. \end{aligned}\] So you see that each term in the infinite sum with \(m\neq n\) drops out because of this orthogonality property. So we’re left with \[\begin{aligned} C_m \frac{L}{2} & = \int_0^L\cos\left(\frac{m \pi x}{L}\right)u_0(x) \, {\mathrm d}x, \quad (m>0) \\ \text{or} \quad C_0 L &= \int_0^L u_0(x) \, {\mathrm d}x \end{aligned}\] which we can rearrange to compute \(C_m\) for each \(m\). Finally, with these values of \(C_m\), we have a complete solution, [full-US-sol], to the original PDE and the boundary conditions.
An example solution is shown in 5.5 for \(u_0(x) = \delta(x-L/2)\), where \(\delta\) is the Dirac delta function. We can see the rapid spreading out of \(u\), until \(u\) has ‘settled’.
You will remember the Dirac delta from AMV. The Dirac delta is a function which can be loosely thought of as being zero everywhere, except at the origin, where it is infinite. It is constrained by the property that (over the real line, say), \[\int_{-\infty}^\infty \delta(x) \, {\mathrm d}x = 1.\] This is technically a distribution rather than a function, but that’s not important right now. We can define \(\delta(\mathbfit{x})\) for any \(V\subset \mathbb{R}^n\) as the function such that, for any function \(f(\mathbfit{x})\), \[\begin{equation} \label{deltafunc} \int_{V}\delta(\mathbfit{x} - \mathbfit{s})f(\mathbfit{s})\,{\mathrm d}{\mathbfit{s}} = \left\{ \begin{array}{cl} f(\mathbfit{x}) & \mbox{if } \mathbfit{x} \in V,\\ 0 & \mbox{if } \mathbfit{x} \not\in V. \end{array} \right. \end{equation}\] That is to say, so long as you are integrating over a region that contains your point \(\mathbfit{x}\), the integral of the Dirac delta multiplied by \(f(\mathbfit{s})\) with respect to \(\mathbfit{s}\) is only the value of \(f\) at \(\mathbfit{x}\): \(f(\mathbfit{x})\).
If we replace [BCs] with, \[u(t,0) = u(t,L) = 0,\] then most of the analysis above goes through, except that \(\lambda=0\) is no longer admissible (because \(X(x) = Ax + B\) only has the trivial solution \(A=B=0\) with these boundary conditions), and instead of \(\cos\) we use \(\sin\) throughout. Hence the sum in [full-US-sol] starts at \(n=1\) rather than \(n=0\). One can also consider periodic and Robin conditions and proceed exactly as above to find how these spatial functions change – we will do some of this next term.
In Additional Problem Sheet 2 you will solve a diffusion problem with Dirichlet boundary conditions.
The boundary value problem we had to solve, [X-US-eq], can be called an eigenvalue problem, with eigenvalues that we found, \(\lambda_n = n^2 \pi^2/L^2\), and with associated eigenfunctions \(X_n(x) = \cos(n\pi x /L)\). The expansion \(\sum_n C_n X_n(x)\) is an example of a generalised Fourier series.
If we write the linear operator \(\mathchoice{\frac{{\mathrm d}^2 }{{\mathrm d}x^2}}{{\mathrm d}^2 /{\mathrm d}x^2}{{\mathrm d}^2 /{\mathrm d}x^2}{{\mathrm d}^2 /{\mathrm d}x^2}\) as \(\mathcal{L}\), then [X-US-eq] becomes \(\mathcal{L}X = -\lambda X\). Furthermore, we know that the eigenfunctions are orthogonal with respect to the inner product on a Hilbert space, \((f,g) = \int_0^L f(x) g(x) {\mathrm d}x\).
This is parallel with the matrix eigenvalue problem, \(\mathsfbfit{A}\mathbfit{x} = \lambda\mathbfit{x}\). In an inner product (Hilbert) space, a real symmetric matrix \(\mathsfbfit{A}\) has real eigenvalues. Eigenvectors associated with distinct eigenvalues are orthogonal. The equivalent of a real symmetric matrix for differential operators is a self-adjoint linear operator, i.e. one that satisfies \((f,\mathcal{L}g) = (\mathcal{L}f, g)\) for all \(f,g\).
Asking whether eigenvalue problems in general can be written in terms of self-adjoint linear operators leads us down the road to Sturm–Liouville theory.
Now let’s take a look at our problem on an unbounded domain and with a diffusion constant, \[\begin{equation} \mathchoice{\frac{\partial u}{\partial t}}{\partial u/\partial t}{\partial u/\partial t}{\partial u/\partial t} = D \mathchoice{\frac{\partial^2 u}{\partial x^2}}{\partial^2 u/\partial x^2}{\partial^2 u/\partial x^2}{\partial^2 u/\partial x^2}, \quad x\in(-\infty,\infty), \quad t>0, \label{diffprob} \end{equation}\] with boundary conditions \[u(\pm\infty,t) = 0,\] and initial condition \[u(x,0) = \delta(x),\] the return once again of the delta function, representing a point density at the origin.
On an unbounded domain, the normal way to solve this equation is to transform it using the Fourier transform.
The Fourier transform of a function \(f(x)\), denoted \(\mathcal{F}(k)[f(x)]\), is given by \[\mathcal{F}(k)[f(x)] = \frac{1}{\sqrt{2\pi}}\int_{-\infty}^{\infty}\mathrm{e}^{-\mathrm{i}k x}f(x)\,\mathrm{d}x.\] Its inverse, \(\mathcal{F}^{-1}(x)[g(k)]\), is \[\mathcal{F}^{-1}(x)[g(k)] = \frac{1}{\sqrt{2\pi}}\int_{-\infty}^{\infty}\mathrm{e}^{\mathrm{i}k x}g(k)\,{\mathrm d}{k}.\] You might have seen a slightly different version of these where only one of the terms has a \(1/2\pi\) factor, but either way, crucially it can be shown that \[\mathcal{F}^{-1}[\mathcal{F}[f(x))]] = f(x).\] You’re not expected to derive this result, merely to be aware of it. The critical property of the Fourier transform which makes it of use in solving PDEs is its effect on derivatives. The following result can be fairly easily demonstrated: \[\begin{equation} \label{fderiv} \mathcal{F}(k)\left[\mathchoice{\frac{\partial^{n} f}{\partial x^{n}}}{\partial^{n} f/\partial x^{n}}{\partial^{n} f/\partial x^{n}}{\partial^{n} f/\partial x^{n}}\right] = (\mathrm{i}k)^{n}\mathcal{F}(k)[f(x)]. \end{equation}\] Again, you don’t need to derive this result, you just need to know how to apply it.
Applying [fderiv] to the first term in [diffprob] we obtain \[\mathchoice{\frac{\partial\mathcal{F}(k)[u]}{\partial t}}{\partial\mathcal{F}(k)[u]/\partial t}{\partial\mathcal{F}(k)[u]/\partial t}{\partial\mathcal{F}(k)[u]/\partial t} = -Dk^2\mathcal{F}(k)[u].\] Just like that, we have transformed a second order PDE into a first order ODE. Its solution is clearly an exponential, \[\mathcal{F}(k)[u(x,t)] = \mathcal{F}(k)[u(x,0)]\mathrm{e}^{-D k^2 t}.\]
To get the real space solution, we must apply the inverse operator. This is the price we pay for using the transform to simplify the system. This is not always straightforward, but in this case this is not so problematic. First we must find the initial condition \(\mathcal{F}(k)[c(x,0)]\), which, from [deltafunc], is the Fourier transform of the Dirac delta function, \[\frac{1}{\sqrt{2\pi}}\int_{-\infty}^{\infty}\delta(x)\mathrm{e}^{-\mathrm{i}kx}\,\mathrm{d}x= \frac{1}{\sqrt{2\pi}}.\] Thus our final solution is \[u(x,t) = \mathcal{F}^{-1}[\mathcal{F}(k)[u(x,t)]] = \frac{1}{2\pi}\int_{-\infty}^{\infty}\mathrm{e}^{\mathrm{i}kx}\mathrm{e}^{-D k^2 t}\,{\mathrm d}{k}.\] In order to perform this integral we note the following known result: \[\mathcal{F}(k)[\mathrm{e}^{-\alpha x^2}] = \frac{1}{\sqrt{2\alpha}}\mathrm{e}^{-k^2/4\alpha}.\] We therefore have \[\begin{equation} \label{fundamentalsol} u(x,t) = \frac{1}{2 \sqrt{\pi Dt}} \mathrm{e}^{-x^2/4Dt}. \end{equation}\]
Ask yourself: Why I was able to use a forward transform \(\mathcal{F}\) result for an inverse \(\mathcal{F}^{-1}\) operation?
This tells us the density spreads out over time following a Gaussian distribution, as depicted in 5.6.
How is this solution different to in the bounded case?
There are questions on Additional Problem Sheet 2 on the Fourier transform. In addition, the extra reading at the end of this chapter has a link to a number of example solutions to a couple of partial differential equations solved using Fourier transform methods (the wave equation and the telegraph equation).
As we remarked at the start of the chapter, the limit of the random walk as the step size reduces to zero is a Gaussian distribution. We then showed, using a slightly hand-wavy argument, that a limit of this distribution was the advection–diffusion equation. We have now squared the circle so to speak, in the sense that we see that the diffusion equation gives Gaussian-like spreading of a point source. In fact it was Einstein (and Marian Smoluchowski) who first made this link precise at the turn of the previous century.
This solution, [fundamentalsol], is known as the fundamental solution of the diffusion equation. Why is it so fundamental? Because it allows us to answer the question ‘what about different initial conditions?’.
Why is a delta function which spreads out over time called the fundamental solution? In general we might have some linear differential operator \(\mathcal{L}\). When acting on a function \(u(\mathbfit{x},t)\) we want to solve the following initial value problem on a domain \(V\) with boundary \(\partial V\): \[\begin{equation} \label{fullprob} \mathcal{L}u(\mathbfit{x},t) = 0,\quad u(\mathbfit{x},0) = g(\mathbfit{x}),\quad u(\partial{V},t) = 0, \end{equation}\] that is to say, \(g(\mathbfit{x})\) is our initial condition.
For example, the diffusion equation operator is \(\mathcal{L}= \mathchoice{\frac{\partial}{\partial t}}{\partial/\partial t}{\partial/\partial t}{\partial/\partial t}- D\nabla^2\), so that \[\mathcal{L}u(\mathbfit{x},t) = \mathchoice{\frac{\partial u}{\partial t}}{\partial u/\partial t}{\partial u/\partial t}{\partial u/\partial t} -D\nabla^2 u = 0.\] \(\mathcal{L}\) is linear so for \(a,b\) scalar constants and \(u,v\) scalar densities, \[\mathcal{L}(au+bv) = a\mathcal{L}u + b\mathcal{L}v.\] The fundamental solution or Green’s function of \(\mathcal{L}\) is defined as the solution, \(u_f\), to \[\begin{equation} \label{fundprob} \mathcal{L}u_f(\mathbfit{x},t) = 0, \quad u_f(\mathbfit{x},0) = \delta(\mathbfit{x}), \quad u_f(\partial{V},t) = 0, \end{equation}\] where specifically the initial condition is the delta function. See how our setup for the diffusion equation, [diffprob], is in this form.
We can relate the two initial conditions by the delta function identity, \[\begin{equation} \label{deltaid} \int_{V}\delta(\mathbfit{x}- \mathbfit{s})g(\mathbfit{s})\,{\mathrm d}\mathbfit{s} = g(\mathbfit{x}). \end{equation}\] This combination of the two functions \(\delta\) and \(g\) on the left-hand is known as a convolution. We now propose that the following relationship – another convolution – holds: \[\begin{equation} \label{gensol} \int_{V}u_f(\mathbfit{x}-\mathbfit{s},t) g(\mathbfit{s})\,{\mathrm d}\mathbfit{s} = u(\mathbfit{x},t). \end{equation}\] To see this, we note that, as the domain of integration is fixed, we can take the operator inside (the Reynolds transport theorem again), \[\begin{equation} \label{lingreen} \int_{V}\mathcal{L}u_f(\mathbfit{x}-\mathbfit{s},t)g(\mathbfit{s})\,{\mathrm d}\mathbfit{s} = \mathcal{L}u(\mathbfit{x},t). \end{equation}\] Remembering that all derivatives in \(\mathcal{L}\) will be with respect to \(\mathbfit{x}\), [fundprob] and [fullprob] ensure that both sides vanish. In addition, as \(\lim_{t \to 0} u_f = \delta(\mathbfit{x})\), we will obtain the right initial condition through [deltaid]. Thus, solving [fullprob] can be reduced to solving the fundamental problem, [fundprob]: generally an easier task! Integrating [gensol] then leaves us with the full solution. This can often be of benefit for analytic solutions and asymptotic approximations.
The terms fundamental solution and Green’s function are broadly used interchangeably. Some authors prefer using ‘fundamental solution’ for the special case where the domain \(V = \mathbb{R}^n\). This is the only case we will be interested in, so we will use the term ‘fundamental solution’ without confusion.
Let’s say we wish to solve the following problem: \[\begin{equation} \label{rectheat} \mathchoice{\frac{\partial c}{\partial t}}{\partial c/\partial t}{\partial c/\partial t}{\partial c/\partial t} = D \mathchoice{\frac{\partial^2 c}{\partial x^2}}{\partial^2 c/\partial x^2}{\partial^2 c/\partial x^2}{\partial^2 c/\partial x^2},\quad c(x,0) = \left\{\begin{array}{ll} 0 & x< -a\\ 1 & -a \leq x \leq a\\ 0 & x>a \end{array} \right. , \quad c(\pm \infty ,t) = 0. \end{equation}\] The initial condition is a ‘population’ spread evenly over a domain \(x\in[-a,a]\). Using the fundamental solution in [fundamentalsol], and [gensol], the solution to the problem in [rectheat] is \[c(x,t) = \frac{1}{ 2\sqrt{\pi Dt}}\int_{-a}^{a} \mathrm{e}^{-(x-s)^2/4Dt}\,{\mathrm d}{s} = \frac{1}{2}\left[\operatorname{erf}\left(\frac{a-x}{\sqrt{4Dt}}\right) + \operatorname{erf}\left(\frac{a+x}{\sqrt{4Dt}}\right)\right],\] where \[\operatorname{erf}(x) = \frac{2}{\sqrt{\pi}}\int_{0}^{x}\mathrm{e}^{-s^2}{\mathrm d}{s}.\] is the error function, a function with well-known properties. The solution is shown to vary with \(t\) in 5.7.
The step in [lingreen] does not generally work if the operator is nonlinear, for example, \[\mathcal{L}[u] = \left(\mathchoice{\frac{{\mathrm d}^2 u}{{\mathrm d}x^2}}{{\mathrm d}^2 u/{\mathrm d}x^2}{{\mathrm d}^2 u/{\mathrm d}x^2}{{\mathrm d}^2 u/{\mathrm d}x^2}\right)^2 + u.\] In that case, \[\mathcal{L}\left[\int_{-\infty}^{\infty} u_f(x-s,t)g(s)\,{\mathrm d}s\right] = \left(\int_{-\infty}^{\infty} \mathchoice{\frac{{\mathrm d}^2 u_f(x-s,t)}{{\mathrm d}x^2}}{{\mathrm d}^2 u_f(x-s,t)/{\mathrm d}x^2}{{\mathrm d}^2 u_f(x-s,t)/{\mathrm d}x^2}{{\mathrm d}^2 u_f(x-s,t)/{\mathrm d}x^2} g(s)\,{\mathrm d}s\right)^2 +\int_{-\infty}^{\infty} u_f(x-s,t)g(s)\,{\mathrm d}s,\] which is not necessarily zero.
There is one critical problem with the fundamental solution: the Gaussian distribution is always positive, so in theory the prediction is that there is always some small concentration infinitely far away form the source, for all \(t\). This cannot be physically realistic (think of this as a population). On the other hand, the density drops away exponentially so the actual density large distances away from the source will be negligible.
We have commented already that a problem with Gaussian distributions is that they are always positive, even out towards infinity where you would expect no population at all. In fact, as the population diffuses, as we saw in [rectheat], you end up with infinitely-fast travelling waves which head out towards \(x=\pm\infty\).
To counter this, we can make the diffusion constant, \(D\), dependent on the density. A popular assumption is that \[D(c) = D_0\left(\frac{c}{c_0}\right)^m,\] for some \(D_0>0\), \(m>0\). In this case, as the density expands and its value drops (compared to its initial value) the diffusion rate also decreases. As we shall see, this leads to a diffusive distribution with finite width.
Rather than carrying on with the standard one-dimensional case we now turn to a radially symmetric two-dimensional case. We will again ignore source terms (\(f\equiv 0\)) and we shall assume no velocity \(\mathbfit{v}\), so only diffusive behaviour is present. We seek a fundamental solution with particle density \(Q\), i.e., we are seeking a solution to the problem \[\begin{equation} \mathchoice{\frac{\partial c}{\partial t}}{\partial c/\partial t}{\partial c/\partial t}{\partial c/\partial t} = D_0\boldsymbol{\nabla}\cdot\left[\left(\frac{c}{c_0}\right)^m \boldsymbol{\nabla}c\right],\quad c(\mathbfit{x},0) = Q\delta(\mathbfit{x}), \quad c(\mathbfit{x} \to \infty,t)=0. \label{radial-diff-eqn} \end{equation}\] By \(c(\mathbfit{x} \to \infty,t)\) we mean to imply that the distribution must eventually go to zero in all directions.
In polar coordinates, \((r,\theta)\), you will remember that \[\boldsymbol{\nabla}c = \mathchoice{\frac{\partial c}{\partial r}}{\partial c/\partial r}{\partial c/\partial r}{\partial c/\partial r}\widehat{\mathbfit{r}} + \frac{1}{r}\mathchoice{\frac{\partial c}{\partial\theta}}{\partial c/\partial\theta}{\partial c/\partial\theta}{\partial c/\partial\theta}\widehat{\mathbfit{\theta}},\] and for some vector \(\mathbfit{A} = A_r\widehat{\mathbfit{r}} + A_{\theta}\widehat{\mathbfit{\theta}}\), \[\boldsymbol{\nabla}\cdot\mathbfit{A} = \frac{1}{r}\left(\mathchoice{\frac{\partial}{\partial r}}{\partial/\partial r}{\partial/\partial r}{\partial/\partial r}(rA_r) + \mathchoice{\frac{\partial A_{\theta}}{\partial\theta}}{\partial A_{\theta}/\partial\theta}{\partial A_{\theta}/\partial\theta}{\partial A_{\theta}/\partial\theta}\right).\] If we assume radial symmetry, \(c(r,\theta,t) \equiv c(r,t)\), then the right-hand side of [radial-diff-eqn] yields \[\begin{equation} \label{porusradial} D_0\boldsymbol{\nabla}\cdot\left[\left(\frac{c}{c_0}\right)^m\mathchoice{\frac{\partial c}{\partial r}}{\partial c/\partial r}{\partial c/\partial r}{\partial c/\partial r} \widehat{\mathbfit{r}}\right] = \frac{D_0}{r}\mathchoice{\frac{\partial}{\partial r}}{\partial/\partial r}{\partial/\partial r}{\partial/\partial r}\left[r\left(\frac{c}{c_0}\right)^m\mathchoice{\frac{\partial c}{\partial r}}{\partial c/\partial r}{\partial c/\partial r}{\partial c/\partial r} \right], \end{equation}\] and our equation becomes have \[\begin{equation} \label{porous-media-eqn} \mathchoice{\frac{\partial c}{\partial t}}{\partial c/\partial t}{\partial c/\partial t}{\partial c/\partial t} = \frac{D_0}{r}\mathchoice{\frac{\partial}{\partial r}}{\partial/\partial r}{\partial/\partial r}{\partial/\partial r}\left[r\left(\frac{c}{c_0}\right)^m\mathchoice{\frac{\partial c}{\partial r}}{\partial c/\partial r}{\partial c/\partial r}{\partial c/\partial r} \right]. \end{equation}\] This is an example of the porous media equation. We will shortly show that the solution is \[\begin{equation} \label{porussol} c(r,t) = \left\{ \begin{array}{ll} \displaystyle\frac{c_0}{\lambda(t)}\left[1 - \left(\frac{r}{r_0 \lambda(t)}\right)^2\right]^{1/m} & \mbox{} r\leq r_0 \lambda(t), \\[6pt] 0 & \mbox{} r> r_0 \lambda(t), \end{array} \right. \end{equation}\] where \[\lambda(t) = \left(\frac{t}{t_0}\right)^{1/(2+m)},\quad t_0 = \frac{r_0^2 m}{2 D_0 (m+2)}, \quad r_0 =\frac{Q\mathit{\Gamma}(1/m + 3/2)}{\pi^{1/2}c_0 \mathit{\Gamma}(1/m + 1)},\] and where \(\mathit{\Gamma}(z)\) is the gamma function. The solution has compact support – it is zero outside of a finite radius – and this is a common property of hyperbolic PDEs. The volume, \(Q\), is conserved and we see in 5.8 the solution spreading out over time.
Deriving this solution involves using another technique for solving partial differential equation: a symmetry reduction technique known as looking for similarity solutions. The basic idea is that if we can find some transformation of our solution which changes its shape but still solves the basic equations, then we should be able to use this transformation to eliminate one of the independent variables. This is a bit like with Fourier transforms when this elimination turns a PDE into an ODE. Unfortunately, of course, first you must such a scaling, and this is not always easy! But for a significant class of PDEs with one spatial dimension the generic transformation of \(u(x,t)\) given by \[\begin{equation} \label{dilation} U = \lambda^{-\alpha} u, \quad X = \lambda^\beta x, \quad T = \lambda t %u(x,t) = \lambda^{\alpha} v(\lambda^{\beta}x,\lambda t) \end{equation}\] for some suitable constants \(\alpha\) and \(\beta\) will give the original PDE for \(U(X,T)\) (a dilation).
For example, consider the 1D diffusion equation from the previous chapter, \[\mathchoice{\frac{\partial u}{\partial t}}{\partial u/\partial t}{\partial u/\partial t}{\partial u/\partial t} = D\mathchoice{\frac{\partial^2 u}{\partial x^2}}{\partial^2 u/\partial x^2}{\partial^2 u/\partial x^2}{\partial^2 u/\partial x^2}.\] Applying [dilation] we get \[\lambda^{\alpha+1}\mathchoice{\frac{\partial U}{\partial T}}{\partial U/\partial T}{\partial U/\partial T}{\partial U/\partial T} = D\lambda^{\alpha+2\beta}\mathchoice{\frac{\partial^2 U}{\partial X^2}}{\partial^2 U/\partial X^2}{\partial^2 U/\partial X^2}{\partial^2 U/\partial X^2}.\] We can spot that if \(\beta=1/2\) then \[\mathchoice{\frac{\partial U}{\partial T}}{\partial U/\partial T}{\partial U/\partial T}{\partial U/\partial T} = D \mathchoice{\frac{\partial^2 U}{\partial X^2}}{\partial^2 U/\partial X^2}{\partial^2 U/\partial X^2}{\partial^2 U/\partial X^2},\] i.e., \(U\) satisfies the diffusion equation! So we see a transformation in the form [dilation] can lead to the same differential equation, just in \(U\) instead of \(u\).
Note that \(\alpha\) turned out to be arbitrary here: as we will see, we can pick it to suit the initial conditions.
But why do we care that under this transformation, \(U\) satisfies the same PDE? Well, note that the groupings \[\begin{equation} \eta = \frac{X}{T^\beta} = \frac{x}{t^\beta} \quad \text{and} \quad \xi = UT^\alpha = ut^{\alpha} \label{sim-sol-sub} \end{equation}\] are invariant (i.e. unchanged) under the transformation. By a theorem sometimes known as Morgan’s theorem,3 we can write one invariant as a function of the other, \[\xi = v(\eta),\] for some function \(v\). Substituting in the definitions of \(\eta\) and \(\xi\), we get \[u(x,t) = \frac{1}{t^\alpha} v \left( \frac{x}{t^\beta} \right),\] and this is the key result! This implies that the solution \(u(x,t)\) has the same shape as the ‘master shape’ \(v(\eta)\), just stretched by a factor \(t^{\beta}\) along the \(x\)-axis and squashed by a factor \(t^{\alpha}\) along the \(y\)-axis. This shape can then be appropriately scaled to give the time-varying behaviour of the system (determined by the parameters \(\alpha\) and \(\beta\), depending on the actual equation).
We have seen this behaviour already. Recall the fundamental solution of the diffusion equation, [fundamentalsol] (with \(u\) taking the \(c\) role), \[u(x,t) =\frac{1}{2 \sqrt{\pi Dt}} \mathrm{e}^{-x^2/4Dt}.\] We have seen above that for the diffusion equation, \(\beta=1/2\), so if we write \(\eta = x/t^{1/2}\) and \(v = ut^{\alpha}\) (from [sim-sol-sub]) then \[v(\eta,t) =\frac{t^{\alpha}}{2\sqrt{\pi D t}} \mathrm{e}^{-\eta^2/4D}.\] We said earlier that \(\alpha\) can be set to fit the initial conditions. Well, the fundamental solution has a very specific initial condition which gives form to this solution, and here if we set \(\alpha=1/2\) then \[v(\eta) =\frac{1}{2 \sqrt{\pi D}}\mathrm{e}^{-\eta^2/4D}.\]
So we see our solution \(u(x,t)\) can be written in the form \[u(x,t) = \frac{1}{t^\alpha}v\left(\frac{x}{t^{\beta}}\right) = \frac{1}{t^{1/2}}v\left(\frac{x}{t^{1/2}}\right),\] and the solution will still satisfy the diffusion equation.
As we saw in 5.6, this confirms that the solution \(u(x,t)\) has the same shape as the master shape \(v(\eta)\), stretched by a factor \(t^{1/2}\) along the \(x\)-axis and squashed by a factor \(t^{1/2}\) along the \(y\)-axis.
Let’s go back to [porous-media-eqn], \[\begin{equation} \label{radialpde} \mathchoice{\frac{\partial c}{\partial t}}{\partial c/\partial t}{\partial c/\partial t}{\partial c/\partial t} = \frac{D_0}{r}\mathchoice{\frac{\partial}{\partial r}}{\partial/\partial r}{\partial/\partial r}{\partial/\partial r}\left[r \left(\frac{c}{c_0}\right)^m \mathchoice{\frac{\partial c}{\partial r}}{\partial c/\partial r}{\partial c/\partial r}{\partial c/\partial r}\right], \end{equation}\] and, using the observed similarity solution, make the substitution \[c(r,t) = \frac{1}{t^{\alpha}}v\left(\frac{r}{t^{\beta}}\right).\] Denoting \(\eta=r/t^{\beta}\), so that \(v=v(\eta)\), we first change the left-hand side: \[\begin{align} \mathchoice{\frac{\partial c}{\partial t}}{\partial c/\partial t}{\partial c/\partial t}{\partial c/\partial t} &= -\alpha\frac{1}{t^{\alpha+1}}v + \frac{1}{t^{\alpha}}\mathchoice{\frac{{\mathrm d}v}{{\mathrm d}\eta}}{{\mathrm d}v/{\mathrm d}\eta}{{\mathrm d}v/{\mathrm d}\eta}{{\mathrm d}v/{\mathrm d}\eta}\mathchoice{\frac{{\mathrm d}\eta}{{\mathrm d}t}}{{\mathrm d}\eta/{\mathrm d}t}{{\mathrm d}\eta/{\mathrm d}t}{{\mathrm d}\eta/{\mathrm d}t} \\ &= -\alpha\frac{1}{t^{\alpha+1}}v - \frac{r \beta}{t^{\alpha}t^{\beta+1}}\mathchoice{\frac{{\mathrm d}v}{{\mathrm d}\eta}}{{\mathrm d}v/{\mathrm d}\eta}{{\mathrm d}v/{\mathrm d}\eta}{{\mathrm d}v/{\mathrm d}\eta}\\ \label{fincaldcdt}& = -\alpha\frac{1}{t^{\alpha+1}}v - \frac{\eta \beta}{t^{\alpha+1}}\mathchoice{\frac{{\mathrm d}v}{{\mathrm d}\eta}}{{\mathrm d}v/{\mathrm d}\eta}{{\mathrm d}v/{\mathrm d}\eta}{{\mathrm d}v/{\mathrm d}\eta}. \end{align}\] On the right-hand side, \[\mathchoice{\frac{\partial}{\partial r}}{\partial/\partial r}{\partial/\partial r}{\partial/\partial r} = \frac{1}{t^\beta}\mathchoice{\frac{\partial}{\partial\eta}}{\partial/\partial\eta}{\partial/\partial\eta}{\partial/\partial\eta} \implies \mathchoice{\frac{\partial c}{\partial r}}{\partial c/\partial r}{\partial c/\partial r}{\partial c/\partial r}= \frac{1}{t^{\alpha+\beta}}\mathchoice{\frac{{\mathrm d}v}{{\mathrm d}\eta}}{{\mathrm d}v/{\mathrm d}\eta}{{\mathrm d}v/{\mathrm d}\eta}{{\mathrm d}v/{\mathrm d}\eta},\] and so we have \[\begin{align} \frac{D_0}{r}\mathchoice{\frac{\partial}{\partial r}}{\partial/\partial r}{\partial/\partial r}{\partial/\partial r}\left[r \left(\frac{c}{c_0}\right)^m \mathchoice{\frac{\partial c}{\partial r}}{\partial c/\partial r}{\partial c/\partial r}{\partial c/\partial r}\right]&=\frac{D_0}{\eta t^{2\beta}}\mathchoice{\frac{{\mathrm d}}{{\mathrm d}\eta}}{{\mathrm d}/{\mathrm d}\eta}{{\mathrm d}/{\mathrm d}\eta}{{\mathrm d}/{\mathrm d}\eta}\left[\frac{t^\beta \eta}{t^{\alpha m}}\left(\frac{v}{c_0}\right)^m \frac{1}{t^{\alpha+\beta}}\mathchoice{\frac{{\mathrm d}v}{{\mathrm d}\eta}}{{\mathrm d}v/{\mathrm d}\eta}{{\mathrm d}v/{\mathrm d}\eta}{{\mathrm d}v/{\mathrm d}\eta}\right]\\ \label{finalrhs}&= \frac{1}{t^{\alpha(m+1)+2\beta}}\frac{D_0}{\eta}\mathchoice{\frac{{\mathrm d}}{{\mathrm d}\eta}}{{\mathrm d}/{\mathrm d}\eta}{{\mathrm d}/{\mathrm d}\eta}{{\mathrm d}/{\mathrm d}\eta}\left[\eta\left(\frac{v}{c_0}\right)^m \mathchoice{\frac{{\mathrm d}v}{{\mathrm d}\eta}}{{\mathrm d}v/{\mathrm d}\eta}{{\mathrm d}v/{\mathrm d}\eta}{{\mathrm d}v/{\mathrm d}\eta}\right]. \end{align}\] Comparing [fincaldcdt] and [finalrhs] we see we want to set the \(t\)-exponents equal to each other, \[\alpha + 1 = \alpha(m+1) + 2\beta \implies 1 = \alpha m + 2\beta.\] We can multiply through by \(t^{\alpha+1}\) to obtain \[\begin{equation} \alpha v + \beta \mathchoice{\frac{{\mathrm d}v}{{\mathrm d}\eta}}{{\mathrm d}v/{\mathrm d}\eta}{{\mathrm d}v/{\mathrm d}\eta}{{\mathrm d}v/{\mathrm d}\eta}\eta +\frac{D_0}{\eta}\mathchoice{\frac{{\mathrm d}}{{\mathrm d}\eta}}{{\mathrm d}/{\mathrm d}\eta}{{\mathrm d}/{\mathrm d}\eta}{{\mathrm d}/{\mathrm d}\eta}\left[\eta\left(\frac{v}{c_0}\right)^m\mathchoice{\frac{{\mathrm d}v}{{\mathrm d}\eta}}{{\mathrm d}v/{\mathrm d}\eta}{{\mathrm d}v/{\mathrm d}\eta}{{\mathrm d}v/{\mathrm d}\eta}\right] =0. \label{rel-between-alpha-and-beta} \end{equation}\] So now we have a relationship between \(\alpha\) and \(\beta\), but still some freedom in choosing one of them.
We now manipulate [rel-between-alpha-and-beta] to simplify it with an appropriate choice of \(\alpha\). In particular, we want to write our equation in the form \(\mathchoice{\frac{{\mathrm d}[\cdots]}{{\mathrm d}\eta}}{{\mathrm d}[\cdots]/{\mathrm d}\eta}{{\mathrm d}[\cdots]/{\mathrm d}\eta}{{\mathrm d}[\cdots]/{\mathrm d}\eta}\) so that we can then integrate and create a first-order equation.
Multiply our equation through by \(\eta\), \[\begin{equation} \label{newode} \alpha v \eta + \beta \mathchoice{\frac{{\mathrm d}v}{{\mathrm d}\eta}}{{\mathrm d}v/{\mathrm d}\eta}{{\mathrm d}v/{\mathrm d}\eta}{{\mathrm d}v/{\mathrm d}\eta} \eta^2 + D_0\mathchoice{\frac{{\mathrm d}}{{\mathrm d}\eta}}{{\mathrm d}/{\mathrm d}\eta}{{\mathrm d}/{\mathrm d}\eta}{{\mathrm d}/{\mathrm d}\eta}\left[\eta\left(\frac{v}{c_0}\right)^m \mathchoice{\frac{{\mathrm d}v}{{\mathrm d}\eta}}{{\mathrm d}v/{\mathrm d}\eta}{{\mathrm d}v/{\mathrm d}\eta}{{\mathrm d}v/{\mathrm d}\eta}\right]=0. \end{equation}\] The idea is to turn the first two terms into a derivative so we can integrate. You might notice that for some \(\gamma\), \[\gamma\mathchoice{\frac{{\mathrm d}}{{\mathrm d}\eta}}{{\mathrm d}/{\mathrm d}\eta}{{\mathrm d}/{\mathrm d}\eta}{{\mathrm d}/{\mathrm d}\eta}[v\eta^2] = \gamma \mathchoice{\frac{{\mathrm d}v}{{\mathrm d}\eta}}{{\mathrm d}v/{\mathrm d}\eta}{{\mathrm d}v/{\mathrm d}\eta}{{\mathrm d}v/{\mathrm d}\eta} \eta^2 + 2\gamma v\eta,\] which is almost the form we want. If we set \(\beta=\gamma\), then we have two simultaneous equations, \[\begin{aligned} 1 & = \alpha m + 2 \gamma,\\ \alpha & = 2\gamma. \end{aligned}\] This gives us \(\alpha = 1/(m+1)\) and \(\gamma=1/[2(m+1)]\), so [newode] becomes \[\frac{1}{2(m+1)}\mathchoice{\frac{{\mathrm d}}{{\mathrm d}\eta}}{{\mathrm d}/{\mathrm d}\eta}{{\mathrm d}/{\mathrm d}\eta}{{\mathrm d}/{\mathrm d}\eta}[v\eta^2] + D_0\mathchoice{\frac{{\mathrm d}}{{\mathrm d}\eta}}{{\mathrm d}/{\mathrm d}\eta}{{\mathrm d}/{\mathrm d}\eta}{{\mathrm d}/{\mathrm d}\eta}\left[\eta\left(\frac{v}{c_0}\right)^m \mathchoice{\frac{{\mathrm d}v}{{\mathrm d}\eta}}{{\mathrm d}v/{\mathrm d}\eta}{{\mathrm d}v/{\mathrm d}\eta}{{\mathrm d}v/{\mathrm d}\eta}\right] = 0.\] Integrating with respect to \(\eta\) gives \[\frac{1}{2(m+1)}v\eta^2 + D_0\eta\left(\frac{v}{c_0}\right)^m \mathchoice{\frac{{\mathrm d}v}{{\mathrm d}\eta}}{{\mathrm d}v/{\mathrm d}\eta}{{\mathrm d}v/{\mathrm d}\eta}{{\mathrm d}v/{\mathrm d}\eta} = C,\] with \(C\) a constant of integration.
We choose the imposed boundary condition that \(v(\eta) \to 0\) sufficiently fast as \(\eta(r) \to \infty\) which implies the constant of integration is \(C=0\). For any finite \(t\), this is equivalent to saying the population vanishes as \(r\to \infty\).
We solve the differential equation using separation of variables, \[\begin{aligned} \int \frac{\eta}{2(m+1)}{\mathrm d}{\eta} &= - \int \frac{D_0}{c_0^m}v^{m-1}{\mathrm d}{v},\\ \implies \frac{\eta^2}{4(m+1)} &=- \frac{D_0}{m c_0^m}v^{m} + A,\\ \implies v &= \frac{m^{1/m} c_0}{D_0^{1/m}}\left(A - \frac{\eta^2}{4(m+1)}\right)^{1/m}. \end{aligned}\] Finally, using \(c= v/t^\alpha\) and \(\eta= r/t^{\beta}\), we get \[c(r,t)= \frac{m^{1/m} c_0}{t^{\alpha}D_0^{1/m}}\left(A - \frac{r^2}{t^{2\beta}4(m+1)}\right)^{1/m},\] where \(\alpha = 1/(m+1)\) and \(\beta=1/[2(m+1)]\).
To get \(A\) and hence the final solution, [porussol], we enforce the fact that the density should be constrained for all time to be constant (it isn’t necessarily conserved in this more complex system). This is a little tedious and I don’t really want you to spend your time doing this. The main concept we’re interested in here is the method of similarity solutions.
There is a link in this chapter’s extra reading to a bunch of worked problems using this method of symmetry reduction/similarity scaling.
Diffusion of populations is covered in Murray, vol. I, chap. 11. Note that we don’t do long-range diffusion, and that our derivation is more general than the derivation in the book since we include advection.
Solution of PDEs by Fourier transform: The Wikipedia page for the Fourier transform is very comprehensive on the topic. Really what you need to do is find examples to test yourself with. Joel Feldman at UBC provides a lot of good examples on his website.
Green’s method: What we cover here is essentially the basics, which you can find on Wikipedia. It can be used in a much wider context than given here but that steps outside the relevancy for this course so I recommend you stick to what is here.
Finite diffusion and spherically symmetric densities are covered in Murray, vol. I, chap. 11.3. The solution method is taken from Evans, L. C., 1998 Partial Differential Equations, page 188, ‘Similarity under scaling’. In that case it is done in Cartesian coordinates; we have adapted the method for the polar case. In that book there are other examples.
Helen Wilson at UCL also provides relevant worked examples.
In 1, 2, 3 and 4 we looked at time-dependent models of population growth and interaction. In 5, we looked at models where the time-dependence was balanced by diffusion, a spatial dependence. We’re now going to bring these two ideas together.
We’re going to start by considering the advection–diffusion equation in one spatial dimension, and for one species only, without advection, \[\mathchoice{\frac{\partial u}{\partial t}}{\partial u/\partial t}{\partial u/\partial t}{\partial u/\partial t} = D \mathchoice{\frac{\partial^2 u}{\partial x^2}}{\partial^2 u/\partial x^2}{\partial^2 u/\partial x^2}{\partial^2 u/\partial x^2} + f(u).\] Systems of this form, i.e. without advection, are known as reaction–diffusion systems. We will be studying them exclusively hereafter. They are capable of, frankly, a staggering variety of behaviour!
The first model we will examine is attributed to Ronald Fisher for his work, published 1937, that models the spread of an advantageous gene mutation in a population. We’ll get to Kolmogorov’s contribution shortly.
The FK equation provides a natural entry point into our discussion of reaction–diffusion equations and spatial dynamics since in this equation, we have \[f(u) = ku (1-u).\] Remember this? Of course you do! It’s logistic growth with a limiting population of \(u=1\). In the context of Fisher’s model, \(u\) is the percentage of the population carrying the advantageous gene.
We nondimensionalise using \(t = \widehat{t}/k\), \(x = (D/k)^{1/2}\widehat{x}\) and dropping hats gives us \[\begin{equation} \label{fk-nondim} \mathchoice{\frac{\partial u}{\partial t}}{\partial u/\partial t}{\partial u/\partial t}{\partial u/\partial t} = \mathchoice{\frac{\partial^2 u}{\partial x^2}}{\partial^2 u/\partial x^2}{\partial^2 u/\partial x^2}{\partial^2 u/\partial x^2} + u(1-u). \end{equation}\] Recall that the logistic equation has two equilibria: \(u = 0\), which is unstable, and \(u = 1\), which is stable. In the context of the reaction–diffusion PDE, these constant values of \(u\) are homogeneous equilibria. What are they?
Equilibrium: A function \(u\) is in equilibrium when all temporal derivatives \(\mathchoice{\frac{\partial^{n} u}{\partial t^{n}}}{\partial^{n} u/\partial t^{n}}{\partial^{n} u/\partial t^{n}}{\partial^{n} u/\partial t^{n}}\) vanish.
Homogeneity: A function is homogeneous when all its spatial derivatives \(\mathchoice{\frac{\partial^{n} u}{\partial x_i^{n}}}{\partial^{n} u/\partial x_i^{n}}{\partial^{n} u/\partial x_i^{n}}{\partial^{n} u/\partial x_i^{n}}\) vanish.
Let’s now consider the FK equation with the boundary conditions that \[\begin{equation} \label{fk-bc} u \to 1 \text{ as } x \to -\infty \quad \text{and} \quad u \to 0 \text{ as } x \to \infty. \end{equation}\] This will require the solution to transition between the two equilibria, from \(u = 1\) to \(u = 0\), over some region of space, as illustrated in 6.1. Knowing what we know about the stabilities of \(u = 1\) and \(u = 0\), we expect that the transition region will not be stationary and move to the right over time, allowing the stable \(u = 1\) region to expand.
Based on this intuition, we’ll search for a travelling wave solution that moves to the right, i.e. a solution whose \((x,t)\) dependence is in the form \(x-ct\). In other words, we look for a function \(u(z)\) where \(z = x-ct\). The constant \(c \geq 0\) is the wave speed, which is unknown and must be determined as part of the problem. The solution, therefore, can be viewed as having a fixed shape that is shifted to the right by the amount \(ct\) at time \(t\).
You might remember that travelling waves have this form because the wave equation, \[\mathchoice{\frac{\partial^2 u}{\partial t^2}}{\partial^2 u/\partial t^2}{\partial^2 u/\partial t^2}{\partial^2 u/\partial t^2} = c^2 \mathchoice{\frac{\partial^2 u}{\partial x^2}}{\partial^2 u/\partial x^2}{\partial^2 u/\partial x^2}{\partial^2 u/\partial x^2},\] which governs how mechanical waves behave, has the general solution \[u(x,t) = F(x-ct) + G(x+ct),\] i.e., a solution composed of a right-travelling function \(F\) and a left-travelling function \(G\).
Convince yourself that the wave really does move to the right. The best way to think about this is to fix \(t\) and draw some function \(u(x,t)=u(z)\) against \(x\). Then increase \(t\)... this is equivalent to what transformation in \(x\)?
The boundary conditions, and the stability of the equilibria at the boundaries, dictate which direction we expect the travelling wave to move in. What form of travelling wave solution would you look for (i.e. what would \(z\) equal) if we had the boundary conditions \[u \to 0 \text{ as } x \to -\infty \quad \text{and} \quad u \to 1 \text{ as } x \to \infty?\]
With this assumption, \[\mathchoice{\frac{\partial u}{\partial x}}{\partial u/\partial x}{\partial u/\partial x}{\partial u/\partial x} = \mathchoice{\frac{{\mathrm d}u}{{\mathrm d}z}}{{\mathrm d}u/{\mathrm d}z}{{\mathrm d}u/{\mathrm d}z}{{\mathrm d}u/{\mathrm d}z}, \qquad \mathchoice{\frac{\partial u}{\partial t}}{\partial u/\partial t}{\partial u/\partial t}{\partial u/\partial t} = -c\mathchoice{\frac{{\mathrm d}u}{{\mathrm d}z}}{{\mathrm d}u/{\mathrm d}z}{{\mathrm d}u/{\mathrm d}z}{{\mathrm d}u/{\mathrm d}z},\] and the FK equation becomes the ODE \[\mathchoice{\frac{{\mathrm d}^2 u}{{\mathrm d}z^2}}{{\mathrm d}^2 u/{\mathrm d}z^2}{{\mathrm d}^2 u/{\mathrm d}z^2}{{\mathrm d}^2 u/{\mathrm d}z^2} = -c \mathchoice{\frac{{\mathrm d}u}{{\mathrm d}z}}{{\mathrm d}u/{\mathrm d}z}{{\mathrm d}u/{\mathrm d}z}{{\mathrm d}u/{\mathrm d}z} -u(1-u)\] with the boundary conditions (now in \(z\)) \[u \to 1 \text{ as } z \to -\infty \quad \text{and} \quad u \to 0 \text{ as } z \to \infty.\]
The last time we looked at a second-order differential equation (2), we threw in the form \(u=A\mathrm{e}^{\lambda t}\) and looked at the sign of \(\lambda\) for stability. We can’t do this here (albeit with \(z\) instead of \(t\)) because of the nonlinear logistic term.
Thirty-second sanity test: Try substituting in \(u=A\mathrm{e}^{\lambda z}\) and see what happens.
In fact, this nonlinearity scuppers many of our plans. Because of it, we will not be able to find an explicit form for \(u(z)\); it’s just too hard. But numerically this equation is easy to solve, and then we know that whatever shape this solution \(u(z)\) takes, in the full solution \(u(x,t)\), this shape will just move to the right with a speed \(c\). What we can do analytically, however, is to find what this \(c\) will be. We do this by – funnily enough – looking at the stability of this second order ODE.
We can learn more about the stability by considering second-order ODEs as two first order ODEs: introducing \(v = \mathchoice{\frac{{\mathrm d}u}{{\mathrm d}z}}{{\mathrm d}u/{\mathrm d}z}{{\mathrm d}u/{\mathrm d}z}{{\mathrm d}u/{\mathrm d}z}\), we have \[\begin{aligned} \mathchoice{\frac{{\mathrm d}u}{{\mathrm d}z}}{{\mathrm d}u/{\mathrm d}z}{{\mathrm d}u/{\mathrm d}z}{{\mathrm d}u/{\mathrm d}z} &= v,\\ \mathchoice{\frac{{\mathrm d}v}{{\mathrm d}z}}{{\mathrm d}v/{\mathrm d}z}{{\mathrm d}v/{\mathrm d}z}{{\mathrm d}v/{\mathrm d}z} &= -u(1-u) - cv. \end{aligned}\] We’re now in a position to apply all of the tools we are familiar with for analysing systems of ODEs. Let’s now proceed by performing a linear stability analysis about the equilibria of the system.
But a quick warning before we do so: when we change coordinate systems, we have to consider ‘stability’ differently. We now have an ODE in \(z\), not in \(t\), so we will not necessarily expect the same stability results as for the homogeneous system.
The equilibria are \((0, 0)\) and \((1, 0)\).
The Jacobian is given by \[\mathsfbfit{J} = \begin{pmatrix} 0 & 1 \\ 2u-1 & -c \end{pmatrix}.\]
We once again look for eigenvalues by looking to solve \(\det(\mathsfbfit{J}-\lambda\mathsfbfit{I}) = 0\).
The Jacobian at \((0,0)\) is \[\mathsfbfit{J} = \begin{pmatrix} 0 & 1 \\ -1 & -c \end{pmatrix},\] and the eigenvalues are \[\lambda = \frac12(-c\pm\sqrt{c^2-4}).\] This is a stable node if \(c>2\), or a stable spiral if \(c<2\).
Let’s have a think about what a stable spiral means here. It means that the trajectory in phase space will spiral about \((0,0)\) and for some values of \(z\) we will therefore have \(u < 0\): see 6.2(b). This is unrealistic in the sense that it doesn’t describe what can occur in the real system: \(u\) can’t be negative. Therefore, we’ll only interest ourselves in the case where \(c \geq 2\).
The Jacobian at \((1,0)\) is \[\mathsfbfit{J} = \begin{pmatrix} 0 & 1 \\ 1 & -c \end{pmatrix},\] giving us eigenvalues \[\lambda = \frac12(-c\pm\sqrt{c^2+4}),\] making \((1,0)\) a saddle point.
Quick thought: in the ODE case (logistic growth only), which of \(u = 0\) and \(u = 1\) was stable? If you think about this, it may seem that our new stabilities of \((0, 0)\) and \((1, 0)\) seem off. But think of our boundary conditions, [fk-bc]: we need the solution to move away from \((1, 0)\) as \(z\) increases. Hence, we need \((1, 0)\) to be unstable. Further, these boundary conditions mean that the solution we seek ‘starts’ at the saddle point \((1, 0)\) and moves along the trajectory that connects to the node at \((0,0)\).
What does this look like in phase space? Firstly, as the trajectory leaves \((1,0)\), \(u\) is decreasing, so \(v=\mathchoice{\frac{{\mathrm d}u}{{\mathrm d}z}}{{\mathrm d}u/{\mathrm d}z}{{\mathrm d}u/{\mathrm d}z}{{\mathrm d}u/{\mathrm d}z}\) must be negative. Because \((1,0)\) is a saddle point (think about the sketch in 3.1), only one such trajectory can exist. Part of Kolmogorov’s contribution was to show that this trajectory must hit \((0,0)\) directly; given \((0,0)\) is a stable node, hopefully this is intuitively true, albeit unproven here. We can draw this as 6.2(a).
The final part of Kolmogorov’s contribution was to effectively couple this with a physical rule that waves travel at their slowest possible speed. We can therefore say that our travelling wave solution exists with \(c=2\).
How does this translate back into the graph of \(u(z)\)? We know it must go from \(u=1\) to \(u=0\), and that \(\mathchoice{\frac{{\mathrm d}u}{{\mathrm d}z}}{{\mathrm d}u/{\mathrm d}z}{{\mathrm d}u/{\mathrm d}z}{{\mathrm d}u/{\mathrm d}z}\) (\(=v\), remember) is negative throughout. Furthermore, we know the magnitude of \(\mathchoice{\frac{{\mathrm d}u}{{\mathrm d}z}}{{\mathrm d}u/{\mathrm d}z}{{\mathrm d}u/{\mathrm d}z}{{\mathrm d}u/{\mathrm d}z}\) increases then decreases. The result... 6.3(a). This plot is from a numerical solution, but you can see we were able to qualitatively describe it analytically. Furthermore, see what happens when \(c<2\): in 6.3(b) you see that \(u<0\) at some point, as expected. Red card!
In conclusion: the full solution, \(u(x,t)\), looks like the plot in 6.3(a) (for \(u\) against \(x\)), and moves rightwards over time with speed \(c=2\).
Let’s make good on our promise from earlier that we will return to our squirrels and the invasion of Britain of grey squirrels overpowering red squirrels (6.4). Remember that the squirrels appeared in the context of two populations competing for the same resources. This led us to the competitive Lotka–Volterra equations, [lotvolcomp2]. If we add in diffusion to both populations, we can write this system as \[\begin{aligned} \mathchoice{\frac{\partial u}{\partial t}}{\partial u/\partial t}{\partial u/\partial t}{\partial u/\partial t} & = D_1\nabla^2 u + a_1u(1-b_1 u-c_1v),\\ \mathchoice{\frac{\partial v}{\partial t}}{\partial v/\partial t}{\partial v/\partial t}{\partial v/\partial t} & = D_2\nabla^2 v + a_2v(1-b_2v-c_2 u). \end{aligned}\] where the functions \(u(\mathbfit{x},t)\) and \(v(\mathbfit{x},t)\) have both spatial and temporal dependence. Once again, we have \(a_i,b_i,c_i > 0\).
We nondimensionalise by setting \(\mathbfit{x} = (D_1/a_1)^{1/2}\widehat{\mathbfit{x}}\), \(t = \widehat{t}/a_1\), \(\kappa = D_2/D_1\), \(\beta = a_2/a_1\), \(\gamma_1= c_1/b_2\), \(\gamma_2 = c_2/b_1\), \(u= \widehat{u}/b_1\), \(v = \widehat{v}/b_2\) (and dropping hats) to get \[\begin{align} \label{scaledlotvolpursuit} \mathchoice{\frac{\partial u}{\partial t}}{\partial u/\partial t}{\partial u/\partial t}{\partial u/\partial t} &= \nabla^2 u +u(1-u-\gamma_1v),\\ \mathchoice{\frac{\partial v}{\partial t}}{\partial v/\partial t}{\partial v/\partial t}{\partial v/\partial t} &= \kappa\nabla^2 v + \beta v(1-v-\gamma_2u). \end{align}\]
At this point, you may notice that we haven’t picked a winner from \(u\) and \(v\). But if you recall from 3.9, reprinted in 6.5, in the homogeneous case, choosing \(\gamma_1>1\) and \(\gamma_2<1\) makes the stable equilibrium the one where \(v\) dominates over \(u\). This choice allows us to continue to think of \(v\) as the grey squirrels.
In the homogeneous case, as we can see in 6.5, there are three equilibria:
\((0,0)\), an unstable node,
\((0,1)\), a stable node,
\((1,0)\), a saddle point.
From looking at the phase plane, or by thinking about trajectories leaving unstable equilibria, we can see that there are trajectories between our equilibria:
from \((0,0)\) to \((0,1)\) (a vertical line going upwards),
from \((1,0)\) to \((0,1)\).
The first case is quite boring: it’s just the growth of the grey population in the absence of reds. This second case is more interesting, as it corresponds to the shift from red domination to grey domination in the homogeneous system. Let’s assume a 1D Britain (Durham will be about \(x=L/2\)... yes, I know, you’d think larger but Scotland is just really big) and let’s look for travelling wave solutions to this more interesting case.
Let’s consider the boundary conditions. If we, once again, choose to place the stable state at \(x\to-\infty\), then we expect this region to grow and for the travelling wave to therefore move rightwards. (This is the usual choice in the community.) We therefore look for functions \(u(z),v(z)\) where \(z= x-ct\).
With this assumption, [scaledlotvolpursuit] takes the form \[\begin{align} \mathchoice{\frac{{\mathrm d}^2 u}{{\mathrm d}z^2}}{{\mathrm d}^2 u/{\mathrm d}z^2}{{\mathrm d}^2 u/{\mathrm d}z^2}{{\mathrm d}^2 u/{\mathrm d}z^2}+c\mathchoice{\frac{{\mathrm d}u}{{\mathrm d}z}}{{\mathrm d}u/{\mathrm d}z}{{\mathrm d}u/{\mathrm d}z}{{\mathrm d}u/{\mathrm d}z} + u(1 - u - \gamma_1v) &= 0,\\ \kappa\mathchoice{\frac{{\mathrm d}^2 v}{{\mathrm d}z^2}}{{\mathrm d}^2 v/{\mathrm d}z^2}{{\mathrm d}^2 v/{\mathrm d}z^2}{{\mathrm d}^2 v/{\mathrm d}z^2}+c\mathchoice{\frac{{\mathrm d}v}{{\mathrm d}z}}{{\mathrm d}v/{\mathrm d}z}{{\mathrm d}v/{\mathrm d}z}{{\mathrm d}v/{\mathrm d}z} + \beta v(1 - v -\gamma_2u) &= 0.\label{fk-grey-squirrel} \end{align}\] with boundary conditions \[(u,v)\to(0,1)\text{ as }z\to-\infty, \qquad (u,v)\to(1,0)\text{ as }z\to\infty,\] implying that a travelling wave acts between these two extremes, and travels rightwards with a speed \(c\), to be determined: this idea is illustrated in 6.6(a).
Convince yourself that you are happy with how the boundary condition has changed from being in \(x\) to being in \(z\).
In general, it can be shown this system is not integrable; however, in the case \(\beta = \kappa = 1\) and \(\gamma_1+\gamma_2 = 2\), we can add the two equations to get \[\mathchoice{\frac{{\mathrm d}^2 w}{{\mathrm d}z^2}}{{\mathrm d}^2 w/{\mathrm d}z^2}{{\mathrm d}^2 w/{\mathrm d}z^2}{{\mathrm d}^2 w/{\mathrm d}z^2} + c\mathchoice{\frac{{\mathrm d}w}{{\mathrm d}z}}{{\mathrm d}w/{\mathrm d}z}{{\mathrm d}w/{\mathrm d}z}{{\mathrm d}w/{\mathrm d}z} + w(1-w)=0,\] where \(w = u+v\), subject to \(w(-\infty)=1\) and \(w(\infty)=1\).
This is (hurrah!) the FK equation again, so we can expect travelling wave solutions between the two conditions at \(\pm\infty\). But lo!, actually these conditions are the same, \(w(\pm\infty)=1\). So although we have a travelling wave, it goes from \(w=1\) to \(w=1\). As you can see in 6.6(b), this actually just suggests that \(w(z)=1\) for all \(z\)... a little underwhelming.
In the language of dynamical systems, a trajectory that goes from one equilibrium to a different one is known as a heteroclinic orbit, or heteroclinic connection. That was the type we saw in the FK equation example. A trajectory that goes from one equilibrium back to itself is a homoclinic orbit, and that is the type we have here.
You might be unconvinced that the boundary conditions really imply \(w\) has to be a constant: after all, could you not have a solution which is \(w=1\) everywhere apart from one travelling bit which is a single wave? (Or more technically, a soliton?) In general, yes, this can happen... just not for this equation! You can prove this using Sturm–Liouville theory but that is outside the scope of our study here.
Nonetheless this allows us to write \(u=1-v\), and [fk-grey-squirrel] becomes \[\mathchoice{\frac{{\mathrm d}^2 v}{{\mathrm d}z^2}}{{\mathrm d}^2 v/{\mathrm d}z^2}{{\mathrm d}^2 v/{\mathrm d}z^2}{{\mathrm d}^2 v/{\mathrm d}z^2} + c \mathchoice{\frac{{\mathrm d}v}{{\mathrm d}z}}{{\mathrm d}v/{\mathrm d}z}{{\mathrm d}v/{\mathrm d}z}{{\mathrm d}v/{\mathrm d}z} + (1-\gamma_2)v(1-v) = 0,\] with \(v(-\infty) = 1\), \(v(\infty) = 0\).
Once again, this is the FK equation (just with different constants), and we can solve it numerically to find the shape of the profile: see 6.7. More importantly for us, we can also use the same linear stability analysis as before to find that the associated wave speed must be \[c = 2\sqrt{1-\gamma_2},\] giving us a prediction for the speed of the approaching grey squirrel army.
Convince yourself! Perform the linear stability analysis on our altered FK equation and confirm that we do indeed find \(c = 2\sqrt{1-\gamma_2}\).
As a model of squirrel invasion through Britain, this one is surprisingly good, even if here we are modelling Britain as a one-dimensional line (maybe the squirrels take the A1). Murray, vol. 2, chap. 1, has a nice discussion on getting real-world parameters to fit this model.
We’ve just seen the spatial extension of the competitive Lotka–Volterra equations. Now let’s take a look at a spatial extension of a predator–prey system that is closer to the original Lotka–Volterra model.
Consider the system \[\begin{aligned} \mathchoice{\frac{\partial u}{\partial t}}{\partial u/\partial t}{\partial u/\partial t}{\partial u/\partial t} &= D_1 \mathchoice{\frac{\partial^2 u}{\partial x^2}}{\partial^2 u/\partial x^2}{\partial^2 u/\partial x^2}{\partial^2 u/\partial x^2} + au\left(1-\frac{u}{K}\right) - buv,\\ \mathchoice{\frac{\partial v}{\partial t}}{\partial v/\partial t}{\partial v/\partial t}{\partial v/\partial t} &= D_2 \mathchoice{\frac{\partial^2 v}{\partial x^2}}{\partial^2 v/\partial x^2}{\partial^2 v/\partial x^2}{\partial^2 v/\partial x^2} - cv + duv, \end{aligned}\] where \(u\) is the prey population size, and \(v\) is the predator population size. In the absence of predators, the prey population obeys the FK equation with diffusion coefficient \(D_1\), growth rate per capita, \(a\), and carrying capacity, \(K\). With predators, the prey are consumed at rate \(−buv\). The predator population grows in size proportional to its rate of prey consumption, \(duv\), dies off at a rate \(−cv\), and spreads with a diffusion coefficient \(D_2\).
We nondimensionalise the system by introducing \[u = K\widehat{u}, \quad v = \frac{a}{b}\widehat{v}, \quad t = \frac{\widehat{t}}{a}, \quad\text{and}\quad x = \sqrt{\frac{D_2}{a}}\widehat{x}\] and dropping hats to obtain \[\begin{aligned} \mathchoice{\frac{\partial u}{\partial t}}{\partial u/\partial t}{\partial u/\partial t}{\partial u/\partial t} &= D\mathchoice{\frac{\partial^2 u}{\partial x^2}}{\partial^2 u/\partial x^2}{\partial^2 u/\partial x^2}{\partial^2 u/\partial x^2} + u(1-u-v),\\ \mathchoice{\frac{\partial v}{\partial t}}{\partial v/\partial t}{\partial v/\partial t}{\partial v/\partial t} &= \mathchoice{\frac{\partial^2 v}{\partial x^2}}{\partial^2 v/\partial x^2}{\partial^2 v/\partial x^2}{\partial^2 v/\partial x^2} + \alpha v(-\beta+u), \end{aligned}\] where \(\alpha=\beta K/a\), \(\beta = c/dK\) and \(D=D_1/D_2\).
We will consider the situation where the prey is not able to move, i.e. \(D_1=0\) and so \(D=0\). This would apply, for example, to non-motile microorganisms that can be consumed by a motile predator, such as nematodes or another simple animal (6.8). This leaves us with \[\begin{align} \label{sclv-nondim} \mathchoice{\frac{\partial u}{\partial t}}{\partial u/\partial t}{\partial u/\partial t}{\partial u/\partial t} &= u(1-u-v),\\ \mathchoice{\frac{\partial v}{\partial t}}{\partial v/\partial t}{\partial v/\partial t}{\partial v/\partial t} &= \mathchoice{\frac{\partial^2 v}{\partial x^2}}{\partial^2 v/\partial x^2}{\partial^2 v/\partial x^2}{\partial^2 v/\partial x^2}+\alpha v(u-\beta). \end{align}\] As we have done with the FK equation, let’s take a look at the homogeneous equilibria: \[\begin{aligned} 0 &= u(1-u-v),\\ 0 &= \alpha v(u-\beta). \end{aligned}\] The homogeneous system has three equilibria: \[(0,0), \quad (1,0), \quad \text{and} \quad (\beta, 1-\beta).\] For that last coexistence equilibrium to be permissible, we need \(\beta < 1\), so let’s take this to be the case in the following discussion.
A linear stability analysis of the equilibria of the homogeneous system (conveniently left as an exercise) reveals that:
\((0,0)\) is a saddle point,
\((1,0)\) is a saddle point,
\((\beta, 1-\beta)\) is a stable node when \(4\alpha < \beta/(1-\beta)\), and a stable spiral for \(4\alpha > \beta/(1-\beta)\).
Once again, this suggests possible trajectories between our equilibria:
from \((0,0)\) to \((\beta,1-\beta)\),
from \((1,0)\) to \((\beta,1-\beta)\).
The beginning of a phase portrait for the homogeneous system is shown in 6.9, with the equilibria marked on it. The position of the equilibria is really all we know at this stage; to find the trajectories we could do a proper phase plane analysis. Without this computation, we can still guess that there might be trajectories from the saddle points to the stable spiral.
Returning to the PDE system, we’ll proceed as we had done with the FK equation and search for travelling wave solutions. We’ll place the stable equilibria at \(x=-\infty\) and therefore we can look for right-travelling waves of the form \[u(x,t) = u(z), \quad v(x,t) = v(z), \quad z = x-ct.\]
The analysis of this problem concludes in Problem Sheet 4.
Let’s have a look at some numerical simulations of the full 2D system for our squirrels, [scaledlotvolpursuit]. In 6.10 we see snapshots of a simulation whose initial conditions are two Gaussian populations which diffuse and interact. The values of \(\gamma_1\) and \(\gamma_2\) are chosen so that \(v\) represents the grey squirrels (inconveniently in red). The boundary conditions are that both populations are zero on the boundary. Do we get travelling wave solutions forever, like in our 1D model?
From (a)–(b) the populations diffuse and begin to interact. In (c)–(e) we see the greys gradually outcompeting the reds. This ends with the reds becoming extinct in (f). This case does lead to a finite-time equilibrium. The critical observation here is that boundary conditions play a fundamental role in determining what type of solutions are permitted. The travelling wave solution would eventually violate the zero boundary conditions, so it cannot persist for all \(t\).
We are going to round off this term by starting to look at PDEs with boundary conditions which don’t admit travelling wave solutions. We will do this by looking at a physical problem, and then a fun biological example.
The FK equation is covered in Murray, vol. I, chap. 13.
Invading squirrels and spatial predator–prey are covered in Murray, vol. II, chap. 1. The questions at the end of the chapter contain a number of example travelling wave type problems which might be helpful.
Jacka Banasiaka at Łódź University of Technology provides some worked examples of travelling wave solutions to various problems.
The question of stability for equilibrium solutions to partial differential equations is far more complex than for ordinary differential equation systems. This is primarily for two reasons:
We have boundary conditions, not just initial conditions as is the case for ODEs.
The equilibria can be spatially dependent. As you’ll see next term, this generally means we end up with linearised sets of equations for which the coefficients aren’t constant. This means we can’t always assume a general form for the solutions, which was possible with the exponential form of the ODE case.
We have seen already that functions can be in equilibrium (\(\mathchoice{\frac{\partial^{n} u}{\partial t^{n}}}{\partial^{n} u/\partial t^{n}}{\partial^{n} u/\partial t^{n}}{\partial^{n} u/\partial t^{n}} = 0\)) and be homogeneous (\(\mathchoice{\frac{\partial^{n} u}{\partial x_i^{n}}}{\partial^{n} u/\partial x_i^{n}}{\partial^{n} u/\partial x_i^{n}}{\partial^{n} u/\partial x_i^{n}} = 0\)). These two things don’t necessarily come together: equilibria can have nontrivial spatial derivatives, as we saw in the last chapter. And we can have homogeneous solutions which grow in time out of equilibrium (if the boundary conditions allow).
But there exists a special class of equilibria – homogeneous equilibria – where you get both these things, and these represent non-changing uniform densities or populations. Of course, this is only permissible for a specific set of boundary conditions: namely that all derivatives vanish on the boundary of the domain. But there are very good number of scenarios where this is relevant. Physically this really just says populations are required to stay within the given domain. Such boundary conditions are commonly referred to as no-flux. For these types of equilibria, the stability analysis is relatively straightforward. We start with a (relatively) simple example which has an interesting biological relevance.
Take a look at 7.1. We consider a thin tubular elastic body which is initially lined up along the \(z\)-axis. It is subjected to a load, \(N\), at one end, and it is clamped at the other end. We assume its ends stay lined up along \(z\) as it is (possibly) deformed under the force, and we monitor the deflection, \(w(s,t)\), as a function of arclength along the tube, \(s\), and time, \(t\), from the \(z\)-axis. The equation of motion of this system can be shown to be \[\begin{align} \label{beameq} \mathchoice{\frac{\partial^{4} w}{\partial s^{4}}}{\partial^{4} w/\partial s^{4}}{\partial^{4} w/\partial s^{4}}{\partial^{4} w/\partial s^{4}} + \frac{N}{EI}\mathchoice{\frac{\partial^2 w}{\partial s^2}}{\partial^2 w/\partial s^2}{\partial^2 w/\partial s^2}{\partial^2 w/\partial s^2} +\rho A \mathchoice{\frac{\partial^2 w}{\partial t^2}}{\partial^2 w/\partial t^2}{\partial^2 w/\partial t^2}{\partial^2 w/\partial t^2}=0, \end{align}\] with \(\rho\) the tube density, \(A\) its cross-sectional area, \(E\) its Young’s modulus (resistance to stretching), and \(I\) a moment of inertia. (You’re not expected to derive this equation!) The boundary conditions are \[w(0,t) = 0,\quad w(L,t) = 0,\quad \mathchoice{\frac{\partial w}{\partial s}}{\partial w/\partial s}{\partial w/\partial s}{\partial w/\partial s}(0,t)= 0,\quad \mathchoice{\frac{\partial w}{\partial s}}{\partial w/\partial s}{\partial w/\partial s}{\partial w/\partial s}(L,t)=0 \quad \forall t.\] These boundary conditions are clamped conditions: the first two mean no deflection at either end and the last two mean it straightens out towards the end of the beam.
There is a trivial equilibrium solution \(w_0(s,t)=0\) for all \(s,t\), which corresponds to the body remaining straight. Note this is true whatever the applied load, \(N\). However, our experience tells us the tube should give way under enough force. To ascertain when this happens we perform a linear stability analysis using the same steps as in the ODE case
In this case we are only interested in the homogeneous equilibrium \(w_0(s,t) = 0\) for all \(s,t\). From experience, there are whole classes of inhomogeneous equilibria4 but we will not pursue them here.
We expand out the solution as \(w(s,t) \approx w_0 + \varepsilon w_1(s,t)\). [beameq] is already linear with constant coefficients so the linearised equation is simply \[\begin{align} \label{beameqlin} \mathchoice{\frac{\partial^{4} w_1}{\partial s^{4}}}{\partial^{4} w_1/\partial s^{4}}{\partial^{4} w_1/\partial s^{4}}{\partial^{4} w_1/\partial s^{4}} + \frac{N}{EI}\mathchoice{\frac{\partial^2 w_1}{\partial s^2}}{\partial^2 w_1/\partial s^2}{\partial^2 w_1/\partial s^2}{\partial^2 w_1/\partial s^2} +\rho A \mathchoice{\frac{\partial^2 w_1}{\partial t^2}}{\partial^2 w_1/\partial t^2}{\partial^2 w_1/\partial t^2}{\partial^2 w_1/\partial t^2}=0, \end{align}\] (it will not always be this easy!).
We seek solutions which will either grow or decay in time, i.e. solutions of the form \[\begin{equation} \label{form-of-w1} w_1(s,t) = f(s)\mathrm{e}^{\lambda t}. \end{equation}\] with \(\lambda\) the critical growth constant. [beameqlin] reduces to \[\begin{equation} \label{linbeam1} \mathchoice{\frac{{\mathrm d}^{4} f}{{\mathrm d}s^{4}}}{{\mathrm d}^{4} f/{\mathrm d}s^{4}}{{\mathrm d}^{4} f/{\mathrm d}s^{4}}{{\mathrm d}^{4} f/{\mathrm d}s^{4}} + \frac{N}{EI}\mathchoice{\frac{{\mathrm d}^2 f}{{\mathrm d}s^2}}{{\mathrm d}^2 f/{\mathrm d}s^2}{{\mathrm d}^2 f/{\mathrm d}s^2}{{\mathrm d}^2 f/{\mathrm d}s^2} +\rho A f(s)\lambda^2=0. \end{equation}\] In essence this is an eigenvalue problem for the operator \[\mathchoice{\frac{{\mathrm d}^{4} }{{\mathrm d}s^{4}}}{{\mathrm d}^{4} /{\mathrm d}s^{4}}{{\mathrm d}^{4} /{\mathrm d}s^{4}}{{\mathrm d}^{4} /{\mathrm d}s^{4}} + \frac{N}{EI}\mathchoice{\frac{{\mathrm d}^2 }{{\mathrm d}s^2}}{{\mathrm d}^2 /{\mathrm d}s^2}{{\mathrm d}^2 /{\mathrm d}s^2}{{\mathrm d}^2 /{\mathrm d}s^2}.\] Since the equation has constant coefficients but periodic boundary conditions, we seek sinusoidally varying solutions of the form \[\begin{equation} \label{complexsol} f(s) = B\mathrm{e}^{\mathrm{i}k s}, \end{equation}\] with \(k\) to be determined by our boundary conditions. Note this is just a lazy way of representing both the cos and sin solution behaviour of the linear constant coefficient ODEs. We assume we only want to satisfy the real part of the equation obtained by substituting this in. Choosing \(B\) as real or imaginary then selects the required behaviour.
This solution is assumed to satisfy \(w_1(0) = w_1(L) = 0\) (no further end deflection) but we allow for arbitrary (but small) changes in the derivatives of \(w\) at the boundary. Therefore \(w_1(0)=w_1(L) =0\) are the only boundary conditions we impose for our small variations around equilibrium. These boundary conditions can only be satisfied for the \(\sin\) part of \(f(s)\): recalling that we only need to satisfy the real part of the equation, we have from Euler’s formula, \[B\mathrm{e}^{\mathrm{i}k s} = B(\cos ks + \mathrm{i}\sin ks),\] and so we set \(B =-C\mathrm{i}\), for some real \(C\), and require \[\begin{equation} \label{kbcon} kL = n \pi. \end{equation}\] The derivatives are not required to be zero for these small perturbations and the change in value of the derivatives sets the magnitude of \(B\), the size of the mode of vibration (solutions of the form [complexsol] are ‘vibrations’ of the system, see 7.2), but this will not be important to us here.
It might seem odd to use the complex wave function rather than \(\cos\) or \(\sin\) (it has even derivatives), but this is a more general solution and \(B\) can just be made imaginary to choose the \(\sin\) solution. It is also handy where there are odd derivatives and a growth/decaying exponential is required for the solution (\(k\) can have an imaginary part). It is standard practice to use this complex waveform, and since we never actually need to fully set the value of \(B\) (we are just interested in \(\lambda\)) it is easier to do so.
With this assumption for \(f(s)\) we have \[k^4 - \frac{N}{E I}k^2 + \rho A \lambda^2 = 0 \implies \lambda^2 = \frac{1}{\rho A}k^2\left(\frac{N}{E I}-k^2\right).\]
If \[\begin{equation} \label{kvcon} k^2 < \frac{N}{EI}, \end{equation}\] we have real eigenvalues, one of which is positive, and so the system will be unstable as it grows in time. It would appear that if \[k^2 > \frac{N}{EI},\] then the solutions are purely imaginary, leaving us in the degenerate case. But actually, satisfying the boundary conditions in this final case requires that the solution is zero. This is because the solutions in this case are in the form \[w_1(s,t) = f(s)\mathrm{e}^{\lambda t} = B\mathrm{e}^{\mathrm{i}(k s + \nu t)},\] where \[\nu = \pm\frac{k}{\sqrt{\rho A}}\sqrt{k^2-\frac{N}{EI}},\] and remembering we only have to look at the real part of the equation, \[\operatorname{Re}(w_1(s,t)) = C\sin(k s + \nu t).\] But \(C\sin(k s+\nu t)\) cannot satisfy the boundary conditions for all time, \(t\), unless \(C=0\): by having only imaginary \(\lambda\) we have killed the nontrivial solution’s ability to satisfy our constraints. Physically this is equivalent to saying the vibrations do not exist.
And finally, if \(k^2=N/EI\), we have zero eigenvalues, leaving us in the degenerate case where there are neighbouring equilibrium solutions: under the very specific load, \(N\), the perturbations are also equilibria.
Here we see the first example of the complexity of PDE stability in comparison to ODE stability: the difference between static and dynamic equilibrium analysis!
Using [kvcon] and [kbcon] we see that if the force \(N\) satisfies \[\begin{equation} \label{eulerbuck} N > \frac{n^2 \pi^2 E I}{L^2}, \end{equation}\] then the initial \(w=0\) \(\forall s\) solution is unstable and the beam will give way under the applied force, \(N\). The lowest force, \(N\), for which this will be true is the \(n=1\) mode. So the minimum critical force \(N_c\) after which the beam is unstable is \[\begin{equation} \label{criticalforce} N_c=\frac{\pi^2 E I}{L^2}. \end{equation}\] This is a famous result due to (surprise!) Euler, although the way he derived it is a little different. It has actually been shown to have some biological relevance...
What happens if we include the derivative boundary conditions for \(w_1\)? First note that \(f(s) = B\mathrm{e}^{\mathrm{i}ks}\), with \(kL=n\pi\), as we had in [complexsol], can’t satisfy both the condition that \(w_1 = 0\) and \(\mathchoice{\frac{\partial w_1}{\partial s}}{\partial w_1/\partial s}{\partial w_1/\partial s}{\partial w_1/\partial s} = 0\) at \(s=0\) and \(s=L\), unless \(B=0\). What could satisfy both boundary conditions is \(f(s) = B(1+\mathrm{e}^{\mathrm{i}k s})\) with \(kL=2n\pi\), or more intuitively, \[f(s) = \widetilde{C} \left [1 + \cos\left(\frac{2 n \pi s}{L}\right)\right],\] but this only satisfies [linbeam1] if \(\lambda=0\), i.e. there is no time dependence in \(w_1(s,t)\) (if you look at [form-of-w1]). So we would be in the degenerate case with neighbouring equilibrium solutions. But in that case, and following the same method as above, we find that the critical force after which the beam is unstable corresponds to the second, \(n=2\), mode.
Mammals tend to use their legs to support the force of their bodies. Let us imagine a cow with four cylindrical legs (7.3): for a cylinder with radius \(R\), the moment of inertia is \(I = \pi R^4/4\). Its main body (the meaty bit!) has a volume \(V\) and density \(\rho_{c}\) so that its weight is \(\rho_c V g\). If we assume this weight (a load) is split equally amongst the four legs then \[N= \frac{\rho_cV g}{4}.\] Now, let’s say we increased the dimensions of the animal by a factor \(a\), we have \[V_{\text{new}} = a^3V,\] so the load on each leg grows cubically in the scaling \(a\).
Next, we look at the critical force equation, [criticalforce]. The Young’s modulus, \(E\), is a material constant so the minimum critical force scales like \(N_c \sim R^4/L^2\). Well, under our increased dimensions, \[N_c^{\text{new}} \sim \frac{R_{\text{new}}^4}{L_{\text{new}}^2} = \frac{a^4 R^4}{a^2 L^2} = a^2\frac{R^4}{L^2}.\] Therefore the critical load under which the legs (beams) will give way scales as \(a^2\). Under this change in dimensions, the size of the imposed force, \(N\), which scales with \(a^3\), grows faster than the critical load \(N_c\), which scales as \(a^2\). That is to say, scaling up the animal increases the possibility that the animal’s weight would cause its legs to give way, providing a possible explanation for why land mammals are limited in size, unlike sea mammals like whales.
In fact, we can go a bit further from a biological context. Imagine that the cow had dimensions of width, height and length. Imagine the length and width scaled (as the animal grows) at a rate \(b\) whilst the height scales as \(a\). Then the volume would scale as \[V_{\text{new}} = ab^2V\] and \[N_c^{\text{new}} \sim \frac{R_{\text{new}}^4}{L_{\text{new}}^2} = \frac{b^4}{a^2}\frac{R^4}{L^2}.\] One can see that if \(b=a^{3/2}\) then both the local \(N\) \((=\rho_c V g/4)\) and the critical load \(N_c\) will scale at the same rate, \[N_\text{new} \sim a^4 N, \qquad N_c^{\text{new}} \sim a^4 N_c.\] This would imply that in order for the legs not to buckle as we scale the cow up, the animal/cow would need to get wider (\(b\)) at a quicker rate than it gets taller (\(a\))!
So do cows (and other living things) actually exhibit this \(b=a^{3/2}\) relationship? Tree measurements from McMahon (1973, fig. 1)5 show that plotting the log of tree heights (\(\sim\) our \(a\)) against the log of tree diameter (\(\sim b\)) gives data which fits well to a slope of \(2/3\), i.e. \(a=b^{2/3}\) as hoped.
Given the weight scales as volume, \(W\sim ab^2 = a^4 = b^{8/3}\) under our \(3/2\) relationship. McMahon (fig. 3a) also demonstrates on primates that chest circumference (\(\sim b\)) plotted against body weight is a power law with exponent \(0.37 \approx 0.375 = 3/8\) as hoped. As as for cows, the wonderful 1000-page tome of Brody & Lardy (1946)6, with every conceivable piece of data on cows you would wish to hope for, shows that cow weight vs height (\(\sim a\)) matches a power law with exponent \(4.3 \approx 4\), and weight vs chest girth (\(\sim b\)) matches a power law with exponent \(2.8 \approx 8/3\).
Not bad for a little bit of scaling analysis. This is our first example of a biomechanical principle dictating the way the natural world has developed.
The Euler buckling problem is not something you will find in Murray. I would advise you stick to what’s here. The buckling result is usually presented in a different context (static not dynamic analysis) in engineering or elasticity texts.
This has been a fun way to finish this term, but scaling laws in general are extremely powerful, and we are only touching on them here.
This term we have looked at systems of differential equations that represent the growth and spread of biological agents: populations, chemical concentrations, and diseases. We have seen how phase portraits and linear stability analysis allow us to model the long-term behaviour of the sorts of systems we encounter all the time in nature. Latterly, we have seen how spatial variation makes everything a bit harder. These equations, reaction–diffusion systems, will appear next term and as you study their instabilities, you’ll learn how to answer another important question... how did the leopard get its spots?
But for now, it’s time to crack open the mince pies, and slowly soak into the Christmas break.
Epiphany term
In the first term course you spent time studying a variety of ordinary and partial differential equation models in mathematical biology, and ideally developed some mathematical tools for studying these models. A key theme is that these models often have nonlinearities which imply they cannot typically be analytically solved, unlike linear differential equations. This term we will further explore several different topics extending these ideas, both in terms of mathematical analysis such as linear stability and qualitative properties of differential equation models, as well as a range of applications from developmental biology to ecology and epidemiology. Ideally you will come away from this term reinforcing everything you learned last year, and having confidence to develop and analyse mathematical models of a range of biological phenomena.
We will start by generalising the basic elements of linear stability analysis (and local bifurcation theory) to arbitrary systems of ODEs, difference equations, and partial differential equations governing spatial populations. All of these kinds of models can be written abstractly as, \[\begin{equation} \label{abstract-US-evol} \mathchoice{\frac{{\mathrm d}u}{{\mathrm d}t}}{{\mathrm d}u/{\mathrm d}t}{{\mathrm d}u/{\mathrm d}t}{{\mathrm d}u/{\mathrm d}t} = \mathcal{F}(u), \quad \textrm{or} \quad u(t+1)= \mathcal{F} (u(t)), \end{equation}\] where the operator \(\mathcal{F}\) is an abstract operator – it can represent a vector-valued function for ODEs, things like the Laplacian for PDEs, etc. The basic ideas of linear stability seen last term apply to these cases just as readily, after one takes care of the details specific for a given model. Below we recall how this works for ODEs, hopefully jogging your memory of things like linear stability, permissibility of equilibria, etc.
We now revisit instability analysis of systems of ordinary differential equations which you have seen last term, but in a different way to help you remember the key ideas. Consider the system of \(n\) first-order ODEs: \[\begin{equation} \label{gen-US-ODE} \mathchoice{\frac{{\mathrm d}\mathbfit{u}}{{\mathrm d}t}}{{\mathrm d}\mathbfit{u}/{\mathrm d}t}{{\mathrm d}\mathbfit{u}/{\mathrm d}t}{{\mathrm d}\mathbfit{u}/{\mathrm d}t} = \mathbfit{f}(\mathbfit{u}), \end{equation}\] where \(\mathbfit{u}(t) = (u^1(t), u^2(t),\dots, u^n(t))\) is a vector of unknown functions, and \(\mathbfit{f} = (f_1(\mathbfit{u}), f_2(\mathbfit{u}), \dots, f_n(\mathbfit{u}))\) is a vector of given functions of \(\mathbfit{u}\). An equilibrium point or steady state solution is a constant vector \(\mathbfit{u}_0\) such that \(\mathbfit{f}(\mathbfit{u}_0) = \bm{0}\). We can then proceed exactly as we did last term to linearise [gen-US-ODE] around this equilibrium point by writing \(\mathbfit{u} = \mathbfit{u}_0 + \varepsilon \mathbfit{u}_1(t)\).7 Substituting this into our governing equation, and dropping terms of \(O(\varepsilon^2)\), we have the linear system, \[\begin{equation} \label{gen-US-lin-US-ODE} \mathchoice{\frac{{\mathrm d}\mathbfit{u}_1}{{\mathrm d}t}}{{\mathrm d}\mathbfit{u}_1/{\mathrm d}t}{{\mathrm d}\mathbfit{u}_1/{\mathrm d}t}{{\mathrm d}\mathbfit{u}_1/{\mathrm d}t} = \mathsfbfit{J}(\mathbfit{u}_0)\mathbfit{u}_1, \quad \mathsfbfit{J} = \begin{bmatrix} \mathchoice{\frac{\partial f_1}{\partial u^1}}{\partial f_1/\partial u^1}{\partial f_1/\partial u^1}{\partial f_1/\partial u^1} & \dots & \mathchoice{\frac{\partial f_1}{\partial u^n}}{\partial f_1/\partial u^n}{\partial f_1/\partial u^n}{\partial f_1/\partial u^n} \\ \vdots & \ddots & \\ \mathchoice{\frac{\partial f_n}{\partial u^1}}{\partial f_n/\partial u^1}{\partial f_n/\partial u^1}{\partial f_n/\partial u^1} & & \mathchoice{\frac{\partial f_n}{\partial u^n}}{\partial f_n/\partial u^n}{\partial f_n/\partial u^n}{\partial f_n/\partial u^n} \end{bmatrix}, \end{equation}\] where \(\mathsfbfit{J}\) is the constant matrix corresponding to the Jacobian of the function \(\mathbfit{f}\) evaluated at the steady state \(\mathbfit{u}_0\) (henceforth we will not write the dependence on \(\mathbfit{u}_0\), but keep this in mind!) We assume that \(\mathbfit{f}\) is a real-valued function of real variables, though the eigenvalues of \(\mathsfbfit{J}\) may be complex in general.
We next let \(\mathbfit{w}\) be an eigenvector of \(\mathsfbfit{J}\) with corresponding eigenvalue \(\lambda\). By substitution, we see that \(\mathbfit{u}(t) = \exp(\lambda t)\mathbfit{w}\) is a solution of [gen-US-lin-US-ODE] (check this step!) Of course, this is just ‘guessing’ a single solution, and it isn’t obvious that one gets all solutions this way. Nevertheless, one has the following classification of the stability of an equilibrium \(\mathbfit{u}_0\) based on the eigenvalues of the matrix \(\mathsfbfit{J}\):
Reminder: if all eigenvalues \(\lambda\) of \(\mathsfbfit{J}\) have \(\operatorname{Re}(\lambda)<0\), then the equilibrium \(\mathbfit{u}_0\) is said to be linearly stable8.
Instead, if at least one eigenvalue has \(\operatorname{Re}(\lambda)>0\), then \(\mathbfit{u}_0\) is said to be linearly unstable.
Finally, if the eigenvalue of \(\mathsfbfit{J}\) with the largest real part has \(\operatorname{Re}(\lambda)=0\), then linear stability analysis fails, and one has to do something else to see what happens to perturbations of the equilibrium.
To get some intuition for why this classification is true, let’s consider a special class of Jacobian matrices where we can calculate the general solution of [gen-US-lin-US-ODE] explicitly. If we assume that \(\mathsfbfit{J}\) is symmetric (that is, \(\mathsfbfit{J} = \mathsfbfit{J}^T\)), then by the Spectral Theorem all of its eigenvalues are real, and its eigenvectors form an orthonormal basis of \(\mathbb{R}^n\). So this lets us take some initial perturbation and expand it as \(\mathbfit{u}(t) = \sum_{i=1}^n A_i(t) \mathbfit{w}_i\) where \(\mathbfit{w}_i\) is the \(i\)th eigenvector of \(\mathsfbfit{J}\) with eigenvalue \(\lambda_i\). Substituting this into [gen-US-lin-US-ODE] we have, \[\begin{equation} \label{gen-US-lin-US-ODE-US-expanded} \mathchoice{\frac{{\mathrm d}}{{\mathrm d}t}}{{\mathrm d}/{\mathrm d}t}{{\mathrm d}/{\mathrm d}t}{{\mathrm d}/{\mathrm d}t} \left(\sum_{i=1}^n A_i(t) \mathbfit{w}_i\right) = \sum_{i=1}^n \mathchoice{\frac{{\mathrm d}A_i}{{\mathrm d}t}}{{\mathrm d}A_i/{\mathrm d}t}{{\mathrm d}A_i/{\mathrm d}t}{{\mathrm d}A_i/{\mathrm d}t}\mathbfit{w}_i = \mathsfbfit{J} \sum_{i=1}^n A_i(t) \mathbfit{w}_i = \sum_{i=1}^n A_i(t) \mathsfbfit{J}\mathbfit{w}_i = \sum_{i=1}^n A_i(t) \lambda_i\mathbfit{w}_i , \end{equation}\] where we have used linearity of the derivative and the matrix \(\mathsfbfit{J}\). Now, to solve for a specific \(A_k\), we can take the inner product of both sides of [gen-US-lin-US-ODE-US-expanded] with \(\mathbfit{w}_k\) to get, \[\begin{equation} \begin{gathered} \label{lin-US-ODE-US-expansion} \left \langle\sum_{i=1}^n \mathchoice{\frac{{\mathrm d}A_i}{{\mathrm d}t}}{{\mathrm d}A_i/{\mathrm d}t}{{\mathrm d}A_i/{\mathrm d}t}{{\mathrm d}A_i/{\mathrm d}t}\mathbfit{w}_i,\mathbfit{w}_k\right \rangle= \sum_{i=1}^n \mathchoice{\frac{{\mathrm d}A_i}{{\mathrm d}t}}{{\mathrm d}A_i/{\mathrm d}t}{{\mathrm d}A_i/{\mathrm d}t}{{\mathrm d}A_i/{\mathrm d}t}\left \langle\mathbfit{w}_i,\mathbfit{w}_k\right \rangle= \left \langle\sum_{i=1}^n A_i(t) \lambda_i\mathbfit{w}_i,\mathbfit{w}_k\right \rangle= \sum_{i=1}^n A_i(t) \lambda_i\left \langle\mathbfit{w}_i,\mathbfit{w}_k\right \rangle\\ \implies \mathchoice{\frac{{\mathrm d}A_k}{{\mathrm d}t}}{{\mathrm d}A_k/{\mathrm d}t}{{\mathrm d}A_k/{\mathrm d}t}{{\mathrm d}A_k/{\mathrm d}t} = A_k(t) \lambda_k \implies A_k(t) = C_k \mathrm{e}^{\lambda_k t}, \end{gathered} \end{equation}\] where we have used linearity of the first argument of the inner product, and orthogonality, to solve the system. The \(C_k\) are constants which can be found by writing the initial condition vector, \(\mathbfit{u}(0)\), in terms of the basis of eigenvectors \(\mathbfit{w}_k\).
The important take-home message is this – for the special case of a symmetric matrix, we can think of solutions of the linear system of equations in terms of the eigenvectors of \(\mathsfbfit{J}\). As this matrix is diagonal when written in this basis, these eigenvectors evolve independently. This idea will come up again when we study linear stability of PDEs, so take some time to make sure you understand this kind of system.
If \(\mathsfbfit{J}\) is diagonalisable, then the solutions \(\exp(\lambda_k t)\mathbfit{w}_k\) for \(k=1,\dots,n\), form a complete set of solutions for the linear system [gen-US-lin-US-ODE], so that linear combinations of these functions form all possible solutions. If \(\mathsfbfit{J}\) is not diagonalisable, then one has to consider generalised eigenvectors with time-dependent coefficients to find the ‘general solution’ of [gen-US-lin-US-ODE]. Still, linear stability will be governed by the real part of the eigenvalues of \(\mathsfbfit{J}\) as above – what changes is that perturbations of the equilibrium may grow or decay with a non-exponential transient.
The classification of equilibria above in terms of (linearly) stable or unstable is especially important when we think of ODE models which depend on parameters, and think of how solutions change as parameters are varied. If an equilibrium becomes stable or unstable as a parameter is varied, we refer to this as a bifurcation, and these can come in many different flavours. One can develop a fairly sophisticated theory of bifurcations of systems of the form, \[\mathchoice{\frac{{\mathrm d}\mathbfit{u}}{{\mathrm d}t}}{{\mathrm d}\mathbfit{u}/{\mathrm d}t}{{\mathrm d}\mathbfit{u}/{\mathrm d}t}{{\mathrm d}\mathbfit{u}/{\mathrm d}t} = \mathbfit{f}(\mathbfit{u};\bm{p}),\] where now \(\bm{p} = (p_1,p_2,\dots,p_m)\) is a vector of parameters. You saw several examples of these bifurcations last term, and we will see more this term (and extend the theory to bifurcations of PDEs). Rather than pursue a general treatment, let’s review one particularly important simple example which will play a role later.
Consider the ODE \[\begin{equation} \label{pitchfork-US-ODE} \mathchoice{\frac{{\mathrm d}u}{{\mathrm d}t}}{{\mathrm d}u/{\mathrm d}t}{{\mathrm d}u/{\mathrm d}t}{{\mathrm d}u/{\mathrm d}t} = u(p-u^2), \end{equation}\] where \(p\) is a real parameter. If \(p \leq 0\), then \(u_0=0\) is the only steady state (as we are only working with real functions), and this solution is stable for \(p < 0\). You should check these linear stability calculations yourself. If instead \(p>0\) then (in addition to the zero steady state), \(u_0=\pm \sqrt{p}\) are two new steady states. You can also show that these steady states are always stable (when they exist), and that \(u_0=0\) is unstable for \(p>0\). Hence this zero solution has lost stability by generating two symmetric branches of solutions. The corresponding bifurcation diagram is shown in 1.1, where the name ‘pitchfork bifurcation’ comes from.
This particular bifurcation will come up again this term, and ideally seeing this diagram will help remind you how to read these kinds of plots. Having drawn something like 1.1, we can read off essentially all of the local behaviour of the system (as this is a 1-D system which are always governed by equilibria). If \(p<0\), then the only possibility is that the system approaches \(u_0=0\). If instead \(p>0\), then the steady state \(u_0=0\) is a separatrix so that if \(u(0)>0\), the solution of the ODE approaches \(u_0=\sqrt{p}\), and if \(u(0)<0\) then as \(t \to \infty\), \(u(t) \to -\sqrt{p}\). [pitchfork-US-ODE] can be solved exactly, but the explicit solution itself is less insightful than the diagram. More importantly, this kind of bifurcation arises in a number of analytically intractable models, but linear stability analysis (and hence sketching the full bifurcation diagram) allows us to get important insights into these kinds of systems.
While this basic approach is powerful, allowing the analysis of complex nonlinear models which cannot be analytically solved, it also has important limitations. This kind of stability analysis is ‘local’ so that it is strictly only valid when \(u\approx u_0\). For an unstable equilibrium (where the solution moves away from \(u_0\)) or for an initial condition not very close, linear stability often does not tell us what happens. Beyond one dimension of state space (a scalar ODE), there are also periodic solutions which can be stable or unstable, and they can undergo bifurcations as parameters vary, though the analysis is harder than for steady state solutions. In three or more dimensions, chaotic solutions can be found, and this kind of local analysis tells us essentially nothing in these cases.
Still, this is a general tool which is widely applicable. In particular, the presentation above only considers first-order systems, but last term you considered ODEs of the form, \[\begin{equation} \label{higher-US-order-US-ODE} \mathchoice{\frac{{\mathrm d}^{n} u}{{\mathrm d}t^{n}}}{{\mathrm d}^{n} u/{\mathrm d}t^{n}}{{\mathrm d}^{n} u/{\mathrm d}t^{n}}{{\mathrm d}^{n} u/{\mathrm d}t^{n}}+F\left (u, \mathchoice{\frac{{\mathrm d}u}{{\mathrm d}t}}{{\mathrm d}u/{\mathrm d}t}{{\mathrm d}u/{\mathrm d}t}{{\mathrm d}u/{\mathrm d}t}, \dots, \mathchoice{\frac{{\mathrm d}^{n-1} u}{{\mathrm d}t^{n-1}}}{{\mathrm d}^{n-1} u/{\mathrm d}t^{n-1}}{{\mathrm d}^{n-1} u/{\mathrm d}t^{n-1}}{{\mathrm d}^{n-1} u/{\mathrm d}t^{n-1}} \right)=0. \end{equation}\] We can rewrite such equations as first-order equations by creating new variables for each time derivative; that is, \[\begin{equation} \label{higher-US-oder-US-system} \mathchoice{\frac{{\mathrm d}u}{{\mathrm d}t}}{{\mathrm d}u/{\mathrm d}t}{{\mathrm d}u/{\mathrm d}t}{{\mathrm d}u/{\mathrm d}t} = u^1, \quad \mathchoice{\frac{{\mathrm d}^{2} u}{{\mathrm d}t^{2}}}{{\mathrm d}^{2} u/{\mathrm d}t^{2}}{{\mathrm d}^{2} u/{\mathrm d}t^{2}}{{\mathrm d}^{2} u/{\mathrm d}t^{2}}=\mathchoice{\frac{{\mathrm d}u^1}{{\mathrm d}t}}{{\mathrm d}u^1/{\mathrm d}t}{{\mathrm d}u^1/{\mathrm d}t}{{\mathrm d}u^1/{\mathrm d}t} = u^2,\,\, \dots, \,\, \mathchoice{\frac{{\mathrm d}^{n-1} u}{{\mathrm d}t^{n-1}}}{{\mathrm d}^{n-1} u/{\mathrm d}t^{n-1}}{{\mathrm d}^{n-1} u/{\mathrm d}t^{n-1}}{{\mathrm d}^{n-1} u/{\mathrm d}t^{n-1}} = \mathchoice{\frac{{\mathrm d}u^{n-2}}{{\mathrm d}t}}{{\mathrm d}u^{n-2}/{\mathrm d}t}{{\mathrm d}u^{n-2}/{\mathrm d}t}{{\mathrm d}u^{n-2}/{\mathrm d}t} = u^{n-1}, \quad \mathchoice{\frac{{\mathrm d}u^{n-1}}{{\mathrm d}t}}{{\mathrm d}u^{n-1}/{\mathrm d}t}{{\mathrm d}u^{n-1}/{\mathrm d}t}{{\mathrm d}u^{n-1}/{\mathrm d}t} = F\left (u, u^1, \dots, u^{n-1} \right). \end{equation}\] If you recall from last term, we can linearise [higher-US-order-US-ODE] by first finding a steady state where all time derivatives are zero. Hence, this is a constant \(u_0\) satisfying \(F(u_0,0,\dots,0)=0\). Substituting in \(u = u^1_0 + \varepsilon u^1_1(t)\), we get a linear ODE in \(u^1_1\) and \(n\) of its derivatives. Solving this ODE via the ansatz \(u^1 = \exp(\lambda t)\) leads to an \(n\)th degree polynomial for the growth rates \(\lambda\), which determine the behaviour of the perturbation. This characteristic polynomial for \(\lambda\) turns out to be exactly what we would get by considering the eigenvalues of the Jacobian of the system [higher-US-oder-US-system].
You should check the above claim for \(n=2\). Linearising the equation, \[\mathchoice{\frac{{\mathrm d}^2 u}{{\mathrm d}t^2}}{{\mathrm d}^2 u/{\mathrm d}t^2}{{\mathrm d}^2 u/{\mathrm d}t^2}{{\mathrm d}^2 u/{\mathrm d}t^2} + F\left (u, \mathchoice{\frac{{\mathrm d}u}{{\mathrm d}t}}{{\mathrm d}u/{\mathrm d}t}{{\mathrm d}u/{\mathrm d}t}{{\mathrm d}u/{\mathrm d}t}\right),\] you should find that the growth rates satisfy the polynomial, \[\lambda^2 + \lambda \mathchoice{\frac{\partial F}{\partial\mathchoice{\frac{{\mathrm d}u}{{\mathrm d}t}}{{\mathrm d}u/{\mathrm d}t}{{\mathrm d}u/{\mathrm d}t}{{\mathrm d}u/{\mathrm d}t}}}{\partial F/\partial\mathchoice{\frac{{\mathrm d}u}{{\mathrm d}t}}{{\mathrm d}u/{\mathrm d}t}{{\mathrm d}u/{\mathrm d}t}{{\mathrm d}u/{\mathrm d}t}}{\partial F/\partial\mathchoice{\frac{{\mathrm d}u}{{\mathrm d}t}}{{\mathrm d}u/{\mathrm d}t}{{\mathrm d}u/{\mathrm d}t}{{\mathrm d}u/{\mathrm d}t}}{\partial F/\partial\mathchoice{\frac{{\mathrm d}u}{{\mathrm d}t}}{{\mathrm d}u/{\mathrm d}t}{{\mathrm d}u/{\mathrm d}t}{{\mathrm d}u/{\mathrm d}t}}(u_0,0) + \mathchoice{\frac{\partial F}{\partial u}}{\partial F/\partial u}{\partial F/\partial u}{\partial F/\partial u}(u_0,0) = 0.\] Now write down the corresponding first-order system, and show that its eigenvalues must satisfy the same characteristic equation. Finally, do the same for the general \(n\) case.
In the remainder of these notes, we will develop increasingly sophisticated models of biological phenomena in time and space, building off of the simpler models you have seen last term. We will make use of slightly more advanced mathematical tools, though of a similar flavour to what you have seen before. An important caveat is that even quite sophisticated-looking models can be only rough caricatures of living things – biology is exquisitely complex, and the kind of theoretical models we are building in this course are very crude representations. To quote a paper we will spend significant time on this term:
“This model will be a simplification and an idealisation, and consequently a falsification. It is to be hoped that the features retained for discussion are those of greatest importance in the present state of knowledge."
—Alan Turing, 1952, The Chemical Basis of Morphogenesis
In addition to being careful (and humble) about what our models say about biology, we should also be aware of how limited our analytical tools are. Nowadays, it is routine to study models of biological systems numerically, so that we can visualise and interact with these models. We will only examine you on aspects of developing models, and analytical (that is, pen-and-paper) analysis of models, but would strongly encourage you to play with computer simulations of the equations covered in this course.
Here we will briefly mention some examples of large systems of ODEs, sketching how the dynamical systems tools we’ve discussed play important roles in analysing such systems..
Last term you studied different kinds of populations and their interactions. These included single-species phenomena such as different kinds of growth and Allee effects, as well as multi-species interactions such as predation and competition. In general, ecological systems can be extremely complex, with hundreds or thousands of different populations interacting. Quite a lot of mathematical modelling is about how to think of all of these complex interactions, and what sensible assumptions we might use to simplify such models to their essential components in order to learn something about how these ecosystems behave. We often organise these interactions into networks (also known as graphs9) of pairwise effects, such as competition or predation. Such networks are known as food webs, though it is typical that such webs have a layered structure divided into trophic levels; see 1.2 for an example. This is a way of thinking of where energy flows in an ecosystem, although as can be seen, these levels are not always cleanly separated.
Let’s consider an example of how to model such interactions between \(n\) species given by \(\mathbfit{u}=(u^1,u^2,\dots,u^n)\). We write a generalised Lotka–Volterra system as \[\begin{equation} \label{gen-US-lv} \mathchoice{\frac{{\mathrm d}u^i}{{\mathrm d}t}}{{\mathrm d}u^i/{\mathrm d}t}{{\mathrm d}u^i/{\mathrm d}t}{{\mathrm d}u^i/{\mathrm d}t}= u^if_i(\mathbfit{u}) = u^i\left(r_i+\sum_{j=1}^n A_{ij}u^j\right), \quad i=1\dots n, \end{equation}\] where the population interactions are given by the vector \[\mathbfit{f}(\mathbfit{u}) = \mathbfit{r}+\mathsfbfit{A}\mathbfit{u}.\] Here \(\mathbfit{r}\) is a constant vector representing the growth rate of a population independent of interactions with itself or other populations, and the constant matrix \(\mathsfbfit{A}\) (called the community matrix) represents the interactions between different populations. We have that \(A_{ii} \leq 0\) as a population at high densities competes with itself for resources. We typically have \(r_i\geq 0\) for populations which can persist without any of the other modelled species. We note that \(A_{ij}<0, A_{ji}<0\) represents populations \(i\) and \(j\) competing with one another, and that \(A_{ij}>0\), \(A_{ji}<0\) represents a predator-prey interaction, with population \(i\) being the predator. Note that for \(r_i>0\), what is really being represented is some input to the system which itself is not being modelled. This might be energy from the sun to plants, or it may be plants or bacteria if we are not modelling lower trophic levels.
In the competitive case, this model is exactly the \(n\)-species generalisation of the two-species model studied in 3.3. In general, there can be many equilibria, where some subset of the populations have gone extinct. However, if none of the populations are extinct, then we can set [gen-US-lv] to \(\mathbf{0}\) and divide each component by \(u^i_0\) to find that if a coexistence equilibrium exists and is permissible, it must satisfy, \[\mathbf{0} = \mathbfit{f}(\mathbfit{u}_0) = \mathbfit{r}+\mathsfbfit{A}\mathbfit{u}_0 \implies \mathbfit{u}_0 = -\mathsfbfit{A}^{-1}\mathbfit{r}.\] Computing conditions for such an equilibrium to be stable can be done, though it becomes quite a heavy computation in linear algebra and properties of matrix eigenvalues. We can derive at least one part of these conditions by first linearising [gen-US-lv] to find that the Jacobian is given by, \[\mathsfbfit{J}_{ij} = \begin{cases} \mathsfbfit{A}_{ij}u^i_0, \quad &i \neq j,\\ r_i +\sum_{j=1}^n\mathsfbfit{A}_{ij}u^j_0+ \mathsfbfit{A}_{ii}u^i_0, \quad &i=j. \end{cases}\] We note that this simplifies, as \(r_i +\sum_{j=1}^n\mathsfbfit{A}_{ij}u^j_0=0\) by the definition of the coexistence equilibrium. Hence our Jacobian is given by \(\mathsfbfit{J} = \mathsfbfit{D}\mathsfbfit{A}\) where \(\mathsfbfit{D}=\operatorname{diag}(\mathbfit{u}_0)\) is a diagonal matrix with \(\mathbfit{u}_0\) along the main diagonal. So we can relate the eigenvalues of \(\mathsfbfit{J}\) to those of \(\mathsfbfit{A}\) by multiplying by \(\mathsfbfit{D}^{-1}\), i.e. \[\mathsfbfit{J}\mathbfit{w} = \lambda\mathbfit{w} \implies \mathsfbfit{A}\mathbfit{w} = \lambda\mathsfbfit{D}^{-1}\mathbfit{w}.\] We are assuming that \(\mathbfit{u}_0\) is permissible, so each element of the inverse diagonal matrix is positive, i.e. \(\mathsfbfit{D}^{-1}_i>0\). Unfortunately, the eigenvalues of \(\mathsfbfit{J}\) are not scaled eigenvalues of \(\mathsfbfit{A}\) in general. So studying stability of the coexistence equilibrium is typically more involved than studying the eigenvalues of the community matrix \(\mathsfbfit{A}\). Nevertheless, one can show that if the coexistence equilibrium is permissible, then it must be that \(\mathsfbfit{A}\) has all eigenvalues with negative real part (though the converse is not always true). In fact, if the matrix \(\mathsfbfit{D}\mathsfbfit{A}\) has all eigenvalues with negative real part, then the equilibrium is globally stable (permissible initial conditions are attracted to it). See https://stefanoallesina.github.io/Sao_Paulo_School/intro.html#multi-species-dynamics for a proof and further discussion, though you do not need to memorize these details.
We won’t do any further analysis here, but hopefully you can see that this kind of model encompasses a huge range of specific types of interactions and kinds of ecosystems. There are many specific results for community matrices \(\mathsfbfit{A}\) with particular structures representing trophic interactions or even in the case of random community matrices, or ones which account for evolutionary selection. There are also other dynamical behaviours that can arise from these models, such as periodic limit cycles, quasiperiodic motions, and chaotic dynamics.
Reminder: unless you are asked to justify a linear stability analysis on an exam or problem sheet, you can simply compute the Jacobian and analyse its eigenvalues. You do not need to show all the steps of the linearisation.
Recall the two-species system from 3.3, \[\begin{aligned} \mathchoice{\frac{{\mathrm d}u}{{\mathrm d}t}}{{\mathrm d}u/{\mathrm d}t}{{\mathrm d}u/{\mathrm d}t}{{\mathrm d}u/{\mathrm d}t} = u(1-u-\gamma_1 v),\\ \mathchoice{\frac{{\mathrm d}v}{{\mathrm d}t}}{{\mathrm d}v/{\mathrm d}t}{{\mathrm d}v/{\mathrm d}t}{{\mathrm d}v/{\mathrm d}t} = v(1-v-\gamma_2 u), \end{aligned}\] where \(\gamma_1,\gamma_2>0\). Recall that direct linearisation gave a linear stability condition for the coexistence equilibrium that it was permissible and stable if and only if \(\gamma_1<1\) and \(\gamma_2 < 1\). Assuming that the coexistence state is permissible, show that requiring all eigenvalues of \(\mathsfbfit{A}\) to be negative gives identically the same condition for coexistence.
The majority of ODE population models studied so far in this course have taken the form, \[\mathchoice{\frac{{\mathrm d}u^i}{{\mathrm d}t}}{{\mathrm d}u^i/{\mathrm d}t}{{\mathrm d}u^i/{\mathrm d}t}{{\mathrm d}u^i/{\mathrm d}t} = u^if_i(\mathbfit{u}),\] for some nonlinear function \(\mathbfit{f}\). Such models are sometimes called Kolmogorov. These models have a nice property that any permissible initial condition (that is, an initial set of populations \(u^i(0) \geq 0\)) will lead to permissible solutions for all time, i.e. \(u^i(t)\geq\) for all \(t\geq0\). This can be shown by looking at what happens to the derivative of one component of this equation as the corresponding population tends to \(0\) (try showing this yourself). Ecologically, the nonlinearities \(\mathbfit{f}\) represent a functional-response, essentially generalising the idea of a growth rate of a population to depend on interactions with other populations.
Last term on Problem Sheet 3 you considered the Rosenzweig–MacArthur predator–prey model, which had a functional response between predator and prey that was not the quadratic form of the Lotka–Volterra model. This kind of model has been explored in a variety of different ways. One well-studied example is the Tritrophic Rosenzweig-MacArthur model which involves a prey \(u\), a predator \(v\), and a super-predator \(w\). One variant of this model is given by, \[\begin{align} \mathchoice{\frac{{\mathrm d}u}{{\mathrm d}t}}{{\mathrm d}u/{\mathrm d}t}{{\mathrm d}u/{\mathrm d}t}{{\mathrm d}u/{\mathrm d}t} &= ru\left(1-\frac{u}{K}\right) - \frac{\theta_u u v}{u+h_u},\label{tri-US-RM-US-u}\\ \mathchoice{\frac{{\mathrm d}v}{{\mathrm d}t}}{{\mathrm d}v/{\mathrm d}t}{{\mathrm d}v/{\mathrm d}t}{{\mathrm d}v/{\mathrm d}t} &= \frac{\varepsilon_u uv }{u+h_u}-\frac{\theta_v vw}{v+h_v}-\delta_v v ,\label{tri-US-RM-US-v}\\ \mathchoice{\frac{{\mathrm d}w}{{\mathrm d}t}}{{\mathrm d}w/{\mathrm d}t}{{\mathrm d}w/{\mathrm d}t}{{\mathrm d}w/{\mathrm d}t} &= \frac{\varepsilon_v vw }{v+h_v}-\delta_w w,\label{tri-US-RM-US-w} \end{align}\] where \(r,K\) are the prey growth rate and carrying capacity, \(h_u,h_v\) are predator and super-predator satiation parameters, \(\theta_u,\theta_v,\varepsilon_u,\varepsilon_v\) are parameters related to consumption and conversion of prey biomass into predator biomass, and \(\delta_v,\delta_w\) are the death rates of the predator and super-predator.
Such a model has several equilibria, such as a complete extinction state given by \((u_0,v_0,w_0)=(0,0,0)\), a predator-free equilibrium \((K,0,0)\), and a super-predator-free equilibrium, which will correspond (modulo nondimensionalisation) to the coexistence equilibrium you found on Problem Sheet 3. Finally, there is a coexistence equilibrium which can be explicitly computed. Stability around such an equilibrium is a bit tedious to compute, as it requires knowing how to find the eigenvalues of a \(3\times3\) matrix, but again this can be done analytically with some effort.
Compute the coexistence equilibrium of [tri-US-RM-US-u], [tri-US-RM-US-v] and [tri-US-RM-US-w] and determine any conditions on its permissibility. Harder: Find the Jacobian, and determine conditions for the stability of this equilibrium. Can this equilibrium and any other equilibrium be simultaneously stable? How do things change if the super-predator can also prey on \(u\)?
The basic ideas of this chapter can be found in several different places, such as Murray, vol. I, chap. 3, or in the book by Strogatz, Nonlinear Dynamics and Chaos, which includes substantially more material on other possible behaviours of these systems, classified by their size (number of equations, \(n\)).
There are many good online resources for visualising the phase planes/trajectories of these systems, such as this tool for \(n=1\) and \(n=2\). Links to particular examples of bifurcations and some other ideas can be found in these slides.
So far in this course we have studied population models involving continuous states (i.e. densities) and continuous time. These assumptions are valid for populations which are sufficiently large, and which have overlapping generations. However, many organisms have complex life-histories, where they undergo quite different behaviours as they age (e.g. transitioning from infertile juveniles to reproductive adults). Such age-structure can be modelled in different ways10, depending on the assumptions made about the population. Here we will consider models in discrete time, which are appropriate for organisms with non-overlapping generations, such as annual plants, many kinds of flies, and even certain kinds of mammalian cells (particularly in development). We will call these models difference equations, though some authors call them maps or iterated function systems.
Let’s consider simple models of single-species populations evolving in discrete time. In dimensional variables, these take the form of \[u(t) = f(u(t-\tau)), \quad u(0) = u_0,\] where \(\tau\) is the time between generations and \(u_0\) is an initial condition. For simplicity, we often want to nondimensionalise time in these models by writing \(t = \tau n\). If we then substitute this in, we can rewrite our model as \[u_n = f(u_{n-1}),\] where we are setting \(u_n = u(\tau n)\). This notation becomes very convenient to represent iterations of our system. In particular, we have \[u_n = f(u_{n-1}) = f(f(u_{n-2})) = \dots = f^{n}(u_0).\] This should remind you of sequences from Analysis, and is really the same idea in a slightly different setting. In contrast to differential equation models, this kind of iterated equation is in some sense easy to explore its behaviour by just doing a few iterations by hand. On the other hand, these systems can exhibit quite exotic behaviours which we will explore through a few examples.
Equilibria for these equations have the same interpretation as in differential equation models, where they do not change in time. Mathematically this corresponds to dropping the dependence on \(n\)11, i.e. equilibria are just solutions to \[u^* = f(u^*).\] We will explore concepts of stability and bifurcation for such equilibria by way of examples.
Let’s consider one of the simplest population models given by \(f(u) = ru\) for some growth rate \(r>0\). This should remind you of the very first model of growth covered in the course. Our population density \(u\) then evolves according to \[\begin{equation} \label{exp-US-diff-US-eqn} u_n = ru_{n-1} \implies u_n = r^{n}u_0, \end{equation}\] where we have iterated the system to write down the dependence of \(u_n\) on \(r\) and the initial population, \(u_0\). We see that this model behaves differently depending on \(r\): for \(r<1\), the population decays towards \(u_n=0\) as \(n \to \infty\), whereas for \(r>1\) the population grows exponentially in time. In the case of \(r=1\), the model gives that every real number \(u_0\geq 0\) is an equilibrium point. We have seen such a situation before in ODE models when the Jacobian matrix had a zero eigenvalue. Biologically, we would not expect a parameter to have an exact value, so we can discount the case \(r=1\) and instead just view it as the bifurcation point between growth and extinction.
Phase portraits and other kinds of graphs were valuable for studying ODE models, so we introduce a similar way of understanding scalar difference equations called the cobweb plot. See 2.1 for an example involving the exponential growth model. One iteration of the map given in [exp-US-diff-US-eqn] corresponds to moving vertically to the green line given by \(f(u)=ru\) from the initial condition on the \(x\)-axis. To iterate again, one moves horizontally to the line of slope one, i.e. to the blue line, before again moving vertically to the function \(f(u)\). These plots can take some time to understand, but they can help immediately visualise the change in the stability of \(u_0=0\) as \(r\) crosses the threshold of \(1\).
While it is not physically sensible for modelling biological populations, it will become mathematically useful for us to consider [exp-US-diff-US-eqn] for negative values of \(r\) as well. What happens to an initial condition for \(-1 < r < 0\)? What about for \(r=-1\) or \(r<-1\)? Draw a cobweb plot like those in 2.1 for each of these cases.
Finally, note that this model is linear, and so it is perhaps not surprising that we could solve it directly. This will not be true of generic models, but through linear stability analysis, we can study properties of equilibria by reducing our models to this one.
As we did at the start of Michaelmas, we will now consider a simple population model involving logistic growth, given by \[\begin{equation} \label{logistic-US-map} u_n = ru_{n-1}(1-u_{n-1}), \end{equation}\] where we have already nondimensionalised the carrying capacity to be \(1\). This model is often called the logistic map, due to its similarity with the logistic model of population growth. Despite this similarity to the model studied previously, we will see that it has both biological and mathematical properties which are drastically different from its continuous-time counterpart.
Immediately we see that if \(u_{n-1}>1\) at any iteration \(n\), then \(u_n<0\). This means that we only consider initial conditions \(u_0 \in [0,1]\). For all \(r\) with \(0\leq r \leq 4\), this map takes points in \([0,1]\) to points in \([0,1]\) at each iteration (you should be able to show this), and for \(r\) outside of this range, some iterations will take negative values. Therefore, we will restrict our range of \(r\) values to be in \([0,4]\) interval, as we are looking for biologically-meaningful models.
We can start to get an idea of how this model behaves by looking at equilibria and their stability. These satisfy \[\begin{equation} \label{log-US-map-US-equilibria} u^* = ru^*(1-u^*) \implies u^* = 0 \,\,\textrm{ or }\,\, u^* = \frac{r-1}{r}. \end{equation}\] We see that \(u_0=0\) is always a permissible equilibrium, whereas the nonzero solution is only distinct and permissible for \(r>1\).
We cannot solve [logistic-US-map] in general, but we can perform a linear stability analysis as we did for differential equations, by thinking of small perturbations of initial conditions and asking if they grow or decay. We use the ansatz \(u_n = u^* + \varepsilon u_{1,n}\), noting that \(u^*\) is a constant but \(u_{1,n}\) is a sequence in \(n\) (so depends on time). Substituting this ansatz into [logistic-US-map] we find \[u^*+\varepsilon u_{1,n} = r(u^*+\varepsilon u_{1,n-1})(1-u^*-\varepsilon u_{1,n-1}) = ru^*(1-u^*) + \varepsilon r(1-2u^*)u_{1,n-1} - \varepsilon^2 r(u_{1,n-1})^2.\] We notice that the \(\mathcal{O}(1)\) equation, given by \(u^* = ru^*(1-u^*)\), is exactly the equation of an equilibrium, and so will identically cancel out. We can then divide by \(\varepsilon\), and neglect the highest-order terms in \(\varepsilon\) to arrive at the linearised equation, \[\begin{equation} \label{lin-US-log-US-map} u_{1,n} = r(1-2u^*)u_{1,n-1}. \end{equation}\] This equation is exactly [exp-US-diff-US-eqn], though the constant \(r\) in that model is now \(r(1-2u^*)\), and so we know by solving the linear model our equilibrium is stable if \(|r(1-2u^*)|<1\), and unstable if \(|r(1-2u^*)|>1\). For the extinction equilibrium, \(u^*=0\), we then have stability when \(r<1\), and the nonzero equilibrium exists for \(r>1\).
Plugging in the nonzero equilibrium into our stability condition, we find that it is stable given the condition, \[\left|r\left(1-2\frac{r-1}{r}\right)\right|< 1 \implies -1 < -r+2 < 1,\] where we are just writing \(|-r+2|< 1\) as the two separate inequalities, and simplifying the terms. Recall that there were two different routes for [exp-US-diff-US-eqn] to become unstable, corresponding to one side or the other of these inequalities being violated. Once either is no longer satisfied, then \(u_0\) is unstable. We see that, since we are only considering stability given permissibility, \(r>1\) means that the inequality can only break when \(-1 = -r+2\), i.e. when \(r=3\). We can get some insight into this by drawing some cobweb diagrams for \(r\) in these various ranges, which we do in 2.2. I would strongly encourage you to use the links in the caption to explore this model with different initial conditions and values of \(r\).
As expected from our linear stability analysis, we see convergence to \(u_0=0\) for \(r<1\), and convergence to another equilibrium for \(r \in (1,3)\), corresponding exactly to the intersection of the blue line and the green curve. One qualitative change occurs at \(r=2\), where the convergence to the equilibrium goes from being monotonic to oscillatory. This can be predicted from the linear theory, as \(r=2\) is exactly when the coefficient of \(u_{1,n-1}\) in [lin-US-log-US-map] goes from being positive to negative.
What happens at \(r=3\) and beyond? This is a different kind of bifurcation from what occurred at \(r=1\) for the extinction state, as there are no other equilibria for the system to go to. The cobweb plot in 2.2(d) is also not too informative, though direct evaluations can be helpful for getting an idea of what happens. Plugging in \(u_0=0.5\) for \(r=3.2\) we find the sequence, \[u_1=0.8,\, u_2 = 0.512,\, u_3 =0.7995392,\, u_4 = 0.51288405652,\, u_5 \approx 0.79946880348,\, \dots,\] where by \(u_2\), we see that the sequence has converged rapidly to a periodic solution.
We can even find the values of this periodic oscillation directly. What would such a solution look like? It would be some value \(u\) such that, after two iterations, it was unchanged. That is, we are looking for equilibria of the iteration \(u_n = f^2(u_{n-2}) = f(f(u_{n-2}))\). Any equilibrium of the model, i.e. those given in [log-US-map-US-equilibria], is also an equilibrium of this equation, but there may be new ones. In fact we can find them without too much trouble, as they would satisfy the quartic equation \[u^* = f^2(u^*) = r^2u^*(1-u^*))(1-ru^*(1-u^*))),\] where we can find the roots (e.g. by dividing out the two roots we know) to find \[u^* = \frac{1 + r + \sqrt{r^2 - 2 r - 3}}{2 r}, \, \frac{1 + r - \sqrt{r^2 - 2 r - 3}}{2 r}.\] Plugging in \(r=3.2\) these give \(u^* \approx 0.51304\dots\) and \(u_0=0.79945\dots\), so our numerical intuition above was not wrong in this case.
We could in principle go further and show that these two equilibria are linearly stable for \(r \in (3, 1+\sqrt{6})\), but they undergo another bifurcation at \(r=1+\sqrt{6} \approx 3.44949\) where now 4-periodic solutions emerge. These are stable and undergo another bifurcation around \(r \approx 3.54409\), leading to 8-periodic solutions. This period-doubling increases until \(r \approx 3.56995\), at which point most initial conditions and most values of \(r\) lead to chaotic solutions, with extremely complex behaviour. These period-doubling bifurcations, beyond the first one, will not be examined or further studied in this course, but they form a fascinating introduction to a large part of modern dynamical systems theory which explores such complicated dynamics.
Rather than dwell too long on this example, let’s go back to questions of biology. Do we expect populations to exhibit complicated period-doubling oscillations, which may be somewhat sensitively dependent on their growth rate? Do we expect large populations to lead to negative populations? Really these are more artefacts of the particular form of [logistic-US-map], which is, unlike its continuous counterpart, often not considered a good model of real populations. In the problem sheets you will explore a few examples of possibly realistic population models of this kind, though that isn’t to say that complex oscillations and chaos are not possible kinds of biological dynamics.
Now let’s consider examples of populations which interact across more than one generation, as well as models of multiple interacting populations. The former can be written as, \[\begin{equation} \label{multistep-US-diff-US-eqs} u_n = f(u_{n-1}, u_{n-2},\dots, u_{n-k}), \end{equation}\] where \(k\leq n\) and we need to specify now a set of initial values \(u_0\), \(u_1\),…\(u_{k-1}\). As with ODEs, we would call this a \(k\)th-order or \(k\)-step difference equation. In contrast, if we want to model several populations using single-generation interactions, we can write \[\begin{equation} \label{systems-US-diff-US-eqs} \mathbfit{u}_n = \mathbfit{f}(\mathbfit{u}_{n-1}), \end{equation}\] where \(\mathbfit{u}_n\) is the vector of populations at time \(n\), and \(\mathbfit{f}\) the vector-valued function representing these interactions.
We can reduce the study of [multistep-US-diff-US-eqs] to that of [systems-US-diff-US-eqs] by just substituting in new variables for the intermediate steps. This is exactly what we did in 1.1.2 to reduce higher-order ODEs to systems of first-order equations. Of course, we can also linearise and analyse the stability of both kinds of models directly. We cannot use cobwebbing unfortunately, so must instead resort to computations. We’ll now explore how to compute stability using these different approaches through a classical example.
Leonardo of Pisa (given the more well-known name Fibonacci in the 1800s) wrote a book on arithmetic in 1202 which contained the following exercise: After one year, how many rabbits can a single breeding pair produce in total? The following assumptions about rabbit biology are made: 1. Rabbits take one month to mature to reproductive age. 2. Once they reach reproductive age, a pair of rabbits will produce a new pair (one male and one female) each month. One can then graphically (see 2.3(a)) or mentally depict this process over a 12 month period to deduce the following sequence of pairs of rabbits: \[1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144,\] which are the first twelve terms of the Fibonacci sequence12, where each subsequent term is the sum of the previous two. In essence this captures the two simplistic biological assumptions about the population.
Now, if we consider a difference-equation formulation of this model13, we have: \[\begin{equation} \label{fib-US-eqn} F_n = F_{n-1}+F_{n-2}, \quad F_0=0, \quad F_1=1. \end{equation}\] Importantly, this is a linear difference equation, and so we might expect solutions like those seen for the exponential model, [exp-US-diff-US-eqn], to work. That is, we can look for solutions of the form \(F_n = \lambda^n\), and see what values of \(\lambda\) would solve [fib-US-eqn].
Miles | Kilometres |
---|---|
1 | 1.61 |
2 | 3.22 |
3 | 4.83 |
5 | 8.05 |
8 | 12.87 |
13 | 20.92 |
21 | 33.80 |
34 | 54.72 |
55 | 88.51 |
89 | 143.23 |
Plugging this ansatz in, we can reduce the resulting equation to a quadratic in \(\lambda\) by dividing through by \(\lambda^{n-2}\) to find: \[\begin{equation} \label{fib-US-lambdas} \lambda^n = \lambda^{n-1}+\lambda^{n-2} \implies \lambda^2 = \lambda+1 \implies \lambda = \frac{1\pm \sqrt{5}}{2}. \end{equation}\] As [fib-US-eqn] is a linear equation, we expect that any superposition of these two values of lambda would be a solution. That is, the general solution takes the form, \[F_n = C_1\left(\frac{1+ \sqrt{5}}{2} \right)^n+C_2\left(\frac{1- \sqrt{5}}{2} \right)^n.\] Finally, to determine the values of \(C_1\) and \(C_2\) we use the initial data that \(F^0=0\), \(=F^1=1\) to find, \[C_1+C_2=0, \, \, C_1\left(\frac{1+ \sqrt{5}}{2} \right)+C_2\left(\frac{1- \sqrt{5}}{2} \right) =1 \implies C_1=\frac{1}{\sqrt{5}}, \,\, C_2 = \frac{-1}{\sqrt{5}}.\] This gives the following formula for the \(n\)th Fibonacci number as, \[F_n = \frac{1}{\sqrt{5}}\left[\left(\frac{1+\sqrt{5}}{2}\right)^n-\left(\frac{1-\sqrt{5}}{2}\right)^n\right] = \frac{1}{2^n\sqrt{5}}\left[(1+\sqrt{5})^n-(1-\sqrt{5})^n\right].\]
The numbers \(F_n\), as well as their ratio \(F_n/F_{n-1}\), which rapidly approaches the golden ratio \((1+\sqrt{5})/2\), appear in a huge variety of natural and cultural phenomena. These range from the use of the golden ratio in art, to the appearance of Fibonacci numbers in the number of seeds on the top of many flowers (see 2.3(b) or search for ‘Fibonacci phyllotaxis’), and even in converting miles to kilometres (see 2.1). However, from a biological modelling point of view, it should be clear that linear models will only be very rough caricatures of real populations. Still, they can be useful in making rough predictions, such as estimating the growth rate of populations in the absence of predation or resource limitations.
Let’s now state the general procedure for linear stability analysis of systems of the form, \[\begin{equation} \label{gen-US-diff-US-sys} \mathbfit{u}_n = \mathbfit{f}(\mathbfit{u}_{n-1}), \end{equation}\] where we are thinking of \(\mathbfit{u}_n\) having \(N\) components. The first step is to compute any equilibria. These are constant vectors \(\mathbfit{u}^*\) which satisfy \(\mathbfit{u}^* = \mathbfit{f}(\mathbfit{u}^*)\). Once such an equilibrium is found, we can perturb it in the usual way by writing \(\mathbfit{u}_n = \mathbfit{u}^*+\varepsilon \mathbfit{u}_{1,n}\) to find, \[\mathbfit{u}^*+\varepsilon \mathbfit{u}_{1,n} = \mathbfit{f}(\mathbfit{u}^*+\varepsilon \mathbfit{u}_{1,n-1}) \approx \mathbfit{f}(\mathbfit{u}^*)+\varepsilon \mathsfbfit{J}\mathbfit{u}_{1,n-1} + O(\varepsilon^2),\] where \(\mathsfbfit{J}\) is the usual Jacobian matrix of \(\mathbfit{f}\) evaluated at the steady state \(\mathbfit{u}^*\). Cancelling the \(O(1)\) terms (that is, using the steady state equation \(\mathbfit{u}^* = \mathbfit{f}(\mathbfit{u}^*)\), dividing through by \(\varepsilon\), and then setting \(\varepsilon=0\), we arrive at the linear system, \[\mathbfit{u}_{1,n} = \mathsfbfit{J}\mathbfit{u}_{1,n-1}.\] This system is in many ways similar to the linearised ODE system [gen-US-lin-US-ODE], in that its solutions are essentially determined by the eigenvalues of \(\mathsfbfit{J}\).
For simplicity, let’s consider the case when \(\mathsfbfit{J}\) is diagonalisable as \(\mathsfbfit{J}=\mathsfbfit{P}\mathsfbfit{D}\mathsfbfit{P}^{-1}\) where \(\mathsfbfit{P}\) is the matrix of eigenvectors and \(\mathsfbfit{D}\) is a diagonal matrix of the eigenvalues. We then have, \[\begin{equation} \label{lin-US-stab-US-diff-US-sys} \mathbfit{u}_{1,n} = \mathsfbfit{J}\mathbfit{u}_{1,n-1} = \mathsfbfit{J}^n\mathbfit{u}_{1,0} = \mathsfbfit{P}\mathsfbfit{D}^n\mathsfbfit{P}^{-1}\mathbfit{u}_{1,0}, \end{equation}\] where we have used the matrix factorisation to cancel out each term in the product, so that only the diagonal matrix \(\mathsfbfit{D}\) gets raised to the power \(n\). Finally, we make a change of basis of the form \(\mathbfit{v}_n = \mathsfbfit{P}^{-1}\mathbfit{u}_{1,n}\) and multiply [lin-US-stab-US-diff-US-sys] by \(\mathsfbfit{P}^{-1}\) to find \[\mathbfit{v}_{n} = \mathsfbfit{D}\mathbfit{v}_{n-1}=\mathsfbfit{D}^n\mathbfit{v}_{0}.\] This is essentially the discrete analogue of the ODE system [diagonalizable-US-lin-US-ode], and the manipulations used about the diagonalisability of \(\mathsfbfit{J}\) are the same. So if we write the components of this vector as \(\mathbfit{v}_n = (v_{1,n},v_{2,n},\dots,v_{N,n})\), these equations are of the form, \[v_{i,n} = \lambda_iv_{i,n-1} = \lambda_i^nv_{i,0},\] where \(\lambda_i\) are the eigenvalues of \(\mathsfbfit{J}\).
The perturbations \(\mathbfit{u}_{1,n}\) are just linear combinations of these \(\mathbfit{v}_n\), so they will grow or decay depending on the size of the eigenvalues \(\lambda_i\). In particular we have the following result:
Consider an equilibrium \(\mathbfit{u}^*\) of [gen-US-diff-US-sys], and the associated Jacobian matrix \(\mathsfbfit{J}(\mathbfit{u}^*)\) corresponding to \(\mathbfit{f}\). This equilibrium is said to be linearly stable if all eigenvalues \(\lambda\) of \(\mathsfbfit{J}(\mathbfit{u}^*)\) satisfy \(|\lambda|<1\). If instead there is at least one eigenvalue with \(|\lambda|>1\), then the equilibrium is said to be linearly unstable. If the eigenvalue of largest modulus has \(|\lambda|=1\), then the linearisation does not determine the stability of the equilibrium.
As in the ODE case, if the matrix \(\mathsfbfit{J}\) is non-diagonalisable, then the solutions of the linear system will not just be vectors multiplying the eigenvalues to the power \(n\). But away from the degenerate case of the eigenvalue with largest modulus having \(|\lambda|=1\), the stability of an equilibrium is determined by the eigenvalues \(\lambda\), and so the above result still holds.
Consider the Fibonacci system [fib-US-eqn]. Introducing a new variable \(v_n = F_{n-1}\), write this equation as a coupled system of first-order difference equations. Show that the eigenvalues of this system give exactly the same values of \(\lambda\) given in [fib-US-lambdas].
Let’s consider a two-species version of [logistic-US-map] corresponding to two competing species which have the same growth rate \(r\) and carrying capacity of \(1\) in the absence of competition. This takes the form, \[\begin{align} \label{LV-US-Diff} u_n = u_{n-1}(r - u_{n-1} - \gamma_1v_{n-1}), \\ v_n = v_{n-1}(r - v_{n-1} - \gamma_2u_{n-1}), \end{align}\] where all parameters are positive. There are four equilibria to this model, which you should be able to find as: \[\begin{equation} \label{diff-US-sys-US-equil} (u^*,v^*) = (0,0),\, (r-1,0),\, (0,r-1),\, \textrm{ or } \,\left(\frac{(\gamma_1-1)(r-1)}{\gamma_1 \gamma_2-1},\frac{(\gamma_2-1)(r-1)}{\gamma_1 \gamma_2-1} \right). \end{equation}\] We compute the Jacobian as, \[\mathsfbfit{J}(u^*,v^*) = \begin{pmatrix} r-2u^*-\gamma_1v^* & -\gamma_1u^*\\ -\gamma_2v^* & r-2v^*-\gamma_2u^* \end{pmatrix}.\]
We consider the extinction steady state, \((0,0)\) first, which is always permissible. The Jacobian is \[\mathsfbfit{J}(0,0) = \begin{pmatrix} r & 0\\ 0 & r \end{pmatrix}.\] which has repeated eigenvalues \(\lambda = r\), so this equilibrium is only stable if \(r<1\).
Next we consider the case of a single species surviving. Without loss of generality, we take \((u^*,v^*)=(r-1,0)\), which is only permissible if \(r>1\). The Jacobian is \[\mathsfbfit{J}(r-1,0) = \begin{pmatrix} 2-r & \gamma_1(1-r)\\ 0 & r-\gamma_2(r-1) \end{pmatrix},\] which has eigenvalues \(\lambda=2-r\) and \(r-\gamma_2(r-1)\). Analysing just the first eigenvalue, we see that this state is unstable for \(0\leq r <1\), and for \(r>3\). So if this state is stable, it will be between \(1 < r < 3\). Note that this is exactly the instability we identified in [logistic-US-map], corresponding to logistic growth leading to period-doubling, and hence setting \(r>3\) would only lead to time-dependent solutions for the single species \(u_n\), rather than invasion or coexistence with the species \(v_n\), at least via this particular bifurcation.
We can manipulate conditions for the second eigenvalue as follows. If \(\lambda > -1\), this says that \[r+\gamma_2(1-r) > -1 \implies \frac{1+r}{r-1} > \gamma_2,\] where all terms are positive for permissible values of \(r>1\). If instead we have \(\lambda < 1\), then \[r+\gamma_2(1-r) < 1 \implies \gamma_2>1,\] so that for this state to be stable we must have, \[\frac{1+r}{r-1} > \gamma_2 > 1,\] giving rise to two possible instabilities as \(\gamma_2\) varies. The bifurcation at \(\gamma_2=1\) has an immediate interpretation that a single species can prevent invasion by a new species as long as it competes sufficiently strongly against the second species. That is, if the population \(u\) is at its equilibrium value of \(r-1\), the introduction of some population \(v\) will just lead to the extinction of \(v\) as long as \(\gamma_2>1\). Something qualitatively similar happens in the ODE model, and is related to competitive exclusion. The other route to instability, when \(\gamma_2>(r+1)/(r-1)\) is different, and corresponds to an oscillatory instability (as here \(\lambda < -1\)) in the perturbation of \(v_n\). However, this is problematic as \(v^*=0\) at this steady state, so this would lead to dynamics which are not permissible.
In principle we can consider the coexistence state given by the fourth value of [diff-US-sys-US-equil]. We then have the Jacobian (after simplifying some of the terms): \[\mathsfbfit{J}(u^*,v^*) = \begin{pmatrix} \displaystyle r-\frac{(r-1)(\gamma_1(3-\gamma_2)-2)}{\gamma_1 \gamma_2-1}& \displaystyle -\gamma_1\frac{(\gamma_1-1)(r-1)}{\gamma_1 \gamma_2-1}\\[5mm] \displaystyle -\gamma_2\frac{(\gamma_2-1)(r-1)}{\gamma_1 \gamma_2-1} & \displaystyle r-2\frac{(r-1)(\gamma_2(3-\gamma_1)-2)}{\gamma_1 \gamma_2-1} \end{pmatrix}.\] One can make some progress studying this system, but it is extremely algebraically tedious, so we will not pursue this here (or expect you to be able to do a calculation this messy).
There is an analogue of the Routh-Hurwitz stability criteria to determine if a matrix has eigenvalues with \(|\lambda|<1\) or not. Namely, the eigenvalues of the Jacobian satisfy, \[\lambda^2 - \operatorname{tr}(\mathsfbfit{J})\lambda + \det(\mathsfbfit{J})=0,\] which has \(|\lambda|<1\) if and only if the following inequalities hold: \[2 > 1+\det(\mathsfbfit{J}) > |\operatorname{tr}(\mathsfbfit{J})|.\] Such conditions are known as the Jury instability conditions.
One advantage to difference equations compared to differential equations is how easy they are to simulate. In 2.4, we consider solutions near \(r=3\), validating the previous comment that this instability really is the same single-species period-doubling bifurcation as before. We can essentially ignore the behaviour of \(v_n\) in this case. On the other hand, 2.5 shows different values of \(\gamma_1\) and \(\gamma_2\) for which coexistence, competitive exclusion, and (unphysical) observations occur, corresponding to the two bifurcations identified above. Although we have not analysed the coexistence equilibrium, we observe that it appears to be stable for \(\gamma_1<1\) and \(\gamma_2<1\), as in the ODE case. Note that in panel (d), the equations for \(u_n\) and \(v_n\) are the same, and hence we have that both \((r-1,0)\) and \((0,r-1)\) are stable. Finally, we note that panel (e) is still within the stable parameter regime of the equilibrium \((r-1,0)\), it exhibits unphysical negative oscillations due to the negative eigenvalue, despite the fact that these oscillations eventually tend towards zero.
There are a number of differences in the analysis of this system, compared to the ODE version of a competitive Lotka–Volterra system. Here it is possible for the coexistence equilibrium to be stable, for instance, if there are insufficient resources for either population to persist (\(r<1\)). The ability of one population at equilibrium to suppress the other population is similar to the ODE model, with only minor algebraic differences in the conditions. However the coexistence equilibrium is substantially more involved to analyse, and in fact some of its instabilities will lead to complex periodic and chaotic solutions which are beyond our scope here.
As we have seen, difference equation models are similar to ODE models, but with some important mathematical and biological differences. The concept of equilibria is similar in both kinds of models, but has a different mathematical form in difference equations. The linearisation of a system of difference equations is very similar to the linearised ODE system, essentially involving the same Jacobian matrix in both cases. However, the condition for an equilibrium to be stable is now that the modulus of all eigenvalues of the Jacobian is less than one, rather than having to do with their real parts. Finally we also developed a graphical way of studying solutions of these models known as cobwebbing, though it is only strictly valid for scalar equations.
We also saw that even simple models, such as the logistic map [logistic-US-map], can give rise to complex behaviour where populations suddenly become negative, or undergo periodic oscillations, or even chaotic behaviour. It is important to keep these kinds of behaviours in mind when we choose what models to use to represent a real population, as well as what ranges of parameters are appropriate. We will explore some of these issues through examples in the assignments and problems classes.
The basic ideas of this chapter can be found in several different places, such as Murray, vol. I, chap. 2, or in these online notes, which includes an extended discussion of physiological applications of difference equations.
You can use this GeoGebra page to explore scalar difference equations. Note that the notation here is \(x_{n+1} = g(x_n)\). You should type in one of the systems we have been studying, then drag the slider for \(n\) on the bottom to the right, and move the slider which changes the initial point \(x_0\).
As mentioned last term, PDE stability analysis is more involved than linear stability analysis for ODEs, though the overall spirit of the procedure is the same. Here we will explore this in terms of scalar reaction–diffusion equations, before sketching a broader perspective.
Consider the equation, \[\begin{equation} \label{rd-US-eqn} \mathchoice{\frac{\partial u}{\partial t}}{\partial u/\partial t}{\partial u/\partial t}{\partial u/\partial t} = D\mathchoice{\frac{\partial^2 u}{\partial x^2}}{\partial^2 u/\partial x^2}{\partial^2 u/\partial x^2}{\partial^2 u/\partial x^2} + f(u), \quad x \in [0,L], \,\, t > 0 \end{equation}\] where \(u(x,t)\) is a concentration or density of some chemical or population, \(D\) is a diffusion coefficient, and \(f\) is the reaction term. To completely define the problem, we must impose initial and boundary conditions14. We assume \(u(x,t)\) is known at \(t=0\), say \(u(x,0) = u_0(x)\). A reasonably general set of homogeneous boundary conditions can be given by, \[\begin{equation} \label{rd-US-bcs} a_0 u(0,t) + b_0 \mathchoice{\frac{\partial u}{\partial x}}{\partial u/\partial x}{\partial u/\partial x}{\partial u/\partial x}(0,t) = 0, \quad \textrm{and}\quad a_L u(L,t) + b_L \mathchoice{\frac{\partial u}{\partial x}}{\partial u/\partial x}{\partial u/\partial x}{\partial u/\partial x}(L,t) = 0. \end{equation}\] Now if \(a_0=a_L=0\) and \(b_0=b_L\neq 0\), then we have that the space derivatives are zero at the boundaries, which is a no-flux or Neumann15 boundary condition. If instead \(b_0=b_L=0\) and \(a_0=a_L\neq 0\), then we have a set of fixed or Dirichlet conditions. Other boundary conditions, such as inhomogeneous ones where the right-hand side above is not \(=0\), are possible too. From a modelling perspective, it is important to think of what the boundary conditions mean, and often return to the fundamental physics of the equations to interpret them.
Reaction–diffusion models can be understood in terms of conservation of mass. The no-flux condition means that the only change in the total mass \(M\) of the population is due to reactions \(f\). To see this, note that the change in \(M\) satisfies \[\begin{aligned} \mathchoice{\frac{{\mathrm d}M}{{\mathrm d}t}}{{\mathrm d}M/{\mathrm d}t}{{\mathrm d}M/{\mathrm d}t}{{\mathrm d}M/{\mathrm d}t} =& \mathchoice{\frac{\partial}{\partial t}}{\partial/\partial t}{\partial/\partial t}{\partial/\partial t} \int_0^L u(s,t) \, {\mathrm d}s = \int_0^L \mathchoice{\frac{\partial u}{\partial t}}{\partial u/\partial t}{\partial u/\partial t}{\partial u/\partial t}(s,t) \, {\mathrm d}s \\ =& \int_0^L \left[\mathchoice{\frac{\partial^2 u}{\partial x^2}}{\partial^2 u/\partial x^2}{\partial^2 u/\partial x^2}{\partial^2 u/\partial x^2} + f(u)\right] {\mathrm d}s = \mathchoice{\frac{\partial u}{\partial x}}{\partial u/\partial x}{\partial u/\partial x}{\partial u/\partial x}(t,L) - \mathchoice{\frac{\partial u}{\partial x}}{\partial u/\partial x}{\partial u/\partial x}{\partial u/\partial x}(t,0) + \int_0^L f \, {\mathrm d}s = \int_0^L f \, {\mathrm d}s. \end{aligned}\] So if \(f=0\) (e.g. our model is the heat equation), and we use Neumann conditions, then the above equation shows that \(M\) is constant in time (hence this PDE being related to ‘conservation of mass’).
Steady state solutions of [rd-US-eqn] are functions \(u_0(x)\) satisfying, \[\begin{equation} \label{spatial-US-steady-US-state} 0 = D\mathchoice{\frac{\partial^2 u_0}{\partial x^2}}{\partial^2 u_0/\partial x^2}{\partial^2 u_0/\partial x^2}{\partial^2 u_0/\partial x^2} + f(u_0), \quad x \in [0,L], \end{equation}\] and satisfying the boundary conditions [rd-US-bcs]. We can linearise [rd-US-eqn] by writing \(u=u_0(x)+\varepsilon u_1(x,t)\), and truncating to \(O(\varepsilon^2)\) as before. We arrive at the linear PDE for the perturbation, \[\begin{equation} \label{rd-US-eqn-US-linear} \mathchoice{\frac{\partial u_1}{\partial t}}{\partial u_1/\partial t}{\partial u_1/\partial t}{\partial u_1/\partial t} = D\mathchoice{\frac{\partial^2 u_1}{\partial x^2}}{\partial^2 u_1/\partial x^2}{\partial^2 u_1/\partial x^2}{\partial^2 u_1/\partial x^2} + \mathchoice{\frac{{\mathrm d}f}{{\mathrm d}u}}{{\mathrm d}f/{\mathrm d}u}{{\mathrm d}f/{\mathrm d}u}{{\mathrm d}f/{\mathrm d}u}(u_0)u_1, \quad x \in [0,L], \,\, t > 0. \end{equation}\] As this equation is linear, we can solve it via separation of variables. Leaving the details to a problem sheet, this leads to an eigenvalue problem of the form, \[\begin{equation} \label{rd-US-eigenproblem} 0 = D\mathchoice{\frac{\partial^2 w_k}{\partial x^2}}{\partial^2 w_k/\partial x^2}{\partial^2 w_k/\partial x^2}{\partial^2 w_k/\partial x^2} + f'(u_0)w_k -\lambda_k w_k, \quad x \in [0,L], \end{equation}\] where \(w_k(x)\) must satisfy the boundary conditions, [rd-US-bcs], and the constant \(\lambda_k\) must be determined as part of the problem. In fact, the subscript \(k\) comes from the fact that we will find a set of functions and constants that satisfy this problem, so we index them by \(k\). This is a two-point boundary value problem, and there is a classical theory of these kinds of equations and their solutions.
A Sturm–Liouville eigenvalue problem takes the form, \[\mathcal{L}w_k = \mathchoice{\frac{{\mathrm d}}{{\mathrm d}x}}{{\mathrm d}/{\mathrm d}x}{{\mathrm d}/{\mathrm d}x}{{\mathrm d}/{\mathrm d}x}\left(p(x)\mathchoice{\frac{{\mathrm d}w_k}{{\mathrm d}x}}{{\mathrm d}w_k/{\mathrm d}x}{{\mathrm d}w_k/{\mathrm d}x}{{\mathrm d}w_k/{\mathrm d}x} \right) + q(x)w_k = \lambda_k w_k.\] For such an equation, with important technical assumptions on the functions \(p\), \(q\), and \(r\), there is abstract machinery guaranteeing a countable sequence of solutions \((w_k, \lambda_k)\), \(k=0,1,2,\dots\) with \(0 \geq \lambda_0> \lambda_1 > \dots > \lambda_k> \dots\) tending towards \(-\infty\).
Depending on the explicit \(x\) dependence of the equation (due to \(u_0\)), we may or may not be able to make analytical progress. Even for relatively simple \(x\) dependence, the linear problem is typically only solvable using special functions. Instead, of pursuing these here, we will explore an important class of equilibria, those which are spatially homogeneous, below.
Finally, in the cases that we can solve [rd-US-eqn-US-linear], we find that the solution depends exponentially on \(\lambda_k t\). That is, a single mode grows as \(u_1 \propto \exp(\lambda_k t) w_k(x)\), and so we can classify stability again in terms of \(\operatorname{Re}(\lambda_k)\): if the largest \(\operatorname{Re}(\lambda_k)>0\), then the solution to the linear problem tends to \(\infty\) (at least in some spatial region), and so we call the equilibrium unstable as perturbations grow. If instead \(\operatorname{Re}(\lambda_k)<0\), for all modes growth rates, then we call the equilibrium linearly stable. As in the ODE case, we do not dwell on when \(\operatorname{Re}(\lambda_k)=0\), and we note that not all PDE will have exponential-in-time dynamics (though this assumption can be justified for PDE which can be solved via Separation of Variables).
If \(u_0\) is spatially homogeneous (constant in \(x\)), then \(f(u_0)\) will be a constant. Study this case by solving the following PDE via separation of variables: \[\mathchoice{\frac{\partial u_1}{\partial t}}{\partial u_1/\partial t}{\partial u_1/\partial t}{\partial u_1/\partial t} = D\mathchoice{\frac{\partial^2 u_1}{\partial x^2}}{\partial^2 u_1/\partial x^2}{\partial^2 u_1/\partial x^2}{\partial^2 u_1/\partial x^2} + au_1, \quad x \in [0,L], \,\,t>0,\] where \(L>0\), \(a \in \mathbb{R}\), with the boundary and initial conditions, \[u_1(0,t)=u_1(L,t)=0, \quad u_1(x,0) = \sin \left (\frac{3\pi x}{L} \right )+\sin \left (\frac{7\pi x}{L} \right ).\] For what values of the constants \(a\) and \(L\) do solutions to this linear equation grow or decay? How does this compare to the growth or decay of solutions to the ODE \(\dot{u_1} = au_1\)? Now, what if you had a general initial condition such that all of the constants in front of the different modes are nonzero?
For no-flux boundary conditions (\(a_0=a_L=0\)), any constant solution to \(f(u_0)=0\) satisfies the PDE [rd-US-eqn]. In this case, we can solve [rd-US-eqn-US-linear] by assuming a solution in terms of eigenfunctions of the Laplacian. In this case, these will be the familiar cosine solutions you should be familiar with from last term.
To explore this in detail, let’s consider a model of logistic growth in a spatially-extended environment: \[\begin{equation} \label{rd-US-eqn-US-logistic} \mathchoice{\frac{\partial u}{\partial t}}{\partial u/\partial t}{\partial u/\partial t}{\partial u/\partial t} = D\mathchoice{\frac{\partial^2 u}{\partial x^2}}{\partial^2 u/\partial x^2}{\partial^2 u/\partial x^2}{\partial^2 u/\partial x^2} + ru\left (1-\frac{u}{K} \right), \quad x \in [0,L], \,\, t > 0, \end{equation}\] again with the no-flux boundary conditions. We could nondimensionalise \(x, t\), and \(u\) to set three of these constants to \(1\) (or to set the domain to be \(x \in [0,1]\)), but for now will study the dimensional model. The two spatially homogeneous equilibria are \(u_0=0,K\). The linearisation about either equilibrium reads, \[\begin{equation} \label{rd-US-eqn-US-linear-US-logistic} \mathchoice{\frac{\partial u_1}{\partial t}}{\partial u_1/\partial t}{\partial u_1/\partial t}{\partial u_1/\partial t} = D\mathchoice{\frac{\partial^2 u_1}{\partial x^2}}{\partial^2 u_1/\partial x^2}{\partial^2 u_1/\partial x^2}{\partial^2 u_1/\partial x^2} + r\left (1-\frac{2u_0}{K} \right )u_1, \quad x \in [0,L], \,\, t > 0. \end{equation}\] Now the key insight is that [rd-US-eqn-US-linear-US-logistic] is a constant-coefficient linear PDE, and it can be solved exactly as we did for the diffusion equation last term, with some additional bookkeeping around the extra constant-coefficient term at the end. Alternatively, we can use the ansatz \[\begin{equation} \label{ansatz} u_1 = A\exp(\lambda_k t)\cos\left (\frac{xk\pi}{L}\right ),\quad k=0,1,2,\dots , \end{equation}\] with \(A\) an (unimportant but nonzero) constant. This ansatz is exactly like expanding solutions of the ODEs [gen-US-lin-US-ODE] in terms of \(\mathbfit{u} = \exp(\lambda_k t)\mathbfit{w}_k\).
Substituting this ansatz into [rd-US-eqn-US-linear-US-logistic] and cancelling the function out, we can directly read off the growth rates: \[\begin{equation} \label{rd-US-logistic-US-linstab} \lambda_k = -D\left (\frac{k\pi}{L} \right)^2 + r \left (1-\frac{2u_0}{K} \right ). \end{equation}\] The spatially homogeneous equation is easy enough to analyse with respect to linear stability analysis of the same two equilibria (and in fact we get these growth rates for free by setting \(D=0\) in the above). For this reason, in the case of no-flux boundary conditions, we see that \(\lambda_0\) is the growth rate associated to the spatially homogeneous model. We also have that \(\lambda_0 > \lambda_1 > \lambda_2 > \dots\), so that the effect of diffusion is to stabilise both equilibria to perturbations. The steady state \(u_0=K\) is therefore stable to all perturbations, whereas \(u_0=0\) is unstable for \(k=0\), but for sufficiently large \(k\), it becomes stable. That is, for \(k>k_0\) where \(k_0 = \sqrt{\frac{r}{D}}\frac{L}{\pi}\), these perturbations decay. In practice, stability to such perturbations does not change the fact that the steady state is stable to larger wavelength ones – a small population in a spatially extended environment will still grow away from the extinction steady state.
The fact that the results of a linear stability analysis for the spatially homogeneous equation, and a linear stability analysis for a reaction–diffusion equation with no-flux boundaries gave the same results is not especially surprising. Diffusion tends to make things spatially homogeneous, and the ODE model is exactly the case where everything is well-mixed to the point that we can ignore spatial organisation of a population.
Of course, our ansatz above is a single term of the solution (which we will call a mode). As in the ODE case, we can justify considering a single mode at a time via expanding \(u_1(x,t) = \sum_{k=0}^\infty A_k(t)\exp(\lambda_k t)\cos(xk\pi/L)\). The key is that these cosine functions are orthogonal to one another for different \(k\), so that the manipulations done in [lin-US-ODE-US-expansion] effectively follow through (subject to details about defining an appropriate inner-product, and technical details about convergence of infinite sums). The take-home message is that considering a single mode at a time is enough to determine the growth or decay of perturbations to a steady state, just as in the ODE case considering a single eigenvector at a time is sufficient. Note that we do not pose initial conditions for this linear equation, but just assume that all modes have some nonzero perturbation – this will be true in general for an arbitrary small perturbation of a spatially constant equilibrium, but is worth spending some time thinking about.
What if we consider the same PDE [rd-US-eqn-US-logistic] but with homogeneous Dirichlet conditions ([rd-US-bcs] with \(b_0=b_L=0\))? Now the steady state \(u_0=K\) of the ODE model does not satisfy these boundary conditions, and so is not a steady state of the PDE, though the steady state \(u_0=0\) is a solution to the PDE (and a steady state, as it will not evolve in time). What changes with the linear stability of \(u_0=0\)?
The linear equation [rd-US-eqn-US-linear-US-logistic] will be the same, although we cannot use the ansatz involving cosines (they will not satisfy the boundary conditions). We can either use separation of variables to solve this linear PDE, or we can guess what will happen to the spatial part of our solutions if we have solved the diffusion equation with Dirichlet conditions before. Essentially we change our cosines to sines to get, \[u_1 = C\exp(\lambda_k t)\sin\left (\frac{xk\pi}{L}\right ),\quad k=1,2,\dots,\] and our growth rates are hence the same as in [rd-US-logistic-US-linstab], except that we no longer have a mode for \(k=0\) (as in this case our ansatz is trivially zero, and cannot grow or decay).
What does this tell us? As before, we have that the growth rates decrease with increasing \(k\), that is \(\lambda_1 > \lambda_2 > \dots\). In the spatially homogeneous model, the growth rate of a perturbation to the extinction steady state was \(\lambda_0=r>0\) (remember this can be seen from [rd-US-logistic-US-linstab] with \(D=0\) and \(u_0=0\)). But now our largest growth rate, \(\lambda_1\), need not be positive. In particular \[\begin{equation} \label{harsh-US-environment-US-growth}\lambda_1 = -D\left (\frac{\pi}{L} \right)^2 + r \implies \lambda_1<0 \quad \textrm{if} \quad L < L_c = \pi\sqrt{\frac{D}{r}}. \end{equation}\] Hence, on a sufficiently small domain, diffusion actually stabilises the extinction steady state, and a small population will die off. Effectively the Dirichlet boundary conditions here are modelling a harsh environment, and random movement of individuals is sufficient to effectively reduce the carrying capacity of the population to zero. If the movement is faster (larger \(D\)), then this occurs on a larger domain \(L\).
What happens when the parameters are such that extinction is no longer possible? Unlike in the ODE model of logistic growth, there is no spatially homogeneous steady state, and so the population density must go to some other state. In principle this could be a time-dependent state, but it turns out this is not possible for this kind of scalar reaction–diffusion equation. Instead, the population will approach another steady state (which can only be found approximately, e.g. numerically), corresponding to a function \(u_0(x)\) solving [spatial-US-steady-US-state]. We will not go into the details of how to approximate this steady state, but give a sketch of how it might depend on the domain length \(L\) in 3.1.
A substantial amount of mathematical theory has been developed around properties of the Laplacian operator. One key piece of this theory which we will make use of is the study of the eigenvalues and eigenfunctions of this operator. These satisfy \[-\nabla^2 w_\mathbfit{k}(\mathbfit{x}) = \rho_\mathbfit{k} w_\mathbfit{k}(\mathbfit{x}),\] on a ‘nice’ domain (typically smooth boundary and simply connected) \(\mathbfit{x} \in \Omega \in \mathbb{R}^n\), \(n=1,2\), or \(3\). The minus sign is conventional so that, with suitable boundary conditions, the eigenvalues \(\rho_\mathbfit{k}\) can be ordered in a strictly increasing sequence tending to \(+\infty\). Though different authors may use different conventions about this, as well as the notation of the eigenvalues \(\rho_\mathbfit{k}\) as opposed to the indexing variable \(\mathbfit{k}\).
For \(n=1\), in the case of no-flux boundary conditions on a domain of length \(L\), we have \[w_k = \cos \left( \frac{x k \pi}{L} \right), \quad \rho_k = \left( \frac{k \pi}{L} \right)^2, \quad k=0,1,2,\dots,\] and for homogeneous Dirichlet conditions we have, \[w_k = \sin \left( \frac{x k \pi}{L} \right), \quad \rho_k = \left( \frac{k \pi}{L} \right)^2, \quad k=1,2,\dots,\] as discussed in the last subsection.
If we consider now \(n=2\), say for homogeneous Dirichlet conditions on the box domain \(\mathbfit{x} = (x,y) \in [0,L_1]\times[0,L_2]\), we can find via separation of variables, \[w_\mathbfit{k} = \sin \left( \frac{n x \pi}{L_1} \right)\sin \left( \frac{m y \pi}{L_2} \right), \quad \rho_\mathbfit{k} = \left( \frac{n \pi}{L_1} \right)^2+\left( \frac{m \pi}{L_2} \right)^2, \quad n=1,2,\dots, \,\, m=1,2,\dots,\] where \(\mathbfit{k}=(n,m)\) is our indexing set. Similarly we can find solutions when our box has Neumann conditions, or even when the vertical and horizontal boundaries have different conditions. We will review other cases of these kinds of functions as needed, but hopefully you will have already seen examples such as Bessel functions.
We will make use of these solutions noting that we can immediately construct ansatzes like [ansatz] to solve linear PDE involving the Laplacian. In some sense these eigenfunctions immediately diagonalise this operator, in the same way that eigenfunctions of a symmetric matrix allow us to solve one component at a time (hence justifying the exponential in \(\lambda\) time dependence).
We will spend a large portion of the remainder of the course on some related theories of pattern formation in developmental biology, ecology, and beyond. Pattern formation is an area of science that studies the fundamental mechanisms which give rise to the rich diversity of shapes and structures observed in nature. For example, 3.2 shows a variety of natural structures which exhibit spatial symmetries – human fingers, zebrafish skin, leopard spots, etc. Many other examples (not pictured) include coherent structures in water waves, forests, and sand dunes, geological and astrophysical formations, and spatial organisations of human settlements. This is a huge area with a very wide literature, with numerous major contributions in our understanding from biologists and chemists, as well as physicists and mathematicians. We will necessarily be narrow in our investigations, but hopefully deep enough for you to come away appreciating the power of simple mathematical models (in concert with experimental corroboration) in explaining these kinds of phenomena.
In 1952, Alan Turing (known mostly for world-changing work in computer science and cryptography), wrote a seminal paper entitled The chemical basis for morphogenesis16. The idea was to explain how patterns such as stripes, spots and spirals can develop spontaneously from homogeneous states. Such a transition is a simplistic way to think of, e.g., an embryo which will eventually develop into an enormously complex organism. The basic idea is that two chemicals/populations/organisms are originally uniformly mixed, but that under some slight change in conditions the two species separate out, forming regions of high and low concentrations. For example milk is basically a suspension composed essentially uniformly of proteins and fat; the protein supplies the white colour. Under the action of either a change in pH, or through excess heating, the protein molecules can spontaneously form bunches separating from the (translucent) rest of the mixture, which gives the nasty (curdled custard) or nice (cheese) formation of blobs of solid curdle!
Turing’s idea was that a number of processes such as the formation of patterns on animal skins/furs form due to reaction diffusion separation processes at the cellular level, usually when the animal is in its embryonic phase. He called the chemical which produces the pattern a morphogen. This is not necessarily a specific chemical, but rather is a more abstract entity, such as a cell or set of genes17. This is to say that the morphogen is simply an agent of change which eventually leads to a spatially structured state. In our milk example, one morphogen would be the protein and the rest of the milk a fellow reactant. As we shall see this theory has in some cases been validated experimentally, especially when the relevant morphogen has been found experimentally (or in the case of chemical systems, when it has been designed to be a morphogen). In some cases the morphogen is not known but the theory can capture important qualitative attributes of the range of patterns observed in nature so well that there is strong circumstantial evidence of its validity. We will return to the question of validity of a theory briefly at the end of this part of the course, but for now look at a preview of how the ideas are built up.
The remainder of this Section uses an alternate notation to what we will adopt in the next Chapter. Do not worry about the mathematical details here, as this Section is largely just a preview of the next Chapter.
Before jumping into Turing’s theory in detail, we first look at some aspects of linear PDE which will be relevant. We can generalise linear stability theory above for scalar PDE by exploiting the ansatz given in [ansatz]. In some sense, using the eigenfunctions of the Laplacian (cosines for no-flux boundaries, sines for homogeneous Dirichlet conditions) diagonalises the partial differential operator, in the same way that using eigenvectors of a symmetric Jacobian ‘trivially’ diagonalises it. That means that partial differential equations of the form \[\begin{align} \label{linsys} \mathchoice{\frac{\partial u}{\partial t}}{\partial u/\partial t}{\partial u/\partial t}{\partial u/\partial t} = D_1\nabla^2u + B_1 u + B_2v,\\ \mathchoice{\frac{\partial v}{\partial t}}{\partial v/\partial t}{\partial v/\partial t}{\partial v/\partial t} = D_2\nabla^2v + B_3 u + B_4 v, \end{align}\] i.e. linear equations, have solutions in the form \[u = A\mathrm{e}^{\mathrm{i}\mathbfit{k} \cdot \mathbfit{x} + \lambda t}, \quad v = B\mathrm{e}^{\mathrm{i}\mathbfit{k} \cdot \mathbfit{x} + \lambda t},\] where again we are only expanding a single mode and implicitly using orthogonality to derive an equation for one mode at a time. We have also replaced the cosine by a complex exponential, to account for other possible boundary conditions18.
Substituting these solutions in we have the algebraic equations: \[\lambda\left(\begin{array}{c}u\\v\end{array}\right)= \begin{pmatrix}-D_1\mathbfit{k}\cdot\mathbfit{k}+ B_1 & B_2 \\ B_3 & -D_2\mathbfit{k}\cdot\mathbfit{k}+ B_4 \end{pmatrix}\left(\begin{array}{c}u\\v\end{array}\right) = \mathsfbfit{M}\left(\begin{array}{c}u\\v\end{array}\right).\] The fundamental theorem of linear algebra states we can only have non-zero \((u,v)\) if \(\det(\mathsfbfit{M}-\lambda \mathsfbfit{I})=0\). Another way to see this is that \(\lambda\) is an eigenvalue of \(\mathsfbfit{M}\). This determinant condition will gives us a quadratic equation in \(\lambda\), which we solve to obtain \(\lambda\) as a function of \(k\) (the so-called dispersion relation). We will derive this again in the next chapter, and spend a lot of time understanding it. As in the scalar case, these growth rates depend eigenvalues of the Laplacian, which can depend heavily on the boundary conditions and geometry of the spatial domain.
For example we might have considered a 2D Cartesian domain with coordinates (\(x_1,x_2\)) and domain lengths \(L_1\) and \(L_2\) and homogeneous Dirichlet boundary conditions \(u(0,x_2)=u(L_1,x_2)= u(x_1,0)=u(x_1,L_2)=0\) and similar for \(v\), so that we would require \[k_1L_1 = n\pi,\quad k_2 L_2 = m\pi,\] so only the part of the solution \(u = A\mathrm{e}^{\mathrm{i}\mathbfit{k} \cdot\mathbfit{x} + \lambda t}\) which can satisfy the boundary conditions is \[u = A_c\mathrm{e}^{\lambda t}\sin\left(\frac{n\pi}{L_1}x_1\right)\sin\left(\frac{m\pi}{L_2} x_2\right).\] Note that the notation here is different, partly to show you alternative ways of viewing these spatial eigenfunctions (and because, e.g., Murray uses this notation with \(\mathbfit{k}\)). We can note that \(\mathbfit{k}\cdot \mathbfit{k} = |\mathbfit{k}|^2=\rho_k\) is how the eigenvalues in this notation relate to what we had before.
This allows for a countable infinity of possible solutions based on choices \(n\) and \(m\). A general solution to this system for the morphogen \(u\) would then take the form \[u(\mathbfit{x},t) = \sum_{n=0}^{\infty}\sum_{m=0}^{\infty}A_{nm}\mathrm{e}^{\lambda(n,m)t}\sin\left(\frac{n\pi }{L_1}x_1\right)\sin\left(\frac{m\pi }{L_2}x_2\right),\] where we recognise \(\lambda\) would now be a function of \(n\) and \(m\). The initial condition \(u(x_1,x_2,0) = u_0\) can be used to define the coefficients \(A_{ij}\) through19 \[A_{nm} = \frac{\displaystyle \int_{0}^{L_1}\int_{0}^{L_2}u(\mathbfit{x},0)\sin\left(\frac{n\pi }{L_1}x_1\right)\sin\left(\frac{m\pi }{L_2}x_2\right){\mathrm d}{x_1}{\mathrm d}{x_2}} {\displaystyle \int_{0}^{L_1}\int_{0}^{L_2}\sin\left(\frac{n\pi }{L_1}x_1\right)^2\sin\left(\frac{m\pi }{L_2}x_2\right)^2{\mathrm d}{x_1}{\mathrm d}{x_2}}.\] As before, we assume ‘generic’ perturbations such that \(A_{mn}\) are all nonzero, though some may be very small for any given initial perturbation. We also know that if \(\lambda(n,m)<0\) the mode \((n,m)\) will decay to zero exponentially in time. So pretty quickly for some \(t>0\) only terms for which \(\lambda(n,m)\geq0\) will remain visible, and these will grow rapidly as time evolves.
Let’s say for argument’s sake that only the \(n=10\) and \(m=10\) mode has \(\lambda(n,m)\geq0\). Then at some time \(t\gg 0\) the solution can be approximated by \[u(\mathbfit{x},t) \approx A_{10\,10}\mathrm{e}^{\lambda(10,10)t}\sin\left(\frac{10\pi }{L_1}x_1\right)\sin\left(\frac{10\pi }{L_2}x_2 \right).\] Let’s say further that if \(u>0\) the morphogen triggers cells which radiate white to grow (if \(u<0\) the cells remain dark). The pattern produced by this function is a checkerboard pattern, shown in 3.3(a). If we are a little more demanding and require \(u>0.5\) then this becomes a regular spotted pattern in 3.3(b) (assuming \(t\) is not too large). If instead we consider only the mode \(n=1\), \(m=10\), we get stripes for \(u>0\) – see 3.3(c) – and if \(u >0.5\) these stripes become restricted ‘fin’ type patterns. Of course many of these details depend on exactly when we decide to ‘threshold’ the morphogen, and at what level. The main point is that since solutions to linear PDE grow exponentially in time, we can make decent approximations by considering only those modes which grow fastest.
In general a Fourier series can produce any reasonable scalar function \(f(x_1,x_2)\) but these spotted and striped patterns seem so common in nature. Turing asked why? As we shall see it is typical of reaction–diffusion equations that only a small number of nodes, \((n,m)\), values are unstable and grow.
Of course if \(\lambda(n,m)>0\) then in theory \(u\) can grow without bound. However, most realistic models are nonlinear and permit spatially varying equilibria with bounded values, as we shall see this includes spotted and striped type patterns. Of course we can always linearise nonlinear systems and look for which modes might grow. This is the essence of Turing’s analysis which tells us something about the set of patterns which might form in reaction–diffusion systems. We expect a restricted space of unstable modes \((n,m)\) which can form spotted/striped patterns; the nonlinear effects then tend to take over and stabilise the growth of the pattern until it relaxes. The question of how much the pattern changes is one which we will come back to throughout the term.
The above discussion assumed we were in a nice rectangular or square domain. I have yet to come across a square animal! The shape of the domain, as well as the boundary conditions, can play a vital role in the allowed patterns the system forms. For example, one issue we cover this term is pattern formation in a circular domain. We shall see later that the solutions to a system like [linsys], when our domain is circular (in a polar coordinate system \((r,\theta)\)), take the form \[u(r, \theta,t) = \sum_{n=1}^{\infty}\sum_{k_m^n}A_{mn}J_n(k_m^n r)\cos(n\theta) + B_{mn}J_n(k_m^n r)\sin(n\theta),\] where \(J_n(k_m^n r)\) is the \(n^\text{th}\) Bessel function. Bessel functions vary sinusoidally, but unlike sine and cosine the amplitude of variation decays away with \(r\). We see in 3.4(c) this tends to produce patterns which form on rings surrounding the centre of the domain (the Bessel function amplitudes decide where). With stripes appearing as circles and spots lying on radii around a fixed central spot. In (d) we show an example of a pattern formed on an annular domain. It is an almost periodic set of ‘stripes’. We will see later that this pattern can plausibly explain branching type behaviour; in fact reaction diffusion mechanisms have been used to explain the splitting of branches we see in trees, or the bifurcations of blood vessels throughout our bodies.
Relevant reading: Murray book II, chapter 2–2.4
Turing’s paper proposed that two chemical species, reacting with one another and diffusing, are sufficient to form spontaneous spatial patterns. The key idea is to use the linear stability theory for PDEs developed in the previous chapter, to find cases where the spatially homogeneous system (that is, just dropping the diffusion terms) has a stable equilibrium which is no longer stable when we put the diffusion back in. This leads to a ‘pitchfork-like’ bifurcation, where spatial steady states emerge from the spatially homogeneous equilibrium. Additionally, and very importantly, Turing’s analysis provides key biological insight into the requirements needed to have these kinds of patterns. This insight is somewhat abstract, phrased in terms of inequalities, but I hope you see by the end of this chapter how nice it is to get out general biological principles from a bit of maths – this is rare in the messy world of biology!
Turing’s result was, and is, considered surprising or non-intuitive. Why? Well, diffusion tends to push everything towards spatial homogeneity. Think about putting dye (such as food colouring) in water, as in 4.1. Diffusion in almost all contexts (heat conduction, fluid flows, mass transport, image analysis...) tends to ‘smear out’ any sharp spatial structure, and if given enough time, homogenises everything. You can also get this intuition mathematically by noting that just the diffusion (heat) equation itself, without reactions, always pushes everything to be as spatially homogeneous as possible. With Neumann boundary conditions, mass is conserved but everything flattens out over time, and with Dirichlet conditions (with the same value on all boundaries), diffusion makes the value inside the domain the same as it is on the fixed boundaries.
Mathematically we can make sense of this by noting that, if we follow the linear stability analysis of the previous chapter, the growth rates of perturbations to homogeneous equilibria satisfy: \[\lambda_k = -D\left(\frac{\pi k}{L}\right )^2 +f'(u_0) \leq \lambda_0 = f'(u_0).\] So if the homogeneous equilibrium \(u_0\) is stable without diffusion (i.e. with \(D=0\)), all of the growth rates of a linear perturbation to this equilibrium must be negative. Note that this is assuming Neumann boundary conditions on the domain \(x \in [0,L]\).
We saw some spatial structures emerging from scalar reaction–diffusion systems when we considered Dirichlet conditions, such as in 3.1.2 (see 3.1). But these require boundary conditions that force the system away from a homogeneous state. In 1-D, you can prove that such solutions can never change sign, so in particular they will not oscillate around the homogeneous steady state. You can also show that if the domain is sufficiently large, away from the boundaries, solutions will always approach a homogeneous spatial state. For Neumann conditions and \(x \in [0,L]\), you can show that only homogeneous equilibria are stable; for a proof of this result and related ideas, click this link. You do not need to understand these details in this course.
Finally there is one case where inhomogeneous equilibria are stable in scalar reaction–diffusion equations. In two or more dimensions, the homogenising effect of diffusion can be limited by exploiting non-convex geometry. That means, if we take a system with at least two stable equilibria, it can be essentially equal to these spatially homogeneous equilibria away from a small connecting channel. This is often referred to as a ‘dumbbell-shaped domain’. See 4.2, where \(u_0\) and \(u_1\) are stable equilibria (that is, \(f'(u_1)<0\) and \(f'(u_1)<0\) with \(f(u_0)=f(u_0)=0\)). The key result is that these domains are almost separated, and the nonlinearity within each ‘large’ part of the domain can overcome the homogenising effect of diffusion. In convex domains, all stable equilibria (for Neumann boundary conditions) are spatially homogeneous.
We now consider the following general reaction–diffusion system,20 \[\begin{aligned} \mathchoice{\frac{\partial u}{\partial t}}{\partial u/\partial t}{\partial u/\partial t}{\partial u/\partial t} &= D_1\nabla^2u + F(u,v),\\ \mathchoice{\frac{\partial v}{\partial t}}{\partial v/\partial t}{\partial v/\partial t}{\partial v/\partial t} &= D_2\nabla^2v + G(u,v). \end{aligned}\] and make the scalings21 \(\widehat{\mathbfit{x}} =\mathbfit{x}/\sqrt{D_1}\), and \(D = D_2/D_1\), so that upon substituting and dropping the hat notation we have. \[\begin{align} \label{RD-US-System} \mathchoice{\frac{\partial u}{\partial t}}{\partial u/\partial t}{\partial u/\partial t}{\partial u/\partial t} &= \nabla^2u + F(u,v),\\ \mathchoice{\frac{\partial v}{\partial t}}{\partial v/\partial t}{\partial v/\partial t}{\partial v/\partial t} &= D\nabla^2v + G(u,v). \end{align}\] The parameter \(D>0\), the nondimensionalised ratio of diffusion rates of \(u\) and \(v\), e.g. if \(D<1\) then \(u\) diffuses faster than \(v\) and vice-versa. The term ‘reaction’ is not always literal as \(F\) and \(G\) can be the interaction of the two densities \(u\) and \(v\), which could be populations (predator prey, competition, etc), biochemical reactions, or electrochemical effects, such as in neural signal propagation. It can also represent independent growth/decay or self competition terms. For simplicity in what follows, we will refer to \(F\) and \(G\) as reactions or sometimes the ‘reaction kinetics’. This is the language Turing used, as he was really thinking of chemicals as a model for how genetic information determined spatial structure in an embryo.
Unless otherwise stated, we will assume no-flux boundary conditions; if our domain is \(\Omega\) with \(\widehat{\mathbfit{n}}\) the normal to the boundary \(\partial \Omega\), then \(\boldsymbol{\nabla}u\cdot \widehat{\mathbfit{n}}=0\) and \(\boldsymbol{\nabla}v\cdot \widehat{\mathbfit{n}}=0\). This implies nothing can enter or leave the system and any pattern formation is due to internal changes in the system. At equilibrium \(u=u_0\) and \(v=v_0\) on \(\partial \Omega\). We will firstly consider the cuboidal domain \(\Omega = [0,L_1]\times[0,L_2]\times\dots [0,L_n]\).
The Turing analysis is essentially a linear stability analysis around a spatially homogeneous equilibrium, though we will want to look specifically for conditions when a spatial instability occurs which would not occur in the absence of diffusion. Using the notation from last chapter, this means we want \(\operatorname{Re}(\lambda_0)<0\) so that the equilibrium is stable in the absence of diffusion, but we are aiming for \(\operatorname{Re}(\lambda_k)>0\) for some \(k>0\) so that some inhomogeneous mode destabilises our spatially homogeneous equilibrium.
We are interested in the stability of homogeneous equilibria. As in the scalar case in the previous chapter, this means we look for constant solutions which satisfy the system of PDEs given by [RD-US-System]. As before, homogenous implies that \(u_0\) and \(v_0\) have no variation in \(x\), i.e. \(\mathchoice{\frac{\partial^{n} u_0}{\partial x_i^{n}}}{\partial^{n} u_0/\partial x_i^{n}}{\partial^{n} u_0/\partial x_i^{n}}{\partial^{n} u_0/\partial x_i^{n}}=0=\mathchoice{\frac{\partial^{n} v_0}{\partial x_i^{n}}}{\partial^{n} v_0/\partial x_i^{n}}{\partial^{n} v_0/\partial x_i^{n}}{\partial^{n} v_0/\partial x_i^{n}}\) for all \(n\) and in all spatial directions \(i\). Such equilibria occur when \[\begin{aligned} F(u_0,v_0) = 0,\quad G(u_0,v_0)= 0. \end{aligned}\]
The linearisation of the derivative terms should be obvious by now, the reaction functions are expanded using a Taylor series \[\begin{aligned} F(u,v) = \varepsilon \left( u_1\mathchoice{\frac{\partial F}{\partial u}}{\partial F/\partial u}{\partial F/\partial u}{\partial F/\partial u}\bigg\vert_{u=u_0,v=v_0} + v_1\mathchoice{\frac{\partial F}{\partial v}}{\partial F/\partial v}{\partial F/\partial v}{\partial F/\partial v}\bigg\vert_{u=u_0,v=v_0} \right) + \mathcal{O}(\varepsilon^2),\\ G(u,v) = \varepsilon\left(u_1\mathchoice{\frac{\partial G}{\partial u}}{\partial G/\partial u}{\partial G/\partial u}{\partial G/\partial u}\bigg\vert_{u=u_0,v=v_0} + v_1\mathchoice{\frac{\partial G}{\partial v}}{\partial G/\partial v}{\partial G/\partial v}{\partial G/\partial v}\bigg\vert_{u=u_0,v=v_0} \right) + \mathcal{O}(\varepsilon^2), \end{aligned}\] where the \(\mathcal{O}(1)\) contribution has vanished because we expand about a homogeneous equilibrium. For notational brevity in what follows we denote partial derivatives evaluated at equilibrium with a subscript, e.g. \[\mathchoice{\frac{\partial F}{\partial u}}{\partial F/\partial u}{\partial F/\partial u}{\partial F/\partial u}\bigg\vert_{u=u_0,v=v_0} = F_u,\quad \mathchoice{\frac{\partial G}{\partial v}}{\partial G/\partial v}{\partial G/\partial v}{\partial G/\partial v}\bigg\vert_{u=u_0,v=v_0} = G_v.\] so that (again remembering that \(F(u_0,v_0)=G(u_0,v_0)=0)\) \[\begin{aligned} F(u,v) = \varepsilon \left(F_u u_1 + F_v v_1 \right) + \mathcal{O}(\varepsilon^2),\\ G(u,v) = \varepsilon\left(G_u u_1 + G_v v_1 \right) + \mathcal{O}(\varepsilon^2). \end{aligned}\]
So perturbations of the form \(u = u_0 +\varepsilon u_1(x,t)\), \(v = v_0 + \varepsilon v_1(x,t)\) satisfy: \[\begin{align} \label{RD-US-System-US-Lin} \mathchoice{\frac{\partial u_1}{\partial t}}{\partial u_1/\partial t}{\partial u_1/\partial t}{\partial u_1/\partial t} &= \nabla^2 u_1 + F_u u_1 + F_v v_1, \\ \mathchoice{\frac{\partial v_1}{\partial t}}{\partial v_1/\partial t}{\partial v_1/\partial t}{\partial v_1/\partial t} &= D\nabla^2 v_1 + G_u u_1 + G_v v_1. \end{align}\] Or, equivalently, using the vector \(\mathbfit{u}_1 = (u_1,v_1)^T\), \[\begin{equation} \label{linear-US-turing} \mathchoice{\frac{\partial\mathbfit{u}_1}{\partial t}}{\partial\mathbfit{u}_1/\partial t}{\partial\mathbfit{u}_1/\partial t}{\partial\mathbfit{u}_1/\partial t} = \mathsfbfit{D}\nabla^2\mathbfit{u}_1 + \mathsfbfit{J}\mathbfit{u}_1, \quad \mathsfbfit{D} = \begin{pmatrix}1 & 0 \\ 0 & D \end{pmatrix}, \quad \mathsfbfit{J} = \begin{pmatrix}F_u & F_v \\ G_u & G_v \end{pmatrix}. \end{equation}\] We can solve this system exactly as we did the scalar case – by exploiting eigenfunctions of the Laplacian and using the same exponential ansatz in time as before.
As we are in a cuboidal domain \(\Omega = [0,L_1]\times[0,L_2]\times\dots [0,L_n]\), we can assume solutions in the form \(\mathrm{e}^{\mathrm{i}\mathbfit{k}\cdot \mathbfit{x} + \lambda_\mathbfit{k} t}\), and the linearised system can be written as \[\begin{equation} \label{turinglin} \lambda_\mathbfit{k}\left(\begin{array}{c}u_1\\v_1\end{array}\right) =\underbrace{\begin{pmatrix}-\rho_k+ F_u & F_v \\ G_u & -D\rho_k+ G_v \end{pmatrix}}_{\mathsfbfit{M}}\left(\begin{array}{c}u_1\\v_1\end{array}\right) \end{equation}\] or \(\lambda_\mathbfit{k} \mathbfit{u}_1 = \mathsfbfit{M}\mathbfit{u}_1 = (-\rho_k\mathsfbfit{D} + \mathsfbfit{J})\mathbfit{u}_1\). For notational brevity in what follows we write \(\rho_k = |\mathbfit{k}|^2 = \mathbfit{k}\cdot\mathbfit{k} = k_1^2 + k_2^2 + \dots k_n^2\). We call \(\rho_k\) the wave number or pattern number. The numerical value of \(\rho_k\) determines an average frequency of oscillations of perturbations. Functions with small \(\rho_k\) have long wavelength oscillations, whereas those with very large \(\rho_k\) will have very fast, short wavelength oscillations.
Solving the eigenequation \(\det(\mathsfbfit{M}-\lambda_\mathbfit{k} \mathsfbfit{I}) = 0\) we have a quadratic equation for \(\lambda_\mathbfit{k}\) given by: \[\lambda^2_\mathbfit{k} -\operatorname{tr}(\mathsfbfit{M})\lambda + \det(\mathsfbfit{M})=0.\] This has the two solutions, \[\begin{equation} \label{turinglambda} \lambda_\mathbfit{k} = \frac{1}{2}\left[\operatorname{tr}(\mathsfbfit{M}) \pm \sqrt{\operatorname{tr}(\mathsfbfit{M})^2 -4\det(\mathsfbfit{M})}\right] \end{equation}\] where \[\begin{align} \nonumber\operatorname{tr}(\mathsfbfit{M}) &= (F_u +G_v)-\rho_k(D+1),\\ \label{fulldet}\det(\mathsfbfit{M}) &= D\rho_k^2 - \rho_k(G_v+ D F_u)+ (F_uG_v -G_uF_v). \end{align}\] So given a particular wavenumber \(\mathbfit{k}\), the equilibrium is stable to a perturbation of that form if \(\operatorname{tr}(\mathsfbfit{M})<0\) and \(\det(\mathsfbfit{M})>0\). In contrast, we have instability if either \(\operatorname{tr}(\mathsfbfit{M})>0\) or \(\det{\mathsfbfit{M}}< 0\). Of course, we have also expanded around a single mode, and there will be countably infinite numbers of modes, given by the wave vector \(\mathbfit{k}\).
Next we consider, separately, the cases where \(\operatorname{Re}(\lambda_0)<0\) (so that the spatially homogeneous equilibrium is stable in the absence of diffusion), and \(\operatorname{Re}(\lambda_\mathbfit{k})>0\) for some wave number \(\mathbfit{k}\). This will lead to a set of four conditions on the parameters and the geometry in order to ensure this kind of instability occurs.
The notations in this section are non-examinable, and different from the current version of the course. This material is here in part to demonstrate an alternative way of thinking about eigenfunction expansions and boundary conditions, and in part because these notations are used in some of the additional problem sheets and related material.
Since we consider a Cartesian domain the boundary conditions take the explicit form \[\mathchoice{\frac{\partial u}{\partial x_i}}{\partial u/\partial x_i}{\partial u/\partial x_i}{\partial u/\partial x_i}(x_1,\dots 0,\dots x_n,t) = \mathchoice{\frac{\partial u}{\partial x_i}}{\partial u/\partial x_i}{\partial u/\partial x_i}{\partial u/\partial x_i}(x_1,\dots L_i,\dots x_n,t)=0,\] for all \(i\), where the \(i\)th coordinate is set to either \(0\) or \(L_i\). The conditions are the same for \(v\). We use the notation \(\widehat{\mathbfit{k}\cdot\mathbfit{x}_i}=k_1x_1 + \dots + k_i 0 + \dots + k_n x_n\) (the dot product with the \(i^{\text{th}}\) component of the sum removed). With this the \(i^{\text{th}}\) boundary condition will be22 \[\begin{equation} \label{bcs} \mathchoice{\frac{\partial u_1}{\partial x_i}}{\partial u_1/\partial x_i}{\partial u_1/\partial x_i}{\partial u_1/\partial x_i} = \operatorname{Re}\left (A_u \mathrm{i}k_i \mathrm{e}^{\mathrm{i}\widehat{\mathbfit{k}\cdot\mathbfit{x}_i}+\lambda t}\right)= \operatorname{Re}\left( A_u\mathrm{i}k_i \mathrm{e}^{\mathrm{i}\widehat{\mathbfit{k}\cdot\mathbfit{x}_i}+k_iL+\lambda t} \right)= 0. \end{equation}\] This can only be satisfied if the solutions take the form \[u_1(\mathbfit{x},t) = A_u\mathrm{e}^{\lambda t}\cos(k_1x_1)\cos(k_2 x_2)\dots \cos(k_nx_n),\] as the conditions in [bcs] can only be satisfied for the cos solution (which differentiates to sin). This fixes all \(k_i\) to take on values \[\begin{equation} \label{kcon} \mathbfit{k} = \left(n_1\pi/L_1,n_2\pi/L_2 , \dots, n_n\pi/L_n\right), \end{equation}\] thus \[\begin{equation} \label{mode-US-numbers} \rho_k= \pi^2\left(\frac{n_1^2}{L_1^2}+\frac{n_2^2}{L_2^2} + \dots + \frac{n_n^2}{L_n^2} \right). \end{equation}\] The same is true for \(v_1\). By contrast if we had chosen Dirichlet boundary conditions \[u(x_1,\dots 0 ,\dots x_n,t) = u(x_1,\dots L,\dots x_n,t) = 0, \forall i.\] Then the solution would be \[\begin{equation} \label{bcs2} u_1(\mathbfit{x},t) = A_u\mathrm{e}^{\lambda t}\sin(k_1x_1)\sin(k_2 x_2)\dots \sin(k_nx_n). \end{equation}\] with the same conditions on \(\mathbfit{k}\) ([kcon]). It would seem this is little different from the fluxless boundary condition case, since \(\cos\) and \(\sin\) are just the same function shifted, so the potential patterns are basically the same. However, there is one critical difference, the existence of a zero pattern number \(\rho_k=0\). This was seen in the previous chapter in the scalar case, and this is just the higher spatial-dimensional analogue for systems of reaction–diffusion equations.
The zero mode \(\mathbfit{k}=\mathbf{0}\) is a homogeneous perturbation to the system. However, for the Dirichlet boundary conditions, which give the solutions [bcs2], these functions are automatically zero, and so \(\rho_k=0\) is not permissible for these kinds of boundary conditions. However with no-flux boundary conditions \(\mathbfit{k}=\mathbf{0}\) will satisfy [bcs] with \(A_u\neq 0\). So with no-flux boundary conditions, \(u_1\) can be homogeneous (\(\bm{\nabla}u=\mathbf{0}\)), corresponding to a uniform perturbation of the system, which can be associated with the spatially homogeneous ODE system. Ultimately this boils down to saying the \(\rho_k=0\) uniform mode is asymptotically stable, i.e. it will decay. For stability of the \(\rho_k=0\) mode we need \[\begin{aligned} \operatorname{tr}(\mathsfbfit{J}) &= (F_u +G_v)<0,\\ \det(\mathsfbfit{J}) &=(F_uG_v -G_uF_v)>0. \end{aligned}\] So we need both \[\begin{equation} \label{homogcons} F_u +G_v < 0\mbox{ and } F_uG_v>G_uF_v. \end{equation}\] These are exactly the conditions we would get for a stable equilibrium of the spatially homogeneous ODE system: \[\begin{align} \label{RD-US-System-US-ODE} \mathchoice{\frac{{\mathrm d}u}{{\mathrm d}t}}{{\mathrm d}u/{\mathrm d}t}{{\mathrm d}u/{\mathrm d}t}{{\mathrm d}u/{\mathrm d}t} &= F(u,v),\\ \mathchoice{\frac{{\mathrm d}v}{{\mathrm d}t}}{{\mathrm d}v/{\mathrm d}t}{{\mathrm d}v/{\mathrm d}t}{{\mathrm d}v/{\mathrm d}t} &= G(u,v). \end{align}\]
Next we seek for inhomogeneous unstable modes \(\mathbfit{k} = \left(n_1\pi/L_1,n_2\pi/L_2,\dots n_n\pi/L_n\right)\) which are unstable, so they don’t vanish and instead grow (at least one \(\lambda_\mathbfit{k}\) has positive real part). Since \((F_u+G_v)<0\) the trace \((F_u +G_v)-\rho_k(D+1)\) is less than zero. Thus we can only have positive \(\lambda_\mathbfit{k}\) if \(\det(\mathsfbfit{M})<0\) so that one of the roots must be positive, that is to say \[\begin{equation} \label{h-US-eq} h(\rho_k) = D\rho_k^2 - \rho_k(G_v+ D F_u)+ (F_uG_v -G_uF_v) < 0. \end{equation}\] Since \(D\rho_k^2>0\) and \((F_uG_v -G_uF_v)>0\) (so the zeroth order mode vanishes), this can only occur if \[G_v+ D F_u>0.\]
This condition is necessary but not sufficient to guarantee that \(\det(\mathsfbfit{M})<0\) for some \((\rho_k\neq 0)\). To help understand this, look at panel (a) of 4.3, ignoring for the moment the parameters \(d\), \(d_c\) etc (and noting that this author has used \(k^2\) in place of our \(\rho_k\)). We need the minimum value of \(h\) to be negative to ensure an instability. To compute this condition, we will make an important approximation – namely that we can treat \(h(\rho_k)\) as a function of a continuous variable, \(\rho_k\), despite the fact that really \(\rho_k\) can only take a discrete set of values. For domains which are sufficiently large, this approximation is OK, as the modes will scale like \(1/L^2\) (see, for example [mode-US-numbers]).
With this approximation in mind, and since we know that \(h\) has a single global minimum, we compute the wave number at this minimum as: \[h'(\rho_k^*) = 2D\rho_k^* - (G_v+ D F_u)=0 \implies \rho_k^ = \frac{G_v+ D F_u}{2D},\] and so we have that the minimum value of \(h\) is: \[h(\rho_k^*) = \frac{(G_v+ D F_u)^2}{4D} - \frac{(G_v+ D F_u)^2}{2D}+ (F_uG_v -G_uF_v) = -\frac{(G_v+ D F_u)^2}{4D} + (F_uG_v -G_uF_v).\] Rearranging this, our condition for instability (that \(h(\rho_k^*)<0\)) is equivalent to: \[(G_v+ D F_u)^2 - 4D(F_uG_v -G_uF_v) > 0.\]
The summary of the previous analysis is as follows.
All of the following inequalities are necessary conditions for a diffusion-driven instability (that is, the growth rates of perturbations to a spatially homogeneous equilibrium satisfy \(\operatorname{Re}(\lambda_0)<0\) but \(\operatorname{Re}(\lambda_\mathbfit{k})>0\) for some wave vector \(\mathbfit{k}\)): \[\begin{align} \label{turingconditions} F_u +G_v <0, \quad F_uG_v -G_uF_v> 0,\\ \nonumber G_v + DF_u>0,\quad (G_v+ D F_u)^2 - 4 D (F_uG_v -G_uF_v) >0. \end{align}\] These four conditions are the so-called Turing conditions.
These conditions are necessary and sufficient for this kind of instability in the case of very large domains, as we have made one assumption in deriving them. Namely, for a fixed bounded domain, \(\rho_k\) will be a discrete set of numbers, and none of these may fall within the range of unstable wave numbers.
It is also worth mentioning that, while this analysis is a bit technical, the only thing we have learned is that a spatially homogeneous equilibrium is unstable to spatial perturbations. What happens after such an instability? Well in general this situation can be quite complex, but most of the time in practice these instabilities lead to pitchfork-like inhomogeneous steady states emerging from the spatially homogeneous equilibrium. Look at 1.1, but imagine that the blue branches emerging from the now-unstable homogeneous steady state (the red line) are now spatially varying functions. Such a ‘pitchfork’ structure is indeed found in this kind of Turing instability, though the full nonlinear analysis of the bifurcation past the linear instability is beyond the scope of this course. Still, even the linear analysis provides enormous insight into the kinds of biological systems which can exhibit this sort of diffusion-driven patterning.
We now use the four conditions given in [turingconditions] to make some important observations on the kinds of systems for which this instability is possible. This is a crucial step, and really allows us to say something meaningful biologically using these relatively simple inequalities.
The first and third condition imply that \(F_u\) and \(G_v\) must be of opposing sign. To see this we note that if they had the same sign then \(F_u + G_v<0\) tells us they would have to be negative, but then the third condition would say \[D \vert F_u \vert + \vert G_v\vert < 0 \implies D <-\frac{\vert G_v\vert}{\vert F_u\vert}.\] But physically we need \(D>0\) (negative diffusion does not make sense) so this cannot be permissible. Therefore, if \(F_u>0\) we must have \(G_v<0\) and vice versa.
One can see that this condition (of opposing \(F_u\) and \(G_v\)) make physical sense. The ‘reaction’ \(F\) promotes \(u\) and \(G\) promotes \(v\). Consider for example the case \(F_u>0\) and \(G_v<0\). To induce a separation in concentration peaks, we want it to be the case that where a small increase in \(u\) leads to an acceleration in the production of \(u\) to coincide with small increases in \(v\) leading to a drop in \(v\) so that (in this case) \(u\) can become dominant over \(v\). Decreases in \(u\) and \(v\) should lead to \(v\) being favoured. If both \(F_u\) and \(G_v\) are the same sign they will both tend to grow and decay simultaneously, this will not promote separation and hence pattern formation.
The biochemical terminology here is that the \(u\) species is a self-activator (assuming without loss of generality that \(F_u>0\)), and that \(v\) is a self-inhibitor. Often this is shortened to ‘activator and inhibitor’ dynamics, though it is important to note that we have only determined how these species interact with themselves (and only at the steady state \((u_0,v_0)\)).
What does this requirement say about the permissible values of \(D\)? Let’s fix \(F_u>0\) so that \(u\) is the activator, so we must have \(G_v<0\). We then have from the first and the third inequalities in [turingconditions]: \[F_u + G_v<0 \iff F_u < -G_v \implies \vert F_u \vert < \vert G_v \vert \implies \frac{\vert G_v\vert }{\vert F_u \vert} >1,\] and \[DF_u+G_v > 0 \implies D\vert F_u\vert +\vert G_v\vert >0 \implies D>\frac{\vert G_v\vert }{\vert F_u \vert} >1.\] A similar argument shows that if \(G_v>0\), we must have \(F_u<0\) and \(D<1\). Note that these inequalities are strict, i.e. \(D\) cannot equal \(1\). This is why the Turing instability is often referred to as a diffusion-driven instability, as it requires that the diffusion rates of the two species are different. We can also see that the inhibitor in either case must always diffuse more quickly than the activator. This is referred to as short-range activation, long-range inhibition, or sometimes as LALI: local activation, lateral inhibition.
What about the other terms in the Jacobian, \(F_v\) and \(G_u\)? Well since \(F_u\) and \(G_v\) have opposite signs, we must have \(F_uG_v<0\). But the second condition means that we must have \(F_uG_v -G_uF_v> 0\), so this inequality can never be satisfied unless \(F_v\) and \(G_u\) have opposite signs. Therefore, the Jacobian matrix must take one of the following forms: \[\mathsfbfit{J} = \begin{pmatrix} + & -\\ + & - \end{pmatrix},\quad \begin{pmatrix} + & +\\ - & - \end{pmatrix},\quad \begin{pmatrix} - & -\\ + & + \end{pmatrix},\quad \begin{pmatrix} - & +\\ - & + \end{pmatrix}.\]
Of course, one can always just consider the first two as the latter two are just a relabelling of \(u\) and \(v\) in a sense, whereas the first two have qualitatively different behaviours. In particular, you can show that the coefficient of the linear term of \(u_1\) and \(v_1\) will be in-phase or out-of-phase, depending on which of these two forms it takes. While \(u_1\) and \(v_1\) will have different forms (and especially the functions satisfying the nonlinear reaction–diffusion equations, \(u\) and \(v\), will generally appear very different from one another), the maxima and minima of their concentrations will coincide or be exactly out-of-phase with one another.
To see this in and out of phase behaviour, consider the second equation in [turinglin]: \[\lambda_k v_1 = G_u u_1 + G_v v_1 - \rho_k D v_1.\] Let’s also assume we have \(F_u>0\) and \(G_v<0\) (so that \(D>1\) for a pattern-forming instability). Moving the \(v_1\) terms to the left side, we have: \[(\lambda_\mathbfit{k} - G_v + D \rho_k)v_1 = G_u u_1.\] For an instability, we must have \(\lambda_\mathbfit{k}>0\), and by assumption the other two terms on the left are positive, so \(v_1\) and \(u_1\) are related in sign by \(G_u\). That is, if \(G_u>0\), then \(u_1\) and \(v_1\) must be of the same sign, and hence their spatial oscillations must be in-phase. If \(G_u<0\), they must be out-of-phase, as there is now a sign difference between them, so anywhere \(u_1\) is positive, \(v_1\) must be negative and vice-versa. See 4.4.
This has useful biological applications, as often an experimentalist has identified a candidate activator and inhibitor, and knows if they have peaks (that is, regions of high concentration) in the same place with one another, or in different places. This can help determine the correct form for a model of their system, as often we do not understand the molecular biology well enough to write down what forms \(F\) and \(G\) should take. Similar remarks can be made about ecological applications, particularly predator–prey interactions.
See chapters 2 and 3 of the second volume of Murray’s Mathematical Biology for a much more thorough discussion of the biological interpretation of these Turing conditions.
Just for a concrete implementation of the above, let’s consider the case of one and two spatial dimensions, given below as 1-D and 2-D respectively. Consider a system \[\begin{align} \label{tsys} \mathchoice{\frac{\partial u}{\partial t}}{\partial u/\partial t}{\partial u/\partial t}{\partial u/\partial t} &= \nabla^2u + F(u,v),\\ \mathchoice{\frac{\partial v}{\partial t}}{\partial v/\partial t}{\partial v/\partial t}{\partial v/\partial t} &= D\nabla^2v + G(u,v). \end{align}\] on a domain in 1-D \([0,L]\), or in 2-D \([0,L_1]\times[0,L_2]\) with coordinate systems (1-D) \(x\) or (2-D) (\(x_1,x_2\)).
We consider this system subject to no-flux boundary conditions in 1-D: \[\begin{aligned} \mathchoice{\frac{\partial u}{\partial x}}{\partial u/\partial x}{\partial u/\partial x}{\partial u/\partial x}(0,t) = \mathchoice{\frac{\partial u}{\partial x}}{\partial u/\partial x}{\partial u/\partial x}{\partial u/\partial x}(L,t) = 0,\\ \mathchoice{\frac{\partial v}{\partial x}}{\partial v/\partial x}{\partial v/\partial x}{\partial v/\partial x}(0,t) = \mathchoice{\frac{\partial v}{\partial x}}{\partial v/\partial x}{\partial v/\partial x}{\partial v/\partial x}(L,t) = 0; \end{aligned}\] or in 2-D: \[\begin{aligned} \mathchoice{\frac{\partial u}{\partial x_1}}{\partial u/\partial x_1}{\partial u/\partial x_1}{\partial u/\partial x_1}(0,x_2,t) = \mathchoice{\frac{\partial u}{\partial x_1}}{\partial u/\partial x_1}{\partial u/\partial x_1}{\partial u/\partial x_1}(L_1,x_2,t) =\mathchoice{\frac{\partial u}{\partial x_2}}{\partial u/\partial x_2}{\partial u/\partial x_2}{\partial u/\partial x_2}(x_1,0,t)=\mathchoice{\frac{\partial u}{\partial x_2}}{\partial u/\partial x_2}{\partial u/\partial x_2}{\partial u/\partial x_2}(x_1,L_2,t) = 0,\\ \mathchoice{\frac{\partial v}{\partial x_1}}{\partial v/\partial x_1}{\partial v/\partial x_1}{\partial v/\partial x_1}(0,x_2,t) = \mathchoice{\frac{\partial v}{\partial x_1}}{\partial v/\partial x_1}{\partial v/\partial x_1}{\partial v/\partial x_1}(L_1,x_2,t) =\mathchoice{\frac{\partial v}{\partial x_2}}{\partial v/\partial x_2}{\partial v/\partial x_2}{\partial v/\partial x_2}(x_1,0,t)=\mathchoice{\frac{\partial v}{\partial x_2}}{\partial v/\partial x_2}{\partial v/\partial x_2}{\partial v/\partial x_2}(x_1,L_2,t) = 0. \end{aligned}\] In all cases we assume the system has a homogenous equilibrium \[F(u_0,v_0) =0,\quad G(u_0,v_0) =0.\] If the system satisfies the following set of inequalities (these involve the elements of the Jacobian evaluated at the homogeneous equilibrium): \[\begin{aligned} F_u +G_v <0, \quad F_uG_v -G_uF_v> 0,\\ \nonumber G_v + DF_u>0,\quad (G_v+ D F_u)^2 - 4 D (F_uG_v -G_uF_v) >0, \end{aligned}\] then the homogeneous steady state is unstable, allowing inhomogeneous patterns to form, as long as the domain is sufficiently large. The inhomogeneities of the morphogen, \(u_1(\mathbfit{x},t)\), which are solutions to the linearised version of [tsys], will take the form (in 1-D):23 \[u_1(x,t) =\sum_{k=0}^{\infty}A_{n}\mathrm{e}^{\lambda_n t}\cos\left(\frac{n\pi x}{L}\right),\] or (in 2-D): \[u_1(x_1,x_2,t) =\sum_{n_1=0}^{\infty}\sum_{n_2=0}^{\infty}A_{(n_1,n_2)}\mathrm{e}^{\lambda_{(n_1,n_2)} t}\cos\left(\frac{n_1 \pi x_1}{L_1}\right)\cos\left(\frac{n_2\pi x_2}{L_2} \right).\]
We write the spatial eigenvalues in 1-D as: \[\begin{equation} \label{1-D-US-eigs} \rho_n = \pi^2\left(\frac{n^2}{L^2}\right), \end{equation}\] and in 2-D as, \[\begin{equation} \label{2-D-US-eigs} \rho_{(n_1,n_2)} = \pi^2\left(\frac{n_1^2}{L_1^2} + \frac{n_2^2}{L_2^2}\right). \end{equation}\]
The conditions \[F_u +G_v <0, \quad F_uG_v -G_uF_v> 0,\] ensure the homogeneous \(n_1=n_2=0\) part of the solution decays exponentially.
The conditions \[G_v + DF_u>0,\quad (G_v+ D F_u)^2 - 4 D (F_uG_v -G_uF_v) >0\] ensure that we can solve (in 1-D) \[D\rho_n^2 - \rho_n(G_v+ D F_u)+ (F_uG_v -G_uF_v)= 0,\] in order the find a real valued domain \(\rho_n \in[\rho_n^{min},\rho_n^{max}]\) which defines the set of possible patterns which will grow in time, that is \(\lambda_{(n)}>0\). In 2-D, this is exactly the same except that we replace the index \(n\) with \((n_1, n_2)\), and for any given value of \(\rho_{(n_1, n_2)}\), there can be different values of \(n_1\) and \(n_2\) that give rise to this value.
In either case, the pattern is assumed to be observed when the morphogen rises above some concentration value \(a\) (\(u_1>a\)) to stimulate some chemical/physical process. In the context of a developing embryo, such a region of high concentration will lead some cells to differentiating (turning on or off some genes to express a certain ‘phenotype’ or set of behaviour), which will lead to subsequent restructuring of the tissue locally etc. When this happens for some periodic pattern, this can lead to things like hair follicles expressing different colours, or the digits forming in one’s hand etc – at least this is the theory!
As mentioned, these algebraic Turing conditions which are independent of \(\rho_k\) only apply for domains which are sufficiently large. The derivation of the growth rates, with solutions given by [turinglambda], is instead a general result that depends on the spatial eigenvalues of the Laplacian \(\rho_k\). It is these spatial eigenvalues which encode both the geometry of the domain (its lengths etc), as well as the boundary conditions, as we saw in the case of Neumann and Dirichlet conditions in the last chapter. Here we discuss the case of finite mode selection briefly, just to get across some intuition of how this works on a fixed bounded domain. We will also continue to restrict to the case of Neumann boundary conditions.
We saw that all growth rates will have negative real part unless \(\det(\mathsfbfit{M})=h(\rho_k)\), given by [h-US-eq], was negative. Finding the zeroes of this equation, assuming parameters where the minimum is negative, allows us to find an interval of unstable wavenumbers. This interval will be exactly where we expect a positive growth rate, as \(h(\rho_k)<0\) there. See the curve given by \(d>d_c\) in 4.3, and note that the band indicated in (b) is exactly the same region for which \(h\) is negative in (a).
The parameters of the system dictate this range of wavenumbers \(\rho_k\in[\rho_k^{-},\rho_k^{+}]\). In one dimension, the spacing of these discrete wavenumbers, given in [1-D-US-eigs], can be directly interpreted as points on the \(x\) axis in 4.3, with a spacing between them given by \(\pi/L\). So as \(L\) increases, these points cluster more closely. By finding the zeroes of \(h(\rho_k)\), we can compute the wavenumber bounds as: \[\begin{align} \label{ksbounds} \rho_k^{\mp}= \frac{1}{2D}\left[(DF_u + G_v) \pm \sqrt{(D F_u +G_v)^2 -4D(F_uG_v - F_vG_u)}\right]. \end{align}\]
In higher dimensions, the size (the \(L_i\)) of the individual domain directions has a relationship between a given \(k_i\) and the relevant mode \(n_i\), i.e. \[k_i = \frac{n_i\pi}{L_i}.\] For a fixed \(n_i\) the value of \(k_i\) decreases as the domain size \(L_i\) increases. We see that the total spatial eigenvalue, in two dimensions given by [2-D-US-eigs] and in the general case by [mode-US-numbers], thus scales with each \(L_i\) individually. Thus, the number of modes in between \(\rho_k\in[\rho_k^{-},\rho_k^{+}]\) will increase as the length of the domain increases. The effect of the parameter \(D\) is a little more complex.
If we send \(D \to \infty\) then the permissible bounds of \(\rho_k\) will tend to \(\rho_k\in(0, F_u)\) (consider \(\rho_k^\mp\) and look at where \(D\) is). This is sometimes a useful limit to consider.
For a fixed wavenumber, Equation [turinglambda] gives the growth rate \(\lambda(\rho_k)\). If we assume that the domain is sufficiently large, we can compute a value of \(\rho_k\) for which \(\lambda(\rho_k)\) is maximal. This will then be the mode which grows the fastest, at least as long as the linear approximation is valid. To find this we differentiate \(\lambda(\rho_k)\) with respect to \(\rho_k\) and solve for \(\mathchoice{\frac{{\mathrm d}\lambda}{{\mathrm d}\rho_k}}{{\mathrm d}\lambda/{\mathrm d}\rho_k}{{\mathrm d}\lambda/{\mathrm d}\rho_k}{{\mathrm d}\lambda/{\mathrm d}\rho_k}=0\). Since \(\lambda(\rho_k)\) takes the form \[\begin{aligned} &\frac{1}{2}\left[(F_u +G_v) -\rho_k(D+1) + \left\{\left((F_u +G_v) -\rho_k(D+1) \right)^2- 4\det(\mathsfbfit{M})\right\}^{1/2}\right],\\ &\det(\mathsfbfit{M}) = D\rho_k^2 - \rho_k(DF_u +G_v) + (F_uG_v -F_vG_u). \end{aligned}\] This is not a trivial task! With no little effort solving \(\mathchoice{\frac{{\mathrm d}\lambda}{{\mathrm d}\rho_k}}{{\mathrm d}\lambda/{\mathrm d}\rho_k}{{\mathrm d}\lambda/{\mathrm d}\rho_k}{{\mathrm d}\lambda/{\mathrm d}\rho_k}=0\) gives \[\begin{equation} \label{kmaximal} \rho_k^* = \frac{1}{D-1}\left[(D+1)\left(\frac{-F_v G_u}{D}\right)^{1/2}-F_u+G_v\right], \end{equation}\] as the fastest growing mode. Alongside bounding the range of unstable modes, this computation often matches observed patterns from full numerical simulations reasonably well, and so despite the algebraic difficulty, it can be a useful heuristic for predicting the wavelength between pattern elements (e.g. the spacing between spots) given model parameters.
I do not expect you to derive this result, it’s quite (very) tedious to obtain. If, however, you feel like stretching your algebra muscles... In deriving this result you should show that the equation \(\mathchoice{\frac{{\mathrm d}\lambda}{{\mathrm d}\rho_k}}{{\mathrm d}\lambda/{\mathrm d}\rho_k}{{\mathrm d}\lambda/{\mathrm d}\rho_k}{{\mathrm d}\lambda/{\mathrm d}\rho_k}=0\) can be written as the following quadratic, \[\nonumber D (D-1)^2\rho_k^2 + \rho_k2 D(D-1)(F_u-G_v)+ ^2 \left(D \left((D+2) F_v G_u+(F_u-G_v)^2\right)+F_v G_u\right)=0.\]
Consider the reaction–diffusion system \[\begin{align} \label{RD-US-System-US-Schnack} \mathchoice{\frac{\partial u}{\partial t}}{\partial u/\partial t}{\partial u/\partial t}{\partial u/\partial t} &= \nabla^2u + F(u,v),\\ \mathchoice{\frac{\partial v}{\partial t}}{\partial v/\partial t}{\partial v/\partial t}{\partial v/\partial t} &= D\nabla^2v + G(u,v). \end{align}\] with the reaction kinetics, \[F = a+u^2v- u,\quad G = b- u^2v,\] with \(a\), \(b\) both positive parameters. These are known as the Schnakenberg kinetics, or alternatively as the substrate-depleted kinetics. These reaction terms are those we used in looking at switching oscillatory behaviour in last terms notes, there called the enzyme-reaction system. Here we will go through the steps of finding parameter regimes for which this system exhibits diffusion-driven instability.
The equilibrium can be found readily by adding \(F+G\) to find \(u_0\), and then substituting this into \(G=0\) (or \(F=0\)) to find \(v_0\). We obtain, \[u_0 = a+b,\quad v_0 = \frac{b}{(a+b)^2}.\] so \(a+b> 0\) and \(b\geq 0\) for physical solutions (and in realistic biological situations, we also tend to require \(a>0\)).
The beauty of the Turing analysis is that we have already done this in the general case (see [turinglin]). We simply need to find the components of the Jacobian evaluated at the steady state. These are given by: \[\begin{align} \label{enzymepds} F_u &= 2u_0v_0-1=\frac{b-a}{a+b}, \quad F_v =(u_0)^2= (a+b)^2,\\ \nonumber G_u &=-2u_0 v_0= \frac{-2b}{a+b},\quad G_v = -(u_0)^2= -(a+b)^2. \end{align}\]
As we have already solved the linear reaction–diffusion system, we can just make use of the Turing conditions. So \(F_v>0\) so that \(v\) is an inhibitor. We see that the cross terms also have the right sign structure for out-of-phase patterns, as \(F_v>0\) and \(G_u<0\). Since we must then have \(F_u>0\) (so that \(u\) is an activator), we require \(b>a\).
As \(F_u\) and \(G_v\) must be of opposing sign \(F_u>0\) so \(b>a\). Further \(D>1\) as \(v\) is the inhibitor.
Firstly we require \(\operatorname{tr}(\mathsfbfit{J})=F_u+G_v<0\) which says, \[\frac{b-a}{a+b}-(a+b)^2 <0 \implies (a+b)^3>b-a.\] We will also need \(b-a>0\) from requiring \(F_u>0\), though the following two conditions below will implicitly require this (and we do not need this for the homogeneous equilibrium to be stable in the absence of diffusion).
Next we need \(\det(\mathsfbfit{J})>0\) which implies \[-\frac{b-a}{a+b}(a+b)^2+2b(a+b) = (a+b)^2>0,\] which is trivially satisfied.
Now we need \(DF_u+G_v>0\) which says, \[D\frac{b-a}{a+b}-(a+b)^2 >0 \implies (a+b)^3<D(b-a).\]
Finally we have to check the last condition, which is often much more algebraically involved than the other three. We need \((DF_u+G_v)^2-4D(F_uG+_v-F_vG_u)>0\) which says, \[\left(D\frac{b-a}{a+b}-(a+b)^2\right)^2-4D(a+b)^2 = \frac{1}{(a+b)^2}\left (D(b-a)-(a+b)^3\right)^2-4D(a+b)^2>0,\] which can be simplified as, \[(D(b-a)-(a+b)^3)^2>4D(a+b)^4.\]
We can summarise the conditions by saying that for the homogeneous steady state to be stable, we simply require, \[\begin{equation} \label{hom-US-ss-US-stab-US-schnack} (a+b)^3>b-a, \end{equation}\] and that for a diffusion-driven instability, we also need, \[(a+b)^3<D(b-a), \quad \textrm{and} \quad (D(b-a)-(a+b)^3)^2>4D(a+b)^4.\] As noted, the requirement that \(b>a\) for \(F_u\) to be positive can be seen in the first of these two conditions. We also see that, as long as \(b-a>0\), both of these conditions will be automatically satisfied in the limit of \(D \to \infty\). We can confirm this by plotting the conditions numerically.
(a) \(D=5\) (b) \(D=50\)
(c) \(D=500\) (d) \(D=5000\)
In 4.5, we plot a region in \((a,b)\) parameter space for different values of \(D\). The black region corresponds to values of \(a\) and \(b\) where the Turing conditions are satisfied, and so for sufficiently large domains we suspect those parameters to give rise to patterned solutions. The grey regions are where the homogeneous equilibrium remains stable, and the white region is where the homogeneous equilibrium is unstable to homogeneous (\(\rho_0=0\)) perturbations, or equivalently where condition [hom-US-ss-US-stab-US-schnack] is violated. The blue dashed line is where \(a=b\). With some work, you can actually draw these boundary curves by hand, though this is much simpler to do using modern numerical software as I have done here.
As predicted, as \(D\) increases the entire region with a stable homogeneous equilibrium in the absence of diffusion (the grey region) becomes Turing unstable as long as \(b>a\), as given by the blue line. On the other hand, for \(D=5\) we see a relatively small Turing unstable region which shrinks rapidly and disappears for \(D<2.5\) or so. Again this has important implications in terms of requiring a massive difference in the diffusion rates of our two morphogens in order to realise a ‘large’ Turing space. This is a major obstacle of robustness of this theory, as we would really want a large and robust parameter space where we see these instabilities, which is not very sensitive to small random fluctuations.
We can also consider a means by which we exert control on the patterns formed by this system, as long as we can control the kinetic parameters (and thereby the Jacobian). Such control approaches have been used in chemical applications of Turing systems, as well as in more modern synthetic biology applications.
The Turing conditions tell us that when \[(DF_u +G_v)^2 = 4D(F_uG_v-F_vG_u),\] the system is on the border of allowing turning instabilities. This occurs when \[\rho_k= \frac{D F_u + G_v}{2D}.\] So only one mode can potentially be unstable if we are just passed this threshold. One could for example demand this occur for an \(n_1=5,n_2=5\) mode in a 2-D system (creating a checkerboard pattern like in 3.3(a)). In the example we are looking at \[\rho_k= \frac{D(b-a)-(a+b)^3}{2D(a+b)} = 25\pi^2\left(\frac{1}{L_1^2} +\frac{1}{L_2^2} \right),\] which would, in principle, allow us to pick parameters such that the system is unstable exactly at that mode. In fact the amplitude of the mode can, to some degree, also be controlled, though this is beyond our scope here. In fact recent efforts have allowed for the design of elaborate Turing patterns controlled via spatial heterogeneity and nonlinear spot and stripe selection mechanisms, though again you will not be expected to be familiar with these. See this paper for further discussion. An example is 4.6.
The critical point to make regarding the Turing analysis is that it is a linear analysis. Essentially all of the interesting systems we shall consider (and indeed not consider) are nonlinear; nature is cruel like that! However this linear analysis tells us something about the permissible pattern (the unstable modes) which the system can take. In actual fact as relayed in section 2.4 of Murray, this constraint tends to be quite accurate. That is to say when the full nonlinear system is solved numerically it tends to be the case that the pattern belongs to the permissible space indicted by the Turing analysis. This can be formally justified near the onset of instability (that is, near the boundaries of the Turing space), though nonlinearity can play an important role in modifying the exact nature of the patterns presented. Of course the Turing analysis only presents us with a range of possible patterns, and especially in two and three dimensions, there can often be many. Indeed one can obtain many different patterned states from full numerical simulations of reaction–diffusion systems, and this kind of ‘multistability’ becomes exponentially difficult to understand in larger domains.
Of course Turing’s idea was that his analysis captures the system’s behaviour just as it develops the pattern. In this case some parameter of the system is increased pushing the system just past the point of stability. We can choose the \(\varepsilon\) in the linearisation to be the difference this parameter has from the loss of equilibrium. In this case the linear approximation will likely be a much more reliable indicator of the system’s behaviour. In this way we can relate a Turing instability at the onset of an instability to the pitchfork bifurcation studied in 1; see 4.7. Still, one can greatly improve an analytical prediction by using more sophisticated analytical tools, such as weakly nonlinear analysis, or the more rigorous study of ‘shadow-limit’ systems which essentially formalise the limit \(D \to \infty\) more carefully.
The fundamental components of a Turing analysis are
The loss of stability of a homogeneously mixed system of biochemicals/populations/cells leading to the formation of inhomogeneous patterns.
From the point of view of the stability analysis, the crucial assumptions were the stability of the zero mode (stability of the homogeneous equilibrium in the absence of diffusion) and the instability of some inhomogeneous \(\rho_k\neq 0\) modes which formed the pattern.
The critical result of these assumptions was that a finite range of \(\rho_k\) values (specific patterns). This is a plausible explanation of why we often see spots and stripes, patterns which would be the consequence of only a subset of modes being promoted.
A number of the aspects of the analysis are not fundamental and can be relaxed:
The boundary conditions do not necessarily need to be no-flux, which forbids anything leaving or entering the system. Similarly periodic boundary conditions, which are more realistic for some animal skin patterns, can give a similar result and comparable patterns. However, as in the case of a harsh environment in the previous chapter, the pattern formation paradigm is not the same for Dirichlet boundary conditions. This is because the zero mode is not permissible so does not need to be suppressed. This can mean the set of growing modes is more difficult to analyse.
The reaction–diffusion system is the sum of linear diffusion terms representing spreading and reaction terms. Pattern formation can and does occur for nonlinear diffusion, but often the analysis is more complex.
The domain will not generally be Cartesian. In fact, whilst it is an excellent mathematical starting point, it is often the case the domain should have more complex geometry (think animal appendages). As we shall see in the next chapter, different domain geometries do not typically affect the Turing conditions themselves, but change the mathematical form of the patterns formed and for small domains influence \(\rho_k\).
We assumed the instability occurred for a homogeneous equilibrium. In fact there is a lot of interest in pattern formation around heterogeneous states in space and time, though the analysis of such systems requires more sophisticated mathematical tools than those we have developed here.
Problems 1–7 of the Epiphany additional problem sheet 1 are all based on the Turing analysis, and are part A and part B style. Note that some of the notation and style has changed from previous terms – in particular, we have essentially considered the case of \(\gamma=1\) here, which is just a particular scaling of the reaction kinetic timescale.
Relevant reading: Murray book II, chapter 3 especially 3.4.
In the last chapter, we developed a theory of pattern formation due to linear instability of a homogeneous steady state. Importantly, this led to the idea of a diffusion-driven instability, where diffusion played a key part in inducing the instability. The patterns which formed, at least within a linear approximation, had the form of eigenfunctions of the Laplacian. These satisfy the auxiliary equation, \[\nabla^2 w_k(\mathbfit{x}) + \rho_k w_k(\mathbfit{x}) = 0,\] which is otherwise known as the Helmholtz equation or eigen-equation.
In this chapter, we will explore this equation in more complicated domains, in order to build some intuition for how it can be solved in contexts beyond 1-D intervals and 2-D squares. As an example of such domains, we will consider the whorl patterns on Acetabularia, and mention other cases of non-Cartesian pattern formation. We will end with a (non-examinable) discussion of very general domains where the ideas still apply, even if we can no longer analytically solve this equation.
It’s worth remembering that whether or not a reaction–diffusion system has a Turing instability is a function of its parameters, and \(\rho_k\) only – the exact form of \(w_k\) is only needed to get a sense for what kinds of linear patterns we might expect. It’s also worth noting that solutions to this equation have applications far beyond reaction–diffusion equations. Acoustics, electrodynamics, and fluid dynamics and solid mechanics are examples of fields where linear stability and the eigenfunctions of the Laplacian all play a role.
While a lot of the patterns we immediately think of in biology are two-dimensional (spots and stripes on animal coats, for example), many biological structures are inherently three-dimensional. Even animal coat patterns are induces by hair follicles, which are the buds in a developing embryo that become the roots of hairs, and these inherently have a three-dimensional structure. Other examples are the differentiation of different cell types to form different organs and tissues inside a body, leading to a wide array of complex spatial structures that have developed differently.
In this section we will consider the simplest case of three spatial dimensions by looking at eigenfunctions of the Laplacian in a cuboid or ‘box-like’ domain. In the first row of 5.1, we give examples of patterns in such a domain corresponding to simulations of the Schnakenberg system studied in 4.6. The second row shows patterns in a cylindrical domain which we will discuss in subsequent sections. Unlike two-dimensional spatial patterns, which tend towards either spot-like structures or stripe-like structures, three-dimensional patterns have a variety of different forms, and there is often no simple classification of them (though there are analogues of ‘spots’ and ‘stripes’ in the form of sphere-like solutions and lamellae-like solutions). Note, however, that these three-dimensional structures are at least qualitatively similar independent of the geometry. We will come back to this point at the end of the chapter.
Let’s proceed with the linear analysis (that is, the study of eigenfunctions of the Laplacian). Consider a cuboidal spatial domain \((x,y,z) \in \Omega = [0,L_x]\times[0,L_y]\times[0,L_z]\). We will solve the eigen-equation by going through the usual separation of variables process. We have that \(w_k(x,y,z)\) satisfies \[\begin{equation} \label{Helm3D} \nabla^2 w_k + \rho_k w_k = \mathchoice{\frac{\partial^2 w_k}{\partial x^2}}{\partial^2 w_k/\partial x^2}{\partial^2 w_k/\partial x^2}{\partial^2 w_k/\partial x^2}+\mathchoice{\frac{\partial^2 w_k}{\partial y^2}}{\partial^2 w_k/\partial y^2}{\partial^2 w_k/\partial y^2}{\partial^2 w_k/\partial y^2}+\mathchoice{\frac{\partial^2 w_k}{\partial z^2}}{\partial^2 w_k/\partial z^2}{\partial^2 w_k/\partial z^2}{\partial^2 w_k/\partial z^2}+\rho_k w_k = 0. \end{equation}\] Note that for now the \(k\) subscript is something like a placeholder for an indexed set of solutions we have to find. We will also assume that this cuboid domain is closed, so that \(w_k\) satisfies Neumann boundary conditions. That is, \[\begin{aligned} \mathchoice{\frac{\partial w_k}{\partial x}}{\partial w_k/\partial x}{\partial w_k/\partial x}{\partial w_k/\partial x}(0,y,z)=\mathchoice{\frac{\partial w_k}{\partial x}}{\partial w_k/\partial x}{\partial w_k/\partial x}{\partial w_k/\partial x}(L_x,y,z)&\nonumber\\ =\mathchoice{\frac{\partial w_k}{\partial y}}{\partial w_k/\partial y}{\partial w_k/\partial y}{\partial w_k/\partial y}(x,0,z)=\mathchoice{\frac{\partial w_k}{\partial y}}{\partial w_k/\partial y}{\partial w_k/\partial y}{\partial w_k/\partial y}(x,L_y,z)&\nonumber\\ =\mathchoice{\frac{\partial w_k}{\partial z}}{\partial w_k/\partial z}{\partial w_k/\partial z}{\partial w_k/\partial z}(x,y,0)=\mathchoice{\frac{\partial w_k}{\partial z}}{\partial w_k/\partial z}{\partial w_k/\partial z}{\partial w_k/\partial z}(x,y,L_z)&=0. \end{aligned}\] These six boundary conditions correspond exactly to the normal derivative vanishing on the six faces of a cuboid.
We can solve this equation via separation of variables. Assuming \[w_k(x,y,z) = X(x)Y(y)Z(z)\] and substituting this in, we have: \[\left(\mathchoice{\frac{\partial^2 }{\partial x^2}}{\partial^2 /\partial x^2}{\partial^2 /\partial x^2}{\partial^2 /\partial x^2}+\mathchoice{\frac{\partial^2 }{\partial y^2}}{\partial^2 /\partial y^2}{\partial^2 /\partial y^2}{\partial^2 /\partial y^2}+\mathchoice{\frac{\partial^2 }{\partial z^2}}{\partial^2 /\partial z^2}{\partial^2 /\partial z^2}{\partial^2 /\partial z^2}\right)(XYZ)+\rho_k XYZ = YZ\mathchoice{\frac{{\mathrm d}^2 X}{{\mathrm d}x^2}}{{\mathrm d}^2 X/{\mathrm d}x^2}{{\mathrm d}^2 X/{\mathrm d}x^2}{{\mathrm d}^2 X/{\mathrm d}x^2}+XZ\mathchoice{\frac{{\mathrm d}^2 Y}{{\mathrm d}y^2}}{{\mathrm d}^2 Y/{\mathrm d}y^2}{{\mathrm d}^2 Y/{\mathrm d}y^2}{{\mathrm d}^2 Y/{\mathrm d}y^2}+XY\mathchoice{\frac{{\mathrm d}^2 Z}{{\mathrm d}z^2}}{{\mathrm d}^2 Z/{\mathrm d}z^2}{{\mathrm d}^2 Z/{\mathrm d}z^2}{{\mathrm d}^2 Z/{\mathrm d}z^2}+\rho_k XYZ=0,\] where two of the three functions can pass through each derivative term, as they are constant with respect to that derivative. We have also changed the derivatives to total derivatives, as \(X\), \(Y\), and \(Z\) are each functions of a single variable.
Dividing this equation by \(XYZ\) we have, \[\frac{1}{X}\mathchoice{\frac{{\mathrm d}^2 X}{{\mathrm d}x^2}}{{\mathrm d}^2 X/{\mathrm d}x^2}{{\mathrm d}^2 X/{\mathrm d}x^2}{{\mathrm d}^2 X/{\mathrm d}x^2}+\frac{1}{Y}\mathchoice{\frac{{\mathrm d}^2 Y}{{\mathrm d}y^2}}{{\mathrm d}^2 Y/{\mathrm d}y^2}{{\mathrm d}^2 Y/{\mathrm d}y^2}{{\mathrm d}^2 Y/{\mathrm d}y^2}+\frac{1}{Z}\mathchoice{\frac{{\mathrm d}^2 Z}{{\mathrm d}z^2}}{{\mathrm d}^2 Z/{\mathrm d}z^2}{{\mathrm d}^2 Z/{\mathrm d}z^2}{{\mathrm d}^2 Z/{\mathrm d}z^2}+\rho_k =0.\] Now, we can proceed by moving any of the three derivative terms to the right-hand side. Which one you choose will not change the final answers we get, but it might change the intermediate symbols we introduce. Let’s subtract the term involving \(X\) from both sides to get, \[\frac{1}{Y}\mathchoice{\frac{{\mathrm d}^2 Y}{{\mathrm d}y^2}}{{\mathrm d}^2 Y/{\mathrm d}y^2}{{\mathrm d}^2 Y/{\mathrm d}y^2}{{\mathrm d}^2 Y/{\mathrm d}y^2}+\frac{1}{Z}\mathchoice{\frac{{\mathrm d}^2 Z}{{\mathrm d}z^2}}{{\mathrm d}^2 Z/{\mathrm d}z^2}{{\mathrm d}^2 Z/{\mathrm d}z^2}{{\mathrm d}^2 Z/{\mathrm d}z^2}+\rho_k =-\frac{1}{X}\mathchoice{\frac{{\mathrm d}^2 X}{{\mathrm d}x^2}}{{\mathrm d}^2 X/{\mathrm d}x^2}{{\mathrm d}^2 X/{\mathrm d}x^2}{{\mathrm d}^2 X/{\mathrm d}x^2} = C_x,\] where this constant \(C_x\) appears because the left-hand side of the equation is a function of \(y\) and \(z\) only, and the right a function of \(x\) only. Hence, both sides of the equation must actually be constant (as you can vary \(x\) but leave \(y\) and \(z\) fixed, for example). We now have the system, \[\begin{align} \label{XY-US-eq} \frac{1}{Y}\mathchoice{\frac{{\mathrm d}^2 Y}{{\mathrm d}y^2}}{{\mathrm d}^2 Y/{\mathrm d}y^2}{{\mathrm d}^2 Y/{\mathrm d}y^2}{{\mathrm d}^2 Y/{\mathrm d}y^2}+\frac{1}{Z}\mathchoice{\frac{{\mathrm d}^2 Z}{{\mathrm d}z^2}}{{\mathrm d}^2 Z/{\mathrm d}z^2}{{\mathrm d}^2 Z/{\mathrm d}z^2}{{\mathrm d}^2 Z/{\mathrm d}z^2}+\rho_k = C_x\\ -\frac{1}{X}\mathchoice{\frac{{\mathrm d}^2 X}{{\mathrm d}x^2}}{{\mathrm d}^2 X/{\mathrm d}x^2}{{\mathrm d}^2 X/{\mathrm d}x^2}{{\mathrm d}^2 X/{\mathrm d}x^2} = C_x. \end{align}\] But now we can do the same trick again, and subtract the \(Y\) terms from both sides of [XY-US-eq], as well as the \(C_x\), to get \[\begin{equation} \label{Z-US-eq} \frac{1}{Z}\mathchoice{\frac{{\mathrm d}^2 Z}{{\mathrm d}z^2}}{{\mathrm d}^2 Z/{\mathrm d}z^2}{{\mathrm d}^2 Z/{\mathrm d}z^2}{{\mathrm d}^2 Z/{\mathrm d}z^2}+\rho_k-C_x = -\frac{1}{Y}\mathchoice{\frac{{\mathrm d}^2 Y}{{\mathrm d}y^2}}{{\mathrm d}^2 Y/{\mathrm d}y^2}{{\mathrm d}^2 Y/{\mathrm d}y^2}{{\mathrm d}^2 Y/{\mathrm d}y^2} = C_y, \end{equation}\] where again the \(C_y\) comes from the left-hand side being a function of \(z\) only, and the right a function of \(y\) only.
Taken together, we can write a system of three equations as, \[\begin{aligned} -\frac{1}{X}\mathchoice{\frac{{\mathrm d}^2 X}{{\mathrm d}x^2}}{{\mathrm d}^2 X/{\mathrm d}x^2}{{\mathrm d}^2 X/{\mathrm d}x^2}{{\mathrm d}^2 X/{\mathrm d}x^2} &= C_x,\\ -\frac{1}{Y}\mathchoice{\frac{{\mathrm d}^2 Y}{{\mathrm d}y^2}}{{\mathrm d}^2 Y/{\mathrm d}y^2}{{\mathrm d}^2 Y/{\mathrm d}y^2}{{\mathrm d}^2 Y/{\mathrm d}y^2} &= C_y,\\ -\frac{1}{Z}\mathchoice{\frac{{\mathrm d}^2 Z}{{\mathrm d}z^2}}{{\mathrm d}^2 Z/{\mathrm d}z^2}{{\mathrm d}^2 Z/{\mathrm d}z^2}{{\mathrm d}^2 Z/{\mathrm d}z^2}&=\rho_k-C_x-C_y=C_z, \end{aligned}\] where we have rearranged [Z-US-eq] to be of the same form as the others, and renamed the constant \(\rho_k-C_x-C_y\) as \(C_z\), just to preserve the notational symmetry. As these equations are essentially all identical, we will discuss solving for \(X\) in detail and then explain what this means for \(Y\) and \(Z\). We note that the problem for \(X\) is exactly the 1-D case discussed in chapter 2 and 3, and in section 5.3 in the Michaelmas notes.
The equation for \(X\) takes the form, after multiplying by \(X\), \[\begin{equation} \label{X-US-eq-US-sep} \mathchoice{\frac{{\mathrm d}^2 X}{{\mathrm d}x^2}}{{\mathrm d}^2 X/{\mathrm d}x^2}{{\mathrm d}^2 X/{\mathrm d}x^2}{{\mathrm d}^2 X/{\mathrm d}x^2} + C_xX = 0. \end{equation}\] We can also impose the boundary conditions on \(w_k\) to see that we need boundary conditions on \(X\) of the form, \[\begin{equation} \label{X-US-BCs} \mathchoice{\frac{{\mathrm d}X}{{\mathrm d}x}}{{\mathrm d}X/{\mathrm d}x}{{\mathrm d}X/{\mathrm d}x}{{\mathrm d}X/{\mathrm d}x}(0) = \mathchoice{\frac{{\mathrm d}X}{{\mathrm d}x}}{{\mathrm d}X/{\mathrm d}x}{{\mathrm d}X/{\mathrm d}x}{{\mathrm d}X/{\mathrm d}x}(L_x)=0. \end{equation}\] As [X-US-eq-US-sep] is a constant coefficient ODE, we can solve it by assuming \(X(x) = \exp(\mu x)\). Substituting this in and dividing by \(X\), we then have an equation for \(\mu\) of the form, \[\mu^2 + C_x =0 \implies \mu = \sqrt{-C_x}.\] Hence the sign of \(C_x\) determines the kinds of solutions we obtain. If \(C_x <0\), then we get exponential solutions, if \(C_x=0\) we obtain linear functions, and if \(C_x > 0\) we find solutions in the form of sines and cosines. The first two cases you should work through, and show that these kinds of solutions cannot satisfy the boundary conditions.
On Problem Sheet 6, question 2(c) has you find the solutions for this kind of constant-coefficient ODE in each case. You are then asked to use the boundary conditions to show that only the periodic solutions, that is \(C_x>0\), satisfy the boundary conditions.
So consider \(C_x>0\). We should then recognise that this is the equation of a simple harmonic oscillator (linearised pendulum equation), and so we expect sinusoidal solutions. But we can derive this just for practice. Our solutions take the form, \[X(x) = A\mathrm{e}^{\sqrt{-C_x}x}+B\mathrm{e}^{-\sqrt{-C_x}x} = A\mathrm{e}^{\mathrm{i}\sqrt{C_x}x}+B\mathrm{e}^{-\mathrm{i}\sqrt{C_x}x}.\] We can then use Euler’s identity on the complex exponentials to find, \[X(x) = A\left(\cos\left(x\sqrt{C_x}\right)+\mathrm{i}\sin\left(x\sqrt{C_x}\right)\right)+B\left(\cos\left(x\sqrt{C_x}\right)-\mathrm{i}\sin\left(x\sqrt{C_x}\right)\right).\] So \[X(x) = (A+B)\cos\left(x\sqrt{C_x}\right)+(A-B)\mathrm{i}\sin\left(x\sqrt{C_x}\right) = \widetilde{A}\cos\left(x\sqrt{C_x}\right) + \widetilde{B}\sin\left(x\sqrt{C_x}\right),\] where we have replaced the (arbitrary complex) constants \(A\) and \(B\) by real constants \(\widetilde{A}\) and \(\widetilde{B}\) by setting \(A=(\widetilde{A}-\mathrm{i}\widetilde{B})/2\), \(B=(\widetilde{A}+\mathrm{i}\widetilde{B})/2\).
By the first boundary condition that \(X'(0)=0\), we have that \(\widetilde{B}=0\). Applying the second we have that, \[\widetilde{A}\sqrt{C_x}\sin\left(L_x\sqrt{C_x}\right) = 0 \implies L_x\sqrt{C_x} = n_x \pi \implies C_x = \left(\frac{n_x \pi}{L_x} \right)^2, \quad n_x = 0,1,2,\dots.\] So our eigenfunctions in the \(x\) direction are given by, \[X(x) = \cos\left(\frac{n_x \pi x}{L_x} \right), \quad n_x = 0,1,2,\dots.\] where we are setting the constant to \(1\). As with eigenvectors of matrices, it is only the direction that matters, and we can scale the eigenfunctions as we’d like. In solving a time-dependent reaction–diffusion problem, the initial condition will determine these constants.
All of this is very standard/textbook, and in an exam you are generally allowed to skip any parts of these ‘derivations’ that you want, as long as you understand where the form of the solution comes from/explain when directed to. In particular, if a function \(F\) satisfies, \[\mathchoice{\frac{{\mathrm d}^2 F}{{\mathrm d}x^2}}{{\mathrm d}^2 F/{\mathrm d}x^2}{{\mathrm d}^2 F/{\mathrm d}x^2}{{\mathrm d}^2 F/{\mathrm d}x^2}(x) + CF(x) = 0, \quad \mathchoice{\frac{{\mathrm d}F}{{\mathrm d}x}}{{\mathrm d}F/{\mathrm d}x}{{\mathrm d}F/{\mathrm d}x}{{\mathrm d}F/{\mathrm d}x}(0)=\mathchoice{\frac{{\mathrm d}F}{{\mathrm d}x}}{{\mathrm d}F/{\mathrm d}x}{{\mathrm d}F/{\mathrm d}x}{{\mathrm d}F/{\mathrm d}x}(L_x)=0,\] then you can just quote that \(F\) must take the form, \[F = A\cos\left(\frac{k\pi x}{L_x} \right), \quad k=0,1,2,\dots\] Similarly if \(F\) satisfies, \[\mathchoice{\frac{{\mathrm d}^2 F}{{\mathrm d}x^2}}{{\mathrm d}^2 F/{\mathrm d}x^2}{{\mathrm d}^2 F/{\mathrm d}x^2}{{\mathrm d}^2 F/{\mathrm d}x^2}(x) + CF(x) = 0, \quad F(0)=F(L_x)=0,\] then you can just quote that \(F\) must take the form, \[F = A\sin\left(\frac{k\pi x}{L_x} \right), \quad k=1,2,\dots\] Just remember that the constant \(C\) in the above will also appear in the other equations from separation of variables, so its value is important.
As the equation we have solved for \(X\) is exactly the same equation, with the same boundary conditions, as for \(Y\) and \(Z\), we know that the solutions must take the same form. However the mode numbers in each case can vary independently from one another. So we have, \[Y(y) = \cos\left(\frac{n_y \pi y}{L_y} \right), \quad n_y = 0,1,2,\dots, \quad Z(Z) = \cos\left(\frac{n_z \pi z}{L_z} \right), \quad n_z = 0,1,2,\dots.\] Putting these all together, we have that the solutions to the Helmholtz or eigenfunction equation given by [Helm3D] is: \[w_k(x,y,z) = \cos\left(\frac{n_x \pi x}{L_x} \right)\cos\left(\frac{n_y \pi y}{L_y} \right)\cos\left(\frac{n_z \pi z}{L_z} \right),\] where the numbers \(n_x, n_y\) and \(n_z\) can take any positive integer value (including \(0\)) independently from one another. How do we compute the eigenvalue of the Laplacian, \(\rho_k\)? Well we can either substitute in the above solution, or just add the equations for \(X''(x)+Y''(y)+Z''(z)\) to see that \(\rho_k = C_x+C_y+C_z\). We find, \[\rho_k = \left(\frac{\pi n_x}{L_x} \right )^2+\left(\frac{\pi n_y}{L_y} \right )^2+\left(\frac{\pi n_z}{L_z} \right )^2,\] consistent with [mode-US-numbers].
What about the subscript \(k\)? This is really just notation for an index, kept over somewhat as a placeholder from the 1-D case, as here we have three independent mode numbers. However, we can put an order on the spatial eigenvalues such that \(\rho_0 = 0 < \rho_1 \leq \rho_2 \leq \rho_3 \leq \dots\) if we specify a way of ‘ordering’ the three-dimensional lattice given by \((n_x, n_y, n_z)\). For example, using \(\rho_{n_x,n_y,n_z}\) to denote the Laplace eigenvalue corresponding to the mode numbers \(n_x, n_y\) and \(n_z\) respectively, and assuming for the moment that \(L_x=L_y=L_z\), we have that \(\rho_{0,0,0} = 0 < \rho_{1,0,0} = \rho_{0,1,0} = \rho_{0,0,1} < \rho_{1,1,0} = \rho_{1,0,1} = \rho_{0,1,1} < \rho_{2,0,0} = \dots\) etc.
This is not crucial information, but I hope gives you some idea of how different spatial eigenmodes (eigenfunctions) can have the same spatial eigenvalue. Our linear instability analysis in the previous chapter only discriminated instability based on the scalar value \(\rho_k\), so it gives us no indication which of the mode numbers grow in time. To determine this requires a weakly nonlinear analysis, which is beyond our scope.
We next want to consider patterns in circular domains. As a warm-up to this, we will consider an example of a scalar reaction–diffusion equation in a disc. The first half of this problem will be essentially identical to question 2 on the sheet for problems class 8. I would highly recommend trying each part of that question before reading the rest of this section.
Consider the growth of a colony of Escherichia coli (often called E. coli) in a circular container known as a Petri dish. See 5.2. We assume that there is plentiful food throughout the dish, but an antibiotic at the walls, which immediately kills any bacteria which touches the edge of the dish.
We model this using a two-dimensional reaction–diffusion equation, \[\mathchoice{\frac{\partial u}{\partial t}}{\partial u/\partial t}{\partial u/\partial t}{\partial u/\partial t} = D\nabla^2 u + u(1-u),\] where \(D\) is a scaled diffusion coefficient. We have nondimensionalised the bacterial density \(u\) and timescale \(t\) such that the carrying capacity of the culture medium (the bacteria’s food source) is \(1\), and that the growth rate is also \(1\). Our domain is the two-dimensional disc \((r,\theta)\) with \(r\in [0, L_r]\) the radial coordinate, and \(\theta\in [0, 2\pi)\) the angular coordinate. In this polar coordinate system the Laplacian takes the form \[\nabla^2u = \frac{1}{r}\mathchoice{\frac{\partial}{\partial r}}{\partial/\partial r}{\partial/\partial r}{\partial/\partial r}\left(r \mathchoice{\frac{\partial u}{\partial r}}{\partial u/\partial r}{\partial u/\partial r}{\partial u/\partial r}\right) + \frac{1}{r^2}\mathchoice{\frac{\partial^2 u}{\partial\theta^2}}{\partial^2 u/\partial\theta^2}{\partial^2 u/\partial\theta^2}{\partial^2 u/\partial\theta^2}.\] We have the boundary conditions, \[{u}(t,L_r,\theta) = 0,\,\, {u}(t,r,0) ={u}(t,r,2\pi),\,\, \mathchoice{\frac{\partial u}{\partial\theta}}{\partial u/\partial\theta}{\partial u/\partial\theta}{\partial u/\partial\theta}(t,r,0) = \mathchoice{\frac{\partial u}{\partial\theta}}{\partial u/\partial\theta}{\partial u/\partial\theta}{\partial u/\partial\theta}(t,r,2\pi).\] The first of these represents bacterial death at the walls, and the second two are periodic conditions conditions in \(\theta\).
Despite the geometry being different, this problem is otherwise the same as that in 3.1.2. While there are two spatially homogeneous equilibria in the absence of the diffusion term, given by \(u_0=0\) or \(1\), the latter does not satisfy the boundary condition at \(r=L_r\). So we can only consider instabilities around \(u_0=0\). In the absence of diffusion this steady state is unstable, though this may not be the case when diffusion is considered.
The linearisation of the equation is the same as we’ve seen before. Writing \(u = u_0 + \varepsilon u_1 = \varepsilon u_1\) we find, \[\begin{equation} \label{Petri-US-Linear} \mathchoice{\frac{\partial u_1}{\partial t}}{\partial u_1/\partial t}{\partial u_1/\partial t}{\partial u_1/\partial t} = D\nabla^2 u_1 + u_1(1-2u_0) = D\nabla^2 u_1 + u_1. \end{equation}\] As before, we substitute in \(u_1 = \exp(\lambda_k t)w_k(r,\theta)\) to find that the perturbation’s growth rates are given by, \[\begin{equation} \label{Disp-US-Bessel-US-Scalar} \lambda_k u_1 = -\rho_k D u_1 + u_1\implies \lambda_k = -\rho_k D + 1, \end{equation}\] where the spatial eigenvalue \(\rho_k\) is determined from the equation, \[\begin{equation} \label{eig-US-petri} \nabla^2w_k + \rho_k w_k = \frac{1}{r}\mathchoice{\frac{\partial}{\partial r}}{\partial/\partial r}{\partial/\partial r}{\partial/\partial r}\left(r \mathchoice{\frac{\partial w_k}{\partial r}}{\partial w_k/\partial r}{\partial w_k/\partial r}{\partial w_k/\partial r}\right) + \frac{1}{r^2}\mathchoice{\frac{\partial^2 w_k}{\partial\theta^2}}{\partial^2 w_k/\partial\theta^2}{\partial^2 w_k/\partial\theta^2}{\partial^2 w_k/\partial\theta^2} + \rho_k w_k = 0. \end{equation}\]
To determine the growth rates, we have to solve the Laplacian eigenvalue problem to find the spatial eigenvalues \(\rho_k\). We assume a separable solution of the form \(w_k(r,\theta) = R(r)P(\theta)\). Substituting this into [eig-US-petri] we find, \[\frac{1}{r}\mathchoice{\frac{\partial}{\partial r}}{\partial/\partial r}{\partial/\partial r}{\partial/\partial r}\left(r \mathchoice{\frac{\partial(RP)}{\partial r}}{\partial(RP)/\partial r}{\partial(RP)/\partial r}{\partial(RP)/\partial r}\right) + \frac{1}{r^2}\mathchoice{\frac{\partial^2 (RP)}{\partial\theta^2}}{\partial^2 (RP)/\partial\theta^2}{\partial^2 (RP)/\partial\theta^2}{\partial^2 (RP)/\partial\theta^2} + \rho_k RP = P\frac{1}{r}\mathchoice{\frac{{\mathrm d}}{{\mathrm d}r}}{{\mathrm d}/{\mathrm d}r}{{\mathrm d}/{\mathrm d}r}{{\mathrm d}/{\mathrm d}r}\left(r \mathchoice{\frac{{\mathrm d}R}{{\mathrm d}r}}{{\mathrm d}R/{\mathrm d}r}{{\mathrm d}R/{\mathrm d}r}{{\mathrm d}R/{\mathrm d}r}\right) + R\frac{1}{r^2}\mathchoice{\frac{{\mathrm d}^2 P}{{\mathrm d}\theta^2}}{{\mathrm d}^2 P/{\mathrm d}\theta^2}{{\mathrm d}^2 P/{\mathrm d}\theta^2}{{\mathrm d}^2 P/{\mathrm d}\theta^2} + \rho_k RP= 0.\] Multiplying both sides of this equation by \(r^2/RP\) we have, \[\frac{r}{R}\mathchoice{\frac{{\mathrm d}}{{\mathrm d}r}}{{\mathrm d}/{\mathrm d}r}{{\mathrm d}/{\mathrm d}r}{{\mathrm d}/{\mathrm d}r}\left(r \mathchoice{\frac{{\mathrm d}R}{{\mathrm d}r}}{{\mathrm d}R/{\mathrm d}r}{{\mathrm d}R/{\mathrm d}r}{{\mathrm d}R/{\mathrm d}r}\right) + \frac{1}{P}\mathchoice{\frac{{\mathrm d}^2 P}{{\mathrm d}\theta^2}}{{\mathrm d}^2 P/{\mathrm d}\theta^2}{{\mathrm d}^2 P/{\mathrm d}\theta^2}{{\mathrm d}^2 P/{\mathrm d}\theta^2} + \rho_kr^2 = 0 \implies \frac{r}{R}\mathchoice{\frac{{\mathrm d}}{{\mathrm d}r}}{{\mathrm d}/{\mathrm d}r}{{\mathrm d}/{\mathrm d}r}{{\mathrm d}/{\mathrm d}r}\left(r \mathchoice{\frac{{\mathrm d}R}{{\mathrm d}r}}{{\mathrm d}R/{\mathrm d}r}{{\mathrm d}R/{\mathrm d}r}{{\mathrm d}R/{\mathrm d}r}\right) + \rho_kr^2 = - \frac{1}{P}\mathchoice{\frac{{\mathrm d}^2 P}{{\mathrm d}\theta^2}}{{\mathrm d}^2 P/{\mathrm d}\theta^2}{{\mathrm d}^2 P/{\mathrm d}\theta^2}{{\mathrm d}^2 P/{\mathrm d}\theta^2} = C_\theta,\] where we have separated the \(r\) and \(\theta\) functions, and hence must have that both sides of this equation are constant, which we set to be equal to \(C_\theta\).
We then have the equations, \[\begin{align} \label{R-US-eq} -\frac{1}{Rr}\mathchoice{\frac{{\mathrm d}}{{\mathrm d}r}}{{\mathrm d}/{\mathrm d}r}{{\mathrm d}/{\mathrm d}r}{{\mathrm d}/{\mathrm d}r}\left(r \mathchoice{\frac{{\mathrm d}R}{{\mathrm d}r}}{{\mathrm d}R/{\mathrm d}r}{{\mathrm d}R/{\mathrm d}r}{{\mathrm d}R/{\mathrm d}r}\right) & = \rho_k-\frac{C_\theta}{r^2},\\ - \frac{1}{P}\mathchoice{\frac{{\mathrm d}^2 P}{{\mathrm d}\theta^2}}{{\mathrm d}^2 P/{\mathrm d}\theta^2}{{\mathrm d}^2 P/{\mathrm d}\theta^2}{{\mathrm d}^2 P/{\mathrm d}\theta^2} = C_\theta, \end{align}\] and the boundary conditions, \[R(L_r)=0, \quad P(0) = P(2\pi), \quad \mathchoice{\frac{{\mathrm d}P}{{\mathrm d}\theta}}{{\mathrm d}P/{\mathrm d}\theta}{{\mathrm d}P/{\mathrm d}\theta}{{\mathrm d}P/{\mathrm d}\theta}(0) = \mathchoice{\frac{{\mathrm d}P}{{\mathrm d}\theta}}{{\mathrm d}P/{\mathrm d}\theta}{{\mathrm d}P/{\mathrm d}\theta}{{\mathrm d}P/{\mathrm d}\theta}(2\pi).\] We now proceed to solve these two ODEs.
We write the equation for \(P\) as, \[\mathchoice{\frac{{\mathrm d}^2 P}{{\mathrm d}\theta^2}}{{\mathrm d}^2 P/{\mathrm d}\theta^2}{{\mathrm d}^2 P/{\mathrm d}\theta^2}{{\mathrm d}^2 P/{\mathrm d}\theta^2} +C_\theta P= 0.\] We can go through the cases of \(C_\theta\) taking different signs, but again should note that exponentials and linear functions will fail to satisfy the periodic boundary conditions. This can be shown in detail by applying the boundary conditions as in the Neumann case in 5.1. So we must have \(C_\theta \geq 0\), with the solution for \(C_\theta=0\) being a constant.
We then write our solutions as, \[P(\theta) = A\cos\left(\sqrt{C_\theta} \theta\right) + B \sin\left(\sqrt{C_\theta} \theta\right).\] Using the first boundary condition we have: \[A\cos(0) + B \sin(0) = A = A\cos\left(2\sqrt{C_\theta} \pi\right) + B \sin\left(2\sqrt{C_\theta} \pi\right),\] where we need the \(\cos\) term to equal \(1\) and the \(B\sin\) term to go to \(0\). For the former we require \(2\sqrt{C_\theta} \pi=2 n \pi,\) so \(\sqrt{C_\theta}\) must be an integer, that is \(\sqrt{C_\theta} = n\), \(n=0,1,2,3,\dots\).
The second boundary condition reads, \[-A\sqrt{C_\theta}\sin(0) + B \sqrt{C_\theta}\cos(0) = B\sqrt{C_\theta} = A\sqrt{C_\theta}\sin\left(2\sqrt{C_\theta} \pi\right) + B \sqrt{C_\theta}\cos\left(2\sqrt{C_\theta} \pi\right).\] As before, this requires that \(\sqrt{C_\theta} = n\), an integer. So the solution for \(P\) is, \[P(\theta) = A\cos(n \theta) + B \sin(n \theta),\quad n=0,1,2,\dots,\] where we note that \(C_\theta = n^2\). The inclusion of \(n=0\) covers the case that \(C_\theta=0\), which has the constant solution \(P =A\).
The equation for \(R\) is given by, after expanding the derivative terms and multiplying by \(r^2\), \[\begin{equation} \label{R-US-eq-US-Bessel} r^2\mathchoice{\frac{{\mathrm d}^2 R}{{\mathrm d}r^2}}{{\mathrm d}^2 R/{\mathrm d}r^2}{{\mathrm d}^2 R/{\mathrm d}r^2}{{\mathrm d}^2 R/{\mathrm d}r^2} + r \mathchoice{\frac{{\mathrm d}R}{{\mathrm d}r}}{{\mathrm d}R/{\mathrm d}r}{{\mathrm d}R/{\mathrm d}r}{{\mathrm d}R/{\mathrm d}r} +( \rho_kr^2-n^2)R = 0, \end{equation}\] where the \(n^2\) comes from \(C_\theta\) in [R-US-eq] and hence represents a coupling of the two boundary-value-problems. It turns out that this equation does not admit closed-form solutions involving the usual algebraic and trigonometric functions, and instead has solutions in terms of Bessel functions. These functions are complicated, but we give a quick overview here – do not overly worry about the details as you will not be expected to be an expert on these special functions, but just have some sense of how they look.
If we scale the spatial variable as \(x = \sqrt{\rho_k}r\), we find that [R-US-eq-US-Bessel] transforms as, \[x^2\mathchoice{\frac{{\mathrm d}^2 R}{{\mathrm d}x^2}}{{\mathrm d}^2 R/{\mathrm d}x^2}{{\mathrm d}^2 R/{\mathrm d}x^2}{{\mathrm d}^2 R/{\mathrm d}x^2} + x \mathchoice{\frac{{\mathrm d}R}{{\mathrm d}x}}{{\mathrm d}R/{\mathrm d}x}{{\mathrm d}R/{\mathrm d}x}{{\mathrm d}R/{\mathrm d}x} +( x^2-n^2)R = 0,\] which is Bessel’s equation. It has two linearly independent sets of solutions given by Bessel functions of the first kind \(J_n(x)\) and Bessel functions of the second kind \(Y_n(x)\). For integer \(n\), we can define these functions via integrals, \[\begin{align} J_n(x) &= \frac{1}{\pi}\int_0^\pi \cos(n\tau-x\sin(\tau))\,{\mathrm d}\tau = \frac{1}{2\pi}\int_{-\pi}^\pi \mathrm{e}^{\mathrm{i}(x\sin(\tau)-n\tau)} {\mathrm d}\tau,\\ Y_n(x) &= \frac{1}{\pi}\int_0^\pi\sin(x\sin(\tau)-n\tau)\,{\mathrm d}\tau-\frac{1}{\pi}\int_0^\infty(\mathrm{e}^{n\tau}+(-1)^n \mathrm{e}^{-n\tau})\mathrm{e}^{-x\sinh(\tau)}{\mathrm d}\tau\label{Y-US-eq-US-1} \end{align}\] There are many other ways to write these functions. For example, they can also be written as, \[\begin{align} \label{J-US-Series} J_n(x) &= \sum_{m=0}^\infty \frac{(-1)^m}{m!(m+n)!}\left(\frac{x}{2} \right)^{2m+n}\\ Y_n(x) &= \lim_{\alpha \to n} \frac{J_\alpha(x)\cos(\alpha \pi)-J_{-\alpha}(x)}{\sin(\alpha \pi)}. \end{align}\] We will not go through these functions in any more detail except to note that \(Y_n(x)\) becomes infinite at \(x=0\). You can see this by looking at [Y-US-eq-US-1] for \(x \ll 1\): \[Y_n(x) \approx \frac{1}{\pi}\int_0^\pi\sin(-n\tau)\,{\mathrm d}\tau-\frac{1}{\pi}\int_0^\infty(\mathrm{e}^{n\tau}+(-1)^n \mathrm{e}^{-n\tau})\,{\mathrm d}\tau \approx -\frac{1}{\pi}\int_0^\infty \mathrm{e}^{n\tau}{\mathrm d}\tau \to -\infty,\] where the \(\exp(n\tau)\) term in the second integral diverges. Even for \(n=0\), this term will diverge as the integral is itself unbounded. We show plots of these two functions in 5.3.
Returning to \(R\), and remembering that we have scaled \(x = \sqrt{\rho_k}r\), we have that for each \(n\), solutions to [R-US-eq-US-Bessel] take the form, \[R_n(r) = A_nJ_n(\sqrt{\rho_k}r) + B_nY_n(\sqrt{\rho_k}r).\] As we have mentioned, \(Y_n\) becomes unbounded for \(r=0\). As this point is inside our domain, we must therefore take \(B_n=0\) to have bounded solutions. Finally, we do not yet have the values for \(\rho_k\). These can be obtained by using the \(R\) boundary condition, \[\begin{equation} \label{R-US-Bessel-US-BC} R_n(L_r)=0 \implies J_n(\sqrt{\rho_k}L_r) = 0. \end{equation}\] We see that, for each \(n\) coming from the \(\theta\) part of the problem, the Bessel function will oscillate around zero. In fact it will have infinitely many zeroes, so that for each fixed \(n\), there will be infinitely many values of \(\rho_k\) that satisfy [R-US-Bessel-US-BC]. As these values may be different for each \(n\), we modify the subscript of our spatial eigenvalue to include this dependence and write \(\rho_{k,n}\). Hence the solution to [R-US-eq-US-Bessel] in full generality can be written as, \[R_n(r) = \sum_{k=1}^\infty A_{k,n}J_n(\sqrt{\rho_{k,n}}r).\] We have chosen to index these zeroes starting at \(k=1\), but this choice is somewhat arbitrary, and just indicates that this first zero of \(J_k(x)\) is not in general \(x=0\) itself (though it will be for \(n>0\)). These values of \(\rho_{k,n}\) do not have closed-form expressions, and must be solved for numerically or, in special cases such as \(k \to \infty\), using analytic approximations. We will not pursue this for this course.
Finally we can write our solution to the linear problem governing perturbations [Petri-US-Linear] in general as, \[u_1(t,r,\theta) = \sum_{n=0}^\infty \sum_{k=1}^\infty\exp\left(\lambda_{k,n} t \right)J_n\left(\sqrt{\rho_{k,n}}r \right)\left(A_{k,n}\cos(n\theta) + B_{k,n}\sin(n\theta) \right),\] where we can update the growth rate relationship [Disp-US-Bessel-US-Scalar] to incorporate these two indices as, \[\lambda_{k,n} = -\rho_{k,n} D + 1.\] So what can we say about stability? Recalling the results from 3.1.2, we found that there was a domain-size dependent switch between growth and extinction of the population with these harsh (Dirichlet) boundary conditions. Do we observe the same thing here?
Yes, in fact we do. Consider the smallest zero of the Bessel function \(J_n(x)\) shown in 5.3(a). This zero occurs for \(n=0\). If we let \(x_0\) be the first such zero, that is \(J_0(x_0)=0\), then we have that \(\sqrt{\rho_{1,0}}L_r = x_0\) determines the value of the smallest spatial eigenvalue. So \(\rho_{1,0} = (x_0/L_r)^2\). We then have by the growth rate given above that if, \[L_r < \sqrt{D}x_0 \implies \lambda_{1,0}<0,\] then the extinction state will be stable, as all other spatial eigenvalues are larger (and hence all other growth rates are smaller). While the notation is messier, this is exactly what happened in 3.1.2, and qualitatively the result is identical – compare this to [harsh-US-environment-US-growth] noting that here we have taken \(r=1\). The main difference is that the critical domain size no longer has a nice analytic expression, and instead corresponds to the first zero of the Bessel function \(J_0(x)\).
Now that we have developed some ideas for how to deal with eigenfunctions of the Laplacian in polar coordinates, we take a look at a specific example of pattern formation motivated by the growth of hairs in a ‘whorl’ of Acetabularia. We will follow the ideas in Murray’s Mathematical Biology, volume 2, section 3.4. Take a look there for further details and references to the mathematical and biological literature.
Acetabularia is a genus of green marine algae, wherein all species are unicellular organisms, most of which are able to regenerate (see 5.4(a) for an example of the species Acetabularia ryukyuensis). These organisms have been the subject of numerous lab experiments designed to quantitatively asses this growth process. Of particular interest in this chapter is the fact that the growth of the whorl hairs, which eventually lead to the formation of the cap (see 5.4(b)), can be shown to be mediated by the presence of calcium in the surrounding fluid. The stalk is hollow and has an annular cross-section (5.5). The Calcium enters through the outer wall, the inner wall is impermeable. By amputating the stalk and then charting its re-growth in the presence of various levels of calcium the following experimental findings were established:
The number of hairs produced is proportional to the radius of the (hollow) stalk (assuming all other experimental conditions are fixed). Hence, there is a fixed wavelength between successive hairs, consistent with a Turing-type mechanism.
The growth of the tip is seen to only occur for a finite range of calcium concentrations in the surrounding fluid, i.e. growth is halted if the levels are too high or too low (see 5.7(a)).
So we have the spontaneous formation of a pattern (the hairs are evenly spaced around the axis of the stalk). Further it only occurs in limited scenarios in the presence of some tune-able parameter (calcium concentration). This suggests we could model it via the Turing mechanism, with calcium playing the role of \(v\), and \(u\) representing a morphogen triggering hair growth.24 This proposed model was originally developed in 1985 at a meeting on the subject, and can explain the two key phenomena listed above.
We consider our domain to be the annulus, with polar coordinates \(r \in [R_i, R_o]\) and \(\theta \in [0, 2\pi)\). We assume the interaction equations take the following form \[\begin{align} \label{Acetabularia-US-Model} \mathchoice{\frac{\partial u}{\partial t}}{\partial u/\partial t}{\partial u/\partial t}{\partial u/\partial t} &= \nabla^2 u+ (a-u+u^2v),\\ \nonumber\mathchoice{\frac{\partial v}{\partial t}}{\partial v/\partial t}{\partial v/\partial t}{\partial v/\partial t} &= D\nabla^2v+(b-u^2v). \end{align}\] This is the Schnakenberg reaction system we considered in 4.6, but now in a polar coordinate system. We consider an annular domain with a radial coordinate system \((r,\theta)\) for which the inner and outer radii are \(R_i\) and \(R_o\), respectively. The inner boundary’s impermeability is modelled by applying no-flux (Neumann) conditions there, i.e. \[\left.\mathchoice{\frac{\partial u}{\partial r}}{\partial u/\partial r}{\partial u/\partial r}{\partial u/\partial r}\right|_{r= R_i} = \left.\mathchoice{\frac{\partial v}{\partial r}}{\partial v/\partial r}{\partial v/\partial r}{\partial v/\partial r}\right|_{r= R_i} = 0.\] Realistically we should allow flux of the calcium \(\mathbfit{v}\) on the outer boundary. As an approximation (as inhomogeneous boundary conditions make the problem substantially harder), we note that the annular width is small compared to the radius of cross-section, so we model the calcium source through the term \(b\) and use zero flux conditions on the outer boundary as well. These are given by \[\left.\mathchoice{\frac{\partial u}{\partial r}}{\partial u/\partial r}{\partial u/\partial r}{\partial u/\partial r}\right|_{r= R_o} = \left.\mathchoice{\frac{\partial v}{\partial r}}{\partial v/\partial r}{\partial v/\partial r}{\partial v/\partial r}\right|_{r= R_o} = 0.\] The parameter \(a\) will represent a basal (that is, in the background and not part of our model) production of the morphogen \(u\). As in the last chapter, we have nondimensionalised the timescale such that the parameter \(D\) represents the ratio of the two diffusion coefficients.
We will now pursue the Turing analysis developed in the previous chapter, with the main novel difficulty being the non-Cartesian domain.
The homogeneous equilibrium for this system is \[u_0 = a+b,\quad v_0 = \frac{b}{(a+b)^2},\] as in 4.6. Since we are using Neumann boundary conditions, there are no difficulties using this as an equilibrium for the full PDE system.
We assume linearised solutions in the form \(u \approx u_0 + \varepsilon u_1\) and \(v \approx v_0 + \varepsilon v_1\). To order \(\varepsilon\) we have \[\begin{aligned} \mathchoice{\frac{\partial u_1}{\partial t}}{\partial u_1/\partial t}{\partial u_1/\partial t}{\partial u_1/\partial t}= \nabla^2 u_1+ (F_u u_1+ F_v v_1),\\ \mathchoice{\frac{\partial v_1}{\partial t}}{\partial v_1/\partial t}{\partial v_1/\partial t}{\partial v_1/\partial t} =D\nabla^2 V+ (G_u u_1+ G_v v_1) \end{aligned}\] with \(F_u = 2 u_0 v_0-1,F_v=(u_0)^2,G_u=-2u_0v_0,G_v= -(u_0)^2\) as the elements of the Jacobian matrix evaluated at the steady state.
Here is where the effect of changing the domain shape from a Cartesian system manifests. If we assume the solutions are in the from \(\mathrm{e}^{\lambda_k t}w_k(r,\theta)\), then we can write this in the form \[\begin{equation} \label{linearised} \lambda\left(\begin{array}{c}u_1\\v_1\end{array}\right)= \mathsfbfit{M}\left(\begin{array}{c}u_1\\v_1\end{array}\right),\quad \mathsfbfit{M} = \begin{pmatrix}-\rho_k+ (2u_0v_0-1) & (u_0)^2 \\ -2u_0v_0 & -D \rho_k-(u_0)^2 \end{pmatrix}, \end{equation}\] if \(w_k\) satisfies the following eigenvalue problem, \[\begin{equation} \label{eigeneq} \nabla^2w_k + \rho_k w_k = 0. \end{equation}\] This is exactly the same problem as in the Cartesian Turing analysis we performed in 4, with the eigenvalue problem for \(\rho_k\) being the same as in 5.2, with two important differences. Firstly, the boundary condition in the radial direction is now Neumann, rather than Dirichlet. Secondly, as we are working on an annulus and \(r > R_i\), we must account for the Bessel function of the second kind, \(Y_n(x)\).
The radial component of the eigenfunction will again be of the form, \[R_n(r) = A_{k,n}J_n(\sqrt{\rho_k}r)+B_{k,n}Y_n(\sqrt{\rho_k}r),\] though we no longer have \(B_n = 0\), and the values for \(\rho_{k,n}\) will be different as these must satisfy new boundary conditions. The equations determining these eigenvalues are, \[\begin{align} \label{R-US-Sys-US-BCs} R_n'(R_i)&=\sqrt{\rho_{k,n}}(A_{k,n}J_n'(\sqrt{\rho_{k,n}}R_i)+B_{k,n}Y_n'(\sqrt{\rho_{k,n}}R_i)) = 0,\\ R_n'(R_o)&=\sqrt{\rho_{k,n}}(A_{k,n}J_n'(\sqrt{\rho_{k,n}}R_o)+B_{k,n}Y_n'(\sqrt{\rho_{k,n}}R_o)) = 0. \end{align}\] We will not discuss solving these equations for \(\rho_{k,n}\), as they typically must be solved numerically. As in the simpler case before, we will note that infinitely many solutions to these equations exist, and they can be ordered. This motivates our subscript notation including both \(n\) and \(k\), although we now note what values \(k\) can take in the context of the homogeneous mode.
In the case of a Dirichlet condition in the radial direction, we argued that \(\rho_{1,0}=x_0/L_r\) was the smallest spatial eigenvalue, with \(x_0\) the first zero of \(J_0(x)\). Now that we have a Dirichlet condition, we can consider the possibility of a zero spatial eigenvalue, that is, \(\rho_{0,0}=0\). In this case we must take \(B_{0,0}\) as otherwise the function \(Y_0\) is unbounded, but we can satisfy the boundary conditions [R-US-Sys-US-BCs] as \(J_0'(0)=0\). This can be seen from taking the derivative of the series representation of \(J_0\) given in [J-US-Series], and evaluating it at \(x=0\).
Using this analysis, it can be shown that the range of permissible wavenumbers, as well as the mode with the largest growth rate, will scale with \((R_o-R_i)^2\) (see section 3.4 of volume 2 of Murray’s book for details). Hence, we can loosely think of the number of modes being unstable as being proportional to the size of the domain, as in the 1D case. In other words, this is the idea that Turing instabilities will give rise to patterns with a fixed wavelength between the pattern elements. In 5.6 we give example simulations of pattern formation for annuli of different inner and outer radii to clearly demonstrate that this system gives a fixed-wavelength pattern. Parameters for this simulation satisfy the bounds given in 4.6.
Experimental data shown in 5.7(a) indicate that there is a finite range of Calcium concentrations (modelled by the parameter \(b\) in our model) for which full hair growth (regeneration) can occur. We also see that, within this range the hair spacing decreases as the calcium concentration increases, quickly at first then more gradually. Further still the amplitude of the pattern decreases as the concentration approaches both ends of this range.
In order for patterns to form we must enforce all the Turing conditions. We have already established for this system that, for fixed \(D\), there is a range of values of \(b\) which satisfy the Turing conditions. We can, with some effort, parametrise the boundaries given in 4.5, which correspond to values of \((a,b)\) precisely where some growth rate \(\lambda_k=0\), and hence on one side we observe Turing instability, and on the other we observe stability of the spatially homogeneous state (or homogeneous instability, for the white region). We show an example of how this prediction compares for fixed \(a\) and \(D\) values in 5.7.
So in summary the model predicts that, for the right parameters
There is some finite range \(b\in [b_\text{min},b_\text{max}]\) for which growth can occur.25
The spacing of individual pattern elements, which become the hairs, scales with the size of the domain.
This model is an early example of a reaction–diffusion pattern mechanism actually being able to explain measurable phenomena, a first step into the goal of giving biologists confidence in the mechanism.
Choosing our linear perturbations \(u_1,v_1\) to take the form \(\mathrm{e}^{\lambda_k t}w_k\), such that \(w_k\) is an eigenfunction of the Laplacian is the general means by which the Turing analysis varies in non-Euclidean domains. This approach is not specific to this annular case, but can be generalised to a variety of different spatial domains. It means the linear matrix \(\mathsfbfit{J}\) will always take the form \[\mathsfbfit{J} = \begin{pmatrix}-\rho_k + F_u & F_v \\ G_u & -D\rho_k + G_v \end{pmatrix}.\] Then the change in the analysis takes two forms. First the pattern will not look the same. In a Euclidean coordinate system \([0,L_1]\times[0,L_2]\), with coordinates \((x_1,x_2)\) we would have to solve the problem \[\mathchoice{\frac{\partial^2 \psi}{\partial x_1^2}}{\partial^2 \psi/\partial x_1^2}{\partial^2 \psi/\partial x_1^2}{\partial^2 \psi/\partial x_1^2} + \mathchoice{\frac{\partial^2 \psi}{\partial x_2^2}}{\partial^2 \psi/\partial x_2^2}{\partial^2 \psi/\partial x_2^2}{\partial^2 \psi/\partial x_2^2} + k^2\psi =0,\] whose solutions are complex exponentials. In this annular coordinate system we must solve \[\mathchoice{\frac{\partial^2 w_k}{\partial r^2}}{\partial^2 w_k/\partial r^2}{\partial^2 w_k/\partial r^2}{\partial^2 w_k/\partial r^2} +\frac{1}{r}\mathchoice{\frac{\partial w_k}{\partial r}}{\partial w_k/\partial r}{\partial w_k/\partial r}{\partial w_k/\partial r} + \frac{1}{r^2}\mathchoice{\frac{\partial^2 w_k}{\partial\theta^2}}{\partial^2 w_k/\partial\theta^2}{\partial^2 w_k/\partial\theta^2}{\partial^2 w_k/\partial\theta^2} +\rho_kw_k =0,\] which as we have seen has much more complicated analytical solutions.
This means the mathematical form of the patterns will not in general be the usual Fourier series, but instead a generalised Fourier series, sometimes called a Galerkin expansion. In your assignment you will consider a 3D cylindrical domain. In any case you will be given the form of the Laplacian, your aim will be to solve it, or to describe what kinds of solutions it admits. The basic structure mechanism for doing so will always be the same.
Assume a separable solution, i.e. if there are coordinates \(x_1,x_2,\dots x_n\) (in our annular case \((x_1=r, x_2=\theta)\) for example), then assume \[w_k(x_1,\dots x_n) = f_1(x_1)f_2(x_2)\dots f_n(x_n).\]
Substitute this into the Laplacian, you will get a form which, when divided by \(w_k\) will separate out into individual ODEs for each function \(f_i\). You must then solve, or state the solution to, these ODEs. Importantly the boundary conditions will impact both the kinds of solutions obtained, and the spatial eigenvalue \(\rho_k\).
For any class of boundary-value problem (that is, ODE with boundary conditions) which you have not seen before, you will not be expected to solve for the eigenfunctions explicitly. Rather, you will need to be able to put together a full solution by quoting, e.g., properties of the Bessel functions described above. Question 2 on Problem Sheet 6 gives an example of this.
The methods described here apply to any geometry where the boundary conditions are separable. This means that you can write them essentially in terms of one of the separated functions \(f_i(x_i)\), as we have done in Cartesian and now polar geometries. However, the ansatz actually works more generally, and can be extended to even quite complicated domains (such as the surface of manifolds, or to discrete geometries such as networks). The difficulty then is estimating eigenvalues of the Laplacian in these domains, though often there are certain tricks, analytical and numerical, that can be employed.
How common are ‘non-separable’ geometries? Really, almost shapes will have non-separability, and even for geometries where we can do this analysis in principle, solving the boundary value problems can be very difficult. It is an open question, for example, how to compute the eigenfunctions \(w_k\) of the Laplacian posed on the regular hexagon – it can be done on the square and the triangle, but the hexagon is exceedingly complicated by comparison!
In the last chapter we discussed how in principle the Turing analysis in one spatial dimension captures more complex geometries in higher dimensions, subject to solving the Helmholtz equation for the spatial eigenvalues \(\rho_k\). In this chapter we will generalise the analysis to systems of equations which have terms that correspond to spatial transport beyond purely Fickian diffusion. We will separate this into two sub-topics: cross-diffusion (including models of chemotaxis and other direction motion), and nonlocal spatial transport. The first will involve essentially the Laplacian and its eigenfunctions, but we will no longer consider the diffusion matrix \(\mathsfbfit{D}\) to be diagonal. In the second we will discuss spatial operators more general than the Laplacian, particularly those of order higher than two.
As in the previous chapter, the mathematical ideas here build on all of the linear instability analysis we have done. So while things will get more technical, I hope it is at least helpful to see the same ideas once again in a slightly different context. The process is always the same since the first chapter of this term: find equilibria, linearise the model, and then solve the linear model to understand how perturbations of the equilibria grow or decay, depending on the parameters. The rest are details, which are important, but just remember the big picture of the forest if you get lost in the trees.
Fickian diffusion is based around the idea of small random movements. For chemicals, this represents random molecular movement essentially due to collisions of individual molecules26. For larger things we have modelled, such as predators and prey, this kind of Fickian diffusion also models a kind of random walking movement. But, despite how you might describe fellow commuters on a busy street, most organisms do not move primarily via random movements in arbitrary directions. Instead, they move according to signals internal or external to themselves, directing them in particular directions.
The word taxis describes the movement of an organism in response to a stimulus. One of the most important and well-studied examples is chemotaxis where an organism follows a chemical trail by going in the direction of increasing gradients of this chemical. But many other kinds of directed motion exist, such as plants moving for optimal light absorption via phototaxis, or bacteria moving along magnetic fields via magnetotaxis. Other examples of directed motion include predators chasing prey (and the prey fleeing), birds flying annually to more suitable climates.
We will first explore a model of snake skin pattern formation where cell movement has been theorised to play a role (and where some experimental evidence indicates that chemotactic movement of cells forms the skin patterns). From this analysis we will show how the same kind of approach applies to general transport systems.
Relevant reading: Murray book I, chapter 11.4; and book II, chapter 4.11.
Snake coats can exhibit a wide variety of spotted and striped patterns, both along the length of the snake and those around the its cross-section. While some species exhibit almost identical markings, snakes within the same family can also exhibit both length wise and width wise stripes, or mixtures of skin patterns; see 6.1 for some examples.
Murray talks at some length about the biology of alligator stripe patterns in chapter 4, noting that there has been much experimental work on this mechanism. As Murray relates in section 4.11, there is some evidence that a similar mechanism could be at work in snake skin patterning. The basic assumptions of such a mechanism are:
The pattern is fixed in the dermis, below the outer epidermis which is what we see. This is important as epidermal cells do not move much, whereas those in the dermis are much more motile.
Pigment producing cells in this region exhibit phenotypes leading to different coloured regions, and this pigment production is dependent on some chemical (the ‘morphogen’) reaching a critical concentration.
The motion and concentrations of these cells proceeds by a chemotactic mechanism.
Given these assumptions, we now write down and analyse a model of chemotaxis-driven pattern formation.
We let \(\widetilde{n}\) denote the density of pigment producing cells and \(\widetilde{c}\) the concentration of a chemoattractant. We can model such a scenario by the following reaction–diffusion-type system: \[\begin{align} \label{unscaledsnake} \mathchoice{\frac{\partial\widetilde{n}}{\partial\widetilde{t}}}{\partial\widetilde{n}/\partial\widetilde{t}}{\partial\widetilde{n}/\partial\widetilde{t}}{\partial\widetilde{n}/\partial\widetilde{t}} &= D_{\widetilde{n}}\nabla^2\widetilde{n} -\bm{\nabla}\cdot\left(g(\widetilde{n},\widetilde{c})\bm{\nabla}\widetilde{c}\right) + f(\widetilde{n}),\\ \mathchoice{\frac{\partial\widetilde{c}}{\partial\widetilde{t}}}{\partial\widetilde{c}/\partial\widetilde{t}}{\partial\widetilde{c}/\partial\widetilde{t}}{\partial\widetilde{c}/\partial\widetilde{t}} &= D_{\widetilde{c}}\nabla^2\widetilde{c} + h(\widetilde{n},\widetilde{c}) - \gamma \widetilde{c}. \end{align}\] The diffusion terms and coefficients have their usual meaning, whereas the new term involving the divergence of \(g(\widetilde{n},\widetilde{c})\bm{\nabla}\widetilde{c}\) represents a directed motion towards increasing values of \(\widetilde{c}\) at a rate proportional to \(g(\widetilde{n},\widetilde{c})\). The function \(f\) represents cell population dynamics (i.e. growth and death of the cells), \(h\) models the production of the chemoattractant \(\widetilde{c}\), and the term \(-\gamma \widetilde{c}\) is the natural degradation of the chemoattractant. All parameters are considered positive.
The new advection term can be interpreted as a bias in the flux of cells. Recalling the idea of a flux from last term, we can write the equation for \(\widetilde{n}\) as, \[\mathchoice{\frac{\partial\widetilde{n}}{\partial\widetilde{t}}}{\partial\widetilde{n}/\partial\widetilde{t}}{\partial\widetilde{n}/\partial\widetilde{t}}{\partial\widetilde{n}/\partial\widetilde{t}} = -\nabla \cdot \mathbfit{J}_n + f(\widetilde{n}), \quad \mathbfit{J}_n = -D_{\widetilde{n}}\bm{\nabla}\widetilde{n}+ g(\widetilde{n},\widetilde{c})\bm{\nabla}\widetilde{c},\] where the first term in \(\mathbfit{J}_n\) is the usual Fickian flux, and the second can be thought of as a bias along gradients of the chemoattractant \(c\). Assuming \(g>0\), we see that these two flux terms have opposite signs. This means that the Fickian diffusion term wants to move cells away from regions of high concentration, whereas the chemotactic flux wants to move cells towards regions of high chemoattractant. This description also allows us to see that if we specify Neumann conditions on both \(n\) and \(c\), we automatically satisfy no-flux for both species, even though the flux in the cell density is not just proportional to gradients of cell density.
While we can perform a linear instability analysis of this general model, we instead consider specific forms of these functions by taking, \[f(\widetilde{n}) = r \widetilde{n}\left (1-\frac{\widetilde{n}}{K}\right ), \quad g(\widetilde{n},\widetilde{c}) = A \widetilde{n}, \quad h(\widetilde{n},\widetilde{c}) = S\widetilde{n}.\] Here \(A\) is a parameter which alters the strength of the chemotaxis effect, \(r\) the cell growth rate, \(K\) the carrying capacity of the cells, and \(S\) the secretion rate of the chemoattractant by the cells.
We use the following scalings to nondimensionalise the system \[\widetilde{n} = Kn, \quad \widetilde{c} = \frac{SKc}{r}, \quad \widetilde{t} = \frac{t}{r}.\] Substituting the above functional forms of \(f\), \(g\) and \(h\) into [unscaledsnake] and then applying these scalings, we obtain \[\begin{align} \label{scaledsnake} \mathchoice{\frac{\partial{n}}{\partial{t}}}{\partial{n}/\partial{t}}{\partial{n}/\partial{t}}{\partial{n}/\partial{t}} &= D_n\nabla^2{n} -\alpha \bm{\nabla}\cdot\left(n\bm{\nabla}c\right) + n(1-n),\\ \mathchoice{\frac{\partial c}{\partial t}}{\partial c/\partial t}{\partial c/\partial t}{\partial c/\partial t} &= D_c\nabla^2c + n - Gc, \end{align}\] where \(D_n=D_{\widetilde{n}}/r\), \(D_c=D_{\widetilde{c}}/r\), \(\alpha = SKA/r^2\), and \(G = \gamma/r\) are all positive parameters
The developing embryo of a snake is a coiled cylindrical shape. We will perform the Turing-type linear instability analysis thinking of a general domain, and not worry about the particular form of the eigenfunctions. Rather, we only assume that the boundary conditions are such that the spatial eigenvalues \(\rho_k\) which satisfy, \[\begin{equation} \label{eigen-US-snake} \nabla^2 w_k + \rho_k w_k = 0, \end{equation}\] are positive-definite. That is, we assume \(\rho_0=0\) and \(\rho_k>0\) for \(k>0\). You might ask: Why are eigenfunctions of the Laplacian important, as the spatial operators in [scaledsnake] are more complicated? As we will see, after linearising the model will always reduce to a linear system involving only Laplacians as the spatial derivatives.
The system admits two homogeneous equilibria. We can find these by se 3,tting all time and space derivatives to zero to find, \[0 = n_0(1-n_0), \quad 0=n_0 - Gc_0,\] which has the solutions \((n_0, c_0) = (0,0)\) or \((1,1/G)\). As we have seen on Problem Sheet 3, the first of these would lead to non-physical perturbations, and so we only consider perturbations around the second of these.
Following the usual procedure, we can substitute \(n = n_0 + \varepsilon N\), \(c = c_0 + \varepsilon C\) into [scaledsnake] to find a linear system of equations. The first of these involves a nonlinear advection term so we will go through this part in detail. We have, \[\begin{equation} \label{chemo-US-manip-US-1} \mathchoice{\frac{\partial}{\partial t}}{\partial/\partial t}{\partial/\partial t}{\partial/\partial t}(n_0 + \varepsilon N) = D_n\nabla^2(n_0 + \varepsilon N) -\alpha \bm{\nabla}\cdot\left((n_0 + \varepsilon N)\bm{\nabla}(c_0 + \varepsilon C)\right) + (n_0 + \varepsilon N)(1-(n_0 + \varepsilon N)). \end{equation}\] Derivatives of \(n_0\) and \(c_0\) are zero, so we can simplify this equation. We also apply the product rule on the term involving \(\alpha\) to find: \[\begin{equation} \label{chemo-US-manip-US-2} \varepsilon \mathchoice{\frac{\partial N}{\partial t}}{\partial N/\partial t}{\partial N/\partial t}{\partial N/\partial t} = \varepsilon D_n\nabla^2 N -\varepsilon \alpha n_0 \nabla^2 C-\varepsilon^2 \alpha \bm{\nabla}N \cdot \bm{\nabla}C + n_0(1-n_0) + \varepsilon (1-2n_0)N - \varepsilon^2 N^2. \end{equation}\] The terms of order \(\varepsilon^2\) will be negligible for a linear analysis, so we will drop these. We also have that \(n_0(1-n_0)=0\) at a homogeneous steady state. Dividing by \(\varepsilon\) and writing the equation for \(C\) (which we note is the same as the equation for \(c\), as it was already a linear equation), we have the linearised reaction–diffusion system, \[\begin{aligned} \mathchoice{\frac{\partial N}{\partial t}}{\partial N/\partial t}{\partial N/\partial t}{\partial N/\partial t} &= D_n\nabla^2 N - \alpha n_0 \nabla^2 C + (1-2n_0)N, \\ \mathchoice{\frac{\partial C}{\partial t}}{\partial C/\partial t}{\partial C/\partial t}{\partial C/\partial t} &= D_c\nabla^2C + N - GC. \end{aligned}\] As in chapter 3, we can write this in vector notation as, \[\begin{equation} \label{linear-US-snake} \mathchoice{\frac{\partial}{\partial t}}{\partial/\partial t}{\partial/\partial t}{\partial/\partial t}\begin{pmatrix} N\\C \end{pmatrix} = \underbrace{\begin{pmatrix} D_n & -\alpha n_0 \\ 0 & D_c \end{pmatrix}}_{\mathsfbfit{D}}\nabla^2\begin{pmatrix} N\\C \end{pmatrix} + \underbrace{\begin{pmatrix} 1-2n_0 & 0 \\ 1 & -G \end{pmatrix}}_{\mathsfbfit{J}}\begin{pmatrix} N\\C \end{pmatrix}. \end{equation}\]
We note that [linear-US-snake] is similar to [linear-US-turing], except that here \(\mathsfbfit{J}\) does not satisfy the right sign structure of the Jacobian. However, we also see that \(\mathsfbfit{D}\) is not diagonal, and it is this feature which will allow the system to admit pattern-forming instabilities. We also note that the steady state \((1,1/G)\) is stable in the absence of diffusion, as the Jacobian evaluated at this state satisfies \(\det(\mathsfbfit{J}) = G>0\) and \(\operatorname{tr}(\mathsfbfit{J}) = -(1+G)<0\). Henceforth we will take \(n_0=1\) and only consider this steady state.
To solve the linear system given by [linear-US-snake], we again assume the usual ansatz that \((N,C) \propto \mathrm{e}^{\lambda_k t}w_k(\mathbfit{x})\). Substituting this in, we find that the growth rates \(\lambda_k\) are eigenvalues of the matrix, \[\mathsfbfit{M} = -\rho_k \mathsfbfit{D} + \mathsfbfit{J} = \begin{pmatrix} -(1+\rho_k D_n) & \rho_k \alpha \\ 1 & -(G+\rho_k D_c) \end{pmatrix}.\] In general, for a fixed geometry and hence fixed sequence of \(\rho_k\), we must compute the growth rates explicitly to see if any give rise to instability. However, as in chapter 3, we can think of a sufficiently large domain so that \(\rho_k\) approximates any continuous value, and then derive conditions on the parameters which are necessary for instability. As in the case of reaction–diffusion systems, we see that \(\operatorname{tr}(\mathsfbfit{M}) < \operatorname{tr}(\mathsfbfit{J}) < 0\), and hence stability cannot be lost this way.
Next let’s consider the determinant condition, which we write as, \[h(\rho_k) = \det(\mathsfbfit{M}) = (1+\rho_k D_n)(G+\rho_k D_c) - \rho_k \alpha = \rho_k^2 D_c D_n + \rho_k(D_c + G D_n - \alpha) + G.\] Compare this equation to the reaction–diffusion case given in [h-US-eq]. As in that case, we have that \(\alpha > D_c + GD_n\) is a necessary condition for instability, but not a sufficient one. We can find the minimum value of \(h\) by taking the derivative with respect to \(\rho_k\) to find, \[\mathchoice{\frac{{\mathrm d}h}{{\mathrm d}\rho_k}}{{\mathrm d}h/{\mathrm d}\rho_k}{{\mathrm d}h/{\mathrm d}\rho_k}{{\mathrm d}h/{\mathrm d}\rho_k} = 2\rho_k^* D_c D_n + (D_c + G D_n - \alpha) = 0 \implies \rho_k^* = \frac{\alpha -D_c - GD_n}{2D_c D_n}.\] Substituting this into \(h\) and rearranging, we find a second condition for instability to be, \[h(\rho_k^*)<0 \implies 4G D_c D_n < (\alpha -D_c - GD_n)^2.\] As we have taken especially simple functional forms, these two conditions seem fairly straightforward. In particular, we see that for any fixed positive values of \(D_c\), \(D_n\), and \(G\), we can take \(\alpha\) sufficiently large to satisfy these conditions.
We give example simulations of [scaledsnake] in 6.2, showing how the domain size can influence the structure of the patterns. In particular we note that, as we have seen for reaction–diffusion simulations, the domain size scales the pattern in a reasonably obvious way (a larger domain supports finer-scale patterns).
Briefly, we note that the above analysis generalises substantially to other kinds of directed transport. We consider the nonlinear cross-diffusion system, \[\begin\{align\} \mathchoice{\frac{\partial u}{\partial t}}{\partial u/\partial t}{\partial u/\partial t}{\partial u/\partial t} &= \bm{\nabla}\cdot \left (d_1(u,v)\bm{\nabla}u + d_2(u,v) \bm{\nabla}v \right ) + F(u,v), \\ \mathchoice{\frac{\partial v}{\partial t}}{\partial v/\partial t}{\partial v/\partial t}{\partial v/\partial t} &= \bm{\nabla}\cdot \left (d_3(u,v)\bm{\nabla}u + d_4(u,v) \bm{\nabla}v \right ) + G(u,v), \end{align} where the $d_i$ and $F$, $G$ are given sufficiently smooth functions.\footnote{For systems like these, PDE-theory questions of existence and regularity of solutions become quite a bit harder, and blow-up phenomena are much more common. Here we will assume the functions are such that we can neglect these issues, but it is worth noting that one has to choose these nonlinear functions with some care.} We now that \cref{scaledsnake} is a special case of this model with $d_1=D_u$, $d_3=0$, $d_4=D_v$ and $d_2 = -\alpha u$, and $u$ replaced by $n$, $v$ replace by $c$. As in the case of chemotaxis, if we impose Neumann conditions on $u$ and $v$, then this automatically gives no-flux conditions. This can be seen because the flux of $u$, for example, is $\mathbfit{J}_u =d_1(u,v)\bm{\nabla}u + d_2(u,v) \bm{\nabla}v$. The converse, that no-flux implies Neumann conditions, is not in general true. However, it turns out that for all permissible values of $u$ and $v$, we need the matrix, \begin{equation} \mathsfbfit{D} = \begin{pmatrix} d_1(u,v) & d_2(u,v) \\ d_3(u,v) & d_4(u,v) \end{pmatrix},\] to be positive-definite in order to guarantee existence of sufficiently smooth solutions27. One can then show that imposing no-flux conditions must imply Neumann conditions.
Homogeneous equilibria of this system satisfy \(F(u_0,v_0)=G(u_0,v_0)=0\). We can perform the usual perturbation analysis for linear stability by setting \(u = u_0 + \varepsilon u_1\), \(v = v_0 + \varepsilon v_1\). After some careful algebra, we find that the linear perturbations satisfy, \[\begin\{align\} \mathchoice{\frac{\partial u_1}{\partial t}}{\partial u_1/\partial t}{\partial u_1/\partial t}{\partial u_1/\partial t} &= d_1(u_0,v_0)\nabla^2 u_1 + d_2(u_0,v_0) \nabla^2 v_1 + F_u(u_0,v_0)u_1+F_v(u_0,v_0)v_1, \\ \mathchoice{\frac{\partial v_1}{\partial t}}{\partial v_1/\partial t}{\partial v_1/\partial t}{\partial v_1/\partial t} &= d_3(u_0,v_0)\nabla^2 u_1 + d_4(u_0,v_0) \nabla^2 v_1 + G_u(u_0,v_0)u_1+G_v(u_0,v_0)v_1. \end{align} As before, we can write this as a system of equations by putting $\mathbfit{u}_1 = (u_1,v_1)$. We then have, \begin{equation} \mathchoice{\frac{\partial\mathbfit{u}_1}{\partial t}}{\partial\mathbfit{u}_1/\partial t}{\partial\mathbfit{u}_1/\partial t}{\partial\mathbfit{u}_1/\partial t} = \mathsfbfit{D}\mathbfit{u}_1 + \mathsfbfit{J}\mathbfit{u}_1, \quad \mathsfbfit{J} = \begin{pmatrix} F_u & F_v \\ G_u & G_v \end{pmatrix},\] where the functions in the matrices are evaluated at the homogeneous steady state, \((u_0,v_0)\). For simplicity we will omit this dependence in all of the partial derivatives of \(F\) and \(G\), and in the \(d_i\).
We can expand these perturbations as \(\mathbfit{u}_1 \propto \mathrm{e}^{\lambda_k t}w_k(\mathbfit{x})\), with the \(w_k\) again satisfying \(\nabla^2 w_k + \rho_k w_k = 0\). Substituting this in, we find that the growth rates \(\lambda_k\) are eigenvalues of the matrix, \[\mathsfbfit{M} = -\rho_k \mathsfbfit{D} + \mathsfbfit{J} = \begin{pmatrix} F_u-\rho_k d_1 & F_v-\rho_k d_2 \\ G_u-\rho_k d_3 & G_v-\rho_k d_4 \end{pmatrix}.\] We then have that the growth rates satisfy, \[\lambda_k^2 - \operatorname{tr}(\mathsfbfit{M})\lambda_k + \det(\mathsfbfit{M}) = 0, \quad \operatorname{tr}(\mathsfbfit{M}) = F_u+G_v-\rho_k(d_1+d_4),\] and we have, \[h(\rho_k) = \det(\mathsfbfit{M}) = \rho_k^2(d_1d_4-d_2d_3) + \rho_k(d_2G_u+d_3F_v - d_1G_v - d_4F_u) + F_uG_v - F_vG_u.\]
We will now derive conditions for pattern formation. We first assume that \(\mathsfbfit{J}\) has eigenvalues with negative real part, and hence we have, \[\operatorname{tr}(\mathsfbfit{J}) = F_u + G_v < 0,\quad \det(\mathsfbfit{J}) = F_uG_v - F_vG_u > 0.\] By assumption \(\mathsfbfit{D}\) is positive-definite, and hence it must have a positive trace and determinant. So we have, \[d_1+d_4 > 0, \quad d_1d_4-d_2d_3 > 0.\] From this last condition we have that the coefficient of \(\rho_k^2\) in \(h(\rho_k)\) is positive, and the constant term in this function is also equivalent to \(\det(\mathsfbfit{J})\), which is positive. We also have that \(\operatorname{tr}(\mathsfbfit{M}) < \operatorname{tr}(\mathsfbfit{J}) < 0\), and hence must again focus our attention on values of \(\rho_k\) where \(h(\rho_k)<0\).
To find the minimum value of \(h\) We compute, \[\mathchoice{\frac{{\mathrm d}h}{{\mathrm d}\rho_k}}{{\mathrm d}h/{\mathrm d}\rho_k}{{\mathrm d}h/{\mathrm d}\rho_k}{{\mathrm d}h/{\mathrm d}\rho_k} = 2(d_1d_4-d_2d_3)\rho_k^* + d_2G_u+d_3F_v - d_1G_v - d_4F_u = 0,\] and hence, \[\rho_k^* = \frac{d_1G_v + d_4F_u-d_2G_u-d_3F_v}{2(d_1d_4-d_2d_3)}.\] For this to be positive we must have, \[d_1G_v + d_4F_u > d_2G_u+d_3F_v,\] which is a generalisation of the third Turing condition given in [turingconditions] (in this notation, set \(d_1=1\), \(d_4=D\), and \(d_2=d_3=0\) to exactly recover the result in the reaction–diffusion case).
Substituting this critical value in and simplifying, we find \[h(\rho_k^*) = -\frac{(d_1G_v + d_4F_u-d_2G_u-d_3F_v)^2}{4(d_1d_4-d_2d_3)} + F_uG_v - F_vG_u.\] Hence for \(h(\rho_k^*)<0\) we must have, \[(d_1G_v + d_4F_u-d_2G_u-d_3F_v)^2-4(d_1d_4-d_2d_3)(F_uG_v - F_vG_u) > 0.\] This is precisely a generalisation of the fourth Turing condition given in [turingconditions]; to see this, again set \(d_1=1\), \(d_4=D\), and \(d_2=d_3=0\) to exactly recover the result in the reaction–diffusion case.
On Problem Sheet 8 there is an example where you will use these conditions to find conditions for pattern formation in a predator–prey model with prey moving towards the predator. Examples of these kinds of ambush predators are given in 6.3.
You will not be examined on a detailed derivation of these general conditions, nor will you be required to memorise these. But you may be asked a question on a specific example, such as chemotaxis, which in principle you should be able to work through.
The following derivation of the higher-order PDE given by [biharmonic-US-PDE-US-pre] will itself be non-examinable. Everything from that equation onward is important for the exam, and more similar to things we have seen before.
Relevant reading: Murray book I, chapter 11.5; and book II, chapters 6 and 12 (only a cursory look!)
So far in this course we have covered models which were either spatially homogeneous (ODEs), or which involved only the Laplacian or relatively simple nonlinear generalisations of it. There are many other model formulations one can use, especially to consider interactions in space. In this section, we will briefly look at models involving integrals of the unknown functions, in addition to partial derivatives. These are sometimes called integro-differential equations, and their general theory and behaviour can be quite a lot more exotic than the corresponding case of PDEs or ODEs. Nevertheless, in many useful contexts we can transform these equations into systems of PDEs, and this is the approach we will take in this section. For simplicity we will only consider models of a single species.
Consider the following equation posed on the spatial domain \(x \in \mathbb{R}\) for a population \(u(x,t)\), \[\begin{equation} \label{nonlocal-US-PDE} \mathchoice{\frac{\partial u}{\partial t}}{\partial u/\partial t}{\partial u/\partial t}{\partial u/\partial t} = \int_{-\infty}^{\infty} K(x,y) F(u(y,t)) \, {\mathrm d}y + G(u(x,t)), \end{equation}\] where \(F\) and \(G\) are suitably nice nonlinear functions, and \(K\) is known as a kernel function. You may have seen this terminology used for Green’s functions, which are the kernels of certain integral operators. The integral term here represents a general form of interaction between the population at a spatial point \(y\), and the population at a point \(x\), with the integration being a sum over all of the points which interact with \(x\). Generally we choose a kernel \(K\) to represent a model for how the population at different spatial points interact, and the function \(F\) for the form of the interaction.
We note that such equations can be posed on finite domains, or alternatively the kernel \(K\) can have compact support which also makes the integral reduce to one over a bounded domain. Compared to PDEs, such integral equations do not typically require any additional boundary conditions, although we still do require an initial condition due to the derivative in \(t\). One can develop a theory of steady states and linear stability analysis for these kinds of equations, though it becomes more technical in general. Instead, we will look at a particular class of kernels which can be reduced to PDEs with higher-order derivatives.
We can relate these nonlocal models to PDEs, like the reaction–diffusion equations we have seen, by choosing a particular form of the kernel \(K\) and interaction term \(F\) which models a kind of ‘local’ interaction. We set \(K(x,y) = K(x-y)=K(y-x)=K(|x-y|)\), so that the kernel is only a function of the distance between two points. We will also assume that this influence decays rapidly as the distance increases; that is \(K(x-y) \to 0\) for \(|x-y| \gg 1\). An example of such a kernel is, \[\begin{equation} \label{example-US-kernel} K(|x-y|) = \mathrm{e}^{-s(x-y)^2}. \end{equation}\] In this case, especially for \(s \gg 1\), this function approaches a very sharp and narrow limit around the point \(x=y\) (which will increasingly approximate a \(\delta\)-function for large \(s\)) so that only nearby regions interact. For the interaction, we simply take the linear function \(F(u) = u\).
We will now use our assumptions on this kernel to perform a Taylor series of the integral term in [nonlocal-US-PDE]. Before we do this, we note some properties of this kind of kernel. By the assumed symmetry of the argument, we see that \(K(y)\) is an even function of \(y\). Hence we have, \[\begin{equation} \label{K-US-symmetry} \int_{-\infty}^{\infty}y^{2m+1}K(y)\,{\mathrm d}y = 0, \quad \textrm{for } m=0,1,2,3,\dots \end{equation}\] We also define the moments of the kernel (which are just numbers) by, \[K_{2m} = \frac{1}{(2m)!}\int_{-\infty}^{\infty}y^{2m}K(y)\, {\mathrm d}y, \quad \textrm{for } m=0,1,2,3,\dots\]
Now we are ready to expand the integral term in [nonlocal-US-PDE], under the assumption that the kernel decays sufficiently quickly (for example, \(s \gg 1\) in [example-US-kernel]). In this case, we can think of the quantity \(z = x-y\) as being ‘small’ and use a Taylor expansion. We have that the integral term looks like, \[\begin{align} \label{horrific-US-expansion} &\int_{-\infty}^{\infty} K(x-y) u(y,t) \, {\mathrm d}y = \int_{-\infty}^{\infty} K(z) u(x-z,t) \, {\mathrm d}z \\ &=\int_{-\infty}^{\infty} K(z) \left [u(x,t)-z\mathchoice{\frac{\partial u}{\partial x}}{\partial u/\partial x}{\partial u/\partial x}{\partial u/\partial x}(x,t)+\frac{z^2}{2}\mathchoice{\frac{\partial^2 u}{\partial x^2}}{\partial^2 u/\partial x^2}{\partial^2 u/\partial x^2}{\partial^2 u/\partial x^2}(x,t)-\frac{z^3}{6}\mathchoice{\frac{\partial^{3} u}{\partial x^{3}}}{\partial^{3} u/\partial x^{3}}{\partial^{3} u/\partial x^{3}}{\partial^{3} u/\partial x^{3}}(x,t) + \frac{z^4}{24}\mathchoice{\frac{\partial^{4} u}{\partial x^{4}}}{\partial^{4} u/\partial x^{4}}{\partial^{4} u/\partial x^{4}}{\partial^{4} u/\partial x^{4}}(x,t) +\cdots \right] {\mathrm d}z\\ &=u(x,t)\int_{-\infty}^{\infty} K(z) \, {\mathrm d}z-\mathchoice{\frac{\partial u}{\partial x}}{\partial u/\partial x}{\partial u/\partial x}{\partial u/\partial x}(x,t)\int_{-\infty}^{\infty} zK(z) \, {\mathrm d}z +\mathchoice{\frac{\partial^2 u}{\partial x^2}}{\partial^2 u/\partial x^2}{\partial^2 u/\partial x^2}{\partial^2 u/\partial x^2}(x,t)\int_{-\infty}^{\infty} \frac{z^2}{2}K(z) \, {\mathrm d}z+\cdots\\ &=u(x,t)K_0 +\mathchoice{\frac{\partial^2 u}{\partial x^2}}{\partial^2 u/\partial x^2}{\partial^2 u/\partial x^2}{\partial^2 u/\partial x^2}(x,t)K_2+\mathchoice{\frac{\partial^{4} u}{\partial x^{4}}}{\partial^{4} u/\partial x^{4}}{\partial^{4} u/\partial x^{4}}{\partial^{4} u/\partial x^{4}}(x,t)K_4+\mathchoice{\frac{\partial^{6} u}{\partial x^{6}}}{\partial^{6} u/\partial x^{6}}{\partial^{6} u/\partial x^{6}}{\partial^{6} u/\partial x^{6}}(x,t)K_6+\cdots, \end{align}\] where all of the odd terms vanish due to the symmetry of \(K\) (see [K-US-symmetry]). This expansion looks scary, but the point is just that if the integral kernel \(K\) is sufficiently smooth and decays quickly enough, then you can approximate the integral term by spatial derivatives.
We note this expansion only makes sense if the \(K_{2m}\) get smaller sufficiently quickly. For [example-US-kernel], we can compute (not trivial but not immensely hard): \[K_{2m} = \frac{\sqrt{\pi}}{\sqrt{s}(4s)^m m!}.\] These terms tend towards \(0\), and generally quite rapidly, though there are some technicalities in ensuring a convergent expansion for [horrific-US-expansion].28 We will not dwell on these, and instead study what happens when we truncate the series. To see how quickly these terms converge we consider \(s=1\), and compute that \(K_2 \approx 0.443\), \(K_4 \approx 0.055\), \(K_6 \approx 0.005\), \(K_8 \approx 0.0003\), etc. We note that the signs of these moments can be different, especially when negative interactions are modelled.
Given the numerical evidence above, and for simplicity, we will truncate this expansion at the fourth order to demonstrate the impact of this kind of nonlocality on instabilities. Our equation then looks like, \[\begin{equation} \label{biharmonic-US-PDE-US-pre} \mathchoice{\frac{\partial u}{\partial t}}{\partial u/\partial t}{\partial u/\partial t}{\partial u/\partial t} = uK_0 +K_2\mathchoice{\frac{\partial^2 u}{\partial x^2}}{\partial^2 u/\partial x^2}{\partial^2 u/\partial x^2}{\partial^2 u/\partial x^2}+K_4\mathchoice{\frac{\partial^{4} u}{\partial x^{4}}}{\partial^{4} u/\partial x^{4}}{\partial^{4} u/\partial x^{4}}{\partial^{4} u/\partial x^{4}} + G(u). \end{equation}\] We can simplify this by writing \(H(u) = uK_0+G(u)\) to find, \[\begin{equation} \label{biharmonic-US-PDE} \mathchoice{\frac{\partial u}{\partial t}}{\partial u/\partial t}{\partial u/\partial t}{\partial u/\partial t} = K_2\mathchoice{\frac{\partial^2 u}{\partial x^2}}{\partial^2 u/\partial x^2}{\partial^2 u/\partial x^2}{\partial^2 u/\partial x^2}+K_4\mathchoice{\frac{\partial^{4} u}{\partial x^{4}}}{\partial^{4} u/\partial x^{4}}{\partial^{4} u/\partial x^{4}}{\partial^{4} u/\partial x^{4}} + H(u). \end{equation}\] We will study this equation assuming that it is posed on a finite spatial domain, \(x \in [0,L]\), in order to avoid some technicalities about eigenfunctions of the Laplacian defined on \(\mathbb{R}\). This can be made sense of if the above derivation was done on a bounded domain, and the limit taken more carefully near the boundaries. It also helps explain why we think of higher-order derivatives as being less ‘local,’ than lower-order ones.
We can impose any boundary conditions that make biological sense, but as the equation is now higher-order in space, we must have two boundary conditions at each endpoint. We will discuss different choices of these boundary conditions in the next subsection.
You have in fact seen an equation with higher-order spatial derivatives before, namely, the beam equation from Michaelmas. This equation had the form, \[\mathchoice{\frac{\partial^{4} w}{\partial s^{4}}}{\partial^{4} w/\partial s^{4}}{\partial^{4} w/\partial s^{4}}{\partial^{4} w/\partial s^{4}}+\frac{N}{EI}\mathchoice{\frac{\partial^2 w}{\partial s^2}}{\partial^2 w/\partial s^2}{\partial^2 w/\partial s^2}{\partial^2 w/\partial s^2}+\rho A\mathchoice{\frac{\partial^2 w}{\partial t^2}}{\partial^2 w/\partial t^2}{\partial^2 w/\partial t^2}{\partial^2 w/\partial t^2} = 0,\] (where \(w\) was the unknown displacement and \(s\) was the spatial variable). This is not quite the same form as [biharmonic-US-PDE], as the beam equation has a second order derivative in time, but you can in fact study it using the methods of this section by reducing it to a first-order system.
We can again study homogeneous equilibria of [biharmonic-US-PDE] by finding numbers \(u_0\) such that \(H(u_0)=0\), as these will identically satisfy the right-hand side of the equation being zero. Following the usual procedure of writing \(u = u_0 + \varepsilon u_1\) and substituting this into [biharmonic-US-PDE], we find that perturbations satisfy, \[\mathchoice{\frac{\partial u_1}{\partial t}}{\partial u_1/\partial t}{\partial u_1/\partial t}{\partial u_1/\partial t} = K_2\mathchoice{\frac{\partial^2 u_1}{\partial x^2}}{\partial^2 u_1/\partial x^2}{\partial^2 u_1/\partial x^2}{\partial^2 u_1/\partial x^2}+K_4\mathchoice{\frac{\partial^{4} u_1}{\partial x^{4}}}{\partial^{4} u_1/\partial x^{4}}{\partial^{4} u_1/\partial x^{4}}{\partial^{4} u_1/\partial x^{4}} + H'(u_0)u_1.\] If we perform a separation-of-variables solution, we can use an ansatz of the form \(u_1 = e^{\lambda_k t}f(x)\) to find an eigenvalue equation for the function \(f\) of the form, \[\begin{equation} \label{eigen-US-biharmonic} \lambda_k f = K_2\mathchoice{\frac{{\mathrm d}^2 f}{{\mathrm d}x^2}}{{\mathrm d}^2 f/{\mathrm d}x^2}{{\mathrm d}^2 f/{\mathrm d}x^2}{{\mathrm d}^2 f/{\mathrm d}x^2}+K_4\mathchoice{\frac{{\mathrm d}^{4} f}{{\mathrm d}x^{4}}}{{\mathrm d}^{4} f/{\mathrm d}x^{4}}{{\mathrm d}^{4} f/{\mathrm d}x^{4}}{{\mathrm d}^{4} f/{\mathrm d}x^{4}} + H'(u_0)f. \end{equation}\] This is a fourth-order ODE which can be solved to find four linearly independent solutions. Using the boundary conditions, as we did in the reaction–diffusion case, we can compute the value of the constant \(\lambda_k\). But this quickly becomes very tedious!
Alternatively, the following boundary conditions allow the use of a clever trick: \[\begin{equation} \label{biharmonic-US-BCs} \mathchoice{\frac{\partial u}{\partial x}}{\partial u/\partial x}{\partial u/\partial x}{\partial u/\partial x}(0,t) = 0, \quad \mathchoice{\frac{\partial u}{\partial x}}{\partial u/\partial x}{\partial u/\partial x}{\partial u/\partial x}(L,t) = 0, \quad \mathchoice{\frac{\partial^{3} u}{\partial x^{3}}}{\partial^{3} u/\partial x^{3}}{\partial^{3} u/\partial x^{3}}{\partial^{3} u/\partial x^{3}}(0,t) = 0, \quad \mathchoice{\frac{\partial^{3} u}{\partial x^{3}}}{\partial^{3} u/\partial x^{3}}{\partial^{3} u/\partial x^{3}}{\partial^{3} u/\partial x^{3}}(L,t) = 0. \end{equation}\] We let \(w_k(x) = \cos(k \pi x/L)\) be the usual Laplacian eigenfunctions satisfying Neumann boundary conditions, and recall \[\mathchoice{\frac{\partial^2 w_k}{\partial x^2}}{\partial^2 w_k/\partial x^2}{\partial^2 w_k/\partial x^2}{\partial^2 w_k/\partial x^2} + \rho_k w_k = 0,\] with \(\rho_k = (k\pi/L)^2\). We then note that these functions \(w_k(x)\) satisfy all of the boundary conditions given in [biharmonic-US-BCs], and hence we can use the ansatz29 \(u_1 = e^{\lambda_k t}w_k(x)\) to find, \[\begin{equation} \label{biharmonic-US-growth-US-rates} \lambda_k = -\rho_kK_2+\rho_k^2K_4 + H'(u_0). \end{equation}\]
If we assume, as in the Turing analysis from preceding chapters, that the system is stable in the absence of transport, then we must have \(\lambda_0 = H'(u_0)<0\) at the steady state. We also require that \(\operatorname{Re}(\lambda_k)<0\) as \(k \to \infty\), as otherwise we obtain instability of arbitrarily fine-scale patterns, which effectively break our assumptions of a spatial continuum. This implies that \(K_4<0\). Hence the only possibility for some finite range of \(k\) to lead to \(\operatorname{Re}(\lambda_k)>0\) is if \(K_2<0\).
Let’s consider an example interaction scheme where we observe spatial pattern formation in a relatively simple system. From the above analysis, we need \(K_2<0\) and \(K_4<0\). One kind of model which can, for the right parameters, admit these signs is a kernel with oscillations between activation and inhibition in space. That is, we choose a \(K\) such that nearby populations benefit one another, but ones farther away inhibit one another. See 6.4 for a sketch of such an interaction kernel. An example is given by the function, \[\begin{equation} %\label{example-US-kernel} K(x-y) = \mathrm{e}^{-(x-y)^2}(\cos(x-y)-\alpha). \end{equation}\]
This kernel has the moments, \[K_2 = \frac{\sqrt{\pi}}{8}\left(\mathrm{e}^{-\frac{1}{4}}-2\alpha\right), \quad K_4 = \frac{\sqrt{\pi}}{384}\left(\mathrm{e}^{-\frac{1}{4}}-12\alpha\right),\] which are both negative for \(2\alpha>\mathrm{e}^{-\frac{1}{4}}\).
An important interpretation of such a structure is that the population \(u\) acts as its own activator nearby, but its own inhibitor far away. Hence it can exhibit the same kind of structures we observed in chapter 3 for reaction–diffusion systems, but in a single species. Such interactions exist in mechanical and neural interactions, where cells interact over long distances via filopodia or mechanical signals, as well as in flocking and aggregation phenomena.
We consider a simple form of a mechanical interaction as an example of this kind of instability analysis. If we let \(u\) denote the deformation of a sheet of cells, such as the epithelium or outermost layer of your skin, we can consider such an interaction kernel with \(K_2 = -a <0\) and \(K_4 = -b<0\) (with \(a,b\) positive) to represent the influence of curvature on the cell layer itself. In the absence of curvature, we assume the cells want to relax to an undeformed state (that is, \(u_0=0\) is the only homogeneous equilibrium). A simple model of this would take \(H(u) = -u-u^3\). Our equation would then be, \[\begin{equation} \label{mechanical} \mathchoice{\frac{\partial u}{\partial t}}{\partial u/\partial t}{\partial u/\partial t}{\partial u/\partial t} = -a\mathchoice{\frac{\partial^2 u}{\partial x^2}}{\partial^2 u/\partial x^2}{\partial^2 u/\partial x^2}{\partial^2 u/\partial x^2}-b\mathchoice{\frac{\partial^{4} u}{\partial x^{4}}}{\partial^{4} u/\partial x^{4}}{\partial^{4} u/\partial x^{4}}{\partial^{4} u/\partial x^{4}} -u-u^3, \end{equation}\] and we see that linearising it is equivalent to just dropping the \(-u^3\) term. Hence by [biharmonic-US-growth-US-rates], we have \[\lambda_k = \rho_ka-\rho_k^2b -1.\]
This dispersion relation is quite simple, and it is easy to calculate, e.g., the range of unstable wavenumbers, and the fastest growing mode. We compute the range of unstable wavenumbers by solving \(\lambda_k=0\) as a function of \(\rho_k\) to find, \[\rho_k^\pm = \frac{a\pm \sqrt{a^2-4b}}{2b},\] from which we see that \(a^2 > 4b\) is necessary for instability (as otherwise the quadratic given by \(\lambda_k\) would never become positive). If this condition is satisfied, then the values of \(\rho_k\) for which \(\lambda_k=0\) are real and positive.
If we maximise \(\lambda_k\) as a function of \(\rho_k\) we find the fastest growing mode as, \[\lambda_k(\rho_k^*) =\lambda_k\left(\frac{a}{2b}\right) = \left(\frac{a^2}{2b}\right)-\left(\frac{a^2}{4b}\right) -1 = \frac{a^2}{4b}-1,\] which is positive as long as \(a^2 > 4b\). You can see simulations of this model in 6.5 for two different domain lengths, in order to see how the instability regimes scale with size. Importantly, as with reaction–diffusion systems, the linear analysis can approximately tell you what wavelength the pattern will be, but it cannot say whether spots or stripes form.
We’ve covered a lot of material in this course, from the biology of spruce budworms and chemotactic bacteria, to similarity solutions and instabilities of chemical and mechanical models. In this second half of the course we have been more narrow but focused deeply on extending the tool of linear stability analysis, especially in the context of looking for pattern formation from our biological models. The core of the content in this term is covered in chapter 3, with chapters 1–2 being more preparatory, and chapters 4–5 being more extensions of the basic ideas.
As you revisit this material to revise for the exam, I would encourage you to step back and think about the big picture perspective. What does a linear instability analysis tell you? What does it mean in terms of a given biological scenario, and what are the limitations of these techniques (both in terms of model simplifications, and tools we use to only partially understand models we cannot solve analytically)?
Despite seeing many different topics, we have only scratched the surface of the field. Mathematical biology is now a vast and thriving research area, with a range of applications in oncology, physiology, cell biology, and even modelling of human interactions (among numerous other topics). Ideally you’ve gotten an idea of how to think about simple models of biological phenomena, and use some analytical tools for their investigation. I would highly recommend skimming other chapters of Murray’s book for some of the classical literature, or searching around online to really get a sense of the breadth of the field. Even the Wikipedia page shows an enormous range of topics, most of whom were not mentioned in this course. Still, we hope you’ve gotten a flavour of the area, and are now able to look into many of these topics on your own. While it is now clear that mathematical theory has played a key role in many modern advances in biology, a huge number of biological phenomena remain elusive to experimental and theoretical analysis.
“We can only see a short distance ahead, but we can see plenty there that needs to be done." —Alan Turing
Allee, W. C. & Bowen, E. S. 1932. Studies in animal aggregations: Mass protection against colloidal silver among goldfishes. Journal of Experimental Zoology 61(2), 185–207.↩︎
This was also the year that Einstein proposed special relativity and a solution to the ultraviolet catastrophe – he was a busy guy in 1905!↩︎
For the interested: Morgan, A. J. A. 1952. The reduction by one of the number of independent variables in some systems of partial differential equations. Quart. J. Math. 3(1), 250–259.↩︎
See Timoshenko, S. P., Gere, J. M. 1963. Theory of Elastic Stability 2nd edition, pp. 54–55.↩︎
McMahon, T. 1973. Size and shape in biology. Science 179(4079), 1201–1204.↩︎
Brody, S., Lardy, H. A. 1946. Bioenergetics and growth. J. Phys. Chem. 50(2), 615–617.↩︎
To keep consistent with last term’s notation for linear stability, we are using superscripts to index these vector-valued functions. Be careful not to confuse these indices with exponents!↩︎
There are important differences between different kinds of stability, and in general one has to be somewhat precise about this. Here I just mean the same thing as you did last term by ‘asymptotic stability’. These differences don’t matter except in the case of eigenvalues with zero real part, which we will not emphasise in this course.↩︎
We will not make use of the graph/network formalism in this course, but it is a common way of studying these models.↩︎
Other ways of modelling such structured populations include integro-partial differential equations, delay differential equations, and cellular automata.↩︎
Note that there is a clash with our previous notation of equilibria as \(u_0\), so in this chapter only we will use \(u^*\) to refer to equilibria, which are just constant numbers.↩︎
The sequence itself likely predates Leonardo of Pisa, as there is evidence it existed in ancient Sanskrit texts. See Finding Fibonacci for more.↩︎
The choice of \(u_0\) here is just to be consistent with the sequence starting at \(0\) with most authors. We could also pick \(u_0=1\) and get the same sequence, just shifted by \(1\).↩︎
In general the number of conditions needed for a system to be fully specified (sometimes ‘well-posed’) depends on the highest derivatives. A good rule of thumb is that each population in the system (each equation) needs one initial condition per time derivative, and one boundary condition on all boundaries for every \(2\) space derivatives.↩︎
For problems involving just linear diffusion, no-flux and Neumann boundary conditions are the same. However, these terms mean different things for other kinds of PDE.↩︎
This paper was written with chemists, biologists, and mathematicians in mind, and so is extremely easy to read. I would highly encourage you to read through it here – the next few chapters are a modern formulation of the ideas, but Turing was extremely prescient in developing the theory, so predicted an enormous amount of what would come.↩︎
While DNA was in some sense discovered in 1869, much of its function and the genetic basis for biology was only just beginning to be understood in the 1950s. Turing was therefore prescient in not considering specific entities as morphogens, but rather arguing in terms of abstract things which he hoped further experimental work could demonstrate with concrete entities in mind.↩︎
There may be boundary conditions or domain geometries where these ‘plane wave’ ansatzes do not work, but generally they will for a Cartesian domain with homogeneous Neumann or Dirichlet conditions.↩︎
You can derive this expression via the orthogonality of the eigenfunctions – try this as an exercise!↩︎
The, generally nonlinear, functions \(F\) and \(G\) are quite arbitrary, though one needs some technical assumptions on their growth rates to ensure that solutions exist for all time etc. We will not dwell on these technicalities, but always assume these functions are sufficiently ‘nice’ to guarantee what we need. Generally systems like the functions we’ve seen in the ODE examples from last term work well enough.↩︎
In Murray’s book, and in previous versions of these notes, the timescale was nondimensionalised as \(\widehat{t} = t/\gamma\) and the \(\mathbfit{x}\) scale was also scaled by this timescale. For notational simplicity, we do not choose a timescale, so if you look at past exams or additional questions you can remember that here we’ve taken \(\gamma=1\).↩︎
Note that this index \(i\) has nothing to do with the imaginary unit \(\mathrm{i}= \sqrt{-1}\)! Hopefully written in the capital vs subscript notation this is clear.↩︎
The summation index is not important, but I have changed it to \(n\) in this subsection from \(k\) to be consistent with the 2-D results.↩︎
Of course this is a vast simplification of a complex process, and mechanical aspects play key roles as well. See this paper for an overview of the evidence for different mechanisms at different stages of development.↩︎
You may ask: what about the ‘half-infinite’ Turing space shown in 4.5 for \(D \gg 1\)? For any finite value of \(D\), the range of permissible \(b\) values will be bounded, and in all likelihood \(D\) is smaller than \(10^2\) or at most \(10^4\), depending on exactly what the morphogen \(u\) represents.↩︎
At the macroscopic scale, we can think of ‘heat’ as essentially the number of collisions per unit of time, and hence a diffusion constant will increase with the temperature of a medium.↩︎
This can in principle be relaxed, but we will not discuss this case as it becomes extremely technical.↩︎
Using Stirling’s approximation, you can show that the series converges depending on how large the derivative terms become. This is highly technical and not too important for our purposes, but kind of an interesting application of analysis-y methods!↩︎
There are other solutions to the eigenfunction equation [eigen-US-biharmonic] in general, but for these boundary conditions the \(w_k\), coming from just the Laplacian, are sufficient to represent any spatial perturbation of the equation.↩︎