The heat equation by Luis A. Caffarelli

(Deparment of Mathematics, UT at Austin).

On September 22, 2003, Luis A. Caffarelli delivered the inaugural lecture of the FME 2003-04 term. The image corresponds to the heading of the corresponding chapter in the booklet distributed by the FME on that occasion. Now, after twenty years, he has been awarded the 2023 Abel Prize and this NL rejoices in this great honor by offering a translation into English from the Spanish original of that lecture.

Fourier

The heat equation was proposed by Fourier in 1807—in his memoir on the propagation of heat in solid bodies.

In it, he also proposed the germ of what would become the theory of Fourier series.

So controversial was the latter that it took fifteen years, until 1822, for the Academy of Sciences to decide to publish it.

Mathematical models

The heat equation is a mathematical model (perhaps the simplest) that tries to describe the evolution of temperature in a solid body.

Let us consider, to simplify the presentation, an isolated metallic bar of length one (0 ⩽ x ⩽ 1), initially at zero temperature, which after a certain time, t_{0}, we have heated to a temperature T (x, t_{0}) keeping its ends, x = 0 and x = 1, at zero temperature.

From that instant, t_{0}, we let the temperature T (x, t_{0}) evolve freely and we are interested in a mathematical model that allows us to predict the temperature T (x, t) for all x in the interval [0, 1], for any future time (that is, for all t > t_{0}), from our knowledge of T (x, t_{0}) = T_{0}(x) and from the fact that for x = 0 or x = 1 the temperature remains equal to zero.

Naturally there is no “one model”. There are infinitely many, depending on the precision and the range of values in which we intend it to be valid (high or low temperatures will change the behavior of the material, impurities could be relevant, etc.).

The model proposed by Fourier can be summarized as follows:

1. The (caloric) energy required to bring a piece of the bar of length ∆ℓ from zero temperature to temperature T is proportional to ∆ℓ × T (i.e., the energy density, e = kT , is proportional to the temperature, with k a characteristic constant of the material).

2. Energy flows from areas of higher temperature to those of lower temperature. More precisely, the energy flux density is f (x) = −θD_{x}T (or f (x) = −θ∇T in various dimensions), where θ is again a characteristic constant of the material.

3. The energy is conserved. If we take a piece of the bar, ∆ℓ, the energy contained in ∆ℓ at the instant t2 is equal to the energy that was in ∆ℓ at the instant t1 plus the “energy flux” that penetrated the extremes x1, x2 in the time interval from t1 to t2. Mathematically:

4. If we draw the rectangle ∆ℓ × ∆t,

the first integral occurs at the top and bottom edges, while the second occurs at the sides. In order to compare them, we need to be able to write them in a common domain, namely the rectangle. We do this by taking derivatives:

Since this relationship must be verified for any rectangle, no matter how small, the integrands must necessarily be equal: D_{t}e + D_{x}f = 0. Remembering the expressions for e and f as a function of T, we obtain the equation kD_{t}T = θD_{xx}T

In present day terms, the relationships 1) and 2) are called constitutive laws, and they establish specific relationships between the state variables, e, f , T and their derivatives, which depend on the characteristics of the state. materials etc. Relationship 3), on the other hand, is of a different nature, it is a conservation law, and establishes that certain quantities (mass, energy, etc.) are conserved through a process. That does not mean that they are point-wise constant. In a gas, for example, mass flows from one part to another. What a conservation law does is postulate the existence of a conserved variable, e, and a flow, f, that satisfy

D_{t}e + D_{x}f = 0.

(or D_{t}e + div f = 0).

In short, writing a mathematical model consists of choosing those state variables that are relevant to the phenomenon we want to describe, finding (usually experimentally) their constitutive laws, and how they are conserved.

Existence and uniqueness

Perhaps the most important variation that these ideas have undergone today is in taking into account random effects. Regardless of how good a mathematical model is at representing reality, it must have a minimum of internal consistency. If the relationships we specify are excessive, they will generally be contradictory and our problem may not have a solution.

If they are too few, we may have many different solutions, when in reality we expect to have a single solution. So Fourier’s next step was to try to find a solution to the problem. Given the initial temperature, T_{0}(x), and the condition

T (1, t) = T (0, t) = 0

for all t > t_{0}, it is a matter of proving that there is a unique function T (x, t) that satisfies the equation T_{t} = T_{xx} (we set k = θ = 1). Let’s first try to find some solutions for particular T_{0} of the form

T (x, t) = T_{0}(x)g(t).

This requires that

g′(t)T_{0}(x) = g(t)T _{0}′′(x)

or that

g′(t)/g(t) = T_{0}′′(x)/T_{0}(x) = λ constant

(the only possible way for two functions of distinct variables to be equal is for them to be constants, since we can set t and vary x over all possible values). Since T_{0}(0) = T_{0}(1) = 0, the only possible pairs are

T_{0}(x) = sin(nπx)
g(t) = e^{−(nπ)2t}.

But the problem we were considering is linear. Therefore, any combination of solutions

is a new solution, with initial data

Fourier then proves that any function T0 (say continuously differentiable, with T0(0) = T0(1) = 0) can be expressed that way, and gives a formula for the coefficients.

This is how Fourier analysis was born, so revolutionary that it took fifteen years for mathematicians of the time to accept that a series of highly oscillating functions such as sin(nπx) could represent, for example, an arc of a parabola or a polygonal.

Harmonic analysis

Fourier analysis has come to be called harmonic analysis. We can say that it consists of describing a function not by its special characteristics (where it is large, where it is small), but by the influence that each frequency (sin λx) has on its composition. As such, it occupies a fundamental place in everything related to wave theory, transmission of all kinds of signals, ultrasound image reconstruction, spectral analysis, etc.

In the last years a way of decomposing functions into “elementary chunks” that describe the oscillatory properties of the function simultaneously in physical space (the variable x) and frequency space (the variable n or λ) has acquired great prominence.

These elementary chunks are called wavelets and have revolutionized image compression, data transmission, etc.

The Gaussian kernel and random walks

There is actually a more convincing way of representing the solution T (x, t), one that more clearly exhibits the qualitative properties of heat propagation. It consists of initially putting “point masses”.

Suppose now that the bar is infinite, it is at zero temperature and we manage to place a “point mass” of one heat unit at the origin and at the instant t0. In other words, we were able to concentrate a quantity c = 1 of heat energy at the origin so quickly that it is instantaneous for our time scale.

How does the temperature evolve next?

A little self-similarity analysis: if T (x, t) is a solution of the equation, so is aT (bx, b2t), which allows us to calculate that in this case.

This is the Gaussian kernel (“the bell”), or error dispersion formula.

If we translate the point mass to x_{0}, the new formula is

T (x, t) = G (x − x_{0}, t),

since the equation is translation invariant.

If we superimpose point masses of intensities c_{i} on the points x_{i},

and finally, for an energy density e = T_{0}(x),

This representation immediately tells us, among other things, that:

a) If the original temperature is positive, it remains positive.

b) The effect of any change in temperature is felt instantly throughout the bar.

c) The temperature T_{0} can be highly discontinuous and an instant later it becomes regular.

But what is the relationship between the heat equation and error propagation?

Suppose that at the instant t0 we are standing at the origin. We flip a coin; if heads, we take one step, ∆x, to the right. If tails, to the left. Every interval ∆t, we repeat the operation.

What is the probability u(x, t) that at time t we find ourselves in position x?

It seems complex to calculate, but we can see that at the instant t − ∆t we were either at x + ∆x or at x − ∆x and that from there we moved with probability ½ to (x, t), that is,

u (x, t) = 1/2 (u (x + ∆x, t − ∆t) + u (x − ∆x, t − ∆t))

u (x, t) − u (x, t − ∆t) = 1/2 u (x + ∆x, t − ∆t) + u (x − ∆x, t − ∆t) − 2u (x, t − ∆t).

Everything now depends on the balance between ∆t and (∆x)^{2}. If we choose (∆x)^{2}/∆t = 1, we can divide both sides by ∆t and we get:

∆u/∆t = ∆^{2}u/(∆x)^{2}

which is a discrete form of the heat equation.

That is, in the limit ∆t → 0, u converges to the solution of the heat equation. But at the initial instant, we are standing at the origin with probability 1, that is,

u (x, t) = G (x, t).

This is a version of the central limit theorem which says that if we independently repeat n times the same zero expectation experiment X_{i}, then the probability distribution of

converges to a Gaussian.

Nonlinear diffusions

That is why a heat-type equation is often called a diffusion equation. Diffusion equations appear in various fields. For example, in population dynamics the energy density e is replaced by the population density σ, and one of the many reasons why a population migrates is to go to areas of lower density, that is, the population flow has the form

f = −∇σ + · · · (other reasons)

and therefore the corresponding equations will be of the form

D_{t}σ = ∆σ + · · ·

Or, in an epidemic, the probability of infection at a place x, at an instant of time t, depends monotonically on the probabilities of adjacent points a few hours earlier. This gives rise, infinitesimally, to an equation of the form

Dte = F (D_{2}^{x}e, ∇e)

where e is the expectation of infection at x, t.

In a viscous fluid, particles adjacent to a given one try to “drag” or “slow down” it if it is slower or quicker, respectively, then the others.

The point I want to emphasize is that, in all these phenomena, the “diffusion” or “viscosity” term induces a process of “flattening” or “averaging” of the state variables that characterizes diffusive or viscous problems.

The influence of the theory of “parabolic equations” is today immense, in fluid equations (Navier–Stokes, flow in porous media, phase change equations), in optimal control theory and game theory (totally non-linear equations), modeling of population dynamics, epidemiology, mathematics of finance, etc.