10. Partial Differential Equations#

10.1. Introduction#

Partial differential equations (PDEs) are involved in the description of virtually every physical situation where quantities vary in space, or in space and time.

Examples are diffusion, electromagnetic waves, hydrodynamics, quantum mechanics.

In all but the simplest cases, equations cannot be solved analytically, and so numerical methods must be employed for quantitative results.

The typical numerical approach proceeds as follows: the dependent variables (e.g. temperature, electrical potential), are described by their values at discrete points (a lattice) of the independent variables (e.g. space and time).

Therefore, by appropriate discretization, the PDE is reduced to a large set of difference equations.

Although difference equations can be solved by matrix methods, the large size of the matrices involved (dimension \(\sim\) number of lattice points), makes this approach impractical.

However, the locality of the original equations (i.e. they involved only low-order derivatives of the dependent variables) makes the resulting differential equations “sparse”, which implies that most of the elements of the matrices involved vanish.

For such matrices, iterative methods of inversion and diagonalization can be very efficient.

Most of the physically important PDEs are of second order and can be classified into three types:

  1. Parabolic: Roughly speaking, parabolic equations involve only a first-order derivative in one variable, but have second-order derivatives in the remaining variables. Examples the diffusion equation, or the time-dependent Schrödinger equation: they are first order in time, but second order in space.

  2. Elliptic: They have second order derivatives in each of the independent variables, each derivative having the same sign when all terms of the equation are grouped on one side. Examples are the Poisson equation for the electrical potential and the time-independent Schrödinger equation, in two or more spatial variables.

  3. Hyperbolic: They involve second derivatives of opposite sign, e.g. the wave equation describing the vibrations of a stretched string.

In this chapter, we will discuss some numerical methods appropriate for elliptic equations and then focus on parabolic equations. Hyperbolic equations often can be treated by similar methods, with some unique differences (we won’t discuss these here).

10.2. Elliptic Partial Differential Equations#

For concreteness, we will consider particular forms of elliptic boundary value and eigenvalue problems for a field \(\phi\) in two spatial dimensions, \((x,y)\).

Specifically, we will tackle the boundary value problem:

\(-\left[ \frac{\partial^2}{\partial x^2} + \frac{\partial^2}{\partial y^2}\right] \phi = S(x,y)\).

Although this is not the most general elliptic form, it covers a wide variety of situations. For example, in electrostatics, \(\phi\) is the potential and \(S\) is related to the charge density (i.e., it is a source term). In steady-state heat diffusion, \(\phi\) is the temperature, \(S\) is the local rate of heat generation or loss.

The discussion can be generalized straightforwardly to other elliptic cases, e.g. in three dimensions.

Of course, the equation by itself is not complete. The boundary conditions are required. We will take the boundary conditions to be of the “Dirichlet” type, i.e. \(\phi\) is specified on some closed curve on the \((x,y)\) plane. Conveniently, we will take this to be the unit square, and perhaps some additional curves within it.

The boundary value problem is then to use the PDE to find \(\phi\) everywhere within the square.

Other classes of boundary conditions are the “Neumann” type, where the normal derivative of \(\phi\) is specified on the surfaces, or the “mixed” type, where a linear combination of \(\phi\) and its normal derivatives is specified. Both can be handled by very similar methods.

The eigenvalue problems we will be interested in might involve an equation of the form,

\(-\left[ \frac{\partial^2}{\partial x^2} + \frac{\partial^2}{\partial y^2}\right] \phi + V(x,y)\phi = \varepsilon \phi\),

together with a set of Dirichlet boundary conditions.

As an example, the above could be the time-independent Schrödinger equation, with \(\phi\) being the wave function, \(V(x,y)\) is related to the potential and \(\varepsilon\) is related to the energy eigenvalue.

The eigenvalue problem is then to find the values \(\varepsilon_\lambda\) and the associated eigenfunctions \(\phi_\lambda\), for which the equation and the b.c.’s are satisfied.

10.2.1. Discretization#

Let’s first cast the equation:

\(-\left[ \frac{\partial^2}{\partial x^2} + \frac{\partial^2}{\partial y^2}\right] \phi = S(x,y)\).

in a form suitable for numerical treatment.

We define a set of lattice points covering the region of interest in the \((x,y)\) plane.

For convenience, we take the lattice spacing \(h\) to be uniform and equal in both directions. Therefore, the unit square is covered by \(N\times N\) lattice squares, with points labeled by \((i,j)\), each of which runs from \(0\) to \(N\). The coordinates of point \((i,j)\) are \(x_i = ih\) and \(y_j = jh\).

We then define \(\phi_{ij} = \phi(x_i, y_j)\) and \(S_{ij} = S(x_i, y_j)\).

It is straightforward to apply a three-point difference for the second derivative (see, e.g. Chapter 9, section 2 for a derivation using a Taylor series expansion in one dimension):

\(\frac{\partial^2 \phi}{\partial x^2} \simeq \frac{ \phi_{i+1j} + \phi_{i-1j} - 2 \phi_{ij}}{h^2}\),

to get:

\(-\left[\frac{ \phi_{i+1j} + \phi_{i-1j} - 2 \phi_{ij}}{h^2} + \frac{ \phi_{ij+1} + \phi_{ij-1} - 2 \phi_{ij}}{h^2}\right] = S_{ij}\).

10.2.2. Variational Principle Approach#

We will be solving this equation shortly, but first let’s derive it in an alternate way, based on a variational principle.

This approach is handy in cases where the coordinates are not Cartesian, or when more accurate formulas are needed.

The variational principle also provides some insnight into how the solution algorithm works.

Consider a quantity \(\mathcal{E}\), defined to be a functional of the field \(\phi\), of the form:

\(\mathcal{E} = \int_0^1 \mathrm{d} x \int_0^1 \mathrm{d} y \left[ \frac{1}{2} (\nabla \phi)^2 - S \phi \right]\).

In some situations, \(\mathcal{E}\) has a physical interpretation. For example, in electrostatics, \(E = -\nabla \phi\) is the electric field and \(S\) is the charge density, making \(\mathcal{E}\) the total energy of the system.

In other situations, such as steady-state diffusion, \(\mathcal{E}\) should be viewed simply as a useful quantity.

At a solution to \(-\left[ \frac{\partial^2}{\partial x^2} + \frac{\partial^2}{\partial y^2}\right] \phi = S(x,y)\), \(\mathcal{E}\) is stationary under all variations of the solution \(\phi\), called \(\delta \phi\) that respect the Dirichlet boundary conditions imposed.

\(\mathcal{E}\) being stationary implies that:

\(\delta \mathcal{E} = 0\).

Let’s find an expression for \(\delta \mathcal{E}\). The “\(\delta\)” simply acts as any derivative would, and therefore:

\(\delta \mathcal{E} = \int_0^1 \mathrm{d} x \int_0^1 \mathrm{d} y \left[ \mathbf{\nabla} \phi \cdot \mathbf{\nabla} \delta \phi - S \delta \phi\right]\).

To perform the integral of the first term by parts, we need to derive an appropriate integration by parts identity in two dimensions.

Starting with the divergence theorem in two dimensions for a vector field \(\mathbf{V}\):

\(\int_C \mathbf{V}\cdot \mathbf{\hat{n}} \mathrm{d}\ell = \int_S \mathbf{\nabla} \cdot \mathbf{V} \mathrm{d}A\),

whre the curve \(C\) is the boundary of the surface \(S\), and \(\mathbf{\hat{n}}\) is a unit vector perpendicular to \(C\).

Changing \(\mathbf{V} \rightarrow u\mathbf{V}\), where \(u\) is a scalar function:

\(\int_C u\mathbf{V}\cdot \mathbf{\hat{n}} \mathrm{d}\ell = \int_S \mathbf{\nabla} \cdot (u\mathbf{V}) \mathrm{d}A\).

Consider the RHS and expand:

\(\int_S \mathbf{\nabla} \cdot (u\mathbf{V}) \mathrm{d}A = \int_S u\mathbf{\nabla} \cdot \mathbf{V} \mathrm{d}A + \int_S \mathbf{\nabla }u \cdot \mathbf{V} \mathrm{d}A\).

And rearranging:

\(\int_S \mathbf{\nabla}u \cdot \mathbf{V} \mathrm{d}A = \int_S \mathbf{\nabla} \cdot (u\mathbf{V}) \mathrm{d}A - \int_S u\mathbf{\nabla} \cdot \mathbf{V} \mathrm{d}A\).

By the divergence theorem as written above:

\(\int_S \mathbf{\nabla}u \cdot \mathbf{V} \mathrm{d}A = \int_C u\mathbf{V}\cdot \mathbf{\hat{n}} \mathrm{d}\ell - \int_S u\mathbf{\nabla} \cdot \mathbf{V} \mathrm{d}A\).

If we now choose \(\mathbf{V} = \mathbf{\nabla}\phi\) and \(u = \delta \phi\), we have:

\(\int_S \mathbf{\nabla }\delta \phi \cdot\mathbf{\nabla}\phi \mathrm{d}A = \int_C \delta \phi \mathbf{\nabla} \phi\cdot\mathbf{\hat{n}} \mathrm{d}\ell - \int_S \delta \phi \mathbf{\nabla} \cdot \mathbf{\nabla} \phi \mathrm{d}A \),

an integration by parts formula in the necessary form.

Substituting the above into:

\(\delta \mathcal{E} = \int_0^1 \mathrm{d} x \int_0^1 \mathrm{d} y \left[ \nabla \phi \cdot \nabla \delta \phi - S \delta \phi\right]\),

we get:

\(\delta \mathcal{E} = \int_C \delta \phi \mathbf{\nabla} \phi\cdot \mathbf{\hat{n}} \mathrm{d}\ell + \int_0^1 \mathrm{d} x \int_0^1 \mathrm{d}y\delta \phi [-\nabla^2 \phi - S]\),

where the line integral is over the boundary of the region of interest (\(C\)). Since we consider only variations that respect the boundary conditions, \(\delta \phi\) must vanish on \(C\), so that the line integral does as well.

Therefore:

\(\delta \mathcal{E} = \int_0^1 \mathrm{d} x \int_0^1 \mathrm{d}y\delta \phi [-\nabla^2 \phi - S]\),

and demanding that \(\delta \mathcal{E} =0\) for any \(\delta \phi\) implies that:

\(-\nabla^2 \phi - S = 0\), i.e. that \(\mathcal{E}\) is stationary when \(\phi\) is a solution to our differential equation.

Moreover, it can be shown that \(\mathcal{E}\) is not only stationary when \(\phi\) is a solution to the PDE, but that it is a minimum as well (left as an exercise!).

Let’s also derive a discrete approximation to the \(\mathcal{E}\) functional. We employ a two-point difference formula to approximate each first derivative in \((\mathbf{\nabla}\phi)^2\), at the points halfway between the lattice points, and use the trapezoid rule for the integrals.

As a reminder, the trapezoid rule for integration in one dimension is given by:

\(\int_a^b \mathrm{d} x f(x) = \frac{h}{2} \left[ f(a) + f(b) + 2 \sum_{i=1}^{N-1} f(x_i)\right]\).

In two dimensions, it can be shown that the trapezoid rule has the form:

\begin{align} \int_a^b \mathrm{d} x \int_c^d \mathrm{d} y f(x,y) &= \frac{h^2}{4} [f(a,c) + f(b,c) + f(a,d) + f(b,d)\ &+ 2 \sum_{i=1}^{N-1} f(x_i, c) + 2 \sum_{i=1}^{N-1} f(x_i, d)\ &+ 2 \sum_{j=1}^{N-1} f(a, y_j) + 2 \sum_{i=1}^{N-1} f(b, y_j)\ &+ 4 \sum_{i=1}^{N-1} \sum_{j=1}^{N-1} f(x_i, y_j)] \end{align}

(see, e.g., https://math.stackexchange.com/questions/2891298/derivation-of-2d-trapezoid-rule for a derivation)

This leads to:

\(\mathcal{E} = \frac{1}{2} \sum_{i=1}^N \sum_{j=1}^N [(\phi_{ij} - \phi_{i-1j})^2 + (\phi_{ij} - \phi_{ij-1})^2 ] - h^2 \sum_{i=1}^{N-1} \sum_{j=1}^{N-1} S_{ij} \phi_{ij}\).

Note that setting:

\(\frac{ \partial \mathcal{E}}{\partial \phi_{ij}} = 0 \forall ij\),

leads to the difference equation derived in the previous section.

Let’s now discuss where the boundary conditions enter the set of the linear equations:

\(-\left[\frac{ \phi_{i+1j} + \phi_{i-1j} - 2 \phi_{ij}}{h^2} + \frac{ \phi_{ij+1} + \phi_{ij-1} - 2 \phi_{ij}}{h^2}\right] = S_{ij}\).

Unless the coordinate system is well adapted to the geometry of the surfaces on which the boundary conditions are imposed, the lattice points will only roughly describe the geometry. One can always improve the accuracy by using a non-uniform lattice spacing, and place more points in the regions near the surfaces, or by transforming to a coordinate system in which the boundary conditions are expressed more naturally.

In any event, the boundary conditions will then provide the values of the \(\phi_{ij}\) at some subset of lattice points.

At a point far away from the boundaries, the boundary conditions do not enter directly. However, consider the equation at a point just next to the boundary, say \((i, N-1)\). Since \(\phi_{iN}\) is specified as a part of the boundary conditions, we can rewrite the equation as:

\(4 \phi_{iN-1} - \phi_{i+1N-1} - \phi_{i-1N-1} - \phi_{iN-2} = h^2 S_{ij} + \phi_{iN}\).

Therefore, \(\phi_{iN}\) enters not as an unknown, but rather as an inhomogeneous, known term.

These considerations show that the discrete approximation to the PDE is equivalent to a system of linear equations for the unknown values of \(\phi\) at the interior points.

In matrix notation:

\(\mathbf{M} \mathbf{\phi} = \mathbf{s}\),

where \(\mathbf{M}\) is the matrix appearing in the linear system of equations, and the inhomogeneous term \(\mathbf{s}\) is proportional to \(S\) at the interior points, and linearly related to the boundary conditions on \(\phi\).

In any sort of practical situation, there are a very large number of these equations, e.g. \(N^2 = 2500\) for, say, \(N=50\), so the solution by direct matrix inversion can be impractical.

Fortunately, sincew the discrete approximation to the Laplacian involves only neighboring points, most of the elements of \(\mathbf{M}\) vanish (since it is sparse), and there are efficient iterative techniques for solving the matrix equation.

We begin their discussion by considering an analogous, but simpler one-dimensional boundary value problem, and then return to the two-dimensional case.

10.2.3. An Iterative Method for Boundary Value Problems#

The one-dimensional boundary value problem analogous to the two-dimensional problem we have been discussing can be written as:

\( -\frac{ \mathrm{d}^2 \phi }{\mathrm{d} x^2} = S(x)\),

with \(\phi(0)\) and \(\phi(1)\) specified.

The related variational principle involves the quantity:

\(\mathcal{E} = \int_0^1 \mathrm{d} x \left[ \frac{1}{2} \left(\frac{\mathrm{d}\phi}{\mathrm{d}x}\right)^2 - S \phi \right]\).

This can be discretized on a uniform lattice with spacing \(h=1/N\) as:

\(\mathcal{E} = \frac{1}{2h} \sum_i^{N-1} (\phi_i - \phi_{i-1})^2 - h \sum_{i=1}^{N-1} S_i \phi_i\).

Considering variations with respect to \(\phi_i\) yields the difference equation:

\(2 \phi_i - \phi_{i+1} - \phi_{i-1} = h^2 S_i\),

which is nothing but the naive discretization of the differential equation.

We already discussed methods for solving the boundary value problem in one dimension in the previous chapter (using the forward/backward Numerov method), but we can also consider the equation together with the known values \(\phi_0\) and \(\phi_N\), as a set of linear equations.

For a modest number of points, say \(N \lesssim 100\), the system can be solved by direct matrix methods.

However, to illustrate the iterative methods appropriate for large sparse matrices of elliptic PDEs in two or more dimensions, let’s begin by solving for \(\phi\):

\(\phi_i = \frac{1}{2} [ \phi_{i+1} + \phi_{i-1} + h^2 S_i]\).

This equation is not manifestly useful, since we don’t know the \(\phi\)’s on the RHS. However, it can be interpreted as giving as an “improved” value for \(\phi_i\), based on the values of \(\phi\) at the neighboring points.

The strategy, known as Gauss-Seidel iteration, is then to guess some initial solution, and then to “sweep” systematically through the lattice (e.g. from left to right), successively replacing \(\phi\) at each point by an improved value.

Note that the most “current” values of the \(\phi_{i\pm 1}\) are to be used in the RHS of the equation. By repeating this sweep many times, an initial guess for \(\phi\) can be “relaxed” to the correct solution.

To investigate the convergence of this procedure, we generalize the equation, so that at each step of the relaxation, \(\phi_i\) is replaced by a linear mixture of its old value and the new “improved” one, given by:

\(\phi_i \rightarrow \phi_i' = (1-\omega) \phi_i + \frac{\omega}{2} [\phi_{i+1} + \phi_{i-1} + h^2 S_i]\).

Here, \(\omega\) is a parameter that can be adjusted to control the rate of relaxation.

To see that the above procedure leads to an “improvement” in the solution, we calculate the change in the energy functional (in one dimension), remembering that all the \(\phi\)’s except \(\phi_i\) are to be held fixed. After some algebra, one finds:

\(E'-E = -\frac{ \omega(2-\omega)}{2h} \left[ \frac{1}{2} (\phi_{i+1} + \phi_{i-1} + h^2 S_i) - \phi_i \right]^2 \leq 0\),

so that, as long as \(0 < \omega < 2\), the energy never increases, and should thus converge to the required minimum value as the sweeps proceed.

10.2.4. Example 10.1: The relaxation method for partial differential equations in one dimension.#

(a) Use the relaxation method in one dimension to solve the boundary value problem defined by the differential equation:

\(-\frac{ \mathrm{d}^2 \phi }{\mathrm{d} x^2} = S(x)\),

with source term \(S(x) = 12 x^2\) and boundary conditions \(\phi(0) = \phi(1) = 0\). Start with \(\phi_i=0\) as your initial guess. Use \(N=40\) lattice steps and perform a few hundred iterations (e.g. 1000) to reach your solution. Perform all calculations for the values of the relaxation parameter \(\omega = 0.5, 1.0, 1.5\).

The relaxation method: Discretize the one-dimensional space and use the discretized “solved” iterative form:

\(\phi_i \rightarrow \phi'_i = (1-\omega)\phi_i + \frac{\omega}{2} [ \phi_{i+1} + \phi_{i-1} + h^2 S_i]\),

where \(\phi_i'\) is the updated value of \(\phi_i\) at lattice site \(i\), \(h\) is the lattice spacing, and \(\omega\) is a “relaxation” parameter.

Compare to the exact solution: \(\phi(x) = x(1-x^3)\).

(b) Write a function that calculates the energy functional, defined by:

\(\mathcal{E} = \int_0^1 \mathrm{d}x \left[ \frac{1}{2} \left( \frac{\mathrm{d}\phi }{\mathrm{d} x} \right)^2 - S \phi \right]\).

Use the discretized form:

\(\mathcal{E}= \frac{1}{2h} \sum_{i=1}^N (\phi_i - \phi_{i-1})^2 - h \sum_{i=1}^{N-1} S_i \phi_i\).

Calculate the value of the energy at the end of each iteration and plot the energy as a function of iteration.

Compare to the exact solution energy given by: \(\mathcal{E} = -9/14 \simeq -0.64286\).

Despite the rather poor initial guess for \(\phi\) in Example 10.1, the iterations converge and the converged energy is independent of the relaxation parameter, \(\omega\). The rate of convergence clearly depends on \(\omega\). A general analysis shows that the best choice for the relaxation parameter depends upon the lattice size and on the geometry of the problem, and it is found to be usually \(\omega > 1\). The optimal value can be determined empirically by examining the convergence of the solution for only a few iterations, before choosing a value to be used for many iterations.

10.2.5. The Relaxation Method in Higher Dimensions#

The generalization of the relaxation method to two-dimensional problems is straightforward.

For the differential equation:

\(-\left[ \frac{\partial^2}{\partial x^2} + \frac{\partial^2}{\partial y^2}\right] \phi = S(x,y)\),

we get, for the updated value of the field at a point \((x_i, y_j)\):

\(\phi_{ij} \rightarrow \phi_{ij}' = (1-\omega) \phi_{ij} + \frac{\omega}{4} [\phi_{i+1j} + \phi_{i-1j} + \phi_{ij+1} + \phi_{ij-1} + h^2 S_{ij}]\).

This algorithm can be applied successively to each point on the lattice, say sweeping the rows in order from top to bottom and each row from left to right. One can again show that the energy functional,

\(\mathcal{E} = \frac{1}{2} \sum_{i=1}^N \sum_{j=1}^N [(\phi_{ij} - \phi_{i-1j})^2 + (\phi_{ij} - \phi_{ij-1})^2 ] - h^2 \sum_{i=1}^{N-1} \sum_{j=1}^{N-1} S_{ij} \phi_{ij}\),

always decreases, and there will be convergence to the required solution.

Several considerations can serve to enhance the convergence in practice:

  1. Starting from a good guess at the solution, e.g. perhaps one with similar, but simpler, boundary conditions, will reduce the number of iterations required.

  2. An optimal value of the relaxation parameter should be used, either estimated analytically or determined empirically (as described above).

  3. It may sometimes be more efficient to concentrate the relaxation process, for several iterations, in some sub-area of the lattice, where the trial solution is known to be particularly poor, thus not wasting effort on already-relaxed parts of the solution.

  4. One can always do a calculation on a relatively coarse lattice that relaxes with a small amount of numerical work, and then interpolate the solution found onto a finer lattice, to be used as the starting guess for further iterations.

10.3. Parabolic Partial Differential Equations#

Typical parabolic PDEs one encounters in physical situations are the diffusion equation:

\(\frac{\partial \phi}{\partial t} = \mathbf{\nabla} \cdot ( D \mathbf{\nabla} \phi ) + S\),

where \(D\) is the diffusion constant (possibly space-dependent) and \(S\) is a source function.

Another example is the Schrödinger equation:

\(i\frac{\partial \phi}{\partial t} = -\frac{ \hbar^2 } { 2 m} \nabla^2 \phi + V\phi\).

In contrast to the boundary value problems, these are generally of the initial value type: we are given the field \(\phi\) at an initial time, and seek to find it at a later time. The evolution is subject to certain spatial boundary conditions, e.g. the Schrödinger wave function vanishes at very large distances, or the temperature or heat flux is specified on some surfaces.

Methods by which such problems are solved on a computer are straightforward, although a few subtleties are involved.

10.3.1. Naive Discretization and Instabilities#

We begin by treating diffusion in one dimension, with a uniform diffusion constant \(D=1\).

We take \(x \in [0,1]\) and assume Dirichlet boundary conditions that specify the value of the field at the end points of the interval.

We will focus on the rescaled equation:

\(\frac{\partial \phi}{\partial t} = \frac{\partial^2 \phi}{\partial x^2} + S(x,t)\).

As before, we will approximate the spatial derivatives by finite differences on a uniform lattice of \(N+1\) points with spacing \(h=1/N\). The time derivative will be approximated by the simplest first-order difference formula, assuming a time step \(\Delta t\).

We will use a superscript \(n\) to label the time step:

\(\phi^n \equiv \phi(t_n)\), with \(t_n = n \Delta t\).

As our first approximation to the equation, we approximate the second derivative on the RHS as follows:

\(\frac{\partial^2 \phi}{\partial x^2} \equiv -\hat{H}\phi \approx \frac{\phi^n_{i+1} + \phi^n_{i-1} - 2 \phi^n_i }{h^2}\),

where we have also defined an operator which corresponds to minus the second spatial derivative, with the discretized form:

\(\hat{H} \phi^n_i = - \frac{\phi^n_{i+1} + \phi^n_{i-1} - 2 \phi^n_i }{h^2}\).

An “explicit” discretization, or differencing scheme, of the differential equation is then given by:

\(\frac{ \phi_i^{n+1} - \phi_i^n }{\Delta t} = \frac{\phi^n_{i+1} + \phi^n_{i-1} - 2 \phi^n_i }{h^2} + S_i^n\).

If we then solve the differential equation for \(\phi\) at the next time step, we obtain:

\(\phi^{n+1} = (1-\hat{H} \Delta t) \phi^n + S^n \Delta t\),

where we have suppressed the index \(i\), essentially turning the above equation to a matrix equation.

10.3.2. Example 10.2: Parabolic PDEs: a first example#

Solve the differential equation (i.e. find \(\phi\) for later times):

\(\frac{\partial \phi}{\partial t} = \frac{ \partial^2 \phi}{\partial x^2} + S(x,t)\),

for \(S=0\) and \(\phi(0,t) = \phi(1,t) = 0\), satisfied by the initial condition of a Gaussian centered about \(x=1/2\):

\(\phi(x,t=0) = e^{-20(x-1/2)^2} - e^{-20(x-3/2)^2} - e^{-20(x+1/2)^2}\),

where the latter two “image” Gaussians approximately ensure the boundary conditions at \(x=1\) and \(x=0\), respectively.

Use the naive discretization formula:

\(\phi^{n+1} = (1 - \hat{H} \Delta t) \phi^n + S^n \Delta t\),

where \((H\phi)_i\equiv -\frac{1}{h^2} ( \phi_i + \phi_{i-1} - 2 \phi_i)\).

Integrate up to \(t_\mathrm{max} = 0.045\) with a time steps \(\Delta t = 0.00075\) and \(\Delta t = 0.0080\), over a lattice with \(N=25\).

Compare to the analytical solution:

\(\phi(x,t) = \tau^{-1/2} \left[ e^{-20(x-1/2)^2/\tau} - e^{-20(x-3/2)^2/\tau} - e^{-20(x+1/2)^2/\tau} \right]\), where \(\tau = 1 + 80t\).

Both time steps chosen are quite small compared to the natural time scale of the solution, \(t\approx 0.01\). Furthermore, as we increase the time step, we see that things go very wrong: an unphysical instability develops in the numerical solution, which quickly acquires violent oscillations from one lattice point to another.

Let’s try to understand what is happening here. Consider the differential equation with \(S=0\), as we are trying to solve in the above example:

\(\frac{\partial \phi}{\partial t} = \frac{\partial^2 \phi}{\partial x^2}\).

The RHS can be replaced by the operator \(-\hat{H}\), to obtain:

\(\frac{\partial \phi(x,t)}{\partial t} = -\hat{H} \phi(x,t)\).

This can be formally solved to give:

\(\phi(x,t) = e^{-t\hat{H}} \phi(x,0)\).

If we now discretize time: \(t_n = n \Delta t\), \(\phi(t_n,x) = \phi^n\), as before, we get:

\(\phi^n = e^{-n \Delta t \hat{H}} \phi^0\).

Now let the set of states \(\psi_\lambda\) be the eigenfunctions of the operator \(\hat{H}\) with eigenvalues \(\epsilon_\lambda\).

Since \(\hat{H}\) is Hermitian, the eigenvalues \(\epsilon_\lambda\) are real and the eigenvectors can be chosen to be orthonormal.

What this implies is that we can expand the solution at any time in terms of this basis:

\(\phi^n = \sum_\lambda \phi^n_\lambda \psi_\lambda\),

where \(\phi^n_\lambda\) are effectively the coefficients of the expansion.

We also expand the initial condition in terms of the eigenvectors:

\(\phi^0 = \sum_{\lambda} \phi^0_\lambda \psi_\lambda\),

then, substituting into the formal discretized solution, \(\phi^n = e^{-n \Delta t \hat{H}} \phi^0\), we obtain:

\(\sum_\lambda \phi^n_\lambda \psi_\lambda = e^{-n \hat{H} \Delta t} \sum_{\lambda} \phi_\lambda^0 \psi_\lambda\),

giving us the evolution of each component of the solution:

\(\phi_\lambda^n = e^{-n \hat{H} \Delta t} \phi^0_\lambda\).

This corresponds to the correct behavior of the diffusion equation: short-wavelenght components (i.e. large eigenvalues \(\epsilon_\lambda\), disappear more rapidly as the solution “smooths out”.

However, the naive discretization (i.e. the “explicit” scheme):

\(\phi^{n+1} = (1 - \hat{H} \Delta t) \phi^n\),

would result in:

\(\phi^n = (1 - \hat{H} \Delta t)^n \phi_0\),

or:

\(\phi^n_\lambda = (1- \epsilon_\lambda \Delta t)^n \phi^0\).

Recall one of the definitions of the exponential function:

\(e^{\alpha t} = \lim_{n\rightarrow \infty} (1 + \alpha \Delta t)^n\), with \(\Delta t = t/n\).

So the naive discretization in the limit \(\Delta t \rightarrow 0\) would yield the correct evolution:

\(\phi^n_\lambda = e^{-n \epsilon_\lambda \Delta t} \phi^0_\lambda\).

Therefore, as long as \(\Delta t\) is chosen to be small, \((1-\epsilon_\lambda \Delta t)^n\) approximates the exponential, and the short-wavelength (large \(\epsilon_\lambda\)) components dampen with time.

However, if \(\Delta t\) is too large, one or more of the quantities \(1 - \epsilon_\lambda \Delta t\) has an absolute value greater than unity. The corresponding components, even if present only due to very small numerical round-off errors, are then amplified with each time step, and soon grow to dominate.

To quantify the limit on \(\Delta t\), we have some guidance in that the eigenvalues of \(\hat{H}\) are known analytically in this simple model problem.

We have: \(\hat{H} = - \frac{ \partial^2 }{ \partial x^2}\). The eigenvalue equation is simply then:

\(\hat{H} \psi_\lambda = \epsilon_\lambda \psi_\lambda\)

\(\Rightarrow \frac{ \partial^2 \psi_\lambda }{ \partial x^2} = - \epsilon_\lambda \psi_\lambda\),

with \(\psi_\lambda(0) = \psi_\lambda(1) = 0\).

The eigenfunctions are then:

\(\psi_\lambda = A \sin \sqrt{\epsilon_\lambda} x\), with eigenvalues \(\sqrt{\epsilon_\lambda} = \lambda \pi\), with \(\lambda = 1,2,...\).

On a lattice with \(x_i = ih\), the discretized solutions are then:

\((\psi_\lambda)_i = A \sin \lambda \pi i h\),

and since \(h = 1/N\):

\((\psi_\lambda)_i = A \sin \frac{\lambda \pi i}{N}\).

The eigenvalues have to change when we move to the lattice. To find them, consider the action of the discretized form of \(\hat{H}\) on the discretized eigenfunctions:

\((\hat{H}\psi_\lambda)_i = -\frac{1}{h^2} \left[ (\psi_\lambda)_{i+1} + (\psi_\lambda)_{i-1} - 2 (\psi_\lambda)_{i} \right]\).

or:

\((\hat{H}\psi_\lambda)_i = -\frac{1}{h^2} \left[ \sin \frac{ \lambda \pi (i+1) } {N} + \sin \frac{ \lambda \pi (i-1) } {N} - 2 \sin \frac{ \lambda \pi i} {N} \right]\).

Using the trigonometric identities \(\sin(A+B) = \sin A \cos B + \sin B \cos A\) and \(1-\cos \frac{ \lambda \pi } {N} = 2 \sin^2 \frac{ \lambda \pi } { 2N }\), it easy to verify that the eigenvalues after discretization are:

\(\epsilon_\lambda = \frac{4}{h^2} \sin^2 \frac{ \lambda \pi }{2 N}\).

The largest eigenvalue corresponds to: \(\lambda = N-1\), i.e.:

\(\epsilon_\lambda = 4/h^2 \left[ \sin\left(\frac{N\pi}{2N}\right) \cos\left(\frac{\pi}{2N}\right) - \cos\left(\frac{N\pi}{2N}\right) \sin\left(\frac{\pi}{2N}\right) \right]= 4/h^2 \cos\left(\pi/2N \right)\),

and to an eigenvector that changes sign from one lattice point to the next:

\((\psi_{N-1})_i = A \sin\left(\frac{(N-1) \pi i}{N}\right) = A\left[ \sin(\pi i) \cos\left(\frac{\pi i}{N}\right) - \cos(\pi i) \sin\left(\frac{\pi i}{N}\right)\right] \propto \cos \pi i\).

Requiring that \(| 1 - \epsilon_\lambda \Delta t | < 1\), then yields: \(\Delta t \lesssim h^2/2\), which in the case of our example corresponds to \(\Delta t \lesssim 0.0008\).

The question of stability is quite distinct from that of accuracy, as the limit imposed on the time step is set up by the spatial step used and not by the characteristic time scale of the solution, which is much larger.

The explicit scheme which we have discussed is unsatisfactory, as the instability forces us to use a much smaller time step than is required to describe the evolution adequately. Indeed, the situation gets even worse if we try to use a finer spatial lattice to obtain a more accurate solution.

Although the restriction on \(\Delta t\) that we have derived is rigorous only for the simple case we have considered, it does provide a useful guide for more complicated situations, as the eigenvector of \(\hat{H}\) with the largest eigenvalue will always oscillate from one lattice point to the next. Its eigenvalue is therefore quite insensitive to the global features of the problem.

10.3.3. Implicit Schemes#

One way around the instability of the explicit algorithm discussed above, is to retain the general form:

\(\frac{ \phi_i^{n+1} - \phi_i^n }{\Delta t} = \frac{\phi^n_{i+1} + \phi^n_{i-1} - 2 \phi^n_i }{h^2} + S_i^n\),

but to replace the second space derivative by that of the solution, at the new time, i.e.:

\(\frac{ \phi_i^{n+1} - \phi_i^n }{\Delta t} = \frac{\phi^{n+1}_{i+1} + \phi^{n+1}_{i-1} - 2 \phi^{n+1}_i }{h^2} + S_i^n\).

This defines an implicit scheme, since the unknown, \(\phi^{n+1}\) appears on both sides of the equation.

In terms of the operator \((\hat{H}\phi^{n+1})_i = -\frac{\phi^{n+1}_{i+1} + \phi^{n+1}_{i-1} - 2 \phi^{n+1}_i}{h^2}\), the above implicit equation can be written as:

\(\frac{ \phi_i^{n+1} - \phi_i^n }{\Delta t} = -\hat{H}\phi^{n+1}_i + S_i^n\).

Solving for \(\phi^{n+1}\), we get:

\(\phi^{n+1} = \frac{1}{1 + \hat{H} \Delta t} \left[ \phi^n + S^n \Delta t \right]\).

This scheme is equivalent to the explicit scheme to lowest order in \(\Delta t\) (as seen by expanding the denominator). However, it is much better in that larger time steps can be used, as the operator \((1 + \hat{H} \Delta t)^{-1}\) has eigenvalues \((1+\epsilon_\lambda \Delta t)^{-1}\), all of whose moduli are less than 1 for any value of \(\Delta t\). Although this decrease is inaccurate (i.e. not exponential as expected by the “formal” solution of the diffusion equation), for the most rapidly oscillating components, such components should not be large in the initial conditions, if the spatial discretization is accurate.

In any event, there’s no amplification of the large-eigenvalue components, which caused the instability found in the explicit scheme. For the slowly-varying components of the solution, corresponding to small eigenvalues, the evolution closely approximates the exponential behavior at each step.

A potential drawback is that it requires the solution of a set of linear equations (albeit tri-diagonal) at each time step to find \(\phi^{n+1}\). This is equivalent to the application of the inverse of the matrix \((1+\hat{H} \Delta t)\) to the vector appearing in the brackets. Since the inverse itself is time-independent, it might be found only once at the beginning of the calculation and then used for all times, but application still requires \(N^2\) operations if done directly.

Fortunately, there exists an algorithm that provides a very efficient solution (\(\mathcal{O}(N)\) operations) of a tri-diagonal system of equations, known as “Gaussian elimination and back-substitution”.

The algorithm proceeds as follows: let’s consider trying to solve the tri-diagonal linear system of equations:

\(\mathbf{A} \mathbf{\phi} = \mathbf{b}\) for the unknowns \(\phi_i\):

\(A_i^- \phi_{i-1}^{n+1} + A^0_i \phi_i^{n+1} + A^+_i \phi_{i+1}^{n+1} = b_i\).

Here, the \(A^{\pm,0}\) are the only non-vanishing elements of \(\mathbf{A}\) and the \(b_i\) are the known quantities. This is exactly the form of the problem posed by the evaluation of our problem:

\(\frac{ \phi_i^{n+1} - \phi_i^n }{\Delta t} = \frac{\phi^{n+1}_{i+1} + \phi^{n+1}_{i-1} - 2 \phi^{n+1}_i }{h^2} + S_i^n\).

rearranging:

\(\phi_i^{n+1} - \frac{\Delta t}{h^2} \left[\phi^{n+1}_{i+1} + \phi^{n+1}_{i-1} - 2 \phi^{n+1}_i\right] = \phi_i^n + S_i^n\),

to get:

\( - \frac{\Delta t}{h^2} \phi^{n+1}_{i-1} + \left( 1 - 2\frac{\Delta t} {h^2}\right) \phi^{n+1}_i - \frac{\Delta t}{h^2} \phi^{n+1}_{i+1} = \phi_i^n + S_i^n\),

which corresponds to:

\(b_i = \phi_i^n + S_i^n \Delta t\),

\(A_i^0 = 1 + 2\frac{\Delta t} {h^2}\),

\(A_i^\pm = - \frac{\Delta t} {h^2}\).

To solve this system of equations, we assume that the solution satisfies a one-term forward recursion relation of the form:

\(\phi^{n+1}_{i+1} = \alpha_i \phi^{n+1}_i + \beta_i\),

where the coefficients \(\alpha_i\) and \(\beta_i\) are to be determined.

Substituting the above relation into the linear system of equations:

\(A^-_i \phi_{i-1}^{n+1} + A_i^0 \phi_i^{n+1} + A_i^+ (\alpha_i \phi^{n+1}_i + \beta_i) = b_i\),

and solving for \(\phi_i^{n+1}\), we get:

\(\phi^{n+1}_i = \gamma_i A_i^- \phi^{n+1}_{i-1} + \gamma_i (A^+_i \beta_i - b_i)\),

with \(\gamma_i = -\frac{ 1 }{A_i^0 + A_i^+ \alpha_i}\).

comparing the above with: \(\phi^{n+1}_{i} = \alpha_{i-1} \phi^{n+1}_{i-1} + \beta_{i-1}\), i.e. the hypothesis with \(i\rightarrow i-1\), we can identify the following recursion relations for the \(\alpha\)’s and \(\beta\)’s:

\(\alpha_{i-1} = \gamma_i A^-_i\),

\(\beta_{i-1} = \gamma_i (A_i^+ \beta_i - b_i)\).

Note that these are backward recursion relations.

The strategy to solve the problem then proceeds as follows:

  1. We use the \(\alpha_{i-1}\), \(\beta_{i-1}\) and \(\gamma_i\) relations to sweep the lattice backwards to determine the \(\alpha_i\) and \(\beta_i\), running from \(N-2\) down to 0. The starting values to be used are: \(\alpha_{N-1} = 0\) and \(\beta_{N-1} = \phi_N^n\), which will guarantee the correct value of \(\phi\) at the last lattice point

  2. Having determined these coefficients, we can then use the recursion relation: \(\phi^{n+1}_{i+1} = \alpha_i \phi^{n+1}_i + \beta_i\) in a forward sweep from \(i=0\) to \(i=N-1\) to determine the solution, with the starting value of \(\phi_0^{n+1}\) known from the boundary conditions.

We have then determined the solution in only two sweeps of he lattice, involving \(\mathcal{O}(N)\) arithmetic operations. The increase in numerical effort per time step is about a factor of two.

Note that: the \(\alpha_i\) and \(\gamma_i\) are independent of \(\mathbf{b}\), so that it is more efficient to compute these coefficients only once and store them at the beginning of the calculation. Only the \(\beta_i\) are then needed to be computed for each time step.

10.3.4. Example 10.3: Application of the Implicit Scheme for Parabolic PDEs#

Use the implicit scheme to solve the problem of Example 10.2. Use a lattice with \(N=25\) intervals. Try time steps of \(\Delta t = 0.00075\) and \(\Delta t=0.005\).

Compare to the exact solution by graphing the results at \(t=0.045\).

10.3.5. Solution by Direct Matrix Multiplication#

Modern computers are efficient at matrix manipulations. The number of operations for the solution of parabolic PDEs will be higher than what has been discussed in the previous sections, but nevertheless, for small problems, it should be efficient enough.

Let’s write down the form of the \(\hat{H}\) operator in the matrix representation. This is easy, since we know its action (up to a factor of \(1/h^2\):

\((\hat{H}\phi)_i \propto \phi_{i+1} + \phi_{i-1} - 2\phi_i\).

E.g. for \(i=1\):

\((\hat{H}\phi)_1 \propto \phi_{2} + \phi_{0} - 2\phi_1\),

for \(i=2\):

\((\hat{H}\phi)_2 \propto \phi_{3} + \phi_{1} - 2\phi_2\),

for \(i=N-1\):

\((\hat{H}\phi)_{N-1} \propto \phi_{N} + \phi_{N-2} - 2\phi_{N-1}\),

or in matrix form:

\(\hat{H} \propto \left(\begin{array}{cccccc} 1 & 0 & ... & 0 & 0 & 0\\ 1 & -2 & 1 & 0 & 0 & ... \\ 0 & 1 & -2 & 1 & 0 & ... \\ \ddots & \ddots & \ddots & \ddots & \ddots & \ddots \\ 0 & 0 & ... & 1 & -2 & 1 \\ 0 & 0 & ... & 0 & 0 & 1\\ \end{array}\right)\).

Therefore, to obtain a solution in the implicit scheme:

\(\phi^{n+1} = [1 + \hat{H} \Delta t]^{-1} \left[ \phi^n + S^n \Delta t \right]\),

with:

\((\hat{H}\phi)_i = -\frac{\phi_{i+1} + \phi_{i-1} - 2 \phi_i}{h^2}\),

all we need to do is find the inverse of the matrix operator:

\([1 + \hat{H} \Delta t]\).

Note that the inverse is calculated in NumPy via:

np.linalg.inv()

On the other hand, we can also use the full time evolution formula derived from the formal solution, via the matrix exponential.

Starting from the PDE:

\(\frac{\partial \phi(x,t)}{\partial t} = -\hat{H} \phi(x,t)\).

This can be formally solved to give:

\(\phi(x,t) = e^{-t\hat{H}} \phi(x,0)\),

which implies, in the discretized form, that:

\(\phi^n = e^{-n \Delta t \hat{H}} \phi^0\).

A single time step therefore corresponds to:

\(\phi^{n+1} = e^{-\Delta t \hat{H}} \phi^n\).

The matrix exponential can be calculated in SciPy (via a Padé approximant, see https://epubs.siam.org/doi/10.1137/09074721X):

scipy.linalg.expm(),

allowing us to perform the direct time step evolution in the simple case we are considering.

10.3.6. Example 10.4: Direct inversion of the matrix and the full evolution#

Use direct inversion of the tri-diagonal matrix to calculate the evolution in the problem of Examples 10.3 and 10.4.

(a) Use the implicit formula as the starting point:

\(\phi^{n+1} = \frac{1}{1 + \hat{H} \Delta t} \left[ \phi^n + S^n \Delta t \right]\),

with:

\((\hat{H}\phi)_i = -\frac{\phi_{i+1} + \phi_{i-1} - 2 \phi_i}{h^2}\).

(b) Use the full evolution formula:

\(\phi^n = e^{-n \Delta t \hat{H}} \phi^0\),

in the form:

\(\phi^{n+1} = e^{-\Delta t \hat{H}} \phi^n\).

You may use the matrix SciPy exponential:

scipy.linalg.expm

10.3.7. Diffusion and Boundary Value Problems in Higher Dimensions#

We saw that diffusion in one dimension is best handled either by an implicit method or by direct inversion of a tri-diagonal matrix.

It is therefore natural to extend this approach to two or more spatial dimensions.

For the two-dimensional diffusion equation:

\(\frac{ \partial \phi }{\partial t} = \nabla^2 \phi\),

the discretization is straightforward, and following our discussion of the one-dimensional problem, the time evolution should be effected by:

\(\phi^{n+1} = \frac{1}{ 1 + \hat{H} \Delta t} \phi^n\),

where now:

\((\hat{H}\phi)_{ij} = -\frac{1}{h^2} \left[ (\phi_{i+1j} + \phi_{i-1j} - 2\phi_{ij}) + (\phi_{ij+1} + \phi_{ij-1} - 2\phi_{ij}) \right]\).

Despite the fact that \(\hat{H}\) is very sparse, it is not tri-diagonal, and therefore one-dimensional algorithm does not apply.

However, \(\hat{H}\) can be written as a sum of operators that separately involve differences only in the \(i\) or \(j\) indices:

\(\hat{H} = \hat{H}_i + \hat{H}_j\),

where \(\hat{H}_i = -\frac{1}{h^2} (\phi_{i+1j} + \phi_{i-1j} - 2\phi_{ij})\) and \(\hat{H}_j = -\frac{1}{h^2} (\phi_{ij+1} + \phi_{ij-1} - 2\phi_{ij})\).

This means that we can write down an equivalent expression up to \(\mathcal{O}(\Delta t)\) to the original one:

\(\phi^{n+1} = \frac{1}{1 + \hat{H}_i \Delta t} \frac{1}{1 + \hat{H}_j \Delta t} \phi^n\).

This now can be evaluated as before, as each of the required inversions involves a tri-diagonal matrix.

If we define the auxiliary function \(\phi^{n+1/2}\), then we can write:

\(\phi^{n+1/2} = \frac{1}{1 + \hat{H}_j \Delta t} \phi^n\),

and:

\(\phi^{n+1} = {1 + \hat{H}_i \Delta t} \phi^{n+1/2}\).

Thus, \((1 + \hat{H}_j \Delta t)^{-1}\) is applied by forward and backward sweeps of the lattice in the \(j\) direction, independently for each value of \(i\). The application of \((1+\hat{H}_i \Delta t)^{-1}\) is then carried out by forward and backward sweeps in the \(i\) direction, independently for each value of \(j\). This “alternating-direction” scheme is stable for all values of the time step and is generalized straightforwardly to three dimensions.

10.4. Iterative Methods for Eigenvalue Problems#

Our analysis of the diffusion equation shows that the net result of time evolution is to enhance those components of the solution with smaller eigenvalues of \(\hat{H}\) relative to those with larger eigenvalues.

Indeed, for very long times, it is only that component with the lowest eigenvalue that is significant, although it has a very small amplitude.

This suggests a scheme for finding the lowest eigenvalue of an elliptic operator, solving the problem:

\(-\left( \frac{ \partial^2 } { \partial x^2 } + \frac{ \partial^2 } { \partial y^2 } \right) \phi + V(x,y) \phi = \epsilon \phi\),

by guessing a trial eigenvector and subjecting it to a fictitious time evolution that will “filter” it to the eigenvector having lowest eigenvalue.

Since we are dealing with a linear problem, the relentlessly decreasing or increasing magnitude of the solution can be avoided by renormalizing continuously as time proceeds.

Concretely, consider the time-independent Schrödinger equation in one dimension with \(\hbar = 2m = 1\). The eigenvalue problem is then:

\(\left[ -\frac{ \mathrm{d}^2 } { \mathrm{d} x^2 } + V(x) \right] \phi = \epsilon \phi\),

with the normalization condition:

\(\int \mathrm{d} x \phi^2 = 1\).

Note that \(\phi\) can always chosen to be real if \(V\) is.

The corresponding energy functional is:

\(\mathcal{E} = \int \mathrm{d}x \left[ \left( \frac{ \mathrm{d} \phi } { \mathrm{d} x} \right)^2 + V(x) \phi^2(x) \right]\).

We know that \(\mathcal{E}\) is stationary at an eigenfunction with respect to variations of \(\phi\) that respect the normalization condition, and that the value of \(\mathcal{E}\) at this eigenfunction is the associated eigenvalue.

To derive a discrete approximation to this problem, we first discretize \(\mathcal{E}\) as:

\(\mathcal{E} = \sum_i h \left[ \frac{ (\phi_i - \phi_{i-1})^2 } { h^2 } + V_i \phi_i^2 \right]\),

and the normalization constraint takes the form:

\(\sum_i h \phi^2_i = 1\).

Variation with respect to \(\phi\) gives the eigenvalue problem:

\((\hat{H} \phi)_i \equiv -\frac{1}{h^2} ( \phi_{i+1} + \phi_{i-1} - 2 \phi_i ) + V_i \phi_i = \epsilon \phi_i\),

(Note that \(\epsilon\) enters as a Lagrange multiplier that ensures the normalization).

The above defines the problem if finding the real eigenvalues and eigenvectors of a (large) symmetric tri-diagonal matrix representing the operator \(\hat{H}\).

Although we have already discussed direct methods for solving this problem (e.g. the Numerov algorithm), they cannot be applied to the very large matrices that arise in two- and three-dimensional problems.

In such cases, one is usually interested in the few highest or lowest eigenvalues of the problem, and for these the diffusion picture is appropriate.

We thus consider the problem:

\(\frac{ \partial \phi } { \partial \tau } = -\hat{H} \phi\),

where \(\tau\) is a “fake” time.

For convenience, we can arrange the problem such that the lowest eigenvalue of \(\hat{H}\) is positive definite. This can be guaranteed by shifting \(\hat{H}\) by a spatially-independent constant, chosen so that the resultant \(V_i\) is positive \(\forall i\).

To solve this “fake” diffusion problem, any of the algorithms discussed previously in this chapter can be employed.

The simplest is the “explicit” scheme:

\(\phi^{n+1} \sim (1 - \hat{H} \Delta \tau) \phi^n\),

where \(\Delta \tau\) is a small, positive parameter. The symbol \(\sim\) is used to indicate that \(\phi^{n+1}\) is to be normalized to unity according to the discrete normalization condition stated above. For an initial guess, we can choose \(\phi^0\) to be anything not orthogonal to the exact eigenfunction, although guessing something close to the solution will speed the convergence.

At each step in this refining process, computation of the energy furnishes an estimate of the true eigenvalue.

As a first example, let’s consider the infinite square well.

10.4.1. Example 10.5: The Infinite Square Well Using the Fake Diffusion Method#

(a) Use the fake time diffusion method on a lattice with 21 points to find the lowest eigenvalue and the corresponding eigenstate, of the infinite square well.

Start with a trial eigenfunction \(\phi(x) = x(1-x)\) and perform 100 time evolutions with a time step \(\Delta \tau = 0.00075\).

\(V(x) = 0\) for \(0 \leq x \leq 1\), \(V(x)= \infty\) otherwise, with \(\phi(0) = \phi(1) = 0\).

You may set \(\hbar = 2m = 1\) and solve the problem:

\(\left[ -\frac{ \mathrm{d}^2 } { \mathrm{d} x^2 } + V(x) \right] \phi = \epsilon \phi\).

(b) Plot the lowest eigenfunction and compare to the exact one.

(c) Plot the evolution of the eigenvalue for each time step and show that it converges to the correct lowest one. | (d) Use the explicit method to find the largest eigenvalue. Use a time step of \(\Delta \tau = 0.0015\). Compare to your expectation from the largest eigenvalue for the discrete case with \(N=21\).

Recall that the exact eigenfunctions and eigenvalues are given by:

\(\psi_\lambda = \sqrt{2}^{1/2} \sin \lambda \pi x\) with \(\epsilon_\lambda = \lambda^2 \pi^2\), where \(\lambda\) is a non-zero integer.

Also recall that in the discretized form, \(\epsilon_\lambda = \frac{4}{h^2} \sin^2 \frac{ \lambda \pi } { 2N }\).

Although the procedure works, it is unsatisfactory in that the limitation on the size of \(\Delta \tau\) caused by the lattice spacing often results in having to iterate many times to refine a poor trial function. This can be alleviated to some extent by choosing a good trial function. Even better is to use an implicit scheme that does not amplify the large-eigenvalue components for any value of \(\Delta \tau\).

Another possibility is to use \(\exp(-\hat{H} \Delta \tau)\) to refine the trial function, as we did before.

The method can be used to find other eigenvalues, e.g. the second-lowest eigenvalue:

  • First find the lowest eigenvalue and eigenfunction by the method described above.

  • A trial function for the second eigenfunction is then guessed and refined in the same way, taking care that at each stage of the refinement, the solution remains orthogonal to the lowest eigenfunction already found. This can be done by projecting out the component of the solution not orthogonal to the lowest eigenfunction.

Having found the 2nd-lowest eigenvalue, the third-lowest can be found similarly, taking care that during its refinement, it remains orthogonal to both of the eigenfunctions with lower eigenvalues.

This process cannot be applied to find more than the few lowest (or highest) eigenvalues, as numerical round-off errors in the orthogonalizations to the many previously-found eigenvectors soon grow to dominate.

Although the methods described have been illustrated by a one-dimensional example, it should be clear that the can be applied directly to find the eigenvalues and eigenfunctions of elliptic operators in two or more dimensions.