Numerical Integration of ODEs

Jump to: navigation, search

This article describes the numerical methods for solving the initial-value problem, which is a standard type of problem appearing in many fields of physics. Suppose we have a system whose state at time t is described by a vector \vec{y}(t), which obeys the first-order ordinary differential equation (ODE) for the form:

\frac{d\vec{y}}{dt} = \vec{F}\Big(\vec{y}(t), t\Big).

Here, \vec{F} is some given vector-valued function, whose inputs are (i) the instantaneous state \vec{y}(t) and (ii) the current time t. Then, given an initial time t_0 and an initial state \vec{y}(t_0), the goal is to find \vec{y}(t) for subsequent times.

Conceptually, the initial value problem is distinct from the problem of solving an ODE discussed in the article on finite-difference equations. There, we were given a pair of boundaries with certain boundary conditions, and the goal was to find the solution between the two boundaries. In this case, we are given the state at an initial time t_0, and our goal is to find \vec{y}(t) for some set of future times t > t_0. This is sometimes referred to as "integrating" the ODE, because the solution has the form

\vec{y}(t) = \vec{y}(t_0)\, +\, \int^t_{t_0} dt' \;\vec{F}\Big(\vec{y}(t'), t'\Big).

However, unlike ordinary numerical integration (i.e., the computing of a definite integral), the value of the integrand is not known in advance, because of the dependence of \vec{F} on the unknown \vec{y}(t).

Contents

 [hide

Example: equations of motion in classical mechanics

The above standard formulation of the initial-value problem can be used to describe a very large class of time-dependent ODEs found in physics. For example, suppose we have a classical mechanical particle with position \vec{r}, subject to an arbitrary external space-and-time-dependent force \vec{f}(\vec{r},t) and a friction force -\lambda d\vec{r}/dt (where \lambda is a damping coefficient). Newton's second law gives the following equation of motion:

m \frac{d^2 \vec{r}}{dt^2} = - \lambda \frac{d\vec{r}}{dt} + \vec{f}(\vec{r}, t).

This is a second-order ODE, whereas the standard initial-value problem involves a first-order ODE. However, we can turn it into a first-order ODE with the following trick. Define the velocity vector

\vec{v} = \frac{d\vec{r}}{dt},

and define the state vector by combining the position and velocity vectors:

\vec{y} = \begin{bmatrix}\vec{r} \\ \vec{v}\end{bmatrix}.

Then the equation of motion takes the form

\frac{d\vec{y}}{dt} = \frac{d}{dt}\begin{bmatrix}\vec{r} \\ \vec{v}\end{bmatrix} = \begin{bmatrix}\vec{v} \\ - (\lambda/m) \vec{v} + \vec{f}(\vec{r}, t)/m\end{bmatrix},

which is a first-order ODE, as desired. The quantity on the right-hand side is the derivative function \vec{F}(\vec{y},t) for the initial-value problem. Its dependence on \vec{r} and \vec{v} is simply regarded as a dependence on the upper and lower portions of the state vector \vec{y}. In particular, note that the derivative function does not need to be linear, since \vec{f} can have any arbitrary nonlinear dependence on \vec{r}, e.g. it could depend on the quantity |\vec{r}|.

The "initial state", \vec{y}(t_0), requires us to specify both the initial position and velocity of the particle, which is consistent with the fact that the original equation of motion was a second-order equation, requiring two sets of initial values to fully specify a solution. In a similar manner, ODEs of higher order can be converted into first-order form, by defining the higher derivatives as state variables and increasing the size of the state vector.

Forward Euler Method

The Forward Euler Method is the conceptually simplest method for solving the initial-value problem. For simplicity, let us discretize time, with equal spacings:

t_0, t_1, t_2, \dots \;\; \mathrm{where}\;\; h \equiv t_{n+1} - t_n.

Let us denote \vec{y}_n \equiv \vec{y}(t_n). The Forward Euler Method consists of the approximation

\vec{y}_{n+1} = \vec{y}_n + h \vec{F}(\vec{y}_n, t_n).

Starting from the initial state \vec{y}_0 and initial time t_0, we apply this formula repeatedly to compute \vec{y}_1, \vec{y}_2, and so forth. The Forward Euler Method is called an explicit method, because, at each step n, all the information that you need to calculate the state at the next time step, \vec{y}_{n+1}, is already explicitly known—i.e., you just need to plug \vec{y}_n and t_n into the right-hand side of the above formula.

The Forward Euler Method formula follows from the usual definition of the derivative, and becomes exactly correct as h \rightarrow 0. We can deduce the numerical error, which is called the local truncation error in this context, by Taylor expanding the left-hand side around t = t_n:

\vec{y}_{n+1} = \vec{y}_n + h \left.\frac{d\vec{y}}{dt}\right|_{t_n} + \frac{h^2}{2}\, \left.\frac{d^2\vec{y}}{dt^2}\right|_{t_n} + \cdots

The first two terms are precisely equal to the right-hand side of the Forward Euler Method formula. The local truncation error is the magnitude of the remaining terms, and hence it scales as O(h^2).

Instability

Fig. 1: The exact solution (blue) and Forward Euler solution (green) for dy/dt = -\kappa y(t), for y(0) = 1, \kappa = 1, and h = 2.1. The numerical solution is unstable; it blows up at large times, even though the exact solution is decaying to zero.

For the Forward Euler Method, the local truncation error leads to a profound problem known as instability. Because the method involves repeatedly applying a formula with a local truncation error at each step, it is possible for the errors on successive steps to progressively accumulate, until the solution itself blows up. To see this, consider the differential equation

\frac{dy}{dt} = - \kappa y.

Given an initial state y_0 at time t_0 = 0, the solution is y(t) = y_0 \, e^{-\kappa t}. For \kappa > 0, this decays exponentially to zero with increasing time. However, consider the solutions produced by the Forward Euler Method:

\begin{align}y_1 &= y_0 + h \cdot (-\kappa y_0) = (1-h\kappa)\, y_0 \\
y_2 &= y_1 + h \cdot (-\kappa y_1) = (1-h\kappa)^2 \, y_0\\
\vdots\; &= \;\;\;\vdots\\
y_n &= (1-h\kappa)^n\, y_0.\end{align}

If h > 2/\kappa, then 1-h\kappa < -1, and as a result y_n \rightarrow \pm\infty as n\rightarrow \infty. Even though the actual solution decays exponentially to zero, the numerical solution blows up, as shown in Fig. 1. Roughly speaking, the local truncation error causes the numerical solution to "overshoot" the true solution; if the step size is too large, the magnitude of the overshoot keeps growing with each step, destabilizing the numerical solution.

Stability is a fundamental problem for the integration of ODEs. The equations which tend to destabilize numerical ODE solvers are those containing spring constants which are "large" compared to the time step; such equations are called stiff equations. At first, you might think that it's no big deal: just make the step size h sufficiently small, and the blow-up can be avoided. The trouble is that it's often unclear how small is sufficiently small, particularly for complicated (e.g. nonlinear) ODEs, where F(\vec{y},t) is something like a "black box". Unlike the above simple example, we typically don't have the exact answer to compare with, so it can be hard to tell whether the numerical solution blows up because that's how the true solution behaves, or because the numerical method is unstable.

Backward Euler Method

In the Backward Euler Method, we take

\vec{y}_{n+1} = \vec{y}_n + h \vec{F}(\vec{y}_{n+1}, t_{n+1}).

Comparing this to the formula for the Forward Euler Method, we see that the inputs to the derivative function involve the solution at step n+1, rather than the solution at step n. As h \rightarrow 0, both methods clearly reach the same limit. Similar to the Forward Euler Method, the local truncation error is O(h^2).

Fig. 2: The exact solution (blue) and Backward Euler solution (green) for the same problem as Fig. 1. The numerical solution is stable.

Because the quantity \vec{y}_{n+1} appears in both the left- and right-hand sides of the above equation, the Backward Euler Method is said to be an implicit method (as opposed to the Forward Euler Method, which is an explicit method). For general derivative functions F, the solution for \vec{y}_{n+1} cannot be found directly, but has to be obtained iteratively, using a numerical approximation technique such as Newton's method. This makes the Backward Euler Method substantially more complicated to implement, and slower to run.

However, implicit methods like the Backward Euler Method have a powerful advantage: it turns out that they are generally stable regardless of step size. By contrast, explicit methods—even explicit methods that are much more sophisticated than the Forward Euler Method, like the Runge-Kutta methods discussed below—are unstable when applied to stiff problems, if the step size is too large. To illustrate this, let us apply the Backward Euler Method to the same ODE, dy/dt = -\kappa y(t), discussed previously. For this particular ODE, the implicit equation can be solved exactly, without having to use an iterative solver:

y_{n+1} = y_n - h \kappa y_{n+1} \quad \Rightarrow \quad y_{n+1} = \frac{y_n}{1+h\kappa}= \frac{y_0}{(1+h\kappa)^n}.

From this result, we can see that the numerical solution does not blow up for large values of h, as shown for example in Fig. 2. Even though the numerical solution in this example isn't accurate (because of the large value of h), the key point is that the error does not accumulate and cause a blow-up at large times.

Adams-Moulton Method

Fig. 3: The exact solution (blue) and Second-Order Adams-Moulton (AM2) solution (green) for the same problem as Fig. 1.

In the Second-Order Adams-Moulton (AM2) method, we take

\vec{y}_{n+1} = \vec{y}_n + \frac{h}{2}\left[ \vec{F}(\vec{y}_{n}, t_{n}) + \vec{F}(\vec{y}_{n+1}, t_{n+1})\right].

Conceptually, the derivative term here is the average of the Forward Euler and Backward Euler derivative terms. Because \vec{y}_{n+1} appears on the right-hand side, this is an implicit method. Thus, like the Backward Euler Method, it typically has to be solved iteratively, but is numerically stable. The advantage of the AM2 method is that its local truncation error is substantially lower. To see this, let us take the derivative of both sides of the ODE over one time step:

\int_{t_n}^{t_{n+1}} \frac{d\vec{y}}{dt} \; dt = \int_{t_n}^{t_{n+1}} \vec{F}(\vec{y}(t), t) \; dt

The integral on the left-hand side reduces to \vec{y}_{n+1} - \vec{y}_n. As for the integral on the right-hand side, if we perform this integral numerically using the trapezium rule, then the result is the derivative term in the AM2 formula. The local truncation error is given by the numerical error of the trapezium rule, which is O(h^3). That's an improvement of one order compared to the Euler methods. (Based on this argument, we can also see that the Forward Euler method and the Backward Euler methods involve approximating the integral on the right-hand side using a rectangular area, with height given by the value at t_n and t_{n+1} respectively. From this, it's clear why the AM2 scheme gives better results.)

There are also higher-order Adams-Moulton methods, which generate even more accurate results by also sampling the derivative function at previous steps: F(\vec{y}_{n-1},t_{n-1}), F(\vec{y}_{n-2},t_{n-2}), etc.

In Fig. 3, we plot the AM2 solution for the problem dy/dt = -\kappa y(t), using the same parameters (including the same step size h) as in Fig. 1 (Forward Euler Method) and Fig. 2 (Backward Euler Method). It is clear that the AM2 results are significantly more accurate.

Runge-Kutta methods

The three methods that we have surveyed thus far (Forward Euler, Backward Euler, and Adams-Moulton) have all involved sampling the derivative function F(y,t) at one of the discrete time steps \{t_n\}, and the solutions at those time steps \{\vec{y}_n\}. It is natural to ask whether we can improve the accuracy by sampling the derivative function at "intermediate" values of t and \vec{y}. This is the basic idea behind a family of numerical methods known as Runge-Kutta methods.

Here is a simple version of the method, known as second-order Runge-Kutta (RK2). Our goal is to replace the derivative term with a pair of terms, of the form

\vec{y}_{n+1} = \vec{y}_n + A h \vec{F}_A + B h \vec{F}_B,

where

\begin{align}\vec{F}_A &= \vec{F}(\vec{y}_n, t_n)\\
\vec{F}_B &= \vec{F}(\vec{y}_n + \beta \vec{F}_A, t_n + \alpha).\end{align}

The coefficients \{A, B, \alpha, \beta\} are adjustable parameters whose values we'll shortly choose, so as to minimize the local truncation error.

During each time step, we start out knowing the solution \vec{y}_n at time t_n, we first calculate \vec{F}_A (which is the derivative term that goes into the Forward Euler method); then we use that to calculate an "intermediate" derivative term \vec{F}_B. Finally, we use a weighted average of \vec{F}_A and \vec{F}_B as the derivative term for calculating \vec{y}_{n+1}. From this, it is evident that this is an explicit method: for each of the sub-equations, the "right-hand sides" contain known quantities.

We now have to determine the appropriate values of the parameters \{A, B, \alpha, \beta\}. First, we Taylor expand \vec{y}_{n+1} around t_n, using the chain rule:

\begin{align} \vec{y}_{n+1} &= \vec{y}_n + h \left.\frac{d\vec{y}}{dt}\right|_{t_n} + \frac{h^2}{2} \left.\frac{d^2\vec{y}}{dt^2}\right|_{t_n} + O(h^3)\\
&= \vec{y}_n + h \vec{F}(\vec{y}_n, t_n) + \frac{h^2}{2} \left[ \frac{d}{dt}\vec{F}(\vec{y}(t), t)\right]_{t_n} + O(h^3) \\
&= \vec{y}_n + h \vec{F}(\vec{y}_n, t_n) + \frac{h^2}{2} \left[ \sum_j \frac{\partial \vec{F}}{\partial y_j}\, \frac{dy_j }{dt} + \frac{\partial \vec{F}}{\partial t}\right]_{t_n} + O(h^3) \\
&= \vec{y}_n + h \vec{F}_A + \frac{h^2}{2} \left\{ \sum_j \left[\frac{\partial \vec{F}}{\partial y_j} \right]_{t_n} \!\! F_{Aj} \; +\;  \left[\frac{\partial \vec{F}}{\partial t}\right]_{t_n}\right\} + O(h^3) \end{align}

In the same way, we Taylor expand the intermediate derivative term F_B, whose formula was given above:

F_B = F_A + \beta \sum_j F_{Aj} \left[\frac{\partial F}{\partial y_j}\right]_{t_n} + \alpha \left[\frac{\partial F}{\partial t}\right]_{t_n}.

If we compare these Taylor expansions to the RK2 formula, then it can be seen that the terms can be made to match up to (and including) O(h^2), if the parameters are chosen to obey the equations

A + B = 1, \quad \alpha = \beta = \frac{h}{2B}.

One possible set of solutions is A = B = 1/2 and \alpha = \beta = h. With these conditions met, the RK2 method has local truncation error of O(h^3), one order better than the Forward Euler Method (which is likewise an explicit method), and comparable to the Adams-Moulton Method (which is an implicit method).

The local truncation error can be further reduced by taking more intermediate samples of the derivative function. The most commonly-used Runge-Kutta method is the fourth-order Runge Kutta method (RK4), which is given by

\begin{align} \vec{y}_{n+1} &= \vec{y}_n + \frac{h}{6}\left(\vec{F}_A + 2\vec{F}_B + 2\vec{F}_C + \vec{F}_D \right)\\ \vec{F}_A &= \vec{F}(\vec{y}_n,\, t_n), \\ \vec{F}_B &= \vec{F}(\vec{y}_n + \tfrac{h}{2} \vec{F}_A,\, t_n + \tfrac{h}{2}), \\ \vec{F}_C &= \vec{F}(\vec{y}_n + \tfrac{h}{2} \vec{F}_B,\, t_n + \tfrac{h}{2}), \\ \vec{F}_D &= \vec{F}(y_n + h\vec{F}_C,\, t_n + h). \end{align}

This has local truncation error of O(h^5). It is an explicit method, and therefore has the disadvantage of being unstable if the problem is stiff and h is sufficiently large.

Integrating ODEs with Scipy

Except for educational purposes, it is almost always a bad idea to implement your own ODE solver; instead, you should use a pre-written solver.

The scipy.integrate.odeint solver

In Scipy, the simplest ODE solver to use is the scipy.integrate.odeint function, which is in the scipy.integrate module. This is actually a wrapper around a low-level numerical library known as LSODE (the Livermore Solver for ODEs"), which is part of a widely-used ODE solver library known as ODEPACK. The most important feature of this solver is that it is "adaptive": it can automatically figure out (i) which integration scheme to use (choosing between either a high-order Adams-Moulton method, or another implicit method known as the Backward Differentiation Formula which we haven't described), and (ii) the size of the discrete time steps, based on the behavior of the solutions as they are being worked out. In other words, the user only needs to specify the derivative function, the initial state, and the desired output times, without having to worry about the internal details of the solution method.

The function takes several inputs, of which the most important ones are:

  1. func, a function corresponding to the derivative function \vec{F}(\vec{y}, t).
  2. y0, either a number or 1D array, corresponding to the initial state \vec{y}(t_0).
  3. t, an array of times at which to output the ODE solution. The first element corresponding to the initial time t_0. Note that these are the "output" times only—they do not specify the actual time steps which the solver uses for finding the solutions; those are automatically determined by the solver.
  4. (optional) args, a tuple of extra inputs to pass to the derivative function func. For example, if args=(2,3), then func should accept four inputs, and it will be passed 2 and 3 as the last two inputs.

The function then returns an array y, where y[n] contains the solution at time t[n]. Note that y[0] will be exactly the same as the input y0, the initial state which you specified.

Here is an example of using odeint to solve the damped harmonic oscillator problem m \ddot{x} = - \lambda \dot{x} - k x(t), using the previously-mentioned vectorization trick to cast it into a first-order ODE:

from scipy import *
import matplotlib.pyplot as plt
from scipy.integrate import odeint

def ydot(y, t, m, lambd, k):
    x, v = y[0], y[1]
    return array([v, -(lambd/m) * v - k * x / m])

m, lambd, k = 1.0, 0.1, 1.0   # Oscillator parameters
y0 = array([1.0, 5.0])        # Initial conditions [x, v]
t  = linspace(0.0, 50.0, 100) # Output times

y = odeint(ydot, y0, t, args=(m, lambd, k))

## Plot x versus t
plt.plot(t, y[:,0], 'b-')
plt.xlabel('t')
plt.ylabel('x')
plt.show()

There is an important limitation of odeint: it does not handle complex ODEs, and always assumes that \vec{y} and \vec{F} are real. However, this is not a problem in practice, because you can always convert a complex first-order ODE into a real one, by replacing the complex vectors \vec{y} and \vec{F} with double-length real vectors:

\vec{y}' \equiv \begin{bmatrix}\mathrm{Re}(\vec{y})\\ \mathrm{Im}(\vec{y})\end{bmatrix}, \;\; \vec{F}' \equiv \begin{bmatrix}\mathrm{Re}(\vec{F})\\ \mathrm{Im}(\vec{F})\end{bmatrix}.

The scipy.integrate.ode solvers

Apart from odeint, Scipy provides a more general interface to a variety of ODE solvers, in the form of the scipy.integrate.ode class. This is a much more low-level interface; instead of calling a single function, you have to create an ODE "object", then use the methods of this object to specify the type of ODE solver to use, the initial conditions, etc.; then you have to repeatedly call the ODE object's integrate method, to integrate the solution up to each desired output time step.

The is an extremely aggravating inconsistency between the odeint function and this ode class: the expected order of inputs for the derivative functions are reversed! The odeint function assumes the derivative function has the form F(y,t), but the ode class assumes it has the form F(t,y). Watch out for this!

Here is an example of using ode class with the damped harmonic oscillator problem m \ddot{x} = - \lambda \dot{x} - k x(t), using a Runge-Kutta solver:

from scipy import *
import matplotlib.pyplot as plt
from scipy.integrate import ode

## Note the order of inputs (different from odeint)!
def ydot(t, y, m, lambd, k):
    x, v = y[0], y[1]
    return array([v, -(lambd/m) * v - k * x / m])

m, lambd, k = 1.0, 0.1, 1.0   # Oscillator parameters
y0 = array([1.0, 5.0])        # Initial conditions [x, v]
t  = linspace(0.0, 50.0, 100) # Output times

## Set up the ODE object
r = ode(ydot)
r.set_integrator('dopri5')    # A Runge-Kutta solver
r.set_initial_value(y0)
r.set_f_params(m, lambd, k)

## Perform the integration.  Note that the "integrate" method only integrates
## up to one single final time point, rather than an array of times.
x    = zeros(len(t))
x[0] = y0[0]
for n in range(1,len(t)):
    r.integrate(t[n])
    assert r.successful()
    x[n] = (r.y)[0]

## Plot x versus t
plt.plot(t, x, 'b-')
plt.xlabel('t')
plt.ylabel('x')
plt.show()

See the documentation for a more detailed list of options, including the list of ODE solvers that you can choose from.


Computational physics notes  |  Prev: Numerical integration      Next: Discrete Fourier transforms