Article · Wikipedia archive · Last revised Jun 4, 2026

Stiff equation

In computational mathematics, a stiff equation is an initial value problem

Last revised
Jun 4, 2026
Read time
≈ 43 min
Length
10.0k w
Citations
28
Source

In computational mathematics, a stiff equation is an initial value problem

u ˙ = f ( u ) , u ( 0 ) = u 0 , t [ 0 , T ] , {\displaystyle {\dot {u}}=f(u)\,,\qquad u(0)=u_{0}\,,\qquad t\in [0,T]\,,}

where f : R d R d {\displaystyle f:{\mathbb {R} }^{d}\rightarrow {\mathbb {R} }^{d}} , requiring dedicated implicit time stepping methods for its efficient numerical integration. The simplest mathematical characterization of a stiff equation is the necessary condition

T d ( d i v u f ) ( u ) 1 . {\displaystyle {\frac {T}{d}}{\big (}{\mathrm {div} }_{u}\,f{\big )}(u)\ll -1\,.}

Since ( d i v u f ) ( u ) = t r a c e ( g r a d u f ) ( u ) = t r a c e f ( u ) {\displaystyle {\big (}{\mathrm {div} }_{u}\,f{\big )}(u)={\mathrm {trace} }({\mathrm {grad} }_{u}\,f)(u)={\mathrm {trace} }\,f'(u)} , where f ( u ) R d × d {\displaystyle f'(u)\in {\mathbb {R} }^{d\times d}} is the Jacobian matrix of f {\displaystyle f} at the point u {\displaystyle u} , the criterion above is easily evaluated and quantifies stiffness. The criterion is derived, explained and illustrated for nonlinear stiff equations below.

For a linear system with constant coefficients u ˙ = A u {\displaystyle {\dot {u}}=Au} , the divergence is constant, making stiffness a global characteristic whose magnitude is related to the time scale T {\displaystyle T} .

For a nonlinear system, stiffness usually varies in space and time along the solution trajectory u ( t ) {\displaystyle u(t)} , where the criterion quantifies stiffness locally. In practical computations, stiff equations are invariably solved using adaptive methods. 12

Background

There is a rich literature on stiff differential equations, but intuitive descriptions and heuristics are far more common than attempts at a rigorous definition of the concept. Hairer and Wanner3 concisely describe the most obvious feature:

Stiff equations are problems for which explicit methods don't work.

This refers to the observation that explicit integration methods are forced to use exceedingly small time steps h {\displaystyle h} to maintain numerical stability, preventing such methods from being competitive. Although each step is inexpensive, the total number of steps N = T / h {\displaystyle N=T/h} becomes prohibitively large, and the integration over [ 0 , T ] {\displaystyle [0,T]} effectively stalls.

By contrast, implicit methods for stiff equations require costly "algebraic" equation solving on each step. The extra work per step is offset by superior stability properties, allowing the use of large time steps. Without crippling stability restrictions, the total computational effort is manageable, and the requested accuracy can be achieved without loss of efficiency. For some stiff equations, the efficiency may be several orders of magnitude higher than that of even the best explicit methods.

Analogous efficiency issues in other fields of scientific computing

The phenomenon of stiffness is analogous to well-known performance issues in other areas of numerical analysis. For example, in optimization, gradient methods have similar limitations. Using the steepest descent method to minimize a convex functional F ( x ) {\displaystyle F(x)} leads to the iteration

x k + 1 = x k h g r a d x F ( x k ) , {\displaystyle x^{k+1}=x^{k}-h\,{\mathrm {grad} }_{x}F(x^{k})\,,}

where h {\displaystyle h} is the line search step size. This is merely the explicit Euler method for the differential equation x ˙ = g r a d x F ( x k ) {\displaystyle {\dot {x}}=-{\mathrm {grad} }_{x}F(x^{k})} . Problems with steep gradients are known for their slow convergence, due to restrictions on the choice of h {\displaystyle h} . The typical remedy is to replace the gradient method by some Newton-type method, which essentially corresponds to using an "implicit" method for differential equations.

Another example is in solving nonlinear equations of the form x = g ( x ) {\displaystyle x=g(x)} . Simple fixed point iteration, x k + 1 = g ( x k ) {\displaystyle x^{k+1}=g(x^{k})} , does not converge for problems where the Lipschitz constant L [ g ] {\displaystyle L[g]} is large, but only for contractions. Yet the equation may have a unique solution under much weaker but common conditions, e.g. when the map g I {\displaystyle g-I} is monotone and the logarithmic Lipschitz constant of g {\displaystyle g} satisfies M [ g ] < 1 {\displaystyle M[g]<1} . Since fixed point iterations diverge, it is necessary to use a Newton-type iteration.

A third example is iterative solvers for the (discretized) Poisson equation Δ u = f {\displaystyle -\Delta u=f} . Here, the Jacobi method (which attempts to avoid matrix algebra) is equivalent to using the explicit Euler method (in pseudo time) for the corresponding method-of-lines discretization of the diffusion equation u t = Δ u + f {\displaystyle u_{t}=\Delta u+f} , seeking its stationary solution. The diffusion equation is a prototypical example of a stiff equation: the time step of the explicit recursion is severely restricted by a CFL condition and required to be proportional to the square of the mesh width in space. This makes the convergence of the Jacobi iteration painfully slow. As before the remedy is to use some form of algebraic equation solving, e.g. by using a preconditioner or a multigrid method.

Thus there are many different mathematical problems, where computational efficiency necessitates the use of dedicated numerical methods, typically of a considerable complexity. Stiff equations is the particular instance for initial value problems in ordinary differential equations. Nevertheless, with a well designed adaptive stiff solver, the efficient numerical integration of most stiff equations is now a routine task.

History and origins of a mathematical characterization

The first mention of stiff equations is found in Curtiss and Hirschfelder in 1952 4. The authors discuss a scalar model equation equivalent to

x ˙ = 1 a ( t , x ) ( x g ( t ) ) {\displaystyle {\dot {x}}={\frac {1}{a(t,x)}}{\big (}x-g(t){\big )}}

and characterize stiffness as follows:

If Δ t {\displaystyle \Delta t} is the desired resolution of t {\displaystyle t} or the interval which will be used in the numerical integration, the equation is "stiff" if

| a ( t , x ) Δ t | 1 {\displaystyle \left|{\frac {a(t,x)}{\Delta t}}\right|\ll 1}

and g {\displaystyle g} is well behaved.

This characterization relates a time scale Δ t {\displaystyle \Delta t} to the decay rate of transients, governed by the coefficient a ( t , x ) {\displaystyle a(t,x)} , which is assumed to be negative. As a ( t , x ) {\displaystyle a(t,x)} varies in time and space, it allows stiffness to vary accordingly. Prothero and Robinson introduced a similar but more instructive model problem,5 essentially taking a ( t , x ) = 1 / λ < 0 {\displaystyle a(t,x)=1/\lambda <0} constant and studying the equation

x ˙ = λ ( x g ( t ) ) + g ˙ ( t ) , x ( 0 ) = x 0 g ( 0 ) , t [ 0 , T ] . {\displaystyle {\dot {x}}=\lambda {\big (}x-g(t){\big )}+{\dot {g}}(t)\,,\qquad x(0)=x_{0}\neq g(0)\,,\qquad t\in [0,T].}

The solution consists of a decaying transient ( x 0 g ( 0 ) ) e t λ {\displaystyle {\big (}x_{0}-g(0){\big )}\,{\mathrm {e} }^{t\lambda }} , together with the particular solution g ( t ) {\displaystyle g(t)} , which is supposed to be bounded and smooth. The mathematical solution x ( t ) {\displaystyle x(t)} is then obviously bounded. Contrasting the explicit and implicit Euler methods, one can investigate under what circumstances the numerical methods replicate the properties of the mathematical solution. The Curtiss-Hirschfelder criterion will be considered for Δ t = T {\displaystyle \Delta t=T} and Δ t = h {\displaystyle \Delta t=h} , corresponding to the conditions T λ 1 {\displaystyle T\lambda \ll -1} and h λ 1 {\displaystyle h\lambda \ll -1} , respectively.

The explicit Euler method for a differential equation u ˙ = f ( u ) {\displaystyle {\dot {u}}=f(u)} is defined by

u n + 1 = u n + h f ( u n ) , {\displaystyle u_{n+1}=u_{n}+hf(u_{n})\,,}

while the implicit Euler method is defined by the recursion

u n + 1 = u n + h f ( u n + 1 ) . {\displaystyle u_{n+1}=u_{n}+hf(u_{n+1})\,.}

Here h {\displaystyle h} is the (constant) time step, and u n {\displaystyle u_{n}} approximates the exact solution u ( t n ) {\displaystyle u(t_{n})} at time t n = n h {\displaystyle t_{n}=nh} . In the explicit method u n + 1 {\displaystyle u_{n+1}} is computed directly by an evaluation of the vector field f {\displaystyle f} . In the implicit method, however, one must solve the "algebraic" equation u n + 1 h f ( u n + 1 ) = u n {\displaystyle u_{n+1}-hf(u_{n+1})=u_{n}} for u n + 1 {\displaystyle u_{n+1}} . This improves the stability of the recursion.

For the Prothero-Robinson problem the explicit Euler method produces the numerical solution (on integrated form)

x n = ( 1 + h λ ) n x 0 + k = 0 n 1 ( 1 + h λ ) n k 1 h ( g ˙ ( t k ) λ g ( t k ) ) . {\displaystyle x_{n}=\left(1+h\lambda \right)^{n}x_{0}+\sum _{k=0}^{n-1}\left(1+h\lambda \right)^{n-k-1}h{\big (}{\dot {g}}(t_{k})-\lambda g(t_{k}){\big )}\,.}

The impediment is clearly visible: positive powers of the factor 1 + h λ {\displaystyle 1+h\lambda } produce exponential growth (numerical instability) unless we impose the stability condition | 1 + h λ | 1 {\displaystyle |1+h\lambda |\leq 1} . Since λ {\displaystyle \lambda } is real and negative, this implies 2 h λ 0 {\displaystyle -2\leq h\lambda \leq 0} . Thus the explicit Euler method cannot use step sizes adapted to the regularity of g ( t ) {\displaystyle g(t)} . Instead the total number of steps to complete the integration, N = T / h T | λ | {\displaystyle N=T/h\sim T|\lambda |} , becomes extremely large. Efficiency is inversely proportional to stiffness as quantified by the product T | λ | 1 {\displaystyle T|\lambda |\gg 1} .

As a consequence the explicit Euler method may take "forever" to solve a scalar problem with a perfectly smooth mathematical solution g ( t ) {\displaystyle \sim g(t)} once the (fast) transient has decayed. This is also the case even if the initial value is x ( 0 ) = g ( 0 ) {\displaystyle x(0)=g(0)} . Thus, the step size restriction is not a matter of whether the transient is present. It is due to the fact that any step size h {\displaystyle h} such that | 1 + h λ | > 1 {\displaystyle |1+h\lambda |>1} inevitably excites numerical instability.

If instead the implicit Euler method is used, the numerical solution becomes

x n = ( 1 h λ ) n x 0 + k = 0 n 1 ( 1 h λ ) k n h ( g ˙ ( t k + 1 ) λ g ( t k + 1 ) ) . {\displaystyle x_{n}=\left(1-h\lambda \right)^{-n}x_{0}+\sum _{k=0}^{n-1}\left(1-h\lambda \right)^{k-n}h{\big (}{\dot {g}}(t_{k+1})-\lambda g(t_{k+1}){\big )}\,.}

Here, the stability requirement is | 1 h λ | 1 1 {\displaystyle |1-h\lambda |^{-1}\leq 1} , which is satisfied for all h > 0 {\displaystyle h>0} when R e λ < 0 {\displaystyle {\mathrm {Re} }\,\lambda <0} , no matter how large h | λ | {\displaystyle h|\lambda |} is. A minor issue is that in practice one needs to resolve the transient (this may initially call for a short step size), but once it has decayed, one can use "large" step sizes, in the sense that h {\displaystyle h} is adapted to the regularity of g ( t ) {\displaystyle g(t)} , irrespective of the magnitude of | h λ | {\displaystyle |h\lambda |} . Of the two methods above, only the implicit Euler method can handle stiff equations efficiently.

Example Consider the Prothero-Robinson test problem

x ˙ = λ ( x sin ω t ) + ω cos ω t ; x ( 0 ) = x 0 , {\displaystyle {\dot {x}}\,=\,\lambda (x-\sin \omega t)+\omega \cos \omega t\,;\qquad x(0)=x_{0}\,,}

with exact solution x ( t ) = x 0 e λ t + sin ω t {\displaystyle x(t)=x_{0}\,{\mathrm {e} }^{\lambda t}+\sin \omega t} . Taking x 0 = 0 {\displaystyle x_{0}=0} , the homogeneous solution is not present in the exact solution x ( t ) = sin ω t {\displaystyle x(t)=\sin \omega t} , but the exponential term will show up in the local solutions passing through the points x n {\displaystyle x_{n}} generated by the numerical methods. Taking ω = π {\displaystyle \omega =\pi } the problem is solved on [ 0 , 1 ] {\displaystyle [0,1]} , using N = 5 {\displaystyle N=5} and 10 {\displaystyle 10} steps, respectively ( h = 0.2 {\displaystyle h=0.2} and h = 0.1 {\displaystyle h=0.1} ). The experiment demonstrates the impact of varying h λ {\displaystyle h\lambda } , which affects the stability/damping of the numerical methods. We take λ = 20 {\displaystyle \lambda =-20} , using the same computational setup for the explicit and implicit Euler methods.

Euler methods applied to Prothero-Robinson problem source ↗

The results are shown in Figure 1. With T λ = 20 {\displaystyle T\lambda =-20} , there is a moderately strong exponential damping, as is evident from the fast contracting local solutions. The setup is deliberately understated for visualization purposes; in real stiff equations much larger values would be typical. Nevertheless, λ {\displaystyle \lambda } is sufficiently large and negative to demonstrate the step size limitations of the explicit Euler method. For N = 5 {\displaystyle N=5} , its computed solution becomes oscillatory and diverges from the exact solution, even though the initial value was taken on the exact solution. Repeating the same exercise for T λ = 2000 {\displaystyle T\lambda =-2000} , say, shows how the difference between the two methods grows with increasing stiffness.

In more realistic problems, the instability is often more dramatic and referred to as the solution "blowing up". Here, the maximum stable step size ( h = 0.1 {\displaystyle h=0.1} ) has only been exceeded by a small factor. Meanwhile, the implicit Euler method proceeds without loss of stability, and is able to produce accurate results also for the larger step size.

In real computations, stiff equations are solved using adaptive methods. These select the step size automatically to keep the computational process in check, usually controlling both accuracy and stability. Thus, if an adaptive explicit method is used, the solution may not go unstable, as the adaptivity takes measures not to exceed the maximum stable step size. Instead, the method proceeds at the (prohibitive) cost of low efficiency.

The essence of the analysis is that a problem parameter, λ {\displaystyle \lambda } , which is characteristic of the vector field and related to the transient behavior of solutions, is linked to a time scale, either the step size h {\displaystyle h} or the range of integration T {\displaystyle T} directly. The products h λ {\displaystyle h\lambda } or T λ {\displaystyle T\lambda } are both dimensionless, scaling invariant, and negative. A general stiffness concept must reflect these properties, and also be able to address stiffness in nonlinear differential equations.

Systems of stiff equations

The scalar Prothero-Robinson problem can be extended to linear systems of equations,

x ˙ = A ( x g ( t ) ) + g ˙ ( t ) , A R d × d {\displaystyle {\dot {x}}=A{\big (}x-g(t){\big )}+{\dot {g}}(t)\,,\quad A\in {\mathbb {R} }^{d\times d}}

as well as nonlinear systems,

x ˙ = f ( x g ( t ) ) + g ˙ ( t ) , f : R d R d {\displaystyle {\dot {x}}=f{\big (}x-g(t){\big )}+{\dot {g}}(t)\,,\quad f:{\mathbb {R} }^{d}\rightarrow {\mathbb {R} }^{d}}

provided that f ( 0 ) = 0 {\displaystyle f(0)=0} . The general stiffness concept needs to specify what is meant by a vector field being "large and negative". The analysis above also shows that the smooth function g {\displaystyle g} at most plays a minor role. Thus, by substituting u {\displaystyle u} for x g ( t ) {\displaystyle x-g(t)} , we have three types of problems,

u ˙ = λ u u ˙ = A u u ˙ = f ( u ) . {\displaystyle {\begin{aligned}{\dot {u}}\,&=\,\lambda u\\{\dot {u}}\,&=\,Au\\{\dot {u}}\,&=\,f(u)\,.\end{aligned}}}

Here stiffness needs to be characterized for matrices A {\displaystyle A} as well as for nonlinear maps f {\displaystyle f} . The first equation above is the well known Dahlquist linear test equation, primarily used to determine the stability region of a time stepping method. The stability region S {\displaystyle S} of a method is the set of all h λ C {\displaystyle h\lambda \in \mathbb {C} } such that the numerical method produces bounded solutions. The previous discussion of stiffness is effectively based on the Dahlquist test equation.

In order to consider the second, linear system of equations, it is common to make use of the eigenvalues λ k {\displaystyle \lambda _{k}} of the matrix A {\displaystyle A} . After all, for a numerical method to produce bounded (stable) solutions to u ˙ = A u {\displaystyle {\dot {u}}=Au} , it is necessary and sufficient that all eigenvalues satisfy h λ k S {\displaystyle h\lambda _{k}\in S} . The previous discussion carries over by considering one eigenvalue at a time. The linear problem is then stiff if all eigenvalues have negative real parts, and at least one of them satisfies T λ k 1 {\displaystyle T\lambda _{k}\ll -1} ; the latter eigenvalue will then cause the typical step size restrictions experienced by an explicit method, due to the fact that all explicit methods have bounded stability regions, and require h λ k S {\displaystyle h\lambda _{k}\in S} . The step size has to be selected small enough to fulfill this requirement, see Figure 2.

Stability regions of Euler methods source ↗

Yet, it is common to find a different characterization of stiffness, in terms of the "stiffness ratio", defined by

max k | R e λ k | min k | R e λ k | , {\displaystyle {\frac {\max _{k}|{\mathrm {Re} }\,\lambda _{k}|}{\min _{k}|{\mathrm {Re} }\,\lambda _{k}|}}\,,}

assuming that λ k C {\displaystyle \lambda _{k}\in {\mathbb {C} }^{-}} .6 Although stiff systems often do have a large stiffness ratio, Byrne and Hindmarsh emphasize that this is neither necessary nor sufficient for u ˙ = A u {\displaystyle {\dot {u}}=Au} to be a stiff equation.7

First, there is no comparison with the time scale T {\displaystyle T} , but only an intrinsic comparison of "time constants" 1 / λ k {\displaystyle 1/\lambda _{k}} , where a large stiffness ratio merely implies that there are "faster" and "slower" transient components, often referred to as widely different time scales. Second, as noted above, there are scalar stiff equations; these have stiffness ratio 1 {\displaystyle 1} . And third, the stiffness ratio breaks down for problems with a singular matrix A {\displaystyle A} , which is common in some applications. (An example is chemical reaction kinetics, where the singularity corresponds to conservation of mass.)

Thus the stiffness ratio is misleading: it fails to characterize the most important aspects of stiff equations.

In a similar vein, it has been suggested that the nonlinear problem is stiff if the Jacobian matrix f ( u ) {\displaystyle f'(u)} has a large stiffness ratio.8 Naturally, this too fails, as noted by Artemiev and Averina:9

For example, a nonlinear autonomous ODE system with f ( 0 ) = 0 {\displaystyle f(0)=0} can be called stiff if real parts [sic] of the eigenvalues of the Jacobi matrix [ f ( 0 ) {\displaystyle f'(0)} ] satisfy the above conditions. However, the famous van der Pol equation, which is frequently used as an example of stiff ODEs, does not satisfy this definition ... it is impossible to determine stiffness only by means of eigenvalues of the Jacobi matrix for nonlinear systems

Although in part valid, this argument too is off the mark, as the zero solution of the van der Pol equation is an unstable equilibrium. Thus stiffness does not occur near the origin, but only along the limit cycle as will be seen below, where the van der Pol equation is analyzed in full detail. A precise stiffness characterization must therefore be local, capturing the behavior of (small) perturbations near the actual solution.

Modern characterization: The stiffness indicator

Replacing the stiffness ratio, the modern approach is to construct a functional s 2 [ ] {\displaystyle s_{2}[\cdot ]} , called the stiffness indicator, such that the criteria s 2 [ A ] T 1 {\displaystyle s_{2}[A]\!\cdot \!T\ll -1} and s 2 [ f ] T 1 {\displaystyle s_{2}[f]\!\cdot \!T\ll -1} , respectively, quantify stiffness by relating a distinct vector field property to the time scale T {\displaystyle T} . This characterization is based on additional observations. Thus Dekker and Verwer10 write (original emphasis):

The essence of stiffness is that the solution to be computed is slowly varying but that perturbations exist which are rapidly damped.

Shampine11 expresses a similar view, but is conceptually more specific, relating stiffness to dissipation, and what in practice is irreversibility:

A way we prefer for describing the latter condition is that [the solution] is very unstable in the opposite direction [of time].

The reverse instability can easily be characterized using logarithmic norms. Thus, for u ˙ = A u {\displaystyle {\dot {u}}=Au} , the norm of the solution can be bounded using differential inequalities. Let u 2 2 = u u {\displaystyle \|u\|_{2}^{2}=u^{*}u} be the Euclidean norm, possibly with u C d {\displaystyle u\in {\mathbb {C} }^{d}} . Then

m 2 [ A ] u 2 D t u 2 M 2 [ A ] u 2 , {\displaystyle m_{2}[A]\cdot \|u\|_{2}\,\leq \,{\mathrm {D} }_{t}\,\|u\|_{2}\,\leq \,M_{2}[A]\cdot \|u\|_{2}\,,}

where D t {\displaystyle {\mathrm {D} }_{t}} denotes the time derivative and m 2 [ A ] {\displaystyle m_{2}[A]} and M 2 [ A ] {\displaystyle M_{2}[A]} are the lower and upper logarithmic norms of A {\displaystyle A} , respectively. These are the extrema of a symmetric quadratic form,

m 2 [ A ] u H e ( A ) u u u M 2 [ A ] , {\displaystyle m_{2}[A]\,\leq \,{\frac {u^{*}{\mathrm {He} }(A)u}{u^{*}u}}\,\leq \,M_{2}[A]\,,}

where H e ( A ) = ( A + A ) / 2 {\displaystyle {\mathrm {He} }(A)=(A+A^{*})/2} denotes the Hermitian part of A {\displaystyle A} . Its stationary points are found from the symmetric eigenvalue problem

H e ( A ) v = μ v , {\displaystyle {\mathrm {He} }(A)v=\mu v\,,}

whose d {\displaystyle d} real eigenvalues μ 1 μ 2 μ d {\displaystyle \mu _{1}\geq \mu _{2}\geq \dots \geq \mu _{d}} are referred to as the logarithmic values of A {\displaystyle A} . In relation to the eigenvalues λ k {\displaystyle \lambda _{k}} of A {\displaystyle A} , the logarithmic values satisfy

m 2 [ A ] μ d min k R e λ k ; max k R e λ k μ 1 M 2 [ A ] , {\displaystyle m_{2}[A]\,\equiv \,\mu _{d}\,\leq \,\min _{k}{\mathrm {Re} }\,\lambda _{k}\,;\qquad \max _{k}{\mathrm {Re} }\,\lambda _{k}\,\leq \,\mu _{1}\,\equiv \,M_{2}[A]\,,}

together with the trace identity

R e t r a c e A = k R e λ k = k μ k = t r a c e H e ( A ) . {\displaystyle {\mathrm {Re} }\,{\mathrm {trace} }\,A\,=\,\sum _{k}{\mathrm {Re} }\,\lambda _{k}\,=\,\sum _{k}\mu _{k}\,=\,{\mathrm {trace} }\,{\mathrm {He} }(A)\,.}

Since u ( t ) = e t A u 0 {\displaystyle u(t)={\mathrm {e} }^{tA}u_{0}} it follows from the differential inequality above that

e t m 2 [ A ] e t A 2 e t M 2 [ A ] t 0 , {\displaystyle {\mathrm {e} }^{tm_{2}[A]}\,\leq \,\|{\mathrm {e} }^{tA}\|_{2}\,\leq \,{\mathrm {e} }^{tM_{2}[A]}\,\qquad t\geq 0\,,}

bounding the maximum growth and decay rates in the system. Thus, over a time interval of length T > 0 {\displaystyle T>0} , the maximum possible forward time growth is e T M 2 [ A ] {\displaystyle {\mathrm {e} }^{TM_{2}[A]}} , while the maximum reverse time growth is e T m 2 [ A ] {\displaystyle {\mathrm {e} }^{-Tm_{2}[A]}} .

Söderlind et al.12 now quantify Shampine's observation. Thus, in a stiff system,

e T m 2 [ A ] e T M 2 [ A ] , {\displaystyle {\mathrm {e} }^{-Tm_{2}[A]}\,\gg \,{\mathrm {e} }^{TM_{2}[A]}\,,}

from which it follows that e T ( m 2 [ A ] + M 2 [ A ] ) 1 {\displaystyle {\mathrm {e} }^{T{\big (}m_{2}[A]+M_{2}[A]{\big )}}\ll 1} , i.e., T ( m 2 [ A ] + M 2 [ A ] ) 1 {\displaystyle T{\big (}m_{2}[A]+M_{2}[A]{\big )}\ll -1} . The stiffness indicator is defined13

s 2 [ A ] = m 2 [ A ] + M 2 [ A ] 2 , {\displaystyle s_{2}[A]\,=\,{\frac {m_{2}[A]+M_{2}[A]}{2}}\,,}

and a stiff equation is characterized by the condition

s 2 [ A ] T 1 . {\displaystyle s_{2}[A]\cdot T\ll -1\,.}

The reason for including M 2 [ A ] {\displaystyle M_{2}[A]} in the definition of the stiffness indicator is that it may often happen in nonlinear systems that M 2 [ A ] {\displaystyle M_{2}[A]} is positive, offsetting stiffness. This is then accounted for by defining the stiffness indicator so as to have odd parity, i.e., s 2 [ A ] = s 2 [ A ] {\displaystyle s_{2}[-A]=-s_{2}[A]} .

Shampine's criterion implies that m 2 [ A ] {\displaystyle m_{2}[A]} is large and negative, i.e., the lower bound on e t A 2 {\displaystyle \|{\mathrm {e} }^{tA}\|_{2}} is extremely small. In other words, the flow is nearly a semigroup, and e t A {\displaystyle {\mathrm {e} }^{tA}} is "close to singular". Unlike in a Lipschitz system, where one can solve (theoretically) the differential equation also in reverse time, this is effectively impossible in a stiff system. The reverse time stiff equation corresponds to an extremely ill conditioned "inverse problem".

Using logarithmic norms for nonlinear maps14 the same arguments apply to a nonlinear vector field f {\displaystyle f} , but since s 2 [ f ] {\displaystyle s_{2}[f]} is defined globally, the stiffness indicator is instead defined locally as s 2 [ f ( u ) ] {\displaystyle s_{2}[f'(u)]} along the trajectory.

In addition to the stiffness indicator, Söderlind et al.15 introduce a local reference time scale Δ t ( u ) {\displaystyle \Delta t(u)} defined by

Δ t ( u ) = 1 / max ( 1 / T , s 2 [ f ( u ) ] ) . {\displaystyle \Delta t(u)\,=\,1/\max {\big (}1/T,-s_{2}[f'(u)]{\big )}\,.}

Stiffness is then characterized by the method independent condition T / Δ t ( u ) 1 {\displaystyle T/\Delta t(u)\gg 1} , referred to as the (local) stiffness factor. A large stiffness factor is a necessary condition for stiffness. But method dependent criteria can also be assessed. In numerical computations, Δ t ( u ) {\displaystyle \Delta t(u)} can be compared to the actual step size h {\displaystyle h} , whose local step size stiffness factor h / Δ t ( u ) {\displaystyle h/\Delta t(u)} depends on the choice of integration method and the accuracy requirement. Only implicit methods can use step sizes with h / Δ t ( u ) 1 {\displaystyle h/\Delta t(u)\gg 1} , but there may also be parts of the integration where the step size stiffness factor is only moderate.

An important aspect of implicit time stepping methods is the need to solve equations of the form

u = γ h f ( u ) + ψ {\displaystyle u=\gamma hf(u)+\psi }

for u {\displaystyle u} on every step. Here γ > 0 {\displaystyle \gamma >0} is a method characteristic constant of moderate size, and ψ {\displaystyle \psi } is a known vector. The question arises whether this equation has a (unique) solution when h f {\displaystyle hf} is "large", in the sense that h s 2 [ f ] 1 {\displaystyle hs_{2}[f']\ll -1} , which represents stiff computation. Rewriting the equation as

( γ h f I ) ( u ) = ψ , {\displaystyle {\big (}\gamma hf-I{\big )}(u)=-\psi \,,}

the uniform monotonicity theorem applies: the equation has a unique solution if the map γ h f I {\displaystyle \gamma hf-I} is (negative) monotone, i.e., if M 2 [ γ h f I ] < 0 {\displaystyle M_{2}[\gamma hf-I]<0} . Since

M 2 [ γ h f I ] < 0 M 2 [ γ h f ] < 1 , {\displaystyle M_{2}[\gamma hf-I]<0\quad \Leftrightarrow \quad M_{2}[\gamma hf]<1\,,}

this condition is usually satisfied by a fair margin in stiff computation. If M 2 [ f ] < 0 {\displaystyle M_{2}[f]<0} , it is satisfied for all h > 0 {\displaystyle h>0} , but if M 2 [ f ] > 0 {\displaystyle M_{2}[f]>0} , it may require a minor step size restriction. In the Shampine criterion, as well as in the stiffness indicator, it holds that

m 2 [ γ h f ] + M 2 [ γ h f ] 2 = γ h s 2 [ f ] 1 , {\displaystyle {\frac {m_{2}[\gamma hf]+M_{2}[\gamma hf]}{2}}\,=\,\gamma h\!\cdot \!s_{2}[f]\ll -1\,,}

where m 2 [ γ h f ] 1 {\displaystyle m_{2}[\gamma hf]\ll -1} , and where M 2 [ γ h f ] {\displaystyle M_{2}[\gamma hf]} is moderate if it is positive. Therefore, although the equation typically requires Newton iteration in stiff computation ( γ h f {\displaystyle \gamma hf} is not a contraction), existence and uniqueness are usually guaranteed also when the step size stiffness factor is large.

A simplified stiffness indicator

Because it is relatively expensive to compute the stiffness indicator by solving the symmetric eigenvalue problem H e ( f ( u ) ) v = μ v {\displaystyle \,{\mathrm {He} }{\big (}f'(u){\big )}v=\mu v\,} for μ 1 {\displaystyle \mu _{1}} and μ d {\displaystyle \mu _{d}} , an inexpensive alternative is of interest. Noting that the stiffness indicator is the arithmetic average of the largest and the smallest logarithmic value, an option is to replace this average by the average of all logarithmic values. With the aforementioned trace identity (see above), we define the scaled (arithmetic average) real trace functional16

τ [ A ] = 1 d R e t r a c e ( A ) = 1 d k = 1 d R e a k k = 1 d k = 1 d R e λ k = 1 d k = 1 d μ k , {\displaystyle \tau [A]\,=\,{\frac {1}{d}}\,{\mathrm {Re} }\,{\mathrm {trace} }(A)\,=\,{\frac {1}{d}}\,\sum _{k=1}^{d}{\mathrm {Re} }\,a_{kk}\,=\,{\frac {1}{d}}\,\sum _{k=1}^{d}{\mathrm {Re} }\,\lambda _{k}\,=\,{\frac {1}{d}}\,\sum _{k=1}^{d}\mu _{k}\,,}

noting that the computation of τ [ A ] {\displaystyle \tau [A]} is both inexpensive and trivial: no eigenvalue computations are required. The scaled trace (as well as the divergence) is a linear functional, hence τ [ A ] = τ [ A ] {\displaystyle \tau [-A]=-\tau [A]} , sharing the odd parity of s 2 [ A ] {\displaystyle s_{2}[A]} . Moreover, for a nonlinear map,

τ [ f ( u ) ] = 1 d t r a c e f ( u ) = 1 d t r a c e ( g r a d u f ) ( u ) = 1 d ( d i v u f ) ( u ) . {\displaystyle \tau [f'(u)]\,=\,{\frac {1}{d}}\,{\mathrm {trace} }\,f'(u)\,=\,{\frac {1}{d}}\,{\mathrm {trace} }({\mathrm {grad} }_{u}\,f)(u)\,=\,{\frac {1}{d}}\,({\mathrm {div} }_{u}\,f)(u)\,.}

The functional τ [ f ( u ) ] {\displaystyle \,\tau [f'(u)]\,} serves as an alternative stiffness indicator. Stiffness can therefore also be characterized by the divergence condition

T d ( d i v u f ) ( u ) 1 , {\displaystyle {\frac {T}{d}}\,{\big (}{\mathrm {div} }_{u}\,f{\big )}(u)\,\ll \,-1\,,}

which is readily evaluated for most problems.

For a linear system u ˙ = A u {\displaystyle {\dot {u}}=Au} , it holds that

D t d e t e t A = t r a c e A d e t e t A . {\displaystyle {\mathrm {D} }_{t}\,{\mathrm {det} }\,{\mathrm {e} }^{tA}\,=\,{\mathrm {trace} }\,A\cdot {\mathrm {det} }\,{\mathrm {e} }^{tA}\,.}

Defining a scaled (geometric average) absolute determinant δ [ A ] = | d e t A | 1 / d {\displaystyle \,\delta [A]=|{\mathrm {det} }\,A|^{1/d}\,} we have δ [ I ] = 1 {\displaystyle \,\delta [I]=1\,} and δ [ α A ] = | α | δ [ A ] {\displaystyle \,\delta [\alpha A]=|\alpha |\!\cdot \!\delta [A]\,} , avoiding the standard determinant's direct dependence on the dimension d {\displaystyle d} . (This is of special importance when the differential equation is derived from a method-of-lines discretization of a partial differential equation.) Here δ [ A ] {\displaystyle \,\delta [A]\,} is the linear "size" of the phase space volume spanned by the column vectors of A {\displaystyle A} , such as in δ [ 2 I ] = 2 {\displaystyle \,\delta [2I]=2\,} , independent of d {\displaystyle d} . The differential equation for the determinant of the flow can now be rewritten

D t δ [ e t A ] = τ [ A ] δ [ e t A ] . {\displaystyle {\mathrm {D} }_{t}\,\delta {\big [}{\mathrm {e} }^{tA}{\big ]}\,=\,\tau [A]\cdot \delta {\big [}{\mathrm {e} }^{tA}{\big ]}\,.}

Hence

δ [ e t A ] = e t τ [ A ] , {\displaystyle \delta {\big [}{\mathrm {e} }^{tA}{\big ]}\,=\,{\mathrm {e} }^{t\,\tau [A]}\,,}

expressing that an initial (linear) phase volume changes by a factor δ [ e t A ] {\displaystyle \,\delta {\big [}{\mathrm {e} }^{tA}{\big ]}\,} over a time t {\displaystyle t} . If τ [ A ] < 0 {\displaystyle \,\tau [A]<0\,} , phase volume is compressed by the flow e t A {\displaystyle \,{\mathrm {e} }^{tA}\,} for t 0 {\displaystyle t\geq 0} .

It follows that τ [ A ] 0 {\displaystyle \,\tau [A]\leq 0\,} is a necessary condition for stability. (Compare the sufficient criterion M 2 [ A ] 0 {\displaystyle \,M_{2}[A]\leq 0\,} for the stability of the zero solution.) Shampine's characterization of stiffness, in terms of strongly unstable solutions in reverse time, can be written as T τ [ A ] 1 {\displaystyle \,T\!\cdot \tau [-A]\gg 1\,} , which, due to the odd parity of the trace, is equivalent to the stiffness condition T τ [ A ] 1 {\displaystyle T\!\cdot \tau [A]\ll -1} . These observations, concepts and constructions provide the key to characterizing and quantifying stiffness, with the divergence criterion T τ [ f ] 1 {\displaystyle \,T\!\cdot \tau [f']\ll -1\,} being the simplest.

Examples of nonlinear stiff equations and their properties

A large number of nontrivial test problems have been collected in the Bari Test Set for Initial Value Problem Solvers.17 This contains detailed examples of problems, solvers, and results achieved under specified conditions by well-known codes.

Here three stiff nonlinear problems are selected to illustrate how the (simplified) stiffness indicator is used to identify and characterize stiffness correctly. This includes the ability to distinguish the complex and rapidly changing behavior at (locally unstable) turning points, such as those observed in the van der Pol and Oregonator equations. Further, the stiffness indicator also distinguishes between subintervals where stiffness is prominent and where the problem is nonstiff. Finally in some cases it is even possible to identify which of the dependent variables contributes most to stiffness.

Operational criteria, such as how the adaptive time step depends on the accuracy requirements and varies along the solution, are also illustrated while quantifying the step size stiffness factor accordingly. Taken together, a detailed insight can be derived from the stiffness indicator, some of it even before the actual computation begins.

Problem 1. (Flame propagation) Consider the scalar flame propagation model18 for t [ 0 , 1 ] {\displaystyle t\in [0,1]} ,

ε u ˙ = 2 ( u 2 u 3 ) ; u ( 0 ) = ε 1 . {\displaystyle \varepsilon \,{\dot {u}}\,=\,2\,(u^{2}-u^{3})\,;\qquad u(0)=\varepsilon \ll 1\,.}

This model is commonly used as a test problem to challenge the time step adaptivity of software for stiff equations.

For u ( 0 , 1 ) {\displaystyle u\in (0,1)} , the right hand side is positive and the solution u ( t ) {\displaystyle u(t)} is monotonically increasing. Starting at u ( 0 ) = ε {\displaystyle u(0)=\varepsilon } , it moves away from the unstable equilibrium at u = 0 {\displaystyle u=0} . The divergence of the vector field (stiffness indicator) is simply f ( u ) = 2 ( 2 u 3 u 2 ) / ε {\displaystyle f'(u)=2\,(2u-3u^{2})/\varepsilon } . Since

f ( ϵ ) 4 ; f ( 1 ) = 2 ε 1 , {\displaystyle f'(\epsilon )\approx 4\,;\qquad f'(1)=-\,{\frac {2}{\varepsilon }}\ll -1\,,}

the solution starts out nonstiff, when u 1 {\displaystyle u\ll 1} . At t 0.5 {\displaystyle t\approx 0.5} when u 1 / 2 {\displaystyle u\sim 1/2} , instability has become catastrophic, with an extremely rapid transition towards the second, stable equilibrium u = 1 {\displaystyle u=1} .

When u > 2 / 3 {\displaystyle u>2/3} , stability is regained, and the solution enters the stiff regime, see Figure 3, where we have used ε = 10 3 {\displaystyle \varepsilon =10^{-3}} . The equation is stiffer the smaller ε {\displaystyle \,\varepsilon \,} is chosen, and the closer to u = 1 {\displaystyle \,u=1\,} the solution moves.

Flame propagation test problem source ↗

The demonstration compares an explicit third order Runge-Kutta method with a second order error estimator, with a third order A-stable implicit Runge-Kutta method, also with a second order error estimator. The implicit method has no advantage in the nonstiff regime, but in the stiff regime it can use much larger steps. The difference between the two methods increases for smaller ε {\displaystyle \,\varepsilon } .

Problem 2. (van der Pol equation) The van der Pol equation is usually presented in a couple of alternate forms, with different scalings. It is a nonlinear problem whose solutions approach a limit cycle. Here we scale time to write the system as

x ˙ = 2 κ y y ˙ = 2 κ 2 ( 1 x 2 ) y 2 κ x , {\displaystyle {\begin{aligned}{\dot {x}}&=2\kappa \cdot y\\{\dot {y}}&=2\kappa ^{2}\cdot (1-x^{2})\,y-2\kappa \cdot x\,,\end{aligned}}}

with initial conditions x ( 0 ) = 2 {\displaystyle x(0)=2} , y ( 0 ) = 0 {\displaystyle y(0)=0} chosen on the limit cycle. The time scaling has been chosen so as to make the period of the limit cycle O ( 1 ) {\displaystyle \mathrm {O} (1)} nearly independent of the parameter κ {\displaystyle \kappa } . The problem can then be solved on [ 0 , 1 ] {\displaystyle [0,1]} , accounting for a full period. The problem originates in the study of nonlinear oscillations in electric circuits.

For large values of κ {\displaystyle \kappa } (see Figure 4, where κ = 200 {\displaystyle \kappa =200} ), the limit cycle consists of two stable branches, connected by nearly discontinuous transitions showing a catastrophic loss of stability. Due to the trace identity, the stiffness indicator s 2 [ f ( u ) ] {\displaystyle s_{2}[f'(u)]} coincides with the scaled trace τ [ f ( u ) ] {\displaystyle \tau [f'(u)]} , given by

s 2 [ f ( u ) ] = τ [ f ( u ) ] = κ 2 ( 1 x 2 ) , {\displaystyle s_{2}[f'(u)]\,=\,\tau [f'(u)]\,=\,\kappa ^{2}\,(1-x^{2})\,,}

where u {\displaystyle u} is the vector ( x , y ) T {\displaystyle (x,y)^{\mathrm {T} }} . The stiffness indicator is readily identified in the second equation of the van der Pol system, hiding in plain sight.

van der Pol test problem source ↗

The scaled trace has a constant term, κ 2 {\displaystyle \kappa ^{2}} , which represents the contribution from the linear part of the system. As this term is positive it is destabilizing. The equilibrium x = y = 0 {\displaystyle x=y=0} is unstable, since τ [ f ( 0 ) ] = κ 2 > 0 {\displaystyle \tau [f'(0)]=\kappa ^{2}>0} . The scaled trace also has a negative term κ 2 x 2 {\displaystyle -\kappa ^{2}x^{2}} representing the nonlinear contribution to stiffness. As the scaled trace is independent of y {\displaystyle y} , the latter variable has no particular influence on stiffness.

The solution enters the stiff regime when τ [ f ( u ) ] 1 {\displaystyle \tau [f'(u)]\ll -1} . Since τ [ f ( u ) ] = κ 2 ( 1 x 2 ) {\displaystyle \tau [f'(u)]=\kappa ^{2}\,(1-x^{2})} it follows that stiffness only occurs for | x | > 1 {\displaystyle \,|x|>1} , i.e., along the two stable branches, where stiffness is proportional to κ 2 {\displaystyle \,\kappa ^{2}} , and can be arbitrarily large. In particular, it follows that if an explicit method were to be used to solve the problem along the stable branch, the computational effort is O ( κ 2 ) {\displaystyle \mathrm {O} (\kappa ^{2})} . By contrast, the effort is O ( 1 ) {\displaystyle \mathrm {O} (1)} for a stiff solver.19

Note that all this information is available a priori. It is confirmed by the numerical solution of the problem.

Problem 3. (Oregonator equation) Chemical reaction kinetics is a rich source of stiff differential equations. The Oregonator equation models an autocatalytic reaction involving three reactions. For suitable initial conditions, it has a limit cycle of period T 305 {\displaystyle T\approx 305} . The equations are

x ˙ = s ( x x y + y q x 2 ) y ˙ = ( z y x y ) / s z ˙ = w ( x z ) . {\displaystyle {\begin{aligned}{\dot {x}}\,&=\,s\cdot (x-xy+y-q\,x^{2})\\{\dot {y}}\,&=\,(z-y-xy)/s\\{\dot {z}}\,&=\,w\cdot (x-z)\,.\end{aligned}}}

The test problem uses the parameters s = 77.27 {\displaystyle s=77.27} , q = 8.375 10 6 {\displaystyle q=8.375\cdot 10^{-6}} , w = 0.161 {\displaystyle w=0.161} , together with initial conditions x ( 0 ) = 3.2 , y ( 0 ) = 1.45 , z ( 0 ) = 2.5 {\displaystyle x(0)=3.2,\,y(0)=1.45,\,z(0)=2.5} .20

Because the dimension of the system is d = 3 {\displaystyle d=3} , the stiffness indicator s 2 [ f ] {\displaystyle s_{2}[f']} and the scaled trace (divergence) τ [ f ( u ) {\displaystyle \tau [f'(u)} do not coincide, although the differences are small. More importantly, because the equation is a more intricate example of stiffness, we need to gain preliminary insight by keeping the parameters as long as possible in the analysis. We therefore choose the scaled divergence, which offers this advantage. Thus we find

T τ [ f ( u ) ] = α + β x + γ y , {\displaystyle T\cdot \tau [f'(u)]\,=\,\alpha +\beta \,x+\gamma \,y,}

where

α = T ( s 2 s w 1 ) / ( 3 s ) 7.838 10 3 β = T ( 2 q s 2 + 1 ) / ( 3 s ) 1.447 γ = T s / 3 7.856 10 3 . {\displaystyle {\begin{aligned}\alpha \,&=\,T\cdot (s^{2}-sw-1)/(3s)\,\approx \,7.838\cdot 10^{3}\\\beta \,&=\,-T\cdot (2qs^{2}+1)/(3s)\,\approx \,-1.447\\\gamma \,&=\,-T\cdot s/3\,\approx \,-7.856\cdot 10^{3}.\end{aligned}}}

Here the constant term α {\displaystyle \alpha } is the linear contribution to the divergence. It is positive, therefore destabilizing. The two other terms represent nonlinear contributions, as they multiply x {\displaystyle x} and y {\displaystyle y} . Their coefficients β {\displaystyle \beta } and γ {\displaystyle \gamma } are negative, and will contribute to stiffness. But as the divergence is independent of z {\displaystyle z} , the latter variable does not contribute to stiffness other than to the constant term α {\displaystyle \alpha } . The largest coefficient is γ {\displaystyle \gamma } , indicating that stiffness will be particularly strong when y {\displaystyle y} is large and positive. Whether x {\displaystyle x} will contribute depends on what magnitude it will reach. In Figure 5, the actual data are collected during one full period [ 0 , 305 ] {\displaystyle [0,305]} plotted on the normalized interval [ 0 , 1 ] {\displaystyle [0,1]} .

Oregonator test equation source ↗

During a brief interval when x {\displaystyle x} is large, the scaled divergence is of order T τ [ f ( u ) ] 10 5 {\displaystyle T\cdot \tau [f'(u)]\approx -10^{5}} (just barely visible as a notch in the upper left corner of the high resolution scaled trace graph), making the problem stiff there. But the worse is yet to come; when y {\displaystyle y} becomes large stiffness becomes severe, with scaled divergence at T τ [ f ( u ) ] 10 7 {\displaystyle T\cdot \tau [f'(u)]\approx -10^{7}} .

When an adaptive third order A-stable Runge-Kutta method is used, the step sizes exceed the reference time scale by three to four orders of magnitude. This corresponds to the efficiency gain of the stiff solver over the potential use of an explicit method.

Methods and software for stiff equations

As noted above, stiff equations require implicit time stepping methods, either Runge-Kutta methods or linear multistep methods. There are also alternatives based on extrapolation techniques. But not all implicit methods are suitable. A good method must have a stability region S {\displaystyle S} covering all or most of the negative half-plane C {\displaystyle {\mathbb {C} }^{-}} .

The stability region is determined by applying the method to the Dahlquist test equation21 u ˙ = λ u {\displaystyle \,{\dot {u}}=\lambda u} , and the stability region S {\displaystyle S} is the set of h λ C {\displaystyle h\lambda \in {\mathbb {C} }} for which the method produces bounded solutions. A method with S C {\displaystyle S\supset {\mathbb {C} }^{-}} , i.e., where S {\displaystyle S} contains the entire left half-plane, is called A-stable.

There are many A-stable Runge-Kutta methods of high order to choose from, but unfortunately, for A-stable linear multistep methods the convergence order is limited to p = 2 {\displaystyle p=2} . Therefore one usually has to settle for less. Among multistep methods the best choice for stiff equations are the backward differentiation (BDF) methods, to which the implicit Euler method belongs. There are BDF methods of convergence orders p = 1 : 6 {\displaystyle p=1:6} , but only orders p 2 {\displaystyle p\leq 2} are A-stable. While the stability region remains large for higher orders, it eventually deteriorates, and only methods up to order p = 5 {\displaystyle p=5} are regularly used.

Software based on BDF-type methods (or similar) are Matlab's ode15s, and C or Fortran codes such as CVODE, LSODE, MEBDF, DASSL. The software is elaborate, robust and of high complexity, uses variable order and variable time step adaptivity, and usually offers many options for problems with special structure, or for special handling of Jacobian matrices. Accuracy criteria can be specified both for absolute and relative error estimates. Some codes include options for nonstiff equations, or for differential-algebraic equations.

For implicit Runge-Kutta methods applied to the test equation, the differential equation is replaced by a recursion u n + 1 = R ( h λ ) u n {\displaystyle u_{n+1}=R(h\lambda )\,u_{n}} , where the rational function R {\displaystyle R} is referred to as the stability function. If

R e z 0 | R ( z ) | 1 , {\displaystyle {\mathrm {Re} }\,z\leq 0\,\Rightarrow \,|R(z)|\leq 1\,,}

the method is A-stable. There are A-stable methods of high order, but few implementations. An advantage of A-stable Runge-Kutta methods is that

M 2 [ h A ] 0 R ( h A ) 2 1 , {\displaystyle M_{2}[hA]\leq 0\,\Rightarrow \,\|R(hA)\|_{2}\leq 1\,,}

i.e., if A {\displaystyle A} is negative definite, then the method produces a contractive recursion. (This is a variant of the von Neumann inequality.) Unfortunately, this only carries over to nonlinear problems under additional conditions. B-stable Runge-Kutta methods are a subset of A-stable methods, such that if M 2 [ h f ] 0 {\displaystyle \,M_{2}[hf]\leq 0\,} in a nonlinear problem, then the method produces a contractive recursion.222324 Other special stability requirements are also common, such as L-stability.25 This is another subset of A-stable method, with the additional requirement R ( ) = 0 {\displaystyle \,R(\infty )=0\,} for improved damping of eigenmodes for which R e h λ 1 {\displaystyle \,{\mathrm {Re} }\,h\lambda \ll -1} .

Because implicit Runge-Kutta methods have high computational complexity, special efficiency requirements are also considered when software is developed. Among efficient Runge-Kutta software for stiff problems are Matlab's ode23s of orders 2 {\displaystyle 2} and 3 {\displaystyle 3} , and the Fortran code RADAU526, implementing the B- and L-stable 5 t h {\displaystyle 5^{\mathrm {th} }} order Radau IIa method. For a general survey of diagonally implicit Runge-Kutta methods used for stiff equations, see Kennedy and Carpenter.27

In all cases, the practical numerical solution of stiff equations needs dedicated, professional software. Even though it is a routine task, success may often require a careful study of the suitable settings of the codes and a proper understanding of the problem to be solved.

Notes and remarks

1. Is stiffness fully understood? In the literature it is sometimes suggested that no precise definition of stiffness exists. This appears to be due to the widespread mention or use of the stiffness ratio, whose obvious shortcomings are long since recognized. Sometimes operational criteria such as the actual discretization method and accuracy requirements are brought in to resolve the issues. But this over-complicates what is basically a simple issue. Thus, to the practitioner it is obvious that there is a distinction between nonstiff and stiff equations; it must therefore be possible to describe this distinction mathematically. The stiffness indicator provides a simple necessary criterion characterizing and quantifying this distinction. Today it is safe to say that stiffness is a complex but fully understood phenomenon, with excellent, efficient, and reliable dedicated software available.

2. Stiffness indicator. The stiffness indicator s 2 [ f ] {\displaystyle s_{2}[f']} is more robust than the scaled divergence τ [ f ] {\displaystyle \tau [f']} . While the former only involves the two extreme logarithmic values, the latter makes use of the intermediate logarithmic values as well, in spite of the fact that they do not have a similar impact on stability. The advantage of the scaled divergence is that it bypasses eigenvalue computations and supports an a priori analytical understanding of the dynamics, as illustrated in the examples above. An approximate quantification is usually sufficient, as stiffness typically is a matter of orders of magnitude.

3. Dissipative vs. conservative systems. The notion of stiffness as discussed above is a measure of dissipativity, i.e., of damping or energy loss. Some sources suggest that there are also stiff conservative systems, in particular if derived from method-of-lines discretizations of hyperbolic PDEs. But such systems belong in a different category, of (potentially) highly oscillatory systems. With eigenvalues located far out on the imaginary axis, the corresponding high frequencies can be suppressed by choosing an implicit method using large steps. However, in accordance with the sampling theorem, high frequency phenomena are then no longer resolved and wave forms may be distorted. The distinction between dissipative and conservative problems is significant, and the computational approach is different. While stiffness is closely related to the irreversibility of parabolic problems, hyperbolic problems have no intrinsic damping and feature invariants, such as conserving energy. Similarly, separable Hamiltonian systems have a divergence free vector field and hence are "area preserving" (conserve phase volume). Due to this structure, the stiffness indicator is zero.

4. Etymology. The term "stiff" was introduced by Curtiss and Hirschfelder. According to Hirschfelder,28 the term was chosen because the first examples were associated with servo systems, in which there was a "tight coupling" between the servo and the controlled system. A similar effect can also be seen in control systems with a high gain negative feedback, where higher gain increases stiffness. Another suggested connection is the (mechanical) notion of stiffness, as represented by the second order equation

m x ¨ + c x ˙ + k x = F , {\displaystyle m{\ddot {x}}+c{\dot {x}}+kx=F\,,}

where m {\displaystyle m} represents mass, c {\displaystyle c} the damping coefficient, k {\displaystyle k} the spring constant, and F {\displaystyle F} an external applied force. The idea is that equations with "stiff springs" (large k {\displaystyle k} ) cause stiffness in the mathematical sense. Taking m = 1 {\displaystyle m=1} , the equation is rewritten as

x ˙ = y y ˙ = k x c y + F , {\displaystyle {\begin{aligned}{\dot {x}}&=y\\{\dot {y}}&=-kx-cy+F\,,\end{aligned}}}

which is a system of the form u ˙ = A u + g {\displaystyle {\dot {u}}=Au+g} . The eigenvalues of A {\displaystyle A} are

λ 1 , 2 = c ± c 2 4 k 2 ; {\displaystyle \lambda _{1,2}\,=\,{\frac {-c\pm {\sqrt {c^{2}-4k}}}{2}}\,;}

these are claimed to be large if k {\displaystyle k} is large. For critical damping, the parameters must satisfy c = 2 m k {\displaystyle c=2{\sqrt {mk}}} , which with m = 1 {\displaystyle m=1} leads to

λ 1 , 2 = c / 2 . {\displaystyle \lambda _{1,2}\,=\,-c/2\,.}

Interestingly, computing the stiffness indicator (here identical to the scaled divergence) of the vector field one obtains

s 2 [ A ] = τ [ A ] = c / 2 , {\displaystyle s_{2}[A]\,=\,\tau [A]\,=\,-c/2\,,}

which is large if and only if the damping constant c {\displaystyle c} is large, independent of the spring constant. Thus the single cause of stiffness is the damper, which dissipates energy. It is difficult to avoid the conclusion that the term "stiff" is a misnomer, which quickly got established in a new context with a different meaning. While in mechanical engineering a large spring constant typically goes with a large damping constant in order to have (near) critical damping, the confusion can be understood.

See also

See also


Notes

Notes

  1. G Söderlind, LO Jay, M Calvo (2015). "Stiffness 1952--2012: Sixty years in search of a definition." BIT 55, pp 531-558.
  2. G Söderlind (2024). "Logarithmic Norms." Berlin-Heidelberg-New York: Springer Series in Computational Mathematics SCM vol 63.
  3. E. Hairer, G. Wanner (1996). "Solving orindary differential equations II. Stiff and differential-algebraic problems", 2nd ed. Berlin-Heidelberg-New York: Springer
  4. C.F. Curtiss, J.O. Hirschfelder (1952). "Integration of stiff equations". Proc. Nat. Acad. Sci. 38, pp 235-243.
  5. A. Prothero, A. Robinson (1974). "On the stability and accuracy of one-step methods for solving stiff systems of ordinary differential equations". Math. Comp. 28, 145-162.
  6. J.D. Lambert (1992). "Numerical Methods for Ordinary Differential Systems", pp 216-217. New York: Wiley, ISBN 978-0-471-92990-1.
  7. G.D. Byrne and A.C. Hindmarsh (1987). "Stiff ODE Solvers: A review of current and coming attractions", p. 3ff. J. Comp. Phys. 70, 1-62.
  8. J.D. Lambert (1973). "Cmmputational methods in ordinary differential equations", p. 232. London: John Wiley & Sons.
  9. S. Artemiev, T. Averina (1997). "Numerical analysis of systems of ordinary and stochastic differential equations", p. 6. Utrecht: VSP.
  10. K. Dekker, J.G. Verwer (1984). "Stability of Runge-Kutta methods for stiff nonlinear differential equations", p. 5. New York: North Holland.
  11. L.F. Shampine (1985). "What is Stiffness?" In: R.C. Aiken (ed.), "Stiff Computation", p. 4. New York: Oxford University Press
  12. G. Söderlind, L.O. Jay, M. Calvo (2015). "Stiffness 1952-2012. Sixty years in search of a definition. BIT Numerical Mathematics 55, 531-558.
  13. ibid.
  14. G. Söderlind (2024). "Logarithmic Norms". Springer Series in Computational Mathematics vol 63.
  15. G. Söderlind, L.O. Jay, M. Calvo (2015). "Stiffness 1952-2012. Sixty years in search of a definition. BIT Numerical Mathematics 55, 531-558.
  16. G. Söderlind (2024). "Logarithmic Norms", Ch. 15. Springer Series in Computational Mathematics, vol 63.
  17. F. Mazzia, F. Iavernaro (2003). "Test Set for Initial Value Problem Solvers". Department of Mathematics, University of Bari, ITALY. http://archimede.dm.uniba.it/~testset/CWI_reports/testset2003r22.pdf
  18. L.F. Shampine. https://es.mathworks.com/company/newsletters/articles/stiff-differential-equations.html
  19. G Söderlind, LO Jay, M Calvo (2015). "Stiffness 1952--2012: Sixty years in search of a definition." BIT 55, pp 531-558.
  20. G Söderlind (2024). Logarithmic Norms, Ch. 21. Springer Series in Computational Mathematics vol 63.
  21. G. Dahlquist (1963). "A special stability problem for linear multistep methods". BIT 3, 27–43, doi:10.1007/BF01963532, hdl:10338.dmlcz/103497, S2CID 120241743
  22. J.C. Butcher (1975). "A stability property of implicit Runge–Kutta methods". BIT 15, 358–361
  23. J.C. Butcher (2008). "Numerical Methods for Ordinary Differential Equations", 2nd ed. New York: Wiley
  24. E. Hairer, E., G. Wanner (1991). "Solving Ordinary Differential Equations II. Stiff and Differential– Algebraic Problems". Berlin–Heidelberg–New York: Springer.
  25. Ehle (1969).
  26. E. Hairer, E., G. Wanner (1991). "Solving Ordinary Differential Equations II. Stiff and Differential– Algebraic Problems". Berlin–Heidelberg–New York: Springer.
  27. C.A. Kennedy, M.H. Carpenter (2016). "Diagonally Implicit Runge–Kutta Methods for Ordinary Differential Equations. A Review". NASA/TM-2016-219173, https://ntrs.nasa.gov/citations/20160005923
  28. J.O. Hirshfelder (1963). "Applied Mathematics as used in Theoretical Chemistry". American Mathematical Society Symposium: 367–376.
References

References

External links