Geophysical Fluid Dynamics, Nonautonomous Dynamical Systems, and the Climate Sciences

This contribution introduces the dynamics of shallow and rotating flows that characterizes large-scale motions of the atmosphere and oceans. It then focuses on an important aspect of climate dynamics on interannual and interdecadal scales, namely the wind-driven ocean circulation. Studying the variability of this circulation and slow changes therein is treated as an application of the theory of nonautonomous dynamical systems. The contribution concludes by discussing the relevance of these mathematical concepts and methods for the highly topical issues of climate change and climate sensitivity. Michael Ghil Ecole Normale Supérieure and PSL Research University, Paris, FRANCE, and University of California, Los Angeles, USA, e-mail: ghil@lmd.ens.fr Eric Simonnet Institut de Physique de Nice, CNRS & Université Côte d’Azur, Nice Sophia-Antipolis, FRANCE, e-mail: eric.simonnet@inphyni.cnrs.fr


The Rossby Number
Newtonian mechanics recognizes a special class of coordinate systems, called inertial frames, and by rotation we always mean rotation relative to an inertial frame. As we shall see, Newtonian fluid dynamics relative to a rotating frame have a number of unsual features.
For example, an observer mowing with constant velocity in an inertial frame feels no force. If the observer, however, has a constant velocity with respect to a frame attached to a rotating body, a Coriolis force will generally be felt, acting in a direction perpendicular to the velocity. Such "apparent" forces lead to dynamical results which are unexpected and peculiar when compared with conventional dynamcics in an inertial frame. We should therefore begin by verifying that this shift of viewpoint to a noninertial frame is really called for by the physical phenomena to be studied, as well as by our individual, direct perception of these flows, as rotating observers.
To examine this question for the dynamics of fluids on the surface of the Earth, we define Ω to be the Earth's angular speed, so 2π/Ω ∼ = 24 hr. Therefore, at a radius of 6380 km a point fixed to the equator and rotating with the surface moves with speed 1670 km/hr as seen by an (inertial) observer at the center of the Earth. On the other hand, the large-scale motion of the fluid relative to the solide surface never exceeds a speed of about 300 km/hr, which occurs in the atmospheric jet streams. Thus the speeds relative to the surface are but a fraction of inertial speeds, and it will be useful to remove the fixed component of motion due to the Earth's rotation, by introducing equations of motion relative to rotating axis.
In this rotating frame, we define U to be a characteristic speed relative to the surface, and define L as a length over which variations of relative speed of order U occur. From these parameters we then obtain a characteristic time T = L/U, which is a time scale for the evolution of the fluid structure in question. Since Ω is an inverse time, the product T Ω , or more customarily the Rossby number is a dimensionless measure of the importance of rotation, when assessed over the lifetime or period of the structure. The nature of this rotational effect will be given a dynamical basis forthwith. For the moment, it is important only to realize that small ε means that the effects of rotation are important. With this in mind, we show in Table 1 values of the Rossby number for a few large-scale flows on the Earth and Jupiter.
We conclude that rotation is important in all of these systems and that a rotating coordinate system is probably useful for their study. Minor changes in these estimates are needed to account for the fact that the proper measure of rotation is the local angular velocity of the tangent plane to the planet, as shown in Sec. 2 below.
Also, these estimates only mean that Coriolis forces, proportional to 2ΩU, are typically larger than the forces needed to produce the acceleration measured relative to the rotating frame, these being of magnitude U 2 /L (or U/T ) times the fluid Gulf Stream 100 km 1 m/sec 0.07 Ω = 7.3 · 10 −5 s −1 Weather system 1000 km 20 m/sec 0.14 Core 3000 km 0.1 cm/sec 2 · 10 −7 Jupiter Ω = 1.7 · 10 −4 s −1 Bands, Red Spot 10 4 km 50 m/sec 0.015 Table 1 density. There may be other forces, such as viscous or magnetic forces, which enter into the dynamical balance. In the Earth's fluid core, for example, magnetic and Coriolis forces are probably comparable, while in the free atmosphere and ocean it is predominantly the "pressure" forces which compete with the Coriolis force in the dominant dynamical balance.

Equations of Motion in an Inertial Frame
We consider a fluid of density ρ, moving under the influence of a pressure field p(r,t) and a body force F(r,t), where r = (x, y, z) = (x 1 , x 2 , x 3 ) is the position vector and t is time. The equations describing the local conservation of momentum and mass are ∂ ρ ∂t where u(r,t) is the velocity field and ∇ = (∂ x , ∂ y , ∂ z ) is the gradient operator, with bold type used for vectors. Equation (2b) is often called either the "continuity equation" or the "mass conservation equation." For a given F , (2) comprises four scalar equations for the three velocity components of u, the density ρ and the pressure p. Another relation is therefore needed to close the system, e.g., an equation of state connecting p and ρ.
In many problems of geophysical interest, such as ocean dynamics, the approximate incompressibility of the fluid plays a prominent role. In these cases, u is prescribed to be a solenoidal, i.e., divergence-free, vector field, and thus it closes the system. For the moment, we leave the closure unspecified and deal with the incomplete system (2). We also take F at first to be a conservative field, such as gravitation, defined in terms of a potential φ by We are thus excluding here the possibility of the important contribution of the nonconservative viscous stresses. The operator D Dt occurring in (2) is called the material derivative or the Lagrangian derivative, since it is the time derivative calculated by an observer moving with the local or material velocity. In the following, we will use the notation D/Dt rather than d/dt when appropriate to avoid any confusion. To verify this last claim let G(r,t) be a field and let r = R(t) be the position of a fluid particle at time t. Then u = (u, v, w) = (u 1 , u 2 , u 3 ) = dR dt is the velocity of the particle, and we have where the summation convention of repeated indices is used. A quantity Q (scalar, vector or tensor) which satisfies DQ/Dt = 0 is said to be a material invariant because the quantity Q(R(t),t) just stays constant along the trajectory defined by R.

Thermodynamics
Very often, the equation of state does not involve the pressure and density alone. Typically, one often has ρ ocean = ρ(S, T, p), ρ atmos = ρ(T, p), where S is the salinity and T the temperature. One needs, therefore, two additional equations for T (and S). In the following, we consider the oceanic case, although the cases of both a dry and a humid atmosphere can be treated in an analogous way. We will suppose later on that the fluid is in a local thermodynamic equilibrium. This assumption implies that the variables entering Eq. (6) below are locally related or that they can be simply expressed in mathematical form: where E is the specific internal energy, α ≡ 1/ρ is called the specific volume, η the specific entropy and S the salinity. Taking the total differential of (6), enables one to interpret it in a illuminating way: The quantity −∂ E(α, η, S)/∂ α is interpreted as the pressure p, while the term (∂ E/∂ α)dα is the change in energy due to mechanical compression. Note that dα < 0 and p > 0 give dE > 0, i.e. energy must be supplied to reduce volume if fluid resists compression. Similarly, ∂ E(α, η, S)/∂ η is interpreted as the temperature T , and the term (∂ E/∂ η)dη is the energy change due to heating. Finally, ∂ E(α, η, S)/∂ S is called the chemical potential µ of seawater. One has therefore It appears that the salinity and specific entropy are conserved along particle motion when there are no heat sources and no molecular diffusion, i.e.
In order to have an equation involving T instand of η, one expresses the entropy η as a function of p, S and T only. It is possible to eliminate α between p(α, η, S) and T (α, η, S). Using (9), one has It appears that the partial derivatives are known quantities: where C p is the specific heat at constant pressure and salinity and C T the coefficient of thermal expansion. Therefore, one ends up with For the oceans, the right-hand side of the first equation is often neglected, unless pressure variations are very large compared to temperature fluctuations, as could be the case in the deep ocean. Note that alternative formulations to close the system (2) do exist, depending on which variables one considers, instead of the temperature T addressed here. The overall procedure, however, is always the same and it makes use of the internal energy (6) and of the identities (7) and (9).
Example. In the case of the atmosphere, the ideal gas law is a good approximation of the equation of state for dry air.
The specific entropy η is η = C p log T − R log p. (14) One obtains C T = −ρ ∂ η ∂ p T = 1 T 1 . Plugging this expression in (12a) and using (13), one obtains DT or with straightforward algebraic manipulations The quantity θ is called the potential temperature and p 0 a constant pressure reference.

Equations in a Rotating Frame
We would like to use the form taken by (2) relative to a rotating coordinate system. That is, we would like to work with the velocity as perceived by an observer fixed in the rotating frame, and with time differentiation as would be performed by such an observer. Spatial differentiation, involving a limit process at fixed time, is an operation which is invariant under transformation to a rotating frame. The essential problem is thus to deal properly with time derivatives.

Absolute and relative derivative
Since differentiation involves a limit process, it is convenient to begin with an infinitesimal rotation of the axis. Consider a constant vector P fixed rigidly relative to the rotating frame. Let the frame rotates with respect to the inertial frame through a small angle dθ about the z or x 3 -axis. If P has coordinates (P 1 = a cos ϕ, P 2 = a sin ϕ, P 3 ) in the rotated frame, an inertial observer will measure in his frame, after rotation, some coordinates (P 1 = a cos(ϕ + dθ ), P 2 = a sin(ϕ + dθ ), P 3 = P 3 ). Using the smallness of dθ one easily obtains (P 1 = P 1 − P 2 dθ , P 2 = P 2 + P 1 dθ , P 3 = P 3 ) for P. Thus, under this rotation, we have the invariant vector relation where the vector dθ = (0, 0, dθ ). Introducing time t, we write (17) as 1 Note that, since only the variables T and p are involved, we could simply derive (14) by integrating (11); in this case, one obtains TC T = const.
where we have defined Ω (t) = (0, 0, dθ /dt) as the instantaneous angular velocity of the rotating frame. Equation (18) involves the subscript "0", which will stand for "as determined by an observer in the inertial frame", sometimes called absolute.
Since P is a constant vector in the rotating frame, (18) simply states that, in this case, the inertial observer sees a vector attached to the origin rotating (about the z-axis. Examples of such "rotating, fixed vectors" are a set of orthonormal basis vectors in the rotating frame, (i 1 , i 2 , i 3 ), i k being the unit vector along the x k axis. Thus, we can recompute (18) as follows. Define P = P 1 i 1 + P 2 i 2 + P 3 i 3 (19) as the representation of P in the rotating basis. Since P 1 , P 2 , and P 3 are independent of time we have, using (18) and the summation convention, Thus we see that the basis vectors carry the effect of rotation in their time dependence. Suppose now that the coordinates P k in (19) depend on time, P k = P k (t). Repeating (20) we obtain dP dt 0 = dP dt r + Ω × P, is the time derivative of P relative to the rotating observer, with P k = dP k (t)/dt. The relation (21) is the basic result that must be used to transform the equations of motion to a rotating frame of reference. We remark that Ω is allowed to depend on time t in (21). This can be of interest in geophysical fluid dynamics when modeling the precession of the Earth's axis of rotation However, this time dependence has negligible effects for most phenomena in the dynamics of the atmosphere, oceans, or fluid core of the earth and will be largely ignored below. We also note that, for the purpose of interpreting a formula such as (18) or (21), it is sometimes convenient to think of the relation as applying at an instant when the two frames of reference coincide.
Actually, we do not need (21) for the continuity equation (2b), since it is obvious that the material derivative of a scalar field is independent of the coordinate frame. Indeed, its value is determined by the time dependence observed for a given particle of fluid. To verify the invariance explicitely, let G(r,t) be a scalar field as observed in the inertial frame.
Then, for partial time derivatives we have P = r in (18) and thus Now, applying (21) to the vector r = R(t), which represents the particle position at time t, yields where is the velocity vector relative to the rotating frame. Using (22a, 22b) one obtains which confirms the invariance of the material derivative.

Relative acceleration and Coriolis acceleration
The right-hand side of (22b) is another vector function of t, and we may repeat the process to obtain the acceleration: where u 0 is the velocity perceived by the inertial observer. Using (23) and (2a) and recalling the invariance of spatial derivatives we thus obtain the desired system of equations relative to the rotating frame: If Ω is taken to be constant, ieΩ = 0 and if we use the identity then Eq. (24a) takes the form where φ c now absorbs the centripetal acceleration. The latter is only about 1/300 of the gravitational acceleration at the surface of the Earth and so is negligible for most atmospheric and oceanographic purposes. The remaining term which distinguishes Eq. (26) for the relative acceleration from Eq. (2a) for the absolute acceleration is the Coriolis acceleration 2Ω × u r .
If U and L are the characteristic scales of Section 1 defined now in the rotating system, and if L/U is taken as the characteristic time, then in order of magniture and so the Rossby number (1) has the dynamical meaning of a characteristic ratio of the relative acceleration of a fluid element to the Coriolis acceleration.
For future reference, we now drop the subscript "r" and rewrite the basic equations (24) as:

Vorticity
For any velocity field u we define the associated vorticity field by where we recall the explicit calculation: Remark: The vorticity is a divergence-free vector field: ∇ · ω = ∇ · ∇ × u = 0. In particular, thanks to Stoke's theorem, for any surface S enclosing a volume V in R 3 (for instance a sphere), one has Define now a vortex line as the integral curve of the vorticity field ω meaning that any points on the vortex line is tangent to the vorticity vector field. A vortex tube is composed of vortex lines going through a closed curve (see Fig. 1). Now applying Fig. 1 The circulations C 1 and C 2 around a vortex tube are the same. The strength of a vortex tube is constant along the length of the tube.
(31) to a portion of this vortex tube determined by two different closed curves C 1 and C 2 , one obtains the results that the circulation is the same around C 1 and C 2 (see Fig. 1). Indeed the lateral surface between C 1 and C 2 does not contribute to the circulation since it is tangent to the vector field. One can thus write Note that this is merely a "static" property since the vortex tube is frozen (t is fixed).
A dynamical view will be provided by Kelvin's theorem. For solid body rotation with angular velocity Ω , the velocity field is obtained by setting u r = 0, so that (cf. (22b)) u = u 0 = Ω × r. The associated vorticity is Applied locally in an arbitrary velocity field, vorticity has the same meaning: it is equal to twice the angular velocity of a fluid element. This interpretation of vorticity in terms of a local angular velocity is misleading in some respects, since the term "fluid element" is not really appropriate for describing what is basically a point property. More precisely, the rate-of-strain tensor given by the matrix of partial derivatives of the components of u at any point has a symmetric and an antisymmetric part. Vorticity is associated with the antisymmetric part, while the symmetric part gives rise to a straining field. The effect of the latter is to distort "fluid elements", by differentially stretching and compressing them. With this caveat, we return to the discussion of vorticity.
Since global solid body rotation can be characterized, according to (33), as a state of uniform vorticity, the mechanics of the fluid relative to a rotating coordinate system can alternatively be regarded as a mechanics of deviations from a state of uniform vorticity. As we have already noted, the practical usefulness of reference to a state of uniform rotation will depend on the magnitude of the deviations, and the Rossby number can aso be regarded as a measure of relative vorticity as a fraction of 2Ω . The properties of vorticity relative to the rotating frame are thus clearly "inherited" from the mechanics in the inertial frame, so for the present discussion we return to (2) and omit subscripts.

Kelvin's theorem
We now use the vector identity (please check by using formula (30)) so that, (2a) can be reformulated as The particularly simple case in which u × ω = 0 is called a Beltrami flow. Taking the curl of (35) we obtain an equation for the vorticity, Using (2b) to eliminate (∇ · u) from (36a) we have Multiplying (36b) by 1/ρ and rearranging we obtain finally a simpler equation for ω/ρ, It shows that the two terms on the right-hand-side will tend to change ω/ρ as one moves with the fluid. In particular, the vector ω/ρ is not in general a material invariant. It would appear that the vorticity vector itself does not provide one with conservation laws having an intuitive appeal, basically because of the already mentioned non-local nature of physical reasoning based upon "fluid elements." It is advanta-geous, therefore, to recognize the divergence-free (solenoidal) property of the vorticity vector and to exploit the non-local concept of flux of a divergence-free vector field. This non-local approach is the basis for the fundamental theorem discovered by Lord Kelvin. Kelvin's theorem studies the evolution of the flux of ω through a surface S bounded by a simple closed contour C, when the latter consists of material points, i.e. of points which move with the fluid (see Fig. 2). t+dt t+dt t t t+dt t Fig. 2 Kelvin's theorem: the circulation Γ t at time t is the same than the circulation Γ t+dt at time t + dt provided the fluid is barotropic and the corresponding closed contours consists of material points. Note that the vortex tube is displaced by the flow but might not be a vortex tube anymore ( the vortex tube at time t + dt is shown in thin dashed contours). By Stoke's theorem, the flux we study is given by where is called the circulation on C. Let C be given by parametric equations {r = R(α,t), 0 ≤ α < 1}; then a given material point in the curve corresponds to a given value of α. The parameter α therefore is a material of Lagrangian parameter. One may think of C as a smoke ring for instance. One obtains where the term vanishes on a closed contour and the material derivative of r = R(α,t) is just u, i.e. u = Dr Dt . Note that the time derivative is a material derivative here and since C is moving with the fluid it is possible to commute integration and time derivative as done in the second line of this calculation letting C "unchanged".
From the last form of (39), we see that if p and ρ are functionally related, i.e.
p 0 dp ρ(p) = 0 so that the circulation is conserved.
Another way to see it, is to apply Stokes theorem on the right-hand-side which becomes − which is zero if ρ and p are functionally dependent. We now state Kelvin's theorem The proof of this theorem, like Eq. (37), emphasizes the role of the pressure and density fieds in the evolution of vorticity. One calls the quantity the baroclinic vector. Thus, Kelvin's theorem asserts the conservation of circulation in barotropic, inviscid flows. Flow situations in which B = 0 will be considered in the next chapters.

Ertel's theorem
To probe deeper into the connection between Kelvin's theorem and local variations of vorticity, it is useful to adopt a fully Lagrangian description of the motion, by defining r(t; r 0 ) to be the position (x 1 , x 2 , x 3 ) at time t of a fluid particle initially at Since r 0 labels a particle, it is fixed following the motion and the For any differential dl separating two nearby points r 0 and r 0 + dl on a material curve at time t = 0, the quantity Jdl (with components J i j dl j ) can be regarded as the vector connecting the two nearby points on the material curve at all subsequent times. By repeating the calculation (43a) for this latter quantity, we obtain These calculations remarkably show that ω/ρ in (37) and dr ≡ Jdl in (43b) satisfy the same equation provided the baroclinic vector B is zero. Now dr is a differential along a material curve. The instantaneous "trajectories" of the vector field ω/ρ = (ω/ρ)(r,t) are defined analogously as the integral curves of the system To understand the meaning of this equation, the vector field ω/ρ is "frozen" (t is fixed), then one starts from an initial position at say r(σ = σ 0 ) and follows the trajectory so that it always stays tangent to the vector field. This procedure defines an instantaneous line or trajectory. We can, at any time, select such an instantaneous trajectory and thereafter follow the evolution of this curve as a material curve. A priori, such a curve as it evolves has no reason to be a solution of (44) anymore. However, since ω/ρ and dr satisfy the same equation, it follows, assuming uniqueness of the solution, that this material curve remains a trajectory of ω/ρ. Therefore, the trajectories of ω/ρ are material curves in a barotropic flow. One usually refers to these trajectories as vortex lines, and they can be visualized as being carried about by the fluid although here we consider the vector field ω/ρ instead of the vorticity field alone.
The first term on the right of (37), which might have the appearance of a "source" of vorticity, is in reality part of the statement of the material nature of vortex lines, although ω/ρ itself is not a material invariant. The only true source of vorticity in (37) is the barotropic vector B. The material changes of ω/ρ which occur when B = 0 are due to changes in the geometry of the vortex lines, i.e., to the twisting and stretching of small elements threaded by material lines. As a small pencil of vortex lines, commonly called a vortex tube is stretched out by the flow, the cross section diminishes and the local vorticity grows in order to maintain, in accordance with Kelvin's theorem, a fixed circulation about the tube.
The most concise local statement concerning vorticity is an immediate consequence of the material nature of vortex lines. If B = 0, the vector (ω/ρ)J −1 = (ω/ρ) · ∇r 02 is a material invariant.
To see this note first, by differentiating JJ −1 = I, that Thus, using (37) and (43a), D Dt which establishes the invariance. Since J is initially the identity tensor, we see that the invariant in (45) is just the initial value of ω/ρ, or ω ρ (r,t) = ω ρ (r 0 , 0)J(r,t).
Now, suppose that λ is any materially invariant scalar, i.e. Dλ /Dt = 0. Then, we may regard λ as a function of the Lagrangian labels r 0 , i.e. λ (r 0 ) = λ (r) = cst. Then, by remarking that ∇λ = ∇ 0 λ J −1 and D(∇ 0 λ )/Dt = 0, we have D Dt We may now state Theorem 2 (Ertel's theorem). Let λ be a scalar field which is invariant by the flow (Dλ /Dt = 0), then provided the fluid is barotropic or λ is a function of ρ and p only, one has D Dt The quantity ω ρ · ∇λ is called (Ertel) potential vorticity. The case where λ is a function of ρ and p only should be clear since ∇λ = λ ρ ∇ρ + λ p ∇p is orthogonal to the baroclinic vector B. Therefore, in contradistinction to Kelvin's theorem, Ertel's theorem can also be generalized to baroclinic flows. In barotropic flows, B = 0, and λ is unrestricted by being function of ρ and p.
Let us now apply Kelvin's theorem to a small contour C lying on a surface of constant λ , where λ is again a scalar invariant (see Fig. 3). Fig. 3 Kelvin's theorem and Ertel's theorem for a material fluid element of cylindrical shape. Note that the closed contour C always stays on the surface λ = λ 0 .
Kelvin's theorem asserts the constancy of ω · dS where dS = dSn is the area enclosed by C, while Ertel's theorem asserts the constancy of (ω/ρ) · (dλ /dl)n. Since the mass ρdSdl is conserved, the two statements become identical in this local setting.
Intuitively, we see that as a tube of vortex lines is laterally compressed, the intensity of the field will increase, by Kelvin's theorem applied to the cross-section of the tube. The presence of ρ in Ertel's local invariant can be traced to the need to simultaneously satisfy Kelvin's theorem and conservation of mass in a small element undergoing deformation. In this sense, Kelvin's theorem is the more fundamental characterization of vorticity. Nevertheless, Ertel's theorem will be useful to us, since in certain cases it is possible to derive an invariant λ and to cast the dynamical problem into a form equivalent to Ertel's equation. This is the case, for instance, in the next chapter.
Our vorticity computations so far have been in the inertial frame, but in view of the correspondence ω 0 = 2Ω + ω r that follows from (22) and (33), one can write Ertel's equation relative to the rotating frame as

The Geostrophic Approximation
The geostrophic approximation governs flows at small Rossby number, i.e. at relatively rapid rotation. From now on, unless otherwise specified, the equations of motion will be assumed to be relative to a rotating frame, and given by (28).
To study motion at small Rossby number, we shall neglect the inertial terms ∂ t u + u · ∇u in (28) relative to the Coriolis force. Generally, this requires that both U/2Ω L and 1/2Ω T be small, where U, L, and T are scales of the velocity field.

The Taylor-Proudman theorem
The geostrophic approximation to (28) is then given by The curl of (50) yields If the baroclinic vector (42) vanishes, then with Ω = (0, 0, Ω ) and u = (u, v, w) the components of (51) reduce to If, in addition, the fluid can be regarded as incompressible, the constraint ∇·u = 0 allows us to conclude from (52) that This is the Taylor-Proudman theorem, stating that the velocity field in the geostropic approximation is invariant along the axis of rotation for a barotropic, inviscid flow. It highlights a striking feature of rotating flows, which was demonstrated experimentally by G. I. Taylor (Fig. 4).
A cylindrical tank of water was rotated with constant speed, and on the bottom of the tank a small cylindrical obstacle was moved horizontally with very small relative speed. In ordinary inviscid fluid dynamics, we would expect the fluid to move over and around the obstacle, the velocity vector being roughly tangent of the Ω Fig. 4 The Taylor column, illustrating the Taylor-Proudman theorem in a rapidly rotating fluid.
boundary. But, on the top of the tank w must vanish everywhere. If fluid particles can only move horizontally, with a velocity independent of z, then the flow field is determined by its structure in a horizontal plane through the cylinder. Indeed, in the experiment, the obstacle was observed to carry an otherwise stagnant column of fluid with it. This "Taylor colums" is an example of the nonlocal effect which boundary conditions can produce in a rapidly rotating fluid.

Geostrophic motion
Suppose now that we deal with a rotating layer of fluid of constant density with φ = gz, and with w = 0 rather than B = 0. Then the components of (50) give This is an example of a hydrostatic as well as geostrophic flow, wherein the Coriolis force is in equilibrium with the horizontal pressure gradient. Looking down on the layer, the flow in the vicinity of a pressure minimum rotates in the same direction as the large scale rotation. In meteorology, this would roughly approximate the wind field associated with a low in the pressure field, and is called a cyclonic circulation. Anticyclonic circulations correspond to local highs. Figure 5 is correct as drawn in the Northern Hemisphere, with the cyclonic wind blowing counterclockwise around the low. In the Southern Hemisphere, the cyclonic wind around a low blows clockwise around it. This simple rule is stated by meteorologists as the Buys-Ballot law: in the Northern Hemisphere, if a person stands with the back to the wind, atmospheric pressure is low to the left and high to the right. A more entertaining way of stating it is that, if lost, stand with the back to wind -if the high pressure is to your left, you're in the Southern Hemisphere. The geostrophic approximation highlights the unusual features of the dynamics of a fluid viewed in a rotating coordinate system. To an observer in the inertial frame, geostrophic flows are predominantly solid body rotation. In order to understand the physical meaning of the dynamical balance represented in Figure 5, this background solid body rotation must be kept in mind. However, it must also be remembered that a spatial pattern evolving on a moderate time scale relative to a rapidly rotating frame will be seen by the inertial observer as a slow modulation of the basic time dependence due to rotation of the pattern.
It is therefore helpful to deal directly with the relative motion in developing one's intuition concerning the effects of rotation. For example, in interpreting Fig. 5, the Coriolis acceleration associated with the eddy is seen to be directed towards its center, and this acceleration is in equilibrium with minus the pressure gradient.

Effects of Shallowness
The most important geometrical approximation in geophysical fluid dynamics stems from the effective shallowness of the fluid layers relative to global horizontal scales of atmospheric and oceanic flow. In the present chapter, we shall study this approximation in the simplest setting of an incompressible inviscid fluid of constant density, with uniform body force, namely gravitation, acting vertically downwards. The resulting approximate description of motion in a frame which rotates about a vertical axis is ususally referred to as rotating shallow-water theory.
The equations of motion for a constant rotation rate Ω are subjected to a systematic scale analysis, where H is the vertical scale, L is the horizontal scale and the aspect ratio δ ≡ H/L is small. The shallow-water equations with rotation are thus derived, and shown to satisfy a particular form of Ertel's theorem. To wit, potential vorticity ( f 0 + ζ )/(h − h 0 ) is conserved, where ζ is the fluid's relative vorticity, f 0 = 2Ω the background vorticity, and h − h 0 the height of a fluid column.
Small-amplitude solutions of these equations are studied in a basin with prescribed depth, using linear theory. Gravity waves are obtained as natural modes when f 0 = 0, and their modification when f 0 = 0 is investigated, yielding Poincaré's inertia-gravity waves. The "edge waves" that are related to the Poincaré waves, and called Kelvin waves, are also derived.
Finally, we study low-frequency modes that satisfy an approximate geostrophic balance between Coriolis force and pressure-gradient force. Stationary geostrophic modes, geostrophic contours of a basin and the infinite multiplicity of solutions they determine are analyzed. Rossby waves over sloping bottom topography are shown to remove this geostrophic degeneracy, and we examine their relationship with the high-frequency Poincaré and Kelvin waves, along with their physical structure.

Derivation of the Equations for Shallow Water
For constant Ω = (0, 0, Ω ) and ρ, we write the governing equations (28) in the form where 2Ω = (0, 0, f 0 ), f 0 is the Coriolis parameter, and the centripetal acceleration is neglected. We shall think of the fluid layer of constant density ρ, which might represent an ocean basin, as occupying the region h 0 (x, y) ≤ z ≤ h(x, y,t). The upper surface z = h(x, y,t) will be assumed to be a constant pressure p 0 (see Fig. 6). If H is a typical value of h and h 0 , and L is a typical horizontal scale of changes in h in p and in the velocity components (u, v, w), the layer is considered shallow if Since there is no preferred horizontal direction, we take u and v to be of possibly comparable size and of order U.
It is convenient to introduce the horizontal velocity u h = (u, v, 0) as well as the horizontal gradient ∇ h = (∂ x , ∂ y , 0) and write (55b) in the form The first term on the left is then of order U/L. If w varies by an amount of order W over the layer, the second term is of order W /H. These two terms must be comparable in order to account for conservation of matter in the presense of a divergence of horizontal velocity. Thus we take W to be of order δU. This small vertical component of velocity is consistent with small inclination of both surfaces in Fig. 6. With this scalling, the material derivative is O(U 2 /L) and contains no negligible terms, provided that the time scale is taken as L/U. The horizontal pressure forces, which drive the horizontal motions, must be of order comparable to the acceleration. Taking the Coriolis and inertial accelerations to be also comparable to each other, so that the Rossby number U/2Ω L is O(1), we have p ∼ ρU 2 . Then the horizontal momentum balance is given by On the other hand, the vertical momentum balance is yielding the hydrostatic approximation For a shallow layer this simplifies therefore to where P(x, y,t) is an arbitrary function. The latter is fixed by the dynamic condition p(x, y, h,t) = p 0 = cst on the free surface. Thus We find that the pressure is also of order ρgH ∼ ρU 2 , yielding the characteristic horizontal velocity U ∼ √ gH. According to (61), ∇ h p is independent of z. Hence we expect from (58) that u h would remain independent of z if initially so. Assuming that this is the case, Eq. (58) simplifies to where, for simplicity, the material derivative in a horizontal plane is denoted by involving a second arbitrary function. To determine it, one imposes the kinematic conditions that the upper and lower fluid boundaries be material surfaces. If a boundary is defined implicitely by S(x, y, z,t) = 0, the latter condition is that dS/dt = 0 on S = 0. This gives using (63) applied to z = h and z = h 0 Thus (64) determines w 0 and provides one equation to supplement (62a). If we substract one equation from the other, the supplemental equation is obtained in the form Note that (65) has, in horizontal coordinates, the form of a continuity equation for a "density" h − h 0 (cf. Eq. (2b)). Physically, such an equation describes how a vertical fluid column changes in height in response to changes in horizontal divergence of the velocity field. Note, in this connection, that the absence of any vertical variation of u h implies that vertical columns remain vertical. The linear variation of w with z in fact implies that the fluid column is stretched uniformly and gives rise to an interesting material invariant of the shallow-water equations. Taking the exact material derivative of z − h 0 and using (63) and (64b), we obtain we may thus combine (65) and (66) to obtain Thus, a particle of fluid retains its position in a column as a given fraction of the height of the column.
We are now in a position to apply Ertel's theorem (Section 4.2) to the material invariant λ . To do this, we need the shallow-water approximation to the inertial or absolute vorticity ω 0 . Taking into account that u h is independent of z and where ζ = ∂ x v − ∂ y u is the relative vorticity of the flow. Actually, to justify (68) it must be shown that there is no term of order δ in the equation for u h , which could contribute O(1) terms through the z-derivatives. But this follows from the estimate (59), which indicates that the shallow-water equations determine the leading terms in an expansion in δ 2 . Thus, the contributions of ∂ z u and ∂ z v to (68) are in fact of order δ as stated.
Consequently, in the shallow-water approximation, Ertel's theorem, cf. Eq. (49), yields a potential vorticity equation in the form D Dt ) and the term ∇ h λ does not contribute to the scaled gra- We can interpret this equation quite easily in physical terms: since h − h 0 is inversely proportional to the horizontal cross-sectional area of a thin column, (69a) is in fact the shallow-water version of a local form of Kelvin's theorem in a rotating frame.
One obtains some simple consequences of (69a) by expanding it as where it is assumed that f 0 does not depend on the horizontal coordinates, i.e. D f 0 /Dt = 0. Since planetary vorticity f 0 is positive, f 0 > 0 in the Northern Hemisphere, it follows from (69b) that relative vorticity ζ and layer depth h − h 0 increase and decrease together, at least when ζ is cyclonic, ζ > 0. This monotonic dependence of ζ on h − h 0 will also hold for anticyclonic ζ , provided f 0 |ζ |. The latter is typically the case for large-scale geophysical flows. The same conclusions apply in the Southern Hemisphere, where f 0 < 0, for cyclonic flows of arbitrary strength and for weak anticyclonic flows; notice that cyclonic flows in the Southern Hemisphere are characterized by ζ < 0 , cf. Fig. 1.3 and accompanying remarks. Of course, we do not know what the changes in ζ are until we find h(x, y,t). In Chapter 10 we shall see that, for small Rossby number, relations connecting u h (and therefore ζ ) to h become quite simple. In this case, (69) becomes a nonlinear partial differential equation in h, which completely determines the shallow-water dynamics.
The shallow-water theory described by Eqs. (62a, 65) has a number of interesting features, both with and without the Coriolis force term. We study here some aspects of the relevant linear theory, and take up in the next chapter some aspects of the nonlinear equations. In other words, the next two sections will address the smallamplitude limit of shallow-water theory. Results about the theory's limit for small Rossby number can be found in several of the books cited at the beginning of Chapter 1, and they are applied here in Chapter 3.
Remarks. It is possible to derive Eqs. (62a, 65) directly from (55), given the hydrostatic approximation (59b) and the fact that the horizontal velocities are independant of z. It suffices to integrate (55) with respect to z, from h 0 to h. The vertically Integrating the divergence-free equation (55b), one has which yields directly Eq. (65). Plugging this expression into the expression above gives Hence, the horizontal pressure terms, when integrated over the depth of the layer and using (61), yield multiplied everywhere by the factor h − h 0 . Equation (69a) can also be obtained directly from (62a) and (65) by taking the curl to yield 7 Reduced-gravity shallow-water models In this section, we would like to obtain the so-called (N + 1/2)-layer shallow-water model that extends the result above in the direction of a fully 3-D model. The terminology involving a '1/2-layer' is shorthand for the assumption of a deep, motionless layer above which the active flow can be approximated by N layers with nonzero velocities. In this situation, the effective gravity felt in the active layers is smaller than g, and it is referred to as reduced gravity. As we shall see at the end of this section, the atmospheric case with N + 1 fully active layers is quite similar, except for the need of using a reduced gravity. More precisely, in the (N + 1/2)-layer setting, the oceans are modeled by N active layers of finite depth and constant density ρ n and another quiescent layer of infinite depth and of larger density ρ N+1 . Roughly speaking, the active layers model the ocean above the thermocline and the last layer represents the fluid below it.
The term "reduced gravity" is used because the interfaces below the surface are in slow motion due to the fact that the force acting on them is essentially the gravity field multiplied by a factor that is proportional to density differences between the layers. This factor is in general small, so that the gravity is reduced by an order of magnitude. A simple experiment with a layer of oil above one of water in a large cup can help illustrate this concept: the interface between the two layers, when moving the cup a bit initially, is slowly damped compared to the fast gravity waves at the surface.
In the shallow-water context, the main hypothesis is the validity of the hydrostatic approximation, with constant pressure P 0 at the surface. Since the horizontal gradient of the pressure becomes independent of z, the velocities are also independent of z at all times, if they are initially so; see Sec. 6. We assume that the depth associated with each layer is h k > 0, and the vertical coordinate z is taken to be equal to h 0 at the surface, cf. Fig. 7.
Note that the derivatives in the vertical direction must be kept although it is assumed that the velocities u k = (u k , v k ) within each layer k are independent of z. This is due to the fact that the complete velocity field (u 1 , · · · , u N ) obviously depends on z, although it does so in a discontinuous way, from layer to layer. In particular, when interpreting the derivatives in discretized form, terms like ∂ u k /∂ z at the interface between layer k and k + 1, say, would be proportional to the velocity difference in a first-order linear approximation: One must be aware that estimating these derivatives is more a matter of parametrization of subgrid-scale processes, and it is difficult to assess what they should be in reality. For oceanic applications, these vertical derivatives are either neglected, except at the surface, or take the form of Rayleigh damping between the layers, in view of (70).
We thus start from N 2-D coupled Navier-Stokes equations in dimensional form, that is Using the hydrostatic hypothesis, and the continuity of the pressure at each interface, one obtains the pressure within each layer: In order to proceed, we reformulate the nonlinear terms in (73) in a more convenient conservative form using the divergence-free equation, to obtain Fig. 7 Reduced-gravity shallow-water model with N active layers of density ρ k and thickness h k > 0, while the bottom layer N + 1 is of infinite depth and at rest. The surface deviation is z = h 0 and serves as the reference for the vertical coordinate. The pressure P 0 at the surface is constant.
In the following, we will extensively use the formula We now integrate (73) over the layer depth, that is between z − = h 0 − ∑ k j=1 h j and z + = z − + h k , where z − and z + of course depends on (x, y,t). We proceed with each term and, again assuming that u k does not depend on z, obtain: using the relation (72).
To complete the derivation, one integrates the vertical dissipative terms: The terms on the right-hand side of the last equation above correspond to the horizontal stresses acting on the top and bottom of the layer k. In particular, the stress at the surface is the wind stress, see Fig. 15 and (101) in Chapter 3. In order to conclude, one must relate h 0 to the layer thicknesses. The requisite relation is readily provided by the motionless N + 1-st layer. Absence of motion implies that there are no pressure variations, that is In the following sections, we will use the notation We thus find, using (72) and (74), that We are now ready to write the reduced-gravity shallow-water model with N active and one passive layer, using the mass flux vector notation U k := u k h k and the fact Here 0 0 · · · · · · · · · · · · 0 g 1 2 0 · · · · · · · · · · · · 0 g 1 3 g 2 3 0 · · · · · · · · · 0 . . .
and the surface deviation in the densities that corresponds to h 0 in vector form is . The interfacial stress T k for 2 ≤ k ≤ N − 1 is often assumed to be zero to retain only the surface and bottom stress. Since we assume the velocity vanishes in the N + 1-st layer, one expects to get a boundary layer of Ekman type.
Herewith we just obtained the shallow-water equations in a conservative massflux form that is quite convenient for numerical discretization. There are two other ways to express (77), one is the momentum form and the other is the vorticity form. In momentum form, the equations are indeed straightforward to obtain since it suffices to divide by h k and use the mass-conservation equation: The vorticity formulation is found by cross-differentiating (79) and defining the vorticities, layer-by-layer, as Dt h k and the cross-differentiation of the left-hand side of (79) yields the Equation (80) neatly generalizes the expression (69). Note that the potential vorticity (ζ k + f )/h k in each layer is a conservative quantity in the absence of lateral friction and vertical stresses. Of course, one needs to relate h k with ζ k to obtain a complete description of the flow. It therefore amounts to solve the shallow-water equations in their primitive form, that is (77) or (79). In the quasi-geostrophic approximation, one can nevertheless find a closed equation for the vorticity-streamfunction variables. Again, we refer, for instance, to Chapter 3 in [60] for the derivation and to Chapter 3 for an application to the wind-driven ocean circulation.
Remark. The case of a genuine N + 1-layer model is slightly different. In such a model, the relation (72) for the pressure in each layer is the same but instead of having (76) due to ∇p N+1 = 0, one has g∇h 0 = g∇h 1 + · · · + g∇h N+1 , which involves an additional layer N + 1.

Small-Amplitude Motions in a Basin
From now on, we will omit the subscript 'h', namely ∇ and u are understood as ∇ h and u h . Suppose that the fluid lies in a basin occupying a closed set D in the (x, y)-plane, cf. Fig. 8, and that when at rest, i.e. with h = const., the depth of the fluid is positive, H(x, y) > 0, over D. We then set where η and hence u = (u, v) are small perturbations. Since nonlinear terms in the perturbations are neglected, (62a) and (65) reduce to ∂ v ∂t When f 0 = 0, we obtain the equations for small-amplitude oscillations in a nonrotating basin. In this case (82) can be combined to give If the contour ∂ D is an impenetrable barrier, the appropriate boundary condition for system (82) is u · n = 0, where n is the normal to ∂ D. In this case, (82a,82b) with as the corresponding boundary condition for (83a).
The deviation η is assumed to be small compared with H.
If we look for eigenfunctions η(x, y,t) = e iωt N(x, y) of (83), where N has zero integral over D, then the resulting equation for N is equivalent to the Euler-Lagrange equation for the following variational problem: Here the minimum is to be taken over all N which satisfy ∂ N/∂ n = 0 on the boundary and have zero integral over D. The Lagrange multiplier in the problem may then be identified with ω 2 /g. In this way, the minimum frequency ω 0 is determined variationally by We note in passing that, for given H(x, y), ω 0 is strictly positive and the Rayleigh-Ritz procedure provides an effective means for constructing approximate eigenfunctions. These eigenfunctions represent standing gravity waves in a shallow basin.
The relative simplicity of this theory is a consequence of the self-adjoint property of the boundary-value problem (83). Writing (83a) as it is easily seen that L is formally self-adjoint, for all functions φ , ψ satisfying the boundary condition (83b).
We turn now to the effect of rotation on small-amplitude motions. It can be shown from (82) that, if f 0 = 0, the associated linear operator is no longer self-adjoint. To exibit this operator, it is convenient to introduce the quantities σ = ∇ · (Hu), and s = ∇ × (Hu) (84) which may be thought of as representing the divergence and vorticity of the integrated mass flux across the layer. From (82a,82b), we obtain by differenciation Eliminating s and using σ = − ∂ η ∂t from (84) and (82c), the equation for η becomes ∂ ∂t We consider the simplest case, namely H = constant, so that the last two terms of the above equation vanish. One substitutes η = e i(ωt+k·x) into (86) to obtain a dispersion equation for traveling plane waves in an unbounded domain: where the wavenumber k has two components (k x , k y ) and |k| 2 = k 2 x + k 2 y . Disregarding for the moment the wave corresponding to ω = 0, these waves are known as Poincaré or inertia-gravity waves, and represent the generalization to the rotating case of familiar gravity waves. For every angular frequence ω = 0, there are two waves with with phase speed along the direction k One propagates in the positive, the other in the negative direction relative to the wave vector k. They are observed in the atmosphere as well as in large bodies of water. Seen from above, particle paths in the nonrotating case appear as line segments parallel to the direction of propagation of the wave; see panel (a) in Fig. 9. With rotation, cf. panel (b), the segments become ellipses whose eccentricity decreases with rotation. These facts can be most easily derived from a Lagrangian description.
For Poincaré waves in a finite basin, one finds that, in certain simple basin shapes, the boundary condition selects a discrete set of eigenfunctions. Each of the latter exhibits a dependence of frequency upon a discrete ordering parameter that is similar to (87).
Using (87), we write the dispersion relation for inertia-gravity waves in the form Wave fronts, which correspond to η = const., and particle paths for gravity waves: (a) without rotation; and (b) with rotation. where has the dimensions of a length, and is called the external Rossby radius of deformation. The physical inportance of this length will become clearer in studying the near-surface ocean circulation in Ch. 3.
In the present context, 2πL R is a horizontal scale below which the Poincaré waves become essentially gravity waves with small Coriolis effect. To take a typical case for the Earth's atmosphere, let H = 10 km, giving L R ∼ = 3 × 10 3 km. Thus, for horizontal scales of 2π/k < 3 × 10 3 km, we note that the gravity contribution to (89) dominates.
Notice that the frequency of these waves increases with wavenumber k. It is always larger than f 0 , i.e., their period is smaller than the period of planetary rotation. A typical period of the inertia-gravity waves at length scales L R = O 10 3 km is about 2 hours. For most geophysical problems on such horizontal scales, this determines a short time scale. An important feature, therefore, of models for large-scale atmospheric and oceanic dynamics is an appropriate mechanism for filtering out the possible Poincaré-wave components of small period. Inertia-gravity waves are observed to have, in fact, small amplitudes over these length scales.

Kelvin waves
Another possible component of oscillations of a basin is the family of Kelvin waves. These waves can be generated near a bounding contour and oscillate at a frequency which can be close to that associated with gravity waves. To take the simplest case, suppose that the basin is in the upper half-plane y > 0.
Eliminating u from (82a,82b) we obtain if v vanishes on the boundary we must also have A new result can be obtained if we assume f 2 0 − ω 2 + gHk 2 x > 0, since then the y-dependence involves real exponentials rather than real trigonometric functions.
To obtain a solution bounded in the upper half plane we must select the decaying exponential. Then (93b) requires N = N 0 e ( f 0 k x /ω)y , ω < 0, in which case (93a) gives Taking ω = −k x √ gH, one obtains a wave with positive phase velocity identical to that of a pure gravity wave.
It can be checked from (82) that v vanishes identically for this wave, and that wave crests are parallel to the y-axis. On each line y = cst the wave propagates as an ordinary gravity wave, but the x-motion is in geostrophic balance with the wave height η; the latter decreases in concert with u as y increases, at an exponential rate determined by the Rossby radius L R .
Because of their exponential decrease away from the boundary, Kelvin waves can be regarded as "edge waves". They have vanishing frequency as k x → 0, a property which separates them from inertial-gravity waves; see Fig. 12 in Sec. below. But for moderate length scales in the x-direction they fall into the category of "fast" waves. Kelvin waves are particularly important in tropical meteorology and oceanography, where the equator plays the role of the "wall".

Geostrophic Degeneracy and Rossby waves
The time scales associated with large-scale atmospheric phenomena are of the order of days, while those for the ocean are of the order of months. If these time scales are to be reflected in the small-amplitude solutions of the shallow-water equations, the relevant physics must be associated with the modes corresponding to the root ω = 0 of the dispersion relation (87) in the constant-depth case.

Stationary geostrophic modes
Let us consider then the stationary modes in the small-amplitude theory. Eqs. (82), reduce for steady solutions to This is a special kind of geostrophic flow (Section 1), and we refer to all solutions of (95) compatible with the condition on the boundary of the basin as stationary geostrophic modes. Eq. (95a) implies that (g/ f 0 ) η,is a stream function for the flow's velocity field and that lines η = cst are streamlines of the flow. In particular, the boundary has to be an isoline of η.
For the constant-depth problem we see that arbitrary divergence-free horizontal motions compatible with the condition that the boundary be a streamline are allowed. Eqs. (95) simply tells us what the fluid level must be to bring the system into geostrophic balance. From (86) and (95) we conclude that in the constant-depth case, with η proportional to e iωt in separated variables, the eigenvalue ω = 0 is infinetely degenerate. Given any boundary contour, there are an infinite number of functions η which satisfy the boundary conditions.
If H depends upon x and y, the geostrophic modes may still be infinetely degenerate, since (95b) then implies that the streamfunction η may be an arbitrary function of H alone. Thus lines of constant depth and streamlines coincide. Although contours of constant H that intersect the boundary ∂ D must carry zero velocity, closed contours that do not touch the boundary -or nested islands of such contours -do allow the construction of geostrophic modes, as shown in Fig. 10). The latter type of contours, along which stationary geostrophic flow may occur, are often called geostrophic contours.

Geostrophic contour
H=cst D Fig. 10 Geostrophic contours for small-amplitude motion in a basin D.

Sloping bottom topography
The question now arises as to what happens to the "disappearing" geostrophic modes when the depth function is perturbed, in such a way that the geostrophic degeneracy is partially or perhaps completely removed. To take a particularly simple case, consider a large basin whose depth function is locally linear It is helpful to consider the parameter β as small, β = O(ε), to retain (96) over a sizeable region, and to disregard boundary effects. It is customary in dealing with large-scale flow of the atmosphere and oceans to choose the x-coordinate pointing eastward and the y-coordinate pointing northward. Such a geometry is able to simulate the important effect of latitudinal variation in the effective local rate of rotation, as shown, for instance by Weeks and colleagues [169, and references therein].
With (96), the geostrophic contours are the lines y = const., so the degeneracy of the constant-depth case has been significantly altered. To force recovery of the missing modes in (82) as a perturbation, we suppress completely the remaining geostrophic modes by requiring that η be of the form with similar expressions for u and v, and with small ω/ f 0 .
The smallness of ω guarantees slow changes in all the variables; in particular, we shall have small ∂ u/∂t and ∂ η/∂t. Equations (82a,82c) then imply an approximate geostrophic balance. Since ∂ η/∂ y = 0, (82b) also states that u itself is small. The solutions we consider have therefore predominantly y-directed fluid velocity. That is, particle motion of the slowly-varying flow is essentially perpendicular to the geostrophic contours of the stationaty flow discussed before, while the slow wave propagation is parallel to these contours. Mowever, the free surface η and velocity (u, v) of the flow (97a) are still in a nearly geostrophic, slowly shifting balance.
Substituting η from (97a) and H from (96) into (86) yields which involves y explicitly in the first term on the right. This appears to be inconsistent with the assumed form of η. However, we are dealing only with small ω, and one can thus neglect the term β y in order to obtain a dispersion relation that is still correct to O(ε). Rearranging terms, the latter takes the form The solutions of (98) are plotted in Fig. 11, which shows that, for small β , there are two "perturbed" inertia-gravity modes, together with a third root of order β . As a matter of fact, letting the left-hand side (LHS) vanish gives directly (89). Since this third root is small, an approximate dispersion relation for it is The waves associated with (99) are called Rossby waves and, in the present case of positive β , their phase velocity c = −ω/k x is westward. The frequency of these waves is always smaller than f 0 , provided β is not too large. Rossby waves are further distinguished from both Poincaré and Kelvin waves by the fact that, for short waves, ω decreases with k x , as seen in Fig. 12. We might think of these waves as replacing, among the infinity of geostrophic modes in a basin of constant depth, the family of modes with y = const. as streamlines. In fact, to O(ε), the flow is still nondivergent and in geostrophic balance, where ψ = (g/ f 0 )η. This balance also implies that, to O(ε), the relative vorticity in the wave, cf. (68), is given by But it is precisely the small deviation from exact geostrophy -apparent to O(1) only in the vorticity form of the equations, cf. (86,97) -that eliminates the geostrophic degeneracy and gives rise to the waves.
To understand the physical structure of Rossby waves, it is useful to consider their propagation in terms of vorticity balance. According to (69), vorticity and depth along a material trajectory increase and decrease together. Examining the structure of the Rossby wave (97a) at any given time, we observe that ζ and v have the appropriate variations shown in Fig. 13, namely if ζ (x) = sin k x x then v(x) = k x cos k x x.
Since v advects ζ , wherever v is positive the relative vorticity at that point must be instantaneously decreasing with time, since advection would carry the fluid element from a deeper region into a shallower region (β > 0). In order for this to happen, the wave pattern must drift westward. Fig. 13 Linear Rossby waves must propagate westward when H = H 0 (1−β y), with β > 0, in order to conserve potential vorticity according to (69a). Thus, an initial vorticity wave given by (97a), plotted in the figure as the heavy blue undulating contour, has its vorticity decreasing -as given by the vertical red arrow that indicates the vorticity contribution −ζ in this case -if the particle moves northward. To the contrary, southward particle motion makes a positive contribution +ζ to the wave's vorticity.

Chapter 3
The Wind-driven Ocean Circulation

Introduction and motivation
Having introduced the reader to some of the fundamentals of planetary-scale flows in Chapters 1 and 2, we are ready to turn to current topics of the climate sciences and of the applications of dynamical systems theory to them. Since the early 1970s, the importance of the climate sciences has been rapidly growing, as a field of scholarship as well as a source of information for stewardship of planet Earth [116,73,74,75,76]. Likewise, the usefulness of dynamical systems and of their successive bifurcations in navigating across the hierarchy of climate models [37, 55, 56, 60, 136] -while far from universally accepted -has found a wide community of users and produced a substantial literature.
An important recent development in this literature was caused by the realization that the theory of differentiable dynamical systems (DDS) -in its classical form dealing with autonomous DDSs -was fairly well adapted to applications in which neither the forcing nor the coefficients were time dependent. To a good approximation, this could be said to be the case in numerical weather prediction [81,159], but much less so in climate problems in which the seasonal forcing [20,63,68,79,80,135,163,162] or anthropogenic effects [18,116,73,74, 75] play a major role.
To cope with the latter type of problems, applications of the theory of nonautonomous and random dynamical systems (NDS and RDS) have started to appear in the theoretical climate dynamics literature [59,27,10,43]. The purpose of the present paper is to give some flavor of the systematic application of DDS, NDS and RDS theory across the hierarchy of climate models, from the simplest, conceptual ones all the way to the most elaborate ones [39,55,56,63].
In mathematical terms, this means considering, at the lower end of the hierarchy, small systems of ordinary differential equations (ODEs) and at the higher end full-blown systems of nonlinear partial differential equations (PDEs). In the climate literature, the former are sometimes called "toy models," while the latter are re-ferred to as general circulation models (GCMs) or, more recently, as global climate models, which preserves the GCM acronym.
An interesting problem to which these ideas and results will be applied herein is that of the wind-driven circulation in mid-latitude ocean basins [37,39], subject at first to time-constant and then to purely periodic and more general forms of timedependent wind stress. The wind-driven circulation is obviously strongest near the surface of the oceans, and its variability lies mostly in the subannual and interannual range. In many simplified, theoretical treatments of the oceans, this circulation is treated separately from the so-called buoyancy-driven, thermohaline or overturning circulation. The latter is driven by mass fluxes at the ocean-atmosphere interface; it involves the density distribution of water masses -which depends on both temperature and salinity, hence its characterization as thermohaline -and it reaches all the way to the bottom of the oceans. The variability of this overturning circulation lies largely in the interdecadal and centennial range [156,152,54,117,121,37,39,56]. It will only be mentioned here in passing in Section 13.
Key features of the wind-driven circulation are described in Section 11 below. The influence of strong thermal fronts -like the Gulf Stream in the North Atlantic or the Kuroshio in the North Pacific -on the mid-latitude atmosphere above is severely underestimated, thus contributing to the current range of uncertainties in climate simulations over the 21st century. Typical spatial resolutions in recent century-scale GCM simulations [1,73,74,75,128,46] are still of the order of 100 km at best, whereas resolutions of 50 km and less would be needed to really capture the strong mid-latitude ocean-atmosphere coupling just above the oceanic fronts [51,52,110,148].
An important additional source of uncertainty comes from the difficulty to correctly parametrize global and regional effects of clouds and their highly complex small-scale physics. This difficulty is particularly critical in the tropics, where largescale features such as the El-Niño/Southern Oscillation (ENSO) and the Madden-Julian Oscillation are strongly coupled with convective phenomena [103,114,63].
The outline of this chapter is the following. First, we describe in Section 11 the most recent theoretical results regarding the internal variability of the mid-latitude wind-driven circulation, viewed as a problem in nonlinear fluid mechanics. These results rely to a large extent on autonomous DDS theory [5,70]. Next, we summarize in Section 12 the key concepts and methods of NDS [137] and RDS [4] theory.
With the results of Sections 11 and 12 in hand, we address in Section 13 the changes in this purely oceanic variability as a result of time-dependent forcing, on the one hand, and truly coupled ocean-atmosphere variability, on the other. Much of the material in this section is new [50, 125,167]. A summary and an outlook on future work follow in Section 14. Rigorous mathematical results appear in Appendices A, B and C, which follow closely [59] (Appendix A) and [27] (Appendices B and C).

Observations
To a first approximation, the main near-surface currents in the oceans are driven by the mean effect of the winds. The trade winds near the equator blow mainly from east to west and are also called the tropical easterlies. In mid-latitudes, the dominant winds are the prevailing westerlies, and towards the poles the winds are easterly again. Thus an idealized version of wind effects on the ocean must involve zonally oriented winds that blow westward at low and high latitudes, and eastward in mid-latitudes.
Five of the strongest near-surface, mid-and-high-latitude currents are the Antarctic Circumpolar Current, the Gulf Stream and the Labrador Current in the North Atlantic, and the Kuroshio and Oyashio off Japan. These currents are all clearly visible in Fig. 14. The Gulf Stream [153] is an oceanic jet with a strong influence on the climate of eastern North America and of western Europe. From Mexico's Yucatan Peninsula, the Gulf Stream flows north through the Florida Straits and along the East Coast of the United States. Near Cape Hatteras, it detaches from the coast and begins to drift off into the North Atlantic towards the Grand Banks near Newfoundland. Actually, the Gulf Stream is part of a larger, gyre-like current system, which includes the North Atlantic Drift, the Canary Current and the North Equatorial Current. It is also coupled with the pole-to-pole overturning circulation.
The Coriolis force is responsible for the so-called Ekman transport, which deflects water masses orthogonally to the near-surface wind direction and to the right [69,60,120]. In the North Atlantic, this Ekman transport creates a divergence and a convergence of near-surface water masses, respectively, resulting in the formation of two oceanic gyres: a smaller, cyclonic one in subpolar latitudes, the other larger and anticyclonic in the subtropics. This type of double-gyre circulation characterizes all mid-latitude ocean basins, including the South Atlantic, as well as the North and South Pacific.
The double-gyre circulation is intensified as the currents approach the East Coast of North America due to the β -effect. This effect arises primarily from the variation of the Coriolis force with latitude, while the oceans' bottom topography also contributes to it. The former, planetary β -effect is of crucial importance in geophysical flows and induces free Rossby waves propagating westward [69,60,120].
The currents along the western shores of the North Atlantic and of the other mid-latitude ocean basins exhibit boundary-layer characteristics and are commonly called western boundary currents. The northward-flowing Gulf Stream and the southward-flowing Labrador Current extension meet near Cape Hatteras and give rise to a strong, mainly eastward jet that eventually reaches Scandinavia. In all four mid-latitude ocean basins -i.e., the North and South Atlantic, as well as the North and South Pacific -the confluence of a warm, poleward current with a cold, equatorward one produces an intense eastward jet. The formation of this jet and of the associated recirculation vortices near the western boundary, to either side of the jet, is mostly driven by internal, nonlinear effects. Figure 15 illustrates how these large-scale wind-driven oceanic flows self-organize, as well as the resulting eastward jet. Different spatial and time scales contribute to this self-organization: so-called mesoscales eddies in the ocean play the role of the synoptic-scale weather systems in the atmosphere.
Two cold rings and a warm one are clearly visible in Fig. 15. Such warm and cold eddies last for several months up to a year and have a diameter of about 100 km vs. the much larger horizontal dimensions of atmospheric weather systems, of the order of 1000 km. In fact, the scale of roughly 100 km that characterizes these eddies in the ocean is given by the Rossby radius of deformation that was introduced in Ch. 2. This scale is much smaller than that of atmospheric weather systems, although the basic physics is the same. The difference in scale of the oceanic and atmospheric Rossby radius is due to the different stratification of the two fluid media.
The characteristic scale of the eastward oceanic jet that detaches from Cape Hatteras and of the two opposite gyres on either side of it -sketched as the recurving white arrows in the figure -is of several thousand kilometers. These larger-scale oceanic features exhibit their own intrinsic dynamics on time scales of several years to possibly one or two decades.
A striking feature of the wind-driven circulation is the existence of two wellknown North Atlantic Oscillations, with a period of about 7-8 and 14-15 years, respectively. Data analysis of various climatic variables, such as sea surface temperature (SST) over the North Atlantic or sea level pressure (SLP) over western Europe [112,171,30] and local surface air temperatures in Central England [126], as well as of proxy records, such as tree rings in Britain, travertine concretions in southeastern France [44], and Nile floods over the last millennium or so [88], all Curl of the wind stress Fig. 15 A satellite image of the sea surface temperature (SST) field over the northwestern North Atlantic (in false color, from the U.S. National Oceanic and Atmospheric Administration), together with a sketch of the associated double-gyre circulation (white arrows). A highly simplified and smoothed view of the amount of potential vorticity injected into the ocean circulation by the wind stress is shown in the sketch to the right. Reproduced from [59], with permission from Elsevier. exhibit strikingly robust oscillatory behavior with a 7-8-yr period and, to a lesser extent, with a 14-yr period. Variations in the path and intensity of the Gulf Stream are most likely to exert a major influence on the climate in this part of the world [153]. These climatic implications add to the intrinsic interest of theoretical studies of the low-frequency variability of the oceans' double-gyre circulation.
Given the complexity of the processes involved, climate studies have been most successful when using not just a single model but a full hierarchy of models, from the simplest to models to the most detailed GCMs [55,63]. In the following, we describe one of the simplest models of the hierarchy used in studying this problem, which still retains many of its essential features.

A simple model of the double-gyre circulation
The simplest model that includes many of the mechanisms described above is governed by the barotropic quasi-geostrophic (QG) equations. The term geostrophic refers to the fact that large-scale rotating flows tend to run parallel to, rather than perpendicular to constant-pressure contours; in the oceans, these contours are associated with the deviation from rest of the surfaces of equal water mass, due to Ekman pumping. Geostrophic balance implies in particular that the flow is divergence-free. The term barotropic, as opposed to baroclinic, has a slightly different meaning in geophysical fluid dynamics than in engineering fluid mechanics: it means that the model describes a single fluid layer of constant density and therefore the solutions do not depend on depth [69,60,120].
We consider an idealized, rectangular basin geometry and simplified forcing that mimics the distribution of vorticity due to the wind stress, as sketched to the right of Fig. 15. In our idealized model, the amounts of subpolar and subtropical vorticity injected into the basin are equal and the rectangular domain Ω = (0, L x ) × (0, L y ) is symmetric about the axis of zero wind stress curl.
The barotropic two-dimensional (2-D) QG equations in this idealized setting are: Here q and ψ are the potential vorticity and the streamfunction, respectively, and the Jacobian J gives the advection of potential vorticity by the flow, J(ψ, q) = ψ x q y − ψ y q x = u · ∇q, where u = (−ψ y , ψ x ), x points east and y points north. The physical parameters are the strength of the planetary vorticity gradient β , the Rossby radius of deformation λ −2 R , the eddy-viscosity coefficient ν, the bottom friction coefficient µ, and the wind-stress intensity τ. We use here free-slip boundary conditions ψ = ∆ 2 ψ = 0; the qualitative results described below do not depend on the particular choice of homogeneous boundary conditions [39,78].
We consider the nonlinear PDE system (101) as an infinite-dimensional dynamical system and study its bifurcations as the parameters change. Two key parameters are the wind stress intensity τ and the eddy viscosity ν. An important property of (101) is its mirror symmetry in the y = L y /2 axis. This symmetry can be expressed as invariance with respect to the discrete Z 2 group S , given by any solution of (101) is thus accompanied by its mirror-conjugated solution. Hence the prevailing bifurcations are of either the symmetry-breaking or the Hopf type.

Bifurcations in the double-gyre problem
The development of a comprehensive nonlinear theory of the double-gyre circulation over the last two decades has gone through four main steps. These four steps can be followed through the bifurcation tree in Fig. 16.

Symmetry-breaking bifurcation
The "trunk" of the bifurcation tree is plotted as the solid blue line in the lower part of the figure. When the forcing τ is weak or the dissipation ν is large, there is only one steady solution, which is antisymmetric with respect to the mid-axis of the basin. This solution exhibits two large gyres, along with their β -induced western boundary currents. Away from the western boundary, such a near-linear solution (not shown) is dominated by so-called Sverdrup balance between wind stress curl and the meridional mass transport [69,155]. The first generic bifurcation of this QG model was found to be a genuine pitchfork bifurcation that breaks the system's symmetry as the nonlinearity becomes large enough with increasing wind stress intensity τ [77,78,17]. As the wind stress increases, the near-linear Sverdrup solution that lies along the solid blue line in the figure develops an eastward jet along the mid-axis, which penetrates farther into the domain and also forms two intense recirculation vortices, on either side of the jet and near the western boundary of the domain.
The resulting more intense, and hence more nonlinear solution is still antisymmetric about the mid-axis, but loses its stability for some critical value of the windstress intensity, τ = τ P . This value is indicated by the filled square on the symmetry axis of Fig. 16 and is labeled "Pitchfork" in the figure.
A pair of mirror-symmetric solutions emerges and it is plotted as the two red solid lines in the figure's lower part. The streamfunction fields associated with the two stable steady-state branches have a rather different vorticity distribution and they are plotted in the two small panels to the upper-left and upper-right of Fig. 16. In particular, the jet in such a solution exhibits a large meander, reminiscent of the one seen in Fig. 15 just downstream of Cape Hatteras; note that the colors in Fig.  16 were chosen to facilitate the comparison with Fig. 15. These asymmetric flows are characterized by one recirculation vortex being stronger in intensity than the other, which deflects the jet accordingly either to the southeast, as is the case in the observations for the North Atlantic, or to the northeast.

Gyre modes
The next step in the theoretical treatment of the problem was taken in part concurrently with the first one above [77,78] and in part shortly thereafter [150,40,138]. It involved the study of time-periodic instabilities through Hopf bifurcation from either an antisymmetric or an asymmetric steady flow. Some of these studies treated wind-driven circulation models limited to a stand-alone, single gyre [138,121]; such a model concentrates on the larger, subtropical gyre while neglecting the smaller, subpolar one.
The overall idea was to develop a full, generic picture of the time-dependent behavior of the solutions in more turbulent regimes, by classifying the various instabilities in a comprehensive way. However, it quickly appeared that a particular kind of instability leads to so-called gyre modes [78,150], and was prevalent across the full hierarchy of models of the double-gyre circulation; furthermore, this instability triggers the lowest nonzero frequency present in all such models [37,39].
These gyre modes always appear after the first pitchfork bifurcation, and it took several years to really understand their genesis: gyre modes arise as two eigenvalues merge -one is associated with a symmetric eigenfunction and responsible for the pitchfork bifurcation, the other is associated with an antisymmetric eigenfunction [139]; this merging is marked by a filled circle on the left branch of antisymmetric stationary solutions and labeled M in Fig. 16.
Such a phenomenon is not a bifurcation stricto sensu: one has topological C 0equivalence before and after the eigenvalue merging, but not from the C 1 point of view. Still, this phenomenon is quite common in small-dimensional dynamical systems with symmetry, as exemplified by the unfolding of codimension-2 bifurcations of Bogdanov-Takens type [70]. In particular, the fact that gyre modes trigger the lowest-frequency of the model is due to the frequency of these modes growing quadratically from zero until nonlinear saturation. Of course these modes, in turn, become unstable shortly after the merging, through a Hopf bifurcation indicated in Fig. 16 by a heavy dot marked "Hopf," from which a stylized limit cycle emerges. A mirror-symmetric M and Hopf bifurcation also occur on the right branch of stationary solutions, but have been omitted in the figure for visual clarity.
More generally, Hopf bifurcations of various types give rise to features that recur more-or-less periodically in fully turbulent planetary-scale flows, atmospheric, oceanic and coupled [60,37,39,56]. In the climate sciences, one commonly refers to this type of near-periodic recurrence as low-frequency variability (LFV).

Global bifurcations
The importance of the gyre modes was further confirmed through an even more puzzling discovery. Several authors realized, independently of each other, that the low-frequency dynamics of their respective double-gyre models was driven by intense relaxation oscillations of the jet [144,108,19,113,141,142,143]. These relaxation oscillations, already described in [78,150], were now attributed to a homoclinic bifurcation, with a global character in phase space [70,60]. In effect, the QG model reviewed here undergoes a genuine homoclinic bifurcation that is generic across the full hierarchy of double-gyre models.
This bifurcation is due to the growth and eventual merging of the limit cycles that arise from the two mutually symmetric Hopf bifurcations. It is marked in the figure by a filled circle and a stylized lemniscate, and labeled "Homoclinic." This global bifurcation is associated with chaotic behavior of the flow due to the Shilnikov phenomenon [113,143], which induces horseshoes in phase space.
The connection between such homoclinic bifurcations and gyre modes was not immediately obvious, but Simonnet et al. [143] emphasized that the two were part of a single, global dynamical phenomenon. The homoclinic bifurcation indeed results from the unfolding of the gyre modes' limit cycles. This familiar dynamical scenario is again well illustrated by the unfolding of a codimension-2 Bogdanov-Takens bifurcation, where the homoclinic orbits emerge naturally.
Since homoclinic orbits have an infinite period, it was natural to hypothesize that the gyre-mode mechanism, in this broader, global-bifurcation context, gave rise to the observed 7-yr and 14-yr North Atlantic oscillations. Although this hypothesis may appear a little farfetched -given the simplicity of the double-gyre models analyzed so far -it is reinforced by results with much more detailed model in the hierarchy, cf. [37,39] and Section 13 herein.
The successive-bifurcation theory appears therewith to be fairly complete for barotropic, single-layer models of the double-gyre circulation. This theory also provides a self-consistent, plausible explanation for the climatically important 7-year and 14-year oscillations of the oceanic circulation and the related atmospheric phenomena in and around the North Atlantic basin [39,37,112,171,30,126,51,52,49,50,88,142,143]. The dominant 7-and 14-year modes of this theory survive, moreover, perturbation by seasonal-cycle changes in the intensity and meridional position of the westerly winds [154].
In baroclinic models, with two or more active layers of different density, baroclinic instabilities [39,52,69,60,120,153,121,142,8,90] surely play a fundamental role, as they do in the observed dynamics of the oceans. However, it is not known to what extent baroclinic instabilities can destroy gyre-mode dynamics. The difficulty lies in a deeper understanding of the so-called rectification process [83], which arises from the nonzero mean effect of the baroclinic component of the flow.
Roughly speaking, rectification drives the dynamics far away from any stationary solutions. In this situation, dynamical systems theory by itself cannot be used as a full explanation of complex, observed behavior resulting from successive bifurcations that are rooted in simple stationary or periodic solutions. Other tools from statistical mechanics and nonequilibrium thermodynamics should, therefore, be considered [13,48,102,100,104,131,160]. Combining these tools with those of the successive-bifurcation approach may eventually lead to a more general and complete physical characterization of gyre modes in realistic models.

Non-autonomous and random dynamical systems
An additional way of improving upon the simple results so far is to include timedependent forcing and stochasticity. As discussed in Section 10, the appropriate general framework for doing this is the one provided by NDS and RDS theory [59,27,10,43]. In the present section, we summarize this mathematical framework in as straightforward a way as possible, and start with some simple ideas about deterministic vs. stochastic modeling.

Background and motivation
We are interested in behavior that is robust across a full hierarchy of climate models in general, and of oceanic double-gyre models in particular. Moreover, we would like this robust behavior of the models to be recognizable in the very large and everincreasing mass of observational data. Finally, the features in whose robustness we are most interested are those that obtain over long time intervals.
Classical DDS theory, going back to H. Poincaré [127], is essentially a geometric approach to studying the asymptotic, long-term properties of solutions to autonomous ODE systems in phase space. Extensions to nonlinear PDE systems are covered, for instance, in [158]. To apply the theory in a reliable manner to a set of complex physical phenomena, one needs a criterion for the robustness of a given model within a class of dynamical systems. Such a criterion should help one deal with the inescapable uncertainties in model formulation, whether due to incomplete knowledge of the governing laws or inaccuracies in determining model parameters.
In this context, A. A. Andronov and L. S. Pontryagin [2] introduced the concept of structural stability for classifying dynamical systems. Structural stability means that a small, continuous perturbation of a given system preserves its dynamics up to a homeomorphism, i.e., up to a one-to-one continuous change of variables that transforms the phase portrait of our system into that of a nearby system; thus fixed points go into fixed points, limit cycles into limit cycles, etc. Closely related is the notion of hyperbolicity formulated by S. Smale [147]. A system is hyperbolic if, (very) loosely speaking, its limit set can be continuously decomposed into invariant sets that are either contracting or expanding; see [82] for more rigorous definitions.
A very simple example is the phase portrait in the neighborhood of a fixed point of saddle type. The Hartman-Grobman theorem states that the dynamics in this neighborhood is structurally stable. The converse statement, i.e. whether structural stability implies hyperbolicity, is still an open question; the equivalence between structural stability and hyperbolicity has only been shown in the C 1 case, under certain technical conditions [130,132,106,118]. Bifurcation theory is quite complete and satisfying in the setting of hyperbolic dynamics. Problems with hyperbolicity and bifurcations arise, however, when one deals with more complicated limit sets.
Hyperbolicity was introduced initially to help pursue the "dynamicist's dream" of finding -in an abstract space X of all possible dynamical systems with given dimension and regularity -an open and dense set S consisting of structurally stable ones. For this set to be open and dense means, roughly speaking, that any dynamical system in X can be approximated by structurally stable ones in S , while systems in the complement of S are negligible in a suitable sense.
Smale conjectured that hyperbolic systems form an open and dense set in the space of all C 1 dynamical systems. If this conjecture were true then hyperbolicity would be typical of all dynamics. Unfortunately, though, this conjecture is only true for one-dimensional dynamics and flows on disks and surfaces [122]. Smale [146] himself found several counterexamples to his conjecture. Newhouse [115] was able to generate open sets of nonhyperbolic diffeomorphisms using homoclinic tangencies. For the physicist, it is even more striking that the famous Lorenz attractor [98] is structurally unstable. Families of Lorenz attractors, classified by topological type, are not even countable [71,170]. In each of these examples, we observe chaotic behavior in a nonhyperbolic situation, i.e., nonhyperbolic chaos.
Nonhyperbolic chaos appears, therefore, to be a severe obstacle to any easy classification of dynamic behavior. As mentioned by J. Palis [118], A. N. Kolmogorov already suggested at the end of the sixties that "the global study of dynamical systems could not go very far without the use of new additional mathematical tools, like probabilistic ones." Once more, Kolmogorov showed prophetic insight, and nowadays the concept of stochastic stability is an important tool in the study of genericity and robustness for dynamical systems. A system is stochastically stable if its Sinai-Ruelle-Bowen (SRB) measure [145] is stable with respect to stochastic perturbations, and the SRB measure is given by lim n→∞ 1 n ∑ i δ z i , with z i being the successive iterates of the dynamics. This measure is obtained intuitively by allowing the entire phase space to flow onto the attractor [45].
To replace the failed program of classifying dynamical systems based on structural stability and hyperbolicity, J. Palis [118] formulated his so-called global conjecture: The set S of systems satisfying the following conditions is dense in the C r -topology for r ≥ 1. The three conditions are: (i) the system has only finitely many attractors, i.e. periodic or chaotic sinks; (ii) the union of the corresponding basins of attraction has full Lebesgue measure in X ; and (iii) each attractor is stochastically stable in its basin of attraction.
Stochastic stability is thus based on ergodic theory. We would like to consider a more geometric approach, which can provide a coarser, more robust classification of climate models across a full hierarchy of such models. In this section, we propose such an approach, based on RDS theory [4,25,34,93, and references therein].
RDS theory describes the behavior of dynamical systems subject to external stochastic forcing; its tools have been developed to help study the geometric properties of stochastic differential equations (SDEs). RDS theory is thus the stochastic, SDE counterpart of the geometric theory of ODEs, as presented for instance in [5]. This approach provides a rigorous mathematical framework for a stochastic form of robustness, while the more traditional, topological concepts do not seem to be appropriate for the climate sciences, given the prevalence of nonhyperbolic chaos.

Pullback and random attractors
Stochastic parametrizations for GCMs aim at compensating for our lack of detailed knowledge on small spatial scales in the best way possible [94,95,96,119,157,151]. The underlying assumption is that the associated time scales are also much shorter than the scales of interest and, therefore, the lag correlation of the phenomena being parametrized is negligibly small. Such assumptions clearly hold, for instance, for the spatial and temporal scales of cloud processes, whose parametrization poses a notorious obstacle to realistic climate simulation and prediction [73,74,75,76]. Stochastic parametrizations thus essentially transform a deterministic autonomous system into a nonautonomous one, subject to random forcing.
Explicit time dependence, whether deterministic or stochastic, in a dynamical system immediately raises a technical difficulty. Indeed, the classical notion of attractor is not always relevant, since any object in phase space is moving with time and the concept of forward asymptotics, as considered in autonomous DDS theory, is meaningless. One needs therefore another notion of attractor. In the deterministic nonautonomous framework, the appropriate notion is that of a pullback attractor [129,85,15], which we present below. The closely related notion of random attractor in the stochastic framework is also explained briefly below, with further details given in Appendix A.

Pullback attraction
Before defining the notion of pullback attractor, let us recall some basic facts about nonautonomous dynamical systems. Consider the ODĖ on a vector space X; this space could even be infinite-dimensional, if we were dealing with partial or functional differential equations, as is often the case in fluid-flow and climate problems. Rigorously speaking, we cannot associate a dynamical sys-tem acting on X with a nonautonomous ODE; nevertheless, in the case of unique solvability of the initial-value problem, we can introduce a two-parameter family of operators {S(t, s)} t≥s acting on X, with s and t real, such that S(t, s)x(s) = x(t) for t ≥ s, where x(t) is the solution of the Cauchy problem with initial data x(s). This family of operators satisfies S(s, s) = Id X and S(t, τ) • S(τ, s) = S(t, s) for all t ≥ τ ≥ s, and all real s. Such a family of operators is called a "process" by Sell [137].It extends the classical notion of the resolvent of a nonautonomous linear ODE to the nonlinear setting.
We can now define the pullback attractor as the family of invariant sets {A (t) : −∞ < t < +∞} that satisfy, for every real t and all x 0 in X: "Pullback" attraction does not involve running time backwards; it corresponds instead to the idea of measurements being performed at present time t in an experiment that was started at some time s < t in the past: the experiment has been running for long enough, and we are thus looking at the present moment for an "attracting state." Note that there exist several ways of defining a pullback attractor -the one retained here is a local one, cf. [91, and references therein]; see [7] for further information on nonautonomous dynamical systems in general.
A simple example will be helpful for the general reader. Consider the scalar linear with both α and σ positive. Here α > 0 implies the system is dissipative, a crucial property both mathematically and physically. In the climate context, its importance has been emphasized by [98,60], among others. The situation is depicted in Fig. 17. The PBA is easily computed in this case, and it is given by The figure clearly shows that the approximation of the PBA at a given time t = t 1 or t = t 2 improves as we "pull back" further, from s = s 1 to s = s 2 (red vs. blue orbits in the figure); it also shows that the accuracy of the approximation depends essentially on the time interval |t − s|, rather than on t and s separately. In practice, the characteristic time for obtaining an approximation with given accuracy will depend largely on the rate of dissipation α.
In the stochastic context, noise forcing is modeled by a stationary stochastic process. If the deterministic dynamical system of interest is coupled to this stochastic process in a reasonable way -defined below by the "cocycle property" -then random pullback attractors may appear. These pullback attractors will exist for almost each sample path of the driving stochastic process, so that the same probability distribution governs both sample paths and their corresponding pullback attractors. A more detailed and rigorous explanation is given in Appendix A. Roughly speaking, the concept of a random attractor provides a geometric framework for the description of asymptotic regimes in the context of stochastic dynamics. Figure 17 showed how additive and linear deterministic forcing modifies a fixed point. We illustrate next, following [27,56], the way that multiplicative stochastic forcing modifies a strange attractor.

The stochastically perturbed Lorenz model
We consider in this section the Lorenz convection model [98] and its randomly forced version studied in [27]. The model is governed by the well-known three nonlinear, coupled ODEs that are now forced by linearly multiplicative white noise [4,25]. [SLM] In the deterministic context, purely geometric map models were proposed in the 1970s [70] to interpret the dynamics observed numerically by E. N. Lorenz in [98].
These geometric models attracted considerable attention and it was shown that they possess a unique SRB measure [84,28], i.e., a time-independent measure that is invariant under the flow and has conditional measures on unstable manifolds that are absolutely continuous with respect to Lebesgue measure [45]. This result has been extended recently to the Lorenz flow [98] itself, in which the SRB measure is supported by a strange attractor of vanishing volume [161,3].
Even though this result was only proven recently, the existence of such an SRB measure was suspected for a long time and has motivated several numerical studies to compute a probability density function (PDF) associated with the Lorenz model, by filtering out the stable manifolds [42, 111, and references therein]. The Lorenz attractor is then approximated by a two-dimensional manifold, called the branched manifold [70], which supports this PDF. Based on such a strategy, Dorfle & Graham [42] showed that the stationary solution of the Fokker-Planck equation for the Lorenz model perturbed by additive white noise possesses a density with two components: the PDF of the deterministic system supported by the branched manifold plus a narrow Gaussian distribution transversal to that manifold.  [98] model's random attractor A (ω) and of the corresponding sample measure µ ω , for a given, fixed realization ω. The figure corresponds to projection onto the (y, z) plane, i.e. µ ω (x, y, z)dx. One billion initial points have been used in both panels and the pullback attractor is computed for t = 40. The parameter values are the classical ones -r = 28, s = 10, and b = 8/3, while the time step is ∆t = 5 · 10 −3 . The color bar to the right of each panel is on a log-scale and quantifies the probability to end up in a particular region of phase space. Both panels use the same noise realization ω but with noise intensity (a) σ = 0.3 and (b) σ = 0.5. Notice the interlaced filamentary structures between highly (yellow) and moderately (red) populated regions; these structures are much more complex in panel (b), where the noise is stronger. Weakly populated regions cover an important part of the random attractor and are, in turn, entangled with (almost) zero-probability regions (black). After [27].
It follows that, in the presence of additive noise, the resulting PDF looks very much like that of the unperturbed system, only slightly fuzzier: the noise smoothes the small-scale structures of the attractor. More generally, this smoothing appears in the classical, forward approach that only considers forward asymptotics, with t → +∞ -as opposed to the pullback approach, introduced here in Section 12.2.1, in which one considers also the pullback asymptotics of s → −∞ -and it does so for a broad class of additive as well as multiplicative noises, in the sense of [4]. More precisely, this smoothing occurs provided that the diffusion terms due to the stochastic components in the Fokker-Planck equation associated with the SDE system under study are sufficiently non-degenerate; see [27, Appendix C and references therein].
Hörmander's theorem guarantees that this is indeed the case for hypoelliptic SDEs [6]. The corresponding non-degeneracy conditions allow one to regularize the stationary solutions of the transport equation which is the counterpart of the Fokker-Planck equation in the absence of noise; a measure-theoretic justification for it can be found, for instance in [92, p. 210]. This transport equation is also known as the Liouville equation and it provides the probability density at time t of S(t)x when the initial state x is sampled from a probability measure that is absolutely continuous with respect to Lebesgue measure; here {S(t)} t∈R is the flow ofẋ = f(x), for some sufficiently smooth vector field f on R d . As a matter of fact, when f is dissipative and the dynamics associated with it is chaotic, the stationary solutions of (108) are very often singular with respect to Lebesgue measure; these solutions are therefore expected to be SRB measures. For a broad class of noises -such as those that obey a hypoellipticity condition -the forward approach leads us to suspect that noise effects tend to remove the singular aspects with respect to Lebesgue measure. This smoothing aspect of random perturbations is often useful in the theoretical understanding of any stochastic system, in particular in the analysis of the low-and higher-order moments, which have been thoroughly studied in various contexts.
For chaotic systems subject to noise, however, this noise-induced smoothing observed in the forward approach compresses a lot of crucial information about the dynamics itself; quite to the contrary, the pullback approach brings this information into sharp focus. A quick look at Figs. 18(a,b) and 19 is already enlightening in this respect. All three figures refer to the invariant measure µ ω supported by the random attractor of our stochastic Lorenz model [SLM]. This model obeys the following three SDEs: In system (107), each of the three equations of the classical, deterministic model [98] is perturbed by linearly multiplicative noise in the Itô sense, with W t a Wiener process and σ > 0 the noise intensity. The other parameter values are the standard ones for chaotic behavior [60], and are given in the caption of Fig. 18. Figures 18(a,b) show two snapshots of the sample measure µ ω supported by the random attractor of [SLM] -for the same realization ω but for two different noise intensities, σ = 0.3 and 0.5, while Fig. 19 provides four successive snapshots of µ θ t ω , for the same noise intensity σ = 0.5 as in Fig. 18(b), but with t = t 0 + k∆t and k = 0, 1, 2, 3 for some t 0 .
The sample measures in these three figures, and in the associated short video given in the SM, exhibit amazing complexity, with fine, very intense filamentation; note logarithmic scale on color bars in the three figures. There is no fuzziness whatsoever in the topological structure of this filamentation, which evokes the Cantor-set foliation of the deterministic attractor [70]. Such a fine structure strongly suggests that these measures are supported by an object of vanishing volume. Fig. 19 Four snapshots of the random attractor and sample measure supported on it, for the same parameter values as in Fig. 18. The time interval ∆t between two successive snapshots -moving from left to right and top to bottom -is ∆t = 0.0875. Note that the support of the sample measure may change quite abruptly, from time to time, cf. short video in [27,Supplementary Material] for details. Reproduced from [27], with permission from Elsevier.
Much more can be said, in fact, about these objects. RDS theory offers a rigorous way to define random versions of stable and unstable manifolds, via the Lyapunov spectrum, the Oseledec multiplicative theorem, and a random version of the Hartman-Grobman theorem [4]. These random invariant manifolds can support measures, like in the deterministic context. When the sample measures µ ω of an RDS have absolutely continuous conditional measures on the random unstable manifolds, then µ ω is called a random SRB measure.
One can prove rigorously, by relying on Theorem B of [93], that the sample measures of the discretized stochastic system obtained from the [SLM] model share the SRB property. Indeed, it can be shown that a Hörmander hypoellipticity condition is satisfied for our discretized [SLM] model, thus ensuring that the random process generated by this model has a smooth density p(t, x) [86]; see [27,Appendix C1] for the details. Standard arguments [149] can then be used to prove that the stationary solution ρ of our model's Fokker-Planck equation is in fact absolutely continuous with respect to Lebesgue measure.
Since these simulations exhibit exactly one positive Lyapunov exponent, the absolute continuity of ρ implies that the sample measures seen in Figs. 18 and 19 are, actually, good numerical approximations of a genuine random SRB measure for our discretized [SLM], whenever δt is sufficiently small; see also the next section. In fact, Ledrappier & Young's [93] Theorem B is a powerful result, which clearly shows that -in noisy systems, and subject to fairly general conditions -chaos can lead to invariant sample measures with the SRB property; see also [27,Appendix C2]. It is striking that the same noise-induced smoothing that was "hiding" the dynamics in the forward approach allows one here to exhibit the existence of an SRB measure from a pullback point of view, and thus to compute explicitly the unstable manifold supporting this invariant measure.
Note that, since the sample measures associated with the discrete [SLM] system are SRB here, they are so-called physical measures [4, 27, and references therein] and can thus be computed at any time t by simply flowing a large set of initial data from the remote past s t up till t, for a fixed realization ω; this is exactly how Figs. 18 and 19 were obtained. Given the SRB property, the nonzero density supported on the model's unstable manifolds delineates numerically these manifolds; Figs. 18 and 19 provide therefore an approximation of the global random attractor of our stochastic Lorenz system.
Finally, these random measures are Markovian, in the sense that they are measurable with respect to the past σ -algebra of the noise [4]. The latter statement results directly from the fact that these measures are physical, and thus satisfy the required measurability conditions in the pullback limit. The information about the moments that is available in the classical Fokker-Planck approach is complemented here by information about the pathwise moments. These pathwise statistics are naturally associated with the sample measures -when the latter are SRB -by taking appropriate averages.
The evolution of the sample measures µ θ t ω -as apparent from the short video in [27, Supplementary Material] -is quite complex, and two types of motion are present. First, a pervasive "jiggling" of the overall structure can be traced back to the roughness of the Wiener process W t and to the multiplicative way it enters into the [SLM] model. Second, there is a smooth, regular low-frequency motion present in the evolution of the sample measures, which seems to be driven by the deterministic system's unstable limit cycles and is thus related to the well-known lobe dynamics. The latter motion is clearly illustrated in Fig. 19.
More generally, it is worth noting that this type of low-frequency motion seems to occur quite often in the evolution of the samples measures of chaotic systems perturbed by noise; it appears to be related to the recurrence properties of the unperturbed deterministic flow, especially when energetic oscillatory modes characterize the latter. To the best of our knowledge, there are no rigorous results on this type of phenomenon in RDS theory.
Besides this low-frequency motion, abrupt changes in the global structure occur from time to time, with the support of the sample measure either shrinking or expanding suddenly. These abrupt changes recur frequently in the video associated with Fig. 19, which reproduces a relatively short sequence out of a very long stochastic model integration; see SM.
As the noise intensity σ tends to zero, the sample-measure evolution slows down, and one recovers numerically the measure of the deterministic Lorenz system (not shown). This convergence as σ → 0 may be related to the concept of stochastic stability [172,84]. Such a continuity property of the sample measures in the zeronoise limit does not, however, hold in general; it depends on properties of the noise, as well as of the unperturbed attractor [31,35,16].
As stated in the theoretical section, the forward approach is recovered by taking the expectation, E[µ • ] := Ω µ ω P(dω), of these invariant sample measures. In practice, E[µ • ] is closely related to ensemble or time averages that typically yield the previously mentioned PDFs. In addition, when the random invariant measures are Markovian and the Fokker-Planck equation possesses stationary solutions, where ρ is such a solution. Subject to these conditions, there is even a one-to-one correspondence between Markovian invariant measures and stationary measures of the Markov semi-group [4,32]. The inverse operation of µ → ρ = E[µ • ] is then given by ρ → µ ω = lim t→∞ Φ(−t, ω) −1 ρ; the latter is in fact the pullback limit of ρ due to the cocycle property [32]. It follows readily from this result that RDS theory "sees" many more invariant measures than those given by the Markov semi-group approach: non-Markovian measures appear to play an important role in stochastic bifurcation theory [4], for instance.
To summarize, one might say that the classical forward approach considers only expectations and PDFs, whereas the RDS approach "slices" the statistics very finely: the former takes a hammer to the problem, while the latter takes a scalpel. Clearly, distinct physical processes may lead to the same observed PDF: the RDS approach and, in particular, the pullback limit are able to discriminate between these processes and thus provide further insight into them.
13 Time-dependent forcing and ocean-atmosphere coupling Given the NDS and RDS concepts of pullback attraction and time-dependent invariant measures, as introduced in the previous section, we are ready now to return to the wind-driven ocean circulation of Section 11 and consider two additional steps in its understanding, simulation and prediction, namely time-dependent forcing and coupling with the atmosphere above. We start by applying a still prescribed but now time-dependent wind-stress to a double-gyre model.

Time-dependent forcing
The most obvious way in which wind stress varies in mid-latitudes is from summer to winter: the so-called atmospheric jet stream is both stronger and closer to the equator in winter. Such periodic changes in the forcing were already considered in [154], albeit without the theoretical underpinnings discussed herein. The first application of pullback attractors to the double-gyre problem appears in [124], for the case of periodic forcing, and we rely here on [125] for a systematic presentation of this novel application of NDS theory to the wind-driven ocean circulation.

Model formulation
The model, in its autonomous form [123], differs somewhat from that of Eq. (101) by the absence of the Laplacian eddy viscosity -given by the bilaplacian ∆ 2 ψ of the streamfunction ψ -and by a slightly different form of the wind stress. Its discretized form is obtained by projection of Eqs (101) onto a set of cartesian basis functions, and low-order truncation of the expansion thus obtained. The discrete, low-order model is thus governed by four nonlinear coupled ODEs for the variables {Ψ i (t), i = 1, · · · , 4} that are the coefficients of the streamfunction ψ(x, y,t) retained in this expansion. The choice of basis functions follows up on such idealized double-gyre models introduced in [78, 143], in particular on this basis including the exponential form of a current that decays away from the domain's western boundary.
In the presence of nonautonomous forcing, the model takes the forṁ Here Ψ denotes the vector of expansion coefficients (Ψ 1 ,Ψ 2 ,Ψ 3 ,Ψ 4 ) of the streamfunction, W is the vector of forcing terms (W 1 ,W 2 ,W 3 ,W 4 ), and the bilinear terms B i are given by In [125], the forcing W(x,t) is defined as where x = (x, y) and w(x) is the time-constant double-gyre wind stress curl used in [123]; furthermore, γ is the dimensionless intensity of this wind stress, while f (t) is an aperiodic time dependence weighed by the dimensionless parameter ε. Clearly, w(x) is projected onto the same four leading-order basis functions as Ψ and W.

Model behavior
The aperiodic forcing f (t) in Fig. 20a corresponds to an idealized but aperiodic choice with a substantial decadal component. Such decadal climate signals are associated with the so-called Pacific Decadal Oscillation [105,21], for instance, and climate prediction on such time scales is a matter of great recent interest for the climate sciences [18,65,126,75,76].
The evolution of the model solutions subject to this forcing is shown in Figs. 20b and 20c for two reference cases, γ = 0.96 and γ = 1.1, respectively; the solutions of the corresponding autonomous system, i.e. with ε = 0, are plotted in [125, Figs. 1(b,c)]. It is clear from the two panels that a regime change occurs as the forcing is increased. This regime change occurs in the autonomous sytem at γ = 1.0 and corresponds to a global bifurcation associated with a homoclinic orbit; see [143,123] and Section 11.3.3 here. The persistence of the regime shift in the nonautonomous case, for ε as large as 0.2, is interesting but none too surprising.
In Figs. 20(b, c), N trajectories Ψ k (t) emanate from N initial states uniformly distributed at time t 0 = 0 on a reference subset Γ of the (Ψ 1 ,Ψ 3 )-plane, while the (Ψ 2 ,Ψ 4 ) coordinates of the initial states (not shown) are chosen the same way as in [124]. The panels (b,c) here provide a first representation of the sets that approximate the corresponding PBAs; the small number of trajectories is in fact chosen for the sake of graphical clarity. The correct identification and characterization of the PBAs requires, however, an analysis of the PDFs that evolve along the trajectories and of the convergence to the appropriate invariant time-dependent sets; see Figs. 20(b',c'). This convergence was shown in [125] to take no longer than about 15 years of model simulation.

Multiplicity of PBAs
The existence of a global PBA is rigorously demonstrated for the weakly dissipative, nonlinear model governed by Eqs. (109)- (111) in [125,Appendix], based on general, NDS-theoretical results [85,15]. Numerically, though, this unique global attractor seems to possess two separate local PBAs, as apparent from Fig. 21. Panels (a) and (b) in the figure refer again to the two types of regimes apparent in Figs. 20(b,c), for γ = 0.96 and γ = 1.1, respectively.
The mean normalized distance plotted in the figure is defined as follows: Here δ t is the distance, at time t, between two trajectories of the model that were a distance δ 0 apart at time t = t 0 , and the normalized distance δ n = δ t /δ 0 is averaged over the whole forward time integration T tot = t fin − t init of the available trajectories. The maps of σ in Fig. 21 reveal large chaotic regions where δ n 1 on average (warm colors) but also non-chaotic regions, in which σ ≤ 1 (blue) and thus initially close trajectories do remain close on average. The rectangular regions in the two panels that are labeled by letters A and B and by numbers 1 − 4 correspond to subdomains of the initial set Γ and are discussed further in [125,Sec. 5]. The numerical evidence in Fig. 21 suggests, furthermore that the boundary between the two types of local attractors has fractal properties.
In the autonomous context, the coexistence of topologically distinct local attractors is well known in the climate sciences [60,37,39,143,140, and references therein]. The coexistence of local PBAs with chaotic vs. non-chaotic characteris- tics, within a unique global PBA, as illustrated by Fig. 21 here, seems to be novel, at least in the climate literature.

Coupled ocean-atmosphere variability
Forcing of an ocean model by time-dependent wind stress is but one step in the direction of a fully coupled ocean-atmosphere model. Simple models of this type have been considered, for instance, by S. Vannitsem and associates [165,166,167], who also included heat fluxes as an important ingredient in the coupling.
A key conclusion of this group's work is that, in such a simple coupled model, it is possible to find a fairly broad parameter regime within which slow modes, with interdecadal near-periodicities exist. Moreover, these modes are truly coupled, in the sense that their atmospheric components share the long near-periodicities. We merely refer here to [57] for a summary of these results and to [168] for an observational verification.

Concluding remarks
Over the last four decades, the climate sciences have become of paramount importance in understanding, predicting and attempting to improve the relations between humanity and the planet it inhabits [116,73,74,75,76]. This development could not have occurred without a strong contribution from the mathematical sciences.
Several books [37,38,60,99,104,164,107] have documented the mathematics that have been used and that have in turn benefited from this area of applications.
To conclude this contribution, it seems appropriate to formulate a few ideas for possibly fruitful future applications of NDS and RDS theory to the climate sciences. It would appear to the authors that -in spite of increasing efforts in this direction, e.g. [109, and further articles in the same volume] -insufficient attention has been paid to understanding the strong interaction between natural or intrinsic climate variability [117,18], on the one hand, and the anthropogenic forcing upon which a huge fraction of the recent climate literature has focused [75,76]. We would like, therefore, to suggest an avenue for overcoming this gap, and to summon the necessary mathematics for doing so.

A generalized definition of climate sensitivity
The usual way of defining climate sensitivity, going back to [116], relies on the conceptual picture of Fig. 22a: climate is in a steady state characterized by a global average temperature T = T 0 ; a parameter that is affected by human activities, say CO 2 concentration, is suddenly changed; and T (t) follows -according to the solution of a scalar, linear ODE like Eq. (105), in which T replaces x and a Heaviside function replaces σt on the right-hand side -so that T (t) → T 1 , and T 1 > T 0 when the CO 2 jump is positive.
This cartoon has been, of course, considerably enriched in successive assessment reports of the the Intergovernmental Panel on Climate Change [73,74,75, 76] from a scalar, linear ODE to the coupled, nonlinear PDE systems that govern GCMs. But no mathematically consistent picture has emerged for reconciling the intrinsic variability that those PDE systems include, as illustrated in Sections 11 and 13 herein, with the complex nature of the forcing, both natural and anthropogenic. Thus climate sensitivity is still essentially thought of as ∆ T /∆ CO 2 , where ∆ T = T 1 − T 0 , the difference between a final and an initial "equilibrium" temperature, and ∆ CO 2 is likewise the difference between a final and an initial forcing level. Consideration has been given to changes in other fields than temperature, such as precipitation [97,133], to variances and energetics [102,100], or to regional differences [14,101].
To include these considerations and follow up on ideas formulated in [56], we would like to suggest a definition of climate sensitivity γ 3 that takes full advantage of the NDS and RDS framework and that encompasses changes in the natural varibility, in its most general meaning, as the forcing level or some other parameter changes. This definition replaces the more-or-less standard definition of γ = ∂ T /∂ µ [60, p. 320], with µ an arbitrary parameter generalizing CO 2 -a definition that corresponds to the equilibrium situation in Fig. 22a -by a more appropriate and self-consistent one that corresponds to the chaotic and random situation in Fig. 22c.

PBAs, invariant measures and Wasserstein distance
The mathematical framework of PBAs, in the presence of time-dependent forcing or coefficients, suggests the definition where d W is the Wasserstein distance [41,56] between the time-dependent invariant measures supported on the system's PBA at two distinct parameter values. An idea of the large differences that may exist between two such PBAs is given by the three snapshots of the invariant measure on the PBA of the infinite-dimensional, but still relatively simple ENSO model of [53] in Fig. 23 below. The model's two dependent variables are sea surface temperature T in the eastern Tropical Pacific and thermocline depth h there, as a function of time t: T = f (T (t), h(t)), h(t) = g(T, h, F)(t, τ 1 , τ 2 ), F(t) = 1 + εcos(ωt + φ ). (114) In Eqs. (114), F stands for the seasonal forcing, with period 2π/ω = 12 months, and all three variables -T, h and F -depend on the time t and the delays τ 1 and τ 2 ; these delays characterize the traveling times along the Equator of eastward Kelvin and westward Rossby waves. Several authors have studied delay-differential models of ENSO; see [37] for a review and [67, 23, and references therein] for further mathematical details on such models.
The solutions of Eqs. (114) exhibit periodic, quasi-periodic and chaotic behavior, as well as frequency locking to the time-dependent, seasonal cycle. Thus, in principle, an infinite number of scalars are required to define the dependence of these solutions on the parameters τ 1 and τ 2 ; these scalars need to include not just the means of temperature T and depth h, but also their variance and higher-order moments. In [56, Fig. 7] this dependence was shown for the zeroth, second and fourth moments of h(t); more precisely the plotted quantities were the mean, standard deviation and fourth root of the kurtosis of the PDF.  While the figure illustrates successive snapshots of the invariant measure supported on the PBA for the same value of the model parameters, there are similarly large differences between such measures for different parameter values [56,67]. Closer attention to the definition of Eq. (113) can provide better insights into the changes in time, as well as with respect to a pararameter, of higher moments of a model's PDF but also into the delicate and important matter of changes in the distribution of extreme events [22,66].
Steps in the application of the NDS and RDS framework to climate change and climate sensitivity are being taken by a number of research groups [10,11,38,43,101]; see especially [62] and references therein. This area of research is only opening up and should provide many opportunities for contributions by mathematicians and climate scientists, as well as for fruitful interactions among them.
If ϕ is measurable, it is called a measurable RDS over θ . If, in addition, X is a topological space (respectively a Banach space), and ϕ satisfies (t, ω) → ϕ(t, ω)x continuous (resp. C k , 1 ≤ k ≤ ∞) for all (t, ω) ∈ T × Ω , then ϕ is called a continuous (resp. C k ) RDS over the flow θ . If so, then (ω, x) → Θ (t)(x, ω) := (θ (t)ω, ϕ(t, ω)x), is a (measurable) flow on Ω × X, and is called the skew-product of θ and ϕ. In the sequel, we shall use the terms "RDS" or "cocycle" synonymously. The choice of the so-called driving system θ is a crucial step in this set-up; it is mostly dictated by the fact that the coupling between the stationary driving and the deterministic dynamics should respect the time invariance of the former, as illustrated in Fig. 24. The driving system θ also plays an important role in establishing stochastic conjugacy [29] and hence the kind of classification we aim at.
The concept of random attractor is a natural and straightforward extension of the definition of pullback attractor (104), in which Sell's [137] process is replaced by a cocycle, cf. Fig. 24, and the attractor A now depends on the realization ω of the noise, so that we have a family of random attractors A (ω), cf. Fig. 25. Roughly speaking, for a fixed realization of the noise, one "rewinds" the noise back to t → −∞ and lets the experiment evolve (forward in time) towards a possibly attracting set A (ω); the driving system θ enables one to do this rewinding without changing the statistics, cf. Figs. 24 and 25.
Other notions of attractor can be defined in the stochastic context, in particular based on the original SDE; see [34,33] for a discussion on this topic. The present definition, though, will serve us well.
Stochastic equivalence extends classic topological conjugacy to the bundle space X × Ω , stating that there exists a one-to-one, stochastic change of variables that continuously transforms the phase portrait of one sample system in X into that of any other such system.
In this appendix, we clarify the notion of LFV from a mathematical perspective. Let us reconsider the deterministic Lorenz system [98]. It is known that the power spectral density, or power spectrum, of this system is exponentially decaying [47,61]. At the same time, one can check numerically that the decay of the autocorrelation function is exponentially decaying, too. Other types of power-spectrum behavior may be encountered for chaotic dynamical systems, though. Aside from pure power-law decay, it may also happen that the power spectrum contains one or several broad peaks that stand out above the continuous background, whether the latter has a power-law [9] or exponential decay. If the central frequencies of these peaks are located in a frequency band that lies close to the lower end of the frequency range being studied, the system is said to exhibit LFV [58,61].
This climatically motivated, but vague notion of LFV can be formalized mathematically through the mixing concept introduced in Appendix B. Indeed, for a general flow {φ t } on a topological space X, which possesses an invariant (Borel) probability measure µ, let us define the correlation function by using the same notations as above. If the system (φ t , µ) is mixing, the rate of approach to zero of C t (F, G) is called the rate of decay of correlations for its observables F and G. A system exhibits a slow decay rate of correlations at short lags if the rate is slower than exponential over some characteristic time interval [0, T ].