CHAPTER
5
Data Assimilation by Models
ICHIRO
FUKUMORI
Jet Propulsion Laboratory
California Institute of Technology
Pasadena CA 91109
1. INTRODUCTION
Data assimilation is a procedure that combines observa-
tions with models. The combination aims to better estimate
and describe the state of a dynamic system, the ocean in
the context of this book. The present article provides an
overview of data assimilation with an emphasis on applica-
tions to analyzing satellite altimeter data. Various issues are
discussed and examples are described, but presentation of
results from the non-altimetric literature will be limited for
reasons of space and scope of this book.
The problem of data assimilation belongs to the wider
field of estimation and control theories. Estimates of the dy-
namic system are improved by correcting model errors with
the observations on the one hand and synthesizing observa-
tions by the models on the other. Much of the original math-
ematical theory of data assimilation was developed in the
context of ballistics applications. In earth science, data as-
similation was first applied in numerical weather forecast-
ing.
Data assimilation is an emerging area in oceanography,
stimulated by recent improvements in computational and
modeling capabilities and the increase in the amount of
available oceanographic observations. The continuing in-
crease in computational capabilities have made numerical
ocean modeling a commonplace. A number of new ocean
general circulation models have been constructed with dif-
ferent grid structures and numerical algorithms, and incorpo-
rating various innovations in modeling ocean physics (e.g.,
Gent and McWilliams, 1990; Holloway, 1992; Large
et al.,
1994). The fidelity of ocean modeling has advanced to a
stage where models are utilized beyond idealized process
studies and are now employed to simulate and study the
actual circulation of the ocean. For instance, model results
are operationally produced to analyze the state of the ocean
(e.g., Leetmaa and Ji, 1989), and modeling the global ocean
circulation at eddy resolution is nearing a reality (e.g., Fu
and Smith, 1996).
Recent oceanographic experiments, such as the World
Ocean Circulation Experiment (WOCE) and the Tropical
Ocean and Global Atmosphere Program (TOGA), have gen-
erated unprecedented amounts of
in situ
observations. More-
over, satellite observations, in particular satellite altimetry
such as TOPEX/POSEIDON, have provided continuous syn-
optic measurements of the dynamic state of the global ocean.
Such extensive observations, for the first time, provide a suf-
ficient basis to describe the coherent state of the ocean and
to stringently test and further improve ocean models.
However, although comprehensive, the available
in situ
measurements and those in the foreseeable future are and
will remain sparse in space and time compared with the
energy-containing scales of ocean circulation. An effective
means of synthesizing such observations then becomes es-
sential in utilizing the maximum information content of such
observing systems. Although global in coverage, the na-
ture of satellite altimetry also requires innovative approaches
to effectively analyze its measurements. For instance, even
though sea level is a dynamic variable that reflects circula-
tion at depth, the vertical dependency of the circulation is not
immediately obvious from sea-level measurements alone.
The nadir-pointing property of altimeters also limits sam-
pling in the direction across satellite ground tracks, making
analyses of meso-scale features problematic, especially with
a single satellite. Furthermore, the complex space-time sam-
pling pattern of satellites caused by orbital dynamics makes
analyses of even large horizontal scales nontrivial, especially
Satelhte Altimetry and Earth Sciences
237
Copyright 9 2001 by Academic Press
All rights of reproduction in any form reserved
23 8 SATELLITE ALTIMETRY AND EARTH SCIENCES
for analyzing high-frequency variability such as tides and
wind-forced barotropic motions.
Data assimilation provides a systematic means to untan-
gle such degeneracy and complexity, and to compensate for
the incompleteness and inaccuracies of individual observing
systems in describing the state of the ocean as a whole. The
process is effected by the models' theoretical relationship
among variables. Data information is interpolated and ex-
trapolated by model equations in space, time, and into other
variables including those that are not directly measured. In
the process, the information is further combined with other
data, which further improves the description of the oceanic
state. In essence, assimilation is a dynamic extrapolation as
well as a synthesis and averaging process.
In terms of volume, data generated by a satellite altime-
ter far exceeds any other observing system. Partly for this
reason, satellite altimetry is currently the most common data
type explored in studies of ocean data assimilation. (Other
reasons include, for example, the near real-time data avail-
ability and the nontrivial nature of altimetric measurements
in relation to ocean circulation described above.) This chap-
ter introduces the subject matter by describing the issues,
particularly those that are often overlooked or ignored. By
so doing, the discussion aims to provide the reader with a
perspective on the present status of altimetric assimilation
and on what it promises to accomplish.
An emphasis is placed on describing what exactly data
assimilation solves. In particular, assimilation improves the
oceanic state
consistent
with both models and observations.
This also means, for instance, that data assimilation does not
and cannot correct every model error, and the results are
not altogether more accurate than what the raw data mea-
sure. This is because, from a pragmatic standpoint, mod-
els are always incomplete owing to unresolved scales and
physics, which in effect are
inconsistent
with models. Over-
fitting models to data beyond the model's capability can lead
to inaccurate estimates. These issues will be clarified in the
subsequent discussion.
We begin in Section 2 by reviewing some examples of
data assimilation, which illustrate its merits and motivations.
Reflecting the infancy of the subject, many published studies
are of relatively simple demonstration exercises. However,
the examples describe the diversity and potential of data as-
similation's applications.
The underlying mathematical problem of assimilation is
identified and described in Section 3. Many of the issues,
such as how best to perform assimilation, what it achieves,
and how it differs from improving numerical models and/or
data analyses per se, are best understood by first recognizing
the fundamental problem of combining data and models.
Many of the early studies on ocean data assimilation cen-
ter on methodologies, whose complexities and theoretical
nature have often muddied the topic. A series of different
assimilation methods are heuristically reviewed in Section 4
with references to specific applications. Mathematical de-
tails are minimized for brevity and the emphasis is placed in-
stead on describing the nature of the approaches. In essence,
most methods are equivalent to each other so long as the as-
sumptions are the same. A summary and recommendation of
methods is also presented at the end of Section 4.
Practical Issues of Assimilation are discussed in Sec-
tion 5. Identification of what the model-data combination
resolves is clarified, in particular, how assimilation differs
from model improvement per se. Other topics include prior
error specifications, observability, and treatment of the time-
mean sea level. We end this chapter in Section 6 with con-
cluding remarks and a discussion on future directions and
prospects of altimetric data assimilation.
The present pace of advancement in assimilation is rapid.
For other reviews of recent studies in ocean data assimila-
tion, the reader is referred to articles by Ghil and Malanotte-
Rizzoli (1991), Anderson
et al.
(1996), and by Robinson
et al.
(1998). The books by Anderson and Willebrand (1989)
and Malanotte-Rizzoli (1996) contain a range of articles
from theories and applications to reviews of specific prob-
lems. A number of assimilation studies have also been col-
lected in special issues of
Dynamics of Atmospheres and
Oceans
(1989, vol 13, No 3-4),
Journal of Marine Systems
(1995, vol 6, No 1-2),
Journal of the Meteorological Society
of Japan
(1997, vol 75, No 1B), and
Journal of Atmospheric
and Oceanic Technology
(1997, vol 14, No 6). Several pa-
pers focusing on altimetric assimilation are also collected in
a special issue of
Oceanologica Acta (1992,
vol 5).
2. EXAMPLES AND MERITS OF DATA
ASSIMILATION
This section reviews some of the applications of data as-
similation with an emphasis on analyzing satellite altimetry
observations. The examples here are restricted because of
limitation of space, but are chosen to illustrate the diversity
of applications to date and to point to further possibilities in
the future.
One of the central merits of data assimilation is its ex-
traction of oceanographic signals from incomplete and noisy
observations. Most oceanographic measurements, including
altimetry, are characterized by their sparseness in space and
time compared to the inherent scales of ocean variability;
this translates into noisy and gappy measurements. Figure 1
(see color insert) illustrates an example of the noise-removal
aspect of altimetric assimilation. Sea-level anomalies mea-
sured by TOPEX (left) and its model equivalent estimates
(center and right) are compared as a function of space and
time (Fukumori, 1995). The altimetric measurements (left
panel) are characterized by noisy estimates caused by mea-
surement errors and gaps in the sampling, whereas the as-
similated estimate (center) is more complete, interpolating
5. DATA ASSIMILATION BY MODELS 23 9
FIGURE 2 A time sequence of sea-level anomaly maps based on Geosat data; (Left) model assimilation, (Right)
statistical interpolation of the altimetric data. Contour interval is 2 cm. Shaded (unshaded) regions indicate negative
(positive) values. The model is a 7-layer quasi-geostrophic (QG) model of the California Current, into which the altimetric
data are assimilated by nudging. (Adapted from White
et al. (1990a),
Fig. 13, p. 3142.)
over the data dropouts and removing the short-scale tempo-
ral and spatial variabilities measured by the altimeter. In the
process, the assimilation corrects inaccuracies in model sim-
ulation (right panel), elucidating the stronger seasonal cycle
and westward propagating signals of sea-level variability.
The issue of dynamically interpolating sea level informa-
tion is particularly critical in studying meso-scale dynam-
ics, as satellites cannot adequately measure eddies because
the satellite's ground-track spacing is typically wider than
the size of the eddy features. Figure 2 compares a time se-
quence of dynamically (i.e., assimilation; left column) and
statistically (right column) interpolated synoptic maps of sea
level by White
et al.
(1990a). The statistical interpolation is
based solely on spatial distances between the analysis point
and the data point (e.g., Bretherton
et al.,
1976), whereas
the dynamical interpolation is based on assimilation with
an ocean model. While the statistically interpolated maps
tend to have maxima and minima associated with meso-scale
eddies along the satellite ground-tracks, the assimilated es-
timates do not, allowing the eddies to propagate without
significant distortion of amplitude, even between satellite
ground tracks. An altimeter's resolving power of meso-scale
variability can also significantly improve variabilities simu-
lated by models. For instance, Figure 3 shows distribution
of sea-surface height variability by Oschlies and Willebrand
(1996), comparing measurements of Geosat (middle) and an
eddy-resolving primitive equation model. The bottom and
top panels show model results with and without assimilation,
respectively. The altimetric assimilation corrects the spatial
distribution of variability, especially north of 30~ reducing
the model's variability in the Irminger Sea but enhancing it
in the North Atlantic Current and the Azores Current.
The virtue of data assimilation in dynamically interpo-
lating and extrapolating data information extends beyond
the variables that are observed to properties not directly
measured. Such an estimate is possible owing to the dy-
namic relationship among different model properties. For in-
stance, Figure 4 shows estimates of subsurface temperature
(left) and velocity (right) anomalies of an altimetric assimi-
lation (gray curve) compared against independent (i.e., non-
assimilated)
in situ
measurements (solid curve) (Fukumori
et al.,
1999). In spite of the assimilated data being limited
to sea-level measurements, the assimilated estimate (gray) is
found to resolve the amplitude and timing of many of the
subsurface temperature and velocity "events" better than the
model simulation (dashed curve). The skill of the model re-
sults are also consistent with formal uncertainty estimates
(dashed and solid gray bars) that reflect inaccuracies in data
and model. Such error estimates are by-products of assimi-
lation that, in effect, quantify what has been resolved by the
model (see Section 5.3 for further discussion).
Although uncertainties in our present knowledge of the
marine geoid (cf., Chapter 10) limit the direct use of alti-
metric sea-level measurements to mostly that of temporal
variabilities, the nonlinear nature of ocean circulation allows
estimates of the mean circulation to be made from measure-
240
SATELLITE ALTIMETRY AND EARTH SCIENCES
FIGURE 3
Root-mean-square variability of sea surface height; (a) model without
assimilation, (b) Geosat data, (c) model with assimilation. Contour interval is 5 cm.
The model is based on the Community Modeling Effort (CME; Bryan and Holland,
1989). Assimilation is based on optimal interpolation. (Adapted from Oschlies and
Willebrand (1996), Fig. 7, p. 14184.)
5. DATA ASSIMILATION BY MODELS 241
FIGURE 4 Comparison of model estimates and
in situ
data; (A) temperature anomaly at 200 m 8~ 180~
(B) zonal velocity anomaly at 120 m 0~ 110~ The different curves are data (black), model simulation (gray dashed),
and model estimate by TOPEX/POSEIDON assimilation (gray solid). Bars denote formal uncertainty estimates of the
model. The model is based on the GFDL Modular Ocean Model, and the assimilation scheme is an approximate Kalman
filter and smoother. This model and assimilation are further discussed in Sections 5.1.2, 5.1.4, and 5.2. (Adapted from
Fukumori
et al.
(1999), Plates 4 and 5.)
240
220
200
180
160
140
120
'
' ' ~ I , 'r , . I ' " ' '" ' ' I ' ' ' '
- . -
. D20
NOASS
ASS2
I00 ~ I ~ . L, ~ _., , ~ l
-20
-15 -10 -5
latitude
FIGURE 5 Time-mean thermocline depth (in m) along 95~ 20~ isotherm depth (plain), model
simulation (dashed), and model with assimilating Geosat data (chain-dashed). The model is a non-
linear 1.5-layer reduced gravity model of the Indian Ocean. Geosat data are assimilated over 1-year
(November 1986 to October 1987) employing the adjoint method. The 20~ isotherm is deduced from
an XBT analyses (Smith, 1995). (Adapted from Greiner and Perigaud (1996), Fig. 10, p. 1744.)
ments of variabilities alone. Figure 5 compares such an esti-
mate by Greiner and Perigaud (1996) of the time-mean depth
of the thermocline in the Indian Ocean, based solely on as-
similation of temporal variabilities of sea level measured by
Geosat. The thermocline depth of the altimetric assimila-
tion (chain-dash) is found to be significantly deeper between
10~ and 18~ than without assimilation (dash) and is in
closer agreement with
in situ
observations based on XBT
measurements (solid).
Data assimilation's ability to estimate unmeasured prop-
erties provides a powerful tool and framework to analyze
data and to combine information systematically from mul-
tiple observing systems simultaneously, making better esti-
mates that are otherwise difficult to obtain from measure-
ments alone. Stammer
et al.
(1997) have begun the process
of synthesizing a wide suite of observations with a gen-
eral circulation model, so as to improve estimates of the
complete state of the global ocean. Figure 6 illustrates im-
242
SATELLITE ALTIMETRY AND EARTH SCIENCES
FIGURE 6 Mean meridional heat transport (in 1015 W) estimate of
a constrained (solid) and unconstrained (dashed lines) model of the At-
lantic, the Pacific, and the Indian Oceans, respectively. The model (Mar-
shall
et al.,
1997) is constrained using the adjoint method by assimilating
TOPEX/POSEIDON data in addition to a hydrographic climatology and a
geoid model. Bars on the solid lines show root-mean-square variability over
individual 10-day periods. Open circles and bars show similar estimates and
their uncertainties of Macdonald and Wunsch (1996). (Adapted from Stam-
mer
et al.
(1997), Fig. 13, p. 28.)
provements made in the time-mean meridional heat transport
estimate from assimilating altimetric measurements from
TOPEX/POSEIDON, along with a geoid estimate and a hy-
drographic climatology. For instance, in the North Atlantic,
the observations require a larger northward heat transport
(solid curve) than an unconstrained model (dashed curve)
that is in better agreement with independent estimates (cir-
cles). Differences in heat flux with and without assimilation
are equally significant in other basins.
One of the legacies of TOPEX/POSEIDON is its im-
provement in our understanding of ocean tides. Refer to
Chapter 6 for a comprehensive discussion on tidal research
using satellite altimetry. In the context of this chapter, a sig-
nificant development in the last few years is the emergence
of altimetric assimilation as an integral part of developing
accurate tidal models. The two models chosen for reprocess-
ing TOPEX/POSEIDON data are both based on combining
observations and models (Shum
et al.,
1997). In particu-
lar, Le Provost
et al.
(1998) give an example of the benefit
of assimilation, in which the data assimilated tidal solution
(FES95.2) is shown to be more accurate than the pure hy-
drodynamic model (FES94.1) or the empirical tidal estimate
(CSR2.0) used in the assimilation. That is, assimilated esti-
mates are more accurate than analyses based either on data
or model alone.
FIGURE 7 Hindcasts of Nifio3 index of sea surface temperature (SST)
anomaly with (a) and without (b) assimilation. The gray and solid curves are
observed and modeled SSTs, respectively. The model is a simple coupled
ocean-atmosphere model, and the assimilation is of altimetry, winds, and
sea surface temperatures, conducted by the adjoint method. (Adapted from
Lee
et al.
(2000), Fig. 10.)
Data assimilation also provides a means to improve pre-
diction of a dynamic system's future evolution, by provid-
ing optimal initial conditions and other model parameters
from which forecasts are issued. In fact, such applications
of data assimilation are the central focus in ballistics ap-
plications and in numerical weather forecasting. In recent
years, forecasting has also become an important application
of data assimilation in oceanography. For example, oceano-
graphic forecasts in the tropical Pacific are routinely pro-
duced by the National Center for Environmental Prediction
(NCEP) (Behringer
et al.,
1998; Ji
et al.,
1998), with par-
ticular applications to forecasting the E1 Nifio-Southern Os-
cillation (ENSO). Of late, altimetric observations have also
been utilized in the NCEP system (Ji
et al.,
2000). Lee
et al.
(2000) have explored the impact of assimilating al-
timetry data into a simple coupled ocean-atmosphere model
of the tropical Pacific. For example, Figure 7 shows improve-
ments in their model's skill in predicting the so-called Nifio3
sea-surface temperature anomaly as a result of assimilating
TOPEX/POSEIDON altimeter data. The model predictions
(solid curves) are in better agreement with the observed in-
dex (gray curve) in the assimilated estimate (left panel) than
without data constraints (right panel).
Apart from sea level, satellite altimetry also measures
significant wave height (SWH), which is another oceano-
graphic variable of interest. In particular, the European Cen-
tre for Medium-Range Weather Forecasting (ECMWF) has
been assimilating altimetric wave height (ERS 1) in produc-
ing global operational wave forecasts (Janssen
et al.,
1997).
Figure 8 shows an example of the impact of assimilating al-
timetric SWH in improving predictions made by this wave
model up to 5-days into the future (Lionello
et al.,
1995).
5. DATA ASSIMILATION BY MODELS
243
0,10 ~ 50
i~G ~ [G
• 40
E 0,05 o
,,,
~ 30
L 'iiii.'il "
-0,05 10
-0,I0 ~ 0
A 24 48 72 96 120 A 24 48 72 96 120
Forecast period in hours Forecast period in hours
FIGURE 8 Bias and scatter index of significant wave height (SWH) analysis (denoted A on the abscissa)
and various forecasts. Comparisons are between model and altimeter. Full (dotted) bars denote the reference
experiment without (with) assimilating ERS-I significant wave height data. The scatter index measures
the lack of correlation between model and data. The model is the third generation wave model WAM.
Assimilation is performed by optimal interpolation. (Adapted from Lionello
et al. (1995),
Fig. 12, p. 105.)
The figure shows the assimilation (dotted bars) resulting
in a smaller bias (left panel) and higher correlation (i.e.,
smaller scatter) (right panel) with respect to actual wave-
height measurements than those without assimilation (full
bars). Further discussions on wave forecasting can be found
in Chapter 7.
In addition to the state of the ocean, data assimilation
also provides a framework to estimate and improve model
parameters, external forcing, and open boundary conditions.
For instance, Smedstad and O'Brien (1991) estimated the
phase speed in a reduced-gravity model of the tropical Pa-
cific Ocean using sea-level measurements from tide gauges.
Fu
et al.
(1993) and Stammer
et al.
(1997) estimated uncer-
tainties in winds, in addition to the model state, from assim-
ilating altimetry data. (The latter study also estimated errors
in atmospheric heat fluxes.) Lee and Marotzke (1998) esti-
mated open boundary conditions of an Indian Ocean model.
Data assimilation in effect fits models to observations.
Then, the extent to which models can or cannot be fit to
data gives a quantitative measure of the model's consistency
with measurements, thus providing a formal means of hy-
pothesis testing that can also help identify specific deficien-
cies of models. For example, Bennett
et al.
(1998) identified
inconsistencies between moored temperature measurements
and a coupled ocean-atmosphere model of the tropical Pa-
cific Ocean, resulting from the model's lack of momentum
advection. Marotzke and Wunsch (1993) found inconsisten-
cies between a time-invariant general circulation model and
a climatological hydrography, indicating the inherent nonlin-
earity of ocean circulation. Alternatively, excessive model-
data discrepancies found by data assimilation can also point
to inaccuracies in observations. Examples of such analysis
at present can be best found in meteorological applications
(e.g., Hollingsworth, 1989).
Lastly, data assimilation has also been employed in eval-
uating merits of different observing systems by analyz-
ing model results with and without assimilating particu-
lar observations. For instance, Carton
et al.
(1996) found
TOPEX/POSEIDON altimeter data having larger impact in
resolving intra-seasonal variability of the tropical Pacific
Ocean than data from a mooring array or a network of
expendable bathythermographs (XBTs). Verron (1990) and
Verron
et al.
(1996) conducted a series of numerical experi-
ments (observing system simulation experiments, OSSEs, or
twin experiments) to evaluate different scenarios of single-
and dual-altimetric satellites. OSSEs and twin experiments
are numerical experiments in which a set of pseudo obser-
vations are extracted from a particular numerical simula-
tion and are assimilated into another (e.g., with different ini-
tial conditions and/or forcing, etc.) to examine the degree
to which the former results can be reconstructed. The rela-
tive skill of the estimate among different observing scenarios
provides a measure of the observation's effectiveness. From
such an analysis, Verron
et al. (1996)
conclude that a 10-
20 day repeat period is satisfactory for the spatial sampling
of mid-latitude meso-scale eddies but that any further gain
would come from increased temporal, rather than spatial,
sampling provided by a second satellite that is offset in time.
Twin experiments are also employed in testing and evaluat-
ing different data assimilation methods (Section 4).
3. DATA ASSIMILATION AS AN
INVERSE PROBLEM
Recognizing the mathematical problem of data assimila-
tion is essential in understanding what assimilation could
achieve, where the difficulties exist, and where the issues
arise from. For example, there are theoretical and practi-
cal difficulties involved in solving the problem, and various
assumptions and approximations are necessarily made, of-
tentimes implicitly. A clear understanding of the problem is
244
SATELLITE ALTIMETRY AND EARTH SCIENCES
critical in interpreting the results of assimilation as well as
in identifying sources of inconsistencies.
Mathematically, as will be shown, data assimilation is
simply an inverse problem, such as,
,A(x) ~ y (1)
in which the unknowns, vector x, are estimated by inverting
some functional ,,4 relating the unknowns on the left-hand-
side to the knowns, y, on the right-hand-side9 Equation (1)
is understood to hold only approximately (thus ~ instead of
=), as there are uncertainties on both sides of the equation9
Throughout this chapter, bold lowercase letters will denote
column vectors.
The unknowns x in the context of assimilation, are inde-
pendent variables of the model that may include the state of
the model, such as temperature, salinity, and velocity over
the entire model domain, and various model parameters as
well as unknown external forcing and boundary conditions9
The knowns, y, include all observations as well as known
elements of the forcing and boundary conditions. The func-
tional .,4 describes the relationships between the knowns
and unknowns, and includes the model equations that dic-
tate the temporal evolution of the model state. All variables
and functions will be assumed discretized in space and time
as is the case in most practical numerical model implemen-
tations.
The data assimilation problem can be identified in the
form of Eq. (1) by explicitly noting the available relation-
ships. Observations of the ocean at some particular instant
(subscript i), yi, can be related to the state of the model (in-
cluding all uncertain model parameters), xi, by some func-
tional 7-r
"~'~i (Xi) ~
Yi. (2)
(The functional
'~'~i
is also dependent on i because the par-
ticular set of observations may change with time i.) In case
of a direct measurement of one of the model unknowns, 7"ti
is simply a functional that returns the corresponding element
of xi. For instance, if Yi were a scalar measurement of the j th
element of xi,
7~i
would be a row vector with zeroes except
for its jth element being one:
"]'~i (0
0, 1, 0, , 0). (3)
Functional 7-r would be nontrivial for diagnostic quantities
of the model state, such as sea level in a primitive equation
model with a rigid-lid approximation (e.g., Pinardi
et al.,
1995). However, even for such situations, a model equiva-
lent of the observation can be expressed by some functional
7"r as in Eq. (2), be it explicit or implicit.
In addition to the observation equations (Eq. [2]), the
model algorithm provides a constraint on the temporal evo-
lution of the model state, that could be brought to bear upon
the problem of determining the unknown model states x:
Xi + 1 "~ -~'i (Xi). (4)
Equation (4) includes the initialization constraint,
x0
Xfirst guess"
(5)
Function
,~'i is,
in practice, a discretization of the continu-
ous equations of the ocean physics and embodies the model
algorithm of integrating the model state in time from one ob-
served instant i to another i + 1. The function generally de-
pends on the state at i as well as any external forcing and/or
boundary condition. (For multi-stage algorithms that involve
multiple time-steps in the integration, such as the leap-frog
or Adams-Bashforth schemes, the state at i could be defined
as concatenated states at corresponding multiple time-steps.)
Combining observation Eq. (2) and model evolution
Eq. (4), the assimilation problem as a whole can be written
as,
i
"~i (Xi)
Yi
Xi+l '.~'i (Xi) 0
(6)
By solving the data and model equations simultaneously, as-
similation seeks a solution (model state) that is consistent
with both data and model equation.
Eq. (6) defines the assimilation problem and can be rec-
ognized as a problem of the form Eq. (1), where the states
in Eq. (6) at different time steps ( xT ' xr+l )7" define
the unknown x on the left-hand side of Eq. (1). Typically, the
number of unknowns far exceed the number of independent
equations and the problem is ill-posed. Thus, data assimi-
lation is mathematically equivalent to other inverse prob-
lems such as the classic box model geostrophic inversion
(Wunsch, 1977) and the beta spiral (Stommel and Schott,
1977). However, what distinguishes assimilation problems
from other oceanic inverse problems is the temporal evolu-
tion and the sophistication of the models involved. Instead
of simple constraints such as geostrophy and mass conserva-
tion, data assimilation employs more general physical prin-
ciples applied at much higher resolution and spatial extent.
The intervariable relationship provided by the model equa-
tions solved together with the observation equations allows
data information to affect the model solution in space and
time, both with respect to times that formally lie in the future
and past of the observed instance, as well as among different
properties.
From a practical standpoint, the distinguishing property
of data assimilation is its enormous dimensionality. Typical
ocean models contain on the order of several million inde-
pendent variables at any particular instant. For example, a
global model with 1 ~ horizontal resolution and 20 vertical
5. DATA ASSIMILATION BY MODELS
245
levels is a fairly coarse model by present standards, yet it
would have 1.3 million grid points (360 x 180 x 20) over
the globe. With four independent variables per grid node
(the two components of horizontal velocity, temperature,
and salinity), such as in a primitive equation model with
the rigid-lid approximation, the number of unknowns would
equal 5 million globally or approximately 3 million when
counting points only within the ocean.
The amount of data is also large for an altimeter. For
TOPEX/POSEIDON, the Geophysical Data Record pro-
vides a datum every second, which over its 10-day repeat cy-
cle amount to approximately 500,000 points over the ocean,
which is an order of magnitude larger than the number of
horizontal grid points of the 1 ~ model considered above. In
light of the redundancy the data would provide for such a
coarse model, the altimeter could be thought of as providing
sea level measurements at the rate of one measurement at ev-
ery grid point per repeat cycle. Then, assuming for simplic-
ity that all observations within a repeat cycle are coincident
in time, each observation equation of form Eq. (2) would
have approximately 50,000 equations, and there would be
180 such sets (time-levels or different i's) over a course of
a 5-year mission amounting to 9 million individual observa-
tion equations. The number of time-levels involved in the
observation equations would require at least as many for
the model equations in Eq. (6), amounting to 540 million
(180 x 3 million) individual model equations.
The size of such a problem precludes any direct approach
in solving Eq. (6), such as deriving the inverse of the opera-
tor on the left-hand side even if it existed. In practice, there is
generally no solution that exactly satisfies Eq. (6), because of
inaccuracies of models and uncertainties in observations. In-
stead, an approximate solution is sought that solves the equa-
tions as "close" as possible in some suitably defined manner.
Several ingenious inverse methods are known and/or have
been developed, and are briefly reviewed in the section be-
low.
4. ASSIMILATION METHODOLOGIES
Because of the problem's large computational task, de-
vising methods of assimilation has been one of the central
issues in data assimilation. Many assimilation methods have
been put forth and explored, and they are heuristically re-
viewed in this section. The aim of this discussion is to elu-
cidate the nature of different methods and thereby allow the
reader familiarity with how the problems are approached.
Rigorous descriptions of the methods are deferred to refer-
ences herein.
Assimilation problems are in practice ill-posed, in the
sense that no unique solution satisfies the problem Eq. (6).
Consequently, many assimilation methodologies are based
on "classic" inverse methods. Therefore, for reference, we
will begin the discussion with a simple review of the na-
ture of inverse methods. Different assimilation methodolo-
gies are then individually described, preceded by a brief
overview so as to place the approaches into a broad per-
spective. A Summary and Recommendation is given in Sec-
tion 4.11.
4.1. Inverse Methods
Comprehensive mathematical expositions of oceano-
graphic inverse problems and inverse methods can be found,
for example, in the textbooks of Bennett (1992) and Wunsch
(1996). Here we will briefly review their nature for refer-
ence.
Inverse methods are mathematical techniques that solve
ill-posed problems that do not have solutions in the strict
mathematical sense. The methods seek solutions that ap-
proximately satisfy constraints, such as Eq. (6), under
suitable "optimality" criteria. These criteria include, vari-
ous least-squares, maximum likelihood, and minimum-error
variance (Bayesian estimates). Differences among the crite-
ria lie in what are explicitly assumed.
Least-squares methods seek solutions that minimize the
weighted sum of differences between the left- and right-hand
sides of an inverse problem (Eq. [1 ]):
,5" = (y -
.A(x)) r W-1 (y _ .A(x)) (7)
where W is a matrix defining weights.
Least-squares methods do not have explicit statistical
or probabilistic assumptions. In comparison, the maximum
likelihood estimate seeks a solution that maximizes the a
posteriori probability of the right-hand side of Eq. (6) by
invoking particular probability distribution functions for y.
The minimum variance estimate solves for solutions x with
minimum a posteriori error variance by assuming the error
covariance of the solution's prior expectation as well as that
of the right-hand side.
Although seemingly different, the methods lead to iden-
tical results so long as the assumptions are the same (see for
example Introduction to Chapter 4 of Gelb [1974] and Sec-
tion 3.6 of Wunsch [ 1996]). In particular, a lack of an explicit
assumption can be recognized as being equivalent to a par-
ticular implicit assumption. For instance, a maximum likeli-
hood estimate with no prior assumptions about the solution
is equivalent to assuming an infinite prior error covariance
for a minimum variance estimate. For such an estimate, any
solution is acceptable as long as it maximizes the a posteriori
probability of the right-hand side (Eq. [6]).
Based on the equivalence among "optimal methods,"
Eq. (7) can be regarded as a practical definition of what
various inverse methods solve (and therefore assimilation).
Furthermore, the equivalence provides a statistical basis for
prescribing weights used in Eq. (7). In particular, W can be
246
SATELLITE ALTIMETRY AND EARTH SCIENCES
identified as the error covariance among individual equations
of the inverse problem Eq. (6).
When the weights of each separate relation are uncorre-
lated in time, Eq. (7) may be expanded as,
,.q,- M T (Yi "~i (Xi))
]~i=o(Yi
7-~i(Xi))
R~ -1
qt_ ]~M0(xi+I __
ff~'i(Xi))TQ-~l(xi+l __
.~'i (Xi)) (8)
where R and Q denote weighting matrices of data and model
equations, respectively, and M is the total number of obser-
vations of form Eq. (2). Most assimilation problems are for-
mulated as in Eq. (8), i.e., uncertainties are implicitly as-
sumed to be uncorrelated in time.
The statistical basis of optimal inverse methods allows
explicit a posteriori uncertainty estimates to be derived. Such
estimates quantify what has been resolved and is an inte-
gral part of an inverse solution. The errors identify what is
accurately determined and what remains indeterminate, and
thereby provide a basis for interpreting the solution and a
means to ascertain necessary improvements in models and
observing systems.
4.2. Overview of Assimilation Methods
Many of the so-called "advanced" assimilation methods
originate in estimation and control theories (e.g., Bryson and
Ho, 1975; Gelb, 1974), which in turn are based on "clas-
sic" inverse methods. These include the adjoint, represen-
ter, Kalman filter and related smoothers, and Green's func-
tion methods. These techniques are characterized by their
explicit assumptions under which the inverse problem of
Eq. (6) is consistently solved. The assumptions include, for
example, the weights W used in the problem identification
(Eq. [7]) and specific criteria in choosing particular "opti-
mal" solutions, such as least-squares, minimum error vari-
ance, and maximum likelihood. As with "classic" inverse
methods, these assimilation schemes are equivalent to each
other and result in the same solution as long as the assump-
tions are the same. Using specific weights allows for explic-
itly accounting for uncertainties in models and data, as well
as evaluation of a posteriori errors. However, because of sig-
nificant algorithmic and computational requirements in im-
plementing these optimal methods, many studies have ex-
plored developing and testing alternate, simpler approaches
of combining model and data.
The simpler approaches include optimal interpolation,
"3D-var," "direct insertion," "feature models," and "nudg-
ing." Many of these approaches originate in atmospheric
weather forecasting and are largely motivated in making
practical forecasts by sequentially modifying model fields
with observations. The methods are characterized by various
ad hoc assumptions (e.g., vertical extrapolation of altimeter
data) to effect the simplification, but the results are at times
obscured by the nature of the choices made without a clear
understanding of the dynamical and statistical implications.
Although the methods aim to adjust model fields towards ob-
servations, it is not entirely clear how the solution relates to
the problem identified by Eq. (6). Many of the simpler ap-
proaches do not account for uncertainties, potentially allow-
ing the models to be forced towards noise, and data that are
formally in the future are generally not used in the estimate
except locally to yield a temporally smooth result. However,
in spite of these shortcomings, these methods are still widely
employed because of their simplicity, and, therefore, warrant
examination.
4.3. Adjoint Method
Iterative gradient descent methods provide an effec-
tive means of solving minimization problems of form
Eq. (7), and a particularly powerful method of obtaining
such gradients is the so-called adjoint method. The adjoint
method transforms the unconstrained minimization problem
of Eq. (7) into a constrained one, which allows the gradi-
ent of the "cost function" (Eq. [7]), 03"/0x, to be evaluated
by the model's adjoint (i.e., the conjugate transpose [Her-
mitian] of the model derivative with respect to the model
state variables [Jacobian]). Namely, without loss of general-
ity, uncertainties of the model equations (Eq. [4]) are treated
as part of the unknowns and moved to the left-hand side
of Eq. (6). The resulting model equations are then satisfied
identically by the solution that also explicitly includes er-
rors of the model as part of the unknowns. As a standard
method for solving constrained optimization problems, La-
grange multipliers are introduced to formally transform the
constrained problem back to an unconstrained one. The La-
grange multipliers are solutions to the model adjoint, and
in turn give the gradient information of ,3" with respect to
the unknowns. The computational efficiency of solving the
adjoint equations is what makes the adjoint method partic-
ularly useful. Detailed derivation of the adjoint method can
be found, for example, in Thacker and Long (1988).
Methods that directly solve the minimization problem (7)
are sometimes called variational methods or 4D-var (four-
dimensional variational method). Namely, four-dimensional
for minimization over space and time and variational be-
cause of the theory based on functional variations. However,
strictly speaking, this reference is a misnomer. For example,
Kalman filtering/smoothing is also a solution to the four-
dimensional optimization problem, and to the extent that as-
similation problems are always rendered discrete, the adjoint
method is no longer variational but is algebraic.
Many applications of the adjoint are of the so-called
"strong constraint" variety (Sasaki, 1970), in which model
equations are assumed to hold exactly without errors making
initial and boundary conditions the only model unknowns.
As a consequence, many such studies are of short dura-
tion because of finite errors in f" in Eq. (4) (e.g., Greiner
5. DATA ASSIMILATION BY MODELS 247
et al.,
1998a, b). However, contrary to common misconcep-
tions, the adjoint method is not restricted to solving only
"strong constraint" problems. As described above, by ex-
plicitly incorporating model errors as part of the unknowns
(so-called controls), the adjoint method can be applied to
solve Eq. (7) with nonzero model uncertainties Q. Examples
of such "weak constraint" adjoint may be found in Stammer
et al. (1997)
and Lee and Marotzke (1998). (See also Griffith
and Nichols, 1996.)
Adjoint methods have been used to assimilate altimetry
data into regional quasi-geostrophic models (Moore, 1991;
Schr6ter
et al.,
1993; Vogeler and Schr6ter, 1995; Mor-
row and De Mey, 1995; Weaver and Anderson, 1997), shal-
low water models (Greiner and Perigaud, 1994, 1996; Cong
et al.,
1998), primitive equation models (Stammer
et al.,
1997; Lee and Marotzke, 1998), and a simple coupled ocean-
atmosphere model (Lee
et al.,
2000), de las Hera
et al.
(1994) explored the method in wave data assimilation.
One of the particular difficulties of employing adjoint
methods has been in generating the model's adjoint. Algo-
rithms of typical general circulation models are complex and
entail on the order of tens of thousands of lines of code, mak-
ing the construction of the adjoint technically challenging.
Moreover, the adjoint code depends on the particular set of
control variables that varies with particular applications. The
adjoint compiler of Giering and Kaminski (1998) greatly
alleviates the difficulty associated with generating the ad-
joint code by automatically transforming a forward model
into its tangent linear approximation and adjoint. Stammer
et al.
(1997) employed the adjoint of the MITGCM (Mar-
shall
et al.,
1997) constructed by such a compiler.
The adjoint method achieves its computational efficiency
by its efficient evaluation of the gradient of the cost func-
tion. Yet, typical application of the adjoint method requires
several tens of iterations until the cost function converges,
which still requires a significant amount of computations
relative to a simulation. Moreover, for nonlinear models,
integration of the Lagrange multipliers requires the for-
ward model trajectory which must be stored or recomputed
during each iteration. Approximations have been made by
saving such trajectories at coarser time levels than actual
model time-steps ("checkpointing"), recomputing interme-
diate time-levels as necessary or simply approximating them
with those that are saved (e.g., Lee and Marotzke, 1997). In
the "weak constraint" formalism, the unknown model errors
are estimated at fixed intervals as opposed to every time-step,
so as to limit the size of the control. Although efficient, such
computational overhead still makes the adjoint method too
costly to apply directly to global models at state-of-the-art
resolution (e.g., Fu and Smith, 1996).
To alleviate some of the computational cost associated
with convergence, Luong
et al.
(1998) employ an itera-
tive scheme in which the minimization iterations are con-
ducted over time periods of increasing length. This progres-
sive strategy allows the initial decrease in cost function to be
achieved with relatively small computational requirements
than otherwise. In comparison, D. Stammer (personal com-
munication, 1998) employs an iterative scheme in space.
Namely, assimilation is first performed by a coarse resolu-
tion model. A finer-resolution model is used in assimilation
next, using the previous coarser solution interpolated to the
fine grid as the initial estimate of the adjoint iteration. It is
anticipated that the resulting distance of the fine-resolution
model to the optimal minimum of the cost function 3" is
closer than otherwise and that the convergence can therefore
be achieved faster.
Courtier
et al.
(1994) instead put forth an incremental ap-
proach to reducing the computational requirements of the
adjoint method. The approach consists of estimating modifi-
cations of the model state (increments) based on a simplified
model and its adjoint. The simplifications include the tangent
linear approximation, reduced resolution, and approximated
physics (e.g., adiabatic instead of diabatic). Motivated in part
to simplify coding the adjoint model, Schiller and Wille-
brand (1995) employed an approximate adjoint in which the
adjoint of only the heat and salinity equations were used in
conjunction with a full primitive equation ocean general cir-
culation model.
The adjoint method is based on accurate evaluations of
the local gradient of the cost function (Eq. [7]). The estima-
tion is rigorous and consistent with the model, but could po-
tentially lead to suboptimal results should the minimization
converge to a local minimum instead of a global minimum as
could occur with strongly nonlinear models and observations
(e.g., convection). Such situations are typically assessed by
perturbation analyses of the system near the optimized solu-
tion.
A posteriori uncertainty estimates are an integral part of
the solution of inverse problems. The a posteriori error co-
variance matrix of the adjoint method is given by the inverse
of the Hessian matrix (second derivative of the cost function
J with respect to the control vector) (Thacker, 1989). How-
ever, computational requirements associated with evaluating
the Hessian render such calculation infeasible for most prac-
tical applications. Yet, some aspects of the error and sensitiv-
ity may be evaluated by computations of the dominant struc-
tures of the Hessian matrix (Anderson
et al.,
1996). Practical
evaluations of such error estimates require further investiga-
tion.
4.4. Representer Method
The representer method (Bennett, 1992) solves the op-
timization problem Eq. (6) by seeking a solution linearly
expanded into data influence functions, called representers,
that correspond to each separate measurement. The assimi-
lation problem then becomes one of determining the optimal
coefficients of the representers. Because typical dimensions
248
SATELLITE ALTIMETRY AND EARTH SCIENCES
of observations are much smaller than elements of the model
state (two orders of magnitude in the example above), the re-
sulting optimization problem becomes much smaller in size
than the original problem (Eq. [6]) and is therefore easier to
solve.
Representers are functionals corresponding to the effects
of particular measurements on the estimated solution, viz.,
Green's functions to the data assimilation problem (Eq. [6]).
Egbert
et al.
(1994) and Le Provost
et al.
(1998) employed
the representer method in assimilating T/P data into a model
of tidal constituents. Although much reduced, representer
methods still require a significant amount of computational
resources. The largest computational difficulty lies in deriv-
ing and storing the representer functions; the computation
requires running the model and its adjoint N-times spanning
the duration of the observations, where N is the number of
individual measurements. Although much smaller than the
size of the original inverse problem (Eq. [6]), the number of
representer coefficients to be solved, N, is also still fairly
large.
Approximations are therefore necessary to reduce the
computational requirements for practical applications. Eg-
bert
et al.
(1994) employed a restricted subset of representers
noting that representers are similar for nearby measurement
functionals. Alternatively, Egbert and Bennett (1996) formu-
late the representer method without explicitly computing the
representers.
Theoretically, the representer expansion is only applica-
ble to linear models and linear measurement functionals,
because otherwise a sum of solutions (representers) is not
necessarily a solution of the original problem. Bennett and
Thorburn (1992) describe how the method can be extended
to nonlinear models by iteration, linearizing nonlinear terms
about the previous solution.
4.5. Kalman Filter and Optimal Smoother
The Kalman filter, and related smoothers, are minimum
variance estimators of Eq. (6). That is, given the right-hand
side and the relationship in Eq. (6), the Kalman filter and
smoothers provide estimates of the unknowns that are opti-
mal, defined as having the minimum expected error variance,
((x - ~)~r(x - ~)). (9)
In Eq. (9), ~ is the true solution and the angle brackets denote
statistical expectation. Although not immediately obvious,
minimum variance estimates are equivalent to least-squares
solutions (e.g., Wunsch, 1996, p. 184). In particular, the two
are the same when the weights used in Eq. (7) are prior error
covariances of the model and data constraints. That is, the
Kalman filter assumes no more (statistics) than what is as-
sumed (i.e., choice of weights) in solving the least-squares
problem (e.g., adjoint and representers). When the statistics
are Gaussian, the solution is also the maximum likelihood
estimate.
The Kalman filter achieves its computational efficiency
by its time recursive algorithm. Specifically, the filter com-
bines data at each instant (when available) and the state pre-
dicted by the model from the previous time step. The result
is then integrated in time and the procedure is repeated for
the next time-step. Operationally, the Kalman filter is in ef-
fect a statistical average of model state prior to assimilation
and data, weighted according to their respective uncertain-
ties (error covariance). The algorithm guarantees that infor-
mation of past measurements are all contained within the
predicted model state and therefore past data need not be
used again. The savings in storage (that past data need not
be saved) and computation (that optimal estimates need not
be recomputed from the beginning of the measurements) is
an important consideration in real-time estimation and pre-
diction.
The filtered state is optimal with respect to measure-
ments of the past. The smoother additionally utilizes data
that lie formally in the future; as future observations con-
tain information of the past, the smoothed estimates have
smaller expected uncertainties (Eq. [9]) than filtered results.
In particular, the smoother literally "smoothes" the filtered
results by reducing the temporal discontinuities present in
the estimate due to the filter's intermittent data updates. Var-
ious forms and algorithms exist for smoothers depending
on the time window of observations used relative to the es-
timate. In general, the smoother is applied to the filtered
results (which contains the data information) backwards
in time. The occasional references to "Kalman smoothers"
or "Kalman smoothing" are misnomers. They are simply
smoothers and smoothing.
The computational difficulty of Kalman filtering, and
subsequent smoothing, lies in evaluating the error covari-
ances that make up the filter and smoother. The state error
evolves in time according to model dynamics and the in-
formation gained from the observations. In particular, the
error covariances' dynamic evolution, which assures the esti-
mate's optimality, requires integrating the model the equiva-
lent of twice-the-size-of-the-model times more than the state
itself, and is the most computationally demanding step of
Kalman filtering.
Although the availability of a posteriori error estimates
are fundamental in estimation, the large computational
requirement associated with the error evaluation makes
Kalman filtering impractical for models with order million
variables and larger. For this reason, direct applications of
Kalman filtering to oceanographic problems have been lim-
ited to simple models. For instance, Gaspar and Wunsch
(1989) analyzed Geosat altimeter data in the Gulf Stream re-
gion using a spectral barotropic free Rossby wave model. Fu
et al.
(1991) detected free equatorial waves in Geosat mea-
surements using a similar model.
5. DATA ASSIMILATION BY MODELS
249
More recently, a number of approximations have been put
forth aimed directly at reducing the computational require-
ments of Kalman filtering and smoothing, and thereby mak-
ing it practical for applications with large general circula-
tion models. For example, errors of the model state often
achieve near-steady or cyclic values for time-invariant ob-
serving systems or cyclic measurements (exact repeat mis-
sions of satellites are such), respectively. Exploiting such
a property, Fukumori
et al.
(1993) explored approximat-
ing the model state error covariance by its time-asymptotic
limit, thereby eliminating the need for the error's continu-
ous time-integration and storage. Fu
et al. (1993),
assim-
ilating Geosat data with a wind-driven spectral equatorial
wave model, demonstrated that estimates made by such a
time-asymptotic filter are indistinguishable from those ob-
tained by the unapproximated Kalman filter. Gourdeau
et al.
(1997) employed a time-invariant model state error covari-
ance in assimilating Geosat data with a second baroclinic
mode model of the equatorial Atlantic.
A number of studies have explored approximating the er-
rors of the model state with fewer degrees of freedom than
the model itself, thereby reducing the computational size
of Kalman filtering while still retaining the original model
for the assimilation. Fukumori and Malanotte-Rizzoli (1995)
approximated the model-state error with only its large-scale
structure, noting the information content of many observing
systems in comparison to the number of degrees of freedom
in typical models. Fukumori (1995) and Hirose
et al.
(1999)
used such a reduced state filter and smoother in assimilating
TOPEX/POSEIDON data into shallow water models of the
tropical Pacific Ocean and the Japan Sea, respectively. Cane
et al.
(1996) employed a limited set of empirical orthogonal
functions (EOFs) arguing that model errors are insufficiently
known to warrant estimating the full error covariance matrix.
Parish and Cohn (1985) proposed approximating the model-
error covariance with only its local structure by imposing a
banded approximation of the covariance matrix. Based on a
similar notion that model errors are dominantly local, Chin
et al. (1999)
explored state reductions using wavelet trans-
formation and low-order spatial regression.
In comparison, Menemenlis and Wunsch (1997) approxi-
mated the model itself (and consequently its error) by a state
reduction method based on large-scale perturbations. Mene-
menlis
et al.
(1997) used such a reduced-state filter to assim-
ilate TOPEX/POSEIDON data in conjunction with acoustic
tomography measurements in the Mediterranean Sea.
For nonlinear models, the Kalman filter approximates the
error evolution by linearizing the model about its present
state, i.e., the so-called extended Kalman filter. (Error co-
variance evolution is otherwise dependent on higher order
statistical moments.) For example, Fukumori and Malanotte-
Rizzoli (1995) employed an extended Kalman filter with
both time-asymptotic and reduced-state approximations. In
many situations, such linearization is found to be adequate.
However, in strongly nonlinear systems, inaccuracies of the
linearized error estimates can be detrimental to the esti-
mate's optimality (e.g., Miller
et al.,
1994). Evensen (1994)
proposed approximating the error evaluation by integrating
an ensemble of model states. The covariance among ele-
ments of the ensemble is then used in assimilating observa-
tions into each member of the ensemble, thus circumventing
the problems associated with explicitly integrating the error
covariance. Evensen and van Leeuwen (1996) used such an
ensemble Kalman filter in assimilating Geosat altimeter data
into a quasi-geostrophic model of the Agulhas current.
Pham
et al.
(1998) proposed a reduced-state filter based
on a time-evolving set of EOFs (Singular Evolutive Ex-
tended Kalman Filter, SEEK) with the aim of reducing the
dimension of the estimate at the same time as taking into ac-
count the time-evolving direction of a model's most unstable
mode. Verron
et al.
(1999) applied the method to analyze
TOPEX/POSEIDON data in the tropical Pacific Ocean.
4.6. Model Green's Function
Stammer and Wunsch (1996) utilized model Green's
functions to analyze TOPEX/POSEIDON data in the North
Pacific. The approach consists of reducing the dimension
of the least-squares problem (Eq. [6]) into one that is solv-
able by expanding the unknowns in terms of a limited set
of model Green's functions, corresponding to the model's
response to impulse perturbations. The amplitudes of the
functions then become the unknowns. Stammer and Wunsch
(1996) restricted the Green's functions to those correspond-
ing to large-scale perturbations so as to limit the size of the
problem. Bauer
et al.
(1996) employed a similar technique
in assimilating altimetric significant wave height data into a
wave model.
The expansion of solutions into a set of limited functions
is similar to the approach taken in the representer method, al-
beit with different basis functions, while the method's iden-
tification of the large-scale corrections is closely related to
the approach taken in the reduced-state Kalman filters (e.g.,
Menemenlis and Wunsch [ 1997]).
4.7. Optimal Interpolation
Optimal interpolation (OI) is a minimum variance se-
quential estimator that is algorithmically similar to Kalman
filtering, except OI employs prescribed weights (error co-
variances) instead of ones that are theoretically evaluated
by the model over the extent of the observations. Sequential
methods solve the assimilation problem separately at differ-
ent instances, i,
i i xi, yi )
Xi .~i- 1 (Xi- 1 ) ( 1 O)
25 0
SATELLITE ALTIMETRY AND EARTH SCIENCES
given the observations Yi and the estimate at the previous in-
stant, xi- 1. The main distinction between Eqs. (10) and (6) is
the lack of time dimension in the former. Observed temporal
evolution provides an explicit constraint in Eq. (6), whereas
it is implicit in Eq. (10), contained supposedly within the
past state and its uncertainties (weights). Although optimal
interpolation provides "optimal" instantaneous estimates un-
der the particular weights used, the solution is in fact subop-
timal over the entire measurement period due to lack of the
time dimension from the problem it solves.
OI is presently one of the most widely employed assim-
ilation methods; Marshall (1985) examined the problem of
separating ocean circulation and geoid from altimetry us-
ing OI with a barotropic quasi-geostrophic (QG) model.
Berry and Marshall (1989) and White
et al.
(1990b) ex-
plored altimetric assimilation with an OI scheme using a
multilevel QG model, but assumed zero vertical correlation
in the stream function, modifying sea surface stream func-
tion alone. A three-dimensional OI method was explored by
Dombrowsky and De Mey (1992) who assimilated Geosat
data into an open domain QG model of the Azores region.
Ezer and Mellor (1994) assimilated Geosat data into a prim-
itive equation (PE) model of the Gulf Stream using an OI
scheme described by Mellor and Ezer (1991), employing
vertical correlation as well as horizontal statistical interpo-
lation. Oschlies and Willebrand (1996) specified the vertical
correlations so as to maintain deep temperature-salinity re-
lations, and applied the method in assimilating Geosat data
into an eddy-resolving PE model of the North Atlantic.
The empirical sequential methods that include OI and
others discussed in the following sections are distinctly dif-
ferent from the Kalman filter (Section 4.5), which is also
a sequential method. The Kalman filter and smoother algo-
rithm allows for computing the time-evolving weights ac-
cording to model dynamics and uncertainties of model and
data, so that the sequential solution is the same as that of
the whole time domain problem, Eq. (6). The weights in
the empirical methods are specified rather than computed,
often neglecting the potentially complex cross covariance
among variables that reflects the information's propagation
by the model (see Section 5.1.4). Some applications of OI,
however, allow for the error variance of the model state
to evolve in time as dictated by the model-data combina-
tion and intrinsic growth, but still retain the correlation un-
changed (e.g., Ezer and Mellor, 1994). The Physical-Space
Statistical Analysis System (PSAS) (Cohn
et al.,
1998), is a
particular implementation of OI that solves Eq. (10) without
explicit formulation of the inverse operator.
4.8. Three-Dimensional Variation Method
The so-called three-dimensional variational method
(3D-var) solves Eq. (10) as a least-squares problem, mini-
mizing the residuals:
J" (yi
7-~i(Xi))TR~ 1
(yi '~i(Xi))
-+- (X/ -~'i-1 (xi-1))TQi-_ll (x/ -~'i-1 (x/-1)). (11)
This is similar to the whole domain problem (Eq. 8) ex-
cept without the time dimension. Thus the name "three-di-
mensional" as opposed to "four-dimensional" (Section 4.3).
However, as with 4D-var, 3D-var is a misnomer, and the
method is merely least-squares. Because there is no model
integration of the unknowns involved, the gradient of ,.~t is
readily computed, and is used in solving the minimum of,.~t.
Bourles
et al. (1992)
employed such an approach in as-
similating Geosat data in the tropical Atlantic using a linear
model with three vertical modes. The approach described by
Derber and Rosati (1989) is a similar scheme, except the in-
version is performed at each model time-step, reusing ob-
servations within a certain time window, which makes the
method a hybrid of 3D-var and nudging (Section 4.9).
4.9. Direct Insertion
Direct insertion replaces model variables with observa-
tions, or measurements mapped onto model fields, so as to
initialize the model for time-integration. Direct insertion can
be thought of as a variation of OI in which prior model state
uncertainties are assumed to be infinitely larger than errors in
observations. Hurlburt (1986), Thompson (1986), and Kin-
dle (1986) explored periodic direct insertions of altimetric
sea level using one- and two-layer models of the Gulf of
Mexico. Using the same model, Hurlburt
et al.
(1990) ex-
tended the studies by statistically initializing deeper pres-
sure fields from sea level measurements. De Mey and Robin-
son (1987) initialized a QG model by statistically projecting
sea surface height into the three-dimensional stream func-
tion. Gangopadhyay
et al.
(1997) and Gangopadhyay and
Robinson (1997) performed similar initializations by the so-
called "feature model." Instead of using correlation in the
data-mapping procedure, which tends to smear out short-
scale gradients, feature models effect the mapping by assum-
ing analytic horizontal and vertical structures for coherent
dynamical features such as the Gulf Stream and its rings.
"Rubber sheeting" (Carnes
et al.,
1996) is another approach
aimed at preserving "features" by directly moving model
fields towards observations in spatially correlated displace-
ments. Haines (1991) formulated the vertical mapping of sea
level based on QG dynamics, keeping the subsurface poten-
tial vorticity unchanged while still directly inserting sea level
data into the surface stream function. Cooper and Haines
(1996) examined a similar vertical extension method pre-
serving subsurface potential vorticity in a primitive equation
model.
5. DATA ASSIMILATION BY MODELS 25 1
4.10. Nudging
Nudging blends data with models by adding a Newtonian
relaxation term to the model prognostic equations (Eq. [4])
aimed at continuously forcing the model state towards ob-
servations (Eq. [2]),
Xi+I = .~'i (Xi)- y('/'~j (Xj) yj).
(12)
The nudging coefficient, V, is a relaxation coefficient that is
typically a function of distance in space and time (i - j)
between model variables and observations. Nudging is
equivalent to the so-called robust diagnostic modeling in-
troduced by Sarmiento and Bryan (1982) in constraining
model hydrographic structures. While other sequential meth-
ods intermittently modify model variables at the time of the
observations, nudging is distinct in modifying the model
field continuously in time, re-using data both formally in the
future and past at every model time-step, aimed at gradu-
ally modifying the model state, avoiding "undesirable" dis-
continuities due to the assimilation. The smoothing aspect
of nudging is distinct from optimal smoothers of estimation
theory (Section 4.5); whereas the optimal smoother propa-
gates data information into the past by the model dynamics
(model adjoint), nudging effects a smooth estimate by using
data interpolated backwards in time based solely on tempo-
ral separation.
Verron and Holland (1989) and Holland and Malanotte-
Rizzoli (1989) explored altimetric assimilation by nudging
surface vorticity in a multi-layer QG model. Verron (1992)
further explored other methods of nudging surface circula-
tion including surface stream function. These studies were
followed by several investigations assimilating actual Geosat
altimeter data using similar models and approaches in vari-
ous regions; examples include White
et al. (1990a)
in the
California Current, Blayo
et al.
(1994, 1996) in the North At-
lantic, Capotondi
et al.
(1995a, b) in the Gulf Stream region,
Stammer (1997) in the eastern North Atlantic, and Seiss
et al. (1997)
in the Antarctic Circumpolar Current. In par-
ticular, Capotondi
et al.
(1995a) theoretically examined the
physical consequences of nudging surface vorticity in terms
of potential vorticity conservation. Most recently, Florenchie
and Verron (1998) nudged TOPEX/POSEIDON and ERS-1
data into a QG model of the South Atlantic Ocean.
Other studies explored directly nudging subsurface fields
in addition to surface circulation by extrapolating sea level
data prior to assimilation. For instance, Smedstad and Fox
(1994) used the statistical inference technique of Hurlburt
et al.
(1990) to infer subsurface pressure in a two-layer
model of the Gulf Stream, adjusting velocities geostroph-
ically. Forbes and Brown (1996) nudged Geosat data into
an isopycnal model of the Brazil-Malvinas confluence re-
gion by adjusting subsurface layer thicknesses as well as
surface geostrophic velocity. The monitoring and forecasting
system developed for the Fleet Numerical Meteorology and
Oceanography Center (FNMOC) nudges three-dimensional
fields generated by "rubber sheeting" and OI (Carnes
et al.,
1996).
4.11. Summary and Recommendation
Innovations in estimation theory, such as developments
of adjoint compilers and various approximate Kalman fil-
ters, combined with improvements in computational capabil-
ities, have enabled applications of optimal estimation meth-
ods feasible for many ocean data assimilation problems.
Such developments were largely regarded as impractical
and/or unlikely to succeed even until recently. The virtue of
these "advanced" methods, described in Sections 4.3 to 4.6
above, are their clear identification of the underlying "four-
dimensional" optimization problem (Eq. [6]) and their ob-
jective and quantitative formalism. In comparison, the re-
lation between the "four-dimensional" problem and the ap-
proach taken by other ad hoc schemes (Sections 4.7 to 4.10)
is not obvious, and the nature and consequence of their par-
ticular assumptions are difficult to ascertain. Arbitrary as-
sumptions can lead to physically inconsistent results, and
therefore analyses resulting from ad hoc schemes must be in-
terpreted cautiously. For instance, nudging subsurface tem-
perature can amount to assuming heating and/or cooling
sources within the water column.
As a result of the advancements, ad hoc schemes used
in earlier studies of assimilation are gradually being super-
seded by methods based on estimation theory. For example,
even though operational requirements often necessitate effi-
cient methods to be employed, thus favoring simpler ad hoc
schemes, the European Center for Medium-Range Weather
Forecasting has recently upgraded their operational meteo-
rological forecasting system from "3D-var" to the adjoint
method.
Differences among the "advanced" methods are largely of
convenience. As in "classic" inverse methods, solutions by
optimal estimation are identical so long as the assumptions,
explicit and implicit, are the same. Some approaches may be
more effective in solving nonlinear optimization problems
than others. Others may be more computationally efficient.
However, published studies to date are inconclusive on either
issue.
Given the equivalence, accuracy of the assumptions
is a more important issue for estimation rather than the
choice of assimilation method. In particular, the form and
weights (prior covariance) of the least-squares "cost func-
tion" (Eq. [8]) require careful selection. Different assimila-
tions often make different assumptions, and the adequacy
and implication of their particular suppositions must prop-
erly be assessed. These and other practical issues of assimi-
lation are reviewed in the following section.
25 2
SATELLITE ALTIMETRY AND EARTH SCIENCES
5. PRACTICAL ISSUES OF
ASSIMILATION
As described in the previous section, assimilation tech-
niques are equivalent as long as assumptions are the same,
although very often those assumptions are not explicitly rec-
ognized. Identifying the assumptions and assessing their ap-
propriateness are important issues in assuring the reliabil-
ity of assimilated estimates. Several other issues of practical
importance exist that warrant careful attention when assim-
ilating data, including some that are particular to altimetric
data. These issues are discussed in turn below, and include:
the weights used in defining and solving the assimilation
problem in Eq. (7), methods of vertical extrapolation, deter-
mination of subsurface circulation (observability), prior data
treatment such as horizontal mapping and conversion of sea
level to geostrophic velocities, and the treatment of the un-
known geoid and reference sea level.
5.1. Weights, A Priori Uncertainties, and
Extrapolation
The weights W in Eq. (7) define the mathematical prob-
lem of data assimilation. As such, suitable specification of
weights is essential to obtaining sensible solutions, and is
the most fundamental issue in data assimilation. While ad-
vancements in computational capabilities will directly solve
many of the technical issues of assimilation (Section 4), they
will not resolve the weight identification. Different weights
amount to different problems, thereby leading to different
solutions. Misspecification of weights can lead to overfitting
or underfitting of data, and/or the failure of the assimilation
altogether.
On the one hand least-squares problems are determinis-
tic in the sense that, mathematically, weights could be cho-
sen arbitrarily, such as minimum length solutions and/or so-
lutions with minimum energy (e.g., Weaver and Anderson,
1997). On the other hand, the equivalence of least-square
solutions with minimum error variance and maximum like-
lihood estimates, suggests a particularly suitable choice of
weights being a priori uncertainties of the data and model
constraints, Eqs. (2) and (4). Specifically, the weights can
be identified as the inverse of the respective error covariance
matrices.
5.1.1. Nature of Model and Data Errors
Apart from the problem of specifying values of a priori
errors (Section 5.1.2), it is important first to clarify what the
errors correspond to, as there are subtleties in their identi-
fication. In particular, the a priori errors in Eqs. (7) and (8)
should be regarded as errors in model and data
constraints
rather than merely model and data errors. A case in point is
the so-called representation error (e.g., Lorenc, 1986), that
corresponds to real processes that affect measurements but
are not represented or resolvable by the models. Representa-
tion "errors" concern the
null space
of the model, as opposed
to
errors
within the model range space. For instance, iner-
tial oscillations and tides are not included in the physics of
quasi-geostrophic models and are therefore within the mod-
els' null space. To the extent that representation errors are
inconsistent with models but contribute to measurements, er-
rors of representativeness should be considered part of the
uncertainties of the data constraint (Eq. [2]) instead of the
model constraint (Eq. [4]). Cohn (1997) provides a particu-
larly lucid explanation of this distinction, which is summa-
rized in the discussion below.
Several components of what may be regarded as "model
error" exist, and a careful distinction is required to define
the optimal solution. In particular, three types of model error
can be distinguished; these could be called
model state error,
model equation error,
and
model representation error.
First,
it is essential to recognize the fundamental difference be-
tween the ocean and the models. Models have finite dimen-
sions whereas the real ocean has infinite degrees of freedom.
The model's true state (~) can be mathematically defined by
a functional relationship with the real ocean (w):
= 7:,(w). (13)
Functional T' relates the complete and exact state of the
ocean to its representation in the finite and approximate
space of the model. Such an operator includes both spatial
averaging as well as truncation and/or approximation of the
physics. For instance, finite dimensional models lack scales
smaller than their grid resolution. Quasi-geostrophic models
resolve neither inertial waves nor tides as mentioned above
and reduced gravity shallow water models (e.g., 1.5-layer
models) ignore high-order baroclinic modes. The difference
between a given model state and the true state defined by
Eq. (13),
x - ~ x - T'(w) (14)
is the
model state error,
and its expected covariance, P,
forms the basis of Kalman filtering and smoothing (Sec-
tion 4.5).
The errors of the model constraint (Eq. 4) or
model equa-
tion error,
q, can be identified as,
qi xi + 1 -~'i (xi). (15)
The covariance of qi, Qi, is the inverse of the weights for
the model constraint Eq. (8) in the maximum likelihood esti-
mate.
Model equation error
(Eq. 15) is also often referred to
as system error or process noise. Apart from its dependence
on errors of the initial condition and assimilated data, the
model state error
P is a time-integral by the model equation
.T" of process noise
(model equation error)
Q. Process noise
includes inaccuracies in numerical algorithms (e.g., integra-
5. DATA ASSIMILATION BY MODELS 253
tion errors caused by finite differencing) as well as errors in
external forcing and boundary conditions.
The third component of model error is
model representa-
tion error
and arises in the context of comparing the model
with observations (reality). Observations y measure proper-
ties of the real ocean and can be described symbolically as:
y = ,5: (w) + e (16)
where C represents the measurements' sampling operation
of the real ocean w, and e denotes the measuring instru-
ments' errors. Functional ,~ is generally different from the
model's equivalent, 7"r in Eq. (2), owing to differences be-
tween x and w (Eq. [13]). Measuring instrumentation er-
rors are strictly errors of the observing system and repre-
sent quantities unrelated to either the model or the ocean.
For satellite altimetry, e includes, for example, errors in the
satellite's orbit and ionospheric corrections (cf. Chapter 1).
In terms of quantities in model space, Eq. (16) can be
rewritten as:
y = 7-r + {,~(w) - "kr + e. (17)
Assimilation is the inversion of Eq. (2), which can be iden-
tified as the first term in Eq. (17) that relates model state to
observations rather than a solution of Eq. (16). The second
term in { } on the right-hand side of Eq. (17) describes differ-
ences between the observing system and the finite dimension
of the model, and is the
representation error.
Representation errors arise from inaccuracies or incom-
pleteness in both model and observations.
Model representa-
tion errors
are largely caused by spatial and physical trunca-
tion errors caused by its approximation 7:' (Eq. [ 13]). For ex-
ample, coarse-resolution models lack sea level variabilities
associated with meso-scale eddies, and reduced gravity shal-
low water models are incapable of simulating the barotropic
mode. Such inaccuracies constitute model representation er-
ror when assimilating altimetric data to the extent that an
altimeter measures sea level associated with such missing
processes of the model.
Data representation error is primarily caused by the ob-
serving system not exactly measuring the intended property.
For instance, errors in altimetric sea state bias correction
may be considered data representation errors. Sea-state bias
arises because altimetric measurements do not exactly rep-
resent a uniformly averaged mean sea level, but an average
depending on wave height (sea state) and the reflecting char-
acteristics of the altimetric radar, a process that is not ex-
actly known. Some island tide gauge stations, because of
their geographic location (e.g., inlet), do not represent sea
levels of the open ocean and thus can also be considered as
contributing to data representation error. (Alternatively, such
geographic variations can be ascribed to the model's lack of
spatial resolution and thus identified as model representation
error, but such distinctions are moot.)
Representation errors are inconsistent with model phys-
ics, and therefore are not correctable by assimilation. As far
as the model inversion is concerned, representation error,
whether of data or model origin, is indistinguishable from in-
strument error e. Representation error and instrument noise
together constitute uncertainties relating data and the model
state, viz., data constraint error, whose covariance is R in
Eq. (8). Data constraint error is often referred to merely as
data error, which can be misleading as there are components
in R that are unrelated to observations y. The data constraint
error covariance R is identified as the inverse of the weights
for the data constraint in the maximum likelihood estimate
(Eq. [8]) as well as the data uncertainty used in sequential
inversions. In effect, representation errors downweight the
data constraint (Eq. [2]) and prevent a model from being
forced too close to observations that it cannot represent, thus
guarding against model overfitting and/or "indigestion," i.e.,
a degradation of model estimate by insisting models obey
something they are not meant to.
The fact that part of the model's inaccuracies should con-
tribute to downweighting the data constraint is not immedi-
ately obvious and even downright upsetting for some (espe-
cially for those who are closest to making the observations).
However, as it should be clear from discussions above, the
error of the data constraint is in the accuracy of the relation-
ship in Eq. (2) and not about deficiencies of the observations
y per se.
On the one hand, most error sources can readily be iden-
tified as one of the three error types of the assimilation
problem; measurement instrumentation error, model process
noise (or equivalently model equation error), and represen-
tation error. Specific examples of instrument and represen-
tation errors were given above. Process noise include er-
rors in external forcing and boundary conditions, inaccura-
cies of numerical algorithms (finite differencing), and errors
in model parameterizations. These and other examples are
summarized in Table 1.
On the other hand, representation errors are sometimes
also sources of process noise. For example, while meso-scale
variabilities themselves are representation error for non-
eddy resolving models, the
effects of meso-scale eddies on
the large-scale circulation that are not accurately modeled,
contribute to process noise (e.g., uncertainties in eddy pa-
rameterization such as that of Gent and McWilliams, 1990).
Furthermore, some model errors (but not all) can be catego-
rized either as process noise or representation error depend-
ing on the definition of the true model state, viz., operator 7:'
in Eq. (13). 1 For instance, 7:' may be defined alternatively
as including or excluding certain forced responses of the
1What strictly constitutes 7:' is in fact ambiguous for many models. For
instance, variables in finite difference models are loosely understood to rep-
resent averages in the vicinity of model grid points. However the exact av-
eraging operator is rarely stated.
254
SATELLITE ALTIMETRY AND EARTH SCIENCES
TABLE 1. Examples
of Error Sources
in Altimetric Assimilation
Error type Source
Instrument error
Representation error
Model equation error
(process noise)
Orbit determination, ionospheric correction, dry and wet tropospheric corrections, instrument noise (bias and random noise),
earth tide
Sea-state bias, atmospheric pressure loading a , ocean tides a , sub-grid scale variability, diabatic effects for adiabatic models
(such as QG models), barotropic modes in 1.5-layer models (or any other model lacking the barotropic mode),
baroclinic modes higher than what a particular model can resolve
Numerical truncation (inaccuracies in numerical algorithm, e.g., finite differencing), parameterization error including effects
of subgridscale processes, errors in external forcing and boundary conditions
a Could be regarded as either process noise or representation error, depending on definition of model state. See text for discussion.
ocean. A particular example is tides (and residual tidal er-
rors) in altimetric measurements. While typically treated as
representation error, for free-surface models, the lack of tidal
forcing (or inaccuracies thereof) could equally be regarded
as process noise as well. (External tides are always repre-
sentation errors for rigid lid models which lack the physics
of external gravity waves.) Other examples of similar nature
include effects of baroclinic instability in 1.5-layer models
(e.g., Hirose
et al.,
1999), and external variability propa-
gating in through open boundaries (e.g., Lee and Marotzke,
1998). Although either model lacks the physics of the re-
spective "forcing," the resulting variability such as propa-
gating waves within the model domain could be resolved as
being a result of process noise.
5.1.2. Prescribing Weights
Instrument errors (e in Eq. [ 16]) and data representation
errors are relatively well known from comparisons among
different observing systems. Discussions of errors in altimet-
ric measurements can be found in Chapter 1. Model errors,
including errors of the initial condition, process noise, and
model representation error, are far less accurately known.
In practice, prior uncertainties of data and model are often
simply guesses, whose consistency must be examined based
on results of the assimilation (cf. Section 5.2). In particular,
error
covariances
(i.e., off-diagonal elements of the covari-
ance matrix including temporal correlations and biases) are
often assumed to be nil for simplicity or for lack of sufficient
knowledge that suggests alternatives.
One of the largest sources of model error is considered to
be forcing error. While some knowledge exists of the accu-
racy of meteorological forcing fields, estimates are far from
complete; geographic variations are not well known and es-
timates particularly lack measures of error
covariances.
In
fact, an accurate assessment of atmospheric forcing errors
has been identified as one of the most urgent needs for ocean
state estimation (WOCE International Project Office, 1998).
The problem of estimating a priori error covariances is
generally known in estimation theory as adaptive filtering.
Many of these methods are based on statistics of the so-
called innovation sequence, i.e., the difference between data
and model estimates based on past observations. Prior er-
rors are chosen and/or estimated so as to optimize certain
properties of the innovation sequence. For instance, Gas-
par and Wunsch (1989) adjusted the model process noise
so as to minimize the innovation sequence. Blanchet
et al.
(1997) compared several adaptive Kalman filtering methods
in a tropical Pacific Ocean model using maximum likelihood
estimates for the error. Hoang
et al.
(1998) put forth an al-
ternate adaptive approach, whereby the Kalman gain matrix
(the filter) itself is estimated parametrically as opposed to
the errors. Such an approach is effective because the filter is
in effect only dependent on the ratio of data and model con-
straint errors and not on the absolute error magnitude, but
the resulting state lacks associated error estimates.
Fu
et al.
(1993) introduced an "off-line" approach in
which a priori errors are estimated prior to assimilation
based on comparing observations with a model
simulation,
i.e., a model run without assimilation. The method is sim-
ilar to a class of adaptive filtering methods termed "covari-
ance matching" (e.g., Moghaddamjoo and Kirlin, 1993). The
particular estimate assumes stationarity and independence
among different errors and the signal, and is described below
with simplifications suggested by R. Ponte (personal com-
munication, 1997). First we identify data y and its model
equivalent rn = 7-r (simulation) as being the sum of the
true signal s = 7-r plus their respective errors r and
p:
y = s + r (18)
in = s+p. (19)
Then, assuming the true signal and the two errors are mu-
tually uncorrelated with zero means, the covariance among
data and its model equivalent can be written as:
(yyT) _ (SS T) + (rr T) (20)
(mm T} _ (ss T) + (ppT) (21)
(ym r ) -(ss r ) (22)
5. DATA ASSIMILATION BY MODELS 255
where angle brackets denote statistical expectation. By sub-
stituting the brackets with temporal and/or spatial averages
(assuming ergodicity), one can estimate the left-hand sides
of Eqs. (20) to (22) and solve for the individual terms on
the right-hand sides. In particular, the error covariances of
the data constraint and the simulated model state can be es-
timated as,
/rr~r) _ (yy~r}_ (ymr)
(pp:r} _ {mmr}_ (ymr}.
(23)
(24)
Equation (24) implicitly provides an estimate of model pro-
cess noise Q (Eq. [15]) since the model state error of the
simulation p is a function of the former. (The state error can
be regarded as independent of initial error for sufficiently
long simulations.) Therefore, Eq. (24) can be used to cali-
brate process noise Q.
An example of error estimates based on Eqs. (23)
and (24) is shown in Figure 9 (see color insert) (Fuku-
mori
et al.,
1999). The data are altimetric sea level from
TOPEX/POSEIDON (T/P), and the model is a coarse res-
olution (2 ~ x 1 ~ x 12 vertical levels) global general circula-
tion model based on the NOAA Geophysical Fluid Dynam-
ics Laboratory's Modular Ocean Model (Pacanowski
et al.,
1991), forced by National Center for Environmental Predic-
tion winds and climatological heat fluxes (Comprehensive
Ocean-Atmosphere Data Set, COADS).
Errors of the data constraint (Figure 9a, Eq. [23]) and
those of the simulated model state (Figure 9b, Eq. [24]) are
both spatially varying, reflecting the inhomogeneities in the
physics of the ocean. In particular, the data constraint er-
ror (Figure 9a) is dominated by meso-scale variability (e.g.,
western boundary currents) that constitutes representation
error for the particular model, and is much larger than the
corresponding model state error estimate (Figure 9b) and the
instrumental accuracy of T/P (2 ~ 3 cm). Process noise was
modeled in the form of wind error (Figure 9c) and calibrated
such that the resulting simulation error (Figure 9d) (solu-
tion of the Lyapunov Equation, which is the time-asymptotic
limit of the Riccati Equation with no observations; see for
example, Gelb, 1974) is comparable to the estimate based on
Eq. (24), i.e., Figure 9b. Similar methods of calibrating er-
rors were employed in assimilating Geosat data by Fu
et al.
(1993) and TOPEX measurements by Fukumori (1995).
Menemenlis and Chechelnitsky (2000) extended the ap-
proach of Fu
et al.
(1993) by using only model-data differ-
ences (residuals),
{(y - m)(y - m) ~r) -(rr ~r } + (pp~r),
(25)
and not assuming uncorrelated signal and model errors. (The
two errors, r and p, are assumed to be uncorrelated.) To sep-
arately estimate R and Q (equivalently P) in Eq. (25), the
time-lagged covariance of the residuals is further employed,
{(y(t)- m(t)) (y(t + At)-m(t + At)) 7" } (p(t)p(t + At) ~r }
(26)
where data constraint error, r, is assumed to be uncorre-
lated in time. Menemenlis and Chechelnitsky (2000) esti-
mate the a priori errors by matching the empirical estimates
of Eqs. (25) and (26) with those based on theoretical esti-
mates using the model and a parametrically defined set of
error covariances.
Temporally correlated data errors and/or model process
noise require augmenting the problem that is solved. For
instance, the expansion in Eq. (8) assumes temporal inde-
pendence among the constraints in the assimilation prob-
lem, Eq. (7). Time correlated errors include biases, caused
for example, by uncertainties in model parameters and errors
associated with closed passageways in the ocean. The aug-
mentation is typically achieved by including the temporally
correlated error as part of the estimated state and by explic-
itly modeling the temporal dependence of the noise, for in-
stance, by persistence or by a low-order Gauss-Markov pro-
cess (e.g., Gelb, 1974). The modification amounts to trans-
forming the problem (Eq. [7]) into one with temporally un-
correlated errors at the cost of increasing the size of the es-
timated state. Dee and da Silva (1998) describe a reformula-
tion allowing estimation of model biases separately from the
model state in the context of sequential estimation. Derber
(1989) and Griffith and Nichols (1996) examine the prob-
lem of model bias and correlated model process noise in the
framework of the adjoint method.
Finally, it should be noted that the significance of differ-
ent weights depend entirely on whether or not those dif-
ferences are resolvable by models and available observa-
tions. To the extent that different error estimates are indis-
tinguishable from each other, further improvement in mod-
eling a priori uncertainties is a moot point. The methods
described above provide a simple means of estimating the
errors, but their adequacy must be assessed through exami-
nation of individual results. Issues of verifying prior errors
and the goodness of resulting estimates are discussed in Sec-
tion 5.2.
5.1.3. Regularization
and the Significance of Covariances
The data assimilation problem, being a rank-deficient in-
verse problem (see, for example, Wunsch, 1996), requires
a criterion for choosing a particular solution. To assure the
solution's regularity (e.g., spatial smoothness), specific regu-
larization or background constraints are sometimes imposed
in addition to the minimization of Eq. (8). For instance,
Sheinbaum and Anderson (1990), in investigating assimila-
tion of XBT data, used a smoothness constraint of the form,
(VHX) 2 -q- (VH2X) 2 (27)
25 6 SATELLITE ALTIMETRY AND EARTH SCIENCES
where V H is a horizontal gradient operator. The gradient
and Laplacian operators are linear operators and can be
expressed by some matrix, G and L, respectively. Then
Eq. (27) can be written
xTGTGx + x~L~Lx x T (GTG + LTL)x. (28)
The weighting matrix G~G + LTL is a symmetric nondi-
agonal matrix, and Eq. (27) can be recognized as a partic-
ular weighting of x. Namely, regularization constraints can
be specified in the weights already used in Eq. (8) by ap-
propriately prescribing their elements, particularly their off-
diagonal values. Alternatively, regularization may be viewed
as correcting inadequacies in the explicit weighting factors,
i.e., the prior covariance weighting, used in defining the as-
similation problem, Eq. (8).
Other physical constraints also render certain diagonal
weighting matrices unphysical. For instance, mass conser-
vation in the form of velocity nondivergence requires model
velocity errors to be nondivergent as well,
D(x
- ~) = 0 (29)
where D is the divergence operator for the velocity compo-
nents of model state x. Then the covariance of the initial
model state error P0 as well as the process noise Q should
be in the null space of D, e.g.,
DQD T = 0 (30)
which a diagonal Q will not satisfy.
Data constraint
covariances,
in particular the off-diagonal
elements of the weighting matrices, are equally as important
in determining the optimality of the solution as are the er-
ror
variances,
i.e., the diagonal elements. For example, Fu
and Fukumori (1996) examined effects of the differences in
covariances of orbit and residual tidal errors in altimetry. Or-
bit error is a slowly decaying function of time following the
satellite ground track, and is characterized by a dominating
period of once per satellite revolution around the globe. Ge-
ographically, errors are positively correlated along satellite
ground tracks, and weakly so across-track. While precision
orbit determination has dramatically decreased the magni-
tude of orbit errors, it is still the dominating measurement
uncertainty of altimetry (Table 1). Tidal error covariance is
characterized by large positive as well as negative values
about the altimetric data points, because of the narrow band
nature of tides and the sampling pattern of satellites. Conse-
quently, tidal errors have less effect on the accuracy of esti-
mating large-scale circulation than orbit errors of compara-
ble variance, because of the canceling effect of neighboring
positive and negative covariances.
5.1.4. Extrapolation and Mapping of Altimeter Data
How best to process or employ altimeter data in data as-
similation has been a long-standing issue. The problems in-
clude, for example, vertical extrapolation (Hurlburt
et al.,
1990; Haines, 1991), horizontal mapping (Schr6ter
et al.,
1993), and data conversion such as sea level to geostrophic
velocity (Oschlies and Willebrand, 1996). (Issues concern-
ing reference sea level are discussed in Section 5.4.) Many of
these problems originate in utilizing simple ad hoc assimila-
tion methods and in altimetric measurements not directly be-
ing a prognostic variable of the models. For instance, many
primitive equation models utilize the rigid lid approximation
for computational efficiencies. For such models, sea level is
not a prognostic variable but is diagnosed instead from pres-
sure gradients against the sea surface, which is dependent
on stratification (dynamic height) and barotropic circulation
(e.g., Pinardi
et al.,
1995). Altimeters also measure signif-
icant wave height whereas the prognostic variable in wave
models is spectral density of the waves (e.g., Bauer
et al.,
1992).
From the standpoint of estimation theory, there is no fun-
damental distinction between assimilating prognostic or di-
agnostic quantities, as both variables can be defined and
utilized through explicit forward relationships of similar
form, Eq. (2). That is, no explicit mapping of data to
model grid is required, and free surface models provide
no more ease in altimetric assimilation than do rigid-lid
models. What enables estimation theory to translate obser-
vations into unique modifications of model state in effect
are the weights in Eq. (8). For instance, specifying data
and model uncertainties uniquely defines the Kalman filter
which sequentially maps data to the entire model state (Sec-
tion 4.5). The Kalman filter determines the optimal extrapo-
lation/interpolation by time-integration of the model state er-
ror covariance. The covariance defines the statistical relation
between uncertainties of an arbitrary model variable and that
of another variable, either being prognostic or diagnostic.
The covariance computed in Kalman filtering, by virtue of
model integration, is dynamically consistent and reflects the
propagation of information in space, time, and among dif-
ferent properties. Least-squares methods achieve the equiv-
alent implicitly through direct optimization of Eq. (8). To
the extent that model state errors are correlated, as they real-
istically would be by the continuous dynamics, the optimal
weights necessarily extrapolate surface information instan-
taneously in space (vertically and horizontally) and among
different properties.
Figures 10 and 11 show examples of some structures of
the Kalman gain corresponding to that based on the model
and errors of Figure 9. Reflecting the inhomogeneous na-
ture of wind-driven large-scale sea level changes (Fukumori
et al.,
1998), Figure 10 shows sea-level differences between
model and data largely being mapped to baroclinic changes
(model state increments) (black curve) in the tropics and
barotropic changes (gray curve) at higher latitudes. Hori-
zontally, the modifications reflect the dynamics of the back-
ground state (Figure 11). For instance, the effect of a sea-
level difference at the equator (Figure 11B) is similar to the
5. DATA ASSIMILATION BY MODELS 25 7
effects of local wind-forcing (the assumed error source), that
is a Kelvin wave with temperature and zonal velocity anoma-
lies centered on the equator and an associated Rossby wave
of opposite phase to the west of the Kelvin wave with off-
equatorial maxima. The Antarctic Circumpolar Current and
the presence of the mid-ocean ridge elongates stream func-
tion changes in the Southern Ocean in the east-west direc-
tion (Figure 11A). The ocean physics render structures of the
model error covariance, and thus the optimal filter, spatially
inhomogeneous and anisotropic. Such complexity makes it
difficult to directly specify an extrapolation scheme for al-
timetry data, as done in ad hoc schemes of data assimilation
(Section 4).
Because mapping is merely a combination of data and
statistical information (e.g., Bretherton
et al.,
1976), the
information content of a mapped sea level should be no
more than what is already available from data along satel-
lite ground tracks and the weights used in mapping the data.
However, mapping procedures can potentially filter out or
alias oceanographic signals if the assumed statistics are in-
accurate. In particular, sea level at high latitudes contain
variabilities with periods of a few days, that is shorter than
the Nyquist period of most altimetric satellites (Fukumori
et al.,
1998). Therefore a mapping of altimetric measure-
ments must be carefully performed to avoid possible aliasing
of high frequency variability. The simplest and most prudent
approach would be to assimilate along-track data directly.
5.2. Verification and the Goodness of Estimates
FIGURE 10 Property of a Kalman gain. The figure shows zonally aver-
aged sea level change (cm) as a function of latitude associated with Kalman
filter changes in model state (baroclinic displacement [black], barotropic
circulation [gray]) corresponding to an instantaneous 1 cm model-data dif-
ference. The estimates are strictly local reflecting sea-level differences at
each separate grid point. The model is a global model based on the GFDL
MOM. The Kalman filter assumes process noise in the form of wind error
(Figure 9). (Adapted from Fukumori
et al.
(1999), Plate 2.)
Improvements achieved by data assimilation not only re-
quire accurate solution of the assimilation problem (Sec-
tion 4), but also depend on the accuracy of the assumptions
underlying the definition of the problem itself (Eq. [7]), in
particular the a priori errors of the model and data constraints
(Section 5.1). The validity of the assumptions must be care-
fully assessed to assure the quality and integrity of the esti-
mates. At the same time, the nature of the assumptions must
be fully appreciated to properly interpret the estimates.
If a priori covariances are correct and the problem is
solved consistently, results of the assimilation should neces-
sarily be an improvement over prior estimates. In particular,
the minimum variance estimate by definition should become
more accurate than prior estimates, including simulations
FIGURE 11 Examples of a Kalman gain's horizontal structure. The figures describe changes in a model correspond-
ing to assimilating a 1 cm sea level difference between data and model at the asterisks. The model and errors are those
in Figure 9. The figures are, (A) barotropic mass transport stream function (c.i. 2 x 10 -10 cm3/sec) and (B) temperature
at 175 m (c.i. 4 x 10 -4 ~ Positive (negative) values are shown in solid (dashed) contours. Arrows are barotropic (A)
and baroclinic (B) velocities. To reduce clutter, only a subset of vectors are shown where values are relatively large. The
assumed data locations are (A) 60~ 170~ and (B) 0~ 170~ Corresponding effects of the changes on sea level are
small due to relatively large magnitudes of data error with respect to model error; changes are 0.02 and 0.03 cm at the
respective data locations for (A) and (B). (Adapted from Fukumori
et al.
(1999), Fig. 4.)
25 8 SATELLITE ALTIMETRY AND EARTH SCIENCES
without assimilation or the assimilated observations them-
selves. Mathematically, the improvement is demonstrated,
for example, by the minimum variance estimate's accuracy
(inverse of error covariance matrix, P) being the sum of the
prior model and data accuracies (e.g., Gelb, 1974),
p-1 _ p(_)-I + HTR-1H (31)
where the minus sign in the argument denotes the model
state error prior to assimilation. Consequently, the trace of
the model state error covariance matrix is a nonincreasing
function of the amount of assimilated observations. Matrices
with smaller trace define smaller inner products for arbitrary
vectors, h; i.e.,
hTph
_< hTp(-)h. (32)
Equation (32) implies that not only diagonal elements of
P but errors of any linear function of the minimum vari-
ance estimate are smaller than those of non-assimilated esti-
mates. Therefore, for linear models at least, assimilated esti-
mates will not only have smaller errors for the model equiv-
alent of the observations but will also have smaller errors
for model state variables not directly measured as well as
the model's future evolution. In the case of altimetric as-
similation, unless incorrect a priori covariances are used,
the model's entire three-dimensional circulation will be im-
proved from, or should be no worse than, prior estimates.
(For nonlinear models, such improvement cannot be proven
in general, but a linear approximation is a good approxi-
mation in many practical circumstances.) Given the equiva-
lence of minimum variance solutions with other assimilation
methods (Section 4), these improvements apply equally as
well to other estimations, provided the assumptions are the
same.
Various measures are used to assess the adequacy of a pri-
ori assumptions. For instance, the particular form of Eq. (8),
as in most applications, assumes a priori errors being uncor-
related in time. Then, if a priori errors are chosen correctly,
the optimal estimate will extract all the information content
from the observations except for noise, making the innova-
tion sequence uncorrelated in time. Blanchet
et al.
(1997)
used such measure to assess the adequacy of adaptively esti-
mated uncertainty estimates. However, in practice, represen-
tation errors (Section 5.1.1) often dominate model and data
differences, such that strict whiteness in residuals cannot al-
ways be anticipated. As in the definition of the assimilation
problem, the distinction of signal and representation error is
once again crucial in assessing the goodness of the solution.
The improvement that is expected of the model estimate is
that of the signal as defined in Section 5.1.1, and not of the
complete state of the ocean.
Another quantitative measure of assessing adequacies of
prior assumptions is the relative magnitude of a posteriori
model-data differences with respect to their a priori expecta-
tions. For instance, the Kalman filter provides formal uncer-
tainty estimates with which to measure magnitudes of actual
model-data differences. Figure 12 (see color insert) shows an
example comparing residuals (i.e., model-data differences;
Figure 12A) and their expectations (Figure 12B) from as-
similating TOPEX/POSEIDON data using the Kalman filter
described by Figure 9. The comparable spatial structures and
magnitudes over most regions demonstrate the consistency
of the a priori assumptions with respect to model and data.
For least-squares estimates, the equivalent would be for each
term in Eq. (8) being of order one (or of comparable magni-
tude) after assimilation (e.g., Lee and Marotzke, 1998).
The model-data misfit should necessarily become smaller
following an assimilation because assimilation forces mod-
els towards observations. What is less obvious, however,
is what becomes of model properties not directly con-
strained. If solved correctly, assimilated estimates are nec-
essarily more accurate regardless of property. Then, com-
parisons of model estimates with independent observations
withheld from assimilation provide another, and possibly
the strongest, direct measure of the goodness of the partic-
ular assimilation and are one of the common means utilized
in assessing the quality of the estimates. For instance, Fig-
ure 4 in Section 2 compared an altimetric assimilation with
in situ
measurements of subsurface temperature and veloc-
ity; it showed not only improvements made by assimilation
but also their quantitative consistency with formal error es-
timates. Others have compared results of an altimetric as-
similation with measurements from drifters (e.g., Schr6ter
et al.,
1993; Morrow and De Mey, 1995; Blayo
et al.,
1997),
current meters (e.g., Capotondi
et al.,
1995b; Fukumori,
1995; Stammer, 1997, Blayo
et al.,
1997), hydrography (e.g.,
White
et al.,
1990a; Dombrowsky and De Mey, 1992; Os-
chlies and Willebrand, 1996; Greiner and Perigaud, 1996;
Stammer, 1997), and tomography (Menemenlis
et al.,
1997).
To the extent that future observations contain information in-
dependent of past measurements, forecasting skills also pro-
vide similar measures of the assimilation's reliability (e.g.,
Figure 7, see also Lionello
et al.,
1995; Morrow and De
Mey, 1995). The so-called innovation vector in sequential
estimation, i.e., the difference of model and data immedi-
ately prior to assimilation (Section 5.1.2), provides a similar
measure of forecasting skill albeit generally over a short pe-
riod (e.g., Figure 12; see also Gaspar and Wunsch, 1989; Fu
et al. ,
1993).
The comparative smallness of model-data differences, on
the one hand, does not by itself verify or validate the esti-
mation, but it does demonstrate a lack of any outright inade-
quacies in the calculation. On the other hand, an excessively
large difference can indicate an inconsistency in the calcula-
tion, but the presence of representation error precludes im-
mediate judgment and requires a careful analysis as to the
cause of the discrepancy. For instance, Figure 13 shows an
altimetric assimilation (gray curve) failing to resolve sub-
surface temperature variability (solid curve) at two depths
5. DATA ASSIMILATION BY MODELS 25 9
FIGURE 13 An example of model representation error. The example compares temperature anomalies (~ at 2~
165~ (A) 125 m, (B) 500 m. Different curves are
in situ
measurements (black; Tropical Atmosphere and Ocean array)
and altimetric assimilation (gray solid). The simulation is hardly different from the assimilation and is not shown to
reduce clutter. Bars denote formal error estimates. Model and assimilation are based on those described in Figure 9.
(Adapted from Fukumori
et al.
(1999), Plate 5.)
with error estimates being much smaller than actual differ-
ences9 However, the lack of vertical coherence in the
in situ
measurements suggests the data being dominated by vari-
ations with a vertical scale much smaller than the model's
resolution (150 m). Namely, the comparison suggests that
the model-data discrepancy is caused by model representa-
tion error instead of a failure of assimilation9 The formal er-
ror estimates are much smaller than actual differences as the
estimate only pertains to the signal consistent with model
and data, and excludes effects of representation error (Sec-
tion 5.19
Withholding observations is not necessarily required to
test consistencies of an assimilation. In fact, the optimal es-
timate by its very nature requires that all available obser-
vations be assimilated simultaneously. Equivalent tests of
model-data differences can be performed with respect to
properties of a posteriori differences of the estimate. How-
ever, from a practical standpoint, when inconsistencies are
found it may be easier to identify the source of the inaccu-
racy by assimilating fewer data and therefore having fewer
assumptions at a time.
5.3. Observability
Observability, as defined in estimation theory, is the abil-
ity to determine the state of the model from observations
in the absence of both model process noise and data con-
straint errors. Weaver and Anderson (1997) empirically ex-
amined the issue of observability from altimetry using twin
experiments. Mathematically, the degree of observability is
measured by the rank of the inverse problem, Eq. (6). In the
absence of errors, the state of the model is uniquely deter-
mined by the initial condition, x0, in terms of which the left-
handside of Eq. (6) may be rewritten,
"~'~i (Xi) "~i .~'~
: : :
Xj+l .gc'j(Xj) .9c'~ +1 .9c'~ +1
:
x0 (33)
where the model .T" was assumed to be linear, and .T'{
de-
notes integration from time i to j. The process noise being
zero, the model equations are identically satisfied, and there-
fore the rank of Eq. (33) is equivalent to that of the equations
regarding observations alone; viz.,
"~ M.~"g
"~'~i .~'~ XO (34)
7-10
where M denotes the total incidences of observations. The
rank and the range space of the coefficient matrix respec-
tively determine how many and what degrees of freedom are
uniquely determined by the observations. In particular, when
the rank of the coefficient matrix equals the dimension of x
(i.e., full rank), all components of the model can be uniquely
determined and the model state is said to be completely ob-
servable.
Hurlburt (1986) and Berry and Marshall (1989), among
others, have explored the propagation of surface data into
subsurface information. While on one hand, sequential as-
similation transfers surface information into the interior of
the ocean, on the other hand, future observations also contain
260 SATELLITE ALTIMETRY AND EARTH SCIENCES
information of the past state. That is, the entire temporal evo-
lution of the measured property, viz., indices i = 0 M
in Eq. (34), provides information in determining the model
state and thus the observability of the assimilation problem.
Webb and Moore (1986) provide a physical illustration of the
significance of the measured temporal evolution in the con-
text of altimetric observability. Namely, as baroclinic waves
of different vertical modes propagate at different speeds, the
phase among different modes will become distinct over time
and thus distinguishable, by measuring the temporal evo-
lution of sea level. Thus dynamics allows different model
states that cannot be distinguished from each other by ob-
servations alone to be differentiated (Miller, 1989). Math-
ematically, the "distinguishability" corresponds to the rows
of Eq. (34) being independent from each other. In fact, most
components of a model are theoretically observable from
altimetry, as any perturbation in model state will eventu-
ally lead to some numerical difference in sea level, even
though perhaps with a significant time-lag and/or with in-
finitesimal amplitude. Miller (1989) demonstrated observ-
ability of model states from measurements of temporal dif-
ferences, such as those provided by an altimeter (see also
Section 5.4). Fukumori
et al. (1993)
demonstrated the com-
plete observability (i.e., observability of the entire state) of
a primitive equation model from altimetric measurements
alone.
Observability, as defined in estimation theory, is a deter-
ministic property as opposed to a stochastic property of the
assimilation problem. In reality, however, data and model er-
rors cannot be ignored and these errors restrict the degree
to which model states can be improved even when they are
mathematically observable, and thus limit the usefulness of
the strict definition and measure of observability. What is
of more practical significance in characterizing the ability
to determine the model state is the estimated error of the
model state, in particular the difference of the model state
error with and without assimilation. For example, Fuku-
mori
et al.
(1993) show that the relative improvement by
altimetric assimilation of the depth-dependent (internal or
baroclinic mode) circulation is larger than that of the depth-
averaged (external or barotropic mode) component caused
by differences in the relative spin-up time-scales. Actual im-
provements of unmeasured quantities are also often used to
measure the fidelity in assimilating real observations (e.g.,
Figure 4 and the examples in Section 5.2).
5.4. Mean Sea Level
Because of our inadequate knowledge of the marine
geoid, altimetric sea level data are often referenced to their
time-mean, that is, the sum of the mean dynamic sea surface
topography and the geoid. The unknown reference surface
makes identifying the model equivalent of such "altimetric
residuals" (Eq. [2]) somewhat awkward, necessitating con-
sideration as to the appropriate use of altimetric measure-
ments. One of several approaches has been taken in practice,
including direct assimilation of temporal differences, using
mean model sea level in place of the unknown reference, and
estimating the mean from separate observations.
The temporal difference of model sea level is a direct
equivalent of altimetric variability. Miller (1989), therefore,
formulated the altimetric assimilation problem by directly
assimilating temporal differences of sea level at successive
instances by expanding the definition of the model state vec-
tor to include model states at corresponding times. Alterna-
tively, Verron (1992), modeling the effect of assimilation as
stretching of the surface layer, reformulated the assimila-
tion problem into assimilating the tendency (i.e., temporal
change) of model-data sea level differences, thereby elim-
inating the unknown time-invariant reference surface from
the problem.
The mean sea level of a model simulation is used in many
studies to reference altimetric variability (e.g., Oschlies and
Willebrand, 1996), which asserts that the model sea level
anomaly is equivalent to the altimetric anomaly. Using the
model mean to reference model sea level affirms that there
is no direct information of the mean in the altimetric residu-
als. In fact, for linear models, the model mean is unchanged
when assimilating altimetric variabilities (Fukumori
et al.,
1993). Yet for nonlinear physics, the model mean can be
changed by such an approach. Using a nonlinear QG model,
Blayo
et al.
(1994) employed the model mean sea level but
iterated the assimilation process until the resulting mean
converges between different iterations.
Alternatively, a reference sea level can also be obtained
from
in situ
measurements. For instance, Capotondi
et al.
(1995b) and Stammer (1997)used dynamic height estimates
based on climatological hydrography in place of the un-
known time-mean altimetric reference surface. Morrow and
De Mey (1995) and Ishikawa
et al.
(1996) utilized drifter
trajectories as a means to constrain the absolute state of the
ocean.
In spite of their inaccuracies, geoid models have skills,
especially at large-spatial scales, which information may be
exploited in the estimation. For instance, Marshall (1985)
theoretically examined the possibility of determining mean
sea level and the geoid simultaneously from assimilating
altimetric measurements, taking advantage of differences
in spatial scales of the respective uncertainties. Thompson
(1986) and Stammer
et al. (1997)
further combined inde-
pendent geoid estimates in conjunction with hydrographic
observations.
Finally, Greiner and Perigaud (1994, 1996), noting non-
linear dependencies of the oceanic variability and the tem-
poral mean, estimated the time-mean sea level of the Indian
Ocean by assimilating sea level variabilities alone measured
5. DATA ASSIMILATION BY MODELS 261
by Geosat, and verified their results by comparisons with hy-
drographic observations (Figure 5).
6. SUMMARY AND OUTLOOK
The last decade has witnessed an unprecedented series
of altimetric missions that includes Geosat (1985-1989),
ERS- 1 (1991-1996), TOPEX/POSEIDON (1992-present),
ERS-2 (1995-present), and Geosat Follow-On (1998-
present), whose legacy is anticipated to continue with
Jason-1 (to be launched in 2001) and beyond. At the same
time, advances in computational capabilities have prompted
increasingly realistic ocean circulation models to be devel-
oped and used in studies of ocean general circulation. These
developments have led to the recognition of the possibilities
of combining observations with models so as to synthesize
the diverse measurements into coherent descriptions of the
ocean; i.e., data assimilation.
Many advances in data assimilation have been accom-
plished in recent years. Assimilation techniques first devel-
oped in numerical weather forecasting have been explored
in the context of oceanography. Other assimilation schemes
have been developed or modified, reflecting properties of
ocean circulation. Methods based on estimation and control
theories have also been advanced, including various approx-
imations that make the techniques amenable to practical ap-
plications. Studies in ocean data assimilation are now evolv-
ing from demonstrations of methodologies to applications.
Examples can be found in practical operations, such as in
studies of weather and climate (e.g., Behringer
et al.,
1998),
tidal modeling (e.g., Le Provost
et al.,
1998), and wave fore-
casting (e.g., Janssen
et al.,
1997).
Data assimilation provides an optimal estimate of the
ocean consistent with
both
model physics and observations.
By doing so, assimilation improves on what either a given
model or a set of observations alone can achieve. For in-
stance, although useful for theoretical investigations, mod-
eling alone is inaccurate in quantifying actual ocean circu-
lation, and observations by themselves are incomplete and
limited in scope.
Yet, data assimilation is not a panacea for compensating
all deficiencies of models and observing systems. A case in
point is representation error (Section 5.2). Mathematically,
data assimilation is an estimation problem (Eq. [7]) in which
the oceanic state is sought that satisfies a set of simultaneous
constraints (i.e., model and data). Consequently, the estimate
is limited in what it can resolve (or improve) by what ob-
servations and models represent
in common.
While errors
caused by measuring instruments and numerical schemes
can be reduced by data assimilation, model and data repre-
sentation errors cannot be corrected or compensated by the
process. Overfitting models to data beyond what the mod-
els represent can have detrimental consequences leading the
assimilation to degrade rather than to improve model esti-
mates.
To recognize such limits and to properly account for the
different types of errors are imperative for making accurate
estimates and for interpreting the results. The a priori er-
rors of model and data in effect define the assimilation prob-
lem (Eq. [7]), and a misspecification amounts to solving the
wrong problem (Section 5.1.1). However, in spite of adap-
tive methods (Section 5.1.2), in practice, weights used in as-
similation are often chosen more or less subjectively, and a
systematic effort is required to better characterize and un-
derstand the a priori uncertainties and thereby the weights.
In particular, the significance of representation error is often
under-appreciated. Quantifying what models and observing
systems respectively do and do not represent is arguably the
most urgent and important issue in estimation.
In fact, identifying representation error is a fundamen-
tal problem in modeling and observing system assessment
and is the foundation to improving our understanding of the
ocean. Moreover, improving model and data representation
can only be achieved by advancing the physics in numerical
models and conducting comprehensive observations. Such
limitations and requirements of estimation exemplify the rel-
ative merits of modeling, observations, and data assimila-
tion. Although assimilation provides a new dimension to
ocean state estimation, the results are ultimately limited to
what models and observations resolve and our understand-
ing of their nature.
A wide spectrum of assimilation efforts presently ex-
ist. For example, on the one hand, there are fine-resolution
state-of-the-art models using relatively simple assimilation
schemes, and on the other there are near optimal assimi-
lation methods using simpler models. The former places a
premium on minimizing representation error while the lat-
ter minimizes the error of the resolved state. The differ-
ences in part reflect the significant computational require-
ments of modeling and assimilation and the practical choices
that need to be made. Such diversity will likely remain for
some time. Yet, differences between these opposite ends of
the spectrum are narrowing and should eventually become
indistinguishable as we gain further experience in applica-
tions.
In spite of formal observability, satellite altimetry, as
with other observing systems, cannot by itself accurately
determine the complete state of the ocean because of fi-
nite model errors, and to a lesser extent data uncertainties.
Various other data types must be analyzed and brought to-
gether in order to better constrain the estimates. Several ef-
forts have already begun in such an endeavor of simultane-
ously assimilating
in situ
observations with satellite altime-
try. Field experiments such as the World Ocean Circulation
Experiment (WOCE) and the Tropical Ocean Global Atmo-
sphere Program (TOGA) have collected an unprecedented
suite of
in situ
observations. In particular, the analysis phase