ESA-ESRIN, Frascati, Rome, Italy
Introduction to Data Assimilation
Alan O’Neill Data Assimilation Research Centre University of Reading
DARC
What is data assimilation?
Data assimilation is the technique whereby observational data are combined with output from a numerical model to produce an optimal estimate of the evolving state of the system.
DARCDARC
1 18th – 29th August 2003 ESA-ESRIN, Frascati, Rome, Italy
Why We Need Data Assimilation
• range of observations • range of techniques • different errors • data gaps • quantities not measured • quantities linked
DARC
DARCDARC
2 18th – 29th August 2003 ESA-ESRIN, Frascati, Rome, Italy
DARCDARC
Some Uses of Data Assimilation
• Operational weather and ocean forecasting • Seasonal weather forecasting • Land-surface process • Global climate datasets • Planning satellite measurements • Evaluation of models and observations DARCDARC
3 18th – 29th August 2003 ESA-ESRIN, Frascati, Rome, Italy
What We Want To Know
x(t) atmos. state vector
s(t) surface fluxes c model parameters
X(t) = (x(t),s(t),c) DARC
What We Also Want To Know
Errors in models Errors in observations What observations to make
DARC
4 18th – 29th August 2003 ESA-ESRIN, Frascati, Rome, Italy
DATA ASSIMILATION SYSTEM
Error Statistics Data model Cache observations A
O A Numerical F DAS Model
B
DARC
The Data Assimilation Process
observations forecasts compare reject adjust
estimates of state & parameters
errors in obs. & forecasts DARC
5 18th – 29th August 2003 ESA-ESRIN, Frascati, Rome, Italy
Retrievals and Assimilation
• Problems with retrievals – a priori state and poorly known errors • Problems with assimilation of radiance – systematic errors and cloud clearing – expensive (multi-channels) • Need effective interface – Information content with error analysis
DARCDARC
Statistical Approach to Data Assimilation
DARC
6 18th – 29th August 2003 ESA-ESRIN, Frascati, Rome, Italy
DARCDARC
Minimum Variance Combination of Data Unbiased, Uncorrelated Errors
~ α + β = 1 x = α x1 + β x2
2 2 Var(x1) = σ 1 Var(x2 ) = σ 2
DARC
7 18th – 29th August 2003 ESA-ESRIN, Frascati, Rome, Italy
Minimum Variance Combination of Data Unbiased, Uncorrelated Errors
~ 2 2 2 2 Var(x) = α σ 1 + (1 - α ) σ 2
¶ ~ 2 2 Var( x) = 2ασ 1 - 2(1 - α )σ 2 = 0 ¶α Þ α , β
DARC
Minimum Variance Combination of Data Unbiased, Uncorrelated Errors
- 1 æ x1 x 2 öæ 1 1 ö x~ = ç + ÷ç + ÷ ç 2 2 ÷ç 2 2 ÷ è σ 1 σ 2 øè σ 1 σ 2 ø
~ 2 2 var( x ) £ min(σ 1 ,σ 2 )
Best Linear Unbiased Estimate DARC
8 18th – 29th August 2003 ESA-ESRIN, Frascati, Rome, Italy
Variational Method
(x - x )2 (x - x )2 = 1 + 2 J (x) 2 2 σ 1 σ 2 J (x)
~x at min J (x) ~ x x DARC
Maximum Likelihood Estimate
• Obtain or assume probability distributions for the errors • The best estimate of the state is chosen to have the greatest probability, or maximum likelihood • If errors normally distributed,unbiased and uncorrelated, then states estimated by minimum variance and maximum likelihood are the same
DARC
9 18th – 29th August 2003 ESA-ESRIN, Frascati, Rome, Italy
Multivariate Case
æ x1 ö ç ÷ ç x2 ÷ state vector x(t) = ç ÷ ç ÷ ç x ÷ æ y ö è n ø ç 1 ÷ observation vector ç y2 ÷ ç ÷ è ym ø
DARC
Variance becomes Covariance Matrix
• Errors in xi are often correlated – spatial structure in flow – dynamical or chemical relationships • Variance for scalar case becomes Covariance Matrix for vector case COV
• Diagonal elements are the variances of xi • Off-diagonal elements are covariances between xi and xj
• Observation of xi affects estimate of xj DARC
10 18th – 29th August 2003 ESA-ESRIN, Frascati, Rome, Italy
Estimating Covariance Matrix for Observations, O
• O usually quite simple: – diagonal or – for nadir-sounding satellites, non-zero values between points in vertical only • Calibration against independent measurements
DARC
Estimating Covariance Matrix for Model, B
• Use simple analytical functions constructed experimentally, e.g.
Bij = σ iσ j exp(- dij )
where dij is the horizontal
distance between xi and x j • Run ensemble of forecasts from slightly different conditions and analyse error
growth. DARC
11 18th – 29th August 2003 ESA-ESRIN, Frascati, Rome, Italy
Methods of Data Assimilation • Optimal interpolation (or approx. to it)
• 3D variational method (3DVar)
• 4D variational method (4DVar)
• Kalman filter (with approximations)
DARCDARC
Optimal Interpolation
observation observation operator
xa = xb + K(y - H (xb ))
“analysis” • linearity H H “background” (forecast) • matrix inverse • limited area K = BHT HBHT + O - 1 ( ) DARC
12 18th – 29th August 2003 ESA-ESRIN, Frascati, Rome, Italy
D = y - H (x b ) at obs. point
D data void
D D D D D
xb D D
DARC
X observation model trajectory
t
DARC
13 18th – 29th August 2003 ESA-ESRIN, Frascati, Rome, Italy
Data Assimilation: an analogy
Driving with your eyes closed: open eyes every 10 seconds and correct trajectory
DARCDARC
Variational Data Assimilation
J(x) vary x to minimise J(x)
x x a DARC
14 18th – 29th August 2003 ESA-ESRIN, Frascati, Rome, Italy
Variational Data Assimilation J (x) =
T -1 (x - xb ) B (x - xb ) + (y - H (x))T O-1(y - H (x))
nonlinear operator assimilate y directly global analysis DARC
Choice of State Variables and Preconditioning
• Free to choose which variables to use to define state vector, x(t) • We’d like to make B diagonal – may not know covariances very well – want to make the minimization of J more efficient by “preconditioning”: transforming variables to make surfaces of constant J nearly spherical in state space DARC
15 18th – 29th August 2003 ESA-ESRIN, Frascati, Rome, Italy
Cost Function for Correlated Errors
x2
x1 DARC
Cost Function for
x2 Uncorrelated Errors
x1 DARC
16 18th – 29th August 2003 ESA-ESRIN, Frascati, Rome, Italy
Cost Function for Uncorrelated Errors x2 Scaled Variables
x1 DARC
4D Variational Data Assimilation obs. & errors given X(to), the forecast is deterministic
to t vary X(to) for best fit to data DARC
17 18th – 29th August 2003 ESA-ESRIN, Frascati, Rome, Italy
4D Variational Data Assimilation
• Advantages – consistent with the governing eqs. – implicit links between variables • Disadvantages – very expensive – model is strong constraint
DARC
Problem
model error covariances
DARC
18 18th – 29th August 2003 ESA-ESRIN, Frascati, Rome, Italy
Kalman Filter (expensive) Use model equations to propagate B forward in time. B B(t) KAL Analysis step as in OI
DARC
What are the benefits of data assimilation? • Quality control • Combination of data • Errors in data and in model • Filling in data poor regions • Designing observational systems • Maintaining consistency • Estimating unobserved quantities
DARCDARC
19 18th – 29th August 2003 ESA-ESRIN, Frascati, Rome, Italy
Some Applications of Data Assimilation
DARC
Skill Measures: Observation Increment, (O-F) • The difference between the forecast from the first guess, F, and the observations, O, also known as observed-minus- background differences or the innovation vector. • This is probably the best measure of forecast skill.
DARC
20 18th – 29th August 2003 ESA-ESRIN, Frascati, Rome, Italy
_ ECMWF Ozone monthly- + mean analysis increment. Sept. 1986 T DARC
DARC
21 18th – 29th August 2003 ESA-ESRIN, Frascati, Rome, Italy
Best observations to make to characterize the chemical system
DARC
Impact on NWP at the Met Office
Mar 99. 3D-Var and ATOVS Oct 99. ATOVS as radiances, Feb/Apr 01. 2nd satellites, SSM/I winds ATOVS + SSM/I Jul 99. ATOVS over Siberia, May 00. Retune sea-ice from SSM/I 3D-Var DARC
22 18th – 29th August 2003 ESA-ESRIN, Frascati, Rome, Italy
Impact of satellite data in NWP
DARC
Future Advanced Infrared Sounders
DARCDARC
23 18th – 29th August 2003 ESA-ESRIN, Frascati, Rome, Italy
IASI vs HIRS
DARCDARC
DARCDARC
24 18th – 29th August 2003 ESA-ESRIN, Frascati, Rome, Italy
DARCDARC
DARCDARC
25 18th – 29th August 2003 ESA-ESRIN, Frascati, Rome, Italy
DARCDARC
DARCDARC
26 18th – 29th August 2003 ESA-ESRIN, Frascati, Rome, Italy
MERIS ocean colour
DARCDARC
Regional Scale: Walnut Gulch (Monsoon 90)
Model Observation
Tombstone, AZ
0% 20%
Model with 4DDA
Houser et al., 1998 DARC
27 18th – 29th August 2003 ESA-ESRIN, Frascati, Rome, Italy
Land Initialization: Motivation
• Knowledge of soil moisture has a greater impact on the predictability of summertime precipitation over land at mid-latitudes than Sea Surface Temperature (SST).
DARC
Conclusions
• Data assimilation should be essential part of “ground-segment” of all satellite missions. • Aim should be to provide all data in “near real time”. • Need to find optimal ways to assimilate data – efficient interfaces. • Challenges: errors, resolution, data.
DARCDARC
28 18th – 29th August 2003