Local Ensemble data assimilation in OOPS¶
The Local Ensemble DA application is a generic application for running data assimilation with Local Ensemble Kalman filters. It can be extended to use any Local EnKF that updates state gridpoints independently from each other by using observations within a localization distance from a gridpoint.
Configuration file (e.g. letkf.yaml
) for running LocalEnsembleDA application has to contain the following sections:
geometry
- geometry for the background and analysis filesbackground
- ensemble background members (currently used both for computing H(x) and as backgrounds)observations
- describes observations, observation errors, observation operators used in the assimilation, and the horizontal localizationdriver
- describes optional modifications to the behavior of the LocalEnsembleDA driverlocal ensemble DA
- configuration of the local ensemble DA solver package
Supported modifications to the driver¶
Read HX from disk instead of computing it at run-time.
driver:
read HX from disk: false #default value
Compute posterior observer and output test prints for the oma statistics. One might choose to set this flag to false in order to speed up completion of the localEnsembleDA solver.
driver:
do posterior observer: true #default value
Run LocalEnsembleDA in observer mode to compute HX offline. This works hand-in-hand with
read HX from disk
. One might choose to separate this into two steps because it is possible to use more efficient round-robin distribution ifrun as observer only: true
.
driver:
run as observer only: false #default value
Save posterior mean. Requires “output” section in the yaml file.
driver:
save posterior mean: false #default value
Save posterior ensemble. Requires “output” section in the yaml file.
driver:
save posterior ensemble: true #default value
Save prior mean. Requires “output mean prior” section in the yaml file.
driver:
save prior mean: false #default value
save posterior mean increment. Requires “output increment” section in the yaml file.
driver:
save posterior mean increment: false #default value
save prior variance. Requires “output variance prior” section in the yaml file.
driver:
save prior variance: false #default value
save posterior variance. Requires “output variance posterior” section in the yaml file.
driver:
save posterior variance: false #default value
Default behavior is for the LocalEnsembleDa to update the obs config with the geometry info relevant to this PE. This is needed for Halo distribution to work properly. If not using Halo distribution or using models that do not implement grid decomposition (e.g. l95) one might choose to not update obs config by setting
update obs config with geometry info : false
.
driver:
update obs config with geometry info: true #default value
Supported Local Ensemble Kalman filters¶
LETKF¶
Two Local Ensemble Transform Kalman Filter (Hunt et al 2007) implementations are supported:
C++ implementation using Eigen (double precision).
This implementation is used when LETKF
keyword is used in solver
section of configuration file:
local ensemble DA:
solver: LETKF
GSI-LETKF Fortran implementation using LAPACK (single precision).
This implementation is used when GSI LETKF
keywords are used in solver
section of configuration file:
local ensemble DA:
solver: GSI LETKF
LGETKF¶
Another available solver is Local GETKF (Gain form of the Ensemble Transform Kalman filter, Bishop et al 2017) using modulated ensembles to emulate model-space localization in vertical. The implementation calls GSI-GETKF Fortran implementation and follows Lei et al 2018.
To use LGETKF, specify GETKF
in solver
section. Using LGETKF also requires specifying parameters for the modulation product that emulates model-space localization in vertical:
fraction of retained variance
- fraction of the variance retained after the eigenspectrum of the vertical localization function is truncated (1 – retain all eigenvectors, 0 – retain the first eigenvector)lengthscale units
- name of variable for vertical localization. FV3 implementation currently supports two types of units:logp
– logarithm of pressure at mid level of the vertical column with surface pressure set to 1e5 at all points, andlevels
– indices of vertical levels.lengthscale
- localization distance in the above units, at which Gaspari-Cohn localization function is zero.
An example of using LGETKF solver in FV3:
local ensemble DA:
solver: GETKF
vertical localization:
fraction of retained variance: .95
lengthscale: 1.5
lengthscale units: logP
Localization supported in the ensemble solvers¶
Observation-space \(R-localization\) is used in all local solvers following Frolov et al, 2024. The obs localizations
syntax specifies a sequence of obs localizations for each obs space. Localization is initialized to all ones internally and is refined (multiplied) with each subsequent localization in the list. In other words, we assume that localizations are separable.
The horizontal localization sequence is specified as following for each obs space:
observations:
observers:
- obs space:
name: radiosonde
...
obs localizations:
- localization method: Horizontal Gaspari-Cohn # inflate errors with Gaspari-Cohn function, based on the
# horizontal distance from the updated grid point
lengthscale: 1000e3 # horizontal localization distance in meters
Other options for obs-space localization are available outside of OOPS. Specifically, UFO supports Gaspari-Cohn, SOAR, and Box Car localizations with kd-tree distance search (e.g., localization method: Horizontal SOAR
). Additional localizations are supported in SOCA (Rossby radius based) and FV3-JEDI (soil-specific localization).
Similarly, the vertical localization sequence is specified as:
observations:
observers:
- obs space:
name: radiosonde
...
obs localizations:
- localization method: Vertical localization # As above but for vertical localization
localization function: Gaspari Cohn # Function for vertical localization
ioda vertical coordinate group: MetaData # Group containing the below vertical coordinate
ioda vertical coordinate: height # Name of UFO variable storing the vertical coordinate
# of the observation locations
vertical lengthscale: 6e3 # vertical localization distance in units of given coord
The Gaspari-Cohn, SOAR, and Box Car methods are also supported for vertical localization (e.g., localization function: SOAR
). If using vertical localization for LETKF (or GSI LETKF), the 3D Geometry Iterator must be enabled to carry model height information into the vertical localization routines following:
geometry:
iterator dimension: 3
Finally, the LGETKF implementation uses ensemble modulation to approximate model-space vertical localization (see above section for details).
Inflation supported in the ensemble solvers¶
Several covariance inflation methods are supported:
multiplicative prior inflation:
Parameter of multiplicative inflation is controlled by inflation.mult
configuration value, for example:
local ensemble DA:
inflation:
mult: 1.1
RTPP (relaxation to prior perturbation), Zhang et al, 2004
Parameter of RTPP inflation is controlled by inflation.rtpp
configuration value, for example:
local ensemble DA:
inflation:
rtpp: 0.5
RTPS (relaxation to prior spread), Whitaker and Hamill, 2012
Parameter of RTPS inflation is controlled by inflation.rtps
configuration value, for example:
local ensemble DA:
inflation:
rtps: 0.6
Solver |
Inflation options |
---|---|
LETKF |
Multiplicative inflation, RTPP, RTPS |
GSI LETKF |
RTPP, RTPS |
GETKF |
RTPP, RTPS |
NOTE about obs distributions¶
Currently Local Ensemble DA supports InefficientDistribution
and Halo
obs distribution. When InefficientDistribution
distribution is used, all observations and H(x) are replicated on all PEs. When Halo
distribution is used, only observations needed on this PE are stored on each PE. Halo
distribution allows for more efficient memory management compared to distribution.name: InefficientDistribution
, however at the expense of potentially poor load management compared to distribution.name: RoundRobin
. For optimal combination of memory and load balancing, we developed an option to run Local Ensemble DA in the observer-only mode with distribution.name: RoundRobin
. Then one can read ensemble of H(x) from disk using driver.read HX from disk == true
, distribution.name: Halo
obs distribution, and driver.do posterior observer == false
.
The type of the obs. distribution is specified for each ObsSpace. For example:
observations:
observers:
- obs space:
distribution:
name: Halo
halo size: 5000e3