Tutorial: Running the SOCA tutorial¶
- Learning Goals:
Perform all the necessary steps to run a low resolution variational data assimilation system with SOCA
- Prerequisites:
Overview¶
In this tutorial we will be running several different applications that are necessary to run one cycle of a 3DVar system and 3 cycles of a 3DVar FGAT.
The scenario of the tutorial is as follows, generate synthetic observations from a perturbed MOM6 forecast and subsequently assimilate the synthetic observations using the unperturbed background as initial conditions.
After succesfully compiling SOCA, in the build directory, you should see a tutorial
folder. cd
into it.
cd soca/tutorial
along with cmake
related files, it should contain the folowing scripts and yaml configuration files
needed to run the tutorial
├── bin # symbolic link to the JEDI executables
├── config # static yaml configurations
│ ├── 3dvar.yaml # 3DVAR configuration example
│ ├── gridgen.yaml # `soca` grid generation
│ ├── pert_ic.yaml # B-matrix randomization
│ ├── staticb.yaml # horizontal convolution
│ └── synthetic_obs.yaml # generate synthetic obs
├── Data # folder containing static MOM6 and SOCA files
├── tutorial_3dvarfgat.sh # 3DVAR FGAT driver for multiple cycles
├── tutorial_3dvar.sh # 3DVAR driver, analysis only
├── tutorial_bump_op.sh # initialize the horizontal convolution
├── tutorial_gridgen.sh # generate the grid and save it to disk
├── tutorial_make_observations.sh # generate synthetic observations
├── tutorial_perturb_ic.sh # perturb IC by randomizing the B-matrix
├── tutorial_plot.py # observation and state space plotting tools
├── tutorial_synthetic_observations.py # generate random locations for synthetic observations
└── tutorial_tools.sh # generate dynamic yaml configurations
Step 1: Grid Generation¶
This step is required in order to save the MOM6 geometry in a file for subsequent use. To generate the grid, run
./tutorial_gridgen.sh
if you open the tutorial_gridgen.sh
script, you will see that it executes the soca_gridgen.x
application on two cores
OMP_NUM_THREADS=1 mpirun -np 2 ../bin/soca_gridgen.x ../config/gridgen.yaml
with the yaml file gridgen.yaml
as argument:
geometry:
geom_grid_file: soca_gridspec.nc # name of the output file that contains the geometry information
save_local_domain: true # save the local FMS domains if true
full_init: true # directive to the MOM6 initialization. The grid generation process
# needs a complete initialization of MOM6
mom6_input_nml: ./inputnml/input.nml # location of the FMS Fortran namelist
fields metadata: ./fields_metadata.yml # location of the field metatdata yaml file
The tutorial_gridgen.sh
script will create a scratch_gridgen
folder in which the application is executed, as well as create
a ./static/soca_gridspec.nc
NetCDF grid file. Using ncdump
to investigate the metadata will
show that the file contains a subset of the usual MOM6 geometry definition.
Under scratch_gridgen
you will also find two NetCDF files, geom_output_00000.nc
and
geom_output_00001.nc
which contain the local FMS geometry, one file per MPI worker. This output is controlled
by the yaml key save_local_domain
and can be turned off by setting it to false
or simply commenting it out.
While running the tutorial applications, you will see this type of warnings from MOM6
:
WARNING from PE 0: ...
They can be ignored.
Step 2: Initialize the correlation operator using the NICAS method¶
This step is a core part of the background error covariance model that will be used in the next steps of this tutorial. To generate the horizontal correlation operator, run
./tutorial_bump_op.sh
The files necessary to subsequently initialize this operator are saved under static/bump/
.
These files are currently layout dependent. If you were to modify the number of processors used to run this
applications, all subsequent applications making use of this operator will have to be run on the same number
of cores.
The tutorial_bump_op.sh
scripts executes the following JEDI applications on two cores
OMP_NUM_THREADS=1 mpirun -np 2 ../bin/soca_staticbinit.x ../config/staticb.yaml
with the yaml file staticb.yaml
as argument. A relevant snippet of that yaml file
that controls the horizontal correlation operator is shown below:
analysis variables: &ana_vars [socn, tocn, ssh] # yaml anchor defining the control variables
background error:
covariance model: SocaError # name of the covariance factory in soca
analysis variables: *ana_vars # variables for which a correlation operator will be implemented
date: *date # date of the background
[...]
correlation:
- name: ocn # horizontal correlation for the ocean
base value: 300.0e3 # minimum decorrelation length scale
rossby mult: 1.0 # sets the decorrelation scale to base value + rossby mult * Rossby radius
min grid mult: 2.0 # impose the minimun decorrelation to span at least 2 grid boxes
min value: 200.0e3 # minimum value for the decorrelation (not used in this case)
variables: [tocn, socn, ssh] # variables on which to apply the correlation operator
the yaml key base value
, rossby mult
and min grid mult
can be adjusted to modify the
horizontal decorrelation length scale.
Step 3: Randomize a B-matrix to generate a perturbed initial condition¶
To generate a perturbation we will randomize a static B-matrix. This is done by running
./tutorial_perturb_ic.sh
it will generate an unrealistically large perturbed ocean state (the goal of the tutorial is not science!) that will be used in the next step to initialize a forecast and generate synthetic observations.
The JEDI application soca_enspert.x
is used inside of the tutorial_perturb_ic.sh
script in the following way
OMP_NUM_THREADS=1 mpirun -np 2 ../bin/soca_enspert.x ../config/pert_ic.yaml
taking the pert_ic.yaml
file as argument.
One can vary the amplitude of the perturbation by editing the background error
section
of the yaml file. The relevant snippet of yaml blocks is shown below.
background error:
covariance model: SocaError
date: *date
analysis variables: &soca_vars [socn, tocn, ssh, hocn]
bump:
verbosity: main
datadir: ./bump
strategy: specific_univariate
load_nicas_local: 1
perturbation scales:
tocn: 10.0
socn: 10.0
ssh: 0.0
Under the perturbation scales
section of the yaml file above, the tocn
and socn
keys represent the
scaling of the perturbation. Choose a number on the order of 1 if you wish to generate a realistic perturbed
ocean state and re-run the tutorial_perturb_ic.sh
script.
Step 4: Generate synthetic observations¶
In this step of the tutorial, we will generate synthetic observations by driving the
MOM6-solo
model using the soca_hofx.x
executable. ioda observation files are
created at the time and locations specified
in tutorial_synthetic_observations.py
. This application uses generic
observation operators from the UFO repository. To generate the synthetic observations, run the
tutorial_make_observations.sh
script:
./tutorial_make_observations.sh
A obs
directory is created and should contain the following ioda observation files:
├── adt.nc4 # absolute dynamic topography
├── insitu.S.nc4 # salinity profiles
├── insitu.T.nc4 # insitu temperature profiles
├── sss.nc4 # sea surface salinity
└── sst.nc4 # sea surface temperature
Step 5: 3DVar example¶
To run the 3DVAR example, execute the tutorial_3dvar.sh
script:
./tutorial_3dvar.sh
this script perform a 3D variational minimization using the soca_var.x
executable for a 24 hour window using the observations generated above.
The executable takes config/3dvar.yaml
yaml configuration file as an
argument.
The JEDI application soca_var.x
is used inside of the tutorial_3dvar.sh
script in the following way
OMP_NUM_THREADS=1 mpirun -np 2 ../bin/soca_var.x ../config/3dvar.yaml
taking the 3dvar.yaml
file as argument. This file controls, among other things, the
data assimilation window length which can be adjusted by changing the value of the
window length
key. The relevant yaml snippet is shown below:
[...]
cost function:
cost type: 3D-Var # cost function type
window begin: &date_begin 2018-04-14T12:00:00Z # starting date of the DA window
window length: P1D # length of the DA window (1 day)
# to adjust to a 6 hour window for example,
# replace with PT6H
[...]
A few figures of surface increments are plotted at the end of the script after the 3DVAR step is done:
$ ls scratch_3dvar/*.png
scratch_3dvar/incr.ssh.png scratch_3dvar/incr.sss.png scratch_3dvar/incr.sst.png
They represent increments for sea surface height, sea surface salinity and sea surface temperature, respectively.
Step 5: 3DVar FGAT example¶
This part of the tutorial is used to show an example of configuration of a
data assimilation experiment cycling through 3 days.
The data assimilation window is 24 hours and the synthetic observations assimilated are
sea surface temperature, sea surface salinity, insitu temperature and salinity and
absolute dynamic topography.
To run the 3DVar FGAT tutorial, execute the tutorial_3dvarfgat.sh
script:
./tutorial_3dvarfgat.sh
The student is encouraged to have a look inside of the tutorial_3dvarfgat.sh
script to follow
the steps that enable a cycling system.
Similarly to the 3DVAR example, figures of surface increments for outer iterations 1 and 2
can be found in ./scratch_3dvarfgat/incr.[1-2].ssh.png
,
./scratch_3dvarfgat/incr.[1-2].sss.png
and ./scratch_3dvarfgat/incr.[1-2].sst.png
.
Statistics of global mean absolute error of each observation space assimilated can be found
in ./scratch_3dvarfgat/*global_mae.png
.