Profile Specific QC Filters¶

Profile Background Check¶

This filter calculates the RMS difference between the observations and the background for a profile. If that RMS is above the given threshold, then all the observations in the profile are rejected. Each variable is checked independently, so the rejection of the profile for one variable will not affect the other variables.

The user can specify two options in the yaml: absolute threshold and relative threshold. Only one of these two options may be set. If absolute threshold is set, then the RMS is calculated without normalisation. If relative threshold is used, then the differences are normalised by the observation error for each observation-background difference. Both absolute threshold and relative threshold can be either a single number or a string. If they are a string, then this is the name of a variable which will be used to give the threshold for each profile. The RMS value will be compared against the first value of the given variable in the profile. The sorting of observations within each profile can be arranged using other options in the yaml file.

time window:
  begin: 2020-12-31T23:59:00Z
  end: 2021-01-01T00:01:00Z

observations:
  observers:
  - obs space:
      name: test data
      obsdatain:
        engine:
          type: H5File
          obsfile: Data/ufo/testinput_tier_1/profile_filter_testdata.nc4
        obsgrouping:
          group variables: [ "sequenceNumber" ]
          sort variable: "latitude"
          sort order: "descending"
      simulated variables: [variable]
    HofX: HofX
    obs filters:
    - filter: Profile Background Check
      filter variables:
      - name: variable
      absolute threshold: 2.5

Note: The obsgrouping: group variables option is necessary to identify which observations belong to a given profile. The sort variable and sort order options are optional.

Note: This is separate from the background check in conventional profile processing.

Profile Few Observations Check¶

This filter finds the number of valid observations within a profile. If this number is less than the filter parameter threshold then all observations in the profile are rejected. If the optional fraction parameter is set, then the check will instead flag the entire profile when more than this decimal fraction of the observations have been flagged. Either the threshold or fraction options must be set.

time window:
  begin: 2020-12-31T23:59:00Z
  end: 2021-01-01T00:01:00Z

observations:
  observers:
  - obs space:
      name: test data
      obsdatain:
        engine:
          type: H5File
          obsfile: Data/ufo/testinput_tier_1/profile_filter_testdata.nc4
        obsgrouping:
          group variables: ["sequenceNumber"]
      simulated variables: [variable]
    obs filters:
    - filter: Profile Few Observations Check
      filter variables:
      - name: variable
      threshold: 10
    - filter: Profile Few Observations Check
      filter variables:
      - name: variable
      fraction: 0.5

Note: The obsgrouping: group variables option is necessary to identify which observations belong to a given profile.

Profile Unflag Observations Check¶

This filter unflags isolated QC-failing observations if their neighbours pass QC and match to within an absolute tolerance. This tolerance is set by the absolute tolerance parameter and can optionally be scaled with a given vertical coordinate using a piece-wise linear scaling function defined by pairs of points given in the vertical tolerance scale option.

Uses the record number functionality defined by the obsgrouping to identify which observations belong to a given profile (all members of a profile must share the same record number). Each observation in a profile is compared to those above and below. If both of these are unflagged and match the observation to within a tolerance, then the observation is marked. If the observation is the first or the last in the profile than a match with only the single adjacent observation is sufficient for unflagging. The marked observations can then be accepted using a “Filter Action” (see the Filter Actions section for more detail). Observations can be included/excluded from this filter in the usual way using a “where” clause to the filter (see “where” clauses for more detail).

time window:
  begin: 2019-06-14T20:30:00Z
  end: 2019-06-15T03:30:00Z

observations:
  observers:
  - obs space:
      name: Unflag obs check unflags based on piecewise absolute tolerance
      obsdatain:
        engine:
          type: H5File
          obsfile: "Data/ufo/testinput_tier_1/oceanprofile_fake_obsdata.nc4"
        obsgrouping:
          group variables: [ "stationIdentification" ]
      simulated variables: [ "waterTemperature", "depthBelowWaterSurface" ]
      observed variables: [ "waterTemperature", "depthBelowWaterSurface" ]
    obs filters:
    - filter: Profile Unflag Observations Check
      filter variables:
      - name: ObsValue/waterTemperature
      absolute tolerance: 10
      vertical tolerance scale: { "0": 1, "3": 1, "8": 0.00001}
      vertical coordinate: "ObsValue/depthBelowWaterSurface"
      actions:
        - name: accept

Note: The optional scaling function vertical coordinate and scale points should be specified as keys and values of a JSON-style map. Owing to a limitation in the eckit YAML parser, the keys must be enclosed in quotes.

Impact Height Check¶

This filter is specific to GNSS-RO. It is based on the impact height, which is calculated from the model as \(x = 10^{-6} N (r_0 + z) + z\), where \(N\) is the refractivity, \(r_0\) is the radius of curvature of the earth at the observation tangent point and \(z\) is the geopotential height of the model layer.

For each observation it calculates the impact height of the lowest and highest model level. If the observation is outside this range (plus surface offset:) then the observation is rejected.

The filter also looks for regions where the vertical gradient of refractivity is large (i.e. less than gradient threshold:, which is normally negative). Any observations lower in the atmosphere than a large vertical gradient (plus sharp gradient offset:) are rejected. The algorithm starts looking from the top of the profile. Therefore a large gradient which is highest in the atmosphere will be the one which is considered. Large refractivity gradients are often associated with temperature inversions, and the radio-occultation retrieval can become ill-posed below such layers.

The following are the optional flags which may be used with this routine:

surface offset: Reject data which is within this height (in m) of the surface. Default: 600.
gradient threshold: The threshold used to define a sharp gradient in refractivity. Units: N-units / m. Default: -0.08.
sharp gradient offset: The height (in m) of a buffer-zone for rejecting data above sharp gradients. Default: 500.

This filter relies on the refractivity and model geopotential heights being saved as ObsDiagnostics. If these are not saved by the observation operator, then the code will fail. More details on saving diagnostics are given below. GnssroBendMetOffice is an example of an observation operator which saves these data.

time window:
  begin: 2020-05-01T03:00:00Z
  end: 2020-05-01T09:00:00Z

observations:
  observers:
  - obs operator:
      name: GnssroBendMetOffice
      obs options:
        vert_interp_ops: true
        pseudo_ops: true
    obs space:
      name: GnssroBnd
      obsdatain:
        engine:
          type: H5File
          obsfile: Data/ioda/testinput_tier_1/gnssro_obs_2020050106_1dvar.nc4
      simulated variables: [bendingAngle]
    geovals:
      filename: Data/ufo/testinput_tier_1/gnssro_geoval_2020050106_1dvar.nc4
    obs filters:
    - filter: GNSSRO Impact Height Check
      filter variables:
      - name: bendingAngle
      gradient threshold: -0.08
      sharp gradient offset: 600
      surface offset: 500

Conventional Profile Processing¶

Overview¶

This filter comprises several QC checks that can be applied to conventional atmospheric profile data (e.g. as measured by radiosondes) whose observations lie at particular pressure levels. These checks have been ported from UK Met Office observation processing system (OPS). The following checks are available:

Basic: These checks ensure the profile pressures lie in a reasonable range and are in the correct order. Click here for more details.
SamePDiffT: If two levels have the same pressure, but their temperature difference is larger than a threshold, reject one of the levels. Click here for more details.
Sign: This check determines whether an observed temperature may have had its sign (in degrees Celsius) recorded incorrectly. To do this the temperature is compared to the model background value. If the check is failed a temperature correction is calculated. Click here for more details.
UnstableLayer: The temperature in a particular level is used to compute the expected temperature in the level above given the dry adiabatic lapse rate. If the measured temperature in the level above is lower than its expected value by a certain threshold then both levels are flagged. Click here for more details.
Interpolation: The temperature between adjacent significant pressure levels is interpolated onto any encompassed standard pressure levels. If the interpolated temperature differs from the observed value by more than a particular threshold then the relevant standard and significant levels are flagged. (Further information on standard and significant levels can be found here.) Click here for more details.
Hydrostatic: This is a check of the consistency between the observed values of temperature and geopotential height at each pressure level. The check relies on the hydrostatic equation and has a complicated decision-making algorithm. If a particular level fails this check then a height correction is (sometimes) computed. Click here for more details.
UInterp: The wind speed between adjacent significant pressure levels is interpolated onto any encompassed standard pressure levels. If the vector difference of the interpolated and measured wind speeds is larger than a certain threshold then the relevant standard and significant levels are flagged. Click here for more details.
RH: This check detects relative humidity errors at the top of cloud layers and at high altitudes. Click here for more details.
Time: This check flags any observations whose time of measurement lies outside the assimilation window. It also optionally rejects wind values for a certain period after launch. Click here for more details.
BackgroundX: These checks use a Bayesian approach to modify the probability of gross error for several variables (X can be GeopotentialHeight, RelativeHumidity, Temperature or WindSpeed). The use of such an approach distinguishes these checks from the Background Check filter introduced above. Click here for more details.
PermanentReject: This check permanently rejects observations that have previously been flagged as failing by another check. Click here for more details.
SondeFlags: This check accounts for any QC flags that were assigned to the sonde data prior to UFO being run. Click here for more details.
WindProfilerFlags: This check accounts for any QC flags that were assigned to the wind profiler data prior to UFO being run. Click here for more details.
Pressure: This routine calculates profile pressures if they have not been measured (or were measured but are potentially inaccurate). This is achieved by vertical interpolation and extrapolation using the observed height and model values of height and pressure. Click here for more details.
AverageX: These routines average observed variables onto model levels (X can be Pressure, RelativeHumidity, Temperature or WindSpeed). Click here for more details.

The Conventional Profile Processing filter can apply more than one check in turn. Please note the following:

The total number of errors that have occurred is recorded as the filter proceeds through each check. If this number exceeds a threshold (set by defining the parameter nErrorsFail) then the entire profile is rejected.
The basic checks are always performed unless they are specifically disabled (by setting the parameter flagBasicChecksFail to true).

Other filters that deal with atmospheric profiles include the Profile Background Check and the Profile Few Observations Check. Note that the Profile Background Check is different to the Bayesian background check which is described in the BackgroundX section below.

Filter variables¶

The QC checks rely on a variety of physical observables. The value of filter variables for each check should be:

Basic, SamePDiffT, Sign, UnstableLayer, Interpolation, Hydrostatic: airTemperature, geopotentialHeight.
UInterp: windEastward, windNorthward.
RH: airTemperature, relativeHumidity.
BackgroundX: air_temperature, relative_humidity, eastward_wind, northward_wind, geopotential_height depending on the value of X.
Pressure: geopotentialHeight.
Time, PermanentReject, SondeFlags, WindProfilerFlags: these routines act on QC flags so must be supplied with a dummy filter variable. Any variable that exists in the data set is acceptable; windEastward would be a good choice.

The obsgrouping category should be set up in one of two ways. The first applies a descending sort to the air pressures:

obsgrouping:
  group variable: "stationIdentification"
  sort variable: "pressure"
  sort order: "descending"

The second does not sort the air pressures:

obsgrouping:
  group variable: "stationIdentification"

The second formulation could be used if the pressures have been sorted prior to applying this filter. An ascending sort order is not valid; if this is selected the checks will throw an error. In both cases the station ID is used to discriminate between different sonde profiles.

Back to overview of conventional profile processing

Filter configuration¶

The following yaml parameters can be used to configure the filter itself:

Checks: List of checks to perform. The checks will be performed in the specified order. Examples: [“Basic”], [“Basic”, “Hydrostatic”, “UInterp”].
nErrorsFail: Total number of errors at which an entire profile is rejected (default 1).
flagBasicChecksFail: Reject a profile if it fails the basic checks (default true). This should only be set to false for testing purposes.
compareWithOPS: Compare values obtained in these checks with the equivalent values produced in the OPS code (default false). This is set to true for certain unit tests (named *OPScomparison*) for which the relevant quantities are present in the input files.
Comparison_Tol: Tolerance for comparisons with OPS, enabling rounding errors to be accommodated (default 0.1).

Back to overview of conventional profile processing

Standard and significant levels¶

Definitions

Standard, or mandatory, levels are values of pressure at which it has been internationally agreed that complete measurements of the physical observables should ideally be recorded. Significant levels correspond to other pressure values at which the physical observables should be recorded to get an accurate picture of the sonde ascent.

Each profile is checked for the presence of both standard and significant levels.

Summary of yaml parameters:

FS_MinP: Minimum pressure for including a level in standard level finding routine (default 0.0 Pa).
StandardLevels: list of standard levels (default [1000, 925, 850, 700, 500, 400, 300, 250, 200, 150, 100, 70, 50, 30, 20, 10, 7, 3, 2, 1] hPa). These are internationally-agreed values and should usually not be changed.

Back to overview of conventional profile processing

Basic check¶

Operation

The following basic checks are applied to each profile:

There is at least one pressure level present,
The pressures lie between minimum and maximum values (BChecks_minValidP and BChecks_maxValidP),
The pressures are in descending order.

Any profiles that do not meet these criteria are rejected.

Summary of yaml parameters

BChecks_minValidP: Minimum pressure in profile (default 0.0 Pa).
BChecks_maxValidP: Maximum pressure in profile (default 110.0e3 Pa).
BChecks_Skip: Do not perform the basic checks (default false). Only set to true for unit tests in which the input sample consists of pressures that should not be sorted.

Back to overview of conventional profile processing

SamePDiffT check¶

Operation

This check searches for pairs of levels that have identical pressures but for which the absolute difference between their temperatures is larger than a particular threshold (SPDTCheck_TThresh). The level with the larger absolute difference between the observed and model background temperature is rejected.

Summary of yaml parameters

SPDTCheck_TThresh: Absolute temperature difference threshold (default 0.0 K).

Back to overview of conventional profile processing

Sign check¶

Operation

The sign check for a particular level is failed in the following case:

The absolute difference between the observed and model background temperature is larger than a threshold (SCheck_tObstBkgThresh),
Changing the sign (in degrees C) of the observed temperature causes its absolute difference relative to the model background temperature (also in degrees C) to be smaller than a threshold (SCheck_ProfileSignTol),
The level pressure is lower by more than a certain amount (SCheck_PstarThresh) than the model surface pressure.

Summary of yaml parameters

SCheck_tObstBkgThresh: Threshold for absolute temperature difference between observation and background (default 5.0 K).
SCheck_ProfileSignTol: Threshold for absolute temperature difference between observation and background after the observation sign has been changed (default 100.0 degrees C).
SCheck_PstarThresh: Threshold for difference between observed pressure and model surface pressure (default 1000.0 Pa).
SCheck_PrintLargeTThresh: Pressure threshold above which large temperature differences are printed (default 1000.0 Pa).
SCheck_CorrectT: Compute correction to temperature (default true).

Back to overview of conventional profile processing

UnstableLayer check¶

Operation

The temperature at a particular level is used to compute the temperature at the adjacent level (upwards) in the profile. The calculation assumes that the temperature-pressure relationship follows the dry adiabatic lapse rate. If the observed temperature at the adjacent level is lower than the calculated temperature by more than a particular amount (ULCheck_SuperadiabatTol) the level is flagged. This check is only applied to levels whose pressure is larger than a minimum threshold (ULCheck_MinP) and lower by a certain amount (ULCheck_PBThresh) than the surface pressure.

Summary of yaml parameters

ULCheck_SuperadiabatTol: Temperature difference threshold between observed temperature and temperature computed assuming dry adiabatic lapse rate (default -1.0 K).
ULCheck_PBThresh: Threshold on difference between level pressure and ‘bottom’ pressure (which can change during the routine) (default 10000.0 Pa).
ULCheck_MinP: Minimum pressure at which the checks are performed (default 0.0 Pa).

Back to overview of conventional profile processing

Interpolation check¶

Operation

The temperature is interpolated from significant levels onto any encompassed standard levels. If the absolute difference between the standard level temperature and the interpolated value is more than a particular threshold (ICheck_TInterpTol) then the level in question, together with the relevant significant levels, are all flagged. Below a particular pressure (ICheck_TolRelaxPThresh) the threshold is relaxed by multiplying it by the factor ICheck_TolRelax.

This check is only performed if the pressure difference between the standard and significant levels is not too large. The difference, known loosely as a ‘big gap’, depends upon the pressure of the standard level. As the standard level pressure decreases, the big gaps also decrease in size according to the list in ICheck_BigGaps; the smallest big gap is defined as ICheck_BigGapInit.

Summary of yaml parameters

ICheck_TInterpTol: Threshold for temperature difference between observed and interpolated value (default 1.0 K).
ICheck_TolRelaxPThresh: Pressure below which temperature difference threshold is relaxed (default 50000.0 Pa).
ICheck_TolRelax: Multiplicative factor for temperature difference threshold, used if pressure is lower than ICheck_TolRelaxPThresh (default 1.0).
ICheck_BigGaps: ‘Big gaps’ for use in this check (default [500, 500, 500, 500, 100, 100, 100, 100, 50, 50, 50, 50, 10, 10, 10, 10, 10, 10, 10, 10] hPa).
ICheck_BigGapInit: Smallest value of ‘big gap’ (default 1000.0 Pa).

Back to overview of conventional profile processing

Hydrostatic check¶

Operation

The hydrostatic check is used to check the consistency of the standard levels. The thickness between two standard levels is computed according to the hydrostatic equation.

If this thickness differs from the measured value by more than a particular amount then the associated levels may be flagged. A decision-making algorithm is used to classify the levels as having height or temperature errors.

Summary of yaml parameters

HCheck_CorrectZ: Compute correction to Z (default true).
HydDesc: Text description of hydrostatic errors.
There are a large number of thresholds used in the decision-making algorithm. Their default values are listed here:
- HCheck_SurfacePThresh: 10000.0 Pa
- HCheck_ETolMult: 0.5
- HCheck_ETolMax: 1.0 m
- HCheck_ETolMaxPThresh: 50000.0 Pa
- HCheck_ETolMaxLarger: 1.0 m
- HCheck_ETolMin: 1.0 m
- HCheck_EThresh: 100.0 m
- HCheck_EThreshB: 100.0 m
- HCheck_ESumThresh: 50.0 m
- HCheck_MinAbsEThresh: 10.0 m
- HCheck_ESumThreshLarger: 100.0 m
- HCheck_MinAbsEThreshLarger: 100.0 m
- HCheck_CorrThresh: 5.0 m
- HCheck_ESumNextThresh: 50.0 m
- HCheck_MinAbsEThreshT: 10.0 m
- HCheck_CorrDiffThresh: 10.0
- HCheck_CorrMinThresh: 1.0

Back to overview of conventional profile processing

UInterp check¶

Operation

This check is used to detect two types of error in the observed wind speed. The first occurs when two levels have identical pressures but a large vector difference between their measured wind speeds. If the squared difference between the measured wind speeds is larger than a threshold (UICheck_TInterpIdenticalPTolSq) then both levels are flagged.

The second type of error is detected by interpolating the significant level wind speeds onto any encompassed standard levels, as is done for temperature in the Interpolation check (see here). If the squared difference between the interpolated and measured wind speeds is larger than a certain amount (UICheck_TinterpTolSq) then both levels are flagged.

Similarly to the interpolation check, the second type of error is only searched for if the pressure difference between the adjacent standard levels is not too large. The maximum permitted difference is referred to as a ‘big gap’. The value of the big gap depends on the pressure of the standard level in question; as this pressure reduces (and passes thresholds defined in UICheck_BigGapsPThresh), the value of the big gap also reduces (according to the values in UICheck_BigGaps), down to a minimum value given by the value of UICheck_BigGapLowP.

There is an alternative implementation of this check called UInterpAlternative. The UInterpAlternative check uses an different data handling method but otherwise behaves identically to the UInterp check. As such the UInterpAlternative check does not need to be used operationally (but should be kept to aid regression testing).

Summary of yaml parameters

UICheck_TInterpIdenticalPTolSq: threshold for squared difference between observed wind speeds for levels with identical pressures (default 0.0 m² s^-2).
UICheck_TInterpTolSq: threshold for squared difference between observed and interpolated wind speeds (default 0.0 m² s^-2).
UICheck_BigGapsPThresh: Maximum pressure thresholds corresponding to the big gaps as defined in UICheck_BigGaps (default [50000.0, 10000.0, 5000.0, 1000.0] Pa).
UICheck_BigGaps: Big gaps corresponding to the pressure thresholds defined in UICheck_BigGapsPThresh (default [100000.0, 50000.0, 10000.0, 5000.0] Pa).
UICheck_BigGapLowP: Minimum ‘big gap’ in pressure (default 500.0 Pa).

Back to overview of conventional profile processing

RH check¶

Operation

The RH check is designed to detect errors in relative humidity that may be caused by ascents through clouds. Two checks are employed:

Transient humidity error at the cloud top,
Persistent humidity error at high altitude (low pressure) levels after passing through a cloud.

The following conditions must be met in order for a level to fail the cloud top check:

The level pressure must be larger than a particular value (RHCheck_PressThresh),
The pressure difference between the present level and the lowest level must be larger than a particular threshold (RHCheck_PressDiff0Thresh),
The dew point temperature difference between the present level and the level below must be larger than the threshold RHCheck_tdDiffThresh,
The level relative humidity must be larger than the threshold RHCheck_RHThresh,
The minimum relative humidity of all levels above the present level must be less than a certain threshold (RHCheck_MinRHThresh). Only levels whose pressure is close to that of the current level (with a difference threshold of (RHCheck_PressDiffAdjThresh) are considered.

The following conditions must be met in order for a level to fail the high-altitude check:

The minimum observed temperature in the profile must be less than a particular threshold (RHCheck_TminThresh),
At least one of the following is true:
- The difference between the observed and model background (O-B) relative humidity in the present level must be larger than a particular threshold (RHCheck_SondeRHHiTol),
- The present level has a pressure lower than RHCheck_PressInitThresh and the mean RH O-B, computed over all levels with a pressure lower than RHCheck_PressInitThresh, is larger than RHCheck_SondeRHHiTol.

Summary of yaml parameters

The following parameters are used in the cloud top check:

RHCheck_PressThresh: Pressure threshold for check at top of cloud layers (default 500.0 Pa).
RHCheck_PressDiff0Thresh: Threshold for difference between pressure at the present level and pressure at the lowest level (default 50.0 Pa).
RHCheck_tdDiffThresh: Threshold for difference in dew point temperature between the present level and the level below (default 5.0 K).
RHCheck_RHThresh: Threshold for relative humidity check to be applied (default 75.0%).
RHCheck_MinRHThresh: Threshold for minimum relative humidity at top of cloud layers (default 75.0%).
RHCheck_PressDiffAdjThresh: Pressure threshold for determining cloud layer minimum RH (default 50.0 Pa).

The following parameters are used in the high-altitude check:

RHCheck_TminThresh: Threshold value of minimum observed temperature in the profile (default 200.0 K).
RHCheck_TminInit: Initial value used in the algorithm that determines the minimum observed temperature (default 400.0 K).
RHCheck_SondeRHHiTol: Threshold for relative humidity O-B difference in sonde ascent check (default 0.0%).
RHCheck_PressInitThresh: Pressure below which O-B mean is calculated (default 500.0 Pa).
RHCheck_TempThresh: Minimum temperature threshold for accumulating an error counter (default 250.0 K).

Back to overview of conventional profile processing

Time check¶

Operation

This check flags any observations whose time of measurement lies outside the assimilation window. The time check also optionally rejects wind values whose observation pressure is within TimeCheck_SondeLaunchWindRej of the surface pressure.

Summary of yaml parameters

ModelLevels: Governs whether the observations have been averaged onto model levels.
TimeCheck_SondeLaunchWindRej: Observations are rejected if they differ from the surface pressure by less than this value. Assuming an ascent rate of 5 m/s, 10 hPa corresponds to around 20 s of flight time. Using a pressure difference enables all sonde reports to be dealt with. (Default: 0.0 hPa, i.e. no rejection is performed).

Back to overview of conventional profile processing

BackgroundX checks¶

Operation

The BackgroundX checks, where X is GeopotentialHeight, RelativeHumidity, Temperature or WindSpeed, use a Bayesian method to update the probability of gross error (PGE) for the relevant set of observations. Each observation must have previously been assigned a value of PGE in order for these checks to be used; this value could, for example, be taken from a stationlist. This PGE is updated with the method detailed below and is used in further filters such as the Buddy check. In addition to updating the PGE, various QC flags are set by each check.

The Bayesian background checks all operate in a similar manner. Firstly, the probability density of ‘bad’ observations is set. Such observations are in gross error, and are assumed to have a uniform probability of taking any climatologically reasonable value. Secondly, for some variables, the observation and background errors are increased to reflect additional sources of error which may be present. Finally the PGE calculation routine is called. Some of the modifications to the errors, and to the PGE within the Bayesian calculation, are only performed if the values in a profile have been averaged onto model levels. This is signified by the filter parameter ModelLevels being equal to true.

The errors and PGEs are modified as follows for each variable:

Geopotential height: the background errors and probability density of bad observations are initialised from the arrays BkCheck_zBkgErrs and BkCheck_zBadPGEs respectively. The value taken from each array depends on where the observed pressure lies in the array BkCheck_PlevelThresholds.
Relative humidity: the probability density of bad observations is set to BkCheck_PdBad_rh. The background and observation error values are multiplied by the square root of two in order to account for long-tailed error distributions. The maximum combined observation and background error variance passed to the Bayesian PGE update is set to the value BkCheck_ErrVarMax_rh.
Temperature: the probability density of bad observations is set to BkCheck_PdBad_t. The observation errors above a certain pressure threshold (‘Psplit’) are scaled in order to account for extra representivity error. The value of Psplit depends on whether the observation is in the tropics, defined as the region with absolute latitude less than options_.BkCheck_Psplit_latitude_tropics degrees. If the observation is in the tropics, Psplit is set to BkCheck_Psplit_tropics; otherwise it is BkCheck_Psplit_extratropics. The error inflation for pressures less than or equal to Psplit is set to BkCheck_ErrorInflationBelowPsplit and BkCheck_ErrorInflationAbovePsplit otherwise. The observation PGE is modified if the observation was previously flagged in the UnstableLayer, Interpolation or Hydrostatic checks.
Wind speed: the probability density of bad observations is set to BkCheck_PdBad_uv. The observation PGE is modified if observation was previously flagged in the Interpolation check.

The PGE update then proceeds as follows. Firstly the probability of the difference between the observed and background values is calculated, assuming the difference follows a normal distribution with variance equal to the combined observation and background error variance. The wind speed components (u and v) are treated together, so a two-dimensional probability density is formed in that case. The PGE is then weighted by this calculated probability and also by the probability that the observation is bad. The updated PGE can be passed to the Buddy check if desired.

The PGE update code is located in a UFO utility function, enabling it to be used by multiple UFO filters. All of the configurable parameters used in the utility function are prefixed with PGE_ and are defined in the section below. Further details of the Bayesian update method can be found in Ingleby, N.B. and Lorenc, A.C. (1993), Bayesian quality control using multivariate normal distributions. Q.J.R. Meteorol. Soc., 119: 1195-1225. https://doi.org/10.1002/qj.49711951316

Summary of yaml parameters

ModelLevels: Governs whether the observations have been averaged onto model levels.
BkCheck_PdBad_t: Probability density of bad observations for T (default: 0.05).
BkCheck_PdBad_rh: Probability density of bad observations for RH (default: 0.05).
BkCheck_PdBad_uv: Probability density of bad observations for u and v (default: 0.001).
BkCheck_Psplit_latitude_tropics: Observations with a latitude smaller than this value (both N and S) are taken to be in the tropics (default: 30 degrees).
BkCheck_Psplit_extratropics: Pressure threshold above which extra representivity error occurs in extratropics (default: 50000 Pa).
BkCheck_Psplit_tropics: Pressure threshold above which extra representivity error occurs in tropics (default: 10000 Pa).
BkCheck_ErrorInflationBelowPsplit: Error inflation factor below Psplit (default value: 1.0).
BkCheck_ErrorInflationAbovePsplit: Error inflation factor above Psplit (default value: 1.0).
BkCheck_ErrVarMax_rh: Maximum combined observation and background error variance for RH (default: 500.0 per 10000).
BkCheck_PlevelThresholds: Pressure thresholds for setting geopotential height background errors and bad observation PGE. This vector must be the same length as BkCheck_zBkgErrs and BkCheck_zBadPGEs (default: [1000.0, 500.0, 100.0, 50.0, 10.0, 5.0, 1.0, 0.0] hPa).
BkCheck_zBkgErrs: List of geopotential height background errors that are assigned based on pressure. This vector must be the same length as BkCheck_PlevelThresholds and BkCheck_zBadPGEs (default: [10.0, 10.0, 10.0, 10.0, 10.0, 10.0, 10.0, 10.0] m).
BkCheck_zBadPGEs: List of geopotential height PGEs for bad observations that are assigned based on pressure. This vector must be the same length as BkCheck_PlevelThresholds and BkCheck_zBkgErrs (default: [0.01, 0.01, 0.01, 0.01, 0.01, 0.01, 0.01, 0.01]).
PGE_ExpArgMax: Maximum value of exponent in background QC (default 80.0). This could be changed depending upon the machine precision.
PGE_PGECrit: PGE rejection limit (default 0.1). Observations with values of PGE above this threshold are flagged.
PGE_ObErrMult: Multiplication factor for observation errors (default 1.0).
PGE_BkgErrMult: Multiplication factor for background errors (default 1.0).
PGE_SDiffCrit: Threshold for (squared observation minus background difference) / (error variance) (default 100.0). Observations with values larger than this threshold are flagged. This is only performed if the observations have been averaged onto model levels.

Back to overview of conventional profile processing

PermanentReject check¶

Operation

This check permanently rejects observations that have previously been flagged as failing by another check.

Summary of yaml parameters

ModelLevels: Governs whether the observations have been averaged onto model levels.

Back to overview of conventional profile processing

SondeFlags check¶

Operation

This check accounts for any QC flags that were assigned to the sonde data prior to UFO being run. These QC flags may be (e.g.) standard WMO designations.

Summary of yaml parameters

There are no configurable parameters for this check.

Back to overview of conventional profile processing

WindProfilerFlags check¶

Operation

This check accounts for any QC flags that were assigned to the wind profiler data prior to UFO being run.

Summary of yaml parameters

There are no configurable parameters for this check.

Back to overview of conventional profile processing

Pressure calculation¶

Operation

This routine calculates profile pressures if they are have not been measured (or were measured but are potentially inaccurate). Firstly the model heights are computed from the orography and the terrain-following height coordinate. The model heights are used together with the observation heights and model pressures to interpolate (or extrapolate) values of the observed pressures.

Summary of yaml parameters

The default values of these parameters are suitable for the UM.

zModelTop: Height of the upper boundary of the highest model layer.
firstConstantRhoLevel: First model rho level at which there is no geographical variation in the height.
etaTheta: Values of terrain-following height coordinate (eta) on theta levels.
etaRho: Value of terrain-following height coordinate (eta) on rho levels.

Back to overview of conventional profile processing

AverageX¶

Operation

The AverageX routines, where X is Pressure, RelativeHumidity, Temperature or WindSpeed, are used to average observed values of X onto model levels.

In order for these routines to work correctly the ObsSpace must have been extended as in the following yaml snippet:

- obs space:
   extension:
     average profiles onto model levels: 71

(where 71 can be replaced by the length of the air_pressure_levels GeoVaL).

Furthermore, the AveragePressure routine must be run prior to any of the other AverageX routines; this calculates various transformed values of pressure which are used in the averaging.

The vertical processing of temperature is based on calculating the thickness of the model layers (rather than just averaging the temperatures). The potential temperature in each layer is converted to temperature by multiplying by the Exner pressure. When the model layer is not completely covered by observations, a potential temperature observation-minus-background increment is computed using linear interpolation of temperature between the layer boundaries. This increment is added to the background value to produce the averaged observation value.
The eastward and northward wind components are averaged separately over model layers defined by adjacent pressure levels, including the surface pressure.
By default, relative humidities are interpolated onto model layer boundaries rather than averaged across layers in order to avoid unwanted smoothing. This behaviour can be controlled with the AvgRH_Interp option. The interpolated/averaged relative humidity values are rejected at any layer where the averaged temperature value is less than or equal to the threshold AvgRH_AvgTThreshold. This threshold can be modified to an instrument-dependent value with the parameter AvgRH_InstrTThresholds, which is a map between WMO sonde instrument codes and the associated temperature thresholds.

The H(x) equivalents of the averaged observations are computed with the ProfileAverage observation operator.

Summary of yaml parameters

AvgP_SondeGapFactor: Factor used to determine big gaps for sondes (dimensionless; multiplied by log(10)) (default: 1.0).
AvgP_WinProGapFactor: Factor used to determine big gaps for wind profilers (dimensionless; multiplied by log(10)) (default: 1.0).
AvgP_GapLogPDiffMin: Minimum value of denominator used when computing big gaps (dimensionless; equal to log (pressure threshold / hPa)) (default: log(5.0)).
AvgT_SondeDZFraction: Minimum fraction of a model layer that must have been covered (in the vertical coordinate) by observed values in order for temperature to be averaged onto that layer (default: 0.5).
AvgT_PGEskip: Probability of gross error threshold above which rejection flags are set in the temperature averaging routine (default: 0.9).
AvgU_SondeDZFraction: Minimum fraction of a model layer that must have been covered (in the vertical coordinate) by observed values in order for wind speed to be averaged onto that layer (default: 0.5).
AvgU_PGEskip: Probability of gross error threshold above which rejection flags are set in the wind speed averaging routine (default: 0.9).
AvgRH_PGEskip: Probability of gross error threshold above which rejection flags are set in the relative humidity averaging routine (default: 0.9).
AvgRH_SondeDZFraction: Minimum fraction of a model layer that must have been covered (in the vertical coordinate) by observed values in order for relative humidity to be averaged onto that layer (default: 0.5).
AvgRH_Interp: Perform interpolation or averaging of relative humidity observations (default: true = interpolation).
AvgRH_AvgTThreshold: Default average temperature threshold below which average relative humidity observations are rejected (degrees C) (default: -40.0).
AvgRH_InstrTThresholds: Custom average temperature thresholds below which average relative humidity observations are rejected (degrees C). These thresholds are stored in a map with keys equal to the WMO codes for radiosonde instrument types and values equal to the custom thresholds. The full list of codes can be found in “WMO Manual on Codes - International Codes, Volume I.2, Annex II to the WMO Technical Regulations: Part C - Common Features to Binary and Alphanumeric Codes” (available at https://library.wmo.int/?lvl=notice_display&id=10684). See yaml file for defaults.

Back to overview of conventional profile processing

Examples¶

This example runs the basic checks on the input data:

- filter: Conventional Profile Processing
  filter variables:
  - name: airTemperature
  - name: geopotentialHeight
  Checks: ["Basic"]

This example runs the basic and SamePDiffT checks on the input data, using separate instances of the filter to do so:

- filter: Conventional Profile Processing
  filter variables:
  - name: airTemperature
  - name: geopotentialHeight
  Checks: ["Basic"]
- filter: Conventional Profile Processing
  filter variables:
  - name: airTemperature
  - name: geopotentialHeight
  Checks: ["SamePDiffT"]
  SPDTCheck_TThresh: 30.0 # This is an example modification of a check parameter

This example runs the basic and SamePDiffT checks on the input data, using the same filter instance:

- filter: Conventional Profile Processing
  filter variables:
  - name: airTemperature
  - name: geopotentialHeight
  Checks: ["Basic", "SamePDiffT"]
  SPDTCheck_TThresh: 30.0 # This is an example modification of a check parameter

Ocean Vertical Stability Check¶

This filter calculates the density (kg/m^3) from given temperature, salinity and pressure, and then checks for locations where the density spikes (is different from the densities above and below by more than the specified tolerance) or steps (decreases with depth by more than the specified tolerance). Such spikes or steps in density indicate a vertical instability, as we expect density to increase monotonically with depth.

Summary of yaml parameters

filter variables: the ObsValue variable(s) that will be associated with the DensitySpike and DensityStep flags. The choice of filter variables does not affect the functioning of the filter as long as the variables’ ObsError s are not missing everywhere.
variables.temperature: in situ temperature values (degrees C) used in computation of density (required).
variables.salinity: absolute salinity values (g/kg) used in computation of density (required).
variables.pressure: pressure values (dbar) used in computation of density (required).
count spikes: count the number of spikes in density (default true).
count steps: count the number of steps in decreasing density (default true).
nominal tolerance: if a density difference from one level to the next deeper one is less than this (more negative), then this is counted as a step (default: -0.05 kg/m^3).
threshold: the smaller the threshold, the more symmetrical a density spike must be to count as a spike (default: 0.25).

Note that a call to the Ocean Vertical Stability Check filter MUST be preceded by creation of Diagnostic Flags called DensitySpike and DensityStep, for every filter variable listed (see example below). An error will be thrown if a filter variable is listed but does not have both DensitySpike and DensityStep flags associated with it. They need to be present because the code itself sets them - DensitySpike at the location of the spike, and DensityStep at the ‘foot’ of the step (the deeper level, not both, so as not to double-count steps).

Example yaml

time window:
  begin: 2020-12-31T23:59:00Z
  end: 2021-01-01T00:01:00Z

observations:
  observers:
  - obs space:
      name: test data
      obsdatain:
        engine:
          type: H5File
          obsfile: Data/ufo/testinput_tier_1/profile_filter_testdata.nc4
        obsgrouping:
          group variables: [ "stationIdentification", "dateTime", "latitude", "longitude" ]
          sort variable: "waterPressure"
          sort group: "DerivedObsValue"
          sort order: "ascending"
      simulated variables: [waterTemperature, salinity, depthBelowWaterSurface, waterPressure]
      observed variables: [waterTemperature, salinity]
      derived variables: [depthBelowWaterSurface, waterPressure]
    HofX: HofX
    obs filters:
    - filter: Create Diagnostic Flags
      filter variables:
        - name: DerivedObsValue/depthBelowWaterSurface
      flags:
      - name: DensitySpike
        initial value: false
      - name: DensityStep
        initial value: false
      - name: Superadiabat
        initial value: false
    - filter: Ocean Vertical Stability Check
      filter variables:
        - name: DerivedObsValue/depthBelowWaterSurface
      variables:
        temperature: ObsValue/waterTemperature
        salinity: ObsValue/salinity
        pressure: DerivedObsValue/waterPressure
      count spikes: true
      count steps: true
      nominal tolerance: -0.05
      threshold: 0.25
      actions:
      - name: set
        flag: Superadiabat
      - name: reject

In this example, the Diagnostic Flags are associated with the filter variable DerivedObsValue/depthBelowWaterSurface. This sets DiagnosticFlags/DensitySpike/depthBelowWaterSurface and DiagnosticFlags/DensityStep/depthBelowWaterSurface. Additionally, because a filter action is specified to set DiagnosticFlags/Superadiabat, this flag is set (for depthBelowWaterSurface only) at every location that is flagged as a density spike or step (both levels of each step). These locations are rejected because that filter action has also been specified.

This filter has only been tested for observations that have been grouped into records (profiles) by setting the obsgrouping.group variables option. The sort variable, sort group and sort order options are optional, though incorrect results will be obtained if the profiles are not sorted surface to depth.

Example of subsequent flagging of whole profiles

Once the density spikes and steps have been flagged, it is possible to subsequently reject whole profiles that exceed specified conditions:

# create derived metadata counting levels:
  - filter: Variable Assignment
    assignments:
    - name: DerivedMetaData/numberOfLevels
      type: int
      function:
        name: IntObsFunction/ProfileLevelCount
        options:
          where:
            - variable:
                name: ObsValue/waterTemperature
              value: is_valid
# create derived metadata counting spikes only:
  - filter: Variable Assignment
    assignments:
    - name: DerivedMetaData/ocean_density_spikes
      type: int
      function:
        name: IntObsFunction/ProfileLevelCount
        options:
          where:
            - variable:
                name: DiagnosticFlags/DensitySpike/depthBelowWaterSurface
              value: is_true
# reject whole profile if num spikes >= numlev/4, so compute
#  4*( num spikes ) minus numlev in order to check it against 0:
  - filter: Variable Assignment
    assignments:
    - name: DerivedMetaData/ocean_density_rejections
      type: int
      function:
        name: IntObsFunction/LinearCombination
        options:
          variables: [DerivedMetaData/ocean_density_spikes, DerivedMetaData/numberOfLevels]
          coefs: [4, -1]
# reject whole profile if num spikes >= numlev/4 AND >= 2:
  - filter: Perform Action
    where:
    - variable:
        name: DerivedMetaData/ocean_density_rejections
      minvalue: 0
    - variable:
        name: DerivedMetaData/ocean_density_spikes
      minvalue: 2
    where operator: and
    action:
      name: reject

This example rejects whole profiles which contain >=2 density spikes AND the number of spikes exceeds one quarter of the number of non-missing levels in the profile. It makes use of the ProfileLevelCount and LinearCombination obsFunctions, and Perform Action: reject based on where statements. With spikes and steps separated like this, they can be counted and used separately in conditional flagging, if required.

Average Observations to Model Levels¶

For each of the filter variables given, this filter computes the model-level average increment (where \(j\) indexes observation levels):

\[inc_{m} = \frac{ \sum_{j = j_{0_m}}^{j_{N_m}} { (y_j - H(x)_j) } }{j_{N_m} - j_{0_m}}\]

It is the mean of all where-included, QC-passing, non-missing observation-minus-background values \((y_j - H(x)_j)\) that fall within the range \(j_{0_m}\) to \(j_{N_m}\) of that model level \(m\). The range is bounded by the mid-points between model level \(m\) and the adjacent model level above and below it.

Each average increment is added to the background value at that model level:

\[<y>_m = H(x)_m + inc_m\]

The resulting observation values averaged on to model levels, \(<y>_m\), are written to the DerivedObsValue ‘s extended space. The original space of the DerivedObsValue is the same as that of the ObsValue.

The QC flags on model levels are set by this filter to be equal to those of the nearest observation level that is only just deeper in the ocean, or only just higher in the atmosphere, than that model level. It is the user’s responsibility to set the model-level (extended-space) ObsError as required, and Perform Action: assign error separately, as there is no agreed method for this filter to assign observation errors.

Summary of yaml parameters

filter variables: the (Derived)ObsValue(s) whose observation-level values are to be averaged on to model levels.
observation vertical coordinate: variable containing the observation levels (e.g. air pressure, ocean depth) in its original space (required).
model vertical coordinate: variable containing the model levels (e.g. air pressure, ocean depth) in its extended space (required).

Example

time window:
  begin: 2020-12-31T23:59:00Z
  end: 2021-01-01T00:01:00Z

observations:
- obs space:
    name: Average Obs to Model Levels
    obsdatain:
      engine:
        type: H5File
        obsfile: Data/ufo/testinput_tier_1/profile_testdata.nc4  # not real
    obsgrouping:
      group variables: [ "stationIdentification" ]
      sort variable: "depthBelowWaterSurface"
      sort order: "ascending"
    simulated variables: ["depthBelowWaterSurface", "salinity"]

    extension:
      allocate companion records with length: &num_levels 75
      variables filled with non-missing values:
      - "latitude"
      - "longitude"
      - "dateTime"
      - "stationIdentification"
      - "observationTypeNum“

  geovals: Data/ufo/testinput_tier_1/profile_geovalsdata.nc4  # not real

  obs operator:
    name: Categorical
    categorical variable: extendedObsSpace
    fallback operator: "CompositeOriginal"
    categorised operators: {"1": "CompositeAverage"}
    operator labels: ["CompositeOriginal", "CompositeAverage"]
    operator configurations:
    - name: Composite
      components:
      - name: VertInterp
        variables:
        - name: salinity
        - name: depthBelowWaterSurface
        observation vertical coordinate: depthBelowWaterSurface
        observation vertical coordinate group: DerivedObsValue
        vertical coordinate: depthBelowWaterSurface
    - name: Composite
      components:
      - name: ProfileAverage
        variables:
        - name: salinity
        - name: depthBelowWaterSurface
        model vertical coordinate: "ocean_depth"
        pressure coordinate: depthBelowWaterSurface
        pressure group: DerivedObsValue
        require descending pressure sort: false
        number of intersection iterations: 0

  obs post filters:
  - filter: Average Observations To Model Levels
    filter variables:
    - name: salinity
    observation vertical coordinate: DerivedObsValue/depthBelowWaterSurface
    model vertical coordinate: HofX/depthBelowWaterSurface

In order for this filter to work correctly, the observations must be grouped into records (profiles) using the obsgrouping.group variables option. The filter works whether the vertical coordinate is in increasing or decreasing order, but the model and observation vertical coordinates must both increase or both decrease, otherwise an error is thrown.

The ObsSpace must also have been extended with obs space.extension as in the example above, to accommodate the averaged observation values on model levels, in the extended space.

It is expected that the model vertical coordinate should contain values in its extended space - one way to achieve this is with the ProfileAverage obsOperator (see example above). ProfileAverage fills the extended space of an HofX variable (created by the VertInterp obsOperator in the above example), with the GeoVaLs values appropriate to each profile’s location. If the extended space of model vertical coordinate is all zeroes (as would be the case if ProfileAverage had not been performed), an error is thrown when applying this filter. (The filter does not stop if the extended space of model vertical coordinate is all missing for a profile, as some profiles may be missing all their data.)

In the example above, a variable called DerivedObsValue/salinity is created. It contains the same values as ObsValue/salinity in its original space, while its extended space is filled with the values of ObsValue/salinity averaged on to the model levels specified by model vertical coordinate.

This filter supports use of “where” statements: any where-excluded observation locations are excluded from the calculation of the average increments.

Average Observations To GeoVals Model Levels¶

For each of the filter variables given, this filter averages the observations within each given model layer and assigns them to that layer. The resulting observation values, averaged onto model levels, are written to the DerivedObsValue’s extended space. The original space of DerivedObsValue remains the same as that of ObsValue.

A new variable, modelLayer, is added in MetaData, with values ranging from 1 to the number of model layers. It is the user’s responsibility to extend the ObsSpace based on the number of model levels.

If there are observations within the model level then observations are averaged, otherwise set to missing. This filter adds a new variable for QC flags in MetaData named actObsAvgQC. The QC flag is set to 1 when there is at least one obseravtions to be assigned to the new model level.

Summary of yaml parameters

filter variables: the (Derived)ObsValue(s) whose observation-level values are to be averaged on to model levels.
observation vertical coordinate: variable containing the observation levels (e.g. height) in its original space (required).
model vertical coordinate: variable containing the model levels (e.g. geopotential height) in its extended space (required).

Example

time window:
  begin: 2020-12-31T23:59:00Z
  end: 2021-01-01T00:01:00Z
observations:
- obs space:
  name: dpr_gpm
    obsdatain:
    engine:
      type: H5File
    obsgrouping:
      group variables: ["sequenceNumber"]
      sort variable: "Layer"
      sort order: "descending"
  extension:
     allocate companion records with length: &num_levels 127
  obsdataout:
    engine:
      type: H5File
      allow overwrite: true
  _source: testing
  simulated variables: [ReflectivityAttenuated]
  channels: 1-2
obs operator:
  name: CRTM
  Absorbers: [H2O,O3]
  Clouds: [Water, Rain, Snow]
  Cloud_Fraction: 1.0
  obs options:
    Sensor_ID: dpr_gpm
    EndianType: little_endian
- filter: Average Observations To GeoVals Model Levels
  filter variables:
  - name: ObsValue/ReflectivityAttenuated
    channels: 1-2
  observation vertical coordinate: MetaData/height
  model vertical coordinate: GeoVaLs/geopotential_height_levels

In order for this filter to work correctly, the observations must be grouped into records (profiles) using the obsgrouping.group variables option. The filter works whether the observation vertical coordinate is in increasing or decreasing order.

The ObsSpace must also have been extended with obs space.extension as in the example above, to accommodate the averaged observation values on model levels, in the extended space.

In the example above, a variable called DerivedObsValue/ReflectivityAttenuated is created. It contains the same values as ObsValue/ReflectivityAttenuated in its original space, while its extended space is filled with the values of ObsValue/ReflectivityAttenuated averaged on to the model levels specified by model vertical coordinate.