Parameter Classes

Introduction

Traditionally, code retrieving the values of options controlling the behavior of JEDI components from a configuration file has been written in an imperative style, making a series of calls to methods of the eckit::Configuration class. For example, the geometry of a model defined on a (very coarse) lat-lon grid could be configured with the following section of a YAML file:

geometry:
  num lats: 5
  num lons: 10
  level altitudes in km: [0.5, 1, 2, 4, 8, 16]

and the implementation of the OOPS Geometry interface for that model could retrieve the values of these options as follows:

MyGeometry::MyGeometry(const eckit::Configuration &config,
                       const eckit::mpi::Comm & comm) {
  int numLats = config.getInt("num lats");
  int numLons = config.getInt("num lons");
  std::vector<float> levels = config.getFloatVector("level altitudes in km");
  // ...
}

An alternative, more declarative approach is now possible, in which the supported options are listed as member variables of a subclass of the Parameters class. The values of all these options are then loaded from a Configuration object into an instance of that subclass in single function call. This has the following advantages:

  • It makes it easy to find the full list of parameters accepted by a given JEDI component, including their name, type, default value (if any) and accompanying documentation.

  • It reduces code duplication (code using the eckit::Configuration interface to build instances of specific complex types, such as util::DateTime or std::map, needs to be written only once).

  • Most importantly, it makes it possible for errors in the configuration file (for example misspelled names of optional parameters, parameter values lying outside a specified range, and incorrect parameter value types) to be detected early (potentially even before a JEDI application is run, as described below) and their location pinpointed accurately.

For example, the options recognized by MyGeometry could be encapsulated in the following subclass of Parameters:

#include "oops/util/parameters/RequiredParameter.h"
#include "oops/util/parameters/Parameters.h"

/// \brief Parameters controlling my model's grid geometry.
class MyGeometryParameters : public oops::Parameters {
  OOPS_CONCRETE_PARAMETERS(MyGeometryParameters, Parameters)
public:
  /// \brief Grid size in the north-south direction.
  oops::RequiredParameter<int> numLats{"num lats", this};

  /// \brief Grid size in the east-west direction.
  oops::RequiredParameter<int> numLons{"num lons", this};

  /// \brief List of model level altitudes (in km).
  oops::RequiredParameter<std::vector<float>> levelAltitudesInKm{"level altitudes in km", this};
};

Note that:

  • In the above example, all member variables are initialized with the C++11 default member initializer syntax (this is not strictly necessary, but very convenient).

  • The first argument passed to the constructor of each RequiredParameter object is the name of the key in the YAML key/value pair from which that parameter’s value will be extracted.

  • The second argument is the address of the Parameters object holding all the individual parameters.

  • The parameter of the RequiredParameter template indicates the type of the values that can be assigned to that parameter.

  • The class definition begins with an invocation of the OOPS_CONCRETE_PARAMETERS macro, which defines the move and copy constructors and assignment operators and the clone() method in an appropriate way. The first argument to the macro must be the name of the surrounding class and the second, the name of its immediate base class. This macro should be invoked in each concrete subclass of Parameters (otherwise a compilation error will occur).

    Abstract subclasses of Parameters (those that don’t need to be instantiated directly, but only serve as base classes for other classes) should invoke the OOPS_ABSTRACT_PARAMETERS macro instead of OOPS_CONCRETE_PARAMETERS.

The validateAndDeserialize() method loads parameter values from a Configuration object into a Parameters object:

MyGeometry::MyGeometry(const eckit::Configuration &config,
                       const eckit::mpi::Comm & comm) {
  MyGeometryParameters params;
  params.validateAndDeserialize(config);
  // ...
}

Since all parameters have been declared as required, this method will thrown an exception if any of them cannot be found in the Configuration object. It is also possible to treat parameters as optional; this is discussed below.

The loaded values can be accessed by calling the value() method of the RequiredParameter object. In most circumstances you can also use a RequiredParameter object as if it was the parameter value itself (omitting the call to value()), since the RequiredParameter<T> class template overloads the conversion operator to const T&. So the two following snippets are equivalent:

for (int i = 0; i < params.numLats.value(); ++i) {
  processZonalBand(i);
}

and

for (int i = 0; i < params.numLats; ++i) {
  processZonalBand(i);
}

More Complex Types

In the preceding example, we have already seen that parameters can store not only values of “primitive” types (e.g. int), but also more complex objects, such as vectors. Other supported types include:

  • Strings. For instance, the parameter

    oops::RequiredParameter<std::string> filename{"filename", this};
    

    can store the value of the filename key from the following YAML file:

    filename: testinput/sondes.nc4
    
  • Dates and durations. For instance, the parameters

    oops::RequiredParameter<util::DateTime> windowBegin{"start date", this};
    oops::RequiredParameter<util::Duration> windowLength{"duration", this};
    

    can store the date and duration loaded from the following YAML file:

    start date: 2010-01-01T21:00:00Z
    duration: PT6H
    

    If the start date or duration YAML keys are not set to valid ISO 8601 dates or durations, validateAndDeserialize() will throw an exception.

  • Maps. For instance, the parameter

    oops::RequiredParameter<std::map<std::string, double>> constants{"constants", this};
    

    can store the key-value pairs loaded from the constants section of the following YAML file:

    constants:
      pi: 3.14
      e:  2.72
    
  • Pairs. For instance, the parameter

    oops::RequiredParameter<std::pair<util::DateTime, util::Duration>> window{"window", this};
    

    can store the value of the window key in the following YAML file:

    window: [2000-01-01T03:00:00Z, PT6H]
    
  • Lists of variable names represented with oops::Variables objects. For instance, the parameter

    oops::RequiredParameter<oops::Variables> simulatedVariables{"simulated variables", this};
    

    can store the list of variables loaded from the following YAML file:

    simulated variables: [air_temperature, relative_humidity]
    

    or

    simulated variables: [brightness_temperature]
    channels: 7-12, 15-19
    

    (with the list of channels also stored in the oops::Variables object).

    Note

    Each file declaring a parameter storing an oops::Variables object needs to include the oops/base/ParameterTraitsVariables.h header.

  • Variable names represented with ufo::Variable objects. For instance, the parameter

    oops::RequiredParameter<ufo::Variable> reference{"reference", this};
    

    can store the variable specified in the reference section of the following YAML file:

    reference:
      name: HofX/brightness_temperature
      channels: 7-12, 15-19
    

    A more complex example is

    reference:
      name: ObsErrorModelRamp@ObsFunction
      options:
        xvar:
          name: air_temperature@ObsValue
        x0: [10]
        x1: [20]
        err0: [20]
        err1: [10]
    

    where the variable is in fact an obs function taking a number of options. The following shorthand YAML syntax (useful for single-channel variables) is also supported:

    reference: MetaData/air_pressure
    

    Note

    Each file declaring a parameter storing a ufo::Variable object needs to include the ufo/utils/parameters/ParameterTraitsVariable.h header.

  • Sets of non-negative integers specified with a shorthand syntax supporting ranges. For instance, the parameter

    oops::RequiredParameter<std::set<int>> channels{"channels", this};
    

    will store the single-element set {5} when loaded from the following YAML file:

    channels: 5
    

    the three-element set {5, 6, 7} when loaded from the following YAML file:

    channels: 5-7
    

    and the seven-element set {5, 6, 7, 10, 15, 16, 17} when loaded from the following YAML file:

    channels: 5-7, 15-17, 10
    

    Spaces after commas are optional.

    Note

    When a channels key appears at the same level as a key set to a variable name or a list of variable names, the values of both keys can be loaded into a single parameter storing a ufo::Variable or oops::Variables object; it is not necessary to declare a separate parameter for the channel list.

Parameter Nesting

It is also possible to nest parameters, i.e. store a subclass of Parameters in a parameter object. For example, to load the following YAML snippet:

latitudes:
  min: 30
  max: 60
longitudes:
  min: 20
  max: 30

one could use the following code:

class RangeParameters : public oops::Parameters {
  OOPS_CONCRETE_PARAMETERS(RangeParameters, Parameters)
 public:
  oops::RequiredParameter<float> min{"min", this};
  oops::RequiredParameter<float> max{"max", this};
};

class LatLonRangeParameters : public oops::Parameters {
  OOPS_CONCRETE_PARAMETERS(LatLonRangeParameters, Parameters)
 public:
  oops::RequiredParameter<RangeParameters> latitudes{"latitudes", this};
  oops::RequiredParameter<RangeParameters> longitudes{"longitudes", this};
};

To load parameter values from a eckit::Configuration object, it would be enough to call the validateAndDeserialize() method of the top-level Parameters object, i.e. in this case an instance of LatLonRangeParameters.

Optional Parameters

Not all parameters are required; some are optional. There are two distinct scenarios:

  • If the parameter’s value is not specified in the configuration file, a default value is assumed. Such parameters are represented by instances of the Parameter class template, with the default value passed to the second parameter of its constructor.

  • The parameter can be omitted from the configuration file, but its absence must be detected and handled specially. This is what the OptionalParameter<T> class template is for: instead of a value of type T it stores a value of type boost::optional<T>. This value is set to boost::none if no key matching the parameter’s name is found in the Configuration object provided to the validateAndDeserialize() function.

As an example, a thinning filter might allow the user to optionally specify a variable storing observation priorities (with observations of higher priority more likely to be retained than those of lower priority). To this end, the name of that variable could be stored in an OptionalParameter<ufo::Variable> object. On the other hand, the maximum number of observations to be retained could be stored in an instance of Parameter<int> if we wanted to provide a default:

#include "oops/util/parameters/OptionalParameter.h"
#include "oops/util/parameters/Parameters.h"
#include "oops/util/parameters/Parameter.h"
#include "ufo/utils/parameters/ParameterTraitsVariable.h"

class MyFilterParameters : public oops::Parameters {
  OOPS_CONCRETE_PARAMETERS(MyFilterParameters, Parameters)
 public:
  oops::OptionalParameter<ufo::Variable> priorityVariable{"priority variable", this};
  oops::Parameter<int> maxNumRetainedObs{"max num retained obs", 10000, this};
};

The priorityVariable parameter would be used like this (assuming that parameters_ is an instance of MyFilterParameters and obsdb_ an instance of ioda::ObsSpace):

// All observations have equal priorities...
std::vector<int> priorities(obsdb_.nlocs(), 0);
if (parameters_.priorityVariable.value() != boost::none) {
  // ... unless a priority variable has been specified.
  const ufo::Variable& var = *parameters_.priorityVariable.value();
  obsdb_.get_db(var.group(), var.variable(), priorities);
}

Constraints

It is possible to restrict the allowed values of Parameter, OptionalParameter and RequiredParameter objects by passing a vector of one or more shared pointers to constant ParameterConstraint objects to their constructor. For convenience, functions returning shared pointers to new instances of subclasses of ParameterConstraint representing particular constraint types have been defined. For example, the code below constrains the iterations parameter to be positive, and the variables parameter to contain at least one element:

#include "oops/util/parameters/ArrayConstraints.h"
#include "oops/util/parameters/NumericConstraints.h"
#include "oops/util/parameters/RequiredParameter.h"

RequiredParameter<int> iterations{"iterations", this, {minConstraint(1)}};
RequiredParameter<std::vector<int>> variables{"variables", this, {nonEmptyConstraint<std::vector<int>>()}};

If the value loaded from the configuration file does not meet this constraint, validateAndDeserialize() will throw an exception. At present, four types of constraints on numeric parameters are available:

  • greater than or equal to (minConstraint()),

  • less than or equal to (maxConstraint()),

  • greater than (exclusiveMinConstraint()),

  • less than (exclusiveMaxConstraint()),

and three types of constraints on vector-valued parameters:

  • lower bound on the number of items (minItemsConstraint()),

  • upper bound on the number of items (maxItemsConstraint()),

  • non-empty (nonEmptyConstraint()), technically a special case of the lower bound constraint.

Polymorphic Parameters

Polymorphic parameters represent branches of the configuration tree whose structure depends on the value of a particular keyword. For example, here is a YAML file listing the properties of some computer peripherals:

peripherals:
  - type: mouse
    num buttons: 2
  - type: printer
    max page width (mm): 240
    max page height (mm): 320

Clearly, the list of options that make sense for each item in the peripherals list depends on the value of the type keyword. This means that a separate Parameters subclass is needed to represent the options supported by each peripheral type, and the decision which of these classes should be instantiated can only be taken at runtime, when a configuration file is loaded.

The structure of the above YAML file could be represented with the following subclasses of Parameters:

class PeripheralParameters : public Parameters {
  OOPS_ABSTRACT_PARAMETERS(PeripheralParameters, Parameters)
 public:
  RequiredParameter<std::string> type{"type", this};
};

class PrinterParameters : public PeripheralParameters {
  OOPS_CONCRETE_PARAMETERS(PrinterParameters, PeripheralParameters)
 public:
  RequiredParameter<int> maxPageWidth{"max page width", this};
  RequiredParameter<int> maxPageHeight{"max page height", this};
};

class MouseParameters : public PeripheralParameters {
  OOPS_CONCRETE_PARAMETERS(MouseParameters, PeripheralParameters)
 public:
  Parameter<int> numButtons{"num buttons", 3, this};
};

class PeripheralParametersWrapper : public Parameters {
  OOPS_CONCRETE_PARAMETERS(PeripheralParametersWrapper, Parameters)
 public:
  RequiredPolymorphicParameter<PeripheralParameters, PeripheralFactory>
    peripheral{"type", this};
};

class ComputerParameters : public Parameters {
  OOPS_CONCRETE_PARAMETERS(ComputerParameters, Parameters)
 public:
  Parameter<std::vector<PeripheralParametersWrapper>> peripherals{
    "peripherals", {}, this};
};

Each item in the peripherals list is represented by a RequiredPolymorphicParameter<PeripheralParameters, PeripheralFactory> object. This object holds a pointer to an instance of a subclass of the PeripheralParameters abstract base class; whether it is an instance of PrinterParameters or MouseParameters is determined at runtime depending on the value of the type key. This is done by the PeripheralFactory::createParameters() static function (not shown in the above code snippet), which is expected to take the string loaded from the type key and return a unique pointer to a new instance of the subclass of PeripheralParameters identified by that string. The PeripheralFactory class would typically be used also to create objects representing the peripherals themselves.

RequiredPolymorphicParameter has counterparts suitable for representing optional polymorphic parameters: OptionalPolymorphicParameter and PolymorphicParameter. These templates behave similarly to OptionalParameter and Parameter; in particular, PolymorphicParameter makes it possible to set a default value of the key (type in the above example) used to select the concrete Parameters subclass instantiated at runtime.

In JEDI, polymorphic parameters are used, for example, to handle options controlling models and variable changes.

Parameter Composition

Sometimes it is convenient to group a subset of keys located at the same level of the YAML hierarchy into a separate Parameters subclass. For example, given the following YAML file

customer:
  name: Mary Brown
  street: 15 High Street
  city: London
  postal code: SW1W 0NY

the address components could be grouped in an AddressParameters class:

class AddressParameters : public Parameters {
  OOPS_CONCRETE_PARAMETERS(AddressParameters, Parameters)
 public:
  RequiredParameter<std::string> street{"street", this};
  RequiredParameter<std::string> city{"city", this};
  RequiredParameter<std::string> postalCode{"postal code", this};
};

The CustomerParameters class, representing the contents of the whole customer YAML section, could then be defined as

class CustomerParameters : public Parameters {
  OOPS_CONCRETE_PARAMETERS(CustomerParameters, Parameters)
 public:
  RequiredParameter<std::string> name{"name", this};
  AddressParameters address{this};
};

Note that it contains a member variable of type AddressParameters rather than e.g. RequiredParameter<AddressParameters>, and that its constructor does not receive the name of any key, but only the this pointer. This is because the street, city and postal code keys are located directly within the customer section rather than within a named subsection of customer.

Introduction of a separate Parameters subclass containing a subset of keys located at a particular level of the YAML hierarchy is especially convenient when the same group of keys appears multiple times in the configuration tree, each time accompanied by different sibling keys, or when the values of this group of keys (excluding any siblings) need to be passed to a function, typically a class constructor.

Conversion to Configuration Objects

The Parameters::toConfiguration() method can be called to convert a Parameters object to a LocalConfiguration object. A typical use case is passing options to Fortran code. As mentioned in Fortran Usage, JEDI defines a Fortran interface to Configuration objects, but there is currently no Fortran interface to Parameters objects, so conversion to a Configuration object is the easiest way to pass the values of multiple parameters to Fortran.

Copying Parameters Objects

Concrete subclasses of Parameters whose definition contains an invocation of the OOPS_CONCRETE_PARAMETERS() macro provide a copy constructor that can be used to copy instances of these objects. In addition, both the OOPS_CONCRETE_PARAMETERS(className, baseClassName) and OOPS_ABSTRACT_PARAMETERS(className, baseClassName) macros define a clone() method returning a unique_ptr<className> holding a deep copy of the object on which it is called. This method can be called to clone an instance of a subclass of Parameters accessed through a pointer to an abstract base class (e.g. PeripheralParameters from the example above).

Validation

We have referred multiple times to the validateAndDeserialize() function taking a reference to a Configuration object. As you may already have guessed, it wraps calls to two separate functions: validate() and deserialize(). The latter populates the member variables of a Parameters object with values loaded from the input Configuration object. The former checks if the contents of the Configuration object are correct: for example, if all the mandatory parameters are present, if there are any keys whose names do not match the names of any parameters (and thus potentially have been misspelled), and if the values of all keys have the expected types and meet all imposed constraints. Under the hood, this is done by constructing a JSON schema defining the expected structure of a JSON/YAML file section that can be loaded into the Parameters object, and checking if the contents of the Configuration object conform to that schema. This check is performed using an external library, so it is only enabled if this library was available when building JEDI.

Delegating the validity check to a JSON Schema validator has multiple advantages:

  • It makes it easier to detect certain types of errors (in particular misspelled names of optional keys).

  • If the JSON schema defining the expected structure of entire configuration files taken by a particular JEDI application is exported to a text file, an external validator can be used to check the input files even before the application is run (or before a batch job is submitted to an HPC machine).

  • The same text file can be used to enable JSON/YAML syntax checking and autocompletion in editors such as Visual Studio Code.

At this stage, Parameters subclasses representing the top-level options from the configuration files taken by JEDI applications have not yet been defined, so JSON schemas defining the structure of these files cannot be generated yet. This is an area of active development.

OOPS Interfaces Supporting Parameters

Implementations of some OOPS interfaces, such as Model, LinearModel, and Geometry, can opt to provide a constructor taking a const reference to a subclass of Parameters representing the collection of options recognized by the implementation, instead of a constructor taking a const reference to an eckit::Configuration object. Such implementations need to typedef Parameters_ to the name of the appropriate Parameters subclass. For example, in the example discussed in the Introduction, the MyGeometry class declaration would have looked like this:

class MyGeometry {
 public:
  MyGeometry(const eckit::Configuration & config, const eckit::Comm & comm);
  // ...
};

But we could also declare it like this:

class MyGeometry {
 public:
  typedef MyGeometryParameters Parameters_;
  MyGeometry(const MyGeometryParameters & params, const eckit::Comm & comm);
  // ...
};

The constructor would then receive a MyGeometryParameters object already populated with values loaded from the configuration file, without a need to call validateAndDeserialize() separately.

OOPS interfaces that support implementations with such constructors are identified in their documentation. It is envisaged that in future such constructors will be supported by all OOPS interfaces.

Headers to Include; Adding Support for New Parameter Types

Inclusion of the Parameter.h, RequiredParameter.h and OptionalParameter.h header files suffices to use parameter objects storing values of type int, size_t, float, double, bool, std::string, std::vector, std::map, std::pair, util::DateTime, util::Duration, and eckit::LocalConfiguration. Support for some less frequently used types, such as ufo::Variable and oops::Variables, can be enabled by including an appropriate ParameterTraits*.h file, e.g. ufo/utils/parameters/ParameterTraitsVariable.h.

As you may have guessed from the name of this file, the class template ParameterTraits<T> is responsible for the loading of values of type T into parameter objects (as well as their storage in Configuration objects and JSON schema generation). This template has been specialized for frequently used types such as those listed above. If none of them fit your needs and you want to extract values into instances of a different type, you will need to specialize ParameterTraits<T> for that type. To do that, start from one of the existing specializations and adapt it to your requirements.