Building and running SkyLab

List of spack, software, and AMIs

Versions used:

Note

It is necessary to use c6i.4xlarge or larger instances of this family (recommended: c6i.8xlarge when running the skylab-atm-land-small experiment).

For more information about using Amazon Web Services please see JEDI on AWS.

Developer Section

To follow this section, one needs read access to the JCSDA-internal GitHub organization.

Prerequisites

The following prerequisites will only need to be completed once on each HPC or platform you are planning on running JEDI-Skylab on.

  1. Set up your AWS credentials on the platform you are using.

You will need to create or edit your ~/.aws/config and ~/.aws/credentials to make sure they contain:

Listing 1 ~/.aws/config
[default]
region=us-east-1

# NOAA AWS acct config for the ``jcsda-noaa-aws-us-east-1`` R2D2 Data Hub
[jcsda-noaa-aws-us-east-1]
region=us-east-1

# USAF AWS acct config for the ``jcsda-usaf-aws-us-east-2`` R2D2 Data Hub
[jcsda-usaf-aws-us-east-2]
region=us-east-2
Listing 2 ~/.aws/credentials
# NOAA AWS acct credentials if default in config is us-east-1
[default]
aws_access_key_id=***
aws_secret_access_key=***

# NOAA AWS acct creds for the ``jcsda-noaa-aws-us-east-1`` R2D2 Data Hub
[jcsda-noaa-aws-us-east-1]
aws_access_key_id=***
aws_secret_access_key=***

# USAF AWS acct creds for the ``jcsda-usaf-aws-us-east-2`` R2D2 Data Hub
[jcsda-usaf-aws-us-east-2]
aws_access_key_id=***
aws_secret_access_key=***

Tip

Make sure to protect your AWS config and credentials via:

chmod 400 ~/.aws/config
chmod 400 ~/.aws/credentials
  1. Set up your GitHub credentials following instructions at Step 0: System Configuration.

Quickstart Build Guide

This quickstart guide will walk you through setting up JEDI Skylab using automation scripts developed and maintained by the JEDI team. If you would prefer more manual control while building, skip to the Manual Build Guide. This is the best way to make sure you are using the most recent spack-stack modules, loading the correct environment variables, and building JEDI along with the required JEDI workflow applications. This section can be used on a supported HPC and on your localhost. If needed, more information can be found in the jedi-tools README.

1. Clone jedi-tools inside $JEDI_ROOT

$JEDI_ROOT is the directory you will clone the JEDI code and all the files needed to build, test, and run JEDI and SkyLab.

export JEDI_ROOT=/path/to/where/you/want/JEDI
mkdir $JEDI_ROOT
cd $JEDI_ROOT
git clone https://github.com/JCSDA-internal/jedi-tools.git

2. Copy, edit, and source the setup script

In order to set up the spack-stack modules, JEDI-Skylab environment variables, and initialize your virtual environment you will use jedi-tools/buildscripts/setup.sh. This script is designed to be reused each time you need to setup your environment and it is recommended to place in $JEDI_ROOT. Edit the file header to set JEDI_ROOT, HOST, COMPILER, and WORKFLOW_ROOT. If you are running on localhost, uncomment the spack-stack module statements and fill in for your local spack-stack location.

Note

The default for WORKFLOW_ROOT get’s set to JEDI_ROOT and you can update this location as needed for your JEDI-Skylab set up.

cp $JEDI_ROOT/jedi-tools/buildscripts/setup.sh $JEDI_ROOT/
vi $JEDI_ROOT/setup.sh
# Edit the header for JEDI_ROOT, HOST, COMPILER, and optional WORKFLOW_ROOT
# If on localhost, uncomment and fill out your spack-stack information
source $JEDI_ROOT/setup.sh

3. Build JEDI-Skylab

Now that your environment is configured correctly you can build JEDI-Skylab. The build script provided will build the develop version of jedi-bundle. Feel free to look inside that file and specify the desired repository branch names. Alternatively, if you want to only build the workflow applications and use Skylab’s experiment option of build_jedi: True you can follow Supplemental Instructions provided at the end of this section.

bash $JEDI_ROOT/jedi-tools/buildscripts/build_jedi_skylab.sh

Sit back, relax, and have a coffee while this script runs.

4. Run JEDI ctests

Once the build completes, you can run the JEDI ctests. This can be done manually from $JEDI_BUILD or you can use the Skylab experiment jedi-ctest.yaml. If using Skylab to run ctests, you will need to start the ECFLOW UI.

Listing 3 Manual ctest example
cd $JEDI_BUILD
ctest
Listing 4 Skylab ctest example
ecflow_ui &
create_experiment.py $JEDI_WORKFLOW/skylab/experiments/jedi-ctest.yaml

Congrats, you are ready to now run JEDI-Skylab experiments!

Supplemental Instructions

Build only JEDI-Skylab workflow applications

If you got to Step 3 above and decided that you will be building JEDI during your Skylab experiment. You only need to build the workflow applications needed for the JEDI Skylab Environment which include simobs, solo, r2d2, ewok, skylab, and the related data repositories r2d2-data and static-data. This is automated in jedi-tools/buildscripts/build_workflow_apps.sh script. The default branches are set to develop, but if you need to specify different branches you can edit the file to set the branch names at the top. Then just run the script. The repositories will be built and installed in the $JEDI_WORKFLOW directory.

bash $JEDI_ROOT/jedi-tools/buildscripts/build_workflow_apps.sh

Manual Build Guide

1 - Load modules

First, you need to load all the modules needed to build jedi-bundle and the jedi workflow applications, solo/r2d2/ewok/simobs/skylab. Loading modules only sets up the environment for you. You still need to build jedi-bundle, run ctests, install solo/r2d2/ewok/simobs and clone skylab.

Currently we only support Orion, Hercules, Derecho, Discover, S4, and AWS platforms. If you are working on a system not specified below please follow the instructions on JEDI Portability.

The commands for loading the modules to compile and run SkyLab are provided in separate sections for HPC platforms and AWS instances (AMIs). Users need to execute these commands before proceeding with the build of jedi-bundle below.

Warning

If you are using spack-stack 1.4.0 or spack-stack 1.4.1 you need to unload the CRTM v2.4.1-jedi module after loading the Spack-Stack modules.

module unload crtm

Make sure you are building CRTMV3 within the jedi-bundle using the ecbuild_bundle command.

Warning

If you are using spack-stack 1.7.0, different versions of mapl are used with different variants, depending on the version of the compiler and whether the system is used for UFS or GEOS. Please reference spack-stack 1.7.0 documentation in a note and table under “3.1. Officially supported spack-stack installations” for more information.

2 - Build jedi-bundle

Once the stack is installed and the corresponding modules loaded, the next step is to get and build the JEDI executables.

The first step is to create your work directory. In this directory you will clone the JEDI code and all the files needed to build, test, and run JEDI and SkyLab. We call this directory JEDI_ROOT throughout this document.

The next step is to clone the code bundle to a local directory. To clone the publicly available repositories use:

mkdir $JEDI_ROOT
cd $JEDI_ROOT
git clone -b 8.0.0 https://github.com/JCSDA/jedi-bundle.git

Alternatively, developers with access to the internal repositories should instead clone the development branch. For that use:

mkdir $JEDI_ROOT
cd $JEDI_ROOT
git clone https://github.com/jcsda-internal/jedi-bundle

The example here is for jedi-bundle, the instructions apply to other bundles as well.

From this point, we will use three environment variables:

  • $JEDI_SRC which should point to the base of the bundle to be built (i.e. the directory that was cloned just above, where the main CMakeLists.txt is located or $JEDI_ROOT/jedi-bundle).

    export JEDI_SRC=$JEDI_ROOT/jedi-bundle
    
  • $JEDI_BUILD which should point to the build directory or $JEDI_ROOT/build. Create the directory if it does not exist.

    export JEDI_BUILD=$JEDI_ROOT/build
    
  • $JEDI_WORKFLOW which should point to the base directory containing the JEDI-Skylab workflow applications of EWOK, R2D2, SIMOBS, Skylab, and SOLO. (ie $JEDI_ROOT/jedi-workflow). Note, you are also able to still place these repos inside $JEDI_SRC but make sure $JEDI_WORKFLOW still gets set to that location.

    export JEDI_WORKFLOW=$JEDI_ROOT/jedi-workflow
    

Note:

It is recommended these two directories are not one inside the other.

  • Orion: it’s recommended to use $JEDI_ROOT=/work2/noaa/jcsda/${USER}/jedi.

  • Discover: it’s recommended to use $JEDI_ROOT=/discover/nobackup/${USER}/jedi.

  • On AWS Parallel Cluster, use $JEDI_ROOT=/mnt/experiments-efs/USER.NAME/jedi.

  • On the preconfigured AWS AMIs, use $JEDI_ROOT=$HOME/jedi.

Before building JEDI, set up a python virtual environment based on the spack-stack python installation and activate it.

Note the use of the python_ROOT variable which, at this point, should be set to the python3 executable installed in your spack-stack environment. This is important since it will ensure the python3 installation you will be using is in sync with the spack-stack python modules (eg. py-numpy).

cd $JEDI_ROOT
$python_ROOT/bin/python3 -m venv --system-site-packages venv
source venv/bin/activate

Note

You need to activate this virtual environment every time you start a new session on your machine. Note below that the creation and usage (sourcing) of the $JEDI_ROOT/setup.h script will cover this requirement.

Run the build of JEDI

mkdir $JEDI_BUILD
cd $JEDI_BUILD
ecbuild $JEDI_SRC
make -j8

Feel free to have a coffee while it builds. Once JEDI is built, you should check the build was successful by running the tests (still from $JEDI_BUILD):

ctest

If you are on an HPC you may need to provide additional flags to the ecbuild command, or login to a compute node, or submit a batch script for running the ctests. Please refer to the Skylab HPC users guide for more details. You can also run ctests using a Skylab experiment. This can be executed after Section 6 - Run SkyLab, using the $JEDI_WORKFLOW/skylab/experiments/jedi_ctest.yaml.

Running the tests may take up to 2 hours depending on your system, so you might want to take another coffee break. If all the expected tests pass, congratulations! You have successfully built JEDI!

Warning

If you are running on your own machine you will also need to clone the static-data repo for some skylab experiments.

cd $JEDI_WORKFLOW
git clone https://github.com/jcsda-internal/static-data

Note

Run ctest --help for more information on the test options. For even more information, see section JEDI Testing.

3 - Clone and install solo/r2d2/ewok/simobs, clone skylab only

We recommend that you use a python3 virtual environment (venv) for building solo/r2d2/ewok/simobs. As indicated above in the note about the $JEDI_WORKFLOW environment variable, these can be placed in a directory your choosing and does not need to be in $JEDI_ROOT/jedi-bundle.

cd $JEDI_WORKFLOW
git clone https://github.com/jcsda-internal/solo
git clone https://github.com/jcsda-internal/r2d2
git clone https://github.com/jcsda-internal/ewok
git clone https://github.com/jcsda-internal/simobs
git clone https://github.com/jcsda-internal/skylab

Or for the latest release of Skylab v8, clone the corresponding workflow repository branches:

cd $JEDI_WORKFLOW
git clone --branch 1.3.0 https://github.com/jcsda-internal/solo
git clone --branch 2.4.0 https://github.com/jcsda-internal/r2d2
git clone --branch 0.8.0 https://github.com/jcsda-internal/ewok
git clone --branch 1.6.0 https://github.com/jcsda-internal/simobs
git clone --branch 8.0.0 https://github.com/jcsda-internal/skylab

You can then proceed with

cd $JEDI_WORKFLOW/solo
python3 -m pip install -e .
cd $JEDI_WORKFLOW/r2d2
python3 -m pip install -e .
cd $JEDI_WORKFLOW/ewok
python3 -m pip install -e .
cd $JEDI_WORKFLOW/simobs
python3 -m pip install -e .

4 - Setup SkyLab

Create and source $JEDI_ROOT/setup.sh

We recommend creating this bash script and sourcing it before running the experiment. This bash script sets environment variables such as JEDI_BUILD, JEDI_SRC, JEDI_WORKFLOW, EWOK_WORKDIR and EWOK_FLOWDIR required by ewok. A reference setup script that reflects the latest developmental code is available at https://github.com/JCSDA-internal/jedi-tools/blob/develop/buildscripts/setup.sh.

The script contains logic for loading the required spack-stack modules on configurable platforms (i.e. where R2D2_HOST=LOCALHOST, see below), and it pulls in spack-stack configurations for supported platforms. These are located in https://github.com/JCSDA-internal/jedi-tools/blob/develop/buildscripts/setup/ for the latest developmental code.

Users may set JEDI_ROOT, JEDI_SRC, JEDI_BUILD, JEDI_WORKFLOW, EWOK_WORKDIR and EWOK_FLOWDIR to point to relevant directories on their systems or use the default template in the sample script. Note that these locations are experiment specific, i.e. you can run several experiments at the same time, each having their own definition for these variables.

The user further has to set two environment variables R2D2_HOST and R2D2_COMPILER in the script. R2D2_HOST and R2D2_COMPILER are required by r2d2 and ewok. They are used to initialize the location EWOK_STATIC_DATA of the static data used by skylab and bind r2d2 to your current environment. EWOK_STATIC_DATA is staged on the preconfigured platforms. On generic platforms, the script sets EWOK_STATIC_DATA to ${JEDI_WORKFLOW}/static-data/static.

Please don’t forget to source this script after creating it: source $JEDI_ROOT/setup.sh

Please see Skylab HPC users guide for more information on specifics for editing this setup.sh script and other general instructions and notes for running skylab on supported HPC systems.

The script also sets the variable ECF_PORT to a constant value that depends on your user ID on the system. Please make sure that the resulting value for ECF_PORT is somewhere between 5000 and 20000. On some systems (e.g. your own macOS laptop), the user ID is a large integer well outside the allowed port range. Note that changing your ECF_PORT will require you to reconnect the ecflow server, so keeping it constant will keep your ecflow server connected.

5 - Set up R2D2 (for MacOS and AWS Single Nodes)

If you are running skylab locally on the MacOS or an AWS single node instance, you will also have to set up R2D2. This step should be skipped if you are on any other supported platforms. As with the previous step, it is recommended to complete these steps inside the python virtual environment that was activated above.

Clone the r2d2-data Repo

As with the other repositories, clone this inside your $JEDI_WORKFLOW directory.

cd $JEDI_WORKFLOW
git clone https://github.com/jcsda-internal/r2d2-data

Create a local copy of the R2D2 data store:

mkdir $HOME/r2d2-experiments-localhost
cp -R $JEDI_WORKFLOW/r2d2-data/r2d2-experiments-tutorial/* $HOME/r2d2-experiments-localhost

Install, Start, and Configure the MySQL Server

Execution of R2D2 on MacOS and AWS single nodes requires that MySQL is installed, started, and configured properly. For new site configurations see the spack-stack instructions for the needed prerequisites for macOS, Ubuntu, and Red Hat. Note, if you are reading these instructions, it is likely you have already setup the spack-stack environment.

You should have installed MySQL when you were setting up the spack-stack environment. To check this, enter brew list to the terminal and check the output for mysql.

Follow the directions for setting up the MySQL server found in the R2D2 tutorial starting at the Prerequisites for MacOS and AWS Single Nodes Only section. (If the link doesn’t work, the directions can be found in the TUTORIAL.md file in the r2d2 repository).

Note: The command used to set up the the local database should be run from the $JEDI_WORKFLOW/r2d2 directory. And the r2d2-experiments-tutorial.sql file is in $JEDI_WORKFLOW/r2d2-data.

6 - Run SkyLab

Now you are ready to start an ecflow server and run an experiment. Make sure you are in your python virtual environment (venv).

First, start the ecflow server. Note that this may already be done by your setup.sh script if you are using the reference script mentioned in the previous sections from jedi-tools.

ecflow_start.sh -p $ECF_PORT

Note: On Discover, users need to set ECF_PORT manually:

export ECF_PORT=2500
ecflow_start.sh -p $ECF_PORT

Please note “Host” and “Port Number” here. Also note that each user must use a unique port number (we recommend using a random number between 2500 and 9999)

To view the ecflow GUI:

ecflow_ui &

When opening the ecflow GUI flow for the first time you will need to add your server to the GUI. In the GUI click on “Servers” and then “Manage servers”. A new window will appear. Click on “Add server”. Here you need to add the Name, Host, and Port of your server. For “Host” and “Port” please refer to the last section of output from the previous step.

To stop the ecflow server:

ecflow_stop.sh -p $ECF_PORT

To start your ewok experiment:

create_experiment.py $JEDI_WORKFLOW/skylab/experiments/your-experiment.yaml

Note for MacOS Users:

If attempting to start the ecflow server on the MacOS gives you an error message like this:

Failed to connect to <machineName>:<PortNumber>. After 2 attempts. Is the server running ?

...

restart of server failed

You will need to edit your /etc/hosts file (which will require sudo access). Add the name of your machine on the localhost line. So if the name of your local machine is SATURN, then edit your /etc/hosts to:

##
# Host Database
#
# localhost is used to configure the loopback interface
# when the system is booting. Do not change this entry.
##
127.0.0.1     localhost SATURN
255.255.255.255       broadcasthost
::1       localhost

7 - Existing SkyLab experiments

At the moment there are four SkyLab flagship experiments:

  • skylab-aero.yaml

  • skylab-atm-land.yaml

  • skylab-marine.yaml

  • skylab-trace-gas.yaml

To read a more in depth description of the parameters available and the setup for these experiments, please read our page on the SkyLab experiments: Parameters and description.