Installing the Required Python Tools

After you have gained access to the JCSDA AWS resources, the next step is to install and configure the tools you will need to one or more compute nodes from the command line.

The first tool you’ll need is the AWS Command Line interface (CLI). This will allow you to launch either a single compute node or a multi-node cluster from your computer. After you have created a compute instance or a cluster, you can then log into it and proceed to build and run JEDI.

The easiest way to install the AWS CLI is through a package installer. For example, you can use Homebrew on a Mac:

brew install awscli

or the apt installer on a Debian-based linux OS such as Ubuntu:

sudo apt-get install awscli

Or, since the AWS CLI is a python package, you can also install it with pip or conda, for example:

pip3 install -U awscli --user

For further details see the AWS documentation.

The next step is to configure the AWS CLI to use your AWS login credentials. When you were granted access to JCSDA AWS resources, a JEDI master should have given you an AWS secret access key and associated ID in addition to your username and password. Have this secret access key and ID handy before running this command to configure your AWS CLI:

aws configure

Enter your secret access key ID and the access key itself at the prompts. When prompted for your default region, enter us-east-1. This is where most of the JEDI AMIs are currently housed. At other prompts, including the default output format, you can just type enter to select the default (None).

In order to use the single-node launch script described in the next section, you will also need to install the following python packages using pip, pip3, or conda:

  • os

  • time

  • click

  • boto3

If you only wish to run JEDI on a single node, you can proceed to the next section.

Alternatively, if you wish to have the capability to run JEDI across multiple AWS nodes, you will also have to install AWS ParallelCluster. ParallelCluster is another python application that provides a user-friendly interface to the AWS CloudFormation. CloudFormation is ultimately responsible for creating and coordinating a cluster of collocated, interconnected compute nodes, which AWS calls EC2 instances.

AWS maintains the most thorough, up-to-date instructions on how to install ParallelCluster so we recommend that you follow those. In particular, we recommend installed the pcluster tool within a python virtual environment, as advised by AWS.