Getting Started Guide

Overview

SlideRule is a C++/Lua framework for on-demand science data processing. It is used to build scalable systems that can process datasets and return customized results based on user provided parameters. Its goal is to enable investigations which require processing science data in ways that were either unintended or may be inefficient given the original structure of the data.

The University of Washington has partnered with the ICESat-2 program to develop and deploy an instantiation of SlideRule to AWS us-west-2 in order to support science investigations using ICESat-2’s ATL03, ATL06, and ATL08 datasets. The ICESat-2 SlideRule deployment is accessed at icesat2sliderule.org.

For a top down explanation of the deployment’s architecture and various components, see the Overview.

How Do You Use SlideRule

While it is possible to directly access all of ICESat-2 SlideRule’s services using any client that communicates over HTTP, it is more practical to use the supplied Python client, which hides much of the complexity of interacting with each of the services and provides a high-level Python interface for most use-cases.

For this reason, “using” SlideRule is, in practice, the same as writing a Python script that uses the SlideRule-Python client package. The Python client is used to issue science processing requests to icesat2sliderule.org and then analyze the responses that come back.

These processing requests will typically specify a geospatial region of interest (e.g. defined by a GeoJSON file) and instruct the SlideRule system what algorithms it wants to have run on the ICESat-2 data collected within that region. When SlideRule receives the request, it reads the appropriate source datasets, executes the requested algorithms on that data, and returns the results back to the requesting application.

Setting Up Your System

This walk-through assumes you are comfortable using git and the conda Python packaging system. See the installation instructions in the reference documentation for details on other methods of installation.

Step 1: Clone the Python client repository

$ git clone https://github.com/ICESat2-SlideRule/sliderule-python.git

Step 2: Create a conda environment with all the necessary dependencies

$ cd sliderule-python
$ conda env create -f environment.yml

Step 3: Activate the sliderule conda environment

$ conda activate sliderule

Your First Processing Request

Now that you have an environment all setup and ready to use SlideRule, this section will walk you through a very simple example that calculates gridded elevations in the Grand Mesa region in Colorado at a 20m along-track resolution.

Step 1: Import the SlideRule Python package for ICESat-2.

>>> from sliderule import icesat2

Step 2: Initialize the icesat2 package .

>>> icesat2.init("icesat2sliderule.org", verbose=True)

In general, it is only necessary to provide the url to the init function; but for this example we are also turning on verbose log messages so we can get more insight into what is happening. For a full description of the options available when initializing the icesat2 package, see the init documentation.

Step 3: Create a list of coordinates that represent the Grand Mesa region of interest.

>>> grand_mesa = icesat2.toregion('grandmesa.geojson')

The grandmesa.geojson file used in this example can be downloaded by clicking on the link; alternatively, you can create your own GeoJSON file at geojson.io.

The toregion function creates a representation of the geospatial region that is understood by SlideRule. It accepts both GeoJSON files and Shapefiles. For a full description of the function, see the toregion documentation.

Step 4: Create a dictionary of processing parameters specifying how the elevations for the region should be calculated.

>>> parms = {
    "poly": grand_mesa[0],
    "srt": icesat2.SRT_LAND,
    "cnf": icesat2.CNF_SURFACE_HIGH,
    "len": 40.0,
    "res": 20.0,
    "maxi": 1
}

For a full description of the different processing parameters that are accepted by SlideRule, see parameters. The parameters of interest here are len which specifies the total along-track length of the segment used to calculate an elevation, and res which specifies the along-track posting interval of the calculation.

Also take note that the first element of the grand_mesa list is passed to the poly parameter. This is because the toregion function always returns a list of closed polygons representing the region of interest. It is the user’s responsibility to sequence through those polygons when making requests to SlideRule. In our case here, the Grand Mesa GeoJSON file contained only one polygon and so we can safely just use that polygon in our request.

Step 5: Issue the processing request to SlideRule.

>>> rsps = icesat2.atl06p(parms)

When you hit enter, you should see a scrolling list of log messages saying “atl06 processing initiated on…”. These messages are normal and expected (and displayed only because of the verbose setting used when we initialized the icesat2 package).

There are many valid reasons for some resources to return no elements, but most often it is because the resource was identified by NASA’s CMR system as crossing the region of interest, yet when SlideRule processed the resource, it did not actually intersect. This happens because the CMR system adds an off-pointing margin to all ground tracks when calculating intersections and therefore over estimates which resources cross any given region.

When this completes (~30 seconds), the rsps variable should now contain all of the results of the elevations calculated by SlideRule.

Step 6: Analyze the results using Pandas.

>>> rsps.describe()
                 lat  n_fit_photons            lon      dh_fit_dx  ...         h_mean     rms_misfit            rgt    segment_id
count  100277.000000  100277.000000  100277.000000  100277.000000  ...  100277.000000  100240.000000  100277.000000  100277.00000
mean       39.028478     132.139414    -108.030490      -0.001975  ...    2707.639564       2.512702     841.096513  500659.18788
std         0.082476     131.598809       0.121329       0.306945  ...     440.133777       3.124903     384.650923  284372.50888
min        38.827507       0.000000    -108.315698     -26.344217  ...    1396.383336       0.042654     211.000000  215376.00000
25%        38.969076      37.000000    -108.112334      -0.096951  ...    2371.550240       0.569138     737.000000  216597.00000
50%        39.034390      87.000000    -108.043394      -0.001515  ...    2846.206073       1.387819    1156.000000  217400.00000
75%        39.097182     166.000000    -107.936355       0.086064  ...    3068.137673       3.363937    1156.000000  785282.00000
max        39.194233    1767.000000    -107.735253      24.634544  ...    3737.048479     164.336963    1179.000000  786420.00000

[8 rows x 13 columns]

For a full description of all of the fields returned from the atl06p function, see the elevations documentation.

Step 7: Plot the gridded elevations returned by SlideRule using measurements collected by ICESat-2 in the Grand Mesa region.

>>> import cartopy
>>> import matplotlib.pyplot as plt
>>> fig = plt.figure(num=None, figsize=(24, 12))
>>> border_lon = [e["lon"] for e in grand_mesa[0]]
>>> border_lat = [e["lat"] for e in grand_mesa[0]]
>>> ax = plt.subplot(111,projection=cartopy.crs.PlateCarree())
>>> ax.scatter(rsps["lon"].values, rsps["lat"].values, s=2.5, c=rsps["h_mean"], cmap='winter_r', zorder=3, transform=cartopy.crs.PlateCarree())
>>> ax.plot(border_lon, border_lat, linewidth=1.5, color='r', zorder=2, transform=cartopy.crs.Geodetic())
>>> plt.show()

The resulting plot should look something like: grand mesa

Next Steps

Once you’ve completed this walk-through and are comfortable issuing processing requests to SlideRule, you should take a look at the Documentation and the example Jupyter Notebooks.