SlideRule Python API

The SlideRule Python API sliderule.py is used to access the services provided by the base SlideRule server. From Python, the module can be imported via:

import sliderule

If you want to begin using the sliderule.py API right away, feel free to skip down to the Functions section to read about what each aPI does. The sections that immediately follow provide an overview of the SlideRule service.

Service Architecture

A typical SlideRule deployment includes three components:

  1. Access to the NASA Common Metadata Repository (CMR) system

  2. A service discovery server

  3. A set of processing nodes

When a client makes a processing request to SlideRule, it is typical that all three components are involved as follows:

  1. The client first makes a spatial and temporal query to the CMR system to retrieve a set of resources (i.e. H5 files) that correspond to the region and time period of interest.

  2. The client then makes a request to the service discovery server to retrieve a list of IP addresses of the available SlideRule processing nodes.

  3. Finally, the client creates a thread pool of workers and fans out the processing of each resource over the available SlideRule processing nodes.

Of course, these steps do not have to be taken in this order, nor is it a problem to just use one component of the system without the others. But these components are designed to compliment each other and provide all of the necessary services needed to perform large processing requests.

De-serialization

There are two types of SlideRule services distinguished by the type of response they provide: (1) normal services, (2) stream services.

Normal

Normal services are accessed via the GET HTTP method and return a discrete block of ASCII text, typically formatted as JSON.

These services can easily be accessed via curl or other HTTP-based tools, and contain self-describing data. When using the SlideRule Python client, they are accessed via the sliderule.source(..., stream=False) call.

Stream

Stream services are accessed via the POST HTTP method and return a serialized stream of binary records containing the results of the processing request.

These services are more difficult to work with third-party tools since the returned data must be parsed and the data itself is not self-describing. When using the SlideRule Python client, they are accessed via the sliderule.source(..., stream=True) call. Inside this call, the SlideRule Python client will take care of any additional service calls needed in order to parse the results and return a self-describing Python data structure (i.e. the elements of the data structure are named in such a way as to indicate the type and content of the returned data).

If you want to process streamed results outside of the SlideRule Python client, then a brief description of the format of the data follows. For additional guidance, the hidden __parse function inside the sliderule.py source code provides the code which performs the stream processing for the SlideRule Python client.

Each response record is formatted as: <record length><record type><record data> where,

record length

32-bit little endian integer providing the total length of the record that follows (type and data)

record type

null-terminated ASCII string containing the name of the record type

record data

binary contents of data

In order to know how to process the contents of the record data, the user must perform an additional query to the SlideRule definition service, providing the record type. The definition service returns a JSON response object that provides a format definition of the record type that can be used by the client to decode the binary record data. The format of the definition response object is:

{
    "__datasize": # minimum size of record
    "field_1":
    {
        "type": # data type (see sliderule.basictypes for full definition), or record type if a nested structure
        "elements": # number of elements, 1 if not an array
        "offset": # starting bit offset into record data
        "flags": # processing flags - LE: little endian, BE: big endian, PTR: pointer
    },
    ...
    "field_n":
    {
        ...
    }
}

Functions

source


sliderule.source(api, parm={}, stream=False, callbacks={'eventrec': __logeventrec})

Perform API call to SlideRule service

Parameters
  • api (str) – name of the SlideRule endpoint

  • parm (dict) – dictionary of request parameters

  • stream (bool) –

    whether the request is a normal service or a stream service (see De-serialization for more details)

  • callbacks (dict) – record type callbacks (advanced use)

Returns

response data

Example:

>>> import sliderule
>>> sliderule.set_url("icesat2sliderule.org")
>>> rqst = {
...     "time": "NOW",
...     "input": "NOW",
...     "output": "GPS"
... }
>>> rsps = sliderule.source("time", rqst)
>>> print(rsps)
{'time': 1300556199523.0, 'format': 'GPS'}

set_url


sliderule.set_url(urls):

Configure sliderule package with URL of service

Parameters

urls (str) – IP address or hostname of SlideRule service (note, there is a special case where the url is provided as a list of strings instead of just a string; when a list is provided, the client hardcodes the set of servers that are used to process requests to the exact set provided; this is used for testing and for local installations and can be ignored by most users)

Example:

>>> import sliderule
>>> sliderule.set_url("service.my-sliderule-server.org")

update_available_servers


sliderule.update_available_servers():

Causes the SlideRule Python client to refresh the list of available processing nodes. This is useful when performing large processing requests where there is time for auto-scaling to change the number of nodes running.

This function does nothing if the client has been initialized with a hardcoded list of servers.

Returns

the number of available processing nodes

Example:

>>> import sliderule
>>> sliderule.update_available_servers()

set_verbose


sliderule.set_verbose(enable):

Configure sliderule package for verbose logging

Parameters

enable (bool) – whether or not user level log messages received from SlideRule generate a Python log message

Example:

>>> import sliderule
>>> sliderule.set_verbose(True)

The default behavior of Python log messages is for them to be displayed to standard output. If you want more control over the behavior of the log messages being display, create and configure a Python log handler as shown below:

# import packages
import logging
from sliderule import sliderule

# Configure Logging
sliderule_logger = logging.getLogger("sliderule.sliderule")
sliderule_logger.setLevel(logging.INFO)

# Create Console Output
ch = logging.StreamHandler()
ch.setLevel(logging.INFO)
sliderule_logger.addHandler(ch)

set_max_errors


sliderule.set_max_errors(max_errors):

Configure sliderule package’s maximum number of errors per node setting. When the client makes a request to a processing node, if there is an error, it will retry the request to a different processing node (if available), but will keep the original processing node in the list of available nodes and increment the number of errors associated with it. But if a processing node accumulates up to the max_errors number of errors, then the node is removed from the list of available nodes and will not be used in future processing requests.

A call to update_available_servers or set_url is needed to restore a removed node to the list of available servers.

Parameters

max_errors (int) – sets the maximum number of errors per node

Example:

>>> import sliderule
>>> sliderule.set_max_errors(3)

set_rqst_timeout


sliderule.set_rqst_timeout(timeout):

Sets the TCP/IP connection and reading timeouts for future requests made to sliderule servers. Setting it lower means the client will failover more quickly, but may generate false positives if a processing request stalls or takes a long time returning data. Setting it higher means the client will wait longer before designating it a failed request which in the presence of a persistent failure means it will take longer for the client to remove the node from its available servers list.

Parameters

timeout (tuple) – (<connection timeout in seconds>, <read timeout in seconds>)

Example:

>>> import sliderule
>>> sliderule.set_rqst_timeout((10, 60))

gps2utc


sliderule.gps2utc(gps_time, as_str=True, epoch=gps_epoch):

Convert a GPS based time returned from SlideRule into a UTC time.

Parameters
  • gps_time (int) – number of seconds since GPS epoch (January 6, 1980)

  • as_str (bool) – if True, returns the time as a string; if False, returns the time as datatime object

  • epoch (datetime) – the epoch used in the conversion, defaults to GPS epoch (Jan 6, 1980)

Returns

UTC time (i.e. GMT, or Zulu time)

Example:

>>> import sliderule
>>> sliderule.gps2utc(1235331234)
'2019-02-27 19:34:03'

get_definition


sliderule.get_definition(rectype, fieldname):

Get the underlying format specification of a field in a return record.

Parameters
  • rectype (str) – the name of the type of the record (i.e. “atl03rec”)

  • fieldname (str) – the name of the record field (i.e. “cycle”)

Returns

dictionary describing field; entry in the sliderule.basictypes variable

Example:

>>> import sliderule
>>> sliderule.set_url("icesat2sliderule.org")
>>> sliderule.get_definition("atl03rec", "cycle")
{'fmt': 'H', 'size': 2, 'nptype': <class 'numpy.uint16'>}

Endpoints

definition


GET /source/definition <request payload>

Gets the record definition of a record type; used to parse binary record data

Request Payload (application/json)

parameter

description

default

record-type

the name of the record type

required

HTTP Example

GET /source/definition HTTP/1.1
Host: my-sliderule-server:9081
Content-Length: 23


{"rectype": "atl03rec"}

Python Example

# Request Record Definition
rsps = sliderule.source("definition", {"rectype": "atl03rec"}, stream=False)

Response Payload (application/json)

JSON object defining the on-the-wire binary format of the record data contained in the specified record type.

See De-serialization for a description of how to use the record definitions.

event


POST /source/event <request payload>

Return event messages (logs, traces, and metrics) in real-time that have occurred during the time the request is active

Request Payload (application/json)

parameter

description

default

type

type of event message to monitor: “LOG”, “TRACE”, “METRIC”

“LOG”

level

minimum event level to monitor: “DEBUG”, “INFO”, “WARNING”, “ERROR”, “CRITICAL”

“INFO”

format

the format of the event message: “FMT_TEXT”, “FMT_JSON”; empty for binary record representation

optional

duration

seconds to hold connection open

0

HTTP Example

POST /source/event HTTP/1.1
Host: my-sliderule-server:9081
Content-Length: 48

{"type": "LOG", "level": "INFO", "duration": 30}

Python Example

# Build Logging Request
rqst = {
    "type": "LOG",
    "level" : "INFO",
    "duration": 30
}

# Retrieve logs
rsps = sliderule.source("event", rqst, stream=True)

Response Payload (application/octet-stream)

Serialized stream of event records of the type eventrec. See De-serialization for a description of how to process binary response records.

geo


GET /source/geo <request payload>

Perform geospatial operations on spherical and polar coordinates

Request Payload (application/json)

parameter

description

default

asset

data source (see Assets)

required

pole

polar orientation of indexing operations: “north”, “south”

“north”

lat

spherical latitude coordinate to project onto a polar coordinate system, -90.0 to 90.0

optional

lon

spherical longitude coordinate to project onto a polar coordinate system, -180.0 to 180.0

optional

x

polar x coordinate to project onto a spherical coordinate system

optional

y

polar y coordinate to project onto a spherical coordinate system

optional

span

a box defined by a lower left latitude/longitude pair, and an upper right lattitude/longitude pair

optional

span1

a span used for intersection with the span2

optional

span2

a span used for intersection with the span1

optional

span definition

parameter

description

default

lat0

smallest latitude (starting at -90.0)

required

lon0

smallest longitude (starting at -180.0)

required

lat1

largest latitude (ending at 90.0)

required

lon1

largest longitude (ending at 180.0)

required

HTTP Example

GET /source/geo HTTP/1.1
Host: my-sliderule-server:9081
Content-Length: 115


{"asset": "atlas-local", "pole": "north", "lat": 30.0, "lon": 100.0, "x": -0.20051164424058, "y": -1.1371580426033}

Python Example

rqst = {
    "asset": "atlas-local",
    "pole": "north",
    "lat": 30.0,
    "lon": 100.0,
    "x": -0.20051164424058,
    "y": -1.1371580426033,
}

rsps = sliderule.source("geo", rqst)

Response Payload (application/json)

JSON object with elements populated by the inferred operations being requested

parameter

description

default

intersect

true if span1 and span2 intersect, false otherwise

optional

combine

the combined span of span1 and span 2

optional

split

the split of span

optional

lat

spherical latitude coordinate projected from the polar coordinate system, -90.0 to 90.0

optional

lon

spherical longitude coordinate projected from the polar coordinate system, -180.0 to 180.0

optional

x

polar x coordinate projected from the spherical coordinate system

optional

y

polar y coordinate projected from the spherical coordinate system

optional

HTTP Example

HTTP/1.1 200 OK
Server: sliderule/0.5.0
Content-Type: text/plain
Content-Length: 76


{"y":1.1371580426033,"x":-0.20051164424058,"lat":29.999999999998,"lon":-100}

h5


POST /source/h5 <request payload>

Reads a dataset from an HDF5 file and return the values of the dataset in a list.

See icesat2.h5 function for a convient method for accessing HDF5 datasets.

Request Payload (application/json)

parameter

description

default

asset

data source asset (see Assets)

required

resource

HDF5 filename

required

dataset

full path to dataset variable

required

datatype

the type of data the returned dataset values should be in

“DYNAMIC”

col

the column to read from the dataset for a multi-dimensional dataset

0

startrow

the first row to start reading from in a multi-dimensional dataset

0

numrows

the number of rows to read when reading from a multi-dimensional dataset

-1 (all rows)

id

value to echo back in the records being returned

0

HTTP Example

POST /source/h5 HTTP/1.1
Host: my-sliderule-server:9081
Content-Length: 189


{"asset": "atlas-local", "resource": "ATL03_20181019065445_03150111_003_01.h5", "dataset": "/gt1r/geolocation/segment_ph_cnt", "datatype": 2, "col": 0, "startrow": 0, "numrows": 5, "id": 0}

Python Example

>>> import sliderule
>>> sliderule.set_url("icesat2sliderule.org")
>>> asset = "nsidc-s3"
>>> resource = "ATL03_20181019065445_03150111_003_01.h5"
>>> dataset = "/gt1r/geolocation/segment_ph_cnt"
>>> rqst = {
"asset" : asset,
"resource": resource,
"dataset": dataset,
"datatype": sliderule.datatypes["INTEGER"],
"col": 0,
"startrow": 0,
"numrows": 5,
"id": 0
}
>>> rsps = sliderule.source("h5", rqst, stream=True)
>>> print(rsps)
[{'__rectype': 'h5dataset', 'datatype': 2, 'data': (245, 0, 0, 0, 7, 1, 0, 0, 17, 1, 0, 0, 1, 1, 0, 0, 4, 1, 0, 0), 'size': 20, 'offset': 0, 'id': 0}]

Response Payload (application/octet-stream)

Serialized stream of H5 dataset records of the type h5dataset. See De-serialization for a description of how to process binary response records.

h5p


POST /source/h5p <request payload>

Reads a list of datasets from an HDF5 file and returns the values of the datasets in a dictionary of lists.

See icesat2.h5p function for a convient method for accessing HDF5 datasets.

Request Payload (application/json)

parameter

description

default

asset

data source asset (see Assets)

required

resource

HDF5 filename

required

datasets

list of datasets (see h5 for a list of parameters for each dataset)

required

Python Example

>>> import sliderule
>>> sliderule.set_url("icesat2sliderule.org")
>>> asset = "nsidc-s3"
>>> resource = "ATL03_20181019065445_03150111_003_01.h5"
>>> dataset = "/gt1r/geolocation/segment_ph_cnt"
>>> datasets = [ {"dataset": dataset, "col": 0, "startrow": 0, "numrows": 5} ]
>>> rqst = {
"asset" : asset,
"resource": resource,
"datasets": datasets,
}
>>> rsps = sliderule.source("h5p", rqst, stream=True)
>>> print(rsps)
[{'__rectype': 'h5file', 'dataset': '/gt1r/geolocation/segment_ph_cnt', 'elements': 5, 'size': 20, 'datatype': 2, 'data': (245, 0, 0, 0, 7, 1, 0, 0, 17, 1, 0, 0, 1, 1, 0, 0, 4, 1, 0, 0)}]

Response Payload (application/octet-stream)

Serialized stream of H5 file data records of the type h5file. See De-serialization for a description of how to process binary response records.

health

GET /source/health

Provides status on the health of the node.

Response Payload (application/json)

JSON object containing a true|false indicator of the health of the node.

{
    "healthy": true|false
}

index


GET /source/index <request payload>

Return list of resources (i.e H5 files) that match the query criteria.

Since the way resources are indexed (e.g. which meta-data to use), is very dependent upon the actual resources available; this endpoint is not necessarily useful in and of itself. It is expected that data specific indexes will be built per SlideRule deployment, and higher level routines will be constructed that take advantage of this endpoint and provide a more meaning interface.

Request Payload (application/json)

{
    "or"|"and":
    {
        "<index name>": { <index parameters>... }
        ...
    }
}

parameter

description

default

index name

name of server-side index to use (deployment specific)

required

index parameters

an index span represented in the format native to the index selected

required

Response Payload (application/json)

JSON object containing a list of the resources available to the SlideRule deployment that match the query criteria.

{
    "resources": ["<resource name>", ...]
}

metric


GET /source/metric <request payload>

Return a list of metric values associated with a provided system attribute.

Each SlideRule server node maintains internal metrics on a variety of things. Each metric is associated with an attribute that identifies a set of metrics.

When querying metrics, you provide the metric attribute, and the server will respond with the set of metrics associated with that attribute.

Request Payload (application/json)

{
  "attr": <metric attribute>
}

parameter

description

default

metric attribute

name of the attribute that is being queried

required

Response Payload (application/json)

JSON object containing a set of the metric names and values.

{
    "<metric name>": <metric value>,
    ...
}

tail


GET /source/tail <request payload>

Retrieve the most recent log messages generated by the server.

The number of log message saved by the server is configured at startup. This endpoint will return up to the maximum number of log messages that are saved.

Request Payload (application/json)

{
  "monitor": "<monitor name>"
}

parameter

description

default

monitor

name of the monitor to tail, should almost always be “EventMonitor”

required

Response Payload (application/json)

JSON object containing a list of log messages.

[
    "<log message 1>",
    "<log message 2>",
    ...
    "<log message N>"
]

time


GET /source/time <request payload>

Converts times from one format to another

Request Payload (application/json)

parameter

description

default

time

time value

required

input

format of above time value: “NOW”, “CDS”, “GMT”, “GPS”

required

output

desired format of return value: same as above

required

Sliderule supports the following time specifications

NOW

If supplied for either input or time then grab the current time

CDS

CCSDS 6-byte packet timestamp represented as [<day>, <ms>]

days = 2 bytes of days since GPS epoch

ms = 4 bytes of milliseconds in the current day

GMT

UTC time represented as a one of two date strings

“<year>:<month>:<day of month>:<hour in day>:<minute in hour>:<second in minute>””

“<year>:<day of year>:<hour in day>:<minute in hour>:<second in minute>”

GPS

seconds since GPS epoch “January 6, 1980”

HTTP Example

GET /source/time HTTP/1.1
Host: my-sliderule-server:9081
Content-Length: 48


{"time": "NOW", "input": "NOW", "output": "GPS"}

Python Example

rqst = {
    "time": "NOW",
    "input": "NOW",
    "output": "GPS"
}

rsps = sliderule.source("time", rqst)

Response Payload (application/json)

JSON object describing the results of the time conversion

{
    "time":     <time value>
    "format":   "<format of time value>"
}

version


GET /source/version

Get the version information of the server.

Response Payload (application/json)

JSON object containing the version information.

{
    "server": {
        "packages": [
            "<package 1>",
            "<package 2>",
            ...
            "<package n>"
        ],
        "version": "<version string>",
        "launch": "<date of launch>",
        "commit": "<commit id of code>",
        "duration": <seconds since launch>
    }
    "<package 1>": {
        "version": "<version string>",
        "commit": "<commit id of code>"
    },
    "<package 2>": {
        "version": "<version string>",
        "commit": "<commit id of code>"
    },
    ...
    "<package n>": {
        "version": "<version string>",
        "commit": "<commit id of code>"
    }
}