Creating global parameters in AWS Step Functions state machines

Creating global parameters in AWS Step Functions state machines

When working with AWS Step Functions, it's sometimes useful to have access to variable values that you can set when launching a state machine execution, are available to all states, but are not part of the JSON data that is passed between states.

Some common use cases I've encountered:

  • Specifying an input parameter that is only relevant for some Task states in the state machine, such as a table or bucket name, an ARN, or a URL.

  • Controlling the state machine's flow through a global flag processed by Choice states. For example, enabling and disabling branches, or controlling whether an intermediate output is persisted.

Creating such a global variable value is possible with the context object that allows you to access the input of the state machine in each state definition. While straightforward to implement, it's not obvious from the documentation.

So here's a complete example for creating a global parameter in an AWS Step Functions state machine.

Let's say we have a state machine that processes a batch of data, calculates some metrics, and finally writes the results to a database table:

Graph of the state machine example. The graph starts with a sequence of two Pass states that act as placeholders for a "Load Data" and "Compute Metrics" state. This is followed by a Choice state that selects between two branches. The default branch contains just a Pass state, the other an invocation of the "Write Metrics" Lambda function.

(You can find the full example state machine definition in this Gist.)

The input to the state machine looks as follows:

{
  "items": ["id1", "id2", "id3"],
  "writeMetrics": true,
  "tableName": "name of the table to write to"
}

In our example, the first two states exclusively deal with the data items. The writeMetrics flag and tableName only become relevant when writing the computed metrics to the database.

Passing writeMetrics and tableName through several (potentially nested) processing steps is not only cumbersome, but leaks implementation details of the state machine into the intermediate states.

Instead, we can access the boolean flag and table name through the Execution.Input key of the context object.

The Choice state can use context object keys as the variable within choice rules:

"Choice": {
  "Type": "Choice",
  "Choices": [
    {
      "Variable": "$$.Execution.Input.writeMetrics",
      "BooleanEquals": true,
      "Next": "Write Metrics"
    }
  ],
  "Default": "Pass"
}

In a Task state definition, we can refer to the context object in the Parameters block:

"Write Metrics": {
  "Type": "Task",
  "Resource": "arn:aws:states:::lambda:invoke",
  "OutputPath": "$.Payload",
  "Parameters": {
    "FunctionName": "...",
    "Payload": {
      "_table.$": "$$.Execution.Input.tableName",
      "metrics.$": "$.metrics"
    }
  },
  "Retry": [],
  "End": true
}