Redrive AWS Step Functions executions
This page explains how to redrive executions directly from Datadog to continue failed AWS Step Functions from the point of failure without a state machine restart.
To enable using redrive within Datadog, configure an AWS Connection with Datadog App Builder. Ensure that your IAM roles include permissions that allow executing a Step Function for the retry action (StartExecution
) or redriving a Step Function for the redrive action (RedriveExecution
).
To take action on a Step Function in Datadog:
- Go to the Step Functions page.
- Find the Step Function you wish to redrive.
- Open this Step Function’s side panel. On the Executions tab, locate the failed execution you wish to redrive.
- Click on the Failed pill to open a redrive modal.
- Click the Redrive button.
When monitoring redriven executions, use the Waterfall view, as the large gap between the original execution and redrive can make the Flame Graph view imperceptible.
A redrive may not always share the same sampling decision as the original execution. To ensure that the redriven execution is also sampled, you can reference the @redrive:true
span tag in a retention query.
Additional helpful documentation, links, and articles: