Retry steps

You can retry steps that return a specific error code; for example, a particular HTTP status code. The retry syntax allows you to do the following:

  • Define the maximum number of retry attempts
  • Define a backoff model to increase the likelihood of success

Workflows has default retry policies available for both idempotent and non-idempotent steps. If you use a default retry policy, you don't need to specify a predicate or define the retry configuration values. If the existing policies don't work for your use case, you can modify the default retry policies and create a custom retry policy.

Note that retrying a step counts as an additional step execution for pricing purposes. For more information, see Pricing.

Use a try/retry structure

A retry policy is composed of a predicate that defines which error codes should be retried and default values for the retry configuration values.

YAML

  - step_name:
      try:
          steps:
              ...
      retry: RETRY_POLICY
          predicate: RETRY_PREDICATE
          max_retries: NUMBER_OF_RETRIES
          backoff:
              initial_delay: DELAY_SECONDS
              max_delay: MAX_DELAY_SECONDS
              multiplier: DELAY_MULTIPLIER

JSON

  [
    {
      "step_name": {
        "try": {
           "steps": [
               ...
    ]
        },
        "retry": "RETRY_POLICY"
          "predicate": "RETRY_PREDICATE",
          "max_retries": NUMBER_OF_RETRIES,
          "backoff": {
            "initial_delay": DELAY_SECONDS,
            "max_delay": MAX_DELAY_SECONDS,
            "multiplier": DELAY_MULTIPLIER
          }
        }
      }
  ]

Replace the following:

  • RETRY_POLICY: optional. Lets you specify a default retry policy to use. Options include:

    • ${http.default_retry}
    • ${http.default_retry_non_idempotent}

    If omitted, all other fields are required. If specified, omit all other fields in the retry block.

  • RETRY_PREDICATE: If you don't specify a default retry policy, a retry predicate is required. Defines which error codes will be retried. Options include:

    • ${http.default_retry_predicate}
    • ${http.default_retry_predicate_non_idempotent}
    • A custom predicate defined as a subworkflow that accepts a single argument as a map which is used to hold the exception definition, and which returns true if a retry; false, otherwise.
  • If you don't specify a default retry policy, the following retry configuration values are required:

    • NUMBER_OF_RETRIES: Maximum number of times a step will be retried, not counting the initial step execution attempt.
    • The backoff block controls how retries occur and has the following parameters:

      • DELAY_SECONDS: delay in seconds between the initial failure and the first retry.
      • MAX_DELAY_SECONDS: maximum delay in seconds between retries.
      • DELAY_MULTIPLIER: multiplier applied to the previous delay to calculate the delay for the subsequent retry.

Note that the steps block is optional. It can contain the following:

  • assign
  • call
  • for
  • parallel
  • raise
  • return
  • steps
  • switch
  • try

For example, given the following retry configuration values:

YAML

  max_retries: 8
  backoff:
      initial_delay: 1
      max_delay: 60
      multiplier: 2

JSON

  {
    "max_retries": 8,
    "backoff": {
      "initial_delay": 1,
      "max_delay": 60,
      "multiplier": 2
    }
  }

The step will be retried a total of eight times. The initial delay is 1 second, and the delay is doubled on each attempt, with a maximum delay of 60 seconds. In this case, the delays between subsequent attempts are: 1, 2, 4, 8, 16, 32, 60, and 60 (time given in seconds). After eight retry attempts, the step is considered failed and an exception is raised. Counting the initial execution, the step was executed nine times.

You can configure the retry block in one of three ways:

Use a default retry policy

There are two default retry policies available: one for idempotent steps, and one for non-idempotent steps.

The default retry policies are composed of a default retry predicate and a set of default retry configuration values.

Default retry policy for idempotent steps

Note: You should only use this retry policy for idempotent steps (steps that can be safely repeated.)

The default retry policy for idempotent steps has the following configuration:

  • predicate: ${http.default_retry_predicate}. Retries HTTP status codes [429, 502, 503, 504], connection error, connection failed error or timeout error
  • max_retries: 5
  • initial_delay: 1.0
  • max_delay: 60
  • multiplier: 1.25

For example, to use the default retry policy for an idempotent step:

YAML

  - idempotent_step:
      try:
          call: http.get
          args:
              url: https://host.com/api
      retry: ${http.default_retry}

JSON

  [
    {
      "idempotent_step": {
        "try": {
          "call": "http.get",
          "args": {
            "url": "https://host.com/api"
          }
        },
        "retry": "${http.default_retry}"
      }
    }
  ]

Default retry for non-idempotent steps

Note: You should only use this retry policy for non-idempotent steps (steps that can't be safely repeated.)

The default retry policy for non-idempotent steps has the following configuration:

  • predicate: ${http.default_retry_predicate_non_idempotent}. Retries HTTP status codes [429, 503] or connection failed error
  • max_retries: 5
  • initial_delay: 1.0
  • max_delay: 60
  • multiplier: 1.25

For example, to use the default retry policy for a non-idempotent step:

YAML

  - non_idempotent_step:
      try:
          call: http.get
          args:
              url: https://host.com/api
      retry: ${http.default_retry_non_idempotent}

JSON

  [
    {
      "non_idempotent_step": {
        "try": {
          "call": "http.get",
          "args": {
            "url": "https://host.com/api"
          }
        },
        "retry": "${http.default_retry_non_idempotent}"
      }
    }
  ]

Use a default retry predicate with custom retry configuration

Select the appropriate predicate for the type of step (idempotent or non- idempotent), then define the retry configuration values. For example, the following retry policy uses the default predicate for non-idempotent steps and defines custom configuration values:

YAML

  - step_name:
      try:
          steps:
              ...
      retry:
          predicate: ${http.default_retry_predicate_non_idempotent}
          max_retries: 10
          backoff:
              initial_delay: 1
              max_delay: 90
              multiplier: 3

JSON

  [
    {
      "step_name": {
        "try": {
          "steps":
          ...
        },
        "retry": {
          "predicate": "${http.default_retry_predicate_non_idempotent}",
          "max_retries": 10,
          "backoff": {
            "initial_delay": 1,
            "max_delay": 90,
            "multiplier": 3
          }
        }
      }
    }
  ]

Create a custom retry policy

To create a custom retry policy, use a subworkflow to define your predicate. Then define your retry configuration values, including your predicate, in the retry block in the main workflow.

The predicate is checked against any errors that are raised, and determines if the error triggers a retry. To trigger a retry, return true; otherwise, return false. If an error isn't retried, or the retries are exhausted, the error is propagated and can cause the execution to fail. You can handle this by using an except block.

YAML

  main:
      - step_name:
          try:
              steps:
                  ...
          retry:
              predicate: ${retry_predicate}
              max_retries: number_of_retries
              backoff:
                  initial_delay: delay_seconds
                  max_delay: max_delay_seconds
                  multiplier: delay_multiplier

  retry_predicate:
      params: [e]
      steps:
          ...

JSON

  {
    "main": [
      {
        "step_name": {
          "try": {
            "steps":
            ...
          },
          "retry": {
            "predicate": "${retry_predicate}",
            "max_retries": "number_of_retries",
            "backoff": {
              "initial_delay": "delay_seconds",
              "max_delay": "max_delay_seconds",
              "multiplier": "delay_multiplier"
            }
          }
        }
      }
    ],
    "retry_predicate": {
      "params": [
        "e"
      ],
      "steps":
      ...
    }
  }

Note that when using a custom predicate, errors are only raised for the following HTTP status codes:

  • Client error responses (400-499)
  • Server error responses (500-599)

To trigger the predicate for other HTTP status codes, you must explicitly raise the exception. For an example, see Retry steps using a custom retry policy for other HTTP status codes.

Samples

These samples demonstrate the syntax.

Retry steps using a default retry policy

Workflows comes with built-in retry policies. This sample uses a built-in retry policy for HTTP requests.

YAML

- read_item:
    try:
      call: http.get
      args:
        url: https://example.com/someapi
      result: apiResponse
    retry: ${http.default_retry}

JSON

[
  {
    "read_item": {
      "try": {
        "call": "http.get",
        "args": {
          "url": "https://example.com/someapi"
        },
        "result": "apiResponse"
      },
      "retry": "${http.default_retry}"
    }
  }
]

Retry steps using a custom retry policy

This sample implements a custom retry policy that only retries HTTP requests that return HTTP status code 500.

YAML

main:
  steps:
    - read_item:
        try:
          call: http.get
          args:
            url: https://host.com/api
          result: api_response
        retry:
          predicate: ${custom_predicate}
          max_retries: 5
          backoff:
            initial_delay: 2
            max_delay: 60
            multiplier: 2
    - last_step:
        return: "OK"

custom_predicate:
  params: [e]
  steps:
    - what_to_repeat:
        switch:
          - condition: ${e.code == 500}
            return: true
    - otherwise:
        return: false

JSON

{
  "main": {
    "steps": [
      {
        "read_item": {
          "try": {
            "call": "http.get",
            "args": {
              "url": "https://host.com/api"
            },
            "result": "api_response"
          },
          "retry": {
            "predicate": "${custom_predicate}",
            "max_retries": 5,
            "backoff": {
              "initial_delay": 2,
              "max_delay": 60,
              "multiplier": 2
            }
          }
        }
      },
      {
        "last_step": {
          "return": "OK"
        }
      }
    ]
  },
  "custom_predicate": {
    "params": [
      "e"
    ],
    "steps": [
      {
        "what_to_repeat": {
          "switch": [
            {
              "condition": "${e.code == 500}",
              "return": true
            }
          ]
        }
      },
      {
        "otherwise": {
          "return": false
        }
      }
    ]
  }
}

Retry steps using a custom retry policy for other HTTP status codes

This sample implements a custom retry policy that retries HTTP requests that return an HTTP status code 202.

YAML

main:
  steps:
    - read_item:
        try:
          steps:
            - callStep:
                call: http.get
                args:
                  url: https://host.com/api
                result: api_response
            - checkNotOK:
                switch:
                  - condition: ${api_response.code == 202}
                    raise: ${api_response}
        retry:
          predicate: ${custom_predicate}
          max_retries: 5
          backoff:
            initial_delay: 2
            max_delay: 60
            multiplier: 2

custom_predicate:
  params: [e]
  steps:
    - what_to_repeat:
        switch:
          - condition: ${e.code == 202}
            return: true
    - otherwise:
        return: false

JSON

{
  "main": {
    "steps": [
      {
        "read_item": {
          "try": {
            "steps": [
              {
                "callStep": {
                  "call": "http.get",
                  "args": {
                    "url": "https://host.com/api"
                  },
                  "result": "api_response"
                }
              },
              {
                "checkNotOK": {
                  "switch": [
                    {
                      "condition": "${api_response.code == 202}",
                      "raise": "${api_response}"
                    }
                  ]
                }
              }
            ]
          },
          "retry": {
            "predicate": "${custom_predicate}",
            "max_retries": 5,
            "backoff": {
              "initial_delay": 2,
              "max_delay": 60,
              "multiplier": 2
            }
          }
        }
      }
    ]
  },
  "custom_predicate": {
    "params": [
      "e"
    ],
    "steps": [
      {
        "what_to_repeat": {
          "switch": [
            {
              "condition": "${e.code == 202}",
              "return": true
            }
          ]
        }
      },
      {
        "otherwise": {
          "return": false
        }
      }
    ]
  }
}

Retry steps with custom configuration

This sample makes an HTTP request using a custom retry configuration. This example uses a standard retry predicate, determining when to perform a retry, and a custom maximum number of retries and backoff parameters.

YAML

- read_item:
    try:
      call: http.get
      args:
        url: https://example.com/someapi
      result: api_response
    retry:
      predicate: ${http.default_retry_predicate}
      max_retries: 5
      backoff:
        initial_delay: 2
        max_delay: 60
        multiplier: 2

JSON

[
  {
    "read_item": {
      "try": {
        "call": "http.get",
        "args": {
          "url": "https://example.com/someapi"
        },
        "result": "api_response"
      },
      "retry": {
        "predicate": "${http.default_retry_predicate}",
        "max_retries": 5,
        "backoff": {
          "initial_delay": 2,
          "max_delay": 60,
          "multiplier": 2
        }
      }
    }
  }
]

Error handling with custom predicate

This sample defines a custom error handler, including a custom predicate, and custom backoff parameters. A predicate is defined as a subworkflow that accepts a single argument as a map, which is used to hold the exception definition. This predicate returns true if a retry; false, otherwise.

YAML

# Define a custom error handler, custom predicate, and custom backoff parameters
# The `my_own_predicate` subworkflow accepts a map as an argument and defines the
# exception; it returns true if a retry; false, otherwise
# Expected outcome: the execution fails and returns an HTTP 404 Not Found error
main:
  steps:
    - read_item:
        try:
          call: http.get
          args:
            url: https://example.com/someapi
          result: api_response
        retry:
          predicate: ${my_own_predicate}
          max_retries: 5
          backoff:
            initial_delay: 2
            max_delay: 60
            multiplier: 2
    - last_step:
        return: "OK"

my_own_predicate:
  params: [e]
  steps:
    - log_error_tags:
        call: sys.log
        args:
          data: ${e.tags}
          severity: "INFO"
    - log_error_message:
        call: sys.log
        args:
          data: ${e.message}
          severity: "INFO"
    - log_error_code:
        call: sys.log
        args:
          data: ${e.code}
          severity: "INFO"
    - what_to_repeat:
        switch:
          - condition: ${e.code == 202}
            return: true
    - otherwise:
        return: false

JSON

{
  "main": {
    "steps": [
      {
        "read_item": {
          "try": {
            "call": "http.get",
            "args": {
              "url": "https://example.com/someapi"
            },
            "result": "api_response"
          },
          "retry": {
            "predicate": "${my_own_predicate}",
            "max_retries": 5,
            "backoff": {
              "initial_delay": 2,
              "max_delay": 60,
              "multiplier": 2
            }
          }
        }
      },
      {
        "last_step": {
          "return": "OK"
        }
      }
    ]
  },
  "my_own_predicate": {
    "params": [
      "e"
    ],
    "steps": [
          {
            "log_error_tags": {
              "call": "sys.log",
              "args": {
                "data": "${e.tags}",
                "severity": "INFO"
              }
            }
          },
          {
            "log_error_message": {
              "call": "sys.log",
              "args": {
                "data": "${e.message}",
                "severity": "INFO"
              }
            }
          },
          {
            "log_error_code": {
              "call": "sys.log",
              "args": {
                "data": "${e.code}",
                "severity": "INFO"
              }
            }
          },
          {
            "what_to_repeat": {
              "switch": [
                {
                  "condition": "${e.code == 202}",
                  "return": true
                }
              ]
            }
          },
          {
            "otherwise": {
              "return": false
            }
          }
        ]
      }
    }

When you run the preceding workflow, the execution fails, and returns an HTTP 404 Not Found error. To demonstrate this, the custom predicate uses the standard library sys.log function and writes elements from the error map to the log. For example, the log output should be similar to the following:

{
  "textPayload": "[\"HttpError\"]",
  ...
  "severity": "INFO",
  ...
}
{
  "textPayload": "HTTP server responded with error code 404",
  ...
  "severity": "INFO",
  ...
}
{
  "textPayload": "404",
  ...
  "severity": "INFO",
  ...
}

What's next