Join the Apache Beam community on July 18th-20th for the Beam Summit 2022 to learn more about Beam and share your expertise.

Using an unsupported SDK

Submitting jobs from an SDK version past its supported date results in interruptions and a reduction in throughput for long-running batch or streaming jobs. To mitigate potential issues, do the following:

Starting jobs with unsupported SDK versions

When you submit a Dataflow job from an SDK version past its supported date, you will receive an error message directing you to either upgrade your SDK version or to use a temporary token when you submit your job.

The token value in the error message contains the datetime when the token expires as well as the token itself. The expiration date is set for two weeks in the future.

If you want to use the token, resubmit your job using the unsupported_sdk_temporary_override_token experiment flag and token value.

--experiments=unsupported_sdk_temporary_override_token=TOKEN

If you submit a job using the token after it expires, you will receive a different error message informing you of the expired status. You can can either upgrade to a supported SDK version or resubmit your job without the token to get a new token.

Tokens should not be used as a permanent way to keep using an unsupported SDK. At some time after the unsupported date of an SDK, all tokens will be revoked and all jobs using that unsupported SDK version are rejected. For more details on the support status of each SDK, see SDK version support status.

Resuming disrupted jobs

Long-running Dataflow jobs that use unsupported SDKs, such as streaming jobs, will also be disrupted and have their throughput significantly reduced. These disrupted jobs can be identified through the following error message in the job logs:

The workflow was automatically disrupted by the service because it uses an unsupported SDK Apache Beam SDK for Python 2.3.0. Please upgrade to the latest SDK version. To resume the disrupted job temporarily, please use gcloud alpha dataflow jobs resume-unsupported-sdk --token=TOKEN --region=REGION JOB_ID. Note that the resumed job by this token will be disrupted again on 2020-08-28T11:21:58-07:00. For a list of supported SDK versions, see: https://cloud.google.com/dataflow/support#support-status-for-dataflow-sdk-releases.

As the error suggests, the disruption can be mitigated using the resume-unsupported-sdk command. Resuming a disrupted job is a temporary solution. Instead, you should upgrade your SDK version to avoid further deprecation actions in the future.