Submitting jobs from an SDK version past its supported date results in interruptions and a reduction in throughput for long-running batch or streaming jobs. To mitigate potential issues, do the following:
- Upgrade the SDK version. (Preferred)
- Resubmit the job using a temporary token.
- Resume the disrupted job.
Starting jobs with unsupported SDK versions
When you submit a Dataflow job from an SDK version past its supported date, you will receive an error message directing you to either upgrade your SDK version or to use a temporary token when you submit your job.
The token value in the error message contains the datetime when the token expires as well as the token itself. The expiration date is set for two weeks in the future.
If you want to use the token, resubmit your job using the unsupported_sdk_temporary_override_token
experiment flag and token value.
--experiments=unsupported_sdk_temporary_override_token=TOKEN
If you submit a job using the token after it expires, you will receive a different error message informing you of the expired status. You can can either upgrade to a supported SDK version or resubmit your job without the token to get a new token.
Tokens should not be used as a permanent way to keep using an unsupported SDK. At some time after the unsupported date of an SDK, all tokens will be revoked and all jobs using that unsupported SDK version are rejected. For more details on the support status of each SDK, see SDK version support status.
Resuming disrupted jobs
Long-running Dataflow jobs that use unsupported SDKs, such as streaming jobs, will also be disrupted and have their throughput significantly reduced. These disrupted jobs can be identified through the following error message in the job logs:
The workflow was automatically disrupted by the service because it uses an unsupported SDK Apache Beam SDK for Python 2.3.0. Please upgrade to the latest SDK version. To resume the disrupted job temporarily, please use
gcloud alpha dataflow jobs resume-unsupported-sdk --token=TOKEN --region=REGION JOB_ID
. Note that the resumed job by this token will be disrupted again on 2020-08-28T11:21:58-07:00. For a list of supported SDK versions, see: https://cloud.google.com/dataflow/support#support-status-for-dataflow-sdk-releases.
As the error suggests, the disruption can be mitigated using the resume-unsupported-sdk command. Resuming a disrupted job is a temporary solution. Instead, you should upgrade your SDK version to avoid further deprecation actions in the future.