BigQuery Engine for Apache Flink Preview will be discontinued on February 28, 2025.

For more information on the deprecation and shutdown, see the deprecation page and release notes.

BigQuery Engine for Apache Flink autoscaling

Autoscaling enables BigQuery Engine for Apache Flink to choose the appropriate number of task slots for your job, adding and removing slots as needed. The autoscaler calculates the required number of task slots for each Flink job, with the goal of keeping up with the input load while using the minimum number of resources necessary.

Parallelism is an estimate of the number of task slots needed by a job to most efficiently process data at any given time. To determine the number of task slots needed for a job, BigQuery Engine for Apache Flink uses an algorithm that analyses the parallelism per task in a job.

The results are constrained by resource limitations per job, per deployment, and per project. You can configure some of the limits, such as quotas. Other limits are based on the project quota limits.

Jobs use autoscaling by default. You can set the minimum and maximum slots for a job, or you can disable autoscaling for a job and use a fixed number of slots.

Benefits

Autoscaling has the following potential benefits.

Allows BigQuery Engine for Apache Flink jobs to process data more efficiently.
Improves parallel processing by adjusting the number of slots available to run tasks in parallel.
Promotes efficient resource usage, which might reduce your costs.

Support and limitations

Supports streaming pipelines.
Scaling up is constrained by your slot quota limit. Quotas are enforced at the deployment level, not the job level.
You can't assign more task slots to your deployment than are available in the Google Cloud project.
The BigQuery Engine for Apache Flink autoscaler works best with Apache Kafka sources. For other I/Os, implement the new Apache Flink source interface and the Apache Flink standardized connector metrics.

How it works

Autotuning dynamically scales the task slot count used by a job up or down depending on resource requirements. Increasing the number of slots allows more tasks to run in parallel. As tasks complete and the slots are no longer needed, the slot count scales down. An algorithm determines how many task slots each job needs.

To determine the appropriate slot count, BigQuery Engine for Apache Flink uses the following metrics:

The average number of milliseconds in a second spent processing data, excluding backpressure
The average number of records coming in per second
The average number of records going out per second
The size of the backlog, that is, the number of pending records
The rate of growth of the backlog
The total number of splits that the source can process in parallel

Configure autoscaling

To configure autoscaling, set the following parameters on your deployments and jobs.

Deployments

When you create a deployment, set the maximum number of slots. This value limits the total number of slots that are available for all jobs in the deployment.

gcloud

To set the maximum number of slots when you create a deployment, use the gcloud alpha managed-flink deployments create gcloud CLI command with the max-slots parameter.

gcloud alpha managed-flink deployments create ... \
  --max-slots=SLOT_NUMBER

Replace SLOT_NUMBER with the number of slots assigned to the deployment. If the value is greater than the number of task slots available in the Google Cloud project, the request is rejected.

Jobs

When you create a new job, autoscaling is enabled by default. Set the minimum and maximum number of slots for the job.

gcloud

When you create a job, use the gcloud alpha managed-flink jobs create gcloud CLI command with the following parameters:

gcloud alpha managed-flink jobs create ... \
--min-parallelism=MINIMUM_SLOTS \
--max-parallelism=MAXIMUM_SLOTS

Replace the following variables:

MINIMUM_SLOTS: the minimum number of task slots available to your job
MAXIMUM_SLOTS: the maximum number of task slots available to your job

If you create a job in an existing deployment, these values cannot exceed the maximum number of slots assigned to the deployment.

Disable autoscaling

You can disable autoscaling when you create a job, and instead assign a fixed number of task slots to the job.

gcloud

To create a job with autoscaling disabled, use the gcloud alpha managed-flink jobs create gcloud CLI command with the following parameters:

gcloud alpha managed-flink jobs create ... \
  --autotuning-mode=fixed \
  --parallelism=SLOTS \

Replace SLOTS with the number of task slots available to your job.

Update autoscaling

gcloud

To update the autoscaling setting for a job by using the gcloud CLI, use the gcloud alpha managed-flink jobs update command.

To change the minimum and maximum number of task slots available to your job:

gcloud alpha managed-flink jobs update JOB_ID \
  --project=PROJECT_ID \
  --location=REGION \
  --min-parallelism=MINIMUM_SLOTS \
  --max-parallelism=MAXIMUM_SLOTS

To disable autoscaling:

gcloud alpha managed-flink jobs update JOB_ID \
  --project=PROJECT_ID \
  --location=REGION \
  --autotuning-mode=fixed \
  --parallelism=SLOTS

Replace the following variables:

JOB_ID: the ID of your BigQuery Engine for Apache Flink job
PROJECT_ID: your BigQuery Engine for Apache Flink project ID
REGION: the region that your BigQuery Engine for Apache Flink job is in
MINIMUM_SLOTS: the minimum number of task slots available to your job
MAXIMUM_SLOTS: the maximum number of task slots available to your job
SLOTS: the number of task slots available to your job

Verify that autoscaling is enabled

console

Use the Google Cloud console to see whether autoscaling is enabled.

In the Google Cloud console, open the job list.

Go to Jobs

To open the Job details page, click the name of your job.
When autoscaling is enabled, in the Job info panel, Horizontal autoscaling policy is set to Throughput-based.

gcloud

To use the gcloud CLI to see whether autoscaling is enabled, use the gcloud alpha managed-flink jobs describe command.

gcloud alpha managed-flink jobs describe \
  JOB_ID \
  --location=REGION

Replace the following:

JOB_ID: the ID of your BigQuery Engine for Apache Flink job
REGION: the region that the BigQuery Engine for Apache Flink job is in

When autoscaling is enabled, the response includes lines similar to the following example:

autotuningConfig:
  throughputBased:
    maxParallelism: NUMBER
    minParallelism: NUMBER
    parallelism: NUMBER

Monitor

Use the Google Cloud console to monitor autoscaling. You can view the autoscaling monitoring chart in the BigQuery Engine for Apache Flink monitoring interface in the Metrics tab on the Job details page. The charts display metrics over the duration of a job and includes the following information:

The current parallelism
The recommended parallelism
The minimum and maximum allowed task slots

For more information, see Autoscaling metrics.

BigQuery Engine for Apache Flink job-level autoscaling metrics are also exported to Cloud Monitoring. Use Metrics Explorer to create charts.