This page shows you how to create Dataproc Serverless interactive sessions and session templates. A session template can be used to create multiple interactive sessions based on the session template configuration.
Create a Dataproc Serverless session
You can use the Google Cloud console, the Google Cloud CLI, or the Dataproc API to create a Dataproc Serverless interactive session.
Console
To create a Dataproc Serverless session using the Google Cloud console, complete the following steps:
In the Google Cloud console, go to the Interactive Sessions page.
- Click Create.
In the Add an interactive session (Preview) page, input or confirm session configuration settings. Note the following:
- Interactive session name: Required. Accept the default name or specify a session name.
- Region: Required. Accept the default region or specify an available region for your session.
- Runtime configuration: Optional. Selectable session runtimes correspond to available Dataproc Serverless for Spark runtime versions. You can specify a custom container image to use for your session.
- Properties: Optional. Click Add Item for each property to set for your session. For more information, see Spark properties.
- Spark UI (Preview): Optional. You can use the Spark UI to collect and monitor session execution details.
- Service account: Optional. The service account to use for the session. If not specified, the Compute Engine default service account is used.
- Network configuration: Required. The session subnetwork must have Private Google Access (PGA) enabled and must allow subnet communication on all ports. Only networks with subnetworks in the specified session region with PGA enabled are listed in this section. For more information, see Dataproc Serverless for Spark network configuration.
Click Submit to create the session.
gcloud
You can use the
gcloud beta dataproc sessions create command SESSION_NAME
to
create a Dataproc Serverless interactive session.
Command flag notes:
--region
: Required. An available region for your session.--version
: Optional. A supported Spark runtime version. If you don't use this flag to specify a version, the current default Spark runtime version is used.--container-image
: Optional. A custom container image to use for your session.--property
: Optional. One or more comma-separated Spark properties for your session.--service-account
: Optional. The service account to use for your session. If not specified, the Compute Engine default service account is used.--subnet
: Optional. A VPC subnet in the following format:projects/PROJECT_ID/regions/REGION/subnetworks/SUBNET_NAME
- REGION: The
--region
you selected for your session. - SUBNET_NAME: The subnet must have Private Google Access (PGA) enabled and allow subnet communication on all ports. For more information, see Dataproc Serverless for Spark network configuration.
- REGION: The
REST
You can use the Dataproc
sessions.create
API to create a Dataproc Serverless interactive session.
Notes:
name
: Required. Session name.version
: Optional. Any of the supported Spark runtime versions for your session. If you don't specify a version, the current default version is used.containerImage
: Optional. A custom container image to use for your session.properties
: Optional. A mapping of session property names to values. See Spark properties.serviceAccount
: Optional. The service account to use to run your session. If not specified, the Compute Engine default service account is used.subnetworkUri
: Optional. A VPC subnet for your session in the following format:projects/PROJECT_ID/regions/REGION/subnetworks/SUBNET_NAME
The subnet must have Private Google Access (PGA) enabled and allow subnet communication on all ports. For more information, see Dataproc Serverless for Spark network configuration.
Create a Dataproc Serverless session template
A Dataproc Serverless session template defines the configuration settings for creating one or more Dataproc Serverless interactive sessions.
You can use the Google Cloud console, the gcloud CLI, or the Dataproc API to create a Dataproc Serverless session template.
Console
To create a Dataproc Serverless session template using the Google Cloud console, complete the following steps:
In the Google Cloud console, go to the Interactive Session Templates page.
Go to Interactive Session Templates
- Click Create.
In the Create session template page, input or confirm the template configuration settings. Note the following:
- Template runtime ID: Required. Accept the default ID (name) or specify a template runtime name.
- Region: Required. Accept the default region or specify an available region for template sessions.
- Runtime version: Optional. Selectable session runtimes correspond to Dataproc Serverless for Spark runtime versions.
- Template configuration type: Required. Select a type. If
you select
Jupyter
, specify the Display name and select the Jupyter kernel type. For more information, see Launch a Jupyter notebook on Dataproc Serverless. - Service account: Optional. The service account to use to run templated sessions. If not specified, the Compute Engine default service account is used.
- Custom container image: Optional. A custom container image to use for your templated sessions.
- Properties: Optional. Click Add Item for each property to set for your templated sessions. For more information, see Spark properties.
- Network configuration: * Required. The session subnetwork must have Private Google Access (PGA) enabled and must allow subnet communication on all ports. Only networks with subnetworks in the session Region with PGA enabled are listed in this section. For more information, see Dataproc Serverless for Spark network configuration.
Click Submit to create the session template.
gcloud
You can't directly create a Dataproc Serverless session template using the
gcloud CLI, but you can use the gcloud beta dataproc session-templates import
command to import an existing session template. You can edit the imported template,
and then export it using the gcloud beta dataproc session-templates export
command.
REST
You can use the Dataproc
sessionTemplates.create
API to create a Dataproc Serverless session template.
Notes:
name
: Required. Session template name.version
: Optional. Any of the supported Spark runtime versions for your templated sessions. If you don't specify a version, the default version is used.containerImage
: Optional. A custom container image to use for your templated sessions.properties
: Optional. A mapping of session property names to values. See Spark properties.serviceAccount
: Optional. A service account to use to run your templated sessions,. If not specified, the Compute Engine default service account is used.subnetworkUri
: Optional. A VPC subnet for your templated sessions in the following format:projects/PROJECT_ID/regions/REGION/subnetworks/SUBNET_NAME
The subnet must have Private Google Access (PGA) enabled and allow subnet communication on all ports. For more information, see Dataproc Serverless for Spark network configuration.