Mantenha tudo organizado com as coleções
Salve e categorize o conteúdo com base nas suas preferências.
Usar o Dataproc sem servidor do Dataproc com notebooks gerenciados
Nesta página, mostramos como executar um arquivo de notebook no Spark
sem servidor em uma instância de notebooks gerenciados do Vertex AI Workbench usando o
Dataproc Serverless.
Sua instância de notebooks
gerenciados pode enviar um código de arquivo de notebook para
ser executado no serviço sem servidor do Dataproc. O serviço executa
o código em uma infraestrutura de computação gerenciada que escalona automaticamente
os recursos conforme necessário. Portanto,
não é necessário provisionar e gerenciar seu próprio cluster.
Para executar um arquivo de notebook no Dataproc sem servidor,
forneça uma conta de serviço
com permissões específicas. É possível conceder essas permissões
à conta de serviço padrão ou fornecer uma conta de serviço personalizada.
Consulte a seção "Permissões desta página".
A sessão Spark sem servidor do Dataproc usa
uma rede de nuvem privada virtual (VPC) para executar cargas de trabalho.
A sub-rede VPC precisa atender a requisitos específicos.
Consulte os requisitos em Configuração de rede do Dataproc sem servidor para
o Spark.
Permissões
Para garantir que a conta de serviço tenha as permissões necessárias
para executar um arquivo de notebook no Dataproc sem servidor,
peça ao administrador para conceder a ela o
papel do IAM de Editor do Dataproc (roles/dataproc.editor) no projeto.
Para mais informações sobre a concessão de papéis, consulte Gerenciar o acesso a projetos, pastas e organizações.
Esse papel predefinido contém
as permissões necessárias para executar um arquivo de notebook no Dataproc sem servidor. Para conferir as permissões exatas
necessárias, expanda a seção Permissões necessárias:
Permissões necessárias
As permissões a seguir são necessárias para executar um arquivo de notebook no Dataproc sem servidor:
Sign in to your Google Cloud account. If you're new to
Google Cloud,
create an account to evaluate how our products perform in
real-world scenarios. New customers also get $300 in free credits to
run, test, and deploy workloads.
In the Google Cloud console, on the project selector page,
select or create a Google Cloud project.
Ao lado do nome da instância de notebooks gerenciados,
clique em Abrir JupyterLab.
Iniciar uma sessão do Spark sem servidor do Dataproc
Para iniciar uma sessão do Spark sem servidor do Dataproc,
conclua as etapas a seguir.
Na interface JupyterLab da instância de notebooks gerenciados,
selecione a guia Acesso rápido e, em seguida, selecione Spark sem servidor.
Se a guia Acesso rápido não estiver aberta,
selecione Arquivo > Novo acesso rápido para abri-la.
A caixa de diálogo Criar sessão do Spark sem servidor é exibida.
No campo Nome da sessão, insira um nome para sua sessão.
Na seção Configuração de execução, insira a
conta de serviço que você quer usar. Se você não inserir uma
conta de serviço, sua sessão usará a conta de serviço padrão
do Compute Engine.
Um novo arquivo de notebook é aberto.
A sessão Spark sem servidor do Dataproc que você criou é
o kernel que executa o código do arquivo de notebook.
Executar o código no Spark Serverless Server do Dataproc e em outros kernels
Adicione o código ao novo arquivo do notebook e execute o código.
Para executar o código em um kernel diferente,
altere o kernel.
Quando você quiser executar o código na sessão do Spark sem servidor do Dataproc novamente,
altere o kernel de volta para
o kernel do Spark do servidor sem servidor do Dataproc.
Encerrar sua sessão do Spark sem servidor do Dataproc
É possível encerrar uma sessão do Spark sem servidor do Dataproc
na interface do JupyterLab ou no console do Google Cloud.
O código no arquivo de notebook é preservado.
JupyterLab
No JupyterLab, feche o arquivo do notebook criado quando
você criou a sessão do Spark sem servidor do Dataproc.
Na caixa de diálogo exibida, clique em Encerrar sessão.
Console do Google Cloud
No console do Google Cloud, acesse a página Sessões do Dataproc.
[[["Fácil de entender","easyToUnderstand","thumb-up"],["Meu problema foi resolvido","solvedMyProblem","thumb-up"],["Outro","otherUp","thumb-up"]],[["Difícil de entender","hardToUnderstand","thumb-down"],["Informações incorretas ou exemplo de código","incorrectInformationOrSampleCode","thumb-down"],["Não contém as informações/amostras de que eu preciso","missingTheInformationSamplesINeed","thumb-down"],["Problema na tradução","translationIssue","thumb-down"],["Outro","otherDown","thumb-down"]],["Última atualização 2025-09-08 UTC."],[],[],null,["# Use Dataproc Serverless Spark with managed notebooks\n====================================================\n\n\n| Vertex AI Workbench managed notebooks is\n| [deprecated](/vertex-ai/docs/deprecations). On\n| April 14, 2025, support for\n| managed notebooks will end and the ability to create managed notebooks instances\n| will be removed. Existing instances will continue to function\n| but patches, updates, and upgrades won't be available. To continue using\n| Vertex AI Workbench, we recommend that you\n| [migrate\n| your managed notebooks instances to Vertex AI Workbench instances](/vertex-ai/docs/workbench/managed/migrate-to-instances).\n\n\u003cbr /\u003e\n\n|\n| **Preview**\n|\n|\n| This feature is subject to the \"Pre-GA Offerings Terms\" in the General Service Terms section\n| of the [Service Specific Terms](/terms/service-terms#1).\n|\n| Pre-GA features are available \"as is\" and might have limited support.\n|\n| For more information, see the\n| [launch stage descriptions](/products#product-launch-stages).\n\nThis page shows you how to run a notebook file on serverless Spark\nin a Vertex AI Workbench managed notebooks instance\nby using [Dataproc Serverless](/dataproc-serverless/docs).\n\nYour managed notebooks instance\ncan submit a notebook file's code to run on\nthe Dataproc Serverless service. The service runs\nthe code on a managed compute infrastructure that automatically\nscales resources as needed. Therefore,\nyou don't need to provision and manage your own cluster.\n\n[Dataproc Serverless charges](/dataproc-serverless/pricing)\napply only to the time when the workload is executing.\n\nRequirements\n------------\n\nTo run a notebook file on Dataproc Serverless Spark,\nsee the following requirements.\n\n- Your Dataproc Serverless session must run in the same\n region as your managed notebooks instance.\n\n- The Require OS Login (`constraints/compute.requireOsLogin`) constraint\n must not be enabled for your project. See [Manage OS Login in\n an organization](https://cloud.google.com/compute/docs/oslogin/manage-oslogin-in-an-org).\n\n- To run a notebook file on Dataproc Serverless,\n you must provide a [service account](/iam/docs/service-accounts)\n that has specific permissions. You can grant these permissions\n to the default service account or provide a custom service account.\n See the [Permissions section of this page](#permissions).\n\n- Your Dataproc Serverless Spark session uses\n a Virtual Private Cloud (VPC) network to execute workloads.\n The VPC subnetwork must meet specific requirements.\n See the requirements in [Dataproc Serverless for\n Spark network configuration](/dataproc-serverless/docs/concepts/network).\n\nPermissions\n-----------\n\n\nTo ensure that the service account has the necessary\npermissions to run a notebook file on Dataproc Serverless,\n\nask your administrator to grant the service account the\n\n\n[Dataproc Editor](/iam/docs/roles-permissions/dataproc#dataproc.editor) (`roles/dataproc.editor`)\nIAM role on your project.\n\n\n| **Important:** You must grant this role to the service account, *not* to your user account. Failure to grant the role to the correct principal might result in permission errors.\nFor more information about granting roles, see [Manage access to projects, folders, and organizations](/iam/docs/granting-changing-revoking-access).\n\n\u003cbr /\u003e\n\n\nThis predefined role contains\n\nthe permissions required to run a notebook file on Dataproc Serverless. To see the exact permissions that are\nrequired, expand the **Required permissions** section:\n\n\n#### Required permissions\n\nThe following permissions are required to run a notebook file on Dataproc Serverless:\n\n- ` dataproc.agents.create `\n- ` dataproc.agents.delete `\n- ` dataproc.agents.get `\n- ` dataproc.agents.update `\n- ` dataproc.session.create `\n- ` dataproc.sessions.get `\n- ` dataproc.sessions.list `\n- ` dataproc.sessions.terminate `\n- ` dataproc.sessions.delete `\n- ` dataproc.tasks.lease `\n- ` dataproc.tasks.listInvalidatedLeases `\n- ` dataproc.tasks.reportStatus`\n\n\nYour administrator might also be able to give the service account\nthese permissions\nwith [custom roles](/iam/docs/creating-custom-roles) or\nother [predefined roles](/iam/docs/roles-overview#predefined).\n\nBefore you begin\n----------------\n\n- Sign in to your Google Cloud account. If you're new to Google Cloud, [create an account](https://console.cloud.google.com/freetrial) to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.\n- In the Google Cloud console, on the project selector page,\n select or create a Google Cloud project.\n\n | **Note**: If you don't plan to keep the resources that you create in this procedure, create a project instead of selecting an existing project. After you finish these steps, you can delete the project, removing all resources associated with the project.\n\n [Go to project selector](https://console.cloud.google.com/projectselector2/home/dashboard)\n-\n [Verify that billing is enabled for your Google Cloud project](/billing/docs/how-to/verify-billing-enabled#confirm_billing_is_enabled_on_a_project).\n\n-\n\n\n Enable the Notebooks, Vertex AI, and Dataproc APIs.\n\n\n [Enable the APIs](https://console.cloud.google.com/flows/enableapi?apiid=notebooks.googleapis.com,aiplatform.googleapis.com,dataproc)\n\n- In the Google Cloud console, on the project selector page,\n select or create a Google Cloud project.\n\n | **Note**: If you don't plan to keep the resources that you create in this procedure, create a project instead of selecting an existing project. After you finish these steps, you can delete the project, removing all resources associated with the project.\n\n [Go to project selector](https://console.cloud.google.com/projectselector2/home/dashboard)\n-\n [Verify that billing is enabled for your Google Cloud project](/billing/docs/how-to/verify-billing-enabled#confirm_billing_is_enabled_on_a_project).\n\n-\n\n\n Enable the Notebooks, Vertex AI, and Dataproc APIs.\n\n\n [Enable the APIs](https://console.cloud.google.com/flows/enableapi?apiid=notebooks.googleapis.com,aiplatform.googleapis.com,dataproc)\n\n1. If you haven't already, [create\n a managed notebooks instance](/vertex-ai/docs/workbench/managed/create-instance#create).\n2. If you haven't already, configure a VPC network that meets the requirements listed in [Dataproc Serverless\n for Spark network configuration](/dataproc-serverless/docs/concepts/network).\n\nOpen JupyterLab\n---------------\n\n1. In the Google Cloud console, go to the **Managed notebooks** page.\n\n [Go to Managed notebooks](https://console.cloud.google.com/vertex-ai/workbench/managed)\n2. Next to your managed notebooks instance's name,\n click **Open JupyterLab**.\n\nStart a Dataproc Serverless Spark session\n-----------------------------------------\n\nTo start a Dataproc Serverless Spark session,\ncomplete the following steps.\n\n1. In your managed notebooks instance's JupyterLab interface,\n select the **Launcher** tab, and then select **Serverless Spark** .\n If the **Launcher** tab is not open,\n select **File \\\u003e New Launcher** to open it.\n\n The **Create Serverless Spark session** dialog appears.\n2. In the **Session name** field, enter a name for your session.\n\n3. In the **Execution configuration** section, enter\n the **Service account** that you want to use. If you don't enter\n a service account, your session will use the [Compute Engine default\n service account](/compute/docs/access/service-accounts#default_service_account).\n\n4. In the **Network configuration** section, select the\n **Network** and **Subnetwork** of a network that meets the requirements\n listed in [Dataproc Serverless for\n Spark network configuration](/dataproc-serverless/docs/concepts/network).\n\n5. Click **Create**.\n\n A new notebook file opens.\n The Dataproc Serverless Spark session that you created is\n the kernel that runs your notebook file's code.\n\nRun your code on Dataproc Serverless Spark and other kernels\n------------------------------------------------------------\n\n1. Add code to your new notebook file, and run the code.\n\n2. To run code on a different kernel,\n [change the kernel](/vertex-ai/docs/workbench/managed/create-managed-notebooks-instance-console-quickstart#change-kernel).\n\n3. When you want to run the code on\n your Dataproc Serverless Spark session again,\n change the kernel back to\n the Dataproc Serverless Spark kernel.\n\nTerminate your Dataproc Serverless Spark session\n------------------------------------------------\n\nYou can terminate a Dataproc Serverless Spark session\nin the JupyterLab interface or in the Google Cloud console.\nThe code in your notebook file is preserved. \n\n### JupyterLab\n\n1. In JupyterLab, close the notebook file that was created when you\n created your Dataproc Serverless Spark session.\n\n2. In the dialog that appears, click **Terminate session**.\n\n### Google Cloud console\n\n1. In the Google Cloud console, go to the **Dataproc sessions** page.\n\n [Go to Dataproc sessions](https://console.cloud.google.com/dataproc/interactive)\n2. Select the session that you want to terminate,\n and then click **Terminate**.\n\nDelete your Dataproc Serverless Spark session\n---------------------------------------------\n\nYou can delete a Dataproc Serverless Spark session\nby using the Google Cloud console.\nThe code in your notebook file is preserved.\n\n1. In the Google Cloud console, go to the **Dataproc sessions** page.\n\n [Go to Dataproc sessions](https://console.cloud.google.com/dataproc/interactive)\n2. Select the session that you want to delete,\n and then click **Delete**.\n\nWhat's next\n-----------\n\n- Learn more about [Dataproc Serverless](/dataproc-serverless/docs/overview)."]]