Mantenha tudo organizado com as coleções
Salve e categorize o conteúdo com base nas suas preferências.
Com o hub do Dataproc, você aproveita o
Vertex AI Workbench e o Dataproc para executar
tarefas interativas de ML e
processamento de dados em escala usando notebooks Jupyter e o ecossistema
Hadoop e Spark.
Os notebooks do Dataproc Hub são gerenciados por administradores e executados em um cluster do JupyterLab do Dataproc de usuário único criado e em execução no projeto do usuário.
O Dataproc Hub usa o JupyterHub para:
trazer consistência em toda a organização permitindo que os administradores
criem uma lista selecionada de modelos de notebook para diferentes grupos
de dados e usuários de ML.
Acelere a criação de notebooks fornecendo aos usuários e aos dados de ML
ambientes pré-configurados que correspondam aos seus requisitos de software e hardware.
O Dataproc Hub fornece interfaces separadas para administradores e
usuários:
Os administradores usam a página Dataproc→Workbench→Notebooks gerenciados pelo usuário
no console Google Cloud para criar instâncias do Dataproc Hub. Cada instância do hub contém um conjunto predefinido de ambientes de notebook
definidos por arquivos de configuração de cluster do YAML.
Os usuários de dados e ML usam a interface de notebooks→Instâncias no
console doGoogle Cloud para selecionar um ambiente de notebook predefinido para gerar
um servidor de notebook no cluster do Dataproc.
Usuários sem acesso ao console podem acessar a instância do
Dataproc Hub para gerar um cluster do Dataproc a partir do navegador
da Web usando um URL de instância do Dataproc Hub fornecido
pelo administrador.
Casos de uso do Dataproc Hub:
Usuários e dados de ML são organizados em grupos com requisitos comuns de software e
hardware (os usuários podem ser colocados em vários grupos)
Acesso restrito ao console do Dataproc: os usuários não têm acesso
ao Dataproc no console Google Cloud
Recursos do Dataproc Hub:
Ambientes de usuário predefinidos
Isolamento de cluster e notebook: os membros de um grupo não recebem acesso
fácil a clusters e notebooks de membros em outros grupos
[[["Fácil de entender","easyToUnderstand","thumb-up"],["Meu problema foi resolvido","solvedMyProblem","thumb-up"],["Outro","otherUp","thumb-up"]],[["Difícil de entender","hardToUnderstand","thumb-down"],["Informações incorretas ou exemplo de código","incorrectInformationOrSampleCode","thumb-down"],["Não contém as informações/amostras de que eu preciso","missingTheInformationSamplesINeed","thumb-down"],["Problema na tradução","translationIssue","thumb-down"],["Outro","otherDown","thumb-down"]],["Última atualização 2025-09-04 UTC."],[[["\u003cp\u003eDataproc Hub and Vertex AI Workbench user-managed notebooks are deprecated and will no longer be supported after January 30, 2025.\u003c/p\u003e\n"],["\u003cp\u003eDataproc Hub allows administrators to create and manage curated notebook templates for different data and ML user groups within an organization.\u003c/p\u003e\n"],["\u003cp\u003eDataproc Hub enables users to quickly create notebook servers on Dataproc clusters from pre-configured environments that match their specific software and hardware requirements.\u003c/p\u003e\n"],["\u003cp\u003eAdministrators can use the Google Cloud console to configure Dataproc Hub instances, while data and ML users can select and spawn notebook servers from predefined environments.\u003c/p\u003e\n"],["\u003cp\u003eDataproc Hub offers benefits such as predefined user environments, cluster and notebook isolation between different groups of users, and can be used by teams with restricted access to the console.\u003c/p\u003e\n"]]],[],null,["| Dataproc Hub and\n| Vertex AI Workbench user-managed notebooks are\n| deprecated. On January 30, 2025, support for user-managed notebooks\n| will end and the ability to create user-managed notebooks instances\n| will be removed. For alternative notebook solutions\n| on Google Cloud, see:\n|\n| - [Install\n| the Jupyter component on your Dataproc cluster](/dataproc/docs/concepts/components/jupyter#install_jupyter).\n| - [Create\n| a Dataproc-enabled\n| Vertex AI Workbench instance](/vertex-ai/docs/workbench/instances/create-dataproc-enabled).\n\nThe Dataproc Hub lets you to take advantage of\nVertex AI Workbench and Dataproc to run\ninteractive ML and\ndata processing tasks at scale using Jupyter notebooks and the Hadoop and Spark\necosystem.\n\nDataproc Hub notebooks are administrator-curated,\nsingle-user notebooks running on a Dataproc JupyterLab cluster\ncreated and running in the user's project.\n\n- Dataproc Hub leverages JupyterHub to:\n\n - Bring consistency across the organization by enabling administrators to create a curated list of notebook templates for different groups of data and ML users.\n - Accelerate notebook creation by providing data and ML users with pre-configured environments that match their software and hardware requirements.\n- Dataproc Hub provides separate interfaces for administrators and\n users:\n\n - Administrators use the **Dataproc→Workbench→User-Managed Notebooks** page in the Google Cloud console to create Dataproc Hub instances. Each hub instance contains a predefined set of notebook environments defined by YAML cluster configuration files.\n - Data and ML users use the Notebooks→Instances UI in the Google Cloud console to select a predefined notebook environment to spawn a notebook server on their Dataproc cluster.\n - Users without console access can access the Dataproc Hub instance to spawn a Dataproc cluster from their web browser by using a Dataproc Hub instance URL provided by the administrator. If the organization does not define and implement separate notebook administrative and user roles, instead of using Dataproc Hub to configure and spawn a Jupyter notebook cluster, users can [install the Jupyter component on their cluster](/dataproc/docs/concepts/components/jupyter).\n- Dataproc Hub use cases:\n\n - Data and ML users are organized in groups with common software and hardware requirements (users can be placed in multiple groups)\n - Restricted Dataproc console access: Users do not have access to Dataproc in the Google Cloud console\n- Dataproc Hub features:\n\n - Predefined user environments\n - Cluster and notebook isolation: members of a group are not provided easy access to clusters and notebooks of members in other groups\n\nFor more information\n\n- Admins: [Configure Dataproc Hub](/dataproc/docs/tutorials/dataproc-hub-admins)\n- Users: [Use Dataproc Hub](/dataproc/docs/tutorials/dataproc-hub-users)"]]