Tetap teratur dengan koleksi
Simpan dan kategorikan konten berdasarkan preferensi Anda.
Dataproc Hub memungkinkan Anda memanfaatkan Vertex AI Workbench dan Dataproc untuk menjalankan tugas pemrosesan data dan ML interaktif dalam skala besar menggunakan notebook Jupyter dan ekosistem Hadoop dan Spark.
Notebook Dataproc Hub adalah notebook satu pengguna yang dikurasi administrator dan berjalan di cluster JupyterLab Dataproc yang dibuat dan berjalan di project pengguna.
Dataproc Hub memanfaatkan JupyterHub untuk:
Tingkatkan konsistensi di seluruh organisasi dengan memungkinkan administrator membuat daftar template notebook yang dikurasi untuk berbagai grup pengguna data dan ML.
Percepat pembuatan notebook dengan menyediakan lingkungan yang telah dikonfigurasi sebelumnya kepada pengguna data dan ML yang sesuai dengan persyaratan software dan hardware mereka.
Dataproc Hub menyediakan antarmuka terpisah untuk administrator dan pengguna:
Administrator menggunakan halaman Dataproc→Workbench→User-Managed Notebooks di konsol Google Cloud untuk membuat instance Dataproc Hub. Setiap instance hub berisi serangkaian lingkungan notebook yang telah ditentukan sebelumnya yang ditentukan oleh file konfigurasi cluster YAML.
Pengguna data dan ML menggunakan UI Notebooks→Instances di
konsolGoogle Cloud untuk memilih lingkungan notebook yang telah ditentukan sebelumnya untuk memunculkan
server notebook di cluster Dataproc mereka.
Pengguna tanpa akses konsol dapat mengakses instance Dataproc Hub untuk memunculkan cluster Dataproc dari browser web mereka menggunakan URL instance Dataproc Hub yang diberikan oleh administrator.
Kasus penggunaan Dataproc Hub:
Pengguna data dan ML diatur dalam grup dengan persyaratan software dan hardware yang sama (pengguna dapat ditempatkan dalam beberapa grup)
Akses konsol Dataproc yang dibatasi: Pengguna tidak memiliki akses
ke Dataproc di konsol Google Cloud
Fitur Dataproc Hub:
Lingkungan pengguna standar
Isolasi cluster dan notebook: anggota grup tidak diberikan akses mudah ke cluster dan notebook anggota di grup lain
[[["Mudah dipahami","easyToUnderstand","thumb-up"],["Memecahkan masalah saya","solvedMyProblem","thumb-up"],["Lainnya","otherUp","thumb-up"]],[["Sulit dipahami","hardToUnderstand","thumb-down"],["Informasi atau kode contoh salah","incorrectInformationOrSampleCode","thumb-down"],["Informasi/contoh yang saya butuhkan tidak ada","missingTheInformationSamplesINeed","thumb-down"],["Masalah terjemahan","translationIssue","thumb-down"],["Lainnya","otherDown","thumb-down"]],["Terakhir diperbarui pada 2025-09-04 UTC."],[[["\u003cp\u003eDataproc Hub and Vertex AI Workbench user-managed notebooks are deprecated and will no longer be supported after January 30, 2025.\u003c/p\u003e\n"],["\u003cp\u003eDataproc Hub allows administrators to create and manage curated notebook templates for different data and ML user groups within an organization.\u003c/p\u003e\n"],["\u003cp\u003eDataproc Hub enables users to quickly create notebook servers on Dataproc clusters from pre-configured environments that match their specific software and hardware requirements.\u003c/p\u003e\n"],["\u003cp\u003eAdministrators can use the Google Cloud console to configure Dataproc Hub instances, while data and ML users can select and spawn notebook servers from predefined environments.\u003c/p\u003e\n"],["\u003cp\u003eDataproc Hub offers benefits such as predefined user environments, cluster and notebook isolation between different groups of users, and can be used by teams with restricted access to the console.\u003c/p\u003e\n"]]],[],null,["| Dataproc Hub and\n| Vertex AI Workbench user-managed notebooks are\n| deprecated. On January 30, 2025, support for user-managed notebooks\n| will end and the ability to create user-managed notebooks instances\n| will be removed. For alternative notebook solutions\n| on Google Cloud, see:\n|\n| - [Install\n| the Jupyter component on your Dataproc cluster](/dataproc/docs/concepts/components/jupyter#install_jupyter).\n| - [Create\n| a Dataproc-enabled\n| Vertex AI Workbench instance](/vertex-ai/docs/workbench/instances/create-dataproc-enabled).\n\nThe Dataproc Hub lets you to take advantage of\nVertex AI Workbench and Dataproc to run\ninteractive ML and\ndata processing tasks at scale using Jupyter notebooks and the Hadoop and Spark\necosystem.\n\nDataproc Hub notebooks are administrator-curated,\nsingle-user notebooks running on a Dataproc JupyterLab cluster\ncreated and running in the user's project.\n\n- Dataproc Hub leverages JupyterHub to:\n\n - Bring consistency across the organization by enabling administrators to create a curated list of notebook templates for different groups of data and ML users.\n - Accelerate notebook creation by providing data and ML users with pre-configured environments that match their software and hardware requirements.\n- Dataproc Hub provides separate interfaces for administrators and\n users:\n\n - Administrators use the **Dataproc→Workbench→User-Managed Notebooks** page in the Google Cloud console to create Dataproc Hub instances. Each hub instance contains a predefined set of notebook environments defined by YAML cluster configuration files.\n - Data and ML users use the Notebooks→Instances UI in the Google Cloud console to select a predefined notebook environment to spawn a notebook server on their Dataproc cluster.\n - Users without console access can access the Dataproc Hub instance to spawn a Dataproc cluster from their web browser by using a Dataproc Hub instance URL provided by the administrator. If the organization does not define and implement separate notebook administrative and user roles, instead of using Dataproc Hub to configure and spawn a Jupyter notebook cluster, users can [install the Jupyter component on their cluster](/dataproc/docs/concepts/components/jupyter).\n- Dataproc Hub use cases:\n\n - Data and ML users are organized in groups with common software and hardware requirements (users can be placed in multiple groups)\n - Restricted Dataproc console access: Users do not have access to Dataproc in the Google Cloud console\n- Dataproc Hub features:\n\n - Predefined user environments\n - Cluster and notebook isolation: members of a group are not provided easy access to clusters and notebooks of members in other groups\n\nFor more information\n\n- Admins: [Configure Dataproc Hub](/dataproc/docs/tutorials/dataproc-hub-admins)\n- Users: [Use Dataproc Hub](/dataproc/docs/tutorials/dataproc-hub-users)"]]