Créer une sauvegarde et restaurer les données d'un notebook
Restez organisé à l'aide des collections
Enregistrez et classez les contenus selon vos préférences.
Google Distributed Cloud (GDC) air-gapped vous permet de créer des sauvegardes et de restaurer des données à partir du répertoire personnel de vos instances JupyterLab.
Cette page explique comment créer et restaurer des sauvegardes des données de notebooks Vertex AI Workbench. Si vous ne connaissez pas Vertex AI, découvrez Vertex AI Workbench.
Avant de commencer
Pour obtenir les autorisations nécessaires pour copier les données restaurées, demandez à votre administrateur IAM de l'organisation de vous accorder le rôle Développeur de cluster utilisateur (user-cluster-developer).
Créer une sauvegarde et restaurer les données d'une instance JupyterLab
Définissez des applications protégées pour créer une sauvegarde du répertoire personnel d'une instance JupyterLab individuelle ou des répertoires personnels de toutes les instances JupyterLab d'un projet à la fois.
Créez une ressource personnalisée ProtectedApplication dans le cluster où vous souhaitez planifier les sauvegardes. Les plans de sauvegarde et de restauration utilisent des applications protégées pour sélectionner des ressources. Pour en savoir plus sur la création d'applications protégées, consultez Stratégies d'application protégée.
La ressource personnalisée ProtectedApplication contient les champs suivants :
Champ
Description
resourceSelection
La façon dont l'objet ProtectedApplication sélectionne les ressources pour les sauvegardes ou les restaurations.
type
Méthode de sélection des ressources. Un type Selector indique que les ressources avec des libellés correspondants doivent être sélectionnées.
selector
Règles de sélection. Ce champ contient les sous-champs suivants :
matchLabels
Libellés utilisés par l'objet ProtectedApplication pour faire correspondre les ressources. Ce champ contient les sous-champs suivants :
app.kubernetes.io/part-of
Nom d'une application de niveau supérieur dont celle-ci fait partie. Sélectionnez Vertex AI Workbench comme application de haut niveau pour les instances JupyterLab.
app.kubernetes.io/component
Composant de l'architecture. Sélectionnez les ressources Vertex AI Workbench qui fournissent du stockage pour les instances JupyterLab.
app.kubernetes.io/instance
Nom unique identifiant l'instance d'une application. Limitez le champ d'application pour sélectionner une instance JupyterLab. La valeur est identique au nom de l'instance JupyterLab dans la console GDC.
Utilisez la ressource personnalisée ProtectedApplication pour sélectionner le stockage d'une seule instance JupyterLab ou de toutes les instances JupyterLab d'un projet, comme dans les exemples suivants :
Sélectionnez le stockage d'une seule instance JupyterLab :
L'exemple suivant montre une ressource personnalisée ProtectedApplication qui sélectionne le stockage d'une instance JupyterLab nommée my-instance-name dans l'espace de noms my-project :
Sélectionnez le stockage de toutes les instances JupyterLab :
L'exemple suivant montre une ressource personnalisée ProtectedApplication qui sélectionne le stockage pour toutes les instances JupyterLab de l'espace de noms my-project :
Copier les données restaurées vers une nouvelle instance JupyterLab
Pour copier les données restaurées à partir de la ressource PersistentVolumeClaim d'une instance JupyterLab vers une nouvelle instance JupyterLab, procédez comme suit :
Remplacez KUBECONFIG_PATH par le chemin d'accès au fichier kubeconfig dans le cluster.
Sauf indication contraire, le contenu de cette page est régi par une licence Creative Commons Attribution 4.0, et les échantillons de code sont régis par une licence Apache 2.0. Pour en savoir plus, consultez les Règles du site Google Developers. Java est une marque déposée d'Oracle et/ou de ses sociétés affiliées.
Dernière mise à jour le 2025/09/04 (UTC).
[[["Facile à comprendre","easyToUnderstand","thumb-up"],["J'ai pu résoudre mon problème","solvedMyProblem","thumb-up"],["Autre","otherUp","thumb-up"]],[["Difficile à comprendre","hardToUnderstand","thumb-down"],["Informations ou exemple de code incorrects","incorrectInformationOrSampleCode","thumb-down"],["Il n'y a pas l'information/les exemples dont j'ai besoin","missingTheInformationSamplesINeed","thumb-down"],["Problème de traduction","translationIssue","thumb-down"],["Autre","otherDown","thumb-down"]],["Dernière mise à jour le 2025/09/04 (UTC)."],[[["\u003cp\u003eGoogle Distributed Cloud (GDC) air-gapped allows for the creation of backups and restoration of data from the home directory of JupyterLab instances.\u003c/p\u003e\n"],["\u003cp\u003eA \u003ccode\u003eProtectedApplication\u003c/code\u003e custom resource is used to define which JupyterLab instances, or their storage, will be included in backup and restore operations.\u003c/p\u003e\n"],["\u003cp\u003eYou can create backups for a single JupyterLab instance or for all JupyterLab instances within a project by selecting the appropriate labels within the \u003ccode\u003eProtectedApplication\u003c/code\u003e custom resource.\u003c/p\u003e\n"],["\u003cp\u003eRestored data must be transferred to a new JupyterLab instance, as settings on the original \u003ccode\u003eNotebook\u003c/code\u003e custom resource are not backed up, and the process involves getting pod and image details, and creating a new pod for data access and copying.\u003c/p\u003e\n"],["\u003cp\u003eThe restored data is located in the \u003ccode\u003e/home/jovyan/restore\u003c/code\u003e directory of the new JupyterLab instance, accessible after transferring the data from the restored \u003ccode\u003ePersistentVolumeClaim\u003c/code\u003e.\u003c/p\u003e\n"]]],[],null,["# Create a backup and restore notebook data\n\nGoogle Distributed Cloud (GDC) air-gapped lets you create backups and restore data from the\nhome directory of your JupyterLab instances.\n\nThis page describes creating and restoring backups of Vertex AI Workbench\nnotebook data. If you are new to Vertex AI,\n[learn more about Vertex AI Workbench](/distributed-cloud/hosted/docs/latest/gdch/application/ao-user/vertex-ai-workbench-intro).\n| **Note:** Vertex AI Workbench doesn't support backing up the settings on the `Notebook` custom resource of the JupyterLab instance. You must create new JupyterLab instances and transfer the restored data into that instance.\n\nBefore you begin\n----------------\n\nTo get the permissions that you need to copy restored data, ask your\nOrganization IAM Admin to grant you the User Cluster\nDeveloper (`user-cluster-developer`) role.\n\nCreate a backup and restore JupyterLab instance data\n----------------------------------------------------\n\nDefine protected applications to create a backup of the home directory of an\nindividual JupyterLab instance or the home directories of all JupyterLab instances\nin a project at once.\n\nCreate a `ProtectedApplication` custom resource in the cluster where you want to\nschedule backups. Backup and restore plans use protected applications to select\nresources. For information about creating protected applications, see\n[Protected application strategies](/distributed-cloud/hosted/docs/latest/gdch/platform-application/pa-ao-operations/protected-application-strategies).\n\nThe `ProtectedApplication` custom resource contains the following fields:\n\nUse the `ProtectedApplication` custom resource to select the storage of a single\nJupyterLab instance or all JupyterLab instances in a project, as in the following\nexamples:\n\n- **Select the storage of a single JupyterLab instance**:\n\n The following example shows a `ProtectedApplication` custom resource that\n selects the storage for a JupyterLab instance named `my-instance-name` in\n the `my-project` namespace: \n\n apiVersion: gkebackup.gke.io/v1\n kind: ProtectedApplication\n metadata:\n name: my-protected-application\n namespace: my-project\n spec:\n resourceSelection:\n type: Selector\n selector:\n matchLabels:\n app.kubernetes.io/part-of: vtxwb\n app.kubernetes.io/component: storage\n app.kubernetes.io/instance: my-instance-name\n\n- **Select the storage of all JupyterLab instances**:\n\n The following example shows a `ProtectedApplication` custom resource that\n selects the storage for all JupyterLab instances in the `my-project` namespace: \n\n apiVersion: gkebackup.gke.io/v1\n kind: ProtectedApplication\n metadata:\n name: my-protected-application\n namespace: my-project\n spec:\n resourceSelection:\n type: Selector\n selector:\n matchLabels:\n app.kubernetes.io/part-of: vtxwb\n app.kubernetes.io/component: storage\n\n This example doesn't contain the `app.kubernetes.io/instance` label because\n it selects all JupyterLab instances.\n\nTo create a backup and restore data from a JupyterLab instance,\n[plan a set of backups](/distributed-cloud/hosted/docs/latest/gdch/platform-application/pa-ao-operations/plan-backups)\nand [plan a set of restores](/distributed-cloud/hosted/docs/latest/gdch/platform-application/pa-ao-operations/plan-restores)\nusing the `ProtectedApplication` custom resource you defined.\n\nCopy restored data to a new JupyterLab instance\n-----------------------------------------------\n\nFollow these steps to copy restored data from the `PersistentVolumeClaim`\nresource of a JupyterLab instance to a new JupyterLab instance:\n\n1. [Meet the prerequisites](#before-you-begin).\n2. [Create a JupyterLab notebook](/distributed-cloud/hosted/docs/latest/gdch/application/ao-user/vertex-ai-workbench#create-notebook) associated with a JupyterLab instance to copy restored data.\n3. Get the pod name of the JupyterLab instance where you created the notebook:\n\n kubectl get pods -l notebook-name=\u003cvar translate=\"no\"\u003eINSTANCE_NAME\u003c/var\u003e -n \u003cvar translate=\"no\"\u003ePROJECT_NAMESPACE\u003c/var\u003e\n\n Replace the following:\n - \u003cvar translate=\"no\"\u003eINSTANCE_NAME\u003c/var\u003e: the name of the JupyterLab instance you configured.\n - \u003cvar translate=\"no\"\u003ePROJECT_NAMESPACE\u003c/var\u003e: the project namespace where you created the JupyterLab instance.\n4. Get the name of the image that the JupyterLab instance is running:\n\n kubectl get pods \u003cvar translate=\"no\"\u003ePOD_NAME\u003c/var\u003e -n \u003cvar translate=\"no\"\u003ePROJECT_NAMESPACE\u003c/var\u003e -o jsonpath=\"{.spec.containers[0].image}\"\n\n Replace the following:\n - \u003cvar translate=\"no\"\u003ePOD_NAME\u003c/var\u003e: the pod name of the JupyterLab instance.\n - \u003cvar translate=\"no\"\u003ePROJECT_NAMESPACE\u003c/var\u003e: the project namespace where you created the JupyterLab instance.\n5. Find the name of the `PersistentVolumeClaim` resource that was restored:\n\n kubectl get pvc -l app.kubernetes.io/part-of=vtxwb,app.kubernetes.io/component=storage,app.kubernetes.io/instance=\u003cvar translate=\"no\"\u003eRESTORED_INSTANCE_NAME\u003c/var\u003e -n \u003cvar translate=\"no\"\u003ePROJECT_NAMESPACE\u003c/var\u003e\n\n Replace the following:\n - \u003cvar translate=\"no\"\u003eRESTORED_INSTANCE_NAME\u003c/var\u003e: the name of the JupyterLab instance that you restored.\n - \u003cvar translate=\"no\"\u003ePROJECT_NAMESPACE\u003c/var\u003e: the project namespace where you created the JupyterLab instance.\n6. Create a YAML file named `vtxwb-data.yaml` with the following content:\n\n apiVersion: v1\n kind: Pod\n metadata:\n name: vtxwb-data\n namespace: \u003cvar translate=\"no\"\u003e\u003cspan class=\"devsite-syntax-l devsite-syntax-l-Scalar devsite-syntax-l-Scalar-Plain\"\u003ePROJECT_NAMESPACE\u003c/span\u003e\u003c/var\u003e\n labels:\n aiplatform.gdc.goog/service-type: workbench\n spec:\n containers:\n - args:\n - sleep infinity\n command:\n - bash\n - -c\n image: \u003cvar translate=\"no\"\u003e\u003cspan class=\"devsite-syntax-l devsite-syntax-l-Scalar devsite-syntax-l-Scalar-Plain\"\u003eIMAGE_NAME\u003c/span\u003e\u003c/var\u003e\n imagePullPolicy: IfNotPresent\n name: vtxwb-data\n resources:\n limits:\n cpu: \"1\"\n memory: 1Gi\n requests:\n cpu: \"1\"\n memory: 1Gi\n terminationMessagePath: /dev/termination-log\n terminationMessagePolicy: File\n volumeMounts:\n - mountPath: /home/jovyan\n name: restore-data\n workingDir: /home/jovyan\n volumes:\n - name: restore-data\n persistentVolumeClaim:\n claimName: \u003cvar translate=\"no\"\u003e\u003cspan class=\"devsite-syntax-l devsite-syntax-l-Scalar devsite-syntax-l-Scalar-Plain\"\u003eRESTORED_PVC_NAME\u003c/span\u003e\u003c/var\u003e\n\n Replace the following:\n - \u003cvar translate=\"no\"\u003ePROJECT_NAMESPACE\u003c/var\u003e: the project namespace where you created the JupyterLab instance.\n - \u003cvar translate=\"no\"\u003eIMAGE_NAME\u003c/var\u003e: the name of the container image that the JupyterLab instance is running.\n - \u003cvar translate=\"no\"\u003eRESTORED_PVC_NAME\u003c/var\u003e: the name of the restored `PersistentVolumeClaim` resource.\n\n | **Note:** The name of the home directory of your JupyterLab instances is `/home/jovyan`.\n7. Create a new pod for your restored `PersistentVolumeClaim` resource:\n\n kubectl apply -f ./vtxwb-data --kubeconfig \u003cvar translate=\"no\"\u003eKUBECONFIG_PATH\u003c/var\u003e\n\n Replace \u003cvar translate=\"no\"\u003eKUBECONFIG_PATH\u003c/var\u003e with the path of the\n kubeconfig file in the cluster.\n8. Wait for the `vtxwb-data` pod to reach the `RUNNING` state.\n\n9. Copy your restored data to a new JupyterLab instance:\n\n kubectl cp \u003cvar translate=\"no\"\u003ePROJECT_NAMESPACE\u003c/var\u003e/vtxwb-data:/home/jovyan ./restore --kubeconfig \u003cvar translate=\"no\"\u003eKUBECONFIG_PATH\u003c/var\u003e\n\n kubectl cp ./restore \u003cvar translate=\"no\"\u003ePROJECT_NAMESPACE\u003c/var\u003e/\u003cvar translate=\"no\"\u003ePOD_NAME\u003c/var\u003e:/home/jovyan/restore --kubeconfig \u003cvar translate=\"no\"\u003eKUBECONFIG_PATH\u003c/var\u003e\n\n rm ./restore\n\n Replace the following:\n - \u003cvar translate=\"no\"\u003ePROJECT_NAMESPACE\u003c/var\u003e: the project namespace where you created the JupyterLab instance.\n - \u003cvar translate=\"no\"\u003eKUBECONFIG_PATH\u003c/var\u003e: the path of the kubeconfig file in the cluster.\n - \u003cvar translate=\"no\"\u003ePOD_NAME\u003c/var\u003e: the pod name of the JupyterLab instance.\n\n After copying the data, your restored data is available in the\n `/home/jovyan/restore` directory.\n10. Delete the pod that you created to access your restored data:\n\n kubectl delete pod vtxwb-data -n my-namespace` --kubeconfig \u003cvar translate=\"no\"\u003eKUBECONFIG_PATH\u003c/var\u003e\n\n Replace \u003cvar translate=\"no\"\u003eKUBECONFIG_PATH\u003c/var\u003e with the path of the kubeconfig file in the cluster."]]