RUNNING 또는 ERROR 상태의 영구 리소스를 재부팅할 수 있습니다.
영구 리소스를 재부팅하면 영구 리소스 자체로는 복구할 수 없는 오류를 복구할 수 있습니다. 영구 리소스를 재부팅하여 수작업으로 최신 클러스터를 더 가져올 수도 있습니다. 이 페이지에서는 Google Cloud 콘솔과 REST API를 사용하여 영구 리소스를 재부팅하는 방법을 보여줍니다.
필요한 역할
영구 리소스를 재부팅하는 데 필요한 권한을 얻으려면 관리자에게 프로젝트에 대한 Vertex AI 관리자(roles/aiplatform.admin) IAM 역할을 부여해 달라고 요청하세요.
역할 부여에 대한 자세한 내용은 프로젝트, 폴더, 조직에 대한 액세스 관리를 참조하세요.
이 사전 정의된 역할에는 영구 리소스를 재부팅하는 데 필요한 aiplatform.persistentResources.update 권한이 포함되어 있습니다.
영구 리소스 재부팅은 장기 실행 작업이므로 이 작업이 진행되는 동안에는 영구 리소스를 삭제할 수 없습니다. 작업에는 오류가 발생할 경우 오류 상태로 채워지는 progressMessage 필드가 포함됩니다. 작업이 "done: true"로 표시되면 영구 리소스의 상태를 확인합니다. 영구 리소스가 RUNNING 상태이면 재부팅에 성공한 것이며 학습 작업을 실행할 수 있습니다.
제한사항
영구 리소스 재부팅의 제한사항은 다음과 같습니다.
경우에 따라 영구 리소스를 재부팅할 때 부족한 리소스 용량이 손실될 수 있습니다. 전체 리소스 보관은 보장되지 않습니다.
[[["이해하기 쉬움","easyToUnderstand","thumb-up"],["문제가 해결됨","solvedMyProblem","thumb-up"],["기타","otherUp","thumb-up"]],[["이해하기 어려움","hardToUnderstand","thumb-down"],["잘못된 정보 또는 샘플 코드","incorrectInformationOrSampleCode","thumb-down"],["필요한 정보/샘플이 없음","missingTheInformationSamplesINeed","thumb-down"],["번역 문제","translationIssue","thumb-down"],["기타","otherDown","thumb-down"]],["최종 업데이트: 2025-09-04(UTC)"],[],[],null,["# Reboot a persistent resource\n\nYou can reboot any persistent resource that's in the `RUNNING` or `ERROR` state. Rebooting a persistent resource lets you recover from errors that the persistent resource can't recover from on its own. You can also reboot a persistent resource to manually obtain more up-to-date clusters. This page shows you how to reboot a persistent resource by using the Google Cloud console and the REST API.\n\n\u003cbr /\u003e\n\nRequired roles\n--------------\n\n\nTo get the permission that\nyou need to reboot a persistent resource,\n\nask your administrator to grant you the\n\n\n[Vertex AI Administrator](/iam/docs/roles-permissions/aiplatform#aiplatform.admin) (`roles/aiplatform.admin`)\nIAM role on your project.\n\n\nFor more information about granting roles, see [Manage access to projects, folders, and organizations](/iam/docs/granting-changing-revoking-access).\n\n\nThis predefined role contains the\n` aiplatform.persistentResources.update`\npermission,\nwhich is required to\nreboot a persistent resource.\n\n\nYou might also be able to get\nthis permission\nwith [custom roles](/iam/docs/creating-custom-roles) or\nother [predefined roles](/iam/docs/roles-overview#predefined).\n\nReboot a persistent resource\n----------------------------\n\nSelect one of the following tabs for instructions on how to reboot a persistent\nresource. Make sure there's no training jobs running on the persistent resource. \n\n### Console\n\nTo reboot a persistent resource in the Google Cloud console, do the following:\n\n1. In the Google Cloud console, go to the **Persistent resources** page.\n\n [Go to Persistent resources](https://console.cloud.google.com/vertex-ai/training/persistent-resources)\n2. Next to the name of the persistent resource that you want to reboot, click\n the vertical ellipses (more_vert).\n\n3. Click **Reboot**.\n\n4. Click **Confirm**.\n\n\n### gcloud\n\n\nBefore using any of the command data below,\nmake the following replacements:\n\n- \u003cvar class=\"edit\" scope=\"PROJECT_ID\" translate=\"no\"\u003ePROJECT_ID\u003c/var\u003e: The Project ID of the persistent resource that you want to reboot.\n- \u003cvar class=\"edit\" scope=\"LOCATION\" translate=\"no\"\u003eLOCATION\u003c/var\u003e: The region of the persistent resource that you want to reboot.\n- \u003cvar class=\"edit\" scope=\"PERSISTENT_RESOURCE_ID\" translate=\"no\"\u003ePERSISTENT_RESOURCE_ID\u003c/var\u003e: The ID of the persistent resource that you want to reboot.\n\n\nExecute the\n\nfollowing\n\ncommand:\n\n#### Linux, macOS, or Cloud Shell\n\n**Note:** Ensure you have initialized the Google Cloud CLI with authentication and a project by running either [gcloud init](/sdk/gcloud/reference/init); or [gcloud auth login](/sdk/gcloud/reference/auth/login) and [gcloud config set project](/sdk/gcloud/reference/config/set). \n\n```bash\ngcloud ai persistent-resources reboot PERSISTENT_RESOURCE_ID \\\n --project=PROJECT_ID \\\n --region=LOCATION\n```\n\n#### Windows (PowerShell)\n\n**Note:** Ensure you have initialized the Google Cloud CLI with authentication and a project by running either [gcloud init](/sdk/gcloud/reference/init); or [gcloud auth login](/sdk/gcloud/reference/auth/login) and [gcloud config set project](/sdk/gcloud/reference/config/set). \n\n```bash\ngcloud ai persistent-resources reboot PERSISTENT_RESOURCE_ID `\n --project=PROJECT_ID `\n --region=LOCATION\n```\n\n#### Windows (cmd.exe)\n\n**Note:** Ensure you have initialized the Google Cloud CLI with authentication and a project by running either [gcloud init](/sdk/gcloud/reference/init); or [gcloud auth login](/sdk/gcloud/reference/auth/login) and [gcloud config set project](/sdk/gcloud/reference/config/set). \n\n```bash\ngcloud ai persistent-resources reboot PERSISTENT_RESOURCE_ID ^\n --project=PROJECT_ID ^\n --region=LOCATION\n```\n\nYou should receive a response similar to the following:\n\n```\nUsing endpoint [https://us-central1-aiplatform.googleapis.com/]\nRequest to reboot the PersistentResource [projects/sample-project/locations/us-central1/persistentResources/test-persistent-resource] has been sent.\n\nYou may view the status of your persistent resource with the command\n\n $ gcloud ai persistent-resources describe projects/sample-project/locations/us-central1/persistentResources/test-persistent-resource\n```\n\n### REST\n\n\nBefore using any of the request data,\nmake the following replacements:\n\n- \u003cvar class=\"edit\" scope=\"PROJECT_ID\" translate=\"no\"\u003ePROJECT_ID\u003c/var\u003e: The Project ID of the persistent resource that you want to reboot.\n- \u003cvar class=\"edit\" scope=\"LOCATION\" translate=\"no\"\u003eLOCATION\u003c/var\u003e: The region of the persistent resource that you want to reboot.\n- \u003cvar class=\"edit\" scope=\"PERSISTENT_RESOURCE_ID\" translate=\"no\"\u003ePERSISTENT_RESOURCE_ID\u003c/var\u003e: The ID of the persistent resource that you want to reboot.\n\n\nHTTP method and URL:\n\n```\nPOST https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/persistentResources/PERSISTENT_RESOURCE_ID:reboot\n```\n\nTo send your request, expand one of these options:\n\n#### curl (Linux, macOS, or Cloud Shell)\n\n| **Note:** The following command assumes that you have logged in to the `gcloud` CLI with your user account by running [`gcloud init`](/sdk/gcloud/reference/init) or [`gcloud auth login`](/sdk/gcloud/reference/auth/login) , or by using [Cloud Shell](/shell/docs), which automatically logs you into the `gcloud` CLI . You can check the currently active account by running [`gcloud auth list`](/sdk/gcloud/reference/auth/list).\n\n\nExecute the following command:\n\n```\ncurl -X POST \\\n -H \"Authorization: Bearer $(gcloud auth print-access-token)\" \\\n -H \"Content-Type: application/json; charset=utf-8\" \\\n -d \"\" \\\n \"https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/persistentResources/PERSISTENT_RESOURCE_ID:reboot\"\n```\n\n#### PowerShell (Windows)\n\n| **Note:** The following command assumes that you have logged in to the `gcloud` CLI with your user account by running [`gcloud init`](/sdk/gcloud/reference/init) or [`gcloud auth login`](/sdk/gcloud/reference/auth/login) . You can check the currently active account by running [`gcloud auth list`](/sdk/gcloud/reference/auth/list).\n\n\nExecute the following command:\n\n```\n$cred = gcloud auth print-access-token\n$headers = @{ \"Authorization\" = \"Bearer $cred\" }\n\nInvoke-WebRequest `\n -Method POST `\n -Headers $headers `\n -Uri \"https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/persistentResources/PERSISTENT_RESOURCE_ID:reboot\" | Select-Object -Expand Content\n```\n\nYou should receive a JSON response similar to the following:\n\n```\nresponse: \n {\n \"name\": \"projects/123456789012/locations/us-central1/persistentResources/test-persistent-resource/operations/1234567890123456789\",\n \"metadata\": {\n \"@type\": \"type.googleapis.com/google.cloud.aiplatform.v1.RebootPersistentResourceOperationMetadata\",\n \"genericMetadata\": {\n \"createTime\": \"2024-03-18T17:31:54.955004Z\",\n \"updateTime\": \"2024-03-18T17:31:55.204817Z\",\n \"state\": \"RUNNING\",\n \"worksOn\": [\n \"projects/123456789012/locations/us-central1/persistentResources/test-persistent-resource\"\n ]\n },\n \"progressMessage\": \"Waiting for persistent resource shut down.\"\n }\n }\n```\n\n\u003cbr /\u003e\n\nRebooting a persistent resource is a\n[long running operation](/vertex-ai/docs/general/long-running-operations),\nduring which the persistent resource can't be deleted. The operation contains a\n`progressMessage` field that populates with an error status if one occurs. After\nthe operation indicates `\"done: true\"`,\n[check the status](/vertex-ai/docs/training/persistent-resource-get#get_information_about_a_persistent_resource)\nof the persistent resource. If the persistent resource is in the `RUNNING`\nstate, the reboot is successful and it's ready to run training jobs.\n\nLimitations\n-----------\n\nThe following are limitations for rebooting a persistent resource:\n\n- In some cases, it's possible to lose capacity of scarce resources when rebooting a persistent resource. Full resource retention is not guaranteed.\n- Reboot is not available on Ray on Vertex AI.\n- Persistent resources containing autoscaled worker pools reboot with the minimum replica count.\n\nWhat's next\n-----------\n\n- [Learn about persistent resource](/vertex-ai/docs/training/persistent-resource-overview).\n- [Create and use a persistent resource](/vertex-ai/docs/training/persistent-resource-create).\n- [Run training jobs on a persistent resource](/vertex-ai/docs/training/persistent-resource-train).\n- [Get information about a persistent resource](/vertex-ai/docs/training/persistent-resource-get).\n- [Delete a persistent resource](/vertex-ai/docs/training/persistent-resource-delete)."]]