Vertex AI는 Google 파트너에서 개발한 선별된 모델 목록을 지원합니다.
파트너 모델은 Vertex AI를 MaaS(model as a service)로 함께 사용될 수 있으며 관리형 API로 제공됩니다. 파트너 모델을 사용하는 경우 요청을 Vertex AI 엔드포인트로 계속 전송합니다. 파트너 모델은 서버리스이므로 인프라를 프로비저닝하거나 관리할 필요가 없습니다.
Model Garden을 사용하여 파트너 모델을 검색할 수 있습니다. Model Garden을 사용하여 모델을 배포할 수도 있습니다. 자세한 내용은 Model Garden의 AI 모델 살펴보기를 참조하세요.
Model Garden의 모델 카드에서 사용 가능한 각 파트너 모델에 대한 정보를 확인할 수 있지만 이 가이드에서는 Vertex AI에서 MaaS로 수행하는 서드 파티 모델만 설명합니다.
Anthropic의 Claude 및 Mistral 모델은 Vertex AI에서 사용할 수 있는 서드 파티 관리형 모델의 예시입니다.
용량 보장이 포함된 Vertex AI 파트너 모델 가격 책정
Google은 일부 파트너 모델에 대해 고정 요금으로 모델의 처리량 용량을 예약하는 프로비저닝된 처리량을 제공합니다. 처리량 용량과 해당 용량을 예약할 리전을 결정합니다. 프로비저닝된 처리량 요청은 표준 사용한 만큼만 지불 요청보다 우선순위가 높으므로 프로비저닝된 처리량은 가용성을 높여 줍니다. 시스템에 과부하가 발생해도 처리량이 예약된 처리량 용량 미만으로 유지되는 한 요청은 계속 완료될 수 있습니다. 자세한 내용을 알아보거나 서비스를 구독하려면 영업팀에 문의하세요.
PRINCIPAL을 주 구성원 식별자로 바꿉니다. 식별자는 user|group|serviceAccount:email 또는domain:domain 형식을 취합니다(예:user:cloudysanfrancisco@gmail.com, group:admins@example.com, serviceAccount:test123@example.domain.com 또는 domain:example.domain.com).
[[["이해하기 쉬움","easyToUnderstand","thumb-up"],["문제가 해결됨","solvedMyProblem","thumb-up"],["기타","otherUp","thumb-up"]],[["이해하기 어려움","hardToUnderstand","thumb-down"],["잘못된 정보 또는 샘플 코드","incorrectInformationOrSampleCode","thumb-down"],["필요한 정보/샘플이 없음","missingTheInformationSamplesINeed","thumb-down"],["번역 문제","translationIssue","thumb-down"],["기타","otherDown","thumb-down"]],["최종 업데이트: 2025-09-04(UTC)"],[],[],null,["# Vertex AI partner models for MaaS\n\nVertex AI supports a curated list of models developed by Google partners.\nPartner models can be used with [Vertex AI](/vertex-ai) as a model as a\nservice (MaaS) and are offered as a managed API. When you use a partner model,\nyou continue to send your requests to Vertex AI endpoints. Partner models\nare serverless so there's no need to provision or manage infrastructure.\n\nPartner models can be discovered using Model Garden. You can also\ndeploy models using Model Garden. For more information, see [Explore AI\nmodels in\nModel Garden](/vertex-ai/generative-ai/docs/model-garden/explore-models).\nWhile information about each available partner model can be found on its model\ncard in Model Garden, only third-party models that perform as a\nMaaS with Vertex AI are documented in this guide.\n\nAnthropic's Claude and Mistral models are examples of third-party managed models\nthat are available to use on Vertex AI.\n\nPartner models\n--------------\n\nThe following partner models are offered as managed APIs on Vertex AI\nModel Garden (MaaS): \n\nVertex AI partner model pricing with capacity assurance\n-------------------------------------------------------\n\nGoogle offers provisioned throughput for some partner models that reserves\nthroughput capacity for your models for a fixed fee. You decide on the\nthroughput capacity and in which regions to reserve that capacity. Because\nprovisioned throughput requests are prioritized over the standard pay-as-you-go\nrequests, provisioned throughput provides increased availability. When the\nsystem is overloaded, your requests can still be completed as long as the\nthroughput remains under your reserved throughput capacity. For more information\nor to subscribe to the service, [Contact sales](/contact).\n\nRegional and global endpoints\n-----------------------------\n\nFor regional endpoints, requests are served from your specified region. In cases\nwhere you have data residency requirements or if a model doesn't support the\nglobal endpoint, use the regional endpoints.\n\nWhen you use the global endpoint, Google can process and serve your requests\nfrom any region that is supported by the model that you are using, which might\nresult in higher latency in some cases. The global endpoint helps improve\noverall availability and helps reduce errors.\n\nThere is no price difference with the regional endpoints when you use the global\nendpoint. However, the global endpoint quotas and supported model capabilities\ncan differ from the regional endpoints. For more information, view the related\nthird-party model page.\n\n### Specify the global endpoint\n\nTo use the global endpoint, set the region to `global`.\n\nFor example, the request URL for a curl command uses the following format:\n`https://aiplatform.googleapis.com/v1/projects/`\u003cvar translate=\"no\"\u003ePROJECT_ID\u003c/var\u003e`/locations/`**global** `/publishers/`\u003cvar translate=\"no\"\u003ePUBLISHER_NAME\u003c/var\u003e`/models/`\u003cvar translate=\"no\"\u003eMODEL_NAME\u003c/var\u003e\n\nFor the Vertex AI SDK, a regional endpoint is the default. Set the\nregion to `GLOBAL` to use the global endpoint.\n\n### Supported models\n\nThe global endpoint is available for the following models:\n\n- [Claude Opus 4.1](/vertex-ai/generative-ai/docs/partner-models/claude/use-claude#regions)\n- [Claude Opus 4](/vertex-ai/generative-ai/docs/partner-models/claude/use-claude#regions)\n- [Claude Sonnet 4](/vertex-ai/generative-ai/docs/partner-models/claude/use-claude#regions)\n- [Claude 3.7 Sonnet](/vertex-ai/generative-ai/docs/partner-models/claude/use-claude#regions)\n- [Claude 3.5 Sonnet v2](/vertex-ai/generative-ai/docs/partner-models/claude/use-claude#regions)\n\n\u003cbr /\u003e\n\n| **Note:** Prompt Caching is supported when using the global endpoint. Provisioned Throughput isn't supported when using the global endpoint.\n\n### Restrict global API endpoint usage\n\nTo help enforce the use of regional endpoints, use the\n`constraints/gcp.restrictEndpointUsage` organization policy constraint to block\nrequests to the global API endpoint. For more information, see\n[Restricting endpoint usage](/assured-workloads/docs/restrict-endpoint-usage).\n\nGrant user access to partner models\n-----------------------------------\n\nFor you to enable partner models and make a prompt request, a Google Cloud\nadministrator must [set the required permissions](#set-permissions) and [verify\nthe organization policy allows the use of required\nAPIs](#set-organization-policy).\n\n### Set required permissions to use partner models\n\nThe following roles and permissions are required to use partner models:\n\n- You must have the Consumer Procurement Entitlement Manager\n Identity and Access Management (IAM) role. Anyone who's been granted this role can\n enable partner models in Model Garden.\n\n- You must have the `aiplatform.endpoints.predict` permission. This permission\n is included in the Vertex AI User IAM role. For more\n information, see [Vertex AI\n User](/vertex-ai/docs/general/access-control#aiplatform.user) and\n [Access control](/vertex-ai/generative-ai/docs/access-control#permissions).\n\n### Console\n\n1. To grant the Consumer Procurement Entitlement Manager IAM\n roles to a user, go to the **IAM** page.\n\n [Go to IAM](https://console.cloud.google.com/projectselector/iam-admin/iam?supportedpurview=)\n2. In the **Principal** column, find the user\n [principal](/iam/docs/overview#concepts_related_identity) for which you\n want to enable access to partner models, and then click\n edit **Edit principal** in that row.\n\n3. In the **Edit access** pane, click\n add **Add another role**.\n\n4. In **Select a role** , select **Consumer Procurement Entitlement Manager**.\n\n5. In the **Edit access** pane, click\n add **Add another role**.\n\n6. In **Select a role** , select **Vertex AI User**.\n\n7. Click **Save**.\n\n### gcloud\n\n\n1. In the Google Cloud console, activate Cloud Shell.\n\n [Activate Cloud Shell](https://console.cloud.google.com/?cloudshell=true)\n2. Grant the Consumer Procurement Entitlement Manager role that's required\n to enable partner models in Model Garden\n\n gcloud projects add-iam-policy-binding \u003cvar translate=\"no\"\u003ePROJECT_ID\u003c/var\u003e \\\n --member=\u003cvar translate=\"no\"\u003ePRINCIPAL\u003c/var\u003e --role=roles/consumerprocurement.entitlementManager\n\n3. Grant the Vertex AI User role that includes the\n `aiplatform.endpoints.predict` permission which is required to make\n prompt requests:\n\n gcloud projects add-iam-policy-binding \u003cvar translate=\"no\"\u003ePROJECT_ID\u003c/var\u003e \\\n --member=\u003cvar translate=\"no\"\u003ePRINCIPAL\u003c/var\u003e --role=roles/aiplatform.user\n\n Replace \u003cvar translate=\"no\"\u003ePRINCIPAL\u003c/var\u003e with the identifier for\n the principal. The identifier takes the form\n `user|group|serviceAccount:email` or `domain:domain`---for\n example, `user:cloudysanfrancisco@gmail.com`,\n `group:admins@example.com`,\n `serviceAccount:test123@example.domain.com`, or\n `domain:example.domain.com`.\n\n The output is a list of policy bindings that includes the following: \n\n - members:\n - user:\u003cvar translate=\"no\"\u003ePRINCIPAL\u003c/var\u003e\n role: roles/roles/consumerprocurement.entitlementManager\n\n For more information, see\n [Grant a single role](/iam/docs/granting-changing-revoking-access#grant-single-role)\n and\n [`gcloud projects add-iam-policy-binding`](/sdk/gcloud/reference/projects/add-iam-policy-binding).\n\n### Set the organization policy for partner model access\n\nTo enable partner models, your organization policy must allow the following\nAPI: Cloud Commerce Consumer Procurement API - `cloudcommerceconsumerprocurement.googleapis.com`\n\nIf your organization sets an organization policy to\n[restrict service usage](/resource-manager/docs/organization-policy/restricting-resources),\nthen an organization administrator must verify that\n`cloudcommerceconsumerprocurement.googleapis.com` is allowed by\n[setting the organization policy](/resource-manager/docs/organization-policy/restricting-resources#setting_the_organization_policy).\n\nAlso, if you have an organization policy that restricts model usage in\nModel Garden, the policy must allow access to partner models. For more\ninformation, see [Control model\naccess](/vertex-ai/generative-ai/docs/control-model-access).\n\n### Partner model regulatory compliance\n\nThe [certifications](/security/compliance/services-in-scope) for\n[Generative AI on Vertex AI](/vertex-ai/generative-ai/docs/overview) continue to\napply when partner models are used as a managed API using Vertex AI.\nIf you need details about the models themselves, additional information can be\nfound in the respective Model Card, or you can contact the respective model\npublisher.\n\nYour data is stored at rest within the selected region or multi-region for\npartner models on Vertex AI, but the regionalization of data\nprocessing may vary. For a detailed list of partner models' data processing\ncommitments, see [Data residency for partner\nmodels](/vertex-ai/generative-ai/docs/learn/locations#ml-processing-partner-models).\n\nCustomer prompts and model responses are not shared with third-parties when\nusing the Vertex AI API, including partner models. Google only processes\nCustomer Data as instructed by the Customer, which is further described in our\n[Cloud Data Processing Addendum](/terms/data-processing-addendum)."]]