Gemini API로 콘텐츠 생성

generateContent 또는 streamGenerateContent를 사용하여 Gemini로 콘텐츠를 생성합니다.

Gemini 모델 계열에는 멀티모달 프롬프트 요청에 사용할 수 있는 모델이 포함됩니다. 멀티모달이란 프롬프트에서 두 개 이상의 형식 또는 입력 유형을 사용할 수 있는 것을 말합니다. 멀티모달이 아닌 모델은 텍스트 프롬프트만 허용합니다. 형식에는 텍스트, 오디오, 동영상 등이 포함될 수 있습니다.

Google Cloud 계정을 만들어 시작하기

Gemini용 Vertex AI API를 사용하려면 Google Cloud 계정을 만듭니다.

계정을 만든 후 이 문서에서 Gemini 모델 요청 본문, 모델 파라미터, 응답 본문, 몇 가지 샘플 요청을 검토합니다.

준비가 되면 Gemini용 Vertex AI API 빠른 시작을 참조하여 프로그래밍 언어 SDK 또는 REST API를 사용하여 Vertex AI Gemini API에 요청을 전송하는 방법을 알아보세요.

지원되는 모델

모델	버전
Gemini 1.5 Flash	`gemini-1.5-flash-001`
Gemini 1.5 Pro	`gemini-1.5-pro-001`
Gemini 1.0 Pro Vision	`gemini-1.0-pro-001` `gemini-1.0-pro-vision-001`
Gemini 1.0 Pro	`gemini-1.0-pro` `gemini-1.0-pro-001` `gemini-1.0-pro-002`

예시 문법

모델 응답을 생성하는 문법입니다.

비스트리밍

curl

curl -X POST \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  -H "Content-Type: application/json" \

https://${LOCATION}-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/${LOCATION}/publishers/google/models/${MODEL_ID}:generateContent \
-d '{
  "contents": [{
    ...
  }],
  "generationConfig": {
    ...
  },
  "safetySettings": {
    ...
  }
  ...
}'

Python

gemini_model = GenerativeModel(MODEL_ID)
generation_config = GenerationConfig(...)

model_response = gemini_model.generate_content([...], generation_config, safety_settings={...})

스트리밍

curl

curl -X POST \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  -H "Content-Type: application/json" \
  https://${LOCATION}-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/${LOCATION}/publishers/google/models/${MODEL_ID}:streamGenerateContent \
  -d '{
    "contents": [{
      ...
    }],
    "generationConfig": {
      ...
    },
    "safetySettings": {
      ...
    }
    ...
  }'

Python

gemini_model = GenerativeModel(MODEL_ID)
model_response = gemini_model.generate_content([...], generation_config, safety_settings={...}, stream=True)

매개변수 목록

구현 세부정보는 예시를 참조하세요.

요청 본문

{
  "contents": [
    {
      "role": string,
      "parts": [
        {
          // Union field data can be only one of the following:
          "text": string,
          "inlineData": {
            "mimeType": string,
            "data": string
          },
          "fileData": {
            "mimeType": string,
            "fileUri": string
          },
          // End of list of possible types for union field data.

          "videoMetadata": {
            "startOffset": {
              "seconds": integer,
              "nanos": integer
            },
            "endOffset": {
              "seconds": integer,
              "nanos": integer
            }
          }
        }
      ]
    }
  ],
  "systemInstruction": {
    "role": string,
    "parts": [
      {
        "text": string
      }
    ]
  },
  "tools": [
    {
      "functionDeclarations": [
        {
          "name": string,
          "description": string,
          "parameters": {
            object (OpenAPI Object Schema)
          }
        }
      ]
    }
  ],
  "safetySettings": [
    {
      "category": enum (HarmCategory),
      "threshold": enum (HarmBlockThreshold)
    }
  ],
  "generationConfig": {
    "temperature": number,
    "topP": number,
    "topK": number,
    "candidateCount": integer,
    "maxOutputTokens": integer,
    "presencePenalty": float,
    "frequencyPenalty": float,
    "stopSequences": [
      string
    ],
    "responseMimeType": string,
    "responseSchema": schema,
    "seed": integer
  }
}

요청 본문에는 다음 매개변수가 있는 데이터가 포함됩니다.

매개변수
`contents`	필수: `Content` 모델과의 현재 대화 콘텐츠입니다. 싱글턴 쿼리의 경우 이는 단일 인스턴스입니다. 멀티턴 쿼리의 경우 이는 대화 기록과 최근 요청이 포함된 반복 필드입니다.
`systemInstruction`	선택사항: `Content` `gemini-1.5-flash`, `gemini-1.5-pro`, `gemini-1.0-pro-002`에 사용할 수 있습니다. 성능 향상을 위해 모델을 조정하는 안내입니다. 예를 들면 '가능한 한 간결하게 답변하세요' 또는 '응답에 기술 용어를 사용하지 마세요' 등이 있습니다. `text` 문자열은 토큰 제한에 포함됩니다. `systemInstruction`의 `role` 필드는 무시되며 모델 성능에 영향을 미치지 않습니다. 참고: `parts`에는 `text`만 사용해야 하며 각 `part`의 콘텐츠는 별도의 단락에 있어야 합니다.
`tools`	선택사항입니다. 시스템이 모델의 지식과 범위를 벗어나 외부 시스템과 상호작용하여 작업 또는 작업 집합을 수행할 수 있도록 하는 코드 조각입니다. 함수 호출을 참조하세요.
`toolConfig`	선택사항입니다. 함수 호출을 참조하세요.
`safetySettings`	선택사항: `SafetySetting` 안전하지 않은 콘텐츠를 차단하는 요청별 설정입니다. `GenerateContentResponse.candidates`에 적용되었습니다.
`generationConfig`	선택사항: `GenerationConfig` 생성 구성 설정입니다.
`cachedContent`	선택사항: `CachedContent` 캐시된 콘텐츠입니다. 반복되는 콘텐츠가 포함된 요청에 캐시된 콘텐츠를 사용할 수 있습니다.

`contents`

메시지의 여러 부분으로 구성된 콘텐츠를 포함하는 구조화된 데이터의 기본 유형입니다.

이 클래스는 role 및 parts라는 두 가지 기본 속성으로 구성됩니다. role 속성은 콘텐츠를 생성하는 개별 사용자를 나타내고 parts 속성에는 여러 요소가 포함되며 각 요소는 메시지 내 데이터 세그먼트를 나타냅니다.

매개변수

매개변수
`role`	선택사항: `string` 메시지를 생성하는 항목의 ID입니다. 다음과 같은 값이 지원됩니다. `user`: 실제 사람이 메시지(일반적으로 사용자가 만든 메시지)를 보냈음을 나타냅니다. `model`: 메시지가 모델에서 생성되었음을 나타냅니다. `model` 값은 멀티턴 대화 중에 모델의 메시지를 대화에 삽입하는 데 사용됩니다. 멀티턴이 아닌 대화의 경우 이 필드를 비워 두거나 설정하지 않을 수 있습니다.
`parts`	`Part` 단일 메시지를 구성하는 순서가 지정된 부분의 목록입니다. 부분마다 IANA MIME 유형이 다를 수 있습니다. 최대 토큰 수 또는 최대 이미지 수와 같은 입력 한도는 Google 모델 페이지의 모델 사양을 참조하세요. 요청에 포함된 토큰 수를 계산하려면 토큰 수 가져오기를 참조하세요.

role

선택사항: string

메시지를 생성하는 항목의 ID입니다. 다음과 같은 값이 지원됩니다.

user: 실제 사람이 메시지(일반적으로 사용자가 만든 메시지)를 보냈음을 나타냅니다.
model: 메시지가 모델에서 생성되었음을 나타냅니다.

model 값은 멀티턴 대화 중에 모델의 메시지를 대화에 삽입하는 데 사용됩니다.

멀티턴이 아닌 대화의 경우 이 필드를 비워 두거나 설정하지 않을 수 있습니다.

parts

Part

단일 메시지를 구성하는 순서가 지정된 부분의 목록입니다. 부분마다 IANA MIME 유형이 다를 수 있습니다.

최대 토큰 수 또는 최대 이미지 수와 같은 입력 한도는 Google 모델 페이지의 모델 사양을 참조하세요.

요청에 포함된 토큰 수를 계산하려면 토큰 수 가져오기를 참조하세요.

`parts`

멀티 파트 Content 메시지의 일부인 미디어를 포함하는 데이터 유형입니다.

매개변수
`text`	선택사항: `string` 텍스트 프롬프트 또는 코드 스니펫입니다.
`inlineData`	선택사항: `Blob` 원시 바이트의 인라인 데이터입니다. `gemini-1.0-pro-vision`의 경우 `inlineData`를 사용하여 이미지를 최대 1개 지정할 수 있습니다. 최대 16개의 이미지를 지정하려면 `fileData`를 사용합니다.
`fileData`	선택사항: `fileData` 파일에 저장된 데이터입니다.
`functionCall`	선택사항: `FunctionCall` `FunctionDeclaration.name` 필드를 나타내는 문자열과 모델에서 예측한 함수 호출의 매개변수가 포함된 구조화된 JSON 객체가 포함됩니다. 함수 호출을 참조하세요.
`functionResponse`	선택사항: `FunctionResponse` `FunctionDeclaration.name` 필드를 나타내는 문자열과 함수 호출의 출력이 포함된 구조화된 JSON 객체가 포함된 `FunctionCall`의 결과 출력입니다. 모델에 대한 컨텍스트로 사용됩니다. 함수 호출을 참조하세요.
`videoMetadata`	선택사항: `VideoMetadata` 동영상 입력의 경우 기간 형식의 동영상 시작 및 끝 오프셋입니다. 예를 들어 1:00부터 시작하는 10초 클립을 지정하려면 `"startOffset": { "seconds": 60 }` 및 `"endOffset": { "seconds": 70 }`을 설정합니다. 메타데이터는 동영상 데이터가 `inlineData` 또는 `fileData`에 표시되는 동안에만 지정되어야 합니다.

`blob`

콘텐츠 blob입니다. 가능하다면 원시 바이트가 아닌 텍스트로 보내세요.

매개변수

매개변수
`mimeType`	`string` `data` 또는 `fileUri` 필드에 지정된 파일의 미디어 유형입니다. 허용되는 값은 다음과 같습니다. 클릭하여 MIME 유형 펼치기 `application/pdf` `audio/mpeg` `audio/mp3` `audio/wav` `image/png` `image/jpeg` `text/plain` `video/mov` `video/mpeg` `video/mp4` `video/mpg` `video/avi` `video/wmv` `video/mpegps` `video/flv` `gemini-1.0-pro-vision`의 경우 최대 동영상 길이는 2분입니다. Gemini 1.5 Pro 및 Gemini 1.5 Flash의 경우 오디오 파일의 최대 길이는 8.4시간이고 동영상 파일의 최대 길이(오디오 제외)는 1시간입니다. 자세한 내용은 Gemini 1.5 Pro 미디어 요구사항을 참조하세요. 텍스트 파일은 UTF-8로 인코딩되어야 합니다. 텍스트 파일 콘텐츠는 토큰 제한에 반영됩니다. 이미지 해상도에는 제한이 없습니다.
`data`	`bytes` 프롬프트에서 인라인을 포함할 이미지, PDF, 또는 동영상의 base64 인코딩입니다. 미디어를 인라인으로 포함할 경우 데이터의 미디어 유형(`mimeType`)도 지정해야 합니다. 크기 제한: 20MB

mimeType

string

data 또는 fileUri 필드에 지정된 파일의 미디어 유형입니다. 허용되는 값은 다음과 같습니다.

클릭하여 MIME 유형 펼치기

application/pdf
audio/mpeg
audio/mp3
audio/wav
image/png
image/jpeg
text/plain
video/mov
video/mpeg
video/mp4
video/mpg
video/avi
video/wmv
video/mpegps
video/flv

gemini-1.0-pro-vision의 경우 최대 동영상 길이는 2분입니다.

Gemini 1.5 Pro 및 Gemini 1.5 Flash의 경우 오디오 파일의 최대 길이는 8.4시간이고 동영상 파일의 최대 길이(오디오 제외)는 1시간입니다. 자세한 내용은 Gemini 1.5 Pro 미디어 요구사항을 참조하세요.

텍스트 파일은 UTF-8로 인코딩되어야 합니다. 텍스트 파일 콘텐츠는 토큰 제한에 반영됩니다.

이미지 해상도에는 제한이 없습니다.

data

bytes

프롬프트에서 인라인을 포함할 이미지, PDF, 또는 동영상의 base64 인코딩입니다. 미디어를 인라인으로 포함할 경우 데이터의 미디어 유형(mimeType)도 지정해야 합니다.

크기 제한: 20MB

CachedContent

컨텍스트 캐시가 만료되면 업데이트하는 데 사용됩니다. CachedContent를 업데이트할 때 ttl 또는 expireTime을 지정해야 하지만 둘 다 지정할 수는 없습니다. 자세한 내용은 컨텍스트 캐싱 사용을 참조하세요.

매개변수

매개변수
`ttl`	`TTL` 컨텍스트 캐시가 생성되거나 업데이트된 후 컨텍스트 캐시가 만료되기 전에 지속되는 초 및 나노초를 지정하는 데 사용됩니다.
`expireTime`	`Timestamp` 컨텍스트 캐시가 만료되는 시간을 지정하는 타임스탬프입니다.

ttl

TTL

컨텍스트 캐시가 생성되거나 업데이트된 후 컨텍스트 캐시가 만료되기 전에 지속되는 초 및 나노초를 지정하는 데 사용됩니다.

expireTime

Timestamp

컨텍스트 캐시가 만료되는 시간을 지정하는 타임스탬프입니다.

TTL

컨텍스트 캐시가 생성되거나 업데이트된 후 만료되기 전까지의 TTL(수명) 또는 기간입니다.

매개변수

매개변수
`seconds`	`float` 컨텍스트 캐시가 생성된 후 만료되기 전까지 기간의 초 부분입니다. 기본값은 3,600초입니다.
`nano`	선택사항: `float` 컨텍스트 캐시가 생성된 후 만료되기 전까지 기간의 나노초 부분입니다.

seconds

float

컨텍스트 캐시가 생성된 후 만료되기 전까지 기간의 초 부분입니다. 기본값은 3,600초입니다.

nano

선택사항: float

컨텍스트 캐시가 생성된 후 만료되기 전까지 기간의 나노초 부분입니다.

FileData

URI 기반 데이터

매개변수

매개변수
`mimeType`	`string` 데이터의 IANA MIME 유형입니다.
`fileUri`	`string` 프롬프트에 포함할 파일의 Cloud Storage URI입니다. 버킷 객체는 공개적으로 읽을 수 있거나 요청을 보내는 동일한 Google Cloud 프로젝트에 있어야 합니다. 또한 파일의 미디어 유형(`mimeType`)을 지정해야 합니다. `gemini-1.5-pro` 및 `gemini-1.5-flash`의 경우 크기 제한은 2GB입니다. `gemini-1.0-pro-vision`의 경우 크기 제한은 20MB입니다.

mimeType

string

데이터의 IANA MIME 유형입니다.

fileUri

string

프롬프트에 포함할 파일의 Cloud Storage URI입니다. 버킷 객체는 공개적으로 읽을 수 있거나 요청을 보내는 동일한 Google Cloud 프로젝트에 있어야 합니다. 또한 파일의 미디어 유형(mimeType)을 지정해야 합니다.

gemini-1.5-pro 및 gemini-1.5-flash의 경우 크기 제한은 2GB입니다.

gemini-1.0-pro-vision의 경우 크기 제한은 20MB입니다.

`functionCall`

functionDeclaration.name을 나타내는 문자열 및 매개변수와 해당 값이 포함된 구조화된 JSON 객체를 포함하는 모델에서 반환된 예측된 functionCall입니다.

매개변수

매개변수
`name`	`string` 호출하려는 함수의 이름입니다.
`args`	`Struct` JSON 객체 형식의 함수 매개변수와 값입니다. 매개변수 세부정보는 함수 호출을 참조하세요.

name

string

호출하려는 함수의 이름입니다.

args

Struct

JSON 객체 형식의 함수 매개변수와 값입니다.

매개변수 세부정보는 함수 호출을 참조하세요.

`functionResponse`

FunctionDeclaration.name을 나타내는 문자열이 포함된 FunctionCall의 결과 출력입니다. 또한 함수의 출력이 포함된 구조화된 JSON 객체를 포함하고 이를 모델의 컨텍스트로 사용합니다. 여기에는 모델 예측을 기반으로 생성된 FunctionCall의 결과가 포함되어야 합니다.

매개변수

매개변수
`name`	`string` 호출하려는 함수의 이름입니다.
`response`	`Struct` JSON 객체 형식의 함수 응답입니다.

name

string

호출하려는 함수의 이름입니다.

response

Struct

JSON 객체 형식의 함수 응답입니다.

`videoMetadata`

입력 동영상 콘텐츠를 설명하는 메타데이터입니다.

매개변수

매개변수
`startOffset`	선택사항: `google.protobuf.Duration` 동영상의 시작 오프셋입니다.
`endOffset`	선택사항: `google.protobuf.Duration` 동영상의 종료 오프셋입니다.

startOffset

선택사항: google.protobuf.Duration

동영상의 시작 오프셋입니다.

endOffset

선택사항: google.protobuf.Duration

동영상의 종료 오프셋입니다.

`safetySetting`

안전 설정입니다.

매개변수

매개변수
`category`	선택사항: `HarmCategory` 기준점을 구성할 안전 카테고리입니다. 허용되는 값은 다음과 같습니다. 클릭하여 안전 카테고리 펼치기 `HARM_CATEGORY_SEXUALLY_EXPLICIT` `HARM_CATEGORY_HATE_SPEECH` `HARM_CATEGORY_HARASSMENT` `HARM_CATEGORY_DANGEROUS_CONTENT`
`threshold`	선택사항: `HarmBlockThreshold` 확률에 따라 지정된 안전 카테고리에 속할 수 있는 응답 차단의 기준점입니다. `BLOCK_NONE` `BLOCK_LOW_AND_ABOVE` `BLOCK_MED_AND_ABOVE` `BLOCK_ONLY_HIGH`
`method`	선택사항: `HarmBlockMethod` 확률 또는 심각도 점수에 기준점이 사용되는지 지정합니다. 지정하지 않으면 기준점이 확률 점수에 사용됩니다.

category

선택사항: HarmCategory

기준점을 구성할 안전 카테고리입니다. 허용되는 값은 다음과 같습니다.

클릭하여 안전 카테고리 펼치기

HARM_CATEGORY_SEXUALLY_EXPLICIT
HARM_CATEGORY_HATE_SPEECH
HARM_CATEGORY_HARASSMENT
HARM_CATEGORY_DANGEROUS_CONTENT

threshold

선택사항: HarmBlockThreshold

확률에 따라 지정된 안전 카테고리에 속할 수 있는 응답 차단의 기준점입니다.

BLOCK_NONE
BLOCK_LOW_AND_ABOVE
BLOCK_MED_AND_ABOVE
BLOCK_ONLY_HIGH

method

선택사항: HarmBlockMethod

확률 또는 심각도 점수에 기준점이 사용되는지 지정합니다. 지정하지 않으면 기준점이 확률 점수에 사용됩니다.

`harmCategory`

콘텐츠를 차단하는 HRM 카테고리입니다.

매개변수
`HARM_CATEGORY_UNSPECIFIED`	피해 카테고리가 지정되지 않았습니다.
`HARM_CATEGORY_HATE_SPEECH`	피해 카테고리는 증오심 표현입니다.
`HARM_CATEGORY_DANGEROUS_CONTENT`	피해 카테고리는 위험한 콘텐츠입니다.
`HARM_CATEGORY_HARASSMENT`	피해 카테고리는 괴롭힘입니다.
`HARM_CATEGORY_SEXUALLY_EXPLICIT`	피해 카테고리는 음란물입니다.

`harmBlockThreshold`

응답을 차단하는 데 사용되는 확률 기준점 수준입니다.

매개변수
`HARM_BLOCK_THRESHOLD_UNSPECIFIED`	지정되지 않은 피해 차단 기준점입니다.
`BLOCK_LOW_AND_ABOVE`	낮은 기준점 이상을 차단합니다(즉, 더 차단).
`BLOCK_MEDIUM_AND_ABOVE`	중간 기준점 이상을 차단합니다.
`BLOCK_ONLY_HIGH`	높은 기준점만 차단합니다(즉, 덜 차단).
`BLOCK_NONE`	차단하지 않습니다.

`harmBlockMethod`

확률과 심각도의 조합에 따라 응답을 차단하는 확률 기준점입니다.

매개변수
`HARM_BLOCK_METHOD_UNSPECIFIED`	피해 차단 메서드가 지정되지 않았습니다.
`SEVERITY`	피해 차단 메서드에서 확률 점수와 심각도 점수를 모두 사용합니다.
`PROBABILITY`	피해 차단 메서드에서 확률 점수를 사용합니다.

`generationConfig`

프롬프트를 생성할 때 사용되는 구성 설정입니다.

매개변수
`temperature`	선택사항: `float` 강도(temperature)는 응답 생성 중 샘플링에 사용되며 `topP` 및 `topK`가 적용될 때 발생합니다. 온도(temperature)는 토큰 선택의 무작위성 수준을 제어합니다. 온도(temperature)가 낮을수록 자유롭거나 창의적인 답변과 거리가 먼 응답이 필요한 프롬프트에 적합하고, 온도(temperature)가 높을수록 보다 다양하거나 창의적인 결과로 이어질 수 있습니다. 온도(temperature)가 `0`이면 확률이 가장 높은 토큰이 항상 선택됩니다. 이 경우 특정 프롬프트에 대한 응답은 대부분 확정적이지만 여전히 약간의 변형이 가능합니다. 모델이 너무 일반적이거나, 너무 짧은 응답을 반환하거나 모델이 대체 응답을 제공할 경우에는 온도(temperature)를 높여보세요. `gemini-1.5-flash` 범위: `0.0 - 2.0`(기본값: `1.0`) `gemini-1.5-pro` 범위: `0.0 - 2.0`(기본값: `1.0`) `gemini-1.0-pro-vision` 범위: `0.0 - 1.0`(기본값: `0.4`) `gemini-1.0-pro-002` 범위: `0.0 - 2.0`(기본값: `1.0`) `gemini-1.0-pro-001` 범위: `0.0 - 1.0`(기본값: `0.9`)
`topP`	선택사항: `float` 지정하면 Nucleus 샘플링이 사용됩니다. Top-P는 모델이 출력용 토큰을 선택하는 방식을 변경합니다. 토큰은 확률의 합이 Top-P 값과 같아질 때까지 확률이 가장 높은 것부터(Top-K 참조) 가장 낮은 것까지 선택됩니다. 예를 들어 토큰 A, B, C의 확률이 0.3, 0.2, 0.1이고 Top-P 값이 `0.5`면 모델이 강도를 사용해서 다음 토큰으로 A 또는 B를 선택하고 C는 후보에서 제외합니다. 임의성이 낮은 응답을 위해서는 낮은 값을 지정하고 임의성이 높은 응답을 위해서는 높은 값을 지정합니다. 범위: `0.0 - 1.0` `gemini-1.5-flash` 기본값: `0.95` `gemini-1.5-pro` 기본값: `0.95` `gemini-1.0-pro` 기본값: `1.0` `gemini-1.0-pro-vision` 기본값: `1.0`
`topK`	선택사항: Top-K는 모델이 출력용 토큰을 선택하는 방식을 변경합니다. Top-K가 `1`이면 다음으로 선택된 토큰이 모델의 어휘에 포함된 모든 토큰 중에서 가장 확률이 높다는 의미입니다('그리디 디코딩'이라고도 함). 반면에 Top-K가 `3`이면 강도를 사용하여 가장 확률이 높은 토큰 3개 중에서 다음 토큰이 선택된다는 의미입니다. 각 토큰 선택 단계에서 확률이 가장 높은 최상위 K 토큰이 샘플링됩니다. 그런 다음 Top-P를 기준으로 토큰을 추가로 필터링하고 강도 샘플링을 사용하여 최종 토큰을 선택합니다. 임의성이 낮은 응답을 위해서는 낮은 값을 지정하고 임의성이 높은 응답을 위해서는 높은 값을 지정합니다. 범위: `1-40` `gemini-1.0-pro-vision`에서만 지원됩니다. `gemini-1.0-pro-vision` 기본값: `32`
`candidateCount`	선택사항: `int` 반환할 응답 변형의 개수입니다. 각 요청에 대해 모든 후보의 출력 토큰이 청구되지만 입력 토큰은 한 번만 청구됩니다. 여러 후보 지정은 `generateContent`에서 작동하는 미리보기 기능입니다(`streamGenerateContent`는 지원되지 않음). 지원되는 모델은 다음과 같습니다. Gemini 1.5 Flash: `1`-`8`, 기본값: `1` Gemini 1.5 Pro: `1`-`8`, 기본값: `1`
`maxOutputTokens`	선택사항: int 응답에서 생성될 수 있는 토큰의 최대 개수입니다. 토큰은 약 4자(영문 기준)입니다. 토큰 100개는 단어 약 60~80개에 해당합니다. 응답이 짧을수록 낮은 값을 지정하고 잠재적으로 응답이 길면 높은 값을 지정합니다.
`stopSequences`	선택사항: `List[string]` 문자열 중 하나가 응답에서 발견되면 모델에 텍스트 생성을 중지하도록 지시하는 문자열 목록을 지정합니다. 문자열이 응답에 여러 번 표시되면 처음 발견된 위치에서 응답이 잘립니다. 문자열은 대소문자를 구분합니다. 예를 들어 `stopSequences`가 지정되지 않았을 때 다음이 반환되면 `public static string reverse(string myString)` `stopSequences`가 `["Str", "reverse"]`로 설정된 응답이 다음과 같이 반환됩니다. `public static string` 목록에 최대 5개의 항목이 포함됩니다.
`presencePenalty`	선택사항: `float` 양수 페널티입니다. 양수 값은 생성된 텍스트에 이미 표시된 토큰에 페널티를 적용하여 다양한 콘텐츠가 생성될 가능성을 높입니다. `presencePenalty`의 최댓값은 `2.0` 미만입니다. 최솟값은 `-2.0`입니다. `gemini-1.5-pro` 및 `gemini-1.5-flash`에서만 지원됩니다.
`frequencyPenalty`	선택사항: `float` 양수 값은 생성된 텍스트에 반복적으로 표시되는 토큰에 페널티를 적용하여 콘텐츠가 반복될 가능성을 줄입니다. `frequencyPenalty`의 최댓값은 `2.0` 미만입니다. 최솟값은 `-2.0`입니다. `gemini-1.5-pro` 및 `gemini-1.5-flash`에서만 지원됩니다.
`responseMimeType`	선택사항: `string (enum)` `gemini-1.5-pro`에 사용 가능 생성된 후보 텍스트의 출력 응답 MIME 유형입니다. 지원되는 MIME 유형은 다음과 같습니다. `application/json`: 후보의 JSON 응답입니다. `text/plain`(기본값): 일반 텍스트 출력입니다. `text/x.enum`: 분류 태스크의 경우 응답 스키마에 정의된 대로 열거형 값을 출력합니다. 의도하지 않은 동작이 방지되도록 적절한 응답 유형을 지정합니다. 예를 들어 JSON 형식의 응답이 필요하면 `text/plain`이 아닌 `application/json`을 지정합니다. 이 기능은 미리보기 기능입니다.
`responseSchema`	선택사항: 스키마 `gemini-1.5-pro`에 사용할 수 있습니다. 후보 텍스트를 생성한 스키마가 이어서 표시되어야 합니다. 자세한 내용은 생성된 출력 제어를 참조하세요. 이 매개변수를 사용하려면 `responseType` 또는 `responseMimeType` 필드를 지정해야 합니다. 이 기능은 미리보기 기능입니다.
`seed`	선택사항: `int` 시드가 특정 값으로 고정되면 모델은 반복된 요청에 같은 응답을 제공하기 위해 최선을 다합니다. 결정론적 출력은 보장되지 않습니다. 또한 온도와 같은 모델 또는 매개변수 설정을 변경하면 같은 시드 값을 사용하더라도 응답이 달라질 수 있습니다. 기본적으로 무작위 시드 값이 사용됩니다. 이 기능은 미리보기 기능입니다.

응답 본문

{
  "candidates": [
    {
      "content": {
        "parts": [
          {
            "text": string
          }
        ]
      },
      "finishReason": enum (FinishReason),
      "safetyRatings": [
        {
          "category": enum (HarmCategory),
          "probability": enum (HarmProbability),
          "blocked": boolean
        }
      ],
      "citationMetadata": {
        "citations": [
          {
            "startIndex": integer,
            "endIndex": integer,
            "uri": string,
            "title": string,
            "license": string,
            "publicationDate": {
              "year": integer,
              "month": integer,
              "day": integer
            }
          }
        ]
      },
      "avgLogprobs": double
    }
  ],
  "usageMetadata": {
    "promptTokenCount": integer,
    "candidatesTokenCount": integer,
    "totalTokenCount": integer
  }
}

응답 요소	설명
`text`	생성된 텍스트입니다.
`finishReason`	모델 토큰 생성이 중지된 이유입니다. 비어 있으면 모델이 토큰 생성을 중단하지 않은 것입니다. 응답은 컨텍스트를 위해 프롬프트를 사용하기 때문에 모델의 토큰 생성 중지 동작을 변경할 수 없습니다. `FINISH_REASON_UNSPECIFIED` 종료 이유가 지정되지 않았습니다. `FINISH_REASON_STOP` 모델의 자연 중단 지점 또는 중지 시퀀스가 제공됨. `FINISH_REASON_MAX_TOKENS` 요청에 지정된 최대 토큰 수에 도달했습니다. `FINISH_REASON_SAFETY` 안전상의 이유로 응답이 신고되어 토큰 생성이 중지되었습니다. 콘텐츠 필터가 출력을 차단하는 경우 `Candidate.content`이(가) 비어 있습니다. `FINISH_REASON_RECITATION` 응답이 승인되지 않은 인용으로 신고되어 토큰 생성이 중지되었습니다. `FINISH_REASON_OTHER` 토큰 생성이 중지된 그 외 모든 이유
`category`	기준점을 구성할 안전 카테고리입니다. 허용되는 값은 다음과 같습니다. 클릭하여 안전 카테고리 펼치기 `HARM_CATEGORY_SEXUALLY_EXPLICIT` `HARM_CATEGORY_HATE_SPEECH` `HARM_CATEGORY_HARASSMENT` `HARM_CATEGORY_DANGEROUS_CONTENT`
`probability`	유해 콘텐츠일 확률 수준입니다. `HARM_PROBABILITY_UNSPECIFIED` `NEGLIGIBLE` `LOW` `MEDIUM` `HIGH`
`blocked`	모델의 입력이나 출력이 차단되었는지 여부를 나타내는 안전 속성과 연결된 불리언 플래그입니다.
`startIndex`	`content`에서 인용이 시작되는 위치를 지정하는 정수입니다.
`endIndex`	`content`에서 인용이 끝나는 위치를 지정하는 정수입니다.
`url`	인용 출처의 URL입니다. URL 소스의 예시에는 뉴스 웹사이트 또는 GitHub 저장소가 있습니다.
`title`	인용 출처의 제목입니다. 소스 제목의 예시에는 뉴스 기사 또는 도서가 있습니다.
`license`	인용과 연결된 라이선스입니다.
`publicationDate`	인용이 게시된 날짜입니다. 유효한 형식은 `YYYY`, `YYYY-MM`, `YYYY-MM-DD`입니다.
`avgLogprobs`	후보의 평균 로그 확률입니다.
`promptTokenCount`	요청의 토큰 수입니다.
`candidatesTokenCount`	응답의 토큰 수입니다.
`totalTokenCount`	요청과 응답의 토큰 수입니다.

예시

비스트리밍 텍스트 응답

텍스트 입력에서 비스트리밍 모델 응답을 생성합니다.

REST

요청 데이터를 사용하기 전에 다음을 바꿉니다.

PROJECT_ID: 프로젝트 ID입니다.
LOCATION: 요청을 처리하는 리전입니다.
MODEL_ID: 사용하려는 모델의 모델 ID입니다(예: gemini-1.5-flash-001). 지원되는 모델 목록을 참조하세요.
TEXT: 프롬프트에 포함할 텍스트 지침입니다.

HTTP 메서드 및 URL:

POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:generateContent

JSON 요청 본문:

{
  "contents": [{
    "role": "user",
    "parts": [{
      "text": "TEXT"
    }]
  }]
}

요청을 보내려면 다음 옵션 중 하나를 선택합니다.

curl

참고: 다음 명령어는 gcloud init 또는 gcloud auth login을 실행하거나 gcloud CLI에 자동으로 로그인하는 Cloud Shell을 사용하여 사용자 계정으로 gcloud CLI에 로그인했다고 가정합니다. gcloud auth list를 실행하면 현재 활성 계정을 확인할 수 있습니다.

요청 본문을 request.json 파일에 저장하고 다음 명령어를 실행합니다.

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:generateContent"

PowerShell

참고: 다음 명령어는 gcloud init 또는 gcloud auth login을 실행하여 사용자 계정으로 gcloud CLI에 로그인했다고 가정합니다. gcloud auth list를 실행하면 현재 활성 계정을 확인할 수 있습니다.

요청 본문을 request.json 파일에 저장하고 다음 명령어를 실행합니다.

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:generateContent" | Select-Object -Expand Content

Python

import vertexai
from vertexai.generative_models import GenerativeModel

# TODO(developer): Update project_id
# PROJECT_ID = "your-project-id"
vertexai.init(project=PROJECT_ID, location="us-central1")

model = GenerativeModel("gemini-1.5-flash-001")

response = model.generate_content(
    "What's a good name for a flower shop that specializes in selling bouquets of dried flowers?"
)

print(response.text)

NodeJS

const {VertexAI} = require('@google-cloud/vertexai');

/**
 * TODO(developer): Update these variables before running the sample.
 */
async function generate_from_text_input(projectId = 'PROJECT_ID') {
  const vertexAI = new VertexAI({project: projectId, location: 'us-central1'});

  const generativeModel = vertexAI.getGenerativeModel({
    model: 'gemini-1.5-flash-001',
  });

  const prompt =
    "What's a good name for a flower shop that specializes in selling bouquets of dried flowers?";

  const resp = await generativeModel.generateContent(prompt);
  const contentResponse = await resp.response;
  console.log(JSON.stringify(contentResponse));
}

자바

import com.google.cloud.vertexai.VertexAI;
import com.google.cloud.vertexai.api.GenerateContentResponse;
import com.google.cloud.vertexai.generativeai.GenerativeModel;
import com.google.cloud.vertexai.generativeai.ResponseHandler;

public class QuestionAnswer {

  public static void main(String[] args) throws Exception {
    // TODO(developer): Replace these variables before running the sample.
    String projectId = "your-google-cloud-project-id";
    String location = "us-central1";
    String modelName = "gemini-1.5-flash-001";

    String output = simpleQuestion(projectId, location, modelName);
    System.out.println(output);
  }

  // Asks a question to the specified Vertex AI Gemini model and returns the generated answer.
  public static String simpleQuestion(String projectId, String location, String modelName)
      throws Exception {
    // Initialize client that will be used to send requests.
    // This client only needs to be created once, and can be reused for multiple requests.
    try (VertexAI vertexAI = new VertexAI(projectId, location)) {
      String output;
      GenerativeModel model = new GenerativeModel(modelName, vertexAI);
      // Send the question to the model for processing.
      GenerateContentResponse response = model.generateContent("Why is the sky blue?");
      // Extract the generated text from the model's response.
      output = ResponseHandler.getText(response);
      return output;
    }
  }
}

Go

import (
	"context"
	"encoding/json"
	"fmt"
	"io"

	"cloud.google.com/go/vertexai/genai"
)

func generateContentFromText(w io.Writer, projectID string) error {
	location := "us-central1"
	modelName := "gemini-1.5-flash-001"

	ctx := context.Background()
	client, err := genai.NewClient(ctx, projectID, location)
	if err != nil {
		return fmt.Errorf("error creating client: %w", err)
	}
	gemini := client.GenerativeModel(modelName)
	prompt := genai.Text(
		"What's a good name for a flower shop that specializes in selling bouquets of dried flowers?")

	resp, err := gemini.GenerateContent(ctx, prompt)
	if err != nil {
		return fmt.Errorf("error generating content: %w", err)
	}
	// See the JSON response in
	// https://pkg.go.dev/cloud.google.com/go/vertexai/genai#GenerateContentResponse.
	rb, err := json.MarshalIndent(resp, "", "  ")
	if err != nil {
		return fmt.Errorf("json.MarshalIndent: %w", err)
	}
	fmt.Fprintln(w, string(rb))
	return nil
}

C#


using Google.Cloud.AIPlatform.V1;
using System;
using System.Threading.Tasks;

public class TextInputSample
{
    public async Task<string> TextInput(
        string projectId = "your-project-id",
        string location = "us-central1",
        string publisher = "google",
        string model = "gemini-1.5-flash-001")
    {

        var predictionServiceClient = new PredictionServiceClientBuilder
        {
            Endpoint = $"{location}-aiplatform.googleapis.com"
        }.Build();
        string prompt = @"What's a good name for a flower shop that specializes in selling bouquets of dried flowers?";

        var generateContentRequest = new GenerateContentRequest
        {
            Model = $"projects/{projectId}/locations/{location}/publishers/{publisher}/models/{model}",
            Contents =
            {
                new Content
                {
                    Role = "USER",
                    Parts =
                    {
                        new Part { Text = prompt }
                    }
                }
            }
        };

        GenerateContentResponse response = await predictionServiceClient.GenerateContentAsync(generateContentRequest);

        string responseText = response.Candidates[0].Content.Parts[0].Text;
        Console.WriteLine(responseText);

        return responseText;
    }
}

REST(OpenAI)

OpenAI 라이브러리를 사용하여 Inference API를 호출할 수 있습니다. 자세한 내용은 OpenAI 라이브러리를 사용하여 Vertex AI 모델 호출을 참조하세요.

요청 데이터를 사용하기 전에 다음을 바꿉니다.

PROJECT_ID: 프로젝트 ID입니다.
LOCATION: 요청을 처리하는 리전입니다.
MODEL_ID: 사용할 모델의 이름입니다.

HTTP 메서드 및 URL:

POST https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/endpoints/openapi/chat/completions

JSON 요청 본문:

{
  "model": "google/MODEL_ID",
  "messages": [{
    "role": "user",
    "content": "Write a story about a magic backpack."
  }]
}

요청을 보내려면 다음 옵션 중 하나를 선택합니다.

curl

요청 본문을 request.json 파일에 저장하고 다음 명령어를 실행합니다.

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/endpoints/openapi/chat/completions"

PowerShell

요청 본문을 request.json 파일에 저장하고 다음 명령어를 실행합니다.

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/endpoints/openapi/chat/completions" | Select-Object -Expand Content

Python(OpenAI)

OpenAI 라이브러리를 사용하여 Inference API를 호출할 수 있습니다. 자세한 내용은 OpenAI 라이브러리를 사용하여 Vertex AI 모델 호출을 참조하세요.

import vertexai
import openai

from google.auth import default, transport

# TODO(developer): Update and un-comment below lines
# project_id = "PROJECT_ID"
# location = "us-central1"

vertexai.init(project=project_id, location=location)

# Programmatically get an access token
credentials, _ = default(scopes=["https://www.googleapis.com/auth/cloud-platform"])
auth_request = transport.requests.Request()
credentials.refresh(auth_request)

# # OpenAI Client
client = openai.OpenAI(
    base_url=f"https://{location}-aiplatform.googleapis.com/v1beta1/projects/{project_id}/locations/{location}/endpoints/openapi",
    api_key=credentials.token,
)

response = client.chat.completions.create(
    model="google/gemini-1.5-flash-001",
    messages=[{"role": "user", "content": "Why is the sky blue?"}],
)

print(response)

비스트리밍 멀티모달 응답

텍스트와 이미지 같은 멀티모달 입력으로부터 비스트리밍 모델 응답을 생성합니다.

REST

요청 데이터를 사용하기 전에 다음을 바꿉니다.

PROJECT_ID: 프로젝트 ID입니다.
LOCATION: 요청을 처리하는 리전입니다.
MODEL_ID: 사용하려는 모델의 모델 ID입니다(예: gemini-1.5-flash-001). 지원되는 모델 목록을 참조하세요.
TEXT: 프롬프트에 포함할 텍스트 지침입니다.
FILE_URI: 데이터를 저장하는 파일에 대한 Cloud Storage URI입니다.
MIME_TYPE: 데이터의 IANA MIME 유형입니다.

HTTP 메서드 및 URL:

POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:generateContent

JSON 요청 본문:

{
  "contents": [{
    "role": "user",
    "parts": [
      {
        "text": "TEXT"
      },
      {
        "fileData": {
          "fileUri": "FILE_URI",
          "mimeType": "MIME_TYPE"
        }
      }
    ]
  }]
}

요청을 보내려면 다음 옵션 중 하나를 선택합니다.

curl

요청 본문을 request.json 파일에 저장하고 다음 명령어를 실행합니다.

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:generateContent"

PowerShell

요청 본문을 request.json 파일에 저장하고 다음 명령어를 실행합니다.

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:generateContent" | Select-Object -Expand Content

Python

import vertexai

from vertexai.generative_models import GenerativeModel, Part

# TODO(developer): Update project_id and location
vertexai.init(project=PROJECT_ID, location="us-central1")

model = GenerativeModel("gemini-1.5-flash-001")

response = model.generate_content(
    [
        Part.from_uri(
            "gs://cloud-samples-data/generative-ai/image/scones.jpg",
            mime_type="image/jpeg",
        ),
        "What is shown in this image?",
    ]
)

print(response.text)

NodeJS

const {VertexAI} = require('@google-cloud/vertexai');

/**
 * TODO(developer): Update these variables before running the sample.
 */
async function createNonStreamingMultipartContent(
  projectId = 'PROJECT_ID',
  location = 'us-central1',
  model = 'gemini-1.5-flash-001',
  image = 'gs://generativeai-downloads/images/scones.jpg',
  mimeType = 'image/jpeg'
) {
  // Initialize Vertex with your Cloud project and location
  const vertexAI = new VertexAI({project: projectId, location: location});

  // Instantiate the model
  const generativeVisionModel = vertexAI.getGenerativeModel({
    model: model,
  });

  // For images, the SDK supports both Google Cloud Storage URI and base64 strings
  const filePart = {
    fileData: {
      fileUri: image,
      mimeType: mimeType,
    },
  };

  const textPart = {
    text: 'what is shown in this image?',
  };

  const request = {
    contents: [{role: 'user', parts: [filePart, textPart]}],
  };

  console.log('Prompt Text:');
  console.log(request.contents[0].parts[1].text);

  console.log('Non-Streaming Response Text:');

  // Generate a response
  const response = await generativeVisionModel.generateContent(request);

  // Select the text from the response
  const fullTextResponse =
    response.response.candidates[0].content.parts[0].text;

  console.log(fullTextResponse);
}

자바

import com.google.cloud.vertexai.VertexAI;
import com.google.cloud.vertexai.api.GenerateContentResponse;
import com.google.cloud.vertexai.generativeai.ContentMaker;
import com.google.cloud.vertexai.generativeai.GenerativeModel;
import com.google.cloud.vertexai.generativeai.PartMaker;
import com.google.cloud.vertexai.generativeai.ResponseHandler;

public class Multimodal {
  public static void main(String[] args) throws Exception {
    // TODO(developer): Replace these variables before running the sample.
    String projectId = "your-google-cloud-project-id";
    String location = "us-central1";
    String modelName = "gemini-1.5-flash-001";

    String output = nonStreamingMultimodal(projectId, location, modelName);
    System.out.println(output);
  }

  // Ask a simple question and get the response.
  public static String nonStreamingMultimodal(String projectId, String location, String modelName)
      throws Exception {
    // Initialize client that will be used to send requests.
    // This client only needs to be created once, and can be reused for multiple requests.
    try (VertexAI vertexAI = new VertexAI(projectId, location)) {
      GenerativeModel model = new GenerativeModel(modelName, vertexAI);

      String videoUri = "gs://cloud-samples-data/video/animals.mp4";
      String imgUri = "gs://cloud-samples-data/generative-ai/image/character.jpg";

      // Get the response from the model.
      GenerateContentResponse response = model.generateContent(
          ContentMaker.fromMultiModalData(
              PartMaker.fromMimeTypeAndData("video/mp4", videoUri),
              PartMaker.fromMimeTypeAndData("image/jpeg", imgUri),
              "Are this video and image correlated?"
          ));

      // Extract the generated text from the model's response.
      String output = ResponseHandler.getText(response);
      return output;
    }
  }
}

Go

import (
	"context"
	"encoding/json"
	"fmt"
	"io"

	"cloud.google.com/go/vertexai/genai"
)

func tryGemini(w io.Writer, projectID string, location string, modelName string) error {
	// location := "us-central1"
	// modelName := "gemini-1.5-flash-001"

	ctx := context.Background()
	client, err := genai.NewClient(ctx, projectID, location)
	if err != nil {
		return fmt.Errorf("error creating client: %w", err)
	}
	gemini := client.GenerativeModel(modelName)

	img := genai.FileData{
		MIMEType: "image/jpeg",
		FileURI:  "gs://generativeai-downloads/images/scones.jpg",
	}
	prompt := genai.Text("What is in this image?")

	resp, err := gemini.GenerateContent(ctx, img, prompt)
	if err != nil {
		return fmt.Errorf("error generating content: %w", err)
	}
	rb, err := json.MarshalIndent(resp, "", "  ")
	if err != nil {
		return fmt.Errorf("json.MarshalIndent: %w", err)
	}
	fmt.Fprintln(w, string(rb))
	return nil
}

C#


using Google.Api.Gax.Grpc;
using Google.Cloud.AIPlatform.V1;
using System.Text;
using System.Threading.Tasks;

public class GeminiQuickstart
{
    public async Task<string> GenerateContent(
        string projectId = "your-project-id",
        string location = "us-central1",
        string publisher = "google",
        string model = "gemini-1.5-flash-001"
    )
    {
        // Create client
        var predictionServiceClient = new PredictionServiceClientBuilder
        {
            Endpoint = $"{location}-aiplatform.googleapis.com"
        }.Build();

        // Initialize content request
        var generateContentRequest = new GenerateContentRequest
        {
            Model = $"projects/{projectId}/locations/{location}/publishers/{publisher}/models/{model}",
            GenerationConfig = new GenerationConfig
            {
                Temperature = 0.4f,
                TopP = 1,
                TopK = 32,
                MaxOutputTokens = 2048
            },
            Contents =
            {
                new Content
                {
                    Role = "USER",
                    Parts =
                    {
                        new Part { Text = "What's in this photo?" },
                        new Part { FileData = new() { MimeType = "image/png", FileUri = "gs://generativeai-downloads/images/scones.jpg" } }
                    }
                }
            }
        };

        // Make the request, returning a streaming response
        using PredictionServiceClient.StreamGenerateContentStream response = predictionServiceClient.StreamGenerateContent(generateContentRequest);

        StringBuilder fullText = new();

        // Read streaming responses from server until complete
        AsyncResponseStream<GenerateContentResponse> responseStream = response.GetResponseStream();
        await foreach (GenerateContentResponse responseItem in responseStream)
        {
            fullText.Append(responseItem.Candidates[0].Content.Parts[0].Text);
        }

        return fullText.ToString();
    }
}

REST(OpenAI)

OpenAI 라이브러리를 사용하여 Inference API를 호출할 수 있습니다. 자세한 내용은 OpenAI 라이브러리를 사용하여 Vertex AI 모델 호출을 참조하세요.

요청 데이터를 사용하기 전에 다음을 바꿉니다.

PROJECT_ID: 프로젝트 ID입니다.
LOCATION: 요청을 처리하는 리전입니다.
MODEL_ID: 사용할 모델의 이름입니다.

HTTP 메서드 및 URL:

POST https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/endpoints/openapi/chat/completions

JSON 요청 본문:

{
  "model": "google/MODEL_ID",
  "messages": [{
    "role": "user",
    "content": [
       {
          "type": "text",
          "text": "Describe the following image:"
       },
       {
          "type": "image_url",
          "image_url": {
             "url": "gs://generativeai-downloads/images/character.jpg"
          }
       }
     ]
  }]
}

요청을 보내려면 다음 옵션 중 하나를 선택합니다.

curl

요청 본문을 request.json 파일에 저장하고 다음 명령어를 실행합니다.

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/endpoints/openapi/chat/completions"

PowerShell

요청 본문을 request.json 파일에 저장하고 다음 명령어를 실행합니다.

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/endpoints/openapi/chat/completions" | Select-Object -Expand Content

Python(OpenAI)

OpenAI 라이브러리를 사용하여 Inference API를 호출할 수 있습니다. 자세한 내용은 OpenAI 라이브러리를 사용하여 Vertex AI 모델 호출을 참조하세요.

import vertexai
import openai

from google.auth import default, transport

# TODO(developer): Update and un-comment below lines
# project_id = "PROJECT_ID"
# location = "us-central1"

vertexai.init(project=project_id, location=location)

# Programmatically get an access token
credentials, _ = default(scopes=["https://www.googleapis.com/auth/cloud-platform"])
auth_request = transport.requests.Request()
credentials.refresh(auth_request)

# OpenAI Client
client = openai.OpenAI(
    base_url=f"https://{location}-aiplatform.googleapis.com/v1beta1/projects/{project_id}/locations/{location}/endpoints/openapi",
    api_key=credentials.token,
)

response = client.chat.completions.create(
    model="google/gemini-1.5-flash-001",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "Describe the following image:"},
                {
                    "type": "image_url",
                    "image_url": "gs://cloud-samples-data/generative-ai/image/scones.jpg",
                },
            ],
        }
    ],
)

print(response)

스트리밍 텍스트 응답

텍스트 입력에서 스트리밍 모델 응답을 생성합니다.

REST

요청 데이터를 사용하기 전에 다음을 바꿉니다.

PROJECT_ID: 프로젝트 ID입니다.
LOCATION: 요청을 처리하는 리전입니다.
MODEL_ID: 사용하려는 모델의 모델 ID입니다(예: gemini-1.5-flash-001). 지원되는 모델 목록을 참조하세요.
TEXT: 프롬프트에 포함할 텍스트 지침입니다.

HTTP 메서드 및 URL:

POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:streamGenerateContent

JSON 요청 본문:

{
  "contents": [{
    "role": "user",
    "parts": [{
      "text": "TEXT"
    }]
  }]
}

요청을 보내려면 다음 옵션 중 하나를 선택합니다.

curl

요청 본문을 request.json 파일에 저장하고 다음 명령어를 실행합니다.

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:streamGenerateContent"

PowerShell

요청 본문을 request.json 파일에 저장하고 다음 명령어를 실행합니다.

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:streamGenerateContent" | Select-Object -Expand Content

Python

import vertexai

from vertexai.generative_models import GenerativeModel

# TODO(developer): Set the following variables and un-comment the lines below
# PROJECT_ID = "your-project-id"
# MODEL_ID = "gemini-1.5-flash-001"

vertexai.init(project=PROJECT_ID, location="us-central1")

model = GenerativeModel(MODEL_ID)
responses = model.generate_content(
    "Write a story about a magic backpack.", stream=True
)

for response in responses:
    print(response.text)

NodeJS

const {VertexAI} = require('@google-cloud/vertexai');

/**
 * TODO(developer): Update these variables before running the sample.
 */
const PROJECT_ID = process.env.CAIP_PROJECT_ID;
const LOCATION = process.env.LOCATION;
const MODEL = 'gemini-1.5-flash-001';

async function generateContent() {
  // Initialize Vertex with your Cloud project and location
  const vertexAI = new VertexAI({project: PROJECT_ID, location: LOCATION});

  // Instantiate the model
  const generativeModel = vertexAI.getGenerativeModel({
    model: MODEL,
  });

  const request = {
    contents: [
      {
        role: 'user',
        parts: [
          {
            text: 'Write a story about a magic backpack.',
          },
        ],
      },
    ],
  };

  console.log(JSON.stringify(request));

  const result = await generativeModel.generateContentStream(request);
  for await (const item of result.stream) {
    console.log(item.candidates[0].content.parts[0].text);
  }
}

자바

import com.google.cloud.vertexai.VertexAI;
import com.google.cloud.vertexai.generativeai.GenerativeModel;

public class StreamingQuestionAnswer {

  public static void main(String[] args) throws Exception {
    // TODO(developer): Replace these variables before running the sample.
    String projectId = "your-google-cloud-project-id";
    String location = "us-central1";
    String modelName = "gemini-1.5-flash-001";

    streamingQuestion(projectId, location, modelName);
  }

  // Ask a simple question and get the response via streaming.
  public static void streamingQuestion(String projectId, String location, String modelName)
      throws Exception {
    // Initialize client that will be used to send requests.
    // This client only needs to be created once, and can be reused for multiple requests.
    try (VertexAI vertexAI = new VertexAI(projectId, location)) {
      GenerativeModel model = new GenerativeModel(modelName, vertexAI);

      // Stream the result.
      model.generateContentStream("Write a story about a magic backpack.")
          .stream()
          .forEach(System.out::println);

      System.out.println("Streaming complete.");
    }
  }
}

Go

import (
	"context"
	"errors"
	"fmt"
	"io"

	"cloud.google.com/go/vertexai/genai"
	"google.golang.org/api/iterator"
)

// generateContent shows how to	send a basic streaming text prompt, writing
// the response to the provided io.Writer.
func generateContent(w io.Writer, projectID, modelName string) error {
	ctx := context.Background()

	client, err := genai.NewClient(ctx, projectID, "us-central1")
	if err != nil {
		return fmt.Errorf("unable to create client: %w", err)
	}
	defer client.Close()

	model := client.GenerativeModel(modelName)

	iter := model.GenerateContentStream(
		ctx,
		genai.Text("Write a story about a magic backpack."),
	)
	for {
		resp, err := iter.Next()
		if err == iterator.Done {
			return nil
		}
		if len(resp.Candidates) == 0 || len(resp.Candidates[0].Content.Parts) == 0 {
			return errors.New("empty response from model")
		}
		if err != nil {
			return err
		}
		fmt.Fprint(w, "generated response: ")
		for _, c := range resp.Candidates {
			for _, p := range c.Content.Parts {
				fmt.Fprintf(w, "%s ", p)
			}
		}
	}
}

REST(OpenAI)

OpenAI 라이브러리를 사용하여 Inference API를 호출할 수 있습니다. 자세한 내용은 OpenAI 라이브러리를 사용하여 Vertex AI 모델 호출을 참조하세요.

요청 데이터를 사용하기 전에 다음을 바꿉니다.

PROJECT_ID: 프로젝트 ID입니다.
LOCATION: 요청을 처리하는 리전입니다.
MODEL_ID: 사용할 모델의 이름입니다.

HTTP 메서드 및 URL:

POST https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/endpoints/openapi/chat/completions

JSON 요청 본문:

{
  "model": "google/MODEL_ID",
  "stream": true,
  "messages": [{
    "role": "user",
    "content": "Write a story about a magic backpack."
  }]
}

요청을 보내려면 다음 옵션 중 하나를 선택합니다.

curl

요청 본문을 request.json 파일에 저장하고 다음 명령어를 실행합니다.

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/endpoints/openapi/chat/completions"

PowerShell

요청 본문을 request.json 파일에 저장하고 다음 명령어를 실행합니다.

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/endpoints/openapi/chat/completions" | Select-Object -Expand Content

Python(OpenAI)

OpenAI 라이브러리를 사용하여 Inference API를 호출할 수 있습니다. 자세한 내용은 OpenAI 라이브러리를 사용하여 Vertex AI 모델 호출을 참조하세요.

import vertexai
import openai

from google.auth import default, transport

# TODO(developer): Update and un-comment below lines
# project_id = "PROJECT_ID"
# location = "us-central1"

vertexai.init(project=project_id, location=location)

# Programmatically get an access token
credentials, _ = default(scopes=["https://www.googleapis.com/auth/cloud-platform"])
auth_request = transport.requests.Request()
credentials.refresh(auth_request)

# OpenAI Client
client = openai.OpenAI(
    base_url=f"https://{location}-aiplatform.googleapis.com/v1beta1/projects/{project_id}/locations/{location}/endpoints/openapi",
    api_key=credentials.token,
)

response = client.chat.completions.create(
    model="google/gemini-1.5-flash-001",
    messages=[{"role": "user", "content": "Why is the sky blue?"}],
    stream=True,
)
for chunk in response:
    print(chunk)

스트리밍 멀티모달 응답

텍스트 및 이미지와 같은 멀티모달 입력에서 스트리밍 모델 응답을 생성합니다.

REST

요청 데이터를 사용하기 전에 다음을 바꿉니다.

PROJECT_ID: 프로젝트 ID입니다.
LOCATION: 요청을 처리하는 리전입니다.
MODEL_ID: 사용하려는 모델의 모델 ID입니다(예: gemini-1.5-flash-001). 지원되는 모델 목록을 참조하세요.
TEXT: 프롬프트에 포함할 텍스트 지침입니다.
FILE_URI1: 데이터를 저장하는 파일에 대한 Cloud Storage URI입니다.
MIME_TYPE1: 데이터의 IANA MIME 유형입니다.
FILE_URI2: 데이터를 저장하는 파일에 대한 Cloud Storage URI입니다.
MIME_TYPE2: 데이터의 IANA MIME 유형입니다.

HTTP 메서드 및 URL:

POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:streamGenerateContent

JSON 요청 본문:

{
  "contents": [{
    "role": "user",
    "parts": [
      {
        "text": "TEXT"
      },
      {
        "fileData": {
          "fileUri": "FILE_URI1",
          "mimeType": "MIME_TYPE1"
        }
      },
      {
        "fileData": {
          "fileUri": "FILE_URI2",
          "mimeType": "MIME_TYPE2"
        }
      }
    ]
  }]
}

요청을 보내려면 다음 옵션 중 하나를 선택합니다.

curl

요청 본문을 request.json 파일에 저장하고 다음 명령어를 실행합니다.

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:streamGenerateContent"

PowerShell

요청 본문을 request.json 파일에 저장하고 다음 명령어를 실행합니다.

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:streamGenerateContent" | Select-Object -Expand Content

Python

import vertexai

from vertexai.generative_models import GenerativeModel, Part

# TODO(developer): Set the following variables and un-comment the lines below
# PROJECT_ID = "your-project-id"
# MODEL_ID = "gemini-1.5-flash-001"

vertexai.init(project=PROJECT_ID, location="us-central1")

model = GenerativeModel(MODEL_ID)
responses = model.generate_content(
    [
        Part.from_uri(
            "gs://cloud-samples-data/generative-ai/video/animals.mp4", "video/mp4"
        ),
        Part.from_uri(
            "gs://cloud-samples-data/generative-ai/image/character.jpg",
            "image/jpeg",
        ),
        "Are these video and image correlated?",
    ],
    stream=True,
)

for response in responses:
    print(response)

NodeJS

const {VertexAI} = require('@google-cloud/vertexai');

/**
 * TODO(developer): Update these variables before running the sample.
 */
const PROJECT_ID = process.env.CAIP_PROJECT_ID;
const LOCATION = process.env.LOCATION;
const MODEL = 'gemini-1.5-flash-001';

async function generateContent() {
  // Initialize Vertex AI
  const vertexAI = new VertexAI({project: PROJECT_ID, location: LOCATION});
  const generativeModel = vertexAI.getGenerativeModel({model: MODEL});

  const request = {
    contents: [
      {
        role: 'user',
        parts: [
          {
            file_data: {
              file_uri: 'gs://cloud-samples-data/video/animals.mp4',
              mime_type: 'video/mp4',
            },
          },
          {
            file_data: {
              file_uri:
                'gs://cloud-samples-data/generative-ai/image/character.jpg',
              mime_type: 'image/jpeg',
            },
          },
          {text: 'Are this video and image correlated?'},
        ],
      },
    ],
  };

  const result = await generativeModel.generateContentStream(request);

  for await (const item of result.stream) {
    console.log(item.candidates[0].content.parts[0].text);
  }
}

자바

import com.google.cloud.vertexai.VertexAI;
import com.google.cloud.vertexai.generativeai.ContentMaker;
import com.google.cloud.vertexai.generativeai.GenerativeModel;
import com.google.cloud.vertexai.generativeai.PartMaker;

public class StreamingMultimodal {
  public static void main(String[] args) throws Exception {
    // TODO(developer): Replace these variables before running the sample.
    String projectId = "your-google-cloud-project-id";
    String location = "us-central1";
    String modelName = "gemini-1.5-flash-001";

    streamingMultimodal(projectId, location, modelName);
  }

  // Ask a simple question and get the response via streaming.
  public static void streamingMultimodal(String projectId, String location, String modelName)
      throws Exception {
    // Initialize client that will be used to send requests.
    // This client only needs to be created once, and can be reused for multiple requests.
    try (VertexAI vertexAI = new VertexAI(projectId, location)) {
      GenerativeModel model = new GenerativeModel(modelName, vertexAI);

      String videoUri = "gs://cloud-samples-data/video/animals.mp4";
      String imgUri = "gs://cloud-samples-data/generative-ai/image/character.jpg";

      // Stream the result.
      model.generateContentStream(
          ContentMaker.fromMultiModalData(
              PartMaker.fromMimeTypeAndData("video/mp4", videoUri),
              PartMaker.fromMimeTypeAndData("image/jpeg", imgUri),
              "Are this video and image correlated?"
          ))
          .stream()
          .forEach(System.out::println);
    }
  }
}

Go

import (
	"context"
	"errors"
	"fmt"
	"io"

	"cloud.google.com/go/vertexai/genai"
	"google.golang.org/api/iterator"
)

func generateContent(w io.Writer, projectID, modelName string) error {
	ctx := context.Background()

	client, err := genai.NewClient(ctx, projectID, "us-central1")
	if err != nil {
		return fmt.Errorf("unable to create client: %w", err)
	}
	defer client.Close()

	model := client.GenerativeModel(modelName)
	iter := model.GenerateContentStream(
		ctx,
		genai.FileData{
			MIMEType: "video/mp4",
			FileURI:  "gs://cloud-samples-data/generative-ai/video/animals.mp4",
		},
		genai.FileData{
			MIMEType: "video/jpeg",
			FileURI:  "gs://cloud-samples-data/generative-ai/image/character.jpg",
		},
		genai.Text("Are these video and image correlated?"),
	)
	for {
		resp, err := iter.Next()
		if err == iterator.Done {
			return nil
		}
		if len(resp.Candidates) == 0 || len(resp.Candidates[0].Content.Parts) == 0 {
			return errors.New("empty response from model")
		}
		if err != nil {
			return err
		}

		fmt.Fprint(w, "generated response: ")
		for _, c := range resp.Candidates {
			for _, p := range c.Content.Parts {
				fmt.Fprintf(w, "%s ", p)
			}
		}
		fmt.Fprint(w, "\n")
	}
}

REST(OpenAI)

OpenAI 라이브러리를 사용하여 Inference API를 호출할 수 있습니다. 자세한 내용은 OpenAI 라이브러리를 사용하여 Vertex AI 모델 호출을 참조하세요.

요청 데이터를 사용하기 전에 다음을 바꿉니다.

PROJECT_ID: 프로젝트 ID입니다.
LOCATION: 요청을 처리하는 리전입니다.
MODEL_ID: 사용할 모델의 이름입니다.

HTTP 메서드 및 URL:

POST https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/endpoints/openapi/chat/completions

JSON 요청 본문:

{
  "model": "google/MODEL_ID",
  "stream": true,
  "messages": [{
    "role": "user",
    "content": [
       {
          "type": "text",
          "text": "Describe the following image:"
       },
       {
          "type": "image_url",
          "image_url": {
             "url": "gs://generativeai-downloads/images/character.jpg"
          }
       }
     ]
  }]
}

요청을 보내려면 다음 옵션 중 하나를 선택합니다.

curl

요청 본문을 request.json 파일에 저장하고 다음 명령어를 실행합니다.

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/endpoints/openapi/chat/completions"

PowerShell

요청 본문을 request.json 파일에 저장하고 다음 명령어를 실행합니다.

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/endpoints/openapi/chat/completions" | Select-Object -Expand Content

Python(OpenAI)

OpenAI 라이브러리를 사용하여 Inference API를 호출할 수 있습니다. 자세한 내용은 OpenAI 라이브러리를 사용하여 Vertex AI 모델 호출을 참조하세요.

import vertexai
import openai

from google.auth import default, transport

# TODO(developer): Update and un-comment below lines
# project_id = "PROJECT_ID"
# location = "us-central1"

vertexai.init(project=project_id, location=location)

# Programmatically get an access token
credentials, _ = default(scopes=["https://www.googleapis.com/auth/cloud-platform"])
auth_request = transport.requests.Request()
credentials.refresh(auth_request)

# OpenAI Client
client = openai.OpenAI(
    base_url=f"https://{location}-aiplatform.googleapis.com/v1beta1/projects/{project_id}/locations/{location}/endpoints/openapi",
    api_key=credentials.token,
)

response = client.chat.completions.create(
    model="google/gemini-1.5-flash-001",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "Describe the following image:"},
                {
                    "type": "image_url",
                    "image_url": "gs://cloud-samples-data/generative-ai/image/scones.jpg",
                },
            ],
        }
    ],
    stream=True,
)
for chunk in response:
    print(chunk)

모델 버전

자동 업데이트 버전을 사용하려면 후행 버전 번호 없이 모델 이름을 지정합니다(예: gemini-1.5-flash-001 대신 gemini-1.5-flash).

자세한 내용은 Gemini 모델 버전 및 수명 주기를 참조하세요.

다음 단계

Gemini API에 대해 자세히 알아보기
함수 호출에 대해 자세히 알아보기
Gemini 모델의 응답 그라운딩에 대해 자세히 알아보기