本快速入门将向您介绍如何为所选语言安装 Google Gen AI SDK,然后发出您的第一个 API 请求。它涵盖以下主题:
- 安装 SDK 并设置环境:为 Python、Go、Node.js、Java 或 REST 设置开发环境。
- 发出第一个请求:使用
generateContent
方法发送文本提示并接收回答。 - 生成图片:根据描述性文本提示生成图片。
- 图片理解:向模型提供图片作为输入,并提出与图片相关的问题。
- 代码执行:让模型能够生成和运行 Python 代码来解决问题。
身份验证方法 | 说明 | 使用场景 |
---|---|---|
API 密钥 | 一个简单的加密字符串,可用于调用 Vertex AI 中的 Gemini API。 | 如果您不需要访问 Google Cloud 资源,则最适合快速进行原型设计和开发。 |
应用默认凭据 (ADC) | 一种根据应用环境自动查找凭据的策略,无需修改应用代码。 | 建议大多数生产应用(尤其是 Google Cloud 上运行的应用)使用此方法,因为它可提供更可靠、更安全的身份验证。 |
选择身份验证方法:
准备工作
概念
- API 密钥:一个简单的加密字符串,用于在调用 API 时标识您的项目,适合快速原型设计。
- 应用默认凭据 (ADC):一种可在 Google Cloud 环境中自动查找和使用服务账号凭据的方法,建议用于生产应用。
前提条件
配置应用默认凭据(如果您尚未配置)。
下图总结了整个工作流程:
安装 SDK 并设置环境
在本地机器上,点击以下某个标签页,以安装相应编程语言的 SDK。
Gen AI SDK for Python
运行以下命令,安装并更新 Gen AI SDK for Python。
pip install --upgrade google-genai
设置环境变量:
# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values # with appropriate values for your project. export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT export GOOGLE_CLOUD_LOCATION=global
export GOOGLE_GENAI_USE_VERTEXAI=True
Gen AI SDK for Go
运行以下命令,安装并更新 Go 版 Gen AI SDK。
go get google.golang.org/genai
设置环境变量:
# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values # with appropriate values for your project. export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT export GOOGLE_CLOUD_LOCATION=global
export GOOGLE_GENAI_USE_VERTEXAI=True
适用于 Node.js 的 Gen AI SDK
运行以下命令,安装并更新 Node.js 版 Gen AI SDK。
npm install @google/genai
设置环境变量:
# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values # with appropriate values for your project. export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT export GOOGLE_CLOUD_LOCATION=global
export GOOGLE_GENAI_USE_VERTEXAI=True
Gen AI SDK for Java
如需安装适用于 Java 的 Gen AI SDK,请将以下依赖项添加到您的 Maven pom.xml
文件中:
<dependencies> <dependency> <groupId>com.google.genai</groupId> <artifactId>google-genai</artifactId> <version>0.7.0</version> </dependency> </dependencies>
设置环境变量:
# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values # with appropriate values for your project. export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT export GOOGLE_CLOUD_LOCATION=global
export GOOGLE_GENAI_USE_VERTEXAI=True
REST
设置环境变量:
GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT GOOGLE_CLOUD_LOCATION=global API_ENDPOINT=YOUR_API_ENDPOINT MODEL_ID="gemini-2.5-flash" GENERATE_CONTENT_API="generateContent"
提交第一个请求
使用 generateContent
方法向 Vertex AI 中的 Gemini API 发送请求:
Python
from google import genai
from google.genai.types import HttpOptions
client = genai.Client(http_options=HttpOptions(api_version="v1"))
response = client.models.generate_content(
model="gemini-2.5-flash",
contents="How does AI work?",
)
print(response.text)
# Example response:
# Okay, let's break down how AI works. It's a broad field, so I'll focus on the ...
#
# Here's a simplified overview:
# ...
Go
import (
"context"
"fmt"
"io"
"google.golang.org/genai"
)
// generateWithText shows how to generate text using a text prompt.
func generateWithText(w io.Writer) error {
ctx := context.Background()
client, err := genai.NewClient(ctx, &genai.ClientConfig{
HTTPOptions: genai.HTTPOptions{APIVersion: "v1"},
})
if err != nil {
return fmt.Errorf("failed to create genai client: %w", err)
}
resp, err := client.Models.GenerateContent(ctx,
"gemini-2.5-flash",
genai.Text("How does AI work?"),
nil,
)
if err != nil {
return fmt.Errorf("failed to generate content: %w", err)
}
respText := resp.Text()
fmt.Fprintln(w, respText)
// Example response:
// That's a great question! Understanding how AI works can feel like ...
// ...
// **1. The Foundation: Data and Algorithms**
// ...
return nil
}
Node.js
const {GoogleGenAI} = require('@google/genai');
const GOOGLE_CLOUD_PROJECT = process.env.GOOGLE_CLOUD_PROJECT;
const GOOGLE_CLOUD_LOCATION = process.env.GOOGLE_CLOUD_LOCATION || 'global';
async function generateContent(
projectId = GOOGLE_CLOUD_PROJECT,
location = GOOGLE_CLOUD_LOCATION
) {
const ai = new GoogleGenAI({
vertexai: true,
project: projectId,
location: location,
});
const response = await ai.models.generateContent({
model: 'gemini-2.0-flash',
contents: 'How does AI work?',
});
console.log(response.text);
return response.text;
}
Java
import com.google.genai.Client;
import com.google.genai.types.Content;
import com.google.genai.types.GenerateContentResponse;
import com.google.genai.types.HttpOptions;
import com.google.genai.types.Part;
public class TextGenerationWithText {
public static void main(String[] args) {
// TODO(developer): Replace these variables before running the sample.
String modelId = "gemini-2.5-flash";
generateContent(modelId);
}
// Generates text with text input
public static String generateContent(String modelId) {
// Initialize client that will be used to send requests. This client only needs to be created
// once, and can be reused for multiple requests.
try (Client client =
Client.builder()
.location("global")
.vertexAI(true)
.httpOptions(HttpOptions.builder().apiVersion("v1").build())
.build()) {
GenerateContentResponse response =
client.models.generateContent(modelId, "How does AI work?", null);
System.out.print(response.text());
// Example response:
// Okay, let's break down how AI works. It's a broad field, so I'll focus on the ...
//
// Here's a simplified overview:
// ...
return response.text();
}
}
}
REST
如需发送此提示请求,请从命令行运行 curl 命令,或在应用中添加 REST 调用。
curl \ -X POST \ -H "Content-Type: application/json" \ -H "Authorization: Bearer $(gcloud auth print-access-token)" \ "https://${API_ENDPOINT}/v1/projects/${GOOGLE_CLOUD_PROJECT}/locations/${GOOGLE_CLOUD_LOCATION}/publishers/google/models/${MODEL_ID}:${GENERATE_CONTENT_API}" -d \ $'{ "contents": { "role": "user", "parts": { "text": "Explain how AI works in a few words" } } }'
模型会返回回复。 请注意,系统分多个部分生成回复,其中每个部分会分别评估安全性。
生成图片
Gemini 可以通过对话方式生成和处理图片。您可以通过文本、图片或两者结合的方式向 Gemini 发出提示,以完成各种与图片相关的任务,例如图片生成和编辑。以下代码演示了如何根据描述性提示生成图片:
您必须在配置中添加 responseModalities: ["TEXT", "IMAGE"]
。这些模型不支持仅输出图片。
Python
from google import genai
from google.genai.types import GenerateContentConfig, Modality
from PIL import Image
from io import BytesIO
client = genai.Client()
response = client.models.generate_content(
model="gemini-2.0-flash-preview-image-generation",
contents=(
"Generate an image of the Eiffel tower with fireworks in the background."
),
config=GenerateContentConfig(response_modalities=[Modality.TEXT, Modality.IMAGE]),
)
for part in response.candidates[0].content.parts:
if part.text:
print(part.text)
elif part.inline_data:
image = Image.open(BytesIO((part.inline_data.data)))
image.save("example-image.png")
# Example response:
# A beautiful photograph captures the iconic Eiffel Tower in Paris, France,
# against a backdrop of a vibrant and dynamic fireworks display. The tower itself...
Node.js
const fs = require('fs');
const {GoogleGenAI, Modality} = require('@google/genai');
const GOOGLE_CLOUD_PROJECT = process.env.GOOGLE_CLOUD_PROJECT;
const GOOGLE_CLOUD_LOCATION =
process.env.GOOGLE_CLOUD_LOCATION || 'us-central1';
async function generateContent(
projectId = GOOGLE_CLOUD_PROJECT,
location = GOOGLE_CLOUD_LOCATION
) {
const ai = new GoogleGenAI({
vertexai: true,
project: projectId,
location: location,
});
const response = await ai.models.generateContentStream({
model: 'gemini-2.0-flash-exp',
contents:
'Generate an image of the Eiffel tower with fireworks in the background.',
config: {
responseModalities: [Modality.TEXT, Modality.IMAGE],
},
});
const generatedFileNames = [];
let imageIndex = 0;
for await (const chunk of response) {
const text = chunk.text;
const data = chunk.data;
if (text) {
console.debug(text);
} else if (data) {
const fileName = `generate_content_streaming_image_${imageIndex++}.png`;
console.debug(`Writing response image to file: ${fileName}.`);
try {
fs.writeFileSync(fileName, data);
generatedFileNames.push(fileName);
} catch (error) {
console.error(`Failed to write image file ${fileName}:`, error);
}
}
}
return generatedFileNames;
}
图片理解
Gemini 还可以理解图片。以下代码使用上一部分中生成的图片,并使用其他模型来推断有关该图片的信息:
Python
from google import genai
from google.genai.types import HttpOptions, Part
client = genai.Client(http_options=HttpOptions(api_version="v1"))
response = client.models.generate_content(
model="gemini-2.5-flash",
contents=[
"What is shown in this image?",
Part.from_uri(
file_uri="gs://cloud-samples-data/generative-ai/image/scones.jpg",
mime_type="image/jpeg",
),
],
)
print(response.text)
# Example response:
# The image shows a flat lay of blueberry scones arranged on parchment paper. There are ...
Go
import (
"context"
"fmt"
"io"
genai "google.golang.org/genai"
)
// generateWithTextImage shows how to generate text using both text and image input
func generateWithTextImage(w io.Writer) error {
ctx := context.Background()
client, err := genai.NewClient(ctx, &genai.ClientConfig{
HTTPOptions: genai.HTTPOptions{APIVersion: "v1"},
})
if err != nil {
return fmt.Errorf("failed to create genai client: %w", err)
}
modelName := "gemini-2.5-flash"
contents := []*genai.Content{
{Parts: []*genai.Part{
{Text: "What is shown in this image?"},
{FileData: &genai.FileData{
// Image source: https://storage.googleapis.com/cloud-samples-data/generative-ai/image/scones.jpg
FileURI: "gs://cloud-samples-data/generative-ai/image/scones.jpg",
MIMEType: "image/jpeg",
}},
},
Role: "user"},
}
resp, err := client.Models.GenerateContent(ctx, modelName, contents, nil)
if err != nil {
return fmt.Errorf("failed to generate content: %w", err)
}
respText := resp.Text()
fmt.Fprintln(w, respText)
// Example response:
// The image shows an overhead shot of a rustic, artistic arrangement on a surface that ...
return nil
}
Node.js
const {GoogleGenAI} = require('@google/genai');
const GOOGLE_CLOUD_PROJECT = process.env.GOOGLE_CLOUD_PROJECT;
const GOOGLE_CLOUD_LOCATION = process.env.GOOGLE_CLOUD_LOCATION || 'global';
async function generateContent(
projectId = GOOGLE_CLOUD_PROJECT,
location = GOOGLE_CLOUD_LOCATION
) {
const ai = new GoogleGenAI({
vertexai: true,
project: projectId,
location: location,
});
const image = {
fileData: {
fileUri: 'gs://cloud-samples-data/generative-ai/image/scones.jpg',
mimeType: 'image/jpeg',
},
};
const response = await ai.models.generateContent({
model: 'gemini-2.5-flash',
contents: [image, 'What is shown in this image?'],
});
console.log(response.text);
return response.text;
}
Java
import com.google.genai.Client;
import com.google.genai.types.Content;
import com.google.genai.types.GenerateContentResponse;
import com.google.genai.types.HttpOptions;
import com.google.genai.types.Part;
public class TextGenerationWithTextAndImage {
public static void main(String[] args) {
// TODO(developer): Replace these variables before running the sample.
String modelId = "gemini-2.5-flash";
generateContent(modelId);
}
// Generates text with text and image input
public static String generateContent(String modelId) {
// Initialize client that will be used to send requests. This client only needs to be created
// once, and can be reused for multiple requests.
try (Client client =
Client.builder()
.location("global")
.vertexAI(true)
.httpOptions(HttpOptions.builder().apiVersion("v1").build())
.build()) {
GenerateContentResponse response =
client.models.generateContent(
modelId,
Content.fromParts(
Part.fromText("What is shown in this image?"),
Part.fromUri(
"gs://cloud-samples-data/generative-ai/image/scones.jpg", "image/jpeg")),
null);
System.out.print(response.text());
// Example response:
// The image shows a flat lay of blueberry scones arranged on parchment paper. There are ...
return response.text();
}
}
}
代码执行
Vertex AI 中的 Gemini API 代码执行功能可让模型生成和运行 Python 代码,并从结果中迭代学习,直到获得最终输出。Vertex AI 提供代码执行作为工具,类似于函数调用。利用此代码执行功能,您可以构建可受益于基于代码的推理并生成文本输出的应用。例如:
Python
from google import genai
from google.genai.types import (
HttpOptions,
Tool,
ToolCodeExecution,
GenerateContentConfig,
)
client = genai.Client(http_options=HttpOptions(api_version="v1"))
model_id = "gemini-2.5-flash"
code_execution_tool = Tool(code_execution=ToolCodeExecution())
response = client.models.generate_content(
model=model_id,
contents="Calculate 20th fibonacci number. Then find the nearest palindrome to it.",
config=GenerateContentConfig(
tools=[code_execution_tool],
temperature=0,
),
)
print("# Code:")
print(response.executable_code)
print("# Outcome:")
print(response.code_execution_result)
# Example response:
# # Code:
# def fibonacci(n):
# if n <= 0:
# return 0
# elif n == 1:
# return 1
# else:
# a, b = 0, 1
# for _ in range(2, n + 1):
# a, b = b, a + b
# return b
#
# fib_20 = fibonacci(20)
# print(f'{fib_20=}')
#
# # Outcome:
# fib_20=6765
Go
import (
"context"
"fmt"
"io"
genai "google.golang.org/genai"
)
// generateWithCodeExec shows how to generate text using the code execution tool.
func generateWithCodeExec(w io.Writer) error {
ctx := context.Background()
client, err := genai.NewClient(ctx, &genai.ClientConfig{
HTTPOptions: genai.HTTPOptions{APIVersion: "v1"},
})
if err != nil {
return fmt.Errorf("failed to create genai client: %w", err)
}
prompt := "Calculate 20th fibonacci number. Then find the nearest palindrome to it."
contents := []*genai.Content{
{Parts: []*genai.Part{
{Text: prompt},
},
Role: "user"},
}
config := &genai.GenerateContentConfig{
Tools: []*genai.Tool{
{CodeExecution: &genai.ToolCodeExecution{}},
},
Temperature: genai.Ptr(float32(0.0)),
}
modelName := "gemini-2.5-flash"
resp, err := client.Models.GenerateContent(ctx, modelName, contents, config)
if err != nil {
return fmt.Errorf("failed to generate content: %w", err)
}
for _, p := range resp.Candidates[0].Content.Parts {
if p.Text != "" {
fmt.Fprintf(w, "Gemini: %s", p.Text)
}
if p.ExecutableCode != nil {
fmt.Fprintf(w, "Language: %s\n%s\n", p.ExecutableCode.Language, p.ExecutableCode.Code)
}
if p.CodeExecutionResult != nil {
fmt.Fprintf(w, "Outcome: %s\n%s\n", p.CodeExecutionResult.Outcome, p.CodeExecutionResult.Output)
}
}
// Example response:
// Gemini: Okay, I can do that. First, I'll calculate the 20th Fibonacci number. Then, I need ...
//
// Language: PYTHON
//
// def fibonacci(n):
// ...
//
// fib_20 = fibonacci(20)
// print(f'{fib_20=}')
//
// Outcome: OUTCOME_OK
// fib_20=6765
//
// Now that I have the 20th Fibonacci number (6765), I need to find the nearest palindrome. ...
// ...
return nil
}
Node.js
const {GoogleGenAI} = require('@google/genai');
const GOOGLE_CLOUD_PROJECT = process.env.GOOGLE_CLOUD_PROJECT;
const GOOGLE_CLOUD_LOCATION = process.env.GOOGLE_CLOUD_LOCATION || 'global';
async function generateContent(
projectId = GOOGLE_CLOUD_PROJECT,
location = GOOGLE_CLOUD_LOCATION
) {
const ai = new GoogleGenAI({
vertexai: true,
project: projectId,
location: location,
});
const response = await ai.models.generateContent({
model: 'gemini-2.5-flash-preview-05-20',
contents:
'What is the sum of the first 50 prime numbers? Generate and run code for the calculation, and make sure you get all 50.',
config: {
tools: [{codeExecution: {}}],
temperature: 0,
},
});
console.debug(response.executableCode);
console.debug(response.codeExecutionResult);
return response.codeExecutionResult;
}
如需查看更多代码执行示例,请参阅代码执行文档。
后续步骤
现在,您已发出第一个 API 请求,不妨探索以下指南,了解如何为生产代码设置更高级的 Vertex AI 功能: