使用 Gemini 1.5 Pro 处理图片、视频、音频和文本
使用集合让一切井井有条
根据您的偏好保存内容并对其进行分类。
此示例展示了如何同时处理图片、视频、音频和文本。此示例仅适用于 Gemini 1.5 Pro。
代码示例
如未另行说明,那么本页面中的内容已根据知识共享署名 4.0 许可获得了许可,并且代码示例已根据 Apache 2.0 许可获得了许可。有关详情,请参阅 Google 开发者网站政策。Java 是 Oracle 和/或其关联公司的注册商标。
[[["易于理解","easyToUnderstand","thumb-up"],["解决了我的问题","solvedMyProblem","thumb-up"],["其他","otherUp","thumb-up"]],[["很难理解","hardToUnderstand","thumb-down"],["信息或示例代码不正确","incorrectInformationOrSampleCode","thumb-down"],["没有我需要的信息/示例","missingTheInformationSamplesINeed","thumb-down"],["翻译问题","translationIssue","thumb-down"],["其他","otherDown","thumb-down"]],[],[],[],null,["# Process images, video, audio, and text with Gemini 1.5 Pro\n\nThis sample shows you how to process images, video, audio, and text at the same time. This sample works with Gemini 1.5 Pro only.\n\nCode sample\n-----------\n\n### C#\n\n\nBefore trying this sample, follow the C# setup instructions in the\n[Vertex AI quickstart using\nclient libraries](/vertex-ai/docs/start/client-libraries).\n\n\nFor more information, see the\n[Vertex AI C# API\nreference documentation](/dotnet/docs/reference/Google.Cloud.AIPlatform.V1/latest).\n\n\nTo authenticate to Vertex AI, set up Application Default Credentials.\nFor more information, see\n\n[Set up authentication for a local development environment](/docs/authentication/set-up-adc-local-dev-environment).\n\n\n using https://cloud.google.com/dotnet/docs/reference/Google.Cloud.AIPlatform.V1/latest/Google.Cloud.AIPlatform.V1.html;\n using System;\n using System.Threading.Tasks;\n\n public class MultimodalAllInput\n {\n public async Task\u003cstring\u003e AnswerFromMultimodalInput(\n string projectId = \"your-project-id\",\n string location = \"us-central1\",\n string publisher = \"google\",\n string model = \"gemini-2.0-flash-001\")\n {\n\n var predictionServiceClient = new https://cloud.google.com/dotnet/docs/reference/Google.Cloud.AIPlatform.V1/latest/Google.Cloud.AIPlatform.V1.PredictionServiceClientBuilder.html\n {\n Endpoint = $\"{location}-aiplatform.googleapis.com\"\n }.Build();\n\n string prompt = \"Watch each frame in the video carefully and answer the questions.\\n\"\n + \"Only base your answers strictly on what information is available in \"\n + \"the video attached. Do not make up any information that is not part \"\n + \"of the video and do not be too verbose, be to the point.\\n\\n\"\n + \"Questions:\\n\"\n + \"- When is the moment in the image happening in the video? \"\n + \"Provide a timestamp.\\n\"\n + \"- What is the context of the moment and what does the narrator say about it?\";\n\n var generateContentRequest = new https://cloud.google.com/dotnet/docs/reference/Google.Cloud.AIPlatform.V1/latest/Google.Cloud.AIPlatform.V1.GenerateContentRequest.html\n {\n Model = $\"projects/{projectId}/locations/{location}/publishers/{publisher}/models/{model}\",\n Contents =\n {\n new https://cloud.google.com/dotnet/docs/reference/Google.Cloud.AIPlatform.V1/latest/Google.Cloud.AIPlatform.V1.Content.html\n {\n Role = \"USER\",\n Parts =\n {\n new https://cloud.google.com/dotnet/docs/reference/Google.Cloud.AIPlatform.V1/latest/Google.Cloud.AIPlatform.V1.Part.html { Text = prompt },\n new https://cloud.google.com/dotnet/docs/reference/Google.Cloud.AIPlatform.V1/latest/Google.Cloud.AIPlatform.V1.Part.html { FileData = new() { MimeType = \"video/mp4\", FileUri = \"gs://cloud-samples-data/generative-ai/video/behind_the_scenes_pixel.mp4\" } },\n new https://cloud.google.com/dotnet/docs/reference/Google.Cloud.AIPlatform.V1/latest/Google.Cloud.AIPlatform.V1.Part.html { FileData = new() { MimeType = \"image/png\", FileUri = \"gs://cloud-samples-data/generative-ai/image/a-man-and-a-dog.png\" } }\n }\n }\n }\n };\n\n https://cloud.google.com/dotnet/docs/reference/Google.Cloud.AIPlatform.V1/latest/Google.Cloud.AIPlatform.V1.GenerateContentResponse.html response = await predictionServiceClient.GenerateContentAsync(generateContentRequest);\n\n string responseText = response.https://cloud.google.com/dotnet/docs/reference/Google.Cloud.AIPlatform.V1/latest/Google.Cloud.AIPlatform.V1.GenerateContentResponse.html#Google_Cloud_AIPlatform_V1_GenerateContentResponse_Candidates[0].https://cloud.google.com/dotnet/docs/reference/Google.Cloud.AIPlatform.V1/latest/Google.Cloud.AIPlatform.V1.Content.html.Parts[0].Text;\n Console.WriteLine(responseText);\n\n return responseText;\n }\n }\n\n### Node.js\n\n\nBefore trying this sample, follow the Node.js setup instructions in the\n[Vertex AI quickstart using\nclient libraries](/vertex-ai/docs/start/client-libraries).\n\n\nFor more information, see the\n[Vertex AI Node.js API\nreference documentation](/nodejs/docs/reference/aiplatform/latest).\n\n\nTo authenticate to Vertex AI, set up Application Default Credentials.\nFor more information, see\n\n[Set up authentication for a local development environment](/docs/authentication/set-up-adc-local-dev-environment).\n\n const {VertexAI} = require('https://cloud.google.com/nodejs/docs/reference/vertexai/latest/overview.html');\n\n /**\n * TODO(developer): Update these variables before running the sample.\n */\n async function analyze_all_modalities(projectId = 'PROJECT_ID') {\n const vertexAI = new https://cloud.google.com/nodejs/docs/reference/vertexai/latest/vertexai/vertexai.html({project: projectId, location: 'us-central1'});\n\n const generativeModel = vertexAI.https://cloud.google.com/nodejs/docs/reference/vertexai/latest/vertexai/vertexai.html({\n model: 'gemini-2.0-flash-001',\n });\n\n const videoFilePart = {\n file_data: {\n file_uri:\n 'gs://cloud-samples-data/generative-ai/video/behind_the_scenes_pixel.mp4',\n mime_type: 'video/mp4',\n },\n };\n const imageFilePart = {\n file_data: {\n file_uri:\n 'gs://cloud-samples-data/generative-ai/image/a-man-and-a-dog.png',\n mime_type: 'image/png',\n },\n };\n\n const textPart = {\n text: `\n Watch each frame in the video carefully and answer the questions.\n Only base your answers strictly on what information is available in the video attached.\n Do not make up any information that is not part of the video and do not be too\n verbose, be to the point.\n\n Questions:\n - When is the moment in the image happening in the video? Provide a timestamp.\n - What is the context of the moment and what does the narrator say about it?`,\n };\n\n const request = {\n contents: [{role: 'user', parts: [videoFilePart, imageFilePart, textPart]}],\n };\n\n const resp = await generativeModel.generateContent(request);\n const contentResponse = await resp.response;\n console.log(JSON.stringify(contentResponse));\n }\n\nWhat's next\n-----------\n\n\nTo search and filter code samples for other Google Cloud products, see the\n[Google Cloud sample browser](/docs/samples?product=generativeaionvertexai)."]]