Générer du texte à partir d'une vidéo

Cet exemple montre comment utiliser l'API Gemini pour générer du texte à partir d'une vidéo.

Exemple de code

Avant d'essayer cet exemple, suivez les instructions de configuration pour Go décrites dans le guide de démarrage rapide de Vertex AI à l'aide des bibliothèques clientes. Pour en savoir plus, consultez la documentation de référence de l'API Vertex AI Go.

Pour vous authentifier auprès de Vertex AI, configurez le service Identifiants par défaut de l'application. Pour en savoir plus, consultez Configurer l'authentification pour un environnement de développement local.

import (
        "context"
        "errors"
        "fmt"
        "io"
        "mime"
        "path/filepath"

        "cloud.google.com/go/vertexai/genai"
)

// generateMultimodalContent generates a response into w, based upon the prompt
// and video provided.
// video is a Google Cloud Storage path starting with "gs://"
func generateMultimodalContent(w io.Writer, prompt, video, projectID, location, modelName string) error {
        // prompt := "What is in this video?"
        // video := "gs://cloud-samples-data/video/animals.mp4"
        // location := "us-central1"
        // modelName := "gemini-1.5-flash-001"
        ctx := context.Background()

        client, err := genai.NewClient(ctx, projectID, location)
        if err != nil {
                return fmt.Errorf("unable to create client: %w", err)
        }
        defer client.Close()

        model := client.GenerativeModel(modelName)
        model.SetTemperature(0.4)

        // Given a video file URL, prepare video file as genai.Part
        part := genai.FileData{
                MIMEType: mime.TypeByExtension(filepath.Ext(video)),
                FileURI:  video,
        }

        res, err := model.GenerateContent(ctx, part, genai.Text(prompt))
        if err != nil {
                return fmt.Errorf("unable to generate contents: %w", err)
        }

        if len(res.Candidates) == 0 ||
                len(res.Candidates[0].Content.Parts) == 0 {
                return errors.New("empty response from model")
        }

        fmt.Fprintf(w, "generated response: %s\n", res.Candidates[0].Content.Parts[0])
        return nil
}

Avant d'essayer cet exemple, suivez les instructions de configuration pour Java décrites dans le guide de démarrage rapide de Vertex AI à l'aide des bibliothèques clientes. Pour en savoir plus, consultez la documentation de référence de l'API Vertex AI Java.

import com.google.cloud.vertexai.VertexAI;
import com.google.cloud.vertexai.api.GenerateContentResponse;
import com.google.cloud.vertexai.generativeai.ContentMaker;
import com.google.cloud.vertexai.generativeai.GenerativeModel;
import com.google.cloud.vertexai.generativeai.PartMaker;
import com.google.cloud.vertexai.generativeai.ResponseHandler;
import java.io.IOException;

public class MultimodalVideoInput {

  public static void main(String[] args) throws IOException {
    // TODO(developer): Replace these variables before running the sample.
    String projectId = "your-google-cloud-project-id";
    String location = "us-central1";
    String modelName = "gemini-1.5-flash-001";

    multimodalVideoInput(projectId, location, modelName);
  }

  // Analyzes the given video input.
  public static void multimodalVideoInput(String projectId, String location, String modelName)
      throws IOException {
    // Initialize client that will be used to send requests.
    // This client only needs to be created once, and can be reused for multiple requests.
    try (VertexAI vertexAI = new VertexAI(projectId, location)) {
      String videoUri = "gs://cloud-samples-data/video/animals.mp4";

      GenerativeModel model = new GenerativeModel(modelName, vertexAI);
      GenerateContentResponse response = model.generateContent(
          ContentMaker.fromMultiModalData(
              "What is in the video?",
              PartMaker.fromMimeTypeAndData("video/mp4", videoUri)
          ));

      String output = ResponseHandler.getText(response);
      System.out.println(output);
    }
  }
}

Étapes suivantes

Pour rechercher et filtrer des exemples de code pour d'autres produits Google Cloud, consultez l'explorateur d'exemples Google Cloud.