멀티모달 AI 모델로 오디오 파일 스크립트 작성
컬렉션을 사용해 정리하기
내 환경설정을 기준으로 콘텐츠를 저장하고 분류하세요.
이 샘플에서는 오디오 파일을 사용하여 타임스탬프가 포함된 팟캐스트 스크립트를 생성하는 방법을 보여줍니다.
더 살펴보기
이 코드 샘플이 포함된 자세한 문서는 다음을 참조하세요.
코드 샘플
달리 명시되지 않는 한 이 페이지의 콘텐츠에는 Creative Commons Attribution 4.0 라이선스에 따라 라이선스가 부여되며, 코드 샘플에는 Apache 2.0 라이선스에 따라 라이선스가 부여됩니다. 자세한 내용은 Google Developers 사이트 정책을 참조하세요. 자바는 Oracle 및/또는 Oracle 계열사의 등록 상표입니다.
[[["이해하기 쉬움","easyToUnderstand","thumb-up"],["문제가 해결됨","solvedMyProblem","thumb-up"],["기타","otherUp","thumb-up"]],[["이해하기 어려움","hardToUnderstand","thumb-down"],["잘못된 정보 또는 샘플 코드","incorrectInformationOrSampleCode","thumb-down"],["필요한 정보/샘플이 없음","missingTheInformationSamplesINeed","thumb-down"],["번역 문제","translationIssue","thumb-down"],["기타","otherDown","thumb-down"]],[],[],[],null,["# Transcript an audio file with Multimodal AI Model\n\nThis sample shows you how to use an audio file to generate a podcast transcript with timestamps.\n\nExplore further\n---------------\n\n\nFor detailed documentation that includes this code sample, see the following:\n\n- [Audio understanding (speech only)](/vertex-ai/generative-ai/docs/multimodal/audio-understanding)\n\nCode sample\n-----------\n\n### Go\n\n\nBefore trying this sample, follow the Go setup instructions in the\n[Vertex AI quickstart using\nclient libraries](/vertex-ai/docs/start/client-libraries).\n\n\nFor more information, see the\n[Vertex AI Go API\nreference documentation](/go/docs/reference/cloud.google.com/go/aiplatform/latest/apiv1).\n\n\nTo authenticate to Vertex AI, set up Application Default Credentials.\nFor more information, see\n\n[Set up authentication for a local development environment](/docs/authentication/set-up-adc-local-dev-environment).\n\n import (\n \t\"context\"\n \t\"fmt\"\n \t\"io\"\n\n \tgenai \"google.golang.org/genai\"\n )\n\n // generateAudioTranscript shows how to generate an audio transcript.\n func generateAudioTranscript(w io.Writer) error {\n \tctx := context.Background()\n\n \tclient, err := genai.NewClient(ctx, &genai.ClientConfig{\n \t\tHTTPOptions: genai.HTTPOptions{APIVersion: \"v1\"},\n \t})\n \tif err != nil {\n \t\treturn fmt.Errorf(\"failed to create genai client: %w\", err)\n \t}\n\n \tmodelName := \"gemini-2.5-flash\"\n \tcontents := []*genai.Content{\n \t\t{Parts: []*genai.Part{\n \t\t\t{Text: `Transcribe the interview, in the format of timecode, speaker, caption.\n Use speaker A, speaker B, etc. to identify speakers.`},\n \t\t\t{FileData: &genai.FileData{\n \t\t\t\tFileURI: \"gs://cloud-samples-data/generative-ai/audio/pixel.mp3\",\n \t\t\t\tMIMEType: \"audio/mpeg\",\n \t\t\t}},\n \t\t},\n \t\t\tRole: \"user\"},\n \t}\n\n \tresp, err := client.Models.GenerateContent(ctx, modelName, contents, nil)\n \tif err != nil {\n \t\treturn fmt.Errorf(\"failed to generate content: %w\", err)\n \t}\n\n \trespText := resp.Text()\n\n \tfmt.Fprintln(w, respText)\n\n \t// Example response:\n \t// 00:00:00, A: your devices are getting better over time.\n \t// 00:01:13, A: And so we think about it across the entire portfolio from phones to watch, ...\n \t// ...\n\n \treturn nil\n }\n\n### Python\n\n\nBefore trying this sample, follow the Python setup instructions in the\n[Vertex AI quickstart using\nclient libraries](/vertex-ai/docs/start/client-libraries).\n\n\nFor more information, see the\n[Vertex AI Python API\nreference documentation](/python/docs/reference/aiplatform/latest).\n\n\nTo authenticate to Vertex AI, set up Application Default Credentials.\nFor more information, see\n\n[Set up authentication for a local development environment](/docs/authentication/set-up-adc-local-dev-environment).\n\n from google import genai\n from google.genai.types import GenerateContentConfig, HttpOptions, Part\n\n client = genai.Client(http_options=HttpOptions(api_version=\"v1\"))\n prompt = \"\"\"\n Transcribe the interview, in the format of timecode, speaker, caption.\n Use speaker A, speaker B, etc. to identify speakers.\n \"\"\"\n response = client.models.generate_content(\n model=\"gemini-2.5-flash\",\n contents=[\n prompt,\n Part.from_uri(\n file_uri=\"gs://cloud-samples-data/generative-ai/audio/pixel.mp3\",\n mime_type=\"audio/mpeg\",\n ),\n ],\n # Required to enable timestamp understanding for audio-only files\n config=GenerateContentConfig(audio_timestamp=True),\n )\n print(response.text)\n # Example response:\n # [00:00:00] **Speaker A:** your devices are getting better over time. And so ...\n # [00:00:14] **Speaker B:** Welcome to the Made by Google podcast where we meet ...\n # [00:00:20] **Speaker B:** Here's your host, Rasheed Finch.\n # [00:00:23] **Speaker C:** Today we're talking to Aisha Sharif and DeCarlos Love. ...\n # ...\n\nWhat's next\n-----------\n\n\nTo search and filter code samples for other Google Cloud products, see the\n[Google Cloud sample browser](/docs/samples?product=googlegenaisdk)."]]