Streaming text generation

This code sample demonstrates how to generate text in a streaming fashion.

Code sample

Python

Before trying this sample, follow the Python setup instructions in the Vertex AI quickstart using client libraries. For more information, see the Vertex AI Python API reference documentation.

To authenticate to Vertex AI, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.

import vertexai
from vertexai import language_models

# TODO(developer): Update and un-comment below line
# PROJECT_ID = "your-project-id"
vertexai.init(project=PROJECT_ID, location="us-central1")

text_generation_model = language_models.TextGenerationModel.from_pretrained(
    "text-bison"
)
parameters = {
    # Temperature controls the degree of randomness in token selection.
    "temperature": 0.2,
    # Token limit determines the maximum amount of text output.
    "max_output_tokens": 256,
    # Tokens are selected from most probable to least until the
    # sum of their probabilities equals the top_p value.
    "top_p": 0.8,
    # A top_k of 1 means the selected token is the most probable among
    # all tokens.
    "top_k": 40,
}

responses = text_generation_model.predict_streaming(
    prompt="Give me ten interview questions for the role of program manager.",
    **parameters,
)

results = []
for response in responses:
    print(response)
    results.append(str(response))
results = "\n".join(results)
print(results)
# Example response:
# 1. **Tell me about your experience as a program manager.**
# 2. **What are your strengths and weaknesses as a program manager?**
# 3. **What do you think are the most important qualities for a successful program manager?**
# 4. **How do you manage
# ...

What's next

To search and filter code samples for other Google Cloud products, see the Google Cloud sample browser.