Generative AI on Vertex AI automatically collects and reports activity from MaaS models to help you troubleshoot latency issues and monitor capacity.

Why use the model observability dashboard
The model observability dashboard helps you understand the performance and usage of your models. As an application developer, you can use the dashboard for the following tasks:
- Monitor user interaction: View trends in model usage, such as requests per second and invocation latencies, to understand how users interact with your models.
- Estimate costs: Use model usage metrics to approximate the costs associated with running each model.
- Troubleshoot issues: Diagnose problems by monitoring API error rates, first token latencies, and token throughput to verify that models are responding reliably and efficiently.
Available monitoring metrics
The model observability dashboard displays a subset of the metrics that Cloud Monitoring collects. Key metrics include the following:
- Model requests per second (QPS)
- Token throughput
- First token latencies
- API error rates
To see all available metrics and their descriptions, see the "aiplatform" section on the Google Cloud metrics page.
Limitations
Vertex AI captures dashboard metrics only for API calls to a model's endpoint. The dashboard doesn't include metrics from Google Cloud console usage, such as from Vertex AI Studio.
View the dashboard
In the Vertex AI section of the Google Cloud console, go to the Dashboard page.
In the Model observability section, click Show all metrics to view the model observability dashboard in the Google Cloud Observability console.
To view metrics for a specific model or in a particular location, set one or more filters at the top of the dashboard page.
Additional resources
- To create alerts for your dashboard, see the Alerting overview page in the Monitoring documentation.
- For information about metrics data retention, see the Monitoring quotas and limits.
- For information about data at rest, see Protecting data at rest.
- To view a list of all metrics that Cloud Monitoring collects, see the "aiplatform" section on the Google Cloud metrics page.