About database observability

Stay organized with collections Save and categorize content based on your preferences.

Database observability is a measure of how accurately you can infer the internal state of a database system based on the data, or telemetry, that it generates in logs, metrics, and traces.

Diagnosing and troubleshooting issues in an application can be particularly difficult and time-consuming when a database is involved. Telemetry collection is crucially important. Telemetry, when enriched with application context, can make database instances more understandable, observable, and easier to maintain. You can identify issues and problematic trends easily and remedy them early, without having to incur costly downtime. Moreover, by using such data, you can configure newer database instances to collect the right kind of data from the moment they start.

You can use data effectively and proactively to prevent issues and focus on strategic innovation. Good telemetry collection is particularly useful in the DevOps model, where database generalists need to independently analyze telemetry to monitor, evaluate, and optimize the performance and health of their rapidly evolving applications.

Google Cloud offers several powerful features spanning the four iterative observability stages to help you maintain the health of your Cloud SQL database.

The iterative stages of implementing observability

Automated telemetry collection

To achieve observability goals, we start by collecting telemetry, preferably through an automated process. When collected over a period, telemetry helps establish a baseline for metrics under different load conditions.

Google Cloud services automatically generate observability data, including metrics, logs, and traces, which can help provide a complete observability overview.

  • Cloud Monitoring collects measurements of your service and of the Google Cloud resources that you use. Cloud SQL uses built-in memory custom agents to collect query telemetry, resulting in a lower impact on performance and eliminating the need for agent maintenance or security overhead.

  • Cloud Logging collects logging data from common application components. For Cloud SQL, see also View instance logs.

  • Cloud Trace collects latency data and executed query plans from applications to help you track how requests propagate through your application. You can compare these latency distributions over time or across versions. Cloud Trace alerts you when it detects a significant shift in the latency profile of your application when it's instrumented to use Cloud Trace.

Sqlcommenter, an OpenTelemetry library for databases helps you monitor your databases through the lens of an application. Sqlcommenter automatically instruments ORMs to augment SQL statements with tags and allows OpenTelemetry trace context information to be propagated to the database.

With tags and trace application context in databases, it's easy to correlate application code with database performance and troubleshoot microservices-based architectures.

Database monitoring

Proper monitoring helps you determine whether your application is working optimally. Implement monitoring early, such as before you initiate a migration or deploy a new application to a production environment. Disambiguate between application issues and underlying cloud issues.

The Cloud SQL Overview page shows graphs for some of the key metrics.

Cloud SQL also helps you compare metrics for selected instances.

You can use Cloud Monitoring to create custom dashboards that help you monitor metrics and to set up alert policies so that you can receive timely notifications.

Database tuning

You can iteratively troubleshoot and tune your database.

Cloud SQL recommenders help you analyze the current usage of your database and provide recommendations and insights based on heuristic methods and machine learning.

Cloud SQL recommenders are briefly described as follows:

Name Description
Out-of-disk recommender Reduce the risk of downtime that might be caused by your Cloud SQL instances running out of disk space.
Idle instance recommender Reduce costs by shutting down Cloud SQL instances that are inadvertently idle.
Overprovisioned instance recommender Reduce costs by resizing Cloud SQL instances that are unnecessarily large for a given workload.

What's next