Modern IT runs on numbers
A comprehensive, metrics-driven approach is now a baseline objective for most IT ops teams. Many businesses now measure IT on service availability and performance. But for IT teams that depend on cloud services, it can be hard to get solid data on services that are provided by an outside cloud provider. If there is a problem, where is it? With your stack or with the service provider? Transparent SLIs help you monitor Google Cloud services and their effects on your workloads, so you can get the complete picture.
Measure all the things
To help IT understand the performance of all your services components, Google provides detailed API level metrics for over 130 Google Cloud services. These metrics show you the error counts and latency for your applications' requests to each Google service. This lets you see correlations and side effects between your applications and the services they depend on, helping to speed root-cause analysis and time to resolution.
SLIs go far beyond traditional notions of “service health.” You can see the specific interactions between services and correlate those to environmental data. This allows you to cross tab service metrics by a variety of attributes such as location of the service, credential of the app calling the service, version and response code to help you explore relationships and determine causes and effects.
Using Transparent SLIs in practice
- If all calls to a service are failing for one user but not any other, chances are there is something wrong with that account that you can easily fix yourself.
- If you're troubleshooting a problem with your app and notice a correlation between your application's degraded performance and a sustained increase in latency for a critical Google Cloud service, this is a sign to call us and get us to help.
- If the latencies for a Google Cloud service report look good and unchanged from before, but your in-app metrics report that the latency on calls to the service is abnormally high, that tells you that there could be some trouble in the network. Call your network provider (in some cases, Google) to get the debugging process started.