Understanding the performance of production systems is notoriously difficult. Attempting to measure performance in test environments usually fails to replicate the pressures on a production system. Microbenchmarking parts of the code is sometimes feasible, but it also typically fails to replicate the workload and behavior of a production system.
Continuous profiling of production systems is an effective way to discover where resources like CPU cycles and memory are consumed as a service operates in its working environment. But profiling adds an additional load on the production system: in order to be an acceptable way to discover patterns of resource consumption, the additional load of profiling must be small.
Stackdriver Profiler is a statistical, low-overhead profiler that continuously gathers CPU usage and memory-allocation information from your production applications. It attributes that information to the source code that generated it, helping you identify the parts of the code that are consuming the most resources, and otherwise illuminating the performance characteristics of the code.
Environment and languages
Stackdriver Profiler runs under Linux on the following Google Cloud Platform environments:
- Compute Engine
- Kubernetes Engine
- App Engine flexible environment
Stackdriver Profiler can profile code written in the following languages:
See Profiling agent for more information on running the profiling agent.
Profiler supports different types of profiling based on the language in which a program is written. The following table summarizes the supported profile types by language:
For more information about these profile types, see Profiling Concepts.
Stackdriver Profiler creates a single profile by collecting profiling data, usually for 10 seconds, every 1 minute for a single instance of the configured service in a single Google Compute Engine zone. If, for example, your Kubernetes Engine service runs 10 replicas of a pod, then, in a 10-minute period, roughly 10 profiles will be created, and each pod will be profiled approximately once. The profiling period is randomized, so there will be variation. See Profile collection for more information.
The overhead of the CPU and heap allocation profiling at the time of the data collection is less than 5 percent. Amortized over the execution time and across multiple replicas of a service, the overhead is commonly less than 0.5 percent, making it an affordable option for always-on profiling in production systems.
Stackdriver Profiler consists of the profiling agent, which collects the data, and a console interface on Google Cloud Platform, which lets you view and analyze the data collected by the agent.
You install the agent on the virtual machines where your application runs. The agent typically comes as a library that you attach to your code when you run it. The agent collects profiling data as the app runs.
For information on running the Stackdriver Profiler agent, see:
You can also run the profiling agent on non-Google Cloud Platform systems. See Profiling Outside Google Cloud Platform for more information.
After the agent has collected some profiling data, you can use the Profiler interface to see how the statistics for CPU and memory usage correlate with areas of the code.
The profile data is retained for 30 days, so you can analyze performance data for periods up to the last 30 days. Profiles can be downloaded for long-term storage.
For information on using the Profiler interface, see Using the Profiler Interface.