The Stackdriver Profiler interface lets you analyze the data collected by the profiling agent from your code, so you can explore its runtime characteristics:
This guide describes how to use this interface to analyze profile data collected from your program.
Before you start
Before you can use the Profiler to analyze data, you must have some profiling data collected to analyze. For information on using the Profiler agent to collect profiling data, see:
Opening the Profiler interface
To use the Profiler interface:
In the Google Cloud Platform Console, select your project:
Go to Stackdriver > Profiler:
The Profiler interface is divided into two primary areas:
A controls area, which lets you configure the data that the Profiler interface displays and analyzes.
A graph area, which contains a flame graph of the profiled data and reports statistics about the data.
Profiler UI controls
The Profiler interface provides a set of controls that let you determine the information displayed in the graph area. These controls fall into two types.
The first type of controls determines which subset of the collected profiles to use:
- Within a time range
- From a specific service
- Of a specific type
- From a geographical zone
- From a specific version of the service
- For comparison to another profile; see Comparing profiles for more information
The second type, below the first, is the filter bar. This lets you control which data in the selected subset of profiles to display and to download profile data.
Description of controls
Range of time
The Timespan pulldown, Now button, and End Time pulldown let you control the period of time for which profiling data is displayed. By default, the end time for a time span is “now”, and the Timespan is 7 days. This displays data from the last 7 days. By explicitly setting the End Time, you can use these controls to display, for example, data from noon to 4pm two Thursdays ago.
When the Now option is enabled, the End Time field is disabled for editing and shows the current date and time.
To set a specific end time, click the Now button to disable the default and enable ending in the End Time field. You can then specify a time and date.
The pulldown next to the time-zone indicator in End Time brings up a calendar, so you can specify dates easily.
The Timespan choices range from the 10 minutes to 30 days, the limit of the retention period for profile data.
The Service pulldown lets you select the data to analyze based on the name of the service name from which it was collected. This is one of the values you (or the runtime environment) specify when an application is run with profiling enabled. You might have profiling data from several different services, or you might have data from only one. For more information on service names, see the profiling guides for Go, Java, or Node.js.
Type of data
The Profile type pulldown lets you choose the type of profiling data to analyze. Different languages support a different set of profile types:
1 For App Engine standard environment, Go 1.11 or later is required. 2 Only available for App Engine standard environment. 3 Not available for App Engine standard environment.
Each profile type captures a different kind of information:
- CPU time: information about CPU usage
- Heap: information about memory consumption
- Contention: information about mutex usage
- Threads: information about thread usage
- Wall time: information about total time to run
The Zone name pulldown lets you restrict the analysis to instances of the service running in a specific Compute Engine zone, for example:
The Version pulldown lets you restrict the analysis to a specific version of the named service. Service version is an optional value you (or the runtime environment) specify when an application is run with profiling enabled. For more information on service versions, see the profiling guides for Go, Java, or Node.js. The following screenshot shows the versions available for a particular service:
Focus (Focus for examples on using the focus capability.) is a mechanism to change the filters in a way that you focus on a single function, showing both the callers and callees of the function. See
Filters choose which data in the chosen set of profiles to display.
Click the Filters button () to display a drop-down menu of options:
Metric: Choose a specific measure to graph. The choices here depend on the Profile type selected, and the available profiles vary with the language:
- For CPU time profiles (Go, Java), the only choice is CPU time
- For Heap profiles (Go, Node.js), the choices are:
- Total alloc bytes
- Total alloc objects
- For Wall time profiles (Java, Node.js), the choices are:
- Wall time
- For Threads profiles (Go), the only choice is Goroutine
- For Contention profiles (Go), the choices are:
For more information about types of profiling metrics, see Profiling concepts.
Show stacks and Hide stacks: Include or exclude from the display those call stacks that contain a frame that matches a string. The stack includes the part of the stack that calls the matching function and the part called by the matching function.
Show from frame: Discard everything except the frames matching a string and everything called from them. This is like Show stacks, except that Show from frame also discards the part of the stack that calls the matching function.
Hide frames: Drop frames that match a string from the graph. This is very useful for suppressing data from standard libraries or utilities.
Highlight: Change the graph coloring to emphasize call sequences that match a string.
Weight: Include only profiles that are among the most expensive, based on the metric filter. For example, of the CPU-time profiles, chart only the most expensive 1 percent.
Focus: Changes the graph to focus on a single function, showing both the callers and callees of the function.
See Filtering the graph for examples of most of these filters. The Focus filter changes the graph sufficiently that it warrants a separate discussion; see Focus. The Profiler also has the ability to zoom in on a specific frame. See Zooming in on a frame for details.
The graph region of the interface shows you the results of the analysis requested by the configuration choices. The region consists of three report lines (Name, File, and Values) and the graph displaying the requested analysis, here for thread usage:
Understanding the graph
As a service runs, the number of collected profiles increases. The graph shows values averaged over the set of profiles, up to a maximum of 250 profiles. If there are more than 250 profiles available, 250 of them are selected randomly as a sample set.
Each individual profile represents data collected one time per minute from a single instance of the configured service in a single Compute Engine zone. The collection period for a profile varies with the profile type. See Profile collection for more information.
What is shown on the graph depends on the requested analysis, but each block on the graph, a frame, represents a function or method in the service. The top frame represents the entire service, and that always consumes 100% of whatever metric is being analyzed.
The colors of the function blocks in the graph correspond, where possible, to the packages of the functions. The colors themselves are meaningless. Functions originating in the same package share a block color. If package information is unavailable, as in Node.js, the names of the source files are used to color the function blocks. In a call stack, a change in block color means a transition from one package to another.
Here is the graph for the same service, but showing consumed heap instead:
Under the “root” level is another frame or set of frames making up a second bar. Each of these frames is a top-level call made by the service. Each is color-coded and its width indicates its consumption of the specified metric. Under each of those colored function frames is another set of function frames, each of which is responsible for some part of the resource of the frame above it. The hierarchy of function frames in this graph represents the call sequence, and the width of a frame represents that function or method's contribution to the resource consumption.
For example, in the graph for consumed heap, the majority of the space is
consumed by the call stack involving the Go runtime's
main, the app-specific
allocImpl. This isn't surprising, as this example was
written to illustrate heap consumption. But you can see other call stacks in
the screenshot; these represent a much smaller part of the heap consumption
in this code.
The report lines above the chart tell you the following:
- Name is the name of the function or method selected and its consumption of the selected metric.
- File is the name of the file in which the function or method is found.
- Values shows values of the selected metric for the function or method.
When the interface is first loaded, these lines report for the root entry, the service as a whole, indicated by the top frame in the graph.
- The Name field includes the number of profiles included in the analysis.
- The File field reports
- The Values field tells you the overall consumption for the entire service, how long it took to run, how much CPU time it used, how much memory it used.
When you point to a different frame in the graph, you see more specific
information. For example, this screenshot shows pointing to the Go runtime's
- The Name field includes the complete name of the method or function.
- The File field indicates the source file of the method or function.
- The Values field reports the consumption of the measured resource by this method or function.
A summary bar also displays a summary of this information.
In this example, the
runtime.main function accounts for a total of about 55MiB
of memory, or 97% percent of the total consumed by the program.
Zooming in on a frame
This feature is used to refine the analysis view of the graph to show a frame, its callers and any callees, and hide from view any other frames. It is particularly useful when a frame is too small to render text. This feature doesn't modify the filter settings of your graph.
To zoom in on a specific frame, place your mouse pointer on the frame of
interest. In this example, the mouse pointer is on
a frame with the function named
Click the frame to zoom in:
To restore the graph to the original state, click on the root (top) frame.
Filtering the graph
You can use the profile-data filters to restrict the data shown call sequences involving specific function. This lets you investigate the contribution of specific parts of the code to the overall resource consumption.
The Metric filter determines the performance measure to display. For example, the following screenshot shows the CPU consumption of a program:
Here, you can see that the
busyloop routine calls
foo2, both of
which call various other routines. You can use filtering to restrict the graph
only to data you are interested in.
Use the Show stacks filter to restrict the view to call stacks that include some method or function. The graph shows the callers and callees of the function, that is, everything that calls the matching function, and everything it calls.
Suppose you own the code that implements the
foo1 feature. To restrict the
CPU-usage graph only to call stacks that involve the function
foo1, set a
Show stacks filter for
In this case, you can see that the call stacks involving
foo1 account for
52% of the CPU usage.
The Hide stacks filter is similar, but it removes all call stacks that involve the specified function.
The Show from frame filter is like Show stacks, except that it eliminates parts of the stacks that call into the specified function. It shows the call stacks from the named function down.
This is useful if your function is called from many places, and you want to see the total consumption attributable to it.
To restrict the graph to show calls originating from the
set a Show from frame filter for
The screenshot reveals that this function consumes 49.8% of the CPU time.
The Hide frames filter suppresses the frame for any method or function that matches the specified string. The graph still shows the callers of the function, and any callees of the function are collected together. This is useful for removing irrelevant frames from the graph.
For example, to remove the frames for both
foo2, set a
Hide frames filter for
foo2 match, so both
are removed from the graph. Because both of them call the
routines, the data for each of those is aggregated together.
Use the Highlight filter to emphasize the call sequences ending in a specific function. Call sequences not including the function or originating from it are colored in subdued tones.
For example, here is a graph with no highlighting:
Here's the same graph with highlighting requested for the
Use the Weight filter to see the profiles that consume more of the profiling metric, for example, “the profiles consuming the top 5 percent of CPU time”. The Weight filter lets you look only at the most expensive profiles. When setting this filter, you choose from varying ranges of metric values:
The options menu indicates the range of metric values covered by the percentage
groups and how many profiles each group includes. The
All choice includes
the entire range of values, so all available profiles are in that group.
The number of available profiles varies with the service being profiled.
The percentage groups refer to the metric being measured, not the number of profiles included in the group.
Top 1% (x - y, n profiles)means that the range of values from
yis the top 1 percent of values. The number of profiles indicates how many profiles fall into the range, not 1 percent of the available profiles.
If there are more than 250 profiles available for a particular choice, the graph averages a random sample of 250 of them. The root frame reports an average over at most 250 profiles.
The following screenshot shows a weight-filtered graph:
This graph summarizes the data from the selected set of profiles, but otherwise, it is equivalent in function to the graph without a weight filter.
The Focus filter provides a different way of looking at a specific function. The view itself is described in Focus.
Focusing the graph
You use Focus to refine the analysis view to add filters so that the view displays the callers and callees of a selected function. When a function can be called through different call stacks, each call stack is shown in the flame graph.
Setting the focus filter
Using the profiler graph
Place the mouse pointer on the frame of interest and click the Focus
button that is in the pop-up window.
In this example, the flame graph, which is expanded around the function
(*huffmanBitWriter).write, displays three different call stacks:
Using the focus button
Click the Focus button () and then select a row from the Select focus function table:
You can sort the table rows in
(*compressor).deflate requires 1.46s to execute,
with 971ms spent in the function itself and the remainder of the time spent in
its call stack.
The percentages columns report that 62% of the total execution time is spent
in the function
(*compressor).deflate, and that 93% of the time, this function,
or a function in its call stack, is executing. Lastly, the count column reports
that there are 2 sequences that invoke the function
Using the profiler filter bar
In the filter bar, enter
Focus: and a string that identifies
the function to focus on. You can use a substring, including package prefixes,
or the full name. If you supply an ambiguous string, the function that
best matches is selected.
Focus filters and the graph
The standard flame graph is useful for getting an overall view of resource consumption within an application, and frame and stack filters let you refine that view. But the standard flame graph is less suitable for two common tasks:
- Analyzing the aggregate resource consumption of a given function that is called from multiple places.
- Analyzing the proportion of time spent in a function for different callers of the function.
For example, how do you analyze the resource consumption around the
function using the standard flame graph:
The graph built by the Focus filter effectively creates two flame graphs for the specified function and joins them together.
One half of the graph treats the function as the starting point of a standard flame graph and shows all the callees of the selected function. You can create this part with the standard flame graph:
The other half is modified flame graph that also treats the specified function
as the starting point but instead shows all the stacks that call the function.
You can only approximate this with the standard flame graph. The following
screenshot shows such an approximation, a flame graph showing only call stacks
terminating with the
But this leaves the
Sort function represented by many frames in the graph.
For the second half of the two-sided flame graph, the
Sort frames are
aggregated into one frame, which becomes the root, and the callers become
These two graphs are then joined together along the frame for the specified
function, leaving you with a two-sided graph that can be followed from that
function in either direction. The following screenshot show the result
of focusing on the
For analyzing your data, this graph supports the same interactions as the standard flame graph.
Downloading profile data
Stackdriver Profiler keeps your profile data for 30 days, but you can download the data for longer-term storage or to share.
To download profile data:
- Use the time-range selector to specify the range of data to download.
- Use the pull-down interface controls to specify which data to download:
- Profile type
- Zone name
Click the download button on the far right of the filter bar:
The data is downloaded in a compressed file. The file names start with
profiler_, followed by values for any of the specifiers you set:
The data is written in the serialized protocol
.pb) format. The open-source
tool can read this format, so you can analyze this data, not just store it.
pprof documentation for more information.
Stackdriver Profiler lets you visually compare two profiles of the same type, taken from the same service within a project. You can compare profiles that differ by:
- Ending times: Does the service run faster at certain times? Has there been a gradual improvement or decline over time?
- Zones: Does the service run faster in some places than in others?
- Service versions: Do newer versions differ markedly from older ones? How?
Selecting a comparison
To compare the most recent profile against another:
Use the Compare To pulldown menu to choose the comparison mode:
- None: default, select this to turn off comparison
- End date/time
Selecting an option other than None brings up fields for additional configuration.
Configure the comparison by selecting the desired date, zone, or version. The configuration is analogous to that done when setting up the current profile. See Range of time, Zone, or Version for more information.
The following screenshot shows the comparison of CPU-time profiles that differ by version:
The graph that results from a comparison is similar to the standard graph in many ways. Significant differences include:
- Meaning of colors
- Meaning of function-block size
- Values reported
Meaning of colors
In a standard profile graph, the colors themselves are meaningless.
The colors in the comparison graph are meaningful. The colors represent the size and direction of differences between values in the profiles:
- Gray indicates little or no difference.
- Red indicates that the primary profile exhibits greater values (of whatever the profile measures) than the secondary, compared-to profile.
- Blue indicates that the secondary profile exhibits greater values (of whatever the profile measures) than the primary profile.
The bigger the difference in values, the more saturated the color.
For example, the following comparison graph shows significant differences
foo2 callstacks. The
foo1 callstack is blue, indicating
that the version in the secondary profile consumes more CPU than the version
in the primary profile. The
foo2 callstack is red, indicating that the
version in the primary profile consumes more CPU than the version in the
Meaning of block size
The size of the function blocks in a standard and comparison graphs illustrates the relative consumption of whatever metric is being analyzed. The difference is that, in the comparison graph, the size of each function block is based on the average of the compared profiles.
If the size of the blocks reflected values from only one of the profiles, then any functions that existed only in the other profile, due perhaps to refactoring, would never appear in the graph.
By averaging the values for the two profiles, a block that exists only in one profile still appears in the comparison graph, at half its pre-comparison size.
When you hover over a function block, Stackdriver Profiler shows you both the details and a summary of the metric consumption. The details appear in the Values field, and the summary appears in a pop-up box:
In a comparison graph, the Values field and summary bar report the metric values and the differences between the two profiles.
The Values field reports up to 12 values for a comparison on a function block. For example:
The Values field shows value comparisons at one or two granularities:
- total: compares the values for the callstack beginning with the chosen function
- self: compares the values attributable to the chosen function itself; these values are omitted if both values are zero
For total and self, up to 6 value are shown:
- Absolute values:
- From primary profile: val1
- From secondary profile: val2
- Difference between primary and secondary: difference
- Percentage values:
- From primary profile: another val1
- From secondary profile: another val2
- Difference between primary and secondary: another difference
The 6 values appear as 2 triples. Each triple is reported using this format: val1 vs. val2 (difference)
If the difference is zero, then “no change” is reported.
The summary bar, shown as a tooltip when hovering over a function block, reports the function name and up to 6 of these values. For example:
Three values are shown for total and for self:
- Absolute value from primary profile: abval1
- Percentage value from primary profile: pct1
- Difference between primary and secondary absolute values: difference
Each triple of values is reported using this format: abval1 (pct1, difference)
The difference between the percentage values isn't reported in the summary.