What is application performance monitoring (APM)?

Application performance monitoring (APM) is the practice of gathering and analyzing telemetry data to help detect, diagnose, and resolve application performance issues before they impact end-users. For enterprise teams, APM can be an essential practice that moves them from a reactive to a proactive operational posture. It can provide the insights needed to understand not just if an application is working, but how well it’s working and why it might be under performing.

10:27

Monitor your Cloud Run app

Application performance monitoring defined

Application performance monitoring (APM) is the process of using software tools and telemetry data to observe and manage the operational health of applications.

The goal of APM is to ensure that applications meet established performance expectations and to provide development and operations teams with the actionable data needed to troubleshoot issues quickly. It goes beyond simple infrastructure monitoring (like checking CPU usage) to provide a deep, code-level view of how an application is behaving, how it’s interacting with its dependencies, and how real users are experiencing its performance.

Key components of APM

A comprehensive APM solution is typically composed of several key functional components that work together to provide a holistic view of application health.

Log management

This component involves the collection, aggregation, and analysis of log files generated by the application and its infrastructure. Logs provide a detailed, time-stamped record of events, which is invaluable for debugging and security analysis.

Error tracking

Error tracking automatically captures and aggregates application errors and exceptions in real time. It groups similar errors, provides stack traces, and alerts development teams to new or recurring issues so they can be addressed quickly.

User experience monitoring

This focuses on the client-side, measuring how real users are experiencing the application's performance. Also known as Real User Monitoring (RUM), it captures metrics like page load times and frontend errors directly from the user's browser or mobile device.

Infrastructure monitoring

This component tracks the health and performance of the underlying infrastructure that the application runs on. It includes monitoring the performance of servers, containers, databases, and other backend services.

How application performance monitoring works

The process of application performance monitoring follows a continuous, cyclical workflow, moving from data collection in your live application to actionable insights for your development and operations teams.

Step 1: Data collection and instrumentation

The process begins by instrumenting your application to generate telemetry data. This is typically done by deploying lightweight software agents onto your servers or by including an SDK (Software Development Kit) in your application's code. These agents and SDKs automatically hook into your application's runtime to collect a rich stream of data, including:

Metrics: Performance indicators like response times and resource utilization
Traces: Detailed records of how a single request travels through all the different services and components of your application
Logs: Text-based records of specific events that occur within the application
Errors: Captured exceptions and stack traces when something goes wrong

Step 2: Data transmission and aggregation

The agents and SDKs securely transmit this collected telemetry data from your application environment to a central APM platform. This platform is designed to ingest and aggregate massive volumes of data from all your application instances and infrastructure components.

Step 3: Data processing and correlation

Once the data arrives at the central platform, sophisticated processing begins. The platform correlates the different types of telemetry data to build a complete picture of each transaction. For example, it links a specific user's slow page load time (a metric) to the exact distributed trace that shows which backend service was slow, and then connects that trace to the specific log entries and error messages generated during that request.

Types of metrics tracked for APM

APM tools track a wide variety of metrics to create a comprehensive picture of performance. These include but are not limited to:

Traditional golden signals

Response time: This measures the total time it takes for an application to respond to a user request; it’s one of the most critical indicators of user-perceived performance
Throughput: This metric tracks the number of requests or transactions that an application can handle over a specific period, often measured in requests per minute (RPM)
Error rates: This measures the percentage of requests that result in an error or failure; a sudden spike in the error rate is a clear indicator of a problem

CPU utilization

This tracks how much of the server's or container's CPU capacity is being consumed by the application. High CPU utilization can be a sign of inefficient code or insufficient resources.

Memory usage

This monitors the amount of memory (RAM) the application is using. Memory leaks or excessive usage can lead to poor performance and application crashes.

Network I/O

This measures the amount of data being sent and received by the application over the network. It can help identify network bottlenecks or inefficient data transfer patterns.

Disk I/O

This tracks the read and write operations on the server's disk. High disk I/O can indicate a bottleneck in data-intensive applications.

Benefits of application performance monitoring

Implementing a robust APM strategy can provide numerous benefits that extend beyond simply fixing bugs.

Improved application performance

By providing deep insights into performance bottlenecks, APM tools help developers optimize code, database queries, and service interactions to create a faster and more efficient application.

Enhanced user experience

Fast, reliable applications lead to higher user satisfaction and engagement. APM helps ensure that performance issues are addressed before they can negatively impact a large number of users.

Faster troubleshooting and issue resolution

When an issue occurs, APM provides developers and operations teams with the correlated logs, traces, and metrics needed to quickly pinpoint the root cause, dramatically reducing the mean time to resolution (MTTR).

Increased operational efficiency

APM automates the process of performance monitoring and can provide intelligent alerting to reduce alert fatigue. This allows operations teams to manage larger and more complex systems with greater efficiency.

Proactive problem identification

By analyzing performance trends over time, APM can help teams identify potential problems and capacity limitations before they result in a full-blown outage, enabling a more proactive approach to system health.

How to set up application monitoring in Google Cloud

Think of your application as a team project. It has many different parts, like a frontend running on Cloud Run and a database. Application monitoring brings information from all those team members into one place, so you can see how the whole project is doing at a glance.

Here’s how you can set it up.

Step 1: Tell Google Cloud about your application

First, you need to create a "folder" for your application so Google Cloud knows it exists. You do this in a tool called App Hub.

This step is like giving your team project an official name. You aren't adding any of the parts yet—you're just creating the main idea of the application itself.

Step 2: Add your app's parts to the list

Now that you've named your project, it's time to assign your team members to it. In this step, you'll select the specific Google Cloud services that make up your application (like your Cloud Run service and your Firestore database) and add them to the application you created in App Hub.

This tells Google Cloud that all these separate services are actually working together as one team. This is the most important step, as it connects everything and allows Google to build your dashboard.

Step 3: Get deeper details from inside your code

The first two steps give you a great overview of your application's health. But to find the exact cause of a slowdown, you need to see what’s happening inside your code. This is called instrumentation.

Think of it like giving each team member a walkie-talkie. By adding a special tool (like OpenTelemetry) to your code, your application can send detailed reports about what it's doing and how long each task takes. This is a highly recommended step because it helps you find and fix problems much faster.

Step 4: Check your all-in-one dashboard

Once you've set everything up, you can go to Cloud Monitoring to see your new dashboard. It pulls together all the important information about your application's health onto a single screen.

You'll be able to see:

If your app is running smoothly and meeting its speed goals
Any error messages (logs) from all its different parts
How requests travel through your different services

Instead of checking on each team member one by one, you now have a project dashboard that gives you the full story in one simple view.

Solve your business challenges with Google Cloud

New customers get $300 in free credits to spend on Google Cloud.

Additional resources

This introductory blog post provides a high-level overview of how you can get started with application monitoring using Google Cloud's observability tools
For a more hands-on approach, this documentation offers detailed instructions and best practices for setting up application monitoring in your Google Cloud environment
Explore the various options for application hosting on Google Cloud to understand the different compute platforms available for running and monitoring your services

Take the next step

Start building on Google Cloud with $300 in free credits and 20+ always free products.

Need help getting started?
Contact sales
Work with a trusted partner
Find a partner
Continue browsing
See all products