Report client-side metrics
The Cloud Bigtable HBase client for Java can collect client-side metrics that enable you to monitor Bigtable's performance. Other Bigtable client libraries do not provide client-side metrics. This page explains how to enable client-side metrics in the HBase client for Java and lists the metrics that are available.
Enable metrics
The Cloud Bigtable HBase client for Java uses Dropwizard Metrics to collect and report client-side metrics. Because collecting metrics can add a very small amount of latency (single-digit microseconds) to each request, metrics are not enabled by default. The following sections explain how to enable client-side metrics.
Use the Log4j reporter
The simplest way to enable metrics is to add the following line to your Log4j configuration file:
log4j.category.com.google.cloud.bigtable.metrics=DEBUG
This configuration setting turns on metrics collection and logs Bigtable metrics with an SLF4J logger.
Use other reporters
You can use other types of reporters by updating your application's code. For example, here is a reporter that sends metrics to a Graphite server:
Graphite pickledGraphite = new PickledGraphite(new InetSocketAddress("graphite.example.com", 2004));
DropwizardMetricRegistry registry = new DropwizardMetricRegistry();
GraphiteReporter reporter =
GraphiteReporter.forRegistry(registry.getRegistry())
.convertRatesTo(TimeUnit.SECONDS)
.convertDurationsTo(TimeUnit.MILLISECONDS)
.filter(MetricFilter.ALL)
.build(pickledGraphite);
reporter.start(1, TimeUnit.MINUTES);
BigtableClientMetrics.setMetricRegistry(registry);
Dropwizard Metrics provides reporters for JMX, console logging, SLF4J, and CSV. There are also a variety of third-party reporters available.
Available metrics
This section describes the metrics that are available when client-side metrics are enabled. Each metric has one of the following types:
- Counter: A cumulative count per Java virtual machine (JVM).
- Meter: Count information plus throughput information (a count per the last minute, 5 minutes, or 15 minutes)
- Timer: Meter information plus latency information (such as median, mean, and 95th percentile)
Only certain metrics are collected for each type of request. See "Example:
Metrics for a Put
request" for an example of the metrics that
are collected during a Put
request.
Channel-level metrics
Type | Name | Description |
---|---|---|
Counter | google-cloud-bigtable.sessions.active |
How many BigtableSession s are opened. Each HBase
Connection has a single BigtableSession . |
Counter | google-cloud-bigtable.grpc.channel.active |
How many lower-level gRPC/Netty channels are opened. Each
BigtableSession has many gRPC channels. |
General RPC metrics
Type | Name | Description |
---|---|---|
Counter | google-cloud-bigtable.grpc.rpc.active |
The number of remote procedure calls (RPCs) that are currently active. |
Meter | google-cloud-bigtable.grpc.rpc.performed |
RPC throughput. |
Data method metrics
Data method metrics are collected for the following data methods:
ReadRows
: Implements gets and scans.MutateRow
: Implements puts and deletes.MutateRows
: Implements bulk writes.CheckAndMutateRow
: Implements HBase'scheckAnd*
methods.ReadModifyWrite
: Implements HBase'sAppend
andIncrement
methods.SampleRowKeys
: Retrieves region information that is used for MapReduce operations.
Type | Name | Description |
---|---|---|
Timer | google-cloud-bigtable.grpc.method.[METHOD_TYPE].operation.latency |
The length of time that individual operations take. Operations include the total amount of latency of all RPCs that are performed. (Usually only one RPC is performed. Client-side retries can cause the same RPC to be performed more than once if there is a transient error.) |
Timer | google-cloud-bigtable.grpc.method.ReadRows.firstResponse.latency |
The length of time that it takes to receive the first response to a scan request. |
Meter | google-cloud-bigtable.grpc.method.[METHOD_TYPE].retries.performed |
The number of retries that were performed. |
Meter | google-cloud-bigtable.grpc.method.[METHOD_TYPE].failures |
The number of non-retryable failures. |
Meter | google-cloud-bigtable.grpc.method.[METHOD_TYPE].retries.exhausted |
The number of times that retrying was aborted because too many retries had failed. |
Bulk metrics
Bulk metrics are provided for methods that return more than one response, such as a bulk write.
Type | Name | Description |
---|---|---|
Meter | google-cloud-bigtable.scanner.results |
The throughput of individual rows returned by a scan. |
Meter | google-cloud-bigtable.bulk-mutator.mutations.added |
The throughput of individual mutations added for each
MutateRows request. |
Meter | google-cloud-bigtable.bulk-mutator.mutations.retried |
The number of individual mutations retried over time. |
Bigtable table metrics
Converting Bigtable objects to HBase objects can add to the
latency of a request. The following timers can be correlated with the specified
*.operation.latency
timers to measure the cost of the conversion.
Type | Name | Description |
---|---|---|
Timer | google-cloud-bigtable.table.put.latency |
The length of time that individual Put operations take.
Correlates with
google-cloud-bigtable.grpc.method.MutateRow.operation.latency . |
Timer | google-cloud-bigtable.table.get.latency |
The length of time that individual Get operations take.
Correlates with
google-cloud-bigtable.grpc.method.ReadRows.operation.latency . |
Example: Metrics for a Put
request
When client-side metrics are enabled, the following metrics are collected for a
successful Put
request that is not retried:
- Counter:
google-cloud-bigtable.grpc.rpc.active
- Meter:
google-cloud-bigtable.grpc.rpc.performed
- Timer:
google-cloud-bigtable.grpc.method.MutateRow.operation.latency
- Timer:
google-cloud-bigtable.table.put.latency
Collecting these metrics adds approximately 1 microsecond (1/1000 of a
millisecond) to the Put
operation. The overall Put
operation could be as
fast as 2 to 3 milliseconds, assuming that the operation includes about 1 KB of
data.