Cassandra

Cassandra is a scalable and fault-tolerant NoSQL database system. For more information visit http://cassandra.apache.org/.

You can configure a receiver for the Ops Agent to retrieve telemetry from a Cassandra node's Java Virtual Machine (JVM) through Java Management Extensions (JMX).

Prerequisites

To collect and ingest Cassandra logs and metrics, you must install Ops Agent version 2.6.0 or higher.

Configure your Cassandra instance

To expose a JMX endpoint, you must set the com.sun.management.jmxremote.port system property when starting the JVM. We also recommend setting the com.sun.management.jmxremote.rmi.port system property to the same port. To expose a JMX endpoint remotely, you must also set the java.rmi.server.hostname system property.

By default, these properties are set in a Cassandra deployment's cassandra-env.sh file.

To set system properties by using command-line arguments, prepend the property name with -D when starting the JVM. For example, to set com.sun.management.jmxremote.port to port 7199, specify the following when starting the JVM:

-Dcom.sun.management.jmxremote.port=7199

Configure the Ops Agent for Cassandra

Following the guide for Configuring the Ops Agent, add the required elements to collect logs and metrics from your Cassandra instances, and restart the agent.

Example configuration

The following command creates the configuration file to collect and ingest logs and metrics for Cassandra and restarts the Ops Agent on Linux.

sudo tee /etc/google-cloud-ops-agent/config.yaml > /dev/null << EOF
logging:
  receivers:
    cassandra_default_system:
      type: cassandra_system
    cassandra_default_debug:
      type: cassandra_debug
    cassandra_default_gc:
      type: cassandra_gc
  service:
    pipelines:
      apache:
        receivers:
          - cassandra_default_system
          - cassandra_default_debug
          - cassandra_default_gc
metrics:
  receivers:
    cassandra_metrics:
      type: cassandra
      endpoint: localhost:7199
      collection_interval: 60s
  service:
    pipelines:
      cassandra_pipeline:
        receivers:
          - cassandra_metrics
EOF
sudo service google-cloud-ops-agent restart

In this example, the com.sun.management.jmxremote.port and com.sun.management.jmxremote.rmi.port system properties were set to 7199, and the java.rmi.server.hostname system property was set to 127.0.0.1. For more information, see Configure metrics collection.

Configure logs collection

To ingest logs from Cassandra, you must create receivers for the logs Cassandra produces and then create a pipeline for the new receivers.

To configure a receiver for your cassandra_system logs, specify the following fields:

Field Default Description
type The value must be cassandra_system.
include_paths [/var/log/cassandra/system*.log] A list of filesystem paths to read by tailing each file. A wild card (*) can be used in the paths; for example, /var/log/cassandra/system*.log.
exclude_paths [] A list of filesystem path patterns to exclude from the set matched by include_paths.


To configure a receiver for your cassandra_debug logs, specify the following fields:

Field Default Description
type The value must be cassandra_debug.
include_paths [/var/log/cassandra/debug*.log] A list of filesystem paths to read by tailing each file. A wild card (*) can be used in the paths; for example, /var/log/cassandra/system*.log.
exclude_paths [] A list of filesystem path patterns to exclude from the set matched by include_paths.


To configure a receiver for your cassandra_gc logs, specify the following fields:

Field Default Description
type The value must be cassandra_gc.
include_paths [/var/log/cassandra/gc.log.*.current] A list of filesystem paths to read by tailing each file. A wild card (*) can be used in the paths; for example, /var/log/cassandra/system*.log.
exclude_paths [] A list of filesystem path patterns to exclude from the set matched by include_paths.

What is logged

The logName of the cassandra_system, cassandra_debug and cassandra_gc logs are derived from the receiver IDs specified in the configuration. Detailed fields inside the LogEntry are as follows.

cassandra_system, cassandra_debug
These logs contain the following fields in the LogEntry:

Field Type Description
jsonPayload.level string Log entry level
jsonPayload.module string Module of cassandra where the log originated
jsonPayload.javaClass string Java class where the log originated
jsonPayload.lineNumber number Line number of the source file where the log originated
jsonPayload.message string Log message, including detailed stacktrace where provided
severity string (LogSeverity) Log entry level (translated)
timestamp string (Timestamp) Time the request was received

Any fields that are blank or missing will not be present in the log entry.

cassandra_gc
These logs contain the following fields in the LogEntry:

Field Type Description
jsonPayload.uptime number Seconds the JVM has been active
jsonPayload.timeStopped number Seconds the JVM was stopped for garbage collection
jsonPayload.timeStopping number Seconds the JVM took to stop threads before garbage collection
jsonPayload.message string Log message
timestamp string (Timestamp) Time the entry was logged

Configure metrics collection

To collect metrics from a Cassandra node, you must create a receiver for Cassandra metrics and then create a pipeline for the new receiver. To configure a receiver for your Cassandra metrics, specify the following fields:

Field Default Description
type The value must be cassandra.
endpoint localhost:7199 The JMX Service URL or host and port used to construct the service URL. This value must be in the form of service:jmx:<protocol>:<sap> or host:port. Values in host:port form are used to create a service URL of service:jmx:rmi:///jndi/rmi://<host>:<port>/jmxrmi.
collect_jvm_metrics true Configures the receiver to also collect the supported JVM metrics.
username The configured username if JMX is configured to require authentication.
password The configured password if JMX is configured to require authentication.
collection_interval 60s A time.Duration value, such as 30s or 5m.

What is monitored

The following table provides the list of metrics that the Ops Agent collects from the Cassandra instance.

Metric type 
Kind, Type
Monitored resources
Labels
workload.googleapis.com/cassandra.client.request.count
CUMULATIVEINT64
gce_instance
operation
workload.googleapis.com/cassandra.client.request.error.count
CUMULATIVEINT64
gce_instance
status
operation
workload.googleapis.com/cassandra.client.request.range_slice.latency.50p
GAUGEDOUBLE
gce_instance
 
workload.googleapis.com/cassandra.client.request.range_slice.latency.99p
GAUGEDOUBLE
gce_instance
 
workload.googleapis.com/cassandra.client.request.range_slice.latency.count
CUMULATIVEINT64
gce_instance
 
workload.googleapis.com/cassandra.client.request.range_slice.latency.max
GAUGEDOUBLE
gce_instance
 
workload.googleapis.com/cassandra.client.request.range_slice.timeout.count
CUMULATIVEINT64
gce_instance
 
workload.googleapis.com/cassandra.client.request.range_slice.unavailable.count
CUMULATIVEINT64
gce_instance
 
workload.googleapis.com/cassandra.client.request.read.latency.50p
GAUGEDOUBLE
gce_instance
 
workload.googleapis.com/cassandra.client.request.read.latency.99p
GAUGEDOUBLE
gce_instance
 
workload.googleapis.com/cassandra.client.request.read.latency.count
CUMULATIVEINT64
gce_instance
 
workload.googleapis.com/cassandra.client.request.read.latency.max
GAUGEDOUBLE
gce_instance
 
workload.googleapis.com/cassandra.client.request.read.timeout.count
CUMULATIVEINT64
gce_instance
 
workload.googleapis.com/cassandra.client.request.read.unavailable.count
CUMULATIVEINT64
gce_instance
 
workload.googleapis.com/cassandra.client.request.write.latency.50p
GAUGEDOUBLE
gce_instance
 
workload.googleapis.com/cassandra.client.request.write.latency.99p
GAUGEDOUBLE
gce_instance
 
workload.googleapis.com/cassandra.client.request.write.latency.count
CUMULATIVEINT64
gce_instance
 
workload.googleapis.com/cassandra.client.request.write.latency.max
GAUGEDOUBLE
gce_instance
 
workload.googleapis.com/cassandra.client.request.write.timeout.count
CUMULATIVEINT64
gce_instance
 
workload.googleapis.com/cassandra.client.request.write.unavailable.count
CUMULATIVEINT64
gce_instance
 
workload.googleapis.com/cassandra.compaction.tasks.completed
CUMULATIVEINT64
gce_instance
 
workload.googleapis.com/cassandra.compaction.tasks.pending
GAUGEINT64
gce_instance
 
workload.googleapis.com/cassandra.storage.load.count
CUMULATIVEINT64
gce_instance
 
workload.googleapis.com/cassandra.storage.total_hints.count
CUMULATIVEINT64
gce_instance
 
workload.googleapis.com/cassandra.storage.total_hints.in_progress.count
CUMULATIVEINT64
gce_instance
 

Verify the configuration

You can use the Logs Explorer and Metrics Explorer to verify that you correctly configured the Cassandra receiver. It might take one or two minutes for the Ops agent to begin collecting logs and metrics.

To verify the logs are ingested, go to the Logs Explorer and run the following query to view the Cassandra logs:

resource.type="gce_instance"
logName=("projects/PROJECT_ID/logs/cassandra_system" OR "projects/PROJECT_ID/logs/cassandra_debug" OR "projects/PROJECT_ID/logs/cassandra_gc")


To verify the metrics are ingested, go to Metrics Explorer and run the following query in the MQL tab.

fetch gce_instance
| metric 'workload.googleapis.com/cassandra.client.request.count'
| align rate(1m)
| every 1m