Troubleshoot read pool queries

This page describes techniques for investigating and debugging queries that AlloyDB for PostgreSQL sends to a read pool instance:

  • Seeing a detailed list of a read pool's constituent nodes, including their IP addresses.
  • Connecting directly to a node, for debugging purposes.
  • Examining AlloyDB logs to determine which particular node is handling any one query sent to a read pool.
  • Querying the logs for all recent activity from a given read pool node.
  • Viewing Google Cloud metrics associated with a read pool node.

Used together, these techniques give you diagnostic and debugging access to your read pools. For example, if one of your cluster's read pools is using an unusual amount of CPU while processing a long-running query, these techniques let you determine which node is handling that query, and then directly connect to that node to further examine or terminate the query.

List details of a read pool's nodes

In normal AlloyDB use, you don't need to know the identities or addresses of the nodes that make up your read pools. However, when necessary, you can see a list of a read pool instance's nodes. Each listed node includes the following information useful for subsequent diagnostic and debugging:

To see the internal ID strings and IP addresses of a read pool's nodes, follow the gcloud-specific instructions on View instance details, but add an additional --view=FULL command-line argument:

gcloud

gcloud alloydb instances describe READ_POOL_ID \
 --region=REGION_ID \
 --cluster=CLUSTER_ID \
 --project=PROJECT_ID \
 --view=FULL

Replace the following:

  • READ_POOL_ID: The ID of the read pool.
  • REGION_ID: The instance's region ID.
  • CLUSTER_ID: The ID of the instance's cluster.
  • PROJECT_ID: The ID of the instance's project.

The output includes a section labeled nodes, resembling this:

nodes:
- id: READ_POOL_INSTANCE_ID-edd4f6ed-hcfh
  ip: 10.90.80.57
  state: HEALTHY
  zoneId: us-central1-b
- id: READ_POOL_INSTANCE_ID-edd4f6ed-ldbm
  ip: 10.90.80.56
  state: HEALTHY
  zoneId: us-central1-c

The id and ip fields for each entry are especially relevant to the other techniques described on this page:

  • The ip field shows the node's IP address within the cluster's VPC.

  • The id field contains the node's full Google Cloud ID string. Only the final four characters of this string appear in logged entries for a node.

    For example, to find log entries involving the first of the two nodes in the previous sample output, query the logs using the ID string hcfh.

Connect directly to a node

Once you know the IP address of a node, you can connect directly to its PostgreSQL server. For example, to use psql to connect to a VM in your cluster's VPC, follow the instructions in Run the psql client. When you do, provide the node's IP address rather than that of its read pool instance:

psql -h NODE_IP_ADDRESS -U USERNAME

Find node activity in logs

AlloyDB includes node IDs in log entries about queries handled by read pools. Generally speaking, you can use these discovered IDs in two ways:

  • Determine that node's IP so you can connect to it.
  • Perform further log queries to learn more about the node's recent activity.

Determine which node is handling a known query

If you know that a certain read pool is processing a long-running query, then Log Explorer can help you determine the ID of the particular node handling that query.

Note that this technique works only on read pool instances that have enabled the pgAudit extension.

  1. Visit the Logs Explorer:

    Go to the Logs Explorer

  2. In the query builder, add resource.labels.instance_id="READ_POOL_ID" to the query-editor field, replacing READ_POOL_ID with the name of your read pool instance.

  3. Add the SQL statement you're investigating, in whole or in part, to the query editor field. For example: select id from MyTable. This input is not case-sensitive.

  4. Click Run query.

  5. Use Log Explorer's controls to tune and re-run the query as needed to filter the results down to the most relevant ones.

  6. Click a log entry in the results list to expand its display.

  7. Click the labels field in the entry's expanded display.

  8. Note the value of NODE_ID under labels.

The result is the four-character identifier of the node handling the query.

Connect to a node mentioned in a log entry

If you wish to connect directly to a particular node's PostgreSQL server based on its logged activity, follow these steps:

  1. Note the logged node's four-character ID string. You can find this ID in the log entry's NODE_ID field.

  2. List the nodes for the read pool.

  3. In that list, find a node with the ID string that ends wth the four characters you noted in the first step. It is possible that none of the listed nodes match.

  4. If you do find a matching node, use its matching IP address to connect to that node's PostgreSQL server.

    Otherwise, if none of the read pool's nodes listed in the previous step have IDs matching the logged node, the read pool has decommissioned that node in the time that has passed since the original log entry. This is normal behavior for AlloyDB read pools, as described in A note about node ephermerality. In this case, you can't establish a direct connection to that node.

Once you connect to a node's PostgreSQL server, you can use standard PostgreSQL monitoring techniques, such as pg_stat_activity, to further investigate the node's current processes, and adjust them as needed.

See more log entries about a node

To see more recent logged activity about a node with a specific ID:

  1. Visit the Logs Explorer:

    Go to the Logs Explorer

  2. Add labels.NODE_ID=NODE_ID to the Logs Explorer query builder, replacing NODE_ID with the node's four-character ID string.

  3. Click Run Query to see all activity from that node within the selected time window, or tune the query to filter it further.

  4. Repeat the previous step as needed to refine your search.

View node metrics

AlloyDB tracks three metrics regarding individual nodes:

  • Node CPU usage measures the CPU activity of a node.

  • Node postgres uptime measures the uptime and availability of the PostgreSQL-compatible database server run by an AlloyDB node.

  • Node replication lag measures the time required to replicate data from your cluster's primary instance to a node within a read pool instance.

To view one of these metrics for a single node in Metrics Explorer, follow these steps:

  1. In the Google Cloud console, go to the Metrics Explorer page.

    Go to Metrics Explorer

  2. Click the Metric selector.

  3. In the Select a metric pane, under Active resources, click AlloyDB node.

  4. Under Active metric categories, click Node.

  5. Under Active metrics, click one of the available metrics.

  6. Click Apply.

  7. Click Add filter.

  8. Click the Label 1 selector.

  9. Select node_id.

  10. Click the Value 1 selector.

  11. Select a node ID from the list.

To learn which node IDs are associated with a particular read pool instance, see List details of a read pool's nodes.

For full reference documentation of these and other AlloyDB metrics, see alloydb in "Google Cloud metrics".

A note about node ephemerality

While you can safely connect to a node for temporary investigation or debugging purposes, applications that use read pools should always connect to those pools at the instance level, using the IP address that its cluster displays in its instance list.

AlloyDB treats its read pools' nodes as ephemeral, interchangeable resources. The service changes a read pool's roster of nodes as often as necessary in order to keep that read pool instance load-balanced and responsive. An application that connects directly to a read pool node, rather than to a read pool instance, runs the risk of abruptly disconnecting from your database whenever AlloyDB updates the instance's node roster.

Always let your applications connect to your read pools at the instance level, and let AlloyDB do the work of efficiently routing your queries to the proper nodes.

What's next