Troubleshoot Kafka issues using audit logs

Managed Service for Apache Kafka generates audit logs that capture administrative activities on your Managed Service for Apache Kafka resources. Some examples of these activities include creating a cluster, updating a topic, or deleting an ACL. You can use these logs for troubleshooting issues and verifying the security of your Kafka environment.

Here are some important points to remember regarding audit logs in Managed Service for Apache Kafka:

  • Managed Service for Apache Kafka audit logs use the service name managedkafka.googleapis.com.

  • The type of log generated depends on the type of operation performed.

    • Admin activity logs record operations that modify the configuration or metadata of a resource, such as creating, deleting, or updating clusters, topics, and ACLs. These are generated by methods that require an ADMIN_WRITE permission type.

    • Data access logs record operations that read the configuration or metadata of a resource, such as getting a cluster's details or listing topics. These are generated by methods that require an ADMIN_READ or DATA_READ permission type. By default, data access logs are not written unless you explicitly enable them.

  • Managed Service for Apache Kafka does not log data plane operations like producing or consuming messages in Cloud Audit Logs.

For more information about Cloud Audit Logs, see the Cloud Audit Logs documentation.

View audit logs

You can use Logs Explorer to view Managed Service for Apache Kafka logs.

  1. Get the required permissions to view logs in Logs Explorer. For more information, see Before you begin.

  2. In the Google Cloud console, go to the Logs Explorer page.

    Go to Logs Explorer

  3. Select an existing Google Cloud project.

  4. To display all audit logs for Managed Service for Apache Kafka, enter the following query into the query-editor field and click Run query:

    protoPayload.serviceName="managedkafka.googleapis.com"
    
  5. To refine your search, do the following in the Query builder pane:

    • For All Resources, select Apache Kafka Cluster and then drill down to a specific cluster name to filter for that resource.

    • For Log name, select the audit log type that you want to see. For example, select activity for Admin Activity logs.

Use audit logs to troubleshoot

Here is a list of potential issues you can troubleshoot with audit logs:

  • Identify who created, deleted, or modified a cluster, topic, or ACL.

  • Track when a cluster's configuration was updated.

  • Verify the existence of a resource at a specific point in time.

You can't use audit logs to troubleshoot issues related to the data plane. For example, audit logs don't contain information about message production or consumption failures, message ordering problems, or client connection issues. For these issues, you must use other tools such as Cloud Monitoring for Managed Service for Apache Kafka.

The most direct way to find a specific event is to filter by the methodName. The following table provides example filters for common troubleshooting scenarios:

Event Filter to be used in Logs Explorer
Creation of a new cluster protoPayload.methodName="google.cloud.managedkafka.v1.ManagedKafka.CreateCluster"
Updates to a cluster's configuration protoPayload.methodName="google.cloud.managedkafka.v1.ManagedKafka.UpdateCluster"
Deletion of a cluster protoPayload.methodName="google.cloud.managedkafka.v1.ManagedKafka.DeleteCluster"
Creation of a new topic protoPayload.methodName="google.cloud.managedkafka.v1.ManagedKafka.CreateTopic"
Deletion of a topic protoPayload.methodName="google.cloud.managedkafka.v1.ManagedKafka.DeleteTopic"
Creation of an ACL protoPayload.methodName="google.cloud.managedkafka.v1.ManagedKafka.CreateAcl"
Deletion of an ACL protoPayload.methodName="google.cloud.managedkafka.v1.ManagedKafka.DeleteAcl"

Examine the protoPayload of the log entries. This field contains the details of the API call, including the authenticationInfo.principalEmail (who performed the action) and the request metadata. Reviewing this information helps you understand the sequence of events and identify any anomalies.