Auto-launching Packet Mirroring for application monitoring

This tutorial shows how to use Cloud Logging, Pub/Sub, and Cloud Functions to auto-enable Packet Mirroring so that you can monitor and troubleshoot traffic flows in your Virtual Private Cloud (VPC) network. This tutorial is intended for network, security, and DevOps teams. It assumes you are familiar with Cloud Logging, Pub/Sub, Cloud Functions, Packet Mirroring, Compute Engine, and Terraform.

Introduction

Packet Mirroring is a feature that lets you monitor VPC traffic flows in real time. Using Packet Mirroring, DevOps teams can troubleshoot degraded performance or traffic that generates error messages, or security-conscious enterprises can observe and react to traffic patterns that might be malicious. Organizations often use Packet Mirroring along with an intrusion detection system (IDS) to help detect and mitigate threats.

Packet Mirroring captures all ingress and egress traffic and packet data (such as payloads and headers) and then exports the traffic, providing you full network visibility. You can send mirrored traffic out of band to security and monitoring appliances that help you detect threats, monitor network performance, and troubleshoot applications.

You can enable Packet Mirroring on the subnet level, on network tags, or on specific VPC instances. You can do continuous packet mirroring or enable and disable Packet Mirroring based on predefined triggers.

In this tutorial, you configure the architecture in the following diagram.

Internet traffic is routed through the global load balancer to the packet-mirroring VPC.

This architecture has the following workflow:

  1. An invalid request is sent to the load balancer, triggering an HTTP 500 Internal Server Error status code from the web servers.
  2. Cloud Monitoring generates an event, and Cloud Logging logs an error message.
  3. Pub/Sub, which is configured as a sink for Cloud Monitoring, receives the error message.
  4. Cloud Monitoring pushes an event to Pub/Sub, triggering Cloud Functions to enable Packet Mirroring.
  5. Packet Mirroring is enabled, and traffic is mirrored to the collector VMs so that an appropriate person or team can further investigate the error messages. In this tutorial, you use the tcpdump utility to look at captured packets.

To complete this tutorial, you use HashiCorp's Terraform to create the VPC, subnets, global load balancer, web server, and collector VM. You then manually configure a packet mirroring policy and its associated collector VMs to receive mirrored traffic. Finally, you configure Cloud Logging, Cloud Functions, and Pub/Sub to trigger Packet Mirroring.

Although this tutorial shows you what is possible when you use an error message code (HTTP 500) to trigger an event, you can customize this solution for other use cases and environments. For example, you might want to trigger logging so that you can look at patterns (such as a specific regex pattern) or application metrics (such as CPU and memory usage).

For more information on how to configure Cloud Monitoring and Cloud Logging, see the documentation for Cloud Logging and Cloud Monitoring.

Objectives

  • Deploy a base environment using Terraform.
  • Configure the packet mirroring infrastructure.
  • Trigger an application error message.
  • Validate that Packet Mirroring is enabled.
  • Show a packet capture on the collector Compute Engine instances.

Costs

This tutorial uses the following billable components of Google Cloud:

To generate a cost estimate based on your projected usage, use the pricing calculator. New Google Cloud users might be eligible for a free trial.

When you finish this tutorial, you can avoid continued billing by deleting the resources you created. For more information, see Clean up.

Before you begin

  1. In the Cloud Console, activate Cloud Shell.

    Activate Cloud Shell

    You complete most of this tutorial from the Cloud Shell terminal using Terraform and the Cloud SDK.

  2. In Cloud Shell, change the local working directory and clone the GitHub repository:

    cd $HOME
    git clone https://github.com/GoogleCloudPlatform/terraform-gce-packetmirror.git packetMirror
    

    The repository contains all the files that you need to complete this tutorial. For a complete description of each file, see the README.md file in the repository.

  3. Make all shell scripts executable:

    cd $HOME/packetMirror
    sudo chmod 755 *.sh
    
  4. Ensure that the user account you use for this tutorial has the Identity and Access Management (IAM) permissions required to complete the tutorial.

Preparing your environment

In this section, you set up your environment variables and deploy the supporting infrastructure.

Set up Terraform

  1. Install Terraform by following the steps in the HashiCorp documentation.
  2. In Cloud Shell, initialize Terraform:

    terraform init
    

    The output is similar to the following:

    ...
    Initializing provider plugins...
    The following providers do not have any version constraints in configuration, so the latest version was installed.
    ...
    Terraform has been successfully initialized!
    ...
    

Set environment variables

  1. If your Google Cloud user account is part of a Google Cloud organization, run the following commands in Cloud Shell:

    1. Set the TF_VAR_org_id variable to the name of your Google Cloud organization:

      export TF_VAR_org_id=$(gcloud organizations list \
          --format="value(ID)" \
          --filter="DISPLAY_NAME:YOUR_ORGANIZATION_NAME")
      

      Replace YOUR_ORGANIZATION_NAME with the Google Cloud organization name that you want to use in this tutorial.

    2. Add the Terraform variable org_id to the project resource:

      sed -i "s/#org_id          = var.org_id/org_id          = var.org_id/" main.tf
      
    3. Verify that you set the environment variable correctly:

      echo $TF_VAR_org_id
      

      The output lists your numeric organization ID and looks similar to the following:

      ...
      123123123123
      ...
      
  2. Set the billing account name:

    1. List your active billing accounts:

      gcloud beta billing accounts list
      

      Copy the name of the billing account name that you want to use for the tutorial. You need this name in the next step.

    2. Set the billing account environment variable:

      TF_VAR_billing_name="YOUR_BILLING_ACCOUNT_NAME"
      export TF_VAR_billing_name
      

      Replace YOUR_BILLING_ACCOUNT_NAME with the name of the billing account name that you copied in the previous step.

  3. Set the remaining environment variables:

    source $HOME/packetMirror/set_variables.sh
    
  4. Verify that you set the environment variables correctly:

    env | grep TF_
    

    The output is similar to the following:

    ...
    TF_VAR_billing_account=YOUR_BILLING_ACCOUNT_NAME
    TF_VAR_org_id=YOUR_ORGANIZATION_NAME
    TF_VAR_region=YOUR_REGION
    TF_VAR_user_account=YOUR_USER_ACCOUNT
    TF_VAR_pid=YOUR_PROJECT_ID
    TF_VAR_zone=YOUR_ZONE
    ...
    

    In this output:

    • YOUR_REGION: the region of your Google Cloud project, for example, us-west1
    • YOUR_USER_ACCOUNT: your account ID, for example, user@example
    • YOUR_PROJECT_ID: your Cloud project ID
    • YOUR_ZONE: the zone of your Cloud project, for example, us-west1-b
  5. If the TF_VAR_billing_account variable is not set correctly, in the Cloud Console, go to the Billing Account Overview page from the Manage billing accounts page, copy the billing account ID number, and then set the following environment variable:

    export TF_VAR_billing_account=BILLING_ACCOUNT_ID
    

    Replace BILLING_ACCOUNT_ID with the account ID number that you copied earlier in this step.

  6. Create an environment variable file:

    $HOME/packetMirror/saveVars.sh
    

    This command redirects the environment variables that you created into a file called TF_ENV_VARS. Each variable is prepended with the export command. If your Cloud Shell session is terminated, you can use this file to reset the variables. These variables are used by the Terraform scripts, Cloud Shell scripts, and the gcloud command-line tool.

    If you need to reinitialize the variables later, you can run the following command:

    source $HOME/packetMirror/TF_ENV_VARS
    

Deploy the base infrastructure

  1. In Cloud Shell, deploy the Terraform supporting infrastructure:

    cd $HOME/packetMirror
    terraform apply
    
  2. When you're prompted, enter yes to apply either configuration.

    The terraform apply command instructs Terraform to deploy the project, networks, subnetworks, global load balancer, and web servers. To understand how the infrastructure is declaratively defined, you can read through the Terraform manifests (files that have the .tf extension).

    Terraform can take several minutes to deploy the components. The following objects are created:

    • VPC
    • Firewall rules
    • Subnets
    • Instance template for web server
    • Managed instance group for web servers
    • Unmanaged instance group for collector VMs
    • Cloud NAT
    • Global load balancer and associated health checks

    You can browse your project to see these items either through the Cloud Console or by using gcloud commands.

Creating packet mirroring resources

In the following steps, you create and verify the packet mirroring infrastructure.

Create collector internal load balancing resources

  • In Cloud Shell, create the backend service and forwarding rule:

    gcloud compute backend-services create collector-ilb \
        --project="$TF_VAR_pid" \
        --region="$TF_VAR_region" \
        --load-balancing-scheme=internal \
        --protocol=tcp \
        --health-checks=http-basic-check
    
    gcloud compute backend-services add-backend collector-ilb \
        --project="$TF_VAR_pid" \
        --region="$TF_VAR_region" \
        --instance-group=collector-ig \
        --instance-group-zone="$TF_VAR_zone"
    
    gcloud compute forwarding-rules create fr-ilb-packet-mirroring \
        --project="$TF_VAR_pid" \
        --region="$TF_VAR_region" \
        --load-balancing-scheme=internal \
        --network=packet-mirror-vpc \
        --subnet=collectors \
        --ip-protocol=TCP \
        --ports=all \
        --backend-service=collector-ilb \
        --backend-service-region="$TF_VAR_region" \
        --is-mirroring-collector
    

    The output is similar to the following:

    ...
    Created [https://www.googleapis.com/compute/v1/projects/pm-pid-1357261223/regions/us-west1/backendServices/collector-ilb].
    NAME           BACKENDS  PROTOCOL
    collector-ilb            TCP
    ...
    Updated [https://www.googleapis.com/compute/v1/projects/pm-pid-1357261223/regions/us-west1/backendServices/collector-ilb].
    ...
    Created [https://www.googleapis.com/compute/projects/pm-pid-1357261223/regions/us-west1/forwardingRules/fr-ilb-packet-mirroring].
    ...
    

Create and disable packet mirroring policy

  1. In Cloud Shell, create and disable a packet mirroring policy:

    gcloud compute packet-mirrorings create pm-mirror-subnet1 \
        --project="$TF_VAR_pid" \
        --region="$TF_VAR_region" \
        --network=packet-mirror-vpc \
        --collector-ilb=fr-ilb-packet-mirroring \
        --mirrored-subnets=webservers \
        --no-enable
    
    gcloud compute packet-mirrorings describe pm-mirror-subnet1 \
        --project="$TF_VAR_pid" \
        --region="$TF_VAR_region"
    

    The output is similar to the following:

    ...
    Created [https://www.googleapis.com/compute/projects/pm-pid-1357261223/regions/us-west1/packetMirrorings/pm-mirror-subnet1].
    ...
    collectorIlb:
    ...
    enable: 'FALSE'
    ...
    

    The packet mirroring policy is enabled by default. In this step, you disable the policy because you want packet mirroring to be turned off until Cloud Logging detects an issue.

Create automation to trigger packet mirroring

  1. In Cloud Shell, create the Pub/Sub topic that you use for the Cloud Logging sink:

    gcloud pubsub topics create stackdriver_logging \
        --project="$TF_VAR_pid"
    

    The output is similar to the following:

    ...
    Created topic [projects/pm-pid-1357261223/topics/stackdriver_logging].
    ...
    
  2. Create a Cloud Logging sink:

    gcloud logging sinks create stackdriver_logging pubsub.googleapis.com/projects/"$TF_VAR_pid"/topics/stackdriver_logging \
        --log-filter='resource.type="http_load_balancer" \
            AND http_request.status>=500' \
        --project=$TF_VAR_pid
    

    The Cloud Logging sink filters the global HTTP status codes in the 500 range (such as 500, 501, or 502) and sends the events to the Pub/Sub topic.

    The output is similar to the following:

    Created [https://logging.googleapis.com/v2/projects/pm-pid-1357261223/sinks/stackdriver_logging].
    Please remember to grant `serviceAccount:p422429379846-984011@gcp-sa-logging.iam.gserviceaccount.com` the Pub/Sub Publisher role on the topic.
    More information about sinks can be found at https://cloud.oogle.com/logging/docs/export/configure_export
    

    Copy the serviceAccount value from the output. You need this value in the next step.

  3. Grant the service account the Pub/Sub Publisher IAM role (roles/pubsub.publisher):

    gcloud pubsub topics add-iam-policy-binding stackdriver_logging \
        --project="$TF_VAR_pid" \
        --member serviceAccount:UPDATE_ACCOUNT \
        --role roles/pubsub.publisher
    

    Replace UPDATE_ACCOUNT with the value for serviceAccount from the previous step.

    The output is similar to the following:

    ...
    Updated IAM policy for topic [stackdriver_logging].
    bindings:
    - members:
      - serviceAccount:UPDATE_ACCOUNT
      role: roles/pubsub.publisher
    etag: notuCRmpoyI=
    version: 1
    ...
    
  4. Update the main.py file:

    net_id="$(gcloud compute networks describe packet-mirror-vpc --project="$TF_VAR_pid" --format='value(id)')"
    sed -i "s/PROJECT-ID/"$TF_VAR_pid"/g" main.py
    sed -i "s/REGION/"$TF_VAR_region"/g" main.py
    sed -i "s/NETWORK-ID/"$net_id"/g" main.py
    

    The main.py file in the repository contains the packet_mirror_pubsub function that you will use to create the Cloud Function. Before you create the Cloud Function, the preceding command updates the Google Cloud project ID, region, and network information in the Python file.

  5. Create the Cloud Function:

    gcloud functions deploy packet_mirror_pubsub \
        --project="$TF_VAR_pid" \
        --region="$TF_VAR_region" \
        --runtime python37 \
        --trigger-topic stackdriver_logging
    

    If you see the following warning, enter N:

    Allow unauthenticated invocations of new function
    [packet_mirror_pubsub]? (y/N)?
    

    The output is similar to the following:

    ...
    availableMemoryMb: 256
    entryPoint: packet_mirror_pubsub
    eventTrigger:
      eventType: google.pubsub.topic.publish
      failurePolicy: {}
      resource: projects/pm-pid--1517226903/topics/stackdriver_logging
      service: pubsub.googleapis.com
    ingressSettings: ALLOW_ALL
    labels:
      deployment-tool: cli-gcloud
    name: projects/pm-pid--1517226903/locations/europe-west1/functions/packet_mirror_pubsub
    runtime: python37
    serviceAccountEmail: pm-pid--1517226903@appspot.gserviceaccount.com
    ...
    status: ACTIVE
    ...
    

    Deploying the Cloud Function can take several minutes to complete.

  6. If you receive an error, do the following:

    • If you receive an error about enabling the Cloud Build API, enable the API:

      gcloud services enable cloudbuild.googleapis.com
      

      Retry step 5.

    • If you receive an error message regarding access to Cloud Storage, retry step 5. This error can occur when you run the commands in quick succession.

Verifying the solution

In the following steps, you trigger and verify the solution.

  1. In Cloud Shell, validate the new Pub/Sub subscription:

    gcloud pubsub subscriptions list --project="$TF_VAR_pid"
    

    The output is similar to the following:

    ...
    ---
    ackDeadlineSeconds: 600
    expirationPolicy: {}
    messageRetentionDuration: 604800s
    name: projects/pm-pid--1517226903/subscriptions/gcf-packet_mirror_pubsub-europe-west1-stackdriver_logging
    pushConfig:
      attributes:
        x-goog-version: v1
      pushEndpoint: https://e147a3acbd9a5314f553d1710671be9c-dot-efdbf9529ce1147d5p-tp.appspot.com/_ah/push-handlers/pubsub/projects/pm-pid--1517226903/topics/stackdriver_logging?pubsub_trigger=true
    topic: projects/pm-pid--1517226903/topics/stackdriver_logging
    ...
    
  2. Log in to the collector VM:

    gcloud compute ssh collector \
        --tunnel-through-iap \
        --project="$TF_VAR_pid" \
        --zone="$TF_VAR_zone"
    
  3. After you log in to the collector VM, install and enable the tcpdump utility:

    sudo apt-get install tcpdump -y
    sudo tcpdump -n not net 172.16.21.0/24
    

    The output is similar to the following:

    tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
    listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
    

    Leave this Cloud Shell session open.

  4. Open a new Cloud Shell terminal window, and then trigger the Cloud Function by generating a HTTP 500 error:

    cd $HOME/packetMirror
    source TF_ENV_VARS
    lb_ip=$(gcloud compute forwarding-rules describe packet-mirror-gfr --project=$TF_VAR_pid --global --format="value(IPAddress)")
    curl http://"$lb_ip"/error500
    

    The output is similar to the following:

    <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
    <html><head>
    <title>500 Internal Server Error</title>
    </head><body>
    <h1>Internal Server Error</h1>
    <p>The server encountered an internal error or
    misconfiguration and was unable to complete
    your request.</p>
    <p>Please contact the server administrator at
     webmaster@localhost to inform them of the time this error occurred,
     and the actions you performed just before this error.</p>
    <p>More information about this error may be available
    in the server error log.</p>
    <hr>
    <address>Apache/2.4.25 (Debian) Server at 35.241.40.217 Port 80</address>
    
  5. Return to the Cloud Shell session for the collector VM, and observe the output from the tcpdump command. The collector VM is receiving traffic, which is the health check probe on the web server instances.

    The output is similar to the following:

    ...
    07:33:41.131992 IP 130.211.2.146.53702 > 172.16.20.2.80: Flags [S], seq 4226031116, win 65535, options [mss 1420,sackOK,TS val 2961711820 ecr 0,nop,wscale 8], length 0
    07:33:41.132149 IP 130.211.2.146.53702 > 172.16.20.2.80: Flags [.], ack 3978158577, win 256, options [nop,nop,TS val 2961711821 ecr 4348156], length 0
    ...
    
  6. To stop the output from the tcpdump command, press Control+C.

  7. Type exit to exit the collector VM.

  8. In Cloud Shell, check the logs to validate that the Cloud Function ran:

    gcloud functions logs read \
        --limit 50 \
        --project="$TF_VAR_pid" \
        --region="$TF_VAR_region"
    

    The output is similar to the following:

    LEVEL  NAME                  EXECUTION_ID     TIME_UTC                 LOG
    D      packet_mirror_pubsub  999875368753102  2020-02-21 07:33:39.206  Function execution started
    I      packet_mirror_pubsub  999875368753102  2020-02-21 07:33:39.222  HTTP 500 Error Detected in: {"httpRequest":{"remoteIp":"136.27.39.107","requestMethod":"GET","requestSize":"85","requestUrl":"http://35.241.40.217/error500","responseSize":"801","serverIp":"172.16.20.2","status":500,"userAgent":"curl/7.52.1"},"insertId":"nb4g1sfdrpm04","jsonPayload":{"@type":"type.googleapis.com/google.cloud.loadbalancing.type.LoadBalancerLogEntry","enforcedSecurityPolicy":{"configuredAction":"ACCEPT","name":"policy","outcome":"ACCEPT","priority":2147483647},"statusDetails":"response_sent_by_backend"},".............
    I      packet_mirror_pubsub  999875368753102  2020-02-21 07:33:39.222  Activating Packet Mirroring For Analysis
    I      packet_mirror_pubsub  999875368753102  2020-02-21 07:33:39.329  ya29.c.KqUBvwfoZ5z88EmHKPXkgd1Gwqwwca88wWsyqjxrEFdhK8HjJDwmBWBIX_DAnC4wOO5W2B6EOQArgHQ03AIVwFnQMawXrB2tLGIkBYFuP3Go5Fylo6zZAvgtXF3LvrXiarwaASkfAM73lXfQiT20PYn4ML4E2Kli9WmhZDu6AdAe1aH-FK2MEoca84zgG65tirRGe04EJGY_hYHejlG_xrRWeaojVlc3
    I      packet_mirror_pubsub  999875368753102  2020-02-21 07:33:40.100  {
    I      packet_mirror_pubsub  999875368753102  2020-02-21 07:33:40.100    "id": "1924200601229605180",
    I      packet_mirror_pubsub  999875368753102  2020-02-21 07:33:40.100    "name": "operation-1582270419413-59f110a49a878-b68f2d26-c8f66a7b",
    I      packet_mirror_pubsub  999875368753102  2020-02-21 07:33:40.100    "operationType": "patch",
    I      packet_mirror_pubsub  999875368753102  2020-02-21 07:33:40.100    …..
     Function execution took 900 ms, finished with status: 'ok'
    

    The logs show that packet mirroring was triggered based on the HTTP 500 error code generated by the web server instances.

  9. Validate the packet mirroring feature state:

    gcloud compute packet-mirrorings describe pm-mirror-subnet1 \
        --project="$TF_VAR_pid" \
        --region="$TF_VAR_region"
    

    The output is similar to the following:

    ...
    collectorIlb:
    ...
    enable: 'TRUE'
    ...
    

Clean up

To avoid incurring charges to your Google Cloud account for the resources used in this tutorial.

Delete the infrastructure

  1. In Cloud Shell, remove the automation resources:

    gcloud functions delete packet_mirror_pubsub \
        --project="$TF_VAR_pid" \
        --region="$TF_VAR_region" \
        --quiet
    gcloud logging sinks delete stackdriver_logging \
        --project="$TF_VAR_pid" \
        --quiet
    gcloud pubsub topics delete stackdriver_logging \
        --project="$TF_VAR_pid" \
        --quiet
    gcloud compute packet-mirrorings delete  pm-mirror-subnet1  \
        --project="$TF_VAR_pid" \
        --region="$TF_VAR_region" \
        --quiet
    gcloud compute forwarding-rules delete fr-ilb-packet-mirroring \
        --project="$TF_VAR_pid" \
        --region="$TF_VAR_region" \
        --quiet
    gcloud compute backend-services delete collector-ilb \
        --project="$TF_VAR_pid" \
        --region="$TF_VAR_region" \
        --quiet
    
  2. Destroy all of the tutorial's components:

    pushd $HOME/packetMirror
    terraform destroy
    popd
    

    When you're prompted, enter yes to destroy the configuration.

  3. Remove the Git repository:

    rm -rf $HOME/packetMirror
    

What's next