Identity & Security
How to conduct live network forensics in GCP
Forensics is the application of science to criminal and civil laws. It is a proven approach for gathering and processing evidence at a crime scene. An integral step in the forensics process is the isolation of the scene without contaminating or modifying the evidence. The isolation step prevents any further contamination or tampering with possible evidence. The same philosophy can be applied to the investigation of digital events.
In this post we will review methods, tactics and architecture designs to isolate an infected VM while still making it accessible to forensic tools. The goal is to allow access so that data and evidence can be captured while protecting other assets. There are many forensic tools for networking that can be used to analyze the captured traffic. This post does not cover these tools but rather how to configure GCP to capture live traffic in the most efficient and secured way. Once traffic is captured, customers can use whatever tools they prefer to run the analysis. More details about these tools and required agents can be found here and details about open source tooling that Google and others are developing are available here.
In cloud security context, when a VM shows signs of compromise, the most common immediate reaction is to take a snapshot, shut down the instance and relocate the image snapshot to an isolated environment, a method known as “dead analysis”. However, shutting down the instance will impede an important step in the investigation and digital forensics, as some important information in a buffer or the RAM may be lost.
The other forensic approach is “live analysis”, in which the VM is kept on and evidence is gathered from the VM directly. Live forensics enables the imaging of RAM, bypasses most hard drives and software encryption, determines the cause of abnormal traffic, and is extremely useful when dealing with active network intrusions. This process is usually performed by forensic analysts. For example, if there is a good chance the malware resides only in memory then live forensics is, in some cases, the only way to capture and analyze the malware. In this method, in addition to disk and memory evidence, a forensic analysis can also capture live-network from data sent over the compromised VM network interfaces. Some of the benefits of collecting live networks are reconstruction and visualizing traffic flow in real-time, in particular during active network intrusions or attacks.
In the cloud, a VM must be isolated when it becomes apparent that an incident has happened, in order to protect other VMs from being infected. Our Cloud Forensics 101 session covers the process and required artifacts, such as logs, that need to be collected for cloud forensics.
What happens when your image is compromised
An incident response plan consists of 3 phases: preparation (actions taken before an attack), detection (actions taken during an attack) and response (actions taken after an attack). During the detection phase, the Computer Security Incident Response Team (CSIRT) or threat analysts decide whether live acquisition analysis is required. If live forensics is required, for example when it is vital to acquire a VM’s RAM, then one of the first courses of action is to isolate and contain the VM from the rest of the world and connect the Forensics VPC to the VM for investigation. The forensics VPC resides in a forensics GCP project, it includes digital forensics tools to capture evidence from the VM such as SANS Investigative Forensics Toolkit - SIFT, The Sleuth Kit, Autopsy, Encase, FTK and alike. These tools are already installed, configured, tested and ready to use. The forensics project will also save and preserve evidence such as disk and memory images for forensic review.
We’ll cover two scenarios in this post, the first scenario is to isolate the image and connect the forensics VPC to the image for live acquisition.
In the second scenario we will also capture live traffic from the isolated image for live network digital forensics. To capture live traffic from the infected VM, we will leverage the GCP Packet Mirroring service to duplicate all traffic going in and out of the VM and send it to a Forensics VPC for analysis. Network forensics analysis tools such as Palo Alto VM-Series for IDS, ExtraHop Reveal(x), CheckPoint CloudGuard, Arkime (formerly Moloch), Corelight are installed, configured and ready for deployment in the Forensics VPC, these tools will be used to analyze the duplicate network traffic.
Isolating the infected VM from other resources and connecting the forensics VPC
As part of the Incident Response plan preparation phase, the CSIRT created a Google Cloud Forensics Project. Since the Forensics project will be used only when needed, it’s better to automate the creation of the project and its resources with a tool such as Terraform. It is important to grant access to this project only to individuals and groups who deal with incident response and forensics, such as CSIRT. As shown in figure 1, the Forensics project on the right includes its own VPC, non-overlapped subnet and VM images with pre-installed and pre-configured forensics tools. Internal load-balancer and instance-groups are also configured, we will use these resources to capture live traffic, as described later in this post.
In order to contain the spread of any malware or network activity, such as data exfiltration, we’ll isolate the VM with VPC firewall rules. The GCP VPC firewall is a distributed firewall that always enforces its rules, protecting the instances regardless of their configuration and operating systems. In other words, the compromised VM cannot override the firewall enforcement if its policies follow the principle of least privilege . Rules can be applied to all instances in the network, target network tags or service accounts.
Step 1 in the diagram above shows how an infected VM is isolated from the rest of the network by firewall rules that deny any ingress and egress traffic from any CIDR beside the forensics subnet CIDR. The infected VM is tagged with a unique network tag, for example “<image-name>_InfectedVM”, then firewalls rules are applied on the network tag. This ensures that the infected VM is isolated from the project and the Internet while enabling access to the VM via VPC peering which we’ll configure in step-2. You can learn more about VPC firewalls rules here.
In step 2, the VPC from the forensics project is peered with the VPC in the production project. When VPC peering is established routes are exchanged between the VPCs. By default, VPC peering exchanges all subnet routes, however, custom routes can also be filtered if required. At this point, the VM from the forensics project can communicate with the infected VM and start the live forensics analysis job using the pre-installed and pre-configured forensics tools.
Shared VPC is a network construct that allows you to connect resources from multiple projects, called service-projects, to a common VPC in a host-project. VPCs from different projects can securely communicate with each other via the hosted project network while centralizing the network administration. Figure 2 depicts Shared VPC topology, rather than using VPC peering, during step 2 the Forensics project is simply attached to the host project. After the attachment, the Shared VPC allows the forensics tools to communicate with the infected VMs.
Capturing live network traffic with Google Traffic Mirroring
If live network forensics is required, for example during active network intrusions, then the incoming and outgoing traffic needs to be duplicated and captured. While VPC Flow logs capture the networking metadata telemetry, this is not enough for live network forensics analysis. GCP Packet Mirroring clones the traffic of a specified instance in a VPC and forwards it to a specified internal load balancer which collects the mirrored traffic and sends it to an attached instance group. Packet mirroring captures all the traffic from the specified subnet, network tags, or instance name.
Figure 3 depicts the steps that allow the compromised VM to communicate with the rest of the world (for example beaconing with C&C) while capturing all traffic for investigation in a peered VPC deployment.
Figure 4 depicts the steps that allow the compromised VM to communicate with the rest of the world while capturing all traffic for investigation in a shared VPC deployment.
We will use the Forensics’ project internal load balancer and the instance group VMs which include packet capture and analysis tools. Note that production and forensics networks must be in the same region. Detailed steps to configure packet mirroring are available on this page.
If you are using a Shared VPC then check the Packet Mirroring configuration for Shared VPC for configuration details. Figure 4 depicts the packet mirroring flow in a shared VPC topology.
It is recommended to automate and periodically test the process to make sure that in case of an incident, the entire setup and Forensics toolchain can be quickly deployed. If after initial investigation a suspicious destination, such as a Command and Control [C&C] Server, has been identified, then the Packet Mirroring policy can be adjusted with a policy filter that only mirrors traffic from that C&C server IP address.
An incident management plan must be in place for companies using cloud services, and this plan should also include the option of using live acquisition when necessary. design and preparation for forensics acquisition allows the company to build the infrastructure that can be deployed and connected to the appropriate VM automatically. The architectures described in this post can help the process of collecting and preserving vital evidence for the forensic process, while the incident response team resolves the incident.