Empowering Gemini for Malware Analysis with Code Interpreter and Google Threat Intelligence
Bernardo Quintero
Andrés Ramírez
One of Google Cloud's major missions is to arm security professionals with modern tools to help them defend against the latest threats. Part of that mission involves moving closer to a more autonomous, adaptive approach in threat intelligence automation.
In our latest advancements in malware analysis, we’re equipping Gemini with new capabilities to address obfuscation techniques and obtain real-time insights on indicators of compromise (IOCs). By integrating the Code Interpreter extension, Gemini can now dynamically create and execute code to help deobfuscate specific strings or code sections, while Google Threat Intelligence (GTI) function calling enables it to query GTI for additional context on URLs, IPs, and domains found within malware samples. These tools are a step toward transforming Gemini into a more adaptive agent for malware analysis, enhancing its ability to interpret obfuscated elements and gather contextual information based on the unique characteristics of each sample.
Building on this foundation, we previously explored critical preparatory steps with Gemini 1.5 Pro, leveraging its expansive 2-million-token input window to process substantial sections of decompiled code in a single pass. To further enhance scalability, we introduced Gemini 1.5 Flash, incorporating automated binary unpacking through Mandiant Backscatter before the decompilation phase to tackle certain obfuscation techniques. Yet, as any seasoned malware analyst knows, the true challenge often begins once the code is exposed. Malware developers frequently employ obfuscation tactics to conceal critical IOCs and underlying logic. Malware may also download additional malicious code, making it challenging to fully understand the behavior of a given sample.
For large language models (LLMs), obfuscation techniques and additional payloads create unique challenges. When dealing with obfuscated strings such as URLs, IPs, domains, or file names, LLMs often “hallucinate” without explicit decoding methods. Additionally, LLMs cannot access, for example, URLs that host additional payloads, often resulting in speculative interpretations about the sample’s behavior.
To help with these challenges, Code Interpreter and GTI function calling tools provide targeted solutions. Code Interpreter enables Gemini to autonomously create and execute custom scripts, as needed, using its own judgment to decode obfuscated elements within a sample, such as strings encoded with XOR-based algorithms. This capability minimizes interpretation errors and enhances Gemini's ability to reveal hidden logic without requiring manual intervention.
Meanwhile, GTI function calling expands Gemini’s reach by retrieving contextualized information from Google Threat Intelligence on suspicious external resources such as URLs, IPs, or domains, providing verified insights without speculative guesses. Together, these tools equip Gemini to better handle obfuscated or externally hosted data, bringing it closer to the goal of functioning as an autonomous agent for malware analysis.
To illustrate how these enhancements boost Gemini's capabilities, let's look at a practical example. In this case, we’re analyzing a PowerShell script that contains an obfuscated URL that hosts a second-stage payload. This particular sample was previously analyzed with some of the most advanced publicly available LLM models, which incorporate code generation and execution as part of their reasoning process. Despite these capabilities, each model “hallucinated,” generating completely fabricated URLs instead of accurately revealing the correct one.
Obfuscated PowerShell code sample to be analyzed by Gemini
Utilizing Code Interpreter and GTI function calling as part of its reasoning process, Gemini autonomously generated the following report without any human intervention. When deemed necessary, it applies these tools to process and extract additional information from the sample.
Gemini identified that the script employs an XOR-based obfuscation algorithm that resembles RC4 to conceal the download URL. Recognizing this pattern, Gemini autonomously generates and executes a Python deobfuscation script within the Code Interpreter sandbox, successfully revealing the external resource.
With the URL in hand, Gemini then utilizes GTI function calling to query Google Threat Intelligence for further context. This analysis links the URL to UNC5687, a threat cluster known for using a remote access tool in phishing campaigns impersonating the Security Service of Ukraine.
As we’ve seen, the integration of these tools has strengthened Gemini’s ability to function as a malware analyst capable of adapting its approach to address obfuscation and gathering vital context on IOCs. By incorporating the Code Interpreter and GTI function calling, Gemini is better equipped to navigate complex samples by autonomously interpreting hidden elements and contextualizing external references.
While these are significant advancements, many challenges remain, especially given the vast diversity of malware and scenarios that exist in the threat landscape. We’re committed to making steady progress, and future updates will continue to enhance Gemini's capabilities, moving us closer to a more autonomous, adaptive approach in threat intelligence automation.