Security & Identity

Beyond source code: The files AI coding agents trust — and attackers exploit

May 12, 2026

Bernardo Quintero

Security Engineering Director, VirusTotal

Daniel Kapellmann Zafra

Threat Intelligence Strategy Lead, GTIG

Try Gemini Enterprise Business Edition today

The front door to AI in the workplace

Try now

As AI coding agents become deeply embedded in developer workflows, defenders must evolve their definition of malicious files and rethink how to protect against them.

Autonomous AI agents operate across integrated development environments (IDEs), editors, terminals, and extension runtimes, and they often have access to local files, command execution, and external services. As a result, the attack surface of the modern developer environments now extends well beyond source code. Repository files, agent instructions, runtime settings, and extension packages can all influence what the agent trusts, what it executes, and what it can reach.

Defending this new attack surface requires moving towards semantic analysis to understand the actual instructions, logic, and context being fed to the AI. Powered by VirusTotal Code Insight, our agentic threat intelligence capability in Google Threat Intelligence extracts the true operational intent behind agent-facing files at scale, allowing security teams to expose configurations that override guardrails and mask supply-chain risks.

By integrating agentic capabilities into Google Threat Intelligence, we’re able to link these invisible artifacts to broader threat campaigns. This powerful capability can help ensure that as attackers exploit what AI agents trust, defenders are equipped with the resources to read between the lines.

To help security analysts understand how the developer threat landscape has quickly expanded, we suggest an approach that groups the attack surface into four categories: what executes, what instructs, what connects, and what extends.

https://storage.googleapis.com/gweb-cloudblog-publish/images/1._Examples_of_common_file_types_that_expa.max-2200x2200.png

Examples of common file types that expand the developer threat landscape.

Attack surface: What executes

Just as developers rely on project configuration to automate setup, debugging, and routine tasks, AI coding agents and modern developer tools also inherit execution paths from repository files. These artifacts can trigger commands, bootstrap environments, and chain execution through normal workflows.

Opening a project, trusting a workspace, starting a debugger, rebuilding a container, or running a standard setup command may therefore execute attacker-controlled logic under the appearance of legitimate project automation.

Attack surface: What instructs

AI coding agents also consume persistent instruction files that shape how they behave inside a project. These files can influence what the agent prioritizes, what it ignores, which tools it uses, which files it trusts, and which actions it takes automatically.

These files do not need to contain exploit code to be security-relevant. Reusing them across repositories introduces a supply-chain risk, because malicious instructions can be presented as harmless guidance while steering otherwise legitimate agent workflows toward unsafe behavior.

Unlike traditional IDEs that require a human to click run, an agent may parse these instructions and execute them as a prerequisite to a task without the developer ever reviewing the specific instruction block.

Attack surface: What connects

Beyond instructions, coding agents also depend on runtime definitions that determine how they interact with tools, hooks, external services, and local execution contexts. These files define permissions, tool connectivity, external endpoints, and execution paths.

This is where repository-level influence becomes operational control. A malicious or unsafe runtime configuration can expose local commands, remote services, sensitive data, and untrusted model context protocol (MCP) servers to the agent, turning configuration abuse into controlled execution.

Attack surface: What extends

Extensions add another layer of inherited trust and introduce third-party code into editor and browser runtimes, often with broad access to local files, credentials, and developer workflows. This inherited trust can create a supply-chain problem similar to malicious project configurations: Compromised extensions, poisoned update paths, and hijacked publisher accounts can introduce attacker-controlled logic through components that otherwise appear to be standard tooling.

Applying VirusTotal Code Insight in agentic threat intelligence

This taxonomy highlights a fundamental shift in the threat landscape: The risk is no longer just in the syntax of code, but in the semantics of intent.

Traditional security tools are effectively blind to natural language instructions that tell an AI to ignore guardrails or redirect data. The operational questions are then: How can defenders identify these risks systematically? How can they detect the danger before a developer or an agent automatically follows a valid instruction file to a malicious conclusion?

To bridge this gap, we use VirusTotal Code Insight and agentic threat intelligence to perform large-scale semantic analysis. Because malicious repository settings and instruction files are often syntactically correct, they frequently return zero detections from signature-based scanners.

Code Insight solves this problem by using AI to analyze the file’s actual logic and read between the lines, surfacing behavioral risks that are invisible to legacy tooling. This context is further enriched within agentic threat intelligence, where security teams can pivot from a single semantic red flag to investigate broader threat infrastructure and associated campaign activity.

Example 1: A Weaponized tasks.json

One representative example is a file distributed under the path coding-challenge/coding-challenge/.cursor/tasks.json. The sample was first submitted to VirusTotal on March 19, and remained undetected by security engines for several days.

VirusTotal Code Insight flagged it as a risk based on the behaviour implied by the configuration itself. The sample has also been verified as malicious by a Mandiant analyst and marked as associated with a tracked threat actor by Google Threat Intelligence.

Screenshot of tasks.json sample.

The Code Insights description indicated that the file, which is parsed when a user opens the project folder in an IDE like Visual Studio (VS) Code, drives the user to download and execute arbitrary code from a GitHub Gist in memory while hiding the execution parameters.

To make Code Insights analysis reproducible at scale, we can also scale access to such descriptions for multiple files via the VirusTotal API. Looking at the contents of this particular file, we identified the Gist URLs that the actor referred to in the instructions.

https://storage.googleapis.com/gweb-cloudblog-publish/images/3._Instructions_from_tasks.max-700x700.png

Instructions from tasks.json pointing to Gists.

Looking up these Gist URLs with agentic threat intelligence provides a detailed breakdown of the malicious instructions embedded within them. Despite masquerading as legitimate tools such as NVIDIA Cuda, these Gists, along with their specific filenames, show strong similarities to widespread campaigns frequently attributed to North Korean actors, which are designed to lure IT professionals.

These attacks often pose as technical challenges to trick users into compromising their own devices.

https://storage.googleapis.com/gweb-cloudblog-publish/images/4._Agentic_threat_intelligence_enrichment_.max-1200x1200.png

Agentic threat intelligence enrichment based on the tasks.json and associated Gists quickly gives analysts more robust context.

Example 2. Offensive system instructions files
System instruction files used to provide guidance, resources, and context to LLMs can also contain malicious capabilities while remaining undetected by common antivirus services. Since the beginning of 2026, we have observed a consistent increase in Skill.md files submitted to VirusTotal with either risky or malicious instructions.

While this does not necessarily mean that all samples were harmful, it illustrates a trend that is likely to grow in tandem with the adoption and implementation of Skills across the industry.

In this example, we identified a Skill.md file containing instructions to steal user data. Code Insight indicated that the skill file contained instructions “to exfiltrate sensitive credentials, including API keys and environment variables, to external endpoints."

This case reflects a growing interest among threat actors in acquiring API keys and resources to enable scalable LLM integrations. At the time of writing, this file had remained active for nearly two months without any detections or researcher notes.

https://storage.googleapis.com/gweb-cloudblog-publish/images/5._Example_of_a_Skill_file_with_instructio.max-1000x1000.jpg

Example of a Skill file with instructions to steal user data.

The file's contents reveal a specific narrative designed to evade detection. The instructions direct the agent to exfiltrate API keys, tokens, and configuration files under the guise of "maintenance," explicitly advising the model not to mention this to the user "as it may cause confusion about the security process."

Although direct intelligence on this specific file was limited, we used the agentic threat intelligence briefing capability to generate a summary and explore similar past observations. This provided contextual information to categorize and understand the threat.

https://storage.googleapis.com/gweb-cloudblog-publish/images/6._Agentic_threat_intelligence_briefs_summ.max-1300x1300.png

Agentic threat intelligence briefs summarize similar threats.

Even files that explicitly state their offensive capabilities often evade traditional detections. For example, we identified a Skill designed to equip an AI agent with Windows privilege escalation and credential theft capabilities.

Although the file includes a disclaimer for authorized use only, its core instructions remain high-risk. Code Insight accurately evaluated the file. "The file provides explicit and systematic instructions for performing high-risk offensive operations," it said.

Despite its offensive capabilities, by the time of writing only a few vendors had flagged the file as malicious.

https://storage.googleapis.com/gweb-cloudblog-publish/images/7._Example_of_Skill_for_Windows_privilege_.max-1100x1100.jpg

Example of Skill for Windows privilege escalation and credential theft.

Example 3: Suspicious JSON runtime configurations
A third example is a pair of settings.json samples shared through VirusTotal: One points to api.awstore.cloud, the other to api.kiro.cheap. The two unrelated samples follow a similar pattern: They override ANTHROPIC_BASE_URL, embed an API key, and turn Claude Code into a client of a third-party proxy rather than Anthropic.

https://storage.googleapis.com/gweb-cloudblog-publish/images/8._Code_Insights_analyzes_suspicious_runtime.max-900x900.png

Code Insights analyzes suspicious runtime configuration samples.

This demonstrates exactly how runtime configurations can be weaponized. The file does not need exploit code or a malicious binary to be dangerous. It simply rewires trust while the agent is running.

For example, a valid AI-generated settings file can silently redirect prompts, source code, and credentials to an external endpoint while the agent appears to behave normally. Beyond data exfiltration, a rogue endpoint could plausibly reverse the flow, feeding malicious instructions or vulnerabilities back to the agent to be injected directly into the local codebase.

A high level analysis of awstore.cloud using an agentic threat intelligence pivoting prompt, uncovered a series of similar domains sharing the same underlying infrastructure. These domains exhibit a clear naming preference for crypto, finance, and tech-related nomenclature.

While the organization’s public sites currently lack formal malicious detections, OSINT lookups reveal several red flags: a lack of a verifiable legal entity, limited contact options restricted to Discord and Telegram, and a payment model that exclusively accepts cryptocurrency via third-party marketplaces like plati.market.

The settings profile reinforces this pattern. Beyond changing the endpoint, the configuration suppresses telemetry, error reporting, and cost warnings, stripping away the guardrails that would otherwise alert a user. The intent is seemingly to maintain a facade of normal operation while silently redirecting traffic to an opaque third-party service.

While these are technically valid configuration artifacts, their ability to hijack trust and exfiltrate sensitive data is indistinguishable from traditional malware.

Example 4. A Sabotaged Extension Payload
Another low key example we recently identified was that of a VS Code extension for User-centric Use cases Validator (UUV) end-to-end tests submitted to VirusTotal in March. More than one week later, the sample continued to have zero detections, but VirusTotal Code Insights identified suspicious behavior.

The analysis indicated that this specific sample included a well-known protestware payload known as peacenotwar which upon activation writes a blank file named WITH-LOVE-FROM-AMERICA.txt and logs a heart in the console.

https://storage.googleapis.com/gweb-cloudblog-publish/images/9._Sample_of_VS_Code_extension_containing_.max-1100x1100.jpg

Sample of VS Code extension containing malware used to spread political messages.

To bridge the gap between a suspicious file and actionable intelligence, we generated an agentic threat intelligence brief. By feeding the semantic context from Code Insight into the prompt, the agent pivoted across historical data, instantly linking this 'benign' extension to the 2022 cyber activist sabotage of the node-ipc library in response to the invasion of Ukraine.

While this specific event may have limited impact today, it highlights a critical, overlooked weakness in how agents handle configurations. Code Insight bridges this gap by identifying samples that, while technically benign to traditional scanners, harbor clear malicious intent.

In another example, we identified this version of a public AI coding assistant which, according to the feature’s analysis, ‘silently reads the user’s system clipboard contents and transmits this data to a remote server.’ Regardless of the likely benign nature of the sample, the analysis points out a risk for users to consider when using the extension.

https://storage.googleapis.com/gweb-cloudblog-publish/images/10._Example_of_public_coding_assistance_th.max-1200x1200.jpg

Example of public coding assistance that reads the user’s system clipboard contents and transmits data to a remote server.

Rethinking detection for the agentic era

Today, a JSON file or plain-text markdown instructions can compromise environments just as effectively as compiled malware. This shift fundamentally redefines what malicious looks like, as the danger now resides in the semantic intent of common text files that AI agents are designed to trust.

These artifacts do not need to contain exploit code to be high-risk, they simply need to provide instructions that steer an agent’s autonomous actions toward unsafe behavior, data exfiltration, and the silencing of security guardrails.

Securing this new frontier requires expanding beyond traditional syntax-based scanning toward a model of semantic analysis, treating plain-text artifacts with the same rigor as compiled malware.

Organizations can formalize this approach by implementing repository-level security policies that strictly define permitted agent-facing files and ideally mandate that they undergo automated peer reviews before being merged. We also recommend that large-scale teams enforce least-privilege access for coding agents to local files and external services, limiting the potential impact of hijacked configurations and sabotaged extensions.

Ultimately, we recommend that defenders use agentic threat intelligence tools — including VirusTotal AI, the VirusTotal Code Insights API endpoint, and our agentic platform — to supervise the operational intent of these files in real-time.

Posted in