Jump to Content
Security & Identity

Same same but also different: Google guidance on AI supply chain security

October 7, 2025
https://storage.googleapis.com/gweb-cloudblog-publish/images/GettyImages-666377166.max-2600x2600.jpg
Anton Chuvakin

Security Advisor, Office of the CISO

Get original CISO insights in your inbox

The latest on security from Google Cloud's Office of the CISO, twice a month.

Subscribe

Supply chain risks are one of the commonly-exploited security vulnerabilities, so it makes sense that AI supply chains — more intricate and opaque than traditional software supply chains — face even greater risks. For AI supply chain security to be effective, it needs to be realistically usable.

Data poisoning, training framework vulnerabilities, and model tampering are significant threats to AI models, and have been documented as far back in AI development as 2023 and early 2024. These compromised AI models appeared to be safe, but actually contained dangerous code. When unsuspecting users downloaded them, the harmful code could steal data and install backdoors that let attackers take control of the users' machines.

When crafting AI supply chain risk-mitigation measures, it’s important to remember that security efforts rarely work when developers won't or can't use them. As the example of GPG keys shows, security measures need to be usable for developers to adopt them.

How AI supply chain risks are similar to traditional software

At Google, we believe that AI development is similar to the traditional software development lifecycle, so existing security measures should readily adapt to AI. Our approach to securing the AI supply chain is built on the Secure AI Framework (SAIF), as detailed in research we’ve recently published.

The paper emphasizes that AI models, particularly large language models (LLMs), are inherently "opaque." Their behavior is heavily influenced by their weights, which are at best difficult to analyze because there are so many of them, and their binary format. That opacity poses a unique challenge for security leaders who can much more easily inspect and understand the inner workings of traditional software.

“As with traditional supply chains, it’s important to find and fix bugs that get introduced into AI artifacts and infrastructure. With AI, though, a new class of dependencies emerges: the datasets which have been used to train a model,“ said the paper’s authors.

The ecosystem for storing, changing, and retrieving datasets is less mature than that for code management, but it is required at Google for AI. Tamper-proof provenance can help developers verify the identity of the model producer and confirm the model's authenticity.

Adapting Supply-chain Levels for Software Artifacts (SLSA) to the AI supply chain requires addressing challenges, such as the long and resource-intensive nature of AI model training, but can help provide crucial insight into AI supply chains.

AI supply chain security can also benefit greatly from adopting the traditional supply chain security concept of provenance: a tamper-proof record of an artifact's origins and modifications. Provenance can help track dependencies, ensure integrity, and mitigate risks, such as data poisoning and model tampering.

Because AI data provenance is often complex, involving questions about the level of detail to capture (including dataset names versus individual data points) and the need for cryptographic integrity checks for datasets, tamper-proof provenance is indispensable for securing AI artifacts and data.

Key differences in AI supply chain risk models

Even with those similarities, AI supply chain risk modeling shouldn’t be lifted and shifted from traditional software supply chain risk models. Three crucial differences between AI supply chains and traditional software supply chains include:

  • Data versus code: AI relies heavily on data, creating unique security challenges around data provenance, poisoning, and versioning. Traditional software primarily relies on code. Version control for AI datasets is not as mature as that for traditional software code, either. This can make it difficult to track changes to datasets and manage their security.
  • Opacity versus inspectability: AI models are opaque, making manual review impossible, unlike software code that can be readily inspected and analyzed. AI training often comprises a series of ad hoc incremental steps that are not recorded in any central configuration.
  • Emphasis on provenance: Software provenance is crucial for AI due to the risks of data poisoning and model tampering.

Provenance is key

Security teams should prioritize the implementation of robust provenance tracking mechanisms to ensure the integrity and traceability of AI models and datasets, and to help build trust with users that the AI model is accurate and reliable.

Collective action is key to securing the AI software supply chain.

The ecosystem for storing, changing, and retrieving datasets is less mature than that for code management, but it is required at Google for AI. Tamper-proof provenance can help developers verify the identity of the model producer and confirm the model's authenticity.

Software provenance details can play a critical role in mitigating AI cyberattacks. Documenting an AI model’s datasets, frameworks, and pretrained models can help identify potentially problematic models and establish a comprehensive record of the model's lineage. Ultimately, provenance information can serve as a valuable resource for identifying potential vulnerabilities and risks associated with specific AI models, and help organizations create more accurate risk models of their AI use.

For example, if a dataset used in training an AI model is known to contain biases or inaccuracies, the resulting model may exhibit similar flaws. By tracking the provenance of the dataset, it becomes possible to identify and address these issues before they can be exploited by attackers.

Similarly, if a pretrained model is found to have security vulnerabilities, any AI models that incorporate it may also be at risk. Provenance information allows for the identification and mitigation of these risks, ensuring that AI models are robust and secure.

Provenance information can also be used to track the evolution of AI models over time. As models are updated and refined, it is important to maintain a record of the changes that have been made. This information can be used to identify any unintended consequences of these changes, as well as to track the effectiveness of different approaches to AI development.

How to get your AI supply chain security started

It may seem daunting to stare down an poorly-documented AI model that desperately needs a SLSA and provenance intervention.

We recommend beginning with capturing enough metadata to understand the lineage of each artifact. You want to be able to answer basic questions: where an artifact came from; who authored it, changed, or trained it; what datasets were used in training it; and what source code was used to generate the artifact.

Next, organize the information to support queries and controls. Ideally, the metadata should be captured during the artifact’s creation in a non-modifiable, tamper-evident way. Finally, as a best practice, share the metadata you capture in an SBOM, provenance document, model card, or some other vehicle that will assist other developers.

Through capturing comprehensive metadata, increasing integrity through cryptographic signing, organizing metadata for effective querying and controls, and sharing this information to foster trust and transparency, developers can gain a better understanding of their models, the risks they pose, and how to mitigate those risks.

Collective action is key to securing the AI software supply chain. No matter how self-reliant an organization is, there will always be dependencies, datasets, and other shared components involved. By diligently applying established software supply chain security practices and carefully tracking datasets, organizations can bolster their defenses against malicious attacks and recover more quickly from unintended vulnerabilities.

For more insights on AI supply chain security, we recommend reading the full Securing the AI Software Supply Chain paper, as well as additional AI security resources that compare AI and traditional security.

Posted in