Integration with Google Kubernetes Engine

Model Armor can be integrated with Google Kubernetes Engine (GKE) through Service Extensions. Service Extensions allow you to add custom logic to network traffic processing paths. Traffic extensions are a specific type of service extensions that let you integrate external services to process traffic. These extensions can be attached to various Google Cloud services, including load balancers. You can configure a service extension on application load balancers, including GKE inference gateways, to screen traffic to and from a GKE cluster. This ensures that all interactions with the AI models are protected by Model Armor. For more information, see Configure a traffic extension to call a Model Armor service.

How it works

You configure a service extension on a load balancer that routes traffic to an LLM hosted in your GKE cluster. This configuration specifies that Model Armor should be used to screen prompts and responses.
When prompts and responses reach the load balancer, the service extension calls the Model Armor service.
Model Armor then applies security policies to the prompts and responses, identifying and blocking any malicious or harmful content.
Only prompts and responses that pass the Model Armor checks are allowed through to the GKE cluster or back to you.