Generative AI and Data Governance

Google was the first in the industry to publish an AI/ML Privacy Commitment, which outlines our belief that customers should have the highest level of security and control over their data that is stored in the cloud. That commitment extends to Google Cloud's generative AI products. Google ensures that its teams are following these commitments through robust data governance practices, which include reviews of the data that Google Cloud uses in the development of its products. More details about how Google processes data can also be found in Google's Cloud Data Processing Addendum (CDPA).

Definitions

Term Description
Foundation Models Large-scale machine learning (ML) models that are trained on a large amount of data and can be used for a broad range of tasks.
Adapter Models Also known as adapter layers or adapter weights. They are ML models that work in conjunction with a Foundation Model to improve performance of specialized tasks.
Customer Data For a definition, see Google Cloud Platform Terms of Service.
Training The process of using data to train an ML model.
Prediction Also known as inference, which refers to processing inputs with ML models to generate outputs.
Safety Classifiers Used to identify certain categories of content, such as potentially-violent material during the Prediction process.

Foundation Model Training

By default, Google Cloud doesn't use Customer Data to train its Foundation Models. Customers can use Google Cloud's Foundation Models knowing that their prompts, responses, and any Adapter Model training data aren't used for the training of Foundation Models.

Adapter Model Training

Vertex AI offers a service that enables customers to train Adapter Models. Adapter Model training data is Customer Data and isn't stored. Also, Customer Data isn't used to improve Google Cloud's Foundation Models. The Adapter Model is only available to the customer who trained the Adapter Model. Google doesn't claim ownership of the Adapter Models except to the extent that the Adapter Models use pre-existing Google intellectual property. By default, Customer Data is stored in encrypted form and is encrypted in transit. The customer can also control the encryption of their Adapter Models by using customer-managed encryption keys (CMEK) and can delete the Adapter Models at any time.

Prediction

Inputs and outputs processed by Foundation Models, Adapter Models, and Safety Classifiers during Prediction are Customer Data. Customer Data is never logged by Google, without explicit permission from the customer by opting in to allow it to cache inputs and outputs.

During prediction, we don't log Customer Data to generate a customer's output or train foundation models. By default Google caches a customer's inputs and outputs for Gemini models to accelerate responses to subsequent prompts from the customer. Cached contents are stored for up to 24 hours. Project level privacy is enforced for cached data. To learn how to use the API to get caching status, disable caching or re-enable caching for a Google Cloud project, see How do I enable or disable caching? You might experience higher latency if you disable caching.

Opt out of the Trusted Tester Program

If you previously opted in to permit Google to use your data to improve pre-GA AI/ML services as part of the Trusted Tester Program terms, you can use the Trusted Tester Program - Opt Out Request form to opt out.

What's next