Learn how to get started with Gen AI evaluation service using the Google Google Cloud console.
Before you begin
- Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
- 
    
    
      In the Google Cloud console, on the project selector page, select or create a Google Cloud project. Roles required to select or create a project - Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
- 
      Create a project: To create a project, you need the Project Creator
      (roles/resourcemanager.projectCreator), which contains theresourcemanager.projects.createpermission. Learn how to grant roles.
 
- 
  
    Verify that billing is enabled for your Google Cloud project. 
- 
    
        Make sure that you have the following role or roles on the project: Storage Admin Check for the roles- 
              In the Google Cloud console, go to the IAM page. Go to IAM
- Select the project.
- 
              In the Principal column, find all rows that identify you or a group that you're included in. To learn which groups you're included in, contact your administrator. 
- For all rows that specify or include you, check the Role column to see whether the list of roles includes the required roles.
 Grant the roles- 
              In the Google Cloud console, go to the IAM page. Go to IAM
- Select the project.
- Click Grant access.
- 
              In the New principals field, enter your user identifier. This is typically the email address for a Google Account. 
- In the Select a role list, select a role.
- To grant additional roles, click Add another role and add each additional role.
- Click Save.
 
- 
              
- 
    
    
      In the Google Cloud console, on the project selector page, select or create a Google Cloud project. Roles required to select or create a project - Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
- 
      Create a project: To create a project, you need the Project Creator
      (roles/resourcemanager.projectCreator), which contains theresourcemanager.projects.createpermission. Learn how to grant roles.
 
- 
  
    Verify that billing is enabled for your Google Cloud project. 
- 
    
        Make sure that you have the following role or roles on the project: Storage Admin Check for the roles- 
              In the Google Cloud console, go to the IAM page. Go to IAM
- Select the project.
- 
              In the Principal column, find all rows that identify you or a group that you're included in. To learn which groups you're included in, contact your administrator. 
- For all rows that specify or include you, check the Role column to see whether the list of roles includes the required roles.
 Grant the roles- 
              In the Google Cloud console, go to the IAM page. Go to IAM
- Select the project.
- Click Grant access.
- 
              In the New principals field, enter your user identifier. This is typically the email address for a Google Account. 
- In the Select a role list, select a role.
- To grant additional roles, click Add another role and add each additional role.
- Click Save.
 
- 
              
Evaluate your model
To evaluate your model:
- In the Google Cloud console, go to the Gen AI Evaluation page. 
- Click New evaluation to open the evaluation page. 
- For Define evaluation dataset, select an option: - Upload file: Click Upload to upload a CSV or JSONL file. The dataset should contain either prompts or records to use in a prompt template and optionally model responses, with a maximum of 200 rows. 
- Generate data: Enter a Prompt template to guide the Gen AI evaluation service in generating a dataset. Variables you define in your prompt template are generated and populated in the dataset. For more information, see Use prompt templates. - Define variables to generate: Specify variables to generate and descriptions of the variable to guide generation. If needed, click Add another variable description. 
- Enter a Number of samples to generate. 
- Click Generate and preview dataset to display a generated dataset based on your prompt template and variables. To adjust the dataset, you can add more details to the variable descriptions and click Re-generate. 
 
- Use model logs: Use the snapshot of prompts and responses from the logged traffic of the selected model. You can only use this option if you have request-response logs enabled on a deployed model in Vertex AI. If you just enabled logging, allow time for sufficient samples to accumulate. - Select the Model and the Region you want to log traffic from. You must have enabled logging already on your selected model and region. 
- Enter a Sampling count. 
- (Optional) Enable Filter by prompt template to use only logs that match your Prompt template. This can be useful if you use your selected models for a variety of use cases and want to evaluate one specific use case. 
 
 
- For Define model responses to evaluate, select an option: - From dataset (only available if you selected Upload file for Define evaluation dataset): If you want to use one of the fields in the uploaded dataset as your response, select a Response column. 
- From model (only available if you selected Use model logs for Define evaluation dataset): If you're using model logs as the evaluation dataset, the Gen AI evaluation service uses the model responses from the model logs. 
- Call model: Select a model. The Gen AI evaluation service runs prompts on the selected model and uses the responses for evaluation. 
 
- (Optional) For Auto-generated evaluation metrics, you can Specify custom instructions to guide the rubrics generated from each prompt. For example, - Evaluate the dataset on cultural sensitivity to the countries {name}. For more information, see Define your evaluation metrics.
- For Name and output directory, enter the following: - Evaluation name: Enter a name for your evaluation. 
- Output private data path: Enter the name of a Cloud Storage bucket where you want to store your evaluation, or click Browse to choose the bucket. 
 
- Click Evaluate. 
View your evaluation results
To view an evaluation result:
- In the Google Cloud console, go to the Gen AI Evaluation page. 
- Click the evaluation name. 
- For each prompt in your evaluation dataset, the model's response displays along with the evaluation results.