Vertex AI glossary

  • agent
    • An AI agent is a software system that utilizes artificial intelligence (AI) to achieve goals and complete tasks for users. It demonstrates reasoning, planning, and memory capabilities, and possesses a level of autonomy to make decisions, learn, and adapt.
  • annotation set
    • An annotation set contains the labels associated with the uploaded source files within a dataset. An annotation set is associated with both a data type and an objective (for example, video/classification).
  • API endpoints
    • API Endpoints is a service config aspect that specifies the network addresses, also known as service endpoints (for example, aiplatform.googleapis.com).
  • Application Default Credentials (ADC)
    • The Application Default Credentials (ADC) provide a simple way to get authorization credentials for use in calling Google APIs. They are best suited for cases when the call needs to have the same identity and authorization level for the application independent of the user. This is the recommended approach to authorize calls to Google Cloud APIs, particularly when you're building an application that is deployed to Google App Engine (GAE) or Compute Engine virtual machines. For more information, see How Application Default Credentials works.
  • Approximate Nearest Neighbor (ANN)
    • The Approximate Nearest Neighbor (ANN) service is a high scale, low latency solution, to find similar vectors (or more specifically, "embeddings") for a large corpus. For more information, see How to use Vector Search for semantic matching.
  • artifact
    • An artifact is a discrete entity or piece of data produced and consumed by a machine learning workflow. Examples of artifacts include datasets, models, input files, and training logs.
  • Artifact Registry
    • Artifact Registry is a universal artifact management service. It is the recommended service for managing containers and other artifacts on Google Cloud. For more information, see Artifact Registry.
  • Artificial Intelligence (AI)
    • Artificial intelligence (or AI) is the study and design of machines that appear to be "intelligent", meaning one which mimics human or intellectual functions such as mechanical movement, reasoning or problem solving. One of the most popular subfields of AI is machine learning, which uses a statistical and data-driven approach to create AI. However, some people use these two terms interchangeably.
  • authentication
    • The process of verifying the identity of a client (which might be a user or another process) for the purposes of gaining access to a secured system. A client that has proven its identity is said to be authenticated. For more information, see Authentication methods at Google.
  • Automatic side-by-side (AutoSxS)
    • Automatic side-by-side (AutoSxS) is a model-assisted evaluation tool that compares two large language models (LLMs) side by side. It can be used to evaluate the performance of either generative AI models in Vertex AI Model Registry or pre-generated predictions. AutoSxS uses an autorater to decide which model gives the better response to a prompt. AutoSxS is available on demand and evaluates language models with comparable performance to human raters.
  • AutoML
    • Machine learning algorithms that "learn to learn" through black-box optimization. For more information, see ML Glossary.
  • autorater
    • An autorater is a language model that evaluates the quality of model responses given an original inference prompt. It's used in the AutoSxS pipeline to compare the predictions of two models and determine which model performed the best. For more information, see The autorater.
  • baseline
    • A model used as a reference point for comparing how well another model (typically, a more complex one) is performing. For example, a logistic regression model might serve as a good baseline for a deep model. For a particular problem, the baseline helps model developers quantify the minimal expected performance that a new model must achieve for the new model to be useful. For more information, see Baseline and target datasets.
  • batch
    • The set of examples used in one training iteration. The batch size determines the number of examples in a batch.
  • batch size
    • The number of examples in a batch. For example, the batch size of SGD is 1, while the batch size of a mini-batch is usually between 10 and 1000. Batch size is usually fixed during training and inference; however, TensorFlow does permit dynamic batch sizes.
  • batch prediction
    • Batch prediction takes a group of prediction requests and outputs the results in one file. For more information, see Getting batch predictions.
  • bias
    • 1. Stereotyping, prejudice or favoritism towards some things, people, or groups over others. These biases can affect collection and interpretation of data, the design of a system, and how users interact with a system. 2. Systematic error introduced by a sampling or reporting procedure.
  • bidrectional
    • A term used to describe a system that evaluates the text that both precedes and follows a target section of text. In contrast, a unidirectional system only evaluates the text that precedes a target section of text.
  • Bidirectional Encoder Representations from Transformers (BERT)
    • BERT is a method of pre-training language representations, meaning that we train a general-purpose "language understanding" model on a large text corpus (like Wikipedia), and then use that model for downstream NLP tasks that we care about (like question answering). BERT outperforms previous methods because it is the first unsupervised, deeply bidirectional system for pre-training NLP.
  • Bilingual Evaluation Understudy (BLEU)
    • A popular measure for evaluating the quality of a machine-translation algorithm by comparing its output to that of one or more human translations.
  • bounding box
    • A bounding box for an object in the video frame can be specified in either of two ways (i) Using 2 vertices consisting of a set of x,y coordinates if they are diagonally opposite points of the rectangle. For example: x_relative_min, y_relative_min,,,x_relative_max,y_relative_max,, (ii) Use all 4 vertices. For more information, see Prepare video data.
  • bucket
    • Top-level folder for Cloud Storage. Bucket names must be unique across all users of Cloud Storage. Buckets contain files. For more information, see Product overview of Cloud Storage.
  • chat
    • The contents of a back-and-forth dialogue with an ML system, typically a large language model. The previous interaction in a chat (what you typed and how the large language model responded) becomes the context for subsequent parts of the chat. A chatbot is an application of a large language model.
  • checkpoint
    • Data that captures the state of a model's parameters either during training or after training is completed. For example, during training, you can: 1. Stop training, perhaps intentionally or perhaps as the result of certain errors. 2. Capture the checkpoint. 3. Later, reload the checkpoint, possibly on different hardware. 4. Restart training. Within Gemini, a checkpoint refers to a specific version of a Gemini model trained on a specific dataset.
  • classification model
    • A model whose prediction is a class. For example, the following are all classification models: A model that predicts an input sentence's language (French? Spanish? Italian?). A model that predicts tree species (Maple? Oak? Baobab?). A model that predicts the positive or negative class for a particular medical condition.
  • classification metrics
    • Supported classification metrics in the Vertex AI SDK for Python are confusion matrix and ROC curve.
  • Cloud TPU
    • A specialized hardware accelerator designed to speed up machine learning workloads on Google Cloud.
  • container image
    • A container image is a package that includes the component's executable code and a definition of the environment that the code runs in. For more information, see Custom training overview.
  • context
    • A context is used to group artifacts and executions together under a single, queryable, and typed category. Contexts can be used to represent sets of metadata. An example of a Context would be a run of a machine learning pipeline.
  • context cache
    • A context cache in Vertex AI is a large amount of data that can be used in multiple requests to a Gemini model. The cached content is stored in the region where the request to create the cache is made. It can be any MIME type supported by Gemini multimodal models, such as text, audio, or video. For more information, see Context caching overview.
  • context window
    • The number of tokens a model can process in a given prompt. The larger the context window, the more information the model can use to provide coherent and consistent responses to the prompt.
  • Customer-managed encryption keys (cmek)
    • Customer-managed encryption keys (CMEK) are integrations that allow customers to encrypt data in existing Google services using a key they manage in Cloud KMS (also known as Storky). The key in Cloud KMS is the key encryption key protecting their data. For more information, see Customer-managed encryption keys (CMEK).
  • CustomJob
    • A CustomJob is one of three Vertex AI resources a user can create to train custom models on Vertex AI. Custom training jobs are the basic way to run custom machine learning (ML) training code in Vertex AI. For more information, see Create custom training jobs.
  • Dask
    • Dask is a distributed computing platform that is often used with TensorFlow, Pytorch, and other ML frameworks to manage distributed training jobs. For more information, see Wikipedia.
  • data analysis
    • Obtaining an understanding of data by considering samples, measurement, and visualization. Data analysis can be particularly useful when a dataset is first received, before one builds the first model. It is also crucial in understanding experiments and debugging problems with the system.
  • data augmentation
    • Artificially boosting the range and number of training examples by transforming existing examples to create additional examples. For example, suppose images are one of your features, but your dataset doesn't contain enough image examples for the model to learn useful associations. Ideally, you'd add enough labeled images to your dataset to enable your model to train properly. If that's not possible, data augmentation can rotate, stretch, and reflect each image to produce many variants of the original picture, possibly yielding enough labeled data to enable excellent training.
  • DataFrame
    • A popular pandas data type for representing datasets in memory. A DataFrame is analogous to a table or a spreadsheet. Each column of a DataFrame has a name (a header), and each row is identified by a unique number.Each column in a DataFrame is structured like a 2D array, except that each column can be assigned its own data type.
  • dataset (data set)
    • A dataset is broadly defined as a collection of structured or unstructured data records. A collection of raw data, commonly (but not exclusively) organized in one of the following formats: a spreadsheet a file in CSV (comma-separated values) format. For more information, see Create a dataset.
  • decoder
    • In general, any ML system that converts from a processed, dense, or internal representation to a more raw, sparse, or external representation. Decoders are often a component of a larger model, where they are frequently paired with an encoder. In sequence-to-sequence tasks, a decoder starts with the internal state generated by the encoder to predict the next sequence.
  • deep neural network (DNN)
    • A neural network with multiple hidden layers, typically programmed through deep learning techniques.
  • depth
    • The sum of the following in a neural network: 1. the number of hidden layers 2. the number of output layers, which is typically one 3. the number of any embedding layers. For example, a neural network with five hidden layers and one output layer has a depth of 6. Notice that the input layer doesn't influence depth.
  • DevOps
    • DevOps is a suite of Google Cloud Platform products, for example, Artifact Registry, Cloud Deploy.
  • early stopping
    • A method for regularization that involves ending training before training loss finishes decreasing. In early stopping, you intentionally stop training the model when the loss on a validation dataset starts to increase; that is, when generalization performance worsens.
  • embedding
    • Numerical representations of words or pieces of text. These numbers capture the semantic meaning and context of the text. Similar or related words or text tend to have similar embeddings, which means they are closer together in the high-dimensional vector space.
  • embedding space (latent space)
    • In Generative AI, embedding space refers to a numerical representation of text, images, or videos that captures relationships between inputs. Machine learning models, particularly generative AI models, are adept at creating these embeddings by identifying patterns within large datasets. Applications can utilize embeddings to process and generate language, recognizing complex meanings and semantic relationships specific to the content.
  • embedding vector
    • A dense, often low-dimensional, vector representation of an item such that, if two items are semantically similar, their respective embeddings are located near each other in the embedding vector space.
  • encoder
    • In general, any ML system that converts from a raw, sparse, or external representation into a more processed, denser, or more internal representation. Encoders are often a component of a larger model, where they are frequently paired with a decoder. Some transformers pair encoders with decoders, though other transformers use only the encoder or only the decoder. Some systems use the encoder's output as the input to a classification or regression network. In sequence-to-sequence tasks, an encoder takes an input sequence and returns an internal state (a vector). Then, the decoder uses that internal state to predict the next sequence.
  • ensemble
    • A collection of models trained independently whose predictions are averaged or aggregated. In many cases, an ensemble produces better predictions than a single model. For example, a random forest is an ensemble built from multiple decision trees. Note that not all decision forests are ensembles.
  • environment
    • In reinforcement learning, the world that contains the agent and allows the agent to observe that world's state. For example, the represented world can be a game like chess, or a physical world like a maze. When the agent applies an action to the environment, then the environment transitions between states.
  • evaluation (eval)
    • An eval, short for "evaluation", is a type of experiment in which logged or synthetic queries are sent through two Search stacks--an experimental stack that includes your change and a base stack without your change. Evals produce diffs and metrics that let you evaluate the impact, quality, and other effects of your change on search results and other parts of the Google user experience. Evals are used during tuning, or iterations, on your change. They are also used as part of launching a change to live user traffic.
  • event
    • An event describes the relationship between artifacts and executions. Each artifact can be produced by an execution and consumed by other executions. Events help you to determine the provenance of artifacts in their ML workflows by chaining together artifacts and executions.
  • execution
    • An execution is a record of an individual machine learning workflow step, typically annotated with its runtime parameters. Examples of executions include data ingestion, data validation, model training, model evaluation, and model deployment.
  • experiment
    • An experiment is a context that can contain a set of n experiment runs in addition to pipeline runs where a user can investigate, as a group, different configurations such as input artifacts or hyperparameters.
  • experiment run
    • An experiment run can contain user-defined metrics, parameters, executions, artifacts, and Vertex resources (for example, PipelineJob).
  • exploratory data analysis
    • In statistics, exploratory data analysis (EDA) is an approach to analyzing data sets to summarize their main characteristics, often with visual methods. A statistical model can be used or not, but primarily EDA is for seeing what the data can tell us beyond the formal modeling or hypothesis testing task.
  • F1 Score
    • The F1 score is a metric used to evaluate the accuracy of a model's output. It's particularly useful for assessing the performance of models on tasks where both precision and recall are important, such as information extraction. For generative AI models, the F1 score can be used to compare the model's predictions with ground truth data to determine the model's accuracy. However, for generative tasks like summarization and text generation, other metrics like Rough-L score might be more appropriate.
  • feature
    • In machine learning (ML), a feature is a characteristic or attribute of an instance or entity that's used as an input to train an ML model or to make predictions.
  • feature engineering
    • Feature engineering is the process of transforming raw machine learning (ML) data into features that can be used to train ML models or to make predictions.
  • feature group
    • A feature group is a feature registry resource that corresponds to a BigQuery source table or view containing feature data. A feature view might contain features and can be thought of as a logical grouping of feature columns in the data source.
  • feature record
    • A feature record is an aggregation of all feature values that describe the attributes of a unique entity at a specific point in time.
  • feature registry
    • A feature registry is a central interface for recording feature data sources that you want to serve for online predictions. For more information, see Feature Registry setup.
  • feature serving
    • Feature serving is the process of exporting or fetching feature values for training or inference. In Vertex AI, there are two types of feature serving—online serving and offline serving. Online serving retrieves the latest feature values of a subset of the feature data source for online predictions. Offline or batch serving exports high volumes of feature data—including historical data—for offline processing, such as ML model training.
  • feature timestamp
    • A feature timestamp indicates when the set of feature values in a specific feature record for an entity were generated.
  • feature value
    • A feature value corresponds to the actual and measurable value of a feature (attribute) of an instance or entity. A collection of feature values for the unique entity represent the feature record corresponding to the entity.
  • feature view
    • A feature view is a logical collection of features materialized from a BigQuery data source to an online store instance. A feature view stores and periodically refreshes the customer's feature data, which is refreshed periodically from the BigQuery source. A feature view is associated with the feature data storage either directly or through associations to feature registry resources.
  • foundation model (FM)
    • Models trained on broad data such that they can be adapted (for example, fine-tuned) to a wide range of downstream tasks.
  • Foundation Model Operations (FMOPs)
    • FMOps expands upon the capabilities of MLOps and focuses on the efficient productionization of pre-trained (trained from scratch) or customized (fine-tuned) FMs.
  • generation
    • In the context of generative AI, "generation" refers to the process of creating new data or content from existing data or information. Generative AI models are trained on large datasets and can learn patterns and relationships within the data. They can then use this knowledge to generate new and unique content that is similar to the training data but not an exact replica. For more information, see https://cloud.google.com/docs/ai-ml/generative-ai/generative-ai-or-traditional-ai.
  • Google Cloud pipeline components SDK
    • The Google Cloud pipeline components (GCPC) SDK provides a set of prebuilt Kubeflow Pipelines components that are production quality, performant, and easy to use. You can use Google Cloud Pipeline Components to define and run ML pipelines in Vertex AI Pipelines and other ML pipeline execution backends conformant with Kubeflow Pipelines. For more information, see Introduction to Google Cloud Pipeline Components.
  • Google Embedded Modem System (GEMS)
    • GEMS is an embedded software framework targeting modems, and an accompanying set of development workflows and infrastructure. The core vision of GEMS is to provide high quality modem system code with high reusability across many Google devices that contain modems. To achieve this broad vision, GEMS provides a comprehensive environment for developers, comprised of the major building blocks depicted below.
  • gradient
    • The vector of partial derivatives with respect to all of the independent variables. In machine learning, the gradient is the vector of partial derivatives of the model function. The gradient points in the direction of steepest ascent.
  • graph
    • In the context of Vertex AI, a graph refers to a data structure that represents the relationships between entities and their attributes. It is used to model and analyze complex data, such as knowledge graphs, social networks, and business processes. For more information, see Introduction to Vertex ML Metadata.
  • ground truth (GT)
    • Ground truth is a term used in various fields to refer to the absolute truth of some decision or measurement problem, as opposed to some system's estimate. In machine learning, the term "ground truth" refers to the training set for supervised learning techniques.
  • hallucination
    • A hallucination in generative AI is a confident response by an AI that cannot be grounded by its training data. It may be factually incorrect. In the context of text generation, it's plausible-sounding random falsehoods within its generated text content.
  • heuristic
    • A simple and quickly implemented solution to a problem. For example, "With a heuristic, we achieved 86% accuracy. When we switched to a deep neural network, accuracy went up to 98%".
  • hidden layer
    • A layer in a neural network between the input layer (the features) and the output layer (the prediction). Each hidden layer consists of one or more neurons. A deep neural network contains more than one hidden layer.
  • histogram
    • A graphical display of the variation in a set of data using bars. A histogram visualizes patterns that are difficult to detect in a simple table of numbers.
  • hyperparameter
    • A hyperparameter refers to a variable that governs the training process of a machine learning model. These variables can include learning rates, momentum values in the optimizer, and the number of units in the last hidden layer of a model. Hyperparameter tuning in Vertex AI involves running multiple trials of a training application with different values for the chosen hyperparameters, set within specified limits. The goal is to optimize the hyperparameter settings to maximize the model's predictive accuracy. For more information, see Hyperparameter tuning overview.
  • Imagen
    • Imagen is a text-to-image generative AI service available through the Vertex AI platform. It allows users to generate novel images, edit images, fine-tune style or subject models, caption images, or get answers to questions about image content. For more information, see Imagen on Vertex AI overview.
  • image recognition
    • Image recognition is the process of classifying objects, patterns, or concepts in an image. It is also known as image classification. Image recognition is a subfield of machine learning and computer vision.
  • index
    • A collection of vectors deployed together for similarity search. Vectors can be added to an index or removed from an index. Similarity search queries are issued to a specific index and will search over the vectors in that index.
  • inference
    • In the context of the Vertex AI platform, inference refers to the process of running data points through a machine learning model to calculate an output, such as a single numerical score. This process is also known as "operationalizing a machine learning model" or "putting a machine learning model into production." Inference is an important step in the machine learning workflow, since it enables models to be used to make predictions on new data. In Vertex AI, inference can be performed in various ways, including batch prediction and online prediction. Batch prediction involves running a group of prediction requests and outputting the results in one file, while online prediction allows for real-time predictions on individual data points.
  • information retrieval (IR)
    • Information retrieval (IR) is a key component of Vertex AI Search. It is the process of finding and retrieving relevant information from a large collection of data. In the context of Vertex AI, IR is used to retrieve documents from a corpus based on a user's query. Vertex AI offers a suite of APIs to help you build your own Retrieval Augmented Generation (RAG) applications or to build your own Search engine. For more information, see Use Vertex AI Search as a retrieval backend using RAG Engine.
  • learning rate (step size)
    • Learning rate is a hyperparameter used to tune the optimization process of a machine learning model. It determines the step size at which the model updates its weights during training. A higher learning rate can lead to faster convergence but may result in instability or overfitting. Conversely, a lower learning rate may lead to slower convergence but can help prevent overfitting, no sources. For more information, see Overview of hyperparameter tuning.
  • loss (cost)
    • During the training of a supervised model, a measure of how far a model's prediction is from its label. A loss function calculates the loss.
  • Machine Learning Metadata
    • ML Metadata (MLMD) is a library for recording and retrieving metadata associated with ML developer and data scientist workflows. MLMD is an integral part of TensorFlow Extended (TFX), but is designed so that it can be used independently. As part of the broader TFX platform, most users only interact with MLMD when examining the results of pipeline components, for example in notebooks or in TensorBoard.
  • managed dataset
    • A dataset object created in and hosted by Vertex AI.
  • metadata resources
    • Vertex ML Metadata exposes a graph-like data model for representing metadata produced and consumed from ML workflows. The primary concepts are artifacts, executions, events, and contexts.
  • MetadataSchema
    • A MetadataSchema describes the schema for particular types of artifacts, executions, or contexts. MetadataSchemas are used to validate the key-value pairs during creation of the corresponding Metadata resources. Schema validation is only performed on matching fields between the resource and the MetadataSchema. Type schemas are represented using OpenAPI Schema Objects, which should be described using YAML.
  • MetadataStore
    • A MetadataStore is the top-level container for metadata resources. MetadataStore is regionalized and associated with a specific Google Cloud project. Typically, an organization uses one shared MetadataStore for metadata resources within each project.
  • ML pipelines
    • ML pipelines are portable and scalable ML workflows that are based on containers.
  • model
    • Any model pre-trained or not. In general, any mathematical construct that processes input data and returns output. Phrased differently, a model is the set of parameters and structure needed for a system to make predictions.
  • model distillatin (knowledge distillation, teacher-student models)
    • Model distillation is a technique that allows a smaller student model to learn from a larger teacher model. The student model is trained to mimic the output of the teacher model, and it can then be used to generate new data or make predictions. Model distillation is often used to make large models more efficient or to make them more accessible to devices with limited resources. It can also be used to improve the generalization of models by reducing overfitting.
  • model resource name
    • The resource name for a model as follows: projects/<PROJECT_ID>/locations/<LOCATION_ID>/models/<MODEL_ID>. You can find the model's ID in the Cloud console on the 'Model Registry' page.
  • Network File System (NFS)
    • A client/server system that lets users access files across a network and treat them as if they resided in a local file directory.
  • offline store
    • The offline store is a storage facility storing recent and historical feature data, which is typically used for training ML models. An offline store also contains the latest feature values, which you can serve for online predictions.
  • online store
    • In feature management, an online store is a storage facility for the latest feature values to be served for online predictions.
  • online store
    • In feature management, an online store is a storage facility for the latest feature values to be served for online predictions.
  • parameter
    • Parameters are keyed input values that configure a run, regulate the behavior of the run, and affect the results of the run. Examples include learning rate, dropout rate, and number of training steps.
  • pipeline
  • pipeline component
    • A self-contained set of code that performs one step in a pipeline's workflow, such as data preprocessing, data transformation, and training a model.
  • pipeline job
    • A pipeline job or a pipeline run corresponds to the PipelineJob resource in the Vertex AI API. It's an execution instance of your ML pipeline definition, which is defined as a set of ML tasks interconnected by input-output dependencies.
  • pipeline run
    • One or more Vertex PipelineJobs can be associated with an experiment where each PipelineJob is represented as a single run. In this context, the parameters of the run are inferred by the parameters of the PipelineJob. The metrics are inferred from the system.Metric artifacts produced by that PipelineJob. The artifacts of the run are inferred from artifacts produced by that PipelineJob.
  • pipeline template
    • An ML workflow definition that a single user or multiple users can reuse to create multiple pipeline runs.
  • positive class
    • "Positive class" refers to the outcome or category that a model is trained to predict. For example, if a model is predicting whether a customer will purchase a jacket, the positive class would be "customer purchases a jacket". Similarly, in a model predicting customer signup for a term deposit, the positive class would be "customer signed up". The opposite is the "negative class".
  • Private Service Connect (PSC)
    • Private Service Connect is a technology that allows Compute Engine customers to map private IPs in their network to either another VPC network or to Google APIs.
  • Private Service Connect interface (PSC-I)
    • Private Service Connect interface provides a way for producers to initiate connections to any network resources in consumer VPC privately.
  • quantization
    • Quantization is a model optimization technique used to reduce the precision of the numbers used to represent a model's parameters. This can lead to smaller models, lower power consumption, and reduced inference latency.
  • Random Forest
    • Random Forest is a machine learning algorithm used for both classification and regression. It's not directly a generative AI model itself, but it's a component that can be used within a larger generative AI system. A random forest consists of multiple decision trees, and its prediction is an aggregation of the predictions from these individual trees. For example, in a classification task, each tree "votes" for a class, and the final prediction is the class with the most votes For more information, see Decision forest.
  • Ray cluster on Vertex AI
    • Ray clusters on Vertex AI are built in to ensure capacity availability for critical ML workloads or during peak seasons. Unlike custom jobs, where the training service releases the resource after job completion, Ray clusters remain available until deleted. For more information, see Ray on Vertex AI overview.
  • Ray on Vertex AI (RoV)
    • Ray on Vertex AI is designed so you can use the same open source Ray code to write programs and develop applications on Vertex AI with minimal changes. For more information, see Ray on Vertex AI overview.
  • Ray on Vertex AI SDK for Python
    • Ray on Vertex AI SDK for Python is a version of the Vertex AI SDK for Python that includes the functionality of the Ray Client, Ray BigQuery connector, Ray cluster management on Vertex AI, and predictions on Vertex AI. For more information, see Introduction to the Vertex AI SDK for Python.
  • recall
    • The percentage of true nearest neighbors returned by the index. For example, if a nearest neighbor query for 20 nearest neighbors returned 19 of the "ground truth" nearest neighbors, the recall is 19/20x100 = 95%.
  • regularization
    • Regularization is a technique used to prevent overfitting in machine learning models. Overfitting occurs when a model learns the training data too well, resulting in poor performance on unseen data. One specific type of regularization mentioned is early stopping, where training is halted before the loss on a validation dataset begins to increase, indicating a decline in generalization performance. For more information, see Overfitting: L2 regularization.
  • restricts
    • Functionality to "restrict" searches to a subset of the index by using Boolean rules. Restrict is also referred to as "filtering". With Vector Search, you can use numeric filtering and text attribute filtering.
  • service account
    • In Google Cloud, a service account is a special kind of account used by an application or a virtual machine (VM) instance, not a person. Applications use service accounts to make authorized API calls.
  • summary metrics
    • Summary metrics are a single value for each metric key in an experiment run. For example, the test accuracy of an experiment is the accuracy calculated against a test dataset at the end of training that can be captured as a single value summary metric.
  • TensorBoard
    • TensorBoard is a suite of web applications for visualizing and understanding TensorFlow runs and models. For more information, see TensorBoard.
  • TensorBoard instance
    • A TensorBoard instance is a regionalized resource that stores Vertex AI TensorBoard Experiments associated with a Project. You can create multiple TensorBoard instances in a project if, for example, you want multiple CMEK enabled instances. This is the same as the TensorBoard resource in the API.
  • TensorBoard Resource name
    • A TensorBoard Resource name is used to fully identify a Vertex AI TensorBoard instance. The format is as follows: projects/PROJECT_ID_OR_NUMBER/locations/REGION/tensorboards/TENSORBOARD_INSTANCE_ID.
  • TensorFlow Extended (TFX)
    • TensorFlow extended (TFX) is an end-to-end platform for deploying production machine learning pipelines based on the TensorFlow platform.
  • time offset
    • Time offset is relative to the beginning of a video.
  • time segment
    • A time segment is identified by beginning and ending time offsets.
  • time series metrics
    • Time series metrics are longitudinal metric values where each value represents a step in the training routine portion of a run. Time series metrics are stored in Vertex AI TensorBoard. Vertex AI Experiments stores a reference to the Vertex TensorBoard resource.
  • token
    • A token in a language model is the atomic unit that the model is training and making predictions on, namely words, morphemes, and characters. In domains outside of language models, tokens can represent other kinds of atomic units. For example, in computer vision, a token might be a subset of an image. For more information, see List and count tokens.
  • training set
    • In Vertex AI, the training set is the largest portion of your data (typically 80%) used to train a machine learning model. The model learns the patterns and relationships within this data to make predictions. The training set is distinct from the validation and test sets, which are used to evaluate the model's performance during and after training.
  • trajectory
    • A "trajectory" refers to a sequence of steps or actions taken by an agent or model. It's often used in the evaluation of generative models, where the model's ability to generate text, code, or other content is assessed. There are several types of trajectory metrics that can be used to evaluate generative models, including trajectory exact match, trajectory in-order match, trajectory any order match, and trajectory precision. These metrics measure the similarity between the model's output and a set of human-generated reference outputs.
  • Transformer
    • A "Transformer" is a neural network architecture that underlies most state-of-the-art generative models. It's used in various language model applications, including translation. Transformers consist of an encoder and a decoder; the encoder converts input text into an intermediate representation, and the decoder converts this into useful output. They utilize a self-attention mechanism to gather context from words surrounding the word being processed. While training a Transformer requires significant resources, fine-tuning a pre-trained Transformer for specific applications is more efficient.
  • true positive
    • A "true positive" refers to a prediction where the model correctly identifies a positive class. For example, if a model is trained to identify customers who will purchase a jacket, a true positive would be correctly predicting that a customer will make such a purchase.
  • unmanaged artifacts
    • An artifact that exists outside of the Vertex AI context.
  • vector
    • A vector refers to a numerical representation of text, images, or videos that captures relationships between inputs. Machine learning models are suited for creating embeddings by identifying patterns within large datasets. Applications can use embeddings to process and produce language, recognizing complex meanings and semantic relationships specific to the content. For more information, see Embeddings APIs overview.
  • Vertex AI data type
    • Vertex AI data types are "image," "text," "tabular," and "video".
  • Vertex AI Experiments
    • Vertex AI Experiments lets users track the following: 1. Steps of an experiment run (for example, preprocessing and training). 2. Inputs (for example, algorithm, parameters, and datasets). 3. Outputs of those steps (for example, models, checkpoints, and metrics).
  • Vertex AI SDK for Python
    • Vertex AI SDK for Python provides similar functionality as the Vertex AI Python client library, except the SDK is higher-level and less granular.
  • Vertex AI TensorBoard Experiment
    • The data associated with an Experiment can be viewed in TensorBoard web application (scalars, histograms, distributions, etc.). Timeseries scalars can be viewed in the Google Cloud Console. For more information, see Compare and analyze runs.
  • video segment
    • A video segment is identified by beginning and ending time offset of a video.
  • virtual private cloud (VPC)
    • Virtual private cloud is an on-demand, configurable pool of shared computing resources that's allocated in a public cloud environment and provides a level of isolation between different organizations using those resources.