You use the AI Platform Data Labeling Service to request having human labelers label a collection of data that you plan to use to train a custom machine learning model.
To train a machine learning model, you provide representative samples of the type of content you want to classify or analyze, along with the "right answer" for how you want the model to handle each sample. For example, to train a model that classifies images of flowers, you provide a sample collection of images labeled with the type of flower (sunflower, daisy, rose, tulip); to train a model that identifies the names of diseases in medical documents, you provide sample documents with the diseases highlighted. The model learns to extrapolate from the samples.
The Data Labeling Service enables you to submit the representative samples to human labelers who annotate them with the "right answers" and return the dataset in a format suitable for training a machine learning model. The type of sample data you provide and the type of annotations the human labelers add depends on the type of machine learning model you plan to train.
To request data labeling, you create three resources for the human labelers:
- A dataset containing the representative samples for the labelers to label
- An annotation specification set identifying the labels for the labelers to apply to the items in the dataset
- A set of instructions for the labelers about how to apply the labels to your data
Once you've created these resources, you submit them as part of a labeling request. The human labelers annotate the items in the dataset according to your instructions, and return an annotated dataset that you can export and use as training data for a custom machine learning model.