AI Platform Data Labeling Service

AI Platform Data Labeling Service lets you work with human labelers to generate highly accurate labels for a collection of data that you can use in machine learning models.

To train a machine learning model, you provide representative data samples that you want to classify or analyze, along with the machine learning algorithm to handle each sample. For example, to train a model that can identify flowers in the image, you need the objects like sunflowers, roses, and tulips to be labeled in the training dataset; to train a model that can identify the names of diseases in medical documents, you need the diseases related words to be highlighted in the document dataset. Labeling your training data is the first step in the machine learning development cycle.

To start data labeling in AI Platform Data Labeling Service, create three resources for the human labelers:

  • A dataset containing the representative data samples to label
  • A label set listing all possible labels in the dataset
  • A set of instructions guiding human labelers through labeling tasks

Once you've created these resources, you submit them as part of a labeling request. The human labelers start annotating the items in the dataset according to your instructions. After human labelers finish the labeling, you can export well labeled datasets and use the datasets in the machine learning development.