Try Gemini 1.5 models, the latest multimodal models in Vertex AI, and see what you can build with up to a 2M token context window. Try Gemini 1.5 models, the latest multimodal models in Vertex AI, and see what you can build with up to a 2M token context window.

REST Resource: projects.locations.dataLabelingJobs

Resource: DataLabelingJob

DataLabelingJob is used to trigger a human labeling job on unlabeled data from the following Dataset:

JSON representation

JSON representation
{ "name": string, "displayName": string, "datasets": [ string ], "annotationLabels": { string: string, ... }, "labelerCount": integer, "instructionUri": string, "inputsSchemaUri": string, "inputs": value, "state": enum (`JobState`), "labelingProgress": integer, "currentSpend": { object (`Money`) }, "createTime": string, "updateTime": string, "error": { object (`Status`) }, "labels": { string: string, ... }, "specialistPools": [ string ], "encryptionSpec": { object (`EncryptionSpec`) }, "activeLearningConfig": { object (`ActiveLearningConfig`) } }

{
  "name": string,
  "displayName": string,
  "datasets": [
    string
  ],
  "annotationLabels": {
    string: string,
    ...
  },
  "labelerCount": integer,
  "instructionUri": string,
  "inputsSchemaUri": string,
  "inputs": value,
  "state": enum (JobState),
  "labelingProgress": integer,
  "currentSpend": {
    object (Money)
  },
  "createTime": string,
  "updateTime": string,
  "error": {
    object (Status)
  },
  "labels": {
    string: string,
    ...
  },
  "specialistPools": [
    string
  ],
  "encryptionSpec": {
    object (EncryptionSpec)
  },
  "activeLearningConfig": {
    object (ActiveLearningConfig)
  }
}

Fields
`name`	`string` Output only. Resource name of the DataLabelingJob.
`displayName`	`string` Required. The user-defined name of the DataLabelingJob. The name can be up to 128 characters long and can consist of any UTF-8 characters. Display name of a DataLabelingJob.
`datasets[]`	`string` Required. Dataset resource names. Right now we only support labeling from a single Dataset. Format: `projects/{project}/locations/{location}/datasets/{dataset}`
`annotationLabels`	`map (key: string, value: string)` Labels to assign to annotations generated by this DataLabelingJob. label keys and values can be no longer than 64 characters (Unicode codepoints), can only contain lowercase letters, numeric characters, underscores and dashes. International characters are allowed. See https://goo.gl/xmQnxf for more information and examples of labels. System reserved label keys are prefixed with "aiplatform.googleapis.com/" and are immutable.
`labelerCount`	`integer` Required. Number of labelers to work on each DataItem.
`instructionUri`	`string` Required. The Google Cloud Storage location of the instruction pdf. This pdf is shared with labelers, and provides detailed description on how to label DataItems in Datasets.
`inputsSchemaUri`	`string` Required. Points to a YAML file stored on Google Cloud Storage describing the config for a specific type of DataLabelingJob. The schema files that can be used here are found in the https://storage.googleapis.com/google-cloud-aiplatform bucket in the /schema/datalabelingjob/inputs/ folder.
`inputs`	`value (Value format)` Required. Input config parameters for the DataLabelingJob.
`state`	`enum (JobState)` Output only. The detailed state of the job.
`labelingProgress`	`integer` Output only. Current labeling job progress percentage scaled in interval [0, 100], indicating the percentage of DataItems that has been finished.
`currentSpend`	`object (Money)` Output only. Estimated cost(in US dollars) that the DataLabelingJob has incurred to date.
`createTime`	`string (Timestamp format)` Output only. timestamp when this DataLabelingJob was created. A timestamp in RFC3339 UTC "Zulu" format, with nanosecond resolution and up to nine fractional digits. Examples: `"2014-10-02T15:01:23Z"` and `"2014-10-02T15:01:23.045123456Z"`.
`updateTime`	`string (Timestamp format)` Output only. timestamp when this DataLabelingJob was updated most recently. A timestamp in RFC3339 UTC "Zulu" format, with nanosecond resolution and up to nine fractional digits. Examples: `"2014-10-02T15:01:23Z"` and `"2014-10-02T15:01:23.045123456Z"`.
`error`	`object (Status)` Output only. DataLabelingJob errors. It is only populated when job's state is `JOB_STATE_FAILED` or `JOB_STATE_CANCELLED`.
`labels`	`map (key: string, value: string)` The labels with user-defined metadata to organize your DataLabelingJobs. label keys and values can be no longer than 64 characters (Unicode codepoints), can only contain lowercase letters, numeric characters, underscores and dashes. International characters are allowed. See https://goo.gl/xmQnxf for more information and examples of labels. System reserved label keys are prefixed with "aiplatform.googleapis.com/" and are immutable. Following system labels exist for each DataLabelingJob: "aiplatform.googleapis.com/schema": output only, its value is the `inputs_schema`'s title.
`specialistPools[]`	`string` The SpecialistPools' resource names associated with this job.
`encryptionSpec`	`object (EncryptionSpec)` Customer-managed encryption key spec for a DataLabelingJob. If set, this DataLabelingJob will be secured by this key. Note: Annotations created in the DataLabelingJob are associated with the EncryptionSpec of the Dataset they are exported to.
`activeLearningConfig`	`object (ActiveLearningConfig)` Parameters that configure the active learning pipeline. Active learning will label the data incrementally via several iterations. For every iteration, it will select a batch of data based on the sampling strategy.

ActiveLearningConfig

Parameters that configure the active learning pipeline. Active learning will label the data incrementally by several iterations. For every iteration, it will select a batch of data based on the sampling strategy.

JSON representation

JSON representation
{ "sampleConfig": { object (`SampleConfig`) }, "trainingConfig": { object (`TrainingConfig`) }, // Union field `human_labeling_budget` can be only one of the following: "maxDataItemCount": string, "maxDataItemPercentage": integer // End of list of possible types for union field `human_labeling_budget`. }

{
  "sampleConfig": {
    object (SampleConfig)
  },
  "trainingConfig": {
    object (TrainingConfig)
  },

  // Union field human_labeling_budget can be only one of the following:
  "maxDataItemCount": string,
  "maxDataItemPercentage": integer
  // End of list of possible types for union field human_labeling_budget.
}

Fields
`sampleConfig`	`object (SampleConfig)` Active learning data sampling config. For every active learning labeling iteration, it will select a batch of data based on the sampling strategy.
`trainingConfig`	`object (TrainingConfig)` CMLE training config. For every active learning labeling iteration, system will train a machine learning model on CMLE. The trained model will be used by data sampling algorithm to select DataItems.
Union field `human_labeling_budget`. Required. Max human labeling DataItems. The rest part will be labeled by machine. `human_labeling_budget` can be only one of the following:
`maxDataItemCount`	`string (int64 format)` Max number of human labeled DataItems.
`maxDataItemPercentage`	`integer` Max percent of total DataItems for human labeling.

SampleConfig

Active learning data sampling config. For every active learning labeling iteration, it will select a batch of data based on the sampling strategy.

JSON representation

JSON representation
{ "sampleStrategy": enum (`SampleStrategy`), // Union field `initial_batch_sample_size` can be only one of the following: "initialBatchSamplePercentage": integer // End of list of possible types for union field `initial_batch_sample_size`. // Union field `following_batch_sample_size` can be only one of the following: "followingBatchSamplePercentage": integer // End of list of possible types for union field `following_batch_sample_size`. }

{
  "sampleStrategy": enum (SampleStrategy),

  // Union field initial_batch_sample_size can be only one of the following:
  "initialBatchSamplePercentage": integer
  // End of list of possible types for union field initial_batch_sample_size.

  // Union field following_batch_sample_size can be only one of the following:
  "followingBatchSamplePercentage": integer
  // End of list of possible types for union field following_batch_sample_size.
}

Fields
`sampleStrategy`	`enum (SampleStrategy)` Field to choose sampling strategy. Sampling strategy will decide which data should be selected for human labeling in every batch.
Union field `initial_batch_sample_size`. Decides sample size for the initial batch. initial_batch_sample_percentage is used by default. `initial_batch_sample_size` can be only one of the following:
`initialBatchSamplePercentage`	`integer` The percentage of data needed to be labeled in the first batch.
Union field `following_batch_sample_size`. Decides sample size for the following batches. following_batch_sample_percentage is used by default. `following_batch_sample_size` can be only one of the following:
`followingBatchSamplePercentage`	`integer` The percentage of data needed to be labeled in each following batch (except the first batch).

SampleStrategy

Sample strategy decides which subset of DataItems should be selected for human labeling in every batch.

Enums
`SAMPLE_STRATEGY_UNSPECIFIED`	Default will be treated as UNCERTAINTY.
`UNCERTAINTY`	Sample the most uncertain data to label.

TrainingConfig

CMLE training config. For every active learning labeling iteration, system will train a machine learning model on CMLE. The trained model will be used by data sampling algorithm to select DataItems.

JSON representation
{ "timeoutTrainingMilliHours": string }

Fields

Fields
`timeoutTrainingMilliHours`	`string (int64 format)` The timeout hours for the CMLE training job, expressed in milli hours i.e. 1,000 value in this field means 1 hour.

timeoutTrainingMilliHours

string (int64 format)

The timeout hours for the CMLE training job, expressed in milli hours i.e. 1,000 value in this field means 1 hour.

Methods
`cancel`	Cancels a DataLabelingJob.
`create`	Creates a DataLabelingJob.
`delete`	Deletes a DataLabelingJob.
`get`	Gets a DataLabelingJob.
`list`	Lists DataLabelingJobs in a Location.

REST Resource: projects.locations.dataLabelingJobs

Resource: DataLabelingJob

ActiveLearningConfig

SampleConfig

SampleStrategy

TrainingConfig

Methods

`cancel`

`create`

`delete`

`get`

`list`