A label set is the set of labels you want the human labelers to use to label your images. For example, if you want to classify images based on whether they contain a dog or a cat, you create a label set with two labels: "Dog" and "Cat". (Actually, as noted below, you might also want labels for "Neither" and "Both".) Your label set can include up to 100 labels.
A project can have multiple label sets, each used for a different Data Labeling Service request. You can get a list of the available label sets and delete sets you no longer need; see the annotation specification set resource page for more information.
Design a good label set
Here are some guidelines for creating a high-quality label set.
- Make each label's display name a meaningful word, such as "dog", "cat", or "building". Do not use abstract names like "label1" and "label2" or unfamiliar acronyms. The more meaningful the label names, the easier it is for human labelers to apply them accurately and consistently.
- Make sure the labels are easily distinguishable from one another. For classification tasks where a single label is applied to each data item, try not to use labels whose meanings overlap.
- For classification tasks, it is usually a good idea to include a label named "other" or "none", to use for data that don't match the other labels. If the only available labels are "dog" and "cat", for example, labelers will have to label every image with one of those labels. Your custom model is typically more robust if you include images other than dogs or cats in its training data.
- Keep in mind that labelers are most efficient and accurate when you have at most 20 labels defined in the label set.
Create a label set resource
Web UI
Open the Data Labeling Service UI.
The Label sets page shows the status of previously created label sets for the current project.
To add a label set for a different project, select the project from the drop-down list in the upper right of the title bar.
Click the Create button in the title bar.
On the Create a label set page, enter a name and description for the set.
In the Labels section, enter names and descriptions for each label you want the human labelers to apply.
After entering the name and description for a label, click Add label to add a row for an additional label. You can add up to 100 labels.
Click Create to create the annotation specification set.
You're returned to the Label sets list page.
Command-line
To create the label set resource, you list all labels in JSON format, then pass it to the Data Labeling Service.
The following example creates a label set named code_sample_label_set
that has two labels.
Save the "name"
of the new label set (from the response) for
use with other operations, such as sending the labeling request.
curl -X POST \ -H "Authorization: Bearer $(gcloud auth application-default print-access-token)" \ -H "Content-Type: application/json" \ https://datalabeling.googleapis.com/v1beta1/projects/"${PROJECT_ID}"/annotationSpecSets \ -d '{ "annotationSpecSet": { "displayName": "code_sample_label_set", "description": "code sample general label set", "annotationSpecs": [ { "displayName": "dog", "description": "label dog", }, { "displayName": "cat", "description": "label cat", } ], }, }'
You should see output similar to the following:
{ "name": "projects/data-labeling-codelab/annotationSpecSets/5c73db2d_0000_2f46_983d_001a114a5d7c", "displayName": "code_sample_label_set", "description": "code sample general label set", "annotationSpecs": [ { "displayName": "dog", "description": "label dog" }, { "displayName": "cat", "description": "label cat" } ] }
Python
Before you can run this code example, you must install the Python Client Libraries.Java
Before you can run this code example, you must install the Java Client Libraries.For continuous evaluation
When you create an evaluation job, you must specify a CSV file that defines your annotation specification set:
- The file must have one row for every possible label your model outputs during prediction.
- Each row should be a comma-separated pair containing the label and a description of the
label:
LABEL_NAME,DESCRIPTION
- When you create an evaluation job, Data Labeling Service uses the filename of the CSV file as the name of an annotation specification set that it creates in the background.
For example, if your model predicts what animal is in an image, you could write the following
specification to a file named animals.csv
:
bird,any animal in the class Aves - see https://en.wikipedia.org/wiki/Bird cat,any animal in the species Felis catus (domestic cats, not wild cats) - see https://en.wikipedia.org/wiki/Cat dog,any animal in the genus Canis (domestic dogs and close relatives) - see https://en.wikipedia.org/wiki/Canis multiple,image contains more than one of the above none,image contains none of the above
Then, upload this file to a Cloud Storage bucket in the same project as your continuous evaluation job.