InputConfig

Input configuration for datasets.importData action.

See Preparing your training data for more information.

One or more CSV files with each line in the following format:

ML_USE,GCS_FILE_PATH,(LABEL,BOUNDING_BOX | ,,,,,,,)
  • ML_USE - Identifies the data set that the current row (file) applies to. This value can be one of the following:

    • TRAIN - Rows in this file are used to train the model.
    • TEST - Rows in this file are used to test the model during training.
    • UNASSIGNED - Rows in this file are not categorized. They are Automatically divided into train and test data. 80% for training and 20% for testing.
  • GCS_FILE_PATH - A Google Cloud Storage URL for an image up to 30 MB in size. Supported extensions: .JPEG, .GIF, .PNG.

  • LABEL - A label that identifies the object in the specified bounding box.

  • BOUNDING_BOX - The edge coordinates of the object in the image. The minimum BOUNDING_BOX edge length is 0.01. You can specify from 0 to 500 bounding boxes per image. To specify an image with no objects, add a row to the CSV file with no LABEL and empty columns (",,,,,,,") as the BOUNDING_BOX.

Sample rows:

TRAIN,gs://folder/image1.png,car,0.1,0.1,,,0.3,0.3,,
TRAIN,gs://folder/image1.png,bike,.7,.6,,,.8,.9,,
UNASSIGNED,gs://folder/im2.png,car,0.1,0.1,0.2,0.1,0.2,0.3,0.1,0.3
TEST,gs://folder/im3.png,,,,,,,,,

Errors

If any of the provided CSV files can't be parsed or if more than certain percent of CSV rows cannot be processed then the operation fails and nothing is imported. Regardless of overall success or failure the per-row failures, up to a certain count cap, will be listed in Operation.metadata.partial_failures.

JSON representation
{
  "params": {
    string: string,
    ...
  },
  "gcsSource": {
    object(GcsSource)
  }
}
Fields
params

map (key: string, value: string)

Additional domain-specific parameters describing the semantic of the imported data, any string must be up to 25000 characters long.

gcsSource

object(GcsSource)

The Google Cloud Storage location for the input content.