Class BatchPredictOutputConfig (0.8.0)

Output configuration for BatchPredict Action.

As destination the

[gcs_destination][google.cloud.automl.v1.BatchPredictOutputConfig.gcs_destination] must be set unless specified otherwise for a domain. If gcs_destination is set then in the given directory a new directory is created. Its name will be "prediction--", where timestamp is in YYYY-MM-DDThh:mm:ss.sssZ ISO-8601 format. The contents of it depends on the ML problem the predictions are made for.

  • For Text Classification: In the created directory files text_classification_1.jsonl, text_classification_2.jsonl,...,\ text_classification_N.jsonl will be created, where N may be 1, and depends on the total number of inputs and annotations found.

    ::

    Each .JSONL file will contain, per line, a JSON representation of a
    proto that wraps input text (or pdf) file in
    the text snippet (or document) proto and a list of
    zero or more AnnotationPayload protos (called annotations), which
    have classification detail populated. A single text (or pdf) file
    will be listed only once with all its annotations, and its
    annotations will never be split across files.
    
    If prediction for any text (or pdf) file failed (partially or
    completely), then additional `errors_1.jsonl`, `errors_2.jsonl`,...,
    `errors_N.jsonl` files will be created (N depends on total number of
    failed predictions). These files will have a JSON representation of a
    proto that wraps input text (or pdf) file followed by exactly one
    

`google.rpc.Status <https:%20//github.com/googleapis/googleapis/blob/master/google/rpc/status.proto>__ containing onlycodeandmessage`.

  • For Text Sentiment: In the created directory files text_sentiment_1.jsonl, text_sentiment_2.jsonl,...,\ text_sentiment_N.jsonl will be created, where N may be 1, and depends on the total number of inputs and annotations found.

    ::

    Each .JSONL file will contain, per line, a JSON representation of a
    proto that wraps input text (or pdf) file in
    the text snippet (or document) proto and a list of
    zero or more AnnotationPayload protos (called annotations), which
    have text_sentiment detail populated. A single text (or pdf) file
    will be listed only once with all its annotations, and its
    annotations will never be split across files.
    
    If prediction for any text (or pdf) file failed (partially or
    completely), then additional `errors_1.jsonl`, `errors_2.jsonl`,...,
    `errors_N.jsonl` files will be created (N depends on total number of
    failed predictions). These files will have a JSON representation of a
    proto that wraps input text (or pdf) file followed by exactly one
    

`google.rpc.Status <https:%20//github.com/googleapis/googleapis/blob/master/google/rpc/status.proto>__ containing onlycodeandmessage`.

  • For Text Extraction: In the created directory files text_extraction_1.jsonl, text_extraction_2.jsonl,...,\ text_extraction_N.jsonl will be created, where N may be 1, and depends on the total number of inputs and annotations found. The contents of these .JSONL file(s) depend on whether the input used inline text, or documents. If input was inline, then each .JSONL file will contain, per line, a JSON representation of a proto that wraps given in request text snippet's "id" (if specified), followed by input text snippet, and a list of zero or more AnnotationPayload protos (called annotations), which have text_extraction detail populated. A single text snippet will be listed only once with all its annotations, and its annotations will never be split across files. If input used documents, then each .JSONL file will contain, per line, a JSON representation of a proto that wraps given in request document proto, followed by its OCR-ed representation in the form of a text snippet, finally followed by a list of zero or more AnnotationPayload protos (called annotations), which have text_extraction detail populated and refer, via their indices, to the OCR-ed text snippet. A single document (and its text snippet) will be listed only once with all its annotations, and its annotations will never be split across files. If prediction for any text snippet failed (partially or completely), then additional errors_1.jsonl, errors_2.jsonl,..., errors_N.jsonl files will be created (N depends on total number of failed predictions). These files will have a JSON representation of a proto that wraps either the "id" : "" (in case of inline) or the document proto (in case of document) but here followed by exactly one `google.rpc.Status <https:%20//github.com/googleapis/googleapis/blob/master/google/rpc/status.proto>__ containing onlycodeandmessage`.

    Required. The Google Cloud Storage location of the directory where the output is to be written to.