REST Resource: projects.datasets

Resource: DataSet

The data sources over which we want to run queries. Here's an example:

    - dataset named 'monitoring' with:
       - data named 'pressure' with:
            - timestamp 0, value '110psi'
            - timestamp 1, value '120psi'
            ...
            - timestamp 7, value '171psi'
       - data named 'temperature' with:
            - timestamp 0, value '17 C'
            - timestamp 1, value '7 C'
            ...
            - timestamp 7, value '1 C'
JSON representation
{
  "name": string,
  "dataNames": [
    string
  ],
  "dataSources": [
    {
      object(DataSource)
    }
  ],
  "state": enum(State),
  "stateDetail": string,
  "enableIngestMode": boolean,
  "ingestTtl": string
}
Fields
name

string

The dataset name, which will be used for querying, status and unload requests. This must be unique per client project.

dataNames[]

string

Data names allowed in this DataSet (e.g. 'pressure', 'temperature'). If not provided, all DataItems in all DataSources are used. If provided, only the DataItems with 'dataNames' in the provided set are used. DataItems that have a 'dataName' which is not part of 'dataNames' will be ignored by the system.

dataSources[]

object(DataSource)

Individual data sources we want to run inference over.

state

enum(State)

Dataset state in the system.

stateDetail

string

Dataset state detail (e.g. failure reason).

enableIngestMode

boolean

Whether dataset is in ingest mode. If set, dataSources are ignored, and client sends docs in ingest mode.

ingestTtl

string (Duration format)

TTL of datasets.ingestDataItems in ingest mode. Only needed in ingest mode.

A duration in seconds with up to nine fractional digits, terminated by 's'. Example: "3.5s".

DataSource

A data source file consists of multiple DataItems stored on Google Cloud Storage. Each DataItem should be in JSON format, with a single item per line.

JSON representation
{
  "uri": string
}
Fields
uri

string

Data source URI (e.g. the Google Cloud Storage path for a timeseries shard). If it is a Google Cloud Storage URI it must be in the following form: 'gs://bucket_name/object_name'. For more details on Google Cloud Storage URIs, please see: https://cloud.google.com/storage/docs/reference-uris.

State

DataSet state in the Inference indexing and serving systems.

Enums
STATE_UNSPECIFIED Unspecified / undefined state.
UNKNOWN Data set is unknown to the system; we have never seen this dataset before or we have seen this dataset but have fully GC-ed it.
PENDING Data set processing is pending.
LOADING Data set is loading.
LOADED Data set is loaded and can be queried.
UNLOADING Data set is unloading.
UNLOADED Data set is unloaded and is removed from the system.
FAILED Data set processing failed.

Methods

create

Load a dataset stored on Google Cloud Storage.

delete

Unload a dataset from the system, once done processing.

ingestDataItems

Ingest data items into a dataset.

list

Lists datasets.

query

Execute an Inference query over a loaded dataset.