REST Resource: projects.locations.datasets

Resource: DataSet

A collection of data sources sent for processing.

JSON representation
{
  "name": string,
  "dataNames": [
    string
  ],
  "dataSources": [
    {
      object (DataSource)
    }
  ],
  "state": enum (State),
  "status": {
    object (Status)
  },
  "ttl": string
}
Fields
name

string

The dataset name, which will be used for querying, status and unload requests. This must be unique within a project.

dataNames[]

string

Data dimension names allowed for this DataSet.

If left empty, all dimension names are included. This field works as a filter to avoid regenerating the data if the original data is a superset of the dimension names to be used.

dataSources[]

object (DataSource)

Input data. An empty [dataSources] is accepted, in which case the system needs to accumulate enough history through online updates to function.

state

enum (State)

Dataset state in the system.

status

object (Status)

Output only. Dataset processing status.

ttl

string (Duration format)

Periodically we discard dataset Event objects that have timestamps older than ttl. Omitting this field or a zero value means no events are discarded.

A duration in seconds with up to nine fractional digits, ending with 's'. Example: "3.5s".

DataSource

A data source consists of multiple Event objects.

JSON representation
{
  "uri": string,
  "bqMapping": {
    object (BigqueryMapping)
  }
}
Fields
uri

string

Data source URI.

  1. Google Cloud Storage files (JSON) are defined in the following form: gs://bucket_name/object_name. Each Event should be in JSON format, with one Event per line, also known as JSON Lines format. Gzipped files are allowed. For more information on Cloud Storage URIs, please see https://cloud.google.com/storage/docs/reference-uris.
  2. Google BigQuery tables are defined in the following form: bq://<PROJECT_ID>:<DATASET_ID>.<TABLE_ID>. Each row corresponds to an Event.
bqMapping

object (BigqueryMapping)

For BigQuery inputs defines the columns that should be used for dimensions (including timestamp and group ID).

BigqueryMapping

Mapping of BigQuery columns to timestamp, groupId and dimensions.

JSON representation
{
  "timestampColumn": string,
  "groupIdColumn": string,
  "dimensionColumn": [
    string
  ]
}
Fields
timestampColumn

string

The column which should be used as the event timestamps. If not specified 'Timestamp' is used by default. The column may have TIMESTAMP or INT64 type (the latter is interpreted as microseconds since the Unix epoch).

groupIdColumn

string

The column which should be used as the group ID (grouping events into sessions). If not specified 'GroupId' is used by default, if the input table does not have such a column, random unique group IDs are generated automatically (different group ID per input row).

dimensionColumn[]

string

The list of columns that should be translated to dimensions. If empty, all columns are translated to dimensions. The timestamp and groupId columns should not be listed here again. Columns are expected to have primitive types (STRING, INT64, FLOAT64 or NUMERIC).

State

DataSet state.

Enums
STATE_UNSPECIFIED Unspecified / undefined state.
UNKNOWN Dataset is unknown to the system; we have never seen this dataset before or we have seen this dataset but have fully garbage-collected it.
PENDING Dataset processing is pending.
LOADING Dataset is loading.
LOADED Dataset is loaded and can be queried.
UNLOADING Dataset is unloading.
UNLOADED Dataset is unloaded and is removed from the system.
FAILED Dataset processing failed. Failed dataset names cannot be reused until it has been deleted. A failed dataset will be automatically removed after 30 days.

Status

The Status type defines a logical error model that is suitable for different programming environments, including REST APIs and RPC APIs. It is used by gRPC. Each Status message contains three pieces of data: error code, error message, and error details.

You can find out more about this error model and how to work with it in the API Design Guide.

JSON representation
{
  "code": integer,
  "message": string,
  "details": [
    {
      "@type": string,
      field1: ...,
      ...
    }
  ]
}
Fields
code

integer

The status code, which should be an enum value of google.rpc.Code.

message

string

A developer-facing error message, which should be in English. Any user-facing error message should be localized and sent in the google.rpc.Status.details field, or localized by the client.

details[]

object

A list of messages that carry the error details. There is a common set of message types for APIs to use.

An object containing fields of an arbitrary type. An additional field "@type" contains a URI identifying the type. Example: { "id": 1234, "@type": "types.example.com/standard/id" }.

Methods

appendEvents

Append events to a LOADED DataSet.

create

Create a DataSet from data stored on Cloud Storage.

delete

Delete a DataSet from the system.

evaluateSlice

Evaluate an explicit slice from a loaded DataSet.

list

Lists DataSets under the project.

query

Execute a Timeseries Insights query over a loaded DataSet.