Metadata instances linking

This guide describes how to link records to metadata instances. Manufacturing Data Engine (MDE) links records to metadata instances by writing the instance_id of the metadata bucket into the record. However you don't set the instance_id in a proto record in the parser directly. Instead, MDE automatically resolves the instance_id for you based on the inputs you provide in the proto record. MDE provides two ways to resolve in instance_id:

  1. By metadata instance natural key.
  2. By metadata instance value.

Resolving a metadata instance_id by natural key

Each metadata instance in a metadata bucket has a natural key, and there may be more than one instance for a natural key. You can associate a record to the latest metadata instance for a natural key by supplying the natural key in the proto record in the parser. MDE takes care of retrieving the latest instance for the natural key you supplied, and inserting the instance's instance_id in the record.

For example, given the following metadata instances in a record metadata bucket called machine with bucket number 1:

[
  {
    "instance_id": "a614e25d-fded-41c3-9a56-3222cd30070d",
    "bucket_number": "1",
    "bucket_name": "machine",
    "bucket_version": "1",
    "natural_key": "m-234",
    "instance": {
      "machineName": "CNC Mill"
    },
    "created_timestamp": "2023-06-27 20:00:29.603000 UTC"
  },
  {
    "instance_id": "6cfdf894-2fb6-4951-82c6-c4eada587529",
    "bucket_number": "1",
    "bucket_name": "machine",
    "bucket_version": "1",
    "natural_key": "m-234",
    "instance": {
      "machineName": "5 Axis CNC Mill"
    },
    "created_timestamp": "2023-06-28 20:00:29.632000 UTC"
  }
]

And given the following source message as sample:

{
  "sensor": "rotation-speed-sensor",
  "machine": "m-234",
  "timestamp": "1687973092857",
  "value": 1200
}

The following Whistle script:

package mde

[
    {
        tagName: $root.machine + "-" + $root.sensor;
        data: {
            numeric: $root.value;
        };
        timestamps: {
            eventTimestamp: $root.timestamp;
        }
        cloudMetadata: [{
            bucketReference: {
                bucketName: "machine";
                version: 1;
            }
            naturalKey: $root.machine
        }]
    }
]

Will produce the following proto record:

[
  {
    "tagName": "m-234-rotation-speed-sensor",
    "data": {
      "numeric": 1200
    },
    "timestamps": {
      "eventTimestamp": "1687973092857"
    },
    "cloudMetadata": [
      {
        "bucketReference": {
          "bucketName": "machine",
          "number": 1
        },
        "naturalKey": "m-234"
      }
    ]
  }
]

After MDE finishes processing the proto record, it produces the following record (row) in BigQuery (assuming the BigQuery sink and metadata metadata materialization are turned on):

[
  {
    "id": "6e008960-7f25-4418-899c-75b05c3d3186",
    "tag_name": "m-234-rotation-speed-sensor",
    "type_version": "1",
    "event_timestamp": "2023-06-28 17:24:52.526000 UTC",
    "value": "1200",
    "embedded_metadata": {},
    "materialized_cloud_metadata": {
      "machine": {
        "machineName": "5 Axis CNC Mill"
      }
    },
    "cloud_metadata_ref": {
      "machine": {
        "instance_id": "6cfdf894-2fb6-4951-82c6-c4eada587529"
      }
    },
    "event_timestamp": "2023-06-28 17:24:53.526000 UTC",
    "source_message_id": "8511923697775002"
  }
]

Resolving a metadata instance_id by instance value

Alternatively, you can associate a record to the latest metadata instance for a natural key by supplying the entire metadata instance object in the proto record in the parser. You can optionally provide a natural key. If you omit the natural key, MDE infers the natural key based on the bucket type:

  • If you are supplying a metadata instance value for a tag bucket and omit the natural key, MDE automatically uses the tag name as the natural key.
  • If you are supplying a metadata instance value for a record bucket and omit the natural key, MDE automatically the hash value of the instance object as the natural key.

After you have constructed a proto record in the parser, MDE compares the provided metadata instance object with the most recent metadata instance object for the supplied natural key.

If the instance objects are identical, MDE inserts into the record the instance_id of the most recent metadata instance for the natural key.

If the instance objects are not identical, MDE uses the metadata instance supplied by the proto record to create a new metadata instance for the given natural key. MDE then inserts the instance_id of newly created instance into the record.

Resolving a metadata instance_id by instance value is particularly useful when your edge sources deliver fully qualified messages. Using this method of resolving metadata instances lets you reference an already existing instance as well as creating new metadata instance dynamically.

Referencing an existing instance by value

Example of metadata instances in a tag metadata bucket called asset with bucket number 1:

[
  {
    "instance_id": "a614e25d-fded-41c3-9a56-3222cd30070d",
    "bucket_number": "1",
    "bucket_name": "asset",
    "bucket_version": "1",
    "natural_key": "s-446-pressure",
    "instance": {
      "machineName": "Stamping M. 446"
    },
    "created_timestamp": "2023-06-27 20:00:29.603000 UTC"
  },
  {
    "instance_id": "6cfdf894-2fb6-4951-82c6-c4eada587529",
    "bucket_number": "1",
    "bucket_name": "asset",
    "bucket_version": "1",
    "natural_key": "s-446-pressure",
    "instance": {
      "machineName": "Stamping Machine 446"
    },
    "created_timestamp": "2023-06-28 20:00:29.632000 UTC"
  }
]

Given the following is a source message:

{
  "sensor": "pressure",
  "machine": "s-446",
  "timestamp": "1687973092857",
  "machineName": "Stamping Machine 446",
  "value": 24
}

And the following Whistle script:

package mde

var tagName: $root.machine + "-" + $root.sensor;

[
    {
        tagName: tagName;
        data: {
            numeric: $root.value;
        };
        timestamps: {
            eventTimestamp: $root.timestamp;
        }
        cloudMetadata: [{
            bucketReference: {
                bucketName: "asset";
                version: 1;
            }
            instance: {
              // optional. Since bucketReference points to a tag bucket, if omitted, the natural is inferred to be tagName
              naturalKey: tagName
              attributes: {
                machineName: $root.machineName;
              }
            }
        }]
    }
]

Applying the whistle script produces the following proto record:

[
  {
    "tagName": "s-446-pressure",
    "data": {
      "numeric": 24
    },
    "timestamps": {
      "eventTimestamp": "1687973092857"
    },
    "cloudMetadata": [
      {
        "bucketReference": {
          "bucketName": "asset",
          "number": 1
        },
        "naturalKey": "s-446-pressure",
        "attributes": {
          "machineName": "Stamping Machine 446"
        }
      }
    ]
  }
]

After MDE finishes processing the proto record, it produces the following record (row) in BigQuery (assuming the BigQuery sink and metadata metadata materialization are turned on):

[
  {
    "id": "6e008960-7f25-4418-899c-75b05c3d3186",
    "tag_name": "s-446-pressure",
    "type_version": "1",
    "event_timestamp": "2023-06-28 17:24:52.526000 UTC",
    "value": "24",
    "embedded_metadata": {},
    "materialized_cloud_metadata": {
      "asset": {
        "machineName": "Stamping Machine 446"
      }
    },
    "cloud_metadata_ref": {
      "asset": {
        "instance_id": "6cfdf894-2fb6-4951-82c6-c4eada587529"
      }
    },
    "event_timestamp": "2023-06-28 17:24:53.526000 UTC",
    "source_message_id": "8511923697775002"
  }
]

Creating a new instance by value

If the supplied metadata instance in the proto record is not identical to the most recent metadata instance for the provided natural key, MDE creates a new instance.

You can configure how thew new instance is created using the instanceOverwriteMode setting on buckets:

  • If instanceOverwriteMode is set to true, the new instance is created from the instance object supplied in the proto record. The new instance is stored in the metadata bucket if it passes the bucket's schema validation.
  • If instanceOverwriteMode is set to false, the new instance is created by merging the most recent instance for the natural key with the instance object supplied in the proto record. The resulting object is stored as a new metadata instance in the metadata bucket if it passes the bucket's schema validation.

For example, given the following metadata instance in a tag metadata bucket called asset with bucket number 1:

[
  {
    "instance_id": "a614e25d-fded-41c3-9a56-3222cd30070d",
    "bucket_number": "1",
    "bucket_name": "asset",
    "bucket_version": "1",
    "natural_key": "s-446-pressure",
    "instance": {
      "machineName": "Stamping M. 446"
    },
    "created_timestamp": "2023-06-27 20:00:29.603000 UTC"
  }
]

And the following source message:

{
  "sensor": "pressure",
  "machine": "s-446",
  "timestamp": "1687973092857",
  "machineName": "Stamping Machine 446",
  "value": 24
}

With the following Whistle script:

package mde

var tagName: $root.machine + "-" + $root.sensor;

[
    {
        tagName: tagName;
        data: {
            numeric: $root.value;
        };
        timestamps: {
            eventTimestamp: $root.timestamp;
        }
        cloudMetadata: [{
            bucketReference: {
                bucketName: "asset";
                number: 1;
            }
            instance: {
              // optional. Since bucketReference points to a tag bucket, if omitted, the natural is inferred to be tagName
              naturalKey: tagName
              attributes: {
                machineName: $root.machineName;
              }
            }
        }]
    }
]

Applying the whistle script produces the following proto record:

[
  {
    "tagName": "s-446-pressure",
    "data": {
      "numeric": 24
    },
    "timestamps": {
      "eventTimestamp": "1687973092857"
    },
    "cloudMetadata": [
      {
        "bucketReference": {
          "bucketName": "asset",
          "number": 1
        },
        "naturalKey": "s-446-pressure",
        "attributes": {
          "machineName": "Stamping Machine 446"
        }
      }
    ]
  }
]

Since the supplied metadata instance is not equivalent to the most recent metadata instance for the natural key, MDE creates a new metadata instance. The metadata bucket now has two instances:

[
  {
    "instance_id": "a614e25d-fded-41c3-9a56-3222cd30070d",
    "bucket_number": "1",
    "bucket_name": "asset",
    "bucket_version": "1",
    "natural_key": "s-446-pressure",
    "instance": {
      "machineName": "Stamping M. 446"
    },
    "created_timestamp": "2023-06-27 20:00:29.603000 UTC"
  },
  {
    "instance_id": "6cfdf894-2fb6-4951-82c6-c4eada587529",
    "bucket_number": "1",
    "bucket_name": "asset",
    "bucket_version": "1",
    "natural_key": "s-446-pressure",
    "instance": {
      "machineName": "Stamping Machine 446"
    },
    "created_timestamp": "2023-06-28 20:00:29.632000 UTC"
  }
]

After MDE finishes processing the proto record, it produces the following record (row) in BigQuery (assuming the BigQuery sink and metadata metadata materialization are turned on):

[
  {
    "id": "6e008960-7f25-4418-899c-75b05c3d3186",
    "tag_name": "s-446-pressure",
    "type_version": "1",
    "event_timestamp": "2023-06-28 17:24:52.526000 UTC",
    "value": "24",
    "embedded_metadata": {},
    "materialized_cloud_metadata": {
      "asset": {
        "machineName": "Stamping Machine 446"
      }
    },
    "cloud_metadata_ref": {
      "asset": {
        "instance_id": "6cfdf894-2fb6-4951-82c6-c4eada587529"
      }
    },
    "event_timestamp": "2023-06-28 17:24:53.526000 UTC",
    "source_message_id": "8511923697775002"
  }
]