Working with JSON data

The JSON data type is a semi-structured data type used for holding JSON (JavaScript Object Notation) data. The specifications for the JSON format are described in RFC 7159.

JSON is useful to supplement a relational schema for data that is sparse or has a loosely-defined or changing structure. However, the query optimizer relies on the relational model to efficiently filter, join, aggregate, and sort at scale. Queries over JSON will have fewer built-in optimizations and fewer affordances to inspect and tune performance.

Specifications

Spanner's JSON type stores a normalized representation of the input JSON document.

  • JSON can be nested to a maximum of 100 levels.
  • Whitespace is not preserved.
  • Comments are not supported. Transactions or queries with comments will fail.
  • Members of a JSON object are sorted lexicographically.
  • JSON array elements have their order preserved.
  • If a JSON object has duplicate keys, only the first one is preserved.
  • Primitive types (string, boolean, number, and null) have their type and value preserved.
    • String type values are preserved exactly.
    • Number type values are preserved, but may have their textual representation changed as a result of the normalization process. For example, an input number of 10000 may have a normalized representation of 1e+4. Number value preservation semantics are as follows:
      • Signed integers in the range of [INT64_MIN, INT64_MAX] are preserved.
      • Unsigned integers in the range of [0, UINT64_MAX] are preserved.
      • Double values that can be roundtripped from string to double to string without precision loss are preserved. If a double value cannot round trip in this manner, the transaction or query fails.
        • For example, SELECT JSON '2.2412421353246235436' fails.
        • A functional workaround is PARSE_JSON('2.2412421353246235436', wide_number_mode='round'), which returns JSON '2.2412421353246237'.

The maximum permitted size of the normalized document is 2621440 of UTF-8 characters.

Nullability

JSON null values are treated as SQL non-NULL.

For example:

SELECT (JSON '{"a":null}').a IS NULL; -- Returns FALSE
SELECT (JSON '{"a":null}').b IS NULL; -- Returns TRUE

SELECT JSON_QUERY(JSON '{"a":null}', "$.a"); -- Returns a JSON 'null'
SELECT JSON_QUERY(JSON '{"a":null}', "$.b"); -- Returns a SQL NULL

Encoding

JSON documents must be encoded in UTF-8. Transactions or queries with JSON documents encoded in other formats return an error.

Create a table with a JSON column

A JSON column can be added to a table when the table is created. JSON columns can be nullable.

CREATE TABLE table_name(
  [key columns],
  column_name JSON,
  [other columns],
) PRIMARY KEY([key column names]);

Add and remove JSON columns from existing tables

A JSON column can also be added to and dropped from existing tables.

ALTER TABLE table_name ADD COLUMN column_name JSON;
ALTER TABLE table_name DROP COLUMN column_name;

The following sample shows how to add a JSON column called VenueDetails to the Venus table using the Cloud Spanner client libraries.

C#


using Google.Cloud.Spanner.Data;
using System;
using System.Threading.Tasks;

public class AddJsonColumnAsyncSample
{
    public async Task AddJsonColumnAsync(string projectId, string instanceId, string databaseId)
    {
        string connectionString = $"Data Source=projects/{projectId}/instances/{instanceId}/databases/{databaseId}";
        string alterStatement = "ALTER TABLE Venues ADD COLUMN VenueDetails JSON";

        using var connection = new SpannerConnection(connectionString);
        using var updateCmd = connection.CreateDdlCommand(alterStatement);
        await updateCmd.ExecuteNonQueryAsync();
        Console.WriteLine("Added the VenueDetails column.");
    }
}

Java


import com.google.api.gax.longrunning.OperationFuture;
import com.google.cloud.spanner.DatabaseAdminClient;
import com.google.cloud.spanner.Spanner;
import com.google.cloud.spanner.SpannerOptions;
import com.google.common.collect.ImmutableList;
import com.google.spanner.admin.database.v1.UpdateDatabaseDdlMetadata;
import java.util.concurrent.ExecutionException;

class AddJsonColumnSample {

  static void addJsonColumn() throws InterruptedException, ExecutionException {
    // TODO(developer): Replace these variables before running the sample.
    String projectId = "my-project";
    String instanceId = "my-instance";
    String databaseId = "my-database";

    try (Spanner spanner =
        SpannerOptions.newBuilder().setProjectId(projectId).build().getService()) {
      DatabaseAdminClient adminClient = spanner.getDatabaseAdminClient();
      addJsonColumn(adminClient, instanceId, databaseId);
    }
  }

  static void addJsonColumn(DatabaseAdminClient adminClient, String instanceId, String databaseId)
      throws InterruptedException, ExecutionException {
    OperationFuture<Void, UpdateDatabaseDdlMetadata> operation =
        adminClient.updateDatabaseDdl(
            instanceId,
            databaseId,
            ImmutableList.of("ALTER TABLE Venues ADD COLUMN VenueDetails JSON"),
            null);
    // Wait for the operation to finish.
    // This will throw an ExecutionException if the operation fails.
    operation.get();
    System.out.printf("Successfully added column `VenueDetails`%n");
  }
}

Node.js


// Imports the Google Cloud client library.
const {Spanner} = require('@google-cloud/spanner');

/**
 * TODO(developer): Uncomment the following lines before running the sample.
 */
// const projectId = 'my-project-id';
// const instanceId = 'my-instance';
// const databaseId = 'my-database';

// Creates a client
const spanner = new Spanner({
  projectId: projectId,
});

// Gets a reference to a Cloud Spanner instance.
const instance = spanner.instance(instanceId);
const database = instance.database(databaseId);

const request = ['ALTER TABLE Venues ADD COLUMN VenueDetails JSON'];

// Alter existing table to add a column.
const [operation] = await database.updateSchema(request);

console.log(`Waiting for operation on ${databaseId} to complete...`);

await operation.promise();

console.log(
  `Added VenueDetails column to Venues table in database ${databaseId}.`
);

Modify JSON data

The following sample shows how update JSON data using the Cloud Spanner client libraries.

C#


using Google.Cloud.Spanner.Data;
using Newtonsoft.Json;
using System;
using System.Collections.Generic;
using System.Linq;
using System.Threading.Tasks;

public class UpdateDataWithJsonAsyncSample
{
    public class Venue
    {
        public int VenueId { get; set; }
        public string VenueDetails { get; set; }
    }

    public async Task UpdateDataWithJsonAsync(string projectId, string instanceId, string databaseId)
    {
        List<Venue> venues = new List<Venue>
        {
            // If you are using .NET Core 3.1 or later, you can use System.Text.Json for serialization instead.
            new Venue
            {
                VenueId = 19,
                VenueDetails = JsonConvert.SerializeObject(new
                {
                    rating = 9,
                    open = true,
                })
            },
            new Venue
            {
                VenueId = 4,
                VenueDetails = JsonConvert.SerializeObject(new object[]
                {
                    new
                    {
                        name = "room 1",
                        open = true,
                    },
                    new
                    {
                        name = "room 2",
                        open = false,
                    },
                })
            },
            new Venue
            {
                VenueId = 42,
                VenueDetails = JsonConvert.SerializeObject(new
                {
                    name = "Central Park",
                    open = new
                    {
                        Monday = true,
                        Tuesday = false,
                    },
                    tags = new string[] {"large", "airy" },
                }),
            },
        };
        // Create connection to Cloud Spanner.
        string connectionString = $"Data Source=projects/{projectId}/instances/{instanceId}/databases/{databaseId}";
        using var connection = new SpannerConnection(connectionString);
        await connection.OpenAsync();

        await Task.WhenAll(venues.Select(venue =>
        {
            // Update rows in the Venues table.
            using var cmd = connection.CreateUpdateCommand("Venues", new SpannerParameterCollection
            {
                    { "VenueId", SpannerDbType.Int64, venue.VenueId },
                    { "VenueDetails", SpannerDbType.Json, venue.VenueDetails }
            });
            return cmd.ExecuteNonQueryAsync();
        }));

        Console.WriteLine("Data updated.");
    }
}

Java

import com.google.cloud.spanner.DatabaseClient;
import com.google.cloud.spanner.DatabaseId;
import com.google.cloud.spanner.Mutation;
import com.google.cloud.spanner.Spanner;
import com.google.cloud.spanner.SpannerOptions;
import com.google.cloud.spanner.Value;
import com.google.common.collect.ImmutableList;

class UpdateJsonDataSample {

  static void updateJsonData() {
    // TODO(developer): Replace these variables before running the sample.
    String projectId = "my-project";
    String instanceId = "my-instance";
    String databaseId = "my-database";

    try (Spanner spanner =
        SpannerOptions.newBuilder().setProjectId(projectId).build().getService()) {
      DatabaseClient client =
          spanner.getDatabaseClient(DatabaseId.of(projectId, instanceId, databaseId));
      updateJsonData(client);
    }
  }

  static void updateJsonData(DatabaseClient client) {
    client.write(
        ImmutableList.of(
            Mutation.newInsertOrUpdateBuilder("Venues")
                .set("VenueId")
                .to(4L)
                .set("VenueDetails")
                .to(
                    Value.json(
                        "[{\"name\":\"room 1\",\"open\":true},"
                            + "{\"name\":\"room 2\",\"open\":false}]"))
                .build(),
            Mutation.newInsertOrUpdateBuilder("Venues")
                .set("VenueId")
                .to(19L)
                .set("VenueDetails")
                .to(Value.json("{\"rating\":9,\"open\":true}"))
                .build(),
            Mutation.newInsertOrUpdateBuilder("Venues")
                .set("VenueId")
                .to(42L)
                .set("VenueDetails")
                .to(
                    Value.json(
                        "{\"name\":null,"
                            + "\"open\":{\"Monday\":true,\"Tuesday\":false},"
                            + "\"tags\":[\"large\",\"airy\"]}"))
                .build()));
    System.out.println("Venues successfully updated");
  }
}

Node.js


// Imports the Google Cloud client library.
const {Spanner} = require('@google-cloud/spanner');

/**
 * TODO(developer): Uncomment the following lines before running the sample.
 */
// const projectId = 'my-project-id';
// const instanceId = 'my-instance';
// const databaseId = 'my-database';

// Creates a client.
const spanner = new Spanner({
  projectId: projectId,
});

// Gets a reference to a Cloud Spanner instance and database.
const instance = spanner.instance(instanceId);
const database = instance.database(databaseId);

// Instantiate Spanner table objects.
const venuesTable = database.table('Venues');

const data = [
  {
    VenueId: '19',
    VenueDetails: {rating: 9, open: true},
    LastUpdateTime: 'spanner.commit_timestamp()',
  },
  {
    VenueId: '4',
    // VenueDetails must be specified as a string, as it contains a top-level
    // array of objects that should be inserted into a JSON column. If we were
    // to specify this value as an array instead of a string, the client
    // library would encode this value as ARRAY<JSON> instead of JSON.
    VenueDetails: `[
      {
        "name": null,
        "open": true
      },
      {
        "name": "room 2",
        "open": false
      },
      {
        "main hall": {
          "description": "this is the biggest space",
          "size": 200
        }
      }
    ]`,
    LastUpdateTime: 'spanner.commit_timestamp()',
  },
  {
    VenueId: '42',
    VenueDetails: {
      name: null,
      open: {
        Monday: true,
        Tuesday: false,
      },
      tags: ['large', 'airy'],
    },
    LastUpdateTime: 'spanner.commit_timestamp()',
  },
];
// Updates rows in the Venues table.
try {
  await venuesTable.update(data);
  console.log('Updated data.');
} catch (err) {
  console.error('ERROR:', err);
} finally {
  // Close the database when finished.
  database.close();
}

Query JSON data

The following sample shows how to query JSON data using the Cloud Spanner client libraries.

C#


using Google.Cloud.Spanner.Data;
using Newtonsoft.Json;
using System.Collections.Generic;
using System.Threading.Tasks;

public class QueryDataWithJsonParameterAsyncSample
{
    public class Venue
    {
        public int VenueId { get; set; }
        public string VenueDetails { get; set; }
    }

    public async Task<List<Venue>> QueryDataWithJsonParameterAsync(string projectId, string instanceId, string databaseId)
    {
        string connectionString = $"Data Source=projects/{projectId}/instances/{instanceId}/databases/{databaseId}";

        using var connection = new SpannerConnection(connectionString);
        // If you are using .NET Core 3.1 or later, you can use System.Text.Json for serialization instead.
        var jsonValue = JsonConvert.SerializeObject(new { rating = 9 });
        // Get all venues with rating 9.
        using var cmd = connection.CreateSelectCommand(
            @"SELECT VenueId, VenueDetails
              FROM Venues
              WHERE JSON_VALUE(VenueDetails, '$.rating') = JSON_VALUE(@details, '$.rating')",
            new SpannerParameterCollection
            {
                { "details", SpannerDbType.Json, jsonValue }
            });

        var venues = new List<Venue>();
        using var reader = await cmd.ExecuteReaderAsync();
        while (await reader.ReadAsync())
        {
            venues.Add(new Venue
            {
                VenueId = reader.GetFieldValue<int>("VenueId"),
                VenueDetails = reader.GetFieldValue<string>("VenueDetails")
            });
        }
        return venues;
    }
}

Java


import com.google.cloud.spanner.DatabaseClient;
import com.google.cloud.spanner.DatabaseId;
import com.google.cloud.spanner.ResultSet;
import com.google.cloud.spanner.Spanner;
import com.google.cloud.spanner.SpannerOptions;
import com.google.cloud.spanner.Statement;
import com.google.cloud.spanner.Value;

class QueryWithJsonParameterSample {

  static void queryWithJsonParameter() {
    // TODO(developer): Replace these variables before running the sample.
    String projectId = "my-project";
    String instanceId = "my-instance";
    String databaseId = "my-database";

    try (Spanner spanner =
        SpannerOptions.newBuilder().setProjectId(projectId).build().getService()) {
      DatabaseClient client =
          spanner.getDatabaseClient(DatabaseId.of(projectId, instanceId, databaseId));
      queryWithJsonParameter(client);
    }
  }

  static void queryWithJsonParameter(DatabaseClient client) {
    String exampleJson = "{rating: 9}";
    Statement statement =
        Statement.newBuilder(
                "SELECT VenueId, VenueDetails\n"
                    + "FROM Venues\n"
                    + "WHERE JSON_VALUE(VenueDetails, '$.rating') = "
                    + "JSON_VALUE(@details, '$.rating')")
            .bind("details")
            .to(Value.json(exampleJson))
            .build();
    try (ResultSet resultSet = client.singleUse().executeQuery(statement)) {
      while (resultSet.next()) {
        System.out.printf(
            "VenueId: %s, VenueDetails: %s%n",
            resultSet.getLong("VenueId"), resultSet.getString("VenueDetails"));
      }
    }
  }
}

Node.js


// Imports the Google Cloud client library.
const {Spanner} = require('@google-cloud/spanner');

/**
 * TODO(developer): Uncomment the following lines before running the sample.
 */
// const projectId = 'my-project-id';
// const instanceId = 'my-instance';
// const databaseId = 'my-database';

// Creates a client
const spanner = new Spanner({
  projectId: projectId,
});

// Gets a reference to a Cloud Spanner instance and database.
const instance = spanner.instance(instanceId);
const database = instance.database(databaseId);

const fieldType = {
  type: 'json',
};

const jsonValue = {rating: 9};

const query = {
  sql: `SELECT VenueId, VenueDetails FROM Venues
          WHERE JSON_VALUE(VenueDetails, '$.rating') = JSON_VALUE(@details, '$.rating')`,
  params: {
    details: jsonValue,
  },
  types: {
    details: fieldType,
  },
};

// Queries rows from the Venues table.
try {
  const [rows] = await database.run(query);

  rows.forEach(row => {
    const json = row.toJSON();
    console.log(
      `VenueId: ${json.VenueId}, Details: ${JSON.stringify(
        json.VenueDetails
      )}`
    );
  });
} catch (err) {˜
  console.error('ERROR:', err);
} finally {
  // Close the database when finished.
  database.close();
}

Indexing

JSON type columns do not support indexing. However, an index can be created on a generated column that extracts a scalar value from a JSON column. For example:

CREATE TABLE table_name(
  [key columns],
  json_column_name JSON,
  stored_gen_column_name STRING(MAX) AS (JSON_VALUE(json_column_name, json_path)) STORED
) PRIMARY KEY([key column names]);

CREATE INDEX index_name ON table_name(stored_gen_column_name);

Where:

  • json_path is a STRING value in JSONPath format.

In addition, JSON column types cannot be used in ORDER BY.