Jump to Content
Data Analytics

Introducing new SQL functions to manipulate your JSON data in BigQuery

August 8, 2023
https://storage.googleapis.com/gweb-cloudblog-publish/images/aiml2022_PO1vxqJ.max-2500x2500.jpg
Tristan Li

Principal Architect, Google Cloud

Diksha Gulati

Customer Engineer, Google Cloud

Enterprises are generating data at an exponential rate, spanning traditional structured transactional data, semi-structured like JSON and unstructured data like images and audio. Beyond the scale of the data, these divergent types present processing challenges for developers, at times requiring separate processing flows for each. With its initial release BigQuery’s support for semi-structured JSON eliminated the need for such complex preprocessing and providing schema flexibility, intuitive querying and the scalability benefits afforded to structured data. 

Today, we are excited to announce the release of new SQL functions for BigQuery JSON, extending the power and flexibility of our core JSON support. These functions make it even easier to extract and construct JSON data and perform complex data analysis.

With these new query functions, you can:

Let's review these new features and some examples of how to use them. First, we will create a table for demonstration.

Loading...

Loading...

https://storage.googleapis.com/gweb-cloudblog-publish/images/1_Create_TABLE.max-2000x2000.jpg

Great! Let’s say we want to get a list of all users over 30. Looking at the table, you will see that user.age contains a JSON number in the first record, a JSON string in the second, and a JSON null in the third. With the new powerful LAX function, LAX_INT64, all types are automatically inferred and processed correctly.

Loading...

https://storage.googleapis.com/gweb-cloudblog-publish/images/2_LAX_COMPARISON.max-2000x2000.jpg

Unlike the “strict” conversion functions, which require that the JSON type matches the primitive type exactly, the “lax” conversion functions will also handle conversions between mismatched data types. 

For example, the strict conversion function below would return an error:

Loading...

https://storage.googleapis.com/gweb-cloudblog-publish/images/3_ERROR.max-2000x2000.jpg

However, the lax conversion function below would return the desired result:

Loading...

https://storage.googleapis.com/gweb-cloudblog-publish/images/4_LAX_INT64_CONVERSION.max-2000x2000.jpg

Furthermore, you can quickly and easily remove a field in the JSON data by using JSON_REMOVE functions.

Loading...

After running the above function the table is updated with the following contents:

https://storage.googleapis.com/gweb-cloudblog-publish/images/5_JSON_REMOVE.max-2000x2000.jpg

JSON_STRIP_NULLS compresses the data by removing JSON nulls. Although BigQuery null values neither impact performance nor storage cost, it can be helpful for reducing data size during exports.

Loading...

After running the above function the table is updated with the following contents:

https://storage.googleapis.com/gweb-cloudblog-publish/images/6_JSON_STRIP_NULLS.max-2000x2000.jpg

Now, what if we want to modify or add a field to the JSON data? 

You can now update the data with the new JSON_SET function. And you have the ability to mix and match JSON functions together to achieve desired results. 

For example, the query below adds a new field, "region_code", to the table. The value of the field will be "America" if the value of the "country" field is "USA", and "Other" if it is not.

Loading...

After running the above function the table is updated with the following contents:

https://storage.googleapis.com/gweb-cloudblog-publish/images/7_JSON_SET.max-2000x2000.jpg

Last but not least, let’s say you have a table of property/value pairs you want to convert to a JSON object. With the new JSON_OBJECT constructor function, you can effortlessly create the new JSON object.

Loading...

https://storage.googleapis.com/gweb-cloudblog-publish/images/8_JSON_OBJECT.max-2000x2000.jpg

Complete list of functions

Lax conversion functions:

JSON constructor functions:

JSON mutator functions:

Try it out!

Google BigQuery is constantly adding new features to make it easier and more powerful to analyze your data. We encourage you to check them out and provide your feedback to us as we continue to develop additional features and capabilities to make working JSON easier and faster over time.

Posted in