Posted by Dan Delorey, Technical Lead and Bosco Zubiaga, Technical Product Marketing Manager, Google Cloud Platform
BigQuery is Google Cloud Platform’s serverless analytics data warehouse. It's used by thousands of companies — both big and small — to store, understand, and analyze large amounts of data. Today, we’re announcing a host of new features that make BigQuery more compatible with traditional big data workflows:
Standard SQL Beta
If you're familiar with the SQL standard or have used another standard-compliant SQL engine, you’ll feel right at home with the Beta of BigQuery’s revamped SQL dialect. Notable features that we’ve added with this new dialect include:
- More advanced query planning and optimization: BigQuery now provides very robust decorrelation, which allows you to write complex subqueries in any clause of your SQL statement (SELECT, FROM, WHERE, etc.)
- A richer type system with fully composable types: In addition to the existing data types BigQuery users are used to, we’ve added dates, times, arrays and structs, as well as additional support for timestamps
- Extended JOIN support: BigQuery now supports Theta JOIN, which offers the ability to use inequalities in your join key comparisons, as well as arbitrary expressions as JOIN conditions.
Identity & Access Management (IAM) Beta
Now that BigQuery supports Standard SQL, you’ll have more and more teams in your company requesting access to BigQuery projects. Earlier this year we announced Cloud IAM for Cloud Platform. We’re now making IAM available for BigQuery as well, in Beta. This feature is currently being rolled out, so if you don’t see it today you can expect it enabled on your projects this month — BigQuery roles will be made available in Google Cloud’s “IAM & Admin” control panel.
Cloud IAM fully automates managing permissions for BigQuery projects. Administrators can define fine-grained security policies they want for their users and resources. For example, let’s say your marketing team needs to query data, but doesn’t need to modify or upload data to get their job done. With Cloud IAM and Google accounts, this process is a breeze. Administrators can create several roles, such as viewer, editor, user or owner.
Time based table partitioning
Time based partitioning makes it easy and cost-effective for you to manage your data and write queries that span multiple days, months or years. You can now create tables with time-based partitions — you load the data, and BigQuery will automatically put it in the right partition.
Querying partitioned tables is easy — you simply provide a date or range of dates you want, and BigQuery will limit data processed to the time interval specified.
Imagine that you had a marketing campaign that spanned the length of 3 months. Here's an example of a query to analyze logs from the past 3 months. You no longer have to manually partition your tables beforehand — it just works!
BigQuery only scans the data in the specified partitions, rather than the whole table. Not only is this query more performant, it also costs you a fraction of the cost of querying the full table!
BigQuery and Kabam
Businesses come to Cloud Platform for our unique data analytics platform, and today we’ll show you one such example. Kabam is an international entertainment company that produces free-to-play video games for mobile devices. Kabam’s games include Marvel: Contest of Champions, Fast & Furious: Legacy, and Star Wars: Uprising. Their average gamer enjoys more than two hours of gameplay a day! Data analytics is critical to Kabam’s ability to keep gamers engaged in a highly competitive marketplace.
Marvel: Contest of Champions became wildly successful after its release in December of 2014. With big popularity comes big data — in-game events and logs were generating well over 1 TB of data per day, and their previous data warehouse was not able to handle this sudden scale. As a result, Kabam had no way to gather insights about the game for several weeks. Costas Piliotis, lead analyst on the game, had used BigQuery at a previous company and suggested the team to give it a try. In six days they were able to ingest hundreds of terabytes of past game telemetry data. More importantly, they could now query data over the lifetime of the entire game.
Thanks to BigQuery, Kabam is able to store and understand every single player action within their games, from game monetization to fraud, from production bugs to level-completion statistics. Now that the data lives in BigQuery, Costas and his team are most excited about applying Cloud Platform’s Machine Learning tools to the data to make their games even more entertaining.
Here in the BigQuery team, we often hear stories like this one and immediately the question that follows from people who've never tried BigQuery is, “How does this compare to my current data warehouse?”
Two weeks ago we shared a summary of a great series of experiments on cloud data warehouse solutions performed by Mark Litwintschik where BigQuery really stood out.
We’ve made this short video to explain why BigQuery is unique, which we hope will inspire you to give it a try.