Google Cloud Big Data and Machine Learning Blog

Innovation in data processing and machine learning technology

Pre-built Cloud Dataflow templates: KISS for data movement

Thursday, March 22, 2018

Get started with simple templates for Cloud Dataflow. Learn how to do simple per-element filters and transforms in JavaScript.

Public datasets: how nonprofits can drive social impact with planetary-scale data

Wednesday, March 21, 2018

BigQuery Public Datasets, plus Kaggle Datasets and Kernels, make massive datasets available to nonprofits, so that they can help solve global problems.

Joining and shuffling very large datasets using Cloud Dataflow

Tuesday, March 20, 2018

Learn how to use Cloud Dataflow on tera-scale datasets to shuffle and join efficiently. We also describe Dataflow pricing adjustments in greater detail.

Predicting community engagement on Reddit using TensorFlow, GDELT, and Cloud Dataflow: Part 2

Monday, March 19, 2018

In part 2 of a 3-part series, learn how to use NLP, TensorFlow, GDELT, and Cloud Dataflow to automatically predict subreddit categorization of news posts.

Predicting community engagement on Reddit using TensorFlow, GDELT, and Cloud Dataflow: Part 1

Monday, March 19, 2018

In part 1 of a 3-part series, learn how to use NLP, TensorFlow, GDELT, and Cloud Dataflow to automatically predict subreddit categorization of news posts.

Hyperparameter tuning on Google Cloud Platform is now faster and smarter

Wednesday, March 14, 2018

Learn how to save time and budget while tuning your TensorFlow model's hyperparameters. Eliminate retraced steps, and minimize your training time.

The switch to self-service marketing analytics at zulily: best practices for using Tableau with BigQuery

Monday, March 12, 2018

zulily explains best practices for building a marketing analytics workflow with Tableau and BigQuery.

Comparing regression and classification on US elections data with TensorFlow Estimators

Wednesday, March 7, 2018

Apply two fundamental ML techniques, regression and classification, to the Elections 2016 dataset from Kaggle. Discover demographic voting trends.

How Color uses the new Variant Transforms tool for breakthrough clinical data science with BigQuery

Monday, March 5, 2018

Google Cloud customer Color explains how Variant Transforms enables novel genomics and clinical conclusions from within BigQuery.

Cloud poetry: training and hyperparameter tuning custom text models on Cloud ML Engine

Wednesday, February 28, 2018

Learn how to train a TensorFlow model to suggest the next line of poetry using Tensor2Tensor on Cloud ML Engine.

Google Cloud and NCAA® team up for a unique March Madness® competition hosted on Kaggle

Tuesday, February 27, 2018

Google Cloud and NCAA® are announcing the annual March Madness Machine Learning Competition on Kaggle, which helps you predict a winning bracket with AI.

How to handle mutating JSON schemas in a streaming pipeline, with Square Enix

Monday, February 26, 2018

Learn to process mutating JSON with Cloud Pub/Sub, Cloud Dataflow, and BigQuery. Square Enix engineers explain how they handle a changing game dataflow.

Practice makes perfect: the Professional Data Engineer Practice Exam is now live

Wednesday, February 14, 2018

Google Cloud Certified now offers an online practice exam designed to help Professional Data Engineer exam takers check their test readiness.

Easy distributed training with TensorFlow using tf.estimator.train_and_evaluate on Cloud ML Engine

Friday, February 9, 2018

Learn how to quickly and simply distribute your training workload with TensorFlow 1.4 and Cloud Machine Learning Engine.

Bitcoin in BigQuery: blockchain analytics on public data

Thursday, February 8, 2018

Learn how to access the Bitcoin blockchain via a new public Google BigQuery dataset. Learn how to visualize transactions in Google Data Studio.

How to process weather satellite data in real-time in BigQuery

Wednesday, January 31, 2018

Learn how to analyze historical GOES-16 geostationary weather satellite data in BigQuery, and visualize a real-time feed, to understand weather events.

Updating Cloud Dataproc for faster speeds and more resiliency

Friday, January 26, 2018

Cloud Dataproc now allows you to designate 3 master nodes for resiliency, and you can now assign persistent SSDs for added speed.

Keys to faster sampling in Cloud Dataflow

Wednesday, January 24, 2018

Learn how to sample faster in Cloud Dataflow in a new white paper, which focuses on building a composite transform, while preserving specific attributes.

A guide to machine learning for the chronically curious: ML Explorer

Monday, January 22, 2018

Introducing the ML Explorer series, which provides a tour of the machine learning landscape for application developers who want to build ML using APIs.

Problem-solving with ML: automatic document classification

Wednesday, January 10, 2018

Learn how to build a machine learning-based document classifier by exploring this scikit-learn-based Colab notebook and the BBC news public dataset.

Improving the efficiency of your helpdesk with serverless machine learning

Thursday, December 28, 2017

Learn how to augment user-submitted data in helpdesk tickets to reduce completion time, using the Natural Language Processing API and Cloud ML Engine.

Busting 12 myths about BigQuery

Friday, December 22, 2017

From our experience with a variety of customers, both enterprises and startups included, we debunk 12 common BigQuery myths.

New in TensorFlow 1.4: converting a Keras model to a TensorFlow Estimator

Monday, December 18, 2017

Learn how to convert a Keras model into a TensorFlow Estimator, using a text classifier as an example. This conversion is newly possible in TensorFlow 1.4.

Bringing Cloud ML Engine to more developers with online prediction features and reduced prices

Thursday, December 14, 2017

We're announcing price reductions for Google Compute Engine GPU instances and Cloud ML Engine while enabling online prediction as a service.

Real-time forecasts in the cloud: from market feed capture to ML predictions

Thursday, December 14, 2017

Learn how to set up a price prediction engine using the Thomson Reuters FX live data feed, the TensorFlow Estimator object, and Cloud Datalab.

Free Trial

Get $300 free credit to spend over 12 months

  • Big Data Solutions

  • Product deep dives, technical comparisons, how-to's and tips and tricks for using the latest data processing and machine learning technologies.

  • Learn More

12 Months FREE TRIAL

Try BigQuery, Machine Learning and other cloud products and get $300 free credit to spend over 12 months.