Google Cloud Big Data and Machine Learning Blog

Innovation in data processing and machine learning technology

Now live in Tokyo: using TensorFlow to predict taxi demand

Wednesday, April 25, 2018

Learn how NTT DOCOMO predicts taxi demand with TensorFlow, Cloud ML Engine, and HyperTune in Tokyo.

BigQuery lazy data loading: SQL data languages (DDL and DML), partitions, and half a trillion Wikipedia pageviews

Wednesday, April 11, 2018

Learn how to use new BigQuery features to query federated tables, and work with DDL, DML, data partitions, and a massive Wikipedia data set.

Serving real-time scikit-learn and XGBoost predictions

Thursday, April 5, 2018

Cloud ML Engine now supports scikit-learn and XGBoost in beta. Learn how to start using online prediction with these two additional frameworks.

Stretching Elastic’s capabilities with historical analysis, backups, and cross-cloud monitoring on Google Cloud Platform

Wednesday, April 4, 2018

Elastic is partnering with Google Cloud Platform to enable Elasticsearch and X-Pack functionality, as well as BigQuery and Stackdriver integration.

Using BigDL for deep learning with Apache Spark and Google Cloud Dataproc

Tuesday, April 3, 2018

Learn how to use Intel's BigDL to scale machine learning workloads with Apache Spark across multiple nodes and try out this workflow on the MNIST dataset.

Architecting live NCAA predictions: from archives to insights

Friday, March 30, 2018

Learn how the NCAA uses Google Cloud to build a predictive data analytics workflow that helps them get real-time insights from college hoops game data.

Simplifying machine learning on open hybrid clouds with Kubeflow

Thursday, March 29, 2018

Cisco and Google Cloud are now collaborating to provide a hybrid architecture for Kubeflow, permitting flexible transfer of TensorFlow jobs to the cloud.

Predicting community engagement on Reddit using TensorFlow, GDELT, and Cloud Dataflow: Part 3

Thursday, March 29, 2018

In part 3 of a 3-part series, learn how to use NLP, TensorFlow, GDELT, and Cloud Dataflow to automatically predict subreddit categorization of news posts.

Testing future Apache Spark releases and changes on Google Kubernetes Engine and Cloud Dataproc

Wednesday, March 28, 2018

Learn how to test out upcoming changes and versions of Apache Spark in Google Kubernetes Engine, preferably on test rather than production data.

How Tokopedia modernized its data warehouse and analytics processes with BigQuery and Cloud Dataflow

Tuesday, March 27, 2018

Learn how Tokopedia, an Indonesian online marketplace, converted to a Google Cloud data warehouse and analytics platform with BigQuery and Cloud Dataflow.

AutoML Vision in action: from ramen to branded goods

Monday, March 26, 2018

Learn how AutoML Vision identifies the origin Tokyo ramen shop for images of ramen, and how Mercari identifies the brands sold on its marketplace app.

Pre-built Cloud Dataflow templates: KISS for data movement

Thursday, March 22, 2018

Get started with simple templates for Cloud Dataflow. Learn how to do simple per-element filters and transforms in JavaScript.

Public datasets: how nonprofits can drive social impact with planetary-scale data

Wednesday, March 21, 2018

BigQuery Public Datasets, plus Kaggle Datasets and Kernels, make massive datasets available to nonprofits, so that they can help solve global problems.

Joining and shuffling very large datasets using Cloud Dataflow

Tuesday, March 20, 2018

Learn how to use Cloud Dataflow on tera-scale datasets to shuffle and join efficiently. We also describe Dataflow pricing adjustments in greater detail.

Predicting community engagement on Reddit using TensorFlow, GDELT, and Cloud Dataflow: Part 2

Monday, March 19, 2018

In part 2 of a 3-part series, learn how to use NLP, TensorFlow, GDELT, and Cloud Dataflow to automatically predict subreddit categorization of news posts.

Predicting community engagement on Reddit using TensorFlow, GDELT, and Cloud Dataflow: Part 1

Monday, March 19, 2018

In part 1 of a 3-part series, learn how to use NLP, TensorFlow, GDELT, and Cloud Dataflow to automatically predict subreddit categorization of news posts.

Hyperparameter tuning on Google Cloud Platform is now faster and smarter

Wednesday, March 14, 2018

Learn how to save time and budget while tuning your TensorFlow model's hyperparameters. Eliminate retraced steps, and minimize your training time.

The switch to self-service marketing analytics at zulily: best practices for using Tableau with BigQuery

Monday, March 12, 2018

zulily explains best practices for building a marketing analytics workflow with Tableau and BigQuery.

Comparing regression and classification on US elections data with TensorFlow Estimators

Wednesday, March 7, 2018

Apply two fundamental ML techniques, regression and classification, to the Elections 2016 dataset from Kaggle. Discover demographic voting trends.

How Color uses the new Variant Transforms tool for breakthrough clinical data science with BigQuery

Monday, March 5, 2018

Google Cloud customer Color explains how Variant Transforms enables novel genomics and clinical conclusions from within BigQuery.

Cloud poetry: training and hyperparameter tuning custom text models on Cloud ML Engine

Wednesday, February 28, 2018

Learn how to train a TensorFlow model to suggest the next line of poetry using Tensor2Tensor on Cloud ML Engine.

Google Cloud and NCAA® team up for a unique March Madness® competition hosted on Kaggle

Tuesday, February 27, 2018

Google Cloud and NCAA® are announcing the annual March Madness Machine Learning Competition on Kaggle, which helps you predict a winning bracket with AI.

How to handle mutating JSON schemas in a streaming pipeline, with Square Enix

Monday, February 26, 2018

Learn to process mutating JSON with Cloud Pub/Sub, Cloud Dataflow, and BigQuery. Square Enix engineers explain how they handle a changing game dataflow.

Practice makes perfect: the Professional Data Engineer Practice Exam is now live

Wednesday, February 14, 2018

Google Cloud Certified now offers an online practice exam designed to help Professional Data Engineer exam takers check their test readiness.

Easy distributed training with TensorFlow using tf.estimator.train_and_evaluate on Cloud ML Engine

Friday, February 9, 2018

Learn how to quickly and simply distribute your training workload with TensorFlow 1.4 and Cloud Machine Learning Engine.

Free Trial

Get $300 free credit to spend over 12 months

TRY IT FREE
  • Big Data Solutions

  • Product deep dives, technical comparisons, how-to's and tips and tricks for using the latest data processing and machine learning technologies.

  • Learn More

12 Months FREE TRIAL

Try BigQuery, Machine Learning and other cloud products and get $300 free credit to spend over 12 months.

TRY IT FREE