Analyze and Strategize More Intelligently

Google Cloud Platform provides data scientists key technology and tools to extract tangible business value from massive data assets. From managed Spark clusters and fast SQL analysis to the latest in machine learning, Google Cloud Platform empowers data scientists to spend more time finding value in data and less time worrying about infrastructure. Whether the task at hand is tactical optimization, predictive analytics, nuanced learning, recommendation engines or building automated decision engines, Google Cloud Platform helps Data Scientists work smarter.

From Bucket of Bits to Understanding

Google Cloud Platform makes it easy for you to analyze data regardless of how it’s stored. For structured data, Google BigQuery is a fully managed, low-cost data warehouse with full SQL compatibility as well as integration with Python, R, and a host of other languages. For more general storage, Google Cloud Storage provides powerful, simple storage accessible from any part of GCP. Either way, access your data through Apache Spark running on Google Cloud Dataproc, stream and analyze it through Google Cloud Dataflow, or use it as the foundation for model building with Google Cloud Machine Learning. Finally, manage it all through Google Cloud Datalab, our notebook-driven environment for data science and machine learning.

Data Science, Not DevOps

Google Cloud Platform enables Data Science teams to work without the burden of managing infrastructure. GCP tools like Google BigQuery, the rocket-fast data warehouse, are serverless: you always have the resources you need and only pay for what you use. Through a rich set of client APIs, BigQuery integrates with Pandas, dplyr, and more of the most popular libraries for data analysis. Cloud Dataflow provides a serverless means of running batch and streaming data pipelines -- perfect for cleaning data or scoring models on streaming data. Cloud Dataflow Python support ensures that Python-based data scientists can take advantage of the Apache Beam programming model, while still using their tools of choice.

Open-Source Analysis Platform

Google Cloud Platform makes it simple for Data Scientists to benefit from the latest innovations in open source software. Google Cloud Dataproc enables users to create managed Apache Spark clusters in seconds, complete with Apache Zeppelin or Jupyter notebooking. Teams designing the next generation of data processing tools in Apache Beam can utilize Cloud Dataflow, Apache Spark or Apache Flink. GCP is committed to open-source to disseminate and broaden both the Data Science and Machine Learning communities. To this end, Google has open-sourced Cloud Datalab, its Jupyter-based notebooking environment, and TensorFlow, its deep-learning library.

The Cloud Machine Learning Platform

Google Cloud Machine Learning Platform makes it easy for Data Science teams to pursue innovative machine learning. With Google Machine Learning APIs, teams can use Google-built models to analyze and understand their data using simple APIs for cutting edge image and speech recognition, natural language processing, and machine translation. Using CloudML, Data Scientists can train and operationalize their own deep learning models using TensorFlow and Cloud Datalab. CloudML directly integrates with other GCP products like Google Cloud Storage or Google BigQuery so you can easily unlock valuable insight from our data.

Data Science Guides and Resources

In-depth guides and resources will help you discover more about using Cloud Platform in data science.

Customer Lifetime Value

Use BigQuery, Python, and R to calculate customer lifetime value using data from Google Analytics.

Read the Development Guide

Generating Recommendations

Use SparkML, Cloud SQL and Google App Engine to build a recommendation engine.

Read the Article

Demand Forecasting

Use BigQuery and Tensorflow to predict demand for NYC taxis.

Read the Article

Sentiment Analysis

Use BigQuery and Cloud Natural Language to predict sentiment and visualize using Google Data Studio.

Read the Article

Real-time Analytics Pipeline

Use BigQuery and Cloud Dataflow to construct a real-time analytics architecture.

Read the Article

Time Series Analysis

Use BigQuery to analyze foreign exchange time series data.

Read the Tutorial

Time-Series Classification

Use BigQuery and Tensorflow to build a classifier for time series data.

Read the Tutorial

Prediction and Remarketing

Use BigQuery and R to create a remarketing list based on Google Analytics Data.

Read the Tutorial