Tools for data scientists

Serverless infrastructure and easy-to-use services and tools for big data and machine learning.

GCP data for scientist image

Easily store, process, and prepare data to train and deploy machine learning models on any type of data, of any size. Our fully managed services and open source software helps data scientists and data engineers focus on turning data into actionable intelligence instead of handling clusters.

Learn

Explore courses and resources to grow your data science and machine learning knowledge.

AI education by Google

Get information and exercises from Google ML experts to help you develop skills and advance your projects.

Kaggle Learn

Sign up for free courses in machine learning and data science that emphasize practical data skills over abstract theory.

Qwiklabs

Get hands-on practice working with cloud technologies and software.

Google Cloud training

Find courses designed for data professionals who are responsible for designing, building, analyzing, and optimizing big data solutions.

Coursera

Learn ML with Google Cloud with real-world experimentation with end-to-end ML.

Prototype

Explore tools and samples to help you prototype quickly on Google Cloud.
Colaboratory

Colaboratory

Colaboratory is a Google research project created to help disseminate machine learning education and research. It's a Jupyter Notebook environment that requires no setup, is free to use, and runs entirely in the cloud. Colaboratory notebooks can be shared just as you would with Google Docs or Sheets.

Go to quickstart arrow_forward
Cloud datalab

Cloud Datalab

Explore, analyze, transform, and visualize data and build machine learning models on Google Cloud Platform. Cloud Datalab runs on Compute Engine and connects to multiple cloud services easily.

Learn more arrow_forward
View documentation arrow_forward
Public datases

Public Datasets

Get a repository of open data curated by Google engineers and supported by domain experts from around the world. Use these data to build and test your algorithms before deployment or join with other datasets to unlock new insights. The data are hosted in BigQuery and Cloud Storage, making them simple to build on and use.

View documentation arrow_forward
Kaggle

Kaggle

Kaggle Kernels offers a browser-based Python and R coding environment at no charge. Get seamless access to thousands of public datasets, code samples from a community of data scientists, and features for collaboration.

Explore repository arrow_forward
Browse public datasets arrow_forward
Jupyter

Jupyter

Get a familiar data science experience without the tedious infrastructure setup by using Jupyter Notebooks with Google Cloud’s fully managed big data stack.

Read blog arrow_forward
Review tutorial arrow_forward
Cloud deep learning

Cloud Deep Learning VM Image Beta

Deep Learning VM Image is pre-configured Compute Engine images for popular machine learning frameworks such as TensorFlow, scikit-learn, and PyTorch.

View documentation arrow_forward

Build

Get tools to streamline the process from data ingestion to model training.

Ingest

Cloud pub sub

Cloud Pub/Sub

Cloud Pub/Sub is a simple, reliable, scalable foundation for large-scale stream analytics and event-driven computing systems. As part of Google Cloud’s stream analytics solution, the service ingests event streams and delivers them to Cloud Dataflow for processing and BigQuery for analysis as a data warehousing solution.

View documentation arrow_forward
Go to GitHub arrow_forward

Process

Cloud dataflow

Cloud Dataflow

Transform and enrich ingested data in streaming and batch modes with equal reliability and expressiveness.

View documentation arrow_forward
Go to Quickstarts arrow_forward
Cloud dataprep

Cloud Dataprep

Cloud Dataprep is an intelligent data service for visually exploring, cleaning, and preparing structured and unstructured data for analysis. Cloud Dataprep is serverless and works at any scale — there’s no infrastructure to deploy or manage.

View documentation arrow_forward
Go to Quickstarts

Warehouse

BigQuery

BigQuery

BigQuery is a fully managed data warehouse service that supports 100,000 streaming row inserts per second and allows ad hoc analysis on real-time data with standard SQL.

View tutorials arrow_forward
Go to Quickstarts arrow_forward
Cloud storage

Cloud Storage

Use Cloud Storage to store your model trainer, training data, saved models, and prediction inputs and outputs.

View documentation arrow_forward
Go to Quickstarts arrow_forward

Explore

BigQuery

BigQuery

Get insights from your data faster without needing to copy or move it. BigQuery gives you full view of all your data by seamlessly querying data stored in BigQuery’s managed columnar storage, Cloud Storage, Cloud Bigtable, Google Sheets, and Google Drive.

Explore tutorials arrow_forward
Go to Quickstarts arrow_forward
Cloud datalab

Cloud Datalab

Cloud Datalab is an interactive tool built on Jupyter (formerly iPython) created to explore, analyze, transform, and visualize data and build machine learning models on Google Cloud Platform. It runs on Compute Engine and connects to multiple cloud services easily so you can focus on your data science tasks.

Go to Quickstart arrow_forward
Launch tutorial arrow_forward
Cloud ML engine

Cloud ML Engine

Add an extra layer of intelligence to your pipeline by running the event streams through custom TensorFlow, XGBoost, or scikit-learn machine learning models.

View training overview arrow_forward
Tensorflow

TensorFlow

TensorFlow™ is an open source software library for high-performance numerical computation. Its flexible architecture allows easy deployment of computation across a variety of platforms (CPUs, GPUs, TPUs), and from desktops to clusters of servers to mobile and edge devices.

View documentation arrow_forward
Get ML crash course arrow_forward
Hardware accelerators

Hardware accelerators

Hardware accelerators on Google Cloud offer the flexibility to choose the right accelerator for the best performance per dollar on ML workloads. Select from a portfolio of accelerators to run your workloads for training and predictions.

Cloud TPU arrow_forward
Cloud GPU arrow_forward
Cloud CPU arrow_forward
Facets

Facets

Facets contains two robust visualizations to aid in understanding and analyzing machine learning datasets. Get a sense of the shape of each feature of your dataset using Facets Overview, or explore individual observations using Facets Dive.

Explore facets arrow_forward

Deploy

Deploy your machine learning models anywhere.
Kubeflow

Kubeflow

The Kubeflow project is dedicated to making deployments of ML workflows on Kubernetes simple, portable, and scalable. The goal is not to re-create other services, but to provide a straightforward way to deploy best-of-breed open-source systems for ML to diverse infrastructures. You should be able to run Kubeflow anywhere you’re running Kubernetes.

Read blog post arrow_forward
Kubeflow on GitHub arrow_forward
Cloud ML engine

Cloud ML Engine

Cloud ML Engine offers online prediction and batch prediction services for different ML frameworks. Data scientists can easily deploy models that have been trained anywhere into production without Docker containers or any special stitch-and-fix mechanisms. Online prediction supports frameworks like scikit-learn, XGBoost, Keras, and TensorFlow to serve classification, regression, clustering, and dimensionality reduction models.

Get prediction overview arrow_forward

Partners

Find Google Cloud machine learning Partners that come with deep AI expertise and can help you incorporate machine learning for a wide range of needs and use cases. Depending on your industry and need, you can choose your preferred development paths. Our partners can help across every stage of model development and serving — getting your data ready for machine learning or providing the right tools and platforms for your work, including off-the-shelf AI solutions and custom model development.
Learn More arrow_forward

Data preparation or preprocessing

Find partners that specialize in making data ready for training.

Figureeight Alteryx Imerit

Data science platforms

Find platforms and tools for machine learning and data science.

H2O Anaconda R studio
Google Cloud

Get started

Learn and build

New to GCP? Get started with any GCP product for free with a $300 credit.

Need help on a bigger project?

Our experts will help you build the right solution or find the right partner for your needs.