Integrations with Cloud Bigtable

This page describes integrations between Cloud Bigtable and other products and services.

Google Cloud services

This section describes the Google Cloud services that Cloud Bigtable integrates with.

BigQuery

BigQuery is Google's fully managed, petabyte-scale, low-cost analytics data warehouse. You can use BigQuery to query data stored in Cloud Bigtable.

To get started, see Querying Cloud Bigtable Data.

Dataflow

Dataflow is a cloud service and programming model for big data processing. Dataflow supports both batch and streaming processing. You can use Dataflow to process data that is stored in Cloud Bigtable or to store the output of your Dataflow pipeline. You can also use Dataflow templates to export and import your data as Avro files or SequenceFiles.

To get started, see Dataflow Connector for Cloud Bigtable.

Dataproc

Dataproc provides Apache Hadoop and related products as a managed service in the cloud. With Dataproc, you can run Hadoop jobs that read from and write to Cloud Bigtable.

For an example of a Hadoop MapReduce job that uses Cloud Bigtable, see the /java/dataproc-wordcount directory in the GitHub repository GoogleCloudPlatform/cloud-bigtable-examples.

Cloud Deployment Manager

Deployment Manager is an infrastructure deployment service that automates the creation and management of Google Cloud resources. Deployment Manager makes API calls to create Cloud Bigtable instances, then adds them to your deployment.

Big Data

This section describes Big Data products that Cloud Bigtable integrates with.

Apache Hadoop

Apache Hadoop is a framework that enables distributed processing of large data sets across clusters of computers. You can use Dataproc to create a Hadoop cluster, then run MapReduce jobs that read from and write to Cloud Bigtable.

For an example of a Hadoop MapReduce job that uses Cloud Bigtable, see the /java/dataproc-wordcount directory in the GitHub repository GoogleCloudPlatform/cloud-bigtable-examples.

StreamSets Data Collector

StreamSets Data Collector is a data-streaming application that you can configure to write data to Cloud Bigtable. StreamSets provides a Cloud Bigtable library in its GitHub repository at streamsets/datacollector.

Geospatial databases

This section describes geospatial databases that Cloud Bigtable integrates with.

GeoMesa

GeoMesa is a distributed spatio-temporal database that supports spatial querying and data manipulation. GeoMesa can use Cloud Bigtable to store its data.

For more information about running GeoMesa with Cloud Bigtable support, see the GeoMesa documentation.

Graph databases

This section describes graph databases that Cloud Bigtable integrates with.

HGraphDB

HGraphDB is a client layer for using Apache HBase or Cloud Bigtable as a graph database. It implements the Apache TinkerPop 3 interfaces.

For more information about running HGraphDB with Cloud Bigtable support, see the HGraphDB documentation.

JanusGraph

JanusGraph is a scalable graph database. It is optimized for storing and querying graphs containing hundreds of billions of vertices and edges.

For more information about running JanusGraph with Cloud Bigtable support, see Running JanusGraph with Cloud Bigtable or the JanusGraph documentation.

Infrastructure management

This section describes infrastructure management tools that Cloud Bigtable integrates with.

Pivotal Cloud Foundry

Pivotal Cloud Foundry is an application development and deployment platform that offers the ability to bind an application to Cloud Bigtable.

Terraform

Terraform is an open source tool that codifies APIs into declarative configuration files. These files can be shared among team members, treated as code, edited, reviewed, and versioned.

For more information about using Cloud Bigtable with Terraform, see Cloud Bigtable Instance and Cloud Bigtable Table in the Terraform documentation.

Machine learning

Feast

Feast is an an open-source feature store for machine learning, developed by Google Cloud and GO-JEK, that can use Cloud Bigtable as a serving store.

TensorFlow

TensorFlow, an open-source library for numerical computation, offers native support for using Cloud Bigtable to store and serve training data. A tutorial, Cloud Bigtable for Streaming Data, is available to help you learn to use this integration.

Time-series databases and monitoring

This section describes time-series databases and monitoring tools that Cloud Bigtable integrates with.

Heroic

Heroic is a monitoring system and time-series database. Heroic can use Cloud Bigtable to store its data.

For more information about Heroic, see the GitHub repository spotify/heroic, as well as the documentation for configuring Cloud Bigtable and configuring metrics.

OpenTSDB

OpenTSDB is a time-series database. With the AsyncBigtable library, OpenTSDB can use Cloud Bigtable to store its data.

For more information about running OpenTSDB with Cloud Bigtable support, see Pythian's blog post and the OpenTSDB documentation. Additionally, see Using OpenTSDB to Monitor Time-Series Data on Google Cloud to learn how to use OpenTSDB running on Google Kubernetes Engine along with Cloud Bigtable to collect, record, and monitor time-series data.

Czy ta strona była pomocna? Podziel się z nami swoją opinią:

Wyślij opinię na temat...

Cloud Bigtable Documentation