Your Google Cloud database options, explained
Lead Developer Advocate, Google
Picking the right database for your application is not easy. The choice depends heavily on your use case — transactional processing, analytical processing, in-memory database, and so on — but it also depends on other factors. This post covers the different database options available within Google Cloud across relational (SQL) and non-relational (NoSQL) databases and explains which use cases are best suited for each database option.
Click to enlarge
In relational databases information is stored in tables, rows and columns, which typically works best for structured data. As a result they are used for applications in which the structure of the data does not change often. SQL (Structured Query Language) is used when interacting with most relational databases. They offer ACID consistency mode for the data, which means:
- Atomic: All operations in a transaction succeed or the operation is rolled back.
- Consistent: On the completion of a transaction, the database is structurally sound.
- Isolated: Transactions do not contend with one another. Contentious access to data is moderated by the database so that transactions appear to run sequentially.
- Durable: The results of applying a transaction are permanent, even in the presence of failures.
Because of these properties, relational databases are used in applications that require high accuracy and for transactional queries such as financial and retail transactions. For example: In banking when a customer makes a funds transfer request, you want to make sure the transaction is possible and it actually happens on the most up-to-date account balance, in this case an error or resubmit request is likely fine.
There are three relational database options in Google Cloud: Cloud SQL, Cloud Spanner, and Bare Metal Solution.
Cloud SQL: Provides managed MySQL, PostgreSQL and SQL Server databases on Google Cloud. It reduces maintenance cost and automates database provisioning, storage capacity management, back ups, and out-of-the-box high availability and disaster recovery/failover. For these reasons it is best for general-purpose web frameworks, CRM, ERP, SaaS and e-commerce applications.
Cloud Spanner: Cloud Spanner is an enterprise-grade, globally-distributed, and strongly-consistent database that offers up to 99.999% availability, built specifically to combine the benefits of relational database structure with non-relational horizontal scale. It is a unique database that combines ACID transactions, SQL queries, and relational structure with the scalability that you typically associate with non-relational or NoSQL databases. As a result, Spanner is best used for applications such as gaming, payment solutions, global financial ledgers, retail banking and inventory management that require ability to scale limitlessly with strong-consistency and high-availability.
Bare Metal Solution: Provides hardware to run specialized workloads with low latency on Google Cloud. This is specifically useful if there is an Oracle database that you want to lift and shift into Google Cloud. This enables data center retirements and paves a path to modernize legacy applications.
Non-relational databases (or NoSQL databases) store complex, unstructured data in a non-tabular form such as documents. Non-relational databases are often used when large quantities of complex and diverse data need to be organized, or where the structure of the data is regularly evolving to meet new business requirements. Unlike relational databases, they perform faster because a query doesn’t have to access several tables to deliver an answer, making them ideal for storing data that may change frequently or for applications that handle many different kinds of data. For example, an apparel store might have a database in which shirts have their own document containing all of their information, including size, brand, and color with room for adding more parameters later such as sleeve size, collars, and so on.
Qualities that make NoSQL databases fast:
- Typically, they are optimized for a specific workload pattern (i.e., key-value, graph, wide-column)
- Horizontal scaling, usually using range or hashed distributions
- Eventual consistency: many NoSQL stores usually exhibit consistency at some later point (e.g., lazily at read time). However, Firestore uniquely offers strong global consistency.
- Transactions: a majority of NoSQL stores don't support cross shard transactions or flexible isolation modes. However, Firestore uniquely offers ACID transactions across shards with serializable isolation.
Because of these properties, non-relational databases are used in applications that require large scale, reliability, availability, and frequent data changes.They can easily scale horizontally by adding more servers, unlike some relational databases, which scale vertically by increasing the machine size as the data grows. Although, some relational databases such as Cloud Spanner support scale-out and strict consistency.
Non-relational databases can store a variety of unstructured data such as documents, key-value, graphs, wide columns, and more. Here are your non-relational database options in Google Cloud:
- Document databases: Store information as documents (in formats such as JSON and XML). For example: Firestore
- Key-value stores: Group associated data in collections with records that are identified with unique keys for easy retrieval. Key-value stores have just enough structure to mirror the value of relational databases while still preserving the benefits of NoSQL. For example: Bigtable, Memorystore
- In-memory database: Purpose-built database that relies primarily on memory for data storage. These are designed to attain minimal response time by eliminating the need to access disks. They are ideal for applications that require microsecond response times and can have large spikes in traffic. For example: Memorystore
- Wide-column databases: Use the tabular format but allow a wide variance in how data is named and formatted in each row, even in the same table. They have some basic structure while preserving a lot of flexibility. For example: Bigtable
- Graph databases: Use graph structures to define the relationships between stored data points; useful for identifying patterns in unstructured and semi-structured information. For example: JanusGraph
There are three non-relational databases in Google Cloud:
Firestore: Is a serverless document database which scales on demand, is strongly-consistent, offers up to 99.999% availability and acts as a backend-as-a-service. It is DBaaS that is optimized for building applications. It is perfect for all general purpose uses cases such as ecommerce, gaming, IoT and real time dashboards. With Firestore, users can interact with and collaborate on live and offline data making it great for real-time application and mobile apps.
Cloud Bigtable: Cloud Bigtable is a sparsely populated table that can scale to billions of rows and thousands of columns, enabling you to store terabytes or even petabytes of data. It is ideal for storing very large amounts of single-keyed data with very low latency. It supports high read and write throughput at sub-millisecond latency, and it is an ideal data source for MapReduce operations. It also supports the open-source HBase API standard to easily integrate with the Apache ecosystem including HBase, Beam, Hadoop and Spark along with Google Cloud ecosystem.
Memorystore: Memorystore is a fully managed in-memory data store service for Redis and Memcached at Google Cloud. It is best for in-memory and transient data stores and automates the complex tasks of provisioning, replication, failover, and patching so you can spend more time coding. Because it offers extremely low latency and high performance, Memorystore is great for web and mobile, gaming, leaderboard, social, chat, and news feed applications.
Choosing a relational or a non-relational database largely depends on the use case. Broadly, if your data structure is not going to change much, select a relational database. In Google Cloud use Cloud SQL for any general-purpose SQL database and Cloud Spanner for large-scale globally scalable, strongly consistent use cases. In general, if your data structure may change later and if scale and availability is a bigger requirement then a non-relational database is a preferable choice. Google Cloud offers Firestore, Memorystore, and Cloud Bigtable to support a variety of use cases across the document, key-value, and wide column database spectrum. For more comparison resources on each database check out the overview. For more hands-on experience with Bigtable, check out our on-demand training here and learn about migrating databases to managed services check out this whitepaper.
For more #GCPSketchnote, follow the GitHub repo. For similar cloud content follow me on Twitter @pvergadia and keep an eye out on thecloudgirl.dev.