Jump to Content
Databases

Cloud Bigtable brings database stability and performance to Precognitive

January 28, 2019
https://storage.googleapis.com/gweb-cloudblog-publish/images/bigtable.max-2600x2600.png
Zac Rosenbauer

Precognitive

[Editor’s note: Today we’re hearing from Precognitive, which develops technology that interprets data to improve the accuracy of fraud detection and prevention, with the goals of reducing false positives and avoiding customer disruption. Their quest for the right database led them to Cloud Bigtable, and we’re bringing you their story here.]

At Precognitive, we were able to start with a blank technology slate to support our fraud detection software products. When we started building the initial version of our platform in 2017, we had some decisions to make: What coding language to use? What cloud infrastructure provider to choose? What database to use? The majority of the decisions were straightforward, but we struggled to decide upon a database. We had plenty of collective experience with relational databases, but not with a wide-column database like Cloud Bigtable—which we knew we’d need to scale our behavior and device workloads. At launch, our products were supported by a self-managed database, but we quickly migrated to Cloud Bigtable, and we love it.  

To efficiently support our bursty, real-time fraud detection workloads, we needed a cloud database that could satisfy the following key requirements:

  • Stability to keep up with increased adoption of our products
  • Intelligent scaling that avoids bottlenecks
  • Native integrations with BigQuery and Cloud Dataproc
  • Managed services that free up our engineers’ time to work on our products

Adding Cloud Bigtable as our performance database

As we scaled our services and added customers, our data collection services for our Device Intelligence and Behavioral Analytics products were seeing thousands of events per second. Cloud Bigtable provided a stable managed database that could handle the volume we were receiving during peak hours. We weren’t always able to handle this scale, as an early version of our product utilized a self-managed database.

Every month, two or three engineers spent hours managing the database instances. Whenever the instances crashed, it would cost at least one engineer a day or two of productivity attempting to restore the instances and recovering any data from our backup database. Managing this database internally was taking precious time away from product development.

We circled back to Cloud Bigtable. After two weeks of R&D, we decided to switch the Device Intelligence and Behavioral Analytics services to Cloud Bigtable.

Cloud Bigtable solved our scaling issues. Cloud Bigtable had been attractive to us from the start because it was fully managed, and offered regional replication and other features we were lacking in our own managed instances. Cloud Bigtable provides horizontal scaling and automatically rebalances row keys (equivalent to a shard key) over time to prevent “hot” nodes. In addition, Cloud Bigtable provides a connector to BigQuery and Cloud Dataproc that allows us to analyze the terabytes of data we are processing and use that data for unsupervised machine learning.

The perks of using Cloud Bigtable

After the migration to Cloud Bigtable, we noticed a number of additional benefits: improved I/O performance, a significant cost reduction, and a sizable decrease in hours spent on database maintenance.

We measured some of our typical metrics before and after implementing Cloud Bigtable. Our request latency dropped by about 30 ms on average (to sub-10 ms) for API requests. Prior to the change, we were seeing latencies of 40+ ms on average. This latency drop on our Behavioral Analytics and Device Intelligence products allowed us to trim about an additional 10 to 15 ms off our average response time across all dependent services.

https://storage.googleapis.com/gweb-cloudblog-publish/images/cloud_bigtable_latency.max-1900x1900.png

Before we moved to Cloud Bigtable, we had to scale our database instances every time a new customer was onboarded. We were over-scaling in an attempt to avoid constantly resizing our database servers. By sunsetting our self-managed database and switching to Cloud Bigtable, we cut database infrastructure costs by approximately 35% and can now scale as needed, with a couple of clicks, during onboarding.

We have spent zero hours managing a Cloud Bigtable database since launch, and we put the time we are saving every month toward product development.

Moving forward with Cloud Bigtable

As an engineering team, we love working with Cloud Bigtable. We are not only seeing improved developer experience and reduced latency, which keeps the engineers happy, but also reduced costs, which keeps the business happy. We’re able to build more product, too, with the time we’ve saved by switching to Cloud Bigtable. Stay tuned to our engineering blog for more on the lessons we’ve learned and our contributions to the wider Cloud Bigtable community.

Posted in