SenseData’s journey with Google Cloud’s managed database services
Editor’s note: Learn how SenseData helps Brazilian companies embrace customer success, rapidly evolving their products on Google Cloud and Google managed database services. SenseData has grown from a handful of customers to 140, manages multi-terabytes of data, and is now moving to Google BigQuery for data warehousing.
SenseData is a customer success company, and we have one product. We gather information from our clients’ systems and aggregate it all in a single platform that they use to make smarter decisions about their business. Some clients want to increase sales, some want to reduce churn, and others want to see a more comprehensive picture of their customers.
Our customer base has grown very quickly and so has the data our platform collects, manages, and makes consumable. Just 5 years ago, we had a minimum viable customer success product (MVP). Brazilian B2B customers don't have a "default" software stack, so they use a mishmash of systems and software to manage the customer relationship. Our objective was to integrate the data from all these different systems. We were on the cloud for that reason, and our goal was to be cloud-agnostic, using open-source software like MySQL and other tools to manage data.
Over time, our outlook has changed. It all started when we participated in the first Google Campus Residency for Startups. The residency introduced us to Google Cloud Platform and Google managed database services. After burning through our credits, we really haven’t looked back. We see the value of being a “Google Cloud shop.”
As we have evolved, we’ve taken advantage of new managed database services from Google. We are really impressed with how Google also evolves with changes in data formats and storage. Best of all, with Cloud SQL for MySQL and PostgreSQL, and now BigQuery, we don’t have to worry about backups, restores, replicas, and everything else that database administrators must do. We can focus on using our talent to keep improving our platform.
Oh, MySQL, how we’ve outgrown you: Building an ecosystem with Google services
Our original architecture consisted of MySQL, an application server, and a cloud infrastructure from another vendor. During our campus residency, we moved to GCP and we started using Cloud SQL for MySQL because our clients’ data formats and sources were all over the map—Oracle, Microsoft SQL Server, Google Sheets, CSVs stored on other cloud infrastructure, and systems with only VPN access to data sources.
As our clients grew, the fact that our platform was written so that all the customer data was completely separated sent us beyond the boundaries of MySQL, and we moved to PostreSQL through Cloud SQL. Some indexes and queries were not performing well for us in MySQL. In PostgreSQL, by doing nothing differently and with the same indexes and queries, the performance was consistently better. We have an ORM tool (SQLAlchemy) on top of the database layer of our app. So, it was very easy to migrate from MySQL to PostgreSQL.
At the same time, we moved to Kubernetes with Google Kubernetes Engine. The result of that combination was an ecosystem that could accommodate various technical needs. For example, to build a firewall, we could start Kubernetes and create an egress rule that easily handled load balancing. Each customer has an external address using the same external IP, and inside Kubernetes, host rules can pinpoint and choose addresses.
This ecosystem represented a big turning point for us. We had not planned on putting all our tech and data eggs in one basket, but our experience showed us the importance of having a first-rate database and managed database services—the lower latency, the support, and everything that comes with it. We quickly came to understand how Google can help companies during the growth process. We decided to take advantage of other GCP resources because Google makes it easy to access their first-rate services.
Like two peas in a pod: Offering clients peace of mind with Google Cloud SQL
Any client of a company like SenseData that has a cloud offering or uses cloud storage and data management is going to be concerned about its data security and whether other customers might get access to it. SenseData uses Google services to ensure that their data stays separate. On GCP with Cloud SQL for PostgreSQL, we have single tenancy per database, and multiple customer databases per instance. In other words, the instance is shared, but the logical database is not.
We also have custom data that is JSON-B, which describes data binding relationships between JSON documents and hypermedia. If one of our clients is a SaaS company that sells consulting services, e-commerce storefronts, and a mobile app that calls a taxi service, we can easily perform joins of that custom data with all the various types of data collected by the client. Then we can deliver metrics and calculations that meet their needs.
Separation of data is a piece of application layer cake
For logical separation at the application layer, we use a cluster that has 13 nodes with 4 CPUs and 15 GB RAM. Inside the cluster, the logical separation uses namespaces. The onboarding and production are inside their own namespaces. Inside the config map of Kubernetes, there’s a patch that directs the customer to the database specific to that customer.
In other words, the pod is the deployment method. The config map tells the pod to answer to the specific database. The service view has a named port or even a specific app selector that it serves. The egress rule of the host indicates the domains that go to each server and the port. This method allows us to have a different service for each customer.
The BigQuery idea: Data warehousing in the cloud to help customers meet KPIs
Recently we started to work with BigQuery. We made the decision to deploy because we were migrating the analytics from another vendor to Looker and wanted to improve performance and address the KPI needs of our biggest clients. These clients have a million customers, and the performance of some of their KPIs was not optimal. Many are ecommerce customers who want to track product sales against KPIs. For each product, they must look at historic sales data, hunt for specific SKUs, try to determine when the product was last purchased, and so on. Imagine doing that for multiple products concurrently. This gets very complicated in PostgreSQL
BigQuery offers us a faster and easier way to address performance and increase scalability. All our calculations will migrate over to BigQuery. Once all the data is aggregated, it can go back to PostgreSQL. We use a Python client to get it from BigQuery to Cloud SQL. This evolution of Cloud SQL and data warehousing is impressive. It gives us the freedom to try new configurations and data management techniques. Two years down the road from now, we’re sure that if we have to change how we handle customer data, there will be an evolution of Cloud SQL or some other Google service that will help us make the switch.
One isn’t the loneliest number: Cloud SQL makes database management easy
Cloud SQL and Google Cloud Platform help us by providing all the complicated database management services, plus observation, monitoring, and more. As a result, the SenseData infrastructure has been managed by just one person for about 6 years. Even though we have grown to 140 customers with terabytes of data, it’s still mostly a one-person job.
How is this possible? The answer is simple. We don’t have to deal with backups, maintenance downtime, and resolving replication issues. Cloud SQL serves up everything for us instead. We don't have to staff a team that includes a DBA, someone to manage networking, someone to administer VMs, and so on. That’s a big value for us. If we had stuck to our original plan to find cloud solutions no matter the vendor, we might not be able to stay so lean. The database services managed by Google, along with GCP and GKE, really make a difference.