Databases

Meet our Data Champions: Credit Karma’s Scott Wong on doing 60 billion model predictions per day

May 3, 2023

Scott Wong

VP of Platform Engineering, Credit Karma

Anita Kibunguchy-Grant

Head of Product Marketing, Databases

Editor’s note: This blog is part of a series called Meet the Google Cloud Data Champions, a series celebrating the people behind data- and AI-driven transformations. Each blog features a champion’s career journey, lessons learned, advice they would give other leaders, and more. This story features Scott Wong, VP of Platform Engineering at Credit Karma, a heavy user of Google Data Cloud offerings like Cloud Bigtable and BigQuery to store and analyze financial information for its 130 million members.

Tell us about yourself. Where did you grow up? What did your journey into tech look like?

I grew up in a small beach town called Del Mar in San Diego County. For college, I attended Cal Berkeley, and around that time was when there was a significant growth in the technology sector. Out of school, I began my first internship at Sun Microsystems, and ended up working there for eight years. My role at Sun Microsystems was centered around manufacturing — we were literally fabricating motherboards and assembling servers. At this time, there was a major push to move a majority of manufacturing to Asia, so I was traveling a lot to Asia early on in my career. I then moved to Google and Twitter building out large-scale data centers.

What really stood out to me during my time at Sun Microsystems, and furthered my interest in tech, was the incredible growth I saw within the company, but also in the wider industry. The pace and culture seen in tech companies experiencing fast growth was very appealing — there was a lot of experimentation and innovation happening around me, and there were always opportunities to grow my skill set.

What’s the coolest thing you and/or your team have accomplished by leveraging our Data Cloud solutions?

Google’s Data Cloud has enabled us to scale at high velocity. We run nearly 60 billion model predictions daily to power financial recommendations for our nearly 130 million members, and Google services like Cloud Bigtable and BigQuery have helped us reach this scale — in terms of scaling our infrastructure and empowering our data scientists to work faster and more efficiently.

To quantify some of these gains:

Our data scientists deploy more than 300 models weekly compared to 2018 when they were deploying less than 10 models on a quarterly basis.
In terms of experiment velocity, our data scientists are doing 7x more experiments compared to 2018.
Using Bigtable and BigQuery, we’ve been able to deploy 10x more features daily w/ batch data

Technology is one part of data-driven transformation. People and processes are others. How were you able to bring the three together? Were there adoption challenges within the organization, and if so, how did you overcome them?

While our journey to cloud was a big undertaking for the company, there was always a clear directive from our CTO that we’d make this migration, and that it was an imperative in order to truly scale our business.

It did pose interesting challenges, primarily around people. We had a staff of engineers that were well versed in datacenter-centric skill sets, and weren’t necessarily used to using cloud technologies. We knew early on that we needed to allocate resources to train engineers while at the same time, hiring those who specialized in cloud and data engineering.

In terms of processes, it was clear early on how much velocity we’d gain from the cloud — there were a lot of processes we just didn’t have to do any longer. This still meant a lot of change was happening so we needed to ensure there was a clear understanding among the engineering organization of how to work with these new technologies — what could and couldn’t we leverage and why? This is why building a trusted partnership with the Google team was so important. We were moving fast and we needed to ensure there was a high level of service available to us in regards to response time, quality, availability, and more.

What advice would you give people who want to start data initiatives in their company?

The first and most important step is really understanding how you want to use the data — what is the business need and how your data strategy will get you to that point. Once you have alignment on the business need and the data you need to reach that state, you get into the more nitty gritty — but still important — details. What do the SLAs look like? How are you addressing data quality and data reliability? What data products do you need to leverage? What are your security guardrails to ensure protection of the data? What efficiencies are you putting in place to ensure the utilization of data is appropriate?

Unlocking velocity is great but when you’re moving fast with data, there could be tradeoffs within the security and reliability realms, and data leaders need to be hyper-aware of that. Especially when it comes to security, you need to retain full responsibility for those practices even if you’re leveraging management support for some of those practices elsewhere.

What’s an important lesson you learned along the way to becoming more data- and AI-driven? Were there challenges you had to overcome?

Credit Karma from the start has been a very data-driven company — data at scale drives our product and personalization is integral to our product vision. Scalability over the years has really demonstrated how important data quality is. The integrity of that data is imperative to running a good product, especially when you have well over 100 million people using your product. Over the years, as we continued to make a lot of investments in our machine learning infrastructure and model building practices, you realize how important data grooming and data governance is.

When it comes to data from a scalability perspective, there’s this notion of wanting to keep all historical data, even as your data grows and grows. In order to use data responsibly, there needs to be some efficiencies and guardrails put in place. When you function at scale, you shouldn’t be holding onto data just to have it, instead, be strategic about your data storage needs.

Which leaders and/or companies have inspired you along your journey?

One engineering leader who comes to mind is Urs Hölzle who has, and continues to play, an integral role in building Google infrastructure. Since I left Google more than a decade ago, his uncanny ability to reach a solution in such a quick and effective manner has really stuck with me over the years. His leadership style and ability to ask the right questions has inspired my outlook on great engineering leadership.

Thinking ahead 5-10 years, what possibilities with data/AI are you most excited about?

I might be biased, but so much opportunity exists for data and AI to disrupt personal finance. So many aspects of finance remain confusing, antiquated, inefficient and all in all, stacked against consumers. At Credit Karma, personalization is our north star, and we’ve made tremendous strides to provide each of our members a unique product experience, helpful to their individual situations, needs and goals. As we continue to prioritize investments in data and machine learning at scale, the hope is to automate financial decision making for millions of consumers.

To learn more about Google’s Data Cloud, please visit https://cloud.google.com/data-cloud

Download the complimentary 2022 Gartner Magic Quadrant for Cloud Database Management Systems report.

Learn why customers choose Google Cloud databases in this e-book.

Databases

Meet our Data Champions: Jan Riehle, at the intersection of beauty and data with Beauty for All (B4A)

How Brazil’s B4A uses Google Cloud to bring a data-driven mind to “beauty-tech” innovation.

By Jan Riehle • 6-minute read

https://storage.googleapis.com/gweb-cloudblog-publish/images/Google_Cloud_Data-Champions_1.max-900x900.jpg

Posted in

https://storage.googleapis.com/gweb-cloudblog-publish/images/Next24_Blog_blank_2-05.max-700x700.jpg

Databases

Turbocharge applications with Memorystore’s persistence and flexible node types

By Ankit Sud • 5-minute read

Databases

What’s new with Firestore at Next ‘24

By Minh Nguyen • 4-minute read

https://storage.googleapis.com/gweb-cloudblog-publish/images/Next24_Blog_blank_2-04.max-700x700.jpg

Databases

Private, secure, and seamless connectivity to Cloud SQL using Private Service Connect

By Shambhu Hegde • 6-minute read

https://storage.googleapis.com/gweb-cloudblog-publish/images/Next24_Blog_blank_2-02.max-700x700.jpg

Databases

Migrate your SQL Server workloads to Cloud SQL with Database Migration Service, now in preview

By Erez Alsheich • 3-minute read

Meet our Data Champions: Credit Karma’s Scott Wong on doing 60 billion model predictions per day

Scott Wong

Anita Kibunguchy-Grant

Meet our Data Champions: Jan Riehle, at the intersection of beauty and data with Beauty for All (B4A)

Related articles

Turbocharge applications with Memorystore’s persistence and flexible node types

What’s new with Firestore at Next ‘24

Private, secure, and seamless connectivity to Cloud SQL using Private Service Connect

Migrate your SQL Server workloads to Cloud SQL with Database Migration Service, now in preview