Powering past limits with financial services in the cloud
Michael Onders
EVP Chief Data Officer, Divisional CIO, and Head of Enterprise Architecture, KeyBank
Editor’s note: We asked financial institution KeyBank to share their story of moving their data warehouse from Teradata to Google Cloud. Here are details on why they moved to cloud, how they did their research, and what benefits cloud can bring.
At KeyBank, we serve our 3.5 million customers online and in-person, and managing and analyzing data is essential to providing great service. We process more than four billion records every single day, and move that data to more than 40 downstream systems. Our teams use that data in many ways; we have about 400 SAS users and 4,000 Tableau users exploring analytics results and running reports.
We introduced Hadoop four or five years ago as our data lake architecture, using Teradata for high-performance analytics. We stored more than a petabyte of data in Hadoop on about 150 servers, and more than 30 petabytes in our Teradata environment. We decided to move operations to the cloud when we started hitting the limits of what an on-premises data warehouse could do to meet our business needs. We wanted to move to cloud quickly and open up new analytics capabilities for our teams.
Considering and testing cloud platforms
Teradata had worked well for us when we first deployed it. Back then, Teradata was a market leader in data warehousing, with many of the leading banks invested in it. We chose it for its high-performance analytics capabilities, and our marketing and risk management teams used it heavily. It also worked well with other SAS tools we were using, and SAS remains a good tool for accessing our mainframe.
Ten years into using Teradata, we had a lot of product-specific data stores. It wasn’t a fully formed data lake architecture. We also maintain more than 200 SAS models. In 2019, our Teradata appliances were nearing capacity, and we knew they would need a refresh in 2021. We wanted to avoid that refresh, and started doing proof-of-concept cloud testing with both Snowflake and Google Cloud.
When we did those trials, we ran comparative benchmarks for load time, ETL time, performance and query time. Snowflake looked just like Teradata, but in the cloud. With Google, we looked at all the surrounding technology of the platform. We couldn’t be on a single cloud platform if we chose Snowflake. We picked Google Cloud, since it would let us simplify and offer us a lot more options to grow over time.
Adapting to a cloud platform
Along with changing technology, our teams would have to learn some new skills with this cloud migration. Our primary goal when moving to a cloud architecture was getting the performance of Teradata at the cost of Hadoop, but on a single platform. Managing a Hadoop data lake running on Teradata architecture is complicated—it really takes two different skill sets.
There are some big considerations that go into making these kinds of legacy vs. modern enterprise technology decisions. With an on-premises data warehouse like Teradata, you govern in capacity, so performance varies based on the load on the hardware at any given time. That led to analytics users hitting the limits during month-end processing, for example. With Google Cloud, there are options for virtually unlimited capacity.
Cost savings was a big reason for our move to cloud. Pricing models are very different with cloud, but ultimately we’re aiming not to pay for storage that’s just sitting there, not in use. Cloud gives us the opportunity to scale up for a month if needed, then back down after the peak, managing costs better. Figuring this out is a new skill we’ve learned. For example, running a bad query in Teradata or Hadoop wouldn’t change the on-premises cost for that query, but would consume horsepower. Running that query on Google Cloud won’t interfere with other users’ performance, but would cost us money. So we’re running training to ensure people aren’t making those types of mistakes, that they’re running the right types of queries.
Shifting to cloud computing
The actual cloud migration involved working closely with the security team to meet their requirements. We also needed to align data formats. For example, we had to make sure our ETL processing could talk to Google Cloud Storage buckets and BigQuery data sets. We’re finding that for the most part the queries do port over seamlessly to BigQuery. We’ve had to tweak just a handful of data types.
Since moving to cloud, the early results are very promising; we’re seeing 3 to 4x faster query performance, and we can easily turn capacity up or down. We have five data marts in testing to use real-world data volumes to get comparison queries.
We’re still making modifications to how we set up and configure services in the cloud. That’s all part of the change that comes when you’re now owning and operating data assets securely in the cloud. We had to make sure that any personally identifiable information (PII) was stored securely and tokenized. We’ll also continue to tune cost management over time as we onboard more production data.
Managing change and planning for the future
The change management of cloud is an important component of the migration process. Even with our modern data architecture, we’re still shifting established patterns and use cases as we move workloads to Google Cloud. It’s a big change to go to a capacity-based model, where we can change capacity on demand to meet our needs, vs. needing more hardware with our old Teradata method. Helping 400 users migrate to newer tools requires some time and planning. We hosted training sessions with help from Google, and made sure business analysts were involved up front to give feedback. We also invested in training and certifications for our analysts.
We’re on our way to demonstrating that Google can give us better performance based on the cost per query than Teradata did. And using BigQuery means we can do more analytics in place now, rather than the previous process of copying, storing, and manipulating data, then creating a report.
As we think through how to organize our analytics resources, we want to get the business focused on priorities and consumer relationships. For example, we want to know the top five or so areas where analytics can add value, so we can all be focused there. To make sure we would get the most out of these new analytics capabilities, we set up a charter and included cross-functional leaders so we know we’re all keeping that focus and executing on it. We’re retraining with these new skills, and even finding new roles that are developing. We built a dedicated cloud-native team—really an extension of our DevOps team—focused on setting up infrastructure and using infrastructure as code.
The program we’ve built is ready for the future. With our people and technology working together, we’re well set up for a successful future.