Fast Restart: A powerful new tool to help improve SAP HANA uptime
Technical Program Manager, Google Cloud
There are plenty of things in life where “good enough” is a worthy goal. But if you’re a SAP administrator, you know that “good enough” simply isn’t when it comes to reducing system downtime and achieving faster restart times on your business-critical SAP HANA environments.
Of course, when you’re pursuing perfection, some tactics are better than others, especially when you’re dealing with tight budgets and an overworked IT staff. That’s why we’re spotlighting a powerful technique that uses existing SAP HANA capabilities to help slash your database restart times. It’s an approach that most SAP admins can implement in minutes—and it complements Google Cloud’s existing arsenal of tools and tactics for maximizing SAP system availability.
Using persistent memory to help reduce HANA restart times
Restart times have always been a concern for SAP HANA, which—like any in-memory database—can take a long time to load resident data back into memory from persistent storage. Whether you’re talking about process restarts, system crashes, or planned maintenance, it’s common for HANA restarts to take an hour or longer. The process of reading data from disk back into memory accounts for virtually all of this downtime.
Beginning with SAP HANA 2 SPS3, SAP has supported the use of persistent memory (PRAM) to help reduce restart times. This approach employs a method to store columnstore fragments into a filesystem, backed by persistent memory like Intel Optane DC Persistent Memory. It’s a tempting option for any organization where SAP HANA plays a business-critical role—and where the idea of losing access to HANA for an hour or more is enough to give any SAP admin some sleepless nights.
There’s a lot of value in maximizing HANA uptime. But there’s also value in adopting technology that’s flexible, scalable, and engineered to support innovation. Let’s look at another option that can help maximize HANA system availability and that organizations can adopt to achieve these goals.
Fast Restart: A valuable new way to combat HANA downtime
Beginning with HANA 2.0 SPS4, SAP has supported a method, dubbed Fast Restart, that offers many of the same benefits as PRAM. Fast Restart is a more limited solution than one using persistent memory, but it also has a major advantage: Customers can implement it on virtually any current host system, without sacrificing performance or flexibility.
In a nutshell, Fast Restart uses TMPFS—a long-established Unix facility for creating virtual filesystems—to store HANA database columns in DRAM. This means Fast Restart won’t survive a full VM restart, but it will keep a database intact and in-memory when a process restart or planned maintenance knocks down a HANA instance. And that still covers a lot of situations where Fast Restart can turn an hour-long ordeal into a hiccup that users are unlikely to notice.
Who should be using Fast Restart? The short answer is simple: Almost everybody who runs SAP HANA 2 SPS4+, whether on-premises or in the cloud, should seriously consider it. Most of the time, adopting Fast Restart is as simple as knowing it’s available; the implementation process is relatively straightforward and low-risk.
For most SAP administrators, the process of implementing Fast Restart includes just three steps:
Map out and understand the host environment’s non-uniform memory access (NUMA) topology. This is a critical preparatory step since HANA is designed to self-optimize its memory access and process allocation based on its own reading of a system’s NUMA topology, and setting up TMPFS for HANA will require a similar understanding of how HANA recognizes and uses system memory.
Create and mount the TMPFS filesystem. This includes creating and naming the required number of directories, setting mount options, updating fstab, and checking the resulting filesystem to confirm that it will function properly.
Configure HANA to use Fast Restart. This includes some fairly simple changes to the HANA global INI parameters, and then deciding whether to store specific Column tables or partitions into the persistent memory space or to change the default for all new tables.
When you’re ready to implement Fast Restart on your own HANA systems, be sure to review the SAP documentation for Fast Restart for a deeper dive into the setup process and to understand the requirements for using Fast Restart.
Fast Restart by the numbers: A night-and-day difference
You may be wondering just how much of a difference Fast Restart can make during an event such as a HANA process restart. We were curious, too, so we set up a simple comparison test to get some hard numbers.
First, we generated a fairly typical HANA environment, including data in 40 Tables with a total volume of 2.74TB, configured for preload. We then measured the time elapsed from HANA startup invocation to all preload tables being loaded in memory—first by provisioning a memory-optimized virtual server using Compute Engine, but without Fast Restart:
Compute Engine M1 memory-optimized server
Startup invocation: 11:41:47
Finished preload: 12:22:05
IO Speed: Approx. 1.17GB/s
Total time elapsed for startup: 40 minutes
And then we measured the same startup time, elapsed on the same virtual server, using Fast Restart:
Startup invocation: 11:12:32
Finished preload: 11:13:09
IO Speed: approx. 28MB/s
Total time elapsed for startup: 1 minute
It’s hard to envision better performance, given the time SAP HANA needs simply to load its own binaries and to read checksum information required to validate the in-memory data. And while the process might take as long as 4 to 5 minutes depending on your exact HANA configuration, the difference in startup times speaks for itself.
Fast Restart on Google Cloud: One tool among many to protect HANA uptime
Fast restart is a great option for any organization that runs SAP HANA, whether you’re running HANA on Google Cloud, using legacy on-premises systems, or another cloud provider. But keep in mind that using Fast Restart raises an important question: What can you do to minimize downtime in cases where Fast Restart isn’t capable of closing the gap on its own—for example, when it’s necessary to shut down a host VM for planned maintenance or due to unplanned issues?
This is where the value of a multi-faceted availability strategy comes into play. For organizations running SAP HANA on Google Cloud, that means interlocking high availability solutions such as:
- Live Migration, which moves a running HANA instance seamlessly to a new VM prior to beginning scheduled maintenance, without the need for administrator monitoring or intervention.
- Host Auto Restart, which allows Compute Engine to restart a VM instance automatically on a different host. This offers the ability to quickly restart an affected application, typically through the use of customer-supplied startup scripts.
- High-availability database support, most notably Google Cloud’s support for synchronous SAP HANA system replication and for SAP HANA host auto-failover.
- Google Cloud’s approach to high availability by design, which allows SAP HANA users to leverage a redundant, global infrastructure to deploy applications across multiple zones and regions—capabilities that can accommodate stringent availability targets.
As we said, “good enough” is rarely good enough when it comes to maintaining the availability of your business-critical SAP HANA systems. Fast Restart is an important, and sometimes overlooked, tool for helping to improve your system availability. But the best approach to availability is one that relies on many solutions working in unison, which is exactly why Fast Restart can be such a valuable tool for organizations already running SAP HANA on Google Cloud.
Learn more about SAP on Google Cloud.