Cloud migration: What you need to know (and where to find it)
Mohan Krishnan
Solutions Architect
Migrating to the cloud for an enterprise that has been running workloads on premises for years can be very daunting. To be successful, a migration plan needs to factor in many different aspects relating to people, process and technology. If you are designing the migration, you need guidance and best practices to help steer you through this process.
Building on our experience as solutions architects, we have put together a comprehensive set of documents for IT practitioners who are planning, designing, and implementing a migration to Google Cloud. At our Migration to Google Cloud page, you’ll find extensive technical information and advice you need to help plan and execute a successful migration. To help you get started faster, this blog post provides a high-level outline and links into the relevant part of the documentation where you can get more information.
Getting started with the migration
Before you start your migration, you should gather some foundational understanding about Google Cloud, your environment, and different migration approaches:
1. Understand the difference between Google Cloud and the current environment. The source environment could be on-premises or a private hosting environment. These environments have a different operational model compared to a public cloud, from a physical security, networking, power, hardware and virtualization standpoint.
2. Identify the type of workloads that need to be migrated. We recommend you start your migration by classifying workloads as either legacy, or cloud-native. Legacy workloads were developed without any consideration for cloud environments, with limited support for scaling resources such as disks and compute. As a result, these workloads can be difficult to modify and expensive to run and maintain. When designed following best practices, cloud-native workloads are natively scalable, portable, available, and secure. As a result, cloud-native workloads tend to increase developer productivity and agility, because developers can focus on the actual workloads, rather than spending effort to manage development and runtime environments.
3. Determine your organization’s maturity level for cloud technologies. When identified early, skill gaps can be addressed as part of the migration process through actions like self-study, training or peer mentorship. You can use Google Cloud’s Cloud Adoption Framework to measure your organization’s cloud adoption maturity.
4. Familiarize yourself with the different types of migration approaches and their tradeoffs, because different workloads might require different migration approaches. We define three types of migrations:
Lift and shift. You migrate the workload, applying the least amount of changes.
Improve and move. You modify parts of the workload to adopt cloud-native approaches as part of the migration.
Rip and replace. You decommission the workload, and write a new workload, adopting a cloud-native approach.
For more information on migration types refer to the migration guide’s section on Types of migration.
The four phases of migration
Broadly speaking, the migration journey can be captured as a four-phase process: Assess, Plan, Deploy and Optimize. It’s easier to show this linearly, but it’s rarely so straightforward, with these phases often happening in parallel for different workloads.
Phase 1: Assess the workloads to be migrated
This phase builds on any pre-work that you’ve done, with a focus on taking an inventory of the workloads that you plan to migrate and their respective dependencies. Things to think about include (but are not limited to) hardware and performance requirements, users, licensing, compliance needs and workload dependencies. Then, map this information into an app catalog that summarizes the information across some key axis questions—for example:
Whether the workload has dependencies, or is a dependency for other workloads
How critical the workload is to the business
How difficult it is to migrate the workload
The app catalog will provide you with a high-level view of the amount of effort to migrate all your different workloads. You can also use automated tools such as StratoZone that can scan your existing workloads and provide you with information based on the data gathered. StratoZone not only helps with discovery but can also help you map your instances to matching Google Compute Engine instances. Check out this blog post for an introduction to StratoZone. Additional information on how to conduct discovery is also available in the Categorizing your apps section.
To further get a sense of the size of risk or effort, you should conduct a proof of concept (POC) that tests the different use cases and requirements of the workload, with a focus on the more complicated workloads. This aids with getting more information early as well as reducing unknowns.
You should also perform a total cost of ownership (TCO) calculation at this phase, giving the business visibility into what their cloud expenditure will look like as a result of the migration, compared to your existing environment. When moving from an on-prem to a cloud environment, there are often hidden costs that are missed when calculating the costs in the old data center. We list out some of the things to look out for when building this TCO in the Calculating total cost of ownership section of our guide. Getting the business to understand the shift in cost models and all of the additional benefits gained will be crucial to migration success.
Lastly, you need to decide on which workloads to migrate first. The answer will vary from business to business depending on many different factors such as business value of the workload, complexity of migration, and the availability and requirements of the workload. To help guide this decision, it’s a good idea to call a meeting of the subject matter experts of the different workloads and go through a jointly agreed list of factors. Succeeding with the first workload is key to the overall success of your migration journey, as early success yields trust and goodwill, whereas early challenges can sometimes derail entire migration projects.
Phase 2: Plan the foundation
The next phase is to plan the foundational pieces of the new cloud environment, which consist of but are not limited to:
1. Establishing user and service identities. How will users and service accounts be created and managed? You can choose between G Suite or Cloud Identity domains, and optionally integrating with your existing Identity Provider (IdP). Read up on this in the Identity and Access management section.
2. Designing a resource organization hierarchy. How are the different Google Cloud resources structured hierarchically? Organization nodes, folders and projects provide the building blocks to set up a resource organization hierarchy. A properly designed resource organization simplifies access control and billing management. Examples of different types of designs are:
Environment oriented hierarchy - This design separates out your production, quality assurance and development environments.
Function orientated hierarchy - This design breaks different business functions into their own folders at the top level, and implements an environment-orientated hierarchy beneath it.
Granular orientated hierarchy - This design builds on top of the function-orientated hierarchy by adding a business unit organization at the top level.
You can dive deep on this topic in the resource hierarchy section.
3. Defining groups and roles for resource access. What are the different roles of users who will be accessing your cloud environment? What permissions should these different roles have? You need to create manager roles such as organizational admin, network admin and security admins to manage the cloud resources. It is also a best practice to create specific roles for the different classes of users who will be using the cloud environment, for example developers, testers and site reliability engineers (SREs). All these roles will have a minimum set of permissions associated with them to carry out their tasks. The Best practices for enterprise organizations document provides more details on this topic.
4. Designing your network topology and connectivity. Into which regions will you deploy your application? Will there be connectivity back into the source environment? How many separate networks will you need to set up? The answers to these questions will feed into how you design your Virtual Private Cloud (VPC), which is your private network within Google Cloud. One VPC maps to one standalone network within your cloud environment. A VPC has subnets, firewall rules and routes that allow you to mimic the characteristics of a physical network. It’s important to also ensure you are applying security best practices; you can read about those in the Security section, as well as in the Secure your apps and data section of our Best practice for enterprise organizations guide. Connectivity back to the source environment is also possible using options such as direct interconnect, peering or a VPN. For more information read the Connectivity and networking section.
Phase 3: Deploy the workloads
Once the foundation for your migration is in place, the next step is to determine the best approach to deploy your workloads to your cloud environment. You don’t need to take the same approach for all your workloads, however, the more standardized the process is, the more opportunity for cross-team learning and improvement of the deployment process. Example of different deployment approaches are:
1. Fully manual deployments. This approach is the simplest and quickest way to get your workload up and running, and can be performed from the Cloud Console or Cloud SDK directly. Although a manual deployment might be all right for some experimentation, we do not recommend this approach for production workload deployments because it is error prone, not repeatable and tends to be poorly documented. If you are currently using manual deployments, the Migration from manual deployments to automated, containerized deployments section will be able to help you improve your process.
For production environments, , a more practical option is to use a service that can automatically replicate the existing workloads in your current environment and deploy it to GCP. Google Cloud offers several such services:
Migrate for Compute Engine - This allows you to migrate VM-based applications from your existing environment (e.g. VMware, Azure, AWS) to GCP with minimal downtime and risk.
Migrate for Anthos - Instead of migrating VMs as-is, you can intelligently convert and workloads running in VMs and migrate those workloads into containers in GKE. This often results in a reduction of cost and management.
Database Migration Solutions - Whether through third parties such as Striim, or using native replication support in Google Cloud SQL, there are lots of different techniques to getting your data into Google Cloud.
VMware Engine - Migrate any existing VMware-based workloads from your on-prem infrastructure without any changes directly to Google Cloud VMware Engine. This allows you to reuse any existing VMware deployment tooling and get started immediately with your migration, and easily add new workloads with the VMware framework within Google Cloud.
2. Deploy using configuration management tools. Using configuration management (CM) tools such as Ansible, Chef or Puppet provides a repeatable, automated and controlled way to run your deployment. However, these tools are best suited for provisioning and configuration, and less suitable for workload deployments. This is because the tools require bespoke deployment logic to handle procedures such as zero-downtime deploys, blue-green deployments or rolling updates, and end up becoming more difficult to manage and maintain over the long run.
3. Deploy by using container orchestration tools. If your workloads are containerized you can use Google Kubernetes Engine (GKE) to handle the deployment process. The Kubernetes orchestrator supports many types of deployment logic such as zero-downtime deploys and rolling updates out of the box. Alternatively if your workloads are still on VMs running GCE, Azure or AWS Migrate for Anthos allows you to convert your VMs into containers automatically. This allows you to gain the benefit of running on containers quicker.
4. Deploy automatically. An automated deployment process is triggered based on some action that results in a change in the workload and can be built on top of any orchestration tool that can be scripted. Automated deployments allow you to streamline and standardize your deployment process reducing human error.
You can use tools such as Jenkins, SonarQube, Cloud Build or Spinnaker to build an end-to-end automated deployment pipeline on top of your existing orchestration tools. The key steps of an automated deployment process are:
Code review. Every change to your codebase should be reviewed by a peer to ensure the quality of the change before merging it into the codebase.
Continuous integration (CI). Once merged, the CI tool runs all existing tests against the new version for the codebase and ensures that no tests fail. Only then does it mark the build as successful.
Artifact production. For every successful build an artifact is produced. A container is an example of an artifact. Tests can also be run by using tools such as Serverspec to ensure that the artifacts are working well.
Continuous deployment (CD). A successful artifact is then deployed into your development or quality assurance cloud environment, after which another set of functional tests could be run against the deployment to ensure that its running well. Once those tests pass, the deployment can then be deployed to your production environment, either automatically, or after being manually triggered by an operator.
5. Deploy by applying the infrastructure-as-code pattern. The idea behind infrastructure as code is to treat configuring and provisioning your cloud resources in the same way you treat the source code for your workloads. Similar to how new versions of workloads are deployed by going through a series of automated steps and tests, any changes to the infrastructure configuration also go through a series of steps that involve testing before being deployed to the target cloud environment. This is our recommended best practice as it provides repeatability and traceability, which improve overall deployment velocity. This process can be implemented using tools such as Terraform and managed services such as a Deployment Manager.
Phase 4: Optimize your environment
Once a basic deployment of your workloads is running and tested in your new Google Cloud environment, you can start to improve on this foundation. This includes critical pieces that should be completed before cutting over live traffic, for example training your team on new cloud operational playbooks as well as ensuring that logging, monitoring and alerting for these workloads are in place.
Other aspects that you can optimize once the workload is serving production traffic include:
Cost optimization with autoscaling
Moving to managed workloads to reduce operational overhead
Automating the deployment process
Read up on how best to approach this in the Optimizing your environment section.
Read on to ensure a successful cloud migration
A large migration can be daunting for the most ambitious of teams. But with the right methodology, planning, and testing before deployment, you can break the problem down into smaller, more manageable steps. Our Migration to Google Cloud solution guide covers the above in more detail, and also provides additional resources, like our ‘Finding Help’ section, that you can use to help start migrating your workloads to the cloud.
If you require more assistance from professionals who have a track record of successful migrations, the Google Cloud Professional Services Organization offers consulting services directly or via a host of partners with a wide range of specialties. Just reach out and we can help you get on your way!