This document helps you plan and design a migration path from manual deployments to automated, containerized deployments in Google Cloud using cloud-native tools and Google Cloud managed services.
This document is part of a multi-part series about migrating to Google Cloud. If you're interested in an overview of the series, see Migration to Google Cloud: Choosing your migration path.
This document is part of a series:
- Migration to Google Cloud: Getting started
- Migration to Google Cloud: Assessing and discovering your workloads
- Migration to Google Cloud: Building your foundation
- Migration to Google Cloud: Transferring your large datasets
- Migration to Google Cloud: Deploying your workloads
- Migration to Google Cloud: Migrating from manual deployments to automated, containerized deployments (this document)
- Migration to Google Cloud: Optimizing your environment
This document is useful if you're planning to modernize your deployment processes, if you're migrating from manual and legacy deployment processes to automated and containerized deployments and infrastructure as code (IaC), or if you're evaluating the opportunity to migrate and want to explore what it might look like.
Before starting this migration, you should evaluate the scope of the migration and the status of your current deployment processes, and set your expectations and goals. You choose the starting point according to how you're currently deploying your workloads:
- You're deploying your workloads manually.
- You're deploying your workloads with configuration management (CM) tools.
It's hard to move from manual deployments directly to fully automated and containerized deployments. Instead, we recommend the following migration steps:
- Deploy by using container orchestration tools.
- Deploy automatically.
- Deploy by applying the IaC pattern.
The following diagram illustrates the path of this migration:
During each migration step, you follow the phases defined in Migration to Google Cloud: Getting started:
- Assessing and discovering your workloads.
- Planning and building a foundation.
- Deploying your workloads.
- Optimizing your environment and workloads.
The following diagram illustrates the migration phases of each step.
This migration path is an ideal one, but you can stop earlier in the migration process if the benefits of moving to the next step outweigh the costs for your particular case. For example, if you don't plan to automatically deploy your workloads, you can stop after you deploy by using container orchestration tools. You can revisit this document in the future, when you're ready to continue on the journey.
When you move from one step of the migration to another, there is a transition phase where you might be using different deployment processes at the same time. In fact, you don't need to choose only one deployment option for all of your workloads. For example, you might have a hybrid environment where you manage your infrastructure applying the IaC pattern, while still deploying your workloads with container orchestration tools.
Migrating to container orchestration tools
One of your first steps to move from manual deployments is to deploy your workloads with container orchestration tools. In this step, you design and implement a deployment process to handle containerized workloads by using container orchestration tools, like Kubernetes.
If your workloads aren't already containerized, you're going to spend a significant effort containerizing them. Not all workloads are suitable for containerization. If you're deploying a workload that isn't cloud native or ready for containerization, it might not be worth containerizing the workloads. Some workloads can't even support containerization for technical or licensing reasons.
Assess and discover your workloads
To scope your migration, you first need an inventory of the artifacts that you're currently producing and deploying along with their dependencies on other systems and artifacts. To build this inventory, you need to use the expertise of the teams that designed and implemented your current artifact production and deployment processes. The Migration to Google Cloud: Assessing and discovering your workloads document discusses how to assess your environment during a migration and how to build an inventory of apps.
For each artifact, you need to evaluate its current test coverage. You should have proper test coverage for all your artifacts before moving on to the next step. If you have to manually test and validate each artifact, you don't benefit from the automation. Adopt a methodology that highlights the importance of testing, like test-driven development.
When you evaluate your procedures, consider how many different versions of your artifacts you might have in production. For example, if the latest version of an artifact is several versions ahead of instances that you must support, you have to design a model that supports both versions.
Also consider the branching strategy that you use to manage your codebase. A branching strategy is only part of a collaboration model that you need to evaluate, and you need to assess the broader collaboration processes inside and outside your teams. For example, if you adopt a flexible branching strategy but don't adapt it to the communication process, the efficiency of those teams might be reduced.
In this assessment phase, you also determine how you can make the artifacts you're producing more efficient and suitable for containerization than your current deployment processes. One way to improve efficiency is to assess the following:
- Common parts: Assess what your artifacts have in common. For example, if you have common libraries and other runtime dependencies, consider consolidating them in one runtime environment.
- Runtime environment requirements: Assess whether you can streamline the runtime environments to reduce their variance. For example, if you're using different runtime environments to run all your workloads, consider starting from a common base to reduce the maintenance burden.
- Unnecessary components: Assess whether your artifacts contain unnecessary parts. For example, you might have utility tools, such as debugging and troubleshooting tools, that are not strictly needed.
- Configuration and secret injection: Assess how you're configuring your artifacts according to the requirements of your runtime environment. For example, your current configuration injection system might not support a containerized environment.
- Security requirements: Assess whether your container security model meets your requirements. For example, the security model of a containerized environment might clash with the requirement of a workload to have super user privileges, direct access to system resources, or sole tenancy.
- Deployment logic requirements: Assess whether you need to implement rich deployment logics. For example, if you need to implement a canary deployment process, you could determine whether the container orchestration tool supports that.
Plan and build a foundation
Next you provision and configure the Google Cloud infrastructure and services to support your deployment processes on Google Cloud. The Migration to Google Cloud: Building your foundation document contains guidance on how to build your foundation.
When you're creating Google Cloud organizations, folders and projects, consider that the deployment processes are shared across multiple environments. We recommend a function-oriented hierarchy or a granular access-oriented hierarchy. These hierarchies give you the necessary flexibility to manage your resources and the possibility of having multiple environments for development and testing.
When you're establishing user and service identities, for the best isolation you need at least a service account for each deployment process step. For example, if your process executes steps to produce the artifact and to manage the storage of that artifact in a repository, you need at least two service accounts. If you want to provision and configure development and testing environments for your deployment processes, you might need to create more service accounts. If you have a distinct set of service accounts per environment, you make the environments independent from each other. Although this configuration increases the complexity of your infrastructure and puts more burden on your operations team, it gives you the flexibility to independently test and validate each change to the deployment processes.
You also need to provision and configure the services and infrastructure to support your containerized workloads:
- Set up a registry to store your container images, like Container Registry and to isolate this registry and the related maintenance tasks, you set it up in a dedicated Google Cloud project.
- Provision and configure the Kubernetes clusters you need to support your workloads. Depending on your current environment and your goals, you can use services like Google Kubernetes Engine (GKE) and Anthos.
- Provision and configure persistent storage for your stateful workloads. For more information, see Google Kubernetes Engine storage overview.
By using container orchestration tools, you don't have to worry about provisioning your infrastructure when you deploy new workloads. For example, you can use cluster autoscaler to automatically resize your GKE cluster as needed.
Deploy your artifacts with container orchestration tools
Based on the requirements you gathered in the assessment phase and the foundation phase of this step, you do the following:
- Containerize your workloads.
- Implement deployment procedures to handle your containerized workloads.
Containerizing your workloads is a nontrivial task. What follows is a generlized list of activities you need to adapt and extend to containerize your workloads. Your goal is to cover your own needs, such as networking and traffic management, persistent storage, secret and configuration injection, and fault tolerance requirements. This document covers two activities: building a set of container images to use as a base, and building a set of container images for your workloads.
First, you automate the artifact production, so you don't have to manually produce a new image for each new deployment. The artifact building process should be automatically triggered each time the source code is modified so that you have immediate feedback about each change.
You execute the following steps to produce each image:
- Build the image.
- Run the test suite.
- Store the image in a registry.
For example, you can use Cloud Build to build your artifacts, run the test suites against them, and, if the tests are successful, store the results in Container Registry.
You also need to establish rules and conventions for identifying your artifacts. When producing your images, label each one to make each execution of your processes repeatable. For example, a popular convention is to identify releases by using semantic versioning where you tag your container images when producing a release. When you produce images that still need work before release, you can use an identifier that ties them to the point in the codebase from which your process produced them. For example, if you're using Git repositories, you can use the commit hash as an identifier for the corresponding container image that you produced when you pushed a commit to the main branch of your repository.
During the assessment phase of this step, you gathered information about your artifacts, their common parts, and their runtime requirements. With this information, you can design and build a set of base container images and another set of images for your workloads. You use the base images as a starting point to build the images for your workloads. The set of base images should be tightly controlled and supported to avoid proliferating unsupported runtime environments.
When producing container images from base images, remember to extend your test suites to cover the images, not only the workloads inside each image. You can use tools like InSpec, ServerSpec, and RSpec to run compliance test suites against your runtime environments.
When you finish containerizing your workloads and implementating procedures to automatically produce such container images, you implement the deployment procedures to use container orchestration tools. In the assessment phase, you use the information about the deployment logic requirements that you gathered to design rich deployment procedures. By using container orchestration tools, you can focus on composing the deployment logic using the provided mechanisms, instead of having to manually implement them.
When designing and implementing your deployment procedures, consider how to inject configuration files and secrets in your workloads, and how to manage data for stateful workloads. Configuration files and secret injection are instrumental to produce immutable artifacts. By deploying immutable artifacts, you can do the following:
- For example, you can deploy your artifacts in your development environment. Then, after testing and validating them, you move them to your quality assurance environment. Finally, you move them to the production environment.
- You lower the chances of issues in your production environments because the same artifact went through multiple testing and validation activities.
If your workloads are stateful, we suggest you provision and configure the necessary persistent storage for your data. On Google Cloud, you have different options:
- Persistent disks managed with GKE
- Fully managed database services like Cloud SQL, Firestore and Cloud Spanner
- File storage services like Filestore
- Object store services like Cloud Storage
When you're able to automatically produce the artifacts to deploy, you can set up the runtime environments for the tools that you use to deploy your workloads. To control the runtime environment for the deployment tools, you can set the environment up as a build in Cloud Build and use that build as the only means to deploy the artifacts in your environments. By using Cloud Build, you don't need each operator to set up a runtime environment on their machines. You can immediately audit the procedure that creates the runtime environment and its contents by inspecting the source code of the build configuration.
Optimize your environment
After implementing your deployment process, you can use container orchestration tools to start optimizing the deployment processes. The Migration to Google Cloud: Getting started document contains guidance on how to optimize your environment.
The requirements of this optimization iteration are the following:
- Extend your monitoring system as needed.
- Extend the test coverage.
- Increase the security of your environment.
You extend your monitoring system to cover your new artifact production, your deployment procedures, and all of your new runtime environments.
If you want to effectively monitor, automate, and codify your processes as much as possible, we recommend that you increase the coverage of your tests. In the assessment phase, you ensured that you had at least minimum end-to-end test coverage. During the optimization phase, you can expand your test suites to cover more use cases.
Finally, if you want to increase the security of your environments, you can configure binary authorization to allow only a set of signed images to be deployed in your clusters. You can also enable Container Analysis to scan container images stored in Container Registry for vulnerabilities.
Migrating to deployment automation
After migrating to container orchestration tools, you can move to full deployment automation, and you can extend the artifact production and deployment procedures to automatically deploy your workloads.
Assess and discover your workloads
Building on the previous evaluation, you can now focus on the requirements of your deployment processes:
- Manual approval steps: Assess whether you need to support any manual steps in your deployment procedures.
- Deployment-per-time units: Assess how many deployments-per-time units you need to support.
- Factors that cause a new deployment: Assess which external systems interact with your deployment procedures.
If you need to support manual deployment steps, it doesn't mean that your procedure cannot be automated. In this case, you automate each step of the procedure, and place the manual approval gates where appropriate.
Supporting multiple deployments per day or per hour is more complex than supporting a few deployments per month or per year. However, if you don't deploy often, your agility and your ability to react to issues and to ship new features in your workloads might be reduced. For this reason, before designing and implementing a fully automated deployment procedure, it's a good idea to set your expectations and goals.
Also evaluate which factors trigger a new deployment in your runtime environments. For example, you might deploy each new release in your development environment, but deploy the release in your quality assurance environment only if it meets certain quality criteria.
Plan and build a foundation
To extend the foundation you built in the previous step, you can provision and configure services to support your automated deployment procedures.
For each of your runtime environments, set up the necessary infrastructure to support your deployment procedures. For example, if you provision and configure your deployment procedures in your development, quality assurance, pre-production, and production environments, you have the freedom and flexibility to test changes to your procedures. However, if you use a single infrastructure to deploy your runtime environments, your environments are simpler to manage, but less flexible when you need to change your procedures.
When provisioning the service accounts and roles, consider isolating your environments and your workloads from each other by creating dedicated service accounts that don't share responsibilities. For example, don't reuse the same service accounts for your different runtime environments.
Deploy your artifacts with fully automated procedures
In this phase, you configure your deployment procedures to deploy your artifacts with no manual interventions, other than approval steps.
For any given artifact, each deployment procedure should execute the following tasks:
- Deploy the artifact in the target runtime environment.
- Inject the configuration files and secrets in the deployed artifact.
- Run the compliance test suite against the newly deployed artifact.
- Promote the artifact to the production environment.
Make sure that your deployment procedures provide interfaces to trigger new deployments according to your requirements.
Code review is a necessary step when implementing automated deployment procedures, because of the short feedback loop that's part of these procedures by design. For example, if you deploy changes to your production environment without any review, you impact the stability and reliability of your production environment. An unreviewed, malformed, or malicious change might cause a service outage.
Optimize your environment
After automating your deployment procedures, you can execute another optimization iteration. The requirements of this iteration are the following:
- Extend your monitoring system to cover the infrastructure supporting your automated deployment procedures.
- Implement more advanced deployment patterns.
- Implement a break glass procedure.
An effective monitoring system lets you plan further optimizations for your environment. When you measure the behavior of your environment, you can find any bottlenecks that are hindering your performance or other issues, like unauthorized or accidental accesses and exploits. For example, you configure your environment so that you receive alerts when the consumption of certain resources reaches a threshold.
When you're able to efficiently orchestrate containers, you can implement advanced deployment patterns depending on your needs. For example, you can implement canary deployments and blue/green deployments to increase the reliability of your environment and reduce the impact of any issue for your users.
Given the fully automated nature of the deployment process, we recommend that you design and implement a break glass procedure that lets you interact with your runtime environments without using the normal deployment procedures. You use this procedure only under exceptional circumstances and when preapproved by senior members of your team. For example, if an issue with your deployment procedure locks you out of the environment, you use the break glass procedure to roll back the change that caused the issue.
Adopting infrastructure as code
Now that your teams know how to effectively use Google Cloud, you can apply the IaC pattern. IaC is a process where you treat the provisioning of resources in a runtime environment in the same way that you handle the source code of your workloads.
Assess and discover your infrastructure
In this assessment phase, you gain a deep understanding of the infrastructure you provisioned in the previous migration steps, including the following:
- Google Cloud resources that you configured as part of your infrastructure.
- Change-management processes that you currently have in place.
- Members of your organization that have the rights to modify your infrastructure.
Having an inventory of the resources you currently have in your infrastructure is crucial to adopt IaC, because in this migration step you have to describe them with code.
A change-management process is fundamental to manage the evolution of your infrastructure. If you have any processes, you adapt them to handle IaC. If you don't have any change-management process for your infrastructure, this phase is a chance to design and implement one. A change-management process should at least include a review phase where you analyze the proposed changes. This analysis, along with an assessment of which team members can modify your infrastructure, is necessary to lower the chances that a change to your infrastructure causes downtimes or unexpected billing.
Plan and build a foundation
Extending the foundation you built in the previous step, you need to provision and configure infrastructure to support the adoption of IaC.
First, you need to choose which tools you're going to adopt. Some commonly used tools on Google Cloud include the following:
- Deployment Manager, a managed service with full support for all Google Cloud resources.
- Terraform, an open source provisioning tool that supports Google Cloud and other cloud providers.
- Chef, Puppet, and Ansible, open source configuration tools that support Google Cloud.
After choosing IaC tools, you need to provision and configure all the necessary infrastructure to support them. You need at least the following:
- Source code repositories to manage and version the descriptors of your resources.
- A code review tool to analyze and approve each change before it goes live.
- A runtime environment to execute the IaC tool after your teams approve the changes during a review.
Some IaC tools need services to manage your infrastructure. For example, Terraform needs remote data storage if you want to collaborate with other members of your organization to manage your infrastructure.
If you choose to manage this foundation with an IaC tool, we recommend that you implement safeguards to avoid disrupting the foundation when you apply changes. These disruptions can cause extended downtimes and unrecoverable data losses. For example, if you accidentally delete the source code repositories where you store your infrastructure descriptors, you can cause an irrecoverable data loss. You can use tools like liens to prevent projects from being deleted.
Provision Google Cloud resources with IaC
When the Google Cloud environment is ready, you can adopt IaC to manage the resources in your environment:
- Describe your existing resources with code.
- Provision new resources with IaC.
In the previous migration steps, you provisioned Google Cloud resources using Cloud Console, Cloud SDK or Cloud APIs. In this phase, you describe these resources with code, following the syntax and the conventions of the IaC tool that you choose.
You might need to import your current resources and instances in your tool's inventory to make the IaC tool aware of these resources and instances. The IaC tool you choose doesn't have state information about your current resources and instances. You must either import the resources, or manually destroy those instances and let the IaC tool recreate them. For example, Terraform can import your existing infrastructure.
If you have to define new components in your infrastructure, you now describe them with code directly, without going through any other provisioning procedures.
Optimize your environment
After you adopt IaC, you execute another optimization iteration. This iteration has the following requirements:
- Extend your monitoring system as needed.
- Extend your test suites to cover your infrastructure.
- Automate the provisioning and configuration of your infrastructure.
- Extend the break glass procedure to cover your infrastructure.
Building on what you did in the previous deployment phase and the optimization phase of the previous migration step, you extend monitoring to the infrastructure supporting your IaC adoption. For example, you can monitor all the runtime environments where your IaC tools run and accurately log each execution to build an audit trail that you can later inspect.
You can extend your test suites to cover the infrastructure, and not only your workloads and container images. You can use tools like InSpec, ServerSpec, and RSpec to run compliance test suites for your runtime environments. You can protect your infrastructure from manual changes, or from changes coming from outside your IaC pipeline. By continuously running the test suites against your infrastructure, you can detect unauthorized changes and correct the situation.
When you're confident in your adoption of IaC, look at automating the provisioning of your infrastructure by designing and implementing new procedures. These new provisioning procedures are significantly different from those you use to produce and deploy your artifacts. Infrastructure provisioning procedures are designed to handle changes to your infrastructure, not to your application. For this reason, these procedures solve different problems and have a different error blast radius and impact, compared to artifact production and deployment procedures. When an element of your infrastructure fails, an error blast radius describes the impact of the damage. For example, if you deploy a faulty artifact, you might cause a service disruption that impacts one or more use cases. If you provision a faulty infrastructure component, the potential service disruption might impact multiple, if not all, services in your environment.
Google Cloud provides the following support resources:
- Self-service resources. If you don't need dedicated support, you have various options that you can use at your own pace.
- Technology partners. Google Cloud has partnered with multiple companies to help you use our products and services.
- Google Cloud professional services. Our professional services can help you get the most out of your investment in Google Cloud.
There are more resources to help migrate workloads to Google Cloud in the Google Cloud Migration Center.
- Support your migration with Istio mesh expansion
- Manage Google Cloud projects with Terraform
- Read about implementing Continuous delivery pipelines with Spinnaker and Google Kubernetes Engine.
- Read about Continuous deployment to Google Kubernetes Engine using Jenkins.
- Migrate your VMs to Google Cloud with Migrate for Compute Engine.
- Read about how to structure Mass VM migrations to Google Cloud with Migrate for Compute Engine.
- Learn more about Anthos and Migrate for Anthos and GKE.
- Explore reference architectures, diagrams, tutorials, and best practices about Google Cloud. Take a look at our Cloud Architecture Center.