Implementing IAM access control as code with HashiCorp Terraform
Developer Relations Engineer
Today, digital transformation requires security transformation. Identity and Access Management (IAM) can be used as the first line of defense in your Google Cloud security strategy. Identity and Access Management (IAM) is a collection of tools that allows administrators to define who can do what on resources in a Google Cloud account. Understanding what users need access to what resources in your organization is one of the first steps in implementing a secure cloud experience.
IAM goes far beyond users and groups. Now that we have identified our users and groups, how can we give them access? Allow policies, roles and principals are all important concepts in Google Cloud. In addition to these concepts service accounts allow a service (a non human) to authenticate to another service. Got a workload running outside of Google Cloud? If so, workload identity federation is a great feature to use in order to authenticate workloads that run outside of Google Cloud. Set compliance and guardrails with organization policies.
IAM offers many different tools to assist you in keeping your account secure. So now, how can we implement and keep track of these tools and concepts? Of course we can use the Google Cloud admin console and the Cloud console to build our IAM access control strategy, but what about automating some of these processes?
Infrastructure as code (IAC) is pretty common among operations teams. Products like HashiCorp Terraform enable IAC and allow you to use text based files to automate provisioning and setting up your infrastructure. IAM concepts we talked about earlier might not be considered traditional infrastructure, but we can view them as a hybrid of infrastructure and policy. We can use Terraform for more than just infrastructure as code; we can also use it to implement account access controls.
Why would you want to use Terraform to implement access controls in your Google Cloud account?
- Speed. Terraform provides a level of automation. Being able to describe your access controls in a code based format allows you to programmatically interact with your Google Cloud account using API calls being made by Terraform to your Google cloud account. This can speed up development/implementation time.
- Integration. Being that we are using APIs on the backend, we can now integrate building certain access controls into new or existing pipelines.
- Version control. Because we are using Terraform .tf configuration files, we can upload our code to a source code repository. I can use git to keep track of all the changes and/or different versions of our code.
- Collaboration. Because we can store our code in a source code repository this enables our access controls to be shared across the team. Making use of pull requests allows your team to increase knowledge sharing.
- Consistency. Because we have our access controls defined in code we can enforce best practices using modules in Terraform. Modules allow you to reuse code in various configurations which further ensures consistency and speeds up development time.
Let’s briefly look at some basic components of IAM, which make up the foundation of any IAM strategy.
A role is a collection of individual permissions. Permissions can be looked at as “things I can do with a service”. For example with the Cloud Run Invoker role I can run.jobs.run and run.routes.invoke.
Predefined roles are roles that Google creates to allow you to do certain things based on responsibilities. Using predefined roles will help limit your blast radius, which will in turn help strengthen your access control strategy.
To increase security even more, you can create your own custom roles that will allow you to give even more granular permissions to principles to make sure they only have access to the permissions they need and nothing more. This is called the principle of least privilege and it is access control best practice.
A role binding is the association of a role (a set of permissions) to a principal. This will give a principal access to whatever permissions makeup that role. We can take this a step further with allow policies. An allow policy is a collection of role bindings that bind one or more principals to individual roles.
A principal can be thought of as an entity that would need access to resources. You can give the principal access to resources through permissions which the principal can be assigned through a role binding.
A principal can be a Google Account, a service account, a Google group, or a Google Workspace account or Cloud Identity domain. Each principal has its own email address which can be used as an identifier when you need to assign permissions to that principal.
Let’s take a look at hierarchical structure in Google Cloud. In Google Cloud this hierarchical structure does two things.
- Provide a hierarchy of ownership
- Provide attach points and inheritance
What does this mean? It means that resources can be associated with a parent. For example, I can have a folder that represents the Devops team. Under that folder I can have a project that will then have resources attached to it. You can see from this progression that the project’s direct ancestor is the Devops folder (which represents the Devops department). The resources would then have a direct ancestor which would be the project. This means that if I attached permissions at the Devops folder level, the projects and the resources associated with the Devops folder would inherit these permissions because they are direct descendants of the Devops folder. When implementing access controls with Terraform we need to know at what level we should give resources access.
Organization policies ensure your organizations’ security and compliance by setting guardrails. Organizational policies allow you to enforce constraints which specify what resource configurations are allowed within an organization. Let’s see how constraints work.
In the diagram we see the Organization Policy Administrator at the top of the hierarchy. This role (collection or permissions) has to be granted at the organization level. Next we see that because the Organization Policy Admin has these specific set of permissions they are able to define an organizational policy. This Policy consists of a constraint also known as restrictions. This constraint is the blueprint for your organization Policy. Next, the policy is set on a resource hierarchy node. For the sake of argument, let’s say it’s set at the folder level. By default, the policy is enforced on a specific GCP service. This policy is then inherited to all resources under that folder.
Building with Terraform
Now let’s take a look at how we could build a policy with code:
Resource - Also known as a resource block, tells Terraform what you want to build. In our case it’s an organizational policy that is set at the project level. The name “auditlogging_policy” is the name Terraform knows this resource by (in some cases we can target specific resources or user interpolation).
Project- Id of the project to apply policy to.
Constraint - The name of the Constraint the Policy is referencing. You can find a list of constraints here.
Boolean_policy - Value that enforces the policy.
A service account can be looked at as both a principal and a resource. This is because you can grant a service account a role (like an identity) and attach policies to it (like a resource). Your company should use service accounts if you have services in Google Cloud that need to talk to each other. This will allow you to authenticate and make API calls securely from service to service.
Resource google_service_account - Creates a service account. Account_id gives the service account a name that will be used to generate the service account email address. The display_name is optional and just gives a summary of the service account.
Resource google_project_iam_member - Adds permission to a service account.
Resource google_service_account_iam_member - Grants access for a user (referenced as member) to assume a service account (service_account_id) by granting the user the iam.ServiceAccountUser role (referenced as role above).
Now we have the basics down, let’s take a look at a practical use case.
Let’s imagine we work at Big Horn Inc. Big Horn Inc. is a SaaS company. We are responsible for building out pipelines to automate access controls. We’ve been tasked with solving 2 problems:1. The team wants to modernize some stateless applications. They want to use containers to create microservices. They want different CloudRun services to be able to talk to other services in Google Cloud. In this case we need to create some service accounts for Cloud Run. Ideally we would like this process to be automated.
2. Right now we have very broad permissions. Some principals have been assigned “basic” roles. After using the policy insights tool in Google Cloud, the team decides that some principles have too much access. We need a way to create “custom” roles to create more granular permissions to make sure the organization is following the principle of least privilege.
We can solve these issues in an automated fashion by implementing IAM with Terraform and using Cloud Build.
Wiring things up
Before we can start building access controls with Terraform, we need to make sure we have some things in place first.
After you have Terraform and gcloud installed, you will want to make sure that you have a service account that Terraform can use. Make sure that service account has all the proper permissions needed. Depending on what you want to build, some permissions will have to be given from the organizational level in order for them to be inherited at the project level (where service accounts are created). Next, let’s make sure you are using the proper authentication method. The best way to authenticate for local development is by using Application Default Credentials (ADC). With a simple setup, Terraform will be able to authenticate automatically using the credentials from your gcloud configuration.
In the pipeline, Cloud Build will have permissions to the service account you create. This will allow Cloud Build to assume the permissions of that service account and in turn authenticate your Terraform configuration.
Now that we have the service account and all the proper tools in place, let’s build a pipeline. As you can see below, I am using a yaml file in order to automatically build a pipeline in Cloud Build. Each step in the pipeline is introduced through a Docker container. My pipeline does some standard things with Terraform.
Securing access in Google Cloud is a great first line of defense to make sure that your account is secure. Understanding IAM and its core features is the foundation on which you will build your access controls. Automating access controls can save your company time, money, and give your organization the agility it needs to make changes in a structured way when the need arises. You can create a free account at cloud.google.com. Don’t know where to get started with IAM? We’ve got you covered. Try this IAM tutorial to hit the ground running.