Identity & Security

Protecting your GCP infrastructure at scale with Forseti Config Validator part three: Writing your own policy

No two Google Cloud environments are the same, and how you protect them isn’t either. In previous posts, we showed you how to use the Config Validator scanner in Forseti to look for violations in your GCP infrastructure by writing policy constraints and scanning for labels. These constraints are a good way for you to translate your security policies into code and can be configured to meet your granular requirements. And because policy constraints are based on Config Validator templates, it’s easy to reuse the same code base to implement similar, but distinct constraints.

In this post, you’ll learn how to write your own custom template (and test it with sample constraints) to get you started writing your own security policies as code.

A closer look at template constraints 

First, let’s examine a sample constraint that implements the GCPStorageLocationConstraintV1 template. This template lets you define where in your cloud environment your Cloud Storage buckets should live.

Let’s take a look at this constraint:

  apiVersion: constraints.gatekeeper.sh/v1alpha1
kind: GCPStorageLocationConstraintV1
metadata:
 name: allow_some_storage_location
spec:
 severity: high
 match:
   target: ["organization/*"]
 parameters:
   mode: "allowlist"
   locations:
   - asia-southeast1
   exemptions: []

As you can see, this constraint implements the GCPStorageLocationConstraintV1 template (kind). Here is what we can tell from this constraint file:

  • Its name is allow_some_storage_location. This is what will show in your reports for each identified violation.

  • The violations it raises will be marked as high.

  • It applies to the entire organization (target)

  • It has three parameters (mode, locations and exemptions).

Another important point is the target object for the constraint. This lets you specify the resources that should comply with this constraint in your organization hierarchy. In this example, all resources should comply with the constraint, but in some cases you may want to limit the folders and/or projects resources that should be affected by it. 

You can specify more than one target (array), and in the same logic, you can use the exclude object to specifically prevent the constraint from targeting certain resources.

Now, what about templates?

Let’s keep digging into this example and look at the GCPStorageLocationConstraintV1 template. For simplicity, it’s divided into two main parts.

GCPStorageLocationConstraintV1 Template (top part):

  apiVersion: templates.gatekeeper.sh/v1alpha1
kind: ConstraintTemplate
metadata:
 name: gcp-storage-location-v1
spec:
 crd:
   spec:
     names:
       kind: GCPStorageLocationConstraintV1
       plural: gcpstoragelocationconstraintsv1
     validation:
       openAPIV3Schema:
         properties:

Here you can see that there are several top-level keys that describe the template. The most important ones to focus on at this point are:

  • kind: A description of a constraint template

  • metadata > name: the template’s common name

  • spec: Documentation for your specific template.

  • spec > names > kind and spec > names > plural - which template to use (kind in your constraint file)

  • spec > validation > openAPIV3Schema > properties - Location of your template’s parameters. If no parameters are needed, use {} as the value.

Now let’s take a look at how to define your parameters in this template file (again, under the properties section):

  validation:
       openAPIV3Schema:
         properties:
           mode:
             type: string
             enum: [denylist, allowlist]
             description: "String identifying the operational mode,allowlist or denylist. In allowlist mode,datasets are only allowed in the locations specified in the 'locations' parameter. In denylist mode, resources are allowed in all locations except those listed in the 'locations' parameter."
           exemptions:
             type: array
             items: string
             description: "Array of storage buckets to exempt from location restriction. String values in the array should correspond to the full name values of exempted storage buckets."

You can use the open API v3 format to describe your parameters, meaning that you can be quite precise about them. Here for instance, you can see that:

  • The first parameter is “mode”, which is a string and more specifically an enum (fixed list of valid values). Its value can either be “denylist” or “allowlist”. Note that there is no default values specified, so you should always pass a value when using this template, just to be safe.

  • The second parameter is “exemptions” which is an array (list) of string.

In all cases, the description field lets you know what values should be passed to these parameters when calling this template in your constraint.

Finally, the last part of the template, is the rego rule, the language that lets your write custom policies for the Config Validator tools, including terraform-validator:

  targets:
   validation.gcp.forsetisecurity.org:
     rego: | #INLINE("validator/storage_location.rego")
       #ENDINLINE

For the template to be valid, you need to put the actual rego rule that Config Validator will call by when you deploy your constraint (either in Forseti or with terraform-validator as we discuss in the next article). 

You can find the code for the rego rule in the validator/storage_location.rego file. Then, use the “make build” command to automatically copy this rule over to your template when the code is ready.

Now you have a clearer sense about what a template is: a YAML file that describes the template itself, the inputs it needs (if any) and finally the rego rule that should be applied whenever the template is called by Config Validator. 

Next, let’s go over rego (and OPA), and how to get started writing your own rules that will become the core of your template.

Introduction to rego and OPA

The Open Policy Agent is a framework that lets you write policies that can be reused across tools. This a good standard to use if you want to ensure that your policies will only need to be written once, regardless of what will consume them in the end. 

In our case, this is the main reason why the template rules discussed in this post (a.k.a. policies) can be interpreted the same way by both the Forseti config_validator scanner and the terraform-validator tool, as we will see in the next article.

The most challenging part about rego for most developers is that it’s a declarative language, but it looks/feels like an imperative one. This can lead to some confusion when debugging rules that you need to write.

There are a lot of good resources to help you get started writing rego:


One tool that I use often to collaborate with other developers is the online rego sandbox that lets you write rego code and test the output based on your inputs. You can also share your examples with others easily.

So, how does this relate to our template? Well, if you look at all the other templates in the policy-library, you will notice that they all define a special rego rule in their definition:

  package templates.gcp.GCPStorageLocationConstraintV1
 
import data.validator.gcp.lib as lib
 
deny[{  
   "msg": message,
   "details": metadata
}]{
# some rego logic         
}

This is where the magic happens. This deny rule is what lets Config Validator know whether or not a given asset (like a GCP resource) should be marked in violation. 

If the body the rule (# some rego logic) evaluates to true given the input it receives (your parameters + the asset to evaluate), then the resource will be marked as a violation: the deny rule will be evaluated to true.

If it evaluates to true, the deny rules will return some metadata (msg and details) In our case it will pass along the values of the message and metadata variables if they have been set (or error out if not).

Some key points to remember when writing rego:

  • This is not an imperative language. The rule will be evaluated in parallel as much as is feasible. The dependencies between the instructions are discovered and followed at runtime. 

    • Some operators have special behaviors, like “=”, “:=” and “==” (learn more here and  here)—make sure you understand the difference. When in doubt use “:=” for assignments (unless you really mean to use “=”).

    • There is a limited number of functions available, but do use them, as they will save you a lot of time.

    • The deny rule can be tricky because it works seemingly  backward to how our brains usually process information (e.g., this deny rule will be true if its body is evaluated to true). Most programmers find it easier to write a positive logic function, and use the “not” operator when calling the function to reverse its outcome in the rule.

    • You can use the trace and sprintf functions to debug your rego logic, but the trace output only shows up if one of your test fails. If there are errors in your code (such as syntax or runtime errors), your traces only show up if they were evaluated before the error, which might be tricky to predict.

I hope I did not scare you too much about rego, but my point is that it’s best to go slow when writing your rule and validate that it behaves as expected as early as possible (do no write hundreds of lines at once and start testing at the end).

Writing your own custom rule

For this section, I will use a template that I recently published to the policy-library repository. This goal of this template is to allow a user to specify which resources types are allowed (whitelist) or denied (blacklist) in their GCP infrastructure (for instance within a folder). This kind of policy is quite in demand by financial or insurance companies that require  additional guardrails.

Let’s get started. Start by writing your new rego rule within the validator folder, within a file named allowed_resource_types.rego. It should look like this:

validator/allowed_resource_types.rego:

  package templates.gcp.GCPAllowedResourceTypesConstraintV1
 
import data.validator.gcp.lib as lib
 
deny[{
    "msg": message,
    "details": metadata,
}] {
    constraint := input.constraint
    lib.get_constraint_params(constraint, params)
 
    asset := input.asset
 
    message := sprintf("%v is in violation.", [asset.name])
    metadata := {"resource": asset.name}
}

You can see we have some basic logic already in the rule. You retrieve the constraint object that was passed by Forseti or terraform-validator as an input to your rule, and you get the constraint parameters using the get_constraint_params function. 

This function is defined in the validator library, which is also in the policy-library repository (under the lib folder). The parameters that you retrieve from the constraint are accessible in the params variable.

At the same time, make sure that you have an input.asset object passed to your rule, which is the GCP resource that you need to evaluate in the rule. This asset object should reflect the GCP Cloud Asset Inventory export format, as mentioned in earlier articles.

Finally, set the message and metadata variables. These will be used only if the body of the deny rule evaluates to true

Writing your first test for your template

Now it’s time to test your template with a brand new constraint, in a new folder: validator/test/fixtures/allowed_resource_types

Following the contributing guidelines, create two subfolders for your tests:

  • assets: this will contain all of your test data that you get from a Cloud Asset Inventory export, or from other template test data (these are both json objects)

  • constraints: this will contain all of the test cases for your template. This is a way for you to test various inputs to your template and make sure it behaves as expected against your test data.

Now, create a new test constraint in the constraints folder:

validator/test/fixtures/allowed_resource_types/constraints/basic/data.yaml:

  apiVersion: constraints.gatekeeper.sh/v1alpha1
kind: GCPAllowedResourceTypesConstraintV1
metadata:
 name: allow_all_resource_types
spec:
 severity: high
 match:
   target: ["organization/*"]
 parameters: {}

This example simply uses the template you just wrote, with no special parameters. Now you can test your almost empty rule by creating a test file in the validator folder (with _test,rego as a suffix):

validator/allowed_resource_types_test.rego:

  package templates.gcp.GCPAllowedResourceTypesConstraintV1
 
import data.validator.gcp.lib as lib
 
#Importing the test data
import data.test.fixtures.allowed_resource_types.assets as fixture_assets
 
# Importing the test constraints
import data.test.fixtures.allowed_resource_types.basic.constraints as fixture_constraint_basic
 
# Find all violations on our test cases
find_all_violations[violation] {
    resources := data.resources[_]
    constraint := data.test_constraints[_]
    issues := deny with input.asset as resources
     	with input.constraint as constraint
 
    violation := issues[_]
}
 
basic_violations[violation] {
    constraints := [fixture_constraint_basic]
    violations := find_all_violations with data.resources as fixture_assets
     	with data.test_constraints as constraints
 
    violation := violations[_]
}
 
test_allowed_resource_types_basic_violations {
    violations := basic_violations
    count(violations) == 5
}

The key points to remember for this test file are:

  • Only the functions with the “test_” prefix will be executed as tests

  • You can import as many data sets and test constraints as you want (see the import statements at the top)

  • When you import a constraint (YAML) or a test data set (JSON), you can retrieve it using the directory structure where it lives (for instance, for the above constraint case, you can import  data.test.fixtures.allowed_resource_types.basic.constraints which maps to the data.yaml file location.

Getting your mock data

As mentioned earlier, Config Validator only supports resources exported by Cloud Asset Inventory. So a good way to get mock/testing data for a new template is to run a Cloud Asset Inventory export on your existing infrastructure ( assuming you already have resources against which you want to test your template). Cloud Asset Inventory supports these resource types. Another option is to use mock data from existing policies.

For my test data, I use only one data.json file, but feel free to have separate data sets for separate use cases. You can find my latest test data set here.

For the context of this article, I have one resource of each of the following:

  • storage.googleapis.com/Bucket

  • compute.googleapis.com/Instance

  • compute.googleapis.com/Disk

  • google.bigtable.Instance

  • sqladmin.googleapis.com/Instance

Now you’re ready to test your new template. The test function verifies that you have five violations at this point (count == 5), since the dataset is currently comprised of five resources (the rule flags everything as a violation at this point):

  awalko@awalko$:~/Dev/policy-library$ make test | grep allowed_resource_types
data.templates.gcp.GCPAllowedResourceTypesConstraintV1.test_allowed_resource_types_basic_violations: PASS (623.377µs)

Adding logic to your rule

Ultimately, your goal is to allow (whitelist) or deny (blacklist) resources in your infrastructure, based on a list of resource types that you’ve passed to your template.

First, add two parameters to your template:

  • resource_types_list: the list of resource types to consider in the template (list of strings)

  • mode: whether the list is a whitelist or blacklist (enum: whitelist or blacklist

Now you can add some more logic to your rule to use these parameters to evaluate the asset that was passed as an input to the template:

  deny[{
    "msg": message,
    "details": metadata,
}] {
 
    constraint := input.constraint
    lib.get_constraint_params(constraint, params)
 
    asset := input.asset
 
   # Retrieve the current mode if passed, use "whitelist" as a default
   mode := lower(lib.get_default(params, "mode", "whitelist"))
 
   # Retrieve the resource types list - default to empty set
   ressource_types_list := cast_set(lib.get_default(params, "resource_types_list", {}))
 
   # The asset raises a violation if resource_type_is_valid is evaluated to false (both of them)
   not resource_type_is_valid(mode, asset, ressource_types_list)
 
    message := sprintf("%v is in violation.", [asset.name])
    metadata := {
       "resource": asset.name,
       "mode": mode,
       "resource_types_list": ressource_types_list
   }
}
 
###########################
# Rule Utilities
###########################
 
resource_type_is_valid(mode, asset, resource_types_list) {
   # anything other than blacklist is treated as "whitelist"
   mode != "blacklist"
 
   # Retrieve the asset type
   asset_type := asset.asset_type
 
   #the asset is valid if it's in the resource_type_list
   asset_type == resource_types_list[_]
}
 
resource_type_is_valid(mode, asset, resource_types_list) {
   # "if we are in a blacklist mode"
   mode == "blacklist"
 
   # Retrieve the asset type
   asset_type := asset.asset_type
 
   #the asset is valid only if it's not in the resource_type_list
   not resource_types_list[asset_type]
}

As discussed earlier, I used a positive logic function (resource_type_is_valid) to make my life easier. If a resource is valid (i.e., its type is part of the list passed for the whitelist mode or absent of that list in the blacklist mode), then my function returns true. This is why in the main deny rule, I use the not operator on it to raise a violation only when the asset scanned is not valid.

Note: As you can see in the rego, you can define the same function multiple times, with the same prototype. At runtime, all of them are evaluated and OR’d together to determine the result. As a programmer, this is convenient, since you can write the same xyz_is_valid function for multiple use cases and call it once in your top-level rule (here deny), as long as each function tests for a different scenario.

You can also test your changes, the same way you did earlier (i.e., run “make test”). In the current version of this template, I added more resource types and more test constraints for it, but the rule itself is unchanged (at least for now). Feel free to take a look at the current version here.

Here is the updated test constraint (to pass values for parameters in our test function):

validator/test/fixtures/allowed_resource_types/constraints/basic/data.yaml:

  apiVersion: constraints.gatekeeper.sh/v1alpha1
kind: GCPAllowedResourceTypesConstraintV1
metadata:
 name: allow_all_resource_types
spec:
 severity: high
 match:
   target: ["organization/*"]
 parameters:
   mode: "blacklist"
   resource_types_list:
     - sqladmin.googleapis.com/Instance
     - compute.googleapis.com/Instance

Publishing your new template

Now that you have a stable (and tested) version of your template rule, you can generate the template using the “make build” command. This runs “make format” and “make build_templates”, which updates the template file to include the latest version of your rego rule automatically. 

Here is your brand new template file, before running the “make build” command to populate your rule:

policies/templates/gcp_allowed_resource_types_v1.yaml:

  # Copyright 2019 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#      http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
 
# This template is for policies restricting the resource types
# in your organization hierarchy. You can specify which resource
# types are allowed (mode: whitelist) or denied (mode: blacklist)
# using the "resource_types_list" and "mode" parameters.
 
apiVersion: templates.gatekeeper.sh/v1alpha1
kind: ConstraintTemplate
metadata:
 name: gcp-allowed-resource-types-v1
spec:
 crd:
   spec:
     names:
       kind: GCPAllowedResourceTypesConstraintV1
       plural: gcpallowedresourcetypesconstraintsv1
     validation:
       openAPIV3Schema:
         properties:
           mode:
             type: string
             enum: [whitelist, blacklist]
             description: "String identifying the operational mode,
             whitelist or blacklist. In the whitelist mode, only resource types
             from the resource_types_list will be allowed (all other types
             will raise a violation). In the blacklist mode, any resource
             type not in the resource_types_list will not raise a violation."
           resource_types_list:
             type: array
             items: string
             description: "Array of resource types that will be either
             authorized (mode: whitelist) or denied (mode: blacklist)."
 
 targets:
   validation.gcp.forsetisecurity.org:
     rego: | #INLINE("validator/allowed_resource_types.rego")
       #ENDINLINE

Note: There is a brief description of the whole template in the comments at the top, as well as a discussion of its  parameters in the properties section.

Once you’re ready to finalize your template, run the “make build” command:

  awalko@awalko:~/Dev/policy-library$ make build | grep allowed_resource_types
Inlining policies/templates/gcp_allowed_resource_types_v1.yaml

Check out again your template file. The command should have populated the rego section of your template for you, from the validator/allowe_resource_types.rego file. 

I encourage you to double check that you have a valid YAML file, or you could encounter issues later on when you deploy your template with Forseti or terraform-validator.

Finally, it’s best practice to reference a sample usage of your template in the samples/ folder, so feel free to copy your test constraint in there before pushing your changes to your repositories.

Conclusion

In this article, we reviewed how you can write your own policies for Config Validator that you can use out-of-the-gate with both Forseti and terraform-validator. You can now commit new files into your repository. Feel free to make a pull request if you would like to publish your new policy to the community (take a look at the contributor guidelines for the Config Validator policy-library).

The policy we just created is quite useful when you want to use a multi-pipeline strategy for your deployments. For instance, you could have highly specific pipelines to deploy a specialized terraform templates with separate pipelines for network, applications / GKE, or IAM resources. You could also use this policy to ensure each pipeline cannot deploy resources beyond its scope, or use different service accounts for each pipeline, with only the minimal permissions it needs to do its job.

In a follow-up article, we’ll discuss how to use terraform-validator in your terraform deployments, so you can prevent bad resources from being deployed in your environment in the first place!

Useful links 

OPA / rego:

Repositories: