Automatically Enforce Policies on Your Terraform Modules using OPA and Terratest
Many organizations have business and legal requirements that must be continuously enforced on the infrastructure they have. These…
Many organizations have business and legal requirements that must be continuously enforced on the infrastructure they have. These requirements most often stem from compliance frameworks like HIPAA and PCI.
For example, in order to enforce HIPAA compliance in your organization, it is imperative that your infrastructure resources are tagged consistently and systematically so that you can trace which resources contain Protected Health Information (PHI).
Traditionally, these requirements are expressed as policies that are enforced by humans. However, humans are the weakest link when it comes to enforcing policies.
Suppose you have a database that contains PHI, and you want to ensure that database always has the PHI tags contained in them. Here is an example module call that deploys such an RDS database:
module "rds_with_phi" { source = "git::git@github.com:gruntwork-io/terraform-aws-data-storage.git//modules/rds?ref=v0.17.5"
name = "main" tags = { includes-phi = "true" } }
With a human driven policy, you might enforce this at the code review phase, where the team collaborates to ensure the RDS database is tagged accordingly. The weakness in the approach is that the check happens only once. You need to enforce these not only in your IaC once, but you want to ensure that these invariants continue to be enforced always in the future.
For example, what if the business requirements change such that the tag needs to be renamed to has-phi
? Or what if the module is refactored to source the tags from different locations, and as a result the tags are dropped accidentally? You want to ensure that your IaC stays compliant as requirements are updated. Is there any way to encode these policies as code, and automatically enforce them on the infrastructure code through automation?
With Open Policy Agent (aka OPA) and Terratest, you can continuously check and enforce your policies at all stages of the Software Development Lifecycle (SDLC). With a fully implemented pipeline, developers no longer need to continuously check if the policies are violated for every code review.
For example, with the RDS example above, if a developer accidentally makes an update that drops the tags, you can use OPA and Terratest to immediately get a failure in your CI build that looks like this:
=== RUN TestEnforceTaggingOnLiveModules TestEnforceTaggingOnLiveModules Running terraform files in ../modules/rds through `opa eval` on policy ../policies/enforce_tagging.rego TestEnforceTaggingOnLiveModules Failed opa eval on file main.json (policy enforce_tagging.rego; query data.enforce_tagging.allow) --- FAIL: TestEnforceTaggingOnLiveModules (0.05s)
You can use this pipeline to enforce all kinds of policies on your IaC:
- Resource tagging
- Ensuring modules come from an approved source
- Enforce end to end encryption (in-transit and at-rest)
- Enforce firewalls aren’t open to the public (e.g., 0.0.0.0/0)
- And more!
In this post, we will cover how we can use Open Policy Agent with Terratest to build out this kind of pipeline.
What is Open Policy Agent?
From the official website:
The Open Policy Agent (OPA, pronounced “oh-pa”) is an open source, general-purpose policy engine that unifies policy enforcement across the stack. OPA provides a high-level declarative language that lets you specify policy as code and simple APIs to offload policy decision-making from your software. You can use OPA to enforce policies in microservices, Kubernetes, CI/CD pipelines, API gateways, and more.
This makes OPA a useful tool for enforcing various governance policies on your Terraform code.
The main interface of OPA is the opa CLI. The OPA CLI can be run directly in the terminal, or as a webserver that serves the policy checks behind a REST-ful API. Policies are written as code using Rego, a purpose-built language for OPA that allows you to declaratively specify policies. The opa
CLI takes input data as JSON and checks the Rego
policies against it.
Writing an OPA Policy
Now let’s try to use OPA to enforce that we have the includes-phi
tag defined on our RDS module calls. We need to first express this requirement in rego
:
package enforce_tagging
# Only allow this if all the RDS module blocks have the tags # attribute set, and the tags attribute contains the # "includes-phi" key. allow = true { count(rds_blocks) == count(rds_with_tags) }
# The set of module blocks that call the Gruntwork RDS module. rds_blocks[module_label] { some module_label, i
module := input.module[module_label][i] startswith( module.source, "git::git@github.com:gruntwork-io/terraform-aws-data-storage.git//modules/rds", ) }
# The set of Gruntwork RDS module blocks that have the right tags # set. rds_with_tags[module_label] { some module_label, i
# Only select the modules that are in the rds_blocks set. module := input.module[module_label][i] rds_blocks[module_label]
# Make sure the tags attribute is set and is not empty module.tags # And make sure the tags attribute has a key called includes-phi module.tags["includes-phi"] }
rego
is a declarative language. Each block in the source defines a new variable or set, and the contents of the block indicate what goes into that set. Here is a breakdown of what is happening in the enforce_tagging
policy above:
- Define a variable called
allow
and assign ittrue
if the set of RDS module blocks is equal to the set of RDS modules that have tags set and includes theincludes-phi
key. - Define a set called
rds_blocks
which iterates over all the objects in themodule
key of the input and only includes those whosesource
attribute starts with the Gruntwork RDS module source. This is handled by thesome
keyword which defines iteration variables. With the expressioninput.module[module_label]
andsome module_label
, OPA will evaluate the expression with every key of themodule
object in the input, binding the key tomodule_label
for each iteration. This set is then further filtered by thestartswith
expression, taking advantage of the fact that OPA only adds items to a set when the block expression all evaluates totrue
. - Define a set called
rds_with_tags
which iterates over the setrds_blocks
, and for eachmodule
block, include those that have thetags
attribute defined, and thetags
attribute includes theincludes-phi
key.
The two sets (rds_blocks
and rds_with_tags
) and can use a little more explanation, in particular how iteration works in OPA. The some
keyword is used to define iterator variables. That is, when you index a list or object in rego
with a variable defined with some
, then OPA will automatically iterate each element, binding the index key to the variable at each step.
So the expression input.module[module_label][i]
is equivalent to the following pseudo code:
for module_label in input.module: for i in input.module[module_label]: input.module[module_label][i]
Given that, this rego
policy expects input of the following form:
{ "module": { "MODULE_LABEL": [{ "source": "MODULE_SOURCE", // ... other module inputs ... }] } }
Combining all this together, this policy enforces that all module
blocks in the Terraform code that calls the Gruntwork RDS module has a tags
attribute with the includes-phi
key set.
Now that we have a policy, let’s run opa
to evaluate it.
Using OPA
We can use the opa
CLI to evaluate policies written in rego
against JSON inputs using the eval
command. Unfortunately, OPA currently doesn’t natively support parsing HCL, and thus we can’t use the CLI against the Terraform code directly. However, we can use a handy utility from the community, hcl2json, for this purpose.
Download the hcl2json
utility and run it to convert the main.tf
we wrote previously to JSON format:
hcl2json main.tf > main.json
Now that we have the contents of the Terraform file in JSON format, we can evaluate our policy against it using the opa
CLI:
# Assuming the file enforce_tagging.rego contains the policy above: opa eval --fail \ -i ./main.json \ -d ./enforce_tagging.rego \ 'data.enforce_tagging.allow'
This command call means the following:
- Evaluate the policy specified in
./enforce_tagging.rego
(specified with the-d
flag). - Evaluate the policy against the JSON data in
./main.json
(specified with the-i
flag). - Query for the data
data.enforce_tagging.allow
after evaluating the policy (the positional arg passed toeval
). - Fail the command if the query data is undefined, or the result is empty (specified with the
--fail
flag).
This means that the command will only be successful if the allow
variable is defined. With our policy, this is true if all RDS module blocks in the source have the tags
attribute set.
When you run this command, it should exit with a zero exit code and the following output:
{ "result": [ { "expressions": [ { "value": true, "text": "data.enforce_tagging.allow", "location": { "row": 1, "col": 1 } } ] } ] }
The expressions
list in the result show you the value of each of the elements that you queried. The above only contains the result of the allow
variable, but you can also query for the rds_blocks
set if you passed in data.enforce_tagging.rds_blocks
instead. Suppose you modify the main.tf
to have another module block:
module "vpc" { source = "git::git@github.com:gruntwork-io/terraform-aws-vpc.git//modules/vpc-app?ref=v0.17.5" }
module "rds_with_phi" { source = "git::git@github.com:gruntwork-io/terraform-aws-data-storage.git//modules/rds?ref=v0.17.5"
name = "main" tags = { includes-phi = "true" } }
When you run OPA, it should only select the module block that calls the RDS module:
$ opa eval --fail \ -i ./main.json \ -d ./enforce_tagging.rego \ 'data.enforce_tagging.rds_blocks' { "result": [ { "expressions": [ { "value": [ "rds_with_phi" ], "text": "data.enforce_tagging.rds_blocks", "location": { "row": 1, "col": 1 } } ] } ] }
Note how it correctly selected the module block with the label rds_with_phi
, and ignored the vpc
module block.
What about when the source fails the check? Try updating the main.tf
module to comment out the tags
attribute and rerun the check. You should see the command exit with a non-zero exit code, with an empty result ({}
). This is because the allow
variable becomes undefined when it finds an RDS module block that doesn’t have the tags
attribute.
At this point, we have the ingredients for automating these checks in a CI/CD pipeline, but it can be cumbersome to do these in a pipeline, especially if you have many policies, and many Terraform modules. We can use Terratest to further automate these checks in an efficient manner.
Enforce OPA using Terratest
Terratest is a Go library that provides patterns and helper functions for testing infrastructure defined as code, with first class support for Terraform, Packer, Docker, Kubernetes, and more. Recently, we added support for OPA to the library (in v0.38.1).
You can use Terratest to automatically run OPA policies against your Terraform modules. Normally, you can’t run OPA policies directly against Terraform code because OPA does not support HCL inputs. To check your Terraform code, you need to first convert it to JSON using a tool like hcl2json. This can be cumbersome when you want to check every module in your repos. Terratest makes that easier by reducing that logic to a single function (test_structure.OPAEvalAllTerraformModules) which will:
- Find all the Terraform modules in a folder. You can configure which folders to look in and exclude using the
ValidationOptions
struct. - For each Terraform module, find all the
tf
files in that module. - For each
tf
file, convert it to JSON using the same routine ashcl2json
. - For each converted JSON file, run through the OPA checks.
- Do all of this concurrently. That is, all the OPA checks for each file found will be done in parallel.
- Report results per Terraform module. That is, the Terraform module containing the failing
tf
file will be reported as failing.
Using this function, you can drop in a single go test file to run your OPA policies against all the Terraform modules in a repo!
Suppose you have the following folder structure for your Terraform repositories:
. ├── examples │ ├── bar-example │ │ └── main.tf │ ├── baz-example │ │ └── main.tf │ └── foo-example │ └── main.tf ├── modules │ ├── bar │ │ └── main.tf │ ├── baz │ │ └── main.tf │ └── foo │ └── main.tf └── policies └── enforce_source.rego
In this setup, you have Terraform modules in the modules
folder, and each subfolder contains a Terraform module. Additional, you have an examples
folder that contains example usage of each of the Terraform modules, which are also Terraform modules themselves. You want to be continuously checking the OPA policy in the policies
folder against all those modules.
To do that, add a new folder test
and place a file called enforce_opa_test.go
in there containing the following:
package testvalidate
import ( "os" "path/filepath" "testing"
test_structure "github.com/gruntwork-io/terratest/modules/test-structure" "github.com/gruntwork-io/terratest/modules/opa" "github.com/stretchr/testify/require" )
func TestWithOPAEvalAllTerraformModules(t *testing.T) { t.Parallel()
cwd, err := os.Getwd() require.NoError(t, err)
// Look for Terraform modules starting at the directory // above the `test` folder. opts, err := test_structure.NewValidationOptions( filepath.Join(cwd, ".."), nil, nil) require.NoError(t, err)
// Configure OPA to run the `enforce_tagging.rego` policy // in `FailUndefined` mode rulePath := filepath.Join(cwd, "../policies/enforce_tagging.rego") opaOpts := &opa.EvalOptions{ FailMode: opa.FailUndefined, RulePath: rulePath, } test_structure.OPAEvalAllTerraformModules( t, opts, opaOpts, "data.enforce_tagging.allow") }
This sets up the OPAEvalAllTerraformModules
the function with the following settings:
- Look for Terraform modules starting at the directory right above where we are. This will be relative to the
test
folder, and thus will look at the repository root. - Run OPA with the policy at the folder
../policies/enforce_tagging.rego
again relative to thetest
folder. - When running OPA, query for
data.enforce_tagging.allow
and run inFailUndefined
mode so that the checks fail if theallow
variable is undefined.
To finish the setup, initialize the test
folder as a go module so that it can pull the terratest
dependency:
cd test # Update this to your terraform repo go mod init github.com/gruntwork-io/infrastructure-modules/test go mod tidy
This will create two files, go.mod
and go.sum
, which tracks all the go modules that are needed to run the test.
Once you have the go modules files, you can now run the test by calling go test
in the test
folder:
# You should already be in the test directory if you # ran the previous command, but if not, change to test dir. cd test go test -v .
This single command will now run the opa eval
check on every single Terraform module in your repository!
Note that this will check a single policy file against a single query. To run multiple checks, you can implement that in the policy source.
For example, imagine you had three separate policies you wanted to enforce, each one written in the same manner as the enforce_tagging
policy above. Each policy is encoded in their own source file, e.g., enforce_source.rego
, enforce_tagging.rego
, and enforce_no_allow_all.rego
. You can combined these by importing the sources and creating a single allow
directive that aggregates the three:
package enforce_policies
import data.enforce_source import data.enforce_tagging import data.enforce_no_allow_all_network
# Only pass the policy if all the imported checks evaluate to true. allow = true { enforce_source.allow == true enforce_tagging.allow == true enforce_no_allow_all_network.allow == true }
When the check fails, Terratest will automatically rerun opa eval
with the query set to data
, which will allow you to see all the OPA variables that are defined. This way, you can debug which OPA check failed.
Check out the terraform-opa-example folder in the Terratest repository for live example usage of the OPA functionality.
Summary
In this post we covered:
- A canonical use case of policy checks on Terraform source code.
- How to write OPA policies in
Rego
. - How to use the
opa
CLI to check policies against Terraform source code. - How to use Terratest to automate
opa
calls.
In most cases, you identify non-compliance retroactively: you run checks against the live AWS environment after things have been deployed. With OPA and Terratest, you can preform the checks before any infrastructure goes live! This approach allows you to ensure your infrastructure stays compliant with company policies over time, as you make changes.
To get a fully implemented and tested collection of Terraform modules that meet compliance standards like CIS and HIPAA enforced with OPA, check out Gruntwork.io.