A guide to automating HashiCorp Vault #3: Authenticating with an IAM user or role
This is the 3rd part of the automating HashiCorp Vault series. In part 2, we talked about how we can authenticate to a Vault cluster using…
This is the 3rd part of the automating HashiCorp Vault series. In part 2, we talked about how we can authenticate to a Vault cluster using instance metadata, after spinning it up and auto-unsealing, which was addressed in the first post. In this third and final post, we’ll talk about an alternative way to authenticate to Vault that you can use with IAM users and roles.
One of the limitations of the ec2
method is that it does not work for many different types of AWS services such as Lambda functions and ECS tasks. A similar situation will happen between the gce
and iam
methods of GCP. Although we’ve been walking through AWS examples in this series of blog posts (you can find the full code examples here), the methods in the two clouds are analogous, and you can find the GCP specific code here.
While only EC2 Instances have an Instance Metadata endpoint, almost all AWS resources can call the AWS Security Token Service (STS) to look up their own identity. Vault’s AWS iam
auth method takes advantage of this by allowing you to create a signed request to STS, but instead of sending the request yourself, you send that signed request data to Vault. Vault executes the request and finds out your real identity from AWS (again, our trusted 3rd party).
The Vault cluster should have policies to query the necessary information from AWS, especially if you use wildcards for configuring the IAM user or role. You can do this with Terraform, using Gruntwork’s vault-cluster
module:
resource "aws_iam_role_policy" "vault_iam" { name = "vault_iam" role = "${module.vault_cluster.iam_role_id}" policy = "${data.aws_iam_policy_document.vault_iam.json}" }
data "aws_iam_policy_document" "vault_iam" { statement { effect = "Allow" actions = ["iam:GetRole", "iam:GetUser"]
# List of ARNs Vault machines can query # For more security, it could be set to specific roles or users: # resources = ["${aws_iam_role.example_instance_role.arn}"] resources = [ "arn:aws:iam::*:user/*", "arn:aws:iam::*:role/*", ] }
statement { effect = "Allow" actions = ["sts:GetCallerIdentity"] resources = ["*"] } }
module "vault_cluster" { source = "github.com/hashicorp/terraform-aws-vault.git/modules/vault-cluster?ref=v0.11.3" # ... other Gruntwork's vault-cluster module vars }
Normally, the iam
method would be ignorant of the specifics of EC2 instances, but through the method AssumeRole
of AWS STS, Vault can infer that the IAM principal is attached specifically to an EC2 instance.
data "aws_iam_policy_document" "example_instance_role" { statement { effect = "Allow" actions = ["sts:AssumeRole"]
principals { type = "Service" identifiers = ["ec2.amazonaws.com"] } } }
resource "aws_iam_role" "example_client_instance_role" { name_prefix = "auth-example-iam-role" assume_role_policy = "${data.aws_iam_policy_document.example_instance_role.json}" }
resource "aws_iam_instance_profile" "example_instance_profile" { path = "/" role = "${aws_iam_role.example_client_instance_role.name}" }
resource "aws_instance" "example_client_auth_to_vault" { iam_instance_profile = "${aws_iam_instance_profile.example_instance_profile.name}" # ... other instance vars }
This means that you can also limit the Vault Role to some instance attributes even when using the iam
auth method instead of the ec2
method. You can do that by setting an inferred entity type when configuring your Vault Role. If you need more granular controls for EC2 instances, however, you should still use the ec2
method. And if you’re authenticating a different type of resource, such as a Lambda function, the iam
method is the only method you can use.
vault auth enable aws
vault policy write "example-policy" -<<EOF path "secret/example_*" { capabilities = ["create", "read"] } EOF
vault write \ auth/aws/role/example-role-name auth_type=iam \ policies=example-policy \ max_ttl=500h \ bound_iam_principal_arn=$client_instance_role_arn \ inferred_entity_type="ec2_instance" \ bound_ami_id=$ami_id # only when EC2 instance is inferred
The client trying to authenticate will create a request to the method GetCallerIdentity
of the AWS STS API (but not yet send it). This method basically answers the question “Who am I?” This request is then signed with the AWS credentials of the client and the signed result is sent with the login request to the Vault Server. As we mentioned before, AWS has already done the hard work of distributing credentials to things running on it and, even better, it also has the notion of temporary credentials. EC2 instances can get their instance profile via the metadata service, Lambda functions can get their credentials through environment variables and ECS tasks have their ECS-specific metadata service.
Still, this part is the the most complex step of the iam
authentication process. Creating the correct canonical request has many bits and pieces that can go wrong. Encrypting the correct parts to include in the authorization header can be a very time consuming process as the failed responses from Vault will often be unhelpful, which is probably intentional. For this reason, I heavily discourage trying to do this by yourself from scratch. I’d recommend instead using either the Vault cli tool (preferable), which already does a lot of the hard work for you, or use the AWS SDK in some programming language.
For a Go example, you can just look at Vault’s source code. Here is a Python 2 example using botocore, adapted from an example by J. Thompson posted at the Vault mailing list:
import botocore.session from botocore.awsrequest import create_request_object import json import base64 import sys
def headers_to_go_style(headers): retval = {} for k, v in headers.iteritems(): retval[k] = [v] return retval
def generate_vault_request(awsIamServerId): session = botocore.session.get_session() client = session.create_client('sts') endpoint = client._endpoint operation_model = client._service_model.operation_model('GetCallerIdentity') request_dict = client._convert_to_request_dict({}, operation_model)
request_dict['headers']['X-Vault-AWS-IAM-Server-ID'] = awsIamServerId
request = endpoint.create_request(request_dict, operation_model)
# It's a CaseInsensitiveDict, which is not JSON-serializable headers = json.dumps(headers_to_go_style(dict(request.headers)))
return { 'iam_http_request_method': request.method, 'iam_request_url': base64.b64encode(request.url), 'iam_request_body': base64.b64encode(request.body), 'iam_request_headers': base64.b64encode(headers), }
if __name__ == "__main__": awsIamServerId = sys.argv[1] print json.dumps(generate_vault_request(awsIamServerId))
You can use the Vault server’s address as an argument to this script, which will be included in the request as a server id. This is useful as a security boundary. For example, if credentials get compromised for a dev Vault cluster, it won’t be useful for breaching the prod Vault cluster.
signed_request=$(python /opt/vault/scripts/sign-request.py vault.service.consul)
iam_request_url=$(echo $signed_request | jq -r .iam_request_url) iam_request_body=$(echo $signed_request | jq -r .iam_request_body) iam_request_headers=$(echo $signed_request | jq -r .iam_request_headers)
# The role name necessary here is the Vault Role name # not the AWS IAM Role name data=$(cat <<EOF { "role":"example-role-name", "iam_http_request_method": "POST", "iam_request_url": "$iam_request_url", "iam_request_body": "$iam_request_body", "iam_request_headers": "$iam_request_headers" } EOF )
curl --fail \ --request POST \ --data "$data" \ "https://vault.service.consul:8200/v1/auth/aws/login"
When the Vault server receives a login request with the iam
method, it can execute the STS request without actually knowing the contents of the signed part. Amazon identifies who signed it, which the Vault Server then can check against the IAM principal bounded to a previously created Vault Role. It is important to note that although the Vault Role is configured with the IAM principal ARN, what Vault actually checks against is a unique internal ID from AWS. So if you destroy and recreate your IAM Role, Vault will reject the login attempt.
As mentioned previously, the Vault cli tool makes this work much simpler. So the last example could be much simplified. Internally, however, the workflow is exactly the same.
vault login \ -address=https://vault.service.consul:8200 \ -method=aws \ header_value=vault.service.consul \ role=vault-role-name
Next steps
In case you noticed that so far whenever we used the address of the Vault server on our examples, we used vault.service.consul
, that’s because we are using HashiCorp Consul not only as a storage backend but also as a Service Discovery mechanism. For the complete code of Vault clusters running with Consul and multiple examples of common use cases, check our open source repositories for AWS and GCP.
Thanks
Many thanks to Joel Thompson, who contributed the iam
authentication method to Vault and gave a fantastic and very detailed talk during HashiConf’17. His talk was immensely helpful to understand the inner pieces of Vault’s authentication workflows and my work would have been much harder without it.
Your entire infrastructure. Defined as code. In about a day. Gruntwork.io.