aws

Gruntwork Newsletter, September 2018

Once a month, we send out a newsletter to all Gruntwork customers that describes all the updates we’ve made in the last month, news in the…
Gruntwork Newsletter, September 2018
YB
Yevgeniy Brikman
Co-Founder
Published August 7, 2018

Once a month, we send out a newsletter to all Gruntwork customers that describes all the updates we’ve made in the last month, news in the DevOps industry, and important security updates. Note that many of the links below go to private repos in the Gruntwork Infrastructure as Code Library and Reference Architecture that are only accessible to customers.

Hello Grunts,

This month, we made major updates to Terratest, our IaC testing library, including adding support for Google Cloud, RDS, SSH Agent, and log file gathering; updated our ECS modules to support Service Discovery (based on Route 53), daemon services (e.g., run exactly one DataDog container on each ECS node), and volumes; fixed some important issues with our cross-account IAM Roles and aws-auth script; fixed a bug in our SQS module that revealed an important Terraform issue; and many other fixes and tweaks. Read on for all the details.

As always, if you have any questions or need help, email us at support@gruntwork.io!

Gruntwork Updates

Terratest supports Google Cloud, RDS, SSH Agent, log file gathering, and more

We’re continuing to grow and improve Terratest, our open source, swiss-army knife for testing infrastructure code. With lots of help from the open source community, we’ve added a number of major new abilities to Terratest this month. Here are the highlights:

Google Cloud Support: We added support for GCP to Terratest! Terratest now supports a broad collection of GCP features including getting public IPs, adding labels, getting a random region or zone, reading/writing in GCS buckets, and much more. This is available in Terratest, v0.10.0.

Gather log files: Added methods that allow you to fetch files over SSH from servers, EC2 Instances, and ASGs. The main use case for this is to fetch the contents of log files (or any other files you want to see!) from EC2 Instances at the end of the test to make debugging a test failure 10x easier. This is available in Terratest, v0.9.17.

SSH Agent:**** You can now use ssh-agent authentication with Terratest’s SSH methods by simply setting the SshAgent field to true in the ssh.Host struct (Terratest, v0.9.16). You can run an in-process SSH agent using the ssh.NewSshAgent method and have Terratest use that SSH Agent for SSH connections via the new OverrideSshAgent parameter in ssh.Host (Terratest, v0.10.1). You can use the new SshAgent property in terraform.Options to specify an in-process SSH agent to use when running Terraform. This makes it easier at test time to use custom SSH keys with remote-exec, files, and other ssh-based provisioners (Terratest, v0.10.2).

RDS: You can now use Terratest to test your RDS databases! Check out the new methods in the awspackage, including GetAddressOfRdsInstance, GetWhetherSchemaExistsInRdsMySqlInstance, GetParameterValueForParameterOfRdsInstance, and GetAllParametersOfRdsInstance. This is available in Terratest, v0.10.3. Support for other DBs and checks will be coming in the future.

ECS Service Discovery support on module-ecs

Motivation: AWS ECS makes it easy to deploy Docker containers as long-running ECS services, but as ECS selects which EC2 instance will run the container, and as EC2 Instances can be replaced or scaled up or down, ECS services have dynamically assigned IPs. If you want your services to be able to talk to each other, you need some form of service discovery so they can find out which service lives at which IPs. In the past, we used internal load balancers for this purpose, but a few months ago, Amazon announced integrated support for Service Discovery in ECS that allows you to reach your ECS services through hostnames managed by Route 53.

For example, service foo can talk to service bar at bar.internal-domain.com and service baz at baz.internal-domain.com. With integrated support for service discovery, ECS can now take care of automatically registering IPs when a new container is deployed and de-registering them when the container is deployed or crashes. Here are some of the advantages of using ECS Service Discovery over a load balancer:

  • Direct communication between your services (no extra hops)
  • Lower latency, if using AWS internal network and private namespaces
  • You can do service-to-service authentication
  • Not having a Load Balancer also means fewer resources to manage
  • You can make a logical group of services under one namespace

Solution: We created a terraform module under our module-ecs package called ecs-service-with-discovery. It allows you to deploy an ECS Service with Service discovery in AWS, taking care of registering the ECS service, configuring the network, and making a the necessary Route 53 alias for public hostnames. Currently our module supports public or private hostnames (examples are provided for both scenarios) and tasks with the awsvpc network mode. host and bridge network modes will be supported in future updates.

What to do about it: Upgrade to version v0.8.0 of module-ecs if this functionality would be useful to you.

Run daemon services on each Instance in your ECS cluster

Motivation: Many of our customers needed a way to run exactly one copy of a specific Docker container, such as a DataDog agent, on each EC2 Instance in their ECS cluster. This was hard to do with the ECS scheduler, so typically, some sort of hack was required.

Solution: We’ve added a new ecs-daemon-service module that that you can use to deploy exactly one task on each active container instance that meets all of the task placement constraints specified in your cluster.

What to do about it: The ecs-daemon-service module is available in module-ecs, v0.8.2.

Other ECS Updates

We had several other updates to ECS this month:

  • module-ecs, v0.7.1: Fix the cidr_blocks parameter in the ecs-fargate module to properly handle lists of CIDR blocks.
  • module-ecs, v0.8.1: The ecs-fargate module now outputs the IAM Role ID and name via the fargate_task_execution_iam_role_id and fargate_task_execution_iam_role_name output variables, respectively.
  • module-ecs, v0.8.3: You can now configure volumes for the ecs-service-with-alb module using the new volumes parameter.

IAM Role + MFA fixes

Motivation: One of our customers wanted to use our aws-auth script to assume an IAM Role, and use MFA, and set an expiration time longer than the 1h default. Upon trying this, he was hitting the error The requested DurationSeconds exceeds the 1 hour session limit for roles assumed by role chaining.

Solution: This issue is fixed in module-security, v0.15.0. Note that you need to update two things:

  1. aws-auth: The new version of the script can assume an IAM Role and use MFA without “role chaining,” so you can use longer expiration times.
  2. cross-account-iam-roles: After fixing the aws-auth issue, we hit a new one: we were able to successfully assume IAM Roles in other accounts, but every API call would fail with the error Access Denied. It turns out that the IAM Roles we were creating in the cross-acount-iam-roles module were requiring an MFA token not only to assume the IAM Role, but also for every API call after, which doesn’t work with aws sts assume-role. We’ve updated the cross-account-iam-roles module so it only requires MFA to assume the role in its Trust Policy, which is all that’s really necessary for MFA protection.

What to do about it: Update your cross-account-iam-roles usage in each of your AWS accounts and update your local copy of the aws-auth script to module-security, v0.15.0.

module-data-storage resource naming

Motivation: A Gruntwork customer was importing an existing RDS instance whose resources (subnet name, description, etc.) did not match the naming conventions used within the module-data-storage package. This was causing Terraform to indicate that their RDS instance would be destroyed and recreated, an obviously unacceptable consequence.

Solution: We updated the module-data-storage package to expose several new variables to allow you to specify the names and descriptions of all the sub-resources, allowing them to match that of the already running instance that is being imported. If these new variables are not specified, the previous default naming schemes are used.

What to do about it: Upgrade to version v0.6.7 of module-data-storage if this functionality would be useful to you.

Terragrunt updates

We’ve made a number of updates to Terragrunt this month:

  • Terragrunt, v0.16.5: You can now override the Terragrunt download dir using the TERRAGRUNT_DOWNLOADenvironment variable.
  • Terragrunt, v0.16.6: When Terragrunt calls terraform init -from-module=xxx to download the code for xxx, it now sets the -get=false, -get-plugins=false, and -backend=false params. These will all be handled in a later call to init instead. This should improve iteration speed when calling Terragrunt commands.
  • Terragrunt, v0.16.7: There are now two “init” commands you can use in hooks: init-from-module, which only executes when downloading remote Terraform configurations based on the source parameter, and init, which executes for all other init invocations (e.g,. to configure backends, download providers, download modules).
  • Terragrunt, v0.16.8: You can now set environment variables in extra_arguments by specifying key value pairs in the env_vars parameter.

Open source updates

  • terraform-aws-consul, v0.3.6: You can now specify custom tags for the Security Group in the consul-cluster module by specifying the security_group_tags parameter.
  • terraform-aws-consul, v0.3.7: Allows passing an exact download url to the packer ami template and the install script (useful for installing Consul Enterprise) and bumps the default Consul install version.
  • terraform-aws-consul, v0.3.8: You can now attach additional security groups to the consul-cluster module using the new additional_security_group_ids parameter.
  • terraform-aws-vault, v0.9.2: You can now set custom tags for the Vault ELB using the optional parameter lb_tags.
  • terraform-aws-vault, v0.9.3: You can now specify a custom port number to use for Vault health checks in the vault-elb module using the optional param health_check_port.
  • terraform-aws-vault, v0.10.0: You can now set custom tags on security groups and S3 buckets using new parameters in the vault-cluster and vault-elb modules.
  • terraform-aws-vault, v0.10.2: You can now pass to the AMI specific urls to dowload vault and consul. This is useful for installing the enterprise or pro packages.
  • terraform-aws-nomad, v0.4.4: The root example now lets you specify a custom VPC using the vpc_id parameter.
  • fetch, v0.3.0: fetch should now work with GitHub Enterprise URLs! If the repo URL you specify is not GitHub.com, fetch will automatically assume it's GitHub Enterprise and use the proper API calls. This defaults to GitHub Enterprise version v3, but that can be overridden with the new --github-api-version option.
  • fetch, v0.3.1: Fetch should now work with two-digit versions (vX.Y).
  • bash-commons, v0.0.5: Two new functions aws_get_instances_with_tag and aws_wrapper_get_ips_with_tag were added to allow retrieving EC2 instances and their (public/private) IPs using the value of a specific tag.
  • bash-commons, v0.0.6: The aws_get_instances_with_tag and aws_wrapper_get_ips_with_tag functions were updated to return only pending and running EC2 instances.

Other updates

  • module-security, v0.15.1: The iam_user_self_mgmt policy in the iam-policies module now includes the iam:DeleteVirtualMFADevice permission, which seems to be required now to add an MFA device, but is also useful for deleting one.
  • module-load-balancer, v0.10.1: Added a helper variable: allow_inbound_from_security_group_ids_num which is the number of elements in var.allow_inbound_from_security_group_ids. We should be able to compute this automatically and we were computing this automatically, but due to a Terraform limitation, if there are any dynamic resources in var.allow_inbound_from_security_group_ids, then we won't be able to. See: hashicorp/terraform#11482
  • module-ci, v0.13.1: The git-add-commit-push script will now retry on "cannot lock ref" errors that seem to come up if two git push calls happen simultaneously.
  • package-messaging, v0.1.1: The SQS module can now create queues with fifo_queue set to true without running into a naming error. Due to a terraform limitation, our conditions on resources weren’t taking effect and the appropriate .fifo suffix for FIFO queues wasn’t being appended as required to the supplied name. See: hashicorp/terraform#13389
  • module-load-balancer, v0.11.1: Updated the ALB module to accept multiple HTTP listeners and allowed security group ids. We had an incorrect approach to calculating what security group rules needed to be created and were running into an error because duplicate security group rules were being created.

DevOps News

Aurora Serverless MySQL now available

What happened: AWS has announced that Aurora Serverless for MySQL is now generally available in several regions!

Why it matters: Aurora Serverless is an on-demand, auto-scaling, serverless relational database. That means you don’t need to deploy any servers or configure it with a certain amount of capacity ahead of time. You just provision a “cluster” in a few seconds and after that, it will scale CPU, memory, and storage capacity up and down in response to load. Moreover, if you don’t use the database for a while, you can configure Aurora Serverless to shut it down automatically (e.g., after 60 min of inactivity) so that you only pay for storage costs. This can be very useful for infrequently used databases (e.g., in a pre-prod environment).

What to do about it: Aurora Serverless for MySQL is available now in US East (N. Virginia), US East (Ohio), US West (Oregon), Europe (Ireland), Asia Pacific (Tokyo). Note that while it has many advantages, there are also many limitations, so use with care! We’ll be updating module-data-storage with support for Aurora Serverless soon.

Vault 0.11 has been released

What happend: HashiCorp has released Vault 0.11.

Why it matters: A few of the key new features in Vault 0.11 are:

  • Namespaces (Enterprise): isolated, self-managed environments.
  • Performance Standby Nodes (Enterprise): Improve read performance via a new type of performance-focused standby node.
  • Vault Agent: Automatically manage the secure introduction and renewal of tokens for local applications.
  • ACL Templates: Support templating for identity groups, entities, and metadata within ACL policies.

What to do about it: Try out the new version and let us know how it works for you!

Security Updates

Below is a list of critical security updates that may impact your services. We notify Gruntwork customers of these vulnerabilities as soon as we know of them via the Gruntwork Security Alerts mailing list. It is up to you to scan this list and decide which of these apply and what to do about them, but most of these are severe vulnerabilities, and we recommend patching them ASAP.

MySQL

  • ALAS-2018-1070: A number of vulnerabilities have been found in several MySQL versions including 5.6 and 5.7. Some of the vulnerabilities allow attackers access to a subset of your data, and some are remotely exploitable, so we strongly recommend updating.

Spectre and Meltdown

  • USN-3756-1: New vulnerabilities of the Spectre and Meltdown variety are found on a pretty regular basis now. In a cloud environment these are important to fix by updating your OS version on a very regular basis.

Tomcat

  • DSA-4281–1: Several issues were discovered in the Tomcat servlet and JSP engine. Some of these allow attackers to access URLs they should not have access to.