aws

Gruntwork Newsletter, August 2020

Once a month, we send out a newsletter to all Gruntwork customers that describes all the updates we’ve made in the last month, news in the…
Gruntwork Newsletter, August 2020
YB
Yevgeniy Brikman
Co-Founder
Published August 28, 2020

Once a month, we send out a newsletter to all Gruntwork customers that describes all the updates we’ve made in the last month, news in the DevOps industry, and important security updates. Note that many of the links below go to private repos in the Gruntwork Infrastructure as Code Library and Reference Architecture that are only accessible to customers.

Hello Grunts,

In the last month, we revealed the new design of the Infrastructure as Code Library, which now consists of three layers: a Module Catalog, Service Catalog, and Architecture Catalog. This is a new standard for infrastructure code and we believe it’ll make it easier than ever to go to prod on AWS! We also updated Gruntwork Landing Zone to support a dedicated logs account and streamline the deployment process, updated Gruntwork Pipelines with support for executing arbitrary infrastructure code (e.g., building Docker images, AMIs, committing to Git, etc), added new modules for Redshift and EC2 backup using data lifecycle manager, and wrote a guide to securely managing secrets with Terraform. In the wider DevOps world, there have been a bunch of new releases: AWS Provider 3.0, Terraform 0.13, CDK for Terraform, and Kubernetes 1.17 support in EKS.

As always, if you have any questions or need help, email us at support@gruntwork.io!

Gruntwork Updates

Introducing: The Gruntwork Module, Service, and Architecture Catalogs

Motivation: While most customers loved being able to directly use modules from the Gruntwork Infrastructure as Code Library (IaC Library) to create their own, custom infrastructure, many of our customers asked for a higher level API to work with—something that let’s them go to prod with less work.

Solution: We’ve created a new design for the Gruntwork IaC Library! It now consists of three layers:

  • The Gruntwork Module Catalog: Build your infrastructure by mixing & matching hundreds of reusable, battle-tested modules. This is what you’ve been using the last few years!
  • The Gruntwork Service Catalog [NEW]: Deploy off-the-shelf services, without writing any code. Each service combines multiple modules into a highly configurable package that’s designed to be deployed directly to production.
  • The Gruntwork Architecture Catalog [NEW]: Deploy proven, end-to-end architectures that contain all the services you need to go to prod, already wired together and fully automated.

What to do about it: Read the introductory blog post to learn more, including code samples and diagrams that show you how each of these catalogs works. The Module Catalog is already available to everyone. The Service Catalog and Architecture Catalog are available now as part of a private, invite-only alpha. If you’re interested, contact us to learn how to get access!

Landing Zone Updates: Dedicated Logs Account and Better UX

Motivation: Since we released our Landing Zone solution, we’ve had a lot of requests from our customers to to support aggregating in a dedicated logs account. We also uncovered some chicken-and-egg issues related to AWS Config and CloudTrail that required you to run apply multiple times across multiple modules, once with AWS Config and CloudTrail disabled, and once with them enabled.

Solution: We’ve updated our Landing Zone solution with support for a dedicated logs account which can be used to aggregate AWS Config and CloudTrail data from all your other accounts! Moreover, we’ve created workarounds for the chicken-and-egg problems related to AWS Config and CloudTrail, so now instead of having to run apply multiple times across multiple modules, you only need to run apply once per AWS account, as you’d expect.

What to do about it: If you’re upgrading an existing Landing Zone deployment, upgrade to the v0.36.0 releases of module-security, making sure to follow the migration guide in those releases notes and the release notes for v0.34.0 too! If you’re deploying your Landing Zone from scratch, check out our dedicated deployment guide instead.

Gruntwork Pipelines Update: Deploy arbitrary infrastructure code

Motivation:**** Back in March, we announced the release of Gruntwork Pipelines, our solution for implementing a secure, automated CI/CD pipeline for infrastructure code. The initial release was limited to just Terraform and Terragrunt: if your deployment workflows depended on building docker images, building AMIs with Packer, or committing changes to Git, you had to do those steps directly on the CI server. This meant that you still had to grant some potentially powerful permissions to the CI server directly (e.g., write access to your Git repos), and mixing a CI servers—which, by design, is used by your entire team to execute arbitrary code—with powerful permissions is not the best idea from a security perspective.

Solution: To address this limitation, we’ve updated Gruntwork Pipelines to support invoking arbitrary scripts—so instead of solely letting you run the built-in infrastructure-deploy-script Terraform/Terragrunt deployments, you can now also run scripts for building Docker images, building AMIs with Packer, committing to Git, and your own custom scripts—all in a secure, locked-down manner that does not require you to give your CI servers powerful permissions. The new version of Gruntwork Pipelines enhances the existing modules with the following feature set:

  • Invoke arbitrary scripts. You define the exact scripts that can be used and can limit the arguments passed to those scripts. All of this is enforced in the ECS Deploy Runner Docker container using a custom entrypoint.
  • Directly read from AWS Secrets Manager in the ECS Deploy Runner Tasks (as opposed to implicitly reading secrets via environment variable injection).
  • Take advantage of our “standard configuration,” which includes four containers for separation of concerns and least privileges: docker-image-builder, ami-builder, terraform-planner, and terraform-applier.
  • Build Docker images and push them to ECR using the build-docker-images script. Under the hood, we use a custom kaniko container for building docker images in ECS Fargate.
  • Build AMIs with Packer using the build-packer-artifact script. This script now supports securely injecting SSH keys from AWS Secrets Manager.
  • Automatically update variables in your Terraform code using terraform-update-variable. This script now supports securely injecting SSH keys from AWS Secrets Manager, updating multiple name value pairs in one call, and specifying the commit message text.
  • Enforce which Git refs can run apply in the infrastructure-deploy-script.

We also updated our example pipeline so that you have an example of how to configure your apps to utilize all this new functionality.

What to do about it: Check out v0.24.0 of the ECS deploy runner, including the migration guide in the release notes to update your existing pipelines and let us know what you think!

New module: Redshift Cluster

Motivation: Amazon’s Redshift gives you a managed, scalable, data warehouse in the cloud.

Solution: We’ve added a new redshift module to module-data-storage! This makes it easy to use Redshift in just a few lines of code:

module "redshift_example" {
source = "git::git@github.com:gruntwork-io/module-data-storage.git//modules/redshift?ref=v0.15.0"
name = "example-cluster"
port = 5439
master_username = var.master_username
master_password = var.master_password
instance_type   = "dc2.large"
number_of_nodes = 3
vpc_id     = var.vpc_id
subnet_ids = var.subnet_ids
}

What to do about it: Try out the new redshift module and let us know what you think!

New module: EC2 Backup

Motivation: Amazon’s data lifecycle manager allows you to configure automatic EBS volume backups on flexible schedules.

Solution: We’ve added a new ec2-backup module to module-server! This makes it a snap to configure your own EBS backup policies.

What to do about it: Try out the new ec2-backup module and let us know what you think!

New blog post: A comprehensive guide to managing secrets in your Terraform code

Motivation: One of the most common questions we get about using Terraform to manage infrastructure as code is how to handle secrets such as passwords, API keys, and other sensitive data. The right approach to use is not obvious and there are many gotchas and stumbling blocks.

Solution: We wrote a blog post called A comprehensive guide to managing secrets in your Terraform code that goes over the most common techniques you can use—including environment variables, encrypted files (e.g., KMS, PGP, SOPS), and secret Stores (e.g., Vault, AWS Secrets manager)—the trade-offs between them, and important pre-requisites, such as secure Terraform state storage.

What to do about it: Check out the blog post and let us know which secrets management solution you like best!

Open Source Updates

Terragrunt

  • v0.23.32: You can now use dependency block references in sub-blocks of the terraform block with xxx-all commands. Also fix bug where dependency blocks ran irrelevant hooks when retrieving outputs.
  • v0.23.33: This release introduces a new CLI flag --terragrunt-debug which can be used to initiate debug mode for terragrunt. In debug mode, terragrunt will emit a terragrunt-generated.tfvars.json file into the terragrunt directory which you can use to inspect what TF variable inputs are being passed to terraform. See the docs to learn more.
  • v0.23.34: Starting this release you can use the provided Makefile to build the terragrunt binary with make build.
  • v0.23.35: There is now an optimization on dependency output fetching if certain conditions are met. See the updated docs for more information.
  • v0.23.36: This fixes a bug that was introduced in the dependency retrieval optimization, where it was not accounting for IAM role assume configurations.
  • v0.23.37: You can now benefit from dependency optimization even if you are not managing remote state in generate mode. You can now disable dependency optimization using a feature flag on the remote_state block: disable_dependency_optimization = true.

Terratest

  • v0.28.8: Added a terraform.OutputStruct method that can parse the result of terraform output into a custom struct you provide.
  • v0.28.9: You can now configure the -parallelism flag for Terraform by setting terraform.Options.
  • v0.28.10: Make sure the -parallelism flag works for xxx-all commands in Terragrunt too.
  • v0.28.11: Added a new aws.GetRecommendedInstanceType function to help us solve the issue where t2.micro and t3.micro instances are each available in some AZs, but not available in others. This module takes in a list of instance types to pick from and returns the recommended one to use, which is one that's available in all AZs in the current region. Also Added a new pick-instance-type CLI that can be used to execute the same function from the CLI and get a recommended instance type printed to stdout.
  • v0.28.12: Added several helper functions for AWS Secrets Manager.
  • v0.28.13: Fix a bug where the HTTPDoWithValidationRetry method, and most of the other http_helper methods which call it under the hood, was showing the wrong HTTP method in log statements.
  • v0.28.14: The error message for when the expected instances could not be found with GetRandomInstance and GetRandomInstanceE now properly indicate that it could be zonal and not just regional.
  • v0.28.15: aws.GetRecommendedInstanceTypeWithClientE : A new function that can be used to call GetRecommendedInstanceTypeE with a preconfigured AWS SDK Go client.

terraform-aws-nomad

  • v0.6.4: The nomad-cluster now allows you to enable scale-in protection using the protect_from_scale_in input variable.

terraform-aws-consul

  • v0.7.5: The consul-cluster now allows you to enable scale-in protection using the protect_from_scale_in input variable.
  • v0.7.6: Removed a duplicate required_version entry from the consul-security-group-rules module. This should have no impact on behavior.
  • v0.7.7: Set default values for availability_zones and subnet_ids to null in consul-cluster. As of AWS Provider 3.x.x, only one of these parameters may be set at a time on an Auto Scaling Group, so we now have to use null rather than empty list as our default.

terraform-aws-vault

  • v0.13.8: You can now override the data directory for Vault using the --data-dir argument in run-vault.

package-terraform-utilities

  • v0.2.1: Added a new instance-type module that can tell you which of a list of instance types are available in all AZs in the current AWS region.

Other updates

module-security

  • v0.32.4: Fix ssh_key param in one of the examples so that tests will pass. No modules were changed.
  • v0.32.5: Added a new logs IAM policy, IAM group, and IAM role that grants access to logs in CloudTrail, AWS Config, and CloudWatch.
  • v0.33.0: When creating a CMK using the kms-master-key module, you can now provide IAM conditions for the key users. Previously, the module only accepted a list of users, and did not accept any conditions.
  • v0.33.1: Fix a syntactic error in account-baseline-security that prevented the module from working. Also, fix some test failures that obscured this.
  • v0.33.2: Adds the sts:TagSession permission to the allow_access_to_other_accounts IAM policy. This will allow session tags. As an example, this is used with the “Configure AWS Credentials” GitHub action.
  • v0.34.0: Adds support for sending logs to a dedicated logging account as part of our Landing Zone solution, as discussed above. Be sure to follow the steps outlined in the migration guide.
  • v0.34.1: Bug fix release for a few issues introduced in v0.34.0.
  • v0.34.2: This release adds a role with permissions only to access support, as required by the CIS AWS Foundations Benchmark. Previously, this permission was available in iam-groups, but not as an IAM role.
  • v0.34.3: Allows an empty list of users and admins in cloudtrail-created KMS keys. Previously, the kms_key_user_iam_arns and kms_key_administrator_iam_arns variables were required. They are now optional and default to an empty list.
  • v0.34.4: This release adds read only permissions to the read_only IAM policy for the Performance Insights service.
  • v0.34.5: There appears to be a Terraform bug where, when you run destroy, you can get errors about (valid) references to resources that use count or for_each (e.g., foo.bar[0]). This release has a workaround for this issue, so hopefully, destroy works correctly now.
  • v0.35.0: Starting this release, tests are run against v3.x series of the AWS provider.
  • v0.36.0: Refactored the account-baseline-xxx modules to work around several chicken-and-egg problems related to AWS Config / CloudTrail.
  • v0.36.1: This release introduces a new module kms-grant-multi-region that allows you to manage KMS grants for KMS keys across multiple regions.
  • v0.36.2: You can now set the max session duration for human and machine cross account IAM roles managed in the account-baseline modules using the max_session_duration_human_users and max_session_duration_machine_users input vars.
  • v0.36.3: Resolve shellcheck issues in aws-auth.

module-ci

  • v0.23.2: The iam-policies modules will now output the policy JSON even when the policy is not created.
  • v0.23.3: infrastructure-deployer and infrastructure-deploy-script now supports deploying the repo root path using "" for --deploy-path. This is now the default for --deploy-path when it is omitted from the CLI args.
  • v0.23.4: You can now set the backend-config option on the init call in the ecs-deploy-runner by passing in --backend-config to the infrastructure-deployer CLI.
  • v0.24.0: You can now run arbitrary scripts in a controlled fashion in the deploy runner containers.
  • v0.24.1: You can now disable specific containers in the standard configuration by setting the corresponding configuration option to null.
  • v0.24.2: Add the ability to set custom tags on all the resources managed by the ecs-deploy-runner module.
  • v0.24.3: The infrastructure-deploy-script now supports passing in -var-file to terraform and terragrunt.
  • v0.24.4: Update install-jenkins to use the new Linux Repository signing keys, as the old ones expired.
  • v0.25.0: The ecs-deploy-runner can now be provisioned with an EC2 worker pool to use as reserved workers to speed up the initial boot sequence for the ECS deploy runner tasks.
  • v0.25.1: ecs-deploy-runner now returns the ECS cluster EC2 worker pool IAM role and ASG name.
  • v0.26.0: Allows users to include environment variables in the ECS deploy-runner containers. To include an environment variable, use the environment_vars field of the container_images variable in the ecs-deploy-runner and ecs-deploy-runner-standard-configuration modules.
  • v0.27.0: Starting this release, tests are run against v3.x series of the AWS provider.
  • v0.27.1: You can now query the available containers and scripts in the ecs-deploy-runner using the --describe-containers command. Refer to the updated documentation for more info.
  • v0.27.2: Update install-jenkins to the latest Jenkins version (2.235.5), switch to https URLs for the APT sources, and add DEBIAN_FRONTEND=noninteractive to all apt-get calls to ensure the installs don't show interactive prompts.

package-static-assets

  • v0.6.5: We now accept new variables base_domain_name and base_domain_name_tags to lookup the relevant hosted zone so that hosted_zone_id need not be provided.

module-data-storage

  • v0.14.0: A number of updates to the aurora module: it now sets aurora-mysql (MySQL 5.7-compatible) instead of aurora (MySQL 5.6-compatible) as the default engine; it no longer ignores the password param when snapshot_identifier is set; it now properly supports setting allow_connections_from_cidr_blocks to an empty list.
  • v0.15.0: Remove an unused is_primary parameter from the aurora module.

terraform-aws-eks

  • v0.20.4: Fix bug where eks-cluster-workers errors out on the aws_autoscaling_group resource in AWS provider versions >v2.63.0.
  • v0.21.0: The upgrade scripts for eks-cluster-control-plane now support upgrading to Kubernetes 1.17. Note that in the process, the AWS VPC CNI version was also updated for ALL kubernetes versions to match expectations with AWS.
  • v0.21.1: eks-cluster-managed-workers will now ignore changes to desired_size after the initial deployment, to be compatible with the cluster autoscaler.
  • v0.21.2: Fix bug where the control plane upgrade scripts fail on python3.
  • v0.22.0: The EKS cluster control plane upgrade script now uses the right image tags for the core components. Additionally, this release drops support for k8s 1.13 and 1.14 in the upgrade script.

module-asg

  • v0.9.1: Fix bug where asg-rolling-deploy errors out on the aws_autoscaling_group resource in AWS provider versions >v2.63.0.
  • v0.9.2: Adds the ARN of the ASG as an output.
  • v0.10.0: The availability_zones input has been dropped from the asg-rolling-deploy module, which is only used in EC2-Classic mode. To control availability zones, use the vpc_subnet_ids input variable instead.

module-ecs

  • v0.20.5: Fix bug where ecs-cluster errors out on the aws_autoscaling_group resource in AWS provider versions >v2.63.0.
  • v0.20.6: The roll-out-ecs-cluster-update.py script will now directly detach the old instances from ASG in a rollout to ensure the old ones get removed.
  • v0.20.7: You can now set the permission boundary on the IAM roles created in the ecs-daemon-service module.
  • v0.20.8: You can now set the permissions boundary for the ECS service IAM role for ELBs.
  • v0.20.9: Add ECS capacity provider functionality to ECS clusters.
  • v0.20.10: You can now conditionally shut off the ecs-cluster module using the create_resources input flag. You can also provide a base64 user data parameter for cloud-init configurations.
  • v0.20.11: Fix issue an issue with how the ecs-scripts module could exit with an error when editing crontab. Fix a number of ShellCheck warnings.
  • v0.21.0: Starting this release, tests are run against v3.x series of the AWS provider.
  • v0.21.1: Add prefix to the ECS capacity providers to support ECS cluster names that begin with ecs or aws.

terraform-aws-vpc

This module was renamed from module-vpc to terraform-aws-vpc. GitHub automatically redirects traffic to the new name, so no action is required as a result of this change.

  • v0.9.0: Switch the vpc-app and vpc-mgmt modules from using the deprecated blacklisted_names and blacklisted_zone_ids parameters to the new exclude_names and exclude_zone_ids parameters.
  • v0.9.1: Adds subnet ARNs to the outputs for vpc-app and vpc-mgmt.
  • v0.9.2: Adds the create_resources variable to allow disabling the module by setting create_resources=false.
  • v0.9.3: In the vpc-peering-external module, it's now possible to disable the network ACL DENY rules by setting enable_blanket_deny=false. This can be useful when you need to add your own ACLs and you're bumping up against the 20 rule limit. Also, as outlined in the Terraform AWS provider v3 upgrade guide, CloudWatch Logs group ARNs no longer include the :* at the end, which caused a problem in the vpc-flow-logs module. This is now resolved.

module-load-balancer

  • v0.20.2: Add Load Balancer Listener Rules module, which is an alternative to creating lb_listener_rule resources directly in Terraform, which can be convenient, for example, when configuring listener rules in a Terragrunt configuration.
  • v0.20.3: The arn_suffix attribute is now available as an output from the alb module.
  • v0.20.4: The lb-listener-rules module now lets you use HTTP headers in conditions via the http_headers param.

terraform-aws-monitoring

  • v0.22.0: Update Route53 Health Check Alarms module to accept a map with the alarm’s configuration, allowing to create more than one resource.

package-openvpn

  • v0.10.0: var.subnet_id has been renamed to var.subnet_ids and takes a list of subnets. The OpenVPN AutoScaling group will include all subnets specified in the list.
  • v0.11.0: Use python to manage sleeps to delay resource creation for IAM propagation. This means that you must have python installed on your machine to use this module.

package-kafka

  • v0.6.3: You can now install generate-key-stores using gruntwork-install.

module-cache

  • v0.9.4: Fix the default parameter-group setting value in theredis module when using clustered mode.

cis-compliance-aws

  • v0.5.1: custom-iam-entity module now supports updating the max session duration of the IAM role.

DevOps News

Terraform AWS Provider v3.0 is out

What happened: HashiCorp and AWS have released v3.0 of the AWS Provider for Terraform.

Why it matters: This is a major new release of the AWS provider, which means that you get new features, such improvements to the Amazon Certificate Manager (ACM) resources, removal of state hashing, and improved authentication ordering. However, major releases also bring breaking changes. In particular, this new version of the AWS Provider only works with Terraform 0.12 and newer and has breaking changes for many individual resources.

What to do about it: We have started work to upgrade the entire Gruntwork IaC Library to be compatible with AWS Provider v3.0. We will announce when this process is complete and how to update. Until then, if you haven’t already, we strongly recommend pinning your AWS Provider version to v2.x by adding code like this:

provider "aws" {
# ... other configuration ...
version = "~> 2.70"
}

Terraform 0.13 is out

What happened: HashiCorp has released Terraform version 0.13.

Why it matters: The new release brings some powerful new features, including the ability to use count and for_each on modules, automatic installation of third-party providers, and custom variable validation! However, the new release also brings some backwards incompatibilities.

What to do about it: As soon as we’re done with the AWS Provider 3.x upgrade, we will move on to upgrade all of our code to work with Terraform 0.13. We will announce when this process is complete and how to update. Until then, if you haven’t already, we strongly recommend pinning your Terraform version to 0.12.x by adding code like this:

terraform {
required_version = "= 0.12.29"
}

CDK for Terraform: Python and TypeScript support for Terraform

What happened: HashiCorp has launched a developer preview of a Cloud Development Kit (CDK) for Terraform.

Why it matters: The CDK allows you to define your infrastructure using a general purpose programming language, such as Python and TypeScript, instead of HCL, while still running Terraform under the hood. For example, you might create a VPC using the following TypeScript code:

class MyStack extends TerraformStack {
constructor(scope: Construct, name: string) {
super(scope, name);

new AwsProvider(this, 'aws', {
region: 'us-east-1'
});

const vpc = new Vpc(this, 'my-vpc', {
cidrBlock: '10.0.0.0/16'
});
new Subnet(this, 'my-subnet', {
vpcId: Token.asString(vpc.id),
cidrBlock: '10.0.0.0/24'
});
}
}

const app = new App();
new MyStack(app, 'vpc-example');
app.synth();

And then deploy as follows:

$ cdktf deploy
Stack: vpcexample
Resources
+ AWS_SUBNET       mysubnet     aws_subnet.vpcexample_mysubnet_3769B309
+ AWS_VPC              myvpc        aws_vpc.vpcexample_myvpc_80A1790F

Diff: 2 to create, 0 to update, 0 to delete.
Do you want to continue (Y/n)?

What to do about it: Give the CDK a shot and let us know what you think! We will be trying it out too, especially as the solution matures more.

EKS / Kubernetes updates

What happened: Amazon has announced that EKS now supports Kubernetes version 1.17 and managed node groups no support launch templates and custom AMIs.

Why it matters: The Kubernetes 1.17 release includes Cloud Provider Labels, ResourceQuotaScopeSelectors, TaintNodesByCondition, Finalizer protection, CSI Topology graduating to generally available, and the Windows containers RunAsUsername feature is now in beta. Support for custom AMIs on managed node groups allows you to install custom software while still having EKS run and manage the worker nodes for you.

What to do about: As of v0.21.0, our terraform-aws-eks repo supports Kubernetes 1.17, so feel free to upgrade. We also updated our Comprehensive Guide to EKS Worker Nodes blog post with the new information about managed worker nodes.