Gruntwork Newsletter, May 2019
Once a month, we send out a newsletter to all Gruntwork customers that describes all the updates we’ve made in the last month, news in the…

Once a month, we send out a newsletter to all Gruntwork customers that describes all the updates we’ve made in the last month, news in the DevOps industry, and important security updates. Note that many of the links below go to private repos in the Gruntwork Infrastructure as Code Library and Reference Architecture that are only accessible to customers.
Hello Grunts,
In the last month, we’ve made a major upgrade to the Infrastructure as Code Library: in partnership with Google, we’ve added a collection of open source, production-grade, reusable modules for deploying your infrastructure on Google Cloud Platform (GCP)! We also launched a new documentation website, replaced our OracleJDK module with an OpenJDK module, added a module to automatically issue and validate TLS certs, made major updates to our Kubernetes/EKS code (including support for private endpoints, log shipping, ingress controllers, external DNS, etc), and fixed a number of critical bugs.
As always, if you have any questions or need help, email us at support@gruntwork.io!
Gruntwork Updates

Gruntwork for Google Cloud Platform (GCP)!
Motivation: Up until recently, we had been primarily focused on AWS, but this month, we’re excited to announce, in partnership with Google, that we’ve added first-class support for Google Cloud Platform (GCP)! And best of all, thanks to this partnership, all of our GCP modules are open source!
Solution: We worked directly with Google engineers to develop a set of reusable, production-grade infrastructure modules, including:
- terraform-google-gke: Launch a production-grade Google Kubernetes Engine (GKE) cluster.
- terraform-google-load-balancer: Deploy a Cloud Load Balancer.
- terraform-google-network: Deploy a best-practice VPC on GCP.
- terraform-google-static-assets: Manage static assets on GCP.
- terraform-google-sql: Easily deploy both MySQL & PostgreSQL using our Cloud SQL module.
We also now offer commercial support for both AWS and GCP. Check out our announcement blog post for the details.
What to do about it: To get started with these modules, check out our post on the Google Cloud Blog, Deploying a production-grade Helm release on GKE with Terraform. This blog post will walk you through setting up a Kubernetes cluster, configuring Helm, and using Helm to deploy a web service on Google Cloud in minutes. You can even try out the code samples from that blog post directly in your browser, without having to install anything or write a line of code, using Google Cloud Shell!

New documentation website (docs.gruntwork.io)
Motivation: DevOps is hard. There seem to be 1,001 little details to get right, and you never have the time to learn them all.
Solution: We’ve launched a new Gruntwork Docs site that helps you get up and running even faster! You can already find guides for Deploying a Dockerized App on GCP/GKE and Deploying a Production Grade EKS cluster.
What to do about it: Head to the Gruntwork Docs site at docs.gruntwork.io. We’ll be adding much more content in the future, so let us know in the comments and via support what DevOps issues you’re struggling with, and we’ll do our best to write up guides to answer your questions.
New OpenJDK installer module
Motivation: We discovered that Oracle has changed their policies to require authentication for all downloads of their JDK. This broke our install-oracle-jdk
module. This in turn has impacted all of our Java based infrastructure packages: Kafka, Zookeeper, ELK.
Solution: We created a new install-open-jdk
module that will install OpenJDK instead of Oracle’s JDK. It was created to be a drop-in replacement for our other module. In the past, the Oracle JDK used to be the best option, as OpenJDK was missing many features, had worse performance, and didn’t offer commercial support. However, in recent years, the differences between the JDKs in terms of features and performance have become negligible and Oracle no longer allows you to use their JDK for free (a paid license is required for production usage!). Therefore, most teams are now better off going with OpenJDK, which you can install using this module. Note that if you need commercial support for the JDK, you may wish to use Azul or AdoptOpenJdk instead. We’re updating all of our own Java based infrastructure packages to use this new module.
What to do about it: The new OpenJDK installer module is available as part of Zookeeper’s v0.5.4 release. Check it out and use it instead of install-oracle-jdk
as we will be deprecating and removing it shortly.
acm-tls-certificate: new module to issue & validate TLS certificates
Motivation: AWS Certificate Manager (ACM) makes it easy to issue free, auto-renewing TLS certificates. So far, we’ve mostly been creating these certificates manually via the AWS Console, but we’ve always wanted to manage them as code.
Solution: We’ve created a new Terraform module called acm-tls-certificate
that can issue and automatically validate TLS certificates in ACM! Usage couldn’t be simpler:
# Create a TLS certificate for example.your-domain.com module "cert" { source = "git::git@github.com:gruntwork-io/module-load-balancer.git//modules/acm-tls-certificate?ref=v0.13.2"
domain_name = "example.your-domain.com" hosted_zone_id = "ZABCDEF12345" }
You pass in the domain name you want to use and the ID of the Route 53 Hosted Zone for that domain, and you get a free, auto-renewing TLS certificate that you can use with ELBs, CloudFront, API Gateway, etc! For example, here’s how you can use this certificate with an Application Load Balancer (ALB):
# Create a TLS certificate for example.your-domain.com module "cert" { source = "git::git@github.com:gruntwork-io/module-load-balancer.git//modules/acm-tls-certificate?ref=v0.13.2"
domain_name = "example.your-domain.com" hosted_zone_id = "ZABCDEF12345" }
# Attach the TLS certificate to an ALB module "alb" { source = "git::git@github.com:gruntwork-io/module-load-balancer.git//modules/alb?ref=v0.13.2"
alb_name = "example-alb"
https_listener_ports_and_acm_ssl_certs = [ { port = 443 tls_domain_name = "${module.cert.certificate_domain_name}" }, ]
# ... other params omitted ... }
And now your load balancer is using the TLS certificate on its listener for port 443!
What to do about it: The acm-tls-certificate module is available in module-load-balancer, v0.13.2. Check out this example for fully-working sample code.
EKS Updates
Motivation: Since December of last year, we have been busy building up a production grade IaC module for EKS that makes it 10x easier to deploy and manage EKS. What makes infrastructure production grade depends on how many items of our Production Grade Checklist is covered. This month we shipped multiple new modules that enhance the security and monitoring capabilities of the EKS cluster deployed with our modules.
Solution: Over the last month we enhanced our EKS module with the following updates:
- We now support EKS private endpoints for clusters launched using the
eks-cluster-control-plane
module. Check out the module docs for more info. (v0.2.3) - We now support directly accessing the tokens in the Terraform code, as opposed to requiring setup of
kubectl
to access the cluster using thekubernetes
provider andkubergrunt
. (v0.3.0) - We enhanced support for managing multiple worker groups: we added support for taints, tolerations, and affinity rules for any infrastructure deployed using helm, such as the
fluentd-cloudwatch
module and introduced a module to create reciprocating security group rules (theeks-cluster-workers-cross-access
module). (v0.3.1) - We now support EKS control plane log shipping via the
enabled_cluster_log_types
variable. You can read more about this feature in the official AWS documentation. (v0.4.0) - We added support for deploying the AWS ALB Ingress Controller in the modules
eks-alb-ingress-controller
andeks-alb-ingress-controller-iam-policy
, which allows you to map Ingress resources to AWS ALBs. See the module documentation for more information. (v0.5.0) - We added support for deploying the external-dns application in the modules
eks-k8s-external-dns
andeks-k8s-external-dns-iam-policy
, which allows you to mapIngress
resource host paths to route 53 domain records so that you automatically configure host name routes to hit theIngress
endpoints. See the module documentation for more information. (v0.5.1, v0.5.2, v0.5.3) - We added support for linking ELBs to the worker ASG. (v0.5.4)
What to do about it: Upgrade to the latest version of terraform-aws-eks (v0.5.4) to start taking advantage of all the new features!
vpc-dns-forwarder: New module to create Route 53 Resolver Endpoints
Motivation: In Ben Whaley’s VPC reference architecture, it is common to setup a management VPC that acts as a gateway to other application VPCs. In this setup, operators typically VPN into the management VPC and access the other VPCs in your infrastructure over a VPC peering connection. One challenge with this setup is that domain names in Route 53 private hosted zones are not available to the peering VPC.
Solution: To allow DNS lookups of private hosted zones over a peering connection, we can use Route 53 Resolvers to forward DNS queries for specific endpoints to the application VPCs. We created two new modules in module-vpc to support this use case: vpc-dns-forwarder
and vpc-dns-forwarder-rules
.
What to do about it: The vpc-dns-forwarder and vpc-dns-forwarder-rules are available in module-vpc, v0.5.7. Take a look at the updated vpc-peering example for fully working sample code.
Fixes for the server-group health-checker
Motivation: The Gruntwork server-group module includes a script called rolling_deployment.py
which can be used for hooking up a load balancer to perform health checks on the server-group. This script relied on an API call which recently started throwing an exception that we were not handling. This resulted in a situation where the unhandled exception in the health-checker script could cause a deployment of the server-group to fail erroneously.
Solution: We updated the rolling_deployment
script to properly handle the exception. See this PR for more details
What to do about it: Update to module-asg v0.6.26 to pick up the fix.
Fixes for the ECS zero-downtime rollout script
Motivation: The Gruntwork ecs-cluster module includes a script called roll-out-ecs-cluster-update.py
which can be used to roll out updates (e.g., a new AMI or instance type) to the Auto Scaling Group that underlies the ECS cluster. This script should work without downtime, but recently, one of our customers ran it, and when the script finished, it had left the cluster with some of the instances updated to the new AMI, but some still running the old AMI, and the old ones were stuck in DRAINING
state. Clearly, something was wrong!
Solution: It looks like AWS made backwards incompatible changes to the default termination policy for Auto Scaling Groups, and as the roll-out-ecs-cluster-update.py
depended on the behavior of this termination policy as part of its roll-out procedure, this change ended up breaking the script. To fix this issue, we’ve updated the ecs-cluster
module to expose the termination policy via a new termination_policies
input variable, and we’ve set the default to OldestInstance
(instead of Default
) to fix the roll out issues.
What to do about it: Update to module-ecs, v0.13.0 to pick up the fix. Update, 05.09.19: it’s possible this does not fix the issue fully. See #134 for ongoing investigation.
Reference Architecture Mgmt VPC CIDR Block fix
Motivation: One of our customers was connected to VPN servers in two different accounts (stage and prod) and noticed connectivity wasn’t working quite right. It turns out the cause was that the Gruntwork Reference Architecture was using the conflicting CIDR blocks for the “mgmt VPCs” (where the VPN servers run) in those accounts.
Solution:** We’ve updated the Reference Architecture to use different CIDR blocks for the mgmt VPCs in each account. The app VPCs were already using different CIDR blocks.
What to do about it: If you wish to connect to multiple VPN servers at once, or you need to peer the various mgmt VPCs together for some reason, you’ll want to ensure each one has a different CIDR block. The code change is easy: see this commit for an example. However, VPC CIDR blocks are considered immutable in AWS, so to roll this change out, you’ll need to undeploy everything in that mgmt VPC, undeploy the VPC, deploy the VPC with the new CIDR block, and then deploy everything back into the VPC.
Open source updates
- health-checker, v0.0.5: Added single-flight -mode preventing long-running health checks from piling up.
- terratest, v0.14.6: This release introduces the
SetStrValues
argument forhelm.Options
, which corresponds to the--set-string
argument. This can be used to force certain values to cast to a string as opposed to another data type. - terratest, v0.15.0: The
GetAccountId
andGetAccountIdE
methods now use STSGetCallerIdentity
instead of IAMGetUser
under the hood, so they should now work whether you're an IAM User, IAM Role, or other AWS authentication method while running Terratest. - terratest, v0.15.1: This release extends AWS ECS support with
GetEcsService
andGetEcsTaskDefinition
, which can be used to retrieve ECS Service and ECS Task Definition objects respectively. Check out the new Terraform example and corresponding test to see it in action. - terratest, v0.15.2: This release adds support for Terraform 12 by stripping surrounding quotes from values passed to the
-var
command line option. - terratest, v0.15.3: This release introduces support for AWS SSM, providing functions to access parameters:
GetParameter
,GetParameterE
,PutParameter
,PutParameterE
. - terratest, v0.15.4: This release adds support for checking S3 Bucket Versioning configuration:
PutS3BucketVersioning
,PutS3BucketVersioningE
,GetS3BucketVersioning
,GetS3BucketVersioningE
,AssertS3BucketVersioningExists
,AssertS3BucketVersioningExistsE
. - terratest, v0.15.5: This release introduces support for S3 Bucket Policy assertions and access functions:
PutS3BucketPolicy
,PutS3BucketPolicyE
,GetS3BucketPolicy
,GetS3BucketPolicyE
,AssertS3BucketPolicyExists
,AssertS3BucketPolicyExistsE
. - fetch, v0.3.5: GitHub Enterprise users can now download assets.
- Terragrunt, v0.18.4: You can now set
skip = true
in your Terragrunt configuration to tell Terragrunt to skip processing aterraform.tfvars
file. This can be used to temporarily protect modules from changes or to skip overterraform.tfvars
files that don't define infrastructure by themselves. - Terragrunt, v0.18.5: Added a new
terragrunt-info
command you can run to get a JSON dump of Terragrunt settings, including the config path, download dir, working dir, IAM role, etc. - terraform-aws-consul, v0.6.1: Fix a bug where we were not registering Consul properly in systemd, so it would not automatically start after a reboot.
- terraform-aws-vault, v0.12.1: You can now tell the
run-vault
script to run Vault in agent mode rather than server mode by passing the--agent
argument, along with a set of new--agent-xxx
configs (e.g.,--agent-vault-address
,--agent-vault-port
, etc). The Vault agent is a client daemon that provides auto auth and caching features. - terraform-aws-vault, v0.12.2: Fix a bug where we were not registering Vault properly in systemd, so it would not automatically start after a reboot.
Other updates
- kubergrunt, v0.3.7: This release introduces the
k8s wait-for-ingress
sub command which can be used to wait until anIngress
resource has an endpoint associated with it. - kubergrunt, v0.3.8: This release updates the
tls gen
command to use the new way of authenticating to Kubernetes (specifically passing in server and token info directly) and using JSON to configure the TLS subject. This release also introduces a new commandhelm wait-for-tiller
which can be used to wait for a tiller deployment to roll out Pods, and have at least one Pod that can be pinged. This enables chaining calls to helm after helm is deployed when using a different helm deployment process that doesn't rely on the helm client (e.g creating deployment resources manually). - kubergrunt, v0.3.9: This release updates
kubergrunt helm configure
with a new option--as-tf-data
, which enables you to call it in an external data source. Passing this flag will cause the command to output the configured helm home directory in the output json onstdout
at the end of the command. - terraform-kubernetes-helm, v0.3.0: This release introduces a new module
k8s-tiller
, which can be used to use manage Tiller deployments using Terraform. The difference with thekubergrunt
approach is that this supports using Terraform to apply updates to the TillerDeployment
resource. E.g you can now upgrade Tiller using Terraform, or update the number of replicas of TillerPods
to deploy. Note that this still assumes the use ofkubergrunt
to manage the TLS certificates. - terraform-kubernetes-helm, v0.3.1:
k8s-namespace
andk8s-namespace-roles
modules now support conditionally creating the namespace and roles via thecreate_resources
input variable. - package-terraform-utilities, v0.0.8: This release introduces a new module
list-remove
which can be used to remove items from a terraform list. See the module docs for more info. - module-ci, v0.13.13: You can now set the
redirect_http_to_https
variable totrue
on thejenkins-server
module to automatically redirect all HTTP requests to HTTPS. - module-load-balancer, v0.13.3: This release fixes an issue with multiple duplicate ACM certs — e.g. you’re rotating to a new cert and still have systems using the old cert — where previously it errored out if multiple ACM certs matched the domain. Instead, we will now pick the newer one.
- module-ecs, v0.13.1: This release adds and exposes a task execution Iam Role so the ECS tasks can pull private images from ECR and read secrets from AWS Secrets Manager.
- module-ecs, v0.13.2: This release fixes a bug where the
fargate_without_lb
resource incorrectly set ahealth_check_grace_period_seconds
. From the terraform documentation, "Health check grace period is only valid for services configured to use load balancers". - module-ecs, v0.13.3: You can now set a custom name prefix for the IAM roles created by the
ecs-service
module using the newtask_execution_name_prefix
input variable. The default isvar.service_name
, as before. - module-security, v0.16.2: This release fixes #89, where
fail2ban
was not correctly working on non-ubuntu instances. Specifically,fail2ban
had a bug that prevented it from correctly banning brute force SSH attempts on CentOS and Amazon Linux 1 platforms. Checkout the release notes for more details. - module-aws-monitoring, v0.12.3: You can now (a) set tags on all the
alarms
modules via a newtags
input variable and (b) configure the snapshot period and snapshot evaluation period for theelasticsearch-alarms
module using the newsnapshot_period
andsnapshot_evaluation_period
input variables, respectively.
DevOps News
Terraform 0.12 rc1
What happened: HashiCorp has released Terraform 0.12, release candidate 1 (rc1).
Why it matters: The final release of Terraform 0.12 draws closer and closer! Terraform 0.12 brings with it a number of powerful new features, but will also require a significant upgrade. We’ve already started updating our modules with support for 0.12, including updating Terratest to work with 0.11 and 0.12.
What to do about it: For now, continue to sit tight, and await 0.12 final, as well as our word that all of our modules have been updated. We’ll send upgrade instructions when everything is ready!
AWS S3 will no longer support path-style URLs
What happened: AWS has announced, rather quietly, that path-style S3 URLs will no longer be supported after September 30th, 2020. Update: AWS just released a new blog post that says path-style URLs will only be deprecated for new S3 buckets created after September 30th, 2020.
Why it matters: In the past, for an S3 bucket called my-bucket
, you could build S3 URLs in one of two formats:
- Path-style URLs:
s3.amazonaws.com/my-bucket/image.jpg
- Virtual-host style URLs:
foo.s3.amazonaws.com/image.jpg
The former supported both HTTP and HTTPS, whereas the latter used to only support HTTP. Now, both support HTTPs, but the path-style URLs will no longer be supported after September 30th 2020. Update: AWS just released a new blog post that says path-style URLs will continue to work for S3 buckets created before September 30th, 2020, but will not be available for S3 buckets created after that date.
What to do about it: If you’re using path-style S3 URLs, update your apps to use virtual-host style URLs instead. Note that if your bucket name contains dots, virtual-host style URLs will NOT work, so you’ll have to migrate to a new S3 bucket!
Security Updates
Below is a list of critical security updates that may impact your services. We notify Gruntwork customers of these vulnerabilities as soon as we know of them via the Gruntwork Security Alerts mailing list. It is up to you to scan this list and decide which of these apply and what to do about them, but most of these are severe vulnerabilities, and we recommend patching them ASAP.
Docker Hub Data Breach
- On Thursday, April 25th, 2019 it was discovered that Docker Hub had been breached, exposing user data of approximately 190,000 users. The exposed data includes Docker Hub usernames, hashed password, Github and BitBucket oauth access tokens. If you are an affected user, you should have received an email from Docker Hub. Even if you weren’t directly affected, you may still need to take action to protect your organization, as a compromise of any employee at your company may give an attacker access to ALL of your code! The most sensitive piece of information leaked are the Github and BitBucket access tokens, which typically grant read/write access to all repos of the org. These tokens are granted to Docker Hub for users who use the
autobuild
feature. The tokens of affected users have already been revoked by Docker, but if you have not received an email from Docker Hub notifying you of the revocation, we recommend revoking the tokens in GitHub or BitBucket. Additionally, we recommend auditing the security logs to see if any unexpected actions have taken place. You can view security actions on your GitHub or BitBucket accounts to verify if any unexpected access has occurred (see this article for GitHub and this article for BitBucket). Now would also be a good time to review your organization’s OAuth App access, and consider enabling access restrictions on your org.**** We notified the Security Alerts mailing list about this vulnerability on May 1st, 2019.