aws

Gruntwork Newsletter, April 2019

Once a month, we send out a newsletter to all Gruntwork customers that describes all the updates we’ve made in the last month, news in the…

Yevgeniy Brikman

Co-Founder

Published March 8, 2019

Once a month, we send out a newsletter to all Gruntwork customers that describes all the updates we’ve made in the last month, news in the DevOps industry, and important security updates. Note that many of the links below go to private repos in the Gruntwork Infrastructure as Code Library and Reference Architecture that are only accessible to customers.

Hello Grunts,

In the last month, we upgraded our modules to work with AWS Provider 2.x, launched a Gruntwork Helm Charts repo, open sourced a set of modules to run the full TICK stack on AWS, and fixed lots of bugs. In DevOps news, the ALB now supports routing via HTTP headers, methods, and query strings, AWS App Mesh is generally available, and Amazon has forked Elasticsearch.

As always, if you have any questions or need help, email us at support@gruntwork.io!

Gruntwork Updates
DevOps News
Security Updates

Gruntwork Updates

AWS Provider 2.x upgrade

Motivation: Last month, AWS Provider 2.0.0 came out for Terraform. This included some backwards incompatible changes (see the upgrade guide), so our customers could not upgrade to the 2.x line until we had upgraded the Infrastructure as Code Library.

Solution: We’ve gone through and tested all of our repos to ensure they work with AWS Provider 2.x. All of our tests are passing now, so it should be safe for you to upgrade. The main modules we needed to upgrade were:

module-ecs, v0.12.0: ecs-service and ecs-service-with-alb modules.
package-static-assets, v0.4.2: s3-cloudfront module.

What to do about it: Here’s how you can upgrade your code to use the latest AWS Provider 2.x:

Inside of your provider "aws" { ... } blocks, set version = "~> 2.0".
Update any usage of module-ecs and package-static-assets to the version numbers indicated above (or newer).
Read through the AWS Provider 2 Upgrade Guide and make the required updates to your own Terraform code.
See this commit for an example to follow.

Terraform 0.12 upgrade

Motivation: Terraform 0.12 beta1 came out last month. It has many wonderful new features and a number of backwards incompatibilities, so customers cannot use it until we upgrade our Infrastructure as Code Library.

Solution: We’ve done some preliminary manual testing with 0.12 beta1 and it seems reasonably stable. The next step is to upgrade Terratest and Terragrunt to work with Terraform 0.12 and then begin upgrading our repos, one at a time. This will be a MUCH more significant change than AWS Provider 2; in fact, we expect almost every repo to require non-trivial changes. Moreover, once we do the upgrade, there’s no going back! We’ll be working on it and will update you with our progress.

What to do about it: In the meantime, sit tight. We suspect there’ll be at least one more beta release of Terraform 0.12 before the final, and even when the final comes out, you might want to wait for the first few bug fixes to roll in (i.e., wait until 0.12.1).

Gruntwork Helm Charts Repo Launched

Motivation: We wanted to make it 10x easier to deploy and manage your applications on a Kubernetes cluster. We wanted to provide a one-step solution to package your dockerized application for Kubernetes in a way that follows all the best practices when deploying on Kubernetes.

Solution: We built a Helm Chart for packaging your dockerized application for deployment on Kubernetes that provides support for rolling deployments, canary deployments, service discovery, and more out of the box. To make it easier to consume, we also open sourced the chart and released it with our own Helm Chart repository: helmcharts.gruntwork.io.

What to do about it: Follow the instructions in the helm-kubernetes-services repository to add the Gruntwork Helm Charts Repository (helmcharts.gruntwork.io). Then, checkout the values.yaml file for all the configuration options. Once you have your values.yaml file filled out, run helm install gruntwork/k8s-service -f values.yaml to deploy it onto your Kubernetes cluster with helm. Also checkout the various examples for a more detailed walkthrough.

TICK Package

Motivation: Last month we announced an updated terraform-aws-influx package that included a set of reusable modules to deploy and Telegraf, InfluxDB, and Chronograf on AWS. However, we were missing one piece: Kapacitor.

Solution: We’ve add support for Kapacitor to the terraform-aws-influx package, so you can now setup a complete “TICK stack” in AWS. These modules can be combined in a variety of ways to meet your specific needs.

There are modules to do the following:

Run Telegraf to forward server logs and metrics to InfluxDB.
Run InfluxDB Enterprise for storing and querying time-series data.
Run Chronograf to provide a web-interface for the the TICK stack.
Run Kapacitor to set up alerts and triggers based on your InfluxDB data.

What to do about it: All of this new code is in the terraform-aws-influx repo and fully open source under the Apache 2.0 License! We also have some examples to help get you started.

Reference Architecture default users fix

Motivation: A few customers reported that the custom CloudWatch metrics (namely, memory usage and disk space usage) were not working correctly in their Reference Architectures.

Solution: Upon digging in, we found out that we broke this functionality when we introduced ip-lockdown to lock the EC2 Instance Metadata endpoint down to solely the root user. What we forgot is that the “default” user—that is, ubuntu on Ubuntu AMIs and ec2-user on Amazon Linux AMIs—needs access to that Metadata endpoint too! Since the default user is the one that we use for sending custom CloudWatch metrics, we needed to whitelist that user.

What to do about it: Check the User Data scripts on all your EC2 Instances to see if ip-lockdown is used, and if it is, make sure the “default” user for your operating system (i.e., ubuntu or ec2-user) is whitelisted! See this commit for an example.

Open source updates

Terragrunt, v0.18.2: Added a new run_cmd() helper that you can use to run an arbitrary shell command and use whatever it writes to stdout in your Terragrunt config. See the run_cmd docs for details.
Terragrunt, v0.18.3: You can now use the --terragrunt-ignore-external-dependencies flag to tell Terragrunt to ignore any external dependencies (i.e., those outside the current folder path) when running xxx-all commands (e.g., apply-all).
bash-commons, v0.1.2: Fix syntax in the docs for the array_split function. Fix minor bug in the loop indices of the assert_exactly_one_of function.
terraform-aws-couchbase, v0.1.5: Fix the install script so that it not only installs a script to disable transparent huge pages, but also configures that script to run on boot so transparent huge pages are actually disabled when you run Couchbase!
terratest, v0.14.3: This release introduces helper functions for verifying ECS clusters: GetEcsCluster, GetDefaultEcsCluster, CreateEcsCluster, DeleteEcsCluster, NewEcsClient . Additionally, a fix was introduced for dependency management such that new projects don’t break on compile errors with kubernetes. Finally, various typos were fixed in the documentation.
terraform-aws-consul, v0.6.0: We have switched this Consul repo to use systemd instead of supervisord as a process supervisor, as systemd is now available on most major Linux distributions by default.
terraform-aws-vault, v0.12.0: Updates run-vault andexamples/vault-consul-ami, switching from supervisord to systemd and updating from Amazon Linux to Amazon Linux 2.

Other updates

module-vpc, v0.5.6: You can now customize the CIDR block calculations for each “tier” of subnet in the vpc-app module using the public_subnet_bits, private_subnet_bits, and persistence_subnet_bits input variables; enable public IPs to be enabled by default on public subnets in the vpc-app module by setting the map_public_ip_on_launch input variable to true; configure the VPC peering connection using the allow_remote_vpc_dns_resolution, allow_classic_link_to_remote_vpc, and allow_vpc_to_remote_classic_link input variables in the vpc-peering module.
package-kafka, v0.5.2: Advertise Kafka health-check listener on private IP instead of 127.0.0.1.
module-ci, v0.13.11: The ec2-backup module has been updated to use node 8.10, as node 6.10 was deprecated in AWS Lambda.
module-aws-monitoring, v0.12.1: This release introduces sqs-alarms, which can be used to setup CloudWatch alarms for SQS queues. Check out the example for how to set it up. This release also verifies compatibility with AWS provider 2.X. NOTE: there are no changes to the underlying modules (only the examples), so there are no breaking changes with this release.
module-aws-monitoring, v0.12.2: This release extends the cloudwatch log aggregation IAM policy to allow logs:DescribeLogGroups as needed by fluentd.
module-ecs, v0.12.1: This release fixes a bug where sometimes the ECS service creation will fail because it can not associate the IAM role for the task. This release adds a sleep for each aws_iam_role creation to give time to propagate before associating the role.
module-ecs, v0.12.2: This release fixes #125, where the ALB Healthcheck was not checking that all the tasks were registered, so was prematurely passing the deployment check. Starting this release, the LB checker now verifies that all the tasks for the newest versions are actually registered in the list before checking the health status.
terraform-aws-eks, v0.2.0: This release enhances the support for provisioning and managing separate worker groups for your EKS cluster. Previously there were limitations that prevented you from provisioning additional EKS worker ASGs that hooked to the same cluster. Starting this release, we’ve fixed and verified support for it. Additionally, this release introduces a helper script that can be used to translate EC2 instance tags into Kubernetes node labels. Using these features, you can implement use cases that require dedicated worker groups for specific Kubernetes work loads.
terraform-aws-eks, v0.2.1: This release introduces the eks-cloudwatch-container-logs module, which installs a DaemonSet on your EKS cluster to ship logs to CloudWatch using fluentd. Refer to the module documentation and eks-cluster-with-supporting-services for more information on how this works.
terraform-aws-eks, v0.2.2: This release fixes a bug where kubergrunt was still required even if all the feature flags were turned off.
kubergrunt, v0.3.4: The eks token command now accepts a new parameter --as-tf-data, which will encode the token output in a format that can be used as an external data source in terraform. This allows you to configure the kubernetes and helm providers without configuring a kubectl context. Checkout the eks-cluster-basic example in terraform-aws-eks for an example of how to use it.
module-data-storage, v0.8.8: This release fixes lambda-cleanup-snapshots, filtering snapshots by manual type, since automated snapshots may not be deleted manually.
package-static-assets, v0.4.3: The s3-static-website module now has a new output called website_bucket_endpoint_path_stylethat has a path-style output of the S3 bucket endpoint, which will be of the format s3-<region>.amazonaws.com/<bucket-name>. The advantage of this style of endpoint is that it works over both HTTP and HTTPS.
package-sam v01.11: Add a new create_resources input variable that, if set to false, will result in the api-gateway-account-settings module creating no resources. This weird parameter exists solely because Terraform does not support conditional modules. Therefore, this is a hack that will allow us to conditionally decide if the API Gateway account settings should be created or not.
module-security, v0.16.1: Add support for IAM role name prefix in the cross-account-iam-roles module via new input variable iam_role_name_prefix. Add a new create_resources input variable tothe kms-master-key module that, if set to false, will result in the module creating no resources. This weird parameter exists solely because Terraform does not support conditional modules. Therefore, this is a hack that will allow us to conditionally decide if the KMS master key should be created or not.

DevOps News

The ALB now supports routing via HTTP headers, methods, query parameters, and source IPs

What happened: AWS has updated its Application Load Balancer (ALB) with support for routing requests based on HTTP headers, methods, query parameters, and source IP addresses. They have also improved their support for how rules and conditions are evaluated and combined.

Why it matters: The ALB has always been a great choice as the ingress for your apps due to its built-in support automatic scaling, high availability, TLS termination (with free, auto-renewing certs from ACM), and integration with other AWS services (e.g., ECS and ASGs), but lack of support for advanced routing rules that many teams still had to run nginx or Apache either behind or instead of the ALB. With this release, that becomes less necessary. You can now define complex routing rules directly within the ALB, defining how to handle requests based on hostname, URL paths, source IP, and HTTP headers, methods, and query string parameters.

What to do about it: You can start using this new advanced routing logic in your ALBs now! All the new functionality is supported in the latest Terraform aws_lb_listener_rule resource.

Amazon EKS updates: API server access control, control plane logs

What happened: AWS has announced two updates to EKS, its managed Kubernetes service:

API server access control: You can now lock down the EKS API Server to only be accessible from within the VPC.
Control plane logs: EKS can now deliver logs from the Kubernetes Control Plane to CloudWatch Logs.

Why it matters:

API server access control: Previously, the EKS API endpoint was only accessible over the public Internet. Now, traffic between Kubernetes worker nodes, the kubectl command line tool, and the EKS API server stays within your VPC. This provides an additional layer of protection to harden clusters against malicious attack and accidental exposure.
Control plane logs: Previously, EKS control plane log data was not accessible, so you had no way to audit changes and monitor activity for your EKS clusters. Now, you can see your control plane logs in CloudWatch Logs, including the audit, API server, authenticator, controller-manager, and scheduler logs.

What to do about it: Check out the API server access control and control plane logs announcement blog posts for details.

AWS App Mesh is now generally available

What happened: AWS has announced that AWS App Mesh is now out of early preview and available to all users.

Why it matters: AWS App Mesh provides a managed service mesh so you have control over how all your applications and microservices communicate with each other. Instead of relying solely on Security Groups, you get application-level control over which apps can talk to which apps, routing logic, and advanced monitoring of network traffic. Under the hood, it uses the open source Envoy Proxy.

What to do about it: Now that App Mesh is generally available anyone can take it for a spin. Try it out and let us know what you think. Check out the aws_appmesh_xxx resources for Terraform support.

Amazon is forking Elasticsearch

What happened: Amazon has announced that, along with a few partner companies (Expedia, Netflix), they will be forking the open source Elastichsearch project and creating a new, “Open Distro for Elasticsearch.”

Why this matters: On the surface, this should matter to all Elasticsearch users as it may end up splitting the community. However, there is another, deeper, and arguably more important trend happening here: this sort of move calls into question the sustainability of certain types of open source business models.

In particular, Amazon is forking this repo because Elastic.co, the commercial entity behind Elasticsearch, has changed the license on some of the code to no longer be truly open source. The reason Elastic.co did this, of course, is largely because Amazon began offering a managed Elasticsearch service that competes with Elastic.co’s commercial offering. Amazon has created similar managed services that compete with the companies behind other open source projects, including Kafka (Amazon’s Managed Kafka Service vs Confluent Cloud) and MongoDB (Amazon’s DocumentDB competing with MongoDB’s Atlas). As a result, many of these open source companies have begun changing their licenses to ban competitive hosted service offerings.

What to do about it: Do you think the open source vendors are right to change their licenses? Is Amazon right to offer services that compete with the vendors that make some of these open source projects possible? Does the open-core business model have a future? Leave your thoughts in the comments!

Security Updates

Below is a list of critical security updates that may impact your services. We notify Gruntwork customers of these vulnerabilities as soon as we know of them via the Gruntwork Security Alerts mailing list. It is up to you to scan this list and decide which of these apply and what to do about them, but most of these are severe vulnerabilities, and we recommend patching them ASAP.

Google Chrome

CVE-2019–5786: A high severity flaw was found in Google Chrome that can be exploited by a remote attacker to execute arbitrary code and** take full control of the target computer**. The vulnerability, tracked as CVE-2019–5786, impacts all major operating systems, is so bad that few details are being released until most users have upgraded, and is actively being exploited in attacks in the wild (so this is not some theoretical issue!). Update your Chrome browser immediately! We notified the Security Alerts mailing list about this vulnerability on March 8th, 2019.

DevOps Foundations Setup

Terragrunt Services

OpenTofu Services

IaC Library

Account Factory

Pipelines

Patcher

Open source

Integrations

Support

Docs

Blog

Changelog

Books

Community