Browse the Repo

file-type-icon.circleci
file-type-icon_docs
file-type-iconexamples
file-type-iconmodules
file-type-iconbackup-mongodb
file-type-iconinit-mongodb
file-type-iconinstall-mongodb
file-type-iconmongodb-backup
file-type-iconmongodb-cluster
file-type-iconrun-mongodb
file-type-iconsetup-ec2-instance
file-type-iconbin
file-type-iconREADME.md
file-type-iconinstall.sh
file-type-icontest
file-type-icon.gitignore
file-type-iconCODEOWNERS
file-type-iconLICENSE.txt
file-type-iconREADME.md

Browse the Repo

file-type-icon.circleci
file-type-icon_docs
file-type-iconexamples
file-type-iconmodules
file-type-iconbackup-mongodb
file-type-iconinit-mongodb
file-type-iconinstall-mongodb
file-type-iconmongodb-backup
file-type-iconmongodb-cluster
file-type-iconrun-mongodb
file-type-iconsetup-ec2-instance
file-type-iconbin
file-type-iconREADME.md
file-type-iconinstall.sh
file-type-icontest
file-type-icon.gitignore
file-type-iconCODEOWNERS
file-type-iconLICENSE.txt
file-type-iconREADME.md
MongoDB

MongoDB

Deploy a MongoDB cluster. Supports replica sets, sharding, automated bootstrapping, backup, recovery, and OS optimizations.

Code Preview

Preview the Code

mobile file icon

README.md

down

Setup EC2 Instance Scripts

This is a Gruntwork Script Module that installs bash scripts used to boot a MongoDB Instance. It includes:

  • attach-ebs-volume: Search for an available EBS Volume to attach. If none is found, terminate the EC2 Instance on which it runs.

  • attach-eni: Search for an available Elastic Network Interface to attach. If none is found, terminate the EC2 Instance on which it runs.

See the comments at the top of each script for additional information.

These scripts are meant to execute in the User Data of an EC2 Instance so that they are executed at boot time. Both scripts assume that a "pool" of available EBS Volumes and Elastic Network Interfaces has already been created, out of band from this script. When an EC2 Instance boots up, it will attach an Elastic Network Interface, and then attach, mount, and format an EBS Volume.

Why attach an EBS Volume?

...

Why attach an Elastic Network Interface (ENI)?

In a typical Auto Scaling Group, each EC2 Instance that boots has a unique private IP address. As long as there's some kind of Service Discovery applications can reach these new EC2 Instances. For example, an ELB or ALB associated with the Auto Scaling Group will automatically route to the new EC2 Instances.

But when we're dealing with clustered services, where each node is aware of the other and there is usually one node that serves as the "primary" or "master" node, a load balancer usually isn't the right solution for routing a connectiong to the right EC2 Instance. If the master node changes, we want to know about this instantly, whereas an ELB takes at minimum 10 seconds to do such a switch. In addition, this adds a small amount of network latency and added cost to our setup.

The alternative to using a load balancer is to rely on the MongoDB client to choose the right endpoint automatically. In the case of MongoDB, clients specify a connection string in the following format:

# MongoDB Client connection string format
mongodb://[username:password@]host1[:port1][,host2[:port2],...[,hostN[:portN]]][/[database][?options]]`

Since there are potentially many hosts, most Mongo clients that can't connect to host1 will automatically retry host2. One elegant way to identify the hosts is to assign each of them a DNS name so that we could have a connection string like:

# Example MongoDB Client connection string
mongodb://mongo1.gruntwork.io:27017,mongo2.gruntwork.io:27017,mongo3.gruntwork.io:27017`

But this suffers from a subtle problem: DNS addresses are cached at multiple layers. For example, the OS usually caches the IP address to which a DNS record resolves, as does the MongoDB client. Even the DNS resolver used to resolve a DNS query will cache responses!

In most cases, this isn't a problem. But if an EC2 Instance becomes unhealthy and reboots, one of our DNS records needs to be updated. Each of our caches now hopefully honors the DNS Records Time-To-Live (TTL) property which can be as low as 5 seconds but is likely higher. In addition, some DNS caches do not honor TTL's at all. Until the DNS record points to the new EC2 Instance, the newly booted EC2 Instance is unreachable by MongoDB clients and therefore effectively out of commission.

Admittedly, DNS caching issues causing a failure to connect to your database are unlikely. For example, if an EC2 Instance serving as the "primary" node is replaced, we would expect the MongoDB cluster to quickly promote a new primary and MongoDB clients to quickly try a different host if the first one is unreachable. But issues do happen.

By using a second Elastic Network Interface (ENI), we can assign a static private IP address to each EC2 Instance. When an EC2 Instance terminates, it automatically "detaches" its ENI. When a new EC2 Instance boots up, it can assume the previously used ENI and thus be accessible at the same private IP address. Now, our DNS records never need to change, and DNS caching issues are no longer a concern.

The attach-eni script makes this entire process automatic.

Questions? Ask away.

We're here to talk about our services, answer any questions, give advice, or just to chat.

Ready to hand off the Gruntwork?