This folder contains a Terraform module for running a cluster of Apache ZooKeeper
nodes. Under the hood, the cluster is powered by the server-group
module, so it supports attaching ENIs and
EBS Volumes, zero-downtime rolling deployment, and auto-recovery of failed nodes. This module assumes that you are
deploying an AMI that has both ZooKeeper and Exhibitor installed.
Quick start
See the root README for instructions on using Terraform modules.
You specify the AMI to run in the cluster using the ami_id input variable. We recommend creating a
Packer template to define the AMI with the following modules installed:
When your servers are booting, you need to tell them to start Exhibitor (which, in turn, will start ZooKeeper). The
easiest way to do that is to specify a User Data
script via the user_data input
variable that runs the run-exhibitor script. See
user-data.sh for an example.
Cluster size
Although you can run ZooKeeper on just a single server, in production, we strongly recommend running multiple
ZooKeeper servers in a cluster (called an ensemble) so that:
ZooKeeper replicates your data to all servers in the ensemble, so if one server dies, you don't lose any data, and
the other servers can continue serving requests.
Since the data is replicated across all the servers, any of the ZooKeeper nodes can respond to a read request, so
you can scale to more read traffic by increasing the size of the cluster.
Note that ZooKeeper achieves consensus by using a majority vote, which has three implications:
Your cluster must have an odd number of servers to make it possible to achieve a majority.
A ZooKeeper cluster can continue to operate as long as a majority of the servers are operational. That means a
cluster with n nodes can tolerate (n - 1) / 2 failed servers. So a 1-node cluster cannot tolerate any
failed servers, a 3-node cluster can tolerate 1 failed server, a 5-node cluster can tolerate 2 failed servers, and
a 7-node cluster can tolerate 3 failed servers.
Larger clusters actually make writes slower, since you have to wait on more servers to respond to the vote. Most
use cases are much more read-heavy than write-heavy, so this is typically a good trade-off. In practice, because
writes get more expensive as the cluster grows, it's unusual to see a ZooKeeper cluster with more than 7 servers.
Putting all of this together, we recommend that in production, you always use a 3, 5, or 7 node cluster depending on
your availability and scalability requirements.
Health checks
We strongly recommend associating an Elastic Load Balancer
(ELB) with your ZooKeeper cluster and configuring it
to perform TCP health checks on the ZooKeeper client port (2181 by default). The zookeeper-cluster module allows you
to associate an ELB with ZooKeeper, using the ELB's health checks to perform zero-downtime
deployments (i.e., ensuring the previous node is passing health checks before deploying the next
one) and to detect when a server is down and needs to be automatically replaced.
Note that we do NOT recommend connecting to the ZooKeeper cluster via the ELB. That's because you access the ELB via
its domain name, and most ZooKeeper clients (including Kafka) cache DNS entries forever. So if the underlying IPs
stored in DNS for the ELB change (which could happen at any time!), the ZooKeeper clients own't find out about it until
after a restart. You should always connect directly to the ZooKeeper nodes themselves via their static IP
addresses.
To connect to ZooKeeper, either from other ZooKeeper servers, or from ZooKeeper clients such as Kafka, you need to
provide the list of IP addresses for your ZooKeeper servers. Most ZooKeeper clients read this list of IPs during boot
and never update it after. That means you need a static list of IP addresses for your ZooKeeper nodes.
This is a problem in a dynamic cloud environment, where any of the ZooKeeper nodes could be replaced (either due to an
outage or deployment) with a different server, with a different IP address, at any time. Using DNS doesn't help, as
most ZooKeeper clients (including Kafka!) cache DNS results forever, so if the underlying IPs stored in the DNS record
change, those clients won't find out about it until they are restarted.
Our solution is to use Elastic Network Interface
(ENIs). An ENI is a static IP address that you can
attach to any server. This module creates an ENI for each ZooKeeper server and gives each (server, ENI) a matching
eni-0 tag. You can use the attach-eni
script in the User
Data of each server to find an
ENI with a matching eni-0 tag and attach it to the server during boot. That way, if a server goes down and is
replaced, its replacement reattaches the same ENI and gets the same IP address.
Every write to a ZooKeeper server is immediately persisted to disk for durability in ZooKeeper's transaction log.
We recommend using a separate EBS Volume to store these transaction logs. This ensures
the hard drive used for transaction logs does not have to contend with any other disk operations, which can
significantly improve ZooKeeper
performance.
This module creates an EBS Volume for each ZooKeeper server and gives each (server, EBS Volume) a matching
ebs-volume-0 tag. You can use the persistent-ebs-volume
module in the User
Data of each server to find an
EBS Volume with a matching ebs-volume-0 tag and attach it to the server during boot. That way, if a server goes down
and is replaced, its replacement reattaches the same EBS Volume.
This module assumes that you are running an AMI with Exhibitor installed.
Exhibitor performs several functions, including acting as a process supervisor for ZooKeeper and cleaning up old
transaction logs. ZooKeeper also exposes a UI you can use to see what's stored in and manage your ZooKeeper cluster.
By default, this UI is available at port 8080 of every ZooKeeper server. We also expose Exhibitor at port 80 via the
ELB used for health checks in the zookeeper-cluster example.
Data backup
ZooKeeper's primary mechanism for backing up data is the replication within the cluster, since every node has a copy
of all the data. It is rare to backup data beyond that, as the type of data typically stored in ZooKeeper is ephemeral
in nature (e.g., the leader of a cluster), and it's unusual for older data to be of any use.
That said, if you need more backup, you can do so from the Exhibitor UI, which offers Backup/Restore
functionality that allows you to index the ZooKeeper transaction
log and backup and restore specific transactions.
Questions? Ask away.
We're here to talk about our services, answer any questions, give advice, or just to chat.
{"treedata":{"name":"root","toggled":true,"children":[{"name":".circleci","children":[{"name":"config.yml","path":".circleci/config.yml","sha":"6c2af749ab9d5a423d64b85597b487d46f981bfc"}]},{"name":".gitignore","path":".gitignore","sha":"9de2b4892a4b8331b381f1a81ffeadfd151353f4"},{"name":".pre-commit-config.yaml","path":".pre-commit-config.yaml","sha":"54c0821e8bc133285e4b99948cab34ee7088fd5b"},{"name":"CODEOWNERS","path":"CODEOWNERS","sha":"4be01a6334d39aa5bf6abe6baae701f5e2a8c5ac"},{"name":"LICENSE.txt","path":"LICENSE.txt","sha":"689cf10ec98e3297a75bdd9b9fb5da10b7a675f8"},{"name":"README.md","path":"README.md","sha":"940f96053554ddd69d7ed70e5e9f76357e4a57d6"},{"name":"examples","children":[{"name":"install-open-jdk","children":[{"name":"main.tf","path":"examples/install-open-jdk/main.tf","sha":"69458225df465e946b292aca1a4420fcc73573a1"},{"name":"outputs.tf","path":"examples/install-open-jdk/outputs.tf","sha":"6d1b9b915cdb10c8ba114d61d0c47f0babd2ebe8"},{"name":"packer","children":[{"name":"build.json","path":"examples/install-open-jdk/packer/build.json","sha":"ae254e27690ef6d078f7e0902dbc34ad4aaa9bf3"}]},{"name":"vars.tf","path":"examples/install-open-jdk/vars.tf","sha":"d1eba93458ba1cfdf6b588d968e558f756087e9f"}]},{"name":"zookeeper-ami","children":[{"name":"README.md","path":"examples/zookeeper-ami/README.md","sha":"05c95c33022491de1a986d0aca53019913a1f44a"},{"name":"configure-image.sh","path":"examples/zookeeper-ami/configure-image.sh","sha":"908f15f17dad7254fc27d6b3b4f8f3a85c3c1619"},{"name":"docker-compose.yml","path":"examples/zookeeper-ami/docker-compose.yml","sha":"feaa66af7fa5879e41b88d09214f94a8e7dfb767"},{"name":"mock","children":[{"name":"README.md","path":"examples/zookeeper-ami/mock/README.md","sha":"281f2d6e28652e1269003052189206637f7a39e1"},{"name":"bash-commons","children":[{"name":"aws.sh","path":"examples/zookeeper-ami/mock/bash-commons/aws.sh","sha":"c3d5b6b6c112ad9296f3207404e5aa183a6346e6"}]},{"name":"modules","children":[{"name":"attach-eni","path":"examples/zookeeper-ami/mock/modules/attach-eni","sha":"da052caea4586b27c2dc13e521092e9403fcc327"},{"name":"mount-ebs-volume","path":"examples/zookeeper-ami/mock/modules/mount-ebs-volume","sha":"e0171fef05dbd120dc8668ceeca176cc5c0dfce4"}]},{"name":"user-data","children":[{"name":"user-data.sh","path":"examples/zookeeper-ami/mock/user-data/user-data.sh","sha":"43a641aa2937ba6f7c07c8391c947de2f65b7920"}]}]},{"name":"zookeeper.json","path":"examples/zookeeper-ami/zookeeper.json","sha":"af80ea61cc75c76cbbcdfdc40f46f5ac94307fa4"}]},{"name":"zookeeper-cluster","children":[{"name":"README.md","path":"examples/zookeeper-cluster/README.md","sha":"eda7e80c820a57b327129083f52b73e0713ae6b3"},{"name":"main.tf","path":"examples/zookeeper-cluster/main.tf","sha":"465ccbc878ef2861adf52b759484a5d832b63827"},{"name":"outputs.tf","path":"examples/zookeeper-cluster/outputs.tf","sha":"b975f1a72a486b619e310e8140110b84dfc29085"},{"name":"user-data","children":[{"name":"user-data.sh","path":"examples/zookeeper-cluster/user-data/user-data.sh","sha":"1f29df534a7a71456ea56de596059025a8fcd12d"}]},{"name":"vars.tf","path":"examples/zookeeper-cluster/vars.tf","sha":"ad3d6ba8d303e025fa8afd2a618510deeafd2e20"}]}]},{"name":"modules","children":[{"name":"bash-commons","children":[{"name":"README.md","path":"modules/bash-commons/README.md","sha":"0b7594de5380fe493e62f91965b5f8d283d76c55"},{"name":"install.sh","path":"modules/bash-commons/install.sh","sha":"d531a722d55a8a82d312725aa19ba2c1ddebcf02"},{"name":"lib","children":[{"name":"assert.sh","path":"modules/bash-commons/lib/assert.sh","sha":"46db8af49acb3e8d707447827dc614740add6a63"},{"name":"aws.sh","path":"modules/bash-commons/lib/aws.sh","sha":"3f14d36e2c7b57c91be34bdf30fff4f400a7d1aa"},{"name":"log.sh","path":"modules/bash-commons/lib/log.sh","sha":"66f936873d4d104c94693d3e68790cc0b9d36c63"},{"name":"os.sh","path":"modules/bash-commons/lib/os.sh","sha":"435e592c0e992f2266247a2da797a0eefc429f35"},{"name":"strings.sh","path":"modules/bash-commons/lib/strings.sh","sha":"19dbfe630edfbe71e7c8c1beee3d50fffdd5b1e6"}]}]},{"name":"exhibitor-shared-config","children":[{"name":"README.md","path":"modules/exhibitor-shared-config/README.md","sha":"ec9415af7d44ddfdf4cdfc58b89f4a5df2f0721e"},{"name":"main.tf","path":"modules/exhibitor-shared-config/main.tf","sha":"a99b8752ba1f53e6be51ce04462fef35c6e04789"},{"name":"outputs.tf","path":"modules/exhibitor-shared-config/outputs.tf","sha":"6cae4a73a186ba26e732a5d77af9b5d8707759e1"},{"name":"vars.tf","path":"modules/exhibitor-shared-config/vars.tf","sha":"4bf1ea368f0195690641154596416c3b92d43d64"}]},{"name":"install-exhibitor","children":[{"name":"README.md","path":"modules/install-exhibitor/README.md","sha":"98e9e15f48b87ae76e30ec6150301bb16d7acb3e"},{"name":"install.sh","path":"modules/install-exhibitor/install.sh","sha":"8c56b759a036ff1e8c0130ee644469f4fb11f74e"},{"name":"pom.xml","path":"modules/install-exhibitor/pom.xml","sha":"e5cca292196cdaa093cc484465eb5bc2d48ed618"},{"name":"zookeeper-log4j.properties","path":"modules/install-exhibitor/zookeeper-log4j.properties","sha":"80e3ae28555ed63b5a74d127796ff3f2adfce223"}]},{"name":"install-open-jdk","children":[{"name":"README.md","path":"modules/install-open-jdk/README.md","sha":"a048ab6e2cc74020657a68b7e32584f6ed78f76d"},{"name":"install.sh","path":"modules/install-open-jdk/install.sh","sha":"9982cf730afc5d58b05b888054138ffbb8725795"}]},{"name":"install-oracle-jdk","children":[{"name":"README.md","path":"modules/install-oracle-jdk/README.md","sha":"b46c3ab123f53f6a32ba2026bc18f3d870617bc1"},{"name":"install.sh","path":"modules/install-oracle-jdk/install.sh","sha":"53b798bd0984de5fd27048d3dfcd437d0687c117"}]},{"name":"install-supervisord","children":[{"name":"README.md","path":"modules/install-supervisord/README.md","sha":"eb07bdb5928882c73efcb5a069257f1356bd5533"},{"name":"install.sh","path":"modules/install-supervisord/install.sh","sha":"2b765667b825b94791e715b0dcbcdbd8ec4e95a2"},{"name":"supervisor-initd-script.sh","path":"modules/install-supervisord/supervisor-initd-script.sh","sha":"171b91613e98ab2bd10282025caff1707918c95a"},{"name":"supervisor-systemd-unit.service","path":"modules/install-supervisord/supervisor-systemd-unit.service","sha":"995a1a95863c21abcd245e45313e99b7cd98d98a"},{"name":"supervisord.conf","path":"modules/install-supervisord/supervisord.conf","sha":"d96beb0ca9a16279ed1bdf74cbb6516275d85085"}]},{"name":"install-zookeeper","children":[{"name":"README.md","path":"modules/install-zookeeper/README.md","sha":"9f23978ae6dfd0124fa8a9d8327b6256aa654b2f"},{"name":"install.sh","path":"modules/install-zookeeper/install.sh","sha":"7d25699e7835711dfc4c9e1fa78565fd578ccfb1"},{"name":"security","children":[{"name":"zookeeper.KEYS","path":"modules/install-zookeeper/security/zookeeper.KEYS","sha":"f23552bd9489de0f229bb4bb27dfb6a0376904f4"},{"name":"zookeeper.asc","path":"modules/install-zookeeper/security/zookeeper.asc","sha":"de701fc5a9b18af3095d32de998b95d7f25667ef"}]}]},{"name":"run-exhibitor","children":[{"name":"README.md","path":"modules/run-exhibitor/README.md","sha":"35608381a0756187d1bf77f1d71775edfff0858a"},{"name":"bin","children":[{"name":"run-exhibitor","path":"modules/run-exhibitor/bin/run-exhibitor","sha":"e7d039d097d060e6b7abcc6eb4622af7a1b010fe"}]},{"name":"install.sh","path":"modules/run-exhibitor/install.sh","sha":"f8f5a2961f42fc23b4a8fe06544a38ad28bac386"}]},{"name":"run-health-checker","children":[{"name":"README.md","path":"modules/run-health-checker/README.md","sha":"2b68f92659844150a106fa2e55972848e6373135"},{"name":"bin","children":[{"name":"check-zookeeper","path":"modules/run-health-checker/bin/check-zookeeper","sha":"c952376818aa1b3c08e6ce54c6b7b44811911696"},{"name":"run-health-checker","path":"modules/run-health-checker/bin/run-health-checker","sha":"0409f239d7078fd1a34d311790b0648e8173b102"}]},{"name":"install.sh","path":"modules/run-health-checker/install.sh","sha":"5241e81822039d9644d2a02f7f10003e64907de4"}]},{"name":"zookeeper-cluster","children":[{"name":"README.md","path":"modules/zookeeper-cluster/README.md","sha":"c604ec0b8326a2496b116da1579996aefa0ac04a","toggled":true},{"name":"main.tf","path":"modules/zookeeper-cluster/main.tf","sha":"7bc7c8862123dfc51c8c4620e273627f61a5f4fc"},{"name":"outputs.tf","path":"modules/zookeeper-cluster/outputs.tf","sha":"424dada690378a4bd7aa8e903c61ad77746ebf99"},{"name":"vars.tf","path":"modules/zookeeper-cluster/vars.tf","sha":"f4e8e2a326eaabdebb132471b153ed9426524f72"}],"toggled":true},{"name":"zookeeper-iam-permissions","children":[{"name":"README.md","path":"modules/zookeeper-iam-permissions/README.md","sha":"2a8f49855ac55e5bd083e8466ac5a2576fb62878"},{"name":"main.tf","path":"modules/zookeeper-iam-permissions/main.tf","sha":"4d75b339f10813099cb6ca68e1b215f49b304b5d"},{"name":"vars.tf","path":"modules/zookeeper-iam-permissions/vars.tf","sha":"9c3a5c00d0591fd79c9e63b6c8bf753cc14cda61"}]},{"name":"zookeeper-security-group-rules","children":[{"name":"README.md","path":"modules/zookeeper-security-group-rules/README.md","sha":"c2681bc0df96de4e97ea63c138f625d7bacd2439"},{"name":"main.tf","path":"modules/zookeeper-security-group-rules/main.tf","sha":"17522dda38c550ef756e3aaf246cdcdf25a702ec"},{"name":"vars.tf","path":"modules/zookeeper-security-group-rules/vars.tf","sha":"6319feab9b2eb4407d562436d716e64ef57b17fd"}]}],"toggled":true},{"name":"test","children":[{"name":"README.md","path":"test/README.md","sha":"a3b551ce00165ed25029229aae7a5ac5d9c33f91"},{"name":"go.mod","path":"test/go.mod","sha":"ddf003fcba842e54da9c8744a2aaabd8f7ecf518"},{"name":"go.sum","path":"test/go.sum","sha":"7dc39c8d23936581266827affbab0301f0d2a859"},{"name":"open_jdk_test.go","path":"test/open_jdk_test.go","sha":"f02e087559c57221223139bc6d66ecb602cf38e8"},{"name":"zookeeper_cluster_test.go","path":"test/zookeeper_cluster_test.go","sha":"0867d627bbb951b2d10023db4cda52805110fc8b"}]}]},"detailsContent":"<h1 class=\"preview__body--title\" id=\"zoo-keeper-cluster\">ZooKeeper Cluster</h1><div class=\"preview__body--border\"></div><p>This folder contains a Terraform module for running a cluster of <a href=\"https://zookeeper.apache.org/\" class=\"preview__body--description--blue\" target=\"_blank\">Apache ZooKeeper</a>\nnodes. Under the hood, the cluster is powered by the <a href=\"/repos/module-asg/modules/server-group\" class=\"preview__body--description--blue\">server-group\nmodule</a>, so it supports attaching ENIs and\nEBS Volumes, zero-downtime rolling deployment, and auto-recovery of failed nodes. This module assumes that you are\ndeploying an AMI that has both ZooKeeper and <a href=\"https://github.com/soabase/exhibitor/\" class=\"preview__body--description--blue\" target=\"_blank\">Exhibitor</a> installed.</p>\n<h2 class=\"preview__body--subtitle\" id=\"quick-start\">Quick start</h2>\n<ul>\n<li>See the <a href=\"/repos/v0.6.6/package-zookeeper/README.md\" class=\"preview__body--description--blue\">root README</a> for instructions on using Terraform modules.</li>\n<li>See the <a href=\"/repos/v0.6.6/package-zookeeper/examples/zookeeper-cluster\" class=\"preview__body--description--blue\">zookeeper-cluster example</a> for sample usage.</li>\n<li>See <a href=\"/repos/v0.6.6/package-zookeeper/modules/zookeeper-cluster/vars.tf\" class=\"preview__body--description--blue\">vars.tf</a> for all the variables you can set on this module.</li>\n</ul>\n<h2 class=\"preview__body--subtitle\" id=\"key-considerations-for-using-this-module\">Key considerations for using this module</h2>\n<p>Here are the key things to take into account when using this module:</p>\n<ul>\n<li><a href=\"#zookeeper-ami\" class=\"preview__body--description--blue\">ZooKeeper AMI</a></li>\n<li><a href=\"#user-data\" class=\"preview__body--description--blue\">User Data</a></li>\n<li><a href=\"#cluster-size\" class=\"preview__body--description--blue\">Cluster size</a></li>\n<li><a href=\"#health-checks\" class=\"preview__body--description--blue\">Health checks</a></li>\n<li><a href=\"#rolling-deployments\" class=\"preview__body--description--blue\">Rolling deployments</a></li>\n<li><a href=\"#static-ips-and-enis\" class=\"preview__body--description--blue\">Static IPs and ENIs</a></li>\n<li><a href=\"#transaction-logs-and-ebs-volumes\" class=\"preview__body--description--blue\">Transaction logs and EBS Volumes</a></li>\n<li><a href=\"#exhibitor\" class=\"preview__body--description--blue\">Exhibitor</a></li>\n<li><a href=\"#data-backup\" class=\"preview__body--description--blue\">Data backup</a></li>\n</ul>\n<h3 class=\"preview__body--subtitle\" id=\"zoo-keeper-ami\">ZooKeeper AMI</h3>\n<p>You specify the AMI to run in the cluster using the <code>ami_id</code> input variable. We recommend creating a\n<a href=\"https://www.packer.io/\" class=\"preview__body--description--blue\" target=\"_blank\">Packer</a> template to define the AMI with the following modules installed:</p>\n<ul>\n<li><a href=\"/repos/v0.6.6/package-zookeeper/modules/install-oracle-jdk\" class=\"preview__body--description--blue\">install-oracle-jdk</a>: Install the Oracle JDK.</li>\n<li><a href=\"/repos/v0.6.6/package-zookeeper/modules/install-supervisord\" class=\"preview__body--description--blue\">install-supervisord</a>: Install Supervisord as a process manager.</li>\n<li><a href=\"/repos/v0.6.6/package-zookeeper/modules/install-zookeeper\" class=\"preview__body--description--blue\">install-zookeeper</a>: Install ZooKeeper.</li>\n<li><a href=\"/repos/v0.6.6/package-zookeeper/modules/install-exhibitor\" class=\"preview__body--description--blue\">install-exhibitor</a>: Install Exhibitor.</li>\n<li><a href=\"/repos/v0.6.6/package-zookeeper/modules/run-exhibitor\" class=\"preview__body--description--blue\">run-exhibitor</a>: Start Exhibitor, which, in turn, will start ZooKeeper.</li>\n</ul>\n<p>See the <a href=\"/repos/v0.6.6/package-zookeeper/examples/zookeeper-ami\" class=\"preview__body--description--blue\">zookeeper-ami example</a> for working sample code.</p>\n<h3 class=\"preview__body--subtitle\" id=\"user-data\">User Data</h3>\n<p>When your servers are booting, you need to tell them to start Exhibitor (which, in turn, will start ZooKeeper). The\neasiest way to do that is to specify a <a href=\"http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/user-data.html#user-data-api-cli\" class=\"preview__body--description--blue\" target=\"_blank\">User Data\nscript</a> via the <code>user_data</code> input\nvariable that runs the <a href=\"/repos/v0.6.6/package-zookeeper/modules/run-exhibitor\" class=\"preview__body--description--blue\">run-exhibitor script</a>. See\n<a href=\"/repos/v0.6.6/package-zookeeper/examples/zookeeper-cluster/user-data/user-data.sh\" class=\"preview__body--description--blue\">user-data.sh</a> for an example.</p>\n<h3 class=\"preview__body--subtitle\" id=\"cluster-size\">Cluster size</h3>\n<p>Although you can run ZooKeeper on just a single server, in production, we <em>strongly</em> recommend running multiple\nZooKeeper servers in a cluster (called an <em>ensemble</em>) so that:</p>\n<ol>\n<li>ZooKeeper replicates your data to all servers in the ensemble, so if one server dies, you don't lose any data, and\nthe other servers can continue serving requests.</li>\n<li>Since the data is replicated across all the servers, any of the ZooKeeper nodes can respond to a read request, so\nyou can scale to more read traffic by increasing the size of the cluster.</li>\n</ol>\n<p>Note that ZooKeeper achieves consensus by using a majority vote, which has three implications:</p>\n<ol>\n<li>Your cluster must have an odd number of servers to make it possible to achieve a majority.</li>\n<li>A ZooKeeper cluster can continue to operate as long as a majority of the servers are operational. That means a\ncluster with <code>n</code> nodes can tolerate <code>(n - 1) / 2</code> failed servers. So a 1-node cluster cannot tolerate any\nfailed servers, a 3-node cluster can tolerate 1 failed server, a 5-node cluster can tolerate 2 failed servers, and\na 7-node cluster can tolerate 3 failed servers.</li>\n<li>Larger clusters actually make writes <em>slower</em>, since you have to wait on more servers to respond to the vote. Most\nuse cases are much more read-heavy than write-heavy, so this is typically a good trade-off. In practice, because\nwrites get more expensive as the cluster grows, it's unusual to see a ZooKeeper cluster with more than 7 servers.</li>\n</ol>\n<p>Putting all of this together, we recommend that in production, you always use a 3, 5, or 7 node cluster depending on\nyour availability and scalability requirements.</p>\n<h3 class=\"preview__body--subtitle\" id=\"health-checks\">Health checks</h3>\n<p>We strongly recommend associating an <a href=\"https://aws.amazon.com/elasticloadbalancing/classicloadbalancer/\" class=\"preview__body--description--blue\" target=\"_blank\">Elastic Load Balancer\n(ELB)</a> with your ZooKeeper cluster and configuring it\nto perform TCP health checks on the ZooKeeper client port (2181 by default). The <code>zookeeper-cluster</code> module allows you\nto associate an ELB with ZooKeeper, using the ELB's health checks to perform <a href=\"#rolling-deployments\" class=\"preview__body--description--blue\">zero-downtime\ndeployments</a> (i.e., ensuring the previous node is passing health checks before deploying the next\none) and to detect when a server is down and needs to be automatically replaced.</p>\n<p>Note that we do NOT recommend connecting to the ZooKeeper cluster via the ELB. That's because you access the ELB via\nits domain name, and most ZooKeeper clients (including Kafka) cache DNS entries forever. So if the underlying IPs\nstored in DNS for the ELB change (which could happen at any time!), the ZooKeeper clients own't find out about it until\nafter a restart. You should always connect directly to the ZooKeeper nodes themselves via their <a href=\"#static-ips-and-enis\" class=\"preview__body--description--blue\">static IP\naddresses</a>.</p>\n<p>Check out the <a href=\"/repos/v0.6.6/package-zookeeper/examples/zookeeper-cluster\" class=\"preview__body--description--blue\">zookeeper-cluster example</a> for working sample code that includes an ELB.</p>\n<h3 class=\"preview__body--subtitle\" id=\"rolling-deployments\">Rolling deployments</h3>\n<p>To deploy updates to a ZooKeeper cluster, such as rolling out a new version of the AMI, you need to do the following:</p>\n<ol>\n<li>Shut down ZooKeeper on one server.</li>\n<li>Deploy the new code on the same server.</li>\n<li>Wait for the new code to come up successfully and start passing health checks.</li>\n<li>Repeat the process with the remaining servers.</li>\n</ol>\n<p>This module can do this process for you automatically by using the <a href=\"/repos/module-asg/modules/server-group\" class=\"preview__body--description--blue\">server-group\nmodule's</a> support for <a href=\"/repos/module-asg/modules/server-group#how-does-rolling-deployment-work\" class=\"preview__body--description--blue\">zero-downtime\nrolling deployment</a>.</p>\n<h3 class=\"preview__body--subtitle\" id=\"static-i-ps-and-en-is\">Static IPs and ENIs</h3>\n<p>To connect to ZooKeeper, either from other ZooKeeper servers, or from ZooKeeper clients such as Kafka, you need to\nprovide the list of IP addresses for your ZooKeeper servers. Most ZooKeeper clients read this list of IPs during boot\nand never update it after. That means you need a static list of IP addresses for your ZooKeeper nodes.</p>\n<p>This is a problem in a dynamic cloud environment, where any of the ZooKeeper nodes could be replaced (either due to an\noutage or deployment) with a different server, with a different IP address, at any time. Using DNS doesn't help, as\nmost ZooKeeper clients (including Kafka!) cache DNS results forever, so if the underlying IPs stored in the DNS record\nchange, those clients won't find out about it until they are restarted.</p>\n<p>Our solution is to use <a href=\"http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-eni.html\" class=\"preview__body--description--blue\" target=\"_blank\">Elastic Network Interface\n(ENIs)</a>. An ENI is a static IP address that you can\nattach to any server. This module creates an ENI for each ZooKeeper server and gives each (server, ENI) a matching\n<code>eni-0</code> tag. You can use the <a href=\"/repos/module-server/modules/attach-eni\" class=\"preview__body--description--blue\">attach-eni\nscript</a> in the <a href=\"http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/user-data.html#user-data-api-cli\" class=\"preview__body--description--blue\" target=\"_blank\">User\nData</a> of each server to find an\nENI with a matching <code>eni-0</code> tag and attach it to the server during boot. That way, if a server goes down and is\nreplaced, its replacement reattaches the same ENI and gets the same IP address.</p>\n<p>See <a href=\"/repos/v0.6.6/package-zookeeper/examples/zookeeper-cluster/user-data/user-data.sh\" class=\"preview__body--description--blue\">user-data.sh</a> for an example.</p>\n<h3 class=\"preview__body--subtitle\" id=\"transaction-logs-and-ebs-volumes\">Transaction logs and EBS Volumes</h3>\n<p>Every write to a ZooKeeper server is immediately persisted to disk for durability in ZooKeeper's <em>transaction log</em>.\nWe recommend using a separate <a href=\"https://aws.amazon.com/ebs/\" class=\"preview__body--description--blue\" target=\"_blank\">EBS Volume</a> to store these transaction logs. This ensures\nthe hard drive used for transaction logs does not have to contend with any other disk operations, which can\nsignificantly <a href=\"https://zookeeper.apache.org/doc/r3.3.2/zookeeperAdmin.html#sc_advancedConfiguration\" class=\"preview__body--description--blue\" target=\"_blank\">improve ZooKeeper\nperformance</a>.</p>\n<p>This module creates an EBS Volume for each ZooKeeper server and gives each (server, EBS Volume) a matching\n<code>ebs-volume-0</code> tag. You can use the <a href=\"/repos/module-server/modules/persistent-ebs-volume\" class=\"preview__body--description--blue\">persistent-ebs-volume\nmodule</a> in the <a href=\"http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/user-data.html#user-data-api-cli\" class=\"preview__body--description--blue\" target=\"_blank\">User\nData</a> of each server to find an\nEBS Volume with a matching <code>ebs-volume-0</code> tag and attach it to the server during boot. That way, if a server goes down\nand is replaced, its replacement reattaches the same EBS Volume.</p>\n<p>See <a href=\"/repos/v0.6.6/package-zookeeper/examples/zookeeper-cluster/user-data/user-data.sh\" class=\"preview__body--description--blue\">user-data.sh</a> for an example.</p>\n<h3 class=\"preview__body--subtitle\" id=\"exhibitor\">Exhibitor</h3>\n<p>This module assumes that you are running an AMI with <a href=\"https://github.com/soabase/exhibitor/\" class=\"preview__body--description--blue\" target=\"_blank\">Exhibitor</a> installed.\nExhibitor performs several functions, including acting as a process supervisor for ZooKeeper and cleaning up old\ntransaction logs. ZooKeeper also exposes a UI you can use to see what's stored in and manage your ZooKeeper cluster.\nBy default, this UI is available at port 8080 of every ZooKeeper server. We also expose Exhibitor at port 80 via the\nELB used for <a href=\"#health-checks\" class=\"preview__body--description--blue\">health checks</a> in the <a href=\"/repos/v0.6.6/package-zookeeper/examples/zookeeper-cluster\" class=\"preview__body--description--blue\">zookeeper-cluster example</a>.</p>\n<h3 class=\"preview__body--subtitle\" id=\"data-backup\">Data backup</h3>\n<p>ZooKeeper's primary mechanism for backing up data is the replication within the cluster, since every node has a copy\nof all the data. It is rare to backup data beyond that, as the type of data typically stored in ZooKeeper is ephemeral\nin nature (e.g., the leader of a cluster), and it's unusual for older data to be of any use.</p>\n<p>That said, if you need more backup, you can do so from the Exhibitor UI, which offers <a href=\"https://github.com/soabase/exhibitor/wiki/Restore-UI\" class=\"preview__body--description--blue\" target=\"_blank\">Backup/Restore\nfunctionality</a> that allows you to index the ZooKeeper transaction\nlog and backup and restore specific transactions.</p>\n","repoName":"package-zookeeper","repoRef":"v0.7.0","serviceDescriptor":{"serviceName":"Apache ZooKeeper","serviceRepoName":"package-zookeeper","serviceRepoOrg":"gruntwork-io","cloudProviders":["aws"],"description":"Deploy an Apache ZooKeeper cluster. Supports automatic bootstrap, Exhibitor, zero-downtime rolling deployment, and auto healing.","imageUrl":"zookeeper.png","licenseType":"subscriber","technologies":["Terraform","Bash"],"compliance":[],"tags":[""]},"serviceCategoryName":"NoSQL","fileName":"README.md","filePath":"/modules/zookeeper-cluster","title":"Repo Browser: Apache ZooKeeper","description":"Browse the repos in the Gruntwork Infrastructure as Code Library."}