Read our newest book, Fundamentals of DevOps and Software Delivery, for free!

Part 7. How to Set Up Networking

Headshot of Yevgeniy Brikman

Yevgeniy Brikman

JUN 25, 2024
Featured Image of Part 7. How to Set Up Networking

Update, June 25, 2024: This blog post series is now also available as a book called Fundamentals of DevOps and Software Delivery: A hands-on guide to deploying and managing production software, published by O’Reilly Media!

This is Part 7 of the Fundamentals of DevOps and Software Delivery series. In Part 6, you learned how to split your deployments into multiple environments and how to split your codebase into multiple (micro)services. Both of these items rely heavily on networking: namely, services need to be able to talk to other services over the network, and environments need to be isolated from each other so they can’t talk to each other over the network. In other words, networking plays two key roles: connectivity and security.

In this blog post, you’ll learn how to set up and configure networking, both for connectivity and security. The goal is to understand the high-level networking concepts you need in order to connect and secure your applications. In particular, this post will walk you through the concepts and examples shown in Table 12:

Table 12. Concepts and examples you’ll go through in this blog post
ConceptDescriptionExample

Public networking

Manage access to your apps over the public Internet with public IPs and domain names.

Deploy servers with public IPs in AWS and register a domain name for them in Route 53.

Private networking

Run your apps in a private network to protect them from public Internet access.

Create a Virtual Private Cloud (VPC) in AWS and deploy servers into it.

Network access

Learn to how to securely access private networks using SSH, RDP, and VPN.

Connect to a server in a VPC in AWS using SSH and a bastion host.

Service communication

Connect and secure communication between apps in a (micro)services architecture.

Use Istio as a service mesh for microservices running in Kubernetes.

Let’s start with the first item, which is public networking.

Public Networking

Just about everything you’ve deployed so far in this blog post series has been accessible directly over the public Internet. For example, the EC2 instance you deployed in Part 1 had a public IP address like 3.22.99.215 that you could use to access it; similarly, the load balancer you deployed in Part 3 had a domain name like sample-app-tofu-656918683.us-east-2.elb.amazonaws.com that you could use to access it. Where did these IP addresses and domain names come from, and how do they work? These two items are the focus of the next two sections, starting with public IP addresses.

Public IP Addresses

Just about the entire Internet runs on top of the Internet Protocol (IP), which is a set of rules for how to route and address data across networks. The first major version of IP, IPv4, which has been around since the 1980s, is the dominant protocol used on the Internet today; its successor, IPv6, started rolling out around 2006, and is gradually gaining adoption.

IP addresses are a central part of IP: each address (a) identifies one host on the network and (b) specifies the location of the host on the network so you can route traffic to it. IPv4 addresses are 32-bit numbers which are typically displayed as four groups of two decimal digits, such as 11.22.33.44. With only 32-bits, the number of possible unique IPv4 addresses is 232, or roughly 4 billion, which is a problem, as we’ve had far more than 4 billion Internet-connected devices for a long time[35]. Running out of IPs is one of the reasons the world is moving to IPv6, which uses 128-bit addresses that are typically displayed as eight groups of four hexadecimal digits, such as 2001:0db8:85a3:0000:0000:8a2e:0370:7334. With 128-bit addresses, the number of possible unique addresses is 2128, or roughly 340 undecillion (340 followed by 36 zeros), which is unlikely to ever run out. Unfortunately, IPv6 adoption world-wide is still well under 50%[36]: many older networking devices don’t support IPv6, so adoption takes a long time, as it requires updating software and hardware across millions of devices and networks around the world. Therefore, most of what you do with networking for now, as well as most of what this blog post will focus on, will be IPv4.

How do you get a public IP address? The Internet Assigned Numbers Authority (IANA) owns all public IP addresses and it assigns them in a hierarchical manner: at the top level, IANA delegates blocks of IP addresses to Internet registries that cover specific regions of the world. These registries, in turn, delegate blocks of IP addresses to network operators within their region, such as Internet Service Providers (ISPs), cloud providers (e.g., AWS, Azure, Google Cloud), enterprise companies, and so on. Finally, these network operators assign IPs to specific devices: for example, when you sign up for an Internet connection at home with an ISP, that ISP assigns you an IP address from its block of IPs; and when you deploy EC2 instances in AWS, AWS assigns you an IP address from its block of IPs.[37]

Key takeaway #1

You get public IP addresses from network operators such as cloud providers and ISPs.

IP addresses are a fundamental building block of the Internet, and they work very well for computers talking to other computers, but they aren’t particularly human-friendly: if the only way to access your servers was to memorize a bunch of random numbers that may change from time to time, the Internet and World Wide Web probably wouldn’t have made it very far. What you want instead is to use memorable, human-friendly, consistent names. This is precisely the role of the Domain Name System, which is the topic of the next section.

Domain Name System (DNS)

The Domain Name System (DNS) is a service that allows you to use a human-friendly domain name to access a web service instead of an IP address: for example, you can use www.google.com instead of 172.253.116.139 to access Google’s servers. DNS stores the mapping from names to IP addresses in a globally-distributed hierarchy of nameservers. When you enter www.google.com into your web browser, your computer doesn’t talk to the nameservers directly, but instead sends a request to a local DNS resolver: at home, your ISP typically configures itself as the DNS resolver; in the cloud, the cloud provider typically configures itself as the DNS resolver. The DNS resolver takes the domain name and processes the parts in reverse order by making a series of queries to the hierarchy of name servers in Figure 74:

The hierarchy of DNS servers
Figure 74. The hierarchy of DNS servers

The DNS resolver’s first query goes to the root nameservers, which run at 13 IP addresses that are managed by IANA and hard-coded into most DNS resolvers, and these servers return the IP addresses of the top-level domain (TLD) nameservers for the TLD you requested (.com). The DNS resolver’s second query goes to these TLD nameservers, which are also managed by IANA, and they return the IP addresses to use for the authoritative nameservers for the domain you requested (google.com). Finally, the DNS resolver’s third query goes to these authoritative nameservers, which are operated by a variety of companies, such as Amazon Route 53, Azure DNS, GoDaddy, Namecheap, CloudFlare DNS, and so on, and these servers return the DNS records that contain the information that is associated with the domain name you requested (www.google.com).

There are many types of DNS records, each of which stores different kinds of information: for example, DNS A records and DNS AAAA records are "address" records that store IPv4 addresses and IPv6 addresses, respectively; DNS CNAME records are "canonical name" records that store aliases for a domain name; DNS TXT records are "text records" that can store arbitrary text; and so on. When your browser looks up www.google.com, it typically requests A or AAAA records. Three rounds of requests to get some DNS records may seem like a lot, but DNS is typically pretty fast, and there is a lot of caching along the way: e.g., your browser, your operating system, and the DNS resolver may all cache records for some period of time to reduce the number of lookups.

Key takeaway #2

DNS allows you to access web services via memorable, human-friendly, consistent names.

So that’s how DNS records are looked up, but how do they get there in the first place? Who decides who owns what domain? As with most things related to the Internet, this also goes back to IANA, which owns and manages all domain names. IANA delegates the management of these domain names to accredited registrars, who are allowed to sell domain names to end users. The registrars are often (but not always) the same companies that run authoritative name servers, such as Amazon Route 53, Azure DNS, GoDaddy, and so on. Note that, technically, you never own a domain name: you can only lease it, for which you pay an annual fee. If you stop paying that fee, the registrar can lease it to someone else.

Once you lease a domain name, you then have permissions to configure the DNS records for that domain in its authoritative nameservers, which allows users all over the world to access your servers via that domain name. DNS is a beautiful, scalable system, and getting your first domain name working can feel magical. Let’s try out an example of this magic by registering and configuring a domain name in Route 53.

Example: Register and Configure a Domain Name in Amazon Route 53

In this section, you’ll deploy a web app, and set up a domain name for it. We’ll use Amazon’s Route 53 as the domain name registrar and the web app will be a simple HTTP server running on several EC2 instances that respond with "Hello, World!" This involves three steps: register a domain name, deploy EC2 instances, and configure DNS records. Let’s start by registering a domain name.

Register a domain name

The first step is to register a domain name. Although you’ll manage most of your infrastructure as code in this blog post series, registering domain names involves a number of manual steps, so I typically do it using a web UI.

Watch out for snakes: registering domain names is not part of the AWS free tier!

While most of the examples in this book are part of the AWS free tier, registering domain names is not. As of July 2024, the pricing varies based on the domain name you register: for example, most .com addresses cost $14 per year. So please be aware that running the examples in this section will cost you a little bit of money.

Head to the Route 53 dashboard, choose "Register a domain," and click "Get started." On the next page, use the search box to find a domain name that is available for purchase. For example, as shown in Figure 75, I found that fundamentals-of-devops-example.com was available; you’ll want to search for other domains, as I’ve already registered that one. Have fun with it: you can register a variety of domain names, including standard ones like .com, .net, and .org, but also more unusual ones, such as .agency, .beer, .expert, .games, .me, and .fail, so get creative.

Once you’ve found a domain name that you like and is available, click Select to add it to your cart, scroll to the bottom of the page, and click "Proceed to checkout." On the next page, decide for how many years you want to register your domain, and if you want the registration to auto-renew, and then click Next. You’ll end up on a page where you need to fill out the contact details for the domain: IANA requires every domain to have contact details, and anyone can look up the contact details for any domain using whois, as shown in Example 129:

Example 129. An example of using whois to look up the contact details for a domain
$ whois fundamentals-of-devops-example.com

Registrant Name: On behalf of fundamentals-of-devops-example.com owner

Registrant Organization: Identity Protection Service

Registrant Street: PO Box 786

Registrant City: Hayes

Registrant State/Province: Middlesex

Registrant Email: f7cbd7cd-401a-44fb-xxxx@identity-protect.org

(... truncated ...)

If you want to keep your details private, at the bottom of the contact details page, you can choose to enable privacy protection: if you do so, Amazon will list its own contact details on the domain, forwarding any messages about your domain to you, while keeping your contact details private (you can see in Example 129 that the contact details are for Amazon). Once you’re done filling in contact details, click Next, which should take you to a summary page where you can review what you’re buying, agree to the terms and conditions, and click Submit to start the registration process.

The registration process takes 5 - 30 minutes, so be patient. You can monitor the process on the registration requests page. During this process, Route 53 will send a confirmation email to the address you put on the contact details page: once you get this email, click the link within it to confirm you own the email address. When the registration process is complete, you should see your domain on the registered domains page: click on the domain, and you should see a page that looks like Figure 76:

Details for your registered domain name
Figure 76. Details for your registered domain name

In the Details section, you should see a number of name servers: when you register a domain in Route 53, it automatically configures its own servers as the authoritative nameservers for that domain. Route 53 also automatically creates a hosted zone for the domain, which is the container for the DNS records for that domain. Head to the hosted zones page, click on your domain in the list, and look for the "Hosted zone details" section at the top of the page, as shown in Figure 77:

Hosted zone details, including the hosted zone ID
Figure 77. Hosted zone details, including the hosted zone ID

Jot down the hosted zone ID, as you’ll need it a little later on.

Deploy EC2 instances

Example Code

As a reminder, you can find all the code examples in the blog post series’s sample code repo in GitHub.

The next step is to deploy some EC2 instances to run the "Hello, World" web app. Head into the folder where you’ve been working on the code samples for this blog post series and create a new folder for this blog post, and within that, a new OpenTofu root module called ec2-dns:

$ cd fundamentals-of-devops

$ mkdir ch7/tofu/live/ec2-dns

$ cd ch7/tofu/live/ec2-dns

Within the ec2-dns root module, you can create several EC2 instances using a module called ec2-instances, which is in the blog post series’s sample code repo in the ch7/tofu/modules/ec2-instances folder. This module is similar to the OpenTofu code you wrote in Part 2 to deploy an EC2 instance, except the ec2-instances module can deploy multiple EC2 instances. To use this module, create a file called main.tf in the ec2-dns folder, with the initial contents shown in Example 130:

Example 130. Create multiple EC2 instances using the ec2-instances module (ch7/tofu/live/ec2-dns/main.tf)
provider "aws" {

  region = "us-east-2"

}



module "instances" {

  source = "github.com/brikis98/devops-book//ch7/tofu/modules/ec2-instances"



  name          = "ec2-dns-example"

  num_instances = 3                                   (1)

  instance_type = "t2.micro"

  ami_id        = "ami-0900fe555666598a2"             (2)

  http_port     = 80                                  (3)

  user_data     = file("${path.module}/user-data.sh") (4)

}

The preceding code uses the ec2-instances module from the blog post series’s sample app repo to do the following:

1Deploy three EC2 instances.
2Run an Amazon Linux AMI on each instance.
3Allow the instances to receive HTTP requests on port 80.
4Have each instance run the user data script described next.

For a user data script, copy the one from all the way back in Part 2:

$ cp ../../../../ch2/bash/user-data.sh .

As a reminder, this is a simple Bash script that installs Node.js and runs a dirt-simple "Hello, World" Node.js server that listens on port 80.

Watch out for snakes: a step backwards in terms of orchestration and security

This example has all the problems from back in Watch out for snakes: these examples have several problems, which you now know how to fix using more robust tools such as Packer, ASGs, and ALBs (as you learned in Part 3). So why the step backwards in this blog post? My goal in this blog post is to show you an idiomatic example of DNS, where you create a DNS A record that points to several servers listening on standard HTTP ports (80). The more robust tools are what you should choose for production, but as they handle DNS in non-standard ways (e.g., to set up a custom domain name for an ALB, you use an alias record, which is an AWS-specific DNS extension), I felt they were not as good of a choice for teaching.

Finally, create an outputs.tf file to output the public IP addresses of the EC2 instances, as shown in Example 131:

Example 131. Output variables for the ec2-dns root module (ch7/tofu/live/ec2-dns/outputs.tf)
output "instance_ips" {

  description = "The IPs of the EC2 instances"

  value       = module.instances.public_ips

}

Deploy the ec2-dns module as usual, authenticating to AWS as described in Authenticating to AWS on the command line, and running init and apply:

$ tofu init

$ tofu apply

When apply completes, you should see the IP addresses of the instances in the instance_ips output variable:

instance_ipds = [

  "3.145.172.12",

  "18.118.205.155",

  "18.224.38.87",

]

Give the instances a minute or two to boot up, copy one of the IP addresses, and check that the web app is working:

$ curl http://3.145.172.12

Hello, World!

Configure DNS records

Now that you have a web app running on several servers, you can point your domain name at these servers by adding the code shown in Example 132 to the bottom of main.tf in the ec2-dns module:

Example 132. Configure a DNS A record in the ec2-dns module (ch7/tofu/live/ec2-dns/main.tf)
resource "aws_route53_record" "www" {

  # TODO: fill in your own hosted zone ID!

  zone_id = "Z0701806REYTQ0GZ0JCF"                   (1)

  # TODO: fill in your own domain name!

  name    = "www.fundamentals-of-devops-example.com" (2)

  type    = "A"                                      (3)

  records = module.instances.public_ips              (4)

  ttl     = 300                                      (5)

}

This code adds a DNS A record to your Route 53 hosted zone as follows:

1Create the DNS record in the hosted zone specified via zone_id. Make sure to fill in your own hosted zone ID here!
2The DNS record is for www.<YOUR-DOMAIN>. Make sure to fill in your own domain here!
3This is an A record, which points to IPv4 addresses.
4Point the A record at the IPv4 addresses of the EC2 instances you deployed.
5Set the time to live (TTL) for the record to 300 seconds (5 minutes). DNS resolvers should cache this record for the amount of time specified in the TTL (though be warned that not all DNS resolvers respect the TTL setting). Longer TTLs have the advantage of reducing latency for your users and load on your DNS server, but the drawback is that any updates you make will take longer to take effect.

Add the domain name as a new output variable to outputs.tf, as shown in Example 133:

Example 133. Add an output variable with the domain name (ch7/tofu/live/ec2-dns/outputs.tf)
output "domain_name" {

  description = "The domain name for the EC2 instances"

  value       = aws_route53_record.www.name

}

Run apply one more time. When it completes, check if your new domain name is working:

$ curl http://www.<YOUR-DOMAIN>

Hello, World!

Congrats, you just configured a domain name for your web app! You now have a single, human-friendly endpoint you can give your users, which under the hood, automatically resolves to the IP addresses of your servers. When you’re done testing and experimenting, commit your code, and run tofu destroy to clean everything up.

Get your hands dirty

Here are some exercises you can try at home to get a better feel for managing domain names:

  • Instead of several individual EC2 instances, use one of the orchestration approaches from Part 3, such as an ASG with an ALB, and figure out how to configure DNS records for that approach.

  • Figure out how to automatically redirect requests for your root domain name (sometimes called the apex domain or bare domain) to your www. sub-domain: e.g., redirect fundamentals-of-devops-example.com to www.fundametals-of-devsop.com. This is a good security practice because of how browsers handle cookies for root domains.

  • DNSSEC (DNS Security Extensions) is a protocol you can use to protect your domain from forged or manipulated DNS data. You may have noticed that in the Details section for your domain in Figure 76, it said that the "DNSSSEC status" was "not configured." Fix this issue by following the Route 53 DNSSEC documentation.

You’ve now seen how to manage public IP addresses and public domain names, but it’s important to understand that not everything should be publicly accessible over the Internet. One reason is that there aren’t enough IP addresses in the world for everything to be public: as you saw earlier, we’ve already exhausted the IPv4 address space, while IPv6 adoption world-wide is still too low. A bigger reason is security: many devices are not locked down enough to be exposed publicly. As a result, a huge portion of networking is private, which is the focus of the next section.

Private Networking

Private networking is a key part of a defense in depth strategy, where you establish multiple layers of security, providing redundancy in case there is a vulnerability in one of the layers. Consider the castle shown in Figure 78:

Beaumaris Castle
Figure 78. Beaumaris Castle (photo by Llywelyn2000)

Builders of castles didn’t rely on just a single wall to keep them safe: they used multiple layers of defense, including moats, multiple layers of walls, gates, towers, keeps, soldiers, weapons, and traps. If one of these layers failed, you could fall back to the others, and still stay safe. You should design your software architecture in a similar manner: set up multiple layers of defense, so that if one of them fails, the others are there to keep you safe.

For example, the servers (EC2 instances) you’ve deployed so far throughout this blog post series have all been accessible over the public Internet. The only thing that kept them safe is the firewalls (security groups) which block access to all ports by default. This is a pretty thin layer of protection: all it takes is one mistake, and your servers may become vulnerable. In the real world, sooner or later, you will make (at least) one mistake: e.g., someone will accidentally misconfigure the firewall, and leave a port open that shouldn’t be. Malicious actors are scanning for open ports and other vulnerabilities all the time—if you ever have a chance to see the access logs for public servers, especially at well-known companies, it can be a scary and eye-opening experience—and many security incidents are not the result of brilliant algorithmic code cracking, but of opportunists jumping on easy vulnerabilities due to someone making a mistake. If one person making a mistake is all it takes to cause a security incident, then the fault isn’t with that person, but with the way you’ve set up your security posture.

Key takeaway #3

Use a defense-in-depth strategy to ensure you’re never one mistake away from a disaster.

A more secure approach would be to deploy just about all of your servers into private networks, which are networks set up by organizations solely for that organization’s use: e.g., an office network, university network, data center network, or a home network. Typically, private networks are locked down so they can only be accessed by authorized individuals from within that organization. This offers the following advantages:

Defense in depth

Servers in private networks have at least two layers of protection: first, a malicious actor would have to be able to get into your private network, and second, they would then have to find a vulnerability in a server, such as a misconfigured firewall. In fact, a good private networking setup can create many more layers than that, making it harder and harder for attackers, until they give up, and go seek out easier targets.

Isolate workloads

You saw in Part 6 that there are different ways to set up environments: different servers, different accounts, different data centers, and, as is the focus of this blog post, different networks. Private networks give you a way to isolate different types of workloads: one common pattern is to deploy different products and different teams in separate private networks; another common pattern is to deploy data store servers and application servers in separate private networks. You can then choose to either allow no network traffic between the different types of workloads, or only allow traffic between specific IPs and ports: either way, this reduces the chances of one workload accidentally—or maliciously—causing problems for another workload.

Better control and monitoring

Private networks give you fine-grained control over routing: you can better control north-south traffic, which is the traffic between your servers and the outside world, and east-west traffic, which is the traffic between servers within your network. This allows you to add security controls, set up monitoring (e.g., you can capture flow logs which show you all the traffic going through your private network), and manage traffic patterns (e.g., shift traffic around as part of a deployment or experiment).

Because of all of these advantages, you should almost always default to a private network. In fact, the typical deployment pattern at most companies is to have almost all servers in a private network, and only a handful of highly-locked down servers accessible publicly, such as load balancers.

Key takeaway #4

Deploy all your servers into private networks by default, exposing only a handful of locked-down servers directly to the public Internet.

In the next several sections, you’ll learn the basics of private networking by looking at physical networks in on-prem data centers and then virtual networks in the cloud.

Physical Private Networks

In this section, I’m going to walk you through an overview of how physical networks work, to help build your intuition for why private networks are useful and what’s happening under the hood.

Lossy compression

Networking is a huge topic, so I’ve had to compress a lot of information in this section, which is good in that it lets you grasp the concepts quickly, but be aware that in the effort to compress information, many nuances get lost, and what you’re seeing here is a highly simplified picture.

Let’s start by thinking through how you’d connect computers together. Connecting two computers is easy: all it takes is a single cable, as shown in Figure 79.

Connecting two computers
Figure 79. Connecting two computers

Connecting N computers is more complicated: if you had to connect every computer to every other computer, you’d need N2 cables, which would be very messy and expensive. The typical solution is to connect all the computers to a single switch, which is a device that can forward data between computers, which only requires N cables, as shown in Figure 80:

Connecting multiple computers using a switch
Figure 80. Connecting multiple computers using a switch

These connected computers form a network. Connecting two networks together is easy: you typically do it using routers, as shown in Figure 81.

Connect two networks using routers
Figure 81. Connecting two networks using routers

Connecting N networks together is hard: you’d have that N2 problem again. The typical solution is to connect those routers together using the Internet, as shown in Figure 82:

Connecting many networks together via the Internet
Figure 82. Connecting many networks together via the Internet

The term "Internet" is derived from interconnected networks: it’s a network of networks. Many of those networks are private networks, which is the focus of this section.

Let’s look at two common private networks: a private network in your house and a private network in a data center. For your home network, you probably got a router from your ISP, which is actually both a switch and a router, and it creates a private network which allows the devices you have at home (e.g., your computer, laptop, phone, tablet, printer, TV) to talk to each other. For a data center network, the data center technicians set up various switches and routers, and this creates a private network which allows the servers in that data center to talk to each other.

Each of these private networks has several key characteristics:

  • Only authorized devices may connect to the private network

  • The private network uses private IP address ranges

  • The private network defines connectivity rules

  • Most devices in a private network access the public Internet through a gateway

The following sections go into detail on each of these, starting with only allowing authorized devices to connect to the private network.

Only authorized devices may connect to the private network

Whereas just about anyone can connect to the public Internet, a key characteristic of private networks is that only devices you explicitly allow may connect. For example, the only way to connect to the private network within a data center is to physically get into the data center and plug a cable into the routers and switches; the only way to connect to the private network within your house is to either physically connect to the ISP router with an ethernet cable, or if your router supports Wi-Fi, you have to be within range of the antenna, and you typically need a password.

The private network uses private IP address ranges

IPv4 (specifically, RFC 1918) reserves the following IP addresses for private networks:

10.0.0.0 - 10.255.255.255

172.16.0.0 - 172.31.255.255

192.168.0.0 - 192.168.255.255

Note that you can express ranges of IP addresses more succinctly using Classless Inter-Domain Routing (CIDR) notation, where you define CIDR blocks of the format a.b.c.d/e, where a.b.c.d is an IP address and e is a decimal number that represents how many bits of the IP address, when expressed in binary, stay the same, so the range of IPs is defined by all the other bits that can change. For example, 0.0.0.0/0 represents all possible IP addresses, as zero of the bits stay the same, 1.2.3.4/32 represents just the single IP address 1.2.3.4, as all 32 bits stay the same, and 10.0.0.0/24 represents the IPs 10.0.0.0 - 10.0.0.255, as the first 24 bits stay the same, leaving the last 8 bits to change. Using CIDR notation, the three private IP address ranges from RFC 1918 can be represented as:

10.0.0.0/8

172.16.0.0/12

192.168.0.0/16

While every public IP address must be unique, these private IPs can be used over and over again, as they can only be used for private networks (and never as public IPs). Just about every private network uses these IP address ranges: for example, if you look at your computer’s Wi-Fi or ethernet settings while on your home network, you’ll typically find that you have an IP address similar to 192.168.xxx.yyy. Most data center networks use 10.0.0.0/8 or 172.16.0.0/12.

The private network defines connectivity rules

In your home network, you can typically define a few basic connectivity rules: for example, depending on your router, you may be able to block outbound access to specific websites, block inbound requests from specific IP addresses, and block specific port numbers from being used. In a data center network, you have full control over connectivity: for every device in the network, you can specify what IP address it gets assigned, what ports it’s allowed to use, which other devices it can talk to, and how traffic gets routed to and from that device. You control some of this behavior through hardware: namely, whether certain devices are connected via cables or not. You control the rest through software, based on the configuration in your switches and routers.

Since data centers can have hundreds or thousands of servers, it’s common to partition the private network into subnets (subnetworks) and to assign specific rules to all the devices in a subnet. For example, a common approach is to run a small handful of servers, such as load balancers, in one subnet called a DMZ (demilitarized zone), which allows access to those servers directly from the public Internet, and to run the rest of your servers in another private subnet, which is not accessible from the public Internet, and is more locked down.

Most devices in a private network access the public Internet through a gateway

If all the devices in a private network have private IP addresses (e.g., 192.168.xxx.yyy), how do those devices access the public Internet? One option is to assign a public IP address to those devices: for example, you might assign a public IP address to a server in your DMZ; that means that server will have two IP addresses, one that is public, which it uses to communicate with the public Internet, and one that is private, which it uses to communicate with other devices in the private network. However, trying to assign a public IP to every device in a private network largely defeats the purpose of having a private network: namely, the desire to keep those devices secure and to avoid running out of IPv4 addresses.

Therefore, the vast majority of devices in a private network access the public Internet through a gateway. Here are a few of the most common types of gateways:

Load balancers

One type of gateway you’ve already seen is a load balancer, which allows requests that originate on the public Internet to be routed to app servers in your private network based on rules you define in that load balancer: e.g., if a user makes a request to the load balancer on port 80 for domain foo.com, forward the request to a specific set of app servers on port 8080.

NAT gateway

A Network Address Translation (NAT) gateway allows requests that originate in a private network to be routed out to the public Internet. A common approach with NAT gateways is to do port address translation (PAT): e.g., if an app server wants to make an API call to some-service.com, the app server sends that request to the NAT Gateway, which will then send the request to some-service.com, modifying ("translating") the request along the way to make it look like it originated from the public IP of the NAT gateway at a specific port number. When the response comes back from some-service.com to that port, the NAT gateway knows which app server to forward the response to, and it will translate the request to make it look like it came directly from some-service.com.

Outbound proxy

An outbound proxy is like a specialized NAT gateway that only allows an apps to make outbound requests to an explicitly-defined list of trusted endpoints. Networking is all about layers of defense, and while most of those layers are about keeping attackers out, an outbound proxy is all about keeping them in: that is, if someone manages to get through all the other layers and break into your systems, then your last line of defense is to make it as hard as possible for them to escape with anything valuable, such as user data. Many attackers will try to send their stolen data to their own servers, and the goal of the outbound proxy is to make this data exfiltration as difficult as possible.

ISP router

On your home network, the router you got from your ISP typically configures itself as a NAT gateway. All the devices on your home network send all requests intended for the public Internet via the router, which uses PAT to get you a response, while keeping those devices hidden.

Gateways offer two major benefits. First, a single gateway can share one or a small number of public IP addresses amongst thousands of devices within its private network; this is one of the ways we’ve been able to get far more than 4 billion devices onto the public Internet, despite IPv4 limitations. Second, the gateway hides the devices in the private network, providing a layer of protection for them, and only allowing through traffic that you’ve explicitly allowed.

Now that you’ve seen the basics of private networking in the physical world, let’s see what private networking looks like in the cloud, where everything is virtualized.

Virtual Private Networks

If you deploy into the cloud, the cloud provider has already taken care of all the physical networking for you: all the servers, switches, routers, and cables are already hooked up, largely in a way you can’t see or control. What you can control is a virtual network, which is a network you configure entirely in software (which is why it’s sometimes referred to as software-defined networking). In the following several sections, you’ll learn about virtual networks in the cloud, virtual networks in orchestration tools, and then go through an example of creating a virtual network in AWS.

Virtual networks in the cloud

Each cloud provider offers slightly different networking features, but they typically have the following basic characteristics in common:

You can create a VPC

Most cloud providers allow you to create virtual private network: the names vary a bit, but the most common one, and the one I’ll use throughout this blog post series, is virtual private cloud (VPC), which is the name used by AWS (as you saw in earlier blog posts) and Google Cloud; Azure calls them virtual networks (VNets).

The VPC consists of subnets

Each VPC contains one or more subnets. Each subnet has an IP address range from RFC 1918: e.g., 10.0.0.0/24.

The subnets assign IP addresses

The resources you deploy into a subnet get an IP address from that subnet’s IP address range: e.g., if you deploy three servers into a subnet with the IP address range 10.0.0.0/24, the servers might end up with the IPs 10.0.0.20, 10.0.0.21, and 10.0.0.22.

You enable connectivity with route tables

Each subnet has a route table that controls how traffic is routed within that subnet. Each row in a route table typically defines a destination and where to route traffic sent to that destination. Each time the VPC needs to route a packet, it will go through the route table, and use the most specific route that matches that packet’s destination. For example, consider the route table shown in Table 13:

Table 13. Example route table
DestinationTarget

10.0.0.0/16

VPC Foo

10.1.0.0/16

VPC Bar

0.0.0.0/0

NAT gateway

This route table configures all traffic to 10.0.0.0/16 to go to a VPC called Foo, all traffic to 10.1.0.0/16 to go to a VPC called Bar, and all other traffic (0.0.0.0/0) to go to the NAT gateway, to be routed to the public Internet. If you get a packet with the destination 10.0.0.8, the most specific route that matches will be VPC Foo. If you get a packet for destination 3.4.5.6, none of the VPC routes will match, so the catch-all 0.0.0.0/0 route will be the only thing that matches, and this packet will be sent to the NAT Gateway.

You block connectivity with firewalls

Each cloud provider provides different types of firewalls to block traffic. Some firewalls apply to individual resources, such as servers, and these firewalls typically block all traffic by default: for example, as you saw in earlier blog posts, every EC2 instance in AWS has a security group, and if you want that EC2 instance to be able to receive network traffic from specific ports and IPs, you have to explicitly open those up in the security group. Other firewalls apply to entire subnets or VPCs, and these firewalls typically allow all traffic by default, but allow you to specify what traffic to block: for example, AWS has a network firewall that you can use to filter inbound and outbound traffic across an entire VPC.

You access the public Internet through gateways

Just as with a physical data center, you can run various types of gateways to allow servers in the VPC access the public Internet. For example, just about all the cloud providers offer load balancers and NAT Gateways.

Note that, to make it easier to get started, most cloud providers allow you to deploy resources without creating a VPC: e.g., you saw in earlier Parts that AWS gives you a Default VPC out-of-the-box. That means you typically have to take extra steps to create a custom VPC and configure your resources to deploy into it: you’ll see an example of this later in this blog post.

Virtual networks in orchestration tools

Some orchestration tools (which you first saw in Part 3) include their own virtual network: for example, see Kubernetes Networking, OpenShift Networking, and Marathon Networking. This is because many orchestration tools, especially open source ones, are designed to work in any data center or cloud, and to be able to solve the core orchestration problems from Part 3 that involve networking (e.g., load balancing, service communication) in a way that’s portable, these tools create their own virtual networks. These virtual networks are typically responsible for the following tasks:

IP address management

Assigning IP addresses to apps running in the orchestration tool.

Service communication

Allowing multiple apps running in the orchestration tool to communicate with each other.

Ingress

Allowing apps running in the orchestration tool to receive requests from the outside world.

The key thing to understand is that if you’re using an orchestration tool that has its own virtual network, then you’re going to have to integrate two sets of networking technologies: one from the orchestration tool, and one from your data center or cloud provider. Since these orchestration tools can be deployed in many different environments, they typically offer plugins to handle this integration. For example, Kubernetes supports Container Network Interface (CNI) plugins to manage cluster networking and ingress controllers to manage ingress. Table 14 shows the typical CNI plugin and ingress controller you use when deploying Kubernetes with various cloud providers, and how that allows you to integrate Kubernetes' networking (IP address management, service communication, and ingress) with that cloud provider’s networking:

Table 14. Comparing the behavior of networking plugins for Kubernetes in various clouds
CloudTypical CNI pluginTypical ingress controllerIP address managementService communicationIngress

AWS

Amazon VPC CNI plugin

AWS Load Balancer Controller

Assign IPs from the AWS VPC

Use AWS VPC routing

Deploy AWS Elastic Load Balancers

GCP

Cilium GKE plugin

GKE ingress

Assign IP addresses from Cloud VPC subnets

Use Cloud VPC routing

Deploy Cloud Load Balancers

Azure

Azure CNI plugin

Nginx ingress controller

Assign IP addresses from VNet subnets

Use VNet routing

Deploy Nginx

Now that you’ve seen the two most common types of virtual networks, let’s go through an example of deploying one in AWS.

Example: Create a VPC in AWS

In this section, you’re going to create a custom VPC in AWS, and deploy some EC2 instances into it. In this blog post series’s sample code repo, in the ch7/tofu/modules/vpc folder, you’ll find a reusable OpenTofu vpc module that can create the VPC shown in Figure 83:

The diagram of the VPC
Figure 83. The diagram of the VPC

This VPC will have the following configuration:

IP address range

The VPC will allow you to specify the IP address range (CIDR block) to use. For example, as shown in the preceding diagram, you could use 10.0.0.0/16, which is one of the private IP address ranges from RFC 1918, and /16 is the largest CIDR block AWS allows, which gives you 65,536 IP addresses, enough for most use cases. The VPC will automatically split this IP address range amongst two subnets, a public subnet and a private subnet, as described next.

Public subnet

The VPC will include a public subnet, which is a subnet that is directly accessible from the public Internet (a DMZ). You typically use public subnets to run servers meant to be accessed by your users directly, such as load balancers. In AWS, to make a subnet public, you have to do three things (all of which the vpc module handles for you): first, you create an Internet Gateway, which is an AWS-specific component that allows communication between the public Internet and your VPC. Second, you create a route in the subnet’s route table to send traffic to the Internet Gateway; typically, you do this via a catch-all route (0.0.0.0/0) that assumes any traffic that doesn’t match a more specific destination must be targeted for the public Internet, so you forward it to the Internet Gateway. Third, you configure the VPC to assign public IP addresses to any EC2 instances you deploy into it. The public subnet will also assign private IP addresses to EC2 instances from a part of the VPC’s IP address range (e.g., 10.0.0.0/21).

Private subnet

The VPC will also include a private subnet, which is a subnet that is not directly accessible from the public Internet. You typically use private subnets to run the rest of your servers, and especially data stores, in a more protected environment. In AWS, subnets are private by default, which means servers in those subnets will be able to talk to other resources within the VPC, but nothing outside the VPC will be able to talk to those servers, and, unless you add a NAT gateway (which this vpc module does not do), those servers also won’t be able to talk to anything outside the VPC (such as the public Internet). This makes it harder both for malicious actors to get in to your servers in private subnets, and if they somehow do get in, it also makes it harder for them to get any data out. It also ensures you can’t accidentally (or maliciously) install software from the public Internet; if you’re using server templating and immutable infrastructure practices (as introduced in Part 2), this is a good thing, as it makes your servers more secure and easier to debug.

To use the vpc module, create a new OpenTofu root module called vpc-ec2:

$ cd fundamentals-of-devops

$ mkdir -p ch7/tofu/live/vpc-ec2

$ cd ch7/tofu/live/vpc-ec2

Inside the vpc-ec2 folder, create a main.tf file with the initial contents shown in Example 134:

Example 134. Use the vpc module (ch7/tofu/live/vpc-ec2/main.tf)
provider "aws" {

  region = "us-east-2"

}



module "vpc" {

  source = "github.com/brikis98/devops-book//ch7/tofu/modules/vpc"



  name       = "example-vpc"   (1)

  cidr_block = "10.0.0.0/16"   (2)

}

The preceding code uses the vpc module from the blog post series’s sample code repo to do the following:

1Set the name of the VPC to "example-vpc."
2Configure the VPC to use 10.0.0.0/16 as its CIDR block.

Deploying a VPC by itself doesn’t do much, so let’s deploy some EC2 instances into it. First, update main.tf to deploy an EC2 instance into the public subnet, as shown in Example 135:

Example 135. Deploy an EC2 instance into the public subnet (ch7/tofu/live/vpc-ec2/main.tf)
module "public_instance" {

  source = "github.com/brikis98/devops-book//ch7/tofu/modules/ec2-instances"



  name          = "public-instance"                   (1)

  num_instances = 1                                   (2)

  instance_type = "t2.micro"

  ami_id        = "ami-0900fe555666598a2"

  http_port     = 80

  user_data     = file("${path.module}/user-data.sh") (3)

  vpc_id        = module.vpc.vpc.id                   (4)

  subnet_id     = module.vpc.public_subnet.id         (5)

}

The preceding code uses the same ec2-instances module that you saw earlier in this blog post in the DNS example to do the following:

1Name the instance "public-instance."
2Deploy just a single EC2 instance.
3Configure the instance to run the user data script shown in Example 136.
4Configure the instance to run in the VPC you created using the vpc module.
5Configure the instance to run in the public subnet of the VPC you created using the vpc module.
Example 136. The user data script (ch7/tofu/live/vpc-ec2/user-data.sh)
#!/usr/bin/env bash



set -e



curl -fsSL https://rpm.nodesource.com/setup_21.x | bash -

yum install -y nodejs



export MY_IP=$(hostname -I)                      (1)



tee app.js > /dev/null << "EOF"

const http = require('http');



const server = http.createServer((req, res) => {

  res.writeHead(200, { 'Content-Type': 'text/plain' });

  res.end(`Hello from ${process.env.MY_IP}!\n`); (2)

});



const port = 80;

server.listen(port,() => {

  console.log(`Listening on port ${port}`);

});

EOF



nohup node app.js &

This user data script is identical to the one you saw in the DNS example earlier in this blog post, except for two changes:

1Look up the private IP address of the server.
2Include the private IP address of the server in the HTTP response.

Now that you have an instance in the public subnet, update main.tf to deploy an instance in the private subnet as shown in Example 137:

Example 137. Deploy an EC2 instance into the private subnet (ch7/tofu/live/vpc-ec2/main.tf)


module "private_instance" {

  source = "github.com/brikis98/devops-book//ch7/tofu/modules/ec2-instances"



  name          = "private-instance"                   (1)

  num_instances = 1

  instance_type = "t2.micro"

  ami_id        = "ami-0900fe555666598a2"

  http_port     = 80

  user_data     = file("${path.module}/user-data.sh")

  vpc_id        = module.vpc.vpc.id

  subnet_id     = module.vpc.private_subnet.id         (2)

}

This code is nearly identical to the code for the public instance. The only differences are:

1Name the instance "private-instance."
2Run the instance in the private subnet of the VPC.

Create a file called outputs.tf that has output variables for these vpc-ec2 module, as shown in Example 138:

Example 138. Output variables for the vpc-ec2 module (ch7/tofu/live/vpc-ec2/outputs.tf)
output "public_instance_public_ip" {

  description = "The public IP of the public instance"

  value       = module.public_instance.public_ips[0]

}



output "public_instance_private_ip" {

  description = "The private IP of the public instance"

  value       = module.public_instance.private_ips[0]

}



output "private_instance_public_ip" {

  description = "The public IP of the private instance"

  value       = module.private_instance.public_ips[0]

}



output "private_instance_private_ip" {

  description = "The private IP of the private instance"

  value       = module.private_instance.private_ips[0]

}

The preceding file outputs the public and private IP addresses for the two EC2 instances. Deploy the vpc-ec2 module as usual, authenticating to AWS as described in Authenticating to AWS on the command line, and running init and apply:

$ tofu init

$ tofu apply

When apply completes, you should see some outputs:

private_instance_private_ip = "10.0.80.65"

private_instance_public_ip = ""

public_instance_private_ip = "10.0.5.100"

public_instance_public_ip = "3.144.105.254"

The outputs include the private IP addresses for both instances, which should fall into the 10.0.0.0/16 CIDR block of the VPC, as well as the public IP of the public instance, but not the public IP of the private instance (it’ll be an empty string). This is not a bug; since you deployed the private instance into a private subnet, that instance shouldn’t have a public IP address!

To see if the instances are working, make an HTTP request to the public IP of the public instance (the IP in the public_instance_public_ip output):

$ curl http://3.144.105.254

Hello from 10.0.5.100

You should see a response with that instance’s private IP address. If that works, congrats, you now have an instance successfully running in a custom VPC!

Get your hands dirty

Here are some exercises you can try at home to get a better feel for working with VPCs:

  • Update the VPC module to deploy a NAT gateway so that resources running in the private subnet can access the public Internet. Note: AWS offers a managed NAT gateway, which works very well and is easy to use, but is not part of the AWS free tier.

  • Update the VPC module to deploy each type of subnet (public and private) across multiple availability zones so that your architecture is resilient to the failure of a single AZ.

You’ve been able to confirm that the public instance is working, but how do you test the private instance? It has no public IP, and if you try to make a request to the private IP from your own computer, that won’t work:

$ curl http://10.0.80.65

curl: (7) Failed to connect to 10.0.80.65 port 80 after 19 ms:

Couldn't connect to server

To be able to test the instance in the private subnet, you’re going to have to learn about how to access private networks, which is the focus of the next section.

Accessing Private Networks

Deploying a server in a private network ensures that you can’t access that server directly from the public Internet: this is mostly a good thing, as it makes it harder for malicious actors to get access to your servers; however, if you can’t access those servers either, then that’s a problem. As you saw in the previous section, a server deployed in a private subnet has no public IP address. It might be running code and working, but if you can’t access it, testing, debugging, and development become harder.

Fortunately, there are a number of tools out there that are designed to give you secure, controlled access to your private networks. The next few sections will look at the following approaches for accessing private networks:

  • Castle-and-moat model

  • Zero-trust model

  • SSH

  • RDP

  • VPN

Let’s start with a look at the castle-and-moat model, which is the most common way to access a private network.

Castle-and-Moat Model

The traditional approach used at many companies for managing access to private networks is the castle-and-moat model, based on the analogy to a castle with an extremely secure perimeter (walls, moat, drawbridge, etc.), which makes it hard to get into the castle, but a soft interior, which allows you free rein to move around once you’re inside. The equivalent with a private network is one that doesn’t allow you to access anything from outside the network, but once you’re "in" the network, you can access everything.

In a physical network, with the castle-and-moat model, merely being connected to the network means you’re "in." For example, with many corporate office networks, if you are plugged into the network via a physical cable, you can access everything in that network: all the wiki pages, the issue tracker, the IT help desk, and so on. However, if you’re outside the physical network, how do you connect to it? For example, if you’re working from home, how do you get access to your corporate office network? Or if you have infrastructure deployed in a VPC in the cloud, how do you get access to the private subnets of that VPC?

A common solution is to deploy a bastion host. In a fortress, a bastion is a structure that is designed to stick out of the wall, allowing for more reinforcement and extra armaments, so that it can better withstand attacks. In a network, a bastion host is a server that is designed to be visible outside the network (i.e., it’s in the DMZ), and this server has extra security hardening and monitoring, so it can better withstand attacks. The idea is that you keep the vast majority of your servers private, with the network acting as a secure perimeter (like a wall and moat), and you use the bastion host as the sole entrypoint to that network. Since there’s just one bastion, you can put a lot of effort into making it as secure as possible. You then allow users to connect to the bastion host using protocols such as SSH, RDP, or VPN, each of which we’ll dive into later in this blog post. Since the bastion host is "in" the network, once you’ve successfully connected to the bastion host, you’re now also "in," and you can freely access everything else in the network, as shown in Figure 84:

A castle-and-moat networking model with a bastion host as the sole access point
Figure 84. A castle-and-moat networking model with a bastion host as the sole access point

For example, if you are able to connect to the bastion host in Figure 84, you can then access everything in the private subnets of that VPC, including the private servers and database with IPs 10.0.0.20, 10.0.0.21, and 10.0.0.22. This approach worked well-enough in the past, but in the modern world, the castle-and-moat approach leads to security concerns, as discussed in the next section.

Zero-Trust Model

The castle-and-moat approach originated in a world where:

  • You had a physical network of routers, switches, and cables in a building owned by the company.

  • To access the network, you had to physically be in a building owned by the company.

  • To connect to the network, you had to be using a computer owned and configured by the company.

In short, your location on the network mattered: some locations could be trusted, while others could not. This is increasingly not the world we live in. In fact, these days:

  • Many of the networks are virtual, such as a VPC in an AWS account.

  • More and more employees work remotely, and need to be able to connect to the network from a variety of locations, extending your network into homes, coworking spaces, coffee shops, airports, and so on.

  • We have more devices than ever that we want to connect to the network, such as laptops, tablets, and phones.

As a result, for many companies, the idea of a secure perimeter and soft interior no longer makes sense. There’s no clear "perimeter" or "interior" anymore, and there’s no location on the network that can be implicitly trusted. This has led to the rise of the zero-trust architecture (ZTA), which is based on the concept of "never trust, always verify," where you never trust a user or device just because they have access to some location on the network. The core principles of ZTA can be summarized as follows:

Authenticate every user

Every connection requires the user to authenticate, typically using single sign-on (SSO) and multi-factor authentication (MFA).

Authenticate every device

Every connection also requires the user’s device (laptop, phone, tablet) to authenticate, which means you can only connect from devices that the company has approved, added to a device inventory, and configured with adequate security controls (e.g., security scanners).

Encrypt every connection

All network communication must be over encrypted channels. You’ll learn more about encryption in Part 8.

Define policies for authentication and authorization

Each piece of software you run can define flexible policies for who is allowed to access that software (authentication) and what level of trust and permissions they will have (authorization). These policies can make use of a variety of data sources, such as what location the user is connecting from (e.g., their typical home office or an unexpected different continent?), the time of day they are connecting (e.g., during normal work hours or three in the morning?), how often they are connecting (e.g., first time today or 5,000 times in the last 30 seconds?), and so on.

Enforce least-privilege access controls

With the castle-and-moat model, once you’re in the network, you get access to everything: e.g., once you connect to a bastion host, you get access to all the wiki pages, the issue tracker, the IT help desk, and so on. With the ZTA model, you follow the principle of least privilege, which means you get access only to the resources you absolutely need to do your specific task, and nothing else: e.g., getting access to the internal wiki only gives you access to the wiki, and does not give you access to issue tracker, the IT help desk, or anything else.

Continuously monitor and validate

The assumption with ZTA is that you’re constantly under attack, so you need to continuously log and audit all traffic to identify suspicious behavior.

The zero-trust model has been evolving for many years. Some of the major publications on it include No More Chewy Centers: Introducing The Zero Trust Model Of Information Security by John Kindervag, where he coins the term "Zero Trust Model," Zero Trust Architecture by NIST, and BeyondCorp: A New Approach to Enterprise Security by Google. Google’s BeyondCorp paper is arguably what popularized the zero trust model, even though the paper doesn’t ever use that term explicitly, but the principles are largely the same.

One of the more controversial principles in the BeyondCorp paper is that Google no longer requires employees working remotely to use VPN to access internal resources: instead, those resources are accessible directly via the public Internet. At first, this feels like a paradox: how can exposing internal resources to the public be more secure? Google’s take is that exposing internal tools publicly forces you to put much more effort into securing them than if you merely relied on the network perimeter for security. Figure 85 shows a simplified version of the type of architecture Google described in BeyondCorp:

Zero-trust architecture
Figure 85. Zero-trust architecture

The idea is that you expose all your internal resources to the public Internet via an access proxy which uses your user database, device registry, and access policies to authenticate, authorize, and encrypt every single connection. From a quick glance, the zero-trust approach in Figure 85 might not look all that different from the moat-and-castle approach in Figure 84: both rely on a single entrypoint to the network (a bastion host or an access proxy) that authorizes connections before granting access to private resources. However, the key difference is that in the moat-and-castle approach, only the bastion host is protected, and all the private resources are open, so if you can get past the bastion, you get access to all the private resources, whereas with the zero-trust approach, every single private resource is protected, and each one requires you to go through the authorization process with the access proxy. Instead of a single, strong perimeter around all the resources in your network, the zero-trust approach is a bit like putting a separate strong perimeter around each individual resource.

That means that zero-trust isn’t a single tool you adopt, but something you integrate into every part of your architecture, including the following:

User and device management

One of the first steps with using ZTA is to get better control over users and devices. You typically want to ensure that authentication for all the software you rely on—e.g., your email, version control system, bug tracker, cloud accounts, and so on—is done through a single identity provider (SSO) that requires MFA. Some tools that can help in this space include JumpCloud, Okta, OneLogin, Duo, Microsoft Entra ID, and Ping Identity. You’ll also want to figure out what sorts of devices you want to allow employees to use and how to track, secure, and authenticate those with a device registry. This is the domain of Mobile Device Management (MDM), and some of the major players in this space include JumpCloud, Rippling, NinjaOne, Microsoft Intune, and Scalefusion.

Infrastructure access

You’ll need to think through how you’ll grant employees access to servers (e.g., SSH or RDP), databases (e.g., MySQL or PostgreSQL clients), containers (e.g., Docker container running in Kubernetes), networks (e.g., a VPC in AWS), and so on, in a manner that works with the zero-trust approach. This is tricky to manage, as these infrastructure tools vary widely in terms of protocols, authentication, authorization, encryption, and so on. Fortunately, tools such as Teleport, Tailscale, Boundary, and StrongDM can help give you a consistent way to access a variety of infrastructure.

Service communication

Finally, you’ll have to rework how your (micro)services communicate with each other. In Part 6, you deployed a frontend and backend microservice in Kubernetes, and the frontend was able to talk to the backend with no authentication, authorization, or encryption. This is how many microservice architectures are designed, relying on the network perimeter to protect those services (the castle-and-moat model). In the ZTA world, this will no longer fly, so you will need to figure out how to secure all your service communication. You’ll learn more about this later in this blog post.

Implementing a true zero-trust architecture is a tremendous amount of work. The reality is that very few companies are able to fully pull it off. It’s a good goal for all companies to strive for, but how far down the ZTA path you go depends on your scale of company: smaller startups will typically use the castle-and-moat approach; mid-sized companies will often adopt a handful of ZTA principles, such as using SSO and securing microservice communication; large enterprises will try to go for all the ZTA principles. As you saw in Section 1.1.2, you need to adapt your architecture to the needs and capabilities of your company.

Key takeaway #5

In the castle-and-moat model, you create a strong network perimeter to protect all the resources in your private network; in the zero-trust model, you create a strong perimeter around each individual resource.

Whether you use the castle-and-moat model, the zero-trust model, or something in between, there is a common set of tooling that you will typically use to access any private network:

  • SSH

  • RDP

  • VPN

Note that these options are not mutually exclusive, as it’s common to combine VPN with SSH and RDP. The following sections will go through each of these options in detail, starting with SSH.

SSH

Secure Shell (SSH) is a protocol that allows you to connect to a computer over the network to execute commands. It uses a client-server architecture, as shown in Figure 86: for example, the client could be the computer of a developer on your team named Alice and the server could be the bastion host. When Alice connects to the bastion host over SSH, she gets a remote terminal where she can run commands and access the private network as if she was using the bastion host directly.

Using SSH to connect to a bastion host
Figure 86. Using SSH to connect to a bastion host

Let’s take a quick look at how to use SSH, followed by its advantages and drawbacks.

How to use SSH

To use SSH, the first step is to configure the client, such as Alice’s computer, as follows:

  1. SSH uses public-key cryptography for authentication and encryption. You’ll learn more about authentication and encryption in Part 8, but for now, all you need to know is that you need to create a key pair for Alice, which consists of a public key and a private key.

  2. Store the private key in a secure manner on Alice’s computer, ensuring unauthorized users can never access it.

Next, you configure one or more servers, such as the bastion host and the servers in the private subnets of Figure 86, as follows:

  1. Run SSH as a background process, known as a daemon. You typically do this using the sshd binary. On many servers, it’s enabled by default.

  2. Update the server’s firewall to allow SSH connections, typically on port 22.

  3. To allow someone to authenticate to a server, you need to add their public key to the authorized keys file for an OS user on that server, typically in ~/.ssh/authorized_keys. For example, if you wanted to allow Alice to SSH to the server as the OS user ec2-user, with home folder /home/ec2-user, you’d need to add Alice’s public key to /home/ec2-user/.ssh/authorized_keys.

Now that you’ve configured your clients and servers, you can use the SSH client to connect to the server, and get a terminal where you can run commands as if you were sitting directly at that server. You also get access to that server’s network: e.g., if Alice connects to the bastion host in Figure 86, she could run the curl command in the terminal to access the server in the private subnet at 10.0.0.20.

You can make it even easier to access a private network over SSH using either port forwarding or a SOCKS proxy, either of which allows you to transmit arbitrary data over the encrypted SSH connection, sometimes referred to as tunneling. With port forwarding, Alice could use SSH to forward port 8080 on her local computer, via the bastion host, to port 8080 of the server at 10.0.0.20 in the private subnet of Figure 86, and then any request she sends from any app on her own computer to localhost:8080 will be sent to 10.0.0.20:8080. Alternatively, Alice could use SSH to run a SOCKS proxy on port 8080, and configure any app that supports SOCKS proxies (SOCKS is a standard protocol for proxying), such as her web browser, to send all traffic via localhost:8080, which will route that traffic through the bastion host, as if she was browsing the web directly from the bastion host.

Now that you’ve seen all the different ways you can use SSH, let’s try some of them out with a real example in AWS.

Example: SSH bastion host in AWS

Earlier in this blog post, you deployed a VPC and a couple EC2 instances, one in a public subnet you could access, and one in a private subnet that you could not. Let’s update that example so that you can access both instances over SSH. We’ll use an EC2 key pair to do this, which is an SSH key pair that AWS can create for you and use with its EC2 instances.

Watch out for snakes: EC2 key pairs are not recommended in production

I’m using EC2 key pairs in this example as they give you a chance to experiment with a native SSH experience where you have a private key on your computer and use the ssh client directly. However, AWS only supports associating a single EC2 key pair with each EC2 instance, so in a team setting, that would mean sharing a single, permanent, manually-managed private key with multiple developers, which is not a good security practice (you’ll learn more about this in Part 8). For production, you should instead use something like EC2 instance connect or Session Manager, which use automatically-managed, ephemeral key pairs that are generated for individual team members on-demand, and expire after a short period of time.

To create a key pair, head to the EC2 key pair page, making sure to select the same region in the top right corner that you used to deploy the VPC, and click "Create key pair." Enter a name for the key pair, leave all other settings at their defaults, and click "Create key pair." AWS will store the public key for the key pair in its own database, but it will not store the private key: instead, it’ll prompt you to download the private key to your computer. Make sure to save it in a secure location, such as your ~/.aws/.ssh folder.

Next, add a passphrase to the private key, so only you can access it:

$ ssh-keygen -p -f <KEYPAIR>.pem

Enter new passphrase (empty for no passphrase):

Enter same passphrase again:

Finally, set the permissions for the private key so that only your OS user can access it (ssh won’t let you use the private key otherwise):

$ chmod 400 <KEYPAIR>.pem

You now have the private key securely stored on your hard drive; the only thing left is to add your public key to the authorized keys file on each of those EC2 instances. AWS will do this for you automatically for the root users of its AMIs if you specify a key pair when launching an EC2 instance: for example, if you specify a key pair when launching an Amazon Linux AMI, AWS will add the public key to the authorized keys file of the OS user ec2-user.

Update main.tf in the vpc-ec2 root module to specify the name of your key pair as shown in Example 139:

Example 139. Update the vpc-ec2 root module to specify a key pair for SSH access (ch7/tofu/live/vpc-ec2/main.tf)
module "public_instance" {

  source = "github.com/brikis98/devops-book//ch7/tofu/modules/ec2-instances"



  # ... (other params omitted) ...



  # TODO: fill in your EC2 key pair name

  key_name = "<YOUR_KEYPAIR_NAME>"

}



module "private_instance" {

  source = "github.com/brikis98/devops-book//ch7/tofu/modules/ec2-instances"



  # ... (other params omitted) ...



  # TODO: fill in your EC2 key pair name

  key_name = "<YOUR_KEYPAIR_NAME>"

}

Make sure to update the key_name parameter for both the public and private instance to whatever you named your key pair. Once you specify a key_name, the ec2-instances module automatically opens up port 22 in the security group so that you can access that instance via SSH.

To deploy these changes, run apply:

$ tofu apply

You should see in the plan output that OpenTofu wants to deploy two new instances: this is expected, as AWS can only update the authorized keys file on the very first boot, so it will need to replace the instances. When apply completes, you should have new EC2 instances, with new IP addresses:

private_instance_private_ip = "10.0.80.242"

private_instance_public_ip = ""

public_instance_private_ip = "10.0.1.26"

public_instance_public_ip = "18.226.187.40"

Grab the public IP address of the public instance from the public_instance_public_ip output variable and try to SSH to the server as follows:

$ ssh -i <KEYPAIR>.pem ec2-user@<PUBLIC_IP>

The authenticity of host '<PUBLIC_IP>' can't be established.

ED25519 key fingerprint is SHA256:v+MXP6xY/O3lGxlyywpBhEmr+qFwS0H2ASy77XPodNY.

Are you sure you want to continue connecting (yes/no/[fingerprint])?

What’s this "authenticity of host can’t be established" warning? You’ll see this message the first time you SSH to any new server, as your SSH client can’t be sure that this is really the server you wanted to talk to, and not some malicious actor who has intercepted your request. If you want to be extremely diligent, you can go to the EC2 console, click on the checkbox next to the instance you’re trying to connect to, and in the nav on top, choose Actions, "Monitor and troubleshoot," and "Get system log," and you should see log output similar to Figure 87:

The system log for an EC2 instance
Figure 87. The system log for an EC2 instance

The system log can be useful for debugging and inspecting your EC2 instances directly from the web browser. Near the bottom of the system log file, you should see the text "BEGIN SSH HOST KEY FINGERPRINTS," and below that, the fingerprint you see there should match the one in the ssh warning message. If it does, type in yes on your terminal, and hit Enter. ssh will store this fingerprint in your ~/.ssh/known_hosts file, and not prompt you about it for this IP address in the future (unless the fingerprint changes, which suggests a malicious actor is trying a man-in-the-middle attack, in which case, you’ll get an error).

After the fingerprint check, ssh will prompt you to enter the password for your SSH key. Type it in and hit Enter. After a second or two more, you should be connected to the server via SSH, and you’ll get a terminal prompt on the EC2 instance:

Amazon Linux 2023

https://aws.amazon.com/linux/amazon-linux-2023

[ec2-user@ip-10-0-1-26 ~]$

At this point, you can run commands on this EC2 instance. For example, you can check if the simple web app is working locally:

$ curl localhost

Hello from 10.0.1.26

Perhaps even more interestingly, since you are now "in" the network, you can finally test if the web app is working on the private instance! Grab the private instance IP address from the private_instance_private_ip output variable and send an HTTP request to it:

$ curl <PRIVATE_IP>

Hello from <PRIVATE_IP>

Congrats, you’re finally able to access an instance in a private network! In fact, you’re effectively using the public instance as a bastion host. Is it possible to SSH to the private instance, too? This would imply using the bastion host as a jump host, which you use as a hop on your way to other servers in the private network. Let’s give it a shot.

Hit CTRL+D to disconnect from the public instance, and you’ll end up back in a terminal on your own computer. If you use SSH frequently, having to specify a private key and enter the password each time can become tedious, so it’s common to use SSH agent, which is a key manager for SSH that temporarily stores your private key in memory, unencrypted, so you can authenticate without specifying a key or password. Use ssh-add to add a key to SSH agent:

$ ssh-add <KEYPAIR>.pem

You’ll be prompted for your password one more time: type it in and hit Enter. Now, re-run the SSH command for your public instance, but this time, you should omit the -i parameter, as your private key is already loaded in SSH agent, and you should add the -A parameter to enable agent forwarding, which will make your SSH agent available to the servers you connect to over SSH, allowing you to securely authenticate from an intermediary server like the bastion host without having to copy or expose your private key:

$ ssh -A ec2-user@<PUBLIC_IP>

After a few seconds, you should end up in a terminal on the EC2 instance, but this time, with no prompt about the host key or your SSH password. Next, run SSH again, but this time, point at the IP address of the private instance:

$ ssh ec2-user@<PRIVATE_IP>

This time, you’ll see the host key warning again, as you haven’t connected to the private instance before. Type in yes and hit Enter. After a second or two, you should get a terminal on the private instance, without any further prompts, as authentication should happen through SSH agent forwarding. You can now run commands on the private instance, such as checking if the web app is working locally:

$ curl localhost

Hello from <PRIVATE_IP>

Congrats, you just used the public instance as a jump host to SSH to a private instance! If you want to disconnect, hit CTRL+D twice, the first time to disconnect from the private instance, and the second time to disconnect from the public instance.

Get your hands dirty

Here are some exercises you can try at home to get a better feel for SSH:

  • Instead of EC2 key pairs, try using EC2 instance connect or Session Manager. How do these options compare when connecting to the public instance? And the private instance?

  • Try using the -L flag to set up port forwarding from your local computer to the private server at <PRIVATE_IP>: e.g., run ssh -L 8080:<PRIVATE_IP>:8080 ec2-user@<PULIC_IP> and then open http://localhost:8080 in your browser.

  • Try using the -D flag to set up a SOCKS proxy: e.g., run ssh -D 8080 ec2-user@<PUBLIC_IP>, configure your browser to use localhost:8080 as a SOCKS proxy, and then open http://<PRIVATE_IP>:8080 in your browser.

When you’re done testing, commit your code, and run tofu destroy to clean everything up in your AWS account.

Advantages and drawbacks of SSH

SSH has a number of advantages:

Widely available

Just about all modern Linux, Unix, and macOS distributions support SSH natively, and there are multiple clients for Windows, so SSH is ubiquitous in the DevOps world.

Secure

SSH is generally considered a mature and secure protocol. It has been around for roughly 30 years, it’s an open standard with open source implementations, it’s based on other mature standards (e.g., public-key cryptography), and due to its ubiquity, it has a massive community around it, so vulnerabilities are rare, and are typically fixed quickly.

No extra infrastructure

You don’t typically need to run any extra infrastructure to use SSH: you just run sshd on your servers; in fact, most servers run sshd by default, so there’s not much to do. It’s just there!

Powerful dev tools

You get access to a number of powerful features that are useful for developers: remote terminal access, tunneling, proxying, and so on. This makes it an indispensable tool for dev teams, especially for testing, debugging, and troubleshooting. It used to also be the primary way that system administrators managed servers, but that’s considerably less frequent with the use of infrastructure as code and immutable infrastructure practices, as you saw in Part 2.

However, SSH also has several disadvantages:

Managing keys can be difficult, especially at scale

Configuring one server to accept one user’s public key is no problem, but if you need to support hundreds of servers, hundreds of developers, hundreds of keys, key rotation and revocation (e.g., when a developer leaves the company), and different levels of permissions and access (including temporary access), things get a lot more complicated. Fortunately, there are a number of tools out there to help solve this problem: some are available from cloud providers, such as EC2 instance connect and Systems Manager in AWS and metadata-managed SSH connections in Google Cloud, and some are available from cloud-agnostic third parties, such as Teleport, Boundary, and StrongDM.

It’s primarily a dev tool

SSH is great for developers; it’s not so great for anyone else at your company. Asking the typical Product Manager, Designer, or Sales Executive to use SSH to access your company’s internal tooling is not likely to go over well. Moreover, even for developers, there are many times when you want an easy way to access a private network without having to jump through various hoops with CLI commands, tunnels, and proxies. Sometimes, you just want an easy-to-use UI.

The last item, support for a UI, is precisely where the next option, RDP, truly shines, as discussed in the next section.

RDP

Remote Desktop Protocol (RDP) is a way to connect to a Windows server remotely and to manage it via the full Windows user interface, as shown in Figure 88. It’s just like being at the computer: you can use the mouse, keyboard, and all the desktop apps.

Using RDP to manage a Windows server remotely looks just like using Windows directly
Figure 88. Using RDP to manage a Windows server remotely looks just like using Windows directly

Let’s take a quick look at how to use RDP, followed by its advantages and drawbacks.

How to use RDP

Like SSH, RDP is based on a client-server architecture. The first step is to configure the server:

  1. Enable RDP in Windows settings.

  2. Update the server’s firewall to allow RDP connections, typically on port 3389. Note that RDP is not generally considered secure—it has had many security vulnerabilities over the years—so exposing port 3389 directly to the public Internet is not recommended. Instead, that port should only be exposed within your network to one of the two devices in the next step.

  3. Deploy either a VPN (you’ll learn more about this in the next section) or a Remote Desktop Gateway (RD Gateway) in front of the server(s) you have running RDP. This protects RDP from direct access, and offers more secure authentication and encryption.

  4. To allow someone to authenticate to a server with RDP, they will need the username and password of a Windows user on that server. There are many ways to manage user accounts on Windows: for example, if you launch a Windows EC2 instance in AWS using the default Windows AMI, AWS has an Administrator user built-in with a randomly-generated password which you can retrieve from the EC2 console; if you launch a Windows server in Azure, you specify the user and password at launch time; if you manage Windows user accounts with an identity provider (e.g., with Active Directory or Microsoft 365), then you’d use that identity provider’s login.

Next, you configure the client:

  1. Install the RDP client. It’s available out-of-the-box with most Windows installs, but if you’re on Mac or Linux, you’ll have to install it separately.

Now that you’ve configured your clients and servers, you open up the RDP client’s UI, type in the IP address of the server to connect to (which might be an RD Gateway IP), enter the username and password when prompted, and after a minute or two, you’ll be logged into the full Windows UI, as shown in Figure 88.

Advantages and drawbacks of RDP

RDP has a number of advantages:

You get a fully-working Windows UI

The ability to manage a remote server using the exact same experience as if you were sitting at the server is a much nicer experience than being limited solely to a terminal.

Works for all employees

RDP is accessible to just about all roles at a company, and not just developers.

However, RDP also has several disadvantages:

Windows-only

While there are RDP clients for all major operating systems, the RDP server part only works with Windows.

Not secure without extra infrastructure

RDP is notorious for security vulnerabilities, so you can’t expose it directly to the public Internet, and must run other infrastructure in front of it, such as a VPN or RD Gateway.

Not your own computer

RDP gives you access to another computer, and whatever private network it can access, but sometimes, you want to be able to access the private network directly from your own computer, as that’s where you have all your apps and data.

The last item, where you want to be able to use your own computer to access a private network, is one of the areas where VPN shines, as discussed in the next section.

VPN

A Virtual Private Network (VPN) is a way to extend a private network across multiple other networks or devices. One of the main goals is for the VPN to be transparent to software running on those devices: that is, that software should be able to communicate with the private network as if the device was plugged physically into the network, without the software being aware of the VPN or having to do anything differently. Another main goal is to keep all traffic going over the VPN secure through the use of encryption.

These days, there are three common use cases for VPNs:

Connect remote employees to an office or data center network

If you’re working from home, you connect to a VPN, and you get access to your corporate office network as if you were in the office. Similarly, you can use a VPN to connect to a data center, whether on-prem or a VPC in your cloud account, and you get access to everything in that private network as if your computer was in the same data center. In this use case, the VPN acts as a bastion host. Some of the major players that address this use case include Cisco, Palo Alto Networks, Juniper Networks, Barracuda, SonicWall, Fortinet, OpenVPN, WireGuard, Tailscale, AWS Client VPN, and Google Cloud VPN.

Connect two data centers together

You can use site-to-site VPN to connect two data centers together: e.g., connect two on-prem data centers or connect your on-prem data center to a VPC in the cloud. In this use case, the VPN acts as a proxy between the data centers, securely forwarding certain traffic in one private network to certain endpoints in another private network. The VPN vendors you’d use on the on-prem side are largely the same ones as for an office network (e.g., Cisco, Palo Alto, Juniper); on the cloud side, you typically use site-to-site VPN services from the cloud provider, such as AWS Virtual Private Gateways and Google Cloud VPN.

Hide Internet browsing behavior

Some Internet users proxy their Internet traffic through a VPN in another country as a way to bypass geographical restrictions or censorship, or to keep their browsing history anonymous. Most of the office network VPNs are overkill for this use case, so it’s more common to use consumer VPN services such as NordVPN, ExpressVPN, and Proton VPN. I mention this use case for completeness, but it’s outside the scope of this book, so I won’t say much more on it.

Let’s take a quick look at how to use VPN, followed by its advantages and drawbacks.

How to use VPN

For the use case of connecting remote employees to an office or data center network, you typically use a client-server architecture. The first step is to configure the VPN server:

  1. Deploy a VPN server as your bastion host and configure the VPN software on it.

  2. Update the server’s firewall to allow VPN connections. The ports you use for this depend on which VPN solution you use. Many VPN solutions are based on Internet Protocol Security (IPsec), which uses several ports: typically, 500, 4500, 50, 51. Many others are based on using TLS, which typically uses port 443.

  3. Configure the VPN server with the ability to authenticate users. How this works also depends on which VPN solution you use. For example, the traditional approach, used by tools such as OpenVPN, is to use certificates, which are based on public-key cryptography (like SSH), but allow for mutual authentication, where the client can verify the VPN server is really who it says it is using the server’s certificate, and the server can verify the user is really who they say they are using the client’s certificate. This approach has many security benefits, but it can be hard to securely sign, distribute, revoke, and manage certificates. As a result, some newer tools, such as Tailscale, allow users to authenticate using an existing identity provider (e.g., Active Directory, Google, Okta), including whatever MFA options that provider supports, and under the hood, they handle all the certificate logic transparently.

Next, you configure the client:

  1. Install the VPN client. The exact client you use depends on the VPN server, but it is usually a desktop or mobile app with a user interface. Some operating systems have VPN clients built-in.

  2. Once the VPN client is installed, you follow the instructions in its UI to authenticate.

  3. Once you’re authenticated, the VPN client will establish an encrypted tunnel to the VPN server and update the network settings on your device to route all network traffic through this tunnel (this is known as a full tunnel configuration). As a result, all the software on your device—your web browser, your email client, all your apps—will transparently get access to the private network, as if your device was physically plugged into that network. Note that a full tunnel configuration has some drawbacks: for example, if employees are watching lots of videos on Netflix or YouTube, all of that network traffic now goes through the VPN, which may put a lot of load on your VPN and cost a lot of money for bandwidth. As a result, some VPN software allows you to use split tunnel mode where only certain traffic is routed via the VPN: e.g., you could configure specific domain names and CIDR block ranges that correspond to internal tooling to go via the VPN tunnel, and everything else to go via the user’s normal Internet connection.

For the use case of connecting two data centers together, the details depend on what sort of devices you’re using, but at a high level, in each data center, you do the following:

  1. Set up a site-to-site VPN device. In an on-prem data center, that might be a physical appliance from Cisco, Palo Alto Networks, Juniper, etc. In the cloud, that might be a virtual configuration, such as a Virtual Private Gateway in AWS.

  2. Configure routing. Typically, you will want to route certain CIDR blocks through the VPN connection to the other data center: for example, if your on-prem data center network used the CIDR block 172.16.0.0/12, you might configure the route table in your AWS VPC to send all traffic to 172.16.0.0/12 to your virtual private gateway.

  3. Configure connectivity and authentication. For the VPN in each data center, you’ll need the IP addresses it uses, various identifying information, such as a Border Gateway Protocol (BGP) Autonomous System Number (ASN), and a way to authenticate and encrypt the connection, which is typically done using IPsec and either certificates or pre-shared secret keys.

  4. Create the VPN tunnel. At this point, you establish an encrypted tunnel, and all traffic that is routed to your VPN is exchanged over this tunnel.

Now that you’ve seen a high level overview of how to use VPN, let’s look at what advantages and drawbacks you get for your efforts.

Advantages and drawbacks of VPN

VPN has a number of advantages:

You get transparent network access from your own computer

Being able to access a private network, from your own computer, as if you were directly a part of that network, is great for usability.

Works for all employees

VPN is accessible to just about all roles at a company, and not just developers.

Works with all operating systems

There are VPN clients for every operating system and many mobile devices, too.

Secure

Many VPN tools are built around either IPSec or TLS, both of which are generally considered mature and secure: they have been around for more than 30 years, are ubiquitous, and have massive communities around them, so vulnerabilities are rare, and are typically fixed quickly.

However, VPN also has several disadvantages:

Extra infrastructure to run

You have to deploy a VPN server, or possibly multiple servers for high availability.

Certificate management can be difficult

Many VPN tools are built around certificates, which can be difficult to securely sign, distribute, revoke, and manage.

Performance overhead

When you use a VPN, you are routing your network traffic through another server, which increases latency, and, depending on the bandwidth that server has available and how many other people are using it simultaneously, may degrade your network throughput.

Now that you’ve seen some common protocols for accessing a private network from the outside, let’s turn our attention to how services within a private network can communicate.

Service Communication in Private Networks

In Part 6, you saw that a common way to deal with problems of scale, such as more traffic and more employees, is to break the codebase up into multiple (micro)services that are deployed independently, typically on separate servers. These services typically need to communicate with each other, which they do by sending messages over the network. In order to be able to do this, there are three technical decisions you’ll have to make:

Service discovery

How does one service figure out what endpoint(s) to use to reach another service?

Service communication protocol

What is the format of the messages that services send to each other?

Service mesh

How do you handle security, resiliency, observability, and traffic management?

Everyone who deploys services has to deal with the first two decisions, service discovery and communication protocol, right away. The third decision, service mesh, is typically only necessary at larger scales. The following sections will go through each of these problems and discuss some of the tools and approaches you can use to solve them, starting with service discovery.

Service Discovery

As soon as you have one service, A, that needs to talk to another service, B, you have to figure out service discovery: how does A figure out the right IP addresses to use to talk to B? This can be a challenging problem as each service may have multiple replicas running on multiple servers, and the number of replicas and which servers they are running on may change frequently as you deploy new versions, replicas crash and are replaced, or you scale the number of replicas up or down in response to load.

Key takeaway #6

As soon as you have more than one service, you will need to figure out a service discovery solution.

In the next section, we’ll go over some of the tools you can use to solve this problem, and in the section after that, we’ll compare the tools to help you figure out which one is the right fit for your use cases.

Service discovery tools

A common way to handle service discovery is to repurpose one of the following generic tools:

Configuration files

The simplest solution is to hard-code server IP addresses in configuration files, using any of the application configuration tools you saw in Part 6, such as JSON, YAML, Cue, or Dhall. For example, service A might have a config file with the hard-coded IP addresses of the servers where B is deployed. This works as long as the IP addresses used by B don’t change too often, such as an on-prem data center, where you have a relatively fixed set of physical servers for B, or if you’re in the cloud, and you’re using private static IP addresses for B’s virtual servers.[38]

Load balancers

Instead of hard-coding the IP addresses of every server, you could deploy an internal load balancer (a load balancer only accessible within your private network) in front of your services using any of the load balancers you saw in Part 3, such as AWS ELB, GCP Cloud Load Balancer, or Nginx, and hard-code just the endpoints for the load balancer in each environment. Each service can then look up the load balancer endpoint in its configuration and make requests to other services using a convention: e.g., service A will know it can reach service B at the /B path of the load balancer.

DNS

If you squint at it, you might realize that service discovery is about translating a name (the name of a service) to a set of IP addresses. As it turns out, we have a system for doing just that: DNS! It’s common to have private DNS servers in a data center, and most cloud providers offer private DNS services such as Private hosted zones in AWS, Private zones in Google Cloud, and Private DNS in Azure, so you can create a DNS record that points to the IP addresses for each service, and use a convention for service discovery: e.g., service A would know that it can talk to service B at the domain B.internal.

Whereas config files, load balancers, and DNS all use generic tools for service discovery, there days, there are also a number of tools that are purpose-built for service discovery. Broadly speaking, most of the dedicated service discovery tools fall into the following two buckets:

Library

Tools such as Consul (if you use the Consul client directly), Curator with ZooKeeper, and Eureka, come with two key ingredients: a service registry and a service discovery library. The service registry is a data store that stores the endpoint data for your services, performs health checks to detect when endpoints are up and down, and, most importantly, allows you to subscribe to updates, so you are notified immediately whenever endpoints are updated. The service discovery library is a library that you incorporate into your application code to (a) add endpoints to the service registry when your service is booting, and (b) fetch endpoint data from the service registry and subscribe to updates, so you can make service calls by looking up the latest service endpoint data in memory, and sending a request directly to one of those endpoints.

Local proxy

Tools such as Consul (if you use Consul DNS or Consul Template), gRPC with etcd, Synapse, and Envoy, as well as the service discovery mechanisms built into orchestration tools such as Kubernetes (and all the platforms built on top of it, such as EKS, GKE, AKS, etc.), Nomad, and Mesos, also come with two key ingredients: a service registry and a local proxy. The local proxy is a proxy you run on the same servers as your apps, either by deploying it as a sidecar container, which is a container that is always deployed in tandem with every one of your application containers, or by running it as a daemon that runs in the background of each server in your cluster.

The local proxy does exactly the same thing as the service discovery library: it adds endpoints to the service registry when your app is booting, and it fetches endpoint data from the service registry and subscribes to updates. The difference is that the local proxy is completely transparent, and does not require you to make any changes to your application code. To make this work, you override your local network settings in each container or server to either send all traffic through this proxy, or to use the proxy as a local DNS server. Either way, the proxy uses its local service registry data to route your app’s requests to the proper endpoints, without the app having to be aware of the service discovery tool.

Now that you’ve seen the various options for service discovery tools, how do you pick the right one? That is the focus of the next section.

Service discovery tool comparison

Here are some of the key trade-offs to consider when picking a service discovery tool:

Manual error

Any solution that involves hard-coding data is error-prone. Every place I’ve worked that hard-coded IP addresses, either of servers or load balancers, had frequent bugs and outages due to errors in the configuration files.

Update speed

One of the biggest advantages of the dedicated service discovery tools is that you can subscribe to updates from the service registry, so you get the latest endpoint data very quickly. On the other hand, hard-coded IPs only update when you update them by hand, which is much slower. DNS falls somewhere in between, depending on caching settings: a low TTL means you get updates faster, but at the cost of more latency (you’ll hear more about latency shortly).

Scalability

If you hard code IPs in configuration files, you almost always hit scaling bottlenecks once you have more than a handful of services. Load balancers can also be tough to scale, as one request from the outside world can result in dozens of internal service calls going through the load balancer, which can become a serious bottleneck when you have a lot of traffic.

Transparency

Some service discovery solutions require you to update your app code to incorporate service discovery logic, such as having to use a service discovery library, or look up a load balancer endpoint. Other solutions are transparent, such as the local proxy or DNS, which don’t require you to update your app code. Of course, service discovery can never be completely transparent, as the app code still has to use some mechanism to make a service call, but the idea with transparent solutions is that the app does not need to be aware of the specific tool you use for service discovery, and can rely on generic, portable approaches, such as using domain names.

Latency

Server-side service discovery tools, such as load balancers, require every service call to go through extra network hops, which increases latency considerably (see Table 9). DNS also adds an extra network hop to query the DNS server; if you cache the DNS response, you can avoid that hop for most requests, but this comes at the cost of reducing update speed. With client-side service discovery tools, such as a service discovery library, you have all the endpoint data cached locally, so you can send the requests directly to those endpoints, without any extra network hops. The local proxy is an in-between solution: there is an extra hop to the proxy, but because it runs locally, the additional latency is miniscule compared to talking to another server.

CPU and memory usage

The local proxy approach requires you to run extra code with every container or every server, which adds CPU and memory usage overhead.

Extra infrastructure

Some of the service discovery tools require you to deploy and manage extra infrastructure, such as load balancers or service registries. This can add a lot of operational overhead, especially the service registries, as they are based on distributed data stores (e.g., Consul, etcd, ZooKeeper) that can be challenging to manage.

Table 15 shows a summary of how the service discovery tools from the previous section compare across these trade-offs:

Table 15. Comparing service discovery tools
Configuration filesLoad balancersDNSRegistry + libraryRegistry + local proxy

Minimize manual error

Poor Poor

Moderate Moderate

Very strong Very strong

Very strong Very strong

Very strong Very strong

Update speed

Poor Poor

Very strong Very strong

Moderate Moderate

Very strong Very strong

Very strong Very strong

Scalability

Poor Poor

Moderate Moderate

Strong Strong

Very strong Very strong

Very strong Very strong

Transparent to app code

Moderate Moderate

Moderate Moderate

Very strong Very strong

Poor Poor

Very strong Very strong

Minimize latency overhead

Very strong Very strong

Poor Poor

Moderate Moderate

Very strong Very strong

Strong Strong

Minimize CPU & memory overhead

Very strong Very strong

Very strong Very strong

Very strong Very strong

Very strong Very strong

Poor Poor

Minimize infrastructure overhead

Very strong Very strong

Moderate Moderate

Strong Strong

Poor Poor

Poor Poor

Now that you’ve seen all the different options for solving service discovery and how they compare, let’s move on to the next challenge, the service communication protocol.

Service Communication Protocol

As you saw in Part 6, a big part of breaking your code into services is defining an API for the service, and maintaining it over the long term. One of the key decision you’ll have to make is the protocol you will use for that API, which actually consists of two primary decisions:

Message encoding

How will you serialize data?

Network encoding

How will you send that data over the network?

In the next section, we’ll go over some of the most common protocols in use today, and in the section after that, we’ll go over the key factors to consider when picking a protocol.

Common protocols

Here are some of the most common protocols in use today:

REST APIs: HTTP + JSON

REST stands for Representation State Transfer, and it is the de facto standard for building web APIs. Going into all the details of REST APIs is beyond the scope of this book, but two of the key ingredients are that the network encoding is HTTP and the message encoding provides a "uniform interface." The uniform interface part of REST has always been a bit vague: it most likely referred to something like HTML, but when building APIs, most teams these days use HTTP + JSON.[39]

Serialization libraries

There are a number of serialization libraries out there that support (a) defining a schema and (b) compiling stubs for various programming languages. These include Protocol Buffers, Cap’n Proto, FlatBuffers, Thrift, and Avro. These are sometimes sent over HTTP, but one of the reasons to use a serialization library instead of JSON for the message encoding is that serialization libraries typically offer better performance, so it’s common to pick a network encoding that offers better performance too, such as HTTP/2 or TCP.

RPC libraries

One level up from serialization libraries are libraries specifically designed for remote procedure calls (RPC), which is a way for a procedure on one computer to execute a procedure on another computer (e.g., one service sending a request to another service), often with code that looks just like the code for executing a procedure locally. Some of the popular tools in this space include gRPC, Connect RPC, drpc, and Twirp. Most of these tools define both the message encoding, which typically uses a serialization library such as Protocol Buffers, and the network encoding, which is often something performant like HTTP/2. These tools not only generate client stubs, but also server stubs to help you implement a service with the RPC libraries.

So should you go with HTTP + JSON or HTTP/2 + Protocol Buffers or gRPC (which is also HTTP/2 + Protocol Buffers)? The next section goes through the key factors you should consider when deciding.

Key factors to consider

When trying to pick a service communication protocol, here are some of the key factors you should take into account:

Programming language support

What programming languages are you using at your company? How many of them have good support for the message encoding you’re considering? JSON is supported in virtually every programming language; other serialization protocols are more hit or miss, though the more mature ones are typically supported in most popular programming languages.

Client support

What clients does your API need to support? Will web browsers be talking directly to your services? Mobile apps? IoT? What protocols do those clients support, both for message and network encoding? HTTP + JSON are supported in virtually every client, and are native to web browsers; other serialization protocols are more hit or miss, especially with web browsers.

Schema and code generation

Does the message encoding support defining a schema? Can you automatically generate client stubs in various programming languages for that schema? Can you automatically generate documentation? This is one area where serialization libraries and RPC libraries typically shine and HTTP + JSON are weaker; that said, tools like OpenAPI can help fill that gap for HTTP + JSON.

Ease of debugging

How hard is it to test an API built with this tool or to debug problems? With HTTP + JSON, this is typically easy, as you can use any HTTP client, such as a web browser or curl. Serialization and RPC libraries often require special tooling for testing.

Performance

How efficient are the message and network encoding in terms of bandwidth, memory, and CPU usage? This is an area where serialization and RPC libraries are usually going to come out well ahead of HTTP + JSON.

Ecosystem

How big is the ecosystem around the message encoding? How is the documentation? How often are there updates and new releases? How many tools, plugins, and related projects are there? How hard is it to hire developers who know how to use this message encoding? How hard is it to find answers on StackOverflow? HTTP + JSON have the largest ecosystem, by far; Protocol Buffers and gRPC (which uses Protocol Buffers under the hood) are arguably a distant second.

As a general rule, I would default to HTTP + JSON for most APIs, and only consider alternatives in special cases: e.g., at very large scales, where you have hundreds of services and tens of thousands of queries per second, the better performance and standardization you get with gRPC may pay off.

Now that you know how to define APIs for your services, let’s talk about how to manage your services at scale using service meshes.

Service Mesh

A service mesh is a networking layer that is designed to help manage communication between applications in a (micro)service architecture by providing a single, unified solution to the following problems:

Security

In Part 6, you deployed a couple microservices in Kubernetes that were able to talk to each other via HTTP requests. In fact, not only could these microservices talk to each other, but anyone could talk to them, so long as they had network access: they’d respond blindly to any HTTP request that came in. Putting these microservices in a private network provides some protection (the castle-and-moat model), but as your company scales, you will most likely want to harden the security around your services (the zero-trust model) by enforcing encryption, authentication, and authorization. You’ll learn more about these topics in Part 8.

Observability

As you saw in Part 6, (micro)service architectures introduce many new failure modes and moving parts that can make debugging harder than with a monolith. In a large services architecture, understanding how a single request is processed can be a challenge, as that one request may result in dozens of API calls to dozens of services. This is where observability tools such as distributed tracing, metrics, and logging become essential. You’ll learn more about these topics in Part 10 [coming soon].

Resiliency

If you’re at the scale where you are running many services, you’re at a scale where bugs, performance issues, and other errors happen many times per day; if you had to deal with every issue manually, you’d never be able to sleep. In order to have a maintainable and resilient (micro)service architecture, there are a number of tools and techniques you can use to recover from or avoid errors automatically, including retries, timeouts, circuit breakers, and rate limiting.

Traffic management

As you saw in Part 6, breaking a monolith into services means you are now managing a distributed system. To be able to manage a large distributed system, you often need a lot of fine-grained control over network traffic, including load balancing between services (e.g., to minimize latency or maximize throughput), canary deployments (sending traffic to just a single new replica of an app, as you saw in Part 5), and traffic mirroring (i.e., sending a duplicate of traffic to an extra endpoint for analysis or testing).

Almost all of these are problems of scale: if you only have two or three services, a small team, and not a lot of load, these problems are not likely to affect you, and a service mesh may be an unnecessary overhead; however, if you have hundreds of services owned by dozens of teams, processing tens of thousands of requests per second, these are problems you’ll be dealing with every day. If you try to solve these problems individually, one at a time, you’ll find it is a huge amount of work, and that the solution to one has an impact on the other: e.g., how you manage encryption has a big impact on your ability to do tracing and traffic mirroring. Moreover, the simple solutions you’re likely to try first may require you to make code changes to every single app, and as you learned in Part 6, rolling out global changes across many services can take a very long time.

This is where a service mesh can be of use: it gives you an integrated, all-in-one solution to these problems, and just as importantly, it can solve most of these problems in a way that is transparent, and does not require you to make changes to your application code.

Key takeaway #7

A service mesh can improve security, observability, resiliency, and traffic management in a microservices architecture, without having to update the application code of each service.

When things are working, a service mesh can feel like a magical way to dramatically upgrade the security and debuggability of your (micro)service architecture. However, when things aren’t working, the service mesh itself can be difficult to debug. Service meshes solve many problems, which is great, but at the cost of introducing many new moving parts that can be the source of new problems: e.g., encryption, authentication, authorization, routing, firewalls, tracing, and so on. Understanding, installing, configuring, and managing a service mesh can be a lot of overhead. If you’re at the scale where you need solutions to the problems listed earlier, it’s worth it; if you’re a tiny startup, it’ll only slow you down.

Service mesh tools can be broken down into three buckets. The first bucket is the service mesh tools designed for use with Kubernetes:

The second bucket is managed service mesh tools from cloud providers:

The third bucket is service mesh tools that can be used with any orchestration approach in any deployment environment (e.g., Kubernetes, EC2, on-prem servers, etc.):

The best way to get a feel for what a service mesh does is to try one out, so let’s go through an example of using Istio with Kubernetes.

Example: Istio Service Mesh with Kubernetes Microservices

Istio is a popular service mesh for Kubernetes that was originally created by Google, IBM and Lyft, and open sourced in 2017. Let’s see how Istio can help you manage the two microservices you deployed with Kubernetes in Part 6: one of those microservices was a backend app that exposed a simple JSON-over-HTTP REST API and the other microservice was a frontend app that made service calls to the backend, using the service discovery mechanism built into Kubernetes, and then rendered the data it got back using HTML. Make a copy of those two sample apps into the folder you’re using for this blog post’s examples:

$ cd fundamentals-of-devops

$ cp -r ch6/sample-app-frontend ch7/

$ cp -r ch6/sample-app-backend ch7/

Next, let’s install Istio. We’ll use the same local Kubernetes cluster that’s part of Docker Desktop, just as you did in Part 3. Make sure you’re authenticated to that cluster as follows:

$ kubectl config use-context docker-desktop

Download Istio using curl:

$ curl -L https://istio.io/downloadIstio | sh -

This will download and extract Istio into a folder called istio-<VERSION>, where <VERSION> is the latest Istio version number, such as 1.22.3. Head into that folder:

$ cd istio-<VERSION>

Inside the bin folder, you’ll find istioctl, which is a CLI tool that has useful helper functions for working with Istio. Add the bin folder to your PATH:

$ export PATH=$PWD/bin:$PATH

Note that the preceding command will only add bin to your PATH in the current terminal window; if you want to make this a permanent change available in all terminal windows, add the same code to your profile configuration: i.e., ~/.profile, ~/.bash_profile or ~/.zprofile.

Next, install Istio as follows:

$ istioctl install --set profile=minimal -y

This uses a minimal profile to install Istio, which is good enough for learning and testing. If you’re installing Istio for production, see the install instructions for other profiles you can use.

The way Istio works is to inject its own sidecar into every Pod you deploy into Kubernetes. That sidecar is what provides all the security, observability, resiliency, and traffic management features, without you having to change your application code. To configure Istio to inject its sidecar into all Pods you deploy into the default namespace, run the following command:

$ kubectl label namespace default istio-injection=enabled

Istio supports a number of integrations with observability tools. For this example, let’s use the sample add-ons that come with the Istio installer, which include a dashboard for Istio called Kiali, a database for monitoring data called Prometheus, a UI for visualizing monitoring data called Grafana, and a distributed tracing tool called Jaeger:

$ kubectl apply -f samples/addons

$ kubectl rollout status deployment/kiali -n istio-system

At this point, you can verify everything is installed correctly by running the verify-install command:

$ istioctl verify-install

If everything looks good, deploy the frontend and backend as you did before:

$ cd ../sample-app-backend

$ kubectl apply -f sample-app-deployment.yml

$ kubectl apply -f sample-app-service.yml

$ cd ../sample-app-frontend

$ kubectl apply -f sample-app-deployment.yml

$ kubectl apply -f sample-app-service.yml

After a few seconds, you should be able to make a request to the frontend as follows:

$ curl localhost

<p>Hello from <b>backend microservice</b>!</p>

At this point, everything should be working exactly as before. So is Istio doing anything? One way to find out is to open up the Kiali dashboard you installed earlier:

$ istioctl dashboard kiali

This command should pop open the dashboard in your web browser. Click the Traffic Graph link in the nav on the left, and you should see something similar to Figure 89:

The traffic graph in Istio’s Kiali dashboard
Figure 89. The traffic graph in Istio’s Kiali dashboard

If the Traffic Graph doesn’t show you anything, run curl localhost several more times, and then click the refresh button in the top right of the dashboard. You should see a visualization of the path your requests take through your microservices, including through the Services and Pods. This is just one of the observability tools you get with Istio. To see another one, click on Workloads in the left nav, select sample-app-backend, and click the Logs tab. You should see logs for your backend, including logs from the Node.js app, as well as access logs from Istio components, such as the Envoy proxy, as shown in Figure 90:

Logs for the backend
Figure 90. Logs for the backend

Right away, you see one of the key benefits of a service mesh: observability. Istio gives you distributed tracing, logs, and metrics.

Another key benefit of service meshes is security, including support for automatically encrypting, authenticating, and authorizing all requests within the service mesh. By default, to make it possible to install Istio without breaking everything, Istio initially allows unencrypted, unauthenticated, and unauthorized requests to go through. However, you can change this by configuring policies in Istio.

Create a file called istio/istio-auth.yml with the policies shown in Example 140:

Example 140. An authentication and authorization policy for Istio (ch7/istio/istio-auth.yml)
apiVersion: security.istio.io/v1beta1

kind: PeerAuthentication               (1)

metadata:

  name: require-mtls

  namespace: default

spec:

  mtls:

    mode: STRICT



---                                    (2)

apiVersion: security.istio.io/v1

kind: AuthorizationPolicy              (3)

metadata:

  name: allow-nothing

  namespace: default

spec:

  {}

This code does the following:

1This creates an authentication policy that requires all service calls to use mTLS (mutual TLS), which is a way to enforce that every connection is encrypted and authenticated (you’ll learn more about TLS in Part 8). One of the benefits of Istio is that it handles mTLS for you, completely transparently.
2Note the use of ---: this is a divider that allows you to put multiple Kubernetes configurations in a single YAML file.
3This is an authorization policy that creates a more secure default where all service calls will be blocked by default, rather than your services responding to anyone who happens to have network access. You can then add additional authorization policies to allow just the service communication that you know is valid.

Hit CTRL+C to shut down Grafana and deploy these policies as follows:

$ cd ../istio

$ kubectl apply -f istio-auth.yml

Now, look what happens if you try to access the frontend app again:

$ curl localhost

curl: (52) Empty reply from server

Since your request to the frontend wasn’t using mTLS, Istio rejected the connection immediately. Enforcing mTLS makes sense for backends, as they should only be accessible to other services, but your frontend should be accessible to users outside your company, so you should disable the mTLS requirement for just the frontend, as shown in Example 141:

Example 141. An authentication policy to disable the mTLS requirement for the frontend
apiVersion: security.istio.io/v1beta1

kind: PeerAuthentication

metadata:

  name: allow-without-mtls

  namespace: default

spec:

  selector:

    matchLabels:

      app: sample-app-frontend-pods (1)

  mtls:

    mode: DISABLE                   (2)

This is an authentication policy that works as follows:

1Target the frontend Pods.
2Disable the mTLS requirement for the frontend Pods.

You can put the YAML in Example 141 into a new YAML file, but dealing with too many YAML files for the frontend is tedious and error-prone. Let’s instead use the --- divider to combine the frontend’s sample-app-deployment.yml, sample-app-service.yml, and the YAML you just saw in Example 141 into a single file called kubernetes-config.yml, with the structure shown in Example 142:

Example 142. Combine multiple Kubernetes configurations into a single YAML file (ch7/sample-app-frontend/kubernetes-config.yml)
apiVersion: apps/v1

kind: Deployment

# ... (other params omitted) ...



---

apiVersion: v1

kind: Service

# ... (other params omitted) ...





---

apiVersion: security.istio.io/v1beta1

kind: PeerAuthentication

# ... (other params omitted) ...

With all of your YAML in a single kubernetes-config.yml, you can delete the sample-app-deployment.yml and sample-app-service.yml files, and deploy changes to the frontend app with a single call to apply:

$ cd ../sample-app-frontend

$ kubectl apply -f kubernetes-configuration.yml

Try accessing the frontend again, adding the --write-out flag so that curl prints the HTTP response code after the response body:

$ curl --write-out '\n%{http_code}\n' localhost

RBAC: access denied

403

This time, Istio did not close the connection immediately, as authentication and encryption with mTLS are no longer required. However, you got a 403 response code (Forbidden) and "access denied" in the response body because the allow-nothing authorization policy is still blocking all requests. To fix this, you need to add authorization policies to the backend and the frontend.

This requires that Istio has some way to identify the frontend and backend (authentication). Istio uses Kubernetes service accounts as identities, providing a TLS certificate to each application based on its service identity, and using mTLS to provide mutual authentication: that is, it’ll not only have the backend verify the request is coming from the frontend, but before sending that request, it’ll have the frontend verify it is really talking to the backend. Istio will handle all the TLS details for you, so all you need to do is associate the frontend and backend with their own service accounts and add an authorization policy to each one.

Let’s start with the frontend. Update its kubernetes-config.yml as shown in Example 143:

Example 143. Configure the frontend with a service account and authorization policy (ch7/sample-app-frontend/kubernetes-config.yml)
apiVersion: apps/v1

kind: Deployment

spec:

  replicas: 3

  template:

    metadata:

      labels:

        app: sample-app-frontend-pods

    spec:

      serviceAccountName: sample-app-frontend-service-account (1)

      containers:

        - name: sample-app-frontend

# ... (other params omitted) ...



---

apiVersion: v1

kind: ServiceAccount

metadata:

  name: sample-app-frontend-service-account                   (2)



---

apiVersion: security.istio.io/v1

kind: AuthorizationPolicy                                     (3)

metadata:

  name: sample-app-frontend-allow-http

spec:

  selector:

    matchLabels:

      app: sample-app-frontend-pods                           (4)

  action: ALLOW                                               (5)

  rules:                                                      (6)

  - to:

    - operation:

        methods: ["GET"]

Here are the updates to make to the frontend:

1Configure the frontend’s Deployment with a service account. The service account itself is created in (2).
2Create a service account for the frontend.
3Add an authorization policy for the frontend.
4The authorization policy targets the frontend’s Pods.
5The authorization policy will allow requests that match the rules in (5).
6Define rules for the authorization policy, where each rule can optionally contain from (sources) and to (destinations) to match. The preceding code will allow the frontend to receive requests from all sources, as the rule doesn’t include a from value, but it will only allow HTTP GET requests, based on the to value.

Run apply to deploy these changes to the frontend:

$ kubectl apply -f kubernetes-config.yml

Next, head over to the backend, and combine its Deployment and Service definitions into a single kubernetes-config.yml file, separated by ---, just like you did for the frontend. Once that’s done, update the backend’s kubernetes-config.yml as shown in Example 144:

Example 144. Configure the backend with a service account and authorization policy (ch7/sample-app-backend/kubernetes-config.yml)
apiVersion: apps/v1

kind: Deployment

spec:

  replicas: 3

  template:

    metadata:

      labels:

        app: sample-app-backend-pods

    spec:

      serviceAccountName: sample-app-backend-service-account (1)

      containers:

        - name: sample-app-backend

# ... (other params omitted) ...



---

apiVersion: v1

kind: ServiceAccount

metadata:

  name: sample-app-backend-service-account                   (2)



---

apiVersion: security.istio.io/v1                             (3)

kind: AuthorizationPolicy

metadata:

  name: sample-app-backend-allow-frontend

spec:

  selector:

    matchLabels:

      app: sample-app-backend-pods                           (4)

  action: ALLOW

  rules:                                                     (5)

  - from:

    - source:

        principals:

          - "cluster.local/ns/default/sa/sample-app-frontend-service-account"

    to:

    - operation:

        methods: ["GET"]

Here are the updates to make to the backend:

1Configure the backend’s Deployment with a service account. The service account itself is created in (2).
2Create a service account for the backend.
3Add an authorization policy for the backend.
4Apply the authorization policy to the backend’s Pods.
5Define rules that allow HTTP GET requests to the backend from the service account of the frontend.

Run apply to deploy these changes to the backend:

$ cd ../sample-app-backend

$ kubectl apply -f kubernetes-configuration.yml

And now test the frontend one more time:

$ curl --write-out '\n%{http_code}\n' localhost

<p>Hello from <b>backend microservice</b>!</p>

200

Congrats, you got a 200 (OK) response code and the expected HTML response body, which means you now have microservices running in Kubernetes, using service discovery, and communicating securely via a service mesh! With the authentication and authorization policies you have in place, you have significantly improved your security posture: all communication between services (such as the request the frontend successfully made to the backend) is now encrypted, authenticated, and authorized—all without you having to modify the Node.js source code of either app. Moreover, you have access to all the other service mesh benefits, too: observability, resiliency, and traffic management.

Get your hands dirty

Here are some exercises you can try at home to get a better feel for service meshes and Istio:

  • Try out some of Istio’s other observability functionality, such as using Grafana to view your metrics: istioctl dashboard grafana.

  • Try out some of Istio’s traffic management functionality, such a request timeouts, circuit breaking, and traffic shifting.

  • Consider if Istio’s ambient mode is a better fit for your workloads than the default sidecar mode.

When you’re done testing, you can run delete on the kubernetes-config.yml files of the frontend and backend to clean up the apps. If you wish to uninstall Istio, first, remove the global authorization and authentication policies:

$ cd ../istio

$ kubectl delete -f istio-auth.yml

Next uninstall the addons:

$ cd ../istio-<VERSION>

$ kubectl delete -f samples/addons

And finally, uninstall Istio itself, including deleting its namespace, and removing the default labeling behavior:

$ istioctl uninstall -y --purge

$ kubectl delete namespace istio-system

$ kubectl label namespace default istio-injection-

One of the benefits of software-defined networking is that it’s fast and easy to try out different networking approaches. Instead of having to spend hours or days setting up physical routers, switches, and cables, you can try out a tool like Istio in minutes, and if it doesn’t work for you, it only takes a few more minutes to uninstall Istio, and try something else. And, of course, you can save your networking configuration (e.g., Istio policies) as code, which allows you to go back and forth as much as you want without losing any of your past work. Keep experimenting until you find the solution that best fits your needs.

Conclusion

You’ve now seen the central role networking plays in connectivity and security, as per the 7 key takeaways from this blog post:

  • You get public IP addresses from network operators such as cloud providers and ISPs.

  • DNS allows you to access web services via memorable, human-friendly, consistent names.

  • Use a defense-in-depth strategy to ensure you’re never one mistake away from a disaster.

  • Deploy all your servers into private networks by default, exposing only a handful of locked-down servers directly to the public Internet.

  • In the castle-and-moat model, you create a strong network perimeter to protect all the resources in your private network; in the zero-trust model, you create a strong perimeter around each individual resource.

  • As soon as you have more than one service, you will need to figure out a service discovery solution.

  • A service mesh can improve security, observability, resiliency, and traffic management in a microservices architecture, without having to update the application code of each service.

Putting these all together, you should now be able to picture the full network architecture you’re aiming for, as shown in Figure 91. Inside your data center, you have a private network, such as a VPC. Within this network, almost all of your servers are in private subnets. The only exceptions are highly-locked down servers designed to accept traffic directly from customers, such as load balancers, and highly-locked down bastion hosts for your employees, such as an access proxy.

Full network architecture
Figure 91. Full network architecture

You should also be able to picture the request flow in Figure 91, both for customers and for employees. When a customer visits your website, their computer looks up your domain name using DNS, gets the public IP addresses of your load balancers (which are assigned by your cloud provider), makes a request to one of those IPs, and the load balancer routes that request to an appropriate app server in the private subnets. That app server processes the request, possibly communicating with other services—using service discovery to find those services, and a service mesh to enforce authentication, authorization, and encryption—and returns a response. When an employee of your company needs to access something on your internal network, such as a wiki page, they use an authorized company device to authenticate to the access proxy, which checks the user, their device, and access policies, and if the employee is authorized, the access proxy gives them access to just that wiki.

As you went through this blog post, you repeatedly came across several key security concepts such as authentication and secrets. These concepts affect not only networking, but all aspects of software delivery, so let’s move on to Part 8, where we do a deeper dive on security.

Update, June 25, 2024: This blog post series is now also available as a book called Fundamentals of DevOps and Software Delivery: A hands-on guide to deploying and managing production software, published by O’Reilly Media!

Join the Fundamentals of DevOps Newsletter!