Too Simple To Fail: Marrying Nomad, Caddy, and Wireguard

« Too Simple To Fail: Marrying Nomad, Caddy, and Wireguard »

4 February, 2022
2,256 words
9 minute read time

My little lab can afford some experimental allowances given that I'll never (hopefully) breach the "thousands of hosts" mark. One experiment that paid off recently was ditching Traefik v1 for a hybrid setup that uses Nomad, consul-template, Caddy, and wireguard in order to provide the HTTP routing layer for my services.

I think it's an interesting solution, and it has proven a) particularly resilient and b) very easy to maintain and extend.

The Problem

When you run a gaggle of containers in a runtime like kubernetes, one of the most fundamental needs is the ability to route incoming traffic to the right workloads. My browser wants to use my grafana installation, so I need to hit an IP address, that endpoint needs to terminate TLS and reverse proxy to a container running on the backend that may or may not have recently migrated due to fluctuations in resources constraints or cluster member availability.

For folks on something like GKE, this is pretty hands-off. Depending on whether you decide to use something like the Nginx ingress controller or just rely on GKE-native resources entirely, you hand k8s a block of YAML asking it to route requests for Host: my.service.app to a certain set of listening containers, and GCP hastily assembles the requisite pieces to make it happen. A load balancer appears, it probably handles ad-hoc certificate provisioning, most of it feels like magic (if you're used to more manual methods like a geriatric devops individual such as myself).

Things are different if you're doing it yourself, or in the case of a home lab, reliant on physical hardware and can't rely on the magic of highly available load balancers that offer an endpoint and let you carry on. Here are some considerations to factor in:

What endpoint do you hit in your LAN (or outside of it)? Does local DNS point to a dedicated HTTP box, does the router serve as the termination point, or something else?
How do you manage TLS? Using http-01 ACME validation assumes you're fine opening up your private services in order to allow Let's Encrypt perform validation, and dns-01 validation requires some extra finagling.
How are new reverse proxy routes added? If a container migrates somewhere, how does the load balancer get updated?
How does the load balancer that does this reverse proxying communicate with backend services securely? In the magic of The Cloud™, you trust AWS/GCP/Azure to get this right, but in your own environment, how do you ensure that a reverse proxy isn't sending authentication cookies over to the backend in cleartext?
Persistence. Assuming that your HTTP endpoint is "floating", where do your certs live? If you're trying to build into high availability, how do two load balancers "share" and concurrently manage a Let's Encrypt certificate and renewals thereof?

There are even more considerations to make; suffice to say that it's sort of a sticky problem. Many of these problems sort of disappear in a more traditional, single-host environment, since that just requires pointing nginx at something like 127.0.0.1:8080, installing acme on a cron job, and talking to one host. More than one machine but less than fully-managed Cloud solutions is the ~~sweet~~ sore spot.

Note: to comments like "why would you build such a large lab that requires this amount of work?", please be aware: this is my hobby, and I enjoy it. Yes, I engage in devops labor in my free time. I do not have brain damage.

The Ghost of Infrastructure Past

In the heady days of Traefik v1, you could solve this pretty elegantly. Traefik can glean a catalog of running services from Consul, manage Let's Encrypt certificates natively and store them in a key/value store like Consul which permits for >1 instances for high availability because certificate keys and data persist cluster-wide instead of within a single directory. This is all well and good - you can now hit the Traefik endpoint and get a dynamically-updating reverse proxy that'll manage certs for you, but:

This doesn't solve the "secure backend communication" problem natively (maybe there's some k8s networking fabric you could use, or jerry-rig something with consul-connect).
Traefik v2 dropped support for storing ACME certs in key/value stores. Once again we find OSS projects that fall from grace at the altar of finding a return on investment for shareholders. Nicely done.

While Traefik configuration is pretty hands-off, you'll need to use something else if you want feature parity but want to stay up-to-date and not pay enterprise prices for a hobby lab.

Dear reader, this is the situation I found myself in recently: still running an aged Traefik v1 deployment without sufficient reason to justify paying for whatever pound of flesh Enterprise Tier Traefik demanded.

Concept

I've tried to steer my lab toward less-complex solutions where I can (and yes, I see the irony about doing this in a homelab with dozens of machines present). For example, rather than running something like kubernetes in my lab, I operate Nomad instead, which is much simpler to wrap your head around.¹ In this case, two candidates came to mind when thinking about "load balancing" and "secure backend communication": Caddy and wireguard, respectively.

Caddy is probably best-in-class when it comes to ad-hoc certification provisioning - they were among the first to do so by default, and the configuration is tremendously simpler than something like Apache (or nginx) by comparison. I mean, behold this sample reverse proxy configuration and tell me you don't love looking at it:

Caddyfile

Font for Caddyfile directives.
Font for Caddyfile labels.

example.com
reverse_proxy localhost:5000

This is literally a valid Caddy configuration for a reverse proxy with TLS (assuming http-01 can work). It's honestly beautiful (and marginally less complex than this).

If you're privy to it, you know that wireguard is the best thing since sliced bread, and probably better than sourdough toast. I've setup a few networks and it does for private networking what ssh does for remote access; it's simple and effective. The "secure backend communication" problem made me consider how wireguard might come into play as an orchestration-agnostic solution.

Note: Consul connect does solve some of these problems, but I'm looking for fairly dynamic proxy configuration, and most Nomad examples require some extra configuration for workloads to communicate with the frontend proxy. Moreover, this also requires a sidecar Envoy proxy, which I'd like to avoid on my slim ARM SBCs. But credit is due to Hashicorp for providing this for those who need it.

Solution

Phew, that's enough set-up. Let's dig into the meat.

Networking Plane

Remember when I said that I can build things that don't scale? wesher is an auto-assembling wireguard mesh tool that works like this:

Startup wesher, which brings up a wireguard interface and assigns it a private IP derived from your hostname
wesher - along with a secret - talks to some common discovery point
You join a common mesh network with all other nodes

And bam, encrypted private network mesh. The "doesn't scale part" is that, if addresses are derived from hostnames, there's an extant risk of address collisions if names "hash" to the same value in a subnet. Granted, it's not huge, but it's there. But I'm not going anywhere above thirty hosts! Or fourty maybe. Fifty.

To demonstrate this, consider the only two requisite configuration files: the systemd unit,

Systemd

Font used to highlight builtins.
Font used to highlight keywords.
Font used to highlight type and class names.

[Unit]
Description=wesher - wireguard mesh builder
After=network-online.target

[Service]
EnvironmentFile=-/etc/default/wesher
ExecStart=/usr/bin/wesher
Restart=on-failure
Type=simple

[Install]
WantedBy=multi-user.target

…and the environment file:

bash

WESHER_CLUSTER_KEY=<snip>
WESHER_OVERLAY_NET=10.100.0.0/16
WESHER_NO_ETC_HOSTS=true
WESHER_JOIN=<endpoint>

We set up wesher on each node in the Nomad cluster and the host that runs the reverse proxy. As with some of my other technology choices, there are a variety of solutions you could probably pick here² - and it doesn't look like wesher is under super-active development, but again - it's simple, so there's not a ton that can go wrong.

Container Orchestration

There's a minor change required for Nomad workloads to ensure that containers join the encrypted mesh, and it looks like this:

HCL

Font used to highlight variable names.
Font used to highlight strings.
Font used to highlight type and class names.

network {
  port "http" {
    host_network = "mesh"
  }
}

Which instructs Nomad to join the workload's container namespace to this network I configure on each node's Nomad configuration:

JSON

Font used to highlight strings.
Font used to highlight keywords.

"host_network": {
    "mesh": {
        "interface": "wgoverlay"
    }
}

Not a lot, but it ensures that communication in and out of workloads happen over an encrypted connection. Moreover, Nomad factors this in when adding services to Consul's catalog, so once this change takes effect, we know about service endpoints and their private wireguard IPs as well, so whatever system we're reverse proxying with just uses a different IP when assembling its reverse proxy routes.

You end up with these interfaces scattered among every cluster member:

shell

ip addr show wgoverlay

5: wgoverlay: <POINTOPOINT,NOARP,UP,LOWER_UP> mtu 1420 qdisc noqueue state UNKNOWN group default
    link/none
    inet 10.50.123.123/32 brd 10.50.83.255 scope global wgoverlay
       valid_lft forever preferred_lft forever

Routing

Time for caddy.

Caddy doesn't natively know how to build a routing configuration like Traefik can (though Matt has commented about it). Fortunately, building on Caddy's simple syntax can make for a very concise consul-template configuration. The strategy becomes: run consul-template on the HTTP box, generate an ad-hoc Caddyfile, and pretend that Caddy is just creating all of this dynamically for us, populated off of consul's service catalog.

What does this look like? Here's a slightly-cleaned-up version of what I ended up writing:

Go template

 1: {{- range services -}}
 2:     {{- range service .Name -}}
 3:         {{- if (.Tags | contains "caddy") -}}
 4:             {{- scratch.MapSetX "vhosts" .Name true -}}
 5:             {{- if .Tags | contains "public" }}
 6:                 {{- scratch.MapSet "vhosts" .Name false -}}
 7:             {{- end -}}
 8:         {{- end -}}
 9:     {{- end -}}
10: {{- end -}}
11: {
12:     http_port 80
13:     https_port 443
14:     acme_ca "https://acme-v02.api.letsencrypt.org/directory"
15:     storage "consul" {
16:         address "127.0.0.1:8500"
17:         prefix "caddytls"
18:     }
19: }
20: 
21: https://*.example.com {
22: {{ range $vhost, $private := scratch.Get "vhosts" }}
23:     @{{ $vhost }} host {{ $vhost }}.example.com
24:     handle @{{ $vhost }} {
25: {{- if $private }}
26:         @blocked not remote_ip 192.168.1.0/24
27:         respond @blocked "Access denied" 403
28: {{- end }}
29: 
30: {{- range services }}
31:         {{- range service .Name }}
32:             {{- if (and (.Tags | contains "caddy") (eq .Name $vhost)) }}
33:                 {{- if index .ServiceMeta "path" }}
34:         reverse_proxy {{ index .ServiceMeta "path" }} http://{{ .Address }}:{{ .Port }}
35:                 {{- else }}
36:         reverse_proxy http://{{ .Address }}:{{ .Port }}
37:                 {{- end }}
38:             {{- end }}
39:         {{- end }}
40:     {{- end }}
41:     }
42: {{ end }}
43:     handle {
44:         abort
45:     }
46: 
47:     tls {
48:         dns tylerjl-route53
49:         resolvers 1.1.1.1
50:     }
51: }

Lines 1 through 10 establish a go template variable called vhosts that map a virtual host name to a boolean indicating whether or not it should be considered private; that is, whether it should only permit local traffic (in case this proxy receives port-forwarded traffic through a router). 11 through 19 are Caddy settings, including a directive to store Let's Encrypt cert data in Consul, which makes this proxy stateless. Line 21 asks for a wildcard to serve up vhosts for our domain, and then 22 through 42 loop through services present in consul's catalog and setup reverse proxies for each. You could make this much prettier if consul-template included sprig libraries, but those functions aren't there yet.

You can see from line 33 that it's pretty easy to make this system flexible; I can add something like the following to a Nomad job definition in order to ask, for example, that a certain path get matched in order to hit a specific proxy.

HCL

Font used to highlight strings.
Font used to highlight variable names.
Font used to highlight type and class names.

meta {
  path = "/subroute"
}

Some more notes: this configuration requires two plugins for Caddy:

caddy-tlsconsul to store ACME data in Consul, very convenient and lets your proxy "float" around without any storage following it around
caddy-route53 - I had to fork this in order to have the ability to specify a specific Route 53 Zone ID, but that's the only difference between it and the upstream source.

Magic! When consul-template kicks out a new Caddyfile, it looks sort of like this:

Caddyfile

Font for Caddyfile labels.
Font for Caddyfile subdirectives.
Font used to highlight strings.
Font for Caddyfile directives.

{
    http_port 80
    https_port 443
    acme_ca "https://acme-v02.api.letsencrypt.org/directory"
    storage "consul" {
        address "127.0.0.1:8500"
        prefix "caddytls"
    }
}

https://*.example.com {
    @whoami host whoami.example.com
    handle @whoami {
        @blocked not remote_ip 192.168.1.0/24
        respond @blocked "Access denied" 403
        reverse_proxy http://10.50.123.123:28518
    }

    handle {
        abort
    }

    tls {
        dns tylerjl-route53
        resolvers 1.1.1.1
    }
}

…and with this Caddy will manage certificate provision and renewal along with modern expectations you might have for a reverse proxy like support for websockets. At ~50 lines it's pretty easy to adapt the template as well if you had the need for additional functionality, like adding/removing headers.

Results

Overall this setup has replaced my previous need for my aging Traefik v1 deployment, and I've both torn down and stood up new services in my Nomad cluster and watched as vhosts "magically" appear in my private LAN domain (that is, I nomad run thing.nomad, then a minute later start using it at https://thing.example.com).

In addition, there are a few new "features" I've begun to use in my homelab.

I still have a number of "legacy" services that I don't operate in a container scheduler (for example, I run Transmission on a singular host). Instead of running a bespoke nginx reverse proxy on that individual host, I now register the listening Transmission web port in consul and let Caddy serve as the central HTTP endpoint for any web services in my homelab. By dropping a systemd service like this onto the host, a route appears in Caddy when the host comes up and the service starts, and disappears when it stops.

Systemd

Font used to highlight strings.
Font used to highlight constants and labels.
Font used to highlight builtins.
Font used to highlight keywords.
Font used to highlight type and class names.

[Unit]
Description=consul catalog registrar for transmission
Requisite=consul.service transmission.service wesher.service
After=consul.service transmission.service wesher.service
BindsTo=transmission.service wesher.service

[Service]
Type=oneshot
RemainAfterExit=yes
ExecStart=/usr/bin/env sh -c '\
     addr=$(ip -4 addr show wgoverlay | awk \'$1 == "inet" { print $4; }\'); \
     consul services register -name=transmission -address=$addr -tag=caddy -port=9091 '
ExecStop=/usr/bin/consul services deregister -id=transmission

[Install]
WantedBy=transmission.service

I really like this because I only have to concern myself with configuring (and securing) one reverse proxy configuration, and it becomes very easy to "expose" running services on any host in one place with all the necessary TLS pieces in place and ready.

This also offers a lot of flexibility in terms of the location of the HTTP endpoint in my lab. For example, this consul-template configuration can dynamically update my dnsmasq instance to point DNS wherever Caddy may be running at any given time - for example, if Nomad moves it to another host as I'm performing maintenance on the machine it was running on before.

HCL

Font used to highlight strings.
Font used to highlight variable names.
Font used to highlight type and class names.

template {
  contents = <<EOF
{{- range services }}{{- if (.Tags | contains "caddy") }}
address=/{{ .Name }}.example.com/192.168.1.1
{{- end }}
{{- end }}
EOF

  command = "/usr/bin/systemctl reload dnsmasq"
  destination = "/etc/dnsmasq.d/web.conf"
}

Conclusion

There's surely a myriad of ways you could solve this, but choosing a combination of consul, wireguard, nomad, and caddy resolves the outstanding concerns (dynamic updates, TLS management, cluster-capable, secure backend communication) with individually simple parts and the ability to extend into other systems with relative ease.

Footnotes:

I understand that, because the Container Scheduler wars have largely been fought and won by kubernetes, there's bound to be all sorts of takes about this. Believe what you will, but as someone who has operated both systems, the two aren't really comparable along the complexity axis. If you're a kubernetes devotee, do not feel hurt, you get more features. But Nomad is the more "assemble small pieces" system.

I know about tailscale! It sounds great! But I also like to segment off portions of my lab from external dependencies when possible, and Tailscale (understandably) runs a central, remote endpoint to operate all the bits and pieces that permit discovery to happen. Wesher isn't as fully featured, but literally all you need is two wesher process to make the mesh work, and you're done.

« Unbreakable Builds on Container Schedulers without Containers

A Doppler Test Drive »

Tyblog

All the posts unfit for blogging
blog.tjll.net