Context
Often, I want to play with a Kubernetes cluster without having to pay a cloud provider for compute, or by setting up a home lab cluster with kubeadm
. In these times, I reach for K8s Kind (although I'd love to have a home lab cluster).
Simply put, Kind allows me to run a small Kubernetes cluster within Docker/Podman/You Choose. Whilst the number of pods I can run is limited to the power of my machine, it's great for running small proof-of-concepts (PoCs) and exploring the configuration of a product before deploying it to a real cluster.
I do have a small tip that I can share with you now to increase that pod limit if you're on a Debian-based machine:
These commands boost the number of file watchers available for the duration of your session, returning those values to their defaults once you reboot. More file watches = more pods.
If Docker/Others were already running before you ran these commands, you might want to restart those services.
But that's not what I really want to share...
Problem
Recently, I decided to spend too much time writing YAML files, in playing with the Argo Project. Mostly to explore some features and generally play with GitOps type processes. As part of my test-bed setup, I deployed Forgejo to replicate a scenario where there's a Git server and container registry hosted alongside the GitOps tooling.
My goals and reasons aren't really important here. The problem I came up against was after I had code repos set up, events firing off workflows to build and push container images, and finally automatic deployment to a target Kubernetes namespace.
All of that worked, as defined in the glorious YAML files I had painstakingly created. The problem was at deployment time: ImagePullBackOff
- Kubernetes couldn't pull my container images because it couldn't verify the self-signed SSL certificate I'd used for the Forgejo Ingress rule.
How did I get around that problem? Well, first, I destroyed my Kind cluster.
Was I having a tantrum? Maybe. Would I have needed to nuke everything I created anyway? Unfortunately so.
Resolution
The reason I had to destroy my Kind cluster is that the container image used by Kind as a virtual Kubernetes node doesn't seem to contain a text editor. No Vim, no Vi, not even Nano. So I couldn't simply shell into the node, make a few edits and service containerd restart
my way to happiness.
The first modification I needed to make is in the ContainerD config file. This is typically located at /etc/containerd/config.toml
on a Kubernetes node.
A bit of reading pointed me to a way to modify this file upon cluster creation. Cool. I modified my Kind cluster definition (another YAML file) to look a little like this:
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
name: plaything
containerdConfigPatches:
- |-
[plugins."io.containerd.grpc.v1.cri".registry]
config_path = "/etc/containerd/certs.d"
nodes:
- role: control-plane
extraPortMappings:
- containerPort: 80
hostPort: 80
protocol: TCP
- containerPort: 443
hostPort: 443
protocol: TCP
The magic is under the containerdConfigPatches
key. It's basically telling ContainerD that some configuration exists under the defined directory.
The next thing I needed to do was to create some config under that config_path
for my self-hosted container registry. I had my Forgejo instance running under the git.benzo.test
hostname, thanks to some Dnsmasq magic that could have also been achieved with an extra entry in my machine's hosts
file.
Note:
git.benzo.test
is a local-only hostname, which isn't internet facing - it won't work for you.
I'm telling you the hostname because it defines the location and the contents of the next file we need to create: /etc/containerd/certs.d/<hostname>/hosts.toml
. Because I wanted this configuration to be repeatable, I chose to create this file locally and then mount it to the Kind node at cluster creation time.
I fired up my text editor and smashed the following keys:
= "git.benzo.test"
[]
= ["pull", "resolve"]
= true
This config tells ContainerD, "Hey! There's this server...you can pull and resolve container images there...but keep your eyes off my dodgy self-signed SSL certificate. It's good, I promise."
The last thing I needed to do was to place that config file in the right location, via an additional modification to my Kind cluster definition:
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
name: plaything
containerdConfigPatches:
- |-
[plugins."io.containerd.grpc.v1.cri".registry]
config_path = "/etc/containerd/certs.d"
nodes:
- role: control-plane
extraMounts:
- hostPath: /home/ben/code/kind/my-cr.toml
containerPath: /etc/containerd/certs.d/git.benzo.test/hosts.toml
extraPortMappings:
- containerPort: 80
hostPort: 80
protocol: TCP
- containerPort: 443
hostPort: 443
protocol: TCP
Focusing on the extraMounts
section, I'm instructing Kind to take a file from my local filesystem (hostPath
) and place it in a specific location on the Kubernetes node (containerPath
).
With all that done, I spooled up my cluster with just go
. Thankfully, I had configured all the deployments as ArgoCD applications, which meant that once ArgoCD was deployed, everything else I had previously configured and deployed just came back to life, without any additional command-line input from myself. The container images hosted in Forgejo pulled without any issues this time around.
For a Real Cluster
I would not recommend using the above hosts.toml
for a production deployment. Blindly accepting a connection and trusting the SSL certificates can open you up to a number of exploits.
It's quite common for enterprises to have their own SSL CA certificate and require it to be used for signing your Ingress SSL certs, instead of those you might get from a public CA like Let's Encrypt.
For this scenario, I'd suggest creating:
- A Secret
- Containing the Client SSL certificate, which needs to be trusted by ContainerD
- A ConfigMap containing a shell script to:
- Modify the ContainerD
config.toml
if necessary - Copy over the SSL cert to a location on a Kuberneres Node
- Create the required
hosts.toml
file (echo 'blah' > /path/to/hosts.toml
)
- Modify the ContainerD
- A DaemonSet (Runs on each Node)
- With the above Secret and ConfigMap mounted
- Runs script defined in above ConfigMap
- Note: You may need to configure the DaemonSet with
tolerations
if your nodes havetaints
(link)
Finally, I would be using a hosts.toml
file that looks a little like this:
= "hostname"
[]
= ["pull", "resolve"]
= "path/to/cert.pem"
This file tells ContainerD, "Hey! There's this server...you can pull and resolve container images there...it has an SSL certificate you might not know about but don't worry, here's a pem file you can use for validation."
The path to the cert.pem
file can be relative to the hosts.toml
file. I would recommend storing the certificate alongside the hosts.toml
file, in the same directory, to keep things visible.
A little more reading on the use of this file can be found here.