January 16

Kubernetes The Harder Way Explorations

I’m trying to bring up a test Kubernetes cluster (just for learning), by using Kubernetes The Hard Way steps, with one big twist… I want to do this on my M2 MacBook (arm64 based).

From the GitHub page, this is what they say about the steps for doing this..

This tutorial requires four (4) ARM64 based virtual or physical machines connected to the same network. While ARM64 based machines are used for the tutorial, the lessons learned can be applied to other platforms.

Prerequisites (1)
Setting up the Jumpbox (2)
Provisioning Compute Resources (3)
Provisioning the CA and Generating TLS Certificates (4)
Generating Kubernetes Configuration Files for Authentication (5)
Generating the Data Encryption Config and Key (6)
Bootstrapping the etcd Cluster (7)
Bootstrapping the Kubernetes Control Plane (8)
Bootstrapping the Kubernetes Worker Nodes (9)
Configuring kubectl for Remote Access (10)
Provisioning Pod Network Routes (11)
Smoke Test (12)
Cleaning Up (13)

Try #1: Docker containers (failed)

Initially, I thought I would just use Docker on the Mac, to create the four nodes used for this. It started out pretty well, provisioning, creating certs, creating config files and copying, and starting etcd.

I even optimized the process a bit with:

Dockerfile to build nodes with the needed packages.
Script to generate SSH keys for all nodes.
Script to run container for each node with IP addresses, and defining /etc/hosts for all nodes with FQDNs and IPs.
Script to distribute SSH keys and known hosts info
Several scripts that just contain the commands needed for each step, and any ssh commands to move from node to node.

My first problem occurred when trying to bring up the control plane (step 8), starting up services. The issue is that my nodes (I tried debian:bookworm and ubuntu:noble bases) did NOT have systemd running. MY guess is that it was because the Docker containers are using the same kernel as my host, and that does not use systemd.

Initially, I tackled this by using a systemV init script template and then filled it out with info from the systemd .service files needed. I placed arguments in a /etc/default/SERVICE_NAME file and would source that and use a variable to add it to the script. Services were coming up for the control plane node, and it was looking good.

When I got to the next step (9), bringing up the first worker node. The systemd init script had two steps for starting the service:

ExecStartPre=/sbin/modprobe overlay
ExecStart=/bin/containerd

This was the second problem. I found an old GitHub repo to convert systemd init scripts to systemV. Unfortunately, it was very old, and created for Python2. I ran py2to3 on it, to use with python3.13.1 that I’m using and made a few changed (some import changes, print command syntax, and some mixed tab/space and indenting issues). The script ran and created a file that “looked” OK, so I ran it on the systemd unit files that I had for the workers.

However, there were two concerns with the results. One, was that for a systemd script with arguments, that lines were converted from this:

ExecStart=/usr/local/bin/kubelet \
--config=/var/lib/kubelet/kubelet-config.yaml \
--kubeconfig=/var/lib/kubelet/kubeconfig \
--register-node=true \
--v=2

to this:

start_daemon -p $PIDFILE /usr/local/bin/kubelet \
start_daemon -p $PIDFILE -config=/var/lib/kubelet/kubelet-config.yaml \
start_daemon -p $PIDFILE -kubeconfig=/var/lib/kubelet/kubeconfig \
start_daemon -p $PIDFILE -register-node=true \
start_daemon -p $PIDFILE -v=2

Now, I don’t know much about systemV init scripts, but I’m wondering if this conversion is correct. With the ones I did manually, I had the service name and then ${OPTIONS} with all the args. I figured I’d just try it and see if it works correctly.

The other concern was that the first service I need to apply, has an ExecStartPre and ExecStart line:

ExecStartPre=/sbin/modprobe overlay
ExecStart=/bin/containerd

It was converted to this:

start_daemon /sbin/modprobe overlay
...
start_daemon -p $PIDFILE /bin/containerd

I was eager to see if that would work correctly, so I gave it a try. This is when I hit the third and fatal problem. There was no modprobe command. I installed kmod, so that the command worked, but found that there was no overlay module. The lsmode did not list the overlay module.

My thought here was that because these Docker containers are using the MacOS kernel, this module was not available. I may be wrong, but I think this method is sunk.

Try #2: Virtualization (failed)

It looks like there are a lot of choices here, Parallels (commercial), VirtualBox, QEMU, etc. I found a link about ways to run virtualization on a arm64 based Mac. I have VirtualBox on my Mac already, but I was intrigued with UTM, which is a virtualization/emulation app for iOS and MacOS, and is based on QEMU.

I created a DockerHub repo, that has supporting scripts to make setting up the environment. For this attempt, I used tag “initial-try” in the repo.

Prerequisites

This is assuming you are on a arm64 based Mac (M1+), have a DockerHub account for building/pushing images, and have downloaded Ubuntu 24.04 server ISO (or whatever you want to use, granted there may be some modifications needed to the steps).

Prep Work

I installed UTM following their instructions, and then created a new virtualization machine, using the Ubuntu 24.04 ISO image I had laying around. I did the default 4 GB ram, 64 GB disk, and added a directory on my host use use as a shared area, in case I wanted to transfer files to/from the host. You can skip this, if desired.

Before starting the VM, I opened the settings, went to network settings(which are set to shared network), and clicked on the “Show Advanced Settings” checkbox.

I told it to use the network 10.0.0.0/24, with DHCP using IPs form 10.0.0.1 to 10.0.0.100. You could leave it as-is, if desired. I just wanted an easy to type IP.

I ran the Ubuntu install process (selecting to use the Ubuntu updates that are available). I named the host “utm” and setup the disk with LVM (the default). As part of the process, I selected to install openssh and import my DockerHub public key, so I can SSH in without a password, and to install docker.

Upon completion, I stopped the VM, and went to UTM settings for the VM and cleared the ISO image from the CD drive, so that it would not boot again to the installer CD. Restarted the VM and verified that I could log in and SSH in to the IP 10.0.0.3. I ran “lsmod” and could see the “overlay” module, so that was a good sign.

To complete the setup of the shared area, I did the following commands:

sudo mkdir /mnt/utm

As root, edit /etc/fstab and add:

# Share area
share /mnt/utm 9p trans=virtio,version=9p2000.L,rw,_netdev,nofail,auto 0 0

You can update the mount with:

sudo systemctl daemon-reload
sudo mount -a

You can now see the shared area files under /mnt/utm. Note the ownership and group of the files, which will match that of the Mac, which is not what you want likely. To make them match this VM (but still be the same on the host), we’ll make another mount and make sure you have bindfs installed:

mkdir ~/share
sudo apt-get install bindfs -y

Add the following to /etc/fstab, substituting the MacOs owner (UID) and group (GID) values that you noted above, and your username for the account you are on:

# bindfs mount to remap UID/GID
/mnt/utm /home/USERNAME/share fuse.bindfs map=UID/1000:@GID/@1000,x-systemd.requires=/mnt/utm,_netdev,nofail,auto 0 0

For, me, my UID was already 1000 (I had changed it on my Mac so that it was the same as Linux systems I have, normally it is like 501 or 502), and my GID was 20. Update the mount again:

sudo systemctl daemon-reload
sudo mount -a

You should now see the files in ~/share with the user and groups matching your VM. One of the files in this shared area is a file called “pcm”, the username I’m using, with:

pcm ALL=(ALL) NOPASSWD: ALL

I did “sudo cp share/pcm /etc/sudoers.d/” so that from now on, I don’t need a password for sudo commands. If you want, you can just create a file named with your username (and has your username as the first word) and place it in /etc/sudoers.d/.

Next, I want to setup Docker, so that I can run w/o sudo. This is assuming you already have an account on DockerHub, and have setup a passkey for login via the command line. I used these commands to setup the docker user:

sudo groupadd docker
sudo usermod -aG docker $USER
sudo gpasswd -a $USER docker
newgrp docker

I had to reboot (they say to log out and back in, but that did not work for me).

Lastly, I installed tools I wanted for development (emacs, ripgrep).

Trying The Hard Way (Again)…

I’m now ready to give this a go. to review, here are the steps they have…

Prerequisites (1)
Setting up the Jumpbox (2)
Provisioning Compute Resources (3)
Provisioning the CA and Generating TLS Certificates (4)
Generating Kubernetes Configuration Files for Authentication (5)
Generating the Data Encryption Config and Key (6)
Bootstrapping the etcd Cluster (7)
Bootstrapping the Kubernetes Control Plane (8)
Bootstrapping the Kubernetes Worker Nodes (9)
Configuring kubectl for Remote Access (10)
Provisioning Pod Network Routes (11)
Smoke Test (12)
Cleaning Up (13)

On the VM, pull my repo:

git clone https://github.com/pmichali/k8s-the-hard-way-on-mac.git
cd k8s-the-hard-way-on-mac

Before starting, set your Docker user ID, so it can be referenced in the scripts:

export DOCKER_ID=YOUR_DOCKER_USERNAME

First off, I wanted to create all the ssh keys for each node, build a docker image to use for the nodes, and create a network 10.10.10.0/24:

./prepare.bash
./build.bash

All four of the nodes are created as docker containers with:

./run.bash jumpbox
./run.bash server
./run.bash node-0
./run.bash node-1

Now we have the four machines running as Docker containers for step 1 (Prerequisites) of the process. The architecture is aarch64, when checking with “uname -mov”. Before continuing on, we’ll set the known hosts and SSH keys on all the nodes so that we can SSH without passwords:

./known-hosts.bash

We’ll also copy over scripts to various nodes, so that we can run them. These scripts are the commands mentioned in the various steps of “Kubernetes The Hard Way”:

./copy-scripts.bash

For step 2 (Setting up the Jumpbox), we’ll access the jumpbox container, clone the kelseyhightower/kubernetes-the-hard-way repo, and install kubectl using the following command:

docker exec jumpbox /bin/bash -c ./jumpbox-install.bash

Because of the earlier steps we did, to setup known-hosts and authorized-keys on each node, and configure /etc/hosts, there is nothing to do for step 3 (Provisioning Compute Resources) of the process. You can verify that you can ssh into any node from the jumpbox, by accessing the node with the following command, and trying to SSH to other nodes by name (e.g. ssh node-1):

docker exec -it jumpbox /bin/bash

For step 4 (Provisioning the CA and Generating TLS Certificates), run these scripts from the jumpbox:

./CA-certs.bash
./distribute-certs.bash

For step 5(Generating Kubernetes Configuration Files for Authentication), run these scripts from jumpbox:

./kubeconfig-create.bash
./distribute-kubeconfigs.bash

For step 6 (Generating the Data Encryption Config and Key), run this script on jumpbox:

./encryption.bash

For step 7 (Bootstrapping the etcd Cluster), run this script on jumpbox:

./etcd-files.bash

Then, from the jumpbox, ssh into the server and run the script to startup the service for etcd:

ssh server
./etcd-config.bash

This FAILED as, for some reason, systemd is not running on any of the docker containers, even though it is on the host VM.

I did a little bit of research, and I see that Docker does not normally run systemd, as the expectation is that a container will be running one service (not multi-service, like on the host). I see some “potential” solutions…

One is to run the container with a “systemd” replacement as the running process (command). This would handle doing the systemctl start/stop operations, and it reads and processes the corresponding systemd init scripts for services that are started. It’s detailed here, and seems like maybe the most straight forward option. I haven’t tried this, but I think it would have to re done from the UTM virtual machine so that we have the overlay module that is also needed.

A second is to run systemd in the container. It looks like that requires a bunch of things, installing systemd, using /sbin/init as the command to run, volume mounting several paths to the host (so I think would require running from a VM still), and running in privileged mode. Several posts indicate different methods. I haven’t tried this either.

A third way would be a more heavyweight solution of running VMs for each of the nodes, so that systemd is running and the overlay module is present. Fortunately, I think I have found another way that may work…

Try 3: Podman (failed)

When looking for solutions for how to run systemd in a Docker container, I saw mention of how podman has systemd running by default, so I wanted to give it a try. Many of the commands are the same as Docker, so it would be easy to setup.

Before doing this in the UTM VM that I had, I decided to just try it from my Mac host. I installed the CLI version of podman on my Mac from https://podman.io/. Next, I updated the files in my github repo to refer to podman, and to alter the container image specification that I was using.

With these changes, I was ready to give another try…

Kubernetes The Hard Way (yet again)

In the GitHub repo, there is a machines.txt file with a list of all the nodes with IP, FQDN, node name, and pod subnet (if applicable). This is read by the scripts to configure nodes as needed through the process. Let’s get started with the steps for the Kubernetes The Hard Way tutorial (listed above). They will be very similar to try #2.

For step 1 (Prerequisites), store your DockerHub user id in an environment variable for use by some of the scripts:

export DOCKER_ID=YOUR_DOCKER_ID

Create and startup the podman machine with the following script:

./init.bash

Note: If you happen to be running Docker Desktop, it will indicate that you can set the DOCKER_HOST environment variable, so that podman can access its machine. In that case, just copy and paste the export command shown.

Create the ssh keys for all the nodes, build the image to be used by nodes (along with all the SSH keys generated), and create the network:

./prepare.bash
./build.bash

There will now be a container image named localhost/${DOCKER_ID}/node:v0.1.0 in the local registry. The four containers can now be created, using the container image:

./run.bash jumpbox
./run.bash server
./run.bash node-0
./run.bash node-1

When started, the container will get an IP, name, and FQDN from the machines.txt file, and will rename the public and private keys for the specific node to id_rsa.pub and id_rsa.

One issue is that because openssh-server is installed as part of the container image, every container created will have the same host keys. To make unique keys, we’ll run “ssh-keygen -A” on each node to generate new host keys. Then, the host keys will be collected and placed into the ~/.ssh/known-hosts file on each node, so that we can SSH from node to node easily. Run the following to make those changes:

./known-hosts.bash

Note: You probably could remove the sshd install from the Containerfile that builds the container image, and then install sshd, after running each node to create unique host keys. Then, the above script could just collect the host keys and build the known-host file, and not have to delete and regenerate host keys.

There are podman commands to see what has been created so far:

podman network ls
podman network inspect k8snet
podman ps -a

At this point, we can copy the scripts I created to the various nodes:

podman cp CA-certs.bash jumpbox:/root/
podman cp distribute-certs.bash jumpbox:/root/
podman cp kubeconfig-create.bash jumpbox:/root/
podman cp distribute-kubeconfigs.bash jumpbox:/root/
podman cp encryption.bash jumpbox:/root/
podman cp etcd-files.bash jumpbox:/root/

podman cp etcd-config.bash server:/root/

These scripts are just the commands listed in the tutorial, so that you can run the script, instead of copy and pasting all the commands in the steps.

For step 2 (Setting up the Jumpbox), we’ll access the jumpbox container, clone the kelseyhightower/kubernetes-the-hard-way repo, and install kubectl using the following command:

podman exec jumpbox /bin/bash -c ./jumpbox-install.bash

Because of the earlier steps we did, to setup known-hosts and authorized-keys on each node, and configure /etc/hosts, there is nothing to do for step 3 (Provisioning Compute Resources) of the process. You can accessing a node, like jumpbox, with the following command, and trying to SSH to other nodes by name (e.g. ssh node-1):

podman exec -it jumpbox /bin/bash

For step 4 (Provisioning the CA and Generating TLS Certificates), run these scripts from the jumpbox:

./CA-certs.bash
./distribute-certs.bash

For step 5(Generating Kubernetes Configuration Files for Authentication), run these scripts from jumpbox:

./kubeconfig-create.bash
./distribute-kubeconfigs.bash

For step 6 (Generating the Data Encryption Config and Key), run this script on jumpbox:

./encryption.bash

For step 7 (Bootstrapping the etcd Cluster), run this script on jumpbox:

./etcd-files.bash

Then, from the jumpbox, ssh into the server and run the script to startup the service for etcd:

ssh server
./etcd-config.bash

For step 8 (Bootstrapping the Kubernetes Control Plane), return back to jumpbox and run the script to place API server, Controller Manager, and Scheduler onto server node:

exit
./push-controller-settings.bash

Then, from the jumpbox, ssh into the server and start up these three services:

ssh server
./bootstrap-controllers.bash

Return to the jumpbox and verify the controller is working by requesting the version:

exit
curl -k --cacert ca.crt https://server.kubernetes.local:6443/version

For step 9 (Bootstrapping the Kubernetes Worker Nodes), from the jumpbox node, run this script to place configs, kubelet, kube-proxy, and kubectl on worker nodes:

./push-worker-settings.bash

Then, on each of the worker nodes, node-0 and node-1, append the following to kube-proxy-config.yaml to fix an issue with containerd and shim task being created:

conntrack:
maxPerCore: 0

And then, invoke:

./bootstrap-workers.bash

Go back to the jumpbox (exit from node-*, or podman exec, if totally exited containers). Check that nodes are running with:

ssh server kubectl get nodes -o wide --kubeconfig admin.kubeconfig

Note: this showed the nodes ready, but I was seeing the kube-proxy service not starting on the worker nodes (systemctl status kube-proxy).

For step 10 (Configuring kubectl for Remote Access), on jumpbox, invoke this script, which creates an admin config and runs lubectl command to show version and nodes:

./remote-access.bash

For step 11 (Provisioning Pod Network Routes), run the following script on jumpbox to create routes for pods to communicate:

./set-routes.bash

For step 12 (Smoke Test), you’ll perform a sequences of steps on the jumpbox to verify the installation. First, create a secret:

kubectl create secret generic kubernetes-the-hard-way \
--from-literal="mykey=mydata"

Check the secret:

ssh root@server 'etcdctl get /registry/secrets/default/kubernetes-the-hard-way | hexdump -C'

The key should be prefixed by ‘k8s:enc:aescbc:v1:key1‘.

Second, create a deployment:

kubectl create deployment nginx \
--image=nginx:latest

Verify the pod is running:

kubectl get all

Hitting a problem where the pod is not coming up and is showing an error:

 Warning FailedCreatePodSandBox 7s kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to start sandbox "73794753461123b907388be4a0b9dcb6b4cf304ceae312e7d2cedfbc0776ed69": failed to create containerd task: failed to create shim task: failed to mount rootfs component: invalid argument

It looks like containerd has some issues as well as there was mention in the log about undefined “SUBNET”. Looking at issues, I saw two things. There is mention of kubelet-config.yaml getting modified and copied to node-0/1 and then the original getting copied over (and overwriting). The modification replaces SUBNET with the actual subnet. I’m wondering if that was causing a problem with containerd.

There was another issue mentioned that someone was creating a Kubernetes The Harder Way with deployment locally on Mac with QEMU. May want to try that.

Trying from the start, with the kubelet-config.yaml overwrite resolved.

Try 4: Harder Way

When looking at the Github issues for the Kubernetes The Hard Way, I saw a posting of someone who made a repo doing the same thing, only geared towards local development and has a MacOs/ARM64 and Linux/AMD64 guide. It is called Kubernetes The Harder Way, and I’ll give that a try next.

Starting out, I cloned the repo:

git clone git@github.com:pmichali/k8s-the-hard-way-on-mac.git

Following the instructions there, which are very clear. Some observations/comments;

They create 7 nodes, some with 2GB RAM, some with 4GB RAM. I used 1GB and 1.5GB, as my Mac only has 16GB RAM.
A node is used for a load balancer, to allow app APPI requests to use one IP and it distributes to control plane nodes. What I’ve done in the past, is to use kube-vip for a common IP for the API.
The tmux tool is used, with a script that creates panes for each node and then they may use of the synchronize option, so that one command can be applied to multiple nodes (e.g. all control plane nodes) at the same time. A great idea.
A cloud-init image for Ubuntu is used, and they use the cloud-init files to set SSH key for remote access, to install tools, and to update the OS on startup.

Posted January 16, 2025 by pcm in category "Uncategorized