Part IV: Preparing to Create Kubernetes Cluster
There are a bunch of ways to create a cluster (microk8s, k3s, KOps, kubeadm, kubespray,…) and after looking at a bunch of them, I decided to use kubespray (ref: https://kubespray.io/#/). I’m going to use my MacBook to drive all this, so I setup an environment on that machine with all the tools I needed (and more).
Getting The Provisioning System Ready
I created a directory ~/workspace/picluster to hold everything, and created a git repo so that I have a version controlled area to record all the changes and customizations I’m going to make. For the Mac, I used brew to install python3.11 (ref: https://www.python.org/downloads/) and poetry (ref: https://python-poetry.org/docs/) to create a virtual environment for the tools I use and to fix versions. Currently my poetry environment has:
[tool.poetry.dependencies] python = "^3.11" ansible = "7.6.0" argcomplete = "^3.1.1" "ruamel.yaml" = "^0.17.32" pbr = "^5.11.1" netaddr = "^0.8.0" jmespath = "^1.0.1"
python is obvious and I’m fixing it to 3.11. ansible is the tool used to provision the nodes. argcomplete is optional, if you want to have command completion. ruamel-yaml is a YAM parser, pbr is used for python builds, netaddr is a network address manipulation lib for python, and jmespath is for JSON matching expression.
You can use any virtual environment tool you want to ensure that you have the desired dependencies. Under “poetry shell“, which creates a virtual environment I continues with the prep work for using kubespray. I installed helm, jq, and kubectl:
brew install helm brew install jq brew install wget brew install kubectl
Note that, for kubectl, I really wanted v1.28, and specified by version (kubectl@1.28), however, when trying months later, that version appears to not be available, and now it will install 1.29).
Defining Your Inventory
I cloned the kubespray repo and then checked out the version I wanted.
git clone https://github.com/kubernetes-sigs/kubespray.git git tag | sort -V --reverse git checkout v2.23.1 # for example
The releases have tags, but you can chose to use any commit desired (or latest). Sometimes, there are newer versions used with commits after the release. For a specific commit, you can see what default Kubernetes version and Calico version are configured for the commit with:
grep -R "kube_version: " grep -R "calico_version: "
You can override the values in inventory/sample/group_vars/k8s_cluster/k8s-cluster.yml, but make sure the kubernetes and calico versions are compatible. You can check at this Tigera link to see the kubernetes versions supported for a Calico release. I have run into some issues when trying a Calico/Kubernetes pair that was not called out in the kubespray configuration (see side bar in Part V).
With the kubespray repo, I copied the sample inventory to the current area (so that the inventory for my cluster is separate from the kubespray repo):
mkdir inventory cp -r kubespray/inventory/sample/ inventory/mycluster/
They have a script that can be used to create the inventory file for the cluster. You can obtain it with::
wget https://raw.githubusercontent.com/kubernetes-sigs/kubespray/master/contrib/inventory_builder/inventory.py
Next, you can create a basic inventory by using the following command, using IPs that you have defined for each of your nodes. If you have four, like me, you could use something like this:
CONFIG_FILE=inventory/mycluster/hosts.yaml python3 inventory.py 10.11.12.198 10.11.12.196 10.11.12.194 10.11.12.192
This creates an inventory file, which is very generic. To customize it for my use, I changed each of the “node#” host names to the names I used for my cluster:
sed -i.bak 's/node1/mycoolname/g' inventory/mycluster/hosts.yaml
I kept the grouping of which nodes were in the control plane, however, later on, I want to have three control plane nodes set up. The last thing I did, was to add the following clause to the end of the file so that proxy_env was defined, but empty (note that it is indented two and four spaces):
vars:
proxy_env: []
Here is a sample inventory:
all:
hosts:
cypher:
ansible_host: 10.11.12.198
ip: 10.11.12.198
access_ip: 10.11.12.198
lock:
ansible_host: 10.11.12.196
ip: 10.11.12.196
access_ip: 10.11.12.196
mouse:
ansible_host: 10.11.12.194
ip: 10.11.12.194
access_ip: 10.11.12.194
niobi:
ansible_host: 10.11.12.192
ip: 10.11.12.192
access_ip: 10.11.12.192
children:
kube_control_plane:
hosts:
cypher:
lock:
kube_node:
hosts:
cypher:
lock:
mouse:
niobi:
etcd:
hosts:
cypher:
lock:
mouse:
k8s_cluster:
children:
kube_control_plane:
kube_node:
calico_rr:
hosts: {}
vars:
proxy_env: []
There are some configurations in the inventory files that need to be changed. This may involve changing existing settings or adding new ones. In inventory/mycluster/group_vars/k8s_cluster/addons.yml we need to enable helm:
helm_enabled: true
In inventory/mycluster/group_vars/k8s_cluster/k8s-cluster.yml, set kubernetes version, timezone, DNS servers, pin Calico version, use Calico IPIP mode, increase logging level (if desired), use iptables, disable node local DNS (otherwise get error as kernel does not have dummy module). and disable pod security policy:
kube_version: v1.27.3 # Set timezone ntp_timezone: America/New_York # DNS Servers (OpenDNS - use whatever you want here) upstream_dns_servers: [YOUR_ROUTER_IP] nameservers: [208.67.222.222, 208.67.220.220] # Pinning Calico version calico_version: v3.25.2 # Use IPIP mode calico_ipip_mode: 'Always' calico_vxlan_mode: 'Never' calico_network_backend: 'bird' # Added debug kube_log_level: 5 # Using iptables kube_proxy_mode: iptables # Must disable, as kernel on RPI does not have dummy module enable_nodelocaldns: false # Pod security policy (RBAC must be enabled either by having 'RBAC' in authorization_modes or kubeadm enabled) podsecuritypolicy_enabled: false
BTW, if you want to run ansible commands on other systems in your network, you can edit inventory/mycluster/other_servers.yaml and add the host information there:
all:
hosts:
neo:
ansible_host: 10.11.12.180
ip: 10.11.12.180
access_ip: 10.11.12.180
ansible_port: 7666
rabbit:
ansible_host: 10.11.12.200
ip: 10.11.12.200
access_ip: 10.11.12.200
vars:
proxy_env: []
In this example, my_other_server is accessed on SSH port 7777, versus the default 22.
General Node Setup Steps
Kubespray uses ansible to communicate and provision each node, and ansible uses SSH. At this time, from your provisioning host, make sure that you can SSH into each node without a password. Each node should have your public SSH key in their ~/.ssh/authorized_keys file. You can use the ssh-copy-id command to setup the authorized_keys files on each node and the host, or just copy the public key.
Ansible has a bunch of “playbooks” that you can run, so I looked around on the internet, found a bunch and placed them into a sub-directory called playbooks. Now is a good time to do some more node configuration, and make sure that ansible, the inventory file, and SSH are all setup correctly. It is way easier than logging into each node and making changes. And yes, I suspect one could put all this into one huge ansible playbook, but I like to do things one at a time and check that they work.
When we run a playbook, we’ll provide our private key for SSH access, turn on the verbose (-v) flag), and sometimes ask for privilege execution. I’ll show the command for both a single node (substitute the hostname for HOSTNAME) that is in the inventory, and for all hosts in the inventory. When the playbook runs, it will display the steps being performed, and with -v flag you can see if things are changed or not. At the end, it will show a summary of the playbook run. For example, here is the output of a “ping” to every node in the cluster:
cd ~/workspace/picluster ansible-playbook -i inventory/mycluster/hosts.yaml playbooks/ping.yaml --private-key=~/.ssh/id_ed25519 PLAY [Test Ping] ************************************************************************************************************************************************************************** TASK [Gathering Facts] ******************************************************************************************************************************************************************** ok: [lock] ok: [cypher] ok: [niobi] ok: [mouse] TASK [ping] ******************************************************************************************************************************************************************************* ok: [niobi] ok: [mouse] ok: [cypher] ok: [lock] PLAY RECAP ******************************************************************************************************************************************************************************** cypher : ok=2 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0 lock : ok=2 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0 mouse : ok=2 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0 niobi : ok=2 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
You’ll want to check the recap to see if there are any failures, and if there are, check above for the step and item that failed.
For the following, you can set TARGET_HOST environment variable, with the single host name, and then run command for that system, or run on all hosts in inventory…
For the node preparation, the first item is a playbook to add your username to the sudo list, so that you don’t have to enter in a password, when running sudo commands:
Single node:
ansible-playbook -i "${TARGET_HOST}," playbooks/passwordless_sudo.yaml -v --private-key=~/.ssh/id_ed25519 --ask-become-pass
All nodes:
ansible-playbook -i inventory/mycluster/hosts.yaml playbooks/passwordless_sudo.yaml -v --private-key=~/.ssh/id_ed25519 --ask-become-pass
The passwordless_sudo.yaml contains (change USER to the username you are using):
- name: Make users passwordless for sudo in group sudo
hosts: all
become: yes
vars:
node_username: "{{ lookup('env','USER') }}"
tasks:
- name: Add user to sudoers
copy:
dest: "/etc/sudoers.d/{{ node_username }}"
content: "{{ node_username }} ALL=(ALL) NOPASSWD: ALL"
You’ll have to provide your password, so that it can change the sudo permissions (hence the –ask-become-pass argument).
Next, you can setup for secure SSH by disabling root login and password based login:
Single node:
ansible-playbook -i "${TARGET_HOST}," playbooks/ssh.yaml -v --private-key=~/.ssh/id_ed25519
All nodes:
ansible-playbook -i inventory/mycluster/hosts.yaml playbooks/ssh.yaml -v --private-key=~/.ssh/id_ed25519
The ssh.yaml file has:
- name: Secure SSH
hosts: all
become: yes
tasks:
- name: Disable Password Authentication
lineinfile:
dest=/etc/ssh/sshd_config
regexp='^PasswordAuthentication'
line="PasswordAuthentication no"
state=present
backup=yes
register: no_password
- name: Disable Root Login
lineinfile:
dest=/etc/ssh/sshd_config
regexp='^PermitRootLogin'
line="PermitRootLogin no"
state=present
backup=yes
register: no_root
- name: Restart service
ansible.builtin.systemd:
state: restarted
name: ssh
when:
- no_password.changed or no_root.changed
For the Raspberry PI, we want to configure the fully qualified domain name and hostname and update the hosts file. Note: I use <hostname>.home for the FQDN.
Single node:
ansible-playbook -i "${TARGET_HOST}," playbooks/hostname.yaml -v --private-key=~/.ssh/id_ed25519
All nodes:
ansible-playbook -i inventory/mycluster/hosts.yaml playbooks/hostname.yaml -v --private-key=~/.ssh/id_ed25519
The hostname.yaml file is:
- name: FQDN and hostname
hosts: all
become: true
tasks:
- name: Configure FQDN and hostname
ansible.builtin.lineinfile:
path: /boot/firmware/user-data
regexp: "{{ item.regexp }}"
line: "{{ item.line }}"
loop:
- { regexp: '^fqdn\: ', line: 'fqdn: {{ ansible_hostname }}.home' }
- { regexp: '^hostname\:', line: 'hostname: {{ ansible_hostname }}' }
register: hostname
- name: Make sure hosts file updated
ansible.builtin.lineinfile:
path: /etc/hosts
regexp: "^127.0.1.1"
line: "127.0.1.1 {{ ansible_hostname }}.home {{ ansible_hostname }}"
register: hosts
- name: Reboot after change
reboot:
when:
- hostname.changed or hosts.changed
I place a bunch of tools on the nodes, but before doing so, update the OS. I used a ansible role for doing reboots by pulling this down with the following command run from the ~/workspace/picluster/ area:
git clone https://github.com/ryandaniels/ansible-role-server-update-reboot.git roles/server-update-reboot
The syntax of this is a bit different for this command
Single node:
ansible-playbook -i "${TARGET_HOST}," playbooks/os_update.yaml --extra-vars "inventory=all reboot_default=false proxy_env=[]" --private-key=~/.ssh/id_ed25519
All nodes:
ansible-playbook -i inventory/mycluster/hosts.yaml playbooks/os_update.yaml --extra-vars "inventory=all reboot_default=false" --private-key=~/.ssh/id_ed25519
Granted, I’ve had times where updates required some prompting and I don’t think the script handled it. You can always log in to each node and do it manually, if desired. The os_update.yaml will update each system once at a time:
- hosts: '{{inventory}}'
max_fail_percentage: 0
serial: 1
become: yes
roles:
- server-update-reboot
Tools can now be installed on nodes by using:
Single node:
ansible-playbook -i "${TARGET_HOST}," playbooks/tools.yaml -v --private-key=~/.ssh/id_ed25519
All nodes:
ansible-playbook -i inventory/mycluster/hosts.yaml playbooks/tools.yaml -v --private-key=~/.ssh/id_ed25519
Here is what the script installs. Feel free to add to the list, and remove emacs and ripgrep (tools I personally like):
- name: Install tools
hosts: all
vars:
host_username: "{{ lookup('env','USER') }}"
tasks:
- name: install tools
ansible.builtin.apt:
name: "{{item}}"
state: present
update_cache: yes
loop:
- emacs
- ripgrep
- build-essential
- make
- nfs-common
- open-iscsi
# - linux-modules-extra-raspi
become: yes
- name: Emacs config
copy:
src: "emacs-config.text"
dest: "/home/{{ host_username }}/.emacs"
- name: Git config
copy:
src: "git-config.text"
dest: "/home/{{ host_username }}/.gitconfig"
In the playbooks area, I have an emacs-config.text with:
(global-unset-key [(control z)])
(global-unset-key [(control x)(control z)])
(global-set-key [(control z)] 'undo)
(add-to-list 'load-path "~/.emacs.d/lisp")
(require 'package)
(add-to-list 'package-archives
'("melpa" . "http://melpa.milkbox.net/packages/") t)
(setq frame-title-format (list '(buffer-file-name "%f" "%b")))
(setq frame-icon-title-format frame-title-format)
(setq inhibit-splash-screen t)
(global-set-key [f8] 'goto-line)
(global-set-key [?\C-v] 'yank)
(setq column-number-mode t)
And a git-config.text with:
[user] name = YOUR NAME email = YOUR_EMAIL@ADDRESS [alias] qlog = log --graph --abbrev-commit --pretty=oneline flog = log --all --pretty=format:'%h %ad | %s%d' --graph --date=short clog = log --graph --pretty=\"tformat:%C(yellow)%h%Creset %Cgreen(%ar)%Creset %C(bold blue)<%an>%Creset %C(red)%d%Creset %s\" lg = log --graph --pretty=format:'%Cred%h%Creset -%C(yellow)%d%Creset %s %C(green)%an%Creset %Cgreen(%cr)%Creset' --abbrev-commit --date=relative [gitreview] username = YOUR_USERNAME [core] pager = "more" editor = "emacs" [color] diff = auto status = auto branch = auto interactive = auto ui = true pager = true
UPDATE: With Ubuntu 22.04 it does not find the linux-modules-extra-raspi. If I try to install a specific instance, like linux-modules-extra-6.8.0-1018-raspi, it switches to linux-modules-6.8.0-1018-raspi and says it is already installed. Maybe this is no longer needed. Ignoring for now (worked without it).
Update For Kube-VIP
After setting up the Kubernetes cluster, I realized that I wanted to include the kube-vip feature for the Kubespray created cluster. This allows us to have one (load balancer) IP for the cluster API and requests will get forwarded to the control plane node that is the active “leader”. If a control plane node fails, leadership will change to another control plane node, but the API IP will remain the same.
To configure to use kube-vip, I uncommented and altered lines in inventory/mycluster/group_vars/k8s_cluster/addons.yaml to enable the feature:
kube_vip_enabled: true
kube_vip_arp_enabled: true
kube_vip_controlplane_enabled: true
kube_vip_address: 10.11.12.240
loadbalancer_apiserver:
address: "{{ kube_vip_address }}"
port: 6443
kube_vip_lb_enable: true
I set the kube_vip_address to the IP that I wanted to be used for the API server’s VIP.
The issue I found, however, is that when Kubespray creates the cluster, the API will not use the IP I defined, but instead, will use a domain name. Specifically, lb-apiserver.kubernetes.local will be used. This works fine, but there are two problems.
This is used in the kube.config file that is used for kubectl commands. If I copy the config to my laptop under ~/.kube/config and then run kubectl commands, they fail, as that domain is unknown. I can work around that, by just replacing the domain name with the IP that I configured in setup for kube-vip.
The second problem occurs when a node is rebooted. It will attempt to re-connect to the cluster, but fails, because it cannot register with the API server when trying to use the domain name that was set up. Since the cluster is running locally, on my home network, there is no lb-apiserver.kubernetes.local outside of the cluster, and since this rebooted node does not have the cluster’s DNS running yet, it cannot resolve the name.
The solution I chose was to add a host name entry to /etc/hosts. Since this file is automatically generated, I had to add the mapping of IP address to lb-apiserver.kubernetes.local in the file /etc/cloud/templates/hosts.debian.tmpl on each node.
I suspect that a playbook could be created to do this for each file in the inventory. I haven’t done that, because my cluster is already up, so I just updated each node manually.
Raspberry PI Specific Setups
For the configuration of the UCTRONICS OLED display and enabling the power switch to do a controlled shutdown, we need to place the sources on the node(s) and build them for that architecture. Before starting, get the sources:
cd ~/workspace git clone https://github.com/pmichali/SKU_RM0004.git cd picluster
Note: I forked the manufacturer’s repo and just renamed the image for now. Later, the manufacturer added a deployment script, but I’m sticking with manual install and using Ansible to set things up.
Next, use this command:
Single node:
ansible-playbook -i "${TARGET_HOST}," playbooks/uctronics.yaml -v --private-key=~/.ssh/id_ed25519
All nodes:
ansible-playbook -i inventory/mycluster/hosts.yaml playbooks/uctronics.yaml -v --private-key=~/.ssh/id_ed25519
The uctronics.yaml contains:
- name: UCTRONICS Hardware setup
hosts: all
become: true
tasks:
- name: Enable OLED display and power switch
ansible.builtin.lineinfile:
path: /boot/firmware/config.txt
regexp: "{{ item.regexp }}"
line: "{{ item.line }}"
loop:
- { regexp: '^dtparam=i2c_arm=on', line: 'dtparam=i2c_arm=on,i2c_arm_baudrate=400000' }
- { regexp: '^dtoverlay=gpio-shutdown', line: 'dtoverlay=gpio-shutdown,gpio_pin=4,active_low=1,gpio_pull=up'}
register: hardware
- name: Get sources
copy:
src: "~/workspace/SKU_RM0004"
dest: "/root/"
register: files
- name: Check built
stat:
path: /root/SKU_RM0004/display"
register: executable
- name: Build display code
community.general.make:
chdir: "/root/SKU_RM0004/"
make: /usr/bin/make
target: oled_display
when:
- files.changed or not executable.stat.exists
register: built
- name: Check installed
stat:
path: /usr/local/bin/oled_display
register: binary
- name: Install display code
ansible.builtin.command:
chdir: "/root/SKU_RM0004/"
cmd: "/usr/bin/install -m 755 oled_display /usr/local/bin"
when:
- built.changed or not binary.stat.exists
register: installed
- name: Service script
copy:
src: "oled.sh"
dest: "/usr/local/bin/oled.sh"
mode: 0755
register: oled_shell
- name: Service
copy:
src: "oled.service"
dest: "/etc/systemd/system/oled.service"
mode: 0640
register: oled_service
- name: Reload daemon
ansible.builtin.systemd_service:
daemon_reload: true
name: oled
enabled: true
state: restarted
when:
- installed.changed or oled_shell.changed or oled_service.changed
- name: Reboot after change
reboot:
when:
- hardware.changed
This uses oled.sh in the playbooks area:
#!/bin/bash echo "oled.service: ## Starting ##" | systemd-cat -p info /usr/local/bin/oled_display
And oled.service for the service:
[Unit] Description=OLED Display Wants=network.target After=syslog.target network-online.target [Service] Type=simple ExecStart=/usr/local/bin/oled.sh Restart=on-failure RestartSec=10 KillMode=process [Install] WantedBy=multi-user.target
Next up is to setup cgroups for the Raspberry PI with the command:
Single node:
ansible-playbook -i "${TARGET_HOST}," playbooks/cgroups.yaml -v --private-key=~/.ssh/id_ed25519
All nodes:
ansible-playbook -i inventory/mycluster/hosts.yaml playbooks/cgroups.yaml -v --private-key=~/.ssh/id_ed25519
The cgroups.yaml has:
- name: Prepare cgroups on Ubuntu based Raspberry PI
hosts: all
become: yes
tasks:
- name: Enable cgroup via boot commandline if not already enabled
ansible.builtin.lineinfile:
path: /boot/firmware/cmdline.txt
backrefs: yes
regexp: '^((?!.*\bcgroup_enable=cpuset cgroup_memory=1 cgroup_enable=memory swapaccount=1\b).*)$'
line: '\1 cgroup_enable=cpuset cgroup_memory=1 cgroup_enable=memory swapaccount=1'
register: cgroup
- name: Reboot after change
reboot:
when:
- cgroup.changed
The following will setup load the overlay modules, setup iptables for bridged traffic, and will allow IPv4 forwarding.
Single node:
ansible-playbook -i "${TARGET_HOST}," playbooks/iptables.yaml -v --private-key=~/.ssh/id_ed25519
All nodes:
ansible-playbook -i inventory/mycluster/hosts.yaml playbooks/iptables.yaml -v --private-key=~/.ssh/id_ed25519
The iptables.yaml contains:
- name: Prepare iptables on Ubuntu based Raspberry PI
hosts: all
become: yes
tasks:
- name: Load overlay modules
community.general.modprobe:
name: overlay
persistent: present
- name: Load br_netfilter module
community.general.modprobe:
name: br_netfilter
persistent: present
- name: Allow iptables to see bridged traffic
ansible.posix.sysctl:
name: net.bridge.bridge-nf-call-iptables
value: '1'
sysctl_set: true
- name: Allow iptables to see bridged IPv6 traffic
ansible.posix.sysctl:
name: net.bridge.bridge-nf-call-ip6tables
value: '1'
sysctl_set: true
- name: Allow IPv4 forwarding
ansible.posix.sysctl:
name: net.ipv4.ip_forward
value: '1'
sysctl_set: true
Tools For Cluster Nodes
With ansible, you can do a variety of operations on the nodes of the cluster. One is to ping nodes. You can do this for other systems in your network, if you set up the other_servers.yaml file:
To ping cluster... ansible-playbook -i inventory/mycluster/hosts.yaml playbooks/ping.yaml -v --private-key=~/.ssh/id_ed25519 To ping other servers... ansible-playbook -i inventory/mycluster/other_servers.yaml playbooks/ping.yaml -v --private-key=~/.ssh/id_ed25519
The ping.yaml script is pretty simple:
- name: Test Ping hosts: all tasks: - action: ping
You can make other scripts, as needed, like the o_update.yaml shown earlier on this page. At this point, we are ready to cross our fingers and bring up the Kubernetes cluster in Part V.
Side Bar
If during some of the UCTRONICS setup steps, I2C is not enabled and OLED display does not work, you can do these steps (ref: https://dexterexplains.com/r/20211030-how-to-install-raspi-config-on-ubuntu):
In a browser, go to https://archive.raspberrypi.org/debian/pool/main/r/raspi-config/
Get the latest version with (for example):
wget https://archive.raspberrypi.org/debian/pool/main/r/raspi-config/raspi-config_20230214_all.deb
Install supporting packages:
sudo apt -y install libnewt0.52 whiptail parted triggerhappy lua5.1 alsa-utils
Fix any broken packages (just in case):
sudo apt install -fy
And then install the config util:
sudo dpkg -i ./raspi-config_20230214_all.deb
Run it with “sudo raspi-config” and select interface options, and then I2C, and then enable. Finally, do a “sudo reboot”.