June 12

Ad-Blocking With PI-Hole

I had Pi-Hole running on a standalone Raspberry PI, but wanted to move this to my Kubernetes cluster. Digging around, I found a useful article on how add PI-Hole to Kubernetes, which not only talked about using PI-Hole, but having redundant instances with info on keeping them in-sync. It used MetalLB, ingress, and CertManager for Let’s Encrypt certifications – something I was interested in.

There was another article, based on using Helm and having some monitoring setup. I may try someday.

 

A Few Things

First, as expected, this article had an older version of pi-hole (2022.12.1). I tried the latest version (at this time 2024.05.0), but the pods were stuck in crash loops. What I found out, was that for liveness/readiness, the YAML specified to do an HTTP get at the root of the Lighttp web server. When using the 2023.02.1 pihole image it worked, but with 2023.02.2 it failed.

Trying curl http://127.0.0.1/ inside the pod showed a 403 Forbidden error. If I tried to access http://127.0.0.1/admin, I’d get a 301 Moved Permanently with a ‘/admin/’ path. If I did http://127.0.0.1/admin/, I’d get a 302 Found response with path ‘login.php’. When I did http://127.0.0.1/admin/login.php, I’d get a 200 OK result with content.

So, I changed the liveness and health probe configuration to add a path field with ‘/admin/login.php’ and then the pods would come up successfully.

Second, For the PI-Hole admin web pages, I chose to use a network type of LoadBalancer (instead of ClusterIP and then setting up an ingress IP). Accessing locally is fine, as I just use the IP assigned by the load balancer. The article talks about setting up a certificate using Let’s Encrypt to be able to access remotely.

I already have a domain name, and I’m using Dynamic DNS to redirect that domain to my router’s WAN IP. But, I’m currently port forwarding external HTTP/HTTPS traffic to my standalone Raspberry PI for a music server that uses Let’s Encrypt for certificates.

For now, I think I’ll just access my PI-Hole admin page locally. I will, however, have to figure out how to setup Let’s Encrypt, once I move my music server and other web apps to the Kubernetes cluster, so it will be useful to keep this info in mind.

 

Setting Up PI-Hole

I’m doing the same thing as the article, running three replicas of the PI-Hole pods, and I altered the liveness/readiness check. Here is my manifest.yaml in pieces:

apiVersion: v1
kind: Namespace
metadata:
name: pihole
---
apiVersion: v1
kind: ConfigMap
metadata:
name: pihole-configmap
namespace: pihole
data:
TZ: "America/New_York"
PIHOLE_DNS_: "208.67.220.220;208.67.222.222"


This sets up a namespace for PI-Hole, defines the timezone I'm using, and the upstream DNS servers that I wanted to use (OpenDNS). You can customize, as desired.
---
apiVersion: v1
kind: Secret
metadata:
name: pihole-password
namespace: pihole
type: Opaque
data:
WEBPASSWORD: <PUT_BASE64_PASSWORD_HERE> # Base64 encoded

This is the password that will be used when logging into the PI-Hole admin page. You should encode this using “echo -n ‘MY PASSWORD’ | base64” and place the encoded string in the WEBPASSWORD attribute.

---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: pihole
namespace: pihole
spec:
selector:
matchLabels:
app: pihole
serviceName: pihole
replicas: 3
template:
metadata:
labels:
app: pihole
spec:
containers:
- name: pihole
image: pihole/pihole:2024.05.0
envFrom:
- configMapRef:
name: pihole-configmap
- secretRef:
name: pihole-password
ports:
- name: svc-80-tcp-web
containerPort: 80
protocol: TCP
- name: svc-53-udp-dns
containerPort: 53
protocol: UDP
- name: svc-53-tcp-dns
containerPort: 53
protocol: TCP
livenessProbe:
httpGet:
port: svc-80-tcp-web
path: /admin/login.php
scheme: HTTP
initialDelaySeconds: 10
periodSeconds: 5
readinessProbe:
httpGet:
port: svc-80-tcp-web
path: /admin/login.php
scheme: HTTP
initialDelaySeconds: 10
periodSeconds: 10
failureThreshold: 10
volumeMounts:
- name: pihole-etc-pihole
mountPath: /etc/pihole
- name: pihole-etc-dnsmasq
mountPath: /etc/dnsmasq.d
volumeClaimTemplates:
- metadata:
name: pihole-etc-pihole
namespace: pihole
spec:
accessModes:
- "ReadWriteOnce"
resources:
requests:
storage: 3Gi
- metadata:
name: pihole-etc-dnsmasq
namespace: pihole
spec:
accessModes:
- "ReadWriteOnce"
resources:
requests:
storage: 3Gi

This is the stateful set that will create three replicas of the PI-Hole pods. I’m using the latest version at this time (2024.05.0), have modified the liveness/readiness checks as mentioned above, and am using PVs (longhorn) for storing configuration.

---
apiVersion: v1
kind: Service
metadata:
name: pihole
namespace: pihole
labels:
app: pihole
spec:
clusterIP: None
selector:
app: pihole
---
kind: Service
apiVersion: v1
metadata:
name: pihole-web-svc
namespace: pihole
spec:
selector:
app: pihole
statefulset.kubernetes.io/pod-name: pihole-0
type: LoadBalancer
ports:
- name: svc-80-tcp-web
port: 80
targetPort: 80
protocol: TCP
---
kind: Service
apiVersion: v1
metadata:
name: pihole-dns-udp-svc
namespace: pihole
annotations:
metallb.universe.tf/allow-shared-ip: "pihole"
spec:
selector:
app: pihole
type: LoadBalancer
ports:
- name: svc-53-udp-dns
port: 53
targetPort: 53
protocol: UDP
---
kind: Service
apiVersion: v1
metadata:
name: pihole-dns-tcp-svc
namespace: pihole
annotations:
metallb.universe.tf/allow-shared-ip: "pihole"
spec:
selector:
app: pihole
type: LoadBalancer
ports:
- name: svc-53-tcp-dns
port: 53
targetPort: 53
protocol: TCP

These are the services for the UI and for DNS. Of note, we are using the same laod balancer IP for the TCP and UDP DNS services. I used load balancer for the web UI as well (instead of using ClusterIP and setting up an ingress – maybe that will bite me later).

With this manifest, you can “kubectl apply -f manifest.yaml” and then look for all three of the pods to start up. You should be able to do nslookup/dig commands using the IP of the service as the server to verify that DNS is working, and you can use the IP for the pihole-web-svc service with a path of /admin/ (e.g. http://10.11.12.203/admin/). Use the password you defined in the manifest, to log in and see operation of the Ad Blocker.

Keeping The PI-Hole Pods In Sync

As mentioned in the article, we have three PI-Hole pods (one primary, two secondary), but need to keep the database in sync. To do this, Orbital Sync is used to backup the primary pod’s database, and then restore it to the secondary pods’ databases. Here is the orbital-sync.yaml manifest:

kind: ConfigMap
metadata:
name: orbital-sync-config
namespace: pihole
data:
PRIMARY_HOST_BASE_URL: “http://pihole-0.pihole.pihole.svc.cluster.local”
SECONDARY_HOST_1_BASE_URL: “http://pihole-1.pihole.pihole.svc.cluster.local”
SECONDARY_HOST_2_BASE_URL: “http://pihole-2.pihole.pihole.svc.cluster.local”
INTERVAL_MINUTES: “1”

apiVersion: apps/v1
kind: Deployment
metadata:
name: orbital-sync
namespace: pihole
spec:
selector:
matchLabels:
app: orbital-sync
template:
metadata:
labels:
app: orbital-sync
spec:
containers:
– name: orbital-sync
image: mattwebbio/orbital-sync:latest
envFrom:
– configMapRef:
name: orbital-sync-config
env:
– name: “PRIMARY_HOST_PASSWORD”
valueFrom:
secretKeyRef:
name: pihole-password
key: WEBPASSWORD
– name: “SECONDARY_HOST_1_PASSWORD”
valueFrom:
secretKeyRef:
name: pihole-password
key: WEBPASSWORD
– name: “SECONDARY_HOST_2_PASSWORD”
valueFrom:
secretKeyRef:
name: pihole-password
key: WEBPASSWORD

It runs every minute, and uses the secret that was created with the password to access PI-Hole. You can look at the orbital sync pod log to see that it is backing up and restoring the database among the PI-Holes.

 

Finishing Touches

Under the UI’s local DNS entries section, I manually entered the hostname (with a .home suffix) and IP address for each of my devices on the local network, so that I can access them by :”name.home”.

I did not setup DHCP on PI-Hole, as I used my router’s DHCP configuration.

To use the PI-Hole as the DNS server for all systems in your network, you can specify the IP of the PI-Hole on each host as the only DNS server. If you specify more than one DNS server, based on your OS, it may use the other server(s) at times and bypass the ad-blocking.

For me, I have all my hosts using the router as the primary DNS server. The router is configured to use the PI-Hole as the primary server, and then a public server as the secondary server. Normally, requests would always go to the Pi-Hole, unless for some reason it was down. This was advantageous for two reasons. First, when I had my standalone PI-Hole, if it crashed, there still was DNS resolution. Second, it made it easy to switch from the standalone PI-Hole to the Kubernetes one, by just changing the router configuration.

The only odd thing with this setup, is that when I use my laptop away from the network, my router’s IP is (obviously) not available. I’ve been getting around this, by using the “Location” feature of the MacOS, to setup the “Home” location to use my router’s IP for DNS, and to use a public DNS server for the “Roaming” location.

I guess I could setup so that the ports used for DNS on my domain name (which points to my router using Dynamic DNS), would port forward to the PI-Hole IP, but I didn’t want to expose that to the Internet.

 

 

Category: bare-metal, Kubernetes, Raspberry PI | Comments Off on Ad-Blocking With PI-Hole
June 3

Kubespray Add-Ons

In Part IV of the PI cluster series, I mention how to setup Kubespray to create a cluster. You can look there for how to setup your inventory, and the basic configuration settings for Kubespray. In that series, I mention about how to add more features, after the cluster is up. Some are pretty simple, and some require some manual steps to get everything set up.

However, you can also have Kubespray install some “add-on” components, as part of the cluster bring-up. In many cases, this makes the process more automated, and “easier”, but it does have some limitations.

First, you will be using the version and configuration that is defined in Kubespray’s Ansible templates and roles.  Granted, you can always customize Kubespray, with the caveat of having to keep your changes up to date with upstream.

Second, removing the feature on a running cluster can be more difficult. You’ll have to manually delete all the resources (e.g. daemonsets, deployments, etc.), of which, some may be hard to identify (CRDs, RoleBindings, secrets, etc). Looking in the Kubespray templates may provide some insight into the resources that were created.

You may be able to find manifests for the feature and version from the feature’s repo, and pull them and use “kubectl delete” on the manifests to remove the feature. Just note, that there may be some differences, between what is in the repo manifests for a version, and what are in the manifests that Kubespray used. I haven’t tried it, but if there is a Helm based version of the feature that matches what Kubespray installed, you might be able to “helm install” the already installed feature, and then “helm delete”?

 

Kube VIP (Virtual IP and Service Load Balancing)

To add Kube-VIP as part of the Kubespray add-on, I did these steps, before creating the cluster.

First, I modified the inventory, so that etcd would run on each of my control-plane nodes (versus a mix of control-plane and worker nodes).

Second, in inventory/mycluster/group_vars/k8s_cluster/k8s-cluster.yml, I enabled strict ARP, used IPVS (instead of iptables) for kube-proxy, and excluded my local network from kube-proxy (so that kube-proxy would not clear entries that were created by IPVS):

kube_proxy_strict_arp: true
kube_proxy_mode: ipvs
kube_proxy_exclude_cidrs: ["CIDR_FOR_MY_LOCAL_NETWORK",]

Third, I enabled kube-vip in inventory/mycluster/group_vars/k8s_cluster/addons.yml. I turned on ARP (vs BGP), and setup to do VIP for control plane and specified the API to use. I also selected to do load balancing of that VIP. I did not enable load-balancing for services, but that is an option too:

kube_vip_enabled: true
kube_vip_arp_enabled: true
kube_vip_controlplane_enabled: true
kube_vip_address: VIP_ON_MY_NETWORK
loadbalancer_apiserver:
address: "{{ kube_vip_address }}"
port: 6443
kube_vip_lb_enable: true

# kube_vip_services_enabled: false
# kube_vip_enableServicesElection: true

I had tried this out, but found that the kube-vip container was showing connection refused and permission problems, so leader election was not working for the virtual IP chosen.

I finally found a bug report on the issue when using Kubernetes 1.29 with kube-vip. Essentially, when the first control plane node is starting up, the admin.conf file used for kubectl commands, does not have the permissions needed for kube-vip at that point in the process. The kube-vip team needs to create their own config file for kubectl. In the meantime, the bug report is trying a work-around fix in Kubespray, by switching to the super-admin.conf file, which will have the needed permissions at that point in time. However, the patch they have does not work. I did more hacking to it, and have this change, which works:

diff --git a/roles/kubernetes/node/tasks/loadbalancer/kube-vip.yml b/roles/kubernetes/node/tasks/loadbalancer/kube-vip.yml
index f7b04a624..b5acdac8c 100644
--- a/roles/kubernetes/node/tasks/loadbalancer/kube-vip.yml
+++ b/roles/kubernetes/node/tasks/loadbalancer/kube-vip.yml
@@ -6,6 +6,10 @@
- kube_proxy_mode == 'ipvs' and not kube_proxy_strict_arp
- kube_vip_arp_enabled

+- name: Kube-vip | Check if first control plane
+ set_fact:
+ is_first_control_plane: "{{ inventory_hostname == groups['kube_control_plane'] | first }}"
+
- name: Kube-vip | Write static pod
template:
src: manifests/kube-vip.manifest.j2
diff --git a/roles/kubernetes/node/templates/manifests/kube-vip.manifest.j2 b/roles/kubernetes/node/templates/manifests/kube-vip.manifest.j2
index 11a971e93..7b59bca4c 100644
--- a/roles/kubernetes/node/templates/manifests/kube-vip.manifest.j2
+++ b/roles/kubernetes/node/templates/manifests/kube-vip.manifest.j2
@@ -119,6 +119,6 @@ spec:
hostNetwork: true
volumes:
- hostPath:
- path: /etc/kubernetes/admin.conf
+ path: /etc/kubernetes/{% if is_first_control_plane %}super-{% endif %}admin.conf
name: kubeconfig
status: {}

 

UPDATE: There is a fix that is in progress, which is a streamlined version of my change. Once that is merged, no patch will be needed.

With this change to Kubespray, I did a cluster create:

cd ~/workspace/kubernetes/picluster
poetry shell
cd ../kubespray
ansible-playbook -i ../picluster/inventory/mycluster/hosts.yaml -u ${USER} -b -vvv --private-key=~/.ssh/id_ed25519 cluster.yml

Everything was up and running, but kubectl commands were failing on my Mac, because the ~/.kube/config file uses the FQDN https://lb-apiserver.kubernetes.local:6443 for the server, and there is no DNS info on my Mac for this host name (it does work on the nodes, however). The simple fix was to repace the FQDN with the IP address selected for the VIP.

Now, all requests to that IP are redirected to the node that is currently running the API server. If the node is not available, IPVS will redirect to another control plane node.

MetalLB Load Balancer

Instead of setting this up after the cluster was created, you can opt to let Kubespray do this as well. In the inventory/mycluster/group_vars/k8s_cluster/addons.yml, I did these changes:

metallb_enabled: true
metallb_speaker_enabled: "{{ metallb_enabled }}"
metallb_namespace: "metallb-system"


metallb_protocol: "layer2"
metallb_config:

 address_pools:
 primary:
 ip_range:
- FIRST_IP_IN_RANGE-LAST_IP_IN_RANGE
 auto_assign: true
layer2:
- primary

Besides enabling the feature, I made sure that it was using layer two vs layer three, and under the config, setup an address pool with the range of IPs on my local network that I wanted to use for load balanced IPs. You can specify as a CIDR, if desired.

Now, when the cluster is created with Kubespray, MetalLB will be set up and you can change pods/services to use the networking type “LoadBalancer” and an IP from the pool will be assigned.

As mentioned in the disclaimer above, with the version of Kubespray I have, it installs MetalLB 0.13.9. I could have overridden the ‘metallb_version’ to a newer version, like ‘v0.14.5’, but the templates for MetalLB in Kubespray are using the older v0.11.0 kubebuilder image in several places. To get the same versioning as used when installing MetalLB via Helm, I would have to modify the templates to specify v0.14.0. I did see other configuration differences with the CRDs used in the Helm version, like setting the tls_min_version argument and not setting some priority nor priorityClassName configurations.

NGINX Ingress

This one is pretty easy to enable, by changing this setting in inventory/mycluster/group_vars/k8s_cluster/addons.yml:

ingress_nginx_enabled: true

When the cluster comes up, there will be an ingress daemonset, which created ingress controller pods on each node, and a NGINX ingress service with an IP from the MetalLB address pool.

There are example YAML files in the MetalLB/NGINX Ingress post, that will allow you to create pods and services, and an ingress resource that allows access via path prefixes.

Category: bare-metal, Kubernetes, Raspberry PI | Comments Off on Kubespray Add-Ons
June 1

High Availability?

OK, so I have a cluster with three control plane nodes and four worker nodes (currently). However, if I shutdown the control plane node that is hosting the API server, I lose API access. 🙁

I’ve been digging around and it looks like kube-vip would be a good solution, as it allows me to create a virtual IP for the API server, and then does load balancing and leader election between the control plane nodes so that the failure of the node providing the API can switch to another control plane node. In addition, kube-vip can do load balancing between services (I’m not sure if that makes metalLB redundant).

Before installing kube-vip, I needed to change the cluster configuration. I changed the inventory, so that etcd is running ONLY on the control-plane nodes (and not a mix of control plane and worker nodes).

Next, I made these changes to inventory/mycluster/group_vars/k8s_cluster/k8s-cluster.yml:

kube_proxy_mode: ipvs
kube_proxy_strict_arp: true
kube_proxy_exclude_cidrs: ["CIDR_OF_LOCAL_NETWORK",]

This had kube-proxy also using IPVS (versus iptables), and running in strict ARP mode (needed for kube-vip). Lastly, to prevent kube-proxy from clearing IPVS settings made by kube-vip, the local network IPs must be excluded. With those changes, I re-created a cluster, and was ready to install kube-vip…

There was Medium article by Chris Kirby to use a Helm install of kube-vip for HA. It used an older version of kube-vip (0.6.4) and used value.yaml settings for K3s. I added the Helm repo for kube-vip, and pulled the values.yaml file to be able to customize it:

mkdir ~/workspace/kubernetes/kube-vip
cd ~/workspace/kubernetes/kube-vip
helm repo add kube-vip https://kube-vip.github.io/helm-charts
helm repo update

wget https://raw.githubusercontent.com/kube-vip/helm-charts/main/charts/kube-vip/values.yaml

Here are the changes I made to the values.yaml, saving it as values-revised.yaml:

6c6
< pullPolicy: IfNotPresent
---
> pullPolicy: Always
8c8
< # tag: "v0.7.0"
---
> tag: "v0.8.0"
11c11
< address: ""
---
> address: "VIP_ON_LOCAL_NETWORK"
20c20
< cp_enable: "false"
---
> cp_enable: "true"
22,23c22,24
< svc_election: "false"
< vip_leaderelection: "false"
---
> svc_election: "true"
> vip_leaderelection: "true"
> vip_leaseduration: "5"
61c62
< name: ""
---
> name: "kube-vip"
86c87,88
< nodeSelector: {}
---
> nodeSelector:
> node-role.kubernetes.io/control-plane: ""
91a94,97
> - effect: NoExecute
> key: node-role.kubernetes.io/control-plane
> operator: Exists
>
93,101c99,104
< # nodeAffinity:
< # requiredDuringSchedulingIgnoredDuringExecution:
< # nodeSelectorTerms:
< # - matchExpressions:
< # - key: node-role.kubernetes.io/master
< # operator: Exists
< # - matchExpressions:
< # - key: node-role.kubernetes.io/control-plane
< # operator: Exists
---
> nodeAffinity:
> requiredDuringSchedulingIgnoredDuringExecution:
> nodeSelectorTerms:
> - matchExpressions:
> - key: node-role.kubernetes.io/control-plane
> operator: Exists

Besides using a newer kube-vip version, this enabled load balancing for control plane nodes and services, selects nodes that have the control-plane attribute (but not a value, like the article), and sets the node affinity.

With this custom values file, I could do the install:

helm install my-kube-vip kube-vip/kube-vip -n kube-system -f values-revised.yaml

With this, all the kube-vip pods were up, and the daemonset showed three desired, current, and ready. However, when I changed the server IP to my VIP in ~/.kube/config and tried kubectl commands, they failed saying that there was a x509 certificate for each of the control plane nodes, and a cluster IP, but not for the VIP I’m using.

This can be fixed by re-generating the certificates on every control plane node:

sudo su
cd
kubectl -n kube-system get configmap kubeadm-config -o jsonpath='{.data.ClusterConfiguration}' --insecure-skip-tls-verify > kubeadm.yaml

mv /etc/kubernetes/pki/apiserver.{crt,key} ~
kubeadm init phase certs apiserver --config kubeadm.yaml

In the output, I saw the IPs of the control plane nodes AND the VIP I defined. Next, the kube-apiserver container needs to be stopped and removed, so that a new one is started.

crictl ps | grep kube-apiserver
crictl stop <ID-of-apiserver>
crictl rm <ID-of-apiserver>

Now, kubectl commands using the VIP will be redirected to the control plane node running the API server, and if that node is unavailable, the requests will be redirected to another control plane node. You can see that by doing arping of the VIP and, when the leadership changes, the MAC displayed will change.

Kind of involved, but this works!

I did have some problems, when playing with HA for the API. I had rebooted the control plane node that was actively providing the API. Kube-vip did its job, and IPVS redirected API requests to another control plane node that was “elected” as the new leader. All good so far.

However, when that control plane node came back up, it would appear in the “kubectl get node” output, but showed as “NotReady”, and it never seemed to become ready. It appeared that the network was not ready, and the calico-node pod was showing an error. I played around a bit, but couldn’t seem to clear the error.

One thing I did was a Kubespray upgrade-cluster.yml with the –limit argument, specifying the node and one of the other control plane nodes (so that control plane “facts” were specified). The kube-vip pod for the node was still failing with a connection refused error. On the node, I stopped/removed the kube-apiserver container and then kube-vip container, and then kube-vip no longer had any errors.

The only thing was that ipvsadm on the node, did not show a load balancing entry for the VIP, and the other two control plane nodes only had their IPs in the load balancing entry for the VIP. I didn’t try rebooting another control-plane node.

Category: bare-metal, Kubernetes, Raspberry PI | Comments Off on High Availability?