From London Hackspace Wiki
Jump to navigation Jump to search


This page provides a bit of a braindump on the Kubernetes setup.

There is (currently) one master node on Blanton. I have thought about adding one on Landin, but it would require careful thought, since an even number of masters is generally discouraged since it can lead to split brain scenarios. Ideally we'd have a third machine to keep the number of masters odd.

Node Role Location Notes
kube-master Master Blanton 4 cores, 4GB of RAM
kube-node-blanton Worker Blanton 8 cores, 8GB of RAM
kube-node-landin Worker Landin 8 cores, 8GB of RAM
kube-node Worker Blanton 8 cores, 8GB of RAM
Older worker, mostly providing redundant capacity when one of the others are down for maintainence

As of writing, all of our containerised services can run on one of the nodes.

General Notes

I did try doing something with docker-compose, but the networking got unwealdy fast, and I realised I was about to create something not unlike Kubernetes but badly in a bunch of scripts! A big sticking point of what took so long to get this working was the dual stack IPv4 and IPv6 support needed to fit into the rest of the hackspace environment,

A few quick notes:

  • Networking is provided by Calico
  • LoadBalancer requests are serviced by metallb
    • If you want both IPv4 and IPv6 you will need to create two LoadBalancer instances pointing to the same service
  • nginx-ingress is configured to support HTTP/HTTPS services
  • cert-manager is configured to issue LetsEncrypt certificates automatically, assuming DNS entries are already in place (as would be needed for a regular VM wanting a cert)
    • Mark your ingress with the annotation "letsencrypt-prod"
  • there's a single-node glusterfs "cluster" providing storage
  • While it's all currently on Blanton, if there was another box (or ideally two) available, it would be possible to make this much more resilient
  • It's running a bleeding edge version of cert-issuer and ingress-nginx because I updated to 1.22 before things were ready :-)

MetalLB is configured to allocate IP addresses in the ranges and 2a00:1d40:1843:182:f000::/68 - it uses layer 2 ARP to advertise these on the LAN.

Accessing the cluster

Currently (useful) access is limited to those in the Admins ldap group. After a bit more testing though, I'd like to create a members namespace for members to play in.

First, go get kubectl (preferably version 1.22 which, at the time of writing this, is the version the cluster runs)

Now go get the binary for your OS from and put it somewhere on your PATH as ldap-kube-auth (or on Windows ldap-kube-auth.exe - you might just want to put it next to kubectl and run it from that directory, at least to get going)

Other filenames and paths would work, but you'll need to modify the config file below in that case.

Now stick the following in $HOME/.kube/config (if you already have a config in there, you'll need to merge them, or put it somewhere else and use the KUBE_CONFIG environment variable)

apiVersion: v1
- cluster:
  name: lhs
- context:
    cluster: lhs
    namespace: default
    user: lhsuser
  name: lhs
current-context: lhs
kind: Config
preferences: {}
- name: lhsuser
      command: ldap-kube-auth
      - name: AUTH_CLUSTER
        value: LHS
      - name: AUTH_URL
      interactiveMode: Always
      apiVersion: ""

now you should be able to run commands:

$ kubectl get pods
kubectl LDAP login helper
Logging into LHS
Press enter for defaults
Username (michael): mich181189
Password: NAME                         READY   STATUS    RESTARTS   AGE
dockerreg-8568799fdb-vwtgg   1/1     Running   0          173m

The slightly gory details

Kubernetes does not natively support LDAP authentication, so we use webhook tokens ( To make this easier to use, there is an credential plugin (

The sources for the server and client implementations of this are on Github:

Tokens are valid for 6 hours, and get cached in $HOME/.kube/LHS_cache (assuming the above config is used) - delete this file if you think the cache is causing you problems. The caching is very necessary otherwise you get a prompt (or sometimes multiple prompts!) per command. Server-side, while JWTs might be popular, this service just generates random strings and stashes them in redis for later lookup

Kubernetes is configured to check these tokens against a web service running in the cluster. This is done with the following chunk of kubeadm-config

apiVersion: v1
  ClusterConfiguration: |
        authentication-token-webhook-config-file: /etc/kubernetes/auth/ldap-auth.yaml
        authentication-token-webhook-version: v1

and a patch passed to kubeadm to get it to mount /etc/kubernetes/auth into the apiserver container. If this patch is not used, it will cause the API server to refuse to start! (see the upgrade instructions below for how to specify the patch directory to kubeadm upgrade)

The server deployment is in the (private) github repo kubernetes-config - this is just a fairly standard kubernetes service deployment.

Instructions Braindump For Admins

Adding a node

Kubernetes mostly requires a basic OS install, but there are a few steps you need to make sure you do correctly.

A key point here is that until recently, Kubernetes didn’t support nodes with swap enabled. These instructions therefore do not have swap. (I’m still not entirely convinced it’s a good idea!)

  1. Install latest Debian on a VM (without swap) with SSH and standard system utilities
  2. Stick your SSH key into /root/.ssh/authorized_keys and /home/<you>/.ssh/authorized_keys
  3. Add to the lhs-hosts section of Ansible (all nodes starting with kube-* get some basic kubernetes requirements installed) and deploy to it
  4. On the master, run
    kubeadm token create
    to get a token
  5. On the master run to get the cert hash:
    openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | \
    openssl dgst -sha256 -hex | sed 's/^.* //'
  6. On the new node, run
    kubeadm join --token <token> --discovery-token-ca-cert-hash sha256:<hash>

Draining a node for mantainence

It's a good idea to shift work off a node when you're about do to anything to it (upgrade, reboot, e.t.c)

  1. run kubectl drain <node> --ignore-daemonsets

when you're done with the maintainence:

  1. run kubectl uncordon <node>

You might want to delete some pods that are running on remaining nodes so they get restarted more evenly spread across the nodes. Alternatively, you might want to just wait for usual updates and stuff to restart pods

Removing a node

  1. Drain the node
     kubectl drain <node>
  2. Delete the node record from Kubernetes
     kubectl delete node <node>
  3. Probably delete the VM or something - it's done now

Upgrading Kubernetes

Full instructions here: READ THEM!

It’s perhaps worth ignoring the 1.x.0 releases, since experience suggests things like metallb and callico might not yet support it in a stable version, which is a recipe for pain.

  1. On the master node, run apt-cache madison kubeadm to find a version to update to
  2. On the master node, run:
    sudo apt-get install -y --allow-change-held-packages kubeadm=<your-chosen-version>
    sudo kubeadm upgrade plan
    sudo kubeadm upgrade apply --patches=/etc/kubernetes/patches v<your-chosen-version>
    sudo apt-get install -y --allow-change-held-packages kubelet=<your-chosen-version> kubectl=<your-chosen-version>

On each node:

  1. Drain the node
  2. Run:
    sudo apt-get install -y --allow-change-held-packages kubeadm=<your-chosen-version>
    sudo kubeadm upgrade node
    sudo apt-get install -y --allow-change-held-packages kubelet=<your-chosen-version> kubectl=<your-chosen-version>
  3. run kubectl get nodes wherever you normally run kubectl to make sure the node is running the expected version
  4. Uncordon node

Fixing Screwups

Re-adding a node you removed by mistake

If you accidentaly run

kubectl delete node <node>

when you didn't mean to, don't panic - the workload should be shifted automatically to

a remaining node. here's how to re-add the one you just removed:

  1. Run kubeadm reset on the affected node
  2. Run kubeadm join as if it was a new node

ETCD Maintainance

It seems the ETCD database can grow very, very large, which can cause startup to take many minutes, making many things unhappy.

Running this on the master can help:

 ETCDCTL_API=3 etcdctl --cacert=/etc/kubernetes/pki/etcd/ca.crt  --cert=/etc/kubernetes/pki/etcd/server.crt --key=/etc/kubernetes/pki/etcd/server.key  get revisiontestkey -w json

take the revision from above, subtract one, then run:

 ETCDCTL_API=3 etcdctl --cacert=/etc/kubernetes/pki/etcd/ca.crt  --cert=/etc/kubernetes/pki/etcd/server.crt --key=/etc/kubernetes/pki/etcd/server.key  compact 45368345
 ETCDCTL_API=3 etcdctl --cacert=/etc/kubernetes/pki/etcd/ca.crt  --cert=/etc/kubernetes/pki/etcd/server.crt --key=/etc/kubernetes/pki/etcd/server.key defrag