My journey on how I built a custom Raspberry Pi Compute Module 4 based k3s cluster and how I turned it from an idea into reality!
Table of Contents
Inspired by “Network Chuck”’s video tutorial “i built a Raspberry Pi SUPER COMPUTER!! // ft. Kubernetes (k3s cluster w/ Rancher)” on YouTube and the corresponding course on his website, as well as Jeff Geerling’s video channel and blog, I set out on four different learning experience paths:
I should add that I am well down the path of my career, but I have always enjoyed tech and Linux. However, I am doing all of this as a hobby. I am sure there are many different (and possibly better) ways of doing things, but this was my way.
Watching Jeff Geerling’s video New Raspberry Pi Projects - CM4 NAS, Piunora, and Seaberry! where he talks about mebs_t’s self designed NAS, made me think I am interested in doing something like this too. At the time, the use case I was working on was staking Navcoin headlessly with a Raspberry Pi Compute Module 4.
I was looking for a Raspberry Pi Compute Module 4 (CM4) carrier board that only provides power to a CM4 with WiFi and emmc storage. Many of the carrier boards out at the time provided a lot more functionality, so set out to design and build the Less-is-More (LiM) carrier board.
What I was looking for was a minimalistic board and I had a conversation with Jeff Geerling when I asked him if he knew of a more minimalistic design than this board? He pointed me to uptime.lab’s Upberry.
Neither of these boards are exactly what I was looking for for my use case. So I was motivated to design my own, customized board to my specs and that’s how the Less-is-More board series came to be.
The original idea of the LiM Carrier Board series was to build a series of minimalistic Raspberry Pi (RPi) Compute Module 4 (CM4) Carrier Boards. Less-is-More (LiM) refers to the minimalistic design only providing the most rudimentary functionality to the CM4 such as 5V power via USB-C power connector and two status (power/activity) LEDs for the original LiM Carrier Board version. An additional LiM+ version of the LiM Carrier Board featured additional functionality by adding flashing capability through a jumper.
Except for the LiM CM4 Cluster Carrier Board, the LiM and LiM+ Carrier boards are meant for CM4 boards with onboard storage (the LiM CM4 Cluster Carrier Board supports both CM4 versions with and w/o eMMC (lite version), micro SD card, and 2232 and has a 2241 M.2 M-key socket for NVMe PCIe SSD) and WiFi as the LiM and LiM+ carrier boards provide no other functionality other than power and status LEDs and flashing capability (LiM+). The LiM CM4 Cluster Carrier Board has PoE Ethernet and supports WiFi and non-WiFi CM4 models.
Collage 0: Various LiM Carrier Board models
Three versions of the LiM Carrier Board series have been prototyped, and more information can be found on their respective pages.
The original LiM carrier board took about two months from idea (May 22nd, 2021) to delivery (July 14th, 2021). Special thanks to Anish Verma aka. thelasthandyman and Muhammad S. for working with me on the CAD designs of the carrier board.
I forked this board design from Shawn Hymel. He has a two part YouTube series where he goes through how to design a CM4 Carrier Board.
Feel free to modify this design for your own application.
Here is the current spec and feature list of designed and planned carrier boards:
Model | Power (5V USB-C) | LEDs (Power/Activity) | Flashing Capability |
---|---|---|---|
LiM Board | ✔ | ✔ | ❌ |
LiM+ Board | ✔ | ✔ | ✔ |
While working on the LiM+ Board, I got interested in clustering and decided to switch directions. I wanted to add a couple more features that could be of value for a cluster type carrier board. And so, the LiM CM4 Cluster Carrier Board idea was born. The LiM CM4 Cluster Board has the following features:
This summarizes how I had the idea, designed and built a custom carrier board for the my LiM cluster board.
Start Sponsored Content
Now that I had the designs drawn up for the LiM cluster board, I needed them manufactured. I chose PCBWay.
PCBWay is a China-based manufacturer specializing in PCB (Printed Circuit Board) production and assembly. They serve both hobbyists and professionals, offering a variety of services such as PCB prototyping, small-batch production, PCB assembly, and other related services.
The company is well-regarded for their competitive pricing, range of options (such as different board materials and finishes), and a user-friendly online ordering system which allows customers to get instant quotes and track their orders.
The company supports a variety of file formats for board design, and they also offer design and layout services. Additionally, PCBWay runs a community platform where PCB designers can share their projects and participate in contests.
There are certainly other services out there but from my personal experience, I can only recommend their PCB manufacturing services. Pricing is reasonable and their customer services is great. I got clarifying emails from their technicians to ensure the PCBs are being manufactured correctly. Even though with the different timezones, they are responsive leading to minimal delays in the manufacturing process. Shipping is also fast, however I had to pay duty and import fees (which you have to with all other manufactures as well, so this is not a PCBWay issue).
End Sponsored Content
The idea for a cluster tray, or Pi Tray, was formed when I started out designing the Less-is-More (LiM) Raspberry Pi Compute Module 4 (CM4) cluster carrier board.
All the cluster set ups for Raspberry Pis I found on the Internet have the boards lined up either side by side or one above the other. This type of alignment of the Raspberry Pis requires venting and fans to circulate the heat away from the CPUs and out of the cases.
Horizontal Cluster | Vertical Cluster |
---|---|
Fig 1: Examples of Raspberry Pi Cluster Enclosures
This is what made me think of aligning the Raspberry Pis in an offset pattern to allow for better airflow. I was thinking of using a Pringles box and aligning trays in a star shaped way.
The original idea was a 2D drawing and a model made from toilet paper rolls with Bristol board wings for the trays for the Raspberry Pis:
Fig 2: First Brainstorm Images of the Pi Cluster Tray
First Model:
Collage 1: Various Pictures of the First Pi Tray Model
For the next version of the Pi Tray prototype, I replaced the toilet paper rolls with a Pringles tube:
Collage 2: Various Pictures of the Pringles Pi Tray Prototype
As a next step, I thought maybe I could design and 3D print a modular tray that could allow for better air circulation. I found a guy on Fiverr who designed the first prototype of the Pi Tray.
The problem with the original prototype was that it was too big to be printed on a standard 3D printer surface. It had to be printed in two pieces and then glued together.
Another improvement for the next iteration of the Pi Tray was that it was only designed with mounting holes for the LiM cluster board. So another guy, Sergiy L., on Upwork helped me design the next iteration of the Pi Tray. This time I had it designed it to fit on a standard 3D printer surface (220mmx220mmx250mm) as well as it should have mounting options for not only the LiM cluster board but also a Raspberry Pi 4B.
Pi Tray v2 drawing | Pi Tray v2 3 levels drawing |
---|---|
Fig 4: 3D Drawings of the Modular Pi Tray
Here are some pictures of the Pi Tray v2 printed, with a LiM cluster board and a Raspberry Pi 4B mounted:
Collage 3: Various Pictures of the 3D Printed Modular Pi Tray
So, after I had two Pi Tray modules printed and started to put a cluster together, I noticed I forgot to add notches in the bottom of the design to allow for the cables to come out of and allow the tray to be positioned evenly on a surface.
I did take a Dremel to one of the modules and added four notches to allow for the network cables to pass through. The final v3 of the Pi Tray will be updated to include these notches.
I will post pictures of the final v3 of the Pi Tray as well as the complete eight node cluster mounted on the modular trays when complete.
After what felt like an eternity (2+ years) I had all the components together to finally build my LIM cluster.
Parts List:
Assembly line, the parts to put all eight cluster nodes together:
And this is now how eight of the cluster nodes look like put together:
Assembled cluster with eight cluster nodes in the Pi (Pringles) Tray standing on top of a 8 port PoE router:
Since I wanted to use Rancher to manage the cluster, I also prepared an Intel based UP 4000 Board and installed Ubuntu 22.04 and Rancher 2.5 on it (more on the Rancher install further down). I set this up as the ‘clustercontroller’ – 10.0.0.200. All the management of the cluster (Rancher, Ansible) is done from the clustercontroller node.
So, for the most part to install the cluster software I followed NetworkChuck’s video tutorial “i built a Raspberry Pi SUPER COMPUTER!! // ft. Kubernetes (k3s cluster w/ Rancher)”. I wanted to automate the cluster build as much as I could, so where possible I created Ansible scripts. But first, I needed to install Raspberry Pi OS on the cluster nodes.
Initial set up/configuration:
[clustercontroller]
10.0.0.200
[clustermaster]
10.0.0.201
[clusterworkers]
10.0.0.202
10.0.0.203
10.0.0.204
10.0.0.205
10.0.0.206
10.0.0.207
10.0.0.208
[limcluster]
10.0.0.201
10.0.0.202
10.0.0.203
10.0.0.204
10.0.0.205
10.0.0.206
10.0.0.207
10.0.0.208
[testnode]
10.0.0.209
Then I tested it by pinging the cluster with Ansible:
---
- hosts: "{{ variable_hosts }}"
remote_user: pi
become: true
become_user: root
gather_facts: False
tasks:
- name: Update apt repo and cache on all Debian/Ubuntu boxes
apt:
update_cache: yes
force_apt_get: yes
cache_valid_time: 3600
- name: Upgrade all packages on servers
apt:
upgrade: dist
force_apt_get: yes
- name: Check if a reboot is needed on all servers
register: reboot_required_file
stat:
path: /var/run/reboot-required
get_md5: no
- name: Reboot the server if kernel updated
reboot:
msg: "Reboot initiated by Ansible for kernel updates"
connect_timeout: 5
reboot_timeout: 300
pre_reboot_delay: 0
post_reboot_delay: 30
test_command: uptime
when: reboot_required_file.stat.exists
“cgroup_memory=1 cgroup_enable=memory”
I wrote the following Ansible script so I can easily do this across multiple cluster nodes and also easily add additional nodes if wanted/needed:---
- hosts: "{{ variable_hosts }}"
remote_user: pi
become: true
become_user: root
tasks:
- name: Check whether /boot/cmdline.txt contains 'cgroup_memory' and append recommended cluster node vars if not found
command: "grep 'cgroup_memory' /boot/cmdline.txt"
register: checkmyconf
check_mode: no
ignore_errors: yes
changed_when: no
failed_when: false
- meta: end_host
when: checkmyconf.rc == 0
- name: Add cgroup_memory to /boot/cmdline.txt
ansible.builtin.lineinfile:
path: "/boot/cmdline.txt"
backrefs: true
regexp: '^(.*rootwait.*)$'
line: '\1 cgroup_memory=1 cgroup_enable=memory'
register: updated
when: checkmyconf.rc != 0
- name: Reboot when /boot/cmdline.txt was updated with recommended cluster node vars
reboot:
connect_timeout: 5
reboot_timeout: 300
pre_reboot_delay: 0
post_reboot_delay: 30
test_command: uptime
when: updated.failed == false
---
- hosts: "{{ variable_hosts }}"
remote_user: pi
become: true
become_user: root
tasks:
- name: Check if k3s is already installed
register: k3s_installed
stat: path=/usr/local/bin/k3s get_md5=no
- name: k3s is already installed and exit
debug:
msg: "k3s is already installed on {{ ansible_hostname }}"
when: k3s_installed.stat.exists
- meta: end_host
when: k3s_installed.stat.exists
- name: Install k3s on master node remote target
shell: curl -sfL https://get.k3s.io | INSTALL_K3S_VERSION="v1.21.1+k3s1" K3S_KUBECONFIG_MODE="644" sh -s –
- hosts: "{{ variable_master }}"
gather_facts: false
user: pi
become: true
become_user: root
tasks:
- name: "Read k3s cluster master token"
shell: |
cat /var/lib/rancher/k3s/server/node-token
register: file_content
- name: "Add k3s cluster master token to dummy host"
add_host:
name: "master_token_holder"
hash: "{{ file_content.stdout }}"
ip: "{{ inventory_hostname }}"
- hosts: "{{ variable_worker }}"
user: pi
tasks:
- name: Check if k3s is already installed
register: k3s_installed
stat: path=/usr/local/bin/k3s get_md5=no
- name: k3s is already installed and exit
debug:
msg: "k3s is already installed on ""{{ ansible_hostname }}"
when: k3s_installed.stat.exists
- meta: end_host
when: k3s_installed.stat.exists
- name: Install k3s worker on remote target
shell: curl -sfL https://get.k3s.io | INSTALL_K3S_VERSION="v1.21.1+k3s1" K3S_KUBECONFIG_MODE="644" K3S_TOKEN="{{ hostvars['master_token_holder']['hash'] }}" K3S_URL="https://"{{ hostvars['master_token_holder']['ip'] }}:6443" K3S_NODE_NAME="{{ ansible_hostname }}" sh -
I now had a functioning 8 node k3s cluster set up, managed by Rancher.
Note: I did run into an issue that caused the minecraft deployment to ‘crashloop’ and constantly reboot. The issue is described here:
I fixed it by changing the following values in the helm chart for the minecraft deployment:
...
livenessProbe:
...
initialDelaySeconds: 90
...
readinessProbe:
...
initialDelaySeconds: 30
...
This was more of a fun add on that is not really necessary for the functionality of the cluster but I thought it would be nice to display basic system information on a small OLED display for easier identification of the indidual nodes.
See my seperate page on this side project for now: oleddisplaystats and the more Raspberry Pi/limcluster specific: displaypistats