In praise of the homelab
Buy a Raspberry Pi and break something.
Over the course of a sort of sabbatical from the broader technology industry, I've been taking some time to revisit my favorite computer projects. One of these projects is my homelab.
Before going further, it is probably worthwhile to provide some definition of a homelab. For the purposes of this post, a homelab is a computing environment used for testing operating system, storage, and network configurations for the environment itself, generally hosted on a private home network. It is the computing equivalent of a scrap pile in a workshop; a user can pull material from it and test things out that might fail without incurring undesirable costs elsewhere.
Homelabs can take on a variety of forms, depending on the goals and interests of their creators. One person might have a single tower with a powerful CPU that runs a hypervisor and RAID array for testing virtual machines. Another might have a dedicated server rack filled with secondhand hardware communicating over 10G links for tinkering with networks. I myself keep a Raspberry Pi stack[1] and a NAS under a coffee table where I can look at the blinking lights.
One might wonder why it is useful to build such a thing in a world where cloud providers offer more or less complete computing environments at low cost. It is true that cloud providers solve tangible, concrete problems via their platforms. The biggest of these are cost and reliability. For a computer to run, many things must go right: an operating system must be available, storage devices must be in good working order, and the computer's network configuration must match what is required of it. Systems built on so many contingencies inevitably fail at some point, and accounting for those failure states costs real time and money and incurs real risk. Cloud platforms offer to manage both of these for their users in exchange for a recurring fee. For businesses that either do not want or cannot afford to pay staff to manage servers, ASNs, and datacenter contracts, this can be of great benefit.
What is good for a business is not necessarily good for an engineer, however. Where the cost of a cloud provider's services for one is primarily measured in money, for the other it is measured in expertise. A server in a cloud platform is, after all, not really a piece of hardware. It is a rentable abstraction of hardware, with all of the rough edges of physical reality filed away and replaced with interfaces for additional rentable abstractions. Abstractions are not bad in and of themselves - like cloud providers, they solve real problems - but the human tendency to look for cognitive shortcuts means a user might mistake this simulated experience for the real thing.
Consider, for example, a public IP address on a cloud platform. Such an IP address will, in theory, allow "public" traffic to flow between some machine and the wider internet. The platform may allocate the address for a user as an independent, billable resource. It may allow assignment of that IP address to a given instance (i.e., a virtual machine or rentable physical server). What happens during these processes of allocation and assignment? The allocation process varies depending on the provider, but it generally goes something like so: The provider maintains a pool of available IP addresses that belong to some autonomous system (AS)[2] they own, removes an address from that pool, and stores it in a database record alongside a user's account. Assignment follows a similar logic, associating the IP address with a given record of a server instance.
At instance provisioning time, things get more complicated. Routers connected to the instance begin to advertise that IP address and direct traffic to it. DHCP[3] servers provide the IP address to the instance so that it will bind that address to some network interface. Virtual network interfaces might come into existence alongside the instance itself to bridge between physical and virtual hardware. If any network devices fail during or after this process, automated processes will ensure that traffic continues flowing to the instance. Similarly, if the instance can be migrated to another physical server, additional processes will redirect all traffic to that new server. It is probably more accurate to say that a user does not lease an IP address from the cloud provider, but rather the management of that IP address and an instance's network state. The story is largely the same for instances themselves, block devices, and higher-level abstractions like machine images and load balancers. In this sense, it is more accurate to say that one does not rent these things from a cloud provider, but rather the management of the logic necessary to simulate those things.
It is in these many layers of simulation that an otherwise competent engineer can become trapped. Virtualized servers obscure the realities of operating system interactions with real hardware. Network technologies like VXLAN[4] encapsulate Layer 2 traffic in Layer 4 traffic, where the simulated Layer 2 network might become the foundation for a further simulated Layer 3 network. The rentable experience over time becomes a pale imitation of physical reality, tailored to and limited by the business goals of the cloud provider. Meanwhile, the engineer has no ability to interact with the fundamentals of the system beyond what the platform allows, leaving them dependent on the provider to understand the workings of these computing systems. The resultant loss of knowledge is the cloud provider's gain.
This brings us back to the value of a homelab. We are now over two decades into the cloud computing era; Amazon EC2 launched in 2006[5], though virtual private server providers existed well before Amazon. We are also roughly as far into the era of ultra-cheap commodity computer hardware. The Raspberry Pi, for example, first became available for $25 in 2012[6]. Just as it has never been easier for a large platform to siphon knowledge away from users, it has also never been easier or cheaper to gain that same knowledge.
Consider, for example, a lab environment made up of a single Raspberry Pi instance. At today's prices, a Raspberry Pi 5 with 4 CPU cores and 4GB of RAM costs roughly $60. A comparable EC2 instance, the t4g.medium with 2 vCPUs and 4GB of RAM, rents at a rate of $0.0336 per hour[7]. After 74 days and change, owning the Raspberry Pi becomes cheaper than renting the t4g.medium instance. (An aside: RAM is generally a more premium resource for cloud providers because unlike CPU time, it is not a compressible resource. This is a fancy way of saying the provider has to actually provide the thing they are selling. Cloud providers, like airlines, often oversell CPU capacity and hope users will not use the full amount to which they are entitled.)
Working with physical hardware provides additional secondary benefits. When running a virtual machine via a web console or some other tool, a user simply presses a button and waits for a status indicator on a screen to change. By contrast, plugging a server into the wall is a much more immersive endeavor: the timings of ethernet port lights and disk activity convey subtle signals about whether the machine is doing what it is supposed to do. When things go well, it is immensely satisfying to watch the machines work as if by magic. When they don't, it becomes a journey to understand why that may end up deep in the internals of kernel behavior or some library's source code. In both cases, one comes away with a greater understanding of computing than is available from rented experiences.
Of course, once a person has such a machine or a fleet of machines in hand, one might ask what to actually do with them. The answer is "anything at all". The point of a homelab environment is to experiment. Getting machines to boot is a valuable exercise in itself. So is trying different storage configurations, network technologies, or operating system features that look interesting. As an example, BGP - a protocol for exchanging internet routes - is not easy to run in a cloud provider environment, where the provider generally assigns IP addresses from its own ASNs. In a homelab, however, there is no such restriction. A user can set up BIRD[8] and OpenWRT[9] to build a collection of peer routers and set up an effective scale model of one of the core technologies of the internet. The same is true for storage, where tools like Open-iSCSI[10] allow a home user to create networked storage robust enough to support running diskless servers that rely entirely on network block devices from boot time onward.
In the current general computing environment, where work is hard to find and large providers obscure the often simple technologies they deploy to make computers run, a homelab provides a low-cost means to develop meaningful skills and independence from those same providers. It is not necessary for you to find a job at some technology company to learn the inner workings of computers; a few dollars and some ethernet cable will work just as well, if not better.