Posts / homelab

Diagrams That Lie to You, and the Beautiful Madness of Fixing That

Someone posted their homelab setup online recently and the project itself is genuinely clever: they took their network diagram out of a drawing tool and made it a build artefact instead. A text file in a repo, a GitHub Actions workflow that renders it on every push, icons pulled from public sources at render time so nothing drifts. The diagram can’t lie to you because it rebuilds itself from the thing that is actually true.

I’ve been thinking about that problem since I first read it. Stale documentation is one of those quiet plagues in tech that everyone acknowledges and almost no one solves properly. You update the thing, you mean to update the diagram, and then something else is on fire and three months later the diagram describes a system that no longer exists. I have done this professionally. I have done this in my own home network, which is considerably less sophisticated than what this person has built. The diagram in my head has definitely drifted from the cables behind my rack.

The tool they used is D2, a diagram-as-code language I hadn’t come across before. Text in, SVG out, layout handled by ELK. The source file and the workflow are all public if you want to look. The ELK layout engine is the bit worth noting because the default D2 layout can produce some questionable arrow routing, and a few people in the comments found that out the hard way.

There’s a reasonable counter-argument buried in the discussion, and it’s worth taking seriously: are you trading manual PNG updates every few months for ongoing maintenance of a workflow that breaks whenever an upstream dependency changes something? The author’s response is that they just like working declaratively, everything lives in one monorepo, and future automation becomes easier when the source is structured text rather than a binary image file. That’s a fair answer. It’s not the only answer.

Someone else in the comments went further and pointed out that if you use NixOS, where the repo actually configures the systems rather than just describing them, the diagram can be generated from a genuine source of truth. Not “what I drew” and not even “what I committed last Tuesday” but “what the system actually is right now, before I deploy.” That’s a different category of problem being solved, and it’s interesting.

The thing that made me stop and actually think, though, was the comment about overengineering. Someone noted, fairly drily, that engineers will spend two days automating a ten-minute monthly task without blinking. The author didn’t really dispute it. Neither would I. There’s a version of this where the time maths doesn’t add up and you know it doesn’t add up and you do it anyway because the problem is interesting and the act of solving it teaches you something. That’s not irrational; it’s just a different value function than the one that looks like efficiency.

I’ve spent an embarrassing number of hours on homelab stuff that has no commercial value and serves approximately one person. Some of it I’ve thrown out. Some of it I still use. The ratio is probably not great. But the Kubernetes comment from someone else in the thread rings true: you can’t blow up a customer’s environment to learn something, and you can’t easily break your work systems on purpose, so you build a lab where the cost of failure is a reboot and some mild swearing at 11pm.

The diagram being self-documenting is almost secondary to that. The real thing is that the process of making it self-documenting taught someone something about GitHub Actions, about D2, about ELK layout engines, about icon inlining into SVG. That knowledge is now theirs. The diagram is just the artefact.

One comment did land a fair hit: the diagram as shown doesn’t include IP addresses, VLANs, host assignments, or anything you’d actually need in an incident. It’s a topology sketch with nice icons. Useful for explaining the shape of things to someone without cluster access, maybe. Not useful for debugging at 3am, as another commenter noted with appropriate bleakness. The author knows this. It’s a starting point, not a finished product.

I don’t have a tidy conclusion here. Diagram-as-code is genuinely useful in the right context. The automation is clever. The scope of what it documents is limited. All three things are true simultaneously and that’s fine. Not every project needs to solve every problem; sometimes it just needs to solve the one that was bothering you.

The crayons comment made me laugh, though. Honestly, for my home network, crayons might be appropriate.