Hybrid AI on k3s: A Sleeping GPU, Local qwen2.5-coder, and Cloud Only When Asked
A homelab AI workflow where qwen2.5-coder:14b on an RTX 3060 12GB is the
default driver, the GPU box is asleep until the cluster wakes it on demand via WoL + SSH,
and Gemini 3.1 Pro only ever runs on explicit manual escalation. Cloud spend ~$80→~$5/mo,
~25–30 kWh/mo saved, ~85% of routine k3s work fully local.
Frigate NVR Migration on k3s: What Breaks on Bare-Metal
Moving a stateful video pipeline off the control plane and onto a dedicated
i5-6600 with a GTX 1050Ti — with the GPU device-plugin chain, tiered
storage, --kube-reserved, and a VLAN held together by a router
startup script. Cuts control-plane CPU from ~95% to ~50%, kills DiskPressure,
drops inference from ~300ms to ~40ms.
Running Edge AI on Broken Phones: The Multi-Model Engineering Workflow
From a drawer of shattered OnePlus phones to a decentralized YOLOv8 inference cluster — with hallucinated OpenCL drivers, air-gapped networking paradoxes, and Claude + Gemini as an AI engineering department.
Running Frigate NVR on Kubernetes: Taming Nine Cheap IP Cameras
From a licensed Blue Iris setup to a containerized, AI-driven surveillance pipeline on k3s — with Cloudflare Tunnel for remote access, nightly rsync backups, and why upscaling bad video is never the answer.
The Kubernetes Sidecar Pattern: Building Real Observability on a Phone Cluster
Adding structured logging and metrics to a FastAPI app using Fluent Bit, Loki, and a shared emptyDir volume — without touching the application code. Three simultaneous outputs, inotify-based tailing, and lessons from five things that went wrong.
Running a Local AI Model on My Homelab Kubernetes Cluster
GPU-accelerated Mistral 7B via Ollama on a GTX 1050 Ti, exposed to the k3s cluster through WSL2 port forwarding, with Open WebUI providing a ChatGPT-like interface accessible from any device on the network.
A k3s Cluster Over USB Cables: What postmarketOS and Linux Bridges Hide
Three OnePlus phones, one Lenovo laptop, no switch. Routed /30 USB links, Flannel VXLAN, an invisible nftables forward chain, and the slow education that every layer of the stack quietly resists being made into a Kubernetes node.
Deploying a Guitar Practice App on My Phone Kubernetes Cluster
Containerizing a React + FastAPI audio processing app and deploying it to a k3s cluster made of old smartphones. Dockerfiles, nginx reverse proxy, and MetalLB LoadBalancer.