Ivalin's Lab

May 2026 · 11 min read

Hybrid AI on k3s: A Sleeping GPU, Local qwen2.5-coder, and Cloud Only When Asked

A homelab AI workflow where qwen2.5-coder:14b on an RTX 3060 12GB is the default driver, the GPU box is asleep until the cluster wakes it on demand via WoL + SSH, and Gemini 3.1 Pro only ever runs on explicit manual escalation. Cloud spend ~$80→~$5/mo, ~25–30 kWh/mo saved, ~85% of routine k3s work fully local.

k3s Ollama qwen2.5-coder Wake-on-LAN Local AI Platform Engineering

May 2026 · 10 min read

Frigate NVR Migration on k3s: What Breaks on Bare-Metal

Moving a stateful video pipeline off the control plane and onto a dedicated i5-6600 with a GTX 1050Ti — with the GPU device-plugin chain, tiered storage, --kube-reserved, and a VLAN held together by a router startup script. Cuts control-plane CPU from ~95% to ~50%, kills DiskPressure, drops inference from ~300ms to ~40ms.

Frigate k3s GPU Bare-Metal Storage Platform Engineering

April 2026 · 7 min read

Running Edge AI on Broken Phones: The Multi-Model Engineering Workflow

From a drawer of shattered OnePlus phones to a decentralized YOLOv8 inference cluster — with hallucinated OpenCL drivers, air-gapped networking paradoxes, and Claude + Gemini as an AI engineering department.

YOLOv8 Edge AI ONNX MQTT Multi-Model k3s

April 2026 · 8 min read

Running Frigate NVR on Kubernetes: Taming Nine Cheap IP Cameras

From a licensed Blue Iris setup to a containerized, AI-driven surveillance pipeline on k3s — with Cloudflare Tunnel for remote access, nightly rsync backups, and why upscaling bad video is never the answer.

Frigate NVR IP Cameras Face Recognition Cloudflare k3s

February 2026 · 10 min read

The Kubernetes Sidecar Pattern: Building Real Observability on a Phone Cluster

Adding structured logging and metrics to a FastAPI app using Fluent Bit, Loki, and a shared emptyDir volume — without touching the application code. Three simultaneous outputs, inotify-based tailing, and lessons from five things that went wrong.

Kubernetes Fluent Bit Loki Grafana Observability k3s

February 2026 · 8 min read

Running a Local AI Model on My Homelab Kubernetes Cluster

GPU-accelerated Mistral 7B via Ollama on a GTX 1050 Ti, exposed to the k3s cluster through WSL2 port forwarding, with Open WebUI providing a ChatGPT-like interface accessible from any device on the network.

Ollama Mistral WSL2 GPU Open WebUI k3s

February 2026 · 12 min read

A k3s Cluster Over USB Cables: What postmarketOS and Linux Bridges Hide

Three OnePlus phones, one Lenovo laptop, no switch. Routed /30 USB links, Flannel VXLAN, an invisible nftables forward chain, and the slow education that every layer of the stack quietly resists being made into a Kubernetes node.

k3s postmarketOS USB networking Flannel VXLAN MetalLB bare-metal

February 2026 · 8 min read

Deploying a Guitar Practice App on My Phone Kubernetes Cluster

Containerizing a React + FastAPI audio processing app and deploying it to a k3s cluster made of old smartphones. Dockerfiles, nginx reverse proxy, and MetalLB LoadBalancer.

Docker FastAPI React k3s nginx