The Kubernetes Sidecar Pattern: Building Real Observability on a Phone Cluster
Adding structured logging and metrics to a FastAPI app on k3s using Fluent Bit, Loki, and a shared volume — without touching the application code.
The Problem
When the guitar practice application crashed, debugging was painfully
manual. I'd notice a 502 error on the frontend, run
kubectl logs guitar-backend-xxxxx, scroll through output
looking for a timestamp that matched the crash, then cross-reference it
with Grafana's memory graphs in a second browser tab. No automation, no
correlation, no way to catch it unless I was watching at the right
moment.
The root cause was known — librosa, the audio processing library, causes memory spikes on large files. But detecting when it happened and capturing the context around it required being in the right place at the right time.
This post is about solving that problem using one of Kubernetes' most elegant patterns: the sidecar.
What the Sidecar Pattern Actually Means
A Kubernetes Pod is not a container — it's a wrapper
that can hold one or more containers sharing certain resources. Every
container in a pod shares the same network namespace (they all see
localhost as each other) and can optionally share mounted
volumes. Each container still has its own filesystem root, its own
process space, and its own resource limits. Think of a pod as an
apartment building floor: each apartment (container) is separate, but
all share the building's electrical system (network) and can use shared
storage rooms (volumes) if you wire them that way.
A sidecar is a second container in that pod whose job is to help the first one — to extend or augment it — without being part of its core business logic. The name comes from motorcycle sidecars: the passenger car is attached to the bike, shares its momentum, but doesn't drive. It just handles what the driver can't do alone.
The pattern is powerful because it achieves separation of concerns at the infrastructure level. The guitar-backend container doesn't need to know how to ship logs to Loki, format Prometheus metrics, or handle retry logic when the log store is temporarily unavailable. That's the sidecar's job. The application just writes a file.
Architecture
┌─────────────────────────────────────────────────────────────────────┐
│ Guitar Backend Pod │
│ │
│ ┌──────────────────────┐ ┌───────────────────────────┐ │
│ │ guitar-backend │ writes │ fluent-bit-sidecar │ │
│ │ (FastAPI + librosa)│ ───────► │ (log processor) │ │
│ │ │ │ │ │
│ │ :8000 (HTTP API) │ │ :2020 (health API) │ │
│ │ │ │ :2021 (Prometheus out) │ │
│ └──────────────────────┘ └───────────────────────────┘ │
│ │ │ │
│ └──────────────┬────────────────────-┘ │
│ │ │
│ shared-logs (emptyDir) │
│ /var/log/guitar-backend/app.log │
└────────────────────────────┬─────────────────────────────────────────┘
│
┌─────────────────┼──────────────────┐
▼ ▼ ▼
Loki :3100 Prometheus :2021 stdout
(log store) (metrics scrape) (kubectl logs)
│
▼
Grafana :80
The critical mechanism is the shared emptyDir volume.
Both containers mount it at /var/log/guitar-backend. The
application writes a structured JSON log file there; Fluent Bit tails
that file and fans the data out to three outputs simultaneously.
Why Not a DaemonSet Log Collector?
The alternative is a single Fluentd or Fluent Bit pod running on each node as a DaemonSet, scraping every container's stdout centrally. That works well for homogeneous workloads. The sidecar approach is better here for three reasons.
Structured files vs unstructured stdout. A node-level collector reads raw stdout and has to parse unstructured text to extract fields. With a sidecar, the application writes JSON directly to a file, and Fluent Bit reads it already structured — no regex required.
Portability. The logging configuration is part of the pod spec. If the guitar-backend moves to a different node or cluster entirely, the sidecar comes with it. No DaemonSet pre-installed on the destination required. On this cluster that matters — the OnePlus phone workers have constrained RAM and I'd rather not run an additional logging agent on each of them.
Isolation. The observability stack for this one service is self-contained. No shared Fluentd config to edit, no risk of a change affecting another pod's log shipping.
The Shared Volume: How It Works
An emptyDir volume is created when a pod starts, lives for
the pod's entire lifetime, is deleted when the pod stops, and is visible
to every container in that pod that mounts it. In the pod spec:
volumes:
- name: shared-logs
emptyDir: {}
Then both containers mount it:
# In guitar-backend container:
volumeMounts:
- name: shared-logs
mountPath: /var/log/guitar-backend
# In fluent-bit-sidecar container:
volumeMounts:
- name: shared-logs
mountPath: /var/log/guitar-backend
From each container's perspective it's just a regular directory. The
guitar-backend writes app.log there; Fluent Bit opens and
tails the same file. The kernel handles the rest — no network call, no
serialization, no protocol.
Fluent Bit uses Linux's inotify mechanism to watch the file. The kernel taps it on the shoulder the moment new bytes appear — no polling, no delay. When I tested this, Fluent Bit detected a newly created log file and registered a watch within milliseconds:
[2026/02/24 19:49:24] [ info] [input:tail:tail.0] inotify_fs_add(): inode=2234192 watch_fd=1 name=/var/log/guitar-backend/app.log
Fluent Bit Configuration
The full pipeline is defined in a ConfigMap mounted into the sidecar container:
[SERVICE]
Flush 5
HTTP_Server On
HTTP_Listen 0.0.0.0
HTTP_Port 2020
Parsers_File parsers.conf
[INPUT]
Name tail
Path /var/log/guitar-backend/app.log
Parser json
Tag guitar.backend
DB /var/log/guitar-backend/flb_pos.db
[FILTER]
Name record_modifier
Match guitar.*
Record cluster homelab-k3s
Record node ${NODE_NAME}
Record service guitar-backend
[OUTPUT]
Name loki
Match guitar.*
Host loki.default.svc.cluster.local
Port 3100
Labels job=guitar-backend, env=homelab
[OUTPUT]
Name prometheus_exporter
Match guitar.*
Host 0.0.0.0
Port 2021
[OUTPUT]
Name stdout
Match guitar.*
The DB parameter stores Fluent Bit's read position in a
small SQLite file inside the shared volume. If the sidecar restarts (but
the pod survives), Fluent Bit resumes from where it left off rather than
replaying the entire log file. The record_modifier FILTER
enriches every record with cluster identity fields that become Loki
labels — making it easy to filter by cluster, node, or service in
Grafana.
The Pod Spec
Both containers side by side in a single deployment:
spec:
volumes:
- name: shared-logs
emptyDir: {}
- name: fluent-bit-config
configMap:
name: fluent-bit-sidecar-config
containers:
- name: guitar-backend
image: guitar-backend:latest
imagePullPolicy: Never
ports:
- containerPort: 8000
livenessProbe:
httpGet:
path: /api/health
port: 8000
volumeMounts:
- name: shared-logs
mountPath: /var/log/guitar-backend
resources:
limits:
memory: "1536Mi"
- name: fluent-bit-sidecar
image: fluent/fluent-bit:3.0
env:
- name: NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
ports:
- containerPort: 2020
- containerPort: 2021
resources:
limits:
memory: "64Mi"
cpu: "100m"
volumeMounts:
- name: shared-logs
mountPath: /var/log/guitar-backend
- name: fluent-bit-config
mountPath: /fluent-bit/etc
The NODE_NAME environment variable is populated from the
Kubernetes downward API — Kubernetes injects the actual node name at
runtime. No hardcoding, works correctly regardless of which node the pod
lands on.
Three Outputs From One Sidecar
Loki is the primary log storage backend. The mental
model from Prometheus translates almost directly: instead of indexing
time-series metric samples, Loki indexes labels (low-cardinality
key-value pairs like job=guitar-backend) and stores the
actual log text as compressed chunks. Grafana — already running at
192.168.100.201 — has native Loki support, so you get full
log exploration in the same tool already used for metrics.
Prometheus metrics come from Fluent Bit's built-in
prometheus_exporter output plugin. This exposes a
/metrics endpoint on port 2021 of the pod, scraped by
Prometheus on its normal cycle. The metrics include records processed,
bytes sent, and output plugin error rates — so you can alert on "Fluent
Bit stopped shipping logs" as a distinct signal from "the application
crashed."
Stdout remains available for real-time debugging.
kubectl logs <pod> -c fluent-bit-sidecar shows every
processed log record in human-readable form — invaluable during initial
setup and when investigating a specific incident without opening
Grafana.
Resource Cost
One concern with sidecars is resource consumption. On k3master's 8GB RAM, every megabyte counts. Fluent Bit is written in C, the binary is around 450KB, and under normal conditions consumes 10–20MB of actual memory for this workload.
| Container | Memory Request | Memory Limit | CPU Limit |
|---|---|---|---|
| guitar-backend | 256Mi | 1536Mi | 1000m |
| fluent-bit-sidecar | 32Mi | 64Mi | 100m |
| Pod total | 288Mi | 1600Mi | 1100m |
The sidecar represents about 4% of the pod's memory ceiling — a reasonable price for the observability it provides.
Things That Went Wrong
Fluent Bit 3.0 renamed plugin properties. The Loki
output plugin removed Batch_Size and
Batch_Wait in version 3.0 — valid in 2.x but a startup
failure in 3.x with a clear error:
unknown configuration property 'batch_size'. The fix is to
remove them; Fluent Bit 3.0 handles batching automatically.
The liveness probe path was wrong. The manifest used
/health as the probe path but the actual endpoint is
/api/health. Kubernetes would probe, get a 404, wait for
three consecutive failures, then terminate the container with exit code
0 — which looks like a graceful shutdown rather than a crash. The tell
was in the logs:
GET /health HTTP/1.1" 404 Not Found appearing three times
followed by Shutting down.
kubectl edit corrupts nested YAML. When
editing a ConfigMap that contains a multi-line string value (like a
Fluent Bit config), kubectl edit shows you
YAML-inside-YAML. Any indentation mistake corrupts the inner content
while the outer Kubernetes object saves successfully. The reliable fix
is to delete the ConfigMap and recreate it from a heredoc — content
written character-for-character with no editor interpretation layer.
Loki rejects future timestamps. When testing with a
hardcoded timestamp that was slightly in the future, every log entry was
rejected with entry for stream has timestamp too new. Loki
enforces time-ordered ingestion to maintain index integrity. The fix was
dynamic timestamp generation with
$(date -u +%Y-%m-%dT%H:%M:%S) inside the container.
Fluent Bit's position database retains stale offsets.
After fixing the timestamp issue, the position DB
(flb_pos.db) still held the offset from the rejected
entries, causing Fluent Bit to retry them indefinitely. A
kubectl rollout restart cleared the emptyDir volume and the
position DB together, letting fresh log entries flow through cleanly.
The Payoff
With this setup running, crash investigation transforms. When a librosa OOMKill happens, Prometheus captures the memory spike via Node Exporter, and Fluent Bit captures the log context — the filename being processed, the operation that triggered the spike, the stack trace if the app logged it. Both land in Grafana. The Explore view's Correlations feature lets you select a time range on a metric graph and jump directly to the Loki logs from that exact interval.
What was a manual, multi-tab, "I hope I was watching" process becomes a single Grafana investigation workflow.
To verify the pipeline end-to-end, I injected a test log line directly
into the shared volume using kubectl exec:
kubectl exec $POD -c guitar-backend -- /bin/sh -c \
'TS=$(date -u +%Y-%m-%dT%H:%M:%S) && echo \
"{\"time\":\"${TS}.000\",\"level\":\"WARNING\",\
\"message\":\"librosa memory spike detected\",\
\"module\":\"audio_processor\",\"file_size_mb\":42}" \
>> /var/log/guitar-backend/app.log'
Within seconds the record appeared in Grafana's Explore view under
{job="guitar-backend"}, enriched with cluster,
node, and service labels added by Fluent Bit's
record_modifier filter.
Cluster Context
This runs on a k3s cluster built from a Lenovo laptop (control plane) and three OnePlus smartphones running postmarketOS, connected via USB ethernet with routed /30 point-to-point links. Full cluster build documented in I Built a Kubernetes Cluster from Old Phones and a Laptop.
| Node | Hardware | Role |
|---|---|---|
| k3master | Lenovo i7-10750H, 8GB RAM | control-plane + workloads |
| one6t | OnePlus 6T, Snapdragon 845, 6GB | worker |
| one62 | OnePlus 6, Snapdragon 845, 8GB | worker |
| one61 | OnePlus 6, Snapdragon 845, 8GB | worker |
Previous posts: Building the cluster · Deploying the guitar app · Running Ollama locally