The Kubernetes Sidecar Pattern: Building Real Observability on a Phone Cluster

Adding structured logging and metrics to a FastAPI app on k3s using Fluent Bit, Loki, and a shared volume — without touching the application code.


The Problem

When the guitar practice application crashed, debugging was painfully manual. I'd notice a 502 error on the frontend, run kubectl logs guitar-backend-xxxxx, scroll through output looking for a timestamp that matched the crash, then cross-reference it with Grafana's memory graphs in a second browser tab. No automation, no correlation, no way to catch it unless I was watching at the right moment.

The root cause was known — librosa, the audio processing library, causes memory spikes on large files. But detecting when it happened and capturing the context around it required being in the right place at the right time.

This post is about solving that problem using one of Kubernetes' most elegant patterns: the sidecar.

A motorcycle with a sidecar racing through a neon-lit tunnel, representing the sidecar pattern


What the Sidecar Pattern Actually Means

A Kubernetes Pod is not a container — it's a wrapper that can hold one or more containers sharing certain resources. Every container in a pod shares the same network namespace (they all see localhost as each other) and can optionally share mounted volumes. Each container still has its own filesystem root, its own process space, and its own resource limits. Think of a pod as an apartment building floor: each apartment (container) is separate, but all share the building's electrical system (network) and can use shared storage rooms (volumes) if you wire them that way.

A sidecar is a second container in that pod whose job is to help the first one — to extend or augment it — without being part of its core business logic. The name comes from motorcycle sidecars: the passenger car is attached to the bike, shares its momentum, but doesn't drive. It just handles what the driver can't do alone.

The pattern is powerful because it achieves separation of concerns at the infrastructure level. The guitar-backend container doesn't need to know how to ship logs to Loki, format Prometheus metrics, or handle retry logic when the log store is temporarily unavailable. That's the sidecar's job. The application just writes a file.


Architecture

┌─────────────────────────────────────────────────────────────────────┐
│                         Guitar Backend Pod                          │
│                                                                     │
│   ┌──────────────────────┐          ┌───────────────────────────┐   │
│   │   guitar-backend     │  writes  │   fluent-bit-sidecar      │   │
│   │   (FastAPI + librosa)│ ───────► │   (log processor)         │   │
│   │                      │          │                           │   │
│   │   :8000 (HTTP API)   │          │   :2020 (health API)      │   │
│   │                      │          │   :2021 (Prometheus out)  │   │
│   └──────────────────────┘          └───────────────────────────┘   │
│             │                                    │                   │
│             └──────────────┬────────────────────-┘                   │
│                            │                                         │
│                    shared-logs (emptyDir)                            │
│                    /var/log/guitar-backend/app.log                   │
└────────────────────────────┬─────────────────────────────────────────┘
                             │
           ┌─────────────────┼──────────────────┐
           ▼                 ▼                  ▼
      Loki :3100      Prometheus :2021      stdout
      (log store)     (metrics scrape)   (kubectl logs)
           │
           ▼
      Grafana :80

The critical mechanism is the shared emptyDir volume. Both containers mount it at /var/log/guitar-backend. The application writes a structured JSON log file there; Fluent Bit tails that file and fans the data out to three outputs simultaneously.


Why Not a DaemonSet Log Collector?

The alternative is a single Fluentd or Fluent Bit pod running on each node as a DaemonSet, scraping every container's stdout centrally. That works well for homogeneous workloads. The sidecar approach is better here for three reasons.

Structured files vs unstructured stdout. A node-level collector reads raw stdout and has to parse unstructured text to extract fields. With a sidecar, the application writes JSON directly to a file, and Fluent Bit reads it already structured — no regex required.

Portability. The logging configuration is part of the pod spec. If the guitar-backend moves to a different node or cluster entirely, the sidecar comes with it. No DaemonSet pre-installed on the destination required. On this cluster that matters — the OnePlus phone workers have constrained RAM and I'd rather not run an additional logging agent on each of them.

Isolation. The observability stack for this one service is self-contained. No shared Fluentd config to edit, no risk of a change affecting another pod's log shipping.


The Shared Volume: How It Works

An emptyDir volume is created when a pod starts, lives for the pod's entire lifetime, is deleted when the pod stops, and is visible to every container in that pod that mounts it. In the pod spec:

volumes:
  - name: shared-logs
    emptyDir: {}

Then both containers mount it:

# In guitar-backend container:
volumeMounts:
  - name: shared-logs
    mountPath: /var/log/guitar-backend

# In fluent-bit-sidecar container:
volumeMounts:
  - name: shared-logs
    mountPath: /var/log/guitar-backend

From each container's perspective it's just a regular directory. The guitar-backend writes app.log there; Fluent Bit opens and tails the same file. The kernel handles the rest — no network call, no serialization, no protocol.

Fluent Bit uses Linux's inotify mechanism to watch the file. The kernel taps it on the shoulder the moment new bytes appear — no polling, no delay. When I tested this, Fluent Bit detected a newly created log file and registered a watch within milliseconds:

[2026/02/24 19:49:24] [ info] [input:tail:tail.0] inotify_fs_add(): inode=2234192 watch_fd=1 name=/var/log/guitar-backend/app.log

Fluent Bit Configuration

The full pipeline is defined in a ConfigMap mounted into the sidecar container:

[SERVICE]
    Flush         5
    HTTP_Server   On
    HTTP_Listen   0.0.0.0
    HTTP_Port     2020
    Parsers_File  parsers.conf

[INPUT]
    Name              tail
    Path              /var/log/guitar-backend/app.log
    Parser            json
    Tag               guitar.backend
    DB                /var/log/guitar-backend/flb_pos.db

[FILTER]
    Name    record_modifier
    Match   guitar.*
    Record  cluster    homelab-k3s
    Record  node       ${NODE_NAME}
    Record  service    guitar-backend

[OUTPUT]
    Name            loki
    Match           guitar.*
    Host            loki.default.svc.cluster.local
    Port            3100
    Labels          job=guitar-backend, env=homelab

[OUTPUT]
    Name   prometheus_exporter
    Match  guitar.*
    Host   0.0.0.0
    Port   2021

[OUTPUT]
    Name  stdout
    Match guitar.*

The DB parameter stores Fluent Bit's read position in a small SQLite file inside the shared volume. If the sidecar restarts (but the pod survives), Fluent Bit resumes from where it left off rather than replaying the entire log file. The record_modifier FILTER enriches every record with cluster identity fields that become Loki labels — making it easy to filter by cluster, node, or service in Grafana.


The Pod Spec

Both containers side by side in a single deployment:

spec:
  volumes:
    - name: shared-logs
      emptyDir: {}
    - name: fluent-bit-config
      configMap:
        name: fluent-bit-sidecar-config

  containers:

    - name: guitar-backend
      image: guitar-backend:latest
      imagePullPolicy: Never
      ports:
        - containerPort: 8000
      livenessProbe:
        httpGet:
          path: /api/health
          port: 8000
      volumeMounts:
        - name: shared-logs
          mountPath: /var/log/guitar-backend
      resources:
        limits:
          memory: "1536Mi"

    - name: fluent-bit-sidecar
      image: fluent/fluent-bit:3.0
      env:
        - name: NODE_NAME
          valueFrom:
            fieldRef:
              fieldPath: spec.nodeName
      ports:
        - containerPort: 2020
        - containerPort: 2021
      resources:
        limits:
          memory: "64Mi"
          cpu: "100m"
      volumeMounts:
        - name: shared-logs
          mountPath: /var/log/guitar-backend
        - name: fluent-bit-config
          mountPath: /fluent-bit/etc

The NODE_NAME environment variable is populated from the Kubernetes downward API — Kubernetes injects the actual node name at runtime. No hardcoding, works correctly regardless of which node the pod lands on.


Three Outputs From One Sidecar

Loki is the primary log storage backend. The mental model from Prometheus translates almost directly: instead of indexing time-series metric samples, Loki indexes labels (low-cardinality key-value pairs like job=guitar-backend) and stores the actual log text as compressed chunks. Grafana — already running at 192.168.100.201 — has native Loki support, so you get full log exploration in the same tool already used for metrics.

Prometheus metrics come from Fluent Bit's built-in prometheus_exporter output plugin. This exposes a /metrics endpoint on port 2021 of the pod, scraped by Prometheus on its normal cycle. The metrics include records processed, bytes sent, and output plugin error rates — so you can alert on "Fluent Bit stopped shipping logs" as a distinct signal from "the application crashed."

Stdout remains available for real-time debugging. kubectl logs <pod> -c fluent-bit-sidecar shows every processed log record in human-readable form — invaluable during initial setup and when investigating a specific incident without opening Grafana.


Resource Cost

One concern with sidecars is resource consumption. On k3master's 8GB RAM, every megabyte counts. Fluent Bit is written in C, the binary is around 450KB, and under normal conditions consumes 10–20MB of actual memory for this workload.

Container Memory Request Memory Limit CPU Limit
guitar-backend 256Mi 1536Mi 1000m
fluent-bit-sidecar 32Mi 64Mi 100m
Pod total 288Mi 1600Mi 1100m

The sidecar represents about 4% of the pod's memory ceiling — a reasonable price for the observability it provides.


Things That Went Wrong

Fluent Bit 3.0 renamed plugin properties. The Loki output plugin removed Batch_Size and Batch_Wait in version 3.0 — valid in 2.x but a startup failure in 3.x with a clear error: unknown configuration property 'batch_size'. The fix is to remove them; Fluent Bit 3.0 handles batching automatically.

The liveness probe path was wrong. The manifest used /health as the probe path but the actual endpoint is /api/health. Kubernetes would probe, get a 404, wait for three consecutive failures, then terminate the container with exit code 0 — which looks like a graceful shutdown rather than a crash. The tell was in the logs: GET /health HTTP/1.1" 404 Not Found appearing three times followed by Shutting down.

kubectl edit corrupts nested YAML. When editing a ConfigMap that contains a multi-line string value (like a Fluent Bit config), kubectl edit shows you YAML-inside-YAML. Any indentation mistake corrupts the inner content while the outer Kubernetes object saves successfully. The reliable fix is to delete the ConfigMap and recreate it from a heredoc — content written character-for-character with no editor interpretation layer.

Loki rejects future timestamps. When testing with a hardcoded timestamp that was slightly in the future, every log entry was rejected with entry for stream has timestamp too new. Loki enforces time-ordered ingestion to maintain index integrity. The fix was dynamic timestamp generation with $(date -u +%Y-%m-%dT%H:%M:%S) inside the container.

Fluent Bit's position database retains stale offsets. After fixing the timestamp issue, the position DB (flb_pos.db) still held the offset from the rejected entries, causing Fluent Bit to retry them indefinitely. A kubectl rollout restart cleared the emptyDir volume and the position DB together, letting fresh log entries flow through cleanly.


The Payoff

With this setup running, crash investigation transforms. When a librosa OOMKill happens, Prometheus captures the memory spike via Node Exporter, and Fluent Bit captures the log context — the filename being processed, the operation that triggered the spike, the stack trace if the app logged it. Both land in Grafana. The Explore view's Correlations feature lets you select a time range on a metric graph and jump directly to the Loki logs from that exact interval.

What was a manual, multi-tab, "I hope I was watching" process becomes a single Grafana investigation workflow.

To verify the pipeline end-to-end, I injected a test log line directly into the shared volume using kubectl exec:

kubectl exec $POD -c guitar-backend -- /bin/sh -c \
  'TS=$(date -u +%Y-%m-%dT%H:%M:%S) && echo \
  "{\"time\":\"${TS}.000\",\"level\":\"WARNING\",\
  \"message\":\"librosa memory spike detected\",\
  \"module\":\"audio_processor\",\"file_size_mb\":42}" \
  >> /var/log/guitar-backend/app.log'

Within seconds the record appeared in Grafana's Explore view under {job="guitar-backend"}, enriched with cluster, node, and service labels added by Fluent Bit's record_modifier filter.


Cluster Context

This runs on a k3s cluster built from a Lenovo laptop (control plane) and three OnePlus smartphones running postmarketOS, connected via USB ethernet with routed /30 point-to-point links. Full cluster build documented in I Built a Kubernetes Cluster from Old Phones and a Laptop.

Node Hardware Role
k3master Lenovo i7-10750H, 8GB RAM control-plane + workloads
one6t OnePlus 6T, Snapdragon 845, 6GB worker
one62 OnePlus 6, Snapdragon 845, 8GB worker
one61 OnePlus 6, Snapdragon 845, 8GB worker

Previous posts: Building the cluster · Deploying the guitar app · Running Ollama locally