Blogs

Observability Beyond Logs: Traces, Metrics, and Exemplars

Logs are not enough. The three-pillar observability stack — and how exemplars connect them in production.

Apr 30, 2026 4 min

The "logs and a Grafana dashboard" era is over. The modern observability stack is denser, more correlated, and more useful.

Logs were the first observability primitive. Then metrics became standard. Then distributed tracing arrived. The three-pillar model is mature, but the connection between the pillars is where the latest gains live.

The three pillars

  • Logs: structured records of discrete events. Best for exact-cause root analysis.
  • Metrics: numerical aggregates over time. Best for SLOs, alerting, and trend analysis.
  • Traces: end-to-end request paths across services. Best for understanding distributed system behavior.

Exemplars: the connecting tissue

An exemplar is a metric data point that includes a pointer to a representative trace. When your latency histogram shows a p99 spike, clicking the spike opens an actual trace from a request that experienced it. This is the feature that turns dashboards from "interesting" to "actionable" in incident response.

OpenTelemetry: the standard

OpenTelemetry (OTel) is now the default instrumentation library across languages. Replace your vendor-specific Datadog, New Relic, or proprietary SDK with OTel; you can route the same data to any backend (Datadog, Honeycomb, Grafana Cloud, Jaeger). Vendor lock-in is no longer the price of observability.

What to instrument

  • HTTP/gRPC servers (auto-instrument via OTel SDK).
  • Database clients (auto-instrument).
  • Outbound HTTP calls to dependencies.
  • Background job queues — frame each job as a span.
  • Critical business spans (e.g., "process_payment", "send_invoice") with attributes (customer_id, amount).

What to alert on

Alert on SLO burn rate, not on individual error counts. A multi-window, multi-burn-rate alerting policy from Google's SRE workbook is the gold standard. It catches both fast burns (immediate page) and slow burns (something is degraded) without false positives.

What we install

For most clients: OpenTelemetry SDK + Collector deployed as a sidecar or daemonset, routing to Grafana Cloud for storage and visualization. Total monthly cost for a mid-size workload: under $500. The signal-to-noise ratio improvement is dramatic.