The Three Pillars
Observability is about understanding your system from the outside. The three pillars are logs, metrics, and traces — each provides a different lens on your application behavior.
Structured Logging
Plain text logs are hard to query. Use structured JSON logging:
logger.info({
event: "user_login",
userId: user.id,
duration_ms: Date.now() - start,
ip: req.ip
});This makes logs searchable and alertable in tools like Loki, Elasticsearch, or Cloud Logging.
Metrics with Prometheus
Track business and technical metrics — request rates, error rates, latency percentiles (p50, p95, p99). Expose a /metrics endpoint and scrape with Prometheus.
Distributed Tracing
In a microservices architecture, a single request touches many services. Distributed tracing (OpenTelemetry, Jaeger) shows the full journey and where time is spent.
Alerting Strategy
Alert on symptoms not causes. "Error rate > 1%" is a good alert. "CPU > 80%" is not — CPU being high is not necessarily a problem for users.
Golden Signals
Google SRE recommends monitoring four golden signals: Latency, Traffic, Errors, and Saturation. Start with these before adding more complexity.