Skip to content

OpenTelemetry

OpenTelemetry (OTel) is the vendor-neutral observability standard that provides the foundation for LLM tracing. It defines how to create, propagate, and export telemetry data.

Background

Before OTel, every observability vendor had proprietary instrumentation. Switching backends meant rewriting code. OTel standardizes APIs for creating traces and spans, SDKs for processing and exporting, OTLP for wire format, and semantic conventions for attribute naming.

For LLM observability, your tracing code works with Phoenix, Datadog, Honeycomb, or any OTLP-compatible backend without changes.

Core Concepts

Traces and Spans

A trace tracks a request from start to finish. A span is a single operation within that trace.

Spans have a name (what operation this is), timing (start and duration), attributes (key-value metadata), a parent link forming a tree, and a status.

In agent contexts, a trace might cover an entire user session, with spans for each LLM call, tool execution, and decision point.

Context Propagation

Traces work across async boundaries and service calls because OTel propagates context. When you start a span, it automatically becomes the parent of any spans created within its scope.

Agent frameworks that use OTel produce coherent traces without manual span linking because context flows through the call stack.

TracerProvider

The TracerProvider is the central configuration point. It determines how spans are processed (batching, sampling), where spans are exported (which backends), and what resource attributes identify this service.

Applications typically configure one TracerProvider at startup, then obtain Tracers from it throughout the codebase.

Exporters and Processors

Exporters send spans to backends. The OTLP exporter speaks the standard protocol understood by most observability platforms.

Processors sit between span creation and export. The BatchSpanProcessor accumulates spans and sends them in batches. The SimpleSpanProcessor exports immediately (useful for debugging but not production).

OpenTelemetry for AI

Standard OTel provides generic span kinds like CLIENT, SERVER, and INTERNAL. These don't capture AI-specific semantics. There's no built-in way to distinguish an LLM call from a database query.

OpenInference extends OTel's semantic conventions with AI-specific span kinds (LLM, AGENT, TOOL) and attributes (gen_ai.usage.input_tokens, llm.input_messages).

OTel provides the tracing infrastructure. OpenInference provides the AI vocabulary. Together they let tooling understand LLM workloads.

Protocol Options

OTel supports two transport protocols:

HTTP/protobuf works through standard proxies and firewalls. Slightly higher per-request overhead. The pragmatic default.

gRPC offers lower latency through persistent connections and streaming. Better for high-volume scenarios. May require gRPC-aware infrastructure.

Most development and moderate-scale production deployments use HTTP. gRPC becomes relevant at thousands of spans per second.

Vendor Neutrality

The value of OTel's vendor neutrality is concrete:

  • Start with self-hosted Phoenix for development
  • Move to a managed platform for production without code changes
  • Run multiple backends simultaneously for comparison
  • Avoid lock-in as the observability market evolves

Your instrumentation investment transfers across backends because everyone speaks OTLP.

External Resources