Skip to content

Arize Phoenix

Arize Phoenix is an open-source observability platform for LLM applications.

Features

Trace Visualization

Phoenix renders traces as waterfall diagrams showing the hierarchy of operations. For an agent session, you see the parent agent span containing child LLM calls and tool executions, with timing for each.

The waterfall shows which operations took the longest, how many LLM calls occurred, what tools were invoked, and where errors occurred.

Because Phoenix understands OpenInference conventions, you can query traces by AI-specific criteria:

  • Find all traces using a particular model
  • Filter by token count
  • Search for tool invocations
  • Identify high-latency LLM calls

Evaluation Integration

Phoenix connects tracing with evaluation. Run LLM-as-judge evaluations on traced conversations, score for hallucination or relevance, and compare results across prompt variations.

Architecture

Phoenix sits at the end of the observability pipeline:

Agent → OTel SDK → OTLP Export → Phoenix

It receives spans via OTLP, stores them, and provides a web interface. Any OTel-instrumented application can send to Phoenix, and you can run Phoenix alongside other backends.

Local-First Design

Phoenix runs as a single container with embedded storage. Spin up Phoenix locally for development and see traces immediately. Traces stay on your infrastructure. No external dependencies for getting started.

For production, Phoenix supports PostgreSQL for durable storage and horizontal scaling, but local SQLite mode works for development.

TypeScript SDK

Phoenix provides TypeScript packages for instrumentation:

Package Purpose
@arizeai/phoenix-otel Configures OTel with Phoenix defaults
@arizeai/openinference-core Helpers for AI-specific spans
@arizeai/phoenix-client REST API client

These packages are optional. Any OTel SDK works.

Phoenix and OTel

Phoenix is an OTel-native backend. OTel handles instrumentation and export. OpenInference adds AI-specific attributes. Phoenix consumes these traces and provides AI-focused analysis.

Phoenix improvements don't require instrumentation changes. Instrumentation improvements benefit all backends.

Use Cases

  • Self-hosted LLM observability
  • Local development tracing
  • AI-specific analysis beyond generic APM
  • Evaluation workflows

For teams using Datadog or similar platforms, Phoenix can run alongside and receive the same traces.

External Resources