GenAI Semantic Conventions¶
Semantic conventions define standard attribute names for telemetry data. OpenTelemetry's GenAI semantic conventions establish a shared vocabulary for AI operations so that one team's model_name matches another team's expectations.
Tooling builds on these standards. When Phoenix shows token usage or Datadog calculates LLM costs, they rely on consistent attribute names.
Attribute Categories¶
Operation Identity¶
Every GenAI span should identify what it's doing:
| Attribute | Purpose |
|---|---|
gen_ai.operation.name |
chat, text_completion, or embeddings |
gen_ai.provider.name |
openai, anthropic, aws.bedrock, etc. |
These support filtering and grouping.
Model Information¶
| Attribute | Purpose |
|---|---|
gen_ai.request.model |
What model was requested |
gen_ai.response.model |
What model responded (may differ) |
The distinction matters because providers sometimes route to different versions.
Token Usage¶
| Attribute | Purpose |
|---|---|
gen_ai.usage.input_tokens |
Tokens in the prompt |
gen_ai.usage.output_tokens |
Tokens in the response |
Token counts are required for cost analysis.
Request Parameters¶
Model configuration affects output quality and cost:
| Attribute | Purpose |
|---|---|
gen_ai.request.temperature |
Randomness |
gen_ai.request.max_tokens |
Output limit |
gen_ai.request.top_p |
Nucleus sampling |
Capturing these helps correlate parameters with output quality or debug reproducibility issues.
OpenInference Extensions¶
OpenInference adds AI-specific span kinds and attributes beyond core GenAI conventions:
Span Kind Attribute¶
Tooling uses this to categorize spans by AI operation type rather than generic client/server distinctions.
Input/Output Capture¶
| Attribute | Purpose |
|---|---|
input.value |
What went into the operation |
output.value |
What came out |
llm.input_messages |
Structured chat messages (optional) |
llm.output_messages |
Model responses |
These support content inspection but raise PII considerations in production.
Tool Attributes¶
| Attribute | Purpose |
|---|---|
gen_ai.tool.name |
Which tool was called |
gen_ai.tool.call.arguments |
Input |
gen_ai.tool.call.result |
Output |
Tool tracing helps debug agent behavior. You need to see what tools returned to understand subsequent decisions.
Agent Attributes¶
| Attribute | Purpose |
|---|---|
gen_ai.agent.name |
Agent identifier |
gen_ai.agent.id |
Unique instance ID |
gen_ai.conversation.id |
Multi-turn session identifier |
Content Capture¶
Capturing prompts and responses supports debugging but creates challenges.
Full content capture lets you reproduce issues exactly and build evaluation datasets from production. But it risks PII exposure, increases storage costs for large contexts, and may conflict with compliance requirements.
Common approaches: capture in development but omit in production; hash or redact sensitive content; sample rather than capture everything.
The conventions define where content goes. Policy determines whether to populate those attributes.
Cardinality¶
Some attributes have bounded values (provider names, operation types). Others are unbounded (response IDs, conversation IDs).
High-cardinality attributes work in trace storage but cause problems in metrics aggregation. Include them on spans, exclude them from metrics dimensions.