Skip to main content
Agent SDK for Go records OpenTelemetry metrics for agent API calls and runtime operations. Metrics are no-op by default until you wire an OTLP exporter.

Wire OTLP

Use WithObservabilityConfig — the same block configures traces, metrics, and logs:
a, err := agent.NewAgent(
    agent.WithLLMClient(llmClient),
    agent.WithObservabilityConfig(&agent.ObservabilityConfig{
        Endpoint: "collector:4317",
        Protocol: agent.OTLPProtocolGRPC,
        Insecure: true,
        // DisableMetrics: true, // opt out of metrics only
    }),
)
defer a.Close()
Apply the same WithObservabilityConfig on both NewAgent and NewAgentWorker. LLM calls, tool executions, and memory operations run as Temporal activities on the worker — without matching config, those spans and metrics are silently dropped and never reach your collector.

Bring your own metrics

import "github.com/agenticenv/agent-sdk-go/pkg/observability"

metrics, err := observability.NewMetrics(
    observability.WithName("my-service"),
    observability.WithEndpoint("collector:4317"),
    observability.WithInsecure(true),
)

a, err := agent.NewAgent(
    agent.WithLLMClient(llmClient),
    agent.WithMetrics(metrics),
)
When WithObservabilityConfig enables metrics, any WithMetrics value is replaced by the OTLP client built from config. For advanced tuning (metrics export interval, headers, sampling), use observability.NewMetrics with observability.Option directly.

Agent API metrics

Emitted by Agent.Run, RunAsync, and Stream:
MetricKindDescription
agent.run.startedcounterEach Run / RunAsync invocation
agent.run.completedcounterSuccessful run
agent.run.failedcounterFailed run — includes error attribute
agent.run.duration_mshistogramRun wall-clock time in ms
agent.stream.startedcounterEach Stream invocation
agent.stream.dispatchedcounterWorkflow successfully dispatched
agent.stream.failedcounterDispatch failed
agent.stream.duration_mshistogramStream dispatch wall-clock time in ms
Stream metrics cover the dispatch phase — not per-token streaming duration.

Runtime metrics

Emitted per LLM call, tool execution, retriever search, and memory operation on both in-process and Temporal runtimes.

LLM

MetricKindDescription
agent.llm.call.startedcounterLLM call started
agent.llm.call.completedcounterLLM call succeeded
agent.llm.call.failedcounterLLM call failed
agent.llm.latency_mshistogramLLM wall-clock time in ms
agent.llm.tokens.inputhistogramPrompt tokens when provider reports usage
agent.llm.tokens.outputhistogramCompletion tokens when provider reports usage

Tools

MetricKindDescription
agent.tool.call.startedcounterTool execute started
agent.tool.call.completedcounterTool execute succeeded
agent.tool.call.failedcounterTool execute failed
agent.tool.latency_mshistogramTool wall-clock time in ms

Retriever

MetricKindDescription
agent.retriever.call.startedcounterRetriever search started
agent.retriever.call.completedcounterRetriever search succeeded
agent.retriever.call.failedcounterRetriever search failed
agent.retriever.latency_mshistogramSearch wall-clock time in ms

Memory

MetricKindDescription
agent.memory.recall.*counter + histogramRecall load operations
agent.memory.store.*counter + histogramStore operations
agent.memory.dedup.*counter + histogramSemantic dedup lookups
agent.memory.extract.*counter + histogramRun-end extraction (StoreModeAlways)
Each memory operation family has .started, .completed, .failed counters and a .latency_ms histogram.

Common attributes

Runtime metrics include attributes when available:
AttributeValues
modelLLM model name
provideropenai, anthropic, gemini
toolTool name
retrieverRetriever name
memory.kindMemory record kind
memory.dedupDedup decision metadata

Export to Prometheus

The SDK pushes OTLP metrics — it does not expose a Prometheus scrape endpoint directly. Use the OpenTelemetry Collector as a bridge: receive OTLP from the SDK and expose a /metrics endpoint for Prometheus to scrape. Minimal otel-collector-config.yaml:
receivers:
  otlp:
    protocols:
      grpc:
        endpoint: "0.0.0.0:4317"

exporters:
  prometheus:
    endpoint: "0.0.0.0:8889"   # Prometheus scrapes this

service:
  pipelines:
    metrics:
      receivers: [otlp]
      exporters: [prometheus]
Prometheus scrape_configs entry:
scrape_configs:
  - job_name: agent-sdk
    static_configs:
      - targets: ["otel-collector:8889"]
The agent exports to localhost:4317 (OTLP gRPC), the collector bridges to Prometheus on :8889:
agent.WithObservabilityConfig(&agent.ObservabilityConfig{
    Endpoint:      "localhost:4317",
    Protocol:      agent.OTLPProtocolGRPC,
    Insecure:      true,
    DisableTraces: true, // metrics only
    DisableLogs:   true,
})

Telemetry vs metrics

Telemetry (Result.Telemetry)OTLP metrics
ScopePer-run summary on the resultContinuous export to collector
UseApp logging, eval assertionsDashboards, alerts, SLO tracking
SetupAlways populated — no configRequires WithObservabilityConfig or WithMetrics
See Telemetry.

Example

Observability

OTLP traces, metrics, and logs

Tracing

Distributed spans for the same operations

Token Usage

Per-run token totals on AgentRunResult

Logs

Structured SDK logs via OTLP

Configuration

WithObservabilityConfig and WithMetrics