文件最后提交记录最后更新时间
feat(agent_teams): add OpenTelemetry observability subsystem Bridge OTel TracerProvider to three existing injection points without modifying any observed code: - AsyncCallbackFramework: LLM invoke/stream/error, tool start/finish/error, agent invoke (~80% of observable surface). First-token latency captured on the first stream chunk; reasoning_content split into a child span. - TeamAgent.add_event_listener: team / member / task / message events attached to a long-lived team root span, with task spans for terminal task events. - DeepAgent rails: a minimal ObservabilityRail covers only the outer task-loop iteration boundary; the other 8 hooks are intentionally no-ops because the Callback handlers already cover them. Attribute keys follow OpenLLMetry semantic conventions (gen_ai.*) plus agentteam.* / deepagent.* namespaces. Prompt and completion bodies are preserved by default; redaction is opt-in via ObservabilityConfig. Exporter is selectable (otlp_grpc / otlp_http / console) and can be disabled wholesale. Tests: 9 pytest cases under tests/unit_tests/agent_teams/observability/ exercise the streaming TTFT path, reasoning child span, tool nesting, error propagation, monitor event dispatch, rail iteration spans, redaction toggle, and the disabled no-op path. Deployment: docker-compose stack (OTel Collector + Langfuse) under deploy/observability/ for local end-to-end verification, plus a matching example entry point at examples/agent_teams/agent_team_observability_e2e.py. Refs: #751 24 天前
feat(agent_teams): add OpenTelemetry observability subsystem Bridge OTel TracerProvider to three existing injection points without modifying any observed code: - AsyncCallbackFramework: LLM invoke/stream/error, tool start/finish/error, agent invoke (~80% of observable surface). First-token latency captured on the first stream chunk; reasoning_content split into a child span. - TeamAgent.add_event_listener: team / member / task / message events attached to a long-lived team root span, with task spans for terminal task events. - DeepAgent rails: a minimal ObservabilityRail covers only the outer task-loop iteration boundary; the other 8 hooks are intentionally no-ops because the Callback handlers already cover them. Attribute keys follow OpenLLMetry semantic conventions (gen_ai.*) plus agentteam.* / deepagent.* namespaces. Prompt and completion bodies are preserved by default; redaction is opt-in via ObservabilityConfig. Exporter is selectable (otlp_grpc / otlp_http / console) and can be disabled wholesale. Tests: 9 pytest cases under tests/unit_tests/agent_teams/observability/ exercise the streaming TTFT path, reasoning child span, tool nesting, error propagation, monitor event dispatch, rail iteration spans, redaction toggle, and the disabled no-op path. Deployment: docker-compose stack (OTel Collector + Langfuse) under deploy/observability/ for local end-to-end verification, plus a matching example entry point at examples/agent_teams/agent_team_observability_e2e.py. Refs: #751 24 天前
feat(agent_teams): add OpenTelemetry observability subsystem Bridge OTel TracerProvider to three existing injection points without modifying any observed code: - AsyncCallbackFramework: LLM invoke/stream/error, tool start/finish/error, agent invoke (~80% of observable surface). First-token latency captured on the first stream chunk; reasoning_content split into a child span. - TeamAgent.add_event_listener: team / member / task / message events attached to a long-lived team root span, with task spans for terminal task events. - DeepAgent rails: a minimal ObservabilityRail covers only the outer task-loop iteration boundary; the other 8 hooks are intentionally no-ops because the Callback handlers already cover them. Attribute keys follow OpenLLMetry semantic conventions (gen_ai.*) plus agentteam.* / deepagent.* namespaces. Prompt and completion bodies are preserved by default; redaction is opt-in via ObservabilityConfig. Exporter is selectable (otlp_grpc / otlp_http / console) and can be disabled wholesale. Tests: 9 pytest cases under tests/unit_tests/agent_teams/observability/ exercise the streaming TTFT path, reasoning child span, tool nesting, error propagation, monitor event dispatch, rail iteration spans, redaction toggle, and the disabled no-op path. Deployment: docker-compose stack (OTel Collector + Langfuse) under deploy/observability/ for local end-to-end verification, plus a matching example entry point at examples/agent_teams/agent_team_observability_e2e.py. Refs: #751 24 天前
README.md

Agent Teams Observability Stack

Local docker-compose deployment for the OpenTelemetry observability subsystem shipped under openjiuwen/agent_teams/observability/.

What's in here

  • docker-compose.yml — OTel Collector + Langfuse (Postgres + ClickHouse).
  • otel-collector-config.yaml — Collector pipeline definition.

Quick start

cd deploy/observability
docker-compose up -d

After the stack is healthy:

  • Langfuse UI: http://localhost:3000 (sign up on first visit, then create a project).
  • OTLP gRPC endpoint for the application: http://localhost:4317.
  • OTLP HTTP endpoint: http://localhost:4318.

Point the application at the collector via ObservabilityConfig:

from openjiuwen.agent_teams.observability import (
    ObservabilityConfig,
    init_observability,
)

init_observability(
    ObservabilityConfig(
        service_name="my-service",
        exporter="otlp_grpc",
        endpoint="http://localhost:4317",
    ),
)

Backend choice

Default: Langfuse. Reasons documented in /Users/alan/.claude/plans/opentelemetry-replicated-quilt.md (section D2). Replace langfuse-server with Phoenix / Jaeger / Tempo by editing otel-collector-config.yaml's exporters block and pointing the pipeline at the new exporter.

Production notes

  • Replace NEXTAUTH_SECRET, SALT, and the database password before exposing the stack outside localhost.
  • Set ObservabilityConfig.sample_rate < 1.0 in production (e.g. 0.1) to bound trace volume.
  • Migrate Postgres / ClickHouse to managed services for any non-toy deployment — running stateful databases in docker-compose is fine for evaluation only.