| ✨ feat: per-call llm_generation_tracing observability (#15124)
* ✨ feat(database): add llm_generation_tracing schema + tracing package (LOBE-9462)
Foundation layer for per-call observability of `generateObject` calls.
- New Drizzle table `llm_generation_tracing` with identity / context / model /
result / usage / storage / feedback / audit columns and full single-column
index coverage (Postgres bitmap-scan friendly). Migration 0103 is idempotent
(CREATE TABLE/INDEX IF NOT EXISTS) for safe re-runs.
- `LlmGenerationTracingModel` with `record` / `updateFeedback` / `findById` /
`listRecent`, all userId-scoped to prevent cross-user leaks.
- New package `@lobechat/llm-generation-tracing` mirroring agent-tracing's
shape: `ITracingStore` interface, `FileTracingStore` (local/dev, scenario
subfolders + latest.json symlink), `computePromptHash` (6-char sha256 of
systemPrompt + schema), and `TRACING_SCENARIO_REGISTRY` + `resolveScenario`
with explicit scenario override.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* ✨ feat(model-runtime): wire llm_generation_tracing into ModelRuntime.generateObject (LOBE-9462)
Per-call interception layer — one hook covers all generateObject callers.
- New `onGenerateObjectComplete` hook on `ModelRuntimeHooks`: always fires
(success or failure) with latency, usage, output/error. Fixes the gap where
`onGenerateObjectFinal` only fires when the runtime invokes `onUsage`.
- `S3TracingStore` (zstd level 3, key
`llm-generation-tracing/{scenario}/{v}-{hash}/{date}/{id}.json.zst`) and
`LLMGenerationTracingService` that does DB insert → store.save → patch
storage_key. Store failures preserve the row with `metadata.store_error`.
- `createLLMGenerationTracingHook` + `mergeModelRuntimeHooks` wired into
`initModelRuntimeFromDB`; tracing runs alongside business (billing) hooks
via `next/server.after()` when available, microtask fallback otherwise.
Unknown metadata keys (e.g. `parent_memory_trace_key`) pass through.
- Memory extractor accepts `parentMemoryTraceKey` option for the job-level
backlink. Follow-up-action caller given an explicit `scenario: 'follow_up'`
metadata override — it was the only OSS caller missing trigger metadata.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* ✅ test(llm-generation-tracing): type vi.fn mocks so tsgo accepts mock.calls indexing
The hook + service tests destructured `mock.calls[0][0]` and accessed nested
fields, which tsgo flagged as TS2493 / TS18046 because `vi.fn()` defaults to a
zero-arg signature. Add explicit type parameters to the mocks so tsgo can
infer the call tuple, and cast `call.payload` at the access point.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* ♻️ refactor(model-runtime): move mergeModelRuntimeHooks into the package
It's a generic utility for composing `ModelRuntimeHooks` instances — same
import surface as `ModelRuntime` and the hooks interface — so it belongs
alongside them rather than tucked under a server-side consumer.
- New `packages/model-runtime/src/core/mergeHooks.ts` exports
`mergeModelRuntimeHooks` and is re-exported from the package index.
- Move the unit tests to `packages/model-runtime/src/core/mergeHooks.test.ts`,
including a new case covering the "a throws → b is skipped" load-bearing
semantics.
- `src/server/services/llmGenerationTracing/hook.ts` drops the local copy and
the consumer (`src/server/modules/ModelRuntime/index.ts`) imports from
`@lobechat/model-runtime`.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* ♻️ refactor(llm-generation-tracing): version lives with the prompt, not in a central table
`promptVersion` was baked into `TRACING_SCENARIO_REGISTRY`, far from any
prompt definition — editing a prompt + forgetting to bump the entry in a
completely different file was an obvious foot-gun.
- Registry is now `Record<string, string>` mapping trigger → scenario only;
it's the stable concern that rarely changes.
- `resolveScenario` always passes `promptVersion` through from the caller,
defaulting to `UNKNOWN_PROMPT_VERSION` ('v0') when absent.
- Each call site declares its own `*_PROMPT_VERSION` constant next to the
prompt it describes. `followUpAction` ships the first one:
`FOLLOW_UP_PROMPT_VERSION` in `prompts/index.ts`, threaded through
`metadata.promptVersion` at the `generateObject` call. Other callers can
add the same constant when they next touch their prompts.
The 6-char prompt hash on the row still catches forgotten bumps.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* ✨ feat(input-completion): wire prompt-version metadata at the auto-complete call site
Aligns input auto-complete with the FOLLOW_UP_PROMPT_VERSION convention so
each prompt iteration is recordable as the chat-side tracing lands.
- `INPUT_COMPLETION_PROMPT_VERSION = 'v1.0'` declared next to
`chainInputCompletion` — bump together with the prompt body.
- `fetchPresetTaskResult` accepts optional `metadata` and forwards it to
`getChatCompletion`; the existing chat path already plumbs metadata to
`ModelRuntime.chat` options.
- `InputEditor` call site passes
`{ scenario: 'input_completion', promptVersion }`.
Note: `llm_generation_tracing` currently only fires from
`onGenerateObjectComplete`. Input completion is a `chat` call, so this
metadata is forward-looking until a chat-side tracing hook lands.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* 🐛 fix(llm-generation-tracing): collapse bucketDir path.join args to silence turbopack glob warning
Turbopack's static analyzer treats `path.join(root, dyn1, dyn2)` as a
multi-segment glob pattern and warned that it could match ~12k files in
the project. Compose the relative subdir as a single string first, so
`path.join` only sees one dynamic segment.
Behavior unchanged — the resulting path is identical.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* ✨ feat(input-completion): route auto-complete through generateObject for tracing
Auto-complete is the first preset-task caller migrated to the structured-
output path so it lands in `llm_generation_tracing` via the existing
`onGenerateObjectComplete` hook. No new server hook, no global chat-side
tracing.
- `chainInputCompletion` now returns `{ messages, schema }` with a minimal
`{ completion: string }` schema and a stable `INPUT_COMPLETION_SCHEMA_NAME`
constant. JSON wrapping costs ~15-30 tokens against a 100-token completion
budget — negligible for the observability win.
- `StructureOutputSchema` / `StructureOutputParams` accept optional
`metadata`; `aiChatRouter.outputJSON` merges caller metadata over the
default trigger so `{ scenario, promptVersion, schemaName }` reach
`ModelRuntime.generateObject` options unchanged.
- `IStructureSchema.description` is now optional to match the zod schema —
previously the TS type was stricter than runtime validation accepted.
- `InputEditor` switches from `chatService.fetchPresetTaskResult` to
`aiChatService.generateJSON`, reading `response.completion`. Streaming
is dropped because auto-complete already buffers the full result before
inserting; no UX change.
- Reverts the unused `metadata` field that was added to
`fetchPresetTaskResult` in the previous commit — no current caller needs
it now that input completion uses the generateObject path.
Bumps `INPUT_COMPLETION_PROMPT_VERSION` to v2.0 because the system prompt
gained an "output the completion field" instruction.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* ♻️ refactor(aiGeneration): extract the runtime-init + generateObject dance into a service
Every server-side caller that produces structured output was repeating the
same two-step ritual: `initModelRuntimeFromDB(...)` → `runtime.generateObject(payload, { metadata })`.
`AiGenerationService` collapses it into one call so future cross-cutting
concerns (default metadata, retry, observability hooks) have one place to
land.
- New `src/server/services/aiGeneration/index.ts` exposes
`generateObject<T>(input, options)` and is unit-tested for provider
resolution + payload/metadata pass-through.
- `aiChatRouter.outputJSON` and `FollowUpActionService.extract` migrated to
the service (other callers move organically when next touched).
- Drops the unused `keyVaultsPayload` field from `StructureOutputParams`
and the placeholder at the InputEditor call site — key vaults are
server-resolved from DB, the client never supplies them.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* ♻️ refactor(tracing): centralize TRACING_SCENARIOS const + inject AiGenerationService via trpc ctx
- New `packages/const/src/llmGenerationTracing.ts` exports `TRACING_SCENARIOS`
+ `TracingScenario` type — the single directory where every known scenario
name lives. Adds `@lobechat/const` as a workspace dep on llm-generation-
tracing so `TRACING_SCENARIO_REGISTRY` can reference the same literals.
- Callers (FollowUpActionService, InputEditor) replace `'follow_up'` /
`'input_completion'` string literals with `TRACING_SCENARIOS.FollowUp` /
`.InputCompletion`, so a typo or a rename fails the type-check instead of
silently drifting on the row.
- `AiGenerationService` is now injected into the `aiChatProcedure` ctx
middleware alongside `aiChatService`; `outputJSON` consumes it via
`ctx.aiGenerationService` instead of new-ing it inside the handler.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* ✨ feat(llm-generation-tracing): add lt/llm-tracing CLI + drop local-only storage_key
- Add `lt` / `llm-tracing` CLI under @lobechat/llm-generation-tracing with
`list` (recent records, --scenario filter, --json) and `inspect` (by
tracing_id prefix or latest, --full, --json).
- `FileTracingStore.save` now returns `{ key: null }` so dev DB rows leave
`storage_key` empty instead of recording a non-resolvable local path; S3
store remains the source of truth for the real key. Add helpers
`findByTracingId` / `getLatest` used by the CLI.
- Wire `agentId` and `topicId` into `input_completion` tracing metadata
from the chat input auto-complete call site.
- Default `FileTracingStore` whenever NODE_ENV=development (drop the
ENABLE_LLM_GENERATION_TRACING_LOCAL opt-in env var).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* 💄 style(llm-generation-tracing): prettier CLI output (tree + colors)
Mirror the @lobechat/agent-tracing viewer style:
- Inline ANSI color helpers (dim/bold/cyan/magenta/green/yellow/red).
- Compact single-line header with id, scenario, version, model, status,
time — replaces the multi-line bullet list.
- Tree structure with `├─`/`└─` connectors instead of `── section ──`
banners.
- input arrays render per-message (role + char count + preview) rather
than dumping raw JSON.
- Small single-key outputs (e.g. `{ completion: "怎么样" }`) collapse
to inline `key: "value"`.
- `lt list` switches to a colored, properly padded table.
Default view stays compact; --full expands system_prompt / input /
schema bodies.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* ♻️ refactor(llm-generation-tracing): split `tracing` config out of `metadata`
`options.metadata` was overloaded — half tracing-specific structured fields
(scenario / promptVersion / schemaName / agentId / topicId / ...), half
free-form jsonb passthrough. Callers couldn't tell which was which, and the
inputHint was always auto-extracted (useless when the prompt wraps the user's
text in a template).
This commit introduces a dedicated `tracing` option:
- Add `TracingOptions` to @lobechat/llm-generation-tracing — the typed shape
callers import (agentId / topicId / inputHint / scenario / promptVersion /
schemaName / systemPrompt / parentTracingId / metadata).
- Add loose `tracing?: Record<string, unknown>` to GenerateObjectOptions and
StructureOutputParams / StructureOutputSchema so the field flows through
the runtime + TRPC.
- Tracing hook now reads `context.options.tracing` for structured fields; it
still falls back to `metadata.trigger` for the cross-cutting trigger string
(ModelRuntime itself uses metadata.trigger for timing logs, so trigger
stays on metadata).
- Service `record()` accepts an explicit `inputHint`; otherwise falls back
to auto-extraction from the first user message. Always truncated.
- Free-form jsonb fields move to `tracing.metadata` (was unknown-key passthrough
on `metadata`).
- Call sites updated:
- FollowUpAction now passes `tracing: { scenario, promptVersion, schemaName,
topicId }` (previously `metadata`).
- InputCompletion now passes `tracing: { agentId, topicId, inputHint: input,
scenario, promptVersion, schemaName }` — `inputHint` is the user's actual
typed text, not the wrapper prompt's first user message.
- `aiChat.outputJSON` router forwards both metadata and tracing.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* Update inputCompletion.ts
* 🐛 fix(llm-generation-tracing): stop duplicating provider into the row's metadata jsonb
`provider` is already a first-class column on the `llm_generation_tracing`
row, so auto-stamping it into the `metadata` jsonb column on every call was
pure noise. The hook now writes the caller-supplied `tracing.metadata`
verbatim — empty/undefined when the caller had nothing to add.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> | 19 小时前 |
| ✨ feat: per-call llm_generation_tracing observability (#15124)
* ✨ feat(database): add llm_generation_tracing schema + tracing package (LOBE-9462)
Foundation layer for per-call observability of `generateObject` calls.
- New Drizzle table `llm_generation_tracing` with identity / context / model /
result / usage / storage / feedback / audit columns and full single-column
index coverage (Postgres bitmap-scan friendly). Migration 0103 is idempotent
(CREATE TABLE/INDEX IF NOT EXISTS) for safe re-runs.
- `LlmGenerationTracingModel` with `record` / `updateFeedback` / `findById` /
`listRecent`, all userId-scoped to prevent cross-user leaks.
- New package `@lobechat/llm-generation-tracing` mirroring agent-tracing's
shape: `ITracingStore` interface, `FileTracingStore` (local/dev, scenario
subfolders + latest.json symlink), `computePromptHash` (6-char sha256 of
systemPrompt + schema), and `TRACING_SCENARIO_REGISTRY` + `resolveScenario`
with explicit scenario override.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* ✨ feat(model-runtime): wire llm_generation_tracing into ModelRuntime.generateObject (LOBE-9462)
Per-call interception layer — one hook covers all generateObject callers.
- New `onGenerateObjectComplete` hook on `ModelRuntimeHooks`: always fires
(success or failure) with latency, usage, output/error. Fixes the gap where
`onGenerateObjectFinal` only fires when the runtime invokes `onUsage`.
- `S3TracingStore` (zstd level 3, key
`llm-generation-tracing/{scenario}/{v}-{hash}/{date}/{id}.json.zst`) and
`LLMGenerationTracingService` that does DB insert → store.save → patch
storage_key. Store failures preserve the row with `metadata.store_error`.
- `createLLMGenerationTracingHook` + `mergeModelRuntimeHooks` wired into
`initModelRuntimeFromDB`; tracing runs alongside business (billing) hooks
via `next/server.after()` when available, microtask fallback otherwise.
Unknown metadata keys (e.g. `parent_memory_trace_key`) pass through.
- Memory extractor accepts `parentMemoryTraceKey` option for the job-level
backlink. Follow-up-action caller given an explicit `scenario: 'follow_up'`
metadata override — it was the only OSS caller missing trigger metadata.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* ✅ test(llm-generation-tracing): type vi.fn mocks so tsgo accepts mock.calls indexing
The hook + service tests destructured `mock.calls[0][0]` and accessed nested
fields, which tsgo flagged as TS2493 / TS18046 because `vi.fn()` defaults to a
zero-arg signature. Add explicit type parameters to the mocks so tsgo can
infer the call tuple, and cast `call.payload` at the access point.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* ♻️ refactor(model-runtime): move mergeModelRuntimeHooks into the package
It's a generic utility for composing `ModelRuntimeHooks` instances — same
import surface as `ModelRuntime` and the hooks interface — so it belongs
alongside them rather than tucked under a server-side consumer.
- New `packages/model-runtime/src/core/mergeHooks.ts` exports
`mergeModelRuntimeHooks` and is re-exported from the package index.
- Move the unit tests to `packages/model-runtime/src/core/mergeHooks.test.ts`,
including a new case covering the "a throws → b is skipped" load-bearing
semantics.
- `src/server/services/llmGenerationTracing/hook.ts` drops the local copy and
the consumer (`src/server/modules/ModelRuntime/index.ts`) imports from
`@lobechat/model-runtime`.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* ♻️ refactor(llm-generation-tracing): version lives with the prompt, not in a central table
`promptVersion` was baked into `TRACING_SCENARIO_REGISTRY`, far from any
prompt definition — editing a prompt + forgetting to bump the entry in a
completely different file was an obvious foot-gun.
- Registry is now `Record<string, string>` mapping trigger → scenario only;
it's the stable concern that rarely changes.
- `resolveScenario` always passes `promptVersion` through from the caller,
defaulting to `UNKNOWN_PROMPT_VERSION` ('v0') when absent.
- Each call site declares its own `*_PROMPT_VERSION` constant next to the
prompt it describes. `followUpAction` ships the first one:
`FOLLOW_UP_PROMPT_VERSION` in `prompts/index.ts`, threaded through
`metadata.promptVersion` at the `generateObject` call. Other callers can
add the same constant when they next touch their prompts.
The 6-char prompt hash on the row still catches forgotten bumps.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* ✨ feat(input-completion): wire prompt-version metadata at the auto-complete call site
Aligns input auto-complete with the FOLLOW_UP_PROMPT_VERSION convention so
each prompt iteration is recordable as the chat-side tracing lands.
- `INPUT_COMPLETION_PROMPT_VERSION = 'v1.0'` declared next to
`chainInputCompletion` — bump together with the prompt body.
- `fetchPresetTaskResult` accepts optional `metadata` and forwards it to
`getChatCompletion`; the existing chat path already plumbs metadata to
`ModelRuntime.chat` options.
- `InputEditor` call site passes
`{ scenario: 'input_completion', promptVersion }`.
Note: `llm_generation_tracing` currently only fires from
`onGenerateObjectComplete`. Input completion is a `chat` call, so this
metadata is forward-looking until a chat-side tracing hook lands.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* 🐛 fix(llm-generation-tracing): collapse bucketDir path.join args to silence turbopack glob warning
Turbopack's static analyzer treats `path.join(root, dyn1, dyn2)` as a
multi-segment glob pattern and warned that it could match ~12k files in
the project. Compose the relative subdir as a single string first, so
`path.join` only sees one dynamic segment.
Behavior unchanged — the resulting path is identical.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* ✨ feat(input-completion): route auto-complete through generateObject for tracing
Auto-complete is the first preset-task caller migrated to the structured-
output path so it lands in `llm_generation_tracing` via the existing
`onGenerateObjectComplete` hook. No new server hook, no global chat-side
tracing.
- `chainInputCompletion` now returns `{ messages, schema }` with a minimal
`{ completion: string }` schema and a stable `INPUT_COMPLETION_SCHEMA_NAME`
constant. JSON wrapping costs ~15-30 tokens against a 100-token completion
budget — negligible for the observability win.
- `StructureOutputSchema` / `StructureOutputParams` accept optional
`metadata`; `aiChatRouter.outputJSON` merges caller metadata over the
default trigger so `{ scenario, promptVersion, schemaName }` reach
`ModelRuntime.generateObject` options unchanged.
- `IStructureSchema.description` is now optional to match the zod schema —
previously the TS type was stricter than runtime validation accepted.
- `InputEditor` switches from `chatService.fetchPresetTaskResult` to
`aiChatService.generateJSON`, reading `response.completion`. Streaming
is dropped because auto-complete already buffers the full result before
inserting; no UX change.
- Reverts the unused `metadata` field that was added to
`fetchPresetTaskResult` in the previous commit — no current caller needs
it now that input completion uses the generateObject path.
Bumps `INPUT_COMPLETION_PROMPT_VERSION` to v2.0 because the system prompt
gained an "output the completion field" instruction.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* ♻️ refactor(aiGeneration): extract the runtime-init + generateObject dance into a service
Every server-side caller that produces structured output was repeating the
same two-step ritual: `initModelRuntimeFromDB(...)` → `runtime.generateObject(payload, { metadata })`.
`AiGenerationService` collapses it into one call so future cross-cutting
concerns (default metadata, retry, observability hooks) have one place to
land.
- New `src/server/services/aiGeneration/index.ts` exposes
`generateObject<T>(input, options)` and is unit-tested for provider
resolution + payload/metadata pass-through.
- `aiChatRouter.outputJSON` and `FollowUpActionService.extract` migrated to
the service (other callers move organically when next touched).
- Drops the unused `keyVaultsPayload` field from `StructureOutputParams`
and the placeholder at the InputEditor call site — key vaults are
server-resolved from DB, the client never supplies them.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* ♻️ refactor(tracing): centralize TRACING_SCENARIOS const + inject AiGenerationService via trpc ctx
- New `packages/const/src/llmGenerationTracing.ts` exports `TRACING_SCENARIOS`
+ `TracingScenario` type — the single directory where every known scenario
name lives. Adds `@lobechat/const` as a workspace dep on llm-generation-
tracing so `TRACING_SCENARIO_REGISTRY` can reference the same literals.
- Callers (FollowUpActionService, InputEditor) replace `'follow_up'` /
`'input_completion'` string literals with `TRACING_SCENARIOS.FollowUp` /
`.InputCompletion`, so a typo or a rename fails the type-check instead of
silently drifting on the row.
- `AiGenerationService` is now injected into the `aiChatProcedure` ctx
middleware alongside `aiChatService`; `outputJSON` consumes it via
`ctx.aiGenerationService` instead of new-ing it inside the handler.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* ✨ feat(llm-generation-tracing): add lt/llm-tracing CLI + drop local-only storage_key
- Add `lt` / `llm-tracing` CLI under @lobechat/llm-generation-tracing with
`list` (recent records, --scenario filter, --json) and `inspect` (by
tracing_id prefix or latest, --full, --json).
- `FileTracingStore.save` now returns `{ key: null }` so dev DB rows leave
`storage_key` empty instead of recording a non-resolvable local path; S3
store remains the source of truth for the real key. Add helpers
`findByTracingId` / `getLatest` used by the CLI.
- Wire `agentId` and `topicId` into `input_completion` tracing metadata
from the chat input auto-complete call site.
- Default `FileTracingStore` whenever NODE_ENV=development (drop the
ENABLE_LLM_GENERATION_TRACING_LOCAL opt-in env var).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* 💄 style(llm-generation-tracing): prettier CLI output (tree + colors)
Mirror the @lobechat/agent-tracing viewer style:
- Inline ANSI color helpers (dim/bold/cyan/magenta/green/yellow/red).
- Compact single-line header with id, scenario, version, model, status,
time — replaces the multi-line bullet list.
- Tree structure with `├─`/`└─` connectors instead of `── section ──`
banners.
- input arrays render per-message (role + char count + preview) rather
than dumping raw JSON.
- Small single-key outputs (e.g. `{ completion: "怎么样" }`) collapse
to inline `key: "value"`.
- `lt list` switches to a colored, properly padded table.
Default view stays compact; --full expands system_prompt / input /
schema bodies.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* ♻️ refactor(llm-generation-tracing): split `tracing` config out of `metadata`
`options.metadata` was overloaded — half tracing-specific structured fields
(scenario / promptVersion / schemaName / agentId / topicId / ...), half
free-form jsonb passthrough. Callers couldn't tell which was which, and the
inputHint was always auto-extracted (useless when the prompt wraps the user's
text in a template).
This commit introduces a dedicated `tracing` option:
- Add `TracingOptions` to @lobechat/llm-generation-tracing — the typed shape
callers import (agentId / topicId / inputHint / scenario / promptVersion /
schemaName / systemPrompt / parentTracingId / metadata).
- Add loose `tracing?: Record<string, unknown>` to GenerateObjectOptions and
StructureOutputParams / StructureOutputSchema so the field flows through
the runtime + TRPC.
- Tracing hook now reads `context.options.tracing` for structured fields; it
still falls back to `metadata.trigger` for the cross-cutting trigger string
(ModelRuntime itself uses metadata.trigger for timing logs, so trigger
stays on metadata).
- Service `record()` accepts an explicit `inputHint`; otherwise falls back
to auto-extraction from the first user message. Always truncated.
- Free-form jsonb fields move to `tracing.metadata` (was unknown-key passthrough
on `metadata`).
- Call sites updated:
- FollowUpAction now passes `tracing: { scenario, promptVersion, schemaName,
topicId }` (previously `metadata`).
- InputCompletion now passes `tracing: { agentId, topicId, inputHint: input,
scenario, promptVersion, schemaName }` — `inputHint` is the user's actual
typed text, not the wrapper prompt's first user message.
- `aiChat.outputJSON` router forwards both metadata and tracing.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* Update inputCompletion.ts
* 🐛 fix(llm-generation-tracing): stop duplicating provider into the row's metadata jsonb
`provider` is already a first-class column on the `llm_generation_tracing`
row, so auto-stamping it into the `metadata` jsonb column on every call was
pure noise. The hook now writes the caller-supplied `tracing.metadata`
verbatim — empty/undefined when the caller had nothing to add.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> | 19 小时前 |
| ✨ feat: per-call llm_generation_tracing observability (#15124)
* ✨ feat(database): add llm_generation_tracing schema + tracing package (LOBE-9462)
Foundation layer for per-call observability of `generateObject` calls.
- New Drizzle table `llm_generation_tracing` with identity / context / model /
result / usage / storage / feedback / audit columns and full single-column
index coverage (Postgres bitmap-scan friendly). Migration 0103 is idempotent
(CREATE TABLE/INDEX IF NOT EXISTS) for safe re-runs.
- `LlmGenerationTracingModel` with `record` / `updateFeedback` / `findById` /
`listRecent`, all userId-scoped to prevent cross-user leaks.
- New package `@lobechat/llm-generation-tracing` mirroring agent-tracing's
shape: `ITracingStore` interface, `FileTracingStore` (local/dev, scenario
subfolders + latest.json symlink), `computePromptHash` (6-char sha256 of
systemPrompt + schema), and `TRACING_SCENARIO_REGISTRY` + `resolveScenario`
with explicit scenario override.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* ✨ feat(model-runtime): wire llm_generation_tracing into ModelRuntime.generateObject (LOBE-9462)
Per-call interception layer — one hook covers all generateObject callers.
- New `onGenerateObjectComplete` hook on `ModelRuntimeHooks`: always fires
(success or failure) with latency, usage, output/error. Fixes the gap where
`onGenerateObjectFinal` only fires when the runtime invokes `onUsage`.
- `S3TracingStore` (zstd level 3, key
`llm-generation-tracing/{scenario}/{v}-{hash}/{date}/{id}.json.zst`) and
`LLMGenerationTracingService` that does DB insert → store.save → patch
storage_key. Store failures preserve the row with `metadata.store_error`.
- `createLLMGenerationTracingHook` + `mergeModelRuntimeHooks` wired into
`initModelRuntimeFromDB`; tracing runs alongside business (billing) hooks
via `next/server.after()` when available, microtask fallback otherwise.
Unknown metadata keys (e.g. `parent_memory_trace_key`) pass through.
- Memory extractor accepts `parentMemoryTraceKey` option for the job-level
backlink. Follow-up-action caller given an explicit `scenario: 'follow_up'`
metadata override — it was the only OSS caller missing trigger metadata.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* ✅ test(llm-generation-tracing): type vi.fn mocks so tsgo accepts mock.calls indexing
The hook + service tests destructured `mock.calls[0][0]` and accessed nested
fields, which tsgo flagged as TS2493 / TS18046 because `vi.fn()` defaults to a
zero-arg signature. Add explicit type parameters to the mocks so tsgo can
infer the call tuple, and cast `call.payload` at the access point.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* ♻️ refactor(model-runtime): move mergeModelRuntimeHooks into the package
It's a generic utility for composing `ModelRuntimeHooks` instances — same
import surface as `ModelRuntime` and the hooks interface — so it belongs
alongside them rather than tucked under a server-side consumer.
- New `packages/model-runtime/src/core/mergeHooks.ts` exports
`mergeModelRuntimeHooks` and is re-exported from the package index.
- Move the unit tests to `packages/model-runtime/src/core/mergeHooks.test.ts`,
including a new case covering the "a throws → b is skipped" load-bearing
semantics.
- `src/server/services/llmGenerationTracing/hook.ts` drops the local copy and
the consumer (`src/server/modules/ModelRuntime/index.ts`) imports from
`@lobechat/model-runtime`.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* ♻️ refactor(llm-generation-tracing): version lives with the prompt, not in a central table
`promptVersion` was baked into `TRACING_SCENARIO_REGISTRY`, far from any
prompt definition — editing a prompt + forgetting to bump the entry in a
completely different file was an obvious foot-gun.
- Registry is now `Record<string, string>` mapping trigger → scenario only;
it's the stable concern that rarely changes.
- `resolveScenario` always passes `promptVersion` through from the caller,
defaulting to `UNKNOWN_PROMPT_VERSION` ('v0') when absent.
- Each call site declares its own `*_PROMPT_VERSION` constant next to the
prompt it describes. `followUpAction` ships the first one:
`FOLLOW_UP_PROMPT_VERSION` in `prompts/index.ts`, threaded through
`metadata.promptVersion` at the `generateObject` call. Other callers can
add the same constant when they next touch their prompts.
The 6-char prompt hash on the row still catches forgotten bumps.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* ✨ feat(input-completion): wire prompt-version metadata at the auto-complete call site
Aligns input auto-complete with the FOLLOW_UP_PROMPT_VERSION convention so
each prompt iteration is recordable as the chat-side tracing lands.
- `INPUT_COMPLETION_PROMPT_VERSION = 'v1.0'` declared next to
`chainInputCompletion` — bump together with the prompt body.
- `fetchPresetTaskResult` accepts optional `metadata` and forwards it to
`getChatCompletion`; the existing chat path already plumbs metadata to
`ModelRuntime.chat` options.
- `InputEditor` call site passes
`{ scenario: 'input_completion', promptVersion }`.
Note: `llm_generation_tracing` currently only fires from
`onGenerateObjectComplete`. Input completion is a `chat` call, so this
metadata is forward-looking until a chat-side tracing hook lands.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* 🐛 fix(llm-generation-tracing): collapse bucketDir path.join args to silence turbopack glob warning
Turbopack's static analyzer treats `path.join(root, dyn1, dyn2)` as a
multi-segment glob pattern and warned that it could match ~12k files in
the project. Compose the relative subdir as a single string first, so
`path.join` only sees one dynamic segment.
Behavior unchanged — the resulting path is identical.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* ✨ feat(input-completion): route auto-complete through generateObject for tracing
Auto-complete is the first preset-task caller migrated to the structured-
output path so it lands in `llm_generation_tracing` via the existing
`onGenerateObjectComplete` hook. No new server hook, no global chat-side
tracing.
- `chainInputCompletion` now returns `{ messages, schema }` with a minimal
`{ completion: string }` schema and a stable `INPUT_COMPLETION_SCHEMA_NAME`
constant. JSON wrapping costs ~15-30 tokens against a 100-token completion
budget — negligible for the observability win.
- `StructureOutputSchema` / `StructureOutputParams` accept optional
`metadata`; `aiChatRouter.outputJSON` merges caller metadata over the
default trigger so `{ scenario, promptVersion, schemaName }` reach
`ModelRuntime.generateObject` options unchanged.
- `IStructureSchema.description` is now optional to match the zod schema —
previously the TS type was stricter than runtime validation accepted.
- `InputEditor` switches from `chatService.fetchPresetTaskResult` to
`aiChatService.generateJSON`, reading `response.completion`. Streaming
is dropped because auto-complete already buffers the full result before
inserting; no UX change.
- Reverts the unused `metadata` field that was added to
`fetchPresetTaskResult` in the previous commit — no current caller needs
it now that input completion uses the generateObject path.
Bumps `INPUT_COMPLETION_PROMPT_VERSION` to v2.0 because the system prompt
gained an "output the completion field" instruction.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* ♻️ refactor(aiGeneration): extract the runtime-init + generateObject dance into a service
Every server-side caller that produces structured output was repeating the
same two-step ritual: `initModelRuntimeFromDB(...)` → `runtime.generateObject(payload, { metadata })`.
`AiGenerationService` collapses it into one call so future cross-cutting
concerns (default metadata, retry, observability hooks) have one place to
land.
- New `src/server/services/aiGeneration/index.ts` exposes
`generateObject<T>(input, options)` and is unit-tested for provider
resolution + payload/metadata pass-through.
- `aiChatRouter.outputJSON` and `FollowUpActionService.extract` migrated to
the service (other callers move organically when next touched).
- Drops the unused `keyVaultsPayload` field from `StructureOutputParams`
and the placeholder at the InputEditor call site — key vaults are
server-resolved from DB, the client never supplies them.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* ♻️ refactor(tracing): centralize TRACING_SCENARIOS const + inject AiGenerationService via trpc ctx
- New `packages/const/src/llmGenerationTracing.ts` exports `TRACING_SCENARIOS`
+ `TracingScenario` type — the single directory where every known scenario
name lives. Adds `@lobechat/const` as a workspace dep on llm-generation-
tracing so `TRACING_SCENARIO_REGISTRY` can reference the same literals.
- Callers (FollowUpActionService, InputEditor) replace `'follow_up'` /
`'input_completion'` string literals with `TRACING_SCENARIOS.FollowUp` /
`.InputCompletion`, so a typo or a rename fails the type-check instead of
silently drifting on the row.
- `AiGenerationService` is now injected into the `aiChatProcedure` ctx
middleware alongside `aiChatService`; `outputJSON` consumes it via
`ctx.aiGenerationService` instead of new-ing it inside the handler.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* ✨ feat(llm-generation-tracing): add lt/llm-tracing CLI + drop local-only storage_key
- Add `lt` / `llm-tracing` CLI under @lobechat/llm-generation-tracing with
`list` (recent records, --scenario filter, --json) and `inspect` (by
tracing_id prefix or latest, --full, --json).
- `FileTracingStore.save` now returns `{ key: null }` so dev DB rows leave
`storage_key` empty instead of recording a non-resolvable local path; S3
store remains the source of truth for the real key. Add helpers
`findByTracingId` / `getLatest` used by the CLI.
- Wire `agentId` and `topicId` into `input_completion` tracing metadata
from the chat input auto-complete call site.
- Default `FileTracingStore` whenever NODE_ENV=development (drop the
ENABLE_LLM_GENERATION_TRACING_LOCAL opt-in env var).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* 💄 style(llm-generation-tracing): prettier CLI output (tree + colors)
Mirror the @lobechat/agent-tracing viewer style:
- Inline ANSI color helpers (dim/bold/cyan/magenta/green/yellow/red).
- Compact single-line header with id, scenario, version, model, status,
time — replaces the multi-line bullet list.
- Tree structure with `├─`/`└─` connectors instead of `── section ──`
banners.
- input arrays render per-message (role + char count + preview) rather
than dumping raw JSON.
- Small single-key outputs (e.g. `{ completion: "怎么样" }`) collapse
to inline `key: "value"`.
- `lt list` switches to a colored, properly padded table.
Default view stays compact; --full expands system_prompt / input /
schema bodies.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* ♻️ refactor(llm-generation-tracing): split `tracing` config out of `metadata`
`options.metadata` was overloaded — half tracing-specific structured fields
(scenario / promptVersion / schemaName / agentId / topicId / ...), half
free-form jsonb passthrough. Callers couldn't tell which was which, and the
inputHint was always auto-extracted (useless when the prompt wraps the user's
text in a template).
This commit introduces a dedicated `tracing` option:
- Add `TracingOptions` to @lobechat/llm-generation-tracing — the typed shape
callers import (agentId / topicId / inputHint / scenario / promptVersion /
schemaName / systemPrompt / parentTracingId / metadata).
- Add loose `tracing?: Record<string, unknown>` to GenerateObjectOptions and
StructureOutputParams / StructureOutputSchema so the field flows through
the runtime + TRPC.
- Tracing hook now reads `context.options.tracing` for structured fields; it
still falls back to `metadata.trigger` for the cross-cutting trigger string
(ModelRuntime itself uses metadata.trigger for timing logs, so trigger
stays on metadata).
- Service `record()` accepts an explicit `inputHint`; otherwise falls back
to auto-extraction from the first user message. Always truncated.
- Free-form jsonb fields move to `tracing.metadata` (was unknown-key passthrough
on `metadata`).
- Call sites updated:
- FollowUpAction now passes `tracing: { scenario, promptVersion, schemaName,
topicId }` (previously `metadata`).
- InputCompletion now passes `tracing: { agentId, topicId, inputHint: input,
scenario, promptVersion, schemaName }` — `inputHint` is the user's actual
typed text, not the wrapper prompt's first user message.
- `aiChat.outputJSON` router forwards both metadata and tracing.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* Update inputCompletion.ts
* 🐛 fix(llm-generation-tracing): stop duplicating provider into the row's metadata jsonb
`provider` is already a first-class column on the `llm_generation_tracing`
row, so auto-stamping it into the `metadata` jsonb column on every call was
pure noise. The hook now writes the caller-supplied `tracing.metadata`
verbatim — empty/undefined when the caller had nothing to add.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> | 19 小时前 |
| ✨ feat: per-call llm_generation_tracing observability (#15124)
* ✨ feat(database): add llm_generation_tracing schema + tracing package (LOBE-9462)
Foundation layer for per-call observability of `generateObject` calls.
- New Drizzle table `llm_generation_tracing` with identity / context / model /
result / usage / storage / feedback / audit columns and full single-column
index coverage (Postgres bitmap-scan friendly). Migration 0103 is idempotent
(CREATE TABLE/INDEX IF NOT EXISTS) for safe re-runs.
- `LlmGenerationTracingModel` with `record` / `updateFeedback` / `findById` /
`listRecent`, all userId-scoped to prevent cross-user leaks.
- New package `@lobechat/llm-generation-tracing` mirroring agent-tracing's
shape: `ITracingStore` interface, `FileTracingStore` (local/dev, scenario
subfolders + latest.json symlink), `computePromptHash` (6-char sha256 of
systemPrompt + schema), and `TRACING_SCENARIO_REGISTRY` + `resolveScenario`
with explicit scenario override.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* ✨ feat(model-runtime): wire llm_generation_tracing into ModelRuntime.generateObject (LOBE-9462)
Per-call interception layer — one hook covers all generateObject callers.
- New `onGenerateObjectComplete` hook on `ModelRuntimeHooks`: always fires
(success or failure) with latency, usage, output/error. Fixes the gap where
`onGenerateObjectFinal` only fires when the runtime invokes `onUsage`.
- `S3TracingStore` (zstd level 3, key
`llm-generation-tracing/{scenario}/{v}-{hash}/{date}/{id}.json.zst`) and
`LLMGenerationTracingService` that does DB insert → store.save → patch
storage_key. Store failures preserve the row with `metadata.store_error`.
- `createLLMGenerationTracingHook` + `mergeModelRuntimeHooks` wired into
`initModelRuntimeFromDB`; tracing runs alongside business (billing) hooks
via `next/server.after()` when available, microtask fallback otherwise.
Unknown metadata keys (e.g. `parent_memory_trace_key`) pass through.
- Memory extractor accepts `parentMemoryTraceKey` option for the job-level
backlink. Follow-up-action caller given an explicit `scenario: 'follow_up'`
metadata override — it was the only OSS caller missing trigger metadata.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* ✅ test(llm-generation-tracing): type vi.fn mocks so tsgo accepts mock.calls indexing
The hook + service tests destructured `mock.calls[0][0]` and accessed nested
fields, which tsgo flagged as TS2493 / TS18046 because `vi.fn()` defaults to a
zero-arg signature. Add explicit type parameters to the mocks so tsgo can
infer the call tuple, and cast `call.payload` at the access point.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* ♻️ refactor(model-runtime): move mergeModelRuntimeHooks into the package
It's a generic utility for composing `ModelRuntimeHooks` instances — same
import surface as `ModelRuntime` and the hooks interface — so it belongs
alongside them rather than tucked under a server-side consumer.
- New `packages/model-runtime/src/core/mergeHooks.ts` exports
`mergeModelRuntimeHooks` and is re-exported from the package index.
- Move the unit tests to `packages/model-runtime/src/core/mergeHooks.test.ts`,
including a new case covering the "a throws → b is skipped" load-bearing
semantics.
- `src/server/services/llmGenerationTracing/hook.ts` drops the local copy and
the consumer (`src/server/modules/ModelRuntime/index.ts`) imports from
`@lobechat/model-runtime`.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* ♻️ refactor(llm-generation-tracing): version lives with the prompt, not in a central table
`promptVersion` was baked into `TRACING_SCENARIO_REGISTRY`, far from any
prompt definition — editing a prompt + forgetting to bump the entry in a
completely different file was an obvious foot-gun.
- Registry is now `Record<string, string>` mapping trigger → scenario only;
it's the stable concern that rarely changes.
- `resolveScenario` always passes `promptVersion` through from the caller,
defaulting to `UNKNOWN_PROMPT_VERSION` ('v0') when absent.
- Each call site declares its own `*_PROMPT_VERSION` constant next to the
prompt it describes. `followUpAction` ships the first one:
`FOLLOW_UP_PROMPT_VERSION` in `prompts/index.ts`, threaded through
`metadata.promptVersion` at the `generateObject` call. Other callers can
add the same constant when they next touch their prompts.
The 6-char prompt hash on the row still catches forgotten bumps.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* ✨ feat(input-completion): wire prompt-version metadata at the auto-complete call site
Aligns input auto-complete with the FOLLOW_UP_PROMPT_VERSION convention so
each prompt iteration is recordable as the chat-side tracing lands.
- `INPUT_COMPLETION_PROMPT_VERSION = 'v1.0'` declared next to
`chainInputCompletion` — bump together with the prompt body.
- `fetchPresetTaskResult` accepts optional `metadata` and forwards it to
`getChatCompletion`; the existing chat path already plumbs metadata to
`ModelRuntime.chat` options.
- `InputEditor` call site passes
`{ scenario: 'input_completion', promptVersion }`.
Note: `llm_generation_tracing` currently only fires from
`onGenerateObjectComplete`. Input completion is a `chat` call, so this
metadata is forward-looking until a chat-side tracing hook lands.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* 🐛 fix(llm-generation-tracing): collapse bucketDir path.join args to silence turbopack glob warning
Turbopack's static analyzer treats `path.join(root, dyn1, dyn2)` as a
multi-segment glob pattern and warned that it could match ~12k files in
the project. Compose the relative subdir as a single string first, so
`path.join` only sees one dynamic segment.
Behavior unchanged — the resulting path is identical.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* ✨ feat(input-completion): route auto-complete through generateObject for tracing
Auto-complete is the first preset-task caller migrated to the structured-
output path so it lands in `llm_generation_tracing` via the existing
`onGenerateObjectComplete` hook. No new server hook, no global chat-side
tracing.
- `chainInputCompletion` now returns `{ messages, schema }` with a minimal
`{ completion: string }` schema and a stable `INPUT_COMPLETION_SCHEMA_NAME`
constant. JSON wrapping costs ~15-30 tokens against a 100-token completion
budget — negligible for the observability win.
- `StructureOutputSchema` / `StructureOutputParams` accept optional
`metadata`; `aiChatRouter.outputJSON` merges caller metadata over the
default trigger so `{ scenario, promptVersion, schemaName }` reach
`ModelRuntime.generateObject` options unchanged.
- `IStructureSchema.description` is now optional to match the zod schema —
previously the TS type was stricter than runtime validation accepted.
- `InputEditor` switches from `chatService.fetchPresetTaskResult` to
`aiChatService.generateJSON`, reading `response.completion`. Streaming
is dropped because auto-complete already buffers the full result before
inserting; no UX change.
- Reverts the unused `metadata` field that was added to
`fetchPresetTaskResult` in the previous commit — no current caller needs
it now that input completion uses the generateObject path.
Bumps `INPUT_COMPLETION_PROMPT_VERSION` to v2.0 because the system prompt
gained an "output the completion field" instruction.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* ♻️ refactor(aiGeneration): extract the runtime-init + generateObject dance into a service
Every server-side caller that produces structured output was repeating the
same two-step ritual: `initModelRuntimeFromDB(...)` → `runtime.generateObject(payload, { metadata })`.
`AiGenerationService` collapses it into one call so future cross-cutting
concerns (default metadata, retry, observability hooks) have one place to
land.
- New `src/server/services/aiGeneration/index.ts` exposes
`generateObject<T>(input, options)` and is unit-tested for provider
resolution + payload/metadata pass-through.
- `aiChatRouter.outputJSON` and `FollowUpActionService.extract` migrated to
the service (other callers move organically when next touched).
- Drops the unused `keyVaultsPayload` field from `StructureOutputParams`
and the placeholder at the InputEditor call site — key vaults are
server-resolved from DB, the client never supplies them.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* ♻️ refactor(tracing): centralize TRACING_SCENARIOS const + inject AiGenerationService via trpc ctx
- New `packages/const/src/llmGenerationTracing.ts` exports `TRACING_SCENARIOS`
+ `TracingScenario` type — the single directory where every known scenario
name lives. Adds `@lobechat/const` as a workspace dep on llm-generation-
tracing so `TRACING_SCENARIO_REGISTRY` can reference the same literals.
- Callers (FollowUpActionService, InputEditor) replace `'follow_up'` /
`'input_completion'` string literals with `TRACING_SCENARIOS.FollowUp` /
`.InputCompletion`, so a typo or a rename fails the type-check instead of
silently drifting on the row.
- `AiGenerationService` is now injected into the `aiChatProcedure` ctx
middleware alongside `aiChatService`; `outputJSON` consumes it via
`ctx.aiGenerationService` instead of new-ing it inside the handler.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* ✨ feat(llm-generation-tracing): add lt/llm-tracing CLI + drop local-only storage_key
- Add `lt` / `llm-tracing` CLI under @lobechat/llm-generation-tracing with
`list` (recent records, --scenario filter, --json) and `inspect` (by
tracing_id prefix or latest, --full, --json).
- `FileTracingStore.save` now returns `{ key: null }` so dev DB rows leave
`storage_key` empty instead of recording a non-resolvable local path; S3
store remains the source of truth for the real key. Add helpers
`findByTracingId` / `getLatest` used by the CLI.
- Wire `agentId` and `topicId` into `input_completion` tracing metadata
from the chat input auto-complete call site.
- Default `FileTracingStore` whenever NODE_ENV=development (drop the
ENABLE_LLM_GENERATION_TRACING_LOCAL opt-in env var).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* 💄 style(llm-generation-tracing): prettier CLI output (tree + colors)
Mirror the @lobechat/agent-tracing viewer style:
- Inline ANSI color helpers (dim/bold/cyan/magenta/green/yellow/red).
- Compact single-line header with id, scenario, version, model, status,
time — replaces the multi-line bullet list.
- Tree structure with `├─`/`└─` connectors instead of `── section ──`
banners.
- input arrays render per-message (role + char count + preview) rather
than dumping raw JSON.
- Small single-key outputs (e.g. `{ completion: "怎么样" }`) collapse
to inline `key: "value"`.
- `lt list` switches to a colored, properly padded table.
Default view stays compact; --full expands system_prompt / input /
schema bodies.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* ♻️ refactor(llm-generation-tracing): split `tracing` config out of `metadata`
`options.metadata` was overloaded — half tracing-specific structured fields
(scenario / promptVersion / schemaName / agentId / topicId / ...), half
free-form jsonb passthrough. Callers couldn't tell which was which, and the
inputHint was always auto-extracted (useless when the prompt wraps the user's
text in a template).
This commit introduces a dedicated `tracing` option:
- Add `TracingOptions` to @lobechat/llm-generation-tracing — the typed shape
callers import (agentId / topicId / inputHint / scenario / promptVersion /
schemaName / systemPrompt / parentTracingId / metadata).
- Add loose `tracing?: Record<string, unknown>` to GenerateObjectOptions and
StructureOutputParams / StructureOutputSchema so the field flows through
the runtime + TRPC.
- Tracing hook now reads `context.options.tracing` for structured fields; it
still falls back to `metadata.trigger` for the cross-cutting trigger string
(ModelRuntime itself uses metadata.trigger for timing logs, so trigger
stays on metadata).
- Service `record()` accepts an explicit `inputHint`; otherwise falls back
to auto-extraction from the first user message. Always truncated.
- Free-form jsonb fields move to `tracing.metadata` (was unknown-key passthrough
on `metadata`).
- Call sites updated:
- FollowUpAction now passes `tracing: { scenario, promptVersion, schemaName,
topicId }` (previously `metadata`).
- InputCompletion now passes `tracing: { agentId, topicId, inputHint: input,
scenario, promptVersion, schemaName }` — `inputHint` is the user's actual
typed text, not the wrapper prompt's first user message.
- `aiChat.outputJSON` router forwards both metadata and tracing.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* Update inputCompletion.ts
* 🐛 fix(llm-generation-tracing): stop duplicating provider into the row's metadata jsonb
`provider` is already a first-class column on the `llm_generation_tracing`
row, so auto-stamping it into the `metadata` jsonb column on every call was
pure noise. The hook now writes the caller-supplied `tracing.metadata`
verbatim — empty/undefined when the caller had nothing to add.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> | 19 小时前 |
| ✨ feat: per-call llm_generation_tracing observability (#15124)
* ✨ feat(database): add llm_generation_tracing schema + tracing package (LOBE-9462)
Foundation layer for per-call observability of `generateObject` calls.
- New Drizzle table `llm_generation_tracing` with identity / context / model /
result / usage / storage / feedback / audit columns and full single-column
index coverage (Postgres bitmap-scan friendly). Migration 0103 is idempotent
(CREATE TABLE/INDEX IF NOT EXISTS) for safe re-runs.
- `LlmGenerationTracingModel` with `record` / `updateFeedback` / `findById` /
`listRecent`, all userId-scoped to prevent cross-user leaks.
- New package `@lobechat/llm-generation-tracing` mirroring agent-tracing's
shape: `ITracingStore` interface, `FileTracingStore` (local/dev, scenario
subfolders + latest.json symlink), `computePromptHash` (6-char sha256 of
systemPrompt + schema), and `TRACING_SCENARIO_REGISTRY` + `resolveScenario`
with explicit scenario override.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* ✨ feat(model-runtime): wire llm_generation_tracing into ModelRuntime.generateObject (LOBE-9462)
Per-call interception layer — one hook covers all generateObject callers.
- New `onGenerateObjectComplete` hook on `ModelRuntimeHooks`: always fires
(success or failure) with latency, usage, output/error. Fixes the gap where
`onGenerateObjectFinal` only fires when the runtime invokes `onUsage`.
- `S3TracingStore` (zstd level 3, key
`llm-generation-tracing/{scenario}/{v}-{hash}/{date}/{id}.json.zst`) and
`LLMGenerationTracingService` that does DB insert → store.save → patch
storage_key. Store failures preserve the row with `metadata.store_error`.
- `createLLMGenerationTracingHook` + `mergeModelRuntimeHooks` wired into
`initModelRuntimeFromDB`; tracing runs alongside business (billing) hooks
via `next/server.after()` when available, microtask fallback otherwise.
Unknown metadata keys (e.g. `parent_memory_trace_key`) pass through.
- Memory extractor accepts `parentMemoryTraceKey` option for the job-level
backlink. Follow-up-action caller given an explicit `scenario: 'follow_up'`
metadata override — it was the only OSS caller missing trigger metadata.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* ✅ test(llm-generation-tracing): type vi.fn mocks so tsgo accepts mock.calls indexing
The hook + service tests destructured `mock.calls[0][0]` and accessed nested
fields, which tsgo flagged as TS2493 / TS18046 because `vi.fn()` defaults to a
zero-arg signature. Add explicit type parameters to the mocks so tsgo can
infer the call tuple, and cast `call.payload` at the access point.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* ♻️ refactor(model-runtime): move mergeModelRuntimeHooks into the package
It's a generic utility for composing `ModelRuntimeHooks` instances — same
import surface as `ModelRuntime` and the hooks interface — so it belongs
alongside them rather than tucked under a server-side consumer.
- New `packages/model-runtime/src/core/mergeHooks.ts` exports
`mergeModelRuntimeHooks` and is re-exported from the package index.
- Move the unit tests to `packages/model-runtime/src/core/mergeHooks.test.ts`,
including a new case covering the "a throws → b is skipped" load-bearing
semantics.
- `src/server/services/llmGenerationTracing/hook.ts` drops the local copy and
the consumer (`src/server/modules/ModelRuntime/index.ts`) imports from
`@lobechat/model-runtime`.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* ♻️ refactor(llm-generation-tracing): version lives with the prompt, not in a central table
`promptVersion` was baked into `TRACING_SCENARIO_REGISTRY`, far from any
prompt definition — editing a prompt + forgetting to bump the entry in a
completely different file was an obvious foot-gun.
- Registry is now `Record<string, string>` mapping trigger → scenario only;
it's the stable concern that rarely changes.
- `resolveScenario` always passes `promptVersion` through from the caller,
defaulting to `UNKNOWN_PROMPT_VERSION` ('v0') when absent.
- Each call site declares its own `*_PROMPT_VERSION` constant next to the
prompt it describes. `followUpAction` ships the first one:
`FOLLOW_UP_PROMPT_VERSION` in `prompts/index.ts`, threaded through
`metadata.promptVersion` at the `generateObject` call. Other callers can
add the same constant when they next touch their prompts.
The 6-char prompt hash on the row still catches forgotten bumps.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* ✨ feat(input-completion): wire prompt-version metadata at the auto-complete call site
Aligns input auto-complete with the FOLLOW_UP_PROMPT_VERSION convention so
each prompt iteration is recordable as the chat-side tracing lands.
- `INPUT_COMPLETION_PROMPT_VERSION = 'v1.0'` declared next to
`chainInputCompletion` — bump together with the prompt body.
- `fetchPresetTaskResult` accepts optional `metadata` and forwards it to
`getChatCompletion`; the existing chat path already plumbs metadata to
`ModelRuntime.chat` options.
- `InputEditor` call site passes
`{ scenario: 'input_completion', promptVersion }`.
Note: `llm_generation_tracing` currently only fires from
`onGenerateObjectComplete`. Input completion is a `chat` call, so this
metadata is forward-looking until a chat-side tracing hook lands.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* 🐛 fix(llm-generation-tracing): collapse bucketDir path.join args to silence turbopack glob warning
Turbopack's static analyzer treats `path.join(root, dyn1, dyn2)` as a
multi-segment glob pattern and warned that it could match ~12k files in
the project. Compose the relative subdir as a single string first, so
`path.join` only sees one dynamic segment.
Behavior unchanged — the resulting path is identical.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* ✨ feat(input-completion): route auto-complete through generateObject for tracing
Auto-complete is the first preset-task caller migrated to the structured-
output path so it lands in `llm_generation_tracing` via the existing
`onGenerateObjectComplete` hook. No new server hook, no global chat-side
tracing.
- `chainInputCompletion` now returns `{ messages, schema }` with a minimal
`{ completion: string }` schema and a stable `INPUT_COMPLETION_SCHEMA_NAME`
constant. JSON wrapping costs ~15-30 tokens against a 100-token completion
budget — negligible for the observability win.
- `StructureOutputSchema` / `StructureOutputParams` accept optional
`metadata`; `aiChatRouter.outputJSON` merges caller metadata over the
default trigger so `{ scenario, promptVersion, schemaName }` reach
`ModelRuntime.generateObject` options unchanged.
- `IStructureSchema.description` is now optional to match the zod schema —
previously the TS type was stricter than runtime validation accepted.
- `InputEditor` switches from `chatService.fetchPresetTaskResult` to
`aiChatService.generateJSON`, reading `response.completion`. Streaming
is dropped because auto-complete already buffers the full result before
inserting; no UX change.
- Reverts the unused `metadata` field that was added to
`fetchPresetTaskResult` in the previous commit — no current caller needs
it now that input completion uses the generateObject path.
Bumps `INPUT_COMPLETION_PROMPT_VERSION` to v2.0 because the system prompt
gained an "output the completion field" instruction.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* ♻️ refactor(aiGeneration): extract the runtime-init + generateObject dance into a service
Every server-side caller that produces structured output was repeating the
same two-step ritual: `initModelRuntimeFromDB(...)` → `runtime.generateObject(payload, { metadata })`.
`AiGenerationService` collapses it into one call so future cross-cutting
concerns (default metadata, retry, observability hooks) have one place to
land.
- New `src/server/services/aiGeneration/index.ts` exposes
`generateObject<T>(input, options)` and is unit-tested for provider
resolution + payload/metadata pass-through.
- `aiChatRouter.outputJSON` and `FollowUpActionService.extract` migrated to
the service (other callers move organically when next touched).
- Drops the unused `keyVaultsPayload` field from `StructureOutputParams`
and the placeholder at the InputEditor call site — key vaults are
server-resolved from DB, the client never supplies them.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* ♻️ refactor(tracing): centralize TRACING_SCENARIOS const + inject AiGenerationService via trpc ctx
- New `packages/const/src/llmGenerationTracing.ts` exports `TRACING_SCENARIOS`
+ `TracingScenario` type — the single directory where every known scenario
name lives. Adds `@lobechat/const` as a workspace dep on llm-generation-
tracing so `TRACING_SCENARIO_REGISTRY` can reference the same literals.
- Callers (FollowUpActionService, InputEditor) replace `'follow_up'` /
`'input_completion'` string literals with `TRACING_SCENARIOS.FollowUp` /
`.InputCompletion`, so a typo or a rename fails the type-check instead of
silently drifting on the row.
- `AiGenerationService` is now injected into the `aiChatProcedure` ctx
middleware alongside `aiChatService`; `outputJSON` consumes it via
`ctx.aiGenerationService` instead of new-ing it inside the handler.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* ✨ feat(llm-generation-tracing): add lt/llm-tracing CLI + drop local-only storage_key
- Add `lt` / `llm-tracing` CLI under @lobechat/llm-generation-tracing with
`list` (recent records, --scenario filter, --json) and `inspect` (by
tracing_id prefix or latest, --full, --json).
- `FileTracingStore.save` now returns `{ key: null }` so dev DB rows leave
`storage_key` empty instead of recording a non-resolvable local path; S3
store remains the source of truth for the real key. Add helpers
`findByTracingId` / `getLatest` used by the CLI.
- Wire `agentId` and `topicId` into `input_completion` tracing metadata
from the chat input auto-complete call site.
- Default `FileTracingStore` whenever NODE_ENV=development (drop the
ENABLE_LLM_GENERATION_TRACING_LOCAL opt-in env var).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* 💄 style(llm-generation-tracing): prettier CLI output (tree + colors)
Mirror the @lobechat/agent-tracing viewer style:
- Inline ANSI color helpers (dim/bold/cyan/magenta/green/yellow/red).
- Compact single-line header with id, scenario, version, model, status,
time — replaces the multi-line bullet list.
- Tree structure with `├─`/`└─` connectors instead of `── section ──`
banners.
- input arrays render per-message (role + char count + preview) rather
than dumping raw JSON.
- Small single-key outputs (e.g. `{ completion: "怎么样" }`) collapse
to inline `key: "value"`.
- `lt list` switches to a colored, properly padded table.
Default view stays compact; --full expands system_prompt / input /
schema bodies.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* ♻️ refactor(llm-generation-tracing): split `tracing` config out of `metadata`
`options.metadata` was overloaded — half tracing-specific structured fields
(scenario / promptVersion / schemaName / agentId / topicId / ...), half
free-form jsonb passthrough. Callers couldn't tell which was which, and the
inputHint was always auto-extracted (useless when the prompt wraps the user's
text in a template).
This commit introduces a dedicated `tracing` option:
- Add `TracingOptions` to @lobechat/llm-generation-tracing — the typed shape
callers import (agentId / topicId / inputHint / scenario / promptVersion /
schemaName / systemPrompt / parentTracingId / metadata).
- Add loose `tracing?: Record<string, unknown>` to GenerateObjectOptions and
StructureOutputParams / StructureOutputSchema so the field flows through
the runtime + TRPC.
- Tracing hook now reads `context.options.tracing` for structured fields; it
still falls back to `metadata.trigger` for the cross-cutting trigger string
(ModelRuntime itself uses metadata.trigger for timing logs, so trigger
stays on metadata).
- Service `record()` accepts an explicit `inputHint`; otherwise falls back
to auto-extraction from the first user message. Always truncated.
- Free-form jsonb fields move to `tracing.metadata` (was unknown-key passthrough
on `metadata`).
- Call sites updated:
- FollowUpAction now passes `tracing: { scenario, promptVersion, schemaName,
topicId }` (previously `metadata`).
- InputCompletion now passes `tracing: { agentId, topicId, inputHint: input,
scenario, promptVersion, schemaName }` — `inputHint` is the user's actual
typed text, not the wrapper prompt's first user message.
- `aiChat.outputJSON` router forwards both metadata and tracing.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* Update inputCompletion.ts
* 🐛 fix(llm-generation-tracing): stop duplicating provider into the row's metadata jsonb
`provider` is already a first-class column on the `llm_generation_tracing`
row, so auto-stamping it into the `metadata` jsonb column on every call was
pure noise. The hook now writes the caller-supplied `tracing.metadata`
verbatim — empty/undefined when the caller had nothing to add.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> | 19 小时前 |
| ✨ feat: per-call llm_generation_tracing observability (#15124)
* ✨ feat(database): add llm_generation_tracing schema + tracing package (LOBE-9462)
Foundation layer for per-call observability of `generateObject` calls.
- New Drizzle table `llm_generation_tracing` with identity / context / model /
result / usage / storage / feedback / audit columns and full single-column
index coverage (Postgres bitmap-scan friendly). Migration 0103 is idempotent
(CREATE TABLE/INDEX IF NOT EXISTS) for safe re-runs.
- `LlmGenerationTracingModel` with `record` / `updateFeedback` / `findById` /
`listRecent`, all userId-scoped to prevent cross-user leaks.
- New package `@lobechat/llm-generation-tracing` mirroring agent-tracing's
shape: `ITracingStore` interface, `FileTracingStore` (local/dev, scenario
subfolders + latest.json symlink), `computePromptHash` (6-char sha256 of
systemPrompt + schema), and `TRACING_SCENARIO_REGISTRY` + `resolveScenario`
with explicit scenario override.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* ✨ feat(model-runtime): wire llm_generation_tracing into ModelRuntime.generateObject (LOBE-9462)
Per-call interception layer — one hook covers all generateObject callers.
- New `onGenerateObjectComplete` hook on `ModelRuntimeHooks`: always fires
(success or failure) with latency, usage, output/error. Fixes the gap where
`onGenerateObjectFinal` only fires when the runtime invokes `onUsage`.
- `S3TracingStore` (zstd level 3, key
`llm-generation-tracing/{scenario}/{v}-{hash}/{date}/{id}.json.zst`) and
`LLMGenerationTracingService` that does DB insert → store.save → patch
storage_key. Store failures preserve the row with `metadata.store_error`.
- `createLLMGenerationTracingHook` + `mergeModelRuntimeHooks` wired into
`initModelRuntimeFromDB`; tracing runs alongside business (billing) hooks
via `next/server.after()` when available, microtask fallback otherwise.
Unknown metadata keys (e.g. `parent_memory_trace_key`) pass through.
- Memory extractor accepts `parentMemoryTraceKey` option for the job-level
backlink. Follow-up-action caller given an explicit `scenario: 'follow_up'`
metadata override — it was the only OSS caller missing trigger metadata.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* ✅ test(llm-generation-tracing): type vi.fn mocks so tsgo accepts mock.calls indexing
The hook + service tests destructured `mock.calls[0][0]` and accessed nested
fields, which tsgo flagged as TS2493 / TS18046 because `vi.fn()` defaults to a
zero-arg signature. Add explicit type parameters to the mocks so tsgo can
infer the call tuple, and cast `call.payload` at the access point.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* ♻️ refactor(model-runtime): move mergeModelRuntimeHooks into the package
It's a generic utility for composing `ModelRuntimeHooks` instances — same
import surface as `ModelRuntime` and the hooks interface — so it belongs
alongside them rather than tucked under a server-side consumer.
- New `packages/model-runtime/src/core/mergeHooks.ts` exports
`mergeModelRuntimeHooks` and is re-exported from the package index.
- Move the unit tests to `packages/model-runtime/src/core/mergeHooks.test.ts`,
including a new case covering the "a throws → b is skipped" load-bearing
semantics.
- `src/server/services/llmGenerationTracing/hook.ts` drops the local copy and
the consumer (`src/server/modules/ModelRuntime/index.ts`) imports from
`@lobechat/model-runtime`.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* ♻️ refactor(llm-generation-tracing): version lives with the prompt, not in a central table
`promptVersion` was baked into `TRACING_SCENARIO_REGISTRY`, far from any
prompt definition — editing a prompt + forgetting to bump the entry in a
completely different file was an obvious foot-gun.
- Registry is now `Record<string, string>` mapping trigger → scenario only;
it's the stable concern that rarely changes.
- `resolveScenario` always passes `promptVersion` through from the caller,
defaulting to `UNKNOWN_PROMPT_VERSION` ('v0') when absent.
- Each call site declares its own `*_PROMPT_VERSION` constant next to the
prompt it describes. `followUpAction` ships the first one:
`FOLLOW_UP_PROMPT_VERSION` in `prompts/index.ts`, threaded through
`metadata.promptVersion` at the `generateObject` call. Other callers can
add the same constant when they next touch their prompts.
The 6-char prompt hash on the row still catches forgotten bumps.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* ✨ feat(input-completion): wire prompt-version metadata at the auto-complete call site
Aligns input auto-complete with the FOLLOW_UP_PROMPT_VERSION convention so
each prompt iteration is recordable as the chat-side tracing lands.
- `INPUT_COMPLETION_PROMPT_VERSION = 'v1.0'` declared next to
`chainInputCompletion` — bump together with the prompt body.
- `fetchPresetTaskResult` accepts optional `metadata` and forwards it to
`getChatCompletion`; the existing chat path already plumbs metadata to
`ModelRuntime.chat` options.
- `InputEditor` call site passes
`{ scenario: 'input_completion', promptVersion }`.
Note: `llm_generation_tracing` currently only fires from
`onGenerateObjectComplete`. Input completion is a `chat` call, so this
metadata is forward-looking until a chat-side tracing hook lands.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* 🐛 fix(llm-generation-tracing): collapse bucketDir path.join args to silence turbopack glob warning
Turbopack's static analyzer treats `path.join(root, dyn1, dyn2)` as a
multi-segment glob pattern and warned that it could match ~12k files in
the project. Compose the relative subdir as a single string first, so
`path.join` only sees one dynamic segment.
Behavior unchanged — the resulting path is identical.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* ✨ feat(input-completion): route auto-complete through generateObject for tracing
Auto-complete is the first preset-task caller migrated to the structured-
output path so it lands in `llm_generation_tracing` via the existing
`onGenerateObjectComplete` hook. No new server hook, no global chat-side
tracing.
- `chainInputCompletion` now returns `{ messages, schema }` with a minimal
`{ completion: string }` schema and a stable `INPUT_COMPLETION_SCHEMA_NAME`
constant. JSON wrapping costs ~15-30 tokens against a 100-token completion
budget — negligible for the observability win.
- `StructureOutputSchema` / `StructureOutputParams` accept optional
`metadata`; `aiChatRouter.outputJSON` merges caller metadata over the
default trigger so `{ scenario, promptVersion, schemaName }` reach
`ModelRuntime.generateObject` options unchanged.
- `IStructureSchema.description` is now optional to match the zod schema —
previously the TS type was stricter than runtime validation accepted.
- `InputEditor` switches from `chatService.fetchPresetTaskResult` to
`aiChatService.generateJSON`, reading `response.completion`. Streaming
is dropped because auto-complete already buffers the full result before
inserting; no UX change.
- Reverts the unused `metadata` field that was added to
`fetchPresetTaskResult` in the previous commit — no current caller needs
it now that input completion uses the generateObject path.
Bumps `INPUT_COMPLETION_PROMPT_VERSION` to v2.0 because the system prompt
gained an "output the completion field" instruction.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* ♻️ refactor(aiGeneration): extract the runtime-init + generateObject dance into a service
Every server-side caller that produces structured output was repeating the
same two-step ritual: `initModelRuntimeFromDB(...)` → `runtime.generateObject(payload, { metadata })`.
`AiGenerationService` collapses it into one call so future cross-cutting
concerns (default metadata, retry, observability hooks) have one place to
land.
- New `src/server/services/aiGeneration/index.ts` exposes
`generateObject<T>(input, options)` and is unit-tested for provider
resolution + payload/metadata pass-through.
- `aiChatRouter.outputJSON` and `FollowUpActionService.extract` migrated to
the service (other callers move organically when next touched).
- Drops the unused `keyVaultsPayload` field from `StructureOutputParams`
and the placeholder at the InputEditor call site — key vaults are
server-resolved from DB, the client never supplies them.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* ♻️ refactor(tracing): centralize TRACING_SCENARIOS const + inject AiGenerationService via trpc ctx
- New `packages/const/src/llmGenerationTracing.ts` exports `TRACING_SCENARIOS`
+ `TracingScenario` type — the single directory where every known scenario
name lives. Adds `@lobechat/const` as a workspace dep on llm-generation-
tracing so `TRACING_SCENARIO_REGISTRY` can reference the same literals.
- Callers (FollowUpActionService, InputEditor) replace `'follow_up'` /
`'input_completion'` string literals with `TRACING_SCENARIOS.FollowUp` /
`.InputCompletion`, so a typo or a rename fails the type-check instead of
silently drifting on the row.
- `AiGenerationService` is now injected into the `aiChatProcedure` ctx
middleware alongside `aiChatService`; `outputJSON` consumes it via
`ctx.aiGenerationService` instead of new-ing it inside the handler.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* ✨ feat(llm-generation-tracing): add lt/llm-tracing CLI + drop local-only storage_key
- Add `lt` / `llm-tracing` CLI under @lobechat/llm-generation-tracing with
`list` (recent records, --scenario filter, --json) and `inspect` (by
tracing_id prefix or latest, --full, --json).
- `FileTracingStore.save` now returns `{ key: null }` so dev DB rows leave
`storage_key` empty instead of recording a non-resolvable local path; S3
store remains the source of truth for the real key. Add helpers
`findByTracingId` / `getLatest` used by the CLI.
- Wire `agentId` and `topicId` into `input_completion` tracing metadata
from the chat input auto-complete call site.
- Default `FileTracingStore` whenever NODE_ENV=development (drop the
ENABLE_LLM_GENERATION_TRACING_LOCAL opt-in env var).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* 💄 style(llm-generation-tracing): prettier CLI output (tree + colors)
Mirror the @lobechat/agent-tracing viewer style:
- Inline ANSI color helpers (dim/bold/cyan/magenta/green/yellow/red).
- Compact single-line header with id, scenario, version, model, status,
time — replaces the multi-line bullet list.
- Tree structure with `├─`/`└─` connectors instead of `── section ──`
banners.
- input arrays render per-message (role + char count + preview) rather
than dumping raw JSON.
- Small single-key outputs (e.g. `{ completion: "怎么样" }`) collapse
to inline `key: "value"`.
- `lt list` switches to a colored, properly padded table.
Default view stays compact; --full expands system_prompt / input /
schema bodies.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* ♻️ refactor(llm-generation-tracing): split `tracing` config out of `metadata`
`options.metadata` was overloaded — half tracing-specific structured fields
(scenario / promptVersion / schemaName / agentId / topicId / ...), half
free-form jsonb passthrough. Callers couldn't tell which was which, and the
inputHint was always auto-extracted (useless when the prompt wraps the user's
text in a template).
This commit introduces a dedicated `tracing` option:
- Add `TracingOptions` to @lobechat/llm-generation-tracing — the typed shape
callers import (agentId / topicId / inputHint / scenario / promptVersion /
schemaName / systemPrompt / parentTracingId / metadata).
- Add loose `tracing?: Record<string, unknown>` to GenerateObjectOptions and
StructureOutputParams / StructureOutputSchema so the field flows through
the runtime + TRPC.
- Tracing hook now reads `context.options.tracing` for structured fields; it
still falls back to `metadata.trigger` for the cross-cutting trigger string
(ModelRuntime itself uses metadata.trigger for timing logs, so trigger
stays on metadata).
- Service `record()` accepts an explicit `inputHint`; otherwise falls back
to auto-extraction from the first user message. Always truncated.
- Free-form jsonb fields move to `tracing.metadata` (was unknown-key passthrough
on `metadata`).
- Call sites updated:
- FollowUpAction now passes `tracing: { scenario, promptVersion, schemaName,
topicId }` (previously `metadata`).
- InputCompletion now passes `tracing: { agentId, topicId, inputHint: input,
scenario, promptVersion, schemaName }` — `inputHint` is the user's actual
typed text, not the wrapper prompt's first user message.
- `aiChat.outputJSON` router forwards both metadata and tracing.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* Update inputCompletion.ts
* 🐛 fix(llm-generation-tracing): stop duplicating provider into the row's metadata jsonb
`provider` is already a first-class column on the `llm_generation_tracing`
row, so auto-stamping it into the `metadata` jsonb column on every call was
pure noise. The hook now writes the caller-supplied `tracing.metadata`
verbatim — empty/undefined when the caller had nothing to add.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> | 19 小时前 |
| ✨ feat: per-call llm_generation_tracing observability (#15124)
* ✨ feat(database): add llm_generation_tracing schema + tracing package (LOBE-9462)
Foundation layer for per-call observability of `generateObject` calls.
- New Drizzle table `llm_generation_tracing` with identity / context / model /
result / usage / storage / feedback / audit columns and full single-column
index coverage (Postgres bitmap-scan friendly). Migration 0103 is idempotent
(CREATE TABLE/INDEX IF NOT EXISTS) for safe re-runs.
- `LlmGenerationTracingModel` with `record` / `updateFeedback` / `findById` /
`listRecent`, all userId-scoped to prevent cross-user leaks.
- New package `@lobechat/llm-generation-tracing` mirroring agent-tracing's
shape: `ITracingStore` interface, `FileTracingStore` (local/dev, scenario
subfolders + latest.json symlink), `computePromptHash` (6-char sha256 of
systemPrompt + schema), and `TRACING_SCENARIO_REGISTRY` + `resolveScenario`
with explicit scenario override.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* ✨ feat(model-runtime): wire llm_generation_tracing into ModelRuntime.generateObject (LOBE-9462)
Per-call interception layer — one hook covers all generateObject callers.
- New `onGenerateObjectComplete` hook on `ModelRuntimeHooks`: always fires
(success or failure) with latency, usage, output/error. Fixes the gap where
`onGenerateObjectFinal` only fires when the runtime invokes `onUsage`.
- `S3TracingStore` (zstd level 3, key
`llm-generation-tracing/{scenario}/{v}-{hash}/{date}/{id}.json.zst`) and
`LLMGenerationTracingService` that does DB insert → store.save → patch
storage_key. Store failures preserve the row with `metadata.store_error`.
- `createLLMGenerationTracingHook` + `mergeModelRuntimeHooks` wired into
`initModelRuntimeFromDB`; tracing runs alongside business (billing) hooks
via `next/server.after()` when available, microtask fallback otherwise.
Unknown metadata keys (e.g. `parent_memory_trace_key`) pass through.
- Memory extractor accepts `parentMemoryTraceKey` option for the job-level
backlink. Follow-up-action caller given an explicit `scenario: 'follow_up'`
metadata override — it was the only OSS caller missing trigger metadata.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* ✅ test(llm-generation-tracing): type vi.fn mocks so tsgo accepts mock.calls indexing
The hook + service tests destructured `mock.calls[0][0]` and accessed nested
fields, which tsgo flagged as TS2493 / TS18046 because `vi.fn()` defaults to a
zero-arg signature. Add explicit type parameters to the mocks so tsgo can
infer the call tuple, and cast `call.payload` at the access point.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* ♻️ refactor(model-runtime): move mergeModelRuntimeHooks into the package
It's a generic utility for composing `ModelRuntimeHooks` instances — same
import surface as `ModelRuntime` and the hooks interface — so it belongs
alongside them rather than tucked under a server-side consumer.
- New `packages/model-runtime/src/core/mergeHooks.ts` exports
`mergeModelRuntimeHooks` and is re-exported from the package index.
- Move the unit tests to `packages/model-runtime/src/core/mergeHooks.test.ts`,
including a new case covering the "a throws → b is skipped" load-bearing
semantics.
- `src/server/services/llmGenerationTracing/hook.ts` drops the local copy and
the consumer (`src/server/modules/ModelRuntime/index.ts`) imports from
`@lobechat/model-runtime`.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* ♻️ refactor(llm-generation-tracing): version lives with the prompt, not in a central table
`promptVersion` was baked into `TRACING_SCENARIO_REGISTRY`, far from any
prompt definition — editing a prompt + forgetting to bump the entry in a
completely different file was an obvious foot-gun.
- Registry is now `Record<string, string>` mapping trigger → scenario only;
it's the stable concern that rarely changes.
- `resolveScenario` always passes `promptVersion` through from the caller,
defaulting to `UNKNOWN_PROMPT_VERSION` ('v0') when absent.
- Each call site declares its own `*_PROMPT_VERSION` constant next to the
prompt it describes. `followUpAction` ships the first one:
`FOLLOW_UP_PROMPT_VERSION` in `prompts/index.ts`, threaded through
`metadata.promptVersion` at the `generateObject` call. Other callers can
add the same constant when they next touch their prompts.
The 6-char prompt hash on the row still catches forgotten bumps.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* ✨ feat(input-completion): wire prompt-version metadata at the auto-complete call site
Aligns input auto-complete with the FOLLOW_UP_PROMPT_VERSION convention so
each prompt iteration is recordable as the chat-side tracing lands.
- `INPUT_COMPLETION_PROMPT_VERSION = 'v1.0'` declared next to
`chainInputCompletion` — bump together with the prompt body.
- `fetchPresetTaskResult` accepts optional `metadata` and forwards it to
`getChatCompletion`; the existing chat path already plumbs metadata to
`ModelRuntime.chat` options.
- `InputEditor` call site passes
`{ scenario: 'input_completion', promptVersion }`.
Note: `llm_generation_tracing` currently only fires from
`onGenerateObjectComplete`. Input completion is a `chat` call, so this
metadata is forward-looking until a chat-side tracing hook lands.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* 🐛 fix(llm-generation-tracing): collapse bucketDir path.join args to silence turbopack glob warning
Turbopack's static analyzer treats `path.join(root, dyn1, dyn2)` as a
multi-segment glob pattern and warned that it could match ~12k files in
the project. Compose the relative subdir as a single string first, so
`path.join` only sees one dynamic segment.
Behavior unchanged — the resulting path is identical.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* ✨ feat(input-completion): route auto-complete through generateObject for tracing
Auto-complete is the first preset-task caller migrated to the structured-
output path so it lands in `llm_generation_tracing` via the existing
`onGenerateObjectComplete` hook. No new server hook, no global chat-side
tracing.
- `chainInputCompletion` now returns `{ messages, schema }` with a minimal
`{ completion: string }` schema and a stable `INPUT_COMPLETION_SCHEMA_NAME`
constant. JSON wrapping costs ~15-30 tokens against a 100-token completion
budget — negligible for the observability win.
- `StructureOutputSchema` / `StructureOutputParams` accept optional
`metadata`; `aiChatRouter.outputJSON` merges caller metadata over the
default trigger so `{ scenario, promptVersion, schemaName }` reach
`ModelRuntime.generateObject` options unchanged.
- `IStructureSchema.description` is now optional to match the zod schema —
previously the TS type was stricter than runtime validation accepted.
- `InputEditor` switches from `chatService.fetchPresetTaskResult` to
`aiChatService.generateJSON`, reading `response.completion`. Streaming
is dropped because auto-complete already buffers the full result before
inserting; no UX change.
- Reverts the unused `metadata` field that was added to
`fetchPresetTaskResult` in the previous commit — no current caller needs
it now that input completion uses the generateObject path.
Bumps `INPUT_COMPLETION_PROMPT_VERSION` to v2.0 because the system prompt
gained an "output the completion field" instruction.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* ♻️ refactor(aiGeneration): extract the runtime-init + generateObject dance into a service
Every server-side caller that produces structured output was repeating the
same two-step ritual: `initModelRuntimeFromDB(...)` → `runtime.generateObject(payload, { metadata })`.
`AiGenerationService` collapses it into one call so future cross-cutting
concerns (default metadata, retry, observability hooks) have one place to
land.
- New `src/server/services/aiGeneration/index.ts` exposes
`generateObject<T>(input, options)` and is unit-tested for provider
resolution + payload/metadata pass-through.
- `aiChatRouter.outputJSON` and `FollowUpActionService.extract` migrated to
the service (other callers move organically when next touched).
- Drops the unused `keyVaultsPayload` field from `StructureOutputParams`
and the placeholder at the InputEditor call site — key vaults are
server-resolved from DB, the client never supplies them.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* ♻️ refactor(tracing): centralize TRACING_SCENARIOS const + inject AiGenerationService via trpc ctx
- New `packages/const/src/llmGenerationTracing.ts` exports `TRACING_SCENARIOS`
+ `TracingScenario` type — the single directory where every known scenario
name lives. Adds `@lobechat/const` as a workspace dep on llm-generation-
tracing so `TRACING_SCENARIO_REGISTRY` can reference the same literals.
- Callers (FollowUpActionService, InputEditor) replace `'follow_up'` /
`'input_completion'` string literals with `TRACING_SCENARIOS.FollowUp` /
`.InputCompletion`, so a typo or a rename fails the type-check instead of
silently drifting on the row.
- `AiGenerationService` is now injected into the `aiChatProcedure` ctx
middleware alongside `aiChatService`; `outputJSON` consumes it via
`ctx.aiGenerationService` instead of new-ing it inside the handler.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* ✨ feat(llm-generation-tracing): add lt/llm-tracing CLI + drop local-only storage_key
- Add `lt` / `llm-tracing` CLI under @lobechat/llm-generation-tracing with
`list` (recent records, --scenario filter, --json) and `inspect` (by
tracing_id prefix or latest, --full, --json).
- `FileTracingStore.save` now returns `{ key: null }` so dev DB rows leave
`storage_key` empty instead of recording a non-resolvable local path; S3
store remains the source of truth for the real key. Add helpers
`findByTracingId` / `getLatest` used by the CLI.
- Wire `agentId` and `topicId` into `input_completion` tracing metadata
from the chat input auto-complete call site.
- Default `FileTracingStore` whenever NODE_ENV=development (drop the
ENABLE_LLM_GENERATION_TRACING_LOCAL opt-in env var).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* 💄 style(llm-generation-tracing): prettier CLI output (tree + colors)
Mirror the @lobechat/agent-tracing viewer style:
- Inline ANSI color helpers (dim/bold/cyan/magenta/green/yellow/red).
- Compact single-line header with id, scenario, version, model, status,
time — replaces the multi-line bullet list.
- Tree structure with `├─`/`└─` connectors instead of `── section ──`
banners.
- input arrays render per-message (role + char count + preview) rather
than dumping raw JSON.
- Small single-key outputs (e.g. `{ completion: "怎么样" }`) collapse
to inline `key: "value"`.
- `lt list` switches to a colored, properly padded table.
Default view stays compact; --full expands system_prompt / input /
schema bodies.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* ♻️ refactor(llm-generation-tracing): split `tracing` config out of `metadata`
`options.metadata` was overloaded — half tracing-specific structured fields
(scenario / promptVersion / schemaName / agentId / topicId / ...), half
free-form jsonb passthrough. Callers couldn't tell which was which, and the
inputHint was always auto-extracted (useless when the prompt wraps the user's
text in a template).
This commit introduces a dedicated `tracing` option:
- Add `TracingOptions` to @lobechat/llm-generation-tracing — the typed shape
callers import (agentId / topicId / inputHint / scenario / promptVersion /
schemaName / systemPrompt / parentTracingId / metadata).
- Add loose `tracing?: Record<string, unknown>` to GenerateObjectOptions and
StructureOutputParams / StructureOutputSchema so the field flows through
the runtime + TRPC.
- Tracing hook now reads `context.options.tracing` for structured fields; it
still falls back to `metadata.trigger` for the cross-cutting trigger string
(ModelRuntime itself uses metadata.trigger for timing logs, so trigger
stays on metadata).
- Service `record()` accepts an explicit `inputHint`; otherwise falls back
to auto-extraction from the first user message. Always truncated.
- Free-form jsonb fields move to `tracing.metadata` (was unknown-key passthrough
on `metadata`).
- Call sites updated:
- FollowUpAction now passes `tracing: { scenario, promptVersion, schemaName,
topicId }` (previously `metadata`).
- InputCompletion now passes `tracing: { agentId, topicId, inputHint: input,
scenario, promptVersion, schemaName }` — `inputHint` is the user's actual
typed text, not the wrapper prompt's first user message.
- `aiChat.outputJSON` router forwards both metadata and tracing.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* Update inputCompletion.ts
* 🐛 fix(llm-generation-tracing): stop duplicating provider into the row's metadata jsonb
`provider` is already a first-class column on the `llm_generation_tracing`
row, so auto-stamping it into the `metadata` jsonb column on every call was
pure noise. The hook now writes the caller-supplied `tracing.metadata`
verbatim — empty/undefined when the caller had nothing to add.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> | 19 小时前 |
| ✨ feat: per-call llm_generation_tracing observability (#15124)
* ✨ feat(database): add llm_generation_tracing schema + tracing package (LOBE-9462)
Foundation layer for per-call observability of `generateObject` calls.
- New Drizzle table `llm_generation_tracing` with identity / context / model /
result / usage / storage / feedback / audit columns and full single-column
index coverage (Postgres bitmap-scan friendly). Migration 0103 is idempotent
(CREATE TABLE/INDEX IF NOT EXISTS) for safe re-runs.
- `LlmGenerationTracingModel` with `record` / `updateFeedback` / `findById` /
`listRecent`, all userId-scoped to prevent cross-user leaks.
- New package `@lobechat/llm-generation-tracing` mirroring agent-tracing's
shape: `ITracingStore` interface, `FileTracingStore` (local/dev, scenario
subfolders + latest.json symlink), `computePromptHash` (6-char sha256 of
systemPrompt + schema), and `TRACING_SCENARIO_REGISTRY` + `resolveScenario`
with explicit scenario override.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* ✨ feat(model-runtime): wire llm_generation_tracing into ModelRuntime.generateObject (LOBE-9462)
Per-call interception layer — one hook covers all generateObject callers.
- New `onGenerateObjectComplete` hook on `ModelRuntimeHooks`: always fires
(success or failure) with latency, usage, output/error. Fixes the gap where
`onGenerateObjectFinal` only fires when the runtime invokes `onUsage`.
- `S3TracingStore` (zstd level 3, key
`llm-generation-tracing/{scenario}/{v}-{hash}/{date}/{id}.json.zst`) and
`LLMGenerationTracingService` that does DB insert → store.save → patch
storage_key. Store failures preserve the row with `metadata.store_error`.
- `createLLMGenerationTracingHook` + `mergeModelRuntimeHooks` wired into
`initModelRuntimeFromDB`; tracing runs alongside business (billing) hooks
via `next/server.after()` when available, microtask fallback otherwise.
Unknown metadata keys (e.g. `parent_memory_trace_key`) pass through.
- Memory extractor accepts `parentMemoryTraceKey` option for the job-level
backlink. Follow-up-action caller given an explicit `scenario: 'follow_up'`
metadata override — it was the only OSS caller missing trigger metadata.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* ✅ test(llm-generation-tracing): type vi.fn mocks so tsgo accepts mock.calls indexing
The hook + service tests destructured `mock.calls[0][0]` and accessed nested
fields, which tsgo flagged as TS2493 / TS18046 because `vi.fn()` defaults to a
zero-arg signature. Add explicit type parameters to the mocks so tsgo can
infer the call tuple, and cast `call.payload` at the access point.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* ♻️ refactor(model-runtime): move mergeModelRuntimeHooks into the package
It's a generic utility for composing `ModelRuntimeHooks` instances — same
import surface as `ModelRuntime` and the hooks interface — so it belongs
alongside them rather than tucked under a server-side consumer.
- New `packages/model-runtime/src/core/mergeHooks.ts` exports
`mergeModelRuntimeHooks` and is re-exported from the package index.
- Move the unit tests to `packages/model-runtime/src/core/mergeHooks.test.ts`,
including a new case covering the "a throws → b is skipped" load-bearing
semantics.
- `src/server/services/llmGenerationTracing/hook.ts` drops the local copy and
the consumer (`src/server/modules/ModelRuntime/index.ts`) imports from
`@lobechat/model-runtime`.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* ♻️ refactor(llm-generation-tracing): version lives with the prompt, not in a central table
`promptVersion` was baked into `TRACING_SCENARIO_REGISTRY`, far from any
prompt definition — editing a prompt + forgetting to bump the entry in a
completely different file was an obvious foot-gun.
- Registry is now `Record<string, string>` mapping trigger → scenario only;
it's the stable concern that rarely changes.
- `resolveScenario` always passes `promptVersion` through from the caller,
defaulting to `UNKNOWN_PROMPT_VERSION` ('v0') when absent.
- Each call site declares its own `*_PROMPT_VERSION` constant next to the
prompt it describes. `followUpAction` ships the first one:
`FOLLOW_UP_PROMPT_VERSION` in `prompts/index.ts`, threaded through
`metadata.promptVersion` at the `generateObject` call. Other callers can
add the same constant when they next touch their prompts.
The 6-char prompt hash on the row still catches forgotten bumps.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* ✨ feat(input-completion): wire prompt-version metadata at the auto-complete call site
Aligns input auto-complete with the FOLLOW_UP_PROMPT_VERSION convention so
each prompt iteration is recordable as the chat-side tracing lands.
- `INPUT_COMPLETION_PROMPT_VERSION = 'v1.0'` declared next to
`chainInputCompletion` — bump together with the prompt body.
- `fetchPresetTaskResult` accepts optional `metadata` and forwards it to
`getChatCompletion`; the existing chat path already plumbs metadata to
`ModelRuntime.chat` options.
- `InputEditor` call site passes
`{ scenario: 'input_completion', promptVersion }`.
Note: `llm_generation_tracing` currently only fires from
`onGenerateObjectComplete`. Input completion is a `chat` call, so this
metadata is forward-looking until a chat-side tracing hook lands.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* 🐛 fix(llm-generation-tracing): collapse bucketDir path.join args to silence turbopack glob warning
Turbopack's static analyzer treats `path.join(root, dyn1, dyn2)` as a
multi-segment glob pattern and warned that it could match ~12k files in
the project. Compose the relative subdir as a single string first, so
`path.join` only sees one dynamic segment.
Behavior unchanged — the resulting path is identical.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* ✨ feat(input-completion): route auto-complete through generateObject for tracing
Auto-complete is the first preset-task caller migrated to the structured-
output path so it lands in `llm_generation_tracing` via the existing
`onGenerateObjectComplete` hook. No new server hook, no global chat-side
tracing.
- `chainInputCompletion` now returns `{ messages, schema }` with a minimal
`{ completion: string }` schema and a stable `INPUT_COMPLETION_SCHEMA_NAME`
constant. JSON wrapping costs ~15-30 tokens against a 100-token completion
budget — negligible for the observability win.
- `StructureOutputSchema` / `StructureOutputParams` accept optional
`metadata`; `aiChatRouter.outputJSON` merges caller metadata over the
default trigger so `{ scenario, promptVersion, schemaName }` reach
`ModelRuntime.generateObject` options unchanged.
- `IStructureSchema.description` is now optional to match the zod schema —
previously the TS type was stricter than runtime validation accepted.
- `InputEditor` switches from `chatService.fetchPresetTaskResult` to
`aiChatService.generateJSON`, reading `response.completion`. Streaming
is dropped because auto-complete already buffers the full result before
inserting; no UX change.
- Reverts the unused `metadata` field that was added to
`fetchPresetTaskResult` in the previous commit — no current caller needs
it now that input completion uses the generateObject path.
Bumps `INPUT_COMPLETION_PROMPT_VERSION` to v2.0 because the system prompt
gained an "output the completion field" instruction.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* ♻️ refactor(aiGeneration): extract the runtime-init + generateObject dance into a service
Every server-side caller that produces structured output was repeating the
same two-step ritual: `initModelRuntimeFromDB(...)` → `runtime.generateObject(payload, { metadata })`.
`AiGenerationService` collapses it into one call so future cross-cutting
concerns (default metadata, retry, observability hooks) have one place to
land.
- New `src/server/services/aiGeneration/index.ts` exposes
`generateObject<T>(input, options)` and is unit-tested for provider
resolution + payload/metadata pass-through.
- `aiChatRouter.outputJSON` and `FollowUpActionService.extract` migrated to
the service (other callers move organically when next touched).
- Drops the unused `keyVaultsPayload` field from `StructureOutputParams`
and the placeholder at the InputEditor call site — key vaults are
server-resolved from DB, the client never supplies them.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* ♻️ refactor(tracing): centralize TRACING_SCENARIOS const + inject AiGenerationService via trpc ctx
- New `packages/const/src/llmGenerationTracing.ts` exports `TRACING_SCENARIOS`
+ `TracingScenario` type — the single directory where every known scenario
name lives. Adds `@lobechat/const` as a workspace dep on llm-generation-
tracing so `TRACING_SCENARIO_REGISTRY` can reference the same literals.
- Callers (FollowUpActionService, InputEditor) replace `'follow_up'` /
`'input_completion'` string literals with `TRACING_SCENARIOS.FollowUp` /
`.InputCompletion`, so a typo or a rename fails the type-check instead of
silently drifting on the row.
- `AiGenerationService` is now injected into the `aiChatProcedure` ctx
middleware alongside `aiChatService`; `outputJSON` consumes it via
`ctx.aiGenerationService` instead of new-ing it inside the handler.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* ✨ feat(llm-generation-tracing): add lt/llm-tracing CLI + drop local-only storage_key
- Add `lt` / `llm-tracing` CLI under @lobechat/llm-generation-tracing with
`list` (recent records, --scenario filter, --json) and `inspect` (by
tracing_id prefix or latest, --full, --json).
- `FileTracingStore.save` now returns `{ key: null }` so dev DB rows leave
`storage_key` empty instead of recording a non-resolvable local path; S3
store remains the source of truth for the real key. Add helpers
`findByTracingId` / `getLatest` used by the CLI.
- Wire `agentId` and `topicId` into `input_completion` tracing metadata
from the chat input auto-complete call site.
- Default `FileTracingStore` whenever NODE_ENV=development (drop the
ENABLE_LLM_GENERATION_TRACING_LOCAL opt-in env var).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* 💄 style(llm-generation-tracing): prettier CLI output (tree + colors)
Mirror the @lobechat/agent-tracing viewer style:
- Inline ANSI color helpers (dim/bold/cyan/magenta/green/yellow/red).
- Compact single-line header with id, scenario, version, model, status,
time — replaces the multi-line bullet list.
- Tree structure with `├─`/`└─` connectors instead of `── section ──`
banners.
- input arrays render per-message (role + char count + preview) rather
than dumping raw JSON.
- Small single-key outputs (e.g. `{ completion: "怎么样" }`) collapse
to inline `key: "value"`.
- `lt list` switches to a colored, properly padded table.
Default view stays compact; --full expands system_prompt / input /
schema bodies.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* ♻️ refactor(llm-generation-tracing): split `tracing` config out of `metadata`
`options.metadata` was overloaded — half tracing-specific structured fields
(scenario / promptVersion / schemaName / agentId / topicId / ...), half
free-form jsonb passthrough. Callers couldn't tell which was which, and the
inputHint was always auto-extracted (useless when the prompt wraps the user's
text in a template).
This commit introduces a dedicated `tracing` option:
- Add `TracingOptions` to @lobechat/llm-generation-tracing — the typed shape
callers import (agentId / topicId / inputHint / scenario / promptVersion /
schemaName / systemPrompt / parentTracingId / metadata).
- Add loose `tracing?: Record<string, unknown>` to GenerateObjectOptions and
StructureOutputParams / StructureOutputSchema so the field flows through
the runtime + TRPC.
- Tracing hook now reads `context.options.tracing` for structured fields; it
still falls back to `metadata.trigger` for the cross-cutting trigger string
(ModelRuntime itself uses metadata.trigger for timing logs, so trigger
stays on metadata).
- Service `record()` accepts an explicit `inputHint`; otherwise falls back
to auto-extraction from the first user message. Always truncated.
- Free-form jsonb fields move to `tracing.metadata` (was unknown-key passthrough
on `metadata`).
- Call sites updated:
- FollowUpAction now passes `tracing: { scenario, promptVersion, schemaName,
topicId }` (previously `metadata`).
- InputCompletion now passes `tracing: { agentId, topicId, inputHint: input,
scenario, promptVersion, schemaName }` — `inputHint` is the user's actual
typed text, not the wrapper prompt's first user message.
- `aiChat.outputJSON` router forwards both metadata and tracing.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* Update inputCompletion.ts
* 🐛 fix(llm-generation-tracing): stop duplicating provider into the row's metadata jsonb
`provider` is already a first-class column on the `llm_generation_tracing`
row, so auto-stamping it into the `metadata` jsonb column on every call was
pure noise. The hook now writes the caller-supplied `tracing.metadata`
verbatim — empty/undefined when the caller had nothing to add.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> | 19 小时前 |
| ✨ feat: per-call llm_generation_tracing observability (#15124)
* ✨ feat(database): add llm_generation_tracing schema + tracing package (LOBE-9462)
Foundation layer for per-call observability of `generateObject` calls.
- New Drizzle table `llm_generation_tracing` with identity / context / model /
result / usage / storage / feedback / audit columns and full single-column
index coverage (Postgres bitmap-scan friendly). Migration 0103 is idempotent
(CREATE TABLE/INDEX IF NOT EXISTS) for safe re-runs.
- `LlmGenerationTracingModel` with `record` / `updateFeedback` / `findById` /
`listRecent`, all userId-scoped to prevent cross-user leaks.
- New package `@lobechat/llm-generation-tracing` mirroring agent-tracing's
shape: `ITracingStore` interface, `FileTracingStore` (local/dev, scenario
subfolders + latest.json symlink), `computePromptHash` (6-char sha256 of
systemPrompt + schema), and `TRACING_SCENARIO_REGISTRY` + `resolveScenario`
with explicit scenario override.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* ✨ feat(model-runtime): wire llm_generation_tracing into ModelRuntime.generateObject (LOBE-9462)
Per-call interception layer — one hook covers all generateObject callers.
- New `onGenerateObjectComplete` hook on `ModelRuntimeHooks`: always fires
(success or failure) with latency, usage, output/error. Fixes the gap where
`onGenerateObjectFinal` only fires when the runtime invokes `onUsage`.
- `S3TracingStore` (zstd level 3, key
`llm-generation-tracing/{scenario}/{v}-{hash}/{date}/{id}.json.zst`) and
`LLMGenerationTracingService` that does DB insert → store.save → patch
storage_key. Store failures preserve the row with `metadata.store_error`.
- `createLLMGenerationTracingHook` + `mergeModelRuntimeHooks` wired into
`initModelRuntimeFromDB`; tracing runs alongside business (billing) hooks
via `next/server.after()` when available, microtask fallback otherwise.
Unknown metadata keys (e.g. `parent_memory_trace_key`) pass through.
- Memory extractor accepts `parentMemoryTraceKey` option for the job-level
backlink. Follow-up-action caller given an explicit `scenario: 'follow_up'`
metadata override — it was the only OSS caller missing trigger metadata.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* ✅ test(llm-generation-tracing): type vi.fn mocks so tsgo accepts mock.calls indexing
The hook + service tests destructured `mock.calls[0][0]` and accessed nested
fields, which tsgo flagged as TS2493 / TS18046 because `vi.fn()` defaults to a
zero-arg signature. Add explicit type parameters to the mocks so tsgo can
infer the call tuple, and cast `call.payload` at the access point.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* ♻️ refactor(model-runtime): move mergeModelRuntimeHooks into the package
It's a generic utility for composing `ModelRuntimeHooks` instances — same
import surface as `ModelRuntime` and the hooks interface — so it belongs
alongside them rather than tucked under a server-side consumer.
- New `packages/model-runtime/src/core/mergeHooks.ts` exports
`mergeModelRuntimeHooks` and is re-exported from the package index.
- Move the unit tests to `packages/model-runtime/src/core/mergeHooks.test.ts`,
including a new case covering the "a throws → b is skipped" load-bearing
semantics.
- `src/server/services/llmGenerationTracing/hook.ts` drops the local copy and
the consumer (`src/server/modules/ModelRuntime/index.ts`) imports from
`@lobechat/model-runtime`.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* ♻️ refactor(llm-generation-tracing): version lives with the prompt, not in a central table
`promptVersion` was baked into `TRACING_SCENARIO_REGISTRY`, far from any
prompt definition — editing a prompt + forgetting to bump the entry in a
completely different file was an obvious foot-gun.
- Registry is now `Record<string, string>` mapping trigger → scenario only;
it's the stable concern that rarely changes.
- `resolveScenario` always passes `promptVersion` through from the caller,
defaulting to `UNKNOWN_PROMPT_VERSION` ('v0') when absent.
- Each call site declares its own `*_PROMPT_VERSION` constant next to the
prompt it describes. `followUpAction` ships the first one:
`FOLLOW_UP_PROMPT_VERSION` in `prompts/index.ts`, threaded through
`metadata.promptVersion` at the `generateObject` call. Other callers can
add the same constant when they next touch their prompts.
The 6-char prompt hash on the row still catches forgotten bumps.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* ✨ feat(input-completion): wire prompt-version metadata at the auto-complete call site
Aligns input auto-complete with the FOLLOW_UP_PROMPT_VERSION convention so
each prompt iteration is recordable as the chat-side tracing lands.
- `INPUT_COMPLETION_PROMPT_VERSION = 'v1.0'` declared next to
`chainInputCompletion` — bump together with the prompt body.
- `fetchPresetTaskResult` accepts optional `metadata` and forwards it to
`getChatCompletion`; the existing chat path already plumbs metadata to
`ModelRuntime.chat` options.
- `InputEditor` call site passes
`{ scenario: 'input_completion', promptVersion }`.
Note: `llm_generation_tracing` currently only fires from
`onGenerateObjectComplete`. Input completion is a `chat` call, so this
metadata is forward-looking until a chat-side tracing hook lands.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* 🐛 fix(llm-generation-tracing): collapse bucketDir path.join args to silence turbopack glob warning
Turbopack's static analyzer treats `path.join(root, dyn1, dyn2)` as a
multi-segment glob pattern and warned that it could match ~12k files in
the project. Compose the relative subdir as a single string first, so
`path.join` only sees one dynamic segment.
Behavior unchanged — the resulting path is identical.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* ✨ feat(input-completion): route auto-complete through generateObject for tracing
Auto-complete is the first preset-task caller migrated to the structured-
output path so it lands in `llm_generation_tracing` via the existing
`onGenerateObjectComplete` hook. No new server hook, no global chat-side
tracing.
- `chainInputCompletion` now returns `{ messages, schema }` with a minimal
`{ completion: string }` schema and a stable `INPUT_COMPLETION_SCHEMA_NAME`
constant. JSON wrapping costs ~15-30 tokens against a 100-token completion
budget — negligible for the observability win.
- `StructureOutputSchema` / `StructureOutputParams` accept optional
`metadata`; `aiChatRouter.outputJSON` merges caller metadata over the
default trigger so `{ scenario, promptVersion, schemaName }` reach
`ModelRuntime.generateObject` options unchanged.
- `IStructureSchema.description` is now optional to match the zod schema —
previously the TS type was stricter than runtime validation accepted.
- `InputEditor` switches from `chatService.fetchPresetTaskResult` to
`aiChatService.generateJSON`, reading `response.completion`. Streaming
is dropped because auto-complete already buffers the full result before
inserting; no UX change.
- Reverts the unused `metadata` field that was added to
`fetchPresetTaskResult` in the previous commit — no current caller needs
it now that input completion uses the generateObject path.
Bumps `INPUT_COMPLETION_PROMPT_VERSION` to v2.0 because the system prompt
gained an "output the completion field" instruction.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* ♻️ refactor(aiGeneration): extract the runtime-init + generateObject dance into a service
Every server-side caller that produces structured output was repeating the
same two-step ritual: `initModelRuntimeFromDB(...)` → `runtime.generateObject(payload, { metadata })`.
`AiGenerationService` collapses it into one call so future cross-cutting
concerns (default metadata, retry, observability hooks) have one place to
land.
- New `src/server/services/aiGeneration/index.ts` exposes
`generateObject<T>(input, options)` and is unit-tested for provider
resolution + payload/metadata pass-through.
- `aiChatRouter.outputJSON` and `FollowUpActionService.extract` migrated to
the service (other callers move organically when next touched).
- Drops the unused `keyVaultsPayload` field from `StructureOutputParams`
and the placeholder at the InputEditor call site — key vaults are
server-resolved from DB, the client never supplies them.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* ♻️ refactor(tracing): centralize TRACING_SCENARIOS const + inject AiGenerationService via trpc ctx
- New `packages/const/src/llmGenerationTracing.ts` exports `TRACING_SCENARIOS`
+ `TracingScenario` type — the single directory where every known scenario
name lives. Adds `@lobechat/const` as a workspace dep on llm-generation-
tracing so `TRACING_SCENARIO_REGISTRY` can reference the same literals.
- Callers (FollowUpActionService, InputEditor) replace `'follow_up'` /
`'input_completion'` string literals with `TRACING_SCENARIOS.FollowUp` /
`.InputCompletion`, so a typo or a rename fails the type-check instead of
silently drifting on the row.
- `AiGenerationService` is now injected into the `aiChatProcedure` ctx
middleware alongside `aiChatService`; `outputJSON` consumes it via
`ctx.aiGenerationService` instead of new-ing it inside the handler.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* ✨ feat(llm-generation-tracing): add lt/llm-tracing CLI + drop local-only storage_key
- Add `lt` / `llm-tracing` CLI under @lobechat/llm-generation-tracing with
`list` (recent records, --scenario filter, --json) and `inspect` (by
tracing_id prefix or latest, --full, --json).
- `FileTracingStore.save` now returns `{ key: null }` so dev DB rows leave
`storage_key` empty instead of recording a non-resolvable local path; S3
store remains the source of truth for the real key. Add helpers
`findByTracingId` / `getLatest` used by the CLI.
- Wire `agentId` and `topicId` into `input_completion` tracing metadata
from the chat input auto-complete call site.
- Default `FileTracingStore` whenever NODE_ENV=development (drop the
ENABLE_LLM_GENERATION_TRACING_LOCAL opt-in env var).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* 💄 style(llm-generation-tracing): prettier CLI output (tree + colors)
Mirror the @lobechat/agent-tracing viewer style:
- Inline ANSI color helpers (dim/bold/cyan/magenta/green/yellow/red).
- Compact single-line header with id, scenario, version, model, status,
time — replaces the multi-line bullet list.
- Tree structure with `├─`/`└─` connectors instead of `── section ──`
banners.
- input arrays render per-message (role + char count + preview) rather
than dumping raw JSON.
- Small single-key outputs (e.g. `{ completion: "怎么样" }`) collapse
to inline `key: "value"`.
- `lt list` switches to a colored, properly padded table.
Default view stays compact; --full expands system_prompt / input /
schema bodies.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* ♻️ refactor(llm-generation-tracing): split `tracing` config out of `metadata`
`options.metadata` was overloaded — half tracing-specific structured fields
(scenario / promptVersion / schemaName / agentId / topicId / ...), half
free-form jsonb passthrough. Callers couldn't tell which was which, and the
inputHint was always auto-extracted (useless when the prompt wraps the user's
text in a template).
This commit introduces a dedicated `tracing` option:
- Add `TracingOptions` to @lobechat/llm-generation-tracing — the typed shape
callers import (agentId / topicId / inputHint / scenario / promptVersion /
schemaName / systemPrompt / parentTracingId / metadata).
- Add loose `tracing?: Record<string, unknown>` to GenerateObjectOptions and
StructureOutputParams / StructureOutputSchema so the field flows through
the runtime + TRPC.
- Tracing hook now reads `context.options.tracing` for structured fields; it
still falls back to `metadata.trigger` for the cross-cutting trigger string
(ModelRuntime itself uses metadata.trigger for timing logs, so trigger
stays on metadata).
- Service `record()` accepts an explicit `inputHint`; otherwise falls back
to auto-extraction from the first user message. Always truncated.
- Free-form jsonb fields move to `tracing.metadata` (was unknown-key passthrough
on `metadata`).
- Call sites updated:
- FollowUpAction now passes `tracing: { scenario, promptVersion, schemaName,
topicId }` (previously `metadata`).
- InputCompletion now passes `tracing: { agentId, topicId, inputHint: input,
scenario, promptVersion, schemaName }` — `inputHint` is the user's actual
typed text, not the wrapper prompt's first user message.
- `aiChat.outputJSON` router forwards both metadata and tracing.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* Update inputCompletion.ts
* 🐛 fix(llm-generation-tracing): stop duplicating provider into the row's metadata jsonb
`provider` is already a first-class column on the `llm_generation_tracing`
row, so auto-stamping it into the `metadata` jsonb column on every call was
pure noise. The hook now writes the caller-supplied `tracing.metadata`
verbatim — empty/undefined when the caller had nothing to add.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> | 19 小时前 |