GGitHubRelease v3.8.43 (#5609)

b729a8f2创建于 16 小时前历史提交

文件	最后提交记录	最后更新时间
docs	Release v3.8.40 v3.8.40 cycle integration → main. All test gates green (Unit/Integration/Coverage/Node-compat/Quality-Ratchet). The only red check, 'PR Test Policy', is the test-masking heuristic firing on the cumulative ~57-commit release diff (legitimate assert consolidations already reviewed per-PR — Gemini CLI removal #5246, retired GPT models #5280, provider catalog refreshes); overridden with --admin per the documented release-PR convention. CodeQL/SonarQube advisory scans non-blocking; #5278's code already passed CodeQL on main. Homologated on VPS 192.168.0.15 (v3.8.40 healthy).	3 天前
CHANGELOG.md	Release v3.8.43 (#5609) * chore(release): open v3.8.43 development cycle * docs(relay): clarify backend routing contract (#5621) Integrated into release/v3.8.43 (drift-shed: cherry-picked the real change onto the release tip; stale-base drift dropped). * fix(security): avoid rendering error stacks (#5624) Integrated into release/v3.8.43 (drift-shed: cherry-picked the real change onto the release tip; stale-base drift dropped). * fix(chatgpt-web): restore dot-form Pro model ids (#5549) Integrated into release/v3.8.43 (drift-shed: cherry-picked the real change onto the release tip; stale-base drift dropped). * feat(commandCode): add multimodal image support for CC vision models (#5557) Integrated into release/v3.8.43 (drift-shed: cherry-picked the real change onto the release tip; stale-base drift dropped). * fix(providers): validate M365 Copilot web credentials (#5432) Integrated into release/v3.8.43 (drift-shed: cherry-picked the real change onto the release tip; stale-base drift dropped). * fix(sse): bound chat hot-path heap — pressure-aware admission + response cap + clone reductions (#5152) (#5425) Integrated into release/v3.8.43 (drift-shed: cherry-picked the real change onto the release tip; stale-base drift dropped). * fix: model lockout not recording for 429 rate_limit_exceeded from Antigravity ## Problem When Antigravity returns HTTP 429 with `rate_limit_exceeded` error code, the model lockout system never records the failure, so the model is not cooled down despite being rate-limited. ### Root Cause Antigravity's 429 error text is: `"Resource has been exhausted (e.g. check quota)."` The QUOTA_PATTERNS in `classify429.ts` contained overly broad regexes: - `/resource.exhaust/i` — matches "Resource has been exhausted" - `/check.quota/i` — matches "check quota" This caused `classifyErrorText()` to return `QUOTA_EXHAUSTED` (wrong), which set `providerExhausted = true` in the combo target exhaustion logic. With `providerExhausted`, the retry path was skipped entirely, and while the "done retrying" path should still record lockout, the misclassification cascaded into incorrect provider-level exhaustion state. Additionally, `targetExhaustion.ts` used the raw error text string instead of the structured error code (`rate_limit_exceeded`) that was already parsed from the response body. ## Fix 1. classify429.ts — Removed overly broad `/resource.exhaust/i` and `/check.quota/i` from QUOTA_PATTERNS. Antigravity's rate-limit wording is not a true quota exhaustion signal. 2. targetExhaustion.ts — Added optional `structuredError` to `ApplyComboTargetExhaustionOptions`. When available, the structured error code (e.g. `rate_limit_exceeded`) takes precedence over raw error text for exhaustion classification. 3. combo.ts — Passes `structuredError` to both `applyComboTargetExhaustion` call sites (dispatch path + retry-or-rotate path). ## Effect `structuredError.code = "rate_limit_exceeded"` → classified as rate-limit (not quota) → `providerExhausted = false` → retry proceeds → `recordModelLockoutFailure` called → model enters lockout with proper cooldown (120s base, exponential backoff). ## Tests Added 2 new tests for `structuredError.code` precedence in exhaustion classification. All 28 related tests pass. * fix(checks): normalize route paths on windows (#5613) Integrated into release/v3.8.43. Windows path-normalization fix for the route-guard membership gate + regression test (Rule #18). Co-authored test added by maintainer. * fix: truncate tool list when provider limit exceeds MAX_TOOLS_LIMIT (grok-cli 200) - Add proactive PROVIDER_TOOL_LIMITS map with grok-cli: 200 - Fix regex to capture 'maximum is 200' (not '427 tools provided') - Remove broken truncation gate that skipped limits >= MAX_TOOLS_LIMIT (128) - Add tests for Grok regex, proactive limits, and limits above threshold Refs #5563 * test(chatcore): cover grok-cli tool-list truncation via prepareUpstreamBody (#5563) Co-authored-by: diegosouzapw <diegosouza.pw@gmail.com> * fix(security): v3.8.15 hardening follow-ups (Seg2/Seg3/Seg4/Bug3) (#5512) Security v3.8.15 hardening follow-ups: Seg2 (CHANGEME boot warn), Seg3 (auth_token cookie maxAge 30d), Seg4 (VS Code path-token once-per-process warning), Bug3 (real global install path resolution), Bug1 (segment-match node_modules in auto-update detection). All 5 carry TDD regression guards. * Fix HuggingChat web session routing (#5592) (#5592) Integrated into release/v3.8.43. HuggingChat web session-routing fix (root parent-message fetch + cookie propagation + encrypted-credential guard) + 24-model catalog refresh. Maintainer adjustments (co-authored): reverted the freeModelCatalog.data.ts whole-file reformat down to the surgical 24-record huggingchat change (preserving the auto-generated compact format), and added a 502 regression test for the null parent-message-id path (Rule #18). * fix: preserve system role for GLM 5.1/5.2 (#5610) (#5663) * fix: restore Codex Responses WS TLS profile + apply proxy (#5591, #5611) (#5668) * fix: allow saving providers without a live validator (#5565, #5567) (#5669) * fix: static model catalog for jules/linkup/ollama/searchapi search providers (#5569, #5571, #5573, #5575) (#5672) * fix: live AI/ML API catalog + deprecate dead CablyAI (#5570, #5568) (#5673) * fix: correct 404 provider setup links for ollama/searchapi/you.com (#5572, #5574, #5576) (#5674) * fix: page call_logs cleanup queries to avoid startup OOM on large DBs (#5618) (#5675) * fix: use PowerShell Expand-Archive on Windows for embedded-service install (#5590) (#5678) * fix: treat array content blocks as valid output in detectMalformedNonStream (#5559) (#5680) * fix: render memory engine status detail strings in English (#5596) (#5685) * fix: free proxy pool silent sync failure — iplocate txt + per-source isolation + surface errors (#5595) (#5686) * chore(quality): close QG v2 tail — drop orphan semcheck.yaml + Fase 9 maturity re-eval (#5681) - Remove semcheck.yaml: orphan config (zero workflow/script wiring) with stale rule counts; deterministic doc-accuracy coverage already exists (check:fabricated-docs --strict + docs-counts-sync + docs-symbols). Drop the REPOSITORY_MAP row referencing it. - Add docs/ops/MATURITY_REEVAL.md (Fase 9): re-measures maturity post-Ondas 0-3. The two biggest structural weaknesses from QUALITY_GATE_PLAYBOOK (2026-06-16) are now closed: fast-gates hole (quality.yml runs typecheck:core + impacted TIA unit tests + vitest + shards) and mutation-score-as-ratchet (check-mutation-ratchet.mjs + seeded baseline + nightly blocking job). Residual gap is owner/infra-gated (branch-protection main, SLSA L3, CodeQL advanced). - Record agent-lsp as deferred/opt-in (doc-only scaffold, no wiring). * fix(ci): stabilize nightly-mutation — guard tap.testFiles drift + anti-flake eps (#5682) Root cause (NOT a timeout): the nightly-mutation run fails on cold-cache nights because the blocking mutation-ratchet job measures modules below baseline, while warm-cache nights pass — the verdict tracked GitHub Actions cache state, not code quality. Proven via a local Stryker probe on headers.ts: covering unit tests (no-memory-header, strip-reasoning) had drifted OUT of stryker.conf.json tap.testFiles, so their mutants went covered-but-unkilled = Survived on a cold full run (COVERED score 61.73 vs 94.29 baseline); adding them restores the kills. - Add scripts/check/check-mutation-test-coverage.mjs: guards that every UNIT test importing a Stryker-mutated module is listed in tap.testFiles. Advisory by default, --strict in CI (wired in quality.yml fast-gates). Prevents recurrence. - Add the 38 drifted covering unit tests to stryker.conf.json tap.testFiles (138 -> 176). Monotonically safe: more covering tests only raise/hold the score. - Add MUTATION_RATCHET_EPS (1.0pt) anti-flake tolerance to check-mutation-ratchet so sub-point tap-runner jitter no longer false-fails the gate. Lowers no baseline. - Tests: check-mutation-test-coverage (3) + eps cases in check-mutation-ratchet. Residual: a clean post-merge nightly confirms scores return to/above baseline; any marginal residual gets a baseline re-seed (operator). * refactor(dashboard): split sidebarVisibility god-file into types + sections leaves (#5683) Behavior-preserving decomposition: src/shared/constants/sidebarVisibility.ts 1197 -> 291 LOC by extracting two leaves under sidebarVisibility/: - types.ts (160): HIDEABLE_SIDEBAR_ITEM_IDS + all sidebar types (self-contained). - sections.ts (762): section building-block consts + SIDEBAR_SECTIONS (imports types only — cycle-safe). COMPRESSION_CONTEXT_GROUP + SIDEBAR_SECTIONS stay exported; host re-exports both + 'export ' of types, so every consumer import path is unchanged. Byte-identical data verified via JSON.stringify of HIDEABLE_SIDEBAR_ITEM_IDS / SIDEBAR_ICON_ACCENTS / COMPRESSION_CONTEXT_GROUP / SIDEBAR_SECTIONS / SIDEBAR_PRESETS + getSectionItems output (identical before/after). typecheck:core, check:cycles (no cycles), check:file-size (3 files <800), and the 3 sidebar suites (20/20) pass. No logic changed. Note: file-size frozen baseline for sidebarVisibility.ts (1198) can ratchet to 291 to lock the shrink (left for the release ratchet / operator). fix: surface fusion-specific config on the Global Routing tab (#5598) (#5688) * fix(executor): route OpenAI-compatible MCP Responses requests to /responses (#5483) Closes #5483. OpenAI-compatible providers receiving a Responses-shaped request carrying MCP / tool_search tools now route to the upstream /responses endpoint instead of downgrading to /chat/completions, preserving Codex deferred tool discovery. Detection helpers extracted to open-sse/executors/forceResponsesUpstream.ts. Thanks to @KooshaPari. * fix(ci): make release-green pre-flight gates visible + bounded so unit reds are not missed (#5644) Integrated into release/v3.8.43. * fix(body-size): raise LLM API payload limit for responses routes (#5652) Integrated into release/v3.8.43. Thanks @JxnLexn! * fix(test): use lightweight health probe for batch e2e (#5651) Integrated into release/v3.8.43. Thanks @KooshaPari! * feat(compression): T05/C5 — preserveSystemPrompt mode enum + legacy back-compat (#5653) Integrated into release/v3.8.43. Includes the legacy-boolean back-compat derivation so existing preserveSystemPrompt=false installs keep whenNoCache behavior. * routing: optimize latency strategy with perf metrics (#5629) Integrated into release/v3.8.43. Thanks @KooshaPari! * feat(db): models/5004 — self-correcting model context-window overrides (#5667) Integrated into release/v3.8.43. * feat(providers): complete SenseNova free Token Plan — chat + Text-to-Image (port from 9router#2233) (#5679) Integrated into release/v3.8.43. * feat(api): routing/4985 — configurable response-body validation + failover (#5684) Integrated into release/v3.8.43. * fix(chatcore): default Claude tool type to "custom" when missing (#5662) Integrated into release/v3.8.43. Port from 9router#2196. Co-authored-by: warelik <warelik@users.noreply.github.com> * fix(translator): merge consecutive same-role contents for Gemini (port from 9router#2191) (#5661) Integrated into release/v3.8.43. Port from 9router#2191. * chore(bun): add locked bun runtime dependency (#5615) Integrated into release/v3.8.43. Bun 1.3.10 pinned via npm lockfile (adopt-partial decision). Thanks @KooshaPari! * chore(bun): run validated ts scripts with bun (#5612) Integrated into release/v3.8.43. Thanks @KooshaPari! * chore(bun): run CI script checks with bun (#5617) Integrated into release/v3.8.43. Validated bun==node output for all 3 gates (provider-consistency, compression-budget, known-symbols). Thanks @KooshaPari! * fix(build): make pack validator bun safe (#5643) Integrated into release/v3.8.43. Forward-compat guard; node/npm path unchanged. Thanks @KooshaPari! * docs: document Bun as the allow-listed build/dev script runner (Node stays the published runtime) (#5703) Integrated into release/v3.8.43. * feat(analytics): show $0 cost for flat-rate subscription/cookie providers (#5552) (#5704) * refactor(api): extract unified-catalog helpers into cohesive leaf modules (#5699) BLOCO E2 of the god-files campaign. The module-level pure/standalone helpers in src/app/api/v1/models/catalog.ts (1611 LOC) were lifted out verbatim into five cohesive leaf modules so the catalog host shrinks toward the 800-LOC file-size cap without any behavior change (host now 1345 LOC; the heavy getUnifiedModelsResponse orchestrator is untouched — its in-function closures stay put): - catalogHelpers.ts — pure numeric/array/shape helpers + shared catalog types - catalogOpenrouter.ts — OpenRouter id/modality/free-model/display-name helpers - catalogVision.ts — vision-capability field derivation (+ isVisionModelId re-export) - catalogProviderMaps.ts — alias<->providerId resolution maps (buildAliasMaps) - catalogRequest.ts — /v1/models API-key auth gating + Codex CLI client detection The host re-exports getCustomVisionCapabilityFields and isVisionModelId so the public API consumed by other tests (llm-selector-custom-vision-models, vision-detection- consistency) is unchanged; all 9 catalog/vision suites stay green. Adds tests/unit/catalog-helpers-extraction.test.ts: characterization tests for every extracted helper + a guard asserting the host preserves its public exports. Validated: typecheck:core, 50 catalog characterization tests, 12 new leaf tests, integration-wiring, check:cycles, check:file-size (no new violations), ESLint, Prettier. * feat(mcp): T07 — expose RTK learn/discover as MCP tools (#5691) Adds two read-only MCP tools wrapping the existing RTK discovery primitives: omniroute_rtk_discover (discoverRepeatedNoise/suggestFilter over recently captured raw tool output → candidate noise patterns + suggested filter) and omniroute_rtk_learn (listRtkCommandSamples + commandToId). Scope read:compression, MCP audit-logged, no new engine logic. Regression guard: tests/unit/compression/rtk-mcp-tools.test.ts. gaps v3.8.42 — T07. * feat(compression): T05/C3 — opt-in LLM-tier compression engine (#5702) Adds an opt-in, default-off LLM-tier compression engine ('llm') that condenses non-system message prose via a pluggable chat-completion backend, mirroring the llmlingua contract. Safe by construction: no-op default backend (pass-through out of the box), not in the default stacked pipeline, enabled defaults false, fenced code blocks + system messages never sent to the model, fail-open everywhere, minTokens floor. Real production backend is a VPS-validated follow-up (Hard Rule #18). Regression guard: tests/unit/compression/llm-compressor-engine.test.ts (8). gaps v3.8.42 — T05/C3. * refactor(db): extract compat/aliases/mitm helpers from db/models.ts into leaf modules (#5705) BLOCO E3 of the god-files campaign. db/models.ts (1250 LOC) mixed six concerns; the three cleanly-separable ones plus the shared key_value helpers were lifted out verbatim into a new src/lib/db/models/ subdirectory, leaving the tightly-coupled custom/synced/ flags trio in the host (host now 936 LOC). The host re-exports every moved public symbol so the module's public API (consumed by ~29 test files + localDb) is unchanged. - models/shared.ts — asRecord / toNonEmptyString / getKeyValue + JsonRecord (19 LOC) - models/compat.ts — model-compat overrides + sanitizeUpstreamHeadersMap (249 LOC) - models/aliases.ts — model-alias CRUD + cascade delete (61 LOC) - models/mitmAlias.ts — MITM alias get/set (32 LOC) The custom/synced/flags trio stays in the host because it is genuinely coupled (flags->getCustomModelRow, flags->readCompatList, custom->removeModelCompatOverride, synced->getModelIsDeleted, setModelIsHidden->updateCustomModel) — splitting it cleanly is a follow-up. Dependency DAG is acyclic (verified by check:cycles). Adds tests/unit/db-models-split.test.ts: characterization of the pure extracted helpers + a guard asserting the host preserves its full public export surface. Validated: typecheck:core, check:cycles (no cycles), 77 existing db/models consumer tests (db-models-crud/extended/aliases-cascade + 7 more) green, 7 new tests, ESLint, Prettier, check:file-size (host 936 < frozen 1259; no new violations). * refactor(db): extract pricing/lkgp/cache-metrics from db/settings.ts into leaf modules (#5709) BLOCO E3 of the god-files campaign. db/settings.ts (1154 LOC) mixed five concerns; the three cleanly-separable ones plus the shared toRecord/JsonRecord helper were lifted out verbatim into a new src/lib/db/settings/ subdirectory, leaving the Settings-core + Proxy config concerns in the host (host now 646 LOC). The host re-exports every moved public symbol so the module's public API (consumed by ~93 test files + localDb) is unchanged. - settings/shared.ts — toRecord + JsonRecord (9 LOC) - settings/pricing.ts — pricing layers/sources/per-model + update/reset (254 LOC) - settings/lkgp.ts — Last-Known-Good-Provider get/set/clear (49 LOC) - settings/cacheMetrics.ts — cache metrics + trend (235 LOC) Settings-core + the Proxy-config concern stay in the host: proxy is the most tangled (245-line resolveProxyForConnection, resolution cache, imports from ./proxies) and getSettings is the most central function — leaving them is the correct coupled-core stop. Pricing/LKGP/Cache have NO dependency on Settings/Proxy helpers (verified); the dependency DAG is acyclic (check:cycles). Adds tests/unit/db-settings-split.test.ts: characterization of the shared toRecord helper + a guard asserting the host preserves its full public export surface. Validated: typecheck:core, check:cycles (no cycles), 149 existing+new db/settings consumer tests green (db-settings-crud/extended, 8 pricing suites, cache-metrics, 2 proxy-resolution suites + 29 new), ESLint, Prettier, check:file-size (host 646 < frozen 1155). * fix(translator): re-apply lost defensive hardening for Gemini merge + Claude tool defaults (#5706) Re-applies two dropped gemini-code-assist hardening fixes (defaultClaudeToolType non-object passthrough; mergeConsecutiveSameRoleContents shallow-copy) with regression tests. Follow-up to #5661/#5662. Integrated into release/v3.8.43. * feat(codex): generate fallback profiles for compatible models (#5701) setup-codex now generates Codex profiles for compatible text models from the live /v1/models catalog when the model id doesn't match a hand-tuned pattern, skipping media/embedding models. Integrated into release/v3.8.43. * docs(changelog): credit @Chewji9875 for #5563 + #5579 Add CHANGELOG credit bullets for grok-cli tool-limit (#5563) and Antigravity 429 lockout (#5579). Documentation-only. * test(dashboard): repoint sidebar quota-share placement scan to sections.ts (#5711) The D1 god-file split (#5683) moved the nav-item id definitions out of src/shared/constants/sidebarVisibility.ts into the extracted leaf src/shared/constants/sidebarVisibility/sections.ts. This source-scan test still read the old monolith path, so it found 0 occurrences of id: "costs-quota-share" and failed (base-red on release/v3.8.43). Repoint SIDEBAR_PATH to sections.ts where the ids now live. All four placement assertions (quota-share after quota, same array, far from costs-budget, exactly one occurrence) hold against the new source. * refactor(db): extract columns/nodes/rate-limit leaves from db/providers.ts (#5714) db/providers.ts was a 1106-line god-file mixing four concerns. Extract the three acyclic, cohesive slices into sibling leaf modules under src/lib/db/providers/, leaving the tightly-coupled connection-CRUD core in the host: - providers/columns.ts (116) 10 pure column-normalizer helpers (DB-free) - providers/nodes.ts (163) 6 provider-node CRUD functions - providers/rateLimit.ts (177) 6 rate-limit/quota runtime helpers + formatResetCountdown Host providers.ts: 1106 -> 719 lines. The connection-CRUD core does not call any node or rate-limit function (verified), so the host re-exports the 12 moved public symbols via `export { ... } from './providers/<leaf>'` — the module's public API stays IDENTICAL (23 symbols). Bodies moved verbatim (byte-identical); the only edit to a moved line is the added `export` on the 10 previously-private normalizers. Behavior-preserving: 122 existing provider/quota/rate-limit consumer tests stay green; new tests/unit/db-providers-split.test.ts guards the re-export barrel + characterizes the pure column helpers (38 assertions). Refs #3501 (god-file structural shrink). * refactor(db): extract types + pure mappers from db/proxies.ts (#5717) db/proxies.ts was a 1059-line god-file. Extract the two acyclic, DB-free slices into sibling leaf modules under src/lib/db/proxies/, leaving the tightly-coupled CRUD + assignment + resolution core in the host: - proxies/types.ts (65) 10 proxy type/interface declarations - proxies/mappers.ts (180) pure row mappers / scope normalizers / payload coercers (toRecord, mapProxyRow, mapAssignmentRow, isRelayProxyType, extractRelayAuth, toRegistryProxyResolution, normalizeScope, normalizeAssignmentScopeId, toLegacyProxyLevel, coerceProxyPayload, redactProxySecrets) Host proxies.ts: 1059 -> 847 lines. The resolution functions call createProxy/assignProxyToScope, so the CRUD+resolution core CANNOT be extracted without an import cycle and stays in the host. The host re-exports the 2 moved public functions (extractRelayAuth, redactProxySecrets) via `export { ... } from './proxies/mappers'` — the public API stays IDENTICAL (20 functions; no types were ever publicly exported). Bodies moved verbatim; the only host edits are the new leaf imports, the re-export, dropping the now unused `import { decrypt }`, and two prettier line-wrap reflows of retained ternary/union lines (token-identical). Behavior-preserving: 69 existing proxy/registry/relay/family consumer tests stay green; new tests/unit/db-proxies-split.test.ts guards the re-export barrel + characterizes the pure mappers (35 assertions). Refs #3501. * refactor(db): extract static migration data tables from migrationRunner.ts (#5721) migrationRunner.ts (1124 lines, frozen-baselined) is the startup migration orchestrator. As a conservative, zero-behaviour-risk first slice, extract the six static migration-compatibility DATA tables (verbatim) into a pure-data leaf, leaving the entire orchestrator + all SQL-running helpers in the host: - migrationRunner/constants.ts (118) RENAMED_MIGRATION_COMPATIBILITY, LEGACY_VERSION_SLOT_MIGRATIONS, SUPERSEDED_DUPLICATE_MIGRATIONS, PHYSICAL_SCHEMA_SENTINELS, INITIAL_SCHEMA_SENTINELS, OPTIONAL_FTS5_MIGRATION_VERSIONS Host migrationRunner.ts: 1124 -> 1023. The runtime fts5SupportCache (a WeakMap, mutable state) stays in the host. No public API change (these consts were module-internal). Data moved byte-identical (sed-extracted, verbatim verified); the only host edits are the leaf import + one prettier collapse of a pre-existing 2-line union type annotation to 1 line (token-identical, typecheck-confirmed). Characterize-first (operator-chosen): the existing db-migration-runner.test.ts (26 tests) + no-migration-collisions/weak-rng-fixes/check-db-rules (11) prove the reconciliation/dedup/already-applied BEHAVIOUR is unchanged; the new tests/unit/db-migrationrunner-constants-split.test.ts (7 tests) PINS THE DATA (counts + shape + spot-checks of every table) so a dropped/transposed row is caught immediately. Refs #3501. * refactor(db): extract pure SQL-source builders from usageAnalytics.ts (#5722) usageAnalytics.ts (924 lines, frozen-baselined) mixes two pure SQL-source builders with ~20 getXxxRows() query functions. Extract the contiguous, DB-free builder block verbatim into a leaf, leaving every query function in the host: - usageAnalytics/sources.ts (208) AnalyticsParams, BuildUnifiedSourceOptions, UnifiedSourceResult + buildUnifiedSource + buildPresetUnifiedSource (pure string builders; no DB, no imports) Host usageAnalytics.ts: 924 -> 723. The query functions do not call the builders (callers build the unified source then pass the string in), so the host re-exports the 5 moved public symbols (2 fns + 3 types) and imports AnalyticsParams as a type for its query signatures — the public API stays IDENTICAL (39 symbols). Builder bodies moved byte-identical; the two orphaned section-header banners that described the moved block were removed with it; the retained query-function suffix is byte-identical to the original. Behavior-preserving: 37 existing analytics consumer tests stay green (usage-analytics 12, usage-endpoint-dimension 3, db-usage-analytics-3500 22); new tests/unit/db-usageanalytics-split.test.ts (25 assertions) characterizes buildUnifiedSource's needsAggregated branching (raw-only vs raw+daily_usage_summary) + guards the 39-symbol re-export barrel. Refs #3501. * docs(readme): refresh metrics, list 17 strategies, add Quota-Share + real provider logos - Unify provider count to 236; MCP tools 87->94; cloud agents 3->4 (+Cursor); compression 9->10 engines (+relevance) - Tests -> 21,000+ across 2,586 files; footer -> v3.8.43 - Raise lower bounds to real values: 90+ free, 80+ commands, 24+ CLIs - Language flag grid 33->43 (15/14/14, all locales) - List all 17 routing strategies; new Quota-Share section before Resilience - Real provider logos (lobe-icons + local agentrouter) in providers grid and Free Forever - Top Contributors: refreshed stats + add herjarsa; 280+ title; half-size avatars; contrib.rocks 100->200 - Acknowledgments: refreshed star counts; fix headroom repo rename * docs(readme): update provider counts and add new badges * feat(memory): T10/TV6 — opt-in typed memory decay (#5723) Opt-in typed memory decay so the conversational memory store self-prunes stale episodic noise. access_count + last_accessed_at telemetry (migration 111) is always-on/non-destructive; the sweep is opt-in (MEMORY_TYPED_DECAY_ENABLED, default false). Only episodic decays by default (30d); factual/procedural/semantic immune; access_count>=3 earns immunity; deletions reuse deleteMemory (SQLite+vec+Qdrant in sync), fail-open. Regression guard: tests/unit/memory/typed-decay.test.ts (15). gaps v3.8.42 — T10/TV6. * feat(dashboard): T06/T03 — drag-reorder compression pipeline editor + studio e2e (#5727) T06: named-combos editor gains a @dnd-kit/sortable drag-to-reorder stacked pipeline backed by a pure model (compressionPipelineModel.ts: add/remove/move/update, engine->intensity invariant, never-empty). CompressionPipelineEditor.tsx replaces the inline fixed list in CompressionCombosPageClient; order persists via the existing combos endpoint (no API change). T03: adds tests/e2e/compression-studio.spec.ts (Tela A render + Play/Compare tab switch), the dedicated compression-studio e2e combo-live-studio.spec.ts did not cover. TDD: compression-pipeline-model.test.ts (11) + compression-pipeline-editor.test.tsx (4). gaps v3.8.42 — T06 + T03. * fix(thinking): wire Thinking-Budget boot hydration into live instrumentation path (#5312) (#5729) hydrateThinkingBudgetConfig was only called from the unused src/server-init.ts, which never runs in production, so the dashboard Thinking-Budget mode silently reverted to passthrough on every restart. Wire it into the real boot path (src/instrumentation-node.ts), next to the Global System Prompt restore. Surfaced by live Anthropic-OAuth validation on the VPS (fix A of #5312 was non-functional even though its direct unit test passed). New guard tests/unit/thinking-budget-boot-wiring-5312.test.ts asserts the production boot module calls the hydration, closing the test gap that let this ship. * refactor(usage): extract pure formatting helpers from callLogs.ts (#5725) callLogs.ts (996 lines, frozen-baselined) mixes pure log-formatting / sanitization helpers with DB CRUD, disk-artifact, and rotation logic. Extract the ten pure, DB-free helpers verbatim into a leaf, leaving all stateful code in the host: - callLogs/format.ts (129) asRecord, toNumber, toStringOrNull, truncateText, parseInlineError, normalizeDetailState, sanitizeErrorForLog, toStoredErrorSummary, protectPipelinePayloads, buildRequestSummary Host callLogs.ts: 996 -> 885. The stateful generateLogId (mutates logIdCounter) stays in the host. These helpers were all module-internal, so the public API is unchanged (10 exported functions). Bodies moved byte-identical; the host's now unused 'sanitizePII' import (only referenced inside the moved bodies) moved to the leaf; prettier wrapped buildRequestSummary's signature across lines once the 'export' prefix pushed it past 100 cols (token-identical). Behavior-preserving: 46 existing call-log consumer tests stay green (call-log-cap 14, pagination 4, file-rotation 5, log-retention 5, startup 1, oom 2, trim-sql 2, db-settings-maintenance 13); new tests/unit/calllogs-format-split.test.ts (26 assertions) characterizes the pure helpers + guards the 10-function public API. Refs #3501. * refactor(usage): extract pure stat/coercer helpers from usageHistory.ts (#5728) usageHistory.ts (987 lines, frozen-baselined) mixes pure DB-free helpers with an in-memory pending-request state machine and DB CRUD. Extract the contiguous pure block verbatim into a leaf, leaving all stateful code in the host: - usageHistory/helpers.ts (85) asRecord, toStringOrNull, normalizeServiceTier, toNumber, percentile, stdDev, truncatePendingPreview (+ its MAX_PREVIEW_* bounds, co-located) Host usageHistory.ts: 987 -> 916. The pending-request state machine (module Maps + track/update/finalize/sweep) and DB CRUD stay in the host. These helpers were all module-internal, so the public API is unchanged (21 direct exports + the pre-existing getCompletedDetails re-export = 22). Bodies moved byte-identical (leaf 0 non-verbatim lines); the host's local 'type JsonRecord' moved with the bodies that used it (host no longer references it — typecheck-confirmed). Behavior-preserving: 38 existing usage-history consumer tests stay green (usage-history-db 5, api-key-usage-limits 6, log-retention 5, usage-endpoint-dimension 3, provider-request-failure-pipeline 6, database-settings-maintenance 13); new tests/unit/usagehistory-helpers-split.test.ts (30 assertions) pins the percentile/stdDev formulas + normalizeServiceTier + guards the public API. Refs #3501. * refactor(usage): extract pure quota-normalize helpers from providerLimits.ts (#5730) providerLimits.ts (954 lines, frozen-baselined) is the heavily DB/network-coupled provider quota sync module. Extract a small, fully SELF-CONTAINED leaf of pure quota-key/quota-value normalization helpers (+ the isRecord type guard they share), leaving all sync/DB/network code in the host: - providerLimits/quotaNormalize.ts (72) isRecord, isUsageQuotaKeyAllowed, normalizeUsageQuotaKey, normalizeUsageQuotasForProvider, sanitizeUsageQuotasForProvider Host providerLimits.ts: 954 -> 890. The leaf imports only the external antigravity/agy model-alias helpers the moved bodies reference (moved from the host's import block) — it does NOT import the host, so check:cycles stays clean (no cycle). isRecord (used ~9x in the host) is co-extracted and imported back. These five were all module-internal, so the public API is unchanged (13 exported functions). Bodies moved byte-identical. Behavior-preserving: 18 existing provider-limits consumer tests stay green (sanitize-scope 3, db-provider-limits 3, proxy-fail-closed 3, rotating-expired-guard 7, codex-quota-sync 2); new tests/unit/providerlimits-quotanormalize-split.test.ts (19 assertions) pins isRecord + isUsageQuotaKeyAllowed + guards the 13-function public API. Refs #3501. * refactor(memory): extract pure scoring/conversion helpers from retrieval.ts (#5733) retrieval.ts (1192 lines — ABOVE its 1171 frozen baseline) is the memory retrieval engine (DB + vector + rerank network). Extract the pure, DB-free scoring/conversion helpers (+ the MemoryRow row shape they share) verbatim into a self-contained leaf, leaving all DB/vector/network code in the host: - retrieval/scoring.ts (104) interface MemoryRow + estimateTokens, parseMetadata, rowToMemory, getRelevanceScore Host retrieval.ts: 1192 -> 1072 — back UNDER the 1171 frozen baseline (the split also repairs the pre-existing file-size drift). The leaf imports only ../types, never the host, so check:cycles stays clean (no cycle). MemoryRow moved to the leaf and imported back as a type by the host's DB row functions. The public estimateTokens is re-exported from the leaf; the host also imports it for its internal token-budget loops. The other three helpers were module-internal, so the public API is unchanged (7 exports). Bodies moved byte-identical. Behavior-preserving: 38 existing memory-retrieval consumer tests stay green (rerank 5, hybrid 6, semantic 6, engine-status 9, stats-api 12); new tests/unit/retrieval-scoring-split.test.ts (11 assertions) pins estimateTokens (ceil(len/4)) + parseMetadata + rowToMemory mapping + getRelevanceScore (+20 phrase / +3 token) and guards the public API. Refs #3501. * refactor(sse): extract reasoning-tag detection/extraction from responseSanitizer.ts (#5734) responseSanitizer.ts (1133 lines, frozen-baselined) mixes reasoning-tag detection/extraction with response/usage/streaming sanitization. Extract the cohesive, ZERO-IMPORT reasoning block verbatim into a self-contained leaf: - responseSanitizer/reasoning.ts (143) the reasoning regex consts + collapseExcessiveNewlines, cleanReasoningFragment, splitClosingOnlyReasoningPrefix, movePrefixBeforeContentTagToThinking, extractThinkingFromContent, normalizeReasoningRouteId, isAntigravityReasoningRoute, isTextualReasoningTagNativeRoute, shouldParseTextualReasoningTags Host responseSanitizer.ts: 1133 -> 1003. The block's helpers only call each other, so the leaf has ZERO imports — it cannot import the host (check:cycles clean). The host imports back collapseExcessiveNewlines (6 call sites) + extractThinkingFromContent, and re-exports the two public symbols (extractThinkingFromContent, shouldParseTextualReasoningTags) — the public API stays IDENTICAL (7 exports). Bodies moved byte-identical; two long declarations (REASONING_TAG_FRAGMENT_REGEX, movePrefixBeforeContentTagToThinking signature) were line-wrapped by prettier once the 'export' prefix pushed them past 100 cols (token-identical). Behavior-preserving: 47 existing consumer tests stay green (response-sanitizer 36, strip-reasoning-header 8, textual-toolcall-false-positive 3); new tests/unit/responsesanitizer-reasoning-split.test.ts (11 assertions) characterizes extractThinkingFromContent + shouldParseTextualReasoningTags and guards the public API. Refs #3501. * refactor(sse): extract rate-limit header parsing from rateLimitManager.ts (#5736) rateLimitManager.ts (1034 lines, frozen-baselined) is the stateful rate-limiter (Bottleneck limiters, watchdog timers, learned-limits Map). Extract the pure, ZERO-IMPORT header-parsing block verbatim into a self-contained leaf, leaving all stateful machinery in the host: - rateLimitManager/headers.ts (94) STANDARD_HEADERS, ANTHROPIC_HEADERS, parseResetTime, toPlainHeaders Host rateLimitManager.ts: 1034 -> 945. The four items are pure (no limiter state, no external deps), so the leaf has ZERO imports — it cannot import the host (check:cycles clean). The host imports all four back (used by updateFromHeaders). They were module-internal, so the public API is unchanged (17 exports). Bodies moved byte-identical. Behavior-preserving: 21 existing rate-limit consumer tests stay green (rate-limit-manager 7, limiter-lifecycle 4, queue-timeout-msg 2, idle-eviction 6, body-lock 2); new tests/unit/ratelimitmanager-headers-split.test.ts (7 assertions) pins parseResetTime (durations / bare-number / nullish) + toPlainHeaders + guards the 17-function public API (with a watchdog-timer teardown hook so the runner exits cleanly). Refs #3501. * fix(config): back boot-hydrated proxy config singletons with globalThis (#5312) (#5742) Next.js compiles instrumentation.ts as a separate webpack module graph from the app-route/open-sse executors, so a module-local `let _config` is duplicated: the boot-time hydration (applyRuntimeSettings / restore hooks) lands on the instrumentation graph's copy, but the request path (base.ts) reads a different, un-hydrated copy. Live VPS validation proved the Thinking-Budget hydrate ran to completion at boot yet base.ts still read the passthrough default — why #5312 fix A stayed broken after the boot-wiring fix. Back the singletons with globalThis (the pattern systemPrompt.ts already uses for #2470) so all graph copies share one instance: - thinkingBudget.ts — dashboard Thinking-Budget mode reaches the executor - backgroundTaskDetector.ts — opt-in background degradation actually fires - systemTransforms.ts — operator pipeline overrides reach the request path payloadRules.ts was already safe (lazy per-request DB self-load, #2986). Guards: thinking-budget-globalthis-5312 + runtime-config-globalthis-5312 (assert globalThis sharing; a module-local let fails them, RED->GREEN). * refactor(evals): extract built-in golden-set suites from evalRunner.ts (#5740) Move the 7 static built-in eval suites (golden-set, coding-proficiency, reasoning-logic, multilingual, safety-guardrails, instruction-following, codex-comparison) plus the builtInSuites aggregate into the pure-data leaf src/lib/evals/evalRunner/builtinSuites.ts (zero imports, no side effects). evalRunner.ts keeps all logic (register/get/list/evaluate/run/scorecard/reset) and registers the leaf suites at module load, mirroring the original inline calls. Public API is unchanged (7 exported functions; the suite consts were already module-private). Host 960->301 LOC; leaf 676 LOC (< 800 cap); host was frozen-satisfied (961), so this is debt reduction. Suite data moved verbatim (652 data lines byte-identical). New split-guard test characterizes the suite ids/case counts/key cases and proves the host registers every leaf suite at load. * refactor(models): extract pure transform layer from modelsDevSync.ts (#5743) Move the models.dev data-model types, the provider-id mapping table (MODELS_DEV_PROVIDER_MAP + mapProviderId), and the raw->OmniRoute transforms (transformModelsDevToPricing, transformModelsDevToCapabilities) into the pure leaf src/lib/modelsDevSync/transform.ts (zero imports, no DB, no module state). modelsDevSync.ts keeps all sync orchestration, DB access, caches and the periodic-sync timer; it imports the transforms for internal use and re-exports mapProviderId/transformModelsDevToPricing/transformModelsDevToCapabilities plus the ModelCapabilityEntry/CapabilitiesByProvider types, so the public API is unchanged. Host 924->677 LOC; leaf 279 LOC (< 800 cap); host was frozen-satisfied (934), so this is debt reduction. 238 moved lines are byte-identical. New split-guard test characterizes the provider map + both transforms and proves the host re-exports them. * refactor(resilience): split settings.ts into types + normalize leaves (#5745) Decompose the (fully pure) resilience settings module into two sibling leaves: - src/lib/resilience/settings/types.ts: the settings shape (11 public interfaces + JsonRecord/AuthCategory), zero imports. - src/lib/resilience/settings/normalize.ts: the coercers (asRecord/toInteger/ toBoolean/feature-flag resolvers) + the 11 per-section normalize* functions. settings.ts keeps DEFAULT_RESILIENCE_SETTINGS, DEFAULT_REQUEST_QUEUE_MAX_WAIT_MS, buildLegacyFallback, and the public orchestrators (resolveResilienceSettings, mergeResilienceSettings, buildLegacyResilienceCompat); it imports the coercers/normalizers for internal use and re-exports the 11 settings interfaces, so the public API is unchanged. Host 840->363 LOC; leaves 182 + 359 LOC (< 800 cap); host was frozen-satisfied (841), so this is debt reduction. 472 moved lines are byte-identical; no cycles (leaves never import the host). New split-guard test characterizes the coercers/normalizers and the host resolve/merge/compat orchestration. * docs(readme): document faster/leaner install — skip native build, sql.js fallback (#5713) Documents the optional better-sqlite3 + pure-JS fallback chain and OMNIROUTE_SKIP_POSTINSTALL/CI skip flags. Docs-only, claims verified. (#5550) * feat(compression): T02 opt-in per-engine pipeline circuit-breaker (#5735) Opt-in, default-off per-engine circuit-breaker for the stacked compression pipeline. Byte-identical to legacy when off. 9 regression tests. * docs: sync MCP tool count to 95 + routing-strategy count (#5732) Sync CLAUDE.md/README.md to canonical MCP tool count (95, 35 base) and routing strategies (17). Numbers fact-checked against getAllToolDefinitions()/ROUTING_STRATEGY_VALUES. * feat(api): add first-class Ollama local provider card (#5712) First-class ollama-local provider card (localhost:11434/v1, keyless, passthrough models) in LOCAL_PROVIDERS + SELF_HOSTED + default.ts executor case. Docs count 236→237, Local 11→12 (full README sweep). 4 tests. (#5578) * feat(api): add opt-in API-key provider quota-policy bypass scope (#5731) Adds an opt-in per-API-key scope (policy:bypass-provider-quota) that lets a key skip provider/account-side quota cutoffs during routing. Operator USD budgets/usage limits still enforced unconditionally (fail-closed, before the bypass). Default-off; UI toggle + badge in API Manager. Integrated into release/v3.8.43. * feat(codex): opt-in auto-sync of Codex profiles after model discovery (#5737) Auto-sync ~/.codex/.config.toml profiles after a provider model sync, reusing the setup-codex generator. Opt-in, default OFF (OMNIROUTE_AUTO_SYNC_CODEX_PROFILES=true; also honors CLI_ALLOW_CONFIG_WRITES). Never touches the active Codex config. Gating test added. Co-authored-by: diegosouzapw <diegosouza.pw@gmail.com> feat(providers): opt-in CLI profile auto-sync toggles + Claude Code auto-sync (#5755) Providers-dashboard 'CLI profile auto-sync' card (Codex + Claude Code toggles), feature-flag backed (default off), + Claude Code auto-sync mirroring the Codex path. Follow-up to #5737. * feat(compression): T08/H8 (2.3) — graduated CCR retrieval-feedback ramp (#5739) Turns CCR retrieval feedback from a binary cliff into a graduated ramp: each prior retrieval raises a block's effective minChars linearly (effectiveMinChars); >= 3 retrievals still excluded (Infinity). retrievalRampFactor default 2 (config/env COMPRESSION_CCR_RETRIEVAL_RAMP_FACTOR); 1 = legacy binary. Regression guard: tests/unit/compression/ccr-retrieval-ramp.test.ts (12); 51 existing CCR tests green. gaps v3.8.42 — T08/H8 (2.3). * feat(compression): T08/H5 (2.4) — usage-observed prefix freeze (opt-in) (#5744) Evolves the cache-aware guard to also learn which system prompts recur: observed >= threshold → treated as a stable cacheable prefix and preserved even for providers the static check misses. Content-addressed by a hash of the system prompt (OpenAI/Claude/Gemini), in-memory, freeze=preserve (never mutates). Opt-in/default-off (COMPRESSION_PREFIX_FREEZE_ENABLED); respects the never preserve-mode. New prefixFreeze.ts wired into resolveCacheAwareConfig. Regression guard: prefix-freeze.test.ts (10); 44 cache-aware tests green. gaps v3.8.42 — T08/H5 (2.4). * feat(compression): T08/H7 (2.5) — read-lifecycle engine (collapse superseded reads) (#5754) New opt-in, default-off read-lifecycle engine: collapses stale/superseded file-Read tool results (same path re-read OR modified later) to a stub, keeping the current Read intact. Anthropic + OpenAI tool shapes; conservative (known tool names, exact path, strictly-later); fail-open. Lossy → opt-in. Regression guard: read-lifecycle.test.ts (10); 41 registry/pipeline suites green. gaps v3.8.42 — T08/H7 (2.5). Completes Onda 2. * fix(sse): anti-thundering-herd guard tolerates numeric-epoch cooldowns (#5747) markAccountUnavailable's dedupe guard used a raw `new Date()` on rateLimitedUntil, which can hold a numeric-epoch string (e.g. the Antigravity full-quota path via setConnectionRateLimitUntil). That produced Invalid Date/NaN, so the guard never detected an already cooling connection — a second concurrent failure on the same connection overwrote a long quota-exhaustion cooldown with a much shorter fresh backoff cooldown, making the account selectable again far sooner than intended. Reuses the existing cooldownUntilMs normalizer (#3954) instead of a raw Date parse. * fix(chat): harden non-streaming SSE aggregation (#5746) * fix: repoint DashScope/Alibaba setup links to consoles (#5665) (#5762) * fix: point Quick Start step 1 to API Keys page, not Endpoint (#5695) (#5763) * fix: onboarding wizard saves providers with unsupported validation (#5692) (#5764) * docs(security): document full LOCAL_ONLY route set + GHSA-fhh6-4qxv-rpqj + audit path (#5599) (#5748) Expand ROUTE_GUARD_TIERS.md Tier 1 (LOCAL_ONLY): - link the GHSA advisory and explain the attack class (RCE via a subprocess spawn reachable from non-loopback traffic) - replace the 3-example prefix table with the full LOCAL_ONLY set, mirroring LOCAL_ONLY_API_PREFIXES / LOCAL_ONLY_API_PATTERNS in routeGuard.ts (the authoritative source; check-route-guard-membership enforces the code side) - add an "Operator guidance & auditing" section for users behind nginx/Cloudflare/Tailscale: don't forge X-Forwarded-For loopback, keep the manage-scope bypass minimal, and how to audit non-loopback access Docs-only; SECURITY.md already links here. Closes #5599 * docs(security): document banned-keyword / account-ban detection (#5600) (#5756) * docs(security): add BAN_DETECTION.md — banned-keyword / account-ban detection (#5600) New docs/security/BAN_DETECTION.md documenting the previously-undocumented system: - the 8 built-in ACCOUNT_DEACTIVATED_SIGNALS + custom keywords are additive - detection flow (body substring match -> terminal `banned` state, skipped in account selection; `deactivated` on 401/403; autoDisableBannedAccounts) - scope: global (all providers); the signal strings target OAuth/subscription scrapers - custom keywords: add path, 200-char cap, hot-reload, and the false-positive warning (raw substring match -> prefer full ban sentences, not "quota"/"limit") - recovery: terminal states never auto-recover -> re-test / re-auth / re-enable Registered in security meta.json; cross-linked from RESILIENCE_GUIDE (terminal states). Docs-only. Closes #5600 * docs(security): clarify deactivated vs expired terminal-status split (#5600) The same ACCOUNT_DEACTIVATED signal surfaces as two different terminal statuses depending on the code path: chatCore.ts inline writes 'deactivated' (401/403 via classifyProviderError), while markAccountUnavailable() -> resolveTerminalConnectionStatus() writes 'expired'. Document both. * fix: surface relay proxy-test errors instead of silent failure (#5716) (#5765) * refactor(api): extract pure discovery leaves from provider-models route (#5758) Split src/app/api/providers/[id]/models/route.ts (2511 -> 1818 LOC) by moving the cohesive, DB-free discovery building blocks into four leaves under discovery/: - helpers.ts record/string coercion, Azure + base-url helpers, bearer/named-openai header builders - normalizers.ts Antigravity / DataRobot / OpenAI-like / SAP models response normalizers - providerModelsConfig.ts PROVIDER_MODELS_CONFIG + ProviderModelsConfigEntry - providerSets.ts NAMED_OPENAI_STYLE_PROVIDERS + isNamedOpenAIStyleProvider The host keeps all request orchestration and imports the leaves back. The moved symbols were module-private, so the route's public export set (GET) is unchanged and no external importer needs updating. Bodies are byte-identical: the code-line multiset of host + leaves equals the original route verbatim. Tests: - repoint the qwen-web source-guard in catalog-updates-v3829-kimi-qwen to the new config leaf (assertions unchanged) - add provider-models-discovery-split as the split regression guard (leaf public surface + host wiring + the #5570 cablyai->aimlapi entry swap) * fix(memory): enabling Qdrant activates it as the engine + inline guidance (#5597) (#5741) * fix(memory): enabling Qdrant now activates it as the engine + inline guidance (#5597) Enabling Qdrant in the Engine tab was inert: retrieval only routes to Qdrant when memoryVectorStore === "qdrant" (the default "auto" never selects it), and the card only wrote qdrantEnabled — nothing set the engine selector, and there is no UI for it. So users configured Qdrant, saw "enabled", but it was never actually used. - PUT /api/settings/qdrant now sets memoryVectorStore alongside the toggle: enable -> "qdrant", disable -> "auto". Editing other fields leaves it untouched. - Add inline guidance to QdrantConfigCard: a Tier-1-vs-Tier-2 banner + per-field help (host, collection, embedding model). Note there is no "vector dimension" or "distance metric" field: dimension is auto-detected from the embedder, distance is always Cosine. - Document the real behavior in MEMORY.md: engine gate, no back-fill of existing memories, dimension auto-detect, Cosine-only, API-key-only auth. Tests: tests/integration/qdrant-routes.test.ts — enable->qdrant, disable->auto, and field-edit-without-enabled leaves the engine untouched (TDD: red -> green). Closes #5597 * fix(memory): invalidate memory-settings cache on Qdrant toggle (#5597) The PUT handler wrote memoryVectorStore to the DB but retrieval reads through getMemorySettings(), a module-level cache. Without busting it, the engine switch did not take effect until a process restart (the DB said qdrant, retrieval kept routing to sqlite-vec). Now calls invalidateMemorySettingsCache() after the write, mirroring src/app/api/settings/memory/route.ts. Regression test warms the cache, toggles via the route, and asserts getMemorySettings().vectorStore flips to qdrant (fails without the invalidate call). * fix(compression): record Context Editing telemetry on the streaming path (#5761) Streaming SSE responses now preserve context_management from the final message_delta snapshot and fire the telemetry hook in onStreamComplete, so context-clear savings surface in compression analytics for streaming (not just non-streaming). Additive telemetry, Claude-only, opt-in-neutral. gaps v3.8.42 — T01 (5.1). Test: context-editing-streaming-telemetry.test.ts (3, failing->passing). * Persist batch item checkpoints during recovery (#5753) * fix(sse): checkpoint batch item recovery * fix(db): renumber batch checkpoints migration 110→112 (collision with #5667) 110 was taken by 110_model_context_overrides.sql (#5667), which landed on the release branch after this PR branched. migrationRunner throws a hard version- collision error on startup when two files share a numeric prefix. 112 is the next free slot (110/111 taken on the release tip). Co-authored-by: diegosouzapw <diegosouza.pw@gmail.com> --------- Co-authored-by: diegosouzapw <diegosouza.pw@gmail.com> * fix: resolve CCR MCP retrieve principal from api-key auth context (#5649) (#5768) * feat(cli): show version in startup banner (integrates #5752) (#5769) * feat(cli): show version in startup banner Print dim 'v<version>' line below ASCII art logo in omniroute serve. Uses readFileSync (same pattern as program.mjs) to read package.json. Closes #5749. * test(cli): guard startup-banner version line (#5752) Source-inspection test (same pattern as cli-serve-port.test.ts) asserting serve.mjs parses the version from package.json and prints v${_pkg.version} in the startup banner — satisfies Hard Rule #8 for the bin/ change. Co-authored-by: diegosouzapw <diegosouza.pw@gmail.com> * docs(changelog): credit #5752 startup-banner version line (thanks @chirag127) --------- Co-authored-by: Chirag Singhal <76880977+chirag127@users.noreply.github.com> * fix(proxyfetch): skip fallback for non-replayable bodies (#5770) * chore(release): open v3.8.42 cycle Bump version to 3.8.42, add CHANGELOG placeholder, sync openapi/electron/open-sse + 42 i18n CHANGELOG mirrors. * chore: remove unused qdrant schema aliases (#5404) Integrated into release/v3.8.42 * chore: remove unused memory schema aliases (#5403) Integrated into release/v3.8.42 * chore: remove unused quota schema types (#5402) Integrated into release/v3.8.42 * chore: remove unused playground row type (#5401) Integrated into release/v3.8.42 * chore: remove unused codegraph exports (#5400) Integrated into release/v3.8.42 * chore: remove unused notion client type (#5399) Integrated into release/v3.8.42 * chore: remove unused settings types (#5398) Integrated into release/v3.8.42 * chore: remove unused combo types (#5396) Integrated into release/v3.8.42 * chore: remove unused provider types (#5393) Integrated into release/v3.8.42 * chore: remove unused skillssh skill type (#5392) Integrated into release/v3.8.42 * chore: remove unused status hex key type (#5391) Integrated into release/v3.8.42 * chore: remove unused batch provider type (#5390) Integrated into release/v3.8.42 * chore: remove unused skills schema types (#5389) Integrated into release/v3.8.42 * chore: remove unused codex auth input type (#5388) Integrated into release/v3.8.42 * chore: remove unused memory schema types (#5387) Integrated into release/v3.8.42 * chore: remove unused playground row type (#5386) Integrated into release/v3.8.42 * chore: remove unused qdrant schema types (#5385) Integrated into release/v3.8.42 * chore: remove unused kiro social schema (#5384) Integrated into release/v3.8.42 * chore: remove unused memory schema types (#5383) Integrated into release/v3.8.42 * chore: remove unused audit action type (#5382) Integrated into release/v3.8.42 * chore: remove unused agent skills schema types (#5381) Integrated into release/v3.8.42 * chore: remove unused shared logger default export (#5380) Integrated into release/v3.8.42 * chore: remove unused sse logger helpers (#5378) Integrated into release/v3.8.42 * chore: remove unused sse model legacy helpers (#5377) Integrated into release/v3.8.42 * chore: remove unused v1 search response schema (#5376) Integrated into release/v3.8.42 * chore: remove unused cloud agent result schemas (#5375) Integrated into release/v3.8.42 * chore: remove unused a2a routing logger readers (#5374) Integrated into release/v3.8.42 * chore: remove unused webhook delivery detail export (#5372) Integrated into release/v3.8.42 * chore: remove unused api key type (#5395) Integrated into release/v3.8.42 * chore: remove unused usage types (#5397) Integrated into release/v3.8.42 * chore: remove unused cloud agent input types (#5373) Integrated into release/v3.8.42 * deps: bump electron from 42.4.1 to 42.5.1 in /electron (#5413) Integrated into release/v3.8.42 * deps: bump the production group with 11 updates (#5414) Integrated into release/v3.8.42 * fix: frame non-streaming JSON responses (#5416) Integrated into release/v3.8.42 * fix(services): runNpm shell on win32 + prefix via env for Node 24 EINVAL (#5379) (#5474) Node 24 refuses execFile of npm.cmd without a shell (nodejs/node#52554), so embedded-service install (9Router/CLIProxy) failed with spawn EINVAL on Windows. runNpm now enables shell on win32 only; to stay Hard-Rule-#13 safe under a shell, the install --prefix is passed via npm_config_prefix (env) instead of an argv path (survives spaces), and the user-supplied version is constrained by SERVICE_VERSION_PATTERN at the route boundary. * fix(cli): restore dist/tls-options.mjs to npm tarball (#5452) (#5503) Closes #5452 * fix(dashboard): render onboarding wizard on /providers/new (#5427) (#5505) Closes #5427 * fix(db): EBUSY-safe database import on Windows (#5406) (#5507) Closes #5406 * chore: remove unused gamification streak exports (#5463) * chore: remove unused headroom log tail export (#5464) * chore(dead-code): remove unused prompt cache control helper (#5466) * chore(duplication): share vscode metadata helpers (#5471) * chore(duplication): share auth zip extractors (#5475) * chore(duplication): share vscode tokenized request helper (#5479) * chore(duplication): share quota strategy ranking helpers (#5482) * chore(duplication): share recharts donut card (#5484) * chore(duplication): share provider specific validation (#5485) * chore(duplication): share batch response formatter (#5488) * chore(duplication): share redis runtime helpers (#5490) * chore(duplication): share version manager request parsing (#5492) * chore(duplication): share media generation route helpers (#5493) * chore(duplication): share settings transform schemas (#5496) * chore(duplication): share relay stream finalizer (#5497) * chore(duplication): share machine id fallback (#5498) * chore(duplication): share node sqlite adapter (#5500) * fix: treat terminal stream cancels as complete (#5491) * fix post-merge ci regressions (#5467) * fix: gate claude adaptive thinking defaults (#5480) Co-authored-by: KooshaPari <koosha@example.com> * fix(fallback): normalize provider error rule headers (#5473) Co-authored-by: KooshaPari <koosha@example.com> * fix(rate-limit): normalize queue refresh settings (#5499) Co-authored-by: KooshaPari <koosha@example.com> * chore(ci): add npm fetch-retry + release-freeze protocol (Hard Rule #21) (#5506) - .npmrc: bump fetch-retries 2->5 with backoff so transient registry ECONNRESET during npm ci (electron-release, v3.8.41) retries instead of failing the job; applies repo-wide. - CLAUDE.md Hard Rule #21: release-freeze coordination marker (label release-freeze) that campaign workflows honor before merging into the active release branch, preventing the mid-release commit races that forced CHANGELOG re-reconciliation in v3.8.40/v3.8.41. * chore(duplication): share service install helpers (#5495) Share service install helpers; re-add SERVICE_VERSION_PATTERN regex to the shared schema (dropped in extraction, #5474) + tests rejecting malformed versions. Co-authored-by: diegosouzapw <diegosouza.pw@gmail.com> * chore(duplication): share proxy route handlers (#5472) Share proxy route handlers; add resolveProxyLookupResponse regression test (3 branches + custom whereUsed param name). Co-authored-by: diegosouzapw <diegosouza.pw@gmail.com> * chore(duplication): share combo builder model options (#5477) Share combo builder model options; add regression test locking custom-model source classification (manual->custom, api-sync->imported). Co-authored-by: diegosouzapw <diegosouza.pw@gmail.com> * chore(dead-code): ratchet dead code baseline (#5468) Ratchet dead-code baseline to the true measured value (310 -> 225) after the v3.8.42 dead-code + duplication wave. Measured by check-dead-code.mjs on the tip. Co-authored-by: diegosouzapw <diegosouza.pw@gmail.com> * fix(dashboard): provider-add UX — i18n labels, surface import warning, default key name (#5511) * fix(dashboard): provider-add UX — real i18n labels, surface import warning, default key name (#5421 #5428 #5429 #5431 #5435) Three rough edges in the Add-API-Key / model-import flow, all from the provider-catalog audit: 1. Validation Model + Account ID form fields shipped untranslated i18n stub copy ('Validation Model Id Label', etc.) that rendered verbatim. Replaced with real copy in en.json. 2. Model import silently fell back to the cached/local catalog — the route returns a 'warning' field the import hook never read. New pure helper extractImportWarning surfaces it as a log line. 3. Required connection-name field defaulted to '' (let browser autofill inject garbage like 'wiw'); now defaults to 'main'. Regression guard: tests/unit/provider-add-ux-i18n-import-warning.test.ts. * fix(dashboard): compress AddApiKeyModal comment to keep file under frozen size cap * fix(providers): align Muse Spark (Meta AI) cookie copy to ecto_1_sess (#5449) (#5513) * fix(providers): align Muse Spark (Meta AI) cookie copy to ecto_1_sess (#5449) The default Meta AI session cookie migrated from the retired abra_sess to ecto_1_sess (META_AI_DEFAULT_COOKIE), but the provider form hint and one 401 auth-failure message still named abra_sess, telling users to paste a cookie that no longer exists. Both strings now name ecto_1_sess. Regression guard: tests/unit/muse-spark-cookie-copy-5449.test.ts. * chore: reconcile CHANGELOG with release (keep #5449 + #5511 bullets) * fix(providers): correct FriendliAI (serverless) + Novita (/openai/v1) endpoints (#5430 #5455) (#5515) * fix(providers): correct FriendliAI (serverless) and Novita (/openai/v1) endpoints (#5430 #5455) Both rejected valid keys, verified live with real provider keys: - FriendliAI baseUrl was /dedicated/v1/... which 403s a serverless flp_* token; switched to /serverless/v1/... + serverless modelsUrl. - Novita baseUrl was the legacy /v3/... with a typo'd model id ai-ai/... (both 404); switched to OpenAI-compat /openai/v1/... + meta-llama/llama-3.1-8b-instruct. Regression guard: tests/unit/provider-endpoints-friendliai-novita.test.ts. * chore: reconcile CHANGELOG with release (keep #5430/#5455 + prior bullets) * fix(providers): gate import for tool-only providers + sanitize Coze validation error (#5420 #5426) (#5522) #5420: the 'Import Models' button now hides for tool-only providers (web search / web fetch) via a capability check over resolved serviceKinds, not just the -search suffix — firecrawl/jina-reader (webFetch) no longer show an Import button that 400s. No LLM/media provider is affected. #5426: Coze key validation no longer leaks the raw upstream envelope ({code,msg,logId,from}) into the UI; the Coze error becomes a friendly message, scoped to provider === 'coze' so no other provider is affected. Regression guards: tests/unit/model-listing-capability-5420.test.ts, tests/unit/coze-validation-error-5426.test.ts. * fix(providers): correct LongCat free tier — GA LongCat-2.0, one-time 10M (KYC) (#5508) LongCat's preview ended and the Flash-* line was retired (2026-05-29); the API now exposes only the GA LongCat-2.0 (1M context, 128K output). The free tier is a ONE-TIME 10M-token grant unlocked after account signup + KYC verification — NOT a recurring daily/monthly allowance. The catalog still described the retired preview/Flash models and a recurring 150M / 5M-per-day budget; this corrects every reference. Config / code: - registry/longcat: model LongCat-2.0-Preview -> LongCat-2.0, name + comment reflect one-time 10M (KYC) and pay-as-you-go beyond it. - freeModelCatalog: longcat-2.0-preview (150M, recurring-daily) -> LongCat-2.0 (10M, freeType one-time-initial via creditTokens). - freeTierCatalog: drop longcat from the recurring-monthly budget map (one-time credits are excluded by that catalog's own rule). - regional.ts freeNote: one-time 10M after signup + KYC, not recurring. - providerCostData: longcat-flash-lite -> longcat-2.0 (pay-as-you-go 0.75/2.95 per 1M, 10M free quota). - validation probe model longcat -> LongCat-2.0. Tests: - free-tier-catalog: longcat now absent from FREE_TIER_BUDGETS; providerCount 22->21 (clean 21->20); documented total ~1.39B. - tierResolver: sample model flash-lite -> LongCat-2.0. Docs: - README, PROVIDERS-GUIDE, FREE-TIERS-GUIDE, FREE_TIERS: 50M/day Flash-Lite -> one-time 10M LongCat-2.0 (KYC); 'No auth' -> API key + KYC. - Regenerated PROVIDER_REFERENCE.md (picks up the new freeNote). typecheck:core clean; changed-file lint 0 errors; docs-sync PASS. * fix(providers): Bytez OpenAI-compat base URL + auth-only key validation (#5422) (#5528) Bytez IS OpenAI-compatible at .../models/v2/openai/v1, but the registry stored the bare .../models/v2 base, so validation's chat-probe hit .../models/v2/chat/completions -> 404 -> 'endpoint not supported'. Part A: registry baseUrl -> full OpenAI-compat chat path. Part B: a Bytez account only serves catalog-provisioned models, so chat-probe validation 404s even for valid keys. validateBytezProvider instead probes the auth-only GET .../models/v2/list/tasks (200=valid, 401/403=invalid). Verified live with a real key: list/tasks -> 200 (valid) / 401 (invalid). Regression guard: tests/unit/bytez-validation-5422.test.ts. * fix(providers): remove dead Phind provider + dedupe HuggingChat catalog listing (#5530) Integrated into release/v3.8.42 (round 3). Dead Phind removal + HuggingChat dedupe, verified complete. * fix: protect dynamic dashboard tests with CSRF (#5405) Integrated into release/v3.8.42 (round 3). Reworked CSRF (HMAC-signed synchronized token). * docs: clarify bifrost relay backend envs (#5520) Integrated into release/v3.8.42 (round 3). Doc-only: bifrost relay envs. * test(quota): guard Claude-Code identity version lockstep (Phase 2) (#5514) Integrated into release/v3.8.42 (round 3). Claude-Code identity version lockstep guard. * feat(compression): T02 — honest default-on pipeline inflation guard (H1) (#5527) Integrated into release/v3.8.42 (round 3). T02 pipeline inflation guard * feat(compression): T05/C2 — caveman dedup + ultra packs for de, fr, ja (#5529) Integrated into release/v3.8.42 (round 3). T05/C2 caveman packs de/fr/ja * feat(compression): T05/C6 — Chinese (zh / wenyan) caveman pack + detection (#5532) Integrated into release/v3.8.42 (round 3). T05/C6 zh/wenyan pack + detection * feat(compression): T07/R9 — gradle + dotnet RTK catalog filters (#5537) Integrated into release/v3.8.42 (round 3). T07/R9 RTK gradle+dotnet filters * refactor(dashboard): T11 — drop duplicate caveman on/off toggle from the compression settings tab (#5524) Integrated into release/v3.8.42 (round 3). T11 consolidate duplicate caveman controls; i18n'd the panel hint string (source key). * test relay routing fallback headers (#5526) Integrated into release/v3.8.42 (round 3). Relay fallback header extraction + tests (drift-shed: dependabot #5415 commit dropped). * fix(opencode-plugin): bump to 0.2.0 + auto-publish on release (#5363) - Bump @omniroute/opencode-plugin from 0.1.0 to 0.2.0 so CI publishes the accumulated fixes (auto combos, schema fields, debug logging) that were merged after the initial 0.1.0 publish on May 24. - Add auto-bump step in npm-publish.yml: detects if the plugin dir changed since the last release tag and auto-increments patch version, so the plugin never falls behind again on future releases. Co-authored-by: herjarsa <herjarsa@users.noreply.github.com> * [codex] add bifrost auto fallback cooldown (#5519) Integrated into release/v3.8.42 (round 3). Bifrost auto fallback cooldown; header reconciled with #5526 helper + env-doc. Co-authored-by: diegosouzapw <diegosouza.pw@gmail.com> * fix onboarding schema client import (#5525) Integrated into release/v3.8.42 (round 3). Browser-safe onboarding schema import (drift-shed: dependabot #5415 dropped). * docs: add relay backend strategy guide (#5547) Port #5533 relay strategy guide to release/v3.8.42 (doc-only). * fix(chatgpt-web): support GPT-5.5 Pro handoff (#5536) Integrated into release/v3.8.42 (round 3). GPT-5.5 Pro async stream_handoff support (drift-shed: dependabot #5415 dropped). * fix(providers): persist Configured filter across page reloads (#5510) Integrated into release/v3.8.42 (round 3). Persist Configured filter across reloads; extracted shouldSyncProviderDisplayMode race guard + TDD test (Closes #4059). Co-authored-by: diegosouzapw <diegosouza.pw@gmail.com> * fix(mimocode): route per-account traffic through SOCKS5 proxy dispatchers (#5521) Integrated into release/v3.8.42 (round 3). Per-account SOCKS5 dispatcher routing — completes #3837's stored proxy config with the actual undici dispatcher layer. Rebased onto .42 (dropped the CI-workflow-deletion commits; merged proxyUrlMap dispatch with #3837's acct.proxy storage). Co-authored-by: diegosouzapw <diegosouza.pw@gmail.com> * fix(chatgpt-web): portable SHA3-512 for sentinel PoW under Electron/BoringSSL (#5531) (#5540) * fix(build): keep ioredis out of the client/CLI bundle via SPAWN_CAPABLE_PREFIXES leaf (#5546) Fix the dast-smoke ioredis client-bundle regression (proven: dast-smoke green). Remaining reds are pre-existing base-reds/flakes (base.ts file-size, GOLDEN provider drift, shard-1 compression flakes) inherited by all PRs — not from this change. * chore(release): finalize v3.8.42 CHANGELOG + cycle-close reconciliation - Reconcile CHANGELOG.md for v3.8.42: 40 bullets covering all 89 commits since v3.8.41 (4 features, 26 fixes, 10 maintenance incl. 2 rollups for the 35-PR dead-code sweep + 17-PR DRY consolidation), dedup the merge- artifact duplicate New Features headers, set release date 2026-06-30. - Sync 42 docs/i18n//CHANGELOG.md mirrors. - Document 3 new chatgpt-web/TLS env vars in .env.example + ENVIRONMENT.md (OMNIROUTE_CGPT_WEB_PRO_TIMEOUT_MS, _PRO_POLL_INTERVAL_MS, OMNIROUTE_CHATGPT_STREAM_FIRST_BYTE_TIMEOUT_MS). - Cycle-close ratchet rebaselines: eslintWarnings 4116->4121, file-size base.ts/chatgpt-web.ts/strategySelector.ts/chatgpt-web.test.ts (all inherited drift, justified inline). - Regenerate provider translate-path golden snapshot for the merged bytez/friendliai/novita endpoint fixes. chore(changelog): cover #5415 dev-deps bump merged from main The release/v3.8.42 ↔ main merge (c4c1b56ba) brought #5415 (development dependency group, 9 updates) and #5533 (relay backend guide) from main. #5533's content is already covered by the #5547 port bullet; add a Maintenance bullet for #5415 and re-sync the 42 i18n CHANGELOG mirrors. * test: relocate 2 orphaned test files to collected runner paths check:test-discovery flagged two cycle-merged tests that no runner collects (they never ran → false coverage confidence): - compression-settings-tab-consolidation.test.tsx (#5524) → tests/unit/ui/ (vitest UI runner collects tests/unit/ui/*/.test.tsx); 3/3 pass. - providers/providerPageStorage.test.ts (#5510) → tests/unit/dashboard/ ('providers' is not a collected subdir; 'dashboard' is, same ../../../ import depth); 30/30 pass under the node runner. Both confirmed green when actually executed; no assertions weakened. * fix(release): repair inherited base-red tests from #5480/#5527/#5427/#5521 The fast-path (PR->release/*) does not run the full unit+integration suites, so four merged feature PRs shipped with stale/incorrect tests that only surface on the release PR (PR->main). Repairs (features are correct; align tests to the new behavior — no assertions weakened): - #5480 (gate claude adaptive thinking): adaptive thinking is now injected only for a real Claude Code client (x-app:cli / claude-code UA), not for any bare Claude OAuth token. claude-thinking-tool-choice-guard + base-thinking-budget-5312 now identify as a Claude Code client to exercise the adaptive path (3 tests). - #5527 (T02 inflation guard): the guard reverts a stacked body that did not shrink in tokens. The bail-out/advancement fixtures used growth-appending mock engines; they now carry a droppable padding message the engines empty, so the body realistically shrinks and the marker assertions survive. bailout (5), stacked-async (3), engine-enabled-toggle (2). - #5427 (render onboarding wizard at /providers/new): integration-wiring asserted the old redirect stub; now asserts the route renders ProviderOnboardingWizard. - #5521 (mimocode SOCKS5 per-account proxy): the constructor's default account omitted the proxy field (undefined), breaking the 'all proxies null' backward compat guard. Default it to null, mirroring syncAccountsFromCredentials(). fix(proxyfetch): skip fallback for non-replayable bodies --------- Co-authored-by: Diego Rodrigues de Sa e Souza <diegosouza.pw@gmail.com> Co-authored-by: Jan Leon <Jan.gaschler@gmail.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Randi <55005611+rdself@users.noreply.github.com> Co-authored-by: Diego Rodrigues de Sa e Souza <8016841+diegosouzapw@users.noreply.github.com> Co-authored-by: KooshaPari <42529354+KooshaPari@users.noreply.github.com> Co-authored-by: KooshaPari <koosha@example.com> Co-authored-by: backryun <bakryun0718@proton.me> Co-authored-by: Hernan Javier Ardila Sanchez <hjasgr@gmail.com> Co-authored-by: herjarsa <herjarsa@users.noreply.github.com> Co-authored-by: Arthur Bodera <abodera@gmail.com> Co-authored-by: PizzaV <103120356+pizzav-xyz@users.noreply.github.com> Co-authored-by: OpenClaw Auto <openclaw-auto@example.invalid> * Move CLI profile sync toggles to CLI Code (#5778) * move CLI profile sync toggles to CLI Code * test CLI profile auto-sync toggles * Document CLI profile auto-sync flags * docs(changelog): note CLI profile auto-sync card moved to CLI Code (#5778) --------- Co-authored-by: Diego Rodrigues de Sa e Souza <diegosouza.pw@gmail.com> * fix(grok-cli): parse expires_at from auth.json and exp from JWT to fix auto-refresh (#5775) * fix(grok-cli): parse expires_at from auth.json and exp from JWT to fix auto-refresh * docs(changelog): note grok-cli token auto-refresh fix (#5775) --------- Co-authored-by: Diego Rodrigues de Sa e Souza <diegosouza.pw@gmail.com> * fix(providers): import intentional local-catalog-only providers instead of 502 (#5460, #5465) (#5787) The model-sync route returned a hard 502 ('Remote model discovery failed; local catalog fallback not synced') for every provider whose local catalog is its ONLY discovery source (Reka #5460, t3.chat #5465, embedding/rerank like voyage-ai/jina-ai, Qwen-OAuth, and web-cookie providers). The /models route now flags catalogs that are the provider's intended source (no remote /models endpoint) with intentional:true; model-sync imports those instead of 502-ing, while a genuinely degraded remote fallback still surfaces. New dependency-free leaf degradedLocalCatalog.ts. Also fixes t3.chat's confusing add-credential hint: it no longer renders the circular 'Required cookie: convex-session-id + Cookie header...' copy and wires the step-by-step DevTools hint (t3ChatWebCookieHint) already translated in every locale. Regression guards: tests/unit/sync-models-degraded-local-catalog-5460-5465.test.ts, tests/unit/t3chat-web-cookie-hint-5465.test.ts, + intentional-flag assertions in tests/unit/provider-models-route.test.ts. * fix(api): self-hydrate model aliases from DB on GET after restart (#5777) * Fix grammatical errors in readme (#5738) * fix(api): self-hydrate model aliases from DB on GET when in-memory state is empty In the standalone production build, webpack creates two separate copies of modelDeprecation.ts — one hydrated by the startup path (used for request routing) and one used by the /api/settings/model-aliases API route. The API route's copy starts with an empty _customAliases after each server restart, causing the Settings → Routing UI to show 'No exact-match aliases configured' even though the aliases are persisted in the DB. The GET handler now detects an empty _customAliases state and reads the modelAliases key from the settings blob in the DB, calling setCustomAliases() to hydrate this module instance. This is a best-effort fallback — when _customAliases is already populated (e.g. by the startup path in dev mode), no DB read occurs. Regression test: tests/unit/model-aliases-settings-route-selfheal.test.ts - Verifies hydration from DB when in-memory state is empty - Verifies no hydration when in-memory state is already populated - Verifies graceful handling when no modelAliases exist in DB --------- Co-authored-by: Chirag Singhal <76880977+chirag127@users.noreply.github.com> Co-authored-by: marcelpeterson <marcelpeterson@users.noreply.github.com> Co-authored-by: Diego Rodrigues de Sa e Souza <diegosouza.pw@gmail.com> * refactor(usage): extract 5 provider usage families into leaves (#5782) Split open-sse/services/usage.ts (1723 -> 901 LOC) by moving the Cursor, Kimi, Codex, Claude and Kiro usage-fetcher families into cohesive leaves under open-sse/services/usage/ (mirroring the existing glm/minimax/antigravity/quota/ scalars leaves): - usage/cursor.ts getCursorUsage (+ CURSOR_USAGE_CONFIG, decodeCursorJwtSub) - usage/kimi.ts getKimiUsage (+ KIMI_CONFIG, getKimiPlanName) - usage/codex.ts getCodexUsage (+ CODEX_CONFIG) - usage/claude.ts getClaudeUsage / getClaudePlanLabel (+ CLAUDE_CONFIG, legacy) - usage/kiro.ts getKiroUsage / buildKiroUsageResult / discoverKiroProfileArn (+ helpers) The host keeps the getUsageForProvider dispatcher and imports the fetchers back; the public export set is unchanged — buildKiroUsageResult + discoverKiroProfileArn are re-exported from the kiro leaf (the kiro-* tests import them from services/usage) and __testing stays wired to the moved claude/kiro internals. Bodies are verbatim: the code-line multiset of host + leaves equals the original. Adds tests/unit/usage-families-split.test.ts pinning the leaf surface, the kiro re-export identity, the __testing wiring, and getClaudePlanLabel's pure logic. * chore(docs): sync i18n CHANGELOG mirrors with root [3.8.43] section (#5789) Regenerate the docs/i18n/<locale>/CHANGELOG.md [3.8.43] blocks from the root CHANGELOG so the mirror body size returns within the 25% docs-sync tolerance. Clears a pre-existing release-time drift (mirrors were ~26% smaller than root) that was failing check-docs-sync and blocking every local commit on the release branch. * fix(providers): correct stale/broken provider metadata (#5487, #5461, #5534, #5470) (#5790) - #5487 Qoder: replace the untranslated i18n stubs (personalAccessTokenLabel, qoderPatHint, qoderPatPlaceholder) with real copy; extend the STUB_KEYS guard. - #5461 Scaleway: website pointed at scaleway.com/en/ai/generative-apis (HTTP 404); repoint at the live docs URL /en/docs/ai-data/generative-apis/. - #5534 Microsoft 365 Copilot: rewrite the vague authHint with concrete DevTools WebSocket steps (the token lives on the Chathub WS URL, not an Authorization header). - #5470 Together AI: retired the $25 signup credit and is now fully prepaid (min $5); hasFree false + a prepaid notice instead of the stale free-tier freeNote (verified live). Regression guards: tests/unit/provider-metadata-5461-5470-5534.test.ts + Qoder keys added to tests/unit/provider-add-ux-i18n-import-warning.test.ts. * fix(dashboard): neutral badge for unsupported validation + clickable OAuth error links (#5442, #5486) (#5795) - #5442 LMArena (and any provider with no live validator) returns { unsupported: true } from /api/providers/validate and Save succeeds, but the Add-API-Key modal only had success/failed states so it rendered a red 'Invalid' badge. Add an 'unsupported' result → neutral info 'N/A' badge via the pure leaf validationBadgeProps(); both validate handlers now map data.unsupported to it. - #5486 GitLab Duo's OAuth setup error embeds a registration URL (gitlab.com/-/profile/applications) but the OAuth error step rendered it as dead red text. New LinkifiedText component (+ pure ReDoS-safe linkify util) makes any http(s) URL in an OAuth error clickable; the GitLab Duo backend message already carries the full setup steps. Regression guards: tests/unit/validation-badge-unsupported-5442.test.ts, tests/unit/oauth-error-linkify-5486.test.ts. Frozen god-files kept within cap (AddApiKeyModal 868/868, OAuthModal 968/969). * fix(system): route in-app auto-update npm calls through the win32 shell helper (#5542) (#5797) The in-app auto-update flow called execFileAsync("npm", ...) directly for the version lookup (versionCheck.getLatestVersionFromNpmCli), dependency install, global install, and native rebuild. On Windows npm is npm.cmd and Node >=24 refuses to execFile a .cmd without a shell (nodejs/node#52554), so those calls threw 'spawn npm ENOENT'. Route them through buildNpmExecOptions (the same win32-shell helper the embedded-services installer uses, fix #5379). The global install spec is validated with SERVICE_VERSION_PATTERN before it is shell-joined (Hard Rule #13). Not the pnpm/npx swap the issue proposed — that is the wrong direction for an 'npm install -g' flow already solved elsewhere in-repo. Regression guard: tests/unit/autoupdate-npm-win32-5542.test.ts. * refactor(sse): extract cursor protobuf wire primitives into a leaf (#5794) Split open-sse/utils/cursorAgentProtobuf.ts (1520 -> 1400 LOC) by moving the low-level protobuf wire-format primitives — varint/tag/length-delimited encode+ decode + the generic field walker (encodeVarint, encodeTag, encodeBytes, encodeString, encodeMessage, encode{UInt32,Bool,Double}Field, decodeVarint, checkedLen, decodeFields, findField, decode{String,Varint}Field, the Field type and the WT_VARINT/WT_LEN wire-type constants) — into cursorAgentProtobuf/wire.ts. These primitives were module-private, so the host's public API is unchanged; the host imports them back internally. Bodies are verbatim: the code-line multiset of host + wire.ts equals the original. First layer of the codec decomposition — the value/framing codec and the message encoders/decoders build on this and stay in the host (they share host-retained helpers; splitting them is a separate step). Adds tests/unit/cursor-protobuf-wire-split.test.ts pinning the leaf surface, the encode/decode round-trip invariants, the buffer-overrun guard, and the host wiring. * test(runtime): guard tsx/esm→esbuild transform path on boot (#5757) (#5773) #5757 reported that a fresh `npm install omniroute` pulls `esbuild@0.28.1` transitively via `tsx` (a runtime dependency the CLI registers at boot in `bin/omniroute.mjs`), and proposed forcing `esbuild@0.27.4`. That override is unsafe: `tsx@4.22.4` requires `esbuild@~0.28.0` and `fumadocs-mdx@15` (also a runtime dep) requires `esbuild@^0.28.0`; forcing 0.27.x pushes esbuild below both, and 0.28.1 is currently the latest release. The reported transform failure also does not reproduce — OmniRoute targets ES2022, its minimum supported Node is 22.2 (destructuring is native), and tsx targets the running Node, so esbuild never lowers to an unsupported target. Instead of an unsafe version pin, add two regression guards: - functional: spawn the real `node --import tsx/esm` loader on a fixture packed with modern syntax (destructuring/spread, class+private fields, optional chaining, nullish, logical assignment, async + top-level await) and assert it transforms + runs correctly. Fails if a future esbuild regresses the boot path. - dependency-shape: assert the resolved esbuild stays within tsx's declared range, so nobody reintroduces the out-of-range override this issue proposed. No production code changed; no esbuild version pinned. * fix(deps): add missing runtime deps @toon-format/toon and safe-regex (#5771) Both packages are imported at runtime but were only declared for their type shims (safe-regex was via @types/safe-regex; @toon-format/toon had no declaration at all). Missing runtime deps mean: - open-sse/services/compression/engines/headroom/toon.ts imports @toon-format/toon → MODULE_NOT_FOUND on cold pnpm/npm install - open-sse/services/compression/engines/ccr/ccrQuery.ts imports safe-regex → MODULE_NOT_FOUND Both engines are wired into the stacked compression pipeline (default enabled), so a fresh clone that does not have a stale node_modules from a previous version crashes as soon as the pipeline runs. Verified with pnpm ls / grep before/after. * fix(oauth): clamp grok-cli expired-token expiresIn to a positive value (#5775 follow-up) (#5820) An already-expired grok-cli token (real expires_at/exp in the past) produced a negative expiresIn, which is truthy in the import-token route and maps to a PAST expiresAt — AutoCombo then reads that as 'already expired' and excludes the connection instead of refreshing it. Clamp with Math.max(1, expiresIn) so an expired token is treated as due-for-refresh. Extends #5775 (thanks @Chewji9875). Regression: 2 new cases in tests/unit/grok-cli-oauth.test.ts (expired JWT exp + expired JSON expires_at), both failing-then-passing. * fix(model-aliases): back custom-alias store with globalThis (#5777 follow-up) (#5821) #5777 self-healed the GET /api/settings/model-aliases symptom at the route layer, but the root cause remained: modelDeprecation.ts held _customAliases in a plain module-level let, which webpack duplicates across the startup and app-route module graphs (same class as #5312). Startup hydration landed on one copy; the API route read the other (empty) one. Back the store with globalThis (__omniroute_customAliases__) so both instances share one store — the exact pattern already used by thinkingBudget.ts/backgroundTaskDetector.ts (#5312). The route-layer DB self-heal from #5777 stays as a harmless fallback. Extends #5777 (thanks @jleonar2). Regression: tests/unit/model-aliases-globalthis-5777.test.ts (fails on the plain-let store: never populates globalThis, never reads a sibling instance's write). * chore(release): rebaseline file-size + test-masking ratchets for v3.8.43 (#5609) DRIFT acumulado dos 109 commits do ciclo v3.8.43 (fast-gate PR->release nao roda check:file-size/test-masking; base-reds so afloram na release-PR): - file-size: 8 god-files existentes cresceram + 2 arquivos novos acima do cap + 4 test files cresceram -> frozen ajustado ao estado atual. - test-masking: chatgpt-web.test.ts 281->280 asserts allowlisted (#5549 consolidou 2 assert.equal num unico map-driven; refactor legitimo, nao masking). Modularizacao dos god-files deferida (#3501). * refactor(sse): extract openai-to-gemini pure helpers into a leaf (#5824) Split open-sse/translator/request/openai-to-gemini.ts (873 -> 756 LOC, back under the 800-line cap) by moving the module-private pure helpers — the historical-tool- context string builders (stringifyHistoricalToolArguments, buildInertHistorical, escapeHistoricalContext, buildHistoricalToolResultContext), deepCleanUndefined, extractClientThoughtSignature, buildChangedToolNameMap, isVertexGeminiProvider, and applyAntigravityGenerationDefaults (with its GeminiGenerationConfig shape) — into openai-to-gemini/helpers.ts. These were module-private, so the translator's public API is unchanged; the host imports them back internally. Bodies are verbatim: the code-line multiset of host + leaf equals the original. Adds tests/unit/openai-to-gemini-helpers-split.test.ts pinning the leaf's pure behaviour (escaping, undefined-pruning, signature extraction, antigravity generation-config defaults) and the host wiring. * fix(db): re-export modelContextOverrides from localDb (check:db-rules #5609) * test(discovery): wire tests/unit/memory into node runner glob (#5609) typed-decay.test.ts (TV6 typed memory decay, 15 asserts) sat in tests/unit/memory/ which no runner glob collected -> orphan (never ran). Adds 'memory' to the subdir brace-glob in all runner sources (package.json scripts + ci.yml shards) and the COLLECTORS mirror in check-test-discovery.mjs (drift-check keeps them in sync). Passes standalone (15/15); DATA_DIR isolation handled per-file by tests/_setup/isolateDataDir.ts. * test: align 3 stale release tests to landed behavior (#5609) Base-reds surfaced on the release PR (fast-gate PR->release skips these shards): - api-manager-page-static: Self-service Visibility now has 5 switches (added the API-key provider quota-policy bypass toggle, #5731); bump inventory 4->5 while keeping the invariant that every switch declares type=button (verified 5/5 typed). - security-hardening (callLogs PII): #5725 extracted sanitizeErrorForLog into callLogs/format.ts; assert the new wiring (callLogs imports it + format.ts imports piiSanitizer) instead of the removed direct import — PII sanitization still intact. - memory-glm-injection: #5610 made GLM 5.1+ ACCEPT the system role (z.ai docs), so glm-5.1 must PRESERVE system, not fold it. Flip the stale #1701-era assertion. * test(shared): align t3-web web-session expected metadata with hintKey (#5835) The t3-web provider metadata intentionally carries `hintKey: "t3ChatWebCookieHint"` (#5465 — the generic cookie hint reads circular for t3.chat), but the metadata assertion in web-session-credentials was never updated, so it deep-equals against an object missing the field. This is a stale-test base-red on release/v3.8.43 that turns the whole PR queue's "Unit Tests fast-path (1/2)" red. Align the expected object to the shipped source of truth. * test(compression): de-flake rtk_discover sample seeding seedSamples() persisted two byte-identical raw outputs. The raw-output filename is keyed on Date.now() (ms) + a content hash (rawOutput.ts), so two identical captures landing in the same millisecond collapse to one file (the 2nd write overwrites the 1st) -> sampleCount 1 instead of 2. Reproduced at ~25% (501/2000 trials), matching the intermittent Coverage Shard (5/8) failure on fast CI runners. Seed two DISTINCT captures so the store deterministically holds 2 samples regardless of timing (0/2000 collisions after the change). * test(e2e): anchor compression-studio smoke on play-input, not async play-lane The T03 smoke asserted `play-lane` visible on mount, but those per-lane buttons only render after a preview-compression run populates `batch.lanes` (usePreviewCompression keeps batch null until run(); there is no mount auto-run). The smoke intentionally does not drive a compression cascade, so `play-lane` can never appear -> the E2E added in #5727 failed all 3 retries (E2E Tests 4/9). Anchor on the always-present `play-input` panel, which proves the studio body mounted without needing async lane data. * fix(security): explicit http(s) scheme allowlist in linkifyText href CodeQL flagged the <a href> in LinkifiedText (#5486) with js/xss (high) and js/client-side-unvalidated-url-redirection (medium) because href traces back to user-provided text. URL_RE already requires an http(s):// prefix, so a javascript:/data: scheme can never reach href — but that guarantee was only implied by the regex. Validate the scheme explicitly via new URL().protocol before exposing href (non-http(s) degrades to plain text): defense-in-depth that also makes the sink provably safe to static analysis. Regression test added. * fix(ci): register mark-account-unavailable test in stryker tap.testFiles check:mutation-test-coverage --strict (Fast Quality Gates) flagged tests/unit/mark-account-unavailable-numeric-epoch-guard.test.ts as a covering unit test missing from stryker.conf.json tap.testFiles, so its mutant kills would not count (--strict). Add it. Pre-existing tap.testFiles drift on the release tip that fails Fast Quality Gates on every PR into release/v3.8.43, not just this branch. * chore(release): rebaseline eslintWarnings ratchet 4121->4158 (v3.8.43 cycle drift) * chore(release): rebaseline complexity 1981->1982 + cognitive-complexity 842->845 (v3.8.43 cycle drift) * chore(release): rebaseline deadExports 225->227 (v3.8.43 cycle drift) * fix(dashboard): add error boundaries for Combos and MITM Proxy pages (#5788) Integrated into release/v3.8.43 * fix(cli): rename process title to omniroute (#5791) Integrated into release/v3.8.43 * fix(providers): add claude-sonnet-5 to Kiro model catalog (#5796) Integrated into release/v3.8.43 * fix(kiro): bound Claude id dash->dot minor group to protect date-suffixed ids (#5825) Integrated into release/v3.8.43 * fix(db): allowlist modelContextOverrides as intentionally-internal to green release DB-rules gate (#5798) (#5827) Integrated into release/v3.8.43 * fix(sse): stop reasoning-summary drop + duplicated deltas on claude→codex streaming (#5786) (#5832) Integrated into release/v3.8.43 * fix(dashboard): guard null modelAliases values in model picker (#5792) Integrated into release/v3.8.43 * fix(github): drop trailing assistant prefill for Copilot chat (#5802) Integrated into release/v3.8.43 * fix(oauth): disambiguate OAuth connections on username to prevent cross-IdP overwrites (#5803) Integrated into release/v3.8.43 * fix(translator): strip orphaned tool results across request formats (#5805) Integrated into release/v3.8.43 * fix(kiro): stop injecting placeholder user turn on tool-result turns (#5807) Integrated into release/v3.8.43 * fix(mitm): clean up privileged hosts entries on exit when possible (#5808) Integrated into release/v3.8.43 * fix(translator): prevent doubled tool args in OpenAI-to-Claude (#5828) Integrated into release/v3.8.43 * fix(usage): keep tool definitions visible when request log is truncated (#5829) Integrated into release/v3.8.43 * fix(db): preserve healthCheckInterval=0 across create/update (#5822) Integrated into release/v3.8.43 * fix: unify dashboard csrf origin fallback (#5856) Integrated into release/v3.8.43 * fix(kimi-web): migrate to www.kimi.com Connect-RPC API (kimi.moonshot.cn retired) (#5858) Integrated into release/v3.8.43 * fix(qwen-web): unblock validator + chat completion (retired endpoint + missing SPA version header) (#5855) Integrated into release/v3.8.43 * fix(antigravity): 429 hang on credit exhaustion and precise reset time lockout (Cleaned) (#5846) Integrated into release/v3.8.43 * fix(cli): correct rootDir resolution in doctor.mjs on Windows (#5844) (#5845) Integrated into release/v3.8.43 * Show startup time in ready banner (#5799) Integrated into release/v3.8.43 * extracted CorrelationId observability changes from #5275 (#5834) Integrated into release/v3.8.43 * refactor(executors): deduplicate shared utilities and add comprehensive tests (#5720) Integrated into release/v3.8.43 * Harden provider node URL validation (#5760) Integrated into release/v3.8.43 * [codex] Tune adaptive stream readiness timeouts (#5767) Integrated into release/v3.8.43 * fix: restore om-usage HTTP endpoint (#5859) Integrated into release/v3.8.43 * fix(sse): strip zero-width markers from streamed responses (parity with non-streaming) (#5857) Integrated into release/v3.8.43 * [codex] Protect long-running agent goal streams (#5772) Integrated into release/v3.8.43 * refactor(oauth): remove dead legacy OAuth service classes (#5838) The src/lib/oauth/services/ service-class hierarchy is superseded — the live OAuth flow runs through src/lib/oauth/providers.ts + providers/. The old per-provider 'class Service extends OAuthService' implementations and their barrel had zero production or test references. Removed oauth/openai/github/claude/codex/antigravity/ qwen/qoder + the index barrel (-1559 LOC). Kept kiro.ts, cursor.ts, codexImport.ts (routes import them directly by path, never via the deleted barrel). Proven safe by typecheck:core staying green (a live reference would fail the build) + a filesystem guard test pinning the removal. Salvage of closed PR #5039. gaps v3.8.42 - T10 (5.7). chore(docs): scope release-freeze to /generate-release only (Hard Rule #21) (#5839) A freeze is authorized ONLY inside /generate-release (raised Phase 0a, lifted Phase 12c). No campaign/session/agent may open a release-freeze mid-development; if one is ever unavoidable outside the release flow it must be requested from the operator in chat first with an explicit "estou criando um freeze" alert. Also codifies: never lift an active captain freeze to unblock campaign merges (it auto-lifts at 12c). * fix(chat): preserve JSON default when stream is omitted (#5866) * fix(chat): preserve JSON default when stream is omitted * chore(chat): type route record guard * fix(api): gate early SSE keepalive on explicit stream intent, keep body untouched Remove the stream:false body normalization so the legacy streaming default (resolveStreamFlag) and the per-key streamDefaultMode json opt-in keep deciding the response framing; the keepalive wrapper is only applied when stream:true is explicit or Accept forces SSE. Co-authored-by: diegosouzapw <diegosouza.pw@gmail.com> --------- Co-authored-by: diegosouzapw <diegosouza.pw@gmail.com> * feat(usage): report usage command quotas as percentages + honor observed provider quota resets (#5874) * feat: report usage command quotas as percentages Convert @@om-usage and the HTTP usage endpoint to report personal API key quotas as remaining percentages while keeping USD amounts out of the command output. Scale provider quota remaining percentages by the configured quota cutoff so the protected reserve reads as 0% left. Restore provider USD cost drilldown in the quota dashboard.\n\nAlso sync the 3.8.43 i18n changelog mirrors so the docs-sync pre-commit gate remains green.\n\nTests: DISABLE_SQLITE_AUTO_BACKUP=true node --import tsx/esm --test tests/unit/internal-usage-command.test.ts; DISABLE_SQLITE_AUTO_BACKUP=true node --import tsx/esm --test tests/unit/api-key-usage-limits.test.ts; DISABLE_SQLITE_AUTO_BACKUP=true node --import tsx/esm --test tests/unit/provider-window-costs.test.ts; DISABLE_SQLITE_AUTO_BACKUP=true node --import tsx/esm --test tests/unit/api-manager-usage-command.test.ts tests/unit/apikeys-usage-command.test.ts; npx eslint <changed files>; npm run typecheck:core; npm run build; npm run check:migration-numbering; npm run check:docs-sync; docker build --target runner-base (cherry picked from commit f66abd2028a40f2950613da97b8880adfded9db8) * fix: honor observed provider quota resets Detect same-resetAt quota resets when provider usage drops back to the reset floor, and prefer that observed snapshot over stale recorded weekly events for provider USD windows and API-key USD quotas.\n\nTests: npx eslint changed files\nTests: npm run typecheck:core\nTests: DISABLE_SQLITE_AUTO_BACKUP=true node --import tsx/esm --test tests/unit/lib/quota-reset-events.test.ts tests/unit/provider-window-costs.test.ts tests/unit/api-key-usage-limits.test.ts\nTests: npm run build\nTests: docker build --target runner-base --build-arg OMNIROUTE_BUILD_MEMORY_MB=4096 -t omniroute:quota-reset-window-20260702002300 . (cherry picked from commit 39c12a6f17995e3c797456fa1611075050f89aaf) * docs(changelog): credit usage quota percentages extraction from #5863 Co-authored-by: diegosouzapw <diegosouza.pw@gmail.com> --------- Co-authored-by: Wital <wital@example.com> * fix(github): keep Copilot access-token sessions active (#5875) * fix(github): keep Copilot access-token sessions active GitHub Copilot device-flow accounts may have a GitHub access token and short-lived Copilot token without a refresh token. The proactive health check was treating that as terminal no_refresh_token and marking the connection expired minutes after login. Keep those sessions active, clear stale no_refresh_token state, and refresh the Copilot sub-token when needed.\n\nTests:\n- npx eslint src/lib/tokenHealthCheck.ts tests/unit/token-health-no-refresh-token-expired-5326.test.ts\n- DISABLE_SQLITE_AUTO_BACKUP=true node --import tsx/esm --test tests/unit/token-health-no-refresh-token-expired-5326.test.ts tests/unit/token-health-check.test.ts tests/unit/token-health-check-circuit-breaker.test.ts tests/unit/token-refresh-service.test.ts tests/unit/token-refresh-route-service.test.ts tests/unit/executor-github.test.ts\n- npm run typecheck:core\n- npm run build (cherry picked from commit 68095d4796ce0ab9c1c8921bbcddbcf1cb62f2b1) * docs(changelog): credit Copilot token-health fix extraction from #5863 Co-authored-by: diegosouzapw <diegosouza.pw@gmail.com> --------- Co-authored-by: Wital <wital@example.com> * feat: add NEXT_PUBLIC_LIVE_WS_PUBLIC_URL for custom domain WebSocket support (#5878) * docs: add ai_features scope to GitLab Duo OAuth env setup instructions * docs: add LIVE_WS_ALLOWED_HOSTS env var to example config for LAN/Tailscale setups * feat: add web socket public URL for reverse proxy/Cloudflare Tunnel WebSocket setups * fix(dashboard): resolve live WS public URL at runtime via handshake with scheme validation - Read NEXT_PUBLIC_LIVE_WS_PUBLIC_URL lazily in /api/v1/ws (function, not module-level const) so runtime env changes are honored in prebuilt images. - Only echo/consume publicUrl when it is a ws:// or wss:// URL (server and client guards); anything else is rejected to null. - useLiveDashboard now fetches /api/v1/ws?handshake=1 before connecting and prefers: explicit wsUrl > build-time env > handshake publicUrl > default. - Align GitLab Duo scopes line in .env.example with GITLAB_DUO_CONFIG.scope. - Extend tests: lazy env read + scheme validation cases. - CHANGELOG entry for 3.8.43. Co-authored-by: diegosouzapw <diegosouza.pw@gmail.com> --------- Co-authored-by: Septianata Rizky Pratama <ian.rizkypratama@gmail.com> * Add .editorconfig to improve repository standards (#5879) * chore(ci): pass sonar.projectVersion to SonarQube scan so the new-code baseline advances per release (#5880) * fix(dashboard): Modal — two-field auth (Token ID + Token Secret) (#5446) (#5881) * fix(dashboard): add Modal Token ID + Token Secret fields (#5446) Modal authenticates with a Token ID (ak-…) + Token Secret (as-…) pair sent as `Authorization: Bearer <TOKEN_ID>:<TOKEN_SECRET>`. The add-connection form only exposed a single API-key field, so users could not enter both credentials. Add a dedicated two-field form for the `modal` provider: the existing field is relabeled "Token ID" and a new "Token Secret" field is rendered below it. Both are combined into the single encrypted `apiKey` value via a new pure helper `combineModalCredential(id, secret)` → `id:secret`, so the generic bearer executor path emits `Bearer <id:secret>` with no registry/executor/DB changes. An empty secret returns the id verbatim, preserving the ability to paste a pre-combined `id:secret` into the single field. The field hint points to https://modal.com/settings → API Tokens. Registry (baseUrl/executor), DB schema, and the request-time header path are untouched — Modal remains bring-your-own-deploy. Tests: tests/unit/modal-credential-combine.test.ts (5, TDD). * docs(changelog): add v3.8.43 bullet for Modal two-field auth (#5446) * fix(mcp): forwarded caller auth wins over OMNIROUTE_API_KEY env fallback (#5819) (#5882) * fix(middleware): run operator hook code in hardened vm sandbox instead of new Function (#5872) (#5885) * fix(providers): include custom compatible providers in auto/ routing (#5873) (#5886) * fix(db): honor autoBackupEnabled setting for pre-write backups (#5871) (#5888) * fix(dashboard): gate Token Expired badge on terminal testStatus, not raw token expiry (#5836) (#5883) * docs: use pnpm --allow-build flag instead of unsupported approve-builds -g (#5554) (#5884) * fix(dashboard): pre-fill Modal Validation Model Id with the server probe model (#5446) (#5892) * fix(api): strip upstream x-middleware-* headers from proxied responses (#5849) (#5893) * fix(providers): restore codex inference for unprefixed gpt-5.5 on codex-only setups (#5887) (#5895) * test(autoCombo): stabilize model fitness source expectation (#5890) * test(autoCombo): make fitness source test stable against model caps * chore(ci): retrigger checks for PR 5890 * docs(changelog): add 3.8.43 bullet for the autoCombo fitness-source test stabilization (#5890) --------- Co-authored-by: kooshapari <kooshapari@users.noreply.github.com> Co-authored-by: Diego Rodrigues de Sa e Souza <diegosouza.pw@gmail.com> * docs(architecture): Router Backends & Embedded Services ADR (#5603) (#5891) * routing: add router backend registry * docs(architecture): add Router Backends & Embedded Services ADR (#5603) Document the two orthogonal axes that #5603 asked to clarify: an engine's lifecycle (in-process / supervised / external / disabled) vs the relay routing backend selection (ts / bifrost / auto). Anchors the ADR on the typed `src/domain/routing/routerBackends.ts` registry as the single source of truth, and captures the /api/services/* status-code contract (409/200/404/403/500 + the LOCAL_ONLY loopback guard) so dashboard errors are interpretable. Stacked on the router-backend-registry work so it documents a real contract. * docs(architecture): reduce ADR PR to docs-only — registry lands via #5868; describe adoption as tracked, not current * docs(changelog): add 3.8.43 bullet for the Router Backends ADR (#5891) --------- Co-authored-by: KooshaPari <kooshapari@gmail.com> * fix(ci): re-green release/v3.8.43 fast-gates — db-rules stale allowlist + 4 more base-reds (#5798) (#5896) * fix(db): remove stale modelContextOverrides allowlist entry from check:db-rules (#5798) * fix(ci): clear release/v3.8.43 fast-gates base-reds (env-docs, ADR refs, mutation-cov, ratchets) (#5798) * fix(sse): type-safe resolveBaseUrl/resolveEffectiveKey coercions in BaseExecutor (typecheck:core base-red, #5798) * chore(quality): freeze base.ts at post-typecheck-fix size (#5798) * fix(docs): add required MDX frontmatter to ROUTER_BACKENDS ADR (build base-red, #5798) * fix(image): keep bare gpt-5.5 codex mapping in image resolver (#5902) * fix: preserve codex bare image model over combo shadowing * docs(changelog): credit #5902 codex bare image alias fix * docs(changelog): restore #5902 bullet after merge auto-resolve --------- Co-authored-by: Diego Rodrigues de Sa e Souza <diegosouza.pw@gmail.com> * fix(providers): route OpenAI responses-only models to /v1/responses (#5842) (#5901) * fix(providers): route OpenAI responses-only models to /v1/responses (#5842) * docs(changelog): restore #5842 bullet after merge auto-resolve ate it * docs(changelog): keep #5842 bullet additive over release tip * chore(release): v3.8.43 — 2026-07-02 * chore(release): allowlist 3 verified-legitimate test-assert reductions (#5805/#5856/#5855) * chore(release): rebaseline file-size caps for base.ts + 2 aligned test files (v3.8.43 release-close) * docs(changelog): add v3.8.43 Contributors section + sync i18n mirrors --------- Co-authored-by: KooshaPari <42529354+KooshaPari@users.noreply.github.com> Co-authored-by: Arthur Bodera <abodera@gmail.com> Co-authored-by: Wahyu Hidayatulloh Pamungkas <87377496+Stazyu@users.noreply.github.com> Co-authored-by: skyzea1 <161649495+skyzea1@users.noreply.github.com> Co-authored-by: José Victor Ferreira <root@josevictor.me> Co-authored-by: Choti Wongbussakorn <126886556+Chewji9875@users.noreply.github.com> Co-authored-by: backryun <bakryun0718@proton.me> Co-authored-by: Jan Leon <Jan.gaschler@gmail.com> Co-authored-by: warelik <warelik@users.noreply.github.com> Co-authored-by: WITALO ROCHA <witalo_rocha@hotmail.com> Co-authored-by: Randi <55005611+rdself@users.noreply.github.com> Co-authored-by: Alex <alexgild@gmail.com> Co-authored-by: Chirag Singhal <76880977+chirag127@users.noreply.github.com> Co-authored-by: Ardem2025 <ardemb22@gmail.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: KooshaPari <koosha@example.com> Co-authored-by: Hernan Javier Ardila Sanchez <hjasgr@gmail.com> Co-authored-by: herjarsa <herjarsa@users.noreply.github.com> Co-authored-by: PizzaV <103120356+pizzav-xyz@users.noreply.github.com> Co-authored-by: OpenClaw Auto <openclaw-auto@example.invalid> Co-authored-by: jleonar2 <92810914+jleonar2@users.noreply.github.com> Co-authored-by: marcelpeterson <marcelpeterson@users.noreply.github.com> Co-authored-by: Yuan Li <atom.long@outlook.com> Co-authored-by: janeza2 <49841619+janeza2@users.noreply.github.com> Co-authored-by: Aris <arissunandar399@gmail.com> Co-authored-by: Isha Tiwari <156085572+ishatiwari21@users.noreply.github.com> Co-authored-by: Markus Hartung <mail@hartmark.se> Co-authored-by: Nguyen Minh <lop123thcs@gmail.com> Co-authored-by: Denis Kotsyuba <kocubads96@gmail.com> Co-authored-by: Wital <wital@example.com> Co-authored-by: Septianata Rizky Pratama <ian.rizkypratama@gmail.com> Co-authored-by: Shiva Vinodkumar <127319648+shiva24082@users.noreply.github.com> Co-authored-by: kooshapari <kooshapari@users.noreply.github.com> Co-authored-by: KooshaPari <kooshapari@gmail.com>	16 小时前
CLAUDE.md	Release v3.8.6 (#2804) * fix(gemini): preserve structured tool calls for antigravity * fix(gemini): parse prefixed textual tool calls * fix(antigravity): preserve textual SSE tool calls * fix(stream): normalize textual passthrough tool calls * fix(stream): normalize split textual tool calls * fix(stream): suppress malformed textual tool calls * fix(stream): suppress compact malformed tool calls * fix(stream): emit structured textual tool calls * fix(stream): suppress unknown textual tool calls * fix(stream): normalize responses textual tool calls * chore: ignore .claude/settings.local.json (per-user Claude Code permissions) * fix(opencode-go): route qwen3.x via claude messages + repair fixMissingToolResponses for Claude-shape upstreams (#2791) Integrated into release/v3.8.6 * fix: resolve npm install warnings — remove dead deps, relax engine constraint (#2792) Integrated into release/v3.8.6 * fix: register missing web-cookie validators (claude-web, gemini-web, copilot-web, t3-web) (#2793) Integrated into release/v3.8.6 * fix: Error: Unable to inspect existing database #2771 (#2795) Integrated into release/v3.8.6 * fix(oauth): repair Google loopback callback flow (#2796) Integrated into release/v3.8.6 * feat(logs): add clean history button (#2799) Integrated into release/v3.8.6 * [codex] home: restore settings-driven home layout and quota auto-refresh (#2800) Integrated into release/v3.8.6 * fix(gemini): emit signaturelessToolCallMode:text for GEMINI format models (#2801) Integrated into release/v3.8.6 * feat(modelSpecs): align opencode-go family with upstream provider limits (#2802) Integrated into release/v3.8.6 * chore: apply unit test fixes, polyfills, and environment precedence fixes * docs(agents): atualiza fluxos de release e triagem Expande os workflows de release para incluir auditoria de segurança, CHANGELOG completo por commits, quality gate obrigatório, homologação em VPS local, publicação oficial, deploy em Akamai e validação de artefatos. Reorganiza a triagem de features com arquivos permanentes por bucket, suporte a itens em andamento, regra de reclaim após 15 dias e novo tratamento para ideias viáveis catalogadas. Corrige a orientação de revisão de discussões para usar a ordem cronológica real dos comentários e respostas ao identificar a última atividade. * fix(lockout): classify Gemini Antigravity resource exhaustion as quota_exhausted * fix(reasoning): gate replay by interleaved field * docs(rule-16): permit human Co-authored-by, restrict only AI/bot trailers Rule #16 previously banned all `Co-Authored-By` trailers absolutely. That blocked the upstream-port workflows (`/port-upstream-features` and `/port-upstream-issues`), which must credit human upstream PR authors and issue reporters in OmniRoute commits. Refine the rule to ban only AI/bot-attributed trailers (Claude, GPT, Copilot, Bot; anthropic.com / openai.com / bot-owned noreply.github.com emails) while allowing standard human `Co-authored-by: Name <email>` attribution. Sync the rule across the source CLAUDE.md, the E2E shakedown doc note, and 41 i18n translations. * fix(gitlawb): add specialty validators for connection test — bypass /models probe GitLawB OpenGateway API (xiaomi-mimo compatible) does not expose a /models endpoint, causing validateOpenAILikeProvider to 404 on the initial probe and report 'Provider validation endpoint not supported'. Add specialty validators for both gitlawb and gitlawb-gmi that follow the same pattern as the existing xiaomi-mimo validator: skip GET /models, validate directly via POST /chat/completions with a minimal test message. Any 401/403 response means an invalid key; all other responses mean auth is OK. Fixes test-connection returning 404 for GitLawB providers. * test(gitlawb): add 12 unit tests for gitlawb and gitlawb-gmi specialty validators Covers success, auth failure (401/403), non-auth acceptance (400/422/429), network errors, and custom baseUrl overrides for both providers. * feat(gitlawb): serve models from static registry without API-unavailable warning GitLawB's OpenGateway API does not expose a /models endpoint per provider-path. Previously the models route fell through to the generic fallback which returned static catalog models with the misleading 'API unavailable — using local catalog' warning. Now gitlawb and gitlawb-gmi are handled as static model providers (same pattern as reka and qwen OAuth) — models are served from the provider registry without any warning, since all registered models are functional via POST /chat/completions. * refactor(gitlawb): extract shared opengateway validator factory, fix docs path in test - Extract gitlawb/gitlawb-gmi validators into buildOpengatewayValidator factory - Fix dockerignore-docs-coverage test: update stale docs/AUTO-COMBO.md -> docs/routing/AUTO-COMBO.md * fix(reasoning): guard interleaved capability lookup * feat(gitlawb): dynamic model fetch with gmi-cloud fallback Hybrid approach: - gitlawb (xiaomi-mimo): dynamic /models endpoint → 356 models - gitlawb-gmi (gmi-cloud): 404 fallback → local catalog gracefully Mimics Gitlawb/openclaude's model-routing pattern * i18n(pt-BR): complete missing translations and sync with en.json * feat(build): nix multi-OS package manager install (#2806) Integrated into release/v3.8.6 * fix(i18n): translate 144 new __MISSING__ pt-BR strings (#2816) Integrated into release/v3.8.6 * chore(docs): set coverage gate to 40/40/40/40 in CLAUDE.md Aligns the documented coverage gate with the v3.8.6 release decision (lowered from 75/75/75/70). Matches the threshold already set in package.json by the large feature PRs (planos 11-22). * fix(cli): respect PORT env var in serve command (#2845) Integrated into release/v3.8.6. * fix(deepseek-web): return 400 when client sends tools[] - chat.deepseek.com has no tool support (#2854) Integrated into release/v3.8.6. * fix(qoder): reject invalid/expired PATs returning Cosy 500 error (#2860) Integrated into release/v3.8.6. * fix(cli): register openclaw in tool-detector (#2833) (#2850) Integrated into release/v3.8.6. * fix(api): include noAuth providers in /v1/models catalog (#2798) (#2814) Integrated into release/v3.8.6. * fix(combo): resolve custom provider targets via combo name (#2778) (#2812) Integrated into release/v3.8.6. * fix(translator): strip safety_identifier in openai-responses cleanup (#2770) (#2809) Integrated into release/v3.8.6. * fix(quota): honor explicit per-connection preflight opt-out (#2831) (#2844) Integrated into release/v3.8.6. * fix(usage): un-invert GitHub Copilot Free/limited quota — limited_user_quotas is remaining (#2876) (#2881) Integrated into release/v3.8.6. * fix(nous-research): correct baseUrl to include /chat/completions (#2826) (#2835) Integrated into release/v3.8.6. * fix(opencode): qwen3.x max/plus models lack vision support (#2822) (#2836) Integrated into release/v3.8.6. * fix(translator): pass-through tool_search built-in tool type (#2766) (#2811) Integrated into release/v3.8.6. * fix(github): route claude-opus-4.6 via chat completions (#2821) Integrated into release/v3.8.6. * docs(oauth): add Windsurf login fix design (Phase 1 hotfix + Phase 2 Firebase OAuth) Two-phase plan to fix the broken Windsurf OAuth flow: - Phase 1: drop the dead app.devin.ai/editor/signin PKCE path, promote import-token from windsurf.com/show-auth-token as the primary path - Phase 2: port Firebase OAuth + RegisterUser flow from fendoushaonian/WindSurf-gRPC-API for full browser-based automation Spec only - no code changes yet. * docs(plan): Phase 1 windsurf login hotfix implementation plan 10 tasks covering: - TDD assertions for flowType + 410 Gone responses - Provider switch to import_token - Route handler retiring authorize/start-callback-server/poll-callback - OAuthModal UI override - i18n sync - Verification + PR steps * fix(cli): replace cli-table3 with hand-rolled formatter (#2752) (#2813) Integrated into release/v3.8.6. * fix(skills): skip interception for unregistered client-native tools (#2815) (#2817) Integrated into release/v3.8.6. * feat(sse): add RTK filters for kubectl, docker-build, composer, gh (#2824) Integrated into release/v3.8.6. * fix(geminiHelper): support rec.image content shape + warn on dropped remote URLs (refs #2807) (#2855) Integrated into release/v3.8.6. * fix(cli): allow nullable/optional apiKey in cliMitmStartSchema (#2857) Integrated into release/v3.8.6. * fix(combo): preserve system messages during context handoff summary generation (#2865) Integrated into release/v3.8.6. * fix: wire CLIProxyAPI fallback settings into chatCore routing engine (#2866) Integrated into release/v3.8.6. * fix(usage): add opencode quota fetcher (#2852) (#2867) Integrated into release/v3.8.6. * feat(claude): default xhigh support for newer Opus models (#2874) Integrated into release/v3.8.6. * fix(cli): restore omniroute logs command stream (#2756) (#2810) Integrated into release/v3.8.6. * fix(combo): normalize upstream Headers for Node 24 undici interop (#2751) (#2823) Integrated into release/v3.8.6. * Rename proxy log Public IP to Client IP (#2880) Integrated into release/v3.8.6. * fix(claude): preserve max effort for supported models (#2875) Integrated into release/v3.8.6. * fix(oauth): switch windsurf provider to import_token flow The PKCE auth URL targeting app.devin.ai/editor/signin returns 404 post-rebrand. Until Phase 2 ports Firebase OAuth + RegisterUser, the only supported path is import-token via windsurf.com/show-auth-token. - windsurf.ts: drop buildAuthUrl, set flowType=import_token - generateAuthData returns supported:false + helpful error for windsurf/devin-cli - tests: assert flowType + disabled stub * fix(oauth): return 410 Gone for retired windsurf/devin-cli PKCE actions start-callback-server, authorize, and poll-callback (GET + POST) now return 410 Gone with a pointer to /import-token. The 410 short-circuit runs before auth so the response is honest about the action being permanently gone, not gated. Codex PKCE flow unchanged. Tests: 5 new assertions cover GET + POST 410 paths and a Codex regression check. * refactor(oauth): annotate retired PKCE fields in WINDSURF_CONFIG No behaviour change - comment-only update documenting that authorizeUrl, codeChallengeMethod, callbackPort, callbackPath, apiServerUrl, and exchangePath are no longer consumed. Active fields (inferenceUrl, showAuthTokenUrl, firebaseApiKey, ideName) called out separately. * fix(cli,docs): use requireCliToolsAuth in logs route + document OPENCODE quota env Post-merge contract fixes for v3.8.6: - src/app/api/cli-tools/logs/route.ts (#2810) now uses the shared requireCliToolsAuth guard (param renamed req->request) to satisfy the cli-tools-auth-hardening contract test. - Document OMNIROUTE_OPENCODE_QUOTA_URL (#2867) in docs/reference/ENVIRONMENT.md to satisfy the env/docs sync contract. * fix(dashboard): force import-token panel for windsurf/devin-cli Phase 1 hotfix: hide the 'Browser Login' tab and start in Paste API Key mode. Removes windsurf/devin-cli from PKCE_CALLBACK_SERVER_PROVIDERS so no callback server is started for them. Codex still uses the PKCE flow. The 'Get token' link continues to point at windsurf.com/show-auth-token via the existing supportsTokenPaste form copy. * fix(oauth): windsurf import-token mapTokens signature mismatch The route at `src/app/api/oauth/[provider]/[action]/route.ts` invokes `providerData.mapTokens({ accessToken: token })` (object), matching the cursor/kiro signature. The windsurf provider was declared with `mapTokens(token: string)` instead, so the entire object was stored as `accessToken`. When the connection record reached the SQLite layer it crashed with: SQLite3 can only bind numbers, strings, bigints, buffers, and null Fix by aligning windsurf's `mapTokens` signature with the route caller and the cursor/kiro convention. Also dedupe a copy-pasted second `if (action === "import-token")` block in the route handler — the second block was unreachable but identical to the first. Adds two regression tests asserting that `provider.mapTokens({ accessToken })` returns a string `accessToken` for both windsurf and devin-cli, so a future signature drift trips the gate instead of the SQLite bind error in production. * feat(compression): expand pt-BR pack with troglodita rules (15 → 49) (#2818) Integrated into release/v3.8.6 * fix(sse): repair RTK engine defaults so dedup and direct calls work (#2825) Integrated into release/v3.8.6 * fix(mcp): redirect console.log/warn to stderr in --mcp stdio mode (#2840) Integrated into release/v3.8.6 * fix(gemini-cli): prefer real project IDs over default-project (#2841) Integrated into release/v3.8.6 * fix(opencode-go): add provider limits quota fetcher (#2861) Integrated into release/v3.8.6 * Audit & add web cookie providers: fix 4 missing registry entries + DuckDuckGo (#2862) Integrated into release/v3.8.6 * fix(antigravity): harden signatureless tool history (#2878) Integrated into release/v3.8.6 * fix: provider model sync pruning and dynamic antigravity MITM proxy mappings (#2886) Integrated into release/v3.8.6 * feat(usage): per-API-key token limits scoped to model/provider/global (#2888) Integrated into release/v3.8.6 * fix(audio): build multipart body manually to preserve Content-Type (#2842) Integrated into release/v3.8.6 * refactor: remove agent skill documentation files and streamline maintenance workflows * test(stabilization): resolve unit test failures in blackbox-web, schema-coercion, translator-helper-branches, usage-service-hardening, and audio-transcription * fix(security): mitigate Socket.dev supply-chain findings + secrets opt-in + minimal build profile (#2863) (#2871) Two real security gaps closed and four cosmetic Socket.dev fingerprints removed. See docs/security/SOCKET_DEV_FINDINGS.md for the per-finding maintainer attestation. Real bugs fixed: - cloudSync: HMAC verification of `X-Cloud-Sig` + opt-in `OMNIROUTE_CLOUD_SYNC_SECRETS=true` before overwriting `accessToken` / `refreshToken` / `providerSpecificData` from a remote response. Closes the silent-credential-swap surface (a misconfigured or hostile CLOUD_URL could previously replace local tokens unverified). - Zed import: split into 2-step `/discover` + `/import` flow. `/import` now requires `confirmedAccounts: [{ service, account, fingerprint }]` and re-reads the keychain server-side to filter by fingerprint, so a tampered discover response cannot trick the endpoint into saving an unrelated token. Cosmetic Socket.dev mitigations: - runElevatedPowerShell writes the elevated payload to a per-call temp `.ps1` file (mode 0o600) and references it via `-File`. Removes the textbook `-EncodedCommand <base64utf16le>` pattern flagged as malware by Socket's AI classifier. - Maintainer attestation `SECURITY-AUDITOR-NOTE:` blocks added at every flagged call site pointing to `docs/security/SOCKET_DEV_FINDINGS.md`. Build-time hardening: - `OMNIROUTE_BUILD_PROFILE=minimal` (`npm run build:secure`) physically removes the four sensitive modules from the standalone bundle via webpack `NormalModuleReplacementPlugin`. Stubs throw `FeatureDisabledError` at runtime. Intended for the `omniroute-secure` artifact. Tests: - 24 new unit tests in `tests/unit/security/` covering the wrapper builder, HMAC verification (4 cases), credential fingerprint determinism (5 cases), confirmedAccounts validation + fingerprint filtering (6 cases), and the minimal-build stubs (5 cases). Docs: - New `docs/security/SOCKET_DEV_FINDINGS.md` — per-finding attestation. - New `socket.yml` — Socket.dev v2 config pointing at the attestation. - Updated `SECURITY.md` — supply-chain scanner section. - Updated `.env.example` — three new env vars documented. Backwards compatibility: - Cloud sync token overwrite is OFF by default. Users who relied on it must set `OMNIROUTE_CLOUD_SYNC_SECRETS=true`. Breaking change documented in CHANGELOG. - Zed import 2-step is the new default; legacy 1-step preserved behind `OMNIROUTE_ZED_IMPORT_LEGACY_ONE_STEP=true` and will be removed in v3.9. Closes #2863 * fix(security): redact public Firebase Web key from windsurf spec; doc SHA-256 cache-key rationale (#2894) Two security-scanning findings on release/v3.8.6: - Secret-scanning alert 7 (google_api_key): the windsurf login-fix design spec embedded the literal public Firebase Web API key on two lines. Firebase Web API keys are non-sensitive by design (they identify the project; access is gated by Firebase Security Rules + key restrictions), but the literal trips secret scanning. Redacted to a placeholder; the embedded default still goes through resolvePublicCred per rule #11. - Code-scanning alert 261 (js/insufficient-password-hash): tokenCacheKey() uses SHA-256 to derive an in-memory cache key from the session token, not for password-at-rest storage. Added a comment documenting why CWE-916 KDFs do not apply (false positive). * fix(ci): resolve release/v3.8.6 gate failures (docs-sync, any-budget, pack-artifact) (#2895) * fix(ci): resolve release/v3.8.6 gate failures (docs-sync, any-budget, pack-artifact) Three CI gates failed on release/v3.8.6 (run 26630300877): - docs-sync: CHANGELOG had a spurious "## [3.8.6-patch]" section above "## [3.8.6]", so the latest release no longer matched package.json (3.8.6) and the 41 i18n CHANGELOG mirrors were flagged as missing that section. Fold the lone #2752 entry into [3.8.6] and drop the patch heading. - any-budget:t11: open-sse/handlers/chatCore.ts regressed to 1 explicit `any` (budget 0). Type the persist callback arg as Record<string, unknown>, which matches runWithOnPersist's RefreshPersistFn contract exactly. - pack-artifact: open-sse/utils/setupPolyfill.ts ships via package.json "files" (bin/omniroute.mjs imports it at startup) but was missing from the pack policy allowlist. Allow it and add a regression test. * fix(security): redact public Firebase Web key from windsurf spec Redact the literal public Firebase Web API key (secret-scanning #7) to a placeholder, mirroring the redaction on release/v3.8.6 (PR #2894) and the windsurf fix branch. Non-sensitive public Web key; trips secret scanning. * feat(combo): Zero-Latency Combos (Hedging, Proactive Compression, Predictive TTFT) (#2868) * feat(combo): implement zero-latency combo optimizations (hedging, proactive compression, predictive TTFT) * fix(combo): fix predictive TTFT skip logic and unhandled promise rejections --------- Co-authored-by: Automation <automation@omniroute> * feat: implement automated skill workflows and update system configuration and validation schemas * test: eliminate dynamic cast warnings in cloud-sync unit test * test: isolate services-branch-hardening database directory to avoid concurrency issues * feat(providers): add 7 new web-cookie providers + research catalog + discovery tool New providers: - huggingchat: free LLM chat via huggingface.co/chat (no subscription) - phind: free dev-focused AI chat via phind.com/api/agent - poe-web: multi-model chat via poe.com GraphQL (p-b cookie) - venice-web: privacy-focused AI chat via venice.ai (session cookie) - v0-vercel-web: Vercel v0 code gen via v0.dev (session cookie) - kimi-web: Moonshot Kimi chat via kimi.moonshot.cn (session cookie) - doubao-web: ByteDance Doubao chat via doubao.com (session cookie) Additional: - Research catalog: docs/research/UNLIMITED_LLM_ACCESS.md - Discovery tool design + stub: src/lib/discovery/ + migration 073 - Unit tests: 33 tests for all 7 providers - Shared helpers consolidated in error.ts (slop cleanup) - All registered in WEB_COOKIE_PROVIDERS + providerRegistry + webSessionCredentials Closes #2885 * fix(typecheck): resolve typecheck errors in combo spec and compression modules * feat(api,oauth): add `agy` (Antigravity CLI) standalone provider with CLI token import (#2899) Add a standalone OAuth provider `agy` (Antigravity CLI) next to gemini-cli/antigravity. It reuses the antigravity inference backend (identical Google client_id + daily-cloudcode-pa.googleapis.com endpoint, executor and token-refresh) but ships its own model catalog — including the Claude models the backend exposes (claude-opus-4-6-thinking, claude-sonnet-4-6) — its own account pool, and four ways to connect: - token-file import (paste/upload the agy oauth token JSON) - auto-detect a local CLI login (~/.gemini/antigravity-cli/antigravity-oauth-token) - browser OAuth (via the shared OAuthModal Google loopback flow) - bulk / ZIP import New routes: POST /api/providers/agy-auth/{import,import-bulk,zip-extract,apply-local}. Catalog pinned from the live :fetchAvailableModels endpoint. Docs (openapi.yaml, ENVIRONMENT.md, .env.example, CHANGELOG) updated; new unit tests for registration, the token parser, and route auth-hardening. * fix(security): redact public Firebase Web key from windsurf spec (#2896) Redact the literal public Firebase Web API key (secret-scanning #7) to a placeholder. Firebase Web API keys are non-sensitive by design but the literal trips GitHub secret scanning. Mirrors the redaction landed on release/v3.8.6 (PR #2894). Embedded default still flows through resolvePublicCred (rule #11). * Pr 2871 (#2897) * fix(security): mitigate Socket.dev supply-chain findings + secrets opt-in + minimal build profile (#2863) Two real security gaps closed and four cosmetic Socket.dev fingerprints removed. See docs/security/SOCKET_DEV_FINDINGS.md for the per-finding maintainer attestation. Real bugs fixed: - cloudSync: HMAC verification of `X-Cloud-Sig` + opt-in `OMNIROUTE_CLOUD_SYNC_SECRETS=true` before overwriting `accessToken` / `refreshToken` / `providerSpecificData` from a remote response. Closes the silent-credential-swap surface (a misconfigured or hostile CLOUD_URL could previously replace local tokens unverified). - Zed import: split into 2-step `/discover` + `/import` flow. `/import` now requires `confirmedAccounts: [{ service, account, fingerprint }]` and re-reads the keychain server-side to filter by fingerprint, so a tampered discover response cannot trick the endpoint into saving an unrelated token. Cosmetic Socket.dev mitigations: - runElevatedPowerShell writes the elevated payload to a per-call temp `.ps1` file (mode 0o600) and references it via `-File`. Removes the textbook `-EncodedCommand <base64utf16le>` pattern flagged as malware by Socket's AI classifier. - Maintainer attestation `SECURITY-AUDITOR-NOTE:` blocks added at every flagged call site pointing to `docs/security/SOCKET_DEV_FINDINGS.md`. Build-time hardening: - `OMNIROUTE_BUILD_PROFILE=minimal` (`npm run build:secure`) physically removes the four sensitive modules from the standalone bundle via webpack `NormalModuleReplacementPlugin`. Stubs throw `FeatureDisabledError` at runtime. Intended for the `omniroute-secure` artifact. Tests: - 24 new unit tests in `tests/unit/security/` covering the wrapper builder, HMAC verification (4 cases), credential fingerprint determinism (5 cases), confirmedAccounts validation + fingerprint filtering (6 cases), and the minimal-build stubs (5 cases). Docs: - New `docs/security/SOCKET_DEV_FINDINGS.md` — per-finding attestation. - New `socket.yml` — Socket.dev v2 config pointing at the attestation. - Updated `SECURITY.md` — supply-chain scanner section. - Updated `.env.example` — three new env vars documented. Backwards compatibility: - Cloud sync token overwrite is OFF by default. Users who relied on it must set `OMNIROUTE_CLOUD_SYNC_SECRETS=true`. Breaking change documented in CHANGELOG. - Zed import 2-step is the new default; legacy 1-step preserved behind `OMNIROUTE_ZED_IMPORT_LEGACY_ONE_STEP=true` and will be removed in v3.9. Closes #2863 * feat: implement automated skill workflows and update system configuration and validation schemas * test: eliminate dynamic cast warnings in cloud-sync unit test * test: isolate services-branch-hardening database directory to avoid concurrency issues * chore(docs): refresh generated docs collection index Update the generated Fumadocs browser collection mapping to keep documentation imports in sync with the current docs structure. * docs: update generated browser docs collection manifest Refresh the generated Fumadocs browser collection mapping so the docs site can resolve the current documentation files correctly. --------- Co-authored-by: OpenClaw <openclaw@kuzhomesrv.local> Co-authored-by: Dmitry Kuznetsov <139351986+dmitry@users.noreply.local> Co-authored-by: KuzyaBot <kuzya@local> Co-authored-by: JeferssonLemes <jeferssondev@gmail.com> Co-authored-by: Paijo <14921983+oyi77@users.noreply.github.com> Co-authored-by: Markus Hartung <mail@hartmark.se> Co-authored-by: akarray <akarray@users.noreply.github.com> Co-authored-by: Apostol Apostolov <theapoapostolov@gmail.com> Co-authored-by: Hernan Javier Ardila Sanchez <hjasgr@gmail.com> Co-authored-by: Dmitry Kuznetsov <dmitry@kuznetsov.me> Co-authored-by: Nikolay Alafuzov <alafuzov_nn@rusklimat.ru> Co-authored-by: oyi77 <oyi77@users.noreply.github.com> Co-authored-by: Ronaldo Davi <alltomatos@users.noreply.github.com> Co-authored-by: levonk <277861+levonk@users.noreply.github.com> Co-authored-by: Lenine Júnior <lenine@engrene.com.br> Co-authored-by: Annas Alghoffar <aag.annas@gmail.com> Co-authored-by: Tushar Agarwal <76201310+Tushar49@users.noreply.github.com> Co-authored-by: GreatLiu <eurasiaxz@qq.com> Co-authored-by: yuna amelia <230527278+yunaamelia@users.noreply.github.com> Co-authored-by: Randi <55005611+rdself@users.noreply.github.com> Co-authored-by: Container <78986709+disonjer@users.noreply.github.com> Co-authored-by: nickwizard <35692452+nickwizard@users.noreply.github.com> Co-authored-by: Rajvardhan Patil <rajvardhanpatil7890@gmail.com> Co-authored-by: Raxxoor <manker_lol@hotmail.com> Co-authored-by: Muhammad Mugni Hadi <mugnimaestra3@gmail.com> Co-authored-by: mi <123757457+soyelmismo@users.noreply.github.com> Co-authored-by: Automation <automation@omniroute>	1 个月前
CODE_OF_CONDUCT.md	fix(combo): fallback to next model on all-accounts-rate-limited 503 (… (#1523) Integrated into release/v3.7.0	2 个月前
CONTRIBUTING.md	refactor(i18n): move existing locale mirrors to subfolder layout For all 40 locales, move docs/i18n/<lang>/docs/<DOC>.md into the matching subfolder (architecture/, guides/, reference/, ...). Mirror references in docs/i18n/<lang>/{llm.txt,CHANGELOG.md,CONTRIBUTING.md,README.md} are also rewritten to the new paths so the i18n llm.txt mirror check stays byte-equal to the root llm.txt body. These are mechanical moves only — actual translations remain unchanged and will be regenerated incrementally in FASE 5 (hash-based pipeline). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	1 个月前
GEMINI.md	fix(combo): fallback to next model on all-accounts-rate-limited 503 (… (#1523) Integrated into release/v3.7.0	2 个月前
README.md	Release v3.8.43 (#5609) * chore(release): open v3.8.43 development cycle * docs(relay): clarify backend routing contract (#5621) Integrated into release/v3.8.43 (drift-shed: cherry-picked the real change onto the release tip; stale-base drift dropped). * fix(security): avoid rendering error stacks (#5624) Integrated into release/v3.8.43 (drift-shed: cherry-picked the real change onto the release tip; stale-base drift dropped). * fix(chatgpt-web): restore dot-form Pro model ids (#5549) Integrated into release/v3.8.43 (drift-shed: cherry-picked the real change onto the release tip; stale-base drift dropped). * feat(commandCode): add multimodal image support for CC vision models (#5557) Integrated into release/v3.8.43 (drift-shed: cherry-picked the real change onto the release tip; stale-base drift dropped). * fix(providers): validate M365 Copilot web credentials (#5432) Integrated into release/v3.8.43 (drift-shed: cherry-picked the real change onto the release tip; stale-base drift dropped). * fix(sse): bound chat hot-path heap — pressure-aware admission + response cap + clone reductions (#5152) (#5425) Integrated into release/v3.8.43 (drift-shed: cherry-picked the real change onto the release tip; stale-base drift dropped). * fix: model lockout not recording for 429 rate_limit_exceeded from Antigravity ## Problem When Antigravity returns HTTP 429 with `rate_limit_exceeded` error code, the model lockout system never records the failure, so the model is not cooled down despite being rate-limited. ### Root Cause Antigravity's 429 error text is: `"Resource has been exhausted (e.g. check quota)."` The QUOTA_PATTERNS in `classify429.ts` contained overly broad regexes: - `/resource.exhaust/i` — matches "Resource has been exhausted" - `/check.quota/i` — matches "check quota" This caused `classifyErrorText()` to return `QUOTA_EXHAUSTED` (wrong), which set `providerExhausted = true` in the combo target exhaustion logic. With `providerExhausted`, the retry path was skipped entirely, and while the "done retrying" path should still record lockout, the misclassification cascaded into incorrect provider-level exhaustion state. Additionally, `targetExhaustion.ts` used the raw error text string instead of the structured error code (`rate_limit_exceeded`) that was already parsed from the response body. ## Fix 1. classify429.ts — Removed overly broad `/resource.exhaust/i` and `/check.quota/i` from QUOTA_PATTERNS. Antigravity's rate-limit wording is not a true quota exhaustion signal. 2. targetExhaustion.ts — Added optional `structuredError` to `ApplyComboTargetExhaustionOptions`. When available, the structured error code (e.g. `rate_limit_exceeded`) takes precedence over raw error text for exhaustion classification. 3. combo.ts — Passes `structuredError` to both `applyComboTargetExhaustion` call sites (dispatch path + retry-or-rotate path). ## Effect `structuredError.code = "rate_limit_exceeded"` → classified as rate-limit (not quota) → `providerExhausted = false` → retry proceeds → `recordModelLockoutFailure` called → model enters lockout with proper cooldown (120s base, exponential backoff). ## Tests Added 2 new tests for `structuredError.code` precedence in exhaustion classification. All 28 related tests pass. * fix(checks): normalize route paths on windows (#5613) Integrated into release/v3.8.43. Windows path-normalization fix for the route-guard membership gate + regression test (Rule #18). Co-authored test added by maintainer. * fix: truncate tool list when provider limit exceeds MAX_TOOLS_LIMIT (grok-cli 200) - Add proactive PROVIDER_TOOL_LIMITS map with grok-cli: 200 - Fix regex to capture 'maximum is 200' (not '427 tools provided') - Remove broken truncation gate that skipped limits >= MAX_TOOLS_LIMIT (128) - Add tests for Grok regex, proactive limits, and limits above threshold Refs #5563 * test(chatcore): cover grok-cli tool-list truncation via prepareUpstreamBody (#5563) Co-authored-by: diegosouzapw <diegosouza.pw@gmail.com> * fix(security): v3.8.15 hardening follow-ups (Seg2/Seg3/Seg4/Bug3) (#5512) Security v3.8.15 hardening follow-ups: Seg2 (CHANGEME boot warn), Seg3 (auth_token cookie maxAge 30d), Seg4 (VS Code path-token once-per-process warning), Bug3 (real global install path resolution), Bug1 (segment-match node_modules in auto-update detection). All 5 carry TDD regression guards. * Fix HuggingChat web session routing (#5592) (#5592) Integrated into release/v3.8.43. HuggingChat web session-routing fix (root parent-message fetch + cookie propagation + encrypted-credential guard) + 24-model catalog refresh. Maintainer adjustments (co-authored): reverted the freeModelCatalog.data.ts whole-file reformat down to the surgical 24-record huggingchat change (preserving the auto-generated compact format), and added a 502 regression test for the null parent-message-id path (Rule #18). * fix: preserve system role for GLM 5.1/5.2 (#5610) (#5663) * fix: restore Codex Responses WS TLS profile + apply proxy (#5591, #5611) (#5668) * fix: allow saving providers without a live validator (#5565, #5567) (#5669) * fix: static model catalog for jules/linkup/ollama/searchapi search providers (#5569, #5571, #5573, #5575) (#5672) * fix: live AI/ML API catalog + deprecate dead CablyAI (#5570, #5568) (#5673) * fix: correct 404 provider setup links for ollama/searchapi/you.com (#5572, #5574, #5576) (#5674) * fix: page call_logs cleanup queries to avoid startup OOM on large DBs (#5618) (#5675) * fix: use PowerShell Expand-Archive on Windows for embedded-service install (#5590) (#5678) * fix: treat array content blocks as valid output in detectMalformedNonStream (#5559) (#5680) * fix: render memory engine status detail strings in English (#5596) (#5685) * fix: free proxy pool silent sync failure — iplocate txt + per-source isolation + surface errors (#5595) (#5686) * chore(quality): close QG v2 tail — drop orphan semcheck.yaml + Fase 9 maturity re-eval (#5681) - Remove semcheck.yaml: orphan config (zero workflow/script wiring) with stale rule counts; deterministic doc-accuracy coverage already exists (check:fabricated-docs --strict + docs-counts-sync + docs-symbols). Drop the REPOSITORY_MAP row referencing it. - Add docs/ops/MATURITY_REEVAL.md (Fase 9): re-measures maturity post-Ondas 0-3. The two biggest structural weaknesses from QUALITY_GATE_PLAYBOOK (2026-06-16) are now closed: fast-gates hole (quality.yml runs typecheck:core + impacted TIA unit tests + vitest + shards) and mutation-score-as-ratchet (check-mutation-ratchet.mjs + seeded baseline + nightly blocking job). Residual gap is owner/infra-gated (branch-protection main, SLSA L3, CodeQL advanced). - Record agent-lsp as deferred/opt-in (doc-only scaffold, no wiring). * fix(ci): stabilize nightly-mutation — guard tap.testFiles drift + anti-flake eps (#5682) Root cause (NOT a timeout): the nightly-mutation run fails on cold-cache nights because the blocking mutation-ratchet job measures modules below baseline, while warm-cache nights pass — the verdict tracked GitHub Actions cache state, not code quality. Proven via a local Stryker probe on headers.ts: covering unit tests (no-memory-header, strip-reasoning) had drifted OUT of stryker.conf.json tap.testFiles, so their mutants went covered-but-unkilled = Survived on a cold full run (COVERED score 61.73 vs 94.29 baseline); adding them restores the kills. - Add scripts/check/check-mutation-test-coverage.mjs: guards that every UNIT test importing a Stryker-mutated module is listed in tap.testFiles. Advisory by default, --strict in CI (wired in quality.yml fast-gates). Prevents recurrence. - Add the 38 drifted covering unit tests to stryker.conf.json tap.testFiles (138 -> 176). Monotonically safe: more covering tests only raise/hold the score. - Add MUTATION_RATCHET_EPS (1.0pt) anti-flake tolerance to check-mutation-ratchet so sub-point tap-runner jitter no longer false-fails the gate. Lowers no baseline. - Tests: check-mutation-test-coverage (3) + eps cases in check-mutation-ratchet. Residual: a clean post-merge nightly confirms scores return to/above baseline; any marginal residual gets a baseline re-seed (operator). * refactor(dashboard): split sidebarVisibility god-file into types + sections leaves (#5683) Behavior-preserving decomposition: src/shared/constants/sidebarVisibility.ts 1197 -> 291 LOC by extracting two leaves under sidebarVisibility/: - types.ts (160): HIDEABLE_SIDEBAR_ITEM_IDS + all sidebar types (self-contained). - sections.ts (762): section building-block consts + SIDEBAR_SECTIONS (imports types only — cycle-safe). COMPRESSION_CONTEXT_GROUP + SIDEBAR_SECTIONS stay exported; host re-exports both + 'export ' of types, so every consumer import path is unchanged. Byte-identical data verified via JSON.stringify of HIDEABLE_SIDEBAR_ITEM_IDS / SIDEBAR_ICON_ACCENTS / COMPRESSION_CONTEXT_GROUP / SIDEBAR_SECTIONS / SIDEBAR_PRESETS + getSectionItems output (identical before/after). typecheck:core, check:cycles (no cycles), check:file-size (3 files <800), and the 3 sidebar suites (20/20) pass. No logic changed. Note: file-size frozen baseline for sidebarVisibility.ts (1198) can ratchet to 291 to lock the shrink (left for the release ratchet / operator). fix: surface fusion-specific config on the Global Routing tab (#5598) (#5688) * fix(executor): route OpenAI-compatible MCP Responses requests to /responses (#5483) Closes #5483. OpenAI-compatible providers receiving a Responses-shaped request carrying MCP / tool_search tools now route to the upstream /responses endpoint instead of downgrading to /chat/completions, preserving Codex deferred tool discovery. Detection helpers extracted to open-sse/executors/forceResponsesUpstream.ts. Thanks to @KooshaPari. * fix(ci): make release-green pre-flight gates visible + bounded so unit reds are not missed (#5644) Integrated into release/v3.8.43. * fix(body-size): raise LLM API payload limit for responses routes (#5652) Integrated into release/v3.8.43. Thanks @JxnLexn! * fix(test): use lightweight health probe for batch e2e (#5651) Integrated into release/v3.8.43. Thanks @KooshaPari! * feat(compression): T05/C5 — preserveSystemPrompt mode enum + legacy back-compat (#5653) Integrated into release/v3.8.43. Includes the legacy-boolean back-compat derivation so existing preserveSystemPrompt=false installs keep whenNoCache behavior. * routing: optimize latency strategy with perf metrics (#5629) Integrated into release/v3.8.43. Thanks @KooshaPari! * feat(db): models/5004 — self-correcting model context-window overrides (#5667) Integrated into release/v3.8.43. * feat(providers): complete SenseNova free Token Plan — chat + Text-to-Image (port from 9router#2233) (#5679) Integrated into release/v3.8.43. * feat(api): routing/4985 — configurable response-body validation + failover (#5684) Integrated into release/v3.8.43. * fix(chatcore): default Claude tool type to "custom" when missing (#5662) Integrated into release/v3.8.43. Port from 9router#2196. Co-authored-by: warelik <warelik@users.noreply.github.com> * fix(translator): merge consecutive same-role contents for Gemini (port from 9router#2191) (#5661) Integrated into release/v3.8.43. Port from 9router#2191. * chore(bun): add locked bun runtime dependency (#5615) Integrated into release/v3.8.43. Bun 1.3.10 pinned via npm lockfile (adopt-partial decision). Thanks @KooshaPari! * chore(bun): run validated ts scripts with bun (#5612) Integrated into release/v3.8.43. Thanks @KooshaPari! * chore(bun): run CI script checks with bun (#5617) Integrated into release/v3.8.43. Validated bun==node output for all 3 gates (provider-consistency, compression-budget, known-symbols). Thanks @KooshaPari! * fix(build): make pack validator bun safe (#5643) Integrated into release/v3.8.43. Forward-compat guard; node/npm path unchanged. Thanks @KooshaPari! * docs: document Bun as the allow-listed build/dev script runner (Node stays the published runtime) (#5703) Integrated into release/v3.8.43. * feat(analytics): show $0 cost for flat-rate subscription/cookie providers (#5552) (#5704) * refactor(api): extract unified-catalog helpers into cohesive leaf modules (#5699) BLOCO E2 of the god-files campaign. The module-level pure/standalone helpers in src/app/api/v1/models/catalog.ts (1611 LOC) were lifted out verbatim into five cohesive leaf modules so the catalog host shrinks toward the 800-LOC file-size cap without any behavior change (host now 1345 LOC; the heavy getUnifiedModelsResponse orchestrator is untouched — its in-function closures stay put): - catalogHelpers.ts — pure numeric/array/shape helpers + shared catalog types - catalogOpenrouter.ts — OpenRouter id/modality/free-model/display-name helpers - catalogVision.ts — vision-capability field derivation (+ isVisionModelId re-export) - catalogProviderMaps.ts — alias<->providerId resolution maps (buildAliasMaps) - catalogRequest.ts — /v1/models API-key auth gating + Codex CLI client detection The host re-exports getCustomVisionCapabilityFields and isVisionModelId so the public API consumed by other tests (llm-selector-custom-vision-models, vision-detection- consistency) is unchanged; all 9 catalog/vision suites stay green. Adds tests/unit/catalog-helpers-extraction.test.ts: characterization tests for every extracted helper + a guard asserting the host preserves its public exports. Validated: typecheck:core, 50 catalog characterization tests, 12 new leaf tests, integration-wiring, check:cycles, check:file-size (no new violations), ESLint, Prettier. * feat(mcp): T07 — expose RTK learn/discover as MCP tools (#5691) Adds two read-only MCP tools wrapping the existing RTK discovery primitives: omniroute_rtk_discover (discoverRepeatedNoise/suggestFilter over recently captured raw tool output → candidate noise patterns + suggested filter) and omniroute_rtk_learn (listRtkCommandSamples + commandToId). Scope read:compression, MCP audit-logged, no new engine logic. Regression guard: tests/unit/compression/rtk-mcp-tools.test.ts. gaps v3.8.42 — T07. * feat(compression): T05/C3 — opt-in LLM-tier compression engine (#5702) Adds an opt-in, default-off LLM-tier compression engine ('llm') that condenses non-system message prose via a pluggable chat-completion backend, mirroring the llmlingua contract. Safe by construction: no-op default backend (pass-through out of the box), not in the default stacked pipeline, enabled defaults false, fenced code blocks + system messages never sent to the model, fail-open everywhere, minTokens floor. Real production backend is a VPS-validated follow-up (Hard Rule #18). Regression guard: tests/unit/compression/llm-compressor-engine.test.ts (8). gaps v3.8.42 — T05/C3. * refactor(db): extract compat/aliases/mitm helpers from db/models.ts into leaf modules (#5705) BLOCO E3 of the god-files campaign. db/models.ts (1250 LOC) mixed six concerns; the three cleanly-separable ones plus the shared key_value helpers were lifted out verbatim into a new src/lib/db/models/ subdirectory, leaving the tightly-coupled custom/synced/ flags trio in the host (host now 936 LOC). The host re-exports every moved public symbol so the module's public API (consumed by ~29 test files + localDb) is unchanged. - models/shared.ts — asRecord / toNonEmptyString / getKeyValue + JsonRecord (19 LOC) - models/compat.ts — model-compat overrides + sanitizeUpstreamHeadersMap (249 LOC) - models/aliases.ts — model-alias CRUD + cascade delete (61 LOC) - models/mitmAlias.ts — MITM alias get/set (32 LOC) The custom/synced/flags trio stays in the host because it is genuinely coupled (flags->getCustomModelRow, flags->readCompatList, custom->removeModelCompatOverride, synced->getModelIsDeleted, setModelIsHidden->updateCustomModel) — splitting it cleanly is a follow-up. Dependency DAG is acyclic (verified by check:cycles). Adds tests/unit/db-models-split.test.ts: characterization of the pure extracted helpers + a guard asserting the host preserves its full public export surface. Validated: typecheck:core, check:cycles (no cycles), 77 existing db/models consumer tests (db-models-crud/extended/aliases-cascade + 7 more) green, 7 new tests, ESLint, Prettier, check:file-size (host 936 < frozen 1259; no new violations). * refactor(db): extract pricing/lkgp/cache-metrics from db/settings.ts into leaf modules (#5709) BLOCO E3 of the god-files campaign. db/settings.ts (1154 LOC) mixed five concerns; the three cleanly-separable ones plus the shared toRecord/JsonRecord helper were lifted out verbatim into a new src/lib/db/settings/ subdirectory, leaving the Settings-core + Proxy config concerns in the host (host now 646 LOC). The host re-exports every moved public symbol so the module's public API (consumed by ~93 test files + localDb) is unchanged. - settings/shared.ts — toRecord + JsonRecord (9 LOC) - settings/pricing.ts — pricing layers/sources/per-model + update/reset (254 LOC) - settings/lkgp.ts — Last-Known-Good-Provider get/set/clear (49 LOC) - settings/cacheMetrics.ts — cache metrics + trend (235 LOC) Settings-core + the Proxy-config concern stay in the host: proxy is the most tangled (245-line resolveProxyForConnection, resolution cache, imports from ./proxies) and getSettings is the most central function — leaving them is the correct coupled-core stop. Pricing/LKGP/Cache have NO dependency on Settings/Proxy helpers (verified); the dependency DAG is acyclic (check:cycles). Adds tests/unit/db-settings-split.test.ts: characterization of the shared toRecord helper + a guard asserting the host preserves its full public export surface. Validated: typecheck:core, check:cycles (no cycles), 149 existing+new db/settings consumer tests green (db-settings-crud/extended, 8 pricing suites, cache-metrics, 2 proxy-resolution suites + 29 new), ESLint, Prettier, check:file-size (host 646 < frozen 1155). * fix(translator): re-apply lost defensive hardening for Gemini merge + Claude tool defaults (#5706) Re-applies two dropped gemini-code-assist hardening fixes (defaultClaudeToolType non-object passthrough; mergeConsecutiveSameRoleContents shallow-copy) with regression tests. Follow-up to #5661/#5662. Integrated into release/v3.8.43. * feat(codex): generate fallback profiles for compatible models (#5701) setup-codex now generates Codex profiles for compatible text models from the live /v1/models catalog when the model id doesn't match a hand-tuned pattern, skipping media/embedding models. Integrated into release/v3.8.43. * docs(changelog): credit @Chewji9875 for #5563 + #5579 Add CHANGELOG credit bullets for grok-cli tool-limit (#5563) and Antigravity 429 lockout (#5579). Documentation-only. * test(dashboard): repoint sidebar quota-share placement scan to sections.ts (#5711) The D1 god-file split (#5683) moved the nav-item id definitions out of src/shared/constants/sidebarVisibility.ts into the extracted leaf src/shared/constants/sidebarVisibility/sections.ts. This source-scan test still read the old monolith path, so it found 0 occurrences of id: "costs-quota-share" and failed (base-red on release/v3.8.43). Repoint SIDEBAR_PATH to sections.ts where the ids now live. All four placement assertions (quota-share after quota, same array, far from costs-budget, exactly one occurrence) hold against the new source. * refactor(db): extract columns/nodes/rate-limit leaves from db/providers.ts (#5714) db/providers.ts was a 1106-line god-file mixing four concerns. Extract the three acyclic, cohesive slices into sibling leaf modules under src/lib/db/providers/, leaving the tightly-coupled connection-CRUD core in the host: - providers/columns.ts (116) 10 pure column-normalizer helpers (DB-free) - providers/nodes.ts (163) 6 provider-node CRUD functions - providers/rateLimit.ts (177) 6 rate-limit/quota runtime helpers + formatResetCountdown Host providers.ts: 1106 -> 719 lines. The connection-CRUD core does not call any node or rate-limit function (verified), so the host re-exports the 12 moved public symbols via `export { ... } from './providers/<leaf>'` — the module's public API stays IDENTICAL (23 symbols). Bodies moved verbatim (byte-identical); the only edit to a moved line is the added `export` on the 10 previously-private normalizers. Behavior-preserving: 122 existing provider/quota/rate-limit consumer tests stay green; new tests/unit/db-providers-split.test.ts guards the re-export barrel + characterizes the pure column helpers (38 assertions). Refs #3501 (god-file structural shrink). * refactor(db): extract types + pure mappers from db/proxies.ts (#5717) db/proxies.ts was a 1059-line god-file. Extract the two acyclic, DB-free slices into sibling leaf modules under src/lib/db/proxies/, leaving the tightly-coupled CRUD + assignment + resolution core in the host: - proxies/types.ts (65) 10 proxy type/interface declarations - proxies/mappers.ts (180) pure row mappers / scope normalizers / payload coercers (toRecord, mapProxyRow, mapAssignmentRow, isRelayProxyType, extractRelayAuth, toRegistryProxyResolution, normalizeScope, normalizeAssignmentScopeId, toLegacyProxyLevel, coerceProxyPayload, redactProxySecrets) Host proxies.ts: 1059 -> 847 lines. The resolution functions call createProxy/assignProxyToScope, so the CRUD+resolution core CANNOT be extracted without an import cycle and stays in the host. The host re-exports the 2 moved public functions (extractRelayAuth, redactProxySecrets) via `export { ... } from './proxies/mappers'` — the public API stays IDENTICAL (20 functions; no types were ever publicly exported). Bodies moved verbatim; the only host edits are the new leaf imports, the re-export, dropping the now unused `import { decrypt }`, and two prettier line-wrap reflows of retained ternary/union lines (token-identical). Behavior-preserving: 69 existing proxy/registry/relay/family consumer tests stay green; new tests/unit/db-proxies-split.test.ts guards the re-export barrel + characterizes the pure mappers (35 assertions). Refs #3501. * refactor(db): extract static migration data tables from migrationRunner.ts (#5721) migrationRunner.ts (1124 lines, frozen-baselined) is the startup migration orchestrator. As a conservative, zero-behaviour-risk first slice, extract the six static migration-compatibility DATA tables (verbatim) into a pure-data leaf, leaving the entire orchestrator + all SQL-running helpers in the host: - migrationRunner/constants.ts (118) RENAMED_MIGRATION_COMPATIBILITY, LEGACY_VERSION_SLOT_MIGRATIONS, SUPERSEDED_DUPLICATE_MIGRATIONS, PHYSICAL_SCHEMA_SENTINELS, INITIAL_SCHEMA_SENTINELS, OPTIONAL_FTS5_MIGRATION_VERSIONS Host migrationRunner.ts: 1124 -> 1023. The runtime fts5SupportCache (a WeakMap, mutable state) stays in the host. No public API change (these consts were module-internal). Data moved byte-identical (sed-extracted, verbatim verified); the only host edits are the leaf import + one prettier collapse of a pre-existing 2-line union type annotation to 1 line (token-identical, typecheck-confirmed). Characterize-first (operator-chosen): the existing db-migration-runner.test.ts (26 tests) + no-migration-collisions/weak-rng-fixes/check-db-rules (11) prove the reconciliation/dedup/already-applied BEHAVIOUR is unchanged; the new tests/unit/db-migrationrunner-constants-split.test.ts (7 tests) PINS THE DATA (counts + shape + spot-checks of every table) so a dropped/transposed row is caught immediately. Refs #3501. * refactor(db): extract pure SQL-source builders from usageAnalytics.ts (#5722) usageAnalytics.ts (924 lines, frozen-baselined) mixes two pure SQL-source builders with ~20 getXxxRows() query functions. Extract the contiguous, DB-free builder block verbatim into a leaf, leaving every query function in the host: - usageAnalytics/sources.ts (208) AnalyticsParams, BuildUnifiedSourceOptions, UnifiedSourceResult + buildUnifiedSource + buildPresetUnifiedSource (pure string builders; no DB, no imports) Host usageAnalytics.ts: 924 -> 723. The query functions do not call the builders (callers build the unified source then pass the string in), so the host re-exports the 5 moved public symbols (2 fns + 3 types) and imports AnalyticsParams as a type for its query signatures — the public API stays IDENTICAL (39 symbols). Builder bodies moved byte-identical; the two orphaned section-header banners that described the moved block were removed with it; the retained query-function suffix is byte-identical to the original. Behavior-preserving: 37 existing analytics consumer tests stay green (usage-analytics 12, usage-endpoint-dimension 3, db-usage-analytics-3500 22); new tests/unit/db-usageanalytics-split.test.ts (25 assertions) characterizes buildUnifiedSource's needsAggregated branching (raw-only vs raw+daily_usage_summary) + guards the 39-symbol re-export barrel. Refs #3501. * docs(readme): refresh metrics, list 17 strategies, add Quota-Share + real provider logos - Unify provider count to 236; MCP tools 87->94; cloud agents 3->4 (+Cursor); compression 9->10 engines (+relevance) - Tests -> 21,000+ across 2,586 files; footer -> v3.8.43 - Raise lower bounds to real values: 90+ free, 80+ commands, 24+ CLIs - Language flag grid 33->43 (15/14/14, all locales) - List all 17 routing strategies; new Quota-Share section before Resilience - Real provider logos (lobe-icons + local agentrouter) in providers grid and Free Forever - Top Contributors: refreshed stats + add herjarsa; 280+ title; half-size avatars; contrib.rocks 100->200 - Acknowledgments: refreshed star counts; fix headroom repo rename * docs(readme): update provider counts and add new badges * feat(memory): T10/TV6 — opt-in typed memory decay (#5723) Opt-in typed memory decay so the conversational memory store self-prunes stale episodic noise. access_count + last_accessed_at telemetry (migration 111) is always-on/non-destructive; the sweep is opt-in (MEMORY_TYPED_DECAY_ENABLED, default false). Only episodic decays by default (30d); factual/procedural/semantic immune; access_count>=3 earns immunity; deletions reuse deleteMemory (SQLite+vec+Qdrant in sync), fail-open. Regression guard: tests/unit/memory/typed-decay.test.ts (15). gaps v3.8.42 — T10/TV6. * feat(dashboard): T06/T03 — drag-reorder compression pipeline editor + studio e2e (#5727) T06: named-combos editor gains a @dnd-kit/sortable drag-to-reorder stacked pipeline backed by a pure model (compressionPipelineModel.ts: add/remove/move/update, engine->intensity invariant, never-empty). CompressionPipelineEditor.tsx replaces the inline fixed list in CompressionCombosPageClient; order persists via the existing combos endpoint (no API change). T03: adds tests/e2e/compression-studio.spec.ts (Tela A render + Play/Compare tab switch), the dedicated compression-studio e2e combo-live-studio.spec.ts did not cover. TDD: compression-pipeline-model.test.ts (11) + compression-pipeline-editor.test.tsx (4). gaps v3.8.42 — T06 + T03. * fix(thinking): wire Thinking-Budget boot hydration into live instrumentation path (#5312) (#5729) hydrateThinkingBudgetConfig was only called from the unused src/server-init.ts, which never runs in production, so the dashboard Thinking-Budget mode silently reverted to passthrough on every restart. Wire it into the real boot path (src/instrumentation-node.ts), next to the Global System Prompt restore. Surfaced by live Anthropic-OAuth validation on the VPS (fix A of #5312 was non-functional even though its direct unit test passed). New guard tests/unit/thinking-budget-boot-wiring-5312.test.ts asserts the production boot module calls the hydration, closing the test gap that let this ship. * refactor(usage): extract pure formatting helpers from callLogs.ts (#5725) callLogs.ts (996 lines, frozen-baselined) mixes pure log-formatting / sanitization helpers with DB CRUD, disk-artifact, and rotation logic. Extract the ten pure, DB-free helpers verbatim into a leaf, leaving all stateful code in the host: - callLogs/format.ts (129) asRecord, toNumber, toStringOrNull, truncateText, parseInlineError, normalizeDetailState, sanitizeErrorForLog, toStoredErrorSummary, protectPipelinePayloads, buildRequestSummary Host callLogs.ts: 996 -> 885. The stateful generateLogId (mutates logIdCounter) stays in the host. These helpers were all module-internal, so the public API is unchanged (10 exported functions). Bodies moved byte-identical; the host's now unused 'sanitizePII' import (only referenced inside the moved bodies) moved to the leaf; prettier wrapped buildRequestSummary's signature across lines once the 'export' prefix pushed it past 100 cols (token-identical). Behavior-preserving: 46 existing call-log consumer tests stay green (call-log-cap 14, pagination 4, file-rotation 5, log-retention 5, startup 1, oom 2, trim-sql 2, db-settings-maintenance 13); new tests/unit/calllogs-format-split.test.ts (26 assertions) characterizes the pure helpers + guards the 10-function public API. Refs #3501. * refactor(usage): extract pure stat/coercer helpers from usageHistory.ts (#5728) usageHistory.ts (987 lines, frozen-baselined) mixes pure DB-free helpers with an in-memory pending-request state machine and DB CRUD. Extract the contiguous pure block verbatim into a leaf, leaving all stateful code in the host: - usageHistory/helpers.ts (85) asRecord, toStringOrNull, normalizeServiceTier, toNumber, percentile, stdDev, truncatePendingPreview (+ its MAX_PREVIEW_* bounds, co-located) Host usageHistory.ts: 987 -> 916. The pending-request state machine (module Maps + track/update/finalize/sweep) and DB CRUD stay in the host. These helpers were all module-internal, so the public API is unchanged (21 direct exports + the pre-existing getCompletedDetails re-export = 22). Bodies moved byte-identical (leaf 0 non-verbatim lines); the host's local 'type JsonRecord' moved with the bodies that used it (host no longer references it — typecheck-confirmed). Behavior-preserving: 38 existing usage-history consumer tests stay green (usage-history-db 5, api-key-usage-limits 6, log-retention 5, usage-endpoint-dimension 3, provider-request-failure-pipeline 6, database-settings-maintenance 13); new tests/unit/usagehistory-helpers-split.test.ts (30 assertions) pins the percentile/stdDev formulas + normalizeServiceTier + guards the public API. Refs #3501. * refactor(usage): extract pure quota-normalize helpers from providerLimits.ts (#5730) providerLimits.ts (954 lines, frozen-baselined) is the heavily DB/network-coupled provider quota sync module. Extract a small, fully SELF-CONTAINED leaf of pure quota-key/quota-value normalization helpers (+ the isRecord type guard they share), leaving all sync/DB/network code in the host: - providerLimits/quotaNormalize.ts (72) isRecord, isUsageQuotaKeyAllowed, normalizeUsageQuotaKey, normalizeUsageQuotasForProvider, sanitizeUsageQuotasForProvider Host providerLimits.ts: 954 -> 890. The leaf imports only the external antigravity/agy model-alias helpers the moved bodies reference (moved from the host's import block) — it does NOT import the host, so check:cycles stays clean (no cycle). isRecord (used ~9x in the host) is co-extracted and imported back. These five were all module-internal, so the public API is unchanged (13 exported functions). Bodies moved byte-identical. Behavior-preserving: 18 existing provider-limits consumer tests stay green (sanitize-scope 3, db-provider-limits 3, proxy-fail-closed 3, rotating-expired-guard 7, codex-quota-sync 2); new tests/unit/providerlimits-quotanormalize-split.test.ts (19 assertions) pins isRecord + isUsageQuotaKeyAllowed + guards the 13-function public API. Refs #3501. * refactor(memory): extract pure scoring/conversion helpers from retrieval.ts (#5733) retrieval.ts (1192 lines — ABOVE its 1171 frozen baseline) is the memory retrieval engine (DB + vector + rerank network). Extract the pure, DB-free scoring/conversion helpers (+ the MemoryRow row shape they share) verbatim into a self-contained leaf, leaving all DB/vector/network code in the host: - retrieval/scoring.ts (104) interface MemoryRow + estimateTokens, parseMetadata, rowToMemory, getRelevanceScore Host retrieval.ts: 1192 -> 1072 — back UNDER the 1171 frozen baseline (the split also repairs the pre-existing file-size drift). The leaf imports only ../types, never the host, so check:cycles stays clean (no cycle). MemoryRow moved to the leaf and imported back as a type by the host's DB row functions. The public estimateTokens is re-exported from the leaf; the host also imports it for its internal token-budget loops. The other three helpers were module-internal, so the public API is unchanged (7 exports). Bodies moved byte-identical. Behavior-preserving: 38 existing memory-retrieval consumer tests stay green (rerank 5, hybrid 6, semantic 6, engine-status 9, stats-api 12); new tests/unit/retrieval-scoring-split.test.ts (11 assertions) pins estimateTokens (ceil(len/4)) + parseMetadata + rowToMemory mapping + getRelevanceScore (+20 phrase / +3 token) and guards the public API. Refs #3501. * refactor(sse): extract reasoning-tag detection/extraction from responseSanitizer.ts (#5734) responseSanitizer.ts (1133 lines, frozen-baselined) mixes reasoning-tag detection/extraction with response/usage/streaming sanitization. Extract the cohesive, ZERO-IMPORT reasoning block verbatim into a self-contained leaf: - responseSanitizer/reasoning.ts (143) the reasoning regex consts + collapseExcessiveNewlines, cleanReasoningFragment, splitClosingOnlyReasoningPrefix, movePrefixBeforeContentTagToThinking, extractThinkingFromContent, normalizeReasoningRouteId, isAntigravityReasoningRoute, isTextualReasoningTagNativeRoute, shouldParseTextualReasoningTags Host responseSanitizer.ts: 1133 -> 1003. The block's helpers only call each other, so the leaf has ZERO imports — it cannot import the host (check:cycles clean). The host imports back collapseExcessiveNewlines (6 call sites) + extractThinkingFromContent, and re-exports the two public symbols (extractThinkingFromContent, shouldParseTextualReasoningTags) — the public API stays IDENTICAL (7 exports). Bodies moved byte-identical; two long declarations (REASONING_TAG_FRAGMENT_REGEX, movePrefixBeforeContentTagToThinking signature) were line-wrapped by prettier once the 'export' prefix pushed them past 100 cols (token-identical). Behavior-preserving: 47 existing consumer tests stay green (response-sanitizer 36, strip-reasoning-header 8, textual-toolcall-false-positive 3); new tests/unit/responsesanitizer-reasoning-split.test.ts (11 assertions) characterizes extractThinkingFromContent + shouldParseTextualReasoningTags and guards the public API. Refs #3501. * refactor(sse): extract rate-limit header parsing from rateLimitManager.ts (#5736) rateLimitManager.ts (1034 lines, frozen-baselined) is the stateful rate-limiter (Bottleneck limiters, watchdog timers, learned-limits Map). Extract the pure, ZERO-IMPORT header-parsing block verbatim into a self-contained leaf, leaving all stateful machinery in the host: - rateLimitManager/headers.ts (94) STANDARD_HEADERS, ANTHROPIC_HEADERS, parseResetTime, toPlainHeaders Host rateLimitManager.ts: 1034 -> 945. The four items are pure (no limiter state, no external deps), so the leaf has ZERO imports — it cannot import the host (check:cycles clean). The host imports all four back (used by updateFromHeaders). They were module-internal, so the public API is unchanged (17 exports). Bodies moved byte-identical. Behavior-preserving: 21 existing rate-limit consumer tests stay green (rate-limit-manager 7, limiter-lifecycle 4, queue-timeout-msg 2, idle-eviction 6, body-lock 2); new tests/unit/ratelimitmanager-headers-split.test.ts (7 assertions) pins parseResetTime (durations / bare-number / nullish) + toPlainHeaders + guards the 17-function public API (with a watchdog-timer teardown hook so the runner exits cleanly). Refs #3501. * fix(config): back boot-hydrated proxy config singletons with globalThis (#5312) (#5742) Next.js compiles instrumentation.ts as a separate webpack module graph from the app-route/open-sse executors, so a module-local `let _config` is duplicated: the boot-time hydration (applyRuntimeSettings / restore hooks) lands on the instrumentation graph's copy, but the request path (base.ts) reads a different, un-hydrated copy. Live VPS validation proved the Thinking-Budget hydrate ran to completion at boot yet base.ts still read the passthrough default — why #5312 fix A stayed broken after the boot-wiring fix. Back the singletons with globalThis (the pattern systemPrompt.ts already uses for #2470) so all graph copies share one instance: - thinkingBudget.ts — dashboard Thinking-Budget mode reaches the executor - backgroundTaskDetector.ts — opt-in background degradation actually fires - systemTransforms.ts — operator pipeline overrides reach the request path payloadRules.ts was already safe (lazy per-request DB self-load, #2986). Guards: thinking-budget-globalthis-5312 + runtime-config-globalthis-5312 (assert globalThis sharing; a module-local let fails them, RED->GREEN). * refactor(evals): extract built-in golden-set suites from evalRunner.ts (#5740) Move the 7 static built-in eval suites (golden-set, coding-proficiency, reasoning-logic, multilingual, safety-guardrails, instruction-following, codex-comparison) plus the builtInSuites aggregate into the pure-data leaf src/lib/evals/evalRunner/builtinSuites.ts (zero imports, no side effects). evalRunner.ts keeps all logic (register/get/list/evaluate/run/scorecard/reset) and registers the leaf suites at module load, mirroring the original inline calls. Public API is unchanged (7 exported functions; the suite consts were already module-private). Host 960->301 LOC; leaf 676 LOC (< 800 cap); host was frozen-satisfied (961), so this is debt reduction. Suite data moved verbatim (652 data lines byte-identical). New split-guard test characterizes the suite ids/case counts/key cases and proves the host registers every leaf suite at load. * refactor(models): extract pure transform layer from modelsDevSync.ts (#5743) Move the models.dev data-model types, the provider-id mapping table (MODELS_DEV_PROVIDER_MAP + mapProviderId), and the raw->OmniRoute transforms (transformModelsDevToPricing, transformModelsDevToCapabilities) into the pure leaf src/lib/modelsDevSync/transform.ts (zero imports, no DB, no module state). modelsDevSync.ts keeps all sync orchestration, DB access, caches and the periodic-sync timer; it imports the transforms for internal use and re-exports mapProviderId/transformModelsDevToPricing/transformModelsDevToCapabilities plus the ModelCapabilityEntry/CapabilitiesByProvider types, so the public API is unchanged. Host 924->677 LOC; leaf 279 LOC (< 800 cap); host was frozen-satisfied (934), so this is debt reduction. 238 moved lines are byte-identical. New split-guard test characterizes the provider map + both transforms and proves the host re-exports them. * refactor(resilience): split settings.ts into types + normalize leaves (#5745) Decompose the (fully pure) resilience settings module into two sibling leaves: - src/lib/resilience/settings/types.ts: the settings shape (11 public interfaces + JsonRecord/AuthCategory), zero imports. - src/lib/resilience/settings/normalize.ts: the coercers (asRecord/toInteger/ toBoolean/feature-flag resolvers) + the 11 per-section normalize* functions. settings.ts keeps DEFAULT_RESILIENCE_SETTINGS, DEFAULT_REQUEST_QUEUE_MAX_WAIT_MS, buildLegacyFallback, and the public orchestrators (resolveResilienceSettings, mergeResilienceSettings, buildLegacyResilienceCompat); it imports the coercers/normalizers for internal use and re-exports the 11 settings interfaces, so the public API is unchanged. Host 840->363 LOC; leaves 182 + 359 LOC (< 800 cap); host was frozen-satisfied (841), so this is debt reduction. 472 moved lines are byte-identical; no cycles (leaves never import the host). New split-guard test characterizes the coercers/normalizers and the host resolve/merge/compat orchestration. * docs(readme): document faster/leaner install — skip native build, sql.js fallback (#5713) Documents the optional better-sqlite3 + pure-JS fallback chain and OMNIROUTE_SKIP_POSTINSTALL/CI skip flags. Docs-only, claims verified. (#5550) * feat(compression): T02 opt-in per-engine pipeline circuit-breaker (#5735) Opt-in, default-off per-engine circuit-breaker for the stacked compression pipeline. Byte-identical to legacy when off. 9 regression tests. * docs: sync MCP tool count to 95 + routing-strategy count (#5732) Sync CLAUDE.md/README.md to canonical MCP tool count (95, 35 base) and routing strategies (17). Numbers fact-checked against getAllToolDefinitions()/ROUTING_STRATEGY_VALUES. * feat(api): add first-class Ollama local provider card (#5712) First-class ollama-local provider card (localhost:11434/v1, keyless, passthrough models) in LOCAL_PROVIDERS + SELF_HOSTED + default.ts executor case. Docs count 236→237, Local 11→12 (full README sweep). 4 tests. (#5578) * feat(api): add opt-in API-key provider quota-policy bypass scope (#5731) Adds an opt-in per-API-key scope (policy:bypass-provider-quota) that lets a key skip provider/account-side quota cutoffs during routing. Operator USD budgets/usage limits still enforced unconditionally (fail-closed, before the bypass). Default-off; UI toggle + badge in API Manager. Integrated into release/v3.8.43. * feat(codex): opt-in auto-sync of Codex profiles after model discovery (#5737) Auto-sync ~/.codex/.config.toml profiles after a provider model sync, reusing the setup-codex generator. Opt-in, default OFF (OMNIROUTE_AUTO_SYNC_CODEX_PROFILES=true; also honors CLI_ALLOW_CONFIG_WRITES). Never touches the active Codex config. Gating test added. Co-authored-by: diegosouzapw <diegosouza.pw@gmail.com> feat(providers): opt-in CLI profile auto-sync toggles + Claude Code auto-sync (#5755) Providers-dashboard 'CLI profile auto-sync' card (Codex + Claude Code toggles), feature-flag backed (default off), + Claude Code auto-sync mirroring the Codex path. Follow-up to #5737. * feat(compression): T08/H8 (2.3) — graduated CCR retrieval-feedback ramp (#5739) Turns CCR retrieval feedback from a binary cliff into a graduated ramp: each prior retrieval raises a block's effective minChars linearly (effectiveMinChars); >= 3 retrievals still excluded (Infinity). retrievalRampFactor default 2 (config/env COMPRESSION_CCR_RETRIEVAL_RAMP_FACTOR); 1 = legacy binary. Regression guard: tests/unit/compression/ccr-retrieval-ramp.test.ts (12); 51 existing CCR tests green. gaps v3.8.42 — T08/H8 (2.3). * feat(compression): T08/H5 (2.4) — usage-observed prefix freeze (opt-in) (#5744) Evolves the cache-aware guard to also learn which system prompts recur: observed >= threshold → treated as a stable cacheable prefix and preserved even for providers the static check misses. Content-addressed by a hash of the system prompt (OpenAI/Claude/Gemini), in-memory, freeze=preserve (never mutates). Opt-in/default-off (COMPRESSION_PREFIX_FREEZE_ENABLED); respects the never preserve-mode. New prefixFreeze.ts wired into resolveCacheAwareConfig. Regression guard: prefix-freeze.test.ts (10); 44 cache-aware tests green. gaps v3.8.42 — T08/H5 (2.4). * feat(compression): T08/H7 (2.5) — read-lifecycle engine (collapse superseded reads) (#5754) New opt-in, default-off read-lifecycle engine: collapses stale/superseded file-Read tool results (same path re-read OR modified later) to a stub, keeping the current Read intact. Anthropic + OpenAI tool shapes; conservative (known tool names, exact path, strictly-later); fail-open. Lossy → opt-in. Regression guard: read-lifecycle.test.ts (10); 41 registry/pipeline suites green. gaps v3.8.42 — T08/H7 (2.5). Completes Onda 2. * fix(sse): anti-thundering-herd guard tolerates numeric-epoch cooldowns (#5747) markAccountUnavailable's dedupe guard used a raw `new Date()` on rateLimitedUntil, which can hold a numeric-epoch string (e.g. the Antigravity full-quota path via setConnectionRateLimitUntil). That produced Invalid Date/NaN, so the guard never detected an already cooling connection — a second concurrent failure on the same connection overwrote a long quota-exhaustion cooldown with a much shorter fresh backoff cooldown, making the account selectable again far sooner than intended. Reuses the existing cooldownUntilMs normalizer (#3954) instead of a raw Date parse. * fix(chat): harden non-streaming SSE aggregation (#5746) * fix: repoint DashScope/Alibaba setup links to consoles (#5665) (#5762) * fix: point Quick Start step 1 to API Keys page, not Endpoint (#5695) (#5763) * fix: onboarding wizard saves providers with unsupported validation (#5692) (#5764) * docs(security): document full LOCAL_ONLY route set + GHSA-fhh6-4qxv-rpqj + audit path (#5599) (#5748) Expand ROUTE_GUARD_TIERS.md Tier 1 (LOCAL_ONLY): - link the GHSA advisory and explain the attack class (RCE via a subprocess spawn reachable from non-loopback traffic) - replace the 3-example prefix table with the full LOCAL_ONLY set, mirroring LOCAL_ONLY_API_PREFIXES / LOCAL_ONLY_API_PATTERNS in routeGuard.ts (the authoritative source; check-route-guard-membership enforces the code side) - add an "Operator guidance & auditing" section for users behind nginx/Cloudflare/Tailscale: don't forge X-Forwarded-For loopback, keep the manage-scope bypass minimal, and how to audit non-loopback access Docs-only; SECURITY.md already links here. Closes #5599 * docs(security): document banned-keyword / account-ban detection (#5600) (#5756) * docs(security): add BAN_DETECTION.md — banned-keyword / account-ban detection (#5600) New docs/security/BAN_DETECTION.md documenting the previously-undocumented system: - the 8 built-in ACCOUNT_DEACTIVATED_SIGNALS + custom keywords are additive - detection flow (body substring match -> terminal `banned` state, skipped in account selection; `deactivated` on 401/403; autoDisableBannedAccounts) - scope: global (all providers); the signal strings target OAuth/subscription scrapers - custom keywords: add path, 200-char cap, hot-reload, and the false-positive warning (raw substring match -> prefer full ban sentences, not "quota"/"limit") - recovery: terminal states never auto-recover -> re-test / re-auth / re-enable Registered in security meta.json; cross-linked from RESILIENCE_GUIDE (terminal states). Docs-only. Closes #5600 * docs(security): clarify deactivated vs expired terminal-status split (#5600) The same ACCOUNT_DEACTIVATED signal surfaces as two different terminal statuses depending on the code path: chatCore.ts inline writes 'deactivated' (401/403 via classifyProviderError), while markAccountUnavailable() -> resolveTerminalConnectionStatus() writes 'expired'. Document both. * fix: surface relay proxy-test errors instead of silent failure (#5716) (#5765) * refactor(api): extract pure discovery leaves from provider-models route (#5758) Split src/app/api/providers/[id]/models/route.ts (2511 -> 1818 LOC) by moving the cohesive, DB-free discovery building blocks into four leaves under discovery/: - helpers.ts record/string coercion, Azure + base-url helpers, bearer/named-openai header builders - normalizers.ts Antigravity / DataRobot / OpenAI-like / SAP models response normalizers - providerModelsConfig.ts PROVIDER_MODELS_CONFIG + ProviderModelsConfigEntry - providerSets.ts NAMED_OPENAI_STYLE_PROVIDERS + isNamedOpenAIStyleProvider The host keeps all request orchestration and imports the leaves back. The moved symbols were module-private, so the route's public export set (GET) is unchanged and no external importer needs updating. Bodies are byte-identical: the code-line multiset of host + leaves equals the original route verbatim. Tests: - repoint the qwen-web source-guard in catalog-updates-v3829-kimi-qwen to the new config leaf (assertions unchanged) - add provider-models-discovery-split as the split regression guard (leaf public surface + host wiring + the #5570 cablyai->aimlapi entry swap) * fix(memory): enabling Qdrant activates it as the engine + inline guidance (#5597) (#5741) * fix(memory): enabling Qdrant now activates it as the engine + inline guidance (#5597) Enabling Qdrant in the Engine tab was inert: retrieval only routes to Qdrant when memoryVectorStore === "qdrant" (the default "auto" never selects it), and the card only wrote qdrantEnabled — nothing set the engine selector, and there is no UI for it. So users configured Qdrant, saw "enabled", but it was never actually used. - PUT /api/settings/qdrant now sets memoryVectorStore alongside the toggle: enable -> "qdrant", disable -> "auto". Editing other fields leaves it untouched. - Add inline guidance to QdrantConfigCard: a Tier-1-vs-Tier-2 banner + per-field help (host, collection, embedding model). Note there is no "vector dimension" or "distance metric" field: dimension is auto-detected from the embedder, distance is always Cosine. - Document the real behavior in MEMORY.md: engine gate, no back-fill of existing memories, dimension auto-detect, Cosine-only, API-key-only auth. Tests: tests/integration/qdrant-routes.test.ts — enable->qdrant, disable->auto, and field-edit-without-enabled leaves the engine untouched (TDD: red -> green). Closes #5597 * fix(memory): invalidate memory-settings cache on Qdrant toggle (#5597) The PUT handler wrote memoryVectorStore to the DB but retrieval reads through getMemorySettings(), a module-level cache. Without busting it, the engine switch did not take effect until a process restart (the DB said qdrant, retrieval kept routing to sqlite-vec). Now calls invalidateMemorySettingsCache() after the write, mirroring src/app/api/settings/memory/route.ts. Regression test warms the cache, toggles via the route, and asserts getMemorySettings().vectorStore flips to qdrant (fails without the invalidate call). * fix(compression): record Context Editing telemetry on the streaming path (#5761) Streaming SSE responses now preserve context_management from the final message_delta snapshot and fire the telemetry hook in onStreamComplete, so context-clear savings surface in compression analytics for streaming (not just non-streaming). Additive telemetry, Claude-only, opt-in-neutral. gaps v3.8.42 — T01 (5.1). Test: context-editing-streaming-telemetry.test.ts (3, failing->passing). * Persist batch item checkpoints during recovery (#5753) * fix(sse): checkpoint batch item recovery * fix(db): renumber batch checkpoints migration 110→112 (collision with #5667) 110 was taken by 110_model_context_overrides.sql (#5667), which landed on the release branch after this PR branched. migrationRunner throws a hard version- collision error on startup when two files share a numeric prefix. 112 is the next free slot (110/111 taken on the release tip). Co-authored-by: diegosouzapw <diegosouza.pw@gmail.com> --------- Co-authored-by: diegosouzapw <diegosouza.pw@gmail.com> * fix: resolve CCR MCP retrieve principal from api-key auth context (#5649) (#5768) * feat(cli): show version in startup banner (integrates #5752) (#5769) * feat(cli): show version in startup banner Print dim 'v<version>' line below ASCII art logo in omniroute serve. Uses readFileSync (same pattern as program.mjs) to read package.json. Closes #5749. * test(cli): guard startup-banner version line (#5752) Source-inspection test (same pattern as cli-serve-port.test.ts) asserting serve.mjs parses the version from package.json and prints v${_pkg.version} in the startup banner — satisfies Hard Rule #8 for the bin/ change. Co-authored-by: diegosouzapw <diegosouza.pw@gmail.com> * docs(changelog): credit #5752 startup-banner version line (thanks @chirag127) --------- Co-authored-by: Chirag Singhal <76880977+chirag127@users.noreply.github.com> * fix(proxyfetch): skip fallback for non-replayable bodies (#5770) * chore(release): open v3.8.42 cycle Bump version to 3.8.42, add CHANGELOG placeholder, sync openapi/electron/open-sse + 42 i18n CHANGELOG mirrors. * chore: remove unused qdrant schema aliases (#5404) Integrated into release/v3.8.42 * chore: remove unused memory schema aliases (#5403) Integrated into release/v3.8.42 * chore: remove unused quota schema types (#5402) Integrated into release/v3.8.42 * chore: remove unused playground row type (#5401) Integrated into release/v3.8.42 * chore: remove unused codegraph exports (#5400) Integrated into release/v3.8.42 * chore: remove unused notion client type (#5399) Integrated into release/v3.8.42 * chore: remove unused settings types (#5398) Integrated into release/v3.8.42 * chore: remove unused combo types (#5396) Integrated into release/v3.8.42 * chore: remove unused provider types (#5393) Integrated into release/v3.8.42 * chore: remove unused skillssh skill type (#5392) Integrated into release/v3.8.42 * chore: remove unused status hex key type (#5391) Integrated into release/v3.8.42 * chore: remove unused batch provider type (#5390) Integrated into release/v3.8.42 * chore: remove unused skills schema types (#5389) Integrated into release/v3.8.42 * chore: remove unused codex auth input type (#5388) Integrated into release/v3.8.42 * chore: remove unused memory schema types (#5387) Integrated into release/v3.8.42 * chore: remove unused playground row type (#5386) Integrated into release/v3.8.42 * chore: remove unused qdrant schema types (#5385) Integrated into release/v3.8.42 * chore: remove unused kiro social schema (#5384) Integrated into release/v3.8.42 * chore: remove unused memory schema types (#5383) Integrated into release/v3.8.42 * chore: remove unused audit action type (#5382) Integrated into release/v3.8.42 * chore: remove unused agent skills schema types (#5381) Integrated into release/v3.8.42 * chore: remove unused shared logger default export (#5380) Integrated into release/v3.8.42 * chore: remove unused sse logger helpers (#5378) Integrated into release/v3.8.42 * chore: remove unused sse model legacy helpers (#5377) Integrated into release/v3.8.42 * chore: remove unused v1 search response schema (#5376) Integrated into release/v3.8.42 * chore: remove unused cloud agent result schemas (#5375) Integrated into release/v3.8.42 * chore: remove unused a2a routing logger readers (#5374) Integrated into release/v3.8.42 * chore: remove unused webhook delivery detail export (#5372) Integrated into release/v3.8.42 * chore: remove unused api key type (#5395) Integrated into release/v3.8.42 * chore: remove unused usage types (#5397) Integrated into release/v3.8.42 * chore: remove unused cloud agent input types (#5373) Integrated into release/v3.8.42 * deps: bump electron from 42.4.1 to 42.5.1 in /electron (#5413) Integrated into release/v3.8.42 * deps: bump the production group with 11 updates (#5414) Integrated into release/v3.8.42 * fix: frame non-streaming JSON responses (#5416) Integrated into release/v3.8.42 * fix(services): runNpm shell on win32 + prefix via env for Node 24 EINVAL (#5379) (#5474) Node 24 refuses execFile of npm.cmd without a shell (nodejs/node#52554), so embedded-service install (9Router/CLIProxy) failed with spawn EINVAL on Windows. runNpm now enables shell on win32 only; to stay Hard-Rule-#13 safe under a shell, the install --prefix is passed via npm_config_prefix (env) instead of an argv path (survives spaces), and the user-supplied version is constrained by SERVICE_VERSION_PATTERN at the route boundary. * fix(cli): restore dist/tls-options.mjs to npm tarball (#5452) (#5503) Closes #5452 * fix(dashboard): render onboarding wizard on /providers/new (#5427) (#5505) Closes #5427 * fix(db): EBUSY-safe database import on Windows (#5406) (#5507) Closes #5406 * chore: remove unused gamification streak exports (#5463) * chore: remove unused headroom log tail export (#5464) * chore(dead-code): remove unused prompt cache control helper (#5466) * chore(duplication): share vscode metadata helpers (#5471) * chore(duplication): share auth zip extractors (#5475) * chore(duplication): share vscode tokenized request helper (#5479) * chore(duplication): share quota strategy ranking helpers (#5482) * chore(duplication): share recharts donut card (#5484) * chore(duplication): share provider specific validation (#5485) * chore(duplication): share batch response formatter (#5488) * chore(duplication): share redis runtime helpers (#5490) * chore(duplication): share version manager request parsing (#5492) * chore(duplication): share media generation route helpers (#5493) * chore(duplication): share settings transform schemas (#5496) * chore(duplication): share relay stream finalizer (#5497) * chore(duplication): share machine id fallback (#5498) * chore(duplication): share node sqlite adapter (#5500) * fix: treat terminal stream cancels as complete (#5491) * fix post-merge ci regressions (#5467) * fix: gate claude adaptive thinking defaults (#5480) Co-authored-by: KooshaPari <koosha@example.com> * fix(fallback): normalize provider error rule headers (#5473) Co-authored-by: KooshaPari <koosha@example.com> * fix(rate-limit): normalize queue refresh settings (#5499) Co-authored-by: KooshaPari <koosha@example.com> * chore(ci): add npm fetch-retry + release-freeze protocol (Hard Rule #21) (#5506) - .npmrc: bump fetch-retries 2->5 with backoff so transient registry ECONNRESET during npm ci (electron-release, v3.8.41) retries instead of failing the job; applies repo-wide. - CLAUDE.md Hard Rule #21: release-freeze coordination marker (label release-freeze) that campaign workflows honor before merging into the active release branch, preventing the mid-release commit races that forced CHANGELOG re-reconciliation in v3.8.40/v3.8.41. * chore(duplication): share service install helpers (#5495) Share service install helpers; re-add SERVICE_VERSION_PATTERN regex to the shared schema (dropped in extraction, #5474) + tests rejecting malformed versions. Co-authored-by: diegosouzapw <diegosouza.pw@gmail.com> * chore(duplication): share proxy route handlers (#5472) Share proxy route handlers; add resolveProxyLookupResponse regression test (3 branches + custom whereUsed param name). Co-authored-by: diegosouzapw <diegosouza.pw@gmail.com> * chore(duplication): share combo builder model options (#5477) Share combo builder model options; add regression test locking custom-model source classification (manual->custom, api-sync->imported). Co-authored-by: diegosouzapw <diegosouza.pw@gmail.com> * chore(dead-code): ratchet dead code baseline (#5468) Ratchet dead-code baseline to the true measured value (310 -> 225) after the v3.8.42 dead-code + duplication wave. Measured by check-dead-code.mjs on the tip. Co-authored-by: diegosouzapw <diegosouza.pw@gmail.com> * fix(dashboard): provider-add UX — i18n labels, surface import warning, default key name (#5511) * fix(dashboard): provider-add UX — real i18n labels, surface import warning, default key name (#5421 #5428 #5429 #5431 #5435) Three rough edges in the Add-API-Key / model-import flow, all from the provider-catalog audit: 1. Validation Model + Account ID form fields shipped untranslated i18n stub copy ('Validation Model Id Label', etc.) that rendered verbatim. Replaced with real copy in en.json. 2. Model import silently fell back to the cached/local catalog — the route returns a 'warning' field the import hook never read. New pure helper extractImportWarning surfaces it as a log line. 3. Required connection-name field defaulted to '' (let browser autofill inject garbage like 'wiw'); now defaults to 'main'. Regression guard: tests/unit/provider-add-ux-i18n-import-warning.test.ts. * fix(dashboard): compress AddApiKeyModal comment to keep file under frozen size cap * fix(providers): align Muse Spark (Meta AI) cookie copy to ecto_1_sess (#5449) (#5513) * fix(providers): align Muse Spark (Meta AI) cookie copy to ecto_1_sess (#5449) The default Meta AI session cookie migrated from the retired abra_sess to ecto_1_sess (META_AI_DEFAULT_COOKIE), but the provider form hint and one 401 auth-failure message still named abra_sess, telling users to paste a cookie that no longer exists. Both strings now name ecto_1_sess. Regression guard: tests/unit/muse-spark-cookie-copy-5449.test.ts. * chore: reconcile CHANGELOG with release (keep #5449 + #5511 bullets) * fix(providers): correct FriendliAI (serverless) + Novita (/openai/v1) endpoints (#5430 #5455) (#5515) * fix(providers): correct FriendliAI (serverless) and Novita (/openai/v1) endpoints (#5430 #5455) Both rejected valid keys, verified live with real provider keys: - FriendliAI baseUrl was /dedicated/v1/... which 403s a serverless flp_* token; switched to /serverless/v1/... + serverless modelsUrl. - Novita baseUrl was the legacy /v3/... with a typo'd model id ai-ai/... (both 404); switched to OpenAI-compat /openai/v1/... + meta-llama/llama-3.1-8b-instruct. Regression guard: tests/unit/provider-endpoints-friendliai-novita.test.ts. * chore: reconcile CHANGELOG with release (keep #5430/#5455 + prior bullets) * fix(providers): gate import for tool-only providers + sanitize Coze validation error (#5420 #5426) (#5522) #5420: the 'Import Models' button now hides for tool-only providers (web search / web fetch) via a capability check over resolved serviceKinds, not just the -search suffix — firecrawl/jina-reader (webFetch) no longer show an Import button that 400s. No LLM/media provider is affected. #5426: Coze key validation no longer leaks the raw upstream envelope ({code,msg,logId,from}) into the UI; the Coze error becomes a friendly message, scoped to provider === 'coze' so no other provider is affected. Regression guards: tests/unit/model-listing-capability-5420.test.ts, tests/unit/coze-validation-error-5426.test.ts. * fix(providers): correct LongCat free tier — GA LongCat-2.0, one-time 10M (KYC) (#5508) LongCat's preview ended and the Flash-* line was retired (2026-05-29); the API now exposes only the GA LongCat-2.0 (1M context, 128K output). The free tier is a ONE-TIME 10M-token grant unlocked after account signup + KYC verification — NOT a recurring daily/monthly allowance. The catalog still described the retired preview/Flash models and a recurring 150M / 5M-per-day budget; this corrects every reference. Config / code: - registry/longcat: model LongCat-2.0-Preview -> LongCat-2.0, name + comment reflect one-time 10M (KYC) and pay-as-you-go beyond it. - freeModelCatalog: longcat-2.0-preview (150M, recurring-daily) -> LongCat-2.0 (10M, freeType one-time-initial via creditTokens). - freeTierCatalog: drop longcat from the recurring-monthly budget map (one-time credits are excluded by that catalog's own rule). - regional.ts freeNote: one-time 10M after signup + KYC, not recurring. - providerCostData: longcat-flash-lite -> longcat-2.0 (pay-as-you-go 0.75/2.95 per 1M, 10M free quota). - validation probe model longcat -> LongCat-2.0. Tests: - free-tier-catalog: longcat now absent from FREE_TIER_BUDGETS; providerCount 22->21 (clean 21->20); documented total ~1.39B. - tierResolver: sample model flash-lite -> LongCat-2.0. Docs: - README, PROVIDERS-GUIDE, FREE-TIERS-GUIDE, FREE_TIERS: 50M/day Flash-Lite -> one-time 10M LongCat-2.0 (KYC); 'No auth' -> API key + KYC. - Regenerated PROVIDER_REFERENCE.md (picks up the new freeNote). typecheck:core clean; changed-file lint 0 errors; docs-sync PASS. * fix(providers): Bytez OpenAI-compat base URL + auth-only key validation (#5422) (#5528) Bytez IS OpenAI-compatible at .../models/v2/openai/v1, but the registry stored the bare .../models/v2 base, so validation's chat-probe hit .../models/v2/chat/completions -> 404 -> 'endpoint not supported'. Part A: registry baseUrl -> full OpenAI-compat chat path. Part B: a Bytez account only serves catalog-provisioned models, so chat-probe validation 404s even for valid keys. validateBytezProvider instead probes the auth-only GET .../models/v2/list/tasks (200=valid, 401/403=invalid). Verified live with a real key: list/tasks -> 200 (valid) / 401 (invalid). Regression guard: tests/unit/bytez-validation-5422.test.ts. * fix(providers): remove dead Phind provider + dedupe HuggingChat catalog listing (#5530) Integrated into release/v3.8.42 (round 3). Dead Phind removal + HuggingChat dedupe, verified complete. * fix: protect dynamic dashboard tests with CSRF (#5405) Integrated into release/v3.8.42 (round 3). Reworked CSRF (HMAC-signed synchronized token). * docs: clarify bifrost relay backend envs (#5520) Integrated into release/v3.8.42 (round 3). Doc-only: bifrost relay envs. * test(quota): guard Claude-Code identity version lockstep (Phase 2) (#5514) Integrated into release/v3.8.42 (round 3). Claude-Code identity version lockstep guard. * feat(compression): T02 — honest default-on pipeline inflation guard (H1) (#5527) Integrated into release/v3.8.42 (round 3). T02 pipeline inflation guard * feat(compression): T05/C2 — caveman dedup + ultra packs for de, fr, ja (#5529) Integrated into release/v3.8.42 (round 3). T05/C2 caveman packs de/fr/ja * feat(compression): T05/C6 — Chinese (zh / wenyan) caveman pack + detection (#5532) Integrated into release/v3.8.42 (round 3). T05/C6 zh/wenyan pack + detection * feat(compression): T07/R9 — gradle + dotnet RTK catalog filters (#5537) Integrated into release/v3.8.42 (round 3). T07/R9 RTK gradle+dotnet filters * refactor(dashboard): T11 — drop duplicate caveman on/off toggle from the compression settings tab (#5524) Integrated into release/v3.8.42 (round 3). T11 consolidate duplicate caveman controls; i18n'd the panel hint string (source key). * test relay routing fallback headers (#5526) Integrated into release/v3.8.42 (round 3). Relay fallback header extraction + tests (drift-shed: dependabot #5415 commit dropped). * fix(opencode-plugin): bump to 0.2.0 + auto-publish on release (#5363) - Bump @omniroute/opencode-plugin from 0.1.0 to 0.2.0 so CI publishes the accumulated fixes (auto combos, schema fields, debug logging) that were merged after the initial 0.1.0 publish on May 24. - Add auto-bump step in npm-publish.yml: detects if the plugin dir changed since the last release tag and auto-increments patch version, so the plugin never falls behind again on future releases. Co-authored-by: herjarsa <herjarsa@users.noreply.github.com> * [codex] add bifrost auto fallback cooldown (#5519) Integrated into release/v3.8.42 (round 3). Bifrost auto fallback cooldown; header reconciled with #5526 helper + env-doc. Co-authored-by: diegosouzapw <diegosouza.pw@gmail.com> * fix onboarding schema client import (#5525) Integrated into release/v3.8.42 (round 3). Browser-safe onboarding schema import (drift-shed: dependabot #5415 dropped). * docs: add relay backend strategy guide (#5547) Port #5533 relay strategy guide to release/v3.8.42 (doc-only). * fix(chatgpt-web): support GPT-5.5 Pro handoff (#5536) Integrated into release/v3.8.42 (round 3). GPT-5.5 Pro async stream_handoff support (drift-shed: dependabot #5415 dropped). * fix(providers): persist Configured filter across page reloads (#5510) Integrated into release/v3.8.42 (round 3). Persist Configured filter across reloads; extracted shouldSyncProviderDisplayMode race guard + TDD test (Closes #4059). Co-authored-by: diegosouzapw <diegosouza.pw@gmail.com> * fix(mimocode): route per-account traffic through SOCKS5 proxy dispatchers (#5521) Integrated into release/v3.8.42 (round 3). Per-account SOCKS5 dispatcher routing — completes #3837's stored proxy config with the actual undici dispatcher layer. Rebased onto .42 (dropped the CI-workflow-deletion commits; merged proxyUrlMap dispatch with #3837's acct.proxy storage). Co-authored-by: diegosouzapw <diegosouza.pw@gmail.com> * fix(chatgpt-web): portable SHA3-512 for sentinel PoW under Electron/BoringSSL (#5531) (#5540) * fix(build): keep ioredis out of the client/CLI bundle via SPAWN_CAPABLE_PREFIXES leaf (#5546) Fix the dast-smoke ioredis client-bundle regression (proven: dast-smoke green). Remaining reds are pre-existing base-reds/flakes (base.ts file-size, GOLDEN provider drift, shard-1 compression flakes) inherited by all PRs — not from this change. * chore(release): finalize v3.8.42 CHANGELOG + cycle-close reconciliation - Reconcile CHANGELOG.md for v3.8.42: 40 bullets covering all 89 commits since v3.8.41 (4 features, 26 fixes, 10 maintenance incl. 2 rollups for the 35-PR dead-code sweep + 17-PR DRY consolidation), dedup the merge- artifact duplicate New Features headers, set release date 2026-06-30. - Sync 42 docs/i18n//CHANGELOG.md mirrors. - Document 3 new chatgpt-web/TLS env vars in .env.example + ENVIRONMENT.md (OMNIROUTE_CGPT_WEB_PRO_TIMEOUT_MS, _PRO_POLL_INTERVAL_MS, OMNIROUTE_CHATGPT_STREAM_FIRST_BYTE_TIMEOUT_MS). - Cycle-close ratchet rebaselines: eslintWarnings 4116->4121, file-size base.ts/chatgpt-web.ts/strategySelector.ts/chatgpt-web.test.ts (all inherited drift, justified inline). - Regenerate provider translate-path golden snapshot for the merged bytez/friendliai/novita endpoint fixes. chore(changelog): cover #5415 dev-deps bump merged from main The release/v3.8.42 ↔ main merge (c4c1b56ba) brought #5415 (development dependency group, 9 updates) and #5533 (relay backend guide) from main. #5533's content is already covered by the #5547 port bullet; add a Maintenance bullet for #5415 and re-sync the 42 i18n CHANGELOG mirrors. * test: relocate 2 orphaned test files to collected runner paths check:test-discovery flagged two cycle-merged tests that no runner collects (they never ran → false coverage confidence): - compression-settings-tab-consolidation.test.tsx (#5524) → tests/unit/ui/ (vitest UI runner collects tests/unit/ui/*/.test.tsx); 3/3 pass. - providers/providerPageStorage.test.ts (#5510) → tests/unit/dashboard/ ('providers' is not a collected subdir; 'dashboard' is, same ../../../ import depth); 30/30 pass under the node runner. Both confirmed green when actually executed; no assertions weakened. * fix(release): repair inherited base-red tests from #5480/#5527/#5427/#5521 The fast-path (PR->release/*) does not run the full unit+integration suites, so four merged feature PRs shipped with stale/incorrect tests that only surface on the release PR (PR->main). Repairs (features are correct; align tests to the new behavior — no assertions weakened): - #5480 (gate claude adaptive thinking): adaptive thinking is now injected only for a real Claude Code client (x-app:cli / claude-code UA), not for any bare Claude OAuth token. claude-thinking-tool-choice-guard + base-thinking-budget-5312 now identify as a Claude Code client to exercise the adaptive path (3 tests). - #5527 (T02 inflation guard): the guard reverts a stacked body that did not shrink in tokens. The bail-out/advancement fixtures used growth-appending mock engines; they now carry a droppable padding message the engines empty, so the body realistically shrinks and the marker assertions survive. bailout (5), stacked-async (3), engine-enabled-toggle (2). - #5427 (render onboarding wizard at /providers/new): integration-wiring asserted the old redirect stub; now asserts the route renders ProviderOnboardingWizard. - #5521 (mimocode SOCKS5 per-account proxy): the constructor's default account omitted the proxy field (undefined), breaking the 'all proxies null' backward compat guard. Default it to null, mirroring syncAccountsFromCredentials(). fix(proxyfetch): skip fallback for non-replayable bodies --------- Co-authored-by: Diego Rodrigues de Sa e Souza <diegosouza.pw@gmail.com> Co-authored-by: Jan Leon <Jan.gaschler@gmail.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Randi <55005611+rdself@users.noreply.github.com> Co-authored-by: Diego Rodrigues de Sa e Souza <8016841+diegosouzapw@users.noreply.github.com> Co-authored-by: KooshaPari <42529354+KooshaPari@users.noreply.github.com> Co-authored-by: KooshaPari <koosha@example.com> Co-authored-by: backryun <bakryun0718@proton.me> Co-authored-by: Hernan Javier Ardila Sanchez <hjasgr@gmail.com> Co-authored-by: herjarsa <herjarsa@users.noreply.github.com> Co-authored-by: Arthur Bodera <abodera@gmail.com> Co-authored-by: PizzaV <103120356+pizzav-xyz@users.noreply.github.com> Co-authored-by: OpenClaw Auto <openclaw-auto@example.invalid> * Move CLI profile sync toggles to CLI Code (#5778) * move CLI profile sync toggles to CLI Code * test CLI profile auto-sync toggles * Document CLI profile auto-sync flags * docs(changelog): note CLI profile auto-sync card moved to CLI Code (#5778) --------- Co-authored-by: Diego Rodrigues de Sa e Souza <diegosouza.pw@gmail.com> * fix(grok-cli): parse expires_at from auth.json and exp from JWT to fix auto-refresh (#5775) * fix(grok-cli): parse expires_at from auth.json and exp from JWT to fix auto-refresh * docs(changelog): note grok-cli token auto-refresh fix (#5775) --------- Co-authored-by: Diego Rodrigues de Sa e Souza <diegosouza.pw@gmail.com> * fix(providers): import intentional local-catalog-only providers instead of 502 (#5460, #5465) (#5787) The model-sync route returned a hard 502 ('Remote model discovery failed; local catalog fallback not synced') for every provider whose local catalog is its ONLY discovery source (Reka #5460, t3.chat #5465, embedding/rerank like voyage-ai/jina-ai, Qwen-OAuth, and web-cookie providers). The /models route now flags catalogs that are the provider's intended source (no remote /models endpoint) with intentional:true; model-sync imports those instead of 502-ing, while a genuinely degraded remote fallback still surfaces. New dependency-free leaf degradedLocalCatalog.ts. Also fixes t3.chat's confusing add-credential hint: it no longer renders the circular 'Required cookie: convex-session-id + Cookie header...' copy and wires the step-by-step DevTools hint (t3ChatWebCookieHint) already translated in every locale. Regression guards: tests/unit/sync-models-degraded-local-catalog-5460-5465.test.ts, tests/unit/t3chat-web-cookie-hint-5465.test.ts, + intentional-flag assertions in tests/unit/provider-models-route.test.ts. * fix(api): self-hydrate model aliases from DB on GET after restart (#5777) * Fix grammatical errors in readme (#5738) * fix(api): self-hydrate model aliases from DB on GET when in-memory state is empty In the standalone production build, webpack creates two separate copies of modelDeprecation.ts — one hydrated by the startup path (used for request routing) and one used by the /api/settings/model-aliases API route. The API route's copy starts with an empty _customAliases after each server restart, causing the Settings → Routing UI to show 'No exact-match aliases configured' even though the aliases are persisted in the DB. The GET handler now detects an empty _customAliases state and reads the modelAliases key from the settings blob in the DB, calling setCustomAliases() to hydrate this module instance. This is a best-effort fallback — when _customAliases is already populated (e.g. by the startup path in dev mode), no DB read occurs. Regression test: tests/unit/model-aliases-settings-route-selfheal.test.ts - Verifies hydration from DB when in-memory state is empty - Verifies no hydration when in-memory state is already populated - Verifies graceful handling when no modelAliases exist in DB --------- Co-authored-by: Chirag Singhal <76880977+chirag127@users.noreply.github.com> Co-authored-by: marcelpeterson <marcelpeterson@users.noreply.github.com> Co-authored-by: Diego Rodrigues de Sa e Souza <diegosouza.pw@gmail.com> * refactor(usage): extract 5 provider usage families into leaves (#5782) Split open-sse/services/usage.ts (1723 -> 901 LOC) by moving the Cursor, Kimi, Codex, Claude and Kiro usage-fetcher families into cohesive leaves under open-sse/services/usage/ (mirroring the existing glm/minimax/antigravity/quota/ scalars leaves): - usage/cursor.ts getCursorUsage (+ CURSOR_USAGE_CONFIG, decodeCursorJwtSub) - usage/kimi.ts getKimiUsage (+ KIMI_CONFIG, getKimiPlanName) - usage/codex.ts getCodexUsage (+ CODEX_CONFIG) - usage/claude.ts getClaudeUsage / getClaudePlanLabel (+ CLAUDE_CONFIG, legacy) - usage/kiro.ts getKiroUsage / buildKiroUsageResult / discoverKiroProfileArn (+ helpers) The host keeps the getUsageForProvider dispatcher and imports the fetchers back; the public export set is unchanged — buildKiroUsageResult + discoverKiroProfileArn are re-exported from the kiro leaf (the kiro-* tests import them from services/usage) and __testing stays wired to the moved claude/kiro internals. Bodies are verbatim: the code-line multiset of host + leaves equals the original. Adds tests/unit/usage-families-split.test.ts pinning the leaf surface, the kiro re-export identity, the __testing wiring, and getClaudePlanLabel's pure logic. * chore(docs): sync i18n CHANGELOG mirrors with root [3.8.43] section (#5789) Regenerate the docs/i18n/<locale>/CHANGELOG.md [3.8.43] blocks from the root CHANGELOG so the mirror body size returns within the 25% docs-sync tolerance. Clears a pre-existing release-time drift (mirrors were ~26% smaller than root) that was failing check-docs-sync and blocking every local commit on the release branch. * fix(providers): correct stale/broken provider metadata (#5487, #5461, #5534, #5470) (#5790) - #5487 Qoder: replace the untranslated i18n stubs (personalAccessTokenLabel, qoderPatHint, qoderPatPlaceholder) with real copy; extend the STUB_KEYS guard. - #5461 Scaleway: website pointed at scaleway.com/en/ai/generative-apis (HTTP 404); repoint at the live docs URL /en/docs/ai-data/generative-apis/. - #5534 Microsoft 365 Copilot: rewrite the vague authHint with concrete DevTools WebSocket steps (the token lives on the Chathub WS URL, not an Authorization header). - #5470 Together AI: retired the $25 signup credit and is now fully prepaid (min $5); hasFree false + a prepaid notice instead of the stale free-tier freeNote (verified live). Regression guards: tests/unit/provider-metadata-5461-5470-5534.test.ts + Qoder keys added to tests/unit/provider-add-ux-i18n-import-warning.test.ts. * fix(dashboard): neutral badge for unsupported validation + clickable OAuth error links (#5442, #5486) (#5795) - #5442 LMArena (and any provider with no live validator) returns { unsupported: true } from /api/providers/validate and Save succeeds, but the Add-API-Key modal only had success/failed states so it rendered a red 'Invalid' badge. Add an 'unsupported' result → neutral info 'N/A' badge via the pure leaf validationBadgeProps(); both validate handlers now map data.unsupported to it. - #5486 GitLab Duo's OAuth setup error embeds a registration URL (gitlab.com/-/profile/applications) but the OAuth error step rendered it as dead red text. New LinkifiedText component (+ pure ReDoS-safe linkify util) makes any http(s) URL in an OAuth error clickable; the GitLab Duo backend message already carries the full setup steps. Regression guards: tests/unit/validation-badge-unsupported-5442.test.ts, tests/unit/oauth-error-linkify-5486.test.ts. Frozen god-files kept within cap (AddApiKeyModal 868/868, OAuthModal 968/969). * fix(system): route in-app auto-update npm calls through the win32 shell helper (#5542) (#5797) The in-app auto-update flow called execFileAsync("npm", ...) directly for the version lookup (versionCheck.getLatestVersionFromNpmCli), dependency install, global install, and native rebuild. On Windows npm is npm.cmd and Node >=24 refuses to execFile a .cmd without a shell (nodejs/node#52554), so those calls threw 'spawn npm ENOENT'. Route them through buildNpmExecOptions (the same win32-shell helper the embedded-services installer uses, fix #5379). The global install spec is validated with SERVICE_VERSION_PATTERN before it is shell-joined (Hard Rule #13). Not the pnpm/npx swap the issue proposed — that is the wrong direction for an 'npm install -g' flow already solved elsewhere in-repo. Regression guard: tests/unit/autoupdate-npm-win32-5542.test.ts. * refactor(sse): extract cursor protobuf wire primitives into a leaf (#5794) Split open-sse/utils/cursorAgentProtobuf.ts (1520 -> 1400 LOC) by moving the low-level protobuf wire-format primitives — varint/tag/length-delimited encode+ decode + the generic field walker (encodeVarint, encodeTag, encodeBytes, encodeString, encodeMessage, encode{UInt32,Bool,Double}Field, decodeVarint, checkedLen, decodeFields, findField, decode{String,Varint}Field, the Field type and the WT_VARINT/WT_LEN wire-type constants) — into cursorAgentProtobuf/wire.ts. These primitives were module-private, so the host's public API is unchanged; the host imports them back internally. Bodies are verbatim: the code-line multiset of host + wire.ts equals the original. First layer of the codec decomposition — the value/framing codec and the message encoders/decoders build on this and stay in the host (they share host-retained helpers; splitting them is a separate step). Adds tests/unit/cursor-protobuf-wire-split.test.ts pinning the leaf surface, the encode/decode round-trip invariants, the buffer-overrun guard, and the host wiring. * test(runtime): guard tsx/esm→esbuild transform path on boot (#5757) (#5773) #5757 reported that a fresh `npm install omniroute` pulls `esbuild@0.28.1` transitively via `tsx` (a runtime dependency the CLI registers at boot in `bin/omniroute.mjs`), and proposed forcing `esbuild@0.27.4`. That override is unsafe: `tsx@4.22.4` requires `esbuild@~0.28.0` and `fumadocs-mdx@15` (also a runtime dep) requires `esbuild@^0.28.0`; forcing 0.27.x pushes esbuild below both, and 0.28.1 is currently the latest release. The reported transform failure also does not reproduce — OmniRoute targets ES2022, its minimum supported Node is 22.2 (destructuring is native), and tsx targets the running Node, so esbuild never lowers to an unsupported target. Instead of an unsafe version pin, add two regression guards: - functional: spawn the real `node --import tsx/esm` loader on a fixture packed with modern syntax (destructuring/spread, class+private fields, optional chaining, nullish, logical assignment, async + top-level await) and assert it transforms + runs correctly. Fails if a future esbuild regresses the boot path. - dependency-shape: assert the resolved esbuild stays within tsx's declared range, so nobody reintroduces the out-of-range override this issue proposed. No production code changed; no esbuild version pinned. * fix(deps): add missing runtime deps @toon-format/toon and safe-regex (#5771) Both packages are imported at runtime but were only declared for their type shims (safe-regex was via @types/safe-regex; @toon-format/toon had no declaration at all). Missing runtime deps mean: - open-sse/services/compression/engines/headroom/toon.ts imports @toon-format/toon → MODULE_NOT_FOUND on cold pnpm/npm install - open-sse/services/compression/engines/ccr/ccrQuery.ts imports safe-regex → MODULE_NOT_FOUND Both engines are wired into the stacked compression pipeline (default enabled), so a fresh clone that does not have a stale node_modules from a previous version crashes as soon as the pipeline runs. Verified with pnpm ls / grep before/after. * fix(oauth): clamp grok-cli expired-token expiresIn to a positive value (#5775 follow-up) (#5820) An already-expired grok-cli token (real expires_at/exp in the past) produced a negative expiresIn, which is truthy in the import-token route and maps to a PAST expiresAt — AutoCombo then reads that as 'already expired' and excludes the connection instead of refreshing it. Clamp with Math.max(1, expiresIn) so an expired token is treated as due-for-refresh. Extends #5775 (thanks @Chewji9875). Regression: 2 new cases in tests/unit/grok-cli-oauth.test.ts (expired JWT exp + expired JSON expires_at), both failing-then-passing. * fix(model-aliases): back custom-alias store with globalThis (#5777 follow-up) (#5821) #5777 self-healed the GET /api/settings/model-aliases symptom at the route layer, but the root cause remained: modelDeprecation.ts held _customAliases in a plain module-level let, which webpack duplicates across the startup and app-route module graphs (same class as #5312). Startup hydration landed on one copy; the API route read the other (empty) one. Back the store with globalThis (__omniroute_customAliases__) so both instances share one store — the exact pattern already used by thinkingBudget.ts/backgroundTaskDetector.ts (#5312). The route-layer DB self-heal from #5777 stays as a harmless fallback. Extends #5777 (thanks @jleonar2). Regression: tests/unit/model-aliases-globalthis-5777.test.ts (fails on the plain-let store: never populates globalThis, never reads a sibling instance's write). * chore(release): rebaseline file-size + test-masking ratchets for v3.8.43 (#5609) DRIFT acumulado dos 109 commits do ciclo v3.8.43 (fast-gate PR->release nao roda check:file-size/test-masking; base-reds so afloram na release-PR): - file-size: 8 god-files existentes cresceram + 2 arquivos novos acima do cap + 4 test files cresceram -> frozen ajustado ao estado atual. - test-masking: chatgpt-web.test.ts 281->280 asserts allowlisted (#5549 consolidou 2 assert.equal num unico map-driven; refactor legitimo, nao masking). Modularizacao dos god-files deferida (#3501). * refactor(sse): extract openai-to-gemini pure helpers into a leaf (#5824) Split open-sse/translator/request/openai-to-gemini.ts (873 -> 756 LOC, back under the 800-line cap) by moving the module-private pure helpers — the historical-tool- context string builders (stringifyHistoricalToolArguments, buildInertHistorical, escapeHistoricalContext, buildHistoricalToolResultContext), deepCleanUndefined, extractClientThoughtSignature, buildChangedToolNameMap, isVertexGeminiProvider, and applyAntigravityGenerationDefaults (with its GeminiGenerationConfig shape) — into openai-to-gemini/helpers.ts. These were module-private, so the translator's public API is unchanged; the host imports them back internally. Bodies are verbatim: the code-line multiset of host + leaf equals the original. Adds tests/unit/openai-to-gemini-helpers-split.test.ts pinning the leaf's pure behaviour (escaping, undefined-pruning, signature extraction, antigravity generation-config defaults) and the host wiring. * fix(db): re-export modelContextOverrides from localDb (check:db-rules #5609) * test(discovery): wire tests/unit/memory into node runner glob (#5609) typed-decay.test.ts (TV6 typed memory decay, 15 asserts) sat in tests/unit/memory/ which no runner glob collected -> orphan (never ran). Adds 'memory' to the subdir brace-glob in all runner sources (package.json scripts + ci.yml shards) and the COLLECTORS mirror in check-test-discovery.mjs (drift-check keeps them in sync). Passes standalone (15/15); DATA_DIR isolation handled per-file by tests/_setup/isolateDataDir.ts. * test: align 3 stale release tests to landed behavior (#5609) Base-reds surfaced on the release PR (fast-gate PR->release skips these shards): - api-manager-page-static: Self-service Visibility now has 5 switches (added the API-key provider quota-policy bypass toggle, #5731); bump inventory 4->5 while keeping the invariant that every switch declares type=button (verified 5/5 typed). - security-hardening (callLogs PII): #5725 extracted sanitizeErrorForLog into callLogs/format.ts; assert the new wiring (callLogs imports it + format.ts imports piiSanitizer) instead of the removed direct import — PII sanitization still intact. - memory-glm-injection: #5610 made GLM 5.1+ ACCEPT the system role (z.ai docs), so glm-5.1 must PRESERVE system, not fold it. Flip the stale #1701-era assertion. * test(shared): align t3-web web-session expected metadata with hintKey (#5835) The t3-web provider metadata intentionally carries `hintKey: "t3ChatWebCookieHint"` (#5465 — the generic cookie hint reads circular for t3.chat), but the metadata assertion in web-session-credentials was never updated, so it deep-equals against an object missing the field. This is a stale-test base-red on release/v3.8.43 that turns the whole PR queue's "Unit Tests fast-path (1/2)" red. Align the expected object to the shipped source of truth. * test(compression): de-flake rtk_discover sample seeding seedSamples() persisted two byte-identical raw outputs. The raw-output filename is keyed on Date.now() (ms) + a content hash (rawOutput.ts), so two identical captures landing in the same millisecond collapse to one file (the 2nd write overwrites the 1st) -> sampleCount 1 instead of 2. Reproduced at ~25% (501/2000 trials), matching the intermittent Coverage Shard (5/8) failure on fast CI runners. Seed two DISTINCT captures so the store deterministically holds 2 samples regardless of timing (0/2000 collisions after the change). * test(e2e): anchor compression-studio smoke on play-input, not async play-lane The T03 smoke asserted `play-lane` visible on mount, but those per-lane buttons only render after a preview-compression run populates `batch.lanes` (usePreviewCompression keeps batch null until run(); there is no mount auto-run). The smoke intentionally does not drive a compression cascade, so `play-lane` can never appear -> the E2E added in #5727 failed all 3 retries (E2E Tests 4/9). Anchor on the always-present `play-input` panel, which proves the studio body mounted without needing async lane data. * fix(security): explicit http(s) scheme allowlist in linkifyText href CodeQL flagged the <a href> in LinkifiedText (#5486) with js/xss (high) and js/client-side-unvalidated-url-redirection (medium) because href traces back to user-provided text. URL_RE already requires an http(s):// prefix, so a javascript:/data: scheme can never reach href — but that guarantee was only implied by the regex. Validate the scheme explicitly via new URL().protocol before exposing href (non-http(s) degrades to plain text): defense-in-depth that also makes the sink provably safe to static analysis. Regression test added. * fix(ci): register mark-account-unavailable test in stryker tap.testFiles check:mutation-test-coverage --strict (Fast Quality Gates) flagged tests/unit/mark-account-unavailable-numeric-epoch-guard.test.ts as a covering unit test missing from stryker.conf.json tap.testFiles, so its mutant kills would not count (--strict). Add it. Pre-existing tap.testFiles drift on the release tip that fails Fast Quality Gates on every PR into release/v3.8.43, not just this branch. * chore(release): rebaseline eslintWarnings ratchet 4121->4158 (v3.8.43 cycle drift) * chore(release): rebaseline complexity 1981->1982 + cognitive-complexity 842->845 (v3.8.43 cycle drift) * chore(release): rebaseline deadExports 225->227 (v3.8.43 cycle drift) * fix(dashboard): add error boundaries for Combos and MITM Proxy pages (#5788) Integrated into release/v3.8.43 * fix(cli): rename process title to omniroute (#5791) Integrated into release/v3.8.43 * fix(providers): add claude-sonnet-5 to Kiro model catalog (#5796) Integrated into release/v3.8.43 * fix(kiro): bound Claude id dash->dot minor group to protect date-suffixed ids (#5825) Integrated into release/v3.8.43 * fix(db): allowlist modelContextOverrides as intentionally-internal to green release DB-rules gate (#5798) (#5827) Integrated into release/v3.8.43 * fix(sse): stop reasoning-summary drop + duplicated deltas on claude→codex streaming (#5786) (#5832) Integrated into release/v3.8.43 * fix(dashboard): guard null modelAliases values in model picker (#5792) Integrated into release/v3.8.43 * fix(github): drop trailing assistant prefill for Copilot chat (#5802) Integrated into release/v3.8.43 * fix(oauth): disambiguate OAuth connections on username to prevent cross-IdP overwrites (#5803) Integrated into release/v3.8.43 * fix(translator): strip orphaned tool results across request formats (#5805) Integrated into release/v3.8.43 * fix(kiro): stop injecting placeholder user turn on tool-result turns (#5807) Integrated into release/v3.8.43 * fix(mitm): clean up privileged hosts entries on exit when possible (#5808) Integrated into release/v3.8.43 * fix(translator): prevent doubled tool args in OpenAI-to-Claude (#5828) Integrated into release/v3.8.43 * fix(usage): keep tool definitions visible when request log is truncated (#5829) Integrated into release/v3.8.43 * fix(db): preserve healthCheckInterval=0 across create/update (#5822) Integrated into release/v3.8.43 * fix: unify dashboard csrf origin fallback (#5856) Integrated into release/v3.8.43 * fix(kimi-web): migrate to www.kimi.com Connect-RPC API (kimi.moonshot.cn retired) (#5858) Integrated into release/v3.8.43 * fix(qwen-web): unblock validator + chat completion (retired endpoint + missing SPA version header) (#5855) Integrated into release/v3.8.43 * fix(antigravity): 429 hang on credit exhaustion and precise reset time lockout (Cleaned) (#5846) Integrated into release/v3.8.43 * fix(cli): correct rootDir resolution in doctor.mjs on Windows (#5844) (#5845) Integrated into release/v3.8.43 * Show startup time in ready banner (#5799) Integrated into release/v3.8.43 * extracted CorrelationId observability changes from #5275 (#5834) Integrated into release/v3.8.43 * refactor(executors): deduplicate shared utilities and add comprehensive tests (#5720) Integrated into release/v3.8.43 * Harden provider node URL validation (#5760) Integrated into release/v3.8.43 * [codex] Tune adaptive stream readiness timeouts (#5767) Integrated into release/v3.8.43 * fix: restore om-usage HTTP endpoint (#5859) Integrated into release/v3.8.43 * fix(sse): strip zero-width markers from streamed responses (parity with non-streaming) (#5857) Integrated into release/v3.8.43 * [codex] Protect long-running agent goal streams (#5772) Integrated into release/v3.8.43 * refactor(oauth): remove dead legacy OAuth service classes (#5838) The src/lib/oauth/services/ service-class hierarchy is superseded — the live OAuth flow runs through src/lib/oauth/providers.ts + providers/. The old per-provider 'class Service extends OAuthService' implementations and their barrel had zero production or test references. Removed oauth/openai/github/claude/codex/antigravity/ qwen/qoder + the index barrel (-1559 LOC). Kept kiro.ts, cursor.ts, codexImport.ts (routes import them directly by path, never via the deleted barrel). Proven safe by typecheck:core staying green (a live reference would fail the build) + a filesystem guard test pinning the removal. Salvage of closed PR #5039. gaps v3.8.42 - T10 (5.7). chore(docs): scope release-freeze to /generate-release only (Hard Rule #21) (#5839) A freeze is authorized ONLY inside /generate-release (raised Phase 0a, lifted Phase 12c). No campaign/session/agent may open a release-freeze mid-development; if one is ever unavoidable outside the release flow it must be requested from the operator in chat first with an explicit "estou criando um freeze" alert. Also codifies: never lift an active captain freeze to unblock campaign merges (it auto-lifts at 12c). * fix(chat): preserve JSON default when stream is omitted (#5866) * fix(chat): preserve JSON default when stream is omitted * chore(chat): type route record guard * fix(api): gate early SSE keepalive on explicit stream intent, keep body untouched Remove the stream:false body normalization so the legacy streaming default (resolveStreamFlag) and the per-key streamDefaultMode json opt-in keep deciding the response framing; the keepalive wrapper is only applied when stream:true is explicit or Accept forces SSE. Co-authored-by: diegosouzapw <diegosouza.pw@gmail.com> --------- Co-authored-by: diegosouzapw <diegosouza.pw@gmail.com> * feat(usage): report usage command quotas as percentages + honor observed provider quota resets (#5874) * feat: report usage command quotas as percentages Convert @@om-usage and the HTTP usage endpoint to report personal API key quotas as remaining percentages while keeping USD amounts out of the command output. Scale provider quota remaining percentages by the configured quota cutoff so the protected reserve reads as 0% left. Restore provider USD cost drilldown in the quota dashboard.\n\nAlso sync the 3.8.43 i18n changelog mirrors so the docs-sync pre-commit gate remains green.\n\nTests: DISABLE_SQLITE_AUTO_BACKUP=true node --import tsx/esm --test tests/unit/internal-usage-command.test.ts; DISABLE_SQLITE_AUTO_BACKUP=true node --import tsx/esm --test tests/unit/api-key-usage-limits.test.ts; DISABLE_SQLITE_AUTO_BACKUP=true node --import tsx/esm --test tests/unit/provider-window-costs.test.ts; DISABLE_SQLITE_AUTO_BACKUP=true node --import tsx/esm --test tests/unit/api-manager-usage-command.test.ts tests/unit/apikeys-usage-command.test.ts; npx eslint <changed files>; npm run typecheck:core; npm run build; npm run check:migration-numbering; npm run check:docs-sync; docker build --target runner-base (cherry picked from commit f66abd2028a40f2950613da97b8880adfded9db8) * fix: honor observed provider quota resets Detect same-resetAt quota resets when provider usage drops back to the reset floor, and prefer that observed snapshot over stale recorded weekly events for provider USD windows and API-key USD quotas.\n\nTests: npx eslint changed files\nTests: npm run typecheck:core\nTests: DISABLE_SQLITE_AUTO_BACKUP=true node --import tsx/esm --test tests/unit/lib/quota-reset-events.test.ts tests/unit/provider-window-costs.test.ts tests/unit/api-key-usage-limits.test.ts\nTests: npm run build\nTests: docker build --target runner-base --build-arg OMNIROUTE_BUILD_MEMORY_MB=4096 -t omniroute:quota-reset-window-20260702002300 . (cherry picked from commit 39c12a6f17995e3c797456fa1611075050f89aaf) * docs(changelog): credit usage quota percentages extraction from #5863 Co-authored-by: diegosouzapw <diegosouza.pw@gmail.com> --------- Co-authored-by: Wital <wital@example.com> * fix(github): keep Copilot access-token sessions active (#5875) * fix(github): keep Copilot access-token sessions active GitHub Copilot device-flow accounts may have a GitHub access token and short-lived Copilot token without a refresh token. The proactive health check was treating that as terminal no_refresh_token and marking the connection expired minutes after login. Keep those sessions active, clear stale no_refresh_token state, and refresh the Copilot sub-token when needed.\n\nTests:\n- npx eslint src/lib/tokenHealthCheck.ts tests/unit/token-health-no-refresh-token-expired-5326.test.ts\n- DISABLE_SQLITE_AUTO_BACKUP=true node --import tsx/esm --test tests/unit/token-health-no-refresh-token-expired-5326.test.ts tests/unit/token-health-check.test.ts tests/unit/token-health-check-circuit-breaker.test.ts tests/unit/token-refresh-service.test.ts tests/unit/token-refresh-route-service.test.ts tests/unit/executor-github.test.ts\n- npm run typecheck:core\n- npm run build (cherry picked from commit 68095d4796ce0ab9c1c8921bbcddbcf1cb62f2b1) * docs(changelog): credit Copilot token-health fix extraction from #5863 Co-authored-by: diegosouzapw <diegosouza.pw@gmail.com> --------- Co-authored-by: Wital <wital@example.com> * feat: add NEXT_PUBLIC_LIVE_WS_PUBLIC_URL for custom domain WebSocket support (#5878) * docs: add ai_features scope to GitLab Duo OAuth env setup instructions * docs: add LIVE_WS_ALLOWED_HOSTS env var to example config for LAN/Tailscale setups * feat: add web socket public URL for reverse proxy/Cloudflare Tunnel WebSocket setups * fix(dashboard): resolve live WS public URL at runtime via handshake with scheme validation - Read NEXT_PUBLIC_LIVE_WS_PUBLIC_URL lazily in /api/v1/ws (function, not module-level const) so runtime env changes are honored in prebuilt images. - Only echo/consume publicUrl when it is a ws:// or wss:// URL (server and client guards); anything else is rejected to null. - useLiveDashboard now fetches /api/v1/ws?handshake=1 before connecting and prefers: explicit wsUrl > build-time env > handshake publicUrl > default. - Align GitLab Duo scopes line in .env.example with GITLAB_DUO_CONFIG.scope. - Extend tests: lazy env read + scheme validation cases. - CHANGELOG entry for 3.8.43. Co-authored-by: diegosouzapw <diegosouza.pw@gmail.com> --------- Co-authored-by: Septianata Rizky Pratama <ian.rizkypratama@gmail.com> * Add .editorconfig to improve repository standards (#5879) * chore(ci): pass sonar.projectVersion to SonarQube scan so the new-code baseline advances per release (#5880) * fix(dashboard): Modal — two-field auth (Token ID + Token Secret) (#5446) (#5881) * fix(dashboard): add Modal Token ID + Token Secret fields (#5446) Modal authenticates with a Token ID (ak-…) + Token Secret (as-…) pair sent as `Authorization: Bearer <TOKEN_ID>:<TOKEN_SECRET>`. The add-connection form only exposed a single API-key field, so users could not enter both credentials. Add a dedicated two-field form for the `modal` provider: the existing field is relabeled "Token ID" and a new "Token Secret" field is rendered below it. Both are combined into the single encrypted `apiKey` value via a new pure helper `combineModalCredential(id, secret)` → `id:secret`, so the generic bearer executor path emits `Bearer <id:secret>` with no registry/executor/DB changes. An empty secret returns the id verbatim, preserving the ability to paste a pre-combined `id:secret` into the single field. The field hint points to https://modal.com/settings → API Tokens. Registry (baseUrl/executor), DB schema, and the request-time header path are untouched — Modal remains bring-your-own-deploy. Tests: tests/unit/modal-credential-combine.test.ts (5, TDD). * docs(changelog): add v3.8.43 bullet for Modal two-field auth (#5446) * fix(mcp): forwarded caller auth wins over OMNIROUTE_API_KEY env fallback (#5819) (#5882) * fix(middleware): run operator hook code in hardened vm sandbox instead of new Function (#5872) (#5885) * fix(providers): include custom compatible providers in auto/ routing (#5873) (#5886) * fix(db): honor autoBackupEnabled setting for pre-write backups (#5871) (#5888) * fix(dashboard): gate Token Expired badge on terminal testStatus, not raw token expiry (#5836) (#5883) * docs: use pnpm --allow-build flag instead of unsupported approve-builds -g (#5554) (#5884) * fix(dashboard): pre-fill Modal Validation Model Id with the server probe model (#5446) (#5892) * fix(api): strip upstream x-middleware-* headers from proxied responses (#5849) (#5893) * fix(providers): restore codex inference for unprefixed gpt-5.5 on codex-only setups (#5887) (#5895) * test(autoCombo): stabilize model fitness source expectation (#5890) * test(autoCombo): make fitness source test stable against model caps * chore(ci): retrigger checks for PR 5890 * docs(changelog): add 3.8.43 bullet for the autoCombo fitness-source test stabilization (#5890) --------- Co-authored-by: kooshapari <kooshapari@users.noreply.github.com> Co-authored-by: Diego Rodrigues de Sa e Souza <diegosouza.pw@gmail.com> * docs(architecture): Router Backends & Embedded Services ADR (#5603) (#5891) * routing: add router backend registry * docs(architecture): add Router Backends & Embedded Services ADR (#5603) Document the two orthogonal axes that #5603 asked to clarify: an engine's lifecycle (in-process / supervised / external / disabled) vs the relay routing backend selection (ts / bifrost / auto). Anchors the ADR on the typed `src/domain/routing/routerBackends.ts` registry as the single source of truth, and captures the /api/services/* status-code contract (409/200/404/403/500 + the LOCAL_ONLY loopback guard) so dashboard errors are interpretable. Stacked on the router-backend-registry work so it documents a real contract. * docs(architecture): reduce ADR PR to docs-only — registry lands via #5868; describe adoption as tracked, not current * docs(changelog): add 3.8.43 bullet for the Router Backends ADR (#5891) --------- Co-authored-by: KooshaPari <kooshapari@gmail.com> * fix(ci): re-green release/v3.8.43 fast-gates — db-rules stale allowlist + 4 more base-reds (#5798) (#5896) * fix(db): remove stale modelContextOverrides allowlist entry from check:db-rules (#5798) * fix(ci): clear release/v3.8.43 fast-gates base-reds (env-docs, ADR refs, mutation-cov, ratchets) (#5798) * fix(sse): type-safe resolveBaseUrl/resolveEffectiveKey coercions in BaseExecutor (typecheck:core base-red, #5798) * chore(quality): freeze base.ts at post-typecheck-fix size (#5798) * fix(docs): add required MDX frontmatter to ROUTER_BACKENDS ADR (build base-red, #5798) * fix(image): keep bare gpt-5.5 codex mapping in image resolver (#5902) * fix: preserve codex bare image model over combo shadowing * docs(changelog): credit #5902 codex bare image alias fix * docs(changelog): restore #5902 bullet after merge auto-resolve --------- Co-authored-by: Diego Rodrigues de Sa e Souza <diegosouza.pw@gmail.com> * fix(providers): route OpenAI responses-only models to /v1/responses (#5842) (#5901) * fix(providers): route OpenAI responses-only models to /v1/responses (#5842) * docs(changelog): restore #5842 bullet after merge auto-resolve ate it * docs(changelog): keep #5842 bullet additive over release tip * chore(release): v3.8.43 — 2026-07-02 * chore(release): allowlist 3 verified-legitimate test-assert reductions (#5805/#5856/#5855) * chore(release): rebaseline file-size caps for base.ts + 2 aligned test files (v3.8.43 release-close) * docs(changelog): add v3.8.43 Contributors section + sync i18n mirrors --------- Co-authored-by: KooshaPari <42529354+KooshaPari@users.noreply.github.com> Co-authored-by: Arthur Bodera <abodera@gmail.com> Co-authored-by: Wahyu Hidayatulloh Pamungkas <87377496+Stazyu@users.noreply.github.com> Co-authored-by: skyzea1 <161649495+skyzea1@users.noreply.github.com> Co-authored-by: José Victor Ferreira <root@josevictor.me> Co-authored-by: Choti Wongbussakorn <126886556+Chewji9875@users.noreply.github.com> Co-authored-by: backryun <bakryun0718@proton.me> Co-authored-by: Jan Leon <Jan.gaschler@gmail.com> Co-authored-by: warelik <warelik@users.noreply.github.com> Co-authored-by: WITALO ROCHA <witalo_rocha@hotmail.com> Co-authored-by: Randi <55005611+rdself@users.noreply.github.com> Co-authored-by: Alex <alexgild@gmail.com> Co-authored-by: Chirag Singhal <76880977+chirag127@users.noreply.github.com> Co-authored-by: Ardem2025 <ardemb22@gmail.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: KooshaPari <koosha@example.com> Co-authored-by: Hernan Javier Ardila Sanchez <hjasgr@gmail.com> Co-authored-by: herjarsa <herjarsa@users.noreply.github.com> Co-authored-by: PizzaV <103120356+pizzav-xyz@users.noreply.github.com> Co-authored-by: OpenClaw Auto <openclaw-auto@example.invalid> Co-authored-by: jleonar2 <92810914+jleonar2@users.noreply.github.com> Co-authored-by: marcelpeterson <marcelpeterson@users.noreply.github.com> Co-authored-by: Yuan Li <atom.long@outlook.com> Co-authored-by: janeza2 <49841619+janeza2@users.noreply.github.com> Co-authored-by: Aris <arissunandar399@gmail.com> Co-authored-by: Isha Tiwari <156085572+ishatiwari21@users.noreply.github.com> Co-authored-by: Markus Hartung <mail@hartmark.se> Co-authored-by: Nguyen Minh <lop123thcs@gmail.com> Co-authored-by: Denis Kotsyuba <kocubads96@gmail.com> Co-authored-by: Wital <wital@example.com> Co-authored-by: Septianata Rizky Pratama <ian.rizkypratama@gmail.com> Co-authored-by: Shiva Vinodkumar <127319648+shiva24082@users.noreply.github.com> Co-authored-by: kooshapari <kooshapari@users.noreply.github.com> Co-authored-by: KooshaPari <kooshapari@gmail.com>	16 小时前
SECURITY.md	fix(combo): fallback to next model on all-accounts-rate-limited 503 (… (#1523) Integrated into release/v3.7.0	2 个月前
llm.txt	Release v3.8.40 v3.8.40 cycle integration → main. All test gates green (Unit/Integration/Coverage/Node-compat/Quality-Ratchet). The only red check, 'PR Test Policy', is the test-masking heuristic firing on the cumulative ~57-commit release diff (legitimate assert consolidations already reviewed per-PR — Gemini CLI removal #5246, retired GPT models #5280, provider catalog refreshes); overridden with --admin per the documented release-PR convention. CodeQL/SonarQube advisory scans non-blocking; #5278's code already passed CodeQL on main. Homologated on VPS 192.168.0.15 (v3.8.40 healthy).	3 天前

🚀 OmniRoute — The Free AI Gateway (Kiswahili)

🌐 Languages: 🇺🇸 English · 🇸🇦 ar · 🇧🇬 bg · 🇧🇩 bn · 🇨🇿 cs · 🇩🇰 da · 🇩🇪 de · 🇪🇸 es · 🇮🇷 fa · 🇫🇮 fi · 🇫🇷 fr · 🇮🇳 gu · 🇮🇱 he · 🇮🇳 hi · 🇭🇺 hu · 🇮🇩 id · 🇮🇹 it · 🇯🇵 ja · 🇰🇷 ko · 🇮🇳 mr · 🇲🇾 ms · 🇳🇱 nl · 🇳🇴 no · 🇵🇭 phi · 🇵🇱 pl · 🇵🇹 pt · 🇧🇷 pt-BR · 🇷🇴 ro · 🇷🇺 ru · 🇸🇰 sk · 🇸🇪 sv · 🇰🇪 sw · 🇮🇳 ta · 🇮🇳 te · 🇹🇭 th · 🇹🇷 tr · 🇺🇦 uk-UA · 🇵🇰 ur · 🇻🇳 vi · 🇨🇳 zh-CN

Never stop coding. Smart routing to FREE & low-cost AI models with automatic fallback.

Your universal API proxy — one endpoint, 100+ providers, zero downtime. Now with MCP Server (25 tools), A2A Protocol, Memory/Skills Systems & Electron Desktop App.

Chat Completions • Embeddings • Image Generation • Video • Music • Audio • Reranking • Web Search • MCP Server • A2A Protocol • 100% TypeScript

NPM Downloads

NPM Downloads Docker Pulls GitHub Downloads (all assets, all releases)

🌐 Website • 🚀 Quick Start • 💡 Features • 📖 Docs • 💰 Pricing • 💬 WhatsApp

🖼️ Main Dashboard

📸 Dashboard Preview

Click to see dashboard screenshots

Page	Screenshot
Providers
Combos
Analytics
Health
Translator
Settings
CLI Tools
Usage Logs
Endpoints

🤖 Free AI Provider for your favorite coding agents

Connect any AI-powered IDE or CLI tool through OmniRoute — free API gateway for unlimited coding.

OpenClaw _{⭐ 205K}	NanoBot _{⭐ 20.9K}	PicoClaw _{⭐ 14.6K}	ZeroClaw _{⭐ 9.9K}	IronClaw _{⭐ 2.1K}
OpenCode _{⭐ 106K}	Codex CLI _{⭐ 60.8K}	Claude Code _{⭐ 67.3K}	Kilo Code _{⭐ 15.5K}

_{📡 All agents connect via http://localhost:20128/v1 or http://cloud.omniroute.online/v1 — one config, unlimited models and quota}

🤔 Why OmniRoute?

Stop wasting money and hitting limits:

Subscription quota expires unused every month
Rate limits stop you mid-coding
Expensive APIs ($20-50/month per provider)
Manual switching between providers

OmniRoute solves this:

✅ Maximize subscriptions - Track quota, use every bit before reset
✅ Auto fallback - Subscription → API Key → Cheap → Free, zero downtime
✅ Multi-account - Round-robin between accounts per provider

📧 Support

💬 Join our community! WhatsApp Group — Get help, share tips, and stay updated.

Website: omniroute.online
GitHub: github.com/diegosouzapw/OmniRoute
Issues: github.com/diegosouzapw/OmniRoute/issues
WhatsApp: Community Group
Contributing: See CONTRIBUTING.md, open a PR, or pick a good first issue

🐛 Reporting a Bug?

When opening an issue, please run the system-info command and attach the generated file:

npm run system-info

This generates a system-info.txt with your Node.js version, OmniRoute version, OS details, installed CLI tools (qoder, gemini, claude, codex, antigravity, droid, etc.), Docker/PM2 status, and system packages — everything we need to reproduce your issue quickly. Attach the file directly to your GitHub issue.

🔄 How It Works

┌─────────────┐
│  Your CLI   │  (Claude Code, Codex, OpenClaw, Cursor, Cline...)
│   Tool      │
└──────┬──────┘
       │ http://localhost:20128/v1
       ↓
┌─────────────────────────────────────────┐
│           OmniRoute (Smart Router)        │
│  • Format translation (OpenAI ↔ Claude) │
│  • Quota tracking + Embeddings + Images │
│  • Auto token refresh                   │
└──────┬──────────────────────────────────┘
       │
       ├─→ [Tier 1: SUBSCRIPTION] Claude Code, Codex
       │   ↓ quota exhausted
       ├─→ [Tier 2: API KEY] DeepSeek, Groq, xAI, Mistral, NVIDIA NIM, etc.
       │   ↓ budget limit
       ├─→ [Tier 3: CHEAP] GLM ($0.6/1M), MiniMax ($0.2/1M)
       │   ↓ budget limit
       └─→ [Tier 4: FREE] Qoder, Qwen, Kiro (unlimited)

Result: Never stop coding, minimal cost

🎯 What OmniRoute Solves — 30 Real Pain Points & Use Cases

Every developer using AI tools faces these problems daily. OmniRoute was built to solve them all — from cost overruns to regional blocks, from broken OAuth flows to protocol operations and enterprise observability.

💸 1. "I pay for an expensive subscription but still get interrupted by limits"

Developers pay $20–200/month for Claude Pro, Codex Pro, or GitHub Copilot. Even paying, quota has a ceiling — 5h of usage, weekly limits, or per-minute rate limits. Mid-coding session, the provider stops responding and the developer loses flow and productivity.

How OmniRoute solves it:

Smart 4-Tier Fallback — If subscription quota runs out, automatically redirects to API Key → Cheap → Free with zero manual intervention
Provider Limits Tracking — Cached quota snapshots refresh on a server-side schedule (default PROVIDER_LIMITS_SYNC_INTERVAL_MINUTES=70) with manual refresh available in the UI
Multi-Account Support — Multiple accounts per provider with auto round-robin — when one runs out, switches to the next
Custom Combos — Customizable fallback chains with 13 balancing strategies (priority, weighted, fill-first, round-robin, P2C, random, least-used, cost-optimized, strict-random, auto, lkgp, context-optimized, context-relay)
Structured Combo Builder — Build combos step-by-step with explicit provider + model + account selection, including repeated providers and fixed-account targets
Quota-Aware P2C — Power-of-two account selection now factors quota headroom, backoff, recent errors, and consecutive use
Codex Business Quotas — Business/Team workspace quota monitoring directly in the dashboard

🔌 2. "I need to use multiple providers but each has a different API"

OpenAI uses one format, Claude (Anthropic) uses another, Gemini yet another. If a dev wants to test models from different providers or fallback between them, they need to reconfigure SDKs, change endpoints, deal with incompatible formats. Custom providers (FriendLI, NIM) have non-standard model endpoints.

How OmniRoute solves it:

Unified Endpoint — A single http://localhost:20128/v1 serves as proxy for all 100+ providers
Format Translation — Automatic and transparent: OpenAI ↔ Claude ↔ Gemini ↔ Responses API
Response Sanitization — Strips non-standard fields (x_groq, usage_breakdown, service_tier) that break OpenAI SDK v1.83+
Role Normalization — Converts developer → system for non-OpenAI providers; system → user for GLM/ERNIE
Think Tag Extraction — Extracts <think> blocks from models like DeepSeek R1 into standardized reasoning_content
Structured Output for Gemini — json_schema → responseMimeType/responseSchema automatic conversion
stream defaults to false — Aligns with OpenAI spec, avoiding unexpected SSE in Python/Rust/Go SDKs

🌐 3. "My AI provider blocks my region/country"

Providers like OpenAI/Codex block access from certain geographic regions. Users get errors like unsupported_country_region_territory during OAuth and API connections. This is especially frustrating for developers from developing countries.

How OmniRoute solves it:

3-Level Proxy Config — Configurable proxy at 3 levels: global (all traffic), per-provider (one provider only), and per-connection/key
Color-Coded Proxy Badges — Visual indicators: 🟢 global proxy, 🟡 provider proxy, 🔵 connection proxy, always showing the IP
OAuth Token Exchange Through Proxy — OAuth flow also goes through the proxy, solving unsupported_country_region_territory
Connection Tests via Proxy — Connection tests use the configured proxy (no more direct bypass)
SOCKS5 Support — Full SOCKS5 proxy support for outbound routing
TLS Fingerprint Spoofing — Browser-like TLS fingerprint via wreq-js to bypass bot detection
🔏 CLI Fingerprint Matching — Reorders headers and body fields to match native CLI binary signatures, drastically reducing account flagging risk. The proxy IP is preserved — you get both stealth and IP masking simultaneously

🆓 4. "I want to use AI for coding but I have no money"

Not everyone can pay $20–200/month for AI subscriptions. Students, devs from emerging countries, hobbyists, and freelancers need access to quality models at zero cost.

How OmniRoute solves it:

Ollama Cloud — Cloud-hosted Ollama models at api.ollama.com with free "Light usage" tier; use ollamacloud/<model> prefix
Free-Only Combos — Chain if/kimi-k2-thinking → qw/qwen3-coder-plus = $0/month with zero downtime
NVIDIA NIM Free Access — ~40 RPM dev-forever free access to 70+ models at build.nvidia.com (transitioning from credits to pure rate limits)
Cost Optimized Strategy — Routing strategy that automatically chooses the cheapest available provider

🔒 5. "I need to protect my AI gateway from unauthorized access"

When exposing an AI gateway to the network (LAN, VPS, Docker), anyone with the address can consume the developer's tokens/quota. Without protection, APIs are vulnerable to misuse, prompt injection, and abuse.

How OmniRoute solves it:

API Key Management — Generation, rotation, and scoping per provider with a dedicated /dashboard/api-manager page
Model-Level Permissions — Restrict API keys to specific models (openai/*, wildcard patterns), with Allow All/Restrict toggle
API Endpoint Protection — Require a key for /v1/models and block specific providers from the listing
Auth Guard + CSRF Protection — All dashboard routes protected with withAuth middleware + CSRF tokens
Rate Limiter — Per-IP rate limiting with configurable windows
IP Filtering — Allowlist/blocklist for access control
Prompt Injection Guard — Sanitization against malicious prompt patterns
AES-256-GCM Encryption — Credentials encrypted at rest

🛑 6. "My provider went down and I lost my coding flow"

AI providers can become unstable, return 5xx errors, or hit temporary rate limits. If a dev depends on a single provider, they're interrupted. Without circuit breakers, repeated retries can crash the application.

How OmniRoute solves it:

Request Queue & Pacing — Per-connection request buckets smooth bursts before they hit upstream rate caps
Connection Cooldown — A single connection cools down after retryable failures with optional upstream Retry-After hints and exponential backoff
Provider Circuit Breaker — The provider only trips after fallback is exhausted and the provider request still fails with provider-wide transient errors; connection-scoped 429 rate limits stay in Connection Cooldown
Wait For Cooldown — The server can wait for the earliest connection cooldown to expire and retry the same client request automatically
Anti-Thundering Herd — Mutex + semaphore protection against concurrent retry storms
Combo Fallback Chains — If the primary provider fails, automatically falls through the chain with no intervention
Health Dashboard — Uptime monitoring, provider circuit breaker states, cooldowns, cache stats, p50/p95/p99 latency

🔧 7. "Configuring each AI tool is tedious and repetitive"

How OmniRoute solves it:

CLI Tools Dashboard — Dedicated page with one-click setup for Claude Code, Codex CLI, OpenClaw, Kilo Code, Antigravity, Cline
GitHub Copilot Config Generator — Generates chatLanguageModels.json for VS Code with bulk model selection
Onboarding Wizard — Guided 4-step setup for first-time users
One endpoint, all models — Configure http://localhost:20128/v1 once, access 100+ providers

🔑 8. "Managing OAuth tokens from multiple providers is hell"

Claude Code, Codex, Copilot — all use OAuth 2.0 with expiring tokens. Developers need to re-authenticate constantly, deal with client_secret is missing, redirect_uri_mismatch, and failures on remote servers. OAuth on LAN/VPS is particularly problematic.

How OmniRoute solves it:

Auto Token Refresh — OAuth tokens refresh in background before expiration
OAuth 2.0 (PKCE) Built-in — Automatic flow for Claude Code, Codex, Copilot, Kiro, Qwen, Qoder
Multi-Account OAuth — Multiple accounts per provider via JWT/ID token extraction
OAuth LAN/Remote Fix — Private IP detection for redirect_uri + manual URL mode for remote servers
OAuth Behind Nginx — Uses window.location.origin for reverse proxy compatibility
Remote OAuth Guide — Step-by-step guide for Google Cloud credentials on VPS/Docker

📊 9. "I don't know how much I'm spending or where"

Developers use multiple paid providers but have no unified view of spending. Each provider has its own billing dashboard, but there's no consolidated view. Unexpected costs can pile up.

How OmniRoute solves it:

Cost Analytics Dashboard — Per-token cost tracking and budget management per provider
Budget Limits per Tier — Spending ceiling per tier that triggers automatic fallback
Per-Model Pricing Configuration — Configurable prices per model
Usage Statistics Per API Key — Request count and last-used timestamp per key
Analytics Dashboard — Stat cards, model usage chart, provider table with success rates and latency

🐛 10. "I can't diagnose errors and problems in AI calls"

When a call fails, the dev doesn't know if it was a rate limit, expired token, wrong format, or provider error. Fragmented logs across different terminals. Without observability, debugging is trial-and-error.

How OmniRoute solves it:

Unified Logs Dashboard — 4 tabs: Request Logs, Proxy Logs, Audit Logs, Console
Console Log Viewer — Real-time terminal-style viewer with color-coded levels, auto-scroll, search, filter
SQLite Summary Logs — Request and proxy log indexes stay queryable across restarts without loading large payload blobs into SQLite
Translator Playground — 4 debugging modes: Playground (format translation), Chat Tester (round-trip), Test Bench (batch), Live Monitor (real-time)
Request Telemetry — p50/p95/p99 latency + X-Request-Id tracing
File-Based Detail Artifacts — App logs rotate by size, retention days, and archive count; detailed request/response payloads live in DATA_DIR/call_logs/ and rotate independently of SQLite summaries
System Info Report — npm run system-info generates system-info.txt with your full environment (Node version, OmniRoute version, OS, CLI tools, Docker/PM2 status). Attach it when reporting issues for instant triage.

🏗️ 11. "Deploying and maintaining the gateway is complex"

Installing, configuring, and maintaining an AI proxy across different environments (local, VPS, Docker, cloud) is labor-intensive. Problems like hardcoded paths, EACCES on directories, port conflicts, and cross-platform builds add friction.

How OmniRoute solves it:

npm global install — npm install -g omniroute && omniroute — done
Docker Multi-Platform — AMD64 + ARM64 native (Apple Silicon, AWS Graviton, Raspberry Pi)
Docker Compose Profiles — base (no CLI tools) and cli (with Claude Code, Codex, OpenClaw)
Electron Desktop App — Native app for Windows/macOS/Linux with system tray, auto-start, offline mode
Split-Port Mode — API and Dashboard on separate ports for advanced scenarios (reverse proxy, container networking)
Cloud Sync — Config synchronization across devices via Cloudflare Workers
DB Backups — Automatic backup, restore, export and import of all settings, with DISABLE_SQLITE_AUTO_BACKUP for externally managed backups

🌍 12. "The interface is English-only and my team doesn't speak English"

Teams in non-English-speaking countries, especially in Latin America, Asia, and Europe, struggle with English-only interfaces. Language barriers reduce adoption and increase configuration errors.

How OmniRoute solves it:

Dashboard i18n — 30 Languages — All 500+ keys translated including Arabic, Bulgarian, Danish, German, Spanish, Finnish, French, Hebrew, Hindi, Hungarian, Indonesian, Italian, Japanese, Korean, Malay, Dutch, Norwegian, Polish, Portuguese (PT/BR), Romanian, Russian, Slovak, Swedish, Thai, Ukrainian, Vietnamese, Chinese, Filipino, English
RTL Support — Right-to-left support for Arabic and Hebrew
Multi-Language READMEs — 30 complete documentation translations
Language Selector — Globe icon in header for real-time switching

🔄 13. "I need more than chat — I need embeddings, images, audio"

AI isn't just chat completion. Devs need to generate images, transcribe audio, create embeddings for RAG, rerank documents, and moderate content. Each API has a different endpoint and format.

How OmniRoute solves it:

Embeddings — /v1/embeddings with 6 providers and 9+ models
Image Generation — /v1/images/generations with 10 providers and 20+ models (OpenAI, xAI, Together, Fireworks, Nebius, Hyperbolic, NanoBanana, Antigravity, SD WebUI, ComfyUI)
Text-to-Video — /v1/videos/generations — ComfyUI (AnimateDiff, SVD) and SD WebUI
Text-to-Music — /v1/music/generations — ComfyUI (Stable Audio Open, MusicGen)
Audio Transcription — /v1/audio/transcriptions — Whisper + Nvidia NIM, HuggingFace, Qwen3
Text-to-Speech — /v1/audio/speech — ElevenLabs, Nvidia NIM, HuggingFace, Coqui, Tortoise, Qwen3, Inworld, Cartesia, PlayHT, + existing providers
Moderations — /v1/moderations — Content safety checks
Reranking — /v1/rerank — Document relevance reranking
Responses API — Full /v1/responses support for Codex

🧪 14. "I have no way to test and compare quality across models"

Developers want to know which model is best for their use case — code, translation, reasoning — but comparing manually is slow. No integrated eval tools exist.

How OmniRoute solves it:

LLM Evaluations — Golden set testing with 10 pre-loaded cases covering greetings, math, geography, code generation, JSON compliance, translation, markdown, safety refusal
4 Match Strategies — exact, contains, regex, custom (JS function)
Translator Playground Test Bench — Batch testing with multiple inputs and expected outputs, cross-provider comparison
Chat Tester — Full round-trip with visual response rendering
Live Monitor — Real-time stream of all requests flowing through the proxy

📈 15. "I need to scale without losing performance"

As request volume grows, without caching the same questions generate duplicate costs. Without idempotency, duplicate requests waste processing. Per-provider rate limits must be respected.

How OmniRoute solves it:

Semantic Cache — Two-tier cache (signature + semantic) reduces cost and latency
Request Idempotency — 5s deduplication window for identical requests
Rate Limit Detection — Per-provider RPM, min gap, and max concurrent tracking
Request Queue & Pacing — Configurable queue, pacing, and concurrency defaults in Settings → Resilience
API Key Validation Cache — 3-tier cache for production performance
Health Dashboard with Telemetry — p50/p95/p99 latency, cache stats, uptime

🤖 16. "I want to control model behavior globally"

Developers who want all responses in a specific language, with a specific tone, or want to limit reasoning tokens. Configuring this in every tool/request is impractical.

How OmniRoute solves it:

System Prompt Injection — Global prompt applied to all requests
Thinking Budget Validation — Reasoning token allocation control per request (passthrough, auto, custom, adaptive)
9 Routing Strategies — Global strategies that determine how requests are distributed
Wildcard Router — provider/* patterns route dynamically to any provider
Combo Enable/Disable Toggle — Toggle combos directly from the dashboard
Manual Combo Ordering — Drag combo cards by handle and persist the order in SQLite
Provider Toggle — Enable/disable all connections for a provider with one click
Blocked Providers — Exclude specific providers from /v1/models listing

🧰 17. "I need MCP tools as first-class product capabilities"

Many AI gateways expose MCP only as a hidden implementation detail. Teams need a visible, manageable operation layer.

How OmniRoute solves it:

MCP appears in the dashboard navigation and endpoint protocol tab
Dedicated MCP management page with process, tools, scopes, and audit
Built-in quick-start for omniroute --mcp and client onboarding

🧠 18. "I need A2A orchestration with sync + stream task paths"

Agent workflows need both direct replies and long-running streamed execution with lifecycle control.

How OmniRoute solves it:

A2A JSON-RPC endpoint (POST /a2a) with message/send and message/stream
SSE streaming with terminal state propagation
Task lifecycle APIs for tasks/get and tasks/cancel

🛰️ 19. "I need real MCP process health, not guessed status"

Operational teams need to know if MCP is actually alive, not just whether an API is reachable.

How OmniRoute solves it:

Runtime heartbeat file with PID, timestamps, transport, tool count, and scope mode
MCP status API combining heartbeat + recent activity
UI status cards for process/uptime/heartbeat freshness

📋 20. "I need auditable MCP tool execution"

When tools mutate config or trigger ops actions, teams need forensic traceability.

How OmniRoute solves it:

SQLite-backed audit logging for MCP tool calls
Filters by tool, success/failure, API key, and pagination
Dashboard audit table + stats endpoints for automation

🔐 21. "I need scoped MCP permissions per integration"

Different clients should have least-privilege access to tool categories.

How OmniRoute solves it:

10 granular MCP scopes for controlled tool access
Scope enforcement and visibility in MCP management UI
Safe default posture for operational tooling

⚙️ 22. "I need operational controls without redeploying"

Teams need quick runtime changes during incidents or cost events.

How OmniRoute solves it:

Switch combo activation directly from MCP dashboard
Tune queue, cooldown, breaker, and wait settings from the dedicated Resilience page
Review live provider breaker state from the Health dashboard

🔄 23. "I need live A2A task lifecycle visibility and cancellation"

Without lifecycle visibility, task incidents become hard to triage.

How OmniRoute solves it:

Task listing/filtering by state/skill with pagination
Drill-down on task metadata, events, and artifacts
Task cancellation endpoint and UI action with confirmation

🌊 24. "I need active stream metrics for A2A load"

Streaming workflows require operational insight into concurrency and live connections.

How OmniRoute solves it:

Active stream counters integrated into A2A status
Last task timestamp and per-state counts
A2A dashboard cards for real-time ops monitoring

🪪 25. "I need standard agent discovery for clients"

External clients and orchestrators need machine-readable metadata for onboarding.

How OmniRoute solves it:

Agent Card exposed at /.well-known/agent.json
Capabilities and skills shown in management UI
A2A status API includes discovery metadata for automation

🧭 26. "I need protocol discoverability in the product UX"

If users cannot discover protocol surfaces, adoption and support quality drop.

How OmniRoute solves it:

Consolidated Endpoints page with tabs for Proxy, MCP, A2A, and API Endpoints
Inline service status toggles (Online/Offline) for MCP and A2A
Links from overview to dedicated management tabs

🧪 27. "I need end-to-end protocol validation with real clients"

Mock tests are not enough to validate protocol compatibility before release.

How OmniRoute solves it:

E2E suite that boots app and uses real MCP SDK client transport
A2A client tests for discovery, send, stream, get, and cancel flows
Cross-check assertions against MCP audit and A2A tasks APIs

📡 28. "I need unified observability across all interfaces"

Splitting observability by protocol creates blind spots and longer MTTR.

How OmniRoute solves it:

Unified dashboards/logs/analytics in one product
Health + audit + request telemetry across OpenAI, MCP, and A2A layers
Operational APIs for status and automation

💼 29. "I need one runtime for proxy + tools + agent orchestration"

Running many separate services increases operational cost and failure modes.

How OmniRoute solves it:

OpenAI-compatible proxy, MCP server, and A2A server in one stack
Shared auth, resilience, data store, and observability
Consistent policy model across all interaction surfaces

🚀 30. "I need to ship agentic workflows without glue-code sprawl"

Teams lose velocity when stitching multiple ad-hoc services and scripts.

How OmniRoute solves it:

Unified endpoint strategy for clients and agents
Built-in protocol management UIs and smoke validation paths
Production-ready foundations (security, logging, resilience, backup)

📚 31. "My long sessions crash with 'context_length_exceeded' limits"

During deep debugging, long histories with tool results quickly exceed provider token windows, causing failed requests and orphaned context.

How OmniRoute solves it:

Proactive Context Compression — Evaluates token budgets before the request hits upstream and proactively prunes old conversation history with a smart binary-search mechanism.
Structural Integrity Guards — Automatically tracks explicit tool_use definitions and ensures that if a tool input is truncated, its corresponding tool_result is also safely removed, preventing API validation errors.
Multi-Layer Dropping — Progressively drops system messages, regular messages, and finally enforces strict length limits without breaking conversational logic.

Example Playbooks (Integrated Use Cases)

Playbook A: Maximize paid subscription + cheap backup

Combo: "maximize-claude"
  1. cc/claude-opus-4-7
  2. glm/glm-4.7
  3. if/kimi-k2-thinking

Monthly cost: $20 + small backup spend
Outcome: higher quality, near-zero interruption

Playbook B: Zero-cost coding stack

Combo: "free-forever"
  1. if/kimi-k2-thinking       (unlimited free)
  2. qw/qwen3-coder-plus       (unlimited free)

Monthly cost: $0
Outcome: stable free coding workflow

Playbook C: 24/7 always-on fallback chain

Combo: "always-on"
  1. cc/claude-opus-4-7
  2. cx/gpt-5.2-codex
  3. glm/glm-4.7
  4. minimax/MiniMax-M2.1
  5. if/kimi-k2-thinking

Outcome: deep fallback depth for deadline-critical workloads

Playbook D: Agent ops with MCP + A2A

1) Start MCP transport (`omniroute --mcp`) for tool-driven operations
2) Run A2A tasks via `message/send` and `message/stream`
3) Observe via /dashboard/endpoint (MCP and A2A tabs)
4) Toggle services via inline status controls

🆓 Start Free — Zero Configuration Cost

Setup AI coding in minutes at $0/month. Connect these free accounts and use the built-in Free Stack combo.

Step	Action	Providers Unlocked
1	Connect Kiro (AWS Builder ID OAuth)	Claude Sonnet 4.5, Haiku 4.5 — unlimited
2	Connect Qoder (Google OAuth)	kimi-k2-thinking, qwen3-coder-plus, deepseek-r1... — unlimited
3	Connect Qwen (Device Code)	qwen3-coder-plus, qwen3-coder-flash... — unlimited
4	`/dashboard/combos` → Free Stack ($0) template	Round-robin all free providers automatically

Point any IDE/CLI to: http://localhost:20128/v1 · API Key: any-string · Done.

Optional extra coverage (also free): Groq API key (30 RPM free), NVIDIA NIM (40 RPM free, 70+ models), Cerebras (1M tok/day), LongCat API key (50M tokens/day!), Cloudflare Workers AI (10K Neurons/day, 50+ models).

Inicio Rápido

1) Install and run

npm install -g omniroute
omniroute

pnpm users: Pass --allow-build at install time to enable native build scripts required by better-sqlite3 and @swc/core (the approve-builds -g command is not supported for global installs on pnpm v11):
pnpm add -g omniroute@latest --allow-build=better-sqlite3 --allow-build=@swc/core
omniroute

Dashboard opens at http://localhost:20128 and API base URL is http://localhost:20128/v1.

Arch Linux (AUR)

Arch Linux users can install the AUR package, which installs OmniRoute and provides a systemd user service:

yay -S omniroute-bin
systemctl --user enable --now omniroute.service

Command	Description
`omniroute`	Start server (`PORT=20128`, API and dashboard on same port)
`omniroute --port 3000`	Set canonical/API port to 3000
`omniroute --mcp`	Start MCP server (stdio transport)
`omniroute --no-open`	Don't auto-open browser
`omniroute --help`	Show help

Optional split-port mode:

PORT=20128 DASHBOARD_PORT=20129 omniroute
# API:       http://localhost:20128/v1
# Dashboard: http://localhost:20129

2) Uninstalling

When you no longer need OmniRoute, we provide two quick scripts for a clean removal:

Command	Action
`npm run uninstall`	Removes the system app but keeps your DB and configurations in `~/.omniroute`.
`npm run uninstall:full`	Removes the app AND permanently erases all configurations, keys, and databases.

Note: To run these commands, navigate to the OmniRoute project folder (if you cloned it) and run them. Alternatively, if globally installed, you can simply run npm uninstall -g omniroute.

Long-Running Streaming Timeouts

For most deployments, you only need:

Variable	Default	Purpose
`REQUEST_TIMEOUT_MS`	`600000`	Shared baseline for upstream response-start timeout, hidden Undici timeouts, TLS fingerprint requests, and API bridge request/proxy timeouts
`STREAM_IDLE_TIMEOUT_MS`	inherits `REQUEST_TIMEOUT_MS`	Maximum gap between streaming chunks before OmniRoute aborts the SSE stream

Backward compatibility is preserved: existing FETCH_TIMEOUT_MS, API_BRIDGE_PROXY_TIMEOUT_MS, and other per-layer timeout vars still work and override the shared baseline.

For Claude Code-compatible upstreams (anthropic-compatible-cc-*), OmniRoute also derives the outbound X-Stainless-Timeout header from the resolved fetch timeout so provider-side read timeouts stay aligned with your env configuration.

For third-party Claude Code-compatible reverse proxies, OmniRoute keeps the default anthropic-beta set conservative and, when Client Cache Control is left on Auto, only forwards client-provided cache_control markers. If the request does not include cache_control, OmniRoute does not inject bridge-owned markers.

Advanced overrides are available if you need finer control:

Variable	Default	Purpose
`FETCH_TIMEOUT_MS`	inherits `REQUEST_TIMEOUT_MS`	Upstream response-start timeout used until response headers arrive
`FETCH_HEADERS_TIMEOUT_MS`	inherits `FETCH_TIMEOUT_MS`	Undici time limit for receiving upstream response headers
`FETCH_BODY_TIMEOUT_MS`	inherits `FETCH_TIMEOUT_MS`	Undici time limit between upstream body chunks (`0` disables it)
`FETCH_CONNECT_TIMEOUT_MS`	`30000`	Undici TCP connect timeout
`FETCH_KEEPALIVE_TIMEOUT_MS`	`4000`	Undici idle keep-alive socket timeout
`TLS_CLIENT_TIMEOUT_MS`	inherits `FETCH_TIMEOUT_MS`	Timeout for TLS fingerprint requests made through `wreq-js`
`API_BRIDGE_PROXY_TIMEOUT_MS`	inherits `REQUEST_TIMEOUT_MS` or `600000`	Timeout for `/v1` proxy forwarding from API port to dashboard port
`API_BRIDGE_SERVER_REQUEST_TIMEOUT_MS`	`max(API_BRIDGE_PROXY_TIMEOUT_MS, 300000)`	Incoming request timeout on the API bridge server
`API_BRIDGE_SERVER_HEADERS_TIMEOUT_MS`	`60000`	Incoming header timeout on the API bridge server
`API_BRIDGE_SERVER_KEEPALIVE_TIMEOUT_MS`	`5000`	Keep-alive timeout on the API bridge server
`API_BRIDGE_SERVER_SOCKET_TIMEOUT_MS`	`0`	Socket inactivity timeout on the API bridge server (`0` disables it)

For streaming requests, FETCH_TIMEOUT_MS only covers connection setup / waiting for the first upstream response. Once the stream is active, OmniRoute will only abort on an actual stall (STREAM_IDLE_TIMEOUT_MS) or Undici body inactivity (FETCH_BODY_TIMEOUT_MS).

If you run OmniRoute behind Nginx, Caddy, Cloudflare, or another reverse proxy, make sure the proxy timeouts are also higher than your OmniRoute stream/fetch timeouts.

2) Connect providers and create your API key

Open Dashboard → Providers and connect at least one provider (OAuth or API key).
Open Dashboard → Endpoints and create an API key.
(Optional) Open Dashboard → Combos and set your fallback chain.

3) Point your coding tool to OmniRoute

Base URL: http://localhost:20128/v1
API Key:  [copy from Endpoint page]
Model:    if/kimi-k2-thinking (or any provider/model prefix)

4) Enable and validate protocols (v2.0)

MCP (for tool-driven operations):

omniroute --mcp

Then connect your MCP client over stdio and test tools like:

omniroute_get_health
omniroute_list_combos

A2A (for agent-to-agent workflows):

curl http://localhost:20128/.well-known/agent.json

curl -X POST http://localhost:20128/a2a \
  -H 'content-type: application/json' \
  -d '{"jsonrpc":"2.0","id":"quickstart","method":"message/send","params":{"skill":"quota-management","messages":[{"role":"user","content":"Give me a short quota summary."}]}}'

5) Validate everything end-to-end (recommended)

npm run test:protocols:e2e

This suite validates real MCP and A2A client flows against a running app.

Alternative: run from source

cp .env.example .env
npm install
PORT=20128 DASHBOARD_PORT=20129 NEXT_PUBLIC_BASE_URL=http://localhost:20129 npm run dev

Void Linux (`xbps-src` template)

For Void Linux users, you can build a native package using xbps-src. Save this block as srcpkgs/omniroute/template:

# Template file for 'omniroute'
pkgname=omniroute
version=3.4.1
revision=1
hostmakedepends="nodejs python3 make"
depends="openssl"
short_desc="Universal AI gateway with smart routing for multiple LLM providers"
maintainer="zenobit <zenobit@disroot.org>"
license="MIT"
homepage="https://github.com/diegosouzapw/OmniRoute"
distfiles="https://github.com/diegosouzapw/OmniRoute/archive/refs/tags/v${version}.tar.gz"
checksum=009400afee90a9f32599d8fe734145cfd84098140b7287990183dde45ae2245b
system_accounts="_omniroute"
omniroute_homedir="/var/lib/omniroute"
export NODE_ENV=production
export npm_config_engine_strict=false
export npm_config_loglevel=error
export npm_config_fund=false
export npm_config_audit=false

do_build() {
	# Determine target CPU arch for node-gyp
	local _gyp_arch
	case "$XBPS_TARGET_MACHINE" in
		aarch64*) _gyp_arch=arm64 ;;
		armv7*|armv6*) _gyp_arch=arm ;;
		i686*) _gyp_arch=ia32 ;;
		*) _gyp_arch=x64 ;;
	esac

	# 1) Install all deps – skip scripts (no network in do_build, native modules
	#    compiled separately below; better-sqlite3 is serverExternalPackage so
	#    Next.js does not execute it during next build)
	NODE_ENV=development npm ci --ignore-scripts

	# 2) Build the Next.js standalone bundle
	npm run build

	# 3) Copy static assets into standalone
	cp -r .next/static .next/standalone/.next/static
	[ -d public ] && cp -r public .next/standalone/public || true

	# 4) Compile better-sqlite3 native binding for the target architecture.
	#    Use node-gyp directly so CC/CXX from xbps-src cross-toolchain are used
	#    without npm altering them.
	local _node_gyp=/usr/lib/node_modules/npm/node_modules/node-gyp/bin/node-gyp.js
	(cd node_modules/better-sqlite3 && node "$_node_gyp" rebuild --arch="$_gyp_arch")

	# 5) Place the compiled binding into the standalone bundle
	local _bs3_release=.next/standalone/node_modules/better-sqlite3/build/Release
	mkdir -p "$_bs3_release"
	cp node_modules/better-sqlite3/build/Release/better_sqlite3.node "$_bs3_release/"

	# 6) Remove arch-specific sharp bundles – upstream sets images.unoptimized=true
	#    so sharp is not used at runtime; x64 .so files would break aarch64 strip
	rm -rf .next/standalone/node_modules/@img

	# 7) Copy pino runtime deps omitted by Next.js static analysis:
	#    pino-abstract-transport – required by pino's worker thread
	#    split2 – dep of pino-abstract-transport
	#    process-warning – dep of pino itself
	for _mod in pino-abstract-transport split2 process-warning; do
		cp -r "node_modules/$_mod" .next/standalone/node_modules/
	done
}

do_check() {
	npm run test:unit
}

do_install() {
	vmkdir usr/lib/omniroute/.next

	vcopy .next/standalone/. usr/lib/omniroute/.next/standalone

	# Prevent removal of empty Next.js app router dirs by the post-install hook
	for _d in \
		.next/standalone/.next/server/app/dashboard \
		.next/standalone/.next/server/app/dashboard/settings \
		.next/standalone/.next/server/app/dashboard/providers; do
		touch "${DESTDIR}/usr/lib/omniroute/${_d}/.keep"
	done

	cat > "${WRKDIR}/omniroute" <<'EOF'
#!/bin/sh
export PORT="${PORT:-20128}"
export DATA_DIR="${DATA_DIR:-${XDG_DATA_HOME:-${HOME}/.local/share}/omniroute}"
export APP_LOG_TO_FILE="${APP_LOG_TO_FILE:-false}"
mkdir -p "${DATA_DIR}"
exec node /usr/lib/omniroute/.next/standalone/server.js "$@"
EOF
	vbin "${WRKDIR}/omniroute"
}

post_install() {
	vlicense LICENSE
}

🐳 Docker

OmniRoute is available as a public Docker image on Docker Hub.

Quick run:

docker run -d \
  --name omniroute \
  --restart unless-stopped \
  --stop-timeout 40 \
  -p 20128:20128 \
  -v omniroute-data:/app/data \
  diegosouzapw/omniroute:latest

With environment file:

# Copy and edit .env first
cp .env.example .env

docker run -d \
  --name omniroute \
  --restart unless-stopped \
  --stop-timeout 40 \
  --env-file .env \
  -p 20128:20128 \
  -v omniroute-data:/app/data \
  diegosouzapw/omniroute:latest

Using Docker Compose:

# Base profile (no CLI tools)
docker compose --profile base up -d

# CLI profile (Claude Code, Codex, OpenClaw built-in)
docker compose --profile cli up -d

Dashboard support for Docker deployments now includes a one-click Cloudflare Quick Tunnel on Dashboard → Endpoints. The first enable downloads cloudflared only when needed, starts a temporary tunnel to your current /v1 endpoint, and shows the generated https://*.trycloudflare.com/v1 URL directly below your normal public URL.

Notes:

Quick Tunnel URLs are temporary and change after every restart.
Quick Tunnels are not auto-restored after an OmniRoute or container restart. Re-enable them from the dashboard when needed.
Managed install currently supports Linux, macOS, and Windows on x64 / arm64.
Managed Quick Tunnels default to HTTP/2 transport to avoid noisy QUIC UDP buffer warnings in constrained container environments. Set CLOUDFLARED_PROTOCOL=quic or auto if you want a different transport.
Docker images bundle system CA roots and pass them to managed cloudflared, which avoids TLS trust failures when the tunnel bootstraps inside the container.
SQLite runs in WAL mode. docker stop should be allowed to finish so OmniRoute can checkpoint the latest changes back into storage.sqlite.
The bundled Compose files already set a 40s stop grace period. If you run the image directly, keep --stop-timeout 40 (or similar) so manual stops do not cut off shutdown cleanup.
Set CLOUDFLARED_BIN=/absolute/path/to/cloudflared if you want OmniRoute to use an existing binary instead of downloading one.

Using Docker Compose with Caddy (HTTPS Auto-TLS):

OmniRoute can be securely exposed using Caddy's automatic SSL provisioning. Ensure your domain's DNS A record points to your server's IP.

services:
  omniroute:
    image: diegosouzapw/omniroute:latest
    container_name: omniroute
    restart: unless-stopped
    volumes:
      - omniroute-data:/app/data
    environment:
      - PORT=20128
      - NEXT_PUBLIC_BASE_URL=https://your-domain.com

  caddy:
    image: caddy:latest
    container_name: caddy
    restart: unless-stopped
    ports:
      - "80:80"
      - "443:443"
    command: caddy reverse-proxy --from https://your-domain.com --to http://omniroute:20128

volumes:
  omniroute-data:

Image	Tag	Size	Description
`diegosouzapw/omniroute`	`latest`	~250MB	Latest stable release
`diegosouzapw/omniroute`	`3.6.2`	~250MB	Current version

🖥️ Desktop App — Offline & Always-On

🆕 NEW! OmniRoute is now available as a native desktop application for Windows, macOS, and Linux.

Run OmniRoute as a standalone desktop app — no terminal, no browser, no internet required for local models. The Electron-based app includes:

🖥️ Native Window — Dedicated app window with system tray integration
🔄 Auto-Start — Launch OmniRoute on system login
🔔 Native Notifications — Get alerts for quota exhaustion or provider issues
⚡ One-Click Install — NSIS (Windows), DMG (macOS), AppImage (Linux)
🌐 Offline Mode — Works fully offline with bundled server

Inicio Rápido

# Development mode
npm run electron:dev

# Build for your platform
npm run electron:build         # Current platform
npm run electron:build:win     # Windows (.exe)
npm run electron:build:mac     # macOS (.dmg) — x64 & arm64
npm run electron:build:linux   # Linux (.AppImage)

System Tray

When minimized, OmniRoute lives in your system tray with quick actions:

Open dashboard
Change server port
Quit application

📖 Full documentation: electron/README.md

💰 Pricing at a Glance

Tier	Provider	Cost	Quota Reset	Best For
💳 SUBSCRIPTION	Claude Code (Pro)	$20/mo	5h + weekly	Already subscribed
	Codex (Plus/Pro)	$20-200/mo	5h + weekly	OpenAI users
	GitHub Copilot	$10-19/mo	Monthly	GitHub users
🔑 API KEY	NVIDIA NIM	FREE (dev forever)	~40 RPM	70+ open models
	Cerebras	FREE (1M tok/day)	60K TPM / 30 RPM	World's fastest
	Groq	FREE (30 RPM)	14.4K RPD	Ultra-fast Llama/Gemma
	DeepSeek V3.2	$0.27/$1.10 per 1M	None	Best price/quality reasoning
	xAI Grok-4 Fast	$0.20/$0.50 per 1M 🆕	None	Fastest + tool calling, ultralow
	xAI Grok-4 (standard)	$0.20/$1.50 per 1M 🆕	None	Reasoning flagship from xAI
	Mistral	Free trial + paid	Rate limited	European AI
	OpenRouter	Pay-per-use	None	100+ models aggr.
💰 CHEAP	GLM-5 (via Z.AI) 🆕	$0.5/1M	Daily 10AM	128K output, newest flagship
	GLM-4.7	$0.6/1M	Daily 10AM	Budget backup
	MiniMax M2.5 🆕	$0.3/1M input	5-hour rolling	Reasoning + agentic tasks
	MiniMax M2.1	$0.2/1M	5-hour rolling	Cheapest option
	Kimi K2.5 (Moonshot API) 🆕	Pay-per-use	None	Direct Moonshot API access
	Kimi K2	$9/mo flat	10M tokens/mo	Predictable cost
🆓 FREE	Qoder	$0	Unlimited	5 models unlimited
	Qwen	$0	Unlimited	4 models unlimited
	Kiro	$0	Unlimited	Claude Sonnet/Haiku (AWS Builder)
	LongCat Flash-Lite 🆕	$0 (50M tok/day 🔥)	1 RPS	Largest free quota on Earth
	Pollinations AI 🆕	$0 (no key needed)	1 req/15s	GPT-5, Claude, DeepSeek, Llama 4
	Cloudflare Workers AI 🆕	$0 (10K Neurons/day)	~150 resp/day	50+ models, global edge
	Scaleway AI 🆕	$0 (1M tokens total)	Rate limited	EU/GDPR, Qwen3 235B, Llama 70B

🆕 New models added (Mar 2026): Grok-4 Fast family at $0.20/$0.50/M (benchmarked at 1143ms — 30% faster than Gemini 2.5 Flash), GLM-5 via Z.AI with 128K output, MiniMax M2.5 reasoning, DeepSeek V3.2 updated pricing, Kimi K2.5 via Moonshot direct API.

💡 $0 Combo Stack — The Complete Free Setup:

# 🆓 Ultimate Free Stack 2026 — 11 Providers, $0 Forever
Kiro (kr/)             → Claude Sonnet/Haiku UNLIMITED
Qoder (if/)            → kimi-k2-thinking, qwen3-coder-plus, deepseek-r1 UNLIMITED
LongCat Lite (lc/)     → LongCat-Flash-Lite — 50M tokens/day 🔥
Pollinations (pol/)    → GPT-5, Claude, DeepSeek, Llama 4 — no key needed
Qwen (qw/)             → qwen3-coder-plus, qwen3-coder-flash, qwen3-coder-next UNLIMITED
Gemini (gemini/)       → Gemini 2.5 Flash — 1,500 req/day free API key
Cloudflare AI (cf/)    → Llama 70B, Gemma 3, Mistral — 10K Neurons/day
Scaleway (scw/)        → Qwen3 235B, Llama 70B — 1M free tokens (EU)
Groq (groq/)           → Llama/Gemma ultra-fast — 14.4K req/day
NVIDIA NIM (nvidia/)   → 70+ open models — 40 RPM forever
Cerebras (cerebras/)   → Llama/Qwen world-fastest — 1M tok/day

Zero cost. Never stops coding. Configure this as one OmniRoute combo and all fallbacks happen automatically — no manual switching ever.

🆓 Free Models — What You Actually Get

All models below are 100% free with zero credit card required. OmniRoute auto-routes between them when one quota runs out — combine them all for an unbreakable $0 combo.

🔵 CLAUDE MODELS (via Kiro — AWS Builder ID)

Model	Prefix	Limit	Rate Limit
`claude-sonnet-4.5`	`kr/`	Unlimited	No reported daily cap
`claude-haiku-4.5`	`kr/`	Unlimited	No reported daily cap
`claude-opus-4.6`	`kr/`	Unlimited	Latest Opus via Kiro

🟢 QODER MODELS (Free PAT via qodercli)

Model	Prefix	Limit	Rate Limit
`kimi-k2-thinking`	`if/`	Unlimited	No reported cap
`qwen3-coder-plus`	`if/`	Unlimited	No reported cap
`deepseek-r1`	`if/`	Unlimited	No reported cap
`minimax-m2.1`	`if/`	Unlimited	No reported cap
`kimi-k2`	`if/`	Unlimited	No reported cap

Recommended connection method: Personal Access Token + qodercli. Browser OAuth is experimental and disabled by default unless QODER_OAUTH_* environment variables are configured.

🟡 QWEN MODELS (Device Code Auth)

Model	Prefix	Limit	Rate Limit
`qwen3-coder-plus`	`qw/`	Unlimited	No reported cap
`qwen3-coder-flash`	`qw/`	Unlimited	No reported cap
`qwen3-coder-next`	`qw/`	Unlimited	No reported cap
`vision-model`	`qw/`	Unlimited	Multimodal (images)

⚫ NVIDIA NIM (Free API Key — build.nvidia.com)

Tier	Daily Limit	Rate Limit	Notes
Free (Dev)	No token cap	~40 RPM	70+ models; transitioning to pure rate limits mid-2025

Popular free models: moonshotai/kimi-k2.5 (Kimi K2.5), z-ai/glm4.7 (GLM 4.7), deepseek-ai/deepseek-v3.2 (DeepSeek V3.2), nvidia/llama-3.3-70b-instruct, deepseek/deepseek-r1

⚪ CEREBRAS (Free API Key — inference.cerebras.ai)

Tier	Daily Limit	Rate Limit	Notes
Free	1M tokens/day	60K TPM / 30 RPM	World's fastest LLM inference; resets daily

Available free: llama-3.3-70b, llama-3.1-8b, deepseek-r1-distill-llama-70b

🔴 GROQ (Free API Key — console.groq.com)

Tier	Daily Limit	Rate Limit	Notes
Free	14.4K RPD	30 RPM per model	No credit card; 429 on limit, not charged

Available free: llama-3.3-70b-versatile, gemma2-9b-it, mixtral-8x7b, whisper-large-v3

🔴 LONGCAT AI (Free API Key — longcat.chat) 🆕

Model	Prefix	Daily Free Quota	Notes
`LongCat-Flash-Lite`	`lc/`	50M tokens 💥	Largest free quota ever
`LongCat-Flash-Chat`	`lc/`	500K tokens	Multi-turn chat
`LongCat-Flash-Thinking`	`lc/`	500K tokens	Reasoning / CoT
`LongCat-Flash-Thinking-2601`	`lc/`	500K tokens	Jan 2026 version
`LongCat-Flash-Omni-2603`	`lc/`	500K tokens	Multimodal

100% free while in public beta. Sign up at longcat.chat with email or phone. Resets daily 00:00 UTC.

🟢 POLLINATIONS AI (No API Key Required) 🆕

Model	Prefix	Rate Limit	Provider Behind
`openai`	`pol/`	1 req/15s	GPT-5
`claude`	`pol/`	1 req/15s	Anthropic Claude
`gemini`	`pol/`	1 req/15s	Google Gemini
`deepseek`	`pol/`	1 req/15s	DeepSeek V3
`llama`	`pol/`	1 req/15s	Meta Llama 4 Scout
`mistral`	`pol/`	1 req/15s	Mistral AI

✨ Zero friction: No signup, no API key. Add the Pollinations provider with an empty key field and it works immediately.

🟠 CLOUDFLARE WORKERS AI (Free API Key — cloudflare.com) 🆕

Tier	Daily Neurons	Equivalent Usage	Notes
Free	10,000	~150 LLM resp / 500s audio / 15K embeds	Global edge, 50+ models

Popular free models: @cf/meta/llama-3.3-70b-instruct, @cf/google/gemma-3-12b-it, @cf/openai/whisper-large-v3-turbo (free audio!), @cf/qwen/qwen2.5-coder-15b-instruct

Requires API Token + Account ID from dash.cloudflare.com. Store Account ID in provider settings.

🟣 SCALEWAY AI (1M Free Tokens — scaleway.com) 🆕

Tier	Free Quota	Location	Notes
Free	1M tokens	🇫🇷 Paris, EU	No credit card needed within limits

Available free: qwen3-235b-a22b-instruct-2507 (Qwen3 235B!), llama-3.1-70b-instruct, mistral-small-3.2-24b-instruct-2506, deepseek-v3-0324

EU/GDPR compliant. Get API key at console.scaleway.com.

💡 The Ultimate Free Stack (11 Providers, $0 Forever):

Kiro (kr/)             → Claude Sonnet/Haiku UNLIMITED
Qoder (if/)            → kimi-k2-thinking, qwen3-coder-plus, deepseek-r1 UNLIMITED
LongCat Lite (lc/)     → LongCat-Flash-Lite — 50M tokens/day 🔥
Pollinations (pol/)    → GPT-5, Claude, DeepSeek, Llama 4 — no key needed
Qwen (qw/)             → qwen3-coder models UNLIMITED
Gemini (gemini/)       → Gemini 2.5 Flash — 1,500 req/day free
Cloudflare AI (cf/)    → 50+ models — 10K Neurons/day
Scaleway (scw/)        → Qwen3 235B, Llama 70B — 1M free tokens (EU)
Groq (groq/)           → Llama/Gemma — 14.4K req/day ultra-fast
NVIDIA NIM (nvidia/)   → 70+ open models — 40 RPM forever
Cerebras (cerebras/)   → Llama/Qwen world-fastest — 1M tok/day

🎙️ Free Transcription Combo

Transcribe any audio/video for $0 — Deepgram leads with $200 free, AssemblyAI $50 fallback, Groq Whisper as unlimited emergency backup.

Provider	Free Credits	Best Model	Rate Limit
🟢 Deepgram	$200 free (signup)	`nova-3` — best accuracy, 30+ languages	No RPM limit on free credits
🔵 AssemblyAI	$50 free (signup)	`universal-3-pro` — chapters, sentiment, PII	No RPM limit on free credits
🔴 Groq	Free forever	`whisper-large-v3` — OpenAI Whisper	30 RPM (rate limited)

Suggested combo in /dashboard/combos:

Name: free-transcription
Strategy: Priority
Nodes:
  [1] deepgram/nova-3          → uses $200 free first
  [2] assemblyai/universal-3-pro → fallback when Deepgram credits run out
  [3] groq/whisper-large-v3    → free forever, emergency fallback

Then in /dashboard/media → Transcription tab: upload any audio or video file → select your combo endpoint → get transcription in supported formats.

💡 Key Features

OmniRoute v3.6 is built as an operational platform, not just a relay proxy.

🆕 New — v3.6.x Highlights (Apr 2026)

Feature	What It Does
🌐 V1 WebSocket Bridge	OpenAI-compatible WebSocket traffic upgraded and proxied via `/v1/ws` — full streaming over WS with session auth (API key or session cookie)
🔑 Sync Tokens & Config Bundle	Issue/revoke sync tokens for config sync endpoints. Config bundles versioned with ETag for bandwidth-efficient polling
🧠 GLM Thinking (glmt) Preset	GLM Thinking registered first-class: 65 536 max tokens, 24 576 thinking budget, 900s timeout, usage sync & pricing — Claude-compatible API
🔢 Hybrid Token Counting	Uses provider-side `/messages/count_tokens` when available; falls back to estimation — accurate usage tracking without guessing
🌱 Model Alias Auto-Seed	30+ cross-proxy dialect aliases normalised at startup — no more routing mismatches
🛡️ Safe Outbound Fetch	All provider validation and model discovery go through a guarded fetch layer blocking private/local URLs with retry, timeout, and SSRF protection
⏳ Wait For Cooldown	Server-side chat retries when every candidate connection is cooling down; configurable `enabled`, `maxRetries`, and `maxRetryWaitSec`
🔍 Runtime Env Validation	Startup validates all env vars with Zod schemas — clear errors for missing secrets, invalid URLs, or wrong types
📋 Compliance Audit Expansion	Structured audit logs with pagination, request context, auth events, provider CRUD events, and SSRF-blocked validation logging
🔐 TPS Log Metric	Log details modal shows Tokens Per Second (TPS) — quick performance at-a-glance for every request
🗑️ Uninstall / Full Uninstall	`npm run uninstall` keeps data, `npm run uninstall:full` removes everything — clean removal for all install methods
🔧 OAuth Env Repair	One-click "Repair env" action for OAuth providers restores missing env vars and fixes broken auth state
🔒 Graceful Electron Shutdown	Electron `before-quit` shuts down Next.js gracefully, preventing SQLite WAL database locks on desktop close
👁️ Model Visibility Toggle	Per-model visibility toggle (👁 icon) with search filter and active-count badge (`N/M active`) on provider pages
📧 Email Privacy Masking	OAuth account emails masked (`di***@g**.com`), full address visible on hover
🔗 Context Relay Strategy	Combo strategy preserving session continuity via structured handoff summaries when accounts rotate mid-conversation
🛡️ Proxy Hardening	Token health check, API key validation, and undici dispatcher all honor proxy config
⚠️ Node.js 24 Login Warning	Login page proactively detects incompatible Node.js versions and shows a clear warning banner
📎 Gemini PDF Attachments	PDF attachments correctly routed to Gemini via `inline_data` and generic base64 detection
🔒 CodeQL Security Hardening	Resolved SSRF, insecure randomness, polynomial ReDoS, and incomplete URL sanitization alerts

🆕 New — ClawRouter-Inspired Improvements (Mar 2026)

Feature	What It Does
⚡ Grok-4 Fast Family	xAI models at $0.20/$0.50/M — benchmarked 1143ms (30% faster than Gemini 2.5 Flash)
🧠 GLM-5 via Z.AI	128K output context, $0.5/1M — newest flagship from the GLM family
🔮 MiniMax M2.5	Reasoning + agentic tasks at $0.30/1M — significant upgrade from M2.1
🎯 toolCalling Flag per Model	Per-model `toolCalling: true/false` in registry — AutoCombo skips non-tool-capable models
🌍 Multilingual Intent Detection	PT/ZH/ES/AR keywords in AutoCombo scoring — better model selection for non-English content
📊 Benchmark-Driven Fallbacks	Real p95 latency from live requests feeds combo scoring — AutoCombo learns from actual data
🔁 Request Deduplication	Content-hash based dedup window — multi-agent safe, prevents duplicate charges
🔌 Pluggable RouterStrategy	Extensible `RouterStrategy` interface — add custom routing logic as plugins

🚀 Previous v2.0.9+ — Playground, CLI Fingerprints & ACP

Feature	What It Does
🎮 Model Playground	Dashboard page to test any model directly — provider/model/endpoint selectors, Monaco Editor, streaming, abort, timing
🔏 CLI Fingerprint Matching	Per-provider header/body ordering to match native CLI signatures — toggle per provider in Settings > Security. Your proxy IP is preserved
🤖 ACP Agents Dashboard	Debug › Agents page — grid of 14 agents with install status, version, custom agent form for any CLI tool. OpenCode users get a "Download opencode.json" button that auto-generates a ready-to-use config with all available models.
🔧 Custom Model `apiFormat` Routing	Custom models with `apiFormat: "responses"` now correctly route to the Responses API translator
🏢 Codex Workspace Isolation	Multiple Codex workspaces per email — OAuth correctly separates connections by workspace ID
🔄 Electron Auto-Update	Desktop app checks for updates + auto-install on restart

🤖 Agent & Protocol Operations (v2.0)

Feature	What It Does
🔧 MCP Server (25 tools)	IDE/agent tools via 3 transports: stdio, SSE (`/api/mcp/sse`), Streamable HTTP (`/api/mcp/stream`). 18 core + 3 memory + 4 skill tools
🤝 A2A Server (JSON-RPC + SSE)	Agent-to-agent task execution with sync and streaming flows
🧭 Consolidated Endpoints Page	Tabbed management page with Endpoint Proxy, MCP, A2A, and API Endpoints tabs
🎚️ Service Enable/Disable Toggles	ON/OFF switches for MCP and A2A with settings persistence (default: OFF)
🛰️ MCP Runtime Heartbeat	Real process status (pid, uptime, heartbeat age, transport, scope mode)
📋 MCP Audit Trail	Filterable audit logs with success/failure and key attribution
🔐 MCP Scope Enforcement	10 granular scope permissions for controlled tool access
📡 A2A Task Lifecycle Management	List/filter tasks, inspect events/artifacts, cancel running tasks
📋 Agent Card Discovery	`/.well-known/agent.json` for client auto-discovery
🧪 Protocol E2E Test Harness	Real MCP SDK + A2A client flows in `test:protocols:e2e`
⚙️ Operational Controls	Switch combos, tune resilience settings, and review breaker state from dedicated Health and Settings surfaces

🧠 Routing & Intelligence

Feature	What It Does
🎯 Smart 4-Tier Fallback	Auto-route: Subscription → API Key → Cheap → Free
📊 Real-Time Quota Tracking	Live token count + reset countdown per provider
🔄 Format Translation	OpenAI ↔ Claude ↔ Gemini ↔ Responses with schema-safe conversions
👥 Multi-Account Support	Multiple accounts per provider with intelligent selection
🔄 Auto Token Refresh	OAuth tokens refresh automatically with retry
🎨 Custom Combos	13 balancing strategies + fallback chain control
🔗 Context Relay	Session continuity handoffs when account rotation happens mid-session
🌐 Wildcard Router	`provider/*` dynamic routing
🧠 Thinking Budget Controls	Passthrough, auto, custom, and adaptive reasoning limits
🔀 Model Aliases	Built-in + custom model aliasing and migration safety
⚡ Background Degradation	Route low-priority background tasks to cheaper models
🧪 Task-Aware Smart Routing	Auto-select model by content type (coding/vision/analysis/summarization)
🔄 A2A Agent Workflows	Deterministic FSM orchestrator for stateful multi-step agent executions
🔀 Adaptive Routing	Dynamic strategy override based on token volume and prompt complexity
🎲 Provider Diversity	Shannon entropy scoring balancing auto-combo traffic distribution
💬 System Prompt Injection	Global behavior controls applied consistently
📄 Responses API Compatibility	Full `/v1/responses` support for Codex and advanced agentic workflows

Feature	What It Does
🖼️ Image Generation	`/v1/images/generations` with cloud and local backends
📐 Embeddings	`/v1/embeddings` for search and RAG pipelines
🎤 Audio Transcription	`/v1/audio/transcriptions` — 7 providers (Deepgram Nova 3, AssemblyAI, Groq Whisper, HuggingFace, ElevenLabs, OpenAI, Azure), auto-language detection, MP4/MP3/WAV support
🔊 Text-to-Speech	`/v1/audio/speech` — 10 providers (ElevenLabs, OpenAI, Deepgram, Cartesia, PlayHT, HuggingFace, Nvidia NIM, Inworld, Coqui, Tortoise) with correct error messages
🎬 Video Generation	`/v1/videos/generations` (ComfyUI + SD WebUI workflows)
🎵 Music Generation	`/v1/music/generations` (ComfyUI workflows)
🛡️ Moderations	`/v1/moderations` safety checks
🔀 Reranking	`/v1/rerank` for relevance scoring
🔍 Web Search 🆕	`/v1/search` — 5 providers (Serper, Brave, Perplexity, Exa, Tavily), 6,500+ free/month, auto-failover, cache

🛡️ Resilience, Security & Governance

Feature	What It Does
🔌 Provider Circuit Breakers	Provider-wide trip/recover after fallback exhaustion with configurable thresholds
🔒 Daily Quota Lock 🆕	Detects exhaustion signals and locks routing for the specific model until midnight
🎯 Endpoint-Aware Models	Custom models declare supported endpoints + API format
🛡️ Anti-Thundering Herd	Mutex + semaphore protections on retry/rate events
🧠 Semantic + Signature Cache	Cost/latency reduction with two cache layers
⚡ Request Idempotency	Duplicate protection window
🔒 TLS Fingerprint Spoofing	Browser-like TLS fingerprint — reduces bot detection and account flagging
🔏 CLI Fingerprint Matching	Matches native CLI request signatures — reduces ban risk while preserving proxy IP
🌐 IP Filtering	Allowlist/blocklist control for exposed deployments
🚦 Request Queue & Pacing	Configurable per-connection request buckets for RPM, spacing, concurrency, and max wait
📉 Graceful Degradation	Multi-layer capability fallbacks protecting core gateway operations
📜 Config Audit Trail	Diff-based change tracking preventing operational drift with simple rollbacks
⏳ Provider Health Sync	Proactive token expiration monitoring triggering alerts before authorization failures
❄️ Connection Cooldown	Retryable 408/429/5xx failures cool down a single connection with optional upstream hints
🚪 Auto-Disable Banned Accounts	Permanently blocked token accounts can be disabled automatically
🔑 API Key Management + Scoping	Secure key issuance/rotation and model/provider controls
👁️ Scoped API Key Reveal 🆕	Opt-in recovery of API keys via `ALLOW_API_KEY_REVEAL`
🛡️ Protected `/models`	Optional auth gating and provider hiding for model catalog
🛡️ Safe Outbound Fetch 🆕	Guarded fetch for provider calls — blocks private/local URLs, retries, SSRF protection
⏳ Wait For Cooldown 🆕	Auto-retry chat after connection cooldowns; configurable `enabled`, `maxRetries`, and `maxRetryWaitSec`
🔍 Runtime Env Validation 🆕	Zod-based env schema validation at startup with actionable error messages
📋 Compliance Audit v2 🆕	Pagination, request context, auth events, provider CRUD, and SSRF-blocked logging

📊 Observability & Analytics

Feature	What It Does
📝 Request + Proxy Logging	Full request/response and proxy logging
📉 Streamed Detailed Logs	Reconstructs SSE payload streams cleanly into the UI
🏷️ Real-Time Model Badges 🆕	Live model status and daily quota countdown timers
📋 Unified Logs Dashboard	Request, proxy, audit, and console views in one page
🔍 Request Telemetry	p50/p95/p99 latency and request tracing
🏥 Health Dashboard	Uptime, breaker states, lockouts, cache stats
💰 Cost Tracking	Budget controls and per-model pricing visibility
📈 Analytics Visualizations	Model/provider usage insights and trend views
🧪 Evaluation Framework	Golden set testing with configurable match strategies
📡 Live Diagnostics 🆕	Semantic cache bypass for accurate combo live testing
🔐 TPS Log Metric 🆕	Tokens Per Second badge in log details modal

☁️ Deployment & Platform

Feature	What It Does
🌐 Deploy Anywhere	Localhost, VPS, Docker, Cloud environments
🚇 Cloudflare Tunnel 🆕	One-click Quick Tunnel integration from the dashboard
🔑 API Key Model Filtering	Native /v1/models response filtered via assigned Bearer context roles
⚡ Smart Cache Bypass	Configurable TTL heuristics and forced refetch controls
🔄 Backup/Restore	Export/import and disaster recovery flows
🧙 Onboarding Wizard	First-run guided setup
🔧 CLI Tools Dashboard	One-click setup for popular coding tools
🎮 Model Playground	Test any provider/model/endpoint from the dashboard
🔏 CLI Fingerprint Toggle	Per-provider fingerprint matching in Settings > Security
🌐 i18n (30 languages)	Full dashboard + docs language support with RTL coverage
🧹 Clear All Models	One-click model list clearing in provider details
👁️ Sidebar Controls 🆕	Hide components and integrations from Appearance Settings
📋 Issue Templates	Standardized GitHub templates for bugs and features
📂 Custom Data Directory	`DATA_DIR` override for storage location
🌐 V1 WebSocket Bridge 🆕	OpenAI-compatible WebSocket traffic proxied via `/v1/ws`
🔑 Sync Tokens & Bundle 🆕	Config sync tokens + versioned bundle endpoint with ETag support

Feature Deep Dive

Smart fallback with practical cost control

Combo: "my-coding-stack"
  1. cc/claude-opus-4-7
  2. nvidia/llama-3.3-70b
  3. glm/glm-4.7
  4. if/kimi-k2-thinking

When quota, rate, or health fails, OmniRoute automatically moves to the next candidate without manual switching.

Protocol management that is visible and operable

MCP + A2A are discoverable in UI and docs (not hidden)
Protocol status APIs expose live operational data (/api/mcp/*, /api/a2a/*)
Dashboards include actions for day-2 ops (combo toggles, breaker resets, task cancellation)

Translator + validation workflow

The Translator area includes:

Playground: request transformation checks
Chat Tester: full request/response round-trip
Test Bench: multiple cases in one run
Live Monitor: real-time traffic view

Plus protocol validation with real clients via npm run test:protocols:e2e.

📖 MCP Server README — Tool reference, IDE configs, and client examples

📖 A2A Server README — Skills, JSON-RPC methods, streaming, and task lifecycle

🧪 Evaluations (Evals)

OmniRoute includes a built-in evaluation framework to test LLM response quality against a golden set. Access it via Analytics → Evals in the dashboard.

Built-in Golden Set

The pre-loaded "OmniRoute Golden Set" contains test cases for:

Greetings, math, geography, code generation
JSON format compliance, translation, markdown generation
Safety refusal (harmful content), counting, boolean logic

Evaluation Strategies

Strategy	Description	Example
`exact`	Output must match exactly	`"4"`
`contains`	Output must contain substring (case-insensitive)	`"Paris"`
`regex`	Output must match regex pattern	`"1.2.3"`
`custom`	Custom JS function returns true/false	`(output) => output.length > 10`

📖 Setup Guide

Protocol Setup (MCP + A2A)

🧩 MCP Setup (Model Context Protocol)

Start MCP transport in stdio mode:

omniroute --mcp

Recommended validation flow:

Connect your MCP client over stdio.
Run omniroute_get_health.
Run omniroute_list_combos.
Open /dashboard/mcp to confirm heartbeat, activity, and audit.

Useful APIs for automation:

GET /api/mcp/status
GET /api/mcp/tools
GET /api/mcp/audit
GET /api/mcp/audit/stats

🤝 A2A Setup (Agent2Agent)

Discover the agent:

curl http://localhost:20128/.well-known/agent.json

Send a task:

curl -X POST http://localhost:20128/a2a \
  -H 'content-type: application/json' \
  -d '{"jsonrpc":"2.0","id":"setup-a2a","method":"message/send","params":{"skill":"quota-management","messages":[{"role":"user","content":"Summarize quota status."}]}}'

Manage lifecycle:

GET /api/a2a/status
GET /api/a2a/tasks
GET /api/a2a/tasks/:id
POST /api/a2a/tasks/:id/cancel

Operational UI:

/dashboard/a2a for task/state/stream observability and smoke actions

🧪 End-to-end protocol validation

Validate both protocols with real clients:

npm run test:protocols:e2e

This verifies:

MCP SDK client connect/list/call
A2A discovery/send/stream/get/cancel
Cross-check data in MCP audit and A2A task management APIs

💳 Subscription Providers

Claude Code (Pro/Max)

Dashboard → Providers → Connect Claude Code
→ OAuth login → Auto token refresh
→ 5-hour + weekly quota tracking

Models:
  cc/claude-opus-4-7
  cc/claude-sonnet-4-5-20250929
  cc/claude-haiku-4-5-20251001

Pro Tip: Use Opus for complex tasks, Sonnet for speed. OmniRoute tracks quota per model!

OpenAI Codex (Plus/Pro)

Dashboard → Providers → Connect Codex
→ OAuth login (port 1455)
→ 5-hour + weekly reset

Models:
  cx/gpt-5.2-codex
  cx/gpt-5.1-codex-max

Codex Account Limit Management (5h + Weekly)

Each Codex account now has policy toggles in Dashboard -> Providers:

5h (ON/OFF): enforce the 5-hour window threshold policy.
Weekly (ON/OFF): enforce the weekly window threshold policy.
Threshold behavior: when an enabled window reaches >=90% usage, that account is skipped.
Rotation behavior: OmniRoute routes to the next eligible Codex account automatically.
Reset behavior: when the provider resetAt time passes, the account becomes eligible again automatically.

Scenarios:

5h ON + Weekly ON: account is skipped when either window reaches threshold.
5h OFF + Weekly ON: only weekly usage can block the account.
5h ON + Weekly OFF: only 5-hour usage can block the account.
resetAt passed: account re-enters rotation automatically (no manual re-enable).

GitHub Copilot

Dashboard → Providers → Connect GitHub
→ OAuth via GitHub
→ Monthly reset (1st of month)

Models:
  gh/gpt-5
  gh/claude-4.5-sonnet
  gh/gemini-3.1-pro-preview

🔑 API Key Providers

NVIDIA NIM (FREE developer access — 70+ models)

Sign up: build.nvidia.com
Get free API key (1000 inference credits included)
Dashboard → Add Provider → NVIDIA NIM:
- API Key: nvapi-your-key

Models: nvidia/llama-3.3-70b-instruct, nvidia/mistral-7b-instruct, and 50+ more

Pro Tip: OpenAI-compatible API — works seamlessly with OmniRoute's format translation!

DeepSeek

Sign up: platform.deepseek.com
Get API key
Dashboard → Add Provider → DeepSeek

Models: deepseek/deepseek-chat, deepseek/deepseek-coder

Groq (Free Tier Available!)

Sign up: console.groq.com
Get API key (free tier included)
Dashboard → Add Provider → Groq

Models: groq/llama-3.3-70b, groq/mixtral-8x7b

Pro Tip: Ultra-fast inference — best for real-time coding!

OpenRouter (100+ Models)

Sign up: openrouter.ai
Get API key
Dashboard → Add Provider → OpenRouter

Models: Access 100+ models from all major providers through a single API key.

Dashboard behavior: OpenRouter models are managed from Available Models. Manual add, import, and auto-sync all update the same list.

💰 Cheap Providers (Backup)

GLM-4.7 (Daily reset, $0.6/1M)

Sign up: Zhipu AI
Get API key from Coding Plan
Dashboard → Add API Key:
- Provider: glm
- API Key: your-key

Use: glm/glm-4.7

Pro Tip: Coding Plan offers 3× quota at 1/7 cost! Reset daily 10:00 AM.

MiniMax M2.1 (5h reset, $0.20/1M)

Sign up: MiniMax
Get API key
Dashboard → Add API Key

Use: minimax/MiniMax-M2.1

Pro Tip: Cheapest option for long context (1M tokens)!

Kimi K2 ($9/month flat)

Subscribe: Moonshot AI
Get API key
Dashboard → Add API Key

Use: kimi/kimi-latest

Pro Tip: Fixed $9/month for 10M tokens = $0.90/1M effective cost!

🆓 FREE Providers (Emergency Backup)

Qoder (5 FREE models via OAuth)

Dashboard → Connect Qoder
→ Qoder OAuth login
→ Unlimited usage

Models:
  if/kimi-k2-thinking
  if/qwen3-coder-plus
  if/glm-4.7
  if/minimax-m2
  if/deepseek-r1

Qwen (4 FREE models via Device Code)

Dashboard → Connect Qwen
→ Device code authorization
→ Unlimited usage

Models:
  qw/qwen3-coder-plus
  qw/qwen3-coder-flash

Kiro (Claude FREE)

Dashboard → Connect Kiro
→ AWS Builder ID or Google/GitHub
→ Unlimited usage

Models:
  kr/claude-sonnet-4.5
  kr/claude-haiku-4.5

🎨 Create Combos

Example 1: Maximize Subscription → Cheap Backup

Dashboard → Combos → Create New

Name: premium-coding
Models:
  1. cc/claude-opus-4-7 (Subscription primary)
  2. glm/glm-4.7 (Cheap backup, $0.6/1M)
  3. minimax/MiniMax-M2.1 (Cheapest fallback, $0.20/1M)

Use in CLI: premium-coding

Example 2: Free-Only (Zero Cost)

Name: free-combo
Models:
  1. if/kimi-k2-thinking (unlimited)
  2. qw/qwen3-coder-plus (unlimited)

Cost: $0 forever!

🔧 CLI Integration

Cursor IDE

Settings → Models → Advanced:
  OpenAI API Base URL: http://localhost:20128/v1
  OpenAI API Key: [from OmniRoute dashboard]
  Model: cc/claude-opus-4-7

Claude Code

Use the CLI Tools page in the dashboard for one-click configuration, or edit ~/.claude/settings.json manually.

Codex CLI

export OPENAI_BASE_URL="http://localhost:20128"
export OPENAI_API_KEY="your-omniroute-api-key"

codex "your prompt"

OpenClaw

Option 1 — Dashboard (recommended):

Dashboard → CLI Tools → OpenClaw → Select Model → Apply

Option 2 — Manual: Edit ~/.openclaw/openclaw.json:

{
  "models": {
    "providers": {
      "omniroute": {
        "baseUrl": "http://127.0.0.1:20128/v1",
        "apiKey": "sk_omniroute",
        "api": "openai-completions"
      }
    }
  }
}

Note: OpenClaw only works with local OmniRoute. Use 127.0.0.1 instead of localhost to avoid IPv6 resolution issues.

Cline / Continue / RooCode

Settings → API Configuration:
  Provider: OpenAI Compatible
  Base URL: http://localhost:20128/v1
  API Key: [from OmniRoute dashboard]
  Model: if/kimi-k2-thinking

OpenCode

Step 1: Add OmniRoute as a custom provider:

opencode
/connect
# Select "Other" → Enter ID: "omniroute" → Enter your OmniRoute API key

Step 2: Create/edit opencode.json in your project root:

{
  "$schema": "https://opencode.ai/config.json",
  "provider": {
    "omniroute": {
      "npm": "@ai-sdk/openai-compatible",
      "name": "OmniRoute",
      "options": {
        "baseURL": "http://localhost:20128/v1"
      },
      "models": {
        "cc/claude-sonnet-4-20250514": { "name": "Claude Sonnet 4" },
        "gg/gemini-2.5-pro": { "name": "Gemini 2.5 Pro" },
        "if/kimi-k2-thinking": { "name": "Kimi K2 (Free)" }
      }
    }
  }
}

Step 3: Select the model in OpenCode:

/models
# Select any OmniRoute model from the list

Tip: Add any model available in your OmniRoute /v1/models endpoint to the models section. Use the format provider/model-id from your OmniRoute dashboard.

Solución de Problemas

Click to expand troubleshooting guide

"Language model did not provide messages"

Provider quota exhausted → Check dashboard quota tracker
Solution: Use combo fallback or switch to cheaper tier

Rate limiting

Subscription quota out → Fallback to GLM/MiniMax
Add combo: cc/claude-opus-4-7 → glm/glm-4.7 → if/kimi-k2-thinking

OAuth token expired

Auto-refreshed by OmniRoute
If issues persist: Dashboard → Provider → Reconnect

High costs

Check usage stats in Dashboard → Costs
Switch primary model to GLM/MiniMax

Dashboard/API ports are wrong

PORT is the canonical base port (and API port by default)
API_PORT overrides only OpenAI-compatible API listener
DASHBOARD_PORT overrides only dashboard/Next.js listener
Set NEXT_PUBLIC_BASE_URL to your dashboard/public URL (for OAuth callbacks)

Cloud sync errors

Verify BASE_URL points to your running instance
Verify CLOUD_URL points to your expected cloud endpoint
Keep NEXT_PUBLIC_* values aligned with server-side values

First login not working

Check INITIAL_PASSWORD in .env
If unset, fallback password is 123456

No request logs

call_logs in SQLite stores summary metadata for the Request Logs table and analytics views
Detailed request/response payloads are written to DATA_DIR/call_logs/ as one JSON artifact per request
Enable pipeline capture from Dashboard → Logs → Request Logs if you need detailed per-stage payloads
Export Logs reads the artifact files on demand, while Export All includes the call_logs/ directory alongside storage.sqlite
Set APP_LOG_TO_FILE=true if you also want application console logs in logs/application/app.log
Adjust APP_LOG_MAX_FILE_SIZE, APP_LOG_RETENTION_DAYS, APP_LOG_MAX_FILES, and CALL_LOG_MAX_ENTRIES as needed

Connection test shows "Invalid" for OpenAI-compatible providers

Many providers don't expose a /models endpoint
OmniRoute v1.0.6+ includes fallback validation via chat completions
Ensure base URL includes /v1 suffix

🔐 OAuth on a Remote Server

⚠️ Important for users running OmniRoute on a VPS, Docker, or any remote server

The OAuth credentials bundled in OmniRoute are registered for localhost only. When you access OmniRoute on a remote server (e.g. https://omniroute.myserver.com), Google rejects the authentication with:

Error 400: redirect_uri_mismatch

Solution: Configure your own OAuth credentials

You need to create an OAuth 2.0 Client ID in Google Cloud Console with your server's URI.

Step-by-step

1. Open Google Cloud Console

Go to: https://console.cloud.google.com/apis/credentials

2. Create a new OAuth 2.0 Client ID

Click "+ Create Credentials" → "OAuth client ID"
Application type: "Web application"
Name: anything you like (e.g. OmniRoute Remote)

3. Add Authorized Redirect URIs

In the "Authorized redirect URIs" field, add:

https://your-server.com/callback

Replace your-server.com with your server's domain or IP (include the port if needed, e.g. http://45.33.32.156:20128/callback).

4. Save and copy the credentials

After creating, Google will show the Client ID and Client Secret.

5. Set environment variables

In your .env (or Docker environment variables):

# For Antigravity:
ANTIGRAVITY_OAUTH_CLIENT_ID=your-client-id.apps.googleusercontent.com
ANTIGRAVITY_OAUTH_CLIENT_SECRET=GOCSPX-your-secret

GEMINI_OAUTH_CLIENT_ID=your-client-id.apps.googleusercontent.com
GEMINI_OAUTH_CLIENT_SECRET=GOCSPX-your-secret

6. Restart OmniRoute

# npm:
npm run dev

# Docker:
docker restart omniroute

7. Try connecting again

Google will now redirect correctly to https://your-server.com/callback.

Temporary workaround (without custom credentials)

If you don't want to set up your own credentials right now, you can still use the manual URL flow:

OmniRoute opens the Google authorization URL
After authorizing, Google tries to redirect to localhost (which fails on the remote server)
Copy the full URL from your browser's address bar (even if the page doesn't load)
Paste that URL into the field shown in the OmniRoute connection modal
Click "Connect"

This works because the authorization code in the URL is valid regardless of whether the redirect page loaded.

🇧🇷 Versão em Português

As credenciais OAuth embutidas no OmniRoute estão cadastradas apenas para localhost. Quando você acessa o OmniRoute em um servidor remoto (ex: https://omniroute.meuservidor.com), o Google rejeita a autenticação com:

Error 400: redirect_uri_mismatch

Solução: Configure suas próprias credenciais OAuth

Você precisa criar um OAuth 2.0 Client ID no Google Cloud Console com a URI do seu servidor.

Passo a passo

1. Acesse o Google Cloud Console

Abra: https://console.cloud.google.com/apis/credentials

2. Crie um novo OAuth 2.0 Client ID

Clique em "+ Create Credentials" → "OAuth client ID"
Tipo de aplicativo: "Web application"
Nome: escolha qualquer nome (ex: OmniRoute Remote)

3. Adicione as Authorized Redirect URIs

No campo "Authorized redirect URIs", adicione:

https://seu-servidor.com/callback

Substitua seu-servidor.com pelo domínio ou IP do seu servidor (inclua a porta se necessário, ex: http://45.33.32.156:20128/callback).

4. Salve e copie as credenciais

Após criar, o Google mostrará o Client ID e o Client Secret.

5. Configure as variáveis de ambiente

No seu .env (ou nas variáveis de ambiente do Docker):

# Para Antigravity:
ANTIGRAVITY_OAUTH_CLIENT_ID=seu-client-id.apps.googleusercontent.com
ANTIGRAVITY_OAUTH_CLIENT_SECRET=GOCSPX-seu-secret

GEMINI_OAUTH_CLIENT_ID=seu-client-id.apps.googleusercontent.com
GEMINI_OAUTH_CLIENT_SECRET=GOCSPX-seu-secret

6. Reinicie o OmniRoute

# Se usando npm:
npm run dev

# Se usando Docker:
docker restart omniroute

7. Tente conectar novamente

Agora o Google redirecionará corretamente para https://seu-servidor.com/callback e a autenticação funcionará.

Workaround temporário (sem configurar credenciais próprias)

Se não quiser criar credenciais próprias agora, ainda é possível usar o fluxo manual de URL:

O OmniRoute abrirá a URL de autorização do Google
Após você autorizar, o Google tentará redirecionar para localhost (que falha no servidor remoto)
Copie a URL completa da barra de endereço do seu browser (mesmo que a página não carregue)
Cole essa URL no campo que aparece no modal de conexão do OmniRoute
Clique em "Connect"

Este workaround funciona porque o código de autorização na URL é válido independente do redirect ter carregado ou não.

🛠️ Tech Stack

Click to expand tech stack details

Runtime: Node.js 18–22 LTS (⚠️ Node.js 24+ is not supported — better-sqlite3 native binaries are incompatible)
Language: TypeScript 5.9 — 100% TypeScript across src/ and open-sse/ (zero any in core modules since v2.0)
Framework: Next.js 16 + React 19 + Tailwind CSS 4
Database: better-sqlite3 (SQLite) + LowDB (JSON legacy) — domain state, proxy logs, MCP audit, routing decisions, memory, skills
Schemas: Zod (MCP tool I/O validation, API contracts)
Protocols: MCP (stdio/HTTP) + A2A v0.3 (JSON-RPC 2.0 + SSE)
Streaming: Server-Sent Events (SSE)
Auth: OAuth 2.0 (PKCE) + JWT + API Keys + MCP Scoped Authorization
Testing: Node.js test runner + Vitest (900+ tests including unit, integration, E2E)
CI/CD: GitHub Actions (auto npm publish + Docker Hub on release)
Website: omniroute.online
Package: npmjs.com/package/omniroute
Docker: hub.docker.com/r/diegosouzapw/omniroute
Resilience: Circuit breaker, exponential backoff, anti-thundering herd, TLS spoofing, auto-combo self-healing

Documentación

Document	Description
User Guide	Providers, combos, CLI integration, deployment
API Reference	All endpoints with examples
MCP Server	25 MCP tools, IDE configs, Python/TS/Go clients
A2A Server	JSON-RPC 2.0 protocol, skills, streaming, task mgmt
Auto-Combo Engine	6-factor scoring, mode packs, self-healing
Context Relay	Session handoff strategy for account rotation
Troubleshooting	Common problems and solutions
Architecture	System architecture and internals
Codebase Documentation	Beginner-friendly codebase walkthrough
Uninstall Guide	Clean removal for all install methods
Environment Config	Complete `.env` variables and references
Contributing	Development setup and guidelines
OpenAPI Spec	OpenAPI 3.0 specification
Security Policy	Vulnerability reporting and security practices
VM Deployment	Complete guide: VM + nginx + Cloudflare setup
Features Gallery	Visual dashboard tour with screenshots
Release Checklist	Pre-release validation steps

🗺️ Roadmap

OmniRoute has 218+ features planned across multiple development phases. Here are the key areas:

Category	Planned Features	Highlights
🧠 Routing & Intelligence	25+	Lowest-latency routing, tag-based routing, quota preflight, quota-aware P2C, step-based combo routing
🔒 Security & Compliance	20+	SSRF hardening, credential cloaking, rate-limit per endpoint, management key scoping
📊 Observability	15+	OpenTelemetry integration, real-time quota monitoring, combo target health, cost tracking per model
🔄 Provider Integrations	20+	Dynamic model registry, connection cooldowns, multi-account Codex, Copilot quota parsing
⚡ Performance	15+	Dual cache layer, prompt cache, response cache, streaming keepalive, batch API
🌐 Ecosystem	10+	WebSocket API, config hot-reload, distributed config store, commercial mode

🔜 Coming Soon

🔗 OpenCode Integration — Native provider support for the OpenCode AI coding IDE
🔗 TRAE Integration — Full support for the TRAE AI development framework
📦 Batch API — Asynchronous batch processing for bulk requests
🎯 Tag-Based Routing — Route requests based on custom tags and metadata
💰 Lowest-Cost Strategy — Automatically select the cheapest available provider

📝 Full feature specifications available in docs/new-features/ (217 detailed specs)

👥 Contributors

How to Contribute

Fork the repository
Create your feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

See CONTRIBUTING.md for detailed guidelines.

Releasing a New Version

# Create a release — npm publish happens automatically
gh release create v2.0.0 --title "v2.0.0" --generate-notes

📊 Star History

🌍 StarMapper

🙏 Acknowledgments

Special thanks to CLIProxyAPI — the original Go implementation that inspired this JavaScript port.

Licencia

MIT License - see LICENSE for details.

_{Built with ❤️ for developers who code 24/7}
_{omniroute.online}

🚀 OmniRoute — The Free AI Gateway (Kiswahili)

Never stop coding. Smart routing to FREE & low-cost AI models with automatic fallback.

🖼️ Main Dashboard

📸 Dashboard Preview

🤖 Free AI Provider for your favorite coding agents

🤔 Why OmniRoute?

📧 Support

🐛 Reporting a Bug?

🔄 How It Works

🎯 What OmniRoute Solves — 30 Real Pain Points & Use Cases

Example Playbooks (Integrated Use Cases)

🆓 Start Free — Zero Configuration Cost

Inicio Rápido

1) Install and run

Arch Linux (AUR)

2) Uninstalling

Long-Running Streaming Timeouts

2) Connect providers and create your API key

3) Point your coding tool to OmniRoute

4) Enable and validate protocols (v2.0)

5) Validate everything end-to-end (recommended)

Alternative: run from source

🐳 Docker

🖥️ Desktop App — Offline & Always-On

Inicio Rápido

System Tray

💰 Pricing at a Glance

🆓 Free Models — What You Actually Get

🔵 CLAUDE MODELS (via Kiro — AWS Builder ID)

🟢 QODER MODELS (Free PAT via qodercli)

🟡 QWEN MODELS (Device Code Auth)

⚫ NVIDIA NIM (Free API Key — build.nvidia.com)

⚪ CEREBRAS (Free API Key — inference.cerebras.ai)

🔴 GROQ (Free API Key — console.groq.com)

🔴 LONGCAT AI (Free API Key — longcat.chat) 🆕

🟢 POLLINATIONS AI (No API Key Required) 🆕

🟠 CLOUDFLARE WORKERS AI (Free API Key — cloudflare.com) 🆕

🟣 SCALEWAY AI (1M Free Tokens — scaleway.com) 🆕

🎙️ Free Transcription Combo

💡 Key Features

🆕 New — v3.6.x Highlights (Apr 2026)

🆕 New — ClawRouter-Inspired Improvements (Mar 2026)

🚀 Previous v2.0.9+ — Playground, CLI Fingerprints & ACP

🤖 Agent & Protocol Operations (v2.0)

🧠 Routing & Intelligence

🎵 Multi-Modal APIs

🛡️ Resilience, Security & Governance

📊 Observability & Analytics

☁️ Deployment & Platform

Feature Deep Dive

Smart fallback with practical cost control

Protocol management that is visible and operable

Translator + validation workflow

🧪 Evaluations (Evals)

Built-in Golden Set

Evaluation Strategies

📖 Setup Guide

Protocol Setup (MCP + A2A)

Claude Code (Pro/Max)

OpenAI Codex (Plus/Pro)

Codex Account Limit Management (5h + Weekly)

GitHub Copilot

NVIDIA NIM (FREE developer access — 70+ models)

DeepSeek

Groq (Free Tier Available!)

OpenRouter (100+ Models)

GLM-4.7 (Daily reset, $0.6/1M)

MiniMax M2.1 (5h reset, $0.20/1M)

Kimi K2 ($9/month flat)

Qoder (5 FREE models via OAuth)

Qwen (4 FREE models via Device Code)

Kiro (Claude FREE)

Example 1: Maximize Subscription → Cheap Backup

Example 2: Free-Only (Zero Cost)

Cursor IDE

Claude Code

Codex CLI

OpenClaw

Cline / Continue / RooCode

OpenCode