chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937)
Replace with for all literal-tuple
membership tests. Set lookup is O(1) vs O(n) for tuple — consistent
micro-optimization across the codebase.
608 instances fixed via ruff --fix --unsafe-fixes, 0 remaining.
133 files, +626/-626 (net zero).
feat(skills): add skill bundles — alias /<name> loads multiple skills (#28373)
Skill bundles are tiny YAML files in ~/.hermes/skill-bundles/ that
group several skills under one slash command. Invoking /<bundle-name>
from any surface (CLI, TUI, dashboard, any gateway platform) loads
every referenced skill into a single combined user message.
Use cases:
- /backend-dev → loads github-code-review + test-driven-development
+ github-pr-workflow as one bundle.
- /research → loads several research skills together.
- Team task profiles shared via dotfiles.
Behavior:
- Bundles take precedence over individual skills when slugs collide.
- Missing skills are skipped with a note, not fatal.
- No system-prompt mutation — bundles generate a fresh user message
at invocation time, the same way /<skill> does. Prompt cache stays
intact.
- Works in CLI dispatch, gateway dispatch, autocomplete (CLI + TUI),
/help display.
Schema (~/.hermes/skill-bundles/<slug>.yaml):
name: backend-dev
description: Backend feature work.
skills:
- github-code-review
- test-driven-development
instruction: |
Optional extra guidance prepended to the loaded skills.
New module: agent/skill_bundles.py — load, scan, resolve, build
invocation message, save, delete. yaml.safe_load only; broken
bundles log a warning and are skipped, never raise.
New CLI subcommand: hermes bundles {list,show,create,delete,reload}.
Implementation in hermes_cli/bundles.py; wired in hermes_cli/main.py.
'bundles' added to _BUILTIN_SUBCOMMANDS so plugin discovery skips it.
New in-session slash command: /bundles lists installed bundles in
both CLI and gateway. /<bundle-name> dispatch added to CLI (cli.py)
and gateway (gateway/run.py) before the existing /<skill-name> path.
Autocomplete: SlashCommandCompleter gained an optional
skill_bundles_provider parameter that defaults to None — the prompt
shows '▣ <description> (N skills)' for bundles vs '⚡' for skills.
Tests:
- tests/agent/test_skill_bundles.py — 33 tests covering slugify,
scan/cache freshness, resolve (including underscore→hyphen
Telegram alias), build_bundle_invocation_message (loading, missing
skills, user/bundle instruction injection, dedup), save/delete,
reload diff, list sort.
- tests/hermes_cli/test_bundles.py — 8 tests for the CLI
subcommand (create/list/show/delete/reload, --force, missing
bundle errors).
- tests/gateway/test_bundles_command.py — 4 tests for the gateway
handler and bundle resolution priority.
Live E2E: verified subprocess invocations of hermes bundles
{list,create,show,reload,delete} round-trip correctly against an
isolated HERMES_HOME.
Docs:
- website/docs/user-guide/features/skills.md — new 'Skill Bundles'
section with quick example, YAML schema, management commands,
behavior notes.
- website/docs/reference/cli-commands.md — 'hermes bundles' added to
the top-level command table and given its own subcommand section.
fix: ESC cancels secret/sudo prompts, clearer skip messaging (#9902)
- Add ESC key binding (eager) for secret_state and sudo_state modal
prompts — fires immediately, same behavior as Ctrl+C cancel
- Update placeholder text: 'Enter to submit · ESC to skip' (was
'Enter to skip' which was confusing — Enter on empty looked like
submitting nothing rather than intentionally skipping)
- Update widget body text: 'ESC or Ctrl+C to skip'
- Change feedback message from 'Secret entry cancelled' to 'Secret
entry skipped' — more accurate for the action taken
- getpass fallback prompt also updated for non-TUI mode
chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937)
Replace with for all literal-tuple
membership tests. Set lookup is O(1) vs O(n) for tuple — consistent
micro-optimization across the codebase.
608 instances fixed via ruff --fix --unsafe-fixes, 0 remaining.
133 files, +626/-626 (net zero).
chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937)
Replace with for all literal-tuple
membership tests. Set lookup is O(1) vs O(n) for tuple — consistent
micro-optimization across the codebase.
608 instances fixed via ruff --fix --unsafe-fixes, 0 remaining.
133 files, +626/-626 (net zero).
chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937)
Replace with for all literal-tuple
membership tests. Set lookup is O(1) vs O(n) for tuple — consistent
micro-optimization across the codebase.
608 instances fixed via ruff --fix --unsafe-fixes, 0 remaining.
133 files, +626/-626 (net zero).
fix(codex-runtime): de-dup [plugins.X] tables and stop leaking HERMES_HOME into config.toml
Builds on @steezkelly's Bug A fix (#25857, top-level default_permissions
via _insert_managed_block_at_top_level) by addressing the other two
config-corruption bugs described in #26250:
Bug B (duplicate [plugins.X] tables)
- Codex itself writes [plugins."<name>@<marketplace>"] tables to
config.toml when the user runs codex plugins enable directly,
before hermes-agent's managed block exists. On the next migrate run,
_query_codex_plugins() re-discovers the same plugins via plugin/list
and render_codex_toml_section() re-emits them inside the managed
block. Codex's strict TOML parser then rejects the duplicate table
header on startup.
- Add _strip_unmanaged_plugin_tables() that drops [plugins.*] tables
from the user-content portion of the file. Only run it when
plugin/list succeeded — if the RPC failed we can't re-emit and
must preserve the user's tables. plugin/list is the source of
truth when it answers.
Bug C (HERMES_HOME pytest-tempdir leak into ~/.codex/config.toml)
- _build_hermes_tools_mcp_entry() read HERMES_HOME directly from
os.environ, so a sibling pytest's monkeypatch.setenv("HERMES_HOME",
tmp_path) silently burned a transient pytest tempdir into the
user's real ~/.codex/config.toml. After pytest reaped the tempdir,
every codex-routed hermes-tools tool call failed silently.
- Derive HERMES_HOME from get_hermes_home() (the canonical resolver
that goes through the profile-aware path) and refuse to emit
obvious test-tempdir paths via _looks_like_test_tempdir() as
belt-and-suspenders for any other callsite that forgets to patch
migrate().
- test_enable_succeeds_when_codex_present in test_codex_runtime_switch.py
invoked the real migrate() (no mock), writing to Path.home() / .codex
using whatever HERMES_HOME the running pytest session had set. Add
the same migrate patch the other apply() tests already use, so the
suite stops touching the user's real ~/.codex/config.toml.
E2E verification (replicating the issue's repro):
- Pre-state config.toml with user [mcp_servers.omx_team_run] +
codex-installed [plugins."tasks@openai-curated"],
HERMES_HOME="/private/var/folders/.../pytest-of-.../..."
- On origin/main: tomllib refuses to load the result with
"Cannot declare ('plugins', 'tasks@openai-curated') twice" AND
the pytest-tempdir HERMES_HOME is burned in.
- On this branch: file parses cleanly, default_permissions is
top-level, exactly one [plugins."tasks@openai-curated"] table
inside the managed block, no HERMES_HOME in the MCP env.
7 new regression tests covering all three bugs + the test-leak guard.
bash scripts/run_tests.sh tests/hermes_cli/test_codex_runtime_*.py —
95 passed, 0 failed.
Closes #26250
chore: ruff auto-fix PLR6201 resweep — tuple → set in membership tests (#27355)
Six days after #23937 (608 fixes) the codebase had accumulated 241 new
PLR6201 violations. Same mechanical x in (...) → x in {...} fix,
same zero-risk profile: set lookup is O(1) vs O(n) for tuple and the
two are semantically equivalent for hashable scalar membership tests.
All 241 instances fixed via `ruff check --select PLR6201 --fix
--unsafe-fixes`, zero remaining. Every changed value is a hashable
scalar (str/int/None/enum/signal); no risk of unhashable runtime
errors. No behavior change.
Test plan:
- 119 files changed, +244/-244 (net zero) — exactly one-line edits
- ruff check clean afterward
- Compile checks pass on the largest touched files (cli.py, run_agent.py,
gateway/run.py, gateway/platforms/discord.py, model_tools.py)
- Subset broad test run on tests/gateway/ tests/hermes_cli/ tests/agent/
tests/tools/: 18187 passed, 59 pre-existing failures (verified against
origin/main with the same shape — identical failure count, identical
category — all xdist test-order flakes unrelated to this change)
Follows the same template as PR #23937 ([tracker: #23972](https://github.com/NousResearch/hermes-agent/issues/23972)).
feat: respect NO_COLOR env var and TERM=dumb (#4079)
Add should_use_color() function to hermes_cli/colors.py that checks
NO_COLOR (https://no-color.org/) and TERM=dumb before emitting ANSI
escapes. The existing color() helper now uses this function instead
of a bare isatty() check.
This is the foundation — cli.py and banner.py still have inline ANSI
constants that bypass this module (tracked in #4071).
Closes #4066
Co-authored-by: SHL0MS <SHL0MS@users.noreply.github.com>
feat(sessions): opt-in per-session JSON snapshot writer
PR #29182 deleted the per-session JSON snapshot writer outright because
state.db is canonical and the snapshots had no in-tree consumer. Some
users have external tooling that reads ~/.hermes/sessions/session_{sid}.json
directly, so reintroduce the writer behind a config flag that defaults
to off.
- Add sessions.write_json_snapshots (default False) to DEFAULT_CONFIG
- Restore AIAgent._save_session_log + _clean_session_content as
gated methods. When the flag is off the call is a fast no-op; when
on, the writer behaves as before (atomic write, truncation guard
preserved, REASONING_SCRATCHPAD → think tag normalization)
- Re-derive the target path from agent.session_id on each call so
/branch and /compress re-points happen automatically — no need
to restore the explicit re-point bookkeeping at call sites
- Wire the single call site in _persist_session (the cleanup-on-exit
hook). Did NOT restore the 7 intra-turn calls the original PR deleted
— those were redundant writes within the same turn that doubled disk
I/O without adding any persistence guarantee _persist_session does
not already provide
- Read the flag once at agent init via load_config(), cache as
agent._session_json_enabled
- Update TestNoSessionJsonSnapshot → TestSessionJsonSnapshotOptIn
to pin behavior: default off (no file), opt-in true (file written),
no-op method on default agents, logs_dir retained unconditionally
- Update CONTRIBUTING.md and the bundled hermes-agent skill to
document the flag and its default
chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937)
Replace with for all literal-tuple
membership tests. Set lookup is O(1) vs O(n) for tuple — consistent
micro-optimization across the codebase.
608 instances fixed via ruff --fix --unsafe-fixes, 0 remaining.
133 files, +626/-626 (net zero).
chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937)
Replace with for all literal-tuple
membership tests. Set lookup is O(1) vs O(n) for tuple — consistent
micro-optimization across the codebase.
608 instances fixed via ruff --fix --unsafe-fixes, 0 remaining.
133 files, +626/-626 (net zero).
chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937)
Replace with for all literal-tuple
membership tests. Set lookup is O(1) vs O(n) for tuple — consistent
micro-optimization across the codebase.
608 instances fixed via ruff --fix --unsafe-fixes, 0 remaining.
133 files, +626/-626 (net zero).
fix: reset default SOUL.md to baseline identity text (#3159)
The default SOUL.md seeded for new users should match
DEFAULT_AGENT_IDENTITY — a short, neutral identity paragraph.
The elaborate voice spec (avoid lists, dialogue examples, symbol
conventions) was never intended as the default for all users.
Users who want a custom persona write their own SOUL.md.
feat(dep_ensure): complete Windows bootstrap — dep_ensure + install.ps1 + detection (#27845)
* feat(dep_ensure): complete Windows bootstrap — dep_ensure + install.ps1 + detection
dep_ensure.py gains Windows awareness: PowerShell invocation, platform-
specific browser detection, (path, shell) tuple returns.
install.ps1 gains -Ensure/-PostInstall modes using npm -g --prefix
(aligned with install.sh) and agent-browser install for Chromium.
browser_tool.py gains node/ in candidate dirs for Windows .cmd shims.
Both install scripts bundled in pip wheel.
Tracking: #27826
* fix(install.ps1): add --ignore-scripts to npm install for camofox
@askjo/camofox-browser has a dependency (impit) whose postinstall
script runs npx only-allow pnpm, which fails under npm. Adding
--ignore-scripts avoids the spurious failure without affecting
functionality.
Tracking: #27826
* fix: remove duplicate install scripts from git
CI already copies scripts/install.{sh,ps1} into hermes_cli/scripts/
during wheel build. No need to commit copies — .gitignore keeps them
out, _find_install_script() falls back to scripts/ for git-clone users.
Tracking: #27826
* fix: address review — remove env_extra, fix ps1 error handling
- Remove unused env_extra parameter from ensure_dependency()
- Invoke-EnsureMode node case now uses Test-Node consistently
- Install-AgentBrowser uses throw instead of exit 1
chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937)
Replace with for all literal-tuple
membership tests. Set lookup is O(1) vs O(n) for tuple — consistent
micro-optimization across the codebase.
608 instances fixed via ruff --fix --unsafe-fixes, 0 remaining.
133 files, +626/-626 (net zero).
chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937)
Replace with for all literal-tuple
membership tests. Set lookup is O(1) vs O(n) for tuple — consistent
micro-optimization across the codebase.
608 instances fixed via ruff --fix --unsafe-fixes, 0 remaining.
133 files, +626/-626 (net zero).
chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937)
Replace with for all literal-tuple
membership tests. Set lookup is O(1) vs O(n) for tuple — consistent
micro-optimization across the codebase.
608 instances fixed via ruff --fix --unsafe-fixes, 0 remaining.
133 files, +626/-626 (net zero).
feat(kanban): add scheduled status for delayed follow-ups
Salvages #24533 by @roycepersonalassistant. Adds a first-class
'scheduled' Kanban status for time-delay follow-ups that aren't
waiting on human input.
- hermes kanban schedule <task_id> [reason] CLI command
- Dashboard/API transitions to/from Scheduled
- unblock_task() now releases both 'blocked' AND 'scheduled' tasks
(re-checking parent dependencies before moving to ready/todo)
- i18n + docs updates
Resolved conflicts: kept HEAD's failure-counter reset on unblock
alongside the PR's scheduled state, kept HEAD's 'running' direct-set
rejection, combined both bulk-status branches. Dropped the dist/
bundle changes (months-stale; would need rebuild from source).
fix(kanban): honor severity thresholds in diagnostics
Salvages #26431 by @LeonSGP43. Dashboard plugin_api list_diagnostics
was using exact-match (severity == filter), so '--severity warning'
hid 'error' and 'critical' diagnostics. Adds severity_at_or_above()
helper to kanban_diagnostics and uses it in the dashboard endpoint
(CLI already used SEVERITY_ORDER comparison correctly).
feat(cli): add kanban swarm topology helper
Salvages #26791 by @Niraven. Adds 'hermes kanban swarm' to create a
durable Kanban Swarm v1 graph: a completed root/blackboard card,
parallel worker cards, a verifier gated on all workers, and a
synthesizer gated on the verifier. Stores shared swarm blackboard
updates as structured JSON comments on the root card.
Self-contained: new hermes_cli/kanban_swarm.py module + CLI wiring +
unit tests.
feat: component-separated logging with session context and filtering (#7991)
* feat: component-separated logging with session context and filtering
Phase 1 — Gateway log isolation:
- gateway.log now only receives records from gateway.* loggers
(platform adapters, session management, slash commands, delivery)
- agent.log remains the catch-all (all components)
- errors.log remains WARNING+ catch-all
- Moved gateway.log handler creation from gateway/run.py into
hermes_logging.setup_logging(mode='gateway') with _ComponentFilter
Phase 2 — Session ID injection:
- Added set_session_context(session_id) / clear_session_context() API
using threading.local() for per-thread session tracking
- _SessionFilter enriches every log record with session_tag attribute
- Log format: '2026-04-11 10:23:45 INFO [session_id] logger.name: msg'
- Session context set at start of run_conversation() in run_agent.py
- Thread-isolated: gateway conversations on different threads don't leak
Phase 3 — Component filtering in hermes logs:
- Added --component flag: hermes logs --component gateway|agent|tools|cli|cron
- COMPONENT_PREFIXES maps component names to logger name prefixes
- Works with all existing filters (--level, --session, --since, -f)
- Logger name extraction handles both old and new log formats
Files changed:
- hermes_logging.py: _SessionFilter, _ComponentFilter, COMPONENT_PREFIXES,
set/clear_session_context(), gateway.log creation in setup_logging()
- gateway/run.py: removed redundant gateway.log handler (now in hermes_logging)
- run_agent.py: set_session_context() at start of run_conversation()
- hermes_cli/logs.py: --component filter, logger name extraction
- hermes_cli/main.py: --component argument on logs subparser
Addresses community request for component-separated, filterable logging.
Zero changes to existing logger names — __name__ already provides hierarchy.
* fix: use LogRecord factory instead of per-handler _SessionFilter
The _SessionFilter approach required attaching a filter to every handler
we create. Any handler created outside our _add_rotating_handler (like
the gateway stderr handler, or third-party handlers) would crash with
KeyError: 'session_tag' if it used our format string.
Replace with logging.setLogRecordFactory() which injects session_tag
into every LogRecord at creation time — process-global, zero per-handler
wiring needed. The factory is installed at import time (before
setup_logging) so session_tag is available from the moment hermes_logging
is imported.
- Idempotent: marker attribute prevents double-wrapping on module reload
- Chains with existing factory: won't break third-party record factories
- Removes _SessionFilter from _add_rotating_handler and setup_verbose_logging
- Adds tests: record factory injection, idempotency, arbitrary handler compat
fix(opencode-go): keep users on opencode-go instead of hijacking to native providers (#20802)
OpenCode Go and OpenCode Zen are flat-namespace model resellers — their
/v1/models returns bare IDs (deepseek-v4-flash, minimax-m2.7), and the
inference API rejects vendor-prefixed names with HTTP 401 'Model not
supported'. Two bugs fixed:
1. switch_model in hermes_cli/model_switch.py was silently switching the
user off opencode-go to native deepseek when they typed
/model deepseek-v4-flash. Step d found the model in opencode-go's live
catalog, but step e (detect_provider_for_model) still ran and matched
the bare name against deepseek's static catalog. Fix: track whether
the live catalog resolved it; skip step e when it did.
2. normalize_model_for_provider in hermes_cli/model_normalize.py only
stripped the exact opencode-zen/ prefix, leaving arbitrary vendor
prefixes like minimax/minimax-m2.7 (commonly copied from aggregator
slugs into fallback_model configs) intact — causing HTTP 401s when
the fallback chain activated. Fix: opencode-go/opencode-zen strip ANY
leading vendor prefix because their APIs are flat-namespace.
Tests: 11 new cases in tests/hermes_cli/test_opencode_go_flat_namespace.py
covering both normalization (prefix stripping, regression guards for
opencode-zen Claude hyphenation and openrouter vendor-prepending) and
switch_model (bare-name resolution on opencode-go's live catalog must
not trigger cross-provider hijack).
Reported by @Ufonik via Discord; Kimi K2.6 always worked because moonshotai
has no overlapping entry in a native provider's static catalog. Deepseek
and minimax failed because their v4/v2.7 names existed in the native
deepseek/minimax catalogs.
fix: detect gh-copilot deprecation and improve GitHub Models 413 errors (#10648)
Address two blocking issues when using GitHub Copilot integrations:
1. ACP mode: detect the gh-copilot CLI deprecation error from stderr
and surface an actionable message with alternatives instead of
hanging or showing a cryptic error.
2. GitHub Models (Azure) 413: recognize models.inference.ai.azure.com
as a known GitHub Models URL, and print a targeted hint explaining
the hard 8K token limit that makes this endpoint incompatible with
Hermes' system prompt size.
fix(oneshot): pass fallback_providers from profile config to AIAgent
Salvages #23368 by @uzunkuyruk. Oneshot workers (e.g. kanban workers
spawned via 'hermes -p <profile> chat -q ...') were not honouring the
profile's fallback_providers / fallback_model chain because oneshot.py
never read the config and never passed fallback_model= to AIAgent.
Reads cfg.get('fallback_providers') (new list format) or
cfg.get('fallback_model') (legacy single-dict) with the same
normalization cli.py applies, then forwards as fallback_model=_fb.
fix(pairing): enforce lockout on approve_code, not just generate_code (#10195) (#21325)
PairingStore.approve_code() didn't consult _is_locked_out(), so after
MAX_FAILED_ATTEMPTS bad approvals the lockout flag was set but a valid
code still got accepted — any pending code (legitimately issued or
attacker-obtained) could be approved during the 1-hour lockout window,
nullifying the brute-force protection.
- gateway/pairing.py: lockout check runs in approve_code() right after
_cleanup_expired, before the pending lookup. Returns None on lockout.
- tests/gateway/test_pairing.py: test_lockout_blocks_code_approval pins
the regression — reporter's exact reproducer (generate valid code,
exhaust attempts with WRONGCODE, try to approve valid code) must
return None and leave is_approved == False. Also pins recovery: once
lockout expires, the still-pending code approves normally.
- hermes_cli/pairing.py: _cmd_approve distinguishes the two None cases.
On lockout, prints 'Platform locked out... clears in N minutes. To
reset sooner, delete the _lockout:<platform> entry from
_rate_limits.json' instead of the misleading 'Code not found or
expired' message. 29/29 pairing tests pass; E2E-verified with
reporter's exact Python reproducer.
feat(browser): add BrowserProvider ABC mirroring web_search_provider template
Foundation commit for the browser-provider plugin migration (#25214).
Mirrors the architecture established by PR #25182 (web providers):
- agent/browser_provider.py — BrowserProvider ABC. Preserves the legacy
CloudBrowserProvider lifecycle contract bit-for-bit (create_session,
close_session, emergency_cleanup, session metadata shape) so the
dispatcher in tools/browser_tool.py becomes a pure registry lookup.
Renames is_configured() → is_available() for parity with WebSearchProvider.
- agent/browser_registry.py — selection registry with the same
three-rule resolution as web_search_registry:
1. Explicit config wins (returns even if is_available() == False so
the dispatcher surfaces a precise credentials error)
2. Single-eligible shortcut
3. Legacy preference walk: browser-use → browserbase, filtered by
availability. Firecrawl is intentionally NOT in the legacy walk
(matches pre-migration behaviour — Firecrawl was only reachable
via explicit browser.cloud_provider: firecrawl).
- hermes_cli/plugins.py — adds ctx.register_browser_provider() facade,
one-liner mirror of register_web_search_provider().
No plugins registered yet; no dispatcher cutover yet. The next commits
move browserbase/browser-use/firecrawl into plugins/browser/<vendor>/
and switch tools/browser_tool.py over to the registry.
feat(kanban): orchestrator-driven auto-decomposition on triage (#27572)
* feat(kanban): orchestrator-driven auto-decomposition on triage
Closes the core gap in the kanban system: dropping a one-liner into Triage
now decomposes it into a graph of child tasks routed to specialist
profiles by description, matching teknium's original vision ("main
orchestrator splits/creates actual tasks, doles them out to each agent").
The build
---------
- hermes_cli/profiles.py: new description + description_auto fields
on ProfileInfo, persisted in <profile_dir>/profile.yaml. Helpers
read_profile_meta / write_profile_meta. create_profile accepts
optional description.
- hermes_cli/profile_describer.py: new module — auto-generate a 1-2
sentence description from a profile's skills + model + name via the
auxiliary LLM (auxiliary.profile_describer).
- hermes_cli/main.py: new hermes profile create --description ...
flag; new `hermes profile describe [name] [--text ... | --auto |
--all --auto]` subcommand.
- hermes_cli/kanban_db.py: new decompose_triage_task atomic helper —
creates N child tasks, links the root as a child of every leaf
(root waits for the whole graph), flips root triage -> todo with
orchestrator assignee, records an audit comment + decomposed event
in a single write_txn.
- hermes_cli/kanban_decompose.py: new module — calls the auxiliary LLM
(auxiliary.kanban_decomposer) with the profile roster + descriptions
to produce a JSON task graph, then invokes the DB helper. Rewrites
unknown assignees to the configured kanban.default_assignee (or
the active default profile) so a task NEVER lands with assignee=None.
Falls back to specify-style single-task promotion when the LLM
returns fanout: false.
- hermes_cli/kanban.py: new hermes kanban decompose [task_id | --all]
CLI verb.
- hermes_cli/config.py: new DEFAULT_CONFIG keys —
kanban.orchestrator_profile, kanban.default_assignee,
kanban.auto_decompose (default True), kanban.auto_decompose_per_tick
(default 3), auxiliary.kanban_decomposer, auxiliary.profile_describer.
- gateway/run.py: kanban dispatcher watcher now runs auto-decompose
before each _tick_once, capped by auto_decompose_per_tick so a
bulk-load of triage tasks doesn't burst-spend the aux LLM.
- plugins/kanban/dashboard/plugin_api.py: new endpoints —
GET /profiles (list roster + descriptions),
PATCH /profiles/<name> (set description, user-authored),
POST /profiles/<name>/describe-auto (LLM-generate),
POST /tasks/<id>/decompose (run decomposer),
GET/PUT /orchestration (orchestrator/default-assignee/auto-decompose
pickers, with resolved fallbacks echoed back).
- plugins/kanban/dashboard/dist/index.js: new OrchestrationPanel
collapsible — dropdowns for orchestrator profile and default
assignee, auto-decompose toggle, per-profile description editor with
Save and Auto-generate buttons. New ⚗ Decompose button next to
✨ Specify on triage-column task drawers.
Behavior
--------
- A task in Triage gets fanned out into a small DAG of child tasks.
Children with no internal parents flip to ready immediately
(parallel dispatch). Children with sibling parents wait. The root
stays alive as a parent of every child — when the whole graph
finishes, it promotes to ready and the orchestrator profile wakes
back up to judge completion (the "adds more tasks until done" part
of the original vision).
- kanban.orchestrator_profile unset -> falls back to the default
profile (whichever hermes launches with no -p flag).
- kanban.default_assignee unset -> same fallback. Tasks NEVER end
up unassigned.
- kanban.auto_decompose=true (default) runs the decomposer
automatically on dispatcher ticks; manual hermes kanban decompose
is always available.
Tests
-----
- tests/hermes_cli/test_kanban_decompose_db.py — 7 tests for the
atomic DB helper (status transitions, dep graph, audit trail,
validation errors).
- tests/hermes_cli/test_kanban_decompose.py — 6 tests for the
decomposer module (fanout, no-fanout fallback, unknown-assignee
rewrite, malformed-JSON resilience, no-aux-client path).
- tests/hermes_cli/test_profile_describer.py — 10 tests for
profile.yaml r/w + the LLM auto-describer (yaml corrupt tolerance,
user-vs-auto description protection, --overwrite, fallback parsing).
E2E
---
- CLI end-to-end: created profiles with descriptions, dropped a triage
task, mocked the aux LLM with a 3-task graph -> verified all three
children were created with the right assignees, the dependency
edges matched the LLM's graph, root flipped to todo gated by every
child, audit comment + decomposed event recorded.
- Dashboard end-to-end: started the dashboard against an isolated
HERMES_HOME, verified all four new endpoints via curl (profile
listing, PATCH for description, PUT for orchestration settings,
POST for decompose). Opened the UI in the browser, confirmed the
OrchestrationPanel renders with all three pickers + the per-profile
description editor, typed a description, clicked Save, verified
~/.hermes/profile.yaml was written. Clicked Decompose on the triage
card and confirmed the inline error message surfaced as designed
("no auxiliary client configured").
* feat(kanban): surface decompose mode (Auto/Manual) as a one-click pill
The auto/manual toggle already existed as kanban.auto_decompose (default
true), but it was buried inside the collapsed Orchestration settings
panel — users couldn't tell at a glance which mode they were in. This
hoists it to a pill at the top of the kanban page so the state is always
visible and one click flips it.
UX
- New "⚗ Decompose: AUTO|MANUAL" pill in the kanban header. Emerald
styling when Auto is on (the default), muted/gray when Manual.
- Pill is visible both in the collapsed AND expanded Orchestration
settings views so context is preserved when the user opens the panel.
- Tooltip explains both states + what clicking does.
- Renamed the in-panel "Auto-decompose on triage / Enabled" checkbox
to "Decompose mode / Auto (default) | Manual" for language parity
with the pill.
Behavior preserved
- Default remains Auto (kanban.auto_decompose=true).
- Manual mode restores pre-PR behavior: triage tasks stay in triage
until the user clicks ⚗ Decompose on each card (or runs
hermes kanban decompose <id>).
Implementation
- plugins/kanban/dashboard/dist/index.js: load /orchestration on mount
(not just on expand) so the collapsed pill reflects real state.
Render mode pill in both collapsed and expanded headers. Reuses the
existing PUT /api/plugins/kanban/orchestration endpoint — no new
backend, no new tests required.
E2E verified
- Pill renders as "⚗ Decompose: AUTO" on page load (default).
- One click flips to "⚗ Decompose: MANUAL" with muted styling.
- config.yaml on disk shows auto_decompose: false after the flip.
- Second click round-trips back to Auto; config.yaml flips to true.
* feat(kanban): rename mode pill to "Orchestration: Auto/Manual"
Per Teknium feedback — "Decompose" was too implementation-specific.
"Orchestration" is the user-facing concept (the whole pitch is the
orchestrator profile routing work), and the pill is the front door to it.
- Pill text: "Orchestration: Auto" / "Orchestration: Manual" (title case,
no ⚗ prefix, no SHOUTY-CAPS for the mode value)
- In-panel checkbox label: "Orchestration mode" (was "Decompose mode")
- Tooltips updated to match
- No behavior change
* docs(kanban): document decompose, profile descriptions, orchestration mode
Brings the docs site up to parity with the PR. English build verified
locally (npx docusaurus build --locale en) — clean, no new broken links
or anchors. Pre-existing broken-link warnings (rl-training, llms.txt,
step-by-step-checklist, fallback-model) untouched.
- website/docs/reference/cli-commands.md
+ hermes kanban decompose action row in the action table, with
pointer to the Auto vs Manual orchestration section.
- website/docs/reference/profile-commands.md
+ --description "<text>" flag on hermes profile create.
+ Full hermes profile describe section: read, --text, --auto,
--overwrite, --all flags with examples.
- website/docs/user-guide/features/kanban.md (the big one)
+ Triage column intro rewritten around the Auto-decompose default
behavior, with pointer to the new Auto vs Manual section.
+ Status action row updated to mention both ⚗ Decompose and
✨ Specify on triage cards.
+ New "Auto vs Manual orchestration" section explaining the two
modes, how to flip them (pill, config), how routing-by-description
works, the no-None-assignee guarantee, plus a config knob table
(auto_decompose, auto_decompose_per_tick, orchestrator_profile,
default_assignee) and the two new auxiliary slots
(kanban_decomposer, profile_describer).
+ REST surface table gains 6 new endpoint rows: /tasks/:id/decompose,
/profiles (GET), /profiles/:name (PATCH), /profiles/:name/describe-auto,
/orchestration (GET + PUT).
- website/docs/user-guide/features/kanban-tutorial.md
+ Triage column blurb updated for Auto by default + Manual via the
pill, with cross-link to the Auto vs Manual orchestration section.
- website/docs/user-guide/profiles.md
+ Blank-profile flow now mentions --description and points to the
kanban routing model for context.
- website/docs/user-guide/configuration.md
+ kanban_decomposer and profile_describer added to the
hermes model -> Configure auxiliary models menu listing.
feat(profile): shareable profile distributions via git (#20831)
* feat(profile): shareable profile distributions (pack/install/update/info)
Closes #20456.
Turns a profile into a portable, versioned artifact. Packs SOUL.md, config,
skills, cron, and an env-var manifest into a tar.gz that others can install
from a local path, URL, or git repo. Updates re-pull the distribution while
preserving user data (memories, sessions, auth.json, .env) and the user's
config.yaml overrides.
New subcommands (under hermes profile, no parallel tree):
hermes profile pack <name> [-o FILE]
hermes profile install <source> [--name N] [--alias] [--force] [-y]
hermes profile update <name> [--force-config] [-y]
hermes profile info <name>
Manifest (distribution.yaml at the profile root): name, version,
hermes_requires, author, env_requires, distribution_owned.
Security:
- Installer shows manifest + env-var requirements before mutating disk;
confirmation required unless -y.
- auth.json and .env are never packed (same exclude set as profile export).
- Cron jobs are packed but NOT auto-scheduled — user is pointed at
'hermes -p <name> cron list' to review.
- Archive extraction rejects path traversal (../ members).
- Alias creation is opt-in via --alias.
Update semantics:
- Distribution-owned paths (SOUL.md, skills/, cron/, mcp.json, manifest):
replaced from the new archive.
- config.yaml: preserved by default; --force-config to overwrite.
- User-owned paths (memories/, sessions/, auth.json, .env, state.db*,
logs/, workspace/, plans/, home/, *_cache/, local/): never touched.
Version pin:
hermes_requires accepts >=, <=, ==, !=, >, < or a bare version (treated
as >=). Install fails with a clear error when the running Hermes version
doesn't satisfy the spec.
Sources supported by 'install':
- Local .tar.gz / .tgz archive
- Local directory
- HTTP(S) URL pointing to a .tar.gz (uses httpx, already a dep)
- Git URL (github.com/user/repo, https://..., git@..., ssh://, git://)
Tests: 43 new unit tests (manifest parsing, version checks, env template,
pack/install/update round-trip, config-preservation, security).
E2E validated via real CLI invocations against an isolated HERMES_HOME
covering pack, install with confirmation, update preservation, update
--force-config, decline-preview, duplicate-install rejection, and
version-requirement rejection.
* refactor(profile-dist): git-only — drop tar.gz/HTTP transports and pack
Scope-cut on top of the original distribution PR: a profile distribution
is now exclusively a git repository (or a local directory during
development). The tar.gz / HTTP archive transports and the matching
hermes profile pack subcommand have been removed.
Why:
* GitHub tags, branches, and commits are already the right versioning
primitive. Tag pushes do for us what 'pack + upload' did.
* hermes profile export / import already cover local backup and
restore; they are not a distribution format and stay untouched.
* One transport means one install/update code path, one doc page,
and one mental model. The extra source types doubled the surface
for no real user win — GitHub auto-attaches release tarballs, and
git bundle / git clone --mirror cover the airgap case.
Changes:
* hermes_cli/profile_distribution.py — removed pack_profile,
_fetch_tar_archive (_http_fetch), _safe_extract, _archive_roots,
_safe_parts, _find_dist_root, tarfile/io/urlparse imports. The
new _stage_source has two arms: git URL → clone, local directory
→ use in place.
* hermes_cli/main.py — removed the 'pack' subparser and action
handler. Install help text updated to match the reduced source list.
* tests/hermes_cli/test_profile_distribution.py — rewritten around a
local-directory staging fixture. The install/update/describe suites
now build a distribution tree on disk directly and install from it,
which is what a real git clone produces after .git is stripped.
Dropped TestPack, TestFindDistRoot, and the tar-specific security
test. New tests cover _looks_like_git_url, env_example emission,
hermes_requires enforcement, and 'installer does not import
credentials if an author mistakenly leaks them in the staging tree'.
* website/docs/reference/profile-commands.md — 'Distribution commands'
section rewritten around git. Added a 'Publishing a distribution'
section. export/import stay documented as local backup/restore.
* website/docs/reference/cli-commands.md — dropped 'pack' from the
profile subcommand table.
* website/package.json — 'lint:diagrams' now passes
--exclude-code-blocks to ascii-guard. Without it, markdown tables
and box-drawing diagrams inside fenced code blocks were being
misidentified as malformed ASCII boxes, blocking the PR's
docs-site-checks CI with 8 false-positive errors.
Validation:
* Targeted suite: tests/hermes_cli/test_profile_distribution.py —
56/56 pass (down from 43 — reorganized to cover the new
local-dir paths).
* Regression: test_profiles.py + test_profile_export_credentials.py
102/102 still pass. export/import behaviour unchanged.
* Docs lint: ascii-guard lint --exclude-code-blocks docs returns
0 errors (was 8 on the PR before the flag bump).
* E2E: ran the real hermes profile install/info against a
local staging dir under an isolated HERMES_HOME — install writes
SOUL.md + skills to the target profile, info reads the manifest
back, a bogus source produces a clear error, and `hermes profile
pack` is now rejected by argparse as expected.
* feat(profile-dist): distribution-aware list/show/delete + installed_at + env preview
Polish pass on top of the git-only scope cut. Five additions, all small,
wiring into existing commands rather than adding new surface.
1. installed_at timestamp on the manifest
* Stamped automatically inside plan_install() on both fresh install
and update — ISO-8601 UTC, seconds resolution.
* Surfaced in hermes profile info as Installed: <ts>.
* Lets users tell "installed 6 months ago, needs update" from
"installed yesterday" without guessing from file mtimes.
2. hermes profile list grows a Distribution column
* Plain profiles: "—"
* Distribution profiles: "<name>@<version>" (e.g. telemetry@1.2.3)
* ProfileInfo gains three optional fields — distribution_name,
distribution_version, distribution_source — populated by a new
_read_distribution_meta() helper that swallows manifest read errors
so a broken distribution.yaml in one profile can't break list
for the others.
3. hermes profile show and hermes profile delete surface
distribution provenance
* show: Distribution: name@version + Installed from: <source>
plus a pointer to hermes profile info <name> for the full
manifest.
* delete: same lines in the pre-confirmation preview, so a user
deleting "telemetry" can see it came from
github.com/kyle/telemetry-distribution before they type
telemetry to confirm. No change to the confirmation gate itself —
deletion semantics are identical to plain profiles.
4. Install preview checks env vars against the current environment
* Replaces the "Env vars you'll need to set:" header with a simpler
"Env vars:" block.
* Each required var is labeled:
- ✓ set — already in os.environ OR present as a key in the
target profile's existing .env (update case).
- needs setting — required but not found in either place.
- — — optional.
* Mirrors pip's "Requirement already satisfied" UX: no unnecessary
nagging about keys the user already has configured.
5. Docs: private distributions
* New "Private distributions" section in
website/docs/reference/profile-commands.md explaining that we
shell out to the user's git binary, so SSH keys / credential
helpers / GitHub CLI stored creds all work transparently. One
paragraph, two examples.
* hermes profile info section updated to mention Installed:.
Module-level hoist:
* from datetime import datetime, timezone was previously lazy-imported
inside plan_install(). Hoisted to module scope so tests can monkeypatch
hermes_cli.profile_distribution.datetime to freeze time.
Tests (+7):
* TestInstalledAtStamp.test_install_stamps_installed_at — format check
(4-digit year, 'T', +00:00 suffix).
* TestInstalledAtStamp.test_update_refreshes_installed_at — freezes
datetime.now() to 2099-01-01 and confirms update writes a new stamp.
* TestProfileInfoDistribution.test_installed_distribution_shows_in_list
— ProfileInfo.distribution_{name,version,source} populated after install.
* TestProfileInfoDistribution.test_plain_profile_has_no_distribution_fields
— plain profiles have None.
* TestProfileInfoDistribution.test_malformed_manifest_does_not_break_list
— broken distribution.yaml in one profile doesn't break list_profiles().
Validation:
* 163/163 tests pass (56 distribution + 102 profile regression +
5 new from this commit — up from 158).
* docs-lint: 0 errors.
* E2E verified: install preview shows ✓/needs-setting per env var,
profile list shows Distribution column, profile show + delete
preview mentions source URL, info shows Installed: timestamp.
* fix(profile-dist): clean errors + warn when overwriting plain profiles
Two small polish fixes found during collision sweeps of the PR:
1. ValueError from validate_profile_name now caught cleanly
* A distribution.yaml whose 'name' field can't be used as a profile
identifier (spaces, path traversal, etc.) raises ValueError from
hermes_cli.profiles.validate_profile_name, which was escaping as a
raw Python traceback from 'hermes profile install/update/info'.
* Broadened the except clause in all three handlers to catch
(DistributionError, ValueError) — users now see:
Error: Invalid profile name '../../etc/passwd'. Must match
[a-z0-9][a-z0-9_-]{0,63}
instead of a stack trace.
2. Install preview distinguishes plain profile overwrite from
distribution re-install
* When plan.target_dir exists and IS a distribution (has
distribution.yaml), preview still shows the mild
(profile exists — will overwrite distribution-owned files only)
* When plan.target_dir exists but is a HAND-BUILT plain profile (no
distribution.yaml), preview now shows a loud warning:
⚠ Profile exists but is NOT a distribution. Installing here will
overwrite its SOUL.md, skills/, cron/, and mcp.json.
Your memories, sessions, auth.json, and .env will be preserved,
but any hand-edits to distribution-owned files will be lost.
* Users who type 'hermes profile install foo --force' against a
profile they hand-built now see what they're signing up for. User
data is still safe (memories, sessions, auth, .env are in
USER_OWNED_EXCLUDE), but custom SOUL/skills get stomped.
Tests (+2):
* TestErrorSurfaces.test_bad_profile_name_raises_valueerror_not_traceback
* TestErrorSurfaces.test_path_traversal_name_rejected
Validation:
* 165/165 tests pass (was 163).
* E2E: bad manifest names produce 'Error: Invalid profile name ...'
with no traceback; installing over a plain profile shows the warning;
re-installing over an existing distribution shows the normal
overwrite message.
* Bad HTTPS URLs still produce 'Error: git clone failed: ...' — git
itself generates a clean enough message that no wrapper is needed.
* 'install .' works correctly from any cwd.
* fix(profiles): reject reserved names at validate time
Before: hermes profile create hermes / profile install / profile rename
all silently accepted reserved names like hermes, test, tmp, root,
sudo. The profile directory was created; only alias creation failed (via
check_alias_collision), leaving a confusingly-named profile on disk — e.g.
~/.hermes/profiles/hermes/ sitting next to ~/.hermes/ itself.
The reserved set already exists (_RESERVED_NAMES, introduced alongside alias
collision detection). This commit moves the check up one layer to
validate_profile_name so every entry point — create, install, import,
rename, dashboard web API — shares the same gate.
The error message points the user at the cause without being cryptic:
Error: Profile name 'hermes' is reserved — it collides with either the
Hermes installation itself or a common system binary. Pick a different
name.
default continues to pass through (it's a special alias for ~/.hermes).
_HERMES_SUBCOMMANDS (chat, model, gateway, etc.) stays at
alias-collision time only — those are fine as bare profile names with
--no-alias.
Tests (+5): test_reserved_names_rejected parametrized over the full
_RESERVED_NAMES set, matching the existing pattern in TestValidateProfileName.
No existing test uses a reserved name as a profile identifier (greppped
create_profile("hermes|test|tmp|root|sudo") — zero hits).
Validation:
* 170/170 tests pass in the profile suites.
* E2E: profile create hermes, profile install with manifest
name=hermes, and profile install ... --name hermes all produce the
same clean Error: Profile name 'hermes' is reserved ... with rc=1
and no traceback. Normal names (mybot) still work.
feat(kanban): orchestrator-driven auto-decomposition on triage (#27572)
* feat(kanban): orchestrator-driven auto-decomposition on triage
Closes the core gap in the kanban system: dropping a one-liner into Triage
now decomposes it into a graph of child tasks routed to specialist
profiles by description, matching teknium's original vision ("main
orchestrator splits/creates actual tasks, doles them out to each agent").
The build
---------
- hermes_cli/profiles.py: new description + description_auto fields
on ProfileInfo, persisted in <profile_dir>/profile.yaml. Helpers
read_profile_meta / write_profile_meta. create_profile accepts
optional description.
- hermes_cli/profile_describer.py: new module — auto-generate a 1-2
sentence description from a profile's skills + model + name via the
auxiliary LLM (auxiliary.profile_describer).
- hermes_cli/main.py: new hermes profile create --description ...
flag; new `hermes profile describe [name] [--text ... | --auto |
--all --auto]` subcommand.
- hermes_cli/kanban_db.py: new decompose_triage_task atomic helper —
creates N child tasks, links the root as a child of every leaf
(root waits for the whole graph), flips root triage -> todo with
orchestrator assignee, records an audit comment + decomposed event
in a single write_txn.
- hermes_cli/kanban_decompose.py: new module — calls the auxiliary LLM
(auxiliary.kanban_decomposer) with the profile roster + descriptions
to produce a JSON task graph, then invokes the DB helper. Rewrites
unknown assignees to the configured kanban.default_assignee (or
the active default profile) so a task NEVER lands with assignee=None.
Falls back to specify-style single-task promotion when the LLM
returns fanout: false.
- hermes_cli/kanban.py: new hermes kanban decompose [task_id | --all]
CLI verb.
- hermes_cli/config.py: new DEFAULT_CONFIG keys —
kanban.orchestrator_profile, kanban.default_assignee,
kanban.auto_decompose (default True), kanban.auto_decompose_per_tick
(default 3), auxiliary.kanban_decomposer, auxiliary.profile_describer.
- gateway/run.py: kanban dispatcher watcher now runs auto-decompose
before each _tick_once, capped by auto_decompose_per_tick so a
bulk-load of triage tasks doesn't burst-spend the aux LLM.
- plugins/kanban/dashboard/plugin_api.py: new endpoints —
GET /profiles (list roster + descriptions),
PATCH /profiles/<name> (set description, user-authored),
POST /profiles/<name>/describe-auto (LLM-generate),
POST /tasks/<id>/decompose (run decomposer),
GET/PUT /orchestration (orchestrator/default-assignee/auto-decompose
pickers, with resolved fallbacks echoed back).
- plugins/kanban/dashboard/dist/index.js: new OrchestrationPanel
collapsible — dropdowns for orchestrator profile and default
assignee, auto-decompose toggle, per-profile description editor with
Save and Auto-generate buttons. New ⚗ Decompose button next to
✨ Specify on triage-column task drawers.
Behavior
--------
- A task in Triage gets fanned out into a small DAG of child tasks.
Children with no internal parents flip to ready immediately
(parallel dispatch). Children with sibling parents wait. The root
stays alive as a parent of every child — when the whole graph
finishes, it promotes to ready and the orchestrator profile wakes
back up to judge completion (the "adds more tasks until done" part
of the original vision).
- kanban.orchestrator_profile unset -> falls back to the default
profile (whichever hermes launches with no -p flag).
- kanban.default_assignee unset -> same fallback. Tasks NEVER end
up unassigned.
- kanban.auto_decompose=true (default) runs the decomposer
automatically on dispatcher ticks; manual hermes kanban decompose
is always available.
Tests
-----
- tests/hermes_cli/test_kanban_decompose_db.py — 7 tests for the
atomic DB helper (status transitions, dep graph, audit trail,
validation errors).
- tests/hermes_cli/test_kanban_decompose.py — 6 tests for the
decomposer module (fanout, no-fanout fallback, unknown-assignee
rewrite, malformed-JSON resilience, no-aux-client path).
- tests/hermes_cli/test_profile_describer.py — 10 tests for
profile.yaml r/w + the LLM auto-describer (yaml corrupt tolerance,
user-vs-auto description protection, --overwrite, fallback parsing).
E2E
---
- CLI end-to-end: created profiles with descriptions, dropped a triage
task, mocked the aux LLM with a 3-task graph -> verified all three
children were created with the right assignees, the dependency
edges matched the LLM's graph, root flipped to todo gated by every
child, audit comment + decomposed event recorded.
- Dashboard end-to-end: started the dashboard against an isolated
HERMES_HOME, verified all four new endpoints via curl (profile
listing, PATCH for description, PUT for orchestration settings,
POST for decompose). Opened the UI in the browser, confirmed the
OrchestrationPanel renders with all three pickers + the per-profile
description editor, typed a description, clicked Save, verified
~/.hermes/profile.yaml was written. Clicked Decompose on the triage
card and confirmed the inline error message surfaced as designed
("no auxiliary client configured").
* feat(kanban): surface decompose mode (Auto/Manual) as a one-click pill
The auto/manual toggle already existed as kanban.auto_decompose (default
true), but it was buried inside the collapsed Orchestration settings
panel — users couldn't tell at a glance which mode they were in. This
hoists it to a pill at the top of the kanban page so the state is always
visible and one click flips it.
UX
- New "⚗ Decompose: AUTO|MANUAL" pill in the kanban header. Emerald
styling when Auto is on (the default), muted/gray when Manual.
- Pill is visible both in the collapsed AND expanded Orchestration
settings views so context is preserved when the user opens the panel.
- Tooltip explains both states + what clicking does.
- Renamed the in-panel "Auto-decompose on triage / Enabled" checkbox
to "Decompose mode / Auto (default) | Manual" for language parity
with the pill.
Behavior preserved
- Default remains Auto (kanban.auto_decompose=true).
- Manual mode restores pre-PR behavior: triage tasks stay in triage
until the user clicks ⚗ Decompose on each card (or runs
hermes kanban decompose <id>).
Implementation
- plugins/kanban/dashboard/dist/index.js: load /orchestration on mount
(not just on expand) so the collapsed pill reflects real state.
Render mode pill in both collapsed and expanded headers. Reuses the
existing PUT /api/plugins/kanban/orchestration endpoint — no new
backend, no new tests required.
E2E verified
- Pill renders as "⚗ Decompose: AUTO" on page load (default).
- One click flips to "⚗ Decompose: MANUAL" with muted styling.
- config.yaml on disk shows auto_decompose: false after the flip.
- Second click round-trips back to Auto; config.yaml flips to true.
* feat(kanban): rename mode pill to "Orchestration: Auto/Manual"
Per Teknium feedback — "Decompose" was too implementation-specific.
"Orchestration" is the user-facing concept (the whole pitch is the
orchestrator profile routing work), and the pill is the front door to it.
- Pill text: "Orchestration: Auto" / "Orchestration: Manual" (title case,
no ⚗ prefix, no SHOUTY-CAPS for the mode value)
- In-panel checkbox label: "Orchestration mode" (was "Decompose mode")
- Tooltips updated to match
- No behavior change
* docs(kanban): document decompose, profile descriptions, orchestration mode
Brings the docs site up to parity with the PR. English build verified
locally (npx docusaurus build --locale en) — clean, no new broken links
or anchors. Pre-existing broken-link warnings (rl-training, llms.txt,
step-by-step-checklist, fallback-model) untouched.
- website/docs/reference/cli-commands.md
+ hermes kanban decompose action row in the action table, with
pointer to the Auto vs Manual orchestration section.
- website/docs/reference/profile-commands.md
+ --description "<text>" flag on hermes profile create.
+ Full hermes profile describe section: read, --text, --auto,
--overwrite, --all flags with examples.
- website/docs/user-guide/features/kanban.md (the big one)
+ Triage column intro rewritten around the Auto-decompose default
behavior, with pointer to the new Auto vs Manual section.
+ Status action row updated to mention both ⚗ Decompose and
✨ Specify on triage cards.
+ New "Auto vs Manual orchestration" section explaining the two
modes, how to flip them (pill, config), how routing-by-description
works, the no-None-assignee guarantee, plus a config knob table
(auto_decompose, auto_decompose_per_tick, orchestrator_profile,
default_assignee) and the two new auxiliary slots
(kanban_decomposer, profile_describer).
+ REST surface table gains 6 new endpoint rows: /tasks/:id/decompose,
/profiles (GET), /profiles/:name (PATCH), /profiles/:name/describe-auto,
/orchestration (GET + PUT).
- website/docs/user-guide/features/kanban-tutorial.md
+ Triage column blurb updated for Auto by default + Manual via the
pill, with cross-link to the Auto vs Manual orchestration section.
- website/docs/user-guide/profiles.md
+ Blank-profile flow now mentions --description and points to the
kanban routing model for context.
- website/docs/user-guide/configuration.md
+ kanban_decomposer and profile_describer added to the
hermes model -> Configure auxiliary models menu listing.
fix(cli): make Ctrl+Enter insert newline on WSL/SSH/Windows Terminal (#22777)
Native Windows, WSL, SSH sessions, and Windows Terminal all send
Ctrl+Enter as bare LF (c-j). Hermes was binding c-j as submit on
every POSIX platform, so Ctrl+Enter submitted instead of inserting
a newline on those terminals. Reported in #22379.
Add _preserve_ctrl_enter_newline() predicate that detects the
environments where Ctrl+Enter must produce a newline (sys.platform
== 'win32', SSH_CONNECTION/SSH_CLIENT/SSH_TTY env, WT_SESSION,
WSL_DISTRO_NAME, /proc/version 'microsoft' marker). Gate the
c-j-as-submit binding off in those environments and gate the
c-j-as-newline handler on. Local POSIX TTYs without those markers
(docker exec, plain ssh from a Mac) keep c-j as submit so plain
Enter still works on thin PTYs.
Add install_ctrl_enter_alias() in hermes_cli/pt_input_extras.py
mapping the three CSI-u / modifyOtherKeys variants of Ctrl+Enter
('\x1b[13;5u', '\x1b[27;5;13~', '\x1b[27;5;13u') to the
(Escape, ControlM) tuple Alt+Enter produces. This lets Kitty /
mintty / xterm-with-modifyOtherKeys users over SSH get a Ctrl+Enter
newline through the existing Alt+Enter handler.
9 new tests + extended existing test_lf_enter_binds_to_submit_handler_posix
to cover bare-local vs SSH branches.
Closes #22379.
chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937)
Replace with for all literal-tuple
membership tests. Set lookup is O(1) vs O(n) for tuple — consistent
micro-optimization across the codebase.
608 instances fixed via ruff --fix --unsafe-fixes, 0 remaining.
133 files, +626/-626 (net zero).
fix(security): derive <VENDOR>_API_KEY from host as final credential fallback
After #28660's host-gating fix, users with provider=custom and base_url
pointed at a commercial endpoint (DeepSeek, Groq, Mistral, …) hit
no-key-required even when they had the vendor-named env var set
(DEEPSEEK_API_KEY, GROQ_API_KEY, …). The issue author flagged this as
'what users intuitively expect'.
Adds _host_derived_api_key() to derive an env var name from the base URL
host using the *registrable* label (second-to-last). Appended to all three
api_key_candidates chains (_resolve_named_custom_runtime direct-alias path,
named-custom path, _resolve_openrouter_runtime non-openrouter branch).
Lookalike resistance: api.deepseek.com.attacker.test resolves to vendor
label 'attacker', NOT 'deepseek' — DEEPSEEK_API_KEY stays put. IPs and
loopback yield no vendor label. Already-handled vendors (OPENAI/OPENROUTER/
OLLAMA) are filtered to prevent bypass of the explicit host-gated paths.
Adds 6 tests covering positive paths (DeepSeek, Groq), the lookalike attack,
loopback rejection, the already-handled-vendor filter, and direct helper
unit tests.
Also adds erhnysr to AUTHOR_MAP.
feat(security): supply-chain advisory checker + lazy-install framework + tiered install fallback (#24220)
* feat(security): supply-chain advisory checker + lazy-install framework + tiered install fallback
Three coordinated mitigations for the Mini Shai-Hulud worm hitting
mistralai 2.4.6 on PyPI (2026-05-12) and for the next single-package
compromise that follows.
# What this PR makes true
1. Users with the poisoned mistralai 2.4.6 in their venv get a loud
detection banner with copy-pasteable remediation steps the moment
they run hermes (and on every gateway startup).
2. One quarantined / yanked PyPI package can no longer silently demote
a fresh install to 'core only' — the installer keeps every other
extra and tells the user which tier landed.
3. Future opt-in backends (Mistral, ElevenLabs, Honcho, etc.) can
lazy-install on first use under a strict allowlist, instead of
eagerly pulling everything at install time.
# Detection: hermes_cli/security_advisories.py
- ADVISORIES catalog (one entry currently: shai-hulud-2026-05 for
mistralai==2.4.6). Adding the next one is a single dataclass.
- detect_compromised() uses importlib.metadata.version() — no pip
dependency, works in uv venvs that lack pip.
- Banner cache (~/.hermes/cache/advisory_banner_seen) rate-limits
the startup banner to once per 24h per advisory.
- Acks persisted to security.acked_advisories in config.yaml; never
re-banner after ack.
- Wired into:
* hermes doctor — runs first, prints full remediation block
* hermes doctor --ack <id> — dismisses an advisory
* cli.py interactive run() and single-query branches — short
stderr banner pointing at hermes doctor
* gateway/run.py startup — operator-visible warning in gateway.log
# Lazy-install framework: tools/lazy_deps.py
- LAZY_DEPS allowlist maps namespaced feature keys (tts.elevenlabs,
memory.honcho, provider.bedrock, etc.) to pip specs.
- ensure(feature) installs missing deps in the active venv via the
uv → pip → ensurepip ladder (matches tools_config._pip_install).
- Strict spec safety regex rejects URLs, file paths, shell metas,
pip flag injection, control chars — only PyPI-by-name accepted.
- Gated on security.allow_lazy_installs (default true) plus the
HERMES_DISABLE_LAZY_INSTALLS env var for restricted/audited envs.
- Migrated three backends as proof of pattern:
* tools/tts_tool.py — _import_elevenlabs() calls ensure first
* plugins/memory/honcho/client.py — get_honcho_client lazy-installs
* tts.mistral / stt.mistral entries pre-registered for when PyPI
restores mistralai
# Installer fallback tiers
scripts/install.sh, scripts/install.ps1, setup-hermes.sh:
- Centralised _BROKEN_EXTRAS list (currently: mistral). Edit one
array when a transitive breaks; users keep every other extra.
- New 'all minus known-broken' tier between [all] and the existing
PyPI-only-extras tier. Only kicks in when [all] fails resolve.
- All three tiers explicit: every fallback announces which tier
landed and prints a re-run hint when not on Tier 1.
- install.ps1 and install.sh both regenerate their tier specs from
the same _BROKEN_EXTRAS array so updates stay in sync.
Side effect: install.ps1 Tier 2 spec previously hardcoded 'mistral'
in its extra list — bug fixed by the refactor (mistral is filtered
out).
# Config
hermes_cli/config.py — DEFAULT_CONFIG.security gains:
- acked_advisories: [] (advisory IDs the user has dismissed)
- allow_lazy_installs: True (security gate for ensure())
No config version bump needed — both keys nest under existing
security: block, and load_config's deep-merge picks up DEFAULT_CONFIG
defaults for users with older configs.
# Tests
tests/hermes_cli/test_security_advisories.py — 23 tests covering:
- detect_compromised matches/non-matches, wildcard frozenset
- ack persistence, idempotence, blank rejection, config-failure path
- banner cache rate limiting + 24h re-banner + ack-stops-banner
- short_banner_lines / full_remediation_text / render_doctor_section /
gateway_log_message
- shipped catalog well-formedness invariant
tests/tools/test_lazy_deps.py — 40 tests covering:
- spec safety: 11 safe parametrized + 18 unsafe parametrized
- allowlist: unknown-feature rejection, namespace.name shape,
every shipped spec passes the safety regex
- security gating: config flag, env var, default, fail-open
- ensure() happy/sad paths: already-satisfied, install success,
pip stderr surfaced on failure, install-succeeds-but-still-missing
- is_available, feature_install_command
Combined: 63 new tests, all passing under scripts/run_tests.sh.
# Validation
- scripts/run_tests.sh tests/hermes_cli/test_security_advisories.py
tests/tools/test_lazy_deps.py → 63/63 passing
- scripts/run_tests.sh tests/hermes_cli/test_doctor.py
tests/hermes_cli/test_doctor_command_install.py
tests/tools/test_tts_mistral.py tests/tools/test_transcription_tools.py
tests/tools/test_transcription_dotenv_fallback.py → 165/165 passing
- scripts/run_tests.sh tests/hermes_cli/ tests/tools/ →
9191 passed, 8 pre-existing failures (verified on origin/main
before this change)
- bash -n on install.sh and setup-hermes.sh → OK
- py_compile on all modified .py files → OK
- End-to-end smoke test of detect_compromised + render_doctor_section
+ gateway_log_message with mocked installed version → produces
copy-pasteable remediation output
# Community
Full advisory + remediation steps:
website/docs/community/security-advisories/shai-hulud-mistralai-2026-05.md
Short-form post drafts (Discord, GitHub pinned issue, README banner):
scripts/community-announcement-shai-hulud.md
Refs: PR #24205 (mistral disabled), Socket Security advisory
<https://socket.dev/blog/mini-shai-hulud-worm-pypi>
* build(deps): pin every direct dep to ==X.Y.Z (no ranges)
Companion to the supply-chain advisory work: replace every >=/</~= range
in pyproject.toml's [project.dependencies] and [project.optional-dependencies]
with an exact ==X.Y.Z pin sourced from uv.lock.
Why: ranges allow PyPI to ship a fresh version of any direct dep at any
time without a code review on our side. With ranges, the malicious
mistralai 2.4.6 release would have been pulled by every fresh
'pip install -e .[all]' for the hours between upload and PyPI's
quarantine — exactly the install window we got hit on. Exact pins close
that window: the only way a new package version reaches a user is via
an intentional update on our end.
What the user-facing change is: nothing, behavior-wise. Every package
resolves to the same version it was already resolving to via uv.lock —
the pins just remove the resolver's freedom to pick a different one.
Cost: any user installing Hermes alongside another package that requires
a newer pin gets a resolver conflict. Acceptable for our isolated-venv
install path; documented in the new comment block.
Build-system requires line (setuptools>=61.0) is intentionally left
as a range — pinning the build backend would block fresh pip from
bootstrapping the build on architectures where that exact wheel isn't
available.
mistral extra (mistralai==2.3.0) is pinned but stays out of [all]
(per PR #24205). 'uv lock' regeneration will fail until PyPI restores
mistralai; lockfile regeneration is gated behind that, NOT on every PR.
LAZY_DEPS in tools/lazy_deps.py also moved to exact pins so the lazy-
install pathway can never resolve a different version than the one
declared in pyproject.toml.
Validation:
- Cross-checked all 77 pinned direct deps in pyproject.toml against
uv.lock — every pin matches the resolved version exactly.
- Cross-checked all LAZY_DEPS specs against uv.lock — same.
- 'uv pip install -e .[all] --dry-run' resolves 205 packages cleanly.
- tests/tools/test_lazy_deps.py + tests/hermes_cli/test_security_advisories.py
→ 63/63 passing (every shipped spec passes the safety regex).
- Doctor + TTS + transcription targeted suite → 146/146 passing.
* build(deps): hash-verify transitives via uv.lock; remove unresolvable [mistral] extra
You asked: 'what about the dependencies the dependencies rely on?' —
correctly noting that exact-pinning direct deps in pyproject.toml does
NOT cover the transitive graph. pip install and uv pip install both
re-resolve transitives fresh from PyPI at install time, so a compromised
transitive (e.g. httpcore if it got worm-poisoned tomorrow) would
still hit our users even with every direct dep exact-pinned.
# What this commit fixes
1. **Both real installer scripts now prefer uv sync --locked as Tier 0.**
uv.lock records SHA256 hashes for every transitive — a compromised
package with a different hash gets REJECTED. Falls through to the
existing uv pip install cascade if the lockfile is missing or
stale, with a loud warning that the fallback path does NOT
hash-verify transitives. Previously only setup-hermes.sh (the dev
path) used the lockfile; scripts/install.sh and scripts/install.ps1
(the paths fresh users actually run) skipped it.
2. **Removed the [mistral] extra entirely.** The mistralai PyPI
project is fully quarantined right now — every version returns 404,
so any pin we wrote was unresolvable, which broke uv lock --check
in CI. Restoration is documented in pyproject.toml as a 5-step
checklist (verify, re-add extra, re-enable in 4 modules, regenerate
lock, optionally re-add to [all]).
3. **Regenerated uv.lock.** 262 packages, mistralai/eval-type-backport/
jsonpath-python pruned. uv lock --check now passes.
# Defense-in-depth view
| Layer | Where | Protects against |
|----------------------------|-------------------|-------------------------------------------|
| Exact pins in pyproject | direct deps | new mistralai 2.4.6-style direct compromise |
| uv.lock + --locked install | transitive graph | transitive worm injection |
| Tier-0 hash-verified path | install.sh / .ps1 | actually USE the lockfile in fresh installs |
| uv lock --check CI gate | every PR | drift between pyproject and lockfile |
| hermes_cli/security_advisories.py | runtime | cleanup for users who already got hit |
The exact pinning + hash verification together close the supply-chain
gap. Without the lockfile path, exact pins alone are theater.
# Validation
- uv lock --check → passes (262 packages resolved, no drift).
- bash -n on install.sh + setup-hermes.sh → OK.
- 209/209 tests passing across new + adjacent test files
(test_lazy_deps.py, test_security_advisories.py, test_doctor.py,
test_tts_mistral.py, test_transcription_tools.py).
- TOML parse OK.
* chore: remove community announcement drafts (PR body covers it)
* build(deps): lazy-install every opt-in backend (anthropic, search, terminal, platforms, dashboard)
Extends the lazy-install framework to cover everything that's not used by
every hermes session. Base install drops from ~60 packages to 45.
Moved out of core dependencies = []:
- anthropic (only when provider=anthropic native, not via aggregators)
- exa-py, firecrawl-py, parallel-web (search backends; only when picked)
- fal-client (image gen; only when picked)
- edge-tts (default TTS but still optional)
New extras in pyproject.toml: [anthropic] [exa] [firecrawl] [parallel-web]
[fal] [edge-tts]. All added to [all].
New LAZY_DEPS entries: provider.anthropic, search.{exa,firecrawl,parallel},
tts.edge, image.fal, memory.hindsight, platform.{telegram,discord,matrix},
terminal.{modal,daytona,vercel}, tool.dashboard.
Each import site now calls ensure() before importing the SDK. Where the
module had a top-level try/except (telegram, discord, fastapi), the
graceful-fallback pattern was extended to lazy-install on first
check_*_requirements() call and re-bind module globals.
Updated test_windows_native_support.py tzdata check from snapshot
(>=2023.3 literal) to invariant (any version + win32 marker).
Validation:
- Base install: 45 packages (was ~60); 6 newly-extracted packages absent
- uv lock --check: passes (262 packages, no drift)
- 209/209 lazy_deps + advisory + doctor + tts/transcription tests passing
- py_compile clean on all 12 modified modules
chore: ruff auto-fix PLR6201 resweep — tuple → set in membership tests (#27355)
Six days after #23937 (608 fixes) the codebase had accumulated 241 new
PLR6201 violations. Same mechanical x in (...) → x in {...} fix,
same zero-risk profile: set lookup is O(1) vs O(n) for tuple and the
two are semantically equivalent for hashable scalar membership tests.
All 241 instances fixed via `ruff check --select PLR6201 --fix
--unsafe-fixes`, zero remaining. Every changed value is a hashable
scalar (str/int/None/enum/signal); no risk of unhashable runtime
errors. No behavior change.
Test plan:
- 119 files changed, +244/-244 (net zero) — exactly one-line edits
- ruff check clean afterward
- Compile checks pass on the largest touched files (cli.py, run_agent.py,
gateway/run.py, gateway/platforms/discord.py, model_tools.py)
- Subset broad test run on tests/gateway/ tests/hermes_cli/ tests/agent/
tests/tools/: 18187 passed, 59 pre-existing failures (verified against
origin/main with the same shape — identical failure count, identical
category — all xdist test-order flakes unrelated to this change)
Follows the same template as PR #23937 ([tracker: #23972](https://github.com/NousResearch/hermes-agent/issues/23972)).
refactor(config): migrate remaining 33 cfg_get call sites (#17311)
Completes the cfg_get migration started in PR #17304. Covers the
remaining hermes_cli/ and plugins/ config-access sites that the first
PR intentionally left opportunistic.
Migrated (33 sites across 14 files):
hermes_cli/setup.py 13 sites (terminal.*, agent.*, display.*, compression.*, tts.*)
hermes_cli/tools_config.py 7 sites (tts.*, browser.*, web.*, platform_toolsets.*)
hermes_cli/plugins_cmd.py 3 sites (plugins.*, memory.*, context.*)
plugins/memory/honcho/cli.py 3 sites (hosts.*)
hermes_cli/web_server.py 1 site (dashboard.*)
hermes_cli/skills_config.py 1 site (platform_disabled)
hermes_cli/plugins.py 1 site (plugins.disabled)
hermes_cli/status.py 1 site (terminal.backend)
hermes_cli/mcp_config.py 1 site (mcp_servers.*)
hermes_cli/webhook.py 1 site (platforms.webhook)
plugins/memory/__init__.py 1 site (memory.provider)
plugins/memory/hindsight/ 1 site (banks.hermes)
plugins/memory/holographic/ 1 site (plugins.hermes-memory-store)
run_agent.py 1 site (auxiliary.compression)
The helper supports non-literal keys too, so e.g.
cfg.get('hosts', {}).get(HOST, {})
becomes
cfg_get(cfg, 'hosts', HOST, default={})
Migration bugs caught and fixed during this PR:
1. An AST-based batch rewrite naïvely captured the first word token in
a chain, which corrupted 'self._config.get(...).get(...)' into
'self.cfg_get(_config, ...)' (dropping 'self.', creating a broken
method call). Plugins/memory/hindsight caught it via its test suite.
Fixed manually to 'cfg_get(self._config, ...)'.
2. Import-extension heuristic rewrote multi-line parenthesized imports
('from X import (\n A,\n B,\n)') as
'from X import cfg_get, (' — syntactically broken. Fixed by inserting
cfg_get as the first name inside the parentheses.
Combined with PR #17304, the cfg_get migration now covers:
PR #17304 (first batch): 20 sites in tools/ + gateway/
PR #17317 (this one): 33 sites in hermes_cli/ + plugins/ + run_agent.py
Total: 53 sites migrated. Remaining ~8 sites are either:
- Function-call chains (e.g. '_load_stt_config().get(...).get(...)')
that would need double-evaluation or a local binding to migrate
cleanly — intentionally deferred.
- JSON response-navigation (e.g. 'response_data.get('data',{}).get('web'))
which is unrelated to config access and shouldn't use cfg_get.
Verified:
- 412/412 tests/plugins/ pass (including the hindsight test that caught
the self.X regex bug before commit)
- 3181/3189 tests/hermes_cli/ pass (8 pre-existing failures on main,
verified by git-stash comparison)
- Live 'hermes status' and 'hermes config' render correctly (exercise
the migrated terminal.backend, tts.provider, browser.cloud_provider,
compression.threshold, display.tool_progress sites)
- Live 'hermes chat': 1 turn + /quit, zero errors in 11-line log window
No semantic changes — cfg_get was already proven to be a 1:1 match for
the original .get("X",{}).get("Y",default) pattern in PR #17304.
chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937)
Replace with for all literal-tuple
membership tests. Set lookup is O(1) vs O(n) for tuple — consistent
micro-optimization across the codebase.
608 instances fixed via ruff --fix --unsafe-fixes, 0 remaining.
133 files, +626/-626 (net zero).
perf(agent-loop): cut 47% of per-conversation function calls via 3 targeted hot-path optimizations (#28866)
* perf(config): add load_config_readonly() fast path for hot agent loop
load_config() is called from the agent loop's per-API-call hot path via
get_provider_request_timeout() and get_provider_stale_timeout() —
both invoked once per turn from _resolved_api_call_timeout() in
run_agent.py.
Profiling a synthetic 20-tool-call agent run revealed:
- 21 invocations of load_config() cumulating 56ms (~17% of agent loop)
- 34,398 deepcopy calls totaling 37ms (config defensive deepcopy + chain)
- 8,652 _expand_env_vars invocations (~412 per turn)
Microbench (cache-hit, real config.yaml present):
load_config() 265us/call (125us deepcopy + 140us infra)
load_config_readonly() 138us/call (~48% faster)
load_config_readonly() returns the cached dict directly without the
defensive deepcopy. Documented contract: caller must not mutate. Returns
plain dict (not MappingProxyType) so downstream isinstance(x, dict)
guards keep working — caught during initial implementation when
MappingProxyType broke get_provider_request_timeout's guard logic.
Wired into hermes_cli/timeouts.py (the two functions called per agent
turn). load_config() is unchanged for the 263 other call sites that
mutate the result before save_config(), are not in the hot path, or
where the safety guarantee matters more than the perf.
Profile A/B (cached config, 21-turn agent loop):
BEFORE AFTER delta
get_provider_request_timeout 55ms 16ms -71%
total function calls 399k 160k -60%
deepcopy calls (in hotspots) 34,398 ~0 ~elim
Verified:
- isinstance(load_config_readonly(), dict) is True
- timeout/stale resolutions correct
- load_config() still returns isolated mutable deepcopies
- tests/hermes_cli/test_config*.py / test_timeouts.py: 102/102 pass
- tests/cli/ + tests/agent/test_auxiliary_client.py: 883/883 pass
* perf(redact): substring pre-screens skip non-matching regex chains
Every log record passes through RedactingFormatter.format which calls
redact_sensitive_text, which historically ran ALL 13 secret-pattern
regexes against every line — including DB connection strings, JWTs,
Discord mentions, Signal phone numbers, etc. — even for typical clean
log records like 'INFO run_agent: API call completed'.
Add cheap substring pre-checks before each regex pass. False positives
still run the regex (which then matches nothing); false negatives are
impossible because every pattern requires the gated substring to match
its leading anchor:
- _PREFIX_RE gated on any of 33 known credential prefix substrings
- _ENV_ASSIGN_RE gated on = in text
- _JSON_FIELD_RE gated on : and " in text
- _AUTH_HEADER_RE gated on uthorization/UTHORIZATION in text
- _TELEGRAM_RE gated on : in text
- _PRIVATE_KEY_RE gated on BEGIN and -----
- _DB_CONNSTR_RE gated on :// in text
- _JWT_RE gated on eyJ in text
- URL userinfo/query gated on ://
- _redact_form_body gated on & and =
- _DISCORD_MENTION_RE gated on <@
- _SIGNAL_PHONE_RE gated on +
Microbench (5 typical log records, 20k iterations each):
BEFORE AFTER delta
redact_sensitive_text per call 5.63us 1.79us -68%
Real-world impact: ~244 log records emitted in a 30-turn agent loop, so
the chain saves ~1ms of CPU per conversation. Bigger win is the
reduction in regex execution and GC pressure during heavy logging
sessions (verbose logging, gateway message processing).
Security regression test: 30 secret-containing inputs (sk-/ghp_/JWT/DB
connstr/Auth-Bearer/private key/URL userinfo/Discord/Signal/etc.)
verified to produce identical redacted output before/after. All 75
existing tests/agent/test_redact.py cases pass.
The ?access_token=foo&code=bar (bare query string, no scheme) case
that 'leaks' is pre-existing behavior — the URL query redaction
requires a well-formed URL with scheme+host. Not a regression.
* perf(run_agent): cache _needs_thinking_reasoning_pad result per (provider, model, base_url)
Profile of a 31-turn synthetic agent run shows _needs_thinking_reasoning_pad
fires 495 times (~16 per turn) and each call ran 3 helper methods, each
hitting base_url_host_matches 1-4 times via urlparse. Total cost:
3,342 base_url_host_matches calls + 3,373 urlparse calls accounting for
~36ms of agent-loop overhead (~7% of the entire post-network work).
Provider / model / base_url don't change during a conversation except via
switch_model and fallback activation — both of which already overwrite
those attributes atomically. Cache the result on a tuple key; since the
key is derived from the very fields that would change, the cache
auto-invalidates on the next read after a switch. No manual invalidation
needed in switch_model / _try_activate_fallback.
Profile A/B (31-turn cached-config agent run):
BEFORE AFTER delta
_needs_thinking_reasoning_pad cum 18ms 1ms -94%
_copy_reasoning_content_for_api cum 17ms 1ms -94%
base_url_host_matches calls 3,342 372 -89%
urlparse calls 3,373 403 -88%
total function calls 296k 223k -25%
Verified:
- tests/run_agent/test_deepseek_reasoning_content_echo.py: 36/36 pass
- tests/run_agent/ (full): 1383/1383 pass + 3 skipped
fix(tui): restore voice push-to-talk parity (#20897)
* fix(tui): restore classic CLI voice push-to-talk parity
(cherry picked from commit 93b9ae301bb89f5b5e01b4b9f8ac91ffa74fbd9d)
* fix(tui): harden voice push-to-talk stop flow
Address review feedback from PR #16189 by stopping the active recorder before background transcription, documenting single-shot voice capture, and covering the TUI gateway flags with regression tests.
* fix(tui): preserve silent voice strike tracking
Keep single-shot voice recording's no-speech counter alive across starts so the TUI can still emit the three-strikes auto-disable event, and bind the auto-restart state at module scope for type checking.
* fix(tui): clean up voice stop failure path
Address follow-up review by naming the TUI flow as single-shot push-to-talk and cancelling the recorder when forced stop cannot produce a WAV.
* fix(tui): report busy voice capture starts
Return explicit start state from the voice wrapper so the TUI gateway does not report recording while forced-stop transcription is still cleaning up.
* fix(tui): handle busy voice record responses
Apply the gateway busy status immediately in the TUI and route forced-stop voice events to the session that sent the stop request.
* fix(tui): clear voice recording on null response
Treat a null voice.record RPC result as a failed optimistic start so the REC badge cannot stick after gateway-side errors.
* fix(tui): count silent manual voice stops
Preserve single-shot voice no-speech strikes through forced stop transcription so empty push-to-talk captures still trigger the three-strikes guard.
---------
Co-authored-by: Montbra <montbra@gmail.com>
chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937)
Replace with for all literal-tuple
membership tests. Set lookup is O(1) vs O(n) for tuple — consistent
micro-optimization across the codebase.
608 instances fixed via ruff --fix --unsafe-fixes, 0 remaining.
133 files, +626/-626 (net zero).