hermes-agent/tests · zqf201492/hermes-agent - AtomGit

TTekniumtest(telegram): add brand-new-topic regression for #31086

文件	最后提交记录	最后更新时间
acp	fix(acp): only deliver final_response after streaming when transformed PR #29119 dropped the 'not streamed_message' guard unconditionally so that plugin-transformed responses (transform_llm_output hook) would reach ACP clients. That regressed test_prompt_does_not_duplicate_streamed_final_message: when no transform happened, the streamed text was re-sent as a duplicate final delivery. Tighten the condition to mirror the gateway side: deliver after streaming only when response_transformed=True. Otherwise keep the old guard. Adds test_prompt_delivers_transformed_response_after_streaming so the transformed path stays covered.	11 天前
acp_adapter	feat(azure-foundry): add Microsoft Entra ID auth Use azure-identity DefaultAzureCredential for keyless Foundry auth. Preserve refreshable callable credentials through OpenAI and Anthropic client paths. Add setup, doctor, auth status, docs, and tests for Entra auth. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	17 天前
agent	fix(anthropic): skip mcp_ prefix on outgoing tool schemas when already prefixed Companion to the GH-25255 incoming-strip fix from @hayka-pacha. Without this, build_anthropic_kwargs unconditionally added 'mcp_' to every tool name in step 3, so a native MCP server tool registered as 'mcp_composio_X' was sent as 'mcp_mcp_composio_X' on the wire. The incoming strip only removes ONE prefix, which still worked on first call, but on subsequent calls the model pattern-matched the single-prefixed form from message history and produced names that stripped to 'composio_X' — registry miss, dispatch fail. The history-rewrite block (#4) already has this guard. Apply the same guard to the schema-rewrite block (#3) so round-trip is symmetric. Added 4 outgoing-side tests. Existing 7 incoming-side tests still pass. Author map: hayka-pacha added for PR #25270 salvage attribution. Refs GH-25255.	11 天前
cli	fix(cli): decouple tool_progress=verbose from global DEBUG logging (#31379) PR #6a1aa420e coupled `display.tool_progress: verbose` (a per-tool display toggle for full args / results / think blocks) to `self.verbose` — which controls root-logger DEBUG level. Result: setting tool_progress: verbose in config silently flipped every module in the process to DEBUG and flooded the terminal with internal logging, far beyond just full tool calls. The two concepts are separate: - `tool_progress_mode == 'verbose'` → display behavior (tool rendering) - `self.verbose` → logging behavior (root logger → DEBUG, line 9795) This change keeps PR #6a1aa420e's argparse.SUPPRESS / config-fallback plumbing but severs the verbose-display → debug-logging link. Changes: - cli.py:2868 — `self.verbose` only follows explicit `verbose=` arg; no longer auto-True when tool_progress_mode == 'verbose'. - cli.py:_toggle_verbose — slash-cycle through tool progress modes no longer flips `self.verbose` / `agent.verbose_logging` / `agent.quiet_mode`. - cli.py:9355 — fix misleading label (drop 'and debug logs'). - tui_gateway/server.py:_make_agent — same decoupling on the TUI side (verbose_logging no longer derived from tool_progress_mode). - tests/cli/test_tool_progress_scrollback.py — invert the test that asserted the broken coupling; add coverage for explicit `--verbose` still enabling DEBUG independent of tool_progress. Live verified: - tool_progress: verbose, no --verbose flag → 0 DEBUG/INFO log lines - --verbose flag explicit → 32 DEBUG/INFO log lines (as expected)	11 天前
cron	Fix unsafe gateway media path delivery	12 天前
e2e	refactor(gateway): migrate Discord adapter to bundled plugin (full Teams parity) First migration of an existing built-in platform adapter to the plugin system established by IRC / Teams / LINE / Google Chat. Closes #24325; advances the umbrella refactor in #3823. Matches Teams' shape exactly — adapter under `plugins/platforms/discord/` with the standard `__init__.py` / `adapter.py` / `plugin.yaml` shell, `register(ctx)` entry point, no back-compat shim at the old import path, and full parity for the four hooks Teams uses plus the `apply_yaml_config_fn` hook that landed in #25443 (the Discord plugin is the first consumer of that hook): * `standalone_sender_fn` — out-of-process cron delivery via REST API * `setup_fn` — interactive `hermes setup gateway` wizard * `apply_yaml_config_fn` — translate `config.yaml` `discord:` keys into `DISCORD_` env vars (replaces the hardcoded block in `gateway/config.py`) `is_connected` — declares connection state from `DISCORD_BOT_TOKEN` * `check_fn` — lazy-installs `discord.py` on demand * plus `allowed_users_env`, `allow_all_env`, `cron_deliver_env_var`, `max_message_length`, `emoji`, `required_env`, `install_hint` * `gateway/platforms/discord.py` (5,101 LOC) → `plugins/platforms/discord/adapter.py` (git rename, R090). * New `plugins/platforms/discord/{__init__.py, plugin.yaml}` with `requires_env` / `optional_env` declarations. * Append `register(ctx)` block + new hook implementations (`_standalone_send`, `interactive_setup`, `_apply_yaml_config`, `_clean_discord_user_ids`, `_is_connected`, `_build_adapter`, plus helpers `_DISCORD_CHANNEL_TYPE_PROBE_CACHE` etc.) to the adapter. * Replace the `Platform.DISCORD elif` branch in `GatewayRunner._create_adapter()` (−9 LOC) with a generic post-creation hook (+6 LOC) in the registry path: any plugin adapter that declares a `gateway_runner` attribute now gets it auto-injected. Webhook's built-in branch is unchanged (it doesn't go through the registry path). * Move `_send_discord` (190 LOC) and helpers (`_DISCORD_CHANNEL_TYPE_PROBE_CACHE`, `_remember_channel_is_forum`, `_probe_is_forum_cached`, `_derive_forum_thread_name`) from `tools/send_message_tool.py` into the plugin as `_standalone_send`. * Wire via `standalone_sender_fn=_standalone_send` (Teams pattern; same gap fixed in #21804 for other plugin platforms). * Replace the Discord `elif` in `tools/send_message_tool.py` `_send_to_platform` with a 10-line registry-hook dispatch. * Drop the `DiscordAdapter` import and the `Platform.DISCORD: DiscordAdapter.MAX_MESSAGE_LENGTH` `_MAX_LENGTHS` entry — the registry's `max_message_length=2000` covers it. * Move `_setup_discord` and `_clean_discord_user_ids` (68 LOC) from `hermes_cli/setup.py` into the plugin as `interactive_setup`. * Wire via `setup_fn=interactive_setup`. CLI helpers (`prompt`, `print_info`, etc.) are lazy-imported so the plugin's module-load surface stays minimal. * Remove `"discord": _s._setup_discord` from `hermes_cli/gateway.py::_builtin_setup_fn`. * Remove the entire 32-line `_PLATFORMS["discord"]` static dict entry — Discord's setup metadata is now discovered dynamically via `_all_platforms()` from the registry entry. * Move the 59-line `discord_cfg` YAML→env bridge from `gateway/config.py::load_gateway_config()` into the plugin as `_apply_yaml_config`. Covers `require_mention`, `thread_require_mention`, `free_response_channels`, `auto_thread`, `reactions`, `ignored_channels`, `allowed_channels`, `no_thread_channels`, ``allow_mentions.{everyone,roles,users, replied_user}`, and` reply_to_mode`` (including the YAML 1.1 `off`-as-False coercion and the `extra.reply_to_mode` fallback). * Wire via `apply_yaml_config_fn=_apply_yaml_config`. * The hook runs BEFORE `_apply_env_overrides` and after the generic shared-key loop, exactly as documented in `website/docs/developer-guide/adding-platform-adapters.md`. * Behavior is preserved exactly — every assignment still uses `not os.getenv(...)` guards so env vars take precedence over YAML. All 78 references to the old import path are rewritten — no back-compat shim: * 51 `from gateway.platforms.discord import X` → `from plugins.platforms.discord.adapter import X` * 5 `import gateway.platforms.discord as discord_platform` → `import plugins.platforms.discord.adapter as discord_platform` * 1 `from gateway.platforms import discord as discord_mod` → `from plugins.platforms.discord import adapter as discord_mod` * 21 `mock.patch("gateway.platforms.discord.X")` strings → `mock.patch("plugins.platforms.discord.adapter.X")` * 1 docstring reference in `hermes_cli/commands.py` * 1 import in `tools/send_message_tool.py` (now removed entirely) The import-safety test in `tests/gateway/test_discord_imports.py` is updated to purge the new canonical module name from `sys.modules`. 38 files changed, +621 / −473 — net positive due to the YAML hook implementation (89 new LOC in the plugin trading for 59 deleted in core), but every line moved has a clear plugin home now. The git rename is detected at R090 because the adapter gained ~340 LOC of moved-in hook implementations (`_standalone_send` + `interactive_setup` + `_apply_yaml_config` + helpers). * All 568 Discord-specific tests pass across 25 `test_discord_.py` files plus voice/send/text-batching/reload-skills/stream-consumer/ integration tests. All 147 tests in the YAML-touching subset (`test_discord_reply_mode`, `test_discord_free_response`, `test_discord_allowed_channels`, `test_discord_allowed_mentions`, `test_discord_channel_controls`, `test_discord_reactions`, `test_discord_thread_persistence`, `test_runtime_footer`) pass — this is the strongest signal that the YAML→env hook behaves identically to the legacy block. * Broader gateway/cron/integration sweep (1297 tests) introduces zero new failures vs `main`. Pre-existing failures in `tests/gateway/test_tts_media_routing.py` and `tests/e2e/test_platform_commands.py` reproduce identically on the unchanged `main` revision. * Plugin discovery sanity check confirms Discord registers alongside the other four platform plugins: Registered platforms: ['discord', 'google_chat', 'irc', 'line', 'teams'] These Discord-shaped tendrils in core were deliberately not moved — they are generic platform-registry concerns affecting every platform, not Discord-specific: * `gateway/config.py:1205` `DISCORD_BOT_TOKEN → config.token` env enablement — same shape Telegram has. The existing `env_enablement_fn` registry hook only seeds `extra`, not `.token`, so it can't replace this without an adapter refactor to read from `extra["bot_token"]`. * `gateway/run.py` voice-mode hooks (`self.adapters.get(Platform.DISCORD)` for `start_voice_mode`/`stop_voice_mode`), role-based auth, `DISCORD_ALLOW_BOTS` branch in `_is_user_authorized`, `_UPDATE_ALLOWED_PLATFORMS` frozenset, and the per-platform allowlist maps — generic platform-registry concerns. * `Platform.DISCORD` enum literal — stable identifier used as dict keys throughout the codebase; removing it is a separate refactor with no real benefit. * `tools/discord_tool.py` and `tools/environments/local.py` — first-class agent tools and env-passthrough config, neither is the gateway adapter. Each of these is worth its own scoping issue when the time comes.	13 天前
fakes	fix: streaming tool call parsing, error handling, and fake HA state mutation - Fix Gemini streaming tool call merge bug: multiple tool calls with same index but different IDs are now parsed as separate calls instead of concatenating names (e.g. ha_call_serviceha_call_service) - Handle partial results in voice mode: show error and stop continuous mode when agent returns partial/failed results with empty response - Fix error display during streaming TTS: error messages are shown in full response box even when streaming box was already opened - Add duplicate sentence filter in TTS: skip near-duplicate sentences from LLM repetition - Fix fake HA server state mutation: turn_on/turn_off/set_temperature correctly update entity states; temperature sensor simulates change when thermostat is adjusted	2 个月前
gateway	test(telegram): add brand-new-topic regression for #31086 The cherry-picked fix from #28605 inverts an existing test (an unknown non-lobby thread_id no longer rewrites to the most-recent binding), but that test only seeds two bindings and queries a third thread_id. Add a second regression test that more closely mirrors the live failure mode: seed exactly one prior binding, then query a brand-new thread_id and assert recovery returns None — so the new topic is allowed to get its own session row instead of being silently merged into the previous topic's session. Co-authored-by: Fábio Siqueira <fabioxxx@gmail.com> Co-authored-by: dillweed <dillweed@users.noreply.github.com>	11 天前
hermes_cli	feat(security): on-demand supply-chain audit via OSV.dev (#31460) Adds 'hermes security audit' — a one-shot vulnerability scan against OSV.dev covering three surfaces a Hermes user actually controls: 1. The running Python's installed PyPI dists (importlib.metadata) 2. Plugin requirements.txt / pyproject.toml pins under ~/.hermes/plugins/ 3. Pinned npx/uvx MCP servers in config.yaml Zero new dependencies (stdlib urllib + importlib.metadata + tomllib + concurrent.futures). No auth required for OSV's public batch API. Flags: --json, --fail-on {low,moderate,high,critical} (default: critical), --skip-venv, --skip-plugins, --skip-mcp Output groups findings by source, sorts by severity descending, surfaces fixed-versions inline. Exit 1 when any finding meets the --fail-on tier. Deliberately out of scope: globally-installed pip/npm, editor/browser extensions, daily background scans, auto-blocking of installs. The audit is on-demand by design — daily scans become noise the user trains themselves to ignore.	11 天前
hermes_state	feat(session_search): single-shape tool with discovery, scroll, browse — no LLM (#27590) * feat(session_search): single-shape tool with discovery, scroll, browse — no LLM Replaces the LLM-summarized session_search with a single-shape tool that returns actual messages from the DB. Three calling shapes inferred from args (no mode parameter): 1. Discovery — pass query. FTS5 + anchored ±5 window + bookends per hit, all in one call. ~20ms on a real DB instead of ~90s for the previous three aux-LLM calls. 2. Scroll — pass session_id + around_message_id. Returns a window centered on the anchor. To paginate, re-anchor on the first/last id of the returned window. Boundary message appears in both windows as the orientation marker. ~1ms per scroll call. 3. Browse — no args. Recent sessions chronologically. Bookend_start (first 3 user+assistant msgs) and bookend_end (last 3) give the agent goal + resolution on every discovery hit, so a single tool call reconstructs a long session's arc without loading the whole transcript. The aux-LLM summary path is gone: it cost ~$0.30/call, took ~30s, and laundered FTS5 hits through a model that could confabulate when the right session wasn't in the hit list. The merged shape returns byte-for-byte content from SQLite. History: - PR #20238 (JabberELF) seeded the fast/summary dual-mode split. - PR #26419 (yoniebans) expanded to fast/guided/summary with bookends, multi-anchor drill-down, default-mode config, and a teaching skill. This PR collapses that toolkit into one shape with explicit scroll support, drops the summary path, drops the mode parameter, drops the config knob, drops the skill. JabberELF's seed work is acknowledged via the AUTHOR_MAP entry. Validation: - 38/38 tool tests pass (tests/tools/test_session_search.py) - 12/12 get_messages_around tests pass (tests/hermes_state/) - 11/11 get_anchored_view tests pass (tests/hermes_state/) - Full tests/tools/ run: 5168 passing, 2 failures pre-exist on main (test ordering in test_delegate.py, unrelated) - E2E against live state DB: discovery 20ms, scroll 1ms, browse 280ms; pagination forward+backward works with boundary-message orientation; error paths return clean tool_error responses Co-authored-by: JabberELF <abcdjmm970703@gmail.com> Co-authored-by: yoniebans <jonny@nousresearch.com> * chore(session_search): prune dead LLM-summary config and docs Companion to the single-shape rewrite. The auxiliary.session_search config block, max_concurrency / extra_body tunables, and matching docs sections all referenced the removed LLM summarization path. Removing them so users don't try to tune knobs that nothing reads. - hermes_cli/config.py: drop dead auxiliary.session_search block from DEFAULT_CONFIG. Leftover keys in user config.yaml are harmless and ignored. - hermes_cli/tips.py: drop two tips referencing the removed max_concurrency / extra_body knobs. - website/docs/user-guide/configuration.md: drop 'Session Search Tuning' section and the auxiliary.session_search block from the example. - website/docs/user-guide/features/fallback-providers.md: drop session_search rows from the auxiliary-tasks tables and the dedicated tuning subsection. - website/docs/reference/tools-reference.md: rewrite the session_search entry to describe the new three-shape behaviour. - CONTRIBUTING.md: update the file-tree description. - tests/tools/test_llm_content_none_guard.py: remove TestSessionSearchContentNone class and test_session_search_tool_guarded — both guard against an unguarded .content.strip() call site in _summarize_session() that no longer exists. Validation: 97/97 targeted tests still pass (hermes_state + session_search + llm_content_none_guard). Config tests 55/55. --------- Co-authored-by: JabberELF <abcdjmm970703@gmail.com> Co-authored-by: yoniebans <jonny@nousresearch.com>	17 天前
honcho_plugin	chore: ruff auto-fix PLR6201 resweep — tuple → set in membership tests (#27355) Six days after #23937 (608 fixes) the codebase had accumulated 241 new PLR6201 violations. Same mechanical `x in (...)` → `x in {...}` fix, same zero-risk profile: set lookup is O(1) vs O(n) for tuple and the two are semantically equivalent for hashable scalar membership tests. All 241 instances fixed via `ruff check --select PLR6201 --fix --unsafe-fixes`, zero remaining. Every changed value is a hashable scalar (str/int/None/enum/signal); no risk of unhashable runtime errors. No behavior change. Test plan: - 119 files changed, +244/-244 (net zero) — exactly one-line edits - `ruff check` clean afterward - Compile checks pass on the largest touched files (cli.py, run_agent.py, gateway/run.py, gateway/platforms/discord.py, model_tools.py) - Subset broad test run on tests/gateway/ tests/hermes_cli/ tests/agent/ tests/tools/: 18187 passed, 59 pre-existing failures (verified against origin/main with the same shape — identical failure count, identical category — all xdist test-order flakes unrelated to this change) Follows the same template as PR #23937 ([tracker: #23972](https://github.com/NousResearch/hermes-agent/issues/23972)).	18 天前
integration	refactor(gateway): migrate Discord adapter to bundled plugin (full Teams parity) First migration of an existing built-in platform adapter to the plugin system established by IRC / Teams / LINE / Google Chat. Closes #24325; advances the umbrella refactor in #3823. Matches Teams' shape exactly — adapter under `plugins/platforms/discord/` with the standard `__init__.py` / `adapter.py` / `plugin.yaml` shell, `register(ctx)` entry point, no back-compat shim at the old import path, and full parity for the four hooks Teams uses plus the `apply_yaml_config_fn` hook that landed in #25443 (the Discord plugin is the first consumer of that hook): * `standalone_sender_fn` — out-of-process cron delivery via REST API * `setup_fn` — interactive `hermes setup gateway` wizard * `apply_yaml_config_fn` — translate `config.yaml` `discord:` keys into `DISCORD_` env vars (replaces the hardcoded block in `gateway/config.py`) `is_connected` — declares connection state from `DISCORD_BOT_TOKEN` * `check_fn` — lazy-installs `discord.py` on demand * plus `allowed_users_env`, `allow_all_env`, `cron_deliver_env_var`, `max_message_length`, `emoji`, `required_env`, `install_hint` * `gateway/platforms/discord.py` (5,101 LOC) → `plugins/platforms/discord/adapter.py` (git rename, R090). * New `plugins/platforms/discord/{__init__.py, plugin.yaml}` with `requires_env` / `optional_env` declarations. * Append `register(ctx)` block + new hook implementations (`_standalone_send`, `interactive_setup`, `_apply_yaml_config`, `_clean_discord_user_ids`, `_is_connected`, `_build_adapter`, plus helpers `_DISCORD_CHANNEL_TYPE_PROBE_CACHE` etc.) to the adapter. * Replace the `Platform.DISCORD elif` branch in `GatewayRunner._create_adapter()` (−9 LOC) with a generic post-creation hook (+6 LOC) in the registry path: any plugin adapter that declares a `gateway_runner` attribute now gets it auto-injected. Webhook's built-in branch is unchanged (it doesn't go through the registry path). * Move `_send_discord` (190 LOC) and helpers (`_DISCORD_CHANNEL_TYPE_PROBE_CACHE`, `_remember_channel_is_forum`, `_probe_is_forum_cached`, `_derive_forum_thread_name`) from `tools/send_message_tool.py` into the plugin as `_standalone_send`. * Wire via `standalone_sender_fn=_standalone_send` (Teams pattern; same gap fixed in #21804 for other plugin platforms). * Replace the Discord `elif` in `tools/send_message_tool.py` `_send_to_platform` with a 10-line registry-hook dispatch. * Drop the `DiscordAdapter` import and the `Platform.DISCORD: DiscordAdapter.MAX_MESSAGE_LENGTH` `_MAX_LENGTHS` entry — the registry's `max_message_length=2000` covers it. * Move `_setup_discord` and `_clean_discord_user_ids` (68 LOC) from `hermes_cli/setup.py` into the plugin as `interactive_setup`. * Wire via `setup_fn=interactive_setup`. CLI helpers (`prompt`, `print_info`, etc.) are lazy-imported so the plugin's module-load surface stays minimal. * Remove `"discord": _s._setup_discord` from `hermes_cli/gateway.py::_builtin_setup_fn`. * Remove the entire 32-line `_PLATFORMS["discord"]` static dict entry — Discord's setup metadata is now discovered dynamically via `_all_platforms()` from the registry entry. * Move the 59-line `discord_cfg` YAML→env bridge from `gateway/config.py::load_gateway_config()` into the plugin as `_apply_yaml_config`. Covers `require_mention`, `thread_require_mention`, `free_response_channels`, `auto_thread`, `reactions`, `ignored_channels`, `allowed_channels`, `no_thread_channels`, ``allow_mentions.{everyone,roles,users, replied_user}`, and` reply_to_mode`` (including the YAML 1.1 `off`-as-False coercion and the `extra.reply_to_mode` fallback). * Wire via `apply_yaml_config_fn=_apply_yaml_config`. * The hook runs BEFORE `_apply_env_overrides` and after the generic shared-key loop, exactly as documented in `website/docs/developer-guide/adding-platform-adapters.md`. * Behavior is preserved exactly — every assignment still uses `not os.getenv(...)` guards so env vars take precedence over YAML. All 78 references to the old import path are rewritten — no back-compat shim: * 51 `from gateway.platforms.discord import X` → `from plugins.platforms.discord.adapter import X` * 5 `import gateway.platforms.discord as discord_platform` → `import plugins.platforms.discord.adapter as discord_platform` * 1 `from gateway.platforms import discord as discord_mod` → `from plugins.platforms.discord import adapter as discord_mod` * 21 `mock.patch("gateway.platforms.discord.X")` strings → `mock.patch("plugins.platforms.discord.adapter.X")` * 1 docstring reference in `hermes_cli/commands.py` * 1 import in `tools/send_message_tool.py` (now removed entirely) The import-safety test in `tests/gateway/test_discord_imports.py` is updated to purge the new canonical module name from `sys.modules`. 38 files changed, +621 / −473 — net positive due to the YAML hook implementation (89 new LOC in the plugin trading for 59 deleted in core), but every line moved has a clear plugin home now. The git rename is detected at R090 because the adapter gained ~340 LOC of moved-in hook implementations (`_standalone_send` + `interactive_setup` + `_apply_yaml_config` + helpers). * All 568 Discord-specific tests pass across 25 `test_discord_.py` files plus voice/send/text-batching/reload-skills/stream-consumer/ integration tests. All 147 tests in the YAML-touching subset (`test_discord_reply_mode`, `test_discord_free_response`, `test_discord_allowed_channels`, `test_discord_allowed_mentions`, `test_discord_channel_controls`, `test_discord_reactions`, `test_discord_thread_persistence`, `test_runtime_footer`) pass — this is the strongest signal that the YAML→env hook behaves identically to the legacy block. * Broader gateway/cron/integration sweep (1297 tests) introduces zero new failures vs `main`. Pre-existing failures in `tests/gateway/test_tts_media_routing.py` and `tests/e2e/test_platform_commands.py` reproduce identically on the unchanged `main` revision. * Plugin discovery sanity check confirms Discord registers alongside the other four platform plugins: Registered platforms: ['discord', 'google_chat', 'irc', 'line', 'teams'] These Discord-shaped tendrils in core were deliberately not moved — they are generic platform-registry concerns affecting every platform, not Discord-specific: * `gateway/config.py:1205` `DISCORD_BOT_TOKEN → config.token` env enablement — same shape Telegram has. The existing `env_enablement_fn` registry hook only seeds `extra`, not `.token`, so it can't replace this without an adapter refactor to read from `extra["bot_token"]`. * `gateway/run.py` voice-mode hooks (`self.adapters.get(Platform.DISCORD)` for `start_voice_mode`/`stop_voice_mode`), role-based auth, `DISCORD_ALLOW_BOTS` branch in `_is_user_authorized`, `_UPDATE_ALLOWED_PLATFORMS` frozenset, and the per-platform allowlist maps — generic platform-registry concerns. * `Platform.DISCORD` enum literal — stable identifier used as dict keys throughout the codebase; removing it is a separate refactor with no real benefit. * `tools/discord_tool.py` and `tools/environments/local.py` — first-class agent tools and env-passthrough config, neither is the gateway adapter. Each of these is worth its own scoping issue when the time comes.	13 天前
openviking_plugin	fix(openviking): pre-check fs/stat to route file URIs before hitting directory-only endpoints Adds a deterministic pre-check on top of htsh's exception-based fallback: before calling /content/abstract or /content/overview on a non-pseudo URI, probe /api/v1/fs/stat. If the server says the URI is a file, route straight to /content/read instead of eating a failing 500 round-trip. This is the same idea pty819 and chennest independently landed in PRs #12757 and #12937 — merged here on top of htsh's broader fix so we keep pseudo-URI normalization and v0.3.3 browse-shape handling while avoiding the slow exception path on servers that return a raised 500 every time. The exception fallback from #5886 stays in place for environments where fs/stat is unavailable or returns an unfamiliar shape. Also credits pty819, chennest, and htsh in AUTHOR_MAP so future release notes attribute them correctly.	1 个月前
plugins	fix(opencode-go): emit Kimi reasoning_effort, match KimiProfile shape The Kimi K2 branch added in the prior commit only emitted extra_body.thinking and dropped reasoning_effort entirely. KimiProfile (api.moonshot.ai/v1) sends both fields, and OpenCode Go proxies to the same Moonshot backend. Mirror that shape on the Go path so /reasoning effort actually reaches Kimi. - low/medium/high pass through verbatim - xhigh/max clamp to high (Moonshot's max supported value) - minimal / unknown effort → omit reasoning_effort, keep thinking on - disabled / no config → unchanged - DeepSeek branch unchanged	12 天前
providers	fix(custom): pass custom provider extra body Allow custom OpenAI-compatible providers declared under `custom_providers:` to set provider-specific `extra_body` fields and have Hermes merge them into chat-completions requests when the matching custom endpoint is active. This is a manual per-provider override rather than a model-name heuristic. OpenAI-compatible Gemma thinking support is real, but the on-wire payload shape is backend-specific: some servers want top-level `enable_thinking`, while vLLM Gemma and NIM-style endpoints expect `chat_template_kwargs`. A per-provider override is safer than picking one assumed payload. Example config: ```yaml custom_providers: - name: gemma-local base_url: http://localhost:8080/v1 model: google/gemma-4-31b-it extra_body: enable_thinking: true reasoning_effort: high ``` For vLLM Gemma or NIM-style endpoints, use the nested shape those servers expect: ```yaml extra_body: chat_template_kwargs: enable_thinking: true ``` Changes: - `hermes_cli/config.py`: preserve `extra_body` in normalized `custom_providers:` entries and allow it in the validated field set. - `hermes_cli/runtime_provider.py`: propagate custom-provider `extra_body` as `request_overrides.extra_body` for named custom runtime resolution, including credential-pool paths. - `agent/agent_init.py`: at agent init, locate the matching custom-provider entry by `base_url` (+ optional model) and merge its `extra_body` into `AIAgent.request_overrides`, with caller-provided overrides winning on conflicting top-level keys. - `plugins/model-providers/custom/__init__.py`: keep existing CustomProfile behavior (Ollama `num_ctx`, `think=False` when reasoning disabled); user-configured `extra_body` flows through `request_overrides`. - `website/docs/integrations/providers.md`: document the explicit `extra_body` override and the vLLM/Gemma `chat_template_kwargs` variant. - Tests cover config normalization, runtime propagation, model matching, trailing-slash equivalence, fallback when no `model` field is set, and caller-override merging precedence. Verified end-to-end against `CustomProfile` via `ChatCompletionsTransport`: configured `extra_body` reaches `kwargs.extra_body` on the wire request, and coexists with profile-generated entries (Ollama `num_ctx`, `think=False`) without clobber. Salvaged from #29022 onto current `main`. Cosmetic typing edit in `plugins/model-providers/custom/__init__.py` and a stale-base docs revert in `providers.md` were dropped during cherry-pick. Closes #29022	14 天前
run_agent	fix(agent): abort on HTTP 402 after pool rotation and fallback fail (#31443) Closes #31273. HTTP 402 (insufficient credits) was retried up to agent.api_max_retries times (default 3), burning paid requests against an exhausted balance. Real-world impact: ~$40 in 48h on a 24/7 Telegram+Discord gateway. Root cause: FailoverReason.billing was in the is_client_error exclusion set in agent/conversation_loop.py, which prevents the non-retryable-abort branch from firing. By the time control reaches that predicate: * credential-pool rotation has already run for billing and either continued the loop or returned False (pool exhausted/absent) * the eager-fallback branch has also fired on billing and either continued the loop or fell through (no fallback configured) Falling through to the backoff retry from here has no recovery mechanism left — it just burns more paid requests. Removing billing from the exclusion set makes 402 abort cleanly once pool+fallback recovery has failed, mirroring how 401/403 (also should_fallback=True) already behave. Added tests/run_agent/test_31273_402_not_retried.py which mirrors the is_client_error predicate shape from the source and asserts the invariant (plus a source-inspection guard against accidental re-introduction).	11 天前
scripts	feat(acp-registry): switch to uvx distribution, drop npm launcher The ACP Registry schema supports uvx as a first-class distribution method alongside npx and binary. Pointing the registry directly at the existing hermes-agent PyPI release removes: - the @nousresearch npm scope (we don't own it) - a separate npm publish step on every weekly release - 90 lines of Node launcher + tests in packages/hermes-agent-acp/ The Zed registry now installs Hermes via: uvx --from 'hermes-agent[acp]==<version>' hermes-acp This is the same command the npm launcher was shelling out to anyway, so end-user behavior is unchanged. Registry CI validates the PyPI URL + version-pin exact match automatically. Changes: - acp_registry/agent.json: distribution.npx -> distribution.uvx - delete packages/hermes-agent-acp/ entirely - scripts/release.py: drop npm-launcher bump paths, keep manifest lockstep - tests/acp/test_registry_manifest.py: assert uvx shape + version pin - tests/scripts/test_release_acp_registry.py: rewrite for uvx-only shape - docs (user-guide + dev-guide): drop all npm-launcher references - delete docs/plans/acp-registry-zed-integration.md (stale, npm-shaped) Validated against agentclientprotocol/registry agent.schema.json via jsonschema. hermes-agent==0.13.0 is already live on PyPI.	20 天前
skills	fix(skills): add timeout to Google OAuth urlopen calls	16 天前
stress	docs: align kanban readiness docs and smoke tests Salvages #28199 by @bensargotest-sys. Aligns Kanban docs with current tool registration: dispatcher-spawned task workers get task tools, profiles that explicitly enable the kanban toolset get orchestrator routing tools (kanban_list, kanban_unblock). Corrects failure-limit text to current default of 2. Hardens the e2e subprocess script to resolve repo root and use the spawnable default assignee. Updates the diagnostics severity fixture to assert error below the critical threshold.	16 天前
tools	fix(tests): align CI tests with recent security hardening (#31470) Four recent security PRs landed on main with stale/missing test updates, breaking 4 test shards on every subsequent PR's CI run: - test_discord_bot_auth_bypass.py (PR #30742 c3caca658): DISCORD_ALLOWED_ROLES no longer bypasses _is_user_authorized. Inverted 3 tests to assert the new (correct) behavior: role config alone does NOT authorize at the gateway layer. - test_msgraph_webhook.py (PR #30169 4ca77f105): adapter.is_connected is a @property, not a method. Test was calling it with () after the connect() change; TypeError: 'bool' is not callable. Removed the parens. - test_feishu_approval_buttons.py (PR #30744 bdb97b857): Card-action callbacks now go through _allow_group_message authorization. 3 tests in TestCardActionCallbackResponse didn't populate adapter._allowed_group_users so the operator's open_id got rejected. Added the allowlist setup to each test, matching the existing pattern in test_returns_card_for_approve_action. Also raise tolerance on test_wait_for_process_kills_subprocess_on_keyboardinterrupt: the SIGTERM → 3s TimeoutStopSec → SIGKILL → reap chain can exceed 10s under loaded xdist (40 workers). Bumped _wait_for_pgid_exit timeout 10→30s and worker join timeout 5→15s. Passes 100% in isolation already; this just makes it tolerant of CI-host load. Validation: 270/270 tests pass across the 5 affected files.	11 天前
tui_gateway	chore: ruff auto-fix PLR6201 resweep — tuple → set in membership tests (#27355) Six days after #23937 (608 fixes) the codebase had accumulated 241 new PLR6201 violations. Same mechanical `x in (...)` → `x in {...}` fix, same zero-risk profile: set lookup is O(1) vs O(n) for tuple and the two are semantically equivalent for hashable scalar membership tests. All 241 instances fixed via `ruff check --select PLR6201 --fix --unsafe-fixes`, zero remaining. Every changed value is a hashable scalar (str/int/None/enum/signal); no risk of unhashable runtime errors. No behavior change. Test plan: - 119 files changed, +244/-244 (net zero) — exactly one-line edits - `ruff check` clean afterward - Compile checks pass on the largest touched files (cli.py, run_agent.py, gateway/run.py, gateway/platforms/discord.py, model_tools.py) - Subset broad test run on tests/gateway/ tests/hermes_cli/ tests/agent/ tests/tools/: 18187 passed, 59 pre-existing failures (verified against origin/main with the same shape — identical failure count, identical category — all xdist test-order flakes unrelated to this change) Follows the same template as PR #23937 ([tracker: #23972](https://github.com/NousResearch/hermes-agent/issues/23972)).	18 天前
website	docs(skills): explain restoring bundled skills	30 天前
__init__.py	A bit of restructuring for simplicity and organization	7 个月前
conftest.py	test: keep tirith checks hermetic	12 天前
run_interrupt_test.py	fix: thread safety for concurrent subagent delegation (#1672) * fix: thread safety for concurrent subagent delegation Four thread-safety fixes that prevent crashes and data races when running multiple subagents concurrently via delegate_task: 1. Remove redirect_stdout/stderr from delegate_tool — mutating global sys.stdout races with the spinner thread when multiple children start concurrently, causing segfaults. Children already run with quiet_mode=True so the redirect was redundant. 2. Split _run_single_child into _build_child_agent (main thread) + _run_single_child (worker thread). AIAgent construction creates httpx/SSL clients which are not thread-safe to initialize concurrently. 3. Add threading.Lock to SessionDB — subagents share the parent's SessionDB and call create_session/append_message from worker threads with no synchronization. 4. Add _active_children_lock to AIAgent — interrupt() iterates _active_children while worker threads append/remove children. 5. Add _client_cache_lock to auxiliary_client — multiple subagent threads may resolve clients concurrently via call_llm(). Based on PR #1471 by peteromallet. * feat: Honcho base_url override via config.yaml + quick command alias type Two features salvaged from PR #1576: 1. Honcho base_url override: allows pointing Hermes at a remote self-hosted Honcho deployment via config.yaml: honcho: base_url: "http://192.168.x.x:8000" When set, this overrides the Honcho SDK's environment mapping (production/local), enabling LAN/VPN Honcho deployments without requiring the server to live on localhost. Uses config.yaml instead of env var (HONCHO_URL) per project convention. 2. Quick command alias type: adds a new 'alias' quick command type that rewrites to another slash command before normal dispatch: quick_commands: sc: type: alias target: /context Supports both CLI and gateway. Arguments are forwarded to the target command. Based on PR #1576 by redhelix. --------- Co-authored-by: peteromallet <peteromallet@users.noreply.github.com> Co-authored-by: redhelix <redhelix@users.noreply.github.com>	2 个月前
test_account_usage.py	feat(account-usage): add per-provider account limits module Ports agent/account_usage.py and its tests from the original PR #2486 branch. Defines AccountUsageSnapshot / AccountUsageWindow dataclasses, a shared renderer, and provider-specific fetchers for OpenAI Codex (wham/usage), Anthropic OAuth (oauth/usage), and OpenRouter (/credits and /key). Wiring into /usage lands in a follow-up salvage commit. Authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com>	1 个月前
test_atomic_replace_symlinks.py	refactor: consolidate symlink-safe atomic replace into shared helper Extract the islink/realpath guard from the 16743 fix into a single atomic_replace() helper in utils.py, then migrate every os.replace() call site in the codebase to use it. The original PR #16777 correctly identified and fixed the bug, but only patched 9 of ~24 call sites. The same bug class (managed deployments that symlink state files silently losing the link on every write) still existed at auth.json, sessions file, gateway config, env_loader, webhook subscriptions, debug store, model catalog, pairing, google OAuth, nous rate guard, and more. Rather than add another 10+ copies of the same three-line guard, consolidate into atomic_replace(tmp, target) which: - resolves symlinks via os.path.realpath before os.replace - returns the resolved real path so callers can re-apply permissions - is a drop-in replacement for os.replace at the use sites Changes: - utils.py: new atomic_replace() helper + atomic_json_write / atomic_yaml_write now call it instead of inlining the guard - 16 files: all os.replace() call sites migrated to atomic_replace() - agent/{google_oauth, nous_rate_guard, shell_hooks}.py - cron/jobs.py - gateway/{pairing, session, platforms/telegram}.py - hermes_cli/{auth, config, debug, env_loader, model_catalog, webhook}.py - tools/{memory_tool, skill_manager_tool, skills_sync}.py Tests: tests/test_atomic_replace_symlinks.py pins the invariant for atomic_replace + atomic_json_write + atomic_yaml_write, covers plain files, first-time creates, broken symlinks, and permission preservation. Refs #16743 Builds on #16777 by @vominh1919.	1 个月前
test_base_url_hostname.py	security(runtime_provider): close OLLAMA_API_KEY substring-leak sweep miss (#13522) Two call sites still used a raw substring check to identify ollama.com: hermes_cli/runtime_provider.py:496: _is_ollama_url = "ollama.com" in base_url.lower() run_agent.py:6127: if fb_base_url_hint and "ollama.com" in fb_base_url_hint.lower() ... Same bug class as GHSA-xf8p-v2cg-h7h5 (OpenRouter substring leak), which was fixed in commit dbb7e00e via base_url_host_matches() across the codebase. The earlier sweep missed these two Ollama sites. Self-discovered during April 2026 security-advisory triage; filed as GHSA-76xc-57q6-vm5m. Impact is narrow — requires a user with OLLAMA_API_KEY configured AND a custom base_url whose path or look-alike host contains 'ollama.com'. Users on default provider flows are unaffected. Filed as a draft advisory to use the private-fork flow; not CVE-worthy on its own. Fix is mechanical: replace substring check with base_url_host_matches at both sites. Same helper the rest of the codebase uses. Tests: 67 -> 71 passing. 7 new host-matcher cases in tests/test_base_url_hostname.py (path injection, lookalike host, localtest.me subdomain, ollama.ai TLD confusion, localhost, genuine ollama.com, api.ollama.com subdomain) + 4 call-site tests in tests/hermes_cli/test_runtime_provider_resolution.py verifying OLLAMA_API_KEY is selected only when base_url actually targets ollama.com. Fixes GHSA-76xc-57q6-vm5m	1 个月前
test_batch_runner_checkpoint.py	test: regression coverage for checkpoint dedup and inf/nan coercion Covers the two bugs salvaged from PR #15161: - test_batch_runner_checkpoint: TestFinalCheckpointNoDuplicates asserts the final aggregated completed_prompts list has no duplicate indices, and keeps a sanity anchor test documenting the pre-fix pattern so a future refactor that re-introduces it is caught immediately. - test_model_tools: TestCoerceNumberInfNan asserts _coerce_number returns the original string for inf/-inf/nan/Infinity inputs and that the result round-trips through strict (allow_nan=False) json.dumps.	1 个月前
test_bitwarden_secrets.py	feat(secrets/bitwarden): EU Cloud + self-hosted server URL support (#31378) Closes #31370. bws defaults to the US identity endpoint, so EU Cloud and self-hosted machine-account tokens fail with [400 Bad Request] {"error":"invalid_client"} during 'hermes secrets bitwarden setup'. The token is valid — it's just being checked against the wrong region. Add a Bitwarden region step to the wizard between the access-token and project-list steps: Step 1 Install bws Step 2 Provide access token Step 3 Pick region <-- new (US / EU / self-hosted-custom-URL) Step 4 Pick project (now talks to the right endpoint) Step 5 Test fetch Region is stored in config.yaml as secrets.bitwarden.server_url and plumbed into every bws subprocess as BWS_SERVER_URL (project list, secret list, test fetch, and the env_loader startup pull). Also: - Non-interactive: 'hermes secrets bitwarden setup --server-url ...' - Pre-existing BWS_SERVER_URL in the shell is detected and reused - Cache key includes server_url so EU/US fetches don't collide - 'hermes secrets bitwarden status' shows the configured region - 'invalid_client' / '400 Bad Request' from bws now triggers a hint pointing at the region setting instead of looking like a bad token	11 天前
test_cli_file_drop.py	fix(tui): improve macOS paste and shortcut parity - support Cmd-as-super and readline-style fallback shortcuts on macOS - add layered clipboard/OSC52 paste handling and immediate image-path attach - add IDE terminal setup helpers, terminal parity hints, and aligned docs	1 个月前
test_cli_manual_compress.py	fix(tests): catch up six stale tests after compression/aux/kanban changes (#28465) - aux_config: drop session_search from _AUX_TASKS and remove stale test (PR #27590 removed auxiliary.session_search from DEFAULT_CONFIG) - compression_boundary_hook: set compressor._last_compress_aborted=False on MagicMock so the post-compress abort branch (PR #28117) doesn't short-circuit before the session-id rotation under test - kanban_dashboard_plugin: use consecutive_failures=3 so severity stays 'error' (failure_threshold default dropped from 3 to 2 in d9fef0c8a, so failures=5 now crosses the critical floor of 2*2=4) - cli_manual_compress: accept force kwarg on DummyAgent._compress_context (cli._manual_compress now passes force=True)	16 天前
test_cli_skin_integration.py	fix(ci): stabilize main test suite regressions (#17660) * fix: stabilize main test suite regressions * test(agent): update MiniMax normalization expectation * test: stabilize remaining CI assertions * test: harden config helper monkeypatching * test: harden CI-only assertions * fix(agent): propagate fast streaming interrupts	1 个月前
test_ctx_halving_fix.py	fix(cache): kill long-lived prefix layout — system prompt is now byte-static within a session (#24778) The long-lived prefix-cache layout split the system prompt into stable/ context/volatile blocks and re-derived them on every API call. The volatile tier (timestamp + memory snapshot + USER profile) ticks per turn, so the system message bytes mutated mid-conversation and broke upstream prompt caches (OpenRouter, Nous Portal, Anthropic). Diagnosed via live wire-format diffing: an 8-turn conversation showed OLD layout flipping system block[1] sha mid-session at the minute boundary, dropping cached_tokens to 0 on that turn (cumulative 66.6% vs 83.3% for the single-block layout). Hermes invariant: history (system + all but the last 1-2 messages) must be static. Fix: drop the long-lived layout entirely. Single layout everywhere — system_and_3 with one cached system string built once on first turn, replayed verbatim on every subsequent turn. Loses cross-session 1h prefix caching for Claude (the feature that motivated the split), but within-session caching now actually works on every provider. Removed: - run_agent.py: _use_long_lived_prefix_cache flag, _long_lived_cache_ttl, _supports_long_lived_anthropic_cache method, the long-lived branch in run_conversation, mark_tools_for_long_lived_cache call site - agent/prompt_caching.py: apply_anthropic_cache_control_long_lived, mark_tools_for_long_lived_cache, _mark_system_stable_block helper - hermes_cli/config.py: prompt_caching.long_lived_prefix and prompt_caching.long_lived_ttl config keys - tests/agent/test_prompt_caching_live.py (entire file) - tests/agent/test_prompt_caching.py: TestMarkToolsForLongLivedCache, TestApplyAnthropicCacheControlLongLived - tests/run_agent/test_anthropic_prompt_cache_policy.py: TestSupportsLongLivedAnthropicCache Targeted tests: 62/62 pass.	23 天前
test_empty_model_fallback.py	fix: fall back to provider's default model when model config is empty (#8303) When a user configures a provider (e.g. `hermes auth add openai-codex`) but never selects a model via `hermes model`, the gateway and CLI would pass an empty model string to the API, causing: 'Codex Responses request model must be a non-empty string' Now both gateway (_resolve_session_agent_runtime) and CLI (_ensure_runtime_credentials) detect an empty model and fill it from the provider's first catalog entry in _PROVIDER_MODELS. This covers all providers that have a static model list (openai-codex, anthropic, gemini, copilot, etc.). The fix is conservative: it only triggers when model is truly empty and a known provider was resolved. Explicit model choices are never overridden.	1 个月前
test_env_loader_secret_sources.py	feat(secrets): label detected credentials with their source (Bitwarden) (#30364) When Bitwarden Secrets Manager supplies a provider key, 'hermes model' and the setup wizard show 'credentials ✓' with no hint of where the key came from — identical to the .env case. Users assume the integration isn't wired up and re-enter the key (or hit Enter and cancel). env_loader now tracks which env vars were injected by an external secret source and exposes get_secret_source() / format_secret_source_suffix() so the provider flows can render 'Anthropic credentials: sk-ant-... ✓ (from Bitwarden)' instead of an unlabeled checkmark. Wired into _prompt_api_key (kimi, z.ai, minimax, opencode, ...), the Anthropic provider flow, the Bedrock flow, and the GitHub Copilot token display. Future secret sources (Vault, 1Password, etc.) drop in by setting their own label in _SECRET_SOURCES; format_secret_source_suffix() has a generic fallback so no call sites need updating.	13 天前
test_evidence_store.py	feat: add OSS Security Forensics skill (Skills Hub) (#1482) * feat: add OSS Security Forensics skill (Skills Hub) Salvaged from PR #1066 by zagiscoming. Adds a 7-phase multi-agent investigation framework for GitHub supply chain attack forensics. Skill contents (optional-skills/security/oss-forensics/): - SKILL.md: 420-line investigation framework with 8 anti-hallucination guardrails, 5 specialist investigators, ethical use guidelines, and API rate limiting guidance - evidence-store.py: CLI evidence manager with add/list/verify/query/ export/summary + SHA-256 integrity + chain of custody - references/: evidence types, GH Archive BigQuery guide (expanded with 12 event types and 6 query templates), recovery techniques (4 methods), investigation templates (5 attack patterns) - templates/: forensic report template (151 lines), malicious package report template Changes from original PR: - Dropped unrelated core tool changes (delegate_tool.py role parameter, AGENTS.md, README.md modifications) - Removed duplicate skills/security/oss-forensics/ placement - Fixed github-archive-guide.md (missing from optional-skills/, expanded from 33 to 160+ lines with all 12 event types and query templates) - Added ethical use guidelines and API rate limiting sections - Rewrote tests to match the v2 evidence store API (12 tests, all pass) Closes #384 * fix: use python3 and SKILL_DIR paths throughout oss-forensics skill - Replace all 'python' invocations with 'python3' for portability (Ubuntu doesn't ship 'python' by default) - Replace relative '../scripts/' and '../templates/' paths with SKILL_DIR/scripts/ and SKILL_DIR/templates/ convention - Add path convention note before Phase 0 explaining SKILL_DIR - Fix double --- separator (cosmetic) - Applies to SKILL.md, evidence-store.py docstring, recovery-techniques.md, and forensic-report.md template --------- Co-authored-by: zagiscoming <zagiscoming@users.noreply.github.com>	2 个月前
test_gateway_streaming_nested_config.py	fix(gateway): load streaming config from nested gateway.streaming key `hermes config set gateway.streaming.*` writes the streaming block nested under a `gateway:` key in config.yaml, but the config loader only checked for a top-level `streaming:` key — silently ignoring the nested variant. Fall back to `yaml_cfg['gateway']['streaming']` when the top-level key is absent, matching the pattern already used for other nested config sections. Closes #25676	21 天前
test_get_tool_definitions_cache_isolation.py	fix(tools): isolate get_tool_definitions quiet_mode cache + dedup LCM injection (#17335) Long-lived Gateway processes were sending duplicate tool names to providers that enforce uniqueness: - DeepSeek: 'Tool names must be unique.' - Xiaomi MiMo: 'tools contains duplicate names: lcm_expand' - Moonshot/Kimi: 'function name lcm_grep is duplicated' TUI was unaffected because TUI runs with quiet_mode=False and skips the cache entirely. Root cause (two layered bugs) - model_tools.get_tool_definitions(quiet_mode=True) memoizes its result in _tool_defs_cache. The cache-hit path returned list(cached) (safe), but the FIRST uncached call stored and returned the SAME object. run_agent.py mutates self.tools (memory + LCM context-engine schemas) in-place, so the very first agent init in a Gateway process poisoned the cache, and every subsequent init appended LCM schemas again on top of the already-polluted list. - run_agent.py's context-engine injection (lcm_grep / lcm_describe / lcm_expand) had no dedup, unlike the memory-tools injection right above it which already skips already-present names. Fix (defense in depth, per the issue's suggested fix) - model_tools.get_tool_definitions: on the uncached branch, cache the computed list but return list(result) to the caller. Same pattern as the cache-hit path. - run_agent.py: build _existing_tool_names from self.tools and skip schemas whose names are already present, mirroring the memory-tools block. This also defends against plugin paths that may register the same schemas via ctx.register_tool(). Tests (tests/test_get_tool_definitions_cache_isolation.py) - test_first_uncached_call_returns_fresh_list \u2014 pins the fix; without it, first-call alias caused all the symptoms. - test_cache_hit_returns_fresh_list \u2014 pre-existing behavior stays. - test_caller_mutation_does_not_poison_cache \u2014 simulates run_agent appending lcm_grep / lcm_expand to the returned list and asserts the next call doesn't see them. - test_repeated_caller_mutation_does_not_accumulate \u2014 reproduces the long-lived Gateway accumulation pattern across 5 agent inits. - test_non_quiet_mode_does_not_use_cache \u2014 sanity, explains why TUI was fine. 5/5 pass on the new file; 23/23 still pass on tests/test_model_tools.py.	1 个月前
test_hermes_bootstrap.py	fix(entry-points): guard hermes_bootstrap import so partial updates don't brick hermes (#22091) teknium1 hit ModuleNotFoundError: No module named 'hermes_bootstrap' after a code update, on both his Windows machine AND his Linux workstation. The failure mode is real and affects every user who updates hermes by any path OTHER than a fully-successful `hermes update`. ## What happens hermes_bootstrap.py is a top-level module registered via pyproject.toml's `py-modules` list (added by Brooklyn's Windows UTF-8 stdio work). It must be registered in the venv's editable-install .pth file before Python can find it as a bare `import hermes_bootstrap`. `hermes update` handles this correctly: (1) git reset --hard, (2) clear __pycache__, (3) uv pip install -e . (re-registers the package including the new py-modules list), (4) restart. BUT if any step AFTER (1) fails — network blip during pip install, PEP 668 on a system Python, venv locked, uv not in PATH, a crash mid-update — the user is left with new code that references hermes_bootstrap and a venv that doesn't know about it. Every hermes invocation after that crashes with ModuleNotFoundError, including `hermes update` itself. No recovery path without manual `uv pip install -e .`. Also affects users who `git pull` the repo directly without running hermes update — relatively common for developers. ## Fix Wrap `import hermes_bootstrap` in a try/except ModuleNotFoundError across all 6 entry points (hermes_cli/main, run_agent, gateway/run, acp_adapter/entry, cli, batch_runner). On Windows, missing bootstrap means the UTF-8 stdio setup doesn't run — degraded behavior (Unicode chars may fail to print) but NOT a crash. POSIX is unaffected either way since the bootstrap is a no-op there. Once hermes is running again, the user can `hermes update` to fully recover. ## Test update tests/test_hermes_bootstrap.py::test_entry_point_imports_bootstrap scans for the first top-level import in each entry point and asserts it is hermes_bootstrap. Extended the check to accept a Try block whose body is a lone Import of hermes_bootstrap — that's the recovery-friendly form we just introduced. Verified behavior by `mv hermes_bootstrap.py hermes_bootstrap.py.bak` and confirming `python -c "import hermes_cli.main"` succeeds. 82/82 tests pass (hermes_bootstrap + windows-native + windows-compat).	27 天前
test_hermes_constants.py	fix(security): guard os.chmod(parent) against / and top-level dirs Five call sites do os.chmod(path.parent, 0o700) without checking that the parent resolves to a safe directory. If HERMES_HOME or another path env var resolves to /, the chmod strips traversal permission from the root inode and bricks the entire host. Add secure_parent_dir() to hermes_constants.py that refuses to chmod / or any top-level directory (depth < 2). Replace all 5 call sites with this helper. Fixes #25821	14 天前
test_hermes_home_profile_warning.py	fix(constants): warn once when get_hermes_home() falls back under an active profile (#18746) When HERMES_HOME is unset but ~/.hermes/active_profile names a non-default profile, any data this process writes lands in the default profile — not the one the operator expects. Before this change the fallback was silent, so cross-profile contamination (#18594) was invisible until a user noticed their memory/state ended up in the wrong place. Now we emit a one-shot warning to stderr the first time this happens in a process. No raise — there are 30+ module-level callers of get_hermes_home() and raising from any of them would brick import. Behavior is otherwise unchanged; subprocess spawners (systemd template, kanban dispatcher, docker entrypoint) already propagate HERMES_HOME correctly. Bypasses logging.getLogger() because this runs before logging is configured in a significant fraction of callers (module import time). Refs #18594. Credit to @liuhao1024 for surfacing the silent-fallback case in PR #18600; we kept the diagnostic signal without the import-time raise.	1 个月前
test_hermes_logging.py	fix(tests): catch up 25 stale tests after recent merges (#28626) Sweep of all CI failures on origin/main, grouped by drift source: Telegram allowlist gate (db50af910 added user-authz to _should_process_message): - Hardcoded "[Telegram]" prefix in the logger.warning so the call no longer dereferences self.name → self.platform, which test fixtures built via object.__new__ never set. - test_telegram_format / test_allowed_channels_widening fixtures stub _is_callback_user_authorized → True so the new gate doesn't reject guest-mode / allowed-channels test messages. - test_telegram_approval_buttons::test_update_prompt_callback_not_affected sets TELEGRAM_ALLOWED_USERS="*" so the fail-closed default doesn't reject the callback before it writes .update_response. Approval surface (6d495d9e7 renamed status, 214b95392 detached stdin): - test_no_callback_returns_approval_required: status is now "pending_approval" (was "approval_required"). - test_close_stdin_allows_eof_driven_process_to_finish: switch to use_pty=True; non-PTY now uses stdin=DEVNULL. Mattermost (send() now resolves root_id via _api_get first): - test_send_with_thread_reply mocks _session.get with a thread-root response so the new resolver doesn't TypeError on a bare AsyncMock. Kanban (d8ad431de rename, f55d94a1e review column, _kanban_worker_skill_available): - _safe_int → _to_epoch in the two test_kanban_db tests. - Spawn-skills tests (×3) monkey-patch _kanban_worker_skill_available to True since the isolated kanban_home fixture has no devops/kanban-worker tree. - test_gateway_dispatcher_disables_corrupt_board: connect count 3 → 5 (review-column probe now also runs per tick). Aux-config severity at_or_above (a94ddd807): - test_diagnostics_endpoint_severity_filter expects warning filter to include error+critical now (was exact-match). Anthropic error handling (conversation loop extracted from run_agent): - _no_backoff_wait fixture patches BOTH run_agent.jittered_backoff AND agent.conversation_loop.jittered_backoff. The latter is the actual call site; without the second patch tests burn ~2s per retry and hit the 30s SIGALRM timeout on CI. Other test pollution / drift: - test_auto_does_not_select_copilot_from_github_token: patch agent.bedrock_adapter.has_aws_credentials → False so boto3's credential chain can't auto-pick Bedrock from developer ~/.aws. - test_setup_openclaw_migration: patch hermes_cli.gateway.get_env_value in addition to setup_mod.get_env_value — _platform_status reads through the gateway module's binding. - test_gateway_prefix: COMPONENT_PREFIXES["gateway"] now includes "hermes_plugins" too. - test_recommended_update_command_defaults_to_hermes_update: also short-circuit get_managed_update_command in case a stray ~/.hermes/.managed marker is present. - test_user_id_is_not_explicit: _parse_target_ref now returns is_explicit=False for Slack U.../W... IDs (chat.postMessage rejects them — a DM must be opened first via conversations.open).	16 天前
test_hermes_state.py	fix(gateway): separate observed Telegram group context	12 天前
test_hermes_state_wal_fallback.py	fix(sqlite): fall back to journal_mode=DELETE on NFS/SMB/FUSE (#22043) SQLite's WAL mode requires shared-memory (mmap) coordination and fcntl byte-range locks that don't reliably work on network filesystems. Upstream documents this explicitly: https://www.sqlite.org/wal.html#sometimes_queries_return_sqlite_busy_in_wal_mode On NFS / SMB / some FUSE mounts / WSL1, 'PRAGMA journal_mode=WAL' raises 'sqlite3.OperationalError: locking protocol' (SQLITE_PROTOCOL). Before this change, every feature backed by state.db or kanban.db broke silently: - /resume, /title, /history, /branch returned 'Session database not available.' with no cause - gateway logged the init failure at DEBUG (invisible in errors.log) - kanban dispatcher crashed every 60s, driving the known migration race (duplicate column name: consecutive_failures, #21708 / #21374) Changes: - hermes_state.apply_wal_with_fallback(): shared helper that tries WAL and falls back to DELETE on SQLITE_PROTOCOL-style errors with one WARNING explaining why - hermes_state.get_last_init_error() + format_session_db_unavailable(): capture the init failure cause and surface it in user-facing strings (with an NFS/SMB pointer for 'locking protocol') - hermes_cli/kanban_db.connect(): use the shared helper - gateway/run.py: bump SessionDB init failure log DEBUG -> WARNING (matches cli.py's existing correct behavior) - cli.py (4 sites) + gateway/run.py (5 sites): replace bare 'Session database not available.' with format_session_db_unavailable() Tests: 12 new tests in tests/test_hermes_state_wal_fallback.py + 1 new test in tests/hermes_cli/test_kanban_db.py. Existing suites (state, kanban, gateway, cli) remain green for all tests unrelated to pre-existing failures on main. Evidence: real-world user on NFSv3 mount (172.26.224.200:d2dfac12/home, local_lock=none) reporting 'Session database not available.' on /resume; 'locking protocol' appears in 4 distinct log entries across backup, kanban, TUI, and CLI paths in the same session. closes #22032	26 天前
test_honcho_client_config.py	feat(memory): pluggable memory provider interface with profile isolation, review fixes, and honcho CLI restoration (#4623) * feat(memory): add pluggable memory provider interface with profile isolation Introduces a pluggable MemoryProvider ABC so external memory backends can integrate with Hermes without modifying core files. Each backend becomes a plugin implementing a standard interface, orchestrated by MemoryManager. Key architecture: - agent/memory_provider.py — ABC with core + optional lifecycle hooks - agent/memory_manager.py — single integration point in the agent loop - agent/builtin_memory_provider.py — wraps existing MEMORY.md/USER.md Profile isolation fixes applied to all 6 shipped plugins: - Cognitive Memory: use get_hermes_home() instead of raw env var - Hindsight Memory: check $HERMES_HOME/hindsight/config.json first, fall back to legacy ~/.hindsight/ for backward compat - Hermes Memory Store: replace hardcoded ~/.hermes paths with get_hermes_home() for config loading and DB path defaults - Mem0 Memory: use get_hermes_home() instead of raw env var - RetainDB Memory: auto-derive profile-scoped project name from hermes_home path (hermes-<profile>), explicit env var overrides - OpenViking Memory: read-only, no local state, isolation via .env MemoryManager.initialize_all() now injects hermes_home into kwargs so every provider can resolve profile-scoped storage without importing get_hermes_home() themselves. Plugin system: adds register_memory_provider() to PluginContext and get_plugin_memory_providers() accessor. Based on PR #3825. 46 tests (37 unit + 5 E2E + 4 plugin registration). * refactor(memory): drop cognitive plugin, rewrite OpenViking as full provider Remove cognitive-memory plugin (#727) — core mechanics are broken: decay runs 24x too fast (hourly not daily), prefetch uses row ID as timestamp, search limited by importance not similarity. Rewrite openviking-memory plugin from a read-only search wrapper into a full bidirectional memory provider using the complete OpenViking session lifecycle API: - sync_turn: records user/assistant messages to OpenViking session (threaded, non-blocking) - on_session_end: commits session to trigger automatic memory extraction into 6 categories (profile, preferences, entities, events, cases, patterns) - prefetch: background semantic search via find() endpoint - on_memory_write: mirrors built-in memory writes to the session - is_available: checks env var only, no network calls (ABC compliance) Tools expanded from 3 to 5: - viking_search: semantic search with mode/scope/limit - viking_read: tiered content (abstract ~100tok / overview ~2k / full) - viking_browse: filesystem-style navigation (list/tree/stat) - viking_remember: explicit memory storage via session - viking_add_resource: ingest URLs/docs into knowledge base Uses direct HTTP via httpx (no openviking SDK dependency needed). Response truncation on viking_read to prevent context flooding. * fix(memory): harden Mem0 plugin — thread safety, non-blocking sync, circuit breaker - Remove redundant mem0_context tool (identical to mem0_search with rerank=true, top_k=5 — wastes a tool slot and confuses the model) - Thread sync_turn so it's non-blocking — Mem0's server-side LLM extraction can take 5-10s, was stalling the agent after every turn - Add threading.Lock around _get_client() for thread-safe lazy init (prefetch and sync threads could race on first client creation) - Add circuit breaker: after 5 consecutive API failures, pause calls for 120s instead of hammering a down server every turn. Auto-resets after cooldown. Logs a warning when tripped. - Track success/failure in prefetch, sync_turn, and all tool calls - Wait for previous sync to finish before starting a new one (prevents unbounded thread accumulation on rapid turns) - Clean up shutdown to join both prefetch and sync threads * fix(memory): enforce single external memory provider limit MemoryManager now rejects a second non-builtin provider with a warning. Built-in memory (MEMORY.md/USER.md) is always accepted. Only ONE external plugin provider is allowed at a time. This prevents tool schema bloat (some providers add 3-5 tools each) and conflicting memory backends. The warning message directs users to configure memory.provider in config.yaml to select which provider to activate. Updated all 47 tests to use builtin + one external pattern instead of multiple externals. Added test_second_external_rejected to verify the enforcement. * feat(memory): add ByteRover memory provider plugin Implements the ByteRover integration (from PR #3499 by hieuntg81) as a MemoryProvider plugin instead of direct run_agent.py modifications. ByteRover provides persistent memory via the brv CLI — a hierarchical knowledge tree with tiered retrieval (fuzzy text then LLM-driven search). Local-first with optional cloud sync. Plugin capabilities: - prefetch: background brv query for relevant context - sync_turn: curate conversation turns (threaded, non-blocking) - on_memory_write: mirror built-in memory writes to brv - on_pre_compress: extract insights before context compression Tools (3): - brv_query: search the knowledge tree - brv_curate: store facts/decisions/patterns - brv_status: check CLI version and context tree state Profile isolation: working directory at $HERMES_HOME/byterover/ (scoped per profile). Binary resolution cached with thread-safe double-checked locking. All write operations threaded to avoid blocking the agent (curate can take 120s with LLM processing). * fix(memory): thread remaining sync_turns, fix holographic, add config key Plugin fixes: - Hindsight: thread sync_turn (was blocking up to 30s via _run_in_thread) - RetainDB: thread sync_turn (was blocking on HTTP POST) - Both: shutdown now joins sync threads alongside prefetch threads Holographic retrieval fixes: - reason(): removed dead intersection_key computation (bundled but never used in scoring). Now reuses pre-computed entity_residuals directly, moved role_content encoding outside the inner loop. - contradict(): added _MAX_CONTRADICT_FACTS=500 scaling guard. Above 500 facts, only checks the most recently updated ones to avoid O(n^2) explosion (~125K comparisons at 500 is acceptable). Config: - Added memory.provider key to DEFAULT_CONFIG ("" = builtin only). No version bump needed (deep_merge handles new keys automatically). * feat(memory): extract Honcho as a MemoryProvider plugin Creates plugins/honcho-memory/ as a thin adapter over the existing honcho_integration/ package. All 4 Honcho tools (profile, search, context, conclude) move from the normal tool registry to the MemoryProvider interface. The plugin delegates all work to HonchoSessionManager — no Honcho logic is reimplemented. It uses the existing config chain: $HERMES_HOME/honcho.json -> ~/.honcho/config.json -> env vars. Lifecycle hooks: - initialize: creates HonchoSessionManager via existing client factory - prefetch: background dialectic query - sync_turn: records messages + flushes to API (threaded) - on_memory_write: mirrors user profile writes as conclusions - on_session_end: flushes all pending messages This is a prerequisite for the MemoryManager wiring in run_agent.py. Once wired, Honcho goes through the same provider interface as all other memory plugins, and the scattered Honcho code in run_agent.py can be consolidated into the single MemoryManager integration point. * feat(memory): wire MemoryManager into run_agent.py Adds 8 integration points for the external memory provider plugin, all purely additive (zero existing code modified): 1. Init (~L1130): Create MemoryManager, find matching plugin provider from memory.provider config, initialize with session context 2. Tool injection (~L1160): Append provider tool schemas to self.tools and self.valid_tool_names after memory_manager init 3. System prompt (~L2705): Add external provider's system_prompt_block alongside existing MEMORY.md/USER.md blocks 4. Tool routing (~L5362): Route provider tool calls through memory_manager.handle_tool_call() before the catchall handler 5. Memory write bridge (~L5353): Notify external provider via on_memory_write() when the built-in memory tool writes 6. Pre-compress (~L5233): Call on_pre_compress() before context compression discards messages 7. Prefetch (~L6421): Inject provider prefetch results into the current-turn user message (same pattern as Honcho turn context) 8. Turn sync + session end (~L8161, ~L8172): sync_all() after each completed turn, queue_prefetch_all() for next turn, on_session_end() + shutdown_all() at conversation end All hooks are wrapped in try/except — a failing provider never breaks the agent. The existing memory system, Honcho integration, and all other code paths are completely untouched. Full suite: 7222 passed, 4 pre-existing failures. * refactor(memory): remove legacy Honcho integration from core Extracts all Honcho-specific code from run_agent.py, model_tools.py, toolsets.py, and gateway/run.py. Honcho is now exclusively available as a memory provider plugin (plugins/honcho-memory/). Removed from run_agent.py (-457 lines): - Honcho init block (session manager creation, activation, config) - 8 Honcho methods: _honcho_should_activate, _strip_honcho_tools, _activate_honcho, _register_honcho_exit_hook, _queue_honcho_prefetch, _honcho_prefetch, _honcho_save_user_observation, _honcho_sync - _inject_honcho_turn_context module-level function - Honcho system prompt block (tool descriptions, CLI commands) - Honcho context injection in api_messages building - Honcho params from __init__ (honcho_session_key, honcho_manager, honcho_config) - HONCHO_TOOL_NAMES constant - All honcho-specific tool dispatch forwarding Removed from other files: - model_tools.py: honcho_tools import, honcho params from handle_function_call - toolsets.py: honcho toolset definition, honcho tools from core tools list - gateway/run.py: honcho params from AIAgent constructor calls Removed tests (-339 lines): - 9 Honcho-specific test methods from test_run_agent.py - TestHonchoAtexitFlush class from test_exit_cleanup_interrupt.py Restored two regex constants (_SURROGATE_RE, _BUDGET_WARNING_RE) that were accidentally removed during the honcho function extraction. The honcho_integration/ package is kept intact — the plugin delegates to it. tools/honcho_tools.py registry entries are now dead code (import commented out in model_tools.py) but the file is preserved for reference. Full suite: 7207 passed, 4 pre-existing failures. Zero regressions. * refactor(memory): restructure plugins, add CLI, clean gateway, migration notice Plugin restructure: - Move all memory plugins from plugins/<name>-memory/ to plugins/memory/<name>/ (byterover, hindsight, holographic, honcho, mem0, openviking, retaindb) - New plugins/memory/__init__.py discovery module that scans the directory directly, loading providers by name without the general plugin system - run_agent.py uses load_memory_provider() instead of get_plugin_memory_providers() CLI wiring: - hermes memory setup — interactive curses picker + config wizard - hermes memory status — show active provider, config, availability - hermes memory off — disable external provider (built-in only) - hermes honcho — now shows migration notice pointing to hermes memory setup Gateway cleanup: - Remove _get_or_create_gateway_honcho (already removed in prev commit) - Remove _shutdown_gateway_honcho and _shutdown_all_gateway_honcho methods - Remove all calls to shutdown methods (4 call sites) - Remove _honcho_managers/_honcho_configs dict references Dead code removal: - Delete tools/honcho_tools.py (279 lines, import was already commented out) - Delete tests/gateway/test_honcho_lifecycle.py (131 lines, tested removed methods) - Remove if False placeholder from run_agent.py Migration: - Honcho migration notice on startup: detects existing honcho.json or ~/.honcho/config.json, prints guidance to run hermes memory setup. Only fires when memory.provider is not set and not in quiet mode. Full suite: 7203 passed, 4 pre-existing failures. Zero regressions. * feat(memory): standardize plugin config + add per-plugin documentation Config architecture: - Add save_config(values, hermes_home) to MemoryProvider ABC - Honcho: writes to $HERMES_HOME/honcho.json (SDK native) - Mem0: writes to $HERMES_HOME/mem0.json - Hindsight: writes to $HERMES_HOME/hindsight/config.json - Holographic: writes to config.yaml under plugins.hermes-memory-store - OpenViking/RetainDB/ByteRover: env-var only (default no-op) Setup wizard (hermes memory setup): - Now calls provider.save_config() for non-secret config - Secrets still go to .env via env vars - Only memory.provider activation key goes to config.yaml Documentation: - README.md for each of the 7 providers in plugins/memory/<name>/ - Requirements, setup (wizard + manual), config reference, tools table - Consistent format across all providers The contract for new memory plugins: - get_config_schema() declares all fields (REQUIRED) - save_config() writes native config (REQUIRED if not env-var-only) - Secrets use env_var field in schema, written to .env by wizard - README.md in the plugin directory * docs: add memory providers user guide + developer guide New pages: - user-guide/features/memory-providers.md — comprehensive guide covering all 7 shipped providers (Honcho, OpenViking, Mem0, Hindsight, Holographic, RetainDB, ByteRover). Each with setup, config, tools, cost, and unique features. Includes comparison table and profile isolation notes. - developer-guide/memory-provider-plugin.md — how to build a new memory provider plugin. Covers ABC, required methods, config schema, save_config, threading contract, profile isolation, testing. Updated pages: - user-guide/features/memory.md — replaced Honcho section with link to new Memory Providers page - user-guide/features/honcho.md — replaced with migration redirect to the new Memory Providers page - sidebars.ts — added both new pages to navigation * fix(memory): auto-migrate Honcho users to memory provider plugin When honcho.json or ~/.honcho/config.json exists but memory.provider is not set, automatically set memory.provider: honcho in config.yaml and activate the plugin. The plugin reads the same config files, so all data and credentials are preserved. Zero user action needed. Persists the migration to config.yaml so it only fires once. Prints a one-line confirmation in non-quiet mode. * fix(memory): only auto-migrate Honcho when enabled + credentialed Check HonchoClientConfig.enabled AND (api_key OR base_url) before auto-migrating — not just file existence. Prevents false activation for users who disabled Honcho, stopped using it (config lingers), or have ~/.honcho/ from a different tool. * feat(memory): auto-install pip dependencies during hermes memory setup Reads pip_dependencies from plugin.yaml, checks which are missing, installs them via pip before config walkthrough. Also shows install guidance for external_dependencies (e.g. brv CLI for ByteRover). Updated all 7 plugin.yaml files with pip_dependencies: - honcho: honcho-ai - mem0: mem0ai - openviking: httpx - hindsight: hindsight-client - holographic: (none) - retaindb: requests - byterover: (external_dependencies for brv CLI) * fix: remove remaining Honcho crash risks from cli.py and gateway cli.py: removed Honcho session re-mapping block (would crash importing deleted tools/honcho_tools.py), Honcho flush on compress, Honcho session display on startup, Honcho shutdown on exit, honcho_session_key AIAgent param. gateway/run.py: removed honcho_session_key params from helper methods, sync_honcho param, _honcho.shutdown() block. tests: fixed test_cron_session_with_honcho_key_skipped (was passing removed honcho_key param to _flush_memories_for_session). * fix: include plugins/ in pyproject.toml package list Without this, plugins/memory/ wouldn't be included in non-editable installs. Hermes always runs from the repo checkout so this is belt- and-suspenders, but prevents breakage if the install method changes. * fix(memory): correct pip-to-import name mapping for dep checks The heuristic dep.replace('-', '_') fails for packages where the pip name differs from the import name: honcho-ai→honcho, mem0ai→mem0, hindsight-client→hindsight_client. Added explicit mapping table so hermes memory setup doesn't try to reinstall already-installed packages. * chore: remove dead code from old plugin memory registration path - hermes_cli/plugins.py: removed register_memory_provider(), _memory_providers list, get_plugin_memory_providers() — memory providers now use plugins/memory/ discovery, not the general plugin system - hermes_cli/main.py: stripped 74 lines of dead honcho argparse subparsers (setup, status, sessions, map, peer, mode, tokens, identity, migrate) — kept only the migration redirect - agent/memory_provider.py: updated docstring to reflect new registration path - tests: replaced TestPluginMemoryProviderRegistration with TestPluginMemoryDiscovery that tests the actual plugins/memory/ discovery system. Added 3 new tests (discover, load, nonexistent). * chore: delete dead honcho_integration/cli.py and its tests cli.py (794 lines) was the old 'hermes honcho' command handler — nobody calls it since cmd_honcho was replaced with a migration redirect. Deleted tests that imported from removed code: - tests/honcho_integration/test_cli.py (tested _resolve_api_key) - tests/honcho_integration/test_config_isolation.py (tested CLI config paths) - tests/tools/test_honcho_tools.py (tested the deleted tools/honcho_tools.py) Remaining honcho_integration/ files (actively used by the plugin): - client.py (445 lines) — config loading, SDK client creation - session.py (991 lines) — session management, queries, flush * refactor: move honcho_integration/ into the honcho plugin Moves client.py (445 lines) and session.py (991 lines) from the top-level honcho_integration/ package into plugins/memory/honcho/. No Honcho code remains in the main codebase. - plugins/memory/honcho/client.py — config loading, SDK client creation - plugins/memory/honcho/session.py — session management, queries, flush - Updated all imports: run_agent.py (auto-migration), hermes_cli/doctor.py, plugin __init__.py, session.py cross-import, all tests - Removed honcho_integration/ package and pyproject.toml entry - Renamed tests/honcho_integration/ → tests/honcho_plugin/ * docs: update architecture + gateway-internals for memory provider system - architecture.md: replaced honcho_integration/ with plugins/memory/ - gateway-internals.md: replaced Honcho-specific session routing and flush lifecycle docs with generic memory provider interface docs * fix: update stale mock path for resolve_active_host after honcho plugin migration * fix(memory): address review feedback — P0 lifecycle, ABC contract, honcho CLI restore Review feedback from Honcho devs (erosika): P0 — Provider lifecycle: - Remove on_session_end() + shutdown_all() from run_conversation() tail (was killing providers after every turn in multi-turn sessions) - Add shutdown_memory_provider() method on AIAgent for callers - Wire shutdown into CLI atexit, reset_conversation, gateway stop/expiry Bug fixes: - Remove sync_honcho=False kwarg from /btw callsites (TypeError crash) - Fix doctor.py references to dead 'hermes honcho setup' command - Cache prefetch_all() before tool loop (was re-calling every iteration) ABC contract hardening (all backwards-compatible): - Add session_id kwarg to prefetch/sync_turn/queue_prefetch - Make on_pre_compress() return str (provider insights in compression) - Add *kwargs to on_turn_start() for runtime context - Add on_delegation() hook for parent-side subagent observation - Document agent_context/agent_identity/agent_workspace kwargs on initialize() (prevents cron corruption, enables profile scoping) - Fix docstring: single external provider, not multiple Honcho CLI restoration: - Add plugins/memory/honcho/cli.py (from main's honcho_integration/cli.py with imports adapted to plugin path) - Restore full hermes honcho command with all subcommands (status, peer, mode, tokens, identity, enable/disable, sync, peers, --target-profile) - Restore auto-clone on profile creation + sync on hermes update - hermes honcho setup now redirects to hermes memory setup fix(memory): wire on_delegation, skip_memory for cron/flush, fix ByteRover return type - Wire on_delegation() in delegate_tool.py — parent's memory provider is notified with task+result after each subagent completes - Add skip_memory=True to cron scheduler (prevents cron system prompts from corrupting user representations — closes #4052) - Add skip_memory=True to gateway flush agent (throwaway agent shouldn't activate memory provider) - Fix ByteRover on_pre_compress() return type: None -> str * fix(honcho): port profile isolation fixes from PR #4632 Ports 5 bug fixes found during profile testing (erosika's PR #4632): 1. 3-tier config resolution — resolve_config_path() now checks $HERMES_HOME/honcho.json → ~/.hermes/honcho.json → ~/.honcho/config.json (non-default profiles couldn't find shared host blocks) 2. Thread host=_host_key() through from_global_config() in cmd_setup, cmd_status, cmd_identity (--target-profile was being ignored) 3. Use bare profile name as aiPeer (not host key with dots) — Honcho's peer ID pattern is ^[a-zA-Z0-9_-]+$, dots are invalid 4. Wrap add_peers() in try/except — was fatal on new AI peers, killed all message uploads for the session 5. Gate Honcho clone behind --clone/--clone-all on profile create (bare create should be blank-slate) Also: sanitize assistant_peer_id via _sanitize_id() * fix(tests): add module cleanup fixture to test_cli_provider_resolution test_cli_provider_resolution._import_cli() wipes tools.*, cli, and run_agent from sys.modules to force fresh imports, but had no cleanup. This poisoned all subsequent tests on the same xdist worker — mocks targeting tools.file_tools, tools.send_message_tool, etc. patched the NEW module object while already-imported functions still referenced the OLD one. Caused ~25 cascade failures: send_message KeyError, process_registry FileNotFoundError, file_read_guards timeouts, read_loop_detection file-not-found, mcp_oauth None port, and provider_parity/codex_execution stale tool lists. Fix: autouse fixture saves all affected modules before each test and restores them after, matching the pattern in test_managed_browserbase_and_modal.py.	2 个月前
test_install_sh_browser_install.py	fix(install): support non-sudo service-user installs on apt distros (#25814) The Debian/Ubuntu branch of install_node_deps() ran 'npx playwright install --with-deps chromium' unconditionally. Playwright invokes sudo interactively to apt-install Chromium's system libraries, which blocks the installer for non-sudo users (systemd service accounts, unprivileged operator users) on an unsatisfiable password prompt. Changes: - install.sh: gate --with-deps behind a sudo capability check on the apt branch (matches the existing Arch/pacman branch pattern). Non-sudo users fall back to 'npx playwright install chromium' alone and the installer prints the exact 'sudo npx playwright install-deps chromium' command an administrator can run separately. - install.sh: add --skip-browser (alias --no-playwright) to skip the Playwright step entirely for headless installs that don't need browser automation. Mirrors the existing --no-venv / --skip-setup shape. - installation.md: add a 'Non-Sudo / System Service User Installs' section covering the admin/service-user split, the --skip-browser flag, and the ~/.local/bin PATH gotcha (the root cause of the 'No module named dotenv' error users hit when running the repo source 'hermes' script with system Python instead of the venv launcher). - test_install_sh_browser_install.py: regression coverage for the --skip-browser flag and the sudo-gate on the apt branch. Reported by @ssilver in Discord.	21 天前
test_install_sh_pythonpath_sanitization.py	fix: harden install.sh against inherited Python env leakage	29 天前
test_install_sh_setup_wizard_tty_probe.py	fix(install): widen /dev/tty open-probe to sibling gates (#16746) The contributor's PR (#16750) scoped the fix to run_setup_wizard() and explicitly punted the two sibling sites. Both have the identical [ -e /dev/tty ] pattern followed by a < /dev/tty redirect and crash in Docker the same way: - scripts/install.sh:732 install_system_packages() -- apt sudo prompt fallback. sudo ... < /dev/tty dies with the same ENXIO. - scripts/install.sh:1395 maybe_start_gateway() -- gateway-install gate, same function path as the wizard reproducer. Fix both with the same (: </dev/tty) 2>/dev/null probe, and parametrize the regression test over all three gated functions so any future regression is caught regardless of which site breaks.	1 个月前
test_install_sh_symlink_stomp.py	fix(install): preserve pip entry point when re-running on symlinked install setup_path() writes the user-facing hermes shim with `cat >`, which follows existing symlinks. Older installs created `$command_link_dir/hermes` as a symlink to `$HERMES_BIN` (`venv/bin/hermes`), so re-running install.sh stomped the pip entry point with a bash shim that exec'd itself in an infinite loop. `rm -f` the link target before writing so the shim lands at `$command_link_dir/hermes` and the venv entry point is left intact. Adds a regression test that reproduces the symlink-stomp end-to-end (creates the symlink, drives the real shim-write block from setup_path, asserts the venv pip script body survives and the shim is now a regular file). Both new assertions fail on origin/main and pass with the fix. Closes #21454.	21 天前
test_install_sh_termux_network_prereqs.py	fix: strengthen termux install network prerequisites	28 天前
test_ipv4_preference.py	feat: add network.force_ipv4 config to fix IPv6 timeout issues (#8196) On servers with broken or unreachable IPv6, Python's socket.getaddrinfo returns AAAA records first. urllib/httpx/requests all try IPv6 connections first and hang for the full TCP timeout before falling back to IPv4. This affects web_extract, web_search, the OpenAI SDK, and all HTTP tools. Adds network.force_ipv4 config option (default: false) that monkey-patches socket.getaddrinfo to resolve as AF_INET when the caller didn't specify a family. Falls back to full resolution if no A record exists, so pure-IPv6 hosts still work. Applied early at all three entry points (CLI, gateway, cron scheduler) before any HTTP clients are created. Reported by user @29n — Chinese Ubuntu server with unreachable IPv6 causing timeouts on lobste.rs and other IPv6-enabled sites while Google/GitHub worked fine (IPv4-only resolution).	1 个月前
test_lazy_session_regressions.py	fix: resolve lazy session creation regressions (#18370 fallout) (#20363) Fix three regressions introduced by PR #18370 (lazy session creation): 1. _finalize_session() uses stale session_key after compression (#20001) 2. session_key not synced after auto-compression in run_conversation (#20001) 3. pending_title ValueError leaves title wedged forever (#19029) 4. Gateway silently swallows null responses when agent did work (#18765) 5. One-time cleanup for accumulated ghost compression continuations (#20001) Changes: - tui_gateway/server.py: _finalize_session() now uses agent.session_id (falls back to session_key when agent is None). Refactor _sync_session_key_after_compress() with clear_pending_title and restart_slash_worker policy flags. Call it post-run_conversation() to sync session_key after auto-compression. Add ValueError handler to pending_title flush. - gateway/run.py: Extract _normalize_empty_agent_response() helper that consolidates failed/partial/null response handling. Surfaces user-facing error when agent did work (api_calls > 0) but returned no text. - hermes_state.py: Add finalize_orphaned_compression_sessions() — marks ghost continuation sessions as ended (non-destructive, preserves data). - cli.py: One-time startup migration for orphaned compression sessions. Test changes: - tests/test_tui_gateway_server.py: Update pending_title ValueError test for post-#18370 architecture (title applied post-message, not at create). - tests/test_lazy_session_regressions.py: 14 new regression tests covering all fixed paths.	30 天前
test_lint_config.py	lint: enable PLW1514 as a blocking ruff rule Turns the existing 'all lints disabled' stance into 'exactly one lint enabled' — PLW1514 (unspecified-encoding) catches bare open() / read_text() / write_text() calls that default to locale encoding on Windows (cp1252), silently corrupting non-ASCII content. Changes: 1. pyproject.toml - Migrate [tool.ruff] top-level select → [tool.ruff.lint].select (deprecated config location, ruff was warning on every run) - Add preview = true (PLW1514 is a preview rule in ruff 0.15.x) - select = ['PLW1514'] (exactly one rule, deliberately minimal) - per-file-ignores exempt tests/, plugins/, skills/, optional-skills/ — those have their own conventions or intentionally exercise edge cases 2. website/scripts/extract-skills.py - Fix 3 remaining bare opens (website/ was excluded from the main sweep but needed for ruff check . to go green) 3. tests/test_lint_config.py (new, 5 tests) - Guards against accidental rule removal. If someone deletes PLW1514 from the select list or disables preview mode, these tests fail with a loud message explaining why the rule exists. Paired with a companion commit (held locally for now, pending a token with workflow scope) that adds a blocking ruff step to .github/workflows/ lint.yml. Without that companion commit, ruff is configured correctly but nothing in CI enforces it yet — the advisory PR comment will still surface new PLW1514 violations though, so authors see them. Verified: ruff check . → exit 0, 0 violations across the repo. Test suite: 90 passed, 14 skipped, 0 failed.	27 天前
test_live_system_guard_self_test.py	chore: ruff auto-fix PLR6201 resweep — tuple → set in membership tests (#27355) Six days after #23937 (608 fixes) the codebase had accumulated 241 new PLR6201 violations. Same mechanical `x in (...)` → `x in {...}` fix, same zero-risk profile: set lookup is O(1) vs O(n) for tuple and the two are semantically equivalent for hashable scalar membership tests. All 241 instances fixed via `ruff check --select PLR6201 --fix --unsafe-fixes`, zero remaining. Every changed value is a hashable scalar (str/int/None/enum/signal); no risk of unhashable runtime errors. No behavior change. Test plan: - 119 files changed, +244/-244 (net zero) — exactly one-line edits - `ruff check` clean afterward - Compile checks pass on the largest touched files (cli.py, run_agent.py, gateway/run.py, gateway/platforms/discord.py, model_tools.py) - Subset broad test run on tests/gateway/ tests/hermes_cli/ tests/agent/ tests/tools/: 18187 passed, 59 pre-existing failures (verified against origin/main with the same shape — identical failure count, identical category — all xdist test-order flakes unrelated to this change) Follows the same template as PR #23937 ([tracker: #23972](https://github.com/NousResearch/hermes-agent/issues/23972)).	18 天前
test_mcp_serve.py	fix(mcp): unwrap platforms key in channels_list channels_list was iterating directory.items() directly, yielding ("updated_at", str) and ("platforms", dict) pairs — neither passed the isinstance(entries_list, list) check, so the inner loop never ran and every call returned count=0 even when channel_directory.json was populated. The writer (gateway/channel_directory.py) wraps the payload as {"updated_at": ..., "platforms": {...}}; every other reader in the codebase unwraps via directory.get("platforms", {}). This aligns channels_list with that convention. Also tightens the existing test_channels_with_directory test, which bypassed the bug by asserting against _load_channel_directory() directly instead of calling channels_list. It now calls the tool end-to-end and a new test_channels_with_directory_platform_filter covers the filter path. Both tests fail against the pre-fix code. Closes #21474 Co-authored-by: chrisworksai <262485129+chrisworksai@users.noreply.github.com>	28 天前
test_mini_swe_runner.py	fix(kimi): omit temperature entirely for Kimi/Moonshot models (#13157) Kimi's gateway selects the correct temperature server-side based on the active mode (thinking -> 1.0, non-thinking -> 0.6). Sending any temperature value — even the previously "correct" one — conflicts with gateway-managed defaults. Replaces the old approach of forcing specific temperature values (0.6 for non-thinking, 1.0 for thinking) with an OMIT_TEMPERATURE sentinel that tells all call sites to strip the temperature key from API kwargs entirely. Changes: - agent/auxiliary_client.py: OMIT_TEMPERATURE sentinel, _is_kimi_model() prefix check (covers all kimi-* models), _fixed_temperature_for_model() returns sentinel for kimi models. _build_call_kwargs() strips temp. - run_agent.py: _build_api_kwargs, flush_memories, and summary generation paths all handle the sentinel by popping/omitting temperature. - trajectory_compressor.py: _effective_temperature_for_model returns None for kimi (sentinel mapped), direct client calls use kwargs dict to conditionally include temperature. - mini_swe_runner.py: same sentinel handling via wrapper function. - 6 test files updated: all 'forces temperature X' assertions replaced with 'temperature not in kwargs' assertions. Net: -76 lines (171 added, 247 removed). Inspired by PR #13137 (@kshitijk4poor).	1 个月前
test_minimax_model_validation.py	fix(models): validate MiniMax models against static catalog (#12611, #12460, #12399, #12547)	1 个月前
test_minimax_oauth.py	fix(minimax-oauth): refresh short-lived access tokens per request (#30619) * fix(minimax-oauth): refresh short-lived access tokens per request MiniMax OAuth issues ~15-minute access tokens. The Anthropic SDK caches api_key as a static string at client construction, so a session that resolves credentials once at startup keeps sending the same bearer until MiniMax returns 401 mid-session. Swap the static string for a callable token provider, reusing the existing Entra-ID bearer-hook infrastructure in build_anthropic_client. The callable re-reads auth.json on each invocation and calls _refresh_minimax_oauth_state, which is a no-op when the token still has more than 60s of life left and refreshes proactively otherwise. Refreshes persist to auth.json so other processes (gateway, cron) see them immediately. The wire-up lives at the agent-init / model-switch boundary rather than in resolve_runtime_provider, so aux client paths that hand the api_key string to OpenAI(api_key=...) are unaffected. * docs: add infographic for minimax-oauth token refresh	13 天前
test_minisweagent_path.py	chore: remove all remaining mini-swe-agent references Complete cleanup after dropping the mini-swe-agent submodule (PR #2804): - Remove MSWEA_SILENT_STARTUP and MSWEA_GLOBAL_CONFIG_DIR env var settings from cli.py, run_agent.py, hermes_cli/main.py, doctor.py - Remove mini-swe-agent health check from hermes doctor - Remove 'minisweagent' from logger suppression lists - Remove litellm/typer/platformdirs from requirements.txt - Remove mini-swe-agent install steps from install.ps1 (Windows) - Remove mini-swe-agent install steps from website docs - Update all stale comments/docstrings referencing mini-swe-agent in terminal_tool.py, tools/__init__.py, code_execution_tool.py, environments/README.md, environments/agent_loop.py - Remove mini_swe_runner from pyproject.toml py-modules (still exists as standalone script for RL training use) - Shrink test_minisweagent_path.py to empty stub The orphaned mini-swe-agent/ directory on disk needs manual removal: rm -rf mini-swe-agent/	2 个月前
test_model_picker_scroll.py	fix: CLI/UX batch — ChatConsole errors, curses scroll, skin-aware banner, git state banner (#5974) * fix(cli): route error messages through ChatConsole inside patch_stdout Cherry-pick of PR #5798 by @icn5381. Replace self.console.print() with ChatConsole().print() for 11 error/status messages reachable during the interactive session. Inside patch_stdout, self.console (plain Rich Console) writes raw ANSI escapes that StdoutProxy mangles into garbled text. ChatConsole uses prompt_toolkit's native print_formatted_text which renders correctly. Same class of bug as #2262 — that fix covered agent output but missed these error paths in _ensure_runtime_credentials, _init_agent, quick commands, skill loading, and plan mode. * fix(model-picker): add scrolling viewport to curses provider menu Cherry-pick of PR #5790 by @Lempkey. Fixes #5755. _curses_prompt_choice rendered items starting unconditionally from index 0 with no scroll offset. The 'More providers' submenu has 13 entries. On terminals shorter than ~16 rows, items past the fold were never drawn. When UP-arrow wrapped cursor from 0 to the last item (Cancel, index 12), the highlight rendered off-screen — appearing as if only Cancel existed. Adds scroll_offset tracking that adjusts each frame to keep the cursor inside the visible window. * feat(cli): skin-aware compact banner + git state in startup banner Combined salvage of PR #5922 by @ASRagab and PR #5877 by @xinbenlv. Compact banner changes (from #5922): - Read active skin colors and branding instead of hardcoding gold/NOUS HERMES - Default skin preserves backward-compatible legacy branding - Non-default skins use their own agent_name and colors Git state in banner (from #5877): - New format_banner_version_label() shows upstream/local git hashes - Full banner title now includes git state (upstream hash, carried commits) - Compact banner line2 shows the version label with git state - Widen compact banner max width from 64 to 88 to fit version info Both the full Rich banner and compact fallback are now skin-aware and show git state.	1 个月前
test_model_tools.py	chore: remove Atropos RL environments and tinker-atropos integration (#26106) * chore: remove Atropos RL environments, tools, tests, skill, and tinker-atropos submodule Delete: - environments/ (43 files — base env, agent loop, tool call parsers, benchmarks) - rl_cli.py (standalone RL training CLI) - tools/rl_training_tool.py (all 10 rl_* tools) - tests: test_rl_training_tool, test_tool_call_parsers, test_managed_server_tool_support, test_agent_loop, test_agent_loop_vllm, test_agent_loop_tool_calling, test_terminalbench2_env_security - optional-skills/mlops/hermes-atropos-environments/ - tinker-atropos git submodule + .gitmodules * chore: remove RL/Atropos references from Python source - toolsets.py: remove rl toolset block + update comment - model_tools.py: remove rl_tools group + update async bridging comment - hermes_cli/tools_config.py: remove RL display entry, _DEFAULT_OFF_TOOLSETS, setup block, and rl_training post-setup handler - tools/budget_config.py: remove RL environment reference in docstring - tests/test_model_tools.py: remove rl_tools from expected groups - tests/run_agent/test_streaming_tool_call_repair.py: fix stale cross-reference * chore: remove rl/yc-bench extras and tinker-atropos refs from pyproject.toml - Remove rl extra (atroposlib, tinker, fastapi, uvicorn, wandb) - Remove yc-bench extra - Remove rl_cli from py-modules - Remove [tool.ty.src] exclude for tinker-atropos - Remove [tool.ruff] exclude for tinker-atropos - Regenerate uv.lock * chore: remove tinker-atropos from install/setup scripts - setup-hermes.sh: remove entire tinker-atropos submodule install block - scripts/install.sh: remove both tinker-atropos blocks (Termux + standard) - scripts/install.ps1: remove tinker-atropos block - nix/hermes-agent.nix: remove tinker-atropos pip install line * chore: remove RL references from cli-config.yaml.example * docs: remove Atropos/RL references from README, CONTRIBUTING, AGENTS.md * docs: remove RL/Atropos references from website - Delete: environments.md, rl-training.md, mlops-hermes-atropos-environments.md - sidebars.ts: remove rl-training and environments sidebar entries - optional-skills-catalog.md: remove hermes-atropos-environments row - tools-reference.md: remove entire rl toolset section - toolsets-reference.md: remove rl row + update example - integrations/index.md: remove RL Training bullet - architecture.md: remove environments/ from tree + RL section - contributing.md: remove tinker-atropos setup - updating.md: remove tinker-atropos install + stale submodule update * chore: remove remaining RL/Atropos stragglers - hermes_cli/config.py: remove TINKER_API_KEY + WANDB_API_KEY env var defs - hermes_cli/doctor.py: remove Submodules check section (tinker-atropos) - hermes_cli/setup.py: remove RL Training status check - hermes_cli/status.py: remove Tinker + WandB from API key status display - agent/display.py: remove both rl_* tool preview/activity blocks - website/docs: remove RL references from providers.md + env-variables.md - tests: remove TINKER_API_KEY from conftest, set_config_value, setup_script * chore: remove RL training section from .env.example	20 天前
test_model_tools_async_bridge.py	fix(model_tools): cancel coroutine on timeout so worker thread exits + log full traceback _run_async() bridges sync tool handlers to async code. When the handler is invoked from inside a running event loop (gateway / nested async), it spawns a worker thread and blocks on future.result(timeout=300). Before this change, a coroutine that ran past 300s leaked its worker thread: - future.cancel() is a no-op on a running ThreadPoolExecutor future (cancel only works on not-yet-started work). - pool.shutdown(wait=False, cancel_futures=True) let the caller proceed but the worker kept running the coroutine until it returned on its own. Every tool timeout leaked one thread. In long-lived gateway / RL sessions this is cumulative. The fix replaces bare asyncio.run() with a worker wrapper that creates its own event loop. On timeout, _run_async schedules task.cancel() on that loop via call_soon_threadsafe, then shuts the pool down with wait=False so the caller returns immediately. The coroutine observes CancelledError at its next await and the worker thread exits cleanly. Also switches logger.error() to logger.exception() in the top-level handle_function_call() except block so tool failures produce full stack traces in errors.log instead of just the message. Related: #17420 (contributor flagged the leak; the original fix used pool.shutdown(wait=True) which would have converted the leak into a hang — caller blocks forever on the same stuck coroutine). Credit for identifying the leak goes to the contributor. Co-authored-by: 0z! <162235745+0z1-ghb@users.noreply.github.com>	1 个月前
test_ollama_num_ctx.py	fix: provider/model resolution — salvage 4 PRs + MiniMax aux URL fix (#5983) Salvaged fixes from community PRs: - fix(model_switch): _read_auth_store → _load_auth_store + fix auth store key lookup (was checking top-level dict instead of store['providers']). OAuth providers now correctly detected in /model picker. Cherry-picked from PR #5911 by Xule Lin (linxule). - fix(ollama): pass num_ctx to override 2048 default context window. Ollama defaults to 2048 context regardless of model capabilities. Now auto-detects from /api/show metadata and injects num_ctx into every request. Config override via model.ollama_num_ctx. Fixes #2708. Cherry-picked from PR #5929 by kshitij (kshitijk4poor). - fix(aux): normalize provider aliases for vision/auxiliary routing. Adds _normalize_aux_provider() with 17 aliases (google→gemini, claude→anthropic, glm→zai, etc). Fixes vision routing failure when provider is set to 'google' instead of 'gemini'. Cherry-picked from PR #5793 by e11i (Elizabeth1979). - fix(aux): rewrite MiniMax /anthropic base URLs to /v1 for OpenAI SDK. MiniMax's inference_base_url ends in /anthropic (Anthropic Messages API), but auxiliary client uses OpenAI SDK which appends /chat/completions → 404 at /anthropic/chat/completions. Generic _to_openai_base_url() helper rewrites terminal /anthropic to /v1 for OpenAI-compatible endpoint. Inspired by PR #5786 by Lempkey. Added debug logging to silent exception blocks across all fixes. Co-authored-by: Hermes Agent <hermes@nousresearch.com>	1 个月前
test_package_json_lazy_deps.py	fix(update): make Camofox lazy-installed instead of eager (#27055) The `@askjo/camofox-browser` npm package was a top-level entry in the root `package.json` `dependencies` block, so `hermes update` ran its postinstall on every user, every update. That postinstall calls `npx camoufox-js fetch`, which silently downloads a ~300MB Firefox-fork browser binary from GitHub Releases — multi-minute on fast connections, and a hard block for users on slow / restricted networks (notably users in China running through a VPN). Camofox is an explicit opt-in browser backend. The runtime check in `tools/browser_tool.py` only routes through Camofox when the user has set `CAMOFOX_URL` (selected via `hermes tools` → Browser Automation → Camofox). Users who never opted in never touched the package at runtime, yet every `hermes update` paid for the binary fetch anyway. This change: * Removes `@askjo/camofox-browser` from root `package.json` dependencies (and the regenerated `package-lock.json` drops Camofox's entire transitive tree, ~2.6k lines). * Updates the Camofox `post_setup` handler in `hermes_cli/tools_config.py` to install `@askjo/camofox-browser@^1.5.2` explicitly when the user selects Camofox, and streams npm output (no `--silent`, no `capture_output`) so the ~300MB download is visible rather than appearing frozen. * Adds `tests/test_package_json_lazy_deps.py` as a regression guard so future PRs can't silently re-add Camofox (or any binary-postinstall package) to eager root dependencies. `agent-browser` stays eager — it is the default Chromium-driving backend used by every session that does not have a cloud browser provider configured, and its postinstall is small. Validation: \| \| Before \| After \| \|---\|---\|---\| \| `hermes update` time on slow network \| multi-minute hang at `→ Updating Node.js dependencies...` \| seconds (no binary fetch) \| \| Camofox opt-in install visibility \| silent, looked frozen \| streamed npm output \| \| Regression guard against re-adding \| none \| `test_package_json_lazy_deps.py` \| Tests: - `tests/test_package_json_lazy_deps.py`: 3/3 pass - `tests/tools/test_browser_camofox*`: 92/92 pass - `tests/hermes_cli/test_tools_config.py`: 66/66 pass - `tests/hermes_cli/test_cmd_update.py` + adjacent: green Reported by lulu (Discord, May 2026) — `hermes update` hangs at `→ Updating Node.js dependencies...` in China. Related: #18840, #18869.	19 天前
test_packaging_metadata.py	chore: prepare Hermes for Homebrew packaging (#4099) Co-authored-by: Yabuku-xD <78594762+Yabuku-xD@users.noreply.github.com>	2 个月前
test_plugin_skills.py	fix(skills): support category-qualified local skill names	30 天前
test_process_loop_event_loop_warning.py	fix(cli): replace get_event_loop() with get_running_loop() to silence RuntimeWarning in process_loop thread (#19285)	28 天前
test_project_metadata.py	fix(packaging): ship dashboard plugin assets in wheel Salvages #23737 by @LeonSGP43. Adds plugins/* manifest.json and dist/ glob entries to setuptools package-data so wheel installs ship the bundled dashboard plugin assets (kanban, achievements, etc.). Without these, /api/dashboard/plugins can't discover plugin assets outside a source checkout.	17 天前
test_retry_utils.py	feat(agent): add jittered retry backoff Adds agent/retry_utils.py with jittered_backoff() — exponential backoff with additive jitter to prevent thundering-herd retry spikes when multiple gateway sessions hit the same rate-limited provider. Replaces fixed exponential backoff at 4 call sites: - run_agent.py: None-choices retry path (5s base, 120s cap) - run_agent.py: API error retry path (2s base, 60s cap) - trajectory_compressor.py: sync + async summarization retries Thread-safe jitter counter with overflow guards ensures unique seeds across concurrent retries. Trimmed from original PR to keep only wired-in functionality. Co-authored-by: martinp09 <martinp09@users.noreply.github.com>	1 个月前
test_run_tests_parallel.py	test: use subprocesses for each test file (#29016) * ci(tests): install ripgrep from prebuilt tarball instead of apt apt-get update + install of ripgrep takes ~4 min on the GHA Ubuntu runners (the apt-get update against archive.ubuntu.com is the slow part; ripgrep itself is small). Switching to the upstream musl binary tarball cuts the step to a few seconds. - Pinned to ripgrep 15.1.0 with sha256 verification (same hash as published in the releases sha256 sidecar file). - Drops the `rg` binary into /usr/local/bin so it is on PATH for every subsequent step without GITHUB_PATH manipulation. - Applied to both the test and e2e jobs in tests.yml. * fix(cli): compile syntax check to tempdir, not source __pycache__ `_validate_critical_files_syntax` runs `py_compile.compile()` on each critical bootstrap file after a successful `git pull`. The default `py_compile` writes the resulting `.pyc` next to the source under `__pycache__/`, which causes two real problems: 1. Parallel test workers walking the same source tree (e.g. running the suite under per-file process isolation) can race against each other on the `__pycache__` write — manifests as flaky 'directory not empty' errors during teardown. 2. In production, the post-pull syntax check leaves a `.pyc` behind that the next interpreter run might pick up — fine when the interpreter version matches, sketchy if it doesn't. Fix: write the compiled output to a `tempfile.TemporaryDirectory()` that's discarded on function exit. We only care about the compile-or-not signal, not the artifact. * test(runner): per-file process isolation, drop manual state reset + xdist Replace fragile manual _reset_module_state test fixtures with robust per-file subprocess isolation. Each test file runs in a fresh `python -m pytest <file>` subprocess via ThreadPoolExecutor. No xdist, no custom pytest plugin, no shared worker state. Key changes: * scripts/run_tests_parallel.py — new runner: discovers test files, runs N in parallel via ThreadPoolExecutor, captures stdout per file, treats exit code 5 (no tests collected) as pass, kills all children on exit. Change from cpu_count to cpu_count2. The runner is I/O-bound (waiting on subprocess.communicate() from pytest children) The parent process does almost no CPU work, so 2x oversubscription keeps more pipes full. When a file fails, immediately show the last 30 lines of pytest output (stack traces + FAILED summary) plus a ready-to-copy repro command: python -m pytest tests/agent/test_auxiliary_client.py scripts/run_tests.sh — delegates to run_tests_parallel.py * .github/workflows/tests.yml — test step: python scripts/run_tests_parallel.py * pyproject.toml — drop pytest-xdist, pytest-split; simplify addopts * tests/conftest.py — remove ~200 lines of manual state-reset fixtures * AGENTS.md — update Testing section for per-file design * test(runner): speed gateway test antipattern scan up * fix(test): web search provider plugin test missing xai * fix(tests): make 14 test files pass under per-file subprocess isolation Tests that relied on cross-file state pollution from xdist workers fail when run in isolation (per-file subprocess model). Root causes and fixes: Tool registry not populated: - test_video_generation_tool_surface_matrix: add discover_builtin_tools() - test_web_providers_brave_free/ddgs/searxng/general: autouse fixtures registering all 8 bundled web providers, reset after each test - test_website_policy: same provider registration pattern - test_web_tools_tavily: same pattern across 3 dispatch test classes - Also add is_safe_url/check_website_access mocks where SSRF check blocks example.com (DNS resolution fails in isolated envs) Stale check_fn cache: - test_kanban_tools: invalidate_check_fn_cache() + _clear_tool_defs_cache() in both kanban guidance tests (prior test cached False for kanban_show) - test_discord_tool: cache invalidation in setup/teardown - test_homeassistant_tool: invalidate_check_fn_cache() before registry queries Module-level state pollution: - test_auxiliary_client: autouse fixture clearing _aux_unhealthy_until cache - test_skill_commands: set_session_vars() instead of patch.dict(os.environ) (ContextVar takes precedence over os.environ) - test_dm_topics: overwrite sys.modules + separate telegram.constants mock + force-reimport of gateway.platforms.telegram - test_terminal_tool_requirements: removed duplicate class declaration, autouse _clear_caches fixture * change(tests): run_tests.sh explicitly includes env vars instead of manually dropping some vars, now we just only include some * fix(tests): 5 more isolation/NixOS fixes - test_approval_plugin_hooks: isolate HERMES_HOME so real user's command_allowlist doesn't short-circuit the approval path - test_google_chat: skipif when Platform.GOOGLE_CHAT not in enum (feature not merged on this branch) - test_write_deny: test systemd prefix against tmp_path instead of /etc/systemd which resolves to /nix/store on NixOS - test_pty_bridge: use shutil.which('cat') instead of /bin/cat (doesn't exist on NixOS) - profiles.py: rmtree onexc handler chmod's parent dirs too, fixing profile deletion when copytree preserved read-only modes from nix store * fix(tests): clear unhealthy cache in autouse fixture for auxiliary_client * fix(tests): skip send_message when telegram not installed; handle missing worker_id in browser_supervisor * fix: py3.11 rmtree onexc compat + belt-and-suspenders unhealthy cache clear for expired codex test * fix: address PR #29016 review feedback - Remove tracked .pytest-cache/ artifact and add to .gitignore - Fix stale 'xdist worker' comment in conftest.py - Deduplicate web provider registration into tests/tools/conftest.py shared helper (register_all_web_providers), replacing 8 copy-pasted blocks across 6 test files - Update PR description: remove stale recovered-test-files claim, fix worker count to match code (cpu_count2) fix: eliminate race in stale-cache achievements test The background scan thread could complete and overwrite _SNAPSHOT_CACHE before evaluate_all() returned the stale data — only 10 fake sessions made the scan finish instantly. Added scan_delay param to _FakeSessionDB and set it to 2s in the stale-cache test so the background thread can't win the race.	14 天前
test_sanitize_tool_error.py	security: sanitize tool error strings before injecting into model context (#26823) Adds _sanitize_tool_error() in model_tools and routes both error paths through it: registry.dispatch's try/except (the primary path for tool exceptions) and handle_function_call's outer except (defense in depth). Stripping targets structural framing tokens that the model itself can react to even though json.dumps already handles wire-layer escaping: XML role tags (tool_call, function_call, result, response, output, input, system, assistant, user), CDATA sections, and markdown code fences. Caps message body at 2000 chars and wraps with [TOOL_ERROR] prefix. Defense-in-depth: a tool exception carrying '<tool_call>...' won't break message framing (json escapes it), but the model still reads those tokens and they nudge it toward role-confusion framing. Ported from ironclaw#1639 (one piece of #3838's three-feature scout). The truncated-tool-call (#1632) and empty-response-recovery (#1677, #1720) pieces are skipped because main now implements both far more thoroughly (run_agent.py L8147/L12209/L13012 for truncation retry + length rewrite; L4500/L15090+ for empty-response scaffolding stripper, multi-stage nudge, fallback model activation).	19 天前
test_sql_injection.py	fix(security): eliminate SQL string formatting in execute() calls Closes #1911 - insights.py: Pre-compute SELECT queries as class constants instead of f-string interpolation at runtime. _SESSION_COLS is now evaluated once at class definition time. - hermes_state.py: Add identifier quoting and whitelist validation for ALTER TABLE column names in schema migrations. - Add 4 tests verifying no injection vectors in SQL query construction.	2 个月前
test_subprocess_home_isolation.py	fix: avoid process-wide cron profile home mutation	17 天前
test_termux_all_extra_compat.py	fix: add termux-all install profile and safe fallbacks	28 天前
test_timezone.py	chore: ruff auto-fix PLR6201 resweep — tuple → set in membership tests (#27355) Six days after #23937 (608 fixes) the codebase had accumulated 241 new PLR6201 violations. Same mechanical `x in (...)` → `x in {...}` fix, same zero-risk profile: set lookup is O(1) vs O(n) for tuple and the two are semantically equivalent for hashable scalar membership tests. All 241 instances fixed via `ruff check --select PLR6201 --fix --unsafe-fixes`, zero remaining. Every changed value is a hashable scalar (str/int/None/enum/signal); no risk of unhashable runtime errors. No behavior change. Test plan: - 119 files changed, +244/-244 (net zero) — exactly one-line edits - `ruff check` clean afterward - Compile checks pass on the largest touched files (cli.py, run_agent.py, gateway/run.py, gateway/platforms/discord.py, model_tools.py) - Subset broad test run on tests/gateway/ tests/hermes_cli/ tests/agent/ tests/tools/: 18187 passed, 59 pre-existing failures (verified against origin/main with the same shape — identical failure count, identical category — all xdist test-order flakes unrelated to this change) Follows the same template as PR #23937 ([tracker: #23972](https://github.com/NousResearch/hermes-agent/issues/23972)).	18 天前
test_toolset_distributions.py	test: add unit tests for 8 modules (batch 2) Cover model_tools, toolset_distributions, context_compressor, prompt_caching, cronjob_tools, session_search, process_registry, and cron/scheduler with 127 new test cases.	3 个月前
test_toolsets.py	test(toolsets): lock web search into default platform coverage Adds regression tests pinning web search into the WhatsApp and api-server default platform-coverage toolsets. Pure test additions, no runtime change. Salvage of the test-addition commit from #25692 by @wesleysimplicio. (The AUTHOR_MAP fixup commit from the same PR landed separately as 529ec85c7.)	21 天前
test_trajectory_compressor.py	fix(kimi): omit temperature entirely for Kimi/Moonshot models (#13157) Kimi's gateway selects the correct temperature server-side based on the active mode (thinking -> 1.0, non-thinking -> 0.6). Sending any temperature value — even the previously "correct" one — conflicts with gateway-managed defaults. Replaces the old approach of forcing specific temperature values (0.6 for non-thinking, 1.0 for thinking) with an OMIT_TEMPERATURE sentinel that tells all call sites to strip the temperature key from API kwargs entirely. Changes: - agent/auxiliary_client.py: OMIT_TEMPERATURE sentinel, _is_kimi_model() prefix check (covers all kimi-* models), _fixed_temperature_for_model() returns sentinel for kimi models. _build_call_kwargs() strips temp. - run_agent.py: _build_api_kwargs, flush_memories, and summary generation paths all handle the sentinel by popping/omitting temperature. - trajectory_compressor.py: _effective_temperature_for_model returns None for kimi (sentinel mapped), direct client calls use kwargs dict to conditionally include temperature. - mini_swe_runner.py: same sentinel handling via wrapper function. - 6 test files updated: all 'forces temperature X' assertions replaced with 'temperature not in kwargs' assertions. Net: -76 lines (171 added, 247 removed). Inspired by PR #13137 (@kshitijk4poor).	1 个月前
test_trajectory_compressor_async.py	fix(kimi): omit temperature entirely for Kimi/Moonshot models (#13157) Kimi's gateway selects the correct temperature server-side based on the active mode (thinking -> 1.0, non-thinking -> 0.6). Sending any temperature value — even the previously "correct" one — conflicts with gateway-managed defaults. Replaces the old approach of forcing specific temperature values (0.6 for non-thinking, 1.0 for thinking) with an OMIT_TEMPERATURE sentinel that tells all call sites to strip the temperature key from API kwargs entirely. Changes: - agent/auxiliary_client.py: OMIT_TEMPERATURE sentinel, _is_kimi_model() prefix check (covers all kimi-* models), _fixed_temperature_for_model() returns sentinel for kimi models. _build_call_kwargs() strips temp. - run_agent.py: _build_api_kwargs, flush_memories, and summary generation paths all handle the sentinel by popping/omitting temperature. - trajectory_compressor.py: _effective_temperature_for_model returns None for kimi (sentinel mapped), direct client calls use kwargs dict to conditionally include temperature. - mini_swe_runner.py: same sentinel handling via wrapper function. - 6 test files updated: all 'forces temperature X' assertions replaced with 'temperature not in kwargs' assertions. Net: -76 lines (171 added, 247 removed). Inspired by PR #13137 (@kshitijk4poor).	1 个月前
test_transform_llm_output_hook.py	test+docs: cover transform_llm_output hook + release author map - tests/test_transform_llm_output_hook.py: dispatch semantics (kwargs contract, first-non-empty-string-wins, empty-string pass-through, raising-plugin fail-open, no-plugins = no-op) - tests/hermes_cli/test_plugins.py: assert the new hook name is in VALID_HOOKS alongside the other transform_* hooks - website/docs/user-guide/features/hooks.md: summary-table entry + full section mirroring transform_tool_result / transform_terminal_output - scripts/release.py: map barnacleboy.jezzahehn@agentmail.to -> JezzaHehn (existing entry only covers the gmail address)	28 天前
test_transform_tool_result_hook.py	test: stop testing mutable data — convert change-detectors to invariants (#13363) Catalog snapshots, config version literals, and enumeration counts are data that changes as designed. Tests that assert on those values add no behavioral coverage — they just break CI on every routine update and cost engineering time to 'fix.' Replace with invariants where one exists, delete where none does. Deleted (pure snapshots): - TestMinimaxModelCatalog (3 tests): 'MiniMax-M2.7 in models' et al - TestGeminiModelCatalog: 'gemini-2.5-pro in models', 'gemini-3.x in models' - test_browser_camofox_state::test_config_version_matches_current_schema (docstring literally said it would break on unrelated bumps) Relaxed (keep plumbing check, drop snapshot): - Xiaomi / Arcee / Kimi moonshot / Kimi coding / HuggingFace static lists: now assert 'provider exists and has >= 1 entry' instead of specific names - HuggingFace main/models.py consistency test: drop 'len >= 6' floor Dynamicized (follow source, not a literal): - 3x test_config.py migration tests: raw['_config_version'] == DEFAULT_CONFIG['_config_version'] instead of hardcoded 21 Fixed stale tests against intentional behavior changes: - test_insights::test_gateway_format_hides_cost: name matches new behavior (no dollar figures); remove contradicting '$' in text assertion - test_config::prefers_api_then_url_then_base_url: flipped per PR #9332; rename + update to base_url > url > api - test_anthropic_adapter: relax assert_called_once() (xdist-flaky) to assert called — contract is 'credential flowed through' - test_interrupt_propagation: add provider/model/_base_url to bare-agent fixture so the stale-timeout code path resolves Fixed stale integration tests against opt-in plugin gate: - transform_tool_result + transform_terminal_output: write plugins.enabled allow-list to config.yaml and reset the plugin manager singleton Source fix (real consistency invariant): - agent/model_metadata.py: add moonshotai/Kimi-K2.6 context length (262144, same as K2.5). test_model_metadata_has_context_lengths was correctly catching the gap. Policy: - AGENTS.md Testing section: new subsection 'Don't write change-detector tests' with do/don't examples. Reviewers should reject catalog-snapshot assertions in new tests. Covers every test that failed on the last completed main CI run (24703345583) except test_modal_sandbox_fixes::test_terminal_tool_present + test_terminal_and_file_toolsets_resolve_all_tools, which now pass both alone and with the full tests/tools/ directory (xdist ordering flake that resolved itself).	1 个月前
test_tui_gateway_server.py	fix(tui): stop slash dropdown from chopping last char of /goal (#31311) Two independent bugs caused the slash-command autocomplete to render `/goal` as `/goa` (and `/gquota` as `/gquot` for that matter) in the TUI: 1. `tui_gateway/server.py` was forwarding `c.display` from prompt_toolkit's `Completion` straight into the JSON-RPC payload. prompt_toolkit normalizes `display=` into `FormattedText` (a `list` subclass), so the wire format became `[["", "/goal"]]` instead of the `string` that `CompletionItem.display` in the TUI declares. `meta` already went through `to_plain_text` — `display` did not. 2. The dropdown row in `appOverlays.tsx` used `flexDirection="row"` with the display `<Text>` and the (very long) meta `<Text>` as siblings. When the meta overflows the row width, Ink/Yoga shrinks the first column by one cell, lopping the trailing character off the command name. `/goal` triggers it reliably because its meta string is the longest of any built-in command (description + embedded `[text \| pause \| resume \| clear \| status]` usage hint). Wrapping the display column in `<Box flexShrink={0}>` keeps it at its natural width and lets the meta wrap or truncate instead.	11 天前
test_utils_truthy_values.py	Gate tool-gateway behind an env var, so it's not in users' faces until we're ready. Even if users enable it, it'll be blocked server-side for now, until we unlock for non-admin users on tool-gateway.	2 个月前
test_yuanbao_integration.py	yuanbao platform (#16298) Co-authored-by: loongzhao <loongzhao@tencent.com>	1 个月前
test_yuanbao_markdown.py	yuanbao platform (#16298) Co-authored-by: loongzhao <loongzhao@tencent.com>	1 个月前
test_yuanbao_pipeline.py	yuanbao platform (#16298) Co-authored-by: loongzhao <loongzhao@tencent.com>	1 个月前
test_yuanbao_proto.py	yuanbao platform (#16298) Co-authored-by: loongzhao <loongzhao@tencent.com>	1 个月前