fix(deepseek): set default_aux_model on profile so aux warning stops firing
Closes #26924 (and supersedes #26926) in spirit.
DeepSeek was missing default_aux_model on its ProviderProfile, so
_get_aux_model_for_provider("deepseek") returned an empty string and
the compression / vision / session-search paths emitted
"No auxiliary LLM provider configured -- context compression will
drop middle turns without a summary."
on every DeepSeek session, even when the user had perfectly working
DeepSeek credentials.
Fix lands at the profile layer rather than the legacy
_API_KEY_PROVIDER_AUX_MODELS_FALLBACK dict the original PR targeted.
Every modern provider (gemini, zai, minimax, anthropic, kimi-coding,
stepfun, ollama-cloud, gmi, novita, kilocode, ai-gateway, opencode-zen)
sets default_aux_model on its ProviderProfile; the fallback dict
only exists for providers that predate the profiles system.
Tests added under tests/plugins/model_providers/test_deepseek_profile.py:
- test_profile_advertises_deepseek_chat -- pins the profile attribute
- test_consumer_api_returns_deepseek_chat -- pins the consumer API behavior
- test_consumer_api_returns_non_empty -- regression guard for the
symptom in the issue
Original diagnosis and aux-model choice from @kriscolab in PR #26926;
moved one layer up.
Co-authored-by: kriscolab <71590782+kriscolab@users.noreply.github.com>
chore: ruff auto-fix PLR6201 resweep — tuple → set in membership tests (#27355)
Six days after #23937 (608 fixes) the codebase had accumulated 241 new
PLR6201 violations. Same mechanical x in (...) → x in {...} fix,
same zero-risk profile: set lookup is O(1) vs O(n) for tuple and the
two are semantically equivalent for hashable scalar membership tests.
All 241 instances fixed via `ruff check --select PLR6201 --fix
--unsafe-fixes`, zero remaining. Every changed value is a hashable
scalar (str/int/None/enum/signal); no risk of unhashable runtime
errors. No behavior change.
Test plan:
- 119 files changed, +244/-244 (net zero) — exactly one-line edits
- ruff check clean afterward
- Compile checks pass on the largest touched files (cli.py, run_agent.py,
gateway/run.py, gateway/platforms/discord.py, model_tools.py)
- Subset broad test run on tests/gateway/ tests/hermes_cli/ tests/agent/
tests/tools/: 18187 passed, 59 pre-existing failures (verified against
origin/main with the same shape — identical failure count, identical
category — all xdist test-order flakes unrelated to this change)
Follows the same template as PR #23937 ([tracker: #23972](https://github.com/NousResearch/hermes-agent/issues/23972)).
fix: mem0 API v2 compat, prefetch context fencing, secret redaction (#5423)
Consolidated salvage from PRs #5301 (qaqcvc), #5339 (lance0),
#5058 and #5098 (maymuneth).
Mem0 API v2 compatibility (#5301):
- All reads use filters={user_id: ...} instead of bare user_id= kwarg
- All writes use filters with user_id + agent_id for attribution
- Response unwrapping for v2 dict format {results: [...]}
- Split _read_filters() vs _write_filters() — reads are user-scoped
only for cross-session recall, writes include agent_id
- Preserved 'hermes-user' default (no breaking change for existing users)
- Omitted run_id scoping from #5301 — cross-session memory is Mem0's
core value, session-scoping reads would defeat that purpose
Memory prefetch context fencing (#5339):
- Wraps prefetched memory in <memory-context> fenced blocks with system
note marking content as recalled context, NOT user input
- Sanitizes provider output to strip fence-escape sequences, preventing
injection where memory content breaks out of the fence
- API-call-time only — never persisted to session history
Secret redaction (#5058, #5098):
- Added prefix patterns for Groq (gsk_), Matrix (syt_), RetainDB
(retaindb_), Hindsight (hsk-), Mem0 (mem0_), ByteRover (brv_)
chore: ruff auto-fix PLR6201 resweep — tuple → set in membership tests (#27355)
Six days after #23937 (608 fixes) the codebase had accumulated 241 new
PLR6201 violations. Same mechanical x in (...) → x in {...} fix,
same zero-risk profile: set lookup is O(1) vs O(n) for tuple and the
two are semantically equivalent for hashable scalar membership tests.
All 241 instances fixed via `ruff check --select PLR6201 --fix
--unsafe-fixes`, zero remaining. Every changed value is a hashable
scalar (str/int/None/enum/signal); no risk of unhashable runtime
errors. No behavior change.
Test plan:
- 119 files changed, +244/-244 (net zero) — exactly one-line edits
- ruff check clean afterward
- Compile checks pass on the largest touched files (cli.py, run_agent.py,
gateway/run.py, gateway/platforms/discord.py, model_tools.py)
- Subset broad test run on tests/gateway/ tests/hermes_cli/ tests/agent/
tests/tools/: 18187 passed, 59 pre-existing failures (verified against
origin/main with the same shape — identical failure count, identical
category — all xdist test-order flakes unrelated to this change)
Follows the same template as PR #23937 ([tracker: #23972](https://github.com/NousResearch/hermes-agent/issues/23972)).
feat(plugins): google_meet \u2014 join, transcribe, speak, follow up (#16364)
* feat(plugins): google_meet — bundled plugin for join+transcribe Meet calls
v1 shipping transcribe-only. Spawns headless Chromium via Playwright,
joins an explicit https://meet.google.com/ URL, enables live captions,
and scrapes them into a transcript file the agent can read across turns.
The agent then has the meeting content in context and can do followup
work (send recap, file issues, schedule followups) with its regular tools.
Surface:
- Tools: meet_join, meet_status, meet_transcript, meet_leave, meet_say
(meet_say is a v1 stub — returns not-implemented; v2 will wire
realtime duplex audio via OpenAI Realtime / Gemini Live +
BlackHole / PulseAudio null-sink.)
- CLI: hermes meet setup | auth | join | status | transcript | stop
- Lifecycle: on_session_end auto-leaves any still-running bot.
Safety:
- URL regex rejects anything that isn't https://meet.google.com/...
- No calendar scanning, no auto-dial, no auto-consent announcement.
- Single active meeting per install; a second meet_join leaves the first.
- Platform-gated to Linux + macOS (Windows audio routing for v2 untested).
- Opt-in: standalone plugin, user must add 'google_meet' to
plugins.enabled in config.yaml.
Zero core changes. Plugin uses existing register_tool /
register_cli_command / register_hook surfaces. 21 new unit tests cover the
URL safety gate, transcript dedup + status round-trip, process-manager
refusals/start/stop paths, tool-handler JSON shape under each branch,
session-end cleanup, and platform-gated register().
* feat(plugins/google_meet): v2 realtime audio + v3 remote node host
v2 \u2014 agent speaks in-meeting
audio_bridge.py: PulseAudio null-sink (Linux) + BlackHole probe (macOS).
On Linux we load pactl module-null-sink + module-virtual-source, track
module ids for teardown; Chrome gets PULSE_SOURCE=<virt src> env so its
fake mic reads what we write to the sink. macOS just probes BlackHole
2ch and returns its device name \u2014 the plugin refuses to switch the
user's default audio input (that would surprise them).
realtime/openai_client.py: sync WebSocket client for the OpenAI Realtime
API. RealtimeSession.speak(text) sends conversation.item.create +
response.create, accumulates response.audio.delta PCM bytes, appends
them to a file. RealtimeSpeaker runs a JSONL-queue loop consuming
meet_say calls. 'websockets' is an optional dep imported lazily.
meet_bot.py: when HERMES_MEET_MODE=realtime, provisions AudioBridge,
starts RealtimeSession + speaker thread, spawns paplay to pump PCM
into the null-sink, then cleans everything up on SIGTERM. If any
realtime setup step fails, falls back cleanly to transcribe mode
with an error flagged in status.json.
process_manager.enqueue_say(): writes a JSONL line to say_queue.jsonl;
refuses when no active meeting or active meeting is transcribe-only.
tools.meet_say: real implementation; requires active mode='realtime'.
meet_join: adds mode='transcribe'|'realtime' param.
v3 \u2014 remote node host
node/protocol.py: JSON envelope (type, id, token, payload) + validate.
node/registry.py: $HERMES_HOME/workspace/meetings/nodes.json, with
resolve() auto-selecting the sole registered node when name is None.
node/server.py: NodeServer \u2014 websockets.serve, bearer-token auth,
dispatches start_bot/stop/status/transcript/say/ping onto the local
process_manager. Token auto-generated + persisted on first run.
node/client.py: NodeClient \u2014 short-lived sync WS per RPC, raises
RuntimeError on error envelopes, clean API matching the server.
node/cli.py: 'hermes meet node {run,list,approve,remove,status,ping}'
subtree; wired into the main meet CLI by cli.py so 'hermes meet node'
Just Works.
tools.py: every meet_* tool accepts node='<name>'|'auto'; when set,
routes through NodeClient to the remote bot instead of running
locally. Unknown node \u2192 clear 'no registered meet node matches ...'
error.
cli.py: 'hermes meet join --node my-mac --mode realtime' and
'hermes meet say "..." --node my-mac' route to the node; 'hermes
meet node approve <name> <url> <token>' registers one.
Tests
21 v1 tests updated (meet_say is no longer a stub; active-record now
carries mode).
20 new audio_bridge + realtime tests.
42 new node tests (protocol/registry/server/client/cli).
17 new v1/v2/v3 integration tests at the plugin level covering
enqueue_say edge cases, env var passthrough, mode validation, node
routing (known/unknown/auto/ambiguous), and argparse wiring for
hermes meet say + hermes meet node + --mode/--node flags.
Total: 100 plugin tests + 58 plugin-system tests = 158 passing.
E2E verified on Linux with fresh HERMES_HOME: plugin loads, 5 tools
register, on_session_end hook wires, 'hermes meet' CLI tree wires
including the node subtree, NodeRegistry round-trips, meet_join routes
correctly to NodeClient under node='my-mac' with mode='realtime',
enqueue_say accepts realtime/rejects transcribe, argparse parses every
new flag cleanly.
Zero changes to core. All new code lives under plugins/google_meet/.
* feat(plugins/google_meet): auto-install, admission detect, mac PCM pump, barge-in, richer status
Ready-for-live-test follow-up on PR #16364. Five additions that matter for
the first live run on a real Meet, in priority order:
1. hermes meet install [--realtime] [--yes]
pip install playwright websockets + python -m playwright install chromium
--realtime: installs platform audio deps (pulseaudio-utils on Linux via
sudo apt, blackhole-2ch + ffmpeg on macOS via brew). Prompts before
sudo/brew unless --yes. Refuses on Windows. Refuses to auto-flip the
macOS default input — user still selects BlackHole in System Settings
(deliberate; surprise audio rerouting is worse than a manual step).
2. Admission detection
_detect_admission(page): Leave-button visible OR caption region
attached OR participants list present → we're in-call.
_detect_denied(page): 'You can\'t join this video call' / 'You were
removed' / 'No one responded to your request' → bail out.
HERMES_MEET_LOBBY_TIMEOUT (default 300s) caps how long we sit in
the lobby before giving up. in_call stays False until admitted.
Status surfaces leaveReason: duration_expired | lobby_timeout |
denied | page_closed.
3. macOS PCM pump
ffmpeg reads speaker.pcm (24kHz s16le mono) and writes to the
BlackHole AVFoundation output via -f audiotoolbox
-audio_device_index <N>. _mac_audio_device_index() probes
ffmpeg -f avfoundation -list_devices true to resolve 'BlackHole 2ch'
→ numeric index. Falls back to index 0 on probe failure. Linux
paplay pump unchanged.
4. Richer status dict
_BotState now tracks realtime, realtimeReady, realtimeDevice,
audioBytesOut, lastAudioOutAt, lastBargeInAt, joinAttemptedAt,
leaveReason. RealtimeSession.audio_bytes_out / last_audio_out_at
counters fold into the status file once a second so meet_status()
can show the agent's voice activity in near-real-time.
5. Barge-in
RealtimeSession.cancel_response() sends type='response.cancel' over
the same WS (lock-guarded so it's safe to call from the caption
thread while speak() is reading frames). Handles response.cancelled
as a terminal frame type. _looks_like_human_speaker() gates triggers
so the bot's own name, 'You', 'Unknown', and blanks don't self-cancel.
Called from the caption drain loop: when a new caption arrives
attributed to a real participant while rt.session exists, we fire
cancel_response() and stamp lastBargeInAt.
Tests: 20 new unit tests across _BotState telemetry, barge-in gating,
admission/denied probe error handling, cancel_response with and without
a connected WS, and hermes meet install CLI wiring (flag parsing +
end-to-end subprocess.run verification + Linux-already-installed fast
path). Total 171 passing across all google_meet test files + the
plugin-system regression suite.
E2E verified on Linux: plugin loads, all 5 tools register,
hermes meet install --realtime --yes parses, fresh-bot status.json
has every new telemetry key, cancel_response on a disconnected session
returns False without raising, barge-in helper gates the bot's own
name correctly.
Still out of scope (for a future PR, not blocking live test):
mic → Realtime duplex (the agent listening to meeting audio via
WebRTC), node-host TLS/pairing UX, Windows audio, Meet create+Twilio.
Docs updated: SKILL.md now lists the installer subcommand, lobby
timeout, barge-in caveat, and the full status-dict reference table.
README.md quick-start uses hermes meet install.
feat(plugins): google_meet \u2014 join, transcribe, speak, follow up (#16364)
* feat(plugins): google_meet — bundled plugin for join+transcribe Meet calls
v1 shipping transcribe-only. Spawns headless Chromium via Playwright,
joins an explicit https://meet.google.com/ URL, enables live captions,
and scrapes them into a transcript file the agent can read across turns.
The agent then has the meeting content in context and can do followup
work (send recap, file issues, schedule followups) with its regular tools.
Surface:
- Tools: meet_join, meet_status, meet_transcript, meet_leave, meet_say
(meet_say is a v1 stub — returns not-implemented; v2 will wire
realtime duplex audio via OpenAI Realtime / Gemini Live +
BlackHole / PulseAudio null-sink.)
- CLI: hermes meet setup | auth | join | status | transcript | stop
- Lifecycle: on_session_end auto-leaves any still-running bot.
Safety:
- URL regex rejects anything that isn't https://meet.google.com/...
- No calendar scanning, no auto-dial, no auto-consent announcement.
- Single active meeting per install; a second meet_join leaves the first.
- Platform-gated to Linux + macOS (Windows audio routing for v2 untested).
- Opt-in: standalone plugin, user must add 'google_meet' to
plugins.enabled in config.yaml.
Zero core changes. Plugin uses existing register_tool /
register_cli_command / register_hook surfaces. 21 new unit tests cover the
URL safety gate, transcript dedup + status round-trip, process-manager
refusals/start/stop paths, tool-handler JSON shape under each branch,
session-end cleanup, and platform-gated register().
* feat(plugins/google_meet): v2 realtime audio + v3 remote node host
v2 \u2014 agent speaks in-meeting
audio_bridge.py: PulseAudio null-sink (Linux) + BlackHole probe (macOS).
On Linux we load pactl module-null-sink + module-virtual-source, track
module ids for teardown; Chrome gets PULSE_SOURCE=<virt src> env so its
fake mic reads what we write to the sink. macOS just probes BlackHole
2ch and returns its device name \u2014 the plugin refuses to switch the
user's default audio input (that would surprise them).
realtime/openai_client.py: sync WebSocket client for the OpenAI Realtime
API. RealtimeSession.speak(text) sends conversation.item.create +
response.create, accumulates response.audio.delta PCM bytes, appends
them to a file. RealtimeSpeaker runs a JSONL-queue loop consuming
meet_say calls. 'websockets' is an optional dep imported lazily.
meet_bot.py: when HERMES_MEET_MODE=realtime, provisions AudioBridge,
starts RealtimeSession + speaker thread, spawns paplay to pump PCM
into the null-sink, then cleans everything up on SIGTERM. If any
realtime setup step fails, falls back cleanly to transcribe mode
with an error flagged in status.json.
process_manager.enqueue_say(): writes a JSONL line to say_queue.jsonl;
refuses when no active meeting or active meeting is transcribe-only.
tools.meet_say: real implementation; requires active mode='realtime'.
meet_join: adds mode='transcribe'|'realtime' param.
v3 \u2014 remote node host
node/protocol.py: JSON envelope (type, id, token, payload) + validate.
node/registry.py: $HERMES_HOME/workspace/meetings/nodes.json, with
resolve() auto-selecting the sole registered node when name is None.
node/server.py: NodeServer \u2014 websockets.serve, bearer-token auth,
dispatches start_bot/stop/status/transcript/say/ping onto the local
process_manager. Token auto-generated + persisted on first run.
node/client.py: NodeClient \u2014 short-lived sync WS per RPC, raises
RuntimeError on error envelopes, clean API matching the server.
node/cli.py: 'hermes meet node {run,list,approve,remove,status,ping}'
subtree; wired into the main meet CLI by cli.py so 'hermes meet node'
Just Works.
tools.py: every meet_* tool accepts node='<name>'|'auto'; when set,
routes through NodeClient to the remote bot instead of running
locally. Unknown node \u2192 clear 'no registered meet node matches ...'
error.
cli.py: 'hermes meet join --node my-mac --mode realtime' and
'hermes meet say "..." --node my-mac' route to the node; 'hermes
meet node approve <name> <url> <token>' registers one.
Tests
21 v1 tests updated (meet_say is no longer a stub; active-record now
carries mode).
20 new audio_bridge + realtime tests.
42 new node tests (protocol/registry/server/client/cli).
17 new v1/v2/v3 integration tests at the plugin level covering
enqueue_say edge cases, env var passthrough, mode validation, node
routing (known/unknown/auto/ambiguous), and argparse wiring for
hermes meet say + hermes meet node + --mode/--node flags.
Total: 100 plugin tests + 58 plugin-system tests = 158 passing.
E2E verified on Linux with fresh HERMES_HOME: plugin loads, 5 tools
register, on_session_end hook wires, 'hermes meet' CLI tree wires
including the node subtree, NodeRegistry round-trips, meet_join routes
correctly to NodeClient under node='my-mac' with mode='realtime',
enqueue_say accepts realtime/rejects transcribe, argparse parses every
new flag cleanly.
Zero changes to core. All new code lives under plugins/google_meet/.
* feat(plugins/google_meet): auto-install, admission detect, mac PCM pump, barge-in, richer status
Ready-for-live-test follow-up on PR #16364. Five additions that matter for
the first live run on a real Meet, in priority order:
1. hermes meet install [--realtime] [--yes]
pip install playwright websockets + python -m playwright install chromium
--realtime: installs platform audio deps (pulseaudio-utils on Linux via
sudo apt, blackhole-2ch + ffmpeg on macOS via brew). Prompts before
sudo/brew unless --yes. Refuses on Windows. Refuses to auto-flip the
macOS default input — user still selects BlackHole in System Settings
(deliberate; surprise audio rerouting is worse than a manual step).
2. Admission detection
_detect_admission(page): Leave-button visible OR caption region
attached OR participants list present → we're in-call.
_detect_denied(page): 'You can\'t join this video call' / 'You were
removed' / 'No one responded to your request' → bail out.
HERMES_MEET_LOBBY_TIMEOUT (default 300s) caps how long we sit in
the lobby before giving up. in_call stays False until admitted.
Status surfaces leaveReason: duration_expired | lobby_timeout |
denied | page_closed.
3. macOS PCM pump
ffmpeg reads speaker.pcm (24kHz s16le mono) and writes to the
BlackHole AVFoundation output via -f audiotoolbox
-audio_device_index <N>. _mac_audio_device_index() probes
ffmpeg -f avfoundation -list_devices true to resolve 'BlackHole 2ch'
→ numeric index. Falls back to index 0 on probe failure. Linux
paplay pump unchanged.
4. Richer status dict
_BotState now tracks realtime, realtimeReady, realtimeDevice,
audioBytesOut, lastAudioOutAt, lastBargeInAt, joinAttemptedAt,
leaveReason. RealtimeSession.audio_bytes_out / last_audio_out_at
counters fold into the status file once a second so meet_status()
can show the agent's voice activity in near-real-time.
5. Barge-in
RealtimeSession.cancel_response() sends type='response.cancel' over
the same WS (lock-guarded so it's safe to call from the caption
thread while speak() is reading frames). Handles response.cancelled
as a terminal frame type. _looks_like_human_speaker() gates triggers
so the bot's own name, 'You', 'Unknown', and blanks don't self-cancel.
Called from the caption drain loop: when a new caption arrives
attributed to a real participant while rt.session exists, we fire
cancel_response() and stamp lastBargeInAt.
Tests: 20 new unit tests across _BotState telemetry, barge-in gating,
admission/denied probe error handling, cancel_response with and without
a connected WS, and hermes meet install CLI wiring (flag parsing +
end-to-end subprocess.run verification + Linux-already-installed fast
path). Total 171 passing across all google_meet test files + the
plugin-system regression suite.
E2E verified on Linux: plugin loads, all 5 tools register,
hermes meet install --realtime --yes parses, fresh-bot status.json
has every new telemetry key, cancel_response on a disconnected session
returns False without raising, barge-in helper gates the bot's own
name correctly.
Still out of scope (for a future PR, not blocking live test):
mic → Realtime duplex (the agent listening to meeting audio via
WebRTC), node-host TLS/pairing UX, Windows audio, Meet create+Twilio.
Docs updated: SKILL.md now lists the installer subcommand, lobby
timeout, barge-in caveat, and the full status-dict reference table.
README.md quick-start uses hermes meet install.
feat(plugins): google_meet \u2014 join, transcribe, speak, follow up (#16364)
* feat(plugins): google_meet — bundled plugin for join+transcribe Meet calls
v1 shipping transcribe-only. Spawns headless Chromium via Playwright,
joins an explicit https://meet.google.com/ URL, enables live captions,
and scrapes them into a transcript file the agent can read across turns.
The agent then has the meeting content in context and can do followup
work (send recap, file issues, schedule followups) with its regular tools.
Surface:
- Tools: meet_join, meet_status, meet_transcript, meet_leave, meet_say
(meet_say is a v1 stub — returns not-implemented; v2 will wire
realtime duplex audio via OpenAI Realtime / Gemini Live +
BlackHole / PulseAudio null-sink.)
- CLI: hermes meet setup | auth | join | status | transcript | stop
- Lifecycle: on_session_end auto-leaves any still-running bot.
Safety:
- URL regex rejects anything that isn't https://meet.google.com/...
- No calendar scanning, no auto-dial, no auto-consent announcement.
- Single active meeting per install; a second meet_join leaves the first.
- Platform-gated to Linux + macOS (Windows audio routing for v2 untested).
- Opt-in: standalone plugin, user must add 'google_meet' to
plugins.enabled in config.yaml.
Zero core changes. Plugin uses existing register_tool /
register_cli_command / register_hook surfaces. 21 new unit tests cover the
URL safety gate, transcript dedup + status round-trip, process-manager
refusals/start/stop paths, tool-handler JSON shape under each branch,
session-end cleanup, and platform-gated register().
* feat(plugins/google_meet): v2 realtime audio + v3 remote node host
v2 \u2014 agent speaks in-meeting
audio_bridge.py: PulseAudio null-sink (Linux) + BlackHole probe (macOS).
On Linux we load pactl module-null-sink + module-virtual-source, track
module ids for teardown; Chrome gets PULSE_SOURCE=<virt src> env so its
fake mic reads what we write to the sink. macOS just probes BlackHole
2ch and returns its device name \u2014 the plugin refuses to switch the
user's default audio input (that would surprise them).
realtime/openai_client.py: sync WebSocket client for the OpenAI Realtime
API. RealtimeSession.speak(text) sends conversation.item.create +
response.create, accumulates response.audio.delta PCM bytes, appends
them to a file. RealtimeSpeaker runs a JSONL-queue loop consuming
meet_say calls. 'websockets' is an optional dep imported lazily.
meet_bot.py: when HERMES_MEET_MODE=realtime, provisions AudioBridge,
starts RealtimeSession + speaker thread, spawns paplay to pump PCM
into the null-sink, then cleans everything up on SIGTERM. If any
realtime setup step fails, falls back cleanly to transcribe mode
with an error flagged in status.json.
process_manager.enqueue_say(): writes a JSONL line to say_queue.jsonl;
refuses when no active meeting or active meeting is transcribe-only.
tools.meet_say: real implementation; requires active mode='realtime'.
meet_join: adds mode='transcribe'|'realtime' param.
v3 \u2014 remote node host
node/protocol.py: JSON envelope (type, id, token, payload) + validate.
node/registry.py: $HERMES_HOME/workspace/meetings/nodes.json, with
resolve() auto-selecting the sole registered node when name is None.
node/server.py: NodeServer \u2014 websockets.serve, bearer-token auth,
dispatches start_bot/stop/status/transcript/say/ping onto the local
process_manager. Token auto-generated + persisted on first run.
node/client.py: NodeClient \u2014 short-lived sync WS per RPC, raises
RuntimeError on error envelopes, clean API matching the server.
node/cli.py: 'hermes meet node {run,list,approve,remove,status,ping}'
subtree; wired into the main meet CLI by cli.py so 'hermes meet node'
Just Works.
tools.py: every meet_* tool accepts node='<name>'|'auto'; when set,
routes through NodeClient to the remote bot instead of running
locally. Unknown node \u2192 clear 'no registered meet node matches ...'
error.
cli.py: 'hermes meet join --node my-mac --mode realtime' and
'hermes meet say "..." --node my-mac' route to the node; 'hermes
meet node approve <name> <url> <token>' registers one.
Tests
21 v1 tests updated (meet_say is no longer a stub; active-record now
carries mode).
20 new audio_bridge + realtime tests.
42 new node tests (protocol/registry/server/client/cli).
17 new v1/v2/v3 integration tests at the plugin level covering
enqueue_say edge cases, env var passthrough, mode validation, node
routing (known/unknown/auto/ambiguous), and argparse wiring for
hermes meet say + hermes meet node + --mode/--node flags.
Total: 100 plugin tests + 58 plugin-system tests = 158 passing.
E2E verified on Linux with fresh HERMES_HOME: plugin loads, 5 tools
register, on_session_end hook wires, 'hermes meet' CLI tree wires
including the node subtree, NodeRegistry round-trips, meet_join routes
correctly to NodeClient under node='my-mac' with mode='realtime',
enqueue_say accepts realtime/rejects transcribe, argparse parses every
new flag cleanly.
Zero changes to core. All new code lives under plugins/google_meet/.
* feat(plugins/google_meet): auto-install, admission detect, mac PCM pump, barge-in, richer status
Ready-for-live-test follow-up on PR #16364. Five additions that matter for
the first live run on a real Meet, in priority order:
1. hermes meet install [--realtime] [--yes]
pip install playwright websockets + python -m playwright install chromium
--realtime: installs platform audio deps (pulseaudio-utils on Linux via
sudo apt, blackhole-2ch + ffmpeg on macOS via brew). Prompts before
sudo/brew unless --yes. Refuses on Windows. Refuses to auto-flip the
macOS default input — user still selects BlackHole in System Settings
(deliberate; surprise audio rerouting is worse than a manual step).
2. Admission detection
_detect_admission(page): Leave-button visible OR caption region
attached OR participants list present → we're in-call.
_detect_denied(page): 'You can\'t join this video call' / 'You were
removed' / 'No one responded to your request' → bail out.
HERMES_MEET_LOBBY_TIMEOUT (default 300s) caps how long we sit in
the lobby before giving up. in_call stays False until admitted.
Status surfaces leaveReason: duration_expired | lobby_timeout |
denied | page_closed.
3. macOS PCM pump
ffmpeg reads speaker.pcm (24kHz s16le mono) and writes to the
BlackHole AVFoundation output via -f audiotoolbox
-audio_device_index <N>. _mac_audio_device_index() probes
ffmpeg -f avfoundation -list_devices true to resolve 'BlackHole 2ch'
→ numeric index. Falls back to index 0 on probe failure. Linux
paplay pump unchanged.
4. Richer status dict
_BotState now tracks realtime, realtimeReady, realtimeDevice,
audioBytesOut, lastAudioOutAt, lastBargeInAt, joinAttemptedAt,
leaveReason. RealtimeSession.audio_bytes_out / last_audio_out_at
counters fold into the status file once a second so meet_status()
can show the agent's voice activity in near-real-time.
5. Barge-in
RealtimeSession.cancel_response() sends type='response.cancel' over
the same WS (lock-guarded so it's safe to call from the caption
thread while speak() is reading frames). Handles response.cancelled
as a terminal frame type. _looks_like_human_speaker() gates triggers
so the bot's own name, 'You', 'Unknown', and blanks don't self-cancel.
Called from the caption drain loop: when a new caption arrives
attributed to a real participant while rt.session exists, we fire
cancel_response() and stamp lastBargeInAt.
Tests: 20 new unit tests across _BotState telemetry, barge-in gating,
admission/denied probe error handling, cancel_response with and without
a connected WS, and hermes meet install CLI wiring (flag parsing +
end-to-end subprocess.run verification + Linux-already-installed fast
path). Total 171 passing across all google_meet test files + the
plugin-system regression suite.
E2E verified on Linux: plugin loads, all 5 tools register,
hermes meet install --realtime --yes parses, fresh-bot status.json
has every new telemetry key, cancel_response on a disconnected session
returns False without raising, barge-in helper gates the bot's own
name correctly.
Still out of scope (for a future PR, not blocking live test):
mic → Realtime duplex (the agent listening to meeting audio via
WebRTC), node-host TLS/pairing UX, Windows audio, Meet create+Twilio.
Docs updated: SKILL.md now lists the installer subcommand, lobby
timeout, barge-in caveat, and the full status-dict reference table.
README.md quick-start uses hermes meet install.
feat(plugins): google_meet \u2014 join, transcribe, speak, follow up (#16364)
* feat(plugins): google_meet — bundled plugin for join+transcribe Meet calls
v1 shipping transcribe-only. Spawns headless Chromium via Playwright,
joins an explicit https://meet.google.com/ URL, enables live captions,
and scrapes them into a transcript file the agent can read across turns.
The agent then has the meeting content in context and can do followup
work (send recap, file issues, schedule followups) with its regular tools.
Surface:
- Tools: meet_join, meet_status, meet_transcript, meet_leave, meet_say
(meet_say is a v1 stub — returns not-implemented; v2 will wire
realtime duplex audio via OpenAI Realtime / Gemini Live +
BlackHole / PulseAudio null-sink.)
- CLI: hermes meet setup | auth | join | status | transcript | stop
- Lifecycle: on_session_end auto-leaves any still-running bot.
Safety:
- URL regex rejects anything that isn't https://meet.google.com/...
- No calendar scanning, no auto-dial, no auto-consent announcement.
- Single active meeting per install; a second meet_join leaves the first.
- Platform-gated to Linux + macOS (Windows audio routing for v2 untested).
- Opt-in: standalone plugin, user must add 'google_meet' to
plugins.enabled in config.yaml.
Zero core changes. Plugin uses existing register_tool /
register_cli_command / register_hook surfaces. 21 new unit tests cover the
URL safety gate, transcript dedup + status round-trip, process-manager
refusals/start/stop paths, tool-handler JSON shape under each branch,
session-end cleanup, and platform-gated register().
* feat(plugins/google_meet): v2 realtime audio + v3 remote node host
v2 \u2014 agent speaks in-meeting
audio_bridge.py: PulseAudio null-sink (Linux) + BlackHole probe (macOS).
On Linux we load pactl module-null-sink + module-virtual-source, track
module ids for teardown; Chrome gets PULSE_SOURCE=<virt src> env so its
fake mic reads what we write to the sink. macOS just probes BlackHole
2ch and returns its device name \u2014 the plugin refuses to switch the
user's default audio input (that would surprise them).
realtime/openai_client.py: sync WebSocket client for the OpenAI Realtime
API. RealtimeSession.speak(text) sends conversation.item.create +
response.create, accumulates response.audio.delta PCM bytes, appends
them to a file. RealtimeSpeaker runs a JSONL-queue loop consuming
meet_say calls. 'websockets' is an optional dep imported lazily.
meet_bot.py: when HERMES_MEET_MODE=realtime, provisions AudioBridge,
starts RealtimeSession + speaker thread, spawns paplay to pump PCM
into the null-sink, then cleans everything up on SIGTERM. If any
realtime setup step fails, falls back cleanly to transcribe mode
with an error flagged in status.json.
process_manager.enqueue_say(): writes a JSONL line to say_queue.jsonl;
refuses when no active meeting or active meeting is transcribe-only.
tools.meet_say: real implementation; requires active mode='realtime'.
meet_join: adds mode='transcribe'|'realtime' param.
v3 \u2014 remote node host
node/protocol.py: JSON envelope (type, id, token, payload) + validate.
node/registry.py: $HERMES_HOME/workspace/meetings/nodes.json, with
resolve() auto-selecting the sole registered node when name is None.
node/server.py: NodeServer \u2014 websockets.serve, bearer-token auth,
dispatches start_bot/stop/status/transcript/say/ping onto the local
process_manager. Token auto-generated + persisted on first run.
node/client.py: NodeClient \u2014 short-lived sync WS per RPC, raises
RuntimeError on error envelopes, clean API matching the server.
node/cli.py: 'hermes meet node {run,list,approve,remove,status,ping}'
subtree; wired into the main meet CLI by cli.py so 'hermes meet node'
Just Works.
tools.py: every meet_* tool accepts node='<name>'|'auto'; when set,
routes through NodeClient to the remote bot instead of running
locally. Unknown node \u2192 clear 'no registered meet node matches ...'
error.
cli.py: 'hermes meet join --node my-mac --mode realtime' and
'hermes meet say "..." --node my-mac' route to the node; 'hermes
meet node approve <name> <url> <token>' registers one.
Tests
21 v1 tests updated (meet_say is no longer a stub; active-record now
carries mode).
20 new audio_bridge + realtime tests.
42 new node tests (protocol/registry/server/client/cli).
17 new v1/v2/v3 integration tests at the plugin level covering
enqueue_say edge cases, env var passthrough, mode validation, node
routing (known/unknown/auto/ambiguous), and argparse wiring for
hermes meet say + hermes meet node + --mode/--node flags.
Total: 100 plugin tests + 58 plugin-system tests = 158 passing.
E2E verified on Linux with fresh HERMES_HOME: plugin loads, 5 tools
register, on_session_end hook wires, 'hermes meet' CLI tree wires
including the node subtree, NodeRegistry round-trips, meet_join routes
correctly to NodeClient under node='my-mac' with mode='realtime',
enqueue_say accepts realtime/rejects transcribe, argparse parses every
new flag cleanly.
Zero changes to core. All new code lives under plugins/google_meet/.
* feat(plugins/google_meet): auto-install, admission detect, mac PCM pump, barge-in, richer status
Ready-for-live-test follow-up on PR #16364. Five additions that matter for
the first live run on a real Meet, in priority order:
1. hermes meet install [--realtime] [--yes]
pip install playwright websockets + python -m playwright install chromium
--realtime: installs platform audio deps (pulseaudio-utils on Linux via
sudo apt, blackhole-2ch + ffmpeg on macOS via brew). Prompts before
sudo/brew unless --yes. Refuses on Windows. Refuses to auto-flip the
macOS default input — user still selects BlackHole in System Settings
(deliberate; surprise audio rerouting is worse than a manual step).
2. Admission detection
_detect_admission(page): Leave-button visible OR caption region
attached OR participants list present → we're in-call.
_detect_denied(page): 'You can\'t join this video call' / 'You were
removed' / 'No one responded to your request' → bail out.
HERMES_MEET_LOBBY_TIMEOUT (default 300s) caps how long we sit in
the lobby before giving up. in_call stays False until admitted.
Status surfaces leaveReason: duration_expired | lobby_timeout |
denied | page_closed.
3. macOS PCM pump
ffmpeg reads speaker.pcm (24kHz s16le mono) and writes to the
BlackHole AVFoundation output via -f audiotoolbox
-audio_device_index <N>. _mac_audio_device_index() probes
ffmpeg -f avfoundation -list_devices true to resolve 'BlackHole 2ch'
→ numeric index. Falls back to index 0 on probe failure. Linux
paplay pump unchanged.
4. Richer status dict
_BotState now tracks realtime, realtimeReady, realtimeDevice,
audioBytesOut, lastAudioOutAt, lastBargeInAt, joinAttemptedAt,
leaveReason. RealtimeSession.audio_bytes_out / last_audio_out_at
counters fold into the status file once a second so meet_status()
can show the agent's voice activity in near-real-time.
5. Barge-in
RealtimeSession.cancel_response() sends type='response.cancel' over
the same WS (lock-guarded so it's safe to call from the caption
thread while speak() is reading frames). Handles response.cancelled
as a terminal frame type. _looks_like_human_speaker() gates triggers
so the bot's own name, 'You', 'Unknown', and blanks don't self-cancel.
Called from the caption drain loop: when a new caption arrives
attributed to a real participant while rt.session exists, we fire
cancel_response() and stamp lastBargeInAt.
Tests: 20 new unit tests across _BotState telemetry, barge-in gating,
admission/denied probe error handling, cancel_response with and without
a connected WS, and hermes meet install CLI wiring (flag parsing +
end-to-end subprocess.run verification + Linux-already-installed fast
path). Total 171 passing across all google_meet test files + the
plugin-system regression suite.
E2E verified on Linux: plugin loads, all 5 tools register,
hermes meet install --realtime --yes parses, fresh-bot status.json
has every new telemetry key, cancel_response on a disconnected session
returns False without raising, barge-in helper gates the bot's own
name correctly.
Still out of scope (for a future PR, not blocking live test):
mic → Realtime duplex (the agent listening to meeting audio via
WebRTC), node-host TLS/pairing UX, Windows audio, Meet create+Twilio.
Docs updated: SKILL.md now lists the installer subcommand, lobby
timeout, barge-in caveat, and the full status-dict reference table.
README.md quick-start uses hermes meet install.
fix(kanban-dashboard): restore implementations dropped during salvages (#28481)
Four kanban dashboard test failures, all from PR salvages that picked up
the test additions but dropped the corresponding implementations.
- BOARD_COLUMNS: add 'review' (status added by PR f55d94a1e but the
board API never grew the column → test_board_empty failed because
VALID_STATUSES - {archived} mismatched the rendered columns).
- update_task: enrich the 'ready' 409 detail with the blocking parent
list (id, title, status) and add _parents_blocking_ready helper.
Implementation lost in the #26744 salvage (commit e215558ba) which
pinned the test but not the server-side code.
- dist/index.js: add parseApiErrorMessage helper, wire it through the
drag/drop banner, add patchErr state to the TaskDrawer and surface
it inline by the action row. Lost in the same #26744 salvage.
- test_diagnostics_endpoint_severity_filter: update to at-or-above
semantics (PR a94ddd807 changed the filter from exact-match so the
warning filter now correctly includes error+critical too).
fix(langfuse): complete observability fix — trace I/O, tool outputs, placeholder credentials (closes #22342, #22763) (#26320)
* fix(langfuse): reject placeholder credentials with one-shot warning
When operators leave HERMES_LANGFUSE_PUBLIC_KEY / HERMES_LANGFUSE_SECRET_KEY
at a template value like 'placeholder', 'test-key', or 'your-langfuse-key',
the Langfuse SDK silently accepts the credentials at construction time and
drops every trace at flush time. No warning, no error — just an empty
Langfuse dashboard the operator only notices hours later.
Add prefix-based validation in _get_langfuse() against the documented
'pk-lf-' / 'sk-lf-' prefixes that Langfuse always issues server-side.
Anything else fires a single warning naming the offending env var(s)
with a log-safe value preview (full string for short placeholders so the
operator knows which template they left in place; truncated for long
values so a real secret pasted into the wrong field never hits the log),
then short-circuits via the existing _INIT_FAILED cache so the warning
fires once per process, not once per hook invocation.
The check sits after the 'Langfuse is None' SDK-installed guard so hosts
without the optional langfuse SDK don't see misleading 'set real keys'
hints when the actionable fix is 'pip install langfuse'. Missing
credentials remains the documented opt-out path and stays silent — no
log noise for unconfigured installs.
Fixes #22763
Fixes #23823
* fix(langfuse): use actual API request messages for generation input
on_pre_llm_request previously used the messages kwarg alone, which
could be None when Hermes passes the payload via request_messages,
conversation_history, or user_message instead. Add _coerce_request_messages
to pick the first available list across all variants, falling back to a
synthetic user message. Generations now show the real outbound payload
rather than an empty input.
* fix(langfuse): record tool call outputs in traces
Tool observations showed input (arguments) but output was always
undefined. Root cause: when tool_call_id is empty, pre_tool_call stored
observations under a unique time-based key that post_tool_call could
never reconstruct, so every tool span was closed without output by the
_finish_trace sweep.
Fix pre/post matching by routing empty-tool_call_id tools through a
per-name FIFO queue (pending_tools_by_name) instead of the time-based
key. Tools with a tool_call_id continue to use the id-keyed dict.
Also:
- Preserve OpenAI-style nested function shape in serialized tool calls
so Langfuse renders name/arguments correctly
- Keep name + tool_call_id on role:tool messages for proper pairing
- Backfill tool results onto the matching turn_tool_calls entry so the
generation's tool-call record carries the result alongside arguments
- Coerce request messages from whichever field the runtime provides
(request_messages, messages, conversation_history, user_message)
* fix(langfuse): salvage-review polish — drop dead is_first_turn, shallow-copy request_messages, real threaded FIFO test
Self-review of the combined #22345 + #23831 salvage surfaced three issues
worth fixing in the same PR rather than as follow-ups:
1. Drop is_first_turn from the pre_api_request hook. The boolean expression
not bool(conversation_history) was wrong: conversation_history is
reassigned to None mid-run after compression (5 sites in run_agent.py),
so the value flips False -> True mid-conversation on every post-compression
API call. The langfuse plugin never consumed it, so the kwarg was both
misleading AND dead.
2. Replace copy.deepcopy(request_messages) with shallow list() copy. The
pre_api_request hook contract discards return values (invoke_hook never
writes back to api_kwargs), and the langfuse plugin's _serialize_messages
already builds its own snapshot dicts via _safe_value. A deepcopy on every
API call would walk every tool result and base64 image — significant
overhead for no real isolation benefit. Shallow copy of the outer list
protects against later mutations of api_messages without paying for the
inner-dict walk.
3. Rename test_empty_tool_call_id_concurrent_fifo_order ->
test_empty_tool_call_id_observations_are_fifo_within_tool_name and add a
real test_threaded_post_calls_preserve_fifo_under_lock that spawns 8
threads behind a barrier to actually exercise _STATE_LOCK on the
pending_tools_by_name queue. The original test was sequential and only
validated Python list semantics; this one validates the lock discipline.
4. Fix stale 'Cleared by reset_cache_for_tests()' comment on _INIT_FAILED —
that function does not exist. Tests reload the module via sys.modules.pop
+ importlib.import_module instead.
Tests: 37 langfuse plugin tests pass, 658 plugin tests overall pass.
---------
Co-authored-by: xxxigm <tuancanhnguyen706@gmail.com>
Co-authored-by: Brian Conklin <brian@dralth.com>
test: speed up slow tests (backoff + subprocess + IMDS network) (#11797)
Cuts shard-3 local runtime in half by neutralizing real wall-clock
waits across three classes of slow test:
## 1. Retry backoff mocks
- tests/run_agent/conftest.py (NEW): autouse fixture mocks
jittered_backoff to 0.0 so the while time.time() < sleep_end
busy-loop exits immediately. No global time.sleep mock (would
break threading tests).
- test_anthropic_error_handling, test_413_compression,
test_run_agent_codex_responses, test_fallback_model: per-file
fixtures mock time.sleep / asyncio.sleep for retry / compression
paths.
- test_retaindb_plugin: cap the retaindb module's bound time.sleep
to 0.05s via a per-test shim (background writer-thread retries
sleep 2s after errors; tests don't care about exact duration).
Plus replace arbitrary time.sleep(N) waits with short polling
loops bounded by deadline.
## 2. Subprocess sleeps in production code
- test_update_gateway_restart: mock time.sleep. Production code
does time.sleep(3) after systemctl restart to verify the
service survived. Tests mock subprocess.run \u2014 nothing actually
restarts \u2014 so the wait is dead time.
## 3. Network / IMDS timeouts (biggest single win)
- tests/conftest.py: add AWS_EC2_METADATA_DISABLED=true plus
AWS_METADATA_SERVICE_TIMEOUT=1 and ATTEMPTS=1. boto3 falls back
to IMDS (169.254.169.254) when no AWS creds are set. Any test
hitting has_aws_credentials() / resolve_aws_auth_env_var() (e.g.
test_status, test_setup_copilot_acp, anything that touches
provider auto-detect) burned ~2-4s waiting for that to time out.
- test_exit_cleanup_interrupt: explicitly mock
resolve_runtime_provider which was doing real network auto-detect
(~4s). Tests don't care about provider resolution \u2014 the agent
is already mocked.
- test_timezone: collapse the 3-test "TZ env in subprocess" suite
into 2 tests by checking both injection AND no-leak in the same
subprocess spawn (was 3 \u00d7 3.2s, now 2 \u00d7 4s).
## Validation
| Test | Before | After |
|---|---|---|
| test_anthropic_error_handling (8 tests) | ~80s | ~15s |
| test_413_compression (14 tests) | ~18s | 2.3s |
| test_retaindb_plugin (67 tests) | ~13s | 1.3s |
| test_status_includes_tavily_key | 4.0s | 0.05s |
| test_setup_copilot_acp_skips_same_provider_pool_step | 8.0s | 0.26s |
| test_update_gateway_restart (5 tests) | ~18s total | ~0.35s total |
| test_exit_cleanup_interrupt (2 tests) | 8s | 1.5s |
| **Matrix shard 3 local** | **108s** | **50s** |
No behavioral contract changed \u2014 tests still verify retry happens,
service restart logic runs, etc.; they just don't burn real seconds
waiting for it.
Supersedes PR #11779 (those changes are included here).