文件最后提交记录最后更新时间
feat: new tui based on ink 2 个月前
chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937) Replace with for all literal-tuple membership tests. Set lookup is O(1) vs O(n) for tuple — consistent micro-optimization across the codebase. 608 instances fixed via ruff --fix --unsafe-fixes, 0 remaining. 133 files, +626/-626 (net zero).23 天前
chore: address copilot comments 1 个月前
tui: inherit Python-side rendering via gateway bridge 1 个月前
feat: auto-launch Chromium-family browser for CDP Add browser CDP launch candidates for Chrome, Chromium, Brave, and Edge while preserving Chrome-first selection. Retry candidate launch failures instead of giving up after the first executable. Update /browser CLI and TUI messaging, docs, and tool descriptions from Chrome-only wording to Chromium-family browser support. Add regression coverage for Brave/Edge paths, Chrome-first precedence, fallback launches, and CDP endpoint probing. 15 天前
feat: personality 1 个月前
fix(tui-gateway): harden stdio transport against half-closed pipes + SIGTERM races (#17118) * fix(tui-gateway): harden stdio transport against half-closed pipes + SIGTERM races tui_gateway reports tui_gateway_crash.log traces where the main thread sits in sys.stdin while a worker holds _stdout_lock mid- flush, and SIGTERM then calls sys.exit(0) while the lock is still held — the interpreter shutdown stalls behind the wedged write. Two narrowly scoped hardenings: **tui_gateway/transport.py** * Move JSON serialisation outside the lock — long messages no longer block sibling writers while we serialise. * Treat BrokenPipeError, ValueError ("I/O on closed file") and generic OSError from both write and flush as "peer is gone": return False instead of bubbling, matching what write_json's callers in entry.py already expect. * Split flush into its own try block so a stuck flush never strands a partial write or holds the lock indefinitely on its way out. * Optional HERMES_TUI_GATEWAY_NO_FLUSH=1 env knob to skip explicit flush() entirely on environments where a half-closed read pipe produces an indefinite kernel-level block. Default unchanged. **tui_gateway/entry.py** * _log_signal now spawns a 1-second daemon timer that calls os._exit(0) if the orderly sys.exit(0) path is itself stuck behind a wedged worker. Atexit handlers run inside the grace window when they can; the timer is the safety net so a deadlocked flush no longer strands the gateway process. Tests: * test_write_json_closed_stream_returns_false — ValueError path. * test_write_json_oserror_on_flush_returns_false — OSError on flush must not strand the lock; the write portion still landed before the flush failure. * test_write_json_no_flush_env_skips_flush — env knob bypass. Validation: scripts/run_tests.sh tests/tui_gateway/test_protocol.py (42/42 pass; one pre-existing failure on test_session_resume_returns_hydrated_messages is unrelated to this change — same include_ancestors mock kwarg issue tracked elsewhere). scripts/run_tests.sh tests/test_tui_gateway_server.py 90/90 pass. * review(copilot): tighten transport hardening comments + test cleanup * review(copilot): narrow exception capture, configurable grace, simpler no-flush test * fix(tui-gateway): narrow ValueError to closed-stream; surface UnicodeEncodeError Copilot review on PR #17118: UnicodeEncodeError is a ValueError subclass, so a non-UTF-8 stdout (mismatched PYTHONIOENCODING / locale) would have been silently swallowed as 'peer gone' under except ValueError. That hides a real environment bug. Now: - UnicodeEncodeError → log with exc_info (warning) and drop the frame - ValueError where str(e) contains 'closed file' → peer gone, return False - Any other ValueError → log loudly, drop frame (defensive, but visible) Same shape applied to flush. Adds two regression tests. * fix(tui-gateway): reserve write() False for peer-gone; re-raise programming errors Round 2 Copilot review on PR #17118: Transport.write() returning False is documented as 'peer is gone', and entry.py reacts by calling sys.exit(0). But the implementation also returned False for non-IO conditions (non-JSON-safe payloads, UnicodeEncodeError, unrelated ValueErrors), so a programming error or local env bug would present as a clean disconnect — exactly the diagnosis pain we wanted to eliminate. Now: - json.dumps failure → re-raises (TypeError/ValueError surfaces in crash log) - BrokenPipeError → False (peer gone) - ValueError('...closed file...') → False (peer gone) - UnicodeEncodeError and any other ValueError → re-raise - OSError → False (existing IO-failure semantics, debug-logged) Tests updated to assert the re-raise behaviour and added a non-serializable-payload regression test. * fix(tui-gateway): narrow OSError to peer-gone errnos; honest test naming Round 3 Copilot review on PR #17118: - Docstring claimed False = peer gone, but generic OSError on write/flush also returned False — meaning ENOSPC/EACCES/EIO would silently exit. Added _PEER_GONE_ERRNOS = {EPIPE, ECONNRESET, EBADF, ESHUTDOWN, +WSA} and narrowed the OSError handlers; non-peer-gone errnos re-raise. Docstring now lists OSError as peer-gone branch with the errno set. - The _DISABLE_FLUSH test was named after the env var but actually patched the module constant. Renamed it to reflect the contract being tested (skips flush when constant is true) AND added a real end-to-end test that sets the env var, reloads transport.py, and asserts the constant flips. Cleanup reload restores defaults so parallel tests stay isolated. Self-review (avoid round 4): - Verified TeeTransport's secondary-swallow stays intentional. - _log_signal grace path already covered by separate tests.1 个月前
fix(async): close unscheduled coroutines in all threadsafe bridges (#26584) Wraps every sync->async coroutine-scheduling site in the codebase with a new agent.async_utils.safe_schedule_threadsafe() helper that closes the coroutine on scheduling failure (closed loop, shutdown race, etc.) instead of leaking it as 'coroutine was never awaited' RuntimeWarnings plus reference leaks. 22 production call sites migrated across the codebase: - acp_adapter/events.py, acp_adapter/permissions.py - agent/lsp/manager.py - cron/scheduler.py (media + text delivery paths) - gateway/platforms/feishu.py (5 sites, via existing _submit_on_loop helper which now delegates to safe_schedule_threadsafe) - gateway/run.py (10 sites: telegram rename, agent:step hook, status callback, interim+bg-review, clarify send, exec-approval button+text, temp-bubble cleanup, channel-directory refresh) - plugins/memory/hindsight, plugins/platforms/google_chat - tools/browser_supervisor.py (3), browser_cdp_tool.py, computer_use/cua_backend.py, slash_confirm.py - tools/environments/modal.py (_AsyncWorker) - tools/mcp_tool.py (2 + 8 _run_on_mcp_loop callers converted to factory-style so the coroutine is never constructed on a dead loop) - tui_gateway/ws.py Tests: new tests/agent/test_async_utils.py covers helper behavior under live loop, dead loop, None loop, and scheduling exceptions. Regression tests added at three PR-original sites (acp events, acp permissions, mcp loop runner) mirroring contributor's intent. Live-tested end-to-end: - Helper stress test: 1500 schedules across live/dead/race scenarios, zero leaked coroutines - Race exercised: 5000 schedules with loop killed mid-flight, 100 ok / 4900 None returns, zero leaks - hermes chat -q with terminal tool call (exercises step_callback bridge) - MCP probe against failing subprocess servers + factory path - Real gateway daemon boot + SIGINT shutdown across multiple platform adapter inits - WSTransport 100 live + 50 dead-loop writes - Cron delivery path live + dead loop Salvages PR #2657 — adopts contributor's intent over a much wider site list and a single centralized helper instead of inline try/except at each site. 3 of the original PR's 6 sites no longer exist on main (environments/patches.py deleted, DingTalk refactored to native async); the equivalent fix lives in tools/environments/modal.py instead. Co-authored-by: JithendraNara <jithendranaidunara@gmail.com>19 天前