| feat(web): parallel plugin — first async-extract plugin
Migrates Parallel.ai from inline _parallel_search() / _parallel_extract()
in tools/web_tools.py to a bundled plugin at plugins/web/parallel/.
First plugin in the codebase to expose an async :meth:extract:
- search() is sync — Parallel.beta.search
- extract() is **async def** — AsyncParallel.beta.extract
The ABC's docstring on supports_extract() already permits sync-or-async;
this commit is the first to exercise the async path. The web_extract_tool
dispatcher (next commit) detects coroutines via
inspect.iscoroutinefunction and awaits accordingly.
Behavior preserved:
- PARALLEL_API_KEY required (raises ValueError if missing → surfaced
as {"success": False, "error": "..."} instead)
- PARALLEL_SEARCH_MODE env var honored (agentic|fast|one-shot, default
agentic), validated via _resolve_search_mode()
- Limit capped at 20 server-side via min(limit, 20)
- Per-URL failure mode preserved: response.errors[] each become a
result dict with an "error" field rather than raising
- Module-level _parallel_client / _async_parallel_client caches kept
(mirrors legacy singleton pattern)
Adds "parallel" to _WEB_PLUGIN_SKIPLIST in hermes_cli/tools_config.py so
the picker doesn't double-list.
The legacy inline _parallel_search, _parallel_extract, _get_parallel_client,
_get_async_parallel_client in tools/web_tools.py are NOT deleted yet — the
dispatcher still calls them. They go away when the dispatcher cuts over.
E2E verified:
- inspect.iscoroutinefunction(p.search) -> False
- inspect.iscoroutinefunction(p.extract) -> True
- extract() returns a coroutine (not a list)
- 5 providers register correctly (brave-free, ddgs, exa, parallel, searxng)
| 22 天前 |
| feat(web): parallel plugin — first async-extract plugin
Migrates Parallel.ai from inline _parallel_search() / _parallel_extract()
in tools/web_tools.py to a bundled plugin at plugins/web/parallel/.
First plugin in the codebase to expose an async :meth:extract:
- search() is sync — Parallel.beta.search
- extract() is **async def** — AsyncParallel.beta.extract
The ABC's docstring on supports_extract() already permits sync-or-async;
this commit is the first to exercise the async path. The web_extract_tool
dispatcher (next commit) detects coroutines via
inspect.iscoroutinefunction and awaits accordingly.
Behavior preserved:
- PARALLEL_API_KEY required (raises ValueError if missing → surfaced
as {"success": False, "error": "..."} instead)
- PARALLEL_SEARCH_MODE env var honored (agentic|fast|one-shot, default
agentic), validated via _resolve_search_mode()
- Limit capped at 20 server-side via min(limit, 20)
- Per-URL failure mode preserved: response.errors[] each become a
result dict with an "error" field rather than raising
- Module-level _parallel_client / _async_parallel_client caches kept
(mirrors legacy singleton pattern)
Adds "parallel" to _WEB_PLUGIN_SKIPLIST in hermes_cli/tools_config.py so
the picker doesn't double-list.
The legacy inline _parallel_search, _parallel_extract, _get_parallel_client,
_get_async_parallel_client in tools/web_tools.py are NOT deleted yet — the
dispatcher still calls them. They go away when the dispatcher cuts over.
E2E verified:
- inspect.iscoroutinefunction(p.search) -> False
- inspect.iscoroutinefunction(p.extract) -> True
- extract() returns a coroutine (not a list)
- 5 providers register correctly (brave-free, ddgs, exa, parallel, searxng)
| 22 天前 |
| fix(web): align _LEGACY_PREFERENCE with legacy 7-provider order + doc cleanup
Self-review of the plugin migration surfaced one warning and a handful of
doc/dead-code cleanups. None affect production behaviour through the main
dispatcher (which always calls tools.web_tools._get_backend() first and
preserves the full 7-provider walk), but direct callers of
agent.web_search_registry.get_active_*_provider() previously diverged
from the legacy order and could return None for users with credentials
but no explicit web.backend config key.
Changes
-------
1. _LEGACY_PREFERENCE was shipped as a 4-tuple
("brave-free", "firecrawl", "searxng", "ddgs") while the PR
description and the legacy _get_backend() candidate order both
call for the 7-tuple
(firecrawl, parallel, tavily, exa, searxng, brave-free, ddgs).
Replaced with the 7-tuple. Verified empirically: with TAVILY+EXA keys
and no config, get_active_search_provider() now returns tavily
(was None); with EXA+PARALLEL it returns parallel (was None); with
BRAVE+FIRECRAWL it returns firecrawl (was brave-free).
2. agent/web_search_registry.py — module docstring, _resolve step-3
docstring, and inline comment all listed the old 4-tuple and claimed
"brave-free first because it was the shipped default". The legacy
default is "firecrawl". Rewritten to match the new ordering and
reference tools.web_tools._get_backend() as the source of truth.
3. agent/web_search_registry.py — get_active_crawl_provider
docstring said "only Tavily implements it among built-in providers".
Firecrawl also advertises supports_crawl=True after the previous
commit. Updated to "Tavily and Firecrawl".
4. plugins/web/tavily/provider.py — module docstring said "Tavily is
the only built-in backend that natively crawls". Updated.
5. agent/web_search_provider.py — ABC docstring mentioned only
search / extract capabilities. Added crawl for accuracy.
6. plugins/web/{firecrawl,parallel,exa}/provider.py — dead plugin-level
cache globals (_firecrawl_client, _parallel_client,
_async_parallel_client, _exa_client) were declared but never read
(all reads/writes go through _wt.* per the `extracting-inline-
helpers-to-plugins` recipe). Removed the dead declarations; the
reset-for-tests helpers in firecrawl + parallel now clear the
canonical _wt._<name> slots, matching the pattern exa already used.
Tests
-----
218/218 web-targeted tests still pass (no test changes needed). 4910/4910
in tests/tools/ still green.
| 22 天前 |