文件最后提交记录最后更新时间
feat: provider modules — ProviderProfile ABC, 33 providers, fetch_models, transport single-path Introduces providers/ package — single source of truth for every inference provider. Adding a simple api-key provider now requires one providers/<name>.py file with zero edits anywhere else. What this PR ships: - providers/ package (ProviderProfile ABC + 33 profiles across 4 api_modes) - ProviderProfile declarative fields: name, api_mode, aliases, display_name, env_vars, base_url, models_url, auth_type, fallback_models, hostname, default_headers, fixed_temperature, default_max_tokens, default_aux_model - 4 overridable hooks: prepare_messages, build_extra_body, build_api_kwargs_extras, fetch_models - chat_completions.build_kwargs: profile path via _build_kwargs_from_profile, legacy flag path retained for lmstudio/tencent-tokenhub (which have session-aware reasoning probing that doesn't map cleanly to hooks yet) - run_agent.py: profile path for all registered providers; legacy path variable scoping fixed (all flags defined before branching) - Auto-wires: auth.PROVIDER_REGISTRY, models.CANONICAL_PROVIDERS, doctor health checks, config.OPTIONAL_ENV_VARS, model_metadata._URL_TO_PROVIDER - GeminiProfile: thinking_config translation (native + openai-compat nested) - New tests/providers/ (79 tests covering profile declarations, transport parity, hook overrides, e2e kwargs assembly) Deltas vs original PR (salvaged onto current main): - Added profiles: alibaba-coding-plan, azure-foundry, minimax-oauth (were added to main since original PR) - Skipped profiles: lmstudio, tencent-tokenhub stay on legacy path (their reasoning_effort probing has no clean hook equivalent yet) - Removed lmstudio alias from custom profile (it's a separate provider now) - Skipped openrouter/custom from PROVIDER_REGISTRY auto-extension (resolve_provider special-cases them; adding breaks runtime resolution) - runtime_provider: profile.api_mode only as fallback when URL detection finds nothing (was breaking minimax /v1 override) - Preserved main's legacy-path improvements: deepseek reasoning_content preserve, gemini Gemma skip, OpenRouter response caching, Anthropic 1M beta recovery, etc. - Kept agent/copilot_acp_client.py in place (rejected PR's relocation — main has 7 fixes landed since; relocation would revert them) - _API_KEY_PROVIDER_AUX_MODELS alias kept for backward compat with existing test imports Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com> Closes #14418 30 天前
feat: provider modules — ProviderProfile ABC, 33 providers, fetch_models, transport single-path Introduces providers/ package — single source of truth for every inference provider. Adding a simple api-key provider now requires one providers/<name>.py file with zero edits anywhere else. What this PR ships: - providers/ package (ProviderProfile ABC + 33 profiles across 4 api_modes) - ProviderProfile declarative fields: name, api_mode, aliases, display_name, env_vars, base_url, models_url, auth_type, fallback_models, hostname, default_headers, fixed_temperature, default_max_tokens, default_aux_model - 4 overridable hooks: prepare_messages, build_extra_body, build_api_kwargs_extras, fetch_models - chat_completions.build_kwargs: profile path via _build_kwargs_from_profile, legacy flag path retained for lmstudio/tencent-tokenhub (which have session-aware reasoning probing that doesn't map cleanly to hooks yet) - run_agent.py: profile path for all registered providers; legacy path variable scoping fixed (all flags defined before branching) - Auto-wires: auth.PROVIDER_REGISTRY, models.CANONICAL_PROVIDERS, doctor health checks, config.OPTIONAL_ENV_VARS, model_metadata._URL_TO_PROVIDER - GeminiProfile: thinking_config translation (native + openai-compat nested) - New tests/providers/ (79 tests covering profile declarations, transport parity, hook overrides, e2e kwargs assembly) Deltas vs original PR (salvaged onto current main): - Added profiles: alibaba-coding-plan, azure-foundry, minimax-oauth (were added to main since original PR) - Skipped profiles: lmstudio, tencent-tokenhub stay on legacy path (their reasoning_effort probing has no clean hook equivalent yet) - Removed lmstudio alias from custom profile (it's a separate provider now) - Skipped openrouter/custom from PROVIDER_REGISTRY auto-extension (resolve_provider special-cases them; adding breaks runtime resolution) - runtime_provider: profile.api_mode only as fallback when URL detection finds nothing (was breaking minimax /v1 override) - Preserved main's legacy-path improvements: deepseek reasoning_content preserve, gemini Gemma skip, OpenRouter response caching, Anthropic 1M beta recovery, etc. - Kept agent/copilot_acp_client.py in place (rejected PR's relocation — main has 7 fixes landed since; relocation would revert them) - _API_KEY_PROVIDER_AUX_MODELS alias kept for backward compat with existing test imports Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com> Closes #14418 30 天前
fix(ci): stabilize shared test state after 21012 21 天前
feat(nous): unified client=hermes-client-v<version> tag on every Portal request (#24779) * feat(nous): unified client=hermes-client-v<version> tag on every Portal request Every Hermes request to Nous Portal now carries the same client=hermes-client-v<__version__> tag (e.g. client=hermes-client-v0.13.0 on this release), sourced live from hermes_cli.__version__. The release script's regex bump auto-aligns it on every release. Centralized in agent/portal_tags.py and wired into all four call sites: - NousProfile.build_extra_body (main agent loop, every chat completion) - auxiliary_client.NOUS_EXTRA_BODY + _build_call_kwargs (aux client) - run_agent.py compression-summary fallback path - tools/web_tools.py web_extract fallback Replaces the client=aux marker added in #24194 with the unified version tag. Tests assert against the helper output (invariant) rather than the literal string, so they don't need updating on every release. * feat(nous): cover /goal judge and kanban specify aux paths Two aux-using surfaces bypassed call_llm by invoking client.chat.completions.create() directly without extra_body, so they were missing the unified Portal client tag: - hermes_cli/goals.py — /goal standing-goal judge - hermes_cli/kanban_specify.py — kanban triage specifier Both now pass extra_body=get_auxiliary_extra_body() or None so they inherit the version tag when the aux client points at Nous Portal, and emit nothing otherwise (no tag leak to OpenRouter/Anthropic auxes).23 天前
feat(nvidia): add NIM billing origin header 20 天前
fix(custom): pass custom provider extra body Allow custom OpenAI-compatible providers declared under custom_providers: to set provider-specific extra_body fields and have Hermes merge them into chat-completions requests when the matching custom endpoint is active. This is a manual per-provider override rather than a model-name heuristic. OpenAI-compatible Gemma thinking support is real, but the on-wire payload shape is backend-specific: some servers want top-level enable_thinking, while vLLM Gemma and NIM-style endpoints expect chat_template_kwargs. A per-provider override is safer than picking one assumed payload. Example config: ```yaml custom_providers: - name: gemma-local base_url: http://localhost:8080/v1 model: google/gemma-4-31b-it extra_body: enable_thinking: true reasoning_effort: high ``` For vLLM Gemma or NIM-style endpoints, use the nested shape those servers expect: ```yaml extra_body: chat_template_kwargs: enable_thinking: true ``` Changes: - hermes_cli/config.py: preserve extra_body in normalized custom_providers: entries and allow it in the validated field set. - hermes_cli/runtime_provider.py: propagate custom-provider extra_body as request_overrides.extra_body for named custom runtime resolution, including credential-pool paths. - agent/agent_init.py: at agent init, locate the matching custom-provider entry by base_url (+ optional model) and merge its extra_body into AIAgent.request_overrides, with caller-provided overrides winning on conflicting top-level keys. - plugins/model-providers/custom/__init__.py: keep existing CustomProfile behavior (Ollama num_ctx, think=False when reasoning disabled); user-configured extra_body flows through request_overrides. - website/docs/integrations/providers.md: document the explicit extra_body override and the vLLM/Gemma chat_template_kwargs variant. - Tests cover config normalization, runtime propagation, model matching, trailing-slash equivalence, fallback when no model field is set, and caller-override merging precedence. Verified end-to-end against CustomProfile via ChatCompletionsTransport: configured extra_body reaches kwargs.extra_body on the wire request, and coexists with profile-generated entries (Ollama num_ctx, think=False) without clobber. Salvaged from #29022 onto current main. Cosmetic typing edit in plugins/model-providers/custom/__init__.py and a stale-base docs revert in providers.md were dropped during cherry-pick. Closes #29022 14 天前