new-api/relay · markho/new-api - AtomGit

GGitHubfix: handle ollama non-stream tool calls (#5865 )

文件	最后提交记录	最后更新时间
channel	fix: handle ollama non-stream tool calls (#5865) * fix: handle ollama non-stream tool calls * test: cover ollama non-stream tool call paths --------- Co-authored-by: CaIon <i@caion.me>	2 天前
common	feat: support Wan2.7 i2v media mapping (#4984) * feat: support Wan2.7 i2v media mapping * fix: normalize wan2.7 i2v image inputs	4 天前
common_handler	perf: avoid eager formatting in debug log calls (#4929)	1 个月前
constant	feat: openai response /v1/response/compact (#2644) * feat: openai response /v1/response/compact * feat: /v1/response/compact bill * feat: /v1/response/compact * feat: /v1/responses/compact -> codex channel * feat: /v1/responses/compact -> codex channel * feat: /v1/responses/compact -> codex channel * feat: codex channel default models * feat: compact model price * feat: /v1/responses/comapct test	5 个月前
helper	test: clean up reward-hacking backend tests (#5563)	18 天前
reasonmap	fix: reason convert	5 个月前
audio_handler.go	Merge branch 'origin/main' into nightly Resolve 4 conflicts: - relay/compatible_handler.go: accept main's refactor (postConsumeQuota -> service.PostTextConsumeQuota) - service/quota.go: accept main's PostClaudeConsumeQuota deletion, keep nightly's tiered billing in PostWssConsumeQuota and PostAudioConsumeQuota - web/src/i18n/locales/{en,zh-CN}.json: merge both sets of translation keys Post-merge integration: - Add tiered billing (TryTieredSettle, InjectTieredBillingInfo) to PostTextConsumeQuota - Update tool pricing calls to use nightly's generic GetToolPriceForModel/GetToolPrice API	3 个月前
chat_completions_via_responses.go	feat: support Responses to Chat (#5787) * fix(openai): harden Chat-to-Responses compatibility Add a shared Responses-to-Chat stream state machine and use it from the OpenAI relay path. Preserve assistant text alongside tool calls, bind tool argument deltas by output_index, map incomplete finish reasons, support reasoning/custom tool events, and buffer upstream SSE for non-stream Chat clients. Add deterministic service tests and relay SSE tests for the conversion path. Related to #5745. * refactor: rename openaicompat to relayconvert for improved clarity * feat(gemini): support responses request conversion * feat: add responses to chat conversion support * fix: harden responses chat conversion edge cases	7 天前
chat_completions_via_responses_test.go	feat: support Responses to Chat (#5787) * fix(openai): harden Chat-to-Responses compatibility Add a shared Responses-to-Chat stream state machine and use it from the OpenAI relay path. Preserve assistant text alongside tool calls, bind tool argument deltas by output_index, map incomplete finish reasons, support reasoning/custom tool events, and buffer upstream SSE for non-stream Chat clients. Add deterministic service tests and relay SSE tests for the conversion path. Related to #5745. * refactor: rename openaicompat to relayconvert for improved clarity * feat(gemini): support responses request conversion * feat: add responses to chat conversion support * fix: harden responses chat conversion edge cases	7 天前
claude_handler.go	fix(relay): fix Anthropic-compatible compatibility for GLM (avoid chunked encoding) (#5307)	30 天前
compatible_handler.go	perf: reduce heap residency for large base64 relay requests Three layered optimizations targeting Gemini-style 5MB base64 payloads where RSS could balloon to tens of GB under concurrent load: 1. Byte-based param override (relay/common/override.go) - Switch legacy/operations hot paths from common.Marshal round-trips and map[string]any conversions to gjson/sjson on []byte directly. - Avoids cloning 5MB strings during each Set/Delete operation. 2. strings.Builder for Gemini response markdown (relay/channel/gemini/relay-gemini.go) - Replace string concatenation + strings.Join when assembling "![image](data:...;base64,DATA)" content for inline image responses. - Pre-allocates capacity from inline_data byte sizes. 3. Outbound BodyStorage + streaming Decoder (this commit's core) - New relay/common/outbound_body.go helper wraps marshaled upstream bodies in common.BodyStorage, allowing disk-cache mode to offload jsonData to a temp file while waiting for upstream TTFB. The original []byte can then be GC'd, removing ~5MB/req of heap residency during the longest window of a request. - All 7 relay handlers (gemini/claude/responses/embedding/image/compatible/ rerank) plus chat_completions_via_responses adopt the helper with defer closer.Close() and explicit jsonData = nil. - relay/common/relay_info.go: new UpstreamRequestBodySize so relay/channel/api_request.go can populate req.ContentLength (lost when body becomes a type-erased io.Reader). - common/gin.go UnmarshalBodyReusable: when storage is disk-backed and content-type is JSON, decode via DecodeJson(storage) instead of storage.Bytes()+Unmarshal, removing one transient 5MB copy per request. memory mode and form/multipart paths unchanged.	1 个月前
embedding_handler.go	perf: reduce heap residency for large base64 relay requests Three layered optimizations targeting Gemini-style 5MB base64 payloads where RSS could balloon to tens of GB under concurrent load: 1. Byte-based param override (relay/common/override.go) - Switch legacy/operations hot paths from common.Marshal round-trips and map[string]any conversions to gjson/sjson on []byte directly. - Avoids cloning 5MB strings during each Set/Delete operation. 2. strings.Builder for Gemini response markdown (relay/channel/gemini/relay-gemini.go) - Replace string concatenation + strings.Join when assembling "![image](data:...;base64,DATA)" content for inline image responses. - Pre-allocates capacity from inline_data byte sizes. 3. Outbound BodyStorage + streaming Decoder (this commit's core) - New relay/common/outbound_body.go helper wraps marshaled upstream bodies in common.BodyStorage, allowing disk-cache mode to offload jsonData to a temp file while waiting for upstream TTFB. The original []byte can then be GC'd, removing ~5MB/req of heap residency during the longest window of a request. - All 7 relay handlers (gemini/claude/responses/embedding/image/compatible/ rerank) plus chat_completions_via_responses adopt the helper with defer closer.Close() and explicit jsonData = nil. - relay/common/relay_info.go: new UpstreamRequestBodySize so relay/channel/api_request.go can populate req.ContentLength (lost when body becomes a type-erased io.Reader). - common/gin.go UnmarshalBodyReusable: when storage is disk-backed and content-type is JSON, decode via DecodeJson(storage) instead of storage.Bytes()+Unmarshal, removing one transient 5MB copy per request. memory mode and form/multipart paths unchanged.	1 个月前
gemini_handler.go	perf: reduce heap residency for large base64 relay requests Three layered optimizations targeting Gemini-style 5MB base64 payloads where RSS could balloon to tens of GB under concurrent load: 1. Byte-based param override (relay/common/override.go) - Switch legacy/operations hot paths from common.Marshal round-trips and map[string]any conversions to gjson/sjson on []byte directly. - Avoids cloning 5MB strings during each Set/Delete operation. 2. strings.Builder for Gemini response markdown (relay/channel/gemini/relay-gemini.go) - Replace string concatenation + strings.Join when assembling "![image](data:...;base64,DATA)" content for inline image responses. - Pre-allocates capacity from inline_data byte sizes. 3. Outbound BodyStorage + streaming Decoder (this commit's core) - New relay/common/outbound_body.go helper wraps marshaled upstream bodies in common.BodyStorage, allowing disk-cache mode to offload jsonData to a temp file while waiting for upstream TTFB. The original []byte can then be GC'd, removing ~5MB/req of heap residency during the longest window of a request. - All 7 relay handlers (gemini/claude/responses/embedding/image/compatible/ rerank) plus chat_completions_via_responses adopt the helper with defer closer.Close() and explicit jsonData = nil. - relay/common/relay_info.go: new UpstreamRequestBodySize so relay/channel/api_request.go can populate req.ContentLength (lost when body becomes a type-erased io.Reader). - common/gin.go UnmarshalBodyReusable: when storage is disk-backed and content-type is JSON, decode via DecodeJson(storage) instead of storage.Bytes()+Unmarshal, removing one transient 5MB copy per request. memory mode and form/multipart paths unchanged.	1 个月前
image_handler.go	fix(relay): correct image quality parameter handling (#5103)	1 个月前
mjproxy_handler.go	perf: avoid eager formatting in debug log calls (#4929)	1 个月前
param_override_error.go	feat: add retry-aware param override with return_error and prune_objects	4 个月前
relay_adaptor.go	feat: advanced custom channel (#5590)	17 天前
relay_task.go	Return error when model price/ratio unset #3079 Change ModelPriceHelperPerCall to return (PriceData, error) and stop silently falling back to a default price. If a model price is not configured the helper now returns an error (unless the user has AcceptUnsetRatioModel enabled and a ratio exists). Propagate this error to callers: Midjourney handlers now return a MidjourneyResponse with Code 4 and the error message, and task submission returns a wrapped task error with HTTP 400. Also extract remix video_id in ResolveOriginTask for remix actions. This enforces explicit model price/ratio configuration and surfaces configuration issues to clients.	4 个月前
rerank_handler.go	perf: reduce heap residency for large base64 relay requests Three layered optimizations targeting Gemini-style 5MB base64 payloads where RSS could balloon to tens of GB under concurrent load: 1. Byte-based param override (relay/common/override.go) - Switch legacy/operations hot paths from common.Marshal round-trips and map[string]any conversions to gjson/sjson on []byte directly. - Avoids cloning 5MB strings during each Set/Delete operation. 2. strings.Builder for Gemini response markdown (relay/channel/gemini/relay-gemini.go) - Replace string concatenation + strings.Join when assembling "![image](data:...;base64,DATA)" content for inline image responses. - Pre-allocates capacity from inline_data byte sizes. 3. Outbound BodyStorage + streaming Decoder (this commit's core) - New relay/common/outbound_body.go helper wraps marshaled upstream bodies in common.BodyStorage, allowing disk-cache mode to offload jsonData to a temp file while waiting for upstream TTFB. The original []byte can then be GC'd, removing ~5MB/req of heap residency during the longest window of a request. - All 7 relay handlers (gemini/claude/responses/embedding/image/compatible/ rerank) plus chat_completions_via_responses adopt the helper with defer closer.Close() and explicit jsonData = nil. - relay/common/relay_info.go: new UpstreamRequestBodySize so relay/channel/api_request.go can populate req.ContentLength (lost when body becomes a type-erased io.Reader). - common/gin.go UnmarshalBodyReusable: when storage is disk-backed and content-type is JSON, decode via DecodeJson(storage) instead of storage.Bytes()+Unmarshal, removing one transient 5MB copy per request. memory mode and form/multipart paths unchanged.	1 个月前
responses_handler.go	perf: reduce heap residency for large base64 relay requests Three layered optimizations targeting Gemini-style 5MB base64 payloads where RSS could balloon to tens of GB under concurrent load: 1. Byte-based param override (relay/common/override.go) - Switch legacy/operations hot paths from common.Marshal round-trips and map[string]any conversions to gjson/sjson on []byte directly. - Avoids cloning 5MB strings during each Set/Delete operation. 2. strings.Builder for Gemini response markdown (relay/channel/gemini/relay-gemini.go) - Replace string concatenation + strings.Join when assembling "![image](data:...;base64,DATA)" content for inline image responses. - Pre-allocates capacity from inline_data byte sizes. 3. Outbound BodyStorage + streaming Decoder (this commit's core) - New relay/common/outbound_body.go helper wraps marshaled upstream bodies in common.BodyStorage, allowing disk-cache mode to offload jsonData to a temp file while waiting for upstream TTFB. The original []byte can then be GC'd, removing ~5MB/req of heap residency during the longest window of a request. - All 7 relay handlers (gemini/claude/responses/embedding/image/compatible/ rerank) plus chat_completions_via_responses adopt the helper with defer closer.Close() and explicit jsonData = nil. - relay/common/relay_info.go: new UpstreamRequestBodySize so relay/channel/api_request.go can populate req.ContentLength (lost when body becomes a type-erased io.Reader). - common/gin.go UnmarshalBodyReusable: when storage is disk-backed and content-type is JSON, decode via DecodeJson(storage) instead of storage.Bytes()+Unmarshal, removing one transient 5MB copy per request. memory mode and form/multipart paths unchanged.	1 个月前
websocket.go	format: package name -> github.com/QuantumNous/new-api (#2017)	8 个月前