atomcode/crates/atomcode-core/src/ctx · AtomCode/atomcode - AtomGit

li4li5li6fix(prompt,ctx): Plan Mode 与压缩摘要移出 system，保住前缀缓存

文件	最后提交记录	最后更新时间
default.rs	fix(ctx): token-aware compression keep-floor to bound reasoning-heavy turns The compaction keep-floor was a fixed message count (KEEP_MESSAGES=20). On Include-policy thinking models (deepseek-v4, Kimi, Moonshot) the kept tail's reasoning_content — which the API requires echoed back in full and which no reducer condenses — can alone exceed the context window. A single user message driving many tool-call rounds stays a single turn, so the drop-oldest path is a structural no-op and microcompact/FINAL CEILING only touch ToolResults; reasoning accumulates unbounded (observed 690K on a 128K window → stream timeout). build_compression_content now takes a token `keep_ceiling` and advances the cut forward (dropping whole old rounds — the only API-legal way to shed echoed reasoning, never stripping a kept message) until the kept tail fits. The old `len <= KEEP_MESSAGES` early-return was a pure message-count gate that let <20-message-but-huge-token sessions slip through; replaced with a saturating_sub so the token loop drives the decision. keep_ceiling is computed in the agent as window - system - tool_defs - cold_zone - output reserve — constant overhead, so it scales correctly to 1M (a fixed fraction would waste ~460K there). Faithful regression test simulates the real auto-compaction cycle and pins the invariant (rendered payload <= window). Also corrects a pre-existing test_cold_zone_compression expectation (first-user carve-out yields 6 turns). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	7 天前
env.rs	feat: suppress Windows console windows & pass --client mode to daemon - Add process_utils module with CREATE_NO_WINDOW helpers for Windows - Apply suppress_console_window to all child process spawns (git, curl, rg, find, LSP servers, MCP servers, hooks, bash tool, etc.) - Add --client CLI arg to daemon, propagate SessionMode to API handlers - Enhance build_api_system_prompt to align with TUI (instructions, memory, git snapshot, platform rules, model identity) - Bump vscode extension to 0.0.3	24 天前
file_store.rs	refactor(d3): merge peek_file into read_file (transparent FileStore) Datalog 2026-05-06_15-33-23 made the case clear: 13 peek_file calls vs 59 read_file in the same session — the model defaults to read_file even when the same content is sitting in FileStore. Exposing the cache as a separate tool (peek_file with store_ids) added API surface without delivering the savings, because: - The model's read_file habit is decades old; peek_file is novel. - Two tools doing similar things is ambiguous to weak models. - store_id management (which the path-C reminder existed to paper over) was a leak of internal state into the model's cognitive surface. This commit folds the cache access behind read_file: - read_file now consults FileStore before reading disk. If a path is cached with the current mtime, the requested range is served from memory regardless of offset/limit. Output carries a `[NOTE: served from FileStore cache, no disk hit]` annotation on cache hits so the model can audit the source. - Every fresh disk read pushes its full content into FileStore, uniformly — small files included. The auto_skeleton and former-D3a paths no longer push their own; one upstream insert is the single source of truth. - A per-path access counter (`ToolContext.path_read_counts`) drives a soft "Nth read of <file>" hint after the third access: encourages weak models to look up specific regions freely instead of hoarding context-window with conservative reads. Removed (dead-end branches now redundant): - `tool/peek.rs` — peek_file tool deleted entirely. - `Conversation.active_files_snapshot` — path-C transient field no longer has a renderer to populate it. - `ctx/render.rs::render_file_store_reminder` + the system-level injection it produced — store_id is no longer model-visible, so there's nothing to remind the model about. - `FileSummary` + `FileStore::active_summaries` — only callers were the deleted reminder. - `TurnRunner::run_with_filter` snapshot-refresh block. - read_file's `LARGE_FILE_LINE_THRESHOLD` pointer-and-preview branch — every read now returns inline content, with cache semantics handled invisibly. Total: 295 lines deleted (peek.rs) + 127 (FileStore reminder infrastructure) + assorted glue = -573 net after the new helper additions. Tests rewritten (7 D3 integration cases): full-read populates store + returns inline / small file populates store too / range read after full hits cache and announces it / edit invalidates the cache so the next read hits disk / re-read keeps store at one entry / auto_skeleton populates store without exposing store_id / 3rd read of same path emits the count hint. 994 lib tests passing. Pre-existing env-dependent failures unchanged (3, same as before merge). What this changes for D3 effective value: The earlier "store_id survives compaction" story (path C) is moot — the model never sees store_ids, so it can't lose them. Cache benefits are now invisible: the model just calls read_file, and ranges that map onto a cached path skip disk. This sidesteps the tool-adoption problem entirely; we trade the "explicit cache tool" path (low adoption) for the "transparent cache hit" path (always works). Honest revised estimate vs no-D3 baseline: - Disk reads: every range read of a previously-read file is now free (was: full disk hit). Real session impact depends on re-read frequency. - Wall time: ~5-50ms saved per cached read; meaningful at ~30+ reads/session. - Hallucination: indirect — fewer disk failures, no stale-cache confusion. Not a primary driver. - Tokens: NEUTRAL. Same wire bytes per response. The previous pointer-and-preview format saved tokens on first read but cost tokens on every range follow-up; net was small. The 41%-input-savings claim is fully retracted. The honest pitch for this refactor is: "fewer moving parts, same baseline of caching value, no tool-adoption gap." Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	29 天前
mod.rs	fix(ctx): token-aware compression keep-floor to bound reasoning-heavy turns The compaction keep-floor was a fixed message count (KEEP_MESSAGES=20). On Include-policy thinking models (deepseek-v4, Kimi, Moonshot) the kept tail's reasoning_content — which the API requires echoed back in full and which no reducer condenses — can alone exceed the context window. A single user message driving many tool-call rounds stays a single turn, so the drop-oldest path is a structural no-op and microcompact/FINAL CEILING only touch ToolResults; reasoning accumulates unbounded (observed 690K on a 128K window → stream timeout). build_compression_content now takes a token `keep_ceiling` and advances the cut forward (dropping whole old rounds — the only API-legal way to shed echoed reasoning, never stripping a kept message) until the kept tail fits. The old `len <= KEEP_MESSAGES` early-return was a pure message-count gate that let <20-message-but-huge-token sessions slip through; replaced with a saturating_sub so the token loop drives the decision. keep_ceiling is computed in the agent as window - system - tool_defs - cold_zone - output reserve — constant overhead, so it scales correctly to 1M (a fixed fraction would waste ~460K there). Faithful regression test simulates the real auto-compaction cycle and pins the invariant (rendered payload <= window). Also corrects a pre-existing test_cold_zone_compression expectation (first-user carve-out yields 6 turns). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	7 天前
ollama.rs	fix(ctx): 冻结历史 read_file 结果，止住前缀缓存周期性塌方 build_messages 每轮组装发送消息时，replace_stale_reads 会对"文件后来被 edit 过"的历史 read_file 结果用 std::fs::read_to_string 重新读盘、重新打绝对行号({:>4}\|)并就地覆写 r.output。同一 tool_call_id 内容却变了 → 发送前缀从最早被改写那条断裂 → 其后全部 token 每轮冷算。线上 deepseek-v4-flash 长会话命中从 ~99% 周期性塌到 4-20%(cached 0.02元/M vs uncached 1元/M，50x 成本差)。该函数原意是"让模型在 edit turn 始终看到最新代码"——正是缓存杀手。修复(direction 1，冻结历史不可变)：删除 replace_stale_reads 调用与函数本体。历史 tool_result 一旦渲染即不可变；模型要看最新内容就在当前轮重新 read(尾部追加，缓存只增不断)。当前轮拿最新内容的能力不受影响。回归测试(crates/atomcode-core/src/ctx/render.rs)： - historical_read_results_are_frozen_after_edit - consecutive_turns_keep_byte_stable_prefix_across_disk_edit 断言 turn N 渲染是 turn N+1 的字节级前缀，即便磁盘在两轮间被编辑。 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	1 天前
render.rs	fix(prompt,ctx): Plan Mode 与压缩摘要移出 system，保住前缀缓存承接会话级 system 冻结(3cda63dc)。litellm 线上 175 对 system 断裂抽样归类 + 逐字节 diff 证据(systemA/B/D):冻结 build_system_prompt 还不够，有两处动态内容仍进了 system 第0条。 == systemA：Plan Mode 指令进 system(占 system 断裂 12%) == prompt.rs 在 plan_mode 时往 system 注入整段 "=== PLAN MODE (ACTIVE) ==="; 中途切换 plan mode 就重写 messages[0]、整会话缓存归零。修复:从 system 删除该段;改为在 AgentCommand::SetPlanMode 状态切换时，通过 add_synthetic_user_message 往历史注入一次说明(进入/退出各一条)， system 保持会话级常量;read-only 工具门控(use_read_only)继续每轮强制约束。不再在 SetPlanMode 里失效 cached_system_prompt。 == systemB：压缩摘要被合并器折进 messages[0] == cold_summaries / 溢出 digest 原以 Role::System push(render.rs)，被 clean_message_pipeline 的"合并连续 system 消息"折进 messages[0] → 每次压缩连 ~16K system 前缀一起失效。改成 Role::User:合并器不再污染 messages[0]，冻结的 system 跨压缩字节稳定。影响:摘要现以 user 角色呈现(标注 "[Earlier conversation history…]"/"[Context overflow…]"),会并入相邻 user 消息;压缩时仍从 messages[1] 断一次(历史确实变了，不可避免)，但 system 前缀保住。truncate(cold_msgs)/microcompact 的位置逻辑与角色无关，已验证。未处理(下游已确认、单独任务):systemC/E —— 主/子 agent、工具型短请求共用 session_id(约占 system 断裂 22%),非改写 bug，需独立缓存归属，另议。回归测试: - agent/mod.rs::plan_mode_is_not_in_system_prompt(plan_mode 不进 system 且不改变 system 一个字节)。 - ctx/render.rs::test_no_consecutive_system_messages_after_compression(更新: 断言摘要不在 system、改骑在 user 消息上)。 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	1 天前
resolver.rs	Merge branch 'pr_93' (feat/image-paste) into release/v4.22.0 Brings Ctrl+V clipboard image paste with Claude/OpenAI vision support. On top of the merge, layers UX/correctness fixes built during conflict resolution: - core: add `model_name_suggests_vision` heuristic as the single source of truth for vision capability, with `accepts_images()` as a thin wrapper. Drives both the TUI Ctrl+V gate and provider-side wire format selection. - openai: degrade `MultiPart` content to plain text when the active model lacks vision (fixes ModelArts.81001 from gitcode.com when switching to GLM-5.1 mid-conversation after a vision turn). - tuix: inline `[Image #N]` marker in the input buffer at paste time (replaces the out-of-buffer "pasted from clipboard" line). On submit, only attach images whose marker survives editing — deleting the marker now correctly detaches the image. Echo each kept marker as a `└ [Image #N]` sub-line beneath the user prompt. - tuix: show "Image in clipboard · ctrl+v to paste" hint in the status line, deduped via SipHash content fingerprint so a second distinct image still re-prompts. - tuix: gate Ctrl+V and bracketed-paste image handlers on `accepts_images()` — text-only models refuse with a clear message. - core: tolerate explicit JSON `null` for CodingPlan `claimed_at` / `expires_at` via a `null_to_default` deserializer. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	28 天前
truncate.rs	fix(session): 压缩后保住原始 prompt,/resume 不再开局就是 tool_call bug: /resume 后滑到顶部看不见自己最初问的 prompt,直接是 list_directory / bash / read_file 一串 tool 调用;session JSON 文件里也没有原始 prompt 的痕迹。根因:`hard_truncate_to_target` (agent/mod.rs:3178) 找"last user message"作为 sacred 锚点,但 agent 在 turn 过程中会以 Role::User 注入 3 种合成消息: `[Additional context from user]: ...`、 `Output limit hit. ...`、`[Context was compressed. ...]`。这些合成消息让 last_user_idx 指向了某条注入而非真实原始 prompt,触发压缩时原始 prompt 在 `drain(0..keep_from)` 里被一并砍掉,落盘 JSON 也丢失。修复(opencode 子集): 1. `Message` 加 `synthetic: bool` 字段。`#[serde(default)]` + skip-if- false,旧 session.json 反序列化默认为 false,序列化时常见 false 不写盘,无 bloat。新增 `Message::synthetic_user()` 构造器。 2. 3 个合成注入点改用新的 `Conversation::add_synthetic_user_message`, 该方法 merge 逻辑保留既有 synthetic 标(real + synthetic 文字 append 后不会被错误升级为 synthetic)。 3. `hard_truncate_to_target` sacred 集合从 `{last_user}` 扩展为 `{first_real_user, last_real_user}`,两个 anchor 都 filter `!m.synthetic`。first 保会话锚点供 /resume,last 保当前任务上下文。单 prompt 场景下两者重合,compaction 宁可超 budget 也不丢 prompt (tier 1/2 仍可降 token,tier 3 在这种场景退化为 no-op 是设计取舍)。 4. session.rs::auto_name_from_messages、event_loop::apply_session_messages 主信号改用 synthetic 字段,次信号保留 bracket-prefix 启发式作为旧 session 兜底,避免老 JSON /resume 标题退化。参考:opencode 的 `message-v2.ts` `synthetic` part 字段 + replay 机制是公认的"原始 prompt 保护"工程化做法;DeepSeek-TUI 只在 metadata.title 存截断版,不能恢复完整 prompt。我们抄了 opencode 的 synthetic 字段 + 双 anchor sacred,没抄 replay(那是单独的"压缩后给模型重新喂上下文" 机制,不在本 bug 范围)。测试(12 个新): - message.rs: 5 个 — 构造器 / serde 默认 false / 不序列化 false / 序列化 true / 反序列化兼容 - conversation/mod.rs: 3 个 — syn 注入标记 / syn 合并到 real 保 real / syn 合并到 syn 保 syn - agent/mod.rs: 3 个 — 复现 bug 场景验证原始 prompt 保留 / 多轮场景验证 last real 跳过尾部 syn / 空 conv 不 panic - event_loop session_naming tests 全过(legacy bracket fallback 还在) 跨 provider/render/test fixture 共 19 处 Message {} 字面量补全 `synthetic: false`(脚本批量,brace-aware,跳过 `-> Message {` 函数签名)。 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	10 天前