DeepSeek-Reasonix/benchmarks/spike-mcp-reconnect · runningW/DeepSeek-Reasonix - AtomGit

GGitHubchore(spike): live DeepSeek run for RFC #110 cache-prefix question (#113 )

文件	最后提交记录	最后更新时间
results.md	chore(spike): live DeepSeek run for RFC #110 cache-prefix question (#113) Two-file spike under benchmarks/spike-mcp-reconnect/: - runner.ts — 5 chat calls against live deepseek-chat with controlled tool-list drifts (identity, append, mid-stream edit) - results.md — captured run + empirical findings The headline result overturns the RFC body's "any drift = full cache miss" claim. DeepSeek's prefix cache works at chunk granularity (≈128 tokens), so the cost depends on WHERE the drift falls: - append a tool at the end → trivial cost (94.8% hit, even better than the no-drift 85% baseline because the new chunk gets cached) - edit a tool's description in the middle → loses chunks past the edit (84.1% hit observed) - replacing or reordering the tool list → effectively full miss This nudges the C2b design call away from blanket "strict default" toward graduated permissive: silent on appends, warn on mid-stream edits, refuse on reorders / removals. `--strict` remains as an explicit flag. Refs #110.	27 天前
runner.ts	chore(spike): live DeepSeek run for RFC #110 cache-prefix question (#113) Two-file spike under benchmarks/spike-mcp-reconnect/: - runner.ts — 5 chat calls against live deepseek-chat with controlled tool-list drifts (identity, append, mid-stream edit) - results.md — captured run + empirical findings The headline result overturns the RFC body's "any drift = full cache miss" claim. DeepSeek's prefix cache works at chunk granularity (≈128 tokens), so the cost depends on WHERE the drift falls: - append a tool at the end → trivial cost (94.8% hit, even better than the no-drift 85% baseline because the new chunk gets cached) - edit a tool's description in the middle → loses chunks past the edit (84.1% hit observed) - replacing or reordering the tool list → effectively full miss This nudges the C2b design call away from blanket "strict default" toward graduated permissive: silent on appends, warn on mid-stream edits, refuse on reorders / removals. `--strict` remains as an explicit flag. Refs #110.	27 天前