| fix: load .env in examples and benchmark entrypoints
loadDotenv was only called inside CLI commands, so any direct script import
(examples/*, benchmarks/*) hit 'DEEPSEEK_API_KEY is not set' even when the
user had a valid .env file. Promote env.ts from cli/ to src/, export loadDotenv
from the library index, and call it at the top of every script entrypoint.
| 1 个月前 |
| feat: Esc cancels immediately · native fs tools · cursor fix · slow_count demo (0.4.9)
Three reported issues rolled into one release:
1. Esc now truly cancels in-flight work. AbortController threaded
through every I/O path:
- DeepSeekClient.chat/.stream signal wired at every call site
- ToolRegistry.dispatch gets a ToolCallContext { signal }
- McpClient.callTool({ signal }) fires notifications/cancelled
AND rejects the pending promise immediately — no waiting on
subprocess for a graceful finish
- bridgeMcpTools forwards ctx.signal down
2. Built-in filesystem tools replace the
@modelcontextprotocol/server-filesystem subprocess inside
reasonix code. 10 native tools, sandboxed, R1-friendly schemas.
The new edit_file takes a flat SEARCH/REPLACE shape instead of
the JSON-in-string array that was the single biggest driver of
R1 DSML hallucinations in 0.4.x. Per-call latency drops from
~500ms–2s (Windows subprocess IPC) to <10ms (direct fs).
3. PromptInput placeholder cursor now renders at position 0
(before the hint text), not after it.
4. Bonus: slow_count demo tool in examples/mcp-server-demo.ts
emits real notifications/progress frames so the progress-bar
plumbing from 0.4.8 is testable end-to-end.
ChatOptions grows seedTools: ToolRegistry so callers can
pre-register tools and still layer MCP on top. ToolCallContext
is re-exported for library users writing abortable tools.
416 tests (+29: filesystem-tools +26, mcp abort +3).
| 1 个月前 |
| chore(ci+examples): bench dry-run in CI + programmatic replay/diff example
Two small adds that make the existing v0.2 infrastructure easier to
pick up:
1. CI now runs npx tsx benchmarks/tau-bench/runner.ts --dry on every
push and PR. Catches bench harness regressions (task factories, CLI
parsing, file IO, per-sub-call transcript emission) without needing
a DEEPSEEK_API_KEY in CI.
2. examples/replay-and-diff.ts demonstrates the library surface
programmatically — readTranscript, computeReplayStats,
diffTranscripts, renderDiffSummary, plus direct report.pairs
inspection. Runs with no API key against the committed reference
transcripts; verifies the v0.1 numbers (94% cache, −20% cost, 1
byte-stable prefix) fall out of the JSONL alone.
Useful for anyone wiring a CI gate or eval dashboard on top of Reasonix
— which is the main use case that justifies shipping transcripts in
the package in the first place.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
| 1 个月前 |
| fix: load .env in examples and benchmark entrypoints
loadDotenv was only called inside CLI commands, so any direct script import
(examples/*, benchmarks/*) hit 'DEEPSEEK_API_KEY is not set' even when the
user had a valid .env file. Promote env.ts from cli/ to src/, export loadDotenv
from the library index, and call it at the top of every script entrypoint.
| 1 个月前 |