1dba122c创建于 1 天前历史提交

SDK

The SDK is the in-process integration surface for @oh-my-pi/pi-coding-agent. Use it when you want direct access to agent state, event streaming, tool wiring, and session control from your own Bun/Node process.

If you need cross-language/process isolation, use RPC mode instead.

Installation

bun add @oh-my-pi/pi-coding-agent

Entry points

@oh-my-pi/pi-coding-agent exports the SDK APIs from the package root (and also via @oh-my-pi/pi-coding-agent/sdk).

Core exports for embedders:

  • createAgentSession
  • SessionManager
  • Settings
  • AuthStorage
  • ModelRegistry
  • discoverAuthStorage
  • Discovery helpers (discoverExtensions, discoverSkills, discoverContextFiles, discoverPromptTemplates, discoverSlashCommands, discoverCustomTSCommands, discoverMCPServers)
  • Tool factory surface (createTools, BUILTIN_TOOLS, tool classes)

Quick start (auto-discovery defaults)

import { createAgentSession } from "@oh-my-pi/pi-coding-agent";

const { session, modelFallbackMessage } = await createAgentSession();

if (modelFallbackMessage) {
  process.stderr.write(`${modelFallbackMessage}\n`);
}

const unsubscribe = session.subscribe((event) => {
  if (
    event.type === "message_update" &&
    event.assistantMessageEvent.type === "text_delta"
  ) {
    process.stdout.write(event.assistantMessageEvent.delta);
  }
});

await session.prompt("Summarize this repository in 3 bullets.");
unsubscribe();
await session.dispose();

What createAgentSession() discovers by default

createAgentSession() follows “provide to override, omit to discover”.

If omitted, it resolves:

  • cwd: getProjectDir()
  • agentDir: ~/.omp/agent (via getAgentDir())
  • authStorage: discoverAuthStorage(agentDir)
  • modelRegistry: new ModelRegistry(authStorage) + background refreshInBackground() when the registry is not provided
  • settings: await Settings.init({ cwd, agentDir })
  • sessionManager: SessionManager.create(cwd) (file-backed)
  • skills/context files/prompt templates/slash commands/extensions/custom TS commands
  • built-in tools via createTools(...)
  • MCP tools (enabled by default; Exa MCP servers are folded into native Exa integration, and browser automation MCP servers are filtered when the built-in browser tool is enabled)
  • LSP integration (enabled by default)
  • eventBus: new EventBus() unless supplied

Required vs optional inputs

Typically you must provide only what you want to control:

  • Must provide: nothing for a minimal session
  • Usually provide explicitly in embedders:
    • sessionManager (if you need in-memory or custom location)
    • authStorage + modelRegistry (if you own credential/model lifecycle)
    • model or modelPattern (if deterministic model selection matters)
    • settings (if you need isolated/test config)

Session manager behavior (persistent vs in-memory)

AgentSession always uses a SessionManager; behavior depends on which factory you use.

File-backed (default)

import { createAgentSession, SessionManager } from "@oh-my-pi/pi-coding-agent";

const { session } = await createAgentSession({
  sessionManager: SessionManager.create(process.cwd()),
});

console.log(session.sessionFile); // absolute .jsonl path
  • Persists conversation/messages/state deltas to session files.
  • Supports resume/open/list/fork workflows.
  • session.sessionFile is defined.

In-memory

import { createAgentSession, SessionManager } from "@oh-my-pi/pi-coding-agent";

const { session } = await createAgentSession({
  sessionManager: SessionManager.inMemory(),
});

console.log(session.sessionFile); // undefined
  • No filesystem persistence.
  • Useful for tests, ephemeral workers, request-scoped agents.
  • Session methods still work, but persistence-specific behaviors (file resume/fork paths) are naturally limited.

Resume/open/list helpers

import { SessionManager } from "@oh-my-pi/pi-coding-agent";

const recent = await SessionManager.continueRecent(process.cwd());
const listed = await SessionManager.list(process.cwd());
const opened = listed[0] ? await SessionManager.open(listed[0].path) : null;

Model and auth wiring

createAgentSession() uses ModelRegistry + AuthStorage for model selection and API key resolution.

Explicit wiring

import {
  createAgentSession,
  discoverAuthStorage,
  ModelRegistry,
  SessionManager,
} from "@oh-my-pi/pi-coding-agent";

const authStorage = await discoverAuthStorage();
const modelRegistry = new ModelRegistry(authStorage);
await modelRegistry.refresh();

const available = modelRegistry.getAvailable();
if (available.length === 0)
  throw new Error("No authenticated models available");

const { session } = await createAgentSession({
  authStorage,
  modelRegistry,
  model: available[0],
  thinkingLevel: "medium",
  sessionManager: SessionManager.inMemory(),
});

Selection order when model is omitted

When no explicit model/modelPattern is provided:

  1. restore model from existing session (if restorable + key available)
  2. settings default model role (default)
  3. first available model with valid auth

If restore fails, modelFallbackMessage explains fallback.

Auth priority

AuthStorage.getApiKey(...) resolves in this order:

  1. runtime override (setRuntimeApiKey, used by CLI --api-key)
  2. config-sourced API key override (models.yml provider apiKey)
  3. stored API-key credential in agent.db / broker-backed storage
  4. stored OAuth credential, including refresh when needed
  5. provider environment variables
  6. custom-provider resolver fallback

Event subscription model

Subscribe with session.subscribe(listener); it returns an unsubscribe function.

const unsubscribe = session.subscribe((event) => {
  switch (event.type) {
    case "agent_start":
    case "turn_start":
    case "tool_execution_start":
      break;
    case "message_update":
      if (event.assistantMessageEvent.type === "text_delta") {
        process.stdout.write(event.assistantMessageEvent.delta);
      }
      break;
  }
});

AgentSessionEvent includes core AgentEvent plus session-level events:

  • auto_compaction_start / auto_compaction_end
  • auto_retry_start / auto_retry_end
  • retry_fallback_applied / retry_fallback_succeeded
  • ttsr_triggered
  • todo_reminder / todo_auto_clear
  • irc_message

Prompt lifecycle

session.prompt(text, options?) is the primary entry point.

Behavior:

  1. optional command/template expansion (/ commands, custom commands, file slash commands, prompt templates)
  2. if currently streaming:
    • requires streamingBehavior: "steer" | "followUp"
    • queues instead of throwing work away
  3. if idle:
    • validates model + API key
    • appends user message
    • starts agent turn

Related APIs:

  • sendUserMessage(content, { deliverAs? })
  • steer(text, images?)
  • followUp(text, images?)
  • sendCustomMessage({ customType, content, ... }, { deliverAs?, triggerTurn? })
  • abort()

Tools and extension integration

Built-ins and filtering

  • Built-ins come from createTools(...) and BUILTIN_TOOLS.
  • toolNames acts as an allowlist for built-ins.
  • customTools and extension-registered tools are still included.
  • Hidden tools (for example yield) are opt-in unless required by options.
const { session } = await createAgentSession({
  toolNames: ["read", "search", "find", "write"],
  requireYieldTool: true,
});

Extensions

  • extensions: inline ExtensionFactory[]
  • additionalExtensionPaths: load extra extension files
  • disableExtensionDiscovery: disable automatic extension scanning
  • preloadedExtensions: reuse already loaded extension set

Runtime tool set changes

AgentSession supports runtime activation updates:

  • getActiveToolNames()
  • getAllToolNames()
  • setActiveToolsByName(names)
  • refreshMCPTools(mcpTools)

System prompt is rebuilt to reflect active tool changes.

Discovery helpers

Use these when you want partial control without recreating internal discovery logic:

  • discoverAuthStorage(agentDir?)
  • discoverExtensions(cwd?)
  • discoverSkills(cwd?, _agentDir?, settings?)
  • discoverContextFiles(cwd?, _agentDir?)
  • discoverPromptTemplates(cwd?, agentDir?)
  • discoverSlashCommands(cwd?)
  • discoverCustomTSCommands(cwd?, agentDir?)
  • discoverMCPServers(cwd?)
  • buildSystemPrompt(options?)

Subagent-oriented options

For SDK consumers building orchestrators (similar to task executor flow):

  • outputSchema: passes structured output expectation into tool context
  • requireYieldTool: forces yield tool inclusion
  • taskDepth: recursion-depth context for nested task sessions
  • parentTaskPrefix: artifact naming prefix for nested task outputs

These are optional for normal single-agent embedding.

createAgentSession() return value

type CreateAgentSessionResult = {
  session: AgentSession;
  extensionsResult: LoadExtensionsResult;
  setToolUIContext: (uiContext: ExtensionUIContext, hasUI: boolean) => void;
  mcpManager?: MCPManager;
  modelFallbackMessage?: string;
  lspServers?: Array<{
    name: string;
    status: "ready" | "error";
    fileTypes: string[];
    error?: string;
  }>;
  eventBus: EventBus;
};

Use setToolUIContext(...) only if your embedder provides UI capabilities that tools/extensions should call into.

Startup performance

createAgentSession() runs two background optimizations to overlap I/O with the rest of session setup:

  • Model-host preconnect. As soon as the model is resolved, the SDK fires a best-effort fetch.preconnect(model.baseUrl) so DNS + TCP + TLS + HTTP/2 to the provider's host happens in parallel with extension/skill load, tool registry build, and system-prompt assembly. The first real fetch(...) then reuses the warm connection, saving 100–300 ms on transcontinental hops (e.g. residential IP → api.anthropic.com). Implementation lives in preconnectModelHost() in packages/coding-agent/src/sdk.ts. If fetch.preconnect is unavailable (non-Bun runtime) or the call throws, the optimization is silently skipped — never a hard dependency. Applies to every mode (interactive, print, RPC, ACP).

  • Conditional LSP warmup. Startup LSP servers (those returned by discoverStartupLspServers(cwd)) are only warmed when all of these hold:

    • enableLsp !== false on the session options, and
    • options.hasUI === true (interactive TUI), and
    • the lsp.diagnosticsOnWrite setting is enabled.

    Print / script / RPC / ACP invocations (hasUI=false) skip the warmup entirely: they don't render the warmup status indicator and typically finish before the language servers would stabilize, so warming them just spends CPU parsing big initialize responses concurrently with the LLM stream consumer and jitters perceived latency. Tools that actually need an LSP server still spin one up on demand through getOrCreateClient() — only the startup warmup is skipped. The returned lspServers field in CreateAgentSessionResult is therefore undefined (not an empty array) whenever the warmup branch was bypassed.

Minimal controlled embed example

import {
  createAgentSession,
  discoverAuthStorage,
  ModelRegistry,
  SessionManager,
  Settings,
} from "@oh-my-pi/pi-coding-agent";

const authStorage = await discoverAuthStorage();
const modelRegistry = new ModelRegistry(authStorage);
await modelRegistry.refresh();

const settings = Settings.isolated({
  "compaction.enabled": true,
  "retry.enabled": true,
});

const { session } = await createAgentSession({
  authStorage,
  modelRegistry,
  settings,
  sessionManager: SessionManager.inMemory(),
  toolNames: ["read", "search", "find", "edit", "write"],
  enableMCP: false,
  enableLsp: true,
});

session.subscribe((event) => {
  if (
    event.type === "message_update" &&
    event.assistantMessageEvent.type === "text_delta"
  ) {
    process.stdout.write(event.assistantMessageEvent.delta);
  }
});

await session.prompt("Find all TODO comments in this repo and propose fixes.");
await session.dispose();