browser
Open, reuse, close, and script Puppeteer tabs against headless Chromium or CDP-attached apps.
Source
- Entry:
packages/coding-agent/src/tools/browser.ts - Model-facing prompt:
packages/coding-agent/src/prompts/tools/browser.md - Key collaborators:
packages/coding-agent/src/tools/browser/tab-supervisor.ts— global tab registry; worker lifecycle; run/close coordination.packages/coding-agent/src/tools/browser/tab-worker.ts— executesruncode; implements thetabhelper API.packages/coding-agent/src/tools/browser/tab-worker-entry.ts— worker-thread transport bootstrap.packages/coding-agent/src/tools/browser/registry.ts— browser-handle registry keyed by browser kind.packages/coding-agent/src/tools/browser/launch.ts— Puppeteer loading, Chromium resolution/download, headless launch, stealth injection.packages/coding-agent/src/tools/browser/attach.ts— CDP attach/reuse, target picking, spawned-app process handling.packages/coding-agent/src/tools/browser/tab-protocol.ts— worker init/run/result message schema.packages/coding-agent/src/tools/browser/readable.ts—tab.extract()readability extraction.packages/coding-agent/src/tools/browser/render.ts— TUI rendering foropen/closestatus lines andrunJS cells.packages/coding-agent/src/tools/puppeteer/00_stealth_tampering.txt— mask patched functions/descriptors as native.packages/coding-agent/src/tools/puppeteer/01_stealth_activity.txt— synthesize visibility/focus/scroll activity.packages/coding-agent/src/tools/puppeteer/02_stealth_hairline.txt— fix Modernizr hairline detection.packages/coding-agent/src/tools/puppeteer/03_stealth_botd.txt— spoofnavigator.webdriver,window.chrome, and Chrome fingerprint surfaces.packages/coding-agent/src/tools/puppeteer/04_stealth_iframe.txt— patch iframecontentWindow/srcdocbehavior.packages/coding-agent/src/tools/puppeteer/05_stealth_webgl.txt— spoof WebGL vendor/renderer/precision.packages/coding-agent/src/tools/puppeteer/06_stealth_screen.txt— normalize screen/viewport/device-pixel-ratio values.packages/coding-agent/src/tools/puppeteer/07_stealth_fonts.txt— spoof local fonts and perturb canvas text rendering.packages/coding-agent/src/tools/puppeteer/08_stealth_audio.txt— spoof audio latency/sample-rate and perturb offline rendering.packages/coding-agent/src/tools/puppeteer/09_stealth_locale.txt— force locale/languages/timezone/date strings.packages/coding-agent/src/tools/puppeteer/10_stealth_plugins.txt— synthesizenavigator.plugins/navigator.mimeTypes.packages/coding-agent/src/tools/puppeteer/11_stealth_hardware.txt— spoofnavigator.hardwareConcurrency.packages/coding-agent/src/tools/puppeteer/12_stealth_codecs.txt— spoof media codec support.packages/coding-agent/src/tools/puppeteer/13_stealth_worker.txt— carry UA/platform spoofing intoWorker/SharedWorker.
Inputs
Shared fields
| Field | Type | Required | Description |
|---|---|---|---|
action |
"open" | "close" | "run" |
Yes | Dispatches to the open/close/run path. |
name |
string |
No | Tab id. Defaults to "main". Tabs live in a process-global map, so the same name is reused across later calls and in-process subagents until closed. |
timeout |
number |
No | Tool wall-clock timeout in seconds. Defaults to 30; clamped to the browser tool range before execution. |
action: "open"
| Field | Type | Required | Description |
|---|---|---|---|
url |
string |
No | Navigate after the tab is ready. Existing reusable tabs also navigate when url is supplied. |
viewport |
{ width: number; height: number; scale?: number } |
No | Requested viewport. For headless launch this becomes the initial viewport; for a page it is applied with page.setViewport(). scale maps to Puppeteer deviceScaleFactor. |
wait_until |
"load" | "domcontentloaded" | "networkidle0" | "networkidle2" |
No | Navigation wait condition. Defaults to "load" where omitted, including open navigation and later tab.goto(...). |
dialogs |
"accept" | "dismiss" |
No | Installs a page dialog handler that auto-accepts or auto-dismisses dialogs. Omitted means no handler. |
app |
{ path?: string; cdp_url?: string; args?: string[]; target?: string } |
No | Selects browser kind. No app uses the session browser.headless setting. app.path is resolved against the session cwd and used as the executable path for spawn/attach reuse. app.cdp_url connects to an existing CDP endpoint. args are appended only when spawning app.path. target is only used for attached/spawned-app page selection. |
action: "close"
| Field | Type | Required | Description |
|---|---|---|---|
all |
boolean |
No | Close every known tab. Omitted closes only name. |
kill |
boolean |
No | When a tab release drops a spawned-app browser handle to refcount 0, also terminate its process tree. Has no effect on headless shutdown and only disconnects connected CDP browsers. |
action: "run"
| Field | Type | Required | Description |
|---|---|---|---|
code |
string |
Yes | Async-function body executed in a VM context with page, browser, tab, display, assert, wait, console, timers, URL, TextEncoder, TextDecoder, and Buffer in scope. |
Outputs
The tool returns one result per call; no streaming partial output is emitted from the browser implementation itself.
open: text content withOpenedorReused, browser description, URL, and optional title.detailsincludesaction,name,browser,url,viewport, and the same text indetails.result.close: text content with eitherClosed ...orNo tab named ....detailsincludesaction,name, anddetails.result.run: orderedcontentarray built as:- every
display(value)call in execution order, - final return value, JSON-stringified unless already a string,
- or
Ran code on tab "..."if nothing else was produced.
- every
display(value)coercion inpackages/coding-agent/src/tools/browser/tab-worker.ts:{ type: "image", data: string, mimeType: string }becomes image content,stringbecomes text content,- other values become pretty JSON text when serializable, else
String(value).
tab.screenshot()also appends text plus an image content item unlesssilent: true;details.screenshotsrecords persisted screenshot metadata{ dest, mimeType, bytes, width, height }.rundetailsincludesaction,name, currentbrowser/urlwhen the tab exists, optionalscreenshots, anddetails.resultcontaining only the concatenated text outputs.
Flow
BrowserTool.execute()(packages/coding-agent/src/tools/browser.ts) abort-checks, clampstimeoutviaclampTimeout("browser", ...), defaultsnameto"main", and dispatches onaction.openresolves browser kind withresolveBrowserKind():app.cdp_url→{ kind: "connected" }after trimming trailing slashes.app.path→{ kind: "spawned" }after resolving against session cwd.- otherwise →
{ kind: "headless", headless: session.settings.get("browser.headless") }.
openrejects reusing the same tab name across different browser kinds (sameBrowserKind()); callers must close first.openacquires a browser handle throughacquireBrowser()(packages/coding-agent/src/tools/browser/registry.ts):- existing connected handle is reused by browser-kind key;
- stale disconnected handles are disposed and recreated;
- headless launches via
launchHeadlessBrowser(); connectedwaits for${cdpUrl}/json/version, thenpuppeteer.connect();spawnedfirst triesfindReusableCdp(), else kills same-path processes, allocates a free loopback port, spawns the executable with--remote-debugging-port=<port>, waits for CDP, then connects.
openacquires a tab throughacquireTab()(packages/coding-agent/src/tools/browser/tab-supervisor.ts):- same-name + same-browser + alive tab is reused unless
dialogschanged; - same-name but different browser handle, dead state, or changed dialog policy forces release and recreation;
- reusing with a new
urlnavigates by issuingawait tab.goto(...)through the worker, defaulting towaitUntil: "load"whenwait_untilis omitted.
- same-name + same-browser + alive tab is reused unless
- New tabs build a
WorkerInitPayloadinbuildInitPayload():- headless mode sends
url,waitUntil,viewport,dialogs, and timeout; the worker defaults missingwaitUntilto"load". - attach mode resolves a page with
pickElectronTarget(), gets its target id, and sendstargetIdplusdialogs.
- headless mode sends
acquireTab()spawns a dedicated BunWorkerfromtab-worker-entry.ts; if that fails it falls back to inline execution in the main thread (spawnInlineWorker()), preserving behavior but losing protection against synchronous infinite loops.WorkerCore.#init()(packages/coding-agent/src/tools/browser/tab-worker.ts) connects back to the browser websocket endpoint. Headless mode opens a new page, applies stealth patches, applies viewport, installs dialog handling if requested, and optionally navigates. Attach mode resolves the requested target page and optionally installs dialog handling.- On success the worker sends
readywith{ url, title, viewport, targetId }; the supervisor stores aTabSession, increments browser-handle refcount withholdBrowser(), and keeps the tab in a process-globalMap<string, TabSession>. runrequires non-emptycode, looks up the tab withgetTab(), then delegates torunInTab().runInTabWithSnapshot()rejects dead tabs and concurrent runs (Tab ... is busy), captures session cwd plus optionalbrowser.screenshotDir, registers an abort hook, sends arunmessage to the worker, and races the result againsttimeoutMs + 750ms. Timeouts force-kill the tab worker and, for headless tabs, close the orphaned page target.WorkerCore.#run()creates a VM context, exposes the raw Puppeteerpage/browserplus a synthetictabAPI, and executes(async () => { ...code... })()viavm.runInContext().- The
tabhelper API implemented in#createTabApi()is:
tab.name: stringtab.page: Pagetab.signal?: AbortSignaltab.url(): stringtab.title(): Promise<string>tab.goto(url, { waitUntil? })tab.observe({ includeAll?, viewportOnly? })tab.screenshot({ selector?, fullPage?, save?, silent? })tab.extract(format = "markdown")tab.click(selector)tab.type(selector, text)tab.fill(selector, value)tab.press(key, { selector? })tab.scroll(deltaX, deltaY)tab.drag(from, to)tab.waitFor(selector)tab.evaluate(fn, ...args)tab.scrollIntoView(selector)tab.select(selector, ...values)tab.uploadFile(selector, ...filePaths)tab.waitForUrl(pattern, { timeout? })tab.waitForResponse(pattern, { timeout? })tab.id(n)
- Selector handling in
normalizeSelector()accepts plain CSS and Puppeteer query handlers, and rewrites legacy Playwright-style prefixesp-text/,p-xpath/,p-pierce/,p-aria/; otherp-*prefixes throw aToolError. tab.observe()clears the element cache, takes a Puppeteer accessibility snapshot, filters to interactive nodes unlessincludeAll, optionally filters to viewport-visible nodes, assigns numeric ids, cachesElementHandles, and returns URL/title/viewport/scroll metadata pluselements.tab.id(n)resolves the cachedElementHandle, verifiesel.isConnected, and throws a stale-id error after cache invalidation if the DOM changed or the cache was cleared.tab.goto()clears the cached element ids before navigating. Any newtab.observe()also clears and rebuilds the cache.tab.click()uses a custom retry loop fortext/...selectors to find an actionable visible match; other selectors usepage.locator(...).click()with the run timeout.tab.screenshot()captures either the whole page or a selector PNG, downsizes a copy for model output, chooses a persistence path, writes the image to disk, records metadata, and optionally emits text + image display entries.display()calls accumulate in an array. After code finishes, the worker posts{ displays, returnValue, screenshots };BrowserTool.#run()appends the return value as trailing text content when notundefined.closereleases one tab or all tabs viareleaseTab()/releaseAllTabs(). Each tab aborts pending runs, asks the worker to close, waits up to750ms for aclosedack, terminates the worker, decrements browser refcount, and disposes the browser handle when refcount reaches zero.
Modes / Variants
- Action dispatch
open— acquire/reuse browser + tab.close— release one tab or all tabs.run— execute JS inside the tab worker.
- Browser kind
- Headless: launches local Chromium with Puppeteer, applies stealth patches, and creates a fresh page per tab.
- Spawned app (
app.path): reuses an existing CDP-enabled process for that executable when possible; otherwise kills same-path processes, spawns the executable with remote debugging enabled, then attaches. No stealth patches are injected. - Connected browser (
app.cdp_url): attaches to an already-running CDP endpoint. No process ownership; close only disconnects.
- Target selection for attached/spawned browsers
- With
app.target,pickElectronTarget()returns the first page whose URL or title contains the case-insensitive substring. - Without
app.target, it skips titles/URLs matchingrequest handler|devtools|background page|background host|service workerand otherwise falls back to the first page.
- With
- Worker mode
- Dedicated worker: normal path; user code runs off the main thread and can be aborted even when it blocks synchronously.
- Inline fallback: activated when Bun worker spawn fails; behavior matches, but synchronous infinite loops on user code cannot be interrupted.
- Dialog policy
- No
dialogsfield: no auto-handler. accept/dismiss: pagedialogevents are handled automatically.- Changing dialog policy on an existing live tab forces tab recreation instead of mutating the worker in place.
- No
- Screenshot persistence
saveprovided: persist full-resolution PNG at the resolved cwd-relative or absolute path.browser.screenshotDirsession setting set: persist full-resolution PNG under that directory with a timestamped filename.- Neither set: persist the resized image to a temp-file path under the OS temp dir.
Side Effects
- Filesystem
loadPuppeteer()writes{}to<puppeteer-safe-dir>/package.jsonbefore importingpuppeteer-core.- First headless launch may download Chromium into the Puppeteer cache directory returned by
getPuppeteerDir(). tab.screenshot()creates parent directories and writes image files.tab.uploadFile()resolves supplied paths against the session cwd.
- Network
- CDP attach paths poll
http://127.0.0.1:<port>/json/versionor the suppliedcdp_url/json/version. - Headless/browser-attach sessions create CDP websocket connections.
- Headless first-use Chromium download uses
@puppeteer/browsers. - User
page/taboperations perform normal browser network traffic.
- CDP attach paths poll
- Subprocesses / native bindings
- Headless mode launches Chromium through Puppeteer.
app.pathmode may spawn the target executable viaBun.spawn().killExistingByPath()/gracefulKillTreeOnce()use@oh-my-pi/pi-nativesprocess inspection/termination.- Worker mode uses Bun
Worker; fallback mode does not.
- Session state (transcript, memory, jobs, checkpoints, registries)
- Browser handles are cached in a process-global
Mapkeyed by browser kind inpackages/coding-agent/src/tools/browser/registry.ts. - Tabs are cached in a process-global
Mapkeyed bynameinpackages/coding-agent/src/tools/browser/tab-supervisor.ts. runcaptures session cwd and optionalbrowser.screenshotDirfor screenshot/save path resolution.restartForModeChange()drops only headless tabs.
- Browser handles are cached in a process-global
- User-visible prompts / interactive UI
- None beyond normal tool output. Dialog auto-handling is invisible unless it fails and emits debug logs.
- Background work / cancellation
open,run, CDP waits, and browser actions thread through abort signals.- A timed-out
runaborts the worker execution path and can tear down the tab.
Limits & Caps
- Tool timeout clamp: default
30s, min1s, max300s (TOOL_TIMEOUTS.browserinpackages/coding-agent/src/tools/tool-timeouts.ts). - Supervisor grace period around init/run/close:
750ms (GRACE_MSinpackages/coding-agent/src/tools/browser/tab-supervisor.ts). - Puppeteer protocol timeout for launch/connect operations:
60_000ms (BROWSER_PROTOCOL_TIMEOUT_MSinpackages/coding-agent/src/tools/browser/launch.ts). - Connected-browser CDP readiness wait:
5_000ms beforepuppeteer.connect()(packages/coding-agent/src/tools/browser/registry.ts). - Spawned-app CDP readiness wait after spawn:
30_000ms (packages/coding-agent/src/tools/browser/registry.ts). - CDP polling cadence: 150 ms in
waitForCdp()(packages/coding-agent/src/tools/browser/attach.ts). - Headless default viewport:
1365x768atdeviceScaleFactor: 1.25(DEFAULT_VIEWPORTinpackages/coding-agent/src/tools/browser/launch.ts). - Screenshot model-attachment resize cap:
maxWidth 1024,maxHeight 1024,maxBytes 150 * 1024,jpegQuality 70(packages/coding-agent/src/tools/browser/tab-worker.ts). tab.waitForUrl()polling interval:200ms (packages/coding-agent/src/tools/browser/tab-worker.ts).- Drag simulation uses
12mouse-move steps (packages/coding-agent/src/tools/browser/tab-worker.ts).
Errors
BrowserTool.execute()converts DOM-styleAbortErrorintoToolAbortError; other errors propagate.runhard-fails on missing code:Missing required parameter 'code' for action 'run'.openfails when reusing a name across browser kinds:Tab "..." is bound to a different browser (...). Close it first.runInTabWithSnapshot()fails when the tab is absent/dead (Tab "..." is not alive. Reopen it.) or already running (Tab "..." is busy).- Worker init failures and run failures are serialized through
RunErrorPayload;ToolErrorand abort state are reconstructed on the host side byerrorFromPayload(). - Attached-target mismatches surface as:
No page targets available on the attached browserNo page target matched "...". Available pages:\n...Target ... is no longer available on the attached browser
- Spawned-app path validation requires an absolute executable path after cwd resolution, not an app bundle directory path.
- Spawn/attach failures are wrapped into
ToolErrors such asTimed out waiting for CDP endpoint ...,Failed to attach to ..., orConnected to ... but puppeteer.connect failed: .... tabhelper errors are user-visibleToolErrors, including unsupported selector prefix, stale/unknown element id, invalid drag target, missing upload files, non-<select>fortab.select(), non-file-input fortab.uploadFile(), and screenshot selector misses.- On run timeout, the worker reports
Browser code execution timed out after <ms>ms; the supervisor may escalate toBrowser code execution hung past grace; tab killedif the worker does not respond after the grace window.
Notes
loadPuppeteer()andloadPuppeteerInWorker()temporarily redirectcwdto a safe Puppeteer directory before importingpuppeteer-core, because Puppeteer probes the current working directory during module load.- Headless launch prefers a detected system Chrome/Chromium, then
PUPPETEER_EXECUTABLE_PATH, and only then downloads Chromium. - Headless launch always passes
--no-sandbox,--disable-setuid-sandbox,--disable-blink-features=AutomationControlled, and a--window-size=...matching the initial viewport. It also ignores Puppeteer default args--disable-extensions,--disable-default-apps, and--disable-component-extensions-with-background-pages. - Proxy-related env vars only affect headless launch:
PUPPETEER_PROXY,PUPPETEER_PROXY_BYPASS_LOOPBACK, andPUPPETEER_PROXY_IGNORE_CERT_ERRORS. - Stealth patches are applied only in headless mode. Spawned or externally connected browsers are intentionally left untouched.
applyStealthPatches()also strips Puppeteer's//# sourceURL=__puppeteer_evaluation_script__suffix from CDPRuntime.evaluate/Runtime.callFunctionOnpayloads.tab.extract()readspage.content(), runs Readability first, then falls back tomain article/article/main/[role='main']/body, and returnsnullif neither extraction path yields content.close(all: true, kill: false)disconnects from spawned/connected browsers when the last tab closes but leaves spawned app processes running.- Headless orphan cleanup is best-effort: if a worker dies before closing its page, the supervisor searches browser targets by
targetIdand closes that page. - Console methods inside
rundo not appear in tool output; they are forwarded as debug/warn/error logs through the worker transport.