read
Read files, directories, archives, SQLite databases, internal resources, images, documents, and URLs through one
pathstring.
Source
- Entry:
packages/coding-agent/src/tools/read.ts - Model-facing prompt:
packages/coding-agent/src/prompts/tools/read.md - Key collaborators:
packages/coding-agent/src/tools/path-utils.ts— splitpathfrom trailing selectors; normalize local paths.packages/coding-agent/src/tools/archive-reader.ts— detectarchive.ext:inner/path, index archives, list/read entries.packages/coding-agent/src/tools/sqlite-reader.ts— detect SQLite targets, parse selectors, render tables.packages/coding-agent/src/tools/fetch.ts— URL parsing, fetch/render pipeline, URL cache/artifacts.packages/coding-agent/src/internal-urls/router.ts— resolveagent://,artifact://,local://,mcp://,memory://,omp://,rule://,skill://.packages/coding-agent/src/edit/notebook.ts— convert.ipynbto editable# %% [...] cell:Ntext.packages/coding-agent/src/utils/file-display-mode.ts— decide hashline vs line-number vs raw display.packages/coding-agent/src/workspace-tree.ts— render directory trees.packages/coding-agent/src/edit/file-snapshot-store.ts— stores read lines for later hashline edit verification/recovery.packages/coding-agent/src/tools/index.ts— registersread: s => new ReadTool(s).
Inputs
| Field | Type | Required | Description |
|---|---|---|---|
path |
string |
Yes | Filesystem path, internal URL, or web URL. May end with a trailing selector such as :50-100 or :raw. |
Selector grammar
For normal file-like reads, splitPathAndSel() in packages/coding-agent/src/tools/path-utils.ts recognizes the final suffix only when it matches one of these forms:
| Suffix | Meaning |
|---|---|
:raw |
Raw/verbatim mode. Disables structural summaries and line prefixes. |
:conflicts |
Render unresolved Git merge-conflict regions for a local file. |
:N / :LN / :N- |
Start at 1-indexed line N, open-ended. |
:A-B / :LA-LB |
Inclusive 1-indexed line range. |
:A+C / :LA+LC |
C lines starting at A; tool converts this to end line A + C - 1. |
:R1,R2,... |
Multiple ranges, sorted and merged before reading (for example :5-16,960-973). |
:range:raw or :raw:range |
Same line selection, but raw output. |
Validation in parseLineRangeChunk():
- line numbers are 1-indexed;
:0throws. +counts must be>= 1.-end must be>= start.
Selector parsing intentionally falls through for unrecognized trailing :...; archive and SQLite paths consume their own colon syntax.
URL selectors are parsed separately in packages/coding-agent/src/tools/fetch.ts, but use the same line-range parser for :raw, :N, :A-B, :A+C, :5-10,20-30, and :range:raw / :raw:range. Because URL ports also use :, add a trailing slash before a selector on a host/port URL, e.g. https://example.com/:80.
Outputs
- Single-shot
AgentToolResultbuilt throughtoolResult()inpackages/coding-agent/src/tools/tool-result.ts. contentis usually one text block. Image reads may return[text, image].detailsis path-dependent.ReadToolDetailsmay include:kind: "file" | "url"(URL path useskind: "url"; file reads usually omitkind)isDirectoryresolvedPathsuffixResolution- URL fields:
url,finalUrl,contentType,method,notes truncationdisplayContent(unprefixed text + starting line for TUI rendering)summary(lines,elidedSpans,elidedLines) for structural summariesmetafrompackages/coding-agent/src/tools/output-meta.ts
details.meta.sourceis set to the backing path, URL, or internal URL.details.meta.truncationcarries shown range, total lines/bytes, next offset, and optionalartifactIdfor cached URL output.- Directory/archive listings and SQLite table lists also set
details.meta.limitswhen list limits trigger.
Flow
ReadTool.execute()accepts{ path }.file://...inputs are expanded first withexpandPath().- It tries URL handling first via
parseReadUrlTarget()frompackages/coding-agent/src/tools/fetch.ts.- Plain URL reads call
executeReadUrl(). - URL reads with line selectors load or refresh the URL cache with
loadReadUrlCacheEntry()and paginate the cached text locally with#buildInMemoryTextResult().
- Plain URL reads call
- If not a web URL, it checks
session.internalRouter.canHandle(...).- Internal URLs are resolved with
internalRouter.resolve(). agent://query extraction (/pathor?q=) bypasses pagination and returns the extracted content directly.- Other internal resources are paginated in-memory by
#buildInMemoryTextResult().
- Internal URLs are resolved with
- It tries archive resolution next with
#resolveArchiveReadPath().parseArchivePathCandidates()scans for.tar,.tar.gz,.tgz, or.zipanywhere before:sub/path.- On success,
#readArchive()either lists a directory or decodes an entry as UTF-8 text.
- It tries SQLite resolution with
#resolveSqliteReadPath().parseSqlitePathCandidates()scans for.sqlite,.sqlite3,.db,.db3before any:table,:key, or?querysuffix.#readSqlite()dispatches onparseSqliteSelector().
- Otherwise it treats the input as a local filesystem path.
resolveReadPath()expands~, resolves relative to session cwd, treats bare/as session cwd, and retries macOS screenshot/NFD/curly-quote variants.- If the path does not exist,
findUniqueSuffixMatch()does a workspace glob-based unique suffix lookup (skipped for remote mounts).
- Directories go through
#readDirectory(). - Non-directories branch by content type:
- image metadata / inline image
- editable notebook text
- markit-converted document
- structural summary for parseable code/prose
- streamed text/line-range read
- Local text reads are streamed by
streamLinesFromFile()rather than loading the whole file. The tool adds up to 3 lines of context before/after explicit bounded ranges. - Non-empty contiguous local reads are recorded into
getFileReadCache(session)for later hashline edit recovery. - If suffix resolution happened, the first text block is prefixed with
[Path '...' not found; resolved to '...' via suffix match].
Modes / Variants
Local text files
- No selector: if summarization is enabled and the file is small enough,
#trySummarize()callssummarizeCode().- Guards: file size
<= 2 MiB(MAX_SUMMARY_BYTES), line count<= 20_000(MAX_SUMMARY_LINES). - Summary output keeps selected declarations and replaces elided spans with
...or merged brace-pair lines containing... When at least one span is elided, the text content ends with a footer like[NN lines elided; re-read needed ranges, e.g. <path>:5-16,40-80]using concrete ranges from the actual elisions. - When an elided block sits between matching brace lines,
#renderSummary()may merge them into one anchored line rather than emitting separate opener/closer lines.
- Guards: file size
- Explicit selector or summarization miss: streamed text read.
- Default open-ended limit is
min(session setting read.defaultLimit, DEFAULT_MAX_LINES). - Explicit ranges expand by
RANGE_LEADING_CONTEXT_LINES = 1/RANGE_TRAILING_CONTEXT_LINES = 3on the constrained sides only. - Non-raw output uses
resolveFileDisplayMode():- hashline numbered output when edit mode is hashline, read is not raw, source is mutable, edit tool exists, and
readHashLines !== false - otherwise optional line numbers when
readLineNumbers === true - raw mode suppresses both
- hashline numbered output when edit mode is hashline, read is not raw, source is mutable, edit tool exists, and
- Default open-ended limit is
- Prefix format in hashline mode is a
¶PATH#TAGheader followed byLINE:TEXT, e.g.¶src/foo.ts#0A1Band41:def alpha():, from the session snapshot store plusformatNumberedLine()/formatHashlineHeader(). - The
edit/hashline path consumes that header plus bare line numbers later; the four-hex tag is opaque and only meaningful in the session snapshot store that minted it. Immutable sources and:rawintentionally suppress hashline headers.
Directory listings
#readDirectory()callsbuildDirectoryTree()with:maxDepth = 2perDirLimit = 12rootLimit = nulllineCap = limitwhen a line selector was present, else unlimited at this layer
buildDirectoryTree()sorts siblings by recency, shows file sizes and relative ages, and may marklimits.resultLimitwhen the tree truncates.- Empty directories render as
(empty directory).
Archives
- Supported archive containers:
.tar,.tar.gz,.tgz,.zip. - Syntax:
archive.ext,archive.ext:path/inside,archive.ext:path/inside:50-60. openArchive()reads the whole archive into memory, then:- tar/tgz uses
new Bun.Archive(bytes) - zip uses
fflate.unzipSync()
- tar/tgz uses
- Archive paths normalize
/, drop.segments, and reject... - Directory reads list immediate children; files show
nameplus(size)when size > 0. - Directory listing default limit is
500entries in#readArchiveDirectory(). - File entries are UTF-8 decoded. Non-UTF-8 entries return
[Cannot read binary archive entry '...' (...)]instead of bytes. - Text archive entries reuse the normal in-memory pagination/anchoring path.
SQLite databases
- Database detection requires both a matching extension and a valid SQLite file header (
isSqliteFile()). - Selector forms from
parseSqliteSelector():
db.sqlite
kind: "list"- Lists non-
sqlite_%tables with row counts. #readSqlite()caps the rendered list to500tables viaapplyListLimit().
db.sqlite:table
kind: "schema"- Returns
sqlite_master.sqlplus sample rows. - Sample size is
DEFAULT_SCHEMA_SAMPLE_LIMIT = 5.
db.sqlite:table:key
kind: "row"- Resolves by primary key when the table has exactly one PK column; otherwise falls back to
rowidlookup. - No query parameters allowed on row lookups.
db.sqlite:table?limit=...&offset=...&order=...&where=...
kind: "query"- Defaults:
limit = 20,offset = 0. limitis capped at500.orderacceptscolumnorcolumn:asc|descand must name an existing column.whereis accepted only aftervalidateWhereClause()rejects comments, semicolons, and control keywords likeLIMIT,OFFSET,UNION,ATTACH,PRAGMA.- Unknown query parameters throw.
db.sqlite?q=SELECT ...
-
kind: "raw" -
Cannot be combined with table selectors or any other query param.
-
Empty
qthrows. -
executeReadQuery()runsdb.prepare(sql).all()and rejects bound parameters; it does not verify that the SQL starts withSELECT. -
Rendering caps in
packages/coding-agent/src/tools/sqlite-reader.ts:- ASCII table width
120(MAX_RENDER_WIDTH) - per-column width
40(MAX_COLUMN_WIDTH)
- ASCII table width
-
#readSqlite()opens Bun SQLite in{ readonly: true, strict: true }and setsPRAGMA busy_timeout = 3000.
Documents
CONVERTIBLE_EXTENSIONSinpackages/coding-agent/src/tools/read.tscovers.pdf,.doc,.docx,.ppt,.pptx,.xls,.xlsx,.rtf,.epub.convertFileWithMarkit()converts the file to text/markdown.- Converted output is then head-truncated with normal shared limits; there is no line selector support inside the source document before conversion.
- Conversion failures return a text block like
[Cannot read .pdf file: ...].
Jupyter notebooks
.ipynbgoes throughreadEditableNotebookText()unless:rawwas requested.- Output is editable plain text with markers like:
# %% [code] cell:0
...
- Raw mode bypasses that conversion and falls back to file-text reading.
Images
- Image detection is metadata-based (
readImageMetadata()). - Max accepted image size is
20 MiB(MAX_IMAGE_INPUT_BYTES, re-exported asMAX_IMAGE_SIZE). Larger files throw. - If
inspect_image.enabledis true,readreturns metadata only (MIME, bytes, dimensions, channels, alpha) plus a suggestion to callinspect_image. - Otherwise it calls
loadImageInput()and returns:- a text note from the image loader
- an inline image block
- Unsupported/undecodable image formats throw a
ToolError.
Internal URLs
readdoes not resolve these itself; it delegates tosession.internalRouter.resolve().- Registered protocols are outside this file, but the router in
packages/coding-agent/src/internal-urls/router.tsis built foragent://,artifact://,issue://,local://,mcp://,memory://,omp://,pr://,rule://, andskill://. #handleInternalUrl()behavior:- parses the URL with
parseInternalUrl()so colons inside the host segment are legal - for
agent://, treats non-root path extraction or?q=extraction as a special no-pagination mode - otherwise paginates the resolved text in memory
- passes
immutablethrough toresolveFileDisplayMode()so anchors are suppressed for immutable resources such as artifacts, skills, memory, and agent outputs - sets
ignoreResultLimits: trueforskill://so the full skill text is paginated only by explicit selectors, not by the normal default line limit
- parses the URL with
issue://<N>/pr://<N>(and the long formissue://<owner>/<repo>/<N>/pr://<owner>/<repo>/<N>) route through the same SQLite cache thegithubtool writes to;?comments=0selects the no-comments rendering. Bareissue:///pr://(andissue://<owner>/<repo>/pr://<owner>/<repo>) issue a livegh issue list/gh pr listfor browsing, accepting?state=,?limit=,?author=,?label=. PR diffs share the same cache throughpr://<N>/diff(numbered file listing with per-file hints),pr://<N>/diff/<i>(single file slice; 1-indexed), andpr://<N>/diff/all(verbatim unified diff); the listing and per-file slices are reconstructed from the cached unified-diff payload, so all three variants share onegh pr diffinvocation per PR. Diff content is served astext/plain. Soft TTLgithub.cache.softTtlSec(default 5 minutes), hard TTLgithub.cache.hardTtlSec(default 7 days). Stale-hit returns the cached row and schedules a background refresh.
Web URLs
parseReadUrlTarget()acceptshttp://,https://, orwww.targets.- Plain URL reads call
executeReadUrl()inpackages/coding-agent/src/tools/fetch.ts. :rawmeans raw HTML/body fallback path; plain URL reads prefer rendered/reader-friendly output.:N,:A-B,:A+C, and comma-separated multi-ranges do not refetch when cached output is usable. They page over cached output from the prior or current URL render.- URL render pipeline in
renderUrl():- normalize scheme (
https://added for barewww.) - try special handlers for known sites unless raw
- fetch with
loadPage() - if content is image/PDF/DOCX/etc., try binary fetch + markit/image handling
- handle JSON directly, feeds via feed parser, plain text directly
- for HTML and non-raw mode, try markdown alternates,
URL.md, content negotiation, feed alternates, HTML-to-text renderers, extracted linked documents, thenllms.txt - fall back to raw body text/html
- normalize scheme (
- URL output is wrapped with a small header:
URL: ...
Content-Type: ...
Method: ...
Notes: ...
---
methodrecords the winning path (json,feed,text,alternate-markdown,md-suffix,content-negotiation,image,markit,llms.txt,raw,raw-html, etc.).- URL reads may return an inline image block when the fetched resource is a supported image and survives resizing.
Side Effects
- Filesystem
- Opens and streams local files.
- Reads entire archives into memory before indexing.
- May read URL-cache artifact files from the session artifacts directory.
- Writes URL output artifacts when URL output is truncated or when line-range pagination needs a persisted cache body.
- Network
- URL mode performs HTTP fetches, binary refetches, and alternate-endpoint probes.
- Subprocesses / native bindings
- Uses Bun SQLite for
.db/.sqlite*. - Uses
Bun.Archivefor tar/tgz andfflatefor zip. - URL HTML rendering can delegate into site handlers and HTML-to-text backends from
packages/coding-agent/src/tools/fetch.ts.
- Uses Bun SQLite for
- Session state
- Records local text lines into
session.fileReadCachefor later stale-anchor recovery. - Uses
session.internalRouterfor internal URLs. - Uses
session.allocateOutputArtifact()for cached/truncated URL output.
- Records local text lines into
- Background work / cancellation
- Most branches honor
AbortSignal; the tool itself is markednonAbortable = true, but helper paths still callthrowIfAborted(signal).
- Most branches honor
Limits & Caps
- Shared text truncation defaults from
packages/coding-agent/src/session/streaming-output.ts:DEFAULT_MAX_LINES = 3000DEFAULT_MAX_BYTES = 50 * 1024
- Local text open-ended default line limit:
read.defaultLimit, clamped to[1, DEFAULT_MAX_LINES]. - Explicit line ranges add
1leading and3trailing context lines on the constrained sides (RANGE_LEADING_CONTEXT_LINES/RANGE_TRAILING_CONTEXT_LINES). - File streaming chunk size:
8 * 1024bytes (READ_CHUNK_SIZE). - Local streamed byte budget for line reads:
max(DEFAULT_MAX_BYTES, maxLinesToCollect * 512). - Structural summaries only run when file size
<= 2 MiBand line count<= 20_000. - Image input max:
20 MiB. - Directory tree caps for local directories: depth
2, per-directory children12. - Archive directory default list cap:
500entries. - SQLite:
- default row query limit
20 - schema sample limit
5 - max query limit
500 - table list cap
500 - render width
120, column width40 - busy timeout
3000ms
- default row query limit
- URL read result shown to the model is truncated to
300lines and50 KiBinexecuteReadUrl(); full cached output can be attached as an artifact. - Inline fetched URL images:
- source bytes cap
20 MiB - post-resize inline output cap
300 KiB
- source bytes cap
- Unique suffix auto-resolution glob timeout:
5000ms. - File-read cache holds
30paths per session.
Errors
- Validation and operational failures surface as
ToolError. - Selector errors include:
Line selector 0 is invalid; lines are 1-indexed. Use :1.- invalid
A+B/A-Bshapes Cannot combine query extraction with offset/limitforagent://.../path:50
- Missing local/archive/sqlite paths first attempt unique suffix resolution; if no unique match exists they error.
- Out-of-bounds line reads do not throw. They return explanatory text with a suggestion such as
Use :1 ...orUse :<last line> .... - Binary archive entries do not throw; they return a text notice.
- Document conversion failure returns a text notice.
- Image oversize/unsupported/invalid cases throw.
- SQLite parser rejects unsupported parameter combinations early; DB/runtime errors are caught and rethrown as
ToolError(message). - URL fetch failure does not throw when HTTP fetch succeeds but
response.ok === false; it returns a failed URL read withmethod: "failed"and explanatory notes.
Notes
- Hashline anchors are suppressed for raw reads and immutable internal resources because there is no editable backing target for later
editconsumption. splitPathAndSel()intentionally treats unknown trailing:...as part of the path soarchive.zip:inner/fileanddb.sqlite:table:keystill work.resolveReadPath()contains macOS-specific filename fallbacks for screenshot timestamps, NFD Unicode normalization, and curly apostrophes.- A bare
/resolves to the session cwd, not the filesystem root. - URL cache keys are session-scoped and normalized by requested URL + raw/rendered mode; both requested URL and final redirected URL are cached.
- URL line-range reads request
ensureArtifact: true, preferCached: trueso a later paginated read can reopen the same rendered body from artifact storage. - Raw SQLite
q=execution is not keyword-restricted beyond “no bound parameters”; the read tool relies on the surrounding contract to keep it read-only. - The file-read cache is not a read acceleration cache. It exists to recover hashline edits when the file changed after the read.