mini cli search engine for your docs, knowledge bases, meeting notes, whatever. Tracking current sota approaches while being all local

TTobi Lütkerelease: v2.5.2

443760f4创建于 20 小时前535次提交

文件	最后提交记录	最后更新时间
.claude-plugin	feat: add Claude Code plugin support with inline status check (#99) - Add marketplace.json for Claude Code plugin installation - Simplify skill status check to inline `qmd status` (portable across agents) - Update SKILL.md MCP section, reference mcp-setup.md for manual config - Clean up mcp-setup.md (remove redundant prerequisites) - Rename MCP-SETUP.md to mcp-setup.md Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>	3 个月前
.github	Use npm trusted publishing	3 天前
.pi	feat: release skill with changelog-driven workflow and git hooks - Add /release skill with full process: hook install, changelog validation, git history review, preview, and release execution - Skill auto-populates [Unreleased] from git history when empty - Install hook script symlinks pre-push for tag validation - Register skills/ dir in .pi/settings.json for pi discovery	3 个月前
assets	Add QMD architecture diagram to README Generated with PaperBanana (Gemini 3 Pro). Shows query expansion fanning HyDE+Vec into vector searches, Lex into BM25, merged via reciprocal rank fusion and LLM reranking.	3 个月前
bin	Enhance launcher shebang polyglot to fall back to Bun if Node is missing on the system Result: {"status":"keep","test_status":0}	20 小时前
docs	feat: add intent parameter for query disambiguation Add optional `intent` parameter that steers query expansion, reranking, chunk selection, and snippet extraction without searching on its own. When a query like "performance" is ambiguous (web-perf vs team health vs fitness), intent provides background context that disambiguates results across all pipeline stages: - expandQuery: includes intent in LLM prompt ("Query intent: {intent}") - rerank: prepends intent to rerank query for Qwen3-Reranker - chunk selection: intent terms scored at 0.5x weight vs query terms - snippet extraction: intent terms scored at 0.3x weight - strong-signal bypass: disabled when intent provided Available via CLI (--intent flag or intent: line in query documents), MCP (intent field on query tool), and programmatic API. Adapted from PR #180 (thanks @vyalamar).	2 个月前
finetune	fix: bump transitive deps to resolve security alerts npm: vite 7.3.1→7.3.2, hono 4.12.10→4.12.12, @hono/node-server 1.19.12→1.19.13 pypi: add uv constraint-dependencies for authlib>=1.6.9, aiohttp>=3.13.4, cryptography>=46.0.7 Made-with: Cursor	1 个月前
scripts	Make release script portable	3 天前
skills	docs: improve qmd skill guidance	3 天前
src	Improve qmd diagnostics and embed resilience	3 天前
test	Make bin/qmd launcher a shebang polyglot to support both Windows cmd/ps1 native wrappers and sh-invoked smoke tests Result: {"status":"keep","test_status":0}	20 小时前
.gitattributes	Update get and multi-get commands for virtual paths - Update getDocument() to support qmd:// virtual paths and filesystem paths - Update multiGet() to handle virtual paths in patterns and comma-separated lists - Update matchFilesByGlob() in store.ts to return virtual paths - Remove duplicate getContextForFile() function from qmd.ts - Use collection-scoped getContextForPath() instead of legacy function - All get and multi-get tests now passing Closes qmd-vro 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	5 个月前
.gitignore	chore: gitignore package-lock.json	2 个月前
CHANGELOG.md	release: v2.5.2	20 小时前
CLAUDE.md	feat: AST-aware chunking for code files via tree-sitter Add opt-in AST-aware chunk boundary detection for code files using web-tree-sitter. When enabled with `--chunk-strategy auto`, code files (.ts, .tsx, .js, .jsx, .py, .go, .rs) are chunked at function, class, and import boundaries instead of arbitrary text positions. Default behavior (`regex`) is unchanged — no surprises on upgrade. In testing on QMD's own codebase, AST mode split 42% fewer function bodies across chunk boundaries compared to regex-only chunking. Usage: qmd embed --chunk-strategy auto qmd query "search terms" --chunk-strategy auto What's included: - Language detection from file extension with support for TypeScript, JavaScript (including arrow functions and function expressions), Python, Go, and Rust - Per-language tree-sitter queries with scored break points aligned to the existing markdown scale (class=100, function=90, type=80, import=60) - AST break points merged with regex break points — highest score wins at each position, so embedded markdown (comments, docstrings) still benefits from regex patterns - Refactored chunking core: chunkDocumentWithBreakPoints() extracted, mergeBreakPoints() added, async chunkDocumentAsync() wrapper for AST - ChunkStrategy type ("auto" \| "regex") threaded through generateEmbeddings(), hybridQuery(), structuredSearch(), CLI, and SDK - getASTStatus() health check wired into `qmd status` - Parse failures log a warning and fall back to regex — never crash Hardening: - Grammar packages are optionalDependencies with pinned versions to prevent ABI breaks from semver drift - web-tree-sitter is a direct dependency (pinned) - Errors are logged (not silently swallowed) for debuggability - Tested on both Node.js and Bun (Bun is actually faster) Testing: - 26 unit tests (test/ast.test.ts) — all 4 languages, error handling - 7 integration tests (test/store.test.ts) — merge, equivalence, bypass - Standalone test-ast-chunking.mjs with 63 synthetic tests and a real-collection performance scanner (npx tsx test-ast-chunking.mjs ~/code) - Validated end-to-end with qmd embed + qmd query on QMD's own codebase - Zero markdown regressions across all test paths Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2 个月前
LICENSE	chore: set up npm publishing as @tobi/qmd v0.9.0 - Scope package to @tobi/qmd, version 0.9.0 - Add files whitelist, publishConfig, repo metadata - Add CI workflow (bun tests on ubuntu + macos, bun latest + 1.1.0) - Add publish workflow (triggers on v* tags, publishes to npm) - Add release script for version bumping + changelog generation - Add LICENSE (MIT) and initial CHANGELOG.md - Update install instructions to use @tobi/qmd	3 个月前
README.md	fix: keep llama GPU fallback noise off JSON stdout	6 天前
bun.lock	chore: update core runtime dependencies	5 天前
example-index.yml	fixes	4 个月前
flake.lock	Enable SQLite extension loading in devshell (#48) Override sqlite in devShell to enable extension loading for sqlite-vec support when running tests. Only sets BREW_PREFIX if not already defined to avoid overriding user's existing setup. Package build remains unchanged. Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>	3 个月前
flake.nix	fix: update nix module hashes	5 天前
migrate-schema.ts	Migrate documents table to use collection names instead of IDs Schema changes: - documents.collection_id (INTEGER FK) → documents.collection (TEXT) - Update UNIQUE constraint to (collection, path) - Update indices to use collection name - Update FTS triggers to compute filepath from collection \|\| '/' \|\| path Code changes in store.ts: - Change all function parameters from collectionId: number to collectionName: string - Update all SQL queries to use d.collection instead of d.collection_id - Remove unnecessary JOINs where collection name is already available - Update DocumentResult type: collectionId → collectionName - Update renameCollection() to also update documents.collection Successfully migrated 2309 documents across 6 collections. This prepares for YAML-based collection configuration where collections table will be removed and collection names will be the primary identifier. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	5 个月前
package.json	release: v2.5.2	20 小时前
pnpm-lock.yaml	chore: update core runtime dependencies	5 天前
tsconfig.build.json	Improve qmd diagnostics and embed resilience	3 天前
tsconfig.json	test: split integration/model suites Split test suites for explicit runtime execution. - Move model-related tests under `src/models/`. - Move CLI/integration tests under `src/integration/`. - Add `src/store.helpers.unit.test.ts` for helper unit coverage. - Add shared Vitest config with default timeout and suite organization. - Remove legacy flat test files from `src/` root. - Keep core test commands in scripts supporting unit/models/integration runs.	3 个月前
vitest.config.ts	fix(test): resolve LLM test timeouts by disabling file parallelism Parallel test files each cold-load their own LLM model, competing for CPU and causing timeouts even at 120s. Sequential execution eliminates contention — tests that timed out at 30s now complete in 1-15s. Made-with: Cursor	1 个月前

自动翻译

QMD - Query Markup Documents

本地设备上的搜索引擎，助您记住所有需要的内容。为您的 Markdown 笔记、会议记录、文档资料和知识库建立索引。支持关键词或自然语言搜索。是您智能工作流的理想之选。

QMD 融合了 BM25 全文搜索、向量语义搜索和 LLM 重排序技术 — 所有功能均通过 node-llama-cpp 在本地运行，并使用 GGUF 模型。

QMD Architecture

您可以在 CHANGELOG 中了解更多关于 QMD 的开发进展。

快速开始

# Install globally (Node or Bun)
npm install -g @tobilu/qmd
# or
bun install -g @tobilu/qmd

# Or run directly
npx @tobilu/qmd ...
bunx @tobilu/qmd ...

# Create collections for your notes, docs, and meeting transcripts
qmd collection add ~/notes --name notes
qmd collection add ~/Documents/meetings --name meetings
qmd collection add ~/work/docs --name docs

# Add context to help with search results, each piece of context will be returned when matching sub documents are returned. This works as a tree. This is the key feature of QMD as it allows LLMs to make much better contextual choices when selecting documents. Don't sleep on it!
qmd context add qmd://notes "Personal notes and ideas"
qmd context add qmd://meetings "Meeting transcripts and notes"
qmd context add qmd://docs "Work documentation"

# Generate embeddings for semantic search
qmd embed

# Search across everything
qmd search "project timeline"           # Fast keyword search
qmd vsearch "how to deploy"             # Semantic search
qmd query "quarterly planning process"  # Hybrid + reranking (best quality)

# Get a specific document
qmd get "meetings/2024-01-15.md"

# Get a document by docid (shown in search results)
qmd get "#abc123"

# Get multiple documents by glob pattern
qmd multi-get "journals/2025-05*.md"

# Search within a specific collection
qmd search "API" -c notes

# Export all matches for an agent
qmd search "API" --all --files --min-score 0.3

与 AI 智能体配合使用

QMD 的 --json 和 --files 输出格式专为智能体工作流设计：

# Get structured results for an LLM
qmd search "authentication" --json -n 10

# List all relevant files above a threshold
qmd query "error handling" --all --files --min-score 0.4

# Retrieve full document content
qmd get "docs/api-reference.md" --full

MCP 服务器

尽管直接在命令行中让智能体使用该工具即可完美运行，但它也公开了一个 MCP（模型上下文协议）服务器，以实现更紧密的集成。

公开的工具：

query — 使用类型化子查询（lex/vec/hyde）进行搜索，通过 RRF + 重排序组合结果
get — 按路径或 docid 检索文档（提供模糊匹配建议）
multi_get — 通过通配符模式、逗号分隔列表或 docid 批量检索
status — 索引健康状况和集合信息

Claude 桌面版配置（~/Library/Application Support/Claude/claude_desktop_config.json）：

{
  "mcpServers": {
    "qmd": {
      "command": "qmd",
      "args": ["mcp"]
    }
  }
}

Claude 代码 — 安装插件（推荐）：

claude plugin marketplace add tobi/qmd
claude plugin install qmd@qmd

或者在 ~/.claude/settings.json 中手动配置 MCP：

{
  "mcpServers": {
    "qmd": {
      "command": "qmd",
      "args": ["mcp"]
    }
  }
}

HTTP 传输

默认情况下，qmd 的 MCP 服务器使用标准输入输出（由每个客户端作为子进程启动）。若要使用共享的、长期运行的服务器以避免重复加载模型，请使用 HTTP 传输：

# Foreground (Ctrl-C to stop)
qmd mcp --http                    # localhost:8181
qmd mcp --http --port 8080        # custom port

# Background daemon
qmd mcp --http --daemon           # start, writes PID to ~/.cache/qmd/mcp.pid
qmd mcp stop                      # stop via PID file
qmd status                        # shows "MCP: running (PID ...)" when active

HTTP 服务器公开了两个端点：

POST /mcp — MCP 可流式 HTTP（JSON 响应，无状态）
GET /health — 带运行时间的存活状态检查

LLM 模型在请求之间保持加载在 VRAM 中。嵌入/重排序上下文在闲置 5 分钟后会被释放，并在下次请求时自动重新创建（约 1 秒延迟，模型仍保持加载状态）。

将任何 MCP 客户端指向 http://localhost:8181/mcp 即可连接。

SDK / 库使用

在您自己的 Node.js 或 Bun 应用程序中将 QMD 用作库。

安装

npm install @tobilu/qmd

快速开始

import { createStore } from '@tobilu/qmd'

const store = await createStore({
  dbPath: './my-index.sqlite',
  config: {
    collections: {
      docs: { path: '/path/to/docs', pattern: '**/*.md' },
    },
  },
})

const results = await store.search({ query: "authentication flow" })
console.log(results.map(r => `${r.title} (${Math.round(r.score * 100)}%)`))

await store.close()

存储创建

createStore() 接受三种模式：

import { createStore } from '@tobilu/qmd'

// 1. Inline config — no files needed besides the DB
const store = await createStore({
  dbPath: './index.sqlite',
  config: {
    collections: {
      docs: { path: '/path/to/docs', pattern: '**/*.md' },
      notes: { path: '/path/to/notes' },
    },
  },
})

// 2. YAML config file — collections defined in a file
const store2 = await createStore({
  dbPath: './index.sqlite',
  configPath: './qmd.yml',
})

// 3. DB-only — reopen a previously configured store
const store3 = await createStore({ dbPath: './index.sqlite' })

搜索

统一的 search() 方法可同时处理简单查询和预展开的结构化查询：

// Simple query — auto-expanded via LLM, then BM25 + vector + reranking
const results = await store.search({ query: "authentication flow" })

// With options
const results2 = await store.search({
  query: "rate limiting",
  intent: "API throttling and abuse prevention",
  collection: "docs",
  limit: 5,
  minScore: 0.3,
  explain: true,
})

// Pre-expanded queries — skip auto-expansion, control each sub-query
const results3 = await store.search({
  queries: [
    { type: 'lex', query: '"connection pool" timeout -redis' },
    { type: 'vec', query: 'why do database connections time out under load' },
  ],
  collections: ["docs", "notes"],
})

// Skip reranking for faster results
const fast = await store.search({ query: "auth", rerank: false })

如需直接访问后端：

// BM25 keyword search (fast, no LLM)
const lexResults = await store.searchLex("auth middleware", { limit: 10 })

// Vector similarity search (embedding model, no reranking)
const vecResults = await store.searchVector("how users log in", { limit: 10 })

// Manual query expansion for full control
const expanded = await store.expandQuery("auth flow", { intent: "user login" })
const results4 = await store.search({ queries: expanded })

检索

// Get a document by path or docid
const doc = await store.get("docs/readme.md")
const byId = await store.get("#abc123")

if (!("error" in doc)) {
  console.log(doc.title, doc.displayPath, doc.context)
}

// Get document body with line range
const body = await store.getDocumentBody("docs/readme.md", {
  fromLine: 50,
  maxLines: 100,
})

// Batch retrieve by glob or comma-separated list
const { docs, errors } = await store.multiGet("docs/**/*.md", {
  maxBytes: 20480,
})

集合

// Add a collection
await store.addCollection("myapp", {
  path: "/src/myapp",
  pattern: "**/*.ts",
  ignore: ["node_modules/**", "*.test.ts"],
})

// List collections with document stats
const collections = await store.listCollections()
// => [{ name, pwd, glob_pattern, doc_count, active_count, last_modified, includeByDefault }]

// Get names of collections included in queries by default
const defaults = await store.getDefaultCollectionNames()

// Remove / rename
await store.removeCollection("myapp")
await store.renameCollection("old-name", "new-name")

上下文

上下文会添加描述性元数据，以提高搜索相关性，并与结果一同返回：

// Add context for a path within a collection
await store.addContext("docs", "/api", "REST API reference documentation")

// Set global context (applies to all collections)
await store.setGlobalContext("Internal engineering documentation")

// List all contexts
const contexts = await store.listContexts()
// => [{ collection, path, context }]

// Remove context
await store.removeContext("docs", "/api")
await store.setGlobalContext(undefined)  // clear global

索引建立

// Re-index collections by scanning the filesystem
const result = await store.update({
  collections: ["docs"],  // optional — defaults to all
  onProgress: ({ collection, file, current, total }) => {
    console.log(`[${collection}] ${current}/${total} ${file}`)
  },
})
// => { collections, indexed, updated, unchanged, removed, needsEmbedding }

// Generate vector embeddings
const embedResult = await store.embed({
  force: false,           // true to re-embed everything
  chunkStrategy: "auto",  // "regex" (default) or "auto" (AST for code files)
  onProgress: ({ current, total, collection }) => {
    console.log(`Embedding ${current}/${total}`)
  },
})

类型

为 SDK 使用者导出的关键类型：

import type {
  QMDStore,            // The store interface
  SearchOptions,       // Options for search()
  LexSearchOptions,    // Options for searchLex()
  VectorSearchOptions, // Options for searchVector()
  HybridQueryResult,   // Search result with score, snippet, context
  SearchResult,        // Result from searchLex/searchVector
  ExpandedQuery,       // Typed sub-query { type: 'lex'|'vec'|'hyde', query }
  DocumentResult,      // Document metadata + body
  DocumentNotFound,    // Error with similarFiles suggestions
  MultiGetResult,      // Batch retrieval result
  UpdateProgress,      // Progress callback info for update()
  UpdateResult,        // Aggregated update result
  EmbedProgress,       // Progress callback info for embed()
  EmbedResult,         // Embedding result
  StoreOptions,        // createStore() options
  CollectionConfig,    // Inline config shape
  IndexStatus,         // From getStatus()
  IndexHealthInfo,     // From getIndexHealth()
} from '@tobilu/qmd'

工具函数导出：

import {
  extractSnippet,              // Extract a relevant snippet from text
  addLineNumbers,              // Add line numbers to text
  DEFAULT_MULTI_GET_MAX_BYTES, // Default max file size for multiGet (10KB)
  Maintenance,                 // Database maintenance operations
} from '@tobilu/qmd'

生命周期

// Close the store — disposes LLM models and DB connection
await store.close()

SDK 需要显式提供 dbPath——不预设任何默认值。这确保其可以安全地嵌入任何应用程序，且不会产生副作用。

架构

┌─────────────────────────────────────────────────────────────────────────────┐
│                         QMD Hybrid Search Pipeline                          │
└─────────────────────────────────────────────────────────────────────────────┘

                              ┌─────────────────┐
                              │   User Query    │
                              └────────┬────────┘
                                       │
                        ┌──────────────┴──────────────┐
                        ▼                             ▼
               ┌────────────────┐            ┌────────────────┐
               │ Query Expansion│            │  Original Query│
               │  (fine-tuned)  │            │   (×2 weight)  │
               └───────┬────────┘            └───────┬────────┘
                       │                             │
                       │ 2 alternative queries       │
                       └──────────────┬──────────────┘
                                      │
              ┌───────────────────────┼───────────────────────┐
              ▼                       ▼                       ▼
     ┌─────────────────┐     ┌─────────────────┐     ┌─────────────────┐
     │ Original Query  │     │ Expanded Query 1│     │ Expanded Query 2│
     └────────┬────────┘     └────────┬────────┘     └────────┬────────┘
              │                       │                       │
      ┌───────┴───────┐       ┌───────┴───────┐       ┌───────┴───────┐
      ▼               ▼       ▼               ▼       ▼               ▼
  ┌───────┐       ┌───────┐ ┌───────┐     ┌───────┐ ┌───────┐     ┌───────┐
  │ BM25  │       │Vector │ │ BM25  │     │Vector │ │ BM25  │     │Vector │
  │(FTS5) │       │Search │ │(FTS5) │     │Search │ │(FTS5) │     │Search │
  └───┬───┘       └───┬───┘ └───┬───┘     └───┬───┘ └───┬───┘     └───┬───┘
      │               │         │             │         │             │
      └───────┬───────┘         └──────┬──────┘         └──────┬──────┘
              │                        │                       │
              └────────────────────────┼───────────────────────┘
                                       │
                                       ▼
                          ┌───────────────────────┐
                          │   RRF Fusion + Bonus  │
                          │  Original query: ×2   │
                          │  Top-rank bonus: +0.05│
                          │     Top 30 Kept       │
                          └───────────┬───────────┘
                                      │
                                      ▼
                          ┌───────────────────────┐
                          │    LLM Re-ranking     │
                          │  (qwen3-reranker)     │
                          │  Yes/No + logprobs    │
                          └───────────┬───────────┘
                                      │
                                      ▼
                          ┌───────────────────────┐
                          │  Position-Aware Blend │
                          │  Top 1-3:  75% RRF    │
                          │  Top 4-10: 60% RRF    │
                          │  Top 11+:  40% RRF    │
                          └───────────────────────┘

分数归一化与融合

搜索后端

后端	原始分数	转换方式	范围
FTS (BM25)	SQLite FTS5 BM25	`Math.abs(score)`	0 至 ~25+
向量	余弦距离	`1 / (1 + distance)`	0.0 至 1.0
重排序器	LLM 0-10 评分	`score / 10`	0.0 至 1.0

融合策略

query 命令采用 倒数排序融合（RRF） 并结合位置感知混合：

查询扩展：原始查询（加权 ×2）+ 1 个 LLM 变体
并行检索：每个查询同时搜索 FTS 和向量索引
RRF 融合：使用 score = Σ(1/(k+rank+1)) 合并所有结果列表，其中 k=60
顶级排名奖励：在任何列表中排名第 1 的文档加 0.05 分，排名第 2-3 的加 0.02 分
Top-K 选择：取排名前 30 的候选文档进行重排序
重排序：LLM 为每个文档评分（是/否，附带对数概率置信度）
位置感知混合：
- RRF 排名 1-3：75% 检索分数，25% 重排序分数（保留精确匹配）
- RRF 排名 4-10：60% 检索分数，40% 重排序分数
- RRF 排名 11+：40% 检索分数，60% 重排序分数（更信任重排序器）

为何采用此方法：当扩展查询不匹配时，纯 RRF 可能会稀释精确匹配结果。顶级排名奖励可保留在原始查询中得分第 1 的文档。位置感知混合可防止重排序器破坏高置信度的检索结果。

分数解读

分数	含义
0.8 - 1.0	高度相关
0.5 - 0.8	中度相关
0.2 - 0.5	一定相关性
0.0 - 0.2	低相关性

要求

系统要求

Node.js >= 22
Bun >= 1.0.0
macOS：Homebrew SQLite（用于扩展支持）
```
brew install sqlite
```

GGUF 模型（通过 node-llama-cpp）

QMD 使用三个本地 GGUF 模型（首次使用时自动下载）：

模型	用途	大小
`embeddinggemma-300M-Q8_0`	向量嵌入（默认）	~300MB
`qwen3-reranker-0.6b-q8_0`	重排序	~640MB
`qmd-query-expansion-1.7B-q4_k_m`	查询扩展（经过微调）	~1.1GB

模型从 HuggingFace 下载并缓存在 ~/.cache/qmd/models/ 目录中。

自定义嵌入模型

通过 QMD_EMBED_MODEL 环境变量覆盖默认嵌入模型。这对于多语言语料库（例如中文、日文、韩文）非常有用，因为 embeddinggemma-300M 的覆盖范围有限。

# Use Qwen3-Embedding-0.6B for better multilingual (CJK) support
export QMD_EMBED_MODEL="hf:Qwen/Qwen3-Embedding-0.6B-GGUF/Qwen3-Embedding-0.6B-Q8_0.gguf"

# After changing the model, re-embed all collections:
qmd embed -f

支持的模型系列：

embeddinggemma（默认）—— 针对英语优化，占用空间小
Qwen3-Embedding—— 多语言（包括中日韩在内的119种语言），MTEB排名领先

注意： 切换嵌入模型时，必须使用 qmd embed -f 重新建立索引，因为不同模型之间的向量不兼容。系统会自动为每个模型系列调整提示词格式。

安装

npm install -g @tobilu/qmd
# or
bun install -g @tobilu/qmd

开发

git clone https://github.com/tobi/qmd
cd qmd
npm install
npm link

使用方法

集合管理

# Create a collection from current directory
qmd collection add . --name myproject

# Create a collection with explicit path and custom glob mask
qmd collection add ~/Documents/notes --name notes --mask "**/*.md"

# List all collections
qmd collection list

# Remove a collection
qmd collection remove myproject

# Rename a collection
qmd collection rename myproject my-project

# List files in a collection
qmd ls notes
qmd ls notes/subfolder

生成向量嵌入

# Embed all indexed documents (900 tokens/chunk, 15% overlap)
qmd embed

# Force re-embed everything
qmd embed -f

# Enable AST-aware chunking for code files (TS, JS, Python, Go, Rust)
qmd embed --chunk-strategy auto

# Also works with query for consistent chunk selection
qmd query "auth flow" --chunk-strategy auto

AST感知分块（--chunk-strategy auto）利用tree-sitter按函数、类和导入边界对代码文件进行分块，而非基于任意文本位置。这能生成更高质量的分块，为代码库带来更优的搜索结果。无论采用何种策略，Markdown及其他文件类型始终使用基于正则表达式的分块方式。

默认策略为regex（现有行为）。使用--chunk-strategy auto可选择启用AST感知分块。运行qmd status可验证可用的语法解析器。

注意：Tree-sitter语法解析器是可选依赖。若未安装，--chunk-strategy auto会自动回退至仅使用正则表达式的分块方式。已在Node.js和Bun环境中测试通过。

上下文管理

上下文为集合和路径添加描述性元数据，帮助搜索功能更好地理解您的内容。

# Add context to a collection (using qmd:// virtual paths)
qmd context add qmd://notes "Personal notes and ideas"
qmd context add qmd://docs/api "API documentation"

# Add context from within a collection directory
cd ~/notes && qmd context add "Personal notes and ideas"
cd ~/notes/work && qmd context add "Work-related notes"

# Add global context (applies to all collections)
qmd context add / "Knowledge base for my projects"

# List all contexts
qmd context list

# Remove context
qmd context rm qmd://notes/old

搜索命令

┌──────────────────────────────────────────────────────────────────┐
│                        Search Modes                              │
├──────────┬───────────────────────────────────────────────────────┤
│ search   │ BM25 full-text search only                           │
│ vsearch  │ Vector semantic search only                          │
│ query    │ Hybrid: FTS + Vector + Query Expansion + Re-ranking  │
└──────────┴───────────────────────────────────────────────────────┘

# Full-text search (fast, keyword-based)
qmd search "authentication flow"

# Vector search (semantic similarity)
qmd vsearch "how to login"

# Hybrid search with re-ranking (best quality)
qmd query "user authentication"

选项

# Search options
-n <num>           # Number of results (default: 5, or 20 for --files/--json)
-c, --collection   # Restrict search to a specific collection
--all              # Return all matches (use with --min-score to filter)
--min-score <num>  # Minimum score threshold (default: 0)
--full             # Show full document content
--line-numbers     # Add line numbers to output
--explain          # Include retrieval score traces (query, JSON/CLI output)
--index <name>     # Use named index

# Output formats (for search and multi-get)
--files            # Output: docid,score,filepath,context
--json             # JSON output with snippets
--csv              # CSV output
--md               # Markdown output
--xml              # XML output

# Get options
qmd get <file>[:line]  # Get document, optionally starting at line
-l <num>               # Maximum lines to return
--from <num>           # Start from line number

# Multi-get options
-l <num>           # Maximum lines per file
--max-bytes <num>  # Skip files larger than N bytes (default: 10KB)

输出格式

默认输出为带颜色的 CLI 格式（遵循 NO_COLOR 环境变量）。

当标准输出（stdout）是 TTY 时，结果路径会作为可点击的终端超链接（OSC 8）显示。点击路径会使用编辑器 URI 模板在你的编辑器中打开文件。

当标准输出不是 TTY 时（例如通过管道传输到另一个命令或重定向到文件），QMD 会输出纯文本路径，不包含转义序列。

TTY 示例：

docs/guide.md:42 #a1b2c3
Title: Software Craftsmanship
Context: Work documentation
Score: 93%

This section covers the **craftsmanship** of building
quality software with attention to detail.
See also: engineering principles


notes/meeting.md:15 #d4e5f6
Title: Q4 Planning
Context: Personal notes and ideas
Score: 67%

Discussion about code quality and craftsmanship
in the development process.

使用 QMD_EDITOR_URI（或配置文件中的 editor_uri）配置编辑器链接目标：

# VS Code (default)
export QMD_EDITOR_URI="vscode://file/{path}:{line}:{col}"

# Cursor
export QMD_EDITOR_URI="cursor://file/{path}:{line}:{col}"

# Zed
export QMD_EDITOR_URI="zed://file/{path}:{line}:{col}"

# Sublime Text
export QMD_EDITOR_URI="subl://open?url=file://{path}&line={line}"

模板占位符：

{path} 文件系统绝对路径（URI 编码）
{line} 基于 1 的行号
{col} 或 {column} 基于 1 的列号
Path：相对于集合的路径（例如，docs/guide.md）
Docid：短哈希标识符（例如，#a1b2c3）- 与 qmd get #a1b2c3 配合使用
Title：从文档中提取（首个标题或文件名）
Context：通过 qmd context add 配置的路径上下文
Score：颜色编码（绿色表示 >70%，黄色表示 >40%，其他情况显示为暗淡色）
Snippet：匹配内容的上下文，其中查询术语会被高亮显示

示例

# Get 10 results with minimum score 0.3
qmd query -n 10 --min-score 0.3 "API design patterns"

# Output as markdown for LLM context
qmd search --md --full "error handling"

# JSON output for scripting
qmd query --json "quarterly reports"

# Inspect how each result was scored (RRF + rerank blend)
qmd query --json --explain "quarterly reports"

# Use separate index for different knowledge base
qmd --index work search "quarterly reports"

索引维护

# Show index status and collections with contexts
qmd status

# Re-index all collections
qmd update

# Re-index with git pull first (for remote repos)
qmd update --pull

# Get document by filepath (with fuzzy matching suggestions)
qmd get notes/meeting.md

# Get document by docid (from search results)
qmd get "#abc123"

# Get document starting at line 50, max 100 lines
qmd get notes/meeting.md:50 -l 100

# Get multiple documents by glob pattern
qmd multi-get "journals/2025-05*.md"

# Get multiple documents by comma-separated list (supports docids)
qmd multi-get "doc1.md, doc2.md, #abc123"

# Limit multi-get to files under 20KB
qmd multi-get "docs/*.md" --max-bytes 20480

# Output multi-get as JSON for agent processing
qmd multi-get "docs/*.md" --json

# Clean up cache and orphaned data
qmd cleanup

数据存储

索引存储位置：~/.cache/qmd/index.sqlite

架构

collections     -- Indexed directories with name and glob patterns
path_contexts   -- Context descriptions by virtual path (qmd://...)
documents       -- Markdown content with metadata and docid (6-char hash)
documents_fts   -- FTS5 full-text index
content_vectors -- Embedding chunks (hash, seq, pos, 900 tokens each)
vectors_vec     -- sqlite-vec vector index (hash_seq key)
llm_cache       -- Cached LLM responses (query expansion, rerank scores)

环境变量

变量	默认值	说明
`XDG_CACHE_HOME`	`~/.cache`	缓存目录位置
`QMD_LLAMA_GPU`	`auto`	强制 llama.cpp 使用 GPU 后端（`metal`、`vulkan`、`cuda`），或设置为 `false` 禁用 GPU
`QMD_FORCE_CPU`	未设置	设置为 `1`/`true` 可在任何 CUDA/Vulkan/Metal 探测前强制启用 CPU 模式。等效的命令行标志：`--no-gpu`。
`QMD_EMBED_PARALLELISM`	自动	覆盖嵌入/重排序上下文并行度（1-8）。Windows CUDA 默认值为 `1`，因为并行 CUDA 上下文可能导致 `ggml-cuda.cu:98` 错误崩溃；请使用 Vulkan，或仅在驱动稳定时提高此值。

工作原理

索引流程

Collection ──► Glob Pattern ──► Markdown Files ──► Parse Title ──► Hash Content
    │                                                   │              │
    │                                                   │              ▼
    │                                                   │         Generate docid
    │                                                   │         (6-char hash)
    │                                                   │              │
    └──────────────────────────────────────────────────►└──► Store in SQLite
                                                                       │
                                                                       ▼
                                                                  FTS5 Index

嵌入流程

文档会通过智能边界检测被切分为约900个token的片段，片段之间有15%的重叠：

Document ──► Smart Chunk (~900 tokens) ──► Format each chunk ──► node-llama-cpp ──► Store Vectors
                │                           "title | text"        embedBatch()
                │
                └─► Chunks stored with:
                    - hash: document hash
                    - seq: chunk sequence (0, 1, 2...)
                    - pos: character position in original

智能分块

QMD 并非按生硬的 token 边界切割，而是采用评分算法寻找自然的 Markdown 断点。这样可以将语义单元（章节、段落、代码块）保持完整。

断点评分：

模式	分值	说明
`# 标题`	100	H1 - 主要章节
`## 标题`	90	H2 - 子章节
`### 标题`	80	H3
`#### 标题`	70	H4
`##### 标题`	60	H5
`###### 标题`	50	H6
```	80	代码块边界
`---` / `***`	60	水平分隔线
空行	20	段落边界
`- 项目` / `1. 项目`	5	列表项
换行符	1	最小断点

算法：

扫描文档，找出所有带分值的断点
当接近 900-token 目标时，在截止点前搜索 200-token 窗口
为每个断点评分：最终得分 = 基础分值 × (1 - (距离/窗口)² × 0.7)
在得分最高的断点处切割

平方距离衰减意味着 200-token 外的标题（得分约 30）仍能胜过目标位置的简单换行符（得分 1），但较近的标题会优于较远的标题。

代码块保护： 代码块内部的断点会被忽略——代码保持完整。如果代码块超过分块大小，会尽可能保持其完整性。

AST 感知分块（代码文件）：

对于支持的代码文件，QMD 还会使用 tree-sitter 解析源代码，并添加从 AST 派生的断点，这些断点会与上述正则表达式评分合并：

AST 节点	分值	语言
类 / 接口 / 结构体 / 实现 / 特征	100	所有
函数 / 方法	90	所有
类型别名 / 枚举	80	所有
导入 / 使用声明	60	所有

支持 .ts、.tsx、.js、.jsx、.py、.go 和 .rs 文件。通过 --chunk-strategy auto 启用。Markdown 和其他文件类型始终使用正则表达式分块。

查询流程（混合式）

Query ──► LLM Expansion ──► [Original, Variant 1, Variant 2]
                │
      ┌─────────┴─────────┐
      ▼                   ▼
   For each query:     FTS (BM25)
      │                   │
      ▼                   ▼
   Vector Search      Ranked List
      │
      ▼
   Ranked List
      │
      └─────────┬─────────┘
                ▼
         RRF Fusion (k=60)
         Original query ×2 weight
         Top-rank bonus: +0.05/#1, +0.02/#2-3
                │
                ▼
         Top 30 candidates
                │
                ▼
         LLM Re-ranking
         (yes/no + logprob confidence)
                │
                ▼
         Position-Aware Blend
         Rank 1-3:  75% RRF / 25% reranker
         Rank 4-10: 60% RRF / 40% reranker
         Rank 11+:  40% RRF / 60% reranker
                │
                ▼
         Final Results

模型配置

模型在 src/llm.ts 中以 HuggingFace URI 的形式进行配置：

const DEFAULT_EMBED_MODEL = "hf:ggml-org/embeddinggemma-300M-GGUF/embeddinggemma-300M-Q8_0.gguf";
const DEFAULT_RERANK_MODEL = "hf:ggml-org/Qwen3-Reranker-0.6B-Q8_0-GGUF/qwen3-reranker-0.6b-q8_0.gguf";
const DEFAULT_GENERATE_MODEL = "hf:tobil/qmd-query-expansion-1.7B-gguf/qmd-query-expansion-1.7B-q4_k_m.gguf";

EmbeddingGemma 提示词格式

// For queries
"task: search result | query: {query}"

// For documents
"title: {title} | text: {content}"

Qwen3-Reranker

使用 node-llama-cpp 的 createRankingContext() 和 rankAndSort() API 进行交叉编码器重排序。返回按相关度分数（0.0 - 1.0）排序的文档。

Qwen3（查询扩展）

用于通过 LlamaChatSession 生成查询变体。

许可证

MIT

项目介绍

为您的文档、知识库、会议记录等打造的迷你命令行搜索引擎。紧跟当前最先进的技术方法，同时完全本地化运行。【此简介由AI生成】

MIT TypeScript 535提交数

定制我的领域

README

9125.46 K1.6 K访问 GitHub

下载使用量

项目总下载次数（含Clone、Pull、 zip 包及 release 下载），每日凌晨更新

发行版

v2.5.2最新版本

20 小时前发布

查看全部发行版

语言类型

TypeScript81.95%

Python15.16%

Shell1.62%

JavaScript0.75%

Nix0.37%