yiyanzhi_akane1workspace_autoresearch: Claude Code 驱动的算子优化工作区

AKG Agents

📋 Table of Contents

AKG Agents

📘 1. Project Overview

AKG Agents is an LLM-powered multi-agent collaboration framework for AI Infra and high-performance computing, aimed at boosting the development and optimization efficiency of high-performance code through intelligent agent collaboration.

The framework provides a complete agent infrastructure: ReAct Agent base classes, extensible Skill / Tools / SubAgent mechanisms, LangGraph workflow orchestration, tree-based Trace system, and a unified configuration and registry. Developers can rapidly build, compose, and deploy intelligent agents tailored to diverse tasks.

The current production scenario is AI kernel code generation: leveraging LLM planning and multi-agent collaboration to automate multi-backend, multi-DSL high-performance kernel generation and optimization. Future extensions will cover kernel migration, performance tuning, code refactoring, and more AI Infra related scenarios.

🗓️ 2. Changelog

2026-05-26: Added workspace_autoresearch — Claude Code-driven iterative kernel optimization workspace. Reuses akg_agents' KernelVerifier / WorkerManager / GitRepo / CodeChecker directly; phase machine + hooks + slash command are workspace-local.
2026-04-28: CLI module (akg_cli) deprecated and will no longer receive updates.
2026-03-31: Added AutoResearch workflow — agent-driven iterative deep optimization with KernelVerifier eval, supporting all DSLs.
2026-03-11：Streamline the operator optimization process by integrating AKG Agents and OpenCode (akg-op Agent).
2026-02-26: Supported PyPTO backend code generation.
2026-02-15: Documentation reorganized. Legacy docs archived to docs/v1/, new docs consolidated under docs/v2/.
2026-02-10: Core framework refactored (v2). Decoupled general-purpose Agent capabilities from kernel-specific logic to build a reusable multi-agent collaboration framework. See Architecture, Agent System, Skill System, Workflow, Trace System, Configuration.
2025-12-01: Introduced LangGraph for task orchestration. New LangGraphTask replaces original Task Orchestration scheme. See Workflow Documentation.
2025-11-25: Supported service architecture, including client-server-worker separation architecture. See Service Architecture Documentation.
2025-10-14: Supported TileLang_CUDA backend code generation. See Benchmark Results.
2025-09-26: Supported CUDA C and CPP backend code generation. See Benchmark Results.
2025-09-14: Updated KernelBench Level1 kernel generation success rate. See Benchmark Results.
2025-08-12: Supported "Doc-Driven Integration" (now replaced by Skill System).
2025-06-27: Initial AIKG release with code generation support for Triton and SWFT backends.

🛠️ 3. Quick Start

Installation

# 1. Environment setup (optional, recommended Python 3.10/3.11/3.12)
conda create -n akg_agents python=3.11
conda activate akg_agents

# 2. Clone the repository
git clone https://gitcode.com/mindspore/akg.git -b br_agents
cd akg

# 3. Install dependencies
pip install -r akg_agents/requirements.txt

# 4. Install AKG Agents
pip install -e ./akg_agents --no-build-isolation

# 5. Download third-party benchmarks as needed
bash akg_agents/download.sh --with_all_benchmarks

Configure LLM

Copy the example config to ~/.akg/settings.json and fill in your API Key and model info:

mkdir -p ~/.akg
cp akg_agents/examples/settings.example.json ~/.akg/settings.json

A minimal configuration only requires one model (auto-applies to all levels):

{
  "models": {
    "standard": {
      "base_url": "https://api.deepseek.com/beta/",
      "api_key": "YOUR_API_KEY",
      "model_name": "deepseek-chat"
    }
  },
  "default_model": "standard"
}

Advanced Configuration:

For integrating locally deployed models, see Local Model Deployment Guide

For per-level model configuration (complex / standard / fast), thinking/reasoning parameters for different providers, Embedding/RAG setup, or environment variable usage, see Configuration Documentation, the basic example settings.example.json, and the multi-provider example settings.example.more.json

Backend Dependencies

The br_agents branch currently supports the following three DSLs. Other backends are pending adaptation:

Platform	Backend (DSL)	Reference Link
Huawei Atlas A2 Training Series	Triton	https://gitee.com/ascend/triton-ascend
NVIDIA GPU	Triton	https://github.com/triton-lang/triton
CPU (x86_64)	C++	GCC / Clang

Launch & Usage

AKG Agents provides kernel generation, optimization, and migration capabilities through scripts in three locations:

Directory	Content
`examples/kernel_related/`	Generation, optimization, migration examples (primary)
`python/akg_agents/op/tools/`	Adaptive search, evolve runner scripts (single/batch)
`scripts/`	AutoResearch

Feature	Example Scripts
Generation	`*_single.py` — direct generation and verification
Optimization	`_adaptive_search.py` (UCB), `_evolve.py` (evolution)
Migration	`run_cuda_to_ascend_conversion.py`, `run_cuda_to_ascend_evolve.py`

Usage examples:

# Kernel generation
python examples/kernel_related/run_torch_npu_triton_single.py

# Optimization (adaptive search)
python python/akg_agents/op/tools/run_single_adaptive_search.py

# Optimization (evolution)
python python/akg_agents/op/tools/run_single_evolve.py

# Migration (CUDA → Ascend)
python examples/kernel_related/run_cuda_to_ascend_conversion.py

# AutoResearch
python scripts/run_autoresearch.py

▶️ 4. Tutorial Examples

examples/ directory

Example	Category	Description
NPU
`kernel_related/run_torch_npu_triton_single.py`	Kernel	Single kernel generation (Torch + Triton Ascend)
`kernel_related/run_torch_npu_triton_single_with_cache.py`	Kernel	Single kernel verification + Data Cache reuse demo (Torch + Triton Ascend)
`kernel_related/run_torch_adaptive_search_triton_ascend.py`	Kernel	UCB adaptive search (Torch + Triton Ascend)
`kernel_related/run_torch_evolve_triton_ascend.py`	Kernel	Evolutionary kernel optimization (Torch + Triton Ascend)
`kernel_related/run_cuda_to_ascend_conversion.py`	Kernel	CUDA to Ascend kernel conversion
`kernel_related/run_cuda_to_ascend_evolve.py`	Kernel	CUDA to Ascend evolutionary optimization
GPU
`kernel_related/gpu/run_triton_to_torch_single.py`	Kernel	Single kernel generation (Torch + Triton CUDA)
`kernel_related/gpu/run_torch_evolve_triton.py`	Kernel	Evolutionary kernel optimization (Torch + Triton CUDA)
`kernel_related/gpu/run_cudac_to_torch_single.py`	Kernel	Single kernel generation (Torch + CUDA C)
CPU
`kernel_related/cpu/run_torch_cpu_cpp_single.py`	Kernel	Single kernel generation (Torch + CPP)
`kernel_related/cpu/run_torch_evolve_cpu_cpp.py`	Kernel	Evolutionary kernel optimization (Torch + CPP)
`kernel_related/cpu/run_torch_adaptive_search_cpu_cpp.py`	Kernel	UCB adaptive search (Torch + CPP)
AutoResearch
`scripts/run_autoresearch.py`	Kernel	AutoResearch iterative optimization (all backends, `--desc` / `--ref` / `--kernel`)
Utilities
`kernel_related/run_kernel_profile.py`	Kernel	Kernel performance profiling
`run_skill/`	Skill	Skill loading, registry, hierarchy, versioning, installation, LLM selection examples
`build_a_simple_react_agent/`	Framework	Build a custom ReAct Agent using the framework
`build_a_simple_workflow/`	Framework	Build a custom LangGraph-based Workflow
`settings.example.json`	Config	Basic `settings.json` configuration template
`settings.example.more.json`	Config	Multi-provider examples (OpenAI, DeepSeek, Claude, Qwen, Kimi, Doubao, etc.)

🧭 Usage Mode vs Development Mode

akg_agents/
├── workspace/          ← Usage: open a Code Agent here to use operator optimization
│   ├── .opencode/      　 skills / agents definitions, auto-loaded
│   └── AGENTS.md
└── ...                 ← Development: open here to develop the framework itself
    ├── AGENTS.md
    └── python/akg_agents/

Usage mode (workspace/): For operator optimization users. Open OpenCode / Claude Code / Cursor and the built-in Agents and Skills handle env setup, kernel generation, fusion analysis, etc.
Development mode (akg_agents/): For framework developers. Develop the akg_agents codebase guided by AGENTS.md and per-directory SPEC.md files.

📐 5. Design Documentation

Start with Architecture for an overview, then read Workflow and Skill System to understand the core mechanisms.

Core Framework

Architecture - Overall architecture and module overview
Agent System - Agent base classes, ReAct Agent, registry
Skill System - Skill management and dynamic knowledge injection
Tools - Tool execution framework, built-in tools, domain tools
Workflow - LangGraph-based workflow orchestration
Trace System - Tree-based inference tracing (multi-fork, checkpoint resume)
Configuration - Unified configuration management (settings.json / env vars)
LLM - LLM provider, client, embedding

Scenarios

Generation — Direct kernel code generation and correctness verification
Optimization — Adaptive Search (UCB), Evolve (evolutionary algorithm), AutoResearch (agent-driven iteration)
Migration — CUDA → Ascend kernel conversion
Verifier Data Cache - Verifier-side local cache for reference data and baseline profile results

OpenCode Integration

akg-op User Guide - End-to-end operator optimization Agent: env setup → fusion analysis (optional) → task extraction → operator generation → code integration, supporting single-operator optimization and model fusion analysis

Contributing

Skill Contribution Guide - How to contribute new Skills

Additional Modules (v1 Documentation)

Database - Database module
RAG - Vector retrieval-augmented generation
RAG Usage Guide - RAG configuration and usage tutorial
Server Architecture - Service architecture (Client-Server-Worker)
TaskPool - Task pool management
DevicePool - Device pool management