文件最后提交记录最后更新时间
feat(examples): add nvidia deep agent (#1822) Add a multi-model deep research agent example built on NVIDIA's AIQ deep research pattern. Uses Claude for orchestration and NVIDIA Nemotron for web research, with GPU-accelerated data analysis via Modal sandboxes and NVIDIA RAPIDS (cuDF/cuML). - Multi-model architecture: Frontier model (orchestrator + data processor) + Nemotron Super (researcher) - GPU sandbox execution via Modal with RAPIDS for accelerated analytics and ML - Skill-based agent memory with self-improving knowledge files - Tavily web search with full page content fetching2 个月前
fix(examples): replace FilesystemBackend with upload_files in nvidia example (#1973) Closes #1862 --- Remove FilesystemBackend and CompositeBackend from the nvidia deep agent example. The old pattern routed skill/memory reads to the local filesystem, which isn't representative of production — skills don't originate from the host machine in a real deployment. The backend now uploads skill and memory files directly into the Modal sandbox on creation via upload_files, so the agent operates entirely within the sandbox.2 个月前
feat(examples): add nvidia deep agent (#1822) Add a multi-model deep research agent example built on NVIDIA's AIQ deep research pattern. Uses Claude for orchestration and NVIDIA Nemotron for web research, with GPU-accelerated data analysis via Modal sandboxes and NVIDIA RAPIDS (cuDF/cuML). - Multi-model architecture: Frontier model (orchestrator + data processor) + Nemotron Super (researcher) - GPU sandbox execution via Modal with RAPIDS for accelerated analytics and ML - Skill-based agent memory with self-improving knowledge files - Tavily web search with full page content fetching2 个月前
feat(examples): add nvidia deep agent (#1822) Add a multi-model deep research agent example built on NVIDIA's AIQ deep research pattern. Uses Claude for orchestration and NVIDIA Nemotron for web research, with GPU-accelerated data analysis via Modal sandboxes and NVIDIA RAPIDS (cuDF/cuML). - Multi-model architecture: Frontier model (orchestrator + data processor) + Nemotron Super (researcher) - GPU sandbox execution via Modal with RAPIDS for accelerated analytics and ML - Skill-based agent memory with self-improving knowledge files - Tavily web search with full page content fetching2 个月前
fix(examples): replace FilesystemBackend with upload_files in nvidia example (#1973) Closes #1862 --- Remove FilesystemBackend and CompositeBackend from the nvidia deep agent example. The old pattern routed skill/memory reads to the local filesystem, which isn't representative of production — skills don't originate from the host machine in a real deployment. The backend now uploads skill and memory files directly into the Modal sandbox on creation via upload_files, so the agent operates entirely within the sandbox.2 个月前
feat(examples): add nvidia deep agent (#1822) Add a multi-model deep research agent example built on NVIDIA's AIQ deep research pattern. Uses Claude for orchestration and NVIDIA Nemotron for web research, with GPU-accelerated data analysis via Modal sandboxes and NVIDIA RAPIDS (cuDF/cuML). - Multi-model architecture: Frontier model (orchestrator + data processor) + Nemotron Super (researcher) - GPU sandbox execution via Modal with RAPIDS for accelerated analytics and ML - Skill-based agent memory with self-improving knowledge files - Tavily web search with full page content fetching2 个月前
chore(examples): ban-relative-imports = "all" (#1959)2 个月前
chore(deps): bump aiohttp from 3.13.3 to 3.13.4 in /examples/nvidia_deep_agent (#2404) [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=aiohttp&package-manager=uv&previous-version=3.13.3&new-version=3.13.4)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - @dependabot rebase will rebase this PR - @dependabot recreate will recreate this PR, overwriting any edits that have been made to it - @dependabot show <dependency name> ignore conditions will show all of the ignore conditions of the specified dependency - @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/langchain-ai/deepagents/network/alerts). </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>2 个月前
README.md

Nemotron Deep Agent + GPU Skills

General-purpose deep agent showcasing multi-model architecture with GPU code execution: a frontier model orchestrates and processes data while NVIDIA Nemotron Super handles research, all backed by a GPU sandbox running NVIDIA RAPIDS.

Architecture

create_deep_agent (orchestrator: frontier model)
    |
    |-- researcher-agent (Nemotron Super)
    |       Conducts web searches, gathers and synthesizes information
    |
    |-- data-processor-agent (frontier model)
    |       Writes and executes Python scripts on GPU sandbox
    |       GPU-accelerated data analysis, ML, visualization, document processing
    |
    |-- skills/
    |       cudf-analytics             GPU data analysis (groupby, stats, anomaly detection)
    |       cuml-machine-learning      GPU ML (classification, regression, clustering, PCA)
    |       data-visualization         Publication-quality charts (matplotlib, seaborn)
    |       gpu-document-processing    Large document processing via GPU sandbox
    |
    |-- memory/
    |       AGENTS.md    Persistent agent instructions (self-improving)
    |
    |-- backend: Modal Sandbox (GPU or CPU, switchable at runtime)
            Skills + memory uploaded on sandbox creation
            Agent reads/writes/executes directly inside the sandbox

Why multi-model? The frontier model handles planning, synthesis, and code generation where reasoning quality matters. Nemotron Super handles the volume work (web research) where speed and cost matter.

How GPU execution works: The data-processor-agent reads skill documentation (SKILL.md), writes Python scripts using RAPIDS APIs (cuDF, cuML), and executes them on a Modal sandbox via the execute tool. Charts are displayed inline via read_file.

Quickstart

Install uv:

curl -LsSf https://astral.sh/uv/install.sh | sh

Install dependencies:

cd nemotron-deep-agent
uv sync

Set your API keys in your .env file or export them:

export ANTHROPIC_API_KEY=your_key    # For Claude frontier model
export NVIDIA_API_KEY=your_key       # For Nemotron Super via NIM
export TAVILY_API_KEY=your_key       # For web search
export LANGSMITH_API_KEY=your_key    # For tracing (optional)
export LANGSMITH_PROJECT="nemotron-deep-agent"
export LANGSMITH_TRACING="true"

Add your Modal keys to your .env(MODAL_TOKEN_ID & MODEL_TOKEN_SECRET)

OR

use Modal's CLI to authenticate:

uv run modal setup

Run with LangGraph server:

uv run langgraph dev --allow-blocking

GPU vs CPU Sandbox Switching

The agent supports runtime switching between GPU and CPU sandboxes via context_schema. Pass context={"sandbox_type": "gpu"} or context={"sandbox_type": "cpu"} when invoking. In Studio you can change this by clicking the manage assistants button on the bottom left.

GPU mode uses the NVIDIA RAPIDS Docker image with an A10G GPU. CPU mode uses a lightweight image with pandas, numpy, and scipy.

Try It Out

Start the server:

uv run langgraph dev --allow-blocking

Then open LangSmith Studio and try:

Generate a 1000-row random dataset about credit card transactions with columns
(id, value, category, score) use your cudf skill, then do some cool analysis
and give me some insights on that data!

The agent will delegate to the data-processor-agent, which reads the cuDF skill, writes a Python script to generate and analyze the dataset on the GPU sandbox, and returns structured insights with inline charts.

Resume from human in the loop interrupts in Studio by pasting:

{"decisions": [{"type": "approve"}]}

Example Queries

Data Analysis: "Generate a 1000-row random dataset about credit card transactions with columns (id, value, category, score), then analyze it for trends and anomalies"

Research + Analysis: "Research the latest trends in renewable energy adoption, then create a visualization comparing solar vs wind capacity growth"

ML: "Upload this CSV and train a classifier to predict customer churn. Show feature importances."

Model Configuration

Frontier model

Configured in src/agent.py via init_chat_model (supports any provider):

frontier_model = init_chat_model("anthropic:claude-sonnet-4-6")

Research subagent (NVIDIA Nemotron Super)

Configured via NVIDIA's NIM endpoint (OpenAI-compatible):

nemotron_super = ChatNVIDIA(
    model="private/nvidia/nemotron-3-super-120b-a12b"
)

GPU Sandbox

The agent uses a Modal sandbox with the NVIDIA RAPIDS base image (cuDF, cuML pre-installed). GPU type is A10G by default.

To use a different GPU tier, modify src/agent.py:

create_kwargs["gpu"] = "A100"  # or "T4", "H100"

Skills

Skills teach the agent how to use NVIDIA libraries via the Agent Skills Specification. Each skill is a SKILL.md file the agent reads when it encounters a matching task.

cudf-analytics

GPU-accelerated data analysis using NVIDIA RAPIDS cuDF. Pandas-like API on GPU for groupby, statistics, correlation, and anomaly detection.

cuml-machine-learning

GPU-accelerated machine learning using NVIDIA RAPIDS cuML. Scikit-learn compatible API for classification, regression, clustering, dimensionality reduction (PCA, UMAP, t-SNE), and preprocessing — all on GPU.

data-visualization

Publication-quality charts using matplotlib and seaborn in headless mode. Includes templates for bar, line, scatter, heatmap, histogram, box plots, and multi-panel dashboard summaries with a colorblind-safe palette. Charts are displayed inline in the conversation via read_file.

gpu-document-processing

Large document processing via the sandbox-as-tool pattern. Agent writes extraction scripts and runs them on GPU.

Adding Your Own Skills

skills/
  my-skill/
    SKILL.md

Self-Improving Memory

The agent has persistent memory via AGENTS.md, loaded at startup through the memory parameter. When the agent discovers something reusable during execution — like a library API that doesn't exist, a better code pattern, or a non-obvious error fix — it edits its own skill files to capture that knowledge for future runs.

For example, if the data-processor-agent discovers that cudf.DataFrame.interpolate() isn't implemented, it updates skills/cudf-analytics/SKILL.md with a "Known Limitations" note so it won't repeat the mistake.

Memory and skills are uploaded into the sandbox on creation via upload_files. The agent reads and edits them directly inside the sandbox; changes persist for the sandbox's lifetime. In production, swap the local file reads in _seed_sandbox for your storage layer (S3, database, etc.). See src/backend.py for the backend configuration.

Adapting to Your Domain

  1. Swap prompts in src/prompts.py
  2. Add/replace subagents with domain-specific agents
  3. Add skills for domain capabilities
  4. Change models in src/agent.py
  5. Swap sandbox for a different provider (Daytona, E2B, or local)

Full Enterprise Version

For a full enterprise deployment with NeMo Agent Toolkit, evaluation harnesses, knowledge layer, and frontend, see NVIDIA's AIQ Blueprint: https://github.com/langchain-ai/aiq-blueprint

Resources