基于Qwen3-Coder打造，原生支持256K上下文并可扩展至1M tokens，优化仓库级代码理解，具备出色的智能编码与工具调用能力，提升开发效率。【此简介由AI生成】

SsystemUpload folder using huggingface_hub

4ea90307创建于 2025年8月5日66次提交

文件	最后提交记录	最后更新时间
BF16	Upload folder using huggingface_hub	9 个月前
.gitattributes	Upload folder using huggingface_hub	9 个月前
Qwen3-Coder-30B-A3B-Instruct-1M-IQ4_NL.ggufLFS	Upload folder using huggingface_hub	9 个月前
Qwen3-Coder-30B-A3B-Instruct-1M-IQ4_XS.ggufLFS	Upload folder using huggingface_hub	9 个月前
Qwen3-Coder-30B-A3B-Instruct-1M-Q2_K.ggufLFS	Upload folder using huggingface_hub	9 个月前
Qwen3-Coder-30B-A3B-Instruct-1M-Q2_K_L.ggufLFS	Upload folder using huggingface_hub	9 个月前
Qwen3-Coder-30B-A3B-Instruct-1M-Q3_K_M.ggufLFS	Upload folder using huggingface_hub	9 个月前
Qwen3-Coder-30B-A3B-Instruct-1M-Q3_K_S.ggufLFS	Upload folder using huggingface_hub	9 个月前
Qwen3-Coder-30B-A3B-Instruct-1M-Q4_0.ggufLFS	Upload folder using huggingface_hub	9 个月前
Qwen3-Coder-30B-A3B-Instruct-1M-Q4_1.ggufLFS	Upload folder using huggingface_hub	9 个月前
Qwen3-Coder-30B-A3B-Instruct-1M-Q4_K_M.ggufLFS	Upload folder using huggingface_hub	9 个月前
Qwen3-Coder-30B-A3B-Instruct-1M-Q4_K_S.ggufLFS	Upload folder using huggingface_hub	9 个月前
Qwen3-Coder-30B-A3B-Instruct-1M-Q5_K_M.ggufLFS	Upload folder using huggingface_hub	9 个月前
Qwen3-Coder-30B-A3B-Instruct-1M-Q5_K_S.ggufLFS	Upload folder using huggingface_hub	9 个月前
Qwen3-Coder-30B-A3B-Instruct-1M-Q6_K.ggufLFS	Upload folder using huggingface_hub	9 个月前
Qwen3-Coder-30B-A3B-Instruct-1M-Q8_0.ggufLFS	Upload folder using huggingface_hub	9 个月前
Qwen3-Coder-30B-A3B-Instruct-1M-UD-IQ1_M.ggufLFS	Upload folder using huggingface_hub	9 个月前
Qwen3-Coder-30B-A3B-Instruct-1M-UD-IQ1_S.ggufLFS	Upload folder using huggingface_hub	9 个月前
Qwen3-Coder-30B-A3B-Instruct-1M-UD-IQ2_M.ggufLFS	Upload folder using huggingface_hub	9 个月前
Qwen3-Coder-30B-A3B-Instruct-1M-UD-IQ2_XXS.ggufLFS	Upload folder using huggingface_hub	9 个月前
Qwen3-Coder-30B-A3B-Instruct-1M-UD-IQ3_XXS.ggufLFS	Upload folder using huggingface_hub	9 个月前
Qwen3-Coder-30B-A3B-Instruct-1M-UD-Q2_K_XL.ggufLFS	Upload folder using huggingface_hub	9 个月前
Qwen3-Coder-30B-A3B-Instruct-1M-UD-Q3_K_XL.ggufLFS	Upload folder using huggingface_hub	9 个月前
Qwen3-Coder-30B-A3B-Instruct-1M-UD-Q4_K_XL.ggufLFS	Upload folder using huggingface_hub	9 个月前
Qwen3-Coder-30B-A3B-Instruct-1M-UD-Q5_K_XL.ggufLFS	Upload folder using huggingface_hub	9 个月前
Qwen3-Coder-30B-A3B-Instruct-1M-UD-Q6_K_XL.ggufLFS	Upload folder using huggingface_hub	9 个月前
Qwen3-Coder-30B-A3B-Instruct-1M-UD-Q8_K_XL.ggufLFS	Upload folder using huggingface_hub	9 个月前
Qwen3-Coder-30B-A3B-Instruct-1M-UD-TQ1_0.ggufLFS	Upload folder using huggingface_hub	9 个月前
README.md	Update README.md	9 个月前
imatrix_unsloth.ggufLFS	Upload folder using huggingface_hub	9 个月前
params	Create params	9 个月前
template	Create template	9 个月前

自动翻译

tags:

unsloth
qwen3
qwen base_model:
Qwen/Qwen3-Coder-30B-A3B-Instruct library_name: transformers license: apache-2.0 license_link: https://huggingface.co/Qwen/Qwen3-Coder-30B-A3B-Instruct/blob/main/LICENSE pipeline_tag: text-generation

Note

将上下文长度从 256K 扩展至 100 万

查看我们的合集，获取 Qwen3 的所有版本，包括 GGUF、4 位和 16 位格式。

学习正确运行 Qwen3-Coder - 阅读我们的指南。

查看 Unsloth Dynamic 2.0 GGUFs 了解我们的量化基准测试。

✨ 在此处阅读我们的 Qwen3-Coder 指南链接！

使用我们的 Google Colab 笔记本免费微调 Qwen3 (14B)！
阅读我们关于 Qwen3 支持的博客：unsloth.ai/blog/qwen3
在我们的文档此处查看其余笔记本。 | Unsloth 支持 | 免费笔记本 | 性能 | 内存占用 | |-----------------|--------------------------------------------------------------------------------------------------------------------------|-------------|----------| | Qwen3 (14B) | ▶️ 在 Colab 上开始 | 快 3 倍 | 减少 70% | | GRPO 与 Qwen3 (8B) | ▶️ 在 Colab 上开始 | 快 3 倍 | 减少 80% | | Llama-3.2 (3B) | ▶️ 在 Colab 上开始 | 快 2.4 倍 | 减少 58% | | Llama-3.2 (11B 视觉版) | ▶️ 在 Colab 上开始 | 快 2 倍 | 减少 60% | | Qwen2.5 (7B) | ▶️ 在 Colab 上开始 | 快 2 倍 | 减少 60% |

Qwen3-Coder-30B-A3B-Instruct

亮点特性

Qwen3-Coder 提供多种参数规模版本。今天，我们荣幸地推出 Qwen3-Coder-30B-A3B-Instruct。这款精简模型在保持卓越性能与效率的同时，带来了以下关键增强：

在智能体编码（Agentic Coding）、智能体浏览器使用（Agentic Browser-Use） 及其他基础编码任务上，性能在开源模型中表现十分突出。
具备长上下文能力，原生支持256K tokens，通过 Yarn 技术可扩展至100万 tokens，专为理解仓库级代码规模优化。
智能体编码支持 Qwen Code、CLINE 等主流平台，并采用特别设计的函数调用格式。

image/jpeg

模型概览

Qwen3-Coder-30B-A3B-Instruct 具有以下特性：

类型：因果语言模型
训练阶段：预训练与后训练
参数数量：总计 305 亿，激活 33 亿
层数：48 层
注意力头数（GQA）：Q 头 32 个，KV 头 4 个
专家数量：128 个
激活专家数量：8 个
上下文长度：原生支持 262,144 tokens。

注意：此模型仅支持非思考模式，不会在输出中生成 </think>superscript: 块。同时，不再需要指定 enable_thinking=False。

有关基准测试评估、硬件要求和推理性能等更多详情，请参阅我们的博客、GitHub 和文档。

快速开始

我们建议您使用最新版本的 transformers。

若使用 transformers<4.51.0，您将遇到以下错误：

KeyError: 'qwen3_moe'

以下是一段代码片段，展示了如何使用模型根据给定输入生成内容。

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "Qwen/Qwen3-Coder-30B-A3B-Instruct"

# load the tokenizer and the model
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto"
)

# prepare the model input
prompt = "Write a quick sort algorithm."
messages = [
    {"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True,
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

# conduct text completion
generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=65536
)
output_ids = generated_ids[0][len(model_inputs.input_ids[0]):].tolist() 

content = tokenizer.decode(output_ids, skip_special_tokens=True)

print("content:", content)

注意：如果遇到内存不足（OOM）问题，请考虑将上下文长度减少到更短的值，例如 32,768。

在本地使用时，Ollama、LMStudio、MLX-LM、llama.cpp 和 KTransformers 等应用程序也已支持 Qwen3。

智能体编码

Qwen3-Coder 在工具调用能力方面表现出色。

您可以按照以下示例简单定义或使用任何工具。

# Your tool implementation
def square_the_number(num: float) -> dict:
    return num ** 2

# Define Tools
tools=[
    {
        "type":"function",
        "function":{
            "name": "square_the_number",
            "description": "output the square of the number.",
            "parameters": {
                "type": "object",
                "required": ["input_num"],
                "properties": {
                    'input_num': {
                        'type': 'number', 
                        'description': 'input_num is a number that will be squared'
                        }
                },
            }
        }
    }
]

import OpenAI
# Define LLM
client = OpenAI(
    # Use a custom endpoint compatible with OpenAI API
    base_url='http://localhost:8000/v1',  # api_base
    api_key="EMPTY"
)
 
messages = [{'role': 'user', 'content': 'square the number 1024'}]

completion = client.chat.completions.create(
    messages=messages,
    model="Qwen3-Coder-30B-A3B-Instruct",
    max_tokens=65536,
    tools=tools,
)

print(completion.choice[0])

最佳实践

为实现最佳性能，我们建议采用以下设置：

采样参数：
- 建议使用 temperature=0.7、top_p=0.8、top_k=20、repetition_penalty=1.05。
适当的输出长度：对于大多数查询，我们建议使用 65,536 个 token 的输出长度，这对于指令模型来说已足够。

引用

如果您发现我们的工作对您有所帮助，欢迎引用我们的成果。

@misc{qwen3technicalreport,
      title={Qwen3 Technical Report}, 
      author={Qwen Team},
      year={2025},
      eprint={2505.09388},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2505.09388}, 
}

项目介绍

下载使用量

项目总下载次数（含Clone、Pull、 zip 包及 release 下载），每日凌晨更新

Qwen3-Coder-30B-A3B-Instruct-1M-GGUF:高效代码生成模型，支持超长上下文与智能工具调用

✨ 在此处阅读我们的 Qwen3-Coder 指南链接！

Qwen3-Coder-30B-A3B-Instruct

亮点特性

模型概览

快速开始

智能体编码

最佳实践

引用

项目介绍

下载使用量

目录

Qwen3-Coder-30B-A3B-Instruct-1M-GGUF:高效代码生成模型，支持超长上下文与智能工具调用

✨ 在此处阅读我们的 Qwen3-Coder 指南 链接！

Qwen3-Coder-30B-A3B-Instruct

亮点特性

模型概览

快速开始

智能体编码

最佳实践

引用

项目介绍

下载使用量

目录

✨ 在此处阅读我们的 Qwen3-Coder 指南链接！