07c56101创建于 12 天前历史提交

文件	最后提交记录	最后更新时间
README.md	add PaddleOCR-VL model Co-authored-by: wangzihan-zzz<1780093255@qq.com> # message auto-generated for no-merge-commit merge: !7542 merge master into master add PaddleOCR-VL model Created-by: wangzihan-zzz Commit-by: wangzihan-zzz Merged-by: ascend-robot Description: ## Motivation add the model PaddleOCR-VL vLLM ## Modification add the patch file、readme and the infer script ## Checklist Before PR: - [ ] The new code needs to comply with the Clean Code specification. - [ ] The PR content is self-checked, and the expression can be clear and the writing standardized After PR: - [ ] CLA has been signed and all committers have signed the CLA in this PR. - [ ] The ci-pipeline is passed, Code Check is passed. See merge request: Ascend/ModelZoo-PyTorch!7542	12 天前
infer.py	add PaddleOCR-VL model Co-authored-by: wangzihan-zzz<1780093255@qq.com> # message auto-generated for no-merge-commit merge: !7542 merge master into master add PaddleOCR-VL model Created-by: wangzihan-zzz Commit-by: wangzihan-zzz Merged-by: ascend-robot Description: ## Motivation add the model PaddleOCR-VL vLLM ## Modification add the patch file、readme and the infer script ## Checklist Before PR: - [ ] The new code needs to comply with the Clean Code specification. - [ ] The PR content is self-checked, and the expression can be clear and the writing standardized After PR: - [ ] CLA has been signed and all committers have signed the CLA in this PR. - [ ] The ci-pipeline is passed, Code Check is passed. See merge request: Ascend/ModelZoo-PyTorch!7542	12 天前
use_4in1_and_add_update_early.patch	add PaddleOCR-VL model Co-authored-by: wangzihan-zzz<1780093255@qq.com> # message auto-generated for no-merge-commit merge: !7542 merge master into master add PaddleOCR-VL model Created-by: wangzihan-zzz Commit-by: wangzihan-zzz Merged-by: ascend-robot Description: ## Motivation add the model PaddleOCR-VL vLLM ## Modification add the patch file、readme and the infer script ## Checklist Before PR: - [ ] The new code needs to comply with the Clean Code specification. - [ ] The PR content is self-checked, and the expression can be clear and the writing standardized After PR: - [ ] CLA has been signed and all committers have signed the CLA in this PR. - [ ] The ci-pipeline is passed, Code Check is passed. See merge request: Ascend/ModelZoo-PyTorch!7542	12 天前

PaddleOCR-VL（vLLM）-推理指导

概述
推理环境准备
快速上手

概述

PaddleOCR-VL是百度推出的文档解析模型，仅用 0.9B 参数就实现了超越大模型的表现，支持 109 种语言的文档识别，能够一键解析文本、表格、数学公式和图表等复杂文档元素，在 OmniDocBench v1.5 和 v1.0 上均达到 SOTA

本文档介绍了PaddleOCR-VL模型的部署流程，包括推理环境准备、模型部署、功能验证，旨在帮助用户快速完成模型部署和验证

注意：完整的PaddleOCR-VL流程包含其中的VLM组件PaddleOCR-VL-0.9B与版面分析模型PP-DocLayoutV2，PaddleOCR-VL-0.9B并不是PaddleOCR-VL的一个模型变种

版本说明：

url=https://modelscope.cn/models/PaddlePaddle/PaddleOCR-VL
model_name=PaddleOCR-VL

推理环境准备

该模型需要以下插件与驱动
仅支持Atlas 800I A2/Atlas 800T A2

表 1 版本配套表

配套	版本	环境准备指导
固件与驱动	25.2.RC1	Pytorch框架推理环境准备
CANN	8.3.RC2	-
Python	3.11	-
vLLM-Ascend	0.13.rc1	-
PyTorch	2.8.0	-

快速上手

环境配置

拉取源码

git clone https://gitcode.com/ascend/ModelZoo-PyTorch.git
cd ModelZoo-PyTorch/ACL_PyTorch/built-in/ocr/PaddleOCR-VL

拉取镜像

# 复用vllm基础镜像或参考vllm-ascend社区文档搭建vllm环境：
docker pull quay.io/ascend/vllm-ascend:v0.13.rc1

更新torch_npu版本

wget https://pytorch-package.obs.cn-north-4.myhuaweicloud.com/pta/Daily/v2.8.0/20260131.3/pytorch_v2.8.0_py311.tar.gz
pip install torch_npu-2.8.0.post3.dev20260131-cp311-cp311-manylinux_2_28_aarch64.whl

代码修改

修改utils.py文件

vim /usr/local/python3.11.13/lib/python3.11/site-packages/torch_npu/npu/utils.py

搜索函数“_get_device_index”前面增加下面两行代码

from functools import lru_cache
@lru_cache()

叠加vllm-ascend补丁

使能FIA update四合一接口，并进行异步tiling更新

cd /vllm-workspace/vllm-ascend/
git apply use_4in1_and_add_update_early.patch

获取权重

下载PaddleOCR-VL模型权重，并放置于本地目录PaddleOCR-VL

mkdir PaddleOCR-VL
modelscope download --model PaddlePaddle/PaddleOCR-VL --local_dir ./PaddleOCR-VL

PaddleOCR-VL-0.9B模型推理

启动vLLM服务

export TASK_QUEUE_ENABLE=1
export PER_STREAM_QUEUE=1
export PYTORCH_NPU_ALLOC_CONF="expandable_segments:True"
export MODEL_PATH="[your_model_path]"
vllm serve ${MODEL_PATH} \
          --host 0.0.0.0 \
          --port 18888 \
          --max-num-batched-tokens 16384 \
          --served-model-name PaddleOCR-VL-0.9B \
          --trust-remote-code \
          --no-enable-prefix-caching \
          --mm-processor-cache-gb 0 \
          --compilation-config '{"cudagraph_mode":"FULL_DECODE_ONLY"}' \
          --async-scheduling

发送请求

服务启动后，可以使用 OpenAI API 客户端进行查询

from openai import OpenAI

client = OpenAI(
    api_key="EMPTY",
    base_url="http://localhost:8000/v1",
    timeout=3600
)

# Task-specific base prompts
TASKS = {
    "ocr": "OCR:",
    "table": "Table Recognition:",
    "formula": "Formula Recognition:",
    "chart": "Chart Recognition:",
}

messages = [
    {
        "role": "user",
        "content": [
            {
                "type": "image_url",
                "image_url": {
                    "url": "https://ofasys-multimodal-wlcb-3-toshanghai.oss-accelerate.aliyuncs.com/wpf272043/keepme/image/receipt.png"
                }
            },
            {
                "type": "text",
                "text": TASKS["ocr"]
            }
        ]
    }
]

response = client.chat.completions.create(
    model="PaddleOCR-VL-0.9B",
    messages=messages,
    temperature=0.0,
)
print(f"Generated text: {response.choices[0].message.content}")

结合PP-DocLayoutV2进行端到端流程推理

参考PP-DocLayoutV2部署客户端

注意：在部署PaddleOCR-VL模型端到端流程时，建议将PP-DocLayoutV2与PaddleOCR-VL-0.9B模型的环境隔离，以防依赖冲突

在客户端使用推理脚本执行端到端推理

python infer.py --input_path image.jpg

参数说明

layout_detection_model_dir: 布局检测模型的目录路径，默认为PP-DocLayoutV2
vllm_ip: vLLM server URL，默认为"http://127.0.0.1:8000/v1"
input_path: 输入路径，图片/pdf/目录
output_dir: 输出结果路径，默认解析结果存放于output目录，包含存有各项中间结果的json文件以及可视化结果图像

模型推理性能

使用官方测试图片paddleocr_vl_demo.png测试端到端耗时为：

模型	硬件	耗时(s)
PaddleOCR-VL	Atlas 800I A2	1.4