ultravox-v0_5-llama-3_2-1b-ONNX:基于Transformers.js的音频文本转换模型,支持多语言与量化配置

该ONNX模型提供高效音频转文本功能,支持多语言处理,可通过Transformers.js轻松集成,支持多种量化配置以平衡性能与精度,适用于语音转写等场景。【此简介由AI生成】

分支1Tags0

language:

  • ar
  • be
  • bg
  • bn
  • cs
  • cy
  • da
  • de
  • el
  • en
  • es
  • et
  • fa
  • fi
  • fr
  • gl
  • hi
  • hu
  • it
  • ja
  • ka
  • lt
  • lv
  • mk
  • mr
  • nl
  • pl
  • pt
  • ro
  • ru
  • sk
  • sl
  • sr
  • sv
  • sw
  • ta
  • th
  • tr
  • uk
  • ur
  • vi
  • zh library_name: transformers.js license: mit metrics:
  • bleu pipeline_tag: audio-text-to-text base_model:
  • fixie-ai/ultravox-v0_5-llama-3_2-1b

使用方法 (Transformers.js)

若尚未安装,您可通过 NPM 安装 Transformers.js JavaScript 库:

npm i @huggingface/transformers

然后,您可以像这样使用该模型:

import { UltravoxProcessor, UltravoxModel, read_audio } from "@huggingface/transformers";

const processor = await UltravoxProcessor.from_pretrained(
  "onnx-community/ultravox-v0_5-llama-3_2-1b-ONNX",
);
const model = await UltravoxModel.from_pretrained(
  "onnx-community/ultravox-v0_5-llama-3_2-1b-ONNX",
  {
    dtype: {
      embed_tokens: "q8", // "fp32", "fp16", "q8"
      audio_encoder: "q4", // "fp32", "fp16", "q8", "q4", "q4f16"
      decoder_model_merged: "q4", // "q8", "q4", "q4f16"
    },
  },
);

const audio = await read_audio("http://huggingface.co/datasets/Xenova/transformers.js-docs/resolve/main/mlk.wav", 16000);
const messages = [
  {
    role: "system",
    content: "You are a helpful assistant.",
  },
  { role: "user", content: "Transcribe this audio:<|audio|>" },
];
const text = processor.tokenizer.apply_chat_template(messages, {
  add_generation_prompt: true,
  tokenize: false,
});

const inputs = await processor(text, audio);
const generated_ids = await model.generate({
  ...inputs,
  max_new_tokens: 128,
});

const generated_texts = processor.batch_decode(
  generated_ids.slice(null, [inputs.input_ids.dims.at(-1), null]),
  { skip_special_tokens: true },
);
console.log(generated_texts[0]);
// "I can transcribe the audio for you. Here's the transcription:\n\n\"I have a dream that one day this nation will rise up and live out the true meaning of its creed.\"\n\n- Martin Luther King Jr.\n\nWould you like me to provide the transcription in a specific format (e.g., word-for-word, character-for-character, or a specific font)?"

项目介绍

该ONNX模型提供高效音频转文本功能,支持多语言处理,可通过Transformers.js轻松集成,支持多种量化配置以平衡性能与精度,适用于语音转写等场景。【此简介由AI生成】

定制我的领域

下载使用量

0

项目总下载次数(含Clone、Pull、 zip 包及 release 下载),每日凌晨更新