Facebook Wav2Vec2-large模型,专为罗曼语系在VoxPopuli语料库上预训练,含101.5小时无标签数据,需配合分词器并微调后用于语音识别,支持16kHz音频输入。【此简介由AI生成】
language: romance tags:
- audio
- automatic-speech-recognition
- voxpopuli-v2 datasets:
- voxpopuli license: cc-by-nc-4.0 inference: false
Wav2Vec2-large-VoxPopuli-V2
Facebook's Wav2Vec2 large model pretrained only in romance on 101.5 unlabeled datat of the VoxPopuli corpus.
The model is pretrained on 16kHz sampled speech audio. When using the model make sure that your speech input is also sampled at 16Khz.
Note: This model does not have a tokenizer as it was pretrained on audio alone. In order to use this model for speech recognition, a tokenizer should be created and the model should be fine-tuned on labeled text data in romance. Check out this blog for a more in-detail explanation of how to fine-tune the model.
Authors: Changhan Wang, Morgane Riviere, Ann Lee, Anne Wu, Chaitanya Talnikar, Daniel Haziza, Mary Williamson, Juan Pino, Emmanuel Dupoux from Facebook AI.
See the official website for more information, here.