Ii-robot!6719 [built-in][Pytorch] 调整多模态模型存放目录

cf2aeb79创建于 2024年9月6日历史提交

文件	最后提交记录	最后更新时间
README.md	!6719 [built-in][Pytorch] 调整多模态模型存放目录 Merge pull request !6719 from zhangjunyi08/master	1 年前
README_ja.md	!6719 [built-in][Pytorch] 调整多模态模型存放目录 Merge pull request !6719 from zhangjunyi08/master	1 年前
README_zh.md	!6719 [built-in][Pytorch] 调整多模态模型存放目录 Merge pull request !6719 from zhangjunyi08/master	1 年前

Video Caption

Typically, most video data does not come with corresponding descriptive text, so it is necessary to convert the video data into textual descriptions to provide the essential training data for text-to-video models.

Video Caption via CogVLM2-Video

🤗 Hugging Face | 🤖 ModelScope | 📑 Blog ｜ 💬 Online Demo

CogVLM2-Video is a versatile video understanding model equipped with timestamp-based question answering capabilities. Users can input prompts such as Please describe this video in detail. to the model to obtain a detailed video caption:

Users can use the provided code to load the model or configure a RESTful API to generate video captions.