kalosm/fusor-ml/cli · fl/kalosm - AtomGit

GGitHubAdd support for Qwen 2.5 Vision (#382 )

fc1b9f7d创建于 2025年5月22日历史提交

文件	最后提交记录	最后更新时间
src	Add support for Qwen 2.5 Vision (#382) * implement qwen vision embed and patch merger * implement qwen vision block * calculate the rope index of images and videos * add get_window_index * fix get window index * unwrap less * Create media source api * integrate the new media support into the language model trait * Create QwenVisionTransformer * implement QwenVisionTransformer::forward * fix formatting * fix loading qwen 2.5 vl * fix rot_pos_emb * add image preprocessing utilities * fix vision rope * fix mask * Fix feed forward * qwen vision forward working * unwrap less * clean up * create tensor tools cli * fix cli * fix fuse tokenizer * move parse into its own module * Use llama.cpp compatible tensor names * add preset * load qwen vision metadata from the gguf file * fix loading the vision encoder * test process image * forward eps and add more tests * fix image processing * implement image chat templating * full pipeline running * fix formatting * use 3d rope index * fix dimension_sections decoding * qwen vl rope working * remove logs * fix rope tests * fix rope size * fix rope index to tensor conversion * Fix rope updates * normalize image input * match image resize behavior * fix fullatt_block calculation * vision model works * remove logs * add more qwen vl presets * fix some clippy lints * fix clippy * Fix ToChatMessage * expose image processing hints * remove unwraps * fix unwraps in tests * fix more examples	11 个月前
Cargo.toml	Add support for Qwen 2.5 Vision (#382) * implement qwen vision embed and patch merger * implement qwen vision block * calculate the rope index of images and videos * add get_window_index * fix get window index * unwrap less * Create media source api * integrate the new media support into the language model trait * Create QwenVisionTransformer * implement QwenVisionTransformer::forward * fix formatting * fix loading qwen 2.5 vl * fix rot_pos_emb * add image preprocessing utilities * fix vision rope * fix mask * Fix feed forward * qwen vision forward working * unwrap less * clean up * create tensor tools cli * fix cli * fix fuse tokenizer * move parse into its own module * Use llama.cpp compatible tensor names * add preset * load qwen vision metadata from the gguf file * fix loading the vision encoder * test process image * forward eps and add more tests * fix image processing * implement image chat templating * full pipeline running * fix formatting * use 3d rope index * fix dimension_sections decoding * qwen vl rope working * remove logs * fix rope tests * fix rope size * fix rope index to tensor conversion * Fix rope updates * normalize image input * match image resize behavior * fix fullatt_block calculation * vision model works * remove logs * add more qwen vl presets * fix some clippy lints * fix clippy * Fix ToChatMessage * expose image processing hints * remove unwraps * fix unwraps in tests * fix more examples	11 个月前