| Add support for Qwen 2.5 Vision (#382)
* implement qwen vision embed and patch merger
* implement qwen vision block
* calculate the rope index of images and videos
* add get_window_index
* fix get window index
* unwrap less
* Create media source api
* integrate the new media support into the language model trait
* Create QwenVisionTransformer
* implement QwenVisionTransformer::forward
* fix formatting
* fix loading qwen 2.5 vl
* fix rot_pos_emb
* add image preprocessing utilities
* fix vision rope
* fix mask
* Fix feed forward
* qwen vision forward working
* unwrap less
* clean up
* create tensor tools cli
* fix cli
* fix fuse tokenizer
* move parse into its own module
* Use llama.cpp compatible tensor names
* add preset
* load qwen vision metadata from the gguf file
* fix loading the vision encoder
* test process image
* forward eps and add more tests
* fix image processing
* implement image chat templating
* full pipeline running
* fix formatting
* use 3d rope index
* fix dimension_sections decoding
* qwen vl rope working
* remove logs
* fix rope tests
* fix rope size
* fix rope index to tensor conversion
* Fix rope updates
* normalize image input
* match image resize behavior
* fix fullatt_block calculation
* vision model works
* remove logs
* add more qwen vl presets
* fix some clippy lints
* fix clippy
* Fix ToChatMessage
* expose image processing hints
* remove unwraps
* fix unwraps in tests
* fix more examples | 11 个月前 |