文件最后提交记录最后更新时间
!6719 [built-in][Pytorch] 调整多模态模型存放目录 Merge pull request !6719 from zhangjunyi08/master 1 年前
README.md

BLIP-Diffusion: Pre-trained Subject Representation for Controllable Text-to-Image Generation and Editing

Paper, Demo Site, Video

This repo hosts the official implementation of BLIP-Diffusion, a text-to-image diffusion model with built-in support for multimodal subject-and-text condition. BLIP-Diffusion enables zero-shot subject-driven generation, and efficient fine-tuning for customized subjects with up to 20x speedup. In addition, BLIP-Diffusion can be flexibly combiend with ControlNet and prompt-to-prompt to enable novel subject-driven generation and editing applications.

Installation

Install the LAVIS library from source:

pip install -e .

Notebook Examples

  • Subject-driven Generation:

    • zero-shot inference: notebook, Open In Colab
    • inference with fine-tuned checkpoint: notebook, Open In Colab
  • Structure-Controlled Generation / Stylization: notebook, Open In Colab

  • Subject-driven Editing:

    • editing a synthetic image:
      • First generate an image, then edit the image with the specified subject visuals: notebook, Open In Colab
    • editing a real image with DDIM inversion:
      • zero-shot inference: notebook, Open In Colab
      • inference with fine-tuned checkpoint: notebook, Open In Colab
  • Virtual Try-On via Subject-driven Editing:

    • the model can be used to naturally facilitate virtual try-on. We provide an zero-shot example: notebook, Open In Colab;

Cite BLIP-Diffusion

If you find our work helpful, please consider citing:

@article{li2023blip,
  title={BLIP-Diffusion: Pre-trained Subject Representation for Controllable Text-to-Image Generation and Editing},
  author={Li, Dongxu and Li, Junnan and Hoi, Steven CH},
  journal={arXiv preprint arXiv:2305.14720},
  year={2023}
}

@inproceedings{li2023lavis,
  title={LAVIS: A One-stop Library for Language-Vision Intelligence},
  author={Li, Dongxu and Li, Junnan and Le, Hung and Wang, Guangsen and Savarese, Silvio and Hoi, Steven CH},
  booktitle={Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations)},
  pages={31--41},
  year={2023}
}