IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2024.
An assistant which can generate creative images for specific user-input subject along with text explanation and elaboration in 2-5 seconds, without any fine-tuning.
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
We propose a method termed Corgi, which can better generate image embeddings from text inside multimodal embedding space. It benefits both standard and language-free text-to-image generation. And yes, I do have a Corgi.
A zero-shot method for video customization, which can generate creative videos for user-input subject image, with desired style, color, texture, background required by user-input text.
A novel framework for customized text-to-image generation without the use of regularization. We can efficiently customize a large-scale text-to-image generation model on single GPU, with only one image provided by the user.