Yufan Zhou

Research Scientist at Luma AI

Email: my full name without last u AT gmail DOT com

Bio

I'm a Research Scientist at Luma AI, working on multimodal generative models. My work focuses on controllable image and video generation/editing — reference-based generation/editing, video-to-video generation, scalable dataset construction, etc.

Previously I was a Research Scientist at Adobe Research. I obtained my Ph.D. from the Department of Computer Science and Engineering, University at Buffalo, under the supervision of Prof. Jinhui Xu and Prof. Changyou Chen. I received my B.E. degree from Zhejiang University. I worked as a Research Intern with Chunyuan Li (Microsoft), Ruiyi Zhang (Adobe), Bingchen Liu (ByteDance).

NEWS

Uni-1.1 released
Uni-1 released
Ray3 Modify released
Two papers accepted by ICLR 2025
One paper accepted by AAAI 2025
One paper accepted by ACL 2024
Two papers accepted by CVPR 2024
One paper accepted by ICLR 2024

Selected Work [Publications]

Uni-1.1 · Luma AI

Designed a framework to improve controllable image generation, including image editing and reference-based generation. At the time of release, achieved competitive performance on arena.ai, behind only OpenAI's GPT Image 1.5/2 and Google's Nano Banana Pro/2 in single image editing and multi-image editing.
Uni-1 · Luma AI

Our unified multi-modal generative model. I'm the DRI (directly responsible individual) for reference-based generation, which lets users create images with up to 10+ arbitrary reference images — characters, objects, backgrounds, styles, patterns, and more. I did work including running training experiments, dataset construction, dataset captioning/annotation/filtering, etc.
Ray3 Modify · Luma AI

I'm DRI for reference-based generation. Character reference and reference-based video-to-video generation are features I delivered to product: the user can input a reference image, keyframes, and source video to produce a target video. My work includes dataset construction, dataset captioning/annotation/filtering, and model training for the reference-based t2v/i2v/v2v features.
Image Blend · Luma AI

A feature that lets users generate creative images which are difficult to be described in natural language, by exploring the latent space.

Service

Reviewer for NeurIPS, ICML, ICLR, CVPR, ECCV, AAAI, AISTATS, IJCAI, EMNLP, ACL; and journals TPAMI, TNNLS, TCSVT.