Research Figure
Figure 1. Lvmin Zhang looking at his looooooong TODO list at 1:30 am.
As shown in Fig. 1, Lvmin Zhang (Lyumin Zhang) is a Ph.D. candidate in Computer Science advised by Prof. Maneesh Agrawala at Stanford University since 2022. Before that, he was a Research Assistant in the lab of Prof. Tien-Tsin Wong at the Chinese University of Hong Kong since 2021. He has also collaborated with Prof. Edgar Simo-Serra on many interesting projects. He received B.Eng. from Soochow University in 2021.
I want to find better ways to control and interact with computation mechanisms, and find better computation mechanisms to serve people. I want the world to have more prior knowledge, though I sadly agree that posteriors are equally important. I love intuitive things. I love observations. I love trade-offs.
lvmin AT cs.stanford.edu / lvminzhang AT acm.org / lllyasviel (GitHub) / Google Scholar (likely outdated)

FramePack: Frame Context Packing and Drift Prevention in Next-Frame-Prediction Video Diffusion Models
Lvmin Zhang, Shengqu Cai, Muyang Li, Gordon Wetzstein, Maneesh Agrawala
FramePack is a neural network structure for next-frame video prediction. It compresses input frames based on importance, enabling longer contexts by packing more frames into a fixed length. Drift prevention methods reduce error accumulation during generation.
ArXiv v1 / ArXiv P1 (coming soon) / FramePack / FramePack-P1 / Code
PaintsAlter: Generating Past and Future in Digital Painting Processes
ACM Transactions on Graphics (SIGGRAPH 2025)
Lvmin Zhang, Chuan Yan, Yuwei Guo, Jinbo Xing, Maneesh Agrawala
The PaintsAlter framework generates past and future drawing process states from a single user canvas image. It repurposes video diffusion models to learn a set-to-set mapping, enabling generation of non-contiguous preceding or succeeding drawing steps.
PaintsUndo / PaintsAlter / Code
IC-Light: Scaling In-the-Wild Training for Diffusion-based Illumination Harmonization and Editing by Imposing Consistent Light Transport
International Conference on Learning Representations (ICLR) 2025 Oral (1%)
Lvmin Zhang, Anyi Rao and Maneesh Agrawala
IC-Light is a training method for diffusion-based relighting models. It imposes a consistent light transport principle, based on linear light blending, to enable scalable training that modifies only illumination while preserving intrinsic image properties. OpenReview / Code
Transparent Image Layer Diffusion using Latent Transparency
(LayerDiffuse)
ACM Transactions on Graphics (SIGGRAPH 2024)
Lvmin Zhang and Maneesh Agrawala
LayerDiffuse is an approach for generating transparent images or layers with latent diffusion models. It introduces a "latent transparency" offset, encoding the alpha channel into the latent space via finetuning while preserving original model quality. ArXiv / Code
Adding Conditional Control to Text-to-Image Diffusion Models
(ControlNet)
International Conference on Computer Vision (ICCV) 2023 Best Paper (Marr Prize)
Lvmin Zhang, Anyi Rao and Maneesh Agrawala
ControlNet is a neural network architecture adding spatial conditioning to large text-to-image models. It locks a pretrained model's weights while a trainable copy, connected via zero convolutions, learns controls like edges, depth, and human pose. ArXiv / Code / Code(v1.1)
Sprite-from-Sprite: Cartoon Animation Decomposition with Self-supervised Sprite Estimation
ACM Transactions on Graphics (SIGGRAPH ASIA 2022, Journal Track)
Lvmin Zhang, Tien-Tsin Wong, and Yuxin Liu
The "sprites" in real-world cartoons are unique: artists may draw arbitrary sprite animations for expressiveness, or alternatively, artists may also reduce their workload by tweening and adjusting contents. Can we use these properties to do a "reverse engineering" to get the original sprites in digital animation? Know more ...
SmartShadow: Artistic Shadow Drawing Tool for Line Drawings
IEEE International Conference on Computer Vision (ICCV) 2021 Oral (3%)
Lvmin Zhang, Jinyue Jiang, Yi Ji, and Chunping Liu
A flexible shadow drawing tool for line drawings, supporting interactive editing of cartoon-style shadows. Know more ...
Generating Digital Painting Lighting Effects via RGB-space Geometry
ACM Transactions on Graphics (Presented in SIGGRAPH 2020)
Lvmin Zhang, Edgar Simo-Serra, Yi Ji, and Chunping Liu
A project conducted to investigate how artists apply lighting effects to their artworks, and how we can assist such workflow. The main idea is to observe the real painting behaviors and procedures of artists so that we can model the illumination in their artworks. Know more ...
Generating Manga from Illustrations via Mimicking Manga Workflow
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2021
Lvmin Zhang, Xinrui Wang, Qingnan Fan, Yi Ji, and Chunping Liu
We propose a data-driven framework to convert a digital illustration into three corresponding components: manga line drawing, regular screentone, and irregular screen texture. These components can be directly composed into manga images and can be further retouched for more plentiful manga creations. Know more ...
User-Guided Line Art Flat Filling with Split Filling Mechanism
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2021
Lvmin Zhang, Chengze Li, Edgar Simo-Serra, Yi Ji, Tien-tsin Wong, and Chunping Liu
We present a deep learning framework for user-guided line art flat filling that explicitly controls the "influence areas" of the user colour scribbles, i.e., the areas where the user scribbles should propagate and influence, to manipulate the colours of image details and avoid colour contamination between scribbles, and simultaneously, leverages data-driven colour generation to facilitate content creation. Know more ...
DanbooRegion: An Illustration Region Dataset
European Conference on Computer Vision (ECCV) 2020
Lvmin Zhang, Yi Ji, and Chunping Liu
Region is a fundamental element of various cartoon animation techniques and artistic painting applications. Achieving satisfactory region is essential to the success of these techniques. To assist diversiform region-based cartoon applications, we use semi-automatic method to annotate regions for in-the-wild artworks. Know more ...
Erasing Appearance Preservation in Optimization-based Smoothing
European Conference on Computer Vision (ECCV) 2020 Spotlight (5%)
Lvmin Zhang, Chengze Li, Yi Ji, Chunping Liu, and Tien-tsin Wong
Optimization-based smoothing can be formulated as a smoothing energy and an appearance preservation energy. We show that partially "erasing" the energy facilitate the smoothing.
Know more ...
Two-stage Sketch Colorization
ACM Transactions on Graphics (SIGGRAPH Asia 2018)
Lvmin Zhang, Chengze Li, Tien-tsin Wong, Yi Ji, and Chunping Liu
With the advances of neural networks, automatic or semi-automatic colorization of sketch become feasible and practical. We present a state-of-the-art semi-automatic (as well as automatic) colorization from line art. Our improvement is accounted by a divide-and-conquer scheme. We divide this complex colorization task into two simplier and goal-clearer subtasks, drafting and refinement. Know more ...

Miscs: lvmin worked with digital painting artists for many years and funded an organization Style2Paints Research. Many years ago (maybe before college) lvmin was a game developer and a pro YGO player and developed YGOPro2 (Google / YouTube). Lvmin has anonymous activities in some cracking community (mainly denuvo and some VMs). Lvmin has anonymous activities in AO3.




Free Web Counter