Projectpage of MonoGaussianAvatar

MonoGaussianAvatar: Monocular Gaussian Point-based Head Avatar

SIGGRAPH 2024

Yufan Chen^1,†, Lizhen Wang², Qijing Li², Hongjiang Xiao³, Shengping Zhang^1*, Hongxun Yao¹, Yebin Liu²

¹Harbin Institute of Technology²Tsinghua University³Communication University of China
^*Corresponding author ^†Work done during an internship at Tsinghua University

Paper

Video

Code

Abstract

The ability to animate photo-realistic head avatars reconstructed from monocular portrait video sequences represents a crucial step in bridging the gap between the virtual and real worlds. Recent advancements in head avatar techniques, including explicit 3D morphable meshes (3DMM), point clouds, and neural implicit representation have been exploited for this ongoing research. However, 3DMM-based methods are constrained by their fixed topologies, point-based approaches suffer from a heavy training burden due to the extensive quantity of points involved, and the last ones suffer from limitations in deformation flexibility and rendering efficiency. In response to these challenges, we propose MonoGaussianAvatar (Monocular Gaussian Point-based Head Avatar), a novel approach that harnesses 3D Gaussian point representation coupled with a Gaussian deformation field to learn explicit head avatars from monocular portrait videos. We define our head avatars with Gaussian points characterized by adaptable shapes, enabling flexible topology. These points exhibit movement with a Gaussian deformation field in alignment with the target pose and expression of a person, facilitating efficient deformation. Additionally, the Gaussian points have controllable shape, size, color, and opacity combined with Gaussian splatting, allowing for efficient training and rendering. Experiments demonstrate the superior performance of our method, which achieves state-of-the-art results among previous methods.

Method

Overview of MonoGaussianAvatar. We model the human head as a learned parametric deformed 3D Gaussian points, comprising the mean position, the color, the opacity, the rotation, and the scale. These parameters collectively characterize the subject’s geometric and intrinsic appearance in the deformed space. The left module details the initialization process of our Gaussian representation, explaining how we obtain the initial mean position, derive the other Gaussian parameters, and transform the mean position xc from the initialized space to the canonical space. The middle module introduces the transformation process of the mean position from the canonical space to the deformed space using LBS. It also describes how the other Gaussian parameters in the deformed space are adjusted to fit the transformation of the mean position through the Gaussian parameter deformation field. The right-top module presents the rendering process with 3D Gaussian parameters in the deformed space. The right-bottom demonstrates the strategy of point insertion and deletion.

MonoGaussianAvatar: Monocular Gaussian Point-based Head Avatar

SIGGRAPH 2024

Yufan Chen^1,†, Lizhen Wang², Qijing Li², Hongjiang Xiao³, Shengping Zhang^1*, Hongxun Yao¹, Yebin Liu²

Abstract

Animation

Self-drive

Cross-identity Reenactment

Method

Demo Video

Citation

MonoGaussianAvatar: Monocular Gaussian Point-based Head Avatar

SIGGRAPH 2024

Yufan Chen1,†, Lizhen Wang2, Qijing Li2, Hongjiang Xiao3, Shengping Zhang1*, Hongxun Yao1, Yebin Liu2

Abstract

Animation

Self-drive

Cross-identity Reenactment

Method

Demo Video

Citation

Yufan Chen^1,†, Lizhen Wang², Qijing Li², Hongjiang Xiao³, Shengping Zhang^1*, Hongxun Yao¹, Yebin Liu²