The method reconstructs detailed 3D avatars from monocular videos in the wild via self-supervised scene decomposition.
Researchers presented Vid2Avatar, a method that creates human avatars from monocular in-the-wild videos via self-supervised scene decomposition.
It models both the person and the background together, parameterized through two separate neural fields, and produces a highly-detailed 3D result without external segmentation modules.
"We define a temporally consistent human representation in canonical space and formulate a global optimization over the background model, the canonical human shape and texture, and per-frame human pose parameters. A coarse-to-fine sampling strategy for volume rendering and novel objectives are introduced for a clean separation of dynamic human and static background, yielding detailed and robust 3D human geometry reconstructions."
You can learn more about the research here. The code should also be posted on GitHub soon. And don't forget to join our 80 Level Talent platform, our Reddit page, and our Telegram channel, follow us on Instagram and Twitter, where we share breakdowns, the latest news, awesome artworks, and more.