Beyond Prompts: Unconditional 3D Inversion for Out-of-Distribution Shapes

Published:

Victoria Yue Chen, Emery Pierson, Léopold Maillard, Maks Ovsjanikov


image-center

TL;DR

We study text-driven inversion of 3D generative models.
We found the existence of sink traps: the model can become insensitive to prompts during generation (generating only one shape).
Despite this propertly, the models retain strong geometric expressiveness in the unconditional distribution. We demonstrate the utility of this finding in a pose retargeting applications on various out-of-distribution shapes.

Expressivity of Language and Geometry

image-center

Sink trap examples. Sink trap examples. Given various description of a character (dancing girl, surgeon, labrador, astronaut, scary wolf) in different poses, we generate multiple assets using TRELLIS. However, we observe a mode collapse where there is high similarity between the results, despite different prompt describing different actions.

image-center

Geometric expressivity. The velocity norm of Flux remains stable across different prompt types, whereas TRELLIS exhibits large variations when inputted various language prompt. This property is not true when dealing with unconditional prompts.

Application: inversion based character retargeting

Character + prompt
Edit 1
Edit 2
Loading mesh...
frame_000.obj normal shading
Loading textured OBJ...
prompt_43_test VTK texture
Loading textured OBJ...
prompt_47_test VTK texture
Loading mesh...
frame_007.obj normal shading
Loading textured OBJ...
prompt_84 VTK texture
Loading textured OBJ...
prompt_86_test VTK texture

This application block shows two VTK.js comparison rows, one for prompt_maria and one for prompt_chicken, combining normal-shaded source meshes with textured VTK edit views.

Acknowledgements

TODO

Citation

If you consider our work useful, please cite:

@misc{incoming}

This webpage was inspired by Nerfies.