Skip to Main content Skip to Navigation

Reconstruction et correspondance de formes par apprentissage

Abstract : The goal of this thesis is to develop deep learning approaches to model and analyse 3D shapes. Progress in this field could democratize artistic creation of 3D assets which currently requires time and expert skills with technical software.We focus on the design of deep learning solutions for two particular tasks, key to many 3D modeling applications: single-view reconstruction and shape matching.A single-view reconstruction (SVR) method takes as input a single image and predicts the physical world which produced that image. SVR dates back to the early days of computer vision. In particular, in the 1960s, Lawrence G. Roberts proposed to align simple 3D primitives to the input image under the assumption that the physical world is made of cuboids. Another approach proposed by Berthold Horn in the 1970s is to decompose the input image in intrinsic images and use those to predict the depth of every input pixel.Since several configurations of shapes, texture and illumination can explain the same image, both approaches need to form assumptions on the distribution of images and 3D shapes to resolve the ambiguity. In this thesis, we learn these assumptions from large-scale datasets instead of manually designing them. Learning allows us to perform complete object reconstruction, including parts which are not visible in the input image.Shape matching aims at finding correspondences between 3D objects. Solving this task requires both a local and global understanding of 3D shapes which is hard to achieve explicitly. Instead we train neural networks on large-scale datasets to solve this task and capture this knowledge implicitly through their internal parameters.Shape matching supports many 3D modeling applications such as attribute transfer, automatic rigging for animation, or mesh editing.The first technical contribution of this thesis is a new parametric representation of 3D surfaces modeled by neural networks.The choice of data representation is a critical aspect of any 3D reconstruction algorithm. Until recently, most of the approaches in deep 3D model generation were predicting volumetric voxel grids or point clouds, which are discrete representations. Instead, we present an alternative approach that predicts a parametric surface deformation ie a mapping from a template to a target geometry. To demonstrate the benefits of such a representation, we train a deep encoder-decoder for single-view reconstruction using our new representation. Our approach, dubbed AtlasNet, is the first deep single-view reconstruction approach able to reconstruct meshes from images without relying on an independent post-processing, and can do it at arbitrary resolution without memory issues. A more detailed analysis of AtlasNet reveals it also generalizes better to categories it has not been trained on than other deep 3D generation approaches.Our second main contribution is a novel shape matching approach purely based on reconstruction via deformations. We show that the quality of the shape reconstructions is critical to obtain good correspondences, and therefore introduce a test-time optimization scheme to refine the learned deformations. For humans and other deformable shape categories deviating by a near-isometry, our approach can leverage a shape template and isometric regularization of the surface deformations. As category exhibiting non-isometric variations, such as chairs, do not have a clear template, we learn how to deform any shape into any other and leverage cycle-consistency constraints to learn meaningful correspondences. Our reconstruction-for-matching strategy operates directly on point clouds, is robust to many types of perturbations, and outperforms the state of the art by 15% on dense matching of real human scans
Document type :
Complete list of metadata
Contributor : Abes Star :  Contact
Submitted on : Tuesday, February 9, 2021 - 4:33:23 PM
Last modification on : Tuesday, February 16, 2021 - 8:02:13 AM


Version validated by the jury (STAR)


  • HAL Id : tel-03127055, version 2



Thibault Groueix. Reconstruction et correspondance de formes par apprentissage. Automatique. Université Paris-Est, 2020. Français. ⟨NNT : 2020PESC1024⟩. ⟨tel-03127055v2⟩



Record views


Files downloads