Visual Computing Seminar (Spring 2020)
The Visual computing seminar is a weekly seminar series on a variety of topics in the broader area of Visual Computing.
Why: The motivation for creating this seminar is that EPFL has a critical mass of people who are working on subtly related topics in computational photography, computer graphics, geometry processing, human–computer interaction, computer vision and signal processing. Having a weekly point of interaction will provide exposure to interesting work in this area and increase awareness of our shared interests and other commonalities like the use of similar computational tools — think of this as the visual computing edition of the “Know thy neighbor” seminar series.
Who: The target audience are faculty, students and postdocs in the visual computing disciplines, but the seminar is open to anyone and guests are welcomed. There is no need to formally enroll in a course. The format is very flexible and will include 45 minute talks with Q&A, talks by external visitors, as well as shorter presentations. In particular, the seminar is also intended as a way for students to obtain feedback on shorter ~20min talks preceding a presentation at a conference. If you are a student or postdoc in one of the visual computing disciplines, you’ll probably receive email from me soon on scheduling a presentation.
Where and when: every Friday in BC410. Food is served at 11:50, and the actual talk starts at 12:15.
How to be notified: If you want to be kept up to date with announcements, please send me an email and I’ll put you on the list. If you are working in LCAV, CVLAB, IVRL, LGG, LSP, IIG, CHILI, LDM or RGL, you are automatically subscribed to future announcements, so there is nothing you need to do.
You may add the seminar events to Google Calendar (click the '+' button in the bottom-right corner), or download the iCal file.
Title: Computational Design of Metamaterials and Deployable Structures [practice job talk]
Title: Shape Reconstruction by Learning Differentiable Surface Representations
Abstract: Generative models that produce point clouds have emerged as a powerful tool to represent 3D surfaces, and the best current ones rely on learning an ensemble of parametric representations. Unfortunately, they offer no control over the deformations of the surface patches that form the ensemble and thus fail to prevent them from either overlapping or collapsing into single points or lines. As a consequence, computing shape properties such as surface normals and curvatures becomes difficult and unreliable. In this paper, we show that we can exploit the inherent differentiability of deep networks to leverage differential surface properties during training so as to prevent patch collapse and strongly reduce patch overlap. Furthermore, this lets us reliably compute quantities such as surface normals and curvatures. We will demonstrate on several tasks that this yields more accurate surface reconstructions than the state-of-the-art methods in terms of normals estimation and amount of collapsed and overlapped patches.
Title: Perception in the Action Loop
Abstract: Artificial Intelligence seeks agents that can perceive the world and act accordingly. Despite remarkable progress toward this goal, a fundamental shortcoming persists on the perception front: difficulty in scaling to the complexity of the real world, and consequently, reducing the operation domain to perceptually simplified ones (e.g. video games, controlled spaces, tabletop manipulation scenarios). I’ll talk about efforts toward a visual perception that could ultimately scale to real-world complexity and support the goals of active agents by going beyond isolated pattern recognition problems.
I’ll present a method for tractably learning a large set of perception tasks using transfer learning (Taskonomy), toward forming a multi-task compositional perception dictionary. I’ll show this dictionary can be turned into an intermediate perception module for active robotic agents (Mid-Level Vision), enabling them to improve their sample efficiency and generalization. This is accomplished using both real robots as well as a virtual environment rooted in real spaces (Gibson Environment). I will conclude with discussing cross-task consistency and quantifying uncertainty in perceptual estimations (X-TaC).
Title: 3D surface reconstruction from image(s)
Abstract: Using convolutional neural networks for single-view reconstruction has recently become a promising and trending topic in geometrical deep learning. Presented with an image of a shape, the task is to accurately reconstruct the visible parts of the object, and to plausibly hallucinate its unseen portion relying on a learned prior. Typical architectures comprise a CNN encoder and a decoder. Image encoders map a 2D view to a vectorized latent space, while the decoders map a latent vector to a 3D shape (in the form of a point cloud, a mesh deformation, the zero-crossing of an implicit function…). To perform well under different viewpoints, the whole architecture has to implicitly learn non trivial 3D geometric manipulations. This has been seen as a very limiting factor for generalization to unseen shapes and poses. We propose to construct a registered 3D latent space, using reverse camera projections. A latent vector consists of a 3D grid aligned with the output object. 2D feature maps and depth maps are pushed to 3D space, and the network is relieved of the burden of localizing spatial features in 3D space. This also allows to geometrically fuse codes in the latent space and more accurately reconstruct a surface from multiple views. Our second contribution is a hybrid 3D decoder, relying both on voxels and point clouds. A relatively coarse grid of occupancy voxels first predicts a low resolution approximation of the desired surface. Then, within each activated voxel, a 2D patch is differentiably folded to capture higher frequency details and smoother curvatures. This hybrid solution exploits both the good spatialization of 3D convolutions and the sparsity of point clouds.
Title: Generative models for solving inverse design problems
Abstract: We present a framework that trains a deep generative model to provide multifarious solutions to a given inverse problem. Inverse problems appear in several engineering tasks, where one tries to answer the question “How can I design a structure (usually this implies a choice of parameters), to achieve a certain target performance?”. Currently, we study this on the example of deployable beam networks (X-Shells), which are fabricated in a flat configuration, but can unfold to a three dimensional shape. We propose a variant of the Generative Adversarial Nets (GAN) framework wherein the generator is trained to output high quality creations (e.g. X-Shells) with respect to a given performance measure (e.g. deployability, flatness in fabrication state, “beauty” of the deployed shape, ...). Since we have limited insight in how to create examples of good X-Shells, our training data samples the space of such structures insufficiently. Thus, we adapt our training framework to rely only on forward simulation of generated structures and drop the use of a dataset. The first version of this framework is highly prone to severe mode-collapse, which is why we introduced a novel diversity regularization term into the loss. In addition to encouraging diversity among the creations, this term seems to stabilize training for GAN. We demonstrate these results on a toy example, as the use of this framework for X-Shells is still a work in progress.
Title: Specular Manifold Sampling for Rendering High-Frequency Caustics and Glints
Abstract: Weaving as a traditional craft has been used to efficiently construct three-dimensional surfaces from flat, straight ribbons. A regular weaving pattern usually has interlaced families of parallel ribbons. A classical basket weaving technique is to insert topological singularities in the pattern to produce nonplanar shapes. However, the weave pattern is dictated by the geometry of the target shape under this technique, and topological singularities often produce concentrated curvature. In this project, we introduce a different approach to control the 3D equilibrium shape of a woven structure: weaving with ribbons that are planar, but curved in their fabricated states. We propose a simulation-based optimization framework that solves for the in-plane rest curvature of each individual ribbon in the weave to produce a rich variety of double-curved shapes. We highlight the capability to precisely approximate complex freeform shapes without imposing constraints on the weave pattern. We demonstrate the effectiveness of the curved ribbon approach and the performance of our optimization framework using simulation, physical fabrication, and measurements from micro CT scans.
Title: Computational Design of Woven Structures
Abstract: Weaving as a traditional craft has been used to efficiently construct three-dimensional surfaces from flat, straight ribbons. A regular weaving pattern usually has interlaced families of parallel ribbons. A classical basket weaving technique is to insert topological singularities in the pattern to produce nonplanar shapes. However, the weave pattern is dictated by the geometry of the target shape under this technique, and topological singularities often produce concentrated curvature.
In this project, we introduce a different approach to control the 3D equilibrium shape of a woven structure: weaving with ribbons that are planar, but curved in their fabricated states. We propose a simulation-based optimization framework that solves for the in-plane rest curvature of each individual ribbon in the weave to produce a rich variety of double-curved shapes. We highlight the capability to precisely approximate complex freeform shapes without imposing constraints on the weave pattern. We demonstrate the effectiveness of the curved ribbon approach and the performance of our optimization framework using simulation, physical fabrication, and measurements from micro CT scans.
Title: Radiative Backpropagation: An Adjoint Method for Lightning-Fast Differentiable Rendering
Abstract: Physically based differentiable rendering has recently evolved into a powerful tool for solving inverse problems involving light. Methods in this area perform a differentiable simulation of the physical process of light transport and scattering to estimate partial derivatives relating scene parameters to pixels in the rendered image. Together with gradient-based optimization, such algorithms have interesting applications in diverse disciplines, e.g., to improve the reconstruction of 3D scenes, while accounting for interreflection and transparency, or to design meta-materials with specified optical properties.
We introduce Radiative Backpropagation, a fundamentally new approach to differentiable rendering. Rather than relying on automatic differentiation, we recast gradient computation as a continuous transport problem, which can then be solved efficiently. Unlike previous work, our method scales to complex scenes rendered at high resolutions. We demonstrate its efficiency with a GPU implementation which achieves up to three orders of magnitude speedup compared to Mitsuba 2.
Title: Imitative Non-Autoregressive Modeling for Trajectory Forecasting and Imputation
Abstract: Trajectory forecasting and imputation are pivotal steps towards understanding the movement of human and objects, which are quite challenging since the future trajectories and missing values in a temporal sequence are full of uncertainties, and the spatial-temporally contextual correlation is hard to model. Yet, the relevance between sequence prediction and imputation is disregarded by existing approaches. To this end, we propose a novel imitative non-autoregressive modeling method to simultaneously handle the trajectory prediction task and the missing value imputation task. Specifically, our framework adopts an imitation learning paradigm, which contains a recurrent conditional variational autoencoder (RC-VAE) as a demonstrator, and a non-autoregressive transformation model (NART) as a learner. By jointly optimizing the two models, RC-VAE can predict the future trajectory and capture the temporal relationship in the sequence to supervise the NART learner. As a result, NART learns from the demonstrator and imputes the missing value in a non-autoregressive strategy. We conduct extensive experiments on three popular datasets, and the results show that our model achieves state-of-the-art performance across all the datasets.
Title: How to train your super-net: An Analysis of Training Heuristics in Weight-Sharing NAS
Abstract: Weight sharing promises to make neural architecture search (NAS) tractable even on commodity hardware. Existing methods in this space rely on a diverse set of heuristics to design and train the shared-weight backbone network, a.k.a. the super-net. Since heuristics and hyperparameters substantially vary across different methods, a fair comparison between them can only be achieved by systematically analyzing the influence of these factors. In this paper, we therefore provide a systematic evaluation of the heuristics and hyperparameters that are frequently employed by weight-sharing NAS algorithms. Our analysis uncovers that some commonly-used heuristics for super-net training negatively impact the correlation between super-net and stand-alone performance, and evidences the strong influence of certain hyperparameters and architectural choices. Our code and experiments set a strong and reproducible baseline that future works can build on.