Visual Computing Seminar (Fall 2019)
The Visual computing seminar is a weekly seminar series on a variety of topics in the broader area of Visual Computing.
Why: The motivation for creating this seminar is that EPFL has a critical mass of people who are working on subtly related topics in computational photography, computer graphics, geometry processing, human–computer interaction, computer vision and signal processing. Having a weekly point of interaction will provide exposure to interesting work in this area and increase awareness of our shared interests and other commonalities like the use of similar computational tools — think of this as the visual computing edition of the “Know thy neighbor” seminar series.
Who: The target audience are faculty, students and postdocs in the visual computing disciplines, but the seminar is open to anyone and guests are welcomed. There is no need to formally enroll in a course. The format is very flexible and will include 45 minute talks with Q&A, talks by external visitors, as well as shorter presentations. In particular, the seminar is also intended as a way for students to obtain feedback on shorter ~20min talks preceding a presentation at a conference. If you are a student or postdoc in one of the visual computing disciplines, you’ll probably receive email from me soon on scheduling a presentation.
Where and when: every Friday in BC02. Food is served at 11:50, and the actual talk starts at 12:15.
How to be notified: If you want to be kept up to date with announcements, please send me an email and I’ll put you on the list. If you are working in LCAV, CVLAB, IVRL, LGG, LSP, IIG, CHILI, LDM or RGL, you are automatically subscribed to future announcements, so there is nothing you need to do.
You may add the seminar events to Google Calendar (click the '+' button in the bottom-right corner), or download the iCal file.
Title: Applications of Physically-based Visual Simulation
*Note*: This talk is scheduled on Wednesday, but most of the other talks during the semester will take place on the new time (Friday).
Abstract: Physically-based simulation plays an important role for computer graphics. We can create highly realistic images by simulating optical interaction between light and virtual objects. Realistic animation of complex phenomena can be generated by solving their governing equations such as Navier-Stokes equations for fluids. Many of the existing researches focus on the accurate and efficient simulation of existing physical phenomena. Our research group focuses on a different direction; we 'use' physically-based simulation for different purposes. In this talk, I will introduce some of our research results, including inverse visual simulation of clouds, physically-based aerodynamic sound synthesis, and inverse optical simulation for digital fabrication.
Title: Computational Design of Robotic Characters and Architectural-Scale Structures
Abstract: With the emergence of modern manufacturing and construction technologies, we are witnessing a shift of the design complexity of robots and structures toward computation. With the aid of computational design tools, we can manage this rapidly growing design complexity when creating custom robotic characters and architectural-scale structures.
In this talk, I will first highlight how computation paves the way toward robots that are lightweight and inexpensive, yet functional. I will discuss how we can leverage differentiable simulation and optimization to design continuously deforming robots made of bent wire, and animate very soft robotic systems while suppressing visible mechanical oscillations. Finally, I will discuss how an optimization under worst-case loads enables the use of large-scale binder jetting at the architectural scale.
Bio: Moritz Bächer is a Research Scientist at Disney Research where he leads the Computational Design and Manufacturing group. His research interests lie at the intersection of computer graphics, and computational fabrication and robotics. Before joining Disney, he received a Ph.D. from the Harvard School of Engineering and Applied Sciences and graduated with a Masters from ETH Zurich.
Title: A Study of Color Rendering in the In-Camera Imaging Pipeline
Abstract: Consumer cameras such as digital single-lens reflex camera (DSLR) and smartphone cameras have onboard hardware that applies a series of processing steps to transform the initial captured raw sensor image to the final output image that is provided to the user. These processing steps collectively make up the in-camera image processing pipeline. This dissertation aims to study the processing steps related to color rendering which can be categorized into two stages. The first stage is to convert an image's sensor-specific raw color space to a device-independent perceptual color space. The second stage is to further process the image into a display-referred color space and includes photo-finishing routines to make the image appear visually pleasing to a human.
In this talk I will summarize four contributions towards the study of camera color rendering. The first contribution is the development of a software-based research platform that closely emulates the in-camera image processing pipeline hardware. This platform allows the examination of the various image states of the captured image as it is processed from the sensor response to the final display output. Our second contribution is to demonstrate the advantage of having access to intermediate image states within the in-camera pipeline that provide more accurate colorimetric consistency among multiple cameras. Our third contribution is to analyze the current colorimetric method used by consumer cameras and to propose a modification that is able to improve its color accuracy. Our fourth contribution is to describe how to customize a camera imaging pipeline using machine vision cameras to produce high-quality perceptual images for dermatological applications. The talk concludes with an overall summary.
Title: Beyond Cartesian Representations for Local Descriptors
Abstract: In the talk I'll give a brief introduction to the local descriptors, where and how they are used, describe current approaches and state-of-the art solutions. And present a novel and better way of patch sampling - extraction of the “support region” directly with a log-polar sampling by simultaneously oversampling the immediate neighbourhood of the point and undersampling regions far away from it. This allows trainable model to match a much wider range of scales than was possible before, and also leverage much larger support regions without suffering from occlusions.
Title: Backpropagation-Friendly Eigendecomposition
Abstract: Eigendecomposition (ED) is widely used in deep networks. However, the backpropagation of its results tends to be numerically unstable, whether using ED directly or approximating it with the Power Iteration method, particularly when dealing with large matrices. While this can be mitigated by partitioning the data in small and arbitrary groups, doing so has no theoretical basis and makes its impossible to exploit the power of ED to the full. In this paper, we introduce a numerically stable and differentiable approach to leveraging eigenvectors in deep networks. It can handle large matrices without requiring to split them. We demonstrate the better robustness of our approach over standard ED and PI for ZCA whitening, an alternative to batch normalization, and for PCA denoising, which we introduce as a new normalization strategy for deep networks, aiming to further denoise the network's features.
Title: Mitsuba 2: A Retargetable Forward and Inverse Renderer
Abstract: Modern rendering systems are confronted with a dauntingly large and growing set of requirements: in their pursuit of realism, physically based techniques must increasingly account for intricate properties of light, such as its spectral composition or polarization. To reduce prohibitive rendering times, vectorized renderers exploit coherence via instruction-level parallelism on CPUs and GPUs. Differentiable rendering algorithms propagate derivatives through a simulation to optimize an objective function, e.g., to reconstruct a scene from reference images. Catering to such diverse use cases is challenging and has led to numerous purpose-built systems—partly, because retrofitting features of this complexity onto an existing renderer involves an error-prone and infeasibly intrusive transformation of elementary data structures, interfaces between components, and their implementations (in other words, everything).
We propose Mitsuba 2, a versatile renderer that is intrinsically retargetable to various applications including the ones listed above. Mitsuba 2 is implemented in modern C++ and leverages template metaprogramming to replace types and instrument the control flow of components such as BSDFs, volumes, emitters, and rendering algorithms. At compile time, it automatically transforms arithmetic, data structures, and function dispatch, turning generic algorithms into a variety of efficient implementations without the tedium of manual redesign. Possible transformations include changing the representation of color, generating a “wide” renderer that operates on bundles of light paths, just-in-time compilation to create computational kernels that run on the GPU, and forward/reverse-mode automatic differentiation. Transformations can be chained, which further enriches the space of algorithms derived from a single generic implementation.
In this talk, I will present some of the key architectural features of Mitsuba 2's and demonstrate it in an interactive Python session. A second talk will be given on Mitsuba 2 on December 6th by Delio Vicini, focusing on some of the applications.
Title: Reparameterizing discontinuous integrands for differentiable rendering
Abstract: Differentiable rendering has recently opened the door to a number of challenging inverse problems involving photorealistic images, such as computational material design and scattering-aware reconstruction of geometry and materials from photographs. Differentiable rendering algorithms strive to estimate partial derivatives of pixels in a rendered image with respect to scene parameters, which is difficult because visibility changes are inherently non-differentiable.
We propose a new technique for differentiating path-traced images with respect to scene parameters that affect visibility, including the position of cameras, light sources, and vertices in triangle meshes. Our algorithm computes the gradients of illumination integrals by applying changes of variables that remove or strongly reduce the dependence of the position of discontinuities on differentiable scene parameters. The underlying parameterization is created on the fly for each integral and enables accurate gradient estimates using standard Monte Carlo sampling in conjunction with automatic differentiation. Importantly, our approach does not rely on sampling silhouette edges, which has been a bottleneck in previous work and tends to produce high-variance gradients when important edges are found with insufficient probability in scenes with complex visibility and high-resolution geometry. We show that our method only requires a few samples to produce gradients with low bias and variance for challenging cases such as glossy reflections and shadows. Finally, we use our differentiable path tracer to reconstruct the 3D geometry and materials of several real-world objects from a set of reference photographs.
Note: This week's seminar talk will start slightly later than usual, around 12:30.
Title: Design and Structural Optimization of Topological Interlocking Assemblies
Abstract: We study assemblies of convex rigid blocks regularly arranged to approximate a given freeform surface. Our designs rely solely on the geometric arrangement of blocks to form a stable assembly, neither requiring explicit connectors or complex joints, nor relying on friction between blocks. The convexity of the blocks simplifies fabrication, as they can be easily cut from different materials such as stone, wood, or foam. However, designing stable assemblies is challenging, since adjacent pairs of blocks are restricted in their relative motion only in the direction orthogonal to a single common planar interface surface. We show that despite this weak interaction, structurally stable, and in some cases, globally interlocking assemblies can be found for a variety of freeform designs. Our optimization algorithm is based on a theoretical link between static equilibrium conditions and a geometric, global interlocking property of the assembly-that an assembly is globally interlocking if and only if the equilibrium conditions are satisfied for arbitrary external forces and torques. Inspired by this connection, we define a measure of stability that spans from single-load equilibrium to global interlocking, motivated by tilt analysis experiments used in structural engineering. We use this measure to optimize the geometry of blocks to achieve a static equilibrium for a maximal cone of directions, as opposed to considering only self-load scenarios with a single gravity direction. In the limit, this optimization can achieve globally interlocking structures. We show how different geometric patterns give rise to a variety of design options and validate our results with physical prototypes.
Title: DeepSphere: a graph-based spherical CNN
Abstract: Equivariance has emerged as a design principle for (Convolutional) Neural Networks (CNNs) that allows to reduce sample complexity and guarantees generalization by exploiting symmetries. For a spherical CNN, proper equivariance to rotation (using the spherical harmonic transform) is computationally expensive. A commonly used scalable alternative is to locally apply 2D CNNs, exploiting the local resemblance of the manifold to Euclidean space. This approach however discards the spherical geometry and is not equivariant. To allow for a controllable balance between equivariance and scalability, we propose instead to model the sampled sphere as a graph of connected pixels. Moreover, this representation naturally accommodates non-uniformly distributed, partial, and changing samplings, without interpolation. As the main challenge is to design discrete operations that respect the underlying continuous geometry, we show both theoretically and empirically how equivariance is affected by the construction of the graph. The application of DeepSphere to the recognition of 3D objects, the discrimination of cosmological models, and the segmentation of extreme events in climate simulations yields state-of-the-art performance and demonstrates the efficiency and flexibility of the method.
Title: DeepWave, a recurrent neural network for real-time acoustic imaging
Abstract: Computational imaging refers to the problem of reconstructing a source object from indirect evidence of the latter obtained using an acquisition device. The traditional approach to solving these imaging tasks is to formulate a regularized inverse problem that takes into account the precise form of the forward operator and prior knowledge of the object of interest to obtain good estimates. Recently however, Deep Learning methods have been placed at the forefront of the field due to their excellent performance and fixed execution times.
Paper pre-print: https://infoscience.epfl.ch/record/265765/files/paper.pdf
Title: Mesh Modeling to Assist Segmentation in Volumetric Data
Abstract: CNN-based volumetric methods that label individual voxels now dominate the field of biomedical segmentation. In this paper, we show that simultaneously performing the segmentation and recovering a 3D mesh that models the surface can boost performance.
Title: Mitsuba 2: A Retargetable Forward and Inverse Renderer: Example Applications
Abstract: Modern rendering systems are confronted with a growing set of complex requirements: in their pursuit of realism, physically based techniques must increasingly account for intricate properties of light, such as its spectral composition or polarization. To reduce prohibitive rendering times, vectorized renderers exploit coherence via instruction-level parallelism on CPUs and GPUs. Differentiable rendering algorithms propagate derivatives through a simulation to optimize an objective function, e.g., to reconstruct a scene from reference images. Catering to such diverse use cases is challenging and has led to numerous purpose-built systems—partly, because retrofitting features of this complexity onto an existing renderer involves an error-prone and infeasibly intrusive transformation of elementary data structures, interfaces between components, and their implementations (in other words, everything).We recently proposed Mitsuba 2 (presented at Siggraph Asia 2019), a versatile renderer that is intrinsically retargetable to various applications including the ones listed above.
Title: Inflatables: A New Deployable Surface Structure
Abstract: I will present our work on a new type of deployable surface structure that transforms from a flat, easily fabricated sheet to a curved 3D surface simply by inflating. Our “inflatables” consist of two sheets of material that are fused together along a pattern of curves to form pockets. When inflated with air, these pockets expand into tubes, inducing transverse contractions of the sheet that force it to pop into a 3D shape. We use tools from differential geometry to understand this shape transformation and to approximately solve the inverse design problem: given a target surface, we generate a pattern of fusing curves so that the resulting inflated structure closely resembles it. We then use an accurate physical simulation of the inflation process to assess how well the target surface is approximated and run a shape optimization on the fusing curves to better fit the target. I will show simulated inflations for a variety of surfaces we are able to produce with this method, along with some fabricated examples.
Title: Estimating People Flows to Better Count them in Crowded Scenes
Abstract: State-of-the-art methods for counting people in crowded scenes rely on deep networks to estimate people densities in individual images. As such, only very few take advantage of temporal consistency in video sequences, and those that do only impose weak smoothness constraints across consecutive frames.