Organizers

Sean Fanello, Christoph Rhemann, Jonathan Taylor, Sofien Bouaziz, Adarsh Kowdle, Rohit Pandey, Sergio Orts-Escolano, Paul Debevec, Shahram Izadi

Google

Description

3D Capture and Rendering of humans has shown incredible progress in the last few years reaching a level of quality that is getting closer to Image Based Renderings (IBR) approaches. These systems usually rely on multiview capture setups and sophisticated pipelines to build a consistent, parameterized mesh of the performer with reflectance properties. The final goal of these approaches is to render high quality, photo-realistic humans that can match the quality of Hollywood productions, without any manual intervention or post-processing.

However, despite steady progress and encouraging results obtained by these 3D capture systems, they still face important challenges and limitations. For example, translucent and transparent objects cannot be easily captured; reconstructing thin structures (e.g. hair) is still very challenging even with high resolution depth sensors. At the same time, the computer vision community has focused its attention towards deep learning techniques to overcome the limitations of geometric approaches.

In this tutorial, we will show how to combine geometric pipelines with recent advances in neural rendering to construct disentangled 3D representations for photo-realistic renderings of humans in novel viewpoints and desired lighting conditions. We will walk the audience through the current state-of-the-art for 3D performance capture, highlighting the pros and cons of the various techniques.

In particular, in the first part of the tutorial we will focus on the capture system, that is the foundation for any machine learning methods that rely on supervised ground truth data. We will consider the hardware design choices for cameras, sensors, lighting, and depth estimation algorithms. We will then describe all the steps needed to select and design the right depth sensing technology for a given application.

In the second part we will then detail state-of-the-art methods to reconstruct humans with high fidelity. We will focus on topics such as 3D reconstruction, parametric and non-parametric tracking, mesh parameterization and compression. We will also detail traditional methods to compute reflectance and material properties of arbitrary objects.

In the third part of the tutorial we will show how deep learning can be applied to overcome the limitations of the traditional capture and rendering pipelines. We will detail recent trends in disentangled representations for human capture, with particular emphasis on pose, viewpoint and lighting. Finally we will discuss multiple applications enabled by performance capture systems and machine learning.

When and Where

June 14th - 9.10 am PDT - Virtual, hosted on the official CVPR Website.
Talks are pre-recorded and attendees can watch them asychronously.
Organizers will periodically check the comments to answer questions.

Program

Time	Title	Speaker
	Morning Session: 3D Capture Systems for Groundtruth Generation
9:10 - 9:30	Learned Disentangled Representations for Perception Tasks	Sean Fanello Google
9:30 - 10:00	High Quality Depth Sensors for Volumetric Capture	Adarsh Kowdle Google
10:00 - 10:30	The Light Stage Hardware	Jay Busch, Matt Whalen Google
10:30 - 10:45	Coffee Break
10:45 - 11:30	Reconstruct, Track and Parameterize Humans	Sean Fanello Google
11:30 - 12:10	Reflectance Estimation in Images, Videos and 3D Content	Alex Ma Google
12:10 - 13:30	Lunch Break
	Afternoon Session: Deep Learning meets Light Stage
13:30 - 14:00	[Keynote] A Light Stage for (almost) Every Application	Paul Debevec Google
14:00 - 14:20	Learning to Predict Depth for Computational Photography	Yinda Zhang Google
14:20 - 14:40	Multi-view Pose Estimation and Tracking	Anastasia Tkach Google
14:40 - 15:00	Accurate Alpha Matting for Performance Capture	Sergio Orts-Escolano Google
15:00 - 15:20	Learning to Estimate Environmental Lighting	Chloe LeGendre Google
15:20 - 15:40	Deep Implicit Compression	Danhang Tang Google
15:40 - 16:00	Coffee Break
16:00 - 16:20	Disentanglement of Lighting, Appearance, Viewpoint	Abhimitra Meka Google
16:20 - 17:00	Neural Rendering for Performance Capture	Rohit Pandey Google

Please contact Shahram Izadi or Sean Fanello if you have any questions.

Disentangled 3D Representations for Relightable Performance Capture of Humans

CVPR 2020 Tutorial

Organizers

Description

When and Where

Program