[PDF][PDF] Total relighting: learning to relight portraits for background replacement.
ACM Trans. Graph., 2021•3dvar.com
Compositing a person into a scene to look like they are really there is a fundamental
technique in visual effects, with many other applications such as smartphone photography
[Tsai and Pandey 2020] and video conferencing [Hou and Mullen 2020]. The most common
practice in film-making has been to record an actor in front of green or blue screen and use
chroma-keying [Wright 2013] to derive an alpha matte and then change the background to a
new one. However, this does nothing to ensure that the lighting on the subject appears …
technique in visual effects, with many other applications such as smartphone photography
[Tsai and Pandey 2020] and video conferencing [Hou and Mullen 2020]. The most common
practice in film-making has been to record an actor in front of green or blue screen and use
chroma-keying [Wright 2013] to derive an alpha matte and then change the background to a
new one. However, this does nothing to ensure that the lighting on the subject appears …
Compositing a person into a scene to look like they are really there is a fundamental technique in visual effects, with many other applications such as smartphone photography [Tsai and Pandey 2020] and video conferencing [Hou and Mullen 2020]. The most common practice in film-making has been to record an actor in front of green or blue screen and use chroma-keying [Wright 2013] to derive an alpha matte and then change the background to a new one. However, this does nothing to ensure that the lighting on the subject appears consistent with the lighting in the new background environment, which must be solved with laborious lighting placement or elaborate LED lighting reproduction systems [Bluff et al. 2020; Debevec et al. 2002; Hamon et al. 2014]. Our goal is to design a system that allows for automated portrait relighting and background replacement. There is a significant body of work both in relighting, eg [Barron and Malik 2015; Debevec et al. 2000; Nestmeyer et al. 2020; Sun et al. 2019; Wang et al. 2020; Zhou et al. 2019], and in determining alpha mattes and foreground colors, eg [Cai et al. 2019; Forte and Pitié 2020; Hou and Liu 2019; Lutz et al. 2018; Xu et al. 2017]. A few techniques simultaneously consider foreground estimation and compositing in a unified framework [Wang and Cohen 2006; Zhang et al. 2020b] and produce convincing composites when the input and target lighting conditions are similar. However, the absence of an explicit relighting step limits realism when the input and target illumination conditions are different. To generate convincing relit composites, Einarsson et al.[2006] and Wenger et al.[2005] captured reflectance field basis images using time-multiplexed lighting conditions played back at very high frame rates (∼ 1000 Hz) in a computational illumination system, leveraging image-based relighting [Debevec et al. 2000] to match the lighting of the subject to the target background. Both methods also employed a simple ratio matting technique [Debevec et al. 2002] used to derive the alpha channel, based on infrared or time-multiplexed mattes and recording a “clean plate”. These hardware-based systems produced realistic composites by handling matting, relighting, and compositing in one complete system. However, their specialized hardware makes these techniques impractical in casual settings, such as for mobile phone photography and video conferencing. Inspired by these approaches, we propose a system for realistic portrait relighting and background replacement, starting from just a single RGB image and a desired target high dynamic range (HDR) lighting environment [Debevec 1998]. Our approach relies on multiple deep learning modules trained to accurately detect the foreground and alpha matte from portraits and to perform foreground relighting and compositing under a target illumination condition. We train our models using data from a light stage computational illumination system [Guo et al. 2019] to record reflectance fields and alpha mattes of 70 diverse individuals in various poses and expressions. We process the data to estimate useful photometric information such as per-pixel surface normals and surface albedo, which we leverage to help supervise the training of the relighting model. We extrapolate the recorded alpha mattes to all of the camera viewpoints using a deep learning framework that leverages clean plates of the light stage background, extending ratio matting to unconstrained backgrounds without the need for specialized lighting. With these reflectance fields, alpha mattes, and a database of high resolution HDR lighting environments, we use image-based relighting [Debevec et al …
3dvar.com