research-article

Open access

Importance Sampling BRDF Derivatives

Authors:

Yash Belhe,

Bing Xu,

Sai Praveen Bangaru,

Ravi Ramamoorthi,

Tzu-Mao LiAuthors Info & Claims

ACM Transactions on Graphics, Volume 43, Issue 3

Article No.: 25, Pages 1 - 21

https://doi.org/10.1145/3648611

Published: 09 April 2024 Publication History

PDF eReader

Abstract

We propose a set of techniques to efficiently importance sample the derivatives of a wide range of Bidirectional Reflectance Distribution Function (BRDF) models. In differentiable rendering, BRDFs are replaced by their differential BRDF counterparts, which are real-valued and can have negative values. This leads to a new source of variance arising from their change in sign. Real-valued functions cannot be perfectly importance sampled by a positive-valued PDF, and the direct application of BRDF sampling leads to high variance. Previous attempts at antithetic sampling only addressed the derivative with the roughness parameter of isotropic microfacet BRDFs. Our work generalizes BRDF derivative sampling to anisotropic microfacet models, mixture BRDFs, Oren-Nayar, Hanrahan-Krueger, among other analytic BRDFs.

Our method first decomposes the real-valued differential BRDF into a sum of single-signed functions, eliminating variance from a change in sign. Next, we importance sample each of the resulting single-signed functions separately. The first decomposition, positivization, partitions the real-valued function based on its sign, and is effective at variance reduction when applicable. However, it requires analytic knowledge of the roots of the differential BRDF, and for it to be analytically integrable too. Our key insight is that the single-signed functions can have overlapping support, which significantly broadens the ways we can decompose a real-valued function. Our product and mixture decompositions exploit this property, and they allow us to support several BRDF derivatives that positivization could not handle. For a wide variety of BRDF derivatives, our method significantly reduces the variance (up to 58× in some cases) at equal computation cost and enables better recovery of spatially varying textures through gradient-descent-based inverse rendering.

1 Introduction

Bidirectional Reflectance Distribution Function (BRDF) importance sampling is an essential variance reduction technique for Monte Carlo forward rendering. However, there is no simple counterpart for differentiable rendering. Taking the derivative of a BRDF with respect to one of its parameters transforms it into a real-valued differential BRDF. The differential BRDF can have a very different shape from the BRDF, and can also take negative values. Previous attempts at tackling this problem [Zeltner et al. 2021] are limited to the roughness derivatives of isotropic Ground Glass Unknown (GGX) (Trowbridge-Reitz) and Beckmann BRDFs, and cannot handle even their anisotropic counterparts. Another method [Zhang et al. 2021a] was developed primarily for odd functions with symmetric positive and negative lobes, and can produce substantially higher variance when the derivative is close to an even function. We propose effective importance sampling of derivatives of not only anisotropic GGX and Beckmann BRDFs but also a wide variety of other analytic BRDF models like Ashikhmin-Shirley, Oren-Nayar, Hanrahan-Krueger, Mixture BRDFs, and ABC models. Figure 1 demonstrates the benefits of our method on several BRDFs compared to BRDF sampling.

Fig. 1.

Importance sampling a real-valued function leads to unique challenges. Its variance has two sources, (a) its sign and (b) its shape. Our idea is to decompose the function into a sum of single-signed functions, which we call single-signed decompositions. A single-signed function is either positive everywhere or negative everywhere in its domain. Single-signed functions, by definition, have no sign variance. Importance sampling these functions eliminates their shape variance.

A classical strategy, positivization [Owen and Zhou 2000], is a special case of our single-signed decomposition. It has positive and negative parts with non-overlapping support, which in turn requires (a) analytic knowledge of the roots and (b) analytic integrability of the BRDF derivative up to the roots, which is possible only for certain BRDF derivatives. To sidestep these issues due to a partition of the domain, we introduce the product and mixture decompositions for which we allow the positive and negative parts to overlap. In fact, we ensure that both the positive and the negative parts have support over the entire hemisphere. This enables analytic integrability and significantly expands upon the set of BRDF derivatives we can handle. Our main contributions are three single-signed decompositions and the corresponding importance sampling PDFs of a large set of BRDF derivatives; see Table 1.

Table 1.

Material	Param.	SSD	PDFs
Isotropic GGX (1975, 2007)	\(\alpha\)	Pos.	A.1.1
Isotropic Beckmann [1987]	\(\alpha\)	Pos.	A.1.2
Blinn Phong (Minnaert) (1977, 1941)	n	Pos.	A.1.3
Henyey-Greenstein (Hanrahan-Krueger) (1941, 1993)	g	Pos.	A.1.4
Anisotropic GGX (1975, 2007)	\(\alpha _x, \alpha _y\)	Prod.	A.2.1
Anisotropic Beckmann (Ward) (1987, 1992)	\(\alpha _x, \alpha _y\)	Prod.	A.2.2
Ashikhmin-Shirley [2001]	\(n_u, n_v\)	Prod.	A.2.3
Isotropic ABC [Löw et al. 2012]	\(B, C\)	Prod.	A.2.4
Isotropic Hemi-EPD [2009]	\(\kappa\)	Prod.	A.2.5
Burley Diffuse Reflectance [2015]	d	Prod.	A.2.6
Mixture Model (e.g., Autodesk, Disney BRDF) (2019, 2012)	w	Mix.	A.3.1
Oren-Nayar [1994]	\(\sigma\)	Mix.	A.3.2
Microcylinder [2013]	\(k_d\)	Mix.	A.3.3

Table 1. List of Supported Material Derivatives

The first column lists the name of the BRDF, and the second column lists the corresponding parameter whose derivative we can importance sample. The third column lists the type of single-signed decomposition applied (Positivization, Product Decomposition, Mixture Decomposition). The fourth column lists the section number in the Appendix (with links) with the relevant sampling PDFs. Please refer to the original papers for definitions of the parameters.

Positivization. First, we introduce a simple decomposition called positivization (Section 4.2), which partitions a real-valued function about its roots into a positive and a negative function. We show that Zeltner et al.’s [2021] antithetic sampling is a special case of positivization, and positivization provides an explanation of the efficiency of their approach. When applicable, positivization leads to significant variance reduction. For example, it can be applied for sampling the isotropic GGX, Beckmann and Hanrahan-Krueger BRDF derivatives. However, others like anisotropic GGX, Beckmann, Ashikhmin-Shirley (Section 4.2.2) are not analytically integrable up to their roots, and the derivatives with mixture weights (Section 6) do not have analytic roots. Positivization cannot handle these derivatives. Zeltner et al.’s antithetic sampling inherits these limitations too.

Product Decomposition. Second, we propose a novel product decomposition (Section 5). Our key observation is that after differentiation, many BRDF derivatives can be decomposed into single-signed functions by separating the terms that result from the derivative product rule. Product decomposition does not require knowledge of the roots for the decomposition and only requires the resulting single-signed functions to be analytically integrable. Product decomposition can importance sample the derivatives of anisotropic GGX, Beckmann, Ashikhmin-Shirley, and more.

Mixture Decomposition. Finally, we introduce mixture decomposition (Section 6). Derivatives of BRDFs with linear combination coefficients, e.g., mixture weights of a layered BRDF, result in real-valued functions whose roots cannot be found analytically in most cases. Our mixture decomposition exploits the fact that this derivative is the difference between two positive-valued terms. Separating them results in a single-signed decomposition, and the two terms can then be importance sampled separately. Mixture decomposition handles the derivatives of Oren-Nayar and mixture weights of Uber BRDFs such as the Disney BRDF or Autodesk Standard Surface.

It is likely that several other BRDF derivatives not surveyed in this article can also be dealt with by one of our three decompositions, and we provide a recipe for handling them in Section 7. We provide a library of importance sampling PDFs for the derivatives of all the BRDF models discussed in this work in Table 1.

2 Related Work

Our work connects two areas in rendering research, differentiable rendering and BRDF sampling.

2.1 Differentiable Rendering

Derivatives in rendering. Computing derivatives or gradients of light transport has a long history. Earlier work focused on accelerating light transport using derivatives [Arvo 1994; Ward and Heckbert 1992; Ramamoorthi et al. 2007]. Approximate differentiable renderers [de La Gorce et al. 2011; Loper and Black 2014; Kato et al. 2018; Liu et al. 2020; Laine et al. 2020] have been used for many computer vision tasks, and light transport derivatives have been used for recovering scattering coefficients [Gkioulekas et al. 2013; Khungurn et al. 2015].

Background on differentiable rendering. Much of the current interest in Monte Carlo differentiable rendering was started by Li et al. [2018], who introduced an edge sampling approach to correctly handle discontinuities in both primary and secondary visibility. As shown by them and subsequent work [Zhang et al. 2019], the derivative of the rendering equation is made up of an interior integral that handles continuous function variation, and a boundary integral that encapsulates discontinuities.

Follow-up work [Loubet et al. 2019; Bangaru et al. 2020; Zhang et al. 2020; Yan et al. 2022; Yu et al. 2022; Zhang et al. 2023; Xu et al. 2023a] focused on accurately computing the boundary integral. Some other recent work focused on reducing memory requirements [Nimier-David et al. 2020; Vicini et al. 2021] and building automatic differentiation systems and compilers [Nimier-David et al. 2019; Jakob et al. 2022; Bangaru et al. 2023]. Efforts have been made to handle different light transport phenomena [Zhang et al. 2019;, 2021b; Wu et al. 2021; Yi et al. 2021]. Some recent work has investigated the possibility to leverage coherency between gradient descent iterations in inverse rendering to reduce variance [Nicolet et al. 2023; Chang et al. 2023; Wang et al. 2023]. Much of the recent inverse rendering work has started to incorporate differentiable rendering components [Azinović et al. 2019; Luan et al. 2021; Nimier-David et al. 2021; Che et al. 2020; Deschaintre et al. 2018; Nimier-David et al. 2022; Sun et al. 2023; Wu et al. 2023]. These efforts are complementary to our work.

Zeltner et al. [2021] show that directly importance sampling a BRDF’s derivative leads to a detached derivative with only an interior term, and no boundary term. They also show that reparameterization before differentiation leads to a different attached derivative with not only an interior term but an additional boundary term too. The boundary term requires careful handling for unbiased estimates and extra auxiliary rays at each shading point to estimate it too (4 to 64 extra rays as per Bangaru et al. [2020]). As a result, attached estimators are not suitable for the low sample budget within which we aim to operate. Our estimators fall under the detached derivative regime, which does not require these extra auxiliary rays, which makes them suitable for low sample budget derivative estimation.

2.2 BRDFs and Importance Sampling

Our work supports importance sampling the derivatives of a wide variety of analytic BRDF models. Table 1 shows a comprehensive list of the supported BRDF derivatives and their importance sampling PDFs. The code for the sampling routines is included in supplementary material.

Importance Sampling BRDFs. Importance sampling according to the BRDF [Pharr et al. 2016] is a fundamental variance reduction technique used in Monte Carlo forward rendering. While essential, it was initially limited to Phong-like BRDFs [Phong 1975; Lafortune et al. 1997] and Ward [1992]. Lawrence et al. [2004] introduced a non-negative matrix-based factorization to efficiently fit analytic and measured BRDFs for sampling. Walter et al. [2007] introduced the GGX BRDF [Trowbridge and Reitz 1975] along with its importance sampling routines. Follow-up works have correctly accounted for the shadowing and masking terms to sample microfacet BRDFs [Heitz 2018;, 2017; Heitz and d’Eon 2014; Jakob 2014].

Data-driven BRDFs. Apart from analytic BRDFs, data-driven measured BRDFs [Matusik et al. 2003; Dupuy and Jakob 2018] and Neural BRDFs [Fan et al. 2022; Sztrajman et al. 2021; Kuznetsov et al. 2021;, 2022; Xu et al. 2023b] are another common class of BRDF models that can model a wide variety of materials. However, both these Neural BRDFs and non-analytic measured BRDFs have a very large number of parameters, and it is unclear which parameters one might want to differentiate and importance sample with respect to. Hence, we do not consider either of these classes of BRDFs in our work and focus instead on common analytic BRDF models.

3 Background

For the sake of simplicity, we begin our discussion by focusing on the direct lighting setting, and extend it to indirect lighting in Section 9. The reflected radiance \(L_r\) , at a shading point \(\boldsymbol {y}\) , in the direction \(\boldsymbol {\omega }_o\) , is given by the reflection equation [Cohen and Wallace 1993],

\begin{align} L_r(\boldsymbol {y}, \boldsymbol {\omega }_o; \alpha) = \int f(\boldsymbol {y}, \boldsymbol {\omega }_i, \boldsymbol {\omega }_o; \alpha) L_i(\boldsymbol {y}, \boldsymbol {\omega }_i) \text{d}\boldsymbol {\omega }_i. \end{align}

(1)

Here, f is the cosine-weighted BRDF at \(\boldsymbol {y}\) , and \(\alpha\) is a scalar BRDF parameter that controls f. In practice, \(\alpha\) is the vector of all BRDF parameters in a given scene. However, for ease of exposition, we assume \(\alpha\) is scalar-valued, with the results for the other parameters following similarly. For example, \(\alpha\) could be the roughness of an isotropic GGX BRDF. Since we are dealing with only direct lighting, the incident radiance \(L_i\) does not depend upon \(\alpha\) . Differentiating the expression for the reflected radiance with \(\alpha\) , we get

\begin{align} \partial _\alpha L_r(\boldsymbol {y}, \boldsymbol {\omega }_o; \alpha) = \int \partial _\alpha f(\boldsymbol {y}, \boldsymbol {\omega }_i, \boldsymbol {\omega }_o; \alpha) L_i(\boldsymbol {y}, \boldsymbol {\omega }_i) \text{d}\boldsymbol {\omega }_i. \end{align}

(2)

Since the parameter \(\alpha\) does not affect discontinuities, we do not have an additional boundary integral [Li et al. 2018].

In forward rendering, BRDF sampling aims to minimize the variance of the BRDF f in the reflection equation, Equation (1). Similarly, our goal is to minimize the variance of the differential BRDF \(\partial _\alpha f\) in differentiable rendering, i.e., the variance of the estimator for

\begin{align} I(\boldsymbol {\omega }_o; \alpha) = \int \partial _\alpha f(\boldsymbol {\omega }_i, \boldsymbol {\omega }_o; \alpha) \text{d}\boldsymbol {\omega }_i. \end{align}

(3)

We drop the spatial coordinate \(\boldsymbol {y}\) , without loss of generality, for simplicity. We deal with the incident radiance \(L_i\) using light source sampling. The estimators for \(\partial _\alpha f\) and \(L_i\) can be combined using Multiple Importance Sampling [Veach and Guibas 1995]. We finally want to compute \(\partial _\alpha L_r\) so the final estimator must always include multiplication by \(L_i\) . All following integrals in the article omit \(L_i\) to stress that we only focus on \(\partial _\alpha f\) .

3.1 Previous Work on Variance Reduction for Differentiable Rendering

3.1.1 Detached and Antithetic Sampling.

Zeltner et al. [2021] noticed that standard BRDF sampling using a PDF \(p \propto f\) for the differential BRDF \(\partial _\alpha f\) leads to high variance, since \(\partial _\alpha f\) and f can be very different functions. They instead construct a PDF \(p \propto |\partial _\alpha f|\) , called the differential detached PDF, which matches \(\partial _\alpha f\) in shape. This eliminates variance from the shape of \(\partial _\alpha f\) , i.e., the sample weights \(\partial _\alpha f / p\) are constant in magnitude. There is, however, additional sign variance resulting from the mismatch in the sign between the positive-valued p and the real-valued integrand \(\partial _\alpha f\) resulting in sample weights \(\partial _\alpha f / p\) that change sign.

To deal with sign variance, Zeltner et al. [2021] applied antithetic sampling. (3) We show that Zeltner et al.’s method is a special case of another technique called positivization [Owen and Zhou 2000].

We show in Section 4.2 and Appendix B that positivization provides a theoretical grounding of antithetic sampling: the effectiveness mainly comes from the stratification (separating the real-valued function into a positive and a negative function). The major drawback of antithetic sampling is its inapplicability to several BRDF derivatives, due to the lack of closed forms of root finding and integration, which we discuss in Section 4.2.2.

Positivization has also been applied by concurrent work in differentiable rendering to the target function q in resampled importance sampling (RIS) [Talbot et al. 2005] by Chang et al. [2023] (see Equation (11) in their paper). Our work is orthogonal to their method: They use BRDF importance sampling as their candidate distribution p. The variance of RIS is a linear combination of \(\text{Var}(f/p)\) and \(\text{Var}(f/q)\) (see Equation 4.2 in Talbot et al. [2005]’s work); positivized RIS reduces the variance of the second term by positivizing the target function q into \(q_+,q_-\) . Our method of positivizing the differential BRDF (using \(p_+,p_-\) instead of p) reduces the variance of the first term.

3.1.2 Antithetic Sampling of Odd Derivatives.

Zhang et al. [2021a] introduce another antithetic-sampling-based method to deal with the derivative of the GGX Normal Distribution Function, \(D(\boldsymbol {\omega }_h)\) with the half vector \(\boldsymbol {\omega }_h\) . They exploit the fact that the derivative \(\partial _{\boldsymbol {\omega }_h} D(\boldsymbol {\omega }_h)\) is odd about the local shading normal, i.e.,

\begin{align} \partial _{\boldsymbol {\omega }_h} D([\omega _{h,x}, \omega _{h,y}, \omega _{h,z}]) = -\partial _{\boldsymbol {\omega }_h} D([-\omega _{h,x}, -\omega _{h,y}, \omega _{h,z}]). \end{align}

(4)

Their estimator for Equation (3) requires two antithetic samples \(\omega _{i,1}\) and \(\omega _{i,2}\) , and is given by

\begin{align} I \approx \frac{\partial _\alpha f(\omega _{i,1}) + \partial _\alpha f(\omega _{i,2})}{p(\omega _{i,1}) + p(\omega _{i,2})}. \end{align}

(5)

Here, and going forward, we drop \(\boldsymbol {\omega }_o\) and \(\alpha\) from the function arguments of \(I(\boldsymbol {\omega }_o, \alpha)\) and \(f(\boldsymbol {\omega }_i, \boldsymbol {\omega }_o, \alpha)\) for brevity. This method works well for the odd derivative with \(\boldsymbol {\omega }_h\) . However, for non-odd derivatives, there are no variance reduction guarantees. Furthermore, several BRDF derivatives are even, e.g., roughness of GGX, Beckmann, and Zhang et al.’s method increases variance in these cases.

Additionally, Equation (5) is not in the standard importance sampling form of \(\partial _\alpha f / p\) due to the presence of a sum in the numerator and denominator. Hence, it is unclear how to use it in conjunction with multiple importance sampling.

4 Single-signed Decompositions

In this section, we describe the concept of sign variance in real-valued integrals, and then show how our first decomposition, positivization, can handle this source of variance for some BRDF derivatives. Positivization requires (a) analytic knowledge of roots and (b) analytic integrability of the BRDF derivative, which limits its applicability. In Section 5, we present a novel product decomposition that exploits the single-signed nature of the terms resulting from the product rule for derivatives, for the correct handling of sign variance. It significantly expands the set of BRDF derivatives we can handle. In Section 6, we present a novel mixture decomposition that exploits the fact that derivatives with mixture weights are a difference of two positive functions, to decompose them into single-signed functions, allowing us to importance sample even more BRDF derivatives. Finally, we describe a general recipe to handle other BRDF derivatives not surveyed in this article in Section 7.

4.1 Sign Variance

We introduce sign variance through the following representative 1D example, showing the real-valued derivative \(\partial _\mu f\) of a normal distribution \(f(x; \mu , \sigma)\) with its mean \(\mu\) , as shown in Figures 2(a) and 2(b):

\begin{align} I = \int _{-\infty }^{\infty } \partial _\mu f(x; \mu , \sigma) \text{d}x = \int _{-\infty }^{\infty } \frac{1}{\sqrt {2\pi }\sigma ^3}(x-\mu)e^{-\frac{1}{2}(\frac{x-\mu }{\sigma })^2} \text{d}x. \end{align}

(6)

For \(x \lt \mu\) , the integrand \(\partial _\mu f\) is negative, and for \(x \gt \mu\) , it is positive. The importance sampling strategy using a single PDF \(p \propto |\partial _\mu f|\) no longer has zero variance [Owen and Zhou 2000]. This is due to the sign variance, i.e., the positive-valued PDF p cannot match the sign of the real-valued integrand \(\partial _\mu f\) over the entire domain, leading to non-constant sample weights; see Figure 2(b).

Fig. 2.

4.2 Positivization

It is possible to construct an estimator for any real-valued integrand \(\partial _\alpha f\) , e.g., Equation (6), which has zero variance. By partitioning \(\partial _\alpha f\) into its positive \(\partial _\alpha f_+\) and negative \(\partial _\alpha f_-\) parts,

\begin{equation} \begin{split} \partial _\alpha f_+(x) &= \max \left(\partial _\alpha f(x), 0\right), \; \partial _\alpha f_-(x) = \min \left(\partial _\alpha f(x), 0\right), \\ \partial _\alpha f(x) &= \partial _\alpha f_+(x) + \partial _\alpha f_-(x), \end{split} \end{equation}

(7)

we are left with two functions that are single-signed by definition. They can be perfectly importance sampled if we can construct the following two PDFs, \(p_-(x) \propto \partial _\alpha f_-(x)\) and \(p_+(x) \propto \partial _\alpha f_+(x)\) ; see Figures 2(c) and 2(d). The resulting estimator is

\begin{equation} \begin{split} I = \int \partial _\alpha f(x)\text{d}x &= \int \partial _\alpha f_+(x)\text{d}x + \int \partial _\alpha f_-(x)\text{d}x \\ &\approx \frac{\partial _\alpha f_+(X_+)}{p_+(X_+)} + \frac{\partial _\alpha f_-(X_-)}{p_-(X_-)}, \end{split} \end{equation}

(8)

where \(X_+ \sim p_+\) and \(X_- \sim p_-\) .

This technique is called positivization [Owen and Zhou 2000], and we apply it to importance sampling BRDF derivatives. The zero-variance claim is only with regard to the variance arising from the differential BRDF \(\partial _\alpha f\) . The derivative of the reflection equation, see Equation (2), is a product of the differential BRDF and the lighting, and as a result, it will still have variance from the lighting.

In Appendix B, we show how Zeltner et al.’s approach can be understood as a special case of positivization with correlated samples \(X_+, X_-\) ; for positivization, these samples are independent. Through an empirical study, we have found that the majority of the variance reduction of antithetic sampling comes from the implicit splitting of \(\partial _\alpha f\) into positive and negative lobes ( \(\partial _\alpha f_+\) and \(\partial _\alpha f_-\) ), instead of the negative correlation between samples; see Figure 3.

Fig. 3.

4.2.1 Positivization of Isotropic GGX.

Positivization, and by extension antithetic sampling, is very effective at reducing variance for BRDF derivatives when \(p_+\) and \(p_-\) can be constructed. For this, \(p_+\) needs to be analytically integrated over the region where \(\partial _\alpha f \gt 0\) to obtain the necessary PDF and CDF required for sampling; see Figure 4 for the overall pipeline to construct them. This step faces two challenges (a) the roots, which define the region where \(\partial _\alpha f \gt 0\) , do not have a closed-form expression for some BRDF derivatives, and (b) \(\partial _\alpha f\) is not analytically integrable over the region where it is positive, for others. A similar argument follows for \(p_-\) too.

Fig. 4.

Some BRDF derivatives, like isotropic microfacet GGX and Beckmann can be handled by positivization. These microfacet models are given by the following equation:

\begin{align} f(\boldsymbol {\omega }_i, \boldsymbol {\omega }_o) = \frac{F(\boldsymbol {\omega }_i, \boldsymbol {\omega }_o, \eta) G(\boldsymbol {\omega }_i, \boldsymbol {\omega }_o) D(\boldsymbol {\omega }_h)}{4\cos \theta _o}, \end{align}

(9)

where F is the Fresnel term, G is the shadowing and masking term, and D is the normal distribution function for the specific BRDF. The unit vector \(\boldsymbol {\omega }_h\) is halfway between \(\boldsymbol {\omega }_i\) and \(\boldsymbol {\omega }_o\) , and its spherical coordinates are \(\theta _h, \phi _h\) .

The derivative of the isotropic GGX BRDF with its roughness \(\alpha\) has two components, \(\partial _\alpha D(\boldsymbol {\omega }_h)\) and \(\partial _\alpha G(\boldsymbol {\omega }_i,\boldsymbol {\omega }_o)\) . However, as noted by previous work [Zeltner et al. 2021; Zhang et al. 2021a], the \(\partial _\alpha G\) term only has a minor effect on the overall derivative. Hence, we focus on the \(\partial _\alpha D\) term, which is given by

\begin{align} D(\boldsymbol {\omega }_h) &= \frac{1}{\pi \alpha ^2\left(\frac{\sin ^2\theta _h}{\alpha ^2} + \cos ^2\theta _h \right)^2}, \end{align}

(10)

\begin{align} \partial _\alpha D (\boldsymbol {\omega }_h) &= \frac{2\cos ^2\theta _h\left(\tan ^2\theta _h -\alpha ^2\right)}{\pi \alpha ^5 \left(\frac{\sin ^2\theta _h}{\alpha ^2} + \cos ^2\theta _h \right)^3}. \end{align}

(11)

Its roots have an analytic form and are \(\tan \theta _h = \alpha\) for all \(\phi _h\) . Additionally, the derivative \(\partial _\alpha D\) is analytically integrable over both the positive and negative regions, and so both conditions to apply positivization are met. Hence, the importance sampling PDFs (and CDFs) can be obtained for this derivative.

Apart from the derivatives of the isotropic GGX roughness, positivization can also be used for the derivative of Beckmann BRDFs with their roughness, and Hanrahan-Krueger BRDF with the scattering parameter g of its Henyey-Greenstein phase function.

4.2.2 Inapplicability of Positivization to Anisotropic GGX.

For many BRDFs’ derivatives, however, one of the two conditions fails, which precludes the use of positivization (and antithetic sampling) for them. For example, consider the derivative of \(D(\boldsymbol {\omega }_h)\) of an anisotropic GGX BRDF with its roughness \(\alpha _x\) ,

\begin{align} \begin{split} D(\boldsymbol {\omega }_h) &= \frac{1}{\pi \alpha _x\alpha _y\left(\frac{\sin ^2\theta _h\cos ^2\phi _h}{\alpha _x^2} + \frac{\sin ^2\theta _h\sin ^2\phi _h}{\alpha _y^2} + \cos ^2\theta _h \right)^2}, \\ \partial _{\alpha _x} D(\boldsymbol {\omega }_h) &= \frac{\cos ^2\theta _h\left(3\tan ^2\theta _h\cos ^2\phi _h - \tan ^2\theta _h\sin ^2\phi _h\alpha _x^2/\alpha _y^2 - \alpha _x^2\right)}{\pi \alpha _x^4\alpha _y\left(\frac{\sin ^2\theta _h\cos ^2\phi _h}{\alpha _x^2} + \frac{\sin ^2\theta _h\sin ^2\phi _h}{\alpha _y^2} + \cos ^2\theta _h \right)^3}. \end{split} \end{align}

(12)

Its roots are the set of \((\theta _h, \phi _h)\) such that the expression

\(3\tan ^2\theta _h\cos ^2\phi _h - \tan ^2\theta _h\sin ^2\phi _h\alpha _x^2/\alpha _y^2 - \alpha _x^2=0\) , and are shown in Figure 5(a), purple curve. However, we were unable to analytically integrate the derivative \(\partial _{\alpha _x} D\) over the positive and negative strata, see Figure 5(a), red and blue regions, which prevented us from applying positivization to this derivative. We were also unable to find closed-form expressions for the roots of the derivatives of other materials like the diffuse BSSRDF from Burley [2015] (Figure 6) and the isotropic microfacet ABC BRDF, which prevented us from positivizing them.

Fig. 5.

Fig. 6.

It is common in BRDF importance sampling to numerically invert a CDF using binary search or Newton iterations, and we will do this with some of our product and mixture decomposition CDFs. However, for positivization, taking a purely numerical approach is not practical. Numerically approximating non-analytic roots and non-analytically integrable PDFs requires storing a high-dimensional representation (for, e.g., 6D \(\boldsymbol {\omega }_i, \boldsymbol {\omega }_o, \alpha _x, \alpha _y\) for anisotropic GGX). Storing such a high-dimensional histogram (piecewise approximation) can be infeasible.

Discussion. Positivization is a specific single-signed decomposition that decomposes the real-valued function into a positive and a negative function with non-overlapping supports. As a result, it requires root-finding and analytic integration over complicated domains defined by these roots. In the following two sections, we discuss two novel decompositions for which the positive and negative functions are defined over simple domains of integration like a plane or hemisphere, with overlapping support. As a result, they do not require root finding, or integration over complicated domains, and as a consequence can handle a broader class of derivatives.

5 Product Decomposition

Our first new decomposition is product decomposition. It can handle the derivatives of anisotropic microfacet BRDFs, diffuse BSSRDFs, and the isotropic ABC BRDF that positivization could not handle. The key idea we exploit is that after differentiating any of these materials, they split up into two terms following the product rule. Both of these are single-signed, have no sign variance, and are analytically integrable over their simple domains of integration (hemisphere or plane).

Several BRDFs (or normal distribution functions) are of the form

\begin{align} f(\boldsymbol {\omega }_h, \alpha) = N(\alpha) g(\boldsymbol {\omega }_h, \alpha), \end{align}

(13)

where \(g(\boldsymbol {\omega }_h, \alpha)\) is a non-negative shape function, which determines the overall shape of the BRDF over all \(\boldsymbol {\omega }_h\) , at the parameter value \(\alpha\) . \(N(\alpha)\) is a directionally constant (independent of \(\boldsymbol {\omega }_h\) ) normalization term that ensures f integrates to 1. Differentiating f with \(\alpha\) , we get

\begin{align} \partial _\alpha f(\boldsymbol {\omega }_h, \alpha) = \partial _\alpha N(\alpha) g(\boldsymbol {\omega }_h, \alpha) + N(\alpha) \partial _\alpha g(\boldsymbol {\omega }_h, \alpha). \end{align}

(14)

Because N and \(\partial _\alpha N\) are directionally constant, the variance in the two terms above comes from g and \(\partial _\alpha g\) , respectively. The first term above is single-signed, because \(g \ge 0\) . The second term with \(\partial _\alpha g\) can potentially be real-valued. However, we have found it to be single-signed for several common BRDFs. For example, for the anisotropic GGX normal distribution function \(D(\boldsymbol {\omega }_h)\) , we have

\begin{equation} \begin{split} D(\boldsymbol {\omega }_h,\alpha _x,\alpha _y) &= N(\alpha _x,\alpha _y)g(\boldsymbol {\omega }_h, \alpha _x,\alpha _y),\\ N(\alpha _x, \alpha _y) &= (\pi \alpha _x \alpha _y)^{-1}, \\ g(\boldsymbol {\omega }_h, \alpha _x,\alpha _y) &= \left(\frac{\sin ^2\theta _h\cos ^2\phi _h}{\alpha _x^2} + \frac{\sin ^2\theta _h\sin ^2\phi _h}{\alpha _y^2} + \cos ^2\theta _h \right)^{-2}, \\ \partial _\alpha g(\boldsymbol {\omega }_h, \alpha _x,\alpha _y) &= 4 g(\boldsymbol {\omega }_h, \alpha _x,\alpha _y)^{3/2} \sin ^2\theta _h\cos ^2\phi _h \alpha _x^{-3}, \end{split} \end{equation}

(15)

where \(\partial _\alpha g\) is single-signed; see Figure 5. Additionally, \(\partial _\alpha g\) is also analytically integrable over its hemispherical domain.

Let us provide some geometric intuition for why the shape derivative \(\partial _\alpha g\) is often single-signed. For our BRDFs, the parameter \(\alpha\) often controls the variance of the distribution, e.g., \(\alpha _x, \alpha _y\) for GGX, Beckmann, \(n_u, n_v\) for Ashikhmin-Shirley. For all of these, the variance \(\alpha\) stretches g horizontally, and increases (or decreases) its value at all locations, making its derivative single-signed. However, \(\alpha\) stretches \(N(\alpha)\) vertically to negate the increase (or decrease) in area due to g, and ensure it integrates to 1.

We construct importance sampling PDFs for the two single-signed terms separately, with PDFs \(p_1 \propto g\) and \(p_2 \propto \partial _{\alpha }g\) ,

\begin{equation} \begin{split} I &= \int \partial _{\alpha }f(\boldsymbol {\omega }_h)\text{d}\boldsymbol {\omega }_h\\ &= \int \partial _{\alpha }Ng(\boldsymbol {\omega }_h)\text{d}\boldsymbol {\omega }_h+ \int N\partial _{\alpha } g(\boldsymbol {\omega }_h) \text{d}\boldsymbol {\omega }_h\\ &\approx \frac{\partial _{\alpha }Ng(\boldsymbol {\omega }_{h,1})}{p_1(\boldsymbol {\omega }_{h,1})} + \frac{N\partial _{\alpha } g(\boldsymbol {\omega }_{h,2})}{p_2(\boldsymbol {\omega }_{h,2})}. \end{split} \end{equation}

(16)

Figure 7(a) describes the pipeline to generate importance sampling PDFs for product decomposition. Product decomposition can handle the derivatives of anisotropic GGX, Beckmann, Ashikhmin-Shirley, which are not analytically integrable over the positivized domains, and Burley’s diffuse BSSRDF and the isotropic ABC BRDF, which have no closed-form solution for the roots. However, they all have single-signed \(\partial _\alpha g\) , which is analytically integrable.

Fig. 7.

Note that the product rule in and of itself does not guarantee a single-signed decomposition. For example, the product of the microfacet distribution (D) and geometric terms (G) does not lead to a single-signed decomposition for the derivative with \(\alpha _x\) (or \(\alpha _y\) ). This is because both \(\partial _{\alpha _x} D\) and the \(\partial _{\alpha _x} G\) terms are real-valued. The decomposition \(D = N g\) is one of the many product decompositions, but the only one we found to preserve the single-signed property.

6 Mixture Decomposition

Our second new decomposition further expands the set of BRDF derivatives we can handle. Consider, for example, a BRDF made up of a diffuse \(f_d\) and specular \(f_s\) lobe with scalar mixture weights \(k_d\) and \(1-k_d\) , respectively:

\begin{equation} \begin{split} f(\boldsymbol {\omega }_i, \boldsymbol {\omega }_o) &= k_d f_d(\boldsymbol {\omega }_i, \boldsymbol {\omega }_o) + (1 - k_d) f_s(\boldsymbol {\omega }_i, \boldsymbol {\omega }_o), \\ \partial _{k_d} f(\boldsymbol {\omega }_i, \boldsymbol {\omega }_o) &= f_d(\boldsymbol {\omega }_i, \boldsymbol {\omega }_o) - f_s(\boldsymbol {\omega }_i, \boldsymbol {\omega }_o). \end{split} \end{equation}

(17)

The derivative with the mixture weight \(k_d\) is positive when the diffuse lobe contribution is higher than the specular lobe and negative otherwise. In general, this derivative is very hard to positivize, because \(f_d\) and \(f_s\) can be arbitrary BRDFs, and so the roots of \(f_d - f_s\) are unlikely to have a simple analytic form.

However, we can once again decompose this derivative into single-signed functions with overlapping support; we refer to this as the mixture decomposition. Since \(f_d\) and \(f_s\) are non-negative valued BRDFs, they are single-signed, and can be importance sampled separately with appropriate PDFs \(p_d\) and \(p_s\) :

\begin{equation} \begin{split} I &= \int \partial _{k_d} f(\boldsymbol {\omega }_i) \text{d}\boldsymbol {\omega }_i= {\int f_d(\boldsymbol {\omega }_i) \text{d}\boldsymbol {\omega }_i- \int f_s(\boldsymbol {\omega }_i) \text{d}\boldsymbol {\omega }_i} \\ &\approx \frac{f_d(\boldsymbol {\omega }_{i,d})}{p_d(\boldsymbol {\omega }_{i,d})} - \frac{f_s(\boldsymbol {\omega }_{i,s})}{p_s(\boldsymbol {\omega }_{i,s})}. \end{split} \end{equation}

(18)

Mixture weights show up in all Uber BRDFs, like the Autodesk Standard Surface, Disney BRDF, and so on, and our mixture decomposition can be applied to all of them.

Mixture decomposition is also applicable to the derivative of BRDFs that are not explicitly mixture models, but internally are made up of different lobes, with parametric weights. For example, the Oren-Nayar BRDF, which is a linear combination of two terms. Here, the positive weights \(A(\sigma), B(\sigma)\) depend upon the roughness \(\sigma\) of the BRDF:

\begin{equation} \begin{split} f(\boldsymbol {\omega }_o, \boldsymbol {\omega }_i) &= A(\sigma)\frac{\rho }{\pi }\cos \theta _i \\ &+ B(\sigma) \frac{\rho }{\pi }\max \left(0, \cos (\phi _i-\phi _o)\right)\sin \alpha \tan \beta \cos \theta _i, \end{split} \end{equation}

(19)

where \(\alpha =\max \left(\theta _i,\theta _o\right)\) , \(\beta =\min \left(\theta _i,\theta _o\right)\) . Once again, since both terms of the BRDF above are positive, the real-valued derivative with \(\sigma\) is simply the sum of a positive and a negative term,

\begin{align} \begin{split} \partial _\sigma f(\boldsymbol {\omega }_o, \boldsymbol {\omega }_i) &= \partial _\sigma A(\sigma)\frac{\rho }{\pi }\cos \theta _i \\ &+ \partial _\sigma B(\sigma) \frac{\rho }{\pi }\max \left(0, \cos (\phi _i-\phi _o)\right)\sin \alpha \tan \beta \cos \theta _i, \end{split} \end{align}

(20)

with the sign of the term decided by the sign of \(\partial _\sigma A\) and \(\partial _\sigma B\) . Importance sampling the first term is simply cosine-hemispherical sampling, and we provide an importance sampling PDF for the second term in Appendix A.3.2. Besides Oren-Nayar, the microcylinder BRDF [Sadeghi et al. 2013] is also a mixture model with weights \(k_d, 1 - k_d\) , where \(k_d\) is the isotropic scattering coefficient, and can be handled by mixture decomposition as well.

7 Recipe for Importance Sampling BRDF Derivatives

We now present a recipe to importance sample BRDF derivatives based on the key ideas introduced in the previous sections.

Step 1, Positivization. Given a real-valued BRDF derivative \(\partial _\alpha f\) , check if it can be positivized. For positivization to be applicable, \(\partial _\alpha f\) should have analytic roots. Compute the normalization constants for the solid angle PDFs \(p_+(\boldsymbol {\omega }_i) \propto \max \left(\partial _\alpha f(\boldsymbol {\omega }_i), 0\right),\; p_-(\boldsymbol {\omega }_i) \propto \min \left(\partial _\alpha f(\boldsymbol {\omega }_i), 0\right)\) , and their marginal and conditional counterparts \(p_+(\phi _i), p_-(\phi _i), p_+(\theta _i|\phi _i), p_-(\theta _i|\phi _i)\) , if they are analytically integrable. See Figure 4 for the PDF generation and Equation (8) for the estimator.

Step 2, Try Product or Mixture Decomposition. If positivization is inapplicable for either reason (no analytic roots or lack of analytic integrability), then either product or mixture decomposition might still be applicable.

Step 2.1, Product Decomposition. If the original BRDF is of the form \(N(\alpha)g(\boldsymbol {\omega }_i,\alpha)\) , where \(\alpha\) appears in a directionally invariant (independent of \(\boldsymbol {\omega }_i\) ) normalization term \(N(\alpha)\) and an unnormalized shape function \(g(\boldsymbol {\omega }_i, \alpha)\) , then product decomposition may be applicable. First check if \(\partial _\alpha g\) is single-signed, i.e., it has a constant sign for all \(\boldsymbol {\omega }_i\) , and is analytically integrable. If these conditions hold, then product decomposition is applicable. Construct a PDF \(p_2(\boldsymbol {\omega }_i) \propto \partial _\alpha g\) and compute the normalization terms for it and its conditional and marginal counterparts. The other PDF \(p_1(\boldsymbol {\omega }_i) \propto g\) is simply the BRDF sampling PDF. See Figure 7 for the PDF generation and Equation (16) for the estimator.

Step 2.2, Mixture Decomposition. If instead the parameter \(\alpha\) appears in the form of linear combination weights either explicitly as a mixture model between two BRDFs, or implicitly as a mixture between two lobes that form a single BRDF, then mixture decomposition is likely applicable here. In this case, simply use the PDFs and sampling strategies most suitable for the two mixture lobes if they are available (e.g., visible normal distribution function sampling for a GGX lobe), or construct PDFs \(p_1(\boldsymbol {\omega }_i) \propto f_1(\boldsymbol {\omega }_i),\; p_2(\boldsymbol {\omega }_i) \propto f_2(\boldsymbol {\omega }_i)\) for the two lobes, where \(f_1, f_2\) are the two lobes. See Figure 7 for the PDF generation, and Equation (18) for the estimator.

Figure 8 depicts the estimators for all three of our decompositions for direct illumination. They all require two shadow rays at the shading point, corresponding to the positive and negative lobes of the corresponding decomposition.

Fig. 8.

Although we have not found examples that require it, our three decompositions can also be interleaved with one another for complicated BRDF derivatives. For example, it is possible that for some BRDF derivatives, the derivative of the shape function from the product rule \(\partial _\alpha g\) could be real-valued. It could then further be positivized to eliminate sign variance.

Forward Rendering Sampling Technique Reuse. Both product and mixture decomposition reuse BRDF sampling developed for forward rendering as one (or both) of the techniques for differential BRDF sampling. For product decomposition, this corresponds to \(p_1 \propto g\) . For mixture decomposition, perfect importance sampling can be achieved by only employing two standard BRDF sampling techniques from forward rendering in some cases. BRDF sampling when used directly to estimate for \(\partial _\alpha f\) suffers from sign and shape variance, however, when paired with the right decomposition, it can correctly handle the shape variance of one of the terms.

Multiple Importance Sampling. For the product and mixture decompositions, the positive and negative decomposition PDFs can have overlapping support (for positivization they are necessarily non-overlapping). As a result, the samples generated for one decomposition can be shared with the other using Multiple Importance Sampling. Also, all three of our decompositions reduce the variance from the differential BRDF \(\partial _\alpha f\) and can be used in conjunction with light source sampling via Multiple Importance Sampling to reduce the lighting, \(L_i\) ’s variance.

8 Results

We organize our results into two subsections. First, we demonstrate that our decompositions do reduce variance in practice for a number of BRDF derivatives under a wide variety of lighting conditions in Section 8.1. Next, we demonstrate that lower variance in gradients indeed does enable better spatially varying texture recovery in inverse rendering, in Section 8.2.

Implementation Details. We implemented all the different decompositions and BRDFs on our own CPU-based differentiable renderer, using the Embree [Wald et al. 2014] library for ray tracing. At each shading point, all three of our decompositions require two shadow rays; see Figure 8. To have a fair comparison with BRDF sampling, we shoot out two shadow rays at each shading point for it too, which ensures an equal-ray comparison with our method. Since our sampling routines have similar timings to BRDF importance sampling (Table 2), all comparisons are also equal time, and we report the timings for each experiment in insets. All our standard deviation comparison images are computed by taking the square root of the averaged squared error of the gradient images, which were each generated at nine samples per pixel over 50 runs. The variance improvement can be easily computed by squaring all the standard deviation improvement numbers in the insets. Additionally, we also provide difference images to show the specific regions where our estimators outperform BRDF sampling in Figure 29.

Table 2.

Material	SSD	Time	Method
Cosine	Fwd.	0.64 \(\mu\) s	Analytic
GGX [Heitz 2018]	Fwd.	2.25 \(\mu\) s	Analytic
Anistropic GGX (ours)	Prod.	5.97 \(\mu\) s	CDF Inv.
Anistropic Beckmann (ours)	Prod.	6.32 \(\mu\) s	CDF Inv.
Anistropic Ashikhmin-Shirley (ours)	Prod.	4.93 \(\mu\) s	CDF Inv.
Oren-Nayar (ours)	Mix.	1.97 \(\mu\) s	CDF Inv.
Hanrahan-Krueger (ours)	Pos.	9.04 \(\mu\) s	CDF Inv.
Isotropic GGX [Zeltner et al. 2021]	Pos.	0.91 \(\mu\) s	Analytic

Table 2. Sample Generation Timings for Our Single Signed Decompositions (SSD)

The timings are to generate a single sample, averaged over 1,000,000 runs. The first two rows include timings for BRDF importance sampling for cosine and GGX BRDFs for reference; our routines add negligible overhead in the overall run time in a differentiable path tracer; see Figure 10 to 13 and 18 to 23.

Fig. 9.

Fig. 10.

Fig. 11.

Fig. 12.

Fig. 13.

Fig. 14.

Fig. 15.

Fig. 16.

Fig. 17.

Fig. 18.

Fig. 19.

Fig. 20.

Fig. 21.

Fig. 22.

Fig. 23.

Fig. 24.

Fig. 25.

Fig. 26.

Fig. 27.

Fig. 28.

Fig. 29.

Sample Generation Timings. We report the average times (over 1,000,000 runs) to generate samples using our routines and baselines in Table 2. Our routines involve CDF inversion, which we perform using a combination of Newton iterations and bisection search as described by Yuksel et al. [2022] (maximum 64 iterations, \(10^{-6}\) CDF error tolerance). Our sampling routines only add a small overhead in the overall path-tracing pipeline; see timings in Figures 10–13 and 18–23.

Convergence Rates. Our estimators are intended to be used within inverse rendering pipelines. This necessitates low sample counts per pixel (1 to 16) for fast iteration times. For these sample budgets, our estimators outperform baselines both without stratification and with it (Figure 9, all other results in the article are with stratification).

Without stratification, the estimators show the typical \(\mathcal {O}(N^{-1})\) convergence. Stratification improves the convergence rate. Interestingly, with stratification, standard BRDF importance sampling performs as well as our methods at high sample counts (100+).

Mitchell [1996] provides a possible explanation: at high sample counts with stratification, variance is dominated by visibility discontinuities in the integrand, so these pixels have convergence rates of \(\mathcal {O}(N^{-1.5})\) . Better importance sampling improves the convergence rate for the continuous differential BRDF, noticeable at low sample counts. At higher sample counts, visibility dominates the variance, which none of the techniques handle. Also, BRDF sampling (which does not handle sign variance) benefits from stratification at high sample counts, since smaller strata are increasingly single-signed. Nevertheless, our methods are able to provide significant variance reduction (by nearly an order of magnitude) in the low sample regime, most relevant to us. This phenomenon is a general observation for importance sampling, not limited to differentiable rendering.

8.1 Derivative Comparison

8.1.1 Positivization.

First, we compare positivization with BRDF sampling for the derivative of two BRDFs in Figure 1. The scene is lit by two area lights. The isotropic GGX teapot (with \(\alpha =0.02\) ) is differentiated with its roughness \(\alpha\) , and the Hanrahan-Krueger (with \(g=-0.9\) ) lion is differentiated with its Henyey-Greenstein parameter for anisotropy g. The Henyey-Greenstein phase function at \(g=-0.9\) is highly back-scattering and is very badly importance sampled by regular BRDF sampling, which cannot correctly account for the highly peaked and signed nature of the derivative. Since positivization is correctly able to handle both sign and shape related variance, we see significant standard deviation reduction of \(1.96\times\) and \(58.57\times\) for the teapot and lion, respectively.

8.1.2 Product Decomposition.

Next, we compare product decomposition with BRDF sampling for the derivative of an anisotropic Beckmann BRDF with its roughness \(\alpha _x\) , lit under constant environment illumination in Figure 10. Positivization (and by extension Zeltner et al.) cannot handle this derivative, see Section 4.2.2, and Zhang et al.’s method fails for even derivatives like this one; see Figure 26. Constant illumination eliminates variance from lighting and only keeps variance from the BRDF derivative and visibility. Since product decomposition can correctly handle both the sign and shape variance of the BRDF derivative, it has an overall \(8.63\times\) reduction in standard deviation, whereas BRDF sampling fails, because it cannot handle either source of variance. In most regions (Figure 10, see right inset), the derivative of the normal distribution function \(\partial _\alpha D\) is the major source of BRDF derivative variance; we eliminate it and see a big improvement of \(32.7\times\) . However, in the grazing angle regions (Figure 10, see left inset), the derivative of the shadowing function \(\partial _\alpha G\) dominates. Here, our improvement is still significant ( \(7.45\times\) ), but relatively less pronounced, since our sampling strategy minimizes \(\partial _\alpha D\) ’s variance.

Now, we change the lighting to realistic environment illumination (see insets for forward renderings), and the geometry to objects with intricate geometry in Figures 11–13; we estimate the derivative of the anisotropic Beckmann, GGX, and Ashikhmin-Shirley BRDFs in these experiments, respectively. Apart from BRDF derivative variance, these scenes have two other major sources of variance, lighting and visibility. When the variance is significant from other sources too, we have found that sharing samples between the positive and negative decomposition is beneficial, see Section 7, Multiple Importance Sampling (MIS). In all three experiments, we have found that our product decomposition is able to outperform BRDF importance sampling and achieve lower standard deviation in gradient estimation \(2.09\times , 1.31\times\) , and \(1.21\times\) , respectively.

We show two more examples of product decomposition in Figure 1, for anisotropic GGX and Beckmann BRDF derivatives, which achieve standard deviation reductions of \(1.56\times\) and \(3.61\times\) , respectively. The insets in the top row of Figure 1 show the regions where our decomposition has lower variance than BRDF sampling in blue. Product decomposition outperforms BRDF sampling in almost all regions.

We provide additional comparisons with Zhang et al. for the anisotropic roughness derivatives of the GGX BRDF for the scenes modelled in Figures 10 and 12, in fig. 26; their estimator has higher standard deviation than BRDF importance sampling. We emphasize that this is behavior is expected, since they designed their estimator to estimate odd derivatives (for example, the derivative with the half vector), but the roughness derivative is even.

8.1.3 Mixture Decomposition.

Finally, we compare BRDF sampling with Mixture Decomposition to estimate the derivative of a mixture model with its mixture weight for the fish-shaped pot in Figure 1. The mixture model is a linear combination of a Lambertian diffuse lobe, and a GGX specular lobe and the lighting is two area lights. Mixture decomposition can reduce the standard deviation by \(4.72\times\) , because it correctly handles shape and sign variance, unlike BRDF sampling.

Figure 1 also shows an example of an Oren-Nayar pot, and its derivative with the roughness \(\sigma\) . BRDF sampling here is simply cosine hemispherical sampling, and works quite well in the central regions of the pot, because the cosine lobe is dominant in the non-grazing angle regions; see Equation (19). However, in the grazing angle regions towards the edges of the pot where the correction term is more dominant, BRDF sampling breaks down and has high variance. However, our mixture decomposition with MIS correctly accounts for the derivative of both terms with regard to their sign and shape variance, and can achieve low variance in all regions of the pot, and leads to a \(3.91\times\) reduction in standard deviation.

8.2 Inverse Rendering Comparison

We demonstrate the benefits of correctly handling sign variance in gradients, for gradient-descent-based inverse rendering. We apply inverse rendering to the task of spatially varying texture recovery, and evaluate the effectiveness of all three of our decompositions on it. Our results for positivization are presented in Figure 14, product decomposition in Figure 15, and mixture decomposition in Figure 16. All our inverse rendering results use 4 samples per pixel for both forward and gradient rendering at each optimization iteration. We use the ADAM optimizer [Kingma and Ba 2015] and the respective loss graphs show the mean absolute texture recovery error (L1) after some initial iterations. In Figure 14, we show that using the ADAM optimizer instead of SGD with momentum (SGDm) reduces the impact of gradient outliers, similar to Nimier-David et al.’s observation [2022]. Nonetheless, our estimators outperform the baselines in all setups.

For positivization (Figure 14), we recover the (spatially varying) scattering parameter g of a Hanrahan-Krueger BRDF with the semi-infinite depth assumption, lit by a single area light. The ground-truth texture consists of a slightly back-scattering background region with \(g = -0.3\) , and a highly back-scattering logo region with \(g = -0.9\) ; see Figure 14(c). We initialized the scattering parameter g with a random negative number. Positivization consistently has lower texture recover error compared to BRDF sampling, especially in the highly back-scattering region.

For product decomposition (Figure 15), we optimize the spatially varying anisotropic roughness textures ( \(\alpha _x\) and \(\alpha _y\) ) of a Beckmann BRDF under a photometric stereo setup under two illumination conditions. The two lighting conditions are rotated versions of the same environment map. Starting from a random initialization for both textures, product decomposition’s correct handling of the sign variance leads to a gradient estimator with lower overall variance, and consequently ensures lower texture recovery error across all iterations, as shown in Figure 15(b). The final recovery is displayed in Figures 15(c) and 15(d).

Our product decomposition computes the gradients for both roughness values using three samples at each shading point combined using multiple importance sampling (one each from \(p_1\) , \(p_{2,x}\) , \(p_{2,y}\) ). To ensure an equal-ray budget, we use three samples for BRDF sampling at each shading point too.

For mixture decomposition in Figure 16, we recover the spatially varying roughness of an Oren-Nayar BRDF under environment map illumination. Once again, mixture decomposition benefits from lowered variance in gradients and can recover a texture with lower error than BRDF sampling at an equal ray-triangle intersection budget; see Figure 16(b).

9 Global Illumination

We now describe how to importance sample BRDF derivatives under multiple bounce global illumination. The recursive rendering equation [Kajiya 1986] (ignoring emission) is given by a generalization of Equation (1),

\begin{align} L_r(\boldsymbol {y}, \boldsymbol {\omega }_o; \alpha) = \int f(\boldsymbol {y}, \boldsymbol {\omega }_i, \boldsymbol {\omega }_o; \alpha) L_r(\boldsymbol {z}, -\boldsymbol {\omega }_i; \alpha) \text{d}\boldsymbol {\omega }_i, \end{align}

(21)

where we have substituted the incoming radiance \(L_i(\boldsymbol {y}, \boldsymbol {\omega }_i)\) , with the outgoing/reflected radiance \(L_r(\boldsymbol {z}, -\boldsymbol {\omega }_i; \alpha)\) , and \(\boldsymbol {z}= \text{rayTrace}(\boldsymbol {y}, \boldsymbol {\omega }_i)\) is the first intersection point from \(\boldsymbol {y}\) in the direction \(\boldsymbol {\omega }_i\) . The recursive call of \(L_r\) is a function of the BRDF parameter \(\alpha\) , because upon unrolling the recursion, it may be a function of an \(\alpha\) dependent BRDF. Differentiating this expression, we get

\begin{align} \partial _\alpha L_r(\boldsymbol {y}, \boldsymbol {\omega }_o; \alpha) &= \int \partial _\alpha f(\boldsymbol {y}, \boldsymbol {\omega }_i, \boldsymbol {\omega }_o; \alpha) L_r(\boldsymbol {z}, -\boldsymbol {\omega }_i; \alpha) \text{d}\boldsymbol {\omega }_i \end{align}

(22)

\begin{align} &+ \int f(\boldsymbol {y}, \boldsymbol {\omega }_i, \boldsymbol {\omega }_o; \alpha) \partial _\alpha L_r(\boldsymbol {z}, -\boldsymbol {\omega }_i; \alpha) \text{d}\boldsymbol {\omega }_i, \end{align}

(23)

which recursively describes how differential radiance is reflected. The two integrals (Equations (22) and (23)) can be importance sampled separately. We have seen how to importance sample Equation (22) by applying different BRDF derivative decompositions in Sections 4.2, 5, and 6. Irrespective of the decomposition required, this requires two evaluations of \(L_i\) corresponding to the positive and negative lobes and is done by regular path tracing (similar to the standard splitting approach [Arvo and Kirk 1990]). To importance sample Equation (23), we follow standard BRDF sampling and continue the same recursive importance sampling of \(\partial _\alpha L_r\) at the next shading point.

This means that we need three samples at each shading point, one each for BRDF, positive lobe and negative lobe importance sampling. Fortunately, for product and mixture decomposition, we can reduce this to two samples at each shading point. For product decomposition, as we saw in Section 5, one of either the positive or negative lobe decomposition PDFs is the same as BRDF sampling, and can share a sample with it. For mixture decomposition, BRDF sampling can be simulated by randomly choosing a sample from either the positive or negative lobes with the probability equal to the mixture weight of the BRDF sampling strategy.

Branching Complexity and Comparison with BRDF sampling. Even though we use two samples to estimate Equation (22), the total number of rays required to estimate \(\partial _\alpha L_r\) for a maximum depth d is quadratic, i.e., \(O(d^2)\) , instead of exponential, see Figure 17, whereas it is \(O(d)\) for BRDF sampling. This is because we only apply splitting when estimating Equation (22), which recurses on \(L_r\) , and we do not split when estimating Equation (23). The recursive call of \(L_r\) in Equation (22) does not require splitting, which prevents exponential branching.

Variance Reduction under Global Illumination. All of our results are subject to an equal-ray budget setup to ensure fair comparisons. We also include wall clock times that reflects this.

We have found that for one-bounce global illumination, our mixture decomposition can reduce standard deviation by \(1.55\times\) ; see Figure 18. Next, we show results for gradient estimation in two typical global illumination setups, glossy reflections and caustics. In Figure 19, we compute the derivative of a glossy reflection with respect to the mixture weight of the teapot that produces the reflection. Once again, our estimator produces gradients with much lower standard deviation, \(6.23\times\) in this setting due to our better importance sampling strategy (mixture decomposition) at the second bounce on the teapot. In Figure 20, we compute the derivative of a caustic pattern formed on a diffuse surface with respect to the isotropic roughness of a metallic cylinder that produces the caustic. This setup uses the positivization estimator with a higher branching factor; nonetheless, it is still able to significantly reduce standard deviation (by \(2.51\times\) ) due to correct handling of the BRDF derivative at the second bounce on the cylinders surface.

We also demonstrate that our estimators work well even in the presence of more detailed geometry and complicated visibility in Figure 21. We estimate the derivative with the mixture weight under both direct and one bounce global illumination. Our mixture decomposition reduces the standard deviation of the estimated gradients by \(2.35\times\) and \(2.27\times\) under the two settings, respectively, as compared to BRDF importance sampling.

Finally, we show the effectiveness of the product decomposition under one bounce global illumination with complex visibility, glossy reflections and other effects intertwined in a single scene in Figures 22 and 23. In Figure 22, we estimate the anisotropic roughness derivative of a chess set modelled by a Beckmann BRDF, which includes effects such as inter-occlusion among the pieces and glossy reflections on the chessboard. Product decomposition reduces the standard deviation by \(1.78\times\) on this scene. In Figure 23, we consider the case of very complex visibility. We estimate the anisotropic roughness derivative of a metallic hairball. Even under this challenging condition, we are able to better handle sign and shape variance of the differential BRDF resulting in a \(1.52\times\) reduction in standard deviation over BRDF sampling. We provide difference images, which show the specific regions of improvement for our techniques over BRDF sampling in Figure 29. Additionally, we also show the effect of longer path lengths in Table 3.

Table 3.

Scene	House	Chessboard	Hairball
Improvement (depth = 3)	\(2.27\times\)	\(1.78\times\)	\(1.52\times\)
Improvement (depth = 5)	\(1.87\times\)	\(1.55\times\)	\(1.1\times\)

Table 3. Derivative Estimation Comparison under 3 Bounce Global Illumination, i.e., Depth=5

The scenes are Figures 21–23, respectively. The numbers indicate standard deviation improvement (higher is better) for our decompositions over BRDF sampling. As the maximum length of light paths increases (depth = 3 vs. depth = 5), applying our decompositions at each path vertex is less effective due to extra branching. Figuring out which path vertices one should perform the decomposition at (or path reconnection after splitting) will likely ameliorate these issues, and is an exciting area for future research in differentiable rendering; Section 10 discusses this in further detail. Nonetheless, even with splitting at each path vertex, our estimators show improvement over BRDF sampling at depth = 5; Azinović et al. [2019] argue that depth = 4 is sufficient for inverse recovery of materials and lighting under diffuse global illumination.

Inverse rendering under global illumination. Positivization has the highest branching factor of our three decompositions. Applying it at a path vertex, which requires a derivative, results in the original path branching out into three sub paths, one of which requires recursive derivative computation. It is conceivable that this added computation may worsen positivization’s performance as compared to BRDF sampling, since BRDF sampling does not induce any branching. Experimentally, we have found that despite the additional branching, positivization is able to reduce variance and better recover textures under a single bounce of global illumination as compared to BRDF importance sampling at an equal computation budget; see Figure 24. In the setup, we optimize the GGX roughness of a plate initialized uniformly with \(\alpha =0.05\) to match a target with a checkerboard pattern with \(\alpha =0.005,0.02\) . Light is incident upon the plate only indirectly via a reflection off a reflector, also modelled as a GGX plate with a fixed roughness of \(\alpha =0.01\) ; see Figure 24(b). The reduced variance due to better handling of the sign and shape variance by positivization leads to a better recovery of the texture. This is despite using the ADAM optimizer, which ameliorates the issues of higher variance estimators [Nimier-David et al. 2022].

Our next inverse rendering experiment under global illumination recovers the anisotropic roughness of a GGX BRDF Figure 25. The scene consists of two dragons ( \(\alpha _x=0.005,\alpha _y=0.007\) ) only viewed indirectly through a glossy reflection. The initialization starts with \(\alpha _x=0.5,\alpha _y=0.007\) , and we optimize for the correct \(\alpha _x\) value. Our product decomposition is able to better handle the variance of the differential BRDF at the second bounce (on the dragons surface), which leads to faster convergence.

10 Limitations and Future Work

Determining the number of samples for each decomposed component. For all three decompositions, our current implementation applies a two-sample estimator, which uses one sample per component. It is possible that a different estimator can be more efficient in some cases. For example, when the two components have different areas (i.e., \(\int \partial _\alpha f_1 \ne \int \partial _\alpha f_2\) for components \(f_1\) and \(f_2\) ), it might be useful to adjust the number of samples according to the area of the component (we show in Appendix C that microfacet normal distribution functions always have components with equal area). Research in allocating budgets for multiple importance sampling can likely help in our case as well [He and Owen 2014; Sbert et al. 2018; Grittmann et al. 2022]. Our estimator that always samples all components belongs to the deterministic mixture scheme [Owen 2013]. An alternative is a random mixture, which randomly chooses one component. We opt for deterministic mixtures, since they consistently outperform random mixtures in our direct lighting experiments (due to the stratification effect, similar to standard MIS versus one-sample MIS). For global illumination, random mixtures are the same as applying Russian Roulette to keep only one of the two branches, and can be more computationally convenient in some cases, since they omit the need for quadratic branching.

Branching and Global Illumination. Our adoption of deterministic mixtures requires path splitting for global illumination. While the branching complexity is quadratic instead of exponential (same as a bidirectional path tracer), it can add undesired overheads. There are several ways to reduce the branching, including (1) deterministically using only BRDF sampling, (2) using random mixtures instead of deterministic mixtures after a certain recursion depth, or (3) using path reconnection similar to Zhang et al.’s approach [2020], to reconnect the branches back to a single primary path. Figuring out an effective strategy to deal with branching is crucial for applying our method to estimate derivatives in volumetric rendering and subsurface scattering with long paths; this is an exciting avenue for future research.

Multiple Parameter Optimization. Prior work [Zeltner et al. 2021] and ours only deal with estimating the derivative of a single material parameter (such as roughness, mixture weights, etc.) and constructs specialized sampling schemes for each of them. However, in inverse rendering, we are often interested in recovering several parameters simultaneously. For example, Uber shaders [Georgiev et al. 2019; Burley 2012] have a large number of tunable parameters. Naively applying our decompositions for each of these parameters independently can lead to very large branching factors. Developing sampling strategies that efficiently estimate the derivatives of all the parameters simultaneously is still an open research problem.

Unsupported BRDFs. Our work covers a large variety of popular analytic surface reflection models; see Table 1. However, our method currently does not support data driven BRDFs or Neural BRDFs. In both cases, there are a very large number of tunable parameters, and it is unclear as to which parameters one might want to differentiate or importance sample.

Better Optimization Schemes. Ultimately, for inverse rendering, the optimization is both ill-posed and non-convex. Recently, we have seen some work [Xing et al. 2022] that takes a step in this direction. We believe the study of efficient estimators of the derivatives is largely orthogonal and equally crucial.

11 Conclusion

Our importance sampling techniques provide a fundamental component for future differentiable rendering work, enabling correct handling of sign and shape variance of differential BRDFs. BRDF sampling is widely used in forward rendering to deal with a variety of light transport phenomena; this includes unidirectional, bidirectional and gradient domain path tracing, Metropolis light transport, path guiding, photon mapping, and so on. Similarly, as the need to deal with the differentials of more complicated light transport phenomena arises, we will need to develop differential counterparts of these algorithms, and we believe that our method will be well suited to serve as a fundamental building block for them. Our product and mixture decompositions can also potentially have use outside of graphics for importance sampling real-valued functions.

Acknowledgments

We thank Tizian Zeltner for providing additional information about antithetic sampling, and Alexandr Kuznetsov, Xuanda Yang, and Wesley Chang for proofreading our article.

A BRDF Derivative Importance Sampling PDFs and CDFs

All PDFs and CDFs are in solid angle coordinates, and do not include multiplication by \(\sin \theta\) for change of variables to spherical coordinates. PDFs may be defined in either \(\boldsymbol {\omega }_i\) or \(\boldsymbol {\omega }_h\) space, depending on the BRDF. The PDFs defined in \(\boldsymbol {\omega }_h\) space must finally be transformed to \(\boldsymbol {\omega }_i\) space, and while doing so must include the appropriate Jacobian \(4\boldsymbol {\omega }_o\cdot \boldsymbol {\omega }_h\) . The PDFs are denoted by p and their corresponding CDFs are P. In the cases where CDFs are provided instead of inverse transform sampling routines, CDF inversion is done numerically.

A.1 Positivization

These are all isotropic BRDFs, and sampling for the azimuthal angle \(\phi\) is uniform sampling. We introduce PDFs and sampling routines for Blinn-Phong and Hanrahan-Krueger derivatives that have not been discussed in past literature to the best of our knowledge.

Importance sampling routines for the derivatives of isotropic GGX and Beckmann were first introduced by Zeltner et al. [2021] in Appendix A of their paper, and we do not repeat them here. However, they do not provide explicit formulae for the PDFs \(p_+, p_-\) that we need for positivization. These PDFs have a different normalization by a factor of 2 than the PDF p they use, so we define the PDFs \(p_+, p_-\) here.

A.1.1 Isotropic GGX.

\begin{align} \begin{split}r(\theta _h) &= \frac{8 \alpha ^2 \sec ^3\theta _h (\tan ^2\theta _h - \alpha ^2)}{ \left(\tan ^2\theta _h + \alpha ^2 \right)^3 },\\ p_{\alpha ,-}(\theta _h) &= -\min (r(\theta _h), 0), \\ p_{\alpha ,+}(\theta _h) &= \max (r(\theta _h), 0).\end{split} \end{align}

(24)

A.1.2 Isotropic Beckmann.

\begin{align} \begin{split} r(\theta _h) &= \frac{4 e^{1 - tan^2\theta _h/\alpha ^2} \sec ^3\theta _h (\tan ^2\theta _h - \alpha ^2)}{ \alpha ^4 }, \\ p_{\alpha ,-}(\theta _h) &= -\min (r(\theta _h), 0), \\ p_{\alpha ,+}(\theta _h) &= \max (r(\theta _h), 0).\end{split} \end{align}

(25)

A.1.3 Blinn-Phong (Minnaert).

\begin{align} \begin{split} r(\theta _h) &= e(n+2)\cos ^{n+1}\theta _h((n+2)\log \cos \theta _h + 1), \\ p_{n,-}(\theta _h) &= -\min (r(\theta _h), 0), \\ p_{n,+}(\theta _h) &= \max (r(\theta _h), 0), \\ P_{n,+}(\theta _h) &= -e(n+2)\cos ^{n+2}\theta _h\log \cos \theta _h, \\ P_{n,-}(\theta _h) &= 1-e(n+2)\cos ^{n+2}\theta _h\log \cos \theta _h.\end{split} \end{align}

(26)

For the Minnaert BRDF, the sampling routines are the same as above, but defined in \(\theta _i\) space instead of \(\theta _h\) .

A.1.4 Henyey-Greenstein (Hanrahan-Krueger).

\begin{align} \begin{split}C &= \frac{3^{3/2}g^2(1-g^2)}{(3 + g^2)^{3/2} - 3^{3/2}(1-g^2)}, \\ r(\theta _i) &= Cg^2 \frac{(g^2 + 3)\cos \theta _i + g(g^2 - 5)}{ (g^2 - 2g\cos \theta _i + 1)^{5/2} }, \\ p_{g,-}(\theta _i) &= -\min (r(\theta _i), 0), \\ p_{g,+}(\theta _i) &= \max (r(\theta _i), 0), \\ P_{g,-}(\theta _i) &= {\left\lbrace \begin{array}{ll} (1-C) - C\left[ \frac{3g^2+1-g(g^2+3)\cos \theta _i}{(g^2 - 2g\cos \theta _i + 1)^{3/2}} \right] ,& \text{if } p_{g,-}(\theta _i) \gt 0\\ 1, & \text{otherwise}\end{array}\right.}, \\ P_{g,+}(\theta _i) &= {\left\lbrace \begin{array}{ll} C\left[ \frac{3g^2+1-g(g^2+3)\cos \theta _i}{(g^2 - 2g\cos \theta _i + 1)^{3/2}} \right] - 1 ,& \text{if } p_{g,+}(\theta _i) \gt 0\\ 1, & \text{otherwise}\end{array}\right.}.\end{split} \end{align}

(27)

A.2 Product Decomposition

For product decomposition, there are two sampling PDFs. The first is \(p_1 \propto g\) , which is just regular BRDF sampling (e.g., visible normal distribution function sampling for GGX/ Beckmann); we do not repeat them here. We provide importance sampling PDFs and CDFs for \(\partial _\alpha g\) .

For Anisotropic GGX and Beckmann, we provide the PDFs and importance sampling routines for \(\partial _{\alpha _x}g\) with one of the directional parameters \(\alpha _x\) . The corresponding PDFs and CDFs for the other directional parameter \(\alpha _y\) can be obtained by swapping \(\alpha _x\) with \(\alpha _y\) and \(\cos \phi _h\) with \(\sin \phi _h\) . We do the same for Ashikhmin-Shirley too, except the directional parameters are \(n_u, n_v\) in this case.

For the three BRDFs above, the CDF for \(\phi _h\) generates an azimuthal angle in the range \([0, \pi /2]\) . \(\phi _h\) is mirror symmetric about \(\pi /2\) and has a period of \(\pi\) , which is used to transform \(\phi _h\) to the range \([0, 2\pi ]\) (and the Jacobian needs to account for this via a division by 4 as well). The CDF for \(\theta _h\) generates an elevation angle in \([0, \pi /2]\) .

A.2.1 Anisotropic GGX.

Derivative with \(\alpha _x\) .

\begin{equation} \begin{split} a(\phi _h) &= \frac{\cos ^2\phi _h}{\alpha _x^2} + \frac{\sin ^2\phi _h}{\alpha _y^2}, \\ g(\theta _h, \phi _h) &= \left(a(\phi _h)\sin ^2\theta _h + \cos ^2\theta _h \right)^{-2}, \\ p_{\alpha _x}(\phi _h) &= \frac{4\cos ^2\phi _h}{\pi \alpha _x^3\alpha _y a(\phi _h)^2}, \\ p_{\alpha _x}(\theta _h|\phi _h) &= \frac{4a(\phi _h)^2 \tan ^2\theta _h\sec ^3\theta _h}{\left(\tan ^2\theta _h a(\phi _h) + 1\right)^3}, \\ P_{\alpha _x}(\phi _h) &= \frac{2}{\pi }\left[\ \tan ^{-1}\left(\frac{\alpha _x}{\alpha _y}\tan \phi _h\right) \right. \\ &+ \left. \frac{\alpha _y \alpha _x \sin (2\phi _h)}{\alpha _x^2 + \alpha _y^2 + (\alpha _y^2 - \alpha _x^2)\cos (2\phi _h)} \right], \\ P_{\alpha _x}(\theta _h|\phi _h) &= \frac{a(\phi _h)^2}{a(\phi _h)^2 - 1} \\ &- \frac{a(\phi _h)^2((1-a(\phi _h))\cos (4\theta _h) + a(\phi _h) + 3)}{4(a(\phi _h)^2 - 1)\left((a(\phi _h)-1)\sin ^2\theta _h + 1\right)^2}. \end{split} \end{equation}

(28)

A.2.2 Anisotropic Beckmann (Ward).

Derivative with \(\alpha _x\) .

The importance sampling PDFs and CDFs for the anisotropic Beckmann and Ward BRDFs are the same, since the shape functions g for both the BRDFs (and their derivatives) take on a similar functional form. The PDF \(p_{\alpha _x}(\phi _h)\) and CDF \(P_{\alpha _x}(\phi _h)\) for them is the same as GGX; see Equation (28). Also see Equation (28) for the definition of \(a(\phi _h)\) .

\begin{align} g(\theta _h, \phi _h) &= \sec ^3\theta _h e^{-a(\phi _h)\tan ^2\theta _h }, \\ p_{\alpha _x}(\theta _h|\phi _h) &= 2 a(\phi _h)^2 \tan ^2\theta _h \sec ^3\theta _h e^{-a(\phi _h) \tan ^2\theta _h},\\ P_{\alpha _x}(\theta _h|\phi _h) &= 1 - (1 + a(\phi _h)\tan ^2\theta _h) e^{-a(\phi _h)\tan ^2\theta _h}. \end{align}

(29)

A.2.3 Ashikhmin-Shirley.

Derivative with \(n_u\) .

\begin{align} a(\phi _h) &= n_u\cos ^2\phi _h + n_v\sin ^2\phi _h, \\ g(\theta _h, \phi _h) &= \cos \theta _h^{a(\phi _h)}, \\ p_{n_u}(\phi _h) &= \frac{4 (n_u+1)^{3/2} \sqrt {n_v + 1}\cos ^2\phi _h}{\pi (1 + a(\phi _h))^2},\\ p_{n_u}(\theta _h|\phi _h) &= -\log \cos \theta _h(1 + a(\phi _h))^2\cos \theta _h^{a(\phi _h)},\\ P_{n_u}(\phi _h) &= \frac{2}{\pi } \left[\ \tan ^{-1}\left(\sqrt {\frac{n_v + 1}{n_u + 1}}\tan \phi _h\right) \right. \\ &+ \left. \frac{\sqrt {(n_u+1)(n_v + 1)} \sin (2\phi _h)}{n_u + n_v + 2 + (n_u - n_v)\cos (2\phi _h)} \right], \\ P_{n_u}(\theta _h|\phi _h) &= 1 - (1 - (a(\phi _h) + 1)\log \cos \theta _h) \cos \theta _h^{a(\phi _h) + 1}. \end{align}

(30)

A.2.4 Microfacet ABC.

The ABC Microfacet BRDF is an isotropic microfacet BRDF, and so the sampling for \(\phi _h\) is uniform. The parameter A does not play a role in the microfacet BRDF (it is canceled out by the normalization constant), so we ignore it, and only consider the derivatives with the parameters \(B, C\) .

\begin{align} \begin{split}g(\theta _h) &= (1 + B(1 - cos(\theta _h)))^{-C}, \\ p_{B}(\theta _h) &= \frac{B^2C(C-1)(B+1)^C(\cos \theta _h - 1) (1 + B(1 - \cos \theta _h))^{-1-C}}{(1 + BC - (B+1)^C) }, \\ p_{C}(\theta _h) &= \frac{B(C-1)^2}{1 - (1+B)^{1-C}\left((C - 1)\log (B+1) + 1\right) }\frac{\log (1 + B(1 - \cos \theta _h))}{\left(1 + B(1 - \cos \theta _h)\right)^{C}}, \\ P_{B}(\theta _h) &= \frac{(B+1)^C \left(1 + B(1 - \cos \theta _h) \right)^{-C} (1 + BC(1 - \cos \theta _h)) - (B+1)^C}{(1 + BC - (B+1)^C) }, \\ P_{C}(\theta _h) &= \frac{1 - (1 + B(1 - \cos \theta _h))^{1-C}((C-1)\log (1 + B(1 - \cos \theta _h)) + 1)}{1 - (B+1)^{1-C}((C-1)\log (1+B) + 1)}.\end{split} \end{align}

(31)

A.2.5 Hemi-EPD.

The Hemi-EPD microfacet BRDF is another isotropic BRDF, so \(\phi _h\) is importance sampled using uniform sampling. \(\Gamma\) is the incomplete gamma function.

\begin{align} \begin{split}g(\theta _h) &= e^{\kappa \cos ^\gamma \theta _h} - 1, \\ p_{\kappa }(\theta _h) &= \frac{\gamma \kappa (-\kappa)^{1/\gamma }}{\Gamma (1 + 1/\gamma ,0) - \Gamma (1 + 1/\gamma , -\kappa))}\cos ^\gamma (\theta _h)e^{\kappa \cos ^\gamma \theta _h}, \\ P_{\kappa }(\theta _h) &= \frac{\Gamma (1 + 1/\gamma , -\kappa \cos ^\gamma \theta _h) - \Gamma (1 + 1/\gamma , -\kappa)}{\Gamma (1 + 1/\gamma ,0) - \Gamma (1 + 1/\gamma , -\kappa))}. \\ \end{split} \end{align}

(32)

A.2.6 Burley Diffuse BSSRDF.

This BSSRDF is defined over an infinite plane, and is radially symmetric. The polar angle \(\phi\) is sampled uniformly. We provide an importance sampling routine to sample the radial distance \(r \in [0, \infty ]\) , for the derivative with the parameter d that controls both its height and width. Once again, a Jacobian for multiplication with r is required here.

\begin{align} \begin{split}g(r, d) &= \frac{e^{-r/d} + e^{-r/3d}}{r}, \\ p_d(r) &= \frac{e^{-r/d} + e^{-r/3d}/3}{4d^2}, \\ P_d(r) &= 1 - \frac{e^{-r/d}(r+d)}{4d} - \frac{e^{-r/3d}(3d + r)}{4d}.\end{split} \end{align}

(33)

A.3 Mixture Decomposition

A.3.1 Mixture Model.

We are interested in differentiating a mixture model f, given by

\begin{equation} \begin{split} f(\boldsymbol {\omega }_i, \boldsymbol {\omega }_o) &= w f_1(\boldsymbol {\omega }_i, \boldsymbol {\omega }_o) + (1 - w) f_2(\boldsymbol {\omega }_i, \boldsymbol {\omega }_o), \\ \partial _{w} f(\boldsymbol {\omega }_i, \boldsymbol {\omega }_o) &= f_1(\boldsymbol {\omega }_i, \boldsymbol {\omega }_o) - f_2(\boldsymbol {\omega }_i, \boldsymbol {\omega }_o), \end{split} \end{equation}

(34)

with its parameter w. Here, f1 and f2 are the two lobes of the BRDF. The importance sampling scheme for the two terms of the derivative \(\partial _w f\) are simply the BRDF importance sampling techniques for \(f_1\) and \(f_2\) , respectively.

A.3.2 Oren-Nayar.

We are interested in differentiating the roughness \(\sigma\) . The PDFs once again are in solid angle coordinates, not in spherical coordinates. The first term of Equation (20) requires standard cosine hemispherical sampling, and we provide an importance sampling routine for the second term. Here, \(p_2(\theta _i)\) is made up of two terms depending on whether \(\theta _i \lt \theta _o\) , and they have weights \(A_{21}^{^{\prime }}, 1 - A_{21}^{^{\prime }}\) , respectively. For \(\phi _i\) , an exact inverse transform sampling routine is available.

\begin{align} \begin{split} A_{21} &= \frac{1}{2}\sin (\theta _o)(\theta _o - \sin (\theta _o)\cos (\theta _o)),\\ A_{22} &= \frac{1}{3}\tan (\theta _o)(1 - \sin ^3(\theta _o)),\\ T_2 &= A_{21} + A_{22}, \; A_{21}^{^{\prime }} = A_{21}/T_2, \\ p_2(\theta _i) &= {\left\lbrace \begin{array}{ll} A_{21}^{^{\prime }} \frac{\sin (\theta _i)}{ (0.5(\theta _o - \sin (\theta _o)\cos (\theta _o)))} ,& \text{if } \theta _i \lt \theta _o\\ (1 - A_{21}^{^{\prime }}) \frac{3\sin (\theta _i) \cos (\theta _i)}{1 - \sin ^3(\theta _o)}, & \text{otherwise}\end{array}\right.},\\ p_2(\phi _i) &= 0.5 \max (0, \cos (\phi _o - \phi _i)) \\ P_2(\theta _i) &= {\left\lbrace \begin{array}{ll} A_{21}^{^{\prime }}\frac{\theta _i - sin(\theta _i)cos(\theta _i)}{\theta _o - sin(\theta _o)cos(\theta _o)} ,& \text{if } \theta _i \lt \theta _o\\ A_{21}^{^{\prime }} + (1 - A_{21}^{^{\prime }})\frac{\sin ^3(\theta _i) - sin^3(\theta _o)}{1.0 - sin^3(\theta _o)}, & \text{otherwise}\end{array}\right.}, \\ \phi _i &= {\left\lbrace \begin{array}{ll} \phi _o - sin^{-1}(2u) ,& \text{if } u \lt 0.5\\ \phi _o + sin^{-1}(2u - 1), & \text{otherwise}\end{array}\right.}.\end{split} \end{align}

(35)

A.3.3 Microcylinder.

We want to importance sample the derivative of the BRDF corresponding to the volumetric scattering component \(f_{r,v}\) in the original paper’s notation, with the linear combination weight \(k_d\) . This BRDF does not include cosine foreshortening.

\begin{align} \begin{split}f(\boldsymbol {\omega }_i, \boldsymbol {\omega }_o) &= F\frac{(1 - k_d)g(\theta _h; \gamma _v) + k_d}{\cos \theta _i + \cos \theta _o}A, \\ \partial _{k_d}f(\boldsymbol {\omega }_i, \boldsymbol {\omega }_o) &= F\frac{1}{\cos \theta _i + \cos \theta _o}A - F\frac{g(\gamma _v, \theta _h)}{\cos \theta _i + \cos \theta _o}A,\end{split} \end{align}

(36)

where F is the Fresnel term, A is the albedo, and g is a Gaussian with width \(\gamma _v\) . The first term is importance sampled using cosine hemispherical sampling, which is also the importance sampling technique used for this BRDF in forward rendering. The second term is importance sampled using inverse transform sampling for the Gaussian.

B Zeltner Et Al.’S Antithetic Sampling Is A Special Case of Positivization

Zeltner et al.’s [2021] antithetic sampling involves generating paired and correlated samples for the positive and negative lobes of the BRDF derivative \(\partial _\alpha f\) in two separate passes, one pass for each lobe, and then averages out the final result.

The correlation is induced by using the same random number generator state across the two passes. The only difference between the two passes are that the first one uses a flag to trigger sampling from the positive lobe \(p_+\) of the PDF \(p = w p_+ + (1 - w) p_-\) , and the second one triggers sampling from the negative lobe \(p_-\) . Here, w is the relative area of the positive lobe of \(\partial _\alpha f\) , given by \(|\int \partial _\alpha f_+| / (|\int \partial _\alpha f_+| + |\int \partial _\alpha f_-|)\) and is equal to 0.5 for the BRDF derivatives they consider; see Appendix Section C.

Their estimator for the integrand \(\partial _\alpha f\) is given by

\begin{align} \begin{split} I = \frac{1}{2}\left(\frac{\partial _\alpha f(X_+)}{p(X_+)} + \frac{\partial _\alpha f(X_-)}{p(X_-)}\right), \end{split} \end{align}

(37)

where the samples are drawn from \(X_+ \sim p_+\) and \(X_- \sim p_-\) , and the factor of \(1/2\) comes from the fact that they average the result of the two passes. We can further simplify Equation (37), to bring it in a form similar to the positivization estimator in Equation (8), by noticing that \(\partial _\alpha f(X) = \partial _\alpha f_+(X)\) when \(X \sim p_+\) and similarly for \(p_-\) too, which gives us

\begin{align} \begin{split} I &= \frac{\partial _\alpha f_+(X_+(u))}{p_+(X_+(u))} + \frac{\partial _\alpha f_-(X_-(u))}{p_-(X_-(u))}.\end{split} \end{align}

(38)

The only difference between the estimator above and the positivization estimator is that the samples \(X_-(u)\) and \(X_+(u)\) are correlated, because they use the same uniform random number u, whereas they are uncorrelated for positivization, because positivization does not impose any such restriction. Thus, antithetic sampling is a special case of positivization with correlated random numbers.

Positivization (with uncorrelated random numbers) achieves its variance reduction due to the stratification of the real-valued function into positive and negative functions, and we have experimentally verified that antithetic sampling (with correlated random numbers) consistently has similar variance reduction as positivization. As a result, antithetic sampling’s variance reduction can be explained by the implicit stratification of \(\partial _\alpha f\) into positive and negative lobes. See Figure 3 for an example of the variance reduction.

C Microfacet BRDF Derivatives Integrate to Zero

Previous work [Zeltner et al. 2021] has noted that the derivative of the normal distribution function of the isotropic GGX (and Beckmann) BRDF with its roughness parameter has positive and negative lobes with equal area. Here, we prove that this observation extends to all the derivatives of all microfacet normal distribution functions.

The projected area of a microfacet BRDF’s normal distribution function D always integrates to 1, i.e., a constant,

\begin{align} \int D(\boldsymbol {\omega }_h, \alpha) \cos \theta _h \text{d}\boldsymbol {\omega }_h= 1. \end{align}

(39)

As a result, its derivative with any parameter \(\alpha\) integrates to 0,

\begin{align} \int \partial _\alpha D(\boldsymbol {\omega }_h, \alpha) \cos \theta _h \text{d}\boldsymbol {\omega }_h= 0, \end{align}

(40)

which means that the positive and negative lobes of \(\partial _\alpha D \cos \theta _h\) have equal area. Since we generally construct microfacet derivative sampling PDFs proportional to the derivative of the projected normal distribution function, the sampling PDFs (irrespective of the decomposition) for the positive and negative lobes of the derivative must necessarily have equal area.

D Additional Results

We present some additional comparisons and results in this section. First, we apply Zhang et al.’s [2021a] method to estimating the anisotropic roughness derivative of GGX BRDFs of the scenes depicted Figures 10, 12, and 13 in Figure 26. Second, we show the recovered textures using the SGDm optimizer in the experiment outlined in Figure 14 in Figure 28. Third, we show the recovered textures by both our method and BRDF sampling in Figure 15 in Figure 27. Next, we show the improvement regions of our estimators versus BRDF sampling for all the standard deviation estimation experiments in Figure 29. Finally, we discuss the effect of larger path lengths in Table 3.

References

[1]

James Arvo. 1994. The Irradiance Jacobian for partially occluded polyhedral sources. In Proceedings of the SIGGRAPH Conference. 343–350.

Abstract

1 Introduction

2 Related Work

2.1 Differentiable Rendering

2.2 BRDFs and Importance Sampling

3 Background

3.1 Previous Work on Variance Reduction for Differentiable Rendering

3.1.1 Detached and Antithetic Sampling.

3.1.2 Antithetic Sampling of Odd Derivatives.

4 Single-signed Decompositions

4.1 Sign Variance

4.2 Positivization

4.2.1 Positivization of Isotropic GGX.

4.2.2 Inapplicability of Positivization to Anisotropic GGX.

5 Product Decomposition

6 Mixture Decomposition

7 Recipe for Importance Sampling BRDF Derivatives

8 Results

8.1 Derivative Comparison

8.1.1 Positivization.

8.1.2 Product Decomposition.

8.1.3 Mixture Decomposition.

8.2 Inverse Rendering Comparison

9 Global Illumination

10 Limitations and Future Work

11 Conclusion

Acknowledgments

A BRDF Derivative Importance Sampling PDFs and CDFs

A.1 Positivization

A.1.1 Isotropic GGX.

A.1.2 Isotropic Beckmann.

A.1.3 Blinn-Phong (Minnaert).

A.1.4 Henyey-Greenstein (Hanrahan-Krueger).

A.2 Product Decomposition

A.2.1 Anisotropic GGX.

A.2.2 Anisotropic Beckmann (Ward).

A.2.3 Ashikhmin-Shirley.

A.2.4 Microfacet ABC.

A.2.5 Hemi-EPD.

A.2.6 Burley Diffuse BSSRDF.

A.3 Mixture Decomposition

A.3.1 Mixture Model.

A.3.2 Oren-Nayar.

A.3.3 Microcylinder.

B Zeltner Et Al.’S Antithetic Sampling Is A Special Case of Positivization

C Microfacet BRDF Derivatives Integrate to Zero

D Additional Results

References

Cited By

Index Terms

Recommendations

Efficient BRDF importance sampling using a factored representation

Efficient BRDF importance sampling using a factored representation

Wavelet importance sampling: efficiently evaluating products of complex functions

Comments

Information

Published In

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Funding Sources

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

View options

PDF

eReader

Get Access

Login options

Full Access

Figures

Other

Share

Share this Publication link