1 Introduction
Bidirectional Reflectance Distribution Function (BRDF) importance sampling is an essential variance reduction technique for Monte Carlo forward rendering. However, there is no simple counterpart for differentiable rendering. Taking the derivative of a BRDF with respect to one of its parameters transforms it into a real-valued differential BRDF. The differential BRDF can have a very different shape from the BRDF, and can also take negative values. Previous attempts at tackling this problem [Zeltner et al.
2021] are limited to the roughness derivatives of isotropic
Ground Glass Unknown (GGX) (Trowbridge-Reitz) and Beckmann BRDFs, and cannot handle even their anisotropic counterparts. Another method [Zhang et al.
2021a] was developed primarily for odd functions with symmetric positive and negative lobes, and can produce substantially higher variance when the derivative is close to an even function. We propose effective importance sampling of derivatives of not only anisotropic GGX and Beckmann BRDFs but also a wide variety of other analytic BRDF models like Ashikhmin-Shirley, Oren-Nayar, Hanrahan-Krueger, Mixture BRDFs, and ABC models. Figure
1 demonstrates the benefits of our method on several BRDFs compared to BRDF sampling.
Importance sampling a real-valued function leads to unique challenges. Its variance has two sources, (a) its sign and (b) its shape. Our idea is to decompose the function into a sum of single-signed functions, which we call single-signed decompositions. A single-signed function is either positive everywhere or negative everywhere in its domain. Single-signed functions, by definition, have no sign variance. Importance sampling these functions eliminates their shape variance.
A classical strategy, positivization [Owen and Zhou
2000], is a special case of our single-signed decomposition. It has positive and negative parts with
non-overlapping support, which in turn requires (a) analytic knowledge of the roots and (b) analytic integrability of the BRDF derivative up to the roots, which is possible only for certain BRDF derivatives. To sidestep these issues due to a partition of the domain, we introduce the product and mixture decompositions for which
we allow the positive and negative parts to overlap. In fact, we ensure that both the positive and the negative parts have support over the entire hemisphere. This enables analytic integrability and significantly expands upon the set of BRDF derivatives we can handle.
Our main contributions are three single-signed decompositions and the corresponding importance sampling PDFs of a large set of BRDF derivatives; see Table 1.Positivization. First, we introduce a simple decomposition called positivization (Section
4.2), which
partitions a real-valued function about its roots into a positive and a negative function. We show that Zeltner et al.’s [
2021] antithetic sampling is a special case of positivization, and positivization provides an explanation of the efficiency of their approach. When applicable, positivization leads to significant variance reduction. For example, it can be applied for sampling the isotropic GGX, Beckmann and Hanrahan-Krueger BRDF derivatives. However, others like anisotropic GGX, Beckmann, Ashikhmin-Shirley (Section
4.2.2) are not analytically integrable up to their roots, and the derivatives with mixture weights (Section
6) do not have analytic roots. Positivization cannot handle these derivatives. Zeltner et al.’s antithetic sampling inherits these limitations too.
Product Decomposition. Second, we propose a
novel product decomposition (Section
5). Our key observation is that after differentiation, many BRDF derivatives can be decomposed into single-signed functions by
separating the terms that result from the derivative product rule. Product decomposition does
not require knowledge of the roots for the decomposition and only requires the resulting single-signed functions to be analytically integrable. Product decomposition can importance sample the derivatives of anisotropic GGX, Beckmann, Ashikhmin-Shirley, and more.
Mixture Decomposition. Finally, we introduce mixture decomposition (Section
6). Derivatives of BRDFs with linear combination coefficients, e.g., mixture weights of a layered BRDF, result in real-valued functions whose roots cannot be found analytically in most cases. Our mixture decomposition exploits the fact that this derivative is the difference between two positive-valued terms. Separating them results in a single-signed decomposition, and the two terms can then be importance sampled separately. Mixture decomposition handles the derivatives of Oren-Nayar and mixture weights of
Uber BRDFs such as the Disney BRDF or Autodesk Standard Surface.
It is likely that several other BRDF derivatives not surveyed in this article can also be dealt with by one of our three decompositions, and we provide a recipe for handling them in Section
7. We provide a library of importance sampling PDFs for the derivatives of all the BRDF models discussed in this work in Table
1.
3 Background
For the sake of simplicity, we begin our discussion by focusing on the direct lighting setting, and extend it to indirect lighting in Section
9. The reflected radiance
\(L_r\) , at a shading point
\(\boldsymbol {y}\) , in the direction
\(\boldsymbol {\omega }_o\) , is given by the reflection equation [Cohen and Wallace
1993],
Here,
f is the cosine-weighted BRDF at
\(\boldsymbol {y}\) , and
\(\alpha\) is a scalar BRDF parameter that controls
f. In practice,
\(\alpha\) is the vector of all BRDF parameters in a given scene. However, for ease of exposition, we assume
\(\alpha\) is scalar-valued, with the results for the other parameters following similarly. For example,
\(\alpha\) could be the roughness of an isotropic GGX BRDF. Since we are dealing with only direct lighting, the incident radiance
\(L_i\) does not depend upon
\(\alpha\) . Differentiating the expression for the reflected radiance with
\(\alpha\) , we get
Since the parameter
\(\alpha\) does not affect discontinuities, we do not have an additional boundary integral [Li et al.
2018].
In forward rendering, BRDF sampling aims to minimize the variance of the BRDF
f in the reflection equation, Equation (
1). Similarly, our goal is to
minimize the variance of the differential BRDF \(\partial _\alpha f\) in differentiable rendering, i.e., the variance of the estimator for
We drop the spatial coordinate
\(\boldsymbol {y}\) , without loss of generality, for simplicity. We deal with the incident radiance
\(L_i\) using light source sampling. The estimators for
\(\partial _\alpha f\) and
\(L_i\) can be combined using Multiple Importance Sampling [Veach and Guibas
1995]. We finally want to compute
\(\partial _\alpha L_r\) so the final estimator must always include multiplication by
\(L_i\) . All following integrals in the article omit
\(L_i\) to stress that we only focus on
\(\partial _\alpha f\) .
3.1 Previous Work on Variance Reduction for Differentiable Rendering
3.1.1 Detached and Antithetic Sampling.
Zeltner et al. [
2021] noticed that standard BRDF sampling using a PDF
\(p \propto f\) for the differential BRDF
\(\partial _\alpha f\) leads to high variance, since
\(\partial _\alpha f\) and
f can be very different functions. They instead construct a PDF
\(p \propto |\partial _\alpha f|\) , called the differential detached PDF, which matches
\(\partial _\alpha f\) in shape. This eliminates variance from the shape of
\(\partial _\alpha f\) , i.e., the sample weights
\(\partial _\alpha f / p\) are constant in magnitude. There is, however, additional
sign variance resulting from the mismatch in the sign between the positive-valued
p and the real-valued integrand
\(\partial _\alpha f\) resulting in sample weights
\(\partial _\alpha f / p\) that change sign.
To deal with sign variance, Zeltner et al. [
2021] applied antithetic sampling. (
3) We show that Zeltner et al.’s method is a special case of another technique called positivization [Owen and Zhou
2000].
We show in Section
4.2 and Appendix
B that positivization provides a theoretical grounding of antithetic sampling: the effectiveness mainly comes from the stratification (separating the real-valued function into a positive and a negative function). The
major drawback of antithetic sampling is its inapplicability to several BRDF derivatives, due to the lack of closed forms of root finding and integration, which we discuss in Section
4.2.2.
Positivization has also been applied by concurrent work in differentiable rendering to
the target function q in
resampled importance sampling (RIS) [Talbot et al.
2005] by Chang et al. [
2023] (see Equation (11) in their paper). Our work is orthogonal to their method: They use BRDF importance sampling as their
candidate distribution p. The variance of RIS is a linear combination of
\(\text{Var}(f/p)\) and
\(\text{Var}(f/q)\) (see Equation 4.2 in Talbot et al. [
2005]’s work); positivized RIS reduces the variance of the second term by positivizing the target function
q into
\(q_+,q_-\) . Our method of positivizing the differential BRDF (using
\(p_+,p_-\) instead of
p) reduces the variance of the first term.
3.1.2 Antithetic Sampling of Odd Derivatives.
Zhang et al. [
2021a] introduce another antithetic-sampling-based method to deal with the derivative of the GGX Normal Distribution Function,
\(D(\boldsymbol {\omega }_h)\) with the half vector
\(\boldsymbol {\omega }_h\) . They exploit the fact that the derivative
\(\partial _{\boldsymbol {\omega }_h} D(\boldsymbol {\omega }_h)\) is odd about the local shading normal, i.e.,
Their estimator for Equation (
3) requires two antithetic samples
\(\omega _{i,1}\) and
\(\omega _{i,2}\) , and is given by
Here, and going forward, we drop
\(\boldsymbol {\omega }_o\) and
\(\alpha\) from the function arguments of
\(I(\boldsymbol {\omega }_o, \alpha)\) and
\(f(\boldsymbol {\omega }_i, \boldsymbol {\omega }_o, \alpha)\) for brevity. This method works well for the odd derivative with
\(\boldsymbol {\omega }_h\) . However, for non-odd derivatives, there are no variance reduction guarantees. Furthermore, several BRDF derivatives are even, e.g., roughness of GGX, Beckmann, and Zhang et al.’s method increases variance in these cases.
Additionally, Equation (
5) is not in the standard importance sampling form of
\(\partial _\alpha f / p\) due to the presence of a sum in the numerator and denominator. Hence, it is unclear how to use it in conjunction with multiple importance sampling.
5 Product Decomposition
Our first new decomposition is product decomposition. It can handle the derivatives of anisotropic microfacet BRDFs, diffuse BSSRDFs, and the isotropic ABC BRDF that positivization could not handle. The key idea we exploit is that after differentiating any of these materials, they split up into two terms following the product rule. Both of these are single-signed, have no sign variance, and are analytically integrable over their simple domains of integration (hemisphere or plane).
Several BRDFs (or normal distribution functions) are of the form
where
\(g(\boldsymbol {\omega }_h, \alpha)\) is a non-negative shape function, which determines the overall shape of the BRDF over all
\(\boldsymbol {\omega }_h\) , at the parameter value
\(\alpha\) .
\(N(\alpha)\) is a directionally constant (independent of
\(\boldsymbol {\omega }_h\) ) normalization term that ensures
f integrates to 1. Differentiating
f with
\(\alpha\) , we get
Because N and \(\partial _\alpha N\) are directionally constant, the variance in the two terms above comes from g and \(\partial _\alpha g\) , respectively. The first term above is single-signed, because
\(g \ge 0\) . The second term with
\(\partial _\alpha g\) can potentially be real-valued. However, we have found it to be single-signed for several common BRDFs. For example, for the anisotropic GGX normal distribution function
\(D(\boldsymbol {\omega }_h)\) , we have
where
\(\partial _\alpha g\) is single-signed; see Figure
5. Additionally,
\(\partial _\alpha g\) is also analytically integrable over its hemispherical domain.
Let us provide some geometric intuition for why the shape derivative \(\partial _\alpha g\) is often single-signed. For our BRDFs, the parameter \(\alpha\) often controls the variance of the distribution, e.g., \(\alpha _x, \alpha _y\) for GGX, Beckmann, \(n_u, n_v\) for Ashikhmin-Shirley. For all of these, the variance \(\alpha\) stretches g horizontally, and increases (or decreases) its value at all locations, making its derivative single-signed. However, \(\alpha\) stretches \(N(\alpha)\) vertically to negate the increase (or decrease) in area due to g, and ensure it integrates to 1.
We construct importance sampling PDFs for the two single-signed terms separately, with PDFs
\(p_1 \propto g\) and
\(p_2 \propto \partial _{\alpha }g\) ,
Figure
7(a) describes the pipeline to generate importance sampling PDFs for product decomposition. Product decomposition can handle the derivatives of anisotropic GGX, Beckmann, Ashikhmin-Shirley, which are not analytically integrable over the positivized domains, and Burley’s diffuse BSSRDF and the isotropic ABC BRDF, which have no closed-form solution for the roots. However, they all have single-signed
\(\partial _\alpha g\) , which is analytically integrable.
Note that the product rule in and of itself does not guarantee a single-signed decomposition. For example, the product of the microfacet distribution (D) and geometric terms (G) does not lead to a single-signed decomposition for the derivative with \(\alpha _x\) (or \(\alpha _y\) ). This is because both \(\partial _{\alpha _x} D\) and the \(\partial _{\alpha _x} G\) terms are real-valued. The decomposition \(D = N g\) is one of the many product decompositions, but the only one we found to preserve the single-signed property.
6 Mixture Decomposition
Our second new decomposition further expands the set of BRDF derivatives we can handle. Consider, for example, a BRDF made up of a diffuse
\(f_d\) and specular
\(f_s\) lobe with scalar mixture weights
\(k_d\) and
\(1-k_d\) , respectively:
The derivative with the mixture weight
\(k_d\) is positive when the diffuse lobe contribution is higher than the specular lobe and negative otherwise. In general, this derivative is very hard to positivize, because
\(f_d\) and
\(f_s\) can be arbitrary BRDFs, and so the roots of
\(f_d - f_s\) are unlikely to have a simple analytic form.
However, we can once again decompose this derivative into single-signed functions with overlapping support; we refer to this as the mixture decomposition. Since
\(f_d\) and
\(f_s\) are non-negative valued BRDFs, they are single-signed, and can be importance sampled separately with appropriate PDFs
\(p_d\) and
\(p_s\) :
Mixture weights show up in all
Uber BRDFs, like the Autodesk Standard Surface, Disney BRDF, and so on, and our mixture decomposition can be applied to all of them.
Mixture decomposition is also applicable to the derivative of BRDFs that are not explicitly mixture models, but internally are made up of different lobes, with parametric weights. For example, the Oren-Nayar BRDF, which is a linear combination of two terms. Here, the positive weights
\(A(\sigma), B(\sigma)\) depend upon the roughness
\(\sigma\) of the BRDF:
where
\(\alpha =\max \left(\theta _i,\theta _o\right)\) ,
\(\beta =\min \left(\theta _i,\theta _o\right)\) . Once again, since both terms of the BRDF above are positive, the real-valued derivative with
\(\sigma\) is simply the sum of a positive and a negative term,
with the sign of the term decided by the sign of
\(\partial _\sigma A\) and
\(\partial _\sigma B\) . Importance sampling the first term is simply cosine-hemispherical sampling, and we provide an importance sampling PDF for the second term in Appendix
A.3.2. Besides Oren-Nayar, the microcylinder BRDF [Sadeghi et al.
2013] is also a mixture model with weights
\(k_d, 1 - k_d\) , where
\(k_d\) is the isotropic scattering coefficient, and can be handled by mixture decomposition as well.
7 Recipe for Importance Sampling BRDF Derivatives
We now present a recipe to importance sample BRDF derivatives based on the key ideas introduced in the previous sections.
Step 1, Positivization. Given a real-valued BRDF derivative
\(\partial _\alpha f\) , check if it can be positivized. For positivization to be applicable,
\(\partial _\alpha f\) should have analytic roots. Compute the normalization constants for the solid angle PDFs
\(p_+(\boldsymbol {\omega }_i) \propto \max \left(\partial _\alpha f(\boldsymbol {\omega }_i), 0\right),\; p_-(\boldsymbol {\omega }_i) \propto \min \left(\partial _\alpha f(\boldsymbol {\omega }_i), 0\right)\) , and their marginal and conditional counterparts
\(p_+(\phi _i), p_-(\phi _i), p_+(\theta _i|\phi _i), p_-(\theta _i|\phi _i)\) , if they are analytically integrable. See Figure
4 for the PDF generation and Equation (
8) for the estimator.
Step 2, Try Product or Mixture Decomposition. If positivization is inapplicable for either reason (no analytic roots or lack of analytic integrability), then either product or mixture decomposition might still be applicable.
Step 2.1, Product Decomposition. If the original BRDF is of the form
\(N(\alpha)g(\boldsymbol {\omega }_i,\alpha)\) , where
\(\alpha\) appears in a directionally invariant (independent of
\(\boldsymbol {\omega }_i\) ) normalization term
\(N(\alpha)\) and an unnormalized shape function
\(g(\boldsymbol {\omega }_i, \alpha)\) , then product decomposition may be applicable. First check if
\(\partial _\alpha g\) is single-signed, i.e., it has a constant sign for all
\(\boldsymbol {\omega }_i\) , and is analytically integrable. If these conditions hold, then product decomposition is applicable. Construct a PDF
\(p_2(\boldsymbol {\omega }_i) \propto \partial _\alpha g\) and compute the normalization terms for it and its conditional and marginal counterparts. The other PDF
\(p_1(\boldsymbol {\omega }_i) \propto g\) is simply the BRDF sampling PDF. See Figure
7 for the PDF generation and Equation (
16) for the estimator.
Step 2.2, Mixture Decomposition. If instead the parameter
\(\alpha\) appears in the form of linear combination weights either explicitly as a mixture model between two BRDFs, or implicitly as a mixture between two lobes that form a single BRDF, then mixture decomposition is likely applicable here. In this case, simply use the PDFs and sampling strategies most suitable for the two mixture lobes if they are available (e.g., visible normal distribution function sampling for a GGX lobe), or construct PDFs
\(p_1(\boldsymbol {\omega }_i) \propto f_1(\boldsymbol {\omega }_i),\; p_2(\boldsymbol {\omega }_i) \propto f_2(\boldsymbol {\omega }_i)\) for the two lobes, where
\(f_1, f_2\) are the two lobes. See Figure
7 for the PDF generation, and Equation (
18) for the estimator.
Figure
8 depicts the estimators for all three of our decompositions for direct illumination. They all require two shadow rays at the shading point, corresponding to the positive and negative lobes of the corresponding decomposition.
Although we have not found examples that require it, our three decompositions can also be interleaved with one another for complicated BRDF derivatives. For example, it is possible that for some BRDF derivatives, the derivative of the shape function from the product rule \(\partial _\alpha g\) could be real-valued. It could then further be positivized to eliminate sign variance.
Forward Rendering Sampling Technique Reuse. Both product and mixture decomposition reuse BRDF sampling developed for forward rendering as one (or both) of the techniques for differential BRDF sampling. For product decomposition, this corresponds to \(p_1 \propto g\) . For mixture decomposition, perfect importance sampling can be achieved by only employing two standard BRDF sampling techniques from forward rendering in some cases. BRDF sampling when used directly to estimate for \(\partial _\alpha f\) suffers from sign and shape variance, however, when paired with the right decomposition, it can correctly handle the shape variance of one of the terms.
Multiple Importance Sampling. For the product and mixture decompositions, the positive and negative decomposition PDFs can have overlapping support (for positivization they are necessarily non-overlapping). As a result, the samples generated for one decomposition can be shared with the other using Multiple Importance Sampling. Also, all three of our decompositions reduce the variance from the differential BRDF \(\partial _\alpha f\) and can be used in conjunction with light source sampling via Multiple Importance Sampling to reduce the lighting, \(L_i\) ’s variance.
8 Results
We organize our results into two subsections. First, we demonstrate that our decompositions do reduce variance in practice for a number of BRDF derivatives under a wide variety of lighting conditions in Section
8.1. Next, we demonstrate that lower variance in gradients indeed does enable better spatially varying texture recovery in inverse rendering, in Section
8.2.
Implementation Details. We implemented all the different decompositions and BRDFs on our own CPU-based differentiable renderer, using the Embree [Wald et al.
2014] library for ray tracing. At each shading point, all three of our decompositions require two shadow rays; see Figure
8. To have a fair comparison with BRDF sampling, we shoot out two shadow rays at each shading point for it too, which ensures an equal-ray comparison with our method. Since our sampling routines have similar timings to BRDF importance sampling (Table
2), all comparisons are also equal time, and we report the timings for each experiment in insets. All our standard deviation comparison images are computed by taking the square root of the averaged squared error of the gradient images, which were each generated at nine samples per pixel over 50 runs. The variance improvement can be easily computed by squaring all the standard deviation improvement numbers in the insets. Additionally, we also provide difference images to show the specific regions where our estimators outperform BRDF sampling in Figure
29.
Sample Generation Timings. We report the average times (over 1,000,000 runs) to generate samples using our routines and baselines in Table
2. Our routines involve CDF inversion, which we perform using a combination of Newton iterations and bisection search as described by Yuksel et al. [
2022] (maximum 64 iterations,
\(10^{-6}\) CDF error tolerance). Our sampling routines only add a small overhead in the overall path-tracing pipeline; see timings in Figures 10–13 and 18–23.
Convergence Rates. Our estimators are intended to be used within inverse rendering pipelines. This necessitates low sample counts per pixel (1 to 16) for fast iteration times. For these sample budgets, our estimators outperform baselines both without stratification and with it (Figure
9, all other results in the article are with stratification).
Without stratification, the estimators show the typical \(\mathcal {O}(N^{-1})\) convergence. Stratification improves the convergence rate. Interestingly, with stratification, standard BRDF importance sampling performs as well as our methods at high sample counts (100+).
Mitchell [
1996] provides a possible explanation: at high sample counts with stratification, variance is dominated by visibility discontinuities in the integrand, so these pixels have convergence rates of
\(\mathcal {O}(N^{-1.5})\) . Better importance sampling improves the convergence rate for the continuous differential BRDF, noticeable at low sample counts. At higher sample counts, visibility dominates the variance, which none of the techniques handle. Also, BRDF sampling (which does not handle sign variance) benefits from stratification at high sample counts, since smaller strata are increasingly single-signed. Nevertheless, our methods are able to provide significant variance reduction (by nearly an order of magnitude) in the low sample regime, most relevant to us. This phenomenon is a general observation for importance sampling, not limited to differentiable rendering.
8.1 Derivative Comparison
8.1.1 Positivization.
First, we compare positivization with BRDF sampling for the derivative of two BRDFs in Figure
1. The scene is lit by two area lights. The isotropic GGX teapot (with
\(\alpha =0.02\) ) is differentiated with its roughness
\(\alpha\) , and the Hanrahan-Krueger (with
\(g=-0.9\) ) lion is differentiated with its Henyey-Greenstein parameter for anisotropy
g. The Henyey-Greenstein phase function at
\(g=-0.9\) is highly back-scattering and is very badly importance sampled by regular BRDF sampling, which cannot correctly account for the highly peaked and signed nature of the derivative. Since positivization is correctly able to handle both sign and shape related variance, we see significant standard deviation reduction of
\(1.96\times\) and
\(58.57\times\) for the teapot and lion, respectively.
8.1.2 Product Decomposition.
Next, we compare product decomposition with BRDF sampling for the derivative of an anisotropic Beckmann BRDF with its roughness
\(\alpha _x\) , lit under constant environment illumination in Figure
10. Positivization (and by extension Zeltner et al.) cannot handle this derivative, see Section
4.2.2, and Zhang et al.’s method fails for even derivatives like this one; see Figure
26. Constant illumination eliminates variance from lighting and only keeps variance from the BRDF derivative and visibility. Since product decomposition can correctly handle both the sign and shape variance of the BRDF derivative, it has an overall
\(8.63\times\) reduction in standard deviation, whereas BRDF sampling fails, because it cannot handle either source of variance. In most regions (Figure
10, see right inset), the derivative of the normal distribution function
\(\partial _\alpha D\) is the major source of BRDF derivative variance; we eliminate it and see a big improvement of
\(32.7\times\) . However, in the grazing angle regions (Figure
10, see left inset), the derivative of the shadowing function
\(\partial _\alpha G\) dominates. Here, our improvement is still significant (
\(7.45\times\) ), but relatively less pronounced, since our sampling strategy minimizes
\(\partial _\alpha D\) ’s variance.
Now, we change the lighting to realistic environment illumination (see insets for forward renderings), and the geometry to objects with intricate geometry in Figures
11–13; we estimate the derivative of the anisotropic Beckmann, GGX, and Ashikhmin-Shirley BRDFs in these experiments, respectively. Apart from BRDF derivative variance, these scenes have two other major sources of variance, lighting and visibility. When the variance is significant from other sources too, we have found that sharing samples between the positive and negative decomposition is beneficial, see Section
7,
Multiple Importance Sampling (MIS). In all three experiments, we have found that our product decomposition is able to outperform BRDF importance sampling and achieve lower standard deviation in gradient estimation
\(2.09\times , 1.31\times\) , and
\(1.21\times\) , respectively.
We show two more examples of product decomposition in Figure
1, for anisotropic GGX and Beckmann BRDF derivatives, which achieve standard deviation reductions of
\(1.56\times\) and
\(3.61\times\) , respectively. The insets in the top row of Figure
1 show the regions where our decomposition has lower variance than BRDF sampling in blue. Product decomposition outperforms BRDF sampling in almost all regions.
We provide additional comparisons with Zhang et al. for the anisotropic roughness derivatives of the GGX BRDF for the scenes modelled in Figures
10 and
12, in fig.
26; their estimator has higher standard deviation than BRDF importance sampling. We emphasize that this is behavior is expected, since they designed their estimator to estimate odd derivatives (for example, the derivative with the half vector), but the roughness derivative is even.
8.1.3 Mixture Decomposition.
Finally, we compare BRDF sampling with Mixture Decomposition to estimate the derivative of a mixture model with its mixture weight for the fish-shaped pot in Figure
1. The mixture model is a linear combination of a Lambertian diffuse lobe, and a GGX specular lobe and the lighting is two area lights. Mixture decomposition can reduce the standard deviation by
\(4.72\times\) , because it correctly handles shape and sign variance, unlike BRDF sampling.
Figure
1 also shows an example of an Oren-Nayar pot, and its derivative with the roughness
\(\sigma\) . BRDF sampling here is simply cosine hemispherical sampling, and works quite well in the central regions of the pot, because the cosine lobe is dominant in the non-grazing angle regions; see Equation (
19). However, in the grazing angle regions towards the edges of the pot where the correction term is more dominant, BRDF sampling breaks down and has high variance. However, our mixture decomposition with MIS correctly accounts for the derivative of both terms with regard to their sign and shape variance, and can achieve low variance in
all regions of the pot, and leads to a
\(3.91\times\) reduction in standard deviation.
8.2 Inverse Rendering Comparison
We demonstrate the benefits of correctly handling sign variance in gradients, for gradient-descent-based inverse rendering. We apply inverse rendering to the task of spatially varying texture recovery, and evaluate the effectiveness of all three of our decompositions on it. Our results for positivization are presented in Figure
14, product decomposition in Figure
15, and mixture decomposition in Figure
16. All our inverse rendering results use 4 samples per pixel for both forward and gradient rendering at each optimization iteration. We use the ADAM optimizer [Kingma and Ba
2015] and the respective loss graphs show the mean absolute texture recovery error (
L1) after some initial iterations. In Figure
14, we show that using the ADAM optimizer instead of
SGD with momentum (SGDm) reduces the impact of gradient outliers, similar to Nimier-David et al.’s observation [
2022]. Nonetheless, our estimators outperform the baselines in all setups.
For positivization (Figure
14), we recover the (spatially varying) scattering parameter
g of a Hanrahan-Krueger BRDF with the semi-infinite depth assumption, lit by a single area light. The ground-truth texture consists of a slightly back-scattering background region with
\(g = -0.3\) , and a highly back-scattering logo region with
\(g = -0.9\) ; see Figure
14(c). We initialized the scattering parameter
g with a random negative number. Positivization consistently has lower texture recover error compared to BRDF sampling, especially in the highly back-scattering region.
For product decomposition (Figure
15), we optimize the spatially varying anisotropic roughness textures (
\(\alpha _x\) and
\(\alpha _y\) ) of a Beckmann BRDF under a photometric stereo setup under two illumination conditions. The two lighting conditions are rotated versions of the same environment map. Starting from a random initialization for both textures, product decomposition’s correct handling of the sign variance leads to a gradient estimator with lower overall variance, and consequently ensures lower texture recovery error across all iterations, as shown in Figure
15(b). The final recovery is displayed in Figures 15(c) and 15(d).
Our product decomposition computes the gradients for both roughness values using three samples at each shading point combined using multiple importance sampling (one each from \(p_1\) , \(p_{2,x}\) , \(p_{2,y}\) ). To ensure an equal-ray budget, we use three samples for BRDF sampling at each shading point too.
For mixture decomposition in Figure
16, we recover the spatially varying roughness of an Oren-Nayar BRDF under environment map illumination. Once again, mixture decomposition benefits from lowered variance in gradients and can recover a texture with lower error than BRDF sampling at an equal ray-triangle intersection budget; see Figure
16(b).
9 Global Illumination
We now describe how to importance sample BRDF derivatives under multiple bounce global illumination. The recursive rendering equation [Kajiya
1986] (ignoring emission) is given by a generalization of Equation (
1),
where we have substituted the incoming radiance
\(L_i(\boldsymbol {y}, \boldsymbol {\omega }_i)\) , with the outgoing/reflected radiance
\(L_r(\boldsymbol {z}, -\boldsymbol {\omega }_i; \alpha)\) , and
\(\boldsymbol {z}= \text{rayTrace}(\boldsymbol {y}, \boldsymbol {\omega }_i)\) is the first intersection point from
\(\boldsymbol {y}\) in the direction
\(\boldsymbol {\omega }_i\) . The recursive call of
\(L_r\) is a function of the BRDF parameter
\(\alpha\) , because upon unrolling the recursion, it may be a function of an
\(\alpha\) dependent BRDF. Differentiating this expression, we get
which recursively describes how differential radiance is reflected. The two integrals (Equations (
22) and (
23)) can be importance sampled separately. We have seen how to importance sample Equation (
22) by applying different BRDF derivative decompositions in Sections
4.2,
5, and
6. Irrespective of the decomposition required, this requires two evaluations of
\(L_i\) corresponding to the positive and negative lobes and is done by regular path tracing (similar to the standard splitting approach [Arvo and Kirk
1990]). To importance sample Equation (
23), we follow standard BRDF sampling and continue the same recursive importance sampling of
\(\partial _\alpha L_r\) at the next shading point.
This means that we need three samples at each shading point, one each for BRDF, positive lobe and negative lobe importance sampling. Fortunately, for product and mixture decomposition, we can reduce this to two samples at each shading point. For product decomposition, as we saw in Section
5, one of either the positive or negative lobe decomposition PDFs is the same as BRDF sampling, and can share a sample with it. For mixture decomposition, BRDF sampling can be simulated by randomly choosing a sample from either the positive or negative lobes with the probability equal to the mixture weight of the BRDF sampling strategy.
Branching Complexity and Comparison with BRDF sampling. Even though we use two samples to estimate Equation (
22), the total number of rays required to estimate
\(\partial _\alpha L_r\) for a maximum depth
d is quadratic, i.e.,
\(O(d^2)\) , instead of exponential, see Figure
17, whereas it is
\(O(d)\) for BRDF sampling. This is because we only apply splitting when estimating Equation (
22), which recurses on
\(L_r\) , and we do not split when estimating Equation (
23). The recursive call of
\(L_r\) in Equation (
22) does not require splitting, which prevents exponential branching.
Variance Reduction under Global Illumination. All of our results are subject to an equal-ray budget setup to ensure fair comparisons. We also include wall clock times that reflects this.
We have found that for one-bounce global illumination, our mixture decomposition can reduce standard deviation by
\(1.55\times\) ; see Figure
18. Next, we show results for gradient estimation in two typical global illumination setups, glossy reflections and caustics. In Figure
19, we compute the derivative of a glossy reflection with respect to the mixture weight of the teapot that produces the reflection. Once again, our estimator produces gradients with much lower standard deviation,
\(6.23\times\) in this setting due to our better importance sampling strategy (mixture decomposition) at the second bounce on the teapot. In Figure
20, we compute the derivative of a caustic pattern formed on a diffuse surface with respect to the isotropic roughness of a metallic cylinder that produces the caustic. This setup uses the positivization estimator with a higher branching factor; nonetheless, it is still able to significantly reduce standard deviation (by
\(2.51\times\) ) due to correct handling of the BRDF derivative at the second bounce on the cylinders surface.
We also demonstrate that our estimators work well even in the presence of more detailed geometry and complicated visibility in Figure
21. We estimate the derivative with the mixture weight under both direct and one bounce global illumination. Our mixture decomposition reduces the standard deviation of the estimated gradients by
\(2.35\times\) and
\(2.27\times\) under the two settings, respectively, as compared to BRDF importance sampling.
Finally, we show the effectiveness of the product decomposition under one bounce global illumination with complex visibility, glossy reflections and other effects intertwined in a single scene in Figures
22 and
23. In Figure
22, we estimate the anisotropic roughness derivative of a chess set modelled by a Beckmann BRDF, which includes effects such as inter-occlusion among the pieces and glossy reflections on the chessboard. Product decomposition reduces the standard deviation by
\(1.78\times\) on this scene. In Figure
23, we consider the case of very complex visibility. We estimate the anisotropic roughness derivative of a metallic hairball. Even under this challenging condition, we are able to better handle sign and shape variance of the differential BRDF resulting in a
\(1.52\times\) reduction in standard deviation over BRDF sampling. We provide difference images, which show the specific regions of improvement for our techniques over BRDF sampling in Figure
29. Additionally, we also show the effect of longer path lengths in Table
3.
Inverse rendering under global illumination. Positivization has the highest branching factor of our three decompositions. Applying it at a path vertex, which requires a derivative, results in the original path branching out into three sub paths, one of which requires recursive derivative computation. It is conceivable that this added computation may worsen positivization’s performance as compared to BRDF sampling, since BRDF sampling does not induce any branching. Experimentally, we have found that despite the additional branching, positivization is able to reduce variance and better recover textures under a single bounce of global illumination as compared to BRDF importance sampling at an equal computation budget; see Figure
24. In the setup, we optimize the GGX roughness of a plate initialized uniformly with
\(\alpha =0.05\) to match a target with a checkerboard pattern with
\(\alpha =0.005,0.02\) . Light is incident upon the plate
only indirectly via a reflection off a reflector, also modelled as a GGX plate with a fixed roughness of
\(\alpha =0.01\) ; see Figure
24(b). The reduced variance due to better handling of the sign and shape variance by positivization leads to a better recovery of the texture. This is despite using the ADAM optimizer, which ameliorates the issues of higher variance estimators [Nimier-David et al.
2022].
Our next inverse rendering experiment under global illumination recovers the anisotropic roughness of a GGX BRDF Figure
25. The scene consists of two dragons (
\(\alpha _x=0.005,\alpha _y=0.007\) ) only viewed indirectly through a glossy reflection. The initialization starts with
\(\alpha _x=0.5,\alpha _y=0.007\) , and we optimize for the correct
\(\alpha _x\) value. Our product decomposition is able to better handle the variance of the differential BRDF at the second bounce (on the dragons surface), which leads to faster convergence.
10 Limitations and Future Work
Determining the number of samples for each decomposed component. For all three decompositions, our current implementation applies a two-sample estimator, which uses one sample per component. It is possible that a different estimator can be more efficient in some cases. For example, when the two components have different areas (i.e.,
\(\int \partial _\alpha f_1 \ne \int \partial _\alpha f_2\) for components
\(f_1\) and
\(f_2\) ), it might be useful to adjust the number of samples according to the area of the component (we show in Appendix
C that microfacet normal distribution functions always have components with equal area). Research in allocating budgets for multiple importance sampling can likely help in our case as well [He and Owen
2014; Sbert et al.
2018; Grittmann et al.
2022]. Our estimator that always samples all components belongs to the
deterministic mixture scheme [Owen
2013]. An alternative is a
random mixture, which randomly chooses one component. We opt for deterministic mixtures, since they consistently outperform random mixtures in our direct lighting experiments (due to the stratification effect, similar to standard MIS versus
one-sample MIS). For global illumination, random mixtures are the same as applying Russian Roulette to keep only one of the two branches, and can be more computationally convenient in some cases, since they omit the need for quadratic branching.
Branching and Global Illumination. Our adoption of deterministic mixtures requires path splitting for global illumination. While the branching complexity is quadratic instead of exponential (same as a bidirectional path tracer), it can add undesired overheads. There are several ways to reduce the branching, including (1) deterministically using only BRDF sampling, (2) using random mixtures instead of deterministic mixtures after a certain recursion depth, or (3) using path reconnection similar to Zhang et al.’s approach [
2020], to reconnect the branches back to a single primary path. Figuring out an effective strategy to deal with branching is crucial for applying our method to estimate derivatives in volumetric rendering and subsurface scattering with long paths; this is an exciting avenue for future research.
Multiple Parameter Optimization. Prior work [Zeltner et al.
2021] and ours only deal with estimating the derivative of a single material parameter (such as roughness, mixture weights, etc.) and constructs specialized sampling schemes for each of them. However, in inverse rendering, we are often interested in recovering several parameters simultaneously. For example, Uber shaders [Georgiev et al.
2019; Burley
2012] have a large number of tunable parameters. Naively applying our decompositions for each of these parameters independently can lead to very large branching factors. Developing sampling strategies that efficiently estimate the derivatives of all the parameters simultaneously is still an open research problem.
Unsupported BRDFs. Our work covers a large variety of popular analytic surface reflection models; see Table
1. However, our method currently does not support data driven BRDFs or Neural BRDFs. In both cases, there are a very large number of tunable parameters, and it is unclear as to which parameters one might want to differentiate or importance sample.
Better Optimization Schemes. Ultimately, for inverse rendering, the optimization is both ill-posed and non-convex. Recently, we have seen some work [Xing et al.
2022] that takes a step in this direction. We believe the study of efficient estimators of the derivatives is largely orthogonal and equally crucial.