research-article

Open access

A Sparse Non-parametric BRDF Model

Authors:

Tanaboon Tongbuasirilai,

Jonas Unger,

Christine Guillemot,

Ehsan MiandjiAuthors Info & Claims

ACM Transactions on Graphics, Volume 41, Issue 5

Article No.: 181, Pages 1 - 18

https://doi.org/10.1145/3533427

Published: 26 October 2022 Publication History

All formats PDF

Abstract

This paper presents a novel sparse non-parametric Bidirectional Reflectance Distribution Function (BRDF) model derived using a machine learning approach to represent the space of possible BRDFs using a set of multidimensional sub-spaces, or dictionaries. By training the dictionaries under a sparsity constraint, the model guarantees high-quality representations with minimal storage requirements and an inherent clustering of the BDRF-space. The model can be trained once and then reused to represent a wide variety of measured BRDFs. Moreover, the proposed method is flexible to incorporate new unobserved data sets, parameterizations, and transformations. In addition, we show that any two, or more, BRDFs can be smoothly interpolated in the coefficient space of the model rather than the significantly higher-dimensional BRDF space. The proposed sparse BRDF model is evaluated using the MERL, DTU, and RGL-EPFL BRDF databases. Experimental results show that the proposed approach results in about 9.75dB higher signal-to-noise ratio on average for rendered images as compared to current state-of-the-art models.

1 Introduction

The bidirectional reflectance distribution function [Nicodemus et al. 1977] describes how light scatters at the surfaces of a scene, depending on their material characteristics. The Bidirectional Reflectance Distribution Function (BRDF) is a 4D function parameterized by the incident and exitant scattering angles and can be described using either parametric models [Blinn 1977; Cook and Torrance 1982; Ashikhmin and Shirley 2000; Walter et al. 2007; Löw et al. 2012] or data-driven models [Lawrence et al. 2004; Jakob et al. 2014; Bilgili et al. 2011; Bagher et al. 2016; Sun et al. 2018; Tongbuasirilai et al. 2019]. Parametric models present great artistic freedom and the possibility to interactively tweak parameters to achieve the desired look and feel. However, most analytical models are not designed for efficient and accurate representation of the scattering properties of measured real-world materials. Data-driven models, however, enable the use of measured BRDFs and real-world materials directly in the rendering pipeline, and are commonly used in computer vision applications [Romeiro et al. 2008]. Data-driven models can represent BRDFs in many different ways. Iterative-factored representations approximate BRDFs with multiple low-rank components [Lawrence et al. 2004; Bilgili et al. 2011; Tongbuasirilai et al. 2019], while hybrid analytical data-driven models [Bagher et al. 2016; Sun et al. 2018] rely on non-parametric components or basis functions computed using specific weighting and optimization schemes.

The efficiency, or performance, of a non-parametric model is typically measured in terms of the number of variables/coefficients required to represent a BRDF at a given quality. The number of coefficients, in turn, depends on the efficacy of the underlying basis functions, often called dictionaries, in modeling the BRDF. Most, if not all, existing methods either sacrifice the model accuracy to achieve fast reconstruction for real-time applications, or aim for high image fidelity leading to increased storage and computational requirements. Another important aspect is the complexity of the basis functions used in the representation. At one end of the spectrum, we have analytical basis functions such as spherical harmonics and wavelets [Ramamoorthi and Hanrahan 2001; Claustres et al. 2003], which do not require the storage of the basis functions, at the cost of low approximation accuracy, unless a large number of coefficients is used. On the other end of the spectrum, we have decomposition-based methods [Bilgili et al. 2011] that model the BRDF as a multiplication of a set of coefficients and a basis matrix/tensor computed from data. Unfortunately, these approaches require a computationally expensive decomposition, e.g., Principal Component Analysis (PCA) or Singular Value Decomposition (SVD), for each BRDF individually and suffer from a high storage cost for the basis itself. Another problem is that the expressiveness of existing bases/decomposition methods is limited. Except for a few, they are in most cases not designed for BRDF data, hence requiring high numbers of coefficients for accurate BRDF representation.

The goal in this article is to develop a new data-driven BRDF model using a set of trained basis functions that enable highly accurate representations with a minimal number of coefficients to represent any BRDF. To solve this challenge, we derive a model that in essence relies on decomposing BRDFs into a coefficient–basis pair, but uses machine learning to adapt the basis to the space of BRDFs to provide maximally sparse coefficients. Sparse BRDF modeling is achieved using a novel BRDF dictionary ensemble and a novel model selection algorithm to efficiently represent a wide range of real-world materials. The learned BRDF dictionary ensemble consists of a set of basis functions trained such that they guarantee a very sparse BRDF representation and near optimal signal reconstruction. Moreover, our model takes into account the multidimensional structure of measured BRDFs (e.g., 3D or 4D depending on the parameterization) and can exploit the information redundancy in the entire BRDF space to reduce the number of coefficients. While in this article, we focus on isotropic materials, our method can be readily applied to anisotropic materials, given a data set of such materials.

The learned ensemble is versatile and can be trained only once to be reused for representing a wide range of previously unseen materials. Additionally, the dictionary ensemble is not restricted to a single BRDF transformation as previous models. Instead, multiple BRDF transformations can be included in the ensemble training such that for each individual BRDF the best representation can be automatically selected and used. This is achieved by a novel model selection method to pick a dictionary in the ensemble that leads to the sparsest solution, the smallest reconstruction error, and the most suitable transformation with respect to rendering quality. For the experiments and evaluations presented here, we use the MERL [Matusik et al. 2003] and RGL-EPFL [Dupuy and Jakob 2018] databases, which are divided into a training set and a test set used for evaluation.

Another key contribution of this article is a novel algorithm for interpolating two or more BRDFs in the sparse coefficient space. We propose to perform this task in the sparse coefficient space. The key challenge is when the two BRDFs to be interpolated do not lie in the same subspace, i.e., they use different dictionaries of the ensemble. We propose a novel formulation for interpolation of BRDFs that lie in different coordinate systems. The proposed interpolation algorithm admits the construction of a smooth surface over the coefficient space of all BRDFs represented using our model.

The main contributions of this article are as follows:

•

A novel non-parametric BRDF model using sparse representations that significantly outperforms existing decomposition-based methods with respect to both model error and rendering quality.

•

A multidimensional dictionary ensemble learning method tailored to measured BRDFs.

•

A novel BRDF model selection method that chooses the best dictionary for efficient BRDF modeling, as well as the most suitable BRDF normalization function. This enables a unified non-parametric BRDF model regardless of the characteristics of the material.

•

A novel algorithm for BRDF interpolation in the sparse coefficient space rather than the BRDF space. We show that two or more measured BRDFs can be linearly interpolated even if each BRDF uses a distinct dictionary in the ensemble.

We compare the proposed non-parametric BRDF model to the current state-of-the-art methods and demonstrate that it performs significantly better in terms of rendering Signal-to-Noise Ratio (SNR) and visual quality, as well as having a lower reconstruction error. To the authors’ knowledge this is the first BRDF model based on sparse representations and dictionary learning.

Notations. Throughout the article, we use the following notational convention. Vectors and matrices are denoted by boldface lower-case (\(\mathbf {a}\)) and bold-face upper-case letters (\(\mathbf {A}\)), respectively. Tensors are denoted by calligraphic letters, e.g., \(\mathbf {\mathcal {A}}\). A finite set of objects is indexed by superscripts, e.g., \(\lbrace \mathbf {A}^{(i)}\rbrace _{i=1}^{N}\), whereas individual elements of \(\mathbf {a}\), \(\mathbf {A}\), and \(\mathbf {\mathcal {A}}\) are denoted \(\mathbf {a}_i\), \(\mathbf {A}_{i_1,i_2}\), \(\mathbf {\mathcal {A}}_{i_1,\dots ,i_n}\), respectively. The determinant of \(\mathbf {A}\) is denoted \(\mathrm{det}(\mathbf {A})\). The \(\ell _p\) norm of a vector \(\mathbf {s}\), for \(1\le p\le \infty\), is denoted by \(\Vert \mathbf {s}\Vert _p\). Frobenius norm is denoted \(\Vert \mathbf {s}\Vert _F\). The \(\ell _0\) pseudo-norm of a vector, denoted \(\Vert \mathbf {s}\Vert _0\), defines the number of non-zero elements in the vector. The n-mode product of a tensor \(\mathbf {\mathcal {S}}\) and a matrix \(\mathbf {U}\) is denoted \(\mathbf {\mathcal {S}}\times _n\mathbf {U}\).

2 Background and Related Work

Measured BRDFs have proven to be an important tool in achieving photo-realism during rendering [Matusik et al. 2003; Dupuy and Jakob 2018; Dong et al. 2016]. Even highly complex surfaces such as layered materials require multiple components of measured data to construct novel complex materials [Jakob et al. 2014]. Measured materials, however, are high-dimensional signals with large memory footprint and a key challenge is that small approximation errors can lead to visual artifacts during rendering. To efficiently represent such high-dimensional measured BRDF data, one can use parametric models, or data-driven models, since densely sampled BRDF data imposes a large memory footprint, making it impractical to use in many applications.

Parametric models. By careful modeling, BRDFs can be encoded with only a few parameters. The components or factors of such models are based on either assumptions describing by the physics of light—surface interactions using, e.g., microfact theory [Cook and Torrance 1982; Walter et al. 2007; Holzschuch and Pacanowski 2017], or empirical observations of BRDF behaviors [Blinn 1977; Ward 1992; Ashikhmin and Shirley 2000; Nishino and Lombardi 2011; Löw et al. 2012]. However, in many practical cases and applications, existing parametric models cannot accurately fit measured real-world data, as mentioned in Bagher et al. [2016].

Data-driven models. Due to their non-parametric property, data-driven models are superior to parametric models in that the number of degrees of freedom, or implicit model parameters, is much higher. This means that the representative power is higher and the expected approximation error is lower. Factored BRDF models use decomposition techniques to factorize BRDF into several components. Matrix and tensor decompositions have been used by Lawrence et al. [2004], Soler et al.[2018], Bilgili et al. [2011], and Tongbuasirilai et al. [2019]. Moreover, factorization-based models have been used for interactive BRDF editing [Kautz and McCool 1999; Ben-Artzi et al. 2006] and real-time rendering [Soler et al. 2015]. Analytical basis functions have been used for BRDF measurements using spherical harmonics [Ghosh et al. 2010;, 2009; Tunwattanapong et al. 2013] and BRDF modeling using rational functions [Pacanowski et al. 2012].

A problem with existing factored models is that rank-1 approximations in most cases lead to inferior results. Accurate modeling requires iterative solutions with many layered factors. Analytic-data-driven BRDF models [Bagher et al. 2016; Sun et al. 2018] employ analytical models extended to a higher number of parameters fitted with measured data to achieve higher accuracy. Cooper et al. [2021] employ the Gaussian mixture model for BRDF representation and clustering that is utilized in the inverse rendering problem. The recent advancement of machine learning algorithms, in particular deep learning, have led to new research directions on BRDF-related topics [Dong 2019]. Deep learning has been used for BRDF editing [Hu et al. 2020], BRDF acquisition [Deschaintre et al. 2019;, 2018; Li et al. 2018], compact BRDF representation for importance sampling [Sztrajman et al. 2021], and inverse rendering using invertible neural BRDFs [Chen et al. 2021]. Moreover, deep learning has also been used for Bidirectional Texture Function (BTF) compression, see, e.g., Rainer et al. [2019].

Dictionary Learning. One of the most commonly used dictionary learning methods is K-SVD [Aharon et al. 2006], and its many variants [Mukherjee et al. 2015; Mazhar and Gader 2008; Rusu and Dumitrescu 2012; Marsousi et al. 2014], where a 1D signal (i.e., a vector) is represented as a linear combination of a set of basis vectors, called atoms. A clear disadvantage of K-SVD for BRDF representation is signal dimensionality. For instance, a measured BRDF in the MERL data set, excluding the spectral information, is a \(90\ \times \ 90\ \times \ 180=\text{1,458,000}\)-dimensional vector. In practice, the number of data points needed for K-SVD dictionary training should be a multitude of the signal dimensionality to achieve a high-quality dictionary. In addition to unfeasible computational power required for training, the limited number of available measured BRDF data sets renders the utilization of K-SVD impractical.

In contrast to 1D dictionary learning methods, multidimensional dictionary learning has received only little attention in the literature [Roemer et al. 2014; Hawe et al. 2013; Ding et al. 2017]. In multidimensional dictionary learning, a data point is treated as a tensor, and a dictionary is trained along each mode. For instance, given our example above, instead of training one \(\text{1,458,000}\)-dimensional dictionary for the MERL data set, one can train three dictionaries (i.e., one for each mode), where the atom size for these dictionaries are 90, 90, and 180, corresponding to the dimensionality of each mode. To the best of our knowledge, there exists only a few multidimensional dictionary learning algorithms. Our sparse BRDF model in this article is inspired by the multidimensional dictionary ensemble training proposed in Miandji et al. [2019], which has been shown to perform well for high dimensional signals such as light fields and light field videos. We will elaborate on our training scheme for BRDFs in Section 3.2.

3 Sparse Data Driven BRDF Model

Our non-parametric model is based on learning a set of multidimensional dictionaries, a dictionary ensemble, spanning the space of BRDFs, i.e., the space in which each BRDF is a single multi-dimensional point. Each dictionary in the ensemble consists of a set of basis functions (i.e., orthonormal matrices), representing each dimension of the BRDF space, that admits sparse representation of any measured BRDF using only a small number of coefficients as illustrated in Figure 1. The dictionary ensemble is trained only once on a given training set of measured BRDFs and can then be reused to represent a wide range of different BRDFs. This is in contrast to previous models that use tensor or matrix decomposition techniques, where the basis and the coefficients are calculated for each BRDF individually.

Fig. 1.

A major challenge when using machine learning methods, and in particular dictionary learning, on BRDFs is the high dynamic range inherent to the data. In Section 3.1, we describe two data transformations that when applied on measured BRDFs, they improve the fitting to our non-parametric model, see Section 4. The training of the multidimensional dictionaries is described in Sections 3.2 and 3.3, followed by our model selection technique in Section 3.4, where we describe a method to select the most suitable dictionary in the ensemble for any unseen BRDF such that the coefficients are maximally sparse, the modeling error is minimal, and that the data transformation used is one that leads to a better rendering quality. Finally, in Section 3.5, we present a novel algorithm for interpolating two or more measured BRDFs, represented using our model, directly in the representation space (i.e., using the sparse coefficients).

A BRDF can be parameterized in many different ways [Rusinkiewicz 1998; Löw et al. 2012; Stark et al. 2005; Barla et al. 2015]. Our dictionary learning approach does not rely on the parameterization of given BRDFs as long as the resolution of these BRDFs is the same. For simplicity and to facilitate a fair comparison to previous works, all the data sets we use here are based on the Rusinkiewicz’s parameterization [Rusinkiewicz 1998] defined by \((\theta _h, \theta _d, \phi _d)\), where \(\theta _h, \theta _d\) are elevation angles of the half vector and the difference vector, respectively, \(\phi _d\) is the azimuthal angle of the difference vector. The sample BRDFs we use here are of resolution \(90\times 90\times 180\). The data sets are sampled with MERL encoding where a square-root function is applied to \(\theta _h\); see Matusik et al. [2003].

3.1 BRDF Data Transformation

Measured BRDFs often exhibit a very high dynamic range, which introduces many difficulties during parameter fitting and optimization. It is therefore necessary to apply an invertible transformation to the BRDF values using, e.g., a log-mapping as suggested by Löw et al. [2012], Nielsen et al. [2015], Sun et al. [2018], and Tongbuasirilai et al. [2019]. In this article, we use two data transformation functions to improve the performance of our model during training and testing.

The first transformation is based on log-plus transformation proposed by Löw et al. [2012]:

\begin{equation} \rho _{t1}(\omega _h, \omega _d) = {\bf log}(\rho (\omega _h, \omega _d) + 1), \end{equation}

(1)

where \(\rho\) is the original BRDF value, \(\rho _{t1}\) is the transformed BRDF value, and (\(\omega _h\), \(\omega _d\)) are vectors defining the half vector and the difference vector defined by Rusinkiewicz’s parameterization, respectively. For the second transformation, we use the log-relative mapping proposed by Nielsen et al. [2015]; however, we exclude the denominator. We call this transformation cosine-weighted-log:

\begin{equation} \rho _{t2}(\omega _h, \omega _d) = {\bf log}(\rho (\omega _h, \omega _d)* {\bf cosMap}(\omega _h, \omega _d) + 1), \end{equation}

(2)

where \({\bf cosMap}\) is a function that maps the inputs \((\omega _h, \omega _d)\) directions in the MERL encoding to \((\omega _i, \omega _o)\) in standard spherical coordinates, and calculates the weights as \(max({\bf cos}(\theta _i)*{\bf cos}(\theta _o), \epsilon)\), where \(\epsilon = 0.001\).

Using the proposed non-parametric model, we have conducted experiments using both transformations, see Table 1. The log-plus transformation in Equation (1) yields better results when compared to the cosine-weighted-log transformation in Equation (2) for glossy materials. The cosine-weighted-log is in most cases a better choice for near-diffuse BRDFs.

Table 1.

Material	Ensemble with \(\tau _l=32\)		Ensemble with \(\tau _l=64\)		Ensemble with \(\tau _l=128\)		Ensemble with \(\tau _l=256\)
	\(\rho _{t1}\) SNR(dB)	\(\rho _{t2}\) SNR(dB)	\(\rho _{t1}\) SNR(dB)	\(\rho _{t2}\) SNR(dB)	\(\rho _{t1}\) SNR(dB)	\(\rho _{t2}\) SNR(dB)	\(\rho _{t1}\) SNR(dB)	\(\rho _{t2}\) SNR(dB)
blue-fabric	53.9003	58.4393	57.0890	61.1925	56.7419	62.4704	58.7229	62.9932
blue-metallic-paint	54.8105	56.8017	52.4779	59.7930	54.2249	61.0643	52.5073	60.5738
dark-red-paint	44.1094	51.9677	45.8218	52.4098	48.4695	54.7743	46.3005	54.4020
gold-metallic-paint2	46.9514	38.6907	45.6783	36.0324	46.1956	37.4564	42.4208	41.1227
green-metallic-paint2	50.7108	41.8161	49.4635	39.3023	52.8230	43.0459	49.6811	50.2204
light-red-paint	41.4139	49.0550	43.7449	48.7451	47.7306	52.1905	45.1613	50.6002
pink-fabric2	44.8244	49.3862	48.5446	52.5484	52.6230	53.5701	52.5405	54.4938
purple-paint	43.8932	38.8859	42.2491	47.5648	48.2735	48.7324	45.3798	47.1568
red-fabric	47.5606	52.3038	50.9287	54.7831	53.9668	56.7085	55.3687	58.5863
red-metallic-paint	47.2351	40.3386	46.9943	38.4251	49.1860	42.1207	48.6971	42.6229
silver-metallic-paint2	40.3291	42.9256	44.0442	43.2292	44.0323	46.8208	46.4961	44.1504
specular-green-phenolic	48.4841	41.6432	47.3226	36.5157	49.4785	48.8586	49.2522	45.9519
specular-violet-phenolic	48.2384	42.7994	47.4994	37.9801	47.4863	44.5840	48.3638	41.3332
specular-yellow-phenolic	46.4907	39.1758	44.5259	36.1666	45.4146	35.4231	43.1846	36.4720
violet-acrylic	48.7179	44.0610	48.7112	38.9536	47.6828	42.0749	48.1322	36.7368
Average	47.1779	45.8860	47.6730	45.5761	49.6219	48.6596	48.8139	48.4944

Table 1. SNR of Rendered Images Using the BRDF Dictionaries Trained with Different Dictionary Sparsity Levels: 32, 64, 128, and 256

Each dictionary has two transformations, \(\rho _{t1}\) and \(\rho _{t2}\). The test set consists of 15 MERL materials (not included in the training). The bottom row shows the average SNR over the test set. The underlined numbers are best SNR values for \(\rho _{t1}\) and the bold numbers are the best SNR values for \(\rho _{t2}\).

While we use the two most commonly used BRDF transformations, our sparse BRDF model is not limited to the choice of the transformation function. Indeed, given any new such function, the previously trained dictionary ensemble can be directly applied. However, to further improve the model accuracy, one can train a small set of dictionaries given a training set obtained with the new BRDF transformation. We then take the union of the previously and newly trained ensembles of dictionaries. The expansion of the dictionary ensemble is a unique characteristic of our model. We utilize this property in Section 3.3 to combine different sets of dictionaries, each trained with a distinct training sparsity. The same approach can be used here for improving the model accuracy when a new measured BRDF data set, that requires a more sophisticated transformation, is given.

3.2 Multidimensional Dictionary Learning for BRDFs

To build the non-paramatric BRDF model, we seek to accurately model the space of BRDFs using basis functions leading to a high degree of sparsity for the coefficients while maintaining the visual fidelity of each BRDF in the training set. To achieve this, the training algorithm needs to take into account the multidimensional nature of BRDF objects, typically 3D or 4D, depending on the parameterization. Let \(\lbrace \mathbf {\mathcal {X}}^{(i)}\rbrace _{i=1}^{N_l}\) be a set of \(N_l\) BRDFs, where \(\mathbf {\mathcal {X}}^{(i)}\in \mathbb {R}^{m_1\times m_2\times m_3}\). Here, we do not assume any specific parameterization and only require that all the BRDFs in \(\lbrace \mathbf {\mathcal {X}}^{(i)}\rbrace _{i=1}^{N_l}\) have the same resolution. Moreover, as discussed in Section 3.1, we utilize two BRDF transformations, \(\rho _{t1}\) and \(\rho _{t2}\). As a result, the training set consists of two versions of each BRDF.

To achieve a sparse three-dimensional representation of \(\lbrace \mathbf {\mathcal {X}}^{(i)}\rbrace _{i=1}^{N_l}\), we train an ensemble of K three-dimensional dictionaries, denoted \(\lbrace \mathbf {U}^{(1,k)},\mathbf {U}^{(2,k)},\mathbf {U}^{(3,k)}\rbrace _{k=1}^{K}\), such that each BRDF, \(\mathbf {\mathcal {X}}^{(i)}\), can be decomposed as

\begin{equation} \mathbf {\mathcal {X}}^{(i)} = \mathbf {\mathcal {S}}^{(i)}\times _1\mathbf {U}^{(1,k)}\times _2\mathbf {U}^{(2,k)}\times _3\mathbf {U}^{(3,k)}, \end{equation}

(3)

where \(\mathbf {U}^{(1,k)}\in \mathbb {R}^{m_1\times m_1}\), \(\mathbf {U}^{(2,k)}\in \mathbb {R}^{m_2\times m_2}\), \(\mathbf {U}^{(3,k)}\in \mathbb {R}^{m_3\times m_3}\), and \(k\in \lbrace 1,\dots ,K\rbrace\). The notation \(\times _n\) is used to denote the n-mode product between a tensor and a matrix as described in Kolda and Bader [2009]. Moreover, the tensor \(\mathbf {\mathcal {S}}^{(i)}\in \mathbb {R}^{m_1\times m_2\times m_3}\) contains a set of sparse coefficients, and we require \(\Vert \mathbf {\mathcal {S}}^{(i)}\Vert _0\le \tau\), where \(\tau\) is a user-defined sparsity parameter to achieve a certain sparsity in the representation. Recall from Section 1 that the \(\ell _0\) norm, denoted \(\Vert .\Vert _0\), counts the number of nonzero elements in a vector, a matrix, or a tensor. It is evident from Equation (3) that each BRDF is represented using one dictionary in the ensemble, in this case \(\lbrace \mathbf {U}^{(1,k)},\mathbf {U}^{(2,k)},\mathbf {U}^{(3,k)}\rbrace\). Moreover, we require that the number of dictionaries to be much smaller than the number BRDFs in the training set, i.e., \(K\ll N_l\). As a result, the training algorithm divides the training set into a set of cluster, where each cluster is represented using one dictionary.

The ensemble training is performed by solving the following optimization problem:

\begin{equation} \begin{split} \underset{\mathbf {U}^{(j,k)},\mathbf {\mathcal {S}}^{(i,k)},\mathbf {M}_{i,k}}{\mathrm{min}} \; \sum \limits _{i=1}^{N_l} \sum \limits _{k=1}^{K} \mathbf {M}_{i,k} \left\Vert \mathbf {\mathcal {X}}^{(i)} \right. \\ \left. {}-\mathbf {\mathcal {S}}^{(i,k)}\times _1\mathbf {U}^{(1,k)}\times _2\mathbf {U}^{(2,k)}\times _3\mathbf {U}^{(3,k)} \right\Vert _F^2 , \end{split} \end{equation}

(4a)

subject to

\begin{align} &\left(\mathbf {U}^{(j,k)}\right)^T\mathbf {U}^{(j,k)}=\mathbf {I},\;\; \forall k=1,\dots ,K,\;\; \forall j=1,\dots ,3, \end{align}

(4b)

\begin{align} & {\left\Vert \mathbf {\mathcal {S}}^{(i,k)}\right\Vert }_0 \le \tau _l, \; \end{align}

(4c)

\begin{align} & \sum \limits _{k=1}^K \mathbf {M}_{i,k}=1, \;\; \forall i=1,\dots ,N_l, \end{align}

(4d)

where the matrix \(\mathbf {M}\in \mathbb {R}^{N_l\times K}\) is a clustering matrix associating each BRDF in the training set to one multidimensional dictionary in the ensemble. Equation (4b) ensures the orthonormality of each multidimensional dictionary, the sparsity of the coefficients is enforced by Equation (4c), and the single dictionary representation is enforced by the constraint in Equation (4c). The user-defined parameter \(\tau _l\) defines the training sparsity. It should be noted that the clustering matrix \(\mathbf {M}\) divides the BRDFs in the training set into a set of clusters such that optimal sparse representation is achieved with respect to the number of model parameters (or coefficients) and the representation error. This clustering is an integral part of our model and improves the accuracy of BRDF representations. The supplementary document accompanying this manuscript contains the update rules for solving Equation (4) iteratively with respect to \(\mathbf {U}^{(j,k)}\), \(\mathbf {\mathcal {S}}^{(i,k)}\), and \(\mathbf {M}_{i,k}\).

Our sparse BRDF modeling is inspired by the Aggregate Multidimensional Dictionary Ensemble (AMDE) proposed by Miandji et al. [2019]. However, we do not perform pre-clustering of data points, in this case BRDFs, for the following two reasons: First, the number of existing measured BRDF data sets is very limited. Hence, if we apply pre-clustering, then the number of available BRDFs to train a dictionary ensemble becomes inadequate. Second, since we use each BRDF as a data point, the size of each data point is \(90*90*180= \hbox{1,458,000}\), hence rendering the proposed pre-clustering method in Miandji et al. [2019] impractical. Indeed, the two BRDF transformations discussed in Section 3.1 can be seen as a pre-clustering of the training set. These transformations divide the training set into diffuse and glossy BRDFs. Moreover, as it will be described in Section 3.3, and unlike the method of Miandji et al. [2019], we perform multiple trainings of the same training set but with different values for the training sparsity parameter, \(\tau _l\). The obtained ensembles are combined to form an ensemble that can efficiently represent BRDFs with less reconstruction error.

3.3 BRDF Dictionary Ensemble with Multiple Sparsities

Measured BRDFs exhibit a variable degree of sparsity in the representation space (also known as the coefficient space). Indeed given a suitable dictionary, a diffuse material requires only a small number of coefficients while a highly glossy BRDF needs a significantly higher number of coefficients for an accurate representation. This phenomenon has been observed by previous work on non-parametric modeling of BRDFs based on factorization or using commonly known basis functions such as spherical harmonics [Tunwattanapong et al. 2013; Lawrence et al. 2004; Sun et al. 2018; Nielsen et al. 2015]. A shortcoming of the dictionary ensemble learning method described in Section 3.2 is that we do not take into account the intrinsic sparsity of various materials in the training set. In other words, since the training sparsity \(\tau _l\) is fixed for all the BRDFs in the training set, a small value for \(\tau _l\) will steer the optimization algorithm to more efficiently model low frequency (or diffuse-like) materials, while neglecting high frequency materials. Moreover, when \(\tau _l\) is large, the dictionary optimization adapts to high frequency materials, leading to degradation of quality for diffuse materials due to over-fitting. Indeed, finding the optimal values of \(\tau _l\) for a set of BRDFs is a tedious task, and requires various assumptions on the behavior of these materials in the representation space.

Instead of finding the optimal values for \(\tau _l\) for a large and diverse set of BRDFs in the training set, we train multiple ensembles of dictionaries, each with a different value for \(\tau _l\), so that we can model both low and high frequency details of the training BRDFs more efficiently, while lowering the risk of over-fitting. After training each ensemble according to the method described in Section 3.2, we combine them all to form one ensemble that includes all the dictionaries. We then utilize our model selection algorithm, see Section 3.4, for selecting the optimal dictionary in the combined ensemble for a given BRDF. Regardless of the material properties, e.g., low or high frequency, our model selection is able to select the best dictionary in the ensemble for the vast majority of cases, see Section 4 for details. In this article, we train 4 ensembles, each with 8 dictionaries, which are trained with \(\tau _l=32\), \(\tau _l=64\), \(\tau _l=128\), and \(\tau _l=256\); hence, the final ensemble consists of 32 dictionaries. Note that the combined ensemble with 32 dictionaries only requires \((32*(90*90+90*90+180*180)*4)\;/\;\hbox{1,024}^2=5.93\) MB of storage, assuming 4 bytes for each element. Therefore, our ensemble can be used in applications where storage capabilities are very limited. In Table 1, we present rendering SNR results obtained from ensembles trained with different values for the training sparsity, \(\tau _l\).

Our BRDF model represents each BRDF using one dictionary from the ensemble. Each dictionary can be seen as a set of basis functions spanning the BRDF manifold in a local neighborhood defined by the dictionary. Therefore, our ensemble of 32 dictionaries defines 32 local coordinate frames for representing the space of all BRDFs defined on a manifold. This is in contrast to the PCA dictionary used in Nielsen et al. [2015], where a single coordinate frame defined the BRDF manifold, or the diffuse-specular separation method [Sun et al. 2018], where two coordinate frames representing diffuse and specular properties are defined. Hence, our method, where the number of dictionaries or coordinate frames, is more flexible and expressive to cover a greater range of material classes. Indeed, the membership matrix, \(\mathbf {M}\), can be thought of as a clustering matrix for the BRDF manifold. During training, a BRDF is assigned to a dictionary in which it is best represented given the limited number of coefficients defined by \(\tau _l\). The different dictionaries represent different physical properties of BRDFs. Table 1 demonstrates that, e.g., phenolic and paint materials cluster in dictionaries with low sparsity and that more glossy or specular BRDFs reside in high sparsity dictionaries. In addition, our ensemble can be expanded by training more dictionaries and adding them to the existing ensemble. This is particularly useful for modeling new data sets or new data transformations, if the existing ensemble does not provide satisfactory results.

3.4 BRDF Model Selection

Once the ensemble of dictionaries is trained, the next step is to use it for the sparse representation of BRDFs. We call this stage model selection, since our dictionaries in the ensemble are trained with different transformations and sparsities, we need to find one dictionary that leads to the most sparse coefficients with the least error, as well as the best performing transformation between \(\rho _{t1}\) and \(\rho _{t2}\). Indeed, as mentioned in Section 3.1, our method is not limited to the number of transformations. In what follows, the term ensemble refers to the combined ensemble that is trained with multiple sparsities and BRDF transformations, as described in Section 3.3.

We begin by describing our method for selecting the most suitable dictionary in the ensemble for BRDF representation. This can be achieved by projecting each BRDF onto all the dictionaries in the ensemble. Let \(\lbrace \mathbf {\mathcal {Y}}^{(i)}\rbrace _{i=1}^{N_t}\) be a BRDF test set. The projection step is formulated as

\begin{equation} \hat{\mathbf {\mathcal {S}}}^{(i,k)} = \mathbf {\mathcal {Y}}^{(i)}\times _1\left(\mathbf {U}^{(1,k)}\right)^T\times _2\left(\mathbf {U}^{(2,k)}\right)^T\times _3\left(\mathbf {U}^{(3,k)}\right)^T, \end{equation}

(5)

where \(\mathbf {\mathcal {Y}}^{(i)}\) is a BRDF in the testing set that we like to obtain a sparse representation of. As discussed in Section 3.1, we utilize two BRDF transformations, \(\rho _{t1}\) and \(\rho _{t2}\). As a result, the testing set consists of two versions of each BRDF \(\lbrace \mathbf {\mathcal {Y}}^{(i)}\rbrace _{i=1}^{N_t}\). The smallest components in the coefficient tensors \(\hat{\mathbf {\mathcal {S}}}^{(i,k)}\) are progressively nullified until we reach a user defined sparsity level, called the testing sparsity, \(\tau _t\), or when the representation error becomes larger than a user defined threshold. The testing sparsity, which defines the model complexity, is different than the training sparsity \(\tau _l\) and we typically require \(\tau _t\ge \tau _l\). Generally, a higher value for \(\tau _t\) is required for glossy materials than for diffuse to achieve an accurate BRDF representation. Indeed, the number of coefficients provides a trade-off between quality and performance, making our model flexible enough to be applied in a variety of applications.

After sparsifying \(\hat{\mathbf {\mathcal {S}}}^{(i,k)}\), \(\forall k\in \lbrace 1,\dots ,K\rbrace\), we pick the dictionary corresponding to the sparsest coefficient tensor among \(\hat{\mathbf {\mathcal {S}}}^{(i,k)}\), \(\forall k\in \lbrace 1,\dots ,K\rbrace\). If all the coefficient tensors \(\hat{\mathbf {\mathcal {S}}}^{(i,k)}\), \(\forall k\in \lbrace 1,\dots ,K\rbrace\) achieve the same sparsity, then we pick the dictionary corresponding to the least reconstruction error. Note that since the number of BRDFs in a test set, which contains the set of all possible BRDFs with the same resolution, is always larger than the number of dictionaries in the ensemble, i.e., \(N_t\gt K\), the dictionary selection algorithm described above clusters the test set based on the dictionaries such that each cluster of BRDFs in the test set uses one dictionary of the ensemble. Recall that the same property holds for the training set since \(N_l \gg K\).

The reconstruction error for a BRDF in the test set, \(\mathbf {\mathcal {Y}}^{(i)}\), modeled using a dictionary \(\lbrace \mathbf {U}^{(1,k)},\mathbf {U}^{(2,k)},\mathbf {U}^{(3,k)}\rbrace\), \(k\in \lbrace 1,\dots ,K\rbrace\), is simply calculated as

\begin{equation} \left\Vert \mathbf {\mathcal {Y}}^{(i)} - \hat{\mathbf {\mathcal {S}}}^{(i,k)}\times _1\mathbf {U}^{(1,k)}\times _2\mathbf {U}^{(2,k)}\times _3\mathbf {U}^{(3,k)}\right\Vert _F^2. \end{equation}

(6)

Because the BRDF dictionary ensemble is trained once and can be used for the sparse representation of unobserved BRDFs, the storage cost of the model in Equation (3) is defined by the storage complexity of the sparse coefficient tensor \(\hat{\mathbf {\mathcal {S}}}^{(i)}\) in Equation (5). We store the nonzero elements in \(\hat{\mathbf {\mathcal {S}}}^{(i)}\) as tuples of nonzero element locations and values, denoted \(\lbrace l_t^1,l_t^2,l_t^3,\hat{\mathbf {\mathcal {S}}}^{(i)}_{l_t^1,l_t^2,l_t^3}\rbrace _{t=1}^{\tau _t}\), where the indices \(l_t^1\), \(l_t^2\), and \(l_t^3\) store the location of the tth nonzero element of \(\hat{\mathbf {\mathcal {S}}}^{(i)}\), while the corresponding value is \(\hat{\mathbf {\mathcal {S}}}^{(i)}_{l_t^1,l_t^2,l_t^3}\).

The reconstruction of a given BRDF, \(\mathbf {\mathcal {Y}}^{(i)}\), using our model is computed by multiplying the sparse coefficient tensor \(\hat{\mathbf {\mathcal {S}}}^{(i,k)}\) with the corresponding dictionary as follows:

\begin{equation} \hat{\mathbf {\mathcal {Y}}}^{(i)} = \hat{\mathbf {\mathcal {S}}}^{(i,k)}\times _1\mathbf {U}^{(1,k)}\times _2\mathbf {U}^{(2,k)}\times _3\mathbf {U}^{(3,k)}, \end{equation}

(7)

where k is the index of the dictionary chosen by the model selection method described above. Thanks to the fact that the coefficient tensor \(\hat{\mathbf {\mathcal {S}}}^{(i,k)}\) is sparse, Equation (7) is computationally tractable even for real-time applications. Indeed, we can evaluate Equation (7) by only multiplying the nonzero elements of \(\hat{\mathbf {\mathcal {S}}}^{(i,k)}\) with the dictionary, as shown in Miandji et al. [2019]. Reconstructing a single element in a BRDF only requires \(3\tau _t\) operations of type multiply-add (MAD). Indeed, for ray tracing applications, we only to reconstruct one or a few elements of the BRDF tensor at each ray intersection point. Moreover, since \(\tau _t\) is user-defined, our method is computationally flexible and can provide an intuitive that makes it suitable for a variety of applications.

Since our dictionary is trained with two sets of transformed BRDFs, i.e., \(\rho _{t1}\) and \(\rho _{t2}\), we can obtain two reconstructed BRDFs from an unseen BRDF by employing the algorithm described above. This still leaves us with the problem of selecting the best reconstructed BRDF between \(\rho _{t1}\) and \(\rho _{t2}\). Due to the discrepency between quantitative quality metrics computed over the BRDF space (such as MSE) and the rendering quality [Bieron and Peers 2020], model selection is a difficult task for BRDF fitting, as well as learning-based methods such as ours. For instance, log-based metrics [Löw et al. 2012; Sun et al. 2018] have been used to improve efficiency of fitting measured BRDFs to parametric functions. Indeed, the most reliable technique is to render a collection of images for all possible variations of the model and select one that is closest to an image rendered using the reference BRDF. This approach has been used by Bieron et al. [[Bieron and Peers 2020]] for BRDF fitting. To reduce the number of renderings, multiple BRDF parameter fitting are performed using a power function with different inputs. The model selection is then performed by rendering a test scene and choosing the best model based on image quality metrics.

We propose a model selection approach that does not require rendering the reconstructed \(\rho _{t1}\) and \(\rho _{t2}\) variants of BRDFs. From our observations, we found that the MSE as an error metric in the model selection algorithm, see Equation (6), does not match a selection method based on rendering quality to choose from \(\rho _{t1}\) and \(\rho _{t2}\), see Table 2. To address this problem, we use a Gamma mapping function, \(\Gamma (\rho ,\gamma)=\rho ^{1/\gamma }\), on the reference \(\rho _{t1}\), and \(\rho _{t2}\), prior to computing the MSE. We refer to this error metric as Gamma-mapped-MSE. Note that since the reference BRDF is in linear BRDF domain, i.e., it is not transformed, we invert \(\rho _{t1}\) and \(\rho _{t2}\) according to Equations (1) and (2), respectively, prior to computing the Gamma-mapped-MSE.

Table 2.

Material	Rendering SNR (dB)		Gamma-mapped-MSE		MSE
Material	Our \(\rho _{t1}\)	Our \(\rho _{t2}\)	Our \(\rho _{t1}\)	Our \(\rho _{t2}\)	Our \(\rho _{t1}\)	Our \(\rho _{t2}\)
blue-fabric	53.99	62.16	0.0233	0.0038	0.0002	0.0062
blue-metallic-paint	51.65	60.53	0.0448	0.0375	0.0011	0.0317
dark-red-paint	49.16	54.80	0.0616	0.0242	0.0295	0.1209
gold-metallic-paint2	48.29	37.68	0.9248	0.9350	72.0330	38.7850
green-metallic-paint2	57.48	43.36	0.8767	0.8939	31.5660	11.9140
light-red-paint	46.51	51.68	0.0552	0.0312	0.0567	0.1699
pink-fabric2	52.66	52.71	0.0230	0.0125	0.0003	0.0323
purple-paint	45.81	47.24	0.1991	0.1725	2.8225	2.1342
red-fabric	56.26	55.05	0.0177	0.0078	0.0002	0.0172
red-metallic-paint	52.70	42.62	1.2910	1.3086	45.0140	19.3610
silver-metallic-paint2	44.70	44.55	0.0988	0.0895	0.0029	0.1217
specular-green-phenolic	53.09	36.67	0.9889	1.0161	27.1120	14.0560
specular-violet-phenolic	50.51	38.21	0.9722	0.9925	22.4920	14.34100
specular-yellow-phenolic	46.81	36.40	0.9454	0.9686	18.8780	10.6280
violet-acrylic	50.07	42.61	0.7770	0.7849	20.2560	12.7940

Table 2. Rendering SNR, Gamma-mapped-MSE, and MSE, Obtained Using Our Sparse BRDF Model for \(\rho _{t1}\) and \(\rho _{t2}\)

For each error metric, the best result between \(\rho _{t1}\) and \(\rho _{t2}\) is shown by bold numbers. Comparing the chosen transformation based on rendering SNR with Gamma-mapped-MSE and MSE in the BRDF space, we see that the Gamma-mapped-MSE can well distinguish the suitable transformation for 13 of 15 materials. It can also be seen that MSE only selects the correct transformation for 3 of 15 materials. For results generated using Gamma-mapped-MSE, we set \(\gamma =2.0\).

In Table 2, we report reconstruction quality measured with rendering SNR, Gamma-mapped-MSE, and MSE for both \(\rho _{t1}\) and \(\rho _{t2}\). For these results, we used 15 test materials from the MERL data set, while the remaining 85 materials were used for training. For each error metric, the best result is highlighted in bold-face characters. It can be seen that Gamma-mapped-MSE can well distinguish the best transformation among \(\rho _{t1}\) and \(\rho _{t2}\) with respect to rendering SNR for 13 of 15 materials. The two exceptions are red-fabric and silver-metallic-paint2. It can also be seen that MSE only selects the correct transformation for 3 of 15 materials. To obtain Gamma-mapped-MSE results, we used \(\gamma =2.0\). Indeed, this parameter can be tuned per-BRDF to further improve our results; however, we found that fixed value of \(\gamma =2.0\) is adequate to achieve a significant advantage over previous methods. Compared to Bieron and Peers [2020], our model selection approach provides greater flexibility by considering rendering quality (via Gamma-mapped-MSE), BRDF transformations (using \(\rho _{t1}\) and \(\rho _{t2}\)), as well as model complexity (via \(\tau _l\)).

3.5 BRDF Interpolation in Sparseland

In this section, we describe our method for interpolating two or more BRDFs, where we consider two scenarios: when the BRDFs use the same dictionary from the ensemble, and when each BRDF uses a distinct dictionary. Unlike previous methods, we perform the interpolation in the lower-dimensional space defined by the sparse coefficients, as opposed to interpolating the BRDFs in the original high-dimensional BRDF-space. We show that using our model, one can efficiently interpolate between BRDFs even when they use different dictionaries. BRDF interpolation on a nonlinear space has been previously explored using the charting algorithm [Matusik et al. 2003; Brand 2002], where locally linear lower-dimensional subspaces are constructed with the same dimensionality. Here, we show that when two or more BRDFs are sparse in distinct dictionaries, BRDF interpolation in the sparse coefficient space is possible. Moreover, each BRDF that is used for interpolation using our model can have different sparsity. Compared to Brand [2002] and Matusik et al. [2003], the local subspaces in our model can have different dimensionalities in the representation space. In what follows, and without loss of generality, we present our interpolation method for two BRDFs. Extension of our formulation to three or more BRDF data points is straightforward.

As described in Sections 1 and 2, existing non-parametric models obtain the model coefficients by fitting techniques that are applied for each BRDF individually. Therefore, the resulting coefficients for each BRDF do not necessarily share a common basis. Since the coefficients of each BRDF to be interpolated lie in a distinct coordinate system, one cannot perform the interpolation in the coefficient space. It should be noted that, once a BRDF is represented using a non-parametric model, we do not have access to the original BRDF. The only information that is available is the coefficients and the representation basis (i.e., the dictionary). Therefore, it is essential to perform the interpolation in the coefficient space.

To interpolate non-parametric BRDFs in the coefficient space, one should define a common basis for the BRDFs to be interpolated, e.g., by employing PCA as in Nielsen et al. [2015]. Since all the BRDF lie in the same coordinate space, defined by the principal components, one can linearly interpolate the coefficients instead of the original BRDFs. Note that this property is not shared by the majority of non-parametric models [Bagher et al. 2016; Bilgili et al. 2011]. The interpolated coefficients and the principal components are then used to reconstruct the resulting BRDF. However, as it will be shown in Section 4, having one common basis (i.e., the set of principal components) for the entire space of possible BRDFs leads to vastly inferior results due to inadequate model capacity. If we define multiple dictionaries, as with our proposed method, then two BRDFs that are represented with two distinct dictionaries cannot be interpolated in the coefficient space. This is evident, since these two BRDFs use two distinct coordinate systems. Our proposed interpolation technique overcomes this limitation by transforming the sparse coefficients into a common basis, which enables the interpolation directly in the coefficient space.

Let \(\mathbf {\mathcal {B}}_1\), \(\mathbf {\mathcal {B}}_2\), \(\mathbf {\mathcal {B}}_3\) be three BRDFs that, respectively, use the dictionaries \(\lbrace \mathbf {U}^{(1,1)}\), \(\mathbf {U}^{(2,1)}\), \(\mathbf {U}^{(3,1)}\rbrace\), \(\lbrace \mathbf {U}^{(1,1)}\), \(\mathbf {U}^{(2,1)}\), \(\mathbf {U}^{(3,1)}\rbrace\), and \(\lbrace \mathbf {U}^{(1,2)}\), \(\mathbf {U}^{(2,2)}\), \(\mathbf {U}^{(3,2)}\rbrace\), respectively; i.e., we assume that \(\mathbf {\mathcal {B}}_1\) and \(\mathbf {\mathcal {B}}_2\) use the same dictionary from the ensemble. The linearly interpolated BRDF obtained from \(\mathbf {\mathcal {B}}_1\) and \(\mathbf {\mathcal {B}}_2\), which we denote \(\mathbf {\mathcal {I}}\), is then given by

\begin{align} \mathbf {\mathcal {I}} &= (1-\alpha)\mathbf {\mathcal {B}}_1 + \alpha \mathbf {\mathcal {B}}_2 \end{align}

(8)

(9)

\begin{align} &= \left((1-\alpha)\mathbf {\mathcal {S}}_1 + \alpha \mathbf {\mathcal {S}}_2\right)\times _1\mathbf {U}^{(1,1)}\times _2\mathbf {U}^{(2,1)}\times _3\mathbf {U}^{(3,1)}, \end{align}

(10)

where \(\mathbf {\mathcal {S}}_1\) and \(\mathbf {\mathcal {S}}_2\) are the sparse coefficients of \(\mathbf {\mathcal {B}}_1\) and \(\mathbf {\mathcal {B}}_2\), respectively; the parameter \(\alpha\) is the interpolation coefficient. Note that, Equation (10) follows from Equation (9) only because the dictionary \(\lbrace \mathbf {U}^{(1,1)},\mathbf {U}^{(2,1)},\mathbf {U}^{(3,1)}\rbrace\) is used by both \(\mathbf {\mathcal {B}}_1\) and \(\mathbf {\mathcal {B}}_2\). One can apply the same approach for bilinear and bicubic interpolation, or even interpolation of BRDFs over triangles using barycentric coordinates.

To interpolate two BRDFs that do not share the same basis, i.e., when they are modeled using two distinct dictionaries from the ensemble, one cannot utilize Equation (10). For this case, we propose to transform the sparse coefficients of all BRDF data points that are used in the interpolation into a common basis by choosing one dictionary as the reference basis. In this way, we can utilize Equation (10) for interpolation in the coefficient space rather than the BRDF space. To elaborate on this, and without loss of generality, let us present our approach for interpolating \(\mathbf {\mathcal {B}}_1\) and \(\mathbf {\mathcal {B}}_3\), which use two distinct dictionaries \(\lbrace \mathbf {U}^{(1,1)},\mathbf {U}^{(2,1)},\mathbf {U}^{(3,1)}\rbrace\) and \(\lbrace \mathbf {U}^{(1,3)},\mathbf {U}^{(2,3)},\mathbf {U}^{(3,3)}\rbrace\), respectively. Our method is based on a simple observation: Given two orthonormal matrices, say \(\mathbf {D}_1\) and \(\mathbf {D}_2\), of the same dimensionality, there exists a unique orthonormal transformation matrix, \(\mathbf {R}\), such that \(\mathbf {D}_1=\mathbf {D}_2\mathbf {R}\) or \(\mathbf {D}_2=\mathbf {D}_1\mathbf {R}^T\). More concretely,

Proposition 1.

Let \(\mathbf {A}\) and \(\mathbf {B}\) be two distinct orthonormal matrices. Then the solution of

\begin{equation} \underset{\mathbf {R}}{\mathrm{min}}\left\Vert \mathbf {B}-\mathbf {A}\mathbf {R}\right\Vert _F^2 \;\; \mathrm{subject \; to} \;\; \mathbf {R}^T\mathbf {R}=\mathbf {I}, \end{equation}

(11)

is \(\mathbf {R}=\mathbf {A}^T\mathbf {B}\).

The proof of Proposition 1 can be found in Gower and Dijksterhuis [2004]. Note that Proposition 1 is a variant of the Orthogonal Procrustes problem with the difference that here we assume that the matrices \(\mathbf {A}\) and \(\mathbf {B}\) are orthonormal; as a result, the transformation matrix \(\mathbf {R}\) is either a rotation or rotation-reflection matrix, depending on whether \(\mathrm{det}(\mathbf {R})=1\) or \(\mathrm{det}(\mathbf {R})=-1\).

Let \(\mathbf {R}^{(i,k)\rightarrow (i,j)}\) be the transformation matrix that transforms the orthonormal matrix \(\mathbf {U}^{(i,k)}\) to \(\mathbf {U}^{(i,j)}\); i.e., we have that \(\mathbf {U}^{(i,j)} = \mathbf {U}^{(i,k)}\mathbf {R}^{(i,k)\rightarrow (i,j)}\). It should be noted that, according to Proposition 1, we can only compute \(\mathbf {R}^{(i,k)\rightarrow (i,j)}\) for two orthonormal matrices of the same dimensionality. This is reflected in the fact that the index i is shared by \(\mathbf {U}^{(i,k)}\) and \(\mathbf {U}^{(i,j)}\), i.e., the matrices that define an orthonormal basis for the ith mode of BRDFs. The transformation matrices, to be obtained using Proposition 1, between the dictionaries for \(\mathbf {\mathcal {B}}_1\) and \(\mathbf {\mathcal {B}}_3\) are

\begin{align} \mathbf {U}^{(1,2)} &= \mathbf {U}^{(1,1)}\mathbf {R}^{(1,1)\rightarrow (1,2)}, \end{align}

(12)

\begin{align} \mathbf {U}^{(2,2)} &= \mathbf {U}^{(2,1)}\mathbf {R}^{(2,1)\rightarrow (2,2)}, \end{align}

(13)

\begin{align} \mathbf {U}^{(3,2)} &= \mathbf {U}^{(3,1)}\mathbf {R}^{(3,1)\rightarrow (3,2)}. \end{align}

(14)

We can now state the formula for interpolating between two BRDFs that are not represented by a common dictionary, namely, \(\mathbf {\mathcal {B}}_1\) and \(\mathbf {\mathcal {B}}_3\) as given by the example above:

\begin{align} \mathbf {\mathcal {I}} &= (1-\alpha)\mathbf {\mathcal {B}}_1 + \alpha \mathbf {\mathcal {B}}_3 \end{align}

(15)

(16)

\begin{align} &= (1-\alpha)\mathbf {\mathcal {S}}_1\times _1\mathbf {U}^{(1,1)}\times _2\mathbf {U}^{(2,1)}\times _3\mathbf {U}^{(3,1)} \nonumber \nonumber\\ &\quad \quad + \alpha \mathbf {\mathcal {S}}_3\times _1\mathbf {U}^{(1,1)}\mathbf {R}^{(1,1)\rightarrow (1,2)} \nonumber \nonumber\\ &\quad \quad \times _2\mathbf {U}^{(2,1)}\mathbf {R}^{(2,1)\rightarrow (2,2)}\times _3\mathbf {U}^{(3,1)}\mathbf {R}^{(3,1)\rightarrow (3,2)} \end{align}

(17)

\begin{align} &= (1-\alpha)\mathbf {\mathcal {S}}_1\times _1\mathbf {U}^{(1,1)}\times _2\mathbf {U}^{(2,1)}\times _3\mathbf {U}^{(3,1)} \nonumber \nonumber\\ &\quad \quad + \alpha \underbrace{\left(\mathbf {\mathcal {S}}_3\times _1\mathbf {R}^{(1,1)\rightarrow (1,2)}\times _2\mathbf {R}^{(2,1)\rightarrow (2,2)}\times _3\mathbf {R}^{(3,1)\rightarrow (3,2)}\right)}_{\tilde{\mathbf {\mathcal {S}}}_3} \nonumber \nonumber\\ &\quad \quad \times _1\mathbf {U}^{(1,1)}\times _2\mathbf {U}^{(2,1)}\times _3\mathbf {U}^{(3,1)}, \end{align}

(18)

\begin{align} &= \left((1-\alpha)\mathbf {\mathcal {S}}_1 + \alpha \tilde{\mathbf {\mathcal {S}}}_3\right)\times _1\mathbf {U}^{(1,1)}\times _2\mathbf {U}^{(2,1)}\times _3\mathbf {U}^{(3,1)}, \end{align}

(19)

where Equation (18) follows from Equation (17) due to a fundamental property of the n-mode product [Kolda and Bader 2009]. Equation (18) shows that one can apply the transformation matrices onto the coefficients of \(\mathbf {\mathcal {B}}_2\) to transform them into the coordinate space of \(\mathbf {\mathcal {B}}_1\) defined by \(\lbrace \mathbf {U}^{(1,1)},\mathbf {U}^{(2,1)},\mathbf {U}^{(3,1)}\rbrace\). When we have more than two BRDFs to interpolate, where each BRDF uses a distinct dictionary from the ensemble, we choose one dictionary as the reference basis and utilize Equation (18) to transform all coefficients to this reference dictionary. Afterwards, linear, bilinear or cubic interpolation can be done on coefficients directly. The results are then multiplied by the reference dictionary to obtain the interpolated BRDF, as in Equation (19).

It can be noted from Equation (19) that \(\tilde{\mathbf {\mathcal {S}}}_3\) is not necessarily sparse, since it is obtained from multiplying the sparse tensor \(\mathbf {\mathcal {S}}_3\) by the dense matrices \(\lbrace \mathbf {R}^{(1,1)\rightarrow (1,2)},\mathbf {R}^{(2,1)\rightarrow (2,2)},\mathbf {R}^{(3,1)\rightarrow (3,2)}\rbrace\). To achieve a desired sparsity for the interpolated BRDF, we utilize the projection step of the model selection algorithm in Section 3.4. Note that since the dictionary is known, i.e., \(\lbrace \mathbf {U}^{(1,1)},\mathbf {U}^{(2,1)},\mathbf {U}^{(3,1)}\rbrace\), we do not need to perform the full model selection algorithm. This operation can be thought of as re-projecting the interpolated BRDF onto the sparse BRDF manifold. Moreover, it allows us to obtain a desired sparsity for the interpolated BRDF, since the re-projection sparsity can be different than the sparsity of the BRDFs used for interpolation. It should be noted that our interpolation algorithm does not require equal sparsity for the BRDFs used in the interpolation.

The interpolation algorithm proposed above admits the construction of a smooth surface over the coefficient space of all BRDFs represented using our model. Since our model enables the efficient sparse representation of any measured BRDF, any two (or more) BRDFs can be smoothly interpolated in the coefficient space using our model.

3.6 Importance Sampling

For the results presented in this article, we utilize the Rusinkiewicz’s parameterization defined by \((\theta _h, \theta _d, \phi _d)\). As a result, Multiple Importance Sampling (MIS) can be utilized in the same way as the method presented in Bagher et al. [2016]. A two-dimensional Cumulative Distribution Function (CDF) can be constructed from our sparse BRDF representation using the luminance channel. The inverted CDF is then used to sample directions proportional to the BRDF. For the example renderings in Section 4 where HDR environment maps are used, this is also combined with a two-dimensional CDF for sampling the illumination from the environment maps. The multiple importance sampling strategies are combined using MIS as suggested in Bagher et al. [2016].

4 Results and Discussion

This section presents an evaluation of the proposed BRDF model and comparisons to the current state-of-the-art models in terms of BRDF reconstruction error and rendering quality. The rendering results were generated using PBRT [Pharr and Humphreys 2010] with the Grace Cathedral environment map. The images were rendered at a resolution of \(512\times 512\) pixels using 512 pixel samples in PBRT with the directlighting surface integrator and 256 infinite light-source samples.

The BRDF dictionary was trained using materials from the MERL database [Matusik et al. 2003] and RGL-EPFL isotropic BRDF database [Dupuy and Jakob 2018]. We split the MERL and RGL-EPFL materials into a training set and a test set. The training set contains 136 materials, where 85 materials are from the MERL dataset and 51 materials are from the RGL-EPFL dataset. The test set contains 28 materials with 15 materials from the MERL dataset, 8 materials from the DTU data set [Nielsen et al. 2015] and 5 materials from RGL-EPFL. The training and test sets cover a wide range of material classes and none of the materials in the test set appear in the training set.

Each BRDF color channel is processed independently for the training and model selection. We use the Rusinkiewicz’s parameterization [Rusinkiewicz 1998], at a resolution of \(90\times 90\times 180\), i.e., we have \(m_1 = 90\), \(m_2 = 90\), and \(m_3 = 180\). For our experiments, we trained four ensembles, each with \(K = 8\) dictionaries and with training sparsities of \(\tau _l=32\), \(\tau _l=64\), \(\tau _l=128\), and \(\tau _l=256\). We then construct one ensemble by taking the union of the dictionaries in the four ensembles that were trained, as described in Section 3.3. The training BRDFs were transformed using log-plus (\(\rho _{t1}\)) and cosine-weighted-log (\(\rho _{t2}\)) functions before starting the training, hence resulting in 272 materials. The training took about 11 hours on a machine with 40 cores clocked at 2.4 GHz.

Once the ensemble is trained, we use the model selection algorithm, described in Section 3.4, to obtain the optimal sparse coefficients of each BRDF in the test set. The model selection for each BRDF and for all three color channels takes about 61 seconds using a CPU with 16 cores clocked at 3.8 GHz. Note that the model selection can be parallelized over the BRDFs in the test set, since this task is performed independently for each BRDF. Moreover, the reconstruction of all the elements of a measured BRDF, i.e., a total of \(\text{4,374,000}\) elements, takes 17 seconds using the same hardware setup. It should be noted that during rendering, only one or a few values from the measured BRDF is needed at each intersection point. In this case, we utilize the algorithm proposed in Miandji et al. [2019] to recover a single element. In addition, since the reconstruction in Equation (7) is based on the n-mode product of a sparse tensor and dense matrices, we can utilize the GPU-based algorithm in Baravdish et al. [2019] for real-time applications. We have left this application for future work. A CPU-based implementation of our method for model selection and reconstruction is provided at https://github.com/emiandji/sparse-brdf.

To evaluate our sparse BRDF model, we use two quality metrics: SNR that is calculated on the rendered images (floating-point images) and Relative Absolute Error (RAE), which is computed on linear BRDF values. Note that since the reference BRDF is in linear BRDF domain, i.e., it is not transformed, we invert the transformation of the reconstructed BRDF (which may be \(\rho _{t1}\) or \(\rho _{t2}\) depending on the model selection result) to transform it back to the linear space prior to computing the RAE. The RAE is defined as

\begin{equation} RAE = \sqrt {\frac{\sum (\rho _{ref}-\rho _{recon})^2}{\sum (\rho _{ref}^2)}}, \end{equation}

(20)

where \(\rho _{ref}\) is the reference BRDF, and \(\rho _{recon}\) is the reconstructed BRDF. Even though rendering SNR (or PSNR) is typically used to evaluate BRDF models in many publications, RAE is very useful to capture the model accuracy without relying on a specific rendering setup. This is because RAE is computed on all BRDF values, while rendering SNR only takes into account a portion of the BRDF values, depending on the rendering setup. Note that for rendering, we invert Equations (1) and (2) to transform the reconstructed BRDFs back to the original linear domain, which is required by the renderer.

We compare our results to Bagher et al. [2016] (naive model), Bilgili et al. [2011] (Tucker decomposition) and Tongbuasirilai et al. [2019] (rank-1 CPD decomposition with L = 1) on 15 MERL test materials. The method of Bagher et al. [2016] stores \((90+90+180+2) = 362\) coefficients per channel, Bligili et al. [2011] uses \((128+16+16+64+2) = 226\) coefficients, and the CPD decompositions from Tongbuasirilai et al. [2019] uses \((90+90+180) = 360\) coefficients per channel. Since the Tucker and CPD methods use an iterative approach, we limit our comparisons to L = 1, i.e., a single factorization was performed so that the number of coefficients used for all models were roughly the same. The CPD method was tested using two different parameterizations: the PDV [Löw et al. 2012; Tongbuasirilai et al. 2019] and HD [Rusinkiewicz 1998] parameterizations.

To the best of our knowledge, the model of Bagher et al. [2016] is the current state-of-the-art for non-parametric BRDF models. Therefore, we set the number of coefficients \(\tau _t\) for our model such that we match the storage complexity of Bagher et al. [2016]. Since our representation is sparse, we require \(1+1+1\) bytes for storing the location of each nonzero element in a sparse coefficient tensor with size \(90\times 90\times 180\), as well as 8 bytes for each nonzero value. Simple calculations show that by using \(\tau _t=262\) coefficients for our model for each color channel, we can match the storage complexity of Bagher et al. [2016], which uses 362 coefficients to model each color channel of a BRDF. Note that the discrepancy in the number of coefficients is due to the fact that we need to store the location of nonzero values, which penalizes our method to use less coefficients.

For the rendered images, shown in Figures 2, 3, and 9, we apply gamma-corrected tone-mapping. The error images, also known as false-color, produced by normalizing the absolute image-space error in the range \([0,1]\), followed by applying a jet color map using MATLAB. Normalization is done individually for each BRDF represented using different models. All the error images are multiplied by 10 to facilitate visual comparisons. Apart from what follows, additional quantitative and qualitative results are included in the supplementary materials.

Fig. 2.

Fig. 3.

Table 3 reports SNR statistics for 15 test materials in the MERL database. The average SNR of our model is about 8dB, 5dB, and 10dB higher for log-plus, cosine-weighted-log and the proposed model selection based on Gamma-mapped-MSE, respectively, when compared to Bagher et al.; moreover, our results show a smaller standard deviation on SNR. Additionally, our proposed model selection method can achieve higher SNR on average compared to the cases where we only use log-plus or cosine-weighted-log. The lower standard deviation indicates that the proposed model can represent the MERL materials more faithfully. Table 4 shows a direct comparison of our model to that of Bagher et al. [2016] for each BRDF in the MERL test set using rendering SNR and BRDF-space RAE. Here, we use our Gamma-mapped-MSE metric to choose between the transformations. Compared to the model of Bagher et al., our approach achieves significantly higher visual quality on 13 of 15 materials. In Figure 2, we present example renderings of four BRDFs in the MERL test set modeled using our method and Bagher et al. [2016]. Our results are obtained using the proposed Gamma-mapped-MSE for model selection. The HDR environment map used here is Grace Cathedral.

Table 3.

BRDF Model	Average SNR (dB)	Standard Deviation	Minimum SNR (dB)	Maximum SNR (dB)
Ours, \(\tau _t=262\), using only \(\rho _{t1}\)	50.65	3.8106	44.70	57.49
Ours, \(\tau _t=262\), using only \(\rho _{t2}\)	47.08	8.5939	36.40	62.16
Ours, \(\tau _t=262\), using Gamma-mapped-MSE	52.51	4.9752	44.55	62.16
Bagher et al. [2016]	42.76	11.6323	27.11	63.88
Bilgili et al. [2011]	32.63	5.8724	22.86	43.17
Tongbuasirilai et al. [2019] using CPD-PDV rank-1 (L = 1)	33.83	5.5236	22.22	42.71
Tongbuasirilai et al. [2019] using CPD-HD rank-1 (L = 1)	32.51	8.4995	22.97	52.27

Table 3. Average, Standard Deviation, Minimum, and Maximum Rendering SNR Values of Each BRDF Model Obtained from 15 Materials in the MERL Dataset

None of these materials were included in our training set. Yet, our method significantly outperforms state-of-the-art decomposition-based methods, such as Bagher et al. [2016], where the basis and coefficients should be computed for each given BRDF (i.e., the training and testing sets are not distinct).

Table 4.

Material	SNR (dB)		RAE
Material	Our	Bagher	Our	Bagher
blue-fabric	62.16	63.88	0.8695	0.3596
blue-metallic-paint	60.53	44.86	0.4287	0.2736
dark-red-paint	54.80	57.89	0.2717	0.4955
gold-metallic-paint2	48.29	29.77	0.0811	0.7007
green-metallic-paint2	57.48	51.04	0.0726	0.4209
light-red-paint	51.68	51.36	0.2583	0.5639
pink-fabric2	52.71	52.44	0.8581	0.3388
purple-paint	47.22	44.29	0.1144	0.4428
red-fabric	55.05	51.44	0.7160	0.4163
red-metallic-paint	52.70	34.06	0.0752	0.7181
silver-metallic-paint2	44.55	27.11	0.6256	0.3626
specular-green-phenolic	53.09	35.69	0.0673	0.6414
specular-violet-phenolic	50.51	37.76	0.0592	0.6592
specular-yellow-phenolic	46.81	28.14	0.0683	0.7370
violet-acrylic	50.07	31.68	0.0653	0.5156

Table 4. Rendering SNR and BRDF-space RAE Values Obtained with Our BRDF Model and That of Bagher et al. [2016], on 15 Test Materials of the MERL Dataset Using \(\tau _t=262\)

These materials were not used in our training set. Higher rendering SNR is highlighted in bold.

The cosine-weighted-log transformation suppresses the grazing angle BRDF values. For diffuse materials, this leads to better visual results and significantly higher rendering SNR. It is evident from Figure 2 that the log-plus transformation is better for glossy materials as the cosine-weighted-log transformation leads to color artifacts for some materials, e.g., gold-metallic-paint2, red-metallic-paint, and violet-acrylic. More results for further analysis is available in the supplementary material.

Table 5 below shows rendering SNR and BRDF-space RAE values for the RGL-EPFL test set using both \(\rho _{t1}\) and \(\rho _{t2}\). Our BRDF model and selection method can efficiently represent the RGL-EPFL data set with an average SNR of more than 38dB. Our model selection method on the RGL-EPFL test set missed on 2 of 5 materials, which are cc-amber-citrine-rgb and vch-dragon-eye-red-rgb. The SNR values demonstrate that our data-driven model can accurately represent and faithfully reconstruct the BRDFs in the test set. See the supplementary materials for rendered images obtained using our model applied on the RGL-EPFL data set. Although we use a fixed number of coefficients for all materials here for the sake of comparisons, i.e., \(\tau _t=262\), ideally we would like to tune this parameter for each BRDF individually. For instance, for challenging materials such as cc-amber-citrine-rgb, the model accuracy significantly increases with higher number of coefficients. We analyze the effect of \(\tau _t\) on rendering SNR and BRDF RAE in Figure 4. Rendering results for two example BRDFs when we set \(\tau _t\) to 16, 64, 256, 1,024, and 4,096 are included in Figure 5.

Fig. 4.

Fig. 5.

Table 5.

Material	Our \(\rho _{t1}\)		Our \(\rho _{t2}\)		Sel.
Material	SNR (dB)	RAE	SNR (dB)	RAE	SNR (dB)
acrylic-felt-green-rgb	43.23	0.9922	45.91	0.3104	45.91
cc-amber-citrine-rgb	26.33	0.5394	26.24	0.8420	26.24
ilm-l3-37-dark-green-rgb	38.92	0.9558	43.45	0.6060	43.45
paper-blue-rgb	38.92	0.9871	40.23	0.4416	40.23
vch-dragon-eye-red-rgb	40.48	0.8973	38.16	0.7989	38.16
Average	37.57	0.8744	38.80	0.5998	38.80

Table 5. Rendering SNR and BRDF-space RAE Values Obtained with Our BRDF Model on 5 Test Materials of the RGL-EPFL Dataset Using \(\tau _t=262\)

The bottom row shows mean of each column. The last column presents SNR results of our model selection method based on Gamma-mapped-MSE described in Section 3.4.

To demonstrate the robustness of our sparse non-parametric model for representing BRDFs in a test set, we also evaluate it using eight test samples in the DTU data set [Nielsen et al. 2015]. Note that we use the same dictionary described above and that none of the materials from the DTU data set were used in the training set. The results are summarized in Table 6, where we report rendering SNR and BRDF-space RAE for \(\rho _{t1}\), \(\rho _{t2}\), and our model selection based on Gamma-mapped-MSE. Our BRDF model and selection method can reproduce the DTU data set with average SNR of more than 48dB. Our model selection algorithm on the DTU test set missed on 2 of 8 materials, which are blue-book and green-cloth. Visual quality examples of the rendered images are presented in Figure 3. The difference between \(\rho _{t1}\) and \(\rho _{t2}\) is evident in this figure. We can see that \(\rho _{t1}\) is favored by glossy materials, while \(\rho _{t2}\) is more effective in modeling low-frequency or diffuse-like materials. For instance, we observe that the artifacts seen on the cardboard and green-cloth renderings for log-plus (\(\rho _{t1}\)) do not appear in cosine-weighted-log (\(\rho _{t2}\)) renderings. For this figure, we used the Grace Cathedral HDR environment map.

Table 6.

Material	Our \(\rho _{t1}\)		Our \(\rho _{t2}\)		Sel.
Material	SNR (dB)	RAE	SNR (dB)	RAE	SNR (dB)
binder-cover	45.70	0.0611	46.12	0.0303	46.12
blue-book	47.24	0.0574	45.07	0.0258	45.07
cardboard	44.72	0.1468	48.90	0.3779	48.90
glossy-red-paper	45.16	0.0436	41.41	0.0288	45.16
green-cloth	51.60	0.1145	51.58	0.7713	51.58
notebook	43.39	0.1838	47.15	0.2805	47.15
painted-metal	46.95	0.0817	51.84	0.1240	51.84
yellow-paper	47.14	0.1289	49.04	0.4873	49.04
Average	46.49	0.1022	47.64	0.2658	48.11

Table 6. Rendering SNR and BRDF-space RAE Values Obtained with Our BRDF Model, on 8 Materials from the DTU Data Set [Nielsen et al. 2015] Using \(\tau _t=262\)

The bottom row shows means of each column. The last column present SNR results of our model selection method based on Gamma-mapped-MSE described in Section 3.4.

Our results confirm the discrepancy between BRDF-space error metrics (such as RAE) and rendering quality using SNR. For example, blue-metallic-paint in Table 4, cardboard in Table 6, and vch-dragon-eye-red-rgb in Table 5 demonstrate how RAE contradicts the rendering SNR. The lower the BRDF-space RAE is, the more accurate the model represents a BRDF. However, a rendered image is dependent on a variety of additional factors such as geometry of objects, lighting environment, and viewing position. As a result, the BRDF-space RAE and rendering SNR have to be considered together for the evaluation of a BRDF model.

Figure 4(a) demonstrates the effectiveness of our BRDF model on all data sets with respect to the number of coefficients, \(\tau _t\). In terms of the decline of error with respect to the number of coefficients, all data sets show a similar behavior. Moreover, all the data sets have poor RAE when \(\tau _t \lt 32\). This behavior is expected, since the training sparsity is set to \(\tau _l = 32\), \(\tau _l = 64\), \(\tau _l = 128\), and \(\tau _l = 256\). And when we have \(\tau _t \lt \tau _l\), we expect a decline in performance. Figure 4(b) indicates that our model generally produces incrementally better rendering results when the number of coefficients is increased for all the data sets. Our model performs best for the MERL data set, because most of the training data are from MERL, while EPFL has fewer number of materials in the training. Although we do not include any materials from the DTU data set in the training, the rendering SNR and BRDF RAE are competitive with MERL reconstructions.

We also evaluated our BRDF model with the PCA-based method presented in Nielsen et al. [2015]; see Figure 6. Our RAE performs similar to PCA-based method with 32 principal components and when we have \(\tau _t=32\). We see that for a lower number of coefficients the PCA-based method outperforms our method, since our minimum training sparsity is 32. It should be noted that the PCA dictionary exhibits a very high storage cost compared to our BRDF dictionary ensemble. The PCA dictionary contains \(\hbox{1,458,000}\times p\) elements, where p is the number of principal components, while our combined dictionary ensemble consists of \((90*90+90*90+180*180)*32=\text{1,555,200}\) elements. This shows that the storage complexity of the PCA dictionary is proportional to the number of principal components, whereas for our dictionary ensemble the storage complexity is constant. In Figures 7(a), 7(b), and 7(c), we present the rendering SNR and in Figures 7(d), 7(e), and 7(f), we present the logarithmic BRDF-space RAE for all three data sets in comparison to Nielsen et al. [2015], where we compensate for the size of the PCA dictionary by increasing the number of coefficients of our method, \(\tau _t\). When considering the size of the dictionaries, the PCA method with 2,3 and 4 principal components corresponds to our model with \(\tau _t= \hbox{453,601}\), \(\tau _t= \hbox{939,601}\), and \(\tau _t= \hbox{1,425,601}\), respectively. The bar plots indicate that, for all data sets, our model performs better than the PCA technique by a large margin on both quality metrics.

Fig. 6.

Fig. 7.

To demonstrate our BRDF interpolation method on the sparse coefficient space (the sparseland), we provide a number of challenging examples in Figure 8 for different material classes using Equation (19) introduced in Section 3.5. The interpolation coefficient, \(\alpha\), was set between 0.0 and 1.0 with increments of 0.1. Figure 8(a) illustrates the interpolation between two materials, i.e., red-metallic-paint and specular-yellow-phenolic, and Figure 8(b) shows the interpolation of two diffuse materials, namely, dark-red-paint and pink-fabric2. Figure 8(c) gives an example of a layered material, cc-amber-citrine-rgb, and a glossy material, silver-metallic-paint2. Finally, Figure 8(d) is an example of vinyl film, vch-dragon-eye-red-rgb, and a diffuse material, red-fabric. Note that each BRDF pair we use here share the same transformation (i.e., \(\rho _{t1}\) or \(\rho _{t2}\) as discussed in Section 3.1). As shown in the rendered images, our BRDF interpolation on the sparse coefficient space provides a smooth transition between the material pairs. For these experiments, we used \(\tau _t=512\) for the model selection algorithm. Moreover, as mentioned in Section 3.5, we run the projection step of the model selection algorithm on the interpolated BRDF to obtain a desired sparsity, in this case \(\tau _t=512\). Note that this parameter is user-defined and can be changed to obtain a trade-off between quality and storage complexity for the interpolated BRDF.

Fig. 8.

In Figure 9, we evaluate our BRDF model using the Princeton scene with the following materials: blue-metallic-paint, gold-metallic-paint2, pink-fabric2, silver-metallic-paint2, and specular-yellow-phenolic. We rendered the scene with path tracing in PBRT [Pharr and Humphreys 2010] using the uffizi environment map and with \(2^{17}\) samples-per-pixel. Figures 9(a) and 9(c) present rendered images from our model and Bagher et al., respectively. Our model achieves an 8.03dB advantage in SNR over the model of Bagher et al.

Fig. 9.

5 Limitations and Future Work

The experiments conducted in this article are limited to isotropic BRDFs. This is partially due to the lack in availability of anisotropic BRDF data. As part of the future work, we aim to extend our sparse models to efficiently represent also anisotropic materials. Moreover, we acknowledge the fact that the discrepancy between BRDF-space errors and the rendering quality is still an open problem. Although we showed significant improvements using our Gamma-mapped-MSE, we believe that a more sophisticated metric that takes into account the support of the BRDF function can improve our results. Our model is relatively robust to noise. However, we believe that an application of a denoising pass that is tailored to measured BRDFs, prior to training and model selection, can greatly improve our results. This is expected, since it is well-known that even a moderate amount of noise in measured BRDFs translates to lower rendering quality; and that noise reduces the sparsity of the representation, hence increasing the model complexity. An alternative to applying a denoiser is to modify the training and model selection methods to be noise-aware.

The proposed interpolation algorithm enables a variety of applications where the BRDFs processing can be performed directly in the coefficient space rather than the substantially higher-dimensional BRDF space. For instance, rendering of smooth surfaces where each point on the surface has a distinct measured BRDF (i.e., spatially varying BRDFs) can be efficiently performed using our model. Moreover, editing of measured BRDFs can be performed directly in the coefficient space by utilizing the shallow network proposed in Hu et al. [2020] that maps our sparse representation of materials to known BRDF properties. We have left these applications of our model for future work. Another interesting venue for future work is the utilization of our model for sampling BRDFs, using, e.g., a gonioreflectometer, where only a few samples are taken to reconstruct a full BRDF, similar to Nielsen et al. [2015].

Our model is GPU-friendly, because it consists of parallel operations where atoms and coefficients can be efficiently distributed and operated among GPU cores. There exists a variety of GPGPU algorithms for performing key tasks in our method; for instance, a GPU-based implementation of the n-mode product is described in Baravdish et al. [2015]. As the main focus in this article is to derive and evaluate a novel non-parametric BRDF model, we have left the GPU implementation of our method to future work. However, given the small memory footprint of our model, and fast BRDF reconstruction using Equation (7), existing real-time ray tracing systems can be modified to accommodate our model for measured materials. Moreover, since the quality of our model is solely controlled using the \(\tau _t\) parameter, one can linearly control the rendering performance using this parameter.

Quality metrics for effective BRDF fitting is an open problem [Lavoué et al. 2021]. In this article, we showed the discrepancy between the results of RAE and rendering SNR. We then propose a model selection technique to reduce the gap between the visual quality and the numerical quality metrics. We believe that our model selection algorithm can benefit from more accurate error metrics to further increase the visual quality. Due to the robustness of our model selection algorithm, any new mathematical error metric that is suitable for measured BRDFs can be used in Equation (6).

One of the limitations of the proposed model is the assumption of same resolution for the BRDFs in the training and testing sets. This is a limiting factor for BRDF data sets that are not densely sampled. One possible solution for these scenarios is that we train the model based on densely sampled BRDFs. For a given sparsely sampled BRDF to be modeled, we first create a dense BRDF by copying known values and setting unknown values to zero. We then iteratively perform the model selection algorithm on the obtained BRDF. At the first iteration, we obtain a crude approximation for the unknown values. This crude approximation is then improved iteratively by the consecutive runs of the model selection algorithm. We have left a full exploration of such method for future work.

6 Conclusions

This paper presented a novel non-parametric sparse BRDF model in which a measured BRDF is represented using a trained multidimensional dictionary ensemble and a set of sparse coefficients. We showed that with careful model selection over the space of multidimensional dictionaries and various BRDF transformations, we achieve significantly higher rendering quality and model accuracy compared to current state-of-the-art. Finally, we introduced a novel algorithm for interpolation of two or more BRDFs, modeled using our approach, directly in the coefficient space, even when the given BRDF data points used for the interpolation utilize distinct dictionaries. We evaluated the performance of our model and algorithm using three different data sets, MERL, RGL-EPFL, and one provided by Nielsen et al. [2015]. For the vast majority of the BRDFs used in the test set, we achieve a significant advantage over previous models.

Supplementary Material

A-sparse-non-parametric-BRDF-model (a-sparse-non-parametric-brdf-model.mp4)

Supplemental video

Download
60.12 MB

Supplementary-A-sparse-non-parametric-BRDF-model (supplementary-a-sparse-non-parametric-brdf-model.pdf)

Supplementary material

Download
38.67 MB

References

[1]

Michal Aharon, Michael Elad, and Alfred Bruckstein. 2006. K-SVD: An algorithm for designing overcomplete dictionaries for sparse representation. IEEE Trans. Signal Process. 54 (12 2006), 4311–4322. DOI:

Abstract

1 Introduction

2 Background and Related Work

3 Sparse Data Driven BRDF Model

3.1 BRDF Data Transformation

3.2 Multidimensional Dictionary Learning for BRDFs

3.3 BRDF Dictionary Ensemble with Multiple Sparsities

3.4 BRDF Model Selection

3.5 BRDF Interpolation in Sparseland

3.6 Importance Sampling

4 Results and Discussion

5 Limitations and Future Work

6 Conclusions

Supplementary Material

References

Cited By

Index Terms

Recommendations

Modeling and rendering of quasi-homogeneous materials

Modeling and rendering of quasi-homogeneous materials

A physically-based BRDF model for retroreflection

Comments

Information

Published In

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Funding Sources

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

View options

PDF

eReader

HTML Format

Get Access

Login options

Full Access

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations