Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Open access

Centimeter-wave Free-space Neural Time-of-Flight Imaging

Published: 03 March 2023 Publication History

Abstract

Depth sensors have emerged as a cornerstone sensor modality with diverse applications in personal hand-held devices, robotics, scientific imaging, autonomous vehicles, and more. In particular, correlation Time-of-Flight (ToF) sensors have found widespread adoption for meter-scale indoor applications such as object tracking and pose estimation. While they offer high depth resolution at competitive costs, the precision of these indirect ToF sensors is fundamentally limited by their modulation contrast, which is in turn limited by the effects of photo-conversion noise. In contrast, optical interferometric methods can leverage short illumination modulation wavelengths to achieve depth precision three orders of magnitude greater than ToF, but typically find their range is restricted to the sub-centimeter.
In this work, we merge concepts from both correlation ToF design and interferometric imaging; a step towards bridging the gap between these methods. We propose a computational ToF imaging method that optically computes the GHz ToF correlation signal in free space before photo-conversion. To acquire a depth map, we scan a scene point-wise and computationally unwrap the collected correlation measurements. Specifically, we repurpose electro-optical modulators used in optical communication for ToF imaging with centimeter-wave signals, and achieve all-optical correlation at 7.15 GHz and 14.32 GHz modulation frequencies. While GHz modulation frequencies increase depth precision, these high modulation rates also pose a technical challenge. They result in dozens of wraps per meter which cannot be estimated robustly by existing phase unwrapping methods. We tackle this problem with a proposed segmentation-inspired phase unwrapping network, which exploits the correlation of adjacent GHz phase measurements to classify regions into their respective wrap counts. We validate this method in simulation and experimentally, and demonstrate precise depth sensing using centimeter wave modulation that is robust to surface texture and ambient light. Compared to existing analog demodulation methods, the proposed system outperforms all of them across all tested scenarios.
CCS Concepts: •  Computing methodologies;

1 Introduction

From interactive gaming to precision industrial manufacturing, depth sensors have enabled advances in a broad set of consumer and research applications. Their ability to recover 3D data at scale [Silberman et al. 2012; Chang et al. 2015; Dai et al. 2017] and produce high-fidelity scene reconstructions [Izadi et al. 2011; Tulsiani et al. 2018] drives developments in 3D scene understanding [Dai et al. 2018; Song et al. 2015; Hickson et al. 2014], which in turn influence the fields of augmented reality, virtual reality, robotic scanning, autonomous vehicle guidance, and path planning for delivery drones.
Some of the most successful depth acquisition approaches for wide operating ranges are based on active time-of-flight sensing, as they offer high depth precision at a small sensor-illumination baseline [Hansard et al. 2012]. Passive approaches, that infer distance from parallax [Subbarao and Surya 1994; Mahjourian et al. 2018] or visual cues in monocular images [Bhat et al. 2021; Saxena et al. 2005], do not offer the same range and depth precision as they struggle with textureless regions and complex geometries [Smolyanskiy et al. 2018; Lazaros et al. 2008]. Active sensing approaches tackle this challenge by projecting light into the scene and reconstructing depth from the returned signal. Structured light methods such as active stereo systems use spatially patterned light to aid stereo matching [Ahuja and Abbott 1993; Baek and Heide 2021]. While being robust to textureless scenes, their accuracy is limited by illumination pattern density and sensor baseline, resulting in a large form factor. Time-of-flight (ToF) depth sensing approaches avoid these limitations by estimating depth from the travel time of photons leaving from and returning to the device, allowing for co-axial sensor setups with virtually no illumination-camera baseline.
Direct ToF systems, such as light detection and ranging (LiDAR) sensors [Schwarz 2010], directly measure the round-trip time of emitted light pulses to estimate point depths, and can theoretically provide accuracy over a long range. However, this direct acquisition approach demands fast pulsed lasers, accurate synchronization, narrow-band filters, and picosecond-resolution time-tagged detectors such as single-photon avalanche diodes (SPADs) [Aull et al. 2002; Bronzi et al. 2015; Niclass et al. 2005; Rochas et al. 2003]. Though affordable SPADs have recently entered the market, these have only 20cm depth resolution [Callenberg et al. 2021], more than \(50{\times }\) lower than their costly picosecond-resolution counterparts.
Amplitude-modulated continuous-wave (AMCW) ToF methods—which we hereon refer to as correlation ToF methods— [Lange and Seitz 2001; Su et al. 2018; Shrestha et al. 2016; Gupta et al. 2015] flood a scene with periodic amplitude-modulated light and indirectly infer depth from the phase shift of returned light. In contrast to direct ToF sensing approaches, this modulation and correlation does not require ultra-short pulse generation and time-tagging, this lowers sensor and laser complexity requirements. Correlation ToF sensors that demodulate the amplitude-modulated flash-illumination on-sensor have been widely adopted, for example, the Microsoft Kinect One camera. These sensors implement multiple charge buckets per pixel and shift a photo-electron to an individual bucket by applying an electrical potential between the individual quantum wells [Lange and Seitz 2001]. Though amplitude modulation allows for depth precision comparable to picosecond-pulsed direct ToF at meter-scale distances, while remaining low-cost thanks to scalable CMOS technology, it is also this sensing mode that fundamentally limits the sensor. Specifically, modulation after photo-electric conversion limits the maximum achievable modulation frequency to a few hundred MHz in practice, restricted by the photon absorption depth in silicon [Lange and Seitz 2001]. This has limited the depth precision of existing correlation ToF sensors to the sub-centimeter regime. Fiber-coupled modulation approaches from optical communication that bypass this limit suffer from low modulation contrast due to coupling loss [Kadambi and Raskar 2017; Rogers et al. 2021; Marchetti et al. 2017; Bandyopadhyay et al. 2020].
In this work, we co-opt free-space electro-optic modulators (EOMs) from optical communication and combine them with a phase unwrapping neural network to build a GHz correlation ToF system. EOM-based ranging systems are known to offer fast intensity modulation and can be integrated with conventional intensity sensors and a continuous-wave laser, bypassing the more complex hardware requirements of time-tagged ToF devices [Froome and Bradsell 1961]. Inspired by existing EOM-based ranging methods, we devise a two-pass EOM-based GHz ToF sensing system that achieves a 7 GHz modulation frequency with \(\gt 50\%\) contrast. Our system inherits the benefits of EOM-based systems—large-area freespace modulation, single-digit driving voltage—using conventional intensity sensors and continuous-wave lasers.
Although a higher modulation frequency can increase phase contrast and allow for more precise depth measurement, it also greatly complicates the task of phase unwrapping, a major obstacle in applying EOMs to depth sensing. At 7 GHz, even a 2 cm depth change results in a phase wrap, in contrast to 3 m of unambiguous depth for a 100 MHz ToF camera. In addition to a few dozens of wraps, imaging noise and the small modulation bandwidth of EOMs—only a few MHz—imposes a further challenge for conventional look-up table approaches. We tackle this challenge with a segmentation-inspired neural phase unwrapping network, where the problem is decomposed into ordinal classification, mapping regions of measured data to their wrap count. Trained in an end-to-end fashion on simulated ToF data and fine-tuned on a small set of experimental measurements, the proposed network exploits the correlation of adjacent measurements to robustly unwrap them.
We validate the proposed ToF system in simulation and experimentally, and demonstrate robust depth imaging for macroscopic diffuse scenes with freespace centimeter-wave modulation at mW laser powers, corresponding to \(\lt\)100 femtosecond temporal resolution. See Figure. Jointly with the learned unwrapping, the all-optical modulation without coupling losses allows for robustness to low-reflectance texture regions and highly specular objects with low diffuse reflectance components. We assess the neural phase unwrapping network extensively on real and simulated data, and validate that it outperforms existing conventional and learned unwrapping approaches across all tested scenarios. We further validate precision and compare extensively against post-photoconversion modulation, which fails in low flux scenarios, and interferometric approaches, that are limited to small ranges. As our free-space modulation is all-optical, we demonstrate that it can be readily combined with interferometric modulation, allowing us to narrow the gap between interferometry and correlation ToF imaging, with the future potential for photon-efficient imaging of macro-scale ultrafast phenomena.
Specifically, we make the following contributions in this work:
We introduce computational ToF imaging with fully optical free-space correlation and an EOM-based two-pass intensity modulation that allows for \(\ge\)10 GHz frequencies.
To tackle phase-unwrapping at centimeter wavelengths, we introduce a segmentation-based phase unwrapping network that poses phase recovery as a classification problem.
We validate the proposed method experimentally with a prototype, achieving robust depth imaging with freespace centimeter-wave modulation for macroscopic scenes.
To ensure reproducibility, we will share the schematics, code, and optical design of the proposed method.

2 Related Work

In this section, we seek to give the reader a broad overview of the current state of depth imaging to better illustrate the gap our work fills in the 3D vision ecosystem.
Depth Imaging. The wide family of modern depth imaging methods can be broadly categorized into passive and active systems. Passive approaches, which leverage solely image cues such as parallax [Hirschmuller 2005; Baek et al. 2016; Meuleman et al. 2020] or defocus [Subbarao and Surya 1994], can offer low-cost depth estimation solutions using commodity camera hardware [Garg et al. 2019]. Their reliance on visual features, however, means they struggle to achieve sub-cm accuracy in favorable conditions, and can fail catastrophically for complex scene geometries and textureless regions [Smolyanskiy et al. 2018]. Active methods, which first project a known signal into the scene before attempting to recover depth, can reduce this reliance on visual features. For example, structured light approaches, such as those used in the Kinect V1 and Intel D415 depth cameras, improve local image contrast with active illumination patterns [Baek and Heide 2021; Scharstein and Szeliski 2003; Ahuja and Abbott 1993], at a detriment to form-factor and power consumption. Even active stereo methods, however, still cannot disambiguate mm-scale features, as they are smaller than the illumination feature size itself and make finding accurate stereo correspondences infeasible. ToF imaging is an active method that does not rely on visual cues, and so avoids the pitfalls of stereo matching completely. ToF cameras instead directly or indirectly measure the travel time of light to infer distances [Lange and Seitz 2001; Hansard et al. 2012], with modern continuous-wave correlation ToF systems achieving sub-cm accuracy for megahertz-scale modulation frequencies. Interferometry extends this principle to the terahertz range, measuring the interference of electromagnetic waves to estimate their travel time. These systems can achieve micron-scale accuracy at the cost of mm-scale operating ranges [Hariharan 2003]. In this work, we seek to bridge the gap between commodity MHz-frequency correlation ToF systems and THz frequency interferometry with a GHz-frequency correlation ToF system for meter-scale imaging.
Pulsed ToF. Pulsed ToF systems, such as LiDAR, are direct ToF acquisition methods, which directly measure the travel time of photon packets to infer depth. They send discrete laser pulses into the scene and detect their reflections with avalanche photodiodes [Cova et al. 1996; Pandey et al. 2011] or single-photon detectors [McCarthy et al. 2009; Heide et al. 2018; Gupta et al. 2019b;, 2019a]. These sensors can extract depth from measured pulse returns without phase wrap ambiguities. Their depth precision is limited by their temporal resolution, however, and the complex detectors and narrow-band filters, used to filter out ambient light, contend with high cost as a result of fabrication complexity when compared to the conventional intensity sensors. Recently, low-cost pulsed sensors have appeared, however at the cost of coarse 20 cm depth precision [Callenberg et al. 2021]. In this work, we revisit indirect ToF with amplitude modulation paired with learned phase unwrapping as an approach to precise depth imaging that does not mandate time-resolved sensors and time-tagging electronics.
Correlation ToF. Amplitude-modulated continuous-wave ToF, which we refer to as simply correlation ToF, floods the scene with periodically modulated illumination and infers distance from phase differences in the returned light [Lange and Seitz 2001; Remondino and Stoppa 2013; Ringbeck 2007]. These systems, such as cameras in the prolific Microsoft Kinect series [Tölgyessy et al. 2021], can rely on affordable CMOS sensors and conventional CW laser diodes to produce dense depth measurements. This flood illumination can lead to multipath interference, though there exists a large body of work to mitigate this [Achar et al. 2017; Fuchs 2010; Freedman et al. 2014; Kirmani et al. 2013; Bhandari et al. 2014; Kadambi et al. 2013; Jiménez et al. 2014; Naik et al. 2015]. Correlation ToF measurements can also be used to resolve the travel-time of light in flight [Heide et al. 2013; Kadambi et al. 2013]. These time-resolved transient images have found a number of emerging applications, such as non-line-of-sight imaging [Heide et al. 2014; Kadambi et al. 2016], imaging through scattering media [Heide et al. 2014], and material classification [Su et al. 2016], which have also been solved with pulsed ToF systems [O’Toole et al. 2018; Heide et al. 2019] and interferometric methods [Gkioulekas et al. 2015]. All these methods, however, are restricted to working with modulation frequencies of only a few hundred MHz due to photon absorption depth in silicon [Lange and Seitz 2001], which governs how these devices perform photo-electric conversion. This limit places the depth resolution of modern correlation ToF sensors at mm- to cm-scale for operating ranges of up to several meters. Previous attempts at pushing this modulation frequency to the GHz regime struggle with low modulation contrast due to the energy loss from fiber coupling within eye-safe laser power levels [Kadambi and Raskar 2017; Li et al. 2018]. Li et al. [2018] overcome some of these limitations but solely rely on interferometric modulation, making the method susceptible to speckle, vibration, laser frequency drift, and other common interferometry errors. Notably, Bamji et al. [2018] achieve 200 MHz modulation frequency at high contrast but are limited to single-frequency modulation. Gupta et al. [2018] achieve 500 MHz modulation frequency with a fast photodiode and analog radio-frequency (RF) modulation, but contend with low modulation contrast at the GHz regime due to modulation after photo-conversion.
Interferometry and Frequency-Modulated Continuous-Wave ToF. Optical interferometry leverage the interference of electromagnetic waves to infer their path lengths, which are encoded in the measured amplitude and/or phase patterns. A detailed review of interferometry can be found in [Hariharan 2003]. Methods such as optical coherence tomography (OCT) [Huang et al. 1991] have found prolific use in biomedical applications [Fujimoto and Swanson 2016] for their ability to resolve micron-scale features in optical scattering media. This, however, comes with the caveat of an mm-scale operating range as diffuse scattering leads to a sharp decline in SNR. In graphics, OCT approaches have been successfully employed to achieve micron-scale light transport decompositions [Gkioulekas et al. 2015] and light transport probing [Kotwal et al. 2020]. Fourier-domain OCT systems mitigate some of the sensitivity to vibration by using a spectrometer and a broadband light source [Leitgeb et al. 2003]. While these methods provide high temporal resolution, they are also limited to cm-scale scenes. Frequency-modulated continuous-wave (FMCW) ToF systems employ an alternative interferometric approach to measuring distance. These methods continuously apply frequency modulation to their output illumination, which when combined in a wave-guide with the delayed returned light from the scene produces constructive and destructive interference patterns from which travel-time (and thereby depth) can be inferred. Experimental FMCW LiDAR setups can achieve millimeter precision for scenes at decimeter range [Behroozpour et al. 2016], but require complex tunable laser systems [Sandborn et al. 2016; Amann 1992; Gao and Hui 2012]. We revisit continuous-wave intensity modulation, which allows us to use conventional continuous-wave lasers modulated and demodulated in free-space.
Phase Unwrapping. In correlation to ToF systems, the analog correlation signal can experience phase shifts of more than one wavelength. To recover the true phase, and thereby accurately reconstruct depth, phase unwrapping algorithms are required [Dorrington et al. 2011; Crabb and Manduchi 2015; Lawin et al. 2016; An et al. 2016]. Single phase unwrapping approaches are only able to recover the relative depth, and require a priori assumptions to estimate scale [Crabb and Manduchi 2015; Ghiglia and Pritt 1998; Bioucas-Dias and Valadao 2007; Bioucas-Dias et al. 2008]. Multi-frequency phase unwrapping methods overcome this limitation by unwrapping high-frequency phases with their lower-frequency counterpart. Wrap count is recovered by either weighing Euclidean division candidates [Bioucas-Dias et al. 2009; Droeschel et al. 2010; Kirmani et al. 2013; Freedman et al. 2014; Lawin et al. 2016], or using a frequency-space lookup table [Gupta et al. 2015]. All of these methods, while powerful for MHz ToF imaging, fail in the presence of noise for the dozens of wrap counts observed in the GHz correlation imaging. To tackle this challenge, in this work we introduce a neural network capable of unwrapping GHz frequency ToF correlation measurements.
Electro-optic Modulators. EOMs control the refractive index of a crystal with an electric field to modulate the phase, frequency, amplitude, and polarization of incident light [Yariv and Yeh 2007]. As such, they have been employed in diverse applications, including fiber communications [Phare et al. 2015], frequency modulation spectroscopy [Tai et al. 2016], laser mode locking [Hudson et al. 2005], and optical interferometry [Minoni et al. 1991]. In particular, EOMs have been used in LiDAR systems to change the optical-carrier frequency for FMCW sensing [Behroozpour et al. 2017] or facilitate pulsed sensing [Chen et al. 2018]. Instead, we repurpose these EOMs for continuous-wave correlation ToF imaging. We employ a two-pass modulation scheme for our ranging system that, instead of optical frequency, modulates intensity with high contrast. We combine this acquisition scheme with a neural phase unwrapping method to then unwrap the dozens of phase wraps we encounter in the GHz regime.

3 Correlation ToF Imaging

In this section, we review the principles of correlation ToF imaging; for a detailed introduction, see [Lange 2000].
Image Formation. Correlation ToF cameras start by sending an amplitude-modulated light into the scene
\begin{equation} p(t) = \alpha \cos (\omega _p t) + \beta , \end{equation}
(1)
where \(\omega _p\) is modulation frequency, \(\alpha\) is amplitude, and \(\beta\) is a DC offset. After traveling through the scene and reflecting off a target, the measured return signal
\begin{equation} \tilde{p}(t - \tau) = \tilde{\alpha } \cos (\omega _p t - \phi) + \tilde{\beta }, \quad \phi = 2\pi \omega _p \tau , \end{equation}
(2)
is a time-delayed \(p(t)\) by time \(\tau\) with an observed attenuation in amplitude \(\tilde{\alpha }\), a shift in bias \(\tilde{\beta }\), and a time-dependent phase shift \(\phi\). This measured signal is then correlated with a reference
\begin{equation} r(t)=\cos (\omega _r t + \psi) + 1/2, \end{equation}
(3)
where \(\omega _r\) and \(\psi\) are the demodulation frequency and phase, respectively. In existing multi-bucket imagers, this correlation occurs during exposure via photonic mixer device pixels [Lange and Seitz 2001; Foix et al. 2011], which are modulated according to the reference function \(r(t)\). When we modulate and demodulate at the same frequency, that is \(\omega _p = \omega _r = \omega\), this is called homodyne imaging. Integrating this signal over exposure time \(T\), we get a correlation measurement
\begin{align} C_\psi = \int _{0}^{T} {\tilde{p}({t - \tau })r({t}) \, \mathrm{d}t} = \frac{\tilde{\alpha }}{2}\cos (\psi - \phi) + TK, \end{align}
(4)
where \(K\) is a general constant offset, meant to model a non-zero modulation bias on the sensor. Given this measurement, we aim at estimating the phase delay \(\phi\) from which the scene depth can be computed. As illustrated in Figure 1(b), the correlation measurement \(C_\psi\) is a constant dependent on the demodulation phase offset \(\psi\) (achieving its max at \(\psi =n\phi , n\in \mathbb {N}\)). In practice, this means we never have to explicitly sample \(\tilde{p}(t-\tau)\), which would require expensive ultrafast detectors and modulation electronics. Although the correlation measurement \(C_\psi\) does not directly give us access to the true phase \(\phi\), by sampling this function for multiple demodulation phase offsets \(\psi\) we can make use of Fourier analysis to discern the true phase \(\phi\). Existing correlation imagers typically acquire four equally-spaced correlation measurements at \(\psi \in [0,\, \pi /2, \, \pi , \, 3\pi /2]\). Using these, we can estimate the phase offset \(\hat{\phi }\) wrapped to the \(2\pi\) range as \(\hat{\phi }= \arctan ({\frac{C_{\pi }-C_{\pi /2}}{C_{0}-C_{3\pi /2}}})\). Phase unwrapping amounts to estimating the integer factor \(n\) to recover the unwrapped phase \(\phi =\hat{\phi }+ 2\pi n\). If successful, we can convert this phase estimate \(\phi\) to depth as \(z=\phi c/4\pi \omega _p\), where \(c\) is the speed of light.
Fig. 1.
Fig. 1. We propose an all-optical neural time-of-flight (ToF) imaging system with centimeter-wave intensity-modulated illumination. To this end, we repurpose electro-optical modulators used in optical communication to computing GHz frequency ToF correlation signals in free space, avoiding photo-conversion and fiber coupling. The proposed system provides correlation measurements at 7.15 GHz and 14.32 GHz modulation frequencies (second to left and center), which results in dozens of phase wraps over meter-scale scenes that cannot be estimated accurately by existing phase unwrapping methods. To this end, we propose a segmentation-inspired neural phase unwrapping network that recovers accurate scene depth (right) from the correlation measurements and scene amplitude (second to right). See scene photograph (inset second to right) as reference. We demonstrate a robust method for all-optical GHz-frequency correlation ToF depth imaging of macroscopic scenes.
Fig. 2.
Fig. 2. Principle of correlation ToF. (a) Typical correlation ToF imagers emit coded illumination into a scene with time-varying sinusoidal intensity modulation. The reflected light then encodes travel time via its phase shift. (b) Homodyne detectors measure the correlation between the reflected sinusoidal signal and a reference signal with the same frequency, which produces a DC value as a function of the reference-signal phase and the phase shift from the scene.
Modulation Frequency. As we noted earlier in Equation (2), the round-trip path of the amplitude-modulated illumination imparts on it a \(\phi\) phase shift. Setting \(t=0, \tilde{\beta } = 0\) and \(\omega =100 MHz\) (a common modulation frequency in conventional ToF cameras) in Equation (2), we observe a 0.0009% signal difference for a 1 mm change in depth \(z\). See Figure 2. This means, with realistic imaging noise and quantization in existing sensors, we would practically not be able to discern millimeter scale features on object surfaces for a setup with this modulation frequency. To achieve higher precision we go to the higher frequency, the same experiment repeated for \(\omega =8 GHz\) leads to a more detectable 5.6% difference in signal amplitude. In practice, there are many factors that affect signal contrast, which we explore in the remainder of this work.
Fig. 3.
Fig. 3. Illustration of depth estimation error versus modulation frequency. For each frequency, we simulate 1,000 samples with added Poisson-Gaussian noise of constant magnitude for an indoor scenario. We quantize the simulated measurement to 14 bits (mimicking a 14-bit digital-to-analog conversion), reconstruct the estimated depth, and plot the resultant mean measurement and standard deviation envelope. We see that as we increase modulation frequency from 100 MHz to 10 GHz, our expected precision similarly increases 100\(\times\).

4 Overview

Realizing correlation imaging at two orders of magnitude higher frequencies than existing systems is prohibited by two technical challenges: modulating at GHz rates, and unwrapping the measured phase estimates, see Figure 3. Stable GHz demodulation is challenging as analog modulation after photo-conversion or with fiber-coupling suffers from the high noise of ultra-fast photodiodes or large coupling losses, respectively. Phase unwrapping becomes a challenge as the increase in modulation frequency results in multiple dozens of wraps instead of a handful of wraps. The proposed computational imaging system tackles both limitations as follows.
Fig. 4.
Fig. 4. Simulated measurements for a 100 MHz ToF system, which exhibits only a single phase wrap, and a 7.15 GHz system which experiences 35.
First, we present a convolutional network for high-frequency phase unwrapping, motivated by recent learning-based segmentation methods. Our approach represents wrap counts as class labels and segments measurements into their corresponding wrap regions, wherein we exploit the fact that proximal measurements that are highly correlated and are likely to also be similarly phase wrapped.
Second, we introduce a two-pass EOM-based system with frequency doubling to tackle the problem of GHz frequency intensity modulation. The proposed method performs correlation computation optically in free-space rather than in the conventional analog domain. In this way, we avoid photo-conversion artifacts and energy loss from fiber-coupling, enabling high modulation contrast ToF imaging at 7.15 GHz and 14.32 GHz.

5 Neural Phase Unwrapping

Phase unwrapping methods estimate the wrap count \(n\) of the wrapped phase \(\hat{\phi }\) to recover the unwrapped phase \(\phi\) for depth estimation. Our GHz ToF system presents two main challenges for unwrapping. First, the high modulation frequencies (7.15 GHz and 14.32 GHz) result in dozens of wraps over meter-scale scenes, as opposed to one or two for conventional MHz systems, see Figure 3. Second, the modulation bandwidth of our GHz correlation ToF system is limited to \(\pm\)10MHz, limiting the available sets of frequencies for multi-frequency approaches [Gutierrez-Barragan et al. 2019]. These challenges lead to lackluster performance from prior phase-unwrapping approaches including analytical solutions [Xia and Wang 2007], kernel methods [Lawin et al. 2016], and newer neural-network designs [Su et al. 2018; Zhang et al. 2019]. Here, we present a novel segmentation-inspired neural network tailored for high-frequency phase unwrapping. Rather than synthesizing the unwrapped phase directly, we pose this as an ordinal classification problem of wrap counts. Our network outputs \(N\) class weights for each input pixel, each corresponding to a candidate wrap count. Here, \(N\) is determined by the minimum and maximum expected wrap counts for the lowest modulation frequency, 7.15 GHz, to reduce class count.

5.1 Segmentation-based Fourier Phase Unwrapping

For our architecture, we modify the Fast SCNN [Poudel et al. 2019] image segmentation network. First, to encourage the network to learn local frequency unwrapping, rather than overfitting to global scene structure, we reduce its receptive field and add a full resolution skip layer directly to the output. We refer to the Supplemental Material for details on the network architecture. Second, as input to our network, in addition to measured amplitude, we use a Fourier feature encoding [Tancik et al. 2020] of wrapped phase \(\hat{\phi }\)
\begin{align} & \gamma (\hat{\phi }) = [cos(2^0 \hat{\phi }), sin(2^0\hat{\phi }), cos(2^1\hat{\phi }),\ldots , sin(2^{EC} \hat{\phi })]^\intercal . \end{align}
(5)
This was used to great success in [Mildenhall et al. 2020] as a positional encoding method, mapping x,y, and z coordinates to a higher dimensional space and improved training for their multilayer-perceptron representation. For our phase unwrapping network, the purpose is two-fold. This encoding increases the dimensionality of the input multi-frequency measurements to facilitate learning of high-frequency features, and effectively modulates the correlation values with a new set of sinusoids, as seen in Figure 4, which allows the network to perform a rudimentary frequency analysis of the underlying ToF signal.
Fig. 5.
Fig. 5. We pose the phase unwrapping problem as an ordinal classification problem and train a neural network to map phase and phase edge measurements to wrap counts. We use Fourier feature encodings of the phase measurements to allow the network to perform a rudimentary frequency analysis of the underlying ToF signal, and add realistic levels of Poisson-Gaussian noise to the simulated data to promote the learning of noise-robust unwrapping.

5.2 Ordinal Classification Loss

We calculate our final estimate unwrapped phase \(\phi ^{\prime }\) with
\begin{equation} \phi ^{\prime } = \sum _{n=0}^{N-1} n\left(\frac{\gamma e^{\hat{\phi }_n}}{\sum _{m=0}^{N-1}e^{\gamma \hat{\phi }_m}} \right), \end{equation}
(6)
a differentiable argmax. Here \(\phi _n\) is the predicted weight for phase class \(n\), corresponding to \(n\) wraps, and \(\gamma\) adjusts the hardness of the argmax function. The predicted depth is as before, \(z^{\prime } = \phi ^{\prime } c/4\pi \omega\). This differentiable argmax allows for backpropagation through our phase-unwrapping network, meaning we are able to use both entropy-based classification losses on the output class weights and standard image losses on the estimated phase or depth. Taking into consideration the ordinal nature of wrap counts—that is, predicting one wrap for a twice wrapped measurement is better than predicting twenty—we opt for a mixed cross-entropy \(\mathcal {L}_{CE}\) and \(\ell _1\) loss \(\mathcal {L}_{L1}\)
\begin{align} & \mathcal {L} = \mathcal {L}_{CE} + w_{L1}\mathcal {L}_{L1} \nonumber \nonumber\\ & \mathcal {L}_{L1} = |z - z^{\prime }| \nonumber \nonumber\\ & \mathcal {L}_{CE} = -\sum _{n=0}^{N-1} \phi _n log(\phi _n^{\prime }), \end{align}
(7)
where \(z\) and \(\phi\) are ground truth measurements. The cross-entropy loss allows us to train the network as a classifier, while the smooth \(\ell _1\)-term provides a distance metric for the classes, penalizing the network for guessing wrap counts \(n^{\prime }\) far from the true \(n\).

5.3 ToF Simulation from RGBD

Given that there do not exist GHz ToF datasets, especially not ones with associated ground truth, we look to simulation to fill our need for training data. We simulate our measurements from the Hypersim RGB-D dataset [Roberts and Paczan 2021], containing 77,400 ground truth depth maps \(z\) (in mm) and images \(I\) from 461 computer-generated indoor scenes. We first calculate ground truth phase as \(\phi = (z 4\pi \omega)/c\), where \(\omega \in \lbrace \mathrm{7.15 GHz, 14.32 GHz}\rbrace\), and \(c\) is the speed of light. We simulate ToF correlation images \(C_\psi\) for \(\psi \in \lbrace 0,\, \pi /2,\, 3\pi /2,\, \pi \rbrace\) as
\begin{align} C_\psi = G T I_g (0.5+cos(\phi + \psi)/\pi) + \eta _P + \eta _G, \end{align}
(8)
where \(G\) is sensor gain, \(T\) is integration time, and \(I_g\) is the green channel of the image, meant to emulate the green laser in the experimental prototype. To simulate measurement fluctuations, we add Poisson noise \(\eta _{P}\) and Gaussian noise \(\eta _{G}\) with mean \(\mu\) and standard deviation \(\sigma\). We note that a typical correlation ToF camera follows a Skellam-Gaussian noise model [Hansard et al. 2012], however our all-optical correlation ToF design has no photon bucketing and subsequently encounters only Poisson-Gaussian noise.

5.4 Correlation Images to Wrapped Phase and Amplitude

From the correlation images \(C_\psi\) obtained either from the training dataset or our GHz ToF imaging system, we recover the wrapped phase \(\hat{\phi }\) and amplitude \(\hat{a}\) using a per-pixel Fourier transform
\begin{align} \hat{\phi }= \mathrm{angle}(\mathcal {F}_2 (C_\psi)), \hat{a} = 2|\mathcal {F}_2 (C_\psi)|, \end{align}
(9)
where \(\mathrm{angle}(\cdot)\) is the phase angle of a complex number and \(\mathcal {F}_i(\cdot)\) is the \(i\)th complex value of the Fourier transformed signal (e.g.,\(\mathcal {F}_0\) is the DC component). This process is repeated for the two modulation frequencies, 7.15 GHz and 14.32 GHz, and the arrays are stacked to form the raw multi-frequency measurements. As a result of the above Fourier recovery, \(\hat{\phi }\in [0, 2\pi ]\) is phase wrapped, and is passed into our phase-unwrapping network.

6 All-optical GHz Correlation ToF Imaging

GHz modulation frequencies for correlation ToF can allow for high-precision depth imaging as illustrated in Figure 2. However, practically realizing this idea has been challenging due to the limited photon absorption depth in silicon [Lange and Seitz 2001] and inefficacy of fiber coupling [Kadambi and Raskar 2017]. In this work, we take a different approach by co-opting free-space EOMs, mainly used for optical communication, and introducing a two-pass intensity modulation system with polarizing optics. Our method optically performs correlation computation, and, as such, permits the use of intensity sensors and continuous-wave lasers as compared to the more complex hardware requirements of pulsed LiDAR.

6.1 Backgrounds on EOM and Jones Calculus

We briefly review EOM and Jones calculus before describing our novel two-pass intensity modulation scheme. Modern EOMs modulate the phase, amplitude, and polarization of light by applying an electric field to control the refractive indices of a bulk crystal, perpendicular to the direction of light propagation, according to the electro-optic Pockel’s effect [Yariv 1967]. To mathematically model the effect of an EOM, we rely on a Jones vector and Jones matrix. The Jones vector is a \(2\times 1\) vector that describes the amplitude and phase of horizontal and vertical polarization components. As such, the corresponding Jones matrix describes the change of the polarization state of light with a \(2 \times 2\) matrix that can be multiplied by a Jones vector. We refer the reader to Collet [2005] for a review on Jones calculus. The Jones matrix describes how an EOM shifts the horizontal and vertical polarization waves of light by an amount dependent on the applied voltage \(V\) as
\begin{equation} B(V) = \begin{bmatrix} e^{-iV/2} & 0 \\ 0 & e^{iV/2} \end{bmatrix}, \quad V=\eta \cos (\omega t - \phi), \end{equation}
(10)
where \(V\) is a time-varying voltage function, \(\eta\) is the modulation power, \(\omega\) is the voltage modulation frequency, and \(\phi\) is the modulation phase. Our custom-designed resonant EOM is capable of generating phase differences in two orthogonally-polarized components of light at a 7.15 GHz frequency with 20 MHz of bandwidth. See Supplemental Material for additional information on our EOM.

6.2 Two-pass GHz Intensity Modulation

For GHz ToF imaging, we propose a two-pass GHz intensity modulation method that uses polarizing optics and custom phase-modulating EOMs. This allows us to achieve a 7.15 GHz modulation frequency at the EOM’s native resolution as well as a higher frequency of 14.32 GHz enabled by intensity modulation with a combination of two-pass phase modulation and polarization changes. The two modulation frequencies provide high depth resolution, with an effective wavelength of 2.1 cm for the 14.32 GHz frequency, and overcome the small native bandwidth of 20 MHz of the EOMs, resulting in a frequency difference of 7.17 GHz for phase unwrapping. We note that our method enables high-frequency EOM-based intensity modulation, an entirely different concept than conventional optical-frequency doubling using EOMs. We describe the working principle of our method below.

6.2.1 Intensity Modulation.

Our custom resonant EOM delays the phase of horizontal and vertical components of light at a frequency \(\omega =7.15\,GHz\). We exploit these polarization-dependent phase shifts to perform intensity modulation of incident light. Specifically, we use the following polarization optics: a polarizing beamsplitter (PBS), an half-wave plate (HWP), a quarter-wave plate (QWP), and a mirror as shown in Figure 6. Incident light enters a PBS, turning light into vertical linear polarization as
\begin{equation} E_{0} = A \begin{bmatrix} 0 \\ 1 \end{bmatrix}, \end{equation}
(11)
where \(A\) is the amplitude of the incident light. The polarization state of the light is then modulated by an HWP and a QWP followed by an EOM at a given voltage \(V\) as
\begin{equation} E_{1} = B(V)Q(\theta _q)H(\theta _h)E_{0}, \end{equation}
(12)
where \(H(\theta _h)\) and \(Q(\theta _q)\) are the Jones matrices of the HWP and the QWP oriented at angles \(\theta _h\) and \(\theta _q\). The light then propagates in free-space for half a modulation wavelength \(c/\omega\) to where a mirror is placed, resulting in the change of Jones vector as
\begin{equation} E_{2} = M E_{1}, \end{equation}
(13)
where \(M\) is the Jones matrix of a mirror. Light travels again back to the EOM, the QWP, and the HWP. The PBS picks up the vertical linear polarization component of this light. Setting the HWP and the QWP angles as \(\theta _q=11.25^\circ\) \(\theta _q=45^\circ\), we obtain the output light
\begin{align} E_{3} & = L_h H(-\theta _h)Q(-\theta _q)B(V)MB(V)Q(\theta _q)H(\theta _h)E_0 \nonumber \nonumber\\ & = A\left[ \begin{array}{*{20}{c}} {\frac{{i(\cos V - \sin V)}}{{\sqrt 2 }}}\\ 0 \end{array} \right]. \end{align}
(14)
We square the magnitude of \(E_{3}\), and observe a modulated intensity
\begin{align} I(V) = |E_{3}|^2 = \frac{A^2}{2} (1-\sin 2V). \end{align}
(15)
Equation (15) indicates that the output intensity of light is a function of the voltage \(V\) applied to the EOM. As we supply a time-varying sinusoidal voltage to the EOM as in Equation (10), we arrive at the time-varying intensity-modulated light as
\begin{align} I(t) &= \frac{A^2}{2} (1-\sin (2\eta \sin (\omega t - \phi))) \nonumber \nonumber\\ &\approx \frac{A^2 }{2} (1-2\eta \sin (\omega t - \phi)). \end{align}
(16)
The last approximation is based on the Taylor expansion given that the modulation power \(\eta\) of our EOM is small. The applied voltage to the EOM has GHz modulation frequency \(\omega =7.15\,GHz\), enabling effective all-optical GHz modulation of light intensity. We refer to the Supplemental Material for detailed derivation.
Equation (16) describes the high-frequency intensity modulation realized by the proposed free-space two-pass phase modulation with polarizing optics shown in Figure 6. This optical configuration serves as a building block for both illumination and detection modules in our imaging system. In the illumination module, we input continuous laser light into the EOM, resulting in sinusoidal intensity-modulated light emitted into the scene. For the detection module, the returned amplitude-modulated light from the scene is demodulated by an additional intensity modulation with the reference signal \(r\), recall Equation (4), and we optically multiply \(r\) and \(\tilde{p}\) before integration on the detector.

6.2.2 Doubled Intensity-Modulation Rate.

Even though the voltage modulation frequency \(\omega\) is limited to a narrow modulation band of 20 MHz in our resonant EOM, we can modulate at the double frequency of \(2\omega\) by adjusting the angle of the HWP, \(\theta _h\), in front of the EOM, achieving 14.32 GHz modulation rate. While doubling the frequency of the optical carrier is well known in optics, we note that the proposed frequency doubling of the intensity modulation is novel. In the original operating mode, we set the HWP angle \(\theta _h\) as \(11.25^\circ\) resulting in the intensity modulation at the original frequency \(\omega\). For frequency doubling, we rotate the HWP to \(\theta _h = 22.5^\circ\). To derive the modulation behavior, we rely on the same Jones calculus from above. Specifically, changing the HWP angle \(\theta _h\) results in the output light \(E_3\) as
\begin{align} E_{3} = A\left[ \begin{array}{*{20}{c}} -i \sin V\\ 0 \end{array} \right]. \end{align}
(17)
The intensity \(I(V)\) is the magnitude square of \(E_3\) as
\begin{equation} I(V) = |E_3|^2 = \frac{A^2}{2} (1-\cos (2V)). \end{equation}
(18)
Note that the difference between Equations (15) and (18) is that we have \(\cos ()\) instead of \(\sin ()\). This single difference enables intensity modulation at a doubled frequency. After applying the time-varying voltage modulation of Equation (10), the time-varying intensity of the output light is
\begin{align} I(t) &= \frac{A^2}{2} (1-\cos (2\eta \sin (\omega t - \phi))) \nonumber \nonumber\\ &\approx \frac{A^2 }{2} \eta ^2 \sin ^2(\omega t - \phi) \nonumber \nonumber\\ & = \frac{A^2 }{4} \eta ^2 (1 - \cos (2\omega t - 2\phi)). \end{align}
(19)
We use the same Taylor expansion. Equation (19) shows that we can obtain the doubled frequency \(2\omega\) at 1/4th amplitude compared to the single-frequency mode at \(\omega\) – only by changing the polarization optics instead of the electro-optical modulation itself.

6.2.3 Validation of GHz Intensity Modulation.

We validate our GHz intensity modulation module consisting of a PBS, an HWP, a QWP, an EOM, and a mirror. We emit laser light to a mirror at a fixed position and directly capture the intensity of the modulated light steered onto a GHz photodetector, see Supplemental Material for the measurement configuration. Figure 7 demonstrates the effective GHz intensity modulation with high modulation contrast at two different HWP angles of \(11.25^\circ\) and \(22.5^\circ\), corresponding to the modulation frequencies of \(\omega =7.15~\)GHz and \(2\omega =14.32~\)GHz.

6.3 Coaxial Spatial Intensity Imaging

Equipped with the intensity modulation block, we design a coaxial imaging system with an illumination and a detection module, see Figure 5. For the illumination module, we opt for continuous-wave laser illumination at 532 nm (for eye-safe lab operation of the prototype) followed by a GHz intensity modulation block. A second GHz modulation block is used for the detection module, combined with an avalanche photodiode (APD) for intensity sensing. We employ an APD to allow for high gain at fast readout rates in low-flux scenarios, which is especially important for scene surfaces low reflectance and objects at long distances; see Supplemental Material. Using a non-polarizing beamsplitter, we share the same path for the output light to a scene and the detected light from a scene, improving the signal-to-ratio of the system. For 2D spatial scanning, we use a 2-axis galvonometer in front of the beamsplitter, as shown at the bottom of Figure 5. Although the proposed free-space modulation method is not limited to co-axial scanning, the beam-steered acquisition effectively eliminates most multi-path interference, which we neglect in the remainder of this work.
Fig. 6.
Fig. 6. We propose an all-optical free-space correlation ToF imaging system using polarizing optics and resonant EOMs. (a) The schematic diagram of our imager as realized in (b) shows the light path from a laser source to a scene, and back to the photodiode. See text for details.
Fig. 7.
Fig. 7. Two-pass GHz intensity modulation with polarizing optics and an EOM. (a) We implement the GHz intensity modulation of incident light by using polarizing optics and an EOM. Incident light becomes linearly polarized after a PBS and further polarization modulated by an HWP and a QWP. An EOM with a sinusoidal voltage applied shifts phases of the horizontal and vertical polarization components. The light then is returned by a mirror distanced at half the modulation wavelength, and returns back to the EOM, the QWP, the HWP, and the PBS. The combination of forward and reverse paths results in (b) the optical intensity modulation of incident light at GHz frequency with unrolled polarization modulation, see the text for details.
Fig. 8.
Fig. 8. We validate the GHz intensity modulation module by capturing the signal reflected from a mirror at a fixed position. (b) When the HWP is oriented at \(11.25^\circ\), we achieve the single intensity modulation frequency of \(\omega =7.15\)GHz. (c) Changing the HWP angle to \(22.5^\circ\) at the EOM base frequency \(\omega =7.16\) GHz enables frequency doubling: \(2\omega =14.32\)GHz.
Analog Signal Integration. We use a conventional avalanche photodiode with a gain \(G\) to detect the correlation signals from the detection module without any quantization involved. This generates analog photocurrent which is then low-pass filtered with an electrical filter and a resistor-capacitor (RC) circuit that further integrates the constant correlation input signal over an exposure time \(T\). We read out the analog signal with an ADC with 14 bit quantization. This results in the digital read of
\begin{align} C_\psi = Q\left(G\int _{0}^{T} {\tilde{p}({t - \tau })r({t}) \, \mathrm{d}t} \right) = Q\left(\frac{\tilde{\alpha }}{2}\cos (\psi - \phi) + TK \right), \end{align}
(20)
where \(Q\) is the 14 bit ADC quantization, \(\phi\) is the illumination phase, and \(\psi\) is the phase of the reference \(r\) shown in Equation (4).

6.4 Fine-tuning with Active Stereo Supervision

We fine-tune our phase-unwrapping network to allow for specialization to the specific noise characteristics of our experimental system and minor deviations of the modulation functions from ideal sinusoids. We acquire pseudo-ground-truth phase wrap maps by augmenting our system with stereo cameras (other ground-truth acquisition approaches are also possible, we chose stereo for ease of implementation). We build the acquisition system by mounting two CMOS cameras (FLIR Grasshopper3 GS3-U3-32S4C) to the ToF rig, with 8 mm lenses to match our system’s FOV as shown in Figure 8. This effectively creates an auxiliary active stereo system from which to recover coarse scene depth without additional captures. After geometric calibration, we triangulate the position of the ToF laser spot in 3D space with the stereo cameras as we scan the scene. The estimated depth for each laser spot allows us to generate a pseudo ground truth wrap map. We perform fine-tuning on a diverse dataset of captured stereo measurements, which are all withheld from the experimental validation section. For details on the fine-tuning, we refer to the Supplemental Document.
Fig. 9.
Fig. 9. Active stereo setup to generate ground truth depth for fine-tuning. (a) Stereo cameras are mounted to the ToF system. (b) Sample 7.15 GHz ToF phase map. (c) Corresponding wrap map recovered from stereo depth.

7 Experimental Setup

For experimental validation, we implement the prototype system shown in Figure 1 . While we assemble the experimental prototype on an optical breadboard, we note that the EOMs and optics can be integrated in a small form factor similar to LiDAR sensors.

7.1 Illumination Module

We use a single transverse mode continuous-wave laser at 532 nm wavelength (Laser Quantum Gem 532). The laser beam is coupled with a custom-design single mode high power optical fiber (OZ Optics QPMJ-A3AHPCA3AHPC-488-3.5/125-3AS-1-1) which removes the higher order modal light and produces a uniform Gaussian beam at the output of the fiber, maintaining \(20-30\%\) laser output power. The light then enters a 2.5\(\times\) inverse beam expander (Thorlabs LC1060-A and LA1608-A) that reduces the beam diameter from 1.25 mm to 0.5 mm, matching the desired beam size for our EOM. The reduced light becomes horizontally linearly polarized by passing through a first PBS (Thorlabs PBS101). Then, a pair of HWP (Thorlabs WPMH05M-532) and QWP (Thorlabs WPQ05M-532) modulates the polarization state of the beam. The polarization-modulated light passes through our custom GHz EOM that operates at the modulation frequency \(\omega\). The light is reflected by a mirror (Thorlabs PF10-03-P01), returning back to the EOM, the QWP, the HWP, and the PBS. This procedure results in the GHz-frequency intensity modulation of light.
The light then passes through a mirror (Thorlabs PF10-03-P01) and an NBS (Thorlabs CCM1-BS013) dividing the incident beam into two beams of equal intensity. One beam is directed to an integrating sphere (Thorlabs S140C) which measures the intensity of emitted light for calibration purposes and the other beam passes through another NBS (Thorlabs CCM1-BS013). The purpose of this module is to calibrate intensity fluctuations from the laser by normalizing the signal incident on the detection module. The optical intensity modulation has a higher frequency than the integration time of a few milliseconds, which allows compensation after the modulation without error. It splits the beam again into two paths with equal intensity where one half of the beam is used as the reference beam in interferometric measurement mode (used for precision comparison see Figure 12) with a mirror; otherwise, this beam is discarded in intensity-measurement mode. The other half of the beam is sent to a scene through a mirror (Thorlabs PF30-03-P01) and a 2-axis galvo mirror system (Thorlabs GVS012) for spatial scanning. The emitted CW laser power is 3 mW. For photon efficiency estimates, see Supplemental Document. To avoid speckle artifacts of the coherent laser illumination, we slightly defocus the projected beam.

7.2 Detection Module

The intensity-modulated light returns from a scene and passes through the galvo mirror system and the mirror followed by an NBS which redirects the beam to the detection module. We use a 1.6\(\times\) inverse beam expander (Thorlabs LA1213-A and Thorlabs LC1060-A) and a mirror (Thorlabs PF10-03-P01) resulting in a beam diameter of 0.5 mm and collimated beam accurately entering the detection EOM. Symmetric to the emission module, we mount a PBS, an HWP, a QWP, an EOM, and a mirror which in effect optically demodulate returned light from the scene. The intensity demodulated light is then captured by an avalanche photodiode (Thorlabs APD440A) with a focusing lens (Thorlabs LA1951-A). We use a 10 kHz lowpass filter (Thorlabs EF120) RC low pass integrator circuit with RC time constant \(t_{RC} = 100 ms\) to integrate the detected photocurrent signal, then passed into an analog-digital-converter (LabJack T7) to sample the signal at up to 24 K samples per second. We integrate over 20 samples for a single phase measurement and sample 16 phases corresponding to 13 ms integration time for a single galvo measurement point.

7.3 RF Driver

To operate the EOMs with a sinusoidal voltage input, we use two custom RF drivers with a high-frequency DDS which are synchronized to the external clock source of a function generator (Siglent SDG2042X). Our DSS signal generators support sinusoidal modulation only, leaving non-sinusoidal modulation as interesting future work [Gupta et al. 2018]. The external clock enables accurate control of the phase of the modulation signal \(\phi\). Our driver contains two RF modulators to output an RF signal provided to the EOMs. The RF driver performs frequency locking to significantly increase the output power and reduces frequency drifting in the EOM. For further details, refer to the Supplemental Document.

7.4 Comparison to Analog RF Demodulation

For comparison of the proposed system with demodulation of a signal after photo-conversion, we add a highspeed GaAs 12 GHz photodetector (EOT GaAs PIN Detector ET-4000) connected to an RF demodulation circuit. This measurement setup can be enabled by flipping a flip-mirror in the optical path, redirecting the scene illumination to the fast photodiode instead of the proposed detection module. This photodiode offered the highest photon-detection efficiency and high-frequency response available to us. The captured photocurrent from the detector is sent as input to an I/Q demodulator consisting of analog microwave electronics as follows. The photodetector signal is first amplified and band-pass filtered. Then it enters an RF mixer to be demodulated with the local oscillator (LO) signal from the RF driver. This produces a signal with the difference of the two frequencies and a signal with the sum of the two frequencies. These signals are passed through a low pass filter which removes the higher frequency signal. Then the remaining homodyne DC signal, is output as two signals, an in-phase component \(I\), and a quadrature component \(Q\) shifted by 90 degrees. For a detailed circuit design, see the Supplementary Document.

8 Assessment

In this section, we validate the proposed computational ToF method in simulation and experimentally. Specifically, we first perform quantitative evaluation of our neural unwrapping approach on a synthetic dataset and compare with other baseline phase unwrapping methods. We then experimentally validate the proposed system quantitatively and qualitatively on unseen real-world measurements captured by our experimental prototype.

8.1 Simulated Analysis

Ablation Study. We conduct an ablation study to validate our choice of Fourier feature encoding and combined loss function. The different ablation configurations and corresponding quantitative results are shown in Table 1, and we refer to the Supplemental Document for qualitative results. We observe that Fourier encoding leads to a 10 percentage point boost in correct wrap predictions, supporting the theory that the doubly modulated phases provide valuable features during training, possibly in the form of a learned frequency analysis of the underlying measurements. We concatenate phase edges to the input in order to aid the network’s understanding of each wrap region. For the loss functions, we find the model trained on cross-entropy loss alone demonstrates competitive results, validating the choice to represent phase unwrapping as a classification problem. However, when we make use of the differentiable argmax function to directly introduce \(\ell _1\) loss on predicted depth, we see a reduction in outliers and an overall smoother final prediction. This reinforces the problem as ordinal classification, where the ordering of classes—in this case, wrap counts—is significant.
Table 1.
 InputLossPerformance (%)
 \(\hat{\phi }\)\(\gamma (\hat{\phi })\)\(|\nabla \hat{\phi }|\)\(\mathcal {L}_{CE}\)\(\mathcal {L}_{L1}\)\(\uparrow \delta \lt 1\)\(\uparrow \delta \lt 2\)\(\uparrow \delta \lt 3\)\(\downarrow \delta \ge 3\)
Proposed\(\checkmark\)\(\checkmark\)\(\checkmark\)\(\checkmark\)\(\checkmark\)51.6%69.1%77.0%23.0%
\(\mathcal {L}_{CE}\) Only\(\checkmark\)\(\checkmark\)\(\checkmark\)\(\checkmark\)-49.1%65.0%73.3%26.7%
F. Features\(\checkmark\)\(\checkmark\)-\(\checkmark\)\(\checkmark\)40.2%59.6%68.8%31.2%
Phase Only\(\checkmark\)--\(\checkmark\)\(\checkmark\)30.3%52.3%65.7%34.3%
Table 1. Ablation Study Configurations and Corresponding Quantitative Results
Here the \(\delta\) metric represents the percent of pixels whose prediction is \(\delta\) wraps from ground truth wrap count. Up arrow denotes “higher is better”, down arrow means “lower is better”. The best performing method is highlighted as bold.
Comparison to Phase Unwrapping Methods. We validate our proposed neural unwrapping approach on a synthetic test set and discuss the qualitative and quantitative results. As a baseline, we compare our work against traditional unwrapping methods including the approach used in phasor imaging [Gupta et al. 2015], the algebraic chinese-remainder theorem (CRT) solution [Pei et al. 1996; Xia and Wang 2007], and the kernel density method (KDE) [Lawin et al. 2016], which is also used in the Kinect V2 software. We also compare to an unmodified U-Net [Ronneberger et al. 2015] baseline and three recent regression-based deep learning approaches [Wang et al. 2019; Zhang et al. 2019; Su et al. 2018].
We show qualitative performance of our proposed neural unwrapping method and baseline methods including CRT [Xia and Wang 2007], KDE [Lawin et al. 2016] and the next best network-based method [Zhang et al. 2019] in Figure 9, and refer to the Supplemental Document for additional qualitative comparisons. Table 2 presents quantitative classification results for the full range of methods in increasingly widening error bands, as well as outlier percentages. Visually CRT and KDE achieve similar results, as they have similar underlying mechanics for wrap calculation, however KDE’s spatial aggregation allows it to better tackle noise and make significantly more correct estimates. This is quantitatively confirmed by the fact that more than half of CRT’s predictions are outliers (\(\delta \ge 10\)) while for KDE this number is less than 9%. The last conventional method, phasor imaging, struggles heavily under added noise and sub-optimal modulation frequencies, leading to nearly all the measurements being incorrectly unwrapped. The U-Net and three comparison deep learning methods produce similar spatially smoother predictions than the classical methods, however often bin entire patches of the image into the wrong wrap count, leading to a marginally higher rate of outliers than KDE while making more than double the number of correct predictions. Deep-ToF [Su et al. 2018] solves the phase unwrapping problem by directly regressing the raw correlation measurements to depth using the modified U-Net with skip connections. This regression-based method often results in globally inaccurate phase unwrapping as it hallucinates depth directly without the exact meaning of wrap counts. In contrast, our unwrapping network avoids this problem by estimating wrap counts with the segmentation-inspired network architecture and loss functions. The proposed neural unwrapping method more than doubles the rate of correct predictions when compared to Zhang et al.’s [2019] baseline. This is confirmed qualitatively in Figure 9 with spatially-consistent outputs and object discontinuities that accurately align with the amplitude measurement. The proposed network outperforms all existing methods in GHz frequency unwrapping.
Fig. 10.
Fig. 10. Phase unwrapping results for comparison to existing conventional and learned methods and our proposed approach. Analytic solutions of CRT [Xia and Wang 2007] and KDE [Zhang et al. 2019] suffer from rapid phase wraps and phase noise. The state-of-the-art neural network method partly overcomes such problems at the cost of smoothed geometry and low-frequency depth artifacts. Our method outperforms the previous methods by recovering both accurate scale and geometric details. The error map below results corresponds to a visual representation of the \(\delta\) metric, see text, in Tables 1 and 2.
Table 2.
Method\(\uparrow \delta \lt 1\)\(\uparrow \delta \lt 2\)\(\uparrow \delta \lt 3\)\(\downarrow \delta \ge 3\)\(\downarrow \delta \ge 10\)
Phasor [2015]0.74%1.66%3.50%96.5%84.4%
CRT [2007]9.29%14.7%19.7%80.3%56.0%
KDE [2016]9.46%18.56%27.0%73.0%8.93%
One-Step [2019]19.9%37.6%52.2%47.8%14.6%
U-Net [2015]21.8%45.6%64.4%35.6%10.0%
Deep-ToF [2018]20.1%47.5%67.6%32.4%8.40%
Rapid. [2019]23.1%45.4%61.1%38.9%9.74%
Proposed51.6%69.1%77.0%23.0%7.59%
Table 2. Quantitative Comparison Table for Proposed Neural Phase Unwrapping Method and Baselines, as Evaluated on the Synthetic Test Scenes
\(\delta \ge 10\) metric added to better quantify outlier performance. The best performing method is highlighted as bold.
Impact of Measurement Noise. In addition to simulating the Poisson-Gaussian noise as described in Section 3, we further test our method with two different measurement distortions. First, we simulate global phase offsets between the two high modulation frequencies. Note that we used a phase-accurate clock (SDG2024X) to mitigate phase offsets, however, high GHz modulation frequencies can make the system sensitive to the phase shifts. We add \(\pm 0.01\) radians global offset to one of the phase measurements. Table 3 shows that our neural phase unwrapping is robust to such phase shift, resulting in a minor performance drop of less than 1% for all metrics. Second, we simulate a high noise a level \(\sigma =2,\!000\) instead of 1,200 to mimic low-signal scenario with strong ambient light present. Again, we obtain a minor performance drop for 1.6\(\times\) higher noise level than the training as shown in Table 3.
Table 3.
Environment\(\uparrow \delta \lt 1\)\(\uparrow \delta \lt 2\)\(\uparrow \delta \lt 3\)\(\downarrow \delta \ge 3\)
Conventional51.6%69.1%77.0%23.0%
With Phase Offset51.0%68.9%77.0%23.0%
With Ambient Light42.7%60.4%70.0%30.0%
Table 3. Quantitative Results for Our Phase Unwrapping Method Against Phase Offset Error and Higher Noise Level Due to the Ambient Light
The best performing method is highlighted as bold.
Training Details. For all methods except phasor imaging, we simulate measurements for two modulation frequencies, a fundamental 7.15 GHz signal, and a shifted plus frequency doubled (7.15 GHz + 10 MHz) \(\times\) 2 \(=\) 14.32 GHz signal. We note that these frequencies correspond to the frequencies we can implement in the experimental setup. For the phasor imaging method, we input 7.15 GHz and 7.16 GHz simulated measurements, as these are the locally optimal feasible shifts achieved by the optical amplitude modulation system. To be robust against real-world noise, we simulate measurements with sensor gain \(G=20\) and integration time \(T=1,\!000\) ms, with noise parameters \(\mu =0, \sigma =1{,}200\). See Figure 10. The models are trained for 1,000 epochs each, with 500 samples drawn per epoch, each consisting of a 512 \(\times\) 512 image and ground truth depth patch (sampled randomly from the full RGB-D datum). We use a OneCycle learning rate schedule with a ratio of 0.995 per epoch and an initial rate of \(1e-3\); training on 3 Nvidia V100 GPUs with a batch size of 12 takes approximately 24 hours. The synthetic test set consists of the 42nd frame of each simulated scene, withheld from both the training and validation sets. We balance the losses by setting \(w_{L1}=0.1\), which leads to noticeable improvements in smoothness without the classifier’s early training behavior. During inference, running on one Nvidia V100 GPU, we achieve an average runtime of 16.5 ms \(\approx\) 60 FPS per image of size 256 \(\times\) 256, and 50 ms \(\approx\) 20 FPS with the full synthetic image size of 768 \(\times\) 1,024.
Fig. 11.
Fig. 11. Noise matching example. Left: simulated synthetic data. Right: experimental measurement.

8.2 Experimental Assessment

In this section, we validate the proposed computational ToF imaging approach on experimental scenes.
Qualitative Reconstruction. We demonstrate depth captures on diverse real-world scenes as shown in Figure 15. All scenes were captured with the galvo on the floor plane with respect to the scene, and swept through 16 phase shifts from \(0-\pi\), corresponding to 13 ms integration time for a single galvo measurement point. Note that we perform this capture procedure under strong room ambient light for all captured scenes, demonstrating the robustness to ambient illumination. Operating outside the visible range (our system uses 532 nm for lab eye safety reasons), and employing narrow-band spectral filters can further enhance this robustness. We use a single-frequency 7.15 GHz and double-frequency 14.32 GHz pair for depth measurement. Figure 15 shows that combining the proposed free-space correlation acquisition and neural unwrapping method enables high-fidelity depth reconstruction of all the tested objects with a wide dynamic range.
Compared to RF demodulation after photon conversion, using the highspeed GaAs 12 GHz photodetector as described in Section 7, the proposed method drastically outperforms post-detection modulation across all experimental tests for an identical photon budget. We tested even for a 10\(\times\) higher laser power of 30 mW with the same result, again validating the photon-efficiency of the proposed free-space modulation approach. Our measured phase maps clearly show depth-dependent contours for diverse surface reflectance types (see also Supplemental Material), demonstrating the robustness of the proposed system. Moreover, our imager handles large variations in object reflectance. From a diffuse bust, a highly specular helmet with a very small diffuse component, to a textured owl object with low albedo components. We evaluate the impact of our neural phase unwrapping on these challenging scenes compared with the existing KDE [Lawin et al. 2016], recent learned network [Zhang et al. 2019] method, and micro ToF phasor unwrapping [Gupta et al. 2015] methods. KDE unwrapping [Lawin et al. 2016] struggles with the high frequencies of the proposed system and residual measurements noise, failing to provide high-quality residual measurements. The other two methods [Gupta et al. 2015; Lawin et al. 2016] also fail to recover correct meaningful geometric structures which can be found in the Supplemental Document. The lookup-table-based unwrapping method from [Gupta et al. 2015] fails here due to the small modulation bandwidth available in our experimental system. We note that we use the optimal frequency settings for the phasor unwrapping [Gupta et al. 2015] in our operating bandwidth. Our neural phase unwrapping successfully handles high wrap counts in the GHz regime, enabling us to obtain accurate depth maps across all scenes. Thus, these experiments validate that the proposed method robustly performs high-frequency correlation depth imaging, outperforming existing approaches and phase unwrapping methods across all tested scenarios.
Validation of Correlation Profiles. We validate the functionality of the proposed imaging system by acquiring correlation measurements as figure of merit. Specifically, we capture measurements of a static target without galvo movement while sweeping the phase of the reference signal driven by the RF driver. To this end, we place a mirror (Thorlabs PF10-03-P01) at a fixed position and uniformly sample \(\psi\) over a range of 0 to 2\(\pi\). Figure 11 confirms that the measured correlation values accurately follow the sinusoidal image formation model from Equation (20).
Fig. 12.
Fig. 12. (c) We validate our optical GHz modulation by capturing the instrument response of the complete system for a fixed mirror with varying phase shifts of the detection EOM, controlled by (a) the voltage-control RF drivers. (b) As predicted in the model, the amplitude measurements accurately follow a sinusoidal function, validating the effective GHz correlation mode.
Quantitative Evaluation of Depth Precision. We quantitatively evaluate depth precision of our experimental prototype by capturing objects at known distances using a motion stage (Thorlabs MTS50/M-Z8) as shown in Figure 12 and Table 2. We control the position of the target object that is placed at a 60 cm distance from the setup. At this depth offset, we sweep over a 1 mm travel distance with a 50 \(\mu\)m step size (stage error 0.05 \(\mu\)m) and estimate corresponding depth values using the proposed method. Our imaging system achieves a mean depth error of 32.5 um and 32.9 um for a specular mirror and a diffuse reflector respectively.
Fig. 13.
Fig. 13. We measure the depth precision of our system for (a) a specular mirror and (b) a diffuse reflector mounted on a linear motion stage, placed at a 60 cm distance from the system. Compared to the known positions of the target objects in a sweep of the mirror and diffuse reflector, we achieve a depth accuracy of 33.5 um with a standard deviation of 7.5 um, outperforming the post-photoconversion RF-based method and approaching (c) optical interferometry. Note that while interferometry is extremely sensitive to short travel distance and scene reflectance, the proposed method effectively estimates depth independently of these environmental influences.
Furthermore, we measure the height difference between the two metallic precision-fabricated gage blocks at a 100 cm distance as shown in Figure 13. The two gage blocks (ACCUSIZE DIN861 Metric, Grade2) are placed on a static mount. The difference of the measured depths, the height difference, is 0.4988 mm which is only 12 \(\mu\)m off from the ground truth 0.5 mm, demonstrating the precision of our depth acquisition. We also captured the shape of a large diffuse flat plane at a slanted angle. Once we obtain the depth map from our measurements, we fit a plane equation and the fitting R-squared value is 0.9998, demonstrating the accuracy of our method over longer travel distance than the translation-stage experiment.
Fig. 14.
Fig. 14. As additional illustrative validation of the depth resolution provided by the proposed method, we capture gage blocks and a slanted flat plane. Our method accurately recovers the height difference of the gage blocks. For the plane object, we fit a plane equation to the acquired 3D points and achieves an R-squared value of 0.9998. This validates the effectiveness and precision of our method independent of phase unwrapping.
Comparison with MHz Correlation ToF Imaging. The proposed method performs all-optical GHz modulation for high-resolution depth imaging. Figure 14 compares our method with the conventional MHz correlation ToF imaging used in LUCID Helios Flex camera equipped with four VCSEL diodes of cumulatively 8 mW illumination module at 850 nm wavelength and 8 ms exposure, which is comparable to the effective photon budget of our system, although less susceptible to ambient light due to the wavelength filter. Our estimated depth contains fine geometric details for challenging scenes at correct depth scales, whereas the MHz correlation ToF suffers from low-precision depth with mm deviation on diffuse highly reflecting surfaces and larger cm to 10 cm deviation for surface areas of low reflectance. We ignore errors due to multipath effects (naturally suppressed by scanning in the proposed method) in this evaluation by focusing on convex object shapes. We note that we provide here qualitative comparisons with RGB frames as reference rather than ground truth depth,which is challenging to acquire for highly specular objects such as the helmet. Qualitatively, MHz correlation ToF fails to recover correct geometry for bright and dark spots on the owl, resulting in a 200 mm depth error (i.e., texture-dependent depth errors). The method also fails for the faint diffuse component returned from the specular helmet scene. While this trend is also confirmed in the estimated normals, we note that the holes in the helmet are “closed” by an incorrect wrapping estimate.
Fig. 15.
Fig. 15. Our all-optical GHz ToF imaging captures fine geometric details which cannot be revealed in MHz correlation ToF imaging. In the MHz regime, fundamentally limiting depth resolution and phase contrast by modulation in the analog domain instead of the optical domain, conventional correlation ToF fails to recover correct depth and fine-grained normals. Noticeably, we observe artifacts on the specular surface of the helmet, which only returns a very faint diffuse component, and the texture-dependent artifacts on the bright and dark spots of the owl object.
Fig. 16.
Fig. 16. (a) We experimentally validate the proposed method on challenging real-world scenes, for which we show photographs and recovered amplitude measurements (b) and (c). While our all-optical neural ToF system captures accurate phases at 7.15 GHz and 14.32 GHz (frequency-doubled 7.16 GHz), (d) the existing GHz RF method electronically computes the correlation after photo-electron conversion (see text for details). As a result, this method struggles with a low photon flux of the returned signal, producing noisy depth reconstructions. (e) We also evaluate existing phase-unwrapping methods for unwrapping measured GHz phase measurements, which fail for most scenes due to the high wrapping counts of GHz frequencies (f) and (g). The proposed neural unwrapping method successfully resolves this issue and enables accurate geometry reconstruction visualized as depth and normals.
Comparison to RF Demodulation and Optical Interferometry. We compare the proposed method to RF demodulation after photo-conversion and to interferometric depth estimation. To compare to RF demodulation, we use the same highspeed GaAs 12 GHz photodetector as before. We note that this was the fastest photodiode available to us, see again Section 7 and the Supplement for additional details. To compare interferometric depth estimation with the proposed method, we add a moving reference mirror and an intensity detector so that interference can be detected with the superposed reference and scene beams as shown in Figure 12(c). To implement this approach with the same proposed setup, we place a beam block in front of the reference mirror when we use the system in the proposed correlation mode. For a fair comparison, we unwrap the interferometric data with sequential unwrapping which adds the smallest multiple of 2 pi whenever the phase exceeds 2 pi.
Table 4 shows that for a 1 mm sweep at 0.6 m distance, our proposed method with an emitter-decoder setup outperforms the RF demodulation in-depth accuracy. The proposed method has a lower depth error than the RF method for RMSE and MAE for both a specular reflector shown in (a), and a diffuse reflector shown in (b). The depth estimates for all methods are shown for a 1 mm range compared to a ground truth for both specular and diffuse reflectors. We validate that, while post-photoconversion performs well for high flux levels, typical diffuse scenes results in low photon counts that are challenging to sense at high frequencies. As such, the RF demodulation approach fails for the diffuse scene object. We note that in contrast to direct fast sampling at rates higher than 10 GHz in the RF setup, our all-optical sensing enables us to get away from low-frequency kHz sampling (six orders of magnitude slower) with high SNR.
Table 4.
MethodRMSE MirrorMAE MirrorRMSE DiffuseMAE Diffuse
Interferometry20 \(\mu\)m20 \(\mu\)m14 \(\mu\)m14 \(\mu\)m
RF49.5 \(\mu\)m48.8 \(\mu\)m11800 \(\mu\)m11800 \(\mu\)m
Proposed33.5 \(\mu\)m32.5 \(\mu\)m34.6 \(\mu\)m32.9 \(\mu\)m
Table 4. Quantitative Comparison of the Proposed Method for Diffuse and Specular Objects Corresponding to the Measurements in Figure 12
The RSME and MAE for interferometry, RF, and proposed methods are shown in Table 4 for specular and diffuse reflectors. The interferometric depth estimation performs best in terms of RMSE and MAE for specular and diffuse reflectors as expected. In this experiment, the proposed method achieves a depth precision of around 30 microns. We note that optical interferometry setup is extremely sensitive to scene scale, and system vibrations, to the point where measurements had to be completed remotely from outside the lab and repeated multiple times due to tiny measurement fluctuations.

9 Discussion

We have introduced a computational imaging method, that presents a complementary direction to existing ToF methods. Specifically, we have jointly designed the optics, sensing, and neural network reconstruction such that computation that is typically done on the sensor, or digitally after sensing, is executed optically on the incident photon stream. Doing so, we introduce concepts from optics on electro-optical modulation to the graphics and imaging community, while devising a new method for two-pass modulation and a new method for unwrapping high-frequency phase measurements. Although we experimentally and synthetically validate that our system performs effective centimeter-wave ToF depth imaging, as a nascent technology, our work also leaves the reader with some open questions regarding its future, which are discussed below.
Implementing Array Sensors. We have opted for sequential point-wise scanning using a galvo system as the beam diameter passing through our EOMs is limited by the EOM’s small active area, 2.5 \(\times\) 2.5 mm. An alternative implementation requiring further engineering efforts is the use of telescope optics to spatially expand the EOM-modulated light, hence exploiting that correlation ToF only mandates global intensity modulation instead of per-pixel intensity control, see also [Kim et al. 2019].
Flood-Illumination and Multi-Path Interference. As our prototype performs point-wise scanning, direct reflection dominates the measurement, which mitigates multi-path interference. When implementing the proposed system with flood illumination in the future, and using 2D array sensing with large-area EOMs, retraining the network with flood-illumination might appear as an immediate solution to multi-path interference. We note that proposed high-frequency modulation may already provide sufficient robustness to the multi-path problem [Gupta et al. 2015].
Generalization to Complex Geometry and Reflectance. For scenes with simple shapes, moving planar targets in Figure 12 and gage blocks and a slanted plane in Figure 13, we demonstrate micron-scale depth resolution. In the future, we hope this approach can be extended to resolve micron-scale features in more complex scenes. While the proposed method outperforms previous methods for complex macroscopic scenes, capturing accurate depth still proves challenging; local geometries induce phase noise as the angular light beams are integrated over uneven depths. In the future, narrow beam sampling or flood-illuminated setups with array sensing might be a hardware solution to this challenge. In addition, more accurate ground truth sensing in the fine-tuning step might also overcome this domain gap issue in the neural network reconstruction.
Phase Unwrapping and Denoising. The proposed neural unwrapping method exploits the ordinal nature of the wrap counts and segmentation-based image semantics to recover dozens of wrap counts, while existing methods fail for more than a handful. While this approach shares some similarities with denoising in that we want to recover clean phase measurements from noisy readings, it does so in a joint manner. Rather than performing denoising and unwrapping sequentially, the proposed network ingests both correlation measurements simultaneously and can use their joint information—and independent noise distributions—to inform unwrapping. In this way, we avoid accidentally denoising phase measurements into the wrong wrap count bin.

10 Conclusion

We propose a computational ToF imaging method that correlates light all-optically at centimeter-wave frequencies, without fiber coupling or photon-conversion—enabling high SNR sensing with more than 10 GHz modulation frequency. To this end, we solve two technical challenges: modulating without large signal losses at GHz rates, and unwrapping phase at these rates which render conventional phase unwrapping methods ineffective. Specifically, we propose a two-pass intensity modulation with free-space EOMs and polarizing optics, which works in tandem with a neural phase unwrapping method to handle high wrapping counts in GHz-frequency measurements, on the order of dozens of wraps. The resulting imaging method achieves ToF imaging with centimeter intensity modulation for macroscopic scenes, robust to materials of low reflectance, highly-specular materials, and ambient light. We demonstrate accurate depth reconstructions, outperforming existing phase-unwrapping and post-photo-conversion ToF methods for all synthetic and real-world experiments. Our approach makes a step towards the goal of filling the gap between interferometric and correlation ToF. Our method performs computation optically that traditionally has been done after or during the sensing process. As such, in the future, we envision that the proposed approach could serve as an optical compute block for a diverse array of tasks, including velocity imaging, transient imaging, non-line-of-sight imaging, and imaging in scattering media, with the potential for fueling imaging of ultrafast phenomena across disciplines.

Supplementary Material

tog-21-0071-File004 (tog-21-0071-file004.mp4)
Supplemental video

References

[1]
Supreeth Achar, Joseph R. Bartels, William L.’Red’ Whittaker, Kiriakos N. Kutulakos, and Srinivasa G. Narasimhan. 2017. Epipolar time-of-flight imaging. ACM Transactions on Graphics 36, 4 (2017), 1–8.
[2]
Narendra Ahuja and A. Lynn Abbott. 1993. Active stereo: Integrating disparity, vergence, focus, aperture and calibration for surface estimation. IEEE Transactions on Pattern Analysis and Machine Intelligence 15, 10 (1993), 1007–1029.
[3]
M.-C. Amann. 1992. Phase noise limited resolution of coherent LIDAR using widely tunable laser diodes. Electronics Letters 28, 18 (1992), 1694–1696.
[4]
Yatong An, Jae-Sang Hyun, and Song Zhang. 2016. Pixel-wise absolute phase unwrapping using geometric constraints of structured light system. Optics Express 24, 16 (2016), 18445–18459.
[5]
Brian F. Aull, Andrew H. Loomis, Douglas J. Young, Richard M. Heinrichs, Bradley J. Felton, Peter J. Daniels, and Deborah J. Landers. 2002. Geiger-mode avalanche photodiodes for three-dimensional imaging. Lincoln Laboratory Journal 13, 2 (2002), 335–349.
[6]
Seung-Hwan Baek, Diego Gutierrez, and Min H. Kim. 2016. Birefractive stereo imaging for single-shot depth acquisition. ACM Transactions on Graphics 35, 6 (2016), 1–11.
[7]
Seung-Hwan Baek and Felix Heide. 2021. Polka lines: Learning structured illumination and reconstruction for active stereo. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5757–5767.
[8]
Cyrus S. Bamji, Swati Mehta, Barry Thompson, Tamer Elkhatib, Stefan Wurster, Onur Akkaya, Andrew Payne, John Godbaz, Mike Fenton, Vijay Rajasekaran, Larry Prather, Satya Nagaraja, Vishali Mogallapu, Dane Snow, Rich McCauley, Mustansir Mukadam, Iskender Agi, Shaun McCarthy, Zhanping Xu, Travis Perry, William Qian, Vei-Han Chan, Prabhu Adepu, Gazi Ali, Muneeb Ahmed, Aditya Mukherjee, Sheethal Nayak, Dave Gampell, Sunil Acharya, Lou Kordus, and Pat O’Connor. 2018. IMpixel 65nm BSI 320MHz demodulated TOF Image sensor with 3\(\mu\)m global shutter pixels and analog binning. In Proceedings of the 2018 IEEE International Solid-State Circuits Conference. IEEE, 94–96.
[9]
Sankhyabrata Bandyopadhyay, Li-yang Shao, Wang Chao, Zhijun Yan, Fei Hong, Guoqing Wang, Jiahao Jiang, Ping Shum, Xiaoping Hong, and Weizhi Wang. 2020. Highly efficient free-space fiber coupler with 45\(^{\circ }\) tilted fiber grating to access remotely placed optical fiber sensors. Optics Express 28, 11 (2020), 16569–16578.
[10]
Behnam Behroozpour, Phillip A. M. Sandborn, Niels Quack, Tae-Joon Seok, Yasuhiro Matsui, Ming C. Wu, and Bernhard E. Boser. 2016. Electronic-photonic integrated circuit for 3D microimaging. IEEE Journal of Solid-State Circuits 52, 1 (2016), 161–172.
[11]
Behnam Behroozpour, Phillip A. M. Sandborn, Ming C. Wu, and Bernhard E. Boser. 2017. Lidar system architectures and circuits. IEEE Communications Magazine 55, 10 (2017), 135–142.
[12]
Ayush Bhandari, Achuta Kadambi, Refael Whyte, Christopher Barsi, Micha Feigin, Adrian Dorrington, and Ramesh Raskar. 2014. Resolving multipath interference in time-of-flight imaging via modulation frequency diversity and sparse regularization. Optics Letters 39, 6 (2014), 1705–1708.
[13]
Shariq Farooq Bhat, Ibraheem Alhashim, and Peter Wonka. 2021. Adabins: Depth estimation using adaptive bins. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 4009–4018.
[14]
Jose Bioucas-Dias, Vladimir Katkovnik, Jaakko Astola, and Karen Egiazarian. 2008. Absolute phase estimation: Adaptive local denoising and global unwrapping. Applied Optics 47, 29 (2008), 5358–5369.
[15]
José Bioucas-Dias, Vladimir Katkovnik, Jaakko Astola, and Karen Egiazarian. 2009. Multi-frequency phase unwrapping from noisy data: Adaptive local maximum likelihood approach. In Proceedings of the Scandinavian Conference on Image Analysis. Springer, 310–320.
[16]
Jos M. Bioucas-Dias and Gonalo Valadao. 2007. Phase unwrapping via graph cuts. IEEE Transactions on Image Processing 16, 3 (2007), 698–709.
[17]
Danilo Bronzi, Yu Zou, Federica Villa, Simone Tisa, Alberto Tosi, and Franco Zappa. 2015. Automotive three-dimensional vision through a single-photon counting SPAD camera. IEEE Transactions on Intelligent Transportation Systems 17, 3 (2015), 782–795.
[18]
Clara Callenberg, Zheng Shi, Felix Heide, and Matthias B. Hullin. 2021. Low-cost SPAD sensing for non-line-of-sight tracking, material classification and depth imaging. ACM Transactions on Graphics 40, 4 (2021), 1–12.
[19]
Angel X. Chang, Thomas Funkhouser, Leonidas Guibas, Pat Hanrahan, Qixing Huang, Zimo Li, Silvio Savarese, Manolis Savva, Shuran Song, Hao Su, Jianxiong Xiao, Li Yi, and Fisher Yu. 2015. ShapeNet: An Information-Rich 3D Model Repository. Technical Report arXiv:1512.03012 [cs.GR]. Stanford University — Princeton University — Toyota Technological Institute at Chicago.
[20]
Zhen Chen, B. O. Liu, Shengjie Wang, and Enhai Liu. 2018. Polarization-modulated three-dimensional imaging using a large-aperture electro-optic modulator. Applied Optics 57, 27 (2018), 7750–7757.
[21]
Edward Collett. 2005. Field Guide to Polarization. Spie Bellingham, WA.
[22]
Sergio Cova, Massimo Ghioni, Andrea Lacaita, Carlo Samori, and Franco Zappa. 1996. Avalanche photodiodes and quenching circuits for single-photon detection. Applied Optics 35, 12 (1996), 1956–1976.
[23]
Ryan Crabb and Roberto Manduchi. 2015. Fast single-frequency time-of-flight range imaging. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 58–65.
[24]
Angela Dai, Angel X. Chang, Manolis Savva, Maciej Halber, Thomas Funkhouser, and Matthias Nießner. 2017. Scannet: Richly-annotated 3d reconstructions of indoor scenes. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 5828–5839.
[25]
Angela Dai, Daniel Ritchie, Martin Bokeloh, Scott Reed, Jürgen Sturm, and Matthias Nießner. 2018. Scancomplete: Large-scale scene completion and semantic segmentation for 3d scans. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4578–4587.
[26]
Adrian A. Dorrington, John P. Godbaz, Michael J. Cree, Andrew D. Payne, and Lee V. Streeter. 2011. Separating true range measurements from multi-path and scattering interference in commercial range cameras. In Proceedings of the Three-Dimensional Imaging, Interaction, and Measurement. International Society for Optics and Photonics, 786404.
[27]
David Droeschel, Dirk Holz, and Sven Behnke. 2010. Multi-frequency phase unwrapping for time-of-flight cameras. In Proceedings of the 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems. IEEE, 1463–1469.
[28]
Sergi Foix, Guillem Alenya, and Carme Torras. 2011. Lock-in time-of-flight (ToF) cameras: A survey. IEEE Sensors Journal 11, 9 (2011), 1917–1926.
[29]
Daniel Freedman, Yoni Smolin, Eyal Krupka, Ido Leichter, and Mirko Schmidt. 2014. SRA: Fast removal of general multipath for ToF sensors. In Proceedings of the European Conference on Computer Vision. Springer, 234–249.
[30]
K. D. Froome and R. H. Bradsell. 1961. Distance measurement by means of a light ray modulated at a microwave frequency. Journal of Scientific Instruments 38, 12 (1961), 458.
[31]
Stefan Fuchs. 2010. Multipath interference compensation in time-of-flight camera images. In Proceedings of the 2010 20th International Conference on Pattern Recognition. IEEE, 3583–3586.
[32]
James Fujimoto and Eric Swanson. 2016. The development, commercialization, and impact of optical coherence tomography. Investigative Ophthalmology & visual Science 57, 9 (2016), OCT1–OCT13.
[33]
Shuang Gao and Rongqing Hui. 2012. Frequency-modulated continuous-wave lidar using I/Q modulator for simplified heterodyne detection. Optics Letters 37, 11 (2012), 2022–2024.
[34]
Rahul Garg, Neal Wadhwa, Sameer Ansari, and Jonathan T. Barron. 2019. Learning single camera depth estimation using dual-pixels. In Proceedings of the IEEE International Conference on Computer Vision. 7628–7637.
[35]
Dennis C. Ghiglia and Mark D. Pritt. 1998. Two-dimensional Phase Unwrapping: Theory, Algorithms, and Software. Wiley, New York.
[36]
Ioannis Gkioulekas, Anat Levin, Frédo Durand, and Todd Zickler. 2015. Micron-scale light transport decomposition using interferometry. ACM Transactions on Graphics 34, 4 (2015), 1–14.
[37]
Anant Gupta, Atul Ingle, and Mohit Gupta. 2019a. Asynchronous single-photon 3D imaging. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 7909–7918.
[38]
Anant Gupta, Atul Ingle, Andreas Velten, and Mohit Gupta. 2019b. Photon-flooded single-photon 3D cameras. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 6770–6779.
[39]
Mohit Gupta, Shree K. Nayar, Matthias B. Hullin, and Jaime Martin. 2015. Phasor imaging: A generalization of correlation-based time-of-flight imaging. ACM Transactions on Graphics 34, 5 (2015), 1–18.
[40]
Mohit Gupta, Andreas Velten, Shree K. Nayar, and Eric Breitbach. 2018. What are optimal coding functions for time-of-flight imaging?ACM Transactions on Graphics 37, 2(2018), 18 pages. DOI:
[41]
Felipe Gutierrez-Barragan, Syed Azer Reza, Andreas Velten, and Mohit Gupta. 2019. Practical coding function design for time-of-flight imaging. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1566–1574.
[42]
Thorsten Ringbeck. 2007. A 3D time of flight camera for object detection. Measurement 9 (01 2007).
[43]
Miles Hansard, Seungkyu Lee, Ouk Choi, and Radu Patrice Horaud. 2012. Time-of-flight Cameras: Principles, Methods and Applications. Springer Science & Business Media.
[44]
Parameswaran Hariharan. 2003. Optical Interferometry, 2e. Elsevier.
[45]
Felix Heide, Steven Diamond, David B. Lindell, and Gordon Wetzstein. 2018. Sub-picosecond photon-efficient 3D imaging using single-photon sensors. Scientific Reports 8, 1 (2018), 1–8.
[46]
Felix Heide, Matthias B. Hullin, James Gregson, and Wolfgang Heidrich. 2013. Low-budget transient imaging using photonic mixer devices. ACM Transactions on Graphics 32, 4 (2013), 1–10.
[47]
Felix Heide, Matthew O’Toole, Kai Zang, David B. Lindell, Steven Diamond, and Gordon Wetzstein. 2019. Non-line-of-sight imaging with partial occluders and surface normals. ACM Transactions on Graphics 38, 3 (2019), 1–10.
[48]
Felix Heide, Lei Xiao, Andreas Kolb, Matthias B. Hullin, and Wolfgang Heidrich. 2014. Imaging in scattering media using correlation image sensors and sparse convolutional coding. Optics Express 22, 21 (2014), 26338–26350.
[49]
Steven Hickson, Stan Birchfield, Irfan Essa, and Henrik Christensen. 2014. Efficient hierarchical graph-based segmentation of RGBD videos. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 344–351.
[50]
Heiko Hirschmuller. 2005. Accurate and efficient stereo processing by semi-global matching and mutual information. In Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. IEEE, 807–814.
[51]
David Huang, Eric A. Swanson, Charles P. Lin, Joel S. Schuman, William G. Stinson, Warren Chang, Michael R. Hee, Thomas Flotte, Kenton Gregory, Carmen A. Puliafito, and James G. Fujimoto. 1991. Optical coherence tomography. science 254, 5035 (1991), 1178–1181.
[52]
Darren D. Hudson, Kevin W. Holman, R. Jason Jones, Steven T. Cundiff, Jun Ye, and David J. Jones. 2005. Mode-locked fiber laser frequency-controlled with an intracavity electro-optic modulator. Optics Letters 30, 21 (2005), 2948–2950.
[53]
Shahram Izadi, David Kim, Otmar Hilliges, David Molyneaux, Richard Newcombe, Pushmeet Kohli, Jamie Shotton, Steve Hodges, Dustin Freeman, Andrew Davison, et al. 2011. KinectFusion: Real-time 3D reconstruction and interaction using a moving depth camera. In Proceedings of the 24th Annual ACM Symposium on User Interface Software and Technology. 559–568.
[54]
David Jiménez, Daniel Pizarro, Manuel Mazo, and Sira Palazuelos. 2014. Modeling and correction of multipath interference in time of flight cameras. Image and Vision Computing 32, 1 (2014), 1–13.
[55]
Achuta Kadambi and Ramesh Raskar. 2017. Rethinking machine vision time of flight with GHz heterodyning. IEEE Access 5 (2017), 26211–26223. DOI:
[56]
Achuta Kadambi, Refael Whyte, Ayush Bhandari, Lee Streeter, Christopher Barsi, Adrian Dorrington, and Ramesh Raskar. 2013. Coded time of flight cameras: Sparse deconvolution to address multipath interference and recover time profiles. ACM Transactions on Graphics 32, 6(2013), 10 pages. DOI:
[57]
Achuta Kadambi, Hang Zhao, Boxin Shi, and Ramesh Raskar. 2016. Occluded imaging with time-of-flight sensors. ACM Transactions on Graphics 35, 2, (2016), 12 pages. DOI:
[58]
Daehee Kim, Yang Lu, Jiyong Park, Byunggi Kim, Liping Yan, Liandong Yu, Ki-Nam Joo, and Seung-Woo Kim. 2019. Rigorous single pulse imaging for ultrafast interferometric observation. Optics Express 27, 14 (2019), 19758–19767.
[59]
Ahmed Kirmani, Arrigo Benedetti, and Philip A. Chou. 2013. Spumic: Simultaneous phase unwrapping and multipath interference cancellation in time-of-flight cameras using spectral methods. In Proceedings of the 2013 IEEE International Conference on Multimedia and Expo. IEEE, 1–6.
[60]
Alankar Kotwal, Anat Levin, and Ioannis Gkioulekas. 2020. Interferometric transmission probing with coded mutual intensity. ACM Transactions on Graphics39, 4 (2020), 74–1.
[61]
Robert Lange. 2000. 3D time-of-flight distance measurement with custom solid-state image sensors in CMOS/CCD-technology. Ph.D. Dissertation, University of Siegen.
[62]
Robert Lange and Peter Seitz. 2001. Solid-state time-of-flight range camera. IEEE Journal of Quantum Electronics 37, 3 (2001), 390–397.
[63]
Felix Järemo Lawin, Per-Erik Forssén, and Hannes Ovrén. 2016. Efficient multi-frequency phase unwrapping using kernel density estimation. In Proceedings of the European Conference on Computer Vision. Springer, 170–185.
[64]
Nalpantidis Lazaros, Georgios Christou Sirakoulis, and Antonios Gasteratos. 2008. Review of stereo vision algorithms: From software to hardware. International Journal of Optomechatronics 2, 4 (2008), 435–462.
[65]
R. Leitgeb, C. K. Hitzenberger, and Adolf F. Fercher. 2003. Performance of fourier domain vs. time domain optical coherence tomography. Optics Express 11, 8 (2003), 889–894.
[66]
Fengqiang Li, Florian Willomitzer, Prasanna Rangarajan, Mohit Gupta, Andreas Velten, and Oliver Cossairt. 2018. SH-ToF: Micro resolution time-of-flight imaging with superheterodyne interferometry. In Proceedings of the 2018 IEEE International Conference on Computational Photography. IEEE.
[67]
Reza Mahjourian, Martin Wicke, and Anelia Angelova. 2018. Unsupervised learning of depth and ego-motion from monocular video using 3d geometric constraints. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 5667–5675.
[68]
Riccardo Marchetti, Cosimo Lacava, Ali Khokhar, Xia Chen, Ilaria Cristiani, David J. Richardson, Graham T. Reed, Periklis Petropoulos, and Paolo Minzioni. 2017. High-efficiency grating-couplers: Demonstration of a new design strategy. Scientific Reports 7, 1 (2017), 1–8.
[69]
Aongus McCarthy, Robert J. Collins, Nils J. Krichel, Verónica Fernández, Andrew M. Wallace, and Gerald S. Buller. 2009. Long-range time-of-flight scanning sensor based on high-speed time-correlated single-photon counting. Applied Optics 48, 32 (2009), 6241–6251.
[70]
Andreas Meuleman, Seung-Hwan Baek, Felix Heide, and Min H. Kim. 2020. Single-shot monocular RGB-D imaging using uneven double refraction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2465–2474.
[71]
Ben Mildenhall, Pratul P. Srinivasan, Matthew Tancik, Jonathan T. Barron, Ravi Ramamoorthi, and Ren Ng. 2020. Nerf: Representing scenes as neural radiance fields for view synthesis. Commun. ACM 65, 1 (2021), 99–106.
[72]
U. Minoni, E. Sardini, E. Gelmini, F. Docchio, and D. Marioli. 1991. A high-frequency sinusoidal phase-modulation interferometer using an electro-optic modulator: Development and evaluation. Review of Scientific Instruments 62, 11 (1991), 2579–2583.
[73]
Nikhil Naik, Achuta Kadambi, Christoph Rhemann, Shahram Izadi, Ramesh Raskar, and Sing Bing Kang. 2015. A light transport model for mitigating multipath interference in time-of-flight sensors. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
[74]
Cristiano Niclass, Alexis Rochas, P.-A. Besse, and Edoardo Charbon. 2005. Design and characterization of a CMOS 3-D image sensor based on single photon avalanche diodes. IEEE Journal of Solid-State Circuits 40, 9 (2005), 1847–1854.
[75]
Matthew O’Toole, David B. Lindell, and Gordon Wetzstein. 2018. Confocal non-line-of-sight imaging based on the light-cone transform. Nature 555, 7696 (2018), 338.
[76]
Gaurav Pandey, James R. McBride, and Ryan M. Eustice. 2011. Ford campus vision and lidar data set. The International Journal of Robotics Research 30, 13 (2011), 1543–1552.
[77]
Dingyi Pei, Arto Salomaa, and Cunsheng Ding. 1996. Chinese Remainder Theorem: Applications in Computing, Coding, Cryptography. World Scientific.
[78]
Christopher T. Phare, Yoon-Ho Daniel Lee, Jaime Cardenas, and Michal Lipson. 2015. Graphene electro-optic modulator with 30 GHz bandwidth. Nature Photonics 9, 8 (2015), 511–514.
[79]
Rudra P. K. Poudel, Stephan Liwicki, and Roberto Cipolla. 2019. Fast-scnn: Fast semantic segmentation network. In Proceeding of the British Machine Vision Conference.
[80]
Fabio Remondino and David Stoppa. 2013. TOF Range-imaging Cameras, Vol. 2. Springer.
[81]
Mike Roberts, Jason Ramapuram, Anurag Ranjan, Atulit Kumar, Miguel Angel Bautista, Nathan Paczan, RussWebb, and JoshuaMSusskind. 2021. Hypersim: A photorealistic synthetic dataset for holistic indoor scene understanding. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 10912–10922.
[82]
Alexis Rochas, Michael Gösch, Alexandre Serov, Pierre-André Besse, Rade S. Popovic, Theo Lasser, and Rudolf Rigler. 2003. First fully integrated 2-D array of single-photon detectors in standard CMOS technology. IEEE Photonics Technology Letters 15, 7 (2003), 963–965.
[83]
Christopher Rogers, Alexander Y. Piggott, David J. Thomson, Robert F. Wiser, Ion E. Opris, Steven A. Fortune, Andrew J. Compston, Alexander Gondarenko, Fanfan Meng, Xia Chen, Graham T. Reed, and Remus Nicolaescu. 2021. A universal 3D imaging sensor on a silicon photonics platform. Nature 590, 7845 (2021), 256–261.
[84]
Olaf Ronneberger, Philipp Fischer, and Thomas Brox. 2015. U-net: Convolutional networks for biomedical image segmentation. In Proceedings of the InternationalConference on Medical Image Computing and Computer-assisted Intervention. Springer, 234–241.
[85]
Phillip A. M. Sandborn, Noriaki Kaneda, Young-Kai Chen, and Ming C. Wu. 2016. Dual-sideband linear FMCW lidar with homodyne detection for application in 3d imaging. In Proceedings of the 2016 Conference on Lasers and Electro-Optics. IEEE, 1–2.
[86]
Ashutosh Saxena, Sung H. Chung, and Andrew Y. Ng. 2005. Learning depth from single monocular images. In Proceedings of the NIPS, Vol. 18, 1–8.
[87]
Daniel Scharstein and Richard Szeliski. 2003. High-accuracy stereo depth maps using structured light. In Proceedings of the 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. IEEE, I–I.
[88]
Brent Schwarz. 2010. LIDAR: Mapping the world in 3D. Nature Photonics 4, 7 (2010), 429.
[89]
Shikhar Shrestha, Felix Heide, Wolfgang Heidrich, and Gordon Wetzstein. 2016. Computational imaging with multi-camera time-of-flight systems. ACM Transactions on Graphics35, 4 (2016), 1–11.
[90]
Nathan Silberman, Derek Hoiem, Pushmeet Kohli, and Rob Fergus. 2012. Indoor segmentation and support inference from rgbd images. In Proceedings of the European Conference on Computer Vision. Springer, 746–760.
[91]
Nikolai Smolyanskiy, Alexey Kamenev, and Stan Birchfield. 2018. On the importance of stereo for accurate depth estimation: An efficient semi-supervised deep neural network approach. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 1007–1015.
[92]
Shuran Song, Samuel P. Lichtenberg, and Jianxiong Xiao. 2015. Sun rgb-d: A rgb-d scene understanding benchmark suite. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 567–576.
[93]
Shuochen Su, Felix Heide, Robin Swanson, Jonathan Klein, Clara Callenberg, Matthias Hullin, and Wolfgang Heidrich. 2016. Material classification using raw time-of-flight measurements. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3503–3511.
[94]
Shuochen Su, Felix Heide, Gordon Wetzstein, and Wolfgang Heidrich. 2018. Deep end-to-end time-of-flight imaging. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 6383–6392.
[95]
Murali Subbarao and Gopal Surya. 1994. Depth from defocus: A spatial domain approach. International Journal of Computer Vision 13, 3 (1994), 271–294.
[96]
Zhaoyang Tai, Lulu Yan, Yanyan Zhang, Xiaofei Zhang, Wenge Guo, Shougang Zhang, and Haifeng Jiang. 2016. Electro-optic modulator with ultra-low residual amplitude modulation for frequency modulation and laser stabilization. Optics Letters 41, 23 (2016), 5584–5587.
[97]
Matthew Tancik, Pratul P. Srinivasan, Ben Mildenhall, Sara Fridovich-Keil, Nithin Raghavan, Utkarsh Singhal, Ravi Ramamoorthi, Jonathan T. Barron, and Ren Ng. 2020. Fourier features let networks learn high frequency functions in low dimensional domains. Advances in Neural Information Processing Systems 33 (2020), 7537–7547.
[98]
Michal Tölgyessy, Martin Dekan, L’ubo\(\checkmark\)s Chovanec, and Peter Hubinskỳ. 2021. Evaluation of the azure Kinect and its comparison to Kinect V1 and Kinect V2. Sensors 21, 2 (2021), 413.
[99]
Shubham Tulsiani, Alexei A. Efros, and Jitendra Malik. 2018. Multi-view consistency as supervisory signal for learning shape and pose prediction. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2897–2905.
[100]
Kaiqiang Wang, Ying Li, Qian Kemao, Jianglei Di, and Jianlin Zhao. 2019. One-step robust deep learning phase unwrapping. Optics Express 27, 10 (2019), 15100– 15115.
[101]
Xiang-Gen Xia and Genyuan Wang. 2007. Phase unwrapping and a robust Chinese remainder theorem. IEEE Signal Processing Letters 14, 4 (2007), 247–250.
[102]
Amnon Yariv. 1967. Quantum Electronics. Wiley.
[103]
Amnon Yariv and Pochi Yeh. 2007. Photonics: Optical Electronics in Modern Communications. Oxford University Press.
[104]
Teng Zhang, Shaowei Jiang, Zixin Zhao, Krishna Dixit, Xiaofei Zhou, Jia Hou, Yongbing Zhang, and Chenggang Yan. 2019. Rapid and robust two-dimensional phase unwrapping via deep learning. Optics Express 27, 16 (2019), 23173–23185.

Cited By

View all
  • (2024)TurboSL: Dense, Accurate and Fast 3D by Neural Inverse Structured Light2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52733.2024.02368(25067-25076)Online publication date: 16-Jun-2024
  • (2024)A Cost-effective approach for quality control in PLA-based material extrusion 3D printing using 3D scanningJournal of Industrial Information Integration10.1016/j.jii.2024.10066041(100660)Online publication date: Sep-2024
  • (2023)Doppler Time-of-Flight RenderingACM Transactions on Graphics10.1145/361833542:6(1-18)Online publication date: 5-Dec-2023
  • Show More Cited By

Index Terms

  1. Centimeter-wave Free-space Neural Time-of-Flight Imaging

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Graphics
    ACM Transactions on Graphics  Volume 42, Issue 1
    February 2023
    211 pages
    ISSN:0730-0301
    EISSN:1557-7368
    DOI:10.1145/3555791
    Issue’s Table of Contents

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 03 March 2023
    Online AM: 10 May 2022
    Accepted: 24 February 2022
    Revised: 04 February 2022
    Received: 24 August 2021
    Published in TOG Volume 42, Issue 1

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Time-of-flight imaging
    2. 3D imaging

    Qualifiers

    • Research-article

    Funding Sources

    • NSF Graduate Research Fellowship
    • NSF CAREER Award

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)1,032
    • Downloads (Last 6 weeks)155
    Reflects downloads up to 10 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)TurboSL: Dense, Accurate and Fast 3D by Neural Inverse Structured Light2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52733.2024.02368(25067-25076)Online publication date: 16-Jun-2024
    • (2024)A Cost-effective approach for quality control in PLA-based material extrusion 3D printing using 3D scanningJournal of Industrial Information Integration10.1016/j.jii.2024.10066041(100660)Online publication date: Sep-2024
    • (2023)Doppler Time-of-Flight RenderingACM Transactions on Graphics10.1145/361833542:6(1-18)Online publication date: 5-Dec-2023
    • (2023)Multipath Interference Suppression of Amplitude-Modulated Continuous Wave Coaxial-Scanning LiDAR Using Model-Based Synthetic Data LearningIEEE Sensors Journal10.1109/JSEN.2023.330575223:19(23822-23835)Online publication date: 1-Oct-2023
    • (2023)Passive Ultra-Wideband Single-Photon Imaging2023 IEEE/CVF International Conference on Computer Vision (ICCV)10.1109/ICCV51070.2023.00747(8101-8112)Online publication date: 1-Oct-2023
    • (2023)Integrated metasurfaces for re-envisioning a near-future disruptive optical platformLight: Science & Applications10.1038/s41377-023-01169-412:1Online publication date: 20-Jun-2023

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Get Access

    Login options

    Full Access

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media