Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

CMOS Imager With 1024 SPADs and TDCs for Single-Photon Timing and 3-D Time-of-Flight

IEEE Journal of Selected Topics in Quantum Electronics, 2014
...Read more
IEEE JOURNAL OF SELECTED TOPICS IN QUANTUM ELECTRONICS, VOL. 20, NO. 6, NOVEMBER/DECEMBER 2014 3804810 CMOS Imager With 1024 SPADs and TDCs for Single-Photon Timing and 3-D Time-of-Flight Federica Villa, Rudi Lussana, Danilo Bronzi, Student Member, IEEE, Simone Tisa, Alberto Tosi, Member, IEEE, Franco Zappa, Senior Member, IEEE, Alberto Dalla Mora, Davide Contini, Daniel Durini, Member, IEEE, Sasha Weyers, and Werner Brockherde Abstract—We present a CMOS imager consisting of 32 × 32 smart pixels, each one able to detect single photons in the 300– 900 nm wavelength range and to perform both photon-counting and photon-timing operations on very fast optical events with faint intensities. In photon-counting mode, the imager provides photon- number (i.e., intensity) resolved movies of the scene under observa- tion, up to 100 000 frames/s. In photon-timing, the imager provides photon arrival times with 312 ps resolution. The result are videos with either time-resolved (e.g., fluorescence) maps of a sample, or 3-D depth-resolved maps of a target scene. The imager is fabricated in a cost-effective 0.35-μm CMOS technology, automotive certified. Each pixel consists of a single-photon avalanche diode with 30 μm photoactive diameter, coupled to an in-pixel 10-bit time-to-digital converter with 320-ns full-scale range, an INL of 10% LSB and a DNL of 2% LSB. The chip operates in global shutter mode, with full frame times down to 10 μs and just 1-ns conversion time. The reconfigurable imager design enables a broad set of applications, like time-resolved spectroscopy, fluorescence lifetime imaging, dif- fusive optical tomography, molecular imaging, time-of-flight 3-D ranging and atmospheric layer sensing through LIDAR. Index Terms—Photon counting, CMOS imagers, single-photon avalanche diode (SPAD), 2-D imaging, 3-D ranging, time-of-flight, photon tagging, time-correlated single-photon counting (TCSPC), light detection and ranging (LIDAR). I. INTRODUCTION T HE last decades saw a growing interest in the scientific community toward single-photon time-resolved measure- ments of very faint and fast optical signals for safety and se- curity, medical and biological applications. High-performance, Manuscript received January 31, 2014; revised June 11, 2014; accepted July 17, 2014. This work was supported by the “MiSPiA” Project, under the EC FP7-ICT Framework, G.A. No. 257646. F. Villa, R. Lussana, D. Bronzi, A. Tosi, and F. Zappa are with the Diparti- mento di Elettronica, Informazione e Bioingegneria, Politecnico di Milano, I- 20133 Milano, Italy (e-mail: federica.villa@polimi.it; rudi.lussana@polimi.it; danilo.bronzi@polimi.it; alberto.tosi@polimi.it; franco.zappa@polimi.it). S. Tisa is with Micro Photon Device srl, Bolzano I-39100, Italy (e-mail: stisa@micro-photon-devices.com). A. Dalla Mora and D. Contini are with the Dipartimento di Fisica, Politec- nico di Milano, I-20133 Milano, Italy (e-mail: alberto.dallamora@polimi.it; davide.contini@polimi.it). D. Durini is with Forschungszentrum J¨ ulich, ZEA-2 J ¨ ulich, Germany (e-mail: d.durini@fz-juelich.de). S. Weyers and W. Brockherde are with the Fraunhofer Institute for Micro- electronic Circuits and Systems IMS, D-47057 Duisburg, Germany (e-mail: sascha.weyers@ims.fraunhofer.de; werner.brockherde@ims.fraunhofer.de). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/JSTQE.2014.2342197 ultra compact, multi-channel instruments capable of measuring photon arrival times with a time jitter better than few hundreds picoseconds, are required in many fields, such as time-of-flight (TOF) measurement in 3-D depth ranging and light detection and ranging (LIDAR), and TOF of gamma rays in advanced TOF-resolved positron emission tomography [1] in nuclear medicine imaging, just to mention a few. Other applications require an accurate waveform (intensity versus time) reconstruc- tion of fast and faint optical signals through repetitive excitation of the sample under investigation by means of time-correlated single-photon counting (TCSPC) [2], [3]. The breakthrough is a multi-pixel sensor able to provide single-photon sensitivity and highly precise timing-measurements electronics into a sin- gle chip. Furthermore, demanding TCSPC applications require extremely good accuracy, integral (INL) and differential (DNL) non-linearities much better than ½ least-significant bit (LSB), i.e., timing resolution. Arrays of single-photon avalanche-diodes (SPADs) are the best candidates because they have no read- out noise, unlike CCDs, no limited sensitivity, differently from CMOS Active Pixel Sensors, and no excess noise, typical of analog avalanche photo diode arrays. SPAD arrays provide true single-photon sensitivity and high frame rates, thus overcoming the gap between high-sensitivity (low-speed though) CCD im- agers and high-speed (but low-sensitivity) CMOS APS sensors. Array of SPADs with time to digital converters (TDCs) have been already reported in literature [4]–[8]. Some of them present a multiplexed architecture, in which one TDC is shared among tens of SPADs [4], [5]. For instance, Ref. [4] reports a large number of SPADs (128 × 128), but the image resolution is low because each column works as a single macro-pixel. Other pa- pers present fully parallel in-pixel electronics with good timing resolution but poor linearity, for example either 52 ps, 40% DNL and 140% INL [6], or 55 ps, 30% DNL, 250% INL [7], or 119 ps, 50% DNL, 120% INL [8]. Many SPAD arrays show non-uniform detection performance and also low uniformity in the measured arrival time. In Ref. [7], the peak photo detec- tion efficiency (PDE) varies from 3% to 27.5% depending on the pixel. Ref. [6] and Ref. [8] have a pixel-to-pixel TDC non- uniformity larger than the LSB (8 LSB and 2 LSB, respectively), thus compensation is needed to actually exploit their timing res- olution. Despite the high dark count rate (DCR) density of those SPADs, the DCR is kept to 50 cps [7] and 100 cps [8], using a very small active diameter (5.6 μm [7] and 8.6 μm [8]). 1077-260X © 2014 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications standards/publications/rights/index.html for more information.
3804810 IEEE JOURNAL OF SELECTED TOPICS IN QUANTUM ELECTRONICS, VOL. 20, NO. 6, NOVEMBER/DECEMBER 2014 Arrays with fully parallel electronics, i.e., one TDC per pixel, have low fill-factor (e.g., 1% [7], 2.3% [8]), hence a low overall active area, despite the large number of pixels (0.5 mm 2 [7], 0.06 mm 2 [8]). Furthermore the peak PDE is lower than 30% and drops below 5% for wavelength shorter than 350 nm and higher than 800 nm. The limited overall active area and PDE prevent the possibility to use them in photon starving applica- tions, such as optical time domain reflectometry [9], in which large collection areas are required. In Ref. [10], a high fill-factor is achieved, but with vertical integration of two wafers (one for SPADs and another one for timing electronics) and just 2 ns timing resolution. In this paper we present a single-chip CMOS imager, fabri- cated in a low-cost 0.35 μm CMOS technology, consisting of 32 × 32 pixels with 150 μm pitch, each able to detect single pho- tons (in the 300 – 900 nm wavelength range), to time-stamp their arrival times (with 312 ps resolution) for acquiring waveforms and time-resolved maps, and to count photons for providing photon-number (i.e., intensity) resolved 2-D videos (e.g., very useful for properly aligning sensor with optical setup). Com- pared to arrays designed in more scaled technologies, the de- veloped imager has worse timing resolution, but reaches a 10× better linearity, larger overall active area (0.7 mm 2 ) and better SPAD performance (much lower DCR with only 3% hot-pixles and excellent PDE). Concerning detection performance, we de- signed large active area (both at pixel- and overall array- level), high PDE, in a well assessed low-DCR technology, aiming at advanced TCSPC applications, where uniformity and sensitivity are essential requirements. An in-depth optical characterization of the SPAD detectors was described in Ref. [11]. Concern- ing array design, we traded-off all in-pixel TDC electronics, pixel pitch, fill-factor, and pixel count, aiming at real applica- tions where high throughput and simple optics are a must, like in confocal microscopy, in which well-spaced laser excitation/ photon detection spots are required [12], and 1024 pixels are quite enough, though a bit lower than denser arrays [7]. Target applications are fluorescence lifetime imaging mi- croscopy (FLIM), where the average fluorescence relaxation time [13] is measured, FLIM-based F¨ orster resonance energy transfer (FRET) for assessing the energy transfer between two chromophores with tens of picoseconds dynamics [14], diffuse optical tomography where internal compositions of biological tissues are characterized through photon delays from 1 ns to tens of nanoseconds [15], proteomics and DNA sequencing [16]. The paper is organized as follows: Section II describes the chip architecture, i.e., the global timing electronics, the in-pixel circuitry, and the readout logics. Section III presents the char- acterization of the TDCs (linearity, precision, uniformity and crosstalk). Section IV reports 2-D and 3-D static images and videos. Section V draws conclusions and perspectives. II. CHIP STRUCTURE The 32 × 32 array has 150 μm × 150 μm pixels with SPAD detectors having circular 30 μm diameter photoactive area. It is fabricated in a 0.35 μm CMOS technology, automotive certified, selected to provide high yield and uniformity of SPAD perfor- Fig. 1. PDE of CMOS SPADs presented in this work, compared to custom- processes, like reach-through thick-SPAD by Excelitas [18], the red-enhanced SPAD by Politecnico di Milano and MPD [17], [19], the planar thin-SPAD by MPD [20], and CMOS SPAD in 130 nm CMOS technologies [21], [22]. For all SPADs, the PDE at the highest reported excess bias is shown. Fig. 2. Dependence of DCR versus PDE of the array’s CMOS SPADs, at four different wavelengths and five excess biases (2 , 3 , 4 , 5, and 6 V). mance [11]. In fact, the developed SPADs reach state-of-the-art performance in terms of noise (i.e., DCR), afterpulsing, yield, reliability, and optical crosstalk compared to all other CMOS SPADs so far reported, and are even comparable to best-in-class custom-process SPADs [17]. Since the employed n-well isolation is not so deep, the achieved PDE in the near infra-red (NIR) is lower than what attained by custom-processed SPADs. Nevertheless in the near ultra-violet (NUV) the obtained sensitivity outperforms all other SPADs, both custom- and CMOS-based, presented so far in lit- erature, as it can be observed in Fig. 1, thanks to a thinner Si 3 N 4 passivation layer. The PDE is 55% at 450 nm wavelength, 45% at 400 and 500 nm, and still 20% at 300 nm in the NUV and at 650 nm in the NIR ends of the Silicon sensitivity. The de- tector characterization of the SPADs integrated in the array was presented in [11]. The 0.35 μm technology employed prevents to shrink the in- pixel electronics, thus limiting the attainable fill-factor to just 3.14% with no micro lens array, but at the same time it allows to exploit large 30 μm SPADs with DCR of just 120 cps (counts/s) with no need of cooling (Fig. 2). Such large SPADs mitigate the fill-factor, which eventually is even better than what reported in other more scaled technology, e.g., 1% [7], or 2.3% [8] in 130 nm technology.
IEEE JOURNAL OF SELECTED TOPICS IN QUANTUM ELECTRONICS, VOL. 20, NO. 6, NOVEMBER/DECEMBER 2014 3804810 CMOS Imager With 1024 SPADs and TDCs for Single-Photon Timing and 3-D Time-of-Flight Federica Villa, Rudi Lussana, Danilo Bronzi, Student Member, IEEE, Simone Tisa, Alberto Tosi, Member, IEEE, Franco Zappa, Senior Member, IEEE, Alberto Dalla Mora, Davide Contini, Daniel Durini, Member, IEEE, Sasha Weyers, and Werner Brockherde Abstract—We present a CMOS imager consisting of 32 × 32 smart pixels, each one able to detect single photons in the 300– 900 nm wavelength range and to perform both photon-counting and photon-timing operations on very fast optical events with faint intensities. In photon-counting mode, the imager provides photonnumber (i.e., intensity) resolved movies of the scene under observation, up to 100 000 frames/s. In photon-timing, the imager provides photon arrival times with 312 ps resolution. The result are videos with either time-resolved (e.g., fluorescence) maps of a sample, or 3-D depth-resolved maps of a target scene. The imager is fabricated in a cost-effective 0.35-µm CMOS technology, automotive certified. Each pixel consists of a single-photon avalanche diode with 30 µm photoactive diameter, coupled to an in-pixel 10-bit time-to-digital converter with 320-ns full-scale range, an INL of 10% LSB and a DNL of 2% LSB. The chip operates in global shutter mode, with full frame times down to 10 µs and just 1-ns conversion time. The reconfigurable imager design enables a broad set of applications, like time-resolved spectroscopy, fluorescence lifetime imaging, diffusive optical tomography, molecular imaging, time-of-flight 3-D ranging and atmospheric layer sensing through LIDAR. Index Terms—Photon counting, CMOS imagers, single-photon avalanche diode (SPAD), 2-D imaging, 3-D ranging, time-of-flight, photon tagging, time-correlated single-photon counting (TCSPC), light detection and ranging (LIDAR). I. INTRODUCTION HE last decades saw a growing interest in the scientific community toward single-photon time-resolved measurements of very faint and fast optical signals for safety and security, medical and biological applications. High-performance, T Manuscript received January 31, 2014; revised June 11, 2014; accepted July 17, 2014. This work was supported by the “MiSPiA” Project, under the EC FP7-ICT Framework, G.A. No. 257646. F. Villa, R. Lussana, D. Bronzi, A. Tosi, and F. Zappa are with the Dipartimento di Elettronica, Informazione e Bioingegneria, Politecnico di Milano, I20133 Milano, Italy (e-mail: federica.villa@polimi.it; rudi.lussana@polimi.it; danilo.bronzi@polimi.it; alberto.tosi@polimi.it; franco.zappa@polimi.it). S. Tisa is with Micro Photon Device srl, Bolzano I-39100, Italy (e-mail: stisa@micro-photon-devices.com). A. Dalla Mora and D. Contini are with the Dipartimento di Fisica, Politecnico di Milano, I-20133 Milano, Italy (e-mail: alberto.dallamora@polimi.it; davide.contini@polimi.it). D. Durini is with Forschungszentrum Jülich, ZEA-2 Jülich, Germany (e-mail: d.durini@fz-juelich.de). S. Weyers and W. Brockherde are with the Fraunhofer Institute for Microelectronic Circuits and Systems IMS, D-47057 Duisburg, Germany (e-mail: sascha.weyers@ims.fraunhofer.de; werner.brockherde@ims.fraunhofer.de). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/JSTQE.2014.2342197 ultra compact, multi-channel instruments capable of measuring photon arrival times with a time jitter better than few hundreds picoseconds, are required in many fields, such as time-of-flight (TOF) measurement in 3-D depth ranging and light detection and ranging (LIDAR), and TOF of gamma rays in advanced TOF-resolved positron emission tomography [1] in nuclear medicine imaging, just to mention a few. Other applications require an accurate waveform (intensity versus time) reconstruction of fast and faint optical signals through repetitive excitation of the sample under investigation by means of time-correlated single-photon counting (TCSPC) [2], [3]. The breakthrough is a multi-pixel sensor able to provide single-photon sensitivity and highly precise timing-measurements electronics into a single chip. Furthermore, demanding TCSPC applications require extremely good accuracy, integral (INL) and differential (DNL) non-linearities much better than ½ least-significant bit (LSB), i.e., timing resolution. Arrays of single-photon avalanche-diodes (SPADs) are the best candidates because they have no readout noise, unlike CCDs, no limited sensitivity, differently from CMOS Active Pixel Sensors, and no excess noise, typical of analog avalanche photo diode arrays. SPAD arrays provide true single-photon sensitivity and high frame rates, thus overcoming the gap between high-sensitivity (low-speed though) CCD imagers and high-speed (but low-sensitivity) CMOS APS sensors. Array of SPADs with time to digital converters (TDCs) have been already reported in literature [4]–[8]. Some of them present a multiplexed architecture, in which one TDC is shared among tens of SPADs [4], [5]. For instance, Ref. [4] reports a large number of SPADs (128 × 128), but the image resolution is low because each column works as a single macro-pixel. Other papers present fully parallel in-pixel electronics with good timing resolution but poor linearity, for example either 52 ps, 40% DNL and 140% INL [6], or 55 ps, 30% DNL, 250% INL [7], or 119 ps, 50% DNL, 120% INL [8]. Many SPAD arrays show non-uniform detection performance and also low uniformity in the measured arrival time. In Ref. [7], the peak photo detection efficiency (PDE) varies from 3% to 27.5% depending on the pixel. Ref. [6] and Ref. [8] have a pixel-to-pixel TDC nonuniformity larger than the LSB (8 LSB and 2 LSB, respectively), thus compensation is needed to actually exploit their timing resolution. Despite the high dark count rate (DCR) density of those SPADs, the DCR is kept to 50 cps [7] and 100 cps [8], using a very small active diameter (5.6 µm [7] and 8.6 µm [8]). 1077-260X © 2014 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications standards/publications/rights/index.html for more information. 3804810 IEEE JOURNAL OF SELECTED TOPICS IN QUANTUM ELECTRONICS, VOL. 20, NO. 6, NOVEMBER/DECEMBER 2014 Arrays with fully parallel electronics, i.e., one TDC per pixel, have low fill-factor (e.g., 1% [7], 2.3% [8]), hence a low overall active area, despite the large number of pixels (0.5 mm2 [7], 0.06 mm2 [8]). Furthermore the peak PDE is lower than 30% and drops below 5% for wavelength shorter than 350 nm and higher than 800 nm. The limited overall active area and PDE prevent the possibility to use them in photon starving applications, such as optical time domain reflectometry [9], in which large collection areas are required. In Ref. [10], a high fill-factor is achieved, but with vertical integration of two wafers (one for SPADs and another one for timing electronics) and just 2 ns timing resolution. In this paper we present a single-chip CMOS imager, fabricated in a low-cost 0.35 µm CMOS technology, consisting of 32 × 32 pixels with 150 µm pitch, each able to detect single photons (in the 300 – 900 nm wavelength range), to time-stamp their arrival times (with 312 ps resolution) for acquiring waveforms and time-resolved maps, and to count photons for providing photon-number (i.e., intensity) resolved 2-D videos (e.g., very useful for properly aligning sensor with optical setup). Compared to arrays designed in more scaled technologies, the developed imager has worse timing resolution, but reaches a 10× better linearity, larger overall active area (0.7 mm2 ) and better SPAD performance (much lower DCR with only 3% hot-pixles and excellent PDE). Concerning detection performance, we designed large active area (both at pixel- and overall array- level), high PDE, in a well assessed low-DCR technology, aiming at advanced TCSPC applications, where uniformity and sensitivity are essential requirements. An in-depth optical characterization of the SPAD detectors was described in Ref. [11]. Concerning array design, we traded-off all in-pixel TDC electronics, pixel pitch, fill-factor, and pixel count, aiming at real applications where high throughput and simple optics are a must, like in confocal microscopy, in which well-spaced laser excitation/ photon detection spots are required [12], and 1024 pixels are quite enough, though a bit lower than denser arrays [7]. Target applications are fluorescence lifetime imaging microscopy (FLIM), where the average fluorescence relaxation time [13] is measured, FLIM-based Förster resonance energy transfer (FRET) for assessing the energy transfer between two chromophores with tens of picoseconds dynamics [14], diffuse optical tomography where internal compositions of biological tissues are characterized through photon delays from 1 ns to tens of nanoseconds [15], proteomics and DNA sequencing [16]. The paper is organized as follows: Section II describes the chip architecture, i.e., the global timing electronics, the in-pixel circuitry, and the readout logics. Section III presents the characterization of the TDCs (linearity, precision, uniformity and crosstalk). Section IV reports 2-D and 3-D static images and videos. Section V draws conclusions and perspectives. II. CHIP STRUCTURE The 32 × 32 array has 150 µm × 150 µm pixels with SPAD detectors having circular 30 µm diameter photoactive area. It is fabricated in a 0.35 µm CMOS technology, automotive certified, selected to provide high yield and uniformity of SPAD perfor- Fig. 1. PDE of CMOS SPADs presented in this work, compared to customprocesses, like reach-through thick-SPAD by Excelitas [18], the red-enhanced SPAD by Politecnico di Milano and MPD [17], [19], the planar thin-SPAD by MPD [20], and CMOS SPAD in 130 nm CMOS technologies [21], [22]. For all SPADs, the PDE at the highest reported excess bias is shown. Fig. 2. Dependence of DCR versus PDE of the array’s CMOS SPADs, at four different wavelengths and five excess biases (2 , 3 , 4 , 5, and 6 V). mance [11]. In fact, the developed SPADs reach state-of-the-art performance in terms of noise (i.e., DCR), afterpulsing, yield, reliability, and optical crosstalk compared to all other CMOS SPADs so far reported, and are even comparable to best-in-class custom-process SPADs [17]. Since the employed n-well isolation is not so deep, the achieved PDE in the near infra-red (NIR) is lower than what attained by custom-processed SPADs. Nevertheless in the near ultra-violet (NUV) the obtained sensitivity outperforms all other SPADs, both custom- and CMOS-based, presented so far in literature, as it can be observed in Fig. 1, thanks to a thinner Si3 N4 passivation layer. The PDE is 55% at 450 nm wavelength, 45% at 400 and 500 nm, and still 20% at 300 nm in the NUV and at 650 nm in the NIR ends of the Silicon sensitivity. The detector characterization of the SPADs integrated in the array was presented in [11]. The 0.35 µm technology employed prevents to shrink the inpixel electronics, thus limiting the attainable fill-factor to just 3.14% with no micro lens array, but at the same time it allows to exploit large 30 µm SPADs with DCR of just 120 cps (counts/s) with no need of cooling (Fig. 2). Such large SPADs mitigate the fill-factor, which eventually is even better than what reported in other more scaled technology, e.g., 1% [7], or 2.3% [8] in 130 nm technology. VILLA et al.: CMOS IMAGER WITH 1024 SPADS AND TDCS FOR SINGLE-PHOTON TIMING AND 3-D TIME-OF-FLIGHT 3804810 Fig. 4. TDC architecture, composed by global components (a frequency doubler (fx2), a 16-taps DLL for providing 16 multiphase clocks, a global STOP interpolator) and the in-pixel TDC (with START fine interpolator, a 6 bit counter for coarse timing resolution, and synchronizers). Blocks with yellow background are replicated into each one of the 1024 pixels of the array. Fig. 3. Imager architecture, composed by: 32 × 32 pixels (split in two identical blocks to speed-up readout), external counters (four blocks by 16 × 16) for multiple gate-on periods, readout circuitry (row selectors and output multiplexers), 16 multiphase clocks, global STOP channel and registers. In each pixel, when a photon hits and triggers the SPAD, the avalanche sensing electronics provides a START signal to the in-pixel TDC. The global STOP is provided to all TDCs by an external sync (e.g., the excitation laser sync-out). Fig. 3 shows all chip building blocks, namely reference clock frequency doubler (fx2), delay locked loop (DLL), STOP interpolator and synchronizer for the global timing electronics, array of pixels (with SPAD and TDC each), and readout circuitry (with row selector and output multiplexer). The next paragraphs describe in detail the TDC technique employed and each component of the chip. A. Sliding Scale and TDC Structure The chip was designed to achieve sub-nanosecond timing resolution and some hundreds of nanosecond full-scale range. Such ranges could be achieved through interpolation methods based for instance on Pulse-Shrinking delay line [23], Tapped delay line [24], or standard and cyclic Vernier delay line [25] elements. In all those methods, linearity is limited by components mismatches, but it can be greatly improved by employing the “Sliding Scale technique”, which requires to separately measure the time interval between a reference clock and both asynchronous START and STOP signals [26], [27]. In this way, even if the same START-STOP time interval is converted, different portions of START and STOP interpolator ranges are exploited, thus interpolators’ deterministic nonlinearities are converted in stochastic jitter [28]. Therefore such “Sliding Scale” improves linearity though worsening single-shot precision, which degrades due to interpolators’ INL [28]:  2 2 2 2 σTDC = σq2 + σINL−START + σINL−STO P + σclk + σn (1) where σq is the quantization error, given by ⎛ ⎞  2 2 LSB LSB LSB STO P START ⎝σ q = + = √ ⎠ (2) 12 12 6 and σINL−START and σINL−STO P are the INL standard deviations of START and STOP interpolators respectively, σclk is the reference clock jitter, and σn is any additional jitter of signals within the TDC. Nonetheless the “Sliding Scale” technique adds the additional σINL contributions to σq , both DNL and INL remarkably improve [29]. We carefully selected the TDC conversion technique by taking into consideration that the interpolator must be replicated into each pixel, hence it should require small dimensions and low power consumption. We designed an interpolator similar to a tapped delay line, but instead of propagating the START signal along the delay line, we decided to propagate a reference clock, whilst the state of multiphase clocks is sampled in correspondence of the START or the STOP signal, respectively, by means of separate interpolators. In this way, we effectively implement the “Sliding Scale” technique and we can keep the DLL propagating the reference clock outside the pixel, as a global electronics. Instead, the pulse shrinking methods would require the integration of one different DLL into each pixel, and the Vernier delay line technique would provide very short time bin, but at the expense of larger area and higher power consumption. We conceived a TDC structure similar to the one presented in Ref. [25], but we decided to employ just one interpolation stage, for smaller pixel dimensions, hence higher fill-factor. Fig. 4 shows the TDC global architecture: the in-pixel TDC components are a coarse 6 bit counter, which assures a long full-scale range (more than 300 ns), and 4 bit interpolators, for fine resolution. The coarse counter is used to count the number 3804810 Fig. 5. clocks. IEEE JOURNAL OF SELECTED TOPICS IN QUANTUM ELECTRONICS, VOL. 20, NO. 6, NOVEMBER/DECEMBER 2014 Simplified schematic of the DLL that generates the 16 multiphase of clock periods (of a 100 MHz external clock, internally doubled to 200 MHz) between START and STOP signals. Then, the fine interpolator, based on a global DLL which divides the clock period into 16 intervals of 312 ps each, includes an in-pixel discriminator to detect the phase of the START signal with respect to the 16 multiphase clocks. An identical discriminator detects the phase of the asynchronous global STOP. Two synchronizers assure the correct synchronization between coarse counter and the START and STOP interpolators, respectively. The TDC acts as a flash converter and requires a negligible (about 1 ns) conversion time, due only to the propagation delays. Eventually, the 6 bit counter, the START interpolator and its synchronizer are integrated into each pixel, whereas the 16 taps DLL, the STOP interpolator and its synchronizer are all global components of the array. Each measurement consists of 6 bits from the in-pixel coarse counter (Ncoarse ), 4 bits from the in-pixel START interpolator (NSTART ), and 4 bits from the global STOP interpolator (NSTO P ). The photon time-tag Tm eas is the elapsed time between the START (i.e., the photon) and the STOP (i.e., the global sync) events and it is computed as  Nstart − Nstop (3) Tm eas = Ncoarse − · Tck 16 where Tck is one period of the reference clock, which corresponds to 5 ns when a 100 MHz external clock is used. Therefore, the TDC provides a LSB of Tck /16 = 312 ps, with a full-scale range FSR = 320 ns and 10-bit resolution. B. Global Timing Logics The global timing logics comprises a 16 taps DLL, for generating clocks with constant phase shift, a frequency multiplier, for doubling the frequency of the external reference clock, the STOP interpolator and the STOP synchronizer, for synchronization between fine interpolator and coarse counter. The structure of the 16 taps DLL is shown in Fig. 5: it propagates the reference clock by generating 16 multiphase clocks (shifted by 312 ps steps when using a 100 MHz external clock), which feed the TDC interpolators. As shown in Fig. 5, the DLL is constituted by 16 delay cells, whose delay is adjustable by means of the control voltage (internally generated by phase detector and charge pump). The phase detector generates voltage pulses (up or down depending on which clock is in advance), with a duration proportional to the phase delay between clk first and clk last. Then the charge pump consequently charges or Fig. 6. Schematics (top and bottom left) and main signals (bottom right) of the frequency doubler, consisting of four stage locked-loop delay line and logics to generate the internal clock. The Glitch blocks create short pulses (hundreds of picoseconds) in correspondence to the rising edge of the input signal. discharges a capacitor, depending on which clock is in advance [30]. The internal control voltage makes the TDC insensitive to process, voltage and temperature drifts. Simulations have been performed to assure the correct operation in a wide temperature range, from 0 to 80 °C, and also considering worst-cases process variations. The 16 multiphase clocks are connected to the 1024 TDC’s START interpolators, one for each pixel, and to the single global STOP interpolator. In order to reduce power consumption of the multiphase clock buffers, clocks are distributed only when SPADs are enabled. The frequency multiplier doubles the external reference clock frequency, allowing to use external clock generators at lower frequency. Fig. 6 shows the implementation of the circuit, based on a four-tap DLL, which makes the internal clock stable and locked. A Set-Reset latch has been used to lock both the rising and the falling edge of the output clock to the rising edge of a clock generated by the DLL, thus a 50% output duty-cycle is assured, independent of the input duty-cycle. The global interpolator detects the phase of the STOP in respect to the 16 multiphase clocks, and provides a 16-levels thermometric scale, which is then converted into a 4 bit binary code, through a 16-to-4 encoder. Identical 4 bit interpolators have been employed also for all the 1024 STARTs, one for each pixel. The interpolator is composed by 16 latches, which acquire the state of the multiphase-clocks at the rising edge of the STOP (or START) signal. Since 16 of these latches are replicated into each pixel, it is important to reduce their power consumption. The schematics of the latch is shown in Fig. 7: M1–M4 form a sense-amplifier based latch [31], M7 and M8 assure a fast recharge after commutation, M5 and M6 form a pseudo p-MOS NOR gate, which pre-charges both branches at the same level when no input is received. Compared to a CMOS gate, it allows to save a large amount of power at such a high 200 MHz clock. The synchronizer block assures the correct synchronization between interpolator and coarse counter. If START (or STOP) signal arrives too close to the clock rising edge, a metastable VILLA et al.: CMOS IMAGER WITH 1024 SPADS AND TDCS FOR SINGLE-PHOTON TIMING AND 3-D TIME-OF-FLIGHT 3804810 Eventually, since many applications benefit of combined 2-D (photon-counting) imaging and 3-D (photon-timing) ranging, we implemented both. In fact, the 6-bit asynchronous counter counts either the number of clock periods between START and STOP, in photon-timing mode, or the photons detected within a time window, in photon-counting mode. D. Readout Circuitry Fig. 7. Schematics of the latch, based on a sense-amplifier, constituting the core block of the fine interpolator. Since many applications require very high frame-rate and synchronous acquisition from all pixels, we optimized the global electronics for improving data throughput. The planar array is divided in two sub-arrays, in which pixels of each half-column share the same data bus. In each sub-array the in-pixel output buffers of one row are activated at the same time and two separate multiplexers (one for the upper and one for the lower sub-array) serially feed the results of each column to the output pads. The two advantages are the doubling of output throughput and the halving of parasitic capacitance of column data buses, at the cost of doubling the output pads. With a 50 MHz readout clock, a 100 000 fps maximum frame rate is easily achieved (i.e., 128 MB/s data rate). E. Overall Chip Fig. 8. Pixel block diagram, with 30 µm SPAD and 10 bit TDC. condition can occur, and it is necessary to refer to the fine conversion to solve the uncertainty. Synchronizer circuits for (in-pixel) START and (global) STOP have been employed to solve such metastable conditions. C. In-Pixel Electronics All functionalities of the chip have been implemented inpixel. Fig. 8 shows the block diagram of the pixel [29], which includes a large photoactive area SPAD (30 µm diameter) [11], a quenching circuit with active reset for fast avalanche sensing and quenching (two high voltage transistors has been used to bias the SPADs at excess voltages as high as 7 V) [32], pulse shaping electronics for proper synchronizations, and the 10 bit TDC. A 10 bit memory latch stores the results of the conversion, and the B0:B9 output buffers drive the readout data buses. Thanks to the 1024 in-pixel memory latches, a new frame can be acquired while performing the readout of the previous one, in a global shutter technique. The minimum frame duration, required for reading out the whole 32 × 32 array, is 10 µs, corresponding to 100 000 fps maximum frame rate. In photon-timing mode the maximum duration of the SPAD gate-on period is limited by the TDC range (320 ns with an external 100 MHz reference clock). Since in many applications the typical probability of detecting photons for each laser shot is lower than 5% [33], the overall photontiming system would be very inefficient. To overcome this problem we implemented a multi-window timing feature: more than one gate-on period is opened within each frame and the first photon within the frame will trigger the TDC conversion. One global STOP is provided to the chip for each gate-on period and up to 64 stop-interpolator results can be stored in a global memory and then can be readout at the end of each frame. In order to associate the proper STOP to each pixel, other 1024 6-bit counter have been added: every gate-on period such counters are incremented, if no photon has yet been detected in the specific pixel. In this way the total conversion of each pixel has 16 bit: six from the coarse counter, four from the START fine interpolator and six from the additional counter for multiple gate-on windows. Since no strict timing constraints must be respected, these counters have been placed outside the pixel, as shown in Fig. 3. Fig. 9 shows a micrograph of the overall chip (total chip size 9 × 9 mm2 ). The total power consumption in photon-counting mode is less than 70 mW, whereas in photon-timing it depends on the required timing resolution and on the number of gate-on windows per frame. Table I shows the power consumption for each array block in photon-timing mode, at two different timing resolutions, attained by varying the external reference clock. 3804810 Fig. 9. IEEE JOURNAL OF SELECTED TOPICS IN QUANTUM ELECTRONICS, VOL. 20, NO. 6, NOVEMBER/DECEMBER 2014 Micrograph of the 32 × 32 CMOS SPAD imager. TABLE I POWER CONSUMPTION OF THE IMAGER MAIN BLOCKS, IN PHOTON COUNTING MODE, FOR DLL CLOCK FREQUENCIES OF 160 AND 200 MHZ Circuit block DLL In-pixel electronics Clock distribution Digital I/O Analog I/O TOTAL Notes Power consumption @ 390 ps @ 312 ps 260 1 conversion per frame 1 gate per frame 33 50 gates per frame at 100 000 fps 1.65 mW mW 50 mW 2.5 W mW mW mW W 1 gate per frame 50 gates per frame 358 1.9 20 <5 430 2.8 Code density test, DNL and INL of one pixel in the array center. Units 315 40 Fig. 10. III. TDCS CHARACTERIZATION In this section, we present the characterization of the TDC within the array, when all 1024 TDCs are running at the same time. We start from linearity (DNL and INL) and single-shot precision, i.e., the standard deviation of TDC conversion results when a constant START-STOP time interval is measured many times. Then we report the uniformity of TDC conversions among different pixels and finally the crosstalk among TDCs. A. Linearity The array design aimed at optimizing the linearity of the measurement, both taking care of the delay cells matching and implementing the Sliding-Scale technique. In order to quantitatively measure the linearity of the chip, we performed a code density test, collecting about 40 000 samples for each pixel within a range of 300 ns. No differences were observed among different pixels, hence Fig. 10 shows the code density test, the DNL and INL of one pixel, in the array center. The obtained DNL is 2% LSB rms (±6% maximum) and the INL is 10% LSB rms (±22% maximum). Such relative linearity Fig. 11. Single-shot precision with 390 ps bin (left), and 312 ps bin (right), corresponding to an rms precision of 280 and 254 ps, respectively. is much better than what reported in literature [4]–[8]. Considering the DNL absolute value, only Ref. [7] shows a slightly better DNL (16.5 ps [7] versus our 18.7 ps), but with a much worse INL (137.5 ps [7] versus our 78 ps). B. Single-Shot Precision and Uniformity We characterized the single-shot precision of the TDCs integrated in each pixel of the array, by illuminating the corresponding SPADs with a pulsed laser with 60 ps full-width at half maximum (FWHM). In order to check the reliability of the imager, two different reference clock frequencies were used, namely 80 and 100 MHz, corresponding to a time-bin, i.e., LSB, of 390 and 312 ps, respectively. The typical pulse responses of one pixel in the array center are shown in Fig. 11: the single-shot precision is 660 ps (left) and 600 ps (right) FWHM, corresponding to a variance of 280 and 254 ps rms, respectively. Depending on the measured time delay, the precision shows variations of ±20 ps with respect to the typical response represented in Fig. 11. For assessing TDC response uniformity, we homogeneously illuminated the imager with a pulsed laser and all 1024 TDCs VILLA et al.: CMOS IMAGER WITH 1024 SPADS AND TDCS FOR SINGLE-PHOTON TIMING AND 3-D TIME-OF-FLIGHT Fig. 12. Uniformity (left) and FWHM precision (right) of the photon arrival time histogram (about 40 000 conversions) in all 1024 array pixels. Fig. 13. Centroid (left) and FWHM precision (right) of photon arrival time histograms of each one of the 1024 array pixel, at low (red) and high (blue) optical illuminating. measured the arrival time of the first photon per pixel. A 100 MHz clock was used, resulting in a 312 ps bin. After accumulating 1024 histograms, with about 40000 conversions per pixel, the centroid and the histograms spread (in FWHM) were computed, across the whole TDC range. No significant differences (lower than ½ LSB) were observed at different delays. The maximum error between the measured time and the average time measured by each pixel of the array is shown in Fig. 12 left, whereas the single-shot precision is shown in Fig. 12 right. The maximum difference across centroids among different pixels is 260 ps (less than 1 LSB) and the standard deviation is 42 ps; it is a clear signature of the excellent pixel uniformity. Fig. 12 right shows the single-shot precision (i.e., the distribution width) of each pixel; the average value across all pixels is 609 ps FWHM, i.e., 259 ps rms. The variance of the FWHM among the 1024 pixels is just 27.8 ps rms, showing again excellent uniformity of TDC performance over the entire array. C. TDC Crosstalk We assessed the crosstalk among different TDCs by measuring the FWHM of photon arrival time histograms (with about 50 000 conversion) in each pixel, by illuminating the chip with a pulsed laser with both low (red lines in Fig. 13) and high (blue lines in Fig. 13) optical power. In the first case less than 1% of pixels receives a photon, thus it is possible to assume that while the hit pixel is performing a TDC conversion, no neighbors are working. In the second case, more than 85% of pixels receives a photon almost simultaneously, therefore it is possible to con- 3804810 Fig. 14. Raw data 2-D intensity image of a doll (5 cm × 5 cm FOV) obtained with no active illumination and no post processing. sider, with good approximation, that most of neighboring pixels are working simultaneously. The centroid (Fig. 13 left) averaged among all 1024 array pixels is 97.21 ns and 97.26 ns, at low and high optical powers respectively, and in all pixels at high conversion rates the measured TOF is longer than at low conversion rates. This fact is mostly due to the heating of the array (that causes longer propagation delay in the STOP signal input buffer) and not to crosstalk among TDCs. Furthermore, note that the time difference in the measured centroid at low and high optical power is only 50 ps, i.e., 16% LSB. The FWHM (Fig. 13 right) is 609.7 and 612.5 ps, at low and high optical power respectively, showing negligible dependence on counting conditions, i.e., photon rate. Therefore, we can conclude that there is almost no crosstalk among TDCs since neither the accuracy (centroid) nor the precision (FWHM spread) of measurements change significantly with the number of working TDCs. IV. 2-D AND 3-D MEASUREMENTS In order to prove the photon counting and photon timing capability of the SPAD imager, we performed combined 2-D/3-D acquisitions, working at high frame-rates. 2-D images and videos were acquired by using the chip in photon-counting mode (100 000 fps maximum frame rate), whereas 3-D distanceresolved ranging was achieved by measuring the round trip duration of a laser pulse shining the target scene. Fig. 14 shows the acquired 2-D image of the face of a doll at 60 cm distance from the imager, acquired through a 55 mm objective, with no post-processing. The field-of-view (FOV) of the camera was about 5 × 5 cm2 , the integration time was 5 ms with low ambient illumination. In order to increase the image dynamic range (ratio between the maximum and the minimum counts per frame), the 5 ms integration time (Tint ) resulted by summing 500 frames (Nfram e ) of 10 µs each. The resulting dynamic range is: DR = 20 log10 Nfram e · 26 = 94.5 dB DCR · Tint (4) set also by the very low dark counting rate of 120 cps. Fig. 15 shows some frames of a 2-D video, acquired at 50 000 fps. A chopper with five blades was rotating at 39 000 r/min (i.e., 1/5 period corresponds to about 300 µs) in a scene 3804810 IEEE JOURNAL OF SELECTED TOPICS IN QUANTUM ELECTRONICS, VOL. 20, NO. 6, NOVEMBER/DECEMBER 2014 Fig. 15. Eight frames from a 2-D video acquired at 50 000 fps of an optical chopper rotating at 39 000 r/min, illuminated by a neon lamp flickering at 100 Hz. All images have the same color scale (0–60 counts). Fig. 17. Eight frames from a video acquired at 13 fps of a pendulum oscillating back and forth the imager. The color scale indicates the distance of the target from the background (at 8 m distance from the camera). The movie has been acquired in dark conditions. Fig. 17 represents some frames of a 3-D video acquired at 13 fps, in which a pendulum is oscillating back (red) and forth (yellow) the imager with about 600 ms period. The scene was at 8 m distance from the camera. V. CONCLUSION Fig. 16. 3-D reconstruction of a human target with office lighting, acquired by the SPAD imager in 5 ms, with a 4 mm depth resolution, with no post-processing. with a neon lamp illumination, flickering at 100 Hz (5 ms halfperiod). Frames 1, 8 and 15 show the chopper rotating 1/5 of a full turn, with the neon lamp at the maximum brightness; instead frames 262, 270 and 277 (about 5 ms after the previous ones) show that the neon gas discharge is almost off. Active illumination is instead necessary for 3-D acquisitions based on TOF photon timing: we employed a 750 nm pulsed laser (90 mW average emitted optical power, 70 ps FWHM) and a 250 mm objective (f/4.8) placed in front of the SPAD imager, thus obtaining a 60 × 60 cm2 FOV at 5 m distance from the camera. Fig. 16 shows an example of 3-D reconstruction, with no post-processing, of a human target obtained with 5 ms dwell time. Each pixel acquired about 3000 photons, so depthinformation is measured with a precision given by σ1 σN = √ . N (5) The overall precision σN of 0.7 mm (i.e., 4.6 ps rms) is achieved after N = 3000 valid events, with the previously reported single-shot precision σ1 of 254 ps rms. We presented a CMOS SPAD imager, based on 32 × 32 smart-pixels, each containing a SPAD detector and a TDC. The chip can count the number of photons detected into each pixel at 100 000 fps or can measure the time-tagging, i.e., the photon arrival time, of one photon per pixel per frame. The measured time delay can be either the photon TOF in 3-D-ranging/LIDAR applications, or the fluorescence photon emission in FLIM/FRET imaging, or the time-resolved photon waveform of very fast optical signals. The 1024 fully independent pixels operate in parallel, with no multiplexing during either detection or TDC conversion, in a global shutter mode. In photon-counting mode, at every frame the camera provides 1024 words of 6 bits each, with 100 000 fps maximum frame rate. In photon timing mode, at every frame the camera provides 1024 words of 10 bits each, i.e., 312 ps time bin and 320 ns full-scale range. The imager opens the way to ultra-sensitive (single-photon sensitivity) high-speed (a hundred kiloframes/s) acquisitions of fast optical phenomena and dynamic sequences of events. Ongoing applications are in time-resolved spectroscopy, fluorescence lifetime imaging, diffusive optical tomography, molecular imaging in life sciences, TOF 3-D ranging, and atmospheric layer sensing through LIDAR. These applications in particular benefit from the large total active area of the array (0.7 mm2 ) combined with the highest PDE among SPAD array for photontiming, allowing to reduce the optical power of the active illumination employed to excite the sample and targets under investigations. Other applications are gesture recognition for human-machine interface in which sub-centimeter resolution and hundreds of frames per second are required [34], gaming and mixed reality requiring 3-D depth-resolved dynamic acquisitions at medium distances (up to ten meters) with centimeter VILLA et al.: CMOS IMAGER WITH 1024 SPADS AND TDCS FOR SINGLE-PHOTON TIMING AND 3-D TIME-OF-FLIGHT resolution [35], and also touchless interaction with mobile devices even in light-starved environments. ACKNOWLEDGMENT The authors would like to thank S. Masci for valuable support in preparation and wire bonding of the chips. REFERENCES [1] M. E. Daube-Witherspoon, S. Matej, M. E. Werner, S. Surti, and J. S. Karp, “Comparison of list-mode and DIRECT approaches for time-offlight PET reconstruction,” in Proc. IEEE Nucl. Sci. Symp. Conf. Record., 2010, pp. 2252–2258. [2] W. Becker, Advanced Time-Correlated Single Photon Counting Techniques. Berlin, Germany: Springer, 2005. [3] M. Wahl. (2009). Time-correlated single photon counting. Tech. Note TCSPC v. 2.1, PicoQuant GmbH. [Online]. Available: www.picoquant.com [4] C. Niclass, C. Favi, T. Kluter, M. Gersbach, and E. Charbon, “A 128 × 128 single-photon imager with on-chip column-level 97 ps 10 bit time-to-digital-converter array,” in Proc. IEEE Int. Solid-State Circuits Conf., Feb. 2008, pp. 44–494. [5] C. Niclass, M. Soga, H. Matsubara, S. Kato, and M. Kagami, “A 100-m range 10-Frame/s 340 × 96-pixel time-of-flight depth sensor in 0.18 µm CMOS,” IEEE J. Solid-State Circuits, vol. 48, no. 2, pp. 559–572, Feb. 2013. [6] J. Richardson, R. Walker, L. Grant, D. Stoppa, F. Borghetti, E. Charbon, M. Gersbach, and R. K. Henderson, “A 32 × 32 50 ps resolution 10 bit time to digital converter array in 130 nm CMOS for time correlated imaging,” in Proc. IEEE Custom Integr. Circuits Conf., Sep. 13–16, 2009, pp. 77–80. [7] C. Veerappan, J. Richardson, R. Walker, D. Li, M. W. Fishburn, Y. Maruyama, D. Stoppa, F. Borghetti, M. Gersbach, R. K. Henderson, and E. Charbon, “A 160 × 128 single-photon image sensor with on-pixel 55 ps 10 b time-to-digital converter,” in Proc. IEEE Int. Solid-State Circuits Conf., 2011, pp. 312–314. [8] M. Gersbach, Y. Maruyama, E. Labonne, J. Richardson, R. Walker et al., “A parallel 32 × 32 time-to-digital converter array fabricated in a 130 nm imaging CMOS technology,” in Proc. 35th Eur. Solid-State Circuits Conf., 2009, pp. 196–199. [9] A. Pifferi, A. Torricelli, L. Spinelli, D. Contini, R. Cubeddu, F. Martelli et al., “Time-resolved diffuse reflectance using small source-detector separation and fast single-photon gating,” Phys. Rev. Lett., vol. 100, no. 13, p. 138101, 2008. [10] B. F. Aull, J. C. Aversa, E. A. Dauler, J. P. Donnelly, E. K. Duerr, and J. P. Frechette, “Geiger-mode avalanche photodiodes for laser communications and laser radar,” in Proc. Dig. LEOS Summer Top. Meet., 2006, pp. 1–2. [11] F. Villa, D. Bronzi, Y. Zou, C. Scarcella, G. Boso, S. Tisa, A. Tosi, F. Zappa, D. Durini, S. Weyers, W. Brockherde, and U. Paschen, “CMOS SPADs with up to 500 µm diameter and 55% detection efficiency at 420 nm,” J. Modern Opt., vol. 61, pp. 102–115, Jan. 2014. [12] X. Michalet, R. A. Colyer, G. Scalia, A. Ingargiola, R. Lin et al., “Development of new photon-counting detectors for single-molecule fluorescence microscopy,” Philosoph. Trans. B, vol. 368, no. 1611, pp. 1–22, Dec. 2012. [13] K. Suhling, P. M. W. French, and D. Phillips, “Time-resolved fluorescence microscopy,” Photochemical Photobiol. Sci., vol. 4, pp. 13–22, 2005. [14] E. A. Jares-Erijman and T. M. Jovin, “FRET imaging,” Nat. Biotechnol., vol. 21, pp. 1387–1395, 2002. [15] L. V. Wang and H. I. Wu, “Diffuse optical tomography,” Biomedical Optics: Principle and Measurements. Hoboken, NJ, USA: Wiley, 2007. [16] S. Carrara, A. Cavallini, Y. Maruyama, E. Charbon, and G. De Micheli, “A new ethylene glycol-silane monolayer for highly-specific DNA detection on silicon chips,” Surface Sci., vol. 604, pp. L71–L74, 2010. [17] A. Gulinatti, I. Rech, F. Panzeri, C. Cammi, P. Maccagnani, M. Ghioni, and S. Cova, “New silicon SPAD technology for enhanced red-sensitivity, high-resolution timing and system integration,” J. Modern Opt., vol. 59, no. 17, pp. 1489–1499, 2012. [18] Excelitas Technologies, Photonic Detection, Single Photon Counting Modules (SPCM) [Online]. Available: http:www.excelitas.com/ Downloads/DTS_SPCM-AQRH.pdf [19] Micro Photon Devices, Photon Detector Modules Red Enhanced (PDM-R) [Online]. Available: http:www.micro-photon-devices.com/ Docs/Datasheet/PDM-R.pdf 3804810 [20] Micro Photon Devices, Photon Detector Modules (PDM) [Online]. Available: http:www.micro-photon-devices.com/Docs/Datasheet/PDM.pdf [21] M. Gersbach, J. Richardson, E. Mazaleyrat, S. Hardillier, C. Niclass, R. Henderson, L. Grant, and E. Charbon, “A low-noise single-photon detector implemented in a 130 nm CMOS imaging process,” J. SolidState Electron., vol. 53, pp. 803–808, 2009. [22] J. A. Richardson, L. A. Grant, and R. K. Henderson, “A low dark count single photon avalanche diode structure compatible with standard nanometer scale CMOS technology,” IEEE Photon. Technol. Lett., vol. 21, no. 14, pp. 120–1022, 2009. [23] Y. Liu, U. Vollenbruch, Y. Chen, C. Wicpalek, L. Maurer, Z. Boos, and R. Weigel, “Multi-stage pulse shrinking time-to-digital converter for time interval measurements,” in Proc. Eur. Microw. Integr. Circuits Conf., Oct. 2007, pp. 267–270. [24] R. B. Staszewski, S. K. Vemulapalli, J. L. Wallberg, and P. T. Balsara, “1.3 V 20 ps time-to-digital converter for frequency synthesis in 90-nm CMOS,” IEEE Trans. Circuits Syst. II, Exp. Briefs, vol. 53, no. 3, pp. 220–224, Mar. 2006. [25] B. Markovic, S. Tisa, F. A. Villa, A. Tosi, and F. Zappa, “A high-linearity, 17 ps precision time-to-digital converter based on a single-stage delay Vernier loop fine interpolation,” IEEE Trans. Circuits Syst. I, Reg. Papers, vol. 60, no. 3, pp. 557–569, Mar. 2013. [26] J. P. Jansson, A. Mäntyniemi, and J. Kostamovaara, “A CMOS time-todigital converter with better than 10 ps single-shot precision,” IEEE J. Solid-State Circuits, vol. 41, no. 6, pp. 1286–1296, Jun. 2006. [27] A. Mantyniemi, T. Rahkonen, and J. Kostamovaara, “A CMOS time-todigital converter (TDC) based on a cyclic time domain successive approximation interpolation method,” IEEE J. Solid-State Circuits, vol. 44, no. 11, pp. 3067–3078, Nov. 2009. [28] B. Markovic, D. Tamborini, F. Villa, S. Tisa, A. Tosi, and F. Zappa, “10 ps resolution, 160 ns full scale range and less than 1.5 differential nonlinearity time-to-digital converter module for high performance timing measurements,” Rev. Sci. Instrum., vol. 83, no. 7, p. 074703, 2012 [29] F. Villa, B. Markovic, S. Bellisai, D. Bronzi, A. Tosi, F. Zappa, S. Tisa, D. Durini, S. Weyers, U. Paschen, and W. Brockherde, “SPAD smart-pixel for time-of-flight and time-correlated single-photon counting measurements,” IEEE Photon. J., vol. 4, no. 3, pp. 795–804, Jun. 2012. [30] D. M. Santos, S. F. Dow, and M. E. Levi, “CMOS delay locked loop and sub-nanosecond time-to-digital converter chip,” in Proc. IEEE Nucl. Sci. Symp. Med. Imag. Conf., 1995, vol. 1, pp. 289–291. [31] B. Nikolić, V. G. Oklobdžija, V. Stojanović, W. Jia, J. K.-S. Chiu, and M. M.-T. Leung, “Improved sense-amplifier-based flip-flop: Design and measurements,” IEEE J. Solid-State Circuits, vol. 35, no. 6, pp. 876–884, Jun. 2000. [32] D. Bronzi, S. Tisa, F. Villa, A. Tosi, and F. Zappa, “Fast sensing and quenching of CMOS SPADs for minimal afterpulsing effects,” IEEE Photon. Technol. Lett., vol. 25, no. 8, pp. 776–779, Apr. 2013. [33] D. V. O’Connor and V. Phillips, Time-Correlated Single Photon Counting. London, U.K.: Academic, 1984. [34] E. S. Nielsen, L. A. Canalis, and M. H. Tejera, “Hand gesture recognition for human-machine interaction,” J. WSCG, vol. 12, no. 1–3, pp. 395–402, 2004. [35] A. Kolb, E. Barth, R. Koch, and R. Larsen, “Time-of-flight sensors in computer graphics,” in Proc. Eur. Assoc. Comput. Graph., 2009, pp. 119–134. Federica Villa received the B.Sc. degree in biomedical engineering and the M.Sc. and Ph.D. degrees in electronic engineering from the Politecnico di Milano, in 2008, 2010, and 2014, respectively. In 2010, she interned in the Biochemistry Department, University of California, Los Angeles. Her current research interests include designing CMOS SPAD imagers for 2-D imaging of fluorescence decays and 3-D ranging through on-chip direct time-of-flight method, by means of in-pixel time-to-digital converters. Rudi Lussana was born in Bergamo, Italy, in 1989. He received the bachelor’s degree and the M.Sc. degree in electronic engineering from Politecnico di Milano, in 2011 and 2013, respectively. Since November 2013, he has been working toward the Ph.D. degree in electronic engineering at the Politecnico di Milano. His main research interests include the field of single-photon 3-D cameras, in particular on direct time of flight technique. 3804810 IEEE JOURNAL OF SELECTED TOPICS IN QUANTUM ELECTRONICS, VOL. 20, NO. 6, NOVEMBER/DECEMBER 2014 Danilo Bronzi received the B.Sc. degree (cum laude) in biomedical engineering and the M.Sc. degree (cum laude) in electronic engineering from the Politecnico di Milano, Italy, in 2008 and July 2011, respectively. Since October 2011, he has been working toward the Ph.D. degree in the Department of Electronics, Information and Bioengineering, Politecnico di Milano. His current research interests include the design and development of novel CMOS single-photon sensors for 2-D and 3-D imaging applications. Alberto Dalla Mora was born in Fiorenzuola d’Arda, Italy, in 1981. He graduated summa cum laude in electronics engineering and received the Ph.D. degree summa cum laude in information and communication technology from the Politecnico di Milano, Italy, in 2006 and 2010, respectively. From 2010 to 2011, he was a Postdoctoral Fellow in the Dipartimento di Fisica, Politecnico di Milano, where since 2011, he has been an Assistant Professor. His current research interests include time-resolved diffuse spectroscopy instrumentations and applications for biomedical diagnosis. Simone Tisa was born in Milano, Italy, in 1977. He received the M.Sc. degree in electronic engineering and the Ph.D. degree from Politecnico di Milano in 2001and 2006, respectively. He is currently a product R&D Manager for electronics design and system integration at Micro Photon Devices. In 2008, he pioneered the first monolithic 2-D SPAD imager of 32 × 32 pixels. His main research interests include the field of single-photon imaging and single-photon timing of fast phenomena, by means of fully integrated arrays of SPADs and associated microelectronics. Davide Contini was born in Angera, Italy, in 1978. He received the Master’s degree in electronics engineering and the Ph.D. degree in physics from the Politecnico, Italy, in 2003 and 2007, respectively. In 2004, he joined the Biomedical Optics Group in the Dipartimento di Fisica, Politecnico di Milano. Since 2008, he has been an Assistant Professor in physics at the Politecnico di Milano. His current research interests include optics and the interaction of laser light with matter, in particular the time-resolved spectroscopy of highly diffusive media for applications in biology and medicine. Alberto Tosi (M’07) was born in Borgomanero, Italy, in 1975. He received the Master’s degree in electronics engineering and the Ph.D. degree in information technology engineering from the Politecnico di Milano, Milan, Italy, in 2001 and 2005, respectively. He has been an Assistant Professor of electronics at the Politecnico di Milano since 2006. In 2004, he was a student with the IBM T. J. Watson Research Center, Yorktown Heights, NY, USA, working on optical testing of CMOS circuits. Currently, he works on silicon and InGaAs/InP singlephoton avalanche diodes (SPADs). He is involved in research on arrays of silicon SPADs for 2-D and 3-D applications and on time-correlated singlephoton counting electronics. Daniel Durini (M‘09) received the B.Sc. degree in electrical and electronic engineering from the National Autonomous University of Mexico (UNAM), the M.Sc. degree in area of microelectronics from the National Institute of Astrophysics, Optics and Electronics (INAOE), Mexico, and the Ph.D. degree in area of microelectronics from the University of Duisburg-Essen in Germany in 2002, 2003, and 2009, respectively. He is currently the Head of Systems Engineering of the Central Institute of Engineering, Electronics and Analytics (ZEA-2) at the Forschungszentrum Jülich in Germany, involved with radiation detector systems. He was with the Fraunhofer IMS in Duisburg, Germany, between 2004 and 2013, where he led the developing of special CMOS process modules for high-performance photodetector devices. He received the Duisburger Sparkasse Award for outstanding Ph.D. thesis in 2009, and two Best Paper Awards. Franco Zappa (SM’07) was born in Milano, Italy, in 1965. Since 2011, he has been a full Professor of electronics at the Politecnico di Milano. His research interests include microelectronic circuitry for single-photon detectors (SPAD) and CMOS SPAD imagers, for high-sensitivity time-resolved measurements, 2-D imaging and 3-D depth ranging. He is coauthor of more than 130 papers, published in peer-reviewed journals and in conference proceedings, and nice text books on Electronic Design and Electronic Systems. He is coauthor of four international patents. In 2004, he cofounded “Micro Photon Devices” focused on the production of SPAD modules and cameras for single photon-counting and photon-timing. Since 2007, he has been a Senior Member of the IEEE. Sasha Weyers is Project Manager at the Fraunhofer Institute of Microelectronics. Werner Brockherde received the Diploma degree in electrical engineering from the University of Dortmund, Germany, in 1982. Since 1985, he heads a research group in CMOS image sensor and mixed-signal IC design at IMS.