research-article

An Electro-Photonic System for Accelerating Deep Neural Networks

Authors:

Cansu Demirkiran,

Jonathan Elmhurst,

Nicholas C. Harris,

Ayon Basumallik,

Vijay Janapa Reddi,

Darius BunandarAuthors Info & Claims

ACM Journal on Emerging Technologies in Computing Systems, Volume 19, Issue 4

Article No.: 30, Pages 1 - 31

https://doi.org/10.1145/3606949

Published: 08 September 2023 Publication History

Abstract

The number of parameters in deep neural networks (DNNs) is scaling at about 5× the rate of Moore’s Law. To sustain this growth, photonic computing is a promising avenue, as it enables higher throughput in dominant general matrix-matrix multiplication (GEMM) operations in DNNs than their electrical counterpart. However, purely photonic systems face several challenges including lack of photonic memory and accumulation of noise. In this article, we present an electro-photonic accelerator, ADEPT, which leverages a photonic computing unit for performing GEMM operations, a vectorized digital electronic application-specific integrated circuits for performing non-GEMM operations, and SRAM arrays for storing DNN parameters and activations. In contrast to prior works in photonic DNN accelerators, we adopt a system-level perspective and show that the gains while large are tempered relative to prior expectations. Our goal is to encourage architects to explore photonic technology in a more pragmatic way considering the system as a whole to understand its general applicability in accelerating today’s DNNs. Our evaluation shows that ADEPT can provide, on average, 5.73× higher throughput per watt compared to the traditional systolic arrays in a full-system, and at least 6.8× and 2.5× better throughput per watt, compared to state-of-the-art electronic and photonic accelerators, respectively.

References

[1]

(nd). Ansys. Retrieved from https://www.ansys.com/products/photonics

[2]

(nd). Genus Synthesis Solution. Retrieved from https://www.cadence.com/en_US/home/tools/digital-design-and-signoff/synthesis/genus-synthesis-solution.html

[3]

(nd). GF22nm FD-SOI Technology. Retrieved from https://globalfoundries.com/sites/default/files/product-briefs/pb-22fdx-26-web.pdf

[4]

(nd). Intel Xeon Gold 6242 Processor (22m Cache, 2.80 GHz) Product Specifications. Retrieved from https://ark.intel.com/content/www/us/en/ark/products/192440/intel-xeon-gold-6242-processor-22m-cache-2-80-ghz.html

[5]

Suguru Akiyama, Takeshi Baba, Masahiko Imai, Takeshi Akagawa, Masashi Takahashi, Naoki Hirayama, Hiroyuki Takahashi, Yoshiji Noguchi, Hideaki Okayama, Tsuyoshi Horikawa, and Tatsuya Usuki. 2012. 12.5-Gb/s operation with 0.29-V·cm V\(\pi\)L using silicon Mach-Zehnder modulator based-on forward-biased pin diode. Opt. Expr. 20, 3 (Jan. 2012), 2911–2923.

[6]

M. A. Al-Qadasi, L. Chrostowski, B. J. Shastri, and S. Shekhar. 2022. Scaling up silicon photonic-based accelerators: Challenges and opportunities. APL Photon. 7, 2 (Feb. 2022), 020902.

[7]

Dario Amodei. 2020. AI and Compute. Retrieved from https://openai.com/blog/ai-and-compute/

[8]

Andrew Anderson, Aravind Vasudevan, Cormac Keane, and David Gregg. 2017. Low-memory GEMM-based convolution algorithms for deep neural networks. CoRR abs/1709.03395 (2017). http://arxiv.org/abs/1709.03395

[9]

Tom Baehr-Jones, Ran Ding, Yang Liu, Ali Ayazi, Thierry Pinguet, Nicholas C. Harris, Matt Streshinsky, Poshen Lee, Yi Zhang, Andy Eu-Jin Lim, Tsung-Yang Liow, Selin Hwee-Gee Teo, Guo-Qiang Lo, and Michael Hochberg. 2012. Ultralow drive voltage silicon traveling-wave modulator. Opt. Expr. 20, 11 (May 2012), 12014–12020.

[10]

Reza Baghdadi, Michael Gould, Shashank Gupta, Mykhailo Tymchenko, Darius Bunandar, Carl Ramey, and Nicholas C. Harris. 2021. Dual slot-mode noem phase shifter. Opt. Expr. 29, 12 (Jun. 2021), 19113–19119.

[11]

Saumil Bandyopadhyay, Ryan Hamerly, and Dirk Englund. 2021. Hardware error correction for programmable photonics. Optica 8, 10 (Oct. 2021), 1247–1255.

[12]

Viraj Bangari, Bicky A. Marquez, Heidi Miller, Alexander N. Tait, Mitchell A. Nahmias, Thomas Ferreira de Lima, Hsuan-Tung Peng, Paul R. Prucnal, and Bhavin J. Shastri. 2020. Digital Electronics and Analog Photonics for Convolutional Neural Networks (DEAP-CNNs). IEEE J. Select. Top. Quant. Electr. 26, 1 (2020), 1–13.

[13]

Qiaoliang Bao, Han Zhang, Zhenhua Ni, Yu Wang, Lakshminarayana Polavarapu, Zexiang Shen, Qing-Hua Xu, Dingyuan Tang, and Kian Ping Loh. 2011. Monolayer graphene as a saturable absorber in a mode-locked laser. Nano Res. 4, 3 (2011), 297–307.

[14]

Sathwika Bavikadi, Abhijitt Dhavlle, Amlan Ganguly, Anand Haridass, Hagar Hendy, Cory Merkel, Vijay Janapa Reddi, Purab Ranjan Sutradhar, Arun Joseph, and Sai Manoj Pudukotai Dinakarrao. 2022. A survey on machine learning accelerators and evolutionary hardware platforms. IEEE Des. Test 39, 3 (2022), 91–116.

[15]

T. E. Bell. 1986. Optical computing: A field in flux. IEEE Spectrum 23, 8 (1986), 34–38.

[16]

W. Bogaerts, P. De Heyn, T. Van Vaerenbergh, K. De Vos, S. Kumar Selvaraja, T. Claes, P. Dumon, P. Bienstman, D. Van Thourhout, and R. Baets. 2012. Silicon microring resonators. Laser Photon. Rev. 6, 1 (2012), 47–73.

[17]

H. John Caulfield. 1987. Parallel N4 weighted optical interconnections. Appl. Opt. 26, 19 (Oct. 1987), 4039–4040.

[18]

Julie Chang, Vincent Sitzmann, Xiong Dun, Wolfgang Heidrich, and Gordon Wetzstein. 2018. Hybrid optical-electronic convolutional neural networks with optimized diffractive optics for image classification. Sci. Rep. 8, 1 (2018), 12324.

[19]

Chao Chen and Ajay Joshi. 2013. Runtime management of laser power in silicon-photonic multibus NOC architecture. IEEE J. Select. Top. Quant. Electr. 19, 2 (2013), 3700713–3700713.

[20]

Chao Chen and Ajay Joshi. 2013. Runtime management of laser power in silicon-photonic multibus NOC architecture. IEEE J. Select. Top. Quant. Electr. 19, 2 (2013), 3700713–3700713.

[21]

Tianshi Chen, Zidong Du, Ninghui Sun, Jia Wang, Chengyong Wu, Yunji Chen, and Olivier Temam. 2014. DianNao: A small-footprint high-throughput accelerator for ubiquitous machine-learning. SIGARCH Comput. Archit. News 42, 1 (Feb. 2014), 269–284.

Digital Library

[22]

Y. Chen, T. Krishna, J. S. Emer, and V. Sze. 2017. Eyeriss: An energy-efficient reconfigurable accelerator for deep convolutional neural networks. IEEE J. Solid-State Circ. 52, 1 (2017), 127–138.

[23]

Y. Chen, T. Luo, S. Liu, S. Zhang, L. He, J. Wang, L. Li, T. Chen, Z. Xu, N. Sun, and O. Temam. 2014. DaDianNao: A machine-learning supercomputer. In Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture. 609–622.

Digital Library

[24]

Yiran Chen, Yuan Xie, Linghao Song, Fan Chen, and Tianqi Tang. 2020. A survey of accelerator architectures for deep neural networks. Engineering 6, 3 (2020), 264–274.

[25]

Yu-Hsin Chen, Tien-Ju Yang, Joel Emer, and Vivienne Sze. 2019. Eyeriss v2: A flexible accelerator for emerging deep neural networks on mobile devices. IEEE J. Emerg. Select. Top. Circ. Syst. 9, 2 (2019), 292–308.

[26]

Zhilu Chen, Jing Wang, Haibo He, and Xinming Huang. 2014. A fast deep learning system using GPU. In Proceedings of the IEEE International Symposium on Circuits and Systems (ISCAS’14). 1552–1555.

[27]

Qixiang Cheng, Jihye Kwon, Madeleine Glick, Meisam Bahadori, Luca P. Carloni, and Keren Bergman. 2020. Silicon photonics codesign for deep learning. Proc. IEEE 108, 8 (2020), 1261–1282.

[28]

Jack Choquette, Wishwesh Gandhi, Olivier Giroux, Nick Stam, and Ronny Krashinsky. 2021. NVIDIA A100 tensor core GPU: Performance and innovation. IEEE Micro 41, 2 (2021), 29–35.

[29]

William R. Clements, Peter C. Humphreys, Benjamin J. Metcalf, W. Steven Kolthammer, and Ian A. Walmsley. 2016. Optimal design for universal multiport interferometers. Optica 3, 12 (Dec. 2016), 1460–1465.

[30]

Edward Cottle, Florent Michel, Joseph Wilson, Nick New, and Iman Kundu. 2020. Optical convolutional neural networks–combining silicon photonics and fourier optics for computer vision. arXiv:2103.09044. Retrieved from https://arxiv.org/abs/2103.09044

[31]

Behzad Dehlaghi and Anthony Chan Carusone. 2016. A 0.3 pJ/bit 20 Gb/s/Wire parallel interface for die-to-die communication. IEEE J. Solid-State Circ. 51, 11 (2016), 2690–2701.

[32]

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. CoRR abs/1810.04805 (2018). http://arxiv.org/abs/1810.04805

[33]

Po Dong, Wei Qian, Hong Liang, Roshanak Shafiiha, Ning-Ning Feng, Dazeng Feng, Xuezhe Zheng, Ashok V. Krishnamoorthy, and Mehdi Asghari. 2010. Low power and compact reconfigurable multiplexing devices based on silicon microring resonators. Opt. Expr. 18, 10 (May 2010), 9852–9858.

[34]

Clément Farabet, Yann LeCun, Koray Kavukcuoglu, Eugenio Culurciello, Berin Martini, Polina Akselrod, and Selcuk Talay. 2011. Large-scale fpga-based convolutional networks. In Scaling Up Machine Learning: Parallel and Distributed Approaches, 399–419.

[35]

Clément Farabet, Berin Martini, Benoit Corda, Polina Akselrod, Eugenio Culurciello, and Yann LeCun. 2011. Neuflow: A runtime reconfigurable dataflow processor for vision. In Proceedings of the Computer Vision and Pattern Recognition (CVPR’11) Workshops. 109–116.

[36]

J. Feldmann, N. Youngblood, M. Karpov, H. Gehring, X. Li, M. Stappers, M. Le Gallo, X. Fu, A. Lukashchuk, A. S. Raja, J. Liu, C. D. Wright, A. Sebastian, T. J. Kippenberg, W. H. P. Pernice, and H. Bhaskaran. 2021. Parallel convolutional processing using an integrated photonic tensor core. Nature 589, 7840 (2021), 52–58.

[37]

Sean Fox, Julian Faraone, David Boland, Kees Vissers, and Philip H. W. Leong. 2019. Training deep neural networks in low-precision with high accuracy using FPGAs. In Proceedings of the International Conference on Field-Programmable Technology (ICFPT’19). 1–9.

[38]

K. Giewont, K. Nummy, F. A. Anderson, J. Ayala, T. Barwicz, Y. Bian, K. K. Dezfulian, D. M. Gill, T. Houghton, S. Hu, B. Peng, M. Rakowski, S. Rauch, J. C. Rosenberg, A. Sahin, I. Stobert, and A. Stricker. 2019. 300-mm monolithic silicon photonics foundry technology. IEEE J. Select. Top. Quant. Electr. 25, 5 (2019), 1–11.

[39]

Mingqiang Guo, Jiaji Mao, Sai-Weng Sin, Hegong Wei, and Rui P. Martins. 2020. A 5 GS/s 29 mW interleaved sar adc with 48.5 dB SNDR using digital-mixing background timing-skew calibration for direct sampling applications. IEEE Access 8 (2020), 138944–138954.

[40]

Ryan Hamerly, Saumil Bandyopadhyay, and Dirk Englund. 2022. Accurate self-configuration of rectangular multiport interferometers. Phys. Rev. Appl. 18, 2 (2022), 024019.

[41]

Ryan Hamerly, Saumil Bandyopadhyay, and Dirk Englund. 2022. Stability of self-configuring large multiport interferometers. Phys. Rev. Appl. 18, 2 (2022), 024018.

[42]

Nicholas C. Harris, Yangjin Ma, Jacob Mower, Tom Baehr-Jones, Dirk Englund, Michael Hochberg, and Christophe Galland. 2014. Efficient, compact and low loss thermo-optic phase shifter in silicon. Opt. Expr. 22, 9 (May 2014), 10487–10493.

[43]

K. He, X. Zhang, S. Ren, and J. Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’16). 770–778.

[44]

Yanzhang He, Tara N. Sainath, Rohit Prabhavalkar, Ian McGraw, Raziel Alvarez, Ding Zhao, David Rybach, Anjuli Kannan, Yonghui Wu, Ruoming Pang, Qiao Liang, Deepti Bhatia, Yuan Shangguan, Bo Li, Golan Pundak, Khe Chai Sim, Tom Bagby, Shuo yiin Chang, Kanishka Rao, and Alexander Gruenstein. 2018. Streaming end-to-end speech recognition for mobile devices. CoRR abs/1811.06621 (2018). http://arxiv.org/abs/1811.06621

[45]

M. Horowitz. 2014. Computing’s energy problem (and what we can do about it). In Proceedings of the IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC’14). 10–14.

[46]

Hung-Yi Huang, Xin-Yu Chen, and Tai-Haur Kuo. 2021. A 10-GS/s NRZ/Mixing DAC With switching-glitch compensation achieving SFDR gt; 64/50 dBc over the first/second nyquist zone. IEEE J. Solid-State Circ. 56, 10 (2021), 3145–3156.

[47]

Huimin Li, Xitian Fan, Li Jiao, Wei Cao, Xuegong Zhou, and Lingli Wang. 2016. A high performance fpga-based accelerator for large-scale convolutional neural networks. In Proceedings of the 26th International Conference on Field Programmable Logic and Applications (FPL’16). 1–9.

[48]

Loc N. Huynh, Youngki Lee, and Rajesh Krishna Balan. 2017. DeepMon: Mobile GPU-based deep learning framework for continuous vision applications. In Proceedings of the 15th Annual International Conference on Mobile Systems, Applications, and Services (MobiSys’17). Association for Computing Machinery, New York, NY, 82–95.

Digital Library

[49]

John David Jackson. 1975. Classical Electrodynamics. Wiley, New York, NY.

[50]

Hasitha Jayatilleka, Hossam Shoman, Lukas Chrostowski, and Sudip Shekhar. 2019. Photoconductive heaters enable control of large-scale silicon photonic ring resonator circuits. Optica 6, 1 (Jan. 2019), 84–91.

[51]

Ajay Joshi, Christopher Batten, Yong-Jin Kwon, Scott Beamer, Imran Shamim, Krste Asanovic, and Vladimir Stojanovic. 2009. Silicon-photonic clos networks for global on-chip communication. In Proceedings of the 3rd ACM/IEEE International Symposium on Networks-on-Chip. 124–133.

Digital Library

[52]

Norman P. Jouppi, Doe Hyun Yoon, George Kurian, Sheng Li, Nishant Patil, James Laudon, Cliff Young, and David Patterson. 2020. A domain-specific supercomputer for training deep neural networks. Commun. ACM 63, 7 (Jun. 2020), 67–78.

Digital Library

[53]

Norman P. Jouppi, Cliff Young, Nishant Patil, David Patterson, Gaurav Agrawal, Raminder Bajwa, Sarah Bates, Suresh Bhatia, Nan Boden, Al Borchers, Rick Boyle, Pierre-luc Cantin, Clifford Chao, Chris Clark, Jeremy Coriell, Mike Daley, Matt Dau, Jeffrey Dean, Ben Gelb, Tara Vazir Ghaemmaghami, Rajendra Gottipati, William Gulland, Robert Hagmann, C. Richard Ho, Doug Hogberg, John Hu, Robert Hundt, Dan Hurt, Julian Ibarz, Aaron Jaffey, Alek Jaworski, Alexander Kaplan, Harshit Khaitan, Daniel Killebrew, Andy Koch, Naveen Kumar, Steve Lacy, James Laudon, James Law, Diemthu Le, Chris Leary, Zhuyuan Liu, Kyle Lucke, Alan Lundin, Gordon MacKean, Adriana Maggiore, Maire Mahony, Kieran Miller, Rahul Nagarajan, Ravi Narayanaswami, Ray Ni, Kathy Nix, Thomas Norrie, Mark Omernick, Narayana Penukonda, Andy Phelps, Jonathan Ross, Matt Ross, Amir Salek, Emad Samadiani, Chris Severn, Gregory Sizikov, Matthew Snelham, Jed Souter, Dan Steinberg, Andy Swing, Mercedes Tan, Gregory Thorson, Bo Tian, Horia Toma, Erick Tuttle, Vijay Vasudevan, Richard Walter, Walter Wang, Eric Wilcox, and Doe Hyun Yoon. 2017. In-datacenter performance analysis of a tensor processing unit. In Proceedings of the 44th Annual International Symposium on Computer Architecture (ISCA’17). Association for Computing Machinery, New York, NY, 1–12.

Digital Library

[54]

Sangpyo Kim, Jongmin Kim, Michael Jaemin Kim, Wonkyung Jung, Minsoo Rhu, John Kim, and Jung Ho Ahn. 2021. BTS: An accelerator for bootstrappable fully homomorphic encryption. CoRR abs/2112.15479 (2021). https://arxiv.org/abs/2112.15479

[55]

Raghuraman Krishnamoorthi. 2018. Quantizing deep convolutional networks for efficient inference: A whitepaper. CoRR abs/1806.08342 (2018). http://arxiv.org/abs/1806.08342

[56]

Adam Lavely. 2022. Powering Extreme-Scale HPC with Cerebras WaferScale Accelerators. Technical Report. Cerebras Systems.

[57]

Jinmook Lee, Changhyeon Kim, Sanghoon Kang, Dongjoo Shin, Sangyeob Kim, and Hoi-Jun Yoo. 2019. UNPU: An energy-efficient deep neural network accelerator with fully variable weight bit precision. IEEE J. Solid-State Circ. 54, 1 (2019), 173–185.

[58]

Xing Li and Lei Zhou. 2020. A survey of high-speed high-resolution current steering DACs. J. Semiconduct. 41, 20060024 (Oct. 2020), 111404.

[59]

Xing Lin, Yair Rivenson, Nezih T. Yardimci, Muhammed Veli, Yi Luo, Mona Jarrahi, and Aydogan Ozcan. 2018. All-optical machine learning using diffractive deep neural networks. Science 361, 6406 (2018), 1004–1008.

[60]

Stefan Lischke, Dieter Knoll, Christian Mai, Lars Zimmermann, Anna Peczek, Marcel Kroh, Andreas Trusch, Edgar Krune, Karsten Voigt, and A. Mai. 2015. High bandwidth, high responsivity waveguide-coupled germanium p-i-n photodiode. Opt. Expr. 23, 21 (Oct. 2015), 27213–27220.

[61]

Weichen Liu, Wenyang Liu, Yichen Ye, Qian Lou, Yiyuan Xie, and Lei Jiang. 2019. HolyLight: A nanophotonic accelerator for deep learning in data centers. In Proceedings of the Design, Automation Test in Europe Conference Exhibition (DATE’19). 1483–1488.

[62]

Armin Mehrabian, Yousra Al-Kabani, Volker J. Sorger, and Tarek El-Ghazawi. 2018. PCNNA: A photonic convolutional neural network accelerator. In Proceedings of the 31st IEEE International System-on-Chip Conference (SOCC’18).

[63]

Maziyar Milanizadeh, Douglas Aguiar, Andrea Melloni, and Francesco Morichetti. 2019. Canceling thermal cross-talk effects in photonic integrated circuits. J. Lightw. Technol. 37, 4 (2019), 1325–1332.

[64]

David A. B. Miller. 2013. Self-configuring universal linear optical component. Photon. Res. 1, 1 (Jun. 2013), 1–15.

[65]

David A. B. Miller. 2015. Perfect optics with imperfect components. Optica 2, 8 (Aug. 2015), 747–750.

[66]

Mario Miscuglio, Zibo Hu, Shurui Li, Jonathan K. George, Roberto Capanna, Hamed Dalir, Philippe M. Bardet, Puneet Gupta, and Volker J. Sorger. 2020. Massively parallel amplitude-only fourier neural network. Optica 7, 12 (2020), 1812–1819. https://opg.optica.org/optica/abstract.cfm?URI=optica-7-12-1812

[67]

Gerard Mourou, Bill Brocklesby, Toshiki Tajima, and Jens Limpert. 2013. The future is fibre accelerators. Nat. Photon. 7, 4 (2013), 258–261.

[68]

Kishore Padmaraju and Keren Bergman. 2014. Resolving the thermal challenges for silicon microring resonator devices. Nanophotonics 3, 4-5 (2014), 269–281.

[69]

Vassil Panayotov, Guoguo Chen, Daniel Povey, and Sanjeev Khudanpur. 2015. Librispeech: An ASR corpus based on public domain audio books. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP’15). 5206–5210.

[70]

Biagio Peccerillo, Mirco Mannino, Andrea Mondelli, and Sandro Bartolini. 2022. A survey on hardware accelerators: Taxonomy, trends, challenges, and perspectives. J. Syst. Arch. 129 (2022), 102561.

Digital Library

[71]

Jiaxin Peng, Yousra Alkabani, Shuai Sun, Volker J. Sorger, and Tarek El-Ghazawi. 2020. DNNARA: A deep neural network accelerator using residue arithmetic and integrated photonics. In Proceedings of the 49th International Conference on Parallel Processing (ICPP’20). Association for Computing Machinery, New York, NY.

Digital Library

[72]

M. Poot and H. X. Tang. 2014. Broadband nanoelectromechanical phase shifting of light on a chip. Appl. Phys. Lett. 104, 6 (2014), 061101.

[73]

Powerapi-Ng. (nd). Powerapi-ng/pyrapl: A library to measure the python energy consumption of python code. Retrieved from https://github.com/powerapi-ng/pyRAPL

[74]

Pranav Rajpurkar, Jian Zhang, Konstantin Lopyrev, and Percy Liang. 2016. SQuAD: 100,000+ questions for machine comprehension of text. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 2383–2392.

[75]

Carl Ramey. 2020. Silicon photonics for artificial intelligence acceleration : Hotchips 32. In IEEE Hot Chips Symposium (HCS’20). IEEE, 1–26.

[76]

Hannes Ramon, Michael Vanhoecke, Jochem Verbist, Wouter Soenen, Peter De Heyn, Yoojin Ban, Marianna Pantouvaki, Joris Van Campenhout, Peter Ossieur, Xin Yin, et al. 2018. Low-power 56Gb/s NRZ microring modulator driver in 28nm FDSOI CMOS. IEEE Photon. Technol. Lett. 30, 5 (2018), 467–470.

[77]

Michael Reck, Anton Zeilinger, Herbert J. Bernstein, and Philip Bertani. 1994. Experimental realization of any discrete unitary operator. Phys. Rev. Lett. 73 (Jul. 1994), 58–61. Issue 1.

[78]

V. J. Reddi, C. Cheng, D. Kanter, P. Mattson, G. Schmuelling, C. Wu, B. Anderson, M. Breughe, M. Charlebois, W. Chou, R. Chukka, C. Coleman, S. Davis, P. Deng, G. Diamos, J. Duke, D. Fick, J. S. Gardner, I. Hubara, S. Idgunji, T. B. Jablin, J. Jiao, T. S. John, P. Kanwar, D. Lee, J. Liao, A. Lokhmotov, F. Massa, P. Meng, P. Micikevicius, C. Osborne, G. Pekhimenko, A. T. R. Rajan, D. Sequeira, A. Sirasao, F. Sun, H. Tang, M. Thomson, F. Wei, E. Wu, L. Xu, K. Yamada, B. Yu, G. Yuan, A. Zhong, P. Zhang, and Y. Zhou. 2020. MLPerf inference benchmark. In Proceedings of the ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA’20). 446–459.

Digital Library

[79]

Oskar A. Reimann and Walter F. Kosonocky. 1965. Progress in optical computer research. IEEE Spectrum 2, 3 (1965), 181–195.

Digital Library

[80]

G. Roelkens, J. Van Campenhout, J. Brouckaert, D. Van Thourhout, R. Baets, P. Rojo Romeo, P. Regreny, A. Kazmierczak, C. Seassal, X. Letartre, G. Hollinger, J. M. Fedeli, L. Di Cioccio, and C. Lagahe-Blanchard. 2007. III-V/Si photonics by die-to-wafer bonding. Mater. Today 10, 7 (2007), 36–43.

[81]

Olaf Ronneberger, Philipp Fischer, and Thomas Brox. 2015. U-NET: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention (MICCAI’15), Nassir Navab, Joachim Hornegger, William M. Wells, and Alejandro F. Frangi (Eds.). Springer International Publishing, Cham, 234–241.

[82]

Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, Alexander C. Berg, and Li Fei-Fei. 2015. Imagenet large scale visual recognition challenge. Int. J. Comput. Vis. 115, 3 (2015), 211–252.

Digital Library

[83]

Ananda Samajdar, Yuhao Zhu, Paul N. Whatmough, Matthew Mattina, and Tushar Krishna. 2018. SCALE-sim: Systolic CNN accelerator. CoRR abs/1811.02883 (2018). http://arxiv.org/abs/1811.02883

[84]

Jose Carlos Sancho and Darren J. Kerbyson. 2008. Analysis of double buffering on two different multicore architectures: Quad-core Opteron and the Cell-BE. In Proceedings of the IEEE International Symposium on Parallel and Distributed Processing. 1–12.

[85]

M. Sankaradas, V. Jakkula, S. Cadambi, S. Chakradhar, I. Durdanovic, E. Cosatto, and H. P. Graf. 2009. A massively parallel coprocessor for convolutional neural networks. In Proceedings of the 20th IEEE International Conference on Application-specific Systems, Architectures and Processors. 53–60.

Digital Library

[86]

Amin Shafiee, Sanmitra Banerjee, Krishnendu Chakrabarty, Sudeep Pasricha, and Mahdi Nikdast. 2022. LoCI: An analysis of the impact of optical loss and crosstalk noise in integrated silicon-photonic neural networks. In Proceedings of the Great Lakes Symposium on VLSI. 351–355.

[87]

Bhavin J. Shastri, Alexander N. Tait, T. Ferreira de Lima, Wolfram H. P. Pernice, Harish Bhaskaran, C. D. Wright, and Paul R. Prucnal. 2021. Photonics for artificial intelligence and neuromorphic computing. Nat. Photon. 15, 2 (2021), 102–114.

[88]

Yichen Shen, Nicholas C. Harris, Scott Skirlo, Mihika Prabhu, Tom Baehr-Jones, Michael Hochberg, Xin Sun, Shijie Zhao, Hugo Larochelle, Dirk Englund, and Marin Soljačić. 2017. Deep learning with coherent nanophotonic circuits. Nat. Photon. 11, 7 (2017), 441–446.

[89]

Shen, Yun, Wang, Xiaodong, Zhang, Wei, Qiu, Ciyuan, and Cheng, Xiulan. 2017. Fabrication of depletion type micro-ring modulator with high extinction ratio and high coupling quality factor. MATEC Web Conf. 139 (2017), 00066.

[90]

B. Shi, N. Calabretta, and R. Stabile. 2020. Deep neural network through an inp SOA-based photonic integrated cross-connect. IEEE J. Select. Top. Quant. Electr. 26, 1 (2020), 1–11.

[91]

Kyle Shiflett, Avinash Karanth, Razvan Bunescu, and Ahmed Louri. 2021. Albireo: Energy-efficient acceleration of convolutional neural networks via silicon photonics. In Proceedings of the ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA’21). 860–873.

Digital Library

[92]

K. Shiflett, D. Wright, A. Karanth, and A. Louri. 2020. PIXEL: Photonic neural network accelerator. In Proceedings of the IEEE International Symposium on High Performance Computer Architecture (HPCA’20). 474–487.

[93]

Farhad Shokraneh, Simon Geoffroy-Gagnon, and Odile Liboiron-Ladouceur. 2020. The diamond mesh, a phase-error-and loss-tolerant field-programmable MZI-based optical processor for optical neural networks. Opt. Expr. 28, 16 (2020), 23495–23508.

[94]

Xiubao Sui, Qiuhao Wu, Jia Liu, Qian Chen, and Guohua Gu. 2020. A review of optical neural networks. IEEE Access 8 (2020), 70773–70783.

[95]

Chen Sun, M. Wade, Yunsup Lee, J. Orcutt, L. Alloatti, M. Georgas, Andrew Waterman, J. Shainline, Rimas Avizienis, Sen Lin, B. Moss, R. Kumar, F. Pavanello, A. Atabaki, Henry Cook, Albert J. Ou, J. Leu, Yu hsin Chen, K. Asanović, Rajeev J. Ram, M. Popovic, and V. Stojanović. 2015. Single-chip microprocessor that communicates directly using light. Nature 528 (2015), 534–538.

[96]

Jie Sun, Ranjeet Kumar, Meer Sakib, Jeffrey B Driscoll, Hasitha Jayatilleka, and Haisheng Rong. 2018. A 128 Gb/s PAM4 silicon microring modulator with integrated thermo-optic resonance tuning. J. Lightw. Technol. 37, 1 (2018), 110–115.

[97]

Febin Sunny, Asif Mirza, Mahdi Nikdast, and Sudeep Pasricha. 2021. CrossLight: A Cross-layer optimized silicon photonic neural network accelerator. CoRR abs/2102.06960 (2021). https://arxiv.org/abs/2102.06960

[98]

Alexander N. Tait, Thomas Ferreira De Lima, Ellen Zhou, Allie X. Wu, Mitchell A. Nahmias, Bhavin J. Shastri, and Paul R. Prucnal. 2017. Neuromorphic photonic networks using silicon photonic weight banks. Sci. Rep. 7, 1 (2017), 1–10.

[99]

Alexander N. Tait, Mitchell A. Nahmias, Bhavin J. Shastri, and Paul R. Prucnal. 2014. Broadcast and weight: An integrated network for scalable photonic spike processing. J. Lightw. Technol. 32, 21 (2014), 3427–3439.

[100]

Thomas N. Theis and H.-S. Philip Wong. 2017. The end of moore’s law: A new beginning for information technology. Comput. Sci. Eng. 19, 2 (2017), 41–50.

Digital Library

[101]

Yvain Thonnart, Mounir Zid, José Luis Gonzalez-Jimenez, Guillaume Waltener, Robert Polster, Olivier Dubray, Florent Lepin, Stéphane Bernabé, Sylvie Menezo, Gabriel Parès, Olivier Castany, Laura Boutafa, Philippe Grosse, Benoît Charbonnier, and Charles Baudot. 2018. A 10Gb/s Si-photonic transceiver with 150 \(\mu\)W 120 \(\mu\)s-lock-time digitally supervised analog microring wavelength stabilization for 1Tb/s/mm2 die-to-die optical networks. In Proceedings of the IEEE International Solid-State Circuits Conference (ISSCC’18). 350–352.

[102]

E. Timurdogan, C. V. Poulton, M. J. Byrd, and M. R. Watts. 2017. Electric field-induced second-order nonlinear optical effects in silicon waveguides. Nat. Photon. 11, 3 (2017), 200–206.

[103]

Xin Tu, Chaolong Song, Tianye Huang, Zhenmin Chen, and Hongyan Fu. 2019. State of the art and perspectives on silicon photonic switches. Micromachines 10, 1 (2019).

[104]

Sairam Sri Vatsavai and Ishan G. Thakkar. 2022. Photonic reconfigurable accelerators for efficient inference of cnns with mixed-sized tensors. IEEE Trans. Comput.-Aid. Des. Integr. Circ. Syst. 41, 11 (2022), 4337–4348.

Digital Library

[105]

Michael R. Watts, Jie Sun, Christopher DeRose, Douglas C. Trotter, Ralph W. Young, and Gregory N. Nielson. 2013. Adiabatic thermo-optic mach-zehnder switch. Opt. Lett. 38, 5 (Mar. 2013), 733–735.

[106]

Gordon Wetzstein, Aydogan Ozcan, Sylvain Gigan, Shanhui Fan, Dirk Englund, Marin Soljačić, Cornelia Denz, David A. B. Miller, and Demetri Psaltis. 2020. Inference in artificial intelligence with deep optics and photonics. Nature 588, 7836 (2020), 39–47.

[107]

C. M. Wilkes, X. Qiang, J. Wang, R. Santagati, S. Paesani, X. Zhou, D. A. B. Miller, G. D. Marshall, M. G. Thompson, and J. L. O’Brien. 2016. 60dB high-extinction auto-configured mach–zehnder interferometer. Opt. Lett. 41, 22 (Nov. 2016), 5318–5321.

[108]

J. Wilson. (nd). The multiply and fourier transform unit: A micro-scale optical processor. https://optalysys.com/wp-content/uploads/2022/04/Multiply_and_Fourier_Transform_white_paper_12_12_20.pdf

[109]

Hao Wu, Patrick Judd, Xiaojie Zhang, Mikhail Isaev, and Paulius Micikevicius. 2020. Integer quantization for deep learning inference: Principles and empirical evaluation. arxiv:2004.09602. Retrieved from https://arxiv.org/abs/2004.09602

[110]

Hongnan Xu and Yaocheng Shi. 2018. Flat-top cWDM (De)Multiplexer based on MZI with bent directional couplers. IEEE Photon. Technol. Lett. 30, 2 (2018), 169–172.

[111]

Xingyuan Xu, Mengxi Tan, Bill Corcoran, Jiayang Wu, Andreas Boes, Thach G. Nguyen, Sai T. Chu, Brent E. Little, Damien G. Hicks, Roberto Morandotti, Arnan Mitchell, and David J. Moss. 2021. 11 TOPS photonic convolutional accelerator for optical neural networks. Nature 589, 7840 (2021), 44–51.

[112]

Tao Yan, Jiamin Wu, Tiankuang Zhou, Hao Xie, Feng Xu, Jingtao Fan, Lu Fang, Xing Lin, and Qionghai Dai. 2019. Fourier-space diffractive deep neural network. Phys. Rev. Lett. 123, 2 (2019), 023901.

[113]

Guandao Yang, Tianyi Zhang, Polina Kirichenko, Junwen Bai, Andrew Gordon Wilson, and Christopher De Sa. 2019. SWALP: Stochastic weight averaging in low-precision training. CoRR abs/1904.11943 (2019). http://arxiv.org/abs/1904.11943

[114]

Tiankuang Zhou, Xing Lin, Jiamin Wu, Yitong Chen, Hao Xie, Yipeng Li, Jingtao Fan, Huaqiang Wu, Lu Fang, and Qionghai Dai. 2021. Large-scale neuromorphic optoelectronic computing with a reconfigurable diffractive processing unit. Nat. Photon. 15, 5 (2021), 367–373.

[115]

Ying Zhu, Grace Li Zhang, Bing Li, Xunzhao Yin, Cheng Zhuo, Huaxi Gu, Tsung-Yi Ho, and Ulf Schlichtmann. 2020. Countering variations and thermal effects for accurate optical neural networks. In Proceedings of the 39th International Conference on Computer-Aided Design. 1–7.

Digital Library

[116]

Ying Zuo, Bohan Li, Yujun Zhao, Yue Jiang, You-Chiuan Chen, Peng Chen, Gyu-Boong Jo, Junwei Liu, and Shengwang Du. 2019. All-optical neural network with nonlinear activation functions. Optica 6, 9 (Sep. 2019), 1132–1137.

Cited By

Atwany MPardo SSerunjogi SRasras M(2024)A review of emerging trends in photonic deep learning acceleratorsFrontiers in Physics10.3389/fphy.2024.136909912Online publication date: 15-Jul-2024
https://doi.org/10.3389/fphy.2024.1369099
TAKAHASHI J(2024)Photonic Cryptographic Circuits for All-Photonics Networkオールフォトニクス・ネットワークに向けた光暗号回路IEICE ESS Fundamentals Review10.1587/essfr.18.2_15818:2(158-166)Online publication date: 1-Oct-2024
https://doi.org/10.1587/essfr.18.2_158
Rahimi Kari SNobile NPantin DShah VYoungblood N(2024)Realization of an integrated coherent photonic platform for scalable matrix operationsOptica10.1364/OPTICA.50752511:4(542)Online publication date: 18-Apr-2024
https://doi.org/10.1364/OPTICA.507525
Show More Cited By

Index Terms

An Electro-Photonic System for Accelerating Deep Neural Networks
1. Hardware
  1. Emerging technologies
    1. Analysis and design of emerging devices and systems
      1. Emerging architectures
    2. Emerging optical and photonic technologies

Recommendations

Can FPGAs Beat GPUs in Accelerating Next-Generation Deep Neural Networks?
FPGA '17: Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays

Current-generation Deep Neural Networks (DNNs), such as AlexNet and VGG, rely heavily on dense floating-point matrix multiplication (GEMM), which maps well to GPUs (regular parallelism, high TFLOP/s). Because of this, GPUs are widely used for ...
Accelerating Binarized Convolutional Neural Networks with Software-Programmable FPGAs
FPGA '17: Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays

Convolutional neural networks (CNN) are the current stateof-the-art for many computer vision tasks. CNNs outperform older methods in accuracy, but require vast amounts of computation and memory. As a result, existing CNN applications are typically run ...
A CGRA-Based Approach for Accelerating Convolutional Neural Networks
MCSOC '15: Proceedings of the 2015 IEEE 9th International Symposium on Embedded Multicore/Many-core Systems-on-Chip

Convolutional neural network (CNN) is an emerging approach for achieving high recognition accuracy in various machine learning applications. To accelerate CNN computations, various GPU-based or application-specific hardware approaches have been recently ...

Comments

Information & Contributors

Information

Published In

cover image ACM Journal on Emerging Technologies in Computing Systems

ACM Journal on Emerging Technologies in Computing Systems Volume 19, Issue 4

October 2023

107 pages

ISSN:1550-4832

EISSN:1550-4840

DOI:10.1145/3609501

Editor:
Ramesh Karri
Polytechnic Institute of New York University, USA

Issue’s Table of Contents

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Journal Family

ACM Journals for the Design of Smart and Connected Systems

Publication History

Published: 08 September 2023

Online AM: 12 July 2023

Accepted: 26 May 2023

Revised: 25 March 2023

Received: 05 December 2022

Published in JETC Volume 19, Issue 4

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

20
Total Citations
View Citations
1,224
Total Downloads

Downloads (Last 12 months)965
Downloads (Last 6 weeks)87

Reflects downloads up to 09 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Atwany MPardo SSerunjogi SRasras M(2024)A review of emerging trends in photonic deep learning acceleratorsFrontiers in Physics10.3389/fphy.2024.136909912Online publication date: 15-Jul-2024
https://doi.org/10.3389/fphy.2024.1369099
TAKAHASHI J(2024)Photonic Cryptographic Circuits for All-Photonics Networkオールフォトニクス・ネットワークに向けた光暗号回路IEICE ESS Fundamentals Review10.1587/essfr.18.2_15818:2(158-166)Online publication date: 1-Oct-2024
https://doi.org/10.1587/essfr.18.2_158
Rahimi Kari SNobile NPantin DShah VYoungblood N(2024)Realization of an integrated coherent photonic platform for scalable matrix operationsOptica10.1364/OPTICA.50752511:4(542)Online publication date: 18-Apr-2024
https://doi.org/10.1364/OPTICA.507525
Xiao XCheung STossoun BVan Vaerenbergh TKurczveil GBeausoleil R(2024)Optical Neural Networks with Tensor Compression and Photonic MemoryOptical Fiber Communication Conference (OFC) 202410.1364/OFC.2024.Tu3F.5(Tu3F.5)Online publication date: 2024
https://doi.org/10.1364/OFC.2024.Tu3F.5
Gosciniak J(2024)Waveguide-Integrated Plasmonic Photodetectors and Activation Function Units With Phase Change MaterialsIEEE Photonics Journal10.1109/JPHOT.2023.333841516:1(1-10)Online publication date: Feb-2024
https://doi.org/10.1109/JPHOT.2023.3338415
Tang KWang JXu WJi XLiu JHuang XXin YDai PSun GZeng ZXiao RChen XJiang W(2024)Photonic Tensor Processing Unit With Single Dataflow and Programmable High-Precision Weighting ControlJournal of Lightwave Technology10.1109/JLT.2023.331709042:2(659-669)Online publication date: 15-Jan-2024
https://doi.org/10.1109/JLT.2023.3317090
Vatsavai SKarempudi VAlo OThakkar I(2024)A Comparative Analysis of Microrings Based Incoherent Photonic GEMM Accelerators2024 25th International Symposium on Quality Electronic Design (ISQED)10.1109/ISQED60706.2024.10528781(1-8)Online publication date: 3-Apr-2024
https://doi.org/10.1109/ISQED60706.2024.10528781
Andrulis TChaudhry GSuriyakumar VEmer JSze V(2024)Architecture-Level Modeling of Photonic Deep Neural Network Accelerators2024 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)10.1109/ISPASS61541.2024.00040(307-309)Online publication date: 5-May-2024
https://doi.org/10.1109/ISPASS61541.2024.00040
Demirkiran CYang GBunandar DJoshi A(2024)Mirage: An RNS-Based Photonic Accelerator for DNN Training2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA)10.1109/ISCA59077.2024.00016(73-87)Online publication date: 29-Jun-2024
https://doi.org/10.1109/ISCA59077.2024.00016
Zhu HGu JWang HJiang ZZhang ZTang RFeng CHan SChen RPan D(2024)Lightening-Transformer: A Dynamically-Operated Optically-Interconnected Photonic Transformer Accelerator2024 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA57654.2024.00059(686-703)Online publication date: 2-Mar-2024
https://doi.org/10.1109/HPCA57654.2024.00059
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Full Text

View this article in Full Text.

Media

Figures

Other

Tables

View full text|Download PDF

View Issue’s Table of Contents