research-article

Public Access

Rhythmic pixel regions: multi-resolution visual sensing system towards high-precision visual computing at low power

Authors:

Venkatesh Kodukula,

Alexander Shearer,

Srinivas Lingutla,

Robert LiKamWaAuthors Info & Claims

ASPLOS '21: Proceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems

Pages 573 - 586

https://doi.org/10.1145/3445814.3446737

Published: 17 April 2021 Publication History

Abstract

High spatiotemporal resolution can offer high precision for vision applications, which is particularly useful to capture the nuances of visual features, such as for augmented reality. Unfortunately, capturing and processing high spatiotemporal visual frames generates energy-expensive memory traffic. On the other hand, low resolution frames can reduce pixel memory throughput, but reduce also the opportunities of high-precision visual sensing. However, our intuition is that not all parts of the scene need to be captured at a uniform resolution. Selectively and opportunistically reducing resolution for different regions of image frames can yield high-precision visual computing at energy-efficient memory data rates.

To this end, we develop a visual sensing pipeline architecture that flexibly allows application developers to dynamically adapt the spatial resolution and update rate of different "rhythmic pixel regions" in the scene. We develop a system that ingests pixel streams from commercial image sensors with their standard raster-scan pixel read-out patterns, but only encodes relevant pixels prior to storing them in the memory. We also present streaming hardware to decode the stored rhythmic pixel region stream into traditional frame-based representations to feed into standard computer vision algorithms. We integrate our encoding and decoding hardware modules into existing video pipelines. On top of this, we develop runtime support allowing developers to flexibly specify the region labels. Evaluating our system on a Xilinx FPGA platform over three vision workloads shows 43-64% reduction in interface traffic and memory footprint, while providing controllable task accuracy.

References

[1]

1996scarlet. RetinaNet. https://github.com/1996scarlet/faster-mobile-retinaface.

[2]

Andrew Adams, Eino-Ville Talvala, Sung Hee Park, David E Jacobs, Boris Ajdin, Natasha Gelfand, Jennifer Dolson, Daniel Vaquero, Jongmin Baek, Marius Tico, et al. The frankencamera: an experimental platform for computational photography. In ACM SIGGRAPH. 2010.

[3]

Android. Android Camera API documentation. https://developer.android.com/ guide/topics/media/camera.

[4]

Mark Buckler, Philip Bedoukian, Suren Jayasuriya, and Adrian Sampson. Eva2: Exploiting temporal redundancy in live computer vision. In ACM/IEEE 45th Annual Int. Symp on Computer Architecture (ISCA), 2018.

Digital Library

[5]

Z. Cao, G. Hidalgo Martinez, T. Simon, S. Wei, and Y. A. Sheikh. Openpose: Realtime multi-person 2d pose estimation using part afinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019.

[6]

Tifany Yu-Han Chen, Lenin Ravindranath, Shuo Deng, Paramvir Bahl, and Hari Balakrishnan. Glimpse: Continuous, real-time object recognition on mobile devices. In Proc. of the 13th ACM Conf. on Embedded Networked Sensor Systems, 2015.

[7]

Yu-Hsin Chen, Tien-Ju Yang, Joel Emer, and Vivienne Sze. Eyeriss v2: A flexible accelerator for emerging deep neural networks on mobile devices. IEEE Journal on Emerging and Selected Topics in Circuits and Systems, 2019.

[8]

Jaehyuk Choi, Seokjun Park, Jihyun Cho, and Euisik Yoon. An energy/illumination-adaptive CMOS image sensor with reconfigurable modes of operations. IEEE Journal of Solid-State Circuits, 2015.

[9]

EETimes. Tensilica's New Vision/AI DSP Guns for SLAM. https://www.eetimes. com/tensilicas-new-vision-ai-dsp-guns-for-slam.

[10]

Yu Feng, Paul Whatmough, and Yuhao Zhu. Asv: accelerated stereo vision system. In Proc. of the 52nd Annual IEEE/ACM Int. Symp. on Microarchitecture, 2019.

Digital Library

[11]

Gallego, Guillermo and Delbruck, Tobi and Orchard, Garrick and Bartolozzi, Chiara and Taba, Brian and Censi, Andrea and Leutenegger, Stefan and Davison, Andrew and Conradt, Jörg and Daniilidis, Kostas and others. Event-based vision: A survey. arXiv preprint arXiv: 1904.08405, 2019.

[12]

Saugata Ghose, Abdullah Giray Yaglikçi, Raghav Gupta, Donghyuk Lee, Kais Kudrolli, William X Liu, Hasan Hassan, Kevin K Chang, Niladrish Chatterjee, Aditya Agrawal, Mike Connor, and Onur Mutlu. What your dram power models are not telling you: Lessons from a detailed experimental study. Proc. of the ACM on Measurement and Analysis of Computing Systems, 2018.

Digital Library

[13]

Ashish Gondimalla, Noah Chesnut, Mithuna Thottethodi, and TN Vijaykumar. Sparten: A sparse tensor accelerator for convolutional neural networks. In Proc. of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture, 2019.

Digital Library

[14]

Google. Take of to your next destination with Google Maps. https://www.blog. google/products/maps/take-your-next-destination-google-maps.

[15]

Mohit Gupta, Amit Agrawal, Ashok Veeraraghavan, and Srinivasa G Narasimhan. Flexible voxels for motion-aware videography. In European Conference on Computer Vision, 2010.

[16]

Rehan Hameed, Wajahat Qadeer, Megan Wachs, Omid Azizi, Alex Solomatnikov, Benjamin C Lee, Stephen Richardson, Christos Kozyrakis, and Mark Horowitz. Understanding sources of ineficiency in general-purpose chips. In Proc. of the 37th annual intl. symp on Computer architecture, 2010.

[17]

Ron Ho, Kenneth W Mai, and Mark A Horowitz. The future of wires. Proc. of the IEEE, 2001.

[18]

Tobias Höllerer and Steve Feiner. Mobile augmented reality. Telegeoinformatics: Location-based computing and services, 21, 2004.

[19]

Jinhan Hu, Jianan Yang, Vraj Delhivala, and Robert LiKamWa. Characterizing the reconfiguration latency of image sensor resolution on android devices. In Proc. of the 19th International Workshop on Mobile Computing Systems & Applications, 2018.

Digital Library

[20]

Iaian Richardson. H.264 and MPEG-4 Video Compression: Video Coding for Nextgeneration Multimedia. 2004.

[21]

Odrika Iqbal, Saquib Siddiqui, Joshua Martin, Sameeksha Katoch, Andreas Spanias, Daniel Bliss, Suren Jayasuriya, and SenSIP Center. Design and fpga implementation of an adaptive video subsampling algorithm for energy-eficient single object tracking. In IEEE Int Conf on Image Processing, 2020.

[22]

Robert LiKamWa, Bodhi Priyantha, Matthai Philipose, Lin Zhong, and Paramvir Bahl. Energy characterization and optimization of image sensing toward continuous mobile vision. In Proc. of the 11th annual international conference on Mobile systems, applications, and services, 2013.

Digital Library

[23]

Chaochao Lu, Michael Hirsch, and Bernhard Scholkopf. Flexible spatio-temporal networks for video prediction. In Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition, 2017.

[24]

Nir Magen, Avinoam Kolodny, Uri Weiser, and Nachum Shamir. Interconnectpower dissipation in a microprocessor. In Proc. of the 2004 international workshop on System level interconnect prediction, 2004.

Digital Library

[25]

Krishna T Malladi, Frank A Nothaft, Karthika Periyathambi, Benjamin C Lee, Christos Kozyrakis, and Mark Horowitz. Towards energy-proportional datacenter memory with mobile dram. In 39th Annual Int. Symp. on Computer Architecture (ISCA). IEEE, 2012.

Digital Library

[26]

Max Planck Institute for Informatics, University of Bonn. PoseTrack Dataset and Benchmark. https://posetrack.net/.

[27]

James D Meindl, Jefrey A Davis, Payman Zarkesh-Ha, Chirag S Patel, Kevin P Martin, and Paul A Kohl. Interconnect opportunities for gigascale integration. IBM journal of research and development, 2002.

[28]

Micron technologies. Micron system power calculators. https://www.micron. com/support/tools-and-utilities/power-calc.

[29]

Microsoft. Azure Kinect DK. https://www.microsoft.com/en-us/p/azure-kinectdk/8pp5vxmd9nhq?activetab=pivot%3aoverviewtab.

[30]

MIPI Alliance. MIPI Camera Serial Interface 2 (MIPI CSI-2 ). https://www.mipi. org/specifications/csi-2.

[31]

Mur-Artal, Raúl, Montiel, J. M. M. and Tardós, Juan D. ORB-SLAM: a versatile and accurate monocular SLAM system. IEEE Trans. on Robotics, 2015.

[32]

Saman Naderiparizi, Pengyu Zhang, Matthai Philipose, Bodhi Priyantha, Jie Liu, and Deepak Ganesan. Glimpse: A programmable early-discard camera architecture for continuous mobile vision. In Proc. of the 15th Annual Int. Conf. on Mobile Systems, Applications, and Services, 2017.

[33]

NICTA. ChokePoint Dataset. http://arma.sourceforge.net/chokepoint/.

[34]

OpenCV. OpenCV KeyPoint Class Reference. https://docs.opencv. org/3.4/d2/ d29/classcv_1_1KeyPoint.html.

[35]

Dhinakaran Pandiyan and Carole-Jean Wu. Quantifying the energy cost of data movement for emerging smart phone workloads on mobile platforms. In 2014 IEEE Int. Syump. on Workload Characterization (IISWC). IEEE, 2014.

[36]

Vijay Raghunathan, Mani B Srivastava, and Rajesh K Gupta. A survey of techniques for energy eficient on-chip communication. In ACM Proc. of the 40th annual Design Automation Conference, 2003.

Digital Library

[37]

Dikpal Reddy, Ashok Veeraraghavan, and Rama Chellappa. P2c2: Programmable pixel compressive camera for high speed imaging. In CVPR 2011.

Digital Library

[38]

Richard Szeliski. Computer Vision: Algorithms and Applications 1st ed. 2010.

[39]

Stemmer Imaging. Teledyne DALSA Piranha4-Dual-Line-CMOS line camera. https://www.stemmer-imaging.com/en/products/series/teledyne-dalsapiranha4.

[40]

J. Sturm, N. Engelhard, F. Endres, W. Burgard, and D. Cremers. A benchmark for the evaluation of rgb-d slam systems. In Proc. of the International Conference on Intelligent Robot Systems (IROS), 2012.

[41]

Synopsys. Augmenting Your Reality with Deep Learning. https: //www.synopsys.com/designware-ip/technical-bulletin/ augmenting-yourreality-dwtb_q318.html.

[42]

Thomas Vogelsang. Understanding the energy consumption of dynamic random access memories. In 43rd Annual IEEE/ACM Int Symp. on Microarchitecture. IEEE, 2010.

Digital Library

[43]

Wikipedia. Foveated Rendering. https://en.wikipedia.org/wiki/Foveated_rendering.

[44]

Xilinx. H.264/H.265 Video Codec Unit v1.2. https://www.xilinx.com/support/ documentation/ip_documentation/vcu/v1_2/pg252-vcu.pdf.

[45]

Xilinx. reVISION Getting Started Guide 2018. 3 ( UG1265 ). https://github.com/ Xilinx/reVISION-Getting-Started-Guide.

[46]

Xilinx. Vivado Design Suite. https://www.xilinx.com/products/design-tools/ vivado.html.

[47]

Xilinx. Xilinx Power Estimator. https://www.xilinx.com/products/technology/ power/xpe.html.

[48]

Xilinx. Zynq DPU v3.2. https://www.xilinx.com/support/documentation/ip_documentation/dpu/v3_2/pg338-dpu.pdf.

[49]

ximea. Multiple ROI cameras. https://www.ximea.com/support/wiki/allprod/ Multiple_ROI.

[50]

Zhekai Zhang, Hanrui Wang, Song Han, and William J Dally. Sparch: Eficient architecture for sparse matrix multiplication. In 2020 IEEE Int. Symp. on High Performance Computer Architecture (HPCA), 2020.

[51]

Yuhao Zhu, Anand Samajdar, Matthew Mattina, and Paul Whatmough. Euphrates: Algorithm-soc co-design for low-power mobile continuous vision. ISCA, 2018.

Cited By

Bayer RPriest JTözün P(2024)Reaching the Edge of the Edge: Image Analysis in SpaceProceedings of the Eighth Workshop on Data Management for End-to-End Machine Learning10.1145/3650203.3663330(29-38)Online publication date: 9-Jun-2024
https://dl.acm.org/doi/10.1145/3650203.3663330
Feng YMa TZhu YZhang X(2024)BlissCam: Boosting Eye Tracking Efficiency with Learned In-Sensor Sparse Sampling2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA)10.1109/ISCA59077.2024.00094(1262-1277)Online publication date: 29-Jun-2024
https://doi.org/10.1109/ISCA59077.2024.00094
Hou XTang PLi CLiu JXu CCheng KGuo M(2023)SMG: A System-Level Modality Gating Facility for Fast and Energy-Efficient Multimodal Computing2023 IEEE Real-Time Systems Symposium (RTSS)10.1109/RTSS59052.2023.00033(291-303)Online publication date: 5-Dec-2023
https://doi.org/10.1109/RTSS59052.2023.00033
Show More Cited By

Index Terms

Rhythmic pixel regions: multi-resolution visual sensing system towards high-precision visual computing at low power
1. Computer systems organization
  1. Architectures
    1. Other architectures
      1. Special purpose systems

Recommendations

Fuzzy Classification of High Resolution Remote Sensing Scenes Using Visual Attention Features

In recent years the spatial resolutions of remote sensing images have been improved greatly. However, a higher spatial resolution image does not always lead to a better result of automatic scene classification. Visual attention is an important ...
Visual comfort assessment for stereoscopic 3D images based on salient discomfort regions
2015 IEEE International Conference on Image Processing (ICIP)
We propose visual comfort assessment for stereoscopic 3D (S3D) images based on salient discomfort regions. Color-based saliency successfully represents visual attention because that the human visual system (HVS) focuses on the most salient region in an ...
Auto‐encoder‐based shared mid‐level visual dictionary learning for scene classification using very high resolution remote sensing images

Effective representation and classification of scenes using very high resolution (VHR) remote sensing images cover a wide range of applications. Although robust low‐level image features have been proven to be effective for scene classification, they are ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ASPLOS '21: Proceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems

April 2021

1090 pages

ISBN:9781450383172

DOI:10.1145/3445814

General Chair:
Tim Sherwood
University of California at Santa Barbara, USA
,
Program Chairs:
Emery Berger
University of Massachusetts at Amherst, USA
,
Christos Kozyrakis
Stanford University, USA

Copyright © 2021 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGPLAN: ACM Special Interest Group on Programming Languages

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 April 2021

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

National Science Foundation

Conference

ASPLOS '21

Sponsor:

SIGPLAN

ASPLOS '21: 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems

April 19 - 23, 2021

Virtual, USA

Acceptance Rates

Overall Acceptance Rate 535 of 2,713 submissions, 20%

Upcoming Conference

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

8
Total Citations
View Citations
639
Total Downloads

Downloads (Last 12 months)152
Downloads (Last 6 weeks)24

Reflects downloads up to 30 Aug 2024

Other Metrics

View Author Metrics

Citations

Cited By

Bayer RPriest JTözün P(2024)Reaching the Edge of the Edge: Image Analysis in SpaceProceedings of the Eighth Workshop on Data Management for End-to-End Machine Learning10.1145/3650203.3663330(29-38)Online publication date: 9-Jun-2024
https://dl.acm.org/doi/10.1145/3650203.3663330
Feng YMa TZhu YZhang X(2024)BlissCam: Boosting Eye Tracking Efficiency with Learned In-Sensor Sparse Sampling2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA)10.1109/ISCA59077.2024.00094(1262-1277)Online publication date: 29-Jun-2024
https://doi.org/10.1109/ISCA59077.2024.00094
Hou XTang PLi CLiu JXu CCheng KGuo M(2023)SMG: A System-Level Modality Gating Facility for Fast and Energy-Efficient Multimodal Computing2023 IEEE Real-Time Systems Symposium (RTSS)10.1109/RTSS59052.2023.00033(291-303)Online publication date: 5-Dec-2023
https://doi.org/10.1109/RTSS59052.2023.00033
Jayasuriya SIqbal OKodukula VTorres VLikamwa RSpanias A(2023)Software-Defined Imaging: A SurveyProceedings of the IEEE10.1109/JPROC.2023.3266736111:5(445-464)Online publication date: May-2023
https://doi.org/10.1109/JPROC.2023.3266736
Feng YMa TBoloor AZhu YZhang X(2023)Invited Paper: Learned In-Sensor Visual Computing: From Compression to Eventification2023 IEEE/ACM International Conference on Computer Aided Design (ICCAD)10.1109/ICCAD57390.2023.10323842(1-9)Online publication date: 28-Oct-2023
https://doi.org/10.1109/ICCAD57390.2023.10323842
Katoch SIqbal OSpanias AJayasuriya S(2023)Energy-Efficient Object Tracking Using Adaptive ROI Subsampling and Deep Reinforcement LearningIEEE Access10.1109/ACCESS.2023.327077611(41995-42011)Online publication date: 2023
https://doi.org/10.1109/ACCESS.2023.3270776
Feng YGoulding-Hotta NKhan AReyserhove HZhu Y(2022)Real-Time Gaze Tracking with Event-Driven Eye Segmentation2022 IEEE Conference on Virtual Reality and 3D User Interfaces (VR)10.1109/VR51125.2022.00059(399-408)Online publication date: Mar-2022
https://doi.org/10.1109/VR51125.2022.00059
Iqbal OMuro VKatoch SSpanias AJayasuriya S(2022)Adaptive Subsampling for ROI-Based Visual Tracking: Algorithms and FPGA ImplementationIEEE Access10.1109/ACCESS.2022.320075510(90507-90522)Online publication date: 2022
https://doi.org/10.1109/ACCESS.2022.3200755
Huzaifa MDesai RGrayson SJiang XJing YLee JLu FPang YRavichandran JSinclair FTian BYuan HZhang JAdve S(2021)ILLIXR: Enabling End-to-End Extended Reality Research2021 IEEE International Symposium on Workload Characterization (IISWC)10.1109/IISWC53511.2021.00014(24-38)Online publication date: Nov-2021
https://doi.org/10.1109/IISWC53511.2021.00014

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents