Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3445814.3446737acmconferencesArticle/Chapter ViewAbstractPublication PagesasplosConference Proceedingsconference-collections
research-article
Public Access

Rhythmic pixel regions: multi-resolution visual sensing system towards high-precision visual computing at low power

Published: 17 April 2021 Publication History

Abstract

High spatiotemporal resolution can offer high precision for vision applications, which is particularly useful to capture the nuances of visual features, such as for augmented reality. Unfortunately, capturing and processing high spatiotemporal visual frames generates energy-expensive memory traffic. On the other hand, low resolution frames can reduce pixel memory throughput, but reduce also the opportunities of high-precision visual sensing. However, our intuition is that not all parts of the scene need to be captured at a uniform resolution. Selectively and opportunistically reducing resolution for different regions of image frames can yield high-precision visual computing at energy-efficient memory data rates.
To this end, we develop a visual sensing pipeline architecture that flexibly allows application developers to dynamically adapt the spatial resolution and update rate of different "rhythmic pixel regions" in the scene. We develop a system that ingests pixel streams from commercial image sensors with their standard raster-scan pixel read-out patterns, but only encodes relevant pixels prior to storing them in the memory. We also present streaming hardware to decode the stored rhythmic pixel region stream into traditional frame-based representations to feed into standard computer vision algorithms. We integrate our encoding and decoding hardware modules into existing video pipelines. On top of this, we develop runtime support allowing developers to flexibly specify the region labels. Evaluating our system on a Xilinx FPGA platform over three vision workloads shows 43-64% reduction in interface traffic and memory footprint, while providing controllable task accuracy.

References

[1]
1996scarlet. RetinaNet. https://github.com/1996scarlet/faster-mobile-retinaface.
[2]
Andrew Adams, Eino-Ville Talvala, Sung Hee Park, David E Jacobs, Boris Ajdin, Natasha Gelfand, Jennifer Dolson, Daniel Vaquero, Jongmin Baek, Marius Tico, et al. The frankencamera: an experimental platform for computational photography. In ACM SIGGRAPH. 2010.
[3]
Android. Android Camera API documentation. https://developer.android.com/ guide/topics/media/camera.
[4]
Mark Buckler, Philip Bedoukian, Suren Jayasuriya, and Adrian Sampson. Eva2: Exploiting temporal redundancy in live computer vision. In ACM/IEEE 45th Annual Int. Symp on Computer Architecture (ISCA), 2018.
[5]
Z. Cao, G. Hidalgo Martinez, T. Simon, S. Wei, and Y. A. Sheikh. Openpose: Realtime multi-person 2d pose estimation using part afinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019.
[6]
Tifany Yu-Han Chen, Lenin Ravindranath, Shuo Deng, Paramvir Bahl, and Hari Balakrishnan. Glimpse: Continuous, real-time object recognition on mobile devices. In Proc. of the 13th ACM Conf. on Embedded Networked Sensor Systems, 2015.
[7]
Yu-Hsin Chen, Tien-Ju Yang, Joel Emer, and Vivienne Sze. Eyeriss v2: A flexible accelerator for emerging deep neural networks on mobile devices. IEEE Journal on Emerging and Selected Topics in Circuits and Systems, 2019.
[8]
Jaehyuk Choi, Seokjun Park, Jihyun Cho, and Euisik Yoon. An energy/illumination-adaptive CMOS image sensor with reconfigurable modes of operations. IEEE Journal of Solid-State Circuits, 2015.
[9]
EETimes. Tensilica's New Vision/AI DSP Guns for SLAM. https://www.eetimes. com/tensilicas-new-vision-ai-dsp-guns-for-slam.
[10]
Yu Feng, Paul Whatmough, and Yuhao Zhu. Asv: accelerated stereo vision system. In Proc. of the 52nd Annual IEEE/ACM Int. Symp. on Microarchitecture, 2019.
[11]
Gallego, Guillermo and Delbruck, Tobi and Orchard, Garrick and Bartolozzi, Chiara and Taba, Brian and Censi, Andrea and Leutenegger, Stefan and Davison, Andrew and Conradt, Jörg and Daniilidis, Kostas and others. Event-based vision: A survey. arXiv preprint arXiv: 1904.08405, 2019.
[12]
Saugata Ghose, Abdullah Giray Yaglikçi, Raghav Gupta, Donghyuk Lee, Kais Kudrolli, William X Liu, Hasan Hassan, Kevin K Chang, Niladrish Chatterjee, Aditya Agrawal, Mike Connor, and Onur Mutlu. What your dram power models are not telling you: Lessons from a detailed experimental study. Proc. of the ACM on Measurement and Analysis of Computing Systems, 2018.
[13]
Ashish Gondimalla, Noah Chesnut, Mithuna Thottethodi, and TN Vijaykumar. Sparten: A sparse tensor accelerator for convolutional neural networks. In Proc. of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture, 2019.
[14]
Google. Take of to your next destination with Google Maps. https://www.blog. google/products/maps/take-your-next-destination-google-maps.
[15]
Mohit Gupta, Amit Agrawal, Ashok Veeraraghavan, and Srinivasa G Narasimhan. Flexible voxels for motion-aware videography. In European Conference on Computer Vision, 2010.
[16]
Rehan Hameed, Wajahat Qadeer, Megan Wachs, Omid Azizi, Alex Solomatnikov, Benjamin C Lee, Stephen Richardson, Christos Kozyrakis, and Mark Horowitz. Understanding sources of ineficiency in general-purpose chips. In Proc. of the 37th annual intl. symp on Computer architecture, 2010.
[17]
Ron Ho, Kenneth W Mai, and Mark A Horowitz. The future of wires. Proc. of the IEEE, 2001.
[18]
Tobias Höllerer and Steve Feiner. Mobile augmented reality. Telegeoinformatics: Location-based computing and services, 21, 2004.
[19]
Jinhan Hu, Jianan Yang, Vraj Delhivala, and Robert LiKamWa. Characterizing the reconfiguration latency of image sensor resolution on android devices. In Proc. of the 19th International Workshop on Mobile Computing Systems & Applications, 2018.
[20]
Iaian Richardson. H.264 and MPEG-4 Video Compression: Video Coding for Nextgeneration Multimedia. 2004.
[21]
Odrika Iqbal, Saquib Siddiqui, Joshua Martin, Sameeksha Katoch, Andreas Spanias, Daniel Bliss, Suren Jayasuriya, and SenSIP Center. Design and fpga implementation of an adaptive video subsampling algorithm for energy-eficient single object tracking. In IEEE Int Conf on Image Processing, 2020.
[22]
Robert LiKamWa, Bodhi Priyantha, Matthai Philipose, Lin Zhong, and Paramvir Bahl. Energy characterization and optimization of image sensing toward continuous mobile vision. In Proc. of the 11th annual international conference on Mobile systems, applications, and services, 2013.
[23]
Chaochao Lu, Michael Hirsch, and Bernhard Scholkopf. Flexible spatio-temporal networks for video prediction. In Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition, 2017.
[24]
Nir Magen, Avinoam Kolodny, Uri Weiser, and Nachum Shamir. Interconnectpower dissipation in a microprocessor. In Proc. of the 2004 international workshop on System level interconnect prediction, 2004.
[25]
Krishna T Malladi, Frank A Nothaft, Karthika Periyathambi, Benjamin C Lee, Christos Kozyrakis, and Mark Horowitz. Towards energy-proportional datacenter memory with mobile dram. In 39th Annual Int. Symp. on Computer Architecture (ISCA). IEEE, 2012.
[26]
Max Planck Institute for Informatics, University of Bonn. PoseTrack Dataset and Benchmark. https://posetrack.net/.
[27]
James D Meindl, Jefrey A Davis, Payman Zarkesh-Ha, Chirag S Patel, Kevin P Martin, and Paul A Kohl. Interconnect opportunities for gigascale integration. IBM journal of research and development, 2002.
[28]
Micron technologies. Micron system power calculators. https://www.micron. com/support/tools-and-utilities/power-calc.
[29]
Microsoft. Azure Kinect DK. https://www.microsoft.com/en-us/p/azure-kinectdk/8pp5vxmd9nhq?activetab=pivot%3aoverviewtab.
[30]
MIPI Alliance. MIPI Camera Serial Interface 2 (MIPI CSI-2 ). https://www.mipi. org/specifications/csi-2.
[31]
Mur-Artal, Raúl, Montiel, J. M. M. and Tardós, Juan D. ORB-SLAM: a versatile and accurate monocular SLAM system. IEEE Trans. on Robotics, 2015.
[32]
Saman Naderiparizi, Pengyu Zhang, Matthai Philipose, Bodhi Priyantha, Jie Liu, and Deepak Ganesan. Glimpse: A programmable early-discard camera architecture for continuous mobile vision. In Proc. of the 15th Annual Int. Conf. on Mobile Systems, Applications, and Services, 2017.
[33]
NICTA. ChokePoint Dataset. http://arma.sourceforge.net/chokepoint/.
[34]
OpenCV. OpenCV KeyPoint Class Reference. https://docs.opencv. org/3.4/d2/ d29/classcv_1_1KeyPoint.html.
[35]
Dhinakaran Pandiyan and Carole-Jean Wu. Quantifying the energy cost of data movement for emerging smart phone workloads on mobile platforms. In 2014 IEEE Int. Syump. on Workload Characterization (IISWC). IEEE, 2014.
[36]
Vijay Raghunathan, Mani B Srivastava, and Rajesh K Gupta. A survey of techniques for energy eficient on-chip communication. In ACM Proc. of the 40th annual Design Automation Conference, 2003.
[37]
Dikpal Reddy, Ashok Veeraraghavan, and Rama Chellappa. P2c2: Programmable pixel compressive camera for high speed imaging. In CVPR 2011.
[38]
Richard Szeliski. Computer Vision: Algorithms and Applications 1st ed. 2010.
[39]
Stemmer Imaging. Teledyne DALSA Piranha4-Dual-Line-CMOS line camera. https://www.stemmer-imaging.com/en/products/series/teledyne-dalsapiranha4.
[40]
J. Sturm, N. Engelhard, F. Endres, W. Burgard, and D. Cremers. A benchmark for the evaluation of rgb-d slam systems. In Proc. of the International Conference on Intelligent Robot Systems (IROS), 2012.
[41]
Synopsys. Augmenting Your Reality with Deep Learning. https: //www.synopsys.com/designware-ip/technical-bulletin/ augmenting-yourreality-dwtb_q318.html.
[42]
Thomas Vogelsang. Understanding the energy consumption of dynamic random access memories. In 43rd Annual IEEE/ACM Int Symp. on Microarchitecture. IEEE, 2010.
[43]
Wikipedia. Foveated Rendering. https://en.wikipedia.org/wiki/Foveated_rendering.
[44]
Xilinx. H.264/H.265 Video Codec Unit v1.2. https://www.xilinx.com/support/ documentation/ip_documentation/vcu/v1_2/pg252-vcu.pdf.
[45]
Xilinx. reVISION Getting Started Guide 2018. 3 ( UG1265 ). https://github.com/ Xilinx/reVISION-Getting-Started-Guide.
[46]
Xilinx. Vivado Design Suite. https://www.xilinx.com/products/design-tools/ vivado.html.
[47]
Xilinx. Xilinx Power Estimator. https://www.xilinx.com/products/technology/ power/xpe.html.
[48]
Xilinx. Zynq DPU v3.2. https://www.xilinx.com/support/documentation/ip_documentation/dpu/v3_2/pg338-dpu.pdf.
[49]
ximea. Multiple ROI cameras. https://www.ximea.com/support/wiki/allprod/ Multiple_ROI.
[50]
Zhekai Zhang, Hanrui Wang, Song Han, and William J Dally. Sparch: Eficient architecture for sparse matrix multiplication. In 2020 IEEE Int. Symp. on High Performance Computer Architecture (HPCA), 2020.
[51]
Yuhao Zhu, Anand Samajdar, Matthew Mattina, and Paul Whatmough. Euphrates: Algorithm-soc co-design for low-power mobile continuous vision. ISCA, 2018.

Cited By

View all
  • (2024)Reaching the Edge of the Edge: Image Analysis in SpaceProceedings of the Eighth Workshop on Data Management for End-to-End Machine Learning10.1145/3650203.3663330(29-38)Online publication date: 9-Jun-2024
  • (2024)BlissCam: Boosting Eye Tracking Efficiency with Learned In-Sensor Sparse Sampling2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA)10.1109/ISCA59077.2024.00094(1262-1277)Online publication date: 29-Jun-2024
  • (2023)SMG: A System-Level Modality Gating Facility for Fast and Energy-Efficient Multimodal Computing2023 IEEE Real-Time Systems Symposium (RTSS)10.1109/RTSS59052.2023.00033(291-303)Online publication date: 5-Dec-2023
  • Show More Cited By

Index Terms

  1. Rhythmic pixel regions: multi-resolution visual sensing system towards high-precision visual computing at low power

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ASPLOS '21: Proceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems
    April 2021
    1090 pages
    ISBN:9781450383172
    DOI:10.1145/3445814
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 17 April 2021

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. augmented reality
    2. pixel discard
    3. visual computing

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    ASPLOS '21
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 535 of 2,713 submissions, 20%

    Upcoming Conference

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)152
    • Downloads (Last 6 weeks)24
    Reflects downloads up to 30 Aug 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Reaching the Edge of the Edge: Image Analysis in SpaceProceedings of the Eighth Workshop on Data Management for End-to-End Machine Learning10.1145/3650203.3663330(29-38)Online publication date: 9-Jun-2024
    • (2024)BlissCam: Boosting Eye Tracking Efficiency with Learned In-Sensor Sparse Sampling2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA)10.1109/ISCA59077.2024.00094(1262-1277)Online publication date: 29-Jun-2024
    • (2023)SMG: A System-Level Modality Gating Facility for Fast and Energy-Efficient Multimodal Computing2023 IEEE Real-Time Systems Symposium (RTSS)10.1109/RTSS59052.2023.00033(291-303)Online publication date: 5-Dec-2023
    • (2023)Software-Defined Imaging: A SurveyProceedings of the IEEE10.1109/JPROC.2023.3266736111:5(445-464)Online publication date: May-2023
    • (2023)Invited Paper: Learned In-Sensor Visual Computing: From Compression to Eventification2023 IEEE/ACM International Conference on Computer Aided Design (ICCAD)10.1109/ICCAD57390.2023.10323842(1-9)Online publication date: 28-Oct-2023
    • (2023)Energy-Efficient Object Tracking Using Adaptive ROI Subsampling and Deep Reinforcement LearningIEEE Access10.1109/ACCESS.2023.327077611(41995-42011)Online publication date: 2023
    • (2022)Real-Time Gaze Tracking with Event-Driven Eye Segmentation2022 IEEE Conference on Virtual Reality and 3D User Interfaces (VR)10.1109/VR51125.2022.00059(399-408)Online publication date: Mar-2022
    • (2022)Adaptive Subsampling for ROI-Based Visual Tracking: Algorithms and FPGA ImplementationIEEE Access10.1109/ACCESS.2022.320075510(90507-90522)Online publication date: 2022
    • (2021)ILLIXR: Enabling End-to-End Extended Reality Research2021 IEEE International Symposium on Workload Characterization (IISWC)10.1109/IISWC53511.2021.00014(24-38)Online publication date: Nov-2021

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media