Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3469116.3470010acmconferencesArticle/Chapter ViewAbstractPublication PagesmobisysConference Proceedingsconference-collections
short-paper
Open access

Benchmarking Video Object Detection Systems on Embedded Devices under Resource Contention

Published: 24 June 2021 Publication History
  • Get Citation Alerts
  • Abstract

    Adaptive and efficient computer vision systems have been proposed to make computer vision tasks, e.g., object classification and object detection, optimized for embedded boards or mobile devices. These studies focus on optimizing the model (deep network) or system itself, by designing an efficient network architecture or adapting the network architecture at runtime using approximation knobs, such as image size, type of object tracker, head of the object detector (e.g., lighter-weight heads such as one-shot object detectors like YOLO over two-shot object detectors like FRCNN). In this work, we benchmark different video object detection protocols, including FastAdapt, with respect to accuracy, latency, and energy consumption on three different embedded boards that represent the leading edge mobile GPUs. Our set of protocols consists of Faster R-CNN, YOLOv3, SELSA, MEGA, and REPP. Further, we characterize their performance under different levels of resource contention, specifically GPU contention, as would arise due to co-located applications on these boards, contending with the video object detection task. Our key insights are that object detectors have to be coupled with trackers to keep up with the latency requirements (e.g., 30 fps). With this, FastAdapt achieves up to 76 fps on the most well-resourced NVIDIA Jetson-class board---the NVIDIA AGX Xavier. Second, adaptive protocols like FastAdapt, FRCNN, and YOLO (specifically our adaptive variants, FRCNN+ and YOLO+) work well under resource constraints. Among the latest video object detection heads, SELSA achieves the highest accuracy but at a latency of over 2 sec per frame. Our energy consumption experiments bring out that FastAdapt, adaptive FRCNN, and adaptive YOLO are best-in-class, relative to the non-adaptive protocols SELSA, MEGA, and REPP.

    References

    [1]
    Yihong Chen, Yue Cao, Han Hu, and Liwei Wang. 2020. Memory enhanced global-local aggregation for video object detection. In CVPR. 10337--10346.
    [2]
    NVIDIA Corporation. 2021. NVIDIA Jetson Linux Developer Guide. https://docs.nvidia.com/jetson/l4t/index.html#page/Tegra%20Linux%20Driver%20Package%20Development%20Guide/power_management_jetson_xavier.html#wwpID0E0VO0HA.
    [3]
    Jifeng Dai, Yi Li, Kaiming He, and Jian Sun. 2016. R-FCN: Object detection via region-based fully convolutional networks. In NeurIPS. 379--387.
    [4]
    Jiajun Deng, Yingwei Pan, Ting Yao, Wengang Zhou, Houqiang Li, and Tao Mei. 2019. Relation distillation networks for video object detection. In ICCV. 7023--7032.
    [5]
    Biyi Fang, Xiao Zeng, and Mi Zhang. 2018. Nestdnn: Resource-aware multi-tenant on-device deep learning for continuous mobile vision. In Proceedings of the 24th Annual International Conference on Mobile Computing and Networking. 115--127.
    [6]
    Kai Han, Yunhe Wang, Qi Tian, Jianyuan Guo, Chunjing Xu, and Chang Xu. 2020. Ghostnet: More features from cheap operations. In CVPR. 1580--1589.
    [7]
    Zdenek Kalal, Krystian Mikolajczyk, and Jiri Matas. 2010. Forward-backward error: Automatic detection of tracking failures. In 2010 20th international conference on pattern recognition. IEEE, 2756--2759.
    [8]
    Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C Lawrence Zitnick. 2014. Microsoft coco: Common objects in context. In ECCV. Springer, 740--755.
    [9]
    Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng-Yang Fu, and Alexander C Berg. 2016. SSD: Single shot multibox detector. In ECCV, Vol. 9907. 21--37.
    [10]
    Ashraf Mahgoub, Alexander Michaelson Medoff, Rakesh Kumar, Subrata Mitra, Ana Klimovic, Somali Chaterji, and Saurabh Bagchi. 2020. OPTIMUSCLOUD: Heterogeneous Configuration Optimization for Distributed Databases in the Cloud. In 2020 USENIX Annual Technical Conference (USENIX ATC 20). 189--203.
    [11]
    Ashraf Mahgoub, Karthick Shankar, Subrata Mitra, Ana Klimovic, Somali Chaterji, and Saurabh Bagchi. 2021. SONIC: Application-Aware Data Passing for Chained Serverless Applications. In 2021 USENIX Annual Technical Conference (USENIX ATC 21). USENIX Association, Virtual. forthcoming
    [12]
    Ashraf Mahgoub, Paul Wood, Alexander Medoff, Subrata Mitra, Folker Meyer, Somali Chaterji, and Saurabh Bagchi. 2019. SOPHIA: Online Reconfiguration of Clustered NoSQL Databases for Time-Varying Workloads. In 2019 USENIX Annual Technical Conference (USENIX ATC 19). USENIX Association, Renton, WA. https://www.usenix.org/conference/atc19/presentation/mahgoub
    [13]
    Joseph Redmon, Santosh Divvala, Ross Girshick, and Ali Farhadi. 2016. You only look once: Unified, real-time object detection. In CVPR. 779--788.
    [14]
    Joseph Redmon and Ali Farhadi. 2018. YOLOv3: An incremental improvement. arXiv preprint arXiv:1804.02767 (2018).
    [15]
    Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. 2015. Faster R-CNN: Towards real-time object detection with region proposal networks. In NeurIPS. 91--99.
    [16]
    Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, Alexander C. Berg, and Li Fei-Fei. 2015. ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision (IJCV) 115, 3 (2015), 211--252.
    [17]
    Alberto Sabater, Luis Montesano, and Ana C Murillo. 2020. Robust and efficient post-processing for video object detection. In Proceedings of the International Conference on Intelligent Robots and Systems (IROS).
    [18]
    Mark Sandler, Andrew Howard, Menglong Zhu, Andrey Zhmoginov, and Liang-Chieh Chen. 2018. Mobilenetv2: Inverted residuals and linear bottlenecks. In CVPR. 4510--4520.
    [19]
    Mingxing Tan, Ruoming Pang, and Quoc V Le. 2020. EfficientDet: Scalable and efficient object detection. In CVPR. 10781--10790.
    [20]
    Haiping Wu, Yuntao Chen, Naiyan Wang, and Zhaoxiang Zhang. 2019. Sequence level semantics aggregation for video object detection. In ICCV. 9217--9225.
    [21]
    Ran Xu, Rakesh Kumar, PengCheng Wang, Ganga Meghanath, Somali Chaterji, Subrata Mitra, and Saurabh Bagchi. 2021. ApproxNet: Content and Contention Aware Video Analytics System for the Edge. ACM Trans. Sensor Netw. (2021).
    [22]
    Ran Xu, Chen-lin Zhang, Pengcheng Wang, Jayoung Lee, Subrata Mitra, Somali Chaterji, Yin Li, and Saurabh Bagchi. 2020. ApproxDet: content and contention-aware approximate object detection for mobiles. In Proceedings of the 18th Conference on Embedded Networked Sensor Systems (SenSys). 449--462.
    [23]
    Shifeng Zhang, Longyin Wen, Xiao Bian, Zhen Lei, and Stan Z Li. 2018. Single-shot refinement neural network for object detection. In CVPR. 4203--4212.
    [24]
    Xizhou Zhu, Yujie Wang, Jifeng Dai, Lu Yuan, and Yichen Wei. 2017. Flow-guided feature aggregation for video object detection. In ICCV. 408--417.

    Cited By

    View all
    • (2023)Análise da Execução de Algoritmos de Aprendizado de Máquina em Dispositivos EmbarcadosAnais do XXIV Simpósio em Sistemas Computacionais de Alto Desempenho (SSCAD 2023)10.5753/wscad.2023.235915(61-72)Online publication date: 17-Oct-2023
    • (2023)Virtuoso: Energy- and Latency-aware Streamlining of Streaming Videos on Systems-on-ChipsACM Transactions on Design Automation of Electronic Systems10.1145/356428928:3(1-32)Online publication date: 3-Apr-2023
    • (2023)Analytical Modelling and Performance Analysis of Resource Utilization Using Proactive Contention System in Highly Mobile Environments2023 International Conference On Cyber Management And Engineering (CyMaEn)10.1109/CyMaEn57228.2023.10050889(45-49)Online publication date: 26-Jan-2023
    • Show More Cited By

    Index Terms

    1. Benchmarking Video Object Detection Systems on Embedded Devices under Resource Contention

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image ACM Conferences
        EMDL'21: Proceedings of the 5th International Workshop on Embedded and Mobile Deep Learning
        June 2021
        44 pages
        ISBN:9781450385978
        DOI:10.1145/3469116
        This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike International 4.0 License.

        Sponsors

        In-Cooperation

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 24 June 2021

        Permissions

        Request permissions for this article.

        Check for updates

        Author Tags

        1. Adaptive inference
        2. approximate algorithms
        3. configuration tuning
        4. context-aware analytics
        5. embedded computing
        6. energy consumption
        7. mobile GPUs
        8. mobile devices
        9. video object detection

        Qualifiers

        • Short-paper
        • Research
        • Refereed limited

        Funding Sources

        Conference

        MobiSys '21
        Sponsor:

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)235
        • Downloads (Last 6 weeks)12
        Reflects downloads up to

        Other Metrics

        Citations

        Cited By

        View all
        • (2023)Análise da Execução de Algoritmos de Aprendizado de Máquina em Dispositivos EmbarcadosAnais do XXIV Simpósio em Sistemas Computacionais de Alto Desempenho (SSCAD 2023)10.5753/wscad.2023.235915(61-72)Online publication date: 17-Oct-2023
        • (2023)Virtuoso: Energy- and Latency-aware Streamlining of Streaming Videos on Systems-on-ChipsACM Transactions on Design Automation of Electronic Systems10.1145/356428928:3(1-32)Online publication date: 3-Apr-2023
        • (2023)Analytical Modelling and Performance Analysis of Resource Utilization Using Proactive Contention System in Highly Mobile Environments2023 International Conference On Cyber Management And Engineering (CyMaEn)10.1109/CyMaEn57228.2023.10050889(45-49)Online publication date: 26-Jan-2023
        • (2023)Applications of deep learning in precision weed managementComputers and Electronics in Agriculture10.1016/j.compag.2023.107698206:COnline publication date: 1-Mar-2023
        • (2022)Design and Implementation of a UAV-Based Airborne Computing Platform for Computer Vision and Machine Learning ApplicationsSensors10.3390/s2205204922:5(2049)Online publication date: 6-Mar-2022
        • (2022)SmartDet: Context-Aware Dynamic Control of Edge Task Offloading for Mobile Object Detection2022 IEEE 23rd International Symposium on a World of Wireless, Mobile and Multimedia Networks (WoWMoM)10.1109/WoWMoM54355.2022.00034(357-366)Online publication date: Jun-2022
        • (2022)Smartadapt: Multi-branch Object Detection Framework for Videos on Mobiles2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52688.2022.00256(2518-2528)Online publication date: Jun-2022
        • (2021)A data-driven approach to increasing the lifetime of IoT sensor nodesScientific Reports10.1038/s41598-021-01431-y11:1Online publication date: 17-Nov-2021

        View Options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Get Access

        Login options

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media