Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3576842.3582366acmconferencesArticle/Chapter ViewAbstractPublication PagesiotdiConference Proceedingsconference-collections
research-article
Open access

Eagle: End-to-end Deep Reinforcement Learning based Autonomous Control of PTZ Cameras

Published: 09 May 2023 Publication History

Abstract

Existing approaches for autonomous control of pan-tilt-zoom (PTZ) cameras use multiple stages where object detection and localization are performed separately from the control of the PTZ mechanisms. These approaches require manual labels and suffer from performance bottlenecks due to error propagation across the multi-stage flow of information. The large size of object detection neural networks also makes prior solutions infeasible for real-time deployment in resource-constrained devices. We present an end-to-end deep reinforcement learning (RL) solution called Eagle1 to train a neural network policy that directly takes images as input to control the PTZ camera. Training reinforcement learning is cumbersome in the real world due to labeling effort, runtime environment stochasticity, and fragile experimental setups. We introduce a photo-realistic simulation framework for training and evaluation of PTZ camera control policies. Eagle achieves superior camera control performance by maintaining the object of interest close to the center of captured images at high resolution and has up to 17% more tracking duration than the state-of-the-art. Eagle policies are lightweight (90x fewer parameters than Yolo5s) and can run on embedded camera platforms such as Raspberry PI (33 FPS) and Jetson Nano (38 FPS), facilitating real-time PTZ tracking for resource-constrained environments. With domain randomization, Eagle policies trained in our simulator can be transferred directly to real-world scenarios2.

References

[1]
2022. Embedded vision for raspberry pi, jetson, Arduino and more. https://www.arducam.com/
[2]
2022. Yolo neural object detector. https://github.com/ultralytics/yolov5
[3]
Michael Balaban. 2021. Deep Learning Hardware Deep Dive – RTX 3090, RTX 3080, and RTX 3070. https://lambdalabs.com/blog/deep-learning-hardware-deep-dive-rtx-30xx/
[4]
Bharathan Balaji, Sunil Mallya, Sahika Genc, Saurabh Gupta, Leo Dirac, Vineet Khare, Gourav Roy, Tao Sun, Yunzhe Tao, Brian Townsend, 2020. Deepracer: Autonomous racing platform for experimentation with sim2real reinforcement learning. In 2020 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2746–2754.
[5]
Keni Bernardin, Florian Van De Camp, and Rainer Stiefelhagen. 2007. Automatic person detection and tracking using fuzzy controlled active cameras. In 2007 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 1–8.
[6]
Alex Bewley, Zongyuan Ge, Lionel Ott, Fabio Ramos, and Ben Upcroft. 2016. Simple online and realtime tracking. In 2016 IEEE International Conference on Image Processing (ICIP). 3464–3468. https://doi.org/10.1109/ICIP.2016.7533003
[7]
Niccolò Bisagno, Alberto Xamin, Francesco De Natale, Nicola Conci, and Bernhard Rinner. 2020. Dynamic Camera Reconfiguration with Reinforcement Learning and Stochastic Methods for Crowd Surveillance. Sensors 20, 17 (2020), 4691.
[8]
Gengjie Chen, Pierre-Luc St-Charles, Wassim Bouachir, Guillaume-Alexandre Bilodeau, and Robert Bergevin. 2015. Reproducible evaluation of pan-tilt-zoom tracking. In 2015 IEEE International Conference on Image Processing (ICIP). IEEE, 2055–2059.
[9]
Shengyong Chen, Youfu Li, and Ngai Ming Kwok. 2011. Active vision in robotic systems: A survey of recent developments. The International Journal of Robotics Research 30, 11 (2011), 1343–1377.
[10]
Charles Hamesse, Benoît Pairet, Rihab Lahouli, Timothée Fréville, and Rob Haelterman. 2021. Simulation of Pan-Tilt-Zoom Tracking for Augmented Reality Air Traffic Control. In 2021 International Conference on 3D Immersion (IC3D). IEEE, 1–5.
[11]
Samer Hanoun, James Zhang, Vu Le, Burhan Khan, Michael Johnstone, Michael Fielding, Asim Bhatti, Doug Creighton, and Saeid Nahavandi. 2017. A framework for designing active Pan-Tilt-Zoom (PTZ) camera networks for surveillance applications. In 2017 Annual IEEE International Systems Conference (SysCon). IEEE, 1–6.
[12]
Tyler Highlander and John Gallagher. 2019. Attention Neural Networks for Pan-Tilt-Zoom Control with Active Hand-Off. In 2019 7th International Conference on Robot Intelligence Technology and Applications (RiTA). IEEE, 130–135.
[13]
Dongchil Kim, Kyoungman Kim, and Sungjoo Park. 2019. Automatic PTZ camera control based on deep-Q network in video surveillance system. In 2019 International Conference on Electronics, Information, and Communication (ICEIC). IEEE, 1–3.
[14]
Christos Kyrkou. 2021. C3 Net: end-to-end deep learning for efficient real-time visual active camera control. Journal of Real-Time Image Processing (2021), 1–13.
[15]
Ezequiel López-Rubio, Miguel A Molina-Cabello, Francisco M Castro, Rafael M Luque-Baena, Manuel J Marín-Jiménez, and Nicolás Guil. 2021. Anomalous object detection by active search with PTZ cameras. Expert Systems with Applications 181 (2021), 115150.
[16]
Wenhan Luo, Peng Sun, Fangwei Zhong, Wei Liu, Tong Zhang, and Yizhou Wang. 2019. End-to-end active object tracking and its real-world deployment via reinforcement learning. IEEE transactions on pattern analysis and machine intelligence 42, 6 (2019), 1317–1332.
[17]
Christian Micheloni, Bernhard Rinner, and Gian Luca Foresti. 2010. Video analysis in pan-tilt-zoom camera networks. IEEE Signal Processing Magazine 27, 5 (2010), 78–90.
[18]
Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A Rusu, Joel Veness, Marc G Bellemare, Alex Graves, Martin Riedmiller, Andreas K Fidjeland, Georg Ostrovski, 2015. Human-level control through deep reinforcement learning. nature 518, 7540 (2015), 529–533.
[19]
Nvidia Nvidia. 2021. Jetson Nano Developer Kit. https://developer.nvidia.com/embedded/jetson-nano-developer-kit
[20]
Raspberry Pi. 2022. Raspberry pi 4 model B. https://www.raspberrypi.com/products/raspberry-pi-4-model-b/
[21]
Pietro Salvagnini, Marco Cristani, Alessio Del Bue, and Vittorio Murino. 2011. An experimental framework for evaluating PTZ tracking algorithms. In International Conference on Computer Vision Systems. Springer, 81–90.
[22]
Sandeep Singh Sandha, Mohit Aggarwal, Igor Fedorov, and Mani Srivastava. 2020. Mango: A Python Library for Parallel Hyperparameter Tuning. In ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 3987–3991.
[23]
John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. 2017. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017).
[24]
Shital Shah, Debadeepta Dey, Chris Lovett, and Ashish Kapoor. 2018. Airsim: High-fidelity visual and physical simulation for autonomous vehicles. In Field and service robotics. Springer, 621–635.
[25]
Halil Utku Unlu, Phillip Stefan Niehaus, Daniel Chirita, Nikolaos Evangeliou, and Anthony Tzes. 2019. Deep learning-based visual tracking of UAVs using a PTZ camera system. In IECON 2019-45th Annual Conference of the IEEE Industrial Electronics Society, Vol. 1. IEEE, 638–644.

Cited By

View all
  • (2024)Active Visual Perception Enhancement Method Based on Deep Reinforcement LearningElectronics10.3390/electronics1309165413:9(1654)Online publication date: 25-Apr-2024
  • (2024)Multiagent Reinforcement Learning and Game-Theoretic Optimization for Autonomous Sensor Control2024 IEEE Aerospace Conference10.1109/AERO58975.2024.10521284(1-12)Online publication date: 2-Mar-2024
  • (2023)Impact of Delays and Computation Placement on Sense-Act Application Performance in IoTMILCOM 2023 - 2023 IEEE Military Communications Conference (MILCOM)10.1109/MILCOM58377.2023.10356219(133-138)Online publication date: 30-Oct-2023

Index Terms

  1. Eagle: End-to-end Deep Reinforcement Learning based Autonomous Control of PTZ Cameras

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    IoTDI '23: Proceedings of the 8th ACM/IEEE Conference on Internet of Things Design and Implementation
    May 2023
    514 pages
    ISBN:9798400700378
    DOI:10.1145/3576842
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 09 May 2023

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. deep reinforcement learning
    2. edge AI
    3. end-to-end control
    4. pan-tilt-zoom cameras
    5. simulation-to-reality transfer

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Funding Sources

    • Army Research Laboratory (ARL)
    • Air Force Office of Scientific Research (AFOSR)
    • Semiconductor Research Corporation (SRC) and DARPA
    • National Science Foundation (NSF)

    Conference

    IoTDI '23
    Sponsor:

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)275
    • Downloads (Last 6 weeks)42
    Reflects downloads up to 30 Aug 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Active Visual Perception Enhancement Method Based on Deep Reinforcement LearningElectronics10.3390/electronics1309165413:9(1654)Online publication date: 25-Apr-2024
    • (2024)Multiagent Reinforcement Learning and Game-Theoretic Optimization for Autonomous Sensor Control2024 IEEE Aerospace Conference10.1109/AERO58975.2024.10521284(1-12)Online publication date: 2-Mar-2024
    • (2023)Impact of Delays and Computation Placement on Sense-Act Application Performance in IoTMILCOM 2023 - 2023 IEEE Military Communications Conference (MILCOM)10.1109/MILCOM58377.2023.10356219(133-138)Online publication date: 30-Oct-2023

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media