Abstract
Computer Vision is becoming widely used for a myriad of purposes, e.g. people counting and tracking. To execute this application in real-time, a relatively complex algorithm processes intensive data streams to identify people in a visual scenario. Although such algorithms frequently run in powerful servers on the Cloud, it is also common that they have to run in local commodity computers with limited capacity. In this work we used the Multi-Camera Multi-Target algorithm of the recent OpenVINOTM toolkit to detect and track people in small retail stores. We ran the algorithm in a common personal computer and analyzed the variation of its performance for a set of different relevant scenarios and algorithm configurations, providing insights into how these affect the algorithm performance and computational cost. In the tested scenarios, the most influential factor was the number of people in the scene. The average frame processing time observed varied around 200 ms.
C. Pereira—This work was developed while this author was at NOS Comunicações.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Bernardin, K., Stiefelhagen, R.: Evaluating multiple object tracking performance: the CLEAR MOT metrics. EURASIP J. Image Video Process. 2008(1), 1–10 (2008). https://doi.org/10.1155/2008/246309
Bezdan, T., Bacanin, N.: Convolutional neural network layers and architectures, pp. 445–451 (2019). https://doi.org/10.15308/Sinteza-2019-445-451
Bradski, G., Kaehler, A.: Opencv. Dr. Dobb’s J. Softw. Tools 3 (2000)
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation (2013)
Gong, S., Cristani, M., Loy, C.C., Hospedales, T.M.: The re-identification challenge. In: Gong, S., Cristani, M., Yan, S., Loy, C.C. (eds.) Person Re-Identification. ACVPR, pp. 1–20. Springer, London (2014). https://doi.org/10.1007/978-1-4471-6296-4_1
Huang, J., et al.: Speed/accuracy trade-offs for modern convolutional object detectors. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7310–7311 (2017)
Intel: Parallel universe issue 34 (2018). https://software.intel.com/content/dam/develop/external/us/en/documents/parallel-universe-issue-34.pdf. Accessed 10 Jan 2021
Intel® : Open model zoo demos. https://docs.openvinotoolkit.org/latest/omz_demos_README.html. Accessed 10 Nov 2020
Intel® : Openvino™ toolkit overview. https://docs.openvinotoolkit.org/latest/index.html. Accessed 2 Nov 2020
Jia, Y., et al.: Caffe: Convolutional architecture for fast feature embedding. In: Proceedings of the 22nd ACM international conference on Multimedia, pp. 675–678 (2014)
Jiao, L., et al.: A survey of deep learning-based object detection. IEEE Access 7, 128837–128868 (2019)
Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection (2017)
Liu, W., et al.: SSD: single shot MultiBox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2
Pitts, W., McCulloch, W.S.: How we know universals the perception of auditory and visual forms. Bull. Math. Biophys. 9(3), 127–147 (1947)
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2016. https://doi.org/10.1109/cvpr.2016.91
Ristani, E., Solera, F., Zou, R., Cucchiara, R., Tomasi, C.: Performance measures and a data set for multi-target, multi-camera tracking. In: Hua, G., Jégou, H. (eds.) ECCV 2016. LNCS, vol. 9914, pp. 17–35. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-48881-3_2
Ristani, E., Tomasi, C.: Features for multi-target multi-camera tracking and re-identification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6036–6046 (2018)
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.C.: Mobilenetv 2: inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4510–4520 (2018)
Xiong, F., Gou, M., Camps, O., Sznaier, M.: Person re-identification using kernel-based metric learning methods. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8695, pp. 1–16. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10584-0_1
Acknowledgments
This work was financially supported by: Base Funding - UIDB/04234/2020 of the Research Centre in Real-Time and Embedded Computing Systems - CISTER - funded by national funds through the FCT/MCTES (PIDDAC).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 IFIP International Federation for Information Processing
About this paper
Cite this paper
Ramos, M., Pereira, C., Almeida, L. (2021). A First Sensitivity Study of Multi-object Multi-camera Tracking Performance. In: Maglogiannis, I., Macintyre, J., Iliadis, L. (eds) Artificial Intelligence Applications and Innovations. AIAI 2021 IFIP WG 12.5 International Workshops. AIAI 2021. IFIP Advances in Information and Communication Technology, vol 628. Springer, Cham. https://doi.org/10.1007/978-3-030-79157-5_22
Download citation
DOI: https://doi.org/10.1007/978-3-030-79157-5_22
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-79156-8
Online ISBN: 978-3-030-79157-5
eBook Packages: Computer ScienceComputer Science (R0)