Abstract
Pedestrian detection, despite the recent advances, still is of a great challenge to computer vision in wide range of diversified applications such as urban autonomous driving and intelligent transportation. Deep convolutional neural network has greatly contributed to the recent advances in pedestrian detection algorithms. The aim of this paper is to use modified single-shot detector (SSD) approach in pedestrian detection and then improve it by a novel deep architecture. The proposed deep architecture extracts initial Region of Interests (RoIs) using SSD approach, while it employs nine parallel fast RCNNs based on inception modules to estimate nine different parts of body. The proposed method takes the advantage of a secure border in each initial RoI to both create an Extended Region of Candidate Pedestrian (ERCP) and also to extract multi-RoIs. It then selects a number of RoIs within the ERCP as detected pedestrians which satisfy few reasonable criteria. We also propose a new training approach based on different body parts estimation which searches the best RoIs. Comprehensive experimental results demonstrate that the proposed method, deep model based on parts in pedestrian proposals, is a highly effective method that achieves very competitive performance on two most popular pedestrian detection datasets: Caltech-USA and INRIA. We have improved the log-average miss rate on the Caltech-USA and INRIA pedestrian datasets to 7.28% and 4.96%, respectively.
















Similar content being viewed by others
References
Tian Y, Luo P, Wang X, Tang X (2015) Deep learning strong parts for pedestrian detection. In: IEEE, ICCV
Jiang X, Pang Y, Li X, Pan J (2016) Speed up deep neural network based pedestrian detection by sharing features across multi-scale models. In: Neurocomputing, vol 185. Elsevier, pp 163–170
Tome D, Monti F, Baroffio L, Bondi L, Tagliasacchi M, Tubaro S (2016) Deep convolutional neural networks for pedestrian detection. Sig Process Image Commun 47:482–489
Cai Z, Saberian M, Vasconcelos N (2015) Learning complexity-aware cascades for deep pedestrian detection. In: IEEE, ICCV
Zhang S, Yang J, Schiele B (2018) Occluded pedestrian detection through guided attention in cnn. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
Li J, Liang X, Shen Sh, Xu T, Feng J, Yan S (2018) Scale-aware fast r-cnn for pedestrian detection. IEEE Trans Multimed 20(4):985–996
Wang S, Cheng J, Liu H, Tang M (2017) PCN: part and context information for pedestrian detection with cnns. In: BMVC
Lin C, Lu J, Wang G, Zhou J (2018) Graininess-aware deep feature learning for pedestrian detection. In: ECCV. Springer
Du X, EI-Khamy M, Morariu V, Lee J, Davis L (2016) Fused Deep Neural Networks for Efficient Pedestrian Detection. CoRR, 2016. https://arxiv.org/abs/1805.08688
Zhang L, Lin L, Liang X, He K (2016) Is faster r-cnn doing well for pedestrian detection?. In: European Conference on Computer Vision. Springer, pp 443–457
Tian Y, Luo P, Wang X, Tang X (2015) Pedestrian detection aided by deep learning semantic tasks. In: IEE, CVPR
Cai Z, Fan Q, Feris R, Vasconcelos N (2016) A unified multi-scale deep convolutional neural network for fast object detection. In: ECCV
Song T, Sun L, Xie D, Sun H, Pu S (2018) Small-scale pedestrian detection based on topological line localization and temporal feature aggregation. In: Proceedings of the ECCV, pp 536–551
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Proceedings of the NIPS
Zeiler MD, Fergus R (2014) Visualizing and understanding convolutional neural networks. In: Proceedings of the ECCV
He K, Zhang X, Ren S, et al (2014) Spatial pyramid pooling in deep convolutional networks for visual recognition. In: Proceedings of the ECCV
Sermanet P, Eigen D, Zhang X, et al (2017) Overfeat: integrated recognition, localization and detection using convolutional networks. In: Proceedings of the ICLR
Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. In: Proceedings of the ICLR
Szegedy C, Liu W, Jia Y, et al (2015) Going deeper with convolutions. In: Proceedings of the CVPR
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceeding of CVPR
Girshick R, Donahue J, Darrell T, et al (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the CVPR
Girshick R (2015) Fast R-CNN. In: Proceedings of the ICCV
Ren S, He K, Girshick R, et al (2015) Faster R-CNN: towards real-time object detection with region proposal networks. In: Proceedings of the NIPS
Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: unified, real-time object detection. In: CVPR
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu CY, Berg AC (2016) SSD: single shot multibox detector. In: ECCV, pp 21–37
Everingham M, Vangool L, Williams CKI, Winn J, Zisserman A (2015) The PASCAL visual object classes challenge: a retrospective. IJCV 111(1):98–136
Dollar P, Wojek C, Schiele B, Perona P (2009) Pedestrian detection: a benchmark. In: CVPR
Saeidi M, Ahmadi A (2020) A novel approach for deep pedestrian detection based on changes in camera viewing angle. In: SIVP
Saeidi M, Ahmadi A (2018) Deep learning based on parallel CNN for pedestrian detection. In: IJICT
Saeidi M, Ahmadi A (2018) Deep learning based on CNN for pedestrian detection: an overview and analysis. In: IEEE, IST
Saeidi M, Ahmadi A (2018) Pedestrian detection using an extended fast RCNN based on a secure margin in RoI feature maps. In: IEEE, IST
Viola PA, Jones M (2004) Robust real-time face detection. J Comput Vis 57(2):137–154
Dollar P, Tu Z, Perona P, Belongie S (2009) Integral channel features. In: Proceedings of British Machine Vision Conference
Wojek C, Schiele B (2008) A performance evaluation of single and multi-feature people detection. In: DAGM Symposium Pattern Recognition
Walk S, Majer N, Schindler K, Schiele B (2010) New features and insights for pedestrian detection. In CVPR
Marin, J., Vazquez, D., Lopez, A., Amores, J., Leibl, B.: Random Forest of local experts for pedestrian detection. In: ICCV. (2013)
Levi D, Silberstein S, Bar-Hillel A (2013) Fast multiple-part based object detection using kd-ferns. In: CVPR
Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: Proceedings of the CVPR
Dollar P, Appel R, Belongie S, Perona P (2014) Fast feature pyramids for object detection. In: PAMI
Nam W, Dollar P, Han JH (2014) Local decorrelation for improved pedestrian detection. In: NIPS
Zhang S, Bauchhage C, Cremers AB (2014) Informed haar-like features improve pedestrian detection. In: CVPR
Paisitkriangkrai S, Shen C, Van den Hengel A (2014) Pedestrian detection with spatially pooled features and structured ensemble learning. IEEE Trans Pattern Anal Mach Intell 38(6):1243–1257
Benenson R, Omran M, Hosang J, Schiele B (2014) Ten years of pedestrian detection, what have we learned. In: ECCV, CVRSUAD Workshop
Yan J, Zhang X, Lei Z, Liao S, Li SZ (2013) Robust multi-resolution pedestrian detection in traffic scenes. In: CVPR
Ouyang W, Wang X (2013) Single-pedestrian detection aided by multi-pedestrian detection. In: CVPR
Park D, Ramanan D, Fowlkes C (2010) Mltiresolution models for object detection. In: ECCV
Felzenszwalb P, McAllester D, Ramanan D (2008) A discriminatively trained, multiscale, deformable part model. In: CVPR
Felzenszwalb P, Girshick R, McAllester D, Ramanan D (2010) Object detection with discriminatively trained part-based models. In: PAMI
Zhang S, Benenson R, Schiele B (2015) Filtered channel features for pedestrian detection. In: CVPR
Park D, Zitnick CL, Ramanan D, Dollar P (2013) Exploring weak stabilization for motion feature extraction. In: CVPR
Costea AD, Nedevschi S (2014) Word channel based multiscale pedestrian detection without image resizing and using only one classifier. In: CVPR
Benenson R, Mathias M, Tuytelaars T, Van Gool L (2013) Seeking the strongest rigid detector. In: CVPR
Dollar P, Wojek C, Schiele B, Perona P (2012) Pedestrian detection: an evaluation of the state of the art. In: PAMI, vol 34
Alexe B, Deselares T, Ferrari V (2012) Measuring the objectness of image windows. In: TPAMI
Uijlings J, Sande K, Gevers T, Smeulders A (2013) Selective search for object recognition. In: IJCV
Endres I, Hoiem D (2010) Category independent object proposals. In: ECCV
Cheng MM, Zhang Z, Lin WY, et al (2014) BING: binarized normed gradients for objectness estimation at 300fps. In: Proceedings of the CVPR
Zitnick CL, Dollar P (2014) Edge boxes: locating object proposals from edges. In: Proceedings of the ECCV
Cortes C, Vapnik V (1995) Support-vector networks. In: Machine Learning, pp 237–297
Laptev I (2009) Improving object detection with boosted histograms. In: Image and vision computing, vol 27. Elsevier, pp 535–544
Severance C, Dowd K High Performance Computing. http://cnx.org/content/col11136/1.5
Cui H, Ganger GR, Gibbons PB (2015) Scalable deep learning on distributed GPUs with a GPU-specialized parameter server. CMU-PDL-15-107
Cai C, Gao J, Minjie B, Zhang P, Gao H (2015) Fast pedestrian detection with adaboost algorithm using GPU. Int J Database Theory Appl 8(6):125–132
Benenson R, Mathias M, Timofte R, Gool L (2012) Pedestrian detection at 100 frames per second. In: CVPR
Machida T, Naito T (2011) GPU & CPU cooperative accelerated pedestrian and vehicle detection. In: Proceeding of IEEE ICCV Workshops
Fukui H, Yamashita T, Yamauchi Y, Fujiyoshi H, Murase H (2015) Pedestrian detection based on deep convolutional neural network with ensemble inference network. In: IEEE Intelligent Vehicle Symposium (IV)
Shakeri A, Moshiri B, Garakani HG (2018) Pedestrian detection using image fusion and stereo vision in autonomous vehicles. In: IEEE, IST
Russakovsky O, Deng J, Su H, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M, Berg AC, Fei-Fei L (2015) ImageNet large scale visual recognition challenge. In: IJCV
Zhang S, Benenson R, Schiele B (2017) A diverse dataset for pedestrian detection. In: CVPR
Kingma D, Ba J (2015) Adam: a method for stochastic optimization. In: ICLR
Glorot X, Bengio Y (2010) Understanding the difficulty of training deep feedforward neural networks. In: AISTATS
Ess A, Leibe B, Schindler K, Van Gool L (2008) A mobile vision system for robust multi-person tracking. In: CVPR, IEEE Press
Wojek C, Walk S, Schiele B (2009) Multi-cue onboard pedestrian detection. In: CVPR
Viola P, Jones M, Snow D (2003) Detecting pedestrians using patterns of motion and appearance. In: CVPR
Geiger A, Lenz P, Urtasun R (2012) Are we ready for autonomous driving? The KITTI vision benchmark suite. In: CVPR
Park D, Ramanan D, Fowlkes C (2010) Multiresolution models for object detection. In: ECCV
Everingham M, Van Gool L, Williams CKI, Winn J, Zisserman A (2011) The PASCAL visual object classes challenge 2011 (VOC2011) results. http://www.pascalnetwork.org/challenges/VOC/voc2011/workshop/index.html
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Saeidi, M., Ahmadi, A. High-performance and deep pedestrian detection based on estimation of different parts. J Supercomput 77, 2033–2068 (2021). https://doi.org/10.1007/s11227-020-03345-4
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-020-03345-4