research-article

Free access

PoseHD: boosting human detectors using human pose information

AUTHORs:

Cewu LuAuthors Info & Claims

AAAI'18/IAAI'18/EAAI'18: Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence and Thirtieth Innovative Applications of Artificial Intelligence Conference and Eighth AAAI Symposium on Educational Advances in Artificial Intelligence

Article No.: 880, Pages 7186 - 7193

Published: 02 February 2018 Publication History

PDF eReader Publisher Site

Abstract

As most recently proposed methods for human detection have achieved a sufficiently high recall rate within a reasonable number of proposals, in this paper, we mainly focus on how to improve the precision rate of human detectors. In order to address the two main challenges in precision improvement, i.e., i) hard background instances and ii) redundant partial proposals, we propose the novel PoseHD framework, a top-down pose-based approach on the basis of an arbitrary state-of-the-art human detector. In our proposed PoseHD framework, we first make use of human pose estimation (in a batch manner) and present pose heatmap classification (by a convolutional neural network) to eliminate hard negatives by extracting the more detailed structural information; then, we utilize pose-based proposal clustering and reranking modules, filtering redundant partial proposals by comprehensively considering both holistic and part information. The experimental results on multiple pedestrian benchmark datasets validate that our proposed PoseHD framework can generally improve the overall performance of recent state-of-the-art human detectors (by 2-4% in both mAP and MR metrics). Moreover, our PoseHD framework can be easily extended to object detection with large-scale object part annotations. Finally, in this paper, we present extensive ablative analysis to compare our approach with these traditional bottom-up pose-based models and highlight the importance of our framework design decisions.

References

[1]

Abadi, M.; Barham, P.; Chen, J.; Chen, Z.; Davis, A.; Dean, J.; Devin, M.; Ghemawat, S.; Irving, G.; Isard, M.; Kudlur, M.; Levenberg, J.; Monga, R.; Moore, S.; Murray, D. G.; Steiner, B.; Tucker, P. A.; Vasudevan, V.; Warden, P.; Wicke, M.; Yu, Y.; and Zheng, X. 2016. TensorFlow - A System for Large-Scale Machine Learning. In OSDI.

[2]

Benenson, R.; Omran, M.; Hosang, J.; and Schiele, B. 2014. Ten Years of Pedestrian Detection, What Have We Learned? In ECCV Workshop.

[3]

Bourdev, L., and Malik, J. 2009. Poselets: Body part detectors trained using 3D human pose annotations. In ICCV.

[4]

Bourdev, L. D.; Maji, S.; Brox, T.; and Malik, J. 2010. Detecting People Using Mutually Consistent Poselet Activations. In ECCV.

[5]

Cai, Z.; Saberian, M.; and Vasconcelos, N. 2015. Learning Complexity-Aware Cascades for Deep Pedestrian Detection. In ICCV.

[6]

Cao, Z.; Simon, T.; Wei, S.-E.; and Sheikh, Y. 2017. Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields. In CVPR.

[7]

Chen, X.; Mottaghi, R.; Liu, X.; Fidler, S.; Urtasun, R.; and Yuille, A. L. 2014. Detect What You Can - Detecting and Representing Objects using Holistic Models and Body Parts. In CVPR.

[8]

Dalal, N., and Triggs, B. 2005. Histograms of Oriented Gradients for Human Detection. In ICCV.

[9]

Dollár, P.; Tu, Z.; Perona, P.; and Belongie, S. J. 2009. Integral Channel Features. BMVC.

[10]

Dollar, P.; Appel, R.; Belongie, S.; and Perona, P. 2014. Fast Feature Pyramids for Object Detection. TPAMI.

[11]

Enzweiler, M.; Eigenstetter, A.; Schiele, B.; and Gavrila, D. M. 2010. Multi-cue pedestrian classification with partial occlusion handling. In CVPR.

[12]

Ess, A.; Leibe, B.; and Van Gool, L. J. 2007. Depth and Appearance for Mobile Scene Analysis. In ICCV.

[13]

Everingham, M.; Van Gool, L.; Williams, C. K. I.; Winn, J.; and Zisserman, A. 2010. The Pascal Visual Object Classes (VOC) Challenge. IJCV.

[14]

Felzenszwalb, P.; McAllester, D.; and Ramanan, D. 2008. A Discriminatively Trained, Multiscale, Deformable Part Model. In CVPR.

[15]

Fidler, S.; Mottaghi, R.; Yuille, A.; and Urtasun, R. 2013. Bottom-Up Segmentation for Top-Down Detection. In CVPR.

[16]

Girshick, R.; Donahue, J.; Darrell, T.; and Malik, J. 2014. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation. In CVPR.

[17]

Girshick, R. 2015. Fast R-CNN. In ICCV.

[18]

Hosang, J.; Omran, M.; Benenson, R.; and Schiele, B. 2015. Taking a deeper look at pedestrians. In CVPR.

[19]

Joachims, T. 2002. Optimizing search engines using clickthrough data. In KDD.

[20]

Khan, F. S.; Anwer, R. M.; van de Weijer, J.; Bagdanov, A. D.; Vanrell, M.; and López, A. M. 2012. Color attributes for object detection. In CVPR.

[21]

Kingma, D., and Ba, J. 2015. Adam: A Method for Stochastic Optimization. In ICLR.

[22]

Krizhevsky, A.; Sutskever, I.; and Hinton, G. E. 2012. ImageNet Classification with Deep Convolutional Neural Networks. In NIPS.

[23]

Larsen, B., and Aone, C. 1999. Fast and effective text mining using linear-time document clustering. In KDD.

[24]

Lin, L.; Wang, X.; Yang, W.; and Lai, J.-H. 2015. Discriminatively Trained And-Or Graph Models for Object Shape Detection. TPAMI.

[25]

Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S. E.; Fu, C.-Y.; and Berg, A. C. 2016. SSD - Single Shot MultiBox Detector. In ECCV.

[26]

Luo, P.; Tian, Y.; Wang, X.; and Tang, X. 2014. Switchable Deep Network for Pedestrian Detection. In CVPR.

[27]

Mikolajczyk, K.; Schmid, C.; and Zisserman, A. 2004. Human Detection Based on a Probabilistic Assembly of Robust Part Detectors. In ECCV.

[28]

Mohan, A.; Papageorgiou, C.; and Poggio, T. 2001. Example-based object detection in images by components. TPAMI.

[29]

Mottaghi, R. 2012. Augmenting deformable part models with irregular-shaped object patches. In CVPR.

[30]

Nam, W.; Dollár, P.; and Han, J. H. 2014. Local Decorrelation For Improved Pedestrian Detection. In NIPS.

[31]

Ouyang, W., and Wang, X. 2012. A discriminative deep model for pedestrian detection with occlusion handling. In CVPR.

[32]

Ouyang, W., and Wang, X. 2013. Joint Deep Learning for Pedestrian Detection. In ICCV.

[33]

Popa, A.-I., and Sminchisescu, C. 2015. Parametric Image Segmentation of Humans with Structural Shape Priors. arxiv.

[34]

Redmon, J., and Farhadi, A. 2017. YOLO9000: Better, Faster, Stronger. In CVPR.

[35]

Redmon, J.; Divvala, S. K.; Girshick, R. B.; and Farhadi, A. 2016. You Only Look Once - Unified, Real-Time Object Detection. In CVPR.

[36]

Ren, S.; He, K.; Girshick, R.; and Sun, J. 2015. Faster R-CNN: towards real-time object detection with region proposal networks. In NIPS.

[37]

Simonyan, K., and Zisserman, A. 2015. Very Deep Convolutional Networks for Large-Scale Image Recognition. In ICLR.

[38]

Song, Z.; Chen, Q.; Huang, Z.; Hua, Y.; and Yan, S. 2011. Contextualizing object detection and classification. In CVPR.

[39]

Song, X.; Wu, T.; Jia, Y.; and Zhu, S.-C. 2013. Discriminatively Trained And-Or Tree Models for Object Detection. In CVPR.

[40]

Tian, Y.; Luo, P.; Wang, X.; and Tang, X. 2015a. Deep Learning Strong Parts for Pedestrian Detection. In ICCV.

[41]

Tian, Y.; Luo, P.; Wang, X.; and Tang, X. 2015b. Pedestrian detection aided by deep learning semantic tasks. In CVPR.

[42]

Tian, Yonglong; Luo, Ping; Wang, Xiaogang; and Tang, Xiaoou. 2015. Deep Learning Strong Parts for Pedestrian Detection. In ICCV.

[43]

Wu, B., and Nevatia, R. 2005. Detection of multiple, partially occluded humans in a single image by Bayesian combination of edgelet part detectors. In ICCV.

[44]

Wu, Q.; Burges, C. J. C.; Svore, K. M.; and Gao, J. 2010. Adapting boosting for information retrieval measures. IR.

[45]

Zhang, J.; Huang, K.; Yu, Y.; and Tan, T. 2011. Boosted local structured HOG-LBP for object localization. In CVPR.

[46]

Zhang, N.; Donahue, J.; Girshick, R.; and Darrell, T. 2014. Part-Based R-CNNs for Fine-Grained Category Detection. In ECCV.

[47]

Zhang, L.; Lin, L.; Liang, X.; and He, K. 2016a. Is Faster R-CNN Doing Well for Pedestrian Detection? In ECCV.

[48]

Zhang, S.; Benenson, R.; Omran, M.; Hosang, J. H.; and Schiele, B. 2016b. How Far are We from Solving Pedestrian Detection? In CVPR.

Index Terms

PoseHD: boosting human detectors using human pose information
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
        Object detection
        Object recognition
      2. Computer vision tasks
  2. Machine learning

Index terms have been assigned to the content through auto-classification.

Recommendations

Open-set face recognition across look-alike faces in real-world scenarios

The open-set problem is among the problems that have significantly changed the performance of face recognition algorithms in real-world scenarios. Open-set operates under the supposition that not all the probes have a pair in the gallery. Most face ...
Robust Statistical Frontalization of Human and Animal Faces

The unconstrained acquisition of facial data in real-world conditions may result in face images with significant pose variations, illumination changes, and occlusions, affecting the performance of facial landmark localization and recognition methods. In ...
Collaborative expression representation using peak expression and intra class variation face images for practical subject-independent emotion recognition in videos

This paper proposes a facial expression recognition (FER) method in videos. The proposed method automatically selects the peak expression face from a video sequence using closeness of the face to the neutral expression. The severely non-frontal faces ...

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings

AAAI'18/IAAI'18/EAAI'18: Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence and Thirtieth Innovative Applications of Artificial Intelligence Conference and Eighth AAAI Symposium on Educational Advances in Artificial Intelligence

February 2018

8223 pages

ISBN:978-1-57735-800-8

Copyright © 2018 Association for the Advancement of Artificial Intelligence.

Sponsors

Association for the Advancement of Artificial Intelligence

Publisher

AAAI Press

Publication History

Published: 02 February 2018

Qualifiers

Research-article
Research
Refereed limited

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
22
Total Downloads

Downloads (Last 12 months)21
Downloads (Last 6 weeks)6

Reflects downloads up to 13 Jan 2025

Other Metrics

View Author Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents