Google Scholar

End-to-end people detection in crowded scenes

R Stewart, M Andriluka, AY Ng - Proceedings of the IEEE …, 2016 - cv-foundation.org

Proceedings of the IEEE conference on computer vision and pattern …, 2016•cv-foundation.org

Current people detectors operate either by scanning an image in a sliding window fashion
or by classifying a discrete set of proposals. We propose a model that is based on decoding
an image into a set of people detections. Our system takes an image as input and directly
outputs a set of distinct detection hypotheses. Because we generate predictions jointly,
common post-processing steps such as non-maximum suppression are unnecessary. We
use a recurrent LSTM layer for sequence generation and train our model end-to-end with a …

Abstract

Current people detectors operate either by scanning an image in a sliding window fashion or by classifying a discrete set of proposals. We propose a model that is based on decoding an image into a set of people detections. Our system takes an image as input and directly outputs a set of distinct detection hypotheses. Because we generate predictions jointly, common post-processing steps such as non-maximum suppression are unnecessary. We use a recurrent LSTM layer for sequence generation and train our model end-to-end with a new loss function that operates on sets of detections. We demonstrate the effectiveness of our approach on the challenging task of detecting people in crowded scenes

cv-foundation.org

Show moreShow less

Save Cite Cited by 666 Related articles All 9 versions View as HTML

Cite

Advanced search

Saved to My library

End-to-end people detection in crowded scenes