This paper describes a pedestrian detection system that integrates image intensity information with motion information. We use a detection style algorithm that scans a detector over two consecutive frames of a video sequence. The detector... more
This paper describes a pedestrian detection system that integrates image intensity information with motion information. We use a detection style algorithm that scans a detector over two consecutive frames of a video sequence. The detector is trained (using AdaBoost)
We present a new algorithm to detect pedestrians in still images utilizing covariance matrices as object descriptors. Since the descriptors do not form a vector space, well-known machine learning techniques are not well suited to learn... more
We present a new algorithm to detect pedestrians in still images utilizing covariance matrices as object descriptors. Since the descriptors do not form a vector space, well-known machine learning techniques are not well suited to learn the classifiers. The space of d-dimensional nonsingular covariance matrices can be represented as a connected Riemannian manifold. The main contribution of the paper is a novel approach for classifying points lying on a connected Riemannian manifold using the geometry of the space. The algorithm is tested on the INRIA and Daim-lerChrysler pedestrian data sets where superior detection rates are observed over the previous approaches.
Visual object analysis researchers are increasingly experimenting with video, because it is expected that motion cues should help with detection, recognition, and other analysis tasks. This paper presents the Cambridge-driving Labeled... more
Visual object analysis researchers are increasingly experimenting with video, because it is expected that motion cues should help with detection, recognition, and other analysis tasks. This paper presents the Cambridge-driving Labeled Video Database (CamVid) as the first collection of videos with object class semantic labels, complete with metadata. The database provides ground truth labels that associate each pixel with one of 32 semantic classes. The database addresses the need for experimental data to quantitatively evaluate emerging algorithms. While most videos are filmed with fixed-position CCTV-style cameras, our data was captured from the perspective of a driving automobile. The driving scenario increases the number and heterogeneity of the observed object classes. Over 10 min of high quality 30 Hz footage is being provided, with corresponding semantically labeled images at 1 Hz and in part, 15 Hz. The CamVid Database offers four contributions that are relevant to object analysis researchers. First, the per-pixel semantic segmentation of over 700 images was specified manually, and was then inspected and confirmed by a second person for accuracy. Second, the high-quality and large resolution color video images in the database represent valuable extended duration digitized footage to those interested in driving scenarios or ego-motion. Third, we filmed calibration sequences for the camera color response and intrinsics, and computed a 3D camera pose for each frame in the sequences. Finally, in support of expanding this or other databases, we present custom-made labeling software for assisting users who wish to paint precise class-labels for other images and videos. We evaluate the relevance of the database by measuring the performance of an algorithm from each of three distinct domains: multi-class object recognition, pedestrian detection, and label propagation.
We describe the functional and architectural breakdown of a monocular pedestrian detection system. We describe in detail our approach for single-frame classification based on a novel scheme of breaking down the class variability by... more
We describe the functional and architectural breakdown of a monocular pedestrian detection system. We describe in detail our approach for single-frame classification based on a novel scheme of breaking down the class variability by repeatedly training a set of relatively simple classifiers on clusters of the training set. Single-frame classification performance results and system level performance figures for daytime conditions are presented with a discussion about the remaining gap to meet a daytime normal weather condition production system.
This paper describes the recent research on the enhancement of pedestrian safety to help develop a better understanding of the nature, issues, approaches, and challenges surrounding the problem. It presents a comprehensive review of... more
This paper describes the recent research on the enhancement of pedestrian safety to help develop a better understanding of the nature, issues, approaches, and challenges surrounding the problem. It presents a comprehensive review of research efforts underway dealing with pedestrian safety and collision avoidance. The importance of pedestrian protection is emphasized in a global context, discussing the research programs and efforts in various countries. Pedestrian safety measures, including infrastructure enhancements and passive safety features in vehicles, are described, followed by a systematic description of active safety systems based on pedestrian detection using sensors in vehicle and infrastructure. The pedestrian detection approaches are classified according to various criteria such as the type and configuration of sensors, as well as the video cues and classifiers used in detection algorithms. It is noted that collision avoidance not only requires detection of pedestrians but also requires collision prediction using pedestrian dynamics and behavior analysis. Hence, this paper includes research dealing with probabilistic modeling of pedestrian behavior for predicting collisions between pedestrians and vehicles.
Pedestrian detection is essential to avoid dangerous traffic situations. In this paper, we present a fast and robust algorithm for detecting pedestrians in a cluttered scene from a pair of moving cameras. This is achieved through... more
Pedestrian detection is essential to avoid dangerous traffic situations. In this paper, we present a fast and robust algorithm for detecting pedestrians in a cluttered scene from a pair of moving cameras. This is achieved through stereo-based segmentation and neural network-based recognition. The algorithm includes three steps. First, we segment the image into sub-image object candidates using disparities discontinuity. Second, we merge and split the sub-image object candidates into sub-images that satisfy pedestrian size and shape constrains. Third, we use intensity gradients of the candidate sub-images as input to a trained neural network for pedestrian recognition. The experiments on a large number of urban street scenes demonstrate that the proposed algorithm: 1) can detect pedestrians in various poses, shapes, sizes, clothing, and occlusion status; 2) runs in real-time; and 3) is robust to illumination and background changes.
In this paper, we address the problem of multiperson tracking in busy pedestrian zones using a stereo rig mounted on a mobile platform. The complexity of the problem calls for an integrated solution that extracts as much visual... more
In this paper, we address the problem of multiperson tracking in busy pedestrian zones using a stereo rig mounted on a mobile platform. The complexity of the problem calls for an integrated solution that extracts as much visual information as possible and combines it through cognitive feedback cycles. We propose such an approach, which jointly estimates camera position, stereo depth, object detection, and tracking. The interplay between those components is represented by a graphical model. Since the model has to incorporate object-object interactions and temporal links to past frames, direct inference is intractable. We, therefore, propose a twostage procedure: for each frame, we first solve a simplified version of the model (disregarding interactions and temporal continuity) to estimate the scene geometry and an overcomplete set of object detections. Conditioned on these results, we then address object interactions, tracking, and prediction in a second step. The approach is experimentally evaluated on several long and difficult video sequences from busy inner-city locations. Our results show that the proposed integration makes it possible to deliver robust tracking performance in scenes of realistic complexity.
The last few decades witnessed the birth and growth of a new sensibility to transportation efficiency. In particular, the need for efficient and improved people and goods mobility pushed researchers to address the problem of intelligent... more
The last few decades witnessed the birth and growth of a new sensibility to transportation efficiency. In particular, the need for efficient and improved people and goods mobility pushed researchers to address the problem of intelligent transportation systems. This paper surveys the most advanced approaches to the (partial) customization of road following task, using on-board systems based on artificial vision. The functionalities of lane detection, obstacle detection and pedestrian detection are described and classified, and their possible application on future road vehicles is discussed.
This work proposes a novel classifier-fusion scheme using learning algorithms, i.e. syntactic models, instead of the usual Bayesian or heuristic rules. Moreover, this paper complements the previous comparative studies on DaimlerChrysler... more
This work proposes a novel classifier-fusion scheme using learning algorithms, i.e. syntactic models, instead of the usual Bayesian or heuristic rules. Moreover, this paper complements the previous comparative studies on DaimlerChrysler Automotive Dataset, offering a set of complementary experiments using feature extractor and classifier combinations. The experimental results provide evidence of the effectiveness of our methods regarding false positive rate, AUC, and accuracy, which reached 96.67%.
A perception system for pedestrian detection in urban scenarios using information from a LIDAR and a single camera is presented. Two sensor fusion architectures are described, a centralized and a decentralized one. In the former, the... more
A perception system for pedestrian detection in urban scenarios using information from a LIDAR and a single camera is presented. Two sensor fusion architectures are described, a centralized and a decentralized one. In the former, the fusion process occurs at the feature level, i.e., features from LIDAR and vision spaces are combined in a single vector for posterior classification using a single classifier. In the latter, two classifiers are employed, one per sensor-feature space, which were offline selected based on information theory and fused by a trainable fusion method applied over the likelihoods provided by the component classifiers. The proposed schemes for sensor combination, and more specifically the trainable fusion method, lead to enhanced detection performance and, in addition, maintenance of false-alarms under tolerable values in comparison with singlebased classifiers. Experimental results highlight the performance and effectiveness of the proposed pedestrian detection system and the related sensor data combination strategies.
This paper describes a system for pedestrian detection in infrared images, which has been implemented on an experimental vehicle equipped with an infrared camera. The proposed system has been tested in many situations and has proven to be... more
This paper describes a system for pedestrian detection in infrared images, which has been implemented on an experimental vehicle equipped with an infrared camera. The proposed system has been tested in many situations and has proven to be efficient and with a very low false-positive rate. It is based on a multiresolution localization of warm symmetrical objects with specific size and aspect ratio; anyway, because road infrastructures and other road participants may also have such characteristics, a set of matched filters is included in order to reduce false detections. A final validation process, based on human shape's morphological characteristics, is used to build the list of pedestrian appearing in the scene. Neither temporal correlation nor motion cues are used in this first part of the project: the processing is based on the analysis of single frames only.
Collision avoidance is one of the most difficult and challenging automatic driving operations in the domain of intelligent vehicles. In emergency situations, human drivers are more likely to brake than to steer, although the optimal... more
Collision avoidance is one of the most difficult and challenging automatic driving operations in the domain of intelligent vehicles. In emergency situations, human drivers are more likely to brake than to steer, although the optimal maneuver would, more frequently, be steering alone. This statement suggests the use of automatic steering as a promising solution to avoid accidents in the future. The objective of this paper is to provide a collision avoidance system (CAS) for autonomous vehicles, focusing on pedestrian collision avoidance. The detection component involves a stereo-vision-based pedestrian detection system that provides suitable measurements of the time to collision. The collision avoidance maneuver is performed using fuzzy controllers for the actuators that mimic human behavior and reactions, along with a high-precision Global Positioning System (GPS), which provides the information needed for the autonomous navigation. The proposed system is evaluated in two steps. First, drivers' behavior and sensor accuracy are studied in experiments carried out by manual driving. This study will be used to define the parameters of the second step, in which automatic pedestrian collision avoidance is carried out at speeds of up to 30 km/h. The performed field tests provided encouraging results and proved the viability of the proposed approach.
This paper describes an approach for pedestrian detection in infrared images. The developed system has been implemented on an experimental vehicle equipped with an infrared camera and preliminarily tested in different situations.
This article presents a stereo system for the detection of pedestrians using far-infrared cameras. Since pedestrian detection in far-infrared images can be difficult in some environmental conditions, the system exploits three different... more
This article presents a stereo system for the detection of pedestrians using far-infrared cameras. Since pedestrian detection in far-infrared images can be difficult in some environmental conditions, the system exploits three different detection approaches: warm area detection, edge-based detection, and disparity computation. A final validation process is performed using head morphological and thermal characteristics. Currently, neither temporal correlation, nor motion cues are used in this processing.
This paper details filtering subsystem for a tetravision based pedestrian detection system. The complete system is based on the use of both visible and far infrared cameras; in an initial phase it produces a list of areas of attention in... more
This paper details filtering subsystem for a tetravision based pedestrian detection system. The complete system is based on the use of both visible and far infrared cameras; in an initial phase it produces a list of areas of attention in the images which can contain pedestrians. This list is furtherly refined using symmetry-based assumptions. Then, this results is fed to a number of independent validators that evaluate the presence of human shapes inside the areas of attention.
This paper presents an analysis of color-, infrared-, and multimodal-stereo approaches to pedestrian detection. We design a four-camera experimental testbed consisting of two color and two infrared cameras for capturing and analyzing... more
This paper presents an analysis of color-, infrared-, and multimodal-stereo approaches to pedestrian detection. We design a four-camera experimental testbed consisting of two color and two infrared cameras for capturing and analyzing various configuration permutations for pedestrian detection. We incorporate this four-camera system in a test vehicle and conduct comparative experiments of stereo-based approaches to obstacle detection using unimodal color and infrared imageries. A detailed analysis of the color and infrared features used to classify detected obstacles into pedestrian regions is used to motivate the development of a multimodal solution to pedestrian detection. We propose a multimodal trifocal framework consisting of a stereo pair of color cameras coupled with an infrared camera. We use this framework to combine multimodal-image features for pedestrian detection and to demonstrate that the detection performance is significantly higher when color, disparity, and infrared features are used together. This result motivates experiments and discussion toward achieving multimodal-feature combination using a single color and a single infrared camera arranged in a cross-spectral stereo pair. We demonstrate an approach to registering multiple objects across modalities and provide an experimental analysis that highlights issues and challenges of pursuing the crossspectral approach to multimodal and multiperspective pedestrian analysis.
In this paper we describe a fully integrated system for detecting, localizing, and tracking pedestrians from a moving vehicle. The system can reliably detect upright pedestrians to a range of 40 m in lightly cluttered urban environments.... more
In this paper we describe a fully integrated system for detecting, localizing, and tracking pedestrians from a moving vehicle. The system can reliably detect upright pedestrians to a range of 40 m in lightly cluttered urban environments. The system uses range data from stereo vision to segment the scene into regions of interest, from which shape features are extracted and used to classify pedestrians. The regions are tracked using shape and appearance features. Tracking is used to temporally filter classifications to improve performance and to estimate the velocity of pedestrians for use in path planning. The end-toend system runs at 5 Hz on 11024 1 768 imagery using a standard 2.4 GHz Intel Core 2 Quad processor, and has been integrated and tested on multiple ground vehicles and environments. We show performance on a diverse set of datasets with groundtruth in outdoor environments with varying degrees of pedestrian density and clutter. In highly cluttered urban environments, the detection rates are on a par with state-of-the-art but significantly slower systems.
This paper describes a system for pedestrian detection in stereo infrared images. The system is based on three different underlying approaches: warm area detection, edgebased detection, and v-disparity computation. Stereo is also used for... more
This paper describes a system for pedestrian detection in stereo infrared images. The system is based on three different underlying approaches: warm area detection, edgebased detection, and v-disparity computation. Stereo is also used for computing the distance and size of detected objects. A final validation process is performed using head morphological and thermal characteristics. Neither temporal correlation, nor motion cues are used in this processing.
During the next decade, on-board pedestrian detection systems will play a key role in the challenge of increasing traffic safety. The main target of these systems, to detect pedestrians in urban scenarios, implies overcoming difficulties... more
During the next decade, on-board pedestrian detection systems will play a key role in the challenge of increasing traffic safety. The main target of these systems, to detect pedestrians in urban scenarios, implies overcoming difficulties like processing outdoor scenes from a mobile platform and searching for aspect-changing objects in cluttered environments. This makes such systems combine techniques in the state-of-the-art Computer Vision. In this paper we present a three module system based on both 2D and 3D cues. The first module uses 3D information to estimate the road plane parameters and thus select a coherent set of regions of interest (ROIs) to be further analyzed. The second module uses Real AdaBoost and a combined set of Haar wavelets and edge orientation histograms to classify the incoming ROIs as pedestrian or non-pedestrian. The final module loops again with the 3D cue in order to verify the classified ROIs and with the 2D in order to refine the final results. According to the results, the integration of the proposed techniques gives rise to a promising system.
This paper describes an approach for pedestrian detection in stereo infrared images. The developed system has been implemented on an experimental vehicle equipped with two infrared camera and preliminarily tested in different situations.... more
This paper describes an approach for pedestrian detection in stereo infrared images. The developed system has been implemented on an experimental vehicle equipped with two infrared camera and preliminarily tested in different situations. It is based on the localization and distance estimation of warm areas in the scene; the algorithm groups areas with similar position and considers only results with specific size and aspect ratio. A final validation process, based on the head shape's morphological and thermal characteristics, is used to build the list of potential pedestrian appearing in the scene. Neither temporal correlation, nor motion cues are used in this processing.
Reliable detection and classification of vulnerable road users constitute a critical issue on safety/protection systems for intelligent vehicles driving in urban zones. In this subject, most of the perception systems have LIDAR and/or... more
Reliable detection and classification of vulnerable road users constitute a critical issue on safety/protection systems for intelligent vehicles driving in urban zones. In this subject, most of the perception systems have LIDAR and/or Radar as primary detection modules and vision-based systems for object classification. This work, on the other hand, presents a valuable analysis of pedestrian detection in urban scenario using exclusively LIDAR-based features. The aim is to explore how much information can be extracted from LIDAR sensors for pedestrian detection. Moreover, this study will be useful to compose multi-sensor based pedestrian detection systems using not only LIDAR but also vision sensors. Experimental results using our data set and a detailed classification performance analysis are presented, with comparisons among various classification techniques.
This paper presents a system whose aim is to detect and classify road obstacles, like pedestrians and vehicles, by fusing data coming from different sensors: a camera, a radar, and an inertial sensor. The camera is mainly used to refine... more
This paper presents a system whose aim is to detect and classify road obstacles, like pedestrians and vehicles, by fusing data coming from different sensors: a camera, a radar, and an inertial sensor. The camera is mainly used to refine the vehicles' boundaries detected by the radar and to discard those who might be false positives; at the same time, a symmetry based pedestrian detection algorithm is executed, and its results are merged with a set of regions of interest, provided by a Motion Stereo technique. The tests have been performed in several environments and traffic situations, their results showed how the vision based filtering provides an effective reduction of radar's false positives; furthermore, the regions of interest detected by the Motion Stereo algorithm, truly improves the pedestrian detector's performance again by keeping low the number of detection errors. The system has been shown during the APALACI-PReVENT European IP final demonstration 1 in September 2007 in Versailles (France).
Despite recent significant advances, pedestrian detection continues to be an extremely challenging problem in real scenarios. In order to develop a detector that successfully operates under these conditions, it becomes critical to... more
Despite recent significant advances, pedestrian detection continues to be an extremely challenging problem in real scenarios. In order to develop a detector that successfully operates under these conditions, it becomes critical to leverage upon multiple cues, multiple imaging modalities and a strong multi-view classifier that accounts for different pedestrian views and poses. In this paper we provide an extensive evaluation that gives insight into how each of these aspects (multi-cue, multimodality and strong multi-view classifier) affect performance both individually and when integrated together. In the multimodality component we explore the fusion of RGB and depth maps obtained by high-definition LIDAR, a type of modality that is only recently starting to receive attention. As our analysis reveals, although all the aforementioned aspects significantly help in improving the performance, the fusion of visible spectrum and depth information allows to boost the accuracy by a much larger margin. The resulting detector not only ranks among the top best performers in the challenging KITTI benchmark, but it is built upon very simple blocks that are easy to implement and computationally efficient. These simple blocks can be easily replaced with more sophisticated ones recently proposed, such as the use of convolutional neural networks for feature representation, to further improve the accuracy.
—Despite recent significant advances, object detection continues to be an extremely challenging problem in real scenarios. In order to develop a detector that successfully operates under these conditions, it becomes critical to leverage... more
—Despite recent significant advances, object detection continues to be an extremely challenging problem in real scenarios. In order to develop a detector that successfully operates under these conditions, it becomes critical to leverage upon multiple cues, multiple imaging modalities and a strong multi-view classifier that accounts for different object views and poses. In this paper we provide an extensive evaluation that gives insight into how each of these aspects (multi-cue, multi-modality and strong multi-view classifier) affect accuracy both individually and when integrated together. In the multi-modality component we explore the fusion of RGB and depth maps obtained by high-definition LIDAR, a type of modality that is starting to receive increasing attention. As our analysis reveals, although all the aforementioned aspects significantly help in improving the accuracy, the fusion of visible spectrum and depth information allows to boost the accuracy by a much larger margin. The resulting detector not only ranks among the top best performers in the challenging KITTI benchmark, but it is built upon very simple blocks that are easy to implement and computationally efficient.
A single feature extractor-classifier is not usually able to deal with the diversity of multiple image scenarios. Therefore, integration of features and classifiers can bring benefits to cope with this problem, particularly when the parts... more
A single feature extractor-classifier is not usually able to deal with the diversity of multiple image scenarios. Therefore, integration of features and classifiers can bring benefits to cope with this problem, particularly when the parts are carefully chosen and synergistically combined. In this paper, we address the problem of pedestrian detection by a novel ensemble method. Initially, histograms of oriented gradients (HOGs) and local receptive fields (LRFs), which are provided by a convolutional neural network, have been both classified by multilayer perceptrons (MLPs) and support vector machines (SVMs). A diversity measure is used to refine the initial set of feature extractors and classifiers. A final classifier ensemble was then structured by an HOG and an LRF as features, classified by two SVMs and one MLP. We have analyzed the following two classes of fusion methods of combining the outputs of the component classifiers: 1) majority vote and 2) fuzzy integral. The first part of the performance evaluation consisted of running the final proposed ensemble over the DaimlerChrysler cropwise data set, which was also artificially modified to simulate sunny and shadowy illumination conditions, which is typical of outdoor scenarios. Then, a window-wise study has been performed over a collected video sequence. Experiments have highlighted a state-of-the-art classification system, performing consistently better than the component classifiers and other methods.
This article presents a tetra-vision (4 cameras) system for the detection of pedestrians by the means of the simultaneous use of one far infra-red and one visible cameras stereo pairs. The main idea is to exploit both the advantages of... more
This article presents a tetra-vision (4 cameras) system for the detection of pedestrians by the means of the simultaneous use of one far infra-red and one visible cameras stereo pairs. The main idea is to exploit both the advantages of far infra-red and visible cameras trying at the same time to benefit from the use of each system. Initially, the two stereo flows are independently processed, then the results are fused together. The final result of this low-level processing is a list of obstacles that have a shape and a size compatible with the presence of a potential pedestrian. In addition, the system is able to remove the background from the detected obstacles to simplify a possible further high level processing.
This paper describes a system for pedestrian detection in infrared images implemented and tested on an experimental vehicle. A specific stabilization procedure is applied after image acquisition and before processing to cope with vehicle... more
This paper describes a system for pedestrian detection in infrared images implemented and tested on an experimental vehicle. A specific stabilization procedure is applied after image acquisition and before processing to cope with vehicle movements affecting the camera calibration. The localization of pedestrians is based on the search for warm symmetrical objects with specific size and aspect ratio. A set of filters is used to reduce false detections. The final validation process relies on the human shape’s morphological characteristics.
Pedestrian safety is B primary traffic issue in urban environment. The use of modern sensing technologies to improve pedestrian safety has remained an active research topic for years. A variety of sensing technologies have been developed... more
Pedestrian safety is B primary traffic issue in urban environment. The use of modern sensing technologies to improve pedestrian safety has remained an active research topic for years. A variety of sensing technologies have been developed for pedestrian detection. The application of pedestrian detection on transit vehicle platforms is desirable and feasible in the near future. In this paper, potential sensing technologies are first reviewed for their advantages and limitations. Several sensors are then chosen for further experhntal testing and evaluation. A reliable sensing system will require a combination of multiple sensors to deal with near-range in stationary conditions and longer-range detection in moving conditions. An approach of vehicle-infrastructure integrated solution is suggested for the pedestrian detection in transit bus application.
Visible light communication (VLC) systems are promising candidates for future indoor access and peer-to-peer networks. The performance of these systems, however, is vulnerable to the line of sight (LOS) link blockage due to objects inside... more
Visible light communication (VLC) systems are promising candidates for future indoor access and peer-to-peer networks. The performance of these systems, however, is vulnerable to the line of sight (LOS) link blockage due to objects inside the room. In this paper, we develop a probabilistic object detection method that takes advantage of the blockage status of the LOS links between the user devices and transceivers on the ceiling to locate those objects. The target objects are modeled as cylinders with random radii. The location and size of an object can be estimated by using a quadratic programming approach. Simulation results show that the root-mean-squared error can be less than 1 cm and 8 cm for estimating the center and the radius of the object, respectively.
This paper describes a vision-based pedestrian detection system for robots, and autonomous vehicles. For that purpose the Haar-like features were used to discriminate pedestrians. Those features were used as input in a learning algorithm,... more
This paper describes a vision-based pedestrian detection system for robots, and autonomous vehicles. For that purpose the Haar-like features were used to discriminate pedestrians. Those features were used as input in a learning algorithm, based on AdaBoost, which selects a small number of critical visual features from a larger set and yields an extremely efficient classifier. The proposed system can run in real-time applications achieving good detection rates.
This paper describes an improved stereovision system for the anticipated detection of car-to-pedestrian accidents. An improvement of the previous versions of the pedestrian-detection system is achieved by compensation of the camera's... more
This paper describes an improved stereovision system for the anticipated detection of car-to-pedestrian accidents. An improvement of the previous versions of the pedestrian-detection system is achieved by compensation of the camera's pitch angle, since it results in higher accuracy in the location of the ground plane and more accurate depth measurements. The system has been mounted on two different prototype cars, and several real collision-avoidance and collision-mitigation experiments have been carried out in private circuits using actors and dummies, which represents one of the main contributions of this paper. Collision avoidance is carried out by means of deceleration strategies whenever the accident is avoidable. Likewise, collision mitigation is accomplished by triggering an active hood system. Index Terms-Collision avoidance and mitigation, pedestrian protection, pitch compensation, stereovision, virtual disparity image.
In this paper, a new approach for pedestrian detection is presented. We design an ensemble of classifiers that employ different feature representation schemes of the pedestrian images: Laplacian Eigen-Maps, Gabor filters, and invariant... more
In this paper, a new approach for pedestrian detection is presented. We design an ensemble of classifiers that employ different feature representation schemes of the pedestrian images: Laplacian Eigen-Maps, Gabor filters, and invariant local binary patterns. Each ensemble is obtained by varying the patterns used to train the classifiers and extracting from each image two feature vectors for each feature extraction method: one for the upper part of the image and one for the lower part of the image. A different radial basis function support vector machine (SVM) classifier is trained using each feature vector; finally, these classifiers are combined by the "sum rule." Experiments are performed on a large data set consisting of 4000 pedestrian and more than 25 000 nonpedestrian images captured in outdoor urban environments. Experimental results confirm that the different feature representations give complementary information, which has been exploited by fusion rules, and we have shown that our method outperforms the state-of-the-art approaches among pedestrian detectors.
This paper describes a comprehensive combination of feature extraction methods for vision-based pedestrian detection in the framework of intelligent transportation systems. The basic components of pedestrians are first located in the... more
This paper describes a comprehensive combination of feature extraction methods for vision-based pedestrian detection in the framework of intelligent transportation systems. The basic components of pedestrians are first located in the image and then combined with a SVM-based classifier. This poses the problem of pedestrian detection in real, cluttered road images. Candidate pedestrians are located using a subtractive clustering attention mechanism based on stereo vision. A by-components learning approach is proposed in order to better deal with pedestrians variability, illumination conditions, partial occlusions, and rotations. Extensive comparisons have been carried out using different feature extraction methods, as a key to image understanding in real traffic conditions. A database containing thousands of pedestrian samples extracted from real traffic images has been created for learning purposes, either at daytime and nighttime. The results achieved up to date show interesting conclusions that suggest a combination of feature extraction methods as an essential clue for enhanced detection performance
The ability to perform off-road autonomous navigation at any time of day or night is a requirement for some unmanned ground vehicle (UGV) programs. Because there are times when it is desirable for military UGVs to operate without emitting... more
The ability to perform off-road autonomous navigation at any time of day or night is a requirement for some unmanned ground vehicle (UGV) programs. Because there are times when it is desirable for military UGVs to operate without emitting strong, detectable electromagnetic signals, a passive only terrain perception mode of operation is also often a requirement. Thermal infrared (TIR) cameras can be used to provide day and night passive terrain perception. TIR cameras have a detector sensitive to either mid-wave infrared (MWIR) radiation (3-5μm) or long-wave infrared (LWIR) radiation (7-14μm). With the recent emergence of high-quality uncooled LWIR cameras, TIR cameras have become viable passive perception options for some UGV programs. The Jet Propulsion Laboratory (JPL) has used a stereo pair of TIR cameras under several UGV programs to perform stereo ranging, terrain mapping, tree-trunk detection, pedestrian detection, negative obstacle detection, and water detection based on object reflections. In addition, we have evaluated stereo range data at a variety of UGV speeds, evaluated dual-band TIR classification of soil, vegetation, and rock terrain types, analyzed 24 hour water and 12 hour mud TIR imagery, and analyzed TIR imagery for hazard detection through smoke. Since TIR cameras do not currently provide the resolution available from megapixel color cameras, a UGV's daytime safe speed is often reduced when using TIR instead of color cameras. In this paper, we summarize the UGV terrain perception work JPL has performed with TIR cameras over the last decade and describe a calibration target developed by General Dynamics Robotic Systems (GDRS) for TIR cameras and other sensors.
This paper presents an approach to pedestrian detection in thermal infrared (thermal) images with limited annotations. The key idea is to adapt the abundance of color images associated with bounding box annotations to the thermal domain... more
This paper presents an approach to pedestrian detection in thermal infrared (thermal) images with limited annotations. The key idea is to adapt the abundance of color images associated with bounding box annotations to the thermal domain for training the pedestrian detector. To this end, we couple a domain adaptation component that consists of a pair of image transformers with a pedestrian detector in the thermal domain and train the entire network end-to-end. The image transformers act as a data augmentation tool that progressively improves synthetic examples on the fly for training the pedestrian detector. To aid the training process, we introduce a detection loss defined on both real thermal images and synthetic thermal images transformed from the color domain. The proposed detector outperforms existing methods on the thermal images from the KAIST detection benchmark [1].
In this paper, we propose an approach for fast pedestrian detection in images. Inspired by the histogram of oriented gradient (HOG) features, a set of multi-scale orientation (MSO) features are proposed as the feature representation. The... more
In this paper, we propose an approach for fast pedestrian detection in images. Inspired by the histogram of oriented gradient (HOG) features, a set of multi-scale orientation (MSO) features are proposed as the feature representation. The features are extracted on square image blocks of various sizes (called units), containing coarse and fine features in which coarse ones are the unit orientations and fine ones are the pixel orientation histograms of the unit. A cascade of Adaboost is employed to train classifiers on the coarse features, aiming to high detection speed. A greedy searching algorithm is employed to select fine features, which are input into SVMs to train the fine classifiers, aiming to high detection accuracy. Experiments report that our approach obtains state-of-art results with 12.4 times faster than the SVM+HOG method.
This work presents the vision-based system for detecting pedestrians in road environments implemented on the ARGO prototype vehicle developed by the University of Parma. The system is aimed at the localization of pedestrians in various... more
This work presents the vision-based system for detecting pedestrians in road environments implemented on the ARGO prototype vehicle developed by the University of Parma. The system is aimed at the localization of pedestrians in various poses, positions and clothing, and is not limited to moving people.
This paper presents an analytical study of the depth estimation error of a stereo vision-based pedestrian detection sensor for automotive applications such as pedestrian collision avoidance and/or mitigation. The sensor comprises two... more
This paper presents an analytical study of the depth estimation error of a stereo vision-based pedestrian detection sensor for automotive applications such as pedestrian collision avoidance and/or mitigation. The sensor comprises two synchronized and calibrated low-cost cameras. Pedestrians are detected by combining a 3D clustering method with Support Vector Machine-based (SVM) classification. The influence of the sensor parameters in the stereo quantization errors is analyzed in detail providing a point of reference for choosing the sensor setup according to the application requirements. The sensor is then validated in real experiments. Collision avoidance maneuvers by steering are carried out by manual driving. A real time kinematic differential global positioning system (RTK-DGPS) is used to provide ground truth data corresponding to both the pedestrian and the host vehicle locations. The performed field test provided encouraging results and proved the validity of the proposed sensor for being used in the automotive sector towards applications such as autonomous pedestrian collision avoidance.
In last years, the European Union, Member States and the automotive industry have concentrated their efforts on improving road safety [1] . Avoiding accidents due to human errors is one of the main objectives of Advanced Driver Assistance... more
In last years, the European Union, Member States and the automotive industry have concentrated their efforts on improving road safety [1] . Avoiding accidents due to human errors is one of the main objectives of Advanced Driver Assistance Systems (ADAS). European traffic accidents figures show a very large number of collisions between pedestrians and vehicles. Every year, more than 200,000 pedestrians are injured, with over 6,000 killed. In spite of the incredibly high number of victims, the protection of pedestrians has received little attention by the research community . The projects that have dealt with this problem are quite recent, as has been pointed out at the Fifth Framework Programme [1].
The paper focuses on motion-based information extraction from cluttered video image sequences. A novel method is introduced which can reliably detect walking human figures contained in such images. The method works with spatio-temporal... more
The paper focuses on motion-based information extraction from cluttered video image sequences. A novel method is introduced which can reliably detect walking human figures contained in such images. The method works with spatio-temporal input information to detect and classify patterns typical of human movement. Our algorithm consists of real-time operations, which is an important factor in practical applications. The paper presents a new information-extraction and temporal tracking method based on a simplified version of the symmetry-pattern extraction, which pattern is characteristic for the moving legs of a walking person. These spatio-temporal traces are labelled by kernel Fisher discriminant analysis. With the use of temporal tracking and non-linear classification we have achieved pedestrian detection from cluttered image scenes with a correct classification rate of 97.6% from 1 to 2 step periods. The detection rates of linear classifier and SVM are also presented in the results hereby the necessity of a non-linear method and the power of KFDA for this detection task is also demonstrated.
In this paper, we present a real-time pedestrian detection system that has been trained using a virtual environment. This is a very popular topic of research having endless practical applications and recently , there was an increasing... more
In this paper, we present a real-time pedestrian detection system that has been trained using a virtual environment. This is a very popular topic of research having endless practical applications and recently , there was an increasing interest in deep learning architectures for performing such a task. However, the availability of large labeled datasets is a key point for an effective train of such algorithms. For this reason, in this work, we introduced ViPeD, a new synthetically generated set of images extracted from a realistic 3D video game where the labels can be automatically generated exploiting 2D pedestrian positions extracted from the graphics engine. We exploited this new synthetic dataset fine-tuning a state-of-the-art computationally efficient Convolutional Neural Network (CNN). A preliminary experimental evaluation, compared to the performance of other existing approaches trained on real-world images , shows encouraging results.
In this paper we present a multistage method applied in pedestrian detection using information from a LIDAR and a monocular-camera mounted on an electric vehicle driving in urban scenarios. The proposed method is a cascade of classifiers... more
In this paper we present a multistage method applied in pedestrian detection using information from a LIDAR and a monocular-camera mounted on an electric vehicle driving in urban scenarios. The proposed method is a cascade of classifiers trained in two subsets of features, one with laserbased features and the other with a set of image-based features. A specific training approach was developed to adjust the cascade stages in order to enhance the classification performance. The proposed method differs from the conventional cascade regarding the way the selected samples are propagated through the cascade. Thus, the subsequent stages of the proposed cascade receive both negatives and positives from previous ones, relying on a decision margin process. Experiments were conducted in off-line mode, for a set of single component classifiers and for the proposed cascade technique. The results are compared in terms of classification performance metrics and ROC curves.