A Layered Approach to People Detection in 3D Range Data
L. Spinello , K. Arras, R. Triebel and R. Siegwart
AAAI Conference on Artificial Intelligence 2010 (AAAI)
Keywords: 3D people detection, 3D pedestrian detection, 3D range data learning
People tracking is a key technology for autonomous systems,
intelligent cars and social robots operating in populated environments.
What makes the task difficult is that the appearance
of humans in range data can change drastically as a function
of body pose, distance to the sensor, self-occlusion and occlusion
by other objects. In this paper we propose a novel approach
to pedestrian detection in 3D range data based on supervised
learning techniques to create a bank of classifiers for
different height levels of the human body. In particular, our
approach applies AdaBoost to train a strong classifier from
geometrical and statistical features of groups of neighboring
points at the same height. In a second step, the AdaBoost
classifiers mutually enforce their evidence across different
heights by voting into a continuous space. Pedestrians are finally
found efficiently by mean-shift search for local maxima
in the voting space. Experimental results carried out with 3D
laser range data illustrate the robustness and efficiency of our
approach even in cluttered urban environments. The learned
people detector reaches a classification rate up to 96% from a
single 3D scan.
3D point cloud
| 3D point cloud recorded in a busy urban environment where people, trams and cars are moving around.
Range data segmentation
| The layers of range data are segmented: colors indicate different segments.
| Segments are classified for belonging
to a part of the learned 3D model of a person. A bank of Adaboost
classifiers trained via 'one vs all' technique is employed for this
task. Each segment cast votes for the center of a person in a 3.5D
voting space: position (x,y,z) and classification 'confidence' w. Votes for different 3D model parts are displayed as colored balls.
| High density loci in the voting space
represent the detection hypotheses. A mean-shift is applied to extract
the modes of the vote distribution. The bigger the ball the higher the
detection score that combines votes quantities, detection confidence,
and parts detected.
| Detected people are depicted as boxes (Tannenstrasse dataset).
| Detected people are depicted as boxes (Polyterrasse dataset).