You are here: Home Staff

InOutDoorPeople RGB-D Dataset


This dataset collection has beeen used to train deep neural networks for the task of people detection in RGB-D data. The approach is presented in our paper Choosing Smartly: Adaptive Multimodal Fusion for Object Detection in Changing Environments, which we published at IROS 2016.

The InOutDoorPeople dataset contains 8305 annotated frames (bounding boxes) of RGB and depth images in both 1920x1080 and 960x540 pixel resolution. The data was collected at a framerate of 30hz from a Kinect v2 camera, which was mounted on a mobile robot. A particularity of the recorded sequences is, that the robot was driving from indoor to outdoor environments in a single take. The dataset is subdivided into four sequences, two recorded during midday and the other two recorded at dusk.

Terms of use

This program is provided for research purposes only. Any commercial use is prohibited. If you use the dataset you should cite our paper:

    author = {Oier Mees and Andreas Eitel and Wolfram Burgard},
    title = {Choosing Smartly: Adaptive Multimodal Fusion for Object Detection in Changing Environments},
    booktitle = {IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)},
    year = 2016,
    address = {Deajeon, South Korea},
    url = {http://ais.informatik.uni-freiburg.de/publications/papers/mees16iros.pdf}


Download RGB images 1920x1080 18GB
Download RGB images 960x540 4.9GB
Download depth images 1920x1080 4.5GB
Download depth images 960x540 583MB (depth-jet encoding)
Download annotations 1920x1080 (.yml, VOC2007 format)
Download imagesets textfiles
Download camera calibration (to convert to original pointcloud use depth images at full resolution)