MASSACHUSETTS INSTITUTE OF TECHNOLOGY
ARTIFICIAL INTELLIGENCE LABORATORY
A I Technical Rep ort No May
A Trainable System for Ob ject
Detection in Images and Video
Sequences
Constantine P Papageorgiou
cpapa ai mit edu
This publication can b e retrieved by anonymous ftp to publications ai mit edu
Abstract
This thesis presents a general trainable system for ob ject detection in static images
and video sequences The core system nds a certain class of ob jects in static im
ages of completely unconstrained cluttered scenes without using motion tracking or
handcrafted mo dels and without making any assumptions on the scene structure or
the numb er of ob jects in the scene The system uses a set of training data of p ositive
and negative example images as input transforms the pixel images to a Haar wavelet
representation and uses a supp ort vector machine classi er to learn the di erence
between in class and out of class patterns To detect ob jects in out of sample im
ages we do a brute force searchover all the subwindows in the image This system
is applied to face p eople and car detection with excellent results
For our extensions to video sequences we augment the core static detection system
in several ways extending the representation to ve frames implementing
an approximation to a Kalman lter and mo deling detections in an image as a
density and propagating this density through time according to measured features
In addition we present a real time version of the system that is currently running in
a DaimlerChrysler exp erimental vehicle
As part of this thesis we also present a system that instead of detecting full pat
terns uses a comp onent based approach We nd it to b e more robust to o cclusions
ere lighting conditions for p eople detection than the full rotations in depth and sev
body version We also exp erimentwithvarious other representations including pixels
and principal comp onents and show results that quantify howthenumb er of features
color and gray level a ect p erformance
c