Car Detector

Project Dates: 3/5/2013 - 3/28/2013

Datasets: UIUC Image Database for Car Detection, Caltech 101

Summary:

Designed a program in MATLAB to detect cars in an image based on an image vocabulary generated by a training set of images.

Algorithm Design:

502 training images of the side view of a car from the UIUC car detection dataset functioned as a ground truth since there is a car in the front and center of each image, and the rest of the image is essentially cropped. A Harris corner detector was then used to determine features around each car. 502 sets of features were obtained which could result in too specific a classifier and would add significant processing time. To work with a more manageable feature set for the classifier, an image vocabulary was created using K means clustering, where a series of centroids were found around groups of similar features and these more limited number of centroids (50 in this case) became the restricted image vocabulary used for classification. Weight was added to the likelihood of each word based on how many of the 502 initial images fell in the window around the centroid each word is based on.

A test image with a car from the Caltech 101 dataset is then read in and its features are detected. Votes are cast to see which feature sets in the image most closely match those of the existing image vocabulary. Many features in an image may bear similarities to the car vocabulary so to reduce false positives, the results with the highest votes corresponding to a car were classified as cars and boxes were traced around each detected car. This detector only works assuming there is a car in an image, otherwise it will classify the set of features that most resemble a word from the car vocabulary, as a car. This code was created and implemented in MATLAB

Example of features detected in a training image. The features do not entirely correspond to the one car since there are two cars in the image. Errors caused by this that may occur in the vocabulary generated become more negligible as the amount of training images processed increase.

Detected car boxed in blue. The feature sets with the second and third highest likelihoods of being cars are in green boxes. They often overlap with the car due to overlapping features

False positives detected in tree and building since they are both objects with high corner density and have greater chance of coinciding with the limited vocabulary. The votes cast for car in the image still beat the false positives.