BACK-VIEW CAR MODEL RECOGNITION
LE THANH SACH
A THESIS SUBMITTED IN PARTIAL FULFILLMENT
OF THE REQUIREMENT FOR THE DEGREE OF
MASTER OF ENGINEERING IN COMPUTER ENGINEERING
SCHOOL OF GRADUATE STUDIES
KING MONGKUT’S INSTITUTE OF TECHNOLOGY LADKRABANG
2007
COPY RIGHT 2007
SCHOOL OF GRADUATES STUDIES
KING MONGKUT’S INSTITUTE OF TECHNOLOGY LADKRABANG
หัวขอวิทยานิพนธ
นักศึกษา
รหัสนักศึกษา
ปริญญา
สาขาวิชา
พ.ศ.
อาจารยผูควบคุมวิทยานิพนธ
อาจารยผูควบคุมวิทยานิพนธรวม
การจําแนกรุนรถจากภาพดานหลังรถ
Mr.Le Thanh Sach
48060733
วิศวกรรมศาสตรมหาบัณฑิต
วิศวกรรมคอมพิวเตอร
2550
ดร.วัชระ ฉัตรวิริยะ
Prof.Dr. Shozo Kondo
บทคัดยอ
การจําแนกรุนและบริษัทผูผลิตรถยนตจากภาพโดยอัตโนมัติ เปนหัวขอที่ทาทายในการ
วิจัยดานการประมวลผลภาพและการวิเคราะหการมองเห็น การวิจัยที่ผานมาที่เกี่ยวกับเรื่องนี้
ไดแก การตรวจจับและการจําแนกประเภทของยานพาหนะ การตรวจจับยานพาหนะสามารถแยก
ยานพาหนะออกจากสิ่ง อื่น ๆ ส วนการจํา แนกประเภทของยานพาหนะสามารถระบุ ได ว า เปน
รถบรรทุก รถบัส และรถจักรยานยนต และยังมีขอจํากัดอยูมาก และจากการสํารวจการศึกษาวิจัย
ที่ผานมายังไมมีปญหาที่สําคัญภายใตการสํารวจของเราเลย ที่สามารถจําแนกยานพาหนะใน
ระดับที่ซับซอนมากขึ้น เชน ยี่หอ และรุนของยานพาหนะ
วิทยานิพนธฉบับนี้มีจุดประสงคเพื่อเสนอแนวทางใหมในการจําแนกรุนและบริษัทผูผลิต
ของรถยนตจากภาพนิ่งดานหลังของรถยนต ในลําดับแรกสีแดงของไฟทายรถจะถูกตรวจหาและ
ถูกทดสอบ ดวยโมเดลการกระจายของแมสี ซึ่งสรางขึ้นจากตัวอยางของสีไฟทาย บริเวณที่เปนสี
แดงที่อาจเปนไฟทายจะถูกตรวจจับไดโดยการเทียบพิกเซลกับแบบจําลองความเขมของสีแดงที่มี
หลั ง จากนั้ น จะมี ก ารตรวจสอบคุ ณ สมบั ติ ท างตํ า แหน ง (geometric) ของภาพท า ยรถเพื่ อ หา
ตําแหนงที่ควรจะเปนไฟทาย ในการจําแนกประเภทและรุนของรถยนตมีการใชเทคนิค Eigen
รวมกับการวิเคราะหแบบแยกแยะเปนเสนตรง Fisher
ในการทดลองไดนําขอมูลของรถยนตจากบริษัทผูผลิตหลายบริษัทและหลายรุนที่เปนที่
นิยม จํานวน 17 รุน ไดถูกรวบรวม เพื่อการวิเคราะหและทดสอบระบบ ความถูกตองในการจําแนก
ประเภทประมาณ 93 เปอรเซนต ผลการทดลองแสดงใหเห็นวางานวิจัยนี้สามารถที่จะพัฒนาตอไป
เพื่อสามารถแยกแยะรุนและบริษัทผูผลิตจํานวนมากขึ้นได
I
Thesis Title
Student
Student ID
Degree
Program
Year
Thesis Advisor
Thesis Co-Advisor
Back-View Car Model Recognition
Mr.Le Thanh Sach
48060733
Master
Computer Engineering
2007
Dr. Watchara Chatwiriya
Prof. Dr. Shozo Kondo
ABSTRACT
Automated Vision-Based Vehicle Recognition is useful and challenging for
researchers in image processing and visual perception disciplines. Recent researches
in this field can be categorized into vehicle detection or vehicle type recognition. While
vehicle detection can only discriminate vehicles against surrounded background;
vehicle type recognition is able to detect and classify vehicles into their types such as
truck, bus and motorcycle. However, the recognition complexity level is quite limited so
far. None of serious work under our investigation is able to recognize vehicles in more
sophisticated level such as vehicle’s makes and models. This thesis proposes new
approaches for detecting and recognizing car makes and models from still images. In
detection, colors in red areas of car tail lights are sampled, and their density in color
space is modeled by a proposed method. Input images segmented with the modeled
color density can limit the number of regions in images to look for car back views. Cars
are detected after verifying the geometric property of car back views in candidate
regions. In recognition, Eigen technique and Fisher Linear Discriminant Analysis have
been demonstrated that they are suitable for extracting features and recognizing car
models. A large image database of 17 popular car models has been collected until now
for investigation. The current recognition rate is over 93%. The experimental results
show that our approach is extensible to a higher number of car models.
II
Acknowledgements
I would like to thank Dr. Watchara Chatwiriya, my advisor, for his enthusiastic
guidance and expansive discussion during the past 24 months. I am also thankful to my
co-advisor, Prof. Dr. Shozo Kondo in Tokai University, for his encouragement and
practical suggestion.
Specially, I am very grateful to all of members in my family - they are always
motivation for me to better myself.
It is also noted that I received all of kind help from members in my laboratory; I
could study well in KMITL thanks to a friendly working environment they created for me.
Finally, I would like to mention that this thesis can not be realized without the
support of JICA project for AUN/SEED-Net.
Bangkok, Thailand
May, 2007
Le Thanh Sach
III
Contents
Page
บทคัดยอ .............................................................................................................................I
Abstract .............................................................................................................................II
Acknowledgements ..........................................................................................................III
Contents........................................................................................................................... IV
List of Tables................................................................................................................... VII
List of Figures ................................................................................................................ VIII
Chapter 1 Introduction...................................................................................................... 1
1.1 Background........................................................................................................... 1
1.2 Objective of the Study ........................................................................................... 2
1.3 Statement of the Thesis ......................................................................................... 3
1.4 Assumption of this Study....................................................................................... 4
1.5 Theory or Concept to be Used in this Research ................................................... 4
Chapter 2 Literature Survey.............................................................................................. 6
2.1 Vehicle Recognition............................................................................................... 6
2.1.1 Sensor Selection ........................................................................................... 6
2.1.2 Vehicle Detection.......................................................................................... 7
2.1.3 Feature Extraction....................................................................................... 10
2.1.4 Recognition................................................................................................. 12
2.2 Color Image Segmentation.................................................................................. 13
2.3 Eigen-Technique ................................................................................................. 14
2.3.1 Principal Component Analysis (PCA).......................................................... 15
2.3.2 Fisher Discriminant Analysis ....................................................................... 18
Chapter 3 System Architecture and Data Collection ...................................................... 25
3.1 System Architecture ............................................................................................ 25
3.2 Dataset Collection ............................................................................................... 26
IV
Contents (cont.)
Page
3.2.1 Conditions for Capturing Image.................................................................. 26
3.2.2 The Number of Car Makes and Models under Consideration..................... 29
3.3 Sample Reference Color Collection .................................................................... 30
Chapter 4 Car Back-View Image Segmentation ............................................................. 32
4.1 Introduction ......................................................................................................... 32
4.2 Reference Color Learning ................................................................................... 33
4.2.1 Color Density Modeling............................................................................... 33
4.2.2 Density Level Selection............................................................................... 46
4.3 Segmentation and Normalization ........................................................................ 47
4.3.1 Segmentation.............................................................................................. 47
4.3.2 Normalization .............................................................................................. 59
Chapter 5 Feature Selection ........................................................................................... 61
5.1 Introduction ......................................................................................................... 61
5.2 Image Space and Eigencar................................................................................. 62
5.3 Car Space and Car Feature ................................................................................ 64
5.4 Fisher Car Space and Fisher Car Feature........................................................... 65
Chapter 6 Recognition.................................................................................................... 69
6.1 Recognition with Quadratic Discriminant Functions............................................ 69
6.2 Recognition with Linear Discriminant Functions.................................................. 72
6.3 Recognition with Nearest Neighborhood Rule .................................................... 73
Chapter 7 Result and Discussion ................................................................................... 76
7.1 Segmentation ...................................................................................................... 76
7.1.1 Learning Reference Color........................................................................... 76
7.1.2 Separating Car back-views......................................................................... 77
7.2 Recognition ......................................................................................................... 79
V
Contents (cont.)
Page
Chapter 8 Conclusion ..................................................................................................... 86
Bibliography.................................................................................................................... 88
Appendix A Sample Car Images .................................................................................... 93
Appendix B Publication List.......................................................................................... 102
VI
List of Tables
Table
Page
3.1 Makes, Models, Years and Number of Sample Images......................................... 30
4.1 Color Prototype Learning Algorithm....................................................................... 37
4.2 Color Combination Algorithm ................................................................................. 43
4.3 Likelihood Optimization Algorithm.......................................................................... 46
4.4 Formulation for Geometric Measurements ............................................................. 50
7.1 Parameters used in HPL, SA and EM algorithms................................................... 76
7.2 Recognition Performance....................................................................................... 80
7.3 Recognition Performance Using LDF..................................................................... 83
7.4 Recognition Performance Using QDF .................................................................... 84
7.5 Recognition Performance Using K-NN (K=5)......................................................... 85
VII
List of Figures
Figure
Page
1.1 Objects being Recognized by Vehicle Detection, Vehicle Type Recognition, and
Car Model Recognition ............................................................................................ 3
1.2 Name of Several Components in Car Back-Sides.................................................... 4
2.1 Directions of Projections versus Scale Factors. (a-c) are Projections onto the Same
Direction with Different Scale Factors (d) is Projection onto the Direction
Discovered by Fisher Mapping with Scale Factor Equal to 1................................. 20
3.1 Proposed System Architecture............................................................................... 26
3.2 System Configuration............................................................................................. 28
3.3 Slanted Angle of Car Back-Side............................................................................. 28
3.4 A Typical Image in the dataset............................................................................... 29
3.5 (a) An Example Distribution of Sample Colors in RGB Color Space (b) Several Red
Images Sliced from Tail Light Locations ................................................................ 31
4.1 (a) An Input Image (b) Its Separated Back-View Image ........................................ 32
4.2 Steps in Color Density Modeling............................................................................ 35
4.3 (a) A Sample Set of Reference Colors Projected onto 2D-plane (u*v*) (b) An
Example of Approximation Using Circular Prototypes ........................................... 36
4.4 Other Approximations for The Distribution in The Previous Figure; (a) Using Big
Size Circular Prototypes and (b) Using Small Size Prototypes .............................. 38
4.5 A Simple Distribution and Its Possible Approximations; (a) and (b) Use Spherical
Covariance Matrices with Different Orders of Sample Colors; (c) Uses Full
Covariance Matrix .................................................................................................. 39
4.6 Interested Regions and Their Boundaries Defined by Several Density Levels ...... 47
4.7 A Simple Case of Pixel Classification (a) Original Image (b) Filtered Image ......... 48
4.8 Definition of Car Back-View Parameters ................................................................ 50
4.9 Variances and Loci of Gravity Centers for Several Car Back-View Images ........... 52
4.10 Location and Size of Car Back-View Image Candidate ......................................... 53
4.11 Some Complicated Results of Pixel Classification ................................................. 54
4.12 Examples of Histograms and Lanes ...................................................................... 55
4.13 Removing Noisy Areas in Filtered Images (a) After Filtering (b) After Removing... 56
VIII
List of Figures (cont.)
Figure
Page
4.14 H-lanes Detection................................................................................................... 56
4.15 Bounding Rectangles for Red Areas...................................................................... 57
4.16 Rectangular Regions of Several Candidates ......................................................... 57
4.17 Symmetric Rule Verification.................................................................................... 58
4.18 Car Back-View Separation Flowchart..................................................................... 59
5.1 Steps in Selecting Representative Features........................................................... 62
5.2 Flowchart for Obtaining Eigencars......................................................................... 64
5.3 Representation of Data .......................................................................................... 66
5.4 Discrimination Between Classes............................................................................ 67
5.5 Algorithm for Obtaining Discrimination Directions ................................................. 68
6.1 Recognition Steps.................................................................................................. 72
6.2 K-NN Example........................................................................................................ 73
6.3 K-NN Algorithm ...................................................................................................... 75
7.1 Boundaries of Reference Regions defined by HPL and the Proposed Method..... 77
7.2 Failure Situations in Detecting Red Areas.............................................................. 79
7.3 The Impact of Number of Dimensions (a) for Car Space (b) for Fisher Car Space 82
IX
Chapter 1
Introduction
1.1 Background
Car Model Recognition is a field inside Vision-Based Vehicle Recognition (VBVR)
which is one of active and important researches in image processing, pattern
classification and intelligent transportation system. A VBVR system is the one that can
answer information about vehicles in input images. Such information can vary from
simple form like the location of vehicles (i.e. vehicle detection) in images to more
complex forms as vehicle types, e.g. bus, truck and car. Even with the former, VBVR is
also very useful in applications such as road following [1] and surveillance [2]. The later
can help VBVR to be the base of applications such as traffic guidance, vehicle statistic
[3], toll collection, intelligent parking systems and so on.
Car Model Recognition deals only with a subset of vehicles called “car”. Although
there are a considerable number of existing works in vehicle recognition, most of them
are in either vehicle detection or vehicle type classification. None of serious research
can classify vehicles into subclasses of types, such as makes and models of vehicles.
The following text in this section presents a brief introduction to VBVR and several
existing approaches. The detail of VBVR can be found in the first section of chapter 2.
Generally, a VBVR system is composed of three tasks, vehicle detection,
representative feature extraction and recognition. All of these three tasks are important
in the sense of making the system accurate and usable.
Vehicle detection or sometime called vehicle segmentation is a task that locate
vehicles in images. Although locating objects is a very simple work for human, it is really
challenging for machines. There are several existing approaches [4] for detecting
vehicles; such approaches can be classifies into two groups which are called
“exhaustive detection” and “selective detection” in this thesis. In exhaustive detection,
vehicles are searched at every pixel in images; meanwhile, selective detection focuses
the search around the most likely locations only by using specific information. Obviously,
exhaustive detection is time-consuming and prohibitive in real-time applications [4].
2
The step following detection is to obtain representative features for classes. In the
view of recognition, the term “classes” is used to refer to groups of objects being
distinguished by the systems [5]. For example, in case of vehicle detection, a class can
be either a group of vehicles under investigation or a group of backgrounds and other
obstacles; meanwhile, classes can be “Bus”, “Truck”, “Car” and so forth in vehicle type
recognition. Typical way for obtaining features of vehicles measures vehicle properties
such as length [2], height, width [6] and color [7]. Recent researches shows that other
features that obtained by Principal Component Analysis (PCA) [8] [9] and transform
domain (e.g. Wavelet, Gabor filter) [10] are also efficient in discriminating data between
classes. The most challenge in feature extraction is to obtain features that are
discriminative and robust to noise, distortion and modification in vehicles.
The last step in VBVR is to classify unknown vehicles into classes; this step is also
called recognition. The method for recognition can be as simple as a comparison in
some applications [7] [11]. However, majority of literature use classical pattern
classification methods such as Quadratic Discriminat Function (QDF), K-Nearest
Neighbor (K-NN), Probabilistic Neural Network (PNN) and Support Vectore Machine
(SVM) [8] [9] [10] for recognizing unknown objects.
1.2 Objective of the Study
It can be seen from the previous section that the most challenging tasks in VBVR
are to detect vehicles in images accurately and quickly, and to enhance the recognition
complexity which is defined as the number of classes being recognized and the amount
of information that a VBVR system can answer. Actually, expanding all of vehicle types
on the world into subclasses is likely impossible for a time-bounding work. For these
reasons, this thesis selects a subset of vehicles called “car” and aims to achieve the
following tasks.
1. It utilizes color and geometric properties of car back-sides in order to speed up
the segmentation process for cars from images that captured from the back-view
of cars in near-field view.
2. It increases the recognition complexity by trying to recognize car makes and
models as showed in Figure 1.1.
3
Scene
Vehicle Detection
Vehicle Type
Recognition
Car Make
Recognition
Car Model
Recognition
Vehicle
…
Bus
Honda
City … Civic
…
Car
Background, other obstacles
…
Toyota
Vios … Altis
Truck
…
Nissan
Sunny … Presea
Figure 1.1 Objects being Recognized by Vehicle Detection, Vehicle Type Recognition,
and Car Model Recognition
1.3 Statement of the Thesis
An investigation in many cars shows that car back-sides contain red colors at car
tail lights and has some other geometric properties such as symmetry and correlation
between components inside. From these observations, the thesis will tackle the
objectives above as follows.
1. It proposes a method for describing the region of red colors in color spaces. As
can be seen in section 3.3 that red areas at tail lights do not contain only one
pure red color, i.e. [255, 0, 0] in RGB color space; actually, they contain all colors
in a certain region inside color spaces. Therefore, the thesis has proposed a
statistical approach for approximating such distributions.
2. The thesis uses red colors to limit the region in images for searching car backview; and thereby, it can speed up the segmentation task. Car back-views are
detected in thesis by verifying geometric properties for candidate car backviews.
3. Eigen-technique is used for selecting representing and discriminating features of
car models. These features are used to recognize car models by linear
discriminant function, quadratic discriminant function and nearest neighbor rule.
4
1.4 Assumption of this Study
In order to realize ideas above, the thesis assumes that some assumptions should
be satisfied.
1. Images used in thesis are captured by digital camera in near-field view from the
back-view of cars. The setup of the camera system is showed in section 3.2.
2. Scenes surrounded cars should be controlled so that there are only small red
areas inside them.
3. Car back-sides should not be distorted and changed considerably. This
assumption guarantees cars can be detected more correctly. Moreover, by this
way, intensities at the same position among car back-view images in each class
are correlated and able to increase the recognition performance.
1.5 Theory or Concept to be Used in this Research
Definition 1.1: The term car make and model are used to refer to sub-classes of
vehicles, as showed in Figure 1.1.
Definition 1.2: The names of several components in car back-sides that are referred
to in this thesis are given in Figure 1.2.
Definition 1.3: The term red color in this thesis does not mean the pure red color in
color space, i.e. [255, 0, 0] in RGB color space. It can be any color that can appear in
red areas of tail lights.
Windshield
License Plate
Spoiler
Red Areas
in Tail Lights
Left Tail Light
Bumper
Right Tail Light
Figure 1.2 Name of Several Components in Car Back-Sides
5
Definition 1.4: In this thesis, we use colors that appear in red areas of tail lights as
reference objects for segmenting car back-view images. Such colors we name
reference colors or interested colors for interchange. Regions in color space contain
such colors we name reference color regions or interested color regions
Definition 1.5: Rather than specifying each color in a color space as a reference
color, we should collect a set of such colors and then seek a way to infer reference color
regions from this set. Colors which are collected for such goal are called sample
reference colors or sample colors for short.
Chapter 2
Literature Survey
2.1 Vehicle Recognition
Car Model Recognition is a small branch in Vehicle Recognition which is a very
broad field and can be approached by many ways. A large number of approaches
basically originate from researchers in Intelligent Transportation System and have a
weak connection to Image Processing and Pattern Recognition. However, many others
which are surveyed in this section need much effort from researchers in these fields.
2.1.1 Sensor Selection
Generally, the first step in designing vehicle recognition systems is to select
suitable sensor types for acquiring input data. Steps thereafter in vehicle recognition are
much dependent on the selected sensors.
Sensors can be classified into two types [1], active and passive. The term “active”
is used to mean that sensors detect the distance of objects by measuring the travel time
of signal emitted by the sensors and reflected by the objects. Radar-based, laser-based
and acoustic-based are examples of this category. Meanwhile, optical sensors such as
normal cameras are classified as passive sensors; sometime they are also called visionbased sensors. Vehicle recognition that uses vision-based sensors is called visionbased vehicle recognition which is the context of the study in this thesis.
Although vision-based sensors are less robust then radar-based and laser-based in
rain, fog, night and direct sunshine, they are inexpensive and able to create a broad
field of view for vehicles (up to 360 degree around vehicles). Moreover, they can be
used for some other specific applications such as lane marking detection and obstacle
identification without requiring any modification to road infrastructure. Vision-based
sensors also avoid interference between sensors of the same type, which can be critical
for a large number of vehicles using active sensors moving simultaneously in the same
environments. These reasons explain for the fact that vision-based approach receives
much attention from researchers in vehicle recognition recent years.
7
Three following sections present a survey of existing approaches for other steps in
vehicle recognition that use vision-based sensors.
2.1.2 Vehicle Detection
Vehicle detection is a step in vehicle recognition that locates the location of
vehicles on the whole images. Locations of vehicles are usually described by
rectangular regions in images. Such regions are called regions of interest or ROI in
some applications [10]. Although detecting ROI is straightforward in systems which use
active sensors; it is a complicated task in vision-based systems.
Generally, the framework for detecting ROIs contains two basic steps as follows.
1. This step is to generate candidates for ROIs inside the whole images. Basically,
there are two approaches that are called exhaustive and selective detection in
this thesis.
2. The second step in the framework is to verify candidates to decide whether a
candidate is a real ROI of vehicle. Because majority of systems under the
investigation such as in [2] [8] consider the verification as a two-class recognition
problem; therefore the description of candidate verification is delayed and
explained in section “recognition” following.
2.1.2.1 Exhaustive Detection
Existing studies in this approach assume that there is no priori knowledge available
for detection. Hence, in order to detect vehicles, several windows of different sizes have
been slided over the whole image to generate candidates [8] [9] [12]. Researches in
this approach is able to detect vehicles at every pixel in input images. However, clearly,
there are a tremendous number of candidates that will be generated by such way;
therefore, this approach needs powerful computing resources and seems to be
prohibitive for real time applications. Usually, inputs for systems developed using this
approach are still images.
2.1.2.2 Selective Detection
This approach generates candidates around only the most likely regions by utilizing
some specific information; therefore it can speed up the detection process. Specific
information can come from many ways which are summarized as follows.
8
2.1.2.2.1 Subtraction-based Method
Most of researches that use vision-based sensors alone follow this method of
candidate generation. Candidates are generated by a subtraction between input images
and background or between two consecutive images in image sequences [2] [11] [13]
[14] [15]. The former is used only in case of the background can be modeled or
collected reliably; while the later is usually used for detecting moving objects in image
sequences.
A typical background subtraction has been studied in [2] [11]; because stationary
vision-based sensors were used in a controllable environment, the background image,
called Ibg, could be modeled reliably upon the program execution. To detect vehicles in
image I, a binary image Ib was formed as in equation (2.1); where θ was a threshold
value to transform the difference between two images into the binary image. White pixels
in Ib that were inside enough large regions were considered as pixels in ROI.
⎧1
I b ( x, y ) = ⎨
⎩0
,| I ( x, y ) − I bg ( x, y ) |≥ θ
, otherwise
(2.1)
On the other hand, studies in [13] and [15] could adapt the background to the
change of the environment by an algorithm so-called self-adaptive background
subtraction. The principal of the method in those studies is to modify the background
image (CB) by using instantaneous background (IB) and applying an appropriate
weighting α as follows.
CBk +1 = (1 − α )CBk + α IBk
Where, k is frame index in image sequences. The instantaneous background is
defined as IBk = M k • CBk + (~ M k ) • I k ; where, Ik is the current frame, Mk is the binary
vehicle mask and similar to Ib above.
2.1.2.2.2 Knowledge-based Method
Knowledge-based methods utilize properties of vehicles such as their symmetry,
colors, edges and textures to hypothesize vehicle locations in images.
1. Symmetry
Symmetry is one of the main signatures of man-made object that is very useful for
detecting and recognizing vehicles [4]. Images of vehicle observed from the back-view
9
or front-view are in general symmetrical in horizontal or vertical directions. This
observation has been used for vehicle detection in several studies [16]-[20].
In [16] and [18], vertical symmetric axes were computed for input images and their
edge maps also. Perspective and size constraints were also used to limit the regions for
finding symmetric axes. From these axes, candidates were established after that by
finding corners, i.e. four corners of rectangular ROIs, or by backtracking algorithm.
In [17], symmetric measure S A ( xs , w) was computed for each scan-line of image;
where, xs was position of a potential symmetric axis in interval w inside scan-line.
Symmetric measures for all scan-lines were accumulated to form symmetric histogram
for images. ROI candidates were derived from symmetric histogram and edge map of
the image after that.
On the other hand, work in [19] used symmetric property as a criterion for
validating ROI candidates; and, in [20] the symmetric detection was formulated as an
optimization problem which was solved using Neural Networks.
2. Color
Although color information is very useful in face detection [21] [22] and other
applications in vehicle recognition such as lane and road detection [23], there is only
few existing systems use color in detecting vehicles.
In [23], a set of sample colors for road was collected; after that, regions in color
space that contain road colors were approximated by using spheres by a density-based
learning algorithm. Lu*v* color space was used in order to achieve best in the uniformity
of perception. Roads were detected by checking each pixel in input images to decide
whether it was inside or outside of the approximated region.
A typical research that uses color for detecting vehicles was presented in [24]. In
that research, colors of cars and background were collected and normalized by a
proposed method in that work. Both normalized colors of cars and backgrounds were
assumed to follow Gaussian model; thereby, all pixels in images could be classified as
foreground (cars) or background according to Bayesian classifiers. Pixels that were
classified as foreground were good suggestions for location of cars in images.
3. Shadow
10
According to [25], the shadows underneath vehicles can be used as a sign for
detecting vehicles; because these regions are darker and cooler than other in images. It
is obvious that such signs are very useful in the sense of locating the position of vehicle
in images. However, it is difficult for choosing suitable threshold value to segment
shadows from other regions. Moreover, the shadow depends heavily on the illumination
condition and the moving direction of vehicles.
4. Vertical/Horizontal Edge and Corners.
The boundary of the vehicle back-sides is nearly rectangles; moreover, vehicle
back-view images usually contain many horizontal and vertical lines. From these
observations, studies in [16] [18] [26] [27] [28] have proposed several ways for using
edges and corners to hypothesize the location of vehicles in images.
The method presented in [16] and [18] generated candidates for vehicles by
combining symmetric properties, corners and edges obtained from edge maps of
images. On the other hand, the method proposed in [26] segmented images into four
regions: pavement, sky, and two lateral regions using edge grouping. After that, groups
of horizontal edges on the detected pavement were then considered for hypothesizing
the presence of vehicles.
2.1.3 Feature Extraction
Feature extraction is a step that obtains characteristic features which will be used
for verifying candidates generated in detection step above or for recognizing vehicles in
vehicle recognition applications. Finding robust and discriminative is the most challenge
in this step. The following sections present several ways for extracting features that were
used in majority of literature.
2.1.3.1 Vehicle Features
Studies in this group aims to extract features that are properties of vehicles, e.g.
length, height, width, the number of axles and wheels, and colors. Except the number of
axles and wheels which have been usually measured by active sensors [6] and [29],
other properties have been estimated from images as in [2] [13] [15] [30]. In those
studies, lengths and widths were estimated as the width and the height of vehicle
11
regions in 2-D images respectively; meanwhile, heights were computed from two
images by stereo-based approach in [6].
As another way, in [7], distributions of colors in several areas such as tail lights,
license plate and windshield inside car back-view images were employed as features
for characterize cars in car detection application.
2.1.3.2 Statistical Features
In this approach, images containing vehicles (ROI) are converted into 1-D vectors.
Features are obtained by projecting these vectors onto pre-computed directions. Such
directions are eigenvectors which are derived from a set of training images. This method
is called Principal Component Analysis (PCA) which is presented in detail later in this
chapter. Typical works that follow these approaches are in [8] and [9].
2.1.3.3 Transform Domain Features
Features extracted by this approach are computed as results of a transformation
such as Gabor filter [10] [11] [31] and Wavelet [31].
Gabor filter responses for image I(x,y) of size NxN can be computed as equation
below.
g ( X , Y ,θ k , λ ) =
N − X −1 N −Y −1
∑ ∑
x =− X
y =−Y
I ( X + x, Y + y ) f ( x, y , θ k , λ )
Where, θk and λ are orientation and wavelength of Gabor filter kernel function
which is the result of a modulation of a 2-D sine signal with a Gaussian envelope and
defined as equation below.
2
2
f ( x, y, θ k , λ ) = exp ⎡⎢ − 12 { ( x cosθk σ+ 2y sinθk ) + ( − x sin θkσ+2y cosθk ) }⎤⎥
x
y
⎣
⎦
2
.exp[ j 2π ( x cosθλk + y sinθk ) ]
θk depends on the total number of the orientation of kernel function. It can be
expressed as
θ k = πn (k − 1)
k = 1, 2,.., n
Study in [11] demonstrated that Gabor filter responses could be used alone to
discriminate three types of vehicles, sedan, van and pickup. Meanwhile Gabor filter
responses was combined with Legendre moments in [26] to be able to characterize
vehicle in vehicle detection.
12
2.1.3.4 Generic Features
The term “generic” is used to imply that methods in this approach use general
algorithms in image processing such as edge [32] and histogram [33] for extracting
features.
Xiaoxu et al proposed in [32] a method for extracting features by combining some
following steps.
1. Extract edge points using edge detection methods.
2. Use SIFT [34] as local descriptor to extract local features for each edge point.
3. Segment edge points into point groups based on the similarity of edge points.
4. Form features from edge points segments.
On the other hand, features were obtained in [33] by forming histogram of distance
map which was the map of distances from each pixel in input image to corresponding
pixel in the mean image of class.
2.1.4 Recognition
Recognition is the step that label class name for unknown objects [5]; however,
there is confusion in VBVR in the use of the term “recognition” and “detection”. This is
probably because the detection can be seen as a recognition problem with two classes,
vehicles versus background and other obstacles [2] [8].
Recognition step is usually dependent on the kind of representative features for
vehicles. For example, in [7], colors of several components in car back-views such as
tail lights, license plate and windshield were modeled by Gaussian Mixture Model
(GMM). In order to recognize vehicles, likelihood ratio which was defined as quotient of
the likelihood of testing image over the likelihood of training images was computed. This
value was compared to a pre-defined range to yield the recognition result which was the
detection result in that study.
As simple as [7], in order to recognize unknown object in [11], Gabor Jets for each
pixel in testing images were computed and compared with ones derived for training
images. Unknown object was labeled by the label of class which was best matched with
the testing image.
13
Compared to specific methods as above, most of existing studies utilize classical
pattern classification methods such as QDF, K-NN, PNN, SVM [8] [9] [10] for
recognizing unknown objects.
2.2 Color Image Segmentation
Image segmentation has a central role in vision-based recognition such as vehicle
recognition, and face recognition; it is a process that partitions image into meaningful
regions. Such partition is the first obligatory step of vision systems. Its qualification
impacts deeply on the performance and the accuracy of the overall vision systems.
Recent studies favor to find techniques for doing segmentation with color images which
naturally own more features than monochrome images.
Colors of interested objects in images are characterized by their chromaticity and
brightness [35] and, therefore, affected by the lighting conditions. For this reason,
interested colors are distributed randomly with an unknown probability density function
in color spaces. Despite the fact that several color spaces have been employed to make
the perception of colors more uniformity, the nature of unknown distribution form has not
filtered out completely. Hence, modeling the density of interested colors is still a
problematic task.
In some applications where the lighting condition is controllable, and the form or
parameters of color density functions can be acquired or simply estimated, interested
color regions in color spaces can be described by using cubes [21] [36], or spheres
[37] or ellipses [22]. Generally speaking, the assumption in those approaches is rarely
satisfied in broader cases of color image segmentation. Moreover, several aforementioned works employed a manual way to extract parameters.
Research in [24] approached the color distribution modeling in statistical way. It
required the acquisition of both groups of interested and background colors; and
utilized Bayesian classifier to segment incoming image pixels. Generally, this approach
can work well when the distribution of interested and background colors are in normal
form and separable to each other. However, that requirement is likely impractical.
Based on the assumption that any distribution of points in multi-dimensional space
can be approximated by a GMM with enough mixing components [38], works in [39],