VIDEO BASED OBSTACLE DETECTION FOR MOBILE ROBOT

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (863.82 KB, 19 trang )

1

VIETNAM NATIONAL UNIVERSITY, HANOI
UNIVERSITY OF ENGINEERING AND TECHNOLOGY

VIDEO BASED OBSTACLE DETECTION FOR
MOBILE ROBOT

SCIENCE RESEARCH CONTEST FOR UNDERGRADUATE
STUDENTS 2012

Student: Nguyễn Huy Tuân
Gender: Male
Class: K53CA – Computer Science
Supervisor: Ph.D. Lê Thanh Hà

HA NOI - 2012

2

ACKNOWLEDGEMENT

First and foremost, I would like to say my special thanks to my supervisor Ph.D. Le
Thanh Ha for his encouragement, good teaching and good ideas. I would have been
lost without his enthusiasm.

Besides, I also want to express my gratefulness to lecturers in University of
Engineering and Technology who gave me nice lessons during my 4 years studying
here.

My sincere thanks also goes to my classmates who have helped me a lot during my
research.

Last but not least, I would love to thank my parents who have been continuously
supporting and loving me throughout my life.

3

ABSTRACT

Computer vision is a field of computer science which has been heavily researched in
recent years. Among many of its applications that might relate to image processing,
video processing or adaptive mobile robots, obstacle detection contributes as a very
important sub-field. Obstacle detection is vital in a variety of innovative safety and
driver assistance systems such as automatic motorcars, collision avoidance or crash
warning system. This project aims at researching obstacle detection.

In this report we introduce several approaches to handle obstacle detection task such as
using laser sensor, radar sensor, infra-red sensor, digital sensor or a combination
between any 2 of those.

Our system uses digital camera to capture video from real scene. After that we
combine two fundamental techniques: Motion estimation and Image segmentation to
detect any obstacle from 2 consecutive frames of that video. At the moment we are
still at phase 1: Motion estimation, got some results and we plan to start phase 2 soon.

4

CONTENTS
I. Introduction 6
1. Obstacle Detection 6
2. This Project 6
3. Overview of Report structure 6
II. Literature Review 7
Related Work 7
1. Use Infrared sensor 7
2. Use Microwave radar 7
3. Use laser scanner 8
4. Use digital cameras 8
5. Combination of laser sensor and digital camera 8
Our Approach 9
1. Speeded-Up Robust Features (SURF) 9
2. Motion Estimation 10
3. Block Matching Algorithm 11
4. Image Segmentation 13
III. Implementation 14

1. Design Plan 14
2. Implementation 15
IV. Evaluation 17
V. Conclusion and Future Research 18

5

LIST OF FIGURES

Figure 1.Obstacle Detection using Infra-red sensor 7
Figure 2.A laser sensor is located at a height h on the vehicle with a depression angle

The laser hits an obstacle of height p. 8
Figure 3.Block diagram of the navigation algorithm 8
Figure 4.Obstacle detection using webcam and laser pointer 9
Figure 5.Input images of SURF 9
Figure 6.Ouput images matching 10
Figure 7.Example of motion estimation 11
Figure 8.Block Matching a macro block of side 16 pixels and a search parameter p of
size 7 pixels 12
Figure 9.The idea of Block Matching Algorithm – Exhaustive Search 13
Figure 10.Example of image segmentation 14
Figure 11.Output of ES Block Matching Algorithm 16
Figure 12. Input consecutive images 16

Figure 13.Output motion field image 17

6

I. Introduction
Computer vision is a field that includes methods for acquiring, processing,
analyzing, and understanding images and, in general, high-dimensional data from
the real world in order to produce numerical or symbolic information, e.g., in the
forms of decisions. A theme in the development of this field has been to duplicate
the abilities of human vision by electronically perceiving and understanding an
image. This image understanding can be seen as the disentangling of symbolic
information from image data using models constructed with the aid of geometry,
physics, statistics, and learning theory.
Applications range from tasks such as industrial machine vision systems which,
say, inspect bottles speeding by on a production line, to research into artificial
intelligence and computers or robots that can comprehend the world around them.
The computer vision and machine vision fields have significant overlap. Computer
vision covers the core technology of automated image analysis which is used in
many fields. Machine vision usually refers to a process of combining automated
image analysis with other methods and technologies to provide automated
inspection and robot guidance in industrial applications [1].
1. Obstacle Detection
Obstacle Detection (OD) is defined as the determination of whether a given space
is free of obstacles for safe travel by an autonomous vehicle. OD has been applied
into many kinds of real life applications such as OD system for motorcars along
highways to avoid collision with other cars or strange objects on road, robot
navigation or doing experiments with the vision of insects, etc.
Previous work has made use of a variety of approaches such as using infra-red
sensor, common radar, microwave based radar, digital cameras, laser sensor or

combination of laser sensor and video cameras.
2. This Project

Although the approach that uses laser sensor returned a very good result but for the
reasons of cost and practicality of each of each sensor type (laser sensor is quite
expensive), we use digital camera in our system. Thus, our system will operate
based on video captured from digital camera.
The ultimate goal of our system is to create an obstacle detection system for a
mobile robot mounted with a camera so that it can detect an obstacle while
traveling by processing videos that the camera captures while the robot travels. The
system should detect an obstacle (if there’s any) as quick as possible with a high
accuracy.
3. Overview of Report structure
The report of this project is divided into five chapters. The first chapter is
Introduction and it has been so far introducing about our problem, our solution as
well as approaches taken before in a general way. The remaining part of this report
is as follows:
Chapter 2 entitled Literature Review. In this chapter we present several approaches
that had been taken to solve our problem in work that relates to this project. Along
7

with that is some background algorithms and techniques we use in our approach
that you should understand.
Chapter 3 entitled Implementation. In this chapter we summarize the way that we
approached to solve the problem. To be specific, we talk about our design plan
(phases need to be taken) and the implementation of each phase.
Chapter 4 entitled Evaluation will be about our experiments and the result compare
with other methods.
Chapter 5 entitled Conclusion and Future work will conclude this report and we
present what we need to do in the future to complete and improve our project.

II. Literature Review
Related Work
Previous work taken in this field has made use of many approaches.
1. Use Infrared sensor
The basic concept of Infra-red (IR) Obstacle Detection is to transmit the IR signal
(radiation) in a direction and a signal is received at the IR receiver when the IR
radiation bounces back from a surface of the object.

Here in the figure the object can be anything
which has certain shape and size, the IR LED
transmits the IR signal on to the object and the
signal is reflected back from the surface of the
object. The reflected signals is received by an IR
receiver. The IR receiver can be a
photodiode/phototransistor or a ready-made
module which decodes the signal [2].
Figure 1.Obstacle Detection using Infra-red sensor

2. Use Microwave radar
In this approach, researchers from Tokyo Institute of Technology had used a
millimeter-wave radar to scan on the plane that might contain an obstacle for many
times. For each scan they outputted a set of data including radial distance, angle,
relative radial velocity and reflection intensity corresponding to each reflection.
After that they carried out a segmentation at each radar frame based on the above
set of data to achieve several clusters. Then the segmented clusters are tracked and
from which they extracted important information of obstacle (if there’s any) such
as object position, distance to object, object width or relative velocity of the object
[3].
8

3. Use laser scanner
Laser range scanners operate by sweeping a laser over a region of interest and
measuring, at each pixel, the time it takes for the laser to leave and return to the
sensor. Since the speed of light is known, the distance to every pixel can be
calculated. Most laser range scanners also provide the intensity of the returned
signal at every pixel. However, this second piece of information, often referred to
as the reflectance, has essentially been ignored by most researchers. The laser
reflectance ought to provide us with a direct means of finding obstacles or vertical
surfaces [4].

Figure 2.A laser sensor is located at a height h on the vehicle with a depression
angle

The laser hits an obstacle of height p.
4. Use digital cameras
Kahlouche Souhila and Achour Karim from Algerie developed an approach which
use digital camera to detect an obstacle. They processing an image sequence
grabbed from a camera embedded on a robot by forming an optical flow from these
images and extract useful information from this optical flow to use in their
navigation algorithm.
To be specific, they first compute (estimate) the motion field from captured
consecutive images by using a technique based on the intensity conservation of a
moving point. After computing the optical flow, they then can find the depth
information for each motion vector by combining the time to contact computation
and the robot’s speed at the time the images are taken. Using depth image
computed, the robot knows what is the farthest scene point and what is the nearest
one and finally decides the navigation zone of the robot.

Figure 3.Block diagram of the navigation algorithm
5. Combination of laser sensor and digital camera

This method just simply combines a digital camera with a laser sensor to give a
better result.
9

Figure 4.Obstacle detection using webcam and laser pointer
Our Approach
For the reasons of cost and practicality we approached our problem by using videos
that are captured from digital camera to compute and detect obstacle ahead of a
robot. Before getting insight into our design plan and our implementation, follows
are several fundamental concepts as well as background knowledge of algorithms,
techniques that we have ever tried to use in our method
1. Speeded-Up Robust Features (SURF)
The ultimate goal of this algorithm is to search for discrete image correspondences.
It means, give 2 images of the same scene but taken at different angle and different
zoom level, SURF carries out some computations to match corresponding points
from the first image to the second image. The result is as follows:

Figure 5.Input images of SURF

10

Figure 6.Ouput images matching

The SURF algorithm includes 3 main steps:
Step 1.Select interest points from distinctive locations in the image (corners, blobs,
T-junctions, etc.)

Step 2.Extract interest point descriptor. The neighborhood of every interest point is
represented by a feature vector
a. Each interest point is assigned a reproducible orientation
b. Construct a scale dependent window for each interest point
c. Extract a 64-dimensional vector in each descriptor window
Step 3.Match descriptor vectors between different images. (Matching process is
often based on a distance between vectors, e.g. the Mahalanobis or Euclidean
distance)
2. Motion Estimation
Motion estimation is the process of determining motion vectors that describe the
transformation from one 2D image to another, usually from adjacent frames in a
video sequence. It is an ill-posed problem as the motion is in three dimensions but
the images are a projection of the 3D scene onto a 2D plane. The motion vectors
may relate to the whole image (global motion estimation) or specific parts, such as
rectangular blocks, arbitrary shaped patches or even per pixel. The motion vectors
maybe represented by a translational model or many other models that can
approximate the motion of a real video camera, such as rotation and translation in
all three dimensions and zoom.
11

Optical flow is a concept that is closely related to motion estimation, where the
vectors correspond to the perceived movement of pixels [5].

Figure 7.Example of motion estimation

3. Block Matching Algorithm

12

Figure 8.Block Matching a macro block of side 16 pixels and a search parameter p
of size 7 pixels

The underlying supposition behind motion estimation is that the patterns
corresponding to objects and background in a frame of video sequence move
within the frame to form corresponding objects on the subsequent frame. The idea
behind block matching is to divide the current frame into a matrix of “macro
blocks” that are then compared with corresponding block and its adjacent
neighbors in the previous frame to create a vector that stipulates the movement of a
macro block from one location to another in the previous frame. This movement
calculated for all the macro blocks comprising a frame, constitutes the motion
estimated in the current frame. The search area for a good macro block match is
constrained up to p pixels on all fours sides of the corresponding macro block in
previous frame. This p is called as the search parameter. Larger motions require a
larger p, and the larger the search parameter the more computationally expensive
the process of motion estimation becomes. Usually the macro block is taken as a
square of side 16 pixels, and the search parameter p is 7 pixels. The idea is
represented in Figure 8. The matching of one macro block with another is based on
the output of a cost function. The macro block that results in the least cost is the
one that matches the closest to current block. There are various cost functions, of
which the most popular and less computationally expensive is Mean Absolute
Difference (MAD) given by equation (i). Another cost function is Mean Squared
Error (MSE) given by equation (ii) [6].

There are many algorithms taken to implement Block Matching such as:
Exhaustive Search (ES), Three Step Search (TSS), New Three Step Search
13

(NTSS), Simple and Efficient Search (SES), Four Step Search (4SS), Diamond
Search (DS) or Adaptive Rood Pattern Search (ARPS). ES is the most
computationally expensive search among those algorithms because it calculates the
cost function at each possible location in the search window. However, ES returns
the most accurate result. Thus we chose to implement ES for the reason of
accuracy.

Figure 9.The idea of Block Matching Algorithm – Exhaustive Search

4. Image Segmentation
The purpose of image segmentation is to partition an image into meaningful
regions with respect to a particular application. The result of image segmentation is
a set of segments that collectively cover the entire image, or a set of contours
extracted from the image. Each of the pixels in a region are similar with respect to
some characteristic or computed property, such as color, intensity, or texture.
Adjacent regions are significantly different with respect to the same characteristics.
When applied to a stack of images, typical in medical imaging, the resulting
contours after image segmentation can be used to create 3D reconstructions with
the help of interpolation algorithms [7].
There are several algorithms and techniques which have been developed for image
segmentation such as Thresholding, Clustering, Histogram-based, Edge detection
or Graph partitioning methods.
14

Figure 10.Example of image segmentation

III. Implementation
1. Design Plan

In order to create an obstacle detection system for our robot based on video
captured from the robot’s camera, we plan to build the system by 2 main
phases:

1.1. Estimate motion field (optical flow) from 2 consecutive images. Each
pixel on one frame corresponds to a motion vector that represents the
movement of that pixel when transforming from frame 1 to frame 2.
1.2. Image segmentation: Segment the extracted motion field in phase 1. In
case of existing an obstacle ahead of the robot, the motion field after
being segmented will have 2 separate regions. All the pixels that belong
to obstacle will have motion vectors with different direction compared to
motion vectors of points that belong to the road surface. In the remaining
case that there’s no obstacle on road, the segmentation result will have
only 1 region.

15

2. Implementation
2.1. Estimate motion field: We’ve been trying to estimate the motion
vectors from 2 continuous images using Block Matching algorithm –
Exhaustive Search. Below is the pseudo-code of Exhaustive Search
algorithm:

And here is the result of estimating the motion after implementing the
above pseudo-code on MATLAB 7.0.1:

16

Figure 11.Output of ES Block Matching Algorithm

We also tried some other open source codes to estimate motion vectors
such as Motion Estimation using Watson-Ahumada Algorithm. The
result is quite good as follow:

Figure 12. Input consecutive images

17

Figure 13.Output motion field image

This Watson-Ahumada algorithm is open-source, implemented in C# and you
candownload it from this link:
/>Estimation-Using-the-Watson
Because the above Exhaustive Search program was written in MATLAB, it still
has a high computation time and low accuracy. Therefore we are going to
rewrite the program in C++ or use a good open source that has been already
written in C++. The re-written program will be integrated into another program
call CxImage to help displaying the result more visually.

2.2. Image segmentation
We are going to carry out this phase right after choosing the best motion
estimation algorithm to draw the optical flow in phase 1.

IV. Evaluation
We did some experiments with 2 algorithm: Block Matching Algorithm Exhaustive
Search (BMAES) and the above Watson-Ahumada algorithm and compare them to
see which one is less complex in terms of computation time and too see which
algorithm gives a more accurate result. At the moment the our ES program is
written in MATLAB and Watson-Ahumda was implemented in C#. Our
experiments were carried out with 2 set of input images. The first input set is 2
18

consecutive images in Figure 11 and the second input set is 2 consecutive images
in Figure 12. For MATLAB program, we test the 2 input sets with 2 parameters:
block size N = 16 and search range R = 7, which are the values that people use
most frequently. The results are shown in the following table:

BMA Exhaustive Search
Watson-Ahumada Algorithm
Input set 1
(360x300 px)
204 seconds
25 seconds
Input set 2
(600x450 px)
583 seconds
75 seconds

Table 1: Time comparison between Exhaustive Search and Watson-Ahumada
As we can see from the table, WA algorithm is nearly 8 times faster than
Exhaustive Search of Block Matching Algorithm. The output optical flow of WA
also looks more accurate than BMA Exhaustive Search. In the nearest day we will

find a way to evaluate the accuracy of estimating motion field of 2 algorithms to
state which algorithm gives a more accurate optical flow.
V. Conclusion and Future Research
In this project, we introduce approaches to solve obstacle detection problem. We
design and implement one approach that use digital camera to capture continuous
images from real scene and detect obstacle from processing these images.
In the next days we will complete our project by choosing the most suitable motion
estimating algorithm and carrying out segmentation phase on the achieved optical
flow.

REFERENCES
[1]
[2]
[3] Sugimoto, S., Tateda, H., Takahashi, H. and Okutomi, M., Obstacle Detection
Using Millimeter-wave Radar and Its Visualization on Image Sequence

[4] John Hancock, Martial Hebert, and Chuck Thorpe, Laser Intensity-based Obstacle
Detection

[5]
[6] Aroh Barjatya, Block Matching Algorithms For Motion Estimation
19

[7]

VIDEO BASED OBSTACLE DETECTION FOR MOBILE ROBOT

Tài liệu liên quan

Tài liệu bạn tìm kiếm đã sẵn sàng tải về