Tải bản đầy đủ (.pdf) (10 trang)

Thuật toán Algorithms (Phần 35)

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (122.02 KB, 10 trang )

FINDING THE
333
Exercises
1.
2.
3.
4.
5.
6.
7.
8.
9.
Suppose it is known in advance that the convex hull of a set of points is
a triangle. Give an easy algorithm for finding the triangle. Answer the
same question for a quadrilateral.
Give an efficient method for determining whether a point falls within a
given convex polygon.
Implement a convex hull algorithm like insertion sort, using your method
from the previous exercise.
Is it strictly necessary for the Graham scan to start with a point guaran-
teed to be on the hull? Explain why or why not.
Is it strictly necessary for the package-wrapping method to start with a
point guaranteed to be on the hull? Explain why or why not.
Draw a set of points that makes the Graham scan for finding the convex
hull particularly inefficient.
Does the Graham scan work for finding the convex hull of the points
which make up the vertices of any simple polygon? Explain why or give
a counterexample showing why not.
What four points should be used for the Floyd-Eddy method if the input
is assumed to be randomly distributed within a circle (using random polar
coordinates)?


Run the package-wrapping method for large points sets with both and
y equally likely to be between 0 and 1000. Use your curve fitting routine
to find an approximate formula for the running time of your program for
a point set of size N.
Use your curve-fitting routine to find an approximate formula for the
number of points left after the Floyd-Eddy method is used on point sets
with x and y equally likely to be between 0 and 1000.

26. Range Searching
Given a set of points in the plane, it is natural to ask which of those
points fall within some specified area. “List all cities within 50 miles of
Providence” is a question of this type which could reasonably be asked if a
set of points corresponding to the cities of the U.S. were available. When the
geometric shape is restricted to be a rectangle, the issue readily extends to
non-geometric problems. For example,
“list all those people between 21 and
25 with incomes between $60,000 and $100,000” asks which “points” from a
file of data on people’s names, ages, and incomes fall within a certain rectangle
in the age-income plane.
Extension to more than two dimensions is immediate. If we want to list
all stars within 50 light years of the sun, we have a three-dimensional problem,
and if we want the rich young people of the second example in the paragraph
above to be tall and female as well, we have a four-dimensional problem. In
fact, the dimension can get very high for such problems.
In general, we assume that we have a set of records with certain at-
tributes that take on values from some ordered set. (This is sometimes called
a database, though more precise and complete definitions have been developed
for this important term.) The problem of finding all records in a database
which satisfy specified range restrictions on a specified set of attributes is
called range searching. For practical applications, this is a difficult and im-

portant problem. In this chapter, we’ll concentrate on the two-dimensional
geometric problem in which records are points and attributes are their coor-
dinates, then we’ll discuss appropriate generalizations.
The methods that we’ll look at are direct generalizations of methods that
we have seen for searching on single keys (in one dimension). We presume that
many queries will be made on the same set of points, so the problem splits into
two parts: we need a preprocessing algorithm, which builds the given points
into a structure supporting efficient range searching, and a range-searching
335
336
CHAPTER 26
algorithm, which uses the structure to return points falling within any given
(multidimensional) range. This separation makes different methods difficult
to compare, since the total cost depends not only on the distribution of the
points involved but also on the number and nature of the queries.
The range-searching problem in one dimension is to return all points
falling within a specified interval.
This can be done by sorting the points
for preprocessing and, then using binary search (to find all points in a given
interval, do a binary search on the endpoints of the interval and return all the
points that fall in between). Another solution is to build a binary search tree
and then do a simple recursive traversal of the tree, returning points that are
within the interval and ignoring parts of the tree that are outside the interval.
For example, the binary search tree that is built using the x coordinates of
our points from the previous chapter, when inserted in the given order, is the
following:
Now, the program required to find all the points in a given interval
is a direct generalization of the treeprint procedure of Chapter 14. If the
left endpoint of the interval falls to the left of the point at the root, we
(recursively) search the left similarly for the right, checking each

node we encounter to see whether its point falls within the interval:
RANGE SEARCHING
type interval = record xl, x2: integer end;
procedure link; int: interval);
var tx2: boolean;
begin
if then
begin
if then int);
if and then
if then int);
end
end
(This program could be made slightly more efficient by maintaining the inter-
val int as a global variable rather than passing its unchanged values through
the recursive calls.) For example, when called on the interval for the ex-
ample tree above, range prints out E C H F I. Note that the points returned
do not necessarily need to be connected in the tree.
These methods require time proportional to about N log N for preprocess-
ing, and time proportional to about N for range, where R is the number
of points actually falling in the range. (The reader may wish to check that
this is true.) Our goal in this chapter will be to achieve these same running
times for multidimensional range searching.
The parameter R can be quite significant: given the facility to make range
queries, it is easy for a user to formulate queries which could require all or
nearly all of the points. This type of query could reasonably occur in many
applications, but sophisticated algorithms are not necessary if all queries are
of this type. The algorithms that we consider are designed to be efficient for
queries which are not expected to return a large number of points.
Elementary Methods

In two dimensions, our “range” is an area in the plane. For simplicity, we’ll
consider the problem of finding all points whose coordinates fall within a
given x-interval and whose y coordinates fall within a given y-interval: that
is, we seek all points falling within a given rectangle. Thus, we’ll assume a
type rectangle which is a record of four integers, the horizontal and vertical
interval endpoints. The basic operation that we’ll use is to test whether a
point falls within a given rectangle, so we’ll assume a function
point; rectangle) which checks this in the obvious way, returning true if

×