Tải bản đầy đủ (.ppt) (8 trang)

Bài giảng Cấu trúc dữ liệu,Phần cuối

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (541.98 KB, 8 trang )

Chapter 19
Web Crawler
Copyright © 2005 Pearson
Addison-Wesley. All rights
reserved. 19-2
Chapter Objectives

Provide a case study example from
problem statement through
implementation

Demonstrate how hash tables and
graphs can be used to solve a problem
Copyright © 2005 Pearson
Addison-Wesley. All rights
reserved. 19-3
Web Crawler

A web crawler is a system that
searches the web, beginning with a
user-designated we page, looking for a
designated target string

A web crawler follows all of the links on
each page that it encounter until there
are no more pages or until it reaches a
designated limit
Copyright © 2005 Pearson
Addison-Wesley. All rights
reserved. 19-4
Web Crawler



For this case study, we will create a
graphical web crawler with the
following requirements

Enter a designated starting web page

Enter a target string for which to search

Limit the search to 50 pages

Display the results when done
Copyright © 2005 Pearson
Addison-Wesley. All rights
reserved. 19-5
Web Crawler - Design

Our web crawler system consists of
three high-level components:

The driver

The graphical user interface

The web crawler implementation

Makes use of graphs and hashtables
Copyright © 2005 Pearson
Addison-Wesley. All rights
reserved. 19-6

Web Crawler - Design

The algorithm for the web crawler is as
follows

Add the starting page to a HashSet of pages to
be searched and to our graph

Remove a page from the set of pages to be
searched

Search the page for the target string

If string exists, add page to list of results

Search the page for links

If links have not already been searched, add them to
set of pages to be searched and to our graph

Repeat the three previous steps until our limit is
reached or the set is empty
Copyright © 2005 Pearson
Addison-Wesley. All rights
reserved. 19-7
FIGURE 19.1 User interface design
Copyright © 2005 Pearson
Addison-Wesley. All rights
reserved. 19-8
FIGURE 19.2

UML description

×