Tải bản đầy đủ (.pdf) (98 trang)

the internet encyclopedia volume phần 9 potx

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (1.95 MB, 98 trang )

P1: 57
Yu WL040/Bidgolio-Vol I WL040-Sample.cls June 20, 2003 17:52 Char Count= 0
WEB SEARCH TECHNOLOGY750
document selector is to utilize the fact that most search
engines return retrieved results in groups. Usually, only
the top 10 to 20 results are returned in the first result page
but the user can make additional requests for more result
pages and more results. Hence, a document selector may
ask each search engine to return the first few result pages.
This method tends to return the same number of pages
from each selected search engine. Since different search
engines may contain different numbers of useful pages for
a given query, retrieving the same number of pages from
each search engine is likely to cause over-retrieval from
less useful databases and under-retrieval from highly use-
ful databases.
More elaborate document selection methods try to tie
the number of pages to retrieve from a search engine to the
ranking score (or the rank) of the search engine relative
to the ranking scores (or ranks) of other search engines.
This can lead to proportionally more pages to be retrieved
from search engines that are ranked higher or have higher
ranking scores. This type of approach is referred to as a
weighted allocation approach in (Meng et al., 2002).
For each user query, the database selector of the
metasearch engine computes a rank (i.e., 1st, 2nd, )
and a ranking score for each local search engine. Both
the rank information and the ranking score information
can be used to determine the number of pages to retrieve
from different local search engines. For example, in the
D-WISE system (Yuwono & Lee, 1997), the ranking score


information is used. Suppose for a given query q, r
i
de-
notes the ranking score of the local database D
i
, i = 1, ,
k, where k is the number of selected local databases for
the query, and α =

k
j=1
r
j
denotes the total ranking score
for all selected local databases. D-WISE uses the ratio r
i

to determine how many pages should be retrieved from
D
i
. More precisely, if m pages across these k databases
are to be retrieved, then D-WISE retrieves m ∗ r
i
/α pages
from database D
i
. An example system that uses the rank
information to select documents is CORI Net (Callan
et al., 1995). Specifically, if mis the total number of pages
to be retrieved from k selected local search engines, then

m∗
2(1 + k − i)
k(k + 1)
pages are retrieved from the ith ranked local database,
i = 1, , k. Since
2(1 + k − u)
k(k + 1)
>
2(1 + k − v)
k(k + 1)
for u < v, more pages will be retrieved from the uth ranked
database than from the vth ranked database. Because
k

i=1
2(1 + k − i)
k(k + 1)
= 1,
exactly m pages will be retrieved from the k top-ranked
databases. In practice, it may be wise to retrieve slightly
more than mpages from local databases in order to reduce
the likelihood of missing useful pages.
It is possible to combine document selection and
database selection into a single integrated process. In
Database Selection, we described a method for ranking
databases in descending order of the estimated similar-
ity of the most similar document in each database for
a given query. A combined database selection and doc-
ument selection method for finding the m most similar
pages based on these ranked databases was proposed in

Yu et al. (1999). This method is sketched below. First, for
some small positive integer s (e.g., s can be 2), each of the
stop-ranked databases are searched to obtain the actual
global similarity of its most similar page. This may re-
quire some locally top-ranked pages to be retrieved from
each of these databases. Let min
sim be the minimum of
these s similarities. Next, from these s databases, retrieve
all pages whose actual global similarities are greater than
or equal to min
sim.Ifm or more pages have been re-
trieved, then sort them in descending order of similarities,
return the top mpages to the user, and terminate this pro-
cess. Otherwise, the next top ranked database (i.e., the
(s + 1)th ranked database) is considered and its most sim-
ilar page is retrieved. The actual global similarity of this
page is then compared with the current min
sim and the
minimum of these two similarities will be used as the
new min
sim. Then retrieve from these s + 1 databases
all pages whose actual global similarities are greater than
or equal to the new min
sim. This process is repeated un-
til m or more pages are retrieved and the m pages with
the largest similarities are returned to the user. A seem-
ing problem with this combined method is that the same
database may be searched multiple times. In practice, this
problem can be avoided by retrieving and caching an ap-
propriate number of pages when a database is searched

for the first time. In this way, all subsequent “interactions”
with the database would be carried out using the cached
results. This method has the following property (Yu et al.,
1999). If the databases containing the m desired pages are
ranked higher than other databases and the similarity (or
desirability) of the mth most similar (desirable) page is
distinct, then all of the m desired pages will be retrieved
while searching at most one database that does not con-
tain any of the m desired pages.
Result Merging
Ideally, a metasearch engine should provide local system
transparency to its users. From a user’s point of view,
such a transparency means that a metasearch search
should behave like a regular search engine. That is,
when a user submits a query, the user does not need
to be aware that multiple search engines may be used
to process this query, and when the user receives the
search result from the metasearch engine, he/she should
be hidden from the fact that the results are retrieved
from multiple search engines. Result merging is a nec-
essary task in providing the above transparency. When
merging the results returned from multiple search en-
gines into a single result, pages in the merged result
should be ranked in descending order of global similari-
ties (or global desirabilities). However, the heterogeneities
that exist among local search engines and between the
metasearch engine and local search engine make result
merging a challenging problem. Usually, pages returned
from a local search engine are ranked based on these
pages’ local similarities. Some local search engines make

the local similarities of returned pages available to the
P1: 57
Yu WL040/Bidgolio-Vol I WL040-Sample.cls June 20, 2003 17:52 Char Count= 0
METASEARCH ENGINE TECHNOLOGY 751
user (as a result, the metasearch engine can also ob-
tain the local similarities) while other search engines
do not make them available. For example, Google and
AltaVista do not provide local similarities while Northern
Light and FirstGov do. To make things worse, local simi-
larities returned from different local search engines, even
when made available, may be incomparable due to the
use of different similarity functions and term-weighting
schemes by different local search engines. Furthermore,
the local similarities and the global similarity of the same
page may be quite different still as the metasearch engine
may use a similarity function different from those used in
local systems. In fact, even when the same similarity func-
tion were used by all local systems and the metasearch
engine, local and global similarities of the same page may
still be very different. This is because some statistics used
to compute term weights, for example the document fre-
quency of a term, are likely to be different in different
systems.
The challenge here is how to merge the pages returned
from multiple local search engines into a single ranked list
in a reasonable manner in the absence of local similarities
and/or in the presence of incomparable similarities. An
additional complication is that retrieved pages may be
returned by different numbers of local search engines. For
example, one page could be returned by one of the selected

local search engines and another may be returned by all of
them. The question is whether and how this should affect
the ranking of these pages.
Note that when we say that a page is returned by a
search engine, we really mean that the URL of the page
is returned. One simple approach that can solve all of the
above problems is to actually fetch/download all returned
pages from their local servers and compute their global
similarities in the metasearch engine. One metasearch
engine that employs this approach for result merging
is the Inquirus system ( />lawrence/inquirus.html). Inquirus ranks pages returned
from local search engines based on analyzing the con-
tents of downloaded pages, and it employs a ranking
formula that combines similarity and proximity matches
(Lawrence & Lee Giles, 1998). In addition to being able
to rank results based on desired global similarities, this
approach also has some other advantages (Lawrence
& Lee Giles, 1998). For example, when attempting to
download pages, obsolete URLs can be discovered. This
helps to remove pages with dead URLs from the final
result list. In addition, downloading pages on the fly
ensures that pages will be ranked based on their current
contents. In contrast, similarities computed by local
search engines may be based on obsolete versions of Web
pages. The biggest drawback of this approach is its slow
speed as fetching pages and analyzing them on the fly
can be time consuming.
Most result merging methods utilize the local similari-
ties or local ranks of returned pages to perform merging.
The following cases can be identified:

Selected Databases for a Given Query Do Not Share
Pages, and All Returned Pages Have Local Similarities
Attached.
In this case, each result page will be returned
from just one search engine. Even though all returned
pages have local similarities, these similarities may be nor-
malized using different ranges by different local search en-
gines. For example, one search engine may normalize its
similarities between 0 and 1 and another between 0 and
1000. In this case, all local similarities should be renor-
malized based on a common range, say [0, 1], to improve
the comparability of these local similarities (Dreilinger &
Howe, 1997; Selberg & Etzioni, 1997).
Renormalized similarities can be further adjusted
based on the usefulness of different databases for the
query. Recall that when database selection is performed
for a given query, the usefulness of each database is esti-
mated and is represented as a score. The database scores
can be used to adjust renormalized similarities. The idea
is to give preference to pages retrieved from highly ranked
databases. In CORI Net (Callan et al., 1995), the adjust-
ment works as follows. Let s be the ranking score of lo-
cal database D and
s be the average of the scores of all
searched databases for a given query. Then the following
weight is assigned to D : w = 1 + k *(s −
s)/s, where k
is the number of databases searched for the given query.
It is easy to see from this formula that databases with
higher scores will have higher weights. Let x be the renor-

malized similarity of page p retrieved from D. Then CORI
Net computes the adjusted similarity of p by w * x. The re-
sult merger lists returned pages in descending order of ad-
justed similarities. A similar method is used in ProFusion
(Gauch et al., 1996). For a given query, the adjusted sim-
ilarity of a page p from a database D is the product of
the renormalized similarity of p and the ranking score of
D.
Selected Databases for a Given Query Do Not Share
Pages, but Some Returned Pages Do Not Have Local
Similarities Attached.
Again, each result page will be re-
turned by one local search engine. In general, there are
two types of approaches for tackling the result-merging
problem in this case. The first type uses the local rank
information of returned pages directly to perform the
merge. Note that in this case, local similarities that may
be available for some returned pages would be ignored.
The second type first converts local ranks to local simi-
larities and then applies techniques described for the first
case to perform the merge.
One simple way to use rank information only for result
merging is as follows (Meng et al., 2002). First, arrange
the searched databases in descending order of usefulness
scores. Next, a round-robin method based on the database
order and the local page rank order is used to produce
an overall rank for all returned pages. Specifically, in
the first round, the top-ranked page from each searched
database is taken and these pages are ordered based on the
database order such that the page order and the database

order are consistent; if not enough pages have been ob-
tained, the second round starts, which takes the second
highest-ranked page from each searched database, orders
these pages again based on the database order, and places
them behind those pages selected earlier. This process is
repeated until the desired number of pages is obtained.
In the D-WISE system (Yuwono & Lee, 1997), the fol-
lowing method for converting ranks into similarities is
employed. For a given query, let r
i
be the ranking score of
P1: 57
Yu WL040/Bidgolio-Vol I WL040-Sample.cls June 20, 2003 17:52 Char Count= 0
WEB SEARCH TECHNOLOGY752
database D
i
, r
min
be the smallest database ranking score, r
be the local rank of a page from D
i
, and g be the converted
similarity of the page. The conversion function is g = 1 −
(r − 1) * F
i
, where F
i
= r
min
/(m * r

i
) and m is the number
of documents desired across all searched databases. This
conversion has the following properties. First, all locally
top-ranked pages have the same converted similarity, i.e.,
1. Second, F
i
is the difference between the converted sim-
ilarities of the jth and the ( j + 1)th ranked pages from
database D
i
, for any j = 1,2, Note that the distance is
larger for databases with smaller ranking scores. Conse-
quently, if the rank of a page p in a higher rank database is
the same as the rank of a page p

in a lower rank database
and neither p nor p

is top-ranked, then the converted sim-
ilarity of p will be higher than that of p

. This property can
lead to the selection of more pages from databases with
higher scores into the merged result. As an example, con-
sider two databases D
1
and D
2
. Suppose r

1
= 0.2, r
2
= 0.5,
and m = 4. Then r
min
= 0.2, F
1
= 0.25, and F
2
= 0.1. Thus,
the three top-ranked pages from D
1
will have converted
similarities 1, 0.75, and 0.5, respectively, and the three top-
ranked pages from D
2
will have converted similarities 1,
0.9, and 0.8, respectively. As a result, the merged list will
contain three pages from D
2
and one page from D
1
.
Selected Databases for a Given Query Share Pages. In
this case, the same page may be returned by multiple local
search engines. Result merging in this situation is usually
carried out in two steps. In the first step, techniques dis-
cussed in the first two cases can be applied to all pages,
regardless of whether they are returned by one or more

search engines, to compute their similarities for merging.
In the second step, for each page p returned by multi-
ple search engines, the similarities of p due to multiple
search engines are combined in a certain way to gener-
ate a final similarity for p. Many combination functions
have been proposed and studied (Croft, 2000), and some of
these functions have been used in metasearch engines. For
example, the max function is used in ProFusion (Gauch
et al., 1996), and the sum function is used in MetaCrawler
(Selberg & Etzioni, 1997).
CONCLUSION
In the past decade, we have all witnessed the explosion
of the Web. Up to now, the Web has become the largest
digital library used by millions of people. Search engines
and metasearch engines have become indispensable tools
for Web users to find desired information.
While most Web users probably have used search en-
gines and metasearch engines, few know the technologies
behind these wonderful tools. This chapter has provided
an overview of these technologies, from basic ideas to
more advanced algorithms. As can be seen from this chap-
ter, Web-based search technology has its roots from text
retrieval techniques, but it also has many unique features.
Some efforts to compare the quality of different search
engines have been reported (for example, see (Hawking,
Craswell, Bailey, & Griffiths, 2001)). An interesting issue is
how to evaluate and compare the effectiveness of different
techniques. Since most search engines employ multiple
techniques, it is difficult to isolate the effect of a particular
technique on effectiveness even when the effectiveness of

search engines can be obtained.
Web-based search is still a pretty young discipline, and
it still has a lot of room to grow. The upcoming transition
of the Web from mostly HTML pages to XML pages will
probably have a significant impact on Web-based search
technology.
ACKNOWLEDGMENT
This work is supported in part by NSF Grants
IIS-9902872, IIS-9902792, EIA-9911099, IIS-0208574,
IIS-0208434 and ARO-2-5-30267.
GLOSSARY
Authority page A Web page that is linked from hub
pages in a group of pages related to the same topic.
Collection fusion A technique that determines how
to retrieve documents from multiple collections and
merge them into a single ranked list.
Database selection The process of selecting potentially
useful data sources (databases, search engines, etc.) for
each user query.
Hub page A Web page with links to important (author-
ity) Web pages all related to the same topic.
Metasearch engine A Web-based search tool that uti-
lizes other search engines to retrieve information for
its user.
PageRank A measure of Web page importance based on
how Web pages are linked to each other on the Web.
Search engine A Web-based tool that retrieves poten-
tially useful results (Web pages, products, etc.) for each
user query.
Result merging The process of merging documents re-

trieved from multiple sources into a single ranked list.
Text retrieval A discipline that studies techniques to
retrieve relevant text documents from a document
collection for each query.
Web (World Wide Web) Hyperlinked documents resid-
ing on networked computers, allowing users to navi-
gate from one document to any linked document.
CROSS REFERENCES
See Intelligent Agents; Web Search Fundamentals; Web Site
Design.
REFERENCES
Bergman, M. (2000). The deep Web: Surfacing the hid-
den value. Retrieved April 25, 2002, from http://www.
completeplanet.com/Tutorials/DeepWeb/index.asp
Callan, J. (2000). Distributed information retrieval. In W.
Bruce Croft (Ed.), Advances in information retrieval: Re-
cent research from the Center for Intelligent Information
Retrieval (pp. 127–150). Dordrecht, The Netherlands:
Kluwer Academic.
Callan, J., Connell, M., & Du, A. (1999). Automatic dis-
covery of language models for text databases. In ACM
SIGMOD Conference (pp. 479–490). New York: ACM
Press.
P1: 57
Yu WL040/Bidgolio-Vol I WL040-Sample.cls June 20, 2003 17:52 Char Count= 0
REFERENCES 753
Callan, J., Croft, W., & Harding, S. (1992). The INQUERY
retrieval system. In Third DEXA Conference, Valencia,
Spain (pp. 78–83). Wien, Austria: Springer-Verlag.
Callan, J., Lu, Z., & Croft, W. (1995). Searching dis-

tributed collections with inference networks. In ACM
SIGIR Conference, Seattle (pp. 21–28). New York: ACM
Press.
Chakrabarti, S., Dom, B., Raghavan, P., Rajagopalan, S.,
Gibson, D., Kleinberg, J. (1998). Automatic resource
compilation by analyzing hyperlink structure and asso-
ciated text. In 7th International World Wide Web Confer-
ence, Brisbane, Australia (pp. 65–74). Amsterdam, The
Netherlands: Elsevier.
Chakrabarti, S., Dom, B., Kumar, R., Raghavan, P.,
Rajagopalan, S., et al. (1999). Mining the Web’s link
structure. IEEE Computer, 32, 60–67.
Croft, W. (2000). Combining approaches to information
retrieval. In W. Bruce Croft (Ed.), Advances in infor-
mation retrieval: Recent research from the Center for
Intelligent Information Retrieval (pp. 1–36). Dordrecht:
Kluwer Academic.
Cutler, M., Deng, H., Manicaan, S., & Meng, W. (1999).
A new study on using HTML structures to improve
retrieval. In Eleventh IEEE Conference on Tools with
Artificial Intelligence, Chicago (pp. 406–409). Washing-
ton, DC: IEEE Computer Society.
Dreilinger, D., & Howe, A. (1997). Experiences with
selecting search engines using metasearch. ACM
Transactions on Information Systems, 15, 195–222.
Fan, Y., & Gauch, S. (1999). Adaptive agents for infor-
mation gathering from multiple, distributed informa-
tion sources. In AAAI Symposium on Intelligent Agents
in Cyberspace, Stanford University (pp. 40–46). Menlo
Park, CA: AAAI Press.

Gauch, S., Wang, G., & Gomez, M. (1996). ProFusion:
Intelligent fusion from multiple, distributed search
engines. Journal of Universal Computer Science, 2, 637–
649.
Gravano, L., Chang, C., Garcia-Molina, H., & Paepcke,
A. (1997). Starts: Stanford proposal for Internet
meta-searching. In ACM SIGMOD Conference, Tucson,
AZ (pp. 207–218). New York: ACM Press.
Hawking, D., Craswell, N., Bailey, P., & Griffiths, K. (2001).
Measuring search engine quality. Journal of Informa-
tion Retrieval, 4, 33–59.
Hearst, M., & Pedersen, J. (1996). Reexamining the clus-
ter hypothesis: Scatter/gather on retrieval results. In
ACM SIGIR Conference (pp. 76–84). New York: ACM
Press.
Kahle, B., & Medlar, A. (1991). An information system for
corporate users: Wide area information servers (Tech.
Rep. TMC199). Thinking Machine Corporation.
Kirsch, S. (1998). The future of Internet search: Infoseek’s
experiences searching the Internet. ACM SIGIR Forum,
32, 3–7. New York: ACM Press.
Kleinberg, J. (1998). Authoritative sources in a hyper-
linked environment. In Ninth ACM-SIAM Symposium
on Discrete Algorithms (pp. 668–677). Washington, DC:
ACM–SIAM.
Koster, M. (1994). ALIWEB: Archie-like indexing in the
Web. Computer Networks and ISDN Systems, 27, 175–
182.
Lawrence, S., & Lee Giles, C. (1998). Inquirus, the NECi
meta search engine. In Seventh International World

Wide Web Conference (pp. 95–105). Amsterdam, The
Netherlands: Elsevier.
Manber, U., & Bigot, P. (1997). The search broker.
In USENIX Symposium on Internet Technologies and
Systems, Monterey, CA (pp. 231–239). Berkeley, CA:
USENIX.
Meng, W., Yu, C., & Liu, K. (2002). Building efficient and
effective metasearch engines. ACM Computing Surveys,
34, 48–84.
Page, L., Brin, S., Motwani, R., & Winograd, T. (1998).
The PageRank citation ranking: Bring order to the Web
(Technical Report). Stanford, CA: Stanford University.
Pratt, W., Hearst, H., & Fagan, L. (1999). A knowledge-
based approach to organizing retrieved documents. In
Sixteenth National Conference on Artificial Intelligence
(pp. 80–85). Menlo Park, CA: AAAI Press and Cam-
bridge, MA: MIT Press.
Salton, G., & McGill, M. (1983). Introduction to modern
information retrieval. New York: McCraw-Hill.
Selberg, E., & Etzioni, O. (1997). The MetaCrawler ar-
chitecture for resource aggregation on the Web. IEEE
Expert, 12, 8–14.
Wu, Z., Meng, W., Yu, C., & Li, Z. (2001). Towards a
highly scalable and effective metasearch engine. In
Tenth World Wide Web Conference (pp. 386–395). New
York: ACM Press.
Yu, C., Meng, W., Liu, L., Wu, W., & Rishe, N. (1999).
Efficient and effective metasearch for a large number
of text databases. In Eighth ACM International Con-
ference on Information and Knowledge Management

(pp. 217–214). New York: ACM Press.
Yuwono, B., & Lee, D. (1997). Server ranking for dis-
tributed text resource systems on the Internet. In
Fifth International Conference on Database Systems
for Advanced Applications (pp. 391–400). Singapore:
World Scientific.
P1: JDW
Sahai WL040/Bidgolio-Vol I WL040-Sample.cls July 16, 2003 18:35 Char Count= 0
Web Services
Web Services
Akhil Sahai, Hewlett-Packard Laboratories
Sven Graupner, Hewlett-Packard Laboratories
Wooyoung Kim, University of Illinois at Urbana-Champaign
Introduction 754
The Genesis of Web Services 754
Tightly Coupled Distributed Software
Architectures 754
Loosely Coupled Distributed Software
Architectures 755
Client Utility 755
Jini 755
TSpaces 755
Convergence of the Two Independent Trends 755
Web Services Today 755
Web Services Description 756
Web Services Discovery 756
Web Services Orchestration 757
Web Services Platforms 758
Security and Web Services 760
Single Sign-On and Digital Passports 760

Payment Systems for Web Services 762
The Future of Web Services 763
Dynamic Web Services Composition and
Orchestration 764
Personalized Web Services 764
End-to-End Web Service Interactions 764
Future Web Services Infrastructures 765
Conclusion 766
Glossary 766
Cross References 766
References 766
INTRODUCTION
There were two predominant trends in computing over
the past decade—(i) a movement from monolithic soft-
ware to distributed objects and components and (ii) an
increasing focus on software for the Internet. Web ser-
vices (or e-services) are a result of these two trends.
Web services are defined as distributed services that are
identified by Uniform Resource Identifiers (URI’s), whose
interfaces and binding can be defined, described, and dis-
covered by eXtensible Markup Language (XML) artifacts,
and that support direct XML message-based interactions
with other software applications over the Internet. Web
services that perform useful tasks would often exhibit the
following properties:
Discoverable—The foremost requirement for a Web ser-
vice to be useful in commercial scenarios is that it be
discovered by clients (humans or other Web services).
Communicable—Web services adopt a message-driven
operational model where they interact with each other

and perform specified operations by exchanging XML
messages. The operational model is thus referred to
as the Document Object Model (DOM). Some of pre-
eminent communication patterns that are being used
between Web services are synchronous, asynchronous,
and transactional communication.
Conversational—Sending a document or invoking a met-
hod, and getting a reply are the basic communication
primitives in Web services. A sequence of the primi-
tives that are related to each other (thus, conversation)
forms a complex interaction between Web services.
Secure and Manageable—Properties such as security, re-
liability, availability, and fault tolerance are critical for
commercial Web services as well as manageability and
quality of service.
As the Web services gain critical mass in the information
technology (IT) industry as well as academia, a dominant
computing paradigm of that of software as a monolithic
object-oriented application is gradually giving way to soft-
ware as a service accessible via the Internet.
THE GENESIS OF WEB SERVICES
Contrary to general public perception, the development of
Web services followed a rather modest evolutionary path.
The underpinning technologies of Web services borrow
heavily from object-based distributed computing and de-
velopment of the World Wide Web (Berners-Lee, 1996).
In the chapter, we review related technologies that help
shape the notion of Web services.
Tightly Coupled Distributed
Software Architectures

The study of various aspects of distributed computing can
be dated back as early as the invention of time-shared mul-
tiprocessing. Despite the early start, distributed comput-
ing remained impractical until the introduction of Object
Management Group’s (OMG) Common Object Request
Broker Architecture (CORBA) and Microsoft’s Distributed
Component Object Model (DCOM), a distributed ex-
tension to the Component Object Model (COM). Both
CORBA and DCOM create an illusion of a single machine
over a network of (heterogeneous) computers and allow
objects to invoke remote objects as if they were on the
same machine, thereby vastly simplifying object sharing
among applications. They do so by building their abstrac-
tions on more or less OS- and platform-independent mid-
dleware layers. In these software architectures, objects de-
fine a number of interfaces and advertise their services
by registering the interfaces. Objects are assigned identi-
fiers at the time of creation. The identifiers are used for
754
P1: JDW
Sahai WL040/Bidgolio-Vol I WL040-Sample.cls July 16, 2003 18:35 Char Count= 0
WEB SERVICES TODAY 755
discovering their interfaces and their implementations. In
addition, CORBA supports discovery of objects using de-
scriptions of the services they provide. Sun Microsystems’
Java Remote Method Invocation (Java RMI) provides a
similar functionality, where a network of platform-neutral
Java virtual machines provides the illusion of a single ma-
chine. Java RMI is a language-dependent solution, though
the Java Native Interface (JNI) provides language inde-

pendence to some extent.
The software architectures supported by CORBA and
DCOM are said tightly coupled because they define their
own binary message encoding, and thus objects are inter-
operable only with objects defined in the same software
architecture; for example, CORBA objects cannot invoke
methods on DCOM objects. Also, it is worth noting that
security was a secondary concern in these software archi-
tectures—although some form of access control is highly
desirable—partly because method-level/object-level ac-
cess control is too fine-grained and incurs too much over-
head, and partly because these software architectures
were developed for use within the boundary of a single
administrative domain, typically a local area network.
Loosely Coupled Distributed
Software Architectures
Proliferation and increased accessibility of diverse intel-
ligent devices in today’s IT market have transformed the
World Wide Web to a more dynamic, pervasive environ-
ment. The fundamental changes in computing landscape
from a static client-server model to a dynamic peer-to-peer
model encourage reasoning about interaction with these
devices in terms of more abstract notion of service rather
than a traditional notion of object. For example, printing
can be viewed as a service that a printer provides; print-
ing a document is to invoke the print service on a printer
rather than to invoke a method on a proxy object for a
printer.
Such services tend to be dispersed over a wide area,
often crossing administrative boundaries, for better re-

source utilization. This physical distribution calls for
more loosely coupled software architectures where scal-
able advertising and discovery are a must and low-latency,
high-bandwidth interprocessor communication is highly
desirable. As a direct consequence, a number of service-
centric middleware developments have come to light.
We note three distinctive systems from computer in-
dustry’s research laboratories, namely, HP’s client utility
(e-Speak), Sun Microsystems’ Jini, and IBM’s TSpaces
(here listed in the alphabetic order). These have been im-
plemented in Java for platform independence.
Client Utility
HP’s client utility is a somewhat underpublicized system
that became the launching pad for HP’s e-Speak (Karp,
2001). Its architecture represents one of the earlier forms
of peer-to-peer system, which is suitable for Web service
registration, discovery, and invocation (Kim, Graupner, &
Sahai, 2002). The fundamental idea is to abstractly repre-
sent every element in computing as a uniform entity called
“service (or resource).” Using the abstraction as a building
block, it provides facilities for advertising and discovery,
dynamic service composition, mediation and manage-
ment, and capability-based fine-grain security. What dis-
tinguishes client utility most from the other systems is the
fact that it makes advertisement and discovery visible to
clients. Clients can describe their services using vocabu-
laries and can specifically state what services they want to
discover.
Jini
The Jini technology at Sun Microsystems is a set of pro-

tocol specifications that allows services to announce their
presence and discover other services in their vicinity. It ad-
vocates a network-centric view of computing. However,
it relies on the availability of multicast capability, prac-
tically limiting its applicability to services/devices con-
nected with a local area network (such as home network).
Jini exploits Java’s code mobility and allows a service to ex-
port stub code which implements a communication proto-
col using Java RMI. Joining, advertisement, and discovery
are done transparently from other services. It has been de-
veloped mainly for collaboration within a small, trusted
workgroup and offers limited security and scalability sup-
ports.
TSpaces
IBM’s TSpaces (TSpaces, 1999) is network middleware
that aims to enable communication between applications
and devices in a network of heterogeneous computers and
operating systems. It is a network communication buffer
with database capabilities, which extends Linda’s Tuple
space communication model with asynchrony. TSpaces
supports hierarchical access control on the Tuple space
level. Advertisement and discovery are implicit in TSpaces
and provided indirectly through shared Tuple spaces.
Convergence of the Two Independent Trends
Web services are defined at the cross point of the evolution
paths of service-centric computing and the World Wide
Web. The idea is to provide service-centric computing by
using the Internet as platform; services are delivered over
the Internet (or intranet). Since its inception, the World
Wide Web has strived to become a distributed, decentra-

lized, all pervasive infrastructure where information is put
out for other users to retrieve. It is this decentralized,
distributed paradigm of information dissemination that
upon meeting the concept of service-centric computing
has led to the germination of the concept of Web services.
The Web services paradigm has caught the fancy of the
research and development community. Many computer
scientists and researchers from IT companies as well as
universities are working together to define concepts, plat-
forms, and standards that will determine how Web ser-
vices are created, deployed, registered, discovered, and
composed as well as how Web services will interact with
each other.
WEB SERVICES TODAY
Web services are appearing on the Internet in the
form of e-business sites and portal sites. For example,
P1: JDW
Sahai WL040/Bidgolio-Vol I WL040-Sample.cls July 16, 2003 18:35 Char Count= 0
WEB SERVICES756
priceline.com () and Expedia.
com () act as a broker for airlines,
hotels, and car rental companies. They offer through their
portal sites statically composed Web services that have
prenegotiated an understanding with certain airlines and
hotels. These are mostly a business-to-consumer (B2C)
kind of Web services. A large number of technologies
and platforms have appeared and been standardized so
as to enable the paradigm of Web services to support
business-to-business (B2B) and B2C scenarios alike in a
uniform manner. These standards enable creation and de-

ployment, description, and discovery of Web services, as
well as communication amongst them. We describe some
preeminent standards below.
The Web Services Description Language (WSDL) is a
standard to describe service interfaces and publish them
together with services’ access points (i.e., bindings) and
supported interfaces. Once described in WSDL, Web ser-
vices can be registered and discovered using the Univer-
sal Description, Discovery, and Integration (UDDI). Af-
ter having discovered its partners, Web services use the
Simple Object Access Protocol (SOAP), which is in fact
an incarnation of the Remote Procedure Call (RPC) in
XML, over the HyperText Transfer Protocol (HTTP) to ex-
change XML messages and invoke the partners’ services.
Though most services are implemented using platform-
independent languages such as Java and C#, development
and deployment platforms are also being standardized;
J2EE and .NET are two well known ones. Web services
and their users often expect different levels of security
depending on their security requirements and assump-
tion. The primary means for enforcing security are dig-
ital signature and strong encryption using the Public
Key Infrastructure (PKI). SAML, XKMS, and XACML are
some of recently proposed security standards. Also, many
secure payment mechanisms have been defined. (See
Figure 1).
Web Services Description
In traditional distributed software architectures, devel-
opers use an interface definition language (IDL) to de-
fine component interfaces. A component interface typi-

cally describes the operations the component supports by
specifying their inputs and expected outputs. This enables
developers to decouple interfaces from actual implemen-
tations. As Web services are envisaged as software acces-
sible through the Web by other Web services and users,
.Net
UDDII
WSDL
SOAP
J2EE
HPPM/
MQSeries
Web Methods
Figure 1: Web services.
Web services need to be described so that their interfaces
are decoupled from their implementations. WSDL serves
as an IDL for Web services.
WSDL enables description of Web services indepen-
dently of the message formats and network protocols
used. For example, in WSDL a service is described as a set
of endpoints. An endpoint is in turn a set of operations.
An operation is defined in terms of messages received or
sent out by the Web service:
Message—An abstract definition of data being communi-
cated consisting of message parts.
Operation—An abstract definition of an action supported
by the service. Operations are of the following types:
one-way, request–response, solicit–response, and noti-
fication.
Port type—An abstract set of operations supported by one

or more endpoints.
Binding—A concrete protocol and data format specifica-
tion for a particular port type.
Port—A single endpoint defined as a combination of a
binding and a network address.
Service—A collection of related endpoints.
As the implementation of the service changes or evolves
over time, the WSDL definitions must be continuously
updated and versioning the descriptions done.
Web Services Discovery
When navigating the Web for information, we use key
words to find Web sites of interest through search engines.
Often times, useful links in search results are mixed with
a lot of unnecessary ones that need to be sifted through.
Similarly, Web services need to discover compatible
Web services before they undertake business with them.
The need for efficient service discovery necessitates some
sort of Web services clearing house with which Web
services register themselves. UDDI ()
supported by Ariba, IBM, Microsoft, and HP, is an ini-
tiative to build such a Web service repository; it is now
under the auspice of OASIS ().
These companies maintain public Web-based registries
(operator sites) consistent with each other that make
available information about businesses and their techni-
cal interfaces and application program interfaces (APIs).
A core component of the UDDI technology is registra-
tion, an XML document defining a business and the Web
services it provides. There are three parts to the regis-
tration, namely a white page for name, address, contact

information, and other identifiers; a yellow page for clas-
sification of a business under standard taxonomies; and
a green page that contains technical information about
the Web services being described. UDDI also lists a set of
APIs for publication and inquiry. The inquiry APIs are for
browsing information in a repository (e.g., find
business,
get
businessDetail). The publication APIs are for business
entities to put their information on a repository.
E-marketplaces have been an important development
in the business transaction arena on the Internet. They
are a virtual meeting place for market participants
(i.e., Web services). In addition to the basic registration
P1: JDW
Sahai WL040/Bidgolio-Vol I WL040-Sample.cls July 16, 2003 18:35 Char Count= 0
WEB SERVICES TODAY 757
and discovery, e-marketplaces offer their participants a
number of value-added services, including the following:
Enabling inter-Web service interaction after the discovery
(the actual interaction may happen with or without the
direct participation of the e-marketplace);
Enabling supply and demand mechanisms through tradi-
tional catalogue purchasing and request for purchase
(RFP), or through more dynamic auctions and ex-
changes;
Enabling supply-chain management through collabora-
tive planning and inventory handling; and
Other value-added services, such as rating, secured pay-
ment, financial handling, certification services, and no-

tification services.
Thus, e-marketplaces can be developed as an entity that
uses public UDDI registries. The e-marketplaces are cat-
egorized as vertical and horizontal depending on their
target market. The vertical e-marketplaces, such as Ver-
ticalNet, GlobalNetXChange, and Retailer Market Ex-
change, target a specific industry sector where partici-
pants perform B2B transactions. In particular, Chemdex,
E-Steel, DirectAg.com, and many more have been success-
ful in their respective markets. By contrast, horizontal ex-
changes, such as eBay, are directed at a broad range of
clients and businesses.
Web Services Orchestration
By specifying a set of operations in their WSDL document,
Web services make visible to the external world a certain
subset of internal business processes and activities. There-
fore, the internal business processes must be defined and
some of their activities linked to the operations before
publication of the document. This in turn requires mod-
eling a Web service’s back-end business processes as well
as interactions between them. On the other hand, Web ser-
vices are developed to serve and utilize other Web services.
This kind of interaction usually takes a form of a sequence
of message exchanges and operation executions, termed
conversation. Although conversations are described inde-
pendently of the internal flows of the Web services, they
result in executions of a set of backend processes. A Web
service and its ensuing internal processes together form
what is called a global process.
Intra-Web Service Modeling and Interaction

The Web Services Flow Language (WSFL) (Leymann,
2001), the Web Services Conversation Language (WSFL)
(W3C, 2002), the Web Service Choreography Interface
(WSCI) (BEA, 2002) and XLANG (Thatte, 2001) are some
of many business process specification languages for Web
services.
WSFL introduces the notion of activities and flows
which are useful for describing both local business pro-
cess flows and global message flows between multiple Web
services. WSFL models business processes as a set of ac-
tivities and links. An activity is a unit of useful work while
a link connects two activities. A link can be a control link
where a decision of what activity to follow is made, or a
data link specifying that a certain datum flows from an
activity to another. These activities may be made visible
through one or more operations grouped as endpoints. As
in WSDL, a set of endpoints defines a service. WSFL de-
fines global message flows in a similar way. A global flow
consists of plug links that link up operations of two ser-
vice providers. Complex services involving more than two
service providers are created by recursively defining plug
links.
XLANG developed by Microsoft extends the XML
Schema Definition Language (XSDL) to provide a mecha-
nism for process definition and global flow coordination.
The extension elements describe the behavioral aspects
of a service. A behavior may span multiple operations.
Action is an atomic component of a behavior definition.
An action element can be an operation, a delay element,
or a raise element. A delay element can be of type de-

layFor or delayUntil. delayFor and delayUntil introduce
delays in execution for a process to wait for something
to happen (for example, a timeout) and to wait till an
absolute date-time has been reached, respectively. Raise
elements are used to specify exception handling. Excep-
tions are handled by invoking the corresponding handler
registered with a raise definition. Finally, processes com-
bine actions in different ways: some of them are sequence,
switch, while, all, pick, and empty.
Inter-Web Service Modeling and Interaction
Web services must negotiate and agree on a protocol in
order to engage in a business transaction on the Web.
X-EDI, ebXML, BTP, TPA-ML, cXML, and CBL have been
proposed as an inter-Web service interaction protocol. We
focus on ebXML as it is by far the most successful one.
(See Figure 2.)
In ebXML ( parties to engage in
a transaction have Collaboration Protocol Profiles (CPP’s)
that they register at ebXML registries. A CPP contains the
following:
Process Specification Layer—Details the business transac-
tions that form the collaboration. It also specifies the
order of business transactions.
Delivery Channels—Describes a party’s message receiving
and sending characteristics. A specification can con-
tain more than one delivery channel.
A
CB
X
Y

Z
Pt .o2
o3
Pt .o5
o7
GF
A
CB
X
Y
Z
P t’.o 2
o 3
P t’. o1
Pt’.o5
o 7
GF
P t’. o 2
Figure 2: Intra and inter-Web service modeling and interac-
tion.
P1: JDW
Sahai WL040/Bidgolio-Vol I WL040-Sample.cls July 16, 2003 18:35 Char Count= 0
WEB SERVICES758
Document Exchange Layer—Deals with processing of the
business documents like digital signatures, encryption,
and reliable delivery.
Transport Layer—Identifies the transport protocols to be
used with the endpoint addresses, along with other
properties of the transport layer. The transport proto-
cols could be SMTP, HTTP, and FTP.

When a party discovers another party’s CPP they ne-
gotiate certain agreement and form a Collaboration Pro-
tocol Agreement (CPA). The intent of the CPA is not to
expose the business process internals of the parties but
to make visible only the processes that are involved in
interactions between the parties. Message exchange be-
tween the parties can be facilitated with the ebXML Mes-
saging Service (ebMS). A CPA and the business process
specification document it references define a conversation
between parties. A typical conversation consists of mul-
tiple business transactions which in turn may involve a
sequence of message exchanges for requests and replies.
Although a CPA may refer to multiple business process
specification documents, any conversation is allowed to
involve only a single process specification document. Con-
ceptually, the B2B servers of parties involved are respon-
sible for managing CPAs and for keeping track of the
conversations. They also interface the operations defined
in a CPA with the corresponding internal business pro-
cesses.
Web Services Platforms
Web services platforms are the technologies, means, and
methods available to build and operate Web services. Plat-
forms have been developed and changed over the course
of time. A classification into four generations of platform
technology should help to structure the space:
First Generation: HTML and CGI—Characterized by Web
servers, static HTML pages, HTML FORMS for simple
dialogs, and the Common Gateway Interface (CGI) to
connect Web servers to application programs, mostly

Perl or Shell scripts. (See Figure 3.)
Second Generation: Java—Server-side dynamic genera-
tion of HTML pages and user session support; the Java
servlet interface became popular for connecting to ap-
plication programs.
Third Generation: Application server as Richer develop-
ment and run-time environments—J2EE as foundation
for application servers that later evolved towards the
fourth generation.
Service
A
CPP
Service
B
CPP
CPA
ebXML
registry
Figure 3: ebXML service-to-service interaction.
LB FW
AS
WS
AS
WS
AS
WS
AS
WS
Back-End
Internet

front-end web server app server back-end
DB
DB
Figure 4: Basic four-tier architecture for Web services.
Fourth Generation: Web services—Characterized by the
introduction of XML and WSDL interfaces for Web
services with SOAP-based messaging. A global service
infrastructure for service registration and discovery
emerged: UDDI. Dynamic Web services aggregation—
Characterized by flow systems, business negotiations,
agent technology, etc.
Technically, Web services have been built according to a
pattern of an n-tier architecture that consists of a front-
end tier, firewall (FW), load balancer (LB), a Web-server
tier (WS), an application (server) (AS) tier, and a back-
end tier for persistent data, or the database tier (DB). (See
Figure 4.)
First Generation: HTML and CGI
The emergence of the World Wide Web facilitated the
easy access and decent appearance of linked HTML mark-
up pages in a user’s browser. In the early days, it was
mostly static HTML content. Passive information services
that provided users with the only capability of naviga-
ting though static pages could be built. However, HTML
supported from the very beginning FORMS that allowed
users to enter text or select from multiple-choice menus.
FORMS were treated specially by Web servers. They were
passed onto CGI, behind which small applications, mostly
Perl or Shell scripts, could read the user’s input, perform
respective actions, and return a HTML page that could

then be displayed in the user’s browser. This primitive
mechanism enabled a first generation of services on the
Web beyond pure navigation through static contents.
Second Generation: Java
With the growth of the Web and the desire for richer ser-
vices such as online shopping and booking, the initial
means to build Web services quickly became too primi-
tive. Java applets also brought graphical interactiveness to
the browser side. Java appeared as the language of choice
for Web services. Servlets provided a better interface be-
tween the Web server and the application. Technology to
support dynamic generation of HTML pages at the server
side was introduced: JSP (Java Server Pages) by Sun Mi-
crosystems, ASP (Active Server Pages) by Microsoft, or
PHP pages in the Linux world enabled separation of pre-
sentation, the appearance of pages in browsers, from con-
tent data. Templates and content were then merged on
the fly at the server in order to generate the final page re-
turned to the browser. Since user identification was crit-
ical for business services, user log-in and user sessions
were introduced. Applications were becoming more com-
plex, and it turned out that there was a significant overlap
in common functions needed for many services such as
session support, connectivity to persistent databases, and
security functions.
P1: JDW
Sahai WL040/Bidgolio-Vol I WL040-Sample.cls July 16, 2003 18:35 Char Count= 0
WEB SERVICES TODAY 759
Figure 5: The J2EE platform.
Third Generation: Application Server

The observation that many functions were shared and
common among Web services drove the development
toward richer development environments based on the
Java language and Java libraries. A cornerstone of these
environments became J2EE (Java 2 Platform, Enterprise
Edition), which is a Java platform designed for enterprise-
scale computing. Sun Microsystems (together with in-
dustry partners such as IBM) designed J2EE (Figure 5)
to simplify application development for Web services by
decreasing the need for programming through reusable
modular components and by providing standard func-
tions such as session support and database connecti-
vity.
J2EE primarily manifests in a set of libraries used by
application programs performing the various functions.
Web service developers still had to assemble all the pieces,
link them together, connect them to the Web server, and
manage the various configurations. This led to the emer-
gence of software packages that could be deployed eas-
ier on a variety of machines. These packages later be-
came application servers. They significantly reduced the
amount of configuration work during service deployment
such that service developers could spend more time on
business logic and the actual function of the service. Most
application server are based on J2EE technology. Exam-
ples are IBM’s WebSphere suite, BEA’s WebLogic environ-
ment, the Sun ONE Application Framework, and Oracle’s
9i application server. (See Figure 5.)
Fourth Generation: Web Services
Prior generations of Web services mostly focused on end-

users, people accessing services from Web browsers. How-
ever, accessing services from services other than browsers
turned out to be difficult. This circumstance has prevented
the occurrence of Web service aggregation for a long time.
Web service aggregation meant that users would only have
to contact one Web service, and this service then would
resolve the user’s requests with further requests to other
Web services.
HTML is a language defined for rendering and pre-
senting content in Web browsers. It does not allow per se
separating content from presentation information. With
the advent of XML, XML became the language of choice
for Web services for providing interfaces that could not
only be accessed by users through Web browsers but also
by other services. XML is now pervasively being used
in Web services messaging (mainly using SOAP) and for
Web service interface descriptions (WSDL). In regard to
platforms, XML enhancements were added to J2EE and
application servers. The introduction of XML is the major
differentiator between Web services platforms of the third
and the fourth generation in this classification.
A major step toward the service-to-service integration
was the introduction of the UDDI service (see the above
section Web Services Discovery).
Three major platforms for further Web services in-
teraction and integration are: Sun Microsystems’ Sun
ONE (Open Net Environment), IBM WebSphere, and Mi-
crosoft’s .NET.
Sun ONE—Sun’s standards-based software architecture
and platform for building and deploying services on

demand. Sun ONE’s architecture is built around exis-
ting business assets: Data, applications, reports, and
transactions, referred to as the DART model. Major
standards are supported: XML, SOAP, J2EE, UDDI,
LDAP, and ebXML. The architecture is composed of
several product lines: the iPlanet Application Frame-
work (JATO), Sun’s J2EE application framework for
enterprise Web services development, application ser-
ver, portal server, integration server, directory server,
e-commerce components, the Solaris Operating Envi-
ronment, and development tools.
IBM WebSphere—IBM’s platform to build, deploy, and
integrate your e-business, including components such
as foundation and tools, reach and user experience,
business integration, and transaction servers and
tools.
Microsoft .NET—Microsoft’s .NET platform for provid-
ing lead technology for future distributed applications
inherently seen as Web services. With Microsoft .NET,
Web services’ application code is built in discrete units,
XML Web services, which handle a specified set of
tasks. Because standard interfaces based on XML sim-
plify communication among software, XML Web ser-
vices can be linked together into highly specific applica-
tions and experiences. The vision is that the best XML
Web services from any provider around the globe can
be used to create a needed solution quickly and easily.
P1: JDW
Sahai WL040/Bidgolio-Vol I WL040-Sample.cls July 16, 2003 18:35 Char Count= 0
WEB SERVICES760

Microsoft will provide a core set of XML Web services,
called Microsoft .NET My Services, to provide func-
tions such as user identification and calendar access.
Security and Web Services
Due to their public nature, security is vital for Web ser-
vices. Security attacks can be classified as threats of infor-
mation disclosure, unauthorized alteration of data, de-
nial of use, misuse or abuse of services, and, more rarely
considered, repudiation of access. Since Web services
link networks together with businesses, further attacks,
such as masquerading, stealing or duplicating identity
and conducting business under false identity, or accessing
or transferring funds from or to unauthorized accounts,
need to be considered.
Security is vital for establishing the legal basis for
businesses done over the Web. Identification and authen-
tication of business partners are the basic security re-
quirements. Others include integrity and authenticity of
electronic documents. Electronic contracts must have the
same binding legal status as conventional contracts. Re-
fusal and repudiation of electronic contracts must be
provable in order to be legally valid. Finally, payment and
transferring funds between accounts must be safe and se-
cure.
Security architectures in networks are typically com-
posed of several layers:
Secure data communication—IPsec (Internet Protocol
Security), SSL (Secure Socket Layer), TLS (Transport
Layer Security);
Secured networks—VPNs (Virtual Private Networks);

Authenticity of electronic documents and issuing
individuals—digital signatures;
Secure and authenticated access—digital certificates;
Secure authentication and certification—PKI (Public Key
Infrastructure); and
Single sign-on and digital passports.
Single Sign-On and Digital Passports
Digital passport emerged from the desire to provide an in-
dividual’s identity information from a trusted and secure
centralized place rather then repeatedly establishing this
information with each collaborating partner and main-
taining separate access credentials for each pair of collab-
orations. Individuals only need one such credential, the
passport, in order to provide collaborating partners with
certain parts of an individual’s identity information. This
consolidates the need for maintaining separate identities
with different partners into a single identification mech-
anism. Digital passports provide an authenticated access
to a centralized place where individuals have registered
their identity information such as phone numbers, social
security numbers, addresses, credit records, and payment
information. Participating individuals, both people and
businesses, will access the same authenticated informa-
tion assuming trust to the authority providing the pass-
port service. Two initiatives have emerged: Microsoft’s
.NET Passport and the Liberty Alliance Project, initiated
by Sun Microsystems.
Microsoft .NET Passport (Microsoft .NET, 2002) is a
single sign-on mechanism for users on the Internet. In-
stead of creating separate accounts and passwords with

every e-commerce site, users only need to authenticate
with a single Passport server. Then, through a series of
authentications and encrypted cookie certificates, the user
is able to purchase items at any participating e-commerce
site without verifying the user’s identity again. .NET Pass-
port is an online service that enables use of an e-mail ad-
dress and a single (Passport server) password to securely
sign in to any .NET Passport participating Web site or
service. It allows users to easily move among participat-
ing sites without the need to verify their identity again.
The Microsoft .NET Passport had initially been planned
for signing into Microsoft’s own services. Expanding it to-
ward broader use in the Web has been seen as critical.
This concern gave reason for the Liberty Alliance Project
initiative that is now widely supported in industry and
public.
The Liberty Alliance Project (Liberty Alliance Project,
2002) is an organization being formed to create an open,
federated, single sign-on identity solution for the digi-
tal economy via any device connected to the Internet.
Membership is open to all commercial and noncommer-
cial organizations. The Alliance has three main objec-
tives:
1. To enable consumers and businesses to maintain per-
sonal information securely.
2. To provide a universal, open standard for single sign-on
with decentralized authentication and open authoriza-
tion from multiple providers.
3. To provide an open standard for network identity span-
ning all network-connected devices.

With the emergence of Web services, specific secu-
rity technology is emerging. Two major security techno-
logy classes are Java-based security technology and XML-
based security technology.
Both classes basically provide mappings of security
technologies, such as authentication and authorization,
encryption, and signatures, into respective environments.
Java-Based Security Technology for Web Services
Java-based security technology is primarily available
through the Java 2 SDK and J2EE environments in the
form of sets of libraries:
Encryption—JSSE (Java Secure Socket Extension); the
JCE (Java Cryptography Extension) provides a frame-
work and implementations for encryption, key gener-
ation and key agreement, and Message Authentication
Code (MAC) algorithms. Support for encryption in-
cludes symmetric, asymmetric, block, and stream ci-
phers. The software also supports secure streams and
sealed objects.
Secure messaging—Java GSS-API is used for securely
exchanging messages between communicating appli-
cations. The Java GSS-API contains the Java bindings
for the Generic Security Services Application Program
Interface (GSS-API) defined in RFC 2853. GSS-API
P1: JDW
Sahai WL040/Bidgolio-Vol I WL040-Sample.cls July 16, 2003 18:35 Char Count= 0
WEB SERVICES TODAY 761
offers application programmers uniform access to
security services atop a variety of underlying security
mechanisms, including Kerberos.

Authentication and Authorization—JAAS (Java Authenti-
cation and Authorization Service) for authentication
of users, to reliably and securely determine who is cur-
rently executing Java code, and for authorization of
users to ensure they have the access rights (permis-
sions) required to do security-sensitive operations.
Certification—Java Certification Path API.
X.509 Certificates and Certificate Revocation Lists (CRLs)
and Security Managers.
These libraries are available for use when Web services
are built using Java. They are usually used when building
individual Web services with application servers.
For Web services interaction, XML technology elim-
inates the tied binding to Java. Consequently, a similar
set of XML-based security technologies enabling cross-
service interactions is emerging.
XML-Based Security Technology for Web Services
The Organization for the Advancement of Structured
Information Standards (OASIS) merges security into
Web services at a higher level than the common Inter-
net security mechanisms and practices described above.
Proposals are primarily directed toward providing XML
specifications for documents and protocols suitable for
cross-organizational Web services interactions. XML-
based security technology can be classified into the fol-
lowing:
XML Document-Level Security—encryption and digitally
signing XML documents;
Protocol-Level Security for XML Document Exchanges—
exchanging XML documents for authentication and

authorization of peers; and
XML-Based Security Frameworks—infrastructures for
establishing secure relationships among parties.
XML Document-Level Security: Encryption and
Signature.
The (preliminary) XML encryption specifi-
cation (Reagle, 2000) details requirements on how to
digitally encrypt a Web resource in general, and an XML
document in particular. XML encryption can be applied
to a part of or complete XML document. The granularity
of encryption can be reduced to an element, attributes,
or text content. Encryption can be recursive. The specifi-
cation does not address confidence or trust relationships
and key establishment. The specification addresses both
key-encrypting-keys and data keys. The specification will
not address the expression of access control policies asso-
ciated with portions of the XML document. This will be
addressed by XACML.
XML signature defines the XML schema and process-
ing rules for creating and representing digital signatures
in any digital content (data object), including XML. An
XML signature may be applied to the content of one
or more documents. Enveloped or enveloping signatures
are over data within the same XML document as the
signature; detached signatures are over data external to
the signature element. More specifically, this specification
defines an XML signature element type and an XML sig-
nature application; conformance requirements for each
are specified by way of schema definitions and prose re-
spectively. This specification also includes other useful

types that identify methods for referencing collections of
resources, algorithms, and keying and management infor-
mation.
The XML Signature (Bartel, Boyer, Fox, LaMacchia,
& Simon, 2002) is a method of associating a key with
referenced data (octets); it does not normatively specify
how keys are associated with persons or institutions, nor
the meaning of the data being referenced and signed.
Consequently, while this specification is an important
component of secure XML applications, it itself is not suf-
ficient to address all application security/trust concerns,
particularly with respect to using signed XML (or other
data formats) as a basis of human-to-human communi-
cation and agreement. Such an application must specify
additional key, algorithm, processing, and rendering re-
quirements. The SOAP Digital Signature Extensions de-
fines how specifically SOAP messages can be digitally
signed.
Protocol-Level Security for XML Document
Exchanges.
Protocol-level security defines document
exchanges with the purpose of establishing secure rela-
tionships among parties, typically providing well-defined
interfaces and XML bindings to an existing public key in-
frastructure. Protocol-level security can be built upon the
document-level security.
The XML Key Management Specification (Ford et al.,
2001) defines protocols for validating and registering pub-
lic keys, suitable for use in conjunction with the pro-
posed standard for XML signature developed by the World

Wide Web Consortium (W3C) and the Internet Engineer-
ing Task Force (IETF) and an anticipated companion stan-
dard for XML encryption. The XML Key Management
Specification (XKMS) comprises two parts: the XML Key
Information Service Specification (X-KISS) and the XML
Key Registration Service Specification (X-KRSS).
The X-KISS specification defines a protocol for a trust
service that resolves public key information contained in
XML-SIG document elements. The X-KISS protocol al-
lows a client of such a service to delegate part or all of the
tasks required to process <ds:KeyInfo> elements embed-
ded in a document. A key objective of the protocol design
is to minimize the complexity of application implemen-
tations by allowing them to become clients and thereby
shielded from the complexity and syntax of the underlying
Public Key Infrastructure (OASIS PKI Member Section,
2002) used to establish trust relationships-based specifi-
cations such as X.509/PKIX, or SPKI (Simple Public Key
Infrastructure, 1999).
The X-KRSS specification defines a protocol for a web
service that accepts registration of public key information.
Once registered, the public key may be used in conjunc-
tion with other web services including X-KISS.
XML-Based Security Frameworks. XML-based se-
curity frameworks go one step further than the above.
P1: JDW
Sahai WL040/Bidgolio-Vol I WL040-Sample.cls July 16, 2003 18:35 Char Count= 0
WEB SERVICES762
The Security Assertion Markup Language (SAML), de-
veloped under the guidance of OASIS (OASIS, 2002), is an

XML-based framework for exchanging security informa-
tion with established, SAML-compliant security services.
This security information is expressed in the form of as-
sertions about subjects, where a subject is an entity (either
human or program) that has an identity in some security
domain. A typical example of a subject is a person, iden-
tified by his or her e-mail address in a particular Internet
DNS domain.
Assertions can convey information about authentica-
tion acts performed by subjects, attributes of subjects,
and authorization decisions about whether subjects are
allowed to access certain resources. Assertions are repre-
sented as XML constructs and have a nested structure,
whereby a single assertion might contain several differ-
ent internal statements about authentication, authoriza-
tion, and attributes. Assertions containing authentication
statements merely describe acts of authentication that
happened previously.
Assertions are issued by SAML authorities, namely, au-
thentication authorities, attribute authorities, and policy
decision points. SAML defines a protocol by which rely-
ing parties can request assertions from SAML authori-
ties and get a response from them. This protocol, consist-
ing of XML-based request-and-response message formats,
can be bound to many different underlying communica-
tions and transport protocols. Currently it defines only
one binding, namely SOAP over HTTP.
SAML authorities can use various sources of informa-
tion, such as external policy stores and assertions that
were received as input in requests, in creating their re-

sponses. Thus, while clients always consume assertions,
SAML authorities can be both producers and consumers
of assertions.
Payment Systems for Web Services
Effective payment systems are a prerequisite for business
with Web services. This section introduces and classifies
different approaches for payment systems that have been
developed over the passed years. However, payments in
the Internet are mostly conducted through the existing
payment infrastructure that was developed before the In-
ternet became pervasive. End-consumer retail business on
the Internet primarily relies on credit card transactions.
Other traditional payment methods are offered as well:
personal checks, money orders, or invoice billing. In the
business-to-business segment, traditional invoice billing
is still the major payment method. An overview is given
in (Weber, 1998). W3C has adopted payment standards
(Micropayment Overview, 2002).
Payments by Credit Cards
The reason why credit card payments are well accepted is
that credit card providers act as intermediaries between
payers and recipients of payments (payees). They do also
guarantee payments up to a limit (important to the payee),
and they carry the risk of misuse. All parties must regis-
ter accounts before transfers can be conducted. Another
important service is the verification of creditability of a
person or a business before opening an account.
SET—The Secure Electronic Transaction Standard
SET (Secure Electronic Transaction, 2002) is an open
technical standard for the commerce industry initially

developed by two major credit card providers, Visa and
MasterCard, as a way to facilitate secure payment card
transactions over the Internet. Digital certificates (Digital
Certificates, 1988) create a trust chain throughout the
transaction, verifying cardholders’ and merchants’ iden-
tity. SET is a system for ensuring the security of finan-
cial transactions of credit card providers or bank acco-
unts. Its main objective is to provide a higher security
standard for credit card payments on the Internet. A ma-
jor enhancement compared to traditional credit card pay-
ments is that neither credit card credentials nor payers’
identity are revealed to merchants. With SET, a user is
given an electronic wallet (digital certificate). A transac-
tion is conducted and verified using a combination of digi-
tal certificates and digital signatures among the purchaser,
a merchant, and the purchaser’s bank in a way that en-
sures privacy and confidentiality.
Not all payments required by Web services can be con-
ducted through credit card transactions. First, credit card
transactions are typically directed from an end-customer,
a person, to a business that can receive such payments.
Second, the amounts transferred through a credit card
transaction are limited to a range between currency equiv-
alents of > $0.10 up to several thousand dollars depending
on an individual’s credit limits. Micropayments <$0.10, as
well as macropayments> $10,000, are typically not pro-
vided. The lower payment bound is also caused by the cost
per transaction model credit card providers use. Third,
payments among persons, as for instance required for
auctions among people or for buying and selling used

goods, cannot be conducted through credit card accounts.
Traditional payment methods are used here: personal
checks, money orders, or cash settlement. Fourth, only
individuals with registered accounts can participate in
credit card payments. Individuals that do not qualify are
excluded. This restriction is also a major barrier for Web
service business in developing countries.
Micropayments
The purpose of micropayments is primarily for “pay-
per-use” models where the usage is measured and im-
mediately charged to customers in very small amounts.
Transaction costs for micropayment systems need to be
significantly lower, and the number of transactions may
be significantly higher than that of credit card payments.
Accurate, fine-grained charging is enabled. These are the
two major differentiators of micropayment systems. W3C
proposes the Common Markup for Micropayment “per-
fee-links.”
Micropayments involve a buyer or customer, a vendor
or merchant, and potentially one or more additional par-
ties that keep accounts in order to aggregate micro pay-
ments for final charge. These mediators are called brokers
(in Millicent), billing servers (in IBM MicroPayments),
or intermediaries (in France Telecom Micropayments), to
name a few.
Millicent. One micropayment system is Millicent
(Glassman, 2000). The MilliCent Microcommerce
P1: JDW
Sahai WL040/Bidgolio-Vol I WL040-Sample.cls July 16, 2003 18:35 Char Count= 0
THE FUTURE OF WEB SERVICES 763

Network provides new pay-per-click/earn-per-click func-
tionality for Internet users. It allows buying and selling
digital products costing from 1/10th of a cent to up to
$10.00 or more. MilliCent can be used by Web services to
build any number of parallel revenue streams through the
simultaneous use of pay-per-click purchases, subscrip-
tions, and advertising. It can also be used to make direct
monetary payments to users. MilliCent is optimized for
buying and selling digital products or services over the
Internet such as articles, newsletters, real-time data, strea-
ming audio, electronic postage, video streams, maps,
financial data, multimedia objects, interactive games,
software, and hyperlinks to other sites.
NetBill. NetBill is a Carnegie Mellon University Inter-
net billing server project, which is used as a payment
method for buying information goods and services via the
Internet. It aims at secure payment for and delivery of
information goods, e.g., library services, journal articles,
and CPU cycles. The NetBill system charges for transac-
tions and requires customers to have a prepaid NetBill ac-
count from which all payments are deducted. The NetBill
payment system uses both symmetric key and public key
cryptography. It relies on Kerberos for authentication. An
account server, called NetBill server, maintains accounts
for both customers and merchants. NetBill acts as an ag-
gregator to combine many small transactions into larger
conventional transactions, thus amortizing conventional
overhead fees. Customers and merchants have to trust the
NetBill server.
Digital Money and Digital Coins

In contrast to account-based payment systems, such as
credit card-based systems, where amounts are trans-
ferred between accounts inside or between credit card
or bank providers, digital money represents a value
amount flowing from a payer to a payee across the
network. Establishing accounts with providers before ser-
vices can actually be used is unnecessary. Advantages
are the same as for cash money: no mutual accounts
need to be established before a payment can be con-
ducted. No mutual authentication is needed for improv-
ing convenience for both parties. In addition, as with
cash money, the payer does not need to reveal any
identity credentials to the payee or someone else. Pay-
ments are anonymous and nontraceable. A major hur-
dle for this approach is the prevention of duplication
and forging of digital money since no physical security
marks such as watermarks can be applied to digitized
bit strings.
The basic idea behind digital money is that a con-
sumer purchases “digital coins” from an issuer using a
regular payment method such as a credit card. The issuer
generates an account for that customer and deposits the
amount into it. It then hands out a set of digital coins to
the customer that he or she can use for payments. For a
payment, the customer transfers coins to the merchant or
service provider. The provider then transfers coins to the
issuer and deposits them into his account. The merchant,
however, may also use these coins to pay its suppliers. Dig-
ital coins will thus flow among participants similarly like
cash money flows among people.

The following requirements need to be met by digital
money systems:
digital money must be protected from duplication or forg-
ing; and
digital money should neither contain nor reveal identity
credentials of any involved party in order to be anony-
mous.
The first requirement is achieved by not actually repre-
senting an amount by a digital coin, but rather a reference
to an allocated amount in the possessor’s account with
the issuer. When digital coins are copied, the reference
is copied, not the amount itself. However, the first indi-
vidual redeeming a coin with the issuer will receive the
amount. Identity at redemption cannot be verified since
digital coins do not carry identifying credentials of the
possessor. The only term the issuer can verify is whether
or not a coin has already been redeemed. By thus, theft of
digital money is possible, and parties have an interest in
keeping their coins protected.
Achieving complete anonymity between an issuer and
subsequent receivers of digital money is a key characteris-
tic of digital money. It is basically achieved by blinded sig-
natures (Chaum, 1985) that guarantee to uniquely assign
coins with allocated amounts within the issuer’s account
system and without revealing any identification informa-
tion of the holder of that account.
E-cash. E-cash (CrytoLogic Ecash FAQ, 2002) stands for
“electronic cash,” a system developed by DigiCash that
underwent field tests in the late 1990s. E-cash is a legal
form of computer-generated currency. This currency can

be securely purchased with conventional means: credit
cards, checks, money orders, or wire transfers.
MicroMint. MicroMint is a proposal by Rivest and
Shamir about coins that can only efficiently be produced
in very large quantities and are hard to produce in small
quantities. The validity of a coin is easily checked. Mi-
croMint is optimized for unrelated low-value payments.
It uses no public key operations. However, the scheme is
very complex and would require a lot of initial and opera-
tional efforts. Therefore, it is unlikely that it ever will gain
any practical importance.
A broker will issue new coins at the beginning of a
period and will revoke those of the prior period. Coins
consist of multiple hash collisions, i.e., different values
that all hash to the same value. The broker mints coins by
computing such hash collisions. For that process many
computations are required, but more and more hash col-
lisions are detected with continued computation. The bro-
ker sells these MicroMint coins in batches to customers.
Unused coins can be returned to the broker at the end of a
period, e.g., a month. Customers render MicroMint coins
as payment to merchants.
THE FUTURE OF WEB SERVICES
In future we will see the unleashing of a Web services
phenomenon. This will involve the fulfillment of dynamic
Web service composition and orchestration vision, the ap-
pearance of personalized Web services, concepts of Web
P1: JDW
Sahai WL040/Bidgolio-Vol I WL040-Sample.cls July 16, 2003 18:35 Char Count= 0
WEB SERVICES764

service management, and the development of Web service
infrastructure as a reusable, reconfigurable, self-healing,
self-managing, large-scale system.
Dynamic Web Services Composition
and Orchestration
The vision of Web services intelligently interacting with
one another and performing useful tasks automatically
and seamlessly remains to become reality. Major mile-
stones have been achieved: XML as a syntactic framework
and data representation language for Web services inter-
action; the Web infrastructure itself providing ubiquitous
access to Web services; the emergence of global registra-
tion and discovery services; and the technology to sup-
port the creation and maintenance of Web services, just
to name a few. However, major pieces such as the forma-
lization and description of service semantic are yet to be
developed. The effort of creating a semantic Web (Se-
mantic Web, 2001) is an extension of the current Web
in which information is given well-defined meaning, bet-
ter enabling computers and people to work in coopera-
tion. Ontologies define the structure, relationships, and
meaning of terms appearing in service descriptions. The
semantic Web vision is that these ontologies can be reg-
istered, discovered, and used for reasoning about Web
service selection before undertaking business. Languages
like DAML+OIL (DAML, 2001) have been developed in
this context.
In addition, sending a document or invoking a method
and getting a reply are the basic communication prim-
itives. However, complex interactions between Web ser-

vices will involve multiple steps of communication that
are related to each other. A conversation definition is a
sequencing of document exchanges (method invocations
in the network object model) that together accomplish
some business functionality. In addition to agreeing upon
vocabularies and document formats, conversational Web
services also agree upon conversation definitions before
communicating with each other. A conversation defini-
tion consists of descriptions of interactions and transi-
tions. Interactions define the atomic units of information
interchange between Web services. Essentially, each ser-
vice describes each interaction in terms of the documents
that it will accept as input or will produce as output. The
interactions are the building blocks of the conversation
definition. Transitions specify the ordering amongst the
interactions. Web services need to introspect other Web
services and obtain each other’s descriptions before they
start communicating and collaborating (Banerji et al.,
2002).
RosettaNet (RosettaNet, 2002) is a nonprofit consor-
tium of major information technology, electronic com-
ponents, and semiconductor manufacturing companies
working to create and implement industry-wide, open
e-business process standards, particularly targeting busi-
ness-to-business market places, workflow, and supply-
chain management solutions. These standards form a
common e-business language, aligning processes between
supply-chain partners on a global basis. Several examples
exist. The centerpiece of the RosettaNet model is the part-
ner interface process (PIP). The PIP defines the activities,

decisions, and interactions that each e-business trading
participant is responsible for. Although the RosettaNet
model has been in development, it will be a while until
Web services start using them to undertake business on
the Web.
Once these hurdles are overcome, the basis and plat-
form for true Web services that will enable agent technolo-
gies merging into Web services to provide the envisioned
dynamic Web service aggregation on demand according
to users’ specifications will emerge.
Personalized Web Services
As Web service technology evolves, we anticipate that
they will become increasingly sophisticated, and that the
challenges the Web service community will face will also
evolve to meet their new capabilities. One of the most
important of these challenges is the question of what it
means to personalize Web services. Personalization can
be achieved by using user profiles, i.e., monitoring user
behavior, devices, and context to customize Web services
(Kuno & Sahai, 2002) for achieving metrics like quality of
experience (QoE) (van Moorsel, 2001). This would involve
providing and meeting guarantees of service performance
on the user’s side. Personalization could also result in the
creation of third-party rating agencies that will register
user experiences, which could be informative for other
first-time users. These rating mechanisms already exist in
an ad hoc manner, e.g., eBay and Amazon allow users to
rate sellers and commodities (books), respectively. Salcen-
tral.com and bizrate.com are third-party rating agencies
that rate businesses. These services could be also devel-

oped as extended UDDI services. These mechanisms will
also render Web services more “customer-friendly.”
End-to-End Web Service Interactions
Web services are federated in nature as they interact
across management domains and enterprise networks.
Their implementations can be vastly different in nature.
When two Web services connect to each other, they must
agree on a document exchange protocol and the appro-
priate document formats (Austin, Barbir, & Garg 2002).
From then on they can interoperate with each other,
exchanging documents. SOAP defines a common layer
for document exchange. Services can define their own
service-specific protocol on top of SOAP. Often, these Web
service transactions will span multiple Web services. A re-
quest originating at a particular Web service can lead to
transactions on a set of Web services. For example, a pur-
chase order transaction that begins when an employee
orders supplies and ends when he or she receives a con-
firmation could result in 10 messages being exchanged
between various services as shown in Figure 6.
The exchange of messages between Web services could
be asynchronous. Services sending a request message
need not be blocked waiting for a response message. In
some cases, all the participating services are like peers, in
which case there is no notion of a request or a response.
Some of the message flow patterns that result from this
asynchrony are shown in Figure 7. The first example in
Figure 7 shows a single request resulting in multiple
responses. The second example shows a broker-scenario,
P1: JDW

Sahai WL040/Bidgolio-Vol I WL040-Sample.cls July 16, 2003 18:35 Char Count= 0
THE FUTURE OF WEB SERVICES 765
officesupplies.comofficesupplies.com
supplies.
marketplace.com
supplies.
marketplace.com
stationery.com
shipme.comshipme.com
supplies.
workhard.com
supplies.
workhard.com
1
2
3
4
5
10
7
6
9
8
1. purchase order
2. part of purchase order
3. the other part of the purchase order
4. shipping request
5. shipping request
6. shipping confirmation
7. shipping confirmation

8. order confirmation
9. order confirmation
10. purchase order confirmation
Figure 6: SOAP messages exchanged between Web services.
in which a request is sent to a broker but responses are
received directly from a set of suppliers.
These Web services also interact with a complex web of
business processes at their back-ends. Some of these busi-
ness processes are exposed as Web service operations. A
business process comprises a sequence of activities and
links as defined by WSFL and XLANG. These business
processes must be managed so as to manage Web ser-
vice interactions. Management of Web services thus is
a challenging task because of their heterogeneity, asyn-
chrony, and federation. Managing Web services involves
managing business transactions by correlation of mes-
sages across enterprises (Sahai, Machiraju, & Wurster,
2001) and managing the business processes.
Also, in order to manage business on the Web, users
will need to specify, agree, and monitor service level agree-
ments (SLAs) with each other. Thus, Web services will
invariably have a large number of SLAs. As less human
intervention is more desirable, the large number of SLAs
would necessitate automating the process as much as pos-
sible (Sahai, Machiraju, Sayal, Jin, & Casati, 2002).
Web service to Web service interaction management
can also be done through mediation (Machiraju, Sahai,
& van Moorsel, 2002). Web service networks’ vision is to
mediate Web service interactions, so as to make it secure,
manageable, and reliable. Such networks enable version-

ing management, reliable messaging, and monitoring of
message flows (e.g., Flamenco Networks, GrandCentral,
Transact Plus, Talking Blocks).
Future Web Services Infrastructures
Deployment and operational costs are determinants in
the balance sheets for Web service providers. Web ser-
(a) multiple
responses
(b) broker
Figure 7: Asynchronous message patterns between Web
services.
vice providers are optimizing their IT infrastructures to
allow faster provisioning of new services and more reli-
able operation. Platforms and management solutions that
reduce Web services’ deployment and operational costs
are emerging. Those platforms support the deployment
of Web services (installation and configuration of soft-
ware and content data), the virtual wiring of machines
into application environments independently of the physi-
cal wiring in a data center. They allow rearrangements of
Web services’ applications among machines, the dynamic
sizing of service capacities according to fluctuations in de-
mands, and the isolation of service environments hosted
in the same data center.
HP’s Utility Data Center (HP Utility Data Center, 2001)
is such a platform. The HP Utility Data Center with its
Utility Controller Software creates and runs virtual IT en-
vironments as a highly automated service optimizing asset
utilization and reducing staffing loads. Resource virtual-
ization is invisible to applications, sitting underneath the

abstractions of operating systems.
Two types of resources are virtualized:
Virtualized network resources, permitting the rewiring of
servers and related assets to create entire virtual IT en-
vironments; and
Virtualized storage resources, for secure, effective stor-
age partitioning, and with disk images containing per-
sistent states of application environments such as file
systems, bootable operating system images, and appli-
cation software.
Figure 8 shows the basic building blocks of such a utility
data center with two fabrics for network virtualization
and storage virtualization.
The storage virtualization fabric with the storage area
network attaches storage elements (disks) to processing
elements (machines). The network virtualization fabric
then allows linking processing elements together in a vir-
tual LAN.
Two major benefits for Web services management can
be achieved on top of the infrastructure:
Automated Web services deployment—By entirely main-
taining persistent Web services’ states in the storage
system and conducting programmatic control over
P1: JDW
Sahai WL040/Bidgolio-Vol I WL040-Sample.cls July 16, 2003 18:35 Char Count= 0
WEB SERVICES766
Storage Pool
Cluster Pool
Firewall Pool
Load Balancer Pool

NAS Pool
Storage
Virtualization
Network
Virtualization
Data Center Management
INTERNET
INTRANETS
Figure 8: Architecture of a utility data center Web services
platform infrastructure.
storage containing deployed service arrangements;
and
Dynamic capacity sizing of Web services—By the ability to
automatically launch additional service instances ab-
sorbing additional load occurring to the service. Ser-
vice instances are launched by first allocating spare
machines from the pool maintained in the data center,
wiring them into the specific environment of the Web
service, attaching appropriate storage to those ma-
chines, and launching the applications obtained from
that storage. Web server farms are a good example for
such a “breathing” (meaning dynamically adjustable)
configuration (Andrzejak, Graupner, Kotov, & Trinks,
2002; Graupner, Kotov, & Trinks, 2002).
IBM’s Autonomic Computing vision is to provide for
self-managing systems. The intent is to create systems
that respond to capacity demands and system glitches
without human intervention. These systems intend to
be self-configuring, self-healing, self-protecting, and self-
optimizing (IBM Autonomic Computing, 2002).

CONCLUSION
The Web services paradigm has evolved substantially be-
cause of concerted efforts by the software community. The
genesis of Web services can be traced back to projects like
e-speak, Jini, and TSpaces. Although progress has been
made in Web service standardization, the full potential of
Web services remains unrealized. The future will see the
realization of Web services as a means of doing business
on the Web, the vision of dynamic composition of Web
services, personalized Web services, end-to-end manage-
ment of Web service interactions, and a dynamically
reusable service infrastructure that will adapt to varia-
tions in resource consumption.
GLOSSARY
Business process execution language for Web services
(BPEL4WS) A standard business process descrip-
tion language that combines features from WSFL and
XLANG.
Composition Creating composite Web services when
Web services outsource their functionalities to other
Web services.
Conversation A set of message exchanges that can be
logically grouped together.
Description Describing Web services in terms of the op-
erations and messages they support, so that they can
be registered and discovered at UDDI operator sites or
by using WS-Inspection.
End-to-end management Protocol required to track
and manage Web service composition leading to a
transaction being subdivided amongst multiple Web

services.
Orchestration Web service to Web service interaction
that leads to the coupling of internal business pro-
cesses.
Personalization Personalizing or customizing Web ser-
vices to user/client profiles and requirements.
Platform One or more execution engines over which a
Web service implementation is executed.
Service level agreement (SLA) An agreement that
specifies quality-of-service guarantees between parties.
Simple object access protocol (SOAP) A standard for
messaging between Web services.
Web service conversation language (WSCL) A lan-
guage to describe Web service conversations.
Web services flow language (WSFL) A language to de-
scribe business processes.
CROSS REFERENCES
See Client/Server Computing; Common Gateway Interface
(CGI) Scripts; Electronic Payment; Java; Perl; Personal-
ization and Customization Technologies; Secure Electronic
Transmissions (SET).
REFERENCES
Andrzejak, A., Graupner, S., Kotov, V., & Trinks, H. (2002).
Self-organizing control in planetary-scale computing.
In IEEE International Symposium on Cluster Comput-
ing and the Grid (CCGrid), 2nd Workshop on Agent-
based Cluster and Grid Computing (ACGC). New York:
IEEE.
Austin, D., Barbir, A., & Garg, S. (2002, 29 April). Web
services architecture requirements. Retrieved November

2002 from />20020429
Banerji, A., Bartolini, C., Beringer, D., Chopella, V., Govin-
darajan, K., Karp, A., et al. (2002, March 14). WSCL
Web services conversation language. Retrieved Novem-
ber 2002 from />Bartel, M., Boyer, J., Fox, B., LaMacchia, B., & Si-
mon, E. (2002, February 12). XML signature syntax
and processing. Retrieved November 2002 from http://
www.w3.org/TR/2002/REC-xmldsig-core-20020212
P1: JDW
Sahai WL040/Bidgolio-Vol I WL040-Sample.cls July 16, 2003 18:35 Char Count= 0
REFERENCES 767
BEA Systems, Intalio, SAP AG, and Sun Microsystems
(2002). Web Service Choreography Interface (WSCI)
1.0 Specification. Retrieved November 2002 from
/>Berners-Lee, T. (1996, August). The World Wide Web: Past,
present and future. Retrieved November 2002 from
/>Chaum, D. (1985). Security without identification: Trans-
action systems to make Big Brother obsolete. Commu-
nications of the ACM, 28.
CryptoLogic Ecash FAQ (2002). Retrieved November 2002
from />DAML: The DARPA Agent Markup Language Home-
page (2001). Retrieved November 2002 from http://
www.daml.org
Digital Certificates, CCITT (1988). Recommendation
X.509: The Directory—Authentication Framework.
ebXML: Enabling a global electronic market (2001). Re-
trieved November 2002 from
Ford, W., Hallam-Baker, P., Fox, B., Dillaway, B., LaMac-
chia, B., Epstein, J., & Lapp, J. (2001, March 30).
XML key management specification (XKMS). Retrieved

November 2002 from />Glassman S., Manasse, M., Abadi, M., Gauthier P., Sobal-
varo, P. (2000). The Millicent Protocol for Inexpensive
Electronic Commerce. Retrieved November 2002 from
/>Graupner, S., Kotov, V., & Trinks, H. (2002). Resource-
sharing and service deployment in virtual data centers.
In IEEE Workshop on Resource Sharing in Massively
Distributed Systems (RESH’02). New York: IEEE.
Hallam-Baker, P., & Maler, E. (Eds.). (2002, March 29).
Assertions and protocol for the OASIS Security As-
sertion Markup Language. Retrieved November 2002
from />docs/draft-sstc-core-29.pdf
Karp, A., Gupta, R., Rozas, G., Banerji, A. (2001). The
Client Utility Architecture: The Precursor to E-speak,
HP Technical Report. Retrieved November 2002
from />136.html
HP Utility Data Center: Enabling the adaptive infrastruc-
ture. (2002, November). Retrieved November 2002
from />Kim, W., Graupner, S., & Sahai, A. (2002, January 7–
10). A secure platform for peer-to-peer computing in
the Internet. Paper presented at 35th Hawaii Inter-
national Conference on System Science (HICSS-35),
Hawaii.
Kuno, H., & Sahai, A. (2002). My agent wants to talk to your
service: Personalizing Web services through agents. Re-
trieved November 2002 from />techreports/2002/HPL-2002-114
IBM Autonomic Computing (n.d.). Retrieved from http://
www.research.ibm.com/autonomic/
Leymann, F. (Ed.) (2001). WSFL Web services flow
language (WSFL 1.0). Retrieved July 2003 from
/>pdf/WSFL.pdf

Liberty Alliance Project (2002). Retrieved November 2002
from />Machiraju, V., Sahai, A., & van Moorsel, A. (2002). Web
service management network: An overlay network for
federated service management. Retrieved November
2002 from />HPL-2002-234.html
Micropayments overview (2002). Retrieved Novem-
ber 2002 from />Micropayments/
Microsoft .NET Passport (2002). Retrieved November
2002 from />OASIS PKI Member Section (2002). Retrieved November
2002 from />Organization for the Advancement of Structured Informa-
tion Standards (OASIS) (2002). Retrieved November
2002 from
Reagle, J. (Ed.) (2000, October 6). XML encryption require-
ments. Retrieved November 2002 from http://lists.w3.
org/Archives/Public/xml-encryption/2000Oct/att-0003/
01-06-xml-encryption-req.html
RosettaNet (2002). Retrieved November 2002 from

Sahai, A., Machiraju, V., Sayal, M., Jin, L. J., & Casati,
F. (2002). Automated SLA monitoring for Web ser-
vices. Retrieved November 2002 from http://www.
hpl.hp.com/techreports/2002/HPL-2002-191.html
Sahai, A., Machiraju, V., & Wurster, K. (2001, July).
Monitoring and controlling Internet based services.
In Second IEEE Workshop on Internet Applications
(WIAPP’01). New York: IEEE. [Also as HP Tech. Rep.
HPL-2000—120.]
Semantic Web (2001). Retrieved November 2002 from
/>SET Secure Electronic Transactions LLC (2002). Re-
trieved November 2002 from />Simple Public Key Infrastructure (SPKI). (1999). SPKI

Certificate Theory (RFC 2693).
Thatte, S. (2001). XLANG Web Services for Business
Process Design. Retrieved November 2002 from
/>wsspecs/xlang-c/
default.htm
TSpaces: Intelligent Connectionware (1999). Retrieved
November 2002 from />cs/TSpaces/
Van Moorsel, A. (2001). Metrics for the Internet Age—
Quality of experience and quality of business. Re-
trieved November 2002 from />techreports/2001/HPL-2001-179.html
Weber, R. (1998). Chablis—Market analysis of digital pay-
ment systems. Retrieved November 2002 from Univer-
sity of Munich Web site: -
muenchen.de/MStudy/x-a-marketpay.html
P1: 61
Irie WL040/Bidgoli-Vol III-Ch-62 June 23, 2003 16:43 Char Count= 0
Web Site Design
Web Site Design
Robert E. Irie, SPAWAR Systems Center San Diego
Introduction 768
Web Site Components 768
Content 768
Presentation 768
Logic 768
Separation of Components 769
Implementation Issues 769
Static Sites 769
Dynamic Sites 769
Client Side 769
Server Side 770

Web Applications 771
Design Issues 771
Usability Issues 772
Basic Web Site Types 772
News/Information Dissemination 772
Portal 772
Community 772
Search 772
E-commerce 772
Company/Product Information 772
Entertainment 772
Basic Design Elements 772
Accessibility/Connectivity 773
Consistent Page Layout 773
Consistent Navigation Mechanism 773
Miscellaneous Interface Issues 773
Graphics 774
Layout Styles 774
Search Engines 774
Cross-Browser Support 774
Web Resources 774
Conclusion 774
Glossary 775
Cross References 775
References 775
INTRODUCTION
Designing and implementing a Web site is increasingly be-
coming a complex task that requires knowledge not only
of software programming principles but of graphical and
user interface design techniques as well. While good de-

sign is important in regular software engineering and ap-
plication development, nowhere is it more essential than
in Web site development, due to the diverse and dynamic
nature of Web content and the larger intended audience.
This chapter will cover some of the issues involved with
the two major components of a Web site, its design and
implementation. The scope of this chapter is necessarily
limited, as Web development is a rich and heterogeneous
field. A broad overview of techniques and technology is
given, with references to other chapters. The reader is di-
rected to consult other chapters in this encyclopedia for
more detailed information about the relevant technolo-
gies and concepts mentioned below. Occasionally links to
Web sites will be given. They are either representative ex-
amples or suggestions for further reference, and should
not be construed as an endorsement.
WEB SITE COMPONENTS
A Web site is an integration of three components, the con-
tent to be published on the Web, its presentation to the
user, and the underlying programming logic. Each com-
ponent has its own particular representation and role in
shaping the overall user experience.
Content
The content consists of all relevant data that are to be pub-
lished, or shown to the user. It usually constitutes the bulk
of a Web site’s storage requirements and can be in the form
of text, images, binary and multimedia data, etc. Static
textual and graphic content can be stored as HTML pages,
whereas multimedia files like videos and sound recordings
are usually stored in large databases and served, in whole

or in parts, by dedicated servers. Most of the discussion
in this chapter will focus on the former type.
Presentation
The presentation component involves the user interface to
the Web site and the manner in which content is displayed.
Typical elements include the graphical and structural lay-
out of a Web document or page, text and graphic styles to
highlight particular content portions, and a mechanism
for the user to navigate the Web site. Originally, files with
HTML markups were used to store both content and in-
formation regarding its presentation. It is now common
practice to store neither exclusively in HTML. HTML is
primarily used to describe the structure of a Web docu-
ment, by breaking down the page into distinct elements
like paragraphs, headings, tables, etc. The actual textual
content of the document can be stored separately in a
database, to be dynamically inserted into the HTML page
using programming logic. A separate file, called the style
sheet, can be associated with the HTML page, and con-
sequently the content, to affect the presentation. A style
sheet file can describe how each structural element in an
HTML file is displayed; sizes, colors, positions of fonts,
blocks, backgrounds, etc., are all specified in a hierarchi-
cal organization, using the standard Web-based style sheet
language, cascading style sheets (CSS) (Lie & Bos, 1999).
Logic
The programming logic determines which content to
display, processes information entered by the user, and
768
P1: 61

Irie WL040/Bidgoli-Vol III-Ch-62 June 23, 2003 16:43 Char Count= 0
DYNAMIC SITES 769
generates new data. It drives the interaction between the
Web site and the user and is the glue that binds the content
and its presentation. To be useful, it needs to access the
content as well as its presentation information and han-
dle user input accordingly. Logic is usually implemented
as small programs, or scripts, that are executed on the
Web server or the user’s browser. These scripts can be
stored within the HTML page, along with the presentation
and content, or separately as distinct program files that
are associated with the content. There are several stan-
dard programming languages that can be used in writing
scripts.
Separation of Components
With the existence of a variety of technologies, protocols,
and standards, Web development is remarkably flexible,
and there are often multiple ways of accomplishing the
same task. This is both an asset and a liability, as while
developers are free to choose their own techniques, it is
very easy to create sloppy or undisciplined documents and
code. In regular application development, it is important
to adhere to sound software engineering techniques to
manage a code base for future enhancements and simul-
taneous development efforts. The flexibility of Web deve-
lopment makes such good techniques even more critical.
Until very recently, there was a great deal of overlap,
in terms of storage and implementation, of the three Web
site components mentioned above. This led, for example,
to Web pages that contained all three components in a

single, often unmanageable, HTML file. As Web site de-
velopment has matured, the principle of Web site com-
ponent separation has become widely encouraged, if not
accepted, and it is the central theme of this chapter.
IMPLEMENTATION ISSUES
The World Wide Web (WWW) is a series of client/server
interactions, where the client is the user’s Web browser,
and the server is a particular Web site. The WWW Con-
sortium (W3C) defines the hypertext markup language
(HTML) and the hypertext transfer protocol (HTTP) as
the standard mechanisms by which content is published
and delivered on the Web, respectively.
In essence, the local Web browser initiates HTTP re-
quests to the remote Web server, based on user input.
The Web server retrieves the particular content specified
by the requests and transmits it back to the browser as
an HTTP response. The Web browser then interprets the
response and renders the received HTML content into a
user-viewable Web page.
Web site implementations can be classified by the level
of interactivity and the way content is stored, retrieved,
and displayed.
Static Sites
Static sites are the simplest type of Web sites, with the
content statically stored in HTML files, which are simple
text files. Updating the Web site requires manually chang-
ing individual HTML text files. While this type of site was
prevalent in the beginning, most sites, especially commer-
cial ones, have come to incorporate at least some degree
Figure 1: Block diagram of a client/server archi-

tecture with a static Web site.
of dynamic behavior, and users have come to expect some
interactivity.
Figure 1 shows the basic client–server interaction for a
static Web site. The client browser makes an HTTP request
to a Web server. The URL specifies the particular Web
server and page. The Web server retrieves the requested
Web page, which is an HTML file, from the file system and
sends it back to the client through the HTTP response.
This very basic interaction between browser and server
is the basis for more complex, dynamic interactions. This
type is static because the Web page content is straight
HTML, statically stored on disk, with no mechanism to
change the contents. The Web server here serves solely as
a file transfer program.
Developing static sites requires very few tools. All that
is required, besides the Web server and browser, is a text
editor application. The simplest text editor can be used to
manually create HTML files. Complex, graphical HTML
editors can make the task almost trivial by automatically
generating HTML files and behaving similarly to word
processors, with WYSIWYG (what you see is what you
get) interfaces. Creating graphics and images for static
Web sites is also straightforward, requiring any typical
paint or drawing program.
DYNAMIC SITES
Dynamic sites share the same basic architecture as static
ones, but with the addition of programming logic. The two
major types of dynamic sites reflect the place of execution
of the scripts. Client-side scripting involves embedding ac-

tual source code in HTML, which is executed by the client
browser in the context of the user’s computer. Server-side
scripts, on the other hand, are executed on the Web server.
While the following discussion examines both types sep-
arately, in an actual Web site both types can and often do
exist simultaneously.
Client Side
Figure 2 shows the basic architecture for a dynamic site
with client-side scripting. Scripts are embedded within
HTML documents with the <script> </script> tags or
stored in separate documents on the server’s file system.
Scripts are transmitted, without execution, to the client
browser along with the rest of the HTML page. When
the client browser renders the HTML page, it also in-
terprets and executes the client script. An example of a
client-side script is the functionality that causes a user
interface element, such as a menu, to provide visual feed-
back when the user moves the mouse pointer over a menu
option.
P1: 61
Irie WL040/Bidgoli-Vol III-Ch-62 June 23, 2003 16:43 Char Count= 0
WEB SITE DESIGN770
Figure 2: Block diagram of a Web site interaction
with client-side scripting.
There are several client-side scripting languages, the
most common one being JavaScript, an object-oriented
language originally developed by Netscape. It is now a sta-
ndardized language, defined by the international industry
group European Computer Manufacturers Association,
and called ECMAScript (European Computer Manufac-

turers Association, 1999). Netscape continues to use the
term JavaScript, however, and Microsoft calls its imple-
mentation of ECMAScript for Windows browsers JScript.
The other major scripting language is Microsoft’s VB-
Script, short for Visual Basic Scripting Edition, which is
available only for Windows platforms (Champeon, 2001).
Regardless of the language, client-side scripts rely on a
standard programming interface, defined by the W3C and
called the Document Object Model (DOM), to dynamically
access and update the content, structure, and style of Web
documents (World Wide Web Consortium, 1998).
Cascading style sheets (CSS) is another W3C language
standard that allows styles (e.g., fonts, colors, and spac-
ing) to be associated with HTML documents. Any specific
HTML tag or group of HTML tags can be modified. It is
a language, separate from HTML, that expresses style in
common desktop publishing terminology. The combina-
tion of HTML, CSS, and DOM client-side scripts is often
referred to as dynamic HTML (Lie & Bos, 1999).
Client-side scripting is used primarily for dynamic user
interface elements, such as pull-down menus and ani-
mated buttons. The advantage of using client-side scripts
instead of server-side scripts for such elements is that
the execution is more immediate. Since the script, once
loaded from the server, is being executed by the browser
directly on the user’s computer, there are no delays asso-
ciated with the network or the server load. This makes the
user interface responsive and similar to standard platform
applications.
One of the disadvantages is that client-side scripting

languages are usually more limited in functionality than
server-side languages, so that complex processing is not
possible. Such limitations are by design, for security rea-
sons, and are not usually apparent for simple user inter-
face programming.
Users may also specifically choose not to allow client-
side scripts to execute on their computers, resulting in a
partial or complete reduction in functionality and usabil-
ity of a Web site. In general, it is recommended that a site
incorporate user interface scripting only sparingly, and
always with clear and functional alternatives.
Finally, because client-side programs, whether embed-
ded or stored separately, must necessarily be accessible
and viewable by the Web browser, they are also ulti-
mately viewable by the user. This may not be desirable for
Figure 3: Block diagram of a Web site interaction with
server-side scripting.
commercial Web applications, where the programming
logic can be considered intellectual property.
Since client-side scripts are embedded in HTML pages,
any tool that creates and edits HTML pages can also be
used to create the scripts. The only requirement is that
the client browser support the particular scripting lan-
guage. Most modern browsers support some variation of
Javascript/ECMAScript/JScript, whereas a smaller subset
support VBScript.
Server Side
Figure 3 shows the basic architecture for a server-side dy-
namic site. Scripts are still stored in HTML documents on
the server’s file system, but are now executed on the server,

with only the program results and output being sent to the
client browser, along with the rest of the HTML page. To
the client browser, the HTTP response is a normal static
HTML Web page. Scripts are embedded in HTML docu-
ments using special HTML-like tags, or templates, whose
syntax depends on the particular server-side scripting lan-
guage (Weissinger, 2000).
There are several common server-side scripting lan-
guages, including PHP, Active Server Pages (ASP), and
Java Server Pages (JSP). The common gateway interface
(CGI) is also a server-side scripting mechanism, whereby
neither the Web content nor the programming logic is
stored in an HTML file. A separate program, stored in the
file system, dynamically generates the content. The Web
server forwards HTTP request information from the client
browser to the program using the CGI interface. The
program processes any relevant user input, generates an
HTML Web document and returns the dynamic content
to the browser via the Web server and the CGI interface.
This process is illustrated in Figure 4.
Server-side scripting is used primarily for complex and
time-consuming programming logic tasks, where imme-
diacy of response is not as critical as with user interface
elements. The advantage of using server-side scripts is the
freedom and computational power that is available on the
Figure 4: Block diagram of a Web site interaction
with common gateway interface scripting.
P1: 61
Irie WL040/Bidgoli-Vol III-Ch-62 June 23, 2003 16:43 Char Count= 0
DESIGN ISSUES 771

server; server-side scripts do not have the same security
constraints as client-side scripts, and often have full ac-
cess to the server machine’s file system and resources. The
user may not disable execution of such scripts, so that the
Web developer can reasonably expect that the Web site
will behave exactly the same regardless of user configu-
ration. Finally, any proprietary server-side source code is
safely hidden from user view, as the client browser re-
ceives only the output of the script.
Server-side scripts have the disadvantage of requiring
a request–response round trip between the client browser
and the server, which leads to slower response times.
Server-side scripting languages normally interact clo-
sely with the Web server, which imposes some compatibil-
ity constraints. The choice of a Web server, particularly a
proprietary system, usually limits the selection of server-
side scripting languages, and vice versa.
WEB APPLICATIONS
As a Web site becomes more complex, a robust and effi-
cient mechanism for the separation of content, presenta-
tion, and logic is necessary. Web application servers are
Web sites that are more interactive, access large amounts
of data, and provide a rich functionality similar to that of
desktop applications. Unlike desktop applications, where
all components are stored and executed on the same com-
puter, Web applications usually follow a three-tier client/-
server architecture (see Figure 5) consisting of the Web
browser, the Web server, and a database. All content and
logic are stored in the database and are retrieved and pro-
cessed as necessary on the Web server. The presentation

information can be embedded with the content or stored
as a separate style sheet on the database or the server.
Usually a Web application server interfaces with a re-
lational database, which stores data in rows of tables.
Table 1 URLs of Various Web Resources
Browsers Link
Internet Explorer />Netscape Navigator />Lynx
Design Guidelines
Fixing Your Web site http://www.fixingyourwebsite.com
CSS />Usability and Accessibility Issues />Programming
DevShed
Webmonkey
Javascript
Standards
World Wide Web Consortium
Web Application Servers
BEA WebLogic />IBM WebSphere />Macromedia ColdFusion />Apache Jakarta
Zope
Figure 5: Block diagram of a Web application
server interaction.
The other major type of database is the object-oriented
database, which stores data by encapsulating them into
objects. Relational databases are often more efficient and
faster than equivalent object-oriented databases and sup-
port an almost universal database language, SQL (struc-
tured query language).
The major disadvantage of developing with Web ap-
plication servers, besides the inherent complexity, is the
necessity of learning a nonstandard or proprietary server-
side programming interface or language. There are several

major Web application servers that support standard pro-
gramming languages such as Java and C++, but each has
its own application programming interface (API). Table 1
lists some of the popular commercial and open source ap-
plication servers (see Web Resources).
DESIGN ISSUES
Unlike implementation issues, which usually are straight-
forward to specify and quantify, design issues are much
more subjective and are dependent on several factors, in-
cluding the particular type of Web site and its purpose.
Web site development efforts are often driven by con-
flicting objectives and considerations, and a balance must
be maintained between business and financial concerns,
P1: 61
Irie WL040/Bidgoli-Vol III-Ch-62 June 23, 2003 16:43 Char Count= 0
WEB SITE DESIGN772
which often stress the commercial viability and revenue-
generating aspects of a Web site, and more user-centric
design concerns, which usually deal with usability issues
(Murray & Costanzo, 1999). Since the former are very
domain-specific, only the latter will be discussed in this
chapter. In the discussion that follows, references to sam-
ple Web sites will be given.
USABILITY ISSUES
The goal of designing a Web site with usability issues in
mind is to ensure that the users of the site find it usable
and useful. Specifically, a Web site should be accessible,
appealing, consistent, clear, simple, navigable, and forgiv-
ing of user errors (Murray & Costanzo, 1999).
The first step in designing any Web site should be the

determination of the purpose of the site. Too often the
rush to incorporate the latest Web technology or standard
prevents a thorough examination and determination of
the most important factor of the Web site, its intention or
purpose. Most Web sites in essence are information dis-
semination mechanisms; their purpose is to publish use-
ful content to as wide an audience as possible. Others also
have a commercial component, with the buying and sell-
ing of goods or services. Still others foster a community
or group activity and are used as collaboration devices.
The Web site’s purpose should drive the design and im-
plementation efforts. A Web site advertising or describing
a company’s products will most likely need eye-catching
graphical designs and images. A commerce site will need
to consider inventory mechanisms and secure transac-
tions. A community site will need to solve problems in-
volving simultaneous collaboration of a distributed group
of users.
It is also important to consider the intended audience
of a Web site. There is a wide range in browser capabilities
and user technical competencies that must be taken into
account. A Web site geared toward a younger, more tech-
nically inclined audience may contain highly interactive
and colorful designs, whereas a corporate site might want
to have a more professional, businesslike appearance. It
is generally a good practice, if not essential, to consider
accessibility issues for all users, including those who do
not have access to high-end graphics-capable browsers.
BASIC WEB SITE TYPES
Just as there are several implementation classifications

for Web sites, we can also classify them based on their
purpose. Each type will lead to different choices in the
content, presentation, and logic components and require
emphasis on different usability issues. A single Web site
may incorporate features of more than one basic type.
News/Information Dissemination
This type of Web site design is geared toward providing
informational content to the Web user. The content is usu-
ally textual in form, with some graphics or images. The
presentation of the content and its navigation are kept as
clear and consistent as possible, so that the user will be
able to quickly access the desired information. Not sur-
prisingly, newspaper companies usually have Web sites
with online news content (e.g., http:/./www.nytimes.com).
Portal
A portal is a popular type of Web site that serves as a gate-
way to other Web sites. The content is usually in the form
of URL links and short descriptions, categorized based on
themes. The links should be organized so that they are eas-
ily searchable and navigable. Major commercial portals
have evolved from simple collections of related URL links
to incorporate more community-like features to prompt
users to return to their sites (e.g., ).
Community
Community sites foster interaction among their users
and provide basic collaboration or discussion capabili-
ties. Message boards, online chats, and file sharing are
all typical functionalities of community sites. The open
source software movement has promoted numerous Web
sites based on this type (e.g., ).

Search
There is a lot of overlap between this type of Web sites and
portals. Like portals, search sites provide a mechanism by
which users discover other Web sites to explore. Some so-
phisticated programming logic, the search engine, forms
the foundation of this type of Web site. Search sites
often emphasize simple, almost minimalist interfaces
(e.g., ).
E-commerce
This type of site is often a component of other Web site
types and allows users to purchase or sell goods and ser-
vices in a secure manner. Since potentially large amounts
of currency are involved, security is an important consid-
eration, as well as an interface that is tolerant of potential
user errors. An example of a successful commerce site
with elements of a community is .
Company/Product Information
With widespread Web use, having an official Web pres-
ence is almost a requirement for corporations. Such sites
usually serve purposes similar to those of informational
and e-commerce sites, but with a more focused interface,
reflecting the corporate image or logo (e.g., http://www.
microsoft.com).
Entertainment
This type of site is usually highly interactive and stresses
appealing, eye-catching interfaces and designs. Typical
applications include online gaming sites, where users may
play games with each other through the site, and sporting
event sites, where users may view streaming content in
the form of video or audio broadcasts of live events (e.g.,

).
BASIC DESIGN ELEMENTS
There is obviously no single best design for a Web site,
even if one works within a single type. There are, however,
some guidelines that have gained common acceptance.
Like any creative process, Web site design is a matter of
tradeoffs. A typical usability tradeoff is between making
P1: 61
Irie WL040/Bidgoli-Vol III-Ch-62 June 23, 2003 16:43 Char Count= 0
MISCELLANEOUS INTERFACE ISSUES 773
an interface appealing and interactive and making it clear
and simple. The former usually involves graphical designs
with animations and client-side scripting, whereas the lat-
ter favors minimal text-based interfaces. Where a particu-
lar Web site belongs on the continuous spectrum between
the two extremes depends on its intended purpose and au-
dience, and should be a subjective, yet conscious decision.
The safest design decision is to always offer alterna-
tives, usually divided into high- and low-bandwidth ver-
sions of the same Web pages, so that the user experience
can be tailored to suit different preferences. The major
disadvantage of this is the increase in development time
and management requirements.
Accessibility/Connectivity
The two major factors affecting accessibility and connec-
tivity issues are the bandwidth of the user’s network con-
nection, and the particular graphical capabilities of the
user browser. Low-bandwidth connections to the Internet
are still very common in homes. By some measures, dialup
modems are still used in 90% of all homes that regularly

access the Internet (Marcus, 2001). This requires Web site
designers either to incorporate only small, compressed
images on their sites, or to provide alternative versions of
pages, for both high- and low-bandwidth users.
Some user browsers do not have any graphics capabil-
ity at all, for accessibility reasons or user preference. For
example, visually impaired users and PDA (personal digi-
tal assistant) users most often require accessibility consid-
eration. Estimates of the number of disabled users range
from 4 to 17% of the total online population (Solomon,
2000). PDA and mobile phone Internet usage is relatively
new in the United States, but is already approaching
10 million users (comScore Networks, 2002). For such
users, designing a separate text-only version of the Web
site is a possibility. What would be better is to design a Web
site that contains automatic browser-specific functiona-
lity degradation. An example is to associate relevant tex-
tual content to graphical images; graphical browsers may
display the images, while text browsers may display the
descriptions.
Consistent Page Layout
One of the most important design for a Web site is a con-
sistent page layout. While every single page does not need
to have the same layout, the more consistent each page
looks, the more straightforward it is for the user to nav-
igate through the site and the more distinctive the Web
site appears. A typical Web page layout utilizes parts or
all of an artificially defined border around the content (see
Figure 6).
Originally, HTML frames or tables were the standard

way of laying out a page, and they are still the preferred
method for most developers. However, the W3C clearly is
favoring the use of cascading style sheets (CSS) for page
layout (World Wide Web Consortium, 2002). CSS also pro-
vides a mechanism for associating styles, such as color,
font type and size, and positioning, with Web content,
without actually embedding them in it. This is in keeping
with the principle of separating content from its presen-
tation.
Figure 6: A typical layout scheme for
a Web page.
Consistent Navigation Mechanism
Web site navigation is an important component of the de-
sign, and a consistent navigation mechanism supplements
a page layout and makes the user experience much sim-
pler and more enjoyable.
One of the best ways of providing a consistent navi-
gation mechanism is to have a menu bar or panel that
is consistent across all pages of the site. Such a menu
can be a static collection of links, or a dynamic, interac-
tive component similar to that of a desktop application.
Figure 7 is an example of a simple and consistent naviga-
tion scheme that utilizes two menu panels. The top panel
(with menu items A, B, C) is similar to a desktop appli-
cation’s menu bar and is a global menu that is consistent
throughout the site and refers to all top-level pages of a
site. The left side panel is a context-dependent menu that
provides further options for each top-level page. This type
of navigation scheme can be seen on several public Web
sites (e.g., ).

While there are no absolute rules or guidelines for good
navigation elements, they usually provide visual feedback
(e.g., mouse rollover effects), have alternate text displays
(for nongraphical or reduced capability browsers), and
are designed to be easy to use as well as learn.
MISCELLANEOUS INTERFACE ISSUES
The following are miscellaneous interface design issues.
Again, only suggestions for possible design choices are of-
fered, and they will not be applicable in all circumstances.
Figure 7: An example of a consistent navigation scheme for
a Web site.
P1: 61
Irie WL040/Bidgoli-Vol III-Ch-62 June 23, 2003 16:43 Char Count= 0
WEB SITE DESIGN774
Graphics
The two major interface issues concerning graphics are
size and color. As the majority of Web users still access
Web sites over slow modem links, it is important to use
graphic images that are of reasonable size, to prevent
excessive download delays. For photograph images, the
JPEG format offers a good compromise between lossy
compression size and image quality, with an adjustable
tradeoff point. For line art and solid color images, lossless
compression is preferred, and the proprietary GIF format
is common, although the open standard PNG format is
gaining in acceptance (Roelofs, 2000).
The issue of colors is a complex one and depends on
many factors. In general, Web graphic designers work
with 24-bit colors, with 256 possible values for each of
three color channels, red, green, and blue (RGB). Until re-

cently, the majority of users’ computers and Web browsers
could only support a palette, or set, of 256 colors simul-
taneously. To ensure that colors appear uniformly across
platforms and browsers, a “Web-safe palette” of 216 colors
was established, consisting of combinations of six pos-
sible values, or points, for each of three color channels
(6 possible reds × 6 possible greens × 6 possible blues =
216 possible colors) (Niederst, 2001).
Recently, browsers and systems with 24-bit and 16-bit
support have drastically increased and now account for
about 94% of all users (Lehn & Stern, 2000). Twenty-four-
bit color support results in the full display of the designer’s
color scheme. Sixteen-bit color systems are sometimes
problematic, as they nonuniformly sample the three color
channels (5 bits for red, 6 bits for green, and 5 bits for blue)
and provide a nonpalettized approximation of 24-bit color.
Layout Styles
A comprehensive guide to layout styles is beyond the scope
of this chapter. The major design decision is between hav-
ing page layouts of fixed or variable size (Niederst, 2001).
By default, HTML documents are variable-sized, in that
text and graphics positioning and line breaks are not deter-
mined by the user’s monitor resolution and browser win-
dow size. Since a wide variety of sizes and resolutions is
almost a given, having a variable-sized page layout allows
flexible designs that scale to the capabilities and prefer-
ences of each user. The disadvantage is that because each
user experience is different, and elements can be resized
or repositioned at will, it is difficult to design a consistent
and coherent interface; there is the possibility that some

configurations lead to poor or unusable interfaces.
The alternative to the default variable-sized page lay-
out is to explicitly design the size and position of some
or all of the elements of a Web document. An example of
this would be to limit the width of all content in a page
to fit within a certain horizontal screen resolution, such
as 640 pixels. All text and graphics will remain stationary
even if the user resizes the browser window to greater
than 640 horizontal pixels. The advantage of this method
is that designing an interface is much more deterministic,
so the Web designer will have some degree of control over
the overall presentation and is reasonably certain that all
users will have the same experience accessing the site.
The disadvantage is that the designer must pick constants
that may not be pleasant or valid for all users. For ex-
ample, a Web page designed for a 640 × 480 resolution
screen will look small and limited on a 1280 × 1024 screen,
whereas a Web page designed for an 800 × 600 screen
would be clipped or unusable for users with only a 640 ×
480 screen.
Actually implementing either type of page layout can
be done with HTML frames, tables, or CSS style sheets,
or some combination of the three. Although using style
sheets is the currently preferred method for page lay-
out, browser support is still poor, and many sites still use
frames or tables (Niederst, 2001).
Search Engines
A search engine is a useful tool to help users quickly
find particular content or page as the content of a Web
site increases, or the navigation scheme becomes com-

plicated. The search engine is a server-side software pro-
gram, often integrated with the Web server, that indexes
a site’s Web content for efficient and quick retrieval based
on a keyword or phrase. Search engines are available with
a variety of configurations, interfaces, and capabilities.
A good resource that summarizes the major commercial
and open source engines is the Search Tools Web site
().
Cross-Browser Support
Designing a Web site that is consistent across multiple
browsers and platforms is one of the most challenging
tasks a developer faces. Even different versions of the
same browser are sometimes incompatible. At the mini-
mum, the three major browsers to consider are Internet
Explorer (IE), Netscape Navigator (NN), and text-based
browsers such as Lynx.
For the most part, browser development and capabili-
ties have preceded the establishment of formal standards
by the W3C, leading to numerous incompatibilities and
nonuniform feature support. The latest versions of the two
common browsers (IE 6, NN 6.2) offer complete support
for the current W3C standard HTML 4.01. However, the
more common, earlier versions of the browsers (versions
4+ and 5+) had only incomplete support.
Even more troublesome was their support of the W3C
standard Document Object Model (DOM) Level 1, as each
has historically taken a different track and implemented
its own incompatible DOM features (Ginsburg, 1999).
In general, NN’s DOM support is much closer to the
“official” W3C DOM Level 1 specification, whereas IE has

several extensions that are more powerful, but are avail-
able only on Windows platforms. The latest versions of the
two browsers have alleviated some of this problem by sup-
porting, as a baseline, the complete Level 1 specification.
WEB RESOURCES
Table 1 summarizes some useful online resources for Web
site development. They are only suggestions and should
not be considered comprehensive or authoritative.
CONCLUSION
This chapter has given an overview of Web site develop-
ment, including the design and implementation aspects.

×