Tải bản đầy đủ (.pdf) (27 trang)

the internet and its applications

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (3.05 MB, 27 trang )

1
Ch.9 - The Internet and its applications
1 What is the Internet?
2 The benefits of the Internet
3 History
4 Merging of computer technologies
5 Bandwidth
6 Client server Model
7 Main Services
8 Other Services
9 Finding information on the Internet
10 Relevant Documents and False Drops
11 Full search
12 Constrained search
13 Internet file formats
14 Compression and Archiving
15 Mail Encoding
16 Media
17 Handling file formats
18 URL
19 Protocols
20 Top-level domains
21 Present-day
Ch.9 - The Internet and its applications
2
The Internet is the largest network of computers. These computers can be different
platforms, like Windows, Mac, UNIX, Next, Amiga and so on, but they can still
communicate with each other using TCP/IP, the "common language of the Internet".
In 1999 there were about 130 million people connected to the Internet. In the year 2004
there may be as many as 1 billion users. Why are so many people getting connected? What
are the benefits of the Internet, from the user’s point of view?


Ch.9 - The Internet and its applications
3
The most important benefit of the Internet is the ability to get in touch with and
communicate with other people. E-mail for instance, reduces the threshold for making
contact with other people. One example is when I was going to attend a conference in Florida
Tech in 1996. I wrote an e-mail to the person who was administrating the conference and
asked her if there were any people in Florida Tech who were interested in Multimedia and
Distance Learning. She sent me the names of four people and their e-mail addresses. One of
them was the Dean of Florida Tech. I wrote e-mail messages to all four of them and they
replied that they were willing to see me when I arrived.
Another example was when we tried to find a teacher for the "Mobile Datacom" part of this
course. There were people at Ericsson in Stockholm who knew this subject well, but no one
had time to teach, since they were involved in other activities.
So I made some searches on the Internet and found an Ericsson owned company in
Gothenburg who worked with "Mobile Datacom". I read their home pages but they
contained only superficial information. I then looked at their employment opportunities
pages, and I found very detailed information about what different departments were working
with. I found a department who were working with the parts which we wanted to cover in
our course. I made contact with the manager of that department and engaged him as a teacher.
Another benefit is information that you will find on the Internet. There are millions of web
pages, news articles and so on. The information on the Internet differs from information that
you will find in libraries and book stores. In libraries you only find broad information, that is
information that interest a large number of people. There is an economic reason for this. No
publisher will publish something which interests only a few people. But on the Internet it
costs very little to publish, and you will often find this kind of narrow information, for
instance a home page describing a single person or a small company.
The information on the Internet is growing very rapidly. The number of web pages for
instance is doubling every 53 days.
Ch.9 - The Internet and its applications
4

A third benefit is all the software which you can find on the Internet. There is freeware,
shareware and commercial software. The Internet is also an excellent source for updates.
A fourth benefit is all the services which are popping up on the Internet. You can order
books, groceries, send flowers, buy stocks and do your banking. These kind of services have
just began to appear on the Internet, but we will see a lot more of them in the future.
Ch.9 - The Internet and its applications
5
In October 1957 the former Soviet Union sent up Sputnik and took USA by surprise. To
many Americans, Sputnik was proof of Russia's ability to launch intercontinental missiles,
and pessimists predicted the destruction of democracy. As an answer to that threat,
president Eisenhower formed ARPA, Advanced Research Project Agency, in January 1958.
ARPA's mission was to make sure that the USA took the lead in research, especially
research for military use.
One of ARPA's project was ARPANET; a communication network that was built upon
computers, and a communication technique that was invented in 1962 called packet
switching. ARPANET had been built to protect the USA's communication structure in the
face of a nuclear attack. If one communication path was destroyed, the information packets
just took another path through the network. ARPANET consisted of four computers in
1969, and that was the seed from which the Internet grew.
In those days computers were very expensive. The people who built the ARPANET
thought that the main use was to use processor power from computers at a distance, through
a service called Telnet. But it soon turned out that scientists were more interested in their
colleagues’ brain power than in computers’ processor power.
The users invented a service called e-mail and it soon turned out that e-mail traffic amounted
to 75% of all the traffic. This trend has continued through the evolution of the Internet. The
primary interest of people is to communicate with other people.
In 1983 TCP/IP has been adopted as a standard and ARPANET became the Internet. The
same year the TCP/IP was included in the operating system UNIX, which made it easy for
system managers to connect to the Internet.
In 1988 the IRC which stands for Internet Relay Chat was written. This was the first

Internet service for real time communication. Up to the 1990’s the Internet was mostly a
Ch.9 - The Internet and its applications
6
playground for students, scientists and the military. But in the 1990’s it all changed. One
reason was that commercial companies and the general public were allowed to connect to the
Internet. Another reason was that WWW and Mosaic were invented and the use of the
Internet became much friendlier.
In 1994 there was a break through for presence of commercial companies on the Internet.
In 1996 there were 54 million users connected to the Internet and by 1999 that number had
increased to 130 millions.
As we look into the future we see that the Internet is continuing to grow and that new
services are appearing all the time.
Ch.9 - The Internet and its applications
7
This picture shows how different computer technologies fit together. The vertical axis is
time, which flows from top to bottom. The core technology of the Internet is computer
network technology. As you can see, some interesting key events are marked. ARPANET
represents the beginning of the Internet and it is followed by the invention of the Internet
services like Telnet, E-mail, Usenet and so on.
There is another technology called Hypertext which was invented by Vannevar Bush,
president Roosevelt’s science advisor. In 1945 he wrote an article called "As we may think",
where he described a device called "Memex" which used this hypertext or linking technology.
Two people read Bush's article and were profoundly influenced by it. One of them was
Douglas Engelbart, the inventor of the mouse, groupware and many other things. In the
sixties he built the first computer based machine called NLS which used this linking
technology. The other person was Theodore Nelson and it was he who coined the term
hypertext. Theodore Nelson also had the idea to use this technology through the telephone
network to link all the literature in the world, and make it accessible to people.
HyperCard, which appeared in 1987, was the first program on ordinary personal computers
which used hypertext technology.

Hypertext technology merged with the Internet when World Wide Web was invented. World
Wide Web uses network technology together with the linking mechanism of hypertext.
There is another technology called Graphical User Interface which was first invented by
Xerox when they developed their Star machine. This technology was later adopted by Apple
on the Macintosh and still later by Microsoft with Windows. All these technologies merged
with the Internet when Marc Andreesen, a 23 year old student, wrote a program called
Mosaic and later its successor called Netscape.
Multimedia is another technology that was started by Philips when they invented the Laser
Discs. Other storage devices, like CD-ROM and DVD (Digital Video Disc) appeared later.
Ch.9 - The Internet and its applications
8
Multimedia means using several media, like text, graphics, sound, animation, video and so on,
in combination with each other to present information. Even if there are multimedia elements
on the Internet today, the multimedia technology has not yet merged with the Internet. In
order to do so, you need to be able to transfer full screen, full motion video through the
Internet, and that requires a bandwidth of approximately 500 kbps. But that will happen in
the next few years.
Expert system technology will also merge into the Internet. Expert systems are intelligent
programs that can use rules to reason and act intelligently. Fuzzy Logic is one powerful
technique used in expert systems. One of the first applications of expert system technology
on the Internet will be something called intelligent agents. An agent is a program that keeps
track of you and your interests. It will go out on the Internet and seek information that might
interest you.
In summary, what this picture is saying is that many powerful computer technologies are
merging into the Internet, which will be extremely powerful in the future, and people will
probably associate the information age with the Internet rather than with computers.
Ch.9 - The Internet and its applications
9
This is a table from the Popular Science magazine that predicts what will happen to
bandwidth in the next couple of years. Today most people have ordinary modems with 28.8

kbps. Some have modems with 56 kbps, Those who have access to ISDN use from 64 up to
128 kbps.
You can also have access to the Internet through television cable network. In 1998 some
people had 500 kbps through that network, and that speed will increase to 1 Mbps in year
2000.
But you can also use the ordinary telephone network for higher speeds. A promising
technology is ADSL (Asynchronous Digital Subscriber Loop) which can give a bandwidth of
up to 8 Mbps.
Ch.9 - The Internet and its applications
10
Most services on the Internet use the Client Server Model. A Client Server Model is a
distributed system in which software is split between server tasks and client tasks. A client
sends requests to a server, according to some protocol, asking for information or action, and
the server responds. A server typically serves many clients. A client can request services
from different servers. This model allows clients and servers to be placed independently on
nodes in a network, on different hardware and operating systems.
Ch.9 - The Internet and its applications
11
There are many services on the Internet but the four most popular and most widely used are
World Wide Web, E-mail, Usenet News and FTP. If you want to request information from a
server, you need to know how it organizes its information.
A WWW server organizes its information in units called web pages. A web page is a
document with text, pictures and links. A web page can also contain other media elements
like sound, animation, video clips and so on. Web pages are grouped into web sites. A web
site is a number of pages that describe a particular subject. Web pages within a web site have
links between them but they can also have links to other web sites.
The protocol used to transfer web pages is called HTTP, HyperText Transfer Protocol.
An e-mail server organizes its information in units called mail messages. Messages are stored
in mail boxes. When mail messages are transferred between e-mail servers, a protocol called
SMTP, Simple Mail Transfer Protocol, is used. When you retrieve messages from the e-mail

server to your computer a protocol called POP3, Post Office Protocol version 3, is used.
A news server organizes its information in units called news articles. News articles
discussing a particular subject are grouped in newsgroups. In 1999 there were about 40 000
different newsgroups discussing all kind of topics. Newsgroups are grouped in news
categories and news categories are themselves grouped in news categories on a higher level.
For instance, the news group "rec.travel.europe" belongs to a category "rec.travel", which
belongs to the category "rec".
News articles are transferred from news server to news server with a copying mechanism.
Every time a news server gets in touch with another news server it checks if the other news
server has some new articles and copies them. News articles migrate in this way through the
Internet.
Ch.9 - The Internet and its applications
12
Since every news server has an almost complete collection of news articles it means a lot of
redundancy. Since news servers have limited amount of disk space, the news articles that are
older than one or two weeks are deleted. The protocol used to transfer news articles is called
NNTP, Network News Transfer Protocol.
An FTP server organizes its information just like your own computer, that is by using
directories which can contain files, or sub directories. You can copy a file from an FTP
server to your computer (that is called downloading) or copy a file from your computer to
the FTP server (that is called uploading). The protocol used to transfer files is called FTP,
File Transfer Protocol.
Ch.9 - The Internet and its applications
13
Telnet is the oldest service on the Internet. The user interface is typically old-fashioned with
text only. In the bottom of the screen you have a command line where you can enter your
commands. Many institutions like libraries still use telnet, but they are slowly changing it to
WWW. The protocol used to communicate with a telnet server is also called telnet.
Mail list servers keep track of two lists. One is the subscriber list, which contains a list of e-
mail addresses to subscribers. The other one is a list of all the messages. When somebody

sends a message to the mail list server that message is stored on the list of messages and then
the message is sent to all subscribers.
There are thousands of different mail lists that you can subscribe to. In order to subscribe to
a mail list, you need to send an e-mail message to the mail list server. Then you will
automatically get all e-mail messages that people send to the mail list and you can also send
messages yourself for other subscribers to read.
A chat server organizes its information into channels (sometimes called rooms). In every
channel a real time discussion is going on. You type text on your keyboard and that text
appears on a shared screen area for other people to see. You can see what other people are
writing. You can also have private discussions with others if you want to.
A gopher server works just like an FTP server. The only difference is that a particular file or
directory that the gopher server shows may not reside on that gopher server but on another
one. The gopher service is dying out and is being replaced with WWW.
Ch.9 - The Internet and its applications
14
There are two ways to find information on the Internet. The first is by using catalogues and
the second one is by using search engines. This is not just typical for World Wide Web but
also for other services like E-mail, News, FTP and so on.
The most well-known catalogue for WWW is Yahoo and the most well-known search engine
is Alta Vista. There are many others.
If you want to find an e-mail address you can use an e-mail catalogue. You enter the name of
a person and you get that person’s e-mail address. One good search engine for finding e-mail
addresses is the Yahoo People Search.
As you know most news servers just keep track of the news articles from the last one or
two weeks. The older articles are deleted. But what if you want to find some older articles?
Deja News keeps track of older news articles. Another benefit is that you can search with
keywords in Deja News, which greatly facilitates finding the relevant articles.
There are a lot of FTP servers in the world and you can connect to most of them. In most
cases it's enough to connect to a handful of good FTP sites to find the software you are
looking for.

If you are looking for a particular software like Disinfectant, how do you find it? Well if you
know a name or a part of the name you can use Archie. If you do not know the name but are
looking for some type of software, say video editing, you can search with VSL, Virtual
Software Library.
Ch.9 - The Internet and its applications
15
We will now look more closely on how to use a search engine for finding documents on
WWW. The circle to the right represents the number of relevant documents that you want to
find. Entering some keywords into a search engine will find the documents representing the
left circle. Some of the documents will be relevant but others will be false drops. You want
to minimize the number of false drops and maximize the number of relevant documents.
How do you do that? Well, the first thing you must be aware of is the distinction between
full search and constrained search.
Full search means that the search engine is searching through the whole of the web pages. A
full search will give you a lot of hits but it will also produce a lot of false drops.
A constrained search will only look at a specified part of the documents. You can for
instance restrict the search to only the titles of documents. A constrained search will reduce
the number of false drops but you might also miss some of the relevant documents.
Ch.9 - The Internet and its applications
16
Here are some rules for performing a full search in Alta Vista. Other search engines have
similar functions although the syntax may vary. One of the best ways to minimize the
number of false drops is to search for phrases instead of single keywords. Another is to use
unique keywords.
Ch.9 - The Internet and its applications
17
Here are some rules for performing a constrained search in Alta Vista. Other search engines
have similar functions although the syntax may vary.
To perform a constrained search you enter a search tag and a colon followed by the
keywords. If for instance you only want to search for documents containing the keyword

Orlando in their titles you would have to enter
title:Orlando
As you know most web pages have links to other web pages. What if you want to search for
all web pages that contain a link to web pages with thomas.gov in their URL. That's easy, all
you have to enter is
link:thomas.gov
Say that you want to find a photo of a comet. Since photos are mostly saved in the jpg
format you could try by entering
image:comet.jpg
As you know, every web page on the Internet is identified by a URL. A URL consists of a
protocol, host name, path name and file name. The host name contains a top-level domain
which identifies the country of the host.
Say that you want to search for all web pages which have file names groupware.html. All
You need to do is to enter
url: groupware.html
Say that you want to find all web pages on Volvo's www server. You could try with
host:www.volvo.se
If you are interested in finding only Swedish documents you could enter
domain:se
You can of course use the search functions in combination with each other and find
documents with very few false drops.
Ch.9 - The Internet and its applications
18
There are a lot of different file formats on the Internet. File extensions reveal what kind of
format is used.
Your browser can only handle a few formats. In order to handle the other formats you need
to get a suitable helper application or plug-in. Normally you will find these helper
applications and plug-ins on the Internet and can download them to your computer for free.
The file formats can be divided into three main groups: Compression and Archiving, Mail
Encoding and Media.

Ch.9 - The Internet and its applications
19
Archiving means wrapping up many files into a single file. The resulting archive file can later
be de-archived into its components. People often archive several files before transferring
them through the Internet in order to preserve the correct file names and directory structures.
Compression means storing data in less space. Archiving and Compression are often
combined in a single process. If you want to compress and archive, use well-established
products.
Use Stuffit for Mac users, TAR/GZIP for UNIX users and zip for Windows and MS-DOS
users.
TAR/GZIP: This is used in the UNIX world. Archiving and compression is performed as a
two-stage process, for example a group of files may be archived to form files.tar and then
compressed with GZIP program to form files.tar.gz. Systems such as MS-DOS and
Windows don’t allow multiple extensions, so the name is frequently condensed to files.tgz.
In order to recover the files you need to first decompress the file and then de-archive,
although some programs perform both steps simultaneously.
Zip: WinZip is a standard program among PC users that does both archiving and
compression in a single program. It also knows how to de-archive and de-compress.
Stuffit: Stuffit is a standard program among Mac users for compressing and archiving. The
extensions used are ”.sit” which stand for stuffit or ”.sea” which stand for self extracting
archive. To de-archive and de-compress Mac users use the program Stuffit Expander.
Ch.9 - The Internet and its applications
20
Due to historical reasons many computers on the Internet that transfer data are designed to
handle only text from the US ASCII character set (also called seven-bit ASCII). However,
you often need to transfer binary data (like programs, word files and pictures) as
attachments to your mail. To do this, you encode the data, converting it into US ASCII
characters and attach them to your mail. The recipient needs to decode the attached files on
the other end.
UUEncode has the extension .uu. This encoding is one of the oldest and one of the most

popular. Unfortunately some mail gateways don’t understand some characters used by
UUEncoded data and that data can get damaged on its way to the recipient. Another flaw is
that there are several types of UUEncodings which are incompatible with each other.
Binhex has the extension .hqx. This is a standard way of converting Macintosh files. For
Mac users there are plenty of programs that convert files from and to Binhex, including the
popular Stuffit Expander program.
MIME. MIME stands for Multipurpose Internet Mail Extension. It is a standard that
contains an encoding often built into mail clients, so that you can send and receive binary
data without having to worry about the actual mechanism of encoding. MIME is supported
by many mail clients, for instance Eudora and Netscape mail. MIME enabled mail clients not
only automate encoding and decoding, but also to mark the data type so that recipient’s mail
client can show the data in the appropriate way.
Ch.9 - The Internet and its applications
21
Different media use different formats. For example;
The following text formats are common:
.txt is simple ASCII text. Sometimes the extension .asc is used.
.doc is simple ASCII text or Microsoft Word Document.
.rtf is Rich Text Format, recognized by many Word processors, including Microsoft Word.
It encrypts attributes such as bold, underline, italics, and so on. RTF is good for exchanging
text files between different systems.
.ps is PostScript. It is used by printers and high-end graphics systems. PostScript’s biggest
strength is that it describes what a page should look like without assuming anything about
the printer or a screen. The same PostScript file can be displayed on a 72 dpi screen or a
2400 dpi imagesetter, and the result in either case will be the best possible for that device.
.pdf is Portable Document Format, recognized by many publishing programs, including
Quark Express and PageMaker. PDF allows you to have total control of the appearance of a
document. PDF uses the same general approach as PostScript but simplifies the complexities
of PostScript.
The following graphic formats are common:

.gif is Graphics Interchange Format. Used for exchanging eight-bit graphics. GIF is very
popular on the Internet since it compresses quite well and that partial image can be
displayed while it is downloaded. GIF can be displayed by most browsers.
Ch.9 - The Internet and its applications
22
.jpg stands for Joint Photographic Experts Group. JPEG is Used for high resolution
photographic images. JPEG achieves its impressive compression by selective removal of
information to which the human eye is less sensitive. JPEG can be displayed by most
browsers.
.tif stands for Tagged Image File Format. It is used for working with large, high resolution
images. TIFF is good for exchanging graphics between different systems.
.bmp is a native graphics format for Windows. BMP is not very good for exchanging files
between different systems.
.pct is a native graphics format for Macintosh. PICT is not very good for exchanging files
between different systems.
The following sound formats are common:
.au is often used for exchanging sound data.
.wav is a native sound format for Windows.
.aiff, which stands for Audio Interchange File Format, is a native sound format for Mac
.mid stands for Musical Instruments Digital Interface. MIDI just stores note names and
instrument types and not recorded sound. That makes MIDI very compact for instrumental
music.
The following video formats are common:
.mov is Apples QuickTime Movie format. QuickTime is a native video format for Mac, but
is also very popular on Windows.
.avi stands for Audio/Video Interleave. It is a native video format for Windows.
.mpg stands for Motion Picture Experts Group. MPEG offers excellent compression and
high quality but requires external hardware for processing power. Decoding with software is
possible but picture quality is lost.
Ch.9 - The Internet and its applications

23
A browser like Netscape Navigator recognizes a number of data formats like html, gif and
jpg. If the browser encounters a file with a data format that it doesn't understand then four
alternatives are possible.
In the first alternative you can save the file on your hard disk and deal with it later with
some other application.
In the second alternative the browser can start a separate helper application to deal with the
file. The helper application will open up a new window and display the data. You can look
at the data and when you are finished you must quit the helper application and return to
your browser.
Third alternative means that the browser will start a plug-in that will deal with the data. The
advantage of using a plug-in instead of a helper application is that a plug-in provides in-line
rendering, which means it can display the data within a preassigned rectangular area in the
window of the web browser. In this way the continuity of the browsing experience is not
lost.
The fourth alternative is to load Java applets together with the data to be displayed. The
advantage of using Java applets is that you don't have to install anything on your computer,
as is the case with helper applications and plug-ins. Java applets can also display the data
within a preassigned rectangular area in the window of the web browser.
Ch.9 - The Internet and its applications
24
Everything on the Internet that is of value, like a web page, a file, a news group and so on is
called a resource. Every resource has an address and this address is called URL, which stands
for Uniform Resource Locator.
Here is an example of an URL:
/>A URL consists of four parts. The part before the first colon specifies the protocol. The
next part comes after the two slashes and introduces a host name. The third part is called a
path name and comes after the first single slash and continues until the last single slash. The
path name defines the directory that contains the resource file. The fourth part is the name
of the resource file, and comes after the last single slash.

Ch.9 - The Internet and its applications
25
There are different protocols used for different services. Most common protocols are:
1. http, (Hypertext Transfer Protocol) used for World Wide Web.
2. ftp, (File Transfer Protocol) used for file transfer.
3. news, used for receiving or sending news articles.
4. gopher, used for communication with gopher servers.
5. mailto, used for sending e-mail
6. telnet, used for remote login.

×