KRONE: 800-775-KRONE www.kroneamericas.com www.truenet-system.com
No part of this document may be reproduced without permission ©2001 KRONE, Inc.
The Effect of Errors on TCP
Application Performance
in Ethernet LANs do to
application performance?
This question has arisen time
and time again since the
KRONE
®
TrueNet
TM
structured
cabling system was launched.
The simple answer is that
errors can degrade application performance; the com-
plicated answer is that the degradation is dependent on
how the errors effect TCP. How TCP works, and what can
happen to TCP in the face of errors is the subject of this
paper. KRONE enlisted the help of an independent
consultant, Dr. Phil Hippensteel, to study this topic and
provide his evaluation of what Ethernet errors would do
to application performance. The following paper presents
Dr. Hippensteels findings.
--Editor
In this paper we will discuss the relationship between
errors that occur in networks and their impact on the
performance of applications that run over Transmission
Control Protocol (TCP). While many individuals in the
industry give the impression that they understand
network errors and error detection and while much has
been published on the performance issues of TCP, few
attempts have been made to tie these two topics
together. However, this topic is important. A vast
number of applications use TCP, including most transac-
tion-oriented systems and virtually all webenabled
processes. Our purpose will be to provide some insight
into how applications are affected by errors, particularly
at the physical level, and to illustrate this through case
studies.
We will begin by developing some background. We will
study how a typical TCP message is encapsulated as well
as the differences between connection-oriented
protocols and connectionless-oriented protocols. We will
investigate TCP and the User Datagram Protocol (UDP),
the two protocols nearly all applications use to communi-
cate. Then, well review the three classes of errors and
how they are detected. Once this background has been
introduced, well study how TCP operates in some detail.
Some case studies will be used to illustrate these
WHAT
CAN
ERRORS
concepts because they are the most difficult that will be
covered. We will conclude the paper with a summary
and some observations about how to control errors in
your network.
Background Information
Messages sent from application to application in packet
data networks such as LANs and WANs are encapsu-
lated. For example, as a server responds to a client
request, the message flows down through what is
referred to as the protocol stack. This is illustrated in
Figure 1. In most implementations, each of the layers
shown creates one of the headers in the encapsulated
message.
As a specific example, suppose a client device such as a
PC makes a request of a web server to retrieve a web
page. The application program interface (API) Hypertext
Transfer Protocol (HTTP) would formulate the request in
this format:
HTTP Header HTTP Message (response)
Figure 1
This message and header would be given to TCP. TCP
would add its header and pass it to the Internet Protocol
(IP). IP would add its header and pass it to the network
interface card (NIC), for instance an Ethernet card.
Finally, the Ethernet card would be responsible for
sending the total Ethernet frame onto the network.
That frame would look like this:
KRONE: 800-775-KRONE www.kroneamericas.com www.truenet-system.com
No part of this document may be reproduced without permission ©2001 KRONE, Inc.
The most widely
used connection
oriented protocol
is TCP.
When this frame arrives at the receiver, each layer
processes its part of the message. In a manner of
speaking, the message filters up through the protocol
stack to the application.
Next, lets focus on the two types of protocols.
Connectionoriented protocols are sometimes called
reliable protocols. This is because they make provisions
for important functions to be conducted; functions that
are essential so that the data will arrive in a way the
receiver can use it. The four key functions imple-
mented by a connectionoriented protocol are: formal
set-up and termination of the session, flow control,
sequencing and error detection. The most widely used
connection-oriented protocol is TCP. By comparison,
connectionless-oriented protocols, such as IP and UDP
usually make no provision for all of these features to be
implemented. Such protocols are sometimes called
unreliable because they depend on a reliable protocol
to carry out these functions. For example, when IP and
TCP are used together, IP depends on TCP to handle
flow control and sequencing. However, in some cases,
a connectionless protocol will implement one or two of
the functions.
One of the most important concepts in data communi-
cations is how connection and connectionless protocols
are related.
As a data message is created, an assumption is made
that if a connectionless-oriented protocol is used, a
higher layer will be connection-oriented. That is, at the
receiving end, if a lower layer protocol is connectionless,
a higher layer protocol will need to provide the
connection-oriented functions.
As an illustration of this, in the example of the web
client-server interaction above, Ethernet and IP operate
in a connectionless fashion. Therefore, it is assumed
that TCP will deal with the basic features of a connec-
tion-oriented exchange.
Error Detection in Packet Networks
There are three types of error detection schemes that
we will study. The first of these is the detection of an
individual bit that is corrupted. This is usually the
responsibility of the receiving interface card. The
metric that is used to describe the rate at which erred
bits arrive is the bit error rate (BER). When a test is run
to determine the BER, it is called the bit error rate test
(BERT). When the physical level specification for a link
is written, it frequently specifies values such as 10
-9
or
10
-10
. If the BER were specified as 10
-9
, it would mean
that the probability that a single bit would arrive in
error is one in one billion. This figure cannot be used
accurately to describe the probability that a data
frame will be corrupted upon arrival. The second
error detection method we will review is specifically
intended to provide a metric for the probability that a
data frame is corrupted. This technique is called the
cyclic redundancy check (CRC). This is a highly reliable
method that has become almost the exclusive means
by which network interface cards check incoming
data frames. Finally, well investigate the error
detection technique usually used in software. Most
protocols such as TCP, UDP and IP include such a value
in their header and call it the checksum. A variety of
techniques are used to calculate it but we will review
only one of the most commonly used procedures.
To determine whether a physical link supports the
specified BER, normally a test device sends a stream of
bits to a receiving test device. The receiving test
device knows the pattern that was transmitted and is
therefore able to determine which bits were
corrupted during the transmission. In nearly all
instances, an output screen shows the number of bits
transmitted, the number corrupted and the BER. As
pointed out above, the BER should not be applied to
blocks of data. A simple example should illustrate why
this is the case. Suppose that p=BER= 10
-1
or 10%.
Although this is unusually high, it will help illustrate the
point. If a block of just two bits were transmitted, the
likelihood that a block or frame would arrive
corrupted would be:
P(at least one bit is corrupt)
=1 - P(no bits are corrupted)
=1 - P(both bits arrive uncorrupted)
=1-(1-p)
2
= 1(.9)(.9)= 10.81= 0.19 (19%)
P( ) is used to indicate the probability that. Note
that the probability that a two-bit block will be
corrupted is neither the BER nor twice the BER. In a
similar manner, we can show that the probability of at
least one bit in a three-bit frame would be corrupted
is 1-(1-p)
3
=1-(9.)
3
=0.271. For interested readers, The
appendix contains a more sophisticated discussion of
the relationship between BER and the block error
rate.
KRONE: 800-775-KRONE www.kroneamericas.com www.truenet-system.com
No part of this document may be reproduced without permission ©2001 KRONE, Inc.
...as a matter of
practice, think of
CRC-32 as
practically
infallible.
The calculation of the block error rate for local area
networks such as Token Ring, Ethernet and FDDI uses
the cyclic redundancy check algorithm. Figure 2
illustrates how it works:
Suppose a frame is to be transmitted by an Ethernet NIC
in a client station across a twisted pair link to a hub and
on to the NIC of a server. The NIC in the client will divide
the entire message by a 32-bit pattern (dictated in the
IEEE 802.3 specification). It will ignore the dividend but
insert the remainder as the last field in the Ethernet
frame. This field is called the frame check sequence. It
can be established mathematically that the remainder is
always 32 bits (four bytes) in length. When the frame
arrives, the server NIC will repeat the division using the
same IEEE specified divisor. It subtracts the received
remainder from the calculated remainder to determine
if the result is zero. If it is zero, the transmitted and
calculated remainders must have been the same.
Consequently, it concludes that the transmitted message
had no errors. Otherwise, the NIC will record in a
register on the card that the frame was received with
an error. It is the responsibility of the NIC driver to react
to this indication. In almost all cases, the driver will
discard the frame. However, it may report this error to
the layer three protocol. This issue is protocol depen-
dent and we will discuss it later in this paper.
CRC is highly reliable. For example, the technique used
in Ethernet and token ring, called CRC-32 will detect:
· All single bit errors.
· All double bit errors.
· All odd-numbered bit errors.
· All burst errors with length under 32.
· Most bit errors with length 32 or over.
This last bullet is very conservative. The probability of a
burst error of length 32 or over is extremely small, so it is
virtually impossible for this to occur. Consequently, as a
matter of practice, think of CRC-32 as practically
infallible.
As previously indicated, calculating a checksum may
follow one of many specific procedures. However, most
follow a pattern that involves doing some form of parity
checking with redundancy. For example, in conventional
parity check techniques, the sender sends a group of bits
and adjusts the last bit in the block so that the total
number of bits in the block is either odd or even,
whatever has been agreed upon in advance by the
transmitting and receiving stations. We can visualize the
message listed character by character vertically, like this:
Character 1 1 0 0 0 0 0 0 0
Character 2 0 0 0 0 0 1 1 1
Character 3 1 0 1 0 0 0 0 1
Character 4 0 0 0 1 0 1 0 1
Character 5 0 0 1 0 0 1 0 1
Character 6 0 0 0 1 0 0 0 0
Character 7 0 0 1 1 1 1 0 1
Character 8 0 0 0 1 0 1 1 0
Character bits
In this illustration odd parity is being used in the rows. So
the sender adjusts the eighth position to make sure that
each row contains an odd number of ones. For example,
in row one, the character is represented by the pattern
1000000. Since that contains a single one, which is an
odd number, the eighth position is made zero. Hence, the
full eight-bit pattern still contains an odd number of ones.
However, in row two, the character is represented by the
pattern 0000011. Since that pattern contains two ones,
which is an even number of ones, the eighth position is
adjusted to one, giving a total of an odd number of ones
in the pattern. Today parity checking is rarely used as the
sole error detection scheme except in low speed links
such as dial-up lines. Parity checking is too easy to
defeat; if two bits in a character are reversed during
transmission, the scheme fails to detect the error.
In order to add redundancy to the parity scheme, an
additional algorithm is introduced. Before the sender
transmits this block of eight characters, it will perform a
binary addition down the columns using a technique
called 1s-complement sum. It will invert that result
(switch zeroes and ones), and insert the result as the
block check character (BCC). We do not need to be
concerned with the details of the 1s-complement sum
process here. By appending the BCC to the block, we will
Figure 2
KRONE: 800-775-KRONE www.kroneamericas.com www.truenet-system.com
No part of this document may be reproduced without permission ©2001 KRONE, Inc.
have the result shown below. The calculated BCC is
inserted in the checksum field in the data packet and
transmitted to the receiving station.
Character 1 1 0 0 0 0 0 0 0
Character 2 0 0 0 0 0 1 1 1
Character 3 1 0 1 0 0 0 0 1
Character 4 0 0 0 1 0 1 0 1
Character 5 0 0 1 0 0 1 0 1
Character 6 0 0 0 1 0 0 0 0
Character 7 0 0 1 1 1 1 0 1
Character 8 0 0 0 1 0 1 1 0
BCC 001 11001
When the receiver gets the message block, it will
perform a similar calculation. While this technique isnt
as foolproof as CRC-32, it is highly reliable. Normally,
the entire message is appended with additional fields
such as network addresses. Then the checksum
calculation is performed on the complete message
block.
There are some interesting observations we can make
at this point. First, the protocols implemented in
software may calculate a checksum and transmit it
whether or not they are connection-oriented. Also, the
checksum doesnt need to be quite as robust as CRC-32
because it is almost always a redundant check. The
receiver assumes that the NIC or its driver would have
dropped the frame had it contained an error. The IPX
protocol is so confident that CRC-32 will detect the
error that it sets the checksum field to all ones as a
means of disabling it.
We are particularly interested in TCP/IP environments,
so how are these techniques used in such an environ-
ment? Ethernet NIC cards use CRC-32. If the receiver
successfully passes the data frame to the IP protocol, IP
will calculate the checksum value based only on the IP
header. Since IP is connectionless, the assumption is
made that some other protocol will be concerned with
the integrity of the data field (which will be inspected
by TCP or UDP.) On the other hand, TCP is connection-
oriented. So, the application is dependent on TCP to
provide error-free data. Therefore, TCP uses the IP
addresses, the protocol identifier field, the segment
length field, and the data field as the basis for its
checksum calculation. This combined set of fields is
called the pseudo-header and data.
Recall our conclusion from near the beginning of the
paper: connectionless-oriented protocols depend on
connection-oriented protocols to provide the
reliability that they cannot. System designers assume
that if a protocol is connectionless, a connection-
oriented protocol will process the packet to provide
the integrity that the receiving application expects.
This is the case with error checking. Refer again to
Figure 1; in a situation where a TCP application is
being used, if a frame error occurs at the Ethernet
layer in the physical network, TCP will have the
responsibility to deal with detecting this and taking
appropriate action. In the section on TCP perfor-
mance, well deal with how this occurs. Specifically, if
the data frame arrives and is dropped by the NIC
driver, IP will never know it. TCP will have to discover
that the frame was dropped and react appropriately.
While a study of real-time applications, such as IP
voice and video, is beyond the scope of this paper, in
such an environment the responsibility for responding
to errors is pushed even higher. Such applications
often replace TCP with UDP. UDP is connectionless,
so reacting to Ethernet errors would be handled by
higher-level protocol, either the application program
interface or the application.
TCP Operation
Since TCP is a connection-oriented protocol, lets
review some of the key functions it is responsible for
implementing. These include:
· Formally establishing and terminating a session
between the two applications that wish to
communicate.
· Providing flow control so that neither application
will transmit information faster than its partner
can receive and process it. i.e. the sender should
not overfill the receivers buffer.
· Providing a means by which the packets can be
sequenced in order that they can be correctly
organized for delivery at the receiving end.
· Detecting that errors have occurred in the
network and reacting appropriately.
· Adjusting to conditions in the network to allow for
a high level of throughput.
Well discuss each of these in some detail. However,
first we need to actually look at the fields in the TCP
header. These are shown in Figure 3:
KRONE: 800-775-KRONE www.kroneamericas.com www.truenet-system.com
No part of this document may be reproduced without permission ©2001 KRONE, Inc.
Figure 3
Since we need to understand most of the fields in the
TCP header, we will review all of them for the sake of
completeness. The combination of the TCP header and
its data is called a TCP segment. Port numbers are used
to identify the sending and receiving application
processes. They are not physical ports, rather they are
identifiers that indicate the specific application program
that sent or is to receive the TCP data. Port numbers up
to 1023 are referred to as well-known. This simply
means they are registered with the Internet committee
responsible for tracking Internet numbers; it also means
that everyone agrees on the meaning of these numbers.
For example, 25 is the port number for SMTP, the Simple
Mail Transport Protocol. 80 is used for the web protocol
HTTP, mentioned earlier. The server end of a connection
will normally use a well-known port. The client will use
an arbitrary port number. The port numbers above
1024 and below 65,536 are available to be defined as
necessary and are referred to as non-well-known ports.
The sequence number is used to make it possible for the
receiver to determine the order in which the TCP
segments were sent. It is assumed that this is the order
in which they are to be delivered. However, it is possible
that the packets may have followed different physical
paths through the packet network. While a detailed
analysis of a packet arriving out of sequence is beyond
the scope of this paper, later we will explain essentially
how TCP deals with it in the discussion of Figure 7.
Also, some packets may get lost or become corrupted
and therefore would be dropped by a NIC along the
way. The sequence numbers can also be used by the
receiver to detect missing segments. When a transmit-
ting TCP entity starts to deliver data bytes, it will include
a value (up to thirty-two bits in length) called the initial
sequence number (ISN) in the sequence number field.
This value is the number it associates with the first byte
in the transmitted segment. For example, suppose there
are 10,000 bytes to be transmitted and the ISN is
17,234. If 100 bytes are sent in the first segment, the
last byte, would correspond to the tag 17,333.
The next frame sent will contain the sequence
number 17,334, exactly one hundred higher than the
previous sequence number. In this way, the receiver
can use the sequence numbers to determine how
many bytes were sent in the first segment. The
receiver can also tell the order of the segments since
the sequence numbers always increase in value. If a
receiver subtracts the first sequence number it
receives from the second sequence number it
receives, the data field should contain that number of
bytes. If it does not, the receiver knows that one or
more segments were lost or dropped.
The acknowledgement number is returned to the
sender to indicate the sequence number it expects to
receive next. In the illustration in the previous
paragraph, the receiver would return the
acknowledgement number 17,334 to indicate that it
received the frame with sequence number 17,234
and 100 bytes of data. Figure 4 shows how this is
often illustrated:
Figure 4
The first packet will contain 100 bytes of data. The
returned packet containing the acknowledgement
number may or may not contain data. Usually it does
not. In other words, most often the
acknowledgement segment will contain only a TCP
header encapsulated in IP and the data link informa-
tion such as Ethernet.
The field labeled Head Length in Figure 3 indicates
how many 32-bit blocks (rows in the diagram) are in
the TCP header. By reading this field the receiver can
determine whether the options block(s) contain any
information. The options block is the last row in the
diagram, just before the data row. Normally, the
header length has the value 5 (meaning no options
are included). The reserved field is not used. There
are six code bits. Each is read independently and is
sort of an on-off switch to indicate a certain state.
We are interested in only the last two bits called the
SYN and FIN bits, respectively. Consequently,