Tải bản đầy đủ (.pdf) (10 trang)

Practical TCP/IP and Ethernet Networking- P15 ppsx

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (255.28 KB, 10 trang )

7
Host-to-host (transport)
layer protocols
Objectives
When you have completed this chapter you should be able to:
• Explain the basic functions of the host-to-host layer
• Explain the basic operation of TCP and UDP
• Explain the fundamental differences between TCP and UDP
• Decide which protocol (TCP or UDP) to use for a particular application
• Explain the meaning of each field in the TCP and UDP headers

The host-to-host communications layer (also referred to as the service layer, or as the
transport layer in terms of the OSI model) is primarily responsible for ensuring end-to-
end delivery of packets transmitted by the Internet protocol (IP). This additional
reliability is needed to compensate for the lack of reliability in IP.
There are only two relevant protocols residing in the host-to-host communications
layer, namely TCP (transmission control protocol) and UDP (user datagram protocol). In
addition to this, the host-to-host layer includes the APIs (application programming
interfaces) used by programmers to gain access to these protocols from the process/
application layer.
Host-to-host (transport) layer protocols 123


Figure 7.1

TCP and UDP within the ARPA model
7.1 TCP (transmission control protocol)
7.1.1 Basic functions
TCP is a connection-oriented protocol and is therefore reliable, although this word is used
in a data communications context and not in an everyday sense. TCP establishes a
connection between two hosts before any data is transmitted. Because a connection is set


up beforehand, it is possible to verify that all packets are received on the other end and to
arrange re-transmission in the case of lost packets. Because of all these built-in functions,
TCP involves significant additional overhead in terms of processing time and header size.
TCP includes the following functions:
• Fragmentation of large chunks of data into smaller segments that can be
accommodated by IP. The word ‘segmentation’ is used here to differentiate
it from the ‘fragmentation’ performed by IP
• Data stream reconstruction from packets received
• Receipt acknowledgment
• Socket services for providing multiple connections to ports on remote hosts
• Packet verification and error control
• Flow control
• Packet sequencing and reordering

In order to achieve its intended goals, TCP makes use of ports and sockets, connection
oriented communication, sliding windows, and sequence numbers/acknowledgments.


124 Practical TCP/IP and Ethernet Networking
7.1.2 Ports
Whereas IP can route the message to a particular machine on the basis of its IP address,
TCP has to know for which process (i.e. software program) on that particular machine it
is destined. This is done by means of port numbers ranging from 1 to 65 535.
Port numbers are controlled by IANA (the Internet Assigned Numbers Authority) and
can be divided into three groups.
Well known ports, ranging from 1 to 1023, have been assigned by IANA and are
globally known to all TCP users. For example, HTTP uses port 80.
Registered ports are registered by IANA in cases where the port number cannot be
classified as well known, yet it is used by a significant number of users. Examples are
port numbers registered for Microsoft Windows or for specific types of PLCs. These

numbers range from 1024 to 49 151, the latter being 75% of 65 536.
A third class of port numbers is known as ephemeral ports. These range from 49 151 to
65 535 and can be used by anyone on an ad-hoc basis.
7.1.3 Sockets
In order to identify both the location and application to which a particular packet is to be
sent, the IP address (location) and port number (process) is combined into a functional
address called a socket. The IP address is contained in the IP header and the port number
is contained in the TCP or UDP header.
In order for any data to be transferred under TCP, a socket must exist both at the source
and at the destination. TCP is also capable of creating multiple sockets to the same port.
7.1.4 Sequence numbers
A fundamental notion in the TCP design is that every BYTE of data sent over the TCP
connection has a unique 32-bit sequence number. Of course this number cannot be sent
along with every byte, yet it is nevertheless implied. However, the sequence number of
the FIRST byte in each segment is included in the accompanying TCP header, for each
subsequent byte that number is simply incremented by the receiver in order to keep track
of the bytes.
Before any data transmission takes place, both sender and receiver (e.g. client and
server) have to agree on the initial sequence numbers (ISNs) to be used. This process is
described under ‘establishing a connection’.
Since TCP supports full duplex operation, both client and server will decide on their
initial sequence numbers for the connection, even though data may only flow in one
direction for that specific connection.
The sequence number, for obvious reasons, cannot start at 0 every time, as it will create
serious problems in the case of short-lived multiple sequential connections between two
machines. A packet with a sequence number from an earlier connection could easily
arrive late, during a subsequent connection. The receiver will have difficulty in deciding
whether the packet belongs to a former or to the current connection. It is easy to visualize
a similar problem in real life. Imagine tracking a parcel carried by UPS if all UPS agents
started issuing tracking numbers beginning with 0 every morning.

The sequence number is generated by means of a 32-bit software counter that starts at 0
during boot-up and increments at a rate of about once every 4 microseconds (although
this varies depending on the operating system being used). When TCP establishes a
connection, the value of the counter is read and used as the initial sequence number. This
creates an apparently random choice of the initial sequence number.
Host-to-host (transport) layer protocols 125
At some point during a connection the counter could rollover from 65 535 and start
counting from 0 again. The TCP software takes care of this.
7.1.5 Acknowledgment numbers
TCP acknowledges data received on a PER SEGMENT basis, although several
consecutive segments may be acknowledged at the same time.
The acknowledgment number returned to the sender to indicate successful delivery
equals the number of the last byte received +1, hence it points to the next expected
sequence number. For example: 10 bytes are sent, with sequence number 33. This means
that the first byte is numbered 33 and the last byte is numbered 42. If received
successfully, an acknowledgment number (ACK) of 43 will be returned. The sender now
knows that the data has been received properly, as it agrees with that number.
TCP does not issue selective acknowledgments, so if a specific segment contains errors,
the acknowledgement number returned to the sender will point to the first byte in the
defective segment. This implies that the segment starting with that sequence number, and
all subsequent segments (even though they may have been transmitted successfully) have
to be retransmitted.
From the previous paragraph it should be clear that a duplicate acknowledgement
received by the sender means that there was an error in the transmission of one or more
bytes following that particular sequence number.
Please note that the sequence number and the acknowledgment number in one header
are NOT related at all. The former relates to outgoing data, the latter refers to incoming
data. During the connection establishment phase the sequence numbers for both hosts are
setup independently, hence these two numbers will never bear any resemblance to each
other.

7.1.6 Sliding windows
Obviously there is a need to get some sort of acknowledgment back to ensure that there is
a guaranteed delivery. This technique, called positive acknowledgment with
retransmission, requires the receiver to send back an acknowledgment message within a
given time. The transmitter starts a timer so that if no response is received from the
destination node within a given time, another copy of the message will be transmitted.
An example of this situation is given in Figure 7.2.
126 Practical TCP/IP and Ethernet Networking

Figure 7.2

Positive acknowledgment philosophy

The sliding window form of positive acknowledgment is used by TCP, as it is very time
consuming waiting for each individual acknowledgment to be returned for each packet
transmitted. Hence the idea is that a number of packets (with cumulative number of bytes
not exceeding the window size) are transmitted before the source may receive an
acknowledgment to the first message (due to time delays, etc). As long as
acknowledgments are received, the window slides along and the next packet is
transmitted.
During the TCP connection phase each host will inform the other side of its permissible
window size. For example, for Windows 95/98 this is typically 8K or around 8192 bytes.
This means that, using Ethernet, 5 full data frames comprising 5 × 1460 = 7300 bytes can
be sent without acknowledgment. At this stage the window size has shrunk to less than
1000 bytes, which means that unless an ACK is generated, the sender will have to pause
its transmission.
7.1.7 Establishing a connection
A three-way SYN/ SYN_ACK/ACK handshake (as indicated in Figure 7.3) is used to
establish a TCP connection. As this is a full duplex protocol it is possible (and necessary)
for a connection to be established in both directions at the same time.

Host-to-host (transport) layer protocols 127

Figure 7.3


TCP connection establishment

As mentioned before, TCP generates pseudo-random sequence numbers by means of a
32-bit software counter that resets at boot-up and then increments every 4 microseconds.
The host establishing the connection reads a value ‘x’ from the counter where x can vary
between 0 and 2
32
–1) and inserts it in the sequence number field. It then sets the SYN
flag = 1 and transmits the header (no data yet) to the appropriate IP address and port
number. Assuming that the chosen sequence number was 132, this action would then be
abbreviated as SYN 132.
The receiving host (e.g. the server) acknowledges this by incrementing the received
sequence number by one, and sending it back to the originator as an acknowledgment
number. It also sets the ACK flag = 1 to indicate that this is an acknowledgment. This
results in an ACK 133. The first byte expected would therefore be numbered 133. At the
same time the server obtains its own sequence number (y), inserts it in the header, and
also sets the SYN flag in order to establish a connection in the opposite direction. The
header is then sent off to the originator (the client), conveying the message e.g. SYN 567.
The composite ‘message’ contained within the header would thus be ACK 133, SYN 567.
The originator receives this, notes that its own request for a connection has been
complied with, and also acknowledges the other node’s request with an ACK 568. Two-
way communication is now established.
7.1.8 Closing a connection
An existing connection can be terminated in several ways.
Firstly, one of the hosts can request to close the connection by setting the FIN flag. The

other host can acknowledge this with an ACK, but does not have to close immediately as
128 Practical TCP/IP and Ethernet Networking
it may need to transmit more data. This is known as a half-close. When the second host is
also ready to close, it will send a FIN that is acknowledged with an ACK. The resulting
situation is known as a full close.
Secondly, either of the nodes can terminate its connection with the issue of RST,
resulting in the other node also relinquishing its connection and (although not necessarily)
responding with an ACK.
Both situations are depicted in Figure 7.4.


Figure 7.4
Closing a connection
7.1.9 The push operation
TCP normally breaks the data stream into what it regards are appropriately sized
segments, based on some definition of efficiency. However, this may not be swift enough
for an interactive keyboard application. Hence the push instruction (PSH bit in the code
field) used by the application program forces delivery of bytes currently in the stream and
the data will be immediately delivered to the process at the receiving end.
7.1.10 Maximum segment size
Both the transmitting and receiving nodes need to agree on the maximum size segments
they will transfer. This is specified in the options field.
On the one hand TCP ‘prefers’ IP not to perform any fragmentation as this leads to a
reduction in transmission speed due to the fragmentation process, and a higher probability
of loss of a packet and the resultant retransmission of the entire packet.
On the other hand, there is an improvement in overall efficiency if the data packets are
not too small and a maximum segment size is selected that fills the physical packets that
are transmitted across the network. The current specification recommends a maximum
segment size of 536 (this is the 576 byte default size of an X.25 frame minus 20 bytes
each for the IP and TCP headers). If the size is not correctly specified, for example too

Host-to-host (transport) layer protocols 129
small, the framing bytes (headers etc) consume most of the packet size resulting in
considerable overhead. Refer to RFC 879 for a detailed discussion on this issue.
7.1.11 The TCP frame
The TCP Frame consists of a header plus data and is structured as follows:

Figure 7.5
TCP frame format
The various fields within the header are as follows:
Source port: 16 bits
The source port number.
Destination port: 16 bits
The destination port number.
Sequence number: 32 bits
The sequence number of the first data byte in the current segment, except when the
SYN flag is set. If the SYN flag is set, a connection is still being established and the
sequence number in the header is the initial sequence number (ISN). The first subsequent
data byte is ISN+1.
Refer to the discussion on sequence numbers.
Acknowledgment number: 32 bits
If the ACK flag is set, this field contains the value of the next sequence number the
sender of this message is expecting to receive. Once a connection is established this is
always sent.
130 Practical TCP/IP and Ethernet Networking
Refer to the discussion on acknowledgment numbers.
Data offset: 4 bits
The number of 32 bit words in the TCP header. (Similar to IHL in the IP header.) This
indicates where the data begins. The TCP header (even one including options) is always
an integral number of 32 bits long.
Reserved: 6 bits

Reserved for future use. Must be zero.
Control bits (flags): 6 bits
(From left to right)
URG: Urgent pointer field significant
ACK: Acknowledgment field significant
PSH: Push function
RST: Reset the connection
SYN: Synchronize sequence numbers
FIN: No more data from sender
Checksum: 16 bits
The checksum field is the 16-bit one’s complement of the one’s complement sum of all
16-bit words in the header and text. If a segment contains an odd number of header and
text octets to be check-summed, the last octet is padded on the right with zeros to form a
16-bit word for checksum purposes. The pad is not transmitted as part of the segment.
While computing the checksum, the checksum field itself is replaced with zeros.
This is known as the standard Internet checksum, and is the same as the one used for
the IP header.
The checksum also covers a 96-bit ‘pseudo header’ conceptually appended to the TCP
header. This pseudo header contains the source IP address, the destination IP address, the
protocol number (06), and TCP length. It must be emphasized that this pseudo header is
only used for computation purposes and is NOT transmitted. This gives TCP protection
against misrouted segments.


Figure 7.6
Pseudo TCP header format
Window: 16 bits
The number of data octets beginning with the one indicated in the acknowledgement
field, which the sender of this segment is willing or able to accept.
Refer to the discussion on sliding windows.

Urgent pointer: Urgent data is placed in the beginning of a frame, and the urgent
pointer points at the last byte of urgent data (relative to the sequence number i.e. the
number of the first byte in the frame). This field is only being interpreted in segments
with the URG control bit set.
Options: Options may occupy space at the end of the TCP header and are a multiple of
8 bits in length. All options are included in the checksum.

Host-to-host (transport) layer protocols 131
7.2 UDP (user datagram protocol)
7.2.1 Basic functions
The second protocol that occupies the host-to-host layer is UDP. As in the case of TCP, it
makes use of the underlying IP protocol to deliver its datagrams.
UDP is a ‘connectionless’ or non-connection-oriented protocol and does not require a
connection to be established between two machines prior to data transmission. It is
therefore said to be an ‘unreliable’ protocol – the word ‘unreliable’ used here as opposed
to ‘reliable’ in the case of TCP.
As in the case of TCP, packets are still delivered to sockets or ports. However, no
connection is established beforehand and therefore UDP cannot guarantee that packets are
retransmitted if faulty, received in the correct sequence, or even received at all. In view
of this, one might doubt the desirability of such an unreliable protocol. There are,
however, some good reasons for its existence.
Sending a UDP datagram involves very little overhead in that there are no
synchronization parameters, no priority options, no sequence numbers, no retransmit
timers, no delayed acknowledgement timers, and no retransmission of packets. The
header is small; the protocol is quick, and streamlined functionally. The only major
drawback is that delivery is not guaranteed. UDP is therefore used for communications
that involve broadcasts, for general network announcements, or for real-time data. A
particularly good application is with streaming video and streaming audio where low
transmission overheads are a prerequisite, and where retransmission of lost packets is not
only unnecessary but also definitely undesirable.

7.2.2 The UDP frame
The format of the UDP frame and the interpretation of its fields are described RFC 768.
The frame consists of a header plus data and contains the following fields:

Figure 7.7
UDP frame format
Source port: 16 bits
This is an optional field. When meaningful, it indicates the port of the sending process,
and may be assumed to be the port to which a reply should be addressed in the absence of
any other information. If not used, a value of zero is inserted.
Destination port: 16 bits
As for source port
Message length: 16 bits
This is the length in bytes of this datagram including the header and the data. (This
means the minimum value of the length is eight.)


×