2Apr il 2003, 17:00:47 The Complete FreeBSD (netdebug.mm), page 401
23
Network debugging
In this chapter:
• Howtoapproach
networ k problems
• Link layerproblems
• Networ k layer
problems
• traceroute
• tcpdump
• Tr anspor t and
application layers
• Ethereal
In this chapter:
• Howtoapproach
networ k problems
• Link layerproblems
• Networ k layer
problems
• traceroute
• tcpdump
• Tr anspor t and
application layers
• Ethereal
The chances are quite good that you’ll have some problems somewhere when you set up
your network. FreeBSD givesyou a large number of tools with which to find and solve
the problem.
In this chapter,we’ll consider a methodology of debugging network problems. In the
process, we’ll look at the programs that help debugging. It will help to have your finger
in Chapter 16 while reading this section.
Howtoapproachnetwork problems
Recall from Chapter 16 that network software and hardware operate on at least four
layers. If one layer doesn’twork, the ones above won’teither.When solving problems,
it obviously makes sense to start at the bottom and work up.
Most people understand this up to a point. Nobody expects a PPP connection to the
Internet to work if the modem can’tdial the ISP.Onthe other hand, a large number of
messages to the FreeBSD-questions mailing list showthat manypeople seem to think
that once this connection has been established, everything else will work automatically.
If it doesn’t, they’re puzzled.
Unfortunately,the Net isn’tthat simple. In fact, it’stoo complicated to give a hard-and-
fast methodology at all. Much network debugging can look more likemagic than
anything rational. Nevertheless, a surprising number of network problems can be solved
by using the steps below. Eveniftheydon’tsolveyour problem, read through them.
Theymight give you some ideas about where to look.
netdebug.mm,v v4.15 (2003/04/02 03:23:15) 401
Howtoapproach networ k problems 402
2April 2003, 17:00:47 The Complete FreeBSD (../tools/tmac.Mn), page 402
Link layerproblems
To test your link layer,start with ping. ping is a relatively simple program that sends an
ICMP echo packet to a specific IP address and checks the reply. ICMP,isthe Internet
Control Message Protocol,isused for error reporting and testing. See TCP/IP
Illustrated,byRichard Stevens, for more information.
Atypical ping output might look like:
$ ping bumble
PING bumble.example.org (223.147.37.156): 56 data bytes
64 bytes from 223.147.37.156: icmp_seq=0 ttl=255 time=1.137 ms
64 bytes from 223.147.37.156: icmp_seq=1 ttl=255 time=0.640 ms
64 bytes from 223.147.37.156: icmp_seq=2 ttl=255 time=0.671 ms
64 bytes from 223.147.37.156: icmp_seq=3 ttl=255 time=0.612 ms
ˆC
--- bumble.example.org ping statistics ---
4packets transmitted, 4 packets received, 0% packet loss
round-trip min/avg/max/stddev = 0.612/0.765/1.137/0.216 ms
In this case, we are sending the messages to the system bumble.example.org.Bydefault,
ping sends messages of 56 bytes. With the IP header,this makes packets of 64 bytes.
By default, ping continues until you stop it—notice the ˆC indicating that this invocation
wasstopped by pressing Ctrl-C.
The information that ping givesyou isn’tmuch, but it’suseful:
• It tells you howlong it takes for each packet to get to its destination and back.
• It tells you howmanypackets didn’tmakeit.
• It also prints a summary of packet statistics.
But what if this doesn’twork? You enter your ping command, and all you get is:
$ ping wait
PING wait.example.org (223.147.37.4): 56 data bytes
ˆC
--- wait.example.org ping statistics ---
5packets transmitted, 0 packets received, 100% packet loss
Obviously,something’swrong here. We’lllook at it in more detail below. This is very
different, however, from this situation:
$ ping presto
ˆC
In the second case, evenafter waiting a reasonable amount of time, nothing happened at
all. ping didn’tprint the PING message, and when we hit Ctrl-C there was no further
output. This is indicative ofaname resolution problem: ping can’tprint the first line
(PING presto...)until it has found the IP address of the system, in other words, until it
has performed a DNS lookup. If we wait long enough, it will time out, and we get the
message ping: cannot resolve presto: Unknown host.Ifthis happens, use the
IP address instead of the name. DNS is an application, so we won’teventry to debug it
netdebug.mm,v v4.15 (2003/04/02 03:23:15)
403 Chapter 23: Networ k debugging
2April 2003, 17:00:47 The Complete FreeBSD (../tools/tmac.Mn), page 403
until we’ve debugged the link and network layers.
If things don’twork out, there are twopossibilities:
• If both systems are on the same network, it’salink layer problem. We’lllook at that
first.
• If the systems are on twodifferent networks, it might be a network layer problem.
That’smore complicated: we don’tknowwhich network to look at. It could be either
of the networks on which the systems are located, or it could also be a problem with
one of the networks on the way.How doyou find out where your packets get lost?
First you check the link layer.Ifitchecks out OK, and the problem still exists,
continue with the network layer on page 405.
So what can cause link layer problems? There are a number of possibilities:
• One of the interfaces (source or destination) could be misconfigured. Theyshould
both have onthe same range of network addresses. Forexample, the following two
interface configurations cannot talk to each other directly,evenifthey’re on the same
physical network:
machine 1
dc0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
inet 223.147.37.81 netmask 0xffffff00 broadcast 223.147.37.255
machine 2
xl0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> mtu 1500
options=3<RXCSUM,TXCSUM>
inet 192.168.27.1 netmask 0xffffff00 broadcast 192.168.27.255
• If you see something likethis on an Ethernet interface, it’spretty clear that it has a
cabling problem:
xl0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> mtu 1500
options=3<RXCSUM,TXCSUM>
inet 192.168.27.1 netmask 0xffffff00 broadcast 192.168.27.255
media: Ethernet autoselect (none)
status: no carrier
In this case, check the physical connections. If you’re using UTP,check that you
have the right kind of cable, normally a ‘‘straight-through’’cable. If you accidentally
use a crossovercable where you need a straight-through cable, or vice versa, you will
not get anyconnection. Also, manyhubs and switches have a ‘‘crossover’’switch
that achievesthe same result.
• If you’re on an RG-58 thin Ethernet, the most likely problem is a break in the cabling.
Youcan check the static resistance between the central pin and the external part of the
connector with a multimeter.Itshould be approximately 25Ω.Ifit’s50Ω,it
indicates that there is a break in the cable, or that one of the terminators has been
disconnected.
• If your interface is configured correctly,and you’re using a 10 Mb/s card, check
whether you are using the correct connection to the network. Some older Ethernet
boards support multiple physical connections (for example, both BNC and UTP). For
netdebug.mm,v v4.15 (2003/04/02 03:23:15)
Link layerproblems 404
2April 2003, 17:00:47 The Complete FreeBSD (../tools/tmac.Mn), page 404
example, if your network runs on RG58 thin Ethernet, and your interface is set to
AUI, you may still be able to send data on the RG58, but you won’tbeable to receive
any.
The method of setting the connection depends on the board you are using. PCI
boards are not normally a problem, because the drivercan set the parameters directly,
butISA boards can drive you crazy.Inthe case of very old boards, such as the
Western Digital 8003, you may need to set jumpers. In others, you may need to run
the setup utility under DOS, and with others you can set it with the link flags to
ifconfig.For example, on a 3Com 3c509 ‘‘combo’’board, you can set the connection
likethis:
# ifconfig ep0 -link0 set BNC
# ifconfig ep0 link0 -link1 set AUI
# ifconfig ep0 link0 link1 set UTP
This example is correct for the ep driver, but not necessarily for other Ethernet
boards: each board has its own flags. Read the man page for the board for the correct
flags.
• If your interface looks OK, the next thing to do is to see whether you can send data to
other machines on the network. If so, of course, you should continue your search on
the machine that isn’tresponding. If none are working, you probably have a cabling
problem.
On a wireless network, you need to check for a number of additional problems. ifconfig
should showsomething likethis:
wi0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
inet6 fe80::202:2dff:fe04:93a%wi0 prefixlen 64 scopeid 0x3
inet 192.168.27.17 netmask 0xffffff00 broadcast 192.168.27.255
ether 00:02:2d:21:54:4c
media: IEEE 802.11 Wireless Ethernet autoselect (DS/11Mbps)
status: associated
ssid "FreeBSD IBSS" 1:""
stationname "FreeBSD WaveLAN/IEEE node"
channel 3 authmode OPEN powersavemode OFF powersavesleep 100
wepmode OFF weptxkey 1
wepkey 2:64-bit 0x123456789a 3:128-bit 0x123456789abcdef123456789ab
There are manythings to check here:
• Do you have the same operating mode? This example shows a card operating in BSS
or IBSS mode. By contrast, you might see this:
media: IEEE 802.11 Wireless Ethernet autoselect (DS/11Mbps <adhoc, flag0>)
In this case, the interface is operating in so-called ‘‘Lucent demo ad-hoc’’mode,
which is not the same thing as ‘‘ad-hoc’’mode (which in turn is better called IBSS
mode). IBSS mode (‘‘ad-hoc’’) and BSS mode are compatible. IBSS mode and
‘‘Lucent demo ad-hoc’’mode are not. See Chapter 17, page 306 for further details.
netdebug.mm,v v4.15 (2003/04/02 03:23:15)
405 Chapter 23: Networ k debugging
2April 2003, 17:00:47 The Complete FreeBSD (../tools/tmac.Mn), page 405
• Is the status associated?The alternative is no carrier.Some cards, including
this one, show no carrier when communicating with a station operating in IBSS
mode, but theynev ershow associated unless theyare really associated.
• If the card is not associated, check the frequencies and the network name.
• Check the WEP (encryption) parameters to ensure that theymatch. Note that
ifconfig does not display the WEP key unless you are root.
Your card may show associated ev enifthe WEP key doesn’tmatch. In such a
case, it knows about the network, but it can’tcommunicate with it.
After checking all these things, you should have a connection. But you may not be home
yet:
• If you have a connection, check if all packets got there. Lost packets could mean line
quality problems. That’snot very likely on an Ethernet, but it’svery possible on a
PPP or DSL link. There’sanuncertainty about dropped packets: you might hit Ctrl-
C after the last packet went out, but before it came back. If the line is very slow, you
might lose multiple packets. Compare the sequence number of the last packet that
returns with the total number returned. If it’sone less, all the packets except the ones
at the end made it.
• Check that each packet comes back only once. If not, there’sdefinitely something
wrong, or you have been pinging a broadcast address. That looks likethis:
$ ping 223.147.37.255
PING 223.147.37.255 (223.147.37.255): 56 data bytes
64 bytes from 223.147.37.1: icmp_seq=0 ttl=255 time=0.428 ms
64 bytes from 223.147.37.88: icmp_seq=0 ttl=255 time=0.785 ms (DUP!)
64 bytes from 223.147.37.65: icmp_seq=0 ttl=64 time=1.818 ms (DUP!)
64 bytes from 223.147.37.1: icmp_seq=1 ttl=255 time=0.426 ms
64 bytes from 223.147.37.88: icmp_seq=1 ttl=255 time=0.442 ms (DUP!)
64 bytes from 223.147.37.65: icmp_seq=1 ttl=64 time=1.099 ms (DUP!)
64 bytes from 223.147.37.126: icmp_seq=1 ttl=255 time=45.781 ms (DUP!)
FreeBSD systems do not respond to broadcast pings, but most other systems do, so
this effectively counts the number of non-BSD machines on a network.
• Check the times. A ping across an Ethernet should takebetween about 0.2 and 2 ms,
a ping across a wireless connection should takebetween 2 and 12 ms, a ping across
an ISDN connection should takeabout 30 ms, a ping across a 56 kb/s analogue
connection should takeabout 100 ms, and a ping across a satellite connection should
takeabout 250 ms in each direction. All of these times are for idle lines, and the time
can go up to over5seconds for a slowline transferring large blocks of data across a
serial line (for example, ftping a file). In this example, some line traffic delayed the
response to individual pings.
netdebug.mm,v v4.15 (2003/04/02 03:23:15)
Link layerproblems 406
2April 2003, 17:00:47 The Complete FreeBSD (../tools/tmac.Mn), page 406
Network layerproblems
Once we knowthe link layer is working correctly,wecan turn our attention to the next
layer up, the network layer.Well, first we should check if the problem is still with us.
We need additional tools for the network layer. ping is a useful tool for telling you
whether data is getting through to the destination, and if so, howmuch is getting through.
But what if your local network checks out just fine, and you can’treach a remote
network? Or if you’re losing 40% of your packets to foo.bar.org,and the remaining ones
are taking up to 5 seconds to get through. Where’sthe problem? Based on the recent
‘‘upgrade’’your ISP performed, and the fact that you’ve had trouble getting to other sites,
you suspect that the performance problems might be occurring in the ISP’snet. Howcan
you find out?
As we sawwhile investigating the link layer,acomplete failure is often easier to fix than
apartial failure. If nothing at all is getting through, you probably have a routing problem.
Check the routing table with netstat.Onbumble,you might see:
$ netstat -r
Routing tables
Internet:
Destination Gateway Flags Refs Use Netif Expire
default gw UGSc 0 8 xl0
localhost localhost UH 2 525 lo0
223.147.37 link#1 UC 6 0xl0
sat-gw 00:80:c6:f9:d3:fa UHLW 0 0 xl0 1150
bumble 00:50:da:cf:17:d3 UHLW 0 24 lo0
presto 00:80:c6:f9:a6:c8 UHLW 0 5 xl0 1200
freebie 00:50:da:cf:07:35 UHLW 6 760334 xl0 1159
223.147.37.255 ff:ff:ff:ff:ff:ff UHLWb 1 403 xl0
The default route is via gw,which is correct. The first thing is to ensure that you can
ping gw;that’salink levelissue, so we’ll assume that you can. But what if you try to
ping aremote system and you see something likethis?
# ping rider.fc.net
PING rider.fc.net (207.170.123.194): 56 data bytes
36 bytes from gw.example.org (223.147.37.5): Destination Host Unreachable
Vr HL TOS Len ID Flg off TTL Pro cks Src Dst
45006800 c5da 00000 fe 01 246d 223.147.37.2 207.170.123.194
36 bytes from gw.example.org (223.147.37.5): Destination Host Unreachable
Vr HL TOS Len ID Flg off TTL Pro cks Src Dst
45006800 c5e7 00000 fe 01 2460 223.147.37.2 207.170.123.194
ˆC
--- rider.fc.net ping statistics ---
2packets transmitted, 0 packets received, 100% packet loss
These are ICMP messages from gw indicating that it does not knowwhere to send the
data. This is almost certainly a routing problem; on gw you might see something like:
netdebug.mm,v v4.15 (2003/04/02 03:23:15)