Tải bản đầy đủ (.pdf) (76 trang)

Network Traffic Analysis Using tcpdump

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (447.7 KB, 76 trang )

1
1
Network Traffic Analysis
Using tcpdump
Judy Novak
Judy Novak
Johns Hopkins University Applied Physics Laboratory

Introduction to tcpdump
All material Copyright  Novak, 2000, 2001. All rights reserved.
2
2
Table of Contents
Topics
Introduction to tcpdump
Writing tcpdump Filters
Examination of Datagram Fields
Beginning Analysis
Real World Examples
Step by Step Analysis
References
3
3
Course Objectives

Introduce the fundamentals of tcpdump

Explain how to write tcpdump filters

Examine fields in datagram for uses/misuses


Analyze traffic by placing it in categories

Demonstrate “real-world” analysis using
tcpdump

Let you participate in the analysis process
The objectives of this course are to introduce you to the fundamentals and benefits of using tcpdump
as a tool to analyze your network traffic. We’ll start with introducing concepts and output of
tcpdump. One of the most important aspects of using tcpdump is being able to write tcpdump filters
to look for specific traffic. Filter writing is fairly basic unless you want to examine fields in an IP
datagram that don’t fall on byte boundaries. So, that is why an entire section is devoted to the art of
writing filters.
Before we start to use tcpdump to analyze traffic, we’ll examine many of the fields found in the IP
datagram. This is done to familiarize you with those fields in theory and also how they might be
used in practice. We’ll study how and why fields might be changed and for what purpose. Next,
we’ll start the basic analysis process by looking at tcpdump output and categorizing the kind of
traffic that you can see.
Then, we’ll take a look at some real-world examples and of how tcpdump was used on monitored
networks to discover what was happening. Next, the analysis process will be inspected step by step
often with missteps to get you comfortable with it.
As a note, all tcpdump output shown in this course is activity that actually occurred. Source and
destination hosts/IP’s have been altered to obfuscate the true identities.
4
4
Overview

Introduction to tcpdump

Writing tcpdump filters


Examination of Datagram Fields

Beginning Analysis

Real World Examples

Step by Step Analysis
This page intentionally left blank.
5
5
Introduction to tcpdump

Introduction to tcpdump

Writing tcpdump Filters

Examination of Datagram Fields

Beginning Analysis

Real World Examples

Step by Step Analysis
This page intentionally left blank.
6
6
Objectives

Examine the strengths/weaknesses of tcpdump


Organize collection/analysis process of tcpdump data via
Shadow

Examine tcpdump output

Standard

Hexadecimal

Length fields and how to convert them to bytes

Application layer

Interpretation of payload/hex output
This page intentionally left blank.
7
7
Introduction
This page intentionally left blank.
8
8
Strengths

Provides audit trail/historical record of network
activity

Provides absolute fidelity

Universally available and used
A

One of the most important parts of an arsenal in your security infrastructure is at least one tool or
software package that captures an audit trail or a historical record of the traffic that enters or leaves
your network. There will be times when you will be required to examine activity or connections that
occurred in your network – not just traffic that caused an alarm to sound. For instance, what if you
suspect that your packet filtering router that acts as your perimeter defense was acting strangely after
some major network changes were made. You would have to examine the traffic that was allowed
into your network to assist in determining the problem. That is where tcpdump is invaluable.
Also, many tools - even logs from firewalls will display suspicious traffic, yet only partial data is
displayed. What if you get a log of rejected traffic, but it doesn’t display or keep TCP flags? You’ll
never know what kind of connection was attempted. tcpdump allows the analyst to examine all the
bits and fields that are collected. If nothing is “wrong” with the connection, examination at the bit
level is unnecessary. Yet, if you suspect something “foul” with the traffic, you really need access to
all the data down to the bit level.
And tcpdump is a tool that is universally used and very portable. If you become familiar with this
software or its Windows counterpart, windump, it can be used on just about any platform to assist
you in analysis of traffic.
9
9
Weaknesses

By default, doesn’t collect all the payload

Does not scale well on large networks

No idea of state

Limited operations

Do-it-yourself interpretations
tcpdump will capture 68 bytes of data from the network interface. Some of this data might be used for

the link layer frame header. For Ethernet, 14 bytes of the data are used to capture fields like the source
and destination MAC address, along with the type of embedded data. That leaves only 54 bytes to
capture the IP header and embedded protocol header as well as any data. Most of the time this size will
allow you to capture the IP header and embedded protocol header. But, sometimes protocol headers or
data will be truncated. And, if you are interested in the data payload, tcpdump is really not the tool to use
for this.
tcpdump can collect a large volume of data for larger networks. This can be alleviated by not collecting
all the data on the network – perhaps omit web traffic (port 80). Or, another way to deal with this is more
disk space and faster processors to analyze all the collected data. But, at some point, the volume gets
unwieldy.
tcpdump blindly collects packet after packet. It has no idea of state or being able to know that a given
packet is anomalous because it does not follow the flow of a normal connection. And while tcpdump has
some primitive arithmetic operations or ways to manipulate bits, it cannot do complex operations for
analyzing data.
Finally, while it is an excellent way to collect data, tcpdump does not attempt to make interpretations of
what it sees. It does have some integrity checking operations for certain data to make sure that the data is
not irregular, but the analyst has to have the training and savvy to interpret the data. For the sophisticated
analyst, this is a bonus because she or he can make the correct call. Compare this with a tool that is prone
to false positives that gives no way of verifying the alarmed event. But, for an analyst who has little
training, tcpdump can be daunting since it does not interpret events.
10
10
tcpdump Versions

tcpdump: Unix version; official current version 3.4

/>•
/>•
windump: Windows version


/>•
/>•
Collective effort; current version 3.5:
www.tcpdump.org

tcpdump-3.5.tar.gz

libpcap-0.5.tar.gz
tcpdump is officially supported by the Lawrence Berkeley Labs. The current version is 3.4. There is
an effort to improve tcpdump and patch known problems with tcpdump and libpcap that appears to
be a collective effort of anyone interested. The software for this effort can be found at
www.tcpdump.org. Their current version is 3.5
For the Unix versions of tcpdump, you need to download software known as libpcap that implements
a portable framework for capturing low-level network traffic. windump is a Windows variant of
tcpdump. It also requires an application program interface to collect the traffic known as winpcap.
The unofficial version of tcpdump has some nice enhancements. It decrypts more of the applications
at the application layer and has a very nice capability of converting hexadecimal payload to
character output.
11
11
tcpdump in Action
0101001110 111010010011000 00100011011
Network
packets
tcpdump output
07:00:48.036746 ping.net > myhost.com: icmp: echo request (DF)
07:00:48.036776 myhost.com > ping.net: icmp: echo reply (DF)
07:02:12.622460 log.net.3155 > syslog.com.514: udp 101
07:03:01.132414 send.net.32938 > mail.com.25: S 248631:248631(0) win 8760
tcpdump running on a host

“sniffing” network packets
We see on this slide, a host running tcpdump and gathering records from the network interface.
We see the records that tcpdump has collected below. tcpdump has a default standard output based
on the protocol (TCP, UDP, ICMP) of the record that is displayed. While each of the various
protocols has a similar format to the other, they are also distinct in what is displayed.
By default, tcpdump will collect and print, in a standard format, all the traffic passing on the
network. There are command line options for tcpdump that will alter the default behavior, either
by collecting specified records, printing in a more verbose mode, printing in hexadecimal or
writing records as “raw packets” to a file instead of printing as standard output.
12
12
Sample tcpdump Output
Sample UDP Record
09:39:19.470000 nmap.edu.728 > dns.net.111: udp 56
timestamp source . port dest . port : protocol bytes
Sample TCP Record
beginning seq # data bytes
09:35:53.660000 nmap.edu.4 > dns.net.111: SF 136747297:136747297(0) win 1028
flags ending seq #
09:32:43.910000 nmap.edu.1171 > dns.net.139: S 2490962508:2490962508(0) win 512
09:32:43.910000 nmap.edu.1173 > dns.net.21: S 62697789:62697789(0) win 512
09:32:43.910000 nmap.edu.1193 > dns.net.22: S 1360146849:1360146849(0) win 512
09:32:43.920000 nmap.edu.1194 > dns.net.1114: S 372884098:372884098(0) win 512
Since we’ll review a lot of tcpdump output in this course, here’s a chance to get more comfortable
with it. This is sample output from what appears to be an nmap scan; a popular and informative
scan.
All records have a timestamp. The sensor host (Redhat Linux 5.2) that captured these records has the
precision to capture hundredths of seconds although tcpdump allows places for up to millionths.
Different protocols will have different representations in tcpdump output. One of the first challenges
is to identify the protocol (TCP, UDP, ICMP). Most will be labeled and while TCP isn’t explicitly

labeled, it is the only one with flag bits, sequence and acknowledgment numbers to name a few.
Some protocols like DNS will be interpreted at the application layer. Because of this, you may not
see the normal clues that you are used to. It may not be obvious if it is UDP or TCP so it is
important to look for clues as to which it is.
In general, tcpdump gives details about the source/host > destination/host.
Note that the bytes (0) transferred on SYN packets is normally 0 since they do not carry a payload
because this is just part of establishing the three-way handshake.
13
13
Organizing tcpdump Using
Shadow
This page intentionally left blank.
14
14
How Does Shadow Help
Organize?
Shadow:

Collects tcpdump data in hourly files

Analyzes each hour’s data for anomalies

Formats anomalous data in html for browsing

Comes with scripts to assist in examining data
Shadow (Secondary Heuristics for Defensive Online Warfare) is an intrusion detection system available
to all for free. It can be found at Shadow uses tcpdump as its
underlying collection and processing tool. Shadow turns tcpdump from a packet collecting tool into an
intrusion detection system. Shadow collects data from the network interface and stores it in hourly files
in raw tcpdump compressed format. It analyzes each hour’s collected data after-the-fact and runs a

series of tcpdump filters against it looking for anomalies and one-to-many source IP to destination IP
traffic.
Shadow will format into html all the events of interest detected by the tcpdump filters and processed by
some perl programs. The analyst can examine the output with a browser and further investigate activity
using some additional perl scripts to look through an hour’s or day’s worth of data.
Using Shadow relieves the analyst from having to worry about the collection of tcpdump data; it
automates this process. Further, it gives the analyst an automated way of examining activity. Still, the
analyst has to interpret the output. As with any other intrusion detection system, it requires a savvy
analyst to accurately interpret the output. However, since it is predicated upon tcpdump, the analyst has
the ability to examine all the collected data down to the bit level.
15
15
What is Shadow?

Intrusion detection system based on tcpdump

Unix-based

Performs traffic analysis

Primary focus on datagram headers

Pull-based architecture

Analyst reviews hourly events of interest via
web browser

Requires a savvy analyst to interpret output

Freeware available from www.nswc.navy.mil

Shadow is a Unix based intrusion detection system. It has a sensor and analysis component. The
sensor component collects network traffic and the analysis component fetches that traffic and
analyzes it. Both the sensor and analysis host process data in an hourly timeframe.
The entire IP datagram is not captured because Shadow is mostly concerned with anomalies or
events of interest found in the header portions of the datagram. The headers examined are the IP,
TCP, UDP and ICMP headers. Much insight can be gained from examining these headers. By
default, some payload or data is captured in the datagram. Shadow does not attempt to analyze this,
but it is there in case you want to analyze it.
Each hour the analysis host analyzes the previous hour’s traffic for events of interest. These events
of interest are formatted in html for viewing by an analyst using a browser. This is known as a pull-
based approach since the analyst is required to examine the records; the analyst is not informed or
pushed alerts of anomalous events.
Shadow was developed by the Shadow team at the Naval Surface Warfare Center. It is still
maintained and upgraded by this team. Shadow can be downloaded at no cost from
Click on the link for Current Shadow Software.
16
16
Why Shadow?

$$$$ (free for all)

Tunable

Customize your own signatures

Change at will

Provides an audit trail of activity to/from network

Provides an intimate view of activity

While not the only reason to install and use Shadow, a very compelling reason is the price tag. In many
cases, but not this one, you get what you pay for. Shadow is an excellent no-cost traffic analysis tool.
Another benefit is that once you master Shadow, you can change it liberally at any time that you want. For
instance, if you hear of a new exploit and can fashion a signature with a tcpdump filter, you can modify
Shadow instantaneously. Compare this with some intrusion detection systems that do not offer the
capability to change filters or signatures. You have to wait for the software company to update the filters
when they get around to it and the updates may not include signatures that you would like to see.
Also, since you get all the source code with Shadow, you can customize it for your whims and needs. This
is highly unusual and allows you to make changes based on your proficiency of the software.
Shadow uses tcpdump as its collection software. By default, you will collect most activity going into and
out of your network. This can be very beneficial in providing an audit trail of activity in the network. If
you ever find yourself in the midst of some kind of incident, this may be a very valuable attribute for an
intrusion detection system to have.
Finally, some of the more GUI kinds of intrusion detection systems do not allow the user to examine the
actual traffic at the IP datagram level. Shadow, by virtue of tcpdump, will allow the user a very intimate
view of the data collected. You will maintain fidelity of data and you can use all fields for interpretation
and analysis. If the traffic you are analyzing is corrupted in some way, you want to be able to inspect the
entire datagram.
17
17
Shadow Architecture
sensor
hour 01 data
hour 00 data
hour 02 data
DMZ
analysis
host
secure copy
tcpdump

filters
html
output
The Shadow architecture is a two-host system. Typically, the sensor resides on the DMZ, but it can
be placed anywhere on the network. It collects the traffic from the network interface and stores the
data in hourly files which are in raw tcpdump compressed format.
Each hour, the analysis host securely copies the files from the sensor. Using perl scripts it
orchestrates the process of running the previous hour’s tcpdump data through a set of tcpdump filters
that looks for anomalous activity. Another filter and perl script examine the data for signs of scans –
one source IP attempting connections to multiple destination IP’s. All of this information is then
formatted into html for viewing by the analyst.
18
18
What is a Shadow Event of
Interest?

The default filters will extract the following types of inbound
traffic:

Traffic sent to broadcast address

Traffic from reserved private networks

Fragmentation

Initial SYN connections

Particular UDP ports

Specific ICMP traffic


Scans

Traffic to core infrastructure hosts
A
Shadow comes with several tcpdump filters to examine each hour’s traffic. All records extracted by
the filters are processed and formatted in html. Shadow’s focus is mainly external to internal traffic.
Specifically, the default filters will look for traffic to the .255 or .0 addresses. Any IP address that is
from the reserved private address spaces is displayed as well as any from the 127.0.0.1 address.
Also, fragments are examined – for all traffic other than ICMP, only the first fragment is displayed.
Fragmentation can be normal, but it can be a sign of subversive activity trying to get around a
stateless packet filtering device or elude notice of a stateless NID. For ICMP, all fragments are
displayed except the last. ICMP messages should be small enough so as not to require
fragmentation.
For TCP records, the initial SYN connections are examined. This doesn’t necessarily mean that the
connection was successful, it just indicates that the connection was attempted. Also, certain ports or
hosts may have to be excluded so as not to false alarm. For UDP records, you have to maintain a list
of UDP destination ports that are of interest to you.
Shadow looks for signs of a one-to-many relationship of source IP to multiple destination hosts –
often indicative of a scan. Finally, Shadow can be tuned to look at more granular activity to the core
infrastructure hosts in your network.
19
19
Sample Shadow Output
Shadow output is sorted tcpdump output. It is sorted by source IP and time to allow the analyst to
group the activity by source IP. The above activity indicates a probe of port 3128 (squid proxy
server port) by host 1.2.3.4. A second host that is displayed because it was extracted by one of the
tcpdump filters is host 2.2.2.2 which appears to be probing mydns.com for destination port 139
which is a NetBIOS port. Typically DNS servers do not have the NetBIOS ports open.
The final set of activity appears to be a full-blown scan from source IP 5.5.5.5. It is scanning the

hosts on the 172.16.1 subnet for port 1243 which is a trojan known as SubSeven or BackDoorG.
Having the output displayed in html for the analyst makes it easier for the analyst to examine the
hour’s traffic.
20
20
Examining tcpdump Output
This page intentionally left blank.
21
21
Default tcpdump Output
Command: tcpdump
Default display
11:55:52.069484 192.168.143.5 > 192.168.143.101: icmp: echo request
tcpdump will display any collected or processed output to standard output – typically the console or
terminal. It will also attempt to resolve any IP numbers to host names and will also attempt to
translate port numbers to known services. For instance, if a port number is 23 and it is found in the
file /etc/services as being associated with telnet, tcpdump will print the service and not the port
number - that is, unless the –n option has been used to disable resolution.
As you can see, this does not display all the captured fields in the datagram. Other fields are
available for display, but different command line options have to be supplied in order to see the
fields. In the above record, we have an ICMP echo request captured.
22
22
Warning!!!!!
In case you hadn’t gotten the subtle clues from the slide above, this is a not-so-hidden warning that
the material is going to be quite a bit more difficult in the next several slides. So, if you are on auto-
pilot or thinking about what to eat for lunch, you may have to devote a few extra neurons to the
following material.
23
23

Hexadecimal tcpdump Output
Command: tcpdump -x
Hexadecimal display
11:55:52.069484 192.168.143.5 > 192.168.143.101: icmp: echo request
4500 0054 064b 0000 4001 bc12 c0a8 8f05
c0a8 8f65 0800 620a 850a 0000 889f 4b39
510f 0100 0809 0a0b 0c0d 0e0f 1011 1213
1415 1617 1819
Underlined: IP
header
Embedded
protocol header
and data
ICMP Header
Suppose you want to examine all the bits that are captured when tcpdump is run. There are many
reasons for wanting to examine this level of detail, especially when you believe that there is some
kind of deliberate crafting or alteration of the datagram.
In order to dump the bits, tcpdump has an option to display the output in hexadecimal. This is done
by using the –x command line option. From the hexadecimal output, the bits can be determined.
When output is displayed in hex, you will have to have some idea of what the fields are that you are
examining. A most excellent resource to assist in this task is the “bible” of TCP/IP – TCP/IP
Illustrated, Volume 1 by Richard Stevens. Not only are the protocol headers conveniently located
directly inside the cover, but this book uses tcpdump output to assist in the understanding of TCP/IP.
One of the first things you will need to do upon looking at the hex output is to determine where the
IP header is and how long it is. We’ll see how to do that in upcoming slides. Also, you want to
examine the embedded protocol and determine where that header stops and starts. Finally, you may
have some kind of interest in the embedded protocol payload.
24
24
Default snaplen


Default number of bytes captured is 68

Why do we see only 54 bytes of data from tcpdump?
4500 0054 064b 0000 4001 bc12 c0a8 8f05
(16)
c0a8 8f65 0800 620a 850a 0000 889f 4b39
(32)
510f 0100 0809 0a0b 0c0d 0e0f 1011 1213
(48)
1415 1617 1819
(54)
The snapshot length or snaplen, for short, is the number of bytes that tcpdump collects. By default,
this is not the entire frame or datagram. The default snaplen is 68 bytes. This is usually enough to
capture the IP header, embedded protocol header and some data. But, if there are many options –
either IP header options or TCP options (TCP is the only embedded protocol that can have header
options), all of the headers might not be captured.
In the above slide, we see that there appears to be a 20 byte header that is underlined. Each line of
tcpdump output is 16 bytes. We see that we have 54 bytes of data that have been captured. For the
above output, the actual datagram is longer than 68 bytes (we’ll see how to compute the datagram
length), but we only have 54 bytes of output. Any ideas why?
25
25
Answer: Frame Header
Frame header IP Header ICMP ICMP Data
Header
Ethernet =
14 bytes
20 bytes 8 bytes 26 bytes
The answer to the question of why only 54 bytes of IP datagram data appear on the previous slide

even though the datagram is greater than 68 bytes has to do with the collection of the data in the
frame header. In this case, we are running on a host that has an Ethernet connection. Ethernet has a
14 byte frame header which holds fields such as the source and destination MAC address and the
kind of embedded datagram – IP, arp or rarp. This is why we only see 54 bytes of IP datagram; 14
bytes are used to record the Ethernet header.

×