Writing tcpdump Filters

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (214.51 KB, 39 trang )

1
1
Network Traffic Analysis
Using tcpdump
Judy Novak
Judy Novak
Johns Hopkins University Applied Physics Laboratory

Writing tcpdump Filters
All material Copyright  Novak, 2000, 2001. All rights reserved.
2
2
Writing tcpdump Filters
•
Introduction to tcpdump
•
Writing tcpdump Filters
•
Examination of Datagram Fields
•
Beginning Analysis
•
Real World Examples
•
Step by Step Analysis
This page intentionally left blank.
3
3
Objectives
•
Review the foundations to understand and create

tcpdump filters including:
•
tcpdump filter format
•
Review of bit/byte theory
•
Review of binary/hexadecimal numbering
systems
•
Review of bit masking
•
Learning to formulate tcpdump filters
•
Review of tcpdump output
tcpdump filters are necessary to selectively gather/read records of network traffic.
While this section may be somewhat difficult to understand especially if you haven’t been exposed
to this theory before, it is more than just an academic exercise. In order to comprehend network
traffic at its most visceral level, you will have to understand tcpdump filters. Also, familiarity with
tcpdump filters is necessary if you want to process tcpdump files for some trait. For instance, if you
wanted to identify the beginning of a TCP connection, you would search for traffic with the SYN bit
alone set.
4
4
Foundations For Understanding
tcpdump Filters
•
Specify item of interest for record selection
•
Any field in the IP datagram
•

Examples: header length or TCP flags
•
Variables for more commonly used fields:
•
Examples: “port” or “host”
•
Less common fields:
•
Identify protocol
•
Identify byte displacement
•
Examples: ip[0], tcp[13]
tcpdump filters need to specify an item of interest, a field in the IP datagram for record selection.
Such items can be part of the IP header such as the IP header length, the TCP header such as TCP
flags, the UDP header such as the destination port, or the ICMP message such as the message type.
tcpdump provides a special name for each type of header. Much as you would expect, ip is used to
denote a field in the IP header or data portion of the IP datagram, tcp for a field in the TCP header or
segment, udp for the UDP header or UDP datagram, and ICMP for the ICMP message.
For instance, ip[0] would indicate the first byte offset of the IP datagram which happens to be part of
the IP header (remember counting starts at 0). tcp[13] would be the 13th byte offset into the TCP
segment which is also part of the TCP header, and icmp[0] would be the first byte offset of the ICMP
message which is the ICMP message type.
Sample filters and reference material are found in:
• tcpdump man pages
5
5
Specifying Fields
0 15 16 31
20

bytes
4-bit 4-bit IP 8-bit TOS 16-bit total length (in bytes)
version header
length
16-bit IP identification number 3-bit
flags
13-bit fragment offset
8-bit time to live
(TTL)
8-bit protocol
16-bit header checksum
32-bit source IP address
32-bit destination IP address
ip[1]
src host
protocol[displacement]
macro
Looking at the IP header as an example, we learn two ways to specify different fields. The easier way to
specify a field of interest is by using a tcpdump macro. Not all fields have these macros. The source IP can
be specified by combining two macros “src” and “host” to identify the field. But, if we want to look at the
type of service field, we have to identify a protocol in which the field is found (IP because this is in the IP
header) and a displacement in bytes (1) offset in the protocol.
What are some of the more common macros used in filters?
host select the record if either the source or destination host matches this IP
net select the record if either the source or destination subnet matches
This is useful if there are several IP’s from the same subnet of interest to you
port select the record if either the source or destination port matches
src host select the record if the source host matches
dst host select the record if the destination host matches
src net select the record if the source subnet matches

dst net select the record if the destination subnet matches
src port select the record if the source port matches
dst port select the record if the destination port matches
icmp select the record if the protocol field ip[9] has a value of 1
tcp select the record if the protocol field ip[9] has a value of 6
udp select the record if the protocol field ip[9] has a decimal value of 17
6
6
The tcpdump Filter Format
•
The two different formats for a tcpdump filter are:
•
<protocol header> [offset: length] <relation> <value>
ip[9] = 1
tcp[2:2] < 20
udp[4:2] != 0
icmp[0] = 8
•
<variable> <value>
port 23
dst host 1.2.3.4
src net 0
The first filter ip[9] = 1 selects any record with the IP protocol of 1 (ICMP).
The second filter tcp[2:2] selects any record with a TCP destination port less than 20.
The third filter udp[4:2] selects any UDP record with a non-zero UDP length.
The fourth filter selects any record with an ICMP message type of 8, an ICMP echo request.
The first variable filter selects any record with source/destination port of 23 (telnet).
The second variable filter selects any record with destination host 1.2.3.4.
The third variable filter selects any record with a source subnet of 0.x.x.x.
7

7
Bit/Byte Fundamentals
•
A byte is an 8 bit field
•
It is possible to denote a span of bytes, i.e.
udp[0:2]
•
Smallest precision that the tcpdump “language”
offers is a byte
•
How do you reference bits within a byte?
•
Bit masking
First 4 bytes (bytes 0 - 3) of the IP header:
BYTE 0 1 2 3
4 bit 4 bit 8 bit TOS 16 bit IP total
length version length
The bit is the smallest unit that can be represented by a computer - it can have a value of either 0 or 1. A
byte is composed of 8 bits. Byte counting begins at byte 0; all successive bytes fall on these 8 bit
boundaries. udp[0:2] specifies the byte in the UDP datagram beginning at byte 0 for a length of two bytes.
Bit masking or using a combination of boolean arithmetic and binary/hexadecimal values will help “isolate”
bits.
8
8
Decimal/Binary
Representations
Base 10 Arithmetic - Decimal
2 6 5
10

2
10
1
10
0
Base 2 Arithmetic - Binary
2
7
2
6
2
5
2
4
2
3
2
2
2
1
2
0
1 0 0 0 0 0 0 1
128 64 32 16 8 4 2 1
= 2x100 + 6x10 + 5x1 = 265
= 1x128 + 1x1 = 129
Because decimal is our native number system, we really don’t have to do any conversions to understand the
value of a number. But, if you examine the number, you realize that a digit has value based on its
placement in the number. The digits that are least significant (to the right) have less value and those that
are most significant (to the left) have the most value. Each digit is represented by an increasing power of

the native base or base 10.
The same theory applies when we are dealing with binary or base 2. Instead of using exponents of 10, we
use exponents of 2 to figure out the decimal representation of the number. Also, because we are talking in
terms of a byte, we use 8 bits or binary digits to represent a byte. So, we see above how we convert the
binary number of 10000001 to a decimal 129.
9
9
Binary/Hex Conversion
Base 2 Arithmetic - Binary
2
7
2
6
2
5
2
4
2
3
2
2
2
1
2
0
1 0 0 0 0 0 0 1
128 64 32 16 8 4 2 1
= 1x128 + 1x1 = 129
.
Base 16 Arithmetic - Hexadecimal

2
3
2
2
2
1
2
0
2
3
2
2
2
1
2
0
1 0 0 0 0 0 0 1
4 binary bits represent one
hex character. 1000 0001
binary is 81 hex. To denote
hex we use the 0x prefix -
0x81.
81 hex = 8x16
1
+ 1x16
0
= 129
If you consider a byte as two hexadecimal characters, each character will be 4 bits long. So 16 different
hex values can be represented - if all bits of a 4-bit chunk (nibble) are turned on or set to 1 the maximum
value will be 15 (8 + 4 + 2 + 1). Counting in hex goes from 0 to 9, 10 = a, 11 = b, 12 = c, 13 = d, 14 =e,

15 = f.
The leftmost bits are called the high-order bits - they have the most value, whereas the rightmost bits are
referred to as the low-order bits. The same holds true for bytes; the left most are known as high-order
bytes and right most are known as low-order bytes.
Remember from arithmetic that any number with an exponent of 0 is 1.
Terminology:
Byte = 8 bits
Nibble = 4 bits
Hex char = 4 bits
Word = 32 bits
10
10
Hexadecimal Representation
2
3
2
2
2
1
2
0
2
3
2
2
2
1
2
0
(Hex)

0 0 0 0 = 0 1 0 0 0 = 8
0 0 0 1 = 1 1 0 0 1 = 9
0 0 1 0 = 2 1 0 1 0 = 10 (a)
0 0 1 1 = 3 1 0 1 1 = 11 (b)
0 1 0 0 = 4 1 1 0 0 = 12 (c)
0 1 0 1 = 5 1 1 0 1 = 13 (d)
0 1 1 0 = 6 1 1 1 0 = 14 (e)
0 1 1 1 = 7 1 1 1 1 = 15 (f)
When representing hexadecimal, we have a numbering system that goes from 1 to 15. The problem
comes in representing values above 9 in a different scheme so that we can differentiate decimal and
hexadecimal. A value of 10 decimal is a different value than 10 hexadecimal. A value of 10
hexadecimal has a value of 16 in decimal. So, when we get to values above 9, we use letters to
represent 10 – 15 as you can see in the second column above. The letters in parentheses are the
hexadecimal representations of the numbers in decimal.
11
11
Figuring Out Decimal Values
for Hex Output
Use reference to discover where fields start and end
Each character in the hex output is a power of 16
Start at the rightmost character and increase power of 16
Multiply by base number by exponent, add all values
First 8 bytes of hexadecimal output of a UDP header
0089 0089 004c 1fd7
0 0 8 9 0 0 8 9 0 0 4 c 1 f d 7
2
3
Source Port Dest Port Length Checksum
1
4

16
3
16
2
16
1
16
0
16
3
16
2
16
1
16
0
16
3
16
2
16
1
16
0
16
3
16
2
16
1

16
0
23
4
4
8*16
1
+ 9*16
0
= 128 + 9 = 137
1
When you see hexadecimal output and you need to translate it to some kind of coherent output, how do
you start? Let’s assume that we are looking at a field or fields that have numeric values. In other words,
we are not looking at a string payload. Let’s use 8 bytes of hexadecimal output from a UDP header to
describe the process of figuring out the decimal values of all the fields.
The first thing that you need to do is to identify what you are looking at. Most of the time when you look
at hex output, it will be the entire datagram. In this case, for demonstration purposes, we will take an
excerpt of the datagram. This is the first 8 bytes of the UDP header. You’ll need to use some reference,
such as TCP/IP Illustrated, Volume1 by Richard Stevens or the references at the back of the course to
identify the fields in the UDP header. Remember that each character that you see in the output is one hex
character (4 bits) so there are 2 hex characters in a byte. You’ll discover that there is a 16-bit source port,
a 16-bit destination port, a 16-bit UDP length and a 16-bit checksum in the UDP header. Coincidentally,
these are all 2 byte fields – or 4 hex characters. You see that we divide up the hex output accordingly.
Next, start with the rightmost hex character and label that with an exponent of 16
0
. For each hex
character associated with that field, move left and increase the power of 16 until you hit the leftmost
character in the field. Then, multiply the base by the exponent above it and add all the values.
Using the source port 0089 as an example, we start with the rightmost character and label it 16
0

. Next,
we only have one more character that is non-zero and we label that as 16
1
. Now, we multiply the
rightmost character 9 by 16
0
(anything to the 0 power is 1) and get a result of 9. Then we multiply the
next character 8 by 16
1
(or 16) and get 128. Adding 128 and 9, we arrive at 137 which is the source port
typically associated with NetBIOS name service queries.
12
12
Your Turn
These are the first two bytes of the IP header
4500 0030
Use the reference pages at the end of the course to figure out
what the 16-bit total length is in decimal
Figure out the decimal value of the 16-bit total length. Use the reference materials at the end of this
course to find a layout of the IP header and where the 16-bit total length falls in the IP header. Once
you’ve discovered that field, use the methods discussed to figure out the decimal equivalent of the
hex value.
13
13
Answer
4 5 0 0 0 0 3 0
IP version
IP header
length
TOS

16
3
16
2
16
1
16
0
3*16
1
= 48
Answer: 48 bytes in the IP
datagram
16-bit total length
The first thing we do is look at the layout for the IP header. The 16-bit total length field is found in
the 2
nd
and 3
rd
bytes offset from the IP header (counting starts at 0). We find a value of 0030 in
these 2 bytes. So, we methodically label all the the hex digits in this field as powers of 16 starting at
the rightmost digit 0. Because we only have one non-zero value in the IP length field, we really only
need to figure out its value.
The non-zero value of 3 is located in the 16
1
position. So, we simply multiply 3*16 and discover
that the IP length is 48 bytes.
14
14
The Problem: Looking at

Fields Less Than a Byte
Layout of first byte
4 bit IP version 4 bit header length
0 1 0 0
Current
value in IP version
0 0 0 0
Desired
value in IP version
We run into a slight problem when we deal with fields in an IP datagram that are less than a byte in
length. The first byte of the IP header is actually two different fields – a 4 bit IP version and a 4 bit
header length. If we use the protocol[displacement] notation, ip[0] finds both fields. What if we
wanted to look at the 4 bit IP header length only and we were not interested in the 4 bit IP version?
There is really no simple operation that is native to the tcpdump “language” that allows us to do this.
But, we can do some operations and manipulations of fields and bits that will allow us to look at the
4 bit header length only. In essence, if we can zero out or change all the bits in the IP version field
to 0, we really are looking at just the 4 bit header length if we look at ip[0]. How exactly do we
discard or zero-out this high-order nibble and preserve the low-order nibble found in the 4 bit header
length? This is what we will discuss next.
15
15
More Fundamentals
•
Individual bit or a range of bits selected by bit
masking
•
Uses the boolean AND operation to keep or
discard a bit(s)
•
Two bits are AND’ed; the following values yield

the following results
BIT A AND BIT B = RESULT
0 0 0
1 0 0
0 1 0
1 1 1
We will use the boolean AND operation to help us zero-out unwanted bits. Let’s look at the
fundamentals of applying this theory.
Because we are dealing with computers that talk in binary, we consider taking every combination of
the only two possible bit values - 0 and 1. As you can see from the truth table above, the only time
the resulting value is 1 is when both bits that are AND’ed are 1.
If you imagine “BIT A” as the bit found in the original byte and “BIT B” as a mask value used in an
AND operation of “BIT A”, we can determine the appropriate mask value to either discard or
preserve an original bit.

Writing tcpdump Filters

Tài liệu liên quan

Tài liệu bạn tìm kiếm đã sẵn sàng tải về