Tải bản đầy đủ (.pdf) (27 trang)

Nghiên cứu đề xuất một số thuật toán phân loại gói tin và phát hiện xung đột nhằm phát triển tường lửa hiệu năng cao tt tiếng anh

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (1.26 MB, 27 trang )

MINISTRY OF EDUCATION AND TRAINING

MINISTRY OF NATIONAL DEFENCE

ACADEMY OF MILITARY SCIENCE AND TECHNOLOGY

VŨ DUY NHẤT

PROPOSAL ALGORITHMS FOR PACKET CLASSIFICATION
AND CONFLICT DETECTION IN RULE SET TO DEVELOP
HIGH-PERFORMANCE FIREWALL

Major: Mathematical foundation for informatics
Major code: 9 46 01 10

SUMMARY OF PH.D THESIS

HA NOI – 2019


The dissertation has been accomplished at: Academy of Military Science
and Technology – Ministry of Defense

Supervisor:
1. Ph.D Nguyen Manh Hung
2. Ph.D Thai Trung Kien
Reviewer 1: Assos. Prof. PhD Nguyễn Long Giang
Information Technology Institute
Vietnam Academy of Science and Technology
Reviewer 2: Assos. Prof. PhD Ngô Thành Long
Military Technical Academy


Reviewer 3: Assos. Prof. PhD Nguyễn Ngọc Hóa
VNU University of Engineering and Technology

The thesis will be defended in front of PhD thesis
examination Committee at Academy of Military Science and
Technology in … hour on …....

The thesis could be found at:
- The Library of Academy of Military Science and Technology
- The National Library of Vietnam.


1
INTRODUCTION
1. Dissertation's necessity
Today, computer networks have a strong development in terms of
connectivity, types of services, and number of users. Along with that
development is the introduction of advanced transmission technologies,
resulting in huge amounts of data being exchanged on the network. A firewall is
an access control device that is located at the connection point between the
networks that needs to be protected with an external network to ensure security
for that network. Security is done by checking all packets going through the
firewall in both directions in and out according to a security policy set by the
administrator.
With the function and location deployed, the firewall will become a barrier
between the networks to be protected with other networks. This device will
affect the network system in two aspects: Ensuring the security of the system
with the function of controlling the legality of the passing packets; Reduces the
speed of exchanging information between protected networks with external
networks. The high firewall's performance enhances the ability to protect the

internal network and limit the degradation of the speed of information exchange
through it.
Until now, researchers both at home and abroad have had many research
projects to improve the performance of firewalls to meet usage requirements.
Each solution has its own advantages and disadvantages and often only solves a
small problem in improving the performance of the device, no solution is really
optimal and general. The firewall's performance has been and will still need to
be enhanced to allow it to meet actual demands. That is the reason why we select
this research problem in the thesis.
2. Objects of the research
The thesis includes the following objectives: Proposing new techniques in
packet classification and detection of conflicts in rules to improve the speed of
packet classification, thereby developing high-performance firewall.
3. Scope, object and method of research
The scope of the thesis focuses on studying software improvements. More


2
details are packet classification algorithms to improve the throughput of the
firewall.
The object to be directly studied in the thesis is: Data structure of rules and
classification algorithms based on that structure; Techniques to minimize the
average sorting time for each packet on the firewall.
The thesis uses a combination of theoretical research and experimental
simulation.
4. The meaning of the research topic
Improving performance is an indispensable requirement for firewalls to
meet actual demands. Analyzing, evaluating and proposing solutions to improve
the performance of firewalls is an area that has been concerned by domestic and
foreign researchers. The research contents of the thesis will be the basis for us

to master and develop firewalls to meet the security demands of network
systems in general and especially the network systems of national security.
5. The composition of the thesis
The dissertation consists of 4 chapters along with the introduction,
conclusion, list of published scientific papers and articles of PhD students and
appendices.


3
CHAPTER 1. OVERVIEW OF PACKET CLASSIFICATION ON
FIREWALL
1.1. Concepts about the firewall
This section includes some contents: Definition and development history of
the firewall; the features and types of firewalls.
1.2. Performance and relationship to the packet classification process of
the firewall
The performance of a firewall is evaluated according to the criteria of "RFC3511: Methodology for Firewall Performance", in which criterion for IP
throughput are determined first. This criterion is directly related to the speed of
packet classification in a firewall device. Improving the speed of packet
classification on firewalls is also about improving the performance of this
device.
1.3. Research fields to improve packet classification speed on the
firewall
1.3.1 Researches in the field of hardware

Latest hardware technology solutions which are divided into basic forms:
Using FPGA technology; ASIC technology; Take advantage of GPU computing
power; Developing specialized network microprocessors; parallel processing
techniques (Fig 1.4).
FPGA technology


ASIC technology
Improve hardware
performance

Take advantage of GPU
computing power

Developing specialized network
microprocessors

Parallel processing techniques

Figure 1.4 Researches in the field of hardware

Each proposed approach using hardware technology to enhance firewall


4
performance has its advantages and disadvantages. However, building a highperformance firewall based entirely on the use of the above hardware
improvements is very difficult in practice.
1.3.2 Researches in software field

Participants in the classification process of firewalls include Classification
Algorithms and Rule sets for classification. The properties of these two
components will directly affect the speed of packet classification. Studies in the
field of software to improve the speed of packet classification are also aimed at
the two objects above. Two research directions in this area are shown in Figure
1.5.
Improve Software

performance

Develop algorithms,
classification techniques

Optimize on
the classifing
time, memory
storage in the
worst case

Early packet
rejection

Optimize the rule set

Optimize the
way of
checking in
the
classification
process

Detect and
resolve
conflicts

Figure 1.5 Researches in software field
1.3.3 Domestic researches


Development of high-performance firewalls has not been studied in
Vietnam, research on firewalls only includes: Mastering and developing
firewalls with basic features and crypto integration; Deploying the firewalls in
network models to ensure system security.
1.3.4 Determine the research directions in the thesis

New proposals are implemented in all steps and stages of the packet
classification process (Figure 1.10).


5
Optimizing the rule set:
Detect and resolve conflicts

Package has been
classified

Improved classification
algorithm

PACKET CLASSICATION
MODULE

Technical proposal for
early packet rejected

Input packets

Rule set


Figure 1.10 Improvements in the packet classification of the thesis

1.4. Conclusion of Chapter 1
Improving the performance of firewalls is an important requirement to
ensure network security in the context of increasing demand for information
exchange today. With the goal of "Proposing new techniques in packet
classification and detection of conflicts in firewall rules to improve the speed of
packet classification from which to develop high-performance firewalls", the
thesis will focus on Data structure of rules and classification algorithms based
on that structure; Techniques to minimize the average sorting time for each
packet on the firewall. The solution is designed to improve the performance of
firewalls with new suggestions associated with each step of the packet
classification process: Detecting and handling conflicts in firewall rule set
(optimizing input parameters for classification problem); Early packet rejected
against DoS attacks on default rules (Reducing average classification time in
case of attack); Improve the efficiency of the classification process with new
data structures and algorithms. New proposals will be presented by the PhD
student in the next chapters of the thesis.


6
CHAPTER 2. CLASSIFICATION ALGORITHM ON FIREWALL
2.2. The basic concepts
Rule set: Each rule set consists of many rules, each of which consists of three
main parameters (Filter F; Action A; Rule index).
Filter: Each F filter contains the value of the fields to be satisfied. Each field
can be represented as a range or tube of pair (address / mask).
2.3 Proposed packet classification algorithm based on Multi-Way
Priority – MWP trie.
2.3.3. Main ideas and definitions

2.3.3.1. Main ideas

Based on Priority Trie - PT [43] and JA-trie [10], we build Multi-Way
Priority trie – MWP with the following characteristics:
- The MWP trie is built into a one-dimensional packet classification (source
or destination IP address), data stored on the trie is given as a prefix.
- Result of classification on MWP returns the longest prefix (BMP – Best
Matching Prefix) matching the input packet.
- Length of the prefix which is stored at a node is always greater than or
equal to length of prefixes which is stored in its child nodes. The search will end
as soon as it matches the prefix at a node.
- MWP is a multi-way trie. Each node on the MWP consists of multiple child
nodes, where the ith child node contains a prefix with the first i bit coinciding
with the first i bit contained in its parent node.
2.3.3.2 Definitions and theorems

DEFINITIONS 2.1. Degree of a prefix.
Consider prefixes P and Q; length of P is l; length of Q is t. Q is called n
degree prefix of P if and only if the following three conditions are satisfied:
 t ≤ l;
 The first n bits of Q coincide with the first n bits of P.
 The (n +1)th bit of Q is different from (n+1)th bit of P.
Denote Q = Ln(P).
DEFINITIONS 2.2. Degree of a set of prefixes.
Let G be the set of prefixes, G is nth degree of prefix P if and only if every


7
prefix Q of G is satisfied Q = Ln(P).
Denote G = Sn(P).

DEFINITIONS 2.3. The biggest prefix.
Let G be the set of prefixes, P is the biggest prefix of G if and only if ∀Q ∈
G (Q ≠ P), length of Q is less than or equal to length of P.
THEOREM 2.1. Let G be a set of prefixes (G does not contain two identical
prefixes) and P is the biggest prefix of G: If an IP address matches P then P will
be the Best Match Prefix of the IP.
THEOREM 2.2. We have two sets of prefixes G1, G2 and prefix P, in which
G1 = Si(P), G2 = Sj(P) and i ≠ j: If an IP address matches with prefix P1 (P1 ∈ G1)
then it will not exist any prefix P2 ∈ G2 so that P2 matches with the IP.
2.3.4 Structure of MWP trie
2.3.4.1 Node structure

Each node on the MWP trie is shown as Figure 2.1 and has the following
characteristics:
 Each N node stores a prefix P.
 The N button has a Backtrack field used when there is a Q prefix that is
prefix of P. In this case, we do not need to create a node to store Q and
then simply set the Backtrack field to length of Q.
 Each node has a maximum of k child nodes (k = 32 with IPv4, k = 128
with IPv6)


The length of the prefix stored in the child node is always less than or equal to
the prefix length stored in its parent node.



The mth child of node N is a node that contains the biggest prefix of m
degree prefix set of P.
N node


Max(S0(P))
Backtrack-0

P prefix
Backtrack

Max(S1(P))
Backtrack-1

Figure 2.1 N-node structure of the MWP trie

Max(Sw(P))
Backtrack-w


8
2.3.4.2 Node construction algorithm
The procedure for building the node on the trie is done with the input being
a prefix set in which prefixes have the same degree of the prefix stored at its
parent node.
Start

Set of prefix G;
The length in bit of IP address: W

+
G is empty

Prefixlongest = Max(G);

i = W;

node.key = [Value of Prefixlongest] Left shift
(W–length of Prefixlonggest) bits;
node.len = length of Prefixlonggest;

Gi = Si(Prefixlongest);
BuildNode(node.children[i], Gi);

i

1

+

i = i -1;

UpdateBacktrack(node);

Finish

Figure 2.4 Node building algorithm on MWP trie

2.2.4.3 Packet classification algorithm

Algorithm 2.2 performs classification of packets with IP address input. The
idea of classification is done as follows:
 The classification process starts from the root node of the trie.
 In each node, the IP address is compared to the stored prefix:
o If matched, the search process ends and the longest matching

prefix is the prefix stored in the button.
o Else:
 If the node does not have a child node, the largest matching
prefix will be equal to the Backtrack value.
 In contrast, compare the first bits of the IP address with the
first bit of the prefix stored in the node to branch for the next


9
node.
Start

Input IP

Node = ROOT;
BMP =0;

+

Node=NULL

node = node.mChildren[pos];

pos = GetMatchPrefix(addr, node.key);

pos

node.len

+

BMP = node.len;

-

BMP
-

+
BMP=Node.Backtrack;

Return BMP

Finish

Figure 2.6 Packet classification algorithm on MWP trie

2.3.4.4 Evaluation of complexity

Comparing the complexity between the three structures is shown in Table
2.5.
Table 2.5. Comparing the complexity of MWP structure with PT and JA-trie
Structure
Complex search
Storage complexity
JA-Trie
O(W/k)
O(2kNW/k)
MWP
O(W)

O(NW)


10
PT

O(W)

O(NW)

2.4 Conclusion of chapter 2
The MWP structure overcomes the limitations of average height and wasting
memory on PT trie, restrictions on prefix length can be applied on the structure
of Ja-trie and finally MWP structure has overcome the backward operation in
some cases of current classification algorithms.
Packet classification algorithms based on the MWP structure can be used by
firewalls and routers. Test results show that the speed of classification with
MWP structure can reach over 40 million packets per second. This speed can be
improved on higher hardware platforms or dedicated hardware. In the current
development trend of parallel processing techniques, the simplicity of data
structure as well as the efficiency of MWP's search and classification process
are advantages to deploy MWP in practice.
The firewall performs the function of protecting the intranet from external
attacks. Proposing packet classification algorithm on MWP tree structure in this
chapter to improve device performance. However, the firewall itself is subject
to direct attacks, so it requires the ability to protect itself against such attacks.
The next chapter of the thesis will present the technical proposal for early packet
rejection on firewalls to limit a form of DoS attack on this device, thereby
improving the device's performance in case of attack.



11
CHAPTER 3. EARLY PACKET REJECTION ON THE FIREWALL
3.2. Proposed early packet rejection technique based on the
combination of fields
3.2.1 The idea of early packet rejection by combining fields

Observation points:
 The rules in the firewall can be divided into two groups: Rules have
action be prohibit – "DENY", Rules have action be allow – "ACCEPT". A
packet that satisfies a rule of the "DENY" group will not satisfy any of the
"ACCEPT" rules and vice versa. Calling CAccept as a condition for the packet to
be "ACCEPT" (built from the set of "ACCEPT" rules) the packet that does not
meet CAccept will be "DENY". Thus, for reject the packet, we can build
NOT(CAccept) condition and check the packet according to that condition. The
problem is how do we build and use NOT(CAccept) conditions to be effective in
packet classification on firewalls.
 In packet classification algorithms, checking must be performed on all
fields used for the classification process. Checking on those fields can be done
in parallel or sequentially. However, in any form, the classification on each field
will require the cost of resources and time. The dimension of classification is
proportional to the classification time. If we reduce the number of dimensions
we need to check, we can reduce the cost for the classification process.
Based on the above observation, we give the idea for the proposed new early
packet rejection technology as follows:
 Reduce the number of check dimensions for each packet arriving at the
early filter module. Instead of having to check on multiple fields, combine the
original fields into one field based on combinations.
 Develop a rule set for early packet rejection on combined fields (build
NOT(CAccept) conditions).

 Use balanced tree structure (B tree, AVL tree, red-black tree) to store
the early packet rejection rules and filter incoming packets.
3.2.2 Early packet rejection using COM combining operations in two
dimensions
3.2.2.1 Combining COM operations

COM is the combination of the source address field and the destination


12
address field of a firewall`s rule into a single field according to association rules
we propose and it is called a COM combination.
Source IP address prefix

s bits

Destination IP address prefix

d bits

COM

s bits

Suppose s < d

Figure 3.1. How to create a COM prefix

COM operation: Rule Ri has a source IP prefix with length s bits, the
destination IP prefix has a length of d bits with d > s (Figure 3.1). The preCOM

prefix consists of s values generated by combining the s bits of the source IP
prefix with the s bits of the destination IP prefix: the jth bit of the source IP
prefix is associated with the jth bit of the destination prefix to form the value j of
the preCOM field (j = 0..s-1) according to the rules in Table 3.1.
Table 3.1. COM association rules
Source IP prefix
Destination IP prefix
COM prefix

Case 1
0
0
0

Case 2
0
1
1

Case 3
1
0
2

Case 4
1
1
3

3.2.2.2 Use the COM field in packet classification


 Definition 3.1: The value range of the COM prefix
The value range of the COM prefix - preCOM has length l, defined as the
v
a
l
u
e

 Definition 3.2: COM field of the packet
Let the Pkt packet have the source IP address of sIP and the destination IP
r
address
is dIP, then the COM field of the Pkt is denoted by fCOM and calculated
a follows:
as
n
fCOM = [sIP] COM [dIP]
(3.1)
g  Theorem 3.1: If the Pkt packet has source IP address - sIP that matches
ethe source IP prefix - preSIP and the destination IP address - dIP matches the
o


13
destination IP prefix preDIP of Ri rule, then the value of the fCOM field of the
Pkt will belong to the range of PreCOM prefix.
 Theorem 3.2: If the packet Pkt has a fCOM field which does not belong
to the range of the preCOM of Ri rule, then the Pkt has at least sIP that does not
satisfy the preSIP prefix of Ri or dIP which does not satisfy the preDIP prefix

of Ri.
 Definition 3.3: Relationship between value ranges
Suppose there are two separate or overlap ranges [a, b] and [x, y]. Then:
 [a, b] < [x, y] if b < x
 [a, b] = [x, y] if a = x and y = b
 [a, b] > [x, y] if a > y
 [a, b] ∈ [x, y] if x ≤ a and b ≤ y
 [a, b] <| [x, y] if a < x < b  [a, b] >| [x, y] if x < a < y < b
3.2.2.3 Build early packet rejection rules from COM fields

The fCOM field is used for the construction of the early packet rejection rule
- build the NOT(CAccept) condition. Call R the set of all "ACCEPT" rules in the
firewall rule set, including n rules. Figure 3.2 shows how to build early packet
rejection rule based on the fCOM field of rules belong to R.

Ф set
432-1

0
COM1

COM2

COM3

COMn

COM
Figure

3.3. How to build early packet rejection rules based on fCOM

If calling Ф is the set of all values of the early packet rejection rules based
on the fCOM field, then Ф is defined by the formula:
(3.3)
Ф = {x: x ∈ COM và x ∉ 𝐶𝑂𝑀𝑖 ∀i ∈ [1, n]}
In which COMi is the value defined by preCOMi prefix and
preCOMi = [preSIPi] COM [preDIPi]


14
3.2.2.4 Algorithm to build early packet rejection rules base on the fCOM

The algorithm to build the Φ set is implemented on a balanced tree structure
(AVL tree, Black red tree or B tree). The key is stored at each node on the tree
be a value range [a, b]. Tree construction algorithm includes the following steps:
Step 1: Create root node for balanced binary tree with key value of interval
[0, MAX] (if length of source and distance IP address is 32 bits then MAX =
432-1).
Step 2: Construct the COMi prefix of the corresponding rule and convert it
into the range [x, y].
Step 3: Insert the range [x, y] into the tree.
Considering the range [x, y] on the tree, suppose that considering node N
containing the range [a, b], perform tree insertion according to the rules in Table
3.3, based on the relationship between the two defined ranges according to
Definition 3.3.
Bảng 3.3. Các quy tắc chèn đoạn [x,y] vào nút N
Case

condition


i

[a,b] ϵ [x,y]

ii
iii
iv

[a,b] =[x,y]
[a,b] < [x,y]
[a,b] > [x,y]

v

[a,b] <| [x,y]

vi

[a,b] >| [x,y]

Action
Remove N node from the tree;
Insert [x,a-1], [b+1,y] into the tree ;
Remove N node from the tree ;
Insert [x,y] into the right child node of N;
Insert [x,y] into the left child node of N;
Replace the [a, b] on the N with the [a, x-1] range;
Insert [b+1,y] into the right child node of N;
Replace the [a,b] on the N with the [y+1, b] range;

Insert [x,a-1] into the left child node of N;

Step 4: Go back to Step 3 until the final rule.
3.2.2.5 Early packet rejection with fCOM field

With the Pkt packet arriving, the fCOM field of the packet is calculated from
the source IP address and the destination IP, converting fCOM to a P-value, and
performing a search on the balanced tree with P key. If the P value is found to
be within the value range of a node on the tree, the Pkt is rejected immediately,
else the Pkt will have to be classified by the original classification module.
3.2.3 Early packet rejection using XOR operation combining multiple fields

This technique differs from the COM technique at points:
 Can be done in multiple fields.
 Use XOR operation to improve of classification speed.


15
3.2.3.1 XOR combination

The fXOR field is constructed as a formula:
𝑓𝑋𝑂𝑅 = preSIP(𝑛) 𝐗𝐎𝐑 preDIP(𝑛) 𝐗𝐎𝐑 preDPort(𝑛)

(3.4)

In which n is:
(3.5)

𝑛 = MIN {length(𝑝𝑟𝑒𝑆𝐼𝑃), length(𝑝𝑟𝑒𝐷𝐼𝑃), length(𝑝𝑟𝑒𝐷𝑃𝑜𝑟𝑡)}
Source IP prefix

XOR
Destination IP prefix
XOR

(a)

=

Source port
Xor prefix

Source IP prefix
XOR
Destination IP prefix
XOR

(b)

=

Source port

Xor prefix
Figure.3.9 Combining fields with XOR

3.2.3.2 Use fXOR field in early packet rejection

 Definition 3.4: XOR field of packet
For the Pkt packet with the source IP address is sIP, the destination IP
address is dIP and the source port is dPort then the XOR field of Pkt is denoted

by fXOR and calculated as follows:
fXOR = [sIP] XOR [dIP] XOR [dPort adds m bits ‘0’ to the right] (3.6)
In which m = len(sIP) – len(dPort).
 Definition 3.5: Range the values of the XOR prefix
Suppose preXOR has length l. The range of values of the preXOR prefix is
d
e
f
1
ined
the range
ofthe
thevalues
valueofofnumber
the binary
preXOR
in the base 2 system
We as
denote
(xyz)2 as
xyz instring
the base
2
a
n


16
of (32 - l) number ‘1’. Let V be the value of the preXOR binary string, then
preXOR has a range of base 10 values: [V × 232−𝑙 , V × 232−𝑙 + 232−𝑙 − 1].

 Theorem 3.3: If a packet Pkt with fXOR field (calculated according to
formula 3.6) does not belong to the range of the preXOR field in rule R, the Pkt
does not match R.
 Definition 3.6: Early packet rejection rule set
Call Q the set of all values in the value space of the field fXOR, A is the set
of values that is the combination of all the values defined by the XOR prefix of
“ACCEPT” rule. The early packet rejection rule set will be values of D set that
is determined by the formula:
𝑫 = 𝑸/𝑨
(3.7)
 Theorem 3.4: When a packet Pkt with fXOR field belongs to set D, it
will not satisfy any “ACCEPT” rules.
Building the set D and using it during the build process is similar to the
process of building and using the set Ф with COM operations..
In addition to being able to use the fXOR field in early packet rejection, this
field can also be used directly during the packet classification. This is researched
and presented in the thesis.
3.2.4 Evaluate the effectiveness of using a combined field in early packet
rejection
3.2.4.2 Conditions for effective use of combined fields

Calling T1 is the average time to classify a packet in the early rejection
module when using the combined field, T2 is the average time to classify a
packet in the original classifier module of the firewall, M is the total the packet
goes through the firewall and P is the percentage of packets that are rejected
early. Then we have:
Time to classify M packets with the original classification module of the
firewall is T2M.
Time to classify M packets when passing through the firewall with both



17
early rejection module and the original module is T1M + T2(1-P)M.
The condition for early packet rejection to be effective (in terms of time) is:

𝑇1 𝑀 + 𝑇2 (1 − 𝑃)𝑀 < 𝑇2 𝑀
↔ 𝑇1 𝑀 < 𝑇2 𝑀𝑃
𝑇

↔ 𝑃 > 𝑇1
2

(3.8)

According to formula 3.8 the effect of early packet rejection with combined
field depends on the rate of packet be rejected early. P has only practical
meaning when T1 rule set (Φ set with the fCOM field, D set with the fXOR field). In the case of a
large number of rules, the early packet rejection technique on the combined field
will not be effective.
T1 and T2 are not fixed and are relative and depend on the nature of the data
flow going through. However, in each given time these values can be
determined. In order for the solution to use the early rejection module with the
combined field always effective, we need to determine a minimum value of P is
Pmin, in case P drops to Pmin, the early rejection module will stop working and
then the packets are only classified by the original classification module.
3.3. Conclusion of chapter 3
This chapter covered the solution to improve the performance as well as
improve the ability to prevent DoS attacks against the firewalls themselves,
namely packet type early technology. Based on the study of the nature of early

packet types as well as the advantages and disadvantages of the introduced
techniques, we have proposed new packet type early techniques based on
reducing the number of checkpoints. Check with COM or XOR combinations
and use balanced tree structures in storing rules and early packet types.
The new proposed point is to use a combination to reduce the number of
dimensions to be tested that can be used in early sorting or classification of
normal packets in order to obtain time-efficient results. The correctness of the
proposal has been proved by theory as well as by the results of the testing
process. In the two proposed combinations, the XOR combination is more
effective thanks to the execution speed and the ability to apply to more than two
fields. Combinations to reduce the number of dimensions to be tested in addition
to using for early packet type can also be used in normal packet classification.


18
Chapter 2 and Chapter 3 have proposed new algorithms that directly
participate in the packet classification process on the firewall thereby improving
the performance of this device. However, we find that the classification of
packets in general is based on the rule set. Therefore, the rule set is an input
parameter of this parameter classification and optimization parameter that will
increase the classification speed. This is the content presented in the next chapter
of the thesis.


19
CHAPTER 4. DETECTION AND RESOLVE CONFLICT IN THE
RULE SET OF FIREWALL
4.2 Some concepts
4.2.2 Rule spaces and their relationships


Each R rule can be represented as follows: R(f1, f2, …, fn, Action). In which,
fi is the pattern value of the ith field and pattern values can be given as ranges,
prefixes, or sets of values. Each rule can be considered an entity in the ndimensional space and the space of that entity is the Rule space.
The relationship between the two spaces of R rule and the P rule will belong
to one of four cases shown in Figure 4.1.
R

P

R-P

(a)

(b)

P

R

R
(c)

P
(d)

Figure 4.1. The spatial relationships of the two rule spaces

4.2.3 Types of conflicts in the rule set of firewall

Corresponding to the relationship of rule space, there are 4 types of conflicts

that can exist between the two laws: Shadowing, Correlation, Generalization
and Redundancy.
4.3 Propose a CDT trie to detect conflicts on the rule set
4.3.1 New definitions

 Definition 4.1: Detail of the field
The detail of the field fn is denoted by |fn| and is determined as follows:
If fn is a prefix type then |fn| is calculated by the length of the prefix.
If fn is the a range [a, b] then |fn| is calculated based on the number of values
in that range according to the formula:
𝑀𝐴𝑋−(𝑏−𝑎)

𝑀𝐴𝑋

|𝑓𝑛 | = (

𝑁

(4.1)

I
n

 Definition 4.2: Relationship between two field values
w The relationship between the field values V1 and V2 of the fn field includes
the
h relationship types: Duplicate V1 ≈ V2, Include V1 ∈ V2, Intersection V1 § V2,
i
c



20
Separation V1 > < V2.
 Theorem 4.1: Given the field values V1 and V2 of the field fn:
 Condition needed to V1 ∈ V2 is |V1| > |V2|
 Condition needed to V1 ≈ V2 is |V1| = |V2|
 Theorem 4.2: Given a set of field values V=(V1, V2, …, Vm ) of the field
fn, if Vk is the field with the largest detail in V, then ∀𝑉𝑖 ∈ 𝑽 (𝑖 ≠ 𝑘, 1 ≤ 𝑖 ≤ 𝑚)
we have 𝑉𝑖 ∉ 𝑉𝑘 .
In order to store rules in the CDT trie structure, the thesis rebuilds the rule
structure including the list of field data and Action. Each field is stored in a unit
record containing information about the field type, detail of field, index of rule
and value of field.
 Definition 4.3: Relationship between two units u1 and u2 (Only used
when u1 and u2 have the same type of field): Coincidentally: u1 coincides with
u2, denoted u1 ≈ u2; Include: u1 includes u2, denoted 𝑢1 ∈ 𝑢2 ; Intersection: u1
intersects u2, denoted u1 § u2.
4.3.2 The idea of the algorithm

The proposed algorithm includes building a CDT trie from rules to
determine their rule spaces relationship. The construction of the CDT trie
follows the main principles including:
i. The relationship between the two rule spaces is examined in each
dimension in that space.
ii. At each rule space dimension, the rule with the highest level of detail will
be considered in relation to the remaining rules. The rule or group of rules under
consideration will only have relations with other rules of the following types:
Match; Subset; Overlap; Disjoin.
iii. For a rule under consideration, at (i+1) dimension: The set of matching
rules (Match) with it will be checked in the Match set of the ith dimension; The

set of rules containing (Super) is checked in Match, Super sets of the ith
dimension; The set of rules for intersect (Overlap) is checked in Match, Super
and Overlap sets of the ith dimension. The rule transfer rules from the set are as


21
follows:
𝑐𝑜𝑛𝑑𝑖𝑡𝑖𝑜𝑛_𝑚𝑎𝑡𝑐ℎ

(𝑀𝑎𝑡𝑐ℎ)𝑖 →

𝑐𝑜𝑛𝑑𝑖𝑡𝑖𝑜𝑛_𝑠𝑢𝑝𝑒𝑟

(𝑀𝑎𝑡𝑐ℎ)𝑖 →

𝑐𝑜𝑛𝑑𝑖𝑡𝑖𝑜𝑛_𝑠𝑢𝑝𝑒𝑟

(𝑆𝑢𝑝𝑒𝑟)𝑖 →

(𝑀𝑎𝑡𝑐ℎ)𝑖+1

(4.2)

(𝑆𝑢𝑝𝑒𝑟)𝑖+1

(4.3)

(𝑆𝑢𝑝𝑒𝑟)𝑖+1

(4.4)


𝑐𝑜𝑛𝑑𝑖𝑡𝑖𝑜𝑛_𝑜𝑣𝑒𝑟𝑙𝑎𝑝

(𝑀𝑎𝑡𝑐ℎ)𝑖 →

𝑐𝑜𝑛𝑑𝑖𝑡𝑖𝑜𝑛_𝑜𝑣𝑒𝑟𝑙𝑎𝑝

(𝑆𝑢𝑝𝑒𝑟)𝑖 →

(𝑂𝑣𝑒𝑟𝑙𝑎𝑝)𝑖+1

(4.5)

(𝑂𝑣𝑒𝑟𝑙𝑎𝑝)𝑖+1

(4.6)
𝑂𝑣𝑒𝑟𝑙𝑎𝑝𝑖𝑐𝑜𝑛𝑑𝑖𝑡𝑖𝑜𝑛_𝑜𝑣𝑒𝑟𝑙𝑎𝑝𝑂𝑣𝑒𝑟𝑙𝑎𝑝𝑖+1 (4.7)
In which, "condition_x" is a condition for a rule to move from a set of step
i to a set of (i + 1) steps. Let R be the rule under consideration, R(fm) is the mth
field value of R, then the condition "condition_x" so that P rule is passed in the
above formulas as follows:
condition_match: R(fi+1) ≈ P(fi+1)
condition_super: R(fi+1) ∈ P(fi+1)
condition_overlap: R(fi+1) § P(fi+1)
4.3.3 CDT trie structure

The CDT is a multi-way trie built from built from the unit set of rules. The
root node contains a list of all the rules in the rule set. In the trie, the path from
the root node to the leaf node represents the complete one or a set of rules that
meet specific conditions on that path. The N node carries information about the

field type fn and the detail of that field, the children of N are constructed
according to the field value fn and the detail of field stored in N is always greater
than or equal to the detail of the field stored in its child nodes.
4.2.3.1 Node structure

The node of the CDT trie has the structure described in Figure 4.2.


22
TOF
DETAIL
M
S
O
Childs
Lables
Other Child
List of labels corresponding to child nodes

The child node contains rules
that do not meet the DETAIL condition
The list of rules with rule space has areas intersected with the
rule space of the rules of the set M
List of child nodes
List of matching rules
The list of rules with rule space contains the rule space
of the rules of the set M
Detail of field

Type of field


Figure 4.2. Structure of the CDT trie node

4.2.3.2 Building node
Algorithm 4.1: BuildNode
Input: List of unit Unit-matchs;
List of unit Unit-supers;
List of unit Unit-overlaps;
Output: CDTNode N;
Begin
1
UMAX = GetMaxUnit(Unit-matchs);
2
lstUnit = GetUnits(UMAX, Unit-matchs);
3
N.TOF = UMAX.type;
4
N.DETAIL = UMAX.detail;
5
For each u of lstUnit
6
Begin
7
ulable=CreateLable(u);
8
If ulable not in N.Labels
9
Begin
10
N.Lables.add(ulable);

11

𝑐𝑜𝑛𝑑𝑖𝑡𝑖𝑜𝑛_𝑚𝑎𝑡𝑐ℎ

(Unit-matchs)→

(uMatchs);


23
12

𝑐𝑜𝑛𝑑𝑖𝑡𝑖𝑜𝑛_𝑠𝑢𝑝𝑒𝑟

(Unit-matchs)→

𝑐𝑜𝑛𝑑𝑖𝑡𝑖𝑜𝑛_𝑠𝑢𝑝𝑒𝑟

13

(Unit-supers)→

14

(Unit-matchs)→

15

(uSupers);
(uSupers);


𝑐𝑜𝑛𝑑𝑖𝑡𝑖𝑜𝑛_𝑜𝑣𝑒𝑟𝑙𝑎𝑝

𝑐𝑜𝑛𝑑𝑖𝑡𝑖𝑜𝑛_𝑜𝑣𝑒𝑟𝑙𝑎𝑝

(Unit-supers)→

(uOverlaps);

(uOverlaps);

𝑐𝑜𝑛𝑑𝑖𝑡𝑖𝑜𝑛_𝑜𝑣𝑒𝑟𝑙𝑎𝑝

16
(Unit-overlaps)→
(uOverlaps);
17
CDTNode M;
18
BuildNode(M,uMatchs, uSupers, uOverlaps);
19
N.Childs.add(M);
20
RemoveUnit(Unit-matchs, uMatchs);
21
End
22 End
23 BuildNode(N.OtherChild, Unit-matchs, Unit-supers, Unit-overlaps);
End


4.3.4 Conflict detection on CDT trie

Information about the rule space relationship between rules contained in trie
leaf nodes. Specifically, at leaf node N: rules of N.M set have identical rule
spaces; the rules of the N.S set have the rule space containing the rule space of
rules in N.M; The rules of N.O set have rule space overlap with the rule space
of rules in N.M.
4.3. Conclusion of chapter 4
In this chapter, the thesis has studied the problem of optimizing the rule set
of the firewall to increase the device's performance. In particular, the focus is on
the detection and resolving of conflicts in the rule set. Based on the analysis and
evaluation of the strengths and limitations of the existing techniques, I have
proposed a technique to detect and resolving conflicts on rule set of firewall
rules with CDT trie. In this study, the PhD student concentrates on solving the
problem of how to effectively determine the relationship of rule space between
the two rules, thereby serving as a basis for determining conflicts in the rule set.
The CDT structure can be applied in building a tool that support for system
administrators to build new security policies or check the security policies
deployed on firewalls.


×