Tải bản đầy đủ (.pdf) (12 trang)

A deep learning model that detects the domain generated by the algorithm in the botnet

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (777.08 KB, 12 trang )

52

HANOI METROPOLITAN UNIVERSITY

A DEEP LEARNING MODEL THAT DETECTS THE DOMAIN
GENERATED BY THE ALGORITHM IN THE BOTNET
Nguyen Trung Hieu(*), Cao Chinh Nghia
Faculty of Mathematics - Informatics and application of science and technology in crime
prevention, The People's Police Academy

Abstract: Domain Generation Algorithm (DGA) is the group of algorithms that generate
domain names for attack activities in botnets. In this paper, we present a Bi-LSTM deep
learning model based on Attention mechanism to detect DGA-generated domains. Through
the experimental process, our model has given good results in detecting DGA-generated
domains belong to the Post and Monerodownloader family. In general, the F1 measure of the
model in the multi-class classification problem reaches 90%. The micro average (macro avg)
efficiency is 86% and the average (weighted avg) efficiency is 91%.
Keywords: Bi-LSTM deep learning network; deep learning; malicious URL detection;
Attention mechanism in deep learning.

Received 1 June 2022
Revised and accepted for publication 26 July 2022
(*) Email:

1. INTRODUCE
Botnet Attacks
The development of Internet has brought many benefits to users, but it is an environment
for cybercriminals to operate also.
Botnet Attack is one of the common attacks. Each member of the botnet is called a bot. A
bot is a malicious software created by attackers that control infected computers remotely
through a command and control server (C&C server). The bot has a high degree of autonomy


and is equipped with the ability to use communication channels to receive commands and
update malicious code from the control system. Botnets are commonly used to transmit
malware, send spam, steal sensitive information, phishing, or create large-scale cyberattacks
such as distributed denial of service (DdoS) attacks [1].


SCIENTIFIC JOURNAL OF HANOI METROPOLITAN UNIVERSITY − VOL.62/2022

53

The distribution’s widespread of bots and the connection between bots and control servers
often requires the Internet. The bots need to know the IP address of the control server to access
and receive commands. In order to avoid detection, command and control servers do not register
static domain names, instead of continuously change addresses and different domains at
different intervals. Attackers use Domain Generation Algorithm (DGA) to generate different
domain names for attacks [2] aimed at masking these control and control servers.
Identifying the attack of a malicious domain can effectively determine the purpose of the
attack, the tools and malware used, and take preventive measures to greatly reduce the damage
caused by the attack induced attack.
Domain Generation Algorithm
The Domain Generation Algorithm (DGA) can use operators in combination with everchanging variables to generate random domain names. The variables can be day, month, year
values, hours, minutes, seconds or other keywords. These pseudo-random strings are
concatenated with the Top-level domain (.com, .vn, .net...) to generate the domain names. The
algorithm of the Chinad malware written in Python [3] shows the input seed includes letters
from a-z and numbers from 0-9 and combines the values of days, months, five. The results are
combined with the TLDs ('.com', '.org', '.net', '.biz', '.info', '.ru', '.cn') to form the complete
domain name.
Table 1. Some DGA samples

Conflicker


gfedo.info
ydqtkptuwsa.org
bnnkqwzmy.biz

Cryptolock
er

nvjwoofansjbh.ru
qgrkvevybtvckik.org
eqmbcmgemghxbcj.co.
uk

Bigvikto support.showremote-conclusion.fans
turntruebreakfast.futbol
r
speakoriginalworld.one
Bamital

cd8f66549913a78c5a8004c82bcf6b01.i
nfo
a024603b0defd57ebfef34befde16370.o
rg
5e6efdd674c134ddb2a7a2e3c603cc14.
org

Chinad

qowhi81jvoid4j0m.biz
29cqdf6obnq462yv.co

m
5qip6brukxyf9lhk.ru


54

HANOI METROPOLITAN UNIVERSITY

A DGA can generate a large number of domains in a short time, and bots can select a small
portion of them to connect to the C&C server. Table 1 shows some examples of domain names
initialized with DGA [4]. The
Chinad malware can generate 1000
domain names per day with the
letters a-z and numbers 0-9.
Bigviktor combines 3 to 4 different
words from 4 predefined lists
(dictionaries) that can generate 1000
domains per month.
Figure 1 depicts the connection
process between the C&C server and
the DGA domains [5]. The attacker
uses the same DGA and initial
Figure 1. DGA-based botnet communication mechanism
kernels for the C&C server and the
bot to generate the same domain
dataset. The attacker needs to select a domain name only from the generated list and register it
for the C&C server 1 hour before performing the attack. The bots on the victim's machine will
in turn send the domain name resolution requests in the generated list of domains to the Domain
Name System (DNS). The DNS system will return the IP address of the corresponding C&C
server, then the bots begin to communicate with the server to receive the command. If the C&C

server is not found in the previous domain, the bots will query the next set of domains generated
by the DGA until an active domain name is found [6].

2. MAIN CONTRIBUTION OF THE ARTICLE
The main of contributions of the paper include:
1 - Introduce a deep learning approach using Bidirectional-Long Short Term Memory (BiLSTM) model based on Attention mechanism in detecting domains created by DGA. Our model
has worked well in the problem of detecting malicious URLs [7].
2 - Presenting experimental results shows a significant improvement compared to previous
techniques with the use of open data sets.
The remainder of paper is organized as follows: Section 2 presents related studies. Our
deep learning network architecture and solution is presented in Section 3. Section 4 presents
our experimental process, including the steps to select the database and the results obtained.
Finally, Section 5 is the conclusion, comments on the results achieved as well as the future
direction of the paper.


SCIENTIFIC JOURNAL OF HANOI METROPOLITAN UNIVERSITY − VOL.62/2022

55

2.1. Related studies
In recent years, much research on Botnet detection has been published. Nguyen Van Can
and colleagues [8] proposed a model to classify benign domains and DGA domains based on
Neutrosophic Sets. Testing on 3 data sets of Alexa, Bambenek Consulting [9] and 360lab [4]
shows that the model has an accuracy result of 81.09%.
R. Vinayakumar et al [10]. have proposed a DGA detection method based on analyzing the
statistical features of DNS queries. Feature vectors are extracted from domain names by text
representation method, optimal features are calculated from numerical vectors using deep
learning architecture in Table 2. The results show that the model has high accuracy with 97.8%.
Table 2. DBD deep architecture [10]

Layers

Output Shape

Embedding

(None, 91, 128)

Conv1D

(None, 8, 764)

MaxPooling1

(None, 2, 164)

LSTM

(None, 70)

Dense

(None, 1)

Activation

(None, 1)

Yanchen Qiao et al [2]. have proposed a method to detect DGA domain names based on
LSTM using Attention mechanism. Their model is executed on the data set from Bambenek

Consulting [9], with an accuracy of 95.14%, overall precision of 95.05%, recall of 95.14% and
F1 score of 95.48%.
Duc Tran [11] built an LSTM.MI model that combines binary classifier and multiclass
classifier with an unbalanced dataset. In which, the original LSTM model is applied a costsensitive adaptation mechanism. Cost items are included in the back-to-back learning process
to account for the importance of delineation between classes. They demonstrate that LSTM.MI
provides at least 7% improvement in accuracy and macro mean recall over the original LSTM
and other modern cost-sensitive methods. It can also maintain high accuracy on non-DGA
generated labels (0.9849 F1 points).
2.2. Proposed model
Our proposed model includes: input layer, embedded layer, two Bi-LSTM layers, one
attention layer and output layer. The architecture of the model is shown in Figure 2 [7].


56

HANOI METROPOLITAN UNIVERSITY

The detection module will take as input a data set of
𝑇 domain addresses with a structure of the form
{(𝑢1 , 𝑦1 ), … . , (𝑢𝑇 , 𝑦𝑇 )} . Where, xt is a pair
(𝑢𝑡 , 𝑦𝑡 )where u t (with t = 1, …, 𝑇) is a domain in the
training list and 𝑦𝑡 an associated label.
Each domain, in its raw form, before being trained,
is processed in two steps to form the input vector:
- Step 1: Cut off the TLD part of the domain name
then tokenize the raw data – convert the string of
characters in the rest to encrypted data in the form of an
integer using Keras's Tokenizer library;

Figure 2. Bi-LSTM network architecture.


- Step 2: Normalize the size of the encrypted data in
step 1 to the same length. This way we can convert the
original domain string into the input vector V = {v1 , v2 ,
v3 , …vT } . Each vector has a fixed length. Any missing
vector, add the value 0 to give the length enough.

Next, we use a bidirectional LSTM network (Bi-LSTM) to model URL sequences based
on a word vector representation. In Bi-LSTM architecture, there are two layers of nodes hidden
from two separate LSTMs, two LSTMs capturing distant dependencies in two different
directions. Since the output vector of the embedded layer is V = {v1, v2, v3, …vT }, the forward
LSTM will read the input from v1 to vT and the backward LSTM will read the input from v T
⃗⃗⃗𝑖 𝑣à ℎ
⃖⃗⃗⃗𝑖 is initialized. We can get the output of the Bito v 1 . Meanwhile a pair of hidden states ℎ
LSTM layer by combining the two hidden states according to the formula:
⃗⃗⃗𝒊 , 𝒉
⃖⃗⃗⃗𝒊 ]
𝒉𝒊 = [𝒉

𝑻

(1)

It uses two layers of Bi-LSTM and the experimental data set is quite large. Therefore, Batch
Normalization layer will be used to normalize the data in batch layers to a normal distribution
to stabilize the learning process and greatly reduce the number of epochs needed to train the
network, thereby increasing the speed of training. training.
As described in this paper, the hidden states at all locations are considered with different
Attention weights. We apply Attention mechanism to capture the relationship between ⃗⃗⃗
ℎ𝑖 𝑣à ⃖⃗⃗⃗

ℎ𝑖 .
This information is aggregated with respect to the feature from the output of the second BiLSTM network. This helps the model to focus only on the important features instead of the
confounding or less valuable information.


SCIENTIFIC JOURNAL OF HANOI METROPOLITAN UNIVERSITY − VOL.62/2022

57

Initially, the weights 𝑢𝑡 are calculated based on the correlation between the input and
output according to the following formula:
𝒖𝒕 = 𝒕𝒂𝒏𝒉(𝑾𝒉𝒕 + 𝒃)

(2)

These weights will be renormalized to the Attention weight vector 𝛼𝑡 using the softmax function:
𝜶𝒕 =

𝐞𝐱𝐩 (𝒖𝑻𝒕 𝒖)
∑𝒕 𝐞𝐱𝐩 (𝒖𝑻𝒕 𝒖)

(3)

Then the vector 𝑐𝑡 is calculated based on the Attention weight vector and the hidden states
ℎ1 … ℎ𝑇 as follows:
𝒄𝒕 = ∑ 𝜶𝒕 𝒉𝒕

(4)

𝒕


value 𝑐𝑡 , the more important the feature 𝑥𝑡 plays in detecting the DGA domain .
Finally, to predict a domain, the calculation results are passed through a Dense layer with
1 hidden neuron using the activation function sigmoid to receive a return value between 0 and
1. The resulting y will be helps determine if a domain is benign or DGA. Thus, the input domain
name will be normalized into a vector form, this vector will be passed through the Embedding,
Bi-LSTM, Batch Normalization, Bi-LSTM, Attention layers before giving the output result. In
addition, the model uses adam optimization algorithm with default parameters in keras. And to
prevent the model from falling into overfitting state (overfitting) compared with the real model
of the data, we use more Dropout technique for Bi-LSTM layers. The mechanism of Dropout
is that in the process of training the model, with each time we update the weights, we randomly
remove the number of nodes in the layer so that the model cannot depend on any node of the
previous layer, but instead which tends to spread evenly.
2.3. Experiment
In this paper, we conduct 2 experiments:
1- Experimentally check the accuracy of the model in 2-class classification: Domains
generated by DGA algorithm and normal domains
2- Experiment to check the accuracy of the model in multi-class classification: Detect
different DGA algorithms in a given data set.
Count
Mean
Std
Min
25%
50%
75%
Max

DGA
30000

14.245103
4.337851
6
12
13
16
25

Regular domain name
30000
9.623797
3.300294
6
7
9
11
25


58

HANOI METROPOLITAN UNIVERSITY

2.4. Evaluation Dataset
In this paper, we use a dataset consisting of DGA domains collected from Bambenek
Consulting [9] and normal domains obtained from Alexa. With two different tests, we use two
different data sets.
Table 2. Summary of the collect dataset
Domain Type
tinba

ramnit
necurs
murofet
Post
qakbot
shiotob
monerodownloader
ranbyus
kraken
Cryptolocker
locky
vawtrak
qadars
ramdo

Sample
64313
62227
30789
24562
21881
19258
14451
14422
13417
5529
5780
3869
3022
2309

1932

Domain Type
Chinad
P2P
Volatile
proslikefan
Sphinx
Pitou
Dircrypt
Fobber
padcrypt
Zloader
Geodo
MyDoom
Beebone
tempedreve
Vidro

Sample
1484
985
966
750
733
749
699
572
551
555

557
333
291
242
188

Domain Type
unknownjs
beautiful baby
pandabanker
cryptowall
an
Unknowndroppr
sisron
kingminer
gozi
dromedan
madmax
g01
mirai

Sample
172
161
94
91
67
59
59
29

24
2
2
1
1

Dataset for test 1: Consists of 30000 DGA domains with label 1 and 30000 normal domains
with label 0. This dataset is randomly shuffled, then divided into two small sets as training
dataset and test dataset. In which there are 46 different types of DGA domain names with the
number given in Table 2. The distribution parameters of character length of each type of domain
name given in Table 3. In which, the sample with the smallest length is 6, the maximum length
is 25 and the average length of the DGA domain name is 14.2, the normal domain name is 9.6.
Table 4. Label assessment to types
Domain type
Post
Kraken
Legit
Monerodownloader
Murofet
Necurs

Labels
0
1
2
3
4
5

Domain type

Qakbot
Ramnit
Ranbyus
Shiotob/urlzone/bebloh
Tinba

Labels
6
7
8
9
10


SCIENTIFIC JOURNAL OF HANOI METROPOLITAN UNIVERSITY − VOL.62/2022

59

Dataset for test 2: With the goal of testing the multi-class classification, the types of DGA
domains used include the families Post, Kraken, Monerodownloader, Murofet, Necurs,
Shiotob/urlzone/bebloh, Qakbot, Ramnit, Ranbyus, Tinba are labeled according to Table 4. The
number of domain names for test 2 includes 25,000 normal domain names and 25,000 domain
names belonging to DGA families.

3. PERFORMANCE METRIC
The performance of the algorithms is evaluated using the confusion matrix. In there:
• True negatives (TN) – are benign sites that are predicted to be benign.
• True Positives (TPs) – are malicious sites that are expected to be malicious.
• False negatives (FN) – are malicious sites that are expected to be benign.
• False positives (FPs) – are benign but expected to be malicious sites.

From there we have the measures:
Accuracy:
(𝑻𝑷+𝑻𝑵)

ACC =(𝑻𝑷+𝑻𝑵+𝑭𝑷+𝑭𝑵)

(5)

The article also uses the measures of F-measure, precision, and recall, which are shown in
the following formulas:
𝑻𝑷
𝑻𝑷 + 𝑭𝑷
𝑻𝑷
𝑹𝒆𝒄𝒂𝒍𝒍 =
𝑻𝑷 + 𝑭𝑵
𝟐 ∗ 𝑹𝒆𝒄𝒂𝒍𝒍 ∗ 𝑷𝒓𝒆𝒄𝒊𝒔𝒊𝒐𝒏
𝑭𝟏 =
𝑹𝒆𝒄𝒂𝒍𝒍 + 𝑷𝒓𝒆𝒄𝒊𝒔𝒊𝒐𝒏
𝑷𝒓𝒆𝒄𝒊𝒔𝒊𝒐𝒏 =

(6)
(7)
(8)

A high Precision value means that the accuracy of the points found is high. A high recall
means a high TP rate, which means that the rate of missing really positive points is low. The
Table 5. Parameter of model
higher the F1, the better the classifier. In addition,
In experience no.1
we also use the loss function binary cross entropy

Layers
Output Shape
(BCE) to calculate the difference between two embedding
(None, 38, 128)
quantities: 𝑦̂- the label of the predicted URL and bidirectional
(None, 38, 128)
(None, 38, 128)
y - the correct label of each URL. Loss function is batch_normalization
(None, 38, 128)
like a way to force the model to pay a penalty for bidirectional_1
attention_with_context (None, 38, 128)
each wrong prediction, and the number of
addition
(None, 128)
penalties is proportional to the severity of the dense
(None, 1)
error. The smaller the loss value, the better the
model shows that the prediction results are good, on the contrary, if the prediction results differ
too much from reality, the larger the loss value.
̂) + (𝟏 − 𝒚) 𝒍𝒐𝒈(𝟏 − 𝒚
̂))
𝑩𝑪𝑬 = −(𝒚𝒍𝒐𝒈(𝒚

(9)


60

HANOI METROPOLITAN UNIVERSITY


3.1. Experimental results
Table 6. Experimental results no. 1
Loss
3.2705e-04
ACC
0.9999
Precision
0.9999
Recall
0.9998
F1
0.9999
3.1.1. Experiment number 1

The model is built on the basic configuration of
the Kaggle platform with Keras kernel, Tensorflow
backend. Which uses ModelCheckPoint to save the
training process and EarlyStopping to immediately
stop the training process when the best value is found.

The parameters of the model in the first experiment are showed in Table 5.
Table 7. Parameter of the model in experience no. 2
Layers
embedding
bidirectional
batch_normalization
bidirectional_1
attention_with_context
addition
dense


Output Shape
(None, 33, 128)
(None, 33, 128)
(None, 33, 128)
(None, 33, 128)
(None, 33, 128)
(None, 128)
(None, 11)

With the binary classification problem between the DGA domain and the normal domain,
the model gives the results in Table 6 with an accuracy of up to 99%. With this result, we
assume that there is a difference coming from the distribution of the domain's length. We will
run other tests to further test the stability of the model.
3.2. Experiment number 2
Table 8. Results of experiment 2
Class
0
1
2
3
4
5
6
7
4

Precision
1.00
0.78

0.98
1.00
0.85
0.81
0.52
0.76
0.85

Recall
0.99
0.76
1.00
1.00
0.59
0.85
0.82
0.95
0.59

F1
1.00
0.77
0.99
1.00
0.69
0.83
0.64
0.84
0.69


Support
3986
1114
50086
2665
4383
5746
3572
11535
4383


SCIENTIFIC JOURNAL OF HANOI METROPOLITAN UNIVERSITY − VOL.62/2022

5
6
7
8
9
4
ten
accuracy
macro avg
weighted avg

0.81
0.52
0.76
0.89
0.97

0.85
0.91

0.85
0.82
0.95
0.88
0.9
0.59
0.59

0.86
0.91

0.85
0.90

0.83
0.64
0.84
0.89
0.94
0.69
0.71
0.90
0.85
0.90

61


5746
3572
11535
2432
2602
4383
11879
100000
100000
100000

In this experiment, we test the multi-class detection ability of the model with three
measures of precision, recall and f1. The parameters used in the model are presented in Table
7. Due to multi-class classification, as an output, we use a hidden layer of size 11 corresponding
to 11 labels to be classified. The experimental results are presented in Table 8. With the normal
domain (labeled as 2), the Precision is 98%, F1 is 99%. Our model gives the best results when
classifying labels for DGA domains belonging to the Post family (label number 0) and
Monerodownloader (label number 3). In contrast, the model gave the worst results on the
Qakbot family (label 6) when the rate of classifying a benign site as a malicious site with a
Precision measure of 52%. For the Murofet family (label number 4) and Tinba (label number
10), the model gives false classification results when evaluating the DGA domain name into a
benign domain with a recall measure of 59%. In general, the F1 measure of the model in the
multi-class classification problem reaches 90%. The micro average (macro avg) efficiency is
86% and the average (weighted avg) efficiency is 91%.

4. COMPARISON WITH OTHER DGAS DETECTION METHODS
The evaluation was performed on a dataset from the same source [9] as the studies being
compared. The results compared with the study of Chanwoong Hwang and colleagues are
shown in Table 9 showing that our model has a higher detection capacity.
Table 10 compares the ability to detect DGA domains

labeled 4,5,6,7,8,9,10 in the study of Yanchen Qiao and
Chanwoong
Proposed
Hwang
model
Duc Tran. Yanchen Qiao [2] using LSTM with Attention
Accuracy
88.77%
90%
mechanism. Duc Tran's model [11] is a cost-sensitive
Precision
89.01%
91%
original LSTM . Cost items are class dependent, taking into
Recall
88.77%
90%
account the importance of classification between classes.
F1-score
88.695%
90%
Our model exhibits good detectability across four DGA
families: Necurs, Qakbot, Ramnit, Ranbyus and lesser on the Shiotob family, tinba.

Table 9. Comparison


62

HANOI METROPOLITAN UNIVERSITY


Table 10. Results comparring with Yanchen Qiao and Duc Tran
La
bel
s

Fa
mil
y

4

Yanchen Qiao
TABLE I.

Duc Tran

Our Model

TABLE III. F
TABLE VI. F
P TABLE II. R
TABLE IV. P TABLE V. R
TABLE VII.

TABLE
P
VIII.

TABLE IX. F


ecall

1Score

recision

ecall

R
1Score

recision

ecall

1score

Mu
rofe
t

0.7641

0.7207

0.7418

0.5330


0.7423

0.6205

0.85

0.59

0.69

5

Nec
urs

0.6651

0.1722

0.2735

0.5248

0.1104

0.1824

0.81

0.85


0.83

6

Qak
bot

0.7862

0.5013

0.6122

0.7716

0.4350

0.5564

0.52

0.82

0.64

Ra
mni
t


0.4688

0.7525

0.5777

0.6068

0.8062

0.6925

0.76

0.95

0.84

7

Ran
byu
s

0.4672

0.8455

0.6018


0.3617

0.7073

0.4787

0.89

0.88

0.89

8

Shi
oto
b

0.9751

0.9251

0.9494

0.9741

0.9004

0.9358


0.97

0.90

0.94

9

10

Tin
ba

0.9259

0.9920

0.9578

0.8951

0.9961

0.9429

0.91

0.59

0.71


recision

5. CONCLUSION
In this paper, we have presented an approach using Bi-LSTM deep learning network based
on Attention mechanism. [7] to solve the problem of detecting domains generated by algorithms
in Botnet, too. The model further shows a strong ability to detect DGA domains. The model
with 2 layers of Bi-LSTM combined with Attention gives results when detecting DGA domains
with 90 % accuracy. In the future, we will continue to improve the model, and at the same time
evaluate the model on larger, more complex datasets to verify the accuracy of the proposed
model. The research results in this direction can be integrated into the DNS domain name
filtering systems to automatically discover the domains of the Botnet network.

REFERENCES
1. Soleymani and F. Arabgol (2021), "A Novel Approach for Detecting DGA-Based Botnets in DNS
Queries Using Machine Learning Techniques," Journal of Computer Networks and Communications.
2. Y. Qiao, B. Zhang, W. Zhang, A. K. Sangaiah and H. Wu (2019), "DGA Domain Name Classification
Method Based on Long Short-Term Memory with Attention Mechanism," Applied Sciences, vol. 9,
no. 20.


SCIENTIFIC JOURNAL OF HANOI METROPOLITAN UNIVERSITY − VOL.62/2022

63

A. Qi, J. Jiang, Z. Shi, R. Mao and Q. Wang (2018), "BotCensor: Detecting DGA-Based Botnet Using
Two-Stage Anomaly Detection.," in 2018 17th IEEE International Conference On Trust, Security
And Privacy In Computing And Communications/ 12th IEEE International Conference On Big Data
Science And Engineering (TrustCom/BigDataSE).
A. K. Sood and S. Zeadally (2016), "A Taxonomy of Domain-Generation Algorithms," IEEE Security

& Privacy, vol. 14, no. 4, pp. 46-53, 05 August 2016.
3. N. T. Hiếu and T. N. Ngọc (2020), "Phát hiện URL độc hại sử dụng mạng học sâu Bi-LSTM dựa
trên cơ chế Attention," in Hội thảo quốc gia lần thứ XXIII: Một số vấn đề chọn lọc của Công nghệ
thông tin và truyền thông, Quảng Ninh.
B. N. T. T. A. T. H. V. L. L. H. S. N. T. K. S. Nguyen Van Can (2020), "A new method to classify
malicious domain name using Neutrosophic sets in DGA botnet detection," Journal of Intelligent &
Fuzzy Systems, vol. 38, p. 4223–4236.
4. S. K. P. P. A. M. J. A. Vinayakumar R (2019)., "DBD: Deep Learning DGA-Based Botnet
Detection," in Advanced Sciences and Technologies for Security Applications Deep Learning
Applications for Cyber Security, T. M. Alazab M., Ed., Switzerland, Springer, Cham, pp. 127-149.
5. H. M. V. T. H. A. T. L. G. N. Duc Chan (2018), "A LSTM based framework for handling multiclass
imbalance in DGA botnet detection," in Neurocomputing, 2018, p. 2401–2413.

MỘT MÔ HÌNH HỌC SÂU PHÁT HIỆN TÊN MIỀN ĐƯỢC TẠO
BỞI THUẬT TỐN TRONG MẠNG BOTNET
Tóm tắt: Thuật tốn khởi tạo tên miền (DGA) là một nhóm các thuật tốn tạo ra các tên miền
phục vụ cho các hoạt động tấn công trong mạng botnet. Trong bài báo này, chúng tơi trình
bày một mơ hình học sâu bi-lstm dựa trên cơ chế attention để phát hiện các tên miền dga. Qua
quá trình thực nghiệm, thuật toán cho kết quả tốt trong việc phát hiện các tên miền dga thuộc
họ post và monerodownloader. Về tổng thể, độ đo f1 của mơ hình trong bài toán phân loại đa
lớp đạt 90%. Hiệu suất trung bình vi mơ (macro avg) đạt 86% và hiệu suất trung bình
(weighted avg) đạt 91%.
Từ khóa: Mạng học sâu Bi-LSTM, miên DGA, phát hiện URL độc hại, cơ chế Attention trong
học sâu.



×