HỌC VIỆN KỸ THUẬT QUÂN SỰ
NGUYỄN ĐỨC ANH
NGHIÊN CỨU KHAI PHÁ DỮ LIỆU DỰ BÁO KHÁCH HÀNG CÓ
KHẢ NĂNG RỜI MẠNG VNPT
Chuyên ngành: Hệ thống thông tin
LUẬN VĂN THẠC SĨ KỸ THUẬT
Hà Nội - Năm 2014
HỌC VIỆN KỸ THUẬT QUÂN SỰ
NGUYỄN ĐỨC ANH
NGHIÊN CỨU KỸ THUẬT KHAI PHÁ DỮ LIỆU DỰ BÁO KHÁCH
HÀNG CÓ KHẢ NĂNG RỜI MẠNG VNPT
Chuyên ngành: Hệ thống thông tin
Mã số: 60480101
LUẬN VĂN THẠC SĨ KỸ THUẬT
Hà Nội - Năm 2014
MỤC LỤC
!"#$%&'
($)$
*%+,+) /0
%$$1$23#4-
%$$1$5
%$/6
MỞ ĐẦU
1
Chương 1
TỔNG QUAN VỀ KHAI PHÁ DỮ LIỆU
7 7 81##4%/92#1:;)#4-
<
7 7 7 =#>'$?5#2#+1$/@AB)3+C+#
<
7 7 D 81##4%/92#1:;)#4-
E
7 D (F+>GHI12#1:;)#4-J#K
7L
7 D 7 HI1M-N=O:-$+#'P
7L
7 D D QNM-NK+&ROS$#>#'+SSP/@) +OT-)SP
77
7 D U 1+#4) +2K+V
7<
7 D < Q)WO)>>#X#$+#'P
7E
7 D E Q$%O)->+S#P
7E
7 D Y HI1:Z+[%\-O>S:]'++SP
7Y
7 D ^ HI1+-F$+[&_+RA1$>-`+OSS:#>S:]'
'#)#+NP
7Y
7 D a (=I'OS-'S+b'2P
7^
7 D c +#5#:#+-N9OSS+#$d)'#+%P
7a
7 U (F+>Ge:$f2#1:;)#4-
7a
7 U 7 Q+g$:;)#4-/@h+VM-NK+&R
7a
7 U D i:21$
DL
Chương 2
NGHIÊN CỨU KỸ THUẬT KHAI PHÁ DỮ LIỆU TRONG
VIỄN THÔNG
D 7 JM-
DD
D 7 7 ($+#[-$f:+%##
DD
D 7 D HW+#K$.+':+%##
DU
D D (F+>Ge::+%##+'/#j+C
DE
D D 7 1+#4#).OX-::S+S$+#'P
DE
D D D 1$e:M-5)3/@$0%>*$21$@
DY
D D U 1$e:1+#4/@$C).)h#+[4+G%=/#j+C
OS+b'2X-)+#>')+#'P
D^
D U #[$e-k!$f%=:#&F
Da
D U 7 1$+.k!M-5)3+-[')#[M-&K@#+'1
Da
D U D Q)W:Z&'1A-HW21$@
UL
D U U Q)WNS>
<<
Chương 3
ỨNG DỤNG DỰ ĐOÁN Ý ĐỊNH RỜI MẠNG CỦA KHÁCH HÀNG
TRONG MẠNG DI ĐỘNG VNPT
U 7 ::+%##&l:Z&'1+HW$3&R"#2m#%=$f21$
@ ED
U 7 7 1$#5#M-NK+N[-$?-@#+'1
ED
U 7 D HI1+#l2#
ED
U 7 U F#:-+#l2#
ED
U 7 < nQN:Z+.k!-`)-N4
EU
U D nQN:Ze:
YL
U D 7#W#+#4-
YL
U D D -1+1++#l
YL
U D U nQN:Zo(
YL
U D < 1#1
YU
KẾT LUẬN VÀ KIẾN NGHỊ
7 8K+) Y<
D 8#KR Y<
TÀI LIỆU THAM KHẢO
Tóm tắt luận văn:
pq/@+[q$/#[rddst
p!Wr8'q$(1N+g 8'1rD<
p1FHW:\rk-Nj'@k#
p[&9+@#rNGHIÊN CỨU KỸ THUẬT KHAI PHÁ DỮ LIỆU DỰ
ĐOÁN CÁC THUÊ BAO RỜI MẠNG DI ĐỘNG VNPT
p*%+,+r
! /0#[$e-/9$1$HI1Q)W+'2#1:;
)#4-:u&lQ)Wv:Z&'1$1$&'1$1$+-['$f$*250
"#%=N2C
kB:$I>w:;)#4-k!>S/S&l+#K+2KvQ+g$:;)#4-v)Z$q
$1$+-F$+g$*g$$'/#4$:Z&'1&l+='5$I>w:;)#4--`)-N4
kB:+ ++'1Q)WNS>&l:Z&'1%F++-['$+l$*25
0"#2m#%=N2C
DANH MỤC CÁC KÝ HIỆU
8! 8#1:;)#4-
k! I>w:;)#4-
T ))+TS$':
DANH MỤC CÁC BẢNG
57 75+##+
7D
57 D5+S>+#+
7U
57 U8K+M-5Q)Wx$QNM-NK+&R
7<
5D 7(G#M-4#;D#K+-./@)'=#ASI#
<Y
5D D5?+0%+S':y/@?+0%+S'$1$#[
<Y
5D U5?+0%+S':y
<^
5D <5?+0%+S''+)
<^
5D E5?+0%+S'$F+
<^
5D Y1$#1+R+'5?+0%+S'$F+
<a
5D ^5$I>w:;)#4->B::R$/
EL
5U 75$#+#K+$-F$q#OTz>P
Ea
5U D5$I>w:;)#4-+g$HW$
Ea
5U U5+C+#21$@
Ec
5U <5+C+#+JV21$@
YL
DANH MỤC HÌNH VẼ
7 7 -1+1+#4+#+e$
Y
7 D ##&'=+#9AB)3:;)#4-+'+(##
a
7 U QNM-NK+&R+{5+##:+
7D
D 7 nQN:Z(C-`)-N4
UD
D D kB:%C-`)-N4&l:Z&'1
U<
D U ;)#4-M->1+/@+.#5+#K+
<E
U 7 8#K+|$4+GQ)W
E^
U D #':#4$g$f$HI+
Y7
U U ;)#4-G$
YD
U < I>w:;)#4--`)-N4
YU
11
MỞ ĐẦU
1. Lý do chọn đề tài:
a. Cơ sở khoa học:
kZ1++#l$f$C4+C+#/@/#4$e:$C4+C
+#+'#9-)}/Z$$f&"#>Gv2#+KA~F#+'#9-0%M-
$•&_}/W#)HV:;)#4-&~&HV$$1$$IM-+-+./@)H-+;
@N%F++g$)•N#9-)[ q)H-+;$1$:;)#4-@N/$'x+'*
€$e;#1+R`+&R@'&* -N#[v+S'+G2[+$•$*
%F+)HVm$f:;)#4-@N)@)-C&HV$Q+g$v>G$y)=#q>62C
#K+5#)@% (F+/`&9&HV$&‚+)@)@%+K@'&l+J$e$v2#+1$
;2G#:;)#4-2J)_/@&:=&*
#5#1$'$1$/`&9+[)@/#4$AQN:Z%F+2':;)#4-O+
ƒS'->SP/@1++#l%F+2-NHW2„+ +%W#&*)@2„+ +1+
#4+#+e$/@2#1:;)#4-O8…8'b)S:S#>$'/SN:+
(##P
#4Nv+[+K#W#81%1+#+e$…8#1:;)#4-&~/@&
&HV$1:%F+$1$F~#+'`+#9-)}/Z$21$-HrN+Kv
%2S+#vQ@v/#j+Cv#+SS+v†8C#$*+lf&R&HV$
;)V#g$+')W%@"1:2„+ +@N&S%)=#
R+H"+C+#:#&FOP+=##4+%&)@+R+H"
$=+/W#+G$&F1++#l$C4:#&Fv$-2‡$C4
|+,v)V#+K:'$#g&?-+H@N$@#5%&~%&K;$IF#
$'@$-$`:R$/%W#+%#+R+H"v&_+"#)@+1$+e$&G#
/W#@$-$`:R$/#4+=# =+#;$1$%=+-[':#&F
#4N$fNK-:Z/@'#5%#1$HW$/@2-NK%~#)#[+$+='[)@
>*+-['
12
:#$-Nl+{%=@N>%=21$@N$@J#K +=@N
$'+`N21$@#4N2C$y+-+@/W#@$-$`H
+'+R+H"&F$M-N9+HW$0%DLLU
W#>')'ˆ~N*#+S'$1$$f=‰v)-CHW+W#%q#&G#
+HV21$@r+{H"#$*+-.$'&K;H"#$*+-.
+` Ff>*+'@M-G$v+{;/u>Q-v/uAv&K#[#W#5#
&@'v)-C)@>Z)Z$q@&?-$f21$@ =#+R+H"#4+
%2C$•$*S)S$'%)@~QG#%=+'=#:#&F
%@$y$1$%=:#&F21$ Š&QN$*+l2l&K$1$&G#+f$=+
/W#Hr#'Sv('#X'Sv#S+%%'#)S QN)@;&G#+f
$g%@$?/HV+M-+[$'&H"$#$+R+H"#4+
%
b. Tính thực tiễn
Z+[/#4$Q)'=#+@$1$*%21$@21$-v@M-5
)3>6,%,+&HV$;-$?-v>w+g$v+*#M-S$f+{*%&G#
+HV21$@‹:Z&'1+HW$$1$+-['>6"#%=+{&*$*;
$#K)HV$2#:'uV+'/#4$+#K$.v$0%>*$21$@+[
+{*%&G#+HV21$@x%%)=##4-M-5$'+'2#
:'
2. Mục tiêu của đề tài
#[$e-2„+ +2#1:;)#4-x%e:Q)'=#v:Z
&'1+HW$&G#+HV21$@"#%=vh+V@M-5)3'=$&R
$#K)HV$2#:'vM-NK+&RV)3$'+{*%21$@
3. Phương pháp nghiên cứu:
a. Về lý thuyết:
]#[$e-+JM-/92#1:;)#4-
13
]#[$e-/9%F+>G2„+ +2#1:;)#4-r! +2K+VvQ
)WvQ$%
]#[$e-$C$2#1:;)#4-
]#[$e-+Z$+#jF:;)#4-/@)Z$qHI12#1:;
)#4-uV
b. Về thực nghiệm:
]kB:$1$2K+M-5#[$e-+{)3+-NK+e:/@'$I>w:;
)#4-/#j+C$f%=:#&F
14
Chương 1
TỔNG QUAN VỀ KHAI PHÁ DỮ LIỆU
1.1. Khái niệm về khai phá dữ liệu.
1.1.1. Tại sao cần phải khai thác và xử lý thông tin
kZ1++#l$f$C4+C+#/@/#4$e:$C4+C
+#+'#9-)}/Z$$f&"#>Gv2#+KA~F#+'#9-0%M-
$•&_}/W#)HV:;)#4-&~&HV$$1$$IM-+-+./@)H-+;
@N%F++g$)•N#9-)[ -N#[+S'+G2[$•%F+)HVm:;
)#4-@N)-C&HV$Q+g$v>G$y)=#q2C#K+5#)@%/W#$|
Hq/\+#K+$+-+./)'>V$*$1#&*M-+qRmM- (F+
/`&9&HV$&‚+)@)@%+K@'&l+J$e$v2#+1$;2G#)HV:;
)#4-2J)_/@&:=&*
9gH"#>B:v$1$2*20‚5#+H")@r
]8C+l+%+`N:;)#4-$?+#K+
]8C+l)`N&HV$:;)#4-$?+#K+
]8C+l#l-&HV$:;)#4-+%+`N
]8C+l>B:&HV$:;)#4-+%+`N
;/`&9/94+G+C+#r
]1++#l$1$$HI+e:21$-)@2C&I#5
]-N+;$HI+@N‚`+#9-/`&9
]8G#)HV:;)#4-)H-+;+0`+
]-5+R:;)#4-e$+=
#5#1$';/`&9+[$g)@/#4$AQN:Z%F+2':;)#4-
O+ƒS'->SP/@1++#l%F+2-NHW:;)#4-%W#&*)@2}+ +
1+#4+#+e$/@2#1:;)#4-
15
1.1.2. Khái niệm về khai phá dữ liệu
8#1:;)#4-&HV$:u&l%C+5M-1+1+#4+#+e$+'
k! -1+@N2K+A-`+$1$+#+e$+#9%€+{:;)#4-#|$'/#4$
:Z1'+'2#:'v$1$'=+&F>5A-`+v†8#1:;)#4-)@%
#5%$#g/9+"##>'/W#HI1+-N9+G+HW$2#
k-&QN)@%F+>G&R}%+g%C+5$f#9-+1$#5/92#
1:;)#4-
Œ8#1:;)#4-O:+%##P)@M-1++g$A-`+$1$+C+#$*
#1+R+#9%€[+')HV)W:;)#4-&HV$)H-+;+'$1$k!v2'
:;)#4-†
ŒR}$f>NSrˆ8#1:;)#4-)@M-1++V#|M-NK+
&Rv+'&*$|++%2#K%$1$%\-+C+#$H#K+/@`+"+'
k!)W‰
ŒR}$f•NN:rˆ8#1+#+e$)@%F+M-1+2C+?%
+H".;%\-:;)#4-$*#1+Rv%W#v;-g$v+#9%0/@$*
+l#l-&HV$‰
#9-H"#$'#2#1:;)#4-/@%F++ +;+C:21$)@
21%1+#+e$+'$I>w:;)#4-O8'b)S:S#>$'/SN#+>S>]
8P)@H- -N#[+[+Z$+K2#1:;)#4-$•)@%F+HW$
+#K+NK-+'M-1+21%1+#+e$+'k!
8!)@%F+M-1+q$+#+e$%W#+{;:;)#4-&~+-+.&HV$
/@'>I&_+[7 U++`N*_%<##&'=$g
16
7 7 -1+1+#4+#+e$
7P##&'=+e`+AB)3+CO)@%>=$:;)#4-]+)S#Pv$y
&HV$q#)@+#9AB)3:;)#4-x%$5#+#4$`+)HV:;)#4-#|M-1+
2#+1$:;)#4-$gA1$/@#4-M-5O7 7Pr
Ž !@%>=$O+)S#Pr!'=#m#j-vAB)3/#4$+#K-:;)#4-
p!'=#m#j-r*+l>B:%F+>GHI1)@%+I#j-H
_#M-NvQ$%
pnB)3+#K-:;)#4-rmM-#1+R+#K-:u+'/#4$Q)Wv>B
:x>G+'@$$x/#4$+N#1+R$H#K+v#1+R+-v#1
+R$*250`+&HV$+%xHI1_#M-#
Ž g$V/@#K&J#:;)#4-O++S+#'•+>X'%+#'Pr
V`+:;)#4-+{#9-2'$ev+{#9--_+-+.v$*+l&HV$&J#
>$1$+e$+g$V
pg$V:;)#4-r!@%$';+Z$+l+'+K#W#+Z$+{#9-
-_21$-$*+luV/W#-‹>B:>#[-:;)#4-:u&l+1
>Z1+>#)h#+'M-1++g$V$1$)HV$&_/@$-Nl&J#:;)#4-‹
&_+"#m&#>Z:H+{:;)#4-
p#K&J#:;)#4-r'_%)'=#m#j-2m#:;)#4-‹>B:$1$
•+'1+*%+,+N2K+V$':;)#4-‹21#M-1+*:;)#4-rw%e$+`
N:;)#4-+C&HV$+JVw21##4%%e$$'I‹$-€*:;)#4-)@
>'$'*x%+'%F+2'5m@'&*‹AQN:Z+-F$+g%W#J
>-/@'+.+-F$+g&~$'
Ž T|+q:;)#4-O+TS:-$+#'Prl$*&HV$>Z#j-:#j|+q$f
+.:;)#4-mI#9-/9>G)HVv%@/\#;&HV$+g-N[/‘$f
:;)#4-G$v})@8!+[:;)#4-|+q>6#4-M-5I>'/W#+[:;
)#4-G$ T|+q>G$#9-:;)#4-+Z$$`++.+-/@'$1$+-F$+g$?
+#K+$/$'/#4$1+#4+#+e$ 1$##&'=|+q:;)#4-r
pJV2G#:;)#4-r1:+'$`-+|$$f2G#:;)#4-‹
p!Z$q+.+-F$+g$'r1$$#9-2C)#[M-v)#[M-
NK-vN:H+{$*+l&HV$+%/@A*‹
17
pT|+q$#9-rkB:HI1%~*&l|+q2g$+HW$+.
:;)#4-‹
p#5%>G)HV)Wr'&*:;)#4-&HV$+N+K&1#1w#:;
)#4-21$vmI/9>G)HVH$1$%C+%>GO$•$?)H-#;$1$
+%>G%C+N/5#)H-#;:;)#4-+.+PN$1$HI1
2C:u+%>GHQ$%v)`N%\-v)HV$&_‹
pT"#=$*/@+='$1$Q$`21##4%r'&*$1$#1+R:;)#4-
+C$f$1$+-F$+g&HV$+N+Kw#$1$:5#N$1$%e$21##4%$'
I T"#=$*)@%F++e$$f#5%>G)HVv`+;-:$'+Z
&F+='$1$Q$`21##4% T"#=$*/@+='$1$21##4%Q$`
)@;$C$%=%68!v+'&*$'•+(##w#9-
%e$21$-$f21##4%‹
18
7 D ##&'=+#9AB)3:;)#4-+'+(##
DP##&'=+e#:;)#4-&HV$&H/@'2'O+ƒS'->SPr##
&'=@N%C+5M-1+AQN:Z2'/@>B:2':;)#4-&l)`N+C
+#$/$'%$&g$M-NK+&R>B:2„+ +!d
UP##&'=+e)@2#1:;)#4-O+(##PrQN)@##&'=
|++g$:;)#4-+S'N[-$?-$f@#+'1N*#$1$21$)@+g$$1$
%\-'‚$$1$%C€:HW#$1$:=:;)#4-v_%%F+>G$C&'=r
pq#4%/2#+1$:;)#4-:Z/@'%$+#[-$fM-1+1+
#4+#+e$)@Q)'=#vQ*%v_#M-#v+JVv/ / ‹
pq+ ++'12#1:;)#4-+g$V‹
p8#1:;)#4-&l+%$1$%\-'‚$%C/9+#+e$‹
p1#1v#5#+g$v+B)=#$1$%\-&~&HV$2#1+HW$2#&H+#
+e$2#+1$&HV$/@'>B:
<P##&'=+e+H)@&1#1%\-O++So/)-+#'Pr1#1+#
+e$&~+%&HV$&‚$#4+)@)@%>1+m$1$%C+5/@:Z&'1v+#$K+[%
+#+e$/{&HV$2#1&lQ$'#4-M-5>B:N$yq#)@#
+0#1+R+#+e$
8#1:;)#4-)@%F+)}/Z$)#[M-+W#`+#9-@q$21$
Hr4k!v+G2[v+Z$M-*v†I;v+uN/@'$1$+#K$.
&HV$>B:v2#1:;)#4-$y$*+l1:%F+>G2}+ +H%=
I'v)3+-NK++.+C'‚$+.%"v#l-:#j+#+e$v†k'/W#$1$
HI1@Nv2#1:;)#4-$*%F+>GH-&#l%’4+r
pk'/W#HI1q$%1Nr2#1:;)#4-$*)V#+Kw$h$*+l
>B:/W#$1$k!$e#9-#j-v:;)#4-2C&?N&f'‚$#K&J#
)#[+$
pk'/W#HI14$-N[#r1$/g:$f$-N[#+H"
w%e$$`+)HV$'I#9->'/W#$1$:;)#4-+'k!v/@$|
+H"$•'@%&HV$$1$+H"VM-+q
pk'/W#HI1+G2[r8#1:;)#4-&~2,$$&HV$
%F+>G&#l%NK-$fHI1+G2[r
19
]1$HI11+G2[$-€2CuV/W#$1$
2#l-:;)#4-$*$`-+|$+'`+#9-$1$k!
]1$HI1+G2['=+&F'@+'@+S':;)#4-v*
2C>B:+#+e$>“$*/9)}/Z$
]8K+M-5Q+g$$*+l>6`+#9-/@2*$*+l)@%’&HV$
]HI1+G2[$?$*>ZHW:\$fH"#:u&l
A1$&RQ+g$:;)#4-H+K@'/@w&Q-
W#;H-&#l%&*v2#1:;)#4-&&HV$1:%F+$1$
F ~# +' #9- )} /Z$ 2# :' /@ &"# >G 21$ - Hr
%2S+#v+@#$gvQ@v5'#l%vN+Kv#v/#j+Cv†T`+
#9-+J$e$/@$C+N)W+[+K#W#&~1:2}+ +2#1:;)#4-
/@'$1$'=+&F>5A-`+2#:'$f%/@+-&HV$;)V#g$+'
)W
1.2. Một số phương pháp khai phá dữ liệu phổ biến
1$2}+ +2#1:;)#4-+H"&HV$$#+@#*%$gr
Œ8}+ +2#1:;)#4-%C+5r$*#4%/%C+5/9$1$+g$`+
'‚$$1$&‚$+g$-$f:;)#4-+'k!#4$* 1$2}+ +@N
_% $*r Q $% O)->+S#Pv +*% +,+ O>-%%S#”+#'Pv +Z$ M- *
O#>-)#”+#'PvQ+g$) +2K+VOd>>'$#+#'-)S>Pv†
Œ8}+ +2#1:;)#4-:Z&'1r$*#4%/&H$1$:Z&'1:Z
/@'$1$>-N:#j+[:;)#4-#4+"# 1$2}+ +@N_%$*rQ)W
O)>>#X#$+#'Pv_#M-NOTSS>>#'Pv†
1.2.1. Phương pháp quy nạp (Induction).
*#2}+ +$g&l+Z$#4)@>-N:#j/@M-N=
pk-N:#jrx%|++C+#)@2K+M-5)'#$$f$1$+C+#+'
k! HI1>-N:#j:Z+[;>Z2#4$gA1$&l>-N$1$
+#+e$%W#+{$1$+C+#$• (\-$#K+A-`++S'2}+ +@N+H")@
$1$) +>-N:#j
p-N=rHI1M-N=>-N+C+#:HV$>#+{
k!v$*})@*+Z+%2#K%v+='%\-/@>#+#+e$$e2C5#
20
,+&?-/W#$1$+#+e$&~#K++HW$ 1$+C+#:'HI1@N%
)=#)@;+C+#+#+e$$`$':#j+5/9$1$&G#+HV+'k!
HI1@N)#[M-&K/#4$+%2#K%$1$%\-+'k!
HI1M-N=+H"&HV$*#&K+'2}+ +$QNM-NK+&R
/@+=') +
1.2.2. Cây quyết định (Decision tree) và luật (Rule).
pQNM-NK+&Rr)@%F+:=%C+5+#+e$&I#5x%QG#
$1$&G#+HV:;)#4-+@%F+>G)W`+&R 1$|+$f$QN&HV$1
~)@+[$1$+-F$+gv$1$$-&HV$,#1+R$*+l$f$1$+-F$+gv
$1$)1%#[-+5$1$)W21$- 1$&G#+HV&HV$Q)W+S'$1$
&H"&#+[$QNvM-$1$$-+HIe/W##1+R$f+-F$+g$f&G#
+HV+W#)1
*%)=#v$':;)#4-/9$1$&G#+HV$u$1$+-F$+g$u/W#)W
$f*v$QNM-NK+&R>6>#$1$) +&l:Z&'1$1$&G#+HV$H#K+
g:r
$*:;)#4-O##:+P/97L&G#+HVOH"#P (h#&G#+HV
&HV$%C+5w#<+-F$+g)@Gender, Car Ownership, Travel Cost/km,
Incom Level /@ %F+ +-F$ +g Q )'=# O+S'N ++#-+SP )@
Tranportation mode. '&*+-F$+gS:S$*2#l-#Nv+-F$+g
bS>#$*2#l--+#++#/S#+SSOLv7Pv/S)'>+•2%/@$'%S
!S/S)$*2#l-:;)#4-:#)
##:+$'#K+>Z)Z$q/9)'=#HI+#4/.$-NlO$v
->v+#P$f21$:Z/@'<+-F$+g&~$'
Attributes Classes
Gender
Car
Ownership
Travel
Cost/km
Income Level
Tranportation
mode
()S L S !'b ->
()S 7 S (S:#-% ->
•S%)S 7 S (S:#-% #
•S%)S L S !'b ->
Travel Cost/Km
?
Gender
?
Car Ownership
?
Car
Train
Bus
Bus Train
Expensive
Cheap
Male
Female
0
1
Standar
21
()S 7 S (S:#-% ->
()S L k+:: (S:#-% #
•S%)S 7 k+:: (S:#-% #
•S%)S 7 oAS>#/S #
()S D oAS>#/S (S:#-%
•S%)S D oAS>#/S #
57 75+##+
Z/@'##+w+[v$|+$*+l+='$QNM-NK+&RH
>-r
7 U QNM-NK+&R+{5+##:+
Œ=') +r1$) +&HV$+='x%>-N:#j%F+>G%\-:;)#4-$*3
}/9%‚++G2[ 1$) +$*:=ˆK-+‰v/W#)@%4&9
&|/W#%F+?:;)#4-$*+'k!v)@%4&9:Z&'1
#5>B+$*:;)#4-/9UH"#/W##1+R:;)#4-&~#K+/9$1$+-F$
+gS:SvbS>#v/S)'>+•2%v$'%!S/S) -N#[+
$H#K+q>6)Z$qHI+#4/.$-Nl@' kB:$QNM-NK+&R
&~+='&l:Z&'1 ;)#4-:HW#&QNq#)@S>+#+
Person
Name
Gender
Car
Ownership
Travel
Cost/km
Income
Level
Tranportation
mode
d)SA ()S 7 k+:: # –
-::N ()S L S (S:#-% –
22
SN •S%)S 7 S # –
57 D5+S>+#+
|+,+&?-+{':SG$O+-F$+g/S)'>+•2%P
S'$QNM-NK+&R+[v$1$) +OkS#S'XT-)S>P&HV$>#+{$QN
M-NK+&R:u&l:Z&'1H>-r
Rule 1 : If Travel cost/km is expensive then mode = car
Rule 2 : If Travel cost/km is standard then mode = train
Rule 3 : If Travel cost/km is cheap and gender is male then mode = bus
Rule 4 : If Travel cost/km is cheap and gender is female and she owns
no car then mode = bus
Rule 5 : If Travel cost/km is cheap and gender is female and she owns 1
car then mode = train3. Phát hiện các luật kết hợp (Association).
Z/@'$1$) +@Nv/#4$:Z&'1)W$'$1$:;)#4-$H#K+`+&I
#5
8K+M-5Q)Wx$QNM-NK+&RH>-r
Person
Name
Gender
Car
Ownership
Travel
Cost/km
Tranportatio
n mode
d)SA ()S 7 k+:: #
-::N ()S L S ->
SN •S%)S 7 S #
57 U 8K+M-5Q)Wx$QNM-NK+&R
QNM-NK+&R/@) +$*H-&#l%)@+e$%C+5&I#5v%C
#l-:#j21:j#l-&G#/W#H"#>B: -N#[%C+5$QN/@) +$•
$*+l#l-:#j&HV$%F+>G$e$0v//.N$|#W#=/9&F$g
A1$$f%C
1.2.3. Phát hiện luật kết hợp.
HI1@Nx%1+#4$1$) +2K+V#;$1$+@?
:;)#4-+'k! ?-$f+ ++'12#1:;)#4-)@)@%F++.) +
2K+V%@%h#) +$*:=rn]—˜OK-$*n+$*˜P 8™%+S'%h#) +
+%&HV$)@$1$+%>G&Fh+V/@&F+#$.N$f) + Fh+V/@&F+#
$.N)@D&F&'$•&F&1M-+Q%v51>Z;-g$/@>Z$,$$,$f
) +v$|&HV$+g+S'$C+e$r
23
Fh+VOk-'+Pš>G5#$en•J>G5#
F+#$.NO'X#:S$SPš>G5#$e$5n/@˜•kG5#$e
n
g:r
Q+g$k!1@vH"#+.&HV$+C+#/9;21$
@%-%1N+g&_+"#$•$*2-NHW%-?%9%M-5)3
+@#$g+'$u%F+)?%-&HV$%C+5+') +2K+V>-r
ˆ(1N+g]—?%9%M-5)3‰
›&Fh+Vr7Lœv&F+#$.Nr^Lœ•
! ++[+l#4$*7Lœ+[+J>G$1$21$@&~%-%1N+gv
+'>G;21$@%-%1N+gv^Lœ$•%-?%9%M-5)3
H/.Nv2#1) +2K+V)@%F+HI1AB)3+C+#M-
+q/@J#Kv*x%21%%h#)#[4#;$1$%\-:;)#4-
1.2.4. Phân lớp (Classification).
($+#[-$fQ)W:;)#4-)@:Z&'1~)W$'$1$%\-:;)#4-
-1+Q)W:;)#4-+H"_%DHW$rAQN:Z%C/@>B:
%C&lQ)W:;)#4-
pnQN:Z%Cr%F+%C>6&HV$AQN:Z:Z+[/#4$Q
+g$$1$%\-:;)#4->“$* (h#%\-+HIe/W#%F+)Wv&HV$M-NK+
&Rw#%F++-F$+gq#)@+-F$+g)W 1$%\-:;)#4-@N$y&HV$
q#)@+.:;)#4--`)-N4O+##:+>S+P 1$~)W$f+.:;)#4-
-`)-N4&9-5#&HV$A1$&R+HW$2#AQN:Z%Cv//.N
HI1@N$y&HV$q#)@q$$*+?NO>-S/#>S:)S#P21$/W#
Q$%:;)#4-)@q$2C$*+?NO->-S/#>S:)S#P
pkB:%CrHW$K+$|+5#+g&F$gA1$$f%C
K-&F$gA1$)@$`.&HV$v%C>6&HV$>B:&l:Z
&'1~)W$'$1$%\-:;)#4-21$+'+HI)#
1.2.5. Phân cụm (Clustering).
Q$%:;)#4-)@AB)3%F++.$1$&G#+HV/@'+'$1$)W&G#
+HV#G- (F+$%)@%F++.V$1$&G#+HV:;)#4-#G-
24
+'=%/#$u%F+$%/@2C#G-&G#/W#$1$&G#+HV
+'$1$$%21$ kG$1$$%:;)#4-&HV$Qw&QN$*+l&HV$A1$
&R+HW$+S'2##4%'‚$$*+l&HV$+Z&FA1$&R+S'HI
1Q$%
(F+>GHI1Q$%$g+'2#1:;)#4-r
ŒQ$%:;)#4-:Z+[Q$%Q$`rHI1Q$%
Q$`)@%/#4$x$1$*%$1$&G#+HV/@'+'%F+$QN$1$$%
ŒQ$%Q$`+g$&G/@Q)Nr
pQ$%Q$`+g$&Gr,+&?-x$1$&‚+%h#&G#+HV
/@'+'$%$f5+Q*v>-&*2K+.$1$$%-N[+B@N/@'
+'$1$$%@N$@)WI$'+W#2#+`+$5$1$&G#+HVx%+'
%F+$%&IN$'+W#2#+m%~	-2#4:{$'+HW$
pQ$%Q$`Q)NrHI1@NHV$)=#x$1$,+
&?-/W#+`+$5$1$&G#+HV+'$%v$#m*/@'+'$1$+@?
@N$@mI$'+W#2#%h#&G#+HV+@[%F+$%N
+m%~%F+	-2#4:{$'+HW$
1.2.6. Phương pháp dựa trên m•u (Based-on Pattern)
kB:$1$%\-%#[-+5+{$I>w:;)#4-&l+='[%F+%C:Z
&'1$1$%\-%W#x$1$|+$1$+-F$+g+HI+ZH$1$%\-&~
#K++'%C 1$2„+ +>B:H)1#9?`+v$1$#5#
+ +_#M-N/@4+G>-N:#j:Z+[+-G 8-NK+&#l%$f2„
+ +@N)@$?5#A1$&R&HV$2'5$1$/@&'&F#G-#;
$1$%\- (C+H"&HV$&1#1xHI1&1#1$•'
+[$1$)h#:Z&'1 (C@N&HV$1:$'$1$HI1&1
#1A`A•$1$+-F$+gv+-N#[$|`+2*#l-/2C&HV$&R
:=’@
1.2.7. Phương pháp phụ thuộc trên đƒ thị xác su„t (Depending
based-on Probability Graph)
25
1$%C&_+RA1$&R>Z+-F$A1$>-`+#;$1$>Z2#4
+CM-$1$)#[4+Z$+#K+S'$1$$-&_+ROS)7caavƒ#++2S
7ccLP w:=&I#5`+v%CA1$&R;#K@'+-F$+Z$
+#K/@'- W#)'=#@N>B:$1$#K$*#1+R"#=$'‚$Q)'=#
-N#[$•%wF$'%F+>G+H"V&‚$#4+H%.+&F->>#
'‚$$'$1$#K#1+R+Z$
&?-HI1@N&HV$1++#l+'$1$4$-N[# `-
+|$ %C/@$1$+%>G&HV$>-N+{4$-N[# @NNv$1$
HI1@N&~1++#[v$5$`-+|$/@$1$+%>G%C&_+R&9-
$*+lq$+Z$+#K+{$I>w:;)#4-O-+#SvS$2S%P #[-$-€&1
#1%C$fNK-)@w:=NS># #4$+%2#K%%C:Z+[
HI1ˆ)S'&_#‰O#))]$)#%#P+[#9-$`-+|$&_+R (‚$:u
HI1@N$y%W#He‘#9-/:=&_+R:j#l-I/@
#l-&=+&HV$#9-3}I&G#/W#$'H"#
1.2.8. Mạng Nơron (Neuron Network).
(=I')@$1$+#K$.+g+'1%W#)#[M-&K/#4$1++#l
$1$$`-+|$+'1q$/W#250q$ (=I')@2K+M-5$f/#4$
#[$e-%Cq$$f4+?2#$'H"# (=$*+l&H3
}$1$:;)#4-e$+='‚$2C$gA1$/@$*+l&HV$>B:&l
$#K+A-`+$1$%\-/@1+#4$1$A-HWe$+=%@$'H"#$•
H$1$2„+ +%1N+g21$2C+l1+#4&HV$
8#&9$.&K2#+1$:;)#4-vH"#++H"&9$.#9-&K%=
I' -N%=I'$*%F+>G=$KQN2*20+'/#4$1:
/@+#l2#H*$•$*;H-&#l%&12l (F++'>G;
H-&#l%&*)@250+='$1$%C:Z&'1$*&F$gA1$$'v$*
+l1:&HV$$'`+#9-@#+'121$-&1e&HV$#4%/
&‚+$f2#1:;)#4-HQ)WvQ*%v%C*v:Z1'
$1$>Z2#4+-F$/@'+"##v†