51
Vdi t~p phdbi€n {iI, i 2 , i3}, eo th€ t~o lu~t k€t h<;ipeo d~ng:
Co 50% khdch hang mua MANY(nhi~u) {il,i2},mua MANY (nhi~u) (ill.
1.7.4. Tun lu~t ke"th(/p cae ngii' canh khai thac du Ii~u mit [7]
GQi FFS(O,I,RF,{/-li};r,minsupp)la t~p h<;ipcae t?P ph6 bien cua ngii'dnh
khai iliac dii' li~u mo ung vdi bt) ham thanh VieD{J-li},gia tIi nguong chuy€n d6i
figii' dnh 1: va nguong minsupp. Vdi ba bt) ham tMnh VieD
{J..liMANY},{J-liAVER},{/-liFEw}cho tUng m~t hang iEI, co th€ t~o ra ba ngii' dnh khai
thac dii' li~u mo khac nhau va ttrdo sii' dt,mgcac thu~t giiii £lmt?P phd bi€n da
trlnh bay d cac ph~n tren d€ tlm cac t?p:
. FFSl= FFS(O,I,RF,{J-liMANY};r,mimsupp) ti'ng vdi bt) ham MANY
. FFSz=. FFS(O,I,RF,{/-liAVER};r,mimsupp)ti'ngvdi bt) ham AVERAGE
. FFS3= FS(O,I,RF,{J..liFEW};r,mimsupp) ti'ngvdi bi) ham FEW.
TIm SFl E FFSj, SFzEFF~z saDcho SFI2=SF1nSF27:0 va phan ra SF12
thanh cae t?P con X, Y khac r6ng cua SFI2saDcho SFI2=XuY va XnY=0 d€
t~o Iu?t ket h<;ipx~ Y giii'acac ngii'dnh khac nhau. N€u lu~t nay co dt) tin c~y
vu<;itngu'Ongminconf, thi co th€ cq.,caelu~t k€t h<;ipeo d~ng:
,,"",,9
. C6 56% khdch hang mua MANY(nhi~u) m(ithang X, thi se mua FEW (it) m(it
hang Y
1.8. DUNG LV! T KET H(1P DE PHAN LOP DU LltV VA M<1RQNG Ht
s6 PHt} THVQC THUQC TINH TRONG LY THVYET T~P THO [9]
1.8.1. Cae khai ni~m cd ban
lJinh nghia 1.22. Bang quyet dinh nhi phan
Xet ngii' dnh khai iliac dii' li~u (O,D,R) vdi 0 la t~p khae r6ng cac d6i
tu<;ing,D la t~p khac r6ng cac chi M.o ( thut)c tloh nhi phan), cho H va C la cae
t~p con khac r6ng cua D saDcho D=HuC , HnC=0, bi) ba (0, D=HvC,R) (hf\1e
gQi IIImi)t bang quy€t dinh nhi phan.
52
Bang 1.11: MQtvi dl,lv~ bang quye'tdinh nb! phan
Bang 1.11 Ia mOt vi dl,l v~ bang quye't dinh nhi phan voi
H={dl ,d2,d3,d4,d5} va C={cl ,c2}. ThuQc tinh cl xac dioh lOpam; thuQc tinh c2
xac dinh lop dlfdng.
Djnh nghia 1.23. Lu~t pMn lop ireD bang quye't d!nh ohi phao
rho bang quye't dinh nhi phan (0, D=HvC,R), gQiS la cac t~p coo khac,
ding cua H, lu~t pMn lop tren bang quye't dinh nhi phan (;6 d~ng S~ {c} voi
CEc. Ham pMn lop f dU'c;1c~o tit lu~t phan lop co d~ng f =1\ dEH" d va H' c H.
Vi dlJ1.8.MQts6lu~t pMn lop trangbang quye'tdinhnhipMn (jbang L11
RI:{d3,d4}~ {c2li;'R2:{d2,d5}~ {cl};R3: {d5}~ {el}
Cac ham pMo lop tu'ong t1ng Ia fl=d3 " d4; f2 = d2 J\ d5, f3=d5. E>6i tlfc;1ng0
thoa ham phan lop f ne'ua co chtta ta'tca cac chI baa co m~t trang H'.
1.8.2.Dq chinhxac cua ham phan lap
rho bangquye'tdiohnhi phan (0, D=Hr..£,R)trongdo cac d6i tu'c;1ngcua
0 du'c;1cxe'p vao hai lop. GQi0+ la t~p cae d6i tu'c;1ngcua 0 thuQcv~ lop e2 va 0-
la eac t~p cae d6i tu'c;1ngcua 0 thuQc v~ lop cl. rho f la mOtham phan lop, eo
th€ stl' dl,lngcac tieu chu~n san d€ xae dioh dO chinh xae cua ham phan lOp f
[24],[38],[48].
GQi TP={OEO+I f(a) dung}; FP = {oEO+1 f(a)sai}
dl d2 d3 d4 d5 c1 c2
01 1 0 0 1 0 1 0
02 0 1 0 1 0 0 1
03 0 0 1 1 0 0 1
04 1 0 0 0 1 1 0
05 0 1 0 0 1 1 0
06 0 0 1 0 1 0 1
07 0 1 0 0 1 1 0
08 0 0 1 1 0 0 1
53
TN ={ 0 E 0-' reO)dung }; FN ={ 0 E 0-' f(o) sai }
Be>chinh xac ciia phan lop c I dtt<;1ctinh bAng Gong thti'c:
11N!
ITPI+I1N1 (1.5)
Be>chinh xac cua phan lop c2 dtt<;1etinh bAng Gong thti'c
IIPI
I TP I + 11N I (1.6)
Vi dlJ.1.9. Voi bang quye'tdinh nbi phan trong bang 1.11
. Xet lu?t phan lop cl : {d2,d5}~ {c1}voi f= d2 J\ d5
0+ ={02, 03,06, 08} ti'ng voi c2; O. ={oI, 04, 05, 07} ti'ng voi c1
TP ={o E 0+1 reo) dung}=0;
FP= {oEO+1 f(0)sai}={02,03,06,08}
TN ={ 0 E 0.1 reo) dung }=~05, 07};
FN ={ 0 E 0- I f( s) sai } ={0 I, 04 }
B6 chinhxac phan lopc1 11NI I{o5,o7}I =10
. ITPI+I1N1 101+I{o5,o7}1 '
. Xet lu?t phan lop e~'d~ng {d3,d4}~ {e2}voi f =d3 J\ d4:
0+ ={02,03, 06, 08} ung voi c2; O. ={01, 04, 05, 07} ti'ngvoi cl
TP ={ 0 E 0+ I reo) dung}={ 03,08}
FP= {o E 0+ I f(o)sai }={02,06}
TN ={ 0 E 0.1 res) dung }=0
FN={oEO.1 f(s)sai} ={ol, 04,05, 07}
Be>chinh xac phan lop c2 - ITPI = I{oJ.o8}I -1,0
ITP I+! nv I I{oJ,oS}I+ 101
1.8.3. Dung lu~t ke't hc1plam lu~t phan lop dii' Ii~u
Cho bang quyet dinh to, D=Hl£,R) va cae ngtKJng minsupp, mine:onf,
t1m cae lu~t ke't h<;1pco d~ng r: S~{ e}. voi c EC va S cH. Co th~ dl{aVaGlu~t
54
ke't hQpnay lam cae lu~t phan lOp dii' li~u. rho bang quye't d!nh (0, D=Hl£.R)
va cae ngu'
ceC va S cR. Theo dinh nghi'a dQ tin c~y eua lu~t ke't hQp r: S~{e} la :
CF(r) IP(S)~~({C}) I va peS) Ia t~p cac d6i tu'Qngco ehua cae thuQctinh trong
S, p({e}) la ~p cae d6i tu'QngthuQelOpc do do p(S)np({c}} se xae dinh cae d6i
tu'<;1ngthuQe Idp e va co chua cae thuQc nnh trong S. Ne'u e la ldp e2 thi
Ip(S)()p({e2})1 =TP, peS) =TP uTN hay Ip(S)1=ITPI+ITNIvi TPnTN=0. Noi
cach khae:
ITNI
CF(S~{el })= ITP I+1TN I
ITPI
CF(S~{e2})= ITP I+1TN I
(1.7)
(LX)
\
Nhqn xii: Co thE sa d~ng dQtin e~y cua lu~t ke't hQpd~ daub gia dQchinh de
eua ham phan ldp
Vi d~ 1.10. Vdi bang quy~t dinh nb! phan trong bang 1.11, se co cae lu~t ke't h~p
.~'
thee ngtttJng ph6 bie'n t6i thi~u minsupp=OJ2va nglliJng tin e~y t6i thiEu
mineonf=O.7
rl:{dl} -> {ell; SP= 0.25 CF= 1.00
r2: {d3}-> {e2}; SP= 0.38 CF= 1.00
r3:{d4} -> {e2}; SP= 0.38 CF= 0.75
r4:{d5}-> {ell; SP= 0.38 CF= 0.75
r5:{d2,dS}-> {c1};SP= 0.25 CF= 1.00
r6:{d3,d4}-> {e2};SP= 0.25 CF= 1.00
Trong doco cae lu~t phan ldp dung 100% If!: rl,r2,r5,r6.
55
1.8.4. Uimg Iu~t ke"t h(jp d~ md rqng h~ s6 ph~ thuQc thuqc tinh trong Iy
thuye't t~p tho
1.8.4.1. Cae khai ni?m cd ban trong Ii thuylt tqp tho
Ph~n nay sii' d~ng cac djnh ngma cd ban cua 1:9thuyet t~p tho lam cd sa
xiiy dlfng h~ s6 phl;1thuQcthuQctinh ma rQng[33],[79].
Dinh nghia 1.24: H~ th6ng thong tin
Cho t~p h<;1p0 hii'uh~n, khac r6ng cac t~p d6i ut<;1ngva A la t~p hii'uh.,n
khac r5ng cac thuQc tinh roi r~c. GQidom(a;) Iii ffii~ngia tri cua thuQctmh aiEA
RAIl
va V=Udom(a;), ham is: O~AxV xac dinh ghi teiciia cac doi ttf<;1ngU'ngvoi cac
1=1
thuQc tinh cua A. H~ th6ng thong tin Iii bQ ba (O,A,fs).
Bang 1.12 MQt vi d~ v~ h~ thong thong tin
\
'.z~
Bang LI2.la mQt vi d1,lv~ h~ thong thong tin vdi O={01,02,03,04.o5,06, 07, 08}
va A={a.b.c}.
Cho h~ th6ng thong tin (O,A,fs). BcA, ky hi~u neB) 130gici tri thuQctinh
cua t~p thuQc tinh B U'ngvoi d6i tu'<;1ngu. M5i doi ttf(1ngCEO se U'ngvdi ffiQt
vector d~c tntng cho doi ttf<;1ngc6 thanh ph§n Iiicac c~p <a,v> voi a E A va
v=o( {a}). E>6itu'<;1ngI trong bang 1.12 tu'dngU'ngvoi vector d~c trung cho d6i
tu'<;1ng«a,l>,<b,4>,
O/A a b c
01 1 4 6
02 2 4 7
03 3 4 7
04 1 5 6
',05 2 5 6
06 3 5 7
07. 2 5 6
08 3 4 7
56
Dink ngkia 1.25. Quan h~ bit kha phan va phan ho~ch t~p d6i tu<;1ng
Cho h~ th6ng thong tin (O,A,fs), BcA, quail h~ bit kha phan ind(B) tren
t~p dO'i ttf<;1ng0 du'<;1cd!nh nghla nhu'sau:
'if B c A , 'if u, V EO, U ind(B) v ~ u(B) =v(B) (1.9)
Quan h~ bit kha phan ind(B) xac dinh hai d6i tu<;1ngu va v co cling gia tIi
thuQctinh dO'ivoi tit d cae thuQetinh trong B ( u(B) =v(B » .
Cho BcA, co th~ ki€m ITaquail h~ bit kha phan ind(B) Ia mQt quail h~
tu'dng du'dng. Quan h~ bit kha phan ind(B) xae dinh mQt phan ho~eh t~p dO'i
tu'<;1ng0 thanh cae lop ttfdng du'dng. Vdi u E 0, k9 hi~u [U]ind(B) 130lOp ttfdng
du'dng eila u theo quail h~ ind(B) va O/B Ia phan ho<:1ehdu'<;1c1<:10tll quail h~
ind(B). M6i phgn tli eila phan ho~ch O/B du'<;1cgQiIa IDQlt~p co sa hay IDQtIdp
tu'dngduong.
Vi dlJ1.11: Vdi bang dii'Ii~u a bang 1.11 va B= {e} se co cae lop tu'dngdu'ong:
. (jng vdi <e,6>
.,
[ol]ind(B)=[04]ind(B) =[~~1jnd(B)= [07]ind(B) = {ol ,04,05,07}
e (j ng vdi <e,7>
[02]ind(B)=[03]ind(B) =[06]ind(B)= [08]ind(B)= {02, 03, 06, 08}
Dink ngkia 1.26: Bang quy€t dinh trong 19thuy€t t~p tho
Cho h~ thO'ngthong tin (O,A,fs), gQi HR va CR la cae t~p con khac r6ng
eila A sao cho A=HRuCR va HRi1CR=0, (0, A=HRuCR, fs» du'<;1cgQihi mQt
bang quy€t dinh trong 19 thuy€t t~p tho. T~p HR du
di~u ki~n va CR la t~p cae thuQc tinh quy€t dinh. Bang 1.12. Ia IDQtvi d~lv~
bang quy€t d!nh trang 19thuy€t t~ptho vdi H={a,b} va C={c}.
57
Dink ngkia 1.27. Xa'pxl t~p h<;fp
Cho h~ th6ng thong tin (O,A,fs), X, la cac t~p can khac r6ng cua 0, XcO
va B la t~p con khac r6ng cua A, BcA. -BE 1!oe Iu'<;fngt~p X cae d6i tu'<;fngqua
t?P B cac thuQc tinh, Z.Pawlak dung khai ni~m xa'p xi du'oi eua X qua B ky hi~u
la B.(Xr va xa'pxi tren eua X quaB kYhi~uIa B*(X)[79].Cae xa'pxi du'oiva
tren B.(X) va B.(X) dtr<;fCdinhnghia nhu'sau:
B.(X) ={u EO I[U]ind(B)C X}
.
B (X)= {U E o ([ U]ind(B) II X * 0 }
(1.10)
Dink nghia 1.28. H~ so' ph1,1thuQc thuQc tlnh
Cho tru'dc hai ~p con khac r6ng U, V cua ~p thuQc tlnh A, h~ sO' ph1,1
thuQc thuQc tinh cua t~p thuQc tmh V VaGt~p thuQc tinh U du'<;fCsa d1,1ngdE khao
sat s1,1'ph1,1thuQc cua t~p thuQc tinh V VaGt~p thuQc tlnh U va du'<;fcdinh nghIa
nhasau:
y(U,V) = LIU.(X)IIIOI
XeOIV (1.11)
-t.
Ph1,1thuQc thuQc"tihh cua V VaG U du'<;fCkj hi~u la: U~V , k. Voi k =1,
t?P thuQc tlnh V bean loan ph1,1thuQC VaGt~p thuQc tlnh U. Voi k
thuQc mQt ph~n VaGU; Voi k =0: V bean loan khong ph1,1thuQc VaG U.
H~ so' ph1,1thuQc thuQc tinh y(U,V) du'<;fCsu-d1,1ngdE phan anh mti'c dQ ph1,1
thuQccua hai t~p thuQctinh [79].
Vi dl}1.12. Vdi h~ th6ng thong tin d bang dii'li~u3.2, rho: U={a, b} va V={c;},
hay tinh Y(U,V)?
a) V8i U={a, b }se eo cae 18pttfdng dtfdng:
. {<a,I> ; <b,4>}: UI=[ol]ind(U)=[oI]
{<a,2> ; <b, 4>}: U2=[ 02]ind(U)=[02].
58
. {<a,3>; ,<b,4>} : U3=[ 03]ind(U)=[08]ind(U)={03,08 }
. {<a,l >; <b,5>} : U4=[04]ind(U)=[04]
. {<a,2>; <b,5>}: U5=[05]ind(U)=[07]ind(U)= {05,07}
. {<a,3>; <b,5>}: U5=[06]ind(U)={06}
b)V8i V= {c}se co cae 18ptudng dudng:
. (fng vdi <c, 6>
XI= [ol]ind(V)=[04]ind(V) =[05]ind(V)= [07]ind(V)= {01,04,05, 07}
. (fng vdi <c,7>
X2= [02] ind(V)=[03]ind(V)=[06]ind(V)= [08]ind(V)= {02, 03,06,08}
Bi tinh h~ s6 pht;1thuQccua thuQc tinh cua V vao U b~ng c6ng thU'c1.11,
dn tinhU*(X)vdix eON.
. VdiXl={01,04,05, 07}, U*(Xl)={01,04,05, 07}
. Vdi X2= {02, 03, o~, 08}, U*(X2)={ 02, 03, 06, 08}
y(U,V)= 2)u.(X)I/IOI-lu.(Xl)I+IU.(X2)1-
XeDif' 8 - 1,0
~f
,~'
V~y h~ 86 pht;1thuQc thuQc tinh cua V vao U la 1,0 hay V pht;1thue}choan
toan vao U.
1.8.4.2. Mil TQnghi sit ph1;lthuQc thuQc linh [9J
Phin nay trlnh bay cd sd 19 lu~n dE dinh nghia va tinh tmin h~ s6 pht;1
thuQc thue}ctinh md fe}ng.
Dinh nghia 1.29. Ham phan anh muc de}bao ham
Cho ngU'ongdo mue dQ bao ham 8e[0,1], gQi ~(S,T) la ham phan anh
mue dQbao ham cua Strong T, ham ~(S,T) dU<;fC(t!nhnghia nhu san:
59
J.lc (S,T) =IS II T)IIISI (1.12)
Neu J.lc(S,T);::: 8, thi t~p h<;1pS du'
baa ham la 8. Neu 8=1,0 thi S c T
Dtnh nghia 1.30. Xa'pXldu'oimd fQng
Vdi dinh nghla cila ham philo anh mue dQ baa ham, co th~ dinh nghia
Xa'pXlmo fQngB**(X)trong Iy thuyet t~p tho nhu'sau:
B**(X) ={ u E 0 I J.lc([U]ind(B),X);::: 8 J\ U EX} (1.13)
Dtnh nghia 1.31. H~ s6 ph\! thuQcthuQctfnhmd fQng
H~ s6 ph\! thuQcmo fQngdu'<;1cdinh nghla qua ham phan anh mue dQbaa
ham. Cho hai t~p thuQctinh U va t~p thuQctinh V, M s6 ph\! thuQcthuQctinh mo
fQngcila V vao U du'<;1cky hi~u Ia '¥ (U,V)va du'<;1cd!nh nghia nhu'sau:
'¥ (U,V) = II U..(X)l1!0 I
XeO/V
(1.14)
Vi dl,lI.13 sail day neu leDkha Dangphan ldp cila h~ s6 ph\! thuQcthuQc
tinh md fQng.
~{'
Vi dl} 1.13: Xet bang quyet dinh 1.12,cho U={b} va V={c}, ta co:
. Voi U={b} se co cae lop tu'dngdu'dng:
[01]ind(U)=[02]ind(U)=[03]ind(U)=[08]ind(U)={01,02,03,08}
[04]ind(U)=[05]ind(U)=[06]ind(U)=[07]ind(U)={04,05,06,07}
. Voi V= {c}se eo cae lop tu'dngdu'dng:-
[ol]ind(B)=[04]ind(B) =[05]ind(B)= [07]ind(B)= {ol,04.05, 07}
[02]ind(B)=[03]ind(B)=[06]ind(B)= [08]ind(B)= {o2,03, 06, 08}
Dung h~ s6 ph\! thuQcthuQctinh truy~n th6ng y(U,V)= II U.(X) 1/101 =0
'eO/!
60
Trong 1:9thuytt t~p tho khi y(U,V)=Oco nghla l?iV khong ph\,!thuQcVaG
U, nhung theo yeu cftu cua pIlau lap gftndung v~n co th8 suy fa duQCV tIcU.
Tit hai lu~t phan ldp :
<b, 4> ~ <c,7>, dQchfnh xac cua pMn lap =0,75
<b, 5>~ <C,6>, dQchinh xaccua pMn Idp =0,75
D\fa VaGnh~n xet tren, lu~n an md fQngkhai ni~m xa'p XlduOicua t~p tho
nh~m (ijnh nghla h~ s6 ph1,1thuQcthuQctinh md fQng \fI(U,V).
Vdi cac t~p cd sd cua phan ho~ch ON va muc dQbaa ham e =0,75:
Vdi Xl= {oI,04.a5, a7}, U..(XI)={a4,05, 07}
Vdi X2= {02,03,06, 08}, U..(X2)={a2, 03, 08}
\fI (U,V) = II U..(X) I/ 10 I = (I {04,05, a7}1+I{02,03,08} I)/101=6/8=0,75
XeOIV ",
Do v~y M s6 ph1,1thuQcthuQctinh md fQngco kha DangpMn ldp t6t hdn
h~ s6 ph1,1thuQcthuQctinh truy~n th6ng, d~cbi~t l?icac pMn lap g~n dung [91.
Nhq.n xet:Khi nguong do mue dQbaa ham 8=1,0 thl '¥ (U,V) =y(U,V).
1.8.4.1. Chuyin tl/Jibang quye'Fi1/nhtTongIi thuylt tljp tho sang bang quylt dink
nhjphlin
IAII
Cho h~ th6ng thong tin (O,A=HRuHC,fs), V=Udom(a,), gQiD Ia t~p h
;=1
cac em baa d= <a,v>eAxV va thoa ham is. Tit (O,A=HRuHC,fs) t~o quaDh~ hai
ngoi RcOxD, saDcho 0 R d <=>o(a)=v va d=<a,v>.
Bang 1.1I Ia bang quytt dinh nhi pMn du
dinh truy~n th6ng (bang 1.12)vdi cac chIbaa d nhu san:
dl=<a,l> ; d2=<a,2>; d3 = < a,3>; d4=<b, 4>; d5=<b, 5>; cl=<c,6>; c2=<c,7>
Xet ham attributes duQcdinh nghla nhu san:
61
v SeD, attributes(S)={ a e A I<a,V>-e S } (1.15)
Ham attributes d~ la'yten cac thuQctinh trong t~p con S cac chi baa cua D.
Tinh chat 1.6: Voi c~p ham (p, A) dfi dtnh nghia a ireD, gQi U eA va OIU la
mQtphilo ho~ch o thee quaDh~ ba't kha philo ind(U) va U1,Uz,., Ukla cac ~p cd
sa cua philo ho~ch OIU thi p(A(Uj»=UjVj=I,...,k.
Vi dl} 1.14: Voi U={a, b} va t~p cd sa cua phan hOi;lChOIU ung voi lop tttdng
du'dng U5=[o5]ind(U)=[o7]ind(U)={o5,o7} du'<;1cxac dtnh bai: <a,3> va <b, 5>.
Theo cach ma hoa ireD,hai chi baa tttdng ung la d2=<a,2>; d5=<b, 5>. Dung c~p
ham p,A da du'<;1cdtnh nghia aireD, ta co:
A(05, o7)={d2,d5,cl}; p(A-(o5, 07») = p({d2,dS,cl})={o5,o7}= U5
1.8.4.4. Tinh hf srfphI} thul)c thul)c tinh md rl)ng qua dl) tin cljy va dl)phil bitn
cua luat kit hd,rp "-.. ,
Rtl dl 1.1: Cho SeD va TeD, muc dQcua peS)bao ham trong peT) du'<;1ctlnh:
J.Ic(p(S) ,peT»~ =Ip(S) tlp(T)llIp(S)1 =CF(S-+ T) (1.16)
-.}-
.,~.
Dinh Ii 1.7([9]).Cho (O,A=HRuHC,fs) la bang quye't dtnh va bang chuy~n d6i
quye't dtnh nht philo (O,D=HuC,R) tttdng ung, gQiU va VIa hai t?P h<;1pcon cua
A, Uj la cac t?P cd sa cua philo hOi;lChOIU va X la t?P cd sa cua philo hOi;lCh
ON, J la t~p cac chi s6 sao rho VjeJ, !lc(Uj,X)~ e thi:
'I' (U,V) = I I(CF(A.(Uj)-+A,(X»*SP(A,(Uj)))
XeOlVjeJ
(1.17)
Trang do D la t~p chi baa cua bang quye't dtnh nht phan (O,D,R) dtt<;1c
chuy~n d6i tITbang quye'tdtnh (O,AJs).
62
Chung minh: Gqi J Ia ~p cac chi s6 saGcho 'v'jeJ, J.1c(Uj,X);::e voi l!j Ia ~p cd
sd cua phin ho~ch 01U, co th€ tinh I(U (X»I bhg:
I(U (X»I = IIUj(JXI
jeJ
Do l(Uv cD, A.(X)g), lu~t ke't h<;1pA.(Uj)-+A.(X)di'idu<;1etlnh dQph6
bie'n va dQtin c~y Den CF(A(Uj)-+A,(X»= Ip(A,(Uj»(\ p(A,(X)l/lp(A(Uj»1.Theo
tlnh cha't 1.6 do Uj va X la cac t~p co sd eua phin ho~ch Den p(A(Uj»=Ujva
p(A(X)=X,do v~y Ip(A.(Uj»n p(A.(X)I=IUjn XI = CF(A(Uj)~A(X»* IV). Ngoai
fa, dQ ph6 bie'n cua ~p h<;fpA(Uj)Ia SP(A,(Uj»= Ip(A(Uj))I/IOI=IUpIOI,Den
IUjl=SP(A(Uv)* 101. Tom l~i:IUjn XI =CF(A(Uj)~A(X»* SP(I..(Uj»* 101
Ne'uA.(Uj ) la t~p ph6 bie'n va A(Uj)~A(X) la lu~t ke'th<;fp,co th€ tlnhh~
s6 ph1:lthuQcthuQc tinh md rQng nhu san:
'¥ (U,V)= I I( GtF(A(U)~ A(X»*SP(A(Uj)))
XeD/V jeJ
1.8.4.5 Xliytb!ng thuQ.tgiai dJ!a tren hi siJphlJ. thllQCthuQc tilllz mll TQng
Cho bang quye't dinh (O,A=HRuCR,fs) va nglliJng dQ ehlnh xae cua phin
~.
lOpminprecisione[O,I], fun cae lu~t'phin lop S~T voi S ~HR va TcCR, saGtho
do chlnh xae cua lu~t phin lop S~ V Ion hon ho~c bing minprecision. Cho bang
quye't dinh (O,A=HRuCR,fs), gQi (O,D=HuC,R) la bang quye't djnh nb! phin
dU<;fCehuy~n d6i tU bang quye't djnh (O,A=HRuCR,fs). Cho trUoc cac nglliJng
minsupp, minconf, minprecision. GQi FS(O,D=HuC,R,minsupp) la t~p cae t~p
ph6 bie'n cia (O,D=HuC,R) va R(O,D=HuC,R,minsupp,mincont) la t~p cae lu~t
ke't h<;fpeo d~ng lu~t phin lop S ~ T, saGcho S~H va Tcc.A=Huc.
Thu~t giai 1.11. san dfty sa d1:lngh~ s6 ph1:lthuQcthuQetinh md rQngd~
tlm lu~( phan Idp dli li~u.
63
Thu4t giiii 1.11: TIm lu~t phan lop dt!a tren h~ 56ph1:1thuQcmd rQng
Vao: Bang quy~t djnh (O,A=HR0CR,fs)
NgU'Ongminsupp, mineonf, minpreeision
Ra: T~p cae lu~t phan lop S ~ T, sac cho S c H, T c C, A=HuC, ngU'Qngphan
lOp la minprecision.
BlIUc 1: Chuy~n bang quy~t dtnh (O,A=HRuCR,fs) sang bang quy€t djnh nht
phan (O,D=HuC, R)
BlIf1c 2: Tinh FS(O,D=HuC,R,minsupp) va R(O,D=HuC,R,minsupp,minconf)
thee cae thu~t giai fun t~pph6 bi€n va lu~t k~t h
BlIUc 3: Phan hoi;1cht~p R(O,D=HuC,R,minsupp,mincont) ra cae nhom lu~it
phan lop S ~ T, co cac thuQc tinh trong t~p S gi6ng nhau va cae thuQc tmh
trong t~p T gi5ng nhau, gQiC={G!,Gz,...,Gd la cac nhom lu~t san khi phan lop.
,BlIUc 4: g6m cae b1foegall:
1) For each G E C do
2)
3)
4)
5)
6)
7)
8)
9)
10)
11)
12)
La'y rEG va r = S ~ T
GQi U=Attributes(S) v~V la Attributes(T)
:::::;':;1
/I Tinh '¥(U,V)
Psi=O
For each r: S ~ T va r EG do
Tinh CF(S ~ T) va SP(S) II dung thu~tgiai t1mlu~t ke't hc;1P
Psi=Psi +CF(S~ T)* SpeS)
Endfor /I r
If Psi ~ minprecision
Ghi (U,V) vao t~p KetQua
Endif
13)Endfor /I G
64
Vi dl!-minh h{Ja thuq.t gidi 1.11
Voi bang quytt dinh nhi phan (j bang 1,12, ngU'ong ph6 bitn t6i thi~u
minsupp=O,1. ngu'Ongtin c~y t6i thi~u III minconf=0,75, ngu'ong cmnh xac toi
thi~u Iii minprecision=O,75.Ung dl,mgcac thu~t giai rim Iu~t phan lop tit lu~t ktt
h
Nhom Gl:
. Lu~t ke't h<;1p{dl} 40 {el}
r1:<a, 1> ~ <c,6>
ThuQc tmh vt trai a, thuQc tinh vt phai c,
SP(rl)= 0,25 CF(rl)= 1,00 SP({dl D=0,25
. Lu~t ke't h<;1p{d3} 40 {e2}
r2:<a,3> ~ <c,7>
ThuQctinh vt trai a, thuQctinh,v€ phiii c.
SP= 0,38 CF= 1,00
SP(r2)= 0,38 CF(r2)= 1.00 SP({d3D=0,38
Tinh 'P({a},{C})=CF(rl)*SP( {dl} )+CF(r2)*SP({d3}}=0,63
Nhom G2:
. Lu~tktth
-.J
c~",'
r3:<b,4> ~ <c,7>
ThuQctinh vt trai b, thuQctinh vt phiii c.
SP(r3)= 0,38 CF(r3)= 0,75 SP({d4})=0,5
. Lu~tktth<;1p {dS} 40 {el}
r4:<b, 5> ~ <c,6>
ThuQc tinh vt trai b, thuQc tinh vt phiii c.
SP(r4)= 0,38 CF(r4)= 0,75 SP({d5})=O,5
65
\f'({b},{ c})=
CF(r3)*SP( {d4} )+CF(r4)*SP( {d5} }=0.5*0.75+0.5*0.75=0,75
Nhom G3:
. Lu~t ke't h<;1p{d1,d4} ~ {el }
r5:<a,l> * <b, 4> ~ < c, 6>
ThuQctinh ve tnIi a,b; thuQctinh ve ph:ii c.
SP(r5)= 0,13 CF(r5)= 1,00 SP({d1,d4})=0,125
. Lu~tketh<;1p{dl,d5} ~ {el}
r6:<a,I>*<b, 5> ~ <c,6>
ThuQctinh ve tnIi a, b; thuQctinh ve phai c.
SP(r6)= 0,13 CF(r6)= 1,00 SP( {dl,d5})=0,125
. {d2,d4} ~ {c2}
r7:<a,2>*<b, 4> ~ <c,7> ,
ThuQctinhve tnii a,b; thuQctinh ve phai c.
SP(r7)= 0,13 CF(r7)= 1,00 SP({d2,d4})=O,125
. Lu~tketh<;1p{d2,d5}~ {el}
r8:<a,2> * <b,5> ~ <c,6>
ThuQctinh ve tnii a, b; thuQctinh ve ph:ii c.
SP(r8)= 0,25 CF(r8)= 1,00 SP({d2,d5})=O,25
0 Lu~t ket h<;1p{d3,d4} ~ {c2}
r9:*<a,3>*<b,4> ~ <c,7>
Ten thuQc tinh ve tnii a,b; ten thuQc tinh ve ph:ii c.
SP(r9)= 0,25 CF(r9)= 1,00 SP({d3,d4})=O,25
. Lu~t ket h<;1p{d3,d5} ~ {c2}
rlO:<a,3>*<b, 5> ~. <c,7>
ThuQctinh ve treE a, b ; thuQctinh ve phai c.
66
SP(rlO)= 0,13 CF(rlO)= 1,00 SP( {d3,d5})=0,125
Tinh 'I'({a, b},{c})= CF(r5)*SP({dl,d4})+ CF(r6)*SP({dl,d5))+
CF(r7)*SP( {d2,d4}) + CF(r8)*SP( {d2,d5})+ CF(r9)* SP( {d3,d4})+
CF(rlO)*SP( {d3,d5})= 1,0
1.9.KET LU~N
Chu'c1ngnay phat tri~n cac thu?t giiii hi~u qua d~ tlm t~p ph6 bien va lu~t
ke't hQp trong CSDL biing cach ghlm dQphuc t~p cila nnh toaD va giam so lftn
truy c~p CSDL. Co hai lo~i thu~t gi.H du'Qcphat tri~n la thu~t giai khong tang
cu'ongva thU?tgiai tang cu'ong.
Trong thu~tgiai khong tang cu'ong, mo hlnh vector bi€u di~n t~p m~t hang
va baa dong d:i du\1Cd€ xu!t nhiim bi~u di€n CSDL thanh ngfi'canh nhi phan
niim trong bQnho may nnh va giam solu'c1ngt~p ung VieDdn tinh dQph6 bien
d~ DangcaDhi~u stIlt thu~t giai. ,
Trong thu~t giai tang cu'ong, thu~t giai (~O daD khai ni~m cila R. Godin
d:i du'Qcdi bien d€ fun t~p ph6 bie'n (it cac khai ni~m hlnh £huc £rong daD khai
ni~m. Thu~t ghHtren daDkhai ni~m ngoai kha Dang tang cu'ongcon co tnI di~m
"f,
la chi dn truy c~p CSDL mQ(Iftn'atiy nh!t la co th€ t~o daDkhai ni~m.
Ke' de'n la cac nghien CUumd rQng lu~t ke't hQptruy€n thong sang d~ng
lu~t ke't hQpphil d!nhva lu~tket hc;ipmo.
Cuoi clIngchttc1ngnay trlnh bay cac nghien cUudung lu~t ke't hc;iPlam lu~t
, phan lOpdfi'li~u va xay dl,l'ngh~ so ph1,1£huQcthuQctinh rod fQngtrong ly thuyet
t~p tho nhiim Dangcao khii Dangkhao sat mli'cdQph1,1thuQcgifi'acac ~p thuQc
tinh trong cae bai toaDphan lop dii' li~u g§n dung.