Tải bản đầy đủ (.pdf) (10 trang)

Nhận dạng tiếng nói tiếng Việt sử dụng mức dưới từ và thử nghiệm so sánh một số phương pháp nhận dạng tiếng nói.

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (336.19 KB, 10 trang )

«y
Proceedings of ICT.rda'06. Hanoi May. 20-21,200(

VIET su* DVNG MlTC Dir6l TU
* VATHU' NGHIEM SO SANH M O T SO PHlTONG PHAP
NHAN DANG TIENG VIET

NHAN DANG TIENG N 6 I TIENG

Vietnamese Speech Recognition using Subword Models and Test
Experiments for Comparing Some Methods of Vietnamese Recognition
Nguyen Phii Binh, Trinh Van Loan

Tom tat
Bdi bdo nay gi&i thieu nhirng kit qud dgt duac cua chung loi Irong viec dp dung ly Ihuyil
vi mo hinh Markov dn di xdy dung mgt hi thdng nhdn dgng tiing ndi tiing Viit cd khd
ndng md hinh hda mgt dan vj dm thanh bdt ky. Bdng cdch diing hi thdng ndy. chiing Idi dd
liin hdnh khdo sdi mgi sd phuang phdp nhdn dgng lua chgn dan viphdt dm khdc nhau, tir
do dua ra m^ phuang phdp nhdn dgng khd hiiu qud, do Id sit dung cdc md hinh bdn dm
liil. Phucmg phdp ndy da dugc kiim nghiim tren ca sa dir Iiiu tiing ndi bao gdm 10575
lugt phdi dm 888 tir dan khdng ddu liing Viit da mgt gigng nam, vd li li nhdn dgng chinh
xdc dgt duvc Id 96.89%.
Tit khia: nhgn dgng liing noi. md hinh Markov dn. tiing Viit, mvrc dual tir. bdn dm liil
Abstract
This paper presents our results in applying the hidden Markov model theory for
developing a Vietnamese speech recognition system, which is capable of modelling any
pronunciatiort unit By using this system, we had some experiments to compare some
methods of recognition with the selection of different pronunciation units. As the result,
the effective method using semi-syllable models is proposed. This approach was tested on
a database consisting of 10,575 pronunciations of 888 Vietnamese toneless single words


spoken by a male voice, and the accuracy is 96.89%.
Keywords: speech recognition. Hidden Markov Model.Vietnamese. sub-word, semisyllable
1. GIOI THIf U
-A
z u- •!.<
u- J
Cho den nay, cac h? thong nh?in d^ng
*-^
i-•u' u '
uJtl u' - J ; tieng noi thanh cong nhat chu yeu d\ra tren
\.i. u U.JL- L- J
s ^ 1 - .u -•
khuynh huong nh^n d^ing mau va ky thuat
,/ .
s® J
, jT
I * L-x L* \y
nh^n djuig mau dvtqc sir dyng pho bien nhat la
..
. ., . . , ,
s ,iyxAXA\ T - *!.dung mo hinh Markov an (HMM). Tren the
•z.-^„ .luz •.» 1--.1.J!
•_- J
-i
gi6i da CO khi nhieu hf thong nhan dang tieng
. - , Lt
.
1. . j'
ã ô u' C
n6i c6 s6 lupiig Jr vyrng Icm sir dyng mo hinh

Markov an cho dp chinh xic rat cao. 6 Viet
Nam mpt so chmmg tnnh nh^n d^ing tieng
n6i dya tren mo hinh Markov an cung da cho
nhihig ket qua kha kha quan. Tuy nhien, phan
Ion nhung chuang tnnh do van chi su dyng

mo hinh Markov an a mure tir ([3],[7],[9]) nfn
so lugmg tCr v^mg con h^n che va kho ap dyng
.* , . ".
V, ° ..i„ „,. „. ,. . ^ ,.^„ t °
de nh?in d^ng duQfc tieng noi phat am lien tyc.
„ . % . .° -» x .°tukJ^ „i.a« j—™ >.v
Cung da co mpt so h? thong nh^n d?ng su
, " - u' u x* i, k^ f.\^.:,. A ^ .,: a^
dyng mo hmh Markov an o muo don v| am
7 °, , , . _ ., „. i„„ L„„ „L., »„ • u,,,
thanh nho hom tu, chSng han nhir am vi hay
,
- A^ ^ '
/^rci r/ci roi\ «i,.^„
phy am <^au + van,... ([5],[6],[8]), nhung
*^, .
.. ,
. . . _ \ V V'. 'Is." *. „»„
nhung thyc nghifm cua cac Jl? thong ao van
^j^,
^^.„ ^^ ^ ^,
^.
,^
,.^ ;,

^^^^ ^ . ^.
. ^.„ ^-^^ ^^^
^j. ^^. ,j.„ ^^^ gen c^nh do, phin I6n
^ ^
^^.
„, ,^i ^^^^ ^^j.„ ^^. ^^^
^ . ^ ^,^ ^.
^^ ^^^ d^„g j - „ g ^^j ^ 3

187


Proceedings ofICT.rda'06. Hanoi May.

Ky y^u HQi thao ICT.rda'06
nguon mor cua nuac ngoai nhu SPHINX',
CSLU^ H T K \ . . . nen ket qua cung mai chi
dimg a mure nghien curu ma kho co the ap
dung vao thyc tf do bj phy thupc ve mat cong
nghf.
Xuat phat tir nhan thuc tren, bai bao nay
trinh bay cac nghien curu, thur nghifm cac
phuomg an lya chpn dom vj am thanh cho cac
mo hinh Markov an de sao cho vai so lirgmg
mo hinh khong nhieu, hf thong van co the
nh|in dan^ dugc mpt tap tir vyng tuomg ddi
lorn vai ket qua chap nhan dugc.
2. L V A C H Q N DOfN V | AM THANH DE
HUAN LUYfN MO HINH
2.1. Cac dorn vj nh^n dang thong thirorng


du so miu tieng noi can thiet de hu.
cong vifc cyrc ky kho khSn.
2.1.2. Mo hinh am Vf
De CO the sir dyng chung cac
luyfn ciia cac tir khac nhau, mo h
thuang dugc chpn lam don vj nhai
ban cho cac hf thong nh|n d^ing
Tieng Vift co khoang 40 am vj (Z
neu khong k^ phy am trong zero, 1
am dan va doi, va 2 ban nguyen am
CO kft hgp vdi thanh difu nQ-a thi cur
den toi da la 40 X 6 = 240 am vj co th.
nen ta co the xay dyng cac hf the
d^ing tifng noi tif ng Vift co so lugng
Ion vai don vj nhan dang ca ban durgt
amvj.

Tuy nhien, mo hinh am vj co mf
difm kha lan, do la mo hinh am vj coi
Vifc lya chpn tir lam dom vj nh|in dang ca vj dugc t^o ra trong bat ki tir nio, or I
ban se bao trum dugc tinh bien the cua am vj. tri nio deu nhu nhau. Gia djnh nay 1
Cung mpt am vj song neu thupc ve cac tir cic hf thong nh^n dang tieng noi tif i
khac nhau thi co the tra thanh cac cac mo hinh dya tren am vj khong khai thic het du
am thanh khac nhau. Ta biet rang tieng Vift la phan biet cua cic am vj. Vi du, tron
ngon ngir dan am tiet. Trong tieng Vift, am Vift, cic phy am k, m. n, ng. p. t vira ci
tiet la dom vj am thanh ty nhien nho nhat chu am dau, vua co the la am cudi. Va cic (
khong phai la tir, cho nen im tiet m6i la muc nay nfu ddng vai tro lam am dau thi
tieu cua cac hf thong nh|in d^ng tifng noi bin da CO dac tinh am hpc - ngQ- am gid
tif ng Vift. Cac mo hinh am thanh dya tren am chinh nd trong vai trd lim am cudi. Tu<

tiet CO tinh ben vOng cao do chung bao trum nhu v|y, cac nguyf n am u, o vira cd thf
dugc hifn tugng dong cau am cua cac am vj df m, vua cd the la am chinh. Nhung r5 r
cau thanh am tiet cung nhu tinh bif n the am vj u. 0 vdi vai trd am chinh (vi dy: biin, to)
trong cac am tiet khac nhau. Vi v|iy, am tift c6 phit am ni^nh va mang am sic chu ye
the dugc lyra chpn lam dan vj nhan d^ng tieng am tiet, trong khi dd u, o vdi vai trd la
noi ca ban cho cac urng dung nhan dang tieng df m (vi dy: qud, xodi) l^i khdng mang ai
noi tieng Vift vai so lugng tu vyng vira va chii yeu.
nho. Tuy nhien, do moi mau huan luyfn cua
Chinh vi nhugc diem nhu \%y che
mpt am tiet chi co the sur dyng df huan luyfn
md
hinh
am vj thudng it dugc sur dung i
cho chinh am tift do chu khong dun^ chung
df huan luyfn cho cac am tift khac, dong thai cac hf thong nh$n dfuig tieng ndi tieng Vi
tieng Vift co tai hom 10.000 am tift nen r6 2.1.3. Md hinh am v/ kep, bp ba am v/
rang la khong de dang gi lya chpn mo hinh am
Md hinh am vj don (phoneme
tiet cho cac urng dyng nh|in dang tieng noi
monophone)
it dugc sir dyng do nhu3ig
tif ng Vift CO so lugng tir virng Idm vi thu thap
diem ke tren. Do v|iy, trong cic hf nhain c
tieng ndi cua mpt so ngdn ngir nhu tieng /
tieng Phip,... ngudi ta thudng sur dyng
hinh am vj kfp (biphone/diphone) ho|ic
'
hinh bp ba am vj (triphone) vi nd khfic pi
^
dugc mpt so nhugc diem cua md hinh am v

^
2.1.1. Mo hinh tir va dm tiet

188


Proceedings of ICT.rda'06. Hanoi May. 20-21,2006

Phan dau am tiet dugc xic djnh la am dau, d
vj tri nay chi cd mgt am vj tham gia cau t^o.
Phin sau cua am tift gpi ii phan van. Cic am
dau van, giQ-a van va cuoi van dugc gpi ten li
cic am dfm, am chinh va am cuoi. Vdi cau
true dac trung niy, da cd mpt sd nghifn curu
dua ra giii phip lya chpn am dau + vin lam
dan vj nhan dang ca ban [8]. Xet vf tinh ben
vQ-ng, md hinh am dau + van bf n vung ban
md hinh am vj do phan van cd thf bao triim
tinh bifn thf am vj cua am dfm, am chinh va
am cudi, nhung lai kem hom md hinh am tiet
Rd ring, cic md hinh diphone vi triphone do md hinh am tiet cd thf bao trum tinh bien
cd xet den sy chuyf n doi vf tinh chat am hpc - thf am vj ci am dau, am dfm, am chinh va am
ngi^ am giQ-a cic am vj lifn nhau nfn nd cd thf cuoi. Xet vf khi ning huan luyfn, md hinh
dugc ip dyng de ^iai quyet hifu qua bii toin am dau + van kem ban md hinh am vj vi can
nh$n d^ng cd sd lugng tir vyng idn. Tuy sd lugng miu am thanh huin luyfn nhieu hom,
nhifn, cic md hinh diphone, triphone cung nhung lai tdt hom md hinh am tift vi sd lugng
thudng chi dugc sur dung trong cac hf thdng miu am thanh cin cho huan luyfn la ft hom.
nh|in d^ng tieng ndi cua cic ngdn ngQ- khdng Md hinh am dau + van cd 22 am diu (neu
CO thanh difu va da am tiet (nhu tieng Anh, khdng kf am dau trong zero) va khoang 150
tieng Phap,...). Vdi cic ngdn ngQ- nay, cic md vin (theo [6]). Neu ket hgp vdi thanh difu thi

hinh diphone va triphone cd khi ning bieu tdng sd im dau + van phii nhd han (22 + 150)
dien khi tot dugc sy lien ket vi chuyen doi x 6 = 1032 vi cd nhifu van vi thanh difu
giQ-a cic am vj trong mpt tir. Han nQ-a, do cic khdng ket hgp dugc vdi nhau. So lugng nay
ngdn ngu- nay khdng cd thanh difu nfn so cho thay cd the thu thap dii so mau huan luyfn
lugng cic diphone vi triphone su dyng cung cho cac hf thong nh|in d^ng tieng ndi su dyng
khdng phii li qui Idn vi trong thyc te hoan md hinh am dau + van. Tuy nhifn, theo nhu
toin cd the thu th§p du so mlu am thanh df danh gii cua chung tdi thi md hinh am dau +
van niy cung chua phai li phuang in lya chpn
huan luyfn.
tot nhat, bdi vi nd chua cho thay cd sy gin ket
Cdn tieng Vift cua chung ta la ngdn ngQ- ch^t che gi&a im diu vi vin. Cic thyc
cd thanh difu, cho nen neu sOr dyng md hinh nghifm d phin sau sS cho thay md hinh niy
diphone hay triphone thi so lugng md hinh khdng tot bing md hinh bin am tiet sau day.
thyc sy sg tang Ifn khi nhieu (khoing 6 lan).
M?it khic, do tieng Vift li ngdn ngCi dom am 2.2. De xuat don vi nhan dang cor ban Ii
tiet cho nfn ta cd thf su dyng cac phuang in
ban 2m tiet
md hinh hda dom vj nh^n d^ng khic df Igi
Neu nhu ta chia mpt im tift tifng Vift ra
dyng trift de d^c difm niy. Mpt trong so cic
lim
hai phin vdi ranh gidi la nguyfn im
phuang in dd se dugc trinh biy d phan 2.2.
chinh cua am tiet thi ta sg cd 2 bin am tiet
2.1.4. Mo hinh dm ddu + vdn
(semi-syllable). Chii y "bdn" am tiet d day
khdng
phii li hai nua rifng tach bift cua mpt
Nhu da ndi, tieng Vift la ngdn ngQ- dom
am

tiet,
ma li hai phin diu vi cuoi cua mpt
am tiet. Am tiet tieng Vift tuy dugc phat am
am
tiet
nhung
cd chung nhau mpt phin cua
lien mpt hai nhung l^i cd cau t^o lap ghep.
Khoi lip ghep ay cd thf thio rdi tihig bp ph$n nguyen am chinh. Tire li hai md hinh bin am
d am tiet nay df hoin vj vdi bp ph^n tuang tift se cd sy chuyen tiep, lifn ket vdi nhau
ling cua am tiet khic. Am tiet tieng Vift cd 3 giong nhu li hai md hinh am vj kep (diphone).
bp ph$n la: phan dau, phan sau va thanh difu. Vi dy: im tiet anh bao gom 2 bin am tiet _a

V i ^ *P ^^Ving Sm vj kfp (diphone) tron§
- d^ng ti^ng ndi dya trfn d^c diem mpt so
to vi cd sy chuyen tiep sang nhau gan nhu cd
Mj^. Nhu v$y am vj kep se dugc md hinh hda
^ 2 im vj hgp thinh vdi mpt phan ngir cinh
^Mtng ^ g ^<^' 2 am vj dd. Cdn neu ta bieu
diln sy lien quan giite mpt am vj vdi hai am
vi dihig trudc vi sau no thi ta cd md hinh bp
ba Sn> vj (triphone). Neu so lugng am vj vao
Uioing SO thi s6 lugng am vj kep la khoing
gin ISOO vi so lugng triphone cd the md hinh
hda dugc li vio khoing 7300 triphone.

leo


Ky ylu HQi thao ICT.rda'06


Proceedings of lCT.rda'06. Hanoi May.:

vi anh (dau _ tugn^ trung cho khoing ling,
_a tuc la ban am tift bat diu, chang han nhu
a a trong anh, cdn a_ ii bin am tiet ket thuc,
vi dy nhu a_ a trong cha).

myc vi thyc hifn chuc ning ]
VSRCutter.
• VSRTraining: Huin luyfn cit
Markov in cho hf thdng vdi c
thanh bit ki (word ho|ic sub
danh gia chat lugng cua md
khi huan luyfn.
• VSRTiny: Li mpt chuong t
gpn cho phep nhin dsing tron;
thdi gian thyc cac phit am tienj

Do ca hai bin am tiet deu cd mpt phin
chung dd li nguyen am chinh cua tir cho nen
md hinh bin am tiet bieu dien kha tot sy lifn
ket vi chuyen doi giu-a cic am vj trong mpt tu.
Tong so bin im tiet cin cd df tdng hgp nfn
tat ci cic tir tieng Vift theo phuang phip trfn
li 389, trong dd di cd 61 bin am tiet da mang
sin thanh difu (ching han nhu dc, ep, dc,...)
[4]. Nhu viy, nfu ket hgp vdi ca thanh difu
thi tong so bin am tift can thift df md hinh
hda cho toin bp cic am tiet tiing Vift toi da

sg li (389 - 61) x 6 + 61 = 2029. Vdi so ^ugng
nay ta hoan toan cd the thu thip du sd miu am
thanh huan luyfn cho cic md hinh.
3. HE T H 6 N G

NHAN

DANG

3.1. Cic thinh phan chinh
De thyc hifn cic myc tieu nghien curu de
ra d trfn, chung tdi di xay dyng mpt hf thong
nhin d^ng tieng ndi tieng Vift nhim thich ung
vdi s6 lugng tir vyng Idn. Hf thong nhin d^ng
niy gom nhilu mddun khic nhau, moi mddun
thyc hifn mgt chuc ning cy the. Vifc chia hf
thong thinh cic mddun khic nhau nhu v$y
cho phep cd thi df ding tich hgp cic mddun
niy vao cic hf thong khic tuy thupc vio tiTng
myc dich sir dyng khic nhau. Cic mddun niy
deu dugc phat trifn bing cdng cy lip trinh
Microsoft Visual C++ 6.0, chi su dyng thuin
tiiy cac ham API cua Windows nfn cd toe dp
thyc thi rat nhanh, giao difn khdng ciu ki
nhung df six dyng.

32. Cor sd dir lifu tiing ndi
Co sd da lifu (CSDL) tiing ndi
thong se bao gom cic phit am ciia khc
900 tu dom khdng dau tiing Vift c

gipng nam. Ngoii ra, CSDL tiing ndi
gom ci mpt tip hgp nhifu file am thai
cic phit im cic so tir 0 din 9 cua nhie
ndi khic nhau, vdi myc dich sir dyng i
tra, dinh gii chat lugng ciia md hinh.
Cic file im thanh niy dugc thu tro
trudng lim vifc binh thudmg, su dyng
difn dpng cim tay, lay mau d tin so
Hz, Mono, 16 bit, va dugc ma hda due
WAVE PCM khdng nen.
Vifc xiy dyng CSDL tiing ndi nij
thyc hifn bdi hai chuang trinh VSRCut
VSRAutoSplit. Diu vio cho cic chuon{
niy li dudng din din cac file WAVE
cic phit im cin xiJr ly, ki hifu ten ngui
va nhan ciia phit im ung vdi tirng do^
thanh (iMn cua tijr tiing Vift dugc m;
dudi d^ng TELEX - vi dy, tCr "khong"
nhan tuong ung li "khoang"). Vdi cic i
tin nay, chuang trinh VSRCutter
VSRAutoSplit sg su dyng thuit toan phat
khoing^l$ng [3] dl liy ra cic do?in im t
chua tiing ndi vi ghi ra 1 file WAVE ci
dugc dit theo quy tic sau: <nhan>_
ngu&i n6i>_<sd thir ttr>.wav. Quy tic di
nay sf giup cho mddun nh|in dang cd
thong kf dugc ty If nhin d?ng chinh xac i
tu cho cac gipng ndi khic nhau.

Cic mddun chinh cua hf thong bao

gdm:
• VSRCutter: Cat mpt file im thanh
chua nhifu lin phit im cua mpt tir
thanh cic file rifng le vi d|it tfn chung
theo mpt quy tac nhit djnh df phyc vy
qui trinh huan luyfn md hinh va test hf
thong.
3.3. Huan luyfn mo hinh
• VSRAutoSplit: Ty dpng tim kiem tit
Sau khi da cd CSDL tiing ndi thi cun
ci cic file im thanh nim trong mpt thu
liic ta cin sir dyng chiing dk huan luyfn
190


H6i thto ICT.rda'06

Proceedings of ICT.rda'06. Hanoi May. 20-21.2006

_j|,hinh Markov in phyc vy cho vifc nhin
jSff. Chuong trinh VSRTraining dugc thiit
U ^ thyc hifn cdng vifc niy.
%.'' Chuong trinh VSRTraining su dyng thuft
10^ huan luvfn nhung [2] (embedded
training) de huan luyfn md hinh. Vi viy, diu
13 ciia chuang trinh khdng chi li cic md hinh
d^ng tir (whole word model) mi cdn cd thi li
Q^c md liinh cua cac don vj im thanh nhd hom
tjy (subword model) nhu im vj, im diu +
vin,...

Khdng gi6ng nhu qui trinh huan luyfn
cic md hinh d^ng tu trong dd chi sur dyng lin
lugt timg tip cic phit im ciia mpt tix de huin
hiyfn ra mpt md hinh cua tir dd, VSRTraining
sg sur dyng tat ci cac file dvi lifu im thanh
dinh cho pha huin luyfn (chua cic phit im
ciia rat nhifu tu khic nhau) de cap nhit dong
thdi toin bp cic md hinh trong hf thong.
Cdng vifc niy dugc thyc hifn vdi mpt vdng
lip. Chi tilt thuit toin da dugc trinh bay trong
[2] song ta cd thf tdm tit lai nhu sau:
Khdi diu, VSRTraining n^ip toin bp
cic file im thanh vio bd nhd. Sau dd tiln
hinh trich chpn die trung tiing ndi ciia timg
file vi xd ly tung file huin luyfn noi tilp
nhau. Tilp theo, VSRTraining sir dyng chuoi
phifn im ciia phit im tuong urng dl xiy dyng
nfn mpt md hinh HMM tong hgp bao phu
toin bp tijr phit im. Md hinh tdng hgp nay
dugc xiy dyng bing cich ghfp noi cic md
hinh subword tuong umg vdi tung nhin trong
chuoi phifn am. Sau dd thuit toin ForwardBackward dugc su dyng vi se t^o ra cic trpng
sd trung binh dugc tich luy theo cich thdng
thudng. Khi tat cic cic file huan luyfn dugc
xir ly xong, cic tham so udc lugng mdi dugc
hinh thinh tur cic tong trpng so vi cho ra t$p
md hinh HMM da dugc cap nh|t.
Diu vao cua VSRTraining la mpt file text
bao gom nhieu ddng, mdi ddng chua cic
thdng tin sau:

mqt tir>
<nhan cua tir>
tie>
Trong dd sg chua ten
cua cic md hinh subword irng vdi tir. Vi vifc
lya chpn don vj im thanh nio dl md hinh hda
101

li tuy thupc vio ngudi su dyng (cd thi li c i
tir, im vj, im diu + vin,...).
Ngoii chuc ning huan luyfn md hinh,
chuomg trinh VSRTraining cdn cd thi ty dpng
nhin d^ng tit ci cic file im thanh trong m$t
thu myc, didng ke kit qui nhin d^ng de dua
ra nhurng danh gii vl chat lugng cua md hinh.
Chuc nang nay sir dyng module nhin d^ng sg
dugc trinh biy d phin sau diy.
3.4. Nhan d^ng tiing ndi trong chi d9 thdi
gian thvc
d diu ra ciia qui trinh huan luyfn nhiing, ta
cd dugc mpt tip hpp cic md hinh subword. Cic
md hinh nay kft hgp vdi mpt file tir diin chura
phifn am ciia cic tir cin nhin d^ing sg cho ta
mpt m^ng lien kit cic md hinh Markov in. Vin
de dit ra d diy la phai tim kiim trong m^ng lifn
ket nay df xac djnh ra dupc mpt chudi cic tr^g
thai ket ndi tix md hinh subword np din md hinh
subword kia sao cho nd thich hgp nhit vdi chuoi
quan sat umg vdi doan tin hifu tifng ndi diu vio.

Giii ma doan dudng di niy sg xic djnh dugc
mpt diy cic phifn im, hay ndi cich khic li di
nhin d^ng dugc doan tin hifu tiing ndi phit im
lifn tyc dd.
Thuit toin tim kiim dl giii quylt vin de
niy thudng dya trfn tu tudng ciia giii thuit
Viterbi. Tuy nhifn, neu nhu kich thudc ciia
tip tir vyng cin nhin d^ng tuomg doi Idn thi
so lugng cac niit md hinh trong m^ng lifn kit
niy cung khi nhieu. Vi thi, cin phii ip dyng
mpt so ky thuit cit nhinh trong ciy tim kiim
dl ting toe dp nhin d^ng.
Thuit toin dugc chuang trinh su dyng dk
giii ma chuoi quan sit li thuit toin tim kiim
Viterbi theo chiun (Viterbi beam search). Npi
dung chi tilt ciia thuit toin niy dugc trinh biy
trong tai lifu [1]. Vdi vifc six dyng heuristic
nhim lo^i bd bdt cic nhinh trong ciy tim
kiim, toe dg thyc hifn ciia thuit toin tim kiim
Viterbi da dugc cii thifn mpt cich ding kl.
Vifc thyc hifn nhin d^ng mpt tir tiing Vift
tmng binh chi mit khoang 30 ms (vdi miy
Pentium 4 toe dp 2 GHz) cho nfn hoin toin
cd thi cii dit dugc thu tyc nhin d^g trong
ehl dp thdi gian thyc.


Proceedings of ICT.rda'06. Hanoi May.

Ky y^u HQi thao ICT.rda'06


Chuong trinh VSRTiny cho phep thyc
hifn dieu dd. So do thyc hifn cua chuong
trinh nay dugc trinh bay d hinh I.
Qui trinh xiir ly nay cd the tdm tat l^i
nhu sau:

X

r--VArôV*ã


1

a<ômdlt

1

4. CAC KET QUA THV'C NGHlEl

nitaavi

Si

a hinh 2 ta thiy cd xuit hifn k
tin vl md hinh ngdn ngff. Hifn t?
trinh da hd trg md hinh ngdn ngi^ d^i
va cho phep nhin d^ng tiing ndi lifr
nhien do bing thdng tin xic suit ciu

ngdn ngir chua dugc tdi uu nen t
chuang trinh mdi chi dugc thiir ng
chiirc nang nhin dang tieng ndi tifng
am rdi rac.

1

4.1. So sinh mo hinh am dau + vai
hinh bin am tiet

1

• Ngu&i noi: mpt gipng nam Ha
tudi.
• Dir lifu de huan luy?n mo hh
888 tir tiing Vift khdng diu {
ciia [4]), mdi md hinh ciia 1
huin luyfn vdi 10 lin phat im c
(chiira trong 10 file am thanh).
• DO- li4u de test: gom 10575
thanh chua cic phit im ciia 888
• Thong sd cua mo hinh: mdi n
subword trong cac thyc nghifn
deu bao gdm 5 tr^ng thai (kf ca
thai gii dung dk kit noi cic md I
vdi nhau) vi 3 thanh phin
Gauss.
• Kit qud: mdi d^ng md hinh dup
luyfn vdi 5 lin lip, kit qui nhi
tip do- lifu test dugc trinh ba>

bing I.

Hinh 1. Nhgn dgng tieng noi trong
thai gian thyc
Tin hifu tiing ndi dugc ghi im tryc tilp
tir micro vi chuyfn vio bd dfm thu vdi
tin so liy miu 11025 Hz, Mono, 16 bit.
Tin hifu tieng ndi trong bp dfm niy
dugc chia thanh cic frame dai 256 miu
(»23 ms), hai frame c^nh nhau cich
nhau 128 mau.
Sau dd chuang trinh se dya trfn ning
lugng cua timg frame vi ngu&ng ning
lugng ciia nhifu nIn di dugc udc lugng
tir trudc dk tiln hanh lo^i bd cic
khoing ling. Tilp theo, vdi tirng frame,
12 gii trj MFCC + ning lugng ciia
fi-ame cimg vdi cac gia trj d^o ham bic
1 vi b$c 2 theo thdi gian ciia chiing (tit
ci cd 39 hf so) sf dugc xic djnh df lim
tham sd die trung cho tirng doan tiing
ndi.
Vdi cic tham sd die trung diu vao nay,
chuang trinh sg siir dyng mpt tip hgp
cic md hinh HMM subword di dugc
huin luyfn tir trudc, mpt md hinh ngdn
ngQ- dang bigram, cimg vdi mpt danh
sach cic phien am ciia cac tir cd khi
nang nhan d^ng de thyc hifn thu|it toin
tim kiem Viterbi theo chiim.

Dudng di tot nhat tim thay sau thii tyc
tim kiim chinh la kft qui cua qui trinh
nhan d^ng.

Bang 1: Ket qua so sinh hai mo hi
Lo^imd
hinh

S6m6 T/gian T/gian
huin
(
hinh
nhin
luyfn
Xi
subword
d?ng(s)
(8)

Am dau +
vin

129

2108

767

i


Ban am tilt

293

1917

529

9

Cac thdng sd thdi gian d phin
nghifm nay dugc tinh khi chay chuang
tren may tinh xich tay cd cau hinh la
Pentium 4 toe dp 2.0 GHz, 512 MB SDI
hf dieu hanh Microsoft Windows

192


Hfli thto ICT.rda'06

Proceedings of ICT.rda'06. Hanoi May. 20-21.2006

potessional. Ngoii ra, thdi gian khi thyc hifn
IbAc ning ty dpng nhin dfrig bao gom ei thdi
gian tim, dpc file im thanh vi thdi gian xur ly
nhin d»ng.

bieu dien khi tit sy lifn kit vi chuyin d6i
giu-a cae don vj im thanh trong mpt tu. Cdn

md hinh am diu + vin thi khdng thi hifn
dugc rd net sy lien kit niy.

B i n g 2: A n h hvorng ciia so lan lap

^lin'^

chinh
xic(%)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20

21
22
23
24
25

40.42
86.60
92.18
93.31
93.74
94.03
94.17
94.18
94.31
94.40
94.56
94.91
95.31
95.79
96.35
96.49
96.67
96.77
96.78
96.89
96.89
96.86
96.85
96.85

96.79

Thdi gian
buinhiyfn
(s)

Thdi gian
nhind^ng
(s)

137
445
444
443
448
579
722
778
784
784
760
761
762
761
762
766
774
774
774
775

763
786
787
790
788

2099

4.2. Anh hirdng ctia so lan thyrc hifn
vong lap trong thii tuc huan luyfn
Vdi Cling mpt bp test nhu thd nghifm d
phin tren, nlu thay ddi so lin thyc hifn vdng
lip trong thii tyc huan luyfn md hinh bin am
tift thi ta cd kit qui nhu d bing 2.
Vdi kich cd bp tu vyng la 888 tu don
tiing Vift khdng diu, trong thix nghifm niy
chuong trinh d^t dp chinh xic cao nhit li
96.89%. Vi khi ting din so vdng lip thyc
hifn huan luyfn, ty If nhin d^ng chinh xic
ciia md hinh cung ting din. Tuy nhifn, din
mpt liic nao dd thi dp chinh xic nay sg trd nfn
bio hda.

816
563
538
529
535
510
495

486
463
445
437
413
384
362
348
335
323
316
312
315
338
338
331
326

4.3. Nhan dang cic tir chua durgrc huan
luyfn

Qua thix nghifm niy ta thay vdi trudng
hgp nh$n d^ng cic td khdng dau tiing Vift,
chuong trinh sur dyng mo hinh ban im tilt cd
kit qui nh|n d^ng tot hon md hinh am diu +
vin. Dd li vi trong md hinh bin im tilt, ci hai
bin am tilt deu cd mpt phin chung thupc
nguyen am chinh cua tur cho nfn md hinh niy

Mic dii du- lifu tiing ndi danh cho huan

luvfn chi bao gom 888 tir dom khdng dau
tieng Vift, nhung khdng phii khi ning cua
chuang trinh chi cd thi nhfn dpng dugc tirng
dd tir. d diy do dom vj nhin dpng li mdc nhd
hon tir (cy thi la cic bin im tilt) cho nfn
chuomg trinh hoan toin cd khi ning nhin
dpng them mpt so lugng Idn cic tir don khdng
diu khic dugc tpo thanh bing cich ghfp tihig
cap cic bin im tilt vdi nhau.
Chuang trinh VSRTiny cho phfp thyc
hifn dieu niy. Diu vio ciia chuomg trinh niy
la mpt file VSR.HMM chua thdng tin ve cic
md hinh bin am tilt, vi mpt file vin bin
VSR.DIC chda nhin vi cich phifn im (chi ra
td umg vdi nhin dd li ghep bdi hai bin im tilt
nio) ciia cic tir ma chuong trinh cd khi ning
nhpn dpng.
Vdi cic thdng tin diu vio nhu d thyc
nghifm tren thi file VSR.DIC sg bao gom
djnh nghTa ciia 1351 td don trong dd cd 888 td
cd trong danh sach huan luyfn, cic tu cdn Ipi
dugc tpo thanh bing each ghep tung c$p bin

193


Ky y^u HQi thao ICT.rda'06

Proceedings of ICT.Kla'06. Hinoi May. 2


am tilt vdi nhau. Cic td mdi niy dugc thu im
va nhpn dpng tryc tilp trong thdi gian thyc.
Kit qui nhpn xet ban diu cho thiy ty If nhpn
dpng dung khdng cao bing trudng hgp nhpn
dpng cac td da dugc huan luyfn nhung van d
mdc chip nhan dugc.

• Kit qui thu nghifm tren hai
hinh mdc im tilt va mdc bar
dugc chi ra d trong bing 5.
Bang 4: CSDL nhan d^ng thur
S n ^eudi% Gidi ^noi
tinh -^ -V
Thai
1
N B A N a m 34
Binh
Thai
2
NTH
N&
31
Binh
Thai
3
PVH N a m 29
Nguyen
Thai
4

NTN
NO25
Nguyen

4.4. Cac thiir nghifm vdi so lurong tir vvng
nhd
Trong thyc nghifm nay, co sd dd lifu
tiing ndi dugc su dyng va bp td vyng nhpn
dpng chi la cac sd tu 0 din 10.
Bang 3: Diir lifu huan luyfn mo hinh
• - • • - . :

Srr

1

^Sudi
noi

Gidi
tmh

^^^.

NPB

Nam

25


°
^f.

s ^ .

mlu
/mo;?
hinh-i
30

5

NLT

Nam

23

6

LTV

Nam

23

Hii
Duong
Hii
Duong


NQI

2

NNV

Nam

25

^!"J
Binh

30

3

TCT

Nam

27

Si

30

Bang 5: Ket qua nh|n d^ng 2 CSI
f Lopi

'^ mo
hmh

NQI

4

VNA

Nam

27

J?"
Tay

30

5

NHQ

Nam

27

jJ*

30


6

DTTT

Na

24

^^^^
Ninh

30.

7

DTNH

Ntt

25

^^.

30



Chinh
xic
(%)/


^^
hinh
am
tilt

(gipng da huin luyfn)

^•^''

2
(g>9ng chua huin
luyfn)

96.18

Mo
hinh
ban
iro

1
(gipng di huin luyfn)
2
(gipng chua huin
luyfn)

tilt

ZMr li^u de huan luy^n mo hinh: x e m


bing 3.
• Ca s& dir li^u nhgn dgng thii 1: Bao
gom 3411 file im thanh chua eie phat
im eie so td 0 din 10 ciia 7 gipng ndi
cd trong tip huan luyfn (luu y li cic
file im thanh dung dl huin luyfn md
hinh se Ididng dugc sd dyng Ipi trong
bude nhpn dpng).
• Ca sa da li^u nhgn dgng thir 2: Bao
gdm 1466 file am thanh chua cic phit
am cua 6 gipng ndi chua timg tham gia
huan luyfn (bang 4).

CSDL nhin dpng

99.44
95.09

Mic du trong tit ei cic thd nghifm
kit qui nhpn dpng deu khi cao nhimg ta
ring sd dyng md hinh mdc im ti« (ti
trudng hgp nay chinh la muc td) cho dp cl
xic cao hon md hinh nhpn dpng muc nhd
tu (d diy la md hinh ban am tilt). Dd li vi
md hinh mdc am tilt cd tinh bin vihig cao
chiing bao trum dugc hifn tugng d6ng p
am cua cic am vj ciu thanh am tilt cung i
tinh bien thi im vj trong cic im tilt kl
nhau. Vi vpy, trong cic dng dyng nh§n d\


194


Proceedings of ICT.rda'06. Hanoi May. 20-21.2006

thio ICT.rda'06

i0B

khdng cin sd dyng nhifu md hinh. Tuy nhifn,
trong tuong lai ta sg phii nghien ciiru tiifm vl
moi quan hf giua cic tu don mang thanh difu
vdi cic td dom khdng dau tuong iimg.

logng td vyng nhd ta nen sd dyng md
^ nb$n dpng mdc im tilt.
1 ^ LUAN VA PHlTONG HlTdNG
pHATTRliN
' Td mpt s6 kit qui thyc nghifm d tren, ta
jbj^ ling hf thong nhan dang tiing ndi dya
i«fn md hinh Markov in mdc dudi td, eu thi
u sd dyng md hinh ban am tilt, da cho kit
qui tit trong c i hai trudng hgp nhpn dpng
'tiing ndi tiing Vift phit am rdi rpc, dd li
nhin dpng phy thupc ngudi ndi vdi sd lugng
td vyng Idn vi nhin dpng khdng phy thupc
ngudi ndi vdi so lugng td vyng nhd.
Tuy nhien, trong mdt so trudng hgp khi
nhpn dpng van eon cd sai so. Nhung sai sd niy

mpt phin ii do dir lifu dung cho huan luyfn
chua cd difu kifn de thu thap nhieu, va mpt
phin la do cac tham sd trong chuang trinh cd
tbi la chua dugc lya chpn mdt each tdi uu
nhit. Vi vpy, ta hoan toan cd thi nang cao dp
chinh xic eua hf thong bing each thu thap
thit nhieu nguon dd lifu huan luyfn va tinh
chi Ipi cic gia trj tham so cho thich hgp hom.

[1]. C. Becchetti and L.P. Ricotti: "Speech
Recognition.
Theory
and
C++
Implementation", John Wiley & Sons. Ltd,
1999, pp. 309-320.
[2]. S. Young et al: "The HTK book (for HTK
version 3.2.1)", Microsoft Corporation, 2002,
pp. 129-133.
[3]. Nguyen Phii Binh, Trjnh Van Loan, Eric
Castelli: "He thing xiir ly thdi gian thyc nh§n
dang cic tir tiing Vift phit im rdi", Kyyiu hgi
thdo ICT.rda'03, Ha Nfi, 2003, tr. 310-316.
[4]. Nguyin Phu Binh: "Nhan dang tiing ndi tieng
Viet siir dung muc diroi tir", Ludn van thgc sy
Cdng nghi thdng tin, DHBK Ha Nfi, 11/2004.
[5]. Dang Ngoc Due, Luang Chi Mai: "TSng
cudng df chinh xic cua hf thing m^ing neuron
nhpn dang tiing Vi?t", Tgp chi Biru chinh Viin
//long, S6 II, 3/2004, tr. 75-81.


Mac dii so lugng td vyng ma hf thong
hifn tpi cd kha ning nhpn dpng dugc di la kha
Idn nhung dd mdi chi la eie td dom tieng Vift
khdng diu. Trong tuong lai, chung tdi sg tilp
tyc thu thpp them cic miu am thanh cd kem
thanh difu df cd thf xiy dyng du so md hinh
cin thiit nhim ting kich thudc bp td vyng,
hudng tdi myc tifu nhpn dpng dugc tit ci cic
tir don tiing Vift. Diiu nay hoan toan cd thi
thyc hifn dugc vi nhu da trinh bay d myc 2.2,
vdi khoing hon 2000 ban am tilt la di du df
md hinh hda cho toan bp cac im tiet tiing
Vift.
Ngoii ra cdn cd mpt hudng phit triin
khac nu-a, dd la kit hgp giua chuc nang nhpn
dpng cic td don khdng diu mi hf thong dang
cd vdi mpt mddun nhpn dpng thanh difu df
xiy dyng nen mpt hf thong cd khi ning nhpn
dpng dugc tit ci cic td don tiing Vift. Uu
dilm de thiy nhit cua hudng di niy dd la

Tii lifu tham khao

[6].Ng6 Hoing Huy, Luong Chi Mai, Bui Quang
Trung, Nguyen Thj Thanh Mai, Vu Kim Bang,
Vu Thj Hai Ha: "Thiet kl cic he thing nhpn
dpng tieng Vift trong thdi gian thyc", Ky yiu
hgi thdo FAIR, 2003, tr. 349-357.
[7]. Bao cao dl tii nhinh "Tuong tic ngudi-may

dung tiing ndi", dl tai cip nhi nude mi s6
KC.Ol.09: "Nghien cim phdt triin vd img
diing cdc ky thugt tuang tdc nguai-mdy vd hi
thdng lien tiin", Khoa CNTT, Trudng DHBK
Ha Nfi, 2004.
[8]. Nguyen Thanh Phiic: "Mft phuong phip nh§n
dpng Idi Viet: Ap dyng phuong phip kit hgp
mpng no-ron vdi md hinh Markov in cho cic
hf thing nhan dang Idi Vift", Lugn dn lien sT
l^ thugt, DHBK Ha Nfi. 2000.
[9] Nguyen Hdng Quang, Trjnh Van Loan: "Nhfn
dpng tiing ndi tiing Vift phit im lien tyc", Ky
yiu hgi thdo ICT.rda '04, Ha Nfi, 2004.

195


Ky y^u HQi thao ICT.rda'06

Proceedings of ICT.rda'06. Hanoi May.

Vl cic tac gia:
Thpc sp Nguyin Phi Binh
tit nghifp dpi hpc chuyen
nganh Cdng nghf thdng tin
nam 2002, bio vf lufn van
thac sS chuyen nganh Xii
ly thdng tin va truyin
thong nam 2004 tpi Dai
hpc Bich Khoa Ha Nfi.

Tir nam 2002 din nay,
ThS. Binh cdng tic tpi
khoa Cdng nghf thdng tin,
trudng Dai hpc Bich Khoa Ha Nfi.
Cic hudng nghien cihi chinh: Xiir ly tiing ndi, eLeaming, cac kp thuit tich thdng tin td Web.
E-mail:

196

Tiln sT Trjnh
nhin bing Tii
Hf thing difn
tpi trudng INI
(Phap). Tir nan
nay, TS. Loan (
trudng Dpi hpc
Ha Nfi va hif
Trudng bf mdr
may tinh, Khoa >
thdng tin. Dpi
TS. Loan hifn dang quan tim nghien ci
vyc Xu \^ tiing ndi va Xu 1]^ tfn hifu.
E-mail: