A comparative acoustic study of hanoi vietnamese and general american english monophthongs m a thesis linguistics 60 22 15

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (1.84 MB, 78 trang )

Vietnam national university, Hanoi University of languages and
international studies Faculty of Post graduate studies

DON MINH MO

A comparative acoustic study of Hanoi
Vietnamese and general American English
monophthongs
Phân tích âm học so sánh hệ thống nguyên âm đơn Tiếng Việt Hà Nội
và tiếng Anh Mỹ Phổ thông

A thesis submitted to the Faculty of Post Graduate Studies,
University of Languages and International Studies, VNU, Hanoi
in partial fulfillment of the requirements for the degree of
Master of Arts
in
English Linguistics
Code: 60.22.15

HA NOI 2012

i

Vietnam national university, Hanoi University of languages and
international studies Faculty of Post graduate studies

DON MINH MO

A comparative acoustic study of Hanoi
Vietnamese and general American English

monophthongs
Phân tích âm học so sánh hệ thống nguyên âm đơn Tiếng Việt Hà Nội
và tiếng Anh Mỹ Phổ thông

A thesis submitted to the Faculty of Post Graduate Studies,
University of Languages and International Studies, VNU, Hanoi
in partial fulfillment of the requirements for the degree of
Master of Arts
in
English Linguistics
Code: 60.22.15
Supervisor: Pham Xuan Tho, M.A.

HA NOI 2012

ii

LIST OF TABLES
Title

Table
1

The first and second formant frequencies of all the subjects for

Page
34

each vowel.

2

3

The values of the first and the second token of each sound 47
produced by each speaker.
The average values of F1 and F2 for each vowel as spoken by

53

ten speakers

vi

LIST OF FIGURES

Figure

T

1

The spectrogram of the author’s

2

The effect of [ɲ] on [i] in inh
researcher.

The effect of [ɲ]on [a] in nha

3
4
5
6
7
8
9
10
11
12
13
14
15
16
17

The difference between the vow
subject.
The difference between th
produced by another subj
The similarities between t
produced by a subject.
The similarities between t
produced by another subj
Spectrograms of [ɤ]and [ɤ̆n]

Spectrograms of [ɤn], on the lef
The similarities between t

produced by a subject. Th
left, and of [ăi] is on the r
The duration of [a] and [ă
the right.
The correlation between the two
F2 by the first 4 subjects.
The monophthongs of ten speak
dialect
The formant chart showing the a
for each monophthong as spoke
A formant chart showing the for
eight English monophthongs. Th
arranged at Bark scale intervals
The formant chart of Vietnames
female speakers

The formant chart of General Am
produced by female speakers

vii

TABLE OF CONTENTS
STATEMENT OF AUTHORSHIP................................................................................. i
ABSTRACT............................................................................................................................ iv
ACKNOWLEDGEMENT................................................................................................. v
LIST OF TABLES...................................................................................vi
LIST OF FIGURES................................................................................vii
TABLE OF CONTENTS......................................................................viii
Chapter 1: INTRODUCTION..................................................................1

1. Rationales......................................................................................................1
2. Scope of the research and the research questions........................................3

Chapter 2: THE REVIEW OF LITERATURE.........................................5
2.1. The articulatory description of Hanoi Vietnamese monophthongs................................5

2.2 The acoustic description attempts

........................................................................10

2.3. Characterizing vowel qualities with the acoustic properties........................................16

2.4 General American English

.....................................................................................24

2.4.1 The traditional description

.............................................................................. 24

2.4.2 The acoustics of GA

......................................................................................... 27

Chapter 3 RESEARCH METHODOLOGY.......................................... 30
3.1 The subjects
3.2 The stimuli

................................................................................................................30

..................................................................................................................30

3.3 The recording process
3.4 The analysis process

..............................................................................................31

................................................................................................32

Chapter 4: FINDINGS AND DISCUSSION.........................................34
4.1 The acoustics of Hanoi Vietnamese monophthongs
4.1.1 [ɛ̆] and [ɛ]

.......................................34

.........................................................................................................................................................................................................................................

35

4.1.2 [ɤ] and [ɤ̆]

........................................................................................................... 40

4.1.3 [a] and [ă]

............................................................................................................. 43
viii

4.1.4 Regression analysis

........................................................................................... 48
4.1.5 Charting the formants of Hanoi Vietnamese monophthongs.............................51
4.2 The monophthongs of Hanoi Vietnamese and General American English in
comparison

....................................................................................................................... 58

Chapter 5 CONCLUSION......................................................................62
5.1. The main findings on the acoustics of Hanoi Vietnamese monophthongs..................62
5.2 The monophthongs of Hanoi Vietnamese and General American English in

comparison

........................................................................................................................63

5.3 The limitations of the study and suggestions for further research................................64

REFERENCES.......................................................................................66
Appendix 1: Phiếu chấp thuận tham gia vào nghiên cứu........................................................68

Appendix 2: The stimuli................................................................................. 69

ix

Chapter 1: INTRODUCTION
1. Rationales
The ultimate aim of this research is to achieve a cross language
comparison between the acoustic properties of Hanoi Vietnamese

monophthongs and General American English monophthongs. The
findings of the accomplished study are significant, from both the
linguistic and pedagogical perspectives.
Ladefoged states firmly that, “The best way of describing vowels is not
in terms of the articulations involved, but in terms of their acoustic
properties.” (2003, p.104). A considerable amount of space of this thesis
is devoted to the researcher’s analysis of the monophthongs, or pure
vowels (Wells, 1962, p.1) of Vietnamese, Hanoi dialect. Aside from a
few studies conducted overseas, which have important limitations to be
addressed, which are discussed in details in the Review of Literature of
this thesis, there has been no attempt to study the vowel acoustics of the
recognized standard Vietnamese so far.
The literature on Vietnamese vowel acoustics has been mainly concerned
with the description of the sounds from the views of articulatory
phonetics. The investigations conducted by Nguyễn (1998), and Đoàn
(2000) are typical examples. These studies examined the behaviors of
the vocal organs involved in the articulatory process when a particular
sound is being produced. This method, while having the advantage of
being straightforward, has put
1

forwards ideas which remain an approximation to the truth . Ladefoged
and Johnson (2011, p.197) comment,
Traditional articulatory descriptions are often not in accord with the
actual articulatory facts. For well a hundred years, phoneticians have
been describing vowels in terms such as high versus low and front
versus back. To some extent, they have been using these terms as
labels to specify acoustic dimensions rather than as descriptions of
actual tongue positions. Phoneticians are thinking in terms of acoustic

fact, and using physiological fantasy to express the idea.

Acoustics offers sufficient tools for explaining the vowel qualities. The
production of a speech sound involves firstly the vibration of the vocal
cords, which produces sound waves. It involves secondly the
performance of the vocal tract, which can be changed into various
shapes, as a filter, under the acoustic impedance. Vowel sounds are
characterized acoustically by formants, which are frequency regions of
high energy concentration corresponding to the pass bands of the throat
and mouth cavities (Wells, 1962, p.1).Therefore, instead of only
studying a particular sound from the outside, rather subjectively, by
observing with eyes, trying to set up a collection of its articulatory
features, there should be a rigorous description method where every
dimension of a sound as its nature is measured and displayed objectively
on the screen of an electronic device.
The analysis, carrying out appropriately, would result in an acoustic
vowel chart, representing accurately the linguistic aspects of Hanoi
2

Vietnamese monophthongs, which serves as a valuable source of
reference for cross language comparison.
The pronunciation of General American English and of Hanoi
Vietnamese are acknowledged as the reference accents of English and
Vietnamese respectively. As a result, from the pedagogical aspect, the
findings of the research are of highly practical values in teaching the
pronunciation of one language to learners of the other language.

2. Scope of the research and the research questions
The study first examined the quality of the pure vowels in Hanoi

Vietnamese. The frequencies of each of the first two formants of each
monophthong (F1, F2) were investigated on the acoustic spectrographs,
generated from the speech analyzer program PRAAT.
The results obtained from the analysis were then compared with the
results of a recent research in the monophthongs of General American
English, conducted by Clark, M. J, Hillenbrand, J, et al . (1995).

The research is aimed at answering two questions:
1) What are the acoustic properties characterizing Hanoi Vietnamese
monophthongs?

3

2) What are the common and distinctive features between the relative
positions of the monophthongs in Vietnamese and General American
English on the formant charts?

4

Chapter 2: THE REVIEW OF LITERATURE
2.1. The articulatory description of Hanoi Vietnamese
monophthongs
There have been considerable attempts to give a description of the vowel
system of Hanoi Vietnamese, impressionistically and acoustically. This
part of the review of literature is concerned firstly with the set of
Vietnamese monophthongs in Hanoi dialect, the description of which has
generated a great amount of debate among phoneticians. I shall then give
an examination of the second set, being described with fair consistency.

As mentioned above, the vowel inventory of Vietnamese includes some
monophthongs that have been described consistently in the literature;
they also have transparent orthographic representation: i/i/, u/u/, ô/o/,
o/ɔ/, ê/e/, e/ε/, a/a/. However, for some other monophthongs,
orthographically realized by ư, ơ, â, and ă, there are important
conflictions in description. For example, Lindau (1978), as cited in Matt
(2009) describes ư as high back unrounded, while Thompson (1965)
insists that it is high central unrounded, or as and high central, as
proposed by Pham (2003). Hwa-Froelich (2002), as cited in Matt (2009),
puts forward the suggestion that ư that includes /ɯ/ and /ʊ/, is
characteristically employed to denote a high back unrounded and a
lower-high back rounded vowel, respectively. Lindow (1978) has
identified ơ as being back unrounded, /ɤ/ or /ʌ/, while according to
Thompson (1965), it should be represented by /ə/.
5

According to Matt, Alina, and Alison (2009) there are two reasons for
the inconsistency in the description of ư and ơ. Firstly, the acoustic
distinction between lip-rounding and the backness of the tongue is not
clear. The traditional analysis of spectrogram cannot convincingly
differentiate the characteristics because of the almost similar, or even
equal acoustic properties (Ladefoged, 2011). The second reason is the
different goal behind the phonetic and phonological descriptions of the
vowels concerned. Phonetic descriptions, the goal of which is to provide
a description of the vowels’ features as being realized in spoken speech,
are concerned with the articulatory or acoustic features of the vowels.
Phonological descriptions, on the other hand, are concerned with the
vowels’ structure and function in relation to each other in a system.
Naturally, different goals of the studies conducted have resulted in the

inconsistency.

As mentioned earlier, there are two other Vietnamese vowels, which
have been identified with conflicting features. The vowels realized by â
and ă are traditionally described as “short”, low central. However, there
has been a great amount of debate surrounding whether these vowels are
short counterparts of ơ and a respectively, which are long vowels of
similar quality, or they are short vowels with distinct vowel qualities.
One of the ultimate goals of the current study is to provide a systematic
description of the quality of Hanoi Vietnamese pure vowel inventory;
therefore, it shall not be concerned with the vowel duration.

6

Thompson (1965) is among the references of highest citation frequency.
In his rather comprehensive account of the Vietnamese language, a fine
amount of space has been devoted to the vowel system of Hanoi dialect.

According to Thompson (1965), the dialect’s vocalic system consists of
two sub-systems of upper vocalics, which includes six vowels and three
semivowels, articulated relatively high in the mouth, and lower vocalics,
which includes five vowels and one semivowels, articulated relatively
low. The table below gives further details on this.

The Vocalic System, Thompson (1965, p.12)

It can be made clearer from this table what Thompson (1965) has
illustrated. The upper vocalics includes three positions, being relatively
distinctive from each other: front, back unrounded, and back rounded. A

high vowel, an upper-mid vowel, and a semivowel occupy each of the
positions. He emphasizes that there are no vowels
7

that occur at the final position. Further description of the uper - vocalics
vowels are provided as follows.
/i/ is proposed here as a high front or central unrounded vowel. It is
lower high central before final ch, nh, as in ích, be useful, and lính,
soldier. Before ê, p, m in the same syllable, it is an upper high front
vowel. Examples are provided as in biết, miệng, kíp, tìm, which means
know, mouth, be urgent, and search for respectively. It is lower high
front elsewhere in the same syllable.
/e/ is characterized as being upper mid front or central, unrounded. It is
upper mid central before final ch, nh; and after [i] before [w, p, m, t, n]
in the same syllable, which is “slightly lower before [w]” (p.30).
Examples given include ếch, bênh, hiểu, tiếp, which respectively means
frog, defend, understand, and receive in English. The vowel is upper-mid
front elsewhere.
/u/ is described as a high back rounded vowel. Thomson (1965)
emphasizes that “it tends to be upper high, but only before [m] and [p]”
(p.31), as in chụp (seize suddenly), chum (earthenware jar), and it will be
lower high elsewhere, as in núi, (mountain), mũ, (hat), tuổi, (age).

/o/ is identified as being upper mid back rounded. It is higher mid before
[j, w], as in tôi, (I), rồi, (be already accomplished), cô, (aunt), lỗ, (hole),
and is mid strongly centralized after [u], as in buồn, (be

8

sad), quốc, (country), tuổi, (age), chuột, (rat). Finally, it is upper mid
elsewhere, that is, before [p, m, t, n].
/ε/ is proposed to be lower mid front unrounded. There is little variation
when the sound is realized in different contexts.
/ɔ/, is much like that of /ε/, maintaining its quality when being
distributed differently. The vowel is described as lower mid back
rounded.
/a/ is characterized as a lower low front unrounded vowel.
Đoàn (2000) has proposed the largest vowel inventory of Vietnamese,
with thirteen monophthongs, including /i/, /e/, /ɛ/, /ɛ̆/, /ɯ/, /u/,
/o, /ɔ/, /ɔ̆/, /ɤ/, /ɤ̆/, /a/, and /ă/. The author did not attempt to describe
these vowels in terms of how they are articulated, as articulatory
phoneticians have often done. Instead, qualities of all the vowels are
described firstly in terms of their timbre. The timbre is then explained as
being high (bổng), mid-low (trầm vừa), and low (trầm). The table below
illustrates how Vietnamese monophthongs are distinguished from each
other in terms of their timbre, according to the author. (p.191)
-

High category: /i, e, ɛ, ɛ̆/

-

Mid-low category: /ɯ, ɤ, ɤ̆,a, ă/

-

Low category: /u, o, ɔ, ɔ̆/

9

However, it is not clear from the explanation what the vowels are high,
mid-low, and low in terms of. If that is concerned with pitch, there
appears to be confusion between the vowel quality and the pitch at
which they are produced. Acoustic studies of vowels have demonstrated
that the pitch of vowels, as perceived by listeners, is decided by the
fundamental frequencies of the sound waves producing that vowel (F0),
and has practically no effect on the vowel quality.

There are four pairs of Vietnamese vowels, which according to the study,
differentiated by duration. These include /ɛ̆/ and /ɛ/, /ɔ̆/ and /ɔ/, / /
and /ă/, /ɤ/ and /ɤ̆/. It is maintained that these four pairs of
vowel have the same quality, and are in long-short opposition.
(p.195)

2.2 The acoustic description attempts
Matt et al. (2009) carried out an exploration of the Vietnamese
monophthongs produced by a small group of native speakers from both
northern and southern Vietnam. The researchers also attempted to
provide a comparison between the native production and those made by
American adult learners. The goals of the study are significant. The
method of conducting the study, however, is problematic. In order to
eliminate

the

anatomical

differences

normalization method inspired by Watt and

among

participants,

the

10

Fabricious (1973) has been employed in the study. This method has been
severely attacked by modern phoneticians.
Johnson (2005) pointed out that, “Talkers may differ from each other at
the level of their articulatory habits of speech. This, in itself, would
suggest that perception may not be able to depend on vocal tract
normalization to “remove” talker differences by removing vocal tract
differences” (p.19). Johnson et al. (1993) goes further:
The presence of individual differences in speech production also complicates
matters for vocal tract normalization. Though normalization research has
usually focused on male/female differences in vocal tract size and shape, vocal
tracts - even within genders - come in lots of different sizes and shapes. Talkers
apparently adopt different (possibly arbitrarily different) articulatory strategies
to produce the “same” sounds. Thus, accurate recovery of the talker’s
articulatory gestures would not completely succeed in “normalizing” speech.
(P.20)

The second problem of the method is in its scale. The study was
conducted on too small a scale so as to provide a conclusive support for
the researchers’ claims in the discussion of the findings. According to the
researchers,
Native speaker participants included 3 Northern dialect speakers (1 female, 2
males) and 1 Southern dialect speaker (female). All were originally from
Vietnam and had been living in an English-speaking country for 6 to 26 years.
They ranged from 42 to 64, and all had experience teaching Vietnamese as a
foreign language to adults.

11

Firstly, the number of participants selected is too small, and is therefore
statistically insignificant. This can be attributed to the authors’ reliance
on the normalization method adopted, as mentioned before. Secondly,
while the qualities of Vietnamese vowels have been recognized as being
substantially varied from dialect to dialect in realization, there is no
indication that the subjects were screened for dialect, and very little
information is provided about the dialects of the speakers. The present
research represents the researcher’s attempt to address these limitations.
(see Chapter 3 for further details)

Srihari and Nguyen (2004) is another attempt to describe the Vietnamese
vowel characteristics employing spectrograms analysis. In order to make
decision on the set of vowels for the recording process, the authors
follow the work of Thompson (1965, 1987), closely, claiming that there
are eleven monophthongs in the Vietnamese vowel system (Hanoi
dialect), which are /i, ɯ, u, e, γ, o,
ε, ɔ, ɐ, a, ɑ

12

The vocalics systems (Thompson, 1987, as cited in Srihari and Nguyen, 2004)

Making a comparison with the system that Mai, Vu, and Hoang
(2008) proposed, considerable differences could be spotted. In the
latter account, it is suggested that there are 13 pure vowels in the
system, and noticeably, there is not an existence of /ɑ/,
characterized as a low, back, unrounded vowel, as Srihari and
Nguyen (2004) maintain. In addition, these authors support the
claim that /γ, o, ε/ have three counterparts differing just in terms of
duration, which are /ɤ̆/, /ɔ̆/, and /ɛ̆/. This is a part of the inconsistent
description of the Vietnamese vowel inventory, as mentioned earlier.
Even Thompson (1987) has departed from his previous proposal made in
Thompson
(1965), with regards to the existence of /ɑ/. As a result, deciding on a set
of eleven monophthongs has posed a threat to the validity of the
findings.
The aims of the study, as stated by its author, are to provide “a preliminary quantitative
description of formant values for F1 and F2 for each vowel and plot the vowel chart of
Vietnamese. ” (p.2). However, what has made it even more problematic, again, is the scale of the
research. The subject of the study, as described, is “a 24-year-old native male speaker of Hanoi
dialect, the standard dialect of Vietnam. The speaker can speak English fluently but not well
-trained in phonetics.” (p.2). This problem also occurred in the previous

13

study. There are anatomical differences among speakers of a certain
language; therefore, selecting one subject for examination would not
provide findings which are representative of the population. Given that
the author would carry out an analysis on the qualitative aspects of the
vowels in question, the conclusion on the acoustics of the vowels of a
language being drawn from the analysis of the recording of a single
speaker of it is seriously questionable. Ladefoged (2003) pointed out
that, “The fact that data has been measured correctly does not show that
there are no problems with the speakers. When looking at the formants
of a group of people you should check whether any one speaker is
different in any way from the others.” (p.129)

The vowels of five speakers of Banawa, Ladefoged (2003, p.129)

14

The ellipse in the figure encloses four stressed [e] vowels of a speaker.
As can be seen, the first formant values of his [e] are distinct from those
of the other speakers. This speaker, therefore, has produced this sound in
a way that is significantly different from the others. This deviation,
according to Ladefoged (2003), cannot be ascribed to some anatomical
factor such as a very small vocal tract size. This is because the other
vowels produced by him are similar to those made by the rest of the
speakers. The author’s suggestion is that, “if you find a speaker who
pronounces a word in a significantly different way, you should leave this
part of the data out when providing diagrams of the vowel qualities of
the language, noting, however, that there are speakers who deviate from
the general pattern.” (p.129).
The second problem with the currently reviewed study involves the set

of words containing the vowels chosen for recording.

The word list containing the vowels in question, Srihari and Nguyen (2004, p.3)
15

The /t-/ context is not the best choice. According to Ladefoged (2011,
p.199), a stop closure will cause the vowel’s first formant (F1) to rise
from a low position. As a result, the accuracy of the formant values
calculated might be affected. It is suggested in a number of the studies
(James et al., 1995; Broadbent & Ladefoged, 1957; Wells, 1962;
Ladefoged, 2011) that a word list of the /h-d/ context would provide the
best spectrograms, as /h/ has almost no effect on the formants of the
adjacent vowels in the same syllable.
2.3. Characterizing vowel qualities with the acoustic properties
The current study is inspired by Ladefoged’s (2003) firm statement that, “the best way
of describing vowels is not in terms of the articulations involved, but in terms of their
acoustic properties. ” (p.104). In this section we shall take a closer look at the acoustics
of vowels.

The different sounds of language are physically characterized with four
dimensions, which are the fundamental frequency, the amplitude, the
duration, and the formants distribution of the sound wave. The four
corresponding perceptual dimensions are pitch, loudness, length, and
quality.
The current study has not investigated the amplitude and the
fundamental frequency of vowels, being primarily concerned with the
spectral distribution of the pure vowels. The measurements of the vowel
duration have been investigated insofar as they distinguish the
16

pairs of vowels having been described with inconsistency in articulatory
phonetics.
Articulatory phonetics describes how a vowel is articulated, in terms of
the behaviors of the articulators, but there has not been a term to
describe the difference between the quality or timber of one vowel and
another vowel. Among the dimensions of the complex sound waves
produced by the human vocal cords, we need to consider carefully the
spectral distribution of the component frequency. A speaker can
pronounce a vowel on any pitch within the range of his voice without
changing its identity. Ladefoged (2003) provides a prime example:

I can say the vowels in heed, hid, head, had on a low pitch, when the vocal folds are
vibrating about 80 times a second, and then I can say them again with vocal folds
vibrating 160 times a second. The pitch of my voice will have changed, but the
vowels will still have the same quality. I can also say any vowel loudly or softly. The
quality, the factor that distinguishes one vowel from another, remains the same when
I shout or talk quietly. (p.31)

The differences among vowels are often compared with the different
instruments. The same note can be played on a guitar, a violin, or a
piano. This can be done as the sound is produced at the same rate of
repetition of a special component wave, i.e, the fundamental frequency.
What is interesting here is that, the quality of the music produced by one
instrument will be different from that of any other. This is due to the
differences in the amplitude as well as the frequency of the component
waves. The quality of a vowel differs
17

from that of another in plainly the same way. Irrespective of the pitch on
which a vowel is produced, the quality will stay unchanged.
A popular way that phoneticians describe the acoustics of the human
speech sounds is using the tube models. The current research is
primarily concerned with the monophthongs (of Vietnamese), so the
models can be conveniently summarized as follows.
The air in a bottle will be set vibrating when the body of air at the top of
it is blown across. Naturally, the note that is produced as a result of
blowing the air at the bottle top will depend on the size and the shape of
the bottle. The more the volume of air inside is increased, the lower will
the produced note be. This is due to the fact that the smaller body of air
will vibrate more quickly than that of a larger one, having a higher
frequency of resonance.
When a vowel is being produced, it is the vocal tract that acts like a
bottle, with the size and the shape being constantly altered. If for a
bottle, the air inside is set in vibration when blowing across the air at the
top, for the vocal tract it is the pulses of the air from the vocal folds.
What makes the tract different from the bottle is its very complex shape,
which can be constantly changed due to the movements of the related
organs. Conveniently, phoneticians often consider the body of air in the
throat to be the first tube, and that in the mouth to be the second one.
The resonances of the vocal tract are called the formants, which
correspond to the basic frequencies of the vibrations of the air in the
vocal tract. Therefore, the formants of a
18

A comparative acoustic study of hanoi vietnamese and general american english monophthongs m a thesis linguistics 60 22 15

Tài liệu liên quan

Tài liệu bạn tìm kiếm đã sẵn sàng tải về