Analysts Grammar or Japanese tn the Nu-ProJect
- A Procedural Approach to Analysts Grammar -
Jun-tcht TSUJII. Jun-tcht NAKANURA and Nakoto NAGAO
Department of Electrical Engineering
Kyoto University
Kyoto. JAPAN
Abstract
Analysts grammar of Japanese tn the Mu-proJect
ts presented, It is emphasized that rules
expressing constraints on stngle linguistic
structures and rules for selecting the most
preferable readtngs are completely different In
nature, and that rules for selecting preferale
readings should be utilized tn analysts grammars of
practical HT systems. It ts also clatmed that
procedural control ts essential tn integrating such
rules tnto a unified grammar. Some sample rules
are gtven to make the points of discussion clear
and concrete.
1.
Introduction
The Hu-ProJect ts a Japanese nattonal project
supported by grants from the Special Coordination
Funds for Promoting Science & Technology of
STA(Sctence and Technology Agency). whlch atms to
develop Japanese-English and English-Japanese
machine translation systems. Ve currently restrict
the domain of translation to abstracts of
scientific and technological papers. The systems
are based on the transfer approach[;], and consist
of three phases: analysts, transfer and generation.
In thts paper, we focus on the analysts grammar of
Japanese tn the Japanese-English system. The
grammar has been developed by using GRADE which ts
a programming language specially designed for thts
project[2]. The grammar now consists of about 900
GRADE rules. The experiments so far show
that
the
grammar works very well and ts comprehensive enough
to treat various linguistic phenomena tn abstracts.
In thts paper we wtll discuss some
of
the basic
design principles of the grammar together wtth its
detatled construction. Some examples of grammar
rules and analysts results wtll be shown to make
the points of our discussion clear and concrete.
2. Procedural Grammar
There has been a prominent tendency tn recent
computational linguistics to re-evaluate CFG and
use tt dtrectly or augment tt to analyze
sentences[3.4.5]. In these systems(frameworks),
CFG rules Independently describe constraints on
stngle linguistic structures, and a universal rule
application mechanism automatically produces a set
of posstble structures which satisfy the given
constraints. It ts well-known, however, that such
sets of posstble structures often become
unmanageably large.
Because two separate rules such as
NP • NP PREP-P
VP • VP PREP-P
are usually prepared tn CFG grammars tn order to
analyze noun and verb phrases modifted by
prepositional phrases. CFG grammars provide two
syntactic analyses for
She was given flowers by her uncle.
Furthermore. the ambiguity of the sentence ts
doubled by the lexlcal ambiguity of "by". which can
be read as etther a locattve or an agenttve
preposition. Since the two syntactic structures
are recognized by compZetely independent ru]es and
the semantic interpretations of "by" are given by
independent processes tn the ]ater stages. It ts
difficult to compare these four readings during the
anaZysts to gtve a preference to one of these four
readings.
A rule such as
"If a sentence ts passlve and there ts a
"by"-prepostttonal phrase, tt ts often the case
that the prepositional phrase ftlls the deep
agenttve case. (try thts ana]ysts first)"
seems reasonable and quite useful for choosing the
most preferable interpretation, but tt cannot be
expressed by refining the ordinary CFG rules. Thts
ktnd of ru]e ts quite different In nature from a
CFG ru]e. It ts not a rule of constraint on a
stng]e ]tngutsttc structure(in fact. the above four
readings are a]l ]tngulsttcal]y posstb]e), but tt
ts a "heuristic" ru]e concerned with preference of
readings, which compares several alternative
analysts paths and chooses the most feastble one.
Human translaters (or humans tn general) have many
267
such preference rules based on vartous sorts of cue
such as morphological forms of words, collocations
of words, text styles, word semantics, etc. These
heuristic rules are quite useful not only for
increasing efficiency but also for preventing
proliferation of analysts results. As Wllks[6]
potnted out, we cannot use semanttc Information as
constraints on stngle linguistic structures, but
Just as preference cues to choose the most feastble
Interpretations among linguistically posstble
Interpretations. We clatm that many sorts of
preference cues other than semanttc ones exist tn
real texts whtch cannot be captured by CFG rules.
We will show tn thts paper that. by utilizing
vartous sorts of preference cues. our analysts
grammar of Japanese can work almost
determtntsttcally to gtve the most preferable
Interpretation as the ftrst output, wtthout any
extensive semanttc processing (note that even
"semant|c" processing cannot dtsambtguate the above
sentence. The four readings are semantically
possible. It requtres deep understanding of
contexts or situations, whtch we cannot expect tn a
practical MT system).
In order to Integrate heuristic rules based on
var|ous levels of cues tnto a untfted analysts
grammar, we have developed a programming langauage.
GRADE. GRADE provtdes us wtth the following
facilities.
Expllctt Control of Rule Appl|cattons :
Heuristic rules can be ordered according to thetr
strength(See 4-2).
- Nulttple Relatton Representation : Vartous
levels of Informer|on Including morphological.
syntactic, semantic, logtcal etc. are expressed tn
a s|ngle annotated tree and can be manipulated at
any ttme durtng the analysts. Thts ts requtred not
only because many heuristic rules are based on
heterogeneous levels of cues. but also because the
analysts grammar should perform semantic/logical
Interpretation of sentences at the same ttme and
the rules for these phases should be wrttten tn the
same framework as syntactic analysis rules (See
4-2. 4-4).
- Lextcon Drtven Processing : We can wrtte
heuristic rules spectftc to a stngle or a 11mtted
number of words such as rules concerned wtth
collocations among words. These rules are strong
tn the sense that they almost always succeed. They
are stored tn the lextcon and tnvoked at
appropriate times durtng the analysts wtthout
decreasing efficiency (See 4-1).
- Expltct% Definition of Analysts Strategies :
The whole analysts phase can be dtvtded into steps.
Thts makes the whole grammar efficient, natural and
easy %o read. Furthermore. strategic consideration
plays an essential role tn preventing undesirable
interpretations from betng generated (See 4-3).
3 Organization of Grammar
In thts sectton, we will give the organization
of the grammar necessary for understanding the
discuss|on |n the follow|ng sections. The matn
components of the grammar are as follows.
(1) Post-Morphological Analysts
(2) Determination of Scopes
(3) Analysts of Stmple Noun Phrases
(4) Analysts of Stmple Sentences
(5) Analysts of Embedded Sentences (Relative
Clauses)
(6) Analysts of Relationships of SentenCes
(7) Analysts of Outer Cases
(8) Contextual Processing (Processing of Omttted
case elements. Interpretation of 'Ha' . etc.)
(9) Reduction of Structures for Transfer Phase
Each component conststs of from 60 to 120
GRADE rules.
47 morpho-syntacttc categories are provtded
for Japanese analysts, each of whtch has tts own
lextcal description format. 12.000 lextcal entrtes
have already been prepared according to the
formats. In thts classification. Japanese nouns
are categorized |nto 8 sub-classes according to
thetr morpho-syntacttc behavtour, and 53 semanttc
markers are used to characterize thetr semanttc
behaviour. Each verb has a set of case frame
descriptions (CFD) whtch correspond to different
usages of the verb. A CFD g|ves mapping rules
between surface case markers (SCN - postpostttonal
case particles are used as SCN's tn Japanese) and
thetr deep case interpretations (DCZ 33 deep
cases are used). DC! of an SCM often depends on
verbs so that the mapping rules are given %o CFD's
of Individual verbs. A CFO also gtves a normal
collocation between the verb and
SCM's(postpositonal case particles). Oetatled
lextcal descriptions are gtven and discussed tn
another paper[7].
The analysts results are dependency trees
whtch show the semanttc relationships among tnput
words.
4. Typtcal Steps of Analysts Grammar
In the following, we w111 take some sample
rules to Illustrate our points of discussion.
4-; Relative Clauses
Relative clause constructions in Japanese
express several different relationships between
modifying clauses (relative clauses) and thelr
antecedents. Some relattve clause constructions
268
cannot be translated as relative clauses tn
Engltsh. Me classified Japanese relattve clauses
Into the followtn 9 four types, according to the
relationships between clauses and their
antecedents.
(1) Type 1 : Gaps
In
Cases
One of the case elements
of
the relattve
clause ts deleted and the antecedent fills the gap.
(2) Type 2
:
Gaps
In
Case Elements
The antecedent modifies a case element tn the
clause. That ts. a gap exists tn a noun phrase tn
the clause.
(3) Type 3 : Apposition
The clause describes the content of the
antecedent as the Engltsh "that"-clause tn 'the
tdea that the earth ts round'.
(4) Type 4 : Partlal Apposltlon
The antecedent and the clause are related by
certain semantic/pragmatic relationships. The
relative clause of thts type doesn't have any gaps.
This type cannot be translated dtrectly lnto
English relative clauses. Me have to Interpolate
In
English appropriate phrases or clauses whtch are
Implicit tn Japanese. tn order to express the
semantic/pragmatic relationships between the
antecedents and relative clauses explicitly. In
other words, gaps extst tn the Interpolated phrases
or clauses.
Because the above four types of relattve
clauses have the same surface forms fn Japanese
(verb) (noun).
RelattvefClause Antecedent
careful processing ts requtred to d|sttngutsh them
(note that the "antecedents' -modified nouns- ape
located after the relat|ve clauses tn Japanese). A
sophisticated analysis procedure has already been
developed, which fully ut|ltzes vartous levels of
heuristic cues as follows.
(Rule 1) There are a 11mtted number of nouns whtch
are often used as antecedents of Type 3 clauses.
(Rule 2) Vhen nouns with certa|n semanttc markers
appear tn the relattve clauses and those nouns are
followed by one of spectflc postpostttonal case
part4cles, there ts a htgh possibility that the
relattve clauses are Type 2. In the following
example, the word "SHORISOKUDO"(processtn 9 speed)
has the semanttc marker AO (attribute).
[ex-1] [Type 2]
"SHORZSOKUDO" "GA"
(processing speed) (case
particle:
subject
I case)
RelattvetClause
"HAYA["
"KEISANK["
(htgh) I (computer) I
/t
Antecedent
>(English Translation)
A computer whose processing speed ts htgh
(Rule 3) Nouns such as "MOKUTEKZ"(puPpose).
"GEN ZN"(reason), "SHUDAN"(method) etc. express
deep case relationships by themselves, and. when
these nouns appear as antecedents. |t is often the
case that they ft11 the gaps of the corresponding
deep cases tn the relattve clauses.
[ex-2] [Type 1]
"KONO" "SOUCHI" "O" "TSUKAT" "TA" "MOKUTEK["
(th,s)l(dev,c. (c ICpurpos.)
|part,cle:h /,ormat,ve: I J
I / °bJect l / pest) l
/case) ~ /
RelattvetClause Antecedent
> (English Translation)
The purpose for wh|ch (someone) used thts devtce
The purpose of ustn9 thts devtce
(Rule 4) There ts a 11mtted number of nouns whtch
are often used as antecedents In Type 4 relattve
clauses. Each of such nouns requtres a specific
phrase or clause to be Interpolated tn Engltsh.
[ex-3] [Type 4]
"KONO" "SOUCHI" "0" "TSUKAT" "TA" "KEKKA"
(th,s),(devlce)/~case e.~. (to use)/~tense ~' (;esult)
l fformat,ve:h J
1 ,object , Ipast) I 1
[ I case) l
Rel at tve ~ Clause Antecedent
> (Engllsh Translation)
The result which was obtatned by ustng thts dev|ce
In the above example, the clause "the result whtch
someone obtatned (the result : gap)" ts onmitted tn
Japanese. whtch relates the antecedent
"KEKKA"(result) and the relattve clause "KONO
SOUCHI 0 TSUKAT_TA"(someone used thts devtce).
269
A set of lextcal rules ts defined for
"KEKKA"(resulL). which basically works as follows :
tt examines first whether the deep object case has
already been filled by a noun phrase tn the
relattve clause. If so, the relattve clause ts
taken as type 4 and an appropriate phrase ts
Interpolated as tn [ex-3]. If not, the relattve
clause ts taken as type 1 as tn the following
example where the noun *KEKKA" (result) ftlls the
gap of object case tn the relattve clause.
[ex-4] [Type 1]
"KONO" "JIKKEN • / •GA". "TSUKAT• J"TA" l "KEKKA"
(thts)J(expertment)//(case~(to use)~(tense (r~ult)
rParticle~
iformsttve:]l
IsubJect I I past)| I
[ _ll case) l / I
Relattve Clause Antecedent
>(English Translation)
The result whtch thts experiment used
Such lextcal rules are Invoked at the beginning of
the relattve clause analysts by
a
rule tn the math
flow of processing. The noun "KEKKA • (result) is
given a mark as a lexlcal property which Indicates
the noun has special rules to be Invoked when tt
appears as an antecedent of
a
relatlve clause. A11
the nouns which requlre speclal treatments In the
relative clause analysts are given the same
marker.
The rule tn the matn flow only checks thts mark and
Invokes the lextcal rules defined tn the lextcon.
(Rule 5) Only the cases marked by postpostttonal
case particles 'GA'. 'WO" and 'NI" can be deleted
tn Type 1 relattve clauses, when the antecedents
are ordtnary nouns. Gaps tn Type 1 relative clauses
can have other surface case marks, only when the
antecedents are spectal nouns such as described tn
Rule (3).
4-2 ConJuncted Noun Phrases
ConJuncted noun phrases often appear in
abstracts of scientific and technological papers.
It ts Important to analyze them correctly.
especially to determine scopes of conjunctions
correctly, because they often lead to proliferation
of analysis results. The particle "TO" plays
almost the same role as the Engllsh "and" to
conjunct noun phrases. There are several heuristic
rules based on various levels of information to
determine the scopes.
<Scope Decision Rules of ConJuncted Noun Phrases
by Partlcle 'TO'>
(Rule
1)
Stnce parttcle
"TO"
ts also used as
a
case
particle, tf It appears tn the position:
Noun 'TO"
verb
Noun,
Noun 'TO' adjective Noun.
there are two posstble Interpretations. one tn
whlch "TO" Is a case parttcle and "noun TO
adjective(verb)' forms a relattve clause that
modifies the second noun. and the other one tn
which "TO" ts a conjunctive particle to form a
conJuncted noun phrase. However. it ts very 11kely
that the parttcle 'TO' ts not 8 conjunctive
parttcle but a post-positional case particle, if
the adjective (verb) ts one of adjectives (verbs)
which requtre case elements wtth surface case mark
"TO' and there are no extra words between "TO • end
the adjective (verb). In the following example.
"KOTONARU(to be different)" ts an adjective which
ts often collocated wtth a noun phrase followed by
case particle "TO".
[ex-5]
YOSOKU-CHI "TO" KOTONARU ATAI
(predicted value) (to be different) (value)
[dominant interpretation]
IYOSOKU-CHI "TO" KOTONARU ATIAI
relattve~clause ant/cedent
• the value which ts different from the
predicted value
[less domtnant Interpretation]
YOSOKU-CHI "TO" KOTONARU ATAI
Me N~
I I
conJuncte~ noun phrase
= the predicted value and the different value
(Rule 2) If two "TO* particles appear tn the
position:
Noun-1
'TO' .
Noun-2
'TO' 'NO" NOUN-3
the right boundary of the scope of the conJuctton
ts almost always Noun-2. The second 'TO" plays
a
role of a delimiter which deltmtts the right
boundary of the conjunction. Thts 'TO" tS
optional, but tn real texts one often places tt to
make the scope unambiguous, especially when the
second conjunct IS a long noun phrase and the scope
is highly ambiguous without tt. Because the second
'TO' can be Interpreted as a case parttcle (not as
a delimiter of the conjunction) and 'NO' following
a case parttcle turns the preceding phrase to
a
270
modlfter of s noun. on Interpretation tn whtch
"NOUN-2 TO NO" ts taken as o modtrter of NOUN-3 and
NOUN-3 ts token as the hood noun of the second
conJunt ts also linguistically possible. However,
In most cases, when two 'TO" particles appear tn
the above position, the second "TO' Is Just a
delimiter of the scope(see [ex-6]).
[ex-6]
YOSOKU-CHI TO JIKKEN DE.NO JISSOKU-CHI TO 60 SA
(predtctedl'~expertment~'~case'~(octual valu~
I
value) J ~orttcle~ (dtt'ference)
t pl°c°) ]
[dominant Interpretation]
YOSOKU-CHI TO J[KKEN DE 60 O[$$OKU-CH] TO NO SA
NP NP
1 I
ConJuncted HP
I
NP
• the difference between the predicted value
and the actual value tn the experiment
[less domtnant tnterpnetattons]
(A)
YOSOKU-CHI TO JIKKEN DE NO JISSOKU-CHI TO NO $A
NP NP
I I
ConJuncted NP
- the difference wtth the actual value tn the
predicted value and the experiment
(e)
YOS~KU-CH]
.p ~p
l I
ConJun~ted NP
TO J[KKEN DE NO JZSSOKU-CH[ TO NO SA
"l "" I
• the predicted value and the difference wtth
the actual value tn the experiment
(Rule 3) If a spectal noun whtch ts often
collocated wtth conjunctive noun phrases appear tn
the position:
Noun-1 'TO' . Noun-2 "NO'<spectal-noun>,
the rtght boundary of the conjunction ts almost
always Noun-2. Such spectal nouns are marked tn
the lextcon. [n the following example. "KANKEI" ts
such a spectal noun.
[ex-7]
JISSOKU-CHI~O"
(actual
value) I
RIRON-DE E-TA YOSOKU-CHI. NO, KANKE[__
1(theory ]( ( to~( prod tcted~ (l:e lot ton~
" Iobtatn)l value) // shtp)J
II
spectal noun
[dominant Interpretation]
JISSOKU-CH! "TO" . YOSOKU-CH[ NO KANKEI
L._;___I
(relative antecedent
clsuse)l J
NP
~P
I I
con]u~cted NP
• the relationship between the actual value
and the predicted value obtatned by the
theory
[less domtnant Interpretations]
(A)
JIS$OKU-CHI "TO" R]ROH-DE YO$OKU-CH[ NO KANKE!
NP
I I
conJun~ted NF
I
relattvetclouse antecedent
• the relationship of the predicted value whtch
was obtatned by the actual value and the theory
(e)
JX$SOKU-CH! "TO" . YO$OKU-CHX NO KANKEX
~P NP
I
I
conJuncted NP
• the actual value and the relationship of
the predicted value whtch was obtatned by
the theory
(Rule 4) Zn
Noun-1 'TO' . Noun-2,
tf Noun-1 and Noun-2 are the same nouns, the rtght
boundary of the conjunction ts almost always
Noun-2.
(Rule 5) In
Noun-! 'TO' . Noun-2.
tf Noun-! and Noun-2 are not exactly the some but
nouns wtth the same morphemes, the rtght boundary
271
ts often Noun-2. In [ex-7] above, both of the heed
nouns of the conJuncts. JISSOKU°CHI(actual value)
and YOSOKU-CH[(predtcted value), have the same
morpheme "CH[" (whtch meams "value"). Thus, thts
rule can correctly determine the scope, even tf the
spectal word "KANKE1"(relattonshtp) does not extst.
(Rule 6) If some spectal words (11ke 'SONO"
'SORE-NO' etc. whtch roughly correspond to 'the'.
'1iS' tn Engllsh) appear tn the position:
Phrases whtchlNoun-1 "TO' <spectal word> Noun-2.
modtfy noun
phrases
the modifiers preceding Noun-1 modtfy only Noun*l
but not the whole conJuncted noun phrase.
(Rule 7) [n
Noun-1 'TO' . Noun-2.
tf Noun-1 and flour-2 belong to the same spectftc
semanttc categories, 11Le actton nouns, abstract
nouns etc, the rtght boundary ts often Noun-2.
(Rule 8) [n most conJuncted noun phrases, the
structures of conJuncts are well-balanced.
Therefore, tf a relattve clause precedes the first
conjunct and the length of the second conjunct (the
number of words between 'TO" and Noun-2) ts short
11ke
[Relative Clause] Noun-1 'TO" . Noun-2
the relattve clause modtftes both conJuncts, that
ts. the antecedent of the relattve clause ts the
whole conJuncted phrase.
These heuristic rules are based on different
levels of Information (some are based on surface
lexlcal Items. some are based on morphemes of
words, some on semanttc |nformatton) and may lead
to different decisions about scopes. However. we
can distinguish strong heuristic rules (t.e. rules
whtch almost always give correct scopes when they
are applled) from others. In fact. there extsts
some ordertng of heuristic rules according to thetr
strength. Rules (1). (2). (3), (4) and (6). for
example, almost always succeed, and rules like (7)
and (8) often lead to wrong decisions. Rules 11ke
(7) and (8) should be treated as default rules
whtch are applted only when the other stronger
rules cannot dectde the scopes. We can deftne tn
GRADE an arbitrary ordertng of rule applications.
Thts capability of contro114ng the sequences of
rule applications ts essential tn Integrating
heuristic rules based on heterogeneous levels of
Information tnto a untried set of rules.
Note that most of these rules cannot be
naturally expressed by ordtnary CFG rules. Rule
(2). for example, ts a rule whtch blocks the
application of the ordtnary CFG rule such as
NP > NP <case-particle> NO N
when the <case-particle> ts 'TO' and a conjunctive
parttcle 'TO' precedes thts sequence of words.
4-3 Determination of Scopes
Scopes of conJuncted noun phrases often
overlap wtth scopes of relattve clauses, whtch
males the problem of scope determination more
complicated. For the surface sequence of phrases
11ke
NP-1 'TO' NP-2 <case-particle> <verb> NP-3
there are two passable
scopes of conJuncted noun
clause 11ke
relationships between the
phrase and the relattve
(1) NP-1 'TO" NP-2 <case-particle> <verb> NP-3
I J
conJ~ncted
noun phrase
I
Relattv~ Clause
I
Antecedent
I
t
NP
(2)NP-2 'TO' NP-2 <case-particle> <verb> NP-3
I Relattve ~ Clause Antecedent
J I
N,P
ConJuncted* Noun Phrase
Thts ambiguity together with genutne ambtgu|ttes tn
scopes of conJuncted noun phrases tn 4-2 produces
combinatorial Interpretations tn CFG grammars, most
of whtch are linguistically posstble but
practically unth|nkable. It Is not only
Inefficient but also almost Impossible to compare
such an enormous number of linguistically posstble
structures after they have been generated. In our
analys|s grammar, a set of scope dectston rules are
applted in the early stages of processing tn order
to block the generation of combinatorial
Interpretations. ]n fact. the structure (2) tn
whtch a relsttve clause extsts wtthtn the scope of
• conJuncted noun phrase is relatively
rare
tn real
texts, especially when the relattve clause ts
rather long. Such constructions wtth long relattve
clauses are a ktnd or garden path sentence.
Therefore. unless strong heuristic rules like (2).
(3) and (4) tn 4-2 suggest the structure (2). the
structure (1) ts adopted as the ftrst chotce (Note
that, tn [ex-7] tn 4-2, the strong heuristic
rule[rule (3)] suggests the structure (2)). Stnce
272
the result of such a decision ts explicitly
expressed tn the tree:
SCOPE-OF-CONUN~CTI~
and the grammar rules in the later stages of
processing work on thts structure, the other
interpretations of scopes will not be tried unless
the ftrst choice fatls at e later stage for some
reason or alternative interpretations are
explicitly requested by a human operator. Note
that a structure
llke
NP-1
'TO' . <verb> NP-2 <verb> NP-3
r[
relettve~clause 8!tecedent
I
relattve ~clause antecedent
I I
I
conJunct~d noun phrase
which ts linguistically posstble but extremely rare
tn real texts, is naturally blocked.
4-4 Sentence Relationships and Outer Case Analysts
Corresponding to Engltsh sub-ordinators and
co-ordinators like 'although'. 'tn order to'. 'and'
etc we have several different syntactic
constructions as follows.
(1) (Verb wtthe specific
Inflection form) I I
I I
$1 S2
(2) (Verb)(a postpostttonal particle)
!
S1 S2
(3) (Verb)(a conjunctive noun)
! |
I i
S1 S2
(1) roughly corresponds to Engllsh co-ordinate
constructions, end (2) end (3) to Engltsh
sub-ordinate constructions. However. the
correspondence between the forms of Japanese end
Engltsh sentence connections ts not so
straightforward. Some postposttional particles tn
(2). for example, are used to express several
different semantic relationships between sentences.
and therefore, should he translated tnto different
sub-ordtnators in Engltsh according to the semantic
relationships. The postpostttonal parttcle 'TAME'
expresses either 'purpose-action" relationships or
'cause-effect' relationships. In order to
dtsambtguate the semantic relationships expressed
by 'TAME'. a set of lextcal rules ts defined in the
dictionary of "TAME'. The rules are roughly as
follows.
(1) If S1 expresses a completed actton or a
stative assertion, the relationship ts
"cause-effect'.
(2) If $1 expresses neither a completed
event nor e statIve assertion and $2 expresses s
controllable action, the relationship ts 'purpose-
action'.
[ex-e]
(A) $1: TOKYO-NX
(Tokyo)
IT- TEITA
(to go) (aspect
formative)
TAME
52: KAIGI-N! SHUSSEK| DEKINAKA- TA
(meeting) (to attend) (cennot)(tense format-
ive : past)
$1: completed actton
(the aspect formative "TEITA" means
completion of an action)
> [cause-effect]
- Because I was in Tokyo. I couldn't
attend the meeting.
(B) $1: TOKYO-NI IKU
(Tokyo) (to go)
TAME
$2: KAIGI-NI SHUSSEKI DEKINAI
(meeting) (to attend) (cannot)
$1: neither a completed action nor
a stattve assertion
S2: "whether I can attend the meeting
or not • ts not controllable.
> [cause-effect]
• Because ! go to Tokyo. I cannot attend
the meeting.
(C) S1: TOKYO-NI IKU
(Tokyo) (to go)
TAME
S2: KIPPU-O KAT- TA
(ttcket) (to buy) (tense formative: past)
$1: neither a completed action nor
a stative assertion
S2: volitional action
> [purpose-action]
• In order to go to Tokyo. I bought a
ticket.
Note that whether S1 expresses a completed
action or not is determined tn the preceding phases
273
by ustng rules whtch uttllze espectual features of
verbs described tn the dictionary and aspect
formattves following the verbs (The classification
of Japanese verbs based on thetr aspectual features
and related toptcs are discussed tn [8]). Ve have
already wrttten rules (some of whtch are heuristic
ones) for 57 postpostttonal particles for
conJucttons of sentences 11ke 'TAME'.
Postpostttonal particles for cases, whtch
follow noun phrases and express case relationships,
are also very ambiguous In the sense that they
express several different deep cases. Vhtle the
Interpretation of tnner case elements are dtrectly
given tn the verb dictionary as the form of mapping
between surface case part|cles and thetr deep case
Interpretations. the outer case elements should be
semantically Interpreted by referring to semanttc
categories or noun phrases and properties of verbs.
Lextcal rules for 62 case particles have also been
Implemented and tested.
5 Conclusions
Analysts Grammar of Japanese tn the Mu-proJect
ts discussed tn thts paper. By Integrating vartous
levels of heuristic Information, the grammar can
work very efficiently to produce the most natural
and preferable readtn 9 as the f|rst output result.
wtthout any extensive semanttc processtngs.
The concept of procedural granwars was
originally proposed by Wtnograd[9] and
Independently persued by other research groups[lO].
However. thetr clatms have not been well
appreciated by other researchers (or even by
themselves). One often argues agatnst procedural
grammars, saytng that: the linguistic facts
Wtnograd's grammar captures can also be expressed
by ATN. and the expressive power of ATN ts
equivalent wtth that of the augmented CFG.
Therefore; procedural grammars have no advantages
over the augmented CFG. They Just make the whole
grammars complicated and hard to maintain.
The above argument, however, mtsses an
Important po|nt and confuses procedural grammar
wtth the representation of grammars tn the form of
programs (as Shown tn Vtnograd[9]). Ve showed tn
thts paper that: the rules whtch gtve structural
constraints on ftnal analysts results and the rules
whtch choose the most preferable linguistic
structures (or the rules whtch block "garden path"
structures) are different tn nature. [n order to
Integrate the latter type of rules tn a untfted
analysts grammar, tt ts essential to control the
sequence of rule applications explicitly and
Introduce strategic knowledge tnto grammar
organizations. Furthermore. Introduction of
control specifications doesn't necessarily lead to
the grammar
In
the form of programs. Our grammar
wrtttng system GRADE allows us a rule based
specification of grammar, and the grammar developed
by ustng GRADE ts easy to maintain.
Ve also dtscuss the usefulness of lexicon
driven processing 4n treattng Idiosyncratic
phenomena tn natural languages. Lax|con drtven
prcesstng ts extremely useful tn the transfer phase
of machtne translation systems, because the
transfer of lextcal ttems (selection of appropriate
target lextcal ttems) ts htghly dependent on each
lextcal ttem[tt].
The current verston of our analysts grammar works
qutte well on t.O00 sample sentences tn real
abstracts wtthout any pre-edtttng.
Acknowledgements
Appreciations go to the members of the
Nu-ProJect, especially to the members of the
Japanese analys4s group [Mr. E.Sumtta (Japan [BH).
Hr. M.gato (Sord Co.). Hr. S.Ten|gucht (Kyosera
Co.). Hr. A.Kosaka (~EC Co.). Mr. H.Sakamoto (Ok1
Electr|c Co.), MtSS H.Kume (JCS). Hr. N.[shtkawa
(Kyoto Untv.)] who are engaged tn Implementing the
comprehensive Japanese analysts grammar, and also
to Or. 6.Vauquots. Dr. C.Bottet (Grenoble Untv
France) and Dr. P.Sabat|er (CNRS. France) for
their fnuttful discussions and comments.
References
[t] S.Vauquots: La Traductton Automat|que 8
Grenoble, Documents de Linguist|qua Quantitative,
No. 24, Par|s, Dunod, t975
[2] J.Nakamura et.al.: Granunar Vrtttng System
(GRADE) of Nu-Machtne Translation Project and tts
Characteristics, Prec. of COL[NG 84. t984
[3] J.Slocum: A Status Report on the LRC Nach|ne
Translation System, Vorktng Paper LRC-82-3.
Linguistic Research Center, Untv. of Texas, t982
[4] F.Pere|ra et.al.: Oef|ntte Clause GRammars of
Natural Language Analysts. Artificial Intelligence.
Vol. 13. 1980
[5] G.Gazdan: Phrase Structure Grammars and Natural
Languages. Prec. of 8th [JCA[. 1983
[6] Y.M|lks: Preference Semantics, tn The Formal
Semant4cs of Natural Language (ed: E.L.Keenan),
Cambridge University Press, t975
[7] Y.Sakamoto et.al.: Lextcon Features for
Japanese Syntactic Analysts In Mu-ProJect-JE, Prec.
of COLING 84, 1984
[8] J.TsuJ41: The Transfer Phase tn an
English-Japanese Translation System. Proc. of
COLING 82. t982
[g] T.Mtnognad: Understanding Natural Language,
Academic Press, t975
[tO] C.Bottet et.al.: Recent Developments tn
Russian-French Machtne Translation at Grenoble,
Linguistics, Vol. 19, tg8t
[tt] M.Nagao. et.al.: Dealing wtth [ncompleteness
of L4ngutsttc Knowledge on Language Translation.
Proc. of COLZNG 84. 1984
274