Tải bản đầy đủ (.ppt) (38 trang)

Augmenting WordNet for Deep Understanding of Text

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (945.98 KB, 38 trang )

Augmenting WordNet for Deep
Understanding of Text
Peter Clark, Phil Harrison,
Bill Murray, John Thompson (Boeing)
Christiane Fellbaum (Princeton Univ)
Jerry Hobbs (ISI/USC)


“Deep Understanding”
• Not (just) parsing + word senses
• Construction of a coherent representation of the scene
the text describes
• Challenge: much of that representation is not in the text

“A soldier
was killed in
a gun battle”

“The soldier died”
“The soldier was shot”
“There was a fight”



“Deep Understanding”
How do we get this
knowledge into the
machine?
How do we exploit it?

“A soldier


was killed in
a gun battle”

Because
A battle involves a fight.
Soldiers use guns.
Guns shoot.
Guns can kill.
If you are killed, you are
dead.
….
“The soldier died”
“The soldier was shot”
“There was a fight”



“Deep Understanding”
Several partially useful
resources exist.
WordNet is already used
a lot…can we extend it?

“A soldier
was killed in
a gun battle”

Because
A battle involves a fight.
Soldiers use guns.

Guns shoot.
Guns can kill.
If you are killed, you are
dead.
….
“The soldier died”
“The soldier was shot”
“There was a fight”



The Initial Vision
• Our vision:
Rapidly expand WordNet to be more of a knowledge-base
Question-answering software to demonstrate its use


The Evolution of WordNet
lexical
resource

• v1.0 (1986)
– synsets (concepts) + hypernym (isa) links

• v1.7 (2001)
– add in additional relationships
• has-part
• causes
• member-of
• entails-doing (“subevent”)


• v2.0 (2003)
– introduce the instance/class distinction
• Paris isa Capital-City is-type-of City
– add in some derivational links
• explode related-to explosion

knowledge
base?

• …
• v10.0 (200?)
– ?????


Augmenting WordNet
• World Knowledge
– Sense-disambiguate the glosses (by hand)
– Convert the glosses to logic
• Similar to LCC’s Extended WordNet attempt
– Axiomatize “core theories”

• WordNet links
– Morphosemantic links
– Purpose links

• Experiments


Converting the Glosses to Logic

“ambition#n2: A strong drive for success”

Convert gloss to form “word is gloss”
Parse (Charniak)
LFToolkit: Generate logical form fragments
Lexical output rules
   strong drive for success
produce logical form
fragments
strong(x1) & drive(x2) & for(x3,x4) & success(x5)


Converting the Glosses to Logic
“ambition#n2: A strong drive for success”

Convert gloss to form “word is gloss”
Parse (Charniak)
LFToolkit: Generate logical form fragments
Identify equalities, add senses


Converting the Glosses to Logic
x2=x3
x1=x2

Composition rules
identify variables

x4=x5


Lexical output rules
A strong drive for success
produce logical form
fragments
strong(x1) & drive(x2) & for(x3,x4) & success(x5)

Identify equalities, add senses


Converting the Glosses to Logic
“ambition#n2: A strong drive for success”

Convert gloss to form “word is gloss”
Parse (Charniak)
LFToolkit: Generate logical form fragments
Identify equalities, add senses
ambition#n2(x1) → a(x1) & strong#a1(x1) & drive#n2(x1) &
for(x1,x2) & success#a3(x2)


Converting the Glosses to Logic



Sometimes works well!
But often not. Primary problems:
1. Errors in the language processing
2. Only capture definitional knowledge
3. “flowery” language, many gaps, metonymy, ambiguity;
If logic closely follows syntax → “logico-babble”


“hammer#n2: tool used to deliver an impulsive force by striking”
hammer#n2(x1) →
tool#n1(x1) & use#v1(e1,x2,x1) & to(e1,e2) & deliver#v2(e2,x3) &
driving#a1(x3) & force#n1(x3) & by(e3,e4) & strike#v3(e4,x4).

→ Hammers hit things??


Augmenting WordNet
• World Knowledge
– Sense-disambiguate the glosses (by hand)
– Convert the glosses to logic
– Axiomatize “core theories”

• WordNet links
– Morphosemantic links
– Purpose links

• Experiments


Core Theories
• Many domain-specific facts are instantiations of more
general, “core” knowledge
• By encoding this core knowledge, get leverage
• eg 517 “vehicle” noun (senses), 185 “cover” verb (senses)

• Approach:
– Analysis and grouping of words in Core WordNet

– Identification and encoding of underlying theories


Core Theories
Composite Entities: perfect, empty, relative, secondary, similar, odd, ...
Scales:  step, degree, level, intensify, high, major, considerable, ...
Events: constraint, secure, generate, fix, power, development, ...
Space:  grade, inside, lot, top, list, direction, turn, enlarge, long, ...
Time: year, day, summer, recent, old, early, present, then, often, ...
Cognition:  imagination, horror, rely, remind, matter, estimate, idea, ...
Communication: journal, poetry, announcement, gesture, charter, ...
Persons and their Activities: leisure, childhood, glance, cousin, jump, ...
Microsocial:  virtue, separate, friendly, married, company, name, ...
Material World: smoke, shell, stick, carbon, blue, burn, dry, tough, ... 
Geo:  storm, moon, pole, world, peak, site, village, sea, island, ...
Artifacts: bell, button, van, shelf, machine, film, floor, glass, chair, ...
Food:  cheese, potato, milk, break, cake, meat, beer, bake, spoil, ... 
Macrosocial: architecture, airport, headquarters, prosecution, ...
Economic: import, money, policy, poverty, profit, venture, owe, ...


Augmenting WordNet
• World Knowledge
– Sense-disambiguate the glosses (by hand)
– Convert the glosses to logic
– Axiomatize “core theories”

• WordNet links
– Morphosemantic links
– Purpose links


• Experiments


Morphosemantic Links
• Often need to cross part-of-speech
T: A council worker cleans up after Tuesday's violence in Budapest.
H: There were attacks in Budapest on Tuesday.

• Can solve with WN’s derivation links:
(“attack”) attack_v3

aggression_n4 (←“violence”)

derivation link
“aggress”/“aggression”


Morphosemantic Links
• But can go wrong!
T: Paying was slow
H1: The transaction was slow
H2: *The person was slow [NOT entailed]
(“pay”) pay_v1

(“pay”) pay_v1

payment_n1 (→ “transaction”)
“pay”/“payment”
“pay”/“payer”


payer_n1 (→ “person”)

Problem: The type of relation matters for derivatives! (Event?
Agent?..)
A pays B → The payment (event-noun) by A
A is the payer (agent-noun) of B


Morphosemantic Links
• Task: Classify the 22,000 links in WordNet:
Verb Synset
hammer_v1
execute_v1
sign_v2

Noun Synset
hammer_n1
execution_n1
signatory_n1

Relationship
instrument
event (equal)
agent

• Semi-automatic process
– Exploit taxonomy and morphology

• 15 semantic types used

– agent, undergoer, instrument, result, material, destination,
location, result, by-means-of, event, uses, state, property,
body-part, vehicle.


Experimentation


Task: Recognizing Entailment
• Experiment with WordNet, logical glosses, DIRT
• Text interpretation to logic using Boeing’s NLP system
“A soldier was
killed in a gun
battle”

“soldier”(soldier01),
“kill”(…..
object(kill01,soldier01),
“in”(kill01,battle01),
modifier(battle01,gun01).

Initial Logic

isa(soldier01,soldier_n1),
isa(……
object(kill01,soldier01)
during(kill01,battle01)
instrument(battle01,gun01)

Final Logic


• Entailment: T → H if:
– T is subsumed by H (“cat eats mouse” → “animal was eaten”)
– An elaboration of T using inference rules is subsumed by H
• (“cat eats mouse” → “cat swallows mouse”)

• No statistical similarity metrics


Successful Examples with the Glosses
• Good example
14.H4
T: Britain puts curbs on immigrant labor from Bulgaria and Romania.
H: Britain restricted workers from Bulgaria.


Successful Examples with the Glosses
• Good example
14.H4
T: Britain puts curbs on immigrant labor from Bulgaria and Romania.
H: Britain restricted workers from Bulgaria.
WN: limit_v1:"restrict“: place limits on.
T: Britain puts curbs on immigrant labor from Bulgaria and Romania.
H: Britain placed limits on workers from Bulgaria.
→ ENTAILED (correct)


Successful Examples with the Glosses
• Another (somewhat) good example
56.H3

T: The administration managed to track down the perpetrators.
H: The perpetrators were being chased by the administration.


Successful Examples with the Glosses
• Another (somewhat) good example
56.H3
T: The administration managed to track down the perpetrators.
H: The perpetrators were being chased by the administration.
WN: hunt_v1 “hunt” “track down”: pursue for food or sport
T: The administration managed to pursue the perpetrators [for food
or sport!].
H: The perpetrators were being chased by the administration.
→ ENTAILED (correct)


×