Augmenting WordNet for Deep
Understanding of Text
Peter Clark, Phil Harrison,
Bill Murray, John Thompson (Boeing)
Christiane Fellbaum (Princeton Univ)
Jerry Hobbs (ISI/USC)
“Deep Understanding”
• Not (just) parsing + word senses
• Construction of a coherent representation of the scene
the text describes
• Challenge: much of that representation is not in the text
“A soldier
was killed in
a gun battle”
“The soldier died”
“The soldier was shot”
“There was a fight”
…
“Deep Understanding”
How do we get this
knowledge into the
machine?
How do we exploit it?
“A soldier
was killed in
a gun battle”
Because
A battle involves a fight.
Soldiers use guns.
Guns shoot.
Guns can kill.
If you are killed, you are
dead.
….
“The soldier died”
“The soldier was shot”
“There was a fight”
…
“Deep Understanding”
Several partially useful
resources exist.
WordNet is already used
a lot…can we extend it?
“A soldier
was killed in
a gun battle”
Because
A battle involves a fight.
Soldiers use guns.
Guns shoot.
Guns can kill.
If you are killed, you are
dead.
….
“The soldier died”
“The soldier was shot”
“There was a fight”
…
The Initial Vision
• Our vision:
Rapidly expand WordNet to be more of a knowledge-base
Question-answering software to demonstrate its use
The Evolution of WordNet
lexical
resource
• v1.0 (1986)
– synsets (concepts) + hypernym (isa) links
• v1.7 (2001)
– add in additional relationships
• has-part
• causes
• member-of
• entails-doing (“subevent”)
• v2.0 (2003)
– introduce the instance/class distinction
• Paris isa Capital-City is-type-of City
– add in some derivational links
• explode related-to explosion
knowledge
base?
• …
• v10.0 (200?)
– ?????
Augmenting WordNet
• World Knowledge
– Sense-disambiguate the glosses (by hand)
– Convert the glosses to logic
• Similar to LCC’s Extended WordNet attempt
– Axiomatize “core theories”
• WordNet links
– Morphosemantic links
– Purpose links
• Experiments
Converting the Glosses to Logic
“ambition#n2: A strong drive for success”
Convert gloss to form “word is gloss”
Parse (Charniak)
LFToolkit: Generate logical form fragments
Lexical output rules
strong drive for success
produce logical form
fragments
strong(x1) & drive(x2) & for(x3,x4) & success(x5)
Converting the Glosses to Logic
“ambition#n2: A strong drive for success”
Convert gloss to form “word is gloss”
Parse (Charniak)
LFToolkit: Generate logical form fragments
Identify equalities, add senses
Converting the Glosses to Logic
x2=x3
x1=x2
Composition rules
identify variables
x4=x5
Lexical output rules
A strong drive for success
produce logical form
fragments
strong(x1) & drive(x2) & for(x3,x4) & success(x5)
Identify equalities, add senses
Converting the Glosses to Logic
“ambition#n2: A strong drive for success”
Convert gloss to form “word is gloss”
Parse (Charniak)
LFToolkit: Generate logical form fragments
Identify equalities, add senses
ambition#n2(x1) → a(x1) & strong#a1(x1) & drive#n2(x1) &
for(x1,x2) & success#a3(x2)
Converting the Glosses to Logic
•
•
Sometimes works well!
But often not. Primary problems:
1. Errors in the language processing
2. Only capture definitional knowledge
3. “flowery” language, many gaps, metonymy, ambiguity;
If logic closely follows syntax → “logico-babble”
“hammer#n2: tool used to deliver an impulsive force by striking”
hammer#n2(x1) →
tool#n1(x1) & use#v1(e1,x2,x1) & to(e1,e2) & deliver#v2(e2,x3) &
driving#a1(x3) & force#n1(x3) & by(e3,e4) & strike#v3(e4,x4).
→ Hammers hit things??
Augmenting WordNet
• World Knowledge
– Sense-disambiguate the glosses (by hand)
– Convert the glosses to logic
– Axiomatize “core theories”
• WordNet links
– Morphosemantic links
– Purpose links
• Experiments
Core Theories
• Many domain-specific facts are instantiations of more
general, “core” knowledge
• By encoding this core knowledge, get leverage
• eg 517 “vehicle” noun (senses), 185 “cover” verb (senses)
• Approach:
– Analysis and grouping of words in Core WordNet
– Identification and encoding of underlying theories
Core Theories
Composite Entities: perfect, empty, relative, secondary, similar, odd, ...
Scales: step, degree, level, intensify, high, major, considerable, ...
Events: constraint, secure, generate, fix, power, development, ...
Space: grade, inside, lot, top, list, direction, turn, enlarge, long, ...
Time: year, day, summer, recent, old, early, present, then, often, ...
Cognition: imagination, horror, rely, remind, matter, estimate, idea, ...
Communication: journal, poetry, announcement, gesture, charter, ...
Persons and their Activities: leisure, childhood, glance, cousin, jump, ...
Microsocial: virtue, separate, friendly, married, company, name, ...
Material World: smoke, shell, stick, carbon, blue, burn, dry, tough, ...
Geo: storm, moon, pole, world, peak, site, village, sea, island, ...
Artifacts: bell, button, van, shelf, machine, film, floor, glass, chair, ...
Food: cheese, potato, milk, break, cake, meat, beer, bake, spoil, ...
Macrosocial: architecture, airport, headquarters, prosecution, ...
Economic: import, money, policy, poverty, profit, venture, owe, ...
Augmenting WordNet
• World Knowledge
– Sense-disambiguate the glosses (by hand)
– Convert the glosses to logic
– Axiomatize “core theories”
• WordNet links
– Morphosemantic links
– Purpose links
• Experiments
Morphosemantic Links
• Often need to cross part-of-speech
T: A council worker cleans up after Tuesday's violence in Budapest.
H: There were attacks in Budapest on Tuesday.
• Can solve with WN’s derivation links:
(“attack”) attack_v3
aggression_n4 (←“violence”)
derivation link
“aggress”/“aggression”
Morphosemantic Links
• But can go wrong!
T: Paying was slow
H1: The transaction was slow
H2: *The person was slow [NOT entailed]
(“pay”) pay_v1
(“pay”) pay_v1
payment_n1 (→ “transaction”)
“pay”/“payment”
“pay”/“payer”
payer_n1 (→ “person”)
Problem: The type of relation matters for derivatives! (Event?
Agent?..)
A pays B → The payment (event-noun) by A
A is the payer (agent-noun) of B
Morphosemantic Links
• Task: Classify the 22,000 links in WordNet:
Verb Synset
hammer_v1
execute_v1
sign_v2
Noun Synset
hammer_n1
execution_n1
signatory_n1
Relationship
instrument
event (equal)
agent
• Semi-automatic process
– Exploit taxonomy and morphology
• 15 semantic types used
– agent, undergoer, instrument, result, material, destination,
location, result, by-means-of, event, uses, state, property,
body-part, vehicle.
Experimentation
Task: Recognizing Entailment
• Experiment with WordNet, logical glosses, DIRT
• Text interpretation to logic using Boeing’s NLP system
“A soldier was
killed in a gun
battle”
“soldier”(soldier01),
“kill”(…..
object(kill01,soldier01),
“in”(kill01,battle01),
modifier(battle01,gun01).
Initial Logic
isa(soldier01,soldier_n1),
isa(……
object(kill01,soldier01)
during(kill01,battle01)
instrument(battle01,gun01)
Final Logic
• Entailment: T → H if:
– T is subsumed by H (“cat eats mouse” → “animal was eaten”)
– An elaboration of T using inference rules is subsumed by H
• (“cat eats mouse” → “cat swallows mouse”)
• No statistical similarity metrics
Successful Examples with the Glosses
• Good example
14.H4
T: Britain puts curbs on immigrant labor from Bulgaria and Romania.
H: Britain restricted workers from Bulgaria.
Successful Examples with the Glosses
• Good example
14.H4
T: Britain puts curbs on immigrant labor from Bulgaria and Romania.
H: Britain restricted workers from Bulgaria.
WN: limit_v1:"restrict“: place limits on.
T: Britain puts curbs on immigrant labor from Bulgaria and Romania.
H: Britain placed limits on workers from Bulgaria.
→ ENTAILED (correct)
Successful Examples with the Glosses
• Another (somewhat) good example
56.H3
T: The administration managed to track down the perpetrators.
H: The perpetrators were being chased by the administration.
Successful Examples with the Glosses
• Another (somewhat) good example
56.H3
T: The administration managed to track down the perpetrators.
H: The perpetrators were being chased by the administration.
WN: hunt_v1 “hunt” “track down”: pursue for food or sport
T: The administration managed to pursue the perpetrators [for food
or sport!].
H: The perpetrators were being chased by the administration.
→ ENTAILED (correct)