Identifying Anaphoric and Non- Anaphoric Noun Phrases to Improve Coreference Resolution

Similar documents
Anaphora Resolution in Biomedical Literature: A

Towards a more consistent and comprehensive evaluation of anaphora resolution algorithms and systems

807 - TEXT ANALYTICS. Anaphora resolution: the problem

Reference Resolution. Regina Barzilay. February 23, 2004

Anaphora Resolution in Biomedical Literature: A Hybrid Approach

Reference Resolution. Announcements. Last Time. 3/3 first part of the projects Example topics

Outline of today s lecture

Resolving Direct and Indirect Anaphora for Japanese Definite Noun Phrases

Coreference Resolution Lecture 15: October 30, Reference Resolution

A Machine Learning Approach to Resolve Event Anaphora

An Introduction to Anaphora

Statistical anaphora resolution in biomedical texts

08 Anaphora resolution

A Survey on Anaphora Resolution Toolkits

Anaphora Resolution Exercise: An overview

Hybrid Approach to Pronominal Anaphora Resolution in English Newspaper Text

Anaphora Resolution in Hindi Language

ANAPHORIC REFERENCE IN JUSTIN BIEBER S ALBUM BELIEVE ACOUSTIC

TEXT MINING TECHNIQUES RORY DUTHIE

Automatic Evaluation for Anaphora Resolution in SUPAR system 1

Introduction to the Special Issue on Computational Anaphora Resolution

Question Answering. CS486 / 686 University of Waterloo Lecture 23: April 1 st, CS486/686 Slides (c) 2014 P. Poupart 1

Keywords Coreference resolution, anaphora resolution, cataphora, exaphora, annotation.

HS01: The Grammar of Anaphora: The Study of Anaphora and Ellipsis An Introduction. Winkler /Konietzko WS06/07

ANAPHORA RESOLUTION IN HINDI LANGUAGE USING GAZETTEER METHOD

Anaphora Resolution. Nuno Nobre

Dialogue structure as a preference in anaphora resolution systems

Performance Analysis of two Anaphora Resolution System for Hindi Language

Models of Anaphora Processing and the Binding Constraints

Natural Language Processing

Palomar & Martnez-Barco the latter being the abbreviating form of the reference to an entity. This paper focuses exclusively on the resolution of anap

The Reliability of Anaphoric Annotation, Reconsidered: Taking Ambiguity into Account

Visual Analytics Based Authorship Discrimination Using Gaussian Mixture Models and Self Organising Maps: Application on Quran and Hadith

Technical Report. Statistical anaphora resolution in biomedical texts. Caroline V. Gasperin. Number 764. December Computer Laboratory

ADDIS ABABA UNIVERSITY SCHOOL OF GRADUATE STUDIES. Design of Amharic Anaphora Resolution Model. Temesgen Dawit

Discourse Constraints on Anaphora Ling 614 / Phil 615 Sponsored by the Marshall M. Weinberg Fund for Graduate Seminars in Cognitive Science

Impact of Anaphora Resolution on Opinion Target Identification

Anaphora Resolution. João Marques

Anaphoric Deflationism: Truth and Reference

Argument Harvesting Using Chatbots

Information Extraction. CS6200 Information Retrieval (and a sort of advertisement for NLP in the spring)

Semantics and Pragmatics of NLP DRT: Constructing LFs and Presuppositions

INFORMATION EXTRACTION AND AD HOC ANAPHORA ANALYSIS

Using word similarity lists for resolving indirect anaphora

PAGE(S) WHERE TAUGHT (If submission is not text, cite appropriate resource(s))

Bertrand Russell Proper Names, Adjectives and Verbs 1

Prentice Hall Literature: Timeless Voices, Timeless Themes, Silver Level '2002 Correlated to: Oregon Language Arts Content Standards (Grade 8)

Sentiment Flow! A General Model of Web Review Argumentation

Prentice Hall Literature: Timeless Voices, Timeless Themes, Bronze Level '2002 Correlated to: Oregon Language Arts Content Standards (Grade 7)

That's Your Evidence?: Using Mechanical Turk To Develop A Computational Account Of Debate And Argumentation In Online Forums

1. Introduction Formal deductive logic Overview

Houghton Mifflin English 2004 Houghton Mifflin Company Level Four correlated to Tennessee Learning Expectations and Draft Performance Indicators

Pronominal, temporal and descriptive anaphora

AliQAn, Spanish QA System at multilingual

Extracting the Semantics of Understood-and- Pronounced of Qur anic Vocabularies Using a Text Mining Approach

Computational Learning Theory: Agnostic Learning

Long-distance anaphora: comparing Mandarin Chinese with Iron Range English 1

Some observations on identity, sameness and comparison

Discourse, Pragmatics, Coreference Resolution

Resolving This-issue Anaphora

KEEP THIS COPY FOR REPRODUCTION Pý:RPCS.15i )OCUMENTATION PAGE 0 ''.1-AC7..<Z C. in;2re PORT DATE JPOTTYPE AND DATES COVERID

Discourse, Pragmatics, Coreference Resolution. Many slides are adapted from Roger Levy, Chris Manning, Vicent Ng, Heeyoung Lee, Altaf Rahman

Competition and Disjoint Reference. Norvin Richards, MIT. appear; Richards 1995). The typical inability of pronouns to be locally bound, on this

DP: A Detector for Presuppositions in survey questions

NAACL HLT Computational Models of Reference, Anaphora and Coreference. Proceedings of the Workshop. June 6, 2018 New Orleans, Louisiana

SEVENTH GRADE RELIGION

An Efficient Indexing Approach to Find Quranic Symbols in Large Texts

4) When are complex discourse entities constructed in the process of text comprehension?

Entailment as Plural Modal Anaphora

ANAPHORA RESOLUTION IN MACHINE TRANSLATION

The SAT Essay: An Argument-Centered Strategy

Natural Language Processing (NLP) 10/30/02 CS470/670 NLP (10/30/02) 1

Parents Seminar English Language Sharing 11 February 2017

The Interpretation of Complement Anaphora: The Case of The Others

Biometrics Prof. Phalguni Gupta Department of Computer Science and Engineering Indian Institute of Technology, Kanpur. Lecture No.

Anaphora Resolution in Hindi: Issues and Directions

Shaping Statically Resolved Indirect Anaphora for Naturalistic Programming

I Couldn t Agree More: The Role of Conversational Structure in Agreement and Disagreement Detection in Online Discussions

The Critique (analyzing an essay s argument)

QCAA Study of Religion 2019 v1.1 General Senior Syllabus

Houghton Mifflin English 2001 Houghton Mifflin Company Grade Three Grade Five

StoryTown Reading/Language Arts Grade 2

TURCOLOGICA. Herausgegeben von Lars Johanson. Band 98. Harrassowitz Verlag Wiesbaden

Now consider a verb - like is pretty. Does this also stand for something?

10. Presuppositions Introduction The Phenomenon Tests for presuppositions

2007 HSC Notes from the Marking Centre Classical Hebrew

The Impact of Oath Writing Style on Stylometric Features and Machine Learning Classifiers

CS224W Project Proposal: Characterizing and Predicting Dogmatic Networks

The synoptic problem and statistics

Mental Files and their Identity Conditions

McDougal Littell High School Math Program. correlated to. Oregon Mathematics Grade-Level Standards

GFS HISTORY Medium Term Plan Year 8 SPRING 1

Epistemology. PH654 Bethel Seminary Winter To be able to better understand and evaluate the sources, methods, and limits of human knowing,

CAS LX 522 Syntax I Fall 2000 November 6, 2000 Paul Hagstrom Week 9: Binding Theory. (8) John likes him.

Ms. Shruti Aggarwal Assistant Professor S.G.G.S.W.U. Fatehgarh Sahib

The synoptic problem and statistics

SB=Student Book TE=Teacher s Edition WP=Workbook Plus RW=Reteaching Workbook 47

+ _ + No mortal man can slay every dragon No mortal Dutchman can slay every dragon No mortal man can slay every animal No mortal man can decapitate

PHIL 155: The Scientific Method, Part 1: Naïve Inductivism. January 14, 2013

Basic Church Profile Inventory Sample

Transcription:

Identifying Anaphoric and Non- Anaphoric Noun Phrases to Improve Coreference Resolution Vincent Ng Ng and Claire Cardie Department of of Computer Science Cornell University

Plan for the Talk Noun phrase coreference resolution general machine learning approach baseline coreference resolution system Identification of anaphoric/non-anaphoric noun phrases (Anaphoricity determination) why anaphoricity info can help coreference resolution general machine learning approach anaphoricity determination system Using anaphoricity information in coreference resolution

Noun Phrase Coreference Identify all noun phrases that refer to the same entity Queen Elizabeth set about transforming her husband, King George VI, into a viable monarch. Logue, a renowned speech therapist, was summoned to help the King overcome his speech impediment...

Noun Phrase Coreference Identify all noun phrases that refer to the same entity Queen Elizabeth set about transforming her husband, King George VI, into a viable monarch. Logue, a renowned speech therapist, was summoned to help the King overcome his speech impediment...

Noun Phrase Coreference Identify all noun phrases that refer to the same entity Queen Elizabeth set about transforming her husband, King George VI, into a viable monarch. Logue, a renowned speech therapist, was summoned to help the King overcome his speech impediment...

Noun Phrase Coreference Identify all noun phrases that refer to the same entity Queen Elizabeth set about transforming her husband, King George VI, into a viable monarch. Logue, a renowned speech therapist, was summoned to help the King overcome his speech impediment...

Noun Phrase Coreference Identify all noun phrases that refer to the same entity Queen Elizabeth set about transforming her husband, King George VI, into a viable monarch. Logue, a renowned speech therapist, was summoned to help the King overcome his speech impediment...

A Machine Learning Approach Classification given a description of two noun phrases, NP i and NP j, classify the pair as coreferent or not coreferent coref? coref? [Queen Elizabeth] set about transforming [her] [husband],... not coref? Aone & Bennett [1995]; Connolly et al. [1994]; McCarthy & Lehnert [1995]; Soon, Ng & Lim [2001]

A Machine Learning Approach Clustering coordinates pairwise coreference decisions Queen Elizabeth Queen Elizabeth not coref coref [Queen Elizabeth], set about transforming [her] [husband] not coref... Clustering Algorithm her King George VI husband King George VI the King his Logue Logue a renowned speech therapist

Machine Learning Issues Training data creation Instance representation Learning algorithm Clustering algorithm [ Ng and Cardie, ACL 02 ]

Baseline System: Training Data Creation Creating training instances texts annotated with coreference information one instance for each pair of noun phrases» feature vector: describes the two NPs and context» class value: coref not coref pairs on the same coreference chain otherwise use sampling to deal with skewed class distributions

Baseline System: Instance Representation 53 features per instance Lexical (9) Semantic (6) Positional (2) Knowledge-based (2) Grammatical (34) NP string matching operations Semantic compatibility tests, aliasing Distance in terms of number of sentences/paragraphs Naïve pronoun resolution, rule-based coref resolution NP type Grammatical role Linguistic constraints Linguistic preferences Heuristics

Baseline System: Learning Algorithm C4.5 (Quinlan, 1993): decision tree induction Classifier outputs coreference likelihood

Baseline System: Clustering Algorithm Best-first single-link clustering algorithm selects as antecedent the NP with the highest coreference likelihood from among preceding coreferent NPs for each noun phrase

Baseline System: Evaluation MUC-6 and MUC-7 coreference data sets documents annotated w.r.t. coreference MUC-6: 30 training texts + 30 test texts MUC-7: 30 training texts + 20 test texts MUC scoring program recall, precision, F-measure

Baseline System: Results MUC-6 MUC-7 R P F R P F Baseline 70.3 58.3 63.8 65.5 58.2 61.6 Best MUC System 59 72 65 56.1 68.8 61.8 Worst MUC System 36 44 40 52.5 21.4 30.4

Baseline System: Results MUC-6 MUC-7 R P F R P F Baseline 70.3 58.3 63.8 65.5 58.2 61.6 Best MUC System 59 72 65 56.1 68.8 61.8 Worst MUC System 36 44 40 52.5 21.4 30.4

Plan for the Talk Noun phrase coreference resolution general machine learning approach baseline coreference resolution system Identification of anaphoric/non-anaphoric noun phrases (Anaphoricity determination) why anaphoricity info can help coreference resolution general machine learning approach anaphoricity determination system Using anaphoricity information in coreference resolution

Motivation Baseline coreference system single-link clustering algorithm attempts to find an antecedent for each noun phrase

Motivation Baseline coreference system single-link clustering algorithm attempts to find an antecedent for each noun phrase What we really want single-link clustering algorithm attempts to find an antecedent for each anaphoric noun phrase

Motivation Baseline coreference system single-link clustering algorithm attempts to find an antecedent for each noun phrase What we really want single-link clustering algorithm attempts to find an antecedent for each anaphoric noun phrase Availability of anaphoricity info can increase the precision of the coreference system

Anaphoricity Determination For each noun phrase in a text, determine whether it is part of a coreference chain but is not the head of the chain. Queen Elizabeth set about transforming her husband, King George VI, into a viable monarch. Logue, a renowned speech therapist, was summoned to help the King overcome his speech impediment...

Anaphoricity Determination For each noun phrase in a text, determine whether it is part of a coreference chain but is not the head of the chain. Queen Elizabeth set about transforming her husband, King George VI, into a viable monarch. Logue, a renowned speech therapist, was summoned to help the King overcome his speech impediment...

Anaphoricity Determination For each noun phrase in a text, determine whether it is part of a coreference chain but is not the head of the chain. Queen Elizabeth set about transforming her husband, King George VI, into a viable monarch. Logue, a renowned speech therapist, was summoned to help the King overcome his speech impediment...

Anaphoricity Determination For each noun phrase in a text, determine whether it is part of a coreference chain but is not the head of the chain. Queen Elizabeth set about transforming her husband, King George VI, into a viable monarch. Logue, a renowned speech therapist, was summoned to help the King overcome his speech impediment...

Anaphoricity Determination For each noun phrase in a text, determine whether it is part of a coreference chain but is not the head of the chain. Queen Elizabeth set about transforming her husband, King George VI, into a viable monarch. Logue, a renowned speech therapist, was summoned to help the King overcome his speech impediment...

Anaphoricity Determination For each noun phrase in a text, determine whether it is part of a coreference chain but is not the head of the chain. Queen Elizabeth set about transforming her husband, King George VI, into a viable monarch. Logue, a renowned speech therapist, was summoned to help the King overcome his speech impediment...

Anaphoricity Determination For each noun phrase in a text, determine whether it is part of a coreference chain but is not the head of the chain. Queen Elizabeth set about transforming her husband, King George VI, into a viable monarch. Logue, a renowned speech therapist, was summoned to help the King overcome his speech impediment...

A Machine Learning Approach Classification given a description of a noun phrases, NP i, classify NP i as anaphoric or not anaphoric anaphoric nonanaphoric nonanaphoric [Queen Elizabeth] set about transforming [her] [husband],...

Anaphoricity Determination System Training data creation texts annotated with coreference information one instance for each noun phrase Learning algorithm C4.5

Anaphoricity Determination System Instance representation 37 features per instance Lexical (4) Positional (3) Semantic (4) Grammatical (35) case, string matching, head matching header, first sentence, first paragraph title, aliasing, semantic compatibility NP type: definite, indefinite, bare plural NP property: pre-modified, post-modified, number Syntactic pattern: THE_N, THE_PN, THE_ADJ_N

Anaphoricity Determination System: Evaluation MUC-6 and MUC-7 coreference data sets Corpus Instances % Negatives Accuracy MUC-6 test 4565 66.3 86.1 MUC-7 test 3558 73.2 84.0

Existing Approaches to to Anaphoricity Determination Heuristic-based approaches Paice and Husk (1987), Lappin and Leass (1994), Kennedy and Boguraev (1996), Denber (1998), Vieira and Poesio (2000) Machine learning approaches Unsupervised: Bean and Riloff (1999) Supervised: Evans (2001)

Comparison with Previous Work (I) Approaches to anaphoricity determination Our Approach Previous Approaches

Comparison with Previous Work (I) Approaches to anaphoricity determination Our Approach Previous Approaches focuses on common nouns

Comparison with Previous Work (I) Approaches to anaphoricity determination Our Approach Previous Approaches focuses on common nouns can operate on all types of noun phrases

Comparison with Previous Work (I) Approaches to anaphoricity determination Our Approach Previous Approaches focuses on common nouns can operate on all types of noun phrases handle specific types of noun phrases only

Comparison with Previous Work (I) Existing anaphoricity determination algorithms address only specific types of NPs: pleonastic pronouns» Paice and Husk (1987), Lappin and Leass (1994), Kennedy and Boguraev (1996), Denber (1998) definite descriptions» Bean and Riloff (1999), Vieira and Peosio (2000) anaphoric and non-anaphoric uses of it» Evans (2001)

Comparison with Previous Work (II) Using anaphoricity information in coreference resolution Our Coref System Previous Coref Systems

Comparison with Previous Work (II) Using anaphoricity information in coreference resolution Our Coref System Previous Coref Systems employs anaphoricity determination as a separate component

Comparison with Previous Work (II) Using anaphoricity information in coreference resolution Our Coref System employs anaphoricity determination as a separate component Previous Coref Systems perform anaphoricity determination within the coreference system

Comparison with Previous Work (II) Most previous work performs anaphoricity determination implicitly e.g. via a specific feature in the coreference system One exception:» Harabagiu et al. (2001)» assumes perfect anaphoricity information» effectively employs a separate (manual) anaphoricity determination component

Comparison with Previous Work (III) Evaluation of anaphoricity determination system Our System Previous Systems

Comparison with Previous Work (III) Evaluation of anaphoricity determination system Our System Previous Systems evaluated as a standalone component

Comparison with Previous Work (III) Evaluation of anaphoricity determination system Our System Previous Systems evaluated as a standalone component evaluated in the context of coreference resolution

Comparison with Previous Work (III) Evaluation of anaphoricity determination system Our System evaluated as a standalone component Previous Systems evaluated as a standalone component evaluated in the context of coreference resolution

Comparison with Previous Work (III) Evaluation of anaphoricity determination system Our System evaluated as a standalone component Previous Systems evaluated as a standalone component evaluated in the context of coreference resolution contribution to coreference resolution not evaluated

Comparison with Previous Work (III) Little previous work evaluates the effects of anaphoricity determination in anaphora/coreference resolution Anaphoricity Determination System Bean and Riloff (1999)? Denber (1998)? Effects on Coref Resolution Evans (2001) Kennedy and Boguraev (1996)? Lappin and Leass (1994)? Mitkov et al. (2001) Paice and Husk (1987)? Vieira and Poesio (2000)

Plan for the Talk Noun phrase coreference resolution general machine learning approach baseline coreference resolution system Identification of anaphoric/non-anaphoric noun phrases (Anaphoricity determination) why anaphoricity info can help coreference resolution general machine learning approach anaphoricity determination system Using anaphoricity information in coreference resolution

How can anaphoricity information be used? The clustering algorithm will only search for an antecedent for anaphoric noun phrases. Hypothesis Anaphoricity information will improve precision

Anaphoricity Determination for Coref Resolution MUC-6 MUC-7 R P F R P F Baseline 70.3 58.3 63.8 65.5 58.2 61.6 coreference system has fairly low precision

Results (Perfect Anaphoricity Information) MUC-6 MUC-7 R P F R P F Baseline 70.3 58.3 63.8 65.5 58.2 61.6 With perfect anaphoricity info 66.3 81.4 73.1 61.5 83.2 70.7 perfect anaphoricity information can improve precision

Results (Learned Anaphoricity Information) MUC-6 MUC-7 R P F R P F Baseline 70.3 58.3 63.8 65.5 58.2 61.6 With learned anaphoricity info 57.4 71.6 63.7 47.0 77.1 58.4 improvement in precision comes at the expense of significant loss in recall

What went wrong? Hypothesis 1 drop in recall and overall performance is caused by poor accuracy of anaphoricity classifier on positive instances

What went wrong? Hypothesis 1 drop in recall and overall performance is caused by poor accuracy of anaphoricity classifier on positive instances Accuracy of anaphoricity classifier overall: 86.1% (MUC-6) and 84.0% (MUC-7) positives only: 73.1% (MUC-6) and 66.2% (MUC-7) Anaphoricity classifier misclassifies 414 and 322 anaphoric entities as non-anaphoric for the MUC-6 and MUC-7 data sets, respectively

Need more accuracy? Hypothesis 1.1 accuracy levels of 66-73% on positive instances for anaphoricity determination are not adequate for improving coreference resolution

Need more accuracy? Hypothesis 1.1 accuracy levels of 66-73% on positive instances for anaphoricity determination are not adequate for improving coreference resolution Goal improve the accuracy on positive instances

Need more accuracy? Hypothesis 1.1 accuracy levels of 66-73% on positive instances for anaphoricity determination are not adequate for improving coreference resolution Goal improve the accuracy on positive instances How?

Improving Accuracy on Positive Instances Observations string matching and aliasing are strong indicators of coreference

Improving Accuracy on Positive Instances Observations string matching and aliasing are strong indicators of coreference string matching and aliasing are weaker indicators of anaphoricity

Improving Accuracy on Positive Instances Observations string matching and aliasing are strong indicators of coreference string matching and aliasing are weaker indicators of anaphoricity Goal ensure that anaphoric NPs involved in these two types of relations are correctly classified

Classification with Constraints Assume that an NP is anaphoric (and bypass the anaphoricity classifier) if anaphoricity is indicated by either the string matching or the aliasing constraint Accuracy on positive instances no constraints: 73.1% (MUC-6) and 66.2% (MUC-7) with constraints: 82.0% (MUC-6) and 80.8% (MUC-7)

Results (Classification with Constraints) MUC-6 MUC-7 R P F R P F Baseline 70.3 58.3 63.8 65.5 58.2 61.6 With anaphoricity (no constraints) 57.4 71.6 63.7 47.0 77.1 58.4 With anaphoricity (with constraints) 63.4 68.3 65.8 59.7 69.3 64.2 large gains in precision and smaller drops in recall automatically acquired anaphoricity info can be used to improve the performance of coreference resolution

Results (Comparison with Best MUC Systems) MUC-6 MUC-7 R P F R P F With anaphoricity (with constraints) 63.4 68.3 65.8 59.7 69.3 64.2 Best MUC System 59 72 65 56.1 68.8 61.8

Results (Comparison with Perfect Anaphoricity) MUC-6 MUC-7 R P F R P F With anaphoricity (with constraints) 63.4 68.3 65.8 59.7 69.3 64.2 With perfect anaphoricity info 66.3 81.4 73.1 61.5 83.2 70.7 substantial room for improvement in anaphoricity determination

Summary Presented a supervised learning approach for anaphoricity determination that can handle all types of NPs Investigated the use of anaphoricity information in coreference resolution Showed automatically acquired knowledge of anaphoricity can be used to improve the performance of a learningbased coreference system