That's Your Evidence?: Using Mechanical Turk To Develop A Computational Account Of Debate And Argumentation In Online Forums

Similar documents
TEXT MINING TECHNIQUES RORY DUTHIE

I Couldn t Agree More: The Role of Conversational Structure in Agreement and Disagreement Detection in Online Discussions

Identifying Anaphoric and Non- Anaphoric Noun Phrases to Improve Coreference Resolution

Outline of today s lecture

Winning on the Merits: The Joint Effects of Content and Style on Debate Outcomes

Reference Resolution. Regina Barzilay. February 23, 2004

Reference Resolution. Announcements. Last Time. 3/3 first part of the projects Example topics

CS224W Project Proposal: Characterizing and Predicting Dogmatic Networks

What can I do with philosophy?

LOUISIANA PUBLIC SQUARE

NPTEL NPTEL ONINE CERTIFICATION COURSE. Introduction to Machine Learning. Lecture-59 Ensemble Methods- Bagging,Committee Machines and Stacking

Question Answering. CS486 / 686 University of Waterloo Lecture 23: April 1 st, CS486/686 Slides (c) 2014 P. Poupart 1

RMPS Assignment. National 5/Higher. Name: Class: Teacher: My Question:

Argument Harvesting Using Chatbots

GCSE RELIGIOUS STUDIES A Paper 2A

StoryTown Reading/Language Arts Grade 2

White Paper: Innocent or Inconclusive? Analyzing Abolitionists Claims About the Death

Information Extraction. CS6200 Information Retrieval (and a sort of advertisement for NLP in the spring)

GW POLITICS POLL 2018 MIDTERM ELECTION WAVE 3

SB=Student Book TE=Teacher s Edition WP=Workbook Plus RW=Reteaching Workbook 47

Lesson 09 Notes. Machine Learning. Intro

GCSE RELIGIOUS STUDIES A Paper 2A

Prentice Hall Literature: Timeless Voices, Timeless Themes, Silver Level '2002 Correlated to: Oregon Language Arts Content Standards (Grade 8)

Prentice Hall Literature: Timeless Voices, Timeless Themes, Bronze Level '2002 Correlated to: Oregon Language Arts Content Standards (Grade 7)

Discussion Notes for Bayesian Reasoning

Date of last example: Never Today/yesterday Last week Last month Last year Before the last year

Essay Discuss Both Sides and Give your Opinion

CONTEMPORARY MORAL PROBLEMS LECTURE 14 CAPITAL PUNISHMENT PART 2

Why do people commit injustice? What is pleasure?

Ms. Shruti Aggarwal Assistant Professor S.G.G.S.W.U. Fatehgarh Sahib

Is The Death Penalty Fair? (At Issue) By Mary E. Williams

Contemporary Social and Moral Problems in the U.S.

Predictability, Causation, and Free Will

More See Too Much Religious Talk by Politicians

SEARCHING FOR ANSWERS

New Testament Exegesis Outline Template by Rev. D. E. Norczyk

part one MACROSTRUCTURE Cambridge University Press X - A Theory of Argument Mark Vorobej Excerpt More information

Classroom Voting Questions: Statistics

The World-Time Parallel: Tense And Modality In Logic And Metaphysics By M. J. Cresswell READ ONLINE

Here s a very dumbed down way to understand why Gödel is no threat at all to A.I..

APwk4.notebook. August 23, Opener 8/27. Write a claim of fact, value and policy about capital punishment on the back of your opener

Does the 2nd Amendment Cover Semi-Automatic Weapons?

Portfolio Project. Phil 251A Logic Fall Due: Friday, December 7

PHI 1700: Global Ethics

FOURTH GRADE. WE LIVE AS CHRISTIANS ~ Your child recognizes that the Holy Spirit gives us life and that the Holy Spirit gives us gifts.

Visual Analytics Based Authorship Discrimination Using Gaussian Mixture Models and Self Organising Maps: Application on Quran and Hadith

Intuitive evidence and formal evidence in proof-formation

the Period, the number of trades within this period, called the Frequency and the In Frequency, meaning all specific dealing days

Establishing premises

CAPITAL PUNISHMENT Text: Exodus 20:13; Numbers 35:30-31

The Fifth National Survey of Religion and Politics: A Baseline for the 2008 Presidential Election. John C. Green

Evidence for evolution

Darwinism on trial in American state (Sun 8 May, 2005)

Parts of Persuasive Writing

Overview: Application: What to Avoid:

NPTEL NPTEL ONLINE CERTIFICATION COURSE. Introduction to Machine Learning. Lecture 31

(i) Morality is a system; and (ii) It is a system comprised of moral rules and principles.

ECE 5424: Introduction to Machine Learning

RECOMMENDED CITATION: Pew Research Center, March 2014, U.S. Catholics View Pope Francis as a Change for the Better

Studying Religion-Associated Variations in Physicians Clinical Decisions: Theoretical Rationale and Methodological Roadmap

Professor: Matthew D. Kim Office: Library 124 Phone: Office Hours: TBD. I. Course Description

ECE 5424: Introduction to Machine Learning

StoryTown Reading/Language Arts Grade 3

Common Morality: Deciding What to Do 1

Houghton Mifflin English 2004 Houghton Mifflin Company Level Four correlated to Tennessee Learning Expectations and Draft Performance Indicators

Scavenger Hunt For Church Youth Group

CS305 Topic Introduction to Ethics

ELA CCSS Grade Three. Third Grade Reading Standards for Literature (RL)

GCSE RELIGIOUS STUDIES 8061/2

Skill Realized. Skill Developing. Not Shown. Skill Emerging

Prentice Hall United States History 1850 to the Present Florida Edition, 2013

EMBARGOED FOR RELEASE: Sunday, February 25 at 9:00 a.m.

THE DIGNITY OF HUMAN LIFE GENSIS 9:1-7. There is a sickness abroad in the land. One symptom of this sickness is the low value that we put

Final Paper. May 13, 2015

POST-DEBATE SURVEY OF ATTENDEES FROM THE 2017 LIBERTARIANISM v CONSERVATISM INTERN DEBATE

Writing Module Three: Five Essential Parts of Argument Cain Project (2008)

GCSE RELIGIOUS STUDIES A

P G R. In This Issue. Pastor Gener al s Report. Church Administration... 2 Media Services... 2 Business Office... 3 Mail Processing...

Building Your Framework everydaydebate.blogspot.com by James M. Kellams

Without essay friend best writing my realizing it can (already). during m1 and service for although its definitely recommend asking you love from

NATIONAL: U.S. CATHOLICS LOOK FORWARD TO POPE S VISIT

William F. Cox, Jr., Ph.D. Regent University

Punjab University, Chandigarh. Kurukshetra University, Haryana. Assistant Professor. Lecturer

SUPREME COURT OF ARKANSAS No. CR

Catholics Divided Over Global Warming

A Scientific Model Explains Spirituality and Nonduality

ECE 5984: Introduction to Machine Learning

Theory of Knowledge. 5. That which can be asserted without evidence can be dismissed without evidence. (Christopher Hitchens). Do you agree?

conqueror QUIET TIME 3-4 one- year daily devotional for children in grades

A Correlation of. To the. Language Arts Florida Standards (LAFS) Grade 4

Freedom's Law: The Moral Reading of the American Constitution.

Minutes of the December meeting were approved.

EMBARGOED FOR RELEASE: Sunday, November 27 at 8:00 a.m.

American Election Eve Poll Florida - Latino, African American, AAPI, and White Voters

STATE OF MAINE CHRISTIAN NIELSEN. [ 1] Christian Nielsen appeals from a judgment of conviction entered in the

Writing the Argumentative Essay

History 2403E University of Western Ontario

Who Says? Chapter 12: Authority. Dictionaries are like watches; the worst is better than none, and the best cannot be expected to go quite true.

Current Ethical Issues and Christian Praxis Introduction to Christian Ethics. Spring 2015 ET512-DA-t-D (3) #

OSSA Conference Archive OSSA 8

Transcription:

That's Your Evidence?: Using Mechanical Turk To Develop A Computational Account Of Debate And Argumentation In Online Forums Natural Language and Dialogue Systems Lab Prof. Marilyn Walker

Debate and Deliberation: Key Human Activity Navy Research Lab funding IARPA, 3rd year of funding Identify subgroups on sides of issue

Persuasion and Argumentation on ConvinceMe

STANCE An overall position held by a person towards an object, idea or position (Somasundaran & Wiebe, 2009) Stance Groups: IARPA s subgroup (cells)

Stance Classification: Death Penalty Yes we should keep it I value human life so much that if someone takes one than his should be taken. Also if someone is thinking about taking a life they are less likely to do so knowing that they might lose theirs No we should not There is no proof that the death penalty acts as a deterrent, plus due to the finalty of the sentence it would be impossible to amend a mistaken conviction which happens with regualrity especially now due to DNA and improved forensic science

Dialogic Properties of Convinceme Every site offers different contextual affordances Convinceme provides three sources of dialogue structure Original post topic and responses on either side can be considered a response to the original post Rebuttal links explicitly link to a previous post on the other side Temporal context at the time of your post. What the page looked like, existing posts the user could see (is lost) Timestamps only by day, get partial order by day, plus order within day only via rebuttals No agree links

Death Penalty: Monologic Posts Yes we should keep it I value human life so much that if someone takes one than his should be taken. Also if someone is thinking about taking a life they are less likely to do so knowing that they might lose theirs No we should not There is no proof that the death penalty acts as a deterrent, plus due to the finalty of the sentence it would be impossible to amend a mistaken conviction which happens with regualrity especially now due to DNA and improved forensic science

Death Penalty: Rebuttal Chain RIGHT Studies have shown that using the death penalty saves 4 to 13 lives per execution. That alone makes killing murderers worthwhile. When Texas and Florida were executing people one after the other in the late 90's, the murder rates in both states plunged, like Rosie O'donnel off a diet... WRONG What studies? I have never seen ANY evidence that capital punishment acts as a deterrant to crime. I have not seen any evidence that it is ``just'' either. That's your evidence? What happened to those studies? In the late 90s a LOT of things were different than the periods preceding and following the one you mention. We have no way to determine what of those contributed to a lower murder rate, if indeed there was one. You have to prove a cause and effect relationship and you have failed.

How do humans do at this task?

1113 Debates, 4873 posts Topic Rebuttals P/A Cats v. Dogs 40% 1.68 Firefox vs. IE 40% 1.28 Mac vs. PC 47% 1.85 Superman/Batman 34% 1.41 2nd Amendment 59% 2.09 Abortion 70% 2.82 Climate Change 69% 2.97 Communism vs. Capitalism 70% 3.03 Death Penalty 62% 2.44 Evolution 76% 3.91 Exist God 77% 4.24 Gay Marriage 65% 2.12 Healthcare 80% 3.24 Marijuana Legalization 52% 1.55 Ideological topics always more than 50% rebuttals More author investment

Mechanical Turk Stance Siding

Data Preparation Natural Language and Dialogue Systems Lab

Map Debates into Topic sets Open Debates matching capital punishment Should Capital Punishment be Allowed? Mar 10 Do you agree with capital punishment? Sep 22 Open Debates matching death penalty Capital Punishment Feb 09 Should young adults who are convicted of extreme crimes, be issued the death penalty? Jan 28 Should death penalty be repealed again in the Philippines? Sep 15 Should the death penalty be brought back?why? Jun 24 death penalty Mar 14 Is the death penalty morally correct as it is SUPPOSED to be used in the United States? Aug 27 death penalty Mar 02 The Death Penalty should be legal Feb 04

ConvinceMe: 1113 Two Sided Debates Topic Rebuttals P/A Cats v. Dogs 40% 1.68 Firefox vs. IE 40% 1.28 Mac vs. PC 47% 1.85 Superman/Batman 34% 1.41 2nd Amendment 59% 2.09 Abortion 70% 2.82 Climate Change 69% 2.97 Communism vs. Capitalism 70% 3.03 Death Penalty 62% 2.44 Evolution 76% 3.91 Exist God 77% 4.24 Gay Marriage 65% 2.12 Healthcare 80% 3.24 Marijuana Legalization 52% 1.55

Mechanical Turk: HIT 9 annotators/post

Mechanical Turk: Human Topline

Human Topline for Stance Classification Class Correct Total Accuracy Rebuttal 606 827.73 Non-Rebuttal 427 493.87 Overall accuracy about 78% Harder for humans to classify stance of rebuttals Rebuttals are more context dependent Sometimes people post on wrong side Harder for humans to classify ideological posts 76% of ideological posts sided correctly, 85% non-ideological

Stance Classification & Rebuttal Classification Natural Language and Dialogue Systems Lab

Experimental Setup Stance Classification Within topic Remove cases where majority of annotators got it wrong But have additional NOISY data self-annotated (as posted) Explore the role of different feature sets 10 fold cross-validation Naïve Bayes, Jrip, SVM learners

Context Features: (naïve) IsRebuttal, Poster, Parent Post Features RIGHT Studies have shown that using the death penalty saves 4 to 13 lives per execution. That alone makes killing murderers worthwhile. When Texas and Florida were executing people one after the other in the late 90's, the murder rates in both states plunged, like Rosie O'donnel off a diet... WRONG What studies? I have never seen ANY evidence that capital punishment acts as a deterrant to crime. I have not seen any evidence that it is ``just'' either. That's your evidence? What happened to those studies? In the late 90s a LOT of things were different than the periods preceding and following the one you mention. We have no way to determine what of those contributed to a lower murder rate, if indeed there was one. You have to prove a cause and effect relationship and you have failed.

STANCE Classification Results MTurk Uni Best Best FeatSet Cats v. Dogs 94 59.23 62.31 All, no context Firefox vs. IE 74 51.25 53.75 LIWC, no context Mac vs. PC 76 53.33 56.67 LIWC, no context Superman Batman 89 54.84 57.26 LIWC with context 2nd Amendment 69 56.41 69.23 Unigram with context Abortion 75 50.97 53.70 LIWC with context Climate Change 66 53.65 58.33 LIWC Comm vs. Capitalism 68 48.81 56.55 LIWC with context Death Penalty 79 51.80 57.55 Generalized Dep POS with context Evolution 72 57.24 57.24 Unigram, no context Existence of God 73 52.71 53.42 Generalized Dep POS with context Gay Marriage 88 60.28 60.28 Unigram, no context Healthcare 86 52.13 60.64 LIWC with context MJ Legalization 81 57.55 59.43 All, no context Two topics unigram best Idealogical topics: context tends to help Need better context features

Compare Somasundaran & Wiebe 2010 Results range from 60 to 70% Arg + Sent statistically better than Unigram Their unigram baseline is higher Data doesn t contain rebuttals Domain (#posts) Distribution Unigram Sentiment Arguing Arg+Sent Overall (2232) 50 62.50 55.02 62.59 63.93 Guns Rights (306) 50 66.67 58.82 69.28 70.59 Gay Rights (846) 50 61.70 52.84 62.05 63.71 Abortion (550) 50 59.1 54.73 59.46 60.55 Creationism (530) 50 64.91 56.60 62.83 63.96 Table 4: Accuracy of the different systems

Current Work: Incorporate Social Network Models Use information over whole discussion Graph-based algorithm Speaker always agrees with self (P/A ranges 1.28 to 3.91) Rebuttal indicates disagreement A, B both disagree with C, agree with each other Evolution Topic: now 71.5% accuracy (from 57%) Firefox vs. IE: no improvement Graph topology differences, under investigation now

QUESTIONS? Natural Language and Dialogue Systems Lab

Discourse Relation Recognition Rebuttal is a kind of discourse relation Initiative Response Recognition Wang & Rose 2009 Get about 70% accuracy, LSA-CART Don t distinguish rebuttals from other Feature Engineering for Discourse Relation Recognition Echahabi & Marcu 2002: cartesian pairs Pitler, Louis & Nenkova, 2009

http://pcon.soe.ucsc.edu/mturk_external/ convinceme/cme.php?totalhits=100&pagegroup=1

Related Work Stance Classification in (Ideological) Online Debates Somasundaran & Wiebe, 2009; 2010 Discourse Relations (in scope of concession) Argumentation Features Results range from 60% to 70% accuracy Debate Side Classification (congress or forums) Thomas et. al. 2006, Bansal, Cardie 2010, Mishne Glance, Murakami & Raymond 2010 Use Graph-based algorithms (Social Network structure via Mincut, Maxcut) Simple features for agreement/disagreement from text improve performance

Compare Somasundaran & Wiebe 09,10 Topic Posts OPPr + Discourse Feats Best Accuracy Firefox vs. IE 62 66.13 % Windows vs. Mac 15 66.67% SonyPS3 vs. WII 36 61.0 % Opera vs. Firefox 4 100% Domain (#posts) Distribution Unigram Sentiment Arguing Arg+Sent Overall (2232) 50 62.50 55.02 62.59 63.93 Guns Rights (306) 50 66.67 58.82 69.28 70.59 Gay Rights (846) 50 61.70 52.84 62.05 63.71 Abortion (550) 50 59.1 54.73 59.46 60.55 Creationism (530) 50 64.91 56.60 62.83 63.96 Table 4: Accuracy of the different systems

Feature Sets Set Description/Examples Post Info IsRebuttal, Poster Unigrams Word frequencies Bigrams Word pair frequencies Cue Words Initial unigram, bigram, and trigram Repeated Punctuation Collapsed into one of the following:??,!!,?! LIWC LIWC measures and frequencies Dependencies Dependencies derived from the Stanford Parser. Generalized Dependencies Dependency features generalized with respect to POS of the head word and opinion polarity of both words. Opinion Dependencieion Subset of Generalized Dependencies with opin- words from MPQA. Context Features Matching Features used for the post from the parent post.

LIWC Lexical Categories Feature Anger words Metaphysical issues Physical state/function Inclusive words Social processes Family members Past tense verbs References to friends Causation Discrepancy Example hate, kill, pissed God, heaven, coffin ache, breast, sleep with, and, include talk, us, friend mom, brother, cousin walked, were, had pal, buddy, coworker because, know, ought should, would, could

Dependencies Dependencies Derived using the Stanford Parser Well, maybe the pistol and the hunting rifle, what do you need an automatic weapon? Have the deer gotten faster? 'amod(weapon, automatic)', 'dep:dobj(need, weapon)', 'dep:det(weapon, an)',...

Generalized Dependencies (POS) Generalized Dependencies Generalized over POS of head word Well, maybe the pistol and the hunting rifle, what do you need an automatic weapon? Have the deer gotten faster? Joshi and Rosé 2009 show that semi-lexicalized dependency features are better than fully lexicalized or fully generalized 'amod(nn, automatic)', 'dep:dobj(vbp, weapon)', 'dep:det(nn, an)',...

Generalized Dependencies (Opinion) Generalized Dependencies MPQA opinion dictionary over each word independently Intended to approximate Somasundaran & Wiebe's opinion target but if I want to own a pistol, a shotgun, and some fancy automatic, well that's my right. 'dep_opinion: amod (automatic, positive)' dep_opinion:poss(positive, my)'...