Statistics, Politics, and Policy

Similar documents
INTRODUCTION TO HYPOTHESIS TESTING. Unit 4A - Statistical Inference Part 1

Introduction to Statistical Hypothesis Testing Prof. Arun K Tangirala Department of Chemical Engineering Indian Institute of Technology, Madras

Discussion Notes for Bayesian Reasoning

occasions (2) occasions (5.5) occasions (10) occasions (15.5) occasions (22) occasions (28)

ECONOMETRIC METHODOLOGY AND THE STATUS OF ECONOMICS. Cormac O Dea. Junior Sophister

Six Sigma Prof. Dr. T. P. Bagchi Department of Management Indian Institute of Technology, Kharagpur

A Scientific Realism-Based Probabilistic Approach to Popper's Problem of Confirmation

POLS 205 Political Science as a Social Science. Making Inferences from Samples

Statistical Inference Without Frequentist Justifications

1. Introduction Formal deductive logic Overview

How many imputations do you need? A two stage calculation using a quadratic rule

The CopernicanRevolution

IDHEF Chapter 2 Why Should Anyone Believe Anything At All?

A Statistical Scientist Meets a Philosopher of Science: A Conversation between Sir David Cox and Deborah Mayo (as recorded, June, 2011)

Logical (formal) fallacies

PHILOSOPHIES OF SCIENTIFIC TESTING

Near and Dear? Evaluating the Impact of Neighbor Diversity on Inter-Religious Attitudes

Sociology Exam 1 Answer Key February 18, 2011

1/9. The First Analogy

Contribution Games and the End-Game Effect: When Things Get Real An Experimental Analysis

Philosophy Epistemology Topic 5 The Justification of Induction 1. Hume s Skeptical Challenge to Induction

Module - 02 Lecturer - 09 Inferential Statistics - Motivation

McDougal Littell High School Math Program. correlated to. Oregon Mathematics Grade-Level Standards

Putnam on Methods of Inquiry

Dave Elder-Vass Of Babies and Bathwater. A Review of Tuukka Kaidesoja Naturalizing Critical Realist Social Ontology

Recoding of Jews in the Pew Portrait of Jewish Americans Elizabeth Tighe Raquel Kramer Leonard Saxe Daniel Parmer Ryan Victor July 9, 2014

Critical Thinking 5.7 Validity in inductive, conductive, and abductive arguments

Introduction to Inference

MISSOURI S FRAMEWORK FOR CURRICULAR DEVELOPMENT IN MATH TOPIC I: PROBLEM SOLVING

Computing Machinery and Intelligence. The Imitation Game. Criticisms of the Game. The Imitation Game. Machines Concerned in the Game

CS485/685 Lecture 5: Jan 19, 2016

ECE 5424: Introduction to Machine Learning

Logic: Deductive and Inductive by Carveth Read M.A. CHAPTER IX CHAPTER IX FORMAL CONDITIONS OF MEDIATE INFERENCE

Ability, Schooling Inputs and Earnings: Evidence from the NELS

Computational Learning Theory: Agnostic Learning

1/8. Introduction to Kant: The Project of Critique

Tuukka Kaidesoja Précis of Naturalizing Critical Realist Social Ontology

Family Studies Center Methods Workshop

ECE 5424: Introduction to Machine Learning

The World Wide Web and the U.S. Political News Market: Online Appendices

6.041SC Probabilistic Systems Analysis and Applied Probability, Fall 2013 Transcript Lecture 21

THE MISSING TABLET: COMMENT ON PETER KENNEDY S TEN COMMANDMENTS

The error statistical philosopher as normative naturalist

Overview of College Board Noncognitive Work Carol Barry

HUME, CAUSATION AND TWO ARGUMENTS CONCERNING GOD

CHAPTER 17: UNCERTAINTY AND RANDOM: WHEN IS CONCLUSION JUSTIFIED?

MEASURING THE TOTAL QUALITY MANAGEMENT IN THE INDONESIAN UNIVERSITIES: FROM THE PERSPECTIVES OF FACULTY MEMBERS THESIS

1/12. The A Paralogisms

1/8. The Third Analogy

CSSS/SOC/STAT 321 Case-Based Statistics I. Introduction to Probability

Betting on God: Pascal, Probability Theory and Theology. nevertheless made surprising contributions to the field of religious philosophy.

Philosophy and Methods of the Social Sciences

NPTEL NPTEL ONINE CERTIFICATION COURSE. Introduction to Machine Learning. Lecture-59 Ensemble Methods- Bagging,Committee Machines and Stacking

EVOLUTION, EMPIRICISM, AND PURPOSENESS.

Scientific Realism and Empiricism

SOME FUN, THIRTY-FIVE YEARS AGO

2.3. Failed proofs and counterexamples

Beliefs Versus Knowledge: A Necessary Distinction for Explaining, Predicting, and Assessing Conceptual Change

Two Ways of Thinking

ABSTRACT. Religion and Economic Growth: An Analysis at the City Level. Ran Duan, M.S.Eco. Mentor: Lourenço S. Paz, Ph.D.

2.1 Review. 2.2 Inference and justifications

The distinctive should of assertability

Detachment, Probability, and Maximum Likelihood

Handout for: Ibn Sīnā: analysis with modal syllogisms

The Problem of the External World

Development, Globalization, and Islamic Finance in Contemporary Indonesia

NICHOLAS J.J. SMITH. Let s begin with the storage hypothesis, which is introduced as follows: 1

Module 02 Lecture - 10 Inferential Statistics Single Sample Tests

Fusion Confusion? Comments on Nancy Reid: BFF Four Are we Converging?

PHI2391: Logical Empiricism I 8.0

There are two common forms of deductively valid conditional argument: modus ponens and modus tollens.

Macmillan/McGraw-Hill SCIENCE: A CLOSER LOOK 2011, Grade 3 Correlated with Common Core State Standards, Grade 3

Epistemic Responsibility in Science

Metametaphysics. New Essays on the Foundations of Ontology* Oxford University Press, 2009

Philosophy of Science. Ross Arnold, Summer 2014 Lakeside institute of Theology

Religious Beliefs of Higher Secondary School Teachers in Pathanamthitta District of Kerala State

Video: How does understanding whether or not an argument is inductive or deductive help me?

Kant and his Successors

Saving the Substratum: Interpreting Kant s First Analogy

On the futility of criticizing the neoclassical maximization hypothesis

Why Good Science Is Not Value-Free

Religious affiliation, religious milieu, and contraceptive use in Nigeria (extended abstract)

9694 THINKING SKILLS

Daniel Little Fallibilism and Ontology in Tuukka Kaidesoja s Critical Realist Social Ontology

Is Evolution Incompatible with Intelligent Design? Outline

Certainty, probability and abduction: why we should look to C.S. Peirce rather than GoÈ del for a theory of clinical reasoning

1/6. The Resolution of the Antinomies

Introduction Questions to Ask in Judging Whether A Really Causes B

LENT 2018 THEORY OF MEANING DR MAARTEN STEENHAGEN

HAS DAVID HOWDEN VINDICATED RICHARD VON MISES S DEFINITION OF PROBABILITY?

Who wrote the Letter to the Hebrews? Data mining for detection of text authorship

Scientific errors should be controlled, not prevented. Daniel Eindhoven University of Technology

ON THE ROLE OF METHODOLOGY: ADVICE TO THE ADVISORS

Introductory Statistics Day 25. Paired Means Test

World without Design: The Ontological Consequences of Natural- ism , by Michael C. Rea.

The Qualiafications (or Lack Thereof) of Epiphenomenal Qualia

Self-Evidence and A Priori Moral Knowledge

Studying Religion-Associated Variations in Physicians Clinical Decisions: Theoretical Rationale and Methodological Roadmap

It doesn t take long in reading the Critique before we are faced with interpretive challenges. Consider the very first sentence in the A edition:

But we may go further: not only Jones, but no actual man, enters into my statement. This becomes obvious when the statement is false, since then

Transcription:

Statistics, Politics, and Policy Volume 3, Issue 1 2012 Article 5 Comment on Why and When 'Flawed' Social Network Analyses Still Yield Valid Tests of no Contagion Cosma Rohilla Shalizi, Carnegie Mellon University Recommended Citation: Shalizi, Cosma Rohilla (2012) "Comment on Why and When 'Flawed' Social Network Analyses Still Yield Valid Tests of no Contagion," Statistics, Politics, and Policy: Vol. 3: Iss. 1, Article 5. DOI: 10.1515/2151-7509.1053 2012 De Gruyter. All rights reserved.

Comment on Why and When 'Flawed' Social Network Analyses Still Yield Valid Tests of no Contagion Cosma Rohilla Shalizi Abstract VanderWeele et al.'s paper is a useful contribution to the on-going scientific conversation about the detection of contagion from purely observational data. It is especially helpful as a corrective to some of the more extreme statements of Lyons (2011). Unfortunately, this paper, too, goes too far in some places, and so needs some correction itself. KEYWORDS: social networks, causal inference, contagion, social influence

Shalizi: Comment on VanderWeele et al. The paper by VanderWeele et al. is a useful contribution to the on-going scientific conversation about the detection of contagion from purely observational data. It it especially helpful as a corrective to some of the more extreme statements of Lyons (2011). Unfortunately, this paper, too, goes too far in some places, and so needs some correction itself. To begin with, Lyons was so unrelentingly hostile in his paper, from the title onwards, that it s quite natural and even laudable to want to defend the objects of his attack. That said, look at exactly what is being offered here as a defense. There is no disputing Lyons s claim that the Christakis and Fowler (2007) model 1 is, in the strictest mathematical sense, simply meaningless, unless there is no contagion. The present paper says that estimating this nonsensical model allows one to test, not a clean hypothesis of no contagion, but rather a joint hypothesis of no contagion and no latent homophily and the complete correctness of the specification for the observed covariates. Suppose I take my data and test this joint hypothesis with the model, and I reject it. It seems to me that there are two big obstacles in the way of saying that I have really tested the hypothesis of no contagion, rejected it, and so can infer contagion with some modicum of confidence. 1. The power of the test is quite unknown. But unless the test has power, it doesn t provide any evidence for an inference (Mayo, 1996, Mayo and Cox, 2006). More exactly, it provides no more evidence than what my teachers called a Gygax test, which generates an independent random number between 0 and 1, and rejects if the number falls into an appropriately-sized interval 2. Specifically, one would need to know how much power the test had to detect departures from the null hypothesis in the direction of contagion. Since the model in question cannot, mathematically, be extended to allow for contagion, finding this power seems like a hard thing to do. I don t want to say it s impossible, but it is a pre-requisite for the test to have any scientific value. 2. Assuming the joint null hypothesis is rejected, how is one to know which component is at fault? Leaving latent homophily aside for the moment, in my experience of applied statistics I can recall exactly one case where a generalized linear model has actually passed even moderately severe mis-specification 1 I ll join everyone else in calling it their model, but, if I can decipher their somewhat obscure citations, they actually took it from Valente (2005), which gives the impression that it is common in both the network-epidemiology and diffusion-of-innovations literatures. 2 If invoking an independent random number feels like cheating, substitute complicated calculations which are sensitive only to low-significance digits in continuous quantities. Published by De Gruyter, 2012 1

Statistics, Politics, and Policy, Vol. 3 [2012], Iss. 1, Art. 5 checks 3. (Perhaps the authors of the present paper have been luckier than me in this regard.) Unless some guidance can be given for reliably locating the problem in the hypothesis of no contagion, it s a stretch to say that this tests no contagion, as opposed to a model which posits that, along with a lot of a priori most dubious assumptions. I am happy to agree with VanderWeele et al. that matters are better when one uses the model where ego s state at t is supposed to be caused by alter s state at time t 1, and that then some of Lyons s criticisms lose their force. That model at least gives rise to a self-consistent stochastic process when there is contagion, so it is not necessarily wrong. One can even give it a coherent causal interpretation, unlike the simultaneous-regression model. (That is why we used the time-delayed model in Shalizi and Thomas (2011).) My point about power above is at least mitigated, since the power of the test proposed, assuming contagion but maintaining the other assumptions, could at be directly approximated by simulation. In all, I can think of no reason for ever using the simultaneous model. This however still leaves the matter of what one learns from rejecting the joint null hypothesis. The issue of latent homophily returns here. VanderWeele (2011) is a truly ingenious paper, which advanced the field by providing the second approach 4 to something like partial identification, as called for in Shalizi and Thomas (2011, 4.2). However, it did so under very strong parametric and substantive assumptions, such as, e.g., all latent homophily being due to a single binary variable, which interacts with observables in very specific and limiting ways. Proving results under these restrictions is more than anyone else has done, but before one appeals to the results in empirical problems, one needs to either have some scientific reason to think the restrictions hold, or a mathematical reason to think that the conclusions are robust to substantial departures from those assumptions. Since those mathematical reasons are, at least for now, unavailable, we are forced to rely on scientific knowledge. Is anyone prepared to argue that we ought, on biological or sociological grounds, to think that everything relevant to friendship formation and obesity (in suburban Massachusetts) boils down to one binary variable? To sum up, there seem to me to be three major weaknesses with the argument of the present paper. 1. If the sensitivity analysis of VanderWeele (2011) is to be invoked, the assumptions underlying that analysis must be shown to apply. 3 The exception involved a lot of work with the client to craft covariates which were highly nonlinear in the raw data. 4 After Ver Steeg and Galstyan (2010), who however have to assume that the latent variables have the same relationship to observables at all times, i.e., that aging does not matter. 2

Shalizi: Comment on VanderWeele et al. 2. If rejecting the null hypothesis of no-contagion-and-no-latent-homophily-andcompletely-correct-specification-of-everything-else is to provide evidence of contagion, then (a) it must be shown that the test has power to detect departures from the null in the direction of contagion; and (b) there really ought to be some guidance as to how one tells that the problem with the null is contagion, specifically. I suspect that these weak points can be patched up, but they do need repair. References Christakis, N. A. and J. H. Fowler (2007): The spread of obesity in a large social network over 32 years, The New England Journal of Medicine, 357, 370 379, URL http://content.nejm.org/cgi/content/abstract/357/4/370. Lyons, R. (2011): The spread of evidence-poor medicine via flawed socialnetwork analysis, Statistics, Politics, and Policy, 2, URL http://arxiv.org/ abs/1007.2876. Mayo, D. G. (1996): Error and the Growth of Experimental Knowledge, Chicago: University of Chicago Press. Mayo, D. G. and D. R. Cox (2006): Frequentist statistics as a theory of inductive inference, in J. Rojo, ed., Optimality: The Second Erich L. Lehmann Symposium, Bethesda, Maryland: Institute of Mathematical Statistics, 77 97, URL http://arxiv.org/abs/math.st/0610846. Shalizi, C. R. and A. C. Thomas (2011): Homophily and contagion are generically confounded in observational social network studies, Sociological Methods and Research, 40, 211 239, URL http://arxiv.org/abs/1004.4704. Valente, T. W. (2005): Network models and methods for studying the diffusion of innovations, in P. J. Carrington, J. Scott, and S. Wasserman, eds., Models and Methods in Social Network Analysis, Cambridge, England: Cambridge University Press, 98 116. VanderWeele, T. J. (2011): Sensitivity analysis for contagion effects in social networks, Sociological Methods and Research, 20, 240 255. Ver Steeg, G. and A. Galstyan (2010): Ruling out latent homophily in social networks, in NIPS Worksop on Social Computing, URL http://mlg.cs.purdue.edu/lib/exe/fetch.php?id=schedule&cache= cache&media=machine_learning_group:projects:paper19.pdf. Published by De Gruyter, 2012 3