Scientific errors should be controlled, not prevented. Daniel Eindhoven University of Technology

Similar documents
Scientific Realism and Empiricism

Realism and the success of science argument. Leplin:

FINAL EXAM REVIEW SHEET. objectivity intersubjectivity ways the peer review system is supposed to improve objectivity

Lecture 9. A summary of scientific methods Realism and Anti-realism

Popper s Falsificationism. Philosophy of Economics University of Virginia Matthias Brinkmann

Scientific Progress, Verisimilitude, and Evidence

HPS 1653 / PHIL 1610 Revision Guide (all topics)

There are two common forms of deductively valid conditional argument: modus ponens and modus tollens.

HPS 1653 / PHIL 1610 Introduction to the Philosophy of Science

SCIENCE: THE RULES OF THE GAME

ECONOMETRIC METHODOLOGY AND THE STATUS OF ECONOMICS. Cormac O Dea. Junior Sophister

A Scientific Realism-Based Probabilistic Approach to Popper's Problem of Confirmation

Van Fraassen: Arguments concerning scientific realism

The Problem of Induction and Popper s Deductivism

Qualitative and quantitative inference to the best theory. reply to iikka Niiniluoto Kuipers, Theodorus

TRUTHLIKENESS, RATIONALITY AND SCIENTIFIC METHOD

Intro to Science Studies I

Mementos from Excursion 2 Tour II: Falsification, Pseudoscience, Induction (first installment, Nov. 17, 2018) 1

World without Design: The Ontological Consequences of Natural- ism , by Michael C. Rea.

A Statistical Scientist Meets a Philosopher of Science: A Conversation between Sir David Cox and Deborah Mayo (as recorded, June, 2011)

Karl Popper ( )

Psillos s Defense of Scientific Realism

Introduction to Political Science

Falsification of Popper and Lakatos (Falsifikace podle Poppera a Lakatose)

Introduction to Statistical Hypothesis Testing Prof. Arun K Tangirala Department of Chemical Engineering Indian Institute of Technology, Madras

Falsification or Confirmation: From Logic to Psychology

Same-different and A-not A tests with sensr. Same-Different and the Degree-of-Difference tests. Outline. Christine Borgen Linander

Phil 1103 Review. Also: Scientific realism vs. anti-realism Can philosophers criticise science?

Karl Popper & The Philosophy of Science. What Makes a Theory Scientific?

Key definitions Action Ad hominem argument Analytic A priori Axiom Bayes s theorem

Family Studies Center Methods Workshop

EXPERIMENTAL PSYCHOLOGY NST PART IB PBS PART 2A TIMETABLE (DRAFT COPY) Course Organiser: Dr GJ Davis (

Statistical Inference Without Frequentist Justifications

Computational Learning Theory: Agnostic Learning

Sydenham College of Commerce & Economics. * Dr. Sunil S. Shete. * Associate Professor

Van Fraassen: Arguments Concerning Scientific Realism

ARTÍCULOS. You don t always get the lightning rod effect when you follow these instructions, but it occurs often enough that it deserves a name.

The unfalsifiability of cladograms and its consequences. L. Vogt*

Business Research: Principles and Processes MGMT6791 Workshop 1A: The Nature of Research & Scientific Method

1. Introduction Formal deductive logic Overview

Scientific realism and anti-realism

Theoretical Virtues in Science

Review of Constructive Empiricism: Epistemology and the Philosophy of Science

The British Society for the Philosophy of Science

Chapter 20 Testing Hypotheses for Proportions

Explanation and Experiment in Social Psychological Science

Kazuhisa Todayama (Graduate School of Information Science, Nagoya University, Japan)

Review Tutorial (A Whirlwind Tour of Metaphysics, Epistemology and Philosophy of Religion)

INTRODUCTION TO HYPOTHESIS TESTING. Unit 4A - Statistical Inference Part 1

NATURALISED JURISPRUDENCE

MITOCW watch?v=4hrhg4euimo

A FIRST COURSE IN PARAMETRIC INFERENCE BY B. K. KALE DOWNLOAD EBOOK : A FIRST COURSE IN PARAMETRIC INFERENCE BY B. K. KALE PDF

CHAPTER 17: UNCERTAINTY AND RANDOM: WHEN IS CONCLUSION JUSTIFIED?

145 Philosophy of Science

UNIVERSITY OF ALBERTA MATHEMATICS AS MAKE-BELIEVE: A CONSTRUCTIVE EMPIRICIST ACCOUNT SARAH HOFFMAN

Introduction: Belief vs Degrees of Belief

A Theory s Predictive Success does not Warrant Belief in the Unobservable Entities it Postulates

Experimental Design. Introduction

Beliefs Versus Knowledge: A Necessary Distinction for Explaining, Predicting, and Assessing Conceptual Change

Sins of the Epistemic Probabilist Exchanges with Peter Achinstein

Unit. Science and Hypothesis. Downloaded from Downloaded from Why Hypothesis? What is a Hypothesis?

Jeu-Jenq Yuann Professor of Philosophy Department of Philosophy, National Taiwan University,

Scientific Method and Research Ethics

Grade 7 Math Connects Suggested Course Outline for Schooling at Home 132 lessons

Some basic statistical tools. ABDBM Ron Shamir

PHILOSOPHY OF SCIENCE PHIL 145, FALL 2017

Introduction and Background

An Introduction to Metametaphysics

CHAPTER FIVE SAMPLING DISTRIBUTIONS, STATISTICAL INFERENCE, AND NULL HYPOTHESIS TESTING

CLASS #17: CHALLENGES TO POSITIVISM/BEHAVIORAL APPROACH

Class 6 - Scientific Method

Certainty, probability and abduction: why we should look to C.S. Peirce rather than GoÈ del for a theory of clinical reasoning

Introduction to Cognitivism; Motivational Externalism; Naturalist Cognitivism

Reasoning and Decision-Making under Uncertainty

A Defense for Scientific Realism:

SYSTEMATIC RESEARCH IN PHILOSOPHY. Contents

The Positive Argument for Constructive Empiricism and Inference to the Best

I-- W. Newton-Smith. THE UNDERDETERMINATION OF THEORY BY DATA W. Newton-Smith and Steven Lukes

The Logic Of Scientific Discovery PDF

Lecture 6. Realism and Anti-realism Kuhn s Philosophy of Science

Probability Distributions TEACHER NOTES MATH NSPIRED

THE TENSION BETWEEN FALSIFICATIONISM AND REALISM: A CRITICAL EXAMINATION OF A PROBLEM IN THE PHILOSOPHY OF KARL POPPER

Discussion Notes for Bayesian Reasoning

Degrees of Belief II

The Moral Behavior of Ethicists and the Role of the Philosopher

THE ROLE OF COHERENCE OF EVIDENCE IN THE NON- DYNAMIC MODEL OF CONFIRMATION TOMOJI SHOGENJI

Philosophy of Science PHIL 241, MW 12:00-1:15

Bounded Rationality. Gerhard Riener. Department of Economics University of Mannheim. WiSe2014

Causal Realism, Epistemology and Underdetermination. Abstract: It is often charged against realist philosophers of science that because they are

PHILOSOPHY AND RELIGIOUS STUDIES

Philosophy of Science. Ross Arnold, Summer 2014 Lakeside institute of Theology

CS485/685 Lecture 5: Jan 19, 2016

What is a counterexample?

Science, Rationality and the Human Mind. by Garry Jacobs

Temperate Rationalism: An Option for the Methodology and Understanding of Scientific Enterprise

Critical Scientific Realism

EXPERIMENTAL PSYCHOLOGY NST PART IB PBS PART IB TIMETABLE Course Organiser: Dr AL Milton (

Jeffrey, Richard, Subjective Probability: The Real Thing, Cambridge University Press, 2004, 140 pp, $21.99 (pbk), ISBN

SOCIOLOGICAL THEORY Michaelmas 2018 Dr Michael Biggs

Practical Inadequacy: Bas van Fraassen's Failures of Systematicity. Curtis Forbes

Transcription:

Scientific errors should be controlled, not prevented Daniel Lakens @Lakens Eindhoven University of Technology

1) Error control is the central aim of empirical science.

2) We need statistical decision theory to manage scientific progress.

Many empirical scientists are scientific realists.

Scientific theories that successfully make novel predictions are a good reason to believe they are ± true (have verisimilitude).

Feyerabend: Remain agnostic. Fraassen: Believe in empirical adequacy. Scientific realism: Verisimilitude (Truth-likeness).

Verisimilitude is an ontological, not epistemological question. Niiniluoto, 1998

In practice, the notion that theories have money in the bank (Meehl, 1990)

We don t need to know the truth as long as we move towards it (comparative scientific realism). Kuipers, 2016

One way to do this is by successfully predicting novel features of the world.

Responses Color Naming Responses Word Naming World 1 Slower Slower World 2 Slower Not Slower World 3 Not Slower Slower World 4 Not Slower Not Slower

What matters is whether theories are truthlike, not whether you believe they are truthlike.

As to degree of corroboration, it is nothing but a measure of the degree to which a hypothesis h has been tested, and of the degree to which it has stood up to tests. It must not be interpreted, therefore, as a degree of the rationality of our belief in the truth of h Popper, 2012, p. 434

We do not care what you believe, we barely care what we believe, what we are interested in is what you can show. Taper and Lele (2011)

From the axiomatic foundational definition of probability Bayesianism is doomed to answer questions irrelevant to science. Taper and Lele (2011)

Good research practices

Bem, 2011

Features can be identified through methodological falsificationism. Lakatos (1978)

Probabilistic statements can be made falsifiable by specifying certain rejection rules which may render statistically interpreted evidence 'inconsistent' with the probabilistic theory Lakatos (1978)

We are inclined to think that as far as a particular hypothesis is concerned, no test based upon the theory of probability can by itself provide any valuable evidence of the truth or falsehood of that hypothesis. But we may look at the purpose of tests from another view-point. Without hoping to know. whether each separate hypothesis is true or false, we may search for rules to govern our behaviour with regard to them, in following which we insure that, in the long run of experience, we shall not be too often wrong. Neyman and Pearson (1933)

If we agree error control is a central aim, the real question is how to do so optimally.

Type 2 errors have large been ignored (in psychology).

Studies in psychology often have low power. Estimates average around 50%. Cohen, 1962; Fraley & Vazire, 2014

Non-significant studies should be expected: 0.8 0.8 0.8 0.8=0.41

Researchers need to assign a utility: u(e, z, a, θ) to performing an experiment {e}, observing a statistical outcome {z}, taking an action {a} (follow-up or abandon), depending on the true state of the world {θ}.

Assigning utilities is essential for a coherent approach to science.

We need more applied work on setting alpha levels.

We need more applied work on controlling alpha levels.

Ask any empirical scientist if one-sided testing is allowed, and you ll know what I mean.

The first is the decision made by the individual experimenter who frequently plans one experiment from his evaluation of a previous one. We concede that here a one-tailed test is often proper. The second is the decision which determines the place of his findings in the literature of psychology. Here the onetailed test seems inadmissible. Burke, 1953, p. 385

Some people say they will never publish something without first replicating. 0.05 0.05

Is it more valuable to show an effect three times with N = 300, or once with N = 900?

We need even more applied work on controlling Type 2 error rates.

Where to start? - Real life costs & benefits - Theoretical models

Plan for the change you would like to see in the world. Ask yourself: What is your smallest effect size of interest?

Requires you to specify H1! That s a good thing. What does you theory predict, or what do you care about if H0 is false?

If we don t, science becomes unfalsifiable. We can never accept the null.

Researcher: But I m not interested in the size of the effect the presence of any effect supports my theory!

Detecting d = 0.001 requires 42 million people.

You make implicit choices about which effects are too small to matter all the time.

If you expect a medium effect size and plan for 80% power, d<0.35 will never be significant.

If nothing else, the maximum sample you are willing to collect determines your SESOI.

When thinking about utilities, the sample size researchers are willing to collect is often the easiest to quantify.

At least initially, we can bootstrap what we care about, based on the resources we want to invest.

In time, we might need to collaborate to control errors for our SESOI

Now you can also reject effects as large as, or larger than, your SESOI, using an equivalence test.

R package ( TOSTER ) & Excel

If effect sizes are uncertain sequential analyses let you collect data at lower costs.

Optional stopping: Collecting data until p < 0.05 inflates the Type 1 error.

Sequential analysis controls Type 1 error rates (e.g., Pocock correction).

Wald, 1945

Pocock Boundary Number of analyses p-value threshold 2 0.0294 3 0.0221 4 0.0182 5 0.0158

Sequential analysis controls Type 1 error rates (e.g., Pocock correction).

Error control is an important goal that can only be achieved by quantifying utilities.

Thanks! @Lakens http://daniellakens.blogspot.nl/