ECE 5424: Introduction to Machine Learning

Similar documents
ECE 5424: Introduction to Machine Learning

ECE 5424: Introduction to Machine Learning

ECE 5984: Introduction to Machine Learning

ECE 5424: Introduction to Machine Learning

ECE 6504: Deep Learning for Perception

NPTEL NPTEL ONINE CERTIFICATION COURSE. Introduction to Machine Learning. Lecture-59 Ensemble Methods- Bagging,Committee Machines and Stacking

NPTEL NPTEL ONLINE CERTIFICATION COURSE. Introduction to Machine Learning. Lecture 31

CS485/685 Lecture 5: Jan 19, 2016

6.041SC Probabilistic Systems Analysis and Applied Probability, Fall 2013 Transcript Lecture 21

Introduction to Statistical Hypothesis Testing Prof. Arun K Tangirala Department of Chemical Engineering Indian Institute of Technology, Madras

Deep Neural Networks [GBC] Chap. 6, 7, 8. CS 486/686 University of Waterloo Lecture 18: June 28, 2017

Scientific Realism and Empiricism

Closing Remarks: What can we do with multiple diverse solutions?

Computational Learning Theory: Agnostic Learning

CS 4803 / 7643: Deep Learning

Discussion Notes for Bayesian Reasoning

Sociology Exam 1 Answer Key February 18, 2011

1/17/2018 ECE 313. Probability with Engineering Applications Section B Y. Lu. ECE 313 is quite a bit different from your other engineering courses.

Radiomics for Disease Characterization: An Outcome Prediction in Cancer Patients

Agnostic Learning with Ensembles of Classifiers

Module 02 Lecture - 10 Inferential Statistics Single Sample Tests

Family Studies Center Methods Workshop

Agnostic KWIK learning and efficient approximate reinforcement learning

MITOCW watch?v=ogo1gpxsuzu

Statistics, Politics, and Policy

Six Sigma Prof. Dr. T. P. Bagchi Department of Management Indian Institute of Technology, Kharagpur

Outline. Uninformed Search. Problem-solving by searching. Requirements for searching. Problem-solving by searching Uninformed search techniques

It is One Tailed F-test since the variance of treatment is expected to be large if the null hypothesis is rejected.

Discussion of "Regime Switches, Agents Beliefs, and Post-WW II U.S. Macro Dynamics" by Francesco Bianchi

Lampiran 1. Daftar Sampel Reksa dana campuran syariah

Lesson 09 Notes. Machine Learning. Intro

MITOCW watch?v=4hrhg4euimo

INTRODUCTION TO HYPOTHESIS TESTING. Unit 4A - Statistical Inference Part 1

Kripke s skeptical paradox

Now consider a verb - like is pretty. Does this also stand for something?

Artificial Intelligence Prof. Deepak Khemani Department of Computer Science and Engineering Indian Institute of Technology, Madras

How many imputations do you need? A two stage calculation using a quadratic rule

A Layperson s Guide to Hypothesis Testing By Michael Reames and Gabriel Kemeny ProcessGPS

POLS 205 Political Science as a Social Science. Making Inferences from Samples

Near and Dear? Evaluating the Impact of Neighbor Diversity on Inter-Religious Attitudes

KRIPKE ON WITTGENSTEIN. Pippa Schwarzkopf

About Type I and Type II Errors: Examples

Question Answering. CS486 / 686 University of Waterloo Lecture 23: April 1 st, CS486/686 Slides (c) 2014 P. Poupart 1

Supplement to: Aksoy, Ozan Motherhood, Sex of the Offspring, and Religious Signaling. Sociological Science 4:

Session 10 INDUCTIVE REASONONING IN THE SCIENCES & EVERYDAY LIFE( PART 1)

The following content is provided under a Creative Commons license. Your support

Okay, good afternoon everybody. Hope everyone can hear me. Ronet, can you hear me okay?

Social Perception Survey. Do people make prejudices based on appearance/stereotypes? We used photos as a bias to test this.

On 21 September 2014, Alexej Chervonenkis went for a walk in a park on the outskirts of Moscow and got lost. He called his wife in the evening, and

Introduction Chapter 1 of Social Statistics

MITOCW watch?v=k2sc-wpdt6k

What Is On The Final. Review. What Is Not On The Final. What Might Be On The Final

Allreduce for Parallel Learning. John Langford, Microsoft Resarch, NYC

Excel Lesson 3 page 1 April 15

The World Wide Web and the U.S. Political News Market: Online Appendices

11 Beware of Syllogism: Statistical Reasoning and Conjecturing According to Peirce

Lesson 07 Notes. Machine Learning. Quiz: Computational Learning Theory

Using Machine Learning Algorithms for Categorizing Quranic Chapters by Major Phases of Prophet Mohammad s Messengership

The Evolution of Belief Ambiguity During the Process of High School Choice

6.00 Introduction to Computer Science and Programming, Fall 2008

CS224W Project Proposal: Characterizing and Predicting Dogmatic Networks

NPTEL NPTEL ONLINE COURSES REINFORCEMENT LEARNING. UCB1 Explanation (UCB1)

Same-different and A-not A tests with sensr. Same-Different and the Degree-of-Difference tests. Outline. Christine Borgen Linander

Carolina Bachenheimer-Schaefer, Thorsten Reibel, Jürgen Schilder & Ilija Zivadinovic Global Application and Solution Team

Module - 02 Lecturer - 09 Inferential Statistics - Motivation

Who wrote the Letter to the Hebrews? Data mining for detection of text authorship

Torah Code Cluster Probabilities

Netherlands Interdisciplinary Demographic Institute, The Hague, The Netherlands

Deconstructing Data Science

PHILOSOPHIES OF SCIENTIFIC TESTING

Marcello Pagano [JOTTER WEEK 5 SAMPLING DISTRIBUTIONS ] Central Limit Theorem, Confidence Intervals and Hypothesis Testing

Health Information Exchange (HIE): Where We Are and What s Ahead

Lecture 9. A summary of scientific methods Realism and Anti-realism

Studying Adaptive Learning Efficacy using Propensity Score Matching

APRIL 2017 KNX DALI-Gateways DG/S x BU EPBP GPG Building Automation. Thorsten Reibel, Training & Qualification

Why Good Science Is Not Value-Free

ICANN San Francisco Meeting IRD WG TRANSCRIPTION Saturday 12 March 2011 at 16:00 local

Appendix 1. Towers Watson Report. UMC Call to Action Vital Congregations Research Project Findings Report for Steering Team

The Evolution of Cognitive and Noncognitive Skills Over the Life Cycle of the Child

Religious affiliation, religious milieu, and contraceptive use in Nigeria (extended abstract)

Statistics for Experimentalists Prof. Kannan. A Department of Chemical Engineering Indian Institute of Technology - Madras

A FIRST COURSE IN PARAMETRIC INFERENCE BY B. K. KALE DOWNLOAD EBOOK : A FIRST COURSE IN PARAMETRIC INFERENCE BY B. K. KALE PDF

The Decline of the Traditional Church Choir: The Impact on the Church and Society. Dr Arthur Saunders

MITOCW ocw f08-rec10_300k

About QF101 Overview Careers for Quants Pre-U Math Takeaways. Introduction. Christopher Ting.

Multiple Regression-FORCED-ENTRY HIERARCHICAL MODEL Dennessa Gooden/ Samantha Okegbe COM 631/731 Spring 2018 Data: Film & TV Usage 2015 I. MODEL.

Content Area Variations of Academic Language

Work Ethic, Social Ethic, no Ethic: Measuring the Economic Values of Modern Christians

What can happen if two quorums try to lock their nodes at the same time?

PHIL 155: The Scientific Method, Part 1: Naïve Inductivism. January 14, 2013

Biometrics Prof. Phalguni Gupta Department of Computer Science and Engineering Indian Institute of Technology, Kanpur. Lecture No.

Overview of the ATLAS Fast Tracker (FTK) (daughter of the very successful CDF SVT) July 24, 2008 M. Shochet 1

I also occasionally write for the Huffington Post: knoll/

Why Discernment is Something You Cannot Do Without

TÜ Information Retrieval

Rational Self-Doubt: The Re-calibrating Bayesian

ITU Kaleidoscope 2016 ICTs for a Sustainable World

Building age models is hard 12/12/17. Ar#ficial Intelligence. An artificial intelligence tool for complex age-depth models

Pray, Equip, Share Jesus:

Transcription:

ECE 5424: Introduction to Machine Learning Topics: (Finish) Regression Model selection, Cross-validation Error decomposition Readings: Barber 17.1, 17.2 Stefan Lee Virginia Tech

Administrative Project Proposal Due: Fri 09/23, 11:55 pm NOTE: DEADLINE SHIFTED <=2pages, NIPS format HW2 Due: Wed 09/28, 11:55pm Implement linear regression, Naïve Bayes, Logistic Regression Reminder: Participation on Scholar forum is part of your grade Ask questions if you have them! (C) Dhruv Batra 2

Recap of last time (C) Dhruv Batra 3

Regression (C) Dhruv Batra 4

(C) Dhruv Batra Slide Credit: Greg Shakhnarovich 5

(C) Dhruv Batra Slide Credit: Greg Shakhnarovich 6

(C) Dhruv Batra Slide Credit: Greg Shakhnarovich 7

(C) Dhruv Batra Slide Credit: Greg Shakhnarovich 8

But, why? Why sum squared error??? Gaussians, Watson, Gaussians (C) Dhruv Batra 9

(C) Dhruv Batra Slide Credit: Greg Shakhnarovich 10

Is OLS Robust? Demo http://www.calpoly.edu/~srein/statdemo/all.html Bad things happen when the data does not come from your model! How do we fix this? (C) Dhruv Batra 11

Robust Linear Regression y ~ Lap(w x, b) On paper 5 4.5 4 L2 L1 huber 4 3 2 least squares laplace Linear data with noise and outliers 3.5 1 3 2.5 2 1.5 0 1 2 1 3 0.5 4 0 5 0.5 3 2 1 0 1 2 3 6 0 0.2 0.4 0.6 0.8 1 (C) Dhruv Batra 12

Plan for Today (Finish) Regression Bayesian Regression Different prior vs likelihood combination Polynomial Regression Error Decomposition Bias-Variance Cross-validation (C) Dhruv Batra 13

Robustify via Prior Ridge Regression y ~ N(w x, σ 2 ) w ~ N(0, t 2 I) P(w x,y) = (C) Dhruv Batra 14

Summary Likelihood Prior Name Gaussian Uniform Least Squares Gaussian Gaussian Ridge Regression Gaussian Laplace Lasso Laplace Uniform Robust Regression Student Uniform Robust Regression (C) Dhruv Batra 15

(C) Dhruv Batra Slide Credit: Greg Shakhnarovich 16

(C) Dhruv Batra Slide Credit: Greg Shakhnarovich 17

(C) Dhruv Batra Slide Credit: Greg Shakhnarovich 18

(C) Dhruv Batra Slide Credit: Greg Shakhnarovich 19

Example Demo http://www.princeton.edu/~rkatzwer/polynomialregression/ (C) Dhruv Batra 20

What you need to know Linear Regression Model Least Squares Objective Connections to Max Likelihood with Gaussian Conditional Robust regression with Laplacian Likelihood Ridge Regression with priors Polynomial and General Additive Regression (C) Dhruv Batra 21

New Topic: Model Selection and Error Decomposition (C) Dhruv Batra 22

Example for Regression Demo http://www.princeton.edu/~rkatzwer/polynomialregression/ How do we pick the hypothesis class? (C) Dhruv Batra 23

Model Selection How do we pick the right model class? Similar questions How do I pick magic hyper-parameters? How do I do feature selection? (C) Dhruv Batra 24

Errors Expected Loss/Error Training Loss/Error Validation Loss/Error Test Loss/Error Reporting Training Error (instead of Test) is CHEATING Optimizing parameters on Test Error is CHEATING (C) Dhruv Batra 25

(C) Dhruv Batra Slide Credit: Greg Shakhnarovich 26

(C) Dhruv Batra Slide Credit: Greg Shakhnarovich 27

(C) Dhruv Batra Slide Credit: Greg Shakhnarovich 28

(C) Dhruv Batra Slide Credit: Greg Shakhnarovich 29

(C) Dhruv Batra Slide Credit: Greg Shakhnarovich 30

Typical Behavior a (C) Dhruv Batra 31

Overfitting Overfitting: a learning algorithm overfits the training data if it outputs a solution w when there exists another solution w such that: (C) Dhruv Batra Slide Credit: Carlos Guestrin 32

Error Decomposition Reality (C) Dhruv Batra 33

Error Decomposition Reality (C) Dhruv Batra 34

Error Decomposition Reality Higher-Order Potentials (C) Dhruv Batra 35

Error Decomposition Approximation/Modeling Error You approximated reality with model Estimation Error You tried to learn model with finite data Optimization Error You were lazy and couldn t/didn t optimize to completion (Next time) Bayes Error Reality just sucks (C) Dhruv Batra 36