ECE 6504: Deep Learning for Perception

Similar documents
ECE 5424: Introduction to Machine Learning

ECE 5984: Introduction to Machine Learning

Deep Neural Networks [GBC] Chap. 6, 7, 8. CS 486/686 University of Waterloo Lecture 18: June 28, 2017

ECE 5424: Introduction to Machine Learning

ECE 5424: Introduction to Machine Learning

CS 4803 / 7643: Deep Learning

NPTEL NPTEL ONINE CERTIFICATION COURSE. Introduction to Machine Learning. Lecture-59 Ensemble Methods- Bagging,Committee Machines and Stacking

Document-level context in deep recurrent neural networks

9/7/2017. CS535 Big Data Fall 2017 Colorado State University Week 3 - B. FAQs. This material is built based on

Information Extraction. CS6200 Information Retrieval (and a sort of advertisement for NLP in the spring)

Closing Remarks: What can we do with multiple diverse solutions?

From Machines To The First Person

ECE 5424: Introduction to Machine Learning

Who wrote the Letter to the Hebrews? Data mining for detection of text authorship

Allreduce for Parallel Learning. John Langford, Microsoft Resarch, NYC

DALI power line communication

Final Review Ch. 1 #1

Question Answering. CS486 / 686 University of Waterloo Lecture 23: April 1 st, CS486/686 Slides (c) 2014 P. Poupart 1

The Disciples. Lesson At-A-Glance. Gather (10 minutes) Play Time Kids explore activities related to the story The Disciples.

Surveying Prof. Bharat Lohani Department of Civil Engineering Indian Institute of Technology, Kanpur. Module - 7 Lecture - 3 Levelling and Contouring

6.041SC Probabilistic Systems Analysis and Applied Probability, Fall 2013 Transcript Lecture 21

Building age models is hard 12/12/17. Ar#ficial Intelligence. An artificial intelligence tool for complex age-depth models

Now consider a verb - like is pretty. Does this also stand for something?

PART THREE: The Field of the Collective Unconscious and Its inner Dynamism

Order-Planning Neural Text Generation from Structured Data

DOWNLOAD OR READ : THE LOGIC BOOK PDF EBOOK EPUB MOBI

Radiomics for Disease Characterization: An Outcome Prediction in Cancer Patients

Plato's Epistemology PHIL October Introduction

Artificial Intelligence Prof. Deepak Khemani Department of Computer Science and Engineering Indian Institute of Technology, Madras

Deconstructing Data Science

Sentiment Flow! A General Model of Web Review Argumentation

Six Sigma Prof. Dr. T. P. Bagchi Department of Management Indian Institute of Technology, Kharagpur. Lecture No. # 18 Acceptance Sampling

Name: Date Handed In: Scripture Project. This project, along with your 10 hours of volunteer time, is due.

Computational Learning Theory: Agnostic Learning

Roadmap -Study Matt 7:7-12 -Ask Seek Knock -Application -Final Words

This is a relatively new term used by those in things like Speech recognition software development or robotic engineering or the internet searches.

Grade 6 correlated to Illinois Learning Standards for Mathematics

Agnostic Learning with Ensembles of Classifiers

Reference Resolution. Regina Barzilay. February 23, 2004

Reference Resolution. Announcements. Last Time. 3/3 first part of the projects Example topics

ESE 303: MATLAB tutorial

Lesson 07 Notes. Machine Learning. Quiz: Computational Learning Theory

Let the Light of Christ Shine

Comments on Saul Kripke s Philosophical Troubles

Overview of the ATLAS Fast Tracker (FTK) (daughter of the very successful CDF SVT) July 24, 2008 M. Shochet 1

SATIR INTERNATIONAL JOURNAL

Assignment Assignment for Lesson 3.1

logic is everywhere Logik ist überall Hikmat har Jaga Hai Mantık her yerde la logica è dappertutto lógica está em toda parte

that impact ur church and

Artificial Intelligence Prof. Deepak Khemani Department of Computer Science and Engineering Indian Institute of Technology, Madras

Macro Plan

SECTION 35. Improving and Evaluating Your Preaching

Noun Compound Interpretation

Learning About World Religions: Buddhism

08 Anaphora resolution

Exploration Introduction to the System of the Cosmic Order ABSTRACT

Georgia Quality Core Curriculum

Explaining Science-Based Beliefs such as Darwin s Evolution and Big Bang Theory as a. form of Creationist Beliefs

Lesson 09 Notes. Machine Learning. Intro

Learning About World Religions: Buddhism

Only King Forever. [Chorus] Almighty God we lift You higher. You are the only King forever. Forevermore, You are victorious

Discussion of "Regime Switches, Agents Beliefs, and Post-WW II U.S. Macro Dynamics" by Francesco Bianchi

Stupid Personal Growth Report - Mid year 2017

Experimentation Recurrent Neural Networks

Outline. Uninformed Search. Problem-solving by searching. Requirements for searching. Problem-solving by searching Uninformed search techniques

Year 4 Medium Term Planning

A Scientific Model Explains Spirituality and Nonduality

Available at IJIBEC. International Journal of Islamic Business and Economics

Curriculum Guide for Pre-Algebra

Is the Concept of God Fundamental or Figment of the Mind?

Year 4 Medium Term Planning

A Note on Straight-Thinking

Strand 1: Reading Process

philippine studies Ateneo de Manila University Loyola Heights, Quezon City 1108 Philippines

Visual Analytics Based Authorship Discrimination Using Gaussian Mixture Models and Self Organising Maps: Application on Quran and Hadith

McDougal Littell High School Math Program. correlated to. Oregon Mathematics Grade-Level Standards

Regina Elementary. Kindergarten Curriculum. Reading/Language Skills Reading Wonders: McGraw-Hill (Publisher) Math Harcourt (Publisher)

QUESTION ANSWERING SYSTEM USING SIMILARITY AND CLASSIFICATION TECHNIQUES

Introduction. Selim Aksoy. Bilkent University

A RELATIONSHIP DEVELOPS BY A PROCESS OF GROWING INTIMACY. increased attentiveness as an encounter with Christ. as a basis for a conversation

Verification of Occurrence of Arabic Word in Quran

Sounds of Love. Intuition and Reason

Women s Issue - IDENTITY

Multiple realizability and functionalism

Department of Philosophy TCD. Great Philosophers. Dennett. Tom Farrell. Department of Surgical Anatomy RCSI Department of Clinical Medicine RCSI

4th Grade Curriculum Map. Subject August/September October November Language Arts

State of the Dead. Church. Living

Potten End Church of England Primary School Curriculum Map. Year 6

DOWNLOAD OR READ : RECTIFYING THE STATE OF ISRAEL A POLITICAL PLATFORM BASED ON KABBALAH PDF EBOOK EPUB MOBI

It is One Tailed F-test since the variance of treatment is expected to be large if the null hypothesis is rejected.

The Fallacy in Intelligent Design

Storytellers Lesson 3 March 3/4 1

JAMES CAIN. wants a cause. I answer, that the uniting. or several distinct members into one body, is performed merely by

Winning on the Merits: The Joint Effects of Content and Style on Debate Outcomes

Grade 7 Math Connects Suggested Course Outline for Schooling at Home 132 lessons

PHILOSOPHIES OF SCIENTIFIC TESTING

CS224W Project Proposal: Characterizing and Predicting Dogmatic Networks

Lazy Functional Programming for a survey

Kripke s skeptical paradox

Gödel's incompleteness theorems

Transcription:

ECE 6504: Deep Learning for Perception Topics: Recurrent Neural Networks (RNNs) BackProp Through Time (BPTT) Vanishing / Exploding Gradients [Abhishek:] Lua / Torch Tutorial Dhruv Batra Virginia Tech

Administrativia HW3 Out today Due in 2 weeks Please please please please please start early https://computing.ece.vt.edu/~f15ece6504/homework3/ (C) Dhruv Batra 2

Plan for Today Model Recurrent Neural Networks (RNNs) Learning BackProp Through Time (BPTT) Vanishing / Exploding Gradients [Abhishek:] Lua / Torch Tutorial (C) Dhruv Batra 3

New Topic: RNNs (C) Dhruv Batra Image Credit: Andrej Karpathy 4

Synonyms Recurrent Neural Networks (RNNs) Recursive Neural Networks General familty; think graphs instead of chains Types: Long Short Term Memory (LSTMs) Gated Recurrent Units (GRUs) Hopfield network Elman networks Algorithms BackProp Through Time (BPTT) BackProp Through Structure (BPTS) (C) Dhruv Batra 5

What s wrong with MLPs? Problem 1: Can t model sequences Fixed-sized Inputs & Outputs No temporal structure Problem 2: Pure feed-forward processing No memory, no feedback (C) Dhruv Batra Image Credit: Alex Graves, book 6

Sequences are everywhere (C) Dhruv Batra Image Credit: Alex Graves and Kevin Gimpel 7

Even where you might not expect a sequence (C) Dhruv Batra Image Credit: Vinyals et al. 8

Even where you might not expect a sequence Input ordering = sequence (C) Dhruv Batra Image Credit: Ba et al.; Gregor et al 9

(C) Dhruv Batra Image Credit: [Pinheiro and Collobert, ICML14] 10

Why model sequences? Figure Credit: Carlos Guestrin

Why model sequences? (C) Dhruv Batra Image Credit: Alex Graves 12

Name that model Y 1 = {a, z} Y 2 = {a, z} Y 3 = {a, z} Y 4 = {a, z} Y 5 = {a, z} X 1 = X 2 = X 3 = X 4 = X 5 = Hidden Markov Model (HMM) (C) Dhruv Batra Figure Credit: Carlos Guestrin 13

How do we model sequences? No input (C) Dhruv Batra Image Credit: Bengio, Goodfellow, Courville 14

How do we model sequences? With inputs (C) Dhruv Batra Image Credit: Bengio, Goodfellow, Courville 15

How do we model sequences? With inputs and outputs (C) Dhruv Batra Image Credit: Bengio, Goodfellow, Courville 16

How do we model sequences? With Neural Nets (C) Dhruv Batra Image Credit: Alex Graves 17

How do we model sequences? It s a spectrum Input: No sequence Output: No sequence Example: standard classification / regression problems Input: No sequence Output: Sequence Example: Im2Caption Input: Sequence Output: No sequence Example: sentence classification, multiple-choice question answering Input: Sequence Output: Sequence Example: machine translation, video captioning, openended question answering, video question answering (C) Dhruv Batra Image Credit: Andrej Karpathy 18

Things can get arbitrarily complex (C) Dhruv Batra Image Credit: Herbert Jaeger 19

Key Ideas Parameter Sharing + Unrolling Keeps numbers of parameters in check Allows arbitrary sequence lengths! Depth Measured in the usual sense of layers Not unrolled timesteps Learning Is tricky even for shallow models due to unrolling (C) Dhruv Batra 20

Plan for Today Model Recurrent Neural Networks (RNNs) Learning BackProp Through Time (BPTT) Vanishing / Exploding Gradients [Abhishek:] Lua / Torch Tutorial (C) Dhruv Batra 21

BPTT a (C) Dhruv Batra Image Credit: Richard Socher 22

Illustration [Pascanu et al] Intuition Error surface of a single hidden unit RNN; High curvature walls Solid lines: standard gradient descent trajectories Dashed lines: gradient rescaled to fix problem (C) Dhruv Batra 23

Fix #1 Pseudocode (C) Dhruv Batra Image Credit: Richard Socher 24

Fix #2 Smart Initialization and ReLus [Socher et al 2013] A Simple Way to Initialize Recurrent Networks of Rectified Linear Units, Le et al. 2015 (C) Dhruv Batra 25