The curious case of Mark V. Shaney. Comp 140 Fall 2008

Similar documents
Mark V. Shaney. Comp 140

Georgia Quality Core Curriculum 9 12 English/Language Arts Course: Ninth Grade Literature and Composition

6.041SC Probabilistic Systems Analysis and Applied Probability, Fall 2013 Transcript Lecture 3

6.041SC Probabilistic Systems Analysis and Applied Probability, Fall 2013 Transcript Lecture 21

The Argument Clinic. Monty Python. Index: Atheism and Awareness (Clues) Home to Positive Atheism. Receptionist: Yes, sir?

>> Marian Small: I was talking to a grade one teacher yesterday, and she was telling me

Georgia Quality Core Curriculum 9 12 English/Language Arts Course: American Literature/Composition

INTRODUCTION TO LOGIC 1 Sets, Relations, and Arguments

Lesson 09 Notes. Machine Learning. Intro

Probability Foundations for Electrical Engineers Prof. Krishna Jagannathan Department of Electrical Engineering Indian Institute of Technology, Madras

Modal verbs. Certain, probable or possible

1 Thessalonians 4:13 5:11 APPLY THE STORY (10 15 MINUTES) TEACH THE STORY (25 30 MINUTES) (25 30 MINUTES) PAGE 108 PAGE 110. Leader BIBLE STUDY

Question Answering. CS486 / 686 University of Waterloo Lecture 23: April 1 st, CS486/686 Slides (c) 2014 P. Poupart 1

Grade 7. correlated to the. Kentucky Middle School Core Content for Assessment, Reading and Writing Seventh Grade

I Am Journey Week 3: Moses and the burning bush. February 25-26, Exodus 2-4; Psalm 139: God is always with us.

KEEP THIS COPY FOR REPRODUCTION Pý:RPCS.15i )OCUMENTATION PAGE 0 ''.1-AC7..<Z C. in;2re PORT DATE JPOTTYPE AND DATES COVERID

Mini-me. What is a Summary? What is a summary? What is a summary? English 2 textbook p. 97 A. Milo Cho

Strand 1: Reading Process

Hey everybody. Please feel free to sit at the table, if you want. We have lots of seats. And we ll get started in just a few minutes.

1. Introduction Formal deductive logic Overview

Transcription ICANN London IDN Variants Saturday 21 June 2014

Lecture 6. Realism and Anti-realism Kuhn s Philosophy of Science

Strand 1: Reading Process

South Carolina English Language Arts / Houghton Mifflin Reading 2005 Grade Three

Lesson 07 Notes. Machine Learning. Quiz: Computational Learning Theory

MODERN FAMILY FIGHTING

An Interview with GENE GOLUB OH 20. Conducted by Pamela McCorduck. 16 May Stanford, CA

BERT VOGELSTEIN, M.D. '74

Thinking Skills. John Butterworth and Geoff Thwaites

Houghton Mifflin English 2001 Houghton Mifflin Company Grade Three Grade Five

Rosendo "Ro" Parra Commencement Speech May 22, 2002

Kindergarten-2nd. March 16-17, Jesus Calms the Storm. Matthew 8:23-27 Adventure Bible for Early Readers, pg We can give our fears to God

PAGE(S) WHERE TAUGHT (If submission is not a book, cite appropriate location(s))

NPTEL NPTEL ONLINE COURSES REINFORCEMENT LEARNING. UCB1 Explanation (UCB1)

South Carolina English Language Arts / Houghton Mifflin English Grade Three

Lesson 10 - Modals (Part 3)

2.1 Review. 2.2 Inference and justifications

Lecture 9. A summary of scientific methods Realism and Anti-realism

MITOCW watch?v=ppqrukmvnas

February 4-5, David and Goliath. 1 Samuel 17. God rescues his family.

NPTEL NPTEL ONINE CERTIFICATION COURSE. Introduction to Machine Learning. Lecture-59 Ensemble Methods- Bagging,Committee Machines and Stacking

Artificial Intelligence Prof. Deepak Khemani Department of Computer Science and Engineering Indian Institute of Technology, Madras

AN APOLOGY FOR THE LIFE OF COLLEY CIBBER

Mathematics. The BIG game Behind the little tricks

Red Riding Hood vs. Wolf Scripted Role Play

I am excited and ready to get out my Bible and dig deep into the truth author, Chrystal Evans Hurst, brings up in She s Still There.

Cabal, Ted, ed. The Apologetics Study Bible.. Nick Norelli Rightly Dividing the Word of Truth New Jersey

Dear colleagues, boy friends and girl friends of colleagues, husbands and wives of colleagues, and others present,

The paradox we re discussing today is not a single argument, but a family of arguments. Here s an example of this sort of argument:!

Gerald s Column. by Gerald Fitton. This month I want to discuss Paul s aim for the future of Archive.

Unit 3: Miracles of Jesus NT3.14 Jesus Raises the Widow's Son

SASK. INDIAN CULTURAL COLLEGE

Grade 7 Math Connects Suggested Course Outline for Schooling at Home 132 lessons

Introduction to Statistical Hypothesis Testing Prof. Arun K Tangirala Department of Chemical Engineering Indian Institute of Technology, Madras

Americano, Outra Vez!

Darwinism on trial in American state (Sun 8 May, 2005)

Interim City Manager, Julie Burch

This item is sought-after!!!

Holy Spirit's Desire for You & Purity vs. Sex

Correlation to Georgia Quality Core Curriculum

StoryTown Reading/Language Arts Grade 2

Checking your understanding or checking their understanding card game

Skits. Come On, Fatima! Six Vignettes about Refugees and Sponsors

February 4-5, David and Goliath. God rescues his family. 1 Samuel 17

John Mayer. Stop This Train. 'Til you cry when you're driving away in the dark. Singing, "Stop this train

Unit 2: Ministry of Christ--Lesson 9 NT2.9 Jesus Visits Mary and Martha

SEVENTH GRADE RELIGION

Daniel Davis - poems -

WEEK #11: Chapter 5 HOW IT WORKS (Step 4 - Fears)

Ethical Colonialism Joseph C. Pitt Virginia Tech

Before reading. Mr Smith's new nose. Preparation task. Stories Mr Smith's new nose

News English.com Ready-to-use ESL/EFL Lessons

Now you know what a hypothesis is, and you also know that daddy-long-legs are not poisonous.

Anaphora Resolution in Biomedical Literature: A

Journaling in Eating Disorder Recovery

INTRODUCTION TO PESSO SYSTEM/PSYCHOMOTOR

Interview with Lennart Sandholm

Saint Bartholomew School Third Grade Curriculum Guide. Language Arts. Writing

ICANN Transcription Discussion with new CEO Preparation Discussion Saturday, 5 March 2016

MITOCW ocw f08-rec10_300k

GIVING LIVING. Text: Luke 6:38

The recordings and transcriptions of the calls are posted on the GNSO Master Calendar page

MITOCW L21

Arnold Schwarzenegger. Republican National Convention Address. Delivered 5 March 2006, Hollywood, CA


PSY 202 Sample 2. Question/Prompt: It is logical that others see us differently than we see ourselves, and there is

Artificial Intelligence. Clause Form and The Resolution Rule. Prof. Deepak Khemani. Department of Computer Science and Engineering

The Gospel According To Paul Romans 1:1-17 Part 2 Rick Edwards

"UNAPPRECIATED SANTA" By Terry Stanley

Ramsey media interview - May 1, 1997

Now consider a verb - like is pretty. Does this also stand for something?

Executive Power and the School Chaplains Case, Williams v Commonwealth Karena Viglianti

Does the name Hari Seldon mean anything to any of you? Okay, I must be the only science fiction geek in the room

StoryTown Reading/Language Arts Grade 3

The following content is provided under a Creative Commons license. Your support

A Son for Zechariah and Elizabeth. Luke 1:5-25

LEADER DEVOTIONAL. Younger Kids Leader Guide Unit 35, Session LifeWay

PAGE(S) WHERE TAUGHT (If submission is not text, cite appropriate resource(s))

God Teaches Me in the Bible Jesus Did Things Others Cannot Do

Interview with Anita Newell Audio Transcript

Transcription:

The curious case of Mark V. Shaney Comp 140 Fall 2008

Who is Mark V. Shaney? Mark was a member of a UseNet News group called net.singles, a users group chock full of dating tips, lonely heart chatter, frank discussions of sexual problems and high tech missionary gospel about the sins of premarital smut-typing. Penn Jilette s description August 27, 2008 (c) Devika Subramanian, Fall 2008 2

Who is MVS? Mr Shaney was a little goofy but he was always there. He chimed in with poetic opinions on romantic empathy: "As I've commented before, really relating to someone involves standing next to impossible." August 27, 2008 (c) Devika Subramanian, Fall 2008 3

MVS contd And he had a great Groucho Marx sense of humor: "One morning I shot an elephant in my arms and kissed him. So it was too small for a pill? Well, it was too small for a while. August 27, 2008 (c) Devika Subramanian, Fall 2008 4

MVS And his idea of a good closing was: "Oh, sorry. Nevermind. I am afraid of it becoming another island in a nice suit." August 27, 2008 (c) Devika Subramanian, Fall 2008 5

MVS on Bush s speech Mr. Chairman, delegates, fellow citizens, I'm honored to aid the rise of democracy in Germany and Japan, Nicaragua and Central Europe and the freedom of knowing you can take them. Tonight, I remind every parent and every school must teach, so we do to improve health care and a more hopeful America. I am in their days of worry. We see that character in our future. We will build a safer world today. The progress we and our friends and allies seek in the life of our work. The terrorists are fighting freedom with all their cunning and cruelty because freedom is not America's gift to every man and woman in this place, that dream is renewed. Now we go forward, grateful for our older workers. With the huge baby boom generation approaching retirement, many of our work. About 40 nations stand beside us in the next four years. August 27, 2008 (c) Devika Subramanian, Fall 2008 6

Compiled musings of MVS http://www.harlanlandes.com/shaney/1984_09.html August 27, 2008 (c) Devika Subramanian, Fall 2008 7

Mark V. Shaney A program created at AT&T Research Labs by Bruce Ellis Rob Pike Don Mitchell Name is a play on Markov chain, which is the underlying technology. August 27, 2008 (c) Devika Subramanian, Fall 2008 8

Motivating the reconstruction of Shaney Wouldn't you like to write a program that could read a thousand words of something and spew out lovable nonsense in the same style? Your own little desktop Bret Easton Ellis, that sucks up the culture of your choice and spits it back at you? Don't let the Murray Hill address scare you, now that rob and brucee have done the hard work of thinking it up, even you and I can understand how Mark V. Shaney works and with a little work you and I can write our own (but let's hope to hell we all have something better to do with our lives - what is on the Weather Channel tonight?) --- Penn Jillette August 27, 2008 (c) Devika Subramanian, Fall 2008 9

What does Shaney do? Input text We know Shaney riffs on texts that he reads. We can therefore guess his inputs and outputs. Shaney Output text We also know that Shaney generates output text that is similar to the input text. (of the same genre, on the same topic, with similar words) August 27, 2008 (c) Devika Subramanian, Fall 2008 10

Outline of lecture Reverse engineering Shaney 10 minute group exercise Shaney s recipe Mathematical model Computational realization of model Fun with our model August 27, 2008 (c) Devika Subramanian, Fall 2008 11

Allen B. Downey The goal is to teach you to think like a computer scientist. This way of thinking combines some of the best features of mathematics, engineering, and natural science. Like mathematicians, computer scientists use formal languages to denote ideas (specifically computations). Like engineers, they design things, assembling components into systems and evaluating tradeoffs among alternatives. Like scientists, they observe the behavior of complex systems, form hypotheses, and test predictions. The single most important skill for a computer scientist is problem solving. Problem solving means the ability to formulate problems, think creatively about solutions, and express a solution clearly and accurately. -- How to think like a computer scientist August 27, 2008 (c) Devika Subramanian, Fall 2008 12

Questions Abstraction Was the problem specified precisely? What are the inputs and outputs? How did you represent the inputs and outputs? Automation How did you express your recipe for a solution? How can you demonstrate that your recipe solves the problem? How expensive is it to run/use your recipe (where cost is defined in units related to the size of the input)? Are there other recipes to solve the problem? Is your recipe the best there could ever be? August 27, 2008 (c) Devika Subramanian, Fall 2008 13

Abstraction Inputs: sequence of words Outputs: sequence of words similar to inputs Use same or similar vocabulary (be about the same topic(s)) Use same or similar phrases (short sequences) (have similar linguistic style) August 27, 2008 (c) Devika Subramanian, Fall 2008 14

Automation Input text Reads posts on net.singles or some other source. Shaney Output text Creates a GENERATIVE mathematical model of these posts Computationally constructs new posts based on this model August 27, 2008 (c) Devika Subramanian, Fall 2008 15

The lyrics of She loves you She loves you, yeh, yeh, yeh. She loves you, yeh, yeh, yeh. She loves you, yeh, yeh, yeh, yeeeh! You think you lost your love, when I saw her yesterday. It's you she's thinking of, and she told me what to say. She says she loves you, and you know that can't be bad. Yes, she loves you, and you know you should be glad. Ooh! She said you hurt her so, she almost lost her mind. And now she says she knows, you're not the hurting kind. She says she loves you, and you know that can't be bad. Yes, she loves you, and you know you should be glad. Ooh! She loves you, yeh, yeh, yeh! She loves you, yeh, yeh, yeh! And with a love like that, you know you should be glad. And now it's up to you, I think it's only fair, if I should hurt you too, apologize to her, because she loves you, and you know that can't be bad. Yes, she loves you, and you know you should be glad. Ooh! She loves you, yeh, yeh, yeh! She loves you, yeh, yeh, yeh! And with a love like that, you know you should be glad. And with a love like that, you know you should be glad. And with a love like that, you know you shouuuld be glad. Yeh, yeh, yeh; yeh, yeh, yeh; yeh, yeh, yeeeh! August 27, 2008 (c) Devika Subramanian, Fall 2008 16

The simplest model Get all the words from the lyrics and put them in a giant bowl/envelope August 27, 2008 (c) Devika Subramanian, Fall 2008 17

Generative model based on randomization Extract all words from the text and put them in a giant bowl/envelope. Repeat N times Draw a word at random (with replacement) from bowl/envelope. Print it out August 27, 2008 (c) Devika Subramanian, Fall 2008 18

Computational mapping How to represent the bowl of words? Our old friend, the Python list She loves you yeh yeeh! Now throw a dart at this list with your eyes closed, and pick the word where your dart lands on. Repeat the dart throw as many times as the length of the text you want to generate. August 27, 2008 (c) Devika Subramanian, Fall 2008 19

How to extract words into a list def read_file_into_word_list(filename): Split the text into a list of words, separating on space inputfile = open(filename, 'r') text = inputfile.read() words = text.split() return words Making the bowl Open the file for reading Read the entire file into a string called text return words as a list August 27, 2008 (c) Devika Subramanian, Fall 2008 20

How to throw a computational dart import random def make_random_text_simple(words, num_words = 100): random_text = '' for i in range(num_words): next = random.choice(words) random_text = random_text + ' ' + next return random_text August 27, 2008 (c) Devika Subramanian, Fall 2008 21

Putting it all together words = read_file_into_word_list( shelovesyou.txt') riff = make_random_text_simple(words) print riff August 27, 2008 (c) Devika Subramanian, Fall 2008 22

More complex models The model we just developed (random drawing out of a list of words) is called a zeroth-order Markov model. Each word is generated independently of any other. However, English has sequential structure. We will now build better models to capture this structure. August 27, 2008 (c) Devika Subramanian, Fall 2008 23

The lyrics of She loves you She loves you, yeh, yeh, yeh. She loves you, yeh, yeh, yeh. She loves you, yeh, yeh, yeh, yeeeh! You think you lost your love, when I saw her yesterday. It's you she's thinking of, and she told me what to say. She says she loves you, and you know that can't be bad. Yes, she loves you, and you know you should be glad. Ooh! She said you hurt her so, she almost lost her mind. And now she says she knows, you're not the hurting kind. She says she loves you, and you know that can't be bad. Yes, she loves you, and you know you should be glad. Ooh! She loves you, yeh, yeh, yeh! She loves you, yeh, yeh, yeh! And with a love like that, you know you should be glad. And now it's up to you, I think it's only fair, if I should hurt you too, apologize to her, because she loves you, and you know that can't be bad. Yes, she loves you, and you know you should be glad. Ooh! She loves you, yeh, yeh, yeh! She loves you, yeh, yeh, yeh! And with a love like that, you know you should be glad. And with a love like that, you know you should be glad. And with a love like that, you know you shouuuld be glad. Yeh, yeh, yeh; yeh, yeh, yeh; yeh, yeh, yeeeh! Look for patterns! August 27, 2008 (c) Devika Subramanian, Fall 2008 24

Example of structure In the lyrics of She loves you by the Beatles, what words follow the word she? She --> ['loves', 'loves', 'loves', 'says', loves, said, almost, says, knows, 'says', 'loves', 'loves', 'loves', 'loves, loves, loves, loves ] August 27, 2008 (c) Devika Subramanian, Fall 2008 25

Computational mapping How to represent this structure? For every distinct word in the text, store a list of words that follow it immediately in the text She loves you A prefix dictionary yeh yeeh August 27, 2008 (c) Devika Subramanian, Fall 2008 26

Creating the prefix dictionary Example text: She loves you yeh yeh yeh She loves you yeh yeh yeh Prefix dictionary: She [loves, loves] loves [you, you] you [yeh, yeh] yeh [yeh, yeh, She, yeh, yeh] August 27, 2008 (c) Devika Subramanian, Fall 2008 27

Generation recipe Generate a random word w from text, and set riff = w. Repeat N times Get list associated with word w from prefix dictionary Make a random choice from that list, say w, then add w to riff Set w = w Print riff August 27, 2008 (c) Devika Subramanian, Fall 2008 28

Generation example Random word picked at start = loves What word is likely to be picked after that? you (probability = 1) What word is likely to be picked after that? yeh (probability = 1) What word is likely to be picked after that? With probability 4/5 it will be yeh, with probability 1/5 it will be She Prefix dictionary: She [loves, loves] loves [you, you] you [yeh, yeh] yeh [yeh, yeh, She, yeh, yeh] August 27, 2008 (c) Devika Subramanian, Fall 2008 29

The generation process 4/5 loves 1 you 1 Pick a word at random from prefix[ loves ] Pick a word at random from prefix[ you ] yeh 1/5 Pick a word at random from prefix[ yeh ] She August 27, 2008 (c) Devika Subramanian, Fall 2008 30

Recipe for constructing prefix dictionary Example text: She loves you yeh yeh yeh She loves you yeh yeh yeh Prefix dictionary: She [loves] August 27, 2008 (c) Devika Subramanian, Fall 2008 31

Recipe for constructing prefix dictionary Example text: She loves you yeh yeh yeh She loves you yeh yeh yeh Prefix dictionary: She [loves] Loves [you] August 27, 2008 (c) Devika Subramanian, Fall 2008 32

Recipe for constructing prefix dictionary Example text: She loves you yeh yeh yeh She loves you yeh yeh yeh Prefix dictionary: She [loves] Loves [you] you [yeh] August 27, 2008 (c) Devika Subramanian, Fall 2008 33

How to make a prefix dictionary using Python def make_prefix_dictionary(words): prefix = {} for i in range(len(words)-1): if words[i] not in prefix: prefix[words[i]] = [] prefix[words[i]].append(words[i+1]) return prefix August 27, 2008 (c) Devika Subramanian, Fall 2008 34

Generating text using the prefix dictionary in Python def make_random_text(prefix, num_words=100): current_word = random.choice(prefix.keys()) random_text = current_word for i in range(num_words-1): # last word in document may not have a suffix if current_word not in prefix: break next = random.choice(prefix[current_word]) random_text = random_text + ' ' + next current_word = next return random_text August 27, 2008 (c) Devika Subramanian, Fall 2008 35

Putting it all together words = read_file_into_word_list( shelovesyou.txt') prefix = make_prefix_dictionary(words) riff = make_random_text (words) print riff The model we just built is called a first-order Markov model. August 27, 2008 (c) Devika Subramanian, Fall 2008 36

Making even more complex models Idea: why look at the current word alone to determine the next word? How about making a prefix dictionary indexed by two previous words, rather than a single word? Such a model is a second-order Markov model. August 27, 2008 (c) Devika Subramanian, Fall 2008 37

Penn Jilette s description Mr Shaney takes the input text and measures how many times each triple occurs - How many times does "you like to" occur in our sample - let's say twice. And how many times does "you like macrame" (for example) occur? Let's say once. All you got to do to generate output text is have Shaney print a pair of words and then choose, according to the probability of the input text, what the next word should be. So after it prints "you like " it will print the word "to" 2/3rds of the time and the word "macrame" 1/3rd of the time at random. Now, let's say, it prints "macrame". Now the current pair becomes "like macrame" (you see? this IS nonsense) - Shaney looks to see what word could follow that pair and he's off and running. August 27, 2008 (c) Devika Subramanian, Fall 2008 38

Creating a more complex prefix dictionary Example text: She loves you yeh yeh yeh She loves you yeh yeh yeh Prefix dictionary: [She loves] [you, you] [loves you] [yeh, yeh] [you yeh] [yeh, yeh] [yeh yeh] [yeh, She, yeh] [yeh She] [loves] August 27, 2008 (c) Devika Subramanian, Fall 2008 39

Generation using the more complex prefix dictionary Random word pair picked at start = loves you What word is likely to be picked after that? yeh (probability = 1) What word is likely to be picked after that? yeh (probability = 1) What word is likely to be picked after that? With probability 2/3 it will be yeh, with probability 1/3 it will be She Prefix dictionary: [She loves] [you, you] [loves you] [yeh, yeh] [you yeh] [yeh, yeh] [yeh yeh] [yeh, She, yeh] [yeh She] [loves] August 27, 2008 (c) Devika Subramanian, Fall 2008 40