This is certainly a time series. We can see very strong patterns in the correlation matrix. This comes out in this form...

Similar documents
This is certainly a time series. We can see very strong patterns in the correlation matrix. This comes out in this form...

Multiple Regression-FORCED-ENTRY HIERARCHICAL MODEL Dennessa Gooden/ Samantha Okegbe COM 631/731 Spring 2018 Data: Film & TV Usage 2015 I. MODEL.

Sociology Exam 1 Answer Key February 18, 2011

Introduction to Statistical Hypothesis Testing Prof. Arun K Tangirala Department of Chemical Engineering Indian Institute of Technology, Madras

Brandeis University Maurice and Marilyn Cohen Center for Modern Jewish Studies

Okay, good afternoon everybody. Hope everyone can hear me. Ronet, can you hear me okay?

It is One Tailed F-test since the variance of treatment is expected to be large if the null hypothesis is rejected.

Six Sigma Prof. Dr. T. P. Bagchi Department of Management Indian Institute of Technology, Kharagpur

Asian Economic and Financial Review THE TREND ANALYSIS OF ISLAMIZATION IN MALAYSIA USING ISLAMIZATION INDEX AS INDICATOR. W. A. Wan Omar.

POLS 205 Political Science as a Social Science. Making Inferences from Samples

Lampiran 1. Daftar Sampel Reksa dana campuran syariah

ABSTRACT. Religion and Economic Growth: An Analysis at the City Level. Ran Duan, M.S.Eco. Mentor: Lourenço S. Paz, Ph.D.

Family Studies Center Methods Workshop

Tests of Homogeneity and Independence

The World Wide Web and the U.S. Political News Market: Online Appendices

Module - 02 Lecturer - 09 Inferential Statistics - Motivation

INTRODUCTION TO HYPOTHESIS TESTING. Unit 4A - Statistical Inference Part 1

T his study aimed to analyze the comparative

Statistics for Experimentalists Prof. Kannan. A Department of Chemical Engineering Indian Institute of Technology - Madras

Factors Influencing on Peaceful Co-Existence: Christian s Living in Tehran

When Financial Information Meets Religiosity in Philanthropic Giving: The Case of Taiwan

The Impact of Islamization on Income Inequality and Economic Growth Nexus in Malaysia

How many imputations do you need? A two stage calculation using a quadratic rule

Prakriti and Quantity of Semen: An Observational Clinical Study

Module 02 Lecture - 10 Inferential Statistics Single Sample Tests

NPTEL NPTEL ONINE CERTIFICATION COURSE. Introduction to Machine Learning. Lecture-59 Ensemble Methods- Bagging,Committee Machines and Stacking

Probability Distributions TEACHER NOTES MATH NSPIRED

An Analysis of Zakat Expenditure and Real Output: Theory and Empirical Evidence

Georgia Quality Core Curriculum

McDougal Littell High School Math Program. correlated to. Oregon Mathematics Grade-Level Standards

CHAPTER FIVE SAMPLING DISTRIBUTIONS, STATISTICAL INFERENCE, AND NULL HYPOTHESIS TESTING

Grade 7 Math Connects Suggested Course Outline for Schooling at Home 132 lessons

Introduction Chapter 1 of Social Statistics

Content Area Variations of Academic Language

Netherlands Interdisciplinary Demographic Institute, The Hague, The Netherlands

Analysis of the Relationship between Religious Participation and Economic Recessions

The Scripture Engagement of Students at Christian Colleges

DETERMINANTS OF HIGHER EDUCATION ISLAMIC ENDOWMENT (WAQF) ATTRIBUTES AMONG MUSLIMS IN MALAYSIA

Grade 6 Math Connects Suggested Course Outline for Schooling at Home

New Research Explores the Long- Term Effect of Spiritual Activity among Children and Teens

Radiomics for Disease Characterization: An Outcome Prediction in Cancer Patients

MISSOURI S FRAMEWORK FOR CURRICULAR DEVELOPMENT IN MATH TOPIC I: PROBLEM SOLVING

The following content is provided under a Creative Commons license. Your support

Supplement to: Aksoy, Ozan Motherhood, Sex of the Offspring, and Religious Signaling. Sociological Science 4:

The Augmented Misery Index

P 97 Personality and the Practice of Ministry

ANALYSIS OF LIQUIDITY FACTORS THAT INFLUENCE EXCESS STOCK RETURN WITH RELATIVE MEASURE OF LIQUIDITY INCLUDED; WITHIN COMPANIES ALWAYS LISTED IN LQ45

Appendix 1. Towers Watson Report. UMC Call to Action Vital Congregations Research Project Findings Report for Steering Team

Logical (formal) fallacies

Factors related to students focus on God

The Effect of Religiosity on Class Attendance. Abstract

6.041SC Probabilistic Systems Analysis and Applied Probability, Fall 2013 Transcript Lecture 21

AN EXPLORATORY SURVEY EXAMINING THE FAMILIARITY WITH AND ATTITUDES TOWARD CRYONIC PRESERVATION. W. Scott Badger, Ph.D. ABSTRACT INTRODUCTION

Office Channeling and Its Impact on the Growth of Indonesian Islamic Banking Industry

Near and Dear? Evaluating the Impact of Neighbor Diversity on Inter-Religious Attitudes

Work Hard or Pray Hard? Religion and Attitudes toward Work

This report is organized in four sections. The first section discusses the sample design. The next

Introductory Statistics Day 25. Paired Means Test

Biometrics Prof. Phalguni Gupta Department of Computer Science and Engineering Indian Institute of Technology, Kanpur. Lecture No.

A Deep Survey on Sole and Essence of Hand Mudra(s)

Believe in Terms of God?

CHAPTER 17: UNCERTAINTY AND RANDOM: WHEN IS CONCLUSION JUSTIFIED?

Religiosity and Growth Revisited: Estimating a Causal E ect

Globalization And Religion David Skinner, ( Mount Vernon Nazarene University

IJRIM Volume 6, Issue 6 (June, 2016) (ISSN ) International Journal of Research in IT & Management (IMPACT FACTOR 5.96)

ISLAMIC CALENDAR EFFECT ON THE SAUDI STOCK MARKET (TASI)

HIGH POINT UNIVERSITY POLL MEMO RELEASE 2/10/2017 (UPDATE)

I also occasionally write for the Huffington Post: knoll/

End of the year test day 2 #3

Chapter 20 Testing Hypotheses for Proportions

Measuring religious intolerance across Indonesian provinces

CONGREGATIONS ON THE GROW: SEVENTH-DAY ADVENTISTS IN THE U.S. CONGREGATIONAL LIFE STUDY

Extended Abstract submission. Differentials in Fertility among Muslim and Non-Muslim: A Comparative study of Asian countries

A Scientific Realism-Based Probabilistic Approach to Popper's Problem of Confirmation

Department of Economics, Faculty of Economics and Political Sciences, Omdurman Islamic University, Sudan Shaqra University, KSA (Secondment)

Correlates of Youth Group Size and Growth in the Anglican Diocese of Sydney: National Church Life Survey (NCLS) data

In Our Own Words 2000 Research Study

A Layperson s Guide to Hypothesis Testing By Michael Reames and Gabriel Kemeny ProcessGPS

Religious affiliation, religious milieu, and contraceptive use in Nigeria (extended abstract)

THE IMPACT OF LDS TEMPLES

Christian-Muslim Relationships in Medan. and Dalihan na tolu. A Social Capital Study. of The Batak Cultural Values

Grade 6 correlated to Illinois Learning Standards for Mathematics

AMERICAN SECULARISM CULTUR AL CONTOURS OF NONRELIGIOUS BELIEF SYSTEMS. Joseph O. Baker & Buster G. Smith

Excel Lesson 3 page 1 April 15

Analysis of Heart Rate Variability during Meditative and Non-Meditative State using Analysis Of variance

Muslim teachers conceptions of evolution in several countries

Discussion Notes for Bayesian Reasoning

TIME SERIES ANALYSIS OF U.S. AND CANADIAN INFLATION AND UNEMPLOYMENT: A TEST OF A FIELD-THEORETIC HYPOTHESIS

Exersices and solutions ANOVA tests for d-primes in sensr

On the Relationship between Religiosity and Ideology

Conservative Protestants and Wealth: How Religion Perpetuates Asset Poverty*

Market Share and Religious Competition: Do Small Market Share Congregations and Their Leaders Try Harder?

Domestic violence and faith communities. The impact of spirituality on women of faith in abusive relationships

The Pennsylvania State University. The Graduate School. College of the Liberal Arts UNDERSTANDING HOW CONGREGATIONS PROMOTE COMMUNITY

Socioeconomic Status and Beliefs about God s Influence in Everyday Life*

Computational Learning Theory: Agnostic Learning

NEWS AND RECORD / HIGH POINT UNIVERSITY POLL MEMO RELEASE 3/1/2017

Studying Religion-Associated Variations in Physicians Clinical Decisions: Theoretical Rationale and Methodological Roadmap

Who wrote the Letter to the Hebrews? Data mining for detection of text authorship

Upward Wealth Mobility: Exploring the Roman Catholic Advantage

Transcription:

Gas Price regression... This is based on data file GasolineMarket.mpj. Here is a schematic of the data file: Year Expenditure Population GasPrice Income NewCars UsedCars Public Trans Durables Nondurables Services 93 7.4 96 6.668 8883 47.2 26.7 6.8 37.7 29.7 9.4 94 7.8 6239 7.29 868 46. 22.7 8. 36.8 29.7 2. 9 8.6 627 7.2 937 44.8 2. 8. 36. 29. 2.4 96 9.4 6822 7.729 9436 46. 2.7 9.2 36. 29.9 2.9 97.2 7274 8.497 934 48. 23.2 9.9 37.2 3.9 2.8 98.6 744 8.36 9343. 24. 2.9 37.8 3.7 22.6 23 9.3 2973.4 26437 34.7 42.9 29.3 7. 6.3 26. 24 224. 2939 23.9 273 33.9 33.3 29. 4.8 72.2 222.8 This is certainly a time series. We can see very strong patterns in the correlation matrix. This comes out in this form... Correlations: Expenditure, Population, GasPrice, Income, NewCars,... Expenditure Population GasPrice Income Population.9 GasPrice.978.927 Income.96.993.934 NewCars.942.9.936.96 UsedCars.936.946.923.93 PublicTrans.966.96.927.964 Durables.92.94.939.949 Nondurables.979.97.963.978 Services.977.96.939.97 NewCars UsedCars PublicTrans Durables UsedCars.994 PublicTrans.98.982 Durables.993.988.98 Nondurables.989.982.99.977 Services.978.977.998.96 Nondurables Services.994 Cell Contents: Pearson correlation

It looks nicer if we re-organize the layout: Expend- Gas New Used Public Noniture Popn Price Income Cars Cars Trans Durables durables Population.9 GasPrice.978.927 Income.96.993.934 NewCars.942.9.936.96 UsedCars.936.946.923.93.994 PublicTrans.966.96.927.964.98.982 Durables.92.94.939.949.993.988.98 Nondurables.979.97.963.978.989.982.99.977 Services.977.96.939.97.978.977.998.96.994 The problem is that everything is moving forward in time together. So what explains GasPrice? Let s try the run on just NewCars, UsedCars, Population. Regression Analysis: GasPrice versus NewCars, UsedCars, Population The regression equation is GasPrice = - 8.2 +.4 NewCars -.42 UsedCars +.326 Population Predictor Coef SE Coef T P VIF Constant -8.9 2.88-3.84. NewCars.364.36 2.87.6 87.684 UsedCars -.423.24 -.66.4 82.269 Population.3263.23 2.7.9.23 S =.223 R-Sq = 89.7% R-Sq(adj) = 89.% Analysis of Variance Source DF SS MS F P Regression 3 4346 4487 38.9. Error 48 6 4 Total 48467 Source DF Seq SS NewCars 42466 UsedCars 226 Population 768 Unusual Observations Obs NewCars GasPrice Fit SE Fit St Resid 29 94 84.2 9.9 2.76 24.43 2.48R 3 97 79.77 9.3.62 2.64 2.R 32 3 76. 6. 3.89 9.9 2.R 46 4 7.87 92.3 2.49-2.44-2.6R 2 34 23.9 98.36 4.24 2.4 2.7R R denotes an observation with a large standardized residual. Durbin-Watson statistic =.4399 With no critical thought, this looks great! 2

But... here are facts about the residuals. This was obtained through Stat Regression Regression Graphs Four in one. Plots for GasPrice Year 99 Normal Probability Plot Versus Fits Percent 9 2 - -3-3 -2 - - Fitted Value Histogram Versus Order Frequency 2 9 6 3 2 - -2-2 -2 2 2 3 3 Observation Order 4 4 The plot in sequence order is a clear indication that the residuals have some type of time-based dependence. Moreover, the Durbin-Watson statistic is very small. As a side note, we ll record the residuals and then get the autocorrelation function plot. You ll get the residuals through Stat Regression Regression Storage s. You can get the autocorrelation function of the residuals through Stat Time Series Autocorrelation. 3

Here s what that plot looks like: Autocorrelation..8.6.4.2. -.2 -.4 -.6 -.8 -. Autocorrelation Function for RESI (with % significance limits for the autocorrelations) 2 3 4 Lag 6 7 8 9 And here are the autocorrelations: Lag ACF T LBQ.7287.7 27.2 2.3998 2. 3.78 3.2994.39 4.67 4.6749.74 42.2 -.79 -.32 42.49 6 -.8789 -.84 44.6 7 -.79286 -.8 46.6 8 -.96 -.87 49.6 9 -.24262 -.6 2.9 -.26948 -.2 7.49 This is a common situation. We note that the first autocorrelation is large (.7) and statistically significant. The T is an ordinary t statistic. Values bigger than 2 or less than -2 indicate statistical significance. The LBQ refers to the Ljung-Box Q statistic to test the null hypothesis that the autocorrelations for all lags up to lag k are all equal to zero. If you really wish to do that test, you get information from Minitab s Help. So... there is a problem that has to be corrected. 4

Correction Attempt. Use time itself as a predictor. Regression Analysis: GasPrice versus NewCars, UsedCars,... The regression equation is GasPrice = - 26 +.937 NewCars -.38 UsedCars -.7 Population +. Year Predictor Coef SE Coef T P VIF Constant -26 3434 -.63.3 NewCars.9368.3988 2.3.23.7 UsedCars -.383.264 -.44. 87.74 Population -.699.6638 -..97 38.27 Year.2.84.6.47 364.94 S =.28 R-Sq = 89.8% R-Sq(adj) = 88.9% Analysis of Variance Source DF SS MS F P Regression 4 43 87 2.9. Error 47 4967 6 Total 48467 Source DF Seq SS NewCars 42466 UsedCars 226 Population 768 Year 39 Unusual Observations Obs NewCars GasPrice Fit SE Fit St Resid 29 94 84.2 9.86 2.8 24.6 2.44R 32 3 76. 7.67 4.69 8.33 2.R 2 34 23.9 96.89 4.9 27.2 2.99R R denotes an observation with a large standardized residual. Durbin-Watson statistic =.4668 This has failed. The Durbin-Watson statistic is very small. Plots involving the residuals are bad also, but they are not shown here. Correction Attempt 2: Use the differenced data. The dependent variable and all the independent variables should be differenced. In Minitab, use Stat Time Series Differences. This will reduce the sample size by.

The plots look much better. Plots for GasPriceDiff 99 Normal Probability Plot 2 Versus Fits Percent 9-2 - 2 - -2..2 2.4 Fitted Value 3.6 4.8 3 Histogram 2 Versus Order Frequency 2 - -2-2 -2 2 2 3 3 Observation Order 4 4 The Durbin-Watson statistic is.46699, which is at the low end of borderline values. Correction attempt 3: Use the lagged version of the dependent variable. In Minitab, use Stat Time Series Lag. by. Again, this will drop the sample size Regression Analysis: GasPrice versus NewCars, UsedCars,... The regression equation is GasPrice = - 49.3 +.497 NewCars -.43 UsedCars +.27 Population +.89 GasPriceLag cases used, cases contain missing values Predictor Coef SE Coef T P VIF Constant -49.28 4. -3.2. NewCars.4966.2248 2.2.32 92.46 UsedCars -.4297.38-2.79.8 82.2 Population.2673.78 2.6..292 GasPriceLag.88997.964 9.26..66 S = 6.2837 R-Sq = 96.3% R-Sq(adj) = 96.% 6

Analysis of Variance Source DF SS MS F P Regression 4 43 378 32.96. Error 46 728 38 Total 4724 Source DF Seq SS NewCars 422 UsedCars 28 Population 82 GasPriceLag 329 Unusual Observations Obs NewCars GasPrice Fit SE Fit St Resid 28 88 7.9 63.34 2.723 2.64 2.22R 34 6.7 76.838.72-6.663-2.8R 46 4 7.874 86.463.628-4.89-2.47R 48 4. 8.88 2.38 8.92 3.29R R denotes an observation with a large standardized residual. Durbin-Watson statistic =.6793 This is not perfect either, but the Durbin-Watson statistic has crossed, just barely, into the zone at which we can accept ρ =. Here are the relevant plots: Plots for GasPrice 99 Normal Probability Plot 2 Versus Fits Percent 9 - -2-2 -2 2 7 Fitted Value 2 Histogram 2 Versus Order Frequency - -6-8 8 6-2 2 2 3 3 Observation Order 4 4 This is tough to live with, but we could do it. 7

Note that the coefficient on GasPriceLag is.89, close to. The fitted equation was GasPrice = - 49.3 +.497 NewCars -.43 UsedCars +.27 Population +.89 GasPriceLag This can be rearranged as GasPrice -.89 GasPriceLag = - 49.3 +.497 NewCars -.43 UsedCars +.27 Population -.89 GasPriceLag The left side is almost the same as GasPriceDiff. An objection to using the lagged variable on the right side of the equation is that we are mixing up dependent and independent variables. Correction Attempt 4: Use the Cochrane-Orcutt method. There are several variations on this method. The essence of the concept is estimating the * autocorrelation coefficient ρ and then computing Y i = Y ˆ i ρ Yi and doing the same thing for each independent variable. There are several ways to get ˆρ. Start with the initial regression, which is where we were on pages -3. In the printout from the autocorrelation, we found that the first autocorrelation was computed as.7287, and we can call this ˆρ. Some like to use ˆρ = DW ; here 2 this is.4399 =.7732. 2 These are not all that far apart. Let s use ˆρ =.7. In Minitab, we will need to create the lagged variable, and then use Calc Calculator to perform (original).7 (lagged). 8

Here are the results: Regression Analysis: GasPriceAdj versus NewCarAdj, UsedCarAdj, PopnAdj The regression equation is GasPriceAdj = - 66.3 +.482 NewCarAdj -.76 UsedCarAdj +.2 PopnAdj cases used, cases contain missing values Predictor Coef SE Coef T P VIF Constant -66.3 3.6-4.86. NewCarAdj.482.832 2.63..22 UsedCarAdj -.76.483 -.9.24 6.977 PopnAdj.79.23.6. 6.36 S = 6.346 R-Sq = 7.7% R-Sq(adj) = 69.9% Analysis of Variance Source DF SS MS F P Regression 3 489.6 66. 39.78. Error 47 897.9 4.4 Total 677. Source DF Seq SS NewCarAdj 934.6 UsedCarAdj 282.2 PopnAdj 32.8 Unusual Observations Obs NewCarAdj GasPriceAdj Fit SE Fit St Resid 28 46.3 37.42 23.84 2.687 3.7 2.36R 46 34.9 4.69 29.248.3 -.79-2.46R 48 33.2 4.2 29.48.89.797 2.9R 2 33.9.293 3.98 3.3 4.33 2.9RX R denotes an observation with a large standardized residual. X denotes an observation whose X value gives it large leverage. Durbin-Watson statistic =.27866 This has actually made the Durbin-Watson statistic a little worse (lower). 9

Here are the plots: Plots for GasPriceAdj 99 Normal Probability Plot 2 Versus Fits Percent 9 - - 2 2 Fitted Value 3 4 6 Histogram 2 Versus Order Frequency 2 8 4 - -6-8 8 6 2 2 3 3 Observation Order 4 4 This particular data set may have incurable issues. The series are all very smooth for the first (about) twenty years and then become irregular. The statistical word for this problem is non-stationarity. By the way, the most commonly used correction is differencing, the second method illustrated here. It s simple and easy to understand. Using time as a predictor, the first method done here, seems promising, but it rarely works.