Regression Analysis
Essay by 24 • March 18, 2011 • 1,257 Words (6 Pages) • 2,462 Views
Introduction
This presentation on Regression Analysis will relate to a simple regression model. Initially, the regression model and the regression equation will be explored. As well, there will be a brief look into estimated regression equation. This case study that will be used involves a large Chinese Food restaurant chain.
Business Case
In this instance, the restaurant chain's management wants to determine the best locations in which to expand their restaurant business. So far the most successful locations have been near college campuses. This opinion is based on the positive numbers that quarterly sales (y) reflect and the size of the student population (x). Management's mindset is that over all, the restaurants that are within close proximity to college campuses with large student bodies generate more sales than restaurants located near campuses with small student bodies.
In the sample box below, xi is the size of the student population (in thousands) and yi is the quarterly sale (in thousands of dollars). The value for xi and yi for all of the 10 Chinese Food restaurants given in the sample are reflected as follows:
Sample Data:
(measured in 1,000s) (measured in $1,000s)
Restaurant Student Population Quarterly Sales
(i) (xi) (yi)
1 2 58
2 6 105
3 8 88
4 8 118
5 12 117
6 16 137
7 20 157
8 20 169
9 22 149
10 26 202
Methodology
Given the circumstances, "in the simple linear regression model, y is a linear function of x (the β0 + β1 Xi part) plus єi. β0 and β1 are referred to as the parameters of the model, and є (the Greek letter epsilon) is a random variable referred to as the error term"
(Anderson, Sweeney & Williams, 2000, pg. 450). To state the simple linear equation succinctly; yi = β0 + β1x + єi. The yi equals the predicted value of Y for observation. Generally, the values for the parameters are unknown and therefore will require us to estimate them with sample statistics (b0 and b1).
"This equation requires us to determine the regression coefficients, b0 and b1 in order to predict the value of Y. Once the regression coefficients are obtained, the straight line can be plotted on a scatter diagram. We then make a visual comparison of how well the statistical model fits the original data by observing whether the original data lie close to the fitted line or deviate greatly from the fitted line" (Levine, Berenson, & Stephen, 1999, pg. 777).
The following reflect the processes in simple linear regression -
Regression Model:
y = β0 + β1x + є
Regression Equation:
∑ (y) = β0 + β1x
Unknown Parameters:
β0 , β1 (which is the value of b0 and b1)
Sample Data:
x, x1, x2, ...xn
y, y1, y2, ... yn
Estimated Regression Equation Computation:
y = b0 + b1x
For this particular case in which management is trying to make their assessment, this analysis will show the correlation between sales (y) and the student population (x).
Scatter Plot
Relative to the given data, the best fit or least squares method - procedure which uses the b0 and b1 comes into play. "The least squares method is a procedure for using sample data to compute an estimated regression equation." (Anderson, Sweeney & Williams, 2000, pg. 452). The method will, in essence, use the sample data to give values to "b0 and b1 that minimize the sum of the squares of the deviations between the observed values of the dependent variable yi and the estimated values of dependent variable ŷi.
A Chi-Square Goodness of Fit Test can help determine whether the data variable is relative.
The evaluation is done at the .05 significance level
The critical F value at 9 df = 16.919
H0: There is no change in sales for a restaurant as the student population increases
H1: There is a change in sales if there is a change in the student population
The F value is greater than 16.919 so the null hypothesis is rejected. This is supported by P value larger than our .05 level of significance so there is relevance to the comparison data.
Regression Analysis
rІ 0.903 n 10
r 0.950 k 1
Std. Error 13.829 Dep. Var. (yi)
ANOVA table
Source SS df MS F p-value
Regression 14,200.0000 1 14,200.0000 74.25 2.55E-05
Residual 1,530.0000 8 191.2500
Total 15,730.0000 9
Regression output confidence interval
variables coefficients std. error t (df=8) p-value 95% lower 95% upper
Intercept 60.0000 9.2260 6.503 .0002 38.7247 81.2753
(xi) 5.0000 0.5803 8.617
...
...