A regression analysis has the goal of measuring how changes in one variable, called a dependent or explained variable can be explained by changes in one or more other variables called the independent or explanatory variables.
A scatter plot is a visual representation of the relationship between the dependent variable and a given independent variable.
POPULATION REGRESSION FUNCTION
Assuming that the 30 observations represent the population of hedge funds that are in the same class then their relationship can provide a population regression function. Such a function would consist of parameters called regression coefficient. The regression equation will include an intercept term and one slope coefficient for each independent variable.
THE ERROR TERM
There is a dispersion of y-values around each conditional expected value. The difference between each y & its corresponding conditional expectation is the error term or noise component denoted as εi.
Sample Regression Function: The sample regression function is an equation that represents a relationship between the y & x variable (s) that is based only on the information in a sample of the population.
Linear Regression Equation:
Y = a + b (X) + e
b is regression slope coefficient
a is intercept, value of Y when b is zero.
Three conditions , to use linear regression
Ordinary Least Squares estimation is a process that estimates the population parameters βi with corresponding values for bi that minimize the squared residuals. The formulas for the coefficients are:b1= cov(xy) / var(x)
b0 = ӯ – b1x
The sum of squared residuals (SSR), sometimes denoted SSE, for sum of squared errors, is the sum of squares that results from placing a given intercept and slope coefficient into the equation and computing the residuals, squaring the residuals and summing them. It is represented by Ʃe2i.
THE COEFFICIENT OF DETERMINATION
The coefficient of determination represented by R2,is a measure of the “goodness of fit” of the regression. It is interpreted as a percentage of variation in the dependent variable explained by the independent variable. The underlying concept is that for the dependent variable, there is a Total sum of squares (TSS) around the sample mean.
Total Sum of Squares (TSS) = Explained sum of squares( ESS) + Sum of squared residuals (SEE)
Ʃ(yi–ӯ)2= Ʃ(ŷ-ӯ)2 + Ʃ(yi-ŷ)2
THE STANDARD ERROR OF THE REGRESSION
The standard error of the regression (SER) measures the degree of variability of the actual Y-values relative to the estimated Y-values from a regression equation. The SER gauges the “fit” of the regression line. The smaller the standard error, the better the fit.
When the variable is binary in nature—it is either on or off falls under the category of dummy variable. Dummy variables are assigned value of 0 or 1.
R2 of a regression model captures the fit of a model. Correlation coefficient is under root of R2, for one variable regressor model.
Formula of R2 is (ESS / TSS)
A t-test may also be used to test the hypothesis that the true slope coefficient, B1, is equal to some hypothesized value. Letting b1 be the point estimate for B1 the appropriate test statistic with n — 2 degrees of freedom is:
t = (b1—B1)/sb1
The decision rule for tests of significance for regression coefficients is:
Reject H0 if t > +tcritical or t < –tcritical
Rejection of the null means that the slope coefficient is different from the hypothesized value of B1.
Hypothesis testing for a regression coefficient may use the confidence interval for the coefficient being tested. The null hypothesis is : H0 : Bi=0 & the alternative hypothesis is HA: B1≠0
If the confidence interval at the desired level of significance does not include zero, the null is rejected and the coefficient is said to be statistically different from zero.
The confidence interval for the regression coefficient, B1, is calculated as: b1 ± (tc x sb1)
tc is the critical two-tailed t-value for the selected confidence level with the appropriate number of degrees of freedom while equal to n-2.
The standard error of the regression coefficient is denoted as sb1.
P Value: The p-value is the smallest level of significance for which the null hypothesis can be rejected.
Predicted values are values of the dependent variable based on the estimated regression coefficients and a prediction about the value of the independent variable.
For a simple regression, the predicted value of Y is:
Ŷ = b0 + b1Xp
Where: ŷ = predicted value of the dependent variable.
Xp= forecasted value of the independent variable.
CONFIDENCE INTERVALS FOR PREDICTED VALUES
The equation for the confidence interval for a predicted value of Y is: Ŷ ± (tc x sf) = [ŷ—(tc x sf) < Y< Ŷ + (tc x sf)]
Where: tc = two-tailed critical t-value at the desired level of significance with df = n — 2.
sf = standard error of the forecast.
Powered by BetterDocs
Create a new account
Number of items in cart: 0
Enter the destination URL
Or link to existing content