click below
click below
Normal Size Small Size show me how
Bus Stats II Exam 3
Terms
Term | Definition |
---|---|
ANOVA Table | The analysis of variance table used to summarize the computations associated with the F test for significance. |
Coefficient of Determination | A measure of the goodness of fit of the estimated regression equation. It can be interpreted as the proportion of the variability in the dependent variable y that is explained by the estimated regression equation |
Confidence Interval | The interval estimate of the mean value of y for a given value of x |
Correlation Coefficient | A measure of the strength of the linear relationship between two variables |
Dependent Variable | The variable that is being predicted or explained. It is denoted by y. |
Estimated Regression Equation | The estimate of the regression equation developed from sample data by using the least squares method. For simple linear regression, the estimated regression equation is yˆ = b0 + b1x. |
High Leverage Points | Observations with extreme values for the independent variables |
Independent Variable | The variable that is doing the predicting or explaining. It is denoted by x |
Influential Observation | An observation that has a strong influence or effect on the regression results. |
Ith Residual | The difference between the observed value of the dependent variable and the value predicted using the estimated regression equation; for the ith observation the ith residual is yi − yˆi. |
Least Squares Method | A procedure used to develop the estimated regression equation. The objective is to minimize o( yi − yˆi)2 |
Mean Square Error | The unbiased estimate of the variance of the error term s2. It is denoted by MSE or s2. |
Normal Probability Plot | A graph of the standardized residuals plotted against values of the normal scores. This plot helps determine whether the assumption that the error term has a normal probability distribution appears to be valid |
Outlier | A data point or observation that does not fit the trend shown by the remaining data |
Prediction Interval | The interval estimate of an individual value of y for a given value of x. |
Regression Equation | The equation that describes how the mean or expected value of the dependent variable is related to the independent variable; in simple linear regression, e(y)=b0 +b1x. |
Regression Model | The equation that describes how y is related to x and an error term; in simple linear regression, the regression model is y = b0 + b1x + e. |
Residual Analysis | The analysis of the residuals used to determine whether the assumptions made about the regression model appear to be valid. Residual analysis is also used to identify outliers and influential observations |
Residual Plot | Graphical representation of the residuals that can be used to determine whether the assumptions made about the regression model appear to be valid |
Scatter Diagram | A graph of bivariate data in which the independent variable is on the horizontal axis and the dependent variable is on the vertical axis |
Simple Linear Regression | Regression analysis involving one independent variable and one dependent variable in which the relationship between the variables is approximated by a straight line. |
Standard Error of the Estimate | The square root of the mean square error, denoted by s. It is the estimate of s, the standard deviation of the error term e |
Standardized Residual | The value obtained by dividing a residual by its standard deviation |
Adjusted Multiple Coefficient of Determination | A measure of the goodness of fit of the estimated multiple regression equation that adjusts for the number of independent variables in the model and thus avoids overestimating the impact of adding more independent variables |
Categorical Independent Variable | An independent variable with categorical data |
Cook's Distance Measure | A measure of the influence of an observation based on both the leverage of observation i and the residual for observation i |
Dummy Variable | A variable used to model the effect of categorical independent variables. A dummy variable may take only the value zero or one |
Estimated Variable Regression Equation | The estimate of the logistic regression equation based on sample data ; that is, yˆ=estimate of P(y=1ux ,x ,...,x ) |
Influential Observation | An observation that has a strong influence on the regression results. |
Least Squares Method | The method used to develop the estimated regression equation. It minimizes the sum of squared residuals (the deviations between the observed values of the dependent variable, yi, and the predicted values of the dependent variable, yˆi) |
Leverage | A measure of how far the values of the independent variables are from their mean values |
Multicollinearity | The term used to describe the correlation among the independent variables. |
Multiple Coefficient of Determination | A measure of the goodness of fit of the estimated multiple regression equation. It can be interpreted as the proportion of the variability in the dependent variable that is explained by the estimated regression equation |
Multiple Regression Analysis | Regression analysis involving two or more independent variables |
Multiple Regression Equation | The mathematical equation relating the expected value or mean value of the dependent variable to the values of the independent variables; that is, E(y)=b0 +b1x1 +b2x2 +...+bpxp |
Multiple Regression Model | The mathematical equation that describes how the dependent variable y is related to the independent variables x1, x2, . . . , xp and an error term e. |
Outlier | An observation that does not fit the pattern of the other data |
Studentized Deleted Residuals | Standardized residuals that are based on a revised standard error of the estimate obtained by deleting observation i from the data set and then performing the regression analysis and computations |
Additive Decomposition Model | in an additive decomposition model the actual time series value at time period t is obtained by adding the values of a trend component, a seasonal component, and an irregular component |
Cyclical Pattern | a cyclical pattern exists if the time series plot shows an alternating sequence of points below and above the trend line lasting more than one year. |
Deseasonalized Time Series | a time series from which the effect of season has been removed by dividing each original time series observation by the corresponding seasonal index |
Exponential Smoothing | a forecasting method that uses a weighted average of past time series values as the forecast; it is a special case of the weighted moving averages method in which we select only one weight—the weight for the most recent observation |
Forecast Error | the difference between the actual time series value and the forecast. |
Horizontal Pattern | a horizontal pattern exists when the data fluctuate around a constant mean |
Mean Absolute Error | the average of the absolute values of the forecast errors |
Mean Absolute Percentage Error | the average of the absolute values of the percentage forecast errors |
Mean Squared Error | the average of the sum of squared forecast errors. |
Moving Averages | a forecasting method that uses the average of the most recent k data values in the time series as the forecast for the next period |
Multiplicative Decomposition Model | in a multiplicative decomposition model the actual time series value at time period t is obtained by multiplying the values of a trend component, a seasonal component, and an irregular component |
Seasonal Pattern | a seasonal pattern exists if the time series plot exhibits a repeating pat- tern over successive periods. the successive periods are often one-year intervals, which is where the name seasonal pattern comes from |
Smoothing Constant | a parameter of the exponential smoothing model that provides the weight given to the most recent time series value in the calculation of the forecast value |
Stationary Time Series | a time series whose statistical properties are independent of time. For a stationary time series the process generating the data has a constant mean and the variability of the time series is constant over time |
Time Series | a sequence of observations on a variable measured at successive points in time or over successive periods of time |
Time Series Decomposition | a time series method that is used to separate or decompose a time series into seasonal and trend components |
Time Series Plot | a graphical presentation of the relationship between time and the time series variable. time is shown on the horizontal axis and the time series values are shown on the vertical axis |
Trend Pattern | a trend pattern exists if the time series plot shows gradual shifts or movements to relatively higher or lower values over a longer period of time |
Weighted Moving Averages | a forecasting method that involves selecting a different weight for the most recent k data values in the time series and then computing a weighted average of the values. the sum of the weights must equal one |