click below
click below
Normal Size Small Size show me how
2Qunatit
data analysis
Question | Answer |
---|---|
useful graphs | scatterplot can get a sense for the nature of the relationship |
what to look for in a graph | relationship between two variables where one variable causes changes to another |
location | where most of the data lies |
spread | variability of the data, how far apart or close together it is |
shape | symetric, skewed etc |
nature of relationship | existent/ non-existent strong/ weak increasing/ decreasing linear/ non-linear |
outliers in scatterplots | represent some unexplainable anomalies in data could reveal possible systematic structure worthy of investigation |
casual relationship | relationship between two variables where one variable causes changes to another |
explanatory variable | explains or causes the change on x-axis |
response variable | is changed on y-axis |
useful numbers | correlation and regression |
formula for the correlation coefficient | r= 1/(n-1) ∑▒〖((xi-x ̅)/sx〗)((yi-y ̅)/sy) |
xi or yi | axis values of corresponding letter |
xbar or ybar | mean of axis values of corresponding letter |
sx or sy | standard deviation of axis values of corresponding latter |
properties of r | close to 1 = strong positive linear relatoinship close to -1 = strong negative linear relationship close to 0 = weak or non-existent linear relationsip |
cautions about the use of r | only useful for describing linear relationships sensitive to outliers |
regression models | general linear relationships between variables focus negative = decrease |
what regression modelling does | describes behaviour of response variable (the variable of interest) in terms of a collection predictors (related variables ie. explanatory variable(s)) |
a linear framework is used to look at? | the relationship between the response and the regressors formula: Y = α + βx Where α is the intercept and β is the slope |
ideal model for linear framework in terms of responses and regressors | one unique response to one given regressor |
real world model for linear framework in terms of responses and regressors | must approximate |
statistical model | relates response to physical model predictions allows for better predictions and quantification of uncertainty concerning the response to make decisions |
what does regression analysis do? | finds the best relationship between responses and regressors for a particular class of models |
experimenter controls predictors, why? | may be important for making inferences about the effect of predictors on response |
course assumption | predictors are controlled in an experiment or at least accurately measured |
define a good statistical model | fit, predictive performance, parsimony interpretability |
qualitative description of model | response = signal + noise Y = α + βx + ǫ ǫ = noise |
define signal | a small number of unknown parameters variation in response explained in terms of predictors it is the systematic part of the model |
define noise | residual variation unexplained in the systematic part of the model can be described in terms of unknown parameters |
what does a good statistical model do to possibly large and complex data | reduces it to a small number of parameters |
a model will fit well if | the systematic part of the model describes much of the variation in the response (low noise) large number of parameters may be required to do this |
define parsimony: | smaller number of parameters = grater reduction of data, more useful for making a decision |
there is a cycle between what? | tentative model formulation, estimation of parameters and model criticism |
a good model will | manage balance between goodness of fit and complexity provide reduction useful data |
model response variable in terms of a single predictor | yn = values of the response variable |