click below
click below
Normal Size Small Size show me how
Statistics Test 1
Question | Answer |
---|---|
observation | is an individual entity in a study |
variable | characteristic that may differ among individuals. |
Sample data | subset of a larger population. |
Population data | collected when all individuals in a population are measured. |
statistic | a summary measure of sample data. |
parameter | summary measure of population data. |
categorical variables | consist of group or category names that don’t necessarily have a logical ordering. Examples: eye color, country of residence. |
ordinal variables | Categorical variables for which the categories have a logical ordering Examples: highest educational degree earned, tee shirt size (S, M, L, XL). |
quantitative variables | consist of numerical values taken on each individual. Examples: height, number of siblings. |
explanatory variable and response variable | In general, the value of the explanatory variable for an individual is thought to partially explain the value of the response variable for that individual. |
relative frequency distribution | is a listing of all categories along with their relative frequencies (given as proportions or percentages, for example). |
A frequency distribution for a categorical variable | is a listing of all categories along with their frequencies (counts). |
Bar Graphs | useful for summarizing one or two categorical variables and particularly useful for making comparisons when there are two categorical variables. |
Pie Charts: | useful for summarizing a single categorical variable if not too many categories. |
extremes | high and low |
quartiles | medians of lower and upper halves of the values |
Outliers | a data point that is not consistent with the bulk of the data |
Shape | clumped in middle or on one end (more later) |
Spread | variability e.g. difference between two extremes or two quartiles. |
Location | center or average. e.g. median |
Histograms | similar to bar graphs, used for any number of data values. |
Stem-and-leaf plots and dotplots | present all individual values, useful for small to moderate sized data sets. |
Boxplot or box-and-whisker plot | useful summary for comparing two or more groups. |
To illustrate shape | histograms and stem-and-leaf plots are best. |
To illustrate location and spread, | any of the pictures work well |
see individual values | use stem-and-leaf plots and dotplots. |
To sort values | use stem-and-leaf plots. |
To identify outliers | using the standard definition, use a boxplot. |
compare groups, | use side-by-side boxplots. |
What would outliers do to the mean | High outliers will increase the mean. Low outliers will decrease the mean. |
The Influence of Outliers on the Mean and Median | Larger influence on mean than median |
Range | high value – low value |
Interquartile Range (IQR) = | upper quartile – lower quartile |
Q1 = lower quartile | median of data values that are below the median |
Q3 = upper quartile | median of data values that are above the median |
Quartile Percentiles | Lower quartile = 25th percentile Median = 50th percentile Upper quartile = 75th percentile |
The greater the distance a value is from the center, the fewer individuals have that value | “bell-shaped”. A special case is called a normal distribution or normal curve. |
Standard deviation | measures variability by summarizing how far individual data values are from the mean. |
Descriptive Statistics | numerical and graphical summaries to characterize a data set or describe a relationship. |
Inferential Statistics: | using sample information to make conclusions about a broader range of individuals than just those observed. |
Population | the entire group of units about which inferences are to be made. |
Sample | the smaller group of units actually measured or surveyed. |
Census: | every unit in the population is measured or surveyed. |
Simple Random Sample | every conceivable group of units of the required size from the population has the same chance to be the selected sample. |
Sample Survey | a subgroup of a large population questioned on set of topics. Special type of observational study.Less costly and less time than a census. |
Advantages of a Sample Survey over a Census | -Sometimes a Census Isn’t Possible when measurements destroy units -Speed especially if population is large -Accuracy devote resources to getting accurate sample results |
Selection bias | occurs if method for selecting participants produces sample that does not represent the population of interest. |
Nonparticipation bias (nonresponse bias) | occurs when a representative sample is chosen but a subset cannot be contacted or doesn’t respond. |
Biased response or response bias | occurs when participants respond differently from how they truly feel |
Stratified Random Sampling | Divide population of units into groups (called strata) and take a simple random sample from each of the strata. |
Cluster Sampling | Divide population of units into groups (called clusters), take a random sample of clusters and measure only those items in these clusters. |
Systematic Sampling | Order the population of units in some way, select one of first k units at random and then every kth unit thereafter. |
Multistage Sampling | Using a combination of the sampling methods, at various stages. |
Random-Digit Dialing | Method approximates a simple random sample of all households in the United States that have telephones. |