| Question | Answer |
| Define population | The whole set of items that are of interest |
| Define sample | Some subset of the population intended to represent the population |
| Define sampling unit | Each individual thing in the population that can be sampled |
| Define sampling frame | Often sampling units of a population are individually named or numbered to for a list |
| Define census | Data collected from the entire population |
| Define simple random sample | -Each sample has an equal chance of selection
-Each item has number
-Random number generator |
| Advantages of simple random sample | -No bias
-Easy
-Cheap
-Equal selection chance |
| Disadvantages of simple random sample | -Not suitable for large population
-Sampling frame needed |
| Define systematic sample | -Elements ordered into list
-Every kth element
-k=pop size/samp size
-Start at random number between 1 and k |
| Advantages of systematic sample | -Simple
-Quick
-Suitable for large populations |
| Disadvantages of systematic sample | -Sampling frame needed
-Can introduce bias is sampling frame is not random |
| Define stratified sample | -Population divided into strata
-Simple random sample for each group
-Samp size/pop size sampled from each group
-Used when sample is large and divided into groups |
| Advantages of stratified sample | -Reflects population structure
-Proportional representation within population |
| Disadvantages of stratified sample | -Population clearly classified into strata
-Selection within strata suffer from same disadvantages as simple random |
| Define quota sample | -Population divided into groups according to sampling frame
-Interviewer selects quotas to reflect groups proportions |
| Advantages of quota sample | -Small sample is still representative
-Easy
-Cheap
-Comparable |
| Disadvantages of quota sample | -Can introduce bias
-Population divided into groups
-Non responses not recorded |
| Define opportunity sample | Sample taken from people at the time, who meet criteria |
| Advantages of opportunity sample | -Easy
-Cheap |
| Disadvantages of opportunity sample | -Not representative
-Dependant on researcher |
| What is the equation for a stratified sample | Strata size x sample size/total population |
| What is the difference between qualitative and quantitative data | Qualitative- Descriptive
Quantitative- Numerical |
| What is the difference between discreet and continuous data | Discreet- Only takes certain values
Continuous- Takes all values |
| How do we find outliers | Greater than Q3+k(Q3-Q1)
Less than Q1-k(Q3-Q1) |
| Define cleaning the data | Removing outliers |
| What do we plot for cumulative frequency diagrams | End point against cumulative frequency |
| What is the equation for frequency density | Frequency density=(Frequency x k)/Class width |
| When do we use a histogram | Continuous data |
| When do we use a bar chart | Discreet data |
| What do we comment on when comparing data | -Measure of location
-Measure of spread |
| What axis is the independent variable on | X |
| What axis is the dependent variable on | Y |
| Define bivariate | There are pairs of values for two variable |
| Define causal relationship | Change in one variables causes a change in the other |
| Define interpolation | Estimating a variable within the data range |
| Define extrapolation | Estimating a variable outside the data range |
| What is the purpose of a regression line | To minimise standard deviation |
| When can we use regression lines | For data within the data range |
| Define mutually exclusive | If one event happens the other events can't happen |
| If events are mutually exclusive:
P(A or B)= | P(A)+P(B) |
| Define independent events | One event has no effect on the other |
| If events are independent
P(A and B)= | P(A) x P(B) |
| Define random variable | A variable whose value depends on the outcome of a random event |
| Define discreet uniform distribution | All probabilities are the same |
| ΣP(X=x)= | 1 |
| When can you model a random variable with a binomial distribution | -Fixed no of trials(n)
-2 possible outcomes
-Fixed probability of success (P)
-Trials are independent of each other |
| P(X<Y)= | P(X≤Y-1) |
| P(X≥Y)= | 1-P(X≤Y-1) |
| P(X>Y)= | 1-P(X≤Y) |
| Define test statistic | The result of the experiment or the statistic that is calculated |
| Define null hypothesis | The hypothesis you assume to be correct |
| Define alternate hypothesis | Tells you about the parameter if your assumption is wrong |
| Define critical region | A region of the probability distribution which if the test statistic falls within it would cause you to reject the null hypothesis |
| Define critical value | The first value to fall in the critical region |
| What is the actual significance level | The probability of incorrectly rejecting the null hypothesis |
| What are the steps of a one tailed hypothesis test | -Formulate a model for test statistic
-Identify suitable null and alternate hypotheses
-Calculate the probability of test statistic being observed assuming null hypothesis is true
-Compare to significance level
-Write conclusion in context of question |
| What must you do for a two tailed hypothesis test | Halve the significance level |
| If y+ax^n then logy= | loga+nlogx |
| If y=kb^x then logy= | logk+xlogb |
| What does the PMCC describe | The strength and direction of the correlation |
| When can the PMCC be used | If there is LINEAR correlation |
| If we are hypothesis testing for correlation what are the null and alternate hypothesis | H0: p=0
H1: p≠0 |
| How do we write, the probability that B occurs given that A has already occurred | P(P|A) |
| What is the rule for independent events and conditional probability | P(A|B)=P(A|B')=P(A) |
| P(A)+P(B)-P(A∩B)= | P(A∪B) |
| (P(B∩A))/(P(A))= | P(B|A) |
| What are the probability symbols for and and or | And=∩
Or=∪ |