This review includes statistical concepts and R commands from the following lectures:
Understand that in an experiment or study we make observations on a sample but want to make inferences about the population from which the sample is drawn.
The Central Limit Theorem tells us about the distribution of sample means taken from a given distribution.
The standard error of the mean, \[\frac{2}{\sqrt{n}}\] is an estimate of the standard deviation of all samples means of size \(n\).
We also looked at how to draw bar charts and error bars in R. We saw how to make error bars representing the standard deviation, standard error of the mean, and 95% confidence intervals.
A confidence interval is a range we construct from sample data when estimating a statistic, for example a mean or a proportion.
We choose a level of confidence, e.g. 95%.
We then construct an interval that we are 95% confident contains the "true" value of our statistic.
If we repeated this process (including the data gathering) over and over again, 95% of the intervals we constructed like this would include the true value.
We discussed the general framework for hypothesis testing:
These tests are for the C -> C case. They test the null hypothesis that the proportion of each possible outcome is the same for each possible value of the explanatory variable.
T-tests test the C -> Q case, where the explanatory variable has only two possible values.
Make sure you understand when to use paired ("matched pairs") t-tests, and when to use the two-class t-test.