BMR 617: Statistical Techniques for the Biomedical Sciences

Inference: Two-Class t-test (Two-Sample t-test)

This type of t-test is used to compare two independent means of quantitative variables.

Recall sample mean and population mean.

We estimate the mean of quantitative (Q) variables, such as weight, height, blood pressure, and IQ.

Sample mean $$ { \bar{x} = \sum_{i=1}^n \frac{x_{i}}{n} = \frac{x_{1}+x_{2}+...+x_{n}}{n} } $$ where x_i = i^th value of x
n = sample size (total number of observations)

It is basically the sum of all values divided by the total number of values.

The sample mean is a point estimate of the population mean (μ), i.e., we estimate the population mean ("true mean") by the sample mean.

Normal Population with Known Population Standard Deviation σ

Recall normal population with known population standard deviation.

We will use the following variables in this section:

H₀: null hypothesis
H_a: alternative hypothesis
μ: population mean
σ: population standard deviation
n: size of random sample
X: sample mean
x: computed sample mean
μ₁: mean of population 1
μ₂: mean of population 2
σ₁²: variance of population 1
σ₂²: variance of population 2
σ²: common population variance
X₁ and X₂: sample means of groups 1 and 2, respectively
x₁ and x₂: computed (observed) sample means of groups 1 and 2, respectively
n₁ and n₂: sample sizes of groups 1 and 2, respectively

Standardizing X gives the standard normal variable (random variable Z has mean μ of zero (0) and standard deviation σ of one(1)). $$ {Z = \frac{\bar{X} - \mu}{\frac{\sigma}{\sqrt{n}}}} $$ Assuming two normal populations with equal population variances (σ₁²=σ₂²), i.e., only possible difference is where they are centered, standardizing X₁-X₂ gives the standardized variable $$ {Z = \frac{\bar{X_1} - \bar{X_2} - (\mu_1 - \mu_2)}{\sqrt{\frac{\sigma_1^2}{n_1} + \frac{\sigma_2^2}{n_2}}} = \frac{\bar{X_1} - \bar{X_2} - (\mu_1 - \mu_2)}{\sqrt{\sigma^2({\frac{1}{n_1} + \frac{1}{n_2}})}}} $$ Since the null hypothesis assumes that there is no difference in the population means, μ₁-μ₂ is always zero (0).

Classical t-test (“pooled”)

Classical t-test is used when the variance of two groups being compared are equivalent.

We will use the following variables in this section:

H₀: null hypothesis
H_a: alternative hypothesis
X: sample mean
x: computed sample mean
μ₁: mean of population 1
μ₂: mean of population 2
σ₁²: variance of population 1
σ₂²: variance of population 2
σ²: common population variance
X₁ and X₂: sample means of groups 1 and 2, respectively
x₁ and x₂: computed (observed) sample means of groups 1 and 2, respectively
n₁ and n₂: sample sizes of groups 1 and 2, respectively
S_p²: common sample variance
s₁² and s₂²: computed sample variances of groups 1 and 2, respectively
s_p²: computed common sample variance (estimator of the pooled variance of groups 1 and 2)
s_p: computed common sample standard deviation
df: degree(s) of freedom

H₀: μ₁ = μ₂ or μ₁ - μ₂ = 0
H_a: μ₁ ≠ μ₂ or μ₁ - μ₂ ≠ 0

H₀: μ₁ ≤ μ₂ or μ₁ - μ₂ ≤ 0
H_a: μ₁ > μ₂ or μ₁ - μ₂ > 0

H₀: μ₁ ≥ μ₂ or μ₁ - μ₂ ≥ 0
H_a: μ₁ < μ₂ or μ₁ - μ₂ < 0

Through statistical theory, if common sample variance S_p² replaces common population variance σ² for 𝑍, the resulting standardized variable follows a t-distribution with n₁ + n₂ - 2 degrees of freedom (df) and its corresponding test statistic T and test statistic value t, i.e., [(sample mean difference – population mean difference)/standard error], would be $$ {T = \frac{\bar{X_1} - \bar{X_2} - Δ_0}{\sqrt{S_p^2({\frac{1}{n_1} + \frac{1}{n_2}})}}} $$ and $$ {t = \frac{\bar{x_1} - \bar{x_2} - Δ_0}{\sqrt{s_p^2({\frac{1}{n_1} + \frac{1}{n_2}})}} = \frac{\bar{x_1} - \bar{x_2}}{s_p\sqrt{{\frac{1}{n_1} + \frac{1}{n_2}}}}} $$ where μ₁-μ₂ is replaced by the null value Δ₀, which is equal to zero.

The computed common sample variance, which has n₁ + n₂ - 2 degrees of freedom, is $$ {s_p^2 = \frac{(n_1 - 1)s_1^2 + (n_2 - 1)s_2^2}{n_1 + n_2 - 2} } $$

Alternative Hypothesis	Rejection Region for Level α Test
H_a: μ₁ ≠ μ₂ or μ₁ - μ₂ ≠ 0	either t ≥ t_α/2,n1+n2-2 or t ≤ -t_α/2,n1+n2-2 (two-tailed test)
H_a: μ₁ > μ₂ or μ₁ - μ₂ > 0	t ≥ t_α,n1+n2-2 (upper-tailed test)
H_a: μ₁ < μ₂ or μ₁ - μ₂ < 0	t ≤ t_α,n1+n2-2 (lower-tailed test)

Welch t-test (“unpooled”)

Welch t-test is used when the variance of two groups being compared are different from each other.

We will use the following variables in this section:

H₀: null hypothesis
H_a: alternative hypothesis
X₁ and X₂: sample means of groups 1 and 2, respectively
x₁ and x₂: computed (observed) sample means of groups 1 and 2, respectively
n₁ and n₂: sample sizes of groups 1 and 2, respectively
s₁² and s₂²: computed sample variances of groups 1 and 2, respectively
df: degree(s) of freedom

H₀: μ₁ = μ₂ or μ₁ - μ₂ = 0
H_a: μ₁ ≠ μ₂ or μ₁ - μ₂ ≠ 0

H₀: μ₁ ≤ μ₂ or μ₁ - μ₂ ≤ 0
H_a: μ₁ > μ₂ or μ₁ - μ₂ > 0

H₀: μ₁ ≥ μ₂ or μ₁ - μ₂ ≥ 0
H_a: μ₁ < μ₂ or μ₁ - μ₂ < 0

The test statistic is a Welch t-statistic $$ {t = \frac{\bar{x_1} - \bar{x_2}}{\sqrt{\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}}}} $$ and the number of degrees of freedom (df) is estimated by $$ {df = \frac{(\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2})^2}{(\frac{1}{n_1 - 1})(\frac{s_1^2}{n_1})^2 + (\frac{1}{n_2 - 1})(\frac{s_2^2}{n_2})^2}} $$ In some references, they use ν to represent the degrees of freedom of an “unpooled” (two-sample) t-test.

Alternative Hypothesis	Rejection Region for Level α Test
H_a: μ₁ ≠ μ₂ or μ₁ - μ₂ ≠ 0	either t ≥ t_α/2,df or t ≤ -t_α/2,df (two-tailed test)
H_a: μ₁ > μ₂ or μ₁ - μ₂ > 0	t ≥ t_α,df (upper-tailed test)
H_a: μ₁ < μ₂ or μ₁ - μ₂ < 0	t ≤ t_α,df (lower-tailed test)

Two-Sample t-test using R

Let us use the metabolic data set from a mouse experiment in Dr. Kim’s lab.


	library(tidyverse)
	met <- read_csv("https://denvirlab.marshall.edu/BMR617-2022/data/TH-B6-metabolic.csv") %>%
  	separate(MouseID, sep="-", into=c("Strain","Diet","ID"))

It has the weight of 29 mice: 15 B6 (group 1) and 14 TH (group 2).

You want to ask the question, "Is the mean weight of B6 (μ₁) different from that of TH (μ₂)?"

H₀: μ₁ = μ₂ or μ₁ - μ₂ = 0
H_a: μ₁ ≠ μ₂ or μ₁ - μ₂ ≠ 0

H₀: μ₁ ≤ μ₂ or μ₁ - μ₂ ≤ 0
H_a: μ₁ > μ₂ or μ₁ - μ₂ > 0

H₀: μ₁ ≥ μ₂ or μ₁ - μ₂ ≥ 0
H_a: μ₁ < μ₂ or μ₁ - μ₂ < 0

You can create vectors containing the body weight per strain by:


	b6_bw <- pull((filter(met, Strain=="B6")), BodyWeight)
	th_bw <- pull((filter(met, Strain=="TH")), BodyWeight)

Does each of them follow a normal distribution?

We can run the Shapiro-Wilk normality test.


	shapiro.test(b6_bw)
	shapiro.test(th_bw)

Output for B6:


		Shapiro-Wilk normality test

	data:  b6_bw
	W = 0.85243, p-value = 0.0188

Since the p-value of 0.0188 is less than our predetermined threshold of 0.05, we would reject the null hypothesis that the population is normally distributed. For this sample, we would conclude that the distribution of body weights of B6 mice does not follow a normal distribution.

Output for TH:


		Shapiro-Wilk normality test

	data:  th_bw
	W = 0.93361, p-value = 0.3425

Since the p-value of 0.3425 is greater than our predetermined threshold of 0.05, we cannot reject the null hypothesis that the population is normally distributed. For this sample, we would conclude that the distribution of body weights of TH mice follows a normal distribution.

Since the body weight data for B6 mice doesn’t follow a normal distribution, technically, we cannot compare these two means. For illustration purposes, let us assume that they both follow a normal distribution.

Are their variances equivalent? Use F-test to test for homogeneity. In R, we use var.test().


	var.test(BodyWeight ~ Strain, data = met)

Output for F-test:


		F test to compare two variances

	data:  BodyWeight by Strain
	F = 0.68079, num df = 14, denom df = 13, p-value = 0.4844
	alternative hypothesis: true ratio of variances is not equal to 1
	95 percent confidence interval:
	 0.2209036 2.0504749
	sample estimates:
	ratio of variances 
			 0.6807926

Since the p-value of 0.4844 is greater than our predetermined threshold of 0.05, we cannot reject the null hypothesis that there is no difference between the two variances, i.e., in this sample, we failed to demonstrate a difference between the variance of body weight of B6 and variance of body weight of TH. This means that we can use the classic t-test.

Use classic t-test


	t.test(BodyWeight ~ Strain, data = met, var.equal = TRUE)

Output for Classic t-test:


		Two Sample t-test

	data:  BodyWeight by Strain
	t = -2.2229, df = 27, p-value = 0.03479
	alternative hypothesis: true difference in means between group B6 and group TH is not equal to 0
	95 percent confidence interval:
	 -8.2435537 -0.3298761
	sample estimates:
	mean in group B6 mean in group TH 
			31.96400         36.25071

Since the p-value of 0.03479 is less than our predetermined threshold of 0.05, we would reject the null hypothesis that tthere is no difference between the two means. For this sample, we would conclude there is a significant difference between the body weights of B6 and TH mice.

References

Devore, J.L. (2010). Probability and Statistics for Engineering and the Sciences (Eighth ed). Cengage Learning, Boston, MA, USA. https://faculty.ksu.edu.sa/sites/default/files/probability_and_statistics_for_engineering_and_the_sciences.pdf

Motulsky, H. (2018). Intuitive biostatistics : a nonmathematical guide to statistical thinking (Fourth edition. ed.). New York: Oxford University Press. pp. 318-328.

Elston, R.C. and Johnson, W.D. (2008). Basic Biostatistics for Geneticists and Epidemiologists: A Practical Approach. John Wiley & Sons Ltd, West Sussex, UK.