To motivate study of ANOVA, we'll use our usual data set, available at
https://denvirlab.marshall.edu/BMR617-2021/data/TH-B6-metabolic.csv.
These data are metabolic data for mice from two different strains: a control C57Bl/6 strain and a model strain for metabolic
syndrome, TALLYHO. The mice were maintained on one of three different diets: Chow (the control diet), a high-sugar,
high-fat diet, and a high-sugar, low-fat diet. These data are taken from
Parkman et al. (2016).
The question we want to address is whether Strain and/or Diet affect the metabolic variables. For this
section of the course, we will focus on Cholesterol.
Using R, you can load the data into a "tidyverse"-style table with the following code:
library(tidyverse)
met <- read_csv('https://denvirlab.marshall.edu/BMR617-2021/data/TH-B6-metabolic.csv')
Extract the Strain and Diet from the MouseID column. In R:
met <- separate(met, MouseID, sep="-", into=c("Strain", "Diet", "ID"))
View the data, noting the variables (columns) and the values in them.
What type and role does the variable Cholesterol have?
Categorical explanatory variable
Categorical response variable
Quantitative explanatory variable
Quantitative response variable
Incorrect
Cholesterol is one of the metabolic variables; remember we are interested in whether or not
Strain and/or Diet affect the metabolic variables. Look at the data to determine what type the
Cholesterol data are.
Correct!
Cholesterol is measured in mg/dl, and is quantitative, and is one of the outcomes (responses)
in which we're interested.
What type and role does the variable Strain have?
Categorical explanatory variable
Categorical response variable
Quantitative explanatory variable
Quantitative response variable
Incorrect
Remember we are interested in whether or not
Strain and/or Diet affect the metabolic variables. Look at the data to determine what type the
Strain data are.
Correct!
The variable Strain can take on the values "B6" or "TH", so it is categorical.
Since we're interested in whether or not Strain (and/or Diet) affect the metabolic
variables, Strain is an explanatory variable.
What type and role does the variable Diet have?
Categorical explanatory variable
Categorical response variable
Quantitative explanatory variable
Quantitative response variable
Incorrect
Remember we are interested in whether or not
Strain and/or Diet affect the metabolic variables. Look at the data to determine what type the
Diet data are.
Correct!
The variable Diet can take on the values "Chow", "HF", or "LF", so it is categorical.
Since we're interested in whether or not Diet (and/or Strain) affect the metabolic
variables, Diet is an explanatory variable.
In any statistical software, it is very important that the software "knows" the type of
variable you are using; in particular, it needs to be able to distinguish between
quantitative variables and categorical variables in order to perform the correct analysis.
In R, the correct type for a categorical variable is a factor.
To change the Strain and Diet columns to a factor, we can mutate the columns and use the
as.factor function:
met <- mutate(met, Strain = as.factor(Strain), Diet = as.factor(Diet))
How many levels does the variable Strain have?
0
1
2
3
4
5
6
An infinite number
Incorrect
The number of levels of a categorical variable is the number of values that it can take on,
in the context of the experiment.
Correct!
Strain has two possible values: B6 and TH, so there are two levels.
How many levels does the variable Diet have?
0
1
2
3
4
5
6
An infinite number
Incorrect
The number of levels of a categorical variable is the number of values that it can take on,
in the context of the experiment.
Correct!
Strain has three possible values: Chow, HF, and LF, so there are three levels.