BMR 617: Statistical Techniques for the Biomedical Sciences

ANOVA Part one: Introduction

Introduction

To motivate study of ANOVA, we'll use our usual data set, available at https://denvirlab.marshall.edu/BMR617-2021/data/TH-B6-metabolic.csv. These data are metabolic data for mice from two different strains: a control C57Bl/6 strain and a model strain for metabolic syndrome, TALLYHO. The mice were maintained on one of three different diets: Chow (the control diet), a high-sugar, high-fat diet, and a high-sugar, low-fat diet. These data are taken from Parkman et al. (2016). The question we want to address is whether Strain and/or Diet affect the metabolic variables. For this section of the course, we will focus on Cholesterol.

Using R, you can load the data into a "tidyverse"-style table with the following code:


library(tidyverse)
met <- read_csv('https://denvirlab.marshall.edu/BMR617-2021/data/TH-B6-metabolic.csv')
	

Extract the Strain and Diet from the MouseID column. In R:


met <- separate(met, MouseID, sep="-", into=c("Strain", "Diet", "ID"))
    
View the data, noting the variables (columns) and the values in them.

  1. What type and role does the variable Cholesterol have?
    1. Categorical explanatory variable
    2. Categorical response variable
    3. Quantitative explanatory variable
    4. Quantitative response variable
    Incorrect
    Cholesterol is one of the metabolic variables; remember we are interested in whether or not Strain and/or Diet affect the metabolic variables. Look at the data to determine what type the Cholesterol data are.
    Correct!
    Cholesterol is measured in mg/dl, and is quantitative, and is one of the outcomes (responses) in which we're interested.
  2. What type and role does the variable Strain have?
    1. Categorical explanatory variable
    2. Categorical response variable
    3. Quantitative explanatory variable
    4. Quantitative response variable
    Incorrect
    Remember we are interested in whether or not Strain and/or Diet affect the metabolic variables. Look at the data to determine what type the Strain data are.
    Correct!
    The variable Strain can take on the values "B6" or "TH", so it is categorical. Since we're interested in whether or not Strain (and/or Diet) affect the metabolic variables, Strain is an explanatory variable.
  3. What type and role does the variable Diet have?
    1. Categorical explanatory variable
    2. Categorical response variable
    3. Quantitative explanatory variable
    4. Quantitative response variable
    Incorrect
    Remember we are interested in whether or not Strain and/or Diet affect the metabolic variables. Look at the data to determine what type the Diet data are.
    Correct!
    The variable Diet can take on the values "Chow", "HF", or "LF", so it is categorical. Since we're interested in whether or not Diet (and/or Strain) affect the metabolic variables, Diet is an explanatory variable.

In any statistical software, it is very important that the software "knows" the type of variable you are using; in particular, it needs to be able to distinguish between quantitative variables and categorical variables in order to perform the correct analysis.

In R, the correct type for a categorical variable is a factor. To change the Strain and Diet columns to a factor, we can mutate the columns and use the as.factor function:


met <- mutate(met, Strain = as.factor(Strain), Diet = as.factor(Diet))
		
  1. How many levels does the variable Strain have?
    1. 0
    2. 1
    3. 2
    4. 3
    5. 4
    6. 5
    7. 6
    8. An infinite number
    Incorrect
    The number of levels of a categorical variable is the number of values that it can take on, in the context of the experiment.
    Correct!
    Strain has two possible values: B6 and TH, so there are two levels.
  2. How many levels does the variable Diet have?
    1. 0
    2. 1
    3. 2
    4. 3
    5. 4
    6. 5
    7. 6
    8. An infinite number
    Incorrect
    The number of levels of a categorical variable is the number of values that it can take on, in the context of the experiment.
    Correct!
    Strain has three possible values: Chow, HF, and LF, so there are three levels.