Last time we discussed the methods section of a journal article, with respect to presenting statistical methods. The methods section should include details about which statistical analyses were used, and which software was used to implement them. Ultimately, the aim is that anyone should be able to reproduce the results given the raw data and the information in the article.
This time we'll look at presenting results in more detail. We'll continue with our example analysis:
library(tidyverse)
met <- read_csv("https://denvirlab.marshall.edu/BMR617-2022/data/TH-B6-metabolic.csv") %>%
separate(MouseID, into=c("Strain", "Diet", "ID"), sep='-')
aov.full <- aov(Cholesterol ~ Strain * Diet, data = met)
aov.full$coefficients
confint(aov.full)
summary(aov.full)
png(filename = "figure1.png", width=4*960, height = 4*480, res=4*72)
ggplot(met, aes(x=Strain, y=Cholesterol, fill=Diet)) +
geom_boxplot(outlier.shape = NA) +
geom_point(position=position_jitterdodge(jitter.width = 0.1, dodge.width = 0.75)) +
xlab("Cholesterol Level (mg/dl)") +
ggtitle("Cholesterol Level by Mouse Strain and Diet")
dev.off()
sessionInfo()
Here are the results of the ANOVA:
> aov.full <- aov(Cholesterol ~ Strain * Diet, data = met)
> aov.full$coefficients
(Intercept) StrainTH DietHF DietLF StrainTH:DietHF StrainTH:DietLF
42.945000 58.290000 72.140998 38.783333 8.934004 7.221669
> confint(aov.full)
2.5 % 97.5 %
(Intercept) 13.1490209 72.74098
StrainTH 19.8235566 96.75644
DietHF 32.1654973 112.11650
DietLF 0.3168892 77.24978
StrainTH:DietHF -49.1490529 67.01706
StrainTH:DietLF -45.5208619 59.96420
> summary(aov.full)
Df Sum Sq Mean Sq F value Pr(>F)
Strain 1 19984 19984 24.082 5.87e-05 ***
Diet 2 25773 12887 15.529 5.40e-05 ***
Strain:Diet 2 102 51 0.062 0.94
Residuals 23 19086 830
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
The key points here are:
We can summarize the results above in a narrative. Each statement should be backed with quantities and confidence intervals. We should also include p-values. The following provides a concise, precise summary of the results:
To create a table with the results, we need to combine different parts of the output from R into a single table. We can do this with a bit of effort and some data wrangling in R. We'd like to create a table with the parameters, estimates, and 95% confidence intervals.
The estimates are contained in aov.full$coefficients, as a named list.
The 95% confidence intervals are contained in confint(aov.full), which
is a matrix. The row names of the matrix contain the parameter names.
We can start by converting the matrix containing the confidence intervals to a tidyverse data table:
resultsTable <- as_tibble(confint(aov.full), rownames="Parameter")
Now let's add a column for the estimate. We can turn the list aov.full$coeffcients
into another data table, and bind its columns (cbind(...)) with our existing table:
resultsTable <- cbind(resultsTable, tibble(Estimate = aov.full$coefficients))
Let's tidy this up a bit. We can round the numerical columns to two digits, and
combine the two columns representing the confidence interval into a single column:
resultsTable <- resultsTable %>%
mutate(`2.5 %` = round(`2.5 %`, digits = 2),
`97.5 %` = round(`97.5 %`, digits = 2),
Estimate = round(Estimate, digits = 2)) %>%
mutate(`95% Confidence Interval`=paste0("[", `2.5 %`, ", ", `97.5 %`, "]"))
and finally we can just select the columns we need:
resultsTable <- resultsTable %>%
select(Parameter, Estimate, `95% Confidence Interval`)
We can save this as a CSV file and paste it into a Word document.
RStudio has options for exporting graphics directly. However, this will not give you publication-quality images. A better way is to export a figure to a graphics file from code.
The basic code structure is
png("ImageFilename.png", ...)
# Graphics commands
dev.off()
This will create an image file with the name ImageFilename.png.
Any graphics commands will be written to an off-screen "graphics device", and
when dev.off() ("device off") is called, all the graphics will be
written to that file, and the file will be closed.
Computer graphics are represented by an array of individual dots, called pixels. Each pixel is a small rectangle in one solid color. Our aim is to generate images that look good on a screen and in print. On a screen, a user might zoom in to see more detail.
When we create the png file, we can specify the size in pixels. The default size is 480 by 480, which is not a very high resolution.
png("lowResChol.png", width=480, height=480)
ggplot(met, aes(x=Strain, y=Cholesterol, fill=Diet)) +
geom_boxplot(outlier.shape = NA) +
geom_point(position=position_jitterdodge(jitter.width = 0.1, dodge.width = 0.75)) +
ylab("Cholesterol Level (mg/dl)") +
ggtitle("Cholesterol Level by Mouse Strain and Diet")
dev.off()
If we increase the image size, we get a much higher-quality images:
png("hiResChol.png", width=8*480, height=8*480)
ggplot(met, aes(x=Strain, y=Cholesterol, fill=Diet)) +
geom_boxplot(outlier.shape = NA) +
geom_point(position=position_jitterdodge(jitter.width = 0.1, dodge.width = 0.75)) +
ylab("Cholesterol Level (mg/dl)") +
ggtitle("Cholesterol Level by Mouse Strain and Diet")
dev.off()
The problem now is that the text is too small to see. Text size is not measured in
pixels, but in point size. Usually, there are 72 points to an inch, so these
units are in "print size", not "image size". We can control this with the res
parameter, which scales the point size, and effectively determines the number of
points per pixel.
png("hiResChol.png", width=4*480, height=4*480, res=4*72)
ggplot(met, aes(x=Strain, y=Cholesterol, fill=Diet)) +
geom_boxplot(outlier.shape = NA) +
geom_point(position=position_jitterdodge(jitter.width = 0.1, dodge.width = 0.75)) +
ylab("Cholesterol Level (mg/dl)") +
ggtitle("Cholesterol Level by Mouse Strain and Diet")
dev.off()