Inferential Statistics Part III

Inferential Statistics Part III – ANOVA

07-Apr-18 . 9 mins read

R statistics

One-way ANOVA
Two way ANOVA
Post-hoc analysis
Example
Inference
ANOVA Assumptions
Unbalanced ANOVA
Summary

Photo by Tisma Jrdl on Unsplash

In the previous post, we looked at T-tests to compare the means of one or two samples. The T-tests can still be used for more than two samples but there are two issues with it :

It will be tedious to compare every sample with every other samples
The probability of making Type I error -False Positive (when we reject Null instead of failing to reject) multiples exponentially.

The ANOVA methods were developed by Ronald Fisher as an extension for t and z tests. They measure the between-group variability vs the within-group variability. They can be used to compare two or more groups and find if there is a relationship that exists between them.

There are 2 types of ANOVA tests

One-way ANOVA is used when there is one independent variable (with more than factor 2 levels) and one dependent variable
Two-way ANOVA is used when there are more one independent variable (with more than factor 2 levels each) and one dependent variable

One-way ANOVA

The ANOVA is different from other tests as we have to compute different measures and then use them to calculate the F-score.
The measures that need to be computed for updating the table are given below

Total Sum of Squares is the sum of the difference between each value y from the grand mean for N observations
Sum of Squares Between for k groups, observations in group k and being the mean of the group k is given by
Sum of Squares Within groups
The degrees of freedom are given by
We then calculate the mean square error with the associated degrees of freedom.
- measures between-group variability
- measures variability within each of the groups
And finally the F statistic is the ratio
When the null hypothesis is true any difference among the sample means are only due to chance and MSB and MSW should be equal
F will be larger when will be larger than , indicate a strong evidence against the null hypothesis. If there is no difference between the groups it will be close to 1 (accept null hypothesis)

Two way ANOVA

In one-way ANOVA, we had one dependent variable and one independent variable.
In two way ANOVA we can have more than one independent variables, so we need to calculate a ratio that measures not only the variation between the dependent and independent variables, but also the interaction between the two independent variables.

Post-hoc analysis

The ANOVA test gives us the significant variables for which the group means are different but if we have to know which pairs of the groups are different then we need to run post-hoc analysis
Couple of methods of post-hoc analysis are Tukey Honest Significant Differences and Bonerroni post-hoc analysis

Example

We will use the ToothGrowth dataset in R.
- The dataset has the observations of the tooth growth in 60 guinea pigs which were administered three doses of Vitamin C (0.1, 1 and 2 mg/day).
- It also has two supplement types using which these doses were administered – OJ (Orange Juice) and VC (ascorbic acid which is a form Vitamin C)
A quick look at the boxplot indicates there are couple of outliers but we will ignore them. It also indicates that Dose 2 resulted in longer tooth growth and so also Dose 1 given in Orange Juice

str(ToothGrowth)

## 'data.frame':    60 obs. of  3 variables:
##  $ len : num  4.2 11.5 7.3 5.8 6.4 10 11.2 11.2 5.2 7 ...
##  $ supp: Factor w/ 2 levels "OJ","VC": 2 2 2 2 2 2 2 2 2 2 ...
##  $ dose: num  0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 ...

ToothGrowth$dose <- as.factor(ToothGrowth$dose)

ggplot(data=ToothGrowth) +
 geom_boxplot(aes(x=dose,y=len,fill=supp)) + 
   labs(subtitle="Box plot of tooth growth for various doses"
   ,caption = varPlotCaption) +
   theme_darklightmix(color_theme = ggplot_color_theme) +
   scale_fill_brewer(palette = "Set3")

We will run one-way ANOVA on dosage and supplement separately and then run the two-way ANOVA with both of them together with their interaction

One-way ANOVA on dose

Null hypothesis :
- Mean tooth growth for the 3 doses is same
Intepretation of the results
- p-value < 0.001 :
  - Reject the null hypothesis i.e
  - Mean tooth growth is not the same ie different doses have different effect on the tooth growth
- F Statistic of 105 :
  - Much higher than the critical value of 12.01 at p=0.05
  - ie between group variability (MSB) is much higher than within group variability (MSW)
- TukeyHSD :
  - Mean tooth growth is higher for larger doses

summary(aov(len~dose,data=ToothGrowth))

##             Df Sum Sq Mean Sq F value               Pr(>F)    
## dose         2   2426    1213   67.42 0.000000000000000953 ***
## Residuals   57   1026      18                                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

paste("Critical value of F Statisic:",qf(df1=1,df2 = 58, p= 0.05,lower.tail=F))

## [1] "Critical value of F Statisic: 4.00687288633273"

TukeyHSD(aov(len~factor(dose),data=ToothGrowth))

##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = len ~ factor(dose), data = ToothGrowth)
## 
## $`factor(dose)`
##         diff       lwr       upr     p adj
## 1-0.5  9.130  5.901805 12.358195 0.0000000
## 2-0.5 15.495 12.266805 18.723195 0.0000000
## 2-1    6.365  3.136805  9.593195 0.0000425

One-way ANOVA on supplement type

Null hypothesis :
- Mean tooth growth for the 2 supplement types is same
Interpretation of results
- p-value > 0.05 :
  - Accept the null hypothesis i.e
  - Mean tooth growth is the same ie different supplement types have no effect on the tooth growth
- F Statistic of 3.668 :
  - Lower than the critical value of 4 at p=0.05
  - ie between group variability (MSB) is slightly higher than within group variability (MSW)
- TukeyHSD :
  - No TukeyHSD as the null hypothesis is not rejected and hence no need to quantify the effects

summary(aov(len~supp,data=ToothGrowth))

##             Df Sum Sq Mean Sq F value Pr(>F)  
## supp         1    205  205.35   3.668 0.0604 .
## Residuals   58   3247   55.98                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

paste("Critical value of F Statisic:",qf(df1=1,df2 = 58, p= 0.05,lower.tail=F))

## [1] "Critical value of F Statisic: 4.00687288633273"

Two-way ANOVA on both

Null hypothesis :
- Mean tooth growth for the 3 doses is same
- Mean tooth growth for the 2 supplement types is same
- The interaction between the dose and supplement type has no effect on the tooth growth
Interpretation of results
- p-values :
  - Dose has p-value < 0.05
    - Reject the null hypothesis that the 3 doses have the same effect on tooth growth ie
    - The 3 doses are significantly different as seen in the One-Way ANOVA
  - Supplement type now has p-value < 0.05
    - Reject the null hypothesis that the 2 supplement types have same effect on tooth growth ie
    - The supp type now has significant effect on the tooth growth, after controling for the level of dose and the interaction effect dose * supp
  - dose:supp has p-value < 0.05
    - Reject the null hypothesis
    - If the significant value was chosen at say 0.02 then we could have accepted the null hypothesis ie the interaction has no effect on tooth growth
- TukeyHSD :
  - dose
    - The interpretaton for dose is same as that in One-way ANOVA
  - supp :
    - The mean tooth growth of VC is lower than that achieved by OJ (diff of -3.7)
    - Even though in this case the Tukey HSD is not needed as there are only two factor levels
  - dose*supp :
    - We will look at couple of cases
    - 0.5:VC-2:OJ : The mean tooth growth due to 0.5:VC is much lower than 2:OJ and the effect is significant with p-adj < 0.05
    - 2:VC-1:OJ : The diff in mean tooth growth is 3.44 with a p adj of > 0.05 indicating that they have the same/similar effect on tooth growth

summary(aov(len~dose*supp,data=ToothGrowth))

##             Df Sum Sq Mean Sq F value               Pr(>F)    
## dose         2 2426.4  1213.2  92.000 < 0.0000000000000002 ***
## supp         1  205.4   205.4  15.572             0.000231 ***
## dose:supp    2  108.3    54.2   4.107             0.021860 *  
## Residuals   54  712.1    13.2                                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

#paste("Critical value of F Statisic:",qf(df1=1,df2 = 58, p= 0.05,lower.tail=F))

TukeyHSD(aov(len~factor(dose)*supp,data=ToothGrowth))

##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = len ~ factor(dose) * supp, data = ToothGrowth)
## 
## $`factor(dose)`
##         diff       lwr       upr     p adj
## 1-0.5  9.130  6.362488 11.897512 0.0000000
## 2-0.5 15.495 12.727488 18.262512 0.0000000
## 2-1    6.365  3.597488  9.132512 0.0000027
## 
## $supp
##       diff       lwr       upr     p adj
## VC-OJ -3.7 -5.579828 -1.820172 0.0002312
## 
## $`factor(dose):supp`
##                 diff        lwr         upr     p adj
## 1:OJ-0.5:OJ     9.47   4.671876  14.2681238 0.0000046
## 2:OJ-0.5:OJ    12.83   8.031876  17.6281238 0.0000000
## 0.5:VC-0.5:OJ  -5.25 -10.048124  -0.4518762 0.0242521
## 1:VC-0.5:OJ     3.54  -1.258124   8.3381238 0.2640208
## 2:VC-0.5:OJ    12.91   8.111876  17.7081238 0.0000000
## 2:OJ-1:OJ       3.36  -1.438124   8.1581238 0.3187361
## 0.5:VC-1:OJ   -14.72 -19.518124  -9.9218762 0.0000000
## 1:VC-1:OJ      -5.93 -10.728124  -1.1318762 0.0073930
## 2:VC-1:OJ       3.44  -1.358124   8.2381238 0.2936430
## 0.5:VC-2:OJ   -18.08 -22.878124 -13.2818762 0.0000000
## 1:VC-2:OJ      -9.29 -14.088124  -4.4918762 0.0000069
## 2:VC-2:OJ       0.08  -4.718124   4.8781238 1.0000000
## 1:VC-0.5:VC     8.79   3.991876  13.5881238 0.0000210
## 2:VC-0.5:VC    18.16  13.361876  22.9581238 0.0000000
## 2:VC-1:VC       9.37   4.571876  14.1681238 0.0000058

Inference

dose has the most significant effect on tooth growth
supplement type on its own doesn’t have much effect but its effect increases when combined with the dose variable
Thus changing supplement methods or the dose of vitamin C, will significantly impact the tooth growth

ANOVA Assumptions

ANOVA has three assumptions
- All observations are independent of one another and randomly selected from the population which they represent
- The variance across groups must be almost the same(homoscedasticity)
- The distribution should be approximately normal at each factor level

The homogeneity of the variances can be verified using the residuals vs fitted plot which shows that the residuals are uniformly distributed around the x axis and that there is no relationship between residuals and fitted values. It can also be checked using car::leveneTest()

plot(aov(len~dose*supp,data=ToothGrowth),1)

The normality assumption can be checked using the QQ plot which shows that the residuals are along the diagonal line It can also be verified using the Shapiro Wilk test. If it failed, then we would do transformation of the dependent variable.

plot(aov(len~dose*supp,data=ToothGrowth),2)

shapiro.test(x = residuals(aov(len~dose*supp,data=ToothGrowth)))

## 
##  Shapiro-Wilk normality test
## 
## data:  residuals(aov(len ~ dose * supp, data = ToothGrowth))
## W = 0.98499, p-value = 0.6694

Unbalanced ANOVA

The ToothGrowth dataset was balanced the number of observations in each group were equal
If it was not balanced then we would use car::Anova()

table(ToothGrowth$supp,ToothGrowth$dose) %>%
kable(format.args = list(decimal.mark = '.', big.mark = ",")) %>%
kable_styling(bootstrap_options = "condensed"
            ,full_width = FALSE
            ,position = "center"
            ,font_size = 10)

	0.5	1	2
OJ	10	10	10
VC	10	10	10

Summary

The null hypothesis for ANOVA is that the mean of the dependent variable is the same for all groups
Further learning
- Unbalanced design experiments for ANOVA
- Three classes of models
  - Fixed-effects model (class I)
  - Random-effects model (class II)
  - Mixed-effects model (class III)
- See also http://www.sthda.com/english/wiki/two-way-anova-test-in-r

Geographic Data and Visualisation in R

Kaggle Porto Seguro Part I - Exploratory Data Analysis

Linear Regression Part III - Plots