Hypothesis Testing — Analysis of Variance (ANOVA)

Ken Hoffman
Analytics Vidhya
Published in
3 min readJan 17, 2021

--

Introduction:

ANOVA (Analysis of Variance) provides a statistical test of whether two or more population means are equal. Assume we want to determine whether multiple groups differ from one another in a measurement. For example, lets say we want to determine whether the amount of uber riders differs by season in New York City. You could use t-test to determine this, but that would require you to use 6 tests (n(n-1)/2). The more tests you conduct, the bigger the risk is that you come to a false conclusion. To counteract this issue, you can use an ANOVA test. Instead of looking at each individual difference, ANOVA examines the ratio of variance between groups and the variance within groups to determine whether the ratio is big enough to be statistically significant.

Types of ANOVA Tests:

  • One-way ANOVA: used when you want to test two or more groups to see if there’s a statistical difference between them. Both the t-test and one-way ANOVA test can compare the means for two groups, but only the one-way ANOVA test can compare the means of multiple groups at once. If you were to run a one-way ANOVA test and a t-test on the same two groups, then the results would be equivalent.
  • Two-way ANOVA: an extension of the one-way ANOVA test. A two-way ANOVA test allows you to test the effect of two independent variables at the same time. For example, if you want to compare the strength of athletes by country and by gender, you could use a two-way ANOVA test to accomplish this.
  • Three-way ANOVA: an extension of the one-way ANOVA test and two-way ANOVA test that allows you to test the effect of three independent variables at the same time. The three-way ANOVA test is also referred to as a three-factor ANOVA test.

Calculating ANOVA:

For ANOVA tests, we would set up a null and alternative hypothesis like so:

Hnull → µ1 = µ2 = µ3 = µ4

Halternative → Hnull is not true.

To calculate ANOVA, you would use the following formulas:

Degrees of Freedom for ANOVA:

  • DF between = k -1
  • DF within = N-k
  • DF total = N-1

Key:

  • k = number of groups
  • N = total number of observations
  • n = number of observations for each group

Limitations of the one-way ANOVA:

A one-way ANOVA will help you determine whether the two or more groups are difference from each other, but it will not be able to tell you which specific groups were different from one another. In this case, you should run an ad-hoc test to determine which groups had different means.

Assumptions for the two-way ANOVA and three-way ANOVA:

  • The population is close to normal distribution
  • The sample are independent
  • Population variances are equal
  • Groups have equal sample sizes

References:

--

--