This article describes statistical tests for comparing the variances of two or more samples. Equal variances across samples is called homogeneity of variances.

Some statistical tests, such as two independent samples T-test and ANOVA test, assume that variances are equal across groups. The Bartlett’s test, Levene’s test or Fligner-Killeen’s test can be used to verify that assumption.

Compare Multiple Sample Variances in R

Statistical tests for comparing variances

There are many solutions to test for the equality (homogeneity) of variance across groups, including:

F-test: Compare the variances of two samples. The data must be normally distributed.
Bartlett’s test: Compare the variances of k samples, where k can be more than two samples. The data must be normally distributed. The Levene test is an alternative to the Bartlett test that is less sensitive to departures from normality.
Levene’s test: Compare the variances of k samples, where k can be more than two samples. It’s an alternative to the Bartlett’s test that is less sensitive to departures from normality.
Fligner-Killeen test: a non-parametric test which is very robust against departures from normality.

The F-test has been described in our previous article: F-test to compare equality of two variances. In the present article, we’ll describe the tests for comparing more than two variances.

Statistical hypotheses

For all these tests (Bartlett’s test, Levene’s test or Fligner-Killeen’s test),

the null hypothesis is that all populations variances are equal;
the alternative hypothesis is that at least two of them differ.

Import and check your data into R

To import your data, use the following R code:

# If .txt tab file, use this
my_data <- read.delim(file.choose())

# Or, if .csv file, use this
my_data <- read.csv(file.choose())

Here, we’ll use ToothGrowth and PlantGrowth data sets:

# Load the data
data(ToothGrowth)

data(PlantGrowth)

To have an idea of what the data look like, we start by displaying a random sample of 10 rows using the function sample_n()[in dplyr package]. First, install dplyr package if you don’t have it: install.packages(“dplyr”).

Show 10 random rows:

set.seed(123)
# Show PlantGrowth
dplyr::sample_n(PlantGrowth, 10)

   weight group
24   5.50  trt2
12   4.17  trt1
25   5.37  trt2
26   5.29  trt2
2    5.58  ctrl
14   3.59  trt1
22   5.12  trt2
13   4.41  trt1
11   4.81  trt1
21   6.31  trt2

# PlantGrowth data structure
str(PlantGrowth)

'data.frame':   30 obs. of  2 variables:
 $ weight: num  4.17 5.58 5.18 6.11 4.5 4.61 5.17 4.53 5.33 5.14 ...
 $ group : Factor w/ 3 levels "ctrl","trt1",..: 1 1 1 1 1 1 1 1 1 1 ...

# Show ToothGrowth
dplyr::sample_n(ToothGrowth, 10)

    len supp dose
28 21.5   VC  2.0
40  9.7   OJ  0.5
34  9.7   OJ  0.5
6  10.0   VC  0.5
51 25.5   OJ  2.0
14 17.3   VC  1.0
3   7.3   VC  0.5
18 14.5   VC  1.0
50 27.3   OJ  1.0
46 25.2   OJ  1.0

# ToothGrowth data structure
str(ToothGrowth)

'data.frame':   60 obs. of  3 variables:
 $ len : num  4.2 11.5 7.3 5.8 6.4 10 11.2 11.2 5.2 7 ...
 $ supp: Factor w/ 2 levels "OJ","VC": 2 2 2 2 2 2 2 2 2 2 ...
 $ dose: num  0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 ...

Note that, R considers the column “dose” [in ToothGrowth data set] as a numeric vector. We want to convert it as a grouping variable (factor).

ToothGrowth$dose <- as.factor(ToothGrowth$dose)

We want to test the equality of variances between groups.

Compute Bartlett’s test in R

Bartlett’s test is used for testing homogeneity of variances in k samples, where k can be more than two. It’s adapted for normally distributed data. The Levene test, described in the next section, is a more robust alternative to the Bartlett test when the distributions of the data are non-normal.

The R function bartlett.test() can be used to compute Barlett’s test. The simplified format is as follow:

bartlett.test(formula, data)

formula: a formula of the form values ~ groups
data: a matrix or data frame

The function returns a list containing the following component:

statistic: Bartlett’s K-squared test statistic
parameter: the degrees of freedom of the approximate chi-squared distribution of the test statistic.
p.value: the p-value of the test

To perform the test, we’ll use the PlantGrowth data set, which contains the weight of plants obtained under 3 treatment groups.

Bartlett’s test with one independent variable:

res <- bartlett.test(weight ~ group, data = PlantGrowth)
res


    Bartlett test of homogeneity of variances

data:  weight by group
Bartlett's K-squared = 2.8786, df = 2, p-value = 0.2371

From the output, it can be seen that the p-value of 0.2370968 is not less than the significance level of 0.05. This means that there is no evidence to suggest that the variance in plant growth is statistically significantly different for the three treatment groups.

Bartlett’s test with multiple independent variables: the interaction() function must be used to collapse multiple factors into a single variable containing all combinations of the factors.

bartlett.test(len ~ interaction(supp,dose), data=ToothGrowth)


    Bartlett test of homogeneity of variances

data:  len by interaction(supp, dose)
Bartlett's K-squared = 6.9273, df = 5, p-value = 0.2261

Compute Levene’s test in R

As mentioned above, Levene’s test is an alternative to Bartlett’s test when the data is not normally distributed.

The function leveneTest() [in car package] can be used.

library(car)
# Levene's test with one independent variable
leveneTest(weight ~ group, data = PlantGrowth)

Levene's Test for Homogeneity of Variance (center = median)
      Df F value Pr(>F)
group  2  1.1192 0.3412
      27

# Levene's test with multiple independent variables
leveneTest(len ~ supp*dose, data = ToothGrowth)

Levene's Test for Homogeneity of Variance (center = median)
      Df F value Pr(>F)
group  5  1.7086 0.1484
      54

Compute Fligner-Killeen test in R

The Fligner-Killeen test is one of the many tests for homogeneity of variances which is most robust against departures from normality.

The R function fligner.test() can be used to compute the test:

fligner.test(weight ~ group, data = PlantGrowth)


    Fligner-Killeen test of homogeneity of variances

data:  weight by group
Fligner-Killeen:med chi-squared = 2.3499, df = 2, p-value = 0.3088

Infos

This analysis has been performed using R software (ver. 3.2.4).

Compare Multiple Sample Variances in R

Statistical tests for comparing variances

Statistical hypotheses

Import and check your data into R

Compute Bartlett’s test in R

Compute Levene’s test in R

Compute Fligner-Killeen test in R

Infos

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Bureau of Internal Revenue: Regional Offices (Directory)

Shivaji University Result 2017 BA B.Com B.Sc 1st, 2nd & 3rd Year परिणाम यंहा...

Return To Forever – Musicmagic (1977) [Audio Fidelity 2016] {SACD ISO + FLAC...

£700k teaching scam claim emerges during sex probe into supply teacher

NY-PHIL Mafia’s “Peter Pan” Tuccio Got A Beat Down For Being Disrespectful To...

New Guidelines for settlement of Medical claims of pensioners and others in...

99 Rain Status for Whatsapp - Best Rain Dp Collection

Cecil Smith Has Taken His Life, After Being the Subject of Conspiracy...

DJ Snake – Encore [iTunes Plus M4A]

Demi Lovato – Tell Me You Love Me (Remixes) – 2018 – iTunes Plus AAC M4A – EP

Windows Update / Microsoft Update の接続先 URL について

The 10 Tennessee Cities With The Largest Black Population For 2021

GTA 5 PPSSPP Zip File Download For Android Mediafire 382 MB

Black Angus Grilled Artichokes

Moondru Mudichu 16-05-2017 – Polimer tv Serial

[GET] Jenna Kutcher – The Instagram Lab 2.0 ($297.00)

[アメリカドラマ][WEBDL] ナルコワールド麻薬取引の実態全4話

Maryland: State Police report DUI arrests for Aug. 16th – 31st 2015; beer and...

RE: Same voucher no. with different dates in AX 2009