Wilcoxon Rank Sum Test

To calculate the Test Statistic, W:

  1. Put both samples in an array

  2. Label each observation according to which group / condition it belongs to

  3. Sort resulting list, keeping sample labels attached to each observation / value and assign a rank to each value

    1. Ties get average ranks; a 2-way tie at rank i would get a rank of (rank i + (rank i +1))/2; 3-way tie – of (rank i + (rank i +1) + (rank i +2))/3 etc.

  4. Add up the rank for each sample / condition

  5. Assess significance by comparing the smaller sum of the ranks with a critical value

    1. Reject the null hypothesis if our value is smaller than that in the table – there is a difference between the groups.

Example

  1. Make combined vector of the samples

> ozone<-c(gardenA,gardenB)

> ozone

[1] 3 4 4 3 2 3 1 3 5 2 5 5 6 7 4 4 3 5 6 5

  1. Assign a group label to each observation

> label<-c(rep("A",10),rep("B",10))

> label

[1] "A" "A" "A" "A" "A" "A" "A" "A" "A" "A" "B" "B" "B" "B" "B" "B" "B" "B" "B"

[20] "B"

  1. Sort and rank resulting list, averaging over tied values

> combined.ranks<-rank(ozone)

> combined.ranks

[1] 6.0 10.5 10.5 6.0 2.5 6.0 1.0 6.0 15.0 2.5 15.0 15.0 18.5 20.0 10.5

[16] 10.5 6.0 15.0 18.5 15.0

  1. Sum all the ranks for each sample (garden, in this case)

> tapply(combined.ranks,label,sum)

A B

66 144

  1. Assess significance by comparing the smaller sum of the ranks with a critical value

  1. Carry out procedure automatically, with no need for tables / critical values

> wilcox.test(gardenA,gardenB)



Wilcoxon rank sum test with continuity correction



data: gardenA and gardenB

W = 11, p-value = 0.002988

alternative hypothesis: true location shift is not equal to 0



(Warning message:

In wilcox.test.default(gardenA, gardenB) :

cannot compute exact p-value with ties)



N.B. The Wilcoxon is more conservative than the t-test on parametric data; if it’s significant in the Wilcoxon it’s likely to be more sig in the t-test.