Wilcoxon Rank Sum Test
Non-parametric equivalent of the t-test (i.e. if assumptions are not met – errors non-normal)
To calculate the Test Statistic, W:
Put both samples in an array
Label each observation according to which group / condition it belongs to
Sort resulting list, keeping sample labels attached to each observation / value and assign a rank to each value
Ties get average ranks; a 2-way tie at rank i would get a rank of (rank i + (rank i +1))/2; 3-way tie – of (rank i + (rank i +1) + (rank i +2))/3 etc.
Add up the rank for each sample / condition
Assess significance by comparing the smaller sum of the ranks with a critical value
Reject the null hypothesis if our value is smaller than that in the table – there is a difference between the groups.
Example
Make combined vector of the samples
> ozone<-c(gardenA,gardenB)
> ozone
[1] 3 4 4 3 2 3 1 3 5 2 5 5 6 7 4 4 3 5 6 5
Assign a group label to each observation
> label<-c(rep("A",10),rep("B",10))
> label
[1] "A" "A" "A" "A" "A" "A" "A" "A" "A" "A" "B" "B" "B" "B" "B" "B" "B" "B" "B"
[20] "B"
Sort and rank resulting list, averaging over tied values
> combined.ranks<-rank(ozone)
> combined.ranks
[1] 6.0 10.5 10.5 6.0 2.5 6.0 1.0 6.0 15.0 2.5 15.0 15.0 18.5 20.0 10.5
[16] 10.5 6.0 15.0 18.5 15.0
Sum all the ranks for each sample (garden, in this case)
> tapply(combined.ranks,label,sum)
A B
66 144
Assess significance by comparing the smaller sum of the ranks with a critical value
Table of Wilcoxon rank sum shows for sample sizes (10, 10) at the 5% level, the critical value is 78. Our value, 66, is smaller.
Carry out procedure automatically, with no need for tables / critical values
> wilcox.test(gardenA,gardenB)
Wilcoxon rank sum test with continuity correction
data: gardenA and gardenB
W = 11, p-value = 0.002988
alternative hypothesis: true location shift is not equal to 0
P<.001, we reject H0 in favour of H1; the ozone concentrations were significantly different in Garden A and Garden B.
(Warning message:
In wilcox.test.default(gardenA, gardenB) :
cannot compute exact p-value with ties)
N.B. The Wilcoxon is more conservative than the t-test on parametric data; if it’s significant in the Wilcoxon it’s likely to be more sig in the t-test.