How to calculate effect size for mann-whitney test?
Answers
Before looking at the example of a Mann-Whitney's U test, let's take a look at what a Mann-Whitney's U test does. The point of a Mann-Whitney's U test is that it treats the data as ordinal data. So you can order the data but the difference between any of the two values is not consistent. What a Mann-Whitney's U test does is to calculate the rank for each value instead of using the values as-is. Let's think about some data from a 5-Likert scale question and say you have the following data.
Group A 1 3 2 4 2
Group B 3 5 5 2 4
Then, you make a rank (R) based on these values. So,
Group A 1 (R1) 3 (R6) 2 (R2) 4 (R7) 2 (R4)
Group B 3 (R5) 5 (R9) 5 (R10) 2 (R3) 4 (R8)
For now, I just randomly ranked for the ties. But obviously this may cause a problem if we want to do a fair statistical test. One thing we can do is to take the average of the ranks of the ties and give them the same average. For instance, the value 2 gets rank 2 and 3 in this example. Instead of deciding which data point gets a higher rank, we just use the average of the ranks that value gets. So, both will get rank 2.5 in this case. Thus, with this correction, this example becomes
Group A 1 (R1) 3 (R5.5) 2 (R3) 4 (R7.5) 2 (R3)
Group B 3 (R5.5) 5 (R9.5) 5 (R9.5) 2 (R3) 4 (R7.5)
The means of the ranks of Group A and Group B are 4.0 and 7.0. The null hypothesis of a Mann-Whitney's U test is that the samples of the both groups came from the same population. So intuitively, if the null hypothesis holds, this means that there is no difference in the mean ranks between the two groups because both groups have the same chances to have low and high ranks. Thus, if the means of the ranks are skewed enough, you can say that you have a significant effect.
Please remember that this is not what exactly a Mann-Whitney test does. It calculates the statistics called the U value. The U value for each group is calculated by subtracting the possible minimum rank which the group can take from the sum of the ranks, and the smallest U value is used for the test. The distribution of the standardized U value is known to be close to the normal distribution when the sample size is more than 20. Thus, if the observed standardized U value is far from the center of the normal distribution (= 0), the test will reject the null hypothesis.
Effect size
The calculation of the effect size of Mann-Whitney's U test is fairly easy.
,
where N is the total number of the samples. Here is the standard value of r for small, medium, and large sizes. The sign does not contain much information, so we often just report the absolute value