机器算法验证 - 执行 Wilcoxon 秩和检验的不同方法以及结果 W 统计量的解释 - 吾爱随笔录

wilcox.test(x,y, paired=F)R 中和（即使用逗号与波浪号）之间的实际区别是什么wilcox.test(x~y, paired=F)，以及如何解释产生的 W 统计量？这应该是相同的统计检验，但两种方法产生不同的结果。

我有一个包含 24 行的数据框，每行都包含有关个人性别和长度的信息：

mydata<-structure(list(ID = 1:24, Sex = structure(c(2L, 2L, 2L, 2L, 2L, 1L, 2L, 1L, 2L, 1L,1L, 2L, 1L, 1L, 2L,2L, 2L, 2L, 1L, 1L,2L, 2L, 2L, 2L), .Label = c("F", "M"), class = "factor"),Length = c(63.8,79.6, 58, 140, 293, 28.6, 147, 31.3, 33.2, 4.55, 16.4, 19.5, 26.4, 3.34, 29.3, 42.9, 55.6, 122, 30.3, 48.4, 130, 64.7, 93.3, 76.1)), .Names = c("ID", "Sex", "Length"), class = "data.frame", row.names = c(NA, -24L))

我想使用 Mann-Whitney U 检验探索两种性别之间的长度差异。

版本 1：

wilcox.test(mydata$Length[mydata$Sex == 'M'], mydata$Length[mydata$Sex == 'F'], paired=F)

        Wilcoxon rank sum test

data:  mydata$Length[mydata$Sex == "M"] and mydata$Length[mydata$Sex == "F"]
W = 118, p-value = 0.0003698
alternative hypothesis: true location shift is not equal to 0

版本 2：

wilcox.test(mydata$Length ~ mydata$Sex, paired=F)

        Wilcoxon rank sum test

data:  mydata$Length by mydata$Sex
W = 10, p-value = 0.0003698
alternative hypothesis: true location shift is not equal to 0

它们都给了我相同的 P 值，但 W 统计数据却截然不同（118 对 10）。我不明白为什么会这样，也不知道使用哪一个来进行推理或报告。我不应该期望从这两种方法中得到相同的答案吗？以及如何解释产生的 W 统计量？