pingouin.wilcoxon#

pingouin.wilcoxon(x, y=None, alternative='two-sided', **kwargs)[source]#

Wilcoxon signed-rank test. It is the non-parametric version of the paired T-test.

Parameters:
xarray_like

Either the first set of measurements (in which case y is the second set of measurements), or the differences between two sets of measurements (in which case y is not to be specified.) Must be one-dimensional.

yarray_like

Either the second set of measurements (if x is the first set of measurements), or not specified (if x is the differences between two sets of measurements.) Must be one-dimensional.

alternativestring

Defines the alternative hypothesis, or tail of the test. Must be one of “two-sided” (default), “greater” or “less”. See scipy.stats.wilcoxon() for more details.

**kwargsdict

Additional keywords arguments that are passed to scipy.stats.wilcoxon().

Returns:
statspandas.DataFrame
  • 'W-val': W-value

  • 'alternative': tail of the test

  • 'p-val': p-value

  • 'RBC' : matched pairs rank-biserial correlation (effect size)

  • 'CLES' : common language effect size

Notes

The Wilcoxon signed-rank test [1] tests the null hypothesis that two related paired samples come from the same distribution. In particular, it tests whether the distribution of the differences x - y is symmetric about zero.

Important

Pingouin automatically applies a continuity correction. Therefore, the p-values will be slightly different than scipy.stats.wilcoxon() unless correction=True is explicitly passed to the latter.

In addition to the test statistic and p-values, Pingouin also computes two measures of effect size. The matched pairs rank biserial correlation [2] is the simple difference between the proportion of favorable and unfavorable evidence; in the case of the Wilcoxon signed-rank test, the evidence consists of rank sums (Kerby 2014):

\[r = f - u\]

The common language effect size is the proportion of pairs where x is higher than y. It was first introduced by McGraw and Wong (1992) [3]. Pingouin uses a brute-force version of the formula given by Vargha and Delaney 2000 [4]:

\[\text{CL} = P(X > Y) + .5 \times P(X = Y)\]

The advantage is of this method are twofold. First, the brute-force approach pairs each observation of x to its y counterpart, and therefore does not require normally distributed data. Second, the formula takes ties into account and therefore works with ordinal data.

When tail is 'less', the CLES is then set to \(1 - \text{CL}\), which gives the proportion of pairs where x is lower than y.

References

[1]

Wilcoxon, F. (1945). Individual comparisons by ranking methods. Biometrics bulletin, 1(6), 80-83.

[2]

Kerby, D. S. (2014). The simple difference formula: An approach to teaching nonparametric correlation. Comprehensive Psychology, 3, 11-IT.

[3]

McGraw, K. O., & Wong, S. P. (1992). A common language effect size statistic. Psychological bulletin, 111(2), 361.

[4]

Vargha, A., & Delaney, H. D. (2000). A Critique and Improvement of the “CL” Common Language Effect Size Statistics of McGraw and Wong. Journal of Educational and Behavioral Statistics: A Quarterly Publication Sponsored by the American Educational Research Association and the American Statistical Association, 25(2), 101–132. https://doi.org/10.2307/1165329

Examples

Wilcoxon test on two related samples.

>>> import numpy as np
>>> import pingouin as pg
>>> x = np.array([20, 22, 19, 20, 22, 18, 24, 20, 19, 24, 26, 13])
>>> y = np.array([38, 37, 33, 29, 14, 12, 20, 22, 17, 25, 26, 16])
>>> pg.wilcoxon(x, y, alternative='two-sided')
          W-val alternative     p-val       RBC      CLES
Wilcoxon   20.5   two-sided  0.285765 -0.378788  0.395833

Same but using pre-computed differences. However, the CLES effect size cannot be computed as it requires the raw data.

>>> pg.wilcoxon(x - y)
          W-val alternative     p-val       RBC  CLES
Wilcoxon   20.5   two-sided  0.285765 -0.378788   NaN

Compare with SciPy

>>> import scipy
>>> scipy.stats.wilcoxon(x, y)
WilcoxonResult(statistic=20.5, pvalue=0.2661660677806492)

The p-value is not exactly similar to Pingouin. This is because Pingouin automatically applies a continuity correction. Disabling it gives the same p-value as scipy:

>>> pg.wilcoxon(x, y, alternative='two-sided', correction=False)
          W-val alternative     p-val       RBC      CLES
Wilcoxon   20.5   two-sided  0.266166 -0.378788  0.395833

One-sided test

>>> pg.wilcoxon(x, y, alternative='greater')
          W-val alternative     p-val       RBC      CLES
Wilcoxon   20.5     greater  0.876244 -0.378788  0.395833
>>> pg.wilcoxon(x, y, alternative='less')
          W-val alternative     p-val       RBC      CLES
Wilcoxon   20.5        less  0.142883 -0.378788  0.604167