pingouin.kruskal#

pingouin.kruskal(data=None, dv=None, between=None, detailed=False)[source]#

Kruskal-Wallis H-test for independent samples.

Parameters:
datapandas.DataFrame

DataFrame

dvstring

Name of column containing the dependent variable.

betweenstring

Name of column containing the between factor.

Returns:
statspandas.DataFrame
  • 'H': The Kruskal-Wallis H statistic, corrected for ties

  • 'p-unc': Uncorrected p-value

  • 'dof': degrees of freedom

Notes

The Kruskal-Wallis H-test tests the null hypothesis that the population median of all of the groups are equal. It is a non-parametric version of ANOVA. The test works on 2 or more independent samples, which may have different sizes.

Due to the assumption that H has a chi square distribution, the number of samples in each group must not be too small. A typical rule is that each sample must have at least 5 measurements.

NaN values are automatically removed.

Examples

Compute the Kruskal-Wallis H-test for independent samples.

>>> from pingouin import kruskal, read_dataset
>>> df = read_dataset('anova')
>>> kruskal(data=df, dv='Pain threshold', between='Hair color')
             Source  ddof1         H     p-unc
Kruskal  Hair color      3  10.58863  0.014172