pingouin.friedman#

pingouin.friedman(data=None, dv=None, within=None, subject=None, method='chisq')[source]#

Friedman test for repeated measurements.

Parameters:
datapandas.DataFrame

DataFrame. Both wide and long-format dataframe are supported for this test.

dvstring

Name of column containing the dependent variable (only required if data is in long format).

withinstring

Name of column containing the within-subject factor (only required if data is in long format). Two or more within-factor are not currently supported.

subjectstring

Name of column containing the subject/rater identifier (only required if data is in long format).

methodstring

Statistical test to perform. Must be 'chisq' (chi-square test) or 'f' (F test). See notes below for explanation.

Returns:
statspandas.DataFrame
  • 'W': Kendall’s coefficient of concordance, corrected for ties

If method='chisq'

  • 'Q': The Friedman chi-square statistic, corrected for ties

  • 'dof': degrees of freedom

  • 'p-unc': Uncorrected p-value of the chi squared test

If method='f'

  • 'F': The Friedman F statistic, corrected for ties

  • 'dof1': degrees of freedom of the numerator

  • 'dof2': degrees of freedom of the denominator

  • 'p-unc': Uncorrected p-value of the F test

Notes

The Friedman test is used for non-parametric (rank-based) one-way repeated measures ANOVA.

It is equivalent to the test of significance of Kendalls’s coefficient of concordance (Kendall’s W). Most commonly a Q statistic, which has asymptotical chi-squared distribution, is computed and used for testing. However, the chi-squared test tend to be overly conservative for small numbers of samples and/or repeated measures, in which case a F-test is more adequate [1].

Data can be in wide or long format. Missing values are automatically removed using a strict listwise approach (= complete-case analysis). In other words, any subject with one or more missing value(s) is completely removed from the dataframe prior to running the test.

References

[1]

Marozzi, M. (2014). Testing for concordance between several criteria. Journal of Statistical Computation and Simulation, 84(9), 1843–1850. https://doi.org/10.1080/00949655.2013.766189

Examples

Compute the Friedman test for repeated measurements, using a wide-format dataframe

>>> import pandas as pd
>>> import pingouin as pg
>>> df = pd.DataFrame({
...    'white': {0: 10, 1: 8, 2: 7, 3: 9, 4: 7, 5: 4, 6: 5, 7: 6, 8: 5, 9: 10, 10: 4, 11: 7},
...    'red': {0: 7, 1: 5, 2: 8, 3: 6, 4: 5, 5: 7, 6: 9, 7: 6, 8: 4, 9: 6, 10: 7, 11: 3},
...    'rose': {0: 8, 1: 5, 2: 6, 3: 4, 4: 7, 5: 5, 6: 3, 7: 7, 8: 6, 9: 4, 10: 4, 11: 3}})
>>> pg.friedman(df)
          Source         W  ddof1    Q     p-unc
Friedman  Within  0.083333      2  2.0  0.367879

Compare with SciPy

>>> from scipy.stats import friedmanchisquare
>>> friedmanchisquare(*df.to_numpy().T)
FriedmanchisquareResult(statistic=1.9999999999999893, pvalue=0.3678794411714444)

Using a long-format dataframe

>>> df_long = df.melt(ignore_index=False).reset_index()
>>> pg.friedman(data=df_long, dv="value", within="variable", subject="index")
            Source         W  ddof1    Q     p-unc
Friedman  variable  0.083333      2  2.0  0.367879

Using the F-test method

>>> pg.friedman(df, method="f")
          Source         W     ddof1      ddof2    F     p-unc
Friedman  Within  0.083333  1.833333  20.166667  1.0  0.378959