pingouin.friedman#
- pingouin.friedman(data=None, dv=None, within=None, subject=None, method='chisq')[source]#
Friedman test for repeated measurements.
- Parameters:
- data
pandas.DataFrame
DataFrame. Both wide and long-format dataframe are supported for this test.
- dvstring
Name of column containing the dependent variable (only required if
data
is in long format).- withinstring
Name of column containing the within-subject factor (only required if
data
is in long format). Two or more within-factor are not currently supported.- subjectstring
Name of column containing the subject/rater identifier (only required if
data
is in long format).- methodstring
Statistical test to perform. Must be
'chisq'
(chi-square test) or'f'
(F test). See notes below for explanation.
- data
- Returns:
- stats
pandas.DataFrame
'W'
: Kendall’s coefficient of concordance, corrected for ties
If
method='chisq'
'Q'
: The Friedman chi-square statistic, corrected for ties'dof'
: degrees of freedom'p-unc'
: Uncorrected p-value of the chi squared test
If
method='f'
'F'
: The Friedman F statistic, corrected for ties'dof1'
: degrees of freedom of the numerator'dof2'
: degrees of freedom of the denominator'p-unc'
: Uncorrected p-value of the F test
- stats
Notes
The Friedman test is used for non-parametric (rank-based) one-way repeated measures ANOVA.
It is equivalent to the test of significance of Kendalls’s coefficient of concordance (Kendall’s W). Most commonly a Q statistic, which has asymptotical chi-squared distribution, is computed and used for testing. However, the chi-squared test tend to be overly conservative for small numbers of samples and/or repeated measures, in which case a F-test is more adequate [1].
Data can be in wide or long format. Missing values are automatically removed using a strict listwise approach (= complete-case analysis). In other words, any subject with one or more missing value(s) is completely removed from the dataframe prior to running the test.
References
[1]Marozzi, M. (2014). Testing for concordance between several criteria. Journal of Statistical Computation and Simulation, 84(9), 1843–1850. https://doi.org/10.1080/00949655.2013.766189
Examples
Compute the Friedman test for repeated measurements, using a wide-format dataframe
>>> import pandas as pd >>> import pingouin as pg >>> df = pd.DataFrame({ ... 'white': {0: 10, 1: 8, 2: 7, 3: 9, 4: 7, 5: 4, 6: 5, 7: 6, 8: 5, 9: 10, 10: 4, 11: 7}, ... 'red': {0: 7, 1: 5, 2: 8, 3: 6, 4: 5, 5: 7, 6: 9, 7: 6, 8: 4, 9: 6, 10: 7, 11: 3}, ... 'rose': {0: 8, 1: 5, 2: 6, 3: 4, 4: 7, 5: 5, 6: 3, 7: 7, 8: 6, 9: 4, 10: 4, 11: 3}}) >>> pg.friedman(df) Source W ddof1 Q p-unc Friedman Within 0.083333 2 2.0 0.367879
Compare with SciPy
>>> from scipy.stats import friedmanchisquare >>> friedmanchisquare(*df.to_numpy().T) FriedmanchisquareResult(statistic=1.9999999999999893, pvalue=0.3678794411714444)
Using a long-format dataframe
>>> df_long = df.melt(ignore_index=False).reset_index() >>> pg.friedman(data=df_long, dv="value", within="variable", subject="index") Source W ddof1 Q p-unc Friedman variable 0.083333 2 2.0 0.367879
Using the F-test method
>>> pg.friedman(df, method="f") Source W ddof1 ddof2 F p-unc Friedman Within 0.083333 1.833333 20.166667 1.0 0.378959