pingouin.cochran#
- pingouin.cochran(data=None, dv=None, within=None, subject=None)[source]#
Cochran Q test. A special case of the Friedman test when the dependent variable is binary.
- Parameters:
- data
pandas.DataFrame
DataFrame. Both wide and long-format dataframe are supported for this test.
- dvstring
Name of column containing the dependent variable (only required if
data
is in long format).- withinstring
Name of column containing the within-subject factor (only required if
data
is in long format). Two or more within-factor are not currently supported.- subjectstring
Name of column containing the subject/rater identifier (only required if
data
is in long format).
- data
- Returns:
- stats
pandas.DataFrame
'Q'
: The Cochran Q statistic'p-unc'
: Uncorrected p-value'dof'
: degrees of freedom
- stats
Notes
The Cochran Q test [1] is a non-parametric test for ANOVA with repeated measures where the dependent variable is binary.
The Q statistics is defined as:
\[Q = \frac{(r-1)(r\sum_j^rx_j^2-N^2)}{rN-\sum_i^nx_i^2}\]where \(N\) is the total sum of all observations, \(j=1,...,r\) where \(r\) is the number of repeated measures, \(i=1,...,n\) where \(n\) is the number of observations per condition.
The p-value is then approximated using a chi-square distribution with \(r-1\) degrees of freedom:
\[Q \sim \chi^2(r-1)\]Data are expected to be in long-format. Missing values are automatically removed using a strict listwise approach (= complete-case analysis). In other words, any subject with one or more missing value(s) is completely removed from the dataframe prior to running the test.
References
[1]Cochran, W.G., 1950. The comparison of percentages in matched samples. Biometrika 37, 256–266. https://doi.org/10.1093/biomet/37.3-4.256
Examples
Compute the Cochran Q test for repeated measurements.
>>> from pingouin import cochran, read_dataset >>> df = read_dataset('cochran') >>> cochran(data=df, dv='Energetic', within='Time', subject='Subject') Source dof Q p-unc cochran Time 2 6.705882 0.034981
Same but using a wide-format dataframe
>>> df_wide = df.pivot_table(index="Subject", columns="Time", values="Energetic") >>> cochran(df_wide) Source dof Q p-unc cochran Within 2 6.705882 0.034981