pingouin.pairwise_tests#

pingouin.pairwise_tests(data=None, dv=None, between=None, within=None, subject=None, parametric=True, marginal=True, alpha=0.05, alternative='two-sided', padjust='none', effsize='hedges', correction='auto', nan_policy='listwise', return_desc=False, interaction=True, within_first=True)[source]#

Pairwise tests.

Parameters:
datapandas.DataFrame

DataFrame. Note that this function can also directly be used as a Pandas method, in which case this argument is no longer needed.

dvstring

Name of column containing the dependent variable.

betweenstring or list with 2 elements

Name of column(s) containing the between-subject factor(s).

withinstring or list with 2 elements

Name of column(s) containing the within-subject factor(s), i.e. the repeated measurements.

subjectstring

Name of column containing the subject identifier. This is mandatory when within is specified.

parametricboolean

If True (default), use the parametric ttest() function. If False, use pingouin.wilcoxon() or pingouin.mwu() for paired or unpaired samples, respectively.

marginalboolean

If True (default), the between-subject pairwise T-test(s) will be calculated after averaging across all levels of the within-subject factor in mixed design. This is recommended to avoid violating the assumption of independence and conflating the degrees of freedom by the number of repeated measurements.

Added in version 0.3.2.

alphafloat

Significance level

alternativestring

Defines the alternative hypothesis, or tail of the test. Must be one of “two-sided” (default), “greater” or “less”. Both “greater” and “less” return one-sided p-values. “greater” tests against the alternative hypothesis that the mean of x is greater than the mean of y.

padjuststring

Method used for testing and adjustment of pvalues.

  • 'none': no correction

  • 'bonf': one-step Bonferroni correction

  • 'sidak': one-step Sidak correction

  • 'holm': step-down method using Bonferroni adjustments

  • 'fdr_bh': Benjamini/Hochberg FDR correction

  • 'fdr_by': Benjamini/Yekutieli FDR correction

effsizestring or None

Effect size type. Available methods are:

  • 'none': no effect size

  • 'cohen': Unbiased Cohen d

  • 'hedges': Hedges g

  • 'r': Pearson correlation coefficient

  • 'eta-square': Eta-square

  • 'odds-ratio': Odds ratio

  • 'AUC': Area Under the Curve

  • 'CLES': Common Language Effect Size

correctionstring or boolean

For independent two sample T-tests, specify whether or not to correct for unequal variances using Welch separate variances T-test. If ‘auto’, it will automatically uses Welch T-test when the sample sizes are unequal, as recommended by Zimmerman 2004.

Added in version 0.3.2.

nan_policystring

Can be ‘listwise’ for listwise deletion of missing values in repeated measures design (= complete-case analysis) or ‘pairwise’ for the more liberal pairwise deletion (= available-case analysis). The former (default) is more appropriate for post-hoc analysis following an ANOVA, however it can drastically reduce the power of the test: any subject with one or more missing value(s) will be completely removed from the analysis.

Added in version 0.2.9.

return_descboolean

If True, append group means and std to the output dataframe

interactionboolean

If there are multiple factors and interaction is True (default), Pingouin will also calculate T-tests for the interaction term (see Notes).

Added in version 0.2.9.

within_firstboolean

Determines the order of the interaction in mixed design. Pingouin will return within * between when this parameter is set to True (default), and between * within otherwise.

Added in version 0.3.6.

Returns:
statspandas.DataFrame
  • 'Contrast': Contrast (= independent variable or interaction)

  • 'A': Name of first measurement

  • 'B': Name of second measurement

  • 'Paired': indicates whether the two measurements are paired or independent

  • 'Parametric': indicates if (non)-parametric tests were used

  • 'T': T statistic (only if parametric=True)

  • 'U-val': Mann-Whitney U stat (if parametric=False and unpaired data)

  • 'W-val': Wilcoxon W stat (if parametric=False and paired data)

  • 'dof': degrees of freedom (only if parametric=True)

  • 'alternative': tail of the test

  • 'p-unc': Uncorrected p-values

  • 'p-corr': Corrected p-values

  • 'p-adjust': p-values correction method

  • 'BF10': Bayes Factor

  • 'hedges': effect size (or any effect size defined in effsize)

Notes

Data are expected to be in long-format. If your data is in wide-format, you can use the pandas.melt() function to convert from wide to long format.

If between or within is a list (e.g. [‘col1’, ‘col2’]), the function returns 1) the pairwise T-tests between each values of the first column, 2) the pairwise T-tests between each values of the second column and 3) the interaction between col1 and col2. The interaction is dependent of the order of the list, so [‘col1’, ‘col2’] will not yield the same results as [‘col2’, ‘col1’]. Furthermore, the interaction will only be calculated if interaction=True.

If between is a list with two elements, the output model is between1 + between2 + between1 * between2.

Similarly, if within is a list with two elements, the output model is within1 + within2 + within1 * within2.

If both between and within are specified, the output model is within + between + within * between (= mixed design), unless within_first=False in which case the model becomes between + within + between * within.

Missing values in repeated measurements are automatically removed using a listwise (default) or pairwise deletion strategy. The former is more conservative, as any subject with one or more missing value(s) will be completely removed from the dataframe prior to calculating the T-tests. The nan_policy parameter can therefore have a huge impact on the results.

Examples

For more examples, please refer to the Jupyter notebooks

  1. One between-subject factor

>>> import pandas as pd
>>> import pingouin as pg
>>> pd.set_option('display.expand_frame_repr', False)
>>> pd.set_option('display.max_columns', 20)
>>> df = pg.read_dataset('mixed_anova.csv')
>>> pg.pairwise_tests(dv='Scores', between='Group', data=df).round(3)
  Contrast        A           B  Paired  Parametric     T    dof alternative  p-unc   BF10  hedges
0    Group  Control  Meditation   False        True -2.29  178.0   two-sided  0.023  1.813   -0.34
  1. One within-subject factor

>>> post_hocs = pg.pairwise_tests(dv='Scores', within='Time', subject='Subject', data=df)
>>> post_hocs.round(3)
  Contrast        A        B  Paired  Parametric      T   dof alternative  p-unc   BF10  hedges
0     Time   August  January    True        True -1.740  59.0   two-sided  0.087  0.582  -0.328
1     Time   August     June    True        True -2.743  59.0   two-sided  0.008  4.232  -0.483
2     Time  January     June    True        True -1.024  59.0   two-sided  0.310  0.232  -0.170
  1. Non-parametric pairwise paired test (wilcoxon)

>>> pg.pairwise_tests(dv='Scores', within='Time', subject='Subject',
...                    data=df, parametric=False).round(3)
  Contrast        A        B  Paired  Parametric  W-val alternative  p-unc  hedges
0     Time   August  January    True       False  716.0   two-sided  0.144  -0.328
1     Time   August     June    True       False  564.0   two-sided  0.010  -0.483
2     Time  January     June    True       False  887.0   two-sided  0.840  -0.170
  1. Mixed design (within and between) with bonferroni-corrected p-values

>>> posthocs = pg.pairwise_tests(dv='Scores', within='Time', subject='Subject',
...                               between='Group', padjust='bonf', data=df)
>>> posthocs.round(3)
       Contrast     Time        A           B Paired  Parametric      T   dof alternative  p-unc  p-corr p-adjust   BF10  hedges
0          Time        -   August     January   True        True -1.740  59.0   two-sided  0.087   0.261     bonf  0.582  -0.328
1          Time        -   August        June   True        True -2.743  59.0   two-sided  0.008   0.024     bonf  4.232  -0.483
2          Time        -  January        June   True        True -1.024  59.0   two-sided  0.310   0.931     bonf  0.232  -0.170
3         Group        -  Control  Meditation  False        True -2.248  58.0   two-sided  0.028     NaN      NaN  2.096  -0.573
4  Time * Group   August  Control  Meditation  False        True  0.316  58.0   two-sided  0.753   1.000     bonf  0.274   0.081
5  Time * Group  January  Control  Meditation  False        True -1.434  58.0   two-sided  0.157   0.471     bonf  0.619  -0.365
6  Time * Group     June  Control  Meditation  False        True -2.744  58.0   two-sided  0.008   0.024     bonf  5.593  -0.699
  1. Two between-subject factors. The order of the between factors matters!

>>> pg.pairwise_tests(dv='Scores', between=['Group', 'Time'], data=df).round(3)
       Contrast       Group        A           B Paired  Parametric      T    dof alternative  p-unc     BF10  hedges
0         Group           -  Control  Meditation  False        True -2.290  178.0   two-sided  0.023    1.813  -0.340
1          Time           -   August     January  False        True -1.806  118.0   two-sided  0.074    0.839  -0.328
2          Time           -   August        June  False        True -2.660  118.0   two-sided  0.009    4.499  -0.483
3          Time           -  January        June  False        True -0.934  118.0   two-sided  0.352    0.288  -0.170
4  Group * Time     Control   August     January  False        True -0.383   58.0   two-sided  0.703    0.279  -0.098
5  Group * Time     Control   August        June  False        True -0.292   58.0   two-sided  0.771    0.272  -0.074
6  Group * Time     Control  January        June  False        True  0.045   58.0   two-sided  0.964    0.263   0.011
7  Group * Time  Meditation   August     January  False        True -2.188   58.0   two-sided  0.033    1.884  -0.558
8  Group * Time  Meditation   August        June  False        True -4.040   58.0   two-sided  0.000  148.302  -1.030
9  Group * Time  Meditation  January        June  False        True -1.442   58.0   two-sided  0.155    0.625  -0.367
  1. Same but without the interaction, and using a directional test

>>> df.pairwise_tests(dv='Scores', between=['Group', 'Time'], alternative="less",
...                    interaction=False).round(3)
  Contrast        A           B  Paired  Parametric      T    dof alternative  p-unc   BF10  hedges
0    Group  Control  Meditation   False        True -2.290  178.0        less  0.012  3.626  -0.340
1     Time   August     January   False        True -1.806  118.0        less  0.037  1.679  -0.328
2     Time   August        June   False        True -2.660  118.0        less  0.004  8.998  -0.483
3     Time  January        June   False        True -0.934  118.0        less  0.176  0.577  -0.170