pingouin.ptests#
- pingouin.ptests(self, paired=False, decimals=3, padjust=None, stars=True, pval_stars={0.001: '***', 0.01: '**', 0.05: '*'}, **kwargs)[source]#
Pairwise T-test between columns of a dataframe.
T-values are reported on the lower triangle of the output pairwise matrix and p-values on the upper triangle. This method is a faster, but less exhaustive, matrix-version of the
pingouin.pairwise_test()
function. Missing values are automatically removed from each pairwise T-test.Added in version 0.5.3.
- Parameters:
- self
pandas.DataFrame
Input dataframe.
- pairedboolean
Specify whether the two observations are related (i.e. repeated measures) or independent.
- decimalsint
Number of decimals to display in the output matrix.
- padjuststring or None
P-values adjustment for multiple comparison
'none'
: no correction'bonf'
: one-step Bonferroni correction'sidak'
: one-step Sidak correction'holm'
: step-down method using Bonferroni adjustments'fdr_bh'
: Benjamini/Hochberg FDR correction'fdr_by'
: Benjamini/Yekutieli FDR correction
- starsboolean
If True, only significant p-values are displayed as stars using the pre-defined thresholds of
pval_stars
. If False, all the raw p-values are displayed.- pval_starsdict
Significance thresholds. Default is 3 stars for p-values <0.001, 2 stars for p-values <0.01 and 1 star for p-values <0.05.
- **kwargsoptional
Optional argument(s) passed to the lower-level scipy functions, i.e.
scipy.stats.ttest_ind()
for independent T-test andscipy.stats.ttest_rel()
for paired T-test.
- self
- Returns:
- mat
pandas.DataFrame
Pairwise T-test matrix, of dtype str, with T-values on the lower triangle and p-values on the upper triangle.
- mat
Examples
>>> import numpy as np >>> import pandas as pd >>> import pingouin as pg >>> # Load an example dataset of personality dimensions >>> df = pg.read_dataset('pairwise_corr').iloc[:30, 1:] >>> df.columns = ["N", "E", "O", 'A', "C"] >>> # Add some missing values >>> df.iloc[[2, 5, 20], 2] = np.nan >>> df.iloc[[1, 4, 10], 3] = np.nan >>> df.head().round(2) N E O A C 0 2.48 4.21 3.94 3.96 3.46 1 2.60 3.19 3.96 NaN 3.23 2 2.81 2.90 NaN 2.75 3.50 3 2.90 3.56 3.52 3.17 2.79 4 3.02 3.33 4.02 NaN 2.85
Independent pairwise T-tests
>>> df.ptests() N E O A C N - *** *** *** *** E -8.397 - *** O -8.332 -0.596 - *** A -8.804 0.12 0.72 - *** C -4.759 3.753 4.074 3.787 -
Let’s compare with SciPy
>>> from scipy.stats import ttest_ind >>> np.round(ttest_ind(df["N"], df["E"]), 3) array([-8.397, 0. ])
Passing custom parameters to the lower-level
scipy.stats.ttest_ind()
function>>> df.ptests(alternative="greater", equal_var=True) N E O A C N - E -8.397 - *** O -8.332 -0.596 - *** A -8.804 0.12 0.72 - *** C -4.759 3.753 4.074 3.787 -
Paired T-test, showing the actual p-values instead of stars
>>> df.ptests(paired=True, stars=False, decimals=4) N E O A C N - 0.0000 0.0000 0.0000 0.0002 E -7.0773 - 0.8776 0.7522 0.0012 O -8.0568 -0.1555 - 0.8137 0.0008 A -8.3994 0.3191 0.2383 - 0.0009 C -4.2511 3.5953 3.7849 3.7652 -
Adjusting for multiple comparisons using the Holm-Bonferroni method
>>> df.ptests(paired=True, stars=False, padjust="holm") N E O A C N - 0.000 0.000 0.000 0.001 E -7.077 - 1. 1. 0.005 O -8.057 -0.155 - 1. 0.005 A -8.399 0.319 0.238 - 0.005 C -4.251 3.595 3.785 3.765 -