pingouin.pairwise_corr#
- pingouin.pairwise_corr(data, columns=None, covar=None, alternative='two-sided', method='pearson', padjust='none', nan_policy='pairwise')[source]#
Pairwise (partial) correlations between columns of a pandas dataframe.
- Parameters:
- data
pandas.DataFrame
DataFrame. Note that this function can also directly be used as a Pandas method, in which case this argument is no longer needed.
- columnslist or str
Column names in data:
["a", "b", "c"]
: combination between columns a, b, and c.["a"]
: product between a and all the other numeric columns.[["a"], ["b", "c"]]
: product between [“a”] and [“b”, “c”].[["a", "d"], ["b", "c"]]
: product between [“a”, “d”] and [“b”, “c”].[["a", "d"], None]
: product between [“a”, “d”] and all other numeric columns in dataframe.
If column is None, the function will return the pairwise correlation between the combination of all the numeric columns in data. See the examples section for more details on this.
- covarNone, string or list
Covariate(s) for partial correlation. Must be one or more columns in data. Use a list if there are more than one covariate. If
covar
is not None, a partial correlation will be computed usingpingouin.partial_corr()
function.Important
Only
method='pearson'
andmethod='spearman'
are currently supported in partial correlation.- alternativestring
Defines the alternative hypothesis, or tail of the correlation. Must be one of “two-sided” (default), “greater” or “less”. Both “greater” and “less” return a one-sided p-value. “greater” tests against the alternative hypothesis that the correlation is positive (greater than zero), “less” tests against the hypothesis that the correlation is negative.
- methodstring
Correlation type:
'pearson'
: Pearson \(r\) product-moment correlation'spearman'
: Spearman \(\rho\) rank-order correlation'kendall'
: Kendall’s \(\tau_B\) correlation (for ordinal data)'bicor'
: Biweight midcorrelation (robust)'percbend'
: Percentage bend correlation (robust)'shepherd'
: Shepherd’s pi correlation (robust)'skipped'
: Skipped correlation (robust)
- padjuststring
Method used for testing and adjustment of pvalues.
'none'
: no correction'bonf'
: one-step Bonferroni correction'sidak'
: one-step Sidak correction'holm'
: step-down method using Bonferroni adjustments'fdr_bh'
: Benjamini/Hochberg FDR correction'fdr_by'
: Benjamini/Yekutieli FDR correction
- nan_policystring
Can be
'listwise'
for listwise deletion of missing values (= complete-case analysis) or'pairwise'
(default) for the more liberal pairwise deletion (= available-case analysis).Added in version 0.2.9.
- data
- Returns:
- stats
pandas.DataFrame
'X'
: Name(s) of first columns.'Y'
: Name(s) of second columns.'method'
: Correlation type.'covar'
: List of specified covariate(s), only when covariates are passed.'alternative'
: Tail of the test.'n'
: Sample size (after removal of missing values).'r'
: Correlation coefficients.'CI95'
: 95% parametric confidence intervals.'p-unc'
: Uncorrected p-values.'p-corr'
: Corrected p-values.'p-adjust'
: P-values correction method.'BF10'
: Bayes Factor of the alternative hypothesis (only for Pearson correlation)'power'
: achieved power of the test (= 1 - type II error).
- stats
Notes
Please refer to the
pingouin.corr()
function for a description of the different methods. Missing values are automatically removed from the data using a pairwise deletion.This function is more flexible and gives a much more detailed output than the
pandas.DataFrame.corr()
method (i.e. p-values, confidence interval, Bayes Factor…). This comes however at an increased computational cost. While this should not be discernible for a dataframe with less than 10,000 rows and/or less than 20 columns, this function can be slow for very large datasets.A faster alternative to get the r-values and p-values in a matrix format is to use the
pingouin.rcorr()
function, which works directly as apandas.DataFrame
method (see example below).This function also works with two-dimensional multi-index columns. In this case, columns must be list(s) of tuple(s). Please refer to this example Jupyter notebook for more details.
If and only if
covar
is specified, this function will compute the pairwise partial correlation between the variables. If you are only interested in computing the partial correlation matrix (i.e. the raw pairwise partial correlation coefficient matrix, without the p-values, sample sizes, etc), a better alternative is to use thepingouin.pcorr()
function (see example 7).Examples
One-sided spearman correlation corrected for multiple comparisons
>>> import pandas as pd >>> import pingouin as pg >>> pd.set_option('display.expand_frame_repr', False) >>> pd.set_option('display.max_columns', 20) >>> data = pg.read_dataset('pairwise_corr').iloc[:, 1:] >>> pg.pairwise_corr(data, method='spearman', alternative='greater', padjust='bonf').round(3) X Y method alternative n r CI95% p-unc p-corr p-adjust power 0 Neuroticism Extraversion spearman greater 500 -0.325 [-0.39, 1.0] 1.000 1.000 bonf 0.000 1 Neuroticism Openness spearman greater 500 -0.028 [-0.1, 1.0] 0.735 1.000 bonf 0.012 2 Neuroticism Agreeableness spearman greater 500 -0.151 [-0.22, 1.0] 1.000 1.000 bonf 0.000 3 Neuroticism Conscientiousness spearman greater 500 -0.356 [-0.42, 1.0] 1.000 1.000 bonf 0.000 4 Extraversion Openness spearman greater 500 0.243 [0.17, 1.0] 0.000 0.000 bonf 1.000 5 Extraversion Agreeableness spearman greater 500 0.062 [-0.01, 1.0] 0.083 0.832 bonf 0.398 6 Extraversion Conscientiousness spearman greater 500 0.056 [-0.02, 1.0] 0.106 1.000 bonf 0.345 7 Openness Agreeableness spearman greater 500 0.170 [0.1, 1.0] 0.000 0.001 bonf 0.985 8 Openness Conscientiousness spearman greater 500 -0.007 [-0.08, 1.0] 0.560 1.000 bonf 0.036 9 Agreeableness Conscientiousness spearman greater 500 0.161 [0.09, 1.0] 0.000 0.002 bonf 0.976
Robust two-sided biweight midcorrelation with uncorrected p-values
>>> pcor = pg.pairwise_corr(data, columns=['Openness', 'Extraversion', ... 'Neuroticism'], method='bicor') >>> pcor.round(3) X Y method alternative n r CI95% p-unc power 0 Openness Extraversion bicor two-sided 500 0.247 [0.16, 0.33] 0.000 1.000 1 Openness Neuroticism bicor two-sided 500 -0.028 [-0.12, 0.06] 0.535 0.095 2 Extraversion Neuroticism bicor two-sided 500 -0.343 [-0.42, -0.26] 0.000 1.000
One-versus-all pairwise correlations
>>> pg.pairwise_corr(data, columns=['Neuroticism']).round(3) X Y method alternative n r CI95% p-unc BF10 power 0 Neuroticism Extraversion pearson two-sided 500 -0.350 [-0.42, -0.27] 0.000 6.765e+12 1.000 1 Neuroticism Openness pearson two-sided 500 -0.010 [-0.1, 0.08] 0.817 0.058 0.056 2 Neuroticism Agreeableness pearson two-sided 500 -0.134 [-0.22, -0.05] 0.003 5.122 0.854 3 Neuroticism Conscientiousness pearson two-sided 500 -0.368 [-0.44, -0.29] 0.000 2.644e+14 1.000
Pairwise correlations between two lists of columns (cartesian product)
>>> columns = [['Neuroticism', 'Extraversion'], ['Openness']] >>> pg.pairwise_corr(data, columns).round(3) X Y method alternative n r CI95% p-unc BF10 power 0 Neuroticism Openness pearson two-sided 500 -0.010 [-0.1, 0.08] 0.817 0.058 0.056 1 Extraversion Openness pearson two-sided 500 0.267 [0.18, 0.35] 0.000 5.277e+06 1.000
As a Pandas method
>>> pcor = data.pairwise_corr(covar='Neuroticism', method='spearman')
Pairwise partial correlation
>>> pg.pairwise_corr(data, covar=['Neuroticism', 'Openness']) X Y method covar alternative n r CI95% p-unc 0 Extraversion Agreeableness pearson ['Neuroticism', 'Openness'] two-sided 500 -0.038737 [-0.13, 0.05] 0.388361 1 Extraversion Conscientiousness pearson ['Neuroticism', 'Openness'] two-sided 500 -0.071427 [-0.16, 0.02] 0.111389 2 Agreeableness Conscientiousness pearson ['Neuroticism', 'Openness'] two-sided 500 0.123108 [0.04, 0.21] 0.005944
Pairwise partial correlation matrix using
pingouin.pcorr()
>>> data[['Neuroticism', 'Openness', 'Extraversion']].pcorr().round(3) Neuroticism Openness Extraversion Neuroticism 1.000 0.092 -0.360 Openness 0.092 1.000 0.281 Extraversion -0.360 0.281 1.000
Correlation matrix with p-values using
pingouin.rcorr()
>>> data[['Neuroticism', 'Openness', 'Extraversion']].rcorr() Neuroticism Openness Extraversion Neuroticism - *** Openness -0.01 - *** Extraversion -0.35 0.267 -