pingouin.compute_esci#

pingouin.compute_esci(stat=None, nx=None, ny=None, paired=False, eftype='cohen', confidence=0.95, decimals=2, alternative='two-sided')[source]#

Parametric confidence intervals around a Cohen d or a correlation coefficient.

Parameters:
statfloat

Original effect size. Must be either a correlation coefficient or a Cohen-type effect size (Cohen d or Hedges g).

nx, nyint

Length of vector x and y.

pairedbool

Indicates if the effect size was estimated from a paired sample. This is only relevant for cohen or hedges effect size.

eftypestring

Effect size type. Must be “r” (correlation) or “cohen” (Cohen d or Hedges g).

confidencefloat

Confidence level (0.95 = 95%)

decimalsint

Number of rounded decimals.

alternativestring

Defines the alternative hypothesis, or tail for the correlation coefficient. Must be one of “two-sided” (default), “greater” or “less”. This parameter only has an effect if eftype is “r”.

Returns:
ciarray

Desired converted effect size

Notes

To compute the parametric confidence interval around a Pearson r correlation coefficient, one must first apply a Fisher’s r-to-z transformation:

\[z = 0.5 \cdot \ln \frac{1 + r}{1 - r} = \text{arctanh}(r)\]

and compute the standard error:

\[\text{SE} = \frac{1}{\sqrt{n - 3}}\]

where \(n\) is the sample size.

The lower and upper confidence intervals - in z-space - are then given by:

\[\text{ci}_z = z \pm \text{crit} \cdot \text{SE}\]

where \(\text{crit}\) is the critical value of the normal distribution corresponding to the desired confidence level (e.g. 1.96 in case of a 95% confidence interval).

These confidence intervals can then be easily converted back to r-space:

\[\text{ci}_r = \frac{\exp(2 \cdot \text{ci}_z) - 1} {\exp(2 \cdot \text{ci}_z) + 1} = \text{tanh}(\text{ci}_z)\]

A formula for calculating the confidence interval for a Cohen d effect size is given by Hedges and Olkin (1985, p86). If the effect size estimate from the sample is \(d\), then it follows a T distribution with standard error:

\[\text{SE} = \sqrt{\frac{n_x + n_y}{n_x \cdot n_y} + \frac{d^2}{2 (n_x + n_y)}}\]

where \(n_x\) and \(n_y\) are the sample sizes of the two groups.

In one-sample test or paired test, this becomes:

\[\text{SE} = \sqrt{\frac{1}{n_x} + \frac{d^2}{2 n_x}}\]

The lower and upper confidence intervals are then given by:

\[\text{ci}_d = d \pm \text{crit} \cdot \text{SE}\]

where \(\text{crit}\) is the critical value of the T distribution corresponding to the desired confidence level.

References

Examples

  1. Confidence interval of a Pearson correlation coefficient

>>> import pingouin as pg
>>> x = [3, 4, 6, 7, 5, 6, 7, 3, 5, 4, 2]
>>> y = [4, 6, 6, 7, 6, 5, 5, 2, 3, 4, 1]
>>> nx, ny = len(x), len(y)
>>> stat = pg.compute_effsize(x, y, eftype='r')
>>> ci = pg.compute_esci(stat=stat, nx=nx, ny=ny, eftype='r')
>>> print(round(stat, 4), ci)
0.7468 [0.27 0.93]
  1. Confidence interval of a Cohen d

>>> stat = pg.compute_effsize(x, y, eftype='cohen')
>>> ci = pg.compute_esci(stat, nx=nx, ny=ny, eftype='cohen', decimals=3)
>>> print(round(stat, 4), ci)
0.1538 [-0.737  1.045]