pingouin.compute_esci#
- pingouin.compute_esci(stat=None, nx=None, ny=None, paired=False, eftype='cohen', confidence=0.95, decimals=2, alternative='two-sided')[source]#
Parametric confidence intervals around a Cohen d or a correlation coefficient.
- Parameters:
- statfloat
Original effect size. Must be either a correlation coefficient or a Cohen-type effect size (Cohen d or Hedges g).
- nx, nyint
Length of vector x and y.
- pairedbool
Indicates if the effect size was estimated from a paired sample. This is only relevant for cohen or hedges effect size.
- eftypestring
Effect size type. Must be “r” (correlation) or “cohen” (Cohen d or Hedges g).
- confidencefloat
Confidence level (0.95 = 95%)
- decimalsint
Number of rounded decimals.
- alternativestring
Defines the alternative hypothesis, or tail for the correlation coefficient. Must be one of “two-sided” (default), “greater” or “less”. This parameter only has an effect if
eftype
is “r”.
- Returns:
- ciarray
Desired converted effect size
Notes
To compute the parametric confidence interval around a Pearson r correlation coefficient, one must first apply a Fisher’s r-to-z transformation:
\[z = 0.5 \cdot \ln \frac{1 + r}{1 - r} = \text{arctanh}(r)\]and compute the standard error:
\[\text{SE} = \frac{1}{\sqrt{n - 3}}\]where \(n\) is the sample size.
The lower and upper confidence intervals - in z-space - are then given by:
\[\text{ci}_z = z \pm \text{crit} \cdot \text{SE}\]where \(\text{crit}\) is the critical value of the normal distribution corresponding to the desired confidence level (e.g. 1.96 in case of a 95% confidence interval).
These confidence intervals can then be easily converted back to r-space:
\[\text{ci}_r = \frac{\exp(2 \cdot \text{ci}_z) - 1} {\exp(2 \cdot \text{ci}_z) + 1} = \text{tanh}(\text{ci}_z)\]A formula for calculating the confidence interval for a Cohen d effect size is given by Hedges and Olkin (1985, p86). If the effect size estimate from the sample is \(d\), then it follows a T distribution with standard error:
\[\text{SE} = \sqrt{\frac{n_x + n_y}{n_x \cdot n_y} + \frac{d^2}{2 (n_x + n_y)}}\]where \(n_x\) and \(n_y\) are the sample sizes of the two groups.
In one-sample test or paired test, this becomes:
\[\text{SE} = \sqrt{\frac{1}{n_x} + \frac{d^2}{2 n_x}}\]The lower and upper confidence intervals are then given by:
\[\text{ci}_d = d \pm \text{crit} \cdot \text{SE}\]where \(\text{crit}\) is the critical value of the T distribution corresponding to the desired confidence level.
References
Hedges, L., and Ingram Olkin. “Statistical models for meta-analysis.” (1985).
Examples
Confidence interval of a Pearson correlation coefficient
>>> import pingouin as pg >>> x = [3, 4, 6, 7, 5, 6, 7, 3, 5, 4, 2] >>> y = [4, 6, 6, 7, 6, 5, 5, 2, 3, 4, 1] >>> nx, ny = len(x), len(y) >>> stat = pg.compute_effsize(x, y, eftype='r') >>> ci = pg.compute_esci(stat=stat, nx=nx, ny=ny, eftype='r') >>> print(round(stat, 4), ci) 0.7468 [0.27 0.93]
Confidence interval of a Cohen d
>>> stat = pg.compute_effsize(x, y, eftype='cohen') >>> ci = pg.compute_esci(stat, nx=nx, ny=ny, eftype='cohen', decimals=3) >>> print(round(stat, 4), ci) 0.1538 [-0.737 1.045]