pingouin.compute_bootci#
- pingouin.compute_bootci(x, y=None, func=None, method='cper', paired=False, confidence=0.95, n_boot=2000, decimals=2, seed=None, return_dist=False)[source]#
Bootstrapped confidence intervals of univariate and bivariate functions.
- Parameters:
- x1D-array or list
First sample. Required for both bivariate and univariate functions.
- y1D-array, list, or None
Second sample. Required only for bivariate functions.
- funcstr or custom function
Function to compute the bootstrapped statistic. Accepted string values are:
'pearson'
: Pearson correlation (bivariate, paired x and y)'spearman'
: Spearman correlation (bivariate, paired x and y)'cohen'
: Cohen d effect size (bivariate, paired or unpaired x and y)'hedges'
: Hedges g effect size (bivariate, paired or unpaired x and y)'mean'
: Mean (univariate = only x)'std'
: Standard deviation (univariate)'var'
: Variance (univariate)
- methodstr
Method to compute the confidence intervals (see Notes):
'cper'
: Bias-corrected percentile method (default)'norm'
: Normal approximation with bootstrapped bias and standard error'per'
: Simple percentile
- pairedboolean
Indicates whether x and y are paired or not. For example, for correlation functions or paired T-test, x and y are assumed to be paired. Pingouin will resample the pairs (x_i, y_i) when paired=True, and resample x and y separately when paired=False. If paired=True, x and y must have the same number of elements.
- confidencefloat
Confidence level (0.95 = 95%)
- n_bootint
Number of bootstrap iterations. The higher, the better, the slower.
- decimalsint
Number of rounded decimals.
- seedint or None
Random seed for generating bootstrap samples.
- return_distboolean
If True, return the confidence intervals and the bootstrapped distribution (e.g. for plotting purposes).
- Returns:
- ciarray
Bootstrapped confidence intervals.
Notes
Results have been tested against the bootci Matlab function.
Since version 1.7, SciPy also includes a built-in bootstrap function
scipy.stats.bootstrap()
. The SciPy implementation has two advantages over Pingouin: it is faster when usingvectorized=True
, and it supports the bias-corrected and accelerated (BCa) confidence intervals for univariate functions. However, unlike Pingouin, it does not return the bootstrap distribution.The percentile bootstrap method (
per
) is defined as the \(100 \times \frac{\alpha}{2}\) and \(100 \times \frac{1 - \alpha}{2}\) percentiles of the distribution of \(\theta\) estimates obtained from resampling, where \(\alpha\) is the level of significance (1 - confidence, default = 0.05 for 95% CIs).The bias-corrected percentile method (
cper
) corrects for bias of the bootstrap distribution. This method is different from the BCa method — the default in Matlab and SciPy — which corrects for both bias and skewness of the bootstrap distribution using jackknife resampling.The normal approximation method (
norm
) calculates the confidence intervals with the standard normal distribution using bootstrapped bias and standard error.References
DiCiccio, T. J., & Efron, B. (1996). Bootstrap confidence intervals. Statistical science, 189-212.
Davison, A. C., & Hinkley, D. V. (1997). Bootstrap methods and their application (Vol. 1). Cambridge university press.
Jung, Lee, Gupta, & Cho (2019). Comparison of bootstrap confidence interval methods for GSCA using a Monte Carlo simulation. Frontiers in psychology, 10, 2215.
Examples
Bootstrapped 95% confidence interval of a Pearson correlation
>>> import pingouin as pg >>> import numpy as np >>> rng = np.random.default_rng(42) >>> x = rng.normal(loc=4, scale=2, size=100) >>> y = rng.normal(loc=3, scale=1, size=100) >>> stat = np.corrcoef(x, y)[0][1] >>> ci = pg.compute_bootci(x, y, func='pearson', paired=True, seed=42, decimals=4) >>> print(round(stat, 4), ci) 0.0945 [-0.098 0.2738]
Let’s compare to SciPy’s built-in bootstrap function
>>> from scipy.stats import bootstrap >>> bt_scipy = bootstrap( ... data=(x, y), statistic=lambda x, y: np.corrcoef(x, y)[0][1], ... method="basic", vectorized=False, n_resamples=2000, paired=True, random_state=42) >>> np.round(bt_scipy.confidence_interval, 4) array([-0.0952, 0.2883])
Bootstrapped 95% confidence interval of a Cohen d
>>> stat = pg.compute_effsize(x, y, eftype='cohen') >>> ci = pg.compute_bootci(x, y, func='cohen', seed=42, decimals=3) >>> print(round(stat, 4), ci) 0.7009 [0.403 1.009]
Bootstrapped confidence interval of a standard deviation (univariate)
>>> import numpy as np >>> stat = np.std(x, ddof=1) >>> ci = pg.compute_bootci(x, func='std', seed=123) >>> print(round(stat, 4), ci) 1.5534 [1.38 1.8 ]
Compare to SciPy’s built-in bootstrap function, which returns the bias-corrected and accelerated CIs (see Notes).
>>> def std(x, axis): ... return np.std(x, ddof=1, axis=axis) >>> bt_scipy = bootstrap(data=(x, ), statistic=std, n_resamples=2000, random_state=123) >>> np.round(bt_scipy.confidence_interval, 2) array([1.39, 1.81])
Changing the confidence intervals type in Pingouin
>>> pg.compute_bootci(x, func='std', seed=123, method="norm") array([1.37, 1.76])
>>> pg.compute_bootci(x, func='std', seed=123, method="percentile") array([1.35, 1.75])
Bootstrapped confidence interval using a custom univariate function
>>> from scipy.stats import skew >>> round(skew(x), 4), pg.compute_bootci(x, func=skew, n_boot=10000, seed=123) (-0.137, array([-0.55, 0.32]))
5. Bootstrapped confidence interval using a custom bivariate function. Here, x and y are not paired and can therefore have different sizes.
>>> def mean_diff(x, y): ... return np.mean(x) - np.mean(y) >>> y2 = rng.normal(loc=3, scale=1, size=200) # y2 has 200 samples, x has 100 >>> ci = pg.compute_bootci(x, y2, func=mean_diff, n_boot=10000, seed=123) >>> print(round(mean_diff(x, y2), 2), ci) 0.88 [0.54 1.21]
We can also get the bootstrapped distribution
>>> ci, bt = pg.compute_bootci(x, y2, func=mean_diff, n_boot=10000, return_dist=True, seed=9) >>> print(f"The bootstrap distribution has {bt.size} samples. The mean and standard " ... f"{bt.mean():.4f} ± {bt.std():.4f}") The bootstrap distribution has 10000 samples. The mean and standard 0.8807 ± 0.1704