pingouin.gzscore#
- pingouin.gzscore(x, *, axis=0, ddof=1, nan_policy='propagate')[source]#
Geometric standard (Z) score.
- Parameters:
- xarray_like
Array of raw values.
- axisint or None, optional
Axis along which to operate. Default is 0. If None, compute over the whole array x.
- ddofint, optional
Degrees of freedom correction in the calculation of the standard deviation. Default is 1.
- nan_policy{‘propagate’, ‘raise’, ‘omit’}, optional
Defines how to handle when input contains nan. ‘propagate’ returns nan, ‘raise’ throws an error, ‘omit’ performs the calculations ignoring nan values. Default is ‘propagate’. Note that when the value is ‘omit’, nans in the input also propagate to the output, but they do not affect the geometric z scores computed for the non-nan values.
- Returns:
- gzscorearray_like
Array of geometric z-scores (same shape as x).
Notes
Geometric Z-scores are better measures of dispersion than arithmetic z-scores when the sample data come from a log-normally distributed population [1].
Given the raw scores \(x\), the geometric mean \(\mu_g\) and the geometric standard deviation \(\sigma_g\), the standard score is given by the formula:
\[z = \frac{log(x) - log(\mu_g)}{log(\sigma_g)}\]References
Examples
Standardize a lognormal-distributed vector:
>>> import numpy as np >>> from pingouin import gzscore >>> np.random.seed(123) >>> raw = np.random.lognormal(size=100) >>> z = gzscore(raw) >>> print(round(z.mean(), 3), round(z.std(), 3)) -0.0 0.995