pingouin.box_m#
- pingouin.box_m(data, dvs, group, alpha=0.001)[source]#
Test equality of covariance matrices using the Box’s M test.
- Parameters:
- data
pandas.DataFrame
Long-format dataframe.
- dvslist
Dependent variables.
- groupstr
Grouping variable.
- alphafloat
Significance level. Default is 0.001 as recommended in [2]. A non-significant p-value (higher than alpha) indicates that the covariance matrices are homogenous (= equal).
- data
- Returns:
- stats
pandas.DataFrame
'Chi2'
: Test statistic'pval'
: p-value'df'
: The Chi-Square statistic’s degree of freedom'equal_cov'
: True ifdata
has equal covariance
- stats
Notes
Warning
Box’s M test is susceptible to errors if the data does not meet the assumption of multivariate normality or if the sample size is too large or small [3].
Pingouin uses
pandas.DataFrameGroupBy.cov()
to calculate the variance-covariance matrix of each group. Missing values are automatically excluded from the calculation by Pandas.Mathematical expressions can be found in [1].
This function has been tested against the boxM package of the biotools R package [4].
References
[1]Rencher, A. C. (2003). Methods of multivariate analysis (Vol. 492). John Wiley & Sons.
[2]Hahs-Vaughn, D. (2016). Applied Multivariate Statistical Concepts. Taylor & Francis.
Examples
Box M test with 3 dependent variables of 4 groups (equal sample size)
>>> import pandas as pd >>> import pingouin as pg >>> from scipy.stats import multivariate_normal as mvn >>> data = pd.DataFrame(mvn.rvs(size=(100, 3), random_state=42), ... columns=['A', 'B', 'C']) >>> data['group'] = [1] * 25 + [2] * 25 + [3] * 25 + [4] * 25 >>> data.head() A B C group 0 0.496714 -0.138264 0.647689 1 1 1.523030 -0.234153 -0.234137 1 2 1.579213 0.767435 -0.469474 1 3 0.542560 -0.463418 -0.465730 1 4 0.241962 -1.913280 -1.724918 1
>>> pg.box_m(data, dvs=['A', 'B', 'C'], group='group') Chi2 df pval equal_cov box 11.634185 18.0 0.865537 True
Box M test with 3 dependent variables of 2 groups (unequal sample size)
>>> data = pd.DataFrame(mvn.rvs(size=(30, 2), random_state=42), ... columns=['A', 'B']) >>> data['group'] = [1] * 20 + [2] * 10 >>> pg.box_m(data, dvs=['A', 'B'], group='group') Chi2 df pval equal_cov box 0.706709 3.0 0.871625 True