pingouin.ancova#

pingouin.ancova(data=None, dv=None, between=None, covar=None, effsize='np2')[source]#

ANCOVA with one or more covariate(s).

Parameters:
datapandas.DataFrame

DataFrame. Note that this function can also directly be used as a Pandas method, in which case this argument is no longer needed.

dvstring

Name of column in data with the dependent variable.

betweenstring

Name of column in data with the between factor.

covarstring or list

Name(s) of column(s) in data with the covariate.

effsizestr

Effect size. Must be ‘np2’ (partial eta-squared) or ‘n2’ (eta-squared).

Returns:
aovpandas.DataFrame

ANCOVA summary:

  • 'Source': Names of the factor considered

  • 'SS': Sums of squares

  • 'DF': Degrees of freedom

  • 'F': F-values

  • 'p-unc': Uncorrected p-values

  • 'np2': Partial eta-squared

See also

anova

One-way and N-way ANOVA

Notes

Analysis of covariance (ANCOVA) is a general linear model which blends ANOVA and regression. ANCOVA evaluates whether the means of a dependent variable (dv) are equal across levels of a categorical independent variable (between) often called a treatment, while statistically controlling for the effects of other continuous variables that are not of primary interest, known as covariates or nuisance variables (covar).

Pingouin uses statsmodels.regression.linear_model.OLS to compute the ANCOVA.

Important

Rows with missing values are automatically removed (listwise deletion).

Examples

1. Evaluate the reading scores of students with different teaching method and family income as a covariate.

>>> from pingouin import ancova, read_dataset
>>> df = read_dataset('ancova')
>>> ancova(data=df, dv='Scores', covar='Income', between='Method')
     Source           SS  DF          F     p-unc       np2
0    Method   571.029883   3   3.336482  0.031940  0.244077
1    Income  1678.352687   1  29.419438  0.000006  0.486920
2  Residual  1768.522313  31        NaN       NaN       NaN

2. Evaluate the reading scores of students with different teaching method and family income + BMI as a covariate.

>>> ancova(data=df, dv='Scores', covar=['Income', 'BMI'], between='Method',
...        effsize="n2")
     Source           SS  DF          F     p-unc        n2
0    Method   552.284043   3   3.232550  0.036113  0.141802
1    Income  1573.952434   1  27.637304  0.000011  0.404121
2       BMI    60.013656   1   1.053790  0.312842  0.015409
3  Residual  1708.508657  30        NaN       NaN       NaN