Contingency Tables#
Classes for statistics based on contingency tables for categorical data.
MulticlassContingencyStats runs summary stats and \(\chi^2\) test
stats for a contingency table with any number of IV & DV levels.
BooleanContingencyStats inherits from
MulticlassContingencyStats, and is a special case for a 2x2 contingency
table, also implementing Fisher’s & Boschloo’s exact tests.
- class unistat.contingency.MulticlassContingencyStats(table_rows: Series, table_cols: Series, row_title: str | None = None, row_names: list[str] | None = None, col_title: str | None = None, col_names: list[str] | None = None)[source]#
Bases:
_ContingencyStatsCompute contingency table stats for 3+ IV and/or DV levels.
Take 2 pandas Series representing row and column variables, compute a contingency table, and provides methods for summary stats, chi-squared tests of independence, and post hoc testing.
- Parameters:
table_rows (pd.Series) – Series representing the row variable (typically predictor).
table_cols (pd.Series) – Series representing the column variable (typically outcome).
row_title (str, optional) – Title for the row index. Defaults to the name of table_rows.
row_names (list[str], optional) – Custom names for the row levels.
col_title (str, optional) – Title for the column index. Defaults to the name of table_cols.
col_names (list[str], optional) – Custom names for the column levels.
- idx_series#
The row variable series.
- Type:
pd.Series
- col_series#
The column variable series.
- Type:
pd.Series
- matrix#
Crosstabulated frequency counts, with levels of
idx_seriesand col_series as the index and columns, respectively.matrixdoes not include marginal row/column totals.- Type:
pd.DataFrame
- exp_freq#
Crosstabulated expected frequency counts. Format mirrors
matrix.- Type:
pd.DataFrame
- get_table(as_pct=False, axis='rows')#
Crosstabulated frequency counts with marginal row/column totals. Format otherwise mirrors
matrix.
See also
Notes
This class assumes categorical data in the input series. For better structure, consider passing intervention and outcome series explicitly in future versions.
- pairwise_post_hoc(alpha: float = 0.05, p_corr_method: PCorrectionMethod = 'holm')[source]#
Perform pairwise chi-square or Boschloo exact post hoc tests.
Appropriate if either rows or columns are binary.
By default, Holm-Bonferroni correction is used, but all correction methods supported by statsmodels.stats.multitest.multipletests() are supported.
- Parameters:
- Returns:
Pairwise results condensed to a single DataFrame. Only returned when a pairwise test is performed.
- Return type:
pd.DataFrame
- Raises:
NotImplementedError – If both rows & columns have 3+ levels, which is technically possible, but currently, only the use of adjusted standardized residuals is supported for such cases.
Examples
>>> four_by_two.matrix Outcome dv0 dv1 Predictor iv0 102 63 iv1 20 8 iv2 3 7 iv3 1 10 >>> four_by_two.pairwise_post_hoc() test test_stat p-value p_corr iv0 X^2(1) 2.572025 0.108768 0.272463 iv1 X^2(1) 2.095681 0.147716 0.272463 iv2 Boschloo exact 0.090821 0.095690 0.272463 iv3 Boschloo exact 0.000983 0.000722 0.003931
The binary axis is automatically detected, and tested against each level of the axis with 3+ levels:
>>> two_by_four.matrix Predictor iv0 iv1 iv2 iv3 Outcome dv0 102 20 3 1 dv1 63 8 7 10 >>> two_by_four.pairwise_post_hoc() test test_stat p-value p_corr iv0 X^2(1) 2.572025 0.108768 0.272463 iv1 X^2(1) 2.095681 0.147716 0.272463 iv2 Boschloo exact 0.090821 0.095690 0.272463 iv3 Boschloo exact 0.000983 0.000722 0.003931
- residuals_post_hoc(alpha: float = 0.05, p_corr_method: PCorrectionMethod = 'holm')[source]#
Perform post hoc tests using adjusted standardized residuals.
Preferred post hoc method if both rows & columns have 3+ levels (see Notes).
Uses adjusted standardized residuals (a/k/a adjusted Pearson residuals; cell-wise Z-scores), which are converted to p-values and corrected.
By default, Holm-Bonferroni correction is used, but all correction methods supported by statsmodels.stats.multitest.multipletests() are supported.
- Parameters:
- Returns:
residuals (pd.DataFrame) – Adjusted standardized residuals in same format as self.matrix. Shows direction of effect in each cell.
p_values (pd.DataFrame) – Uncorrected p-values in same format as self.matrix.
p_corr (pd.DataFrame) – Corrected p-values in same format as self.matrix.
- Raises:
NotImplementedError – If both rows & columns have 3+ levels, which is technically possible, but currently, use of adjusted standardized residuals is mandatory for such cases.
See also
pairwise_post_hocPairwise tests if either variable is binary.
Notes
We prefer post hoc testing via adjusted standardized residuals only in cases where both rows & columns have 3+ levels.
If either rows or columns are binary, p-values are identical for both levels of the binary factor (see Examples). If comparing \(k \times 2\) rows & columns (where \(k \geq 3\)), there would only be \(k\) different p-values (duplicated twice). This method detects these cases, and only corrects for \(k\) p-values, to avoid inflating type-II error rate by naively corrected \(2k\) p- values.
However, this approach is effectively equivalent to running \(k\) pairwise \(\chi^2\) tests without Yates correction. Use of the pairwise_post_hoc() method is preferred, since this will automatically use Boschloo exact tests when expected frequencies are <5.
Examples
>>> four_by_four.matrix Outcome dv0 dv1 dv2 dv3 Predictor iv0 70 0 0 0 iv1 53 4 3 0 iv2 28 17 4 2 iv3 14 7 3 9
Adjusted standardized residuals (a/k/a adjusted Pearson residuals, equivalent to Z-scores) indicate whether each cell was less (negative) or more (positive) frequent than expected by chance. p-Values indicate whether this difference is statistically significant.
>>> four_by_four.residuals_post_hoc().residuals Outcome dv0 dv1 dv2 dv3 Predictor iv0 5.558156 -3.957284 -2.258185 -2.374231 iv1 2.440597 -1.737651 0.141516 -2.125546 iv2 -4.323559 4.913430 1.229105 -0.451581 iv3 -5.155365 1.505522 1.307525 6.260734 >>> four_by_four.residuals_post_hoc().p_values Outcome dv0 dv1 dv2 dv3 Predictor iv0 2.726397e-08 7.580683e-05 0.023934 1.758554e-02 iv1 1.466299e-02 8.227228e-02 0.887462 3.354109e-02 iv2 1.535320e-05 8.949665e-07 0.219032 6.515711e-01 iv3 2.531376e-07 1.321900e-01 0.191034 3.831699e-10 >>> four_by_four.residuals_post_hoc().p_corr Outcome dv0 dv1 dv2 dv3 Predictor iv0 4.089596e-07 0.000834 0.191473 1.582699e-01 iv1 1.466299e-01 0.493634 1.000000 2.347877e-01 iv2 1.842384e-04 0.000012 0.764137 1.000000e+00 iv3 3.543927e-06 0.660950 0.764137 6.130718e-09
When one axis is binary, compare the results pairwise_hoc_hoc() to residuals_post_hoc():
>>> four_by_two.matrix Outcome dv0 dv1 Predictor iv0 102 63 iv1 20 8 iv2 3 7 iv3 1 10
pairwise_hoc_hoc() uses either \(\chi^2\) or Fisher exact tests, depending on expected frequencies for each comparison.
>>> four_by_two.pairwise_post_hoc() test test_stat p-value p_corr iv0 X^2(1) 2.572025 0.108768 0.272463 iv1 X^2(1) 2.095681 0.147716 0.272463 iv2 Boschloo exact 0.090821 0.095690 0.272463 iv3 Boschloo exact 0.000983 0.000722 0.003931
residuals_post_hoc() gives similar results to pairwise \(\chi^2\) tests (cf uncorrected p-values), but cannot use Boschloo (or Fisher) exact tests when expected frequencies are low (
iv2&iv3):>>> four_by_two.residuals_post_hoc().p_values Outcome dv0 dv1 Predictor iv0 0.108768 0.108768 iv1 0.147716 0.147716 iv2 0.057318 0.057318 iv3 0.000570 0.000570 >>> four_by_two.residuals_post_hoc().p_corr Outcome dv0 dv1 Predictor iv0 0.217537 0.217537 iv1 0.217537 0.217537 iv2 0.171955 0.171955 iv3 0.002279 0.002279
- post_hoc(alpha: float = 0.05, p_corr_method: PCorrectionMethod = 'holm')[source]#
Perform appropriate post hoc test, based on IV/DV levels.
Convenience method to choose type of post hoc test.
If either rows or columns are binary, results of pairwise \(\chi^2\) or Fisher exact tests (as appropriate for expected frequencies) are returned with uncorrected & corrected p-values.
If rows & columns both have 3+ levels, adjusted standardized residuals (a/k/a adjusted Pearson residuals; cell-wise Z-scores) are converted to p-values and corrected.
By default, Holm-Bonferroni correction is used, but all correction methods supported by statsmodels.stats.multitest.multipletests() are supported.
- Parameters:
- Returns:
pd.DataFrame – Pairwise results condensed to a single DataFrame. Only returned when a pairwise test is performed.
residuals (pd.DataFrame) – Adjusted standardized residuals in same format as self.matrix. Shows direction of effect in each cell.
p_values (pd.DataFrame) – Uncorrected p-values in same format as self.matrix.
p_corr (pd.DataFrame) – Corrected p-values in same format as self.matrix.
See also
pairwise_post_hocWhen either rows or columns are binary.
residuals_post_hocPreferred when rows & columns have 3+ levels.
- class unistat.contingency.BooleanContingencyStats(table_rows: Series, table_cols: Series, row_title: str | None = None, row_names: list[str] | None = None, col_title: str | None = None, col_names: list[str] | None = None)[source]#
Bases:
_ContingencyStatsPerform contingency statistics on boolean (2x2) tables.
Extends MulticlassContingencyStats with methods specific to 2x2 tables, such as odds ratio and Fisher’s exact test.
- Parameters:
table_rows (pd.Series) – Series representing the row variable (typically predictor), with dtype collapsible to Boolean (True/False, 1/0, 1.0/0.0).
table_cols (pd.Series) – Series representing the column variable (typically outcome), with dtype collapsible to Boolean (True/False, 1/0, 1.0/0.0).
row_title (str, optional) – Title for the row index.
row_names (list[str], optional) – Custom names for the row levels.
col_title (str, optional) – Title for the column index.
col_names (list[str], optional) – Custom names for the column levels.
- Return type:
BooleanContingencyStats object
- Warns:
ExpectedFrequencyWarning – If any cell-wise expected frequencies < 5
See also
Notes
Assumes the contingency table is 2x2.
p-values are displayed for both the chi-square test of independence (ToI), and for an exact test like Fisher’s. Deciding which test to report can follow Cochran’s rule-of-thumb criteria 1 2, which includes (but is not limited to) the following as indication for use of an exact test (Fisher’s exact test in the original 1952 article) over chi-squared:
- Any cell-wise expected frequency < 5
Actual rule is <20% must have expected frequency < 5, which means no cells can have low expected frequency in a 2x2 table.
N < 20
Cochran (1952) 1 recommends using Yates’ correction if N > 40 but any expected frequency < 500;
unistatdoes not implement this by default.
By default,
unistatnever implements Yates’ correction factor. Hasselblad & Lokhnygina (2007) 3 found that in all cases, Yates-corrected chi-squared is inferior to Fisher’s exact test. Furthermore, they found that even Fisher’s exact test is too conservative, and that, depending on sample size, Fisher’s mid-p test or Barnard’s exact test offer better power while maintaining target Type I error control.Alternative exact test(s) will be implemented in later releases; expect that at a minimum, this will include Boschloo’s exact test.
Lydersen et al. (2009) 4 compared multiple different exact tests, and noted the following:
Standard Fisher’s exact test is near-uniformly too conservative, though it always maintains Type I error rate
Fisher’s mid-p generally improves power, but occasionally violates Type I error rate.
Barnard’s exact test is an excellent performer, but is computationally intensive to a prohibitive degree (exponential time complexity).
Boschloo’s exact test (aka Fisher-Boschloo test) is an extension of Fisher’s exact, and was considered the gold standard by Lydersen et al.; it is universally more powerful than traditional Fisher’s exact & mid-p, and in trials did not violate target Type I error rate.
- odds_ratio(kind: str = 'sample')[source]#
Compute the odds ratio.
- Parameters:
kind (str, optional) – Type of odds ratio: ‘sample’ (default), ‘conditional’, or ‘unconditional’.
- Returns:
Result object with statistic and confidence interval.
- Return type:
- fisher_exact(alternative: Literal['two-sided', 'less', 'greater'] = 'two-sided')[source]#
Perform Fisher’s exact test.
- boschloo_exact(alternative: Literal['two-sided', 'less', 'greater'] = 'two-sided', n_sampling_points: int = 32)[source]#
Perform Boschloo’s exact test.
- Parameters:
alternative (str, default: 'two-sided') – Alternative hypothesis: ‘two-sided’, ‘less’, or ‘greater’
n_sampling_points (int, default 32) – Number of sampling points used in the construction of the sampling method. See scipy.stats.boschloo_exact() documentation for further detail.
_bosch-exact (..)
- Returns:
statistic (float) – Test statistic for Boschloo’s exact test, which is the lesser of the p-values given by 2 one-sided Fisher exact test.
pvalue (float) – Boschloo’s exact p-value.
Notes
Lydersen et al. (2009) 4 compared multiple different exact tests, and found Boschloo’s exact test to be universally more powerful than both traditional and mid-p Fisher’s exact tests. Boschloo’s exact test can be further improved using the Berger-Boos correction, particularly for unbalanced designs (e.g. if survival occurs much more often than mortality) 4 5