discopt.doe.selection#
Post-experiment model selection helpers.
Derived from discopt.estimate.EstimationResult, whose
.objective equals the deviance −2 · log L(θ̂) up to a
constant under Gaussian noise. This makes every metric in this module
a one-line derivation:
log L̂ = −0.5 · objectiveAIC = 2·p + objectiveBIC = p · log(n) + objectiveLRT G² = objective_nested − objective_fullondf = p_full − p_nestedVuong
zrequires the per-observation log-likelihoods; these are rebuilt from the residuals via each candidate model’s responses.
Usage#
>>> from discopt.doe import model_selection, likelihood_ratio_test
>>> res = model_selection({"arrh": est_a, "eyr": est_e}, method="aic")
>>> res.best_model, res.weights
References#
Akaike (1973); Schwarz (1978); Hurvich & Tsai (1989) AICc; Wilks (1938) LRT; Vuong (1989) non-nested LRT.
Classes#
Bundle of scores / weights / p-values from a selection test. |
Functions#
|
Rank candidate fits by AIC, AICc, or BIC. |
|
Likelihood-ratio test on a nested pair (Wilks 1938). |
|
Vuong (1989) likelihood-ratio test for non-nested models. |
Module Contents#
- class discopt.doe.selection.ModelSelectionResult#
Bundle of scores / weights / p-values from a selection test.
Attributes#
- method{“aic”, “aicc”, “bic”, “lrt”, “vuong”}
Selection method that produced this result.
- scoresdict[str, float]
Per-model score (lower is better for AIC/BIC).
- weightsdict[str, float] or None
Softmax weights of
-0.5 * Δscore(None for LRT / Vuong).- best_modelstr
Name of the top-ranked model.
- p_valuefloat or None
Null-hypothesis p-value for LRT / Vuong tests.
- nested_pair(str, str) or None
(nested, full)names for LRT only.- z_statisticfloat or None
Vuong test statistic only.
- warningslist[str]
Diagnostic messages accumulated during selection.
- method: SelectionMethod#
- scores: dict[str, float]#
- weights: dict[str, float] | None#
- best_model: str#
- p_value: float | None = None#
- nested_pair: tuple[str, str] | None = None#
- z_statistic: float | None = None#
- warnings: list[str] = None#
- discopt.doe.selection.model_selection(estimation_results: dict[str, discopt.estimate.EstimationResult], *, method: Literal['aic', 'aicc', 'bic'] = 'aic') ModelSelectionResult#
Rank candidate fits by AIC, AICc, or BIC.
All candidates must have been fit to the same data (same
n_observations). The deviance convention ofdiscopt.estimate.estimate_parameters()makes these one-liners.Parameters#
- estimation_resultsdict[str, EstimationResult]
Per-model fitted results keyed by user-chosen names.
- method{“aic”, “aicc”, “bic”}, default “aic”
Information criterion used to score models.
Returns#
- ModelSelectionResult
scores, softmaxweights, and thebest_model.
- discopt.doe.selection.likelihood_ratio_test(nested: discopt.estimate.EstimationResult, full: discopt.estimate.EstimationResult, *, nested_name: str = 'nested', full_name: str = 'full', alpha: float = 0.05) ModelSelectionResult#
Likelihood-ratio test on a nested pair (Wilks 1938).
Returns the
G² = objective_nested − objective_fullstatistic and its \(\chi^2_{df}\) p-value withdf = p_full − p_nested. The full model is declared “best” when the null is rejected at levelalpha; otherwise the parsimony principle keeps the nested model as “best”.Parameters#
- nested, fullEstimationResult
Fitted results.
nested.parameter_namesmust be a subset offull.parameter_names(nested relationship) and both must sharen_observations.- nested_name, full_namestr
Labels to use in the result’s
scoresandbest_model.
Returns#
- ModelSelectionResult
scoresare the two deviances;p_valueis the χ² tail probability;nested_pair = (nested_name, full_name).
- discopt.doe.selection.vuong_test(res_a: discopt.estimate.EstimationResult, res_b: discopt.estimate.EstimationResult, data: dict, experiments: dict[str, discopt.estimate.Experiment], *, name_a: str | None = None, name_b: str | None = None, alpha: float = 0.05) ModelSelectionResult#
Vuong (1989) likelihood-ratio test for non-nested models.
Computes per-observation log-likelihoods \(\ell^A_n, \ell^B_n\) under Gaussian noise, forms the mean difference \(\bar m\), and reports the z-statistic \(z = \sqrt{N}\,\bar m / s_m\).
|z| < z_{1-\alpha/2}is read as “statistically indistinguishable”; otherwise the sign of \(\bar m\) picks the winner.Parameters#
- res_a, res_bEstimationResult
Fits to be compared.
- datadict
Observed data used for both fits (same keys and values that were passed to
estimate_parameters()).- experimentsdict[str, Experiment]
Mapping containing at least the candidate experiment for each result. Keys must include
name_aandname_b(if these are None, the two keys present inexperimentsare used in the order given).- name_a, name_bstr, optional
Labels for each model. Default to the first two keys of
experimentsin iteration order.- alphafloat, default 0.05
Two-sided significance level for the “indistinguishable” region.
Returns#
- ModelSelectionResult
scoresgives each model’s summed log-likelihood;p_valueis the two-sided p;z_statisticis the Vuongz.best_model = "indistinguishable"inside the acceptance region.