fennomix_mhc.tda_fmm module¶
Classes:
|
A simplified model for fitting decoy score distributions. |
|
Finite Mixture Model (FMM) for Target-Decoy Analysis (TDA). |
Functions:
|
Calculate Gamma probability density function values using mean and std. |
|
Calculate Gaussian probability density function values for input array X. |
|
Selects the best TDA_fmm model by BIC criterion. |
- class fennomix_mhc.tda_fmm.DecoyModel(gaussian_outlier_sigma, *args, **kwargs)[source][source]¶
Bases:
TDA_fmmA simplified model for fitting decoy score distributions.
Uses a single Gaussian, optionally filtering outliers using sigma threshold.
Methods:
__init__(gaussian_outlier_sigma, *args, **kwargs)Initializes the decoy model.
fit(X)Fits a single Gaussian to the decoy scores.
pdf(X)Computes Gaussian PDF for given scores.
- __init__(gaussian_outlier_sigma, *args, **kwargs)[source][source]¶
Initializes the decoy model.
- Parameters:
gaussian_outlier_sigma (
float|None) – If provided, scores below (mu - sigma * gaussian_outlier_sigma) are filtered before fitting.*args (
Any) – Ignored, for compatibility.**kwargs –
Ignored, for compatibility.
- class fennomix_mhc.tda_fmm.TDA_fmm(n_components, external_model=None)[source][source]¶
Bases:
objectFinite Mixture Model (FMM) for Target-Decoy Analysis (TDA).
This class estimates score distributions using a mixture of Gaussians. It supports modeling both target and decoy distributions, where the decoy model can be incorporated as an external component in the target model.
- n_components¶
Number of Gaussian components in the mixture.
- external_model¶
Optional fitted decoy model (for target modeling).
- max_iter¶
Maximum number of EM iterations.
- main_pdf¶
PDF function used for the first component.
- helper_pdf¶
PDF function used for other components.
- weights¶
Learned mixture weights (pi_k).
- mu¶
Learned means for each component.
- sigma¶
Learned standard deviations for each component.
Methods:
__init__(n_components[, external_model])Initializes the TDA_fmm model.
fit(X)Fits the FMM model using Expectation-Maximization (EM) algorithm.
get_pi0()Returns the estimated proportion of decoy (null) components in the mixture.
loglik_BIC(X)Computes log-likelihood and Bayesian Information Criterion (BIC).
pdf(X)Computes the PDF of the main mixture components (excluding external model).
pdf_mix(X[, external_pdf])Computes the full mixture PDF, including external model if present.
pep(X[, external_pdf])Estimates Posterior Error Probabilities (PEP).
plot(title, plot_scores[, false_scores])Plots the fitted mixture model against histogram of scores.
- __init__(n_components, external_model=None)[source][source]¶
Initializes the TDA_fmm model.
- Parameters:
n_components (
int) – Number of Gaussian components in the mixture.external_model (
Optional[TDA_fmm]) – Pre-fitted decoy model. If None, models decoy; if provided, models target with decoy as a component.
- fit(X)[source][source]¶
Fits the FMM model using Expectation-Maximization (EM) algorithm.
- Parameters:
X (
ndarray|list[float]) – Input scores to fit the model on.- Return type:
None
- get_pi0()[source][source]¶
Returns the estimated proportion of decoy (null) components in the mixture.
- Return type:
float- Returns:
pi0 value (between 0 and 1). Returns 0 if model not fitted or no external model.
- loglik_BIC(X)[source][source]¶
Computes log-likelihood and Bayesian Information Criterion (BIC).
BIC = -2 * loglik + num_params * log(n)
- Parameters:
X (
ndarray|list[float]) – Input scores.- Return type:
tuple[float,float]- Returns:
A tuple of (log-likelihood, BIC). Returns (0, 0) if model not fitted.
- pdf(X)[source][source]¶
Computes the PDF of the main mixture components (excluding external model).
- Parameters:
X (
ndarray|list[float]) – Input scores of shape (n,).- Return type:
ndarray- Returns:
PDF values of shape (n,). Returns zeros if model not fitted.
- pdf_mix(X, external_pdf=None)[source][source]¶
Computes the full mixture PDF, including external model if present.
f_mixture(x) = pi0 * f_decoy(x) + (1-pi0) * f_target(x)
- Parameters:
X (
ndarray|list[float]) – Input scores.external_pdf (
ndarray|None) – Optional precomputed PDF values from external model.
- Return type:
ndarray- Returns:
Mixture PDF values. Returns zeros if model not fitted.
- pep(X, external_pdf=None)[source][source]¶
Estimates Posterior Error Probabilities (PEP).
PEP = pi0 * f_decoy(x) / f_mixture(x)
- Parameters:
X (
ndarray|list[float]) – Input scores.external_pdf (
ndarray|None) – Optional precomputed PDF values from external model. If None and external_model exists, it will be computed.
- Return type:
ndarray- Returns:
Array of PEP values for each score in X. Returns zeros if model not fitted.
- plot(title, plot_scores, false_scores=None)[source][source]¶
Plots the fitted mixture model against histogram of scores.
- If an external model exists and false_scores are provided, plots:
Decoy model (external)
Target histogram + mixture fit
Separated true and false components
Otherwise, plots only the decoy model fit.
- Parameters:
title (
str) – Title prefix for plots.plot_scores (
ndarray|list[float]) – Scores to plot (e.g., target scores).false_scores (
ndarray|list[float] |None) – Optional decoy scores for comparison.
- Return type:
None
- fennomix_mhc.tda_fmm.gamma_pdf(X, u, sigma)[source][source]¶
Calculate Gamma probability density function values using mean and std.
The shape and scale parameters are derived from mean (u) and std (sigma).
- Parameters:
X (
ndarray|list[float] |float) – Input array of shape (n,) representing scores.u (
float) – Mean of the distribution.sigma (
float) – Standard deviation of the distribution.
- Return type:
ndarray- Returns:
Array of Gamma PDF values with same shape as X.
- fennomix_mhc.tda_fmm.gauss_pdf(X, u, sigma)[source][source]¶
Calculate Gaussian probability density function values for input array X.
- Parameters:
X (
ndarray|list[float] |float) – Input array of shape (n,) representing scores.u (
float) – Mean (mu) of the Gaussian distribution.sigma (
float) – Standard deviation (sigma) of the Gaussian distribution.
- Return type:
ndarray- Returns:
Array of PDF values with same shape as X.
- fennomix_mhc.tda_fmm.select_best_fmm(target_scores, decoy_fmm, _max_component_=3, verbose=True)[source][source]¶
Selects the best TDA_fmm model by BIC criterion.
Fits models with 1 to _max_component_ components and selects the one with lowest BIC.
- Parameters:
target_scores (
ndarray|list[float]) – Scores to fit the target model on.decoy_fmm (
DecoyModel) – Pre-fitted decoy model._max_component_ – Maximum number of components to try.
verbose (
bool) – Whether to print progress.
- Return type:
- Returns:
Best-fitted TDA_fmm model (target model).