qbiocode.evaluation package#

Submodules#

qbiocode.evaluation.dataset_evaluation module#

evaluate(df, y, file)[source]#

This function evaluates a dataset and returns a transposed summary DataFrame with various statistical measures, derived from the dataset. Using the functions defined above, it computes intrinsic dimension, condition number, Fisher Discriminant Ratio, total correlation, mutual information, variance, coefficient of variation, data sparsity, low variance features, data density, fractal dimension, data distributions (skewness and kurtosis), entropy of the target variable, and manifold complexity. The summary DataFrame is transposed for easier readability and contains the dataset name, number of features, number of samples, feature-to-sample ratio, and various statistical measures. This function is useful for quickly summarizing the characteristics of a dataset, especially in the context of machine learning and data analysis, allowing you to correlate the dataset’s properties with its performance in predictive modeling tasks.

Parameters:

df (pandas.DataFrame) – Dataset in pandas with observation in rows, features in columns
y (int) – supervised binary class label
file (str) – Name of the dataset file for identification in the summary DataFrame

Returns:

Summary DataFrame containing various statistical measures of the dataset

Return type:

transposed (pandas.DataFrame)

get_coefficient_var(df)[source]#

Get coefficient of variance

Parameters:: df (pandas.DataFrame) – Dataset in pandas with observation in rows, features in columns
Returns:: Mean coefficient of variance std_var (float): Standard deviation of coefficient of variance
Return type:: avg_co_of_v (float)

get_complexity(df, n_neighbors=10, n_components=2)[source]#

Measure the manifold complexity by fitting Isomap and analyzing the geodesic vs. Euclidean distances. This function computes the reconstruction error of the Isomap algorithm, which serves as an indicator of the complexity of the manifold represented by the data.

Parameters:

df (pandas.DataFrame) – Dataset in pandas with observation in rows, features in columns
n_neighbors – Number of neighbors for the Isomap algorithm. Default value 10
n_components – Number of components (dimensions) for Isomap projection. Default value 2

Returns:

float: The reconstruction error of the Isomap model, which indicates the complexity of the manifold.

reconstruction_error: The residual error of geodesic distances

Return type:

reconstruction_error

get_condition_number(df)[source]#

Get the condition number of a matrix.

A high condition number indicates that the matrix is ill-conditioned and can produce large output errors even for small input perturbations. A low condition number indicates a more stable matrix.

Parameters:: df (pandas.DataFrame) – Dataset in pandas with observation in rows, features in columns
Returns:: Condition number of the matrix represented in df.
Return type:: float

get_dimensions(df)[source]#

Get the number of features, samples, and feature-to-sample ratio from a DataFrame. :type df: pandas.DataFrame :param df: Dataset in pandas with observation in rows, features in columns :type df: pandas.DataFrame

Returns:

(num_features, num_samples, ratio)

num_features (int): Number of features in the DataFrame
num_samples (int): Number of samples in the DataFrame
ratio (float): Feature-to-sample ratio

Return type:

tuple

get_entropy(y)[source]#

Calculate entropy of the target variable

Parameters:: y (int) – supervised binary class label
Returns:: mean entropy std_y_entropy (flat): standard deviation of entropy
Return type:: avg_y_entropy (float)

get_fdr(df, y)[source]#

Calculate Fisher Discriminant Ratio for a given dataset.

Parameters:

df (pandas.DataFrame) – Dataset in pandas with observation in rows, features in columns
y (int) – supervised binary class label

Returns:

Fisher Discriminant ratio

Return type:

float

get_fractal_dim(df, k_max)[source]#

Calculate the fractal dimension of the data using Higuchi’s method

Parameters:

df (pandas.DataFrame) – Dataset in pandas with observation in rows, features in columns
k_max (int) – Maximum number of k values to use in the calculation

Returns:

Fractal dimension of the data

Return type:

float

get_intrinsic_dim(df)[source]#

Get intrinsic dimension of the data using lPCA from skdim. :type df: pandas.DataFrame :param df: Dataset in pandas with observation in rows, features in columns :type df: pandas.DataFrame

Returns:: Intrinsic dimension of the data
Return type:: float

get_log_density(df)[source]#

Calculate the mean log density of the data

Parameters:: df (pandas.DataFrame) – Dataset in pandas with observation in rows, features in columns
Returns:: mean log kernel density
Return type:: float

get_low_var_features(df, num_features)[source]#

Calculate get count of low variance features

Parameters:

df (pandas.DataFrame) – Dataset in pandas with observation in rows, features in columns
num_features (int) – number of features in the dataset

Raises:

ValueError – If no feature is strong enough to keep

Returns:

count of features with low variance

Return type:

int

get_moments(df)[source]#

Compute third and fourth order moments of the data

Parameters:: df (pandas.DataFrame) – Dataset in pandas with observation in rows, features in columns
Returns:: Mean skewness std_skew (float): Standard deviation of skewness avg_kurt (float): Mean kurtosis std_kurt (float): Standard deviation of kurtosis
Return type:: avg_skew (float)

get_mutual_information(df, y)[source]#

Calculate mutual information via sklearn

Parameters:

df (pandas.DataFrame) – Dataset in pandas with observation in rows, features in columns
y (int) – supervised binary class label

Returns:

Mutual information

Return type:

float

get_nnz(df)[source]#

Calculate nonzero values in the data

Parameters:: df (pandas.DataFrame) – Dataset in pandas with observation in rows, features in columns
Returns:: nonzero count
Return type:: int

get_total_correlation(df)[source]#

Calculate Total Correlation

Parameters:: df (pandas.DataFrame) – Dataset in pandas with observation in rows, features in columns
Returns:: Total correlation
Return type:: float

get_variance(df)[source]#

Get variance

Parameters:: df (pandas.DataFrame) – Dataset in pandas with observation in rows, features in columns
Returns:: Mean variance std_var (float): Standard deviation of variance
Return type:: avg_var (float)

get_volume(df)[source]#

Get volume of the data from Convex Hull

Parameters:: df (pandas.DataFrame) – Dataset in pandas with observation in rows, features in columns
Returns:: Volume of the space spanned by the features of the data
Return type:: volume (float)

qbiocode.evaluation.model_evaluation module#

evaluation_metrics(predictions, y_test, metrics=['accuracy', 'brier'], save=False)[source]#

Calculate evaluation metrics for classification predictions.

Computes specified metrics for model predictions. Supports accuracy, Brier score, F1 score, precision, recall, and AUC-ROC. The Brier score measures the mean squared difference between predicted probabilities and actual outcomes, providing a measure of calibration quality.

Parameters:

predictions (np.ndarray) – Predicted probabilities, shape (n_samples, n_classes)
y_test (np.ndarray) – True labels, shape (n_samples,)
metrics (list of str, optional) – List of metrics to compute. Options: ‘accuracy’, ‘brier’, ‘f1’, ‘precision’, ‘recall’, ‘auc’ (default: [‘accuracy’, ‘brier’])
save (bool, optional) – Whether to save results (reserved for future use, default: False)

Returns:

If metrics=[‘accuracy’, ‘brier’] (default): returns (accuracy, brier_score) Otherwise: returns dict with requested metrics as keys

Return type:

tuple or dict

Examples

>>> import numpy as np
>>> from qbiocode.evaluation import evaluation_metrics
>>>
>>> # Binary classification example - default metrics
>>> predictions = np.array([[0.8, 0.2], [0.3, 0.7], [0.9, 0.1]])
>>> y_test = np.array([0, 1, 0])
>>> accuracy, brier = evaluation_metrics(predictions, y_test)
>>> print(f"Accuracy: {accuracy:.2f}, Brier Score: {brier:.3f}")
Accuracy: 1.00, Brier Score: 0.060

>>> # Multiple metrics
>>> results = evaluation_metrics(predictions, y_test,
...                              metrics=['accuracy', 'brier', 'f1', 'auc'])
>>> print(results)
{'accuracy': 1.0, 'brier': 0.06, 'f1': 1.0, 'auc': 1.0}

Notes

For binary classification, Brier score is computed using the probability of the positive class
For multi-class classification, the average Brier score across all classes is returned
F1, precision, and recall use weighted averaging for multi-class
AUC uses one-vs-rest for multi-class
Lower Brier scores indicate better calibrated probability predictions

References

Brier, G. W. (1950). “Verification of forecasts expressed in terms of probability”. Monthly Weather Review, 78(1), 1-3.

modeleval(y_test, y_predicted, beg_time, params, args, model, verbose=True, average='weighted')[source]#

Evaluates the model performance using accuracy, F1 score, and AUC.

Parameters:

y_test (array-like) – True labels for the test set.
y_predicted (array-like) – Predicted labels by the model.
beg_time (float) – Start time for measuring execution time.
params (dict) – Model parameters used during training.
args (dict) – Additional arguments, including grid search flag.
model (str) – Name of the model being evaluated.
verbose (bool) – If True, prints the evaluation results.
average (str) – Type of averaging to use for F1 score calculation. Default is ‘weighted’.

Returns:

DataFrame containing the evaluation results, including accuracy, F1 score, AUC, and model parameters.

Return type:

pd.DataFrame

qbiocode.evaluation.model_run module#

model_run(X_train, X_test, y_train, y_test, data_key, args)[source]#

This function runs the ML methods, with or without a grid search, as specified in the config.yaml file. It returns a python dictionary contatining these results, which can then be parsed out. It is designed to run each of the ML methods in parallel, for each data set (this is done by calling the Parallel module in results below). The arguments X_train, X_test, y_train, y_test are all passed in from the main script (qmlbench.py) as the input datasets are processed, while the remaining arguments are passed from the config.yaml file.

Parameters:

X_train (pd.DataFrame) – Training features.
X_test (pd.DataFrame) – Testing features.
y_train (pd.Series) – Training labels.
y_test (pd.Series) – Testing labels.
data_key (str) – Key for the dataset being processed.
args (dict) – Dictionary containing configuration parameters, including: - model: List of models to run. - n_jobs: Number of parallel jobs to run. - grid_search: Boolean indicating whether to perform grid search. - cross_validation: Cross-validation strategy. - gridsearch_<model>_args: Arguments for grid search for each model. - <model>_args: Additional arguments for each model.

Returns:

A dictionary containing the results of the models run, with keys as model names and values as their respective results. This dictionary can readily be converted to a Pandas Dataframe, as seen in the ‘ModelResults.csv’ files that are produced in the results directory when the main profiler is run (qbiocode-profiler.py).

Return type:

model_total_result (dict)

Module contents#

Evaluation Module for QBioCode#

This module provides comprehensive evaluation tools for machine learning models and datasets. It includes functions for model performance assessment, dataset complexity analysis, and automated model execution.

Available Functions#

modeleval: Evaluate model performance with multiple metrics
evaluation_metrics: Calculate accuracy and Brier score from predictions
evaluate: Comprehensive dataset complexity evaluation
model_run: Automated model training and evaluation pipeline

Usage#

>>> from qbiocode.evaluation import modeleval, evaluate
>>> # Evaluate model performance
>>> metrics = modeleval(y_true, y_pred, y_proba)
>>> # Evaluate dataset complexity
>>> complexity_metrics = evaluate(X, y)

evaluate(df, y, file)[source]#

This function evaluates a dataset and returns a transposed summary DataFrame with various statistical measures, derived from the dataset. Using the functions defined above, it computes intrinsic dimension, condition number, Fisher Discriminant Ratio, total correlation, mutual information, variance, coefficient of variation, data sparsity, low variance features, data density, fractal dimension, data distributions (skewness and kurtosis), entropy of the target variable, and manifold complexity. The summary DataFrame is transposed for easier readability and contains the dataset name, number of features, number of samples, feature-to-sample ratio, and various statistical measures. This function is useful for quickly summarizing the characteristics of a dataset, especially in the context of machine learning and data analysis, allowing you to correlate the dataset’s properties with its performance in predictive modeling tasks.

Parameters:

df (pandas.DataFrame) – Dataset in pandas with observation in rows, features in columns
y (int) – supervised binary class label
file (str) – Name of the dataset file for identification in the summary DataFrame

Returns:

Summary DataFrame containing various statistical measures of the dataset

Return type:

transposed (pandas.DataFrame)

evaluation_metrics(predictions, y_test, metrics=['accuracy', 'brier'], save=False)[source]#

Calculate evaluation metrics for classification predictions.

Computes specified metrics for model predictions. Supports accuracy, Brier score, F1 score, precision, recall, and AUC-ROC. The Brier score measures the mean squared difference between predicted probabilities and actual outcomes, providing a measure of calibration quality.

Parameters:

predictions (np.ndarray) – Predicted probabilities, shape (n_samples, n_classes)
y_test (np.ndarray) – True labels, shape (n_samples,)
metrics (list of str, optional) – List of metrics to compute. Options: ‘accuracy’, ‘brier’, ‘f1’, ‘precision’, ‘recall’, ‘auc’ (default: [‘accuracy’, ‘brier’])
save (bool, optional) – Whether to save results (reserved for future use, default: False)

Returns:

If metrics=[‘accuracy’, ‘brier’] (default): returns (accuracy, brier_score) Otherwise: returns dict with requested metrics as keys

Return type:

tuple or dict

Examples

>>> import numpy as np
>>> from qbiocode.evaluation import evaluation_metrics
>>>
>>> # Binary classification example - default metrics
>>> predictions = np.array([[0.8, 0.2], [0.3, 0.7], [0.9, 0.1]])
>>> y_test = np.array([0, 1, 0])
>>> accuracy, brier = evaluation_metrics(predictions, y_test)
>>> print(f"Accuracy: {accuracy:.2f}, Brier Score: {brier:.3f}")
Accuracy: 1.00, Brier Score: 0.060

>>> # Multiple metrics
>>> results = evaluation_metrics(predictions, y_test,
...                              metrics=['accuracy', 'brier', 'f1', 'auc'])
>>> print(results)
{'accuracy': 1.0, 'brier': 0.06, 'f1': 1.0, 'auc': 1.0}

Notes

For binary classification, Brier score is computed using the probability of the positive class
For multi-class classification, the average Brier score across all classes is returned
F1, precision, and recall use weighted averaging for multi-class
AUC uses one-vs-rest for multi-class
Lower Brier scores indicate better calibrated probability predictions

References

Brier, G. W. (1950). “Verification of forecasts expressed in terms of probability”. Monthly Weather Review, 78(1), 1-3.

model_run(X_train, X_test, y_train, y_test, data_key, args)[source]#

This function runs the ML methods, with or without a grid search, as specified in the config.yaml file. It returns a python dictionary contatining these results, which can then be parsed out. It is designed to run each of the ML methods in parallel, for each data set (this is done by calling the Parallel module in results below). The arguments X_train, X_test, y_train, y_test are all passed in from the main script (qmlbench.py) as the input datasets are processed, while the remaining arguments are passed from the config.yaml file.

Parameters:

X_train (pd.DataFrame) – Training features.
X_test (pd.DataFrame) – Testing features.
y_train (pd.Series) – Training labels.
y_test (pd.Series) – Testing labels.
data_key (str) – Key for the dataset being processed.
args (dict) – Dictionary containing configuration parameters, including: - model: List of models to run. - n_jobs: Number of parallel jobs to run. - grid_search: Boolean indicating whether to perform grid search. - cross_validation: Cross-validation strategy. - gridsearch_<model>_args: Arguments for grid search for each model. - <model>_args: Additional arguments for each model.

Returns:

A dictionary containing the results of the models run, with keys as model names and values as their respective results. This dictionary can readily be converted to a Pandas Dataframe, as seen in the ‘ModelResults.csv’ files that are produced in the results directory when the main profiler is run (qbiocode-profiler.py).

Return type:

model_total_result (dict)

modeleval(y_test, y_predicted, beg_time, params, args, model, verbose=True, average='weighted')[source]#

Evaluates the model performance using accuracy, F1 score, and AUC.

Parameters:

y_test (array-like) – True labels for the test set.
y_predicted (array-like) – Predicted labels by the model.
beg_time (float) – Start time for measuring execution time.
params (dict) – Model parameters used during training.
args (dict) – Additional arguments, including grid search flag.
model (str) – Name of the model being evaluated.
verbose (bool) – If True, prints the evaluation results.
average (str) – Type of averaging to use for F1 score calculation. Default is ‘weighted’.

Returns:

DataFrame containing the evaluation results, including accuracy, F1 score, AUC, and model parameters.

Return type:

pd.DataFrame