qbiocode.evaluation.model_run module#

Summary#

Functions:

model_run

This function runs the ML methods, with or without a grid search, as specified in the config.yaml file.

Reference#

model_run(X_train, X_test, y_train, y_test, data_key, args)[source]#

This function runs the ML methods, with or without a grid search, as specified in the config.yaml file. It returns a python dictionary contatining these results, which can then be parsed out. It is designed to run each of the ML methods in parallel, for each data set (this is done by calling the Parallel module in results below). The arguments X_train, X_test, y_train, y_test are all passed in from the main script (qmlbench.py) as the input datasets are processed, while the remaining arguments are passed from the config.yaml file.

Parameters:

X_train (pd.DataFrame) – Training features.
X_test (pd.DataFrame) – Testing features.
y_train (pd.Series) – Training labels.
y_test (pd.Series) – Testing labels.
data_key (str) – Key for the dataset being processed.
args (dict) – Dictionary containing configuration parameters, including: - model: List of models to run. - n_jobs: Number of parallel jobs to run. - grid_search: Boolean indicating whether to perform grid search. - cross_validation: Cross-validation strategy. - gridsearch_<model>_args: Arguments for grid search for each model. - <model>_args: Additional arguments for each model.

Returns:

A dictionary containing the results of the models run, with keys as model names and values as their respective results. This dictionary can readily be converted to a Pandas Dataframe, as seen in the ‘ModelResults.csv’ files that are produced in the results directory when the main profiler is run (qbiocode-profiler.py).

Return type:

model_total_result (dict)