Model data
Module that contains the ModelData class. This class stores and operates on a training dataset for use in machine learning applications
- class topsearch.data.model_data.ModelData(training_file: str, response_file: str)
Class to store the data associated with a machine learning model, and perform the methods to modify the data.
- training
All training data points of the dataset
- Type:
NDArray
- response
The corresponding response values for each data point
- Type:
NDArray
- n_points
The number of data points in the dataset
- Type:
int
- resp_props
Dictionary containing the statistics of the response array
- Type:
dict
- train_props
Dictionary containing the statistics of the training array
- Type:
dict
- hull
Object storing the convex hull of the training data
- Type:
NDArray
- append_data(new_training: NDArray[Any, Any], new_response: NDArray[Any, Any]) None
Add additional training and response data to the attributes that store these within the class
- convex_hull() None
Compute the convex hull for the training data
- feature_subset(features: list) None
Get a subset of the features of the training data
- limit_response_maximum(upper_limit: float) None
Limits the maximum allowed response value
- normalise_response() None
Returns the normalised response values, scaled to lie in the range (0,1)
- normalise_training() None
Limit all features to lie within the range (0, 1)
- point_in_hull(point: NDArray[Any, Any]) bool
Determine if point is within convex hull of training data
- read_data(training_file: str, response_file: str) None
Read in the training and repsonse values needed for an ML model. Stored in class attributes
- remove_duplicates(dist_cutoff: float = 1e-07) None
Remove any minima within dist_cutoff from each other, retaining only the first
- standardise_response() None
Standardises response values, enforcing a mean of 0, and a standard deviation of 1
- standardise_training() None
Standardises each feature of the training data to have mean 0, standard deviation 1
- unnormalise_response()
Undo the normalisation of the training array
- unnormalise_training()
Undo the normalisation of the training array
- unstandardise_response()
Undo the normalisation of the response array
- unstandardise_training()
Undo the normalisation of the response array
- write_data(training_file: str, response_file: str) None
Writes the training and response attributes into the specified files