Gaussian process
Module that contains classes to implement Gaussian process regression and evaluate the resulting model
- class topsearch.potentials.gaussian_process.GaussianProcess(model_data: ModelData, kernel_choice: str, kernel_bounds: list, standardise_training: bool = False, standardise_response: bool = True, limit_highest_data: bool = False, matern_nu: float = None)
Description
Fit and evaluate a Gaussian process regression model to a given dataset. The function we evaluate is the value of the regression model at any point in feature space
- model_data
The object containing the training and response data we will fit
- Type:
ModelData instance
- kernel_choice
The choice of kernel, can be ‘RBF’ or ‘Matern’
- Type:
str
- kernel_bounds
Limits on the kernel lengthscales and noise (final element) hyperparameters
- Type:
list
- standardise_training
Choose whether to standardise the training data before GP fit
- Type:
bool
- standardise_response
Choose whether to standardise the response data before GP fit
- Type:
bool
- limit_highest_data
Specifies if we should limit the largest response value. Useful in molecular applications where the steep repuslive wall gives huge values
- Type:
logical
- matern_nu
The nu parameter of the Matern kernel
- Type:
float
- gpr
The sklearn gaussian process object
- Type:
class
- add_data(new_training: NDArray[Any, Any], new_response: NDArray[Any, Any]) None
Add data to the model data, accounting for standardisation
- function(position: NDArray[Any, Any]) float
Return the mean of the GP fit at position
- function_and_std(position: NDArray[Any, Any]) float
Return the mean and variance of the GP fit at position
- get_score() float
Get the R^2 score of the gp fit
- initialise_gaussian_process(n_restarts: int = 50) None
Initialise the Gaussian process from sklearn Returns a gpr object as an attribute of this class
- initialise_kernel() None
Create a specified kernel for use in a Gaussian process
- lowest_point() float
Find the lowest point in the current dataset
- prepare_training_data()
Modify the training data that is provided to the Gaussian process to normalise training, response and limit their values
- refit_model(n_restarts: int = 50) None
Refit the GP model based on the current model_data
- update_bounds(scaling: float) None
Change the lengthscale bounds for kernel
- write_fit() None
Write the hyperparameters of the best GP fit