Utilities#
Data utilities#
- inFairness.utils.datautils.convert_tensor_to_numpy(tensor)[source]#
Converts a PyTorch tensor to numpy array
If the provided tensor is not a PyTorch tensor, it returns the same object back with no modifications
- Parameters:
tensor (torch.Tensor) – Tensor to be converted to numpy array
- Returns:
array_np – Numpy array of the provided tensor
- Return type:
- inFairness.utils.datautils.generate_data_pairs(n_pairs, datasamples_1, datasamples_2=None, comparator=None)[source]#
Utility function to generate (in)comparable data pairs given data samples. Use case includes creating a dataset of comparable and incomparable data for the EXPLORE distance metric which learns from such data samples.
- Parameters:
n_pairs (int) – Number of pairs to construct
datasamples_1 (numpy.ndarray) – Array of data samples of shape (N_1, *)
datasamples_2 (numpy.ndarray) – (Optional) array of data samples of shape (N_2, *). If datasamples_2 is provided, then data pairs are constructed between datasamples_1 and datasamples_2. If datasamples_2 is not provided, then data pairs are constructed within datasamples_1
comparator (function) – A lambda function that given two samples returns True if they should be paired, and False if not. If comparator is not defined, then random data samples are paired together. Example: comparator = lambda x, y: (x == y)
- Returns:
idxs – A (n_pairs, 2) shaped array with indices of data sample pairs
- Return type:
- inFairness.utils.datautils.get_device(obj)[source]#
Returns a device (cpu/cuda) based on the type of the reference object
- Parameters:
obj (torch.Tensor) –
- inFairness.utils.datautils.include_exclude_terms(data_terms: Iterable[str], include: Iterable[str] = (), exclude: Iterable[str] = ())[source]#
given a set of data terms, return a resulting set depending on specified included and excluded terms.
- Parameters:
data_terms (string iterable) – set of terms to be filtered
include (string iterable) – set of terms to be included, if not specified all data_terms are included
exclude (string iterable) – set of terms to be excluded from data_terms
- Returns:
terms – resulting terms in alphabetical order.
- Return type:
list of strings
Post-Processing utilities#
- inFairness.utils.postprocessing.build_graph_from_dists(dists: Tensor, scale: float | None = None, threshold: float | None = None, normalize: bool = False)[source]#
Build the adjacency matrix W given distances
- Parameters:
dists (torch.Tensor) – Distance matrix between data points. Shape: (N, N)
scale (float) – Parameter used to scale the computed distances
threshold (float) – Parameter used to determine if two data points are connected or not. Distances below threshold value are connected, and beyond threshold value are disconnected.
normalize (bool) – Whether to normalize the adjacency matrix or not
- Returns:
W (torch.Tensor) – Adjancency matrix. It contains data points which are connected to atleast one other datapoint. Isolated datapoints, or ones which are not connected to any other datapoints, are not included in the adjancency matrix.
idxs_in (torch.Tensor) – Indices of data points which are included in the adjacency matrix
- inFairness.utils.postprocessing.get_laplacian(W: Tensor, normalize: bool = False)[source]#
Get the Laplacian of the matrix W
- Parameters:
W (torch.Tensor) – Adjacency matrix of shape (N, N)
normalize (bool) – Whether to normalize the computed laplacian or not
- Returns:
Laplacian – Laplacian of the adjacency matrix
- Return type:
- inFairness.utils.postprocessing.laplacian_solve(L: Tensor, y_hat: Tensor, lambda_param: float | None = None)[source]#
Solves a system of linear equations to get the post-processed output. The system of linear equation it solves is: \(\hat{{f}} = {(I + \lambda * L)}^{-1} \hat{y}\)
- Parameters:
L (torch.Tensor) – Laplacian matrix
y_hat (torch.Tensor) – Model predicted output class probabilities
lambda_param (float) – Weight for the laplacian regularizer
- Returns:
y – Post-processed solution according to the equation above
- Return type: