Post-Processing#
Post-Processing methods included in the package
Base post-processing class#
- class inFairness.postprocessing.BasePostProcessing(distance_x, is_output_probas)[source]#
Base class for Post-Processing methods
- Parameters:
distance_x (inFairness.distances.Distance) – Distance matrix in the input space
is_output_probas (bool) – True if the data_Y (model output) are probabilities implying that this is a classification setting, and False if the data_Y are in euclidean space implying that this is a regression setting.
- add_datapoints(X: Tensor, y: Tensor)[source]#
Add datapoints to the post-processing method
- Parameters:
X (torch.Tensor) – New input datapoints
y (torch.Tensor) – New output datapoints
- property data#
Input and Output data used for post-processing
- Returns:
data – A tuple of (X, Y) data points
- Return type:
Tuple(torch.Tensor, torch.Tensor)
- property distance_matrix#
Distance matrix
- Returns:
distance_matrix – Matrix of distances of shape (N, N) where N is the number of data samples
- Return type:
Graph Laplacian Individual Fairness (GLIF)#
- class inFairness.postprocessing.GraphLaplacianIF(distance_x, is_output_probas)[source]#
Implements the Graph Laplacian Individual Fairness Post-Processing method.
Proposed in Post-processing for Individual Fairness
- Parameters:
distance_x (inFairness.distances.Distance) – Distance metric in the input space
is_output_probas (bool) – True if the data_Y (model output) are probabilities implying that this is a classification setting, and False if the data_Y are in euclidean space implying that this is a regression setting.
- get_objective(y_solution, lambda_param: float, scale: float, threshold: float, normalize: bool = False, W_graph=None, idxs=None, L=None)[source]#
Compute the objective values for the individual fairness as follows:
\[\widehat{\mathbf{f}} = \arg \min_{\mathbf{f}} \ \|\mathbf{f} - \hat{\mathbf{y}}\|_2^2 + \lambda \ \mathbf{f}^{\top}\mathbb{L_n} \mathbf{f}\]Refer equation 3.1 in the paper
- Parameters:
y_solution (torch.Tensor) – Post-processed solution values of shape (N, C)
lambda_param (float) – Weight for the Laplacian Regularizer
scale (float) – Parameter used to scale the computed distances. Refer equation 2.2 in the proposing paper.
threshold (float) – Parameter used to construct the Graph from distances Distances below provided threshold are considered to be connected edges, while beyond the threshold are considered to be disconnected. Refer equation 2.2 in the proposing paper.
normalize (bool) – Whether to normalize the computed Laplacian or not
W_graph (torch.Tensor) – Adjacency matrix of shape (N, N)
idxs (torch.Tensor) – Indices of data points which are included in the adjacency matrix
L (torch.Tensor) – Laplacian of the adjacency matrix
- Returns:
objective –
- post-processed solution containing two parts:
Post-processed output probabilities of shape (N, C) where N is the number of data samples, and C is the number of output classes
Objective values. Refer equation 3.1 in the paper for an explanation of the various parts
- Return type:
- postprocess(method: str, lambda_param: float, scale: float, threshold: float, normalize: bool = False, batchsize: int | None = None, epochs: int | None = None)[source]#
Implements the Graph Laplacian Individual Fairness Post-processing algorithm
- Parameters:
method (str) –
GLIF method type. Possible values are:
(a) coordinate-descent method which is more suitable for large-scale data and post-processes by batching data into minibatches (see section 3.2.2 of the paper), or
(b) exact method which gets the exact solution but is not appropriate for large-scale data (refer equation 3.3 in the paper).
lambda_param (float) – Weight for the Laplacian Regularizer
scale (float) – Parameter used to scale the computed distances. Refer equation 2.2 in the proposing paper.
threshold (float) – Parameter used to construct the Graph from distances Distances below provided threshold are considered to be connected edges, while beyond the threshold are considered to be disconnected. Refer equation 2.2 in the proposing paper.
normalize (bool) – Whether to normalize the computed Laplacian or not
batchsize (int) – Batch size. Required when method=`coordinate-descent`
epochs (int) – Number of coordinate descent epochs. Required when method=`coordinate-descent`
- Returns:
solution –
- post-processed solution containing two parts:
Post-processed output probabilities of shape (N, C) where N is the number of data samples, and C is the number of output classes
Objective values. Refer equation 3.1 in the paper for an explanation of the various parts
- Return type: