Skip to content

Partitioning algorithms

random_partition(df, test_size, random_state=42, **kwargs)

Use random partitioning algorithm to generate training and evaluation subsets. Wrapper around the train_test_split function from scikit-learn.

Parameters:

Name Type Description Default
df DataFrame

DataFrame with the entities to partition

required
test_size float

Proportion of entities to be allocated to test subset, defaults to 0.2

required
random_state int

Seed for pseudo-random number generator algorithm, defaults to 42

42

Returns:

Type Description
Tuple[pd.DataFrame, pd.DataFrame]

A tuple with the indexes of training and evaluation samples.