Generic Datamodules
terratorch.datamodules.generic_pixel_wise_data_module
#
This module contains generic data modules for instantiation at runtime.
GenericNonGeoPixelwiseRegressionDataModule
#
Bases: NonGeoDataModule
This is a generic datamodule class for instantiating data modules at runtime. Composes several GenericNonGeoPixelwiseRegressionDataset
__init__(batch_size, num_workers, train_data_root, val_data_root, test_data_root, means, stds, predict_data_root=None, img_grep='*', label_grep='*', train_label_data_root=None, val_label_data_root=None, test_label_data_root=None, train_split=None, val_split=None, test_split=None, ignore_split_file_extensions=True, allow_substring_split_file=True, dataset_bands=None, output_bands=None, predict_dataset_bands=None, predict_output_bands=None, constant_scale=1, rgb_indices=None, train_transform=None, val_transform=None, test_transform=None, expand_temporal_dimension=False, reduce_zero_label=False, no_data_replace=None, no_label_replace=None, drop_last=True, pin_memory=False, check_stackability=True, **kwargs)
#
Constructor
Parameters:
Name | Type | Description | Default |
---|---|---|---|
batch_size
|
int
|
description |
required |
num_workers
|
int
|
description |
required |
train_data_root
|
Path
|
description |
required |
val_data_root
|
Path
|
description |
required |
test_data_root
|
Path
|
description |
required |
predict_data_root
|
Path
|
description |
None
|
img_grep
|
str
|
description |
'*'
|
label_grep
|
str
|
description |
'*'
|
means
|
list[float]
|
description |
required |
stds
|
list[float]
|
description |
required |
train_label_data_root
|
Path | None
|
description. Defaults to None. |
None
|
val_label_data_root
|
Path | None
|
description. Defaults to None. |
None
|
test_label_data_root
|
Path | None
|
description. Defaults to None. |
None
|
train_split
|
Path | None
|
description. Defaults to None. |
None
|
val_split
|
Path | None
|
description. Defaults to None. |
None
|
test_split
|
Path | None
|
description. Defaults to None. |
None
|
ignore_split_file_extensions
|
bool
|
Whether to disregard extensions when using the split file to determine which files to include in the dataset. E.g. necessary for Eurosat, since the split files specify ".jpg" but files are actually ".jpg". Defaults to True. |
True
|
allow_substring_split_file
|
bool
|
Whether the split files contain substrings that must be present in file names to be included (as in mmsegmentation), or exact matches (e.g. eurosat). Defaults to True. |
True
|
dataset_bands
|
list[HLSBands | int] | None
|
Bands present in the dataset. Defaults to None. |
None
|
output_bands
|
list[HLSBands | int] | None
|
Bands that should be output by the dataset. Naming must match that of dataset_bands. Defaults to None. |
None
|
predict_dataset_bands
|
list[HLSBands | int] | None
|
Overwrites dataset_bands with this value at predict time. Defaults to None, which does not overwrite. |
None
|
predict_output_bands
|
list[HLSBands | int] | None
|
Overwrites output_bands with this value at predict time. Defaults to None, which does not overwrite. |
None
|
constant_scale
|
float
|
description. Defaults to 1. |
1
|
rgb_indices
|
list[int] | None
|
description. Defaults to None. |
None
|
train_transform
|
Compose | None
|
Albumentations transform to be applied to the train dataset. Should end with ToTensorV2(). If used through the generic_data_module, should not include normalization. Not supported for multi-temporal data. Defaults to None, which simply applies ToTensorV2(). |
None
|
val_transform
|
Compose | None
|
Albumentations transform to be applied to the train dataset. Should end with ToTensorV2(). If used through the generic_data_module, should not include normalization. Not supported for multi-temporal data. Defaults to None, which simply applies ToTensorV2(). |
None
|
test_transform
|
Compose | None
|
Albumentations transform to be applied to the train dataset. Should end with ToTensorV2(). If used through the generic_data_module, should not include normalization. Not supported for multi-temporal data. Defaults to None, which simply applies ToTensorV2(). |
None
|
no_data_replace
|
float | None
|
Replace nan values in input images with this value. If none, does no replacement. Defaults to None. |
None
|
no_label_replace
|
int | None
|
Replace nan values in label with this value. If none, does no replacement. Defaults to None. |
None
|
expand_temporal_dimension
|
bool
|
Go from shape (time*channels, h, w) to (channels, time, h, w). Defaults to False. |
False
|
reduce_zero_label
|
bool
|
Subtract 1 from all labels. Useful when labels start from 1 instead of the expected 0. Defaults to False. |
False
|
drop_last
|
bool
|
Drop the last batch if it is not complete. Defaults to True. |
True
|
pin_memory
|
bool
|
If |
False
|
check_stackability
|
bool
|
Check if all the files in the dataset has the same size and can be stacked. |
True
|
GenericNonGeoSegmentationDataModule
#
Bases: NonGeoDataModule
This is a generic datamodule class for instantiating data modules at runtime. Composes several GenericNonGeoSegmentationDatasets
__init__(batch_size, num_workers, train_data_root, val_data_root, test_data_root, img_grep, label_grep, means, stds, num_classes, predict_data_root=None, train_label_data_root=None, val_label_data_root=None, test_label_data_root=None, train_split=None, val_split=None, test_split=None, ignore_split_file_extensions=True, allow_substring_split_file=True, dataset_bands=None, output_bands=None, predict_dataset_bands=None, predict_output_bands=None, constant_scale=1, rgb_indices=None, train_transform=None, val_transform=None, test_transform=None, expand_temporal_dimension=False, reduce_zero_label=False, no_data_replace=None, no_label_replace=None, drop_last=True, pin_memory=False, **kwargs)
#
Constructor
Parameters:
Name | Type | Description | Default |
---|---|---|---|
batch_size
|
int
|
description |
required |
num_workers
|
int
|
description |
required |
train_data_root
|
Path
|
description |
required |
val_data_root
|
Path
|
description |
required |
test_data_root
|
Path
|
description |
required |
predict_data_root
|
Path
|
description |
None
|
img_grep
|
str
|
description |
required |
label_grep
|
str
|
description |
required |
means
|
list[float]
|
description |
required |
stds
|
list[float]
|
description |
required |
num_classes
|
int
|
description |
required |
train_label_data_root
|
Path | None
|
description. Defaults to None. |
None
|
val_label_data_root
|
Path | None
|
description. Defaults to None. |
None
|
test_label_data_root
|
Path | None
|
description. Defaults to None. |
None
|
train_split
|
Path | None
|
description. Defaults to None. |
None
|
val_split
|
Path | None
|
description. Defaults to None. |
None
|
test_split
|
Path | None
|
description. Defaults to None. |
None
|
ignore_split_file_extensions
|
bool
|
Whether to disregard extensions when using the split file to determine which files to include in the dataset. E.g. necessary for Eurosat, since the split files specify ".jpg" but files are actually ".jpg". Defaults to True. |
True
|
allow_substring_split_file
|
bool
|
Whether the split files contain substrings that must be present in file names to be included (as in mmsegmentation), or exact matches (e.g. eurosat). Defaults to True. |
True
|
dataset_bands
|
list[HLSBands | int] | None
|
Bands present in the dataset. Defaults to None. |
None
|
output_bands
|
list[HLSBands | int] | None
|
Bands that should be output by the dataset. Naming must match that of dataset_bands. Defaults to None. |
None
|
predict_dataset_bands
|
list[HLSBands | int] | None
|
Overwrites dataset_bands with this value at predict time. Defaults to None, which does not overwrite. |
None
|
predict_output_bands
|
list[HLSBands | int] | None
|
Overwrites output_bands with this value at predict time. Defaults to None, which does not overwrite. |
None
|
constant_scale
|
float
|
description. Defaults to 1. |
1
|
rgb_indices
|
list[int] | None
|
description. Defaults to None. |
None
|
train_transform
|
Compose | None
|
Albumentations transform to be applied to the train dataset. Should end with ToTensorV2(). If used through the generic_data_module, should not include normalization. Not supported for multi-temporal data. Defaults to None, which simply applies ToTensorV2(). |
None
|
val_transform
|
Compose | None
|
Albumentations transform to be applied to the train dataset. Should end with ToTensorV2(). If used through the generic_data_module, should not include normalization. Not supported for multi-temporal data. Defaults to None, which simply applies ToTensorV2(). |
None
|
test_transform
|
Compose | None
|
Albumentations transform to be applied to the train dataset. Should end with ToTensorV2(). If used through the generic_data_module, should not include normalization. Not supported for multi-temporal data. Defaults to None, which simply applies ToTensorV2(). |
None
|
no_data_replace
|
float | None
|
Replace nan values in input images with this value. If none, does no replacement. Defaults to None. |
None
|
no_label_replace
|
int | None
|
Replace nan values in label with this value. If none, does no replacement. Defaults to None. |
None
|
expand_temporal_dimension
|
bool
|
Go from shape (time*channels, h, w) to (channels, time, h, w). Defaults to False. |
False
|
reduce_zero_label
|
bool
|
Subtract 1 from all labels. Useful when labels start from 1 instead of the expected 0. Defaults to False. |
False
|
drop_last
|
bool
|
Drop the last batch if it is not complete. Defaults to True. |
True
|
pin_memory
|
bool
|
If |
False
|
terratorch.datamodules.generic_scalar_label_data_module
#
This module contains generic data modules for instantiation at runtime.
GenericNonGeoClassificationDataModule
#
Bases: NonGeoDataModule
This is a generic datamodule class for instantiating data modules at runtime. Composes several GenericNonGeoClassificationDatasets
__init__(batch_size, num_workers, train_data_root, val_data_root, test_data_root, means, stds, num_classes, predict_data_root=None, train_split=None, val_split=None, test_split=None, ignore_split_file_extensions=True, allow_substring_split_file=True, dataset_bands=None, predict_dataset_bands=None, output_bands=None, constant_scale=1, rgb_indices=None, train_transform=None, val_transform=None, test_transform=None, expand_temporal_dimension=False, no_data_replace=0, drop_last=True, check_stackability=True, **kwargs)
#
Constructor
Parameters:
Name | Type | Description | Default |
---|---|---|---|
batch_size
|
int
|
description |
required |
num_workers
|
int
|
description |
required |
train_data_root
|
Path
|
description |
required |
val_data_root
|
Path
|
description |
required |
test_data_root
|
Path
|
description |
required |
means
|
list[float]
|
description |
required |
stds
|
list[float]
|
description |
required |
num_classes
|
int
|
description |
required |
predict_data_root
|
Path
|
description |
None
|
train_split
|
Path | None
|
description. Defaults to None. |
None
|
val_split
|
Path | None
|
description. Defaults to None. |
None
|
test_split
|
Path | None
|
description. Defaults to None. |
None
|
ignore_split_file_extensions
|
bool
|
Whether to disregard extensions when using the split file to determine which files to include in the dataset. E.g. necessary for Eurosat, since the split files specify ".jpg" but files are actually ".jpg". |
True
|
allow_substring_split_file
|
bool
|
Whether the split files contain substrings that must be present in file names to be included (as in mmsegmentation), or exact matches (e.g. eurosat). Defaults to True. |
True
|
dataset_bands
|
list[HLSBands | int] | None
|
description. Defaults to None. |
None
|
predict_dataset_bands
|
list[HLSBands | int] | None
|
description. Defaults to None. |
None
|
output_bands
|
list[HLSBands | int] | None
|
description. Defaults to None. |
None
|
constant_scale
|
float
|
description. Defaults to 1. |
1
|
rgb_indices
|
list[int] | None
|
description. Defaults to None. |
None
|
train_transform
|
Compose | None
|
Albumentations transform to be applied to the train dataset. Should end with ToTensorV2(). If used through the generic_data_module, should not include normalization. Not supported for multi-temporal data. Defaults to None, which simply applies ToTensorV2(). |
None
|
val_transform
|
Compose | None
|
Albumentations transform to be applied to the train dataset. Should end with ToTensorV2(). If used through the generic_data_module, should not include normalization. Not supported for multi-temporal data. Defaults to None, which simply applies ToTensorV2(). |
None
|
test_transform
|
Compose | None
|
Albumentations transform to be applied to the train dataset. Should end with ToTensorV2(). If used through the generic_data_module, should not include normalization. Not supported for multi-temporal data. Defaults to None, which simply applies ToTensorV2(). |
None
|
no_data_replace
|
float
|
Replace nan values in input images with this value. Defaults to 0. |
0
|
expand_temporal_dimension
|
bool
|
Go from shape (time*channels, h, w) to (channels, time, h, w). Defaults to False. |
False
|
drop_last
|
bool
|
Drop the last batch if it is not complete. Defaults to True. |
True
|
check_stackability
|
bool
|
Check if all the files in the dataset has the same size and can be stacked. |
True
|