Datamodules
terratorch.datamodules.biomassters
#
BioMasstersNonGeoDataModule
#
Bases: NonGeoDataModule
NonGeo LightningDataModule implementation for BioMassters datamodule.
__init__(data_root, batch_size=4, num_workers=0, bands=BioMasstersNonGeo.all_band_names, train_transform=None, val_transform=None, test_transform=None, predict_transform=None, aug=None, drop_last=True, sensors=['S1', 'S2'], as_time_series=False, metadata_filename=default_metadata_filename, max_cloud_percentage=None, max_red_mean=None, include_corrupt=True, subset=1, seed=42, use_four_frames=False, **kwargs)
#
Initializes the DataModule for the non-geospatial BioMassters datamodule.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data_root
|
str
|
Root directory containing the dataset. |
required |
batch_size
|
int
|
Batch size for DataLoaders. Defaults to 4. |
4
|
num_workers
|
int
|
Number of workers for data loading. Defaults to 0. |
0
|
bands
|
dict[str, Sequence[str]] | Sequence[str]
|
Band configuration; either a dict mapping sensors to bands or a list for the first sensor. Defaults to BioMasstersNonGeo.all_band_names |
all_band_names
|
train_transform
|
Compose | None | list[BasicTransform]
|
Transformations for training data. |
None
|
val_transform
|
Compose | None | list[BasicTransform]
|
Transformations for validation data. |
None
|
test_transform
|
Compose | None | list[BasicTransform]
|
Transformations for testing data. |
None
|
predict_transform
|
Compose | None | list[BasicTransform]
|
Transformations for prediction data. |
None
|
aug
|
AugmentationSequential
|
Augmentation or normalization to apply. Defaults to normalization if not provided. |
None
|
drop_last
|
bool
|
Whether to drop the last incomplete batch. Defaults to True. |
True
|
sensors
|
Sequence[str]
|
List of sensors to use (e.g., ["S1", "S2"]). Defaults to ["S1", "S2"]. |
['S1', 'S2']
|
as_time_series
|
bool
|
Whether to treat data as a time series. Defaults to False. |
False
|
metadata_filename
|
str
|
Metadata filename. Defaults to "The_BioMassters_-_features_metadata.csv.csv". |
default_metadata_filename
|
max_cloud_percentage
|
float | None
|
Maximum allowed cloud percentage. Defaults to None. |
None
|
max_red_mean
|
float | None
|
Maximum allowed red band mean. Defaults to None. |
None
|
include_corrupt
|
bool
|
Whether to include corrupt data. Defaults to True. |
True
|
subset
|
float
|
Fraction of the dataset to use. Defaults to 1. |
1
|
seed
|
int
|
Random seed for reproducibility. Defaults to 42. |
42
|
use_four_frames
|
bool
|
Whether to use a four frames configuration. Defaults to False. |
False
|
**kwargs
|
Any
|
Additional keyword arguments. |
{}
|
Returns:
Type | Description |
---|---|
None
|
None. |
setup(stage)
#
Set up datasets.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
stage
|
str
|
Either fit, validate, test, or predict. |
required |
terratorch.datamodules.burn_intensity
#
BurnIntensityNonGeoDataModule
#
Bases: NonGeoDataModule
NonGeo LightningDataModule implementation for BurnIntensity datamodule.
__init__(data_root, batch_size=4, num_workers=0, bands=BurnIntensityNonGeo.all_band_names, train_transform=None, val_transform=None, test_transform=None, predict_transform=None, use_full_data=True, no_data_replace=0.0001, no_label_replace=-1, use_metadata=False, **kwargs)
#
Initializes the DataModule for the BurnIntensity non-geospatial datamodule.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data_root
|
str
|
Root directory of the dataset. |
required |
batch_size
|
int
|
Batch size for DataLoaders. Defaults to 4. |
4
|
num_workers
|
int
|
Number of workers for data loading. Defaults to 0. |
0
|
bands
|
Sequence[str]
|
List of bands to use. Defaults to BurnIntensityNonGeo.all_band_names. |
all_band_names
|
train_transform
|
Compose | None | list[BasicTransform]
|
Transformations for training. |
None
|
val_transform
|
Compose | None | list[BasicTransform]
|
Transformations for validation. |
None
|
test_transform
|
Compose | None | list[BasicTransform]
|
Transformations for testing. |
None
|
predict_transform
|
Compose | None | list[BasicTransform]
|
Transformations for prediction. |
None
|
use_full_data
|
bool
|
Whether to use the full dataset or data with less than 25 percent zeros. Defaults to True. |
True
|
no_data_replace
|
float | None
|
Value to replace missing data. Defaults to 0.0001. |
0.0001
|
no_label_replace
|
int | None
|
Value to replace missing labels. Defaults to -1. |
-1
|
use_metadata
|
bool
|
Whether to return metadata info (time and location). |
False
|
**kwargs
|
Any
|
Additional keyword arguments. |
{}
|
setup(stage)
#
Set up datasets.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
stage
|
str
|
Either fit, validate, test, or predict. |
required |
terratorch.datamodules.carbonflux
#
CarbonFluxNonGeoDataModule
#
Bases: NonGeoDataModule
NonGeo LightningDataModule implementation for Carbon FLux dataset.
__init__(data_root, batch_size=4, num_workers=0, bands=CarbonFluxNonGeo.all_band_names, train_transform=None, val_transform=None, test_transform=None, predict_transform=None, aug=None, no_data_replace=0.0001, use_metadata=False, **kwargs)
#
Initializes the CarbonFluxNonGeoDataModule.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data_root
|
str
|
Root directory of the dataset. |
required |
batch_size
|
int
|
Batch size for DataLoaders. Defaults to 4. |
4
|
num_workers
|
int
|
Number of workers for data loading. Defaults to 0. |
0
|
bands
|
Sequence[str]
|
List of bands to use. Defaults to CarbonFluxNonGeo.all_band_names. |
all_band_names
|
train_transform
|
Compose | None | list[BasicTransform]
|
Transformations for training data. |
None
|
val_transform
|
Compose | None | list[BasicTransform]
|
Transformations for validation data. |
None
|
test_transform
|
Compose | None | list[BasicTransform]
|
Transformations for testing data. |
None
|
predict_transform
|
Compose | None | list[BasicTransform]
|
Transformations for prediction data. |
None
|
aug
|
AugmentationSequential
|
Augmentation sequence; if None, applies multimodal normalization. |
None
|
no_data_replace
|
float | None
|
Value to replace missing data. Defaults to 0.0001. |
0.0001
|
use_metadata
|
bool
|
Whether to return metadata info. |
False
|
**kwargs
|
Any
|
Additional keyword arguments. |
{}
|
setup(stage)
#
Set up datasets.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
stage
|
str
|
Either fit, validate, test, or predict. |
required |
terratorch.datamodules.forestnet
#
ForestNetNonGeoDataModule
#
Bases: NonGeoDataModule
NonGeo LightningDataModule implementation for Landslide4Sense dataset.
__init__(data_root, batch_size=4, num_workers=0, label_map=ForestNetNonGeo.default_label_map, bands=ForestNetNonGeo.all_band_names, train_transform=None, val_transform=None, test_transform=None, predict_transform=None, fraction=1.0, aug=None, use_metadata=False, **kwargs)
#
Initializes the ForestNetNonGeoDataModule.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data_root
|
str
|
Directory containing the dataset. |
required |
batch_size
|
int
|
Batch size for data loaders. Defaults to 4. |
4
|
num_workers
|
int
|
Number of workers for data loading. Defaults to 0. |
0
|
label_map
|
dict[str, int]
|
Mapping of labels to integers. Defaults to ForestNetNonGeo.default_label_map. |
default_label_map
|
bands
|
Sequence[str]
|
List of band names to use. Defaults to ForestNetNonGeo.all_band_names. |
all_band_names
|
train_transform
|
Compose | None | list[BasicTransform]
|
Transformations for training data. |
None
|
val_transform
|
Compose | None | list[BasicTransform]
|
Transformations for validation data. |
None
|
test_transform
|
Compose | None | list[BasicTransform]
|
Transformations for testing data. |
None
|
predict_transform
|
Compose | None | list[BasicTransform]
|
Transformations for prediction. |
None
|
fraction
|
float
|
Fraction of data to use. Defaults to 1.0. |
1.0
|
aug
|
AugmentationSequential
|
Augmentation/normalization pipeline; if None, uses Normalize. |
None
|
use_metadata
|
bool
|
Whether to return metadata info. |
False
|
**kwargs
|
Any
|
Additional keyword arguments. |
{}
|
setup(stage)
#
Set up datasets.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
stage
|
str
|
Either fit, validate, test, or predict. |
required |
terratorch.datamodules.fire_scars
#
FireScarsDataModule
#
Bases: GeoDataModule
Geo Fire Scars data module implementation that merges input data with ground truth segmentation masks.
FireScarsNonGeoDataModule
#
Bases: NonGeoDataModule
NonGeo LightningDataModule implementation for Fire Scars dataset.
__init__(data_root, batch_size=4, num_workers=0, bands=FireScarsNonGeo.all_band_names, train_transform=None, val_transform=None, test_transform=None, predict_transform=None, drop_last=True, no_data_replace=0, no_label_replace=-1, use_metadata=False, **kwargs)
#
Initializes the FireScarsNonGeoDataModule.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data_root
|
str
|
Root directory of the dataset. |
required |
batch_size
|
int
|
Batch size for DataLoaders. Defaults to 4. |
4
|
num_workers
|
int
|
Number of workers for data loading. Defaults to 0. |
0
|
bands
|
Sequence[str]
|
List of band names. Defaults to FireScarsNonGeo.all_band_names. |
all_band_names
|
train_transform
|
Compose | None | list[BasicTransform]
|
Transformations for training. |
None
|
val_transform
|
Compose | None | list[BasicTransform]
|
Transformations for validation. |
None
|
test_transform
|
Compose | None | list[BasicTransform]
|
Transformations for testing. |
None
|
predict_transform
|
Compose | None | list[BasicTransform]
|
Transformations for prediction. |
None
|
drop_last
|
bool
|
Whether to drop the last incomplete batch. Defaults to True. |
True
|
no_data_replace
|
float | None
|
Replacement value for missing data. Defaults to 0. |
0
|
no_label_replace
|
int | None
|
Replacement value for missing labels. Defaults to -1. |
-1
|
use_metadata
|
bool
|
Whether to return metadata info. |
False
|
**kwargs
|
Any
|
Additional keyword arguments. |
{}
|
setup(stage)
#
Set up datasets.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
stage
|
str
|
Either fit, validate, test, or predict. |
required |
terratorch.datamodules.landslide4sense
#
Landslide4SenseNonGeoDataModule
#
Bases: NonGeoDataModule
NonGeo LightningDataModule implementation for Landslide4Sense dataset.
__init__(data_root, batch_size=4, num_workers=0, bands=Landslide4SenseNonGeo.all_band_names, train_transform=None, val_transform=None, test_transform=None, predict_transform=None, aug=None, **kwargs)
#
Initializes the Landslide4SenseNonGeoDataModule.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data_root
|
str
|
Root directory of the dataset. |
required |
batch_size
|
int
|
Batch size for data loaders. Defaults to 4. |
4
|
num_workers
|
int
|
Number of workers for data loading. Defaults to 0. |
0
|
bands
|
Sequence[str]
|
List of band names to use. Defaults to Landslide4SenseNonGeo.all_band_names. |
all_band_names
|
train_transform
|
Compose | None | list[BasicTransform]
|
Transformations for training data. |
None
|
val_transform
|
Compose | None | list[BasicTransform]
|
Transformations for validation data. |
None
|
test_transform
|
Compose | None | list[BasicTransform]
|
Transformations for testing data. |
None
|
predict_transform
|
Compose | None | list[BasicTransform]
|
Transformations for prediction data. |
None
|
aug
|
AugmentationSequential
|
Augmentation pipeline; if None, applies normalization using computed means and stds. |
None
|
**kwargs
|
Any
|
Additional keyword arguments. |
{}
|
setup(stage)
#
Set up datasets.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
stage
|
str
|
Either fit, validate, test, or predict. |
required |
terratorch.datamodules.m_eurosat
#
MEuroSATNonGeoDataModule
#
Bases: GeobenchDataModule
NonGeo LightningDataModule implementation for M-EuroSAT dataset.
__init__(batch_size=8, num_workers=0, data_root='./', bands=None, train_transform=None, val_transform=None, test_transform=None, aug=None, partition='default', **kwargs)
#
Initializes the MEuroSATNonGeoDataModule for the MEuroSATNonGeo dataset.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
batch_size
|
int
|
Batch size for DataLoaders. Defaults to 8. |
8
|
num_workers
|
int
|
Number of workers for data loading. Defaults to 0. |
0
|
data_root
|
str
|
Root directory of the dataset. Defaults to "./". |
'./'
|
bands
|
Sequence[str] | None
|
List of bands to use. Defaults to None. |
None
|
train_transform
|
Compose | None | list[BasicTransform]
|
Transformations for training. |
None
|
val_transform
|
Compose | None | list[BasicTransform]
|
Transformations for validation. |
None
|
test_transform
|
Compose | None | list[BasicTransform]
|
Transformations for testing. |
None
|
aug
|
AugmentationSequential
|
Augmentation/normalization pipeline. Defaults to None. |
None
|
partition
|
str
|
Partition size. Defaults to "default". |
'default'
|
**kwargs
|
Any
|
Additional keyword arguments. |
{}
|
terratorch.datamodules.m_bigearthnet
#
MBigEarthNonGeoDataModule
#
Bases: GeobenchDataModule
NonGeo LightningDataModule implementation for M-BigEarthNet dataset.
__init__(batch_size=8, num_workers=0, data_root='./', bands=None, train_transform=None, val_transform=None, test_transform=None, aug=None, partition='default', **kwargs)
#
Initializes the MBigEarthNonGeoDataModule for the M-BigEarthNet dataset.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
batch_size
|
int
|
Batch size for DataLoaders. Defaults to 8. |
8
|
num_workers
|
int
|
Number of workers for data loading. Defaults to 0. |
0
|
data_root
|
str
|
Root directory of the dataset. Defaults to "./". |
'./'
|
bands
|
Sequence[str] | None
|
List of bands to use. Defaults to None. |
None
|
train_transform
|
Compose | None | list[BasicTransform]
|
Transformations for training. |
None
|
val_transform
|
Compose | None | list[BasicTransform]
|
Transformations for validation. |
None
|
test_transform
|
Compose | None | list[BasicTransform]
|
Transformations for testing. |
None
|
aug
|
AugmentationSequential
|
Augmentation/normalization pipeline. Defaults to None. |
None
|
partition
|
str
|
Partition size. Defaults to "default". |
'default'
|
**kwargs
|
Any
|
Additional keyword arguments. |
{}
|
terratorch.datamodules.m_brick_kiln
#
MBrickKilnNonGeoDataModule
#
Bases: GeobenchDataModule
NonGeo LightningDataModule implementation for M-BrickKiln dataset.
__init__(batch_size=8, num_workers=0, data_root='./', bands=None, train_transform=None, val_transform=None, test_transform=None, aug=None, partition='default', **kwargs)
#
Initializes the MBrickKilnNonGeoDataModule for the M-BrickKilnNonGeo dataset.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
batch_size
|
int
|
Batch size for DataLoaders. Defaults to 8. |
8
|
num_workers
|
int
|
Number of workers for data loading. Defaults to 0. |
0
|
data_root
|
str
|
Root directory of the dataset. Defaults to "./". |
'./'
|
bands
|
Sequence[str] | None
|
List of bands to use. Defaults to None. |
None
|
train_transform
|
Compose | None | list[BasicTransform]
|
Transformations for training. |
None
|
val_transform
|
Compose | None | list[BasicTransform]
|
Transformations for validation. |
None
|
test_transform
|
Compose | None | list[BasicTransform]
|
Transformations for testing. |
None
|
aug
|
AugmentationSequential
|
Augmentation/normalization pipeline. Defaults to None. |
None
|
partition
|
str
|
Partition size. Defaults to "default". |
'default'
|
**kwargs
|
Any
|
Additional keyword arguments. |
{}
|
terratorch.datamodules.m_forestnet
#
MForestNetNonGeoDataModule
#
Bases: GeobenchDataModule
NonGeo LightningDataModule implementation for M-ForestNet dataset.
__init__(batch_size=8, num_workers=0, data_root='./', bands=None, train_transform=None, val_transform=None, test_transform=None, aug=None, partition='default', use_metadata=False, **kwargs)
#
Initializes the MForestNetNonGeoDataModule for the MForestNetNonGeo dataset.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
batch_size
|
int
|
Batch size for DataLoaders. Defaults to 8. |
8
|
num_workers
|
int
|
Number of workers for data loading. Defaults to 0. |
0
|
data_root
|
str
|
Root directory of the dataset. Defaults to "./". |
'./'
|
bands
|
Sequence[str] | None
|
List of bands to use. Defaults to None. |
None
|
train_transform
|
Compose | None | list[BasicTransform]
|
Transformations for training. |
None
|
val_transform
|
Compose | None | list[BasicTransform]
|
Transformations for validation. |
None
|
test_transform
|
Compose | None | list[BasicTransform]
|
Transformations for testing. |
None
|
aug
|
AugmentationSequential
|
Augmentation/normalization pipeline. Defaults to None. |
None
|
partition
|
str
|
Partition size. Defaults to "default". |
'default'
|
use_metadata
|
bool
|
Whether to return metadata info. |
False
|
**kwargs
|
Any
|
Additional keyword arguments. |
{}
|
terratorch.datamodules.m_so2sat
#
MSo2SatNonGeoDataModule
#
Bases: GeobenchDataModule
NonGeo LightningDataModule implementation for M-So2Sat dataset.
__init__(batch_size=8, num_workers=0, data_root='./', bands=None, train_transform=None, val_transform=None, test_transform=None, aug=None, partition='default', **kwargs)
#
Initializes the MSo2SatNonGeoDataModule for the MSo2SatNonGeo dataset.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
batch_size
|
int
|
Batch size for DataLoaders. Defaults to 8. |
8
|
num_workers
|
int
|
Number of workers for data loading. Defaults to 0. |
0
|
data_root
|
str
|
Root directory of the dataset. Defaults to "./". |
'./'
|
bands
|
Sequence[str] | None
|
List of bands to use. Defaults to None. |
None
|
train_transform
|
Compose | None | list[BasicTransform]
|
Transformations for training. |
None
|
val_transform
|
Compose | None | list[BasicTransform]
|
Transformations for validation. |
None
|
test_transform
|
Compose | None | list[BasicTransform]
|
Transformations for testing. |
None
|
aug
|
AugmentationSequential
|
Augmentation/normalization pipeline. Defaults to None. |
None
|
partition
|
str
|
Partition size. Defaults to "default". |
'default'
|
**kwargs
|
Any
|
Additional keyword arguments. |
{}
|
terratorch.datamodules.m_pv4ger
#
MPv4gerNonGeoDataModule
#
Bases: GeobenchDataModule
NonGeo LightningDataModule implementation for M-Pv4ger dataset.
__init__(batch_size=8, num_workers=0, data_root='./', bands=None, train_transform=None, val_transform=None, test_transform=None, aug=None, partition='default', use_metadata=False, **kwargs)
#
Initializes the MPv4gerNonGeoDataModule for the MPv4gerNonGeo dataset.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
batch_size
|
int
|
Batch size for DataLoaders. Defaults to 8. |
8
|
num_workers
|
int
|
Number of workers for data loading. Defaults to 0. |
0
|
data_root
|
str
|
Root directory of the dataset. Defaults to "./". |
'./'
|
bands
|
Sequence[str] | None
|
List of bands to use. Defaults to None. |
None
|
train_transform
|
Compose | None | list[BasicTransform]
|
Transformations for training. |
None
|
val_transform
|
Compose | None | list[BasicTransform]
|
Transformations for validation. |
None
|
test_transform
|
Compose | None | list[BasicTransform]
|
Transformations for testing. |
None
|
aug
|
AugmentationSequential
|
Augmentation/normalization pipeline. Defaults to None. |
None
|
partition
|
str
|
Partition size. Defaults to "default". |
'default'
|
use_metadata
|
bool
|
Whether to return metadata info. |
False
|
**kwargs
|
Any
|
Additional keyword arguments. |
{}
|
terratorch.datamodules.m_cashew_plantation
#
MBeninSmallHolderCashewsNonGeoDataModule
#
Bases: GeobenchDataModule
NonGeo LightningDataModule implementation for M-Cashew Plantation dataset.
__init__(batch_size=8, num_workers=0, data_root='./', bands=None, train_transform=None, val_transform=None, test_transform=None, aug=None, partition='default', use_metadata=False, **kwargs)
#
Initializes the MBeninSmallHolderCashewsNonGeoDataModule for the M-BeninSmallHolderCashewsNonGeo dataset.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
batch_size
|
int
|
Batch size for DataLoaders. Defaults to 8. |
8
|
num_workers
|
int
|
Number of workers for data loading. Defaults to 0. |
0
|
data_root
|
str
|
Root directory of the dataset. Defaults to "./". |
'./'
|
bands
|
Sequence[str] | None
|
List of bands to use. Defaults to None. |
None
|
train_transform
|
Compose | None | list[BasicTransform]
|
Transformations for training. |
None
|
val_transform
|
Compose | None | list[BasicTransform]
|
Transformations for validation. |
None
|
test_transform
|
Compose | None | list[BasicTransform]
|
Transformations for testing. |
None
|
aug
|
AugmentationSequential
|
Augmentation/normalization pipeline. Defaults to None. |
None
|
partition
|
str
|
Partition size. Defaults to "default". |
'default'
|
use_metadata
|
bool
|
Whether to return metadata info. |
False
|
**kwargs
|
Any
|
Additional keyword arguments. |
{}
|
terratorch.datamodules.m_nz_cattle
#
MNzCattleNonGeoDataModule
#
Bases: GeobenchDataModule
NonGeo LightningDataModule implementation for M-NZCattle dataset.
__init__(batch_size=8, num_workers=0, data_root='./', bands=None, train_transform=None, val_transform=None, test_transform=None, aug=None, partition='default', use_metadata=False, **kwargs)
#
Initializes the MNzCattleNonGeoDataModule for the MNzCattleNonGeo dataset.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
batch_size
|
int
|
Batch size for DataLoaders. Defaults to 8. |
8
|
num_workers
|
int
|
Number of workers for data loading. Defaults to 0. |
0
|
data_root
|
str
|
Root directory of the dataset. Defaults to "./". |
'./'
|
bands
|
Sequence[str] | None
|
List of bands to use. Defaults to None. |
None
|
train_transform
|
Compose | None | list[BasicTransform]
|
Transformations for training. |
None
|
val_transform
|
Compose | None | list[BasicTransform]
|
Transformations for validation. |
None
|
test_transform
|
Compose | None | list[BasicTransform]
|
Transformations for testing. |
None
|
aug
|
AugmentationSequential
|
Augmentation/normalization pipeline. Defaults to None. |
None
|
partition
|
str
|
Partition size. Defaults to "default". |
'default'
|
use_metadata
|
bool
|
Whether to return metadata info. |
False
|
**kwargs
|
Any
|
Additional keyword arguments. |
{}
|
terratorch.datamodules.m_chesapeake_landcover
#
MChesapeakeLandcoverNonGeoDataModule
#
Bases: GeobenchDataModule
NonGeo LightningDataModule implementation for M-ChesapeakeLandcover dataset.
__init__(batch_size=8, num_workers=0, data_root='./', bands=None, train_transform=None, val_transform=None, test_transform=None, aug=None, partition='default', **kwargs)
#
Initializes the MChesapeakeLandcoverNonGeoDataModule for the M-BigEarthNet dataset.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
batch_size
|
int
|
Batch size for DataLoaders. Defaults to 8. |
8
|
num_workers
|
int
|
Number of workers for data loading. Defaults to 0. |
0
|
data_root
|
str
|
Root directory of the dataset. Defaults to "./". |
'./'
|
bands
|
Sequence[str] | None
|
List of bands to use. Defaults to None. |
None
|
train_transform
|
Compose | None | list[BasicTransform]
|
Transformations for training. |
None
|
val_transform
|
Compose | None | list[BasicTransform]
|
Transformations for validation. |
None
|
test_transform
|
Compose | None | list[BasicTransform]
|
Transformations for testing. |
None
|
aug
|
AugmentationSequential
|
Augmentation/normalization pipeline. Defaults to None. |
None
|
partition
|
str
|
Partition size. Defaults to "default". |
'default'
|
**kwargs
|
Any
|
Additional keyword arguments. |
{}
|
terratorch.datamodules.m_pv4ger_seg
#
MPv4gerSegNonGeoDataModule
#
Bases: GeobenchDataModule
NonGeo LightningDataModule implementation for M-Pv4gerSeg dataset.
__init__(batch_size=8, num_workers=0, data_root='./', bands=None, train_transform=None, val_transform=None, test_transform=None, aug=None, partition='default', use_metadata=False, **kwargs)
#
Initializes the MPv4gerNonGeoDataModule for the MPv4gerSegNonGeo dataset.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
batch_size
|
int
|
Batch size for DataLoaders. Defaults to 8. |
8
|
num_workers
|
int
|
Number of workers for data loading. Defaults to 0. |
0
|
data_root
|
str
|
Root directory of the dataset. Defaults to "./". |
'./'
|
bands
|
Sequence[str] | None
|
List of bands to use. Defaults to None. |
None
|
train_transform
|
Compose | None | list[BasicTransform]
|
Transformations for training. |
None
|
val_transform
|
Compose | None | list[BasicTransform]
|
Transformations for validation. |
None
|
test_transform
|
Compose | None | list[BasicTransform]
|
Transformations for testing. |
None
|
aug
|
AugmentationSequential
|
Augmentation/normalization pipeline. Defaults to None. |
None
|
partition
|
str
|
Partition size. Defaults to "default". |
'default'
|
use_metadata
|
bool
|
Whether to return metadata info. |
False
|
**kwargs
|
Any
|
Additional keyword arguments. |
{}
|
terratorch.datamodules.m_SA_crop_type
#
MSACropTypeNonGeoDataModule
#
Bases: GeobenchDataModule
NonGeo LightningDataModule implementation for M-SA-CropType dataset.
__init__(batch_size=8, num_workers=0, data_root='./', bands=None, train_transform=None, val_transform=None, test_transform=None, aug=None, partition='default', **kwargs)
#
Initializes the MSACropTypeNonGeoDataModule for the MSACropTypeNonGeo dataset.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
batch_size
|
int
|
Batch size for DataLoaders. Defaults to 8. |
8
|
num_workers
|
int
|
Number of workers for data loading. Defaults to 0. |
0
|
data_root
|
str
|
Root directory of the dataset. Defaults to "./". |
'./'
|
bands
|
Sequence[str] | None
|
List of bands to use. Defaults to None. |
None
|
train_transform
|
Compose | None | list[BasicTransform]
|
Transformations for training. |
None
|
val_transform
|
Compose | None | list[BasicTransform]
|
Transformations for validation. |
None
|
test_transform
|
Compose | None | list[BasicTransform]
|
Transformations for testing. |
None
|
aug
|
AugmentationSequential
|
Augmentation/normalization pipeline. Defaults to None. |
None
|
partition
|
str
|
Partition size. Defaults to "default". |
'default'
|
**kwargs
|
Any
|
Additional keyword arguments. |
{}
|
terratorch.datamodules.m_neontree
#
MNeonTreeNonGeoDataModule
#
Bases: GeobenchDataModule
NonGeo LightningDataModule implementation for M-NeonTree dataset.
__init__(batch_size=8, num_workers=0, data_root='./', bands=None, train_transform=None, val_transform=None, test_transform=None, aug=None, partition='default', **kwargs)
#
Initializes the MNeonTreeNonGeoDataModule for the MNeonTreeNonGeo dataset.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
batch_size
|
int
|
Batch size for DataLoaders. Defaults to 8. |
8
|
num_workers
|
int
|
Number of workers for data loading. Defaults to 0. |
0
|
data_root
|
str
|
Root directory of the dataset. Defaults to "./". |
'./'
|
bands
|
Sequence[str] | None
|
List of bands to use. Defaults to None. |
None
|
train_transform
|
Compose | None | list[BasicTransform]
|
Transformations for training. |
None
|
val_transform
|
Compose | None | list[BasicTransform]
|
Transformations for validation. |
None
|
test_transform
|
Compose | None | list[BasicTransform]
|
Transformations for testing. |
None
|
aug
|
AugmentationSequential
|
Augmentation/normalization pipeline. Defaults to None. |
None
|
partition
|
str
|
Partition size. Defaults to "default". |
'default'
|
**kwargs
|
Any
|
Additional keyword arguments. |
{}
|
terratorch.datamodules.multi_temporal_crop_classification
#
MultiTemporalCropClassificationDataModule
#
Bases: NonGeoDataModule
NonGeo LightningDataModule implementation for multi-temporal crop classification.
__init__(data_root, batch_size=4, num_workers=0, bands=MultiTemporalCropClassification.all_band_names, train_transform=None, val_transform=None, test_transform=None, predict_transform=None, drop_last=True, no_data_replace=0, no_label_replace=-1, expand_temporal_dimension=True, reduce_zero_label=True, use_metadata=False, metadata_file_name='chips_df.csv', **kwargs)
#
Initializes the MultiTemporalCropClassificationDataModule for multi-temporal crop classification.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data_root
|
str
|
Directory containing the dataset. |
required |
batch_size
|
int
|
Batch size for DataLoaders. Defaults to 4. |
4
|
num_workers
|
int
|
Number of workers for data loading. Defaults to 0. |
0
|
bands
|
Sequence[str]
|
List of bands to use. Defaults to MultiTemporalCropClassification.all_band_names. |
all_band_names
|
train_transform
|
Compose | None | list[BasicTransform]
|
Transformations for training data. |
None
|
val_transform
|
Compose | None | list[BasicTransform]
|
Transformations for validation data. |
None
|
test_transform
|
Compose | None | list[BasicTransform]
|
Transformations for testing data. |
None
|
predict_transform
|
Compose | None | list[BasicTransform]
|
Transformations for prediction data. |
None
|
drop_last
|
bool
|
Whether to drop the last incomplete batch during training. Defaults to True. |
True
|
no_data_replace
|
float | None
|
Replacement value for missing data. Defaults to 0. |
0
|
no_label_replace
|
int | None
|
Replacement value for missing labels. Defaults to -1. |
-1
|
expand_temporal_dimension
|
bool
|
Go from shape (time*channels, h, w) to (channels, time, h, w). Defaults to True. |
True
|
reduce_zero_label
|
bool
|
Subtract 1 from all labels. Useful when labels start from 1 instead of the expected 0. Defaults to True. |
True
|
use_metadata
|
bool
|
Whether to return metadata info (time and location). |
False
|
**kwargs
|
Any
|
Additional keyword arguments. |
{}
|
setup(stage)
#
Set up datasets.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
stage
|
str
|
Either fit, validate, test, or predict. |
required |
terratorch.datamodules.open_sentinel_map
#
OpenSentinelMapDataModule
#
Bases: NonGeoDataModule
NonGeo LightningDataModule implementation for Open Sentinel Map.
__init__(bands=None, batch_size=8, num_workers=0, data_root='./', train_transform=None, val_transform=None, test_transform=None, predict_transform=None, spatial_interpolate_and_stack_temporally=True, pad_image=None, truncate_image=None, **kwargs)
#
Initializes the OpenSentinelMapDataModule for the Open Sentinel Map dataset.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
bands
|
list[str] | None
|
List of bands to use. Defaults to None. |
None
|
batch_size
|
int
|
Batch size for DataLoaders. Defaults to 8. |
8
|
num_workers
|
int
|
Number of workers for data loading. Defaults to 0. |
0
|
data_root
|
str
|
Root directory of the dataset. Defaults to "./". |
'./'
|
train_transform
|
Compose | None | list[BasicTransform]
|
Transformations for training data. |
None
|
val_transform
|
Compose | None | list[BasicTransform]
|
Transformations for validation data. |
None
|
test_transform
|
Compose | None | list[BasicTransform]
|
Transformations for testing data. |
None
|
predict_transform
|
Compose | None | list[BasicTransform]
|
Transformations for prediction data. |
None
|
spatial_interpolate_and_stack_temporally
|
bool
|
If True, the bands are interpolated and concatenated over time. Default is True. |
True
|
pad_image
|
int | None
|
Number of timesteps to pad the time dimension of the image. If None, no padding is applied. |
None
|
truncate_image
|
int | None
|
Number of timesteps to truncate the time dimension of the image. If None, no truncation is performed. |
None
|
**kwargs
|
Any
|
Additional keyword arguments. |
{}
|
setup(stage)
#
Set up datasets.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
stage
|
str
|
Either fit, validate, test, or predict. |
required |
terratorch.datamodules.openearthmap
#
OpenEarthMapNonGeoDataModule
#
Bases: NonGeoDataModule
NonGeo LightningDataModule implementation for Open Earth Map.
__init__(batch_size=8, num_workers=0, data_root='./', train_transform=None, val_transform=None, test_transform=None, predict_transform=None, aug=None, **kwargs)
#
Initializes the OpenEarthMapNonGeoDataModule for the Open Earth Map dataset.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
batch_size
|
int
|
Batch size for DataLoaders. Defaults to 8. |
8
|
num_workers
|
int
|
Number of workers for data loading. Defaults to 0. |
0
|
data_root
|
str
|
Root directory of the dataset. Defaults to "./". |
'./'
|
train_transform
|
Compose | None | list[BasicTransform]
|
Transformations for training data. |
None
|
val_transform
|
Compose | None | list[BasicTransform]
|
Transformations for validation data. |
None
|
test_transform
|
Compose | None | list[BasicTransform]
|
Transformations for test data. |
None
|
predict_transform
|
Compose | None | list[BasicTransform]
|
Transformations for prediction data. |
None
|
aug
|
AugmentationSequential
|
Augmentation pipeline; if None, defaults to normalization using computed means and stds. |
None
|
**kwargs
|
Any
|
Additional keyword arguments. Can include 'bands' (list[str]) to specify the bands; defaults to OpenEarthMapNonGeo.all_band_names if not provided. |
{}
|
setup(stage)
#
Set up datasets.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
stage
|
str
|
Either fit, validate, test, or predict. |
required |
terratorch.datamodules.pastis
#
PASTISDataModule
#
Bases: NonGeoDataModule
NonGeo LightningDataModule implementation for PASTIS.
__init__(batch_size=8, num_workers=0, data_root='./', truncate_image=None, pad_image=None, train_transform=None, val_transform=None, test_transform=None, predict_transform=None, **kwargs)
#
Initializes the PASTISDataModule for the PASTIS dataset.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
batch_size
|
int
|
Batch size for DataLoaders. Defaults to 8. |
8
|
num_workers
|
int
|
Number of workers for data loading. Defaults to 0. |
0
|
data_root
|
str
|
Directory containing the dataset. Defaults to "./". |
'./'
|
truncate_image
|
int
|
Truncate the time dimension of the image to a specified number of timesteps. If None, no truncation is performed. |
None
|
pad_image
|
int
|
Pad the time dimension of the image to a specified number of timesteps. If None, no padding is applied. |
None
|
train_transform
|
Compose | None | list[BasicTransform]
|
Transformations for training data. |
None
|
val_transform
|
Compose | None | list[BasicTransform]
|
Transformations for validation data. |
None
|
test_transform
|
Compose | None | list[BasicTransform]
|
Transformations for testing data. |
None
|
predict_transform
|
Compose | None | list[BasicTransform]
|
Transformations for prediction data. |
None
|
**kwargs
|
Any
|
Additional keyword arguments. |
{}
|
setup(stage)
#
Set up datasets.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
stage
|
str
|
Either fit, validate, test, or predict. |
required |
terratorch.datamodules.sen1floods11
#
Sen1Floods11NonGeoDataModule
#
Bases: NonGeoDataModule
NonGeo LightningDataModule implementation for Fire Scars.
__init__(data_root, batch_size=4, num_workers=0, bands=Sen1Floods11NonGeo.all_band_names, train_transform=None, val_transform=None, test_transform=None, predict_transform=None, drop_last=True, constant_scale=0.0001, no_data_replace=0, no_label_replace=-1, use_metadata=False, **kwargs)
#
Initializes the Sen1Floods11NonGeoDataModule.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data_root
|
str
|
Root directory of the dataset. |
required |
batch_size
|
int
|
Batch size for DataLoaders. Defaults to 4. |
4
|
num_workers
|
int
|
Number of workers for data loading. Defaults to 0. |
0
|
bands
|
Sequence[str]
|
List of bands to use. Defaults to Sen1Floods11NonGeo.all_band_names. |
all_band_names
|
train_transform
|
Compose | None | list[BasicTransform]
|
Transformations for training data. |
None
|
val_transform
|
Compose | None | list[BasicTransform]
|
Transformations for validation data. |
None
|
test_transform
|
Compose | None | list[BasicTransform]
|
Transformations for test data. |
None
|
predict_transform
|
Compose | None | list[BasicTransform]
|
Transformations for prediction data. |
None
|
drop_last
|
bool
|
Whether to drop the last incomplete batch. Defaults to True. |
True
|
constant_scale
|
float
|
Scale constant applied to the dataset. Defaults to 0.0001. |
0.0001
|
no_data_replace
|
float | None
|
Replacement value for missing data. Defaults to 0. |
0
|
no_label_replace
|
int | None
|
Replacement value for missing labels. Defaults to -1. |
-1
|
use_metadata
|
bool
|
Whether to return metadata info (time and location). |
False
|
**kwargs
|
Any
|
Additional keyword arguments. |
{}
|
setup(stage)
#
Set up datasets.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
stage
|
str
|
Either fit, validate, test, or predict. |
required |
terratorch.datamodules.sen4agrinet
#
Sen4AgriNetDataModule
#
Bases: NonGeoDataModule
NonGeo LightningDataModule implementation for Sen4AgriNet.
__init__(bands=None, batch_size=8, num_workers=0, data_root='./', train_transform=None, val_transform=None, test_transform=None, predict_transform=None, seed=42, scenario='random', requires_norm=True, binary_labels=False, linear_encoder=None, **kwargs)
#
Initializes the Sen4AgriNetDataModule for the Sen4AgriNet dataset.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
bands
|
list[str] | None
|
List of bands to use. Defaults to None. |
None
|
batch_size
|
int
|
Batch size for DataLoaders. Defaults to 8. |
8
|
num_workers
|
int
|
Number of workers for data loading. Defaults to 0. |
0
|
data_root
|
str
|
Root directory of the dataset. Defaults to "./". |
'./'
|
train_transform
|
Compose | None | list[BasicTransform]
|
Transformations for training data. |
None
|
val_transform
|
Compose | None | list[BasicTransform]
|
Transformations for validation data. |
None
|
test_transform
|
Compose | None | list[BasicTransform]
|
Transformations for test data. |
None
|
predict_transform
|
Compose | None | list[BasicTransform]
|
Transformations for prediction data. |
None
|
seed
|
int
|
Random seed for reproducibility. Defaults to 42. |
42
|
scenario
|
str
|
Defines the splitting scenario to use. Options are: - 'random': Random split of the data. - 'spatial': Split by geographical regions (Catalonia and France). - 'spatio-temporal': Split by region and year (France 2019 and Catalonia 2020). |
'random'
|
requires_norm
|
bool
|
Whether normalization is required. Defaults to True. |
True
|
binary_labels
|
bool
|
Whether to use binary labels. Defaults to False. |
False
|
linear_encoder
|
dict
|
Mapping for label encoding. Defaults to None. |
None
|
**kwargs
|
Any
|
Additional keyword arguments. |
{}
|
setup(stage)
#
Set up datasets.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
stage
|
str
|
Either fit, validate, test, or predict. |
required |
terratorch.datamodules.sen4map
#
Sen4MapLucasDataModule
#
Bases: LightningDataModule
NonGeo LightningDataModule implementation for Sen4map.
__init__(batch_size, num_workers, prefetch_factor=0, train_hdf5_path=None, train_hdf5_keys_path=None, test_hdf5_path=None, test_hdf5_keys_path=None, val_hdf5_path=None, val_hdf5_keys_path=None, **kwargs)
#
Initializes the Sen4MapLucasDataModule for handling Sen4Map monthly composites.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
batch_size
|
int
|
Batch size for DataLoaders. |
required |
num_workers
|
int
|
Number of worker processes for data loading. |
required |
prefetch_factor
|
int
|
Number of samples to prefetch per worker. Defaults to 0. |
0
|
train_hdf5_path
|
str
|
Path to the training HDF5 file. |
None
|
train_hdf5_keys_path
|
str
|
Path to the training HDF5 keys file. |
None
|
test_hdf5_path
|
str
|
Path to the testing HDF5 file. |
None
|
test_hdf5_keys_path
|
str
|
Path to the testing HDF5 keys file. |
None
|
val_hdf5_path
|
str
|
Path to the validation HDF5 file. |
None
|
val_hdf5_keys_path
|
str
|
Path to the validation HDF5 keys file. |
None
|
train_hdf5_keys_save_path
|
str
|
(from kwargs) Path to save generated train keys. |
required |
test_hdf5_keys_save_path
|
str
|
(from kwargs) Path to save generated test keys. |
required |
val_hdf5_keys_save_path
|
str
|
(from kwargs) Path to save generated validation keys. |
required |
shuffle
|
bool
|
Global shuffle flag. |
required |
train_shuffle
|
bool
|
Shuffle flag for training data; defaults to global shuffle if unset. |
required |
val_shuffle
|
bool
|
Shuffle flag for validation data. |
required |
test_shuffle
|
bool
|
Shuffle flag for test data. |
required |
train_data_fraction
|
float
|
Fraction of training data to use. Defaults to 1.0. |
required |
val_data_fraction
|
float
|
Fraction of validation data to use. Defaults to 1.0. |
required |
test_data_fraction
|
float
|
Fraction of test data to use. Defaults to 1.0. |
required |
all_hdf5_data_path
|
str
|
General HDF5 data path for all splits. If provided, overrides specific paths. |
required |
resize
|
bool
|
Whether to resize images. Defaults to False. |
required |
resize_to
|
int or tuple
|
Target size for resizing images. |
required |
resize_interpolation
|
str
|
Interpolation mode for resizing ('bilinear', 'bicubic', etc.). |
required |
resize_antialiasing
|
bool
|
Whether to apply antialiasing during resizing. Defaults to True. |
required |
**kwargs
|
Additional keyword arguments. |
{}
|
setup(stage)
#
Set up datasets.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
stage
|
str
|
Either fit, test. |
required |
terratorch.datamodules.torchgeo_data_module
#
Ugly proxy objects so parsing config file works with transforms.
These are necessary since, for LightningCLI to instantiate arguments as objects from the config, they must have type annotations
In TorchGeo, transforms
is passed in **kwargs, so it has no type annotations!
To get around that, we create these wrappers that have transforms type annotated.
They create the transforms and forward all method and attribute calls to the
original TorchGeo datamodule.
Additionally, TorchGeo datasets pass the data to the transforms callable as a dict, and as a tensor.
Albumentations expects this data not as a dict but as different key-value arguments, and as numpy. We handle that conversion here.
TorchGeoDataModule
#
Bases: GeoDataModule
Proxy object for using Geo data modules defined by TorchGeo.
Allows for transforms to be defined and passed using config files. The only reason this class exists is so that we can annotate the transforms argument with a type. This is required for lightningcli and config files. As such, all getattr and setattr will be redirected to the underlying class.
__init__(cls, batch_size=None, num_workers=0, transforms=None, **kwargs)
#
Constructor
Parameters:
Name | Type | Description | Default |
---|---|---|---|
cls
|
type[GeoDataModule]
|
TorchGeo DataModule class to be instantiated |
required |
batch_size
|
int | None
|
batch_size. Defaults to None. |
None
|
num_workers
|
int
|
num_workers. Defaults to 0. |
0
|
transforms
|
None | list[BasicTransform]
|
List of Albumentations Transforms. Should enc with ToTensorV2. Defaults to None. |
None
|
**kwargs
|
Any
|
Arguments passed to instantiate |
{}
|
TorchNonGeoDataModule
#
Bases: NonGeoDataModule
Proxy object for using NonGeo data modules defined by TorchGeo.
Allows for transforms to be defined and passed using config files. The only reason this class exists is so that we can annotate the transforms argument with a type. This is required for lightningcli and config files. As such, all getattr and setattr will be redirected to the underlying class.
__init__(cls, batch_size=None, num_workers=0, transforms=None, **kwargs)
#
Constructor
Parameters:
Name | Type | Description | Default |
---|---|---|---|
cls
|
type[NonGeoDataModule]
|
TorchGeo DataModule class to be instantiated |
required |
batch_size
|
int | None
|
batch_size. Defaults to None. |
None
|
num_workers
|
int
|
num_workers. Defaults to 0. |
0
|
transforms
|
None | list[BasicTransform]
|
List of Albumentations Transforms. Should enc with ToTensorV2. Defaults to None. |
None
|
**kwargs
|
Any
|
Arguments passed to instantiate |
{}
|