Specific Datasets#
terratorch.datasets.biomassters
#
BioMasstersNonGeo
#
Bases: BioMassters
BioMassters Dataset for Aboveground Biomass prediction.
Dataset intended for Aboveground Biomass (AGB) prediction over Finnish forests based on Sentinel 1 and 2 data with corresponding target AGB mask values generated by Light Detection and Ranging (LiDAR).
Dataset Format:
- .tif files for Sentinel 1 and 2 data
- .tif file for pixel wise AGB target mask
- .csv files for metadata regarding features and targets
Dataset Features:
- 13,000 target AGB masks of size (256x256px)
- 12 months of data per target mask
- Sentinel 1 and Sentinel 2 data for each location
- Sentinel 1 available for every month
- Sentinel 2 available for almost every month (not available for every month due to ESA acquisition halt over the region during particular periods)
If you use this dataset in your research, please cite the following paper:
- https://nascetti-a.github.io/BioMasster/
.. versionadded:: 0.5
Source code in terratorch/datasets/biomassters.py
26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 |
|
__init__(root='data', split='train', bands=BAND_SETS['all'], transform=None, mask_mean=63.4584, mask_std=72.21242, sensors=['S1', 'S2'], as_time_series=False, metadata_filename=default_metadata_filename, max_cloud_percentage=None, max_red_mean=None, include_corrupt=True, subset=1, seed=42, use_four_frames=False)
#
Initialize a new instance of BioMassters dataset.
If as_time_series=False
(the default), each time step becomes its own
sample with the target being shared across multiple samples.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
root
|
root directory where dataset can be found |
'data'
|
|
split
|
str
|
train or test split |
'train'
|
sensors
|
Sequence[str]
|
which sensors to consider for the sample, Sentinel 1 and/or Sentinel 2 ('S1', 'S2') |
['S1', 'S2']
|
as_time_series
|
bool
|
whether or not to return all available time-steps or just a single one for a given target location |
False
|
metadata_filename
|
str
|
metadata file to be used |
default_metadata_filename
|
max_cloud_percentage
|
float | None
|
maximum allowed cloud percentage for images |
None
|
max_red_mean
|
float | None
|
maximum allowed red_mean value for images |
None
|
include_corrupt
|
bool
|
whether to include images marked as corrupted |
True
|
Raises:
Type | Description |
---|---|
AssertionError
|
if |
DatasetNotFoundError
|
If dataset is not found. |
Source code in terratorch/datasets/biomassters.py
89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 |
|
plot(sample, show_titles=True, suptitle=None)
#
Plot a sample from the dataset.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
sample
|
dict[str, Tensor]
|
a sample returned by :meth: |
required |
show_titles
|
bool
|
flag indicating whether to show titles above each panel |
True
|
suptitle
|
str | None
|
optional suptitle to use for figure |
None
|
Returns:
Type | Description |
---|---|
Figure
|
a matplotlib Figure with the rendered sample |
Source code in terratorch/datasets/biomassters.py
401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 |
|
terratorch.datasets.burn_intensity
#
BurnIntensityNonGeo
#
Bases: NonGeoDataset
Dataset implementation for Burn Intensity classification.
Source code in terratorch/datasets/burn_intensity.py
21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 |
|
__init__(data_root, split='train', bands=BAND_SETS['all'], transform=None, use_full_data=True, no_data_replace=0.0001, no_label_replace=-1, use_metadata=False)
#
Initialize the BurnIntensity dataset.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data_root
|
str
|
Path to the data root directory. |
required |
split
|
str
|
One of 'train' or 'val'. |
'train'
|
bands
|
Sequence[str]
|
Bands to output. Defaults to all bands. |
BAND_SETS['all']
|
transform
|
Optional[Compose]
|
Albumentations transform to be applied. |
None
|
use_metadata
|
bool
|
Whether to return metadata info (location). |
False
|
use_full_data
|
bool
|
Wheter to use full data or data with less than 25 percent zeros. |
True
|
no_data_replace
|
Optional[float]
|
Value to replace NaNs in images. |
0.0001
|
no_label_replace
|
Optional[int]
|
Value to replace NaNs in labels. |
-1
|
Source code in terratorch/datasets/burn_intensity.py
plot(sample, suptitle=None)
#
Plot a sample from the dataset.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
sample
|
dict[str, Tensor]
|
A sample returned by |
required |
suptitle
|
str | None
|
Optional string to use as a suptitle. |
None
|
Returns:
Type | Description |
---|---|
Any
|
A matplotlib Figure with the rendered sample. |
Source code in terratorch/datasets/burn_intensity.py
terratorch.datasets.carbonflux
#
CarbonFluxNonGeo
#
Bases: NonGeoDataset
Dataset for Carbon Flux regression from HLS images and MERRA data.
Source code in terratorch/datasets/carbonflux.py
21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 |
|
__init__(data_root, split='train', bands=BAND_SETS['all'], transform=None, gpp_mean=None, gpp_std=None, no_data_replace=0.0001, use_metadata=False, modalities=('image', 'merra_vars'))
#
Initialize the CarbonFluxNonGeo dataset.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data_root
|
str
|
Path to the data root directory. |
required |
split
|
str
|
'train' or 'test'. |
'train'
|
bands
|
Sequence[str]
|
Bands to use. Defaults to all bands. |
BAND_SETS['all']
|
transform
|
Optional[Compose]
|
Albumentations transform to be applied. |
None
|
use_metadata
|
bool
|
Whether to return metadata (coordinates and date). |
False
|
merra_means
|
Sequence[float]
|
Means for MERRA data normalization. |
required |
merra_stds
|
Sequence[float]
|
Standard deviations for MERRA data normalization. |
required |
gpp_mean
|
float
|
Mean for GPP normalization. |
None
|
gpp_std
|
float
|
Standard deviation for GPP normalization. |
None
|
no_data_replace
|
Optional[float]
|
Value to replace NO_DATA values in images. |
0.0001
|
Source code in terratorch/datasets/carbonflux.py
plot(sample, suptitle=None)
#
Plot a sample from the dataset.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
sample
|
dict[str, Any]
|
A sample returned by |
required |
suptitle
|
str | None
|
Optional title for the figure. |
None
|
Returns:
Type | Description |
---|---|
Any
|
A matplotlib figure with the rendered sample. |
Source code in terratorch/datasets/carbonflux.py
terratorch.datasets.forestnet
#
ForestNetNonGeo
#
Bases: NonGeoDataset
NonGeo dataset implementation for ForestNet.
Source code in terratorch/datasets/forestnet.py
23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 |
|
__init__(data_root, split='train', label_map=default_label_map, transform=None, fraction=1.0, bands=BAND_SETS['all'], use_metadata=False)
#
Initialize the ForestNetNonGeo dataset.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data_root
|
str
|
Path to the data root directory. |
required |
split
|
str
|
One of 'train', 'val', or 'test'. |
'train'
|
label_map
|
Dict[str, int]
|
Mapping from label names to integer labels. |
default_label_map
|
transform
|
Compose | None
|
Transformations to be applied to the images. |
None
|
fraction
|
float
|
Fraction of the dataset to use. Defaults to 1.0 (use all data). |
1.0
|
Source code in terratorch/datasets/forestnet.py
map_label(index)
#
terratorch.datasets.fire_scars
#
FireScarsHLS
#
Bases: RasterDataset
RasterDataset implementation for fire scars input images.
Source code in terratorch/datasets/fire_scars.py
FireScarsNonGeo
#
Bases: NonGeoDataset
NonGeo dataset implementation for fire scars.
Source code in terratorch/datasets/fire_scars.py
26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 |
|
__init__(data_root, split='train', bands=BAND_SETS['all'], transform=None, no_data_replace=0, no_label_replace=-1, use_metadata=False)
#
Constructor
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data_root
|
str
|
Path to the data root directory. |
required |
bands
|
list[str]
|
Bands that should be output by the dataset. Defaults to all bands. |
BAND_SETS['all']
|
transform
|
Compose | None
|
Albumentations transform to be applied. Should end with ToTensorV2(). If used through the corresponding data module, should not include normalization. Defaults to None, which applies ToTensorV2(). |
None
|
no_data_replace
|
float | None
|
Replace nan values in input images with this value. If None, does no replacement. Defaults to 0. |
0
|
no_label_replace
|
int | None
|
Replace nan values in label with this value. If none, does no replacement. Defaults to -1. |
-1
|
use_metadata
|
bool
|
whether to return metadata info (time and location). |
False
|
Source code in terratorch/datasets/fire_scars.py
plot(sample, suptitle=None)
#
Plot a sample from the dataset.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
sample
|
dict[str, Tensor]
|
a sample returned by :meth: |
required |
suptitle
|
str | None
|
optional string to use as a suptitle |
None
|
Returns:
Type | Description |
---|---|
Figure
|
a matplotlib Figure with the rendered sample |
Source code in terratorch/datasets/fire_scars.py
FireScarsSegmentationMask
#
Bases: RasterDataset
RasterDataset implementation for fire scars segmentation mask. Can be easily merged with input images using the & operator.
Source code in terratorch/datasets/fire_scars.py
terratorch.datasets.landslide4sense
#
Landslide4SenseNonGeo
#
Bases: NonGeoDataset
NonGeo dataset implementation for Landslide4Sense.
Source code in terratorch/datasets/landslide4sense.py
17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 |
|
__init__(data_root, split='train', bands=BAND_SETS['all'], transform=None)
#
Initialize the Landslide4Sense dataset.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data_root
|
str
|
Path to the data root directory. |
required |
split
|
str
|
One of 'train', 'validation', or 'test'. |
'train'
|
bands
|
Sequence[str]
|
Bands to be used. Defaults to all bands. |
BAND_SETS['all']
|
transform
|
Compose | None
|
Albumentations transform to be applied. Defaults to None, which applies default_transform(). |
None
|
Source code in terratorch/datasets/landslide4sense.py
terratorch.datasets.m_eurosat
#
MEuroSATNonGeo
#
Bases: NonGeoDataset
NonGeo dataset implementation for M-EuroSAT.
Source code in terratorch/datasets/m_eurosat.py
19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 |
|
__init__(data_root, split='train', bands=BAND_SETS['all'], transform=None, partition='default')
#
Initialize the dataset.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data_root
|
str
|
Path to the data root directory. |
required |
split
|
str
|
One of 'train', 'val', or 'test'. |
'train'
|
bands
|
Sequence[str]
|
Bands to be used. Defaults to all bands. |
BAND_SETS['all']
|
transform
|
Compose | None
|
Albumentations transform to be applied. Defaults to None, which applies default_transform(). |
None
|
partition
|
str
|
Partition name for the dataset splits. Defaults to 'default'. |
'default'
|
Source code in terratorch/datasets/m_eurosat.py
plot(sample, suptitle=None)
#
Plot a sample from the dataset.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
sample
|
dict[str, Tensor]
|
A sample returned by :meth: |
required |
suptitle
|
str | None
|
Optional string to use as a suptitle. |
None
|
Returns:
Type | Description |
---|---|
Figure
|
matplotlib.figure.Figure: A matplotlib Figure with the rendered sample. |
Source code in terratorch/datasets/m_eurosat.py
terratorch.datasets.m_bigearthnet
#
MBigEarthNonGeo
#
Bases: NonGeoDataset
NonGeo dataset implementation for M-BigEarthNet.
Source code in terratorch/datasets/m_bigearthnet.py
20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 |
|
__init__(data_root, split='train', bands=BAND_SETS['all'], transform=None, partition='default')
#
Initialize the dataset.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data_root
|
str
|
Path to the data root directory. |
required |
split
|
str
|
One of 'train', 'val', or 'test'. |
'train'
|
bands
|
Sequence[str]
|
Bands to be used. Defaults to all bands. |
BAND_SETS['all']
|
transform
|
Compose | None
|
Albumentations transform to be applied. Defaults to None, which applies default_transform(). |
None
|
partition
|
str
|
Partition name for the dataset splits. Defaults to 'default'. |
'default'
|
Source code in terratorch/datasets/m_bigearthnet.py
plot(sample, suptitle=None)
#
Plot a sample from the dataset.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
sample
|
dict[str, Tensor]
|
A sample returned by :meth: |
required |
suptitle
|
str | None
|
Optional string to use as a suptitle. |
None
|
Returns:
Type | Description |
---|---|
Figure
|
matplotlib.figure.Figure: A matplotlib Figure with the rendered sample. |
Source code in terratorch/datasets/m_bigearthnet.py
terratorch.datasets.m_brick_kiln
#
MBrickKilnNonGeo
#
Bases: NonGeoDataset
NonGeo dataset implementation for M-BrickKiln.
Source code in terratorch/datasets/m_brick_kiln.py
22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 |
|
__init__(data_root, split='train', bands=BAND_SETS['all'], transform=None, partition='default')
#
Initialize the dataset.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data_root
|
str
|
Path to the data root directory. |
required |
split
|
str
|
One of 'train', 'val', or 'test'. |
'train'
|
bands
|
Sequence[str]
|
Bands to be used. Defaults to all bands. |
BAND_SETS['all']
|
transform
|
Compose | None
|
Albumentations transform to be applied. Defaults to None, which applies default_transform(). |
None
|
partition
|
str
|
Partition name for the dataset splits. Defaults to 'default'. |
'default'
|
Source code in terratorch/datasets/m_brick_kiln.py
plot(sample, suptitle=None)
#
Plot a sample from the dataset.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
sample
|
dict[str, Tensor]
|
A sample returned by :meth: |
required |
suptitle
|
str | None
|
Optional string to use as a suptitle. |
None
|
Returns:
Type | Description |
---|---|
Figure
|
matplotlib.figure.Figure: A matplotlib Figure with the rendered sample. |
Source code in terratorch/datasets/m_brick_kiln.py
terratorch.datasets.m_forestnet
#
MForestNetNonGeo
#
Bases: NonGeoDataset
NonGeo dataset implementation for M-ForestNet.
Source code in terratorch/datasets/m_forestnet.py
23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 |
|
__init__(data_root, split='train', bands=BAND_SETS['all'], transform=None, partition='default', use_metadata=False)
#
Initialize the dataset.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data_root
|
str
|
Path to the data root directory. |
required |
split
|
str
|
One of 'train', 'val', or 'test'. |
'train'
|
bands
|
Sequence[str]
|
Bands to be used. Defaults to all bands. |
BAND_SETS['all']
|
transform
|
Compose | None
|
Albumentations transform to be applied. Defaults to None, which applies default_transform(). |
None
|
partition
|
str
|
Partition name for the dataset splits. Defaults to 'default'. |
'default'
|
use_metadata
|
bool
|
Whether to return metadata info (time and location). |
False
|
Source code in terratorch/datasets/m_forestnet.py
plot(sample, suptitle=None)
#
Plot a sample from the dataset.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
sample
|
dict[str, Tensor]
|
A sample returned by :meth: |
required |
suptitle
|
str | None
|
Optional string to use as a suptitle. |
None
|
Returns:
Type | Description |
---|---|
Figure
|
matplotlib.figure.Figure: A matplotlib Figure with the rendered sample. |
Source code in terratorch/datasets/m_forestnet.py
terratorch.datasets.m_so2sat
#
MSo2SatNonGeo
#
Bases: NonGeoDataset
NonGeo dataset implementation for M-So2Sat.
Source code in terratorch/datasets/m_so2sat.py
22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 |
|
__init__(data_root, split='train', bands=BAND_SETS['all'], transform=None, partition='default')
#
Initialize the dataset.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data_root
|
str
|
Path to the data root directory. |
required |
split
|
str
|
One of 'train', 'val', or 'test'. |
'train'
|
bands
|
Sequence[str]
|
Bands to be used. Defaults to all bands. |
BAND_SETS['all']
|
transform
|
Compose | None
|
Albumentations transform to be applied. Defaults to None, which applies default_transform(). |
None
|
partition
|
str
|
Partition name for the dataset splits. Defaults to 'default'. |
'default'
|
Source code in terratorch/datasets/m_so2sat.py
plot(sample, suptitle=None)
#
Plot a sample from the dataset.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
sample
|
dict[str, Tensor]
|
A sample returned by :meth: |
required |
suptitle
|
str | None
|
Optional string to use as a suptitle. |
None
|
Returns:
Type | Description |
---|---|
Figure
|
matplotlib.figure.Figure: A matplotlib Figure with the rendered sample. |
Source code in terratorch/datasets/m_so2sat.py
terratorch.datasets.m_pv4ger
#
MPv4gerNonGeo
#
Bases: NonGeoDataset
NonGeo dataset implementation for M-PV4GER.
Source code in terratorch/datasets/m_pv4ger.py
22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 |
|
__init__(data_root, split='train', bands=BAND_SETS['all'], transform=None, partition='default', use_metadata=False)
#
Initialize the dataset.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data_root
|
str
|
Path to the data root directory. |
required |
split
|
str
|
One of 'train', 'val', or 'test'. |
'train'
|
bands
|
Sequence[str]
|
Bands to be used. Defaults to all bands. |
BAND_SETS['all']
|
transform
|
Compose | None
|
Albumentations transform to be applied. Defaults to None, which applies default_transform(). |
None
|
partition
|
str
|
Partition name for the dataset splits. Defaults to 'default'. |
'default'
|
use_metadata
|
bool
|
Whether to return metadata info (location coordinates). |
False
|
Source code in terratorch/datasets/m_pv4ger.py
plot(sample, suptitle=None)
#
Plot a sample from the dataset.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
sample
|
dict[str, Tensor]
|
A sample returned by :meth: |
required |
suptitle
|
str | None
|
Optional string to use as a suptitle. |
None
|
Returns:
Type | Description |
---|---|
Figure
|
matplotlib.figure.Figure: A matplotlib Figure with the rendered sample. |
Source code in terratorch/datasets/m_pv4ger.py
terratorch.datasets.m_cashew_plantation
#
MBeninSmallHolderCashewsNonGeo
#
Bases: NonGeoDataset
NonGeo dataset implementation for M-BeninSmallHolderCashews.
Source code in terratorch/datasets/m_cashew_plantation.py
22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 |
|
__init__(data_root, split='train', bands=BAND_SETS['all'], transform=None, partition='default', use_metadata=False)
#
Initialize the dataset.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data_root
|
str
|
Path to the data root directory. |
required |
split
|
str
|
One of 'train', 'val', or 'test'. |
'train'
|
bands
|
Sequence[str]
|
Bands to be used. Defaults to all bands. |
BAND_SETS['all']
|
transform
|
Compose | None
|
Albumentations transform to be applied. Defaults to None, which applies default_transform(). |
None
|
partition
|
str
|
Partition name for the dataset splits. Defaults to 'default'. |
'default'
|
use_metadata
|
bool
|
Whether to return metadata info (time). |
False
|
Source code in terratorch/datasets/m_cashew_plantation.py
plot(sample, suptitle=None)
#
Plot a sample from the dataset.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
sample
|
dict[str, Tensor]
|
A sample returned by :meth: |
required |
suptitle
|
str | None
|
Optional string to use as a suptitle. |
None
|
Returns:
Type | Description |
---|---|
Figure
|
matplotlib.figure.Figure: A matplotlib Figure with the rendered sample. |
Source code in terratorch/datasets/m_cashew_plantation.py
terratorch.datasets.m_nz_cattle
#
MNzCattleNonGeo
#
Bases: NonGeoDataset
NonGeo dataset implementation for M-NZ-Cattle.
Source code in terratorch/datasets/m_nz_cattle.py
22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 |
|
__init__(data_root, split='train', bands=BAND_SETS['all'], transform=None, partition='default', use_metadata=False)
#
Initialize the dataset.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data_root
|
str
|
Path to the data root directory. |
required |
split
|
str
|
One of 'train', 'val', or 'test'. |
'train'
|
bands
|
Sequence[str]
|
Bands to be used. Defaults to all bands. |
BAND_SETS['all']
|
transform
|
Compose | None
|
Albumentations transform to be applied. Defaults to None, which applies default_transform(). |
None
|
partition
|
str
|
Partition name for the dataset splits. Defaults to 'default'. |
'default'
|
use_metadata
|
bool
|
Whether to return metadata info (time and location). |
False
|
Source code in terratorch/datasets/m_nz_cattle.py
plot(sample, suptitle=None)
#
Plot a sample from the dataset.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
sample
|
dict[str, Tensor]
|
A sample returned by :meth: |
required |
suptitle
|
str | None
|
Optional string to use as a suptitle. |
None
|
Returns:
Type | Description |
---|---|
Figure
|
matplotlib.figure.Figure: A matplotlib Figure with the rendered sample. |
Source code in terratorch/datasets/m_nz_cattle.py
terratorch.datasets.m_chesapeake_landcover
#
MChesapeakeLandcoverNonGeo
#
Bases: NonGeoDataset
NonGeo dataset implementation for M-ChesapeakeLandcover.
Source code in terratorch/datasets/m_chesapeake_landcover.py
20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 |
|
__init__(data_root, split='train', bands=BAND_SETS['all'], transform=None, partition='default')
#
Initialize the dataset.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data_root
|
str
|
Path to the data root directory. |
required |
split
|
str
|
One of 'train', 'val', or 'test'. |
'train'
|
bands
|
Sequence[str]
|
Bands to be used. Defaults to all bands. |
BAND_SETS['all']
|
transform
|
Compose | None
|
Albumentations transform to be applied. Defaults to None, which applies default_transform(). |
None
|
partition
|
str
|
Partition name for the dataset splits. Defaults to 'default'. |
'default'
|
Source code in terratorch/datasets/m_chesapeake_landcover.py
plot(sample, suptitle=None)
#
Plot a sample from the dataset.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
sample
|
dict[str, Tensor]
|
A sample returned by :meth: |
required |
suptitle
|
str | None
|
Optional string to use as a suptitle. |
None
|
Returns:
Type | Description |
---|---|
Figure
|
matplotlib.figure.Figure: A matplotlib Figure with the rendered sample. |
Source code in terratorch/datasets/m_chesapeake_landcover.py
terratorch.datasets.m_pv4ger_seg
#
MPv4gerSegNonGeo
#
Bases: NonGeoDataset
NonGeo dataset implementation for M-PV4GER-SEG.
Source code in terratorch/datasets/m_pv4ger_seg.py
21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 |
|
__init__(data_root, split='train', bands=BAND_SETS['all'], transform=None, partition='default', use_metadata=False)
#
Initialize the dataset.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data_root
|
str
|
Path to the data root directory. |
required |
split
|
str
|
One of 'train', 'val', or 'test'. |
'train'
|
bands
|
Sequence[str]
|
Bands to be used. Defaults to all bands. |
BAND_SETS['all']
|
transform
|
Compose | None
|
Albumentations transform to be applied. Defaults to None, which applies default_transform(). |
None
|
partition
|
str
|
Partition name for the dataset splits. Defaults to 'default'. |
'default'
|
use_metadata
|
bool
|
Whether to return metadata info (location coordinates). |
False
|
Source code in terratorch/datasets/m_pv4ger_seg.py
plot(sample, suptitle=None)
#
Plot a sample from the dataset.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
sample
|
dict[str, Tensor]
|
A sample returned by :meth: |
required |
suptitle
|
str | None
|
Optional string to use as a suptitle. |
None
|
Returns:
Type | Description |
---|---|
Figure
|
matplotlib.figure.Figure: A matplotlib Figure with the rendered sample. |
Source code in terratorch/datasets/m_pv4ger_seg.py
terratorch.datasets.m_SA_crop_type
#
MSACropTypeNonGeo
#
Bases: NonGeoDataset
NonGeo dataset implementation for M-SA-Crop-Type.
Source code in terratorch/datasets/m_SA_crop_type.py
20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 |
|
__init__(data_root, split='train', bands=BAND_SETS['all'], transform=None, partition='default')
#
Initialize the dataset.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data_root
|
str
|
Path to the data root directory. |
required |
split
|
str
|
One of 'train', 'val', or 'test'. |
'train'
|
bands
|
Sequence[str]
|
Bands to be used. Defaults to all bands. |
BAND_SETS['all']
|
transform
|
Compose | None
|
Albumentations transform to be applied. Defaults to None, which applies default_transform(). |
None
|
partition
|
str
|
Partition name for the dataset splits. Defaults to 'default'. |
'default'
|
Source code in terratorch/datasets/m_SA_crop_type.py
plot(sample, suptitle=None)
#
Plot a sample from the dataset.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
sample
|
dict[str, Tensor]
|
A sample returned by :meth: |
required |
suptitle
|
str | None
|
Optional string to use as a suptitle. |
None
|
Returns:
Type | Description |
---|---|
Figure
|
matplotlib.figure.Figure: A matplotlib Figure with the rendered sample. |
Source code in terratorch/datasets/m_SA_crop_type.py
terratorch.datasets.m_neontree
#
MNeonTreeNonGeo
#
Bases: NonGeoDataset
NonGeo dataset implementation for M-NeonTree.
Source code in terratorch/datasets/m_neontree.py
20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 |
|
__init__(data_root, split='train', bands=rgb_bands, transform=None, partition='default')
#
Initialize the dataset.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data_root
|
str
|
Path to the data root directory. |
required |
split
|
str
|
One of 'train', 'val', or 'test'. |
'train'
|
bands
|
Sequence[str]
|
Bands to be used. Defaults to RGB bands. |
rgb_bands
|
transform
|
Compose | None
|
Albumentations transform to be applied. Defaults to None, which applies default_transform(). |
None
|
partition
|
str
|
Partition name for the dataset splits. Defaults to 'default'. |
'default'
|
Source code in terratorch/datasets/m_neontree.py
plot(sample, suptitle=None)
#
Plot a sample from the dataset.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
sample
|
dict[str, Tensor]
|
A sample returned by :meth: |
required |
suptitle
|
str | None
|
Optional string to use as a suptitle. |
None
|
Returns:
Type | Description |
---|---|
Figure
|
matplotlib.figure.Figure: A matplotlib Figure with the rendered sample. |
Source code in terratorch/datasets/m_neontree.py
terratorch.datasets.multi_temporal_crop_classification
#
MultiTemporalCropClassification
#
Bases: NonGeoDataset
NonGeo dataset implementation for multi-temporal crop classification.
Source code in terratorch/datasets/multi_temporal_crop_classification.py
24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 |
|
__init__(data_root, split='train', bands=BAND_SETS['all'], transform=None, no_data_replace=None, no_label_replace=None, expand_temporal_dimension=True, reduce_zero_label=True, use_metadata=False, metadata_file_name='chips_df.csv')
#
Constructor
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data_root
|
str
|
Path to the data root directory. |
required |
split
|
str
|
one of 'train' or 'val'. |
'train'
|
bands
|
list[str]
|
Bands that should be output by the dataset. Defaults to all bands. |
BAND_SETS['all']
|
transform
|
Compose | None
|
Albumentations transform to be applied. Should end with ToTensorV2(). If used through the corresponding data module, should not include normalization. Defaults to None, which applies ToTensorV2(). |
None
|
no_data_replace
|
float | None
|
Replace nan values in input images with this value. If None, does no replacement. Defaults to None. |
None
|
no_label_replace
|
int | None
|
Replace nan values in label with this value. If none, does no replacement. Defaults to None. |
None
|
expand_temporal_dimension
|
bool
|
Go from shape (time*channels, h, w) to (channels, time, h, w). Defaults to True. |
True
|
reduce_zero_label
|
bool
|
Subtract 1 from all labels. Useful when labels start from 1 instead of the expected 0. Defaults to True. |
True
|
use_metadata
|
bool
|
whether to return metadata info (time and location). |
False
|
Source code in terratorch/datasets/multi_temporal_crop_classification.py
plot(sample, suptitle=None)
#
Plot a sample from the dataset.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
sample
|
dict[str, Tensor]
|
a sample returned by :meth: |
required |
suptitle
|
str | None
|
optional string to use as a suptitle |
None
|
Returns:
Type | Description |
---|---|
Figure
|
a matplotlib Figure with the rendered sample |
Source code in terratorch/datasets/multi_temporal_crop_classification.py
terratorch.datasets.open_sentinel_map
#
OpenSentinelMap
#
Bases: NonGeoDataset
Pytorch Dataset class to load samples from the OpenSentinelMap dataset, supporting multiple bands and temporal sampling strategies.
Source code in terratorch/datasets/open_sentinel_map.py
22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 |
|
__init__(data_root, split='train', bands=None, transform=None, spatial_interpolate_and_stack_temporally=True, pad_image=None, truncate_image=None, target=0, pick_random_pair=True)
#
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data_root
|
str
|
Path to the root directory of the dataset. |
required |
split
|
str
|
Dataset split to load. Options are 'train', 'val', or 'test'. Defaults to 'train'. |
'train'
|
bands
|
list of str
|
List of band names to load. Defaults to ['gsd_10', 'gsd_20', 'gsd_60']. |
None
|
transform
|
Compose
|
Albumentations transformations to apply to the data. |
None
|
spatial_interpolate_and_stack_temporally
|
bool
|
If True, the bands are interpolated and concatenated over time. Default is True. |
True
|
pad_image
|
int
|
Number of timesteps to pad the time dimension of the image. If None, no padding is applied. |
None
|
truncate_image
|
int
|
Number of timesteps to truncate the time dimension of the image. If None, no truncation is performed. |
None
|
target
|
int
|
Specifies which target class to use from the mask. Default is 0. |
0
|
pick_random_pair
|
bool
|
If True, selects two random images from the temporal sequence. Default is True. |
True
|
Source code in terratorch/datasets/open_sentinel_map.py
terratorch.datasets.openearthmap
#
OpenEarthMapNonGeo
#
Bases: NonGeoDataset
OpenEarthMapNonGeo Dataset for non-georeferenced imagery.
This dataset class handles non-georeferenced image data from the OpenEarthMap dataset. It supports configurable band sets and transformations, and performs cropping operations to ensure that the images conform to the required input dimensions. The dataset is split into "train", "test", and "val" subsets based on the provided split parameter.
Source code in terratorch/datasets/openearthmap.py
15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 |
|
__init__(data_root, bands=BAND_SETS['all'], transform=None, split='train', crop_size=256, random_crop=True)
#
Initialize a new instance of the OpenEarthMapNonGeo dataset.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data_root
|
str
|
The root directory containing the dataset files. |
required |
bands
|
Sequence[str]
|
A list of band names to be used. Default is BAND_SETS["all"]. |
BAND_SETS['all']
|
transform
|
Compose or None
|
A transformation pipeline to be applied to the data. If None, a default transform converting the data to a tensor is applied. |
None
|
split
|
str
|
The dataset split to use ("train", "test", or "val"). Default is "train". |
'train'
|
crop_size
|
int
|
The size (in pixels) of the crop to apply to images. Must be greater than 0. Default is 256. |
256
|
random_crop
|
bool
|
If True, performs a random crop; otherwise, performs a center crop. Default is True. |
True
|
Raises:
Type | Description |
---|---|
Exception
|
If the provided split is not one of "train", "test", or "val". |
AssertionError
|
If crop_size is not greater than 0. |
Source code in terratorch/datasets/openearthmap.py
terratorch.datasets.pastis
#
PASTIS
#
Bases: NonGeoDataset
" Pytorch Dataset class to load samples from the PASTIS dataset, for semantic and panoptic segmentation.
Source code in terratorch/datasets/pastis.py
16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 |
|
__init__(data_root, norm=True, target='semantic', folds=None, reference_date='2018-09-01', date_interval=(-200, 600), class_mapping=None, transform=None, truncate_image=None, pad_image=None, satellites=['S2'])
#
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data_root
|
str
|
Path to the dataset. |
required |
norm
|
bool
|
If true, images are standardised using pre-computed channel-wise means and standard deviations. |
True
|
reference_date
|
(str, Format)
|
'YYYY-MM-DD'): Defines the reference date based on which all observation dates are expressed. Along with the image time series and the target tensor, this dataloader yields the sequence of observation dates (in terms of number of days since the reference date). This sequence of dates is used for instance for the positional encoding in attention based approaches. |
'2018-09-01'
|
target
|
str
|
'semantic' or 'instance'. Defines which type of target is returned by the dataloader. * If 'semantic' the target tensor is a tensor containing the class of each pixel. * If 'instance' the target tensor is the concatenation of several signals, necessary to train the Parcel-as-Points module: - the centerness heatmap, - the instance ids, - the voronoi partitioning of the patch with regards to the parcels' centers, - the (height, width) size of each parcel, - the semantic label of each parcel, - the semantic label of each pixel. |
'semantic'
|
folds
|
list
|
List of ints specifying which of the 5 official folds to load. By default (when None is specified), all folds are loaded. |
None
|
class_mapping
|
dict
|
A dictionary to define a mapping between the default 18 class nomenclature and another class grouping. If not provided, the default class mapping is used. |
None
|
transform
|
callable
|
A transform to apply to the loaded data (images, dates, and masks). By default, no transformation is applied. |
None
|
truncate_image
|
int
|
Truncate the time dimension of the image to a specified number of timesteps. If None, no truncation is performed. |
None
|
pad_image
|
int
|
Pad the time dimension of the image to a specified number of timesteps. If None, no padding is applied. |
None
|
satellites
|
list
|
Defines the satellites to use. If you are using PASTIS-R, you have access to Sentinel-2 imagery and Sentinel-1 observations in Ascending and Descending orbits, respectively S2, S1A, and S1D. For example, use satellites=['S2', 'S1A'] for Sentinel-2 + Sentinel-1 ascending time series, or satellites=['S2', 'S1A', 'S1D'] to retrieve all time series. If you are using PASTIS, only S2 observations are available. |
['S2']
|
Source code in terratorch/datasets/pastis.py
22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 |
|
terratorch.datasets.sen1floods11
#
Sen1Floods11NonGeo
#
Bases: NonGeoDataset
NonGeo dataset implementation for sen1floods11.
Source code in terratorch/datasets/sen1floods11.py
24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 |
|
__init__(data_root, split='train', bands=BAND_SETS['all'], transform=None, constant_scale=0.0001, no_data_replace=0, no_label_replace=-1, use_metadata=False)
#
Constructor
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data_root
|
str
|
Path to the data root directory. |
required |
split
|
str
|
one of 'train', 'val' or 'test'. |
'train'
|
bands
|
list[str]
|
Bands that should be output by the dataset. Defaults to all bands. |
BAND_SETS['all']
|
transform
|
Compose | None
|
Albumentations transform to be applied. Should end with ToTensorV2(). Defaults to None, which applies ToTensorV2(). |
None
|
constant_scale
|
float
|
Factor to multiply image values by. Defaults to 0.0001. |
0.0001
|
no_data_replace
|
float | None
|
Replace nan values in input images with this value. If None, does no replacement. Defaults to 0. |
0
|
no_label_replace
|
int | None
|
Replace nan values in label with this value. If none, does no replacement. Defaults to -1. |
-1
|
use_metadata
|
bool
|
whether to return metadata info (time and location). |
False
|
Source code in terratorch/datasets/sen1floods11.py
plot(sample, suptitle=None)
#
Plot a sample from the dataset.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
sample
|
dict[str, Tensor]
|
a sample returned by :meth: |
required |
suptitle
|
str | None
|
optional string to use as a suptitle |
None
|
Returns:
Type | Description |
---|---|
Figure
|
a matplotlib Figure with the rendered sample |
Source code in terratorch/datasets/sen1floods11.py
terratorch.datasets.sen4agrinet
#
Sen4AgriNet
#
Bases: NonGeoDataset
Source code in terratorch/datasets/sen4agrinet.py
35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 |
|
__init__(data_root, bands=None, scenario='random', split='train', transform=None, truncate_image=4, pad_image=4, spatial_interpolate_and_stack_temporally=True, seed=42)
#
Pytorch Dataset class to load samples from the Sen4AgriNet dataset, supporting multiple scenarios for splitting the data.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data_root
|
str
|
Root directory of the dataset. |
required |
bands
|
list of str
|
List of band names to load. Defaults to all available bands. |
None
|
scenario
|
str
|
Defines the splitting scenario to use. Options are: - 'random': Random split of the data. - 'spatial': Split by geographical regions (Catalonia and France). - 'spatio-temporal': Split by region and year (France 2019 and Catalonia 2020). |
'random'
|
split
|
str
|
Specifies the dataset split. Options are 'train', 'val', or 'test'. |
'train'
|
transform
|
Compose
|
Albumentations transformations to apply to the data. |
None
|
truncate_image
|
int
|
Number of timesteps to truncate the time dimension of the image. If None, no truncation is applied. Default is 4. |
4
|
pad_image
|
int
|
Number of timesteps to pad the time dimension of the image. If None, no padding is applied. Default is 4. |
4
|
spatial_interpolate_and_stack_temporally
|
bool
|
Whether to interpolate bands and concatenate them over time |
True
|
seed
|
int
|
Random seed used for data splitting. |
42
|
Source code in terratorch/datasets/sen4agrinet.py
terratorch.datasets.sen4map
#
Sen4MapDatasetMonthlyComposites
#
Bases: Dataset
Sen4Map Dataset for Monthly Composites.
Dataset intended for land-cover and crop classification tasks based on monthly composites derived from multi-temporal satellite data stored in HDF5 files.
Dataset Format:
- HDF5 files containing multi-temporal acquisitions with spectral bands (e.g., B2, B3, …, B12)
- Composite images computed as the median across available acquisitions for each month.
- Classification labels provided via HDF5 attributes (e.g., 'lc1') with mappings defined for:
- Land-cover: using
land_cover_classification_map
- Crops: using
crop_classification_map
- Land-cover: using
Dataset Features:
- Supports two classification tasks: "land-cover" (default) and "crops".
- Pre-processing options include center cropping, reverse tiling, and resizing.
- Option to save the keys HDF5 for later filtering.
- Input channel selection via a mapping between available bands and input bands.
Source code in terratorch/datasets/sen4map.py
15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 |
|
__init__(h5py_file_object, h5data_keys=None, crop_size=None, dataset_bands=None, input_bands=None, resize=False, resize_to=[224, 224], resize_interpolation=InterpolationMode.BILINEAR, resize_antialiasing=True, reverse_tile=False, reverse_tile_size=3, save_keys_path=None, classification_map='land-cover')
#
Initialize a new instance of Sen4MapDatasetMonthlyComposites.
This dataset loads data from an HDF5 file object containing multi-temporal satellite data and computes monthly composite images by aggregating acquisitions (via median).
Parameters:
Name | Type | Description | Default |
---|---|---|---|
h5py_file_object
|
File
|
An open h5py.File object containing the dataset. |
required |
h5data_keys
|
Optional list of keys to select a subset of data samples from the HDF5 file. If None, all keys are used. |
None
|
|
crop_size
|
None | int
|
Optional integer specifying the square crop size for the output image. |
None
|
dataset_bands
|
list[HLSBands | int] | None
|
Optional list of bands available in the dataset. |
None
|
input_bands
|
list[HLSBands | int] | None
|
Optional list of bands to be used as input channels.
Must be provided along with |
None
|
resize
|
Boolean flag indicating whether the image should be resized. Default is False. |
False
|
|
resize_to
|
Target dimensions [height, width] for resizing. Default is [224, 224]. |
[224, 224]
|
|
resize_interpolation
|
Interpolation mode used for resizing. Default is InterpolationMode.BILINEAR. |
BILINEAR
|
|
resize_antialiasing
|
Boolean flag to apply antialiasing during resizing. Default is True. |
True
|
|
reverse_tile
|
Boolean flag indicating whether to apply reverse tiling to the image. Default is False. |
False
|
|
reverse_tile_size
|
Kernel size for the reverse tiling operation. Must be an odd number >= 3. Default is 3. |
3
|
|
save_keys_path
|
Optional file path to save the list of dataset keys. |
None
|
|
classification_map
|
String specifying the classification mapping to use ("land-cover" or "crops"). Default is "land-cover". |
'land-cover'
|
Raises:
Type | Description |
---|---|
ValueError
|
If |
ValueError
|
If an invalid |
Source code in terratorch/datasets/sen4map.py
reverse_tiling_pytorch(img_tensor, kernel_size=3)
#
Upscales an image where every pixel is expanded into kernel_size
*kernel_size
pixels.
Used to test whether the benefit of resizing images to the pre-trained size comes from the bilnearly interpolated pixels,
or if the same would be realized with no interpolated pixels.