Skip to content

Necks

terratorch.models.necks.Neck #

Bases: ABC, Module

Base class for Neck

A neck must must implement self.process_channel_list which returns the new channel list.

terratorch.models.necks.SelectIndices #

Bases: Neck

__init__(channel_list, indices) #

Select indices from the embedding list

Parameters:

Name Type Description Default
indices list[int]

list of indices to select.

required

terratorch.models.necks.PermuteDims #

Bases: Neck

__init__(channel_list, new_order) #

Permute dimensions of each element in the embedding list

Parameters:

Name Type Description Default
new_order list[int]

list of indices to be passed to tensor.permute()

required

terratorch.models.necks.InterpolateToPyramidal #

Bases: Neck

__init__(channel_list, scale_factor=2, mode='nearest') #

Spatially interpolate embeddings so that embedding[i - 1] is scale_factor times larger than embedding[i]

Useful to make non-pyramidal backbones compatible with hierarachical ones Args: scale_factor (int): Amount to scale embeddings by each layer. Defaults to 2. mode (str): Interpolation mode to be passed to torch.nn.functional.interpolate. Defaults to 'nearest'.

terratorch.models.necks.MaxpoolToPyramidal #

Bases: Neck

__init__(channel_list, kernel_size=2) #

Spatially downsample embeddings so that embedding[i - 1] is scale_factor times smaller than embedding[i]

Useful to make non-pyramidal backbones compatible with hierarachical ones Args: kernel_size (int). Base kernel size to use for maxpool. Defaults to 2.

terratorch.models.necks.ReshapeTokensToImage #

Bases: Neck

__init__(channel_list, remove_cls_token=True, effective_time_dim=1) #

Reshape output of transformer encoder so it can be passed to a conv net.

Parameters:

Name Type Description Default
remove_cls_token bool

Whether to remove the cls token from the first position. Defaults to True.

True
effective_time_dim int

The effective temporal dimension the transformer processes. For a ViT, his will be given by num_frames // tubelet size. This is used to determine the temporal dimension of the embedding, which is concatenated with the embedding dimension. For example: - A model which processes 1 frame with a tubelet size of 1 has an effective_time_dim of 1. The embedding produced by this model has embedding size embed_dim * 1. - A model which processes 3 frames with a tubelet size of 1 has an effective_time_dim of 3. The embedding produced by this model has embedding size embed_dim * 3. - A model which processes 12 frames with a tubelet size of 4 has an effective_time_dim of 3. The embedding produced by this model has an embedding size embed_dim * 3. Defaults to 1.

1

collapse_dims(x) #

When the encoder output has more than 3 dimensions, is necessary to reshape it.

terratorch.models.necks.AddBottleneckLayer #

Bases: Neck

Add a layer that reduces the channel dimension of the final embedding by half, and concatenates it

Useful for compatibility with some smp decoders.

terratorch.models.necks.LearnedInterpolateToPyramidal #

Bases: Neck

Use learned convolutions to transform the output of a non-pyramidal encoder into pyramidal ones

Always requires exactly 4 embeddings


Last update: March 24, 2025