Implementation of Temporal Attention Encoder (TAE) for satellite image time series classification.
TAE is a simplified version of the well-known self-attention architeture used in large language models. Its modified self-attention scheme that uses the input embeddings as values. TAE defines a single master query for each sequence, computed from the temporal average of the queries. This master query is compared to the sequence of keys to produce a single attention mask used to weight the temporal mean of values into a single feature vector.
Usage
sits_tae(
samples = NULL,
samples_validation = NULL,
epochs = 150L,
batch_size = 64L,
validation_split = 0.2,
optimizer = torch::optim_adamw,
opt_hparams = list(lr = 0.001, eps = 1e-08, weight_decay = 1e-06),
lr_decay_epochs = 1L,
lr_decay_rate = 0.95,
patience = 20L,
min_delta = 0.01,
seed = NULL,
verbose = FALSE
)
Arguments
- samples
Time series with the training samples.
- samples_validation
Time series with the validation samples. if the
samples_validation
parameter is provided, thevalidation_split
parameter is ignored.- epochs
Number of iterations to train the model.
- batch_size
Number of samples per gradient update.
- validation_split
Number between 0 and 1. Fraction of training data to be used as validation data.
- optimizer
Optimizer function to be used.
- opt_hparams
Hyperparameters for optimizer: lr : Learning rate of the optimizer eps: Term added to the denominator to improve numerical stability. weight_decay: L2 regularization
- lr_decay_epochs
Number of epochs to reduce learning rate.
- lr_decay_rate
Decay factor for reducing learning rate.
- patience
Number of epochs without improvements until training stops.
- min_delta
Minimum improvement to reset the patience counter.
- seed
Seed for random values.
- verbose
Verbosity mode (TRUE/FALSE). Default is FALSE.
Note
sits
provides a set of default values for all classification models.
These settings have been chosen based on testing by the authors.
Nevertheless, users can control all parameters for each model.
Novice users can rely on the default values,
while experienced ones can fine-tune deep learning models
using sits_tuning
.
This function is based on the paper by Vivien Garnot referenced below and code available on github at https://github.com/VSainteuf/pytorch-psetae.
We also used the code made available by Maja Schneider in her work with Marco Körner referenced below and available at https://github.com/maja601/RC2020-psetae.
If you use this method, please cite Garnot's and Schneider's work.
References
Vivien Garnot, Loic Landrieu, Sebastien Giordano, and Nesrine Chehata, "Satellite Image Time Series Classification with Pixel-Set Encoders and Temporal Self-Attention", 2020 Conference on Computer Vision and Pattern Recognition. pages 12322-12331. DOI: 10.1109/CVPR42600.2020.01234
Schneider, Maja; Körner, Marco, "[Re] Satellite Image Time Series Classification with Pixel-Set Encoders and Temporal Self-Attention." ReScience C 7 (2), 2021. DOI: 10.5281/zenodo.4835356
Author
Charlotte Pelletier, charlotte.pelletier@univ-ubs.fr
Gilberto Camara, gilberto.camara@inpe.br
Rolf Simoes, rolfsimoes@gmail.com
Felipe Souza, lipecaso@gmail.com
Examples
if (sits_run_examples()) {
# create a TAE model
torch_model <- sits_train(samples_modis_ndvi, sits_tae())
# plot the model
plot(torch_model)
# create a data cube from local files
data_dir <- system.file("extdata/raster/mod13q1", package = "sits")
cube <- sits_cube(
source = "BDC",
collection = "MOD13Q1-6.1",
data_dir = data_dir
)
# classify a data cube
probs_cube <- sits_classify(
data = cube, ml_model = torch_model, output_dir = tempdir()
)
# plot the probability cube
plot(probs_cube)
# smooth the probability cube using Bayesian statistics
bayes_cube <- sits_smooth(probs_cube, output_dir = tempdir())
# plot the smoothed cube
plot(bayes_cube)
# label the probability cube
label_cube <- sits_label_classification(
bayes_cube,
output_dir = tempdir()
)
# plot the labelled cube
plot(label_cube)
}