AIModels.ModelTraining module

Training,validation and prediction methods for the Informer model.

Utilities

AIModels.ModelTraining.train_model(model, epoch, train_loader, optimizer, lr=0.001, patience=5, clip=1.0, device=None, criterion=None)[source]

Training models

Parameters:
  • model (torch model) -- Model to be trained

  • epoch (int) -- Number of epochs

  • train_loader (torch DataLoader) -- Training data

  • optimizer (torch optimizer) -- Optimizer

  • lr (float) -- Learning rate

  • patience (int) -- Patience -- number of epoch to wait before early stopping

  • clip (float) -- Gradient clipping

  • device (torch device) -- Device to run the model ('cpu' or 'mps' for apple silicon)

  • criterion (torch loss function) -- Loss function

returns:
  • model (torch model) -- Trained model

  • train_loss (float) -- Training loss

AIModels.ModelTraining.validate_model(model, epoch, val_loader, lr=0.001, patience=5, clip=1.0, device=None, criterion=None)[source]

Validate models

Parameters:
  • model (torch model) -- Model to be validated

  • epoch (int) -- Number of epochs

  • train_loader (torch DataLoader) -- Training data

  • optimizer (torch optimizer) -- Optimizer

  • lr (float) -- Learning rate

  • patience (int) -- Patience -- number of epoch to wait before early stopping

  • clip (float) -- Gradient clipping

  • device (torch device) -- Device to run the model ('cpu or 'mps' for apple silicon)

  • criterion (torch loss function) -- Loss function

returns:
  • model (torch model) -- Validated model

  • train_loss (float) -- Training loss

AIModels.ModelTraining.predict(model, val_loader, Tpredict, device=None, criterion=None)[source]

Predict models

Parameters:
  • model (torch model) -- Model to be validated

  • val_loader (torch DataLoader) -- Input sequence data for prediction

  • Tpredict (int) -- Number of time steps to predict

  • device (torch device) -- Device to run the model ('cpu or 'mps' for apple silicon)

  • criterion (torch loss function) -- Loss function. If it defined it assumes a deterministic model

AIModels.ModelTraining.deter_generate(model, past_values: Tensor, past_time_features: Tensor, future_time_features: Tensor, past_observed_mask: Tensor | None = None, static_categorical_features: Tensor | None = None, static_real_features: Tensor | None = None, output_attentions: bool | None = None, output_hidden_states: bool | None = None) SampleTSPredictionOutput[source]

Greedily generate sequences of predictions from the last hidden state, modified version of generate method from transformers library.

Parameters:
  • past_values (`torch.FloatTensor` of shape `(batch_size, sequence_length)` or `(batch_size, sequence_length, input_size)`) -- Past values of the time series, that serve as context in order to predict the future. The sequence size of this tensor must be larger than the context_length of the model, since the model will use the larger size to construct lag features, i.e. additional values from the past which are added in order to serve as "extra context".

    The sequence_length here is equal to config.context_length + max(config.lags_sequence), which if no lags_sequence is configured, is equal to config.context_length + 7 (as by default, the largest look-back index in config.lags_sequence is 7). The property _past_length returns the actual length of the past.

    The past_values is what the Transformer encoder gets as input (with optional additional features, such as static_categorical_features, static_real_features, past_time_features and lags).

    Optionally, missing values need to be replaced with zeros and indicated via the past_observed_mask.

    For multivariate time series, the input_size > 1 dimension is required and corresponds to the number of variates in the time series per time step.

  • past_time_features (`torch.FloatTensor` of shape `(batch_size, sequence_length, num_features)`) -- Required time features, which the model internally will add to past_values. These could be things like "month of year", "day of the month", etc. encoded as vectors (for instance as Fourier features). These could also be so-called "age" features, which basically help the model know "at which point in life" a time-series is. Age features have small values for distant past time steps and increase monotonically the more we approach the current time step. Holiday features are also a good example of time features.

    These features serve as the "positional encodings" of the inputs. So contrary to a model like BERT, where the position encodings are learned from scratch internally as parameters of the model, the Time Series Transformer requires to provide additional time features. The Time Series Transformer only learns additional embeddings for static_categorical_features.

    Additional dynamic real covariates can be concatenated to this tensor, with the caveat that these features must but known at prediction time.

    The num_features here is equal to config.`num_time_features + config.num_dynamic_real_features.

  • future_time_features (`torch.FloatTensor` of shape `(batch_size, prediction_length, num_features)`) -- Required time features for the prediction window, which the model internally will add to sampled predictions. These could be things like "month of year", "day of the month", etc. encoded as vectors (for instance as Fourier features). These could also be so-called "age" features, which basically help the model know "at which point in life" a time-series is. Age features have small values for distant past time steps and increase monotonically the more we approach the current time step. Holiday features are also a good example of time features.

    These features serve as the "positional encodings" of the inputs. So contrary to a model like BERT, where the position encodings are learned from scratch internally as parameters of the model, the Time Series Transformer requires to provide additional time features. The Time Series Transformer only learns additional embeddings for static_categorical_features.

    Additional dynamic real covariates can be concatenated to this tensor, with the caveat that these features must but known at prediction time.

    The num_features here is equal to config.`num_time_features + config.num_dynamic_real_features.

  • past_observed_mask (`torch.BoolTensor` of shape `(batch_size, sequence_length)` or `(batch_size, sequence_length, input_size)`, *optional*) -- Boolean mask to indicate which past_values were observed and which were missing. Mask values selected in [0, 1]:

    • 1 for values that are observed,

    • 0 for values that are missing (i.e. NaNs that were replaced by zeros).

  • static_categorical_features (`torch.LongTensor` of shape `(batch_size, number of static categorical features)`, *optional*) -- Optional static categorical features for which the model will learn an embedding, which it will add to the values of the time series. Static categorical features are features which have the same value for all time steps (static over time). A typical example of a static categorical feature is a time series ID.

  • static_real_features (`torch.FloatTensor` of shape `(batch_size, number of static real features)`, *optional*) -- Optional static real features which the model will add to the values of the time series. Static real features are features which have the same value for all time steps (static over time). A typical example of a static real feature is promotion information.

  • output_attentions (`bool`, *optional*) -- Whether or not to return the attentions tensors of all attention layers.

  • output_hidden_states (`bool`, *optional*) -- Whether or not to return the hidden states of all layers.

Returns:

  • [SampleTSPredictionOutput] where the outputs sequences tensor will have shape (batch_size, number of

  • samples, prediction_length) or (batch_size, number of samples, prediction_length, input_size) for

  • multivariate predictions.