Returns predictions and weights calculated by online-learning algorithms using CRPS Learning.

online(
  y,
  experts,
  tau = 1:dim(experts)[2]/(dim(experts)[2] + 1),
  lead_time = 0,
  loss_function = "quantile",
  loss_parameter = 1,
  loss_gradient = TRUE,
  method = "bewa",
  basis_knot_distance = 1/(dim(experts)[2] + 1),
  basis_knot_distance_power = 1,
  basis_deg = 1,
  forget_regret = 0,
  soft_threshold = -Inf,
  hard_threshold = -Inf,
  fixed_share = 0,
  p_smooth_lambda = -Inf,
  p_smooth_knot_distance = basis_knot_distance,
  p_smooth_knot_distance_power = basis_knot_distance_power,
  p_smooth_deg = basis_deg,
  p_smooth_ndiff = 1.5,
  gamma = 1,
  parametergrid_max_combinations = 100,
  parametergrid = NULL,
  forget_past_performance = 0,
  allow_quantile_crossing = FALSE,
  init_weights = NULL,
  loss = NULL,
  regret = NULL,
  trace = TRUE
)

Arguments

y

A numeric matrix of realizations. In probabilistic settings a matrix of dimension Tx1. In multivariate setting a TxP matrix can be used. In the latter case, each slice of the expert's array gets evaluated using the corresponding column of the y matrix.

experts

An array of predictions with dimension (Observations, Quantiles, Experts).

tau

A numeric vector of probabilities.

lead_time

offset for expert forecasts. Defaults to 0, which means that experts forecast t+1 at t. Setting this to h means experts predictions refer to t+1+h at time t. The weight updates delay accordingly.

loss_function

Either "quantile", "expectile" or "percentage".

loss_parameter

Optional parameter scaling the power of the loss function.

loss_gradient

Determines if a linearized version of the loss is used.

method

One of "boa", "bewa", "ml_poly" or "ewa". Where "bewa" refers to a mixture of boa and ewa, including the second order refinement of boa, but updating weights with the simple exponential weighting.

basis_knot_distance

determines the distance of the knots in the probability basis. Defaults to 1 / (dim(experts)[2] + 1), which means that one knot is created for every quantile. Takes a vector with values >0 where values > .5 correspond to constant weights (only one single knot).

basis_knot_distance_power

Parameter which defines the symmetry of the basis reducing the probability space. Defaults to 1, which corresponds to equidistant knots. Values less than 1 create more knots in the center, while values above 1 concentrate more knots in the tails.

basis_deg

Degree of the basis reducing the probability space. Defaults to 1.

forget_regret

Share of past regret not to be considered, resp. to be forgotten in every iteration of the algorithm. Defaults to 0.

soft_threshold

If specified, the following soft threshold will be applied to the weights: w = sgn(w)*max(abs(w)-t,0) where t is the soft_threshold parameter. Defaults to -inf, which means that no threshold will be applied. If all expert weights are thresholded to 0, a weight of 1 will be assigned to the expert with the highest weights prior to thresholding. Thus soft_threshold = 1 leads to the 'follow the leader' strategy if method is set to "ewa".

hard_threshold

If specified, the following hard thresholding will be applied to the weights: w = w*(abs(w)>t) where t is the threshold_hard parameter. Defaults to -inf, which means that no threshold will be applied. If all expert weights are thresholded to 0, a weight of 1 will be assigned to the expert with the highest weight prior to thresholding. Thus hard_threshold = 1 leads to the 'follow the leader' strategy if method is set to "ewa".

fixed_share

Amount of fixed share to be added to the weights. Defaults to 0. 1 leads to uniform weights.

p_smooth_lambda

Penalization parameter used in the smoothing step. -Inf causes the smoothing step to be skipped (default).

p_smooth_knot_distance

determines the distance of the knots. Defaults to the value of basis_knot_distance. Corresponds to the grid steps when knot_distance_power = 1 (the default).

p_smooth_knot_distance_power

Parameter which defines the symmetry of the P-Spline basis. Takes the value of basis_knot_distance_power if unspecified.

p_smooth_deg

Degree of the B-Spine basis functions. Defaults to the value of basis_deg.

p_smooth_ndiff

Degree of the differencing operator in the smoothing equation. 1.5 (default) leads to shrinkage towards a constant. Can take values from 1 to 2. If a value in between is used, a weighted sum of the first and second differentiation matrix is calculated.

gamma

Scaling parameter for the learning rate.

parametergrid_max_combinations

Integer specifying the maximum number of parameter combinations that should be considered. If the number of possible combinations exceeds this threshold, the maximum allowed number is randomly sampled. Defaults to 100.

parametergrid

User supplied grid of parameters. Can be used if not all combinations of the input vectors should be considered. Must be a matrix with 13 columns (online) or 12 columns batch with the following order: basis_knot_distance, basis_knot_distance_power, basis_deg, forget_regret, soft_threshold, hard_threshold, fixed_share, p_smooth_lambda, p_smooth_knot_distance, p_smooth_knot_distance_power, p_smooth_deg, p_smooth_ndiff, gamma.

forget_past_performance

Share of past performance not to be considered, resp. to be forgotten in every iteration of the algorithm when selecting the best parameter combination. Defaults to 0.

allow_quantile_crossing

Shall quantile crossing be allowed? Defaults to false, which means that predictions are sorted in ascending order.

init_weights

Matrix of dimension 1xK or PxK used as starting weights. 1xK represents the constant solution with equal weights over all P, whereas specifying a PxK matrix allows different starting weights for each P.

loss

User specified loss array. Can also be a list with elements "loss_array" and "share", share mixes the provided loss with the loss calculated by profoc. 1 means, only the provided loss will be used. share can also be vector of shares to consider.

regret

User specified regret array. If specific, the regret will not be calculated by profoc. Can also be a list with elements "regret_array" and "share", share mixes the provided regret with the regret calculated by profoc. 1 means, only the provided regret will be used. share can also be vector of shares to consider.

trace

Print a progress bar to the console? Defaults to TRUE.

Value

Returns weights and corresponding predictions.

Details

online can tune various parameters automatically based on the past loss. For this, lambda, forget, fixed_share, gamma, ndiff, deg and knot_distance can be specified as numeric vectors containing parameters to consider. online will automatically try all possible combinations of values provide.

Examples

if (FALSE) { T <- 50 # Observations N <- 2 # Experts P <- 9 # Quantiles prob_grid <- 1:P / (P + 1) y <- rnorm(n = T) # Realized experts <- array(dim = c(T, P, N)) # Predictions for (t in 1:T) { experts[t, , 1] <- qnorm(prob_grid, mean = -1, sd = 1) experts[t, , 2] <- qnorm(prob_grid, mean = 3, sd = sqrt(4)) } model <- online( y = matrix(y), experts = experts, p_smooth_lambda = 10 ) print(model) plot(model) autoplot(model) new_y <- matrix(rnorm(1)) # Realized new_experts <- experts[T, , , drop = FALSE] # Update will update the model object, no need for new assignment update(model, new_y = new_y, new_experts = new_experts) # Use predict to combine new_experts, model$predictions will be extended predict(model, new_experts = new_experts) }