Skip to main content

Accuracy and training metrics

What the accuracy KPIs (BIAS, MAE, MAPE, RMSE, Accuracy) and the training KPIs (MASE, WAPE, wQL, MSE) measure, and why they are best read together.

Where to find the formulas

This page explains the purpose of each family of metrics. Formulas, reference thresholds, and numerical examples live in Reference → Forecast accuracy and Reference → Model training.

Two families, two different questions

Forecasting produces two types of evaluation that answer different questions and must not be confused:

Accuracy metrics — evaluate the forecast the human team delivered to the business (including all manual adjustments). They are the yardstick for the entire S&OP process.

Training metrics — evaluate only the quality of the underlying statistical model, before any human adjustment. They indicate whether the algorithm learned well from historical data and whether it is worth continuing to use or whether alternatives should be explored.

Mixing the two leads to wrong conclusions: a statistically good model can look poor if the team adds adjustments that worsen the final forecast, and vice versa.

Accuracy metrics and how to read them together

BIAS: which direction we miss

BIAS measures whether the forecast tends to be systematically higher or lower than actual demand. A positive BIAS means consistent overestimation (the team forecasts more than the market asks for); a negative BIAS means underestimation (less is forecasted than actually occurs).

The critical point is that BIAS can appear close to zero even when errors are large. If one month overestimated by 500 units and the next underestimated by 500, the errors cancel and BIAS reads as zero — but the process was far from accurate. That is why BIAS is never read in isolation.

MAE / MAPE: how much we miss

MAE (Mean Absolute Error) and MAPE (Mean Absolute Percentage Error) measure the magnitude of the error regardless of direction. Where BIAS indicates bias, MAE/MAPE indicates dispersion: how far the forecast is from the actual value on average, whether above or below.

The combined BIAS + MAE reading is the basic diagnostic rule:

  • Low BIAS + low MAE → accurate forecast with no systematic bias.
  • Low BIAS + high MAE → errors are large but cancel each other out; high variability without a clear directional bias.
  • High BIAS + high MAE → there is a systematic problem of both direction and magnitude; the model or the adjustments consistently point in the wrong direction.

RMSE: large errors matter more

RMSE (Root Mean Squared Error) penalizes large errors disproportionately. An error of 100 units weighs four times as much as an error of 50. This makes it useful when error spikes have severe operational consequences (a stockout of an A product during peak season is worth much more than the average error). If RMSE is substantially larger than MAE, there are outlier errors that deserve individual review.

Accuracy: the business metric

Accuracy expresses precision as a positive percentage (100 % − MAPE, simplified). It is the easiest metric to communicate to non-technical audiences: "the forecast was 85 % accurate this month." Its limitation is that it aggregates everything into a single number and can hide problems in specific segments of the portfolio.

Training metrics

Training metrics (MASE, WAPE, wQL, MSE) are internal to the modeling process. They reflect how well the algorithm learned from the history, comparing its performance against simple benchmarks or measuring fit on the validation set.

Their primary use is technical: comparing candidate models during per-SKU selection and detecting overfitting. When the training MASE is good but operational accuracy is poor, the cause is usually manual adjustments or market changes that occurred after training, not the model itself.

TODO: link to the engine technical configuration guide when available.

Why no single metric is enough on its own

If you only looked at…You would miss…
BIASLarge errors that cancel each other out
MAE/MAPEWhether the error has a systematic direction
RMSEThe absolute average magnitude
AccuracyProblems in portfolio segments
Training metricsWhether human adjustments improve or worsen the final outcome

AInventory's metrics panel presents the families together precisely so that none is read in isolation.

The terms BIAS, MAE, MAPE, RMSE, Accuracy, MASE, wQL, and WAPE have entries in the glossary.