Skip to main content

The four-level statistical cascade

Why each SKU gets the appropriate statistical method based on how much history it has, in a four-level cascade that automatically migrates to more precise methods.

The central principle

Statistics is not neutral with respect to sample size. A product with three weeks of sales and one with three years do not deserve the same treatment: the former accumulates parametric uncertainty — we do not know well which distribution its demand follows — while the latter has enough data to let the data speak without distributional assumptions.

The four-level statistical cascade formalizes that intuition: the most appropriate method for each SKU is determined automatically from the number of rows n available in the historical error matrix, and the system migrates to the next level without manual intervention as the product accumulates sales.

The underlying principle is simple: the more data, the fewer assumptions.

The four levels

The cut-off variable is n = number of observations in the SKU's historical error matrix.

No history

The system computes no policy for this SKU. Instead, it emits a BLOCKING warning and marks it for planner review. The reason is that, with zero error rows, any number the model produced would be pure invention with no statistical or empirical backing.

Design decision

Emitting a blocking warning — rather than silently recommending a default value — is a deliberate decision of honesty about uncertainty. A fabricated recommendation is more dangerous than an acknowledged absence.

Little history

With few observations, parametric bootstrap via the Method of Moments (MoM) is the available instrument. The system fits the parameters of three candidates — Normal, Gamma, and Lognormal — using MoM, and chooses the distribution with the lowest AIC (Akaike Information Criterion):

distribution* = argmin_{d ∈ {Normal, Gamma, Lognormal}} AIC(d)

MoM on small samples is less efficient than MLE, but MLE is unstable with little history (variance estimates can collapse or diverge). AIC penalizes model complexity, so with little data the most parsimonious distribution tends to win.

Moderate history

With moderate history, estimation by Maximum Likelihood Estimation (MLE) stabilizes (Lawless, 2003). The workflow adds a normality test:

  1. The Shapiro-Wilk test is applied to the historical errors.
  2. If normality is not rejected (p ≥ α, typically 0.05), the Normal distribution is adopted.
  3. If rejected, argmin AIC is run between Gamma and Lognormal, fitted by MLE.

The Shapiro-Wilk test in this n-range has good power to detect severe deviations from normality without being overly sensitive to sample noise. The system emits an INFO warning when the test rejects, so the planner knows which distribution was used and why.

Mature product (broad history)

With broad history, the system drops all distributional assumptions and uses non-parametric bootstrap of full rows: it resamples with replacement directly from the historical error matrix, preserving the within-horizon correlation structure. The academic basis is Efron & Tibshirani (1993, §6): the empirical coverage of non-parametric bootstrap quantiles converges to nominal coverage with sufficient history.

The result is that the Monte Carlo simulation at this level captures asymmetries, heavy tails, and any unusual demand behavior without forcing it to fit a Gaussian bell curve.

Why automatic migration matters

Traditional systems assign a fixed method to all SKUs — usually the Normal distribution, which is the most mathematically convenient. The problem is that this choice penalizes the extremes: with little history, Normal overstates confidence in the parameters; with lots of history, it discards real information about demand tails.

The cascade solves the opposite problem: it never uses a more sophisticated method than the data can support, and never uses a simpler one than the data deserves.

The thresholds between levels are configurable per instance. The default values follow the cited references, but markets with very stable demand patterns can lower the threshold for the mature-product level, while highly volatile categories may want to require more data before trusting the non-parametric bootstrap.

Summary

LevelHistoryMethodBasis
No historyNo dataNo computation; BLOCKING warning
Little historyLittle historyParametric bootstrap, MoM, argmin AICParsimony with small sample
Moderate historyModerate historyParametric bootstrap, MLE + Shapiro-WilkLawless (2003)
Mature productBroad historyNon-parametric bootstrap (full rows)Efron & Tibshirani (1993) §6