Forecasting layer¶
This page covers the v1 forecasting surface (rolling-ridge). A second-generation forecast layer with tree / NN predictors is planned and not yet shipped.
A rolling-fit linear predictor (RollingRidge) exposed to .qe
configs as predict_return(...). The motivation is to start
data-driven strategy research from a known baseline — ridge
regression — that any later ML approach has to beat to justify
its complexity.
Quick start¶
backtest(
data = yahoo("SPY", "1d", "2018-01-01", "2024-12-31"),
strategy = signal(
entry = predict_return(
252, 1.0,
lag_return(close, 1),
lag_return(close, 5),
rolling_zscore(close, 20),
rsi(close, 14) / 100.0
) > 0.001,
exit = predict_return(
252, 1.0,
lag_return(close, 1),
lag_return(close, 5),
rolling_zscore(close, 20),
rsi(close, 14) / 100.0
) < 0,
symbol = "SPY",
),
execution = execution(capital = 100_000),
output = output(results = "out/forecast.json"),
)
What this does, per bar:
- Compute the four feature values from the bar's
close. - If a prior bar's features are stashed, push
(features_{t-1}, return_t)into the ridge — keeping the training pair causal — and refit. - Predict
return_{t+1}from this bar's features. - Compare the prediction to the entry / exit thresholds, which becomes the strategy's signal for the next bar's fill.
The first lookback + feature_warmup bars produce NaN forecasts; the
strategy stays flat through warm-up.
API¶
predict_return(lookback, alpha, f1, f2, ..., fN)¶
| Arg | Type | Constraints |
|---|---|---|
lookback |
int literal | > 0, >= n_features + 1 |
alpha |
double literal | >= 0 (use 0 for plain OLS) |
f1..fN |
signal expr | at least one; up to 64 in v1 |
Returns a double — the next-bar forecast. Warm-up returns NaN; any
feature being NaN this bar also returns NaN and breaks the causal
training chain for one bar (better than corrupting the ring with
garbage). Each call site allocates an independent RollingRidge
instance; an entry and exit that both call predict_return(...)
do not share state in v1 — they train in parallel on the same data
and converge to similar coefficients.
Feature primitives¶
| Function | Output |
|---|---|
lag_return(price, k) |
(p_t - p_{t-k}) / p_{t-k}; NaN k bars |
rolling_vol(x, n) |
n-window sample std; NaN n-1 bars |
rolling_zscore(x, n) |
(x_t - mean_n) / std_n; 0 on constants |
All are per-call-site stateful, O(1) per push for lag_return, O(n)
for the rolling stats — well under the per-bar budget for any
reasonable window.
Walk-forward validation¶
The forecasting layer composes with the walk_forward(...)
harness. Wrap a backtest(...) that uses predict_return in
walk-forward windows; each test slice reports OOS metrics that are
not in the predictor's training set:
walk_forward(
base = backtest(
data = yahoo("SPY", "1d", "2018-01-01", "2024-12-31"),
strategy = signal(
entry = predict_return(60, 1.0, lag_return(close, 1)) > 0.001,
exit = predict_return(60, 1.0, lag_return(close, 1)) < 0.0,
symbol = "SPY",
),
execution = execution(capital = 100_000),
),
train_window = "365d",
test_window = "90d",
step_window = "90d",
)
Read the resulting results.json's walk_forward.oos_metrics: if
in-sample sharpe is high but OOS sharpe sits near zero, the predictor
is fitting noise. This is the canonical overfit smell-test the
forecasting layer is designed to support.
Inspecting forecasts in the dashboard¶
Opt in by setting record_forecast = true on execution(...):
qe_run then drains the per-bar (y_hat, y_realized) trace from
every predict_return(...) call site after the run and appends a
forecasts[] block to results.json:
"forecasts": [
{
"id": "entry#0",
"lookback": 60,
"alpha": 1.0,
"n_features": 4,
"metrics": {
"rmse": 0.0142,
"directional_accuracy": 0.51,
"r2_vs_naive": 0.03,
"n": 243
},
"series": [
{ "ts_ns": ..., "bar_index": ..., "y_hat": ..., "y_realized": ... },
...
]
}
]
The F4 BCKT screen's bottom-right slot detects the block and switches
into a scatter view: y_hat on x, y_realized on y, with a 45°
identity line and a y=0 reference. The header badge restates
n / directional accuracy / RMSE / R² vs naïve. Points above the
identity line are bars where the model under-predicted the realized
return; the top-right and bottom-left quadrants are direction-correct.
Cost note: each trace point is 24 bytes on disk and a few hundred
bytes per render. For 60-bar smoke runs this is free; for 1M-bar
minute runs you'll add ~25 MB to results.json. Leave the flag off
unless you actually want the overlay.
Limitations of the v1 overlay:
- Only the first call site is rendered (typical use: entry#0 and
exit#0 are identical predictors, so the picture is the same). A
selector across call sites is a follow-up.
- walk_forward(...) doesn't currently drain predictor traces
through its window-stitched runner — the overlay shows up on
single-pass backtests only.
Limitations (v1)¶
- Target is hardcoded to next-bar return. To predict other targets
(volatility, multi-bar return, direction probability), v2 will
introduce a
target = ...arg. - Method is ridge-only.
alpha = 0recovers OLS via the Tikhonov fallback. Lasso, Bayesian regression, and tree-based models (XGBoost, LightGBM) are planned as separatepredict_*builtins. - No cross-sectional models. The predictor takes per-bar features from one symbol; cross-asset (e.g. residualize on a market factor) needs a future multi-symbol predict primitive.
- No shared state across signal positions. An
entryandexitexpression that both callpredict_return(...)instantiate independent ridges. Practical impact: ~2× compute for the same fit; no correctness issue. - No live coefficient inspection from the dashboard yet. The F4 BCKT screen renders the OOS curve when walk-forward is used; an overlay panel showing forecast-vs-realized is a follow-up.
Implementation pointers¶
| Concern | File |
|---|---|
| Closed-form ridge solver | include/qe/forecast/rolling_ridge.hpp |
| Feature indicators | include/qe/indicators/lag_return.hppinclude/qe/indicators/rolling_stats.hpp |
| DSL builtin registration | src/dsl/env.cpp |
| Per-bar dispatch | src/dsl/evaluator.cpp (case PredictReturn) |
| Warmup math | src/dsl/analysis.cpp |
| End-to-end smoke | tests/fixtures/forecast_smoke.qe |
Pitfalls¶
lookbacktoo small: with 5 features + intercept = 6 coefficients,lookback = 6is the minimum but the design matrix will be rank-deficient on any colinear inputs; bump to at least4 × n_featuresfor sane fits. The binder rejectslookback < n_features + 1to catch the obviously broken case.- Highly correlated features: the ridge is robust against
collinearity at the LDLT level, but the learned slopes become
noisy. Prefer a small set of orthogonal-ish features
(
lag_returnat different horizons, a z-score, RSI normalized). - Refit cost on minute bars: at lookback 1000 and 5 features
RollingRidge::fit()runs in ~250 µs. For 1M bars that's about 4 minutes wall time of pure refit — acceptable for one-shot research, painful inside a sweep × walk-forward grid. Drop the lookback or thin the features when that bites.