Abandoned · LightGBM · PostgreSQL · Recursive Forecasting

Ride Whisperer v6
LightGBM + Live Inference

An earlier iteration of the wait-time prediction engine that took a fundamentally different approach from v8 Prophet: a single global LightGBM model trained on all attractions, a live PostgreSQL backend for real-time wait data, and a recursive multi-step forecasting loop where each predicted value is fed back as a lag feature for the next interval. Seven prediction horizons from 15 minutes to 3 hours.

v8 (Prophet) ↗ ← All Projects

⚠️

Abandoned before v8. This version existed alongside and before the Prophet experiments. The recursive forecasting loop was a clever partial solution to the inference-time lag problem — but predicted wait times used as lag features accumulate error with each step. By the 3-hour horizon, lag features were almost entirely model-generated rather than observed. The approach also required a live PostgreSQL feed of wait times at inference time, which added infrastructure dependency.

v6 — This Version

ModelLightGBM (GBDT)

ScopeGlobal (all rides, one model)

Inference dataLive PostgreSQL

Lag strategyRecursive (predicted → re-feed)

Horizon15 min → 3 hr (7 intervals)

Weather sourceMETAR/ASOS (KISM)

v8 — Prophet (Later)

ModelFacebook Prophet

ScopePer-ride (N separate models)

Inference dataBigQuery (batch)

Lag strategyImputed to zero (broken)

Horizon24 hr (hourly)

Weather sourceNWS forecast API → BigQuery

Iteration

Global Model

Horizons

3hr

Max Horizon

6hr

Lookback Window

WDW Parks

The Key Innovation

Recursive Multi-Step Forecasting Loop

The fundamental challenge with any lag-based wait time model is: at t+60 min, the lag features for t+15 and t+30 don't exist yet. v6 solved this by running predictions sequentially — each predicted value is appended to the live data buffer and used as input for the next step.

Prediction generation order per attraction

+15m

LightGBM predict
lag features from
live PostgreSQL data

↓ append to buffer

+30m

lag_15min = predicted
wait from +15m step

↓ append to buffer

+60m

rolling_mean_1hr
computed from buffer
(mix of real + predicted)

↓ append to buffer

+90m

lag_1hr = predicted
from +30m; rolling
from mixed buffer

↓ append

+3hr

All lag/rolling features
model-generated — no
observed data used

Buffer management: prediction_buffer.tail(20) — only the 20 most recent points are kept to prevent memory growth. At prediction start, the buffer is seeded with 6 hours of real PostgreSQL wait-time observations from park_live_status_logs. Each appended predicted value gradually replaces real observations as the horizon extends. At +3hr, the buffer contains almost exclusively model-generated values — error compounds across steps.

Inference Pipeline

Per-Attraction Real-Time Prediction Flow

Model Load

LightGBM model loaded from trained_wait_time_model.pkl Feature list: model_features.pkl — exact column order the model was trained on Categorical features: model_categorical_features.pkl Category levels: model_categorical_feature_categories.pkl — maps each cat column to its training-time levels to prevent unseen-category errors

Entity Filter

Entity_ID categories loaded from pkl → filter park_entities to only trained IDs Entity type filter: ATTRACTION or RIDE — shows excluded Each entity row: entity_id, name, parent_id (Park_ID), entity_type Park_ID derived from parent_id → maps to MK/HS/EP/AK prefix for schedule columns

Live Wait Buffer

park_live_status_logs PostgreSQL table — query last 6 hours per entity_id Column: queue_details → 'STANDBY' → 'waitTime' (JSONB path extraction) Output: list of {datetime, wait_time} sorted by recorded_at ASC Timestamps normalized to UTC (tzinfo-aware) 6-hour lookback provides lag_7day (zeroed), lag_24hr (zeroed), lag_1hr, lag_30min, lag_15min

Park Context

park_schedules table → today, yesterday, tomorrow hours (Operating / Early Entry / Extended Evening) Derived: {PREFIX}HOURS (today), {PREFIX}HOURSYEST, {PREFIX}HOURSTOM — operating hour duration in float hours METAR weather from weather_data table (station KISM) — closest observation by UTC timestamp Weather fields: TEMP (°F), HUMIDITY, PRECIP_TOTAL, WIND_SPEED, WIND_GUST, PRESSURE, CONDITION, VISIBILITY, CLOUD_COVER Park_Name from parent entity lookup in park_entities School session flags defaulted to 0 if not in DB (Total_Schools_inSession, Schools_inSession_wdw, etc.)

Feature Assembly

All feature columns initialized to NaN from pkl feature list Time features re-derived from prediction timestamp: year, month, day, day_of_week, day_of_year, hour, minute, is_weekend, DAYOFWEEK, DAYOFYEAR, WEEKOFYEAR, MONTHOFYEAR, YEAR Lag features computed from sorted buffer: lag_15min, lag_30min, lag_1hr, lag_24hr, lag_7day Rolling features: rolling_mean/std/max/min over 1hr and 4hr windows from buffer Entity_ID and Park_ID as categorical features — model disambiguates between attractions Lat/lon from park_entities populated as children.location.latitude/longitude Numerical NaNs filled to 0 to match training behavior; categorical columns cast with explicit training-time category levels

Recursive Predict

model.predict(features_df) per interval [15, 30, 60, 90, 120, 150, 180 min] Output clamped: max(0, predicted_wait_time) — no negative wait times Each prediction appended to buffer before next interval's feature assembly Buffer pruned to tail(20) to limit memory per ride Result: dict of {prediction_datetime → wait_time_seconds} per entity_id

Table	Key Columns	Used For
park_live_status_logs	entity_id recorded_at queue_details JSONB	Real-time wait-time buffer — 6-hour lookback window per ride. JSONB path: `queue_details → 'STANDBY' → 'waitTime'`
park_entities	entity_id entity_type parent_id latitude longitude	Entity metadata — name, type (ATTRACTION/RIDE), park parent_id, geographic coordinates
park_schedules	entity_id date type start_time end_time	Today / yesterday / tomorrow park hours. Schedule types: Operating, Early Entry, Extended Evening Hours
weather_data	station_id observation_time temp_c precip_in wx_string	METAR/ASOS observations from KISM (Disney World weather station). Closest observation by UTC delta selected.

Feature Engineering

Feature Categories — Assembled at Inference Time

Lag Features5

wait_time_lag_15min

wait_time_lag_30min

wait_time_lag_1hr

wait_time_lag_24hr

wait_time_lag_7day

Short lags: real data. Long lags (24hr, 7day): 0 unless buffer has deep history.

Rolling Window Features6

wait_time_rolling_mean_1hr

wait_time_rolling_std_1hr

wait_time_rolling_max_1hr

wait_time_rolling_min_1hr

wait_time_rolling_mean_4hr

wait_time_rolling_std_4hr

Computed from the live buffer window — blend of real + predicted at longer horizons.

Time Decomposition13

year, month, day

hour, minute

day_of_week, is_weekend

day_of_year, DAYOFYEAR

WEEKOFYEAR, MONTHOFYEAR

YEAR, DAYOFWEEK

Derived from prediction_datetime — always available, no imputation needed.

Park Schedules8+

{PREFIX}OPEN, {PREFIX}CLOSE

{PREFIX}HOURS (today duration)

{PREFIX}HOURSYEST, {PREFIX}HOURSTOM

{PREFIX}CLOSEYEST, {PREFIX}OPENTOM

{PREFIX}EMHOPEN, {PREFIX}EMHCLOSE

PREFIX = MK/HS/EP/AK. Yesterday + tomorrow hours provide schedule context for crowd patterns.

Weather (METAR)10

TEMP (°F, from °C)

HUMIDITY

PRECIP_TOTAL, RAIN_SUM

WIND_SPEED, WIND_GUST

PRESSURE (altimeter in Hg)

CONDITION (wx_string)

VISIBILITY, CLOUD_COVER

Entity / Categorical6+

Entity_ID (categorical)

Park_ID (categorical)

Entity_name, Park_Name

children.location.latitude/longitude

SEASON, school session flags

Training-time categories loaded from pkl to prevent unseen-level prediction errors.

Design Decisions

Why These Choices, and What They Got Wrong

Why LightGBM instead of Prophet (or XGBoost)+

LightGBM was chosen for three concrete reasons over the alternatives available at this stage. First, it handles categorical features natively — Entity_ID can be passed directly as a pandas category dtype without one-hot encoding, which matters when you have 130+ attractions as a single categorical. Second, training speed: LightGBM's leaf-wise growth is significantly faster than XGBoost's level-wise growth on this size dataset, which mattered when iterating on feature sets. Third, it tolerates missing values without explicit imputation unlike many other algorithms.

The explicit model_categorical_feature_categories.pkl file was added specifically to prevent a common LightGBM deployment bug: if a category appears at inference time that wasn't in training (e.g., a new attraction added after training), the model raises an error. Serializing the training-time category levels and re-applying them at inference forces any new values to be treated as known categories.

Global model vs. per-ride model — the tradeoffs+

v6 used a single LightGBM model trained on all rides simultaneously, with Entity_ID as a categorical feature. This is the opposite of v8's one-Prophet-per-ride approach. The global model benefits from cross-ride patterns: if rain causes Space Mountain's waits to drop and Haunted Mansion's waits to stay flat, the model can learn that pattern from the combined data rather than each ride having to independently learn weather effects from its own sparse rain observations.

The tradeoff: a global model can't capture highly ride-specific seasonality patterns. If Big Thunder Railroad has a unique crowd pattern during Halloween parties that no other ride shares, the global model's Entity_ID feature has to carry that signal alone — it can't model the full interaction between entity, time, and context the way a dedicated per-ride model can. This is why v8 swung hard in the other direction, but then hit the training time problem.

Recursive forecasting — why it degrades and how far it gets+

The recursive approach is theoretically sound for short horizons. At +15 minutes, all lag features come from observed PostgreSQL data — the prediction quality is limited only by model accuracy. At +30 minutes, lag_15min comes from the +15 min prediction — one compounded error, but small. At +60 minutes, rolling_mean_1hr is an average of observed data and one or two predicted values — still mostly real. The degradation is gradual.

At +3 hours, lag_1hr is the prediction from +2hr, which itself was built from predicted lags. Rolling_mean_4hr is almost entirely model-generated. The model is predicting based on its own prior predictions rather than anything observed. Empirically this produces "smooth" but increasingly average-regressed forecasts — the model tends toward the historical mean of that entity_id for that hour of day, which is arguably better than the v8 zero-imputation approach but still not reliable.

The buffer was pruned to tail(20) — keeping only the 20 most recent points. This helped memory but also meant at +3hr prediction, you might have only 3 actual observed points and 17 model-generated ones in the window, with the observed ones being 4+ hours stale.

The live PostgreSQL dependency — what it required+

The v6 architecture required a continuously running wait-time ingestion process writing to park_live_status_logs. Without this, the 6-hour lookback buffer would be empty at inference time, all lag and rolling features would fall back to NaN → 0, and the model would be operating in near-identical conditions to v8's imputation approach. The live data was the entire justification for v6's architecture over v8's batch approach.

The queue_details column used a JSONB structure — the actual wait time was at path queue_details → 'STANDBY' → 'waitTime'. This JSONB path is the same structure as the live queue data from the park entity API, so the ingestion pipeline wrote API responses directly as JSON blobs. This meant the schema was flexible but the extraction path had to be hardcoded in the prediction query.

In hindsight, this was the right call for the data model — the current production PostgreSQL database in the successor system uses the same JSONB-first approach for queue data, and the table structure of park_live_status_logs directly influenced the production schema.

What this version got right that v8 didn't+

Three things v6 got right that v8 abandoned in the pivot to Prophet:

Real-time lag features — even if they degrade at long horizons, having actual observed data for the first 60-90 minutes of predictions is a meaningful improvement over imputing to zero
Global model scalability — one trained artifact, not 130+ per-ride Prophet models requiring hours to retrain
Categorical feature handling — serializing training-time category levels and re-applying at inference is good practice that prevented silent category mismatch bugs, regardless of which model algorithm is used

The successor architecture (Knowledge RAG + precomputed aggregates) ultimately rendered the forecasting approach obsolete for the product use case — real-time wait prediction was replaced by historical pattern queries ("typical Saturday morning wait at Space Mountain") which require no live model at all.

Ride Whisperer v6LightGBM + Live Inference

Recursive Multi-Step Forecasting Loop

Per-Attraction Real-Time Prediction Flow

Four Tables Queried at Inference Time

Feature Categories — Assembled at Inference Time

Why These Choices, and What They Got Wrong

Ride Whisperer v6
LightGBM + Live Inference