Abandoned · Prophet · BigQuery · GCP · Python

Ride Whisperer
Wait Time Prediction

A per-attraction time-series forecasting engine for all four Walt Disney World parks. One Facebook Prophet model per attraction, trained on 2+ years of hourly wait data from BigQuery, with 50+ external regressors: weather (temperature, precipitation, humidity, thunderstorm probability), rolling windows at 1h/2h/4h, precipitation lags, park operating hours, holidays, and time decomposition features. 24-hour ahead forecasts written back to BigQuery.

v6: LightGBM + Live Inference ↗ Successor: Knowledge RAG ↗ ← All Projects

⚠️

Abandoned at v8. The project reached v8 iterations before being shelved. The primary blocker was feature imputation at forecast time: rolling mean and lag features are computed from actual historical data and can't be faithfully forward-filled for a 24-hour forecast horizon — the regressors the model relied on most were unavailable at inference time. The dataset infrastructure and feature engineering work directly informed the design of the current PostgreSQL pre-computed aggregates system, which achieves the same analytical goals through a different architecture.

Final Iteration

N×1

Models (per ride)

50+

Regressors

24hr

Horizon

2015

Training Start

WDW Parks

Training + Inference Pipeline

Four Stages: Acquire → Train → Forecast → Store

Training Data

BigQuery: prophet_ready_data_view unified feature view 2023-01-01 → 2025-07-01 training window (2+ years) Joined sources: historical wait CSVs (2015-2021), NWS weather, park schedules, holiday metadata Target variable y: actual wait time in seconds Partition key: entity_id (one row per attraction per hour) 50+ columns: time, weather, rolling stats, park context, one-hot categoricals

Per-Ride Training

One Prophet model per entity_id — no global model Prophet config: additive seasonality, daily + weekly + yearly, linear growth changepoint_prior_scale=0.05 — moderate trend flexibility All 50+ numerical features added via m.add_regressor() Categorical features excluded (land_name, attraction_name) — not directly supported as regressors Rides with zero observations still trained (MIN_OBSERVATIONS=0)

Future Context

Weather: weather_forecasts BQ table → temperature, precipitation, dewpoint, humidity, cloud cover, wind, thunderstorm prob Park hours: park_open_status_hourly_v1 BQ table → per-park is_park_open for 24hr horizon Historical regressors (rolling means, lags) imputed to 0 — cannot be computed without actuals Time-based features re-derived from ds: day_of_week, hour_of_day, month, quarter, is_weekend, is_holiday

Prediction

m.predict(future_ride) per ride — returns yhat + confidence interval (yhat_lower, yhat_upper) 24-hour horizon at hourly resolution Output annotated with ride_entity_id, park_name, prediction_generated_at UTC Rides with missing critical regressors skipped with error log

BigQuery Output

prophet_predicted_wait_times table — WRITE_APPEND disposition Schema: prediction_timestamp (TIMESTAMP), ride_entity_id (STRING), predicted_wait_time_seconds (FLOAT64), yhat_lower, yhat_upper, prediction_generated_at Downstream: feed to dashboard, alerts, and routing recommendations

Feature Engineering

50+ Regressors Across Five Categories

All features were computed in BigQuery and materialized into prophet_ready_data_view before training. The feature set grew from ~10 in v2 to 50+ by v8 as each iteration added a new data source or transformation.

Weather — Raw10

avg_temperature_f

precipitation_inches

dewpoint_f

apparent_temperature_f

humidity_percent

cloud_cover_percent

wind_speed_mph

is_adverse_heavy_rain_1hr

adverse_thunderstorm_prob_3hr

weather_condition_category_* (6 one-hot)

Weather — Derived13

temp_x_precipitation

is_weekend_x_temperature

precipitation_if_open

precipitation_lag_15min

precipitation_lag_30min

temp_f_rolling_mean_1hr / 2hr / 4hr

precip_sum_1hr / 2hr / 4hr

wind_speed_max_1hr / 2hr / 4hr

Wait Time — Rolling9

y_rolling_mean_1hr

y_rolling_std_1hr

y_rolling_mean_2hr

y_rolling_std_2hr

y_rolling_mean_4hr

y_rolling_std_4hr

⚠ Unavailable at inference time — imputed to 0. Known accuracy degrader.

Time Decomposition8

hour_of_day

day_of_week

is_weekend

day_of_year

month_of_year

quarter_of_year

time_of_day_category_* (4 one-hot)

Park & Holiday Context10

is_park_open

is_park_operating_hour

is_ride_open

is_holiday

park_name_magic_kingdom

park_name_epcot

park_name_hollywood_studios

park_name_animal_kingdom

Data Architecture

BigQuery as the Central Feature Store

All data was ingested to a GCP BigQuery data warehouse under the park-whisperer.park_data_warehouse project. The training view joined four independent datasets into a single query surface.

Historical Ride Data

Source: public attraction wait-time CSVs, 2015–2021 (~7 years) Columns: datetime, SACTMIN (actual wait), SPOSTMIN (posted wait) Preprocessed: ride ID extracted from filename, missing values (-999) handled, merged with metadata.csv (holiday flags, ticket season, park hours, events count, capacity_lost_percent) Output: long_term_historical_ride_data.parquet → loaded to BQ 130+ attraction CSV files across 4 parks ingested via preprocess.py

Weather Data

Historical: historical_weather.py → NWS hourly observations → BQ Forecast: weather_forecasts BQ table (live NWS feed) Computed in BQ: adverse_heavy_rain_pred_1hr, adverse_thunderstorm_prob_3hr, weather_condition_category

Park Operations

park_open_status_hourly_v1 — hourly is_park_open flag per park Derived from park schedule data: MKOPEN/MKCLOSE, EPOPEN/EPCLOSE, HSOPEN/HSCLOSE, AKOPEN/AKCLOSE is_park_operating_hour distinguishes scheduled hours from actual operation

Feature View

prophet_ready_data_view — SQL view joining all sources Rolling windows and lag features computed in SQL (y_rolling_mean_*hr, precip_lag_*min) One-hot encoding done in SQL: CASE WHEN park_name = 'Magic Kingdom' THEN 1 ELSE 0 Training query reads from view, filters by date range, orders by ds, entity_id

Column	Type	Description
prediction_timestamp	TIMESTAMP	Hourly timestamp for which the wait is predicted (Eastern Time, as UTC)
ride_entity_id	STRING	Attraction identifier matching entity_id in training data (e.g., "seven-dwarfs-mine-train")
predicted_wait_time_seconds	FLOAT64	Prophet yhat — point forecast in seconds. Divide by 60 for minutes.
yhat_lower	FLOAT64	80% confidence interval lower bound
yhat_upper	FLOAT64	80% confidence interval upper bound
prediction_generated_at	TIMESTAMP	UTC timestamp when the prediction batch was computed. WRITE_APPEND — each run adds rows.

Version History

v2 → v8 — What Changed Each Iteration

First working Prophet model

Basic per-ride Prophet with daily/weekly seasonality. ~10 regressors: temperature, precipitation, is_holiday, day_of_week, hour_of_day. Training data from CSV files. No BigQuery integration yet.

BigQuery integration + weather expansion

Moved training data to BigQuery. Added full NWS weather set: dewpoint, apparent temperature, humidity, cloud cover, wind speed. Added park operating hours as a regressor. First prophet_ready_data_view.

Rolling window features

Added y_rolling_mean/std at 1hr, 2hr, 4hr windows for autocorrelation signal. Added temp_f_rolling_mean, precip_sum, wind_speed_max at same windows. This significantly improved training accuracy but introduced the future-imputation problem.

Interaction terms + lag features

Added cross-feature interactions: temp_x_precipitation, is_weekend_x_temperature, precipitation_if_open. Added precipitation lags at 15min and 30min. Added weather_condition_category and time_of_day_category one-hot encodings.

LightGBM parallel experiment

Parallel track exploring LightGBM (gradient boosting) as an alternative to Prophet. Achieved better in-sample accuracy on rolling features but required more complex feature pipelines. Model artifact: lightgbm_model.joblib.

Full regressor set — final iteration before abandonment

50+ regressor Prophet. Removed debug MAX_RIDES_TO_TRAIN limit — trained on all attractions. Added yearly seasonality. MIN_OBSERVATIONS set to 0 to include all entity_ids. Rolling features explicitly imputed to 0 at forecast time with warning logs. Project stalled on this imputation problem.

Post-Mortem

Why It Was Abandoned and What It Built Toward

The imputation problem that killed the forecast accuracy+

By v5, the most predictive features were the rolling window statistics: y_rolling_mean_1hr, y_rolling_mean_2hr, y_rolling_mean_4hr. These captured autocorrelation in wait times — if a ride has been running 60-minute waits for the past two hours, that's a strong signal for the next hour. In training, these values are computed from actual historical data and are highly accurate.

At forecast time, there is no "actual" data for the future. The pipeline imputed these to 0, which is demonstrably wrong — 0 means the ride has had zero wait time for the past four hours, a state that doesn't exist during park operating hours. The model trained with real values and predicted with zeroes. Every prediction was effectively generated without the features the model weighted most heavily. Adding more weather features didn't compensate for this structural gap.

Why one-model-per-ride didn't scale well in practice+

Prophet is computationally heavy: fitting a full Prophet model with 50 regressors and 2+ years of hourly data takes 30–120 seconds per attraction. With 130+ attractions across 4 parks, a single training run was measured in hours on commodity hardware. Retraining every night to keep models fresh on new data was not practical without significant infrastructure — a Cloud Run or Dataflow job with parallelism across all rides.

A global model (one model for all rides with entity_id as a categorical feature) was considered but Prophet doesn't natively support entity-level models — that's more LightGBM/XGBoost territory, which is why the v7 LightGBM parallel track existed. Neither approach reached production readiness.

What the data infrastructure enabled after abandonment+

The BigQuery warehouse, the NWS weather ingestion, the historical CSV preprocessing pipeline, and the prophet_ready_data_view feature view all survived the Prophet experiment and fed directly into the next architecture. The feature categories that proved most important — weather condition, time-of-day category, park operating hours — became the basis for the knowledge extractor taxonomy in the current PostgreSQL RAG system.

Instead of serving predictions from a live model, the current system precomputes day-of-week and hour-of-day aggregates from the same historical dataset and stores them as knowledge documents. "What's the typical wait for Space Mountain on a Saturday morning?" is answered directly from those precomputed aggregates — no real-time inference required, and the answer is based on 7+ years of actual data rather than a forecast from a model with imputed zero-value features.

What would need to change to revive it+

The imputation problem is solvable — but not with Prophet. A model architecture that treats the rolling features as autoregressive inputs natively (LSTM, N-BEATS, or a tree-based model with an explicit lag feature pipeline) would handle the forecast horizon correctly. The BigQuery feature store is largely ready; the main engineering work would be:

Replace Prophet with a global LightGBM or XGBoost model trained on all rides simultaneously with entity embeddings
Build an online feature pipeline that computes the rolling/lag features from a streaming wait-time source (the existing Cosmos DB selloutEventv2 / wait-time records) at inference time rather than from precomputed training data
Move training to Vertex AI or a scheduled Cloud Run job to parallelize per-ride training
The output schema is already defined — prophet_predicted_wait_times just needs a better upstream producer

Ride WhispererWait Time Prediction